The DataTools4Heart project aims to develop a cardiology data toolbox for data scientists and clinicians. The toolbox will primarily include descriptive analytics and decision support tools utilizing machine learning algorithms. The use cases to be realized by these tools include to determine the role of chronic kidney disease and hyperkalaemia in heart failure; to develop and validate a risk score for acute heart failure patients presenting at the emergency department, which is intended to be used by clinicians at the point of emergency department care; and to investigate the referral pathways before and after admission to cardiology clinics. Furthermore, while multilingual natural language processing approaches will help extracting as much useful information as possible out of unstructured data, data synthesis methods will help generating real-life compatible, synthetic patient data. All these tools will be powered by privacy-preserving data ingestion pipeline composed of data ingestion and feature extraction capabilities. These will respectively be used for transforming heterogenous data from hospital information systems to a common data model and further processing the common data model to obtain machine learning ready datasets.
All the software modules will be deployed within hospitals’ IT ecosystem without taking sensitive data out. These tools will be deployed at 8 different hospital settings:
- University Medical Centre Utrecht
- Fondazione Policlinico A. Gemelli IRCCS
- St. Anne’s University Hospital Brno
- University College London Hospital
- Vall d’Hebron Hospital Research Institute
- Karolinska University Hospital
- Bucharest Emergency Clinical Hospital
- Amsterdam Medical Center
A data catalogue and accompanying virtual assistants will help data scientists to discover datasets across pilot sites. These helper tools will help identify datasets meeting specific requirements of use cases.
SRDC will mainly be responsible for making patient data residing in multiple heterogenous data sources available by data scientists via a unified interface. Having this ultimate goal SRDC will specifically be working on the following items:
- Analysis of the use cases and clinical protocols and define the HL7 FHIR-based common data model covering these use cases and clinical protocols
- Collaboration with the pilot sites to understand the format of the data as it is provided by the hospital information systems and develop mappings from these data sources to the common data model
- Collaboration with the pilot sites and data scientists to elaborate the features composing the datasets
- Development of extraction specifications on top of the common data model to obtain the data from the onFHIR patient data repository
- Development of software modules to define and execute data mappings and feature extraction configurations
- FAIRification of datasets by annotating them with EOSC Dataset Minimum Information metadata
- Development of a dataset catalogue to find datasets satisfying specific content and quality criteria
1. | University of Barcelona | Spain |
2. | Lynkeus Srl. | Italy |
3. | Barcelona Supercomputing Centre | Spain |
4. | SRDC | Turkey |
5. | Athena Research Centre | Greece |
6. | University Medical Centre Utrecht | Netherlands |
7. | Panetta Studio Legale | Italy |
8. | Translated Srl. | Italy |
9. | Siemens Healthcare | Romania |
10. | Fondazione Policlinico A. Gemelli IRCCS | Italy |
11. | St. Anne's University Hospital Brno | Czechia |
12. | University College London Hospital | United Kingdom |
13. | Vall d’Hebron Hospital Research Institute | Spain |
14. | Karolinska University Hospital | Sweden |
15. | Bucharest Emergency Clinical Hospital | Romania |
16. | European Society of Cardiology | France |
17. | Amsterdam Medical Centre | Netherlands |