GE HealthCare to Head Consortium Focused on Creating Synthetic Data for Healthcare AI

15 October 2024

GE HealthCare will lead the Synthia consortium project, which aims to evaluate methods for generating synthetic data. This initiative will focus on creating synthetic datasets and using them to develop artificial intelligence (AI) algorithms.

Alongside GE HealthCare, notable organisations such as Gates Ventures, NovoNordisk, and Pfizer, as well as academic institutions like La Fe University, Fraunhofer Institute, and the University of Bologna, will contribute to this effort.

Data is essential for developing AI products, from initial stages to final deployment. Synthetic data—artificially generated to imitate real patient data—can provide a solution to issues like the lack of available real datasets, biases in training data, and concerns over privacy.

However, the reliability of synthetic data generation tools and the quality of the datasets produced are critical factors that need addressing.

The Synthia project aims to develop reliable methods, standards, and frameworks for generating synthetic data and using it effectively in AI development, training, and validation. By bringing together expertise from healthcare providers, academics, and industry professionals, the project will tackle challenges related to synthetic data, including legal, ethical, and regulatory aspects, while also exploring ways to increase the availability of high-quality training datasets.

A key objective of the project is to establish workflows for data generation and assessment frameworks to evaluate privacy, quality, and the relevance of generated datasets. These resources will be made accessible to the research community through a dedicated platform that will host a repository of high-quality synthetic datasets, each labelled for specific applications.

The tools developed through Synthia will encompass various data types, including laboratory results, clinical notes, genomics, imaging, and mobile health data, allowing for the generation of longitudinal data.

The project will focus on six diseases to assess the effectiveness of synthetic data in areas such as oncology (lung and breast cancer), hematology (multiple myeloma and diffuse large B-cell lymphoma), neurology (Alzheimer’s disease), and metabolic health (type 2 diabetes).

Ultimately, the Synthia platform aims to foster trust among stakeholders regarding the utility of synthetic data and promote its responsible use in health research.

The project is part of the Innovative Health Initiative (IHI), a public-private partnership between the European Union and the life sciences sector. It is supported by the European Union’s Horizon Europe research and innovation programme, as well as various industry partners.

GE HealthCare's involvement builds on its strong history of AI research and innovation, which includes patents for generating realistic synthetic images and improving machine learning model generalisability with synthetic training data.






Source: gehealthcare.com