Healthcare Technology Advancements: Powering Healthcare Innovation with Synthetic Patient Data Simulation 

Patient-level data presents a huge challenge for executives tasked with healthcare innovation. The process of accessing patient data and insights can be costly. And because the use of patient data is governed by rigorous procedures, regulatory controls, and ethical and legal constraints, many healthcare organizations end up unable to undertake the effort or justify the risk.  

For small and medium-sized healthcare teams, especially, sample size is a challenge. These organizations face difficulty in building a large enough data model to create useful insights. 

At 11TEN Innovation Partners, we can build proofs of concept (POCs) and develop machine learning to augment health research using robust samples of patient-level data for teams of any size. We have successfully constructed algorithms and machine learning models in combination with external data sets with minimal security issues.  

WHAT IS SYNTHEA?

Synthea is an open-source, synthetic patient generator that models the medical history of artificial patients from birth to death. It provides realistic, but not real, patient data that simulates synthetic patients’ entire lives and associated electronic health records covering every aspect of healthcare.

Because patient records are artificially created, they do not contain the personally identifiable information (PII) of real patients, which means these synthetic patients can be used without the same legal concerns or privacy restrictions. Ultimately, this allows for fast, large, and easy-to-use patient-level data representing diverse populations.

THE USE OF SYNTHETIC PATIENT DATA IN HEALTHCARE INNOVATION IS ON THE RISE

Recent momentum around synthetic patient data modeling is evident in several government research initiatives.

THE US FEDERAL DRUG ADMINISTRATION (FDA) LEVERAGES SYNTHEA FOR MACHINE LEARNING MODELING 

In the wake of the COVID-19 pandemic, the FDA used Synthea to identify risk and protective factors for COVID-19 in one population (Veterans), who have a higher incidence of risk factors for severe COVID-19 illness. This research used a combination of real-world data sources and Synthea synthetic patient records in machine learning to predict COVID-19-related health outcomes.

THE OFFICE OF THE NATIONAL COORDINATOR FOR HEALTH INFORMATION TECHNOLOGY (ONC) FOCUSES ON SYNTHETIC DATA TO ACCELERATE RESEARCH

Synthea has also been leveraged by the ONC for a multi-year challenge that invites participants to build momentum and adoption of synthetic patient data modeling to accelerate patient-centered outcomes research (PCOR). A statement from the study reads:

“Clinical data are critical for the process of PCOR, which focuses on the effectiveness of prevention and treatment options. However, realistic patient data are often difficult to access because of cost, patient privacy concerns, or other legal restrictions. Synthetic health data help address these issues and speed the initiation, refinement, and testing of innovative health and research approaches.”

Through collective collaboration, participants in the challenge have contributed to the enhancement of synthetic patient health records and further expanded Synthea’s potential for researching diverse populations.

SYNTHEA OFFERS SYNTHETIC PATIENT DATA SIMULATION FOR VARIOUS DISEASE CONDITIONS 

Synthea is available for 31 different disease areas and can model the entire U.S. population.

Originally, Synthea started with modules for the top 10 reasons patients visit primary care providers and the top 10 medical conditions that result in years of lost life. Some of these modules include:

  • Asthma 
  • Attention Deficit Disorder (ADD) 
  • Bronchitis 
  • Breast Cancer, Colorectal Cancer, and Lung Cancer 
  • Chronic Obstructive Pulmonary Disease (COPD) 
  • Dementia  
  • Ear Infections  
  • Food Allergies  
  • Hypertension  
  • Opioid Addiction  
  • Osteoporosis  
  • Sinusitis 
  • Urinary Tract Infections (UTIs) 

The team at 11TEN has used Synthea to build POC Artificial Intelligence (AI) and predictive models to help showcase underrepresented populations in healthcare, such as those that have a higher propensity for opioid addiction.

Since Synthea is an open-source data tool, it is ripe for further healthcare innovation. It is constantly being updated with new disease models that can have a greater impact on both patient experiences and patient outcomes – both of which are essential as we transition to more patient-centric solutions such as value-based care.

SYNTHETIC PATIENT DATA: “LOW-RISK” COMES WITH LIMITATIONS

Synthetic data simulation can be used to test existing systems, build POCs, or develop machine learning protocols without the complications of real-world patient data, but healthcare teams should not consider synthetic data a replacement for real patient records.  

Because synthetic patient records are artificial, some outlier insights that would occur within the real world may be absent.  

In testing and learning scenarios, however, synthetic data can bypass the difficulty most healthcare organizations face in housing and accessing patient data, and shows promise when augmenting existing, insufficient data sources.  

PUT SYNTHEA TO WORK FOR YOUR HEALTHCARE ORGANIZATION 

Interested in exploring how Synthea could help your organization accelerate healthcare innovation, testing, learning, and insights? We’d love to hear about your project and help you determine whether synthetic data can assist your organization with any research initiatives. Contact us for more information.