How synthetic data can boost efficiency for clinical researchers and IT leaders

Washington University and Ottawa Hospital used synthetic data to review a sepsis prediction model used in both institutions over the course of two phone calls – and avoided months of conventional research arrangements.
By Bill Siwicki
10:49 AM

Josh Rubel, chief commercial officer of MDClone

Photo: Josh Rubel

Hospitals and health systems traditionally have experienced significant challenges in finding insights from data at scale, because their data universes are so complicated. A standard health system has a huge volume of data from a wide variety of sources.

To take one example, data associated with a procedure like surgery and follow-up from a surgery come from HIS, inpatient EHR, perioperative systems, genomics, lab, pathology, imaging, ambulatory EHR and others. Further, each source has unique formats, data types and data elements.

If a service line leader, a clinician and an administrator want to find insights around surgical success rates, they need to navigate through data from all of those sources to find correlations and conclusions that inform an analysis.

For example, a clinician may want to know if there’s a correlation between patients taking a certain post-operative pain medication and the likelihood of readmission and/or complication, essentially asking, “Should we use this medication for these types of patients going forward?”

To obtain the answer to that question, the clinician would need to understand the entirety of the health system's data universe, which is a very high bar.

"To address this challenge, the typical approach is for health systems to employ teams of data analysts that clinicians send these types of questions to, and the data analysts then have to pull together that data universe, and that takes a lot of time and energy and expertise," explained Josh Rubel, chief commercial officer at MDClone, a data analytics and synthetic data technology vendor.

"And the analysts face an ever-growing number of requests, which creates a bottleneck," he continued. "In that example, if a surgeon has an interesting question about a specific pain medication, she is forced to get in line with anyone else who is asking questions. The process is inherently slow and over-mediated, and therefore it’s not scalable."

To discuss potential solutions to these challenges and dive into surrounding technologies, Healthcare IT News interviewed Rubel. He had much to say.

Q. For those who are not fully up to speed with the technology, please explain what synthetic data is and how it can be used in healthcare.

A. Synthetic data are nonhuman subject data. Good synthetic data, especially in healthcare, are data that are derived from original or real data, that tell the same story as real data, but contain no real human subject data.

In healthcare, with the important emphasis on patient privacy, synthetic data dramatically increases the number and types of users, especially external users, who can interact with data. For example, with robust and accurate synthetic data, health systems can give data access to algorithm developers to train a disease-progression predictive model.

Q. What do healthcare provider organization CIOs and other health IT leaders need to know about synthetic data? What's the news here that should have their attention?

A. Health IT leaders understand how valuable their data is, and they want to get more use out of it – internally and externally. The reason to get excited about synthetic data is the promise of wider and more use of data.

Synthetic data enables providers to drive better and faster therapeutic development through their life science or technology partners and create more quality improvement by partnering with third parties to improve performance and accelerate the research that is coming out of academic institutions.

Q. Please give an example of how a provider organization can use synthetic data to explore, discover or collaborate to improve patient health.

A. The U.S. Department of Veterans Affairs developed an innovation initiative called ARCHES, where MDClone is deployed as core infrastructure. The VA’s ARCHES program is using synthetic data to collaborate with several third-party companies in a program to help understand patient behavior and improve processes and strategies for suicide prevention.

Because the data is synthetic, the VA can safely and easily share it with outside organizations developing models to predict which veterans may be likely to attempt to end their lives. Without synthetic data, this sort of collaborative effort would be very difficult.

We expect many more programs like this one at the VA and elsewhere, all enabled by reliable, accurate and privacy-preserving synthetic data.

Q. Please give one more example of how a provider organization can use synthetic data to explore, discover or collaborate to improve patient health.

A. Washington University in St. Louis and the Ottawa Hospital in Canada wanted to review data that was relevant for a sepsis-prediction model and sepsis care in both of their institutions, and they were interested in doing cross-site comparison work.

In the old way of doing things, maybe they could have come up with a clinical research arrangement and started to share data with each other over a period of several months. Instead, over the course of two phone calls, the organizations were able to look at each other's synthetic data.

No data sharing or data use agreements had to be signed. They weren't looking at human subject data, and they built joint research together on sepsis modeling and sepsis care after about two hours of engaging with each other.

This kind of collaboration is relatively unheard of from a velocity and scale standpoint for large-scale academic organizations working together. Synthetic is a game-changer for this sort of cross-institutional research.

Q. How does your company's technology enable synthetic data use for provider organizations?

A. MDClone is focused on maximizing data privacy and data utility in healthcare. With more than 20 implementations around the world, MDClone’s core offering, the ADAMS platform, enables flexible and high-velocity exploration of data to drive insights.

A differentiating capability of ADAMS is our synthetic data generation for precision cohorts, therapeutic areas, and populations of interest. The platform includes the ability for nontechnical end users to generate and explore synthetic data.

In provider organizations like the VA or Intermountain Health or Sheba in Israel, our synthetic engine is used hundreds of times per day to help find opportunities to improve quality, performance and overall health.

Follow Bill's HIT coverage on LinkedIn: Bill Siwicki
Email the writer: bsiwicki@himss.org
Healthcare IT News is a HIMSS Media publication.

Want to get more stories like this one? Get daily news updates from Healthcare IT News.
Your subscription has been saved.
Something went wrong. Please try again.