Could big data beat breast cancer?
After just months of study so far, a multiyear, $100 million enterprise analytics project at UPMC is already starting to show promise in the fight against breast cancer, researchers say.
With the basic architecture of UPMC's leading-edge enterprise data warehouse now in place, Pitt researchers say they've been able to integrate clinical and genomic information for 140 breast cancer patients.
The results show promise for what big data and analytics technology can do to further personalized medicine and the fight against disease, they say.
"One of the first questions we asked was, 'Is there a difference, a unique difference between pre-menopausal and post-menopausal breast cancer?'" said Adrian V. Lee, director of the Women's Cancer Research Center at the University of Pittsburgh Cancer Institute, in a press statement.
"We are interested in this question from a research standpoint because we are moving toward personalized medicine, and personalized medicine is all about finding subgroups of patients who have a specific type of disease for which we could develop novel therapies," he added.
Researchers found what they call "intriguing" molecular differences in the makeup of pre-menopausal and post-menopausal breast cancer; understanding them will require more research, said Lee, but the findings eventually could provide a roadmap for developing targeted therapies.
These investigations are the first step in UPMC's and Pitt's plans to mine huge troves of clinical, genomic, proteomic, imaging and financial data in the pursuit of smarter medicine, researchers say.
While these types of data usually reside in separate information systems – making it difficult, if not impossible, to integrate and analyze dozens of variables – this five-year, $100 million enterprise analytics project, built in collaboration with Oracle, IBM, Informatica and dbMotion, stands to change all that.
The data warehouse grows and evolves, data from more than 200 sources of information across UPMC, UPMC Health Plan and outside entities – including labs and pharmacies – will be integrated for further research, officials say. When the first phase of the multi-year project is completed in the spring of 2014, many researchers, clinicians and administrators will have secure, real-time access to data and analytic tools that fit their particular interests and needs.
"The integration of data, which is the goal of the enterprise data warehouse, allows us to ask questions that we just simply couldn't ask before," said Lee, adding that UPMC's analytics tools will now make it easier to explore other types of cancer and other diseases.
This breast cancer research was chosen as a test of the enterprise data warehouse because of the rich genomics data available on these 140 patients. Their de-identified information previously had been submitted as part of a federally funded project called The Cancer Genome Atlas (TCGA), a multi-center effort to produce comprehensive genomic maps of the most common cancers. Pitt was the largest contributor of tissue to the TCGA.
While the data warehouse started with only two types of breast cancer "omic" data – gene expression and copy number variant data, measuring changes in the amount of DNA – many more will be added, say officials.
"Every patient is different; every patient has a unique story," said Steven D. Shapiro, MD, chief medical and scientific officer at UPMC, when the data warehouse and analytics platform was first announced this past October. "This comprehensive analytics approach will enable us to treat each patient in a personalized way to produce the best possible results."
[See also: UPMC and IBM keep focus on big data for personalized medicine]
[See also: UPMC data centers go green with wind]