Amazon Omics announces direct upload to storage and new automated capabilities

The new functionalities allow customers to integrate existing processing pipelines so they can quickly build applications and scale up robust genomic analyses and secondary analyses.
By Andrea Fox
07:58 AM

Photo: HIMSS Media

At the annual AWS Life Sciences Executive Symposium, Amazon announced new Amazon Omics capabilities to help customers build, run and scale 35 Ready2Run workflows by Element Biosciences, NVIDIA and Sentieon Inc. 

WHY IT MATTERS

Large-scale analysis omics data typically requires specialized bioinformatics workflows to process data sets.

For use cases where customers want to run industry-standard analysis workflows without any modification, Amazon Omics now allows healthcare organizations to use Ready2Run from Sentieon, NVIDIA Parabricks and Element Biosciences as well as open-source pipelines, including GATK best practices, nf-core scRNAseq AlphaFold and ESMFold for protein prediction. 

AWS said on its blog that a new multi-part direct upload API allows customers to upload their data directly to Omics storage and write their outputs to a sequence store "in minutes."

The workflows can also be converted into private workflows to support larger file sizes, according to the announcement. The transfer manager utility has also been updated for direct upload of these large files with a single python command.

In addition to direct data ingestion, other updates include the ability to: 

  • Use NVIDIA T4 and a10 graphical processing units in Omics workflows to support computationally intensive pipelines and easier querying and analysis of variants.
  • Automatically parse variant data containing variant effect predictor annotation into a separate data structure.
  • Leverage integration with Amazon EventBridge and use published events as part of an event-driven architecture.

THE LARGER TREND

AWS launched the cloud-based Amazon Omics precision medicine platform in November to eliminate the need for specialized infrastructure and workflows.

The platform utilizes artificial intelligence, machine learning and other AWS and partner products and services to run IT-heavy bioinformatics workflows. 

Combining multiple clinical modalities is key to delivering care at Children's Hospital in Philadelphia, said Jeff Pennington, associate vice president and chief research informatics officer. 

"With Amazon Omics, we can expand our understanding of our patients' health, all the way down to their DNA," Pennington said in the launch announcement.

ON THE RECORD

"The ease of use, scalability and cost transparency of AWS’s Ready2Run workflows for GATK Best Practices, NVIDIA Parabricks and Sentieon’s pipelines significantly lowers barriers to bioinformaticians at any scale," said Dr. Daniel S. T. Hughes, director of bioinformatics at the Institute for Genomic Medicine & Precision Genomics Laboratory at Columbia University Medical Center, in a statement.

"Combined with native integration into AWS’s analytical ecosystem this should significantly accelerate the pace of clinico-genomics."

Andrea Fox is senior editor of Healthcare IT News.
Email: afox@himss.org

Healthcare IT News is a HIMSS Media publication.

Want to get more stories like this one? Get daily news updates from Healthcare IT News.
Your subscription has been saved.
Something went wrong. Please try again.