October 25, 2013
Genomics workloads have proven to be perfect match for the cloud era, a point that was brought to light once again. As part of the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Project, Baylor College of Medicine, Amazon Web Services (AWS), and DNAnexus have teamed up to run the largest ever cloud-based analysis of genomic data.
It's become a familiar tale. The CHARGE project had a job to run – a massive analysis load that needed processing. But the job exceeded the computing and storage resources of partner organization, the Human Genome Sequencing Center (HGSC) at Baylor College of Medicine.
The options were to purchase more hardware for a short-term project; "jam the cluster" to attempt to get the job done at the cost of pushing back other important work; or identify a suitable cloud-based solution. They decided to go for option number three, and signed on to work with DNAnexus and Amazon Web Services for this ultra-large scale genomic analysis project.
As part its participation in the project, the Human Genome Sequencing Center (HGSC) at Baylor College of Medicine used the DNAnexus enterprise cloud platform (hosted by AWS) to power its Mercury pipeline, a semi-automated and modular set of tools for the analysis of next-generation sequencing data. With DNAnexus providing the platform-as-a-service (PaaS) on top of AWS infrastructure, HGSC was able to analyze the genomes of over 14,000 patients, encompassing 3,751 whole genomes and 10,771 whole exomes.
As the case study describes, the entire job was run over a four-week period, using approximately 2.4 million core-hours of computational time with a peak of 20,800 cores to generate 440TB of results and nearly 1 PB of data storage. The output from the pipeline and the analysis of the CHARGE data, as well as the tools themselves were made available to over 300 researchers from across five collaborating institutions.
Having the option to run this ultra large-scale clinical analysis of genomic data without any capital investment helps the CHARGE Consortium get closer to its goal of unlocking the mysteries of human genetics with regard to heart disease and aging, paving the way for the development of new medical interventions and analysis tools.
10/30/2013 | Cray, DDN, Mellanox, NetApp, ScaleMP, Supermicro, Xyratex | Creating data is easy… the challenge is getting it to the right place to make use of it. This paper discusses fresh solutions that can directly increase I/O efficiency, and the applications of these solutions to current, and new technology infrastructures.
10/01/2013 | IBM | A new trend is developing in the HPC space that is also affecting enterprise computing productivity with the arrival of “ultra-dense” hyper-scale servers.
Ken Claffey, SVP and General Manager at Xyratex, presents ClusterStor at the Vendor Showdown at ISC13 in Leipzig, Germany.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?