Being competitive in today’s world economy means companies have to accelerate the time it takes to go from concept to profitable products and services. There is no shortage of new, good ideas and inventions to solve the problems we face; yet the world market demands solutions faster.
Specifically, high-performance parallel compute technologies have become more affordable for commercial research and development operations, but there is a gap in the data storage technologies.
As an example, huge leaps in technologies are reducing the time it takes to sequence DNA from a month down to a day. This shortens the front-end time it takes to get data into the chemical analysis process, and could help doctors shorten the time it takes to make a clinical diagnosis of a patient illnesses from weeks to just a few days.
To achieve this vision, storage bottlenecks have to be removed from the compute clusters. Today’s general-purpose network storage, while easy to implement, cannot keep up with the throughput demanded by the compute side of the analysis; we can keep adding compute cheaply but storage remains the bottleneck. Some high-end HPC storage solutions can address the performance bottleneck but are either too costly or too difficult to implement and manage.
To make breakthroughs in diagnosis and treatment, we have to break this logjam throughout the entire system. Until this happens, additional chemical analysis will remain impractical, and we’ll remain restricted in our ability to fully understand autism, Alzheimer’s and other medical puzzles.
Similar data bottlenecks are emerging as research and development departments across a myriad of industries (specifically manufacturing simulation and modeling, energy exploration, weather and climate analysis, media and entertainment and economics and financial analysis) seek more data granularity, accuracy and resolution from their applications. These departments have attempted to leverage traditional general-purpose storage platforms sold to the enterprise, but they quickly hit performance barriers that affect sustained productivity. They recognize the need for high performance storage to break their application bottlenecks, but they can’t justify the acquisition cost or the deployment and operational management expense that comes with super compute complexity. Could commercial research and development operations invest in this more costly technology? Of course, but the return on investment and resulting end products would cost too much to the end consumers.
Removing the storage bottleneck in a cost-effective manner for commercial ventures can accelerate time to results for all players in a given market and ultimately will benefit consumers. It is time for an affordable, next-generation HPC storage platform that can meet these new productivity demands. Research and development departments should seek out new storage solutions that can deliver on the following requirements:
Costs of deployment and management must be minimized. Storage system deployment should not require new expertise and needs to be able to be done by the in-house IT team or the R&D department. With the exception of provisioning and service operations, ongoing storage system management must be hands-off.
Flexibility to scale-out storage in both throughput performance and capacity. This will accommodate new product development, application of new algorithms, increased data resolutions and multiple simulation analyses.
High performance efficiency. Unlike general-purpose network file systems, next-generation HPC storage solutions must offer robust parallel file system technology to match compute cluster performance. Choosing an open source file system such as Lustre® ensures flexible scalability that has broad application interoperability and integrates ongoing enhancements from the greater development, test and support community.
Enterprise Availability. Research and development operations that were once confined to workstations now depend on clustered server and storage. Next generation HPC storage must provide out-of-the-box high availability, because accelerating application results depends on compute parallelization and shared storage resources.
Deploying compute clusters to accelerate your time to results will make your organization more competitive. Use these requirements as a guide to choosing the right HPC storage solution, keeping in mind that the system you choose should be specifically designed to meet your new technical computing workloads now and in the future.
Interested in finding out more information on Xyratex HPC storage products and solutions? Come see us at http://www.xyratex.com/products/hpc-big-data and don’t forget to check back with us from time to time to see what’s new.