September 12, 2011
In a recent whitepaper on SGI’s role in the coming wave of data-intensive computing requirements , IDC’s high performance computing (HPC) guru, Steve Conway, presented an overview of how the HPC and “big data” markets are merging in terms of hardware challenges.
Conway expressed a clear set of shared needs between data-intensive computing in HPC and the enterprise “big data” hardware and software in the face of his prediction that big data will keep getting bigger, necessitating changes in what (and how) hardware vendors design for HPC and enterprise customers.
In the HPC context, IDC defines data-intensive computing for this market as a set of big data problems that includes “tasks involving sufficient data volumes and complexity to require HPC-based modeling and simulation.” They go on to explain that these problems are rooted in combinations or isolated masses of both structured and unstructured data and can come from traditional HPC spheres (academia, public sector, etc.) or can “be upward extensions of commercial problems that have grown large and complex enough at the high end to require HPC.”
IDC claims that data-intensive workloads are going to become par for the HPC course in coming years, making up a more sizable portion of the overall high performance computing market. Conway notes that “in addition, while many big data problems will be run on standard clusters, limitations in the memory sizes and memory architectures of clusters make them ill-suited for the most challenging classes of data-intensive problems.” He points to a number of HPC sites that are looking to upgrade their systems to those that have fatter memory profiles, a trend that IDC expects to see playing out in the next few years and beyond.
Conway points to a number of requirements that are specific to data-intensive computing hardware, noting that HPC and data-intensive problems vary in terms of their emphasis on speed or time to solution. He says that “data-intensive problem solving performance typically is gauged by how fast the computer can traverse one or more large data sets, something using special frameworks such as MapReduce, Hadoop (Linux) or Dryad (Windows)” in contrast to say floating operations per second (FLOPS) on the HPC side.
Also of interest, there are a number of trends that affect data-intensive solutions that will be rolling out, including from SGI with its Altix line. Conway says that the high-end data explosion has had an impact on the entire IT spectrum with this being even more pronounced at the HPC end. He also says that there is a trend toward “unbalanced HPC systems.” In effect, he is referring to the last decade’s per node and system memory speeds that haven’t kept up with advancements in processors, which he says makes it “more difficult to feed the processors enough data to keep them busy.” This is the famous “memory wall” that is holding back the type of standard clusters dominant in the HPC realm.
Conways claims HPC vendors need to recognize the demand for system-wide emphasis on memory size, capabilities, bandwidth and latency. Conway says that even though many big data challenges can be tackled on commodity clusters, these aren’t always designed well for such problems because of limited system and cluster memory size and sharing issues as well as the communications barriers that exist. He claims that “the latencies of standard clusters typically are too high to support cache coherency across the clusters’ distributed memory locations.”
Although the bias factor should be noted here (this was a whitepaper for an HPC vendor), Conway makes the argument that commodity clusters just can’t do the job that a specialized HPC solution can do when it comes to data-intensive computing.
This article was originally published in BigCompute, a Tabor Communications publication slated to launch later this year.
10/30/2013 | Cray, DDN, Mellanox, NetApp, ScaleMP, Supermicro, Xyratex | Creating data is easy… the challenge is getting it to the right place to make use of it. This paper discusses fresh solutions that can directly increase I/O efficiency, and the applications of these solutions to current, and new technology infrastructures.
10/01/2013 | IBM | A new trend is developing in the HPC space that is also affecting enterprise computing productivity with the arrival of “ultra-dense” hyper-scale servers.
Ken Claffey, SVP and General Manager at Xyratex, presents ClusterStor at the Vendor Showdown at ISC13 in Leipzig, Germany.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?