June 18, 2013
The world's largest supercomputers, like Tianhe-2, are great at traditional, compute-intensive HPC workloads, such as simulating atomic decay or modeling tornados. But data-intensive applications--such as mining big data sets for connections--is a different sort of workload, and runs best on a different sort of computer.
TH-2 nabbed the top spot on the latest iteration of the Top 500 list released at ISC 2013 this week in Germany. With 33.8 petaflops of computing power, it has nearly as much capacity as the next two largest supercomputers on the planet, Sequoia and Titan, combined.
However, TH-2 didn't even make the top 5 in a competing list of large computers, called Graph 500. In fact, it debuted at number six on the latest iteration of the list, which was also released this week at ISC 2013.
The two lists share obvious similarities. They both come out twice a year, and they both have "500" in their names, but there are important differences. While the Top 500 measures a system's capacity to handle large floating point operations per second (FLOPS), Graph 500 measures a computer's ability to perform graphing functions, as measured in traversed edges per second (TEPS).
Supercomputers like TH-2 strive to pack as many processing cores into a single system image. But data-intensive and graphing applications, such as Facebook Graph Search, do best on systems that have been optimized to access memory, according to Richard Murphy, a senior architect of advanced memory systems at Micron Technology and a founder of Graph 500.
"Graph 500 is more challenging on the data movement parts of the machine--on the memory and interconnect--and there are strong commercial driving forces for addressing some of those problems," the Sandia National Laboratories veteran tells IEEE Spectrum.
Math is obviously important to both traditional HPC systems and the emerging breed of big data systems. The "needle in a haystack" problem that big data mining applications try to solve depends heavily on the rapid manipulation of integers, as realized in TEPS as the ability to build edges between nodes in a graph.
The ability to get lots of data into and out of memory and disk quickly is important for the sort of data mining workloads that companies like Amazon, Facebook, and Netflix use to churn out recommendations. It's also important for the types of workloads that the NSA is supposedly running with PRISM and related programs.
The DOE's Sequoia cluster at Lawrence Livermore National Laboratory, by the way, retained the number one spot on the latest Graph 500 list, with 15,363 GTAPs. Systems built on IBM's BlueGene/Q platform own four of the top five Graph 500 spots and eight of the top 10.
10/30/2013 | Cray, DDN, Mellanox, NetApp, ScaleMP, Supermicro, Xyratex | Creating data is easy… the challenge is getting it to the right place to make use of it. This paper discusses fresh solutions that can directly increase I/O efficiency, and the applications of these solutions to current, and new technology infrastructures.
10/01/2013 | IBM | A new trend is developing in the HPC space that is also affecting enterprise computing productivity with the arrival of “ultra-dense” hyper-scale servers.
Ken Claffey, SVP and General Manager at Xyratex, presents ClusterStor at the Vendor Showdown at ISC13 in Leipzig, Germany.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?