February 27, 2013
Reining TOP500 champ, Titan, is not performing as expected. Jeff Nichols, head of Oak Ridge National Laboratory's scientific computing division, told Knoxville News that the massive supercomputer encountered technical issues that halted the final acceptance test.
This means that the DOE's Oak Ridge National Laboratory (ORNL) won't yet be taking official ownership of the $100 million dollar machine, and payments to Cray will be put on hold.
On the bright side, the problem has been identified and both parties are working on a solution.
"We've found a few bugs that have held us back," Nichols said, "and we're doing some repair work with Cray in order to get the stability tests where we want them to be."
The problems were traced to the interconnect fabric that enables the CPU and GPU components to communicate. The CPU-side of this hybrid supercomputer is operational, but applications that call on GPUs have encountered sporadic faults. ORNL is sending back sections of the system to Cray on a rotating basis for repair.
Even with these issues, Titan came close to meeting the goals for a successful acceptance test. A passing score is awarded for completing 95 percent of the jobs in the test, and the Cray supercomputer came in at 92-93 percent, only a few percentage points shy.
From what Nichols told Knoxville News, the issues sound more like a speed bump as opposed to a fatal flaw. Nichols expects final acceptance of Titan to be delayed no more than a month or two at most. He believes that once the connecters are repaired, the rest of the process should be a "slam dunk."
Despite recent setbacks, Titan passed initial testing in time for the November 2012 TOP500 list. This 27-petaflops (peak) Cray XK7 scored 17.59 petaflops on the Linpack benchmark, earning it bragging rights as the "world's fastest supercomputer."
The DOE's Oak Ridge National Laboratory describes Titan as "the world's most powerful supercomputer for open science with a theoretical peak performance exceeding 20 petaflops (quadrillion calculations per second)." This unprecedented level of power opens up a new possibilities for ground-breaking research, including complex climate change models and sophisticated nuclear reactor simulations.
10/30/2013 | Cray, DDN, Mellanox, NetApp, ScaleMP, Supermicro, Xyratex | Creating data is easy… the challenge is getting it to the right place to make use of it. This paper discusses fresh solutions that can directly increase I/O efficiency, and the applications of these solutions to current, and new technology infrastructures.
10/01/2013 | IBM | A new trend is developing in the HPC space that is also affecting enterprise computing productivity with the arrival of “ultra-dense” hyper-scale servers.
Ken Claffey, SVP and General Manager at Xyratex, presents ClusterStor at the Vendor Showdown at ISC13 in Leipzig, Germany.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?