September 12, 2010
Updated I/O performance library improves ease of use and achieves even better performance
Big machines are one thing. Taking advantage of their full potential is quite another. Application performance has long been trailing hardware as supercomputers have sought, entered, and now surpassed petaflop performance.
One of the factors commonly affecting application performance is input/output (I/O). Researchers regularly find themselves having to choose between the performance of their applications and the amount and quality of the data they write.
It's a problem familiar to the Oak Ridge Leadership Computing Facility's (OLCF's) Scott Klasky from his early years as a researcher with a team from Princeton Plasma Physics Laboratory using the Gyrokinetic Toroidal Code.
"We looked at the performance of how often we would like to write, and we were spending over 30 percent of the time writing the analysis files in a very popular file format. Thirty percent of all your computational time writing data to files is too much," said Klasky. "The scientists eventually decided that unless it was a run that we definitely wanted to get some visualization out of, we weren't going to write those because we were wasting our valuable computing time doing this."
Klasky, along with a team of researchers (Qing Liu, Norbert Podhorszki, Jay Lofstead, Hasan Abbasi, Ron Oldfield, Matt Wolf, Fang Zheng, Ciprian Docan, Manish Parashar, Weikuan Yu, Yuan Tian, Nagiza Samatova, Sriram Lakshminarasimh, Todd Kordenbrock, and others) from Georgia Tech, the OLCF, Rutgers University, and Sandia National Laboratories are the developers of ADIOS, an open-source middleware with the primary goal of making the process of getting information in and out of a supercomputer easier and more effective.
Last week the team released ADIOS 1.2, the latest incarnation of one of computational science's most effective I/O tools. So far ADIOS has helped researchers make huge strides in fusion, astrophysics and combustion. The new version features some interesting improvements that will doubtless aid researchers in taking full advantage of leading supercomputing platforms.
For starters, previous versions of ADIOS had users construct an external XML file that allowed them to organize their simulation variables into distinct groups and add important metadata to their output. With the new application programming interface (API), which allows for interaction between different software packages, users can now place the APIs directly into their code and interactively construct new variables during run time. This was especially important for adaptive mesh refinement (AMR) codes, such as Chombo, that can alter the variables placed on disk during run time. This new API makes ADIOS much more flexible and allows researchers to choose between defining the output in an external file for maximum flexibility or in their codes.
ADIOS also features a custom I/O method that writes data to subfiles and aggregates it into larger pieces for maximum performance on the leadership-class systems. This method has been shown to get near peak I/O performance for many codes, particularly S3D, on the Cray XT5 and Cray XT4 at the OLCF and Lawrence Berkeley National Laboratory's National Energy Research Scientific Computing Center.
"We are now able to speed up applications such as S3D to near-peak I/O bandwidth through simple and easy-to-use ADIOS APIs," said Qing Liu, a member of the ADIOS team at the OLCF. "We are also able to speed up S3D by a factor of more than 15. This is achieved by intelligently aggregating and writing data to storage targets in ADIOS."
Now users who run on large systems can switch from running on P-processors and writing to P-files -- or one file or M-files, transparently. ADIOS users can switch to the best method for individual systems, including the IBM Blue Gene/P at Argonne National Laboratory, where PhD student Yuan Tian, along with her advisor Weikuan Yu at Auburn University, has created a custom method to write more efficiently with ADIOS.
Version 1.2 also features further support for self-describing data in the output. Users can now write more statistics into their data and have more flexibility in their output. For example, users can automatically retrieve the average value, minimum, maximum, and standard deviation for all arrays at negligible computational cost. This feature allows users to take large files (terabytes) and automatically determine these parameters in less than 2 seconds when listing the contents of the data. Furthermore, users can get these statistics for each independent time step in the output.
Finally, version 1.2 features some new asynchronous transport methods, allowing even faster I/O. The trick is scheduling. I/O uses the network bandwidth, and by taking advantage of the downtime during communication between processors, researchers "can essentially get I/O for free," said Klasky.
For example, both the DataTap and the Network Scalable Service Interface (NSSI) methods, from Georgia Tech and Sandia Labs respectively, send data to a user defined set of nodes (a staging area) and writes the data from these nodes, reducing the performance linkage between the file system and the application. Furthermore, the DataSpace method from Rutgers creates a PGAS environment in the staging area so that independently compiled codes with ADIOS can be used as services to efficiently couple them together.
"The focus for this release is broader compatibility and user convenience. The introduction of the API calls to replace the XML file addresses long-standing requests from a small but vocal part of our user community," said team member Jay Lofstead. "The AMR-focused enhancements broaden the classes of application that can use ADIOS while maintaining 100 percent backward compatibility. Some additional changes smooth the user experience."
Taken separately, all of ADIOS's individual improvements represent significant advances toward more efficient simulations. Taken together, they embody a major innovation in the way computational science will be conducted.
"Working with Scott Klasky and his team has moved our research and our software, such as DataTap and Data Staging, from being interesting research prototypes to becoming artifacts that address the real needs of petascale simulations," said Georgia Tech team member Karsten Schwan. "By then also interacting with the fusion, astrophysics and combustion modeling communities, we have not only found ways to alleviate their problems with I/O at scale, but we have also gained valuable information about ways to better organize data and quickly analyze it to help scientists understand the behavior of their petascale codes and gain the scientific insights they seek."
There are few foreseeable limits to ADIOS's potential. As it is expanded to additional platforms, simulating big science will become correspondingly simpler, allowing researchers to concentrate more on their results than the technical aspects of their simulations. And as high-performance computing becomes an increasingly powerful research tool, there will be no shortage of grateful scientists.
For more information on ADIOS and/or to download the source, check out the project's Web page.
10/30/2013 | Cray, DDN, Mellanox, NetApp, ScaleMP, Supermicro, Xyratex | Creating data is easy… the challenge is getting it to the right place to make use of it. This paper discusses fresh solutions that can directly increase I/O efficiency, and the applications of these solutions to current, and new technology infrastructures.
10/01/2013 | IBM | A new trend is developing in the HPC space that is also affecting enterprise computing productivity with the arrival of “ultra-dense” hyper-scale servers.
Ken Claffey, SVP and General Manager at Xyratex, presents ClusterStor at the Vendor Showdown at ISC13 in Leipzig, Germany.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?