August 16, 2012
Last week, HPC in the Cloud discussed what types of HPC applications are best suited for cloud technologies. While capabilities offered by cloud providers (minimal upfront costs, high scalability and quick time to deployment) remain attractive to HPC users, the needs of their workloads are sometimes at odds with the technology. One particular hurdle is the amount of bandwidth between the end user and their provider of choice. Earlier this week, a scalability.org blog covered this dilemma, calling it a "non-trivial" issue.
Most public cloud providers are best suited for Web hosting, email services and similar ongoing tasks. Their infrastructures are geared toward these purposes, scaling up capacity relative to end user demand. However, if a single user wants to store and process massive datasets, the lack of high bandwidth connectivity can severely hinder their research.
NASA is familiar with this problem. The agency recently launched a program called NEX, which houses 40 years of earth satellite data in a storage cluster next to their Pleiades supercomputer. NASA AMES Earth scientist Ramakrishna Nemani, spoke to us about the project. He described how long it took to migrate a large collection of landsat images from a datacenter in South Dakota to the AMES facility.
"I'll give you an example about how difficult this has been. We brought about 400 terabytes of data from the EROS datacenter in Sioux Falls, South Dakota. I was blown away, it took us nearly 6 ½ months."
With a turnaround time like that, it probably would have been easier to FedEx the dataset on a set of hard drives. The scalability blog directs blame for this kind of issue at lack of competition between ISPs in the US.
They priced an asymmetric connection delivering 100Mbit/s down and 10-15Mbit/s up at roughly $300/ mo. That translates to 12.5MByte/s down and 1.25MByte/s up.
Given that performance, an end user could download roughly one terabyte per day. But since the upload transfers at 10 percent the download speed, it would take approximately 10 days to upload a single terabyte.
Although standard service providers have been lacking in their ability to match throughput with demand, they may receive more incentive from Google. The Internet search giant has decided to throw themselves into the mix, launching their own fiber service in Kansas City. For $70 a month, users can get symmetrical 1,000Mbit/s (1 Gb/s) connectivity. With that performance, the 10 day/TB upload becomes a more practical, two hour transfer.
By effectively eliminating the bandwidth bottleneck, end users have the ability to implement a new range of cloud-based services. This includes high capacity storage and data-intensive research. Unfortunately Google's service is limited to Kansas City and no plans to expand the program have been announced.
10/30/2013 | Cray, DDN, Mellanox, NetApp, ScaleMP, Supermicro, Xyratex | Creating data is easy… the challenge is getting it to the right place to make use of it. This paper discusses fresh solutions that can directly increase I/O efficiency, and the applications of these solutions to current, and new technology infrastructures.
10/01/2013 | IBM | A new trend is developing in the HPC space that is also affecting enterprise computing productivity with the arrival of “ultra-dense” hyper-scale servers.
Ken Claffey, SVP and General Manager at Xyratex, presents ClusterStor at the Vendor Showdown at ISC13 in Leipzig, Germany.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?