October 19, 2011
Vendors in the HPC universe are jumping on the Hadoop bandwagon. This week SGI announced that it was marrying Cloudera's CDH (Cloudera's Distribution including Apache Hadoop) software with its own cluster machines. This is not too surprising, considering Hadoop's role as the leading open source framework for data-intensive analytics on distributed platforms, and Cloudera's position as a top Hadoop distributor and supporter.
According to the press release, the SGI-Cloudera partnership will "enable the two companies to jointly build, sell and deploy integrated, high performance Apache Hadoop-based commercial solutions." But as pointed out by Derrick Harris over at GigaOM, this is not necessarily an HPC play in the conventional sense. Even though Hadoop can be used for technical workloads like genomics and seismology, it's more typical application is for search engines, social media analytics, and advertising optimization.
According to Harris, the Cloudera integration with SGI gear appears to be targeted more toward the latter. On SGI's website, the pre-configured Hadoop clusters come in two cluster flavors: Rackable Servers and CloudRack Servers. Both are from the non-HPC side of the house. That doesn't mean such systems won't be running technical computing workloads, however, given the somewhat different nature of these data-intensive applications (i.e., you don't necessarily need top bin CPUs, or even InfiniBand, for I/O-bound Hadoop apps).
Harris also points out that Microsoft recently announced its Hadoop integration with Windows Server and Azure. This is an even more nuanced move, considering that Microsoft already has a Hadoop alternative for HPC called LINQ to HPC (formally Dryad). The latter is also packaged with HPC Server 2008 R2, and eventually will be supported in Azure as well.
The implication is that Microsoft will position its LINQ technology for HPC-type applications, and its standard Hadoop integration for non-HPC use cases. There are other Hadoop alternatives designed specifically for performance-obsessed users. In this category are platforms like LexisNexis' Data Analytics Supercomputer (DAS) offering, as well as non-standard flavors of Hadoop that are being tweaked for performance.
Unfortunately this is the ultimate endorsement of a successful technology -- copycats and derivatives. If successful though, at least some of these performance-minded frameworks for data-intensive analytics could find a happy home in HPC.
Full story at GigaOM
10/30/2013 | Cray, DDN, Mellanox, NetApp, ScaleMP, Supermicro, Xyratex | Creating data is easy… the challenge is getting it to the right place to make use of it. This paper discusses fresh solutions that can directly increase I/O efficiency, and the applications of these solutions to current, and new technology infrastructures.
10/01/2013 | IBM | A new trend is developing in the HPC space that is also affecting enterprise computing productivity with the arrival of “ultra-dense” hyper-scale servers.
Ken Claffey, SVP and General Manager at Xyratex, presents ClusterStor at the Vendor Showdown at ISC13 in Leipzig, Germany.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?