![]() |
|
| The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing / August 17, 2006 | |
Since launching its first products in November 2005, Fabric7 Systems. has attracted a good deal of attention with its innovative commodity-based server solutions for the data center. Using AMD Opteron processors and the Linux or Windows operating system as the foundation of their "fabric" computing model, the company offers systems that provide features normally seen only in high-end enterprise servers: hardware partitioning and I/O virtualization.
At the LinuxWorld Conference in San Francisco, Sharad Mehrotra, President and CEO of Fabric7 Systems, talked about the I/O bottleneck in the data center in his presentation titled "The Coming I/O Crisis: Why Virtualization Technology Needs to Move Beyond Processors and Memory." In this Q&A with HPCwire, he discusses how Fabric7 is addressing I/O virtualization as well as other data center computing challenges.
HPCwire: Briefly, what are the IT challenges that are driving the interest in server virtualization?
Mehrotra: The most obvious driver is the proliferation of servers in the data center. There are hundreds of small x86 Windows and Linux servers running at only 15 percent utilization. Server virtualization offers the promise of consolidating five to ten servers or more onto a few larger multi-core servers. The second, and not as obvious driver, is flexibility and simplified operations. Server virtualization software provides a common standard platform for both system and application software. Entire stacks of OS, drivers, patches, apps, and updates can be moved from one hardware platform to another in a matter of minutes. This flexibility is the strategic objective for which server consolidation provides the financial justification. Of course, software virtualization on its own comes with a performance penalty, which is why our systems have focused on also delivering hardware-partitioning of the SMP as well, to enable application workload consolidation and system flexibility with no degradation in system performance.
HPCwire: In the enterprise, the conventional solution has been to provide hypervisors on top of RISC/UNIX systems? How does your solution differ?
Mehrotra: Prior to VMware, the only server platforms that supported hardware and software partitioning were the higher-end Unix servers and mainframe systems. VMware brought software partitioning to x86 servers. What is emerging now as the conventional solution is VMware/XenSource/MSFT Virtual Server on x86 servers. This development will force RISC/UNIX to follow the same legacy path as mainframes, where they become less numerous and saw workloads move to the less expensive x86 hardware. Fabric7 differs from these conventional x86 solutions in that we provide hardware-based partitioning as a complement for software partitioning, so that customers can deploy both in any combination they see fit. This flexibility will accelerate the transition away from Unix servers built using proprietary chips and operating systems, by enabling larger x86 systems like ours to deliver the same enterprise-class capabilities required of RISC/UNIX systems, but at a significantly lower cost.
HPCwire: Other enterprise solutions have ignored I/O virtualization, whereas in the Fabric7 systems it appears to be one of the fundamental technologies. Why has I/O virtualization lagged behind processor and memory virtualization and why did you focus so heavily on this aspect in your solution?
Mehrotra: I/O virtualization was not ignored on mainframe systems and is still critical to these systems today. This capability did not cross over to RISC/UNIX systems because larger shared memory multiprocessing and memory capacity were the primary drivers of that transition, and as a result, everyone instead just focused on building monolithic SMP-friendly applications. Fabric7 invested heavily in I/O virtualization with our higher-end x86 system, the Q160, for the same reason mainframes originally did -- to decouple I/O processing from the main compute complex. This return to I/O performance as a priority dramatically improves the flexibility and simplicity of the rapidly growing x86 processing infrastructure in large enterprises. Server virtualization makes this even more critical as it drives the need for flexible I/O to keep pace with the flexible processing that is being made possible with multi-core chips and large x86 SMPs.
HPCwire: The combination of virtual I/O and hardware partitioning gives you what might be described as "virtual clustering," which is a fundamental aspect of a Grid computing solution. How does your solution differ from Grid computing?
Mehrotra: Conventional Grid computing consists of a large number of stand-alone servers, each with their own fixed Ethernet, Fibre Channel and cluster interconnect switch and cabling infrastructure. All these servers are "hard-wired" together to form a Grid. Fabric7's approach creates a flexible "fabric" that allows all of a server's key parts, including processing, storage I/O, and network bandwidth, to be disaggregated from the chassis and shared across the fabric, creating what is for all intents and purposes a virtual, easy-to-manage pool of computing resources that can be provisioned and assigned to different workloads in minutes. While "grid" and "fabric" are used loosely together in the market, they represent quite different approaches in our eyes.
HPCwire: Can you talk a little bit about how you have addressed QoS?
Mehrotra: Careful attention to Quality of Service, or QoS, for all I/O traffic is one of Fabric7's fundamental architectural strengths. Our founders' previous experience in carrier-grade networking made QoS for I/O a major aspect of the company's product design philosophy from the start. We've engineered the systems in such a way that when more than one server is sharing the network infrastructure that connects multiple Fabric7 machines together, each virtual I/O channel on a server can define its own QoS and the amount of dedicated I/O bandwidth that it requires. This unique QoS profile can be changed "on-the-fly" to respond to rapidly changing demands, such as peak loads. We are also working to automate this real-time QoS profiling with additional management tools that will make it even easier to respond as peaks occur during the day/year or other seasonal fluctuations our customers face by adding policy or pre-defined business rules functions that will shift resources on the fabric in order to ensure QoS levels are protected automatically.
HPCwire: Since your solution is based on Opterons/HyperTransport on the hardware side and Linux and Windows Server on the software side, that would seem to limit your competitive reach to midrange RISC/Unix servers. Do you think these technologies will evolve sufficiently in the next few years so that you can compete with high-end servers? What additional features would be needed?
Mehrotra: Yes, Windows and Linux will evolve to compete with high-end servers. Many, if not most, industry analysts are declaring that Windows and Linux are almost there now and are forecasting zero to negative growth for RISC/UNIX servers going forward. Windows and Linux are the only operating systems predicted to grow in the future. AMD Opteron and the HyperTransport Technology Consortium have clear roadmaps to deliver 32-way (core) servers within the next 12 months and 128-way (core) servers the year after with the HyperTransport 3.0 interconnect. From our perspective, the technologies are here now. The next step is for users to establish proof points that mainstream "fast-followers" can use to show that it is OK to shift to these architectures of the future.
HPCwire: Fabric7's eight-socket Opteron system represents one of the few platforms of this type in the industry. This SMP architecture is the basis of the recent Tokyo Tech supercomputer, which was constructed from Sun Microsystems' new Sun Fire X4600 servers. Does this make you think your solution would have applicability in the high performance computing space? If so, how would the unique characteristics of your solution -- especially hardware partitioning and virtual I/O -- apply to HPC?
Mehrotra: Yes, Fabric7 does think that the pendulum will swing back towards larger SMP servers. The Tokyo Tech example is the lighthouse that is making heads turn and prompting people to question the conventional wisdom of today. Fabric7's implementation of hardware partitioning across its entire product line, provides HPC users with the flexibility to move quickly from small SMP to large SMP configurations in a matter of minutes. In the dynamic world of HPC, one run might consist of lots of small nodes and the next a small number of large nodes. Who can afford the waste of maintaining two separate server farms? Additionally, the switched, virtualized I/O capability available in our larger system, the Q160, provides customers the flexibility in network infrastructure that is required to keep up with the variable compute needs of the processing farm. We believe that the HPC world will shift from "grid" to "fabric" computing in the future as the benefits of our approach become more apparent with real-world deployments.
-----
Dr. Sharad Mehrotra, president and chief executive officer of Fabric7, leads the company with over 20 years of experience in enterprise servers and high performance networking equipment. He served as entrepreneur-in-residence at New Enterprise Associates, where he developed the key technology and market concepts that led to the formation of Fabric7. Dr. Mehrotra was a founder and chairman and CEO of Procket Networks, an advanced networking startup that was acquired by Cisco Systems. Previously, as an architect at Sun Microsystems, he was a key contributor in the design of the Sun UltraSPARC V microprocessor. Dr. Mehrotra holds four degrees in Electrical Engineering and Computer Science, including a Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign and a Masters in Computer Engineering from the University of Massachusetts at Amherst. During his career, he has been granted ten U.S. patents.