Wednesday, January 4, 2012

The advantage of commercial clouds versus HPC for scientific research

[The Department of Energy (DoE) recently came out with an excellent report, called the Magellan report, on the advantages and disadvantages of using commercial clouds versus in house High Performance Computers for leading edge scientific research.
 The DoE probably supports the largest concentration of HPC facilities in the world.   I agree with the report that for traditional  applications such as computational chemistry, astrophysics, etc will still need large HPC facilities. But traditional computational intensive applications are becoming a niche market and increasingly  many of these applications can now run on specialized commercial “HPC” clouds as for example Nimbix.

The biggest growth in demand for computing is not in computational intensive modeling but data intensive processing. New disciplines such as Astroinformatics, Matinformatics (real-time chemical analysis), Systems biology, Meta-genomics, Computational history, computational linguistics, etc are the driving force for research computing. Most of these data intensive applications are loosely coupled and are ideally suited for using clouds.

While the growth of data intensive science and use of clouds is well recognized, it is still ongoing debate whether researchers should use in-house clouds or commercial facilities.  The DoE report did an extensive analysis on the cost of commercial clouds versus in-house facilities. They compared the cost per compute core of an in-house facility versus that of a commercial provider. While I may argue with some of the assumptions in their analysis: for example they did not include cost of money or real estate in their analysis, nor did they use much lower spot market for commercial cloud pricing, I still agree that, in the near term, commercial clouds will be marginally more expensive than in-house facilities.

From a funding agency perspective, however, there is huge advantage of promoting commercial clouds have over an in-house facility. Despite the higher per core costs, the elimination of up-front capital costs of using a commercial cloud is incredibly significant, especially in this time period of fiscal constraint. Any capital expenses that can be delayed or eliminated, and yet not impact the quality of the research, has a huge cost benefit to funding agencies.  This is also advantageous to the researcher as well. Usually it takes several years to make a proposal, get approval, acquire and install a large HPC facility.  With commercial clouds a researcher can start immediately to undertake their computational research. The upfront cost is very small and their time to market (i.e. publishing the results) can be much faster with a commercial facility. In fact some commercial clouds like Amazon and Azure offer a free pilot service to allow researchers and businesses to migrate their software to the cloud and shake out any possible kinks in their software.

With a commercial facility researchers can scale their application as warranted without incurring any additional capital costs. There is no need for peer review to determine the resources that may be made available to the researcher.  More importantly, because the incremental per core costs are very small, many other venues for funding for the computation facility are available, as opposed to the limited funding channels available for the purchase of an in-house facility. For example, some commercial organizations will broker their cloud infrastructure for little or no cost to university researchers, as opposed to commercial users. Many R&E networks are also negotiating significant bulk discounts for commercial cloud services on behalf of the R&E community  –BSA]

DoE report

Clouds for HPC applications

Big data can lead to big breakthroughs in research

Green Internet Consultant. Practical solutions to reducing GHG emissions such as free broadband and electric highways.

twitter:  BillStArnaud
skype:    Pocketpro