Sunday, August 12, 2012

NSF and XSEDE survey on cloud use cases for researchers and educators

[It is good to see the National Science Foundation (NSF) and XSEDE (eXtreme Science and Engineering Discovery Environment) undertake a survey to determine cloud use cases by researchers and educators and plan accordingly for the seamless integration of cloud resources into the XSEDE architecture.
A good example of such a possible seamless architecture is SURFconext which uses university credentials via SAML for access to commercial clouds such as GreenQloud, etc. This is the kind of study that myself and co-author Dr Denis Therien recommended in a report we wrote for Canadian Foundation of Innovation on the future of cyber-infrastructure in Canada. In our report we uncovered considerable anecdotal evidence from researchers and from funding councils in Canada, USA and Europe that many researchers and educators are already using commercial clouds, paid out of their own pockets.

We also noted that of the many small and medium size research teams are acquiring their own clusters, they could instead be best served by commercial clouds. In our report we speculated that the total aggregate spend on these small clusters could possibly be greater than all the money spent on HPC. Unfortunately there is no way of tracking these expenditures as the purchase of small clusters is often buried amongst other larger equipment and research costs.

Most of these research teams are not your traditional compute intensive disciplines, and have little concern about their computer being in the top 500, and are for the most part focused on “occasional” computational data analysis. These teams are largely in the humanities, health sciences, biology sciences, civil engineering, etc. They refer to acquire their own clusters because it is far less hassle than applying for permission to use a large campus HPC facility. While a fully loaded university private cloud may be cheaper than commercial facilities, most small and medium research teams only need occasional use of such facilities and so often a commercial cloud is more convenient. As well, we also noted anecdotal evidence that a lot of the necessary tools and applications for many research teams are only available on commercial clouds. Many graduate students and small businesses are motivated to build tools for commercial clouds as they see a significant revenue opportunity.

I hope that the NSF and XSEDE will also undertake a more proactive analysis beyond a volunteer survey. Traffic to commercial clouds from R&E networks at major peering is sky rocketing. Tracking some of the IP addresses to determine who are the heaviest users of commercial clouds at universities may be more revealing than depending on a volunteers, especially outside of engineering and the physical sciences to complete a survey. – BSA]