Bill St. Arnaud
- Bill St. Arnaud is a R&E Network and Green IT consultant who works with clients on a variety of subjects such as the next generation research and education and Internet networks. He also works with clients to develop practical solutions to reduce GHG emissions such as free broadband and dynamiccharging of eVehicles (See http://green-broadband.
blogspot.com/) . View my complete profile
Wednesday, April 11, 2012
Great example of public-private network-computer partnership to support big data research
[Here is a great example of how optical exchange points and advanced networks (STAR LIGHT) working in partnership with public clouds can accelerate fundamental research.
Such partnerships can significantly reduce capital cost of campus computing resources as well as operational costs in terms of energy consumption of on campus computing. Thanks to Ed Lucente for this pointer – BSA]
Open Cloud Consortium Announces First Integrated Set of Cloud Services for Researchers Working with Big Data
CHICAGO, April 4 — Today, the Open Cloud Consortium (OCC) announced the availability of Tukey, which is an innovative integrated set of cloud services designed specifically to enable scientific researchers to manage, analyze and make discoveries with big data.
Several public cloud service providers provide resources for individual scientists and small research groups, and large research groups can build their own dedicated infrastructure for big data. However,currently, there is no cloud service provider that is focused on providing services to projects that must work with big data, but are not large enough to build their own dedicated clouds.
Tukey is the first set of integrated cloud services to fill this niche.
Tukey was developed by the Open Cloud Consortium, a not-for-profit multi-organizational partnership. Many scientific projects are more comfortable hosting their data with a not-for-profit organization than with a commercial cloud service provider.
Cloud Service Providers (CSP) that are focused on meeting the needs of the research community are beginning to be called Science Cloud Service Providers or Sci CSPs (pronounced psi-sip). Cloud Service Providers serving the scientific community must support the long term archiving of data, large data flows so that large datasets can be easily imported and exported, parallel processing frameworks for analyzing large datasets, and high end computing.
"The Open Cloud Consortium is one of the first examples of an innovative resource that is being called a Science Cloud Service Provider or Sci CSP," says Robert Grossman, Director of the Open Cloud Consortium. "Tukey makes it easy for scientific research projects to manage, analyze and share big data, something this is quite difficult to do with the services from commercial Cloud Service Providers."
The beta version of Tukey is being used by several research projects, including: the Matsu Project, which hosts over two years of data from NASA's EO-1 satellite; Bionimbus, which is a system for managing, analyzing, and sharing large genomic datasets; and bookworm, which is an applications that extracts patterns from large collections of books.
The services include: hosting large public scientific datasets; standard installations of the open source OpenStack and Eucalyptus systems, which provide instant on demand computing infrastructure; standard installations of the open source Hadoop system, which is the most popular platform for processing big data; standard installations of UDT, which is a protocol for transporting large datasets; and a variety of domain specific applications.
Tukey has a direct 10 Gbps connection to StarLight, an advanced national and international communications exchange facility, which in turn connects to dozens of high performance research networks around the nation and the globe. "Tukey enables scientists to share their big datasets with researchers around the country and the world," says Joe Mambretti, Director, International Center for Advanced Internet Research (iCAIR) at Northwestern University.
About the Open Cloud Consortium
The Open Cloud Consortium (OCC) is not for profit that manages and operates cloud computing infrastructure to support scientific, medical, health care, and environmental research. The Open Cloud Consortium is a consortium managed by the Center for Computational Science Research, Inc., which is an Illinois based 501(c)(3) not-for-profit corporation. (http://www.opencloudconsortium.org)
Tukey is named after the American scientist John Wilder Tukey (1915 - 2000), who made a number of fundamental contributions to statistics. He helped popularize exploratory data analysis, which is an important technique when working with big data. He also introduced the term "bit."
StarLight is the world's most advanced national and international communications exchange facility. StarLight provides advanced networking services and technologies that are optimized for high-performance, large-scale metro, regional, national and global applications, especially for data intensive research science communities. (http://www.startap.net/starlight).
R&E Network and Green Internet Consultant.
at 5:38 AM