Friday, February 5, 2010

Scientists Given Free Access to Cloud

[Another example why content peering at major IXs will be critical for future of scientific research. More and more research projects need access to global Internet community for crowd sourcing, cloud computing, distribution of educational video and citizen science. Excerpts from NY Times and Dan Reed’s blog – BSA]

U.S. Scientists Given Access to Cloud
The National Science Foundation and the Microsoft Corporation have agreed to offer American scientific researchers free access to the company’s new cloud computing service.

A goal of the three-year project is to give scientists the computing power to cope with exploding amounts of research data. It uses Microsoft’s Windows Azure computing system, which the company recently introduced to compete with cloud computing services from companies like Amazon, Google, I.B.M. and Yahoo. These cloud computing systems allow organizations and individuals to run computing tasks and Internet services remotely in relatively low-cost data centers.

Neither Microsoft nor the foundation was willing to place a dollar amount on the agreement, but Dan Reed, the corporate vice president for technology strategy and policy at Microsoft, said that the company was prepared to invest millions of dollars in the service and that it could support thousands of scientific research programs.

Access to the service will come in grants from the foundation to new and continuing scientific research. Microsoft executives said they planned eventually to make the new service global.
Simplicity of use is one Microsoft goal. Programming modern cloud systems for full efficiency has been difficult. The company is trying to overcome this difficulty in creating a variety of software tools for scientists, said Ed Lazowska, a University of Washington computer scientist who works with the Microsoft researchers.

Dr. Lazowska said the explosion of data being collected by scientists had transformed the staffing needs of the typical scientific research program on campus from a half-time graduate student one day a week to a full-time employee dedicated to managing the data. He said such exponential growth in cost was increasingly hampering scientific research

Innovation Via Client Plus Cloud: Microsoft-NSF Partnership

Today, February 4, Microsoft and the U.S. National Science Foundation (NSF) announced a collaborative project where Microsoft will offer individual researchers and research groups (selected through NSF's merit review process) free access to advanced client-plus-cloud computing. Our focus is on empowering researchers via intuitive and familiar client tools whose capabilities extend seamlessly in power and scope via the cloud.

I am very excited about this, as it is the fruit of nearly two years of planning and collaboration across Microsoft product and research teams, as well as many discussions with researchers, university leaders and government agencies. As part of this project, a technical computing engagement team, led by Dennis Gannon and Roger Barga, will work directly with NSF-funded researchers to port, extend and enhance client tools for data analysis and modeling. We also appreciate the support of the Microsoft Dreamspark, Technical Computing, Windows Azure, Azure Dallas, Public Sector, education and evangelism (DPE) teams, among others, to build and deliver this capability.

21st Century Innovation

The brief history of computing is replete with social and technological inflection points, when a set of quantitative and qualitative technology changes led to new computing modalities. I believe we are now at such an inflection point in computing-mediated discovery and innovation, enabled by four social and technical trends:
• Massive, highly efficient cloud infrastructures, driven by search and social networking demands
• Explosive data growth, enabled by inexpensive sensors and high-capacity storage
• Research at the interstices of multiple disciplines, conducted by distributed, virtual teams
• Powerful, popular and easy-to-use client tools that facilitate data analysis and modeling

The first two of these are well documented, and I have written about them before. (See Beyond the Azure Blue and Language Shapes Behavior: Our Poor Cousin Data.) The late Jim Gray also lectured and wrote perspicuously about data-intensive scientific discovery, which he called The Fourth Paradigm. As a logical complement to theory, experiment and computation, the fourth paradigm is based on extracting insight from the prodigious amounts of social, business and scientific data now stored in facilities whose scale now dwarfs all previous computing capabilities.

Climate change and its environmental, economic and social implications; genetics, proteomics, lifestyle, environment, health care and personalized medicine; economics, global trade, social dynamics and security – these are all complex, multidisciplinary problems whose exploration and understanding depend critically on expertise from diverse disciplines with differing cultures and reward metrics.
As our research problems rise in complexity, the enabling tools must rise commensurately in power while retaining simplicity. I have seen far too many multidisciplinary projects founder on the rocks of infrastructure optimization and complexity, when they should have focused on simplicity, familiarity and ease of use.

As Fred Brooks once remarked, "We must build tools so powerful that full professors want to use them, and so simple that they can." Simplicity really, really matters. It is for this reason that Excel spreadsheets, high-level scripting languages and domain-specific client toolkits are now the lingua franca of multidisciplinary innovation, the harbingers of invisibility.

Invisible Simplicity
Sadly, our technical computing experiences have been dominated and shaped by a focus on technology and infrastructure, rather than empowerment and simplicity. We talk routinely of data and software repositories, toolkits and packages; of cyberinfrastructure and technology roadmaps. In our technological fascination, it is all too easy to lose sight of the true objective. Infrastructure exists to enable. If it is excessively complex, cumbersome or difficult to use, its value is limited. The mathematician Richard Hamming's admonition remains apt: "The purpose of computing is insight, not numbers."

Empowering the Majority

To address 21st century challenges, we must democratize access to data and computational models, recognizing that the computing cognoscenti are, by definition, the minority of those who can and should benefit from computing-mediated innovation and discovery. Instead, we must enfranchise the majority, those who do not and will not use the low-level technologies – clusters, networks, file systems and databases – but wish to ask and answer complex questions. Remember, most researchers do not write low-level code; nor should they need to.

I believe we must focus on human productivity, not cyberinfrastructure, and leverage popular and intuitive client tools that hide infrastructural idiosyncrasies. This means deploying tools like Excel that can manipulate data of arbitrary size, reaching into the cloud to access petabytes of data and executing massive computations for epidemiological analysis as easily as one might balance a checkbook. It means coupling multiple, domain-specific toolkits via a script of a dozen lines, and launching a parametric microclimate study as easily as one searches the web. Perhaps more importantly, it means rethinking public and private sector partnerships for innovation, identifying and leveraging core competencies.
It all comes back to simplicity and invisibility. Technical computing can and should be an invisible intellectual amplifier, as easy to use as any other successful consumer technology. Now is the time; and Microsoft is committed to making it a reality. We look forward to working with the community.

twitter: BillStArnaud
skype: Pocketpro