Cloud Computing: UC Berkeley Overview

Diego Vergara

Nov 09, 2011

Microsoft, Sun Microsystems, Google AppEngine, Amazon EC2, are the principal exponents in cloud computing world. Basically it's about distributing the processing power to various processors across a network and create the illusion of an infinite computing resource available on demand. But how real — and how expensive — is this alternative for IT departments, business and developers? UC Berkeley published a paper "Above the Clouds: A Berkeley View of Cloud Computing" which includes some interesting points as a definition for cloud computing, some obstacles for adopting it and some possible uses for utility cloud computing.

Before adhering to the Berkeley definition of cloud computing, it's necessary to remark that there's a lot of confusion about what exactly cloud computing is:

"The interesting thing about Cloud Computing is that we've redefined Cloud Computing to include everything that we already do... I don't understand what we would do differently in the light of Cloud Computing other than change the wording of some of our ads."
— Larry Ellison - Oracle CEO - Wall Street Journal, September 26, 2008

"A lot of people are jumping on the [cloud] bandwagon, but I have not heard two people say the same thing about it. There are multiple definitions out there of "the cloud"."
— Andy Isherwood - HP Vice President - ZDnet News, December 11, 2008

Even Richard Stallman has argued against Cloud Computing as a trap for users, who will become dependent on proprietary systems.

The UC Berkeley point of view includes Hardware and Software as Cloud: SaaS (Software as a Service) and hardware resources are now refereed as XaaS, where X can be Software, Infrastructure, Hardware or Platform, but the paper centralized over hardware facilities and include three aspects of what Cloud Computing should be:

The illusion of infinite computing resources available on demand
The elimination of an up-front commitment by Cloud Users: they can start small and increase hardware based on their needs
The ability to pay for use of computing resources and release them as needed

About new application opportunities this paper use Grey's insight (Jim Grey concluded, in 2003, that data must be near the application, since the cost of wide-area networking has fallen more slowly than all other IT hardware costs) to describe what kinds of applications represent good opportunities:

Mobile Interaction applications
Parallel batch processing
The rise of analytics
Extension of compute-intensive desktop applications

There's also a classification of Utility Computing based on the abstraction level to customize the use of hardware:

Amazon EC2 is at one end of the spectrum, it allows IT department to control almost the entire software stack, using API the hardware virtualization can be configured, but this includes a limitation: Amazon has difficulties to offer automatic scalability and fail-over.
In the middle is Microsoft Azure (Here at Oshyn we test some of the available features of Microsoft Azure) which represents an intermediate point between flexibility and programmer convenience. Right now Azure applications are written using .NET libraries, and compiled to the CLR. The system supports general-purpose computing, which means that there's a choice of language (but can not control the underlying operative system or runtime).
In the other end of the spectrum we find application domain-specific platforms such as Google AppEngine. Even AppEngine has an impressive automatic scaling and high-availability mechanisms it's not suitable for general-purpose computing.

About the obstacles there's a lot of resources discussing this topic, however i advice you to read the paper because it include a good description about every obstacle and its opportunity to overcome it.

Obstacle	Opportunity
Availability of Service	Use Multiple Cloud Providers to provide Business Continuity; Use Elasticity to Defend Against DDOS attacks
Data Lock-In	Standardize APIs; Make compatible software available to enable Surge Computing
Data Confidentiality and Auditability	Deploy Encryption, VLANs and Firewalls; Accommodate National Laws via Geographical Data Storage
Data Transfer Bottlenecks	FedExing Disks; Data Backup/Archival; Lower WAN Router Costs; Higher Bandwidth LAN Switches
Performance Unpredictability	Improved Virtual Machines Support; Flash Memory; Gang scheduling VMs for HPC apps
Scalable Storage	Invent Scalable Store
Bugs in Large-Scale Distributed Systems	Invent Debugger that relies on Distributed VMs
Scaling Quickly	Invent Auto-Scaler that relies on Machine Learning; Snapshots to encourage Cloud Computing Conservationism
Reputation Fate Sharing	Offer reputation-guarding services like those for email
Software Licensing	Pay-for-use licenses; Bulk use sales