The report released by UC Berkeley’s Reliable Adaptive Distributed Systems Lab (RAD Lab) carries significant implications for IT organizations.
With its report, RAD has indicated that academic institutions are paying attention to technology trends. The work RAD did for its report is excellent and valuable and should be required reading for anyone considering the potential of cloud computing.
RAD then goes on to make three recommendations, and this is where you should pay very, very close attention. They say:
“Hence, developers would be wise to design their next generation of systems to be deployed into Cloud Computing. In general, the emphasis should be horizontal scalability to hundreds or thousands of virtual machines over the efficiency of the system on a single virtual machine.”
There are specific implications as well:
1. Applications Software of the future will likely have a piece that runs on clients and a piece that runs in the Cloud. The cloud piece needs to both scale down rapidly as well as scale up, which is a new requirement for software systems. The client piece needs to be useful when disconnected from the Cloud, which is not the case for many Web 2.0 applications today. Such software also needs a pay-for-use licensing model to match needs of Cloud Computing.
2. Infrastructure Software of the future needs to be cognizant that it is no longer running on bare metal but on virtual machines. Moreover, it needs to have billing built in from the beginning, as it is very difficult to retrofit an accounting system.
3. Hardware Systems of the future need to be designed at the scale of a container (at least a dozen racks) rather than at the scale of a single 1U box or single rack, as that is the minimum level at which it will be purchased. Cost of operation will match performance and cost of purchase in importance in the acquisition decision. Hence, they need to strive for energy proportionality by making it possible to put into low power mode the idle portions of the memory, storage, and networking, which already happens inside a microprocessor today.
Hardware should also be designed assuming that the lowest level software will be virtual machines rather than a single native operating system, and it will need to facilitate flash as a new level of the memory hierarchy between DRAM and disk.
Finally, we need improvements in bandwidth and costs for both datacenter switches and WAN routers.”
Overall, they strongly recommend use of cloud computing due to its elasticity and transfer of risk. Elasticity refers to the first point in their description of cloud computing — the ability to scale up and down rapidly under conditions in which the illusion of unlimited capacity (should it be contemplated) is present.
Transfer of risk is a really intriguing concept. Essentially, they are referring to a situation of uncertain or fluctuating demand. For traditional IT shops, this type of situation is risky, posing a Hobson’s Choice: either overprovision capacity to be certain of meeting peak demand (thus probably wasting capital by purchasing unnecessary capacity), or purchase capacity to meet average demand and thereby forego having the processing capacity available to meet user demand greater than average.
Essentially, this risk boils down to being uncertain about how much hardware capacity to purchase; in environments with potential for extreme swings in demand, this risk is enormous. By relying on a cloud provider to provide sufficent capacity to absorb whatever user load might occur, an IT organization transfers the risk of capacity planning to the cloud provider. By using a cloud provider that operates on a scale one or two magnitudes greater than any one end user organization, the user can be comforted that, no matter what level of its own demand it puts to the cloud provider, the provider’s large infrastructure can easily manage fluctuations in overall demand.
To sum up, RAD identifies key characteristics of cloud as reflecting an easy-to-get-going, scalable upon demand, no upfront commitment computing environment. Its recommendations reflect systems that are attuned to that infrastructure and operating characteristics. And finally, they identify a number of impediments to cloud computing and recommend ways to mitigate them.
I’d like to address some implications of the report as well as what I view as its — if not shortcomings — its rather too-glib assumptions regarding this new environment.
As to implications, in its first recommendation, RAD suggests that “the emphasis should be horizontal scalability to hundreds or thousands of virtual machines over the efficiency of the system on a single virtual machine.” This sentence carries significant implications for IT organizations. In particular, it suggests that cloud-based applications need to be architected with specific capabilities — easily partionable, able to incorporate additional resources without need for manual intervention, lack of concurrency contention for key resources, and so on. These capabilities map extremely well to the cloud characteristics they identify.
The key challenge this poses to IT organizations is that most applications today are not architected this way; most assume a relatively stable load and a fixed topology. In other words, most of the skills in today’s IT shops are not oriented toward the architecture RAD recommends as the basis for cloud-based apps. Probably the nearest most mainstream IT shops come to this kind of application is web-based systems, but these represent only a fraction of the overall application portfolio.
This is not to say that traditionally-architected applications cannot be run in cloud environments, merely to note that to fully leverage the unique characteristics of clouds a different architecture is required. Moreover, the type of applications that clouds enable, along with the suggested architecture, are just the type that could not be countenanced in traditional data centers, operated as they are with assumptions of capacity scarcity.
There are two aspects of the report that minimize issues that will challenges mainstream IT organizations will confront when moving to cloud computing. One of those challenges emanates from outside IT: it is located within the vendor community. The second resides within IT itself and relates to internal cultural issues.
In its second recommendation, RAD advises that “Infrastructure Software of the future needs to have billing built in from the beginning.” In that statement a whole range of issues resides. Infrastructure software,