When you get right down to it, most computers spend a life sitting around waiting for work, and the unused CPU cycles perish into the ether. Grid computing, in its purest form, is out to change all that.
Since grids are designed to reduce unused and idle CPU cycles, it can, theoretically at least, do more with less. But defining what is, and what isn’t, grid computing — well, that’s another matter.
Much like Web services and unified messaging, grid computing seems to have as many definitions as it does players.
Some companies view it as the IT equivalent of electricity, where a user plugs into a system to access CPU cycles when needed. Others view it more as a series of interconnected corporate machines that can simultaneously work on a complex problem.
Neither definition is inherently wrong and that is why a concise definition is next to impossible. But one commonality exists for all definitions, and that is the idea of sharing IT resources.
The underlying strategy behind adopting a grid-computing solution in the enterprise is to resolve the fundamental conflict that exists between IT and business.
Ideally, a company has all the computing power it needs, but because demand is uneven and predictable at best, most companies are forced to buy additional systems in order to handle peak loads.
If you’re a national weather forecaster and top loads are almost continual, there is no business/IT conflict, since the case exists on both sides to build large powerful systems with peak demand capability in mind.
But for most Canadian companies, peak demand is often infrequent, but still predictable — say weekly or quarterly. If a company’s systems are designed to handle this load then there are a lot of under-used machines. The other side of the coin, especially in tight economic times, is to have systems designed for average, rather than peak, use. The result — when demand outstrips supply — is that everything slows to a crawl. With grid, the goal is to reduce the total number of corporate CPUs, while still handling peak demands — and saving money at the same time.
Grid on a grand scale
The University of California at Berkeley’s SETI@home project is not only the largest example of grid computing, but also, in its simplest of forms, one of the most powerful supercomputers ever created.
SETI (Search for Extraterrestrial Intelligence) has more than five million participants, each of whom has downloaded a free program that analyzes radio telescope data and sends the results back to SETI.
It has generated more than two million years of CPU number crunching by using participants’ idle CPU cycles.
But this design, while interesting, is fundamentally unusable in the enterprise. First of all, most corporate software applications don’t have SETI’s wafer-thin 791KB profile and thus can’t easily be loaded onto a machine in search of idle CPU cycles.
Secondly, little corporate data is as easily parsed as individual radio telescope data. There is, generally speaking, just too much interactivity between data sets to run an enterprise-level ERP application over a SETI-like distributed network.
In practice, it is often difficult to write an application so that disparate CPUs can successfully execute different portions of an application without causing data-continuity conflicts. Because solutions to these obstacles are relatively recent, it is only in the past 24 months that grid has truly piqued the interest of the enterprise.
Today the software needed to parse up tasks running within an application and source these out to disparate CPUs has improved dramatically, as has the functionality needed to silo unrelated data running simultaneously on a grid.
Needless to say, from both a security and business continuity perspective, application data cannot risk being influenced by outside forces.
When an application runs on a dedicated server there is no risk of cross talk, but when it is jumping on and off shared systems the solutions needed to handle the tasks of siloing and continuity are quite complex. But they do exist. Large enterprise vendors such as Sun Microsystems Inc. and IBM Corp. are now offering utility model solutions for customers, as well as helping others build internal grid solutions.
High barriers
Today, unfortunately, the biggest hurdles to grid adoption are not ephemeral, since these tend to be political and psychological.
With an external utility-grid model there is still residual fear about crunching data outside the confines of the corporate infrastructure. But most utility models have superior security simply because business survival demands it.
Data transfer is done through a dedicated VPN and can be encrypted if needed. To paraphrase a grid guru from IBM Canada Ltd., security, technically at least is no longer a problem. Internal grid adoption has other crosses to bear. Priority allocation often rears its ugly head.
With individual departments ruling their own domain, sharing is not a problem. But if a large internal grid is set up, how does a company — politically at least — decide who gets priority over whom? Does finance rule over R&D? What about sales wanting to do some data mining? The battle can be furious. Today, few companies are at this level of resource allocation. But they are aware of the potential political hurdles if they do decide to build a grid.
Success stories
But before you decide the political hassles aren’t worth the effort, be advised there are many Canadian success stories, both small and large. Tundra Semiconductor Corp. in Ottawa, for instance, uses a utility-computing service from GridWay Computing Corp. to test new product designs.
Meanwhile, one of the big-five Canadian banks has built an internal grid to run CPU-intensive applications focusing on risk analysis and portfolio optimization — two different departments that have achieved peaceful co-existence.
For companies unsure about whether a grid option is viable there is help available to measure potential return on investment — most of the consulting firms, along with the major IT players, have groups dedicated to the grid cause. Whether a grid is internal or external, the rules for engagement are pretty simple, though the technology is complex.
When an application needs additional resources, other than what is traditionally allocated, an automatic request is made to the grid. Often, an exact mirror of the actual enterprise application sits on a gateway server, itself physically located next to the grid (especially with the utility model). This speeds up the process since the application, traditionally larger in size than the actual data, does not have to be sent over the network.
When the request is made for additional power, the application, along with pertinent data sets, is sent out on to the grid. The numbers are crunched, and the data is returned with the application back to the gateway server. Then the data makes its trip back home, to the enterprise application server. Properly designed, the entire process is seamless, with no human intervention required.
The result is much faster data processing since the application’s work is distributed over many machines. And since the machines are shared by many applications, there are fewer idle CPU cycles. When one job is done the next is run.
With many grid designs, multiple jobs are run at once, thus the need to silo data, further reducing wait times.
Many external grid offerings have a combination of systems available. For data that is easily parsed, they offer clusters of single and dual CPU systems.
For applications that run best in a shared memory environment, there are huge multi-processor machines. IBM recently announced that its Blue Gene supercomputer — the most powerful in the world, according to the respected top500.org — is available for customer use.
Admittedly not every application lends itself to a grid environment, but many industries — oil and gas, life sciences, financial, for example — find grid solutions, either internal or external utility models, to be a cost-effective way to process information quickly and effectively.