No news here: The federal government wants to reduce head counts, centralize computing resources,and move service to Canadians to an online model. The results should be better service, streamlined operations and billions of dollars in savings.
The question is: Can grid computing contribute to the solution?
The answer: Yes, but. . . . In theory, the technology is simplicity itself.
Computers spend most of their time literally doing nothing: grid computing ties them together with high speed networks and management software to create virtual supercomputers from that idle capacity.
Many computer users have contributed to the SETI search for extraterrestrial life, or supported medical research projects.
In Canada, academic research networks pool their computing power to study how diseases spread, to map genes or to model the entire universe.
Dr. Roger Impey is Group Leader for High Performance Computing at the National Research Council’s Institute for Information Technology. He knows how effective – and how demanding – grid computing can be.
“At NRC, we have something like six or seven thousand PCs on people’s desks, most of which are pretty significant machines,” Impey said. “Most of which spend almost all of their time doing e-mail and Word, right?”
Impey said grid computing can be a powerful tool for certain sets of problems, but is relatively immature in some respects.
Scientists doing research in high energy physics, for example, can easily tolerate the drawbacks of ad hoc networking.
“They have hundreds of thousands of jobs they want to run, so they are tolerant about this kind of environment. If for some reason a couple of hundred of their jobs don’t run, or they don’t get their results back, they’re not too worried. (But) if you’re working with an insurance company or a bank and a couple of hundred of their database updates don’t get done — well, that could be a big problem. You have to pick your partners based on the state of the technology.”
Henk Dykhuizen, Vice President Government, Education and Health Care with Oracle Canada, finds that “the infrastructure that is in place today (in government) does not lend itself to a shared resource environment.”
“The systems that are over in one department are really not tightly connected to the systems in another department. In terms of sharing resources on a system, sharing data,to create a true grid, they are at least a generation behind on a lot of those machines.”
It takes a great deal of effort to make it all work, Dykhuizen said. “You need an enterprise console. Can you manage remote resources? Can you bridge different environments, the different platforms and architectures?”
Even with compatible machinery, Impey said, “very little in life is for free. You can get additional resources, but the down side is a layer of complexity and some of this stuff is relatively complicated.”
And compatible resources don’t necessarily address serious security questions.
As Dr. Impey said, high energy physics researchers have no problem with security; their jobs can run anywhere. Government, he said, is a different story.
“The ultimate thing we are striving for in grid computing is in some sense a security nightmare. What I would love to do is submit my job to some user-friendly interface, and it just takes it and does the work. You don’t know when or where or how. You just get the results back. That is the ultimate in grid computing for us, but I can imagine for some government departments, that is a bit of a nightmare.”
Even if wide-scale grid computing were technologically simple and inherently secure, it could still be difficult for federal departments to implement collectively. As Oracle’s Dykhuizen pointed out, “Step one may be Service Canada and doing shared services in a real way, but look at the struggle they’re having right now with that, trying to get all the departments to agree.”
Realistically, the first operational grid computing systems within government will probably be in-house, under the complete administrative and operation control of one department. That model is already working at NRC, in the Condor system.
“Condor is a bit like SETI At Home but for private networks,” Impey said. “Anyone in a group of several hundred machines can submit jobs to their own machine and then it decides whether to run it itself or run it on this group of network machines.” In one example, a scientist ran 10,000 jobs, each of which took several hours. “He had thousands of hours of computer time, really for free. It’s called ‘CPU salvaging.’ Essentially scavenging, which is where the word Condor comes in.”
With the right management structure, Impey said, a department with operations across the country would have enough computing power available at any given time to make virtual supercomputers available almost 24 hours a day. “In Canada’s case we have 5 1/2 time zones. If you come in early, machines in Vancouver are still idle and you can get a lot of time on those machines before people come in, in the morning. From Vancouver, in the afternoon, they can start using machines on the east coast, because most people have gone home.”
Impey said departments should be aware of grid computing’s potential today, to keep their options open for the future. “Don’t put things into place that will prevent these things from happening,” he advised. “Even if you don’t enable it, don’t prevent it.”
Richard Bray (rbray@itworldcanada.com) is an Ottawa-based freelance writer specializing in high technology isues.
For more information on Grid Computing including a primer plus more articles, case studies, white papers, industry links and more, be sure to visit our currently featured Grid Computing Spotlight.