As the executive director of the National Center for Supercomputing Applications (NCSA), Dan Reed may have one of the better jobs in all of IT. He leads an organization that is committed to pushing the computer-science envelope, and as such pioneers many of the concepts that will become part of mainstream IT infrastructure in the future. In an interview with InfoWorld editor in chief Michael Vizard and Test Center Director Steve Gillmor, Reed explains why his organization has become heavily committed to researching Linux clusters and grid computing.
InfoWorld: How did the concept of grid computing come about?
DR: There were analogies with the electric power grid. You don’t really worry about where generating plants are. You expect that those electrons will be there at the outlet when you need them. The notion of grid computing is not only to make computing cycles available but [also] data and access to remote instruments in the same way, so that you don’t need to know where the facility is; the information is available when you request it.
InfoWorld: At what point does grid computing begin to make it out of labs and into commercial applications?
DR: We’re definitely still in the early adopter phase. You’ll start to see deployment, in increasing scale, during the next three [years] to four years. There are already vendors who’ve committed to support a lot of the standard software infrastructure – Sun and IBM being two notable examples. There are still pieces of the infrastructure being developed. One of the things that we’re doing in concert with Argonne [National Laboratory] and Caltech and San Diego [Supercomputing Center)] is building what we hope will be a high-end backbone. Just like the ARPAnet, it will start out with a small number of sites that will morph by aggregation into a teragrid that will be the initial backbone for a science [and] research grid that will span the United States. What we’ve chosen as the standard computing component for those four sites are Linux clusters based on [Intel] McKinley processors. The thing that’s going to connect those four sites is a 40Gb backbone that Qwest is helping us deploy. We want to view this really as a test bed we can use to deploy the infrastructure, test it, and then start to plug in other facilities and sites around the country.
InfoWorld: What role does XML play in your project?
DR: I think the real challenge for both business and scientific data archives is the whole metadata issue — how you support interoperability across multiple databases where the naming conventions are different. That’s where we’re trying to draw on some expertise we have here in large-scale data mining. XML is a big piece of that story.
InfoWorld: Given the emphasis on Linux, what’s your relationship with Microsoft like?
DR: We made a strategic decision about a year ago to shift the focus of our efforts from [Microsoft Windows] NT to Linux. Microsoft’s interaction with us has changed based on that. But clearly the whole issue of how the grid interacts with .Net is a big part of the story. There are ongoing discussions with Microsoft and some collaborations to integrate some of the grid software with .Net, so one gets interoperability between the Linux pieces and the desktop pieces running Windows.
InfoWorld: What impact do you think Linux will ultimately have on the industry as a whole?
DR: What I hope will be the value of Linux is that it will raise the development effort [higher] up the value chain. When I talk to people at Hewlett-Packard or IBM or other vendors about why they’re investing in Linux, the answer in the end largely seems to devolve to [this]: In a world of limited resources and people, where do you maximize your return? If most of the value-add as a seller of software is in the applications or things layered on top of those, then the extent to which you can minimize investment in the plug-and-play operating system level gives you flexibility to invest those resources further up the value chain. That’s what I think is going to happen.