CERN, the European nuclear research lab, has passed a milestone in building its worldwide data grid, sustaining a continuous data flow of an average of 600 megabytes per second (MB/s) for 10 days between eight facilities distributed through Europe and the U.S.
The service challenge is a step towards creating the reliable, high-speed distributed computing network, or grid, needed to handle the estimated 15 terabytes (TB) per year expected to be generated by photon collisions at CERN’s Large Hadron Collider (LHC). Last month CERN completed building the 100-site grid, based in 31 countries.
Grid computing — a form of computing cluster — is much talked of in enterprise IT as a way of flexibly using computing power across an organization, CERN is largely paving new ground in building its grid, the lab says.
The project is using some commercial components, such as Oracle databases and some storage systems, but its own engineers had to develop the middleware to operate the grid. The project’s grid will also operate on a greater scale than has been previously achieved. “When the LHC starts operating in 2007, it will be the most data-intensive physics instrument on the planet, producing more than 1,500 megabytes of data every second for over a decade,” said Jamie Shiers, manager of the service challenges at CERN, in a statement.
The new service challenge channelled data between CERN in Switzerland and seven other participants: Brookhaven National Laboratory and Fermi National Accelerator Laboratory (Fermilab) in the US, Forschungszentrum Karlsruhe in Germany, CCIN2P3 in France, INFN-CNAF in Italy, SARA/NIKHEF in the Netherlands and Rutherford Appleton Laboratory in the UK.
The test sustained roughly one-third of the data rate ultimately needed for the LHC, and reached peak rates of more than 800MB/s. It used underlying high-speed networks including DFN, GARR, GEANT, ESnet, LHCnet, NetherLight, Renater and UKLight. The project’s next service challenge, starting this summer, will extend the operating period to three months and will add many more facilities to the current eight.
CERN’s grid is based on the Globus Toolkit from the Globus Alliance, adding scheduling software developed by the University of Wisconsin and tools developed under the aegis of the E.U.’s DataGrid project.
View related articles in Spotlight on grid computing