We routinely talk about the electrical power grid or the telephone grid, and it’s pretty clear what we mean — a large, decentralized network with massive interconnectivity and coordinated management. A grid is, in fact, a meshed network in which no single centralized switch or hub controls routing. Grids offer almost unlimited scalability in size and performance because they aren’t constrained by the need for ever-larger central switches. Grid networks thus reduce component costs and produce a reliable and resilient structure.
Applying the grid concept to a computer network lets us harness available but unused resources by dynamically allocating and deallocating capacity, bandwidth and processing among numerous distributed computers. A computing grid can span locations, organizations, machine architectures and software boundaries, offering power, collaboration and information access to connected users. Universities and research facilities are using grids to build what amounts to supercomputer capability from PCs, Macintoshes and Linux boxes.
After grid computing came into being, it was only a matter of time before a similar model would emerge for making use of distributed data storage. Most storage networks are built in star configurations, where all servers and storage devices are connected to a single central switch. In contrast, grid topology is built with a network of interconnected smaller switches that can scale as bandwidth increases and continue to deliver improved reliability and higher performance and connectivity.
What Is Grid Storage?
Based on current and proposed products, it appears that a grid storage system should include the following:
Modular storage arrays: These systems are connected across a storage network using serial ATA disks. The systems can be block-oriented storage arrays or network-attached storage gateways and servers.
Common virtualization layer: Storage must be organized as a single logical pool of resources available to users.
Data redundancy and availability: Multiple copies of data should exist across nodes in the grid, creating redundant data access and availability in case of a component failure.
Common management: A single level of management across all nodes should cover the areas of data security, mobility and migration, capacity on demand, and provisioning.
Simplified platform/management architecture: Because common management is so important, the tasks involved in administration should be organized in modular fashion, allowing the autodiscovery of new nodes in the grid and automating volume and file management.
Three Basic Benefits
Applying grid topology to a storage network provides several benefits, including the following:
Reliability. A well-designed grid network is extremely resilient. Rather than providing just two paths between any two nodes, the grid offers multiple paths between each storage node. This makes it easy to service and replace components in case of failure, with minimal impact on system availability or downtime.
Performance. The same factors that lead to reliability also can improve performance. Not requiring a centralized switch with many ports eliminates a potential performance bottleneck, and applying load-balancing techniques to the multiple paths available offers consistent performance for the entire network.
Scalability. It’s easy to expand a grid network using inexpensive switches with low port counts to accommodate additional servers for increased performance, bandwidth and capacity. In essence, grid storage is a way to scale out rather than up, using relatively inexpensive storage building blocks.
Kay is a Computerworld contributing writer. You can reach him at russkay@charter.net.