The City of Saskatoon’s technology infrastructure faces the same demands as any given city of 200,000. Peter Farquharson, manager of technology integration for the City’s corporate services department, says his situation is a little uncommon in that he doesn’t just own the network infrastructure.
“I own the (database administrators), which makes me unique in some organizations,” he says. “The DBAs are on my side of the fence, not the applications side. And we have this sort of see-through brick wall between the applications group and technology. We have this separation of duties, so the applications people can’t change or update any production data.”
Data patches and record changes go through the DBA group, “and that keeps our auditing group very happy, because we’ve got this division of duties.” With 13 TB of data on the floor and perhaps 100 servers, configuration changes were a chore. “We’re very much a Microsoft cluster shop,” Farquharson says.
“There’s no Linux in place whatsoever and there’s no mainframe. Everything is Microsoft clustering. In that realm you need shared disks…to upgrade it, migrate it, change its size, whatever, was a nightmare.” When IBM launched its SAN Volume Controller, a Fibre Channel storage virtualization product, the City jumped at the chance. “I think we were definitely the first SVC in Western Canada, if not Canada,” having bought its first SVC in March 2004, says Farquharson. “We didn’t do a full cost justification. It’s just part of our infrastructure management.”
Virtualization saves the department headaches and money. Staff can reallocate storage resources on the fly and move data to faster or slower drives depending on how it’s used. The application doesn’t shut down if the data is being shuffled.
Computer Associates’ CABrightStor storage resource management software allows staff to analyze storage use to decide what to locate in what performance tiers. This is what John Sloan, senior research analyst with Info Tech Research Group, calls “Big Iron” storage virtualization. He also calls it “neutron bomb virtualization.”
“You know how the neutron bomb’s supposed to knock out your enemies, but leave their infrastructure standing?” Sloan says. The Big Iron approach — that of IBM and Hitachi Data Systems, among others — allows vendors with a virtual storage appliance to control other vendors’ storage infrastructure.
ISOLATING COMPLEXITY
Storage virtualization is about abstracting the management of storage from the hardware, and isolating the user from the complexity of the storage network. It’s a similar concept to processor virtualization, but it doesn’t benefit from the standardized x86 architecture that processor virtualization layers on, Sloan says.
“The reason that storage virtualization is a little harder to get a handle on is because it isn’t as standardized and open and non-proprietary as the server side,” Sloan says.
“When storage area networks became a reality, it really wasn’t this open, malleable cloud of storage behind the switch. It was a proprietary array from a vendor and that array had its own APIs and management software. And if you were a larger company and you were going to buy more than one array from more than one vendor, that would add complexity, because you couldn’t just manage all those arrays as one resource. You’d have to manage them each separately.”
Closer than the Big Iron approach to “that elusive ideal,” says Sloan, is the iSCSI clustered storage virtualization fostered by IP SAN vendors like LeftHand Networks and EqualLogic.
“It’s more or less based on standardized hardware, where you have SATA drives and usually a Xeon processor and you’re connecting with Ethernet,” says Sloan. “So it’s using more commodity hardware and virtualizing it behind an Ethernet switch.” Whichever approach, there are certain capabilities a virtualization system needs, according to Jag Chahal, technology business consultant, network attached storage, with EMC Canada.
Chief among them are capacity management, file management, global namespace management and migration and consolidation. Picture a company with file servers across the country, managing various types of documents. One or two servers might be under-utilized while the head office server is being hammered.
“By setting policies and rules for capacity management, you can move around that data transparently,” thus balancing the load and making more efficient use of the storage network, Chahal says.
Policy-based storage rules also play a role in information lifecycle management. The virtualization system can automatically shuffle data between tiers and platforms based, for example, on the age of the document or when it was last accessed. Virtualization also manages content-addressable storage, which hashes files for secure archiving, ensuring inactive files haven’t been changed.
“Once it becomes inactive, and the only requirement for us is to read the file but make sure its secure and managed, then the content-addressable storage is a technology that allows you to store a single instance of that file and keep it for as long as you need to, then expunge it,” says Iain Anderson, EMC Canada’s client director.
TIERED DATA
In Saskatoon, Farquharson tiers the City’s data based on performance needs. “We’ve got various speeds and sizes of disk drives. We do have some 15k, 36G drives. We’ve some 146 G 10K drives.
“We’ve got some applications that are highly transactional; they’ll wind up on the 36G, 15K drives. Something that’s more archival will end up on relatively slow storage,” he says.
The next tier, likely a 2007 purchase, will be a virtual tape library. Another SVC, bought in 2005, is being brought online so the City can split the data centre in two and run production on both servers. “It’s like a recovery site,” Farquharson says.
”We’ll run one node of the cluster on one site, the other node on the other site. We’ll have storage in both sites, and should we have this building issue, like something burns down, for the less critical applications, we’ll have storage available, we’ll just restore them from backup and bring them back up. For the extremely critical applications, we’ll probably have mirrored storage in both locations anyway.”
The delay isn’t technological. It’s due to real estate. The City has lined up for the second data centre site, but a deal with the federal government fell through. So while the two SVCs are being organized as two separate data centres, they’re actually being operated in the same building, Farquharson says.
“All the racks are being built for Building A and Building B. Very hypothetically, it should be a matter of, we come in here on a long weekend, we shut down a bunch of equipment, we cart it across to the other location, we turn it back on, and everything should work,” he says. “It shouldn’t be tremendously more difficult than that.”
To an extent, says Info Tech’s Sloan, the storage virtualization market in Canada is limited by the proportion of mid-sized to large enterprises. The virtualization pitch, after all, is managing multiple SANs.
“A lot of larger enterprises have encountered the problem of multiple SANs, because what they have is multiple switch fabrics going to multiple arrays and the arrays are from different vendors and they’re not management-compatible with each other,” Sloan says.
“Midsize enterprises don’t have this issue with multiple SANs. Maybe they just have one SAN, and the one SAN is built around an array, and the array is already carved up into multiple logical units. So they haven’t hit that problem.” But there’s an interesting wrinkle that might have an impact on the virtual storage market, and it comes from the processor virtualization side of the fence. VMware’s latest version of its processor virtualization product, VMware Infrastructure 3, can use an iSCSI SAN as a storage location for its virtual processors.
“I think that’s where the iSCSI-type storage virtualization is bound for some growth going forward,” Sloan says. “You’ve got a company now that can say, ‘We can look at building multiple virtual servers with VMWare and we can connect that to a cluster of storage servers that are using iSCSI and connect it all with Ethernet.’ And, in essence, you’re now doing utility infrastructure because you’re managing both storage volumes and processors that are virtual.”
QuickLink: 078560