SAN FRANCISCO– As IT contemplates the rapidly expanding universe of storage options, at least one detail has become clear: In the majority of infrastructures, most data just sits around, feeling lonely, while a small percentage is more or less constantly in use. Addressing this issue in an elegant and cost-saving way paves the road to lower capital expenditures for storage, as well as reduced power and cooling costs, with a side order of performance gains. What’s not to love?
Several storage tiering solutions are available today, but they tend to be on the upper end of the market. For most solutions, you choose SAS disks, perhaps with an older SATA-based unit that’s already in place; you might equip another array with solid-state disks for extra juice. Without any smarts to bind these together, you wind up with manual tiering: Old data sits on the SATA/SAS boxes, and the high-turnover data lives on the SSDs. It’s a workable solution, but requires care and feeding to maintain proper residence for each type of data.
Dell’s EqualLogic iSCSI SANs now offer automated tiering across arrays, even across arrays of disparate types. In the lab, I ran a Dell EqualLogic PS4100E with 12 SAS drives and a PS6100XVS with a hybrid disk set — eight SSDs and 16 SAS drives. Each unit was equipped with redundant controllers and two 10GbE interfaces per array.
Multiple arrays, one system
The PS4100E and PS6100XVS were placed in the same storage group and managed as a single entity. The Dell EqualLogic management software allows the use of groups to maintain volumes that can be spread across multiple individual arrays. In the days of yore, it was important to maintain consistency between the arrays so that volumes wouldn’t be spread across faster disks in one unit and slower disks in another, but it’s no longer a requirement.
Because both arrays are members of a group with a single IP address and iSCSI gateway, hosts that bind to the various iSCSI LUNs perceive only a single storage host on the other side. iSCSI traffic is load balanced between the active interfaces on the controllers and the arrays themselves.
Further, working in concert with the automated storage tiering features, the controllers understand which storage blocks are experiencing the most turnover. The controllers move these “hot” blocks to and from the fastest storage, ensuring that the data needing faster access will not wind up on a slower array, but will be prioritized on the set of SSDs, should they be available. This capability is also available with traditional disks, but the inclusion of the SSDs — specifically, the hybrid 6100XVS coupled with the lower-cost PS4100E — really shows off the benefits of these features in production workloads.
Let’s envision a fairly traditional storage workload for a medium-size infrastructure. We have a bunch of hypervisors driving several hundred VMs, along with general-purpose file sharing, and a passel of databases that drive a Web application tier to provide critical line-of-business applications.
It’s common to satisfy all of these storage requirements through the same homogenous storage array, but there are drawbacks. For instance, it means that the long-forgotten, never-again-to-be-accessed 2GB movie file that a user once stored in his home directory will sit right next to the bits that the core database servers are constantly reading and writing. In a perfect world, these files wouldn’t mix, but we all know that the world we inhabit is rife with similar examples.
With automated tiering, that neglected movie file will eventually wind up on the slowest disks in the data center, while the database volume will wind up on the fastest — without any administrative intervention required.
In practice, this process is as simple as setting up the disparate arrays in the same group and introducing the workload. As the controllers get an idea of which data is flowing where, they will automatically distribute the blocks throughout the arrays according to the demand.
In our example, this would mean that the database volumes and high-transaction VMs would wind up on the SSDs, while the movie file winds up on the SATA drives. As the load changes, the solution automatically adapts. If the user shared a link to that movie with the entire company and the movie began streaming to a few hundred people, the controllers would migrate it to faster storage. Thankfully, the Dell EqualLogic SAN HQ software provides the controls to ensure that an odd workload change such as this does not bump more critical data sets from the fastest disk.
Another benefit of automated tiering is that weekly or monthly workloads can be granted the benefit of fast disk only when they actually need it. As a monthly batch job progresses and the responsible databases start churning for a 24-hour period, they will reap the benefits of the SSD-backed storage, then fall back to the slower disk as their processing completes. Another example might be a virtual desktop infrastructure that experiences heavy loads during the morning log-ins and the evening log-offs, when desktop VMs are being quickly spun up and put away, respectively, with lower disk I/O utilization in the middle.
Performance in numbers
Automated tiering isn’t completely new to Dell EqualLogic, but the ability to extend the tiering across multiple high-speed and low-speed arrays such as the PS6100XV and the PS4100E puts the performance benefits in bold relief. Rather than having to add three or four arrays of differing storage types to fully realize the benefits, the 6100XVS accomplishes much the same goal internally, as it can drive both SSD and SAS drives in a single 24-disk 2U chassis. And demonstrating the effects of storage tiering is relatively simple, requiring only a repetitive workload that extends for a reasonable period of time.
Using IOMeter to test the PS6100XV and the PS4100E arrays was the simplest way to investigate the solution. When hit with a mix of streaming and random reads and writes, the throughput grew substantially in some cases, less so in others, depending on a wide variety of variables such as block size and the level of random reads and writes. As with any storage device, your mileage may vary depending on the workload, but my general-purpose testing shows that the combination of the PS6100XV and the PS4100E should adapt very nicely to most infrastructures.
By extending automated storage tiering throughout the EqualLogic product line, Dell has made it both simpler and less costly to implement an extremely useful storage management capability in the data center.