Tens of thousands of users are deploying open-source storage software in an effort to avoid pricey proprietary products such as array clustering and disk eraser applications and to get some long-term protection through the availability of source code.
Rafiu Fakunle, CEO of London-based open-source vendor Xinit Systems Ltd., said that users have downloaded more than 38,000 copies of its Openfiler NAS and SAN software from the sourceforge.net Web site. And Zmanda Inc., in Sunnyvale, Calif. — the company providing support for the open source backup software product Amanda — says that it supports 20,000 users worldwide.
Open source storage software is available to address a number of user needs, experts say. Amanda is a backup software product targeted at small and midsize businesses that allows the creation of a single master backup server to back up multiple hosts. DBAN (Darik’s Boot and Nuke) allows users to securely wipe the hard drives of their computers.
Other open-source storage software includes Lustre, OpenAFS and SAMBA, which are each network file systems used for different tasks. Lustre is used in large scale cluster computing while OpenAFS is deployed to create a single file space across all computers so that any computer can access a file on any other computer. SAMBA allows Linux servers to provide file and print services to Microsoft Windows clients.
Integrators like the Network Resource Group (NRG) in Manhattan, Kan., say they can deliver substantial savings for their customers using open-source storage software. Terry Hull, a principal network engineer with NRG, recently put together a VLAN for a client using iSCSI and open source storage software.
“The incremental costs for the solution were US$1,500 for the open source software versus $25,000 for a comparable configuration from Lefthand Networks and $75,000 for a Dell Fibre Channel SAN,” Hull says.
However experts remain skeptical about the wisdom of implementing open source storage software products. Jacob Farmer, the CTO of Cambridge Computer Inc., in Waltham, Mass., has some clients who implemented OpenAFS and Lustre in order to avoid the high cost of clustered file system software from a company like TerraScale Technologies Inc. in Montreal, Quebec, Canada.
Despite Cambridge Computer’s successes in deploying open sources storage software, Farmer says, “Only those with highly skilled personnel were able to pull it off. The rest found that these products were too complex and had deceptively high costs of ownership.”
Key questions that users need to answer before using open source storage software are:
– What is open source storage software’s value proposition? –
What products are available for their specific needs?
– How stable and scalable are the products?
– What risks do they present?
– Under what circumstances should an end users should consider open source?
– What level of user skill is required to implement and support them?
– What software support options are available?
Value Proposition
The three primary value propositions for open source storage software are:
– Minimal or no upfront software costs
– Comparable base line features as proprietary storage software products
– Availability of source code provides some level of long term protection
Open source storage software can be obtained in one of two ways – freely downloaded from a web site or purchased. For example, users interested in trying the open source Amanda backup software may either download a community edition from the sourceforge.net web site or purchase an enterprise edition from Zmanda’s web site. While the underlying source code should be the same in both instances, Zmanda provides a ‘sanity check’ of the enterprise edition code ensuring that the version that the user downloads and installs is fully tested and compiled at their labs.
DBAN (Darik’s Boot and Nuke) is an open source storage software product available in both free and commercial versions. DBAN meets the 5022.22-M standards of the Department of Defense (DoD) for data erasure by overwriting all disk locations three times. Network Resource Group’s Hull primarily supports Linux in his clients’ environments and says he uses DBAN on a “constant basis to clean hard drives or partitions for my clients.”
Other users like David Ritchie, an IT manager with an Atlanta-based staffing firm, still finds DBAN is not quite ready for his environment. With DBAN, which is often used on smaller servers with internal disk drives, Ritchie encountered some quirks when trying to erase data on volumes on external storage. “The amount of storage it displays is different than what is presented by the external storage array and the program runs single-threaded so you need to be strategic in how you deploy it,” he says.
The AoE (ATA over Ethernet) protocol provides a method that is comparable to Fibre Channel for users to connect to external storage using common the 1Gbit/sec Ethernet protocol and network switches. As a registered IEEE protocol, AoE runs at lower level in the Ethernet stack than TCP/IP so it does not impact server performance in the same way that the iSCSI protocol does yet it provides approximately the same level of performance as more expensive Fibre Channel SANs. Coraid’s CEO, Jim Kemp says, “On a 1Gb Ethernet link, AoE can achieve 110MB of throughput without burdening the host processor.”
However, AoE does have a number of downsides. First, while drivers are freely available for Linux, FreeBSD and Solaris, Windows users still need to purchase an AoE driver such as Rocket Division Software’s Starport software. Second, AoE is not a routable protocol so it can not be used to access storage on other segments of the LAN. Third, storage products that support this protocol are only available from a few vendors such as Coraid. Finally, AoE requires newer network switches that provide flow control that maximize throughput and limit network collisions.
The availability and accessibility of the source code is also a major advantage of open source storage software, especially for organizations that archive data for long periods of time. Charles Wegryzn, is a developer for Retriever Technologies in Santa Fe, N.M., which is working on an open source content management and digital archiving software. Wegryzn says it used to be fairly typical for users to buy software from IBM and IBM would include the source code inside. “Then Microsoft came along and changed everything. With open source, we are going back to our roots of how computer software sales used to work.”
Cambridge Computer’s Farmer thinks archived data is the single largest value proposition for open source storage software. Supporting proprietary data formats long term and the possibility of vendors going out of business who provide those formats are valid user concerns now. Farmer says, “With open source, at least you know you will have support in 25 years since you own the code.”
Open Source’s Hidden Costs
Despite the benefits open source storage software offers, users need to establish what the hidden costs of open source storage software are, experts say. The major factors that affect the total cost of ownership are:
– Product installation and configuration documentation
– Product support
– Breadth of product functionality
– Hardware and software interoperability
One hidden upfront cost with open source storage software is finding documentation and scripts that ease its installation and configuration. Coraid’s Kemp says, “The open source community is rich in information but it is a scavenger hunt to find exactly what you need.”
Because of these concerns, commercial versions of Amanda, DBAN, and Openfiler available from Zmanda, Techway Services Inc. in Grapevine, Texas and UK-based Xinit Systems, respectively, provide documentation and install scripts for the commercial open source versions. Protocols like AoE are included with the Linux 2.6.11 kernel or bundled with hardware like Coraid’s EtherDrive SR1520.
The costs for supporting open source storage software show up in different ways. Open source vendors are in general agreement that managing open source code and changes to it require, as a rule of thumb, administrators with at least two to four years of experience.
“Users who like the idea of modifying open source code need to take a close look at the code to make certain that they can work with it and that it is within their skill set to modify,” Cambridge Computer’s Farmer says. Integrators like Terry Hull of the Network Resource Group (NRG) also encounter other issues with product support.
“Getting to the root of a problem when you have open source layer upon open source layer is rarely easy and the thing we (NRG) know we are giving up with open source storage software is a significant margin of management,” Hull says.
Another major concern for open systems storage software is the depth of product functionality. Open source products like Amanda and OpenSMS, a policy-driven systems management storage software product, almost always have certain product restrictions. For example, Amanda will not backup Microsoft Windows hosts unless SAMBA, a file and print sharing utility, is first installed on the Windows host, and Amanda offers no media server option so all backups must go through a central server. OpenSMS only officially supports Linux 2.4 and 2.6 running on an XFS file system though it suggests it should work on other UNIX platforms and, with some porting, on JFS file systems. OpenSMS offers no integration with Microsoft Windows platforms.
The final major concern for enterprise shops is the lack of verifiable interoperability testing between the open source storage software and other hardware and software products in the user’s environment. NRG’s Hull notes that while interoperability is not a major concern for over 90% of his installs, he still never discounts the possibility of having to troubleshoot interoperability issues. Cambridge Computer’s CTO Farmer says, “Unless the software has comprehensive support services behind it such as Amanda does, one needs a really good reason to mess around with it since primary storage is such a vital piece of the IT infrastructure.”
Next Steps with Open Source
For the most part, open source storage software is still largely a work in progress that requires users to have years of practical experience as well as the time to research and support the products. However, there are initiatives under way to make it easier and more practical for users to deploy open source storage software.
First, vendors like Zmanda are helping to make a product like Amanda a more viable option for the average user. One of Zmanda’s goals for the next 12 months is to simplify the install, configuration and management of Amanda so that it can be set up and managed by a novice or entry-level administrator.
Second, open source projects like Aperi are creating a standards based, open source software framework to manage storage networks. Standards like the Storage Management Initiative Specification (SMI-S) which defines a method for interoperable management in heterogeneous SANs that is now included with most storage software but provides users with no software to manage storage devices. Aperi goes the final mile and provides users with the needed open source storage software to manage storage that supports SMI-S.
In the meantime, Cambridge Computer’s Farmer offers this advice:
1. Find an open source software where there is a large open source community and make sure that you have the skills and time to modify and manage the code.
2. Go with a low-cost solution with an easy way to migrate your data out if need be
3. Don’t be afraid to pay an enormous premium for a big name vendor to avoid the risk