As the number of servers being virtualized grows, backing up and protecting them becomes more of a problem. It’s not enough for IT administrators to simply back up each virtual server and its data. Protection also is needed for the virtual server’s image — its operating system, configuration and settings — and the metadata on the physical server that identifies the virtual server’s relationship to networked storage.
A challenge for IT managers is choosing from among a variety of virtual server backup options, which include:
• Traditional agent-based backup software, which installs a software agent on each virtual machine to back it up.
• Serverless backup or consolidated backup, which offloads backup processing from virtual machines to a separate physical server.
• Snapshotting or cloning the virtual machines using software from a vendor such as Network Appliance or software included with the virtualization package to protect data and images.
• Writing scripts and executing them to quiesce (minimize the number of processes running on) the virtual machine, back up its contents and restore the virtual machine.
• A combination of agent-based software and cloning.
Each virtual machine backup approach has its advantages and disadvantages. Chief among the disadvantages is the effect on network performance and utilization.
While virtualization can result in better utilization of server resources, backing up all the newly created virtual machines concurrently for a physical server can overwhelm the network and take resources from applications running in other virtual machines.
Since by virtualizing physical machines you increase the number of servers contending for a single bus, Chris Wolf, senior analyst at Burton Group, suggests that users only virtualize physical servers that contain a PCI-Express (PCI-e) bus.
“The problem, especially in virtualization, is that when you have a shared I/O channel for all your PCI devices, the bandwidth of the bus becomes a really important issue. Traditional PCI devices can severely slow you down when you talk about six to 10 virtual machines sharing the same bus,” Wolf says. “PCI-Express should be the bus of choice for all new virtualization deployments, as it offers a transfer rate up to 16Gbps in full duplex, compared to PCI Extended, which has a maximum throughput of 4Gbs.”
Another factor to consider is the cost associated with agent-based backup software used in a virtual environment. Since most vendors of backup software require a separate license for each virtual machine that is being backed up, as well as one for the physical machine hosting the virtual machines, licensing costs can increase quickly.
The advantage of agent-based backup software is that IT administrators are familiar with it, having deployed it for many years to back up the physical machines in their environment.
Increasingly, IT users are opting for a combination of methods to back up virtual servers. A common approach is to use agent-based and serverless backup for protecting data on virtual machines, combined with cloning or snapshot technology for protecting and recovering server images in the event of a hardware failure.
One user who has adopted this combined approach is Jim Klein, director of information services and technology for the Saugus Union School District in Santa Clarita, Calif. Klein uses the open source Xen virtualization hypervisor to virtualize the blade servers in his environment.
“We treat virtual machines like any other server by using backup agents from Bacula, an open source backup solution,” says Klein.
While Klein uses agent-based backup to protect the data on his virtual machines, he uses cloning technology to deal with server failures.
“For the base virtual machine images, we store them on the host computers and replicate them to the failed server or store them on a [network attached storage, or NAS] device,” Klein says. The metadata for Xen, which describes how servers attach to storage resources, is stored in a database called XenStore, which is included with the Xen hypervisor, and can be backed up easily by simply copying files to a backup device.
Art Beane, IT enterprise architect at IFCO in Houston, also has found that a combination of backup technologies works best for him. Beane uses NetApp’s SnapManager software to snapshot the data on his NetApp SAN, and cloning to back up the servers attached to them.
He has six physical machines virtualized with VMware Infrastructure 3 into 23 virtual machines.
“Our backup strategy is common to both virtual and physical servers,” Beane says. “No persistent data is allowed on a server, only on the SAN. The SAN gets a snapshot backup every two hours and a full backup daily.” The system drives — both physical and virtual — in Beane’s servers get imaged weekly using Acronis’ TrueImage. In the event of a catastrophic server loss, the Acronis image can be restored either to a physical box or as a virtual machine.
This multilayered approach is the configuration Wolf most often recommends.
“For smaller dedicated application servers, running the agent inside the virtual machine is certainly ideal,” Wolf says. “That needs to be combined with a policy and change-control process for the creation of storage and snapshots as well.”
In this way, when it comes time to recover files, all an IT admininstrator needs to do is bring online the backup copy of the virtual machine snapshot and then restore the latest data files from the agent-based backup.
Another method for backing up virtual machines is to use serverless or consolidated backup technology. In consolidated backups, backup processing is offloaded from the virtual machine and physical server to a separate backup server called a proxy, thus helping to avoid any performance impairments.
Consolidated backup is most commonly deployed using a combination of VMware Consolidated Backup (VCB) and agent-based backup. VCB consists of a set of drivers and scripts that enable LAN-free backup of virtual machines.
In a consolidated backup, a job is created for each virtual machine and that job is executed on the proxy server. The pre-backup script takes a virtual-machine snapshot and mounts the snapshot to the proxy server directly from the SAN. The pre-backup script also quiesces the Windows NT file system inside the virtual machine. The backup client then backs up the contents of the virtual machine as a virtual disk image. Finally, the post-backup script tears down the mount and takes the virtual disk out of snapshot mode.
Taking snapshots and cloning virtual machine images have many advantages. Like agent-based backups, most IT administrators are familiar with them. Snapshot and cloning capability also is included in many virtualization packages such as VMware and XenSource, and with many traditional backup tools.
“There’s not a one-size-fits-all solution,” Wolf says. “When you are dealing with large amounts of data such as in databases … I prefer to configure virtual machines to use a raw LUN [logical unit] so the virtual machine is not using a virtual hard disk, but actually mapping to actual storage resources on the SAN. That opens up the flexibility to use serverless backups, some of your snapshotting agents and all of the capabilities of your backup software that exist in the physical world.”