Increasingly, organizations find themselves having to archive data in order to meet regulatory requirements and to avoid legal exposure. In Singapore, for example, MAS requires banks to keep records of financial information for seven years.
There has also been growing awareness of the need to archive email following a landmark case in March, in which the High Court ruled that agreements made by email for property transactions can be legally binding.
The case involved a deal in which the Singapore unit of German logistics firm Schenker was to lease a warehouse for two years from SM Integrated. The court upheld a lease agreement made via email and awarded S$500,000 (US$303,000) in damages based on loss of rental to the warehouse landlord.
A perfect storm of litigation requiring electronic data discovery, regulations governing data retention, disappearing backup windows (thanks to enormous data growth and nonstop business operations) and large-scale catastrophes has catapulted backup and recovery to IT’s head table.
And, reflecting its newfound status, backup and recovery is taking on a more sophisticated, grown-up name: data protection, which encompasses backup, recovery, archiving, retrieval, disaster recovery and business continuity. “This is a phenomenal time for storage, and particularly for data protection,” says Arun Taneja, president of Taneja Group.
According to IDC, the backup, archiving and replication software market will grow from US$4.3 billion in 2003 to $6.58 billion by 2008, representing 54 per cent of storage software expenditures.
While the term “data protection” covers a lot of ground, it’s the first four areas – backup, recovery, archiving and retrieval – that are currently of highest interest, says Pete Gerr, senior analyst at Enterprise Strategy Group.
Companies now realize they must be able to recover specific pieces of data from financial records, email, instant messaging logs and the like if it’s subpoenaed as evidence in a legal case, Gerr says.
The bottom line is backup, restoration and safe archiving of electronic data can no longer be a “hope it works” proposition. And this is because very often, it does not work. Forrester Research has said that 30 percent of all data recovery instances that fail are due to botched backups.
Jon Murray, regional program manager, EMC South, says botched backups occur typically because of the enormous and continuing growth of information in the production environment, not matching service level to the right tier of storage or using multiple backup servers streaming to multiple tape drives.
“Most backups simply stop in the middle of the backup cycle and businesses are left unaware that they do not have a full and complete copy of production data,” he says.
Most failed backups are due to human errors, not mechanical errors, says Jim Simon, director of Marketing, Asia Pacific, Quantum. He recommends the use of an automated backup system such as an autoloader or tape library working in conjunction with backup software like Veritas Backup Exec. “Automating the backup eliminates the need for a human to run the backup every night as well as reducing or eliminating the need to physically swap tapes during the backup process.”
Narayan M, senior consultant, Brocade Communications, also suggests that one way to minimize backup errors is to centralize backups so that there is more control of the backup process. Secondly, there is a need to check that the backups have been successfully completed and periodically tested to ensure that the backed up data can be restored as additional safeguards.
Indeed, a complete data protection strategy should include verifying that the data can be restored, says Edward Pearson, Tape Storage Solutions engineer at Exabyte. “Strategies should include checks on equipment reliability and restore performance through actual restore operations.”
In the case of archived data, retrieval is a major pain point for many enterprises today, says Murray of EMC. He highlights a common example of companies having to recover an entire email server just to extract one email required urgently by a chief executive officer.
“In archiving, there must be a mechanism for an organization to search through massive amounts of archives, which could exist in various different media (disk or tape), filed according to the value of the information,” says Yeong Chee Wai, manager, Pre-sales Consulting, Symantec Singapore.
Archives are built to maintain and protect fixed content for long periods of time and retrieved for specific business usage, Murray points out. They are quite different from backups, which are generally short-term.
“Understanding the difference between archive and backup allows users to fit the best storage technology for each purpose maximizing reliability and cost in their data protection strategy,” says Pearson.
An archive is a complete set of data from a certain point in time, he explains. “An archive is meant to preserve data in the event it must be referenced at a later time.” In his view, a reliable and inexpensive storage media would be a good choice for this application.
A backup, on the other hand, is a rolling updated copy of data that allows a restore of the most popular version in the case of data corruption. Implementing a backup copy of data on nearline storage enables quick recovery from an unplanned event and would be a good choice to keep data highly available, says Pearson.
Murray emphasizes that backup and archiving are designed to deliver against very specific and very distinct requirements. As such, many large organizations actually keep backup and archive separate as they are used for different purposes. He provides three simple rules of thumb to differentiate between backup and archive:
1) Backups are for recovery; archives are for retrieval.
2) Backups are short-term; archives are long-term.
3) Backups aren’t good for compliance; archives are.
Simon of Quantum uses a banking scenario to illustrate the differences. Each day the bank should backup its data in case of a physical catastrophe such as a fire, or a virus attack, he says. At the same time, the bank should save an archive copy of the data for an indefinite period so that it will be possible for the bank to provide customers with a copy of their account history, even years later.
According to Simon, companies turn to tape for backup and archiving because it is removable and can be stored off-site. Other factors going for tape are that it has a long shelf life of 30 years, offers the lowest cost per gigabyte of any medium (for example, a Quantum LTO-3 cartridge has a cost per GB of US$8), and it is easy to clone, so multiple copies can be kept at multiple locations for added protection.
In recent years, however, disk has been encroaching onto this space, with disk manufacturers increasing disk capacities and pushing down the cost per MB of disk storage. “For the first time, companies have a cost-effective alternative to tape for their enterprise storage needs,” says Robert Yang, Seagate’s senior director and general manager, Channel Sales and Marketing, Asia Pacific.
“They can now choose to backup or archive their information on online disk storage or nearline storage versus on tape. This enables them to still continue their business while gaining access to their information in a timely manner due to faster retrieval times,” he says.
He gives the example of Seagate’s newly-launched External Hard Drive, with a new FireWire 800 interface which offers faster access to backup or archival information. Having the information on disk enables companies to retrieve it rapidly and prevent any significant downtime or gap in the cus