If they develop it, will it work?

Google Inc.’s free e-mail service, Gmail, has received a huge amount of interest in the recent weeks thanks mostly to its claim that it will offer 1GB of storage to each user.

It seems safe to assume that within a few days of the service going live (it is currently in beta), it will have several million people apply for an account. One gigabyte multiplied by several million could represent the world’s largest-ever storage order.

Microsoft Corp.’s Hotmail and Yahoo Inc. offer just a few megabytes of free e-mail storage each. Users pay for additional storage. Google is disrupting this model of Web-hosted e-mail.

If one million users, say, take Gmail up then, on the face of it, 1PB, one petabyte – that’s one million gigabytes – of hard disk would be needed. Double that for redundancy, add in more for indexing, and some lucky supplier could find a 2.5PB HDD order in the in-tray.

But Google doesn’t work like this. Google operates a massively distributed server and storage design using clustered Linux X86 server nodes with one or two hard drives each. The servers store Google’s Web page index separately from the Web documents themselves.

A Google spokeswoman confirmed: “Gmail is built on existing Google search technology, letting people quickly search over the large amount of information in their emails. Using keywords or the advanced search feature, Gmail users can find what they need, when they need it.” The Gmail service, incidentally, is already up and running and all Google employees have their own “gmail.com” address.

But such a system architecture is unusual in a world where storage networking is the norm. It may also be a gamble for the search engine giant, with storage experts noting that alternative methods are better when dealing with so much data.

Google’s system can be defined as direct-attached storage (DAS), where, oddly enough, storage is attached directly to a computer. The vast majority of big storage networks in use are network-attached (NAS) – where a data server on a network provides storage accessed via the network – or storage area network (SAN) – a high-speed subnetwork of shared storage devices.

Tom Clark, director for SAN technology at McData, thinks Google may have it wrong. “With individual servers with separate, direct-attached storage, there are inherent scaling problems over time and I would think increased administrative overhead as more servers are added,” he said. “The success of SANs to date is based on the ability to reduce administrative overhead through centralized sharing of storage assets, streamlining backup operations, gaining performance via SAN-based RAID, plus five-nines (99.999 per cent) availability through enterprise-class storage.”

He continued: “I would think Google would see a significant benefit from implementing a high-performance SAN, which would also scale more readily over time compared to NAS.”

The understanding is that Google is proposing to treat an e-mail as a quasi-Web page. It will be indexed and this index data added to the Gmail overall index. The e-mails themselves plus attachments will be stored as quasi-Web documents.

The infrastructure needs will be massive, but Google currently operates more than 15,000 Linux servers in clusters of over a thousand machines.

As for whether Google will be able to deal with the huge demand, whether its search technology and DAS approach to storage will revolutionize Web email or leave a huge black spot on Google’s untarnished image, well, only time will tell. But one of the reasons that Google is so popular is that it has a tendency to achieve the unachievable.

Would you recommend this article?

Share

Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.


Jim Love, Chief Content Officer, IT World Canada

Featured Download

Featured Articles

Cybersecurity in 2024: Priorities and challenges for Canadian organizations 

By Derek Manky As predictions for 2024 point to the continued expansion...

Survey shows generative AI is a top priority for Canadian corporate leaders.

Leaders are devoting significant budget to generative AI for 2024 Canadian corporate...

Related Tech News

Tech Jobs

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

Tech Companies Hiring Right Now