If they develop it, will it work? Exploring Gmail

April 7, 2004, 09:37 AM —  Techworld.com — 

Google Inc.'s free e-mail service, Gmail, has received a huge amount of interest in the past week thanks mostly to its claim that it will offer 1GB of storage to each user.

It seems safe to assume that within a few days of the service going live, it will have several million people apply for an account. One gigabyte multiplied by several million could represent the world's largest-ever storage order. It could also represent the world's single largest privacy problem due to Google's business model where content-related ads will pop up as you read your incoming mail.

Microsoft Corp.'s Hotmail and Yahoo Inc. offer just a few megabytes of free e-mail storage each. Users pay for additional storage. Google is comprehensively disrupting this model of Web-hosted e-mail. And already a Mac web-hosting service, Spymac Network Inc., is also offering a free gigabyte e-mail storage to its members.

If one million users, say, take Gmail up then, on the face of it, 1PB, one petabyte -- that's one million gigabytes, of hard disk would be needed. Double that for redundancy, add in more for indexing, and some lucky supplier could find a 2.5PB HDD order in the in-tray.

But Google doesn't work like this.

As we described, Google operates a massively distributed server and storage design using clustered Linux X86 server nodes with one or two hard drives each. The servers store Google's web page index separately from the web documents themselves.

A Google spokeswoman confirmed: "Gmail is built on existing Google search technology, letting people quickly search over the large amount of information in their emails. Using keywords or the advanced search feature, Gmail users can find what they need, when they need it." The Gmail service, incidentally, is already up and running and all Google employees have their own "gmail.com" address.

But such a system architecture is unusual in a world where storage networking is the norm. It may also be a gamble for the search engine giant, with storage experts telling us that alternative methods are better when dealing with so much data.

Google's system can be defined as direct-attached storage (DAS), where, oddly enough, storage is attached directly to a computer. The vast majority of big storage networks in use are network-attached (NAS) -- where a data server on a network provides storage accessed via the network -- or storage area network (SAN) -- a high-speed subnetwork of shared storage devices.

Tom Clark, director for SAN technology at McData, thinks Google may have it wrong. "With individual servers with separate, direct-attached storage, there are inherent scaling problems over time and I would think increased administrative overhead as more servers are added," he told us. "The success of SANs to date is based on the ability to reduce administrative overhead through centralized sharing of storage assets, streamlining backup operations, gaining performance via SAN-based RAID, plus five-nines availability (meaning, 99.999 per cent availability) through enterprise-class storage."

He continued: "I would think Google would see a significant benefit from implementing a

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Free books

Build your tech library with our book giveaways.

Windows PowerShell 2.0 Unleashed
By Tyson Kopczynski, Pete Handley, Marco Shaw; Published by Sams

Windows PowerShell Unleashed will not only give you deep mastery over PowerShell but also a greater understanding of the features being introduced in PowerShell 2.0–and show you how to use it to solve your challenges in your production environment. Enter now!

 

Ubuntu Server Administration
By Michael Jang; Published by McGraw-Hill Osborne Media

Realize a dynamic, stable, and secure Ubuntu Server environment with expert guidance, tips, and techniques from a Linux professional. Ubuntu Server Administration covers every facet of system management -- from users and file systems to performance tuning and troubleshooting. Enter now!

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

More Resources