If they develop it, will it work? Exploring Gmail

By Chris Mellor, Techworld.com |  Business Add a new comment

Google Inc.'s free e-mail service, Gmail, has received a huge amount of interest in the past week thanks mostly to its claim that it will offer 1GB of storage to each user.

It seems safe to assume that within a few days of the service going live, it will have several million people apply for an account. One gigabyte multiplied by several million could represent the world's largest-ever storage order. It could also represent the world's single largest privacy problem due to Google's business model where content-related ads will pop up as you read your incoming mail.

Microsoft Corp.'s Hotmail and Yahoo Inc. offer just a few megabytes of free e-mail storage each. Users pay for additional storage. Google is comprehensively disrupting this model of Web-hosted e-mail. And already a Mac web-hosting service, Spymac Network Inc., is also offering a free gigabyte e-mail storage to its members.

If one million users, say, take Gmail up then, on the face of it, 1PB, one petabyte -- that's one million gigabytes, of hard disk would be needed. Double that for redundancy, add in more for indexing, and some lucky supplier could find a 2.5PB HDD order in the in-tray.

But Google doesn't work like this.

As we described, Google operates a massively distributed server and storage design using clustered Linux X86 server nodes with one or two hard drives each. The servers store Google's web page index separately from the web documents themselves.

A Google spokeswoman confirmed: "Gmail is built on existing Google search technology, letting people quickly search over the large amount of information in their emails. Using keywords or the advanced search feature, Gmail users can find what they need, when they need it." The Gmail service, incidentally, is already up and running and all Google employees have their own "gmail.com" address.

But such a system architecture is unusual in a world where storage networking is the norm. It may also be a gamble for the search engine giant, with storage experts telling us that alternative methods are better when dealing with so much data.

Google's system can be defined as direct-attached storage (DAS), where, oddly enough, storage is attached directly to a computer. The vast majority of big storage networks in use are network-attached (NAS) -- where a data server on a network provides storage accessed via the network -- or storage area network (SAN) -- a high-speed subnetwork of shared storage devices.

Tom Clark, director for SAN technology at McData, thinks Google may have it wrong. "With individual servers with separate, direct-attached storage, there are inherent scaling problems over time and I would think increased administrative overhead as more servers are added," he told us. "The success of SANs to date is based on the ability to reduce administrative overhead through centralized sharing of storage assets, streamlining backup operations, gaining performance via SAN-based RAID, plus five-nines availability (meaning, 99.999 per cent availability) through enterprise-class storage."

He continued: "I would think Google would see a significant benefit from implementing a high-performance SAN, which would also scale more readily over time compared to NAS. Even global file systems such as Sistina benefit from having SANs as the shared storage infrastructure."

Paul Ligget, sales and business development director for 3Pardata Inc. in Europe, is also sceptical. "There are massive benefits of centralized storage rather than DAS. These have been well documented. In terms of our systems, the major benefit to Google would be ease of provisioning new storage, ease of backup, snapshots to protect against corruption and allow rapid recovery, DR planning would be easier with our replication, rather than having to replicate each server."

Our understanding is that Google is proposing to treat an e-mail as a quasi-web page. It will be indexed and this index data added to the Gmail overall index. The e-mails themselves plus attachments will be stored as quasi-web documents.

Google will use its existing search technology to enable users to find their e-mails using keywords or other search features. Its website states: "Each message is grouped with all its replies and displayed as a conversation." This is similar to a newsgroup thread.

The infrastructure needs will be massive, but Google currently operates more than 15,000 Linux servers in clusters of over a thousand machines. Wayne Rosing, Google VP engineering, said in a report: "It will take many petabytes. The infrastructure is quite amazing ... and we don't even flinch at the thought of 10 million or even 100 million users."

Mail storage

    Add a comment

    Post a comment using one of these accounts
    Or join now
    At least 6 characters

    Note: Comment will appear soon after you have activated your account.
    Obscene/spam comments will be removed and accounts suspended.
    The information you submit is subject to our Privacy Policy and Terms of Service.

    ITworld LIVE

    BusinessWhite Papers & Webcasts

    White Paper

    Insiders Can Ruin Your Company. Take Action.

    Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in organizations worldwide. This white paper from NetIQ, discusses key technology solutions that help to prevent and detect insider threats.

    White Paper

    Ten Steps to an Enterprise Mobility Strategy

    Enterprise employees are more mobile, relishing the ability to work productively anywhere, at any time. They may use any means to get connected, often creating financial and security risks for your company. Discover how to get control of your enterprise mobility strategy and ensure mobile worker productivity with these ten steps.

    White Paper

    What You Need to Know About the Costs of Mobility

    Mobile workers want to get connected anywhere, at any time, often at any cost. Enterprise mobility is often a hidden "black" budget in your company. Ensure that your traveling employees are productive everywhere, even while you control cost and security, through an enterprise mobility strategy.

    White Paper

    The 2011 iPass Mobile Enterprise Report

    This industry survey covers trends, recommendations and a policy guide on managing Enterprise Mobility for IT management and CIOs. Get data on employee device liability, as well as smartphone/tablet penetration, budget control and provisioning. Find out how your organization compares, how to ensure mobile worker productivity, and control costs.

    White Paper

    Smarter Commerce is redefining value chain visibility

    Smarter Commerce is redefining the value chain in the age of the customer. It starts with putting the customer at the center of your operations - which of itself is not a new idea - however, truly operationalizing this strategy is not easy.

    See more White Papers | Webcasts

    Ask a question

    Ask a Question