Google fixes lengthy, widespread Gmail malfunction

A 10-hour disruption that affected email delivery and attachment downloads affected close to 50 percent of Gmail users

A Gmail glitch that took about 10 hours to fix and hit close to 50 percent of the webmail service's users has been fixed, ending one of the longest, most widespread Gmail disruptions in years.

Affected users endured email delivery delays and difficulties downloading attachments due to a still unexplained bug first acknowledged by Google at around 10:30 a.m. U.S. Eastern Time Monday. The company declared it patched at 10 p.m.

On its Google Apps Status site, the company pegged the start of the problem at close to 9 a.m. and its resolution at 6:30 p.m.

The issue affected individuals who use the free version of Gmail as well as businesses, schools and government agencies that pay for it as part of the Google Apps cloud collaboration and email suite.

In the U.S., the disruption covered most of the workday on both coasts, which heightened the impact of the bug for millions.

People who depend on Gmail for critical tasks took to Twitter, discussion groups and other online forums to express their frustration.

The last time Google gave an official figure for active Gmail users was more than a year ago, when it said there were more than 425 million.

Assuming conservatively that the service now has about 450 million active users, Monday's disruption likely affected more than 200 million users, plus senders on other email platforms whose messages weren't received in a timely fashion.

Google said that the severity and length of the impact varied among users. About 29 percent of messages received were delayed by an average of 2.6 seconds, but some mail was "severely delayed."

"We apologize for the duration of today's event; we're aware that prompt email delivery is an important part of the Gmail experience, and today's experience fell far short of our standards," the company wrote on the status site.

The incident is a big deal for both Google and those affected, but it shouldn't on its own dissuade CIOs from using the suite, said Forrester Research analyst TJ Keitt.

"Data centers hosting multi-tenant collaboration services aren't immune to disruptions. So, when they happen, the way to judge the vendor is on how well they identify and resolve the problem, and then inform the public to how they resolved the issue," Keitt said.

Using that criteria, Google's updates throughout the duration of the incident could have been more transparent and detailed regarding the nature of the problem and the strength of the fix that was put in place, he said via email.

"They have clearly not communicated this publicly, so I hope they've been forthcoming with this information with their clients," Keitt said.

Meanwhile, Matthew Cain, a Gartner analyst, said the incident raises fundamental questions about what is considered downtime, especially as it relates to service-level agreements from cloud application vendors.

"If message delivery is delayed 15 minutes, is that considered downtime? What about 2 hours?," he said via email. "The move to cloud email puts a spotlight on these essential questions about how to meter and compensate for subpar messaging performance that is not traditionally classified as 'downtime.'"

Juan Carlos Perez covers enterprise communication/collaboration suites, operating systems, browsers and general technology breaking news for The IDG News Service. Follow Juan on Twitter at @JuanCPerezIDG.

Insider: How the basic tech behind the Internet works
Join the discussion
Be the first to comment on this article. Our Commenting Policies