FRAMINHAM 1 MARCH 2011 - Two days after tens of thousands of Google Gmail users discovered that their e-mail, chat histories and contacts had disappeared from their accounts, the problem still is not fixed.
Google announced Monday night that the Gmail issue, which struck some users on Sunday, was caused by a bug in a storage software update. While Google had said Monday afternoon that the issue would be resolved for all users within 12 hours, the company now says that the problem has not been fixed but will be "soon."
The good news is that Google reported that users' e-mails, contacts, folders and settings have not been lost. They are retrievable and should be back in users' accounts once the problem is resolved.
"Imagine the sinking feeling of logging in to your Gmail account and finding it empty," wrote Ben Treynor, a Google vice president engineering and site reliability czar, in a blog post. "That's what happened to 0.02% of Gmail users yesterday, and we're very sorry. The good news is that email was never lost and we've restored access for many of those affected. Though it may take longer than we originally expected, we're making good progress and things should be back to normal for everyone soon."
Estimates of the number of users affected have varied. At first, the company estimated that 0.08% of its users, or 150,000 people, had been affected. Later on Monday, Google reduced that estimate to 0.02%, or 35,000 people.
On Monday afternoon, a Google spokesman told Computerworld that engineers had restored service to about one-third of those affected.
In his blog Monday night, Treynor addressed the question of how such a problem could occur, given the fact that Google has multiple copies of users' data in multiple data centers.
"Well, in some rare instances software bugs can affect several copies of the data," Treynor wrote. "That's what happened here. Some copies of mail were deleted, and we've been hard at work over the last 30 hours getting it back for the people affected by this issue."
He added that engineers have also archived to tape in order to save the data.
"To protect your information from these unusual bugs, we also back it up to tape," he said. "Since the tapes are offline, they're protected from such software bugs. But restoring data from them also takes longer than transferring your requests to another data center, which is why it's taken us hours to get the email back instead of milliseconds."
Treynor said a detailed incident report will be posted to Google's Apps Status Dashboard.
Sign up for Computerworld eNewsletters.