Google's reliability team was on Reddit's AMA when Gmail went down

By VCPOST Staff Reporter

Jan 24, 2014 05:50 PM EST

An hour ago, four engineers from the team responsible for maintaining Google services participated in Reddit's Ask Me Anything (AMA) thread. At about the same time, Gmail and Google+ experienced a worldwide outage, according to TechCrunch.

The team is called the Site Reliability Engineering (SRE) team. Based on a previous AMA last year, the SRE team is responsible for Google's round-the-clock operations, including the technical infrastructure of GMail, G+, Maps, and other Google products, the report detailed.

The timing of the outage was almost certainly a coincidence, and a funny one at that. Still, when Google's services went down, the SRE team worked together to contribute answers to the AMA thread and bring back the site's services after 50 minutes, TechCrunch reported.

During the AMA, Reddit user notcaffeinefree asked the team: "Sooo....what's it like there when a Google service goes down? How much freaking out is done?"

Dave O'Connor responded: "Very little freaking out actually, we have a well-oiled process for this that all services use - we use thoroughly documented incident management procedures, so people understand their role explicitly and can act very quickly. We also exercise this processes regularly as part of our DiRT testing. Running regular service-specific drills is also a big part of making sure that once something goes wrong, we're straight on it."

© 2024 VCPOST, All rights reserved. Do not reproduce without permission.

Join the Conversation

Real Time Analytics