I plan to collect reports here over time, and welcome links to other write-ups of outages and how to survive them. My naming convention is {vendor} {primary scope} {cause}. The scope may be global, a specific region, or a zone in the region. In some cases there are secondary impacts with a wider scope but shorter duration such as regional control planes becoming unavailable for a short time during a zone outage.
This post was written while researching my AWS Re:Invent talk.
Slides: http://www.slideshare.net/AmazonWebServices/arc203-netflixha
Video: http://www.youtube.com/watch?v=dekV3Oq7pH8
November 18th, 2014 - Azure Global Storage Outage
Microsoft Reports
http://azure.microsoft.com/blog/2014/11/19/update-on-azure-storage-service-interruption/
http://azure.microsoft.com/blog/2014/12/17/final-root-cause-analysis-and-improvement-areas-nov-18-azure-storage-service-interruption/
http://azure.microsoft.com/blog/2014/12/17/final-root-cause-analysis-and-improvement-areas-nov-18-azure-storage-service-interruption/
January 10th, 2014 - Dropbox Global Outage
Dropbox Report
April 20th, 2013 - Google Global API Outage
Google Report
February 22nd, 2013 - Azure Global Outage Cert Expiry
Azure Report
December 24th, 2012 - AWS US-East Partial Regional ELB State Overwritten
AWS Service Event Report
http://aws.amazon.com/message/680587/Netflix Techblog Report
http://techblog.netflix.com/2012/12/a-closer-look-at-christmas-eve-outage.htmlOctober 26th, 2012 - Google AppEngine Network Router Overload
Google Outage Report
October 22, 2012 - AWS US-East Zone EBS Data Collector Bug
AWS Outage Report
Netflix Techblog Report
June 29th 2012 - AWS US-East Zone Power Outage During Storm
AWS Outage Report
Netflix Techblog Report
June 13th, 2012 - AWS US-East SimpleDB Region Outage
AWS Outage Report
February 29th, 2012 - Microsoft Azure Global Leap-Year Outage
Azure Outage Report
August 17th, 2011 - AWS EU-West Zone Power Outage
AWS Outage Report
April 2011 - AWS US-East Zone EBS Outage
AWS Outage Report
Netflix Techblog Report
http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.html