Wednesday, November 12. 2008This is why I'm not a server admin
XMission Outage
--------------- XMission experienced a serious outage while we were performing some standard UPS maintenance today. The outage affected all services and started at approximately 2:00 p.m. on Tuesday, November 11th. Network services for many were partially restored by about 2:30 p.m. but some other services required a lot of attention and took much longer. Details ------- About 40% of our data center, including our server room, suffered a power outage when a technician flipped a mislabled breaker during some standard maintenance on one of our 3 UPS units. Although the power outage was momentary, servers and routers often respond very poorly to losing power and sometimes take extensive work to come back up. Unfortunately, such was the case today with many systems. Seriously Affected Systems -------------------------- ** An important router, which some connections and servers rely on, required extensive attention from our network administrators. ** DNS (Domain Name Service) was sporadic for some customers for over an hour. ** Email services were down for over 5 hours. ** Web hosting suffered the longest outage because our NetApp storage appliance which houses all customer files and web sites lost multiple hard drives. As a result, we are currently restoring files to our new NetApp 2020 from our November 9th backup, which will take many hours yet to complete. We recently purchased this new NetApp and were merely days away from getting it online. Conclusion ---------- Today's outage was exascerbated by multiple systems responding poorly to losing power. In spite of the holiday, our systems administrators were on site within minutes and continue to work tirelessly to restore all services. In the end, we should have performed this maintenance on a day when our systems administrators were on site because problems can arise no matter how carefully you proceed. Trackbacks
Trackback specific URI for this entry
|
Handy LinksItems of InterestCategoriesBlog AdministrationSyndicate This BlogPowered byTheme dropdownBookmark |