Business

Authorize.net categorizes downtime events as 'a perfect storm'

3 comments | 7I like it!
July 6, 2009, 06:58 AM — 

Authorize.net has posted an explanation of what happened over the weekend to bring its services down. Users of the service can access the information from the 'Announcements' menu in the authorize.net dashboard.

In this document, they call the situation a "perfect storm" of events. The fire at Fisher Plaza happened late at night (11:10 pm PT on July 2nd) at the start of a long holiday weekend when many Authorize.net IT engineers were off on holiday, and it took time to get them all back to work the problem. The Seattle Fire Department wouldn't allow operation of the backup generators due to their proximity to the fire location, nor would they allow customers into the damaged building to access hardware. These factors were outside of Authorize.net's control.

Of more concern is the question of a back-up data center. Authorize.net states that they were approaching capacity of their current backup data center and they were in the midst of transitioning to a new one: a true "hot" site (in other words, real-time synchronization), so that the Authorize.Net platform could be switched from one data center to the other "on the fly." When the fire took out the primary data center, they attempted to fail over to the new, still-in-testing backup data center and encountered "a number of unanticipated errors." They offer no explanation as to why they tried to fail over to the new backup data center rather than the old (presumably well-tested) one.

The document finishes with a section entitled 'Lessons'

Even as our engineering and operations teams continue to ensure normal operations, the postmortem process is already under way. We are examining all aspects of this outage and implementing steps to mitigate future risks. Over the next weeks, we will be completing the work to ensure that we have two fully functional, synchronized hot sites. Failing over from one to the other will occur in a matter of seconds. Steps are also being taken to ensure that we have the ability to implement emergency communication by distributing our voice, e-mail and Web capabilities across multiple sites.

Over the next days and weeks the postmortem will continue. Processes will be refined and further protections put into place.

While Monday morning quarterbacking is always easy, it seems like some mistakes were made in the handling of the backup data center. It's unclear if the old backup center was no longer live, or if the engineers just determined that the new one was 'ready enough' to fail over to. At the same time, having been in that kind of position, I know that the engineers were under tremendous pressure and were doing their best to come up with solutions which would get services back online as soon as possible. The more egregious issue is that Authorize.net didn't have other ways to keep in touch with customers. When the fire broke out, all the authorize.net websites went down. Eventually they opened a twitter account and for some time that was their only means of getting information out to the customers who were losing revenue as a result of the downtime.

Sign up for ITworld's Daily newsletter
Follow ITworld on Twitter @IT_world

I like it!
Comments

Fisher Plaza fire

Our company LiveRez uses Authorize.net for our credit card processing, but managed to avert disaster
Take a look:


http://www.liverez.com/blog/

Mr Tracy Lotz is available to chat if you are interested in a story of a company that managed to avoid disruption, in spite of having a large portion of our business ( our credit card processing company) housed in the burned building. Being that we provide e-commerce solutions for online bookings of vacation rentals, and 4th of July is the biggest vacation weekend of the summer, you can imagine that we had some VERY happy property managers when they realized it was business as usual. Due to some forward thinking on the part of Lotz, all of our customers were able to continue with all credit card transations and online bookings.

| reply

Fisher

I'd like to clarify for Liverez.com that Fisher is not a high rise, it's about 6 floors high... There are homes in the area here that are larger.
| reply

The CTO of Authorize.Net should be fired

This is unacceptable. For them to have their backup center in the same area and not have a detailed disaster strategy is totally unacceptable.

This goes back to your Information Technology 101 class, wherein you NEVER have your backup in the same area as your main data center.

Heads should roll for this!
| reply
peer-to-peer

Esther Schindler
If the comments are ugly, the code is ugly

claird
SVG a graphics format for 21st century

pasmith
Take Chrome OS for a test spin

Sandra Henry-Stocker
Solaris Tip: Have Your Files Changed Since Installation?

sjvn
64-bits of protection?

jfruh
Android fragments vs. the iPhone monolith

mikelgan
What Gizmodo missed about the Pro WX Wireless USB disk drive

 

Where Google Chrome security fails: the password
I heard mention that the Chrome OS will have some sort of encryption available a la bitlocker. If it's possible to encrypt personal data using another password or key, then it may have potential for very secure data.... And Ubuntu has an 'encrypt home directory' option, perhaps google should follow suit.
- Dann

Join the conversation here

The Daily Tip

The Daily TipQuick, practical advice for IT pros. Made fresh daily.

Hot tips:

Want to cash in on your IT savvy? Send your tip to tips@itworld.com. If we post it, we'll send you a $25 Amazon e-gift card.

Newsletters

Subscribe to ITWORLD TODAY and receive the latest IT news and analysis.

I would like to receive offers via email from ITworld partners.
By clicking submit you agree to the terms and conditions outlined in ITworld's privacy policy.
Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

Marketplace