Authorize.net categorizes downtime events as 'a perfect storm'

By Peter Smith  3 comments

Authorize.net has posted an explanation of what happened over the weekend to bring its services down. Users of the service can access the information from the 'Announcements' menu in the authorize.net dashboard.

In this document, they call the situation a "perfect storm" of events. The fire at Fisher Plaza happened late at night (11:10 pm PT on July 2nd) at the start of a long holiday weekend when many Authorize.net IT engineers were off on holiday, and it took time to get them all back to work the problem. The Seattle Fire Department wouldn't allow operation of the backup generators due to their proximity to the fire location, nor would they allow customers into the damaged building to access hardware. These factors were outside of Authorize.net's control.

Of more concern is the question of a back-up data center. Authorize.net states that they were approaching capacity of their current backup data center and they were in the midst of transitioning to a new one: a true "hot" site (in other words, real-time synchronization), so that the Authorize.Net platform could be switched from one data center to the other "on the fly." When the fire took out the primary data center, they attempted to fail over to the new, still-in-testing backup data center and encountered "a number of unanticipated errors." They offer no explanation as to why they tried to fail over to the new backup data center rather than the old (presumably well-tested) one.

The document finishes with a section entitled 'Lessons'

Even as our engineering and operations teams continue to ensure normal operations, the postmortem process is already under way. We are examining all aspects of this outage and implementing steps to mitigate future risks. Over the next weeks, we will be completing the work to ensure that we have two fully functional, synchronized hot sites. Failing over from one to the other will occur in a matter of seconds. Steps are also being taken to ensure that we have the ability to implement emergency communication by distributing our voice, e-mail and Web capabilities across multiple sites.

Over the next days and weeks the postmortem will continue. Processes will be refined and further protections put into place.

While Monday morning quarterbacking is always easy, it seems like some mistakes were made in the handling of the backup data center. It's unclear if the old backup center was no longer live, or if the engineers just determined that the new one was 'ready enough' to fail over to. At the same time, having been in that kind of position, I know that the engineers were under tremendous pressure and were doing their best to come up with solutions which would get services back online as soon as possible. The more egregious issue is that Authorize.net didn't have other ways to keep in touch with customers. When the fire broke out, all the authorize.net websites went down. Eventually they opened a twitter account and for some time that was their only means of getting information out to the customers who were losing revenue as a result of the downtime.

Follow Peter on Google+

Peter Smith writes about personal technology for ITworld.

3 comments

    Anonymous 2 years ago
    This is unacceptable. For them to have their backup center in the same area and not have a detailed disaster strategy is totally unacceptable. This goes back to your Information Technology 101 class, wherein you NEVER have your backup in the same area as your main data center.Heads should roll for this!
    Anonymous 2 years ago
    I'd like to clarify for Liverez.com that Fisher is not a high rise, it's about 6 floors high... There are homes in the area here that are larger.
    Anonymous 2 years ago
    Our company LiveRez uses Authorize.net for our credit card processing, but managed to avert disasterTake a look:http://www.liverez.com/blog/ Mr Tracy Lotz is available to chat if you are interested in a story of a company that managed to avoid disruption, in spite of having a large portion of our business ( our credit card processing company) housed in the burned building. Being that we provide e-commerce solutions for online bookings of vacation rentals, and 4th of July is the biggest vacation weekend of the summer, you can imagine that we had some VERY happy property managers when they realized it was business as usual. Due to some forward thinking on the part of Lotz, all of our customers were able to continue with all credit card transations and online bookings.

      Add a comment

      Post a comment using one of these accounts
      Or join now
      At least 6 characters

      Note: Comment will appear soon after you have activated your account.
      Obscene/spam comments will be removed and accounts suspended.
      The information you submit is subject to our Privacy Policy and Terms of Service.

      ITworld LIVE

      BusinessWhite Papers & Webcasts

      White Paper

      Insiders Can Ruin Your Company. Take Action.

      Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in organizations worldwide. This white paper from NetIQ, discusses key technology solutions that help to prevent and detect insider threats.

      White Paper

      Ten Steps to an Enterprise Mobility Strategy

      Enterprise employees are more mobile, relishing the ability to work productively anywhere, at any time. They may use any means to get connected, often creating financial and security risks for your company. Discover how to get control of your enterprise mobility strategy and ensure mobile worker productivity with these ten steps.

      White Paper

      What You Need to Know About the Costs of Mobility

      Mobile workers want to get connected anywhere, at any time, often at any cost. Enterprise mobility is often a hidden "black" budget in your company. Ensure that your traveling employees are productive everywhere, even while you control cost and security, through an enterprise mobility strategy.

      White Paper

      The 2011 iPass Mobile Enterprise Report

      This industry survey covers trends, recommendations and a policy guide on managing Enterprise Mobility for IT management and CIOs. Get data on employee device liability, as well as smartphone/tablet penetration, budget control and provisioning. Find out how your organization compares, how to ensure mobile worker productivity, and control costs.

      White Paper

      Smarter Commerce is redefining value chain visibility

      Smarter Commerce is redefining the value chain in the age of the customer. It starts with putting the customer at the center of your operations - which of itself is not a new idea - however, truly operationalizing this strategy is not easy.

      See more White Papers | Webcasts

      Ask a question

      Ask a Question