After a high-profile disaster like a days-long service outage that lost a lot of customers a lot of money and permanently lost data for others, most companies might shorten-up the technology vision and focus on fixing short-term problems.
Not Amazon CEO Jeff Bezos. In an opinion piece in today's San Francisco Chronicle, Bezos lays out his case for advanced IT R&D in a purely commercial organization.
The piece is mainly a piece of firefighting – an attempt to defend to shareholders Amazon's spending decisions following a quarterly earnings report that showed revenue went up 38 percent compared to a year ago, but earnings went down by 33 percent.
That makes for an uuugly financial report and as much disgust among financial geeks as the Amazon EC2 outage caused among technical ones.
Bezos didn't address the data outage, but part of it did give his perception and macro-view of the architecture of both Amazon's internal systems and its public cloud service, including the storage and database services that caused the outage:
>"State management is the heart of any system that needs to grow to very large size. Many years ago, Amazon’s requirements reached a point where many of our systems could no longer be served by any commercial solution: our key data services store many petabytes of data and handle millions of requests per second.
To meet these demanding and unusual requirements, we’ve developed several alternative, purpose-built persistence solutions, including our own key-value store and single table store.
To do so, we’ve leaned heavily on the core principles from the distributed systems and database research communities and invented from there.
The storage systems we’ve pioneered demonstrate extreme scalability while maintaining tight control over performance, availability, and cost.
To achieve their ultra-scale properties these systems take a novel approach to data update management: by relaxing the synchronization requirements of updates that need to be disseminated to large numbers of replicas, these systems are able to survive under the harshest performance and availability conditions.
These implementations are based on the concept of eventual consistency. The advances in data management developed by Amazon engineers have been the starting point for the architectures underneath the cloud storage and data management services offered by Amazon Web Services (AWS). For example, our Simple Storage Service, Elastic Block Store, and SimpleDB all derive their basic architecture from unique Amazon technologies."
That's a lot more technical savvy than most CEOs could manage , but still doesn't address how Amazon plans to avoid outages in his cloud service as well as it does in its internal ones.
He does give any IT manager interested in expanding the scope of R&D, or even test and development of existing products, beyond their current limits.
Bezos believes R&D isn't a luxury, or an expensive gamble that internal IT will come up with the same solution to a problem faster than an external vendor. He sees it as a way to keep the company moving forward into new markets continually without gaps in the technology getting in the way.
"While many of our systems are based on the latest in computer science research, this often hasn’t been sufficient: our architects and engineers have had to advance research in directions that no academic had yet taken. Many of the problems we face have no textbook solutions, and so we -- happily -- invent new approaches," Bezos wrote.<
Amazon didn't get into cloud-computing and services-oriented architectures for its internal application development because it thought SOA was the future, liked the cost efficiencies or wanted to make sure the company was using the latest technology available, whether or not there was a good reason for it.
Way back in earliest days of e-commerce, Amazon started designing apps as services because it kept the apps modular. Rather than forcing the whole company to hold back some new line of business while developers tinkered with complex soup-to-nuts systems, SOA let Amazon break all the functions into separate pieces that would use standard methods to exchange data and requests, and move forward incrementally as each was completed.
Each additional service added functions such as personalization, better data mining and analysis of product categories, more accurate product classifications and personalized search that, he writes, prove the concrete value of IT R&D:
"In my opinion, these techniques are not idly pursued – they lead directly to free cash flow," he wrote.
That's more true in a pure e-commerce play like Amazon than it is for companies with more complex real-world cost structures, but it's still true for any company that understands the value of its own information.
The trick is to make sure all that research really is focused on solving problems that are costing the company money or slowing its growth.
Amazon is able to do that very consistently; most other companies aren't.
As good as his justification is in Amazon's case, anyone else using the same rationale has to be able to back it up with clearly practical, clearly profitable results before it will carry them very far with anyone outside IT.