May 26, 2009, 9:26 AM — Running an Internet startup remains a tricky business, says link-sharing service ShareThis. During the past two years, more than 110,000 sites have added the ShareThis embedded link, allowing readers to forward articles or videos to their friends. The popularity has made the company's data requirements enormous: it handles up to 12,000 requests a second and 130 million page views every day.
In the planning stages, the startup realized that potential success would mean that it would have to establish a data center-and pay for provisioning the infrastructure-before its business took off, says Nanda Kishore, CTO of ShareThis. Understandably, the business balked at such costs.
"If you look at a traditional data center model, it is very clumsy," Kishore says. "Your need to buy capacity is always ahead of demand (and) if you don't have the traffic, you've built up all this capacity and you are stuck with a fixed cost."
The unpredictability of demand for the ShareThis service and the need to minimize up-front costs are two reasons why the company decided to outsource its needs and use Amazon's Elastic Compute Cloud (EC2) service, he says.
Exactly What is an Elastic Compute Cloud?
The much-discussed but often not-fully-understood EC2 technology is part of Amazon's Web Services infrastructure, an alphabet soup of computing services delivered via the Internet to give business access to flexible computing resources. With EC2, companies can provision a number of virtual computers-depending on their immediate processing needs-to crunch numbers and only have to pay for what they use.
ShareThis uses the data it collects from each customer site to analyze how content is forwarded amongst Internet users. The company crunches their link logs every night, adding them to its 10 terabyte data warehouse stored on Amazon's service. The process could take a single computer 100 hours, but ShareThis creates more than dozen virtual instances and finishes the job overnight.
"We do the data analysis in the cloud," Kishore says. "We instantaneously configure 100 compute hours to process these large files."
Not having traditional data center costs or management hassles allows ShareThis employees to focus on the business, not on managing its infrastructure, he says.
"You can do intellectually stimulating things, rather than turn the dials and knobs," says Kishore. "It feels like, to me, that this is an entirely new world, and it seems that here, there are large savings."
EC2 is the processing part of Amazon's Web Services. The Internet giant also offers the Simple Storage Service (S3) allowing customers to backup large quantities of data, the Simple Queue Service (SQS) for messaging between computers, and the Simple Database (SimpleDB) for structured-data storage needs.
The services have already taken off. According to Amazon, the bandwidth for EC2 and S3 has exceeded the bandwidth for all of Amazon's global websites. The storage service, S3, holds more than 52 billion objects and regularly peaks at 80,000 requests per second, according to the company.
The Usual Cloud Concerns Apply
However, customers interested in using Amazon's Web Services should be ready to change their applications to minimize costs, states Gartner analyst Lydia Leong in a research note on the service.
"Broadly, in cloud-computing environments, access to computing capacity is inexpensive, but access to storage is expensive," she states. "Consequently, applications written for such environments should trade off greater CPU usage, or more memory usage, for less I/O, whenever possible," she writes.
In addition, applications running on Linux instances are less expensive than those running on Windows instances, and instances that require less memory and a smaller number of processing cores are cheaper, she adds.
Companies that deploy applications on Amazon's EC2 must also be aware of the security considerations. EC2 instances have to secured, just like any other server connected to the Internet. While there is a user-configurable firewall, because the virtual machines are only accessible via the Internet, connecting internal systems to EC2 instances requires additional security precautions, Leong says. And, because Amazon does not provide details of its internal infrastructure, companies that have to comply with regulations may not have access to the necessarily level of auditing, she writes.