Twitter Picks Up Cassandra

"NoSQL" database a popular alternative to MySQL amongst web application hosters

By Brian Proffitt  Add a new comment

Something interesting crossed the wires today... Twitter, the popular microblogging service, is switching to the Cassandra data management system from MySQL.

Like MySQL, Cassandra is open source, created as the Apache Software Foundation project sponsored by Facebook. Facebook obviously uses Cassandra, as does Digg, which announced its switch to Cassandra in September last year.

Why so popular in the social media applications set? It seems that MySQL seems to have issues scaling to the needs of the these high-traffic, high-storage sites. For Digg, "... the case of the traditional architecture, the lack of redundancy on the write masters is painful, and both approaches have significant management overhead to keep running," wrote Ian Eure back in September. The rest of Eure's blog post is extremely illuminating. He goes into a detail explanation of the challenges of using a relational database like MySQL, but this passage really sums it up well:

"The fundamental problem is endemic to the relational database mindset, which places the burden of computation on reads rather than writes. This is completely wrong for large-scale web applications, where response time is critical. It’s made much worse by the serial nature of most applications. Each component of the page blocks on reads from the data store, as well as the completion of the operations that come before it."

That reasoning seems to be the same for Twitter's move to Cassandra this week. In fact, when Ryan King, a Twitter software engineer, was interviewed Feb. 23 about a potential move to Cassandra, he laid the solutions out exactly like Eure did in September: either Twitter would have to move to a "more Automated sharded MySQL setup" or switch to one of a new class of non-relational databases that have much more appeal for web application hosts. These include HBase, Voldemort, MongoDB, MemcacheDB, Redis, Cassandra, HyperTable, Tokyo Cabinet/Tyrant, and Dynomite.

Euphemistically, this set of databases is referred to as the "NoSQL" family of databases, in that they all shift away from the relational database model that all of the SQL-class of databases use.

With its flexible architecture that scales up, and its clustering capabilities, Cassandra is well-suited for the kind of work needed on web applications. So, are MySQL and the other SQL databases in trouble? Not likely... there are still some things the non-relational databases can't do very well yet. Data handling is not as rigorous, since NoSQL databases tend to sacrifice long-term data detail for short-term performance.

I would keep an eye on this new family of databases... if the performance and data issues can be better tuned, the NoSQL movement may represent a new future for data management--especially since the Web seems to be the most popular platform for apps.

Follow Brian on Google+

Brian Proffitt is a veteran Linux and open source journalist/analyst with experience in a variety of technologies, including cloud, virtualization, and consumer devices.

ITworld LIVE

Open SourceWhite Papers & Webcasts

White Paper

CIO Quickpulse: Drivers for Enterprise Virtualization Diversification

Open source is a key driving force as organizations consider second-vendor virtualization adoption to attain more diversity, data center power and agility.

White Paper

Consolidating SAP Applications to Linux on Power by IDC

IDC studied a group of enterprises that had deployed SAP applications on IBM Power Systems servers running Linux server operating environments and had been working with those systems for several years. Learn about the results...

White Paper

An Interactive eGuide: Open Source

By now, enterprises are well aware of the benefits of open-source software, which boasts a clean design, reliability, and maintainability, as well as support for standards and community values. But perhaps the biggest benefit is quality; since open-source software users have access to source code, bug fixes and enhancements come from multiple sources, often resulting in superior software.

See more White Papers | Webcasts

Ask a question

Ask a Question