Digg, like Twitter, rips out MySQL
Digg's engineering team has stopped using MySQL, echoing a move recently made by Twitter, another major social networking site.
Digg is abandoning MySQL in favor of a "NoSQL" environment, because of the "increasing difficulty of building a high performance, write intensive, application on a data set that is growing quickly, with no end in sight," Digg vice president of engineering John Quinn wrote in a blog post last week.
Digg made several other changes, rewiring all its application code, installing a new client and server architecture and moving away from the LAMP open source software bundle. But the switch away from MySQL may be the most significant infrastructure change of all at Digg, Quinn writes.
"To someone like me who's been building systems almost exclusively on relational databases for almost 20 years, this feels like a bold move," he says.
MySQL is now controlled by Oracle, the new owner of Sun Microsystems. MySQL creator Michael Widenius has expressed concern over Oracle's control of MySQL. It remains to be seen how MySQL fares under new leadership, but Quinn did not mention the Oracle/Sun merger as a driving factor.
Digg is moving to Cassandra, a distributed database management system originally developed by Facebook. Digg is not the only major social networking site moving from MySQL -- Twitter is doing the same.
As of now, Digg has re-implemented most of its functionality with Cassandra, while making its own improvements to the open source Cassandra software, Quinn writes.
"Digg is committed to the use and development of open source software and we're keen to avoid the cost of proprietary large-scale storage solutions," he states. "Cassandra … is column-oriented and allows for the storage of relatively structured data. It has a fully decentralized model; every node is identical and there is no single point of failure. It's also extremely fault tolerant; data is replicated to multiple nodes and across data centers. Cassandra is also very elastic; read and write throughput increase linearly as new machines are added."
Digg's main focus now is readying its latest release for general availability and "championing Cassandra's development and adoption," Quinn writes.
Follow Jon Brodkin on Twitter: www.twitter.com/jbrodkin
Read more about software in Network World's Software section.