February 21, 2012, 1:47 PM — The exploding big data sector hasn't completely left behind the old fogeys yet, even if there's a little bit of scrambling to re-adjust the technology.
For every Lucid Imagination, there's still an IBM or an Oracle, with a lot of experience in data management -- big, little, or otherwise.
That was demonstrated again with the general availability of MySQL Cluster 7.2, the distributed carrier-grade edition of the popular open source database from Oracle.
If MySQL in general feels like yesterday's news in the context of big data, it should be noted that big data is more than just Hadoop. Hadoop is just the storage medium, and many businesses still need a relational database to hold structured data and process information. MySQL has long been an open source fixture within the relational database world, and has evolved well into the big data space.
The recent announcement adds more confirmation to that: MySQL Cluster, which already features C/C++ and Java APIs that enable analysts to plug-in non-relational queries (the "NoSQL" class of queries), has added an API in this 7.2 release to enable memcached-based NoSQL queries. Memcached is, according to its website, "an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering."
This means that as queries get more complex, memcached-based queries can deliver better performance and speed -- highly important in a large-scale data environment. It also means, from a wider perspective, that application developers will have more choices with which to access data -- from the traditional SQL queries to NoSQL-class queries.
Complex queries are something that the MySQL Cluster team has been working on quite a bit, according to Tomas Ulin, VP of MySQL Development.
"Looking at two-, four-, or even eleven-way joins," Ulin explained in an interview, "MySQL Cluster has been given a 10- to 100-times performance boost." One customer was able to get a 140-times performance improvement for an eleven-way joined query, Ulin added, though he did caution that the mileage may vary on that kind of boost.
The improvements stem from a shift in the workloads within MySQL Cluster 7.2, where join work will now mostly be done at the data node, using processors closer to the actual data and reducing the amounts of data that will have to be transferred to meet the requirements of a query.