MySQL Cluster 7.2 release brings legacy tech to big data
Think big data is a game for NoSQL and start-ups only? MySQL Cluster's new release shows otherwise
The exploding big data sector hasn't completely left behind the old fogeys yet, even if there's a little bit of scrambling to re-adjust the technology.
For every Lucid Imagination, there's still an IBM or an Oracle, with a lot of experience in data management -- big, little, or otherwise.
That was demonstrated again with the general availability of MySQL Cluster 7.2, the distributed carrier-grade edition of the popular open source database from Oracle.
If MySQL in general feels like yesterday's news in the context of big data, it should be noted that big data is more than just Hadoop. Hadoop is just the storage medium, and many businesses still need a relational database to hold structured data and process information. MySQL has long been an open source fixture within the relational database world, and has evolved well into the big data space.
The recent announcement adds more confirmation to that: MySQL Cluster, which already features C/C++ and Java APIs that enable analysts to plug-in non-relational queries (the "NoSQL" class of queries), has added an API in this 7.2 release to enable memcached-based NoSQL queries. Memcached is, according to its website, "an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering."
This means that as queries get more complex, memcached-based queries can deliver better performance and speed -- highly important in a large-scale data environment. It also means, from a wider perspective, that application developers will have more choices with which to access data -- from the traditional SQL queries to NoSQL-class queries.
Complex queries are something that the MySQL Cluster team has been working on quite a bit, according to Tomas Ulin, VP of MySQL Development.
"Looking at two-, four-, or even eleven-way joins," Ulin explained in an interview, "MySQL Cluster has been given a 10- to 100-times performance boost." One customer was able to get a 140-times performance improvement for an eleven-way joined query, Ulin added, though he did caution that the mileage may vary on that kind of boost.
The improvements stem from a shift in the workloads within MySQL Cluster 7.2, where join work will now mostly be done at the data node, using processors closer to the actual data and reducing the amounts of data that will have to be transferred to meet the requirements of a query.
Scalability is another big data attribute that MySQL Cluster 7.2 seems geared to address: the new release will enable multi-site clusters' individual data nodes to be located in different data centers, with databases automatically sharded between them. This, coupled with the long-standing capability of any of the MySQL family to run on commodity servers, means databases can be put together with a strong degree of flexibility.
MySQL Cluster 7.2 is at the high-end of the MySQL product line, "above" MySQL standard and enterprise editions. It has a long track record in the telecomm sector, and as such, demonstrates knowing a thing or two about this whole big data hullaballoo.
Something these young whippersnapper startups might do well to remember.
Read more of Brian Proffitt's Zettatag and Open for Discussion blogs and follow the latest IT news at ITworld. Drop Brian a line or follow Brian on Twitter at @TheTechScribe. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.