September 10, 2012, 11:29 AM — Big data is all about non-relational databases and distributed data storage, so the hype goes, and if you don't believe us, WE HAVE AN ELEPHANT THAT WILL STEP ON YOU! RARGH, MORE DATA!
If that sounds like the typical line from the usual big-data story (minus, of course, all the caffeine this writer had this morning), it's not only wildly hyperbolic, it's also not completely accurate.
The truth is, open source relational databases like MySQL and PostgreSQL are still a big part of the big data story, not matter how much noise the Hadoop elephant makes. This is why today's announced release of PostgreSQL 9.2, while it may not get a lot of media attention, is still a big deal.
It's not just me saying this: NASA, Instagram, and the Chicago Mercantile Exchange and Instagram are all putting PostgreSQL to use. That's some serious user-cred.
Two new features will give developers some interesting toys to play with. First, PostgreSQL now has native JSON support, so any application developer who wants to apply document-storage within PostgreSQL is now more than welcome to.
There is also new Range Types support, which are useful because they represent many element values in a single range value, and because concepts such as overlapping ranges can be expressed clearly. This means that time and date ranges can be used much more robustly, as well as financial and scientific data. According to the PostgreSQL team, this is the only major SQL database that supports this feature.
IT managers will also be pretty darned intrigues with the performance tune-up under 9.2's hood. The database now has linear scalability up to 64 cores, index-only scans, and reductions in CPU power consumption. This, coupled with improvements in PostgreSQL's capability to use hardware resources on bigger servers, translates to up to 350,000 read queries per second, which is more than four times faster than earlier versions of PostgreSQL. Index-only scans for data warehousing queries are anywhere from two-twenty times faster, and data writes can get up to 14,000 per second.
PostgreSQL is already a good system to use for fast data analysis, particularly when used in conjunction with the mighty Hadoop elephant. With this revved up engine, it will be able to handle a lot more data even faster, and deliver some serious power to database developers who don't want to jump away from the world of relational databases.
Read more of Brian Proffitt's Open for Discussion blog and follow the latest IT news at ITworld. Drop Brian a line or follow Brian on Twitter at @TheTechScribe. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.