January 18, 2012, 9:11 PM — Online database repository provider RainStor today announced what it calls the industry's first enterprise-class database that runs natively on Hadoop.
RainStor's Big Data Analytics on Hadoop software enables faster analytics processing because there's no need to to move data out of the Hadoop Distributed File System (HDFS) environment.
[ Free download: Hadoop creator Doug Cutting expects surge in interest to continue ]
Hadoop is an open-source aggregator of structured and unstructured data that allows huge volumes of information to be analyzed, including online transactions, social media data and web logs.
RainStor provides a specialized database purpose-built for online big data retention.
By running the database natively on Hadoop, RainStor said it can produce faster query and analysis against multi-structured data inside Hadoop. For example, neither Oracle nor SQL databases run natively on Hadoop or HDFS, so they require that data be analyzed using outside sources.
RainStor can also analyze compressed data in its native state.
RainStor's new Big Data Analytics on Hadoop combines a compression algorithm with around 40:1 data reduction along with providing SQL and Oracle access and MapReduce. The compressed data set, both structured and unstructured, running on HDFS reduces the cluster size by 50% to 80%, which significantly lowers operating cost, according to RainStor CEO John Bantleman.
"We do partition filtering. Our index says, 'go find me this record,' and filtering asks 'does this value exist in a partition? Yes or no,'" Bantleman said. "So when you ask a question, we can look at a tiny slice of metadata and decide quickly what not to read, so it does a lot less work than big databases do."
Lucas Mearian covers storage, disaster recovery and business continuity, financial services infrastructure and health care IT for Computerworld. Follow Lucas on Twitter at @lucasmearian or subscribe to Lucas's RSS feed . His e-mail address is firstname.lastname@example.org .