Moving beyond Hadoop for big data needs

Hadoop isn't enough anymore for enterprises that need new and faster ways to extract business value from massive datasets

By , Computerworld |  Big Data, Hadoop

"Dremel was architected from the ground up to be an analytical data store," Driscoll said. Its column-oriented, parallelized, in-memory design makes it several orders of magnitude faster than a traditional data store, he said.

"We have a very similar architecture," Driscoll said. "We are column-oriented, distributed and in-memory."

The Metamarkets technology, though, allows enterprises to run queries over data even before it is streamed into a data store, so it allows for even faster insight than Dremel, he said.

Metamarkets earlier this year released Druid to the open source community to spur more development activity around the technology.

The demand for such technology is driven by the need for speed, Driscoll said.

Hadoop, he said, is simply too slow for companies that need sub-millisecond query response times. Analytics technologies such as those being offered by the traditional enterprise vendors are faster than Hadoop but still don't scale as well as a Dremel or a Druid, Driscoll said.

Nodeable, another venture-backed startup, offers a cloud-hosted service called StreamReduce that is similar to the Metamarkets offering.

StreamReduce is powered by Storm, an open source data analytics technology originally developed by BackType before it was acquired by Twitter last year. Storm, also used internally by Twitter, is designed to let enterprises run real-time analytics on streaming data.

Nodeable offers a connector to Hadoop so enterprises can use the service to run interactive queries against data stored in their Hadoop environment as well, CEO Dave Rosenberg said.

Nodeable was launched as a cloud system management company but switched tracks after seeing an opportunity for big data analytics technology. "We realized there was a lack of a real-time complement to Hadoop. We asked ourselves, how do we get real-time with Hadoop?" Rosenberg said.

Services such as Nodeable's do not replace Hadoop, they complement it, Rosenberg said.

StreamReduce gives companies a way to extract actionable information from streaming data that can be stored in a Hadoop environment or in another data store for more traditional batch processing later, he said.

Streaming engines such as those offered by Nodeable and Metamarkets are different from technologies like Dremel in one important aspect -- they are designed for analyzing raw data before it hits a database. Dremel and other technologies are designed for ad hoc querying of data that is already in a data store such as a Hadoop environment.

Meanwhile, major Hadoop players are not standing by idly.


Originally published on Computerworld |  Click here to read the original story.
Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

ITworld Answers helps you solve problems and share expertise. Ask a question or take a crack at answering the new questions below.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question