Hadoop growing, not replacing RDBMS in enterprises

In most cases, Hadoop coexisting with conventional relational database management tools, a new study says

By Jaikumar Vijayan, Computerworld |  Software, analytics, Hadoop Add a new comment

The growing need for companies to manage surging volumes of structured and unstructured data is continuing to propel enterprise use of open source Apache Hadoop software.

However, instead of replacing existing technologies, Hadoop appears to be finding more of a place working alongside conventional relational database management system platforms, according to a new report from Ventana Research.

Hadoop is designed to help companies manage and process petabytes of data. Much of the technology's appeal lies in its ability to break up very large data sets into smaller data blocks that are then distributed across a cluster of commodity hardware for faster processing.

Early users of the technology including Facebook , Amazon, eBay and Yahoo have been using Hadoop to store and analyze petabytes worth of unstructured data not easily handled by conventional RDBM systems.

Ventana's report, which is based on a survey of more than 160 companies, shows that a growing number of enterprises have begun harnessing Hadoop for the same kind of reasons.

More than half of all enterprises looking to glean business insights by analyzing very large volumes of structured and unstructured data are said to have begun using Hadoop to help them with the task.

Most are using Hadoop to add new capabilities rather than to replace existing technologies, said David Menninger, author of the Ventana report.

Ventana's research shows that a majority of companies that are using Hadoop are using it mainly to collect and analyze huge volumes of unstructured and machine-generated data such as log and event data, search engine data, and text and multimedia data from social media sites.

The technology is much less likely to be used for analyzing conventional structured data such as transaction data, customer data and call records data, where traditional RDBS tools still appear to have the edge, Menninger said.

People are using Hadoop because it enables new kinds of data analytics capabilities, he said. "In two-thirds of the cases we found that people are using Hadoop for advanced analytics, and for types of analysis that they were not doing before."

Hadoop appears to be delivering new capabilities especially on the operations side of the business, Menninger said. In most cases, Hadoop is being used by business units such as sales and marketing rather than by groups such as human resources and finance, he said.

"Operations is the place where the most up-to-the minute and granular data occurs. It is a place where a lot of the data is machine generated," he said. "It is also the group which is asked to support other areas of the business."

Despite its early promise, enterprises still face some significant challenges in adopting Hadoop. One of the biggest problems continues to be the relative shortage of people who are skilled with Hadoop, Menninger said.

The obstacles that were cited by survey respondents most often related to staff availability and training, Menninger said. A lot of companies also appear to be having problems dealing with Hadoop's clustered computing approach.

Another area where enterprises appear a bit unsure about Hadoop is security , with just 49% of the respondents saying they were satisfied with Hadoop's security.

Jaikumar Vijayan covers data security and privacy issues, financial services security and e-voting for Computerworld. Follow Jaikumar on Twitter at @jaivijayan or subscribe to Jaikumar's RSS feed . His e-mail address is jvijayan@computerworld.com .

Read more about databases in Computerworld's Databases Topic Center.


Originally published on Computerworld |  Click here to read the original story.

ITworld LIVE

SoftwareWhite Papers & Webcasts

White Paper

Activities Streams Base An Integrated Social Layer

The enterprise social software market is exploding thanks to converging trends of consumerization, cloud, and mobile. In this must-read report, "The Forrester Wave: Activities Streams, Q2 2012", Forrester Research Inc. evaluated five social software vendors with core strengths in the stream based on the overall strength of vendors' current offerings, a clear product strategy, and vendor market presence. In a detailed look at the space, Forrester named Yammer as a leader.

White Paper

ESG Lab Review: HP 3PAR Peer Motion Software

This ESG Lab review sponsored by HP + Intel documents hands-on testing of HP 3PAR Peer Motion Software's distributed volume.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

ESG Lab Review: HP 3PAR Peer Motion Software

This ESG Lab review documents hands-on testing of HP 3PAR Peer Motion Software's distributed volume management with a focus on federated workload balancing, asset management, and thin provisioning.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

Deliver Cost-Effective Business Continuity with Extreme Capacity

IBM DB2 provides application cluster transparency technology that equips organizations running OLTP applications with the ability to deliver high availability and continuous uptime for transactional data, plus the flexibility and capacity they need to remain competitive.

White Paper

What Developers Want: The End of Application Redeploys

Eliminate application restarts in Java with JRebel! JRebel is a JVM plugin that eliminates application redeploys from the Java development cycle, a process that takes over 10 minutes of coding time away from developers each working hour, according to a recent survey. Just code, refresh and see everything instantly.

See more White Papers | Webcasts

Ask a question

Ask a Question