What's the big deal about Hadoop?

By Todd R. Weiss, Computerworld |  Big Data/Hadoop Add a new comment

Hadoop is all the rage, it seems. With more than 150 enterprises of various sizes using it -- including major companies such as JP Morgan Chase, Google and Yahoo -- it may seem inevitable that the open-source Big Data management system will land in your shop, too.

But before rushing in, make sure you know what you're signing up for. Using Hadoop requires training and a level of analytics expertise that not all companies have quite yet, customers and industry analysts say. And it's still a very young market; a number of Hadoop vendors are duking it out with various implementations, including cloud-based.

Enterprise Hadoop vendors

The free open source application, Apache Hadoop, is available for enterprise IT departments to download, use and change however they wish.

But for many business users, the need for support and technical expertise often largely overshadows the lure of free do-it-yourself applications, especially when there are critical IT systems at stake.

That's where supported, enterprise-ready versions of Hadoop can instead be a better, more realistic option.

Here is a sampling of some of the major commercial vendors that can help your company get started with Hadoop. Some offer on-premises software packages; others sell Hadoop in the cloud. There are also some Hadoop database appliances beginning to appear, including the recently announced .

Amazon Web Services runs Amazon Elastic MapReduce, a hosted Hadoop framework running on Amazon's Elastic Compute Cloud and its Simple Storage Service

The Cloudera Enterprise subscription service

The Datameer Analytics Solution using Hadoop

The DataStax Enterprise Hadoop software

Greenplum, a Division of EMC, offers Greenplum HD Enterprise-Ready Apache Hadoop

The Hortonworks Data Platform

BigInsights, an unstructured-data cloud service from IBM based on Hadoop

Karmasphere Analyst, a toolkit to help produce data using Hadoop

MapR provides an enterprise-ready M5 edition of its Hadoop software

This list features only some of the many vendors offering enterprise Hadoop products and services today. The number of vendors is constantly growing as Hadoop gains steady traction in the data marketplace.

- Todd R. Weiss

Most important, perhaps: Don't buy into the hype. Forrester Research analyst James Kobielus points out that only 1% of U.S. enterprises are using Hadoop in production environments. "That will double or triple in the coming year," he expects, but caution is still called for, as with any up-and-coming technology.

To be sure, Hadoop has advantages over traditional database management systems, especially the ability to handle both structured data like that found in relational databases, say, as well as unstructured information such as video -- and lots of it. The system can also scale up with a minimum of fuss and bother. eBay, the online global marketplace, has 9 petabytes of both structured data on clusters from Terabyte as well as unstructured data on Hadoop-based clusters running on "thousands" of nodes, according to Hugh Williams, vice president of experience, search and platforms for the company.

"Hadoop has really changed the landscape for us," he says.

"You can run lots of different jobs of different types on the same hardware. The world pre-Hadoop was fairly inflexible that way," Williams explains. "You can make full use of a cluster in a way that's different from the way the last user used it. It allows you to create innovation with very little barrier to entry. That's pretty powerful."

Scaling up, and up

One early Hadoop adopter, Duluth, Ga.-based Concurrent, sells video-streaming systems. It also stores and analyzes huge quantities of video data for its customers. To better cope with the ever-rising amount of data it processes, Concurrent started using Hadoop CDH from Cloudera two years ago.


Originally published on Computerworld |  Click here to read the original story.

ITworld LIVE

Big Data/HadoopWhite Papers & Webcasts

White Paper

Storage for Application Consolidation

This paper describes how Oracle's integrated storage approach and next-generation storage solutions deliver on the promise of application onsolidation-and help companies fully exploit the business advantages of a private cloud.

See more White Papers | Webcasts

Ask a question

Ask a Question