Hadoop goes mainstream for big BI tasks

By Jaikumar Vijayan, Computerworld |  Business Intelligence, analytics, data management Add a new comment

Corporate efforts to glean business intelligence from the massive volumes of data generated by Web server logs and social media have led to a surge of interest in open-source Hadoop software.

Hadoop is designed to process terabytes and even petabytes of unstructured and structured data. It breaks large workloads into smaller data blocks that are distributed across a cluster of commodity hardware for faster processing.

The technology -- already used by Web giants such as Facebook, eBay, Amazon and Yahoo -- is increasingly being adopted by banking, advertising, biotech and pharmaceutical companies, said Stephen O'Grady, an analyst at RedMonk.

Tynt Multimedia, a Web analytics firm that collects and analyzes nearly 1TB of data per day, switched to Hadoop about 18 months ago when its MySQL database system began collapsing under the sheer volume of data it was collecting, said Cameron Befus, Tynt's vice president of engineering.

Relational database systems are good at data retrieval and queries but don't accept new data quickly. "Hadoop reverses that. You can put data into Hadoop at ridiculously fast rates," Befus said. But Hadoop requires programming tools such as Pig or Hive to write SQL-like queries to retrieve the data.

This version of this story was originally published in Computerworld's print edition. It was adapted from an article that appeared earlier on Computerworld.com.

Read more about applications in Computerworld's Applications Topic Center.


Originally published on Computerworld |  Click here to read the original story.

ITworld LIVE

Business IntelligenceWhite Papers & Webcasts

White Paper

Five Myths of Cloud Computing

In recent years, cloud computing has been as visible as any topic in IT. Its front-page news status has been accelerated by Amazon, Salesforce.com, Yahoo, and Microsoft®, among other firms, aggressively vying for leadership in providing cloud infrastructure or services. However, this race for mindshare has obscured cloud computing facts. Many admit to the haze surrounding cloud computing.This white paper separates fact from fiction, reality from myth, and, in doing so, will aide senior IT executives as they make decisions around cloud computing. While dispelling cloud computing myths, we will answer tough questions: How hard is it to adopt a private or hybrid cloud? How difficult is it to maintain and secure a cloud? How will the cloud transform my business? Do I have the right skill sets in place? What are some of my cost considerations? HP and Intel are committing extensive resources to helping customers with all of their questions and concerns around cloud computing.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

Hybrid IT service delivery: A strategic thinking model for optimizing IT resources

With the introduction of cloud computing, the IT industry has a new path for applying Shared Services business models to better utilize a company's financial and operational resources. At the same time, it creates the need to understand how these new business models can be integrated with existing IT organizations and business, and understanding that it is sub-optimal to organize the management of IT resources into a "one size fits all" management model. HP Hybrid Delivery strategy offers a structured approach to the development of your IT delivery model, taking advantage of the best of all the various business models and creating a safe pathway through the complex landscape of IT sourcing and IT delivery.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

Using BD for Smarter Decision Making

This paper looks at new developments in business analytics and discusses the benefits analyzing big data bring to the business.

Webcast On Demand

InfoSphere Warehouse Packs Demo

These flash modules make warehousing more tangible and relevant to business users through detailed explanations of the InfoSphere Warehouse Packs.

Sponsor: IBM

Webcast On Demand

Making Information Matter

Join us in the upcoming Hitachi virtual Forum on Wednesday, June 6th, at 8:30am PT / 11:30am ET and gain meaningful insights on how to maximize efficiency and reduce expenses. At the virtual forum you will learn about key solution strategies in our featured live video sessions from top leaders at Hitachi, like Miki Sandorfi, Chief Strategy Officer and industry experts, such as Ben Woo, VP WW Storage Systems at IDC.

Sponsor: Hitachi

See more White Papers | Webcasts

Ask a question

Ask a Question