Informatica adds support for 'big data,' Hadoop

A Hadoop connector is part of Informatica 9.1

By Chris Kanaracus, IDG News Service |  Software, big data, Hadoop Add a new comment

Informatica is joining the growing ranks of vendors moving to support Hadoop, the open-source framework for large-scale or "big data" processing, the company announced Monday.

The 9.1 version of Informatica's platform features a connector to the Hadoop file system (HDFS), allowing customers to move data in and out of Hadoop clusters.

While the Hadoop project has its roots in Web companies, having been led by Yahoo, enterprises are quickly warming up to it as well, said James Markarian, Informatica executive vice president and CTO.

The problem is that corporate IT shops may not have the right sort of in-house expertise, Markarian said.

"It's early days for Hadoop, but as it starts to get more mainstream, the kind of developers that can make use of it is really changing," he said. "Your average guy in IT, they don't know MapReduce, they don't know Hive, they don't know Pig," he said, referring to a range of Hadoop tools. "That's where we come in. You don't need to learn anything else. Take your Informatica skills and we'll bring you to Hadoop."

Informatica 9.1 also targets the growing number of information types that get lumped under the "big data" header. Along with "near-universal" connections to transactional databases like IBM DB2 and Oracle, as well as analytic-focused data platforms such as Netezza and Teradata, the new release can also pull in data from social sites like Facebook, Twitter and LinkedIn.

Other aspects of Informatica's platform, such as MDM (master data management), data quality and self-service tools, are also getting a range of updates as part of the 9.1 release.

The social media and Hadoop connectors are sold separately from the core platform, according to Markarian. Pricing was not immediately available.

It is indeed early days for Hadoop, according to Forrester Research analyst James Kobielus .

While Informatica now has the ability to load and retrieve data from Hadoop clusters, that's not necessarily different from what a number of data warehousing vendors already have, and it's likely that other data integration vendors will follow Informatica's move, he said.

Overall, effective use of Hadoop is not about one tool, according to Kobielus. Early adopters should look to standardize their activities on a core stack of technologies, which hasn't emerged yet, he said. So far, it seems like the only common element in Hadoop projects is the use of MapReduce for the modeling layer, he said.

"I would like to see Informatica and other data integration vendors offer rich IDEs (integrated development environments) for [Hadoop]," Kobielus said. "I strongly expect they will do that."

Chris Kanaracus covers enterprise software and general technology breaking news for The IDG News Service. Chris's e-mail address is Chris_Kanaracus@idg.com

ITworld LIVE

SoftwareWhite Papers & Webcasts

White Paper

Activities Streams Base An Integrated Social Layer

The enterprise social software market is exploding thanks to converging trends of consumerization, cloud, and mobile. In this must-read report, "The Forrester Wave: Activities Streams, Q2 2012", Forrester Research Inc. evaluated five social software vendors with core strengths in the stream based on the overall strength of vendors' current offerings, a clear product strategy, and vendor market presence. In a detailed look at the space, Forrester named Yammer as a leader.

White Paper

ESG Lab Review: HP 3PAR Peer Motion Software

This ESG Lab review sponsored by HP + Intel documents hands-on testing of HP 3PAR Peer Motion Software's distributed volume.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

ESG Lab Review: HP 3PAR Peer Motion Software

This ESG Lab review documents hands-on testing of HP 3PAR Peer Motion Software's distributed volume management with a focus on federated workload balancing, asset management, and thin provisioning.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

Deliver Cost-Effective Business Continuity with Extreme Capacity

IBM DB2 provides application cluster transparency technology that equips organizations running OLTP applications with the ability to deliver high availability and continuous uptime for transactional data, plus the flexibility and capacity they need to remain competitive.

White Paper

What Developers Want: The End of Application Redeploys

Eliminate application restarts in Java with JRebel! JRebel is a JVM plugin that eliminates application redeploys from the Java development cycle, a process that takes over 10 minutes of coding time away from developers each working hour, according to a recent survey. Just code, refresh and see everything instantly.

See more White Papers | Webcasts

Ask a question

Ask a Question