Hewlett-Packard's Autonomy subsidiary will release an add-on component to link the company's IDOL flagship search software to the Apache Hadoop data processing platform, it announced Monday as part of its HP Discover user conference this week in Las Vegas.
While Hadoop provides a good platform holding vast amounts of information, it offers little in the way of prebuilt analysis tools, said Matt Malden, Autonomy vice president. Organizations must write their own Java programs in the MapReduce framework to analyze their data.
With Autonomy's Hadoop package, users can instead embed an IDOL 10 engine in each node of their Hadoop cluster. They then can use IDOL's 500 functions to analyze and summarize data on the Hadoop implementation.
Autonomy's IDOL (Intelligent Data Operating Layer) provides enterprise users with the ability to conduct complex queries across large amounts of unstructured data, such as Web pages, email and digitized office documents. Over 400 organizations use this software, according to the company.
All the functionality in IDOL itself can be applied to a Hadoop dataset, Malden said. The software offers such functionality as concept searching, where a search on one word will return results containing items with synonyms to that word. It can do sentiment analysis, offering a summary of how negative or positive the information in a set of documents may be. Such sentiment analysis can be used understand user satisfaction levels, perhaps over a select period of time. IDOL can also offer conceptual clustering, whereby it groups documents under broad themes, potentially simplifying a search process.
The pairing of Hadoop and IDOL was a natural fit, Malden said. "You don't need to move data into IDOL to use its functions. Whatever technology choice you make for data storage, we are able to process it," he said, adding that IDOL has 400 connectors to various other platforms, and can understand over 1,000 different data formats.
Autonomy is one of a number of data-centered software companies that have bridged their offerings to the increasingly popularHadoop. Teradatapartnered with Hadoop distributor Hortonworks to bridge Hadoop with its own data warehouse software. In a similar move, Oracle partnered with Cloudera, another Hadoop distributor.
HP Autonomy declined to reveal pricing or availability of the IDOL Hadoop plug-in.
Autonomy also announced on Monday that it has released a tool to analyze how users surf around a company's website, called Autonomy Optimost Clickstream Analytics. The software summarizes user visits, as well as online purchases and other pertinent information.