Data analytics

EMC pools enterprise smarts to create data 'lakes'

Technology from VMware, Pivotal and EMC storage will go into big-data analytics systems

EMC is drawing on its “federation” of companies to help customers build data lakes using EMC storage, VMware virtualization and Pivotal big-data smarts.

The Federation Business Data Lake will ingest and analyze data from diverse sources to give enterprises new insights that can help them make better decisions, EMC says. It can tie together existing EMC assets with new software to run the data lake, and the whole package can be built and started up in as little as seven days, according to the company.

EMC’s aim is to help enterprises of all sizes make better use of information they collect, including both structured and unstructured data. Building the data lakes may also show how EMC can make the diverse businesses it owns add up to more than the sum of their parts.

A data lake is a repository that can hold different kinds of data, allowing for cross-functional analysis. The data lake EMC announced on Monday is a combination of existing EMC products: Storage from EMC Information Infrastructure, VMware vCloud Suite, Pivotal Big Data Suite, and Pivotal Cloud Foundry. But it also includes newly developed software for ingesting and distributing data and for controlling access to information based on policies.

Configuring a data lake may require custom work, depending on what a particular enterprise needs and the infrastructure it already has. The systems can be built by EMC’s own services business or by system integrators such as Deloitte and CapGemini. Federation Data Business Lake will be available starting in April.

For now, the data lakes can only tap into EMC storage infrastructure, though the company plans to bring in other vendors’ systems through its ViPR architecture in the future, said Aidan O’Brien, senior director of big data solutions at EMC. But there are other roles for third parties: Customers can choose Hadoop distributions from other vendors, including Cloudera and Hortonworks, and EMC doesn’t make a visualization tool, so it’s qualified products including Tableau and MongoDB for that component, O’Brien said.

EMC has its roots in storage, but following several acquisitions, the company now pitches itself as a federation of companies. In addition to its traditional business, now called EMC Information Infrastructure, those units include VMware, Pivotal, security company RSA and the converged-systems venture VCE.

Some critics have said it would be better for business if EMC spun off parts of that federation, but the company has rejected those calls. CEO Joe Tucci says the structure allows each unit to focus on its own goals and for EMC to offer customers a set of technologies while giving them the freedom to make their own choices.

Federation Business Data Lake is one way EMC can combine assets from three of its member companies to make something new. Last year, the company announced the EMC Enterprise Hybrid Cloud Solution, a combination of hardware, software and services from EMC and VMware that is supposed to let enterprises set up a hybrid cloud in 28 days or less.

ITWorld DealPost: The best in tech deals and discounts.