October 12, 2011, 3:56 PM — Microsoft is the latest of the world's top IT vendors has to climb aboard the Hadoop 'big data' bandwagon.
The company Wednesday announced it will collaborate with Yahoo spin-off Hortonworks to develop a Apache Hadoop implementation for its Windows Server and Windows Azure platforms.
Under the strategic partnership, Hortonworks will lend its domain expertise to help Microsoft integrate Hadoop into its Windows technology.
Microsoft said it expects to have a preview of a Hadoop-based service for Windows Azure by the end of this year, and one for Windows Server sometime in 2012. The Windows Server Hadoop implementation will work with existing Microsoft BI tools, Microsoft said in a statement.
Microsoft made the announcement at the PASS Summit, a SQL Server user conference held in Seattle.
The move will help Microsoft customers better manage their 'big data' requirements, said Microsoft corporate vice president Ted Kummert in a statement. "The next frontier is all about uniting the power of the cloud with the power of data to gain insights that simply weren't possible even just a few years ago."
Microsoft's move comes barely a week after Oracle unveiled a Hadoop-based big data appliance, along with a new Oracle NoSQL database and an open source distribution of the R programming language for statistical analysis.
Like Microsoft, Oracle said its Hadoop offering aims to tap a growing enterprise interest in big data analytics.
Just yesterday, IBM announced plans to Platform Computing a Toronto-based maker of software for managing the large computing clusters on which Hadoop typically runs.
Hadoop, an open-source software framework that supports big data applications, is increasingly attracting the attention of top IT executives for its ability to handle massive volumes of unstructured data like email content, weblogs, clickstream data, audio and video files, and sensor data.
A growing number of companies are looking to collect and analyze such unstructured data to glean new business insights. But to date they have been somewhat hampered in the because of the inherent scalability limitations of conventional relational database management products that are designed mostly to handle structured, relational data.