December 10, 2012, 8:32 AM — For some time Microsoft didn't offer a solution for processing big data in cloud environments. SQL Server is good for storage, but its ability to analyze terabytes of data is limited. Hadoop, which was designed for this purpose, is written in Java and was not available to .NET developers. So, Microsoft launched the Hadoop on Windows Azure service to make it possible to distribute the load and speed up big data computations.
But it is hard to find guides explaining how to work with Hadoop on Windows Azure, so here we present an overview of two out-of-the-box ways of processing big data with Hadoop on Windows Azure and compare their performance.
HOW-TO: Get Hadoop certified ... fast
IN PICTURES: 'The Human Face of Big Data'
We created eight types of queries in both languages and measured how fast they were processed.
A data set was generated based on US Air Carrier Flight Delays information downloaded from Windows Azure Marketplace. It was used to test how the system would handle big data. Here, we present the results of the following four queries: