Dependency between the block size and the number of Map tasks
Dependency between performance and the number of Map/Reduce tasks
Dependency between performance and the type of a query
Below you can see the diagram that shows how the processing speed depends on the query type for a data set of 64MB.
The difference between the first and the second, as well as between the third and the fourth, queries was in the number of grouping parameters. The first query calculated flight delay times by year. In the second query, we added such parameters as month and day. The third query returned the average flight delay times by year, which is a different arithmetic operation. The fourth query calculated the average flight delay times by year, month and day.