Facebook's Graph Search puts Apache Giraph on the map

Powering Facebook Open Graph, Apache Giraph was built from Yahoo and Google technologies

By , IDG News Service |  

Using the Bulk Synchronous Parallel model of computing, Google designed Pregel to generate graphs from very large data sets, using lots of commodity servers.

Like it did with Hadoop, Yahoo bequeathed Giraph to the Apache Software Foundation, where it is now a fully open-source project worked on by developers from Facebook, LinkedIn, Twitter and Hortonworks.

Because Giraph is written in Java, Ching explained, it can connect very easily with the various parts of Facebook's Hadoop deployment, which it relies upon for data storage management and resource scheduling.

Facebook stores its user-generated data in a data warehouse running on Apache Hive, a component of Hadoop. Giraph, however, can generate graphs four times faster than Hive itself. Because it runs on Hadoop's MapReduce, a Giraph job can be split across multiple servers so it can be executed in parallel.

Facebook modified Giraph in a number of ways to make it run more efficiently, according to Ching.

Company engineers devised a number of tweaks to trim Giraph's memory usage on servers. "Giraph was a memory behemoth due to all data types being stored as separate Java objects," Ching wrote.

To improve Giraph's scalability, Facebook linked it with the Netty event-driven framework.

In one test using user interaction data, Facebook was able to use Giraph to create a 1 trillion-edge social graph in under four minutes, using 200 commodity servers.

Facebook's benchmark dwarfed previously published Giraph tests by other companies by at least two orders of magnitude. Heretofore, researchers have been able to create a 6.6 billion-edge graph using Yahoo Altavista data and a graph of Twitter data with 1.5 billion edges.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question
randomness