January 26, 2013, 7:24 AM — In the summer of 2011 Facebook czar Mark Zuckerberg and former Googler Lars Rasmussen, who had joined the company about a year earlier, had a conversation about building a third "pillar" of Facebook.
Facebook's Timeline is a snapshot of a person's life: Who they are, what they do, who their friends are, what they like. Newsfeed is a look into the latest happenings in each user's social network. Graph Search, introduced this week during a private unveiling at the company's headquarters, is a third pillar, Zuckerberg says. It is an advanced search that lets users type in what they're looking for and return results based on their distinctive social surroundings.
[ FREE DOWNLOAD: 3 things Google Apps needs to fix... like, NOW ]
Building such a search system is no trivial task. Facebook has 1 billion active users each month who have shared more than 240 billion photos. The social network is a complex web of more than 1 trillion connections of thousands of different types. "Every day, people share billions of pieces of new content, and Graph Search needs those indexed within seconds of their creation," Rasmussen describes. So how did they do it?
FIRST LOOK: Facebook Graph Search
PRESIDENTIAL INAUGURATION: There's an app for that
Facebook is fairly mum about its internal operations, although it does give some hints on the company's engineering Facebook page. Database experts, and even those who track Facebook closely don't even know for sure exactly how Facebook built Graph Search, but combining both structured and unstructured data into a unified search tool, which also takes into account the individual privacy settings of each user is undoubtedly a massive technical challenge.
"Searching a database on this scale is massively complex, with the key problem being how to search the entire database without degrading the performance of Facebook itself," says Matt Aslett, a data expert at the 451 Group. "Add in the fact that Facebook is doing a graph search - which not only searches all the data but the relationships between the data - and it is very hard to do efficiently."
Graph Search, Aslett believes, is likely based off of Facebook's internally-developed Tao database, a cached layer on top of Facebook's thousands of sharded MySQL databases, according to a blog post a Facebook engineer wrote last fall describing challenges related to working with cached content.
To create Graph Search, the engineers likely used some combination of open source tools that are available on the market, combined with internally-developed code written specifically for Facebook's extremely unique use case, predicts Jeffrey Kelly, big data expert at The Wikibon Project. Tools like Apache Lucene Solr and Cassandra- used by Netflix to index its movie library in Amazon Web Service's cloud. "FB doesn't use straight off the shelf software and hardware," he says. They can't, they either customize open source technology or develops it in-house.
"None of us had built anything remotely like this before," Rasmussen says in describing the "Under the hood" details of Graph Search beta. No one had built something like Graph Search before because it's a completely new type of search tool, one designed specifically for use by a social network.