For the Graph 500 benchmark, the supercomputer is given a large set of data, called a graph. A graph is an interconnected set of data, such as a group of connected friends on a social network like Facebook. A graph consists of a set of vertices and edges, and in the social media context a vertex would be a person and the edge that person's connection to another person. Some vertices have many connections while many others have fewer. The computer is given a single vertex and is timed on how quickly it discovers all the other vertices in a graph, namely by following the edges.
Currently, IBM's BlueGene/Q systems dominate this edition of the Graph 500. Nine out of the top 10 systems on the list are BlueGene/Q models -- compared to four BlueGene/Q systems on the November 2011 compilation. For Bader, this is proof that IBM is becoming more sensitive to current data processing needs. IBM's previous BlueGene system, BlueGene/L, was geared more towards floating point operations, and does not score as highly on the list.
Like the Top500, each successive edition of the Graph 500 shows steady performance gains among its participants. The top machine on the new list, Sequoia, traversed 15,363 billion edges per second. In contrast, the top entrant of the first list, compiled in 2010, followed only 7 billion edges per second. This jump of four orders of magnitude is "staggering," Bader said.
The Graph500 list is compiled twice a year, and, like the Top500, the results are announced at the Supercomputing conference, usually held in November, or the International Supercomputing Conference, usually held in June. Participation is voluntary: entrants will run either the reference implementations, or their own implementations, of the benchmark and submit the results.
Despite its name, the Graph 500 has yet to attract 500 submissions, though the numbers are improving with each edition. The first contest garnered 9 participants, and this latest edition has 124 entrants.
Bader is quick to point out that the Graph 500 is not a replacement for the Top500 but rather a complementary benchmark. Still, the data intensive benchmark could help answer some of the criticisms around the Top500's use of the Linpack benchmark.