Why the U.S. may lose the race to exascale
Unlike the U.S., Japan and Europe have set firm goals for systems by 2020, and China could beat everyone
In the global race to build the next generation of supercomputers -- exascale -- there is no guarantee the U.S. will finish first. But the stakes are high for the U.S. tech industry.
Today, U.S. firms -- Hewlett-Packard, IBM and Intel, in particular -- dominate the global high performance computing (HPC) market. On the Top 500 list, the worldwide ranking of the most powerful supercomputers, HP now has 39% of the systems, IBM, 33%, and Cray, nearly 10%.
That lopsided U.S. marketshare does not sit well with other countries, which are busy building their own chips, interconnects and new technologies in the push for exascale. Europe, China and Japan are the major challengers to the U.S., which has yet to even set an overall budget for its own efforts or even a target date.
The Europeans are now building an exascale system using ARM chips designed by British semiconductor firm Arm Holdings, and hope to deliver the system by 2020. They also hope that the exascale effort can do for the region's tech firms what Airbus accomplished for the aircraft industry. (Airbus grew out of a European government initiative and consortium of aircraft makers that successfully challenged Boeing.)
"It's not that Europe just wants an exaflop system," said Alex Ramirez, the computer architecture research manager at the Barcelona Supercomputing Center. "It wants to be able to build the exaflop system and not buy it from a provider." Ramirez is leading the effort behind the ARM-based system, and attended the annual supercomputing conference, SC13, here this week.
Europe has already committed to spending the equivalent of $1.6 billion, in contrast to U.S. funding, which has been slight and is waiting for action from Congress.
Meanwhile, China could deliver a system before 2020. Its Tianhe-2, a supercomputer developed by China's National University of Defense Technology, maintained its global top ranking in the latest Top 500 benchmark of the world's most powerful supercomputer. Tianhe-2 system runs at nearly 34 petaflops.
China is expected to produce two 100-petaflop size systems as early as 2015, one built entirely from China-made chips and interconnects.
In reaching exascale, "I think the Chinese are two years ahead of the U.S.," said Earl Joseph, an analyst at IDC who covers high performance computing.
Kimihiko Hirao, director of the RIKEN Advanced Institute for Computational Science of Japan, said in an interview that Japan is already discussing creation an exascale system by 2020, one that would use less than 30 megawatts of power.
Riken is the home of the world's fourth largest system, Fujitsu's K system, which runs at 10.5 petaflops and uses SPARC chips. Asked whether he sees the push to exascale as a race between nations, Hirao said yes. Will Japan try to win that race? "I hope so," he said.
"We are rather confident," said Hirao, arguing that Japan has the technology and the people to achieve the goal.
Jack Dongarra, a professor of computer science at the University of Tennessee and one of the academic leaders of the Top 500 supercomputing list, said Japan is serious and on target to deliver a system by 2020. Citing Japan's previous accomplishments in supercomputing, Dongarra said that "when the Japanese put down a plan to deliver a machine, they deliver the machine."
Separately, Dongarra does not believe that China has a head-start on the U.S.
"They are not ahead in terms of software, they are not ahead in terms of applications," said Dongarra. But he said China has shown a willingness to invest in HPC, "where we haven't seen that same level in the U.S. at this point."
Exascale computing isn't seen as just a performance goal. A nation's system can be designed to run a wide range of scientific applications, although there are often concerns that if it takes too much power to run, it might not be financially viable.
There is, nonetheless, a clear sense that HPC is at an exciting juncture, because new technologies are needed to achieve exascale. DRAM, for instance, is too slow and too expensive to support exascale, which is one million trillion calculations per second, or 1,000 times faster than the single petaflop systems available today. Among the possibilities is phase-change memory, which has 100 times the performance of flash memory products.
Developing those new technologies will require major research investments by governments. The gridlock in Congress is partially to blame for the absence of major exascale funding, something that's at least on par with Europe. But political gridlock isn't wholly to blame. The White House's recent emphasis on big data is seen by some as delivering mixed messages about U.S. focus. The Department of Energy (DOE) has yet to offer up a clear exascale delivery date, simply describing the goal more generally as "in the 2020 timeframe."
A major constraint is the cost of power. Roughly, 1 megawatt a year costs $1 million. While the DOE has set a goal of building an exascale system that uses 20 megawatts or less, Joseph said that may be too stringent a goal. Instead, he envisioned 50-to-100-megawatt data centers built to support large-scale systems.
Dongarra and others remain optimistic that Congress will deliver on funding. There is clear bipartisan support. In the U.S. House, Rep. Randy Hultgren (R-Ill) has been working to get funding passed, and has 18 co-sponsors from both parties. Similar efforts are under way in the Senate.
Global exascale competition isn't necessarily about the basic science or the programming.
The Department of Energy's Argonne National Laboratory, for instance, just announced a cooperation agreement on petascale computing with Japan. Peter Beckman, a top computer scientist at the laboratory and head of an international exascale software effort, said the pact calls for information sharing with Japanese HPC scientists. The two groups are expected to discuss how they manage their machines, their power and other operational topics. The effort is analogous to Facebook's Open Compute project, where some aspects of data center designs and operations are openly shared.
"We're not competing at this level," said Beckman. "We're just trying to run stuff."
On a broader scale, there is considerable effort internationally on writing programs for large-scale parallel machines, but no agreement on approach.
"That is one area where people really want to work together," said Beckman. "You want to be able to write portable code, and there does not seem to be competition in that. We want the railroad gauge to be the same in every country, because it just makes our lives are lot easier."
Patrick Thibodeau covers cloud computing and enterprise applications, outsourcing, government IT policies, data centers and IT workforce issues for Computerworld. Follow Patrick on Twitter at @DCgov or subscribe to Patrick's RSS feed. His e-mail address is firstname.lastname@example.org.
Read more about high performance computing in Computerworld's High Performance Computing Topic Center.