Supercomputer makers will have to construct their machines so that their cost and power consumption do not increase in a linear fashion along with performance, lest they grow too expensive to purchase and run, Dongarra said. An exascale machine should cost about $200 million, and use only about 20 megawatts, or about 50 gigaflops per watt.
Dongarra expects that half the cost of building such a computer would be earmarked for buying memory for the system. Judging from the roadmaps of memory manufacturers, Dongarra estimated that $100 million would purchase between 32 petabytes to 64 petabytes of memory by 2020.
In addition to challenges in hardware, designers of exascale supercomputers must also grapple with software issues. One issue will be synchronization, Dongarra said. Today's machines pass tasks among many different nodes, though this approach needs to be streamlined as the number of nodes increases.
"Today, our model for parallel processing is a fork/join model, but you can't do that at [the exascale] level of a parallelism. We have to change our model. We have to be more synchronous," Dongarra said. Along the same lines, algorithms need to be developed that reduce the amount of overall communication among nodes.
Other factors must be considered as well. The software must come with built-in routines for optimization. "We can't rely on the user setting the right knobs and dials to get the software to run anywhere near peak performance," Dongarra said. Fault resilience will be another important feature, as will reproducibility of results, or the guarantee that a complex calculation will produce the exact same answer when run more than once.
Reproducibility may seem like an obvious trait for a computer. But in fact, it can be a challenge for huge calculations on multinode supercomputers.
"From the standpoint of numerical methods, it is hard to guarantee bit-wise reproducibility," Dongarra said. "The primary problem is in doing a reduction -- a summing up of numbers in parallel. If I can't guarantee the order in which those numbers come together, I'll have different round-off errors. That small difference can be magnified in a way that can cause answers to diverge catastrophically," he said.
"We have to come up with a scenario in which we can guarantee the order in which those operations are done, so we can guarantee we have the same results," Dongarra said.