Nvidia held its annual GPU Technology Conference this week and the opening keynote made it clear that the leader in GPU technology isn't slowing down or remotely impacted by Moore's Law.
The company introduced the Maxwell architecture just last year but it is already moving on to the next, as CEO Jen-Hsun Huang went into detail on Pascal, its architecture due next year. Due to advances in both the GPU and the memory architecture, Huang promised a 10-fold improvement in performance over Maxwell, which is already a barn-burner.
Pascal will use TSMC’s 16nm FinFET+ process, which is a three-dimensional manufacturing process instead of 2D. Huang claimed that Pascal will achieve over 2x the performance per watt of Maxwell in single precision math, which means nothing to gamers, as they want raw performance and aren't concerned with power. However, for supercomputing scenarios, where you have thousands of cards in thousands of machines, it's an issue.
The other big improvement is the use of High Bandwidth Memory. HBM is 3D stacked memory, which will provide three times the bandwidth and nearly three times the frame buffer capacity of Maxwell. Pascal will have its memory chips stacked on top of each other, and placed adjacent to the GPU, rather than further down the processor boards. Because data will have to travel just millimeters instead of inches, this is another way Pascal will achieve tremendous memory performance improvements.
Nvidia said it will offer up to 32GB of RAM per GPU. This will allow for up to five times better performance in what Nvidia calls "deep learning applications" which are applications capable of gathering data and learning to recognize patterns or images. It's also a sign that this card will be for high performance computing, as the majority of video cards have just 2GB of memory.
Pascal will be the first Nvidia GPU to feature NVLink, Nvidia's high-speed interconnect. NVLink is an ultra-high-speed interconnect capable of transferring data at a rate of up to 200 gigabytes per second. It will allow the CPU and GPU to communicate five to 12 times faster than standard PCI Express Gen3 interconnects and uses 3D stacked memory, increasing its bandwidth by four-fold.
Huang also showed off the company's new top of the line card, the Titan X. Built on the Maxwell technology, it has 3,072 processing cores and can process more than 7 teraflops of data. In a demo, Huang showed off the card rendering 15 million plants in a single scene in real time.
It'll cost you, though. The Titan X has a retail price of $999.