i860, though, never caught on as a general purpose CPU as Intel intended. It had ideas like zero signal process instructions and single instruction multiple data (SIMD) that are now standard in GPUs. But it was terrible at context switching, said Culver. It could take 2,000 cycles to switch tasks, which is an eternity for processor. It was also terrible at multitasking.
The floating point registers in the i860 that made it so popular in multimedia would find their way into the x86 in the form of MMX, or MultiMedia Extensions. They would be introduced in the Pentium line in 1996 and are still in use today.
What started out as a "brilliant" idea on paper, as Reynolds puts it, has turned into an embarrassment. "It was going back to simple. We were going to make the internals very fast and use the compiler to set the instructions up for us so they run very quickly. To that point, hardware had to take instructions, pull them apart and reassemble them to run through the processor," says Reynolds.
Intel predicted IA-64 would replace x86, so it didn't bother working on a 64-bit version of x86. DEC, HP and SGI all gave up their RISC efforts in favor of Itanium and HP was Intel's development partner on the project. Sun promised a port of Solaris. Only IBM stayed out, content with its Power architecture.
And then the wheels came off. Performance was poor and no one would port their x86 apps to Itanium. DEC and SGI ceased to be effective competitors in the marketplace. Sun stayed with Sparc. The initial Itanium chips sold for as much as $2,000. Support dropped off before it even reached the market in 2001.
"Intel went to software developers and said 'we need code for this.' Code writers said 'we need compilers because nothing we have is optimized for this architecture'," says Culver. He faulted Intel for not making optimal compilers available for developers when it shipped, which stopped the processor's momentum dead.
Reynolds also faults Intel for not providing the compilers developers needed. "Developers rely on the compiler to build the code in such a way it can use all those execution units. Without good compiler support, that's not going to happen. Clock speed doesn't matter as much because you do everything in parallel. Poor branch prediction slows the computer way down because then clock speed becomes most important," he said.
Then AMD struck. Former DEC engineer Dirk Meyer, who helped design the 64-bit Alpha processor, designed the Athlon desktop and eventually Opteron server chips, which were 64-bit x86, had the memory controller on the chip, and eventually became dual-core.
AMD went from also-ran to major competitor almost overnight. Suddenly there was a 64-bit x86 desktop available for under $200. Opteron was the first 64-bit x86 processor and was the first platform used in server consolidation efforts because it busted the 4GB memory limits of 32-bit processors. In the space of two years, AMD went from 0% server market share to 20%.
This forced Intel's hand on x86. After saying x86 didn't need 64-bits because we would go to Itanium, x86 went 64-bit. Core processors for desktops and Xeon for servers began gaining more and more features found in the Itanium, such as memory error correction. The Xeon 7500, launched in 2010, added a number of RAS (Reliability, Availability, Scalability) features found only in Itanium.
Then in November 2012, Intel announced plans to merge the Itanium and Xeon architectures, sharing essential on-chip features. Intel said it was doing this to reduce development costs, but with 80 percent of servers running x86 and 14 percent using RISC/Itanium and shrinking fast, according to Reynolds, the future does not look good for Itanium.
In all three cases, the processors were undone by two things: a lack of adequate compilers and the entrenchment of x86. With each passing year, x86 only accumulates more software. Intel has a compiler business but the standard is Microsoft's Visual Studio and Microsoft plans Visual Studio around its own releases, not Intel's.
Despite multiple attempts to retire x86, it may end up that ARM will be its undoing. The processor in your smartphone is getting faster and more powerful with each generation. Initial tests on Nvidia's upcoming Tegra 4 processor is that it will be three times faster than the Tegra 3, according to Jim McGregor, president of Tirias Research.
"That's equal to somewhere between a Core i3 and i5. Even when you don't include new chips, if you look at the rapid progression of latest processors from Qualcomm, Nvidia and Apple, you see how quickly those things are ramping up and these are still 32-bit processors," says McGregor.
The worst part for Intel is that it had an answer to ARM: the StrongARM/XScale ARM processor that it sold off to Marvell in 2006. "If they stuck with StrongARM, they'd be leaps and bounds ahead of where they are now with Atom," says McGregor.
It will be a challenge awaiting Intel's next CEO, as current CEO Paul Otellini is headed for retirement later this year. He was with Intel the entire time, through the iAPX432, the i860, the Pentium bug fiasco (he was the general manager of the Pentium group at the time) and Itanium. Otellini did a lot to turn around the mess at Intel, including responding to AMD's Athlon challenge, but in the end, he couldn't stem the ARM tide.
It would be ironic, then, if the undoing of x86 isn't a faster, more powerful processor but a smaller, lower power chip.