During one of the IDF technical sessions, Intel Senior Principal Engineer Ronak Singhal noted several times that Intel added no new features to the CPU that would have imposed a power penalty. Even so, CPU designers have a number of options for improving performance while accommodating the need for greater power efficiency.
One trick is branch prediction, which lets the CPU anticipate instructions that are likely to be executed in the near future. If the CPU knows which instructions will be coming through the pipeline next, it can allocate CPU resources much more efficiently, turning on only the parts of the CPU that the new task requires. So Intel tweaked Haswell's architectural elements to improve branch prediction, enlarging the internal buffers and the out-of-order windows.
The more work a CPU can do in a single cycle, the better its performance will be at the same level of power usage. So Intel added the ability to run two floating-point multiply-add operations every clock cycle, doubling the performance throughput over Ivy Bridge for the same power usage. L1 and L2 cache throughput is better, too, reducing how long the CPU must wait for data to arrive.
Of course, none of this good stuff comes for free. Though power efficiency has improved, Haswell pays a price in chip real estate. Given that Haswell will still be built on the 22nm CMOS process, the chips themselves are likely to be larger than Ivy Bridge CPUs.
The chip size will likely increase for another reason as well: graphics.
High-end PC gaming on tablets: Haswell graphics
Haswell builds on the existing Intel HD graphics core in Sandy Bridge, adding refinements and improving power efficiency. Haswell now offers three different integrated graphics options for Intel CPUs (called GT1, GT2, and GT3), as opposed to the two options (Intel HD 2500 and HD 4000) available with Ivy Bridge.
From a performance perspective, GT3 is the most interesting new graphics option. GT3 doubles graphics performance over what was possible with the older HD 4000 GPU, simply by doubling the number of execution units. Execution units act as the GPU's core computation engine, handling graphics shader and GPU compute tasks. These execution units are built into a common modular unit, which Intel calls a "slice common."
The slice common contains a number of other key components for real-time graphics, such as the raster engines and cache. To double the number of compute engines over the HD 4000, Intel added a second slice common to GT3. This additional slice takes up some chip space, but it saves power because the GPU doesn't need to enter turbo mode for additional performance.