October 19, 2001, 12:00 AM — SMP in a Nutshell
In a previous newsletter (http://www.itworld.com/nl/lnx_tip/03302001),
I presented the principles of symmetric multiprocessing (SMP). An SMP
system combines multiple processors that operate under a single
operating system and access each other's memory over a common bus.
However, SMP's scalability is rather limited; once the system includes
more than 16 processors, performance usually deteriorates. The problem
lies with the throughput of the shared bus that connects the processors
to memory devices. As the number of processors increases, the bus
becomes saturated and turns into a performance bottleneck.
NUMA is a relatively new method of configuring a cluster of processors
in a multiprocessor system so that they share memory locally thereby
improving performance and users' ability to expand the system beyond
the inherent limits of SMP. NUMA adds an intermediate level of memory
shared among a few processors so that most data accesses don't have to
travel on the main bus. NUMA defines three cache layers, where a lower
number indicates a faster cache: L1, L2 and L3. When a processor looks
for data, it first looks in the L1 cache on the processor itself (MMX
processes, for instance, have a private 32KB cache each), then on a
larger L2 cache chip nearby, then on the L3 cache that NUMA provides.
Only if all the previous lookups have failed does the processor seek
the data in the external memory, which is significantly slower. Put
differently, NUMA introduces an additional cache layer that reduces the
number of accesses to the external memory.
A typical NUMA-based machine consists of multiple clusters, or units.
Each unit consists of four processors interconnected by a local bus to
a shared memory (the L3 cache) on a single motherboard. A common SMP
bus interconnects several units thus forming an SMP system. Such a
system may contain up to 256 processors. NUMA views each of these units
as a node in the interconnection network. However, a user-level
application views all the individual cluster's memories as a single
For further information about NUMA see: