February 01, 2002, 12:00 AM — Theory and Practice
The rationale behind MMDBs is straightforward: a disk access operation
takes a few milliseconds whereas a RAM access operation is at least a
thousand times faster. Therefore, loading an entire database into RAM
should -- in theory at least -- improve performance significantly. In
practice, however, things are more complicated.
Loading an ordinary database into the system's RAM will certainly
improve performance, but the effect won't be as dramatic as one might
expect. The problem is that disk-based databases run background
processes that are responsible for caching and transaction processing.
Even when these processes aren't needed, they incur substantial
overhead and can't be turned off.
Caching and Transaction Processing Overhead
In order to improve performance, disk-based databases cache frequently
accessed portions of the database (e.g., table indexes) by loading them
into RAM. When the cached data is no longer used, or when the cache is
full, the DBMS swaps out parts of the current cache into a disk file.
Transaction processing is another cause of overhead. Disk-based
databases constantly log information that enables the DBMS to recover
from disasters such as a power fault or network failure. Clearly, both
these features are useless in MMDBs. However, they can't be turned off
because they are integral constituents of ordinary database engines.
MMDBs work differently. They eliminate the unnecessary overhead of
caching and transaction processing and use alternate algorithms for
storing and retrieving data. In addition, they enable users to store
the contents of a RAM database on a physical disk. That said,
production MMDBs must be able to cope with memory exhaustion and
fallback to traditional disk storage in a user-transparent fashion. In
fast growing databases, this isn't an unlikely scenario. Furthermore,
if the database is extremely large, loading it into RAM on startup can
Instead, a sophisticated MMDB use heuristics to determine which
portions of the database should be loaded initially, leaving the rest
of the data to be loaded on demand.
Currently there are several MMDBs for Linux. Some of them simply tweak
a traditional disk-based DBMS and load it into RAM. Others, such as
McObject's eXtremeDB (http://www.mcobject.com/index.htm), were
originally designed according to the principles of MMDB.
A Note Regarding Last Week's Newsletter
Several readers have pointed out to me that state 4 is customizable. By
default, Linux doesn't associate a meaningful value to it but users may
create scripts to specify which processes and daemons will be run when
the system is set to state 4.