Main-Memory Database

Theory and Practice

The rationale behind MMDBs is straightforward: a disk access operation

takes a few milliseconds whereas a RAM access operation is at least a

thousand times faster. Therefore, loading an entire database into RAM

should -- in theory at least -- improve performance significantly. In

practice, however, things are more complicated.

Loading an ordinary database into the system's RAM will certainly

improve performance, but the effect won't be as dramatic as one might

expect. The problem is that disk-based databases run background

processes that are responsible for caching and transaction processing.

Even when these processes aren't needed, they incur substantial

overhead and can't be turned off.

Caching and Transaction Processing Overhead

In order to improve performance, disk-based databases cache frequently

accessed portions of the database (e.g., table indexes) by loading them

into RAM. When the cached data is no longer used, or when the cache is

full, the DBMS swaps out parts of the current cache into a disk file.

Transaction processing is another cause of overhead. Disk-based

databases constantly log information that enables the DBMS to recover

from disasters such as a power fault or network failure. Clearly, both

these features are useless in MMDBs. However, they can't be turned off

because they are integral constituents of ordinary database engines.

MMDB Systems

MMDBs work differently. They eliminate the unnecessary overhead of

caching and transaction processing and use alternate algorithms for

storing and retrieving data. In addition, they enable users to store

the contents of a RAM database on a physical disk. That said,

production MMDBs must be able to cope with memory exhaustion and

fallback to traditional disk storage in a user-transparent fashion. In

fast growing databases, this isn't an unlikely scenario. Furthermore,

if the database is extremely large, loading it into RAM on startup can

take hours.

Instead, a sophisticated MMDB use heuristics to determine which

portions of the database should be loaded initially, leaving the rest

of the data to be loaded on demand.

Currently there are several MMDBs for Linux. Some of them simply tweak

a traditional disk-based DBMS and load it into RAM. Others, such as

McObject's eXtremeDB (http://www.mcobject.com/index.htm), were

originally designed according to the principles of MMDB.

A Note Regarding Last Week's Newsletter

Several readers have pointed out to me that state 4 is customizable. By

default, Linux doesn't associate a meaningful value to it but users may

create scripts to specify which processes and daemons will be run when

the system is set to state 4.

What’s wrong? The new clean desk test
You Might Like
Join the discussion
Be the first to comment on this article. Our Commenting Policies