Journaling and Logging
The traditional Linux file systems were based on the legacy Unix file
systems. Such file systems (e.g. ext2fs) are static, which means they
do not track changes applied to files and directories to guarantee that
all updates are performed safely. Furthermore, ext2fs works
asynchronously. Information about a file -- for example its
permissions, creation date, and ownership -- are written in a delayed
fashion and, often, in several distinct operations.
This approach results in a noticeable performance gain; however, it
also incurs data consistency problems. If a power failure occurs
exactly when the file system has updated the contents of a file but
before it managed to update its header, then the file becomes
corrupted. Worse yet, if the disk is highly fragmented, then it's
likely that other files may have been corrupted as a result and the
entire directory needs to be restored.
Traditionally, a process called fsck (file system check) would check
the file system during reboot and detect the corrupt files. In some
cases, it would manage to fix them too, but usually you would have to
reconstitute the files from a backup set. In the Internet age, when
servers are required to stay up for months, this approach is
unacceptable. The demand for a more reliable file system and faster
recovery time led to the development of several journaling and logging
What is journaling?
The concept, introduced about a decade ago in database systems, ensures
data consistency and integrity in the event of a failure during a
transaction. A typical database journaling system records every
operation applied to the database records. If a transaction can't be
completed due to a hardware fault or a network failure, then the
database system restores the records to their original state. A
journaling file system uses a similar method by constantly monitoring
Logging, as opposed to journaling, keeps track of both inode changes
and file content changes. Each of these approaches has advantages and
drawbacks. In terms of performance overhead, journaling requires less
resources but logging enables faster recovery time. In either case,
recovery time is much faster compared to a static file system.
Furthermore, it doesn't necessitate a reboot.
Next week, I will explore this issue in further detail and present some
of the available journaling file systems for Linux.