Journaling, Part 2

By Danny Kalev, ITworld |  How-to

Following last week's discussion about journaling file systems, today I
will explore the inner workings of journaling file systems and present
available products. Before I explain how journaling file systems work,
let's review the vulnerabilities of traditional static file systems,
such as ext2fs.

Under a static file system, each file consists of two logical units: a
metadata block (commonly known as the inode) and the file's data. The
inode (information node) contains information about the physical
locations of the file's data blocks, modification time, etc.... The
second logical unit consists of one or more blocks of data, which
needn't be contiguous. Thus, when an application changes the contents
of a file, ext2fs modifies the file's inode and its data in two
distinct, synchronous write operations. If an outage occurs in between,
then the file system's state is unknown and needs to be checked for
consistency. A metadata logging file system overcomes this
vulnerability by using a wrap-around, appending only log area on the
disk.

The logging system records the state of each disk transaction in the
log area. Before any change is applied to the file system, an intent-to-
commit record is appended to the log. When the change has been
completed, the log entry is marked as complete. In the event of a
recovery from a failure, the system replays the log and checks for an
intent-to-commit record without a matching completion mark. Since every
modification to the file system is recorded in the log, the file system
only needs to read the log rather than performing a full file system
scan. If an intent-to-commit record without a completion mark is found,
then the change logged in that record is undone.

Let's look at a concrete example. Suppose we have a file that contains
three data blocks: 1,2 and 3. The first two of blocks are contiguous:

bbb12bbb3Hbbb

The b area indicates discarded data blocks and H is the file header.
Now an application updates blocks 2 and 3. Consequently, the file
system looks as follows (the a area marks obsolete data blocks that
previously contained the blocks 2 and 3 and the header):

bbb1abbbaabbb23H

Notice that the modified data was appended to the end: first, the
blocks 2 and 3, and finally the header. The previous location of blocks
2,3, and the header was discarded. This approach has several
advantages. It's faster because the system doesn't need to seek all
over the disk for writing parts of the file and it's safer because file
parts that have been changed aren't lost until the log has successfully
written the new blocks. Finally, a recovery after a crash is much
faster because the logging system needs to check only the updates that
took place after the last checkpoint.

At present, there are several journaling file systems available for
Linux. The SGI xfs file system is an Open Source product.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

Ask a Question