Advanced I/O Techniques, Part 1

By Danny Kalev, ITworld |  How-to

This week I will introduce the principles of I/O multiplexing. This
introduction will serve as the basis for our next week discussion about
the select() syscall.

Fast and Slow Files
I/O operations on normal files always block. This means that once you
call read() or write(), the process waits until the function returns.
When dealing with disk files, this isn't a problem because these files
are stored on a local disk and the execution time of the syscalls that
access them is more or less predictable. Yet certain file types have
unpredictable completion times. For example, reading from a pipe that
doesn't have any data in it will block until data becomes available; in
the meantime, the process remains blocked. Files that may take an
indeterminate amount of time to complete an I/O operation are called
"slow files".

I/O Multiplexing
Things get more complicated when simultaneously dealing with multiple
file descriptors. Consider a Web server process that is constantly
polling 200 client connections, each of which send requests to the
server. A naive implementation of this server would look as follows:

while(true)
{
for (int i=0; i<200; i++)
{
read(file_descriptors[i], buff, buffsize);
/*...process the data*/
}
}

Alas, if a file descriptor doesn't contain data, the loop will block
until data becomes available on that file, regardless of the remaining
199 clients. Obviously, this is a very bad idea.

Enter Nonblocking I/O
Using nonblocking I/O to access slow files would undoubtedly improve
matters. The fcntl() syscall enables you to open a slow file in a
nonblocking mode. When a slow file is nonblocking, read() always returns
immediately; if no data is available, it simply returns 0.

Still, this isn't a perfect solution. The problem with polling
nonblocking file descriptors is that the program never blocks! It
continually executes the loop, thereby inflicting a significant
performance penalty. We really want the kernel to notify our process
when data is available on one or more file descriptors. When no data is
available, the process should block and, thus, avoid wasting system
resources in vain.

Next week, we will see exactly how to do that.

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Answers - Powered by ITworld

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question
randomness