Errno Libretto

By Hal Stern, Unix Insider |  Operating Systems

One of the most frustrating exercises performed by system
administrators is explaining (calmly) to users why applications that
behave routinely on machine A suddenly fail or exhibit strange side
effects on machine B. For well-known tools like the C shell, you can
wade through .cshrc scripts and find minor environmental
differences. But how do you deal with shrink-wrapped code? Use
truss to identify the configuration and initialization
files opened by the application. On the good machine, grep out the list
of files opened and then match it against the same list on the problem
machine:

huey% fgrep  'open(' truss.out1 > /tmp/out1
huey% fgrep  'open(' truss.out2 > /tmp/out2
huey% diff /tmp/out1 /tmp/out2

Look for the string "Err#2 ENOENT" signaling a missing file. Double
check automounter maps, environment variables, and installation
processes that modify files local to each machine, in /etc or
/usr/lib, for example. Some applications search for
configuration files in several directories, and may find identical
files on the two hosts but process them in a different order. Again,
checking the sequence of the open() calls and the ENOENT
results will tell you if you have a configuration problem.




Use truss to identify the

configuration and initialization

files opened by the application.



Also look for EACCESS errors, caused by insufficient file or
directory permissions. If the file exists but can't be read by the
user, ensure that user and group IDs are consistent between the
machines in question. Group-readable files aren't effective unless you
enforce group membership on all machines at which users may camp.

Here's a nastier version of the same problem: a user is panicking to
set up a demo environment. Rather than create new users and their
environments, he runs the demo as root, only to have it fail
miserably. Even root gets slapped with EACCESS violations if the files
being accessed are NFS mounted. Over the network, root becomes the
anonymous user nobody, and relies on world read and execute
permissions to open files and search directories. Any application that
works for non-privileged users but fails for root is probably opening
configuration or data files over NFS. If you suspect that NFS access
is contributing to your problem, locate the filesystem for the file in
question using df:

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Operating SystemsWhite Papers & Webcasts

See more White Papers | Webcasts

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+

Ask a Question