Unix: Looking at files this way and that

Don't throw up your hands if your boss won't buy you Tripwire and a modern debugger. There are a lot of things that your Unix system -- right out of the box -- can tell you about files and processes.

Unix systems as part of the base OS have a number of commands that can help you characterize files that strike you as suspicious. Even without a file integrity scanner, you can make some headway toward identifying the nature of files on your system.

find

The find command can find files that have been changed within so many days or that are newer than some reference file, are newer than a reference file, have been changed within so many days, are owned by particular users or are greater (or smaller) than a certain size. You can't place a lot of value in NOT finding files that you are looking for if you are basing your search on time stamps because these can be easily changed with the touch command.

[13 Things that a Unix sysadmin will never do and How to freeze Unix accounts]

The find command can locate files based on numerous criteria -- type, size, age, permissions, ownership, group, etc. It's a good idea, for example, to look for files that might have the setuid bit set -- a real no-no by today's security standards. Finding files with setuid:

$ find . -type f -perm -4000 -ls
15089715    4 -rws------   1 root     staff        11 Sep 23  2009 ./bin/oops

Finding files newer than a reference file:

$ find /usr -newer /root/ref -ls

You can also use the find command to change the files in some way -- whether you want to strip execute permission, remove the files entirely, change ownership ... In the command below, we look for and remove the setuid bit.

$ find . -type f -perm -4000 -exec chmod u-s {} \;

You might like this version better because it also shows you the files prior to the fix.

$ find . -type f -perm -4000 -ls -exec chmod u-s {} \;
15089715    4 -rws------   1 root     staff        11 Sep 23  2009 ./bin/oops

file

The file command displays the basic type of a file such as text, jpg, binary, symbolic link, block special, etc. It examines the content of files to make this determination, so it won't be fooled by file names. If you have an executable with an extension of .txt or .jpg, file looks at the file and tells you what it really is. What is this mysterious beerlist file? Is it worth investigating now or during Happy Hour?

$ file beerlist
beerlist: UTF-8 Unicode English text

Hmm, a new jpg file? Really?

$ file bash.jpg
bash.jpg: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/
Linux 2.6.9, dynamically linked (uses shared libs), stripped
...

Let's look at some more files using the file command:

$ file /bin/bash
/bin/bash: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/
Linux 2.6.9, dynamically linked (uses shared libs), stripped
$ file /dev/cciss/c0d0p1
/dev/cciss/c0d0p1: block special (104/1)
$ file /proc/cpuinfo
/proc/cpuinfo: empty

strings

The strings command displays printable character sequences from binary files. Its output isn't restricted to the content of print statements, but includes library names, system calls (kernel functions), etc. The examples below should give you a feel for what this output looks like. Some will mean nothing to you. Some will tell you a lot about what you're looking at.

$ strings /bin/bash | head -7
/lib/ld-linux.so.2
AH!D
@BH@D
B80(
E@{B
!        @
Cl $

Looking just at shared object (library) files:

$ strings /bin/bash | grep "\.so"
/lib/ld-linux.so.2
libtermcap.so.2
libdl.so.2
libc.so.6
/lib/ld-linux.so.2

Looking at system called and other strings that start with "f":

$ strings /bin/bash | grep ^f
fbLE
fflush
fork
fgets
fputc
fputs
fclose
fileno
fwrite
fdopen
freeaddrinfo
fcntl
fopen64
ferror
free_buffered_stream
find_function
find_variable_internal
find_variable
find_shell_builtin
file_status
find_reserved_word

od

The od (octal dump) command -- which doesn't limit you to octal, by the way -- can be used to display the contents of a file in a number of formats. The hex/character or octal/character formats can be useful. An example is shown below.

$ od -xc /bin/bash | more
0000000 457f 464c 0101 0001 0000 0000 0000 0000
        177   E   L   F 001 001 001  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000020 0002 0003 0001 0000 c7e0 0805 0034 0000
        002  \0 003  \0 001  \0  \0  \0 340 307 005  \b   4  \0  \0  \0
0000040 3564 000b 0000 0000 0034 0020 0008 0028
          d   5  \v  \0  \0  \0  \0  \0   4  \0      \0  \b  \0   (  \0

Years ago, when I was particularly interested in the format of jpg files, I used the od command to display the contents of a jpg file and was able to find the identifying information (the magic number near the top that will be d8ffe0ff or ffd8ffe0 depending on whether your system is big endian or little endian) along with the color maps.

$ od -xc FrontDoor2.jpg | head -16
0000000 d8ff e0ff 1000 464a 4649 0100 0101 2c01
        377 330 377 340  \0 020   J   F   I   F  \0 001 001 001 001   ,
0000020 2c01 0000 e1ff 521d 7845 6669 0000 4949
        001   ,  \0  \0 377 341 035   R   E   x   i   f  \0  \0   I   I
0000040 002a 0008 0000 000a 010f 0002 0006 0000
          *  \0  \b  \0  \0  \0  \n  \0 017 001 002  \0 006  \0  \0  \0
0000060 0086 0000 0110 0002 000d 0000 008c 0000
        206  \0  \0  \0 020 001 002  \0  \r  \0  \0  \0 214  \0  \0  \0
0000100 0112 0003 0001 0000 0001 0000 011a 0005
        022 001 003  \0 001  \0  \0  \0 001  \0  \0  \0 032 001 005  \0
0000120 0001 0000 009a 0000 011b 0005 0001 0000
        001  \0  \0  \0 232  \0  \0  \0 033 001 005  \0 001  \0  \0  \0
0000140 00a2 0000 0128 0003 0001 0000 0002 0000
        242  \0  \0  \0   ( 001 003  \0 001  \0  \0  \0 002  \0  \0  \0
0000160 0131 0002 000c 0000 00aa 0000 0132 0002
          1 001 002  \0  \f  \0  \0  \0 252  \0  \0  \0   2 001 002  \0

cksum or md5sum

The cksum or md5sum commands computer a checksum for a file. Since checksums are mathematically defined in such a way that any change in a file, no matter how small, will result in a significant change in the computed checksum, these commands are useful in determining whether files have changed in any way -- particularly if you can compare files across systems. In fact, I often use checksums to compare files on systems since this technique means that I don't have to transfer files from one system to another to do an accurate comparison -- at least this is true if all I want to know is whether they're the same or different.

$ cksum /bin/bash
1562613511 735804 /bin/bash

wc

The wc (word count) command can provide useful data on the line, word, and character counts associated with a file. This is a low level check, but could prove useful from time to time. At least this is true if you're looking into configuration files or some other form of text files.

stat

The stat command can provide information on when files were last modified and accessed. While modify and change are synonyms to most of us, this is not so with the stat command which reports modify as the time the file was last changed and change as the time its status (metadata) changed.

$ stat /bin/bash
  File: `/bin/bash'
  Size: 735804          Blocks: 1448       IO Block: 4096   regular file
Device: 6802h/26626d    Inode: 4756181     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-02-10 12:32:29.000000000 -0500
Modify: 2011-05-13 08:29:43.000000000 -0400
Change: 2013-01-11 04:04:34.000000000 -0500

strace

The strace command will not examine a file per se, but can be used when running a file to trace system calls and signals. This can tell you a lot about what an executable is doing and is a good debugging tool, though you wouldn't want to use this on a suspect executable -- at least not on a production system. strace -- run an executable and report on system calls should be able to write output to a file using -o Running strace with only an executable as an argument will provide you with a pile of information on what it's reading and writing, how it's using memory, etc. Adding the -e open option restricts the output to open system calls.

$ strace -e open ls
open("/etc/ld.so.cache", O_RDONLY)      = 3
open("/lib/librt.so.1", O_RDONLY)       = 3
open("/lib/libacl.so.1", O_RDONLY)      = 3
open("/lib/libselinux.so.1", O_RDONLY)  = 3
open("/lib/libc.so.6", O_RDONLY)        = 3
open("/lib/libpthread.so.0", O_RDONLY)  = 3
open("/lib/libattr.so.1", O_RDONLY)     = 3
open("/lib/libdl.so.2", O_RDONLY)       = 3
open("/lib/libsepol.so.1", O_RDONLY)    = 3
open("/etc/selinux/config", O_RDONLY|O_LARGEFILE) = 3
open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
open(".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
open("/proc/meminfo", O_RDONLY)         = 3
1                   CheckIn             hlinks      new           Sumr2011
135                 checkPassword       hold        newfile       symcal

If you want to send your strace output to a file, add -o

to your command like this:
$ strace -o ls.out -e open ls

netstat

The netstat command displays information on network connections. This, of course, has little to do with the content of files, but it's a good command to keep in your pocket if you're ever feeling wary about what's running on your system and how. I use the command netstat -a | grep "LISTEN " to show me listening ports. Notice that I put LISTEN in quote with an extra blank so that I'm not picking up all the lines with LISTENING in them.
$ netstat -a | grep "LISTEN "
tcp        0      0 boson.bugfarm.org:2208      *:*                  LISTEN
tcp        0      0 *:repscmd                   *:*                  LISTEN
tcp        0      0 *:sunrpc                    *:*                  LISTEN
tcp        0      0 *:ftp                       *:*                  LISTEN
tcp        0      0 *:ssh                       *:*                  LISTEN
tcp        0      0 boson.bugfarm.org:ipp       *:*                  LISTEN
tcp        0      0 *:smtp                      *:*                  LISTEN
tcp        0      0 *:52383                     *:*                  LISTEN
tcp        0      0 boson.bugfarm.org:2207      *:*                  LISTEN
tcp        0      0 *:imaps                     *:*                  LISTEN
tcp        0      0 *:pop3s                     *:*                  LISTEN
tcp        0      0 *:pop3                      *:*                  LISTEN
tcp        0      0 *:imap                      *:*                  LISTEN
tcp        0      0 *:http                      *:*                  LISTEN
tcp        0      0 *:ssh                       *:*                  LISTEN
Use the command netstat -a | grep ESTABLISHED and you will get a list of current connections.
$ netstat -a | grep ESTABLISHED
tcp    0    0 boson.bugfarm.org:con     fermion:nfs                 ESTABLISHED
tcp    0    0 boson.bugfarm.org:ssh     static-71-127-157-21.:56673 ESTABLISHED
tcp    0    0 boson.bugfarm.org:sunrpc  10.1.2.3:58811              ESTABLISHED
Looking at files and processes with basic Unix commands may not tell you everything you want to know, but you can learn a lot by using them, especially if you use them often enough that you're comfortable with their output.

Read more of Sandra Henry-Stocker's Unix as a Second Language blog and follow the latest IT news at ITworld, Twitter and Facebook.

What’s wrong? The new clean desk test
Join the discussion
Be the first to comment on this article. Our Commenting Policies