13 Things that a Unix sysadmin will never do
Image credit: ThinkStock
Unix systems administrators are an odd lot. We generally love doing somersaults on the command line and we don't like sharing the power, often not even with other sysadmins. We know how special we are just because most of the people we interact with can barely understand the words that we use. At the same time, we generally have work ethics that reveal a deep respect for the systems we manage and the people who use them -- or, at least, for their success in getting their work done on our servers.
So here's a list of 13 things that Unix admins just won't do.
1: Share our passwords
More than just about anyone, Unix systems administrators are going to be very reluctant to share passwords of any kind, but will never share our own. It's bad enough if we have to share root passwords with other admins, worse if we have to share them with anyone in our user population.
Between temporary accounts and sudo privileges, we can generally provide anyone with whatever access they are going to need to do any kind of work on the systems we manage. We can give them precisely what they need and nothing more and then take it back when the work is done, generally relieved to be doing so. We never ever have to share out passwords to allow someone else to accomplish a specific task.
2: Run backups in the middle of the day or not do them at all
Unix sysadmins know that backups should always be run when the systems are as quiescent (that means not doing much of anything) as possible. This prevents the kind of problems that occur when files are changing as backups are being made. In addition, backing up a system involves considerable I/O and, often, considerable network bandwidth as well (when the backup device is not connected to the server itself). If we're going to kick up dust, we want to do it when no one is trying to breathe.
That said, Unix system administrators won't fail to back up important servers. Avoiding backups would be like not having a spare key to our homes. We never forget that important files can be erased by mistake, that files can be corrupted and that applications can sometimes do funny things. So, we like to set ourselves up so that we can recover from anything that happens. No matter how careless our users may be or whether or not the applications that we support behave themselves, we can make it all work out.
We're also generally pretty good about remembering that our protecting our backup tapes is as important as protecting the same data on our servers. We label them, lock them up, and are downright persnickety about who we will trust with the keys to our backup cabinets.
And, of course, backing systems up is only half the battle. Unix admins also know that backups need to be tested from time to time to sure that the backups are complete and that anyone who might be tasked with restoring from them has some recent experience in doing just that. We consider verifying as well as labeling our backups to be part of the job.
3: Give a script the same name as a Unix command
Never. Creating a script that has the same name as a Unix command can lead to some very odd effects. This is one of the reasons why some Unix admins will give their scripts file extensions like .sh or .pl. It means that, even if we're unaware that there's a Unix command named mknod, we won't end up with a script named mknod and find that it's been run by mistake. If ever in doubt, we can ask the system which mknod or man mknod to verify that we're not about to use a name we shouldn't.
4: Set the root password on a critical server to some variation of root (e.g., r00t) or password (e.g., pa55w0rd)
No way we Unix admins will ever use a "stupid" password, even when we're setting up a system that we're about to pass over to someone else to administer. We're more likely to use one that you wouldn't remember ten seconds after we told it to you.
5: Issue a rm * account without checking where we are in the file system
Unix admins are fully aware of the power of root. We've come around to issuing commands like rm * only after we have verified that we're running the command from the proper directory. And, if we're logged into more than one system at a time -- as I often am, we also verify that we're issuing the command on the proper system. A few seconds of hesitation before pressing the return key can save us hours of time and months of embarrassment, so we make that hesitation a part of our routine any time we're about to enter a command that, under the wrong conditions, could lead to disaster.
6: Run a command without knowing what it's supposed to do
It's tempting at times. We run into a problem, Google to see if others have seen the same errors or the same strange results, and then find an alleged solution that purports to return our system to a state of OS Nirvana. Even so, we're cautious and distrustful by nature. So we make sure that we understand what the proposed solution actually does and any potential side effects before we risk our troubled system to further chaos.
7: Delete an account without knowing what it's for
Removing an account and related directory without verifying what it's intended to be used for could end up causing more problems than would ignoring the account. Unix sysadmins have come to understand that many Unix accounts are essential for various services and applications and that, were we to remove one simply because no one's logged into it all year or because it looks strange, we could break something important. A good rule of thumb that we use is to annotate these accounts in the "gecos" (description) field in the /etc/passwd file so that the purpose for these accounts is recorded.
We also don't generally remove user accounts without some reassurance that they are no longer needed. Even if a Unix user left the company last month, there may still be files in his/her home directory that are still needed and should maybe be made available to someone else still on staff. We know and take advantage of the difference between deleting an account and disabling an account and use the latter whenever there is any question about whether the files in the ex-user's home directory are of value.
8: Ignore our log files
Unix admins know that most of the data that ends up in log files on Unix systems is routine and boring. At the same time, we never forget that log files are one of the first places to look for clues when something goes wrong on one of our systems. One of my favorite knee jerk reactions when something isn't quite right on one of the servers I manage is to cd down to /var/log and do an ls -ltr. This quickly shows me where the most recent log messages have gone. Then I tail those files and see if there's anything there that helps me to understand the nature of whatever is going wrong. The tail -f or tail --follow command can help with ongoing problems as it show us as messages are being added to our log files.
I also take advantage of log analysis tools like logwatch to help bring the more interesting events included in my log files to the surface for me or I build my own log analyzers in Perl and email the results to myself on a daily basis to be sure that I don't miss anything important.
We're all too busy to read through our log files the way we would if we had nothing better to do, but ignoring them is never an option.
9: Issue a mkfs or format command without triple checking the target
A world in which small mistakes led to small problems and only really obvious big mistakes led to big problems would be a nice place to live, but that's not the world we inhabit. On Unix systems, putting one measly blank in the wrong place in a command (e.g., rm f * when we meant rm f*) or getting one character wrong when we issue a command to create a new file system (e.g., mkfs -t ext4 /dev/sda2 when we mean /dev/sdb2) can leave us with irate users and hours of work to put things back the way they were. These are times when we exercise the ten second rule. Stare at the command for ten seconds before pressing the return key. We know the rest of our day can depend on that brief moment of hesitation.
10: Edit a complex configuration file without making a backup copy
There's always vi's :q! but most Unix admins will make a backup copy of an important configuration file before we edit the original. Commands like cp complicated.cfg complicated.bak have been commonplace for as long as I can remember. If there's anything wrong with the modified file, we can always back up and start over again.
11: Forget that rm .* includes ..
Ever cautious of wild cards, Unix admins are very aware that the pattern .* matches .. along with .settings and .bashrc. We're likely to remove "dot files" as we like to call them with a command such as rm .[a-z0-9]* to be sure that we're avoiding .. and the disaster that would befall us if we started removing all the directories above our current location in the file system.
12: Reboot as the first option when resolving a problem
Unix sysadmins understand that Unix systems rarely need to be rebooted. I have in my decades of administering Unix systems encountered quite a few Unix systems that ran for years without a reboot. Of course, with occasional power outages and OS upgrades, this is something of a rarity. Even so, most anything can be accomplished -- including adding disks -- without a reboot and we generally know how to make that happen. When we feel compelled to reboot, it may be because we want to be sure that our systems reboot as expected to be sure that they'll reboot as expected when we're not around to watch.
13: Start file names with a hyphen
Just about every Unix sysadmin has run into a problem in which a file with a name that starts with a hyphen causes a disproportionate amount of confusion, so we don't start file names with hyphens any more than we include blanks in them. That would be just too much work! We know that we'd have to insert -- into commands (e.g., cat -- -dumfilename) just to look at these files and rm -- -dumfilename to remove them, so we avoid them altogether.
Unix systems administrators use caution in most everything we do. We're basically lazy and don't want to make work for ourselves. Besides, recovering from the same disaster more than once isn't fun. So we learn from our mistakes and are well aware of the many things that can go wrong when we're making changes on the systems we manage.