Disk error detection
Q: I sometimes see messages on the console when a disk has an error. How can I automatically look for a disk that is giving soft errors before it fails completely? Are there any tools that will do this for me?
A: Traditionally, the disk device driver reacts to a problem by printing an error via syslog that gets printed on the console and stored in the /var/adm/messages file. While this is useful, it's an inefficient way to report a problem, and it's difficult to ensure that messages aren't lost or ignored. It's also hard to parse the error messages, because you might not know all the possible error conditions that could be reported.
The SyMON 1.x and 2.x products include a log file monitor that watches the /var/adm/messages file and warns the user if it sees certain kinds of errors. A new way to monitor disks was added in Solaris 2.6, and the iostat command was extended with -e and -E options that report error counts. I've now also extended the SE toolkit to look at the same information as part of the disk monitoring rule.
This isn't normally considered a performance question, but disks that are doing multiple retries due to transient errors can cause hard-to-find performance problems -- and the performance of a dead disk is zero!
Monitoring errors with iostat
Since Solaris 2.6, a new kstat data structure has been maintained in the kernel for every disk. Here is an example output from an Ultra 2 with two disks and a CD-ROM. The disk name is given in the normal device form of sd1 unless the -n option is also used to translate the name to c0t1d0s0 form.
Sign up for ITworld's Daily newsletter
Follow ITworld on Twitter @IT_world
jfruh
Apple syncing patent can't come soon enough
pasmith
New Twitter features borrow from 3rd party clients
Esther Schindler
Open Source Changes the Software Acquisition Process
mikelgan
How to set up continuous podcast play on the new iTunes
David Strom
Five important Windows 7 mobility features
sjvn
Guard your Wi-Fi for your own sake
Sandra Henry-Stocker
Grepping on Whole Words
Sidekick: The Good News & the Bad News
Either way you look at it Microsoft Data Center management did not follow standards or best practices in this failure. In which case it makes me wonder more about the outsourcing of corporate data much less personal data.
- mburton325
Join the conversation here
Quick, practical advice for IT pros. Made fresh daily.
Want to cash in on your IT savvy? Send your tip to tips@itworld.com. If we post it, we'll send you a $25 Amazon e-gift card.












