Step 2: Recognizing a Failing Disk
This section explains how to look for signs that one of your disks is having problems, and how to
determine which disk it is.
I/O Errors in the System Log
Often an error message in the system log file,
/var/adm/syslog/syslog.log
, is your first
indication of a disk problem. You might see the following error
Asynchronous write failed on LUN (dev=0x3000015)
IO details : blkno : 2345, sector no : 23
See the system log errors fully described in ,
“Matching Error Messages to Physical Disks and
Volume Groups” (page 163)
, where it shows how you map this type of error message to a specific
disk.
Disk Failure Notification Messages from Diagnostics
If you have Event Monitoring Service (EMS) hardware monitors installed on your system, and you
enabled the disk monitor
disk_em
, a failing disk can trigger an event to the (EMS). Depending
on how you configured EMS, you might get an email message, a message in
/var/adm/syslog/
syslog.log
, or messages in another log file. EMS error messages identify a hardware problem,
what caused it, and what must be done to correct it. The following example is part of an error
message:
Event Time..........: Tue Oct 26 14:06:00 2004
Severity............: CRITICAL
Monitor.............: disk_em
Event #.............: 18
System..............: myhost
Summary:
Disk at hardware path 0/2/1/0.2.0 : Drive is not responding.
Description of Error:
The hardware did not respond to the request by the driver.
The I/O request was not completed.
Probable Cause / Recommended Action:
The I/O request that the monitor made to this device failed because the
device timed-out. Check cables, power supply, ensure the drive is powered ON,
and if needed contact your HP support representative to check the drive.
For more information on EMS, see the Diagnostics section on the http://docs.hp.com website
LVM Command Errors
Sometimes LVM commands, such as
vgdisplay
, return an error suggesting that a disk has
problems. For example:
#vgdisplay –v | more
…
--- Physical volumes ---
PV Name /dev/dsk/c0t3d0
PV Status unavailable
Total PE 1023
Free PE 173
…
The physical volume status of unavailable indicates that LVM is having problems with the disk. You
can get the same status information from
pvdisplay
.
The next two examples are warnings from
vgdisplay
and
vgchange
indicating that LVM has
no contact with a disk:
#vgdisplay -v vg
vgdisplay: Warning: couldn't query physical volume "/dev/dsk/c0t3d0":
The specified path does not correspond to physical volume attached to
118
Troubleshooting LVM