Server initialization, recovery, and resets
S8100 Initialization
126
Maintenance Procedures
December 2003
However, with an engineer’s guidance, recovery can be disabled by setting the sampling-
interval or occupancy-threshold values to 0. More likely, the sampling-interval and CPU-
occupancy thresholds need to be fine-tuned to values that do not cause erroneous recovery
attempts.
NOTE:
The value of the sampling interval must be greater than or equal to 0. If the sampling
interval is set to 0, the top command is not run and no recovery is performed.
The threshold CPU-occupancy percentage must be between 0 and 100. If the threshold
CPU-occupancy percentage is set to 0, no recovery is performed but the top command’s
output is logged.
Setting the sampling interval and the threshold CPU-occupancy percentage to 0 may help
achieve stability by obtaining useful data without disrupting the processes.
Watchdog’s hardware timer
The Watchdog’s HiMonitor resets the timer on the hardware Watchdog circuitry via the Hardware-Sanity
device driver. If the Watchdog is unable to reset the timer, the timer’s value eventually decrements to 0,
and the processor is reset.
Hardware-Sanity device driver
The Hardware Sanity device driver (loadable module) is a modified Linux driver for the hardware
Watchdog. A Sanity thread periodically writes to the Hardware Sanity driver, which resets the timer on
the hardware Watchdog. If the Sanity thread does not write to the Hardware-Sanity driver, the:
•
Driver does not reset the timer on the hardware Watchdog
•
Timer expires
•
Hardware Watchdog reboots Linux
The driver has three capabilities: set time-out interval to a configurable value, reset the timer to the time-
out interval, and reboot Linux.
Rolling reboots
There may be cases where recovering the system using a reboot does not correct the problem. If this
occurs, the server continually reboots. This repeated rebooting increases the difficulty of diagnosing the
problem. The Watchdog handles this with fixed “
MaxReboots
” and “
MaxRebootInterval
”
parameters in the watchd.conf file. These fixed values are set to 3 reboots within 60 minutes. If it detects
the software is rebooting too quickly, Watchdog logs a message to syslog and does not start
Communication Manager software.
Restarts
Restart is a traditional Avaya term for a system restart of less severity than a full recreation. Restarts are
accomplished by retaining the memory state of certain processes.
Summary of Contents for CMC1
Page 1: ...Maintenance Procedures 555 245 103 Issue 1 1 December 2003 ...
Page 14: ...Contents 14 Maintenance Procedures December 2003 ...
Page 416: ...Additional maintenance procedures IP Telephones 416 Maintenance Procedures December 2003 ...
Page 426: ...Index X 426 Maintenance Procedures December 2003 ...