Harmonic Spectrum X Installation Manual Download

Page: 214 / 254

Chapter 9: MediaStore 5100 and 5000 series hardware reference

cycled. When this condition is detected, the disk will be failed, regardless of any of the requisite

conditions mentioned previously. In extreme circumstances, this may cause the MediaDirector to shut

down its file system, stopping all playback and recording on that Spectrum server (but not affecting

other Spectrum servers in an EFS system.) The drive will be automatically bypassed, and an alarm

message will be generated instructing the operator to remove and reinsert the drive. If this is done

within five minutes of the failure, an automatic “surgical” rebuild will be immediately started. Otherwise,

one of the Spectrum servers in the system will automatically start a rebuild (provided a hot spare is

available).

About bad-block auto-repair

Review the details on Spectrum bad-block auto-repair.
When an unreadable or unwritable (“Read Error” or “Write Error”) block occurs, the block is internally

marked as bad. Bad block errors can occur occasionally on any system and do not, by themselves, imply

catastrophic drive failures. After a short period to allow collection of clusters of bad-blocks, and if it safe to

do so, a bad-block auto-repair will be performed as follows:
• Any drive with any unrepaired bad blocks (“Read/Write” and hard/soft errors) will be temporarily auto-

failed from the RAID set.

• The failed blocks will be recovered or reallocated on the disk.

• Blocks that could not be recovered will be marked for later rebuild.

• The drive will be added back into the RAID set.

• A surgical rebuild will be performed. A surgical rebuild uses RAID functionality to regenerate the

missing blocks that could not be recovered or were written to the RAID set while the drive was

removed.

In order for BBAR to be performed on a drive, its containing RAID set must be on-line with redundancy,

and the RAID set's containing file-system must be on-line and writable.
If the BBAR is unable to fix the block, the drive will be failed and will need to be replaced.
If an auto-fail is unsuccessful, be cautious when manually failing or removing any drives in the RAID Set.

Failing another drive on a rebuilding RAID Set or a compromised RAID Set could cause all Spectrum

servers to stop the file system. Contact Technical Support if you are unsure of what action to take.

Bad-block auto-repair in SystemManager

Warning alarms (yellow) are generated in SystemManager for any drive that reports a bad block.
In some cases, the bad-block auto-repair process may generate a red “CRIT” (critical) alarm in

SystemManager when the drive is temporarily auto-failed. This alarm is normal and can be ignored

provided the sequence of alarms confirms the “RAID Set rebuild completed.” The following is an example

of the SystemManager alarm sequence:
1. Disk diagnostics detect Bad Blocks (after a “Read Error” or “Write Error” on that block.)

2. Disk diagnostics deactivate the drive to recover bad blocks. During this time, RAID set protection is

momentarily lost while bad blocks are recovered or reallocated and bad- block tables are updated.

3. The drive is added back to the RAID set and, if needed, a rebuild is scheduled. At this point, the

drive is active, and RAID set protection is restored. After the rebuild is complete, disk diagnostics will

confirm and report zero bad blocks.

214