6 Housekeeping
In this chapter:
•
What is housekeeping? (page 38)
•
What effect does housekeeping have on performance (page 38)
•
Why is housekeeping important? (page 38)
•
What do I need to do? (page 38)
What is housekeeping?
If data is deleted from the StoreOnce Backup System (e.g a virtual cartridge is overwritten or
erased), any unique chunks will be marked for removal, any non-unique chunks are de-referenced
and their reference count decremented. The process of removing chunks of data is not an inline
operation because this would significantly impact performance. This process, termed
“housekeeping”, runs on the appliance as a background operation, it runs on a per cartridge and
NAS file basis and will run as soon as the cartridge is unloaded and returned to its storage slot
or a NAS file has completed writing and has been closed by the appliance.
What effect does housekeeping have on performance?
Whilst the housekeeping process can run as soon as a virtual cartridge is returned to its slot, this
could cause a high level of disk access and processing overhead, which would affect other
operations such as further backups, restores, tape offload jobs or replication.
In order to avoid this problem the housekeeping process will check for available resources before
running and, if other operations are in progress, the housekeeping will dynamically hold-off to
prevent impacting the performance of other operations. It is, however, important to note that the
hold-off is not binary, (i.e. on or off) so, even if backup jobs are in process, some low level of
housekeeping will still take place which may have a slight impact on backup performance.
Why is housekeeping important?
Housekeeping is an important process in order to maximize the deduplication efficiency of the
appliance and, as such, it is important to ensure that it has enough time to complete. Running
backup, restore, tape offload and replication operations with no break (i.e. 24 hours a day) will
result in housekeeping never being able to complete.
As a general rule a number of minutes per day should be allowed for every 100 GB of data
overwritten on a virtual cartridge or NAS share. For example: if, on a daily basis, the backup
application overwrites two cartridges in different virtual libraries with 400 GB of data on each
cartridge, an HP D2D4106 appliance would need approximately 30 minutes of quiescent time
over the course of the next 24 hours to run housekeeping in order to de-reference data and reclaim
any free space.
What do I need to do?
Configuring backup rotation schemes correctly is very important to ensure the maximum efficiency
of the product; doing so reduces the amount of housekeeping that is required and creates a
predictable load. As backup on one library or directory in a NAS share finishes it triggers
Housekeeping, which then impacts the performance of the backup on the next library or NAS
share. If backup jobs can be scheduled to complete at the same time, the impact of Housekeeping
on backup performance will be greatly reduced
Large housekeeping loads are created when large numbers of cartridges are manually erased or
re-formatted. In general all media overwrites should be controlled by the backup rotation scheme
so that they are predictable. Create enough virtual library cartridges for at least one backup rotation
38
Housekeeping