If a NetDefendOS cluster has the Anti-Virus or IDP subsystems enabled then updates to the
Anti-Virus signature database or IDP pattern database will routinely occur. These updates involve
downloads from the external D-Link databases and they require NetDefendOS reconfiguration to
occur for the new database contents to become active.
A database update causes the following sequence of events to occur in an HA cluster:
1.
The active (master) unit downloads the new database files from the D-Link servers. The
download is done via the shared IP address of the cluster.
2.
The active (master) node sends the new database files to the inactive peer.
3.
The inactive (slave) unit reconfigures to activate the new database files.
4.
The active (master) unit now reconfigures to activate the new database files causing a
failover to the slave unit. The slave is now the active unit.
5.
After reconfiguration of the master is complete, failover occurs again so that the master
once again becomes the active unit.
Dealing with Sync Failure
An unusual situation that can occur in an HA cluster is if the
sync
connection between the master
and slave experiences a failure with the result that heartbeats and state updates are no longer
received by the inactive unit.
Should such a failure occur then the consequence is that both units will continue to function but
they will lose their synchronization with each other. In other words, the inactive unit will no
longer have a correct copy of the state of the active unit. A failover will not occur in this situation
since the inactive unit will realize that synchronization has been lost.
Failure of the
sync
interface results in the generation of
hasync_connection_failed_timeout
log
messages by the active unit. However, it should be noted that this log message is also generated
whenever the inactive unit appears to be not working, such as during a software upgrade.
Failure of the
sync
interface can be confirmed by comparing the output from certain CLI
commands for each unit. The number of connections could be compared with the
stats
command. If IPsec tunnels are heavily used, the
ipsecglobalstat -verbose
command could be used
instead and significant differences in the numbers of IPsec SAs, IKE SAs, active users and IP pool
statistics would indicate a failure to synchronize.
Once the broken
sync
interface is fixed, perhaps by replacing the connecting cable,
resynchronization of the two units will take place automatically. If the
sync
interface is now
functioning correctly, there may still be some small differences in the statistics from each cluster
unit but these will be minor compared with the differences seen in the case of failure.
In unusual circumstances, synchronization between the active and inactive unit will not take
place automatically. In this case, it may be necessary to manually restart the unsynchronized
inactive unit in order to force resynchronization. This can be achieved using the CLI command:
gw-world:/> shutdown
A restart of the inactive unit will cause the following to take place:
•
During startup, the inactive unit sends a message to the active unit to flag that its state has
been initialized and it requires the entire state of the active unit to be sent.
•
The active unit then sends a copy of its entire state to the inactive unit.
•
The inactive unit then becomes synchronized after which a failover can take place
Chapter 11: High Availability
825
Summary of Contents for NetDefendOS
Page 30: ...Figure 1 3 Packet Flow Schematic Part III Chapter 1 NetDefendOS Overview 30 ...
Page 32: ...Chapter 1 NetDefendOS Overview 32 ...
Page 144: ...Chapter 2 Management and Maintenance 144 ...
Page 284: ...Chapter 3 Fundamentals 284 ...
Page 392: ...Chapter 4 Routing 392 ...
Page 419: ... Host 2001 DB8 1 MAC 00 90 12 13 14 15 5 Click OK Chapter 5 DHCP Services 419 ...
Page 420: ...Chapter 5 DHCP Services 420 ...
Page 573: ...Chapter 6 Security Mechanisms 573 ...
Page 607: ...Chapter 7 Address Translation 607 ...
Page 666: ...Chapter 8 User Authentication 666 ...
Page 775: ...Chapter 9 VPN 775 ...
Page 819: ...Chapter 10 Traffic Management 819 ...
Page 842: ...Chapter 11 High Availability 842 ...
Page 866: ...Default Enabled Chapter 13 Advanced Settings 866 ...
Page 879: ...Chapter 13 Advanced Settings 879 ...