RUGGEDCOM ROS
User Guide
Chapter 6
Troubleshooting
VLANs
249
Problem
Solution
Another possible explanation is that some links in the network run in half-duplex mode.
RSTP uses a peer-to-peer protocol called Proposal-Agreement to ensure transitioning in the
event of a link failure. This protocol requires full-duplex operation. When RSTP detects a
non-full duplex port, it cannot rely on Proposal-Agreement protocol and must make the port
transition the slow (i.e. STP) way. If possible, configure the port for full-duplex operation.
Otherwise, configure the port’s point-to-point setting to true.
Either one will allow the Proposal-Agreement protocol to be used.
When the switch is tested by deliberately
breaking a link, it takes a long time before
devices beyond the switch can be polled.
Is it possible that some ports participating in the topology have been configured to STP mode
or that the port’s point-to-point parameter is set to false? STP and multipoint ports converge
slowly after failures occur.
Is it possible that the port has migrated to STP? If the port is connected to the LAN segment
by shared media and STP bridges are connected to that media, then convergence after link
failure will be slow.
Delays on the order of tens or hundreds of milliseconds can result in circumstances where
the link broken is the sole link to the root bridge and the secondary root bridge is poorly
chosen. The worst of all possible designs occurs when the secondary root bridge is located
at the farthest edge of the network from the root. In this case, a configuration message will
have to propagate out to the edge and then back in order to reestablish the topology.
The network is composed of a ring of
bridges, of which two (connected to
each other) are managed and the rest are
unmanaged. Why does the RSTP protocol
work quickly when a link is broken between
the managed bridges, but not in the
unmanaged bridge part of the ring?
A properly operating unmanaged bridge is transparent to STP configuration messages. The
managed bridges will exchange configuration messages through the unmanaged bridge
part of the ring as if it is non-existent. When a link in the unmanaged part of the ring fails
however, the managed bridges will only be able to detect the failure through timing out of
hello messages. Full connectivity will require three hello times plus two forwarding times to
be restored.
The network becomes unstable when a
specific application is started. The network
returns to normal when the application is
stopped.
RSTP sends its configuration messages using the highest possible priority level. If CoS is
configured to allow traffic flows at the highest priority level and these traffic flows burst
continuously to 100% of the line bandwidth, STP may be disrupted. It is therefore advised
not to use the highest CoS.
When a new port is brought up, the root
moves on to that port instead of the port it
should move to or stay on.
Is it possible that the port cost is incorrectly programmed or that auto-negotiation derives an
undesired value? Inspect the port and path costs with each port active as root.
An Intelligent Electronic Device (IED) or
controller does not work with the device.
Certain low CPU bandwidth controllers have been found to behave less than perfectly when
they receive unexpected traffic. Try disabling STP for the port.
If the controller fails around the time of a link outage, there is the remote possibility that
frame disordering or duplication may be the cause of the problem. Try setting the root port
of the failing controller’s bridge to STP.
Polls to other devices are occassionally lost.
Review the network statistics to determine whether the root bridge is receiving Topology
Change Notifications (TCNs) around the time of observed frame loss. It may be possible there
are problems with intermittent links in the network.
The root is receiving a number of TCNs.
Where are they coming from?
Examine the RSTP port statistics to determine the port from which the TCNs are arriving.
Sign-on to the switch at the other end of the link attached to that port. Repeat this step until
the switch generating the TCNs is found (i.e. the switch that is itself not receiving a large
number of TCNs). Determine the problem at that switch.
Section 6.4
VLANs
The following describes common problems related to the VLANs.