30
Advanced System Diagnostics and Troubleshooting Guide
Packet Errors and Packet Error Detection
described in the section
“System (CPU and Backplane) Health Check” on page 70
. For example, the
system health check facility can be configured such that ExtremeWare will insert a message into the
system log that a checksum error has been detected.
Failure Modes
Although packet errors are extremely rare events, packet errors can occur anywhere along the data path,
along the control path, or while stored in packet memory. A checksum mismatch might occur due to a
fault occurring in any of the components between the ingress and egress points—including, but not
limited to, the packet memory (SRAM), ASICs, MAC, or bus transceiver components.
There are many causes and conditions that can lead to
packet error events
. These causes and conditions
can fall into one of these categories:
•
Transient errors
•
Systematic errors
—
Soft-state errors
—
Permanent errors
The failure modes that can result in the above categories are described in the sections that follow.
Transient Failures
Transient failures are errors that occur as one-time events during normal system processing. These types
of errors will occur as single events, or might recur for short durations. Because these transient events
usually occur randomly throughout the network, there is usually no single locus of packet errors. They
are temporary (do not persist), do not have a noticeable impact on network functionality, and require no
user intervention to correct: There is no need to swap a hardware module or other equipment.
Systematic Failures
S
ystematic
errors are repeatable events: some hardware device or component is malfunctioning in such a
way that it persistently exhibits incorrect behavior. In the context of the ExtremeWare Advanced System
Diagnostics, the appearance of a checksum error message in the system log—for example—indicates
that the normal error detection mechanisms in the switch have detected that the data in a packet has
been modified inappropriately. While checksums provide a strong check of data integrity, they must be
qualified according to their risk to the system and by what you can do to resolve the problem.
Systematic errors can be subdivided into two subgroups:
•
Soft-state failures
•
Permanent, or
hard
failures
Soft-State Failures
These types of error events are characterized by a prolonged period of reported error messages and
might, or might not, be accompanied by noticeable degradation of network service. These events require
user intervention to correct, but are resolved without replacing hardware.
Summary of Contents for ExtremeWare Version 7.8
Page 8: ...8 Advanced System Diagnostics and Troubleshooting Guide Contents...
Page 14: ...14 Advanced System Diagnostics and Troubleshooting Guide Introduction...
Page 24: ...24 Advanced System Diagnostics and Troubleshooting Guide i Series Switch Hardware Architecture...
Page 48: ...48 Advanced System Diagnostics and Troubleshooting Guide Software Exception Handling...
Page 102: ...102 Advanced System Diagnostics and Troubleshooting Guide Additional Diagnostics Tools...
Page 110: ...110 Advanced System Diagnostics and Troubleshooting Guide Troubleshooting Guidelines...
Page 120: ...120 Advanced System Diagnostics and Troubleshooting Guide Index...