Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
(Troubleshooting)
Ch. 7 Troubleshooting Performance
Issues Part 2
Rick Graziani
Cabrillo College
graziani@cabrillo.edu
Fall 2014
Materials
Book:
Troubleshooting and Maintaining
Cisco IP Networks (TSHOOT)
Foundation Learning Guide:
Foundation learning for the CCNP
TSHOOT 642-832
By Amir Ranjbar
Book
ISBN-10: 1-58705-876-6
ISBN-13: 978-1-58705-876-9
eBook
ISBN-10: 1-58714-170-1
ISBN-13: 978-1-58714-170-6
Topics
Part 1: Troubleshooting Network Application Services
NetFlow accounting
IP SLAs (Service Level Agreements)
NBAR (Network-Based Application Recognition) packet
inspection
SLB (Server Load Balancing)
QoS (Quality of Service) and AutoQoS
Part 2: Routers and Switches
Performance Issues on Switches
Performance Issues on Routers
Troubleshooting Performance
Issues on Switches
10
The control plane CPU and memory are not involved in switching traffic.
Does not have a direct impact on switch performance.
Responsible for updating the information in the forwarding
hardware.
So has an indirect effect on the forwarding capability.
If CPU is constantly running at a high load, this could eventually affect the
forwarding behavior of the device.
Control plane handles any traffic that cannot be handled by the
forwarding hardware.
A high load on the control plane hardware could therefore be an indication:
Forwarding hardware has reached its maximum capacity
Forwarding hardware is not handling traffic as it should.
11
12
14
When you find indications of packet loss on a switch, the first place to look
is usually the output of the show interface command.
show interfaces interface counters: Displays the total number of:
Input and output unicast
Multicast and broadcast packets
Input and output byte counts.
show interfaces interface counters errors: Displays the total number of:
Error statistics for each interface
17
InOctets
647140108
InUcastPkts
499128
OutOctets
28533484
OutUcastPkts
319996
InMcastPkts
4305
InBcastPkts
0
OutMcastPkts
52
OutBcastPkts
3
Align-Err
0
Single-Col
0
FCS-Err
12618
Multi-Col
0
Xmit-Err
0
Late-Col
0
Rcv-Err
12662
Excess-Col
0
UnderSize
0
Carri-Sen
0
OutDiscards
0
Runts
0
Giants
44
When you find indications of packet loss on a switch, the first place to look
is usually the output of the show interface command.
show interfaces interface counters: Displays the total number of:
Input and output unicast
Multicast and broadcast packets
Input and output byte counts.
show interfaces interface counters errors: Displays the total number of:
Error statistics for each interface
Align-Err
Alignment errors - Frames that do not end with an even number of octets
(padded) and have a bad Cyclic Redundancy Check (CRC), received on the
port.
Usually indicate a physical problem when the cable is first connected to the
port:
Bad Cabling
Bad port
Bad NIC
Bad interface module (UCSC)
Can also indicate a duplex mismatch:
If there is a hub connected to the port, collisions between other devices
on the hub can cause these errors.
20
FCS-Err
The number of valid size frames with Frame Check Sequence (FCS) errors
but no framing errors.
This is typically a physical issue:
Bad cabling
Bad port
Bad Network Interface Card (NIC)
Can also indicate a duplex mismatch
21
22
Undersize
The frames received that are smaller than the minimum IEEE 802.3 frame
size of 64 bytes long
Cause:
Check the device that sends out these frames.
23
Single-Col
The number of times one collision occurs before the port transmits a frame
to the media successfully.
Multi-Col
This is the number of times multiple collisions occur before the port
transmits a frame to the media successfully.
Collisions are normal for ports operating in half-duplex mode, but should not
be seen on ports operating in full-duplex mode.
Cause: If collisions are increasing dramatically:
Indicates a highly utilized link on half-duplex links
Possibly a duplex mismatch with the attached device
24
Late-Col
This is the number of times that a collision is detected on a particular port
late in the transmission process.
For a 10 Mb/s port, this is later than 512 bit-times into the transmission of a
packet, which is 51.2 microseconds
This error can indicate a duplex mismatch among other things seen on the
half-duplex side.
As the half-duplex side transmits
The full-duplex side does not wait its turn and transmits simultaneously
This causes a late collision.
Collisions should not be seen on ports configured as full-duplex
Half duplex: Late collisions can also indicate an Ethernet cable or segment
that is too long.
25
Excess-Col
This is a count of frames transmitted on a particular port, which fail due to
excessive collisions.
Occurs when:
A packet has a collision 16 times in a row.
The packet is then dropped.
Indication:
The load on the segment needs to be split across multiple segments
Can also point to a duplex mismatch with the attached device.
Collisions should not be seen on ports configured as full-duplex.
26
Carri-Sen
This occurs every time an Ethernet controller wants to send data on a halfduplex connection.
The controller senses the wire and checks if it is not busy before
transmitting.
This is normal on a half-duplex Ethernet segment.
27
Runts
The frames received that are smaller than the minimum IEEE 802.3 frame
size (64 bytes for Ethernet) and with a bad CRC.
Causes:
Duplex mismatch
Physical problems:
Bad cable
Bad port
Bad NIC
28
Giants
These are frames that exceed the maximum IEEE 802.3 frame size (1518
bytes for non-jumbo Ethernet) and have a bad FCS.
Maybe okay and may be permitted (see Cisco documentation)
Try to find the offending device and remove it from the network.
In many cases, it is the result of a bad NIC.
29
Switchport issues:
No cable connected
Wrong port
Device has no power
Wrong cable type
Bad cable
Loose connections
Path panels
Media converters (eliminate if possible)
Wrong or bad gigabit interface converter (GBIC)
31
32
33
ASW1 Fa 0/2 which leads to the client, does not show a significant number
of errors.
However, ASW1 Fa 0/1 which leads to CSW1, there is a high percentage of
FCS errors.
FCS errors can have various causes, such as:
bad cabling
bad interface hardware
duplex mismatch
34
35
Auto-MDIX
Manual
The default setting for auto-MDIX was changed from disabled to enabled
starting from Cisco IOS software release 12.2(20)SE.
If you have a switch that supports auto-MDIX, but is running older software,
you can enable this feature manually using the mdix auto
Speed and duplex auto negotiation must be enabled as well.
37
To verify the status of auto-MDIX, speed, and duplex for an interface you
can use the show interface transceiver properties command
38
The Forwarding
Hardware
Forwarding hardware:
The components that are involved in switching the frames from the
ingress interface to the egress interface
Can have a big impact that they have on the performance of the switch.
Two components:
Backplane
Decision-making logic
39
Backplane:
The backplane carries traffic between interfaces.
There are many different types of backplane architectures.
The impact is limited.
Designed for very high switching capacity.
In most cases is the capacity of the links between the devices, not the
capacity of the backplanes of the switches.
40
TCAM
Decision-making logic:
For each incoming frame, the decision-making logic makes the decision to
either:
Forward the frame
Discard the frame
Also called performing layer 2 and layer 3 switching actions.
Provides the information that is necessary to:
Rewrite and forward the frame and..
Processing of access-lists (if applicable)
Processing quality of service (QoS) features (if applicable)
Consists of specialized high performance lookup memory Ternary Content
Addressable Memory (TCAM)
41
42
CAM (FYI)
43
TCAM (FYI)
44
http://www.cisco.com/en/US/products/hw/switches/ps5023/products
_tech_note09186a00801e7bb9.shtml
45
TCAM
If for some reason frames cannot be forwarded by the TCAM, they will be
handed off (punted) to the CPU for processing.
CPU is also used to execute the control plane processes, so it can only
forward traffic at certain rate.
Can effect the traffic throughput and control plane processes.
46
TCAM
TCAM will punt any frames to the CPU for forwarding that it cannot forward
itself.
Traffic may be punted or handled by the CPU for many reasons:
Packets destined for any of the switch IP addresses (Telnet, SSH,)
Multicasts and broadcasts from control plane protocols such as the STP
or routing protocols.
Because a feature is not supported in hardware (GRE tunnels)
TCAM is full (The TCAM has a limited capacity)
47
49
50
One remedy to the TCAM utilization and exhaustion problem is reducing the
amount of information that the control plane feeds into the TCAM.
Use techniques such as:
route summarization
route filtering
access-list (prefix-list) optimization
51
52
Checking CPU
Utilization
(Review)
Over the past minute 31% of the available CPU has been used
SSH Process was responsible for roughly half of these CPU cycles
(15.67%) over that period.
What the remaining 15% CPU cycles recorded over the last minute
were spent on?
53
Checking CPU
Utilization
(Review)
What the remaining 15% CPU cycles recorded over the last minute were spent
on?
CPU runs both IOS processes and is also responsible for packet switching.
The packet switching task is not handled by a specific process.
The CPU is:
Interrupted to suspend the current process that it is executing
Switch one or more packets
Resume the execution of scheduled processes
Packet switching calculated by adding the CPU percentages for all processes
54
and subtracting that total from the total CPU percentage listed at the top.
If CPU interrupt mode is high, this means that the switch is forwarding part
of the traffic in software instead of the TCAM handling it.
Cause:
TCAM allocation failures
Features that cannot be handled in hardware
Note 1: Important to have a baseline
Note 2: In general, an average CPU load of 50 percent is not problematic
and temporary bursts to 100 percent are not problematic, as long as
there is a reasonable explanation for the observed peaks.
56
57
DHCP Issues
60
61
Spanning Tree
Issues
Ill-behaving instance of STP might slow down the network and the switch.
STP issues can cause:
Topology loops
Switch ports not going into forwarding state
Other STP situations include issues related to capacity planning
Per VLAN Spanning Tree Plus (PVST+)
Deterministic approach to selecting root bridges
62
HSRP
63
64
show interfaces
We see that the interface connecting to the PC is up and line protocol is up.
File server interface, shows the same status.
So, there is no problem with the connection itself.
65
67
68
70
G0/11
G0/13
72
73
showintg/2
Switch#showprocessescpu
76
77
The show vlan command reveals that Gig 0/2, 9, 11, 12, 13 and 22 are in
VLAN 10.
Use show interface controller include broadcasts on these interfaces to
find out which of these ports is the source of the excessive ARP packets
g0/11 and g0/13 ports have excessive broadcasts
These ports are the ones that the Wireless Access Points (WAPs) connect
to.
We now know that these are the broadcasts from the wireless clients
Because the WAPs act like hubs, they forward all their client broadcasts to
the switch.
78
We can limit the amount of broadcasts the switch accepts from those ports.
Use the storm-control command on g0/11 and g0/13 interfaces to limit
broadcasts to 3 packets per second.
show processes cpu sorted Observe positive results
You confirm with our users that they are no longer experiencing any
problems we document our work.
79
80
confirm that the PC is connected using the show interfaces command, and
see that it is up/up
81
83
84
85
86
Huge output similar to the output when you displayed access-list vlan10_out
You cant help but to wonder if this access list is affecting switch
performance so badly that users cannot connect.
87
88
89
Troubleshooting Performance
Issues on Routers
90
91
92
Some of the most common router processes that could cause high CPU
utilization:
ARP Input: If the router has to originate an excessive number of ARP
requests.
Interesting notes:
Multiple ARP requests for the same IP address are rate-limited to one
request every two seconds.
Excessive numbers of ARP requests can only occur if the router needs
to originate ARP requests for many different IP addresses.
This can happen if an IP route has been configured pointing to a
broadcast address.
93
94
95
Data Link
Header
IP Packet
Data Link
Trailer
Fast Switching
Process by the Input Process that is running on the central CPU:
Router strips the Layer 2 header from an incoming frame
Looks up the Layer 3 network address in the routing table
Router initializes the fast-switching cache and stores the
corresponding Data Link Layer
Sends the frame with a rewritten Layer 2 header with a newly
computed CRC to the outgoing interface
When subsequent frames to that same destination arrive, a cache
lookup is performed and the destination is found in the fast-switching
cache (bypassing the Input Process running on the central CPU).
New CRC computed for the frame.
Interface Configuration:
Router(configif)#iproutecache
High CPU utilization when:
High number of new flows per second
This can be due to a network attack
98
99
100
103
Troubleshooting CEF
105
show adjacency - To see the adjacency table entries for this next-hop we
use the
Note the difference that there is no ip in this command.
Can see the full Layer 2 frame header associated with this next-hop
Layer 2 MAC address, built through ARP.
107
108
109
110
Memory Size Does not Support the Cisco IOS Software Image:
Check minimum memory size for the Cisco IOS software feature set and
version that you are running:
Release Notes (available to registered customers only)
IOS Upgrade Planner (available to registered customers only)
The actual memory requirements will vary based on protocols used, routing
tables, and traffic patterns on the network.
111
113
114
115
show buffers - Displays statistics for the buffer pools on the router.
Output reveals a buffer leak in the middle buffers pool.
There are a total of 17602 middle buffers in the router
Only 11 are in the free list.
This implies that some process takes all the buffers, but does not return
them.
Similar to a generic memory leak, a buffer leak is caused by a software bug
Fix: Upgrade the Cisco IOS software
116
http://www.cisco.com/en/US/products/hw/modules/ps2643/products_tech_n
ote09186a0080093fc5.shtml
117