Sei sulla pagina 1di 31

BU Control Technologies

Consult IT
Web Tech Talks

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 1
WT262
800xA Troubleshooting Networks and Network Loops

Presented by:
Stephan Hissbach

Date:
 June 11th, 2014, at 9:00 a.m. and 3:00 p.m. (CET)

Duration:
 60-90 minutes

Contact:
 Web Tech Talks: Thomas Kruse (thomas.kruse@de.abb.com)

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 2
Agenda

Introduction
 Network Overload Situations
 Broadcast Storms

Protective Measures in 800xA


 RNRP Loop Detection and Loop Protection
 Network Filtering and Loop Protection in AC 800M
 Network Storm Protection in AC 800M

Live Demo

Preventive Actions

Document References

Questions

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 3
Agenda
Introduction

Introduction
 Network Overload Situations
 Broadcast Storms

Protective Measures in 800xA


 RNRP Loop Detection and Loop Protection
 Network Filtering and Loop Protection in AC 800M
 Network Storm Protection in AC 800M

Live Demo

Preventive Actions

Document References

Questions

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 4
Introduction
Network Overload Situations

 A network overload (sometimes called network storm) is a


fault-initiated network packet transmission with abnormal
high transfer rates.
 Causes:
 Malicious attacks (Denial of Service).
 Hardware failures (typically of network adapters).
 Network loops (“broadcast storms”).
 Impacts
 High network utilization, up to 100%.
 Delay or stop of all other network communication.
 High CPU load on the connected devices.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 5
Introduction
Causes for Network Loops

 Wrong cabling.
 Steady-state effect, exists until the cabling is corrected.
 Easy to prevent in small networks. But large networks
exist.

 Routing changes in networks with ring redundancy.


 Transient effect, mostly lasting only some seconds.
 Difficult to prevent.
 Root cause is difficult to find.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 6
Introduction
Network Ring Redundancy

 Multiple switches are often connected in a ring structure, to


provide link redundancy. This is a network loop by design.
 The failure of one cable link between the switches can be
compensated.
 Common used protocols:
 Spanning Tree Protocol (STP), outdated
 Rapid Spanning Tree Protocol (RSTP), commonly used
 Hiper Ring. This is similar to RSTP, but has faster
reaction times. A proprietary Hirschmann protocol.
 One switch acts as "loop guard" (or "redundancy
manager"), preventing packets from looping around.
 The loop guard role may be moved from one switch to
another. This may result in transient network loops!

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 7
Introduction
Broadcast Storms

 Broadcast Messages are sent to every device in the


network.
 Used by Windows ARP, DHCP, NetBIOS and more.
 Broadcast addresses:
 FF:FF:FF:FF:FF:FF (MAC address)
 255.255.255.255 (TCP/IP address)
 Switches send broadcast telegrams to every outgoing port.
If there is a loop, they receive them and send them out
over and over again.
 This results in a Broadcast Storm, which may flood the
network up to 100%.
 Also multicast messages in a network loop (on the TCP/IP
level) may give similar results.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 8
Agenda
Protective Measures in 800xA

Introduction
 Network Overload Situations
 Broadcast Storms

Protective Measures in 800xA


 RNRP Loop Detection and Loop Protection
 Network Filtering and Loop Protection in AC 800M
 Network Storm Protection in AC 800M

Live Demo

Preventive Actions

Document References

Questions

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 9
Protective Measures in 800xA
Network Storm Handling of AC 800M

 In SV 5.0 SP2 Rev C there was a requirement that the AC


800M controller should handle denial-of-service attacks.
 This is validated by a standard "Achilles level 1 test".
 This guarantees handling of attacks, and other high load
situations, up to at least 10% network utilization. AC 800M
has passed this test.
 AC 800M uses preventive methods to avoid CPU overload
which is caused by network overload.
 Network Traffic Filtering.
 RNRP Loop Detection.
 RNRP Loop Protection.
 Network Storm Protection.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 10
Protective Measures in 800xA
Broadcast Storms might stop AC 800M

 The preventive methods improve the overload resistance


significantly. But there is no guarantee for being able to
resist a 100% overload.
 An extreme high amount of incoming network traffic might
utilize the network thread so high, that threads with lower
priority do not get any execution time.
 Such a faulty situation is detected by the AC 800M thread
supervision functions in SV 5.0 SP2 and later.
 If this situation occurs, the thread supervision stops the
controller and removes the application program (cold
reset).
 This is the "safe state". If the application would be kept,
there would be a risk that it actually became corrupted.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 11
Protective Measures in 800xA
Network Traffic Filtering in AC 800M

 Since SV 5.0 SP1 the AC 800M has a filter that discards


messages for protocols that AC 800M does not support.
 This filter has been enhanced in SV 5.0 SP2 Rev C, for
safe handling of DOS attacks with 10% utilization.
 If the filter detects that many packets are received, while
lower priority threads do not get any chance to execute, it
starts discarding packets.
 The filter reduces the CPU load, but it does not guarantee
that the controller can survive a broadcast storm.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 12
Protective Measures in 800xA
RNRP Loop Detection

 In a network loop, multicast (and broadcast) messages are


sent to all nodes over and over again.
 RNRP uses multicast messages. RNRP messages contain
sequence numbers that allow RNRP to detect if the same.
 If RNRP messages are received several times, RNRP
assumes that there is a network loop.
 If this happens RNRP sends an error message “Suspected
Network Loop detected”.
 In a controller the error message is written in the
controller log.
 In a PC the error message can be seen with the RNRP
monitor and in the RNRP log file.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 13
Protective Measures in 800xA
RNRP Loop Protection

 In addition to loop detection, RNRP has a also function to


respond to and deal with a network loop.
 With loop protection enabled, RNRP can disable the
network port on which the loop was detected.
 Loop protection disables the network port subjected to the
network loop only if there is a redundant network available.
 If only a single network is used the loop protection will not
take any actions.
 In AC 800M, the HW Unit “Ethernet” creates a high priority
alarm with the text “Port Disabled, Network Loop detected”.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 14
Protective Measures in 800xA
Network Storm Protection

 Beginning in 800xA SV 5.1 Rev A, a storm protection


function in AC 800M replaces the RNRP loop protection.
 The storm protection is capable of protecting the controller
from all types of excessive network traffic, not only network
storms caused by loops.
 Storm protection thresholds to become activated:
 PM86x: if more than 800 packets per second are
received.
 PM891: if more than 1,600 packets per second are
received.
 The overloaded port is disabled for 10 seconds. If the
network remains overloaded, it is disabled for a certain
time period (10 seconds in 5.1 Rev A, 2 minutes from 5.1
Rev B onward). This check is cyclically repeated.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 15
Protective Measures in 800xA
Availability of Network Overload Handling

800xA System AC 800M RNRP


Version Version Version
4.0 4.0.0/0 2.14 Loop Detection introduced
5.0 SP2 5.0.2/0 2.24 Loop Protection introduced
5.0.SP2 Rev D and 5.0.2/4 2.28 Loop Protection enabled by default in AC 800M and on
later Windows XP/ Windows Server 2003
5.1 5.1.0/0 3.12 Loop Protection enabled by default in AC 800M
Loop Protection not supported for Windows 7 / Windows
Server 2008
5.1 Rev A 5.1.0/1 3.14 Loop Protection works for Windows 7 / Windows Server
2008
Storm Protection replaces the Loop Protection in AC
800M.
Port disable time is 10 seconds
5.1 Rev B and later 5.1.0/2 3.19 and Storm Protection port disable time changed to 2 minutes.
higher
 In AC 800M HI, the RNRP Loop Protection is not enabled
by default in SV 5.1.
 If loop protection shall be activated where it is not enabled
by default, a special RNRP parameter configuration is
required. See details in Technical Description 3BSE060651.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 16
Protective Measures in 800xA
Availability of Network Overload Handling

Storm protection is also available for CI Modules, with the


following detection thresholds:

Module type Protocol Storm limit pkt/s

CI857 INSUM-2 600

CI860 FF HSE 800

CI867 ModbusTCP 1000

CI868 IEC 61850 1200

CI871 PROFINET IO 1200

CI873 EtherNet/IP 1200

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 17
Live Demo

Live Demo

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 18
Live Demo
Setup with/without Redundant Control Net

172.17.80.11

172.16.80.11 172.16.80.151 172.17.80.151

RSTP Enabled / Disabled RSTP Disabled

Network Loop Connection

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 19
Agenda
Preventive Actions

Introduction
 Network Overload Situations
 Broadcast Storms

Protective Measures in 800xA


 RNRP Loop Detection and Loop Protection
 Network Filtering and Loop Protection in AC 800M
 Network Storm Protection in AC 800M

Live Demo

Preventive Actions

Document References

Questions

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 20
Troubleshooting
Example for AC 800M Log and RNRP Event Log

 The log file "RnrpEvent.log" is located in the folder


"C:\ProgramData" on the RNRP nodes.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 21
Preventive Actions
By Design

 Conventional troubleshooting is barely possible. When a


broadcast storm occurs, it is in most cases too late!
 Preventive actions by design:
 Avoid ring structures, especially for Control Networks.
In this case, RSTP is not required.
 Strictly separate Primary from Secondary RNRP
network.
 Separate the Control Network from the Client/Server
Network.
 Do not mix different redundancy ring protocols, such as
RSTP and Hiper Ring.
 Managed switches may have different loop protection
mechanisms. If possible, use switches from the same
vendor.
© 2014 ABB Automation GmbH
May 27, 2018 | Slide 22
Preventive Actions
Cabling and Switches

 Attach good readable and durable tags on all cables.


 Use switches with dedicated uplink ports. This avoids using ports
with “Auto MDI/MDI-X” detection, which may induce transient
distortions.
 Don't use any cross-over cables.
 Avoid resetting or powering down switches when plant is in
operation.
 If a switch, which is part of a redundancy loop, needs to be replaced,
open the ring first. After replacement, verify the redundancy
configuration. Avoid doing this when the plant is in production state.
 Use the latest available firmware version for the switches. Avoid
mixing different versions.
 If the Network Adapter has the multi core feature “Receive Side
Scaling” enabled, this must be disabled. See 3BSE053117.
© 2014 ABB Automation GmbH
May 27, 2018 | Slide 23
Preventive Actions
Software

 If PC Network Adapters are used, which offer the multi core


feature “Receive Side Scaling”, this must be disabled. This
is done in the Network Adapter's Properties – Configure –
Advanced Tab.
 If this is not done, duplicate network messages may be
received, and RNRP may misinterpret this as a “Suspected
Network Loop”.
 See Product Bulletin 3BSE053117, and Network
Configuration Manual 3BSE034463-510, Section 2.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 24
Agenda
Document References

Introduction
 Network Overload Situations
 Broadcast Storms

Protective Measures in 800xA


 RNRP Loop Detection and Loop Protection
 Network Filtering and Loop Protection in AC 800M
 Network Storm Protection in AC 800M

Live Demo

Preventive Actions

Document References

Questions

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 25
Document References
Web Tech Talk

 WT192 – Troubleshooting 800xA Networks (Nov. 2009).

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 26
Document References
Technical Documentation

 3BSE060651 – Technical Description System 800xA,


Network Loops and Storm Protection.
 3BSE034463-510 – System 800xA Network Configuration,
chapter "RNRP Network Loop Detection and Protection" in
section 2.
 3BSE066739 – System 800xA - RNRP Network
Configuration Requirements ().
 3BSE067587 – AC 800M Controller Certificate of
Compliance Achilles Level 1 Certification.

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 27
Document References
Problem Reports

 3BSE047421D0068 – AC 800M Controller Firmware 5.0.2


A Network Loop may cause AC 800M to shut down.
 3BSE073384 – System 800xA Base SV5.1 Rev A (64 bit),
FP1 - RNRP reports false network down.
 3BSE053117 – Product Bulletin, 800xA Operations SV 5.0
SP1, RNRP may report "Suspected Network Loop" after
installation of Windows Server 2003 SP2.
 3BSE066759 – ALERT - System 800xA SV 5.0, 5.1 RNRP
node status spoofing vulnerability.
 2PAA111288-510 – System 800xA Release Notes, Fixed
Problems System Version 5.1 Rev D, "800xACON-AD-
5101-012"
 ABB PowerHelp case ABB20131113-0006 – All AC 800M
controllers (7x redundant PM864) crashed at the same
time.
© 2014 ABB Automation GmbH
May 27, 2018 | Slide 28
Document References
Wikipedia

 "Broadcast radiation" –
http://en.wikipedia.org/wiki/Broadcast_storm
 "Switching loop" –
http://en.wikipedia.org/wiki/Switching_loop
 "Routing loop problem" –
http://en.wikipedia.org/wiki/Routing_loop_problem

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 29
Questions

Questions?

© 2014 ABB Automation GmbH


May 27, 2018 | Slide 30
ABB Logo

© 2010 ABB Automation GmbH


May 27, 2018 | Slide 31

Potrebbero piacerti anche