Sei sulla pagina 1di 5

Preventive actions in protection relays network

using SNMP
Breno Jácomo de Freitas
Infrastructure & Cities, Smart Grid, Energy Automation
Siemens
Jundiaí, SP, Brazil
freitas.breno@siemens.com

Abstract—Energy automation systems have been expanding and supervision of this failure.
there are more Ethernet capable devices in the network.
Protection relays, switches, GPS and also computers have many After some research it was noticed that the SNMP (Simple
critic diagnosis information available via the network interface. Network Management Protocol) is available in most of the
The intention of this paper is to present how the already available network devices used in modern substations (manageable
information can be used in the human-machine interfaces and switches, protection relays with network interfaces, GPS and
SCADA systems to have a more detailed diagnosis of the network computers). The availability of SNMP enables diagnosis
and its devices. Information like IED (Intelligent Electronic information, such as failures and performance, to be
Devices) health status, CPU loads, fiber-optics light intensity, monitored in the SCADA. Monitoring these data it is possible
GOOSE (Generic Oriented Object Substation Event) message take preventive actions that, in a conventional supervisory
errors, can be used to early detect equipment failure and network system would be visible only when the failure have already
instabilities. Early analysis can prevent problems with the caused system degradation.
equipment interlocking and protection functionalities and
improves the power system availability. The simple network
management protocol (SNMP) is a management protocol widely
implemented, which makes it an ideal manufacturer-independent
solution for this purpose. To test this concept a Siemens SCADA
system was used as a manager of the SNMP enabled devices in an
already commissioned power plant with Siemens Siprotec 4
protection relays. Further ideas were also developed.

Keywords-Electrical System Availability; Energy Automation;


HMI; IEC 61850; Prevention; SCADA; SNMP; Substation

I. INTRODUCTION
The studies about application of SNMP protocol in
Electrical systems, shown in the paper, started when a project
composed of Siprotec 4 protection relays communicating in Figure 1 – Architecture
the same Ethernet network using the IEC61850 standard was
developed. The project has more than 250 protections relays in Nowaday, most of control centers have HMI and SCADA
7 different substations. There is at least one switch in each of system. The idea is to integrate the engineer tools used to do
them. These switches are in gigabit fiber-optic ring network diagnosis at the operator HMI already used to control
architecture with RSTP (Rapid Spanning Tree Protocol). the power system. Integrating these functionalities at the local
There is a local network ring with relays and switches in each or remote supervisory system is possible to take faster
substation as shown at Figure 1. The IEDs communicate with preventive actions before the failure. Once the actions are
the SCADA systems via the IEC61850 standard. When some taken before the fail, it would reduce or eliminate the mean
fiber-optic directly connected to the IED is damaged, this time to repair (MTTR) [1].
failure generates a message to the SCADA system and the
substation operators can verify the problem and corrective At the network is already used the RSTP, that is a highly
actions. When a fiber-optic between the switches is damaged, available network that can decrease the MTTR. Using SNMP
since the switches are not IEC 61850 capable, there is no as a tool to diagnosis intrinsic fails at the communication is
possible to improve the availability of this power system.
Even with the hardware redundancy with the intention of
The development of this paper had the support of Siemens Ltda (Brazil),
BU of Jundiaí, São Paulo, which has provided the laboratory and devices of increase the reliability of the network, sometimes the network
IC SG EA (Infrastructure & Cities – Smart Grid – Energy Automation). has some problem and is working in a critic mode. Trying to
1
B. J. Freitas works at support and after sales of Siemens Ltda, Jundiaí, avoid it, that is dangerous, the HMI can show some intrinsic
São Paulo – Brazil (e-mail: freitas.breno@siemens.com). devices’ problems and network instability. Sometimes all the

978-1-4577-1829-8/12/$26.00 ©2012 IEEE


system is working, but the fiber optic is damaged and the
maintenance team does not know it or the GPS antenna is
connecting just 1 or 2 satellites and should be assembled in
another place. This application aim is to show some points that
would be helpful to manage the network as some way that
normally is not observed in an energy system improving the
system availability decreasing the MTTR.
The SNMP monitoring is done by a direct communication
between the devices and the manager by a frame asked by the
network manager. Each device has intrinsic information that
can be reported under a SNMP manager request. Normally
these points are not observed at the substation network.
The application shown in this paper suggest the monitoring
of some signals such as: how many frames an IED has lost
Figure 2 - Manager and agents
(FrameLoss), how many frames with error some IED has
received (CRC-Check error, counter of data packages with
CRC error), counts broadcast Ethernet packets which were not B. Manageable objects
evaluated because of a processor overload (FNF queue
A manageable object is an abstract vision of a real
overflow), counter for telegrams that are too long and
resource of the system. So, all resources of the network that
discarded (more than 1520 bytes), counter for the number of
has to be manageable have to be modeled and the data
bits which cannot be divided by 8 and, if the value is not equal
structure shown in a MIB file. The manageable objects can
to zero, there may be problems on the transmission link
have permission to be read only or read/write (change). Each
(RxAlForm) [2], percent of processing (when it reaches 100%
read will represent the real state of the object. Each change
It will lose information), device temperature and others
will be interfering on the managed device.
parameters that will not be explored at this paper.
C. The MIB file
II. SIMPLE NETWORK MANAGEMENT PROTOCOL
The MIB is a set of manageable objects that try to include
(SNMP) AND AVAILABILITY CONCEPT
all the necessary information to the network can be managed
On the beginning of 80th decade the Simple Network [5].
Management Protocol (SNMP) has started to be developed by
the Internet Engineering Task Force (IETE). The goal was to The Figure 3 shows a MIB file screenshot of the software
simplify the monitoring of devices in the network by a iReasoning MIB Browser. At this example is possible to see
manager. Now a day the research in network management has some IED intrinsic information. When some object is selected
the goal of have the maximum income [3]. The SNMP is possible to see some information as object’s name, SNMP
protocol work with manageable objects between the address, status, description and syntax.
management information bases (MIB). The object rstpRoleChannel1 is selected at figure 3. This
A. Definition figure shows the state of the port 1 that concern the role (it can
be root, designate or alternate) about the RSTP protocol,
The SNMP has as premise the implementation flexibility which will be the focus of this paper. Analyzing this figure is
and facility, so visioning the future devices. It is a possible to see some others information about the Siprotec’s
management protocol defined on the application layer. It is Ethernet port possible to be monitored (nRxTelInt,
used to take information from the SNMP agents which are nRxTelTask…).
devices in a network based on the stack of TCP/IP protocol.
The data are received by a request from the manager to the
agent or more than one using the transport protocol UDP to
send and receive messages on the network. The network
management allows the real time follow up of the state of
devices to manage several kinds of system. The commands are
limited based on the search or change mechanism. These are
available as change the object value, take values of some
object and the variations of the manageable objects. This turns
the implementation easy, simple, stable and flexible. SNMP is
used in the communication between managers and agents
transporting management operations, information and trap,
and is kept in the central database. [4]
As shown at Figure 2, the operation is based in 2 devices:
the manager and the agent. Each manageable device has some
variables that the manager can access when it need. Each of
that has to be a management information base (MIB).
When the MTTR is decreased the availability is increased. It is
not possible to change the MTBF but, with preventive actions,
we can decrease the MTTR and get a better availability of the
power system.

III. TEST
The test was done at Siemens test field a real project that
was being developed at Jundiaí, Brazil. At the project were
nine Siprotec protection relays with EN100 Ethernet port
(IEC61850) and two redundant switches in ring architecture
thru RSTP. There was a NTP server to synchronizing the IEDs
in the network. All these devices are SNMP agents and can be
managed by the supervisory software SICAM230 of Siemens.
This software is being used as a SCADA system that is the
SNMP network manager. Each agent has some variables that
were considered as important points to be managed by the
supervisory.
Some messages that can be sent by SNMP agents are focus
on network and RSTP problem. At each IED is possible to see
the RSTP status of each port, the device’s bridge priority (to
define the RSTP root), the counter of GOOSE message
received, counts of GOOSE telegrams which passed the
multicast filter but are not addressed to the device
(parametrization fault) and the status and role of each port at
the RSTP network.
At the managed switches, the main information to monitor
Figure 3 - Tree structure (MIB)
is the LastTopChange that shows when the last topology
change at the network happened. It means, if some relay was
D. The Agent rebooted, some optic fiber has broken, some other device has
A process executed in the manageable device responsible entered at the network (direct at the switch or at the ring). The
for the maintenance of the information base by the manager switch even shows the up, the CPU load [%] to avoid it
[6]. The main functions of the agent are to attend the reaches 100% and loose frames, the processor temperature and
requisitions sent by the manager and send automatically the the network ports (if on, off or disconnected).
information when previously programmed. At studied case the The NTP server shows if synchronized, how many
agents are the protection relays (IED), the manageable satellites is connected, the UTC time (GMT 00:00), the
switches and the NTP synchronism server device. antenna status (open or connected) and the local time.
E. The Manager IED (protection relay):
A process executed in a server station that allows the
• Link Status (up/down)
achievement and dispatch of manageable information in one
or more agents. The manager is responsible to monitor, create • RSTP Status (forwarding, discarding...)
reports and make decisions when some event occurs. The
agent is responsible to send and change information and also • RSTP Role (root, designate, alternate...)
notify some specific event to the manager. At the studied case • Bridge priority
the manager is the HMI.
• GOOSE matched
F. Availability
The availability is the probability that a system is in its • GOOSE mismatched
functional condition so is capable of being used in a stated
environment under given conditions at a given instant of time
during a given time interval. When only corrective Manageable switch:
maintenance is considered, the inherent availability which is a • Time since last Topology Change
Equation (1) is employed [1].
• CPU Usage [%]
ெ்஻ி
‫ ݕݐ݈ܾ݈݅݅ܽ݅ܽݒܣ‬ൌ (1) • Up time
ெ்஻ிାெ்்ோ
• Internal temperature [°C]
• Available RAM [MBytes]
• Port status
NTP server:
• Antenna (Sync)
• Local Time
• Number of satellites
• UTC time

The main point observed at this paper is a fail simulating a


broken fiber optic between two IEDs. It was simulated by
Figure 5 - Broken ring
taking off one pair of fibers and analyzed what happened. It is
possible to observe at the same time the behavior of the others
intrinsic points of the IEDs and the other SNMP agents. When the fiber optic is restored or replaced, automatically
the IEDs and the switch know and the virtual break
All tags listed are shown at the picture 4. To a better view, (discarding port) changes to the initial position, where the cost
the forwarding ports are green and discarding ports are orange. is the cheapest according the RSTP rules. Every devices turn
The port in root or designate is yellow and the alternate is back to the initial position and the time of the last topology
orange. The blue square with number 1 means that the channel change is zero.
is up and red with number 0, it is down. There is an important
point, the LastTopChange. It shows the last time that the
topology has changed. It is useful to know for how long the
network is stable. When the time interval is short, some fail
happened at the network as a communication problem of some
device, some IED or switch may be turned off.
Another important parameter to observe is the root port.
This root port is always at the side that the telegram can find
the root of the network, which normally is the switch. At this
picture below, all the communications links were ok.

Figure 6 - Ring stable again

When some device sends a lot of telegrams the traffic


become too intense, the switch starts to work hard and, when
its CPU loads reach 100% there are datagram losses on the
network. It can cause the mal function of the network and the
operator on the control center cannot see this problem or the
relay can miss some GOOSE and do not operate correctly
causing problems at the power system.
Figure 4 - Stable architecture An avalanche test was done pooling every device per
second. It is possible to see at the picture 7 that the processing
When some fail happens, as shown at the figure 5, some reaches 79% when the normal is about 13% when the pooling
fiber optic is broken, the network rearrange itself changing the is every 10 seconds. This is an interesting tag to be observed
discarding ports to the real broken ring connection. All the because it can generate problems at the electric system or at
IEDs have their root port nearer the network root. Each change the human-machine interface loosing important information.
of this root port is one topology change so, when some change
happens, the value of LastTopChange goes to 0 minutes.
Managing all intrinsic important information of the
network devices, the HMI can dispose and generates alarms to
alert the maintenance team. These intrinsic errors that just the
IED or network devices can count as erroneous message
exchanged, datagram that were lost, frames discarded,
GOOSE mismatched and all this information can help the
maintenance team to do actions and prevent future problems at
the energy system.
Figure 7 - CPU load [%] At an energy system’s HMI these variables are usually not
taken and are invisible for the operator and maintenance team.
When observed the NTP server (GPS), if the antenna is If we compose these information at most critical HMI we
connected and how many satellites it can reach can be could prevent problems as important information lost as
monitored. There are at picture 8 some tags of GPS and a test GOOSE and intrinsically defects made of the network devices.
done. The picture in left shows the GPS in normal status with With this implemented system, most of these problems should
the antenna OK, sync by 14 satellites, UTC time and local not happen and increase the availability of the electrical
time (Brazil – GMT -03:00). At right, the GPS without system.
antenna that cannot find any satellites.
REFERENCES

[1] Avaibility of Maintained System: A State-of-the-Art Survey, C. H. Lie,


C. L. Hwang & F. A. Tillman.
[2] Ethernet & IEC 61850 Concepts, Implementation, Commissioning
Manual, Siemens E50417-F1176-C361-A4.
[3] Communication Network Management in Power Automation Systems
under the IEC 61850 protocol throught the use of SNMP, Paes, Fred.
[4] Protocolo de Gerenciamento SNMP. Beethoven Zanella Dias and Nilton
Figure 8 - NTP server (GPS) status Alves Jr.
[5] http://penta.ufrgs.br/gr952/trab1/2osi.html read on 08/15/2011 at
15:58hs.
IV. CONCLUSION [6] http://www.dpstele.com/white-paperssnmp-tutorialsnmp_glossary.php
read on 23/08/2011 at 12:19.
There are some variables to be observed in a protection
relays network and at most of SCADA system they are not
available. The purpose of this paper is to show this
information for the operation and maintenance team to take
providences. Observing all these intrinsic information is
possible to take some preventive actions to reduce the mean
time to recovery and improve the availability of the power
system.

Potrebbero piacerti anche