Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
R. Froschauer, F. Auinger, G. Grabmair T. Strasser University of Applied Sciences Wels PROFACTOR Produktionsforschungs GmbH {r.froschauer, f.auinger ,gu.grabmair}@fh-wels.at thomas.strasser@profactor.at
Abstract
Modern industrial automation systems are supposed to execute applications distributed across heterogeneous networks. Additionally the market has raised a demand for downtime less operation and change of automation and control for such systems. Consequently appropriate concepts for recovering and reconfiguring devices during full operation are needed. The open standard IEC 61499 provides a scalable architecture to model applications for such distributed control systems. It supports interoperability, configurability and portability of control applications and therefore delivers the basis for online-configuration and recovery of heterogeneous systems. The main goal of this paper is to present a concept for autonomous recovery of applications within the context of distributed systems. The proposed concept facilitates the exchange of devices without any need for extra configuration. It supports an automatic recognition of new components in a distributed automation & control system and an automated up & download of control applications, user data and configuration data. The introduced approach is tested using an Ethernet-interconnected distributed demonstration system, consisting of standard personal computers with a running IEC 61499 Function Block Development Kit.
of distributed systems the control logic of industrial plants is distributed on several small devices which are interconnected through a flat communication network. The FIT-IT project Crons [1] introduces a middleware approach, which enables application centred development on top of mechatronic devices called Crons. These devices combine the mechanical or electrical functionality with the corresponding control logic. Enabling this approach each mechatronic device, (e.g. drives, pneumatic cylinders, linear axis ), has to have its own microcontroller, executing the Crons middleware. Furthermore each Cron has a standardized communication interface and a hardware abstraction layer, which allows interaction with the physical part without any special programming skills. Within the next ten years distributed systems are considered to be built of Crons or another comparable technology and will therefore support interoperability, configurability and portability. Furthermore zerodowntime operation within a real-time execution environment will become a common technology. The first step to realize zero-downtime operation is to find an approach to replace faulty devices without stopping and restarting the whole system [7,8]. This paper presents an approach for a basic Plug & Work functionality on the basis of IEC 61499.
1. Introduction
Today many automation and control systems are indicative for an increasing and almost overwhelming complexity. Only highly skilled engineers are supposed to cope with these requirements and therefore the engineering costs for replacing faulty components are increasing as well. Currently many vendors in the field of automation and control focus their research activities on finding approaches for developing new mechanisms to assemble and maintain industrial plants without any special programming skills. In the context
The IEC 61499 [4, 5] is intended to be the successor of IEC 61131-3 [6] with a special focus on distributed systems. By supporting a standardized communication protocol, which enables remote-configuration of devices, the IEC 61499 offers the possibility for building middleware systems and applications as described above. In order to enable the automatic recovery of faulty devices several guidelines and requirements have to be identified. Especially in heterogeneous networks a set of standardized interfaces is necessary. The
Proceedings of the IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS06) 0-7695-2589-X/06 $20.00 2006
IEEE
following list presents a minimum set of interfaces [1, 9, 10, 11]: Communication interface: This interface defines methods, which ensure that all devices and even all applications are able to communicate with each other, especially within heterogeneous networks. Application execution interface: This interface defines a kind of hardware abstraction layer, which enables a device independent representation of applications. Especially on heterogeneous devices every application should be executable without recompiling (i.e. a middleware approach). Application transfer interface: This interface defines methods of transferring applications between different devices and between devices and an engineering environment.
Management
Management
Other application 1
Other application n
Other application 1
Other application n
2. Communication scheme
As stated in [2,13] traditional recovery concepts, which are mostly used in storage networks for databases or even control systems, are using a client/server architecture. Every client tries to store its valuable data on a central recovery server, either client- or serverdriven. In case of failure the client device can restore the application from the server. With regards to this type of system numerous algorithms of replica propagation have been developed. The big disadvantage of this approach is the vulnerability to failures concerning the central server. Therefore the preferred approach is a multi-master concept, which allows the integration of more than one recovery server in the network. Due to
Slave 1 Redundancy: 2
Slave 2 Redundancy: 3
The RG approach can be divided in two different main operational sequences: Device registration process: The device registration process describes the communication between MD's and SD's in the network and how to assign new SD's to one or more corresponding MD's liable in case of a recovery. The used replica placement strategy comes close to the well-known cli-
Proceedings of the IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS06) 0-7695-2589-X/06 $20.00 2006
IEEE
Management
Other device
Other application 1
Other application n
Replication Slave
Replication Slave
IEC 61499 does not define how to implement a recovery system but it smoothens the way for new concepts and enables defined interfaces as introduced above [4]. The device - independent representation of automation hardware, applications and function blocks is fundamental for a heterogeneous concept of control application and device recovery. The usage of the mentioned interfaces and the capabilities of IEC 61499 open countless possibilities, but with regards to the shortcomings of current field bus technologies the IEC 61499 may only be capable of overcoming all these missing features, in case several extensions are made to the standard [1]. Especially the configuration management interface has to be extended with commands for device operations, such as identification and storage of configuration data as well as operational states of control applications & function blocks. Assuming these changes are performed the communication scheme for organizing the automatic recovery of faulty devices is described in the next chapter.
the changed type of features the server will be called recovery master (RM) or master-device (MD) in the further investigations. Depending on the requested redundancy the amount of MD's can be increased arbitrarily. This requires a special kind of network communication feature and of course management rules to determine the execution behaviour. The approach described in this paper uses a multiple access network structure where every device has the ability to send and receive messages to and from every other device (for example using Ethernet or CAN). Therefore each of these network technologies supports a kind of Multicast communication which is used instead of or additionally to numerous direct point-to-point connections. With regards to a software-based implementation the master- and the slave-device (SD) can be represented by a master- and a slave-component. Therefore the term master-device is equivalent to the term device with master component, whereas the functionality is reached by an additional software component, such as a master application. Similar to the MD the slavedevice can also be called device with slave component, whereas the slave component is supposed to be a part of each standard device. Furthermore the whole set of devices may also be called recovery group (RG) (see Fig. 1), because this group of devices contains all necessary participants for a system recovery.
Recovery group Master 1 Management Other application 1 Other application n Replication Master Master 2 Management Other application 1 Other application n Replication Master
Router
ent-based replica or pull-approach, whereas some aspects are similar to the push-approach. Application query and transfer process: The application process describes how a slaveapplication is queried, stored and transferred back to the SD by one or more MD's.
until every master-ID is unique. With regards to deterministic execution the master-list may be implemented static, because otherwise the list might get too big, due to adding too much devices. Therefore the size of the list has to be defined before start-up. If a MD receives a message with a master-ID equal to -1, the sending device will be removed from the receivers list.
Device with Master application
and are used by several system processes. These processes and the linked algorithms are described within the following subsections.
yes
ID = -1
Delete Master
Proceedings of the IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS06) 0-7695-2589-X/06 $20.00 2006
IEEE
contains the same data in each MD, the slave-list contains only those devices which should be served by the specific MD. In case of a slave register request a hierarchy is derived from the master-ID of each MD. Depending on the specific position of each MD in this hierarchy and on the requested amount of redundancy the information of a SD will only be kept by those MD's, which have the highest master-ID. The algorithm (shown in Fig. 3) assumes that the RG contains enough MD's to cope with all the redundancy requests and the MD's are registered by each other. Furthermore each MD has to listen and wait for incoming messages.
Device with Master application Management Other application 1 Other application n Replication Master
ing a new SD. After accepting a SD the master-ID is automatically lowered by a defined value or a random number between the old ID and zero. The change of the ID is published to the RG using an extra master registration message (this process is similar to the master-component registration process described in section 3.1). If the MD receives a message from a SD which is already in its list, the stored information is deleted. Therefore every slave register request effects a new distribution of the SD.information across the whole RG. All stored information about this SD is deleted to prevent the system of version conflicts. After a successful registration of a SD the MD tries to retrieve the requested application either from the requesting device or from a local repository.
no
If a device wants to register for a recovery, it has to publish a slave register request message which contains its network address, its desired redundancy and an application identifier. Every MD receives this message and checks whether it is allowed to accept the SD registration or not. On the basis of the master-list and the master-IDs the MD can determine its own position within the hierarchy and if it is between highest position and highest position minus redundancy the allowance for accepting the SD is granted. Therefore the redundancy determines the amount of MD's accept-
Succeeding the registration and updating process the slave recovery request starts the prepared recovery process, as shown in Fig. 4. Every MD which has the requesting SD in its list starts to transfer the locally stored application to the requesting device. The multiple connections between the devices are handled by the first come - first serve principle. Therefore no additional algorithm is necessary and the amount of MD's transferring an application to a SD can be increased without any need for further configuration. The way in which the application is transferred depends on the implementation and is not covered in further detail in this work.
Proceedings of the IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS06) 0-7695-2589-X/06 $20.00 2006
IEEE
Masterlist
Slave 2 Slave 3
4
applicatio n transfer
4 accepted
est requ
accepted
not accepted
component registration process. After this procedure the deleting MD generates a new random master-ID and sends another master register request message to tell the other devices that it is accepting messages again.
Reco
very
4. Implementation
Slave 3 Redundancy: 2 Management Other application n
Slave has been replaced by a new one, with the same identification
The concepts described in sections 2 and 3 founded the basis for the development of a prototypic IEC 61499 function block library. This library contains the function blocks for basic communication, such as Replication Master/ Slave (see Fig.5 & 6), and furthermore several function blocks for transferring, storing and parsing an control application.
Master EVENT EVENT EVENT EVENT INIT LISTEN DEL ID INITO RECOVER DISCOVER CNFID EVENT EVENT EVENT EVENT
Other application 1
Other application n
Replication Slave
Replication Slave
QI
ReplicationSlave BOOL WSTRING INT WSTRING QI MGR_ID REDUNDANCY APPNAME QO STATUS BOOL WSTRING
The library has been designed by using the Function Block Development Kit FBDK as well as the Function Block Runtime FBRT [3] and some additional JAVA coding. For testing purpose a simple application has been designed running on two common Wintel computers, whereas the faulty behaviour of a device is simply simulated by closing the running application. As depicted in Fig. 7 the user has to start the basic master and slave service on the devices. These services may be part of the firmware of an embedded device and are started automatically after power on. After ensuring all
Proceedings of the IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS06) 0-7695-2589-X/06 $20.00 2006
IEEE
services are running properly the user can load its application onto the SD. When starting the user application the SD automatically registers itself at the RG and one or more MD's are accepting the register request. Next the MD retrieves the user application from the SD and stores it for a future recovery. In case the SD has a malfunction it is replaced by a new empty SD (i.e. empty means basic services are included as firmware). The empty device is registering itself at the RG and asks for a possible recovery. In case one or more MD's have an appropriate application the recovery process is started. The application is transferred back to the new SD and started up.
User Development tool Device (with master application) Device (with slave application) Device (with slave application)
Acknowledgements
This work is supported by the FIT-IT: Embedded System program, an initiative of the Austrian federal ministry of transport, innovation, and technology (bm:vit) within the Crons-project [1] under contract number FFG 808205. Further information is available at: www.microns.org PROFACTOR is core member of the I*PROMS consortium. www.iproms.org
References
[1] Micro Holons for Next Generation Distributed Embedded Automation and Control, [web page, http://www.microns.org, accessed December 05, Profactor GmbH, 2005]. [2] Robert Spalding, Storage Networks: The complete reference, McGraw-Hill & Osborne, 2003. [3] HOLOBLOC, Inc. - Resources for the new generation of automation and control, [web page, http://www.holobloc.com, HoloBloc Inc., accessed December 05, 2005]. [4] Robert Lewis, Modelling control systems using IEC 61499, The institutions of electrical engineers, London, 2001 [5] IEC 61499: Function blocks for industrial-process measurement and control systems, Publication, International Electrotechnical Commission IEC Standard (2005). [6] IEC 61131-3: Programmable controllers - Part 3: Programming languages, Publication, International Electrotechnical Commission IEC Standard (2003). [7] Kramer J., Magee J., Dynamic configuration for distributed systems, IEEE Transactions on Software Engineering, 1985. [8] Wills L.M., Kannan S., Sander S. Guler M., Heck B.S., Prasad J.V.R., Schrage D., Vachtsevanos G.J., An Open Platform For Reconfigurable Control, IEEE Control Systems Magazine, 2001. [9] Shelton C.P., Koopman P. Nace W., A Framework for Scalable Analysis and Design of System-wide Graceful Degradation in Distributed Embedded Systems, Proceedings of the 8. IEEE International Workschop on Object Oriented Real-Time Dependable Systems, 2003. [10] Garcia H.E., Ray A., Edwards R.M., A reconfigurable hybrid supervisory system for process control, Proceedings of the 33rd conference on Decision and Control, Lake Buena Vista, 1994. [11] Guler M., Clements S., Wills L.M., Heck B.S., Vachtsevanos G.J., Transition Management for Reconfigurable Hybrid Control Systems, IEEE Control Systems Magazine, February 2003. [12] Feiler, P., Jun Li., Consistency in dynamic reconfiguration, Proceedings of the 4. International Conference on Configurable Distributed Systems, 1998. [13] Tanenbaum A.S., van Steen M., Distributed Systems Principles and Paradigms, Prentice Hall New Jersey, ISBN 0-13-088893-1, 2002.
implement
system
start master applicatio n
start slav
d user ap
Check responsibility
register req
plication uest
t recover reques
load slave user application
The concept above does only work if the devices provide a unique identifier to determine whether they are appropriate for a recovery or not.
Proceedings of the IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS06) 0-7695-2589-X/06 $20.00 2006
IEEE