High Level Techniques For Self-Repairing Robotic Systems: Claus C. Aranha, Jacques Wainer, Andr e Covic Bastos

High level techniques for self-repairing robotic systems
Claus C. Aranha , Jacques Wainer , André Covic Bastos
Instituto de Computação – Universidade Estadual de Campinas

Avenida Albert Einstein, 1251, Caixa Postal 6176 – 13083-970 Campinas, SP - Brasil
claus.aranha@ic.unicamp.br, wainer@ic.unicamp.br, andrebas@ic.unicamp.br
Abstract. Usually, robotic fault-tolerance techniques refer to methods to isolate

and treat faults individually. In this work we propose high level (planning) tech-
niques for a new approach, where multiple, possibly heterogeneous devices in
a robotic system cooperate to diagnose faults and to take over a faulty device’s
functions in the system. This technique uses pre-defined “replacement plans” to
deal with faults, avoiding costly online replanning.
1. Introduction
Robots are usually employed to replace humans in harzadous tasks, or tasks in harzadous
envirionment. These missions usually strains the robot’s circuitry to it’s extremes, mak-
ing components failures more common than on robots that perform assembly line tasks.
Therefore, the ability to avoid and tolerate failure has became an important factor in eval-
uating a robotic system’s performance on a given mission. Another desirable feature on
such an autonomous system is the ability to recover and reconfigure itself after detecting
an error, so that it does not lose its functionality, even if at some performance cost. Both of
these should be done in a reasonable amount of time, for a robot to survive in an dynamic
envirionment.
Fault tolerant and self reconfigurable systems are specially useful to extend a
robot’s autonomy time. That is, how long the systen is able to run without direct or in-
direct support by humans. This is necessary for robots used on space exploration, where
the time lag between base and robot can be very long, or when the communication with
the base is not reliable. Even on earth-based robots, a large degree of authonomy is de-
sirable for robots which will face missions where human support is not avaliable, like
on underwater envirionments, or emergency situations. Our aim, however, is not only to
make robots last “long enough”, so that they can be eventually repaired by humans, but
eventually make robots that are capable of repairing themselves.
While, on the field of robotics, the terms fault tolerance, self repair, and au-
tonomous robots have all been used to coin quite different lines of research, in this work
we focus on a high level (planning level) approach. It is called Replacement process. It
consists of, given an action belonging to a plan needed for a robot to accomplish a
mission, to find a group of actions so that the preconditions and postconditions of
still hold ( still accomplishes the mission).
Supported by FAPESP

While the obvious application of this technique (and which we build this work
upon) is to use the replacement process to mantain the functionality of a robot that had
faced a few faults in some of its devices, it does have other uses in robotics. For instance,
a robot that runs self diagnostics procedures online might need to leave a system being
checked unavaliable for the planner. Another possibility is that the robot needs to execute
two tasks at the same time, therefore making some of its resources unavaliable for a
time. Finally, we propose that this approach, while leading to a non-optimal plan when
compared with replanning techniques, will make the robot’s response to faults faster.
The next session of this article will discuss previous works that influenced our cur-
rent research. In the following session we’ll present the technique I’m currently working
on. A theoretical discussion will be followed by a report on preeliminary experiments,
performed on a simulated platform. Following that, proposals on where to continue the
research efforts are made. In the last session, we’ll introduce the issues still left open in
our work and interesting questions.
2. Related Works
One of the main trends in robotic fault tolerance are low level techniques for di-
agnosis and isolation of faults in robotic components [Visinsky et al., 1994]. While
the basic idea of comparing a robotic device’s internal sensor feedback to its
expected values [Visinsky, 1991] still remains, current developments of this ap-
proach includes the use of neural networks to avoid mistaking data noise for faults
[Tinós and Terra, 2001, Terra et al., 2001] and the use of pretty complicated mathemat-
ics to obtain more information from relatively little sensory data, called Analitical Re-
dundancy [Leuschen et al., 2002, Visinsky et al., 1994]. These works, however, do little
more than say that “the planner will take on from here”, after they lock out a faulty device
on a robotic system. That’s where we intend to pick up and carry on.
Another approach for the robotic fault tolerance problem comes from self-
configurable robotic systems [Kotay et al., 1998]. These are systems composed of many
very simple robots (like 1DoF robots), which try to behave like multicelular biological
systems. It is proposed [Ortega and Tyrrel, 2000, Tyrrell, 1999] that these system’s abil-
ity to change their own configuration leads to high fault-recovery capabilities. In this
direction, evolutive algorithms also have been employed to develop fault tolerance hard-
ware [Thompson, 1995].
Benso has proposed [Benso et al., 2001] a fault tolerance approach for micropro-
cessors in which we base our robotic proposal. In his work, replacement tables are used
to replace a faulty device within a processor for a new command set that reaches the same
results, exchanging performance for reliability. This is the idea we want to use to provide
fault tolerance abilities in robots in this work.
In [Parker, 1998], a system quite similar to the one we will present in this paper is
proposed. The main differences is that our approach is initially focused in single, complex
robot systems, while being able to be extended to multi-robots domains. Also, we dwelve
a little deeper into the software functions that can be performed by a single robot. Still,
the two works are different approaches for handling the same problem.
3. Replacement System
Let us call a plan a set of actions which we expect to takes us from an initial
state to a final state which target variables from the set of the variables that
define a state in our world have a desired target value.
For a given plan, we will define a emphreplacement for a plan on an action
( ), as a second plan, so that the initial and final states of are the same as
those of , but .
For an example, let us define a simple robot. It has three possible actions: go
forward, turn left and turn right ( ). Let be a plan to make the robot take
one step left: (turn left, then go forward). A replacement plan for on ( )
would be: (three turns left, then go forward). On the other hand, there is no
replacement plan for on with the avaliable actions - that would be different if, for
instance, the robot had a “go backwards” action.
We call a Replacement System S a set of plans so that for every action , there is
at least one plan such as is a replacement for the subplan composed of only
the action . For instance, a very simple replacement system for the above robot would
look like the one described in 1.
Action Replacement
Go forward No replacement for this robot
Turn Right Turn Left, Turn Left, Turn Left
Turn Left Turn Right, Turn Right, Turn Right
Table 1: A Simple Sample Replacement System
We can use such a system to provide a robot with planning repair capabilities
during runtime. A robot running with a replacement system would begin it’s mission
doing the usual planning. Then it would run the plan’s instructions. Whenever the robot
detect an actuator failure, it would mark the corresponding actions as unavaliable. If the
plan later required a marked off action to be used, the robot would replace that action
from a equivalent plan from the Replacement System.
It can be easily noted that this procedure can recurse. While the robot is executing
the subplan, it must check the subplan’s actions for faults, so it can replace the replace-
ment actions themselves. If done without care, this could lead the replacement system
into a deadlock. Many different safeguards can be made to avoid this situation: we could
make the replacement system directed, with higher level actions that replace into lower
level actions, like done in [Benso et al., 2001]. While this is simple, it will reduce our
replacement ability (in our simple example above, the turn left and turn right actions need
to make a cycle, or one of them will lose its replacement ability). Another solution would
be to store which actions we have already replaced, and avoid using them again in the
replacement process.
Therefore, before using the replacement plan, it is needed to check if it is valid. A
replacement subplan is considered valid when: 1- Starting from that state, all its actions
are valid (for instance, it will not try to do something the envirionment would usualy

prevent it from doing, like going through a wall). 2- It does not contains actions that are
already being replaced then (like described on the previous paragraph.
4. Experimental results
we propose that the use of a replacement system can reduce the processing time per action,
when it is able to avoid or postpone the need of replanning in a faulty envirionment.
Altought this also means that the replaced substitute plan might have more actions than
replanning, for a robot in a dynamic envirionment, the response time is more important
than finding an optimal solution.
To validate our proposal, we have developed a simple robot simulator. It simulates
a four wheeled robot, where each wheel is independent from the others, and capable of
going forward, stopping and going backwards. Therefore, the simulated robot is capable
of 81 different actions. This robot’s envirionment is a simple maze which the robot must
navigate. During the simulation, we can inject faults in the robot’s wheels, making each
of them unable to go forward, backwards, or to stop.
The experiment consisted on running the robot on four different mazes, four times
on each maze (with different starting and ending points). This was to represent short and
long, simple and complex, narrow and wide paths the robot was supposed to walk. Each
of the sixteen paths was run under twelve different sets of failures, each set containing up
to two failures. For this experiment, all failures happens in the simulation’s first turn, are
readly detected by the robot, and are permanent (last for the entire simulation).
We implemented a simple replacement system for the simulated robot. In this
system’s replacemente table, each action had 5 replacement plans which we defined man-
ually. The robot would check this table whenever he tried to do a movement for which
one of the engines were damaged. If any of the subplans for that particular movement
was composed of non-damaged movements only, and the subplan was itself valid in that
state (would not bump into a wall), it would replace the damaged movement for it. Else,
it would try to replan the path with the avaliable movements all over again.
We compared this robot’s performance to that of a simulated robot without the
replacement system. This second robot would simply replan everything from scratch
whenever it tried to perform a faulty movement. We used a simple A* search for the
planning part.
The preeliminary test runs revealed many issues concerning the implementation of
a replacement system. The first of them was the need to take the possibility of replacing
actions during the planning stage. The first A* planner used would find optimal paths that
bordered any walls between the starting and ending point. However, this optimal path
would rule out many possible replacement actions, which would bump the robot into the
wall (like in 1). We, therefore, changed slightly the search’s heuristics so that the robot
would find a path that avoided being too near the walls whenever possible. After that, a
slight increase of suscesfully replaced actions could be observed.
Another important issue regarded the implementation of the algorithm which
would search the replacement system for a suitable subplan to a faulty movement. A
very simple approach, where a table simply listed the possible replacements, and each
Figure 1: When walking too close to the wall, movement constrainsts prevent
some of the replacement plans to work
would be checked for suitability and then used, could be executed in constant time. How-
ever, its success rate would depend entirely on the cleverness of the table it was based.
Like suggested in the previous session, some recursive action could improve this. Hov-
ever, we then open the question of how to implement this recursiveness without losing the
constant, small time needed to replace an action using the table directly.
In the experiments for this paper we tried two different implementations to solve
this issue. First we tried separating the actions in two kinds: those that didn’t need re-
placements with more than two different actions, and those which couldn’t do without
them. The first type we called primitive actions, and the second, complex actions. When
testing a substitute plan for fitness, complex actions wouldn’t be checked for errors, un-
less they were the first action in the plan, so that they could be recursively replaced when
they were used by the robot. The first action requirement was to guarantee the stop condi-
tion. This solution didn’t work very well in the experiments, mainly due to the simplified
replacement table that was used. The other way to solve the recursion problem was de-
scribed in the previous session, and consists of storing which actions are currently being
replaced, and not using subplans composed of those actions.
After those considerations, the simulated robot performed well, spending a neg-
ligible amount of time per action whenever it could find a substitute for all faulty action
in its plan, and a noticeably reduced amount of replanning time when it could substitute
most actions in its plan. After each testing round, manipulating the most non-replaceable
actions in the replacement table would yield better results, which indicated the need to
generate a broader, more compreensive replacement table.
5. Conclusion and Future work

In this paper, we intended to present the basic idea of a replacement system and discuss
the main issues regarding it.
The first problem we need to address from now on is the automatic generation of a
replacement table. The experience with the current work showed us that human designed
tables, besides taking too long to make, are very prone to error and to miss key replace-
ment subplans. Having acknowledged the need of an automatically generated replaced
table, we face the problem of how to do it. Taylor [Taylor, 1992] proposes a method
to eliminate action sequences which lead to identical states in depth first search. This
method could be used to generate a replacement table by turning the redundant sequences
into replacement subplans. Other machine learning techniques, like reinforcement learn-
ing, look promising in respect to finding a good replacement table for a given robotic
system.
However, further study will be needed to balance out the replacement table size
and scope against the cost to find a suitable replacement in it. The use of special algo-
rithms for replacement finding could also play a part in it. Also, the idea of the replace-
ment process itself is something that could be worked on. Instead of directly replacing
faulty instructions, the system could try to use a lookahead, where it would replace the
faulty action and actions before and/or after it, so that it could use a more efficient
replacement subplan.
Finally, we can extend the ideas presented here for a multi-robot system, where a
faulty acton in one of the robots could be replaced by a subplan composed of actions in
different robots belonging to the system. For this, the replacement system idea should be
revised to work on higher level actions and replacement plans. This higher level replace-
ment system might then treat not only failures due to system faults, but also to problems
related to a dynamic envirionment.
References
Benso, A., Chiusano, S., and Prinetto, P. (2001). A self-repairing execution unit for
microprogrammed processors. IEEE Micro, pages 16–21.
Kotay, K., Rus, D., Vona, M., and McGray, C. (1998). The self reconfiguring robotic
molecule. In Proceedings of IEEE International Conference on Robotics and Automa-
tion.
Leuschen, M. L., Walker, I. D., and Cavallaro, J. R. (2002). Robotic fault detection using
nonlinear analytical redundancy. In IEEE International Conference on Robotics and
Automation.
Ortega, C. and Tyrrel, A. (2000). Reability analysis in self-repairing embryonic systems.
Parker, L. E. (1998). Allance: Am architecture for fault tolerant multi-robot cooperation.
IEEE Transactions on robotics and automation, 14(2):220–240.
Taylor, L. A. (1992). Pruning duplicate nodes in depth-first search. Technical report,
University of California.
Terra, M. H., Bergerman, M., Tinós, R., and Siqueira, A. A. G. (2001). Controle tolerante
a falhas de robôs manipuladores. SBA controle e automação, 12(2):73–92.
Thompson, A. (1995). Evolving fault tolerant systems. In Proc. 1st IEE/IEEE Int. Conf.
on Genetic Algorithms in Engineering Systems: Innovations and Applications (GALE-
SIA’95), pages 524–529. IEE Conf. Publication No. 414.
Tinós, R. and Terra, M. H. (2001). Fault detection and isolation in robotic manipula-
tors using a multilayer perceptron and a rbf network trained by the kohonen’s self-
organizing map. Revista Controle & Automação, 12(1):11–18.
Tyrrell, A. (1999). Computer know thy self!: A biological way to look at fault tolerance.
Visinsky, M. L. (1991). Fault detection and fault tolerance methods for robotics. Master’s
thesis, Rice University.
Visinsky, M. L., Cavallaro, J. R., and Walker, I. D. (1994). Robotic fault detection and
fault tolerance: a survey. Reliability Eng. and System Safety, 46:139–158.

High Level Techniques For Self-Repairing Robotic Systems: Claus C. Aranha, Jacques Wainer, Andr e Covic Bastos

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

High Level Techniques For Self-Repairing Robotic Systems: Claus C. Aranha, Jacques Wainer, Andr e Covic Bastos

Caricato da

Copyright:

Formati disponibili

High level techniques for self-repairing robotic systems

Claus C. Aranha , Jacques Wainer , André Covic Bastos

Instituto de Computação – Universidade Estadual de Campinas

Abstract. Usually, robotic fault-tolerance techniques refer to methods to isolate

5. Conclusion and Future work

Potrebbero piacerti anche