Sei sulla pagina 1di 19

1

Robotics and Artificial Intelligence:


a Perspective on Deliberation Functions

Felix Ingrand, Malik Ghallab Robotics has always been a fertile inspiration
LAASCNRS, Universite de Toulouse paradigm for AI research, frequently referred to
7, Av. Colonel Roche, 31077 Toulouse, France in its literature, in particular in the above top-
E-mail: {felix,malik}@laas.fr ics. The early days of AI are rich in pioneering
projects fostering a strong AI research agenda on
Abstract: Despite a very strong synergy between robotics platforms. Typical examples are Shakey
Robotics and AI at their early beginning, the two fields at SRI [85] and the Stanford Cart in the sixties,
progressed widely apart in the following decades. How- or Hilare at LAAS [36] and the CMU Rover [70]
ever, we are witnessing a revival of interest in the fer- in the seventies. However, in the following decades
tile domain of embodied machine intelligence, which the two fields developed in diverging directions;
is due in particular to the dissemination of more ma- robotics expanded mostly outside of AI laborato-
ture techniques from both areas and more accessible ries. Hopefully, a revival of the synergy between
robot platforms with advanced sensory motor capabil-
the two fields is currently being witnessed. This
ities, and to a better understanding of the scientific
challenges of the AIRobotics intersection.
revival is due in particular to more mature tech-
The ambition of this paper is to contribute to this niques in robotics and AI, to the development of
revival. It proposes an overview of problems and ap- inexpensive robot platforms with more advanced
proaches to autonomous deliberate action in robotics. sensing and control capabilities, to a number of
The paper advocates for a broad understanding of de- popular competitions, and to a better understand-
liberation functions. It presents a synthetic perspective ing of the scientific challenges of machine intelli-
on planning, acting, perceiving, monitoring, goal rea- gence, to which we would like to contribute here.
soning and their integrative architectures, which is il- This revival is particularly strong in Europe
lustrated through several contributions that addressed
where a large number of groups is actively con-
deliberation from the AIRobotics point of view.
tributing to the AIRobotics interactions. For ex-
ample, out of the 260 members of the Euron net-
1. Introduction work,1 about a third investigate robotics decision
and cognitive functions. A similar ratio holds for
Robotics is an interdisciplinary integrative field, the robotics projects in FP6 and FP7 (around a
at the confluence of several areas, ranging from me- hundred). Many other european groups not within
chanical and electrical engineering to control the- Euron and projects outside of EU programs are
ory and computer science, with recent extensions equally relevant to the AI and Robotics synergy.
toward material physics, bioengineering or cogni- This focused perspective on deliberative capabili-
tive sciences. The AIRobotics intersection is very ties in robotics cannot pay a fair tribute to all eu-
rich. It covers issues such as: ropean actors of this synergy. It illustrates however
deliberate action, planning, acting, monitoring several contributions from a few groups through-
and goal reasoning, out Europe.2 Its ambition is not to cover a com-
perceiving, modeling and understanding open en- prehensive survey of deliberation issues, and even
vironments, less of the AIRobotics intersection. In the lim-
interacting with human and other robots,
1 http://www.euron.org/
learning models required by the above functions, 2 e.g., from Barcelona, Bremen, Freiburg, Grenoble, Karl-
integrating these functions in an adaptable and sruhe, London, Lille, Link
oping, Munich, Orebro, Os-
resilient architecture. nabruck, Oxford, Rennes, Roma and Toulouse

AI Communications
ISSN 0921-7126, IOS Press. All rights reserved
2

ited scope of this special issue, we propose a syn- chitectures, but it is clarifying to distinguish the
thetic view of deliberation functions. We discuss following five deliberation functions, schematically
the main problems involved in their development depicted in figure 1:
and exemplify a few approaches that addressed Planning: combines prediction and search to syn-
these problems. This tour dhorizon allows us to thesize a trajectory in an abstract action space,
advocate for a broad and integrative view of de- using predictive models of feasible actions and of
liberation, where problems are beyond search in the environment.
planning, and beyond the open-loop triggering of Acting: implements on-line close-loop feedback
commands in acting. We hope through this per- functions that process streams of sensors stimu-
spective to strengthen the AIRobotics synergies. lus to actuators commands in order to refine and
The outline of the paper is the following: five control the achievement of planned actions.
deliberation functions are introduced in the next Perceiving: extracts environment features to iden-
section; these are successively addressed through tify states, events, and situations relevant for the
illustrative contributions; section 8 is devoted to task. It combines bottom-up sensing, from sen-
architecture problems, followed by a conclusion. sors to meaningful data, with top-down focus
mechanisms, sensing actions and planning for in-
formation gathering.
2. Deliberation functions in robotics Monitoring: compares and detects discrepancies
between predictions and observations, performs
Deliberation refers to purposeful, chosen or diagnosis and triggers recovery actions.
planned actions, carried out in order to achieve Goal reasoning: keeps current commitments and
some objectives. Many robotics applications do goals into perspective, assessing their relevance
not require deliberation capabilities, e.g., fixed given observed evolutions, opportunities, con-
robots in manufacturing and other well-modeled straints or failures, deciding about commitments
environments; vacuum cleaning and other devices to be abandoned, and goals to be updated.
limited to a single task; surgical and other tele- These deliberation functions interact within a
operated robots. Deliberation is a critical function- complex architecture (not depicted in Fig. 1)
ality for an autonomous robot facing a variety of that will be discussed later. They are interfaced
environments and a diversity of tasks. with the environment through the robots plat-
form functions, i.e., devices offering sensing and
Goal Reasoning Planning actuating capabilities, including signal processing
and low-level control functions. The frontier be-
tween sensory-motor functions and deliberation
Models, data, functions depends on how variable are the environ-
and knowledge bases User ments and the tasks. For example, motion control
along a predefined path is usually a platform func-
tion, but navigation to some destination requires
one or several deliberation skills, integrating path
Acting Perceiving Monitoring planning, localization, collision avoidance, etc.
Learning capabilities change this frontier, e.g.,
in a familiar environment a navigation skill is
compiled down into a low-level control with pre-
Robots Platform cached parameters. A metareasoning function is
also needed for trading off deliberation time for
action time: critical tasks require careful delibera-
Environment tion, while less important or more urgent ones may
Fig. 1. Schematic view of deliberation functions.
not need, or allow for, more than fast approximate
solutions, at least for a first reaction.3
Several functions can be required for acting de- 3 Learning as well as metareasoning are certainly needed
liberately. The frontiers between these functions for deliberation; they are not covered here to keep the ar-
may depend on specific implementations and ar- gument focused.
3

3. Planning robot planning of collaborative tasks, such as two


robots assembling a table.
Over the past decades, the field of automated The integration of motion and task planning
planning achieved tremendous progress such as is also explored in [96] with Angelic Hierarchical
a speed up of few orders of magnitude in the Planning (AHP). AHP plans over sets of states
performance of Strips-like classical planning, as with the notion of reachable set of states. These
well as numerous extensions in representations and sets are not computed exactly, but bounded, e.g.,
improvements in algorithms for probabilistic and by a subset and a superset, or by an upper and a
lower bound cost function. A high-level action has
other non-classical planning [35]. Robotics stresses
several possible decompositions into primitives. A
particular issues in automated planning, such as
plan of high-level actions can be refined into the
handling time and resources, or dealing with un-
product of all feasible decompositions of its ac-
certainty, partial knowledge and open domains.
tions. A plan is acceptable if it has at least one fea-
Robots facing a variety of tasks need domain spe-
sible decomposition. Given such a plan, the robot
cific as well as domain independent task planners,
chooses opportunistically a feasible decomposing
whose correct integration remains a challenging
for each high-level action (AHP refers to the an-
problem.
gelic semantics of nondeterminism). The bounds
Motion and manipulation planning are key ca-
used to characterize reachable sets of states are
pabilities for a robot, requiring specific represen-
obtained by simulation of the primitives, includ-
tations for geometry, kinematics and dynamics.
ing through motion and manipulation planning,
Probabilistic Roadmaps and Rapid Random Trees for random values of the state variables.
are well developed and mature techniques for mo- A different coupling of a hierarchical task plan-
tion planners that scale up efficiently and allow for ner to fast geometric suggesters is developed
numerous extensions [61]. The basic idea is to ran- in [45]. These suggesters are triggered when the
domly sample the configuration space of the robot search in the decomposition tree requires geomet-
(i.e., the vector space of its kinematics parameters) ric information. They do not solve completely the
into a graph where each vertex is a free configura- geometric problem, but they provide information
tion (away from obstacles) and each edge a direct that allows the search to continue down to leaves of
link in the free space between two configurations. the tree. The system alternates between planning
Initial and goal configurations are added to this phases and execution of primitives, including mo-
graph, between which a path is computed. This tion and manipulation actions. Online planning al-
path is then transformed into a smooth trajectory. lows to run motion or manipulation planners (not
Manipulation planning requires finding feasible se- suggesters) in fully known states. The approach
quences of grasping positions, each of which is a assumes that the geometric preconditions of the
partial constraint on the robot configuration that abstract actions can be computed quickly and ef-
changes its kinematics [87]. Many other open prob- ficiently by the suggesters, and that the sub-goals
lems remain in motion and manipulation planning, resulting from actions decomposition are executed
such as dynamics and stability constraints, e.g. for in sequence (no parallelism). The resulting system
a humanoid robot [46], or visibility constraints to is not complete. Failed actions should be reversible
allow for visual servoing [14]. at a reasonable cost. For problems where these as-
Task planning and motion/manipulation plan- sumptions are met, the system is able to quickly
ning have been brought together in several work. produce correct plans.
The Asymov planner [12] combines a state-space
planner with a search in the motion configuration
space. It defines places which are both states, as 4. Acting
well as sets of free configurations. Places define
bridges between the two search spaces. The state- In contrast to planning that can easily be spec-
space search prunes a state whose correspond- ified as an offline predictive function, decoupled
ing set of free configurations does not meet cur- from the intricacies of the executing platform, act-
rent reachability conditions. Asymov has been ex- ing is more difficult to define as a deliberation func-
tended to manipulation planning and to multi- tion. The frequent reference to execution control is
4

often reductive: there is more to it than just trig- Refinement In most systems, the plan steps pro-
gering actions prescribed by a plan. Acting has to duced by the Planning component is not directly
handle noisy sensors and imperfect models. It re- executable as a robot command. The goto(room1)
quires non-deterministic, partially observable and action requires sending commands to the robot to
dynamic environment models, dealt with through perceive the environment, plan the path, execute
closed-loop commands. it avoiding dynamic obstacles, etc. The refinement
To integrate these requirements with those of process needs be context dependent in order to se-
predictive planning models, different forms of hier- lect the most appropriate skills according to the
archization are usually explored. For example (fig- online observation. It should be able to consider
ure 2): alternative refinements in case of failure.
planning deals with abstract preconditions-effects Instantiation/Propagation Acting skills are often
actions; applicable to a range of situations. Their models
acting refines opportunistically each action into use parameters whose value can be chosen at exe-
skills and a skill further down into commands. cution time or observed in the environment. Cho-
Planning techniques
This refinement mechanism may also use some in action refinement
sen or observed value have to be propagated in the
planning techniques but with distinct state space rest of the skills down to the lowest level to issue
and action space than those of the planner. commands.
Time management/Coordination Acting is per-
formed in a close loop taking into account the dy-
Mission ..., action, ...
namics of the environment. Adequate responses
Planning ..., skill, ... must be given in a timely manner. Some systems
reason about time, deadlines as well as durations.
Acting ..., command, ...
Other systems handle a more symbolic represen-
tation of time with concurrency, rendez-vous and
Robots platform synchronization.
Handling nondeterminism and uncertainty Ac-
Fig. 2. Refining actions into skills.
tions may have non-nominal effects. Furthermore,
exogenous events in a dynamic environment are
The skill into which an action is refined may seldom predictable in a deterministic way. Finally,
change during that action execution. For exam- uncertainties in observations have to be taken into
ple, several navigation skills may offer different lo- 30
account.
calization or motion control capabilities adapted Plan repair Some Acting approaches can repair the
to different environment features. A goto(room1) plan being executed. This is often performed us-
action can be refined into a sequence of different ing part of the problem search space already de-
navigation skills. veloped and explored by the Planner (hence, with
This hierarchization scheme may rely on dis- an overlap between acting and planning). The idea
tinct knowledge representations, e.g. STRIPS op- is to solve new flaws or new threats that appeared
erators combined to PRS [42] or RAP [27] pro- during execution by minimizing the changes in
cedures. In some cases a single representation is the remaining of the plan. Even if repair may not
used for specifying planning and acting knowledge, be more efficient that replanning, there are cases
e.g., Golog [63] or TAL [20] languages. Other ap- where part of the plan has already been shared
proaches use a single representation seen at differ- with other agents (robots or humans) which expect
ent levels of abstractions and refined appropriately, the robot to commit to it.
as in Hierarchical MDPs [38] for example. Another important aspect of Acting is how the
Various computational techniques can be used skills are acquired and used. Are they completely
to design a deliberate acting system. We pro- hand written or learned? Are they used directly
pose to organize these approaches into five cate- as programs or as specification models from which
gories (see table 1) presented in the following sub- further synthesis is performed? Finally, there is the
sections. Before discussing and illustrating these consistency verification issue between the Acting
approaches, let us introduce the main functions knowledge the Planning knowledge. We will see
needed to carry out a planned abstract action. that some of the proposed formalism for represent-
5

ing skills are more adapted to validation and ver- (RPL). Unlike the above systems, it explores a
ification, even if this function is not always per- plan space, transforming the initial RPL relying on
formed. simulation and probabilities of possible outcomes.
It replaces the currently executed plan on the fly if
4.1. Procedurebased approaches another one more adapted to the current situation
is found. This approach evolved toward Structured
In procedure-based approaches, action refine- Reactive Controllers (SRC) and Structure Reactive
ment is done with handwritten skills. In RAP [27], Plan (SRP) [3], but still retains the XFRM tech-
each procedure is in charge of satisfying a partic- nique to perform planning using transformation
ular goal, corresponding to a planned action. De- rules on SRP. It has been deployed on a number of
liberation chooses the appropriate procedure for service robots at Technical University of Munich.
the current state. The system commits to a goal
to achieve, trying a different procedure when one Most procedure-based approaches focus on the
fails. The approach was later extended with AP [7], Refinement and Instantiation/Propagation func-
integrating the planner PRODIGY [92] producing tions of an acting system. XFRM proposes a form
RAP procedures. of plan repair in plan space taking into account
PRS [42] is an action refinement and monitor- the probabilities of outcomes, while TDL provides
ing system. As in RAP, procedures specify skills to some synchronization mechanism between skills
achieve goals or to react to particular events and and commands. All skills used by these systems
observations. The system commits to goals and are hand written, sometimes in a formalism shared
tries alternative skills when one fails. PRS relies with the planner (e.g., in Cypress and TCA), but
on a database describing the world. It allows con- without consistency checking. The hand written
current procedure execution and multi-threading. skills map to the robot commands, except for
Some planning capabilities have been added to XFRM where some can be transformed online.
PRS [19] to anticipate paths leading to execution
failure. PRS is used on various robotics platforms 4.2. Automatabased approaches
to trigger commands, e.g., through GenoM func-
tional modules services [43]. It seems quite natural to express an abstract
Cypress [94] results from merging the planner action as a program whose I/O are the sensory-
Sipe with PRS. It uses a unified representation for motor signals and commands. PLEXIL, a language
planning operators and PRS skills, which was ex- for the execution of plans, illustrates such a rep-
tended into the Continuous Planning and Execu- resentation where the user specifies node as com-
tion Framework (CPEF) [75]. CPEF includes sev- putational abstraction [93]. It has been developed
eral components for managing and repairing plans. for space applications and used with several plan-
The system has been deployed for military mission ners such as CASPER, but it remains fairly generic
planning and execution. and flexible. A node can monitor events, execute
TCA [88] was initially developed to handle con- commands or assign values to variables. It may
current planning and execution. It provides a hier- refer hierarchically to a list of lower level nodes.
archical tasks decomposition framework, with ex- Similarly to TDL, PLEXIL execution of nodes can
plicit temporal constraints to allow tasks synchro- be controlled by a number of constraints (start,
nization. Planning is based on task decomposi- end), guards (invariant) and conditions. PLEXIL
tion. It is mostly focused on geometrical and mo- remains very focused on the execution part. But
tion planning (e.g., gait planning, footfall planning of the generated plan, it does not share knowledge
for the Ambler robot). The Task Definition Lan- with the planner.
guage (TDL) [89] extends TCA with a wide range SMACH, the ROS execution system, offers an
of synchronization constructs between tasks. It fo- automata-based approach [6]. The user provides
cuses on task execution and relies on systems like a set of hierarchical automata whose nodes corre-
Casper/Aspen for planning. sponds to components of the robot and the par-
XFRM [4] illustrates another approach which ticular state they are in. The global state of the
uses transformation rules to modify hand writ- robot corresponds to the joint state of all compo-
ten plans expressed in the Reactive Plan Language nents. ROS actions, services and topics (i.e. mon-
6

itoring of state variables) are associated to au- manage explicit time representations by extend-
tomata states, and according to their value, the ing state-based representations with durative ac-
execution proceeds to the next appropriate state. tions (e.g., RPG, LPG, LAMA, TGP, VHPOP,
An interesting property of Automata-based ap- Crickey). A few of them can manage concurrency
proaches is that the Acting component knows ex- and, in the case of COLIN [15], even linear contin-
actly in which state the execution is, which eases uous change. However, temporal planners that rely
the deployment of the monitoring function. on time-lines, i.e., partially specified evolution of
state variables over time, are more expressive and
Automata-based approaches focus on the coor- flexible in the integration of planning and acting
dination function. They can also be used for refine- than the standard extension of state-based plan-
ment and instantiation/propagation. Models are ners. Their representation ingredients are:
hand-written. However, the underlying formalism temporal primitives: point or intervals (tokens),
permits possibly a validation function with au- state variables, possibly parametrized, e.g., po-
tomata checking tools. sition(object32), and rigid relations, e.g., con-
nected(loc3, room5),
4.3. LogicBased approaches
persistence of the value of a state variable over
time, and the discrete or continuous change of
A few systems try to overcome the tedious engi-
these values,
neering bottleneck of detailed hand specifications
temporal constraints: Allen interval relations or
of skills by relying on logic inference mechanisms
Simple Temporal Networks over time-points,
for extending high-level specifications. Typical ex-
amples are the Temporal Action Logic (TAL) ap- atemporal constraints on the values and param-
proach (to which well come back in section 5) and eters of state-variables.
the situation calculus approach. The latter is ex- The initial values, expected events and goals
emplified in GOLEX [37], an execution system for are expressed in this representation as an un-
the GOLOG planner. explained trajectory, i.e., some required state-
In GOLOG and GOLEX the user specify respec- variable changes have to be accounted for by the
tively planning and acting knowledge in the sit- planner through actions. These are instances of op-
uation calculus representation. GOLEX provides erators whose preconditions and effects are simi-
Prolog exec clauses which explicitly define the larly expressed by time-lines and constraints.
sequence of commands a robot has to execute. It Planning proceeds in the plan-space by detect-
also provides monitoring primitives to check the ing flaws, i.e., unexplained changes and possible
effects of executed actions. GOLEX executes the inconsistencies, and repairing them through addi-
plan produced by GOLOG but even if the two sys- tional actions and constraints. It makes use of var-
tems rely on the same logic programming represen- ious heuristics, constraint propagation and back-
tation, they remain completely separated, limiting track mechanisms. It produces partially specified
the planning/execution interleaving. plans, that have no more flaw but still contain
non instantiated temporal and atemporal vari-
The Logic-based approaches provides refinement ables. This least commitment approach has several
and instantiation/propagation functions. But their advantages permitting to adapt the acting system
main focus is on the logical specification of the to the contingencies of the execution context.
skills, and the possibility to validate and verify The acting system proceeds like the planner
their models. TAL (see section 5) offers also a Time by propagating execution constraints, including
management handling. for observable but non controllable variables (e.g.,
ending time of actions). As for any CSP, consis-
4.4. CSPbased approaches tency does not guaranty that all possible vari-
able values are compatible. Hence the system keeps
Most robotics applications require explicit time checking the consistency of the remaining plan,
to handle durative actions, concurrency and syn- propagating new constraints and triggering a plan
chronization with exogenous events and other repair when needed. Special attention has to be
robots, and with absolute time. Several approaches paid to observed values of non controllable vari-
7

Mission Reactor

Human Nurse Diner

Robot_Nav kitchen Goto dining_room

Robot_Manip Idle PickUp Holding PutDo


wn Idle PickUp Holding PutDo
wn Idle

MicroWave Close Open Closed-Cooking Open Close

Refrigerator Close Open Close

Navigation Reactor

Robot_Nav kitchen Goto dining_room

Laser_model Idle Model Idle

Motion Planner Idle Plan


Motion Idle

Locomotion Idle Traj. Track Idle

Manipulation Reactor
PutDo
Robot_Manip Idle PickUp Holding PutDown Idle PickUp Holding wn Idle

Gripper Open Clo


sing Closed Openi
ng Open Clo
sing Closed Openi
ng Open

Right_Arm Untuck Approach Idle Retract Idle Approach Idle Retract Idle Approach Idle Retract Idle Approach Idle Retract Tuck

Manipulation
Idle Plan Idle Plan Idle Plan Idle Plan Idle Plan Idle Plan Idle Idle Plan Idle Plan Idle
Planner

Fig. 3. Example of an IDEA/T-ReX Plan built within three reactors (Mission reactor, Navigation reactor, Manipulation
reactor). Note the timelines Robot Nav and Robot Manip shared between reactors.

ables, depending on the requirements of strong, out execution constraint propagation and reason-
weak or dynamic controllability [72,16]. ing on the current plan. In addition to its im-
IxTeT [34] is an early temporal planner along pressive application success in DS1, this system
this approach that was later extended with ex- inspired the development of two interesting ap-
ecution capabilities [62]. Planning and acting proaches: IDEA (then T-ReX) and RMPL.
share the partial plan structure which is dynam- IDEA [73] relies on a distributed approach
ically produced during planning phase, and exe- where planning and acting use the same represen-
cuted/repaired during the execution phase. An ex- tation and differ only in their prediction horizon
ecution failure is seen as a new flaw in the partial and allocated computing time to find a solution.
plan space. The repair process minimizes as much The system is distributed into a hierarchy of reac-
as possible the consequences of a failure on the rest tors, each being in charge of a particular deliber-
of the plan. Repair requires invalidating part of ation function: e.g. mission planning, robot navi-
the current plan and/or relaxing some constraints. gation, robot manipulation, payload management,
This is done by memorizing for each causal link the etc; each has its own set of timelines, planning
reason why it has been inserted (with or without a horizon, and a computing time quantum. Reactors
task). Causal links associated to tasks are not re- use the Europa planner to perform the constraint-
moved from the flawed plan. If the repair does not based temporal planning. Two reactors may share
succeed in some allocated time, the current plan is timelines, accessed and modified by both, possi-
discarded and the planning is restarted from the bly with priorities. The timelines sharing mecha-
current situation. To fill the gap with robot com- nism allows the propagation of the planning re-
mands and perceptions, PRS is used jointly with sults down to commands and similarly the integra-
IxTeT for refinement and skills execution. tion from precepts to observations. For example,
PS is another early timeline-based planning and in Figure 3 the timeline robot nav in the mission
acting system. As a component of the New Mil- reactor will specify on a shared timeline with the
lennium Remote Agent [44,77,74], it controlled the navigation reactor the sequence of locations the
Deep Space One (DS1) probe for a few days. Vari- robot must reach. From the mission reactor, the
ous components were used on DS1, PS and EXEC timeline will be seen as an execution one, while
being the one of interest for this section (FDIR will for the navigation reactor it is a goal.
be presented in section 5). EXEC uses a procedure- In principle the hierarchy of reactors should be
based approach, as presented in section 4.1, with- able to express a continuum from planning oper-
8

ators down to commands. However, in practice, formed for execution taking into account online
the shortest computing time quantum that could temporal flexibility. Altogether, RMPL offers an
be achieved was in the order of a second, not interesting and original integration of state-based
fast enough for the command level. Hence, that models, procedural control and temporal reason-
system had also to resort to hand programmed ing used in satellite control applications.
skills. Furthermore, the specification and debug-
ging of action models distributed over several re- CSP approaches are very convenient for han-
actors proved to be quite complex and tedious. dling time. They provide refinement, instantiation
IDEA has been experimented with on several and, for some of them, plan repair (IxTeT, IDEA,
platforms such as the K9 and Gromit rovers [26]. It T-ReX and RMPL). They also rely on hand writ-
led to the development, along a similar approach, ten models of skills which are handled by vari-
of a system called T-ReX, deployed at MBARI for ous CSP algorithms (STN, constraints filtering).
controlling UAVs [81]. T-ReX simplifies some of RMPL manages also nondeterminism by modeling
IDEA too ambitious goals. For example in T-ReX, non nominal transitions of the system.
reactors are organized in such a way that con-
straint propagation is guaranteed to reach a fixed 4.5. Stochasticbased approaches
point (which was not the case in IDEA). T-ReX
also tries to be planner independent and has been The classical framework of Markov Decisional
used jointly with APSI [30], yet most implementa- Processes (MDPs) offers an appealing approach for
tions use Europa [28]. integrating planning and acting. It naturally han-
Object Localization
dles probabilistic effects and it provides policies,
Robot Arm PickUpObject()::
Model Off Model Off ((do-watching (RA = holding)
(do-watching (RA = Failed)
i.e., universal plans, defined everywhere. The ex-
(parallel
Untuck Tuck Stop
init StandBy
Failed
(RA = Untucked)
(OL = Search))
(do-watching (OL = Failed)
ecution of a policy is a very simple loop: (i) ob-
t >=30s

Untucked
0.05
Search (OL = Localized))
(when-donext (OL = Localized)
(parallel
serve current state, then (ii) apply corresponding
(RA = Approached)
Approach Withdraw

Failed
Found
(OL = Tracked))
(when-donext ((OL = Tracked) and
action. It can even be extended, in principle, to
Reinit
Localized (RA = Approached))
Approached
0.1 )
(RA = Grasped))) partially observable systems (POMDP), as illus-
Track ;;; <Arm recovery>
Grasp Ungrasp

0.01 Tracked
Tracking Lost )
;;; Success
trated in the Pearl system of [79]. That framework
)
Grasped
0.1 Lost works fine as long as the state space, together with
Tuck
its cost and probability parameters, can be en-
Holding
tirely acquired and explicited, and, for POMDPs,
remains of small size.4 However, most deliberation
Fig. 4. Example of an RMPL automata (system model on
problems in robotics do not allow for an explicit
the left) and program (control model on the right). Note enumeration of their state space, and hence cannot
the non nominal outcomes (with probabilities) of actions afford a universal plan. Fortunately, most of these
in the system model, and the coordination operators in the problems are usually focused on reaching a goal
control model. from some initial state s0 . Factorized and hierar-
chical representations of MDPs [9], together with
RMPL (for Reactive Model-based Programming
heuristic search algorithms for Stochastic Shortest
Language) [41], another spinoff of the DS1 experi-
Path (SSP) problems [65], offer a promising per-
ment, proposes a common representation for plan-
spective for using effectively stochastic representa-
ning, acting and monitoring. It combines a system
tions in deliberate action.
model with a control model (Figure 4). The for-
SSP problems focus on partial policies, closed for
mer uses hierarchical automata to specify nominal
the initial state s0 (i.e., defined on all states reach-
as well as failure state transitions, together with
able from s0 ), terminating at goal states. They
their constraints. The latter uses reactive program-
generalize to And/Or graphs classical path search
ming constructs (including primitives to address
problems. For very large implicit search spaces,
constraint-based monitoring, as in Esterel [18]).
based on sparse models (few applicable actions
Moreover, RMPL programs are transformed into
Temporal Plan Networks (TPN) [95]. The result 4 A POMDP is an MDP on the belief space, whose size is
of each RMPL program is a partial temporal plan exponential in that of the state space. The latter is already
which is analyzed by removing flaws and trans- of size kn , for a domain with n state variables.
9

Functions
Approaches Systems Refinement Instantiation Time handling Nondeterminism Repair
RAP X X
PRS X X
Procedure Cypress/CPEF X X
TCA/TDL X X X
XFRM/RPL/SRP X X X X
Automata PLEXIL X X X
Graph SMACH X X X
Logic Golex X X
IxTeT X X X X
RMPL X X X X X
CSP
IDEA/T-ReX X X X X
Casper X X X X
MCP X
Pearl X
Stochastic
Robel X
Ressac X

Table 1
Illustrative examples of acting approaches and functions.

per state, few nondeterministic effects per applica- Heuristic search algorithms in SSPs are more
ble action, including deterministic actions), a sig- scalable than dynamic programming techniques
nificant scaling up with respect to classical dy- for MDP planning, but they still cannot address
namic programming methods can be achieved with large domains, with hundreds of state variables,
heuristics and sampling techniques [65]. unless these domains are carefully engineered and
Most heuristic search algorithms for SSPs are decomposed. Even a solution policy for such prob-
based on a two steps Find&Revise general frame- lems can be of a size so large as to make its enu-
work: (i) Find an unsolved state s in the succes- meration and memorization challenging to cur-
sors of s0 with current policy, and (ii) Revise the rent techniques. However, such a solution contains
estimated value of s along its current best action many states of very low probability that would al-
(with the so-called Bellman update). A state s is most never be visited. Various sampling and ap-
solved when the best (or a good) goal reaching pol- proximation techniques offer promising alterna-
icy from s has already been found. That framework tives to further scale up probabilistic planning.
can be instantiated in different ways, e.g., Among these approaches, determinization tech-
with a best-first search, as in AO*, LAO* and niques transform each non-deterministic actions
their extensions (ILAO*, BLAO*, RLAO*, etc.) into several deterministic ones (the most likely or
with a depth-first iterative deepening search, as all possible ones), then it plans deterministically
in LDFS with these actions, online and/or offline. For ex-
with a random walk along current greedy pol- ample, the RFF planner [90] generates an initial
icy, as in RTDP, LRTDP and their extensions deterministic plan, then it considers a fringe state
(BRTDP, FRTDP, SRTDP, etc.) along a non-deterministic branch of that plan:
These algorithms assume an SSP problem with a if the probability to reach that state is above a
proper policy closed for s0 (i.e., one that reaches a threshold, it extends the plan with a deterministic
goal state from s0 with probability 1) where every path to a goal or to an already solved state.
improper policy has infinite cost. A generalization Similar ideas are developed in sampling ap-
relaxes this last assumption and allows to seek a proaches. Among their advantages is the capabil-
policy that maximizes the probability of reaching a ity to work without a priori estimates of the prob-
goal state, a very useful and desirable criteria [52]. ability distributions of the domain, as long as the
Other issues, such as dead-ends (states from which sampling is drawn from these same distributions.
its not possible to reach a goal) have to be taken Bounds on the approximation quality and the com-
care of, in particular in critical domains [50]. plexity of the search have been obtained, with good
10

results on various extensions of algorithms such as veloped [74]. The spacecraft is modeled as a fine
LRTDP and UCT , e.g. [49,11,51]. grained collection of components, e.g., a thrust
Although MDPs are often used in robotics at valve. Each component is described by a graph
the sensory motor level, in particular within re- where nodes are the normal functioning states or
inforcement learning approaches, SSP techniques failure states of that component. Edges are ei-
are not as widely disseminated at the delibera- ther commands or exogenous transition failures.
tive planning and acting level. Contributions are The dynamics of each component is constrained
mostly on navigation problems, e.g., the RESSAC such that at any time exactly one nominal transi-
system [91]. On sparsely nondeterministic domains tion is enabled but zero or more failure transitions
where most actions are deterministic but of a few are possible. Models of all components are com-
are probabilistic, the approach called MCP [64] re- positionally assembled into a system allowing for
duces with deterministic planning a large problem concurrent transitions compatible with constraints
into a compressed MDP. It has tested on a simu- and preconditions. The entire model is compiled
lated multi-robot navigation problem. into a temporal propositional logic formula which
Finally, let us mention promising heterogeneous is queried through a solver. Two query modes are
approaches where task planning is deterministic used: (i) diagnosis, i.e., find most likely transitions
and SSP techniques are used for the choice of the consistent with the observations, and (ii) recovery,
best skill refining an action, given the current con- i.e., find minimal cost commands that restore the
text. An illustration is given by the ROBEL sys- system into a nominal state. This approach has
tem [71] with a receding horizon control. been demonstrated as being effective for a space-
craft. However, it is quite specific to cases where
monitoring can be focused on the robot itself, not
on the environment, and where reliability is a criti-
5. Monitoring
cal design issue addressed through redundant com-
ponents permitting complex diagnosis and allow-
The monitoring function is in charge of (i) de-
ing for recovery actions. It can be qualified as a ro-
tecting discrepancies between predictions and ob-
bust proprioceptive monitoring approach. It is un-
servations, (ii) classifying these discrepancies, and
clear how it could handle environment discrepan-
(iii) recovering from them. Monitoring has at least
cies, e.g., a service robot failing to open a door.
to monitor the planners predictions supporting
Other robotics monitoring systems are surveyed
the current plan. It may have also to monitor pre- in [78] and characterized into three classes: an-
dictions made when refining plan steps into skills alytical approaches, data-driven approaches and
and commands, as well as to monitor conditions knowledge-based approaches. The former rely on
relevant for the current mission that are left im- planning and acting models, such as those men-
plicit in planning and refinement steps. The latter tioned above, but also control theory models and
are, for example, how calibrated are the robots filtering techniques for low-level action monitoring.
sensors, or how charged are its batteries. Data-driven approaches rely on statistical cluster-
Although monitoring functions are clearly dis- ing methods for analyzing training data of normal
tinct from action refinement and control functions, and failures cases, and pattern recognition tech-
in many cases the two are implemented by the niques for diagnosis. Knowledge-based approaches
same process with a single representation. For ex- exploit specific knowledge in different representa-
ample, the early Planex [25] performs a very sim- tions (rules, chronicles, neural nets, etc.), which is
ple monitoring through the iterated computation given or acquired for the purpose of monitoring
of the current active kernel of a triangle table. and diagnosis. This classification of almost 90 dif-
In most procedure-based systems there are PRS, ferent contributions to Monitoring in robotics is
RAP, ACT or TCA constructs that handle some inspired from the field of industrial control, where
monitoring functions. However, diagnosis and re- Monitoring is a well studied issue. However, the re-
covery functions in such systems are usually lim- lationship between Monitoring, Planning and Act-
ited and ad hoc. ing was not a major concern in the surveyed con-
Diagnosis and recovery are critical in appli- tributions.
cations like the DS1 probe, for which FDIR, a That relationship is explored in [29] on the ba-
comprehensive monitoring system, has been de- sis of plan invariants. Several authors have synthe-
11

sized state-reachability conditions, called invari- current active behaviors are combined into low-
ants, from the usual planning domain specifica- level controls. At a higher level, properties of the
tions. Invariants permit a focus and a speed-up robot behaviors are modeled using Linear Tempo-
of planning algorithms, e.g., [84,5]. Going further, ral Logic (LTL). LTL formula express correctness
[29] proposes extended planning problems, where statements, execution progress conditions, as well
the specifications of planning operators are aug- as goals. A trace of the robot execution, observed
mented by logical formula stating invariant con- or predicted at planning time, is incrementally
ditions that have to hold during the execution of checked for satisfied and violated LTL formula.
a plan. Indeed, planning operators and extended A delayed formula progression technique evaluates
invariants are two distinct knowledge sources that at each state the set of pending formula. It re-
have to be modeled and specified distinctly. These turns the set of formula that has to be satisfied by
extended invariants are used to monitor the execu- any remaining trace. The same technique is used
tion of a plan. They allow to detect infeasible ac- both for Planning (with additional operator mod-
tions earlier then their planned execution, or vio- els and some search mechanism) and for Monitor-
lated effects of action after their successful achieve- ing. The approach has been tested on indoor nav-
ment. Furthermore, extended invariants allow to igation tasks with robots running the Saphira ar-
monitor effects of exogenous events and other con- chitecture [54].
ditions not influenced by the robot. However, this A very comprehensive and coherent integration
approach assumes complete sensing and perfect of Monitoring to Planning and Acting is illustrated
observation function. No empirical evaluation has in the approach used in the Witas project [21].
been reported. That system demonstrates a complex Planning,
Along the same line, the approach of [24] has Acting and Monitoring architecture embedded on
been tested on a simple office delivery robot. It re- autonomous UAVs. It has been demonstrated in
lies on a logic-based representation of a dynamic surveillance and rescue missions. Planning relies of
environment using the fluent calculus [86]. Actions TALplanner [58], a forward chaining planner using
are described by normal and abnormal precon- the Temporal Action Logics(TAL) formalism for
ditions. The former are the usual preconditions. specifying planning operators and domain knowl-
The latter are assumed away by the planner as edge. Formal specifications of global constraints
default; they are used as a possible explanation and dependencies, as well as of operator models
by the monitor in case of failure. E.g., delivery and search recommendations, are used by the plan-
of an object to a person may fail with abnormal ner to control and prune the search. These spec-
preconditions of the object being lost or the per- ifications are also used to automatically generate
son not being traceable. Similarly, abnormal effects monitoring formula from the model of each oper-
are specified. Discrepancies between expectations ator, and from the complete plan, e.g., constraints
and observations are handled by a prioritized non- on the persistence of causal links. This automated
monotonic default logic, which generates explana- synthesis of monitoring formula is not systematic
tions ranked using relative likelihood. That system but rather selective, on the basis of hand pro-
handles incomplete world model and observation grammed conditions of what needs to be moni-
updates performed either while acting or on de- tored and what doesnt. In addition to the plan-
mand from the monitoring system through specific ning domain knowledge, extra monitoring formula
sensory actions. are also specified in the same highly expressive
The idea of using extended logical specifications temporal logic formalism.
for Planning and Monitoring has been explored by The TAL-based system produces plans with
several others authors in different settings. The in- concurrent and durative actions together with con-
teresting approach of [8] uses domain knowledge ditions to be monitored during execution. These
expressed in description logic to derive expecta- conditions are evaluated on-line, at the right
tions of the effects of actions in a plan to be moni- moment, using formula progression techniques.
tored during execution. An interesting variant is il- When actions do not achieve their desired results,
lustrated in [60] for a hybrid architecture, combin- or when some other conditions fail, a recovery
ing a behavior-based reactive control with model- through a plan repair phase is triggered. Acting
based deliberation capabilities. At each cycle, con- is performed by Task Procedures, which provide
12

some level of action refinement through classical forts have been devoted. The building blocks for
concurrent procedural execution, down to elemen- such a function can to be taken from the fields of
tary commands. Altogether, this system proposes signal processing, pattern recognition and image
a coherent continuum from Planning to Acting and analysis, which offer a long history of rich devel-
Monitoring. The only component which does not opments. However, the integration of these tech-
seem to rely on formal specifications is the Act- niques within the requirements of autonomy and
ing function which uses hand written Task Proce- deliberation remains a bottleneck.
dures. However the lack of flexible action refine- The anchoring problem provides an excellent il-
ment is compensated for by specifying planning lustration of the complexity of integrating pattern
operators (and hence plan steps) at a low-level of recognition methods with autonomous deliberate
granularity. For example, there are five different action. As defined in [17], anchoring is the problem
fly operators in the UAV domain corresponding to of creating and maintaining over time a correspon-
different contexts, specifying context-specific con- dence between symbols and sensor data that re-
trol and monitoring conditions, and being mapped fer to the same physical object. Planning and other
to different Task Procedures. deliberation functions reason on objects through
symbolic attributes. It is essential that the sym-
bolic description and the sensing data agree about
6. Perceiving the objects they are referring to. Anchoring con-
cerns specific physical objects. It can be seen as
Situated deliberation relies on data reflecting a particular case of the symbol grounding prob-
the current state of the world. Beyond sensing, per- lem, which deals with broad categories, e.g., any
ceiving combines bottom-up processes from sen- chair, as opposed to that particular chair-2. An-
sors to interpreted data, with top-down focus of at- choring an object of interest can be achieved by
tention, search and planning for information gath- establishing and keeping an internal link, called
ering actions. Perceiving is performed at: an anchor, between the perceptual system and
the signal level, e.g., signals needed in control the symbol system, together with a signature that
loops , gives estimate of some of the attributes of the ob-
the state level : features of the environment and ject it refers to. The anchor is based on a model
the robot and their link to facts and relations that relates relations and attributes to perceptual
characterizing the state of the world, and features and their possible values.
the history level, i.e., sequences or trajectories Establishing an anchor corresponds to a pattern
of events, actions and situations relevant for the recognition problem, with the challenges of han-
robots mission. dling uncertainty in sensor data and ambiguity in
The signal level is usually dealt with through models, dealt with for example through maintain-
models and techniques of control theory. Visual ing multiple hypotheses. Ambiguous anchors are
servoing approaches [13] for tracking or handling handled in [47] as a planning problem in a space of
objects and moving targets offer a good exam- belief states, where actions have causal effects that
ple of mature techniques that can be considered change object properties, and observation effects
as tightly integrated into the basic robot func- that partition a belief state into several new hy-
tions. Similarly for simultaneous localization and potheses. There is also the issue of which anchors
mapping techniques, a very active and well ad- to establish, when and how, in a bottom-up or a
vanced field in robotics, to which numerous publi- top-down process. Anchors in principle are needed
cations have been devoted, e.g., [2,69]. These geo- for all objects relevant to the robot mission. These
metric and probabilistic techniques, enriched with objects can only be defined by intension (not ex-
topological and semantic data, as for example in tensively), in a context-dependent way. There is
[56,57,53], may involve deliberation and can be also the issue of tracking anchors, i.e., taking into
quite effective. account objects properties that persist across time
But of the above areas, methods for design- or evolve in a predictable way. Predictions are used
ing perceiving functions remain today a limiting to check that new observations are consistent with
factor in autonomous robotics, a hard and chal- the anchor and that the updated anchor still sat-
lenging issue to which surprisingly not enough ef- isfies the object properties. Finally, reacquiring an
13

anchor when an object is re-observed after some tomata. Chronicles are similar to temporal plan-
time is a mixture of finding and tracking; if the ning operators. The recognition is efficiently per-
object moves it can be quite complex to account formed by maintaining incrementally a hypothe-
consistently of its behavior. sis tree for each partially recognized chronicle in-
The dynamics of the environment is a strong stance. These trees are updated or pruned as new
source of complexity, e.g., as we just saw in the events are observed or as time advances. Recent
anchor tracking and re-acquiring problems. This development have added hierarchization and focus
dynamics is itself what needs to interpreted for on rare events with extended performances [23].
the history level: what an observed sequence of Very few systems have been proposed for design-
changes means, what can be predicted next from ing and implementing a complete Perceiving func-
past evolutions. In many aspects, research at this tion, integrating the three levels mentioned earlier
history level is more recent. It relates to acting in of signal, state and history views. DyKnow [39]
and understanding environments with rich seman- stands as a clear exception, noteworthy by its com-
tics, in particular involving human and manrobot prehensive and coherent approach. This system
interactions, e.g., in applications such as robot pro- addresses several requirements: the integration of
gramming by demonstration [1] or video surveil- different sources of information, of hybrid symbolic
lance [40,31]. and numeric data, at different levels of abstrac-
The survey of [55] covers an extensive list of con- tion, with bottom-up and top-down processing; it
tributions to action and plan recognition. These manages uncertainty, reasons on explicit models of
are focused on (i) human action recognition, (ii) its content and is able to dynamically reconfigure
general activity recognition, and (iii) plan recog- its functions.
nition level. The understanding is that the former These challenging requirements are addressed as
two sets of processing provide input to the latter. a data-flow based publish-and-subscribe middle-
Most surveyed approaches draw from two sources ware architecture. DyKnow views the environment
of techniques: as consisting of objects described by a collection
Signal processing: Kalman and other filtering of features. A stream is a set of time-stamped sam-
techniques, Markov Chains, Hidden Markov Mod- ples representing observations or estimations of the
els. These techniques have been successfully used value of a feature. It is associated with a formally
in particular for movement tracking and gesture specified policy giving requirements on its content
recognition[97,67]. such as: frequency of updates, delays and ampli-
Plan recognition: deterministic [48,83] or proba- tude differences between two successive samples,
bilistic [32] planning techniques, as well as pars- or how to handle missing values.
ing techniques [82]. A stream is generated by a process which
Most plan recognition approaches assume to get may offer several stream generators synthesizing
as input a sequence of symbolic actions. This as- streams according to specific policies. Processes
sumption is reasonable for story understanding have streams as input and output. They are of dif-
and document processing applications, but it does ferent types, such as primitive processes, that are
not hold in robotics. Usually actions are sensed directly connected to sensors and databases; re-
only through their effects on the environment. finement processes, that subscribe input streams
The Chronicle recognition techniques [22,33] are and provide as output more combined features,
very relevant at the history level of the Perceiv- e.g., a signal filter or a position estimator fusing
ing function. A chronicle recognition system is able several raw sensing sources and filtered data; or
to survey a stream of observed events and recog- configuration processes that allow to reconfigure
nize, on the fly, instances of modeled chronicles dynamically the system by initiating and removing
that match this stream. A chronicle is a model for processes and streams, as required by the task and
a collection of possible scenarios. It describes pat- the context, e.g., to track a newly detected target.
terns of observed events, i.e., change in the value of DyKnow uses a specific Knowledge Processing
state variables, persistence assertions, non occur- Language (KPL) to specify processes, streams and
rence assertions and temporal constraints between corresponding policies. KPL allows to refer to ob-
these assertions. A ground instance of a chronicle jects, features, streams, processes, and time, to-
can be formalized as a nondeterministic timed au- gether with their domains, constraints and rela-
14

tionships in the processing network. Formal spec- sion. Its main role is to manage the set of objec-
ifications in KPL defines a symbol for each com- tives the system wants to achieve, maintain or su-
putational unit, but they do not define the ac- pervise. It may react to new goals given by the
tual function associated with this symbol. Their user or to goal failure reported acting and mon-
semantics is taken with respect to the interpre- itoring. In several implementations, this function
tation of the processing functions used. They al- is embedded in the planning or acting functions.
low to describe and enforce streams policies. They It clearly shares similarities with the monitoring
also support a number of essential functions, e.g., function. Still, Goal Reasoning is not akin to plan-
synchronize states from separate unsynchronized ning as it does not really produce plan, but merely
streams; evaluate incrementally temporal logic for- establish new goals and manage existing one which
mulas over states; recognize objects and build up are then passed to the planner. Similarly to mon-
anchors to classify and update interpretation as itoring, it continuously checks unexpected events
new information becomes available; or follow his- or situations. These are analyzed to assess current
tories of spatio-temporal events and recognize oc- goals and possibly establish new goals. Some sys-
currences of specified chronicle models. tems have a dedicated component to perform this
DyKnow has been integrated to the TALplan- high-level function. For example, Goal Driven Au-
ner system [21] discussed earlier. This system is tonomy (GDA) approaches model and reason on
various and sometime conflicting goals an agent
queried by planning, acting and Monitoring func-
may have to consider. GDA reasoning focus on
tions to acquire information about the current con-
goal generation and management. In [68], the au-
textual state of the world. It provides appropriate
thors instantiate the GDA model in the ARTUE
and highly valuable focus of attention mechanisms,
agent which appropriately responds to unexpected
linking monitoring or control formulas to streams.
events in complex simulations and games environ-
It has been deployed within complex UAV rescue ment. As shown on figure 5, their system includes
and traffic surveillance demonstration. a classical planner; when it executes a plan, it
detects discrepancy (Discrepancy Detector), gen-
erates an explanation, may produce a new goal
7. Goal reasoning
(Goal Formulator) and finally manages the goals
currently under consideration by the system. The
3,4*$ Goal Manager can use different approaches to de-
,70*$ cide which goal to keep (e.g., using decision theory
*:$$ to balance conflicting goals).
Similarly, in [80] the authors point out that plan-
4348$
ning should be considered from a broader point of
$ ./&$ view and not limited to the sole activity of generat-
"')/* ing an abstract plan with restrictive assumptions
'(4*$ introduced to scope the field and make the prob-
4,.$ lem more tractable. They propose the Plan Man-
ABC:$ agement Agent (PMA) which, beyond plan gener-
-.&4$ ation, provides extra plan reasoning capabilities.
$ 9&$ The resulting PMA system heavily relies on tem-
1(4$ poral and causal reasoning, and is able to plan with
./(.$ partial commitments, allowing to further refine a
%(.&$ plan when needed.
./&$ Goal reasoning has been deployed in a number
44&%$ of real experiments. Notably in the DS1 New Mil-
,>4$ lenium Remote Agent experiment [74] and in the
CPEF framework [75]. Yet, overall, the goal rea-
4+50*-#6;$#$<,47&2.0('$E,=&'$-,%$!,('N"%3+&4$#0.,4,15! soning function is not often developed. It is never-
4$ 3*$ Fig. 5. GDA Model with its different components, from [68].
theless needed for complex and large systems man-
./(4$ ;<2'#&6&":* .,21($ 7,1934&$ 34-,%1(.3,4$ (9,0.$ *&4*348$
,9U&7.*$ (4=$ 8&4&%(.348$ 8,('*$ 34.,$ (4$ &)3*.348$ GLG$ aging various long term objectives while taking dy-
G'(4$ Goal reasoning is mostly concerned with the namically into account new events which may trig-
./&$ *5*.&1$ ?O('(1(=020'($ "#* 213$ @AAIC:$ L313'(%'56$
management of high-level goals and the global mis- ger new goals.
12'&6$ !,'=1(4$?@AAIC$=&*7%39&*$($*5*.&1$>3./$043+&%*(''5$
M,46$ \0(4.3-3&=$ 8,('*$ ./(.$ ('',>*$ 2'(44348$ -,%$ *&.*$ ,-$
%,1$ &4.3.3&*$ >/,*&$ 7(%=34('3.5$ 3*$ 04F4,>4$ (.$ 2'(44348$
8,('$ .31&:$ L&+&%('$ *5*.&1*$ ?&:8:6$ GQ<#L$ ?R(>&*$ &.$ ('6$
IIIC$ @AAICC$ 8&4&%(.&$ 8,('*$ (.$ &)&70.3,4$ .31&$ 9(*&=$ ,4$ ($
.(%5$ $
770%$ $ #'./,08/$ ./&*&$ (**012.3,4*$ 7/(%(7.&%3M&$ 7,12'&)$
,1&$ &4+3%,41&4.*6$4,4&$,-$./&*&$2%&+3,0*$&--,%.*$%&'()$(''$
137$ -,0%$*310'.(4&,0*'56$>/37/$3*$./&$-,70*$,-$!"#:$$
15

8. Integration and Architectures Beyond architecture paradigms, it is interesting


to note that some robotics systems have achieved
Beyond the integration of various devices (me- an impressive level of integration of numerous de-
chanical, electrical, electronical, etc), robots are liberation functions on real platforms. The Linkop-
complex systems including multiple sensors, actu- ing UAV project [20] provides planning, acting,
ators and information processing modules. They perception, monitoring with formal representa-
embed online processing, with various real time re- tions all over these components. The NMRA on
quirement, from low-level servo loops up to delib- the DS1 probe [74] also proposed planning, acting,
eration functions which confer the necessary au- and FDIR onboard. IDEA and T-ReX, providing
tonomy and robustness for the robot to face the planning and acting have been used respectively
variability of tasks and environment. The software on a robot [26] and an AUV [66].
integration of all these components must rely on
an architecture and supporting tools which spec-
ify how theses components communicate, share re- 9. Conclusion
sources and CPUs, and how they are implemented
on the host computer(s) and operating systems. Autonomous robots facing a variety of open en-
Various architectures have been proposed to vironments and a diversity of tasks cannot rely
on the decision making capabilities of a human
tackle this task, among which the following:
designer or teleoperator. To achieve their mis-
Reactive architectures, e.g. the subsumption ar-
sions, they have to exhibit complex reasoning ca-
chitecture [10], are composed of modules which
pabilities required to understand their environ-
close the loop between inputs (e.g. sensors) and
ment and current context, and to act deliberately,
outputs (e.g. effectors) with an internal au-
in a purposeful, intentional manner. In this paper,
tomata. These modules can be hierarchically or-
we have referred to these reasoning capabilities
ganized and can inhibit other modules or weight
as deliberation functions, closely interconnected
on their activity. They do not rely on any partic-
within a complex architecture. We have presented
ular model of the world or plans to achieve and
an overview of the state of the art for some of them.
do not support any explicit deliberative activi-
For the purpose of this overview, we found it
ties. Nevertheless, there are a number of work,
clarifying to distinguish these functions with re-
e.g. [59], which rely on them to implement delib-
spect to their main role and computational re-
erative functions.
quirements: the perceiving, goal reasoning, plan-
Hierarchical architectures are probably the most ning, acting and monitoring functions. But let us
widely used in robotics [43,76,20]. They propose insist again: the border line between them is not
an organization of the software along layers (two crisp; the rational for their implementation within
or three) with different temporal requirements an operational architecture has to take into ac-
and abstraction levels. Often, there is a functional count numerous requirements, in particular a hier-
layer containing the low-level sensorseffectors archy of closed loops, from the most dynamic in-
processing modules, and a decision layer contain- ner loop, closest to the sensory-motor signals and
ing some of the deliberation functions presented commands, to the most offline outer loop.
here (e.g. planning, acting, monitoring, etc). Consider for example the relationship between
Teleo-reactive architectures [26,66] are more re- planning and acting. We argued that acting can-
cent. They propose an integrated planning not be reduced to execution control, that is the
acting paradigm which is implemented at dif- triggering of commands mapped to planned ac-
ferent levels, from deliberation down to reac- tions. There is a need for significant deliberation
tive functions, using different planningacting to take place between what is planned and the
horizons and time quantum. Each planneractor commands achieving it (Fig. 2). This acting de-
is responsible for ensuring the consistency of liberation may even rely on the same or on dif-
a constraint network (temporal and atempo- ferent planning techniques as those of the plan-
ral) whose state variables can be shared with ner, but it has to take into account different state
other plannersactors to provide a communica- spaces, action spaces and event spaces than those
tion mechanism. of the planner. However, if we insisted to distin-
16

guish these two levels, there is no reason to believe Acknowledgements


that just two levels is the right number. There can
be a hierarchy of planningacting levels, each re- We thank the editors of this special issue and
fining a task planned further up into more concrete the reviewers for their highly valuable feedback.
actions, adapted to the acting context and fore-
seen events. It would be convenient and elegant to
address this hierarchy within a homogeneous ap- References
proach, e.g., HTN or AHP. But we strongly sus-
pect that conflicting requirements, e.g., for han- [1] B. Argall, S. Chernova, M. Veloso, and B. Browning. A
dling uncertainty and domain specific representa- survey of robot learning from demonstration. Robotics
tions, favor a variety of representations and ap- and Autonomous Systems, 57(5):469483, 2009.

proaches. [2] T. Bailey and H. Durrant-Whyte. Simultaneous local-


ization and mapping (SLAM): part II. IEEE Robotics
Many other open issues, briefly referred to in and Automation Magazine, 13(3):108 117, 2006.
this paper, give rise to numerous scientific chal- [3] M. Beetz. Structured reactive controllers: controlling
lenges. The relationship from sensing and acting to robots that perform everyday activity. In Proceed-
perceiving is clearly one of these bottleneck prob- ings of the annual conference on Autonomous Agents,
lems to which more investigation efforts need to pages 228235. ACM, 1999.
be devoted. Acting in an open world requires go- [4] M. Beetz and D. McDermott. Improving Robot Plans
During Their Execution. In Proc. AIPS, 1994.
ing from anchoring to symbol grounding, from ob-
[5] S. Bernardini and D. Smith. Finding mutual exclusion
ject recognition to categorization. A development invariants in temporal planning domains. In Seventh
perspective is to make robots query when needed International Workshop on Planning and Scheduling
and benefit from the growing wealth of knowledge for Space (IWPSS), 2011.
available over the web, within ontologies of tex- [6] J. Bohren, R. Rusu, E. Jones, E. Marder-Eppstein,
tual and symbolic relations, as well as of images, C. Pantofaru, M. Wise, L. Mosenlechner, W. Meeussen,
and S. Holzer. Towards autonomous robotic butlers:
graphical and geometric knowledge. Lessons learned with the PR2. In Proc. ICRA, pages
Deliberation functions involve several other open 55685575, 2011.
issues that we have not discussed in this overview, [7] R. Bonasso, R. Firby, E. Gat, D. Kortenkamp,
among which the noteworthy problems of: D. Miller, and M. Slack. Experiences with an Archi-
metareasoning: trading off deliberation time for tecture for Intelligent, Reactive Agents. Journal of
Experimental and Theoretical Artificial Intelligence,
acting time, given how critical and/or urgent are 9(2/3):237256, April 1997.
the context and tasks at hand; [8] A. Bouguerra, L. Karlsson, and A. Saffiotti. Seman-
interaction and social behavior that impact all tic Knowledge-Based Execution Monitoring for Mobile
functions discussed here, from the perceiving re- Robots. In Proc. ICRA, pages 36933698, 2007.
quirements of a multi-modal dialogue, to the [9] C. Boutilier, T. Dean, and S. Hanks. Decision-
Theoretic Planning: Structural Assumptions and Com-
planning and acting at the levels of task shar-
putational Leverage. Journal of AI Research, 11:194,
ing and plan understanding for multi-robots and May 1999.
man-robot interaction; [10] R. Brooks. A robust layered control system for a mo-
learning which is the only hope for building the bile robot. IEEE Journal of Robotics and Automation,
models required by deliberation functions and 2:1423, 1986.
which has a strong impact on the architecture [11] L. Busoniu, R. Munos, B. De Schutter, and
R. Babuska. Optimistic planning for sparsely stochas-
that would permit to integrate these functions tic systems. IEEE Symposium on Adaptive Dynamic
and allow them to adapt to tasks and environ- Programming And Reinforcement Learning, pages 48
ments the robot is facing. 55, 2011.
We believe that the AIRobotics synergy is be- [12] S. Cambon, R. Alami, and F. Gravot. A hybrid
coming richer and more complex, and it remains approach to intricate motion, manipulation and task
planning. International Journal of Robotics Research,
today as fruitful for both fields as it used to be 28(1):104126, 2009.
in their early beginning. We do hope that this [13] F. Chaumette and S. Hutchinson. Visual servo con-
overview will attract more practitioners to the trol, part ii: Advanced approaches. IEEE Robotics and
challenging problems of their intersection. Automation Magazine, 14(1):109118, 2007.
17

[14] F. Chaumette and S. Hutchinson. Visual servoing and [31] F. Fusier, V. Valentin, F. Br emond, M. Thonnat,
visual tracking. In B. Siciliano and O. Khatib, edi- M. Borg, D. Thirde, and J. Ferryman. Video under-
tors, Springer Handbook of Robotics, pages 563583. standing for complex activity recognition. Machine
Springer, 2008. Vision and Applications, 18:167188, 2007.
[15] A. J. Coles, A. Coles, M. Fox, and D. Long. COLIN: [32] C. Geib and R. Goldman. A probabilistic plan recogni-
Planning with Continuous Linear Numeric Change. tion algorithm based on plan tree grammars. Artificial
Journal of AI Research, 2012. Intelligence, 173:11011132, 2009.
[16] P. Conrad, J. Shah, and B. Williams. Flexible execu- [33] M. Ghallab. On chronicles: Representation, on-line
tion of plans with choice. In Proceedings of ICAPS, recognition and learning. In International Conference
2009. on Knowledge Representation and Reasoning, pages
[17] S. Coradeschi and A. Saffiotti. An introduction to the 597606, 1996.
anchoring problem. Robotics and Autonomous Sys- [34] M. Ghallab and A. Mounir Alaoui. Managing effi-
tems, 43(2-3):8596, 2003. ciently temporal relations through indexed spanning
[18] E. Coste-Maniere, B. Espiau, and E. Rutten. A task- trees. In Proc. IJCAI, pages 12971303, 1989.
level robot programming language and its reactive ex- [35] M. Ghallab, D. Nau, and P. Traverso. Automated Plan-
ecution. In Proc. ICRA, 1992. ning: Theory and Practice. Morgann Kaufmann, Oc-
tober 2004.
[19] O. Despouys and F. Ingrand. Propice-Plan: Toward
a Unified Framework for Planning and Execution. In [36] G. Giralt, R. Sobek, and R. Chatila. A multi-level
European Workshop on Planning, 1999. planning and navigation system for a mobile robot:
a first approach to HILARE. In Proc. IJCAI, pages
[20] P. Doherty, J. Kvarnstr
om, and F. Heintz. A temporal
335337, 1979.
logic-based planning and execution monitoring frame-
work for unmanned aircraft systems. Autonomous [37] D. Hahnel, W. Burgard, and G. Lakemeyer. GOLEX
Agents and Multi-Agent Systems, 19(3), 2009. bridging the gap between logic (GOLOG) and a real
robot. In KI-98: Advances in Artificial Intelligence,
[21] P. Doherty, J. Kvarnstr
om, and F. Heintz. A temporal
pages 165176. Springer, 1998.
logic-based planning and execution monitoring frame-
work for unmanned aircraft systems. Autonomous [38] M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean,
Agents and Multi-Agent Systems, 19(3):332377, 2009. and C. Boutilier. Hierarchical solution of Markov de-
cision processes using macro-actions. In Proceedings
[22] C. Dousson, P. Gaborit, and M. Ghallab. Situation
of the Conference on Uncertainty in Artificial Intelli-
recognition: Representation and algorithms. Proc. IJ-
gence, pages 220229, 1998.
CAI, 13:166166, 1993.
[39] F. Heintz, J. Kvarnstrom, and P. Doherty. Bridging
[23] C. Dousson and P. Le Maigat. Chronicle recognition
the sense-reasoning gap: DyKnow-Stream-based mid-
improvement using temporal focusing and hierarchiza-
dleware for knowledge processing. Advanced Engineer-
tion. In Proc. IJCAI, pages 324329, 2007.
ing Informatics, 24(1):1426, 2010.
[24] M. Fichtner, A. Gromann, and M. Thielscher. Intel-
[40] S. Hongeng, R. Nevatia, and F. Bremond. Video-based
ligent execution monitoring in dynamic environments.
event recognition: activity representation and proba-
Fundamenta Informaticae, 57(2-4):371392, 2003.
bilistic recognition methods. Computer Vision and
[25] R. Fikes. Monitored Execution of Robot Plans Pro- Image Understanding, 96(2):129162, 2004.
duced by STRIPS. In IFIP Congress, Ljubljana, Yu-
[41] M. Ingham, R. Ragno, and B. Williams. A Reac-
goslavia, August 1971.
tive Model-based Programming Language for Robotic
[26] A. Finzi, F. Ingrand, and N. Muscettola. Model-based Space Explorers. In International Symposium on Arti-
executive control through reactive planning for au- ficial Intelligence, Robotics and Automation for Space,
tonomous rovers. In Proc. IROS, volume 1, pages 879 2001.
884, 2004. [42] F. Ingrand, R. Chatilla, R. Alami, and F. Robert. PRS:
[27] R. Firby. An Investigation into Reactive Planning in A High Level Supervision and Control Language for
Complex Domains. In Proc. AAAI, pages 15, 1987. Autonomous Mobile Robots. In IEEE International
[28] J. Frank and A. J onsson. Constraint-based attribute Conference on Robotics and Automation, 1996.
and interval planning. Constraints, 8(4):339364, 2003. [43] F. Ingrand, S. Lacroix, S. Lemai-Chenevier, and F. Py.
[29] G. Fraser, G. Steinbauer, and F. Wotawa. Plan ex- Decisional Autonomy of Planetary Rovers. Journal of
ecution in dynamic environments. In Innovations in Field Robotics, 24(7):559580, October 2007.
Applied Artificial Intelligence, volume 3533 of LNCS, [44] A. J
onsson, P. Morris, N. Muscettola, K. Rajan, and
pages 208217. Springer, 2005. B. Smith. Planning in Interplanetary Space: Theory
[30] S. Fratini, A. Cesta, R. De Benedictis, A. Orlandini, and Practice. In International Conference on AI Plan-
and R. Rasconi. Apsi-based deliberation in goal ori- ning Systems, 2000.
ented autonomous controllers. In 11th Symposium on [45] L. Pack Kaelbling and T. Lozano-Perez. Hierarchical
Advanced Space Technologies in Robotics and Automa- task and motion planning in the now. In Proc. ICRA,
tion (ASTRA), 2011. pages 14701477, 2011.
18

[46] O. Kanoun, J-P. Laumond, and E. Yoshida. Planning [62] S. Lemai-Chenevier and F. Ingrand. Interleaving Tem-
foot placements for a humanoid robot: A problem of poral Planning and Execution in Robotics Domains.
inverse kinematics. International Journal of Robotics In Proc. AAAI, 2004.
Research, 30(4):476485, 2011. [63] H. Levesque, R. Reiter, Y. Lesperance, F. Lin, and
[47] L. Karlsson, A. Bouguerra, M. Broxvall, S. Coradeschi, R. Scherl. Golog: A logic programming language for
and A. Saffiotti. To secure an anchor a recovery plan- dynamic domains. Journal of Logic Programming,
ning approach to ambiguity in perceptual anchoring. 31:5984, 1997.
AI Communications, 21(1):114, 2008. [64] M. Likhachev, G. Gordon, and S. Thrun. Planning
[48] H. Kautz and J. Allen. Generalized plan recognition. for markov decision processes with sparse stochasticity.
In Proc. AAAI, pages 32 37, 1986. Advances in Neural Information Processing Systems
[49] M. Kearns, Y. Mansour, and A. Ng. A sparse sampling (NIPS), 17, 2004.
algorithm for near-optimal planning in large Markov [65] Mausam and A. Kolobov. Planning with Markov De-
decision processes. Machine Learning, 49:193208, cision Processes: An AI Perspective. Morgan & Clay-
2002. pool, July 2012.
[50] A. Kolobov, Mausam, and D. Weld. SixthSense: Fast [66] C. McGann, F. Py, K. Rajan, J. Ryan, and R. Hen-
and Reliable Recognition of Dead Ends in MDPs. Proc. thorn. Adaptive Control for Autonomous Underwater
AAAI, April 2010. Vehicles. In Proc. AAAI, page 6, April 2008.
[51] A. Kolobov, Mausam, and D. Weld. LRTDP vs. UCT [67] T. Moeslund, A. Hilton, and V. Kr
uger. A survey of
for Online Probabilistic Planning. In Proc. AAAI, advances in vision-based human motion capture and
2012. analysis. Computer Vision and Image Understanding,
[52] A. Kolobov, Mausam, D. Weld, and H. Geffner. Heuris- 104(2-3):90126, 2006.
tic search for generalized stochastic shortest path [68] M. Molineaux, M. Klenk, and D. Aha. Goal-driven au-
MDPs. Proc. ICAPS, 2011. tonomy in a Navy strategy simulation. In Proc. AAAI,
[53] K. Konolige, E. Marder-Eppstein, and B. Marthi. Nav- pages 15481554, 2010.
igation in hybrid metric-topological maps. In Proc. [69] M. Montemerlo, S. Thrun, D. Koller, and B. Weg-
ICRA, 2011. breit. Fastslam 2.0: An improved particle filtering algo-
[54] K. Konolige, K. Myers, E. Ruspini, and A. Saffiotti. rithm for simultaneous localization and mapping that
The saphira architecture: A design for autonomy. Jour- provably converges. In Proc. IJCAI, pages 11511156,
nal of Experimental and Theoretical Artificial Intelli- 2003.
gence, 9:215235, 1997. [70] H. Moravec. The Stanford Cart and the CMU Rover.
[55] V. Kruger, D. Kragic, A. Ude, and C. Geib. The mean- Technical report, CMU, February 1983.
ing of action: a review on action recognition and map- [71] B. Morisset and M. Ghallab. Learning how to com-
ping. Advanced Robotics, 21(13):14731501, 2007. bine sensory-motor functions into a robust behavior.
Artificial Intelligence, 172(4-5):392412, March 2008.
[56] B. Kuipers and Y. Byun. A robot exploration and map-
ping strategy based on a semantic hierarchy of spatial [72] P. Morris, N. Muscettola, and T. Vidal. Dynamic con-
representations. Robotics and Autonomous Systems, trol of plans with temporal uncertainty. In Proc. IJ-
8(1-2):4763, 1991. CAI, pages 494502, 2001.
[57] B. Kuipers, J. Modayil, P. Beeson, M. MacMahon, [73] N. Muscettola, G. Dorais, C. Fry, R. Levinson, and
and F. Savelli. Local metrical and global topological C. Plaunt. A Unified Approach to Model-Based Plan-
maps in the hybrid spatial semantic hierarchy. In Proc. ning and Execution. In Proceedings of the Interna-
ICRA, pages 48454851, 2004. tional Conference on Intelligent Autonomous Systems,
2000.
[58] J. Kvarnstrom and P. Doherty. TALplanner: A tem-
poral logic based forward chaining planner. Annals [74] N. Muscettola, P. Nayak, B. Pell, and B. Williams.
of Mathematics and Artificial Intelligence, 30(1):119 Remote Agent: to boldly go where no AI system has
169, 2000. gone before. Artificial Intelligence, 103:547, 1998.
[59] K. Ben Lamine and F. Kabanza. History checking of [75] K. Myers. CPEF: Continuous Planning and Execution
temporal fuzzy logic formulas for monitoring behavior- Framework. AI Magazine, 20(4):6369, 1999.
based mobile robots. In 12th IEEE International Con- [76] I. Nesnas, A. Wright, M. Bajracharya, R. Simmons,
ference on Tools with Artificial Intelligence, 2000. IC- and T. Estlin. CLARAty and Challenges of Develop-
TAI 2000., pages 312319, 2000. ing Interoperable Robotic Software. In Proc. IROS,
[60] K. Ben Lamine and F. Kabanza. Reasoning about October 2003.
robot actions: A model checking approach. In [77] B. Pell, E. Gat, R. Keesing, N. Muscettola, and Ben
M. Beetz, J. Hertzberg, M. Ghallab, and M. Pollack, Smith. Robust Periodic Planning and Execution for
editors, Advances in Plan-Based Control of Robotic Autonomous Spacecraft. In Proc. IJCAI, 1997.
Agents, volume 2466 of LNCS, pages 123139, 2002. [78] O. Petterson. Execusion monitoring in robotics: a
[61] S. LaValle. Planning Algorithms. Cambridge Univer- survey. Robotics and Autonomous Systems, 53:7388,
sity Press, 2006. 2005.
19

[79] J. Pineau, M. Montemerlo, M. Pollack, N Roy, and [97] Y. Wu and T. Huang. Vision-Based Gesture Recogni-
S. Thrun. Towards robotic assistants in nursing homes: tion: A Review, volume 1739 of LNCS, pages 103116.
Challenges and results. Robotics and Autonomous Sys- Springer-Verlag, 1999.
tems, 42(3-4):271281, March 2003.
[80] M. Pollack and J. Horty. Theres more to life than
making plans: plan management in dynamic, multia-
gent environments. AI Magazine, 20(4):71, 1999.
[81] F. Py, K. Rajan, and C. McGann. A systematic agent
framework for situated autonomous systems. In Proc.
AAMAS, pages 583590, 2010.
[82] D. Pynadath and M. Wellman. Probabilistic state-
dependent grammars for plan recognition. In Proc.
Uncertainty in Artificial Intelligence, pages 507514,
2000.
[83] M. Ramirez and H. Geffner. Plan recognition as plan-
ning. In Proc. IJCAI, pages 1778 1783, 2009.
[84] J. Rintanen. An iterative algorithm for synthesizing
invariants. In Proc. AAAI, pages 806811, 2000.
[85] C. Rosen and N. Nilsson. Application of intelligent
automata to reconnaissance. Technical report, SRI,
November 1966.
[86] E. Sandewall. Features and Fluents. Oxford university
Press, 1995.
[87] T. Simeon, J-P. Laumond, J. Cort es, and A. Sahbani.
Manipulation planning with probabilistic roadmaps.
International Journal of Robotics Research, 23(7-
8):729746, 2004.
[88] R. Simmons. Structured control for autonomous
robots. IEEE Transactions on Robotics and Automa-
tion, 10(1):3443, 1994.
[89] R. Simmons and D. Apfelbaum. A task description
language for robot control. In Proc. IROS, 1998.
[90] F. Teichteil-Konigsbuch, U. Kuter, and G. Infantes.
Incremental plan aggregation for generating policies in
MDPs. In Proc. AAMAS, pages 12311238, 2010.
[91] F. Teichteil-K
onigsbuch, C. Lesire, and G. Infantes. A
generic framework for anytime execution-driven plan-
ning in robotics. In Proc. ICRA, pages 299304, 2011.
[92] M. Veloso and P. Rizzo. Mapping planning actions
and partially-ordered plans into execution knowledge.
In Workshop on Integrating Planning, Scheduling and
Execution in Dynamic and Uncertain Environments,
pages 9497, 1998.
[93] V. Verma, T. Estlin, A. J
onsson, C. Pasareanu, R. Sim-
mons, and K. Tso. Plan execution interchange lan-
guage (PLEXIL) for executable plans and command
sequences. In Proceedings of the 9th International
Symposium on Artificial Intelligence, Robotics and
Automation in Space, 2005.
[94] D. Wilkins and K. Myers. A common knowledge rep-
resentation for plan generation and reactive execu-
tion. Journal of Logic and Computation, 5(6):731761,
1995.
[95] B. Williams and M. Abramson. Executing reactive,
model-based programs through graph-based temporal
planning. In Proc. IJCAI, 2001.
[96] J. Wolfe, B. Marthi, and S. Russell. Combined task
and motion planning for mobile manipulation. In Proc.
ICAPS, 2010.

Potrebbero piacerti anche