Sei sulla pagina 1di 170

2003 Evolution Robotics, Inc. All rights reserved.

Evolution Robotics and the Evolution Robotics logo are trademarks of Evolution Robotics, Inc. All other trademarks are the property of their respective owners. Evolution Robotics Software Platform is a trademark of Evolution Robotics, Inc. Microsoft Windows is a registered trademark of Microsoft Corporation Inc. IBM Viavoice is a registered trademark of International Business Machines Corporation. WinVoice is a trademark of Microsoft Corporation Inc. Java Runtime Environment version 1.4 is a trademark of Sun Microsystems, Inc. 3D Studio Max is a trademark of Discreet. Linux is a registered trademark of Linus Torvalds. This product includes software developed by the Apache Software Foundation (http://www.apache.org/).

Part number MC6100 Last revised 6/16/03.

Table of Contents
Chapter 1 About ERSP SDK
The User Guide .................................................................................................................1-1 Introducing ERSP .............................................................................................................1-2 Why Use ERSP? ...............................................................................................................1-2 Who Should Use ERSP? ...................................................................................................1-3 ERSP Structure and Organization.....................................................................................1-4 Evolution Robotics Software Architecture (ERSA) .........................................................1-5 ER Vision ..........................................................................................................................1-8 Object Recognition.....................................................................................................1-8 Motion Flow ...............................................................................................................1-8 Color Segmentation....................................................................................................1-9 ER Navigation...................................................................................................................1-9 Target Following ........................................................................................................1-9 Obstacle Avoidance....................................................................................................1-9 Hazard Avoidance ......................................................................................................1-9 Teleoperation..............................................................................................................1-10 ER Human-Robot Interaction ...........................................................................................1-10 Speech Recognition and Text to Speech ....................................................................1-10 Robot Emotions and Personality ................................................................................1-10 Person Detection and Head Gestures .........................................................................1-11 Core Libraries ...................................................................................................................1-11

Chapter 2 Vision
Object Recognition System...............................................................................................2-2 ER Vision Demo ...............................................................................................................2-3 Main Screen ......................................................................................................................2-3

User Guide

Table of Contents

Main Screen ...................................................................................................................... 2-3 Rotate Buttons................................................................................................................... 2-4 Live Video Screen............................................................................................................. 2-4 Outline Screen................................................................................................................... 2-4 Object Recognition Information Area............................................................................... 2-5 Buttons .............................................................................................................................. 2-5 Vision ......................................................................................................................... 2-5 Capture ....................................................................................................................... 2-5 Learn........................................................................................................................... 2-6 Recognize ................................................................................................................... 2-6 Sound.......................................................................................................................... 2-6 Prefs............................................................................................................................ 2-6 Timing ........................................................................................................................ 2-8 Object Library ................................................................................................................... 2-9 View Image ................................................................................................................ 2-9 View Features............................................................................................................. 2-9 Properties.................................................................................................................... 2-9 Delete ......................................................................................................................... 2-10 Add ............................................................................................................................. 2-10 Command Line Tools........................................................................................................ 2-11 Using the Command Line Tools ....................................................................................... 2-11 objrec_add .................................................................................................................. 2-11 objrec_add_list ........................................................................................................... 2-12 objrec_del ................................................................................................................... 2-13 objrec_list ................................................................................................................... 2-14 objrec_recognize ........................................................................................................ 2-14 Object Recognition API Overview ................................................................................... 2-15 Example...................................................................................................................... 2-16 Object Recognition API .................................................................................................... 2-17 ObjRecDatabase ......................................................................................................... 2-18 ObjRecImageHash ..................................................................................................... 2-18 ObjRecQuery.............................................................................................................. 2-19

Chapter 3 Hardware Abstraction Layer


Resource Drivers and Interfaces ....................................................................................... 3-1 Resource Categories.......................................................................................................... 3-2 Device......................................................................................................................... 3-2 Bus.............................................................................................................................. 3-2

ii

User Guide

Device Group .............................................................................................................3-2 The Resource Life Cycle and Resource Manager.............................................................3-3 Resource Manager......................................................................................................3-3 Resource Configuration ....................................................................................................3-3 Resource Configuration XML....................................................................................3-4 Standard Resource Interfaces .....................................................................................3-5 Supported Drivers.......................................................................................................3-7 Resource Type............................................................................................................3-12 Creating a Resource Driver...............................................................................................3-12 Resource Interfaces ....................................................................................................3-13 Driver Implementation ...............................................................................................3-13

Chapter 4 Behavior Execution Layer


The Behavior Life Cycle and Environment ......................................................................4-2 Behavior Configuration..............................................................................................4-2 Data Types..................................................................................................................4-5 Implementing Behaviors ...................................................................................................4-7 BehaviorImpl and the IBehavior Interface.................................................................4-7 Input and Output Ports ......................................................................................................4-7 Reading from Ports ...........................................................................................................4-8 Writing to Ports .................................................................................................................4-9 The Compute_output() Function.......................................................................................4-10 XML Interface to Behaviors .............................................................................................4-11 Aggregate Behaviors: Combining Behaviors ...................................................................4-12 Relevant Header Files and Functions.........................................................................4-14 Data Passing between Behaviors................................................................................4-14 Input Data Interface ..........................................................................................................4-15 Output Data Interface........................................................................................................4-15

Chapter 5 Task Execution Layer


Tasks .................................................................................................................................5-2 Tasks and Task Functors ............................................................................................5-2 Task Arguments .........................................................................................................5-3 Task Termination .......................................................................................................5-3 Task Return Values ....................................................................................................5-3 Task Success and Failure ...........................................................................................5-3 TaskValue...................................................................................................................5-3 TaskArg ......................................................................................................................5-4

User Guide

iii

Table of Contents

TaskContext ............................................................................................................... 5-4 TaskStatus .................................................................................................................. 5-5 TaskRegistry............................................................................................................... 5-5 Asynchronous Tasks ......................................................................................................... 5-6 TaskManager::install_task ......................................................................................... 5-6 Waiting For Tasks ...................................................................................................... 5-7 Parallel........................................................................................................................ 5-7 Terminating Tasks...................................................................................................... 5-8 Events ......................................................................................................................... 5-9 Types of Tasks ........................................................................................................... 5-10 Primitives .......................................................................................................................... 5-10 Example Tasks .................................................................................................................. 5-12 Using Python..................................................................................................................... 5-14 In the Python Interpreter ............................................................................................ 5-15 Embedded Python ...................................................................................................... 5-16 Overview of Tasks ............................................................................................................ 5-18

Chapter 6 Behavior Libraries


Utility Behaviors ............................................................................................................... 6-1 Buffer ......................................................................................................................... 6-1 Console....................................................................................................................... 6-2 Constant...................................................................................................................... 6-2 DecayBehavior ........................................................................................................... 6-2 DelayedConstant ........................................................................................................ 6-2 FunctionBehavior ....................................................................................................... 6-2 ImageConverter .......................................................................................................... 6-3 InputCollector............................................................................................................. 6-3 InputLogger ................................................................................................................ 6-3 MalleableBehavior ..................................................................................................... 6-3 PeriodicTrigger........................................................................................................... 6-4 PlaySoundBehavior .................................................................................................... 6-4 TCPServerBehavior ................................................................................................... 6-4 TransmitBehavior....................................................................................................... 6-4 TriggeredConstant ...................................................................................................... 6-4 Operator Behaviors ........................................................................................................... 6-4 AbsoluteValue ............................................................................................................ 6-4 Addition...................................................................................................................... 6-5 Average ...................................................................................................................... 6-5

iv

User Guide

DoubleArrayJoiner .....................................................................................................6-5 DoubleArraySplitter ...................................................................................................6-5 Max.............................................................................................................................6-5 Min .............................................................................................................................6-5 Multiplication .............................................................................................................6-6 Priority........................................................................................................................6-6 Subtract.......................................................................................................................6-6 ThresholdBehavior .....................................................................................................6-6 Transistor....................................................................................................................6-7 TriangularDistributor..................................................................................................6-7 TriggerPassBehavior ..................................................................................................6-7 WeightedSumOperator...............................................................................................6-7 Condensers .................................................................................................................6-8 Resource Behaviors...........................................................................................................6-8 A/VClient ...................................................................................................................6-8 A/VServer...................................................................................................................6-9 BatteryMeter...............................................................................................................6-9 BumpSensorBehavior.................................................................................................6-9 Camera........................................................................................................................6-9 CompressedAudioPlaybackBehavior .........................................................................6-9 CompressedAudioRecorder........................................................................................6-9 DriveSystem ...............................................................................................................6-9 FuzzyLRFParse ..........................................................................................................6-10 FuzzyRangeSensorRing .............................................................................................6-10 JoystickBehavior ........................................................................................................6-11 JPEGServer ................................................................................................................6-11 LRFDataBehavior ......................................................................................................6-11 MotorQuery ................................................................................................................6-12 Odometry....................................................................................................................6-12 PanTiltControl ............................................................................................................6-12 RangeSensor...............................................................................................................6-12 SensorAggregate ........................................................................................................6-12 Wireless Monitor........................................................................................................6-13 Vision Behaviors...............................................................................................................6-13 ClusterTargetBehavior ...............................................................................................6-13 ColorTrainerServer.....................................................................................................6-13 FlowDetectBehavior...................................................................................................6-13

User Guide

Table of Contents

ObjRecRecognize....................................................................................................... 6-13 PanTiltDriveTracker................................................................................................... 6-13 SkinDetector............................................................................................................... 6-14 SkinFlow .................................................................................................................... 6-14 StallDetector............................................................................................................... 6-14 StatisticalColorDetection ........................................................................................... 6-14 Navigation Behaviors........................................................................................................ 6-14 AvoidanceAggregate .................................................................................................. 6-14 FaceObject.................................................................................................................. 6-14 FuzzyAvoidance......................................................................................................... 6-15 FuzzyHeading............................................................................................................. 6-15 FuzzyVelocity ............................................................................................................ 6-15 HazardAvoidance ....................................................................................................... 6-16 LegDetectBehavior..................................................................................................... 6-16 OdometryCompare ..................................................................................................... 6-16 PointAndGo................................................................................................................ 6-16 PointandGoBridge ...................................................................................................... 6-16 SafeDriveSystem ........................................................................................................ 6-16 StopMonitor ............................................................................................................... 6-17 TargetMarkerBehavior ............................................................................................... 6-17 TargetToHeadingBehavior......................................................................................... 6-17 Speech Behaviors.............................................................................................................. 6-17 ASR ............................................................................................................................ 6-17 TTS............................................................................................................................. 6-18 Emotion Behaviors............................................................................................................ 6-18 Emotion Aggregate ........................................................................................................... 6-18 FaceDrive ................................................................................................................... 6-19 Emotion Elicitors........................................................................................................ 6-19

Chapter 7 Task Libraries


Defining New Tasks.......................................................................................................... 7-1 TaskContext ...................................................................................................................... 7-2 Events................................................................................................................................ 7-2 Example Program.............................................................................................................. 7-2 Overview of Tasks ............................................................................................................ 7-4

Chapter 8 Core Libraries


Image Toolkit.................................................................................................................... 8-1

vi

User Guide

Image Class ................................................................................................................8-2 Formats.......................................................................................................................8-2 Initialization................................................................................................................8-2 Conversion..................................................................................................................8-2 Transforms..................................................................................................................8-2 Memory Management ................................................................................................8-2 I/O...............................................................................................................................8-3 Metadata .....................................................................................................................8-3 Drawing ......................................................................................................................8-3 Colorspace Utilities ....................................................................................................8-3 Math Library .....................................................................................................................8-5 Vector3 .......................................................................................................................8-6 Euler Angles ...............................................................................................................8-8 Point Sets....................................................................................................................8-8 MultiValued Functions...............................................................................................8-8 N-Dimensional Vectors..............................................................................................8-11 Matrix .........................................................................................................................8-11 VectorField.................................................................................................................8-11 Platform Independent Utilities ..........................................................................................8-11 SystemProperty and Related Methods .......................................................................8-11 Core Libraries Utilities......................................................................................................8-12 Fuzzy Logic.......................................................................................................................8-12 Basic Fuzzy Data Structures.......................................................................................8-13 Basic Fuzzy Set Types ...............................................................................................8-13 Basic Fuzzy Operators................................................................................................8-13 Basic Fuzzy Rule Combination..................................................................................8-14 Crisp Value Generation ..............................................................................................8-14 Logging Utilities ...............................................................................................................8-14 Where to use logging..................................................................................................8-15 Formatting of logging statements...............................................................................8-16 Logging Levels...........................................................................................................8-16 How to Set Logging Levels for Categories ................................................................8-16 More about Logging...................................................................................................8-17

Chapter 9 Teleop and ColorTrainer GUI Applications


Starting the GUIs ..............................................................................................................9-1 Assigning the Address and Port .................................................................................9-2 Using the Teleoperation Control Panel ......................................................................9-3

User Guide

vii

Table of Contents

Using the ColorTrainer Application ................................................................................. 9-3 Using the ColorTrainer............................................................................................... 9-5

Chapter 10 Behavior Composer GUI


Start the GUI ................................................................................................................... 10-1 GUI Description.............................................................................................................. 10-2 Toolbar Menus ......................................................................................................... 10-2 Toolbar Buttons........................................................................................................ 10-3 Behavior Palette ....................................................................................................... 10-3 Network Editor ......................................................................................................... 10-3 Creating an Aggregate Behavior..................................................................................... 10-6 Working with Multiple Instances.................................................................................... 10-8 Behavior Composer GUI Example ................................................................................. 10-8

viii

User Guide

Chapter 1

About ERSP SDK

The User Guide


This User Guide is a detailed explanation of the concepts and functionality of the ERSP software. You should complete the ERSP Getting Started Guide before reading the User Guide, as basic concepts and procedures are explained in that document. Note that the APIs are explained in detail in the Doxygen documents in the Install_dir/doc/ERSP-API/html directory of the software installation. This manual contains the following chapters: About ERSP SDK - This chapter introduces the ERSP software. Vision - The Vision chapter gives a detailed description of ERSPs object recognition capabilities. Hardware Abstraction Layer - Describes how ERSP can be used to control hardware. Behavior Execution Layer - Describes the behavior concept. Task Execution Layer - Describes the task concept. Behavior Libraries - Gives an overview of available behaviors. Task Libraries - Explains Python libraries and how to use them.

User Guide

1-1

Chapter 1

Core Libraries - Core libraries are described in detail. Teleop and ColorTrainer GUI Applications - Shows how to use the Teleop and ColorTrainer GUIs. Behavior Composer GUI - Shows how to create behavior networks using the Behavior Composer GUI. Grammar Information - Describes grammar files syntax.

Introducing ERSP
This introductory chapter is intended to provide the reader with a overview of ERSPs functionality and how it can be used to prototype and develop software for a wide range of robotic systems. This introduction also walks you through related resources that will enhance your ability to use ERSP and its Application Programmers Interfaces (APIs) to maximum advantage. ERSP is a software development kit for programming robots. At the lowest level, ERSP consists of several hundred thousand lines of C++ code, which gives application developers a big head start with their robotics projects. The code is organized as a number of core libraries that define the basis of application programs. The ERSP libraries consist of a large set of functions that are useful for a wide variety of robotic applications. The infrastructure can be partitioned into four major components: Software Control Architecture Computer Vision Robot Navigation Human-Robot Interaction (HRI) Associated with each major component are tools that provide configuration management, programming languages, or graphical user interfaces.

Why Use ERSP?


ERSP enables developers to build powerful, rich robotics applications quickly and easily. ERSP supports this objective in several ways. First, it provides tools for efficient software/hardware integration. Interfacing the software with sensors, actuators, and user interface components (LCDs, buttons, etc.) can be a tedious, time-consuming, and costly task. ERSP provides a powerful paradigm for software/hardware integration making these tasks easier. By taking advantage of the object-oriented mechanisms of C++, ERSP provides powerful tools for easily extending a users robotic system to support new hardware components without the need to rebuild code from scratch. See the HAL chapter of the ERSP User Guide for more information. Second, ERSP provides a system architecture which contains a rich set of mechanisms and algorithms for controlling the activities of a robot. This architecture consists of several layers that deal with control issues ranging from as simple as turning a single motor to complex issues such as making a robot follow a person while avoiding obstacles.

1-2

User Guide

Who Should Use ERSP?

The system architecture is modular, with well-defined interfaces between its layers and interacting software modules. A developer can chose to use one or more layers of the architecture in a target system allowing scalability of computational requirements. This makes the target application more computationally efficient. For maximum flexibility, ERSP provides easily accessible Application Programmers Interfaces (APIs) so that developers can easily extend and modify them to fit the requirements of their target systems. The open APIs also make it very easy to integrate 3rd party software into ERSP. For instance, a company could use these APIs to integrate a proprietary face recognition technology into ERSP. Third, ERSP puts a number of unique and very powerful technologies into the developers hands. A partial list includes: Vision Object Recognition Voice Recognition Text-to-speech Emotion Navigation And more In the area of computer vision, ERSP provides a very powerful object recognition system that can be trained to recognize an almost unlimited number of objects in its environment. Recognition can be used for many applications such as reading books to children, locating a charging station and docking into it, or localization and mapping. ERSPs voice recognition and text-to-speech modules can be used for enhanced voice interactivity between the user and the robot. A model of emotion is used to emulate and express robot emotions which enhance the user interface for applications such as entertainment robots. In the area of navigation, ERSP provides modules for controlling the movement of the robot relative to its environment. For instance, a target following module can be used to track and follow a given target while at the same time obstacle avoidance can be used to assure safe movement around obstacles. These modules define a set of high-level components upon which an application can be developed.

Who Should Use ERSP?


ERSP is for companies, organizations, developers, and researchers who are working on robotic products or projects. Most robotic projects require a large subset of the modules and technologies that ERSP provides. Often, companies with robotics initiatives need to develop an entire system from the ground up, from drivers to common components to the final complex robot application. Evolution Robotics, with ERSP, provides companies with these common, yet critical, software components necessary to develop systems for any robotics application. These applications could be anything to allow a robot perform cleaning, delivery, factory automation tasks or entertainment.

User Guide

1-3

Chapter 1

ERSP frees companies from the mundane and resource-consuming task of developing common subsystems such as vision and navigation. With ERSP, companies can focus entirely on the value-added functionality of their particular robot applications. One of the additional benefits of this approach is that robotics applications developed using ERSP can be made portable to a wide range of hardware, enabling companies to extend valuable engineering resources. Using ERSP, customers can build robot applications faster, cheaper, and at lower risk. The value that ERSP has for an organization depends on the companys existing software infrastructure. Companies with a new initiative in robotics often find ERSP valuable because it gives them a head start, whereas starting from scratch would months or years of development time and cost. Companies that have had robotics initiatives for many years will have some legacy infrastructure. These companies typically find specific modules within ERSP such as the visual object recognition, voice recognition, and obstacle avoidance, useful for integration with their own products. Some mature companies with several robotics initiatives may find that their existing software infrastructure is not being leveraged across projects; they end up building the same functions many times over, or finding that these functions from different projects do not talk to each other. These companies find ERSP valuable because it provides a cross-platform standard that encourages cross-project fertilization.

ERSP Structure and Organization


The collection of ERSP libraries provide APIs that can be divided in several important functional categories (see the figure below): ER Software Architecture: The software architecture provides a set of APIs for integration of all the software components with each other and with the robot hardware. The infrastructure consists of APIs to deal with the hardware, for building task-achieving modules that can make decisions and control the robot, for orchestrating the coordination and execution of these modules, and for controlling access to system resources. ER Vision: The Vision APIs provide access to very powerful computer vision algorithms for analyzing camera images and extracting information that can be used for various tasks such as recognizing an object, detecting motion, or detecting skin (for detection of people).

Vision

Human-Robot Interaction

Navigation

ERSA

TEL BEL HAL

1-4

User Guide

Evolution Robotics Software Architecture (ERSA)

ER Navigation: The Navigation APIs provide mechanisms for controlling movement of the robot. These APIs provide access to modules for teleoperation control, obstacle avoidance, and target following. ER Human-Robot Interaction (HRI): The Human-Robot APIs support building user interfaces for applications with graphical user interfaces, voice recognition, and speech synthesis. Additionally, the HRI components include modules for robot emulation of emotions and personality to enhance the users experience and improve human-robot interaction. Also, these APIs support modules for recognition of gestures that can be used to interact with the robot. The software platform also provides developer tools which consist of well-defined application programmer's interfaces in Python, C++, XML scripting language, and visual programming tools. These tools provide a flexible environment for developing software for application programs without the need for in-depth knowledge of the intimate details of ERSP.

Evolution Robotics Software Architecture (ERSA)


ERSA consists of three main layers, where each of the layers provides infrastructure for dealing with three different aspects of application development. The Hardware Abstraction Layer (HAL) provides abstraction of the hardware devices and low-level operating system (OS) dependencies. This assures portability of the architecture and application programs to other robots and computing environments. At the lowest level, the HAL interfaces with device drivers, which communicate with the hardware devices through a communication bus. The description of the resources, devices, busses, their specifications and the corresponding drivers are managed through a number of configuration files. Configuration files employ a user-specified XML framework and syntax. The advantage of managing the resource specifications through configuration files is that it provides a high degree of flexibility. If you have two robots with significantly different devices, sensors, and motors, you only need to create a single resource configuration file for each. That file describes the mapping between the software modules and the hardware for each robot. HAL reads the specifications from the configuration file and reconfigures the software to work transparently, without modifications, with the application software. The XML configuration files typically contain information about the geometry of the robot, the sensors, sensor placements, interfaces to hardware devices, and parameters for hardware devices. The second layer, the Behavior Execution Layer (BEL), provides infrastructure for development of modular robot competencies, known as behaviors, for achieving tasks with a tight feedback loop such as finding a target, following a person, avoiding an object, etc. The behaviors become the basic building blocks on which software applications are built. The BEL also provides powerful techniques for coordination of the activities of behaviors for conflict resolution and resource scheduling. Each group of behaviors is typically organized in a behavior network which executes at a fixed rate. Behaviors are executed synchronously with an execution rate that can be set by the developer. The BEL also allows running several behavior networks simultaneously with each executing at a different execution rate. The communication ports and protocols between behaviors can be defined and implemented by the user. The BEL defines a common and uniform interface

User Guide

1-5

Chapter 1

for all behaviors and the protocols for interaction among the behaviors. In each cycle, a Behavior Manager executes all sensor behaviors to acquire fresh sensory data then executes the network of behaviors to control the robot. The coordination of behaviors is transparent to the user. An XML interface enables behaviors to interact with scripts written in XML. The XML interface provides a convenient and powerful approach to building application programs using XML scripts. XML files (known as schemas) can be used to define the characteristics of a behavior module, such as parameters, input/output interface, etc. Schemas for behaviors are similar to classes in C++, whereas specific behaviors correspond to objects which are instances of classes. A behavior network can be specified in an XML file that instantiates behaviors using the schema files, specifies values for optional parameters, and specifies the interconnections between behavior ports. A behavior network written in XML can then be executed using the behave command (see the Behave Command section of the ERSP Basics chapter of this Guide for details). The advantage of using XML for developing behavior networks is that it is very flexible and does not require recompilation of the code each time the tiniest change has been made to the network. Setting up the connections between behaviors using the C++ APIs could be a tedious task. Therefore, to further improve the process of developing behavior networks, ERSP provides the Behavior Composer, a graphical user interface. Typically, behavior networks are more conveniently developed using the Behavior Composer because it can be used to build application programs visually. With the Behavior Composer, you can use a mouse and keyboard to drag-and-drop behaviors and connect them together to form an application. This visual program is converted to an XML script that then can be executed by the ERSA. The Task Execution Layer (TEL) provides infrastructure for developing goal oriented tasks along with mechanisms for coordination of complex execution of tasks. Tasks can run in sequence or in parallel. Execution of tasks is triggered by user-defined events. (Events are conditions or predicates defined on values of variables within the Behavior Execution Layer or the Task Execution Layer.) Complex events can be defined by logical expressions of basic events. While behaviors are highly reactive, and are appropriate for creating robust control loops, tasks are a way to express higher-level execution knowledge and coordinate the actions of behaviors. Tasks run asynchronously as events are triggered. Time-critical modules such as obstacle avoidance are typically implemented in the BEL while tasks implement behaviors that are not required to run at a fixed execution rate. Tasks are developed hierarchically, starting with the primitive tasks, which are wrappers of behavior networks. At invocation, a primitive task loads and starts the execution of a behavior network. Tasks can monitor the execution of behavior networks and values of the data flow between behaviors to define certain events. Tasks can manipulate the behavior networks to cause desired outcomes. For example, tasks can inject values into the behavior network to cause a desired outcome. To change context of execution based on the goals of the robot, the TEL can cause termination of one behavior network and loading and execution of another. Asynchronous events provide a flexible mechanism for inter-task communication as well as communication between BEL and TEL. Tight feedback loops for controlling the actions of the robot according to perceptual stimuli (presence of obstacles, detection of a person, etc.) are typically implemented in

1-6

User Guide

Evolution Robotics Software Architecture (ERSA)

the Behavior Execution Layer. Behaviors tend to be synchronous and highly data driven. The Task Execution Layer is more appropriate to deal with complex control flow which depends on context and certain conditions that can arise asynchronously. Tasks tend to be asynchronous and highly event driven. The TEL provides an interface to Python, an interpreted scripting language. Prototyping in Python is convenient because it is a programming language at a higher abstraction layer than C++, and it is interpreted. The design of TEL makes it easy to interface it to other programming or scripting languages. This figure is a graphical representation of how the different layers of the software interact with each other and the input XML files.

Python

TEL

Tasks

Primitive Tasks

BEL Behavior Networks Behavior Networks XML Files Behavior Composer

Behaviors

HAL

Resources

Drivers

Hardware Configuration XML Files

ERSA has been engineered to be highly flexible and reconfigurable to meet the requirements of numerous application programs. Any subset of the ERSA layers can be combined to embody a range of architectures with radically different characteristics. The possible embodiments of the architecture could consist of using any of the layers in

User Guide

1-7

Chapter 1

isolation, any two of the layers in combination, or all three layers. For example, for applications with limited requirements for high-level functionality may require only HAL or HAL and BEL. The advantage of restricting the use to HAL would be in saving computational resources (memory, CPU power, etc.). If hardware abstraction is not of a concern to a project or product, then BEL can be used in isolation. Or if only high-level, event-driven control follow is required then TEL may be used.

ER Vision
ERSP provides powerful vision algorithms for object recognition, motion flow estimation, and color segmentation. Object Recognition The object recognition system is a vision-based module that can be trained to recognize objects using a single, low-cost camera. The main strengths of the object recognition module lie in its robustness in providing reliable recognition in realistic environments where, for example, lighting can change dramatically. Object recognition provides a fundamental building block for many useful tasks and applications for consumer robotic products, including object identification, visual servoing and navigation, docking, and hand-eye coordination. Other useful and interesting applications include entertainment and education. The object recognition module is implemented in the objrec library (in the Core Libraries). The Behavior and Task libraries implement several useful behaviors and tasks that use the object recognition for tracking and following an object. To train the software, you need to capture one or more images of the object of interest, name them using a text string, and load them into a database known as the model set (using file extension .mdl). The software then analyzes the objects image and finds up to 1,000 unique and local features to build an internal model of the object. ERSP provides graphical and command line tools that help in creating and manipulating object model sets. (See the Vision chapter of the ERSP User Guide). To use the object recognition, the user employs the APIs to load a model set and executes the object recognition algorithm (using the library APIs, the behaviors, or tasks). Once the object is seen in the robot cameras field of view, it will be recognized. The recognition returns the name of the object, the pixel coordinates of where in the video image it was recognized, and a distance to the object. The object recognition can be trained on hundreds of objects and can recognize more than one simultaneously. Motion Flow While object recognition provides a key technology for building fundamental robot capabilities, it does not process movement in objects such as people and other robots. Motion Flow analyzes an image sequence rather than a single image at a time, making it possible to discern motion in the field of view. This fundamental capability can be used for a number of tasks, ranging from detection of motion at a gross scale (moving people) to analysis of motion at a very fine scale (moving pixels). The optical flow algorithm provides a robust analysis of motion in the field of view. This algorithm correlates blocks of pixels between two consecutive frames of a video to determine how much they have moved from one frame to the next.

1-8

User Guide

ER Navigation

Color Segmentation Color segmentation can be useful for finding objects of a specific color. For instance, looking for an object using color can be used for a number of human-robot interaction components. This algorithm can also be used to detect people by searching for skin color under various lighting conditions. The color segmentation algorithm provides for a reliable color segmentation based on a probabilistic model of the desired color. Using a mixture of Gaussian distributions, it can be trained to classify pixels into the desired color or background and allow for significant variation in pixel color caused by lighting changes or diversity of the object population. The color segmentation module builds models for a desired color based on a training set that contains a population of objects with the desired color. Once the model is learned by the module, it is able to classify objects based on the model.

ER Navigation
ERSP provides modules for safe navigation in realistic environments. The navigation modules consist of behaviors for following targets and for obstacle and hazard avoidance. In addition, ERSP provides facilities for teleoperation of robots remotely. Target Following Target following modules are available in the BEL as well as the TEL. These modules track and follow the position of a target. The input to these modules comes from a target detection module which can be based on visual detection or detection using odometry information. Obstacle Avoidance Using the obstacle avoidance algorithm, the robot generates corrective movements to avoid obstacles. The robot continuously detects obstacles using its sensors and rapidly controls its speed and heading to avoid obstacles. Our obstacle avoidance algorithm uses a description of the robots mechanics and sensor characteristics in order to generate optimally safe control commands. The description of the robots mechanics and sensors are done in a generic configuration description language defined in XML so that the obstacle avoidance algorithm can easily be integrated onto different types of robots. Porting of obstacle avoidance (and other modules for that matter) to a new robot with different hardware just requires describing the new hardware in the configuration description language. Hazard Avoidance The hazard avoidance mechanisms provide a reflexive response to a hazardous situation in order to insure the robots safety and guarantee that it does not cause any damage to itself or the environment. Mechanisms for hazard avoidance include collision detection (using not one but a set of sensors and techniques). Collision detection provides a last resort for negotiating around obstacles in case obstacle avoidance fails to do so, which can be caused by moving objects, software or hardware failures.

User Guide

1-9

Chapter 1

Stairs and other drop-off areas are handled by a cliff avoidance module. Cliff avoidance uses a set of redundant sensors to detect the hazard and ensures the robots safety in the case of faulty sensors. The robot immediately stops and moves away from a drop-off. Teleoperation ERSA provides infrastructure for cross network operation of the robot. Applications of this capability include multi-robot systems, off-board processing, and teleoperation. For more information on the networking infrastructure see the 03-teleop and 04-primitive tutorials, and the Doxygen documents pertaining to, for example, MalleableBehavior.

ER Human-Robot Interaction
Evolution Robotics provides a variety of component technologies for developing rich interfaces for engaging interactions between humans and robots. These components support a number of interfaces for command and control of a robot and allow the robot to provide feedback about its internal status. Furthermore, these components enable the robot to interact with a user in interesting and even entertaining ways. The core technologies provided for developing human-robot interfaces (HRIs) consist of: Speech recognition and text-to-speech (TTS) for verbal interaction Robot emotions and personality to create interesting and entertaining life-like robot characters Person detection and recognition of simple gestures Speech Recognition and Text to Speech Two speech engines are available for use in user applications: one for input that converts a speech waveform into text (Automatic Speech Recognition or ASR) and one for output that converts text into audio (Text-to-Speech or TTS). Both engines are third-party applications that are included in the ERSP. The speech engines are resources available in HAL similar to resources for interacting with sensors and actuators such as IRs and motors. The speech modules can be integrated into behaviors, tasks, or both. Robot Emotions and Personality The robot emotion behaviors are used to describes the robot's internal and emotional states. For example, the emotional state defines whether the robot is sad or happy, angry or surprised. The emotion behaviors can also describe personality traits. For example, an optimistic robot would tend toward a state of happy, whereas a pessimistic robot would tend toward a state of sad. A graphical robot face is also available in ERSP. This face is capable of expressing emotion and having the appearance of forming words. This functionality allows the user to create a wide variety of emotions and responses triggered by user-specified stimuli. This greatly enhances the human-robot experience. See the Behaviors Library chapter of the ERSP User Guide and the Doxygen documents for details.

1-10

User Guide

Core Libraries

Person Detection and Head Gestures Person detection and tracking can enable very diverse human-robot interaction. For instance, being able to detect, approach, and follow a person can be very useful primitives for HRI. Evolution Robotics, Inc. has a reliable person-tracking technology using vision, combining some of our technologies for object recognition, optical flow, and skin segmentation. Gesture recognition provides another powerful technology for enhanced human-robot interfaces. Using gestures for interacting with a robot provides a natural and powerful interface for commanding a robot to perform tasks such as pick-and-place. Using our vision component technologies for motion analysis and skin segmentation (using color segmentation) ERSP can detect gestures including head nodding and head shaking. This is done by tracking the motion of head and hands of a user which are segmented using skin segmentation. These modules can be used to extend the system to recognize other gestures such as waving and pointing.

Core Libraries
The Core Libraries implement the basic functionality of ERSP upon which all other infrastructure is built. The core libraries can also be said to implement standards for later software modules. An application can build directly on any subset of the core libraries. The Driver Libraries implement interfaces for specific hardware components such as controller boards, drive systems, positioning systems, graphics engines, sensors, audio devices, etc. These drivers build on the infrastructure implemented in the core libraries. Specific drivers such as the Robot Control Module driver are implemented as a C++ class that is derived from a driver class in the resource library. This modular scheme assures, for example, that all derived driver classes for motor controllers provide a uniform interface defined by the driver class in the resource library. Thus, one controller can easily be replaced with another without propagating the change throughout the modules/classes that use the driver for the controller. The core libraries named Resource, Behavior, and Task implement the three layers of the software control architecture of the ERSA. While the core libraries implement the core functions of ERSA, the Behavior Libraries and Task Libraries provide higher-level functionality that builds on the core. For example, the navigation library in the Behavior Libraries provides modules for obstacle avoidance. A user can easily use this behavior without being concerned about how it is implemented using the core libraries. Finally, the core libraries implement basic and powerful functionality for object recognition and other vision algorithms. These modules become basic building blocks for building higher-level modules in the BEL and TEL.

User Guide

1-11

Chapter 1

ERSP consists of the following set of libraries which implement its core functionality. The libraries can be found in the Install_dir\lib directory. Core Libraries Driver Libraries (Hardware Abstraction Layer) Behavior Libraries (Behavior Execution Layer) Task Libraries (Task Execution Layer) For details on these libraries, see the Core Libraries, Hardware Abstraction Layer, Behavior Execution Layer, and Task Execution Layer chapters, of this Guide.

1-12

User Guide

Chapter 2

Vision

Object recognition can perform many tasks, including: navigation manipulation human-robot interaction The object recognition algorithm provides a fundamental building block for tasks and applications for consumer robotic and machine vision products, including object identification, visual servoing and navigation, docking, and hand-eye coordination. Other applications include entertainment and education. The object recognition system can be used to read childrens books aloud or to automatically provide information about a painting or sculpture in a museum. The object recognition system is mainly focused on recognizing planar, textured objects. Three-dimensional objects composed of planar textured structures, or composed of slightly curved components, are also properly recognized by the system. However, three-dimensional deformable objects such as a human face cannot be handled.

User Guide

2-1

Chapter 2

Object Recognition System


The object recognition system can be trained to recognize objects using a single, low-cost camera. The main strength of the object recognition system lies in its ability to provide reliable recognition in realistic environments where lighting can change dramatically. The following are some of the characteristics that make object recognition a valuable component for many robotics and vision-based applications: Invariant to rotation and affine transformations: Object recognition system recognizes objects even if they are rotated upside down (rotation invariance) or placed at an angle with respect to the optical axis (affine invariance). Invariant to changes in scale: Objects can be recognized at different distances from the camera, depending on the size of the objects and the camera resolution. The recognition works reliably from distances of several meters. Invariant to changes in lighting: The object recognition system can handle changes in illumination ranging from natural to artificial indoor lighting. The system is insensitive to artifacts caused by reflections or backlighting. Invariant to occlusions of 50 to 90% depending on object, distance, size, and camera: The object recognition system can reliably recognize objects that are partially blocked by other objects, by the hand of the person holding the object, and objects that are partially in camera view. Reliable recognition: The object recognition system has 80-95% success rate in uncontrolled settings. A 95-100% recognition rate is achieved in controlled settings. Training the object recognition system is simple. You capture one or more images of the object and load the images into the object recognition systems database. The object recognition system analyzes the images and finds up to 1,000 unique local features. The object recognition algorithm builds an internal model for the object. The recognition speed is a logarithmic function of the number of objects in the database, log (N). The object library can scale up to hundreds or even thousands of objects without a significant increase in computational requirements. The recognition frame rate is proportional to CPU power and image resolution. For example, the recognition algorithm runs at 5 frames per second (fps) at an image resolution of 320x240 on an 850MHZ Pentium III processor and 3 fps at 80x66 on a 100MHz 32-bit processor. Reducing the image resolution decreases the image quality and, ultimately, the recognition rate. However, the object recognition system allows for graceful degradation with decreasing image quality. Each object model requires about 40KB of memory. The following are the three steps involved in using the object recognition algorithm: Training: Training is accomplished by capturing one or more images of an object from several points of view. For planar objects, only the front and rear views are necessary. For 3-D objects, several views, covering all facets of the object, are necessary. For example, train a Coke can using 4 views. Recognition: The algorithm extracts up to 1,000 local, unique features for each object. A small subset of those features and their interrelation identifies the object.

2-2

User Guide

ER Vision Demo

Estimation: Identification provides the name of the object and its full pose, with respect to the camera.

ER Vision Demo
The ER Vision Demo application allows you to access the object recognition algorithm through a graphical user interface. This program makes it easy to view your entire Object Library. Image properties can easily be added and manipulated. Features found by the object recognition algorithm can be visualized and analyzed. To initiate this GUI, type er_vision_demo on the command line for Linux (this is not functional in this release) or double click on er_vision_demo.exe in Install_dir/bin for Windows.

Main Screen
The screen shot below shows the ER Vision Demo application. It includes several areas that are described in the sections that follow.
Rotate Buttons

Live Video Screen Main Screen Object Recognition Information Area Outline Screen Buttons

Object Library

Main Screen
The Main Screen area shows a grayscale version of the video feed. If Show Features (see Display Parameters on page 2-8) and Recognize (see Recognize on page 2-6) are set to On, a series of circles overlay this image. These are the features that the object recognition algorithm uses to uniquely identify an object or location.

User Guide

2-3

Chapter 2

Turning off this screen can improve your CPU usage. To turn off the Main Screen area, right-click the image and click Set invisible. To redisplay, right-click the empty Main Screen area and click Set Visible.

Rotate Buttons
Use the buttons to rotate the Main Screen, Live Video Screen and Outline Screen.

Live Video Screen


This screen shows the camera input. You can adjust the location of the camera in order to ensure that objects are properly captured for recognition. Turning off this screen can improve your CPU usage. To turn off the Live Video Screen, right-click the image and click Set invisible. To redisplay, right-click the empty Live Video Screen area and click Set Visible.

Outline Screen
This screen displays an outline of the camera input. This screen is for your reference and is not used by the vision algorithm. Turning off this display can improve your CPU usage. For more information about how the edges of an object are detected, see Prefs on page 2-6. To turn off the Outline Screen, right-click the image and click Set invisible. To redisplay, right-click the Outline Screen and click Set Visible.

2-4

User Guide

Object Recognition Information Area

Object Recognition Information Area


This area shows the recognition results. Objects the software recognizes in the cameras view are identified by: label number of matched features out of the total number of features in the object the coordinates of the center of the bounding box distance from the camera Important Note: You must specify the objects distance from the camera for the distance feature to work. See How to Capture an Image on page 2-5 for more information.

Buttons
The buttons control the operation of the application and set advanced parameters. Each button is described below.

Vision
The Vision button turns the screens On and Off.

Capture
The Capture button lets you capture a snapshot of an object, person, or scene to add to the Object Library.

How to Capture an Image


1. Place an object in front of the robot's or PCs camera. This object will become one of the learned objects in the Object Library. 2. Adjust the object so that the image fills most of the Live Video screen. To maximize the recognition performance of the system, place the object on a neutral, not patterned, background with good lighting and no other objects in view. For even more foolproof recognition, save multiple images of the same object from slightly different perspectives and variable lighting conditions. This is highly recommended for 3-D objects. Capture two dimensional objects from the front and back. 3. When the video displays the image as you want it, click the Capture button. A dialog box appears:

User Guide

2-5

Chapter 2

In the Label text box, type a label for the object image. This is the name the ER Vision Demo software uses to refer to this object image. In the Image file text box, a default name is automatically completed. This image filename is the name of the file in which the captured image is saved. To see a thumbnail of the image, click View. In the Distance and Units fields, type in the number and units of the distance between the object and the camera at the time of capture. For the Units, use inches (in) or centimeters (cm). Valid values are 0 to 500. Specifying distance and units allows an estimation of actual distance when the object is recognized later. If you have a zoom lens on your camera and you change the zoom setting after you set an image distance, the ability to judge distance correctly can be compromised. In the Sound file text box, specify a sound file to play when the object is recognized, if needed. To use a sound file, specify the path and filename. Use the Browse button to locate the file. You can also type text in the Text text box to be read aloud when the object is recognized. Refer to Sound Recorder in Microsoft Windows Help for information on recording Windows sound files. 4. Click OK. The image and associated sound file are added to the database. 5. Click Learn. The algorithm is trained on the image.

Learn
The Learn button trains the algorithm on your images. You can process the images one at a time or, to save processing time, in groups. Images are not recognized until they are trained.

Recognize
The Recognize button turns recognition On and Off. Object recognition can become very CPU-intensive when multiple objects are recognized or when an Object Library is very large. Use the Recognize button to recognize a few objects at a time to reduce CPU time.

Sound
The Sounds button turns the sound On and Off. This is useful when sound files are associated with objects that will be recognized often.

Prefs
Click the Prefs button to open the Object Recognition advanced parameters. Change multiple parameters by selecting the check box and changing the value in the text box or by simply selecting the check box. Repeat this process as necessary. When you are done, click OK.

2-6

User Guide

Buttons

Object Recognition Parameters


Feature Strength - Defines the required amplitude of the feature. Using a landscape metaphor, if you view the image as a landscape in which features appear as hills, the feature strength parameter defines the required minimum height of the hills. All hills with lower height are ignored. The lower you set this value, the more features are found, but the reliability of these features decreases. For environments with many features, set this close to 1.0. For relatively featureless environments, set this value lower. Valid values are 0.05 to 1.0. Use the slider to adjust the value of this parameter. The default value is 1.0. Feature Quality - Defines the required ratio of the amplitude of the closest, highest neighbor of the feature and the amplitude of the feature. Using the landscape metaphor, this parameter defines the required minimum ratio between the height of highest hill in the neighborhood of a given hill, and the height of a hill. This value defines the relative uniqueness of a feature with respect to its neighbors. Lowering this value provides fewer, but more reliable features. Valid values are 0.05 to 1.0. Use the slider to adjust the value of this parameter. The default value is 0.85. Use Upsampling - Performs an initial upsampling on the image before extracting features from it. Upsampling means doubling the resolution of the image. This allows the ER Vision Demo to acquire more features in an image, but uses computationally expensive recognition. The quality of these features is dependant on the way the Feature Quality and Feature Strength parameters are set. The system automatically re-trains the database of objects using the selected values of the parameters. The default value is On.

Edge Detection Parameters


Canny Sigma - Sets the width of the filter to compute the edges in the Outline Screen image. To change the value of this parameter, use the slider and click Ok. The default value is 1.25.

User Guide

2-7

Chapter 2

Canny tLow - Specifies the lower contrast threshold for the Outline Screen image. To change the value of this parameter, use the slider and click OK. The default value is 0.35. Canny tHigh - Specifies the upper contrast threshold for the Outline Screen image. To change the value of this parameter, use the slider and click Ok. The default value is 0.85.

Display Parameters
Show features - Shows the features that the object recognition algorithm uses to positively identify an object. These features are shown as circles in the Main Screen area. The features with blue circles match the features of an image in the Object Library. The default value is Off. Select the check box to turn it On. Show match object bounding box- Shows any matches to the Object Recognition library by putting a red box around the object in the Main Screen. The name and the estimated statistics for the recognized object are shown in the Object Recognition Information Area. The default value is On. To turn this option Off, select the check box. Show labels - Labels the identified object in the Main Screen. This parameter is set to On by default. To turn this value Off, select the check box. Highlight most likely object - Highlights of the object in the Object Library that best matches the recognized object in the live video feed. The object in the Object Library is highlighted with a green box. Use only best matches - Shows the best matches between the live video feed and the Object Library. This is set to On by default. To turn this option Off, select the check box. Turning this option Off shows all identified matches. Save results to file - Saves the recognition results to a file. Type the path and filename in the text box to the right. You can also use the Browse button to browse to a folder. The path is shown in the text box.

Camera and Sound


Resolution - Sets the image resolution of the camera. To change this parameter, select from the list. Repeat interval (secs) - The number of seconds the software waits between instances of vocalizing an object identification. Type the interval in the text box.

Info
Shows the ER Vision Demo version number and copyright information.

Timing
Click the Timing button to see the current time, average time, best time and worst time for Object Learning and Object Recognition. The Learning row only lists values if learning occurred during the current session. If learning has not occurred since you started the

2-8

User Guide

Object Library

application, you see a row of zeroes in the Learning row.

The Timing button is set to Off by default. Important Note: An Object Recognition Library must be loaded or established to use the Timing feature.

Object Library
The Object Library shows thumbnail images of the objects that the object recognition algorithm has been trained to identify. The images of the objects in the library, as well as the library file itself, are loaded from disk when the application starts.

Right-click an image to see a menu with several options: View Image, View Features, Properties, Delete and Add.

View Image
Click View Image to see larger version of the image. When you are done, click the x in the top right corner to close the image box.

View Features
Click View Feature to see a larger version of the image with an overlay of circles indicating the location of features. These features are extracted during training and are used to identify objects.

Properties
You can customize each objects information. Label - Specify the name of your object. Make sure to use a unique name in order to avoid naming conflicts as your Object Library grows. Image File - Shows the internal ER Vision Demo name for your image. To see the image, click View.

User Guide

2-9

Chapter 2

Distance - Specify the distance from the camera to the object. Valid values are 0 to 500. The object recognition algorithm determines the distance between the camera and the object when the object is recognized. Changing the zoom setting on your web camera after capturing an image result in inaccurate distance calculations. Units - From the list, specify units in centimeters (cm) or inches (in). Sound File - Specify the path and filename of a sound file to be associated with the object. Every time the object is recognized, the sound file is played. Click Test to hear the sound file. Text - Type a phrase to be read aloud when an object is recognized. Click Test to hear the phrase.

Delete
Click Delete to remove the object image from the Object Recognition Library. The system automatically re-trains on the remaining images in the Object Library after the deletion. Make sure you select the item you want to delete. After you delete it, you cannot undelete it. Important Note: Do not delete image files directly from the C:\Program Files\ER Vision Demo directory. ER Vision Demo may function in unexpected ways.

Add
Click Add to add an existing image to an empty box in your Object Library. Provide the following information. Label - Name the image file. Image File - Specify the path and filename of the graphic file to be added to the Object Library. If you are unsure of the path or file name, you can use the Browse button to find the file. After you find the file, you can view it by clicking on the View button. ER Vision Demo accepts the .png, .bmp and .jpg file types. Distance - Specify the distance from the camera to the object. Valid values are 0 to 500. The object recognition algorithm determines the distance between the camera and the object when the object is recognized. Changing the zoom setting on your web camera after capturing an image results in inaccurate distance calculations. Units - From the list, specify units as centimeters (cm) or inches (in). Sound File - Specify the path and filename of a sound file to be associated with the object. Every time the object is recognized, the sound file is played. Click Test to hear the sound file. Text - Type a phrase to be read aloud when an object is recognized. Click Test to hear the phrase.

2-10

User Guide

Command Line Tools

Command Line Tools


You can use the command line tools to perform a more extensive evaluation of the object recognition system. You can generate a database of objects from a set of images. The the following command-line tools are provided: objrec_add objrec_add_list objrec_del objrec_list objrec_recognize Use the tools to create databases of objects from training images, edit the databases, and perform recognition on new images.

Using the Command Line Tools


To use the command line tools, you must first have access to the executables.

How to find the executables


1. Windows users need to open a DOS prompt. 2. In Windows, change directories to the installation directory by typing:
cd C:\Program Files\ER Vision Demo

In Linux, add Install_dir/bin to your path. You are ready to use to the commands. Remember that these functions can only be run from the installation directory.

objrec_add
Use this command to create a database of objects from training images.
objrec_add.exe [<--create>]? <database_file> <image_file> <object_label> <distance> <units>

--create

Pass the --create flag to force the creation of a database with a .mdl extension. If the database does not already exist, failing to include the --create flag results in the program exiting without doing anything. Specifies either a precise path to a database file or the name of a database file stored in the current directory or in the default system repository. Failure to include the database file results in an error and program exit. Specifies a precise path to an image file. Failure to find the image file results in the program exiting.

database_file

image_file

User Guide

2-11

Chapter 2

object_label

Name of the object associated with the given image. The database has a new object entry for the given image that is identified with the object_label. Distance between the physical object and the camera when the image was taken. Units of distance.

distance units

Examples
objrec_add.exe --create database1.mdl image1.jpg object1 10 cm

Creates the database named database1.mdl containing a model for the object present in image1. The associated label is object1. If the database database1.mdl already exists, the program exits.
objrec_add.exe database1.mdl image2.jpg object2 20 cm

Checks for the existence of the file database1.mdl. The program exits if the database is not found in a valid form. If the database is valid, it adds a new model for the object present in image2.jpg. An image file can be added more than one time by using different object labels. Duplicate object labels are not permitted.

objrec_add_list
Use this command to create a database of objects from a multiple set of training images.
objrec_add_list.exe <--create>? <database_file> <list_file>+

--create

Pass the --create flag to force the creation of a database with a .mdl extension. If the database does not already exist, failing to include the --create flag results in the program exiting without doing anything. Specifies either a precise path to a database file or the name of a database file stored in the current directory or in the default system repository. Failure to include the database file results in an error and program exit.

database_file

2-12

User Guide

Using the Command Line Tools

list_file

At least one list file must be included on the command line. Any number of valid (existing) list files can be included. Each line of a file list encodes a single object to add to the database. The format is: <complete image file path> <label for added object> <distance for added object> <units for distance of added object> Only the image file path is required. Other arguments are completed with defaults if left unspecified. The label defaults to the name of the image, minus the file path. Distance default to 10, units of distance default to cm. Duplicate label entries are skipped. For example, a listing file that reads:
img1.jpg image1 img1.jpg image1 img1.jpg image2 img1.jpg image3

results in a database that contains 3 objects: image1, image2, and image3. The program also checks that no label about to be added to the database already exists in the database. Any pre-existing label is dropped from the list of files to add.

Examples
objrec_add_list.exe --create local_database.mdl \ listfile1 listfile2

Creates the database named local_database.mdl. If the database local_database.mdl already exists, the program exits. The tool parses the list files listfile1 and listfile2, checks for, and removes, duplicate labels and attempts to read each image. All the located images are added to the database with the appropriate labels, distances and units.
objrec_add_list.exe local_database.mdl listfile1

Checks for the existence of local_database.mdl. The program exits if the database is not found in a valid form. If the database is valid, it parses listfile1 and adds each entry. It then exits.

objrec_del
Use this command to delete a model from a database.
objrec_del.exe <database_file> <object_label>

User Guide

2-13

Chapter 2

database _file

The database_file specifies either a precise path to a database file, or the name of a database file stored in the current directory or in the default system repository. Failure to include the database file results in an error and program exit. Name of the object associated with the model to be deleted from the database.

object_label

Example
objrec_del.exe database.mdl object1

Deletes the model named object1 from the database called database.mdl. The new database, not including object1, is saved with the name database.mdl before the program exits.

objrec_list
Use this command to list the models present in a database.
objrec_list.exe <database_file>

database _file

The database_file specifies either a precise path to a database file, or the name of a database file stored in the current directory or in the default system repository. Failure to include the database file results in an error and program exit.

Example
objrec_list.exe database.mdl

Lists all the names, distances and units of the current objects stored in the database.

objrec_recognize
Use this command to specify a set of images to be recognized among objects stored in the database.
objrec_recognize.exe <--verbose>? <database file> <image_file|list_file>+

--verbose database _file

Output more detailed information to the console. The database_file specifies either a precise path to a database file, or the name of a database file stored in the current directory or in the default system repository. Failure to include the database file results in an error and program exit.

2-14

User Guide

Object Recognition API Overview

image_file|
list_file

Specifies a list of arguments. Either an image file or a list file is acceptable at any position in the list, and the list can be as long as needed. Image files are opened and checked against the database. If a match is found in the database, a message on the screen shows the label of the match. Any invalid entry is skipped, and processing of the remaining entries continues. List files are formatted with no white space preceding a full image path, and one full image is allowed path per line. Each image in the list file is opened and checked against the database.

Examples
objrec_recognize.exe --verbose local_database.mdl img1.jpg

Checks to see if img1.jpg matches an image in the database. If a match exists, the program displays the image name and the label the image matches.
objrec_recognize.exe --verbose local_database.mdl\ img1.jpg listfile1

Parses the contents of listfile1 and then recognizes each image. Image names and labels for matches are displayed.

Object Recognition API Overview


The object recognition system contains a minimal number of objects and a clean paradigm for dealing with the actual computations. The object recognition system uses four objects. Three are defined by the subsystem and the last is the ERSP Image object. ObjRecImageHash manipulates and gets image hashes. ObjRecDatabase stores a group of ObjRecImageHashes. ObjRecQuery determines if and how closely an ObjRecImageHash matches any ObjRecImageHashes in an ObjRecDatabase.
ObjRecImageHashes are the base of this system. Given an ObjRecImageHash object, you hash an Image by calling ObjRecImageHash::hash(Image * img). This function can only be called on time on any given ObjRecImageHash object. This guarantees that after an ObjRecImageHash contains valid data, it is immutable. This is a useful property, as ObjRecImageHashes are expensive to compute and copy, so storing them by reference count

is the norm.
ObjRecDatabase holds a number of ObjRecImageHashes, and labels associated with them. The most fundamental operation on these objects is ObjRecDatabase::add_hash(const char* label, ObjRecImageHash * hash). This adds a hash to the database. Load and save methods for streams and filenames are also available. ObjRecQuery does the actual object recognition. ObjRecQuery::query(ObjRecDatabase *database, ObjRecImageHash *hash)

computes the recognition of hash on database, and stores all the resultant interesting

User Guide

2-15

Chapter 2

information in the object that called query. Unlike ObjRecImageHash::hash, query can be called multiple times on the same ObjRecQuery object. The basic method for performing object recognition is: 1. Load a database from disk or create an empty one. 2. Add some image hashes to the database, if needed. To do this: Get an Image object from disk, autogeneration, camera. Create an ObjRecImageHash object. Hash the Image using ObjRecImageHash::hash(Image *image). Call ObjRecDatabase::add_hash(const char * label, ObjRecImageHash * hash). 3. Get an Image object, as above. 4. Create an ObjRecImageHash and hash on the Image object. 5. Create an ObjRecQuery object. 6. Call ObjRecQuery::query(ObjRecDatabase* database, \ ObjRecImageHash *hash). 7. Use the various accessor methods of ObjRecQuery to get the information.

Example
The following is annotated sample code for an example that loads a object database using get_by_filename, gets a camera, reads images off of the camera and recognizes them against the object database. First, create an object database, and load some data.
Result result = RESULT_SUCCESS; ObjRecDatabase * database = NULL; database = ObjRecDatabase::get_by_filename("modelset.mdl", false, &result); if(result != RESULT_SUCCESS) { return result; //fail, can't get modelset. }

Next, get a camera.


ICamera * camera = NULL; result = CameraUtil::get_camera_by_id("camera0", &camera); if(result != RESULT_SUCCESS) { return result; //failed to get camera, so fail. }

Now set up a for loop to get images and query the database.
unsigned i = 0;

2-16

User Guide

Object Recognition API

ObjRecImageHash *hash = NULL; ObjRecQuery *query = NULL; Image image; for( i=0 ; i < 200 ; i++) { hash = new ObjRecImageHash(); query = new ObjRecQuery(); if(camera->get_image_copy(0, &image) != RESULT_SUCCESS) { std::cout << "WARN: Camera missed a frame.\n"; }

This if statement does the hashing and querying, which is the main component of the algorithm. The hash function takes an image and fills out a hash object with a unique hash of the image given. The hash function takes an image, extracts the salient features of that image, and stores them in the hash object. The final call checks to see if there are any valid matches.
if(hash->hash(&image) == RESULT_SUCCESS && query->query(database, hash) == RESULT_SUCCESS && query->get_num_matches() > 0) {

Call a function to process the query and things you are using with the query.
custom_processing_function(query, <other arguments>); } hash-> remove_ref(); //don't delete, reference counted objects. query->remove_ref(); }

After the processing is done, clean up.


database->remove_ref(); if(CameraUtils::release_camera(camera) != RESULT_SUCCESS) { std::cerr << "Failed to release camera properly.\n"; }

Object Recognition API


Important Note: The following are the functions with brief descriptions. For detailed descriptions of these functions, refer to the Doxygen documents located in Install_dir/doc/ERSP-API/html directory in both Linux and Windows.

User Guide

2-17

Chapter 2

ObjRecDatabase

Constructor & Destructor Documentation


ObjRecDatabase ~ObjRecDatabase

The default constructor. It creates an anonymous database. The destructor.

Static Public Methods


get_by_filename

Gets a reference to a global database.

Member Function Documentation


add_hash get_model_distance

Adds a hash to the database with the corresponding name. Obtains the distance / units of the specified model. Obtains the label of the specified model. Queries the database for the existence of a hash corresponding to a model_name. Loads the current database from a file. Loads the current database from a stream. Removes an ObjRecImageHash from the database. Saves the current database to file. Saves the current database to stream.

get_model_hash_size Returns the size of the hash table for a specified model. get_model_label has_hash_by_name load_from load_from remove_hash save_to save_to

ObjRecImageHash
ObjRecImageHash ~ObjRecImageHash

Constructor. Destructor.

Member Function Documentation


Important Note: All functions in this section are in the Evolution namespace and the ObjRecImageHash class. Specified this by typing:
Evolution::ObjRecImageHash::

before the function name.


get_distance get_distance get_distance_units hash set_distance

Gets the distance from the camera to the object and the associated units. Gets the distance from the camera to the object. Gets the units of the distance from the camera to the object. Creates a new hash table of features for a given image. Specifies the distance from the camera to the object.

2-18

User Guide

Object Recognition API

set_distance_units set_flag set_image get_timestamp set_timestamp get_image_width get_image_height get_hash_size peek_image

Specifies the units of the distance from the camera to the object. Specifies the value of various parameters. Sets the optional raw image. Obtain the timestamp for the hashed image. (If 0, it is not necessarily valid). Sets the optional timestamp. Returns width of image. Returns height of image. Returns size of the hash table. Obtains the optional raw image, which can be used for debugging and display purposes.

ObjRecQuery

Constructor and Destructor Documentation


ObjRecQuery ~ObjRecQuery

Constructor. Destructor.

Member Function Documentation


Important Note: All functions in this section are in the Evolution namespace and the ObjRecQuery class. This is specified by typing:
Evolution::ObjRecQuery::

before the function name.


get_match_affine

Obtains the feature match count of a matching image. If a recognized image matches an image in the training set, the object recognition algorithm computes the affine transformation between the training set image and the recognized image. The affine transform is defined as follows: A pixel in the trained image is x, y. A pixel in the recognized image is: x',y'. x' = m3 * y + m4 * x + m6 y' = m1 * y + m2 * x + m5 After query() is called and get_num_matches() indicates that one or more images were matched in the training set, call this function with a match_index to determine the affine transform to one of the images matched.

User Guide

2-19

Chapter 2

get_match_center

Calculates a rectangle bounding the match and returns the coordinated of the centroid of the bounding box. After query() is called and get_num_matches() indicates that one or more images were matched in the training set, call this function with a match_index to determine the parameters of the bounding box of the match.

get_match_distance

Obtains the distance/units of a matching image. After query() is called and get_num_matches() indicates that one or more images were matched in the training set, call this function with a match_index to determine distance estimate from the camera to the object in the newly-recognized image. This information is computed based on the original distance/unit values specified when the object was originally trained upon. Obtains the feature match count of a matching image. The match count is a measure of how much correspondence is found with the training set image. The higher this number, the more corresponding features are found in the two images. A high match count coupled with a low residual indicates a strong probability that an object is correctly matched. After query() is called and get_num_matches() indicates that one or more images were matched in the training set, call this function with a match_index to determine the feature match count to one of the images matched.

get_match_hash_size

get_match_label

Obtains the label of a matching object. After query() is called and get_num_matches() indicates that one or more objects were matched in the database, call this function with a match_index to determine the label of one of the objects matched. camera (in radians) is required. After query() is called and get_num_matches() indicates that one or more images are matched in the training set, call this function with a match_index to determine the navigation parameters of the match.

get_match_navigational Calculates azimuth and elevation. The field of view of the

get_match_rectangle

Calculates a rectangle bounding the match and returns the coordinates of the vertices of the bounding box. After query() is called and get_num_matches() indicates that one or more images are matched in the training set, call this function with a match_index to determine the parameters of the bounding box of the match.

2-20

User Guide

Object Recognition API

get_match_residual

Obtains the error residual of a matching image. The residual is a measure of how closely the training set image is matched. Use the residual to determine, within a given set of matched images, the relative degree of correspondence. A value of 0 for residual indicates a perfect match. After query() is called and get_num_matches() indicates that one or more images are matched in the training set, call this function with a match_index to determine the residual to one of the images matched.

query

Searches for possible matches to the specified hash table in the given database of models. Use the get_num_matches() method to determine how many matches were found to objects in the database. Specifies the value of various parameters.

set_flag

User Guide

2-21

Chapter 2

2-22

User Guide

Chapter 3

Hardware Abstraction Layer

The Hardware Abstraction Layer (HAL) controls the robots interactions with the physical world, and with low-level operating system (OS) dependencies. The HAL receives physical input through sensors, such as cameras and range sensors. It interacts at a physical level through effectors, that change the state of the robots environment, such as moving or picking up objects.

Resource Drivers and Interfaces


The HAL provides the connection between hardware resources and the resource drivers. A resource is a physical device, connection point, or any other means through which the software interacts with the external environment. Resources are sensors and actuators, network interfaces, microphones and speech recognition systems, or a battery. The software module that provides access to a resource is a resource driver. A driver implementation uses the appropriate operating system or other API function calls to interact with the underlying resource. A driver implements resource functions and is by definition dependent on operating system and device specifics.

User Guide

3-1

Chapter 3

To protect higher-level modules from these dependencies, drivers must provide one or more resource interfaces, such as a set of public, well-defined, C++ abstract classes. The infrared (IR) sensor driver provides an IRangeSensor interface with methods to determine the distance to an obstacle. An obstacle avoidance algorithm can request IRangeSensor to determine the position of obstacles to avoid. If you replace the IR sensor with a sonar sensor, the avoidance algorithm requires no changes, because it does not depend upon using an IR device. Using this system, diverse resources can access identical interfaces. A single resource driver can support multiple interfaces. This allows you to create a layered system with algorithms interacting with the underlying hardware through different interfaces. Example: A navigation algorithm uses the IDriveSystem interface, providing basic move and turn commands present in a common differential drive system. To extend the system for an omnidirectional drive, you can create a new IOmniDriveSystem interface and support it in addition to IDriveSytem. The drive can be used with existing components that depend on IDriveSystem, while supporting new omnidirectional algorithms.

Resource Categories
There are three resource subtypes: device bus device groups Drivers for all three categories implement IResourceDriver, the functions they perform require different orders of loading and activation.

Device
Devices are the basic resource type representing a single physical device or other external entity. IR sensors, motors, and cameras are all examples of devices. Devices are the simplest resource drivers.

Bus
A bus is a transport layer to access external devices. A bus requires software initialization and must necessarily be performed before the initialization of a dependent device. Examples of a resource bus include a serial or USB port, an RS-485 network, or an Ethernet interface. A resource bus contains all devices to which it provides access, and it is guaranteed that the bus driver initializes and activates before any of its dependent device drivers.

Device Group
A device group allows a set of devices to be handled as a single resource. They are can consist of multiple devices that are viewed and accessed as a single device, such as a drive system composed of multiple motors. A device group can also synchronize or serialize several devices, such as a polling group that accesses other devices regularly. A device group is not activated until all of its component devices are available and they are the

3-2

User Guide

The Resource Life Cycle and Resource Manager

last of the three resource categories activated.

The Resource Life Cycle and Resource Manager


A resource must be located, instantiated, and activated before it becomes available to the system. After it is no longer needed or the system shuts down, the resource must be released and its memory reallocated. The Resource Manager is responsible for managing the system resources across their life cycle.

Resource Manager
The Resource Manager determines what resources are available in the current environment, ascertains the binary locations of the appropriate drivers, and creates C++ object instances of those drivers. Then it activates each bus and its devices and finally the resource groups. Activated devices are available for publishing. The Resource Manager proxies requests to resource interfaces through the appropriate drivers. External modules can access only those drivers provided by the Resource Manager. The interaction between resources and other system components occurs through the abstraction of IResourceContainer. The Resource Manager class is currently the only implementation of this interface.
IResourceContainer serves two purposes:

Gives access to the resource drivers from the external environment Serves as the point of access for the drivers to that external environment The IResourceContainer as implemented by ResourceManager does not restrict access in either direction. Container abstraction can easily be used to provide subsets of the available resources. It can also limit access to portions of the external environment to its contained resources. With resource drivers and other modules able to communicate, the most important phase of the resource life cycle starts. Eventually, the system shuts down and no further resources are needed. When an external module is done using a resource, it releases the interface through the same IResourceContainer. As driver interfaces are implemented, their reference counts increment. During system shutdown, these counts are decremented until only one reference remains. This last reference is held by the Resource Manager. At shutdown it deactivates all resources in the reverse order of activation and releases its references to the drivers. This frees the allocated memory.

Resource Configuration
For the Resource Manager to load resources correctly, it must be told which resources to load and how. The resource configuration system provides this and other information about resources and the external environment. While arbitrary storage backends are supported by this framework, currently only an XML-based file backend is available. In this scheme, configuration files are located from a list of search paths, each of which specifies a configuration directory tree. The default search path includes Install_dir/config, for components installed with the ERSP,

User Guide

3-3

Chapter 3

and, for components local to the system or organization, Install_dir/local/config. If additional directories must be searched, set the EVOLUTION_CONFIGPATH environment to include these directories. The primary resource configuration file is resource-config.xml. The file Install_dir/config/resource-config.xml is in the default installation, with the standard configuration. This file lists the resources present on the system, parameters to the appropriate drivers, and provides additional information about the physical configuration of the system. For purposes of driver loading, the most important piece of information about a resource is the type attribute. The resource type refers to a specific resource driver and set of parameters to the driver. Each type has an XML configuration file to describe it. The description file must have the same name as the type, with an.xml suffix. The configuration system expects it to be located in a resource directory under one of the configuration paths. If the resource type name contains a dot ('.'), the text before the dot is mapped to an additional subdirectory. The description file for the resource driver with type Evolution.Diff2Drive is located at the following path:
resource/Evolution/Diff2Drive.xml

The resource description file specifies both the location and parameters for that driver. By creating separate resource type files, you can use one driver for multiple devices. You do not have to specify all the parameter values each time you use a device.

Resource Configuration XML


The following is a description of the tags and attributes that are accepted by the main resource file.
<Resources> Top-level tag for organization and XML compatibility <Kinematics> The robot's kinematic configuration (optional) <Chain> A kinematic chain <Link> Link in the kinematic chain <Parameter> Parameter specifying the chain <Dimensions> Robot's physical dimensions (optional) <Shape> Shape of the dimensions <Parameter> Parameters specifying the shape <Devices> Devices present (optional) <DeviceBus> Resource bus providing access to devices

Attributes: ID - ID of the bus type - Bus type, indicating a type description file
<Device> A resource device

3-4

User Guide

Resource Configuration

<Parameter> Parameter to the device driver

Attributes: name - Name of parameter value - Parameter value


<Groups> The resource groups (optional) <Group> A resource group <Parameter> A parameter to the resource group driver <Member> Member of the group

Attributes: member_id - ID of the member device Each resource is described by a resource type description file, which is also called a resource specification file. This file is in XML format, and contains the ID of the resource, the library containing the implementation of the resource, and name of the resource driver. The resource specification file can also optionally contain default parameter values. If a parameter's value is not specified in the resource-configuration.xml file, then the value specified in the resource type description file is used. The resource type description file supports the following tags and attributes:
<ResourceSpec> Specification of the resource type

Attributes: ID - Type ID library - Library where the driver is located driver - Name of the driver
<Parameter> Parameter to the resource driver

Attributes: name - Name of the parameter value - Default value of the parameter

Standard Resource Interfaces


Standard Resources Interfaces define the functions that the resource driver must implement. There are resource interfaces to control and access sensors such as ICamera and IRangeSensor. Robot motion is controlled through IDriveSystem, which allows motion to be specified in terms of linear and angular velocity. Feedback from robot motion can be obtained by using IOdometry. Individual motors can be controlled using the IMotorCommand and IMotorQuery interfaces. Asynchronous use of resources is facilitated by IResourceCallback. Human robot interactions can be achieved by the use of IFace to create facial expressions and ISpeechRecognizer for speech recognition. Buttons, bump sensors, and other switch

User Guide

3-5

Chapter 3

devices are controlled using the ISwitchDevice resource interface. Important Note: All of the Standard Resource Interfaces are documented in detail in the Doxygen documents that are included in the ERSP installation. You can find these files in the Install_dir/doc/ERSP-API/html directory for Linux and Windows.

ICamera
When programming the camera interface, an image capturing device can return a single, uncompressed frame (get_image()) or a frame compressed in JPEG format (get_jpeg_frame()). The driver implementing the ICamera interface returns a frame shortly after the invocation rather than a frame buffered before the invocation. To access a single image multiple times or to do image buffering, grab_new_frame() grabs a new frame and stores it in an internal buffer. Get_image() returns the raw pixel data of the image.

IDriveSystem
IDriveSystem describes a set of devices, or motors, that provide the means to change the

linear and angular velocity of the robot. This interface also provides methods for stopping motion. Using StopTypes(), an electric motor issues a high-current sharp stop, or uses built-in electronic braking to slow smoothly. Using the get_velocities() method, the drive system returns its current linear and angular velocities.

IFace
The public interface for the Facial Graphics Driver (FGD) is defined by the IFace interface class.

IMotorCommand
This is the first layer of abstraction below the IDriveSystem interface.

IMotorQuery
You can query motor encoders to return position information.

IOdometry
An odometer resource tracks the position of the robot. Accurate position data, distance traveled, and current velocities are necessary for navigation algorithms. A method to reset the odometer is also required.

IPollable
This class describes a device that can be polled. The Polling Group driver obtains the IPollable from its member devices and call the poll() method to cause polling.

3-6

User Guide

Resource Configuration

IRangeSensor
A range sensor is a device that returns a value representing the distance to a physical object.

IResourceCallback
Receives the results of a resource command asynchronously. Asynchronous methods on a resource interface use an IResourceCallback parameter, as well as a parameter to identify the callback specifically.

ISpeechRecognizer
Speech Interface and Functions
A common speech interface accommodates different Automatic Speech Recognition (ASR) and Text to Speech (TTS) engines. The implementation for a particular engine inherits from this abstract interface. There are two abstractions, one for the ASR engine and one for the TTS engine. Most of these functions require an input identifier for a particular instance of a class called ticket. The identifier is helpful where multiple instances of the same class run at the same time.

ASR Interface and Functions


The base interface class for an ASR engine includes functions to control the activation or deactivation of the engine, the loading or unloading of grammars, and the propagation of events to registered callbacks (see the file ISpeechRecognizer.hpp in Install_dir/include/evolution/core/resource). This class virtually implements a client for the recognizer engine. A speech-enabled application needs to instantiate an ISpeechRecognizer object to communicate with the recognition engine and receives the information asynchronously using the callback function set with the appropriate function.

TTS Interface and Functions


The base interface class for a TTS engine includes functions to control the activation or deactivation of the engine, to select the language and the voice used in speech synthesis, and to synthesize speech. See the file ISpeechTTS.hpp in Install_dir/include/evolution/core/resource).

ISwitchDevice
The switch device interface corresponds to an output device that can be turned on and off. An example of a switch device is an LED light connected to an output port of an I/O board.

Supported Drivers

Diff2Drive
Device group to control a two-wheel differential drive system comprising two devices that support the IMotor interface. This driver has no parameters.

User Guide

3-7

Chapter 3

Diff2Odometry
Device group to read odometry from a two-wheel differential drive system comprising two devices supporting the IMotor interface. Implements polling_interval(), the interval between polling of the motors.

FGD Driver
The Face Graphics Display (FGD) is a resource that can be selected like the bump sensors, the motors, or the speech engines can be selected. To use this resource, you must include the following lines in the corresponding resource-config.xml file.
<DeviceBus id="FGD0" type="Evolution.FGD"> <Device id="robohead" type="Evolution.MorphedFace"> <Parameter name="model_name" value="bluebot"/> <Parameter name="full_screen" value="false"/> <Parameter name="use_keys" value="false"/> </Device> </DeviceBus>

There are three parameters: one to select the face model, another to select full screen display, and the last one to select keyboard interaction with the display. The FGD resource is loaded at the initialization of the programs. You see the display appearing on the screen. The parameter model_name points to one of the directories of Install_dir/data/faces or to one of the directories set by the configuration variable EVOLUTION_FACES_DIR. Inside the corresponding directory, there must be the three files that define the facial model. The FGD driver assumes that these three files are named using the name of the directory and have extensions .ase, .rh, and .rhm. In this example, the three files are called bluebot.ase, bluebot.rh, and bluebot.rhm. These files include the following information: <name>.ase - The geometry and animation. <name>.rhm - The morph channel structure and animation. <name>.rh - Face configuration values.

File Formats
.rh Face Configuration File Format. These files contain the FGD configuration values for specific faces. Possible options are: Syntax - fov angle yon Description - Set the cameras field of view (FOV). The angle is in degrees and represents the horizontal FOV for the camera. The YON specifies the number of units after which the camera begins automatically culling. Example - fov 45 1000

3-8

User Guide

Resource Configuration

Syntax - translate x y z Description - Moves the camera in the world to the specified point. Example - translate 0 0 -500 Syntax - rotate ang x y z Description - Rotates the world ang degrees around the specified axis. The axis can be normalized, but is not required. Example - rotate -90 1 0 0 Syntax - light x y z w Description - Creates a light at the specified position and orientation. See the OpenGL command glLightfv(POSITION) for more details. The parameters contain four integer or floating-point values that specify the position of the light in homogeneous object coordinates. Both integer and floating-point values are mapped directly. Neither integer nor floating-point values are clamped. The position is transformed by the modelview matrix when glLight is called (just as if it were a point), and it is stored in eye coordinates. If the w component of the position is 0.0, the light is treated as a directional source. Diffuse and specular lighting calculations take the lights direction, but not its actual position, into account, and attenuation is disabled. Otherwise, diffuse and specular lighting calculations are based on the actual location of the light in eye coordinates, and attenuation is enabled. The default position is (0,0,1,0). The default light source is directional, parallel to, and in the direction of the z axis. Example - light 0 -250 0 0 Syntax - numActions n Description - Number of actions in this file. This must be followed by the specified number of action commands. Example - numActions 6 Syntax - action name start loop stop end looping Description - Determines how an action is animated. The name is a string that is passed to the Robohead::Action() function. The start value is the animation frame to start from. The loop value is the animation frame to return to if looping is enabled. The stop frame is the animation frame at which it returns to the loop frame if looping is enabled. The end frame is the last frame to be played before ending the animation. It only reaches the end frame if looping is disabled. Example - action walk 30 40 100 110 1 .rhm Morph data file format - These files contain all the morph channel data. They are ASCII text files that are exported automatically by using the wfExportMorphs.ms script in 3D Studio Max. Note that 3D StudioMax is not included with your RDK. You probably never need to change these values directly, but the file format is shown in the following sections for reference:

User Guide

3-9

Chapter 3

Syntax - numMods n Description - Number of models in the file. Example - numMods 20 Syntax - numFrames n Description - Number of animation frames in the file. Animation is played at 30 frames per second, if possible. Example - numFrames 600 Syntax - morpher name n name0 name1 Description - Creates a morph channel table for the specified model. Each morph target has a name followed by an integer value representing the number of channels followed by each channel name. Following that is a table of floating point channel values from 0 to 1. Example morpher eyebrow 3 happy sad angry .5 .5 0 0 1 0 .2 .7 .9

IBM ViaVoiceTM ASR and TTS Engines


Two speech engines are available for use in user applications: one for input that converts a speech waveform into text (Automatic Speech Recognition or ASR) and one for output that converts text into audio (Text-to-Speech or TTS). Both engines are third-party applications that are included in the ERSP. The speech engines are resources available for you that can be selected in a similar way as the bump sensors, the motors, or the Face Graphics Display. To use these resources, you must make sure the following lines in the corresponding resource-config.xml file are not commented out.
<DeviceBus id="ASR0" type="Evolution.ASR"> <Device id="viavoice_asr" type="Evolution.ViaVoiceRecognizer"> </Device> </DeviceBus> <DeviceBus id="TTS0" type="Evolution.TTS"> <Device id="eloquence_tts" type="Evolution.EloquenceTTS"> </Device> </DeviceBus>

You can include one or both engines. The engines must be loaded at the initialization of the programs. In order to use a different set of speech engines, you must write the drivers for the engine following the directives provided in the resource implementation tutorial.

WinVoiceTM ASR and TTS Engines


Two speech engines are available for use in user applications: one for input that converts a speech waveform into text (Automatic Speech Recognition or ASR) and one for output that

3-10

User Guide

Resource Configuration

converts text into audio (Text-to-Speech or TTS). Both engines are third-party applications that are included in ERSP. The speech engines are resources available for you that can be selected in a similar way as the bump sensors, the motors, or the Face Graphics Display. To use these resources, you must make sure that the following lines in the corresponding resource-config.xml file are not commented out.
<DeviceBus id="ASR0" type="Evolution.ASR"> <Device id="vwinvoice_asr" <Device id="winvoice_audio_level" type="Evolution.WinVoiceAudioLevel"/> type="Evolution.WinVoiceRecognizer"> </Device> </DeviceBus> <DeviceBus id="TTS0" type="Evolution.TTS"> <Device id="win_tts" type="Evolution.WinTTS"> </Device> </DeviceBus>

You can include one or both engines. The engines must be loaded at the initialization of the programs. In order to use a different set of speech engines, you must write the drivers for the engine following the directives provided in the resource implementation tutorial.

IEEE1394DC
An IEEE 1394 (Firewire) digital camera driver. Note: this driver is for still cameras, not camcorders. Important Note: This driver is only valid for Linux systems.

RcmBumpSensor
This driver controls a bump sensor connected to a digital input on the Robot Control Module (RCM). The board() method gets the ID of the RCM, and input() represents the digital input on the board.

RcmIrSensor
This driver controls IR sensor connected to an analog input on the Robot Control Module (RCM).

RcmMotor
Motor connected to the Robot Control Module (RCM). Board() returns the ID of the RCM. position_factor() factor converting servo counts to the requested units.

RcmNetworkDriver
Device bus handling. Robot Control Module (RCM) network over a serial connection. The serial port (port()) to which the RCM network is connected. baud() specifies the baud rate at which to run the serial port.

User Guide

3-11

Chapter 3

NullResource
An empty resource driver, mainly used for a device bus that requires no initialization or shutdown. This driver has no parameters.

PollingGroup
A resource group of devices supporting the IPollable interface, each of which is polled at a specified rate.

Resource Type
The three resource types - bus, device, and group - are implemented in basically the same way, but their loading order and intended use differ. A bus driver usually provides initialization and shutdown for some communications bus on which other devices reside. Because those devices depend on the correct functioning of the underlying communications, a bus driver is activated before the devices on that bus. For a motor driver to function, the appropriate bus driver must already be activated to open the serial port and, in some cases, initialize communications on the device network. Some devices do not need bus initialization, for example, a USB camera. In this case, you can use the Evolution.NullResource driver. For an example, see the USB.xml specification file under Install_dir/config/resource/Evolution. The majority of drivers are devices like motors, cameras, and speech recognition systems. In general, device drivers simply implement their interface(s) and the basic requirements of a driver. The only complexity that can arise is when a device driver needs some data structure (e.g. a file handle) from its bus driver. In these cases, it is recommended that the bus driver support a special interface to provide access to that data structure. Then the device driver obtains the interface from the resource container and makes the appropriate method call. The resource group is the most complicated type of driver to write, and it is also the least common. A group is only required when several devices can be used to function as a whole. Group drivers are activated last and deactivated first. In general, they obtain interfaces from member devices and use them to perform tasks as a group. The Evolution.Diff2Drive driver is a device group implementing a general two wheel differential drive system. While the driver can be implemented as a single device, sending commands to both hardware motors. You must write a new driver each time motor communications changed. Using the group implementation, the drive system acquires the interfaces from the two motors and sends commands to the drivers, rather than to the direct hardware. Only the simple motor driver needs to be rewritten.

Creating a Resource Driver


ERSP provides a number of resource drivers for various hardware that are useful in robotics. If you want to extend your robot with additional hardware not supported by the platform, you need to write your own resource driver. This section describes this process. Refer to the tutorial on how to write a driver under /sample code/driver/tutorial in the sample code. You must untar the sample code in order to see this directory.

3-12

User Guide

Creating a Resource Driver

Resource Interfaces
The Hardware Abstraction Layer (HAL) uses interfaces to separate higher levels of the platform from the details of specific hardware. A resource driver is at heart a software component that implements one or more well known interfaces to control a physical device. When writing a resource driver, you must first decide which interface(s) that driver provides. If your new hardware is a different model of a device already supported by the ERSP (such as a new camera or motor), implement a standard interface from include/evolution/Resource.hpp. If, however, you have a different piece of hardware that is not currently supported, you need to define your own abstract interface class. See the interfaces in evolution/Resource.hpp for examples. If your hardware supports a superset of the functionality defined by an existing interface, derive the new interface from the standard ERSP interface and add methods to expose the additional functionality. For example, if you have a camera that supports instructions to adjust the focus, write a new IExtendedCamera interface that derives from ICamera and adds a new focus() method. If your driver uses non-standard interfaces, other ERSP components will not be able to access them directly. You need to write one or more behavior components to interact with the driver. If you decide to add an interface, try to make it as general as possible and not tied to a specific device model, in case you switch to a different model later on.

Driver Implementation
Resource drivers are implemented by deriving from the abstract class IResourceDriver, in addition to the appropriate interface class(es). IResourceDriver declares the basic methods through which other components of the ERSP interact with a driver. In general, it is easier to derive from ResourceDriverImpl, a helper subclass. A driver then redefines methods for initialization, activation, and the access to interfaces.

Initialization
Initialization in the constructor should be simple, because hardware access and parsing configuration information generally occurs in the activation phase. The method initializes the objects member variables to ready it for activation. The constructor, however, needs to have the following type signature:
MyConstructor (const ResourceConfig& resource_config, IResourceContainer& resource_container, TicketId ticket);

Important Note: Remember that these parameter types are in the Evolution namespace. Specify the namespace completely with each type, import the types using a directive, or use typedefs in your namespace. The signature of the constructor is important for the IMPLEMENT_RESOURCE2 macro, discussed later.

User Guide

3-13

Chapter 3

Activation
Three methods handle the drivers activation: activate(), deactivate(), and is_active(). The methods are responsible respectively for the activation, deactivation, and indication of activation state. Activation and deactivation can occur repeatedly (for example, a failsafe system can deactivate and reactivate a driver that stops working, attempting to reset the device), and a driver should handle such an occurrence correctly. After an activate() and deactivate() sequence, the driver and hardware should be in a state identical to that before activation. All hardware, threading, and/or network or other communications initialization should not occur in the constructor, but in activate(). Reading configuration information should occur in activate() as well. All shutdown should occur in deactivate(). For a driver that requires threading, the creation and destruction of mutexes, condition variables, and so on can be placed in the constructor and destructor to be used, possibly, across multiple activation and deactivation cycles. The important thing is that a thread doing processing (and consuming system resources) should not be running when the driver is inactive. The is_active() method should return true if activation completed successfully, otherwise it returns false. It is important that is_active() gives an accurate report of the drivers state. The resource container that manages the driver use this information to determine if the hardware is performing correctly and whether the driver is ready to accept requests for interface pointers. Additionally, the activate()method should check is_active()at the beginning of its execution, and if the driver is already active, return success immediately. The deactivate() method should do the converse, checking if the driver is not active. It is not an error to have repeated calls of activate() or deactivate(); the methods semantics are effectively activate if not already active.

Obtaining Interfaces
For a driver to be useful, it must expose a well-known interface that other components can access. The obtain_interface() method performs that function. Resource drivers are protected inside a resource container, so when a component wants an interface to a certain driver, it must make the request through the container. The containers obtain_interface() method checks that the driver is active and then calls the drivers obtain_interface() method. The driver must determine if the requested interface is supported and return a pointer to it. If the interface is not supported, the call returns RESULT_NOT_IMPLEMENTED. If the interface is returned, obtain_interface() calls add_ref(), because a client now has a reference to the driver and is responsible for releasing it through the resource container when done. In addition to the interface pointer, the obtain_interface() method outputs a reservation count. This count indicates to the container how many clients can hold a reference to the interface simultaneously. If the container supports reservation, it can track references and deny an interface to a client if the reservation count is full. If an arbitrary number of simultaneous accesses is allowed, the reference count should be zero. A driver can support both a reader and

3-14

User Guide

Creating a Resource Driver

writer interface, with the reader allowing unlimited accesses and the writer allowing only one. This is in fact the motivation behind having two different interfaces for a motor, IMotorQuery and IMotorCommand. The following is an example implementation of obtain_interface(), from the driver tutorial:
Result FileStreamDriver::obtain_interface (TicketId owning_token, const char* interface_name, IResource** resource_interface, unsigned& reservation_count) { // Check for the supported interface. if (std::strcmp (interface_name, IStream::INTERFACE_ID) == 0) { // Assign this first to a pointer of the // interface type, so that compiler will // handle it correctly. IStream* ptr = this; *resource_interface = (IResource*) ptr; } else { return Evolution::RESULT_NOT_IMPLEMENTED; } add_ref (); // Dont forget to add reference! return Evolution::RESULT_SUCCESS; }

Driver Registration
A new driver or set of drivers is placed in a shared library, to be plugged in to the ERSP framework. After the library is loaded, instances of the driver class can be created. How is the resource container able to create instances of drivers in external libraries? As the library is mapped into memory, the driver class registers its type ID and a factory function with a centralized registry. When the resource container requests an instance of that type, the factory function is called to create the driver instance. The above process is automated by the use of two macros and a compliant constructor. Simply follow these two steps: 1. Use the DECLARE_RESOURCE macro inside the declaration of your class, usually in a header file. The macro specifies the name of the driver class and a string driver type ID. Generally, the driver ID should have some namespace prefix to differentiate it from a possible driver with a similar name:
class MyDriver : public Evolution::ResourceDriverImpl, public ... interface(s) ... { ...

User Guide

3-15

Chapter 3

public: DECLARE_RESOURCE(MyDriver, "Namespace.MyDriver"); ... }; // end class MyDriver

2. Place the IMPLEMENT_RESOURCE2 macro in the class implementation file (.cpp), specifying the class name:
IMPLEMENT_RESOURCE2(MyDriver);

Now you can create instances of the driver, using the driver type ID string.

Advertising the Driver


You need to ensure that the driver is loaded. The driver type and its library must be advertised in some manner to enable dynamic loading. The XML configuration form serves this purpose, so every resource driver has an XML specification file. By default the ERSP searches the Install_dir/config/resources directory for resource specifications. For resource files not installed with the system, the ERSP searches for a resource directory under the paths specified in the EVOLUTION_CONFIG_PATH environment variable. The .xml suffix is appended to the requested driver type ID (the same as specified in DECLARE_RESOURCE) to locate the file. A period (.) in the ID indicates a subdirectory. The specification file specifies the name of the driver library that contains it, minus its platform-specific suffix. If the driver type has not registered itself already, the ERSP attempts to load the library, using the usual OS method for loading shared libraries.

3-16

User Guide

Chapter 4

Behavior Execution Layer

The Behavior Execution Layer (BEL) of the ERSP architecture is a framework for building autonomous robotics systems. Applications use the BEL to acquire sensory input, make decisions based on that input, and take the appropriate actions. The fundamental building block of the BEL is the behavior, defined as a computational unit that maps a set of inputs to a set of outputs. This definition generalizes the classical idea from behavior-based robotics, making all behaviors support an identical interface for consistency and maximal inter-operability. Behaviors cover a wide range of the robots functions, from driving sensors and actuators to mathematical operators, algorithms, and state machines. Behavior inputs and outputs are called ports. Each output port can have connections to an arbitrary number of input ports. A port is characterized by its data type, data size, and semantic type together, indicating the structure and use of the data passing through that port. These port attributes determine the validity of connections, because the start and end ports must have matching types. For example, a text string output from a speech recognizer cannot connect to a port expecting an array of range sensor data. Chains of connected behaviors form behavior networks. ERSP executes all behaviors in the

User Guide

4-1

Chapter 4

network sequentially and at the same rate. In this sequential model, behavior execution occurs in two stages: the behavior first receives data from its input connections, and then it computes and pushes its output. For a behavior to operate on the most current input in a given network cycle, the behaviors from which it receives input must be executed first. Accordingly, the ERSP uses a partial ordering such that, for behavior A connected to another behavior B, A executes before B.

The Behavior Life Cycle and Environment


While behaviors and behavior networks are implemented as C++ objects that are available to any C++ program, the much easier and more common method of creation is through behavior XML files. The behavior XML environment provides a good model for examining the behavior life cycle. The behavior XML interpreter is the behave application. It instantiates behavior networks and runs them, providing an execution environment for the behaviors, according to the specifications of the XML files. The behavior XML format has two main structures: the behavior network and behavior schema files, similar to the resource configuration and specification files in the Hardware Abstraction Layer. A behavior network file specifies one or more behavior networks, each with constituent behaviors and the connections between them. Each behavior in the network is an instance of a certain type, called a behavior schema, each represented by a schema file. The schema file indicates the location of the binary code implementing the behavior, as well as that behavior's input and output ports and runtime parameters. From the behavior network file, the behave application creates each network instance, which is in turn responsible for instantiating its constituent behaviors. To create those behaviors, the behavior network examines the type of each instance and then finds its corresponding schema file. Using the binary location specified therein, the network is able to create the behavior object in memory. After all the behavior instances are created, the network connects them as specified by the XML file. The behave application runs the network with the speed and duration given by the XML file and by command line options. When the network execution is complete, the network and its behaviors are stopped and the memory is released.

Behavior Configuration

Behavior Network File Format


The following is the format of a behavior network file:
<Behaviors> Top level key for behavior configuration <BehaviorNetwork> Contains the behaviors and connections for a behavior

network. Specifies the behavior container type to be used Attributes:

4-2

User Guide

The Behavior Life Cycle and Environment

ID: Unique identifier for the network library: Optional. The name of the library (defaults to __internal) containing the behavior network implementation type: The network type ID
<Behavior> A behavior instance, specifying the schema (type) and

containing parameter values for the instance Attributes: ID: Unique identifier for the behavior instance type: The type of the behavior schema, used to instantiate and check the instance input_reps: For a behavior schema that has variable inputs, the number of input ports output_reps: For a behavior schema that has variable outputs, the number of output ports
<Parameter> Value of a single parameter

Attributes name: The name of the parameter value: The value of the parameter. If not specified, the default value from the schema is used
<Connection> A connection between behavior instances

Attributes source: The ID of the source (upstream) behavior from which the connection originates source_port: The output port name from which the connection originates target: The ID of the target (downstream) behavior at which the connection ends target_port: The input port name at which the connection ends

Behavior Schema File Format


The following is the format of a behavior schema file:
<BehaviorSchema> Defines the name and location of the behavior.

Attributes: type: The type of the driver library: (optional) The name of the library (defaults to __internal) containing the behavior driver description: User-friendly description

User Guide

4-3

Chapter 4

display_name: A short display name


<Parameter> (optional) 0 or more. Specification of a single parameter.

Attributes: name: An identifier naming the parameter type: The type of the parameter. Can be Boolean, enum, double, doublearray, or string default: (optional) The default value of the parameter. If specified, the parameter is optional in the instance
<Inputs> Defines the inputs to the behavior

Attributes: sizing: Optional. Must be fixed (the default) or variable. Specifies whether or not the number of input ports is fixed for all behavior instances of the schema, or if different instances can have different numbers of inputs For variable inputs, zero or more ports can be defined explicitly. If the number of inputs exceeds the number of defined ports, the additional ports are assumed to be of the type of the last port. If no ports are defined, all ports are assumed to be of datatype generic.
<Port> Defines an input port

Attributes: name: The name of the port. Must be an alphanumeric, _', or '-', with the first character being a non-digit data type: (optional) Must be generic (the default), Boolean, enum, double, string, doublearray, or multivalue (equivalent to the types in BehaviorData) semantic_type: (optional) Specifies the type information for the containing port, acting as a compatibility filter in addition to data types min_connections: (optional) The minimum number of connections to the port. Defaults to 0 max_connections: (optional) The maximum number of connections to the port. Defaults to infinity
<Repeat> (optional) Specifies a repetition of one or more Port tags. If used, the Port tags are nested within

Attributes: min: (optional) The minimum count, defaults to 1 max: (optional) The maximum count, defaults to infinity Outputs: Defines an output port. All attributes and contained Port and Repeat tags are the same as with Inputs

4-4

User Guide

The Behavior Life Cycle and Environment

Data Types

Data Type: Unknown


Uninitialized BehaviorData and BehaviorDataWriter objects have type BehaviorData::TYPE_UNKNOWN. This type has no get/set accessor functions. An object also has type BEHAVIOR_DATA_UNKNOWN clear() function call method is called on data.

Data Type: Boolean


This data type is a simple Boolean value. Use get_boolean() to return a Boolean data value from a BehaviorData. Use set_boolean(bool x) to set a Boolean value for a BehaviorDataWriter.

Data Type: Enumeration


Enumerators are long integers. Use get_enum() to return a enumeration data value from a BehaviorData. Use set_enum(long x) to set a enumeration value for a BehaviorDataWriter.

Data Type: Double


Standard floating point double data type. Use get_double() to return a double data value from a BehaviorData. Use set_double(double x) to set a double value for a BehaviorDataWriter.

Data Type: String


BehaviorData objects using this data type return a pointer to character (char*), not a std::string object. Use get_string() to return a string data value from a BehaviorData. The returned (char*) is constant. Use set_string(const char* x) to set a string value for a BehaviorDataWriter. Note that the contents of the string are copied into the BehaviorDataWriter.

Data Type: Multivalue


This is the Multivalue set of the arbitrary dimension. The Multivalue type is similar to an array of one or more dimensions, specifically used to represent discrete values of a function. Each index of a Multivalue corresponds to the value of a function a some point. For example, suppose that f(x) = x^2, is represented by a multivalue on the interval [-1, 1], with a step size of 0.5. The multivalue has one dimension with size 5, for x = { -1, -0.5, 0, 0.5, 1}, and the actual values in the multivalue are { 1, 0.25, 0, 0.25, 1}. Use get_multivalue() and set_multivalue() to get and set the multivalue into the BehaviorData.

User Guide

4-5

Chapter 4

Data Type: Character Array


This data type uses the CharArray object, not a simple char* pointer (as used by TYPE_STRING). CharArray implements a standard character pointer array with boundary checking. Use get_char_array() to return a Character Array from BehaviorData. The get() function does not copy the whole structure on access. Instead, it returns a pointer to the array object. Use set_char_array(CharArray& x) to set a Character Array in a BehaviorDataWriter. The set function creates a copy of the input data object when x is passed as a reference. When x is passed as a pointer, only the CharArray pointer is copied. If the code calls set_char_array() in pointer mode, a call to get_char_array() on the other end must have a pointer to the same object. But if set_char_array() is passed a reference, the reader receives a pointer to a different object.

Data Type: Double Array


This data mode is identical to TYPE_CHAR_ARRAY, except that it uses doubles instead of characters. Use get_double_array() to return a Double Array from a BehaviorData. Use set_double_array(DoubleArray& x) to set a Double Array in a BehaviorDataWriter.

Data Type: Image


The image type is defined in <evolution/core/resource/Image.hpp>. Use get_image() and set_image() to get and set images into the BehaviorData.

Data Type: Matrix Double


A matrix of double values, as defined in <evolution/core/math/Matrix.hpp>.

Data Type: Vector Field


A matrix of Vector2 values, as defined by <evolution/core/math/Matrix.hpp> and <evolution/core/math/Vector2.hpp>.

Data Type: Point Set


A set of three-dimensional points.

Data Type: IObject


An IObject pointer, as defined in <evolution/core/base/IObject.hpp>. The IObject mechanism allows arbitrary data types to be stored in the BehaviorData; the object only needs to derive from IObject (usually ObjectImpl) and implement the obtain_interface() to expose its methods.

4-6

User Guide

Implementing Behaviors

Data Type: Pointer


A raw pointer. Used by the TaskValue type in the Task Layer, which derives from BehaviorData.

Implementing Behaviors
At the implementation level, a behavior is a C++ class that derives from the abstract class IBehavior and provides all of its declared methods. The IBehavior interface specifies a number of methods to control the behavior's activation, connections, computation, and data propagation. Most behaviors do not need to derive from IBehavior directly and implement all of its methods. Behaviors can inherit from the BehaviorImpl helper class, which implements all IBehavior methods and requires that its subclasses provide only a single method.

BehaviorImpl and the IBehavior Interface


Since BehaviorImpl is a reference implementation of the IBehavior interface, it serves as a model for those rare cases when a new implementation of IBehavior is needed. When you implement an IBehavior, each behavior must maintain its own table of connections and push its output data directly to the behaviors on the receiving end of those connections. While all connections can be managed centrally by the network object, such a model relies on an intermediary lookup for each data object passed on each connection every execution cycle. The direct connection model adds some complexity to the behavior implementation, but it is much more efficient because it supports high-frequency execution. Also, because BehaviorImpl handles both the connection table and data propagation, it is no trouble for the behavior implementor. The execution model is another crucial piece of IBehavior. Behavior execution occurs in two phases: the propagation of input data and computation. These phases correspond to the push_input() and invoke() methods, respectively. A behavior receives all of its input data through calls to push_input() before invoke() is called. The behavior first computes its output within invoke(). Then it pushes the output over its connections by calling push_input() on the behaviors at the end of those connections. Each time a behavior is invoked, data propagates across a single level of connections. When all behaviors in a network have been invoked, data has propagated through the entire network. In practice, a behavior inheriting from BehaviorImpl provides neither push_input() nor invoke() methods, relying instead on the base class' implementations. BehaviorImpl::push_input() caches input data stored by input port and source. BehaviorImpl::invoke() calls compute_output(), which must be present in the subclass, and then pushes the resulting output data across the behavior's output connections.

Input and Output Ports


A behavior reads data from input port(s), does some calculations, and writes data to output port(s). Each behavior has its own set of input ports and output ports to handle this data. Behaviors in the network are linked together using these ports. The output of one behavior links into the input of the next.

User Guide

4-7

Chapter 4

When a behavior does a calculation, it sends data through an output port that connects to another behavior's input port. The input port buffers the computed data. All data values in the port are encapsulated in the BehaviorData class. BehaviorData objects can contain integers, doubles, strings, and many more complicated structures such as multi-valued functions. Each port can hold any number of BehaviorData objects including zero, if there is no valid data to send. In this case, the interface returns a null BehaviorData pointer instead of actual data. Another important feature of data ports is that any number of output ports can link to the same input port and vice versa. When two or more output ports link to the same input port, the input port cannot determine which behavior sent what data to the input port. Behaviors assume that all input data in the same port is interchangeable, or commutative. When you need a behavior to distinguish input data from multiple other behaviors, you must give the behavior separate input ports. For example, consider the difference between an AddOperator and a PowerOperator: Because (A + B) = (B + A), an AddOperator function only needs one input port. The order that inputs arrive is not well defined, but that does not matter for addition. The addition behavior just reads all data from a port, adds it together, and sends the result to the output port. For exponents, it is not true that (A ^ B) = (B ^ A). Consequently, you cannot read two inputs from the same port because it is not clear which input is supposed to be the base and which supposed to be the exponent. Instead, the data for a power operator must come in from two separate ports, presumably INPUT_BASE and INPUT_EXPONENT.

Reading from Ports


Ports are identified by constant integer. For example, INPUT_DATA instead of 0. Each input port contains a list of all data written into that port. To access the table of all data waiting in the INPUT_DATA port, use this code sample:
PortInputTable& inputs = get_port_input (INPUT_DATA);

The PortInputTable object contains all the data waiting inside a port. Normally each port contains data that the behavior interprets in different ways. For this reason, it is unlikely to see the following construct, which iterates over every port:
for (int port_num = 0; port_num < INPUT_PORTS; port_num++) { PortInputTable& inputs = get_port_input (port_num); // Do some processing on input list }

However, similar constructs are used for behaviors whose number of ports are determined at runtime. In this case, the BehaviorImpl constructor gets passed 0 for the input or output port count, and the actual number is determined from the configuration file. Because a PortInputTable is just a list of input data objects, you can use iterators to access each input it contains.
PortInputTable& inputs = get_port_input (INPUT_DATA);

4-8

User Guide

Writing to Ports

for (PortInputIterator input = inputs.begin (); input != inputs.end (); ++input) { BehaviorData* data = input.get_data(); // Process data from port }

If an input port has only one value, or you only care about one value, this code sample accesses the first value in the port.
PortInputTable& inputs = get_port_input (INPUT_DATA); BehaviorData* data = BehaviorImplHelpers::get_first_data(inputs);

If there was no data, the BehaviorData pointer equals NULL. Assuming the behavior data actually exists (is non-null), you can check its type with the get_type() function. For example, the following code sample checks for double inputs:
if (data && data->get_type() == BehaviorData::TYPE_DOUBLE) { // Do something }

After the data type is known, simple accessor functions retrieve the data:
if (data && data->get_type() == BehaviorData::TYPE_DOUBLE) { double x = data->get_double(); }

Writing to Ports
Writing data to output ports is similar to reading data from input ports. The BehaviorDataWriter class writes the output data. BehaviorData supports get_<TYPE> functions, but BehaviorDataWriter has set_<TYPE> functions:
double value; // Compute value BehaviorDataWriter* output = get_port_output(OUTPUT_DATA); if (output == NULL) return RESULT_OUT_OF_MEMORY; output->set_double(value);

To send a NULL object through a port (equivalent to writing nothing), do not access the output port using get_port_output(). The behavior automatically sends NULL data if the output port was not allocated. Alternatively, you can invalidate the output port manually:
mark_output_valid(OUTPUT_DATA, false);

User Guide

4-9

Chapter 4

This is necessary when the output data writer was accessed, but the code later decided to send a NULL.

The Compute_output() Function


Putting this all together, you can create a simple compute_output() function for a behavior. The compute_output() in this code sample computes the average of all double input values on the input port.
Result MyBehavior::compute_output () { double total = 0.0; int num_inputs = 0; // Grab list of all inputs in the INPUT_DATA port // and iterate over them PortInputTable& inputs = get_port_input (INPUT_DATA); for (PortInputIterator input = inputs.begin (); input != inputs.end (); ++input) { // Grab the next data object in this port BehaviorData* data = input.get_data(); // Only process the data if it's a valid double if (data == NULL || data->get_type != BehaviorData::TYPE_DOUBLE) continue; // Add data to running total total += data->get_double(); num_inputs++; } // Check for no valid inputs if (num_inputs == 0) { Note that this case is still a success-Nothing went wrong in processing. The defined behavior for "no valid inputs" is "no valid output". return RESULT_SUCCESS; } // Obtain the output writer BehaviorDataWriter* output = get_port_output(OUTPUT_DATA); if (output == NULL) return RESULT_OUT_OF_MEMORY;

// // // //

4-10

User Guide

XML Interface to Behaviors

// Compute and set total output->set_double(total / num_inputs); // Completed computation return RESULT_SUCCESS; }

The compute_output() function is called automatically when the behavior network is run. Running the network involves subsequently calling compute_output() on each behavior and passing the data into other behaviors in the network. Behaviors themselves do not need to call this function.

XML Interface to Behaviors


XML behavior network file passes configuration information to behaviors through their parameters. To use parameters, place the following in the header file in the class declaration in a public section.
DECLARE_BEHAVIOR_PARAMS;

Also, place the following the in implementation file.


BEGIN_BEHAVIOR_PARAMS(MyBehavior, MyBehaviorsParent); // Usually MyBehaviorParent is BehaviorImpl BEHAVIOR_PARAM(MyBehavior, "parameter1", parameter1); BEHAVIORM_PARAM(MyBehavior, "parameter2", parameter2); BEHAVIOR_PARAM(MyBehavior, "parameter3", parameter3); END_BEHAVIOR_PARAMS (MyBehavior); The call to the macro BEHAVIOR_PARAMS has three arguments.

The first is the name of the behavior. The second is the name of the parameter in the XML files. The third is the tail end of the name of the function to set the parameter (that is, for an argument of parameter1, the function used by the macro is named set_parameter1()). After that is complete, the behavior needs to declare and implement the set functions so that the parameter loading routines can use them to set the behavior's internal data. The form of the set functions is as follows:
Result set_<parameter_name> (TicketId ticket, const char * value); Result set_<parameter_name> (TicketId ticket, const char * value) { //parse the char* in value for the data to store in //member variables. Return RESULT_SUCCESS or RESULT_FAILURE //based on success or failure. }

Where <parameter_name> is replaced with the variable name, such as set_parameter1. When all the macros are called, and all set functions are called, the behavior is ready to have its parameters set when it is instantiated, based on the parameters from the XML file containing it. Important Note: If a child behavior does not redefine a parent's parameter, then the parents parameter is used. Multiple inheritance is not supported.

User Guide

4-11

Chapter 4

Aggregate Behaviors: Combining Behaviors


Aggregate behaviors are used to make componentsbehavior networks that perform one function and are used as a single block rather than as a complex group of individual behaviors. You can place a behavior network inside an outer behavior, such that only the input and output ports of the aggregate are exposed. Parameter parsing is done for the behavior as a whole and make automatic or dynamic configuration of the behaviors easier. This makes large or complex behavior networks much easier to construct and debug by reducing the number of behaviors you must address at one time. It also reduces the complexity of behavior networks created with a visual tool. The end of this section shows a sample header to help with your first aggregate behavior. You can refer to it as you read the following explanation of how to create an aggregate behavior. Important Note: The header file needs to include evolution/Behavior.hpp and whatever behaviors make up the aggregate. The aggregate needs to define input/output ports like any other behavior. Access behaviors inside the aggregate by using an id_string and the IBehavior Network Interface. To refer to a static number of behaviors inside an aggregate, define a set of constant static strings or char*s, and use each string to refer to a given behavior. When using a variable number of behaviors, or making a general behavior aggregate that obtains some or all of the list of behaviors, creating a runtime that pre-defines the behavior names does not work as well. Use a systematic naming system to refer to dynamically-generated names. It is useful to pass an INPUT/OUTPUT_PORT_COUNT variable directly to the structors. Just as in a normal behavior, you can pass '0' to the inputs or outputs and define how many ports there are during configuration or dynamically. You can do internal behavior setup in the constructor only if there are no parameters to read in from XML network files. If you need to read in parameters, you need to override the initialize function defined in BehaviorImpl.hpp, and call BehaviorImpl::initialize on the first line of the overridden function. This ensures that parameters are loaded in and available to set up any internal behaviors. A sample function body:
Result NewAggregate::initialize (TicketId ticket, BehaviorConfig& config) { Result |= RESULT_SUCCESS; BehaviorImpl::initialize(ticket); x result |= create_internal_behaviors(ticket); result |= connect_internal_behaviors(ticket); return result; }

Inside the create_internal_behaviors function, create each behavior dynamically and pass it to get_aggregate_network().add_behavior(). get_aggregate_network() is an accessor function in BehaviorAggregate that returns the container of the behaviors. The creation function and possibly the connection function need to access the configuration data that is associated with the container that is holding the NewAggregate.

4-12

User Guide

Aggregate Behaviors: Combining Behaviors

get_resource_container(TicketId, IResourceContainer**) returns a Result and places an IResourceContainer* into the second argument if it succeeds.

Inside the connect_internal_behaviors function, determine which behaviors must be connected, and connect them using the connect_behaviors() function available on the IBehaviorNetwork class, using the IBehaviorNetwork instance _aggregate_network in BehaviorAggregate (parent class of NewAggregate). The _aggregate_network reference is obtained by calling get_aggregate_network(). None of the previously described functions, except the structors, are required. However, it helps the clarity of the class to move the creation and connection of behaviors into their own function. If you are using parameters, then you must inherit load_configuration to have access to the configuration information for creating behaviors. BehaviorAggregate is a child of BehaviorImpl, and the newly created aggregate must be a child of BehaviorAggregate, to ensure that load configuration is inherited. There is no need to write a compute_output() for aggregrates, because it is already implemented in the base class BehaviorAggregate. You must write the functions that create and connect component behavior only.
#include <evolution\core\behavior\BehaviorAggregate.hpp> namespace Evolution { class NewAggregate : public BehaviorAggregate { public: // Constants // Input ports. static const PortId static const PortId static const PortId static const PortId

INPUT_PORT_DATA1 INPUT_PORT_DATA2 INPUT_PORT_DATA3 INPUT_PORT_COUNT

=0; =1; =2; =3;

// Output ports. static const PortId OUTPUT_PORT_DATA1 static const PortId OUTPUT_PORT_COUNT // Behavior IDs. static const char* const BEHAVIOR_ONE_ID static const char* const BEHAVIOR_TWO_ID public: // Structors

= 0; = 1;

= "beh_one"; = "beh_two";

DECLARE_BEHAVIOR(NewAggregate, "MyNamespace.NewAggregate"); DECLARE_BEHAVIOR_PARAMS; /// Constructor

User Guide

4-13

Chapter 4

NewAggregate (TicketId ticket, IBehaviorContainer& container); /// Destructor. virtual ~NewAggregate(); protected: // Helpers /** * Creates internal behaviors and adds them to the internal * behavior network. */ Result create_internal_behaviors (TicketId ticket, BehaviorNetworkImpl& network); /** * Connects the behaviors on the internal behavior network. */ Result connect_internal_behaviors (TicketId ticket); /** * Only required if parameters are being loaded from xml files. */ Result load_configuration (TicketId ticket, BehaviorConfig& config); }; // end class NewAggregate } // end namespace Evolution

Relevant Header Files and Functions


get_aggregate_network(); evolution\core\behavior\BehaviorImpl.hpp

Data Passing between Behaviors


To help move data through a behavior network, all transmitted data is encapsulated in a BehaviorData object. BehaviorData is a generic data class which wraps many possible data types. The data in a BehaviorData object ranges from data as simple as a Boolean value (BehaviorData::TYPE_DOUBLEBOOLEAN) or as complicated as an image (BehaviorData::TYPE_DOUBLEIMAGE). The following data types are defined in BehaviorData:
TYPE_UNKNOWN TYPE_BOOLEAN TYPE_ENUM TYPE_DOUBLE TYPE_STRING

4-14

User Guide

Input Data Interface

TYPE_MULTIVALUE TYPE_MATRIX_DOUBLE TYPE_VECTOR_FIELD TYPE_CHAR_ARRAY TYPE_DOUBLE_ARRAY TYPE_IMAGE TYPE_POINT_SET TYPE_IOBJECT TYPE_DATA_POINTER

Their specific interfaces are discussed below.

Input Data Interface


When reading data, the first step is to identify its type. If there is no real data, the object has a null pointer. Always make sure that data exists before running code that uses that data.
if (data) { // Do something }

Assuming you have acceptable behavior data, the next step is to check its type with the get_type() function. The following sample code checks for double inputs:
if (data && data->get_type() == BehaviorData::TYPE_DOUBLE) { // Do something } The specific accessor functions are named get_<type>. This code sample gets a double: if (data && data->get_type() == BehaviorData::TYPE_DOUBLE) { double x = data->get_double(); }

Output Data Interface


The BehaviorDataWriter class complements the BehaviorData class. Whereas BehaviorData contains input read from a port, BehaviorDataWriter contains output to send to a port. You do not need to check the type for a BehaviorDataWriter - the code that accesses the writer sets the data itself, so the type is implicitly known. For each data type, there is a different set accessor function in the format set_<type>. For example:
BehaviorDataWriter* output = get_port_output(SOME_OUTPUT_PORT); if (output == NULL) return RESULT_OUT_OF_MEMORY; double value; output->set_double(value);

User Guide

4-15

Chapter 4

If you do not call get_port_output, the port defaults to sending no data. Alternatively, if the BehaviorDataWriter has already been accessed, you can manually invalidate the output port like this:
mark_output_valid(SOME_OUTPUT_PORT, false);

Note that the get_<type>() function defined for BehaviorData can also be used by BehaviorDataWriters because BehaviorDataWriter inherits from BehaviorData. In practice, however, only BehaviorData objects receive data and only BehaviorDataWriter objects send data.

4-16

User Guide

Chapter 5

Task Execution Layer

The Task Execution Layer (TEL) provides a high level, task-oriented method of programming an autonomous robot. By sequencing and combining tasks using functional composition, you can create a flexible plan for a robot to execute while writing code in a traditional procedural style. TEL support for parallel execution, task communication and synchronization allows for the development of plans that can be successfully executed in a dynamic environment. The TEL also provides a high level, task-oriented interface to the Behavior Execution Layer. While behaviors are highly reactive and are appropriate for creating robust control loops, tasks are a way to express higher-level execution knowledge and coordinate the actions of multiple behaviors. For example, An action that is best written as a behavior is a robot using vision to approach a recognized object. An action that is more appropriate for a task is a robot navigating to the kitchen, finding a bottle of beer and picking it up. Features of the Task Execution Layer include: Familiarity. Defining tasks is similar to writing standard C++ functions, not writing finite state machines.

User Guide

5-1

Chapter 5

Ease of Scripting. It is straightforward to create an interface between the Task Execution Layer and most scripting languages, including Python, allowing you to write tasks in a natural way.

Tasks
Tasks and Task Functors
Tasks are just like C++ functions. To define a task, you write code that performs the task. However, because there are some constraints and bookkeeping also associated with task functions, the TEL layer uses the abstraction of a TaskFunctor to represent these functions. A TaskFunctor object behaves much like a function pointer, except that to call the function, you call the TaskFunctor run method. This is similar to the function object abstraction in C++s Standard Template Library. To make it easy to define tasks, the TEL provides two macros: ERSP_DECLARE_TASK and ERSP_IMPLEMENT_TASK. These macros simplify the process of defining task functions, creating TaskFunctor, and registering the functors so that they can be used in other tasks.
ERSP_DECLARE_TASK declares tasks. It takes two arguments: the name of the functor, and the ID under which the functor will be registered in the task registry. For example, ERSP_DECLARE_TASK(HelloWorld, Example.HelloWorld);

Tasks must be declared using this macro before they are defined. The ERSP_IMPLEMENT_TASK macro defines a task. It takes just one argument, the name of the functor, and must be followed by the body of code making up the function. For example,
ERSP_IMPLEMENT_TASK(HelloWorld) { printf (Hello, world!\n); return NULL; }

The code examples on this page do three things: 1. Defines HelloWorld to be a pointer to a TaskFunctor. 2. Registers the functor under the ID Example.HelloWorld in the task registry. 3. Defines a (hidden) C++ function that runs when the HelloWorld functors run method is called. The only thing you need to know about the hidden C++ function that gets defined is its signature. It takes a single argument, context, which is a pointer to a TaskContext. You can use the context argument in your code to access the task arguments (which are different from C++ function arguments), get a handle on the TaskManager, and generally interact with the rest of the task system. The function (your code) must return a pointer to a newly allocated TaskValue (or Null). Task functors can be called directly in a way that is not much different from calling any other C++ function:

5-2

User Guide

Tasks

TaskValue *v; TaskContext *c = TaskContext::task_args(0); v = HelloWorld->run(c);

A TaskFunctor is essentially the definition of a task. When a given task is instantiated with particular arguments and possibly in its own thread, it is represented by a Task object. A Task object is an instance of a running task.

Task Arguments
Tasks have arguments; a tasks arguments are accessed through the TaskContext object that is passed to the task function. Its get_arguments method returns a TaskArgVector* which is a typedefed std::vector containing pointers to TaskArgs).

Task Termination
Writing functions intended to be tasks can differ from writing normal functions. It should be possible for tasks to cleanly terminate any other task, even tasks running in a different thread. To make this possible, the function should periodically check whether another task has requested that it be terminated. To do this, it calls the TaskContexts termination_requested method. For example, if a task wants to sleep for n seconds, instead of calling Platform::millisecond_sleep(n * 1000), it should instead loop each time through the loop, sleeping for a fraction of the full time (say 100 ms, though the actual time depends on the degree of responsiveness) and checking whether it has been terminated, until it slept for the total required time.

Task Return Values


Tasks can return a single value by returning a pointer to a newly allocated TaskValue object from the task function. If no value should be returned, the task function should return NULL. After a task is complete, its return value can be retrieved by other tasks using the get_result method. For example,
TaskValue v = t->get_result();

If a task returns a non-NULL TaskValue pointer, deleted it by the caller.


if (v) { delete v; }

Task Success and Failure


Distinct from the return value that a task returns is the tasks success or failure. Tasks succeeded when they are complete unless they have at any point called the TaskContext set_failed method:
context->set_failed();

TaskValue
TaskValue is a variant type, like BehaviorData. It is actually a subclass of BehaviorDataWriter. It can hold objects of type bool, long, and int, as well as strings (const

User Guide

5-3

Chapter 5

char*), void pointers, and IObjects.

To create a TaskValueone, use the provided constructors:


TaskValue val1(10); TaskValue val2(3.1415926); TaskValue val3(true); TaskValue val4(reptile); TaskValue val5((void*) &val4);

Or use the setter functions:


TaskValue val1; val1.set_long(11); val1.set_double(1.0); val1.set_bool(false); val1.set_string(hero); val1.set_pointer((void*) &val1); TaskValues may be copied and assigned safely: TaskValue val1(4); TaskValue val2 = val1;

TaskArg
TaskArgs are just like TaskValues, but they are read-only.

Arrays of TaskArgs can be created easily, which is convenient for passing arguments to tasks in some situations:
TaskArg args[] = { &x, 7, approach }; TaskContext *c = TaskContext::task_args(3, args); MyTask->run(c); delete c;

TaskContext
The TaskContext is the means by which task functions get a handle on the rest of the task system, and the context in which they are executing. Tasks can get several important pieces of information from their context: The task arguments, returned by the get_arguments method as a pointer to a TaskArgVector object, which contains TaskArgs:
const TaskArgVector& args = *(context->get_arguments()); double speed = args[0].get_double();

Whether another task has requested the termination of this task:


if (context->termination_requested()) return NULL;

5-4

User Guide

Tasks

The Task object corresponding to the task that is executing the task function:
Task* task = context->get_task();

And there is one method that tasks can call to signal information about their execution: The set_failed method marks the task as having failed, which causes their status to be TASK_FAILURE when the task function returns. For example,
context->set_failed();

When task functions are being called directly, they need to be passed a new TaskContext that contains the arguments to be passed. To create a TaskContext, use the TaskContext::task_args function. task_args takes the number of arguments to be passed, an array of TaskArg containing the arguments, and an optional pointer to a TaskContext. The TaskContext argument allows the TEL to keep track of the proper parent-child relationships between tasks. For example,
TaskArg args[] = { v, w, STOP_SMOOTH }; TaskContext* c = task_args(3, args, context); TaskValue* v = MyTask->run(c); delete c;

If you dont need to pass any arguments, call it like this:


TaskContext* c = task_args(0, NULL, context);

The above code assumes that it is part of a task function that has been passed a TaskContext named context. To call a task function directly from a non-task function, you may not have a parent context to pass to task_args. It is acceptable to call task_args like this instead:
TaskContext* c = task_args(3, args);

TaskStatus
TaskStatus is an enumeration of different task states:

TASK_READY TASK_RUNNING TASK_SUCCESS TASK_FAILURE

The task hasnt started running yet. The task is running. The task has completed successfully. The task has completed unsuccessfully.

Some utility functions are provided to help with checking task status. The function task_complete takes a Task* and returns a bool; it returns true if the task has completed, either successfully or unsuccessfully. The function task_not_complete takes a Task* and returns a bool; it is the complement of task_complete.

TaskRegistry
The TEL task registry maps task IDs to task functors. By keeping a task registry and allowing

User Guide

5-5

Chapter 5

one to lookup tasks at runtime, the TEL allows for the possibility of defining new tasks at runtime that may not be associated with a C++ symbol.
The TaskRegistry class has two important methods (both static): register_task and find_task.

To register a task, you must call register_task with the ID of the task to be registered and a pointer to its TaskFunctor. Usually when you write tasks in C++ the task is automatically registered, so you may never need to use this method. To look up a task, use the find_task method. It takes the ID of the task as an argument and returns a pointer to the corresponding TaskFunctor (or NULL if no task with that ID has been registered).

Asynchronous Tasks
You amy want to run multiple tasks simultaneously. The TEL supports this in two ways: The install_task method of the TaskManager, and the Parallel class.

TaskManager::install_task
The TaskManager s install_task method starts a task running in its own new thread and returns immediately. Its arguments are the TaskFunctor to execute, the number of arguments, and an array containing the arguments.
TaskArg args[] = { hello }; TaskManager *manager = TaskManager::get_task_manager(); Manager->install_task(MyTask, 1, args);

The install_task method returns a pointer to the new Task object, which can be used to wait for the task to complete, terminate the task, or access its return value.
Task objects are reference counted, and install_task returns a new reference to a Task, so it is the callers responsibility to decrement the reference (using remove_ref) when it no longer needs the Task (one reference remains as long as the Task is executing, so it is safe to call remove_ref immediately after install_task if you do not need to keep track of the Task for any reason.

Here is a complete example showing the use of install_task, waiting for the asynchronous task to complete, accessing its return value, and properly maintaining the reference count:
TaskManager *manager = TaskManager::get_task_manager(); TaskArg args[] = { 1.0, 3.0 }; // Assume Add is a task that returns the sum of its arguments. Task *task = manager->install_task(Add, 2, args); manager->wait_for_task(NULL, task); TaskValue sum = task->get_result(); task->remove_ref();

If all you want to do is run multiple tasks asynchronously and immediately wait for one or all of them to complete, use the Parallel class instead of install_task.

5-6

User Guide

Asynchronous Tasks

Waiting For Tasks


The TaskManager::wait_for_task method waits for an asynchronously started task to finish. It takes two arguments: the current task, and the task to wait for. If there is no current task the first argument should be NULL. You need to pass the current task to wait_for_task is in case the current task is terminated. In that case, the wait is interrupted and the method returns immediately, before the other task finishes. Heres an example of how to check whether a call to wait_for_task returned because the task actually finished, or the current task terminated:
Task *this_task = context->get_task(); TaskManager *manager = TaskManager::get_task_manager(); Task* new_task = manager->install_task(MyTask); manager->wait_for_task(this_task, new_task); if (this_task->termination_requested()) printf Wait was interrupted due to termination.

Parallel
Because task functions are normal C++ functions with normal C++ execution semantics, tasks are usually executed in order. When parallel execution of multiple tasks is needed, use the Parallel class. Parallel allows a task to spawn threads in which to run subtasks, and then blocks until those tasks are complete. Tasks are added to the parallel construct with the add_task method, which takes as arguments a task function, the number of arguments to be passed to the task, and an array of TaskArg containing the arguments. add_task returns a pointer to the new Task.
Parallel p(context); TaskArg drive_args[] = { 10.0, 4.0, 60 } Task* Task* t1 = p.add_task(DriveTo, 3, drive_args); t2 = p.add_task(PlayChess);

Adding a task does not execute it. To begin executing the parallel subtasks, one of three things must be done: 1. Nothing. The destructor for Parallel starts the subtasks in separate threads and then block, until all subtasks are complete. 2. The wait_for_all_complete_tasks method can be called, which also starts the subtasks in separate threads and block until they are complete. 3. The wait_for_first_complete_task method can be called, which begins executing the subtasks and then blocks until one of them completes. It then terminates the remaining subtasks. The wait_for_all_complete_tasks method takes no arguments and just returns a Result, which just indicates whether all the thread-ops and such worked. The wait_for_all_complete_tasks and wait_for_first_complete_task methods may also unblock and return early if they are called by a task, and that task is terminated with a wait

User Guide

5-7

Chapter 5

still in effect. This means that anytime any of these functions is called, they may return early and the task that called them should check whether it was terminated if it wants to be a good citizen. However, in many cases it doesn't really matter. If the last thing a task does is wait for a Parallel block to finish, it may not care whether the block finished or if it itself was terminated. If you are interested in the final status of or value returned by a subtask, save the pointer to the subtask returned by add_task, and then call the task objects get_status or get_result methods after the subtask is complete. For example,
Parallel p(context); Task* Task* t1 = p.add(Task1); t2 = p.add(Task2);

p.wait_for_all_complete_tasks(); if (t1->get_status() == TASK_SUCCESS) { return new TaskValue(t1->get_result()); } else { context->set_failed(); return NULL; }

The wait_for_first_complete_task method uses a pointer to a Task* in which to store a pointer to the subtask that completed first.
Parallel p(context); Task* Task* t1 = p.add(Task1); t2 = p.add(Task2);

Task* complete_task; p.wait_for_first_complete_task(&complete_task); if (complete_task->get_status() == TASK_SUCCESS) { return complete->get_result(); } else { context->set_failed(); return NULL; }

The Parallel class requires a TaskContext* argument in its constructor; that context should be either the current task context if there is one, or one created with TaskContext::task_args if it is being used in a non-task function, like main.

Terminating Tasks
After a task is installed and running in its own thread, you can terminate it before it finishes naturally. The Task::terminate method sets a flag that tells the task it should terminate.
Task* task = manager->install_task(MyTask); Platform::millisecond_sleep(150);

5-8

User Guide

Asynchronous Tasks

// Tell the task to terminate now, even if its not done. task->terminate();

Events
The TEL offers a publish/subscribe mechanism for tasks to send messages to one another. Events are broadcast from one task and may be received by multiple tasks. Events consist of a type, which is a string, and a set of properties, which are name/value pairs where the name is a string and the value is a TaskValue. Events are represented by the Event class. Heres an example of constructing an Event:
/* The string in the constructor is the type of event. */ Event e(Evolution.Vision.ObjectRecognized); e.set_property(label, soda can); e.set_property(heading, 30.2);

To raise an event, a task calls the TaskManagers raise_event method:


context->get_task_manager()->raise_event(e);

To block and wait for an event, a task calls its wait_for_event method, specifying the type
of event it is waiting for: Task* task = context->get_task(); Event e = task->wait_for_event(Evolution.Vision);

The wait_for_event method actually accepts simple event type patterns. An asterisk (*) in the pattern works as a wildcard, matching any string. Asterisks may only appear as the last character in a pattern. For example, the event pattern Evolution.Vision* matches events of type
Evolution.Vision, Evolution.Vision.ObjectRecognized, Evolution.Vision.ColorRecognized, Evolution.Vision.ObjectRecognized.Soda, and so on.

It is also possible to supply an optional timeout value to wait_for_event which is the maximum time in milliseconds to wait for an event. If no event of the specified type is raised within that time, wait_for_event unblocks and returns. To determine whether the call timed out or not, use the Events is_empty method. After an event is received, its properties can be accessed with get_property. For example,
Task* t = context->get_task(); /* Wait for 2 seconds max. */ Event e = t->wait_for_event(Evolution.Vision, 2000.0); if (e.is_empty()) { std::cerr < < Timed out < < std::endl; } else { std::cerr << Saw a << e.get_property(label) <<\ std::endl; }

User Guide

5-9

Chapter 5

Events have a timestamp corresponding to the time that the event was raised. It can be accessed
with Events get_timestamp method.

Finally, there are two more event-related methods that allow tasks to wait for events in a loop
without missing any due to race conditions: enable_event and disable_event.

The wrong way to wait for events in a loop:


Task* t = context->get_task(); for (int i = 0; i < 10; i++) { Event e = t->wait_for_event(Evolution.Vision); std::cerr << Got event << e << std::endl; }

The above code can miss events if they are raised while it is busy writing output to std::cerr. To avoid this, enable the event before starting the loop and disable it after the loop has terminated:
Task * t = context->get_task(); task->enable_event(Evolution.Vision); for (int i = 0; i < 10; i++) { Event e = t->wait_for_event(Evolution.Vision); std::cerr << Got event << e << std::endl; } task->disable_event(Evolution.Vision);

Each call to enable_event must be matched by a call to disable_event with exactly the same event type pattern.
wait_for_event can return prematurely if another task has terminated the waiting task. To be safe, after calling wait_for_event, check whether the current task has been terminated. When wait_for_event returns prematurely, it returns an empty event. This can be tested for with the Event::is_empty method.

Types of Tasks
It amy be helpful to think of tasks as falling under two classifications: action tasks and monitor tasks. These are not formally defined categories but rather a way to talk about common practice. Action tasks are those tasks that perform an action and then terminate on their own. Examples of action tasks are a task to move a robot to specified coordinates, or a task to open a Gripper attachment. They can be used in either synchronous or asynchronous contexts.

Primitives
Primitives provide the interface between tasks and behaviors by packaging behavior networks and making them look and act like tasks. Other tasks can then use the primitive exactly like they use any other task.

5-10

User Guide

Primitives

A primitive should: Specify a behavior network Provide inputs to the network Handle outputs from the network Decide when to terminate A complete example of how to write a primitive in the ERSP sample code is located at Samp_code_dir/task/primitive for both Windows and Linux platforms. The example primitive is called MonitorObstaclesPrim (defined in MonitorObstacles.cpp), and it monitors infrared range sensors and raises events when the sensors detect an obstacle within a minimum threshold distance. In this section, we give a short overview of how to write a primitive, including excerpts of code from the example, but refer to the complete example to gain a thorough understanding of primitives. Primitives must be classes that have Primitive as their parent. They must additionally call the Primitive constructor with the name of the behavior network to be loaded, and define a get_status method that returns a TaskStatus value. The Primitive protocol is as follows. The primitive must call the Primitive constructor with the name of the behavior network. For example, the MonitorObstaclesPrim primitive uses a behavior network named Examples.MonitorObstacles, which is also the ID that the primitive was registered under (using the ERSP_DECLARE_PRIMITIVE macro). The MonitorObstaclesPrim constructor uses the ERSP_TASK_ID macro to extract the task ID and pass that to the Evolution::Primitive constructor.
MonitorObstaclesPrim. (Evolution::TaskContext* context, const char* task_id = ERSP_TASK_ID (MonitorObstacles)) : Evolution::Primitive (context, task_id) { // } // end MonitorObstaclesPrim()

While the primitive is being prepared for execution, its start method is called. At this point, the primitive can set up inputs to the behavior network and register callbacks to handle outputs of behaviors. Primitive has a member_manager that is a pointer to a TaskManager, and the TaskManager method set_prim_input_value is used to provide inputs to behaviors in the network while add_prim_value_callback is used to register callbacks. For example, heres the code in MonitorObstaclesPrim::start that registers callbacks for the output of the SensorRing behavior that is in MonitorObstaclesPrims network:
// Register for callbacks from a behavior output port (the // "obstacles" port on the "sensor_ring" behavior) in the // network. The Primitive base class provides the _manager // and _network_id data members. handle_input() will be // called once per cycle for each registered callback.

User Guide

5-11

Chapter 5

return (_manager->add_prim_value_callback (_network_id, "sensor_ring", "obstacles", this));

While the primitive is active and executing, any callback handlers that the primitive registered
are called with behavior output values.

In the MonitorObstaclesPrim primitive, the handle_input method just looks at the output of the SensorRing behavior and checks each piece of range data against the minimum range threshold. If any of the indicated ranges are below the threshold, the handle_input method raises an event. The primitives get_status method is called by the Task Manager. If the method returns TASK_RUNNING, the primitive stays active, and we go back to step 3. Otherwise, the primitive has completed (successfully or not) and we go to step 5.
MonitorObstaclesPrim does not override the get_status method, so it runs forever, or

until explicitly terminated. After a primitive indicates it is complete, by returning something other than TASK_RUNNING
from its get_status method, the primitives finish method is called. The primitive can take this opportunity to do any necessary cleanup. Since many primitives do not need a finish method, a default is provided that does nothing. MonitorObstaclesPrim uses the default finish method.

Example Tasks
Here is a complete program that defines and uses a few simple tasks.
#include <iostream> #include <evolution/Task.hpp> using namespace Evolution; /* Returns the sum of two numbers. */ ERSP_DECLARE_TASK(Add, "Example.Add"); ERSP_IMPLEMENT_TASK(Add) { const TaskArgVector& args = *(context->get_arguments()); double a = args[0].get_double(); double b = args[1].get_double(); return new TaskValue(a + b); } /* Waits for an event. */ ERSP_DECLARE_TASK(Wait, "Example.Wait"); ERSP_IMPLEMENT_TASK(Wait) { Task* task = context->get_task();

5-12

User Guide

Example Tasks

Event e = task->wait_for_event("Example.Event"); std::cerr << "Got event: " << e << std::endl; return NULL; } /* Raises an event. */ ERSP_DECLARE_TASK(Raise, "Example.Raise"); ERSP_IMPLEMENT_TASK(Raise) { TaskManager* mgr = context->get_task_manager(); Event e("Example.Event"); e.set_property("color", "red"); std::cerr << "Raising event: " << e << std::endl; mgr->raise_event(e); return NULL; } int main(void) { // Call the Add task directly. TaskArg add_args[] = { -1.0, 3.0 }; TaskContext* TaskValue* delete c1; std::cerr << "-1.0 + 3.0 = " << *v1 << std::endl; // Again, but using the registry. TaskFunctor* AddF; AddF = TaskRegistry::find_task("Example.Add"); TaskArg add_args2[] = { -2.0, 5.0 }; TaskContext* TaskValue* delete c2; delete v2; c2; v2 = AddF->run(c2); c2 = TaskContext::task_args(2, add_args2); std::cerr << "-1.0 + 3.0 = " << *v2 << std::endl; c1; v1 = Add->run(c1); c1 = TaskContext::task_args(2, add_args);

// Now run Wait and Raise in parallel.

User Guide

5-13

Chapter 5

TaskContext*

c3;

c3 = TaskContext::task_args(); Parallel p(c3); p.add_task(Wait); p.add_task(Raise); p.wait_for_all_complete_tasks(); delete c3; TaskManager::get_task_manager()->shutdown(); return 0; }

Here is the output from this program:


[user@machinename tests]$ ./a.out -1.0 + 3.0 = 2.000000 -2.0 + 5.0 = 3.000000 Raising event: {Event Example.Event [time 0] (color: "red")} Got event: {Event Example.Event [time 1.0311e+09] (color: "red")}

Using Python
ERSP offers two ways to use the Python scripting language together with the TEL: 1. In the Python interpreter 2. Embedded in an application program that uses the ERSP In either case, the TEL includes a Python module named ersp.task. The ersp.task module provides an API very similar to the C++ API for tasks. It includes the following classes: TaskManager, TaskContext, Task, TaskFunctor, Event, and Parallel. For convenience, many of the built-in Python data types are automatically converted to and from TaskArgs and TaskValues for you. The ersp.task module contains online documentation in the standard Python style. You can view documentation for the module itself along with the classes and functions in the module by printing their __doc__ attributes in the interactive Python interpreter:
>> import ersp.task >> print ersp.task.TaskContext.__doc__ Represents a task context, including information about the task parent, task arguments and task success or failure.

Additional modules are provided that let you use ERSPs predefined tasks from Python:
ersp.task.navigation ersp.task.net ersp.task.resource ersp.task.speech ersp.task.util

5-14

User Guide

Using Python

ersp.task.vision

To use the tasks defined in each module, you simply import the module after importing ersp.task. For example,
import ersp.task import ersp.task.module ersp.task.util.PlaySoundFile(silence.wav)

In the Python Interpreter


Heres an example of a simple Python program that can be executed in the Python interpreter. This is the Python equivalent of the C++ program seen previously.
import ersp.task # Returns the sum of two numbers. def AddFun(context): [a, b] = context.getArguments() return a + b # Waits for an event. def WaitFun(context): task = context.getTask() event = task.waitForEvent('Example.Event', 5000) if (event != None): print 'Got Event: %s' % (event,) else: print 'Something went wrong, never got the event.' # Raises an event. def RaiseFun(context): manager = context.getTaskManager() event = ersp.task.Event("Example.Event") event.setProperty("color", "red") print "Raising event: %s" % (event,) manager.raiseEvent(event) # Register the tasks. Add = ersp.task.registerTask("Example.Add", AddFun) Wait = ersp.task.registerTask("Example.Wait", WaitFun) Raise = ersp.task.registerTask("Example.Raise", RaiseFun) # Call Add task directly.

User Guide

5-15

Chapter 5

context = ersp.task.TaskContext([-1.0, 3.0]) v = Add.run(context) print "-1.0 + 3.0 = %s" % (v) # Now run Wait and Raise in parallel. p = ersp.task.Parallel() p.addTask(Wait) p.addTask(Raise) p.waitForAllTasks()

And heres some output:


[user@machinename TaskModule]$ python example.py -1.0 + 3.0 = 2.0 Raising event: {Event Example.Event [time 0] (color: "red")} Got Event: {Event Example.Event [time 1.04673e+09] (color: "red")}

Embedded Python
The other way to use the TELs Python integration is to embed Python within an application. The TEL makes this possible with just a single function: PythonTaskRegistry:: register_task. It allows you to register a task that is written in Python. It takes three arguments: the ID of the task, a string containing Python code, and the name of a function defined in that code that is the definition of the task. Heres an example using the tasks defined in Python above from C++, in an embedded Python context:
#include <iostream> #include <evolution/Task.hpp> #include <evolution/core/task/PythonTaskRegistry.hpp> const char* script = " import ersp.task # Returns the sum of two numbers. def AddFun(context): [a, b] = context.getArguments() return a + b # Waits for an event. def WaitFun(context): task = context.getTask() event = task.waitForEvent('Example.Event', 5000) if (event != None): print 'Got Event: %s' % (event,)

5-16

User Guide

Using Python

else: print 'Something went wrong, never got the event.' # Raises an event. def RaiseFun(context): manager = context.getTaskManager() event = ersp.task.Event('Example.Event') event.setProperty('color', 'red') print 'Raising event: %s' % (event,) manager.raiseEvent(event) "; using namespace Evolution; int main(void) { TaskFunctor *Add, *Wait, *Raise; TaskContext *c; // Register the tasks. PythonTaskRegistry::register_task("Example.Add", script, "AddFun"); PythonTaskRegistry::register_task("Example.Wait", script, "WaitFun"); PythonTaskRegistry::register_task("Example.Raise", script, "RaiseFun"); // Look up the tasks. Add = TaskRegistry::find_task("Example.Add"); Wait = TaskRegistry::find_task("Example.Wait"); Raise = TaskRegistry::find_task("Example.Raise"); if (Add == NULL || Wait == NULL || Raise == NULL) { std::cerr << "Something went wrong. << " PYTHONPATH. return 1; } // Call the Add task directly. Check your

User Guide

5-17

Chapter 5

TaskArg args[] = { -1.0, 3.0 }; c = TaskContext::task_args(2, args); TaskValue* v = Add->run(c); std::cerr << "-1.0 + 3.0 = " << *v << std::endl; // Now run Wait and Raise in parallel. TaskContext *c2 = TaskContext::task_args(); Parallel p(c2); p.add_task(Wait); p.add_task(Raise); p.wait_for_all_complete_tasks(); return 0; }

To compile this program, do something like the following, assuming you named the file containing the program source code example.cpp:
g++3 `evolution-config --cflags --debug` `evolution-config --libs` example.cpp -o example

Output from the program is identical to the output of the pure Python version:
[user@machinename TaskModule]$ ./example -1.0 + 3.0 = 2.0 Raising event: {Event Example.Event [time 0] (color: "red")}

Overview of Tasks
The following are the tasks with brief descriptions. For detailed descriptions of these tasks, refer to the Doxygen documents located in Install_dir/doc/ERSP-API/html directory for Linux and Windows.
AdjustFlowDetectParams - Adjusts the parameters the DetectFlow task uses, without stopping DetectFlow (if it is running).

AudioVideoClient - Connects to an AudioVideoServer on the specified IP and port after a connection is established, it starts sending/receiving A/V info over the connection. It uses both the Camera resource and the AudioRecord resource. AudioVideoServer - Listens on a port for connections. After a connection is established, it starts sending Audio/Video information over the connection. It uses both the Camera resource and the AudioRecord resource. CloseER2Gripper - A blocking call that closes the ER2 gripper fully or for a specified. CloseGripper - Closes the gripper. DetectColor - Begins watching for a specified color and raises events when it is seen.

5-18

User Guide

Overview of Tasks

DetectFlow - Raises an event each time a new frame comes out of flowdetect behavior. DetectGesture - Begins watching for gestures, and raises events when they are detected. DetectMotion - Returns the amount of motion between consecutive frames from the camera. DetectObject - Begins watching for a specified object and raises events when it is seen. DetectSound - Returns the perceived volume level from the microphone. DoAtATime - Performs a task at a specific time. DoPeriodically - Performs a task periodically, every so often. DoWhen - Waits for an event, and executes a task when the event is raised. DriveMoveDelta - Moves the drive system to a position delta. DriveStop - Sends as top command to the drive system. FaceObject - Rotate to search for and face a specified object. GetImage - Returns an image from the camera. GetPosition - Returns the robot in global coordinate system (x, y, theta). GetRangeData - Polls the range sensor and returns distance-timestamp pairs in one contiguous array. GetVelocities - Returns the current linear and angular velocity of the robot as reported by odometry (v, w). GotoObject - Rotate to search for and face a specified object. Move - Moves at the given velocity. MoveER2ArmDelta - A blocking call that moves the arm to a specified position. MoveRelative - Moves a set of relative distances. MoveTo - Moves to a set of absolute points. OpenER2Gripper - A blocking call that opens the ER2 gripper fully or for a specified position delta. The full range of motion is 10,000 motor units. OpenGripper - Opens the gripper. PlaySoundFile - Plays a sound file. PrintEvent - Prints out events of the specified type until terminated. RangeSensorMove - Sets drive system to move at the given velocities until a specified range sensor reads a value less than a specified threshold. The call will block until the range sensor threshold is read.

User Guide

5-19

Chapter 5

RangeSensorSweep - Sets drive system to turn at the given angular velocity until a specified range sensor reads a value less than a specified threshold or the drive have turned a specified angular distance. The call will block until either the threshold is read or the distance is reached. RecognizeSpeech - Recognizes speech according to the specified grammar, raising events when speech is detected. SendMail - Sends mail and waits for its completion. SenseObjectInGripper - A blocking call that checks if an object sensed by the gripper. This only works if the gripper is in the fully open position. SetDriveVelocity - Sets the drive system to move at a given velocity. SetER2ArmVelocity - A non-blocking call that starts the ER2 arm moving at a specified velocity. SetFaceEmotion - Sets a facial expression based on a vector. SetTrackColorVelocity - Sets the linear velocity while color tracking. Speak - "Speaks" the given text using TTS, waiting until completed. SpeakFromFile - "Speaks" the specified file using TTS, waiting until completed. Stop - Stops. Teleop - Remote control your robot. TrackColor - Tracks motion occurring within regions of the specified color. TrackColorFlow - Tracks motion occurring within regions of the specified color. TrackSkinFlow - Tracks people by tracking skin colored objects that are moving. TrainObject - This task adds an object to a modelset. Turn - Moves at the given angular velocity. TurnRelative - Turns a relative distance. TurnTo - Turns to an absolute heading. Wait - Waits for the specified amount of time. WaitUntil - Waits for an event. Wander - Causes the robot to wander aimlessly with obstacle avoidance enabled. This task never returns, so if you want to stop wandering you must terminate this task. WatchInbox - Waits for new mail and raises an event when it is available.

5-20

User Guide

Chapter 6

Behavior Libraries

This section describes predefined behaviors included in the Evolution Robotics Software Platform. The behavior names used in this chapter appear as they do in the Behavior Composer GUI. To see the full type name (Evolution.BehaviorName), drag the behavior block into the Network Editor, and right-click the behavior block. You can also look in Install_dir\behavior\config directory for Windows or Linux to see the full type names. Important Note: The following are the behaviors with brief descriptions. For detailed descriptions of these behaviors, refer to the Doxygen documents located in Install_dir/doc/ERSP-API/html directory in both Linux and Windows.

Utility Behaviors
Utility behaviors perform various functions including remote connection and testing tasks.

Buffer
Buffers input values and outputs these values in later network executions. There is only one input port to the buffer - the next value to store in the buffer. If a NULL value is received, the old values stay in the buffer. If a real value of any type is sent, of any type, the oldest value in the buffer is overwritten with the new value. While there is only one input, there

User Guide

6-1

Chapter 6

can be any number of outputs. The first output port contains the latest data input to the buffer. The second port contains the next latest received data, and so on. Each higher output port number contains older buffered data. For example, suppose a Buffer has 3 output ports. In its initial state, it outputs (NULL, NULL, NULL) on ports 1, 2, and 3. Now suppose the network sends 1.0 to the input port. The outputs are (1.0, NULL, NULL) until another input is received. After receiving the second input, a, the outputs are (a, 1.0, NULL). Now suppose the buffer receives input 'true'. It outputs (true, a, 1.0). If the buffer receives a fourth input value of 100, the outputs become (100, true, a). In this case, the original 1.0 data value is no longer recorded.

Console
This behavior reads strings that are input from the console of the machine running the behavior, and then puts them on the output port. This input is non-blocking, so the execution of the behavior network is unaffected. After a newline is entered, the string is assumed to be complete and is output one time. Then, the next newline-terminated string is read and output, and so forth.

Constant
This behavior is useful for testing and keeping constant inputs to a given behavior input port. Specify what this value in the parsed_value parameter list before the network is run.

DecayBehavior
This behavior implements an exponential decay function. After a value is provided in the input of the behavior, the output in subsequent invocations of the behavior follows the equation: o(t) = k * o(t-1), where k is a decay constant that can be specified as a parameter or provided in one of the input ports.

DelayedConstant
The DelayedConstant functions in an identical manner to Constant, but it waits a specified amount of time before emitting its constant value. After that delay, it continues emitting the specified constant value continuously. This behavior is probably only useful for testing.
Input behavior2 string: p1-8.rgb

FunctionBehavior
This behavior generates a few simple multivalues. The behavior has no inputs. Based on the input parameter, it generates a multivalue each time its output is computed. The parameter function defines what multivalue function is produced. The allowed options are square and identity which encode the functions f(x) = x*x and f(x) = x respectively. The output multivalue is one dimensional, ranging from values 0.0 to 5.0 in steps of 1.0. In other words, if you ask for square, you get the multivalue: 0.0, 1.0, 4.0, 9.0, 16.0, 25.0, and if you ask for identity you get 0.0, 1.0, 2.0, 3.0, 4.0, 5.0.

6-2

User Guide

Utility Behaviors

ImageConverter
ImageConverter converts image data from one color format (RGB24, RGB32, YUV, and so on) to another. The required color format is specified by input on the output_color_format port, or if there is no input on that port, the output_color_format parameter.

InputCollector
This behavior is essentially a data sink. It can output the data it receives to screen. It can also check the data against an expected value if the data is of a simple type (i.e. not an Image, Multivalue, or Pointset). This behavior only has one input port and no output port. All data values on that input port are processed. Processing might involve printing to screen (if the print_input parameter is true). NULL values are only printed if the print_null parameter is true. If the expected parameter is set, all input values are compared to this value. The behavior's execution fails if an input value does not match the expected value.

InputLogger
The InputLogger behavior logs collected inputs to a file that is written to disk. The location of all logged data is set by the log_prefix behavior. If log_prefix is set to /tmp/my_log, then all inputted data is written to /tmp/my_log.log. The name of the selected input file is written to the .log file. Note that the my_log.log file can be used as an index of image filenames the InputLogger created. The exception is an image. If an image is sent to the logger, it is written to a separate file in JPEG format. The file name uses the same prefix but has an index and .jpg suffix. So the first image that the /tmp/my_log InputLogger receives is saved to /tmp/my_log-0.jpg, the second to my_log-1.jpg and so on. The InputLogger behavior has an unlimited number of input ports. All inputs received in one computation frame are saved to the same log file line separated by spaces (unless all inputs were NULL, in which case nothing is written). The first input port is the first entry on the line, the second port data is the second line entry, and so on. So for example, suppose the 3 input data ports contain: 1.0, NULL, <A picture of the Mona Lisa>, then the line written to the log file is:
1.0 NULL /tmp/my_log-0.jpg And /tmp/my_log-0.jpg contains a picture of the Mona Lisa.

MalleableBehavior
MalleableBehavior is a behavior used to implement a simple TCP server.

All of its input ports and output ports are specified in the behavior's XML configuration file. If you must add a new port, you only must change that XML file. After the MalleableBehavior has determined its ports from the XML file, it begins listening on the port specified by the parameter tcp_port. Any XML received within the tags given as the tag_name parameter is proxied to output ports within the behavior network. Symmetrically, any input data received in the behavior network on an input port is written to any clients connected to that port, and its contents are inside the tags. In most behaviors, the name of the behavior's C++ class is the same as the name of the behavior

User Guide

6-3

Chapter 6

type, but that is not the case for MalleableBehavior. In fact, you should never use the MalleableBehavior in the network directly. It must always be referred to by another XML file; otherwise you can only make one type of MalleableBehavior, which defeats the purpose.

PeriodicTrigger
The output of this behavior is usually null, but every delay seconds, it outputs the value value for a single cycle of the behavior network. This behavior is probably only useful for testing.

PlaySoundBehavior
This is the behavior used to play sound files. The behavior takes the path to a sound file and plays on the speakers. There is support for playing using the normal /dev/dsp device or the E-sound daemon.

TCPServerBehavior
This is a purely virtual class used to create behaviors to serve off datasets to remote clients across a TCP network. Currently used to implement ColorTrainerServerBehavior.

TransmitBehavior
A behavior that allows data to be sent to and received from a TCP/IP connection. Commands can be sent to read inputs and set outputs.

TriggeredConstant
This is a slight modification of the Constant behavior. It holds a constant value and outputs it when it receives any non-null input on its input port. If it receives null, then it outputs null.

Operator Behaviors
Operator behaviors perform system calculation tasks. There is an effective type promotion for all operators to determine output. Booleans are less than doubles which are less than multivalues. A set of Boolean inputs results in Boolean outputs. A set of doubles and Booleans as inputs results in double output. A set of Booleans, doubles, and multivalues as inputs results in multivalued outputs. For example, input .3 and [.4,.5,.6] to Min results in [.3,.3,.3] as output. Booleans are converted to 1.0 for true and 0.0 for false.

AbsoluteValue
The absolute value operator outputs the absolute value of input double or MultiValue. The input port must contain a single input, either a double or a multivalue. If the input is a double, the absolute value of the double is sent down the output port. If the input is a multivalue, absolute value is applied to each double in the multivalue, and that new multivalue is output.

6-4

User Guide

Operator Behaviors

Addition
The addition operator outputs the sum value of all inputted values. Input values can be Booleans, doubles, or Multivalues. Note that computing the sum of Boolean values is equivalent to logical OR. All inputs arrive on the same port. If two values of different types are added together, the less specific type is promoted to the more specific type. For example to add "true" and "3.14", the Boolean is converted to double value 1.0, and the resultant "4.14" is outputted. False is converted to 0.0. When adding a double and a multivalue, the double value is added to each element in the multivalue. If no inputs are provided, the behavior outputs NULL.

Average
The Average behavior computes the arithmetic mean of a set of input values. This interface is almost identical to the AdditionOperator interface. The only difference is that after all the values have been added, the resulting double or multivalue is divided by the total number of processed inputs.

DoubleArrayJoiner
The DoubleArrayJoiner behavior joins a set of input doubles into a double array. Each future array value is input on a new port. The double value in the first port is the first array entry, and so on. For example, suppose DoubleArrayJoiner has 3 input ports. If the first input was 1.0, the second was 2.0, and the third was 3.0, then the output is the double array (1.0, 2.0, 3.0).

DoubleArraySplitter
The DoubleArraySplitter behavior splits an input double array into a set of doubles. This is an inverse of the DoubleArrayJoiner. The splitter has only one input port (containing a double array) and any number of output ports. Each output port contains a subsequent array data value.

Max
The Max behavior outputs the maximum value of all input values. This interface is similar to the addition operator interface. There is one input port which can have any number of input Booleans, doubles, or multivalues. Each input value is compared to each other input value and the maximum value is outputted. Values of different types are promoted as described in the AdditionOperator interface section. The maximum of two multivalues is the maximum of each individual double contained in the multivalues. For example, if one multivalue is (0, 1) and another is (1, 0), then the maximum is (1, 1). Note that the maximum of two Booleans is equivalent to Boolean "or".

Min
The Min outputs the minimum value of all input values. This interface is identical to the Max behavior, except that the minimum is computed in each

User Guide

6-5

Chapter 6

situation.

Multiplication
The Multiplication behavior multiplies input data and outputs their product. This behavior's interface is almost identical to the AdditionOperator's interface. Instead of adding all input values together, they are multiplied. The product of two compatible Multivalues is defined as the product of their corresponding double entries. For example, the product of (1, 2, 3) and (1, 2, 3) is (1*1, 2*2, 3*3) = (1, 4, 9). Note that the product of two Boolean values is equivalent to the AND operator.

Priority
The priority operator is used to select between different possible inputs. The priority input has a variable number of input ports. Every two ports are paired together, where the first port is an input data value to consider and the second port is a priority for that data (in the form of a double). The priority operator outputs whichever data had the highest non-NULL priority. For example, suppose the inputs to the priority behavior were, in order: "a", 1.0, "b", 2.0, "c", 3.0. The string output "a" has priority 1.0, and so on. The behavior has two outputs: the data whose priority was highest and the winning priority. So in the above example, the Priority behavior outputs "c" on the first port and 3.0 on the second port. If an input priority is NULL, its input data is not considered. For example, this data: a, 1.0, b, 2.0, c, NULL will output b, 2.0. If all priorities are NULL (or if no inputs are received), the output value and priority are both NULL. So a, NULL, b, NULL will output NULL, NULL.

Subtract
This behavior takes an arbitrary number of inputs in its two input ports. All inputs are then added together, with those on the second port being multiplied by -1 first. This has the effect of subtracting the values on the second port from those on the first port. This behavior uses the type promotion convention common to almost all operators. If all inputs are Boolean, output will be Boolean. If there exists at least one double input, and no multivalue input, output will be double. If there are any multivalue inputs, the output will be a multivalue. A double subtracted from a multivalue subtracts the double value from each element of the multivalue. Booleans are treated as 1 and 0 for true and false.

ThresholdBehavior
This behavior operates as a gate that allows data to pass through only if the given numerical input is greater than the specified threshold value. One of the following operators can be used for comparison with the threshold: >=, >, ==, <, <=. If the pass through value is a number, it can be specified as a parameter. If the pass through value is not a number or needs to be dynamic, it can be passed in as an input.

6-6

User Guide

Operator Behaviors

Transistor
This behavior specifies that if the data on the trigger input port is non-null, the data on the data input port is written to the data output port. Otherwise, NULL is written to the data output port.

TriangularDistributor
This behavior takes an input double and turns it into a triangular shaped function on the output port. The width of the function and the discrete interval the function are defined as parameters.
Sigma = x Interval = [a : w : b] Input TriangularDistributor Double=d Multivalue Output

Output Multivalue x x

} }
d

TriggerPassBehavior
TriggerPassBehavior is similar to a relay, allowing the output of one behavior to control whether the output of a second behavior is passed along for further processing. When it receives a non-null input on the trigger port, it copies the input data port to the output data port for a certain number of seconds controlled by the parameter window_seconds.

Before it has been triggered, or after the window has elapsed, it outputs a default value controlled by the parameter default_data.

WeightedSumOperator
This is a math behavior. It has a dynamic number of input ports, specified in pairs. The first port of each pair specifies the value to be operated upon, and the second port specified the weight to multiply the operand by. Each value, after being weighted, is summed with the results of all the other weightings, and the resultant value is output on the output port.

User Guide

6-7

Chapter 6

Condensers

CondenserBase
The CondenserBase behavior takes a multivalued input and condenses it down to a single double output. All condenser behaviors based on the abstract base class CondenserBase. Each multivalue is a range from x to z, with increments of y. [x:y:z] Each value has a preference (from 0 to 1) associated with it. The condensers use this preference value to select which value should be output as a double.

CentroidCondenser
The CentroidCondenser behavior condenses an input Multivalue to a single double value by computing the Multivalue's centroid. The condenser has only one input port, which must contain only one input multivalue. The centroid of this multivalue is computed and is pushed on the single output port.

MedianCondenser
The MedianCondenser behavior condenses an input multivalue to a single double value by computing the multivalue's median value. The condenser has only one input port, which must contain only one input multivalue. The median of this multivalue is computed and is pushed on the single output port.

MaxCondenser
The MaxCondenser behavior condenses an input multivalue to a single double value by computing the multivalue's maximum value. The condenser has only one input port, which must contain only one input multivalue. The maximum of this multivalue is computed and is pushed on the single output port.

Resource Behaviors
Resource behaviors access and drive the system hardware.

A/VClient
The A/VClient is one half of a pair of behaviors which allow audio and image data to be streamed over a network, allowing for real-time teleconferencing. It connects to an A/VServer (usually on port 15000), and it sends and receives both audio and video data over one TCP connection. The A/VClient does not handle capture or playback of audio or video. It is simply a conduit through which audio and video packets are sent and received over the network. Currently images are sent as JPEGs over the network in rapid succession to simulate "video". Audio samples are interspersed between the images as they are received by the behavior. No attempt is made to synchronize the audio and video streams. No format information is explicitly sent over the network.

6-8

User Guide

Resource Behaviors

The behavior immediately tries to connect to the specified A/V server (host:port) on activation.

A/VServer
The A/VServer is one half of a pair of behaviors which allow audio and image data to be streamed over a network, allowing for real-time teleconferencing. It receives a connection from an A/VClient, usually on port 1500, and it sends and receives both audio and video data over one TCP connection. The A/VServer does not handle capture or playback of audio or video. It is simply a conduit through which audio and video packets are sent and received over the network. Currently images are sent as JPEGs over the network in rapid succession to simulate "video". Audio samples are interspersed between the images as they are received by the behavior. No attempt is made to synchronize the audio and video streams. No format information is explicitly sent over the network. The behavior immediately starts listening on the specified port on all interfaces (0.0.0.0:port) for a connection from an A/V client. Currently you cannot specify an interface on which to listen.

BatteryMeter
This behavior wraps a battery resource and outputs the battery level in raw voltage and as a percentage.

BumpSensorBehavior
The BumpSensorBehavior is a behavior associated with object contact detection in the a bump sensor. It queries the bump sensor, and if the bump sensor is hit, passes out a true (1) value on its output port. If no bump is detected, then it outputs a 0 on its output port.

Camera
This behavior gets images from the camera and, if there are behaviors connected to it on the behavior network, pushes its data out either as an Image object or as a JPEG format image.

CompressedAudioPlaybackBehavior
The CompressedAudioPlaybackBehavior takes data packets compressed in the GSM610 format (single channel, 8000 samples/second) on the input pin and plays them on the sound card.

CompressedAudioRecorder
The CompressedAudioRecorder takes sound samples from the sound card and compresses them in the GSM610 format (single channel, 8000 samples/second) and puts them on the output pin.

DriveSystem
The DriveSystem converts input velocity descriptions into motion on the robot. This is a straightforward behavior, used when you want the robot to go somewhere. Its maximum and

User Guide

6-9

Chapter 6

minimum acceleration, and maximum and minimum velocity are specified in the behavior network XML file that contains DriveSystem as a part of its network.

FuzzyLRFParse
This takes an extremely dense pointset from a laser range finder that scans the front 180 degrees and distills them down to three data points corresponding to forward, left, and right. This behavior has one input port and one output port. The input port contains a DoubleArray data object, presumably read from a laser range finder. Each array value is a different distance reading from that sensor with the first value corresponding to a range reading 90 degrees to the right and the last value being 90 degrees to the left. The behavior outputs a double array of three values. The first value is the forward distance, the second is the left distance, and the third value is the right distance.

FuzzyRangeSensorRing
The FuzzyRangeSensorRing (FRSR) reads the resource-config.xml file, finds all the range sensor devices (by probing them) and configures them. It first calculates each sensor's true 3space position, finds the robots bounding box (from the resource config file), then intersects the ray in space describing the sensor's operation with the bounding box. This intersection yields a distance for the range sensor to be offset by to return readings relative to the bounding box of the robot instead of the sensor's true position. This makes adjusting the avoidance behavior of the robot easier, and often avoids problems with deadbands, if the sensors are mounted farther from the edge of the robot. The bounding box is defined in the dimensions section of the resource config, as the body of the robot. The FuzzyRangeSensorRing reads this information when it is automatically configuring the range sensors. Next it calculates how much each range sensor needs to be offset such that its distance readings are not from its actual position, but where its position is along the line where it is listed as facing intersects the bounding box. This lets the robot avoid obstacles more easily, and will avoid hitting deadband on an IR sensor that is set back at least 10cm from the robots surface. a, b, c, d

= Distance readings

a d
Robot

Bounding box

The following is the Dimensions section is in the Resources section. The Shape with the ID that is equal to body defines the location of the bounding box of the robot.
<Dimensions>

6-10

User Guide

Resource Behaviors

<Shape id="body" type="rectangular"> <Parameter name="lz" value="50"/> <Parameter name="ly" value="30"/> <Parameter name="lx" value="30"/> <Parameter name="link" value="origin"/> <Parameter name="x" value="-15"/> <Parameter name="y" value="0"/> <Parameter name="z" value="25"/> <Parameter name="roll" value="0"/> <Parameter name="pitch" value="0"/> <Parameter name="yaw" value="0"/> </Shape> </Dimensions>

JoystickBehavior
This behavior encapsulates a joystick resource and outputs readings from the joystick.

JPEGServer
This behavior serves JPEG frames to a remote client and can be used as a simple video server within a TCP/IP network with high bandwidth and reliability, for example an internal LAN. This behavior is typically used with the Camera, whose JPEG output connects to the JPEGServer's JPEG input. A video client written in Java is included with the ERSP distribution. To run the video client, run the command evolution_java.sh video-client.

LRFDataBehavior
The behavior reads distance and obstacle data from a laser range finder (LRF). Note that an LRF is not provided with your robot. This behavior has no inputs. It reads data from a laser range finder specified in the sensor_id parameter, which refers to an LRF device in the resource-config.xml file. The LRFDataBehavior has two separate output ports containing the same data in different formats. The first output port contains a list of all distance readings in a double array. As usual, these readings are in cm. If no obstacle is detected, however, the distance is DBL_MAX. A -1 is returned if an error occurred in the scan for example, scanning a light bulb or a highly reflective or glaring surface. The first value represents the rightmost reading and the last value represents the leftmost reading. The actual angular offset of this reading depends on how the LRF is configured, but the standard configuration is a 180 degree scan where the first entry is at angle -90 degrees (straight right) and the last entry is +90 degrees (straight left). The number of data points in the double array also depends on the configuration (i.e. requested number of scans), so any code using this array should check the array size. The second output port outputs a PointSet. Each point in the pointset corresponds to an obstacle that the LRF detected. Every value in the distance array (from port 1) gets converted into the appropriate (x,y,z) coordinates and saved in the pointset. Of course, DBL_MAX and -1

User Guide

6-11

Chapter 6

distances are not stored in the pointset since they correspond to 'saw nothing' and 'scan error'. For this reason, the number of points in the pointset might be less than the number of distance readings.

MotorQuery
This behavior queries a motor for raw odometry information. There are inputs to this behavior. Each computation cycle, it queries the specified motor for the distance that motor has travelled in cm. Note that distance values decreased as the motor moves backward and increases as the motor moves forward. The parameter motor_query_id refers to the motor to query, as named in the resource-config.xml file. Typical names are Drive_left and Drive_right, for the left and right wheel motors.

Odometry
The Odometry reads position and velocity information from the robot's motors. There are no input ports to this behavior. There are four different output ports. The first output port contains all of the robot's position information in a DoubleArray. The first value is the robot's x coordinate, and the second is the y coordinate. The third array value is the robot's heading angle in radians (not degrees). The last value is a timestamp (measured in seconds) for the odometry information. The second output port contains the robot's forward velocity in cm/sec. The third port contains angular velocity in radians/sec. The last output port contains raw data used for odometry calculations. This raw data is dependant on how the underlying odometer works. The only implemented odometer outputs four different values. The first two values are raw odometer tics as read from the motor. The second two values are the conversion factors used to convert tics to distance traveled in cm. Average users do not need to use this raw data. The odometer to use for odometry is stored in the odometry_id parameter. Generally this is set to odometry, but it refers to a specific named device in the resource-config.xml file.

PanTiltControl
The PanTiltControl behavior provides an interface for controlling a pan/tilt system for a camera. Currently, this is only implemented under Windows, using the TrackerCam pan/tilt module. Future support for other pan/tilt mechanisms is expected. Input ports are provide for either relative or absolute pan/tilt commands to be given.

RangeSensor
The RangeSensor is a behavior associated with a range sensor. It queries the range sensor whose ID has been specified, and passes the distance and timestamp returned by the sensor down the network.

SensorAggregate
This behavior determines the available sensors that can be used for obstacle avoidance and combines their output into a compact form. This behavior commonly is hooked into the AvoidanceAggregate.

6-12

User Guide

Vision Behaviors

Wireless Monitor
This behavior outputs the strength of a wireless connection by wrapping an IWireless resource.

Vision Behaviors
Vision behaviors are used to perform tasks based on visual input.

ClusterTargetBehavior
This behavior is closely tied to the StatisticalColorDetection. It takes an array of clusters, finds the best one, determines the bearing of the cluster, and outputs the bearing.

ColorTrainerServer
This behavior allows you to train the StatisticalColorDetection from a remote computer. It creates a TCP server that listens for color training parameters and sets them in a connected StatisticalColorDetection. It is designed to be used in conjunction with the avior, the Camera behavior, and the JPEGServer behavior. An example behavior network using the ColorTrainerServer can be found in the following behavior network file: Samp_code_dir/behavior/vision/color_trainer_server.xml To use this behavior network, run it using the behave program. On another computer, run the ColorTrainer GUI to connect to this behavior network. Now you can use the ColorTrainer GUI to train the robot to recognize particular colors.

FlowDetectBehavior
This behavior computes the optical flow from input images.

ObjRecRecognize
This behavior can do several things. It stores a set of objects in memory. It can train on an object, recognize an object, save out its current model set, delete a model from the model set, or not save the current model set. Most of these options can be set as parameters and not passed into the behavior at runtime to save CPU cycles.

PanTiltDriveTracker
The PanTiltDriveTracker is a behavior module that combines object tracking using both pan/tilt cameras and the robot's differential drive system into a single behavior. The tracker does image-based tracking, and requires the image coordinates of the target as an input. Any combination of camera pan/tilt, robot rotation, and robot forward motion can be selected through a mode selection input port. Alternatively, this behavior can select the type of tracking based on the currently available resources. In its current form, the pan/tilt control only works under Windows, using the TrackerCam pan/tilt module. Future support for other pan/tilt mechanisms is expected. Important Note: The drive mode is only partially implemented and so should be used with care.

User Guide

6-13

Chapter 6

SkinDetector
This behavior does a per-pixel analysis of input images. Each image pixel is checked for yes-no match against a mixture of Gaussian models. After each pixel is assigned a true or false value, they are clustered together. The resultant clusters are output, along with a debug image that shows what matches skin, and a binary matrix encoding the true/false values on a per-pixel basis.

SkinFlow
This behavior melds the computations of the SkinDetector behavior and the FlowDetection behavior. It looks for moving clusters, and outputs them, on the assumption that a moving cluster is more likely to be an interesting cluster.

StallDetector
This behavior is used to detect, using visual techniques, if the robot is currently in motion or stopped. It has a slight latency, due to video processing, but still reliably and quickly detects if the robot is stopped or moving.

StatisticalColorDetection
This behavior detects regions with a specified color in an image by using statistical methods on UV colorspace parameters. A set of color parameters determine what color the behavior is looking for an the image. The behavior then outputs all rectangular regions containing the color that matches the specified parameters. This behavior is typically used in conjunction with the Camera and the ClusterTargetBehavior to let the robot to follow a color. It can also be used with the ColorTrainerServer to allow for remote specification of the color detection parameters. This remote color training behavior network gives visual feedback on the outputs of the
StatisticalColorDetection behavior based on visual inputs and is the best way to test

the operation of this behavior.

Navigation Behaviors
These behaviors perform tasks to control robot navigation.

AvoidanceAggregate
This behavior receives input about the locations of obstacles and hazards ("dangerous" obstacles) and controls the drive system accordingly. Generally, this input originates from the SensorAggregate. Because the AvoidanceAggregate contains a DriveSystemBehavior already, it should not be present in a network with the latter.

FaceObject
This behavior causes the robot to rotate in place to search for a specified target object. When it recognizes the target object, it turns to try to face the object directly. It accomplishes this by

6-14

User Guide

Navigation Behaviors

outputting an angular velocity that causes the robot to rotate toward the target object.

FuzzyAvoidance
Together with FuzzyVelocity, this behavior implements a simple obstacle avoidance algorithm using Fuzzy Logic. The simple algorithm for FuzzyAvoidance is as follows: If an obstacle is on the left, turn right. If it is on the right, turn left. It is in front, turn either way.
FuzzyAvoidance takes as input a set of range sensor readings for the distance to the nearest obstacle in four fuzzy directions: front, left, back, and right. From these readings the algorithm computes an output multivalue, or fuzzy set, of preferences (0 to 1) over a range of angular velocities, indicating the best direction(s) in which to turn. The multivalue can be composed with the output of other behaviors and ultimately condensed to a single angular velocity value, suitable for input to the drive system.

FuzzyHeading
The FuzzyHeading takes a heading and outputs an acceptable angular_velocity multivalue. This behavior is used in fuzzy navigation systems, because single values headings do not work in these systems.

1 Angular Velocity Preference

Preferences

0 - Heading

FuzzyVelocity
Together with FuzzyAvoidance, this behavior implements a simple obstacle avoidance algorithm using FuzzyLogic. The simple algorithm for FuzzyVelocity is as follows: If an obstacle is on the left or right, slow down. If an obstacle is ahead, slow down or back up, depending on distance, unless there is also an obstacle behind. The FuzzyVelocity behavior takes a set of range sensor readings as input. These readings are for the distance to the nearest obstacle in four fuzzy directions: front, left, back, and right. From these readings, the algorithm computes an output multivalue, or fuzzy set, of preferences (0 to 1) over a range of (forward) velocities, indicating the best speed and direction at which to

User Guide

6-15

Chapter 6

proceed. The multivalue can be composed with the output of other behaviors and ultimately condensed to a single velocity value, suitable for input to the drive system.

HazardAvoidance
This behavior avoids hazards, which are simply obstacles that are viewed as especially dangerous, such as stairs and physical bumps. HazardAvoidance sends an immediate stop to the drive system when a hazard is detected, and then prohibits movement in the direction of the hazard(s) until it is no longer present.

LegDetectBehavior
This behavior takes data from a laser rangefinder, and outputs an angle and heading to the nearest set of legs. If no legs are found, or the legs found are invalid, the output data is set to 0 angle and the target distance to 0 cm.

OdometryCompare
The OdometryComparator behavior compares two odometry positions to see if they are sufficiently different. The positions can differ either when their (x,y) locations are further than a specified distance apart (20 cm) or when their angles differ by more than a certain angle (10 degrees). The behavior checks to see if a camera takes a notably different picture of each odometry position. The behavior takes two odometry positions as input. It outputs a true value stored in a double if the positions differ, and a NULL value if the positions are sufficiently closer.

PointAndGo
PointAndGo behavior takes image coordinates or the robots relative coordinates, and

odometry measurements. It outputs absolute coordinates, typically a target position, that other behaviors can use for navigation. In the case of image coordinates, the coordinate represents a point in the robots environment to which the robot should go. Knowing the current odometry readings, camera location and orientation, PointAndGo finds the absolute coordinate of that point and outputs it. In the simpler case, robot relative coordinates are passed in, and PointAndGo uses odometry to find the absolute coordinates of that point. After PointAndGo is given a target, it continues to move toward that target until either the robot has reached it, or a stop command is sent. At any time, another target can be given to PointAndGo to change the robot's course.

PointandGoBridge
This behavior passes PointandGo data from the client to the robot.

SafeDriveSystem
This is a replacement for DriveSystem behavior, but with obstacle avoidance. The behavior takes (v, w) velocities and stop input to control the drive system.
SafeDriveSystem contains the SensorAggregate and AvoidanceAggregate behaviors;

6-16

User Guide

Speech Behaviors

the former determines the types of sensors available on the system, while the latter performs obstacle avoidance using the sensor readings.

StopMonitor
The stop monitor determines if a robot is close enough to a target that it should stop. Two separate odometry inputs are passed into the behavior on two separate ports. One odometry represents the current position and the other represents the target. If the positions are within 30 cm, the behavior outputs a SMOOTH_STOP signal. If this signal is sent to the DriveSystemBehavior, the robot smoothly slows down until it has completely stopped. If the robot is not close enough to the target, nothing is output.

TargetMarkerBehavior
This behavior allows the tracking of targets that temporarily move out of sensing range. When a target is in range of some targeting behavior, it outputs a valid angle-to-target which the TargetMarkerBehavior, receiving the heading as input, leaves unchanged. If, however, the target passes out of range, the heading becomes NULL and TargetMarker takes over, outputting its own heading value.
TargetMarkerBehavior continuously stores the robot position and heading to target while the target is in range. When the target is out of range, the behavior knows the position and heading at the last moment when the target was sensed. From this data, it computes a ray emitting from the last position in the direction of the last heading. The target must have been in some location on this ray. TargetMarkerBehavior outputs headings to trace along the ray, until the target is reacquired. If the target is still not found, it continues its search in a circular pattern.

An example of uses for TargetMarkerBehavior's output include being used as input to FuzzyHeading, or directly to a drive system controller.

TargetToHeadingBehavior
This behavior takes an (x, y) target and odometry information and outputs the angular heading that the robot needs to turn in order to reach that (x, y) target. Used as an intermediary between behaviors that output (x, y) targets, and those that take angular target inputs, such as TargetMarker.

Speech Behaviors
ASR
The ASR has been designed to allow the development of command-and-control applications. This behavior consists of one input port and one output port. The input port receives the name of the grammar file of corresponding allowed commands. The output port provides the recognized text to down-stream behaviors. The behavior has only one parameter that is the default grammar to be loaded in the initialization of the behavior. The default value for this parameter is the NULL (empty) grammar.

User Guide

6-17

Chapter 6

TTS
The TTS behavior provides text-to-speech services to behavior networks. The TTS Behavior encapsulates the communication between the behavior network and the TTS driver. This is the TTS engine being controlled by the TTS driver. The TTS behavior has several input ports to control the synthesized speech characteristics at run-time. These characteristics can also be specified with parameters. The TTS behavior has three output ports, one for the text to be synthesized and the other two for the voice and the language parameters. The behavior has two parameters: the voice, and the language to be used for synthesizing audio. The supported values for the voice parameter are: male, female, child, old_male, and old_female. The third-party TTS engine supports different languages.

Emotion Behaviors Emotion Aggregate


This behavior performs the main computations of the emotion/personality model and controls the graphical facial responses of the robot. The inputs, Activity, Self-Confidence, and Friendliness, regulate the personality model by controlling the extent of the activation generated by the releasers. The other input collects the emotion elicitor outputs. This is shown in the diagram that follows.

The next diagram shows an overview of the components of the personality model. Because this model works using the behavior-based architecture, it corresponds to just one network of behaviors. Running synchronously, it computes the value of the emotion state in each cycle. An animated face shown on the display demonstrates the emotional state.

6-18

User Guide

Emotion Aggregate

The model has four inputs, three of them corresponding to the personality model and the remaining one corresponding to the emotion releasers. This last input is generated by a set of behaviors called elicitors that convert sensor inputs into adequate values for the model. There are two output ports. One carries an array defining the current emotional state. The other outputs a list of labels identifying each entry in the array. The FaceDrive uses these outputs to animate a face graphic which conveys the robots emotional state. While the examples in this section cover the FaceDrive, you can apply the personality model elsewhere. For example, you can use the emotional state to cause the robot to move by combining the personality model with movement behaviors.

FaceDrive
The FaceDrive behavior converts the emotion state into a facial animation display. The current implementation of the system has one facial expression per emotion state entry. The facial expression can be set as a dominant emotion of a weighted mixture of emotions. The FaceDrive behavior also has inputs for forward velocity, the angular velocity, and the stop commands. These inputs mimic the inputs to the actual DriveSystem of the robot. These inputs trigger actions that are merged with the facial animations. For example, the displayed face turns to the left if the robot is turning left.

Emotion Elicitors
The emotion elicitors are mediator behaviors that transform a sensory input from the robot environment into a a value that is understandable for the model. Given the enormous range of sensory inputs, it is plausible to have many different elicitor implementations, and even several implementations for the same type of elicitor. All elicitors are derived from a base class that define the input and output ports and perform the basic behavior computation. This computation is based on a function (called elicitor_function()) that has to be implemented by the derived class. For example, for the case of the SigmoidElicitor, the elicitor_function() just implements a sigmoid function, or for the case of a SpeechElicitor, the elicitor_function() implements a

User Guide

6-19

Chapter 6

hash table in which the output value would be obtained depending on the recognized text provided by the ASR behavior. All elicitor behaviors share the following input and output ports:

BumpElicitor
Behavior that transforms the Boolean signal provided by a bump sensor into a 0/1 output value.

ObstacleElicitor
This behavior transforms distance output from an IR sensor into a 0/1 output value. If the input is INFINITE, the output is zero, otherwise, the output is one.

ObstacleDistanceElicitor
Behavior that transforms distance output from an IR sensor into a distance within the range of the emotional model. If the input is outside the operation band of the IR sensor, the output is zero, otherwise, the output is a linear function of the input distance.

SigmoidElicitor
This behavior implements a sigmoid function.

SpeechElicitor
This behavior implements a hash table that is provided as a parameter, in which the value in the output port is defined by the input string.

TargetElicitor
Behavior that triggers an emotional response when a target is present.

6-20

User Guide

Chapter 7

Task Libraries

The following sections explain the concepts and tasks you need to use the Task Execution Layer. These sections are: Defining New Tasks TaskContext Events Example Program Overview of Tasks

Defining New Tasks


In addition to using the predefined tasks that are provided as part of ERSP, you can define new tasks of your own. It's as simple as defining a new function, plus one additional step of registering the function as a task. For example, here's a script that defines a new, very simple task and then calls the task:
from ersp import task # First define the task function

User Guide

7-1

Chapter 7

def MyTaskFun(context): print "hello!" # Then register the task MyTask = task.registerTask("MyTask", MyTaskFun) # Finally, call the task (not the function). MyTask()

TaskContext
When writing a function you plan on registering as a task, the function must take a single argument. When the task is executed, that argument is bound to the TaskContext object, which represents the task context. One of its most important functions is packaging up the task arguments. To extract the task arguments, you can use the context's getArguments method, which returns a tuple containing the arguments. For example, here's a task that adds its two arguments together and returns the sum:
from ersp import task def AddFun(context): (a, b) = context.getArguments() return a + b # Note that a will be set to the first argument and b will be set to the second argument. Add = task.registerTask("Add", AddFun) print Add(5, 4)

Events
Some tasks are designed to watch for particular events or situations, for example a loud noise, a recognizable object in front of the camera, or a spoken command. These tasks need a way to signal when the situation they have been monitoring for occurs, and the way they signal other tasks is by raising events. You can make your tasks wait for these events and take appropriate actions.

Example Program
Here is an example that waits for the speech recognition system to hear a spoken command:
from ersp import task from ersp.task import speech def WaitForCommandFun(context):

7-2

User Guide

Example Program

task = context.getTask() event = task.waitForEvent(speech.EVENT_SPEECH_RECOGNIZED) text = event.getProperty("text") print "You said '%s'." % (text,) WaitForCommand = task.registerTask("WaitForCommand", WaitForCommandFun) p = task.Parallel() p.addTask(WaitForCommand) p.addTask(speech.RecognizeSpeech) p.waitForFirstTask()

Here is an overview of what this program does. It starts up two tasks in parallel, the custom WaitForCommand task, and RecognizeSpeech. RecognizeSpeech is a task that runs forever and raises events when the speech recognition system recognizes speech. WaitForCommand waits for one of the speech recognition events, prints the text that was recognized, then returns. This occurs because Parallel's waitForFirstTask method is called. As soon as the WaitForCommand task returns, the program ends.
WaitForCommand needs to get a handle on the Task object that represents the currently executing instance of the WaitForCommand task. This is required in order to call the Task's waitForEvent method.

The waitForEvent method waits for an event to occur and returns the event. You specify which events you're waiting for by passing the type as the first argument (EVENT_SPEECH_RECOGNIZED in the example). The method does not return until an event of that type is raised by another task. After a speech recognition event is raised and waitForEvent returns, look up the value of the "text" property, which for recognition events contains the text that was recognized. You can write tasks that raise their own events. 1. Creating the event 2. Setting any properties 3. Raising the event An updated version of the Add task that, instead of returning the sum of its two arguments, raises an event that contains the sum. This is a fictional example of how to use events:
from ersp import task def AddFun(context): (a, b) = context.getArguments() event = Event("Sum") event ["sum"] = a + b manager = context.getTaskManager() manager.raiseEvent(event)

User Guide

7-3

Chapter 7

Overview of Tasks
The following are the tasks with brief descriptions. For detailed descriptions of these tasks, refer to the Doxygen documents located in Install_dir/doc/ERSP-API/html directory for Linux and Windows. AdjustFlowDetectParams - Adjusts the parameters the DetectFlow task uses, without stopping DetectFlow (if it is running). AudioVideoClient - Connects to an AudioVideoServer on the specified IP and port after a connection is established, it starts sending/receiving A/V info over the connection. It uses both the Camera resource and the AudioRecord resource. AudioVideoServer - Listens on a port for connections. After a connection is established, it starts sending Audio/Video information over the connection. It uses both the Camera resource and the AudioRecord resource. CloseER2Gripper - A blocking call that closes the ER2 gripper fully or for a specified. CloseGripper - Closes the gripper. DetectColor - Begins watching for a specified color and raises events when it is seen. DetectFlow - Raises an event each time a new frame comes out of flowdetect behavior. DetectGesture - Begins watching for gestures, and raises events when they are detected. DetectMotion - Returns the amount of motion between consecutive frames from the camera. DetectObject - Begins watching for a specified object and raises events when it is seen. DetectSound - Returns the perceived volume level from the microphone. DoAtATime - Performs a task at a specific time. DoPeriodically - Performs a task periodically, every so often. DoWhen - Waits for an event, and executes a task when the event is raised. DriveMoveDelta - Moves the drive system to a position delta. DriveStop - Sends as top command to the drive system. FaceObject - Rotate to search for and face a specified object. GetImage - Returns an image from the camera. GetPosition - Returns the robot in global coordinate system (x, y, theta). GetRangeData - Polls the range sensor and returns distance-timestamp pairs in one contiguous array.

7-4

User Guide

Overview of Tasks

GetVelocities - Returns the current linear and angular velocity of the robot as reported by odometry (v, w). GotoObject - Rotate to search for and face a specified object. Move - Moves at the given velocity. MoveER2ArmDelta - A blocking call that moves the arm to a specified position. MoveRelative - Moves a set of relative distances. MoveTo - Moves to a set of absolute points. OpenER2Gripper - A blocking call that opens the ER2 gripper fully or for a specified position delta. The full range of motion is 10,000 motor units. OpenGripper - Opens the gripper. PlaySoundFile - Plays a sound file. PrintEvent - Prints out events of the specified type until terminated. RangeSensorMove - Sets drive system to move at the given velocities until a specified range sensor reads a value less than a specified threshold. The call will block until the range sensor threshold is read. RangeSensorSweep - Sets drive system to turn at the given angular velocity until a specified range sensor reads a value less than a specified threshold or the drive have turned a specified angular distance. The call will block until either the threshold is read or the distance is reached. RecognizeSpeech - Recognizes speech according to the specified grammar, raising events when speech is detected. SendMail - Sends mail and waits for its completion. SenseObjectInGripper - A blocking call that checks if an object sensed by the gripper. This will only work if the gripper is in the fully open position. SetDriveVelocity - Sets the drive system to move at a given velocity. SetER2ArmVelocity - A non-blocking call that starts the ER2 arm moving at a specified velocity. SetFaceEmotion - Sets a facial expression based on a vector. SetTrackColorVelocity - Sets the linear velocity while color tracking. Speak - "Speaks" the given text using TTS, waiting until completed. SpeakFromFile - "Speaks" the specified file using TTS, waiting until completed. Stop - Stops. Teleop - Remote control your robot. TrackColor - Tracks motion occurring within regions of the specified color. TrackColorFlow - Tracks motion occurring within regions of the specified color.

User Guide

7-5

Chapter 7

TrackSkinFlow - Tracks people by tracking skin colored objects that are moving. TrainObject - This task adds an object to a modelset. Turn - Moves at the given angular velocity. TurnRelative - Turns a relative distance. TurnTo - Turns to an absolute heading. Wait - Waits for the specified amount of time. WaitUntil - Waits for an event. Wander - Causes the robot to wander aimlessly with obstacle avoidance enabled. This task never returns, so if you want to stop wandering you must terminate this task. WatchInbox - Waits for new mail and raises an event when it is available.

7-6

User Guide

Chapter 8

Core Libraries

The Evolution Robotics Software Platform (ERSP) includes a number of Core Libraries to ease various development tasks. These are described in this section.

Image Toolkit
The Imaging Toolkit provides facilities to acquire and process image data from cameras and other sources. Major features of the Toolkit include: Generic Image class to represent raw data formats. Colorspace and image format conversion utilities. Basic I/O to and from various file formats, memory blocks, and network sockets. Transformations such as up- and down-sampling. Matrix conversion for computer vision processing. A set of simple filters: Mean, sum, covariance and standard deviation. The Imaging Toolkit is part of the Core Resource Library, and can be included with the <evolution/Resource.hpp> header file. A more detailed discussion of Toolkit components follows.

User Guide

8-1

Chapter 8

Image Class
The Image class is at the heart of the Imaging Toolkit, encapsulating a raw data image in standard formats, and performing general operations on the image such as conversion, transforms, memory management, and I/O.

Formats
Images can be stored in a variety of raw formats commonly supported by cameras, including Grayscale, RGB24, and numerous YUV formats, such as IYUV, UYVY, and the IEEE1394 standards IYU1 and IYU2. IYUV is the default and preferred format, because its planar layout allows access to grayscale information without computation, while still providing color information. For more information, see Colorspace Utilities on page 8-3.

Initialization
Image instances can be initialized with a memory buffer, with another Image object, or created (or copied) from a camera, using the ICamera interface. They can also be read from a raw or compressed image file (for example, PNG) or a Matrix object.

Conversion
The Image class supports conversion between the raw formats, either by the conversion of the image itself, or by the creation of an image copy in the destination format. The latter is much more efficient than copying the image and then converting the copy. As individual pixel values are accessed, their color format can be converted, though for large numbers of pixel access it is more efficient to convert the entire image and access the actual data pointer. The Image class uses the ColorspaceConverter utility class, which you can also use directly. Conversion to and from matrices in the Math Library is also supported for use in computer vision processing, as provided by the Evolution Robotics Vision SDK.

Transforms
While it is more appropriate to perform complex calculations on matrices, some simple transformations are supported on the Image class itself. An Image can be up-sampled or down-sampled. It can be cropped. Filters that apply convolutions for sum mean, covariance and standard deviation are also available.

Memory Management
Usually, the Image class manages all necessary memory allocations as data are loaded and converted. The direct manipulation of and access to the data buffer are also supported. The buffer can be resized manually without conversion, invalidating the image data, reformatted, or cleared. In addition, the Image class is reference-counted, suitable for safely passing data around in different contexts.

8-2

User Guide

Image Toolkit

I/O
The Image class provides methods to read from and write to a raw or compressed image file. Specifically, raw, jpeg (.jpg), bitmap (.bmp) and portable network graphics (.png) file formats are supported. In addition, the facilities in ImageIO can be used for more fine-tuned control of file I/O, as well as accessing memory buffers and network sockets.

Metadata
Because an image created by a camera can be used in vision computations, it is important to have associated metadata stored with the image. Accordingly, the Image class retains the timestamp and the intrinsic and extrinsic parameters of its camera source at the time of creation.

Drawing
A number of simple drawing methods are available on the Image class, to superimpose shapes on an image, such as lines, boxes, circles, and so on. These methods are especially useful for visualization of computer vision processing. For example, drawing arrows to indicate motion flow. These drawing routines are generally heavily optimized for drawing in the RGB colorspace.

Colorspace Utilities
Cameras support various raw color formats, such as Grayscale/Y800, RGB24. The Imaging Toolkit makes a distinction between colorspaces and color formats because there is often confusion when the two concepts are blurred.

Color Formats
Each color format represents color according to a colorspace; the supported colorspaces are: Grayscale, RGB, YCbCr, and YUV. For a given colorspace, there are often many different color formats. For example, the formats IYUV, UYVY, and IYU1 all use the YCbCr colorspace. Colorspace then determines the color values of a single pixel and how they represent a specific color or in the case of Grayscale, shade of gray. The color format determines how the pixels that compose an image are arrayed. One source of confusion is that the various "YUV" formats actually use the YCbCr colorspace. While YUV and YCbCr and fairly equivalent, YCbCr is scaled in a manner that eases A-to-D and D-to-A conversion and is preferred for imaging hardware. For clarity, the Imaging Toolkit defines the YCbCr and YUV colorspaces separately, and marks the "YUV" formats correctly as using the YCbCr colorspace. If a "pure" YUV format is needed for some reason, the Toolkit includes an artificial YUV color format (packed, 24-bit). If the preceding discussion impels you to abandon hope of using images, fear not. In general, you can convert between the various color formats using the Image class or ColorspaceConverter utilities.

User Guide

8-3

Chapter 8

Format Description
As the large number of color formats can get to being a bit much, there are utility functions in the Colorspace.hpp header file to determine information about the characteristics of each color format, such as colorspace, bits per pixel, and sampling. The calculate_buffer_size() function is quite important, calculating the size of a calculate memory buffer for an image with a specific color format and dimensions.

Colorspace Details
Following are some details on the various colorspaces: RGB - This common and familiar colorspace represents color as a triplet of red, green, and blue values, each ranging from 0 to 255. Grayscale - This monochrome colorspace simply represents the brightness of a pixel as a shade of gray from 0 to 255. YUV - Because the human eye is more sensitive to luminance (brightness) than chrominance (color), this colorspace dedicates one value to luminance, leaving two values to represent chrominance. The Y is the luminance, while U and V are chrominance, with U equal to blue minus Y, and V equal to red minus Y. Because the chrominance is separated out and can be subsampled to save transmission bandwidth and storage space, YUV is a very common video output format. As U and V contain the red and blue components of the color, the green component always can be derived from YUV. Y ranges from 0 to 255, while U and V range from -127 to +127. YCbCr - As noted above, this is the actual colorspace used by video color formats. For A-to-D and D-to-A conversion, the chrominance levels (U and V) are scaled to match the luminance levels, now calling them Cb (Chroma:Blue) and Cr (Chroma:Red). The luminance levels are also a bit scaled relative to YUV, so, yes, the Y in YCbCr is NOT equal to the Y in YUV! This Y ranges from 16 to 235, while Cb and Cr range from 16 to 240.

Color Format Details


Aside from the colorspace, the color formats are distinguished by packed vs. planar layout. Packed layout means that the color values for a given pixel are contiguous. For example, in RGB24, we have the (R, G, B) values for the first pixel, then the values for the second pixel, and so on. Planar layout, on the other hand, groups the values of a each color component for all pixels in the image; for example in IYUV, all Y values for the image appear first, followed by all U values, followed by V. With packed formats, it is easier to access the values for a single pixel, and to iterate through blocks of pixels. With YUV planar formats, however, conversion to grayscale is trivial, because all Y values occur at the beginning of the image data, and Y values are equivalent to grayscale values. Following is a list of the more common color formats:

8-4

User Guide

Math Library

Grayscale/Mono/Y800 - Neither packed nor planar, because each pixel has only a luminance value. RGB24 - Packed 24-bit RGB format. RGB32 - Packed 24-bit RGB format, with a byte of padding for each pixel to align storage IYUV/I420 - Planar 12-bit YUV format, with Y first, then U and V, each sampled at every second pixel in the vertical and horizontal directions, so that the size of the U and V blocks is 1/4 that of the Y block. UYVY - Packed 16-bit YUV format, U and V sampled at every second pixel in the horizontal direction. The layout is grouped in 4-byte blocks, with a byte each for U and V with 2 bytes for the Y values of two pixels. We have: U0Y0V0Y1 U2Y2V2Y3... IYU1 - Packed 12-bit YUV format, part of the IEEE1394 standard, with U and V sample at every fourth pixel in the horizontal direction. The layout has a U value, then the Y values for two pixels, a V value for the first pixel, then Y values for the two more pixels, and so on. We have: U0Y0Y1 V0Y2Y3 U4Y4Y5 V4Y6Y7... IYU2 - Packed 24-bit YUV format, part of the IEEE1394 standard, with full sampling for Y, U, and V. The layout is: U0Y0V0 U1Y1V1...

Further Reference
Some useful references on colorspaces and formats:
http://www.fourcc.org/ http://www.thedirks.org/v4l2/v4l2fmt.htm http://elm-chan.org/works/yuv2rgb/report.html 5

Math Library
The core math library contains a basic set of mathematical algorithms and structures commonly used in robotics: 2 Dimensional Vectors (Vector2) 3 Dimensional Vectors (Vector3) 3 Dimensional Angles Array (EulerAngles) Point Sets (PointSet) Multivalued Functions (Multivalue) Matrix template class (N-Dimensional) Vector template class (N-Dimensional) VectorField (Matrix of 3D vectors) 3 Dimensional Matrix Colass (Matrix3) 2 Dimensional Matrix Class (Matrix2)

User Guide

8-5

Chapter 8

2D / 3D geometric classes (Point / Line / Triangle / Box), and intersecting testing methods. Gaussian modeling utilities (Gaussian, GaussianMixture, ClassMixture)

Vector3
The Vector3 class is a collection of three doubles (a three-tuple). You can construct them from the points, an array, or another vector:
Vector3 a(0, 1, 2); double x[3] = {4, 5, 6}; Vector3 b(x); Vector3 c(b); // b = (4, 5, 6) // c = (4, 5, 6) // a = (3, 1, 2)

The Vector3 class includes basic mathematical operations for the 3-tuple. Getting/Setting the Data
a[0] = a[1] - 1; b.set(5, 4, 3); // a = (0, 1, 2) // b = (5, 4, 3)

Scaling
a *= 10; b = a * .5; // a = (0, 10, 20) // b = (0, 5, 10)

Vector Addition
a += b; c = a + b; // a = (0, 15, 30) // c = (0, 20, 40)

Casting into a Double Array


const double* arr = c; // arr points to c's internal point array

Vector and Array Assignment


a = arr; // a = (3, 20, 40) a *= 2; // a = (6, 40, 80) b = c; // b = (3, 20, 40) This leaves c unchanged.

Basic Comparisons
if (a == b) { // This will not execute } if (a != c) { // This will execute

8-6

User Guide

Math Library

The Vectors have formatted output to cout


cout << a << endl; // Prints "(3, 20, 40)\n"

The position vector class (Vector3) derives from the Vector3 class and has all the basic mathematical operands. Vector3 defines certain functions for use when the Vector is viewed as Cartesian coordinates in 3-space. Its constructors are analogous to the Vector3 constructors:
Vector3 a(0, 1, 2); // a = (0, 3, 4)

Normalization
double old_len = a.normalize(); // old_len = 5.0, a = (0, .6, .8)

Existence
// True if a != (0,0,0) if (a) { // This will execute }

Length
Vector3 b(10, 20, 30); double len = b.length(); len = b.lengthsquared(); // b = (10, 20, 30) // len = sqrt(1300.0) // len = 1300.0

Dot Product
b.set(.8, .6, 0); double dp = dot(a, b); // b = (.8, .6, 0) -- normalized // dp = .36

Cross Product
Vector3 c = cross(a, b); Angle Vector Casting // Casts to the angle vector which points in the same direction // as the original position vector. a.set(1, 1, 0); EulerAngles ang = EulerAngles(a); // ang = (M_PI/4, 0, 0)\ (yaw, pitch, roll) Roll angle will be zero. // c = (-.48, .64, -.48)

User Guide

8-7

Chapter 8

Euler Angles
Angle vectors define a point on the unit sphere in polar coordinates, plus a rotation around that point. These angles are: yaw (rotation about the Z axis) pitch (rotation about the new Y axis) roll (rotation about the new X axis) Existence
// Always true because no angle vector encodes position (0, 0, 0) if (phi) { // This will execute }

Cast to unit position vector


Vector3 dir = Vector3(phi); // dir = (-1, 0, 0)

Compute axis vectors defined by angles


Vector3 forward, left; phi.toAxis(&forward, &left, NULL); // Forward = (-1, 0, 0) // Left = (0, -1, 0) // Up is not set because pointer was NULL, but would be // (0, 0, 1)

Bound angles in range [-PI, +PI]


phi.set(-2*M_PI, M_PI*1.5, 4*M_PI); phi.bound_angles(); // phi = (0, -.5 * M_PI, 0)

Point Sets
The PointSet class is a memory managed list of Vector3 vectors:
PointSet set(4); Vector3& v = set[0]; v.set(1, 2, 3); set.resize(3); // Create a set of 4 Position3 vectors // Grab vector 0 from the point set // Do stuff to the vector // Delete old positions and resize // for 3 entries

MultiValued Functions
The MultiValue class is the discrete evaluation of a general mathematical function. Each multivalue has a set of intervals defining the domains for each input variable. For each interval intersection, there is a double value representing the evaluation of the function at that point.

8-8

User Guide

Math Library

Multivalue objects contain a lot of data. For example, if the input function is:
f(x) = x + .5

And the domain interval of the x variable is:


Domain(x) = [0, 1, 2, 3] (from 0 to 3 in steps of stride 1)

Then the multivalue consists of these evaluations:


0 => .5 1 => 1.5 2 => 2.5 3 => 3.5

For a more complicated function:


g(x,y) = x + y

With domains:
Domain(x) = [0, 1, 2, 3] Domain(y) = [0, .1, .2]

The multivalue consists of these evaluations:


0, 0.0 0, 0.1 0, 0.2 1, 0.0 1, 0.1 1, 0.2 2, 0.0 2, 0.1 2, 0.2 3, 0.0 3, 0.1 3, 0.2 => => => => => => => => => => => => 0.0 0.1 0.2 1.0 1.1 1.2 2.0 2.1 2.2 2.0 2.1 2.2

Multivalued functions can represent more complex mathematical functions (unlike the examples given earlier). The MultiValue class provides access to change any evaluated point. This provides a method to extend the function as continuously or discontinously as you want. To construct a MultiValue function, each domain interval is manually constructed ahead of time. Intervals are defined by a starting point, a step, and a stop point. For the g(x,y) function described above, that construction is:

User Guide

8-9

Chapter 8

Interval x_domain(0, 1, 3); //Interval from 0 to 3 in steps of 1 Interval y_domain(0, .1, .2);//Interval in steps of .1

The MultiValue object is constructed after the intervals are created. The MultiValue constructor takes as its first argument the dimensionality of the function - the number of variables. Each argument thereafter is a pointer to the interval describing that variable's domain. For example, to construct the g(x,y) multivalue:
MultiValue g(2, &x_domain, &y_domain);

The evaluation for each (x,y) coordinate in this multivalue is initialized to 0 and is set to its proper value after construction. To manually access interval values, do the following:
g.set_value(0, .1) = 0.1; g(0, .1) = 0.1; // Set g at x=0, y=.1 to value .1 // Same thing

To read values:
double val1 = g.get_value(0, .1); double val2 = g(0, .1); // Get g at x=0, y=.1 // Same thing

While it is better form to iterate over the entire multivalue, it can be cumbersome. The most convenient syntax uses the multivalue iterator:
for (MultiValue::Iterator point = g.begin(); point != g.end(); ++point) { double old_value = point.get_value(); point.set_value (old_value * 2); }

This example code does not refer to the multivalue coordinates for each point, but it can. The following code looks up the appropriate (x,y) coordinates and sets the result to x+y:
for (MultiValue::Iterator point = g.begin(); point != g.end();\ ++point) { SizeType inputs[2] point.get_current_indicies(inputs); // Get x, y, ... point.set_value (inputs[0] + inputs[1]); // Compute x + y }

Under constrained situations, well-defined operations exist between MultiValues to produce a new output multivalue. If two multivalues have the same number of input variables (dimensionality) and all input interval domains match perfectly, they are compatible. You can test for compatibility the is_compatible() function:
if (f.is_compatible(g)) { // Do something if f and g are compatible

8-10

User Guide

Platform Independent Utilities

Some operations among compatible multivalues are already defined:


// Create multivalue "h" whose values are 10 times that of "f" MultiValue h = f * 10; // Cresate multivalue whose values are the sum of f's and h's values MultiValue h2 = f + h;

If two incompatible multivalues are summed together like this:


MultiValue h3 = f + g;

The assumption is that the second multivalue is the identity. Therefore, the multivalue output is identical to the first input.
if (f = = h3){ //Do some work }

N-Dimensional Vectors
The N-Dimensional Vector is a template class that supports the same vector math operations as the 3-Dimensional Vector. Cross is the only exception, and is not implemented for an N-Dimensional vector.

Matrix
The N-Dimensional Matrix is a template class datatype and supports the basic math operations. It also supports correlation, sum, mean, deviation, spoiling and despoiling, and has many methods for setting its contents (ones, zeros, identity, random, gradient, shift). The most efficient multiplication operations are obtained when using the "operate and assign" operators, as opposed to operator* and such. Many methods for manipulating matrices and Images in the context of vision are found in Install_dir/include/core for both Windows and Linux. This is not a linpack quality matrix library.

VectorField
VectorField implements a matrix of 3D vectors. Use for modeling flows in a matrix, image, or two dimensional flow fields in general.

Platform Independent Utilities


ERSP provides a set of platform-independent tools and variables. This section mentions those that are most commonly used. Refer to the Doxygen documents for information on the remainder of the utilities.

SystemProperty and Related Methods


The ERSP stores system properties for easy use in the system. The function Platform::get_system_property(SystemProperty, String*) gets the specified

User Guide

8-11

Chapter 8

system property (currently one of: INSTALL_PATH, CONFIG_PATH, FACES_PATH, or MODELSETS_PATH) and place it in the passed in String.
Platform::find_system_file(SystemProperty, const char* filename, String* full_path) checks the directory used by SystemProperty (CONFIG_PATH, most commonly)

and returns the full path on that system to that file. This is especially useful for determining exactly which file your system is looking at, if you have multiple file copies installed on your system. A function named Platform::find_file determines the full path of a specified file, given a list of directories, if it exists in those directories.
Platform::file_exists returns a Result indicating if a file with the given filename

exists. This is quite useful to have and is cross platform.


Platform::millisecond_sleep is a cross platform method of sleeping for a specified

number of milliseconds.
Timestamp Platform::get_timestamp() is fairly obvious. Timestamp is currently defined as a double that represents seconds, with precision down to microseconds.

A handful of functions deal with file and path delimiters, file formatting, making paths, and so on. These are useful when manipulating file paths. However, these are not very useful when dealing with mathematical manipulation. Try using Matrix instead.
Platform contains a set of functions to do platform-independent thread manipulation: create_thread, create_condition_variable, and create_mutex. Create_mutex

will create a recursive mutex by default. The other functions are straightforward.

Core Libraries Utilities


Three useful utilities are provided by the core library's utility section. The first is a very standard MD5 generator named MD5. The second and third are very similar, both being bit depth tables. The tables map 3 inputs to a single bit of output.
CompactCacheTable supports arbitrary x and y, with a [0,32] range for z. BinaryCacheTable supports arbitrary x, y and z values. They both support basic functions

for getting and setting any given bit. These are useful for caching any sort of complex and processor intensive decision function. Refer to the Doxygen documents for further details.

Fuzzy Logic
The Fuzzy Logic Toolbox (FLT) offers a simple suite of tools for prototyping behaviors. It provides basic membership functions, an implication operator, an instruction set, a rule combiner, and a defuzzification method. Beyond this, several behaviors implement fuzzy logic set operations. The FLT core is implemented in the following files, located in the math directory: FuzzyConstants.hpp FuzzyMembership.hpp FuzzyOperators.hpp

8-12

User Guide

Fuzzy Logic

Defuzzy.hpp Fuzzy logic is a superset of conventional logic that recognizes more than true and false values. With fuzzy logic, degrees of truthfulness and falsehood are possible. For example, the statement, he is tall, might be 100% true if a person is over 7 feet tall, 80% true if a person is 6 feet tall, 50% true if a person is 5 and feet tall and 0% true if a person is under 5 feet tall. The FLT allows you to write behaviors using easy-to-understand if/then constructs, rules which are analogous to linguistic terminology. For example:
IF obstacle is close on right THEN turn left.

Left, close, and right are all linguistic terms defined using fuzzy sets. The fuzzy membership functions are adjustable for system performance tuning.

Basic Fuzzy Data Structures


double - The basic fuzzy data element type. vector <double> - A vector of doubles used to store the fuzzy set. multivalue - Data structure used to contain the fuzzy set over discrete steps. result - The error status of the aggregation function

Basic Fuzzy Set Types


Simple membership functions are based on trapezoid functions. Default values are provided for t and w, which can be overridden.
double double double double low (double x, double t, double width) med (double x, double t, double width) high (double x, double t, double width) triangular (double x, double t, double width)

Basic Fuzzy Operators


Simple fuzzy inferencing operators are also provided in the Fuzzy Logic Toolbox. Unless otherwise specified, FUZZY_TRUE = 1.0 The fuzzy NOT operator
double fuzzy_not (double x)

Returns 1.0 - x The fuzzy OR operator


double fuzzy_or (double x, double y)

Returns maximum of (x,y) The fuzzy AND operator


double fuzzy_and (double x, double y)

Returns minimum of (x,y)

User Guide

8-13

Chapter 8

The fuzzy FALSE operator


double fuzzy_false ()

Returns not of (FUZZYTRUE) The fuzzy PRODUCT operator


double fuzzy_product (double x, double y)

Returns arithmetic product of(x,y) The fuzzy IFTHEN operator


double fuzzy_ifthen(double condition, double consequence)

Returns minimum of (condition, consequence)

Basic Fuzzy Rule Combination


Aggregate and Combine take a rule set, r, and generate a combined implication set. This

returns the combined rule value set. These two functions are the same but take different types. The inference engine's default rule for aggregation is MAX.
double combine (double r, int num)

Aggregates using an array of doubles


double aggregate (std::vector<double> &r)

Aggregates using a vector of doubles

Crisp Value Generation


To generate crisp values, several defuzzification functions are provided: centroid, median, and max defuzzification functions.
Result defuzzify_centroid (const multivalue& mv, double& value) Result defuzzify_median (const multivalue& mv, double& value) Result defuzzify_max(const multivalue& mv, double& value)

The defuzzification functions take one dimensional multivalue input values and return value RESULT_SUCCESS.

Logging Utilities
ERSP provides a fairly complex set of logging utilities. These are most often used as a method of switchable logging. All logging statements have an associated priority which are ordered and system-defined. Each debug object, typically one per file, has an associated category. Categories are

8-14

User Guide

Logging Utilities

user-defined and ordered by depth, in a tree hierarchy. For example: Categories:


CodeClass CodeClass.Movement CodeClass.Movement.Mapping
CodeClass.Movement

CodeClass.Movement.Mapping

CodeClass.Movement.Mapping.Pathing

CodeClass.Movement.Mapping.Landmarks

Priorities:
LOG_DEBUG LOG_INFO LOG_WARN ERSP_LOG_CATEGORY("CodeClass.Sight.VisionProcessor") defines CodeClass.Sight.VisionProcessor as a category. ERSP_LOG_DEBUG(...) is a statement in the LOG_DEBUG priority. ERSP_LOG_WARN(...) is a statement in the LOG_WARN priority.

Where to use logging


ERSP_LOG_CATEGORY("<Category>") is used in a .cpp file, as it declares an actual instance of a logging object. The actual ERSP_LOG_xxx statements can be used anywhere in the file that has the ERSP_LOG_CATEGORY statement after the ERSP_LOG_CATEGORY declaration.

If your class requires extensive use of inlined functions and/or is a templated class,
ERSP_LOG_CLASS_CATEGORY_DECLARE and ERSP_LOG_CLASS_CATEGORY_IMPLEMENT might be better choices. These allow the use of ERSP_LOG_<DEBUG|etc> methods in the class declaration in header files. If ERSP_LOG_Category is used, its use only allows ERSP_LOG statements to be valid in the implementation (.cpp) file. Using DECLARE/IMPLEMENT allows one to you ERSP_LOG statements anywhere in that class, including the header. ERSP_LOG_CLASS_CATEGORY_DECLARE(classname, <category string>) in a header file within a class declaration sets up logging so that it can be used in a header file.

You must place ERSP_LOG_CLASS_CATEGORY_IMPLEMENT(classname, <category string>) in the corresponding implementation file. Any uses in the implementation must be in functions belonging to the class it is declared to belong to. These instances can be used inside the implementations of functions in the .cpp file in addition to in the header within inlined function implementations.

User Guide

8-15

Chapter 8

Formatting of logging statements


Each logging statement takes arguments in a standard printf way. Format strings and arguments afterwards are processed exactly the same as printf processes them.

Logging Levels
There are several ways to use the ability to switch which levels appear. The levels of logging are:
LOG_NOLOG LOG_FATAL LOG_ERROR LOG_WARN LOG_INFO LOG_DEBUG LOG_VERBOSE LOG_NOTSET

If logging is switched to LOG_DEBUG, everything below LOG_DEBUG on the above list is activated. Each statement that specifies a level must have a category to apply that level to. For example:
File one.cpp ERSP_LOG_CATEGORY("Foo.Bar"); ERSP_LOG_DEBUG("Log statement: Foo.Bar"); ERSP_LOG_WARN("Log statement: Foo.Bar"); ERSP_LOG_ERROR("Log statement: Foo.Bar"); File two.cpp ERSP_LOG_CATEGORY("Foo.Baz"); ERSP_LOG_DEBUG("Log statement: Foo.Baz"); ERSP_LOG_WARN("Log statement: Foo.Baz"); //line e //line f //line g //line a //line b //line c //line d

If Foo is set to display the LOG_WARN level, lines c, d, g output.


LOG_DEBUG is lower than LOG_WARN, so it is ignored.

If Foo.Baz is set to show the LOG_DEBUG level, lines f and g is output by virtue of that declaration. Lines c and d are also output, because all categories have LOG_INFO as their default level, and setting Foo.Baz does not affect the Foo default level, which Foo.Bar inherits unless specifically told to act otherwise.

How to Set Logging Levels for Categories


There are at least three common ways to specify which level of logging.

8-16

User Guide

Logging Utilities

1. When using behave to run a network, behave takes a --debug="category" argument. This turns that category to LOG_DEBUG for that run of behave. 2. A program can, at any point and as often as needed, call ERSP_LOG_SET_PRIORITY("Category", level). This sets the given category to the specified level of output. 3. The third and most flexible option is to use environment variables. Setting the ERSP_LOG_PRIORITY variable allows an arbitrary number of categories to be set to differing log levels. The regular expression for this is:
"Category[,Priority][;Category[,Priority]]*" ('"'s are needed to protect from shell) By setting ERSP_LOG_OUTPUT to a filename, you can also specify a file for logging output to be written to.

More about Logging


There is no ERSP_LOG_NOTSET or ERSP_LOG_NOLOG logging output functions. These levels define the unset level, and the 'there will be no logging' level of logging. LOG_NOLOG is typically used on release builds to avoid having any sort of logging information displayed. The logging code is still there, however. To remove the logging code, there are some makefile options to remove it at build time, using the simple method of having all logging macros resolve to nothing. If you want to log variables and statements that aren't log statements, but want them removed when logging is removed, the ERSP_DEBUG_LOGGING() macro wrapping those statements result in their removal if logging is made to vanish, as above. For example:
ERSP_DEBUG_LOGGING( static int cycles = 0;) ERSP_LOG_DEBUG("X function calls have happened: %d", cycles++);

This snippet ensures that if logging is removed by Makefile, there is not an int remaining behind, doing nothing and taking up memory.

User Guide

8-17

Chapter 8

8-18

User Guide

Chapter

Teleop and ColorTrainer GUI Applications

This chapter describes: The Teleoperation GUI The ColorTrainer GUI

Starting the GUIs


This The Teleop and ColorTrainer GUI allows you to control the robot remotely. Teleoperation is usually used in combination with the video client GUI, so that you can monitor the video feed from the robots camera while driving the robot using the teleop control panel.

User Guide

9-1

Chapter 9

To Start the GUIs in Windows


1. In Windows, do the following: On the Start menu click Start, Search, For Files and Folders In the For Files and Folders text box, type Install_dir\java. Click Search Now. 2. To start the Teleoperation GUI, double-click teleop-client.bat. 3. To start the ColorTrainer GUI, double-click video-client.bat.

To Start the GUIs in Linux


1. In Linux, change directories to Install_dir/sample_code/behavior/network. 2. To start teleop-video server, navigate to the terminal prompt and type
teleop-video.xml

3. Change directories to Install_dir/sample_code/behavior/vision and then type color_trainer_server.xml 4. Change directories to Install_dir\java 5. To start the Teleoperation GUI and the ColorTrainer, type
behave teleop-video.jar and behave video-client.jar.

The following screen appears.

Assigning the Address and Port


There are two text boxes: Address and Port. You must specify a value for each. Address - If you are not controlling the robot remotely, this value is 127.0.0.1. This specifies that the robot laptop looks at itself instead of trying to find another computer. If you are controlling the robot remotely, use the IP address of the robot's laptop. Port - Specify the port you are using either on the robot's laptop or the remote computer. To determine the port number, type teleop_server.xml on the command line. Look for the tcp_port value. This is the port number to use.

To Assign the Address and Port


1. In the Address text box, type 127.0.0.1. 2. In the Port text box, type the port value. 3. When both items are complete, click Connect.

9-2

User Guide

Using the ColorTrainer Application

Using the Teleoperation Control Panel


The teleoperation control panel controls the movement, linear velocity, and angular velocity of the robot. The Maximum Linear Velocity sets the speed of the robot as it is moving forward or backward. The Maximum Angular Velocity sets the speed of the robot as it turns.

To Use the Teleoperation Control Panel


1.Click and drag the dot in the direction that you want the robot to move. 2.To Set the Maximum Linear Velocity, move the slider to the required speed. 3.To set the Maximum Angular Velocity, move the slicer to the required speed.

Using the ColorTrainer Application


The ColorTrainer application lets you train the robot to distinguish a specific color or range of colors from a video stream. The video feed from the robot appears in the Video Display area. A number of parameters can be adjusted using this application. These are described in the

User Guide

9-3

Chapter 9

following sections.

Selected Area Display Area

Parameters

Display Area - This area shows the live video feed from the robot. Selected Area - This is the user-selected area that determines the color to select. The software calculates an average color value based on the values of the pixels within the selection. Address - The Address parameter gives the IP address of the robots laptop. ColorTrainer Port - This parameter specifies the port for the ColorServerTrainer commands on the laptop on the robot. Video Port - This parameter specifies the port for the video feed. Play - If this button is selected, the live video feed from the robot is shown. Stop - If this button is selected, the video feed is disconnected. Train - This parameter trains the robot to recognize your selected color. Save - This option saves your selected color as an XML file. Load - This button loads any color selection previously saved using the Save button. SigmaThreshold - This slider adjusts the sensitivity of the color matching algorithm. Setting the slider to a larger value matches a greater range of colors. Setting the slider to a smaller value matches fewer colors.

9-4

User Guide

Using the ColorTrainer Application

Using the ColorTrainer

To Use the ColorTrainer


1. Start the ColorTrainer using the directions at the beginning of the chapter. The graphical interface appears. 2. In the Address text box, type the IP address or host name of your robot. 3. To connect the video stream from the robots camera, click Play. 4. To disconnect the video, click Stop. 5. Locate an area of the video display that contains the color you want. 6. To select that area in the image, click inside the image and drag the pointer to select an area of the image. 7. To adjust the sensitivity of the color matching algorithm use the Sigma Threshold control at the bottom of the window. 8. To train the software to recognize your color, click Train. 9. Adjust the profile again using the Sigma Threshold control. After adjusting the control or changing the color target, click Train again. 10. When you are satisfied with your color profile, click Save. 11. At the prompt, type a filename . The color profile is saved as an XML file. 12. To load a saved color profile, select the color profile and click Load.

User Guide

9-5

Chapter 9

9-6

User Guide

Chapter 10

Behavior Composer GUI

The Behavior Composer GUI application allows you to create robot behavior scripts by dragging and linking boxes representing behaviors. You can use the Behavior Composer to create scripts using only the existing sample behaviors shipped with ERSP, or you can edit the sample behaviors to create your own behaviors.

Start the GUI


In Windows start the Behavior Composer GUI, by changing directories to Install_dir\java and then executing the composer.bat file by clicking on it. In Linux, type evolution_java.sh composer on the command line.

User Guide

10-1

Chapter 10

GUI Description

Toolbar

Palette Network Editor

Property Panel Status Bar

Toolbar Menus
The Behavior Composer has three menus: File, Viewers and Refresh. File This menu has a single option: Quit - Closes the Behavior Composer window. This menu opens a viewer to display the selected item. Note: If the subwindow is already open in the screen, the command opens an additional copy of the subwindow. The new subwindow opens directly on top of the old subwindow, so it may appear that nothing has happened. Click the X in the top right corner of the window and the new subwindow is closed, showing the original, underlying subwindow. Tree XML Editor - Opens a subwindow allowing you to edit the properties (input, output, and so on) of the file you specify from the subwindow. Right-clicking on a behavior box in the Network Editor also shows the XML Tree subwindow for that behavior. The available options are: Behavior Network - Opens a new Network Editor subwindow. Behavior Palette - Opens a new Behavior Palette subwindow.

Viewers

10-2

User Guide

GUI Description

Refresh

The available options are: XML Schema - Updates the Behavior Schema after you make edits. Semantic Types - Updates the semantic types after you make edits to the file.

Toolbar Buttons
The Behavior Composer contains a series of buttons at the top of the window: Creates a new instance of the Behavior Palette. This palette contains a visual representation of all existing behaviors. Creates a new instance of the Network Editor. Use the Network Editor to create a behavior network. Creates a new instance of the Properties Palette. This palette shows the properties of the currently selected object. Opens the XML Tree Viewer palette. This window shows an XML document as a relational tree.

Behavior Palette
The palette on the left side of the Behavior Composer GUI is the Behavior Palette. This lists sample behaviors that are used to create behavior scripts

To Use the Behavior Palette


1.To view different categories of behaviors, such as movement, operators and target, click the tabs (movement, drivers, and so on). 2.Specify any required behavior parameters in the appropriate text box.

Network Editor
A Behavior Network is a series of behaviors that are wired together. The Network Editor allows you to edit behavior networks. In a typical network, sensors feed data into logic behaviors, which in turn feed data out to behaviors for interpretation. These behaviors send their output to the actuators and/or drive mechanisms. The Composer gives you a visual interface to construct a behavior network of this type.

User Guide

10-3

Chapter 10

Input ports

Output ports

In the Network Editor, a behavior is represented by a block. Its input and output ports are marked by green dots. Input ports appear on the left of the block, and output ports appear on the right.

You can create a network by connecting, or wiring, the output ports of behaviors to the inputs of others.

The Network Editor also has an additional feature that allows you to edit the XML files associated with the behaviors in your network.

Network Editor Menus


File - Save/Open a file or create a new file. Edit - Edit the behavior network. Group - Group/Flatten a behavior. See the Creating an Aggregate Behavior section.

Edit Options Buttons


Load a saved behavior network. Save the currently selected behavior network. Open a new behavior network window. Cut the current selection.

10-4

User Guide

GUI Description

Copy the current selection. Paste the current selection.

Modes
When the Behavior Composer is opened, it is in Edit Mode. To perform certain actions, you need to enter a different mode. Use Wiring Mode to connect behavior ports together to form a behavior network. The Behavior Composer does not allow you to connect disparate data types. If you try to do so, an error message appears at the bottom of the window. Edit Mode allows you to add behaviors to the network and to edit behavior properties. Insert Ports allows you to insert new input or output ports into a Behavior Network.

Use Delete Mode to remove behaviors, wires and ports from the current Behavior network.

Property Panel
The Property Panel shows the properties of the currently selected object. Use the Properties Panel to edit the XML tag values. You can also edit a behaviors XML tags by right-clicking on a behavior that you have dragged to the Network Editor.

User Guide

10-5

Chapter 10

To Edit XML Files


1. right-click a behavior in the Network Editor. The XML file associated with the selected behavior appears in the XML TreeEditor. 2. Click any line of data to edit.

If any child classes are available, they are shown in the lower left quadrant of the screen.

To Edit Child Classes


1. Select the child class and click Add to add them to the XML file. 2. Selected data can be deleted by clicking Delete. 3. When you are done editing the file, on the file menu select File>Save.

Creating an Aggregate Behavior


Aggregate behaviors are created with a number of behaviors. They are treated like single behaviors by the system and can simplify working with large networks. Aggregate behaviors are created by grouping existing behaviors and then adding input and output ports.

To Create an Aggregate Behavior:


1. In the Behavior Composer, open an existing network or create a new behavior network. To wire them together, click the Wiring Mode button. 2. Make sure that you are in Edit mode.

10-6

User Guide

Creating an Aggregate Behavior

3. Select the behaviors you want to add to the aggregate, click and drag until the bounding box surrounds them. The selected behaviors are highlighted in yellow.

4. Group the set by selecting Group>Group, or press Alt+G. Gray bars appear along the left and right sides of the Network Editor window and the Input Ports button becomes active.

The gray bar on the left holds input ports, while the one on the right holds output ports. Important Note: If you group a subset of behaviors in an existing network, the Network Editor automatically adds the necessary ports to maintain connections with the rest of the network.

User Guide

10-7

Chapter 10

5. To add ports to the new group, do the following: Click the Insert Ports button. Click the left-hand bar to add an input port. A blue-outlined dot appears on the bar. Click the Wiring Mode button, then click a behavior input port and drag the dot to the newly created port on the bar. A line connects the two ports, showing the virtual wire that exists between these two ports. Repeat for each port you want to add. Important Note: Remember, output ports go on the right-hand bar. 6. Use the XML Tree Editor to edit the ID and data types of the newly created ports as necessary. 7. To save your work, click File>Save. Enter a name for the new aggregate behavior in the text box and click Save.

Working with Multiple Instances


You can open multiple instances of the Network Editor pane by clicking the Behavior Editor button. This is helpful when you need to re-create sections of an existing network.

To Create Multiple Instances


1. In Edit mode, select a group of behaviors and click Edit>Copy. 2. Click in the new network window and click Edit>Paste. Copies of the behaviors appear in the new network window. Maintaining all customization and wiring set up in the previous network instance.

Behavior Composer GUI Example


Let's take a look at the sample fuzzy_avoidance.xml behavior. When this behavior is run on your robot, the robot moves around avoiding obstacles. 1. On your client machine, start the Behavior Composer by typing: In Windows: Install_dir\java\composer.bat In Linux: Install_dir/bin/evolution_java.sh composer The Composer window appears.

10-8

User Guide

Behavior Composer GUI Example

2. In the Behavior Composer Window, click File, Open. 3. Navigate to the directory: In Windows: Install_dir\sample_code\behavior\vision In Linux: Install_dir/sample_code/behavior/vision 4. Select and open the file: object_recognition_from_camera.xml The Network Editor portion of the Behavior Composer window shows the composer version of the behavior. Scroll to see the entire network, if needed.

5. To save the behavior as a new sample, click File>Save to ensure that the sample file remains intact.

User Guide

10-9

Chapter 10

For example, rename object_recognition.xml to:


~username/my_files/my_object_recognition.xml

Use this sample to familiarize yourself with the elements of the Behavior Composer. Of course, if you make significant changes here, your behavior will not work in the same manner as the original. 6. When you are done with your behavior, click File>Save. 7. To stop the robot, you'll have to switch it off, or run the behave command with a timeout. For example, to make the robot stop after 30 seconds, type:
behave my_object_recognition.xml --duration=30

10-10

User Guide

Appendix A

Grammar Information

The ASR engine supports only Backaus-Naur Form (BNF) compiled grammars. A simple example of a command grammar is the following:
<sample> = go left | go right | go forward | stop | quit .

The different commands are separated by a pipe character '|'. The root node of the grammar is called sample in this case and it is enclosed between <> brackets. The same grammar can be written to include more nodes, as follows:
<sample> = go <direction> | stop | quit <direction> = left | right | forward

where sample is still the root node for the grammar. The use of multiple nodes in the specification of the grammar provides a more compact way of writing grammars. The default grammar specified in the grammar parameter of the ASRBehavior is the following null grammar:
<sample> = ""

User Guide

A-1

Appendix A

Speech grammars are defined by listing a series of valid words and phrases. The ASR engine supports only BNF compiled grammars. In general, speech grammars are formed as follows:
<rule> = sentences and phrases .

This is called a production rule. Every speech grammar has a production rule at the beginning and at the end. A production rule is broken into four parts: <rule> - This is a required part of the rule. Any string can be used as long as it is bounded by the angle brackets and the string is unique. = (assignment operator) - This is required and must be written as =. sentences and phrases - This section defines any sentences or phrases that you define as valid. . - This part of the rule is required. A period must be used. The period denotes the end of the rule. A simple example of a command grammar is the following:
<sample> = go left | go right | go forward | stop | quit .

The commands are separated by a pipe character |. The root node of the grammar is called sample in this case and it is enclosed between <> brackets. The same grammar can be written to include more nodes, as follows:
<sample> = go <direction> | stop | quit . <direction> = left | right | forward .

where <sample> is still the root node for the grammar. The use of multiple nodes in the specification of the grammar provides a more compact way of writing grammars. The default grammar specified in the grammar parameter of the ASRBehavior is the following: <sample> = All grammars must be compiled from ASCII format to binary format before passing to the engine. The utility program, vtbnfc, included with the ASR engine performs the compilation.

A-2

User Guide

G LOSSARY
Absolute position API ASR Engine Position independent of a reference point. Application Program Interface. ASR Engine stands for Automatic Speech Recognition Engine. An ASR Engine is a speaker-independent speech recognition engine that is compliant with the industrial standard MS SAPI 5.1. It comes with real time speech recognition, active vocabularies and high recognition performance. IBMs ViaVoice and Microsofts WinVoice are examples of ASR engines Class used to represent image meta-data. Examples: camera ID, timestamp, camera position and orientation. The behavioral XML Interpreter. Computational unit that maps a set of inputs or to a set of outputs. Chains of connected behaviors. Indicates the location of the binary code implementing the behavior, as well as that behaviors input and output ports and runtime parameters. Behavior Execution Layer. Two or more cameras that must return frames in synchrony. A basic resource type, representing a single physical device or other external entity. Drivers of this category allow a set of devices to be handled as a single resource. A set of devices that provide means to change the linear and angular velocities of the robot. Implementation of a resource. Enlightened Sound Daemon (package: alsaplayer-esd 0.99.59-5). This is a PCM player designed for ALSA (ESD output module). Alsaplayer is a PCM player designed specifically for use with ALSA, but also works with OSS or EsounD. It is heavily threaded, which reduces skipping, and offers optional and even simultaneous visual scopes. It plays mp3, mp2, ogg, cdda, and audiofs format files.

AuxData object Behave Behavior Behavior Networks Behavior schema files BEL Camera group Device Device group Drive system Driver E-Sound Daemon

User Guide

Glossary-1

Glossary

Ethernet interface ERSP FTDI Grammar GUI HAL Infrared sensor Meta-data Odometer Operators Pixel Port Range sensor Resource

Widely used local area network (LAN) access method, defined by the IEEE as the 802.3 standard. Evolution Robotics Software Platform. Future Technology Devices International Ltd. A file containing a list of valid words or phrases that is used by the ASR Engine. For more information, see the Grammar Information on page A-1. Graphical User Interface. Hardware Abstraction Layer. Used for object avoidance. Typically, data that describes an image collection event. Keeps track of the position of the robot. Behavior-coordination behaviors. The smallest addressable unit on a display screen. Behavior inputs and outputs. Characterized by its data type, data size, semantic type, together indicating the structure and use of the data passing through the port. A device that returns a value representing the distance to a physical object. A physical device, connection point or other means through which software interacts with the external environment: sensors and actuators, network interfaces, microphones, speech recognition systems, battery charge. A transport layer through which external devices are accessed. Software module providing access to a resource. A set of public, well-defined C++ abstract classes. The software module responsible for handling tasks such as locating, instantiating, activating and making available of resources. Also handles freeing of resources, and proxying requests for resource interfaces through to the appropriate drivers. A stand alone computer system that performs physical and computational activities. A standard for multipoint communications lines. Port for connecting serial devices. Characterizes a set of data passing through a port. A device that senses objects in three dimensional space. Task Execution Layer. Remote operation of the robot. This acronym stands for Text to Speech. An example of this functionality can be seen in the ER Vision Demo GUI in the Vision chapter.

Resource bus Resource driver Resource interfaces Resource Manager Robot RS-485 network Serial port Semantic type Spatial sensor TEL Teleoperation TTS

Glossary-2

User Guide

USB XML

Universal Serial Bus. Hardware interface for low-speed peripherals such as the keyboard, mouse, joystick. Also supports MPEG-1 and MPEG-2 digital video. eXtensible Markup Language.

User Guide

Glossary-3

Glossary

Glossary-4

User Guide

I NDEX
A
A/VClient 6-8 A/VServer 6-9 AbsoluteValue 6-4 Addition 6-5 AdjustFlowDetectParams 5-18, 7-4 aggregate behaviors, combining behaviors 4-12 API documentation 2-17, 3-6, 5-18, 6-1, 7-4 ASR 3-7, 3-10, 6-17 AudioVideoClient 5-18, 7-4 AudioVideoServer 5-18, 7-4 automatic speech recognition 3-7 AverageOperator 6-5 AvoidanceAggregate 6-14 edit options buttons 10-4 file menu 10-2 insert ports button 10-5 multiple instances working with 10-8 network editor panel 10-3 ports input and output 10-4 properties panel 10-6 refresh menu 10-3 behavior schema 10-3 semantic types 10-3 save behavior network button 10-4 toolbar buttons 10-3 toolbar menus 10-2 viewers menu 10-2, 10-3 behavior network 10-2 behavior palette 10-2 tree XML editor 10-2 wiring mode button 10-5 XML tag values editing 10-5 BehaviorImpl 4-7 behaviors aggregate 4-12 condensers 6-20 configuration 4-2 data passing between 4-14 environment 4-2

B
BatteryMeter 6-9 Behavior composer behavior network loading saved 10-4 opening a new 10-4 save selected 10-4 behavior palette 10-3 ColorTrainer application using 9-3 delete mode 10-5 edit mode button 10-5

User Guide

Index-1

Index

implementing 4-7 life cycle 4-2 navigation 6-14 operator 6-4 resource 6-8 speech 6-17, 6-18 utility 6-1 vision 6-13 XML interface 4-11 best matches 2-8 bounding box 6-10 Buffer 6-1 BumpSensorBehavior 6-9

CondenserBase 6-8 MaxCondenser 6-8 MedianCondenser 6-8 CondenserBase 6-8 Condensers 6-8 Console 6-2 ConsoleReaderBehavior 6-2 Constant 6-2 ConstantBehavior 6-2 crisp value generation 8-14

D
data type 4-5 boolean 4-5 character array 4-6 double 4-5 double array 4-6 enumeration 4-5 image 4-6 multivalued function 4-5 string 4-5 unknown 4-5 data, input interface 4-15 Data, output interface 4-15 debug 4-12 DecayBehavior 6-2 defuzzification methods centroid 8-14 max 8-14 median 8-14 DelayedConstant 6-2 DetectColor 5-18, 7-4 DetectFlow 5-19, 7-4 DetectGesture 5-19, 7-4 DetectMotion 5-19, 7-4 DetectObject 5-19, 7-4 DetectSound 5-19, 7-4

C
Camera 6-9 Canny Sigma 2-7 Canny tHigh 2-8 Canny tLow 2-8 Capture button 2-5 capturing images, how to 2-5 CentroidCondenser 6-8 CloseER2Gripper 5-18, 7-4 CloseGripper 5-18, 7-4 ClusterTargetBehavior 6-13 ColorTrainer 6-13 command line tools objrec_add.exe 2-11 objrec_add_list.exe 2-12 objrec_del.exe 2-13 objrec_list.exe 2-14 objrec_recognize.exe 2-14 using 2-11 CompressedAudioPlaybackBehavior 6-9 CompressedAudioRecorder 6-9 compute_output() function 4-10 condenser behaviors 6-20 CentroidCondenser 6-8

Index-2

User Guide

Diff2Drive 3-7 Diff2Odometry 3-8 display parameters 2-8 distance 2-10 DoAtATime 5-19, 7-4 DoPeriodically 5-19, 7-4 DoubleArrayJoiner 6-5 DoubleArraySplitter 6-5 DoWhen 5-19, 7-4 Doxygen 2-17, 3-6, 5-18, 6-1, 7-4 DriveMoveDelta 5-19, 7-4 Drivers, supported 3-7 DriveStop 5-19, 7-4 DriveSystem 6-9

Vision button 2-5 Euler angles 8-8 events 7-2

F
FaceDrive 6-19 FaceObject 5-19, 6-14, 7-4 Facial Graphics Driver 3-6 feature quality 2-7 feature strength 2-7 FGD 3-6 file formats 3-8 .rh face configuration 3-8 .rhm morph data 3-9 FlowDetectBehavior 6-13 FunctionBehavior 6-2 fuzzy basic data structures 8-13 basic operators 8-13 basic rule combination 8-14 FuzzyAvoidance 6-15 FuzzyHeading 6-15 FuzzyLRFParse 6-15 FuzzyRangeSensorRing, managing the IR sensors 6-10 FuzzyTargetBehavior 6-15 FuzzyVelocity 6-15

E
emotion behaviors 6-18 emotion elicitors 6-19 BumpElicitor 6-20 ObstacleDistanceElicitor 6-20 ObstacleElicitor 6-20 SigmoidElicitor 6-20 SpeechElicitor 6-20 TargetElicitor 6-20 EmotionAggregate 6-18 FaceDrive 6-19 EmotionAggregate 6-18 ER Vision Demo 2-3 capture button 2-5 how to capture an image 2-5 learn button 2-6 live video screen 2-4 main screen 2-3 object recognition information area 2-5 outline screen 2-4 Recognize button 2-6 Rotate buttons 2-4 Sound button 2-6

G
GetImage 5-19, 7-4 GetPosition 5-19, 7-4 GetRangeData 5-19, 7-4 GetVelocities 5-19, 7-5 GotoObject 5-19, 7-5

H
Hazard avoidance 1-9 HazardAvoidance 6-16

User Guide

Index-3

Index

I
IBehavior interface 4-7 IBumpSensor 3-6 ICamera 3-6 IDriveSystem 3-6 IFace 3-6 Image class 8-1 ImageConverter 6-3 images, how to capture 2-5 IMotorCommand 3-6 IMotorQuery 3-6 input data interface 4-15 input ports 4-7 InputCollector 6-3 InputLogger 6-3 instantiating behavior networks 4-2 IOdometry 3-6 IPollable 3-6 IRangeSensor 3-7 IResourceCallback 3-7 ISpatialSensor 3-7 ISwitchDevice 3-7

Match object bounding box 2-8 math library 8-5 Matrix 8-11 Max 6-5 MaxCondenser 6-8 MedianCondenser 6-8 Min 6-5 MotorQuery 6-12 MotorQueryBehavior 6-12 Move 5-19, 7-5 MoveER2ArmDelta 5-19, 7-5 MoveRelative 5-19, 7-5 MoveTo 5-19, 7-5 multiple inheritance 4-11 MultiplicationOperator 6-6 multivalued functions 8-8

N
Navigation behavior, LegDetectBehavior 6-11 Navigation behaviors FuzzyAvoidance 6-15 FuzzyHeading 6-15 FuzzyLRFParse 6-15 FuzzyTargetBehavior 6-15 FuzzyVelocity 6-15 LegDetectBehavior 6-16 OdometryCompare 6-16 PointAndGo 6-16 PointandGoBridge 6-16 SafeDriveSystem 6-16 StopMonitor 6-17 TargetMarkerBehavior 6-17 TargetToHeadingBehavior 6-17 WorldSearchBehavior 6-13 navigation behaviors 6-14 Navigation Behaviors, FaceObject 6-14 NullResource 3-12

J
JoystickBehavior 6-11 JPEGServer 6-11

L
Learn button 2-6 LegDetectBehavior 6-11, 6-16 Live video screen 2-4 LRFDataBehavior 6-11

M
Main screen 2-3 MalleableBehavior 6-3

Index-4

User Guide

O
Object library 2-9 Object recognition information area 2-5 ObjectRecRecognize 6-13 objrec_add.exe 2-11 obrec_add_list.exe 2-12 obrec_del.exe 2-13 obrec_list.exe 2-14 obrec_recognize.exe 2-14 Obstacle avoidance 1-9 ObstacleDistanceElicitor 6-20 ObstacleElicitor 6-20 Odometry 6-12 OdometryCompare 6-16 OpenER2Gripper 5-19, 7-5 OpenGripper 5-19, 7-5 operator behaviors 6-4 AbsoluteValue 6-4 Addition 6-5 AverageOperator 6-5 Condensers 6-8 DoubleArrayJoiner 6-5 DoubleArraySplitter 6-5 Max 6-5 Min 6-5 MultiplicationOperator 6-6 Priority 6-6 Subtraction 6-6 ThresholdBehavior 6-6 Transistor 6-7 TriangularDistributor 6-7 TriggeredConstant 6-4 TriggerPassBehavior 6-7 WeightedSumOperator 6-7 Outline screen 2-4 Output data interface 4-15 output ports 4-7

P
PanTiltControl 6-12, 6-13 PeriodicTrigger 6-4 PlaySoundBehavior 6-4 PlaySoundFile 5-19, 7-5 point sets 8-8 PointAndGo 6-16 PointandGoBridge 6-16 PollingGroup 3-12 ports reading from 4-8 writing to 4-9 PrintEvent 5-19, 7-5 Priority 6-6 Python tasks AdjustFlowDetectParams 5-18, 7-4 AudioVideoClient 5-18, 7-4 AudioVideoServer 5-18, 7-4 CloseER2Gripper 5-18, 7-4 CloseGripper 5-18, 7-4 DetectColor 5-18, 7-4 DetectFlow 5-19, 7-4 DetectGesture 5-19, 7-4 DetectMotion 5-19, 7-4 DetectObject 5-19, 7-4 DetectSound 5-19, 7-4 DoAtATime 5-19, 7-4 DoPeriodically 5-19, 7-4 DoWhen 5-19, 7-4 DriveMoveDelta 5-19, 7-4 DriveStop 5-19, 7-4 FaceObject 5-19, 7-4 GetImage 5-19, 7-4 GetPosition 5-19, 7-4 GetRangeData 5-19, 7-4 GetVelocities 5-19, 7-5 GotoObject 5-19, 7-5 Move 5-19, 7-5

User Guide

Index-5

Index

MoveER2ArmDelta 5-19, 7-5 MoveRelative 5-19, 7-5 MoveTo 5-19, 7-5 OpenER2Gripper 5-19, 7-5 OpenGripper 5-19, 7-5 PlaySoundFile 5-19, 7-5 PrintEvent 5-19, 7-5 RangeSensorMove 5-19, 7-5 RangeSensorSweep 5-20, 7-5 RecognizeSpeech 5-20, 7-5 SendMail 5-20, 7-5 SenseObjectInGripper 5-20, 7-5 SetDriveVelocity 5-20, 7-5 SetER2ArmVelocity 5-20, 7-5 SetFaceEmotion 5-20, 7-5 SetTrackColorVelocity 5-20, 7-5 Speak 5-20, 7-5 SpeakFromFile 5-20, 7-5 Stop 5-20, 7-5 Teleop 5-20, 7-5 TrackColor 5-20, 7-5 TrackColorFlow 5-20, 7-5 TrackSkinFlow 5-20, 7-6 TrainObject 5-20, 7-6 Turn 5-20, 7-6 TurnRelative 5-20, 7-6 TurnTo 5-20, 7-6 Wait 5-20, 7-6 WaitUntil 5-20, 7-6 Wander 5-20, 7-6 WatchInbox 5-20, 7-6

RcmNetworkDriver 3-11 Recognize button 2-6 RecognizeSpeech 5-20, 7-5 repeat interval 2-8 resolution 2-8 resource categories bus 3-2 device 3-2 device group 3-2 configuration 3-3 XML 3-4 drivers and interfaces 3-1 life cycle 3-3 manager 3-3 standard interfaces 3-5 resource behavior, JoystickBehavior 6-11 resource behaviors 6-8 A/VClient 6-8 A/VServer 6-9 BatteryMeter 6-9 BumpSensorBehavior 6-9 Camera 6-9 DriveSystem 6-9 FuzzyRangeSensorRing 6-10 JPEGServer 6-11 LRFDataBehavior 6-11 MotorQueryBehavior 6-12 Odometry 6-12 PanTiltControl 6-12 RangeSensorBehavior 6-12 Wireless Monitor 6-13 resource categories bus 3-2 device 3-2 device group 3-2 Rotate buttons 2-4

R
RangeSensorBehavior 6-12 RangeSensorMove 5-19, 7-5 RangeSensorSweep 5-20, 7-5 RcmBumpSensor 3-11 RcmIrSensor 3-11 RcmMotor 3-11

S
SafeDriveSystem 6-16

Index-6

User Guide

save results 2-8 SendMail 5-20, 7-5 SenseOpenGripper 5-20, 7-5 SensorAggregate 6-12 SetDriveVelocity 5-20, 7-5 SetER2ArmVelocity 5-20, 7-5 SetFaceEmotion 5-20, 7-5 SetTrackColorVelocity 5-20, 7-5 SigmoidElicitor 6-20 SkinDetector 6-14 SkinFlow 6-14 Sound button 2-6 Sound file 2-10 Speak 5-20, 7-5 SpeakFromFile 5-20, 7-5 speech behaviors 6-17 ASR 6-17 TTSBehavior 6-18 Speech recognition 1-10 SpeechElicitor 6-20 StallDetector 6-14 StatisticalColorDetection 6-14 Stop 5-20, 7-5 StopMonitor 6-17 Subtraction 6-6

text to speech 3-7 ThresholdBehavior 6-6 Timing button 2-8 TrackColor 5-20, 7-5 TrackColorFlow 5-20, 7-5 TrackSkinFlow 5-20, 7-6 TrainObject 5-20, 7-6 Transistor 6-7 TransmitBehavior 6-4 TriangularDistributor 6-7 TriggeredConstant 6-4 TriggerPassBehavior 6-7 TTS 3-7, 3-10, 5-20, 7-5 TTSBehavior 6-18 Turn 5-20, 7-6 TurnRelative 5-20, 7-6 TurnTo 5-20, 7-6

U
Upsampling 2-7 utility behaviors 6-1 Buffer 6-1 Console 6-2 ConsoleReaderBehavior 6-2 Constant 6-2 ConstantBehavior 6-2 DecayBehavior 6-2 DelayedConstant 6-2 FunctionBehavior 6-2 ImageConverter 6-3 InputCollector 6-3 InputLogger 6-3 MalleableBehavior 6-3 PeriodicTrigger 6-4 PlaySoundBehavior 6-4 TCPServerBehavior 6-4 TransmitBehavior 6-4

T
TargetBehavior 6-17 TargetElicitor 6-20 TargetToHeadingBehavior 6-17 TaskContext 7-2 TCPServerBehavior 6-4 Teleop 5-20, 7-5 Teleoperation 1-10 Text to speech 1-10

User Guide

Index-7

Index

V
Vector3 8-6, 8-7 ViaVoice 3-10 vision behaviors ClusterTargetBehavior 6-13 ColorTrainerServer 6-13 ObjectRecRecognize 6-13 PanTiltControl 6-13 SkinDetector 6-14 SkinFlow 6-14 StallDetector 6-14 StatisticalColorDetection 6-14 Vision button 2-5

W
Wait 5-20, 7-6 WaitUntil 5-20, 7-6 Wander 5-20, 7-6 WatchInbox 5-20, 7-6 WeightedSumOperator 6-7 WinVoice 3-10 Wireless Monitor 6-13 WorldSearchBehavior 6-13

X
XML, interface to behaviors 4-11

Index-8

User Guide

Potrebbero piacerti anche