Brain Inspired Robotics Supplement Final

A Sponsored Supplement to Science
Brain-inspired
intelligent robotics:
The intersection
of robotics and
neuroscience
Produced by the Science/AAAS

Sponsored by
Custom Publishing Office
Be Among the First to Publish
in Science Robotics
Image: jim /AdobeStock
NOW ACCEPTING
MANUSCRIPTS
ScienceRobotics.org
Science Robotics is a unique journal created to help advance the research

and development of robotics for all environments. Science Robotics will
provide a much-needed central forum to share the latest technological
discoveries and to discuss the field’s critical issues.
Join in the excitement for the debut issue coming December 2016!
TABLE O F CO N T EN T S 1
Introductions
Brain-inspired
2
intelligent robotics:
Realizing intelligent robotics
Jackie Oberst, Ph.D.
The intersection

Sean Sanders, Ph.D.
Science/AAAS
of robotics and 3 Innovating at the intersection of neuroscience

neuroscience

and robotics
Hong Qiao, Ph.D.
Professor with the 100 Talents Program of the Chinese
Academy of Sciences
Group Director of Robotic Theory of Application, Institute
of Automation, Chinese Academy of Sciences (CASIA)
Deputy Director, Research Centre for Brain-Inspired
Intelligence, CASIA
Core Expert, CAS Center for Excellence in Brain Science
and Intelligence Technology (CEBSIT)
Articles
4 Creating more intelligent robots through
brain-inspired computing
Bo Zhang, Luping Shi, Sen Song
9 Deep learning: Mathematics and neuroscience

Tomaso Poggio
12 Collective robots: Architecture, cognitive behavior

model, and robot operating system
Xiaodong Yi, Yanzhen Wang, Xuejun Yang et al.
16 Toward robust visual cognition through

brain-inspired computing
Pengju Ren, Badong Chen, Zejian Yuan et al.
20 Brain-like control and adaptation for

intelligent robots
About the cover: A robot hand reaches out for a Yunyi Jia, Jianguo Zhao, Mustaffa Alfatlawi et al.
mechanized brain, floating in a sea of magnified
neurons. As we better understand the workings of
the brain, we are able to create machines that can 25 Neurorobotics: A strategic pillar of the
better mimic its ability to sense, learn, and react, Human Brain Project
with the aim of creating a new generation of more
intelligent robots. Alois Knoll and Marc-Oliver Gewaltig
This supplement was produced by the Science/
AAAS Custom Publishing Office and sponsored by
35 Biologically inspired models for visual cognition
the publishing house of the Chinese Academy of and motion control: An exploration of brain-inspired
Sciences.
intelligent robotics
Editors: Sean Sanders, Ph.D.; Jackie Oberst, Ph.D. Hong Qiao, Wei Wu, Peijie Yin
Proofreader/Copyeditor: Bob French
Designer: Amy Hardcastle
39 Anthropomorphic action in robotics
Materials that appear in this supplement were not
reviewed or assessed by the Science editorial staff. Jean-Paul Laumond
Articles can be cited using the following format:
[AUTHOR NAME(S)] [CHAPTER TITLE] in Brain-
inspired intelligent robotics: The intersection 42 Actor–critic reinforcement learning for autonomous
of robotics and neuroscience (Science/AAAS, control of unmanned ground vehicles
Washington, DC, 2016), p. [xx-xx].
Xin Xu, Chuanqiang Lian, Jian Wang et al.
Yan Xiang, Ph.D.
Director, Global Collaboration
and Publishing Services China (Asia)
47 Compliant robotic manipulation: A neurobiologic strategy
Custom Publishing Hong Qiao, Chao Ma, Rui Li
yxiang@aaas.org
50
+86-186-0082-9345
Declarative and procedural knowledge modeling
© 2016 by The American Association for the methodology for brain cognitive function analysis
Advancement of Science. All rights reserved.
16 December 2016 Bin Hu, Yun Su, Philip Moore et al.
2 BR A I N-IN SPIR E D IN TE LLI GENT ROBOT I C S: T HE I NT ERSECT I ON O F RO BOT IC S AND NE U RO S C IE NC E
O
ften invoking the future, various cultures throughout history
have been obsessed with the idea of robots. Leonardo da Vinci
detailed a humanoid robot in one of his notebooks. In third-
century BC China, a mechanical figure was presented to King
Mu of the Zhou Dynasty, as described in the book Liezi. There have even
been tea-serving traditional Japanese puppets—known as karakuri—made
by Hisashige Tanaka, described as “Japan’s Edison.”
Robotics is not just fodder for fiction, despite the word being coined
by Russian-born American sci-fi writer Isaac Asimov in his 1942 short
story “Runaround.” The field has become multidisciplinary, borrowing
from engineering, mathematics, computer science, and more recently,
neuroscience. Researchers from these fields are trying to build better
robots and also to better humanity through their use.
One prominent area of robotics research is the Center for Excellence
in Brain Science and Intelligence Technology (CEBSIT) at the Chinese
Academy of Sciences (CAS). The mission of CEBSIT is “to tackle
fundamental problems in the areas of brain sciences and brain-inspired
intelligence technologies.” One way it seeks to accomplish this mission is
to develop brain-inspired hardware including intelligent devices, chips,
robotic systems, and brain-inspired computing systems. Important work
in this field is also being performed at the Institute of Automation at CAS,
particularly in the Laboratory of Research and Application for Robotic
Intelligence of “Hand-Eye-Brain” Interaction, which focuses on developing
intelligent technologies, including so-called “brain-inspired intelligence.”
In this supplement, researchers from across China and around the world
discuss the latest challenges and advancements in their field. For many
functions, such as grasping, vision, and memory, researchers are turning to
neurobiology for inspiration. Building humanoid or anthropomorphic ro-
Realizing bots remains the ultimate goal, but since the robots’ memory and learning
algorithms are patterned after the human brain, there is still much to learn
intelligent about neurobiology before that goal is attained.
robotics The European Union-funded Human Brain Project’s Neurorobotics

Platform, of which the Robotics Laboratory is a participant, aims to further
If toolmaking is understand the brain and translate this knowledge into common products
such as robots. Two schools of thought have emerged in robotics: bio-
what distinguishes logically inspired robots that include a body, sensor, and actuators, and a
brain-inspired computing robot.
humans from animals, Applications of the latest models from these two schools include
unmanned ground vehicles and “collective robots”—multiple automatic
then robots, even robots that perform tasks collaboratively. When applications such as these
start to become commonplace, the Age of Robots may soon be a reality.
if fashioned in our The Greek philosopher Aristotle wrote, “If every tool, when ordered, or
even of its own accord, could do the work that befits it … then there would
likeness, could evolve be no need either of apprentices for the master workers or of slaves for
us even further. the lords.” If toolmaking is what distinguishes humans from animals, then
robots, even if fashioned in our likeness, could evolve us even further.
Jackie Oberst, Ph.D.

Sean Sanders, Ph.D.
Custom Publishing Office
Science/AAAS
INT RO D U CT I O N S 3
W
e are pleased to introduce this special supplement, “Brain-
inspired intelligent robotics: The intersection of robotics
and neuroscience,” which presents recent research
advances in this interdisciplinary area and proposes future
research directions.
Robots have found increasing applications in industry, service, and
medicine due in large part to advances achieved in robotics research over
the past decades, such as the ability to accomplish complex manipula-
tions that are essential for automated product assembly. Despite these
developments, robotics still has many technical bottlenecks to overcome.
Robots still lack truly flexible movement, have limited intellectual percep-
tion and control, and are not yet able to carry out natural interactions with
human. These deficits are especially critical in service robots, where facile
human–robot interactions are essential. A critical concern of government,
academia, and industry is how to advance R&D for the key technologies
that can bring about the next generation of robots.
Developing robots with more flexible manipulation, improved learning
Innovating at ability, and increased intellectual perception will achieve the goal of mak-
ing these machines more human-like. One possible pathway to success
the intersection in building next-generation robots is through brain-inspired intelligent
robotics, an interdisciplinary field that brings together researchers from
of neuroscience robotics, neuroscience, informatics, and mechatronics, among other areas.
Brain-inspired intelligent robotics aims to endow robots with human-
and robotics like intelligence that can be either application-oriented or mechanism-
oriented. Application-oriented robotics focuses on mimicking human
Brain-inspired intelligent functions by using new models or algorithms borrowed from information
robotics aims to endow science. However, such robots are usually designed for specific tasks and
their learning ability is poor compared with that of humans. Mechanism-
robots with human-like oriented robotics attempts to improve robot performance by mimicking
the structures, mechanisms, and underlying principles of human cognitive
intelligence that can be function and movement. It therefore requires the close collaboration of
researchers from both neuroscience and robotics.
either application-oriented Several brain projects are underway in the United States, Europe, Ja-
pan, and other countries, serving to promote neuroscience research. The
or mechanism-oriented. interaction between neuroscience and information science central to
these projects serves to advance research in brain-inspired intelligence,
generating breakthroughs in many related fields. Advances in neurosci-
ence research help to elucidate the mechanisms and neuronal circuitry
underlying different mental processes in the human brain, providing the
basis for new models in robotics research. Similarly, robotics research can
provide new ideas and technologies applicable to neuroscience.
In this supplement, we provide examples of cutting-edge achievements
in brain-inspired robotics, including both software and hardware devel-
opment. The authors offer current perspectives and future directions for
researchers interested in this interdisciplinary area, covering topics includ-
ing deep learning, brain-inspired computing models, anthropomorphic
action, compliant manipulation, collective intelligence, and neurorobotics.
Due to space limitations, we regret that some exciting achievements from
various frontiers in brain-inspired robotics could not be included.
We hope this supplement draws worldwide attention to this important
area and promotes broader communication and collaboration on robot-
ics, neuroscience, and intelligence science.
Hong Qiao, Ph.D.

Professor with the 100 Talents Program of the Chinese Academy of Sciences
Group Director of Robotic Theory of Application, Institute of Automation,
Chinese Academy of Sciences (CASIA)
Deputy Director, Research Centre for Brain-Inspired Intelligence, CASIA
Core Expert, CAS Center for Excellence in Brain Science and Intelligence
Technology (CEBSIT)
perform complex calculations, tasks that would be a marvel

Creating more intelligent for humans. Thus, it stands to reason that a computer system
incorporating the advantages of both computers and the
robots through brain- brain would have capabilities far exceeding those of current
computers, and perhaps the brain as well.
inspired computing What are some of the problems with current von Neu-
mann architecture that limit its ability in tasks at which
Bo Zhang1,4*, Luping Shi2,4 , and Sen Song3,4 humans excel? A critical issue is that in von Neumann archi-
tecture, the CPU and memory are separate. The CPU speed
T
has grown at a faster pace than memory speed, creating a
so-called “memory wall effect” caused by this speed-rate
mismatch, which greatly reduces the computer’s efficiency.
he great success achieved in building the digital However, BICS can significantly enhance computing
universe can be attributed to the elegant and simple von
Neumann architecture of which the central processing
unit (CPU) and memory are two crucial components.
The scaling up of CPU and memory, which both follow
Moore’s law, has been the main driving force for
computers for over half a century. Yet in March 2016, the
semiconductor industry announced that it is abandoning
its pursuit of Moore’s law because device scaling is
expected to soon reach its physical limit (1). Therefore,
improvements in computers in the post-Moore’s law
era must be based on radically new technologies.
Brain-inspired computing (BIC) is one of the most
promising technologies (1). By deriving inspiration
from the architecture and working mechanisms of
the brain, BIC systems (BICS) can potentially achieve
low power consumption, high robustness, and self-
adaptation, and at the same time handle multimodal
FIGURE 1. A comparison of the relative advantages of computers
data in complex environments, which will be conducive
versus brains.
to the development of systems that function and learn
autonomously. We use the term “BICS” here instead of
“neuromorphic computing” because it is more inclusive
and, in our opinion, future computer architecture will performance while at the same time greatly mitigating the
be a hybrid of neuromorphic and nonneuromorphic influence of the memory wall effect and reducing energy
components. Here we review the why, what, and how of consumption. It also can provide flexibility, robustness, low
developing BICS. power, and real parallel computation capability, and facili-
tate integration of new brain-like models and algorithms,
The need for BICS making it highly suitable for the development of intelligent
As summarized in Figure 1, certain capabilities of the robots.
computer far surpass those of the human brain, includ-
ing computation speed and accuracy, and memory access The current status of BICS
speed, lifetime, capacity, and accuracy. Conversely, certain Recently the development of BIC technologies has
capabilities of the brain trump those of the computer, in- become the focus of intensive efforts, with many possible
cluding adaptability, robustness, flexibility, and learning abil- solutions proposed (2–35). One example is the TrueNorth
ity. Accordingly, computers and humans each have different chip and Compass software system, developed by Modha
advantages in their task performance. For example, a one- et al. (2–6), which provides a scalable, efficient, and flexible
year-old child has a built-in ability to identify his or her par- non–von Neumann architecture based on a neuromorphic
ents in a viewpoint-invariant way in a complex environment, system. It consists of 4,096 neurosynaptic cores intercon-
a task that would be challenging for a computer. Conversely, nected to form an intrachip network, which uses silicon
computers can accurately recall long lists of numbers and technology to integrate 1 million programmable neurons
(that communicate using signal events in the form of
1
Department of Computer Science and Technology, Tsinghua University, Beijing, “spikes,” as occurs in biological neurons) and 256 million
China configurable synapses.
2
Optical Memory National Engineering Research Center, Department of
Precision Instruments, Tsinghua University, Beijing, China Another example is the SpiNNaker system, developed
3
Department of Biomedical Engineering, Tsinghua University, Beijing, China by Furber et al. (7, 8), which is built using standard von
Center for Brain-Inspired Computer Research, Tsinghua University, Beijing,
Neumann computing blocks and comprises a parallel
4
China
*
Corresponding Author: dcszb@tsinghua.edu.cn 1,000-core computer interconnected with low-power
C RE AT ING MO RE INT E L LIG E NT RO BOT S T H RO U G H BRAIN-INS P IRE D COM P U T I N G 5
Acorn RISC Machine Paths forward in

(ARM; RISC, reduced developing BICS
instruction set computing) Despite the many po-
processor cores. It is tential solutions discussed
suitable for modeling above, there is currently
large-scale spiking no consensus on the ideal
neural networks with technology (or combina-
bioplausible real-time tions thereof) to develop
performance. The the ideal BICS. Going
SpiNNaker platform is forward, there are three
designed to deliver a main types of hardware
broad capability that approaches: (1) brain-like
can support research computing to emulate the
exploring the information- primary functions of the
processing principles at brain; (2) BICS to build a
work in the brain. new computing architec-
The NeuroGrid, ture with guidance from
developed by Boahen known basic principles of
et al. (9, 10), is a the brain; and, (3) accel-
neuromorphic system FIGURE 2. Common features and principles in computers and erators of existing neural
for simulating large- brains. IC, integrated circuits; CMOS, complementary metal- networks. The key ques-
scale neural models in oxide semiconductor. tions for developing BICS
real time. It comprises are how to build such sys-
mixed-signal analog/ tems without a full under-
digital circuits capable standing of the biological
of executing important mechanisms of the brain
synaptic and neuronal and, from a fundamental
functions such as perspective, what new
exponentiation, features can be integrated
thresholding, integration, into BICS that will make
and temporal dynamics. them different from current
The BrainScaleS computers. We address
system, developed by these questions below.
Meier et al. (11, 12), is The underlying prin-
composed of wafer- ciples and computation
scale arrays of multicore mechanisms of computers
microprocessor systems FIGURE 3. A comparison of the computational characteristics and brains have both simi-
that are used to build of von Neumann architecture (temporal complexity) and larities and differences.
a custom mixed-signal the human brain (temporal, spatial, and spatiotemporal As put forth by David
complexity).
analog/digital simulation Marr (36) and depicted
engine, in which each in Figure 2, they can be
8-in. silicon wafer understood on three dif-
integrates some 50 x 106 plastic synapses and 200,000 ferent levels: design principles, algorithms, and hardware
biologically realistic neuronal circuits. These arrays can be implementation. Marr’s model inspired us to identify fun-
programmed using computational neuroscience models damental principles from the brain to guide the design of
or established software methods. They demonstrate good BICS and to narrow the gap between these two systems.
scalability and real-time operation. We posit that the two major differences between
In another approach, Giacomo et al. (13–15) devel- computers and the brain are those of the architecture and
oped a full-custom, mixed-signal, very-large scale computation paradigms. In current computers based on
integration device with neuromorphic learning circuits von Neumann architecture, the CPU and memory unit are
for exploring the properties of computational neurosci- separate and data is centrally stored in the memory unit, as
ence models and for building BICS. Other research- depicted in Figure 3. To deal with more complex problems,
ers have developed different types of neural network the primary strategy is to increase computational speed by
accelerators (16–21) and nanodevices such as memristor increasing the clock rate, which governs the speed of in-
networks (which change their resistance based on past formation exchange between the CPU and memory unit so
history and can naturally emulate synapse plasticity) that pieces of data can be accessed sequentially in rapid
with the aim of building neuromorphic networks for the succession. However, the brain uses a different mode of
development of neuromorphic computing circuits and “computation” than computers; it appears to make full
chips (22–35). use of resources and spatial complexity by distributing
FIGURE 4. Proposed hybrid

architecture of BICS. The red
dotted line depicts components
inherited from current computer
architecture—I/O (input/output),
a memory chip, and the central
processing unit (CPU)—to which
a brain-inspired computing unit
(BICU) has been added.
information processing to a large number of neurons Neumann and neuromorphic architectures is essential,
across many areas. In the neocortex, each neuron is specifically, how to express, process, store, and access
connected to between 1,000 and 10,000 other neurons “knowledge” and data within BICS;
through synapses; in some regions of the brain, there (2) Developing new neural network (NN) models for
are as many as 100,000 connections. This distribution of identifying fundamental brain principles suitable for guid-
information processing capacity might help to explain the ing the design and building of models that accommodate
brain’s energy and computational efficiency. functions such as top-down and recurrent connections and
An innovative yet practical way to develop BICS is to excitatory/inhibitory mechanisms, among others;
take advantage of computing paradigms inspired by the (3) Building basic network logic modules for BICU tech-
latest developments in neuroscience research. The full nology that will support the use of complexity in spatial
understanding of mechanisms underlying brain function and spatiotemporal domains for different NN and learning
remains a distant goal. However, many of the brain’s fun- mechanisms; and
damental computational principles, or “algorithms,” have (4) Developing software for the control and manage-
been elucidated or soon will be, which makes possible the ment of hybrid structures that will efficiently allocate,
transfer of ideas from ongoing research in neuroscience interact with, and schedule knowledge and data.
to BICS design. As we discuss below, we have identified Based on these considerations, we have developed
fundamental principles that may prove useful for develop- a cross-paradigm neuromorphic chip called “Tianjic”
ing BICS. We propose to embed them in a BICS design (Figure 5). Tianjic can support analog neural networks
to create an efficient programming environment that can (ANN) and spiking neural networks (SNN) with a high
accommodate rapidly evolving algorithms (37). degree of scalability and flexibility, and can accommodate
To build a BICS, the first step is to design a new architec- most current NN and learning methods. We have also
ture. We propose that as a general characteristic of neural developed accompanying software. Demonstrations have
computation, neural network systems exploit resources in shown that when integrated into a self-driving bicycle,
the temporal, spatial, and spatiotemporal domains. More- Tianjic performs multiple brain-like functions, including
over, the brain exhibits power-law distributions, which are sensing, tracking, memory, adaptation, analysis, control,
signatures of complexity in each of these domains (38, 39). and decision-making (40). The bicycle was able to follow
Nonetheless, the great success of current computers is a running person and pass obstacles smoothly (Figure
attributable in part to the von Neumann architecture, thus 5). The development of such chips paves the way for the
it should not be disregarded in future computer design. To practical construction of BICS, which will hopefully achieve
continue to take advantage of this architecture, we have a degree of high-level intelligence far beyond the current
proposed a hybrid BIC architecture that accommodates generation of computers.
and integrates the complexity of the temporal, spatial, and
spatiotemporal domains into a single platform (Figure Using BICS to develop intelligent robots
4) (40). This architecture integrates a new device called a Today the field of robotics is one of the most dynamic
BIC unit (BICU), which handles mainly perception-related areas of technological development (41–45). However, the
tasks using raw data. A BICU can be built based on any lack of intelligence in current industrial and service robots
technology that provides spatial complexity as well as a severely limits their applications. Therefore, the most
combination of spatial and spatiotemporal complexity. important issue in advancing this field further is endow-
Such technologies include neuromorphic, photonic, and ing robots with cognitive abilities. To this end, we view
quantum devices. the two most pressing imperatives as endowing robots
The development of BICS presents challenges in basic with perceptive capabilities and drastically reducing their
theory as well as in the development of hardware sys- power consumption. If robots lack perceptive capability
tems, software environments, and system integration. To in a complex and dynamic environment, it will be difficult
succeed in BICS development, research in the following for them to support other cognitive functions, let alone to
four areas is needed: learn and function autonomously.
(1) In terms of theory, understanding how to fuse von Intelligent robots must be capable of carrying out the
C RE AT ING MO RE INT E L LIG E NT RO BOT S T H RO U G H BRAIN-INS P IRE D COM P U T I N G 7
ment of BICS appears to be ripe for a breakthrough

due to rapid advances in neuroscience, information
technology, mathematics, new devices, and great
interest from the semiconductor industry.
Over the past decade, the field of neuroscience
has accumulated vast amounts of data. With the de-
velopment of the two-photon microscope and func-
tional magnetic resonance imaging, it is now pos-
sible to make brain measurements at scales ranging
from single neurons to large-scale brain networks.
In addition, optogenetics can be used to precisely
control brain activity at the level of single neurons.
Thus the field is currently in a state of rapid develop-
ment and approaching a threshold for achieving a
FIGURE 5. Photographs of (A) the Tianjic chip, and (B) a self-driving much greater understanding of the mechanisms of
bike controlled by the chip.
brain function.
In terms of information technology, current su-
percomputing technologies can simulate a cortical
following functions: column and large-scale, whole-brain models with simplified
(1) Modeling of complex environments from sensor data; spiking neurons in exquisite detail. In addition, the devel-
(2) Synergy and redundancy in control; opment of Internet technologies and sensor technologies
(3) Multisensory information processing and multimodal has brought about a big-data revolution in recent years.
integration; The resulting massive data trove has generated an intense
(4) Active, interactive, collaborative, and continuous demand for efficient artificial intelligence methods, and has
learning; and made it possible to train large-scale, brain-inspired models
(5) Emotional understanding and the ability to interact such as deep convolutional neural networks and recurrent
with humans in a natural way. neural networks.
BICS are highly suitable for the development of intel- In parallel, a wide range of mathematical branches have
ligent robots because they are better able to address the started to cross-pollinate, laying the foundation for a com-
imperatives discussed above, namely power consumption, prehensive and rigorous theoretical framework for BICS.
which is especially important for mobile robots, and the One promising avenue of research is Bayesian probability
practical implementation of perceptive and cognitive capa- theory, which has been shown to model aspects of human
bilities, which currently often requires a large, nonmobile learning and decision-making. The field of complex systems,
computer cluster. which encompasses graph theory–based analysis of network
BICS carry out perceptual tasks on raw data with a much topology and theories of nonlinear differential equations
higher degree of power efficiency than that of current dynamics, also promises to radically advance BICS.
computers. This reduced energy consumption is one of the In addition, many recently developed devices can serve
most important driving forces behind the development of to enhance BICS. For example, as mentioned earlier, Stru-
BICS, and can be attributed to the following: (1) Memory kov et al. (22) developed a memristor device for construct-
and computing are integrated, which avoids the huge en- ing a new kind of neuromorphic network by emulating
ergy consumption associated with data transfer; (2) sparse characteristics of synapse plasticity, short-term memory,
spatiotemporal and spike-based coding minimizes com- and long-term memory. It has low power consumption and
munication overhead; (3) computation is only performed is compatible with silicon technology.
when necessary due to an event-driven mode of operation; In the semiconductor industry, the maximum number
and (4) novel passive devices store synaptic weights and of integrated transistors on a chip is close to 100 billion,
memory with almost no power consumption. which is higher than the number of neurons in the nervous
For perceptive and cognitive applications in BICS, the system of Drosophila and mice. Within the next few years,
von Neumann architecture is convenient for symbolic that number is expected to reach or exceed 100 billion,
reasoning, whereas BICU excels in its capacity in perceptual the number of neurons in the human brain. It is predicted
recognition, intuitive judgment, and learning, using complex that when technology allows the semiconductor device
network representations that integrate information storage fabrication node to shrink below 10 nm in size, which
and computing capability. By combining the advantages of is projected to happen in 2017, neuromorphic devices
both von Neumann and neuromorphic architectures with will potentially reach the same level of computational
low-power consumption, BICS provide a highly suitable efficiency as biological neurons (46).
model for the development of intelligent robots (Figure 6).
Advances in artificial intelligence
Opportunity for development of BICS Inspiration obtained from studying the brain can play an
At present there is no widely accepted paradigm for influential role in the development of artificial agents with
building BICS. However, as we discuss below, the develop- human-like general intelligence. As an example, AlphaGo—
software trained to play the deceptively complex ancient

Chinese game of Go—represents a recent breakthrough in
the development of artificial intelligence (AI) (47). On March
16, 2016, it beat Lee Sedol, one of the world’s top Go play-
ers, to win the Go series by four games to one. Previously,
the Deep Blue program, another example of AI, used its vast
computing power in chess to quickly search through large
decision trees, a strategy quite different from that afforded
by human intelligence. In contrast, the AlphaGo team used
deep neural networks to mimic an aspect of what top Go
players might label intuition or perception. In addition,
AlphaGo has the ability to acquire skill by analyzing the
games of top players through supervised learning, as well
as by self-play through reinforcement-learning mechanisms,
demonstrating human-like learning capabilities. FIGURE 6. The development trend of BICU, showing the
relationship between energy/time and spatial complexity.
Currently, AI largely entails carrying out specific tasks
The energy efficiency of BICU (left dashed line) is better than
with discrete capabilities. This approach has yielded many
that of von Neumann architecture and can be expected to
interesting technologies. For example, deep learning has gradually approach that of the human brain (right dashed line).
dramatically improved speech recognition technologies as
well as visual object recognition and detection. In light of
these advances, it is an opportune time to focus on artificial 3. F. Akopyan et al., IEEE T. Comput. Aid. D. 34, 1537–1557
general intelligence (AGI), an alternate form of AI (47, 48). (2015).
The field of AGI views “general intelligence” as a funda- 4. D. S. Modha et al., Commun. ACM 54, 62–71 (2011).
mentally distinct property from task- or problem-specific 5. A. Amir, Proceedings of the 2013 International Joint
capabilities, and focuses on understanding the universal Conference on Neural Networks (IEEE, Dallas, TX, August
properties of intelligence. At the core of AGI is the hypoth- 2013), pp. 1–6.
esis that synthetic intelligence with sufficiently broad (e.g., 6. J. Hsu, IEEE Spectr. 51, 17–19 (2014).
human-level) scope and strong generalization capability is 7. S. B. Furber, F. Galluppi, S. Temple, Proc. IEEE 102, 652–665
qualitatively different from synthetic forms of intelligence (2014).
with significantly narrower scope and weaker generaliza- 8. S. B. Furber, IEEE Spectr. 49, 46–49 (2012).
tion capability (49). The flexibility, robustness, low power 9. B. V. Benjamin et al., Proc. IEEE 102, 699–716 (2014).
consumption, and real parallel computation capabilities 10. P. Gao, B. V. Benjamin, K. Boahen, IEEE Trans Circuits Syst. I
provided by BICS would facilitate the integration of new Regul. Pap. 59, 2383–2394 (2012).
brain-like models and algorithms suitable for developing 11. K. Meier, 2015 IEEE International Electron Devices Meeting,
AGI. The general capability of AGI will ultimately be of key (IEEE, Washington, DC, December 2015), pp. 4.6.1–4.6.4.
importance in the evolution of truly intelligent robots. 12. J. Schemmel, J. Fieres, K. Meier, Proceedings of the 2008 IEEE
International Joint Conference on Neural Networks (IEEE,
Conclusions Hong Kong, June 2008), pp. 431–438.
To be considered genuinely intelligent, robots need to 13. G. Indiveri et al., 2015 IEEE International Electron Devices
combine the superior capability of computers in perform- Meeting (IEEE, Washington, DC, December 2015), pp.
ing searches (reasoning) and simulating human delibera- 4.2.1–4.2.4.
tive behaviors with the advanced capability of neural 14. G. Indiveri, S. C. Liu, Proc. IEEE 103, 1379–1397 (2015).
networks in perception and learning. This approach points 15. N. Qiao et al. Front. Neurosci. 9, article 141 (2015).
to a promising path forward for building AGI systems. 16. T. Chen et al., ACM SIGPLAN Notices (ACM, New York, NY,
Such advances are highly compatible with the proposed 2014), pp. 269–284.
hybrid BICS architecture, which can integrate tasks at which 17. D. F. Liu et al., Proceedings of the Twentieth International
current computers excel—such as high-speed symbolic pro- Conference on Architectural Support for Programming
cessing—with tasks at which the human brain excels—such Languages and Operating Systems (ACM, Istanbul, Turkey,
as perception, adaptation, learning, cognition, and even March 2015), pp. 369–381.
innovation and creativity. Furthermore, it will be important 18. Z. Du et al., Proceedings of the 42nd Annual International
to create rich but controlled environments for robots to Symposium on Computer Architecture (ACM, Portland, OR,
interact with the world as testbeds to develop their percep- June 2015), pp. 92–104.
tual, cognitive, and motor abilities. In summary, BICS prom- 19. D. Y. Kim, J. M. Kim, H. Jang, J. Jeong, J. W. Lee, IEEE Trans. 20.
ises to greatly aid in the development of AGI, which will Consum. Electr. 61, 555–563 (2015).
ultimately drive robots towards human-like intelligence. 20. J. Tanabe et al., 2015 IEEE International Solid-State Circuits
Conference (IEEE, San Francisco, CA, February 2015), pp. 1–3.
References 21. Y. H. Chen, N. Fong, B. Xu, C. Wang, 2016 IEEE International
1. M. M. Waldrap, Nature 530, 144–147 (2016). Solid-State Circuits Conference (IEEE, San Francisco, CA,
2. P. A. Merolla et al., Science 345, 668–673 (2014). January–February 2016), pp. 262–263.
DE E P L E ARNING : MAT H E MAT IC S AND NE U RO SCI EN CE 9
22. D. B. Strukov, G. S. Snider, D. R. Stewart, R. S. Williams, Nature

453, 80–83 (2008).
Deep learning:
23. J. J. Yang, D. B. Strukov, D. R. Stewart, Nat. Nanotechnol. 8,
13–24 (2013).
Mathematics and
24. M. D. Pickett, G. Medeiros-Ribeiro, R. S. Williams, Nat. Mater. neuroscience
12, 114–117 (2013).
25. L. Deng et al., Sci. Rep. 5, 10684 (2015). Tomaso Poggio
26. Q. Xia et al., Nano Lett. 9, 3640–3645 (2009).
U
27. K.-H. Kim et al., Nano Lett. 12, 389–395 (2011).
28. S. Park et al., 2012 IEEE International Electron Devices
Meeting (IEEE, San Francisco, CA, December 2012), pp.
10.2.1–10.2.4. nderstanding the nature of intelligence is one of
29. G. W. Burr et al., 2014 IEEE International Electron Devices the greatest challenges in science and technology today.
Meeting (IEEE, San Francisco, CA, December 2014), pp. Making significant progress toward this goal will require
29.5.1–29.5.4. the interaction of several disciplines including neuro-
30. J. M. Cruz-Albrecht, T. Derosier, N. Srinivasa, Nanotechnology science and cognitive science, as well as computer science,
24, 384011 (2013). robotics, and machine learning. In this paper, I will discuss
31. S. B. Eryilmaz et al., Front. Neurosci. 8, 205 (2014). the implcations of recent empirical successes in many
32. B. Li et al., International Symposium on Low Power Electronics applications, such as image categorization, face identifica-
and Design (IEEE, Beijing, China, September 2013), pp. tion, localization, action recognition, and depth estimation,
242–247. as well as speech recognition through a machine learning
33. T. Tang et al., 2015 Design, Automation and Test in Europe technique called “deep learning,” which is based on multi-
Conference and Exhibition (IEEE, Grenoble, France, March layer or hierarchical neural networks. Such neural networks
2015), pp. 860–865. have become a central tool in machine learning.
34. R. C. Luo et al., Proc. IEEE 91, 371–382 (2003).
35. X. Lu et al., 2015 52nd ACM/EDAC/IEEE Design Automation Science and engineering intelligence
Conference (IEEE, San Francisco, CA, June 2015), pp. 1–6. The original idea of hierarchical neural networks came
36. G. Marcus, A. Marblestone, T. Dean, Science 346, 551–552 from basic neuroscience studies of the visual cortex con-
(2014). ducted by Hubel and Wiesel in the 1960s (1). Recordings
37. J. Lisman, Neuron 86, 864–882 (2015). they made from simple and complex cells in the primary
38. D. S. Bassett, M. S. Gazzaniga, Trends Cogn. Sci. 15, 200–209 visual cortex suggested a hierarchy of layers of simple,
(2011). complex, and hypercomplex neurons. These findings were
39. E. Bullmore, O. Sporns, Nat. Rev. Neurosci. 10, 186–198 (2009). later taken to reflect a series of layers of units comprising
40. L. P. Shi et al., 2015 IEEE International Electron Devices increasingly high degrees of complex features and invari-
Meeting, (Washington, DC, December 2015), pp. 4.3.1–4.3.4. ance. Fukushima developed the first quantitative model of
41. E. Burdet, D. Franklin, T. E. Milner, Human Robotics: a neural network for visual pattern recognition, called the
Neuromechanics and Motor Control (MIT Press, Cambridge, “neocognitron” (2). Its architecture was identical to today’s
MA, 2013). multilayer neural networks, comprising convolution, max
42. A. J. Ijspeert, A. Crespi, D. Ryczko, J.-M. Cabelguen, Science pooling, and nonlinear units. However, the training of that
315, 1416–1420 (2007). model differed from today’s, since it did not rely on the
43. X. Hu et al., PLOS ONE 9, e81813 (2014). supervised stochastic gradient descent (SGD) technique
44. J. Hertzberg et al., Künstliche Intelligenz 28, 297–304 (2014). introduced by Hinton (3), which is used in current deep
45. N. Burgess, J. G. Donnett, J. O’Keefe, Connect. Sci. 10, learning networks. Several similar quantitative models fol-
291–300 (1998). lowed, some geared toward optimizing performance in the
46. J. Hasler, B. Marr, Front. Neurosci. 7, 118 (2013). recognition of images (4), and others toward the original
47. E. Gibney, Nature 531, 284–285 (2016). goal of modeling the visual cortex, such as a physiologi-
48. B. Goertzel, Journal of Artificial General Intelligence 5, 1–48 cally faithful model called “HMAX” (5, 6).
(2014). Whereas the roots of deep learning architectures are
49. D. Kumaran, D. Hassabis, J. L. McClelland, Trends Cogn. Sci. derived from the science of the brain, advances in the
20, 512–534 (2016). performance of these architectures on a variety of vi-
sion, speech, and text tasks over the last five years can
Acknowledgments be regarded as a feat of engineering, combined with the
We are indebted to the following researchers for their helpful discus- cumulative effect of greatly increased computing power,
sions: the developers of Tianjic, J. Pei, Y. H. Zhang, and D. Wang; and the availability of manually labeled large data sets, and a
the developers of the self-driving bicycle, M. G. Zhao, Y. Wang, S. Wu, small number of incremental technical improvements. It
and N. Deng. This work was supported in part by the Study of Brain-
Inspired Computing of Tsinghua University (20141080934), the Beijing
Center for Brains, Minds, and Machines, McGovern Institute for Brain Research,
Municipal Science and Technology Commission, and the National Massachusetts Institute of Technology, Cambridge, MA
Natural Science Foundation of China (61327902). Corresponding Author: tp@ai.mit.edu
10 BR A IN -IN SPIR E D IN TELLI GENT ROBOT I C S: T HE I NT ERSECT IO N O F RO BOT IC S AND NE U RO S C IE NC E
is telling that several of the algorithmic tricks touted as on different image datasets with little additional training.
breakthroughs just a couple of years ago are now regard- The same networks can be used to perform somewhat
ed as unnecessary. Some notable examples are “pretrain- different tasks such as detection or segmentation, again
ing,” “dropout,” “max pooling,” and “inception modules.” with relatively minor training of an additional output layer.
As discussed below, other ideas, such as “data augmenta- These characteristics are consistent with the intriguing
tion” [once called “virtual examples” (7)], “batch normaliza- ability of deep learning networks to avoid overfitting and
tion,” and “residual learning,” (8) are more fundamental to show a testing error similar to the training error, at least
and likely more durable, though their exact form will likely when trained with very large data sets. A new focus in the
change somewhat. mathematics of learning and in the field of function ap-
proximation is clearly desirable, in order to extend these
Deep learning: Scientific breakthroughs? findings and realize the potential of deep learning.
Deep learning appears to be more than a good en-
gineering implementation of existing know-how: It may Deep learning: Open scientific questions and some
be the beginning of a breakthrough with the potential answers
to open a new field of science. There are several reasons The ability of deep learning networks to predict prop-
for this statement; below I discuss the two that are most erties of the visual cortex seems a major breakthrough,
noteworthy. which prompts the obvious question: What if the new
The first is related to neuroscience. Deep networks technology makes it possible to develop some of the
trained with ImageNet, a database of more than 1 million basic building blocks for brain-like intelligence? Of course,
labeled images, seem to mimic not only the recognition training with millions of labeled examples is biologically
performance but also some of the tuning properties of implausible and is not how children learn. However, it
neurons in cortical areas of the visual cortex of monkeys may be possible to largely eliminate the need for labels,
(9). However, there are major caveats to this interpretation. which requires that a learning organism should have the
One is that the variance explained by the models is only ability to group together (with the implicit label of “same
around 60%. The comparison is made between actual neu- identity”) images of the same object (or for classification,
rons and optimal linear combinations of units in a layer of of the same object class). This ability of “implicit labeling”
the model, thus neglecting nonuniformities in cortical ar- could then bypass the need for huge numbers of labeled
eas (such as cortical layers, different types of neurons, and data. The idea is that when explicit labels are not provided,
face patches). However, for the sake of the argument, let contextual or other information may often allow implicit
me take an optimistic view of these claims, which lead me labeling. There are several plausible ways for biological
to conclude that in order to explain how the visual cortex organisms, particularly during development, to implicitly
works, it is sufficient to spell out a computational task, such label images and other sensory stimuli such as sounds
as a high-variation object categorization task, then find and words. For images, time continuity is a powerful and
parameter values by optimizing recognition performance primitive cue for implicit labeling (11–13). As a back-of-the-
of the architecture on the millions of data points that are envelope estimate, let us consider how many unlabeled
images and their labels. The constraints on the architecture images a baby robot could acquire during its first year of
are rather weak, consisting of the simple nonlinear units “life.” One saccade (i.e., an eye movement that changes
and generic shift invariant connectivity used in both the the direction of gaze) per second, 8 hours per day for 365
neocognitron and HMAX networks. Since the constrained days would provide the robot with over 10 million images.
optimization problem described above yields properties If only a small fraction of these images, say 10%, were
similar to what neuroscience shows, we can conclude that implicitly labeled, even this smaller number could provide
an optimization problem with such weak constraints has a a sufficiently rich training set, as suggested by empiri-
unique solution and, furthermore, that evolution has found cal results on ImageNet. A single deep learning network
the same solution. As mentioned in previous work (9), this architecture certainly cannot deal with the challenging
seems to imply that the computational level of analysis of cognitive tasks that humans perform constantly. However,
both Marr and myself (10) effectively determines the lower it may be possible to “build a mind” with a set of different
levels of the algorithms, and that the details of the imple- modules, several of which are deep learning networks.
mentation are irrelevant. I now turn to the set of mathematical questions posed
The second reason deep learning may portend a major by the successful engineering of deep learning to un-
breakthrough in science is related to the existing mathe- derstand when and why deep networks are better than
matical theory of machine learning and the need to extend shallow ones. In answering this question, there are two
it. In particular, it seems that trained multilayer networks further sets of questions about deep neural networks to
(deep networks) are able to generalize to other tasks and consider. The first relates to the power of the architec-
other databases very different from the ones on which they ture: Which classes of functions can it approximate well,
were originally trained (3). This behavior is quite distinct and what is the role of convolution and pooling? The
from and superior to classical one-layer kernel machines second set relates to learning the unknown coefficients
(shallow networks), which typically suffer from overfitting from the data: Do multiple solutions exist? If so, how
issues. Deep convolutional networks trained to perform many? Why is the learning technique based on SGD so
classification on ImageNet achieve good performance unreasonably efficient, at least in appearance? Are good
DE E P L E ARNING : MAT H E MAT IC S AND NE U RO SCI EN CE 11
matching the
structure of
compositional
function (Figure
1) should be
“better” at
approximating
those functions
than a generic
shallow
network.
We define
“compositional
FIGURE 1. On the left is a shallow universal network with one node containing n units or channels. On the right is functions”
a binary-tree hierarchical network. Some of the best deep learning networks are quite similar to the binary-tree as functions
architecture (8). that are
hierarchically
local; they are
compositions
local minima easier to find in deep rather than in shallow of functions with a small, fixed number of variables.
networks? To prove a formal result, one has to compare the
Answers to some of these questions are just begin- complexity (measured in terms of number of units or
ning to emerge. There is now a rather complete theory of parameters) of shallow and deep networks. Figure
convolutional and pooling layers that extends the current 1 (right) shows a “deep” hierarchical network with
neural network design to other classes of transformations eight inputs and the architecture of a binary tree. The
beyond the translation group, including nongroup trans- network architecture matches the structure of the
formations, ensuring partial invariance under appropriate form’s compositional functions:
conditions. Some of the less obvious mathematical results
are as follows (12, 14): f(x1,…x8) = g3(g2(g1(x1,x2),g1(x3,x4)),g2(g1(x5,x6),g1(x7,x8))).
• Pooling—or locally averaging—a number of nonlinear
units is sufficient for invariance and for maintain- • Each of the nodes in the deep network consists of a
ing uniqueness [see theorems in section 3 of (12)], set of rectified linear units, so called because each unit
provided there is no clutter within the pooling region. implements a rectification of its scalar input (the output
Thus, convolution and subsampling as performed is zero if the input is negative or equal to the input).
in neural networks are sufficient for selective invari- Though not needed for the theorem, Figure 1 assumes
ance, provided there are a sufficient number of units. shift invariance (equivalent to weight sharing and
The form of the nonlinearity used in pooling is rather convolution). Thus, the nodes in each layer compute
flexible: Smooth rectifiers and threshold functions are the same function (for instance, g1, in the first layer).
among the simplest choices. With these definitions, the following theorem, stated
• The size of the pooling region does not affect unique- informally [for a formal statement see (15)], answers the
ness apart from larger regions being more susceptible question of why and when deep networks are superior
to clutter. Reducing clutter effects on selective invari- to shallow networks:
ance by restricting the pooling region is probably a Both shallow and deep networks can approximate a
main role of “attention.” compositional function of d variables. For the same level of
• Max pooling is a form of pooling that provides invari- accuracy, the number of required parameters in the shallow
ant but not strictly selective information. There is no network is exponential in d, whereas it is linear in d for the
need for max pooling instead of local averaging, apart deep network.
from computational considerations. Notice that the binary tree of Figure 1 (right) is a good
The above results related to pooling do not by them- model of the architecture of state-of-the-art neural networks
selves explain why and when multiple layers are superior with their small kernel size and many layers (8). This figure
to a single hidden layer. Using the language of machine is very similar to hierarchical models of the visual cortex,
learning and function approximation, the following ad- which show a doubling of receptive field sizes from one
ditional statements can be made (15, 17): layer to the next.
• Both shallow and deep networks are universal. That is, The results outlined here are only the beginning of stud-
they can approximate arbitrarily well—and in principle ies of multilayer function approximation techniques and
learn—any continuous function of d variables (on a related machine learning results. They are of broad inter-
compact domain). est because progress in machine learning needs a theory
• Then what is the difference between shallow and to build upon the empirical successes of deep learning
deep networks? We conjectured that a deep network achieved thus far.
Importance of the science of intelligence

In this article, I have sketched some scientific questions Collective robots:
in both neuroscience and mathematics that will need to be
answered. In fact, there may be a deep connection between
Architecture, cognitive
the two fields:
“A comparison with real brains offers another, and prob-
behavior model, and robot
ably related, challenge to learning theory. The ‘learning
algorithms’ we have described in this paper correspond
operating system
to one-layer architectures. Are hierarchical architectures Xiaodong Yi, Yanzhen Wang, Xuejun Yang*,
with more layers justifiable in terms of learning theory? It Shaowu Yang, Bo Zhang, Zhiyuan Wang,
seems that the learning theory of the type we have outlined Yun Zhou, Wei Yi, and Xuefeng Peng
does not offer any general argument in favor of hierarchi-
cal learning machines for regression or classification. This is
T
somewhat of a puzzle since the organization of the cortex—
for instance, the visual cortex—is strongly hierarchical. At
the same time, hierarchical learning systems show superior
performance in several engineering applications.” (16). o carry out applications in the service of human
Beyond the case of deep learning, I believe that a con- society, such as exploration of unknown environments
centrated effort in the study of the basic science of intel- and disaster relief missions, robots must be able to
ligence—not only the engineering of intelligence—should autonomously adapt to dynamic and uncertain environ-
be a high priority for our society. We need to understand ments, and accomplish complex tasks with flexibility. As
how our brain works and how it generates the mind; this a means of achieving these goals, what we call “collective
understanding will advance basic curiosity-driven research robots,” a collection of multiple autonomous robots per-
and contribute greatly to human knowledge. It will also be forming tasks collaboratively, are likely to become one of
critically important for future advances in engineering, and the most important types of robots in the future.
for helping us to build robots and artificial intelligences in To improve their capacity to adapt to their
an ethical manner. environment and accomplish complex tasks, collective
robots must be treated as a living system, an organism
References whose purpose is to survive in dynamic surroundings
1. D. H. Hubel, T. N. Wiesel, J. Physiol. 160, 106–154 (1962). and successfully complete tasks. Therein lies the
2. K. Fukushima. Biol. Cybern. 36, 193–202 (1980). fundamental difference between collective robotics
3. Y. LeCun, Y. Bengio, G. Hinton, Nature 521, 436–444 (2015). and swarm robotics: Swarm robotics projects, such as
4. Y. LeCun et al., Neural Comput. 1, 541–551 (1989). Kilobot (1) and swarm-bot (2), comprise multiple simple
5. M. Riesenhuber, T. Poggio. Nat. Neurosci. 3, 1199–1204 (2000).
robots, while collective robots can comprise highly
6. T. Serre et al., Prog. Brain Res. 165, 33–56 (2007)
heterogeneous robots, each possessing strong sensing
7. P. Niyogi, T. Poggio, F. Girosi, Proc. IEEE 86, 2196–2209 (1998).
8. K. He, X. Zhang, S. Ren, J. Sun, arXiv:1512.03385v1 [cs.CV] and movement capabilities.
(2015). Cooperation among collective robots is essentially
9. D. L. K. Yamins, J. J. DiCarlo, Nat. Neurosci. 19, 356–365 (2016). different than among multirobots, as seen in hazardous
10. T. Poggio, Perception 41, 1017–1023 (2012). environment exploration (3), unmanned aerial vehicle
11. F. Anselmi et al., Center for Brains, Minds, and formation flight (4), and robotic soccer (5). When
Machines (CBMM) Memo No. 001, DSpace@MIT (2014); facing complex tasks in unknown environments,
available at http://cbmm.mit.edu/sites/default/files/ cooperation among collective robots relies on
publications/1311.4158v5_opt.pdf. distributed mechanisms that merge autonomous
12. F. Anselmi et al., Theor. Comput. Sci. 633, 112–121 (2015). actions from multiple robots, such that a higher level of
13. X. Wang, A. Gupta, Proceedings of the IEEE International collective intelligence emerges. By contrast, multirobot
Conference on Computer Vision, (IEEE, Santiago, Chile,
cooperation often relies on a certain element of
December 2015), pp. 2794–2802.
centralized control.
14. F. Anselmi, T. Poggio, Visual Cortex and Deep Networks:
Learning Invariant Representations (MIT Press, Cambridge,
Here, we review the design of the collective structure
MA, 2016). and cognitive behavior model of collective robots, with
15. H. Mhaskar, Q. Liao, T. Poggio, arXiv:1603.00988 [cs.LG], (2016). micROS providing an example of how to implement
16. T. Poggio, S. Smale, Not. Am. Math. Soc. 50, 537–544 (2003). morphable, intelligent, and truly collective robotic
17. T. Poggio, “Deep Learning: Mathematics and Neuroscience,” operating systems.
(2016); available at http://cbmm.mit.edu/publications/deep-
learning-mathematics-and-neuroscience. The collective structure
Before they are capable of accomplishing complex
Acknowledgments
The author is supported by the Center for Brains, Minds and Ma-
chines (CBMM), funded by National Science Foundation Science State Key Laboratory of High Performance Computing, School of Computer
and Technology Centers award CCF-1231216. Science, National University of Defense Technology, Changsha, China
*
Corresponding Author: xjyang@nudt.edu.cn
COLLECT I VE ROBOT S: ARC H IT E CT U RE , CO G NIT IV E BE H AV IO R MO D E L , AND RO BOT O P E RAT ING SY ST EM 13
FIGURE 1. Left:
Collective robots
interacting in an
outdoor urban
environment. Right:
The collective structure
of the robots in an
indoor domain, where
the nodes in the black,
dashed circles represent
the robots, and the
cloud represents
different environments
interacting with robots
and humans.
tasks, collective robots must first be able to adapt with computers and intelligent terminals for improving
to and operate within uncertain, dynamic, and the quality of services.
unpredictable environments. According to Gödel’s proof
of incompleteness, Heisenberg’s uncertainty principle, A model for cognitive behavior
and the second law of thermodynamics, the character Existing as a part of an organic whole, individual com-
and nature of a system cannot be determined internally, ponents of collective robots must cooperate with each
and any efforts to do so will lead to disorder. Since only other to achieve harmonious behavior. The collective
an open system is capable of fully adapting to changes, should be capable of adapting to dynamic environments
collective robots must have an open architecture and proactively, not just passively. Achieving the characteris-
continually interact with open environments. tics of dynamism, speed, harmony, and initiative, the col-
To guarantee their scalability, robustness, lective robots must possess key attributes of cognition
parallelism, and flexibility, collective robots should and behavior, as follows:
conform to a distributed organizational structure in (1) Observation: As an open system, collective
which multiple nodes form an interconnected network robots must continuously acquire information from
and enable distributed control—this structure ensures the surrounding environment through proactive
that collective robots are decentralized, autonomous, observation rather than passive sensing. To improve the
and adaptive. observational capability and effectiveness of the entire
The collective structure exerts enormous effects collective, diversity of observation within the system
on the behavior of collective robots. In general, the must be respected. Collective observation activities and
collective structure should be flexible and cellular, and perceived content should be circulated in the system to
be such a networked structure as is formulated through obtain a shared and enriched understanding for the col-
design concepts, experiences, mission objectives, lective.
and oriented modes. This structure leads to abundant (2) Orientation: As the key to cognition and behavior,
nondeterminancy that is of benefit to the system’s orientation is defined as the paired acts of hypothesizing
revolutionary, innovative, and active evolution. Classical and testing, analyzing and synthesizing, and destroy-
organizational structures include hierarchies, holarchies, ing and creating. It is a dynamic and continuous internal
coalitions, teams, congregations, federations, markets, interaction. The results of orientation are represented
matrices, and societies (6). Collective robots should as images, views, or impressions of the world shaped
be capable of autonomously formulating a variety by genetic heritage, cultural tradition, previous experi-
of organizational structures according to different ences, or unfolding circumstances. Orientation brings
environments and tasks. Moreover, they should also be insight, focus, and direction to the act of observing. As
able to dynamically evolve into new structures when the pivotal point of control, orientation shapes the way in
environments and tasks are changed. The collective which collective robots interact with the environment.
structure should also be capable of integrating external, (3) Interaction: Autonomous individuals and the
heterogeneous robots when needed. interactions among them form the basis of collective
Figure 1 illustrates collective robots providing intelligence. Harmony, focus, and direction in the
services for human users in two domains of the urban collective’s operations are created by the bonds of
environment. In the outdoor domain, unmanned aerial implicit communication and trust that evolve as a
vehicles and unmanned ground vehicles improve consequence of the similar mental images or impressions
mobility through cooperative sensing, planning, and each individual creates and commits to memory. These
data sharing. Indoors, the robots collaboratively work include common insights, orientation patterns, and
FIGURE 2. John Boyd’s OODA loop [reproduced from (7)].
previous experiences. In this way, it is not only the for confrontational operations (7), which has been
subjective initiative of each individual that can be successfully extended to modeling complex behaviors
mobilized, but also the collective intent. in commercial and social domains.
(4) Decision and action: Observations that match The OODA loop exhibits the evolving, open, and
up with certain mental schema call for certain purposely unbalanced process of self-organizing
decisions and actions. A “decision” occurs when each and emerging collective intelligence demonstrated
individual of the robot collective decides among in collective robots. The observe module of the
action alternatives generated in the orientation phase. OODA loop manages the sensory data and performs
“Action” takes place when operations affecting the information filtering, compression, and fusion. The
environment are carried out according to the decision. orient module then conducts analysis and synthesis of
Previous research has shown that systems comprising the environmental input. The decide module performs
more diverse and independent members make better the hypothesizing, selection, and planning of the
choices (9). robots’ actions according to the required tasks and
(5) Feedback loop: Feedback is a key mechanism environmental situation. Finally, the act module carries
for an open system to implement self-regulation and out the control process according to the plan, as well
self-organization. Besides enforcing the action-control as any feedback received, and performs continuous
component, feedback also allows for a greater num- reassessments based on the most current information.
ber of choices, new strategies, and decision rules to
apply. It leads to the questioning of assumptions and Collective intelligence
alternative ways of seeing a situation. When one be- In an attempt to form collective intelligence by
comes exposed to a new perception or experience that polymerizing the autonomous behaviors of collective
challenges existing schemas, a process of reorganiza- robots through the OODA loop, the following four
tion and adaptation occurs, leading to new schemas. research topics have become the subject of intensive
Establishment of a rapid feedback loop is essential study (8):
to promote the adaptability of a system exposed to (1) Autonomous observation and collective
dynamic and uncertain environments and tasks. perception: Individual machines within collective robot
systems may have different perception capabilities,
The OODA model for cognitive behavior and thus may assume different roles in perception. The
Based on the elements described above, the “ob- perception–information fusion and the perception–
serve-orient-decide-act” (OODA) model or loop can behavior management processes will need to take
be constructed, as depicted in Figure 2. The remaining these different roles into consideration.
elements, namely the interaction and feedback loop, (2) Autonomous orientation and collective cognition:
are implicitly included in this model. The OODA loop Autonomous orientation of robots is achieved
is a well-known decision cycle proposed by John Boyd through machine learning technologies. A collective
COLLECT I VE ROBOT S: ARC H IT E CT U RE , CO G NIT IV E BE H AV IO R MO D E L , AND RO BOT O P E RAT ING SY ST EM 15
FIGURE 3. The distributed architecture of micROS for the city FIGURE 4. Layered structure of micROS nodes. APP, applications;
scenario example. API, application programing interface.
knowledge base is further utilized to obtain collective behavioral model, is composed of observation,
cognition. orientation, decision, and action modules.
(3) Autonomous decision and collective game
behavior: Collective game behavior is based on Conclusions and future directions
cooperative/noncooperative game theory and is In this paper we illustrated the collective structure and
achieved by synthesizing autonomous planning of the cognitive behavior model of collective robots. We
individual robots. also proposed the design of a morphable, intelligent,
(4) Autonomous action and collective dynamics: and collective robot operating system (micROS) for the
Collective robots achieve cooperative and harmonized management, control, and development of collective
motion using collective dynamic models, such as the robots. In the future, we plan to focus our research on
Boids, leader–follower, and graph-based models (10). autonomous behaviors and collective intelligence, as
well as devoting attention to the further development of
micROS micROS.
We have proposed a morphable, intelligent, and
collective robot operating system (micROS) for the References
management, control, and development of collective 1. M. Rubenstein, A. Cornejo, R. Nagpal, Science 345, 795–799
robots (8). It provides a platform for standardization (2014).
and modularity while achieving autonomous behavior 2. R. Gross, M. Bonani, F. Mondada, M. Dorigo, IEEE Trans.
and collective intelligence. Robot. 22, 1155–1130 (2006).
MicROS supports the construction of different col- 3. M. Eich et al., J. Field Robot. (Special Issue on Space Robotics,
lective structures. For example, Figure 3 illustrates an Part 2) 31, 35–74 (2014).
example of an organizational structure derived from 4. B. D. O. Anderson, B. Fidan, C. Yu, D. Van der Walle, in Recent
Figure 1. MicROS also provides a wireless-based, scal- Advances in Learning and Control, Vol. 371, V. D. Blondel, S. P.
Boyd, H. Kimura, Eds. (Springer-Verlag, London, 2008), pp.
able, and distributed self-organized networking plat-
15–33.
form that implements distributed messaging-based
5. L. Mota, L. P. Reis, N. Lau, Mechatronics 21, 434–444 (2011).
communication mechanisms.
6. B. Horling, V. Lesser, Knowl. Eng. Rev. 19, 281–316 (2005).
MicROS can be installed on each node of a 7. F. P. B. Osinga, Science, Strategy and War: The Strategic
collective robot to create a layered structure consisting Theory of John Boyd (Routledge, Abingdon, UK, 2007).
of the core layer and the application programing 8. X. Yang et al., Robotics Biomim. 3, article 21 (2016).
interface (API) layer (Figure 4). The core layer is divided 9. L. Fisher, The Perfect Swarm: The Science of Complexity in
into resource-management and behavior-management Everyday Life (Basic Books, New York, NY, 2011).
sublayers. The former controls resource management 10. X. Wang, Z. Zeng, Y. Cong, Neurocomputing 199, 204–218
in the physical, information, cognitive, and social (2016).
domains. The latter, based on the OODA cognitive
Toward robust visual In contrast to knowledge representations based

on symbols and probabilities, the brain achieves
cognition through brain- informational organization, inference, and reasoning
through the dynamic evolution of sophisticated
inspired computing spatiotemporal networks. Such plastic, dynamic, and
nonlinear brain networks are the structural basis of
Pengju Ren†, Badong Chen†, the robust cognitive function of the brain, and are also
Zejian Yuan, and Nanning Zheng* a starting point in the development of brain-inspired
intelligence.
V
Memory and knowledge migration
Memory, an important function of the nervous
system, is the neural activity in the brain responsible for
isual perception is one of the most important storage and reproduction of acquired information and
components of remote autonomous systems and experience. The brain’s memory system can be divided
intelligent robots and is one of the most active into three categories: instantaneous, short-term, and
areas of artificial intelligence research. Achieving a long-term. Instantaneous memory is temporary storage
better understanding of the brain’s visual perception in sensory organs. Short-term memory maintains
mechanisms will be critical for developing robust information, performs fine processing, and rehearses
visual cognitive computing, building compact and new information from sensory buffers. Long-term
energy-efficient intelligent autonomous systems, and memory comprises the accumulation of individual
establishing new computing architectures for visual experience and the development of cognitive ability.
perception and understanding of complex scenes. The brain receives internal and external information
Inspired by the cognitive mechanisms of brain collected from sensory organs and receptors, and
networks, associative memory, and human visual then makes use of the knowledge memorized by the
perception, here we review a computational framework nervous system to complement, interpret, and judge the
of visual representation and “scene understanding” collected information. Due to unavoidable “noise” in the
(1), and explore possible approaches to robust visual signals and incomplete observations, the brain has to
cognition. correct and complement faulty or missing information
based on its memories through different cognitive
Brain networks and cognitive plasticity processing stages. In order to make adaptive behavior
The brain is a complex network comprising decisions, the nervous system must form an internal
interconnected tissues from different regions, with model that reflects the history of environmental changes
each region responsible for a different cognitive task. and choose whether or not to modify or consolidate this
The functionality and tasks related to spontaneous model to form the long-term memory. Compared with
brain activities are determined by a network topology the storage mechanisms of current computer systems,
that is hierarchical, multiscale, highly connected, and neural memory is characterized by the following
multicentric (2). In recent years, neuroscientists have features: a distributed structure for knowledge
gained insights into the cognitive mechanisms of the representation and storage, engram cell connectivity as
brain by investigating structural connections, as well a substrate for memory storage and retrieval, and tight
as the aggregation and separation (convergence and coupling of memory and information processing (4, 5).
divergence) of functional and effective connections In the brain, memory is seamlessly integrated with
(Figure 1). Structural connections comprise the dynamic processes of its functional and effective
physiological connections within the cerebral cortex; connections throughout the entire cognitive information
the high level of redundancy determines the versatility processing system, where relevant information is
and fault-tolerant characteristics of the brain. Functional maintained, interpreted, judged, retrieved, revised,
connections are task-specific statistical correlations consolidated, and completed. “Knowledge migration”
based on neurophysiological phenomena; their degree is the ability of humans to learn about new concepts by
of reconstruction determines the brain’s cognitive applying past knowledge to the new situation. More
plasticity (3). “Effective connections” describes the specifically, when the brain needs to learn a new model
causal connections and mutual influences between or respond to an unforeseen situation, it can mimic an
different functional areas, which determine the existing, familiar memory, thereby executing different
efficiency and flexibility of cognitive ability. levels of knowledge transfer and achieving robust
Brain function is consistent with its structure. problem solving ability.
Visual perception and multiscale

Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, spatiotemporal scene learning
China
*
Corresponding Author: nnzheng@mail.xjtu.edu.cn
Since about 80% of external information perceived
†
Contributed equally to this work. by the human brain is obtained through vision, visual
TO WARD RO BU ST V IS UAL CO G NIT IO N T H RO U G H BRAIN-INS P IRE D CO M P U T I N G 17
FIGURE 1. Brain network

connections. Left, structural
connections; center, functional
connections; right, effective
connections.
perception mechanisms are of great interest in the Visual cognition is not merely a passive response
study of brain cognition. Different properties of visual to the environment, but an active behavior. Biological
information are transferred and processed by various visual channels use a “bottom-up” process to form
interacting pathways that make up the brain’s functional an initial recognition result for visual objects, and a
connectivity. There are three major properties of visual reverse “top-down” process to control eye movement
cognition that interest us here. The first is related to and attention to complete “forecast-verification”
studies of visual cognition showing that neural coding cognitive processes. Humans have the ability to use
mechanisms cannot be detected from a single cell. memory information to search for a specific target
Moreover, single neurons are incapable of representing from complex environments and to process target
knowledge and associated memory independently. information. Selective attention can be divided into two
Instead, in visual cognition, mental concepts are levels: (1) primary attention systems, or (2) advanced
represented by the joint activation of groups or attention systems, based on whether it is independent
assemblies of cells, with each cell participating in of semantics or content.
diverse assemblies at different times. This property
greatly reduces the required scale of neuronal networks Framework and components of brain-inspired
and improves the flexibility of neuronal representation. visual cognitive computing
In neurobiology, this well-known concept is referred to A robust, visual cognitive computing system should
as “population or assembly coding.” embrace the latest developments in neuroscience and
A second characteristic of visual cognition is its the cognitive and information sciences (8). A framework
dynamic and selective nature, whereby it can selectively for visual cognitive computing is presented (Figure 3),
focus on a discrete aspect of information while ignoring and is based on the following fundamental principles:
other perceivable information. The third characteristic (1) Computation must be performed close to the
is the adaptability of visual cognition, whereby it can sensors to enable configurable and programmable
make functional changes based on interactions with the image data delivery that supports attention-driven
outside world. information acquisition;
In addition, visual cognition does not function (2) A scalable and flexible network will mimic the
independently of the environment, but interacts with structural connections of the human brain, using
it as part of a more complex cognitive behavior. In either static or dynamic task mapping and scheduling
primary vision, while local perception components strategies, and associated routing schemes. These
might compete (e.g., focusing on different visual strategies will also mimic the brain’s functional
items of interest), on a larger, more global scale, they connections, using adaptive routing path selection,
cooperate to bring about acute visual perception. dynamic link-bandwidth assignment, and priority-based
This feature conforms to the “small world” properties resource allocation. This configuration allows functional
of brain networks, and shows consistency between and effective connections to be adjusted and modified
structure and function. The local–global interaction according to different requirements of applications and
behavior in visual cognition necessitates that tasks;
multiscale spatiotemporal features are integrated (3) Content-based data retrieval schemes are used to
into visual information processing. Recently, these manage and organize distributed memory units and to
multiscale spatiotemporal features have been record the routes and orders of the intermediate nodes
successfully applied to visual salience detection and during neural activity transmission;
collaborative tracking (6, 7). In these studies, a set (4) Convergent–divergent connections in the
of multiscale features was used to describe a salient functional and effective networks will accomplish
object locally, regionally, and globally (Figure 2A), population coding and associative memory; and
while different visual features were learned from (5) Visual cognition tasks can be performed by
sample sets with different timescales to track objects designing a multiscale, spatiotemporal dynamic
in challenging scenarios (Figure 2B). relational network.
FIGURE 2. (A) Multiscale salience detection in

image sequence. Modified from (6). (B) Multi-
timescale collaborative tracking (7). CRF, conditional
random field; SL, long-term target samples; SM,
medium-term target samples; SS, short-term target
samples; VL, long-term variables; VM, medium-term
variables; VS, short-term variables; t, time.
The proposed framework contains the following major spatiotemporal nonlinear mapping to achieve spiking
components: activity propagation, transformation, and migration (9).
(1) Event-/attention-driven information acquisition: (3) Networked dynamic information processing:

Accomplishes the “bottom-up” process of passive Various features of visual objects, such as shape, color,
information acquisition, such as changes of optical flow motion, and stereo view, are transmitted and processed
caused by motion, as well as the “top-down” process in parallel and at multiple scales using hierarchical and
of active information retrieval, such as concept- or scalable network structures to emulate the structural,
empirical knowledge-based object location (6, 7). functional, and effective connections of the human
brain. Resource allocation is carried out with specific
(2) Spatiotemporal coding: Allows rapid functional learning (or tuning) rules that mimic the synaptic plas-
connection and reorganization to achieve contextual ticity of biological neural systems. Resilience against
(or task-related) representation, by applying temporal- unanticipated events and uninformative background
or rate-coding for spiking signals. This enables visual input, as well as recovery from failures, are required
information representation and management and uses throughout the entire system design (10).
TO WARD RO BU ST V IS UAL CO G NIT IO N T H RO U G H BRAIN-INS P IRE D CO M P U T I N G 19
FIGURE 3. Diagram of proposed brain-inspired visual cognitive computing. A modularized, flexible, scalable and power-efficient
computing architecture is achieved using the proposed model for the high-level synthesis of the computational requirement of desired
functionalities, and the essential communication between these functionalities in order to carry out different tasks, together with a
deeper understanding of the effective connections being emulated. Other critical features include systematically mapping logical
neural networks to the physical neurosynaptic cores, logically scheduling execution orders, adaptively applying intra- and interchip
routing algorithms and flow control schemes, and deliberatively designing a reconfigurable microarchitecture neurosynaptic core.
These characteristics are crucial to fulfilling the functionality of the system. EPSP, excitatory postsynaptic potential; GALS, globally
asynchronous locally synchronous; SRAM, static random-access memory.
(4) Distributed associative memory: A distributed modules to generate a reasonable visual representation
context-based memory management scheme that that is accomplished by taking full advantage of
performs population coding for visual concept conditions, prior knowledge, and hypotheses from the
representations and associative retrieval of memories. real world.
This component autonomously facilitates information In order to continually improve and comprehensively
maintenance, interpretation, judgment, correction, verify this framework, it will be necessary to deploy
reinforcement, and integration, using associative visual cognitive computing in a complex and real-
memory that is seamlessly integrated with the world environment. A self-driving, unmanned vehicle
processing units. is a typical scenario that would require simultaneous
execution of various complicated visual cognition tasks
(5) Constraints and guidance of conditions: This in real time. These include unstructured lane detection;
component implements an intelligent control unit pedestrian, vehicle, and traffic sign identification; and
during cognition processing to complement information detection and tracking of multiple obstacles. Self-
that is lost by the projection of a 3D scene onto a driving vehicles therefore represent an ideal platform
2D image sensor. It also reduces uncertainty during for development and evaluation of visual cognition
information integration across multiple hierarchies and systems (11).
Conclusions and perspectives

Visual cognitive computing is playing an increasingly Brain-like control and
important role in the development of intelligent
autonomous systems and human-level robots.
adaptation for intelligent
Developing a robust computing architecture inspired
by brain networks, associative memory, and human
robots
visual perception is therefore a crucial area of brain- Yunyi Jia1 , Jianguo Zhao2,
inspired robotics. We have proposed a framework and Mustaffa Alfatlawi3, and Ning Xi4*
T
components of such a system on the basis of previous
research on multiscale spatiotemporal scene perception
and understanding. Going forward, we will begin to
investigate large-scale, compact, and reconfigurable raditional industrial robots have historically been
visual cognitive computing architectures and their caged in structured environments where they accom-
implementation. plish simple and repetitive tasks. In contrast, the current
generation of robots is required to be more intelligent so
References as to handle more complicated and dexterous tasks, and
1. P. Wei, Y. Zhao, N. Zheng, S.-C. Zhu, IEEE Trans. Pattern Anal. to work with humans in unstructured environments. These
Mach. Intell. (2016), doi: 10.1109/TPAMI.2016.2574712. requirements bring new challenges for robots, such as the
2. H. J. Park, K. Friston, Science 342, 1238411-1–1238411-8 need for a higher degree of control, sensing, and decision-
(2013). making. In this review, we discuss brain-like robot control
3. M. W. Cole et al., Nat. Neurosci. 16, 1348–1355 (2013).
and adaptation approaches for moving toward truly intel-
ligent robotics.
4. Y. Li, N. Zheng, Prog. Nat. Sci. 8, 228–236 (1998).
Intelligent robots require brain-like perception and con-
5. R. Semon, The Mneme (George Allen & Unwin, London,
trol. One particular aspect of these features is the ability
1921).
to employ vision as feedback to control a robot’s motion.
6. T. Liu et al., IEEE Trans. Pattern Anal. Mach. Intell. 33, 353–367
Traditional vision-based control methods need to extract
(2011).
and track geometric features in an image (1), which is not
7. D. Chen, Z. Yuan, G. Hua, J. Wang, N. Zheng, IEEE Trans. a brain-like function. In fact, insects or animals directly
Pattern Anal. Mach. Intell. (2016), doi: 10.1109/TPAMI employ image-intensity information in controlling their
.2016.2539956. motion to perform various tasks such as docking, landing,
8. P. A. Merolla et al., Science 345, 668–673 (2014). or obstacle avoidance (2).
9. L. Li et al., IEEE Trans. Neural Syst. Rehabil. Eng. 21, 532–543 To achieve brain-like control, we propose a direct vision-
(2013). based control approach that directly leverages all the
10. P. Ren et al., IEEE Trans. Comput. 65, 353–366 (2016). intensities or luminance in an image to form an image set.
11. N. Zheng et al., IEEE Intell. Syst. 19, 8–11 (2004). By considering an image as a set, the approach formulates
the vision-based control problem in the space of sets (3).
Acknowledgments Then, we can obtain a controller to steer the current image
This work was supported by the National Natural Science set to the desired image set. This derivation is performed
Foundation (61231018) and the National Basic Research Program in the space of sets, and the linear structure of the vector
(973 Program) (2015CB351703) of China. space no longer holds. We therefore refer to this approach
as “non-vector space control.”
Recently, we have validated the non-vector approach
using images obtained by atomic force microscopy (4, 5).
We have also shown that the approach works even when
only partial image information is known (6, 7), which can
significantly increase the computation speed. With such a
controller, a tailed robot is used to direct its midair orienta-
tion once it falls from a certain height (8). Here, we will
briefly illustrate an alternate function: how the non-vector
method could be leveraged to control the motion of a
robotic manipulator (9).
1
Department of Automotive Engineering, Clemson University, Greenville, South
Carolina, USA
2
Department of Mechanical Engineering, Colorado State University, Fort Collins,
Colorado, USA
3
Department of Electrical and Computer Engineering, Michigan State University,
East Lansing, Michigan, USA
4
Emerging Technologies Institute and Department of Industrial and
Manufacturing Systems Engineering, University of Hong Kong, Hong Kong
*
Corresponding Author: xining@hku.hk
BRAIN-LIKE CO NT RO L AND ADAPTAT IO N FO R INT E LLIG E NT R O B OT S 21
a vector. A system in the non-

vector space can be described
as φ(x(t), u(t))∈Ṡ(t), where Ṡ(t),
which is different from traditional
derivatives with respect to time,
is the set of all bounded Lipschitz
functions satisfying particular
conditions (9). u(t) = γ(S(t)) is
the feedback mapping from the
current feedback set S(t). Under this
general framework, the stabilization
problem in the non-vector space
can be stated as follows: Given a
controlled dynamics system
FIGURE 1. Experimental setup for non-vector based control. φ(x(t), u(t))∈Ṡ(t), a constant desired
set Ŝ, and an initial set S0 in the
vicinity of Ŝ, a feedback controller
Since intelligent robots are also required to work with should be designed, u(t) = γ(S(t)), such that the feedback
humans, it is crucially important to consider the safety set S(t) will approach Ŝ asymptotically. To address this
and efficiency of human–robot interaction. When humans problem, we have devised the following theorem. For the
work with each other, they can adapt to their coworkers to system f(x)u∈Ṡ(t) with f(x)∈Rm×n, x ∈Rm, u ∈Rn, and S ⊂ Rm, the
ensure cooperative performance. To imitate this function, following controller can locally asymptotically stabilize it at Ŝ:
we propose a brain-like robot adaptation approach and
apply it to a specific human–robot interaction field: telero-
botic operations.
Previous studies in telerobotic operations have been fo- where dŜ(x) is the projection of a point x to a set Ŝ. The
cused mainly on two issues—stability and telepresence. The detailed form of D(S)∈Rn can be found in (9).
first issue concerns the stability of the telerobotic systems in The vision-based control problem can be modeled by
the presence of various system constraints such as commu- the system in the above theorem, and the controller can
nication delays, packet losses, and unstructured environ- be readily applied. In fact, for servoing with greyscale
ments (10, 11). The second issue examines how teleopera- images, each pixel can be represented by a 3D vector
tors control robots in order to visually perceive the remote x = [x1,x2,x3]T, where x1 and x2 are the pixel indices, and
environment in an efficient way while presenting a sympa- x3 the pixel intensity. For a general visual servoing
thetic visage (12, 13). Most of these studies were based on problem, the control input is the camera’s spatial velocity.
the assumption that the human teleoperator is always fully Therefore, the control input, u(t), has three translational
able to control operations. This assumption, however, does components and three rotational components, which can
not always hold true. For instance, the teleoperator may be be represented by a vector, u(t) = [vx,vy,vz,ωx,ωy,ωz]T.
suffering from drowsiness, deprivation, and even inebria- The system in the theorem is determined by φ(x), which
tion. Under such conditions, teleoperation safety and is further determined by the relationship between u(t)
effectiveness may be seriously affected due to the opera- and x(t). The perspective-projection-sensing model in
tor’s decreased capacity. However, previous teleoperation computer vision can be used to derive such a relationship.
studies have rarely considered these possibilities. Under constant lighting condition, x3 will be a constant
To enhance the performance of telerobotic operations, for each pixel; therefore, ẋ3 = 0. With a unit camera focal
we first employed brain-like neural networks to identify length, a 3D point with coordinates P = [px, py, pz]T in
the operational status of the teleoperator based on their the camera frame will be projected to the image plane
online brain electroencephalogram (EEG) signals. We then with coordinates x1 = px /pz, x2 = py /pz. Based on these
designed a brain-like robot adaptation to adjust the robot equations, the relation between u(t) and x(t) can be
behaviors in response to these signals. We proposed the obtained as ẋ(t) = L(x(t))u(t), where
“quality of teleoperator” (QoT) concept to represent the
teleoperator’s operational status. Based on QoT, a robot
adaptation mechanism can be designed to adjust the
robot’s velocity and responsivity. Our experimental results,
discussed below, have demonstrated that the efficiency Note that the first two rows are the same as the
and safety of telerobotic operations can be improved interaction matrix in visual servoing. Since ẋ(t) = L(x(t))
through such brain-like robot adaptation approaches. u(t) has the same form of the system in the theorem,
the controller can be readily applied. To simplify the
Brain-like robot control for intelligent robotics problem, however, we only allow the camera attached
In our model, the non-vector space control approach to the end effector to have planar motion (i.e., two
considers the state of the system as a set instead of translational planes of motion plus a rotation) about its
optical axis.
The non-vector space control A C
method is implemented on a
redundant seven-degrees-of-
freedom manipulator (e.g., LAW3,
SCHUNK), which has seven revo-
lute joints. The experimental setup
is shown in Figure 1, where an
ordinary complementary metal-
B
oxide semiconductor (CMOS)
camera is rigidly attached to the
end-effector. As in our preliminary
experiment, a planar object with a
rectangle, a circle, and a triangle
printed on a white paper is placed
on the ground. First, the manipula-
tor is moved to a desired position,
where a desired image is acquired.
The desired position is obtained FIGURE 2. Experimental results of non-vector based control. (A) Initial image. (B) Desired
image. (C) Error in the task space.
using the manipulator’s forward
kinematics. Then the manipulator
is moved to some initial position,
where the initial image is recorded. The manipulator will
then start moving according to the non-vector space visual In this case, there are three main steps in the identification
servoing (NSVS). The experimental results for this strategy model: EEG signal acquisition, QoT indicator generation,
are shown in Figure 2, where the initial and desired images and QoT generation. EEG signal acquisition is realized by
are shown in panels A and B, respectively. The task space an Emotiv EPOC brain headset that has 14 saline sensors to
errors for two translations and one rotation are shown in collect the EEG signals from the channels, namely AF3, F7,
panel C; the rotation error is plotted in panel C with de- F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, and AF4. QoT
gree as the unit. From the plot, there are some overshoots indicator generation is realized using an affectivity software
for the system, but the controller appears to be able to development kit provided by Emotiv, which can recognize
stabilize the system at the desired position. the five QoT indicator values by analyzing the 14 EEG
signals. The QoT generation is realized through a B-spline
Brain-like robot adaptation for neural network. The detailed process of generating this
intelligent robotics model has been described previously (14, 16, 17).
In our recent research (15), we showed that the emo-
Brain-like learning of QoT tional states space, which is spanned by PAD states, can
QoT reflects the operational status of the teleoperator. be used to identify QoT from a psychological perspective.
It is impossible to use only one state of the teleoperator to The identification is achieved in two stages. In the first
reproduce this quality. Thus we have defined the concept stage, 14 EEG signals are used to estimate PAD states. The
we call the “QoT indicator.” Each QoT indicator is designed EEG-based fractal dimensions are extracted as complexity-
to represent a specific state of the teleoperator. These based features, and a Gaussian Process (GP) regression
indicators may measure the internal mental states of the model is fitted to these features to estimate the PAD states
teleoperator, such as fatigue and excitement, as well as after learning the GP regression model’s covariance kernel
external physiological states such as blood pressure and using Deep Belief Nets (DBN). In the second stage, the
heart rate. In our previous work (14), we proposed using estimated PAD states are used as features to identify QoT
the mental state of the teleoperator to identify QoT. Five by fitting another GP regression model with a spherical
mental QoT indicators were designed, including long-term Gaussian kernel.
excitement, short-term excitement, boredom, medita-
tion, and frustration. In our more recent research (15), we Brain-like robot adaptation according to QoT
proposed the pleasure, arousal, and dominance (PAD) The framework delineating how robots adapt to tele-
model to construct a more comprehensive representation operation according to QoT is shown in Figure 3. The
of the internal mental states. perceptive planning and control method and a QoT-based
Since the relationship between QoT and the QoT indica- adjustment mechanism are designed to construct the
tors involves a complex web of in-depth knowledge in vari- teleoperation system. The perceptive planning and control
ous fields such as psychology and physiology, this makes it method is applied to guarantee the stability of the system
difficult or impossible to use an analytic model to represent under network delays (11). However, the efficiency and
this relationship. Fortunately, the brain-like neural network safety of telerobotic operations may still be affected by
provides an alternative way to approximate the relationship. the quality of the teleoperator. In particular, the effects of
BRAIN-LIKE CO NT RO L AND ADAPTAT IO N FO R INT E LLIG E NT R O B OT S 23
FIGURE 3. Robot adaptation according to QoT in teleoperation. EEG, electroencephalogram; QoT, quality of teleoperator; e(s),
system errors; y, system output; s, perceptive reference; +, positive; –, negative.
low QoT become more evident when the teleoperator is velocity gain ki,s, which maps the teleoperator’s command-
performing tasks that require high dexterity. ed velocity to the robot velocity in the ith direction. The
In order to address this issue, we have proposed a design adopts a linear mapping function that is described
QoT-based adjustment mechanism to imitate human by vi,r = ki,s vi,tele where vi,r and vi,tele are the robot velocity and
adaptation. QoT is generated online and then sent to the the teleoperator’s commanded velocity in the ith direc-
robotic systems associated with the teleoperator’s com- tion, respectively. A larger velocity gain of ki,s leads to a
mands. Based on QoT, the parameters of the robot plan- relatively larger robot velocity, and a smaller velocity gain
ner and controller can be adjusted to assign appropriate leads to the inverse.
velocity and responsivity for the robotic system. Intuitively, For robot responsivity adjustment, the controller gains
with a high QoT, the robotic system needs to perform of the robotic system are used. For a robotic system where
quickly and responsively so that the teleoperator can ef- nonlinearities exist, by applying nonlinear and decoupling
ficiently control the robot and, in turn, achieve the teleop- techniques such as nonlinear feedback control (18), the
eration goal as quickly as possible. On the other hand, if system models can be linearized and decoupled into
QoT is low, the robotic system needs to perform relatively several independent linear subsystems. Then, a propor-
slowly with a decreased responsivity, so as to reduce or tion differentiation (PD) controller can be introduced to
avoid the potential impact of insufficient and incorrect control the subsystems. For each controller, there are two
commands from the teleoperator. gains, K i = [ki1,ki2]. These gains are proportional to the
In addition, making adjustments solely based on QoT robot responsivity, which makes them applicable for the
does not always make for more efficient operations. For responsivity adjustment.
~
instance, when the teleoperator is completing simple Initially, a set of nominal gains, K i, are predefined for
tasks, such as driving a robot along a straight line at a low both the velocity gains and controller gains. The nominal
speed, a low QoT may be sufficient for the teleoperator velocity gains are chosen as 1, and the nominal controller
to accomplish the task. To address this issue, in the QoT gains are chosen to achieve the best possible responsiv-
adaptive control design, we have introduced a concept we ity of the robotic system based on the experimental tests.
call the “task dexterity index” (TDI) (16, 17) to represent the Then, based on the QoT and TDI, the desired gains can be
dexterity and complexity of the task. A collection of task computed by
indicators are generated online based on the teleopera-
tor’s control commands and the sensory information from
the robot. By using a fuzzy inference scheme, these task in-
dicators are mapped into a normalized value to represent
the TDI.
In order to enhance teleoperation performance, robot where γ, DH, DL and are threshold constants, and λ is a
performance can be adjusted according to both QoT weight to adjust the influence of the QoT on the robot
and TDI, so that the teleoperator can safely and efficiently controller gains. These parameters are all defined based
operate the robot to accomplish different tasks at dif- on the teleoperation requirements. The first part of the
ferent operation statuses. The adjustment to the robot formula indicates that the robot should perform as rapidly
comprises two parts: velocity adjustment and responsivity as possible with high responsivity when the QoT is high
adjustment. For the robot velocity adjustment, we use the enough or the task is simple enough. The second part
indicates that the robot should perform suitably fast will need to be dealt with. In the future, robots will be
or slow with suitable responsivity, according to both mostly uncaged and thus will frequently work alongside
the QoT and TDI, when the QoT is low and the task is humans. Therefore, safety considerations will be
intermediately complicated. The third part indicates critical for protecting humans from robots, and vice
that the robot should perform with an appropriately low versa. Developing efficient approaches such as using
velocity and low responsivity, adapting only to QoT when natural language, gestures, physical contact, and even
it is low and the task is very complicated. connectivity to the human mind, will be essential to
Once the desired controller gains are computed based give humans the capability to communicate with robots
on QoT and TDI, to ensure the system performance, naturally and efficiently, and to enable robots to actively
they are then examined to determine whether they lie learn from humans.
in the efficient region [defined in (16)]. If they lie in the
region, the desired gains are set as the controller gains. References
Otherwise, the gains which lie in the region and are the 1. S. Hutchinson, G. D. Hager, P. I. Corke, IEEE Trans. Rob. Autom.
nearest to the desired gains in the Euclidean space of 12, 651–670 (1996).
the ki1— ki2 responsivity plane are selected by solving a 2. É. Marchand, F. Chaumette, Rob. Auton. Syst. 52, 53–70
quadratic programming problem (16). (2005).
The designed approaches are implemented on an 3. L. Doyen, J. Math. Imaging Vis. 5, 99–109 (1995).
Internet-based mobile manipulator teleoperation system 4. J. Zhao, B. Song, N. Xi, K. Wai, C. Lai, 2011 50th IEEE
for three types of tasks: manipulation, navigation, and Conference on Decision and Control and European Control
mobile manipulation (16). Our results demonstrated that Conference (CDC-ECC), (IEEE, Orlando, FL, December 2011),
when robot self-adaptation was implemented according pp. 5683–5688.
to QoT, the time cost, collision times, and attempt times 5. J. Zhao et al., Automatica 50, 1835–1842 (2014).
for accomplishing the three types of tested tasks were all 6. J. Zhao, N. Xi, L. Sun, B. Song, 2012 IEEE 51st Annual
reduced, by 45%, 80%, and 40% respectively. In addition, Conference on Decision and Control (CDC) (IEEE, Maui, HI,
with this adaptation design, teleoperators found that the December 2012), pp. 5685–5690.
robot’s operations became easier and smoother due to 7. B. Song et al., IEEE Trans. Robot. 30, 103–114 (2014).
active adaptation of the robot according to the human’s 8. J. Zhao, H. Shen, N. Xi, 2015 IEEE/RSJ International
operation status. Conference on Intelligent Robots and Systems (IROS),
(IEEE, Hamburg, Germany, September–October 2015),
Conclusions and future directions pp. 2154–2159.
In summary, brain-like robot control and adaptation 9. J. Zhao et al., 2012 IEEE/ASME International Conference on
approaches have been developed to advance intelligent Advanced Intelligent Mechatronics (IEEE, Kaohsiung, Taiwan,
robotics. In the brain-like, non-vector based control ap- July 2012), pp. 87–92 .
proach, control of the robot is achieved using sets instead 10. R. J. Anderson, M. W. Spong, IEEE Trans. Automat. Contr. 34,
of vectors. By applying this approach to visual servo 494–501 (1989).
control of robotic manipulators, it is possible to eliminate 11. N. Xi, T. J. Tarn, Rob. Auton. Syst. 32, 173–178 (2000).
the requirement for image-feature extraction and tracking. 12. I. Elhajj et al., Proc. IEEE 91, 396–421 (2003).
In the brain-like adaptation approach, the robot is able 13. T. Itoh, K. Kosuge, T. Fukuda, IEEE Trans. Rob. Autom. 16, 505–
to discern the operational status of humans and can then 516 (2000).
actively adjust its behaviors based on their status to en- 14. Y. Jia, N. Xi, Y. Wang, X. Li, 2012 IEEE International
hance the safety and efficiency of telerobotic operations. Conference on Robotics and Automation (ICRA), (IEEE, St.
Our experiments in applying these two approaches to real Paul, MN, May 2012), pp. 451–456.
robotic systems illustrated their effectiveness and advan- 15. M. Alfatlawi, Y. Jia, N. Xi, 2015 IEEE International Conference
tages, by making robotic systems more intelligent. on Robotics and Biomimetics (ROBIO), (IEEE, Zhuhai, China,
For the successful application of robotics to intelligent December 2015), pp. 470–475.
manufacturing, several important issues will need to 16. Y. Jia et al., Int. J. Robot. Res. 33, 1765–1781, (2014).
be addressed. First, robotic systems will need to be 17. Y. Jia, N. Xi, F. Wang, Y. Wang, X. Li, 2011 IEEE/RSJ
more flexible and reconfigurable. Existing fixed-base International Conference on Intelligent Robots and Systems
industrial robots will no longer satisfy the demands (IROS), (IEEE, San Francisco, CA, September 2011), pp.
of modern markets. In the future, robotic systems will 184–189.
have more scalable work spaces, more flexibility for 18. T. Tarn, A. K. Bejczy, A. Isidori, Y. Chen, The 23rd IEEE
accomplishing various tasks, and more adaptability to Conference on Decision and Control, (IEEE, Las Vegas, NV,
various environments. Second, more advanced sensing December 1984), pp. 736–751.
systems will be required for intelligent robotics. Stable
and reliable sensing systems such as vision, force, Acknowledgments
haptic, and tactile sensors will need to be integrated This research work was partially supported under U.S. Army
into robotic systems to allow them to perform various Research Office Contract No. W911NF-11-D-0001 and grants
tasks in unstructured environments. Third, safety and from the U.S. Army Research Office (W911NF-09-1-0321,
efficiency concerns regarding human–robot interaction W911NF-10-1-0358, and W911NF-16-1-0572).
NE U RO RO BOT IC S : A ST RAT E G IC P ILLAR O F T H E H U MAN BRAIN P R OJ ECT 25
Neurorobotics: A strategic Recently, industry has again been advancing technol-

ogy by building robotic bodies that are softer, more
pillar of the Human Brain flexible, and more durable than before. However, despite
numerous attempts, no robot today matches the cogni-
Project tive or behavioral abilities of even the simplest animals, let
alone humans. What is still missing after decades of effort?
Alois Knoll1* and Marc-Oliver Gewaltig2 Robots are still lacking a brain that allows them to learn to
adapt to and exploit the physics of their bodies.
N
Today, there are two main schools of thought in neuro-
robotics research. The first focuses on biologically inspired
robots with bodies, sensors, and actuators that mimic
eurorobotics is an emerging science that studies structural and functional principles at work in the bodies
the interaction of brain, body, and environment in closed and organs of living creatures. The second concentrates
perception–action loops where a robot’s actions affect its on brain-inspired control architectures.
future sensory input. At the core of this field are robots Biologically inspired robots are adaptable and can dis-
controlled by simulated nervous systems that model the play rich perceptual and behavioral capabilities. In contrast
structure and function of biological brains at varying levels to industrial robots, they often use compliant materials
of detail (1). In a typical neurorobotics experiment, a robot that make their mechanics intrinsically flexible. Examples
or agent will perceive its current environment through a of advanced biology-inspired humanoid robots are the
set of sensors that will transmit their signals to a simulated iCub, a humanoid robot “child” (6); Kojiro, a humanoid
brain. The brain model may then produce signals that will robot with about 100 artificial muscles (7); and ECCERO-
cause the robot to move, thereby changing the agent’s BOT (Embodied Cognition in a Compliantly Engineered
perception of the environment. Robot), a humanoid upper torso that attempts to replicate
Observing how the robot then interacts with its envi- the inner structures and mechanisms of the human body
ronment and how the robot’s actions influence its future in a detailed manner (8). More recently, engineers have
sensory input allows scientists to study how brain and created Roboy, a soft human body with a spine and a
body have to work together to produce the appropriate head that can display several emotional patterns (9); the
response to a given stimulus. Thus, neurorobotics links ASIMO robot (10); and, most recently Atlas, the latest agile
robotics and neuroscience, enabling a seamless exchange anthropomorphic robot developed by Boston Dynam-
of knowledge between these two disciplines. ics (11). The last two are constructed from hard materi-
Here, we provide an introduction to neurorobotics and als and implement research in machine walking. Such
report on the current state of development of the Euro- advances in multilegged robots have led to a renewed
pean Union–funded Human Brain Project’s (HBP’s) Neu- interest in applications for the military [BigDog, also from
rorobotics Platform (2, 3). HBP is Europe’s biggest project Boston Dynamics (12)]; disaster management [iRobot’s
in information communication technologies (ICT) to date PackBot (13), a snake-like robot developed by Hitachi-GE
(www.humanbrainproject.eu) and is one of two large-scale, Nuclear Energy and IRID (International Research Institute
long-term flagship research initiatives selected by the for Nuclear Decommissioning) (14), and a scorpion-like
European Commission to promote disruptive scientific robot from Toshiba (15)]; and entertainment [the NAO and
advance in future key technologies. It will have a duration Pepper robots from Aldebaran Robotics (16)]. Researchers
of 10 years and deliver six open ICT platforms for future re- have also developed robots that can mimic the character-
search in neuroscience, medicine, and computing, aimed istics of animals, such as locomotion in snakes and spiders,
at unifying the understanding of the human brain and whisking in rodents, or jumping in frogs.
translating this knowledge into commercial products. Brain-inspired control architectures are robotic control
systems that on some level reflect properties of animal
History and current status of neurorobotics nervous systems. In general, they are tailor-made for a
One of the first researchers to apply the concepts of specific set of tasks, often using a combination of artificial
neurorobotics was Thomas Ross, who in 1933 devised a neural networks, computer vision/audition, and machine
robot with a small electromechanical memory cell that learning algorithms. Early work in this field was done by
could learn its way through a very simple maze via “con- Miyamoto et al. (17), who developed an artificial neural
ditioned reflex” (4). In his 1948 book Cybernetics, Norbert network with brain-inspired hierarchical architecture that
Wiener laid the theoretical foundation for robotics as “the could control an industrial manipulator by acquiring a
study of control and communication in the animal and the dynamical model of the robot through the “feedback-
machine” (5). Since then, however, robotics research has error-learning” method. Later, Edelman et al. developed
focused largely on industrial applications. the “Darwin” series of recognition automata (18) to study
different behavioral phenomena ranging from pattern
1
Technical University of Munich, Department of Informatics, Chair of Robotics and
recognition to association to autonomous behavior and
Embedded Systems, Garching bei München, Germany sensorimotor coordination. Cox and Krichmar derived a
Ecole Polytechnique Fédérale de Lausanne, Blue Brain Project, Campus Biotech,
robot controller based on the neuromodulatory systems of
2
Geneva, Switzerland
*
Corresponding Author: knoll@in.tum.de mammalian brains (19). Burgess et al. used a mobile robot
FIGURE 1. A
schematic of the
Human Brain
Project showing
collaborations
between the
neurorobotics
subproject and
other subprojects
shown in the list
on the left.
to test a model of the rat hippocampus (20). Ijspeert’s standing of the human brain and to use this knowledge
group investigated low-level mechanisms of locomotion to develop new products and technologies. HBP features
in vertebrates by constructing a salamander robot con- a huge and diverse spectrum of research and develop-
trolled by central pattern generators (21). Further work has ment activities ranging from human and mouse brain-data
included self-organizing synchronization in robotic swarms collection to the discussion of ethical issues arising from
(22) and the learning of compositional structures from brain simulation and brain-based technologies. These
sensorimotor data (23). activities are organized in a network of closely collaborat-
Lund et al. used spiking-neuron models to derive a ing subprojects. As shown in Figure 1, the neurorobotics
model for cricket phonotaxis that they evaluated on a mo- subproject collaborates directly with almost all other
bile robot (24). Since then, spiking neural networks have subprojects. Six of these subprojects are to develop ICT
become a standard tool in brain-based robotics and are platforms that will make technology developed within HBP
used to control complex biomechanical systems such as available to the public. The Neuroinformatics Platform will
the musculoskeletal model of the human body developed deliver searchable brain atlases to enable access to a vast
by Nakamura’s group (25). For example, Sreenivasa et al. amount of digitized brain data. The Medical Informatics
published a study on modeling the human arm stretch Platform will gather and process clinical data for research
reflex based on a spinal circuit built of spiking neurons on brain diseases. The Brain Simulation Platform will pro-
(26). Kuniyoshi et al. developed a fetus model to simulate vide detailed large-scale brain simulations that will run on
the self-organized emergence of body maps in a nervous the supercomputer infrastructure of the High-Performance
system of spiking neurons (27, 28). Finally, Richter et al. Analytics and Computing Platform and on the novel
controlled a biomimetic robotic arm with a model of the neuromorphic hardware of the Neuromorphic Computing
cerebellum that was executed in biological real-time on Platform. The NRP will connect the brain simulation and
the SpiNNaker neuromorphic computing platform (29, 30). computing platforms through an advanced closed-loop
A third line of research that is usually not mentioned robot simulator for neurorobotics experiments whereby
along with neurorobotics—but with which it greatly over- robots can be controlled by a simulated brain model.
laps—is neuroprosthetics, which can in fact be viewed as All these platforms contribute to the virtualization of
the translational branch of neurorobotics. To advance neu- brain and robotics research. Virtual brain research, when
rorobotics further, a close collaboration between neurosci- supported by powerful and well-calibrated models, will
entists, roboticists, and experts in appropriate computing enable researchers to perform experiments impossible in
hardware is clearly mandatory. the real world, such as the simultaneous placement of 1
million probes in a mouse brain. By the same token, the
The Human Brain Project’s Neurorobotics Platform advantage of virtual robotics research is the enormous
The Neurorobotics Platform (NRP) is a novel tool that acceleration of experiment and development speed, alle-
allows scientists to collaboratively design and conduct viating dependency on real hardware in real environments.
in silico experiments in cognitive neuroscience, using a The possibility for smooth “mode changes” between
neurorobotics-focused approach. The NRP thus plays an real-world and virtual design phases is certainly essential,
integrative role in HBP as it provides the tools for all scien- and only when this step becomes simple will an iterative
tists to study brain models in the context of their natural give-and-take of insights between both worlds be possible.
habitat—the body. In the remainder of the article we will introduce the
The overarching goal of HBP is to help unify our under- “neurorobotics workflow,” a principled step-by-step ap-
nervous systems. They define control

architectures and neural network
models, possibly trained by deep
learning algorithms, with the aim
of solving a particular set of tasks.
Examples are the Spaun model (31)
and control architectures commonly
found in cognitive robotics. At the oth-
er extreme are digital reconstructions
of neural circuits (32) or even entire
brains of mice and rats, for example,
based on experimental data. These
models focus foremost on the struc-
tural and dynamical details of the re-
constructed system and regard brain
function as an emergent phenome-
non of these brain reconstructions.
While many researchers argue in
favor of one or the other position,
we propose that the most productive
FIGURE 2. The neurorobotics workflow. route is to combine the two ap-
proaches. For example, many theories
exist for higher-level brain functions
like visual perception, but not all of
proach to designing neurorobotics experiments for both these theories can be true at the same time. Some may be
roboticists and neuroscientists. The topics of the following appropriate for humans, whereas others may be applica-
section will cover details about the implementation of this ble to cats or rodents. The only way to separate suitable
workflow in the NRP, and the underlying tools and software theories from less suitable ones is to give researchers a
architecture. We will then present a series of pilot exper- tool that allows them to confront a given theory of brain
iments that were designed using the NRP to validate the function with the anatomical and physiological realities
workflow and to benchmark the performance and usability of a particular brain embedded in a concrete body, be it
of its first public software release to the scientific communi- mouse, cat, or human. The NRP aims to be such a tool, fol-
ty. Finally, we will summarize results achieved so far in the lowing the time-tested approach of analysis by synthesis.
neurorobotics subproject and provide an outlook on the A typical neurorobotics experiment might involve the
future development of the NRP. simulation of a rat as it navigates through a maze. In this
case, the control architecture for the simulated rat could
Methodology comprise sensory areas, a hippocampus, and a motor area
The approach for the NRP consists of providing a num- to generate movements. Traditionally, these components
ber of design applications for models (of environments, are individually selected and specifically adapted to match
robot bodies, and brains) and simulation engines that are the robot, the environment, and the task. The neurorobot-
integrated into a web-based frontend. Using this web fron- ics workflow departs from this approach: Rather than de-
tend, users at different locations can rapidly construct a ro- signing specific neural control architectures for each robot
bot model, its brain-based controller, an environment, and and each experiment, it provides elements for the rapid
an execution plan. We call this ensemble a “neurorobotics design and training of application-specific brains and for
experiment.” The NRP also allows reusing and sharing of embedding these brains in appropriate robotic embod-
previously defined experiments, thus opening a new area iments, with the theme of “modular brains for modular
of collaborative neurorobotics research. bodies.” It will incorporate both the experimenter’s own
Only through integration of the appropriate simulation research as well as results from others.
engines can we expect to achieve meaningful results, es- The neurorobotics workflow starts with the choice of
pecially if we want to close the loop between the brain, the the system to be examined and ends with the execution of
robot, and its environment such that the simulated robot the experiment. It has four elementary steps, as described
perceives its simulated environment through its simulated below and illustrated in Figure 2.
sensors and interacts with it through its simulated body.
The simulation technologies used must provide the high- Step 1: Choose in silico experiment to run
est available fidelity and the best possible match between First, the researcher needs to specify the system to be
an observation in the real world and its counterpart in the investigated by deciding on the robot body, the environ-
virtual world. ment, the task for the robot to solve, and the brain (the
In brain modeling, there are currently two extremes. The neural controller) that will control the body. We will refer to
first are models that focus on the functional properties of the specific combination of body, environment, and neural
system as a “neurorobotic system.” The

concrete decisions for building each neu-
rorobotic system will influence all other
steps of the workflow.
Step 2: Choose desired level and platform

In the second step, the implementation
of the neurorobotic system chosen in step
1 needs to be specified. The simulated
nervous systems can be realized at vary-
ing levels of detail ranging from con-
ceptual abstract models to spiking-point
neuron models (33) or highly detailed
neural simulations (32). Similarly, the
robotic body can be built of stiff material
with standard actuators from industrial FIGURE 3. Connecting a virtual mouse model to the primary somatosensory
robotics or can be based on a more com- cortex of a mouse brain simulation. The left and right sides of the figure depict
plex design of soft structures (34), includ- the visualizations of the body and brain simulations, respectively. The mapping
ing biomimetic musculoskeletal actuators between brain regions and touch receptors of the mouse body is displayed in
with many degrees of freedom. the center panels.
Every component may be realized
physically or in simulation. But why use
simulated robots when our ultimate goal
is a brain model that can control a real robot in a physical Step 3: Set instrumentation and alignment
environment? Real robots have the advantage that all real- After step 2 of the workflow, both the brain simulation
world complexities are considered. The same applies to and the robot are fully specified. Now, the researcher must
the environments in which the robot is meant to perform define the connection between brain and body, specifi-
its task. Computer models of robots and environments are cally the mappings that translate the neural output of the
complex and still inaccurate. However, robot simulations brain into motor commands for the robot, and translate
can be easily accessed without the need for purchasing the sensor readings into input for the brain. An example of
expensive hardware. They also enable quick changes of this mapping process is depicted in Figure 3, where touch
the robot design and do not suffer from mechanical failure. receptors located on the skin and the whiskers of a mouse
During the simulation, the system state of the robot can body are mapped to the corresponding cortical areas of
be fully inspected at any point in time. Specific setups like a simulated mouse brain. Defining these mappings within
the parallel execution of hundreds of simulations or the the closed-loop integration of a brain simulation and a
simulation of biological muscles cannot be realized at all body model comprises the process stage that is the core
in a physical setup. In the end, the decision whether to use of the workflow.
simulated or physical robots is not influenced by scientific
considerations, but rather by our ability to efficiently Step 4: Run the experiment
implement the brain model into the robot. In the final step of the workflow, the researcher executes
Like the robot, brain models can be executed either as a the neurorobotics experiment. In this phase, the brain
pure simulation or as a physical model, with the difference model and the robot model run in parallel with the output
being that simulations tend to be more accurate than any of one system being the input of the other and vice versa.
physical model we can build to date. Thus, if we are inter- It is of vital importance that both simulations are synchro-
ested in brain models that capture many of the properties nized and run on the same timescale, since only then can
of real brains, we must resort to simulation. These simula- the behavior observed in the neurorobotics experiment
tions, however, run much more slowly than real-time, and be validated with data from neuroscience and biology.
this limitation forces us to also simulate the combination of During the execution of the experiment, the researcher
robot and environment, because we cannot slow down the can monitor and control all states and parameters of the
physics that govern them. To test and implement simpli- experiment (brain, body, and environment) including the
fied and abstracted brain models, we must find physical option to pause and restart it, which is technically impossi-
implementations that will also allow us to use real robots ble to achieve in the laboratory.
and environments. In essence, the NRP consists of different model
In HBP, physical brain models are developed through designers and simulation engines that allow rapid
the neuromorphic computing subproject, either by emu- construction of neurorobotics experiments. It enables
lating neural dynamics based on analog integrated circuits researchers to design virtual robot bodies, to connect
(35) or by SpiNNaker, a digital neuromorphic system built these bodies to brain models, to embed them in rich,
of standard smartphone microprocessors interconnected virtual environments, and, ultimately to calibrate the
by an efficient communication mesh (30). brain models to match the specific characteristics of the
It goes without saying that for

virtualized neurorobotics research
to be meaningful, the models for
all constituents (environment, robot
bodies, and brains) must reflect and
integrate current knowledge of all
entities, including knowledge about
all abstraction levels of the brain, from
the cell level to information process-
ing. The neurorobotics subproject will
therefore openly collaborate with re-
search groups worldwide—both within
and outside HBP—and continuously
integrate the state of the art in brain
and robot research.
Even though the NRP is designed
with a focus on simulation, it is impor-
tant to realize that the virtual design
and virtual execution tool chain imple-
mented by the NRP can, at any step
along the process, be translated to
FIGURE 4. Schematic of brain modeling based on multiscale integration of data the real world; thus, real robots can be
from different sources. designed from their models in the vir-
tual world. They will produce the same
behavior as their virtual counterparts,
provided the models are permanently
calibrated. For industrial practice,
this means there can be a very rapid
design-test-revise cycle that produces
robots with integrated behavior based
on “brain-derived technologies.”
Implementation
The NRP consists of a frontend,
called the “neurorobotics cockpit,”
and a simulation backend. The cockpit
gives access to a set of tools that
implement the first three steps of the
neurorobotics workflow. The tools
allow users to select and configure ro-
bots and environments, configure the
FIGURE 5. Synthesizing a brain model from measurement data. brain model, and set up neurorobotics
experiments. Different visualization
channels allow control and visualiza-
tion of the simulation and interaction
robot’s sensors and “muscles.” Collections of predefined with the experiment while it is running. The backend
models for brains, bodies, environments, sensors, and orchestrates the different tools for the simulation of robot,
actuators will also allow nonexpert users to quickly set environment, and brain model, and manages the signal
up new experiments. Researchers will further be able flow between these tools and the frontend. An overview of
to use traditional techniques from neuroscience, such the complete system architecture is depicted in Figure 7.
as lesion studies or manipulations of single neurons, to
identify the right control architecture for the specific Frontend: Designers
task. The resulting setups allow researchers to perform The robot designer creates and adapts robot models,
in silico experiments, initially replicating previous work, the environment designer sets up the virtual environment
but ultimately breaking new ground. The individual with which the robot will interact, the brain interfaces and
steps from the collection of brain data to synthesizing a body integrator (BIBI) specifies the mapping between
brain model based on this data and finally to connecting the simulated brain and the sensors and actuators of the
the brain simulation to a body model are illustrated in robot, and the experiment designer defines the neuroro-
Figures 4–6. botics experiment.
The robot the brain atlas

designer embedding mod-
implements all ule as part of its
functionality Brain Simulation
necessary to Platform. Follow-
design robotic ing the paradigm
bodies, adapt of predictive
their appearance, neuroscience, this
set their dynamic platform builds
parameters, upon and extends
and equip them the bottom-up
with sensors biological recon-
and actuators. struction process
Since users in to brain modeling,
neuroscience will which has recent-
be less interested ly enabled the
in designing digital recon-
robotic bodies, struction of the
needing them FIGURE 6. Defining a mapping between the brain model and the robot model. microcircuitry of
simply as tools for the rat somatosen-
conducting closed- sory cortex (32).
loop experiments, the designer includes libraries with The brain builder is at the core of this process. It allows
ready-to-use models of robots, sensors, and actuators. researchers to synthesize brain models from a large-scale
In the current version of the NRP, users can choose from database delivered by the neuroinformatics subproject,
a model of the iCub robot (6), the six-legged walking which is designed to store the vast amount of available ex-
machine LAURON (37), or the mobile robot Husky (38). perimental knowledge about the brain. General principles
Most importantly, the designer also includes a virtual of brain structure and function that are beyond the scope
mouse model that allows one-to-one comparison to of experimental data are represented algorithmically and
experimental neuroscience studies. Future releases validated against biological knowledge. Depending on the
of the NRP will incorporate further refinements and intended application and the research question, the brain
improvements to this model, such as the addition of a builder will support the generation of brain models at
realistic simulation of the musculoskeletal system. For the different levels of granularity ranging from the molecular
advanced user, the NRP offers a robot designer plug- level to microcircuits to whole brains. Both brain models
in for the popular open-source 3D modeling software and experimental data can be visualized and searched
Blender (36). using the brain atlas embedding module.
Similar to a real neuroscience experiment, a After selecting a brain model from this component,
neurorobotics experiment is conducted in a specific BIBI will allow researchers to set up a spatial mapping by
environment. The environment designer is a tool for selecting neurons graphically using the brain atlas em-
designing virtual environments tailored to the needs of the bedding module, and to map these neurons to the robot
experiment. Moreover, users will be given the option to model. The neurorobotics subproject will develop an open
share their models with other scientists using the common standard for storing and sharing the brain–body mappings
HBP infrastructure. created with the BIBI tool.
The mapping between brain and body lies at the core The experiment designer combines the output of the
of every neurorobotics experiment. In correspondence other three designers in a virtual protocol for neurorobot-
with step 3 of the neurorobotics workflow, BIBI pro- ics experiments. Future releases of the NRP will contain an
vides tools for both spatial and representation mapping. intuitive graphical user interface for setting the number of
Presently, there is only support for small brain models, runs, terminating conditions, and measurement values to
which means that no spatial mapping is required. The be recorded during experiments.
representation mapping between the brain model and
robotic sensing and actuation is accomplished by imple- Frontend: Visualization
menting so-called “transfer functions,” which translate the The second group of frontend software components ad-
data exchanged between the brain simulation and the dresses the visualization of the neurorobotics experiment.
robot simulation. Beyond mere signal translation, transfer Depending on the experimental setup, this visualization can
functions can also directly control the robot’s actuators be computed from live data in a currently running simula-
in response to sensory input and thereby bypass further tion or generated from a previously recorded experiment.
processing in the brain in a reflex-like manner. The NRP supports two different types of visualization.
We are now adding support for more realistic brain The first is the web-based experiment simulation viewer,
models in close collaboration with HBP’s brain simulation which runs in a browser and can therefore be used on any
subproject, which is developing the brain builder and standard personal computer or even on mobile devices,
by the neuromorphic computing plat-

form, or on a supercomputer running
a detailed simulator. Both simulations
are coordinated and synchronized by
the closed-loop engine that connects
and manages the brain simulation and
robot simulation according to the map-
ping defined in the BIBI component.
In correspondence with the different
levels of modeling supported by the
brain builder, the brain simulation
subproject offers different simulators
for molecular-, cellular-, and network-
level analyses. Whereas network-level
simulations capture less detail about
individual neurons and synapses, they
are based on established point-neuron
models that support large-scale models
of whole brain regions or even com-
plete brains.
The NRP is fully integrated into
the HBP Collaboratory, a web portal
FIGURE 7. Components and architecture of the Human Brain Project’s
Neurorobotics Platform (NRP).
providing unified access to the six ICT
platforms developed in HBP. The portal
implements a common set of services
for storage or authentication that is
making the NRP easily accessible to interested users with- automatically available to users of the NRP and allows an
out the burden of buying dedicated hardware. However, easy exchange of data. The integration of the novel work-
compared to real-world experiments, the interaction is less flows and data formats developed into a common portal
intuitive and the fidelity of the visualization is limited by makes new research results immediately available to all
the screen size and performance of the graphics process- members and collaborators. The NRP therefore automati-
ing unit. cally benefits from the work of the entire neuroscientific
The second type of visualization supported by the NRP community, making it not only the most advanced tool in
is the high-fidelity experiment simulation viewer, which the field, but one that sets a new standard for state-of-the-
delivers a much more immersive experience by rendering art neurorobotics research.
life-size visualization on a display wall or in a cave auto-
matic virtual environment (CAVE). While this approach Experimental results
requires complex installation of displays and dedicated We validated both the neurorobotics workflow and the
hardware, it delivers the highest degree of realism and NRP in four different experiments that are available in the
even opens up the possibility for mixed-reality setups, first public release of the platform. Because the interface
where persons or objects in front of the display wall or to the brain simulation platform is under intense develop-
CAVE can become part of the experiment. ment, all experiments currently rely on simplified neural
controllers simulated by NEST (40).
Backend: Simulation
The backend of the NRP processes the specifications Basic closed-loop integration
of the experiment and coordinates the simulators. The The first proof-of-concept experiment carried out on
world simulation engine computes and updates the states the NRP was a simple Braitenberg vehicle (41), in a version
of the robot and the environment based on the system designed for the Husky wheeled robot and the LAURON
descriptions from the corresponding designers. It is based hexapod. The setup is based on a model of the Husky ro-
on the highly modular Gazebo simulator (39), which can bot that is equipped with a camera and capable of moving
be easily augmented with new simulation modules. The around in a virtual room included in the NRP. The camera
brain model is simulated by the well-established neural output is processed and forwarded to the NEST simulation
simulation tool (NEST) (40), which ensures that even brain of a neural network implementing basic phototaxis. As
models of the largest scale can be simulated using HBP’s shown in Figure 8, the room contains two displays located
extensive computer resources. In the future, the user will on opposite walls. During the simulation, the user can
not only be able to choose the desired brain model but interactively set the color of each display.
also whether the simulation will be computed on standard To demonstrate the modularity of our approach, the
hardware using a point-neuron simulator, on one of the Braitenberg experiment was also implemented for the
two neuromorphic hardware simulators that are provided walking machine LAURON (Figure 8). Both the Husky
and LAURON robots were directed by the

Braitenberg brain to walk in the direction of
the red stimulus.
Simulation of a humanoid robot

and a retina model
Studies of human brain models will require
realistic humanoid embodiments. To this
end, we integrated the iCub robot model
delivered with the NRP. A sample experi-
mental setup is depicted in Figure 9. The
robot model is equipped with two cameras
positioned at the eyes and is aimed toward
one of the screens in the virtual room. Simple
spiking neural networks similar to that used in
the Braitenberg experiment control the eye
FIGURE 8. Live visualization of the LAURON Braitenberg experiment in the
motion. However, in this case, the network
web-based experiment simulation viewer.
causes the robot’s eyes to track a moving
visual stimulus displayed on the screen. As
can be seen in the figure, the user can not
only display the spike raster plot, but can also
access the recordings of the two cameras
placed in the head of the iCub model. We are
currently integrating a much more realistic
model from a computational framework for
retina modeling (42).
Mouse model and soft-body simulation

The virtual mouse model is an essen-
tial component of the NRP and a key to
comparative studies bridging traditional
neuroscience and virtual neurorobotics
experiments. Compared to the other robot FIGURE 9. Live visualization of the iCub eye-tracking experiment.
models considered thus far, the simulation of
a mouse body is especially demanding due
to its soft body structure. The mouse experiment included with arbitrary complexity. A SpiNNaker system (30) was
in the first release of the NRP therefore adopts the same connected to the robot via a dedicated hardware interface
simple protocol as the other prototype experiments and that translated between the communication protocols of
focuses on the soft-body simulation. The result is shown in SpiNNaker and Myorobotics (44). The arm was controlled
Figure 10. The mouse model is placed at a junction point by a cerebellum model simulation adapted to run on the
of a Y-shaped maze. Each direction leads to a dead end SpiNNaker architecture from the system described by
with a screen. As in the Braitenberg experiment, the neural Luque et al. (45). In particular, we augmented the SpiNNa-
controller directs the mouse to move its head toward the ker software framework to support the supervised learning
red stimulus. A more detailed, realistic-appearing mouse rule defined by the model. Based on this rule, the system
model has been completed, and a prototype that maps successfully learned to follow a desired trajectory.
the sensory areas of this model to corresponding cortical
areas of a detailed point-neuron model of the mouse Conclusions and outlook
brain has been successfully tested. The results will soon be Following the neurorobotics subproject’s overarching
available on the NRP. theme, “modular brains for modular bodies,” we have suc-
cessfully integrated the most advanced tools and tech-
Closed-loop neuromorphic control of a nologies from brain simulation, robotics, and computing in
biomimetic robotic arm a unified and easy-to-use toolset that will enable research-
To complement the purely virtual experiments run- ers to study neural embodiment and brain-based robot-
ning on the NRP, we also implemented an initial physical ics. This integrative approach combines and connects the
neurorobotics experiment (29). The robot was a single- results of all HBP subprojects, rendering the neurorobotics
joint biomimetic arm with two antagonistic tendon-driven subproject a strategic pillar of HBP.
artificial muscles (Figure 11). It was assembled from design The NRP (neurorobotics.net) is the first integrated and
primitives of the modular Myorobotics toolkit (43), which Internet-accessible toolchain for connecting large-scale
allows the easy assembly of biomimetic robotic structures brain models to complex biomimetic robotic bodies.
To this end, we are currently

augmenting the neural simula-
tion interface with support for
distributed setups in which
many instances of NEST are
running in parallel.
Additionally, we are imple-
menting an interface to the
SpiNNaker platform to allow
neuromorphic simulation
setups; an initial prototype
system is already available. We
are also making a concerted
effort to ensure that the NRP is
as attractive as possible to us-
ers. One example is a domain-
specific language for defining
transfer functions (46), which
will serve as the basis for a
graphical transfer function edi-
FIGURE 10. Mouse model experiment with soft-body physics simulation and live visualization tor. On the modeling side, we
of spike trains. will provide further environ-
ments and robots.
In the next phase of HBP, we
will expand our collaboration with both internal and exter-
nal partners to ensure that the NRP continues to reflect the
cutting edge of both robotics and neuroscience. In partic-
ular, we are investigating embodied learning techniques,
which are an essential prerequisite to endowing simulated
brains with desired behaviors. We are also researching
biomimetic robots with human-like musculoskeletal actua-
tion, since these models are an important requirement for
transferring results from the simulation to the real world.
We cannot stress enough that the development of the
NRP and the neurorobotics subproject are complete-
ly open to the entire scientific community. Interested
researchers from both academia and industry are strongly
encouraged to become involved and contribute. For the
upcoming public release of the NRP, we are extending
our hardware resources to accommodate as many users
as possible. Regular meetings and workshops organized
by our subproject are open to everybody, which will not
FIGURE 11. Closed-loop neuromorphic control of a only promote the NRP but also establish a community for
single-joint robot assembled from design primitives of
neurorobotics research.
the Myorobotics toolkit (left). The robot is controlled by a
cerebellum model that runs on a SpiNNaker system (right).
© 2016 IEEE. Reprinted, with permission, from (29). References
1. A. K. Seth, O. Sporns, J. L. Krichmar, Neuroinformatics 3,
167–170 (2005).
Based completely on simulations, the platform enables 2. Human Brain Project, website (2016); available at https://
the design and execution of neurorobotics experiments at www.humanbrainproject.eu.
an unprecedented speed, which is accelerating scientific 3. Neurorobotics in the HPB, website (2016); available at http://
progress both in neuroscience and robotics. The neuroro- www.neurorobotics.net.
botics workflow guarantees that the results can be rapidly 4. T. Ross, Scientific American 148, 206–209 (1933).
transferred to real robots. 5. N. Wiener, Cybernetics: Or Control and Communication in
A set of pilot experiments running on the first public the Animal and the Machine (Technology Press, Cambridge,
release of the NRP has yielded positive results and helped MA; John Wiley & Sons, New York, NY; Hermann & Cie,
to refine the development roadmap. In the upcoming Paris, 1948).
releases, we will focus on the integration of more realistic 6. G. Metta, G. Sandini, D. Vernon, L. Natale, F. Nori, Proceedings
brain models comprising a very large number of neurons. of the 8th Workshop on Performance Metrics for Intelligent
Systems (Permis ‘08), R. Madhavan, E. Messina, Eds. (ACM, and Epigenetic Robotics (ICDL) (IEEE, Osaka, Japan, August
Gaithersburg, MD, August 2008), pp. 50–56. 2013), pp. 1–7.
7. I. Mizuuchi et al., Proceedings of the 2007 7th IEEE-RAS 29. C. Richter et al., IEEE Robot. Autom. Mag. (2016), doi: 10.1109/
International Conference on Humanoid Robots (IEEE Robotics MRA.2016.2535081.
and Automation Society, Pittsburgh, PA, November– 30. S. B. Furber, F. Galluppi, S. Temple, L. A. Plana, Proc. IEEE 102,
December 2007), pp. 294–299. 652–665 (2014).
8. H. G. Marques et al., Proceedings of the 2010 10th IEEE-RAS 31. C. Eliasmith et al., Science 338, 1202–1205 (2012).
International Conference on Humanoid Robots (2010), (IEEE 32. H. Markram et al., Cell 163, 456–492 (2015).
Robotics and Automation Society, Nashville, TN, December 33. F. Walter, F. Röhrbein, A. Knoll, Neural Process. Lett. 44,
2010), pp. 391–396. 103–124 (2016).
9. Roboy Anthropomimetic Robot, website (2016); available at 34. S. Kim, C. Laschi, B. Trimmer, Trends Biotechnol. 31, 287–294
http://www.roboy.org. (2013).
10. Honda Motor Company, “ASIMO: The World’s Most 35. J. Schemmel et al., Proceedings of the 2010 IEEE International
Advanced Humanoid Robot” (2016); available at http://asimo. Symposium on Circuits and Systems (ISCAS 2010), (IEEE,
honda.com. Paris, France, May–June 2010), pp. 1947–1950.
11. E. Ackerman, E. Guizo, “The Next Generation of Boston 36. Blender Foundation, website (2016); available at https://www.
Dynamics’ ATLAS Robot Is Quiet, Robust, and Tether Free” blender.org.
(2016); available at http://spectrum.ieee.org/automaton/ 37. A. Roennau, G. Heppner, L. Pfotzer, R. Dillmann, in Nature-
robotics/humanoids/next-generation-of-boston-dynamics- Inspired Mobile Robotics, K. J. Waldron, M. O. Tokhi, G. S. Virk,
atlas-robot. Eds. (World Scientific, Hackensack, NJ, 2013), pp. 563–570.
12. M. Raibert, K. Blankespoor, G. Nelson, R. Playter, in 38. Clearpath Robotics, “Husky Unmanned Ground Vehicle”
Proceedings of the 17th IFAC World Congress, M. J. Chung, (2016); available at http://www.clearpathrobotics.com/husky-
P. Misra, H. Shim, Eds. (International Federation of Automatic unmanned-ground-vehicle-robot.
Control, Seoul, Korea, July 2008), pp. 10822–10825. 39. Open Source Robotics Foundation, “Gazebo: Robot
13. B. M. Yamauchi, Proc. SPIE 5422, 228–237 (2004). Simulation Made Easy” (2016); available at http://www.
14. E. Ackerman, “Unlucky Robot Gets Stranded Inside gazebosim.org.
Fukushima Nuclear Reactor, Sends Back Critical Data” 40. M.-O. Gewaltig, M. Diesmann, Scholarpedia 2, 1430 (2007).
(2015); available at http://spectrum.ieee.org/automaton/ 41. V. Braitenberg, Vehicles, Experiments in Synthetic Psychology
robotics/industrial-robots/robot-stranded-inside-fukushima- (MIT Press, Cambridge, MA, 1986).
nuclear-reactor. 42. P. Martínez-Cañada, C. Morillas, B. Pino, F. Pelayo, in Artificial
15. E. Ackerman, “Another Robot to Enter Fukushima Reactor, Computation in Biology and Medicine: International Work-
and We Wish It Were Modular” (2015); available at http:// Conference on the Interplay Between Natural and Artificial
spectrum.ieee.org/automaton/robotics/industrial-robots/ Computation, IWINAC 2015, Elche, Spain, June 1–5, 2015,
heres-why-we-should-be-using-modular-robots-to-explore- Proceedings, Part I, J. M. Ferrández Vicente, J. R. Álvarez-
fukushima. Sánchez, F. de la Paz López, F. J. Toledo-Moreo, H. Adeli, Eds.
16. Aldebaran Robotics, website (2016); available at https://www. (Springer International Publishing, Cham, 2015), pp. 47–57.
aldebaran.com/en. 43. Myorobotics, website (2016); available at http://
17. H. Miyamoto, M. Kawato, T. Setoyama, R. Suzuki, Neural Netw. myorobotics.eu.
1, 251–265 (1988). 44. C. Denk et al., in Artificial Neural Networks and Machine
18. G. N. Reeke, O. Sporns, G. M. Edelman, Proc. IEEE 78, Learning—ICANN 2013: 23rd International Conference on
1498–1530 (1990). Artificial Neural Networks Sofia, Bulgaria, September 10–13,
19. B. Cox, J. Krichmar, IEEE Robot. Autom. Mag. 16, 72–80 2013, Proceedings, V. Mladenov et al., Eds. (Springer, Berlin,
(2009). Heidelberg, 2013), pp. 467–474.
20. N. Burgess, J. G. Donnett, J. O’Keefe, Connect. Sci. 10, 45. N. R. Luque, J. A. Garrido, R. R. Carrillo, O. J. M. D. Coenen, E.
291–300 (1998). Ros, IEEE Trans. Neural Netw. 22, 1321–1328 (2011).
21. A. Crespi, K. Karakasiliotis, A. Guignard, A. J. Ijspeert, IEEE 46. G. Hinkel et al., Proceedings of the 2015 Joint MORSE/VAO
Trans. Robot. 29, 308–320 (2013). Workshop on Model-Driven Robot Software Engineering
22. V. Trianni, S. Nolfi, IEEE T. Evolut. Comput. 13, 722–741 (2009). and View-Based Software-Engineering (ACM, L’Acquila, Italy,
23. Y. Sugita, J. Tani, M. V. Butz, Adapt. Behav. 19, 295–316 (2011). July 2015), pp. 9–15.
24. H. H. Lund, B. Webb, J. Hallam, Artif. Life 4, 95–107 (1998).
25. Y. Nakamura, K. Yamane, Y. Fujita, I. Suzuki, IEEE Trans. Robot. Acknowledgments
21, 58–66 (2005). This paper summarizes the work of many people in HBP, in par-
26. M. Sreenivasa, A. Murai, Y. Nakamura, Proceedings of the ticular the team of SP10 Neurorobotics, whose great contribution
2013 IEEE/RSJ International Conference on Intelligent Robots we would like to acknowledge. We also thank Florian Walter for
and Systems (IROS) (IEEE/Robotics Society of Japan, Tokyo, his support in preparing this manuscript. Finally, we thank Csaba
Japan, November 2013), pp. 329–334. Erö for kindly providing Figures 3, 5, and 6. The research leading
27. Y. Yamada et al., Sci. Rep. 6, 27893 (2016). to these results has received funding from the European Union
28. Y. Yamada, K. Fujii, Y. Kuniyoshi, in IEEE Third Joint Seventh Framework Programme (FP7/2007–2013) under grant
International Conference on Development and Learning agreement 604102 (HBP).
B I OLO GI CA LLY IN SPIR E D MO DELS F OR VI SUAL COGNI T I ON AND M OT IO N CO NT RO L : AN E XP LO RAT IO N O F BRAIN-INS P IRE D INT E L LIG E NT RO B OT I CS 35
Biologically inspired These two main research directions are illustrated in Figure
1. In the function- and performance-oriented approach,
models for visual researchers have focused on the input–output relationship in
humans, treating it as a “black box.” In contrast, in the mecha-
cognition and motion nism- and structure-oriented approach, brain-inspired intel-
ligent robotics is developed to mimic, apply, and integrate
control: An exploration of internal mechanisms existing in humans. In this way, the black
box can be clarified in a stepwise fashion. Our findings on
brain-inspired intelligent neural mechanisms and biological structures in humans can
robotics be applied to robots to improve their perception, cognition,
learning, and motion control. Furthermore, since brain-
Hong Qiao1, 2*, Wei Wu1, 3, and Peijie Yin4 inspired intelligent robots implement internal neural mecha-
nisms such as mood and emotion, they may have the poten-
T
tial to acquire empathy, so as to interact and cooperate with
humans in a close and intimate manner (1). Such an advance
in robotics could provide a substantial benefit in the national
he development of robots with human character- defense, manufacturing, and service sectors.
istics that can assist and even befriend humans has been Through discussions with neuroscientists beginning in
a longstanding dream of many researchers in the field of 2009, we have sought to improve the framework of brain-
robotics. However, whereas significant progress has been inspired intelligent robotics and build specific models
made in robotics in recent years, the flexibility, intelligence, inspired by neural mechanisms that include memory and
and empathic ability of robots have not lived up to ex- association, attention modulation, generalized study, and
pectations. Here we review ways in which current robotics fast movement response. Combining this work with long-
research is improving these human-like characteristics. standing robotic vision and motion research, we have built
There are two main approaches for increasing human a series of brain-inspired models that have improved upon
and even “friend-like” qualities in robots. The first is func- the performance of current models. By introducing neural
tion- and performance-oriented. Based on observations mechanisms for memory and association, attention, and gen-
and analyses of human behavior, models can be built to eralized learning into the models, we have developed robots
incorporate human-like cognition and decision-making into that exhibit robust recognition in complex environments and
robotic systems. Up to this point, many biologically inspired the ability to generalize (1-3).
models have been developed and applied to robotic vi-
sion, audition, and motion control. The second approach Visual models
is mechanism- and structure-oriented. Based on recent Over the years a number of different biologically inspired
findings in biology and brain science, robots have been visual computational models of cognition have been pro-
developed to mimic neural mechanisms and biological posed (4-6). The design of these models is based on related
structures. The integration of neuroscience and information biological mechanisms and structures, and provides new
science is a promising direction for both theoretical and solutions for visual recognition tasks. Here, we summarize our
practical research in robotics, and paves the way for the previous work on biologically inspired visual models.
proposed brain-inspired intelligent robot. We first introduced memory and association in the HMAX
visual computational model (Figure
2). Objects were divided into two
groups depending on whether
they had episodic or semantic
features. Here, episodic features
refer to special image patches
that correspond to the detailed
features of the object, such as the
key components (eyes, ears, nose,
mouth) of a human face. Semantic
FIGURE 1. Brain-inspired intelligent robotics is emerging from the exploration of human features are those parts of an
behavior and biology. Approach A, function- and performance-oriented approach; Approach object that have a clear physical
B, mechanism- and structure-oriented approach. meaning, for example, whether or
not the eyes or mouth on a face are
big or small. Similar features found
1
State Key Laboratory of Management and Control for Complex Systems, Institute of
Automation, Chinese Academy of Sciences, Beijing, China on different objects were saved aggregately, which aided in
2
Chinese Academy of Sciences Center for Excellence in Brain Science and association and recognition. Ultimately, the model achieved
Intelligence Technology (CEBSIT), Shanghai, China
3
Huizhou Advanced Manufacturing Technology Research Center, Guangdong, China recognition with familiarity discrimination (recognizing
4
Institute of Applied Mathematics, Academy of Mathematics and Systems Science, episodic features) and recollective matching (retrieval of
Chinese Academy of Sciences, Beijing, China
*
Corresponding Author: hong.qiao@ia.ac.cn memory of features seen in the past and comparing them
FIGURE 2. Introducing memory and association into the

recognition model. Our model introduces biologically
inspired memory and association mechanisms into HMAX,
which consists of five parts: (1) Objects are memorized
through episodic and semantic features (Block 1). Oi
represents the ith object (i = 1... n), aji represents the jth
semantic attribute (j = 1... m) of the ith object, and sxi
represents the xth special episodic patch of the ith object.
In face-recognition processing, an example of a semantic
description might be that the eyes or mouth are large. A
special episodic patch could be the image of an eye if the
eyes are prominent on the face. (2) Features of one object
are saved in distributed memory and common features of
various objects are stored aggregately (Block 2). A1 to Am
represent various common semantic attributes of different
objects. Sx is a library for special episodic patches of different
objects. A common attribute of different objects is stored
in the same regions, for example, a11−a1n is the first feature
of different objects that are stored together, called A1. (3)
Recognition of one candidate (Block 3). Tt represents a
candidate. The semantic features a1t to amt and the episodic
patch feature sxt are extracted before recognition. (4)
Familiarity Discrimination (Block 4). Familiarity discrimination
is achieved through the HMAX model. Briefly, S1 and S2
correspond to simple cells in the primate visual cortex that
carry out the filtering process, while C1 and C2 are complex
cells that are the pooling layers of the corresponding
S layers. Both sxi and sxt are extracted in the C1 layer of
the HMAX model. The saved object can be ordered by
comparing the similarity between the candidate and the saved objects in the C2 layer of the HMAX model. (5) Recollective Matching
(Block 5). Recollective matching is achieved through integration of the semantic and episodic feature similarity analysis. For more
information, see (1).
FIGURE 3. Introducing
preliminary cognition and active
attention into the recognition
model. Unlike our previous work,
here we propose a new formation
of memory and association in
Blocks 2 and 3. Specifically, the
episodic and semantic features
from the same object are saved
together in Block 2. Features
(fji) of the ith object include
episodic features (pji, j = 1... m)
and semantic features (sji, j =
1... m), and different features
corresponding to the same
component of an object are
saved together. There are m
distributed regions (Fj, j = 1... m)
storing various common features
of different objects. In Block
3, features are memorized in a
distributed structure, and recognition can be achieved through loop discharge in the association process. Additionally, we mimic the
structure and function of the inferior temporal cortex layer (Block 5) and add them to the HMAX model (Block 4) to achieve preliminary
cognition and active attention adjustment. For more information, see (2).
with features on the object being viewed). We have modified it could achieve active cognition adjustment for occlusion
the model to provide better efficiency with a lower memory and orientation of the object, as well as combine episodic
requirement and the ability to achieve recognition with and semantic features of the same object in a loop-
faster speeds (1). discharge manner, meaning that one pair of episodic and
We further set out to improve the HMAX model so that semantic features could trigger other pairs of features from
B I OLO GI CA LLY IN SPIR E D MO DELS F OR VI SUAL COGNI T I ON AND M OT IO N CO NT RO L : AN E XP LO RAT IO N O F BRAIN-INS P IRE D INT E L LIG E NT RO B OT I CS 37
FIGURE 4. Introducing active and dynamic learning

into the recognition model. The new model for
the cognition of a special class of objects is given
in panel A (framework for the learning process
that corresponds to the training process) and
B (framework for the classification process that
corresponds to the testing process). The model
consists of five parts: (1) Block 1 includes the input
images. (2) Block 2 represents a deep neural network
(DNN) model that is used for unsupervised key
components learning. DNN includes a five-layer (I,
H(1), P(1), H(2), P(2)) convolutional deep belief network
(CDBN) and an upper feature layer MO-P2, where
I is the input image of CDBN, H(1) and H(2) are two
hidden layers, and P(1) and P(2) are two pooling layers.
The last layer, MO-P2, is computed using a max-out
operation on P(2). (3) Block 3 locates key components.
Based on the outputs of the MO-P2 layer in CDBN,
the spatial relation template of the key components
is computed, and a support vector machine (SVM)
classifier is trained to detect their positions. (4) Block
4 carries out contour detection and defines semantic
geometric features based on this functionality.
(5) Block 5 computes average semantic values
as general knowledge and assessed semantic
similarity between new samples and general
knowledge. (6) Block 6 achieves final classification
by combining episodic and semantic cognition.
The subclass semantic description can help in
achieving a higher-level cognition and is meaningful
in associative recognition. (7) Block 7 adjusts the
subclass descriptions of each sample according to
the updated features. In summary, Block 2 simulates
the mechanisms and functions of a primate visual
cortex, predominantly from the primary visual cortex
(V1) to posterior inferior temporal cortex (PIT). Blocks
3, 4, and 5 introduce mechanisms from the anterior
inferior temporal cortex (AIT) of primates, while
Block 6 simulates the functions of the prefrontal
cortex (PFC) in primates. For more information,
see (3).
the same object, which were previously stored in the mem- dynamic cognition process that occurs in human babies. We
ory (Figure 3). For example, seeing the eyes of one person combined unsupervised feature learning with convolutional
could invoke retrieval of a memory of the whole face. These deep belief networks (CDBNs) to enable key components to
improvements were implemented into the model based be learned automatically without any prior knowledge, form
on relevant biological evidence, such as primates’ ability the general knowledge of a class of objects, and achieve
to adjust the orientation of an object in the anterior inferior higher-level cognition and dynamic updating (Figure 4). This
temporal cortex, and their multitask processing ability in capacity could improve the ability of robots to learn on their
the posterior inferior temporal cortex. By achieving robust own and to perform inductive generalization.
recognition with orientation and occlusion of the object, In order to mimic the neural mechanisms of human visual
the applicable scenarios of robotic cognition have been processing and incorporate them into our model, we modi-
extended (2). Such improvements can provide the basis for fied CDBNs to extract semantic features of the object, form
developing personalized services performed by robots. an integrated concept, and reselect features when the input
We also performed simulations of the spontaneous and is ambiguous (3). These improvements promise to diminish
FIGURE 5. Biologically inspired human-like vision (A) and motion (B). FIGURE 6. Ideal capabilities of brain-inspired intelligent robots.
uncertainties inherent in semantics and improve general- neuroscientists to explore brain mechanisms that will further
ization, which will enable robots to achieve more robust illuminate missing features of the black box model. As part
recognition in a complex environment. of our interdisciplinary research in the Center for Excellence
in Brain Science and Intelligence Technology at the Chinese
Motion models Academy of Sciences (CAS), we have entered into close col-
To improve the motion of robots, we introduced the laboration with neuroscientists from the CAS Institute of Neu-
concept of a “brain-cerebellum-spine-muscle” structure, roscience to integrate neuroscience and information science.
together with a “central–peripheral” nervous system into Specifically, we are seeking to combine our current theoreti-
the motion planning and movement control of robots. This cal models with more sophisticated neural mechanisms of
addition allows us to better simulate human-like multi-input perception, cognition, learning, and motion.
and multioutput movements and to build a neural encod- For example, we plan to build a model inspired by hu-
ing–decoding model for the generation of movement con- man neural mechanisms that will mimic human perception,
trol signals. Through these advances we hope to improve motion, decision-making, emotion, energy optimization, and
the precision of movement and learning ability of robots, interaction with other humans (Figure 6). The model will be
without sacrificing response speed (4). developed in a systematic manner and will be based on the
In order to test these new concepts, we created bio- coordination between the central and peripheral nervous
logically inspired human-like vision and motion platforms systems. Applying such novel models in brain-inspired intel-
(Figure 5). The vision platform uses memory and associa- ligent robots will allow us to both improve the robustness and
tion, active attention, and preliminary recognition based on intelligence of robotic systems and verify their effectiveness in
human behavior to achieve robust robotic recognition of robotic design. In another, complementary research project,
semantic features. The motion platform mimics structures we are establishing brain-inspired intelligent robotic plat-
and mechanisms in the central and peripheral nervous forms for use in brain-science research.
systems of humans to achieve fast responses, highly precise
movements, and the ability to learn (4). References
1. H. Qiao, Y. L. Li, T. Tang, P. Wang, IEEE Trans. Cybern. 44, 1485–
Conclusions and future directions 1496 (2014).
2. H. Qiao, X. Y. Xi, Y. L. Li, W. Wu, F. F. Li, IEEE Trans. Cybern. 45,
In conclusion, we have built sophisticated robotic cogni-
2612–2624 (2015).
tion and control models based on neural mechanisms.
3. H. Qiao, Y. L. Li, F. F. Li, X. Y. Xi, W. Wu, IEEE Trans. Cybern. 46,
These models have relied on investigations into the subtle
2335–2347 (2016).
differences and relationships between target tasks and
4. K. Fukushima, Neural Netw. 1, 119–130 (1988).
related brain neural network mechanisms. What results 5. L. Itti, C. Koch, E. Niebur, IEEE Trans. Pattern Anal. Mach. Intell. 20,
from the implementation of these mechanisms is a highly 1254–1259 (1998).
abstracted version of their human counterparts. Our objec- 6. T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, T. Poggio, IEEE Trans.
tive is to better understand the possible ways in which the Pattern Anal. Mach. Intell. 29, 411–426 (2007).
human brain manages tasks, and to carefully select and 7. H. Qiao, C. Li, P. J. Yin, W. Wu, Z.-Y. Liu, Assembly Autom. 36, 97–
abstract the appropriate mechanisms to make our models 107 (2016).
computationally realistic. The proposed models would then
be consistent with actual biological mechanisms, but also Acknowledgments
This research was partially supported by the National Natural Science
applicable to real-world robotic environments. Foundation of China (61210009) and the Strategic Priority Research
In the future, we will continue to collaborate closely with Program of the Chinese Academy of Sciences (XDB02080003).
ANT H RO P O MO RP H IC ACT IO N IN ROB OT I CS 39
Anthropomorphic action to gain an understanding of the foundations of human ac-

tion, in domains ranging from medicine and rehabilitation to
in robotics ergonomics. Nevertheless, neuroscience and its quest to un-
derstand the computational foundation of the brain provide
Jean-Paul Laumond a further entry point to robotics. And despite their different
scientific cultures and backgrounds, the communities of life
T
scientists and roboticists are pursuing converging objectives.
Human beings and humanoid robots share a common
anthropomorphic shape. A key to understanding anthropo-
he word “robot” was first coined in the early morphic action that can bridge robotics and life sciences
20th century, and the seminal ideas of cybernetics first is gaining insight into the fundamental mechanisms of the
appeared during World War II. The birth of robotics is human body. As an example, consider the actions in Figure
generally pinpointed to 1961, with the introduction of the 1, performed by the humanoid robot HRP-2 at the Labora-
first industrial robot on the General Motors assembly line: tory for Analysis of Architecture and Systems of the French
The Unimate robot was patented by George Devol and National Center for Scientific Research (LAAS-CNRS). In the
industrialized by Joseph Engelberger, who is recognized first scenario, the robot answers a single order: “Give me the
as the founding father of robotics. From its beginning purple ball” (1). To accomplish the assigned objective, HRP-2
in the 1960s to its broad application in the automotive decomposes its task into elementary subtasks (Scenario A
industry by the end of the 1970s, the field of robotics has in Figure 1). A dedicated software module addresses each
been viewed as a way to improve production in manu- subtask. For instance, to reach the ball, the robot has to walk
facturing by providing a manipulator integrated into a to the ball. “Walking” can be regarded as an elementary ac-
well-structured environment. Stimulated by programs tion that is a resource to solve the problem, and is processed
such as space exploration, the 1980s saw the creation of by a dedicated locomotion module. In the second scenario
field robotics, where a robot’s environment was no longer (Scenario B in Figure 1), HRP-2 has to grasp a ball that is
closed. However, even then the robot remained in isola- located between its feet (2). To accomplish this objective, the
tion, only interacting with a static world. robot has to first step away from the ball and then grasp it. In
At the end of the 1990s, robotics began to be promot- this scenario, the significance of “stepping away” becomes a
ed within the service industry, which led to the develop- vital issue. In this experiment, there is no dedicated module
ment of simple wheeled mobile platforms capable of in charge of “stepping,” which is a direct consequence of
performing tasks such as cleaning or automated transpor- the requirement for “grasping.” Thus, no stepping “symbol”
tation. In the next stage of development, arms were added appears as a resource for problem solving in Scenario B. The
to the platforms, which allowed more expressive commu- grasping action is embedded in the robot’s body, allowing
nication. This generation of robots, including most recently its legs to naturally contribute to the action. No deliberative
“Pepper the robot,” is able to enter into dialogue with hu- reasoning is required for the robot to face complicated situa-
mans—to welcome and guide them into public areas such tions such as picking up a ball between the feet.
as supermarkets, train stations, or airports. Incorporating To design a robot capable of the embedded actions
arms into this type of robotic design has also allowed for required to execute Scenario B, we must first imagine replac-
object manipulation and physical interaction. However, the ing the humanoid robot HRP-2 with a human being. Among
limitations of these wheeled mobile robots have become all the possible motions required for grasping an object,
quite evident, sparking a quest for more anthropomorphic we must consider the underlying principle for selection of
robots that can maneuver within a human environment, a particular motion in humans. How do humans organize
performing actions such as using stairs and moving over their behaviors to reach a given objective? Where within the
small obstacles. These features would allow for their mo- brain does this reasoning take place? What are the relative
bility on any terrain; moreover, they would incorporate the contributions in Scenario A and B of voluntary actions com-
capacity to perform dexterous manipulation. Developing puted in the frontal cortex to reflexive actions computed by
such “humanoid” robots has become a major challenge in spinal reflexes? How and why are different actions computed
the field of robotics and, if fully realized, these devices will by different mechanisms? What musculoskeletal synergies
become the paragons of robotics science. Here I review are required to simplify control of complex motions? Such
the current status of anthropomorphic robots and how questions lie at the core of current research in computational
the fields of robotics, neuroscience, and biomechanics are neuroscience and biomechanics. In the remainder of this re-
coalescing to drive robotic innovation forward. view, I will briefly discuss three viewpoints on anthropomor-
phic action from a robotics, neuroscience, and biomechanics
Anthropomorphic action perspective. I will also comment on mathematical methods
Whereas the ultimate goal of roboticists is to provide for anthropomorphic action modeling.
humanoid robots with autonomy, life scientists are striving
A robotics perspective
In the quest for robot autonomy, R&D in robotics has
Laboratory for Analysis and Architecture of Systems-Centre National de la been stimulated by competition between computer sci-
Recherche Scientifique (LAAS-CNRS), Toulouse University, Toulouse, France
Corresponding Author: jpl@laas.fr ence and control theory, and between abstract symbol
FIGURE 1. An introductory example of embodied intelligence. Top panels. Scenario (A): The global task “Give me the ball” is
decomposed into a sequence of subtasks— [locate the ball], [walk to the ball], [grasp the ball], [locate the operator], [walk to the
operator], and [give the ball]. The motions [walk to], [grasp], [give] appear as symbols of the decisional process that decomposes the
task into subtasks. Bottom panels. Scenario (B): To grasp the ball between its feet, the robot has to step away from the ball. In this
experiment, “stepping away” is not a software module, nor a symbol. It is an integral part of the embodied action “grasping.” The action
in Scenario A is well segmented. The action in Scenario B is not: Unlike the command “walk to,” “stepping away” does not constitute
a symbol.
manipulation and physical signal processing, with the goal A computational neuroscience perspective
of embedding discrete data structures and continuous How to represent action is a key issue today in human
variables into a single architecture. This architecture is a science research, a field that encompasses endeavors
way of decomposing complicated intelligent behavior into ranging from neuroscience to the philosophy of mind
elementary modules, or “symbols,” capable of execut- (12). The subject itself, which ponders such questions as
ing a well-defined function. Designing robot architecture whether there can in fact be “representation” of action, has
requires the well-designed “placing” of these symbols. long been controversial. However, the discovery of mirror
In the field of robotics, centralized architectures were neurons by Rizzolatti (13) provided physiological evidence
first designed in manufacturing. In this hierarchical para- to support the concept of action representation, which was
digm, the robot operates in a top-down fashion, combining promoted by philosopher Edmund Husserl at the end of the
predefined specialized functions for perception, decision, 19th century (14).
and control. Such architectures perform well in structured In terms of motor control, the pioneering work of Nikolai
environments where a finite-state machine can describe the Bernstein in the 1960s revealed the existence of mo-
world of possible actions, as is the case in production engi- tor synergies (15). The work of Bizzi and colleagues then
neering (3). Other architectures promote a bottom-up view, provided biological evidence of this concept (16). Since
a seminal approach introduced by Rodney A. Brooks (4). Us- then, numerous researchers have pushed the borders of
ing the concept of subsumption, he proposed a reactive ro- their disciplines to discover laws and principles underlying
bot architecture organized by integrating low-level, sensory- human motion, which has in turn established the fundamen-
motor loops into a hierarchical structure. In this paradigm, a tal building blocks of complex movements (17–19). More
behavior is decomposed into subbehaviors organized in a recently, Alain Berthoz introduced the word “simplexity” to
hierarchy of layers. Higher levels subsume lower levels ac- synthesize all these works into a single concept: To face the
cording to the context. This research gave rise to the school complexity of having such high dimensions in motor control
of so-called “bioinspired” robotics, which emphasized space, living beings have created laws that link motor con-
mechanism design and control (5) and related schools in ar- trol variables and hence reduce computations (20).
tificial intelligence, including multiagent systems (6), swarm
robotics (7), and developmental robotics (8). Other types A biomechanics perspective
of architectures have tended to combine top-down and In the 19th century, Étienne-Jules Marey introduced
bottom-up views in a hybrid manner, integrating delibera- chronophotography to scientifically investigate locomotion,
tive reasoning and reactive behaviors (9, 10). and was the first scientist to correlate ground reaction forces
The aim of all these approaches is to provide a generic with kinetics. The value of jointly considering biomechan-
solution for mobile robots as well as articulated me- ics, modern mathematics, and robotics is illustrated by the
chanical systems, and to enable the robotic design to be famous “falling cat” case study: Why is it that a falling cat
independent of the mechanical dimensions of the system. always lands on its feet? The answer comes from the law of
Further developments in imposing anthropomorphic body conservation of angular momentum: The cat can be mod-
considerations on humanoid robot architectures will in- eled as a nonholonomic system1 whereby geometric control
volve the promotion of “morphological computation,” with techniques perfectly explain the phenomenon (21). Thus,
its emphasis on the role of the body in cognition (11). biomechanics provides models of motion generation (22),
ANT H RO P O MO RP H IC ACT IO N IN ROB OT I CS 41
which have subsequently been applied in ergonomics (23) References

and studies of athletic performance (24). 1. E. Yoshida et al., Comput. Animat. Virtual Worlds 20, 511–522
(2009).
Mathematical methods for anthropomorphic 2. O. Kanoun, J.-P. Laumond, E. Yoshida, Int. J. Robot. Res. 30,
476–485 (2011).
action modeling
3. A. Jones, C. McLean, J. Manuf. Syst. 5, 15–25 (1986).
From a mechanistic point of view, the human (or
4. R. Brooks, Artif. Intell. 47, 139–159 (1991).
humanoid) body is both a redundant system and an 5. S. Hirose, Biologically Inspired Robots: Snake-Like Locomotors
underactuated one. It is redundant because its number of and Manipulators (Oxford University Press, Oxford, 1993).
degrees of freedom is usually much greater than the di- 6. Y. Shoham, K. Leyton-Brown, Multiagent Systems: Algorithmic,
mension of the tasks to be performed. It is underactuated Game-Theoretic, and Logical Foundations (Cambridge University
because there is no direct actuator allowing the body to Press, Cambridge, 2008).
move from one place to another place: To do so, the hu- 7. E. Bonabeau, M. Dorigo, G. Theraulaz, Swarm Intelligence: From
man must use its internal degrees of freedom and actuate Natural to Artificial Systems (Oxford University Press, Oxford,
all his limbs following a periodic process, namely bipedal 1999).
locomotion. Actions take place in the physical place, while 8. R. Arkin, T. Balch, J. Exp. Theor. Artif. Intell. 9, 175–189 (1997).
they originate in the sensory-motor space. Thus geometry 9. R. Alami, R. Chatila, S. Fleury, M. Ghallab, F. Ingrand, Int. J. Robot.
Res. 17, 315–337 (1998).
is the core abstraction linking three fundamental action
10. M. Asada et al., IEEE Trans. Auton. Ment. Dev. 1, 12–34 (2009).
spaces: the physical space where the action is expressed,
11. R. Pfeifer, J. Bongard, How the Body Shapes the Way We Think: A
the motor space, and the sensory space. The emergence New View of Intelligence (MIT Press, Cambridge, 2007).
of symbols can be understood by the geometric structure 12. M. Jeannerod, The Cognitive Neuroscience of Action (Wiley-
of the system configuration space. Such a structure de- Blackwell, Hoboken, NJ, 1997).
pends on the role of the sensors in action generation and 13. G. Rizzolatti et al., Cogn. Brain Res. 3, 131–141 (1996).
control. As an example, in a recent study, we highlighted 14. E. Husserl, Ideas: General Introduction to Pure Phenomenology
the role of the gaze to explain the geometric shape of hu- (Macmillan, New York, 1931).
man locomotor trajectories (26). 15. N. Bernstein, The Co-ordination and Regulation of Movements
Whereas an action, such as “walk to” or “grasp” is (Pergamon Press, Oxford, 1967).
defined in the real world, it originates in the control space. 16. E. Bizzi, F. A. Mussa-Ivaldi, S. Giszter, Science 253, 287–291 (1991).
The relationship between “action in the real world” and 17. M. J. E. Richardson, T. Flash, J. Neurosci. 22, 8201–8211 (2002).
18. E. Todorov, Nat. Neurosci. 7, 907–915 (2004).
“motion generation” in the motor control space is defined
19. K. Mombaur, A. Truong, J. P. Laumond, Auton. Robot. 28, 369–383
in terms of differential geometry, linear algebra, and
(2010).
optimality principles (18, 27). Optimal control is based on 20. A. Berthoz, Simplexity: Simplifying Principles for a Complex World
well-established mathematical machinery ranging from (Yale University Press, New Haven, 2012).
the analytical approaches initiated by Pontryagin (28) to 21. Z. Li and J. F. Canny, Eds., Nonholonomic Motion Planning (Kluwer
the recent developments in numerical analysis (29). It Academic Publishers, Dordrecht, 1993).
allows for motion segmentation as well as motion genera- 22. A. Chapman, Biomechanical Analysis of Fundamental Human
tion. On the other hand, inverse optimal control is a way Movements (Human Kinetics, Champaign, IL, 2008).
to model human motion in terms of controlled systems. 23. S. Kumar, Ed., Biomechanics in Ergonomics (CRC Press, Taylor &
Specifically, if given an underlying hypothesis of a system, Francis, London, 2008).
as well as a set of observed natural actions recorded from 24. M. R. Yeadon, in Biomechanics in Sport: Performance
an experimental protocol performed on several partici- Enhancement and Injury Prevention, V. M. Zatsiorsky, Ed.,
Encyclopaedia of Sports Medicine, Vol. IX (Blackwell Science Ltd.,
pants, optimization acts to determine the cost function of
Oxford, 2000), pp. 273–283.
the system. From a mathematical point of view, the inverse
25. H. Poincaré, Revue de Métaphysique et de Morale 3, 631–646
problem is much more challenging than the direct one. (1895).
Recent studies have been published in this area utilizing 26. M. Sreenivasa, K. Mombaur, J.-P. Laumond, PLOS ONE 10,
numerical analysis (30), statistical analysis (31), and ma- e0121714 (2015).
chine learning (32, 33). 27. J.-P. Laumond, N. Mansard, J. B. Lasserre, Commun. ACM 58,
Movement is a distinctive attribute of living systems. It 64–74 (2015).
is the source of action. Robots are computer-controlled 28. L. Pontryagin et al., The Mathematical Theory of Optimal Processes
machines endowed with movement ability. Whereas (Wiley, Interscience Publishers, New York, 1962).
living systems move to survive, robots move to perform 29. J. F. Bonnans, J. C. Gilbert, C. Lemaréchal, C. A. Sagastizábal,
actions defined by humans. Exploring the computational Numerical Optimization: Theoretical and Practical Aspects,
foundations of human action therefore provides a promis- (Springer, Berlin, Heidelberg, 2006).
30. M. Diehl, K. Mombaur, Eds., Fast Motions in Biomechanics and
ing route to better engineering humanoid robots. As a
Robotics, Lecture Notes in Control and Information Sciences 340
movement science, geometry offers the suitable abstrac-
(Springer, Berlin, Heidelberg, 2006).
tion allowing for fruitful dialog and mutual understanding 31. T. Inamura, Y. Nakamura, I. Toshima, Int. J. Robot. Res. 23, 363–377
between roboticists and life scientists. (2004).
32. T. Mitchell, Machine Learning (McGraw Hill, New York, 1997).
1
A nonholonomic system is a system whose reachable space dimensions are 33. J. Kober, J. Bagnell, J. Peters, Int. J. Robot. Res. 32, 1238–1274
greater than the number of its motors. (2013).
Actor–critic reinforcement longitudinal control of the UGV. The near-optimal policy

was obtained offline by the KLSPI algorithm using samples
learning for autonomous collected from the control system. Furthermore, we also
designed an online actor–critic learning controller de-
control of unmanned veloped from kernel-based dual heuristic programming
(KDHP) (11) to realize self-learning lateral control. We
ground vehicles conducted a variety of field experiments to test the perfor-
mance of the self-learning and self-optimization capabili-
Xin Xu*, Chuanqiang Lian, Jian Wang, ties of the brain-like actor–critic learning controllers. The
Han-gen He, and Dewen Hu* experimental results indicate that for UGVs, actor–critic
learning control not only autonomously controls UGVs,
U
but also improves their driving skills through self-learning
mechanisms.
nmanned ground vehicles (UGVs) are receiving Reinforcement learning in robotics

substantial attention for their potential utility in a variety of As a class of wheeled mobile robots, UGVs have been
domains such as the automobile industry, transportation widely studied for their potential to improve intelligent
systems, and smart manufacturing (1–3). However, attain- transportation systems, automobile safety, and space
ing autonomous control of UGVs in complex environments exploration. Yet despite advances in areas related to their
remains a significant challenge. In particular, self-learning development, the application of UGVs to real environmen-
abilities are needed in the skill and performance optimi- tal situations has not been realized to its full potential. From
zation of autonomous driving systems. In this article, we the perspective of motion control, the controller design of
will introduce some recent work in an area of study called UGVs has to deal with uncertainties in vehicle dynamics,
“reinforcement learning” for autonomous control of UGVs varying road conditions, and constraints in kinematics such
(4–6). These types of self-learning methodologies are as the minimum turning radius of a vehicle. To overcome
inspired by the so-called “actor–critic” learning mechanism these difficulties, various advanced control approaches
of the human brain, which can improve the effectiveness have been studied (12–15), with convergence and stability
of actions by receiving delayed evaluative feedback from analyses among the most typical dynamic control methods.
environmental interactions. Yet even with these sophisticated methods, challenging
The UGV platform used for this study won the 2014 problems remain for UGV control. For example, designing
Challenge for Unmanned Ground Vehicles in China. optimal or near-optimal motion controllers for UGVs where
Advanced sensing and information fusion theories and there are uncertainties and disturbances remains a serious
technologies have been developed for the UGV plat- challenge (6, 7). Thus, it will be critically important to devel-
form, including visual saliency detection, stereo vision for op UGVs capable of self-learning such that the driving skills
obstacle detection, and long-distance lane detection (7–9), of the autonomous control system continually improve dur-
among others. We introduce our research on developing ing their interactions with uncertain environments.
the self-learning capabilities of the UGV (4, 5) below. In As a major class of machine learning methods, rein-
particular, this work involved analyzing the connections be- forcement learning (RL) (16, 17) has been studied in the
tween the actor–critic learning algorithms and the activity robotics field in both simulated and real-world scenarios
pattern of dopamine neurons in the brain, and developing (18, 19). RL is distinct from supervised learning and math-
two actor–critic learning control algorithms for the self- ematical programming methods and has been shown to
learning motion control of UGVs. be an efficient way to solve sequential decision problems
We have modeled the learning control problem as a modeled as MDPs, where some modeling information is
Markov decision process (MDP), and the learning objec- unknown. Thus, compared with dynamic programming,
tive is to find an optimal control policy that can minimize RL is better suited for solving sequential optimization and
long-term cumulative rewards. We designed the rewards control problems with inherent uncertainties. However, one
according to the performance measure of the learning of the main problems in RL is the “curse of dimensional-
control problems being analyzed, such as tracking errors. ity,” which means that computational and storage costs
The optimal policy is a deterministic mapping or function increase exponentially with the number of state dimen-
from input states to control actions. In practice, due to sions. To solve this problem, approximate RL methods, also
the large state and action space of the underlying MDP, called “approximate dynamic programming” or “adaptive
a near-optimal or suboptimal policy is usually obtained. dynamic programming” (ADP), have received more and
In our study, by using a batch-mode actor–critic learning more attention in recent years (20, 21). A variety of approxi-
controller founded on kernel-based least-squares policy mate RL methods have been developed that fall mainly
iteration (KLSPI) (10), a near-optimal policy is learned for into three categories: value function approximation (VFA),
policy search, and actor–critic methods (21, 22). VFA aims
to estimate the value function of an MDP using function ap-
College of Mechatronics and Automation, National University of Defense
Technology, Changsha, China proximators such as neural networks. In the framework of
*Corresponding Authors: xinxu@nudt.edu.cn (X.X.), dwhu@nudt.edu.cn (D.H.) VFA, successful results have been obtained by making use
ACTOR–C RI T I C REI NF ORC EM E NT L E ARNING FO R AU TO NO MO U S CO NT RO L O F U NMANNE D G RO U ND V E H I CL ES 43
R is the reward function. By making use of the TD learn-

ing algorithm (30–32), the critic estimates the state-value
function V p(s), which is the expected, discounted total
reward when starting from s and following policy p there-
after, where the expectation is with respect to the state
transition probability and the action policy:
(1)
FIGURE 1. Schematic of the actor–critic learning control
structure. The system consists of a critic that predicts value The activity pattern of DA neurons not only estimates
functions and an actor that learns the action policies for the state-value function but also estimates the reward
optimizing long-term rewards. xt and rt denote the state of prediction error or the TD error:
the Markov decision process and the reward at time step
t, respectively; at is the control output at time step t; V(xt) (2)
and λ(xt) are the estimated value function or value function
derivatives of state xt. is the predicted state at time t + 1. where st+1 is the successive state of st, V(s) denotes the
estimate of the value function V(s), and rt is the reward
received after the state transition from st to st+1.
of deep neural networks, such as the AlphaGo program Based on the predicted value function in the critic, the
(23). Policy search involves directly performing a search actor updates the action policies either by using a so-
process in the policy space, using either heuristic informa- called “greedy policy search” or by performing an update
tion or gradient rules. Actor–critic methods can be viewed based on an approximated policy gradient. For the greedy
as a hybrid of VFA and policy search, where the critic is policy search, the action of each state will be greedy with
used for VFA and the actor is used for policy search. Actor– respect to the estimated value function, which means
critic learning control has been shown to be important for that the action with the minimum value function will be
human skill learning based on experience and evaluative selected. For the approximated policy gradient, the action
feedback, especially in the case of improving driving skills. policy will be updated with an estimated gradient direc-
However, to our knowledge, little work has been done on tion for optimizing the policy performance, which can be
actor–critic learning control methods in real-time field test- measured by the estimated value function.
ing of UGVs.
Designing an actor–critic learning
Actor–critic learning in the human brain controller for UGVs
Research from fields such as machine learning, In our recent work (5), a self-learning cruise controller
operations research, robotics, and cognitive science have was developed for longitudinal control of a UGV, as de-
contributed to a better understanding of the reinforcement picted in Figure 2. The self-learning controller integrates
learning mechanisms in the human brain (24–27). Various the KLSPI algorithm (10, 33) with a proportional–integral
RL models have been established to describe the (PI) controller with adjustable PI parameters. The output
reinforcement-based cognitive process from the neuro- of the actor determines an optimized selection of the
chemical level to higher system levels (28, 29). adjustable PI parameters. In Figure 2, K P and K I are the
In RL, the reward function plays an important role by in- coefficients of the PI controller, vc and ac are the cur-
dicating the objectives to be optimized. Many experimen- rent speed and acceleration of the vehicle, vd is the
tal results have demonstrated that reward-directed behav- desired speed, Δv is the difference between vc and vd,
iors rely on the activity of dopamine (DA) neurons from the and u denotes the throttle and brake input. Velocity and
midbrain ventral tegmental area (VTA) and substantia nigra acceleration profiles v t and at were designed to specify
pars compacta (24, 25). It has been shown that DA not only the desired longitudinal motion property. After generat-
implements synaptic transmission but also generates long- ing the learning target, a reward function is defined to
term effects, which are executed through DA’s modulation minimize the velocity tracking error after collecting the
of synaptic plasticity at target structures that include the observation data.
basal ganglia, prefrontal cortex, and amygdala (24). In the self-learning cruise controller, the state of the
The actor–critic model has been widely studied to underlying MDP is defined as a vector containing vc, ac,
understand how optimized action policies are generated and Δv, and the action is the selected KP–KI vector. In the
from reinforced behaviors (24). In an actor–critic structure, critic, the value function outputs the state-action value for
the critic implements a form of temporal-difference (TD) every KP–KI vector in its current state, and the least-squares
learning that is closely related to the DA neuron activ- TD (LSTD) learning algorithm (32, 34) is used to evaluate
ity, whereas the actor learns an action policy that can the value function under a given policy. In order to realize
optimize the long-term cumulative rewards (Figure 1). The generalization in large or continuous state spaces, function
learning problem is modeled as an MDP described as approximators are usually employed. Under a stationary
a tuple {S, A, P, R}, where S is the state space, A denotes policy p n at iteration n, when a state transition trajectory
the action space, P is the state transition probability, and {s1, a1, r1, s2, a2, …sN, aN, rN} is observed, the following LSTD
update rule (34) is used to estimate the approximated

action value function: FIGURE 2.
Block
diagram of
(3) the self-
learning
cruise
controller.
(4) See text for
! explanation
where ft =ft (st, at) is the feature vector, and q is the of variables
weight vector. shown.
In the actor, for each observed state, the policy
improvement process computes a greedy action with
the largest state-action value, and the selected action
determines a corresponding KP–KI vector to update the
coefficients of the PI controller:
(5)

For a deterministic policy set, there exists an optimal
policy p*, which maximizes the action-value function Qp
for all the state-action pairs:

(6)
If the optimal action value function Q*(s, a) is com-

puted, the optimal policy can be obtained by

(7)
In KLSPI and other kernel-based RL algorithms, a

FIGURE 3. Architecture of the KDHP-based lateral controller.
set
! of regularized or sparsified kernel-based features See text for explanation of variables shown.
k are generated in a data-driven way and the value
function is approximated in a linear form as .
Then, the generalization ability of RL algorithms can
be improved without much human intervention, as FIGURE 4. The UGV
described previously in detail (10, 33). platform. The UGV is
In addition to the self-learning longitudinal equipped with laser
controller, the lateral control problem was solved radars, cameras, and
using another type of online actor–critic learning con- a GPS, as well as an
troller, namely KDHP (11). In contrast to KLSPI, KDHP is inertial navigation
more suitable for MDPs with continuous action spaces. system.
In KDHP, a critic network is used to approximate the
derivative of the cost function, which has been shown
to be more beneficial for reducing the estimation
variance of the policy gradient in the actor.
Figure 3 depicts the architecture of the KDHP-based
lateral controller. s(k) and s(k + 1) are the observed states The system states are defined as s = [ex, ey, eθ, ρ, v] T,
at time steps k and k + 1. The actor generates the control where ν is the speed of the UGV; ex, ey, eθ are the track-
û(s(k)) . The critic evaluates the control performance under ing errors in the x-direction, y-direction, and orientation,
the current control policy. Critic #1 is equivalent to critic respectively; and ρ is the curvature of path point (c x, c y),
#2, and together they are used to estimate the value func- which can be calculated by
tion derivatives at different time steps. The value function
derivative is defined as:

(8) (9)
ACTOR–C RI T I C REI NF ORC EM E NT L E ARNING FO R AU TO NO MO U S CO NT RO L O F U NMANNE D G RO U ND V E H I CL ES 45
A
FIGURE 5. Re-
sults before and
after learning
through the LSPI
and KLSPI algo-
rithms. Control-
ler i (i =1, 2), in
panels A and B,
is the proportion-
al–integral (PI)
controller whose
fixed proportion-
B
al and integral
coefficients
together cor-
respond to the
ith action of the FIGURE 7. Lateral error curves of kernel-based dual heuristic
action set. Con- programming (KDHP) and pure pursuit controllers (18). m,
trollers 3 and 4 meters.
(panels C and D)
are both learn-
ing controllers
with the same
architecture. The
C former uses the The control signal is defined by u = tan d / L - r and is
policy obtained subject to the following constraint:
by least-squares
policy iteration
(LSPI) (33), while (10)
the latter em-
ploys the policy
where δ is the driving angle, L is the distance between the
obtained by the
front and the rear axles, and aLmax is the maximum lateral
kernel-based
LSPI (KLSPI) acceleration. The reward function for evaluating the lateral
algorithm (5). control performance can then be defined by
D
(11)
where Pi (i = 1,2,3) and Rj (j = 1,2) are positive constants.

In the critic of KLSPI and KDHP, a data dictionary Dn with
a reduced number of observed data is constructed.
!
a = [a 1 ,a 2 ,...,a T ]T is the coefficient vector for approximating
!
λ (s) and k (xt ) = (k(x1, xt ), k(x2 , xt ),..., k(xT , xt ))T.
The state-action value function or its derivative are
approximated as:
(12)
FIGURE 6. Spatial
layout of the campus (13)
road. m, meters.
where d(n) is the length of the dictionary Dn, qj = (sj, aj), and
sj (j = 1,2,…,d(n)) are the elements of the data dictionary.
Based on the kernel-based features, the online learning
rules in the critic of KDHP were designed by the recursive
LSTD learning approach (32).
In the actor training of KDHP, the following online policy
gradient method is used:
the LSPI-based learning controller.

Controller PI-1 PI-2 LSPI KLSPI Table 1 shows the quantitative evaluation of the four
Average absolute 1.8571 1.3040 1.2877 1.0494 controllers. The performance measure includes the aver-
error (km/h) age absolute error and the standard deviation of speed
tracking. It is shown that the KLSPI-based self-learning
Standard deviation 2.3876 1.8943 2.4447 2.0648 controller has the smallest tracking error (bolded).
(km/h) Making use of the KDHP algorithm, we implemented
an online learning control process for lateral control of the
TABLE 1. Performance comparisons among the four controllers (17).
UGV platform (6). Figure 6 shows the configuration of the
campus road used for the lateral control experiments. The
lateral error curves of KDHP and a popularly used lateral
Controller e (m) S(e)(m) ā(m/s2) aM (m/s2)
control method called “pure pursuit control” are shown in
KDHP controller 0.0660 0.0757 0.1671 0.8737 Figure 7. The data shows that the KDHP controller has a
smaller lateral error, especially from 0 to 40 seconds.
Pure pursuit 0.1046 0.0810 0.1660 0.9771 Table 2 shows the performance comparisons between
controller these two controllers based on lateral control experi-
ments on the campus road. The data indicates that, after
TABLE 2. Performance comparisons between two control methods learning, the KDHP controller has a smaller average
based on lateral control experiments on campus road (18). e lateral error and standard deviation (bolded). In addition,
is the average lateral error, S(e) is the standard deviation of lateral
the KDHP controller has a smaller maximum lateral ac-
error, ā is the average lateral acceleration, and aM is the maximum
lateral acceleration.
celeration than pure pursuit control (bolded).
Based on the above results, we conclude that the
actor–critic self-learning controller can effectively
optimize the control performance of UGVs without much
(14) prior knowledge or human intervention. The proposed
methodologies will be important and beneficial for
improving the driving skills and performance of UGVs
Experimental testing of a UGV using brain-like during their interaction with complex environments.
learning control
The UGV platform used for testing is shown in Figure References
4. It weighs 2,550 kg and its length, width, and height are 1. J. Ziegler et al., IEEE Trans Intell. Transp. Syst. 6, 8–20 (2014).
5.08 m, 1.94 m and 1.9 m, respectively. It is equipped with 2. S. Thrun, S. Strohband, G. Bradski., J. Field Robot. 9, 661–692
a 5-speed automatic transmission. A customized drive- (2004).
by-wire system was designed for the low-level actuator 3. V. Milanés, J. Villagrá, J. Godoy, C. González, IEEE Trans. Control
system, which realizes communications based on the Syst. Technol. 20, 770–778 (2012).
controller area network (CAN) bus. The controller is 4. J. Wang et al., J. Field Robot. 30, 102–128 (2013).
implemented by a Motorola PowerPC architecture, which 5. J. Wang, X. Xu, D. Liu, Z. Sun, Q. Chen, IEEE Trans. Control Syst.
has a 450-MHz CPU and 512-MB error correction code Technol. 22, 1078–1087 (2014).
(ECC) SDRAM memory (4, 5). 6. C. Lian, Optimal Control Methods Based on Approximate
As discussed previously, the KLSPI algorithm realizes Dynamic Programming with Applications to Unmanned
the policy-learning process in a batch mode, and a sample Ground Vehicles, Ph.D. thesis (National University of Defense
collection process is designed before learning commences Technology, Changsha, China, 2016).
(10). The sample collection process is performed in an 7. J. Li, M. D. Levine, X. An, X. Xu, H. He, IEEE Trans. Pattern Anal.
urban traffic environment with a sampling time step of Mach. Intell. 35, 996–1010 (2013).
0.05 seconds. For each episode, the maximum number of 8. T. Hu, B. Qi, T. Wu, X. Xe, H. He, Comput. Vis. Image Und. 116,

state transition steps is 200, giving a total of 10 seconds. 908–921 (2012).

The sample set used for KLSPI has 500 episodes of state- 9. X. Liu, X. Xu, B. Dai, J. Cent. South Univ. 19, 1454–1465 (2012).
transition data. 10. X. Xu, D. Hu, X. Lu, IEEE Trans. Neural Netw. 18, 973–992 (2007).
Figure 5 shows the performance comparisons among 11. X. Xu, Z. Hou, C. Lian, H. He, IEEE Trans. Neural Netw. Learn. Syst.
different longitudinal controllers. In Figure 5, panels A 24, 762 –775 (2013).
and B depict the velocity-tracking performance of two 12. R. Rajamani, H.-S. Tan, B. K. Law, W.-B. Zhang, IEEE Trans. Control
manually optimized PI controllers, which represents Syst. Technol. 8, 695–708 (2000).
performance prior to learning. Panels C and D show 13. J.-J. Martinez, C. C. de Wit, IEEE Trans. Control Syst. Technol. 15,
the velocity tracking performances of the LSPI-based 246–258 (2007).
learning controller and KLSPI-based learning controller, 14. B. J. Patz, Y. Papelis, R. Pillat, G. Stein, D. Harper, J. Field Robot.
respectively. For the LSPI-based learning controller, some 25, 528–566 (2008).
manually designed features were used (5). The data shows 15. K. R. S. Kodagoda, W. S. Wijesoma, E. K. Teoh, IEEE Trans.
that the KLSPI-based controller achieves superior perfor- Control Syst. Technol. 10, 112–120 (2002).
mance compared to manually optimized PI controllers and 16. R. S. Sutton, A. G. Barto, Reinforcement Learning: An
CO MP L IANT RO BOT IC MANIP U L AT IO N: A NE U RO BIO LO G IC STR AT EGY 47
Introduction (MIT Press, Cambridge MA, 1998). 28. W. Schultz, J. Neurophysiol. 80, 1–27 (1998).
17. C. Szepesvari, Algorithms for Reinforcement Learning (Morgan 29. P. N. Tobler, C. D. Fiorillo, W. Schultz, Science 307, 1642–1645
and Claypool, Williston, VT, 2010). (2005).
18. F. Lewis, D. Vrabie, IEEE Trans. Circuits Syst. 9, 32–50 (2009). 30. R. S. Sutton, Mach. Learn. 3, 9–44 (1988).
19. W. B. Powell, Approximate Dynamic Programming: Solving the 31. B. Seymour et al., Nature 429, 664–667 (2004).
Curses of Dimensionality (Wiley, New York, 2007). 32. X. Xu, H. G. He, D. W. Hu. J. Artif. Intell. Res. 16, 259–292 (2002).
20. L. Busoniu, R. Babuska, B. D. Schutter, D. Ernst, 33. M. G. Lagoudakis, R. Parr, J. Mach. Learn. Res. 4, 1107–1149
Reinforcement Learning and Dynamic Programming Using (2003).
Function Approximators (CRC Press, Boca Raton, FL, 2010). 34. J. Boyan, Mach. Learn. 2–3, 233–246 (2002).
21. D. Liu, Q. Wei, IEEE Trans. Neural Netw. Learn. Syst. 25,
621–634 (2014). Acknowledgments
22. X. Xu, L. Zuo, Z. H. Huang, Inf. Sci. (NY) 261, 1–31(2014). The authors would like to thank our colleagues in the unmanned
23. D. Silver et al., Nature 529, 484–489 (2016). vehicle laboratory of the National University of Defense Technology
24. R. D. Samson, M. J. Frank, J.-M. Fellous, Cogn. Neurodyn. 4, for preparation of the experimental UGV platform. This work was
91–105 (2010). supported by the National Natural Science Foundation of China
25. J. T. Arsenault, K. Nelissen, B. Jarraya, W. Vanduffel, Neuron (NSFC) under grant 91420302, the Innovation and Development
77, 1174–1186 (2013). Joint Foundation between NSFC and the Chinese Automobile
26. R. Chowdhury et al., Nat. Neurosci. 16, 648–653 (2013). Industry under grant U1564214, and the Pre-research Project of the
27. R. E. Suri, W. Schultz, Neuroscience 91, 871–890 (1999). National University of Defense Technology.
Compliant robotic FIGURE 1.

manipulation: A An intuitive
illustration
neurobiologic strategy of ARIE
using the
“bowl–bean”
Hong Qiao1,2,3*, Chao Ma4, and Rui Li1,3 system in
the physical
R
space.
obotics will be key to the success of automation

technologies of the future, particularly in the areas of FIGURE 2. ARIE in the
advanced manufacturing and services. The development configuration space of a
of compliant robots, which are characterized by flexibility, typical round-peg, round-
suppleness, and freedom of movement, has become hole insertion task. The
an area of intensive research. Although the concept of color of the region varies
from yellow to blue as the
compliance has not been strictly defined, it implies that
uncertainty of the system
robots are capable of dexterous manipulation and the decreases.
ability to sense and respond to interactions with humans
with exquisite precision.
Achieving compliant design

Compliant robotic manipulation can effectively reduce
dependency on sensory information while improving
relevant operational capabilities (1-3). Yet designing effec-
tive compliance strategies remains a challenging task. In and devices for applying compliance by leveraging infor-
recent years, researchers have begun to develop methods mation within the robot’s environment. In particular, the
concept of “attractive region in environment” (ARIE) has
achieved encouraging results.
1
The State Key Laboratory of Management and Control for Complex Systems,
Institute of Automation, Chinese Academy of Sciences, Beijing, China ARIE is a type of constrained region in the configuration
2
Center for Excellence in Brain Science and Intelligence Technology, Chinese space of the robotic system, formed by its environment.
Academy of Sciences, Shanghai, China
3
School of Computer and Control Engineering, University of Chinese Academy of In such regions, the state of the system will converge to a
Sciences, Beijing, China stable point with a state-independent input applied to the
4
School of Automation and Electrical Engineering, University of Science and
Technology Beijing, Beijing, China
system. By way of example, imagine that there is a bowl
*
Corresponding Author: hong.qiao@ia.ac.cn and a bean in the physical space (Figure 1). If we drop the
bean into the bowl, no mat-

ter the starting position of the
bean, the effect of gravity and
friction will act on the bean such
that it will finally come to rest at
the bottom of the bowl. ARIE is
just like such a bowl in the con-
figuration space; the state is the
position of the bean, the state-
independent input is the gravity,
which is a constant in this case,
and the stable point is the bot-
tom of the bowl. If we can find
ARIE and the state-independent
input in the configuration space
of the robotic system, then the
task can be achieved without the
need for sensing information.
ARIE enables the uncertainty
of the system state to be elimi- FIGURE 3. An illustration of the constrained region in environment (CRIE) framework.
nated by a state-independent
input (meaning that the input is
not the function of the state of limited sensing information, humans can plan and properly
the system). It has been implemented in robotic grasping execute complex motions such as grasping. We posit that,
and localization tasks, and has become a common fea- by careful observations of how humans achieve these tasks,
ture in the configuration space of robotic systems (4–7). robots can be designed to execute a variety of complex
Figure 2 shows an illustration of the performance of an movements in response to a range of object types.
ARIE system in a typical round-peg, round-hole insertion Based on these observations and the concept of ARIE,
task. According to the relationship between the shape of we have proposed the adoption of a framework called
the peg and the hole in the physical space and that of the “constrained region in environment” (CRIE) to achieve
constrained region in the configuration space, a series compliant robotic manipulation. This system would
of compliant strategies to achieve peg insertion can be integrate constraints formed by the environment and
designed. coarse-sensing information. As illustrated in Figure 3, one
Although environmental constraints can be used to of the distinguishing features of CRIE is the combination of
develop compliance, some a priori knowledge of the robot offline planning and online dynamic compliant adjustments
and the object to be manipulated is still necessary in the to compensate for uncertainties inherent in the ideal model.
design of compliant robotic applications. Researchers are Thus, by using manipulation states acquired by coarse-
now focusing on integrating human-type features into sensing information, a compliant robot could determine
robotics to achieve compliant functionality. Here we review whether an appropriate adjustment should be implemented
how applying neurobiological paradigms can improve to asymptotically track the predesigned planning sequence.
robotic compliance. In robotics, the implementation of various sensors and
related technologies has grown in recent years, and offers
Adopting compliant functionality boundless prospects in a wide range of manipulation tasks.
A central goal of compliance research has been to The CRIE framework has great potential in engineering
introduce the functionality of human hands to robots. applications, including the challenging task of capturing
Methods for compliant control and motion have been pro- information not provided directly by sensors. CRIE has been
posed, and a large number of remote center compliance successful in providing more flexibility and efficiency for
(RCC) devices have been developed. An RCC device is a robots in executing grasping and other manipulation tasks;
mechanical device that moves the center of compliance however, more effort in control and planning strategies for
in an automated assembly such that object jamming can robotic manipulation is needed to achieve a higher degree
be prevented. However, these devices remain deficient in of compliance.
comparison to the capabilities of human hands. How do we
therefore impart features of human control to robots, and Integrating neurobiologically inspired mechanisms
how can we integrate these features to improve compli- Bridging the existing gaps in compliant manipulation
ance? Humans do not possess the most highly evolved between robots and humans is challenging. One way to
sensory organs; nonetheless, they are capable of many solve this problem is to introduce a human-like manipula-
sophisticated operations. In fact, the ability to deal with a tion mechanism to robots. Researchers are therefore turn-
complex real-world environment in a flexible manner is one ing to neurobiology for inspiration.
of the most striking features of human beings. Even with Humans are capable of executing a variety of complicat-
CO MP L IANT RO BOT IC MANIP U L AT IO N: A NE U RO BIO LO G IC STR AT EGY 49
ed tasks with high to control the hand would interface with a design whereby
dexterity through the hand could turn dynamically, providing compliance in a
coordination be- variety of situations. The mechanical structure of the robotic
tween their hands, hand can be designed in a compliant manner based on
eyes, and limbs. the anatomical structure of the human hand. It would also
The combination of be necessary to provide a muscle-like driving structure that
complex anatomi- would function as the hardware of the motor nervous system.
cal structure with Figure 4 illustrates the idea of a tendon-muscle–based struc-
dense, tactile sen- ture with compliant strategy enabling a robot hand to insert
sory organs and the a key into a lock, a task that would require a high degree of
complex motion compliance.
control that human
hands possess is Conclusions and perspectives
essential for achiev- Developing robots with human-like tactile abilities has
ing such a range of been a long-standing goal of robotics. The field of compli-
tasks. For example, ance for robotic manipulation has developed many remark-
the fingers carry able strategies, ranging from combining compliant func-
FIGURE 4. An illustration of a task the most densely tionality with environmental constraints and coarse-sensing
requiring a high level of compliance: innervated areas of information, to integrating neurobiologically inspired mecha-
inserting a key to a lock.
the body and the nisms into robotic hand design—all paving the way for many
richest source of promising robotic applications. Exploring neurobiologically
tactile feedback, inspired paradigms for compliance can also provide a better
and possess the greatest positioning capability of the understanding of how the human body carries out manipu-
body. Thus, mimicking the structure, function, and control lation tasks. We plan to design and test a neurobiologically
of human hands provides an excellent model for improv- inspired robotic hand with a muscle-like structure; this
ing the compliance ability of robotic hands. Dexterous system will integrate a neurobiologically inspired model of a
robotic hands have received increasing attention in recent motor nervous system compliance-based technology in the
years and have resulted in a number of human hand robotic hand, which could lead to a low-cost and fast robotic
imitations, such as the Shadow Dexterous Hand, the Bar- manipulation system with an inherent learning ability.
rett Hand, Robotiq Grippers, and RightHand Labs’ ReFlex
Hand. Compared with simple grippers or other rudimen- References
tary mechanical hands, dexterous robot hands provide 1. T. Lozano-Perez, M. T. Mason, R. H. Taylor, Int. J. Robot. Res. 3,
more flexibility and improve the user’s ability to execute 3–24 (1984).
grasping and manipulation tasks. These capabilities are 2. M. R. Ahmed, Compliance Control of Robot Manipulator
due to a variety of visual, force/torque, or tactile sensors, for Safe Physical Human Robot Interaction, Ph.D. thesis, Örebro
and related intelligent algorithms incorporated into the University (2011).
design (8–10). 3. E. Mattar, Rob. Auton. Syst. 61, 517–544 (2013).
Although such robotic hands can achieve simple compli- 4. H. Qiao, Int. J. Prod. Res. 40, 975–1002 (2002).
ant manipulations, the human hand has a more intricate 5. J. Su, H. Qiao, C. Liu, Z. Ou, Int. J. Adv. Manuf. Tech. 59, 1211–
and complex structure. There are 27 bones in the human 1225 (2012).
hand. Together with the forearm, more than 30 individual 6. H. Qiao, M. Wang, J. Su, S. Jia, R. Li, IEEE/ASME Transactions on
muscles work together to achieve flexibility, precise control, Mechatronics 20, 2311–2327 (2015).
and gripping strength for various tasks. The human hand 7. J. Su, H. Qiao, Z. Ou, Y. Zhang, Assembly Autom. 32, 86–99
is controlled by two groups of muscles: extrinsic muscles (2012).
(containing long flexors and extensors) located on the 8. T. Iberall, Int. J. Robot. Res. 16, 285–299 (1997).
forearm, which control crude movement, and intrinsic 9. A. Bicchi, R. Sorrentino, Proc. IEEE International Conference on
muscles located within the hand, which perform fine motor Robotics and Automation 1, 452–457 (1995).
functions. The most flexible parts of the hand are the fin- 10. A. Bicchi, IEEE Trans. Robot. Autom. 16, 652–662 (2000).
gers, which are mostly controlled by strong muscles in the 11. R. Tubiana, J. M. Thomine, E. Mackin, Examination of the Hand
forearm (11). The mechanisms underlying hand movement and Wrist, 2nd Ed. (Martin Dunitz, London, 1998).
are closely connected to the motor nervous system. Under-
standing this connection may help in developing robotic Acknowledgments
systems with enhanced compliance abilities. This research was partially supported by the National Natural
In recent years, researchers have gained inspiration Science Foundation of China (61210009), the Strategic Prior-
from studying the makeup of the nervous system, and have ity Research Program of the Chinese Academy of Sciences
begun to apply their ideas to robotic compliance design. (XDB02080003), the Beijing Municipal Science and Technology
One such idea is that a tendon-muscle–based structure Commission (D16110400140000 and D161100001416001), and
could provide a compliant strategy to control movement the Fundamental Research Funds for the Central Universities (FRF-
of the robot hand and arm. In this scenario, neural signals TP-15-115A1).
Declarative and procedural challenges is how to model human memory, whether

in designing hardware to deal with large-scale brain
knowledge modeling simulations or software with brain-inspired cognitive
architectures. Declarative memory (i.e., the knowledge
methodology for brain of “what”) and procedural memory (i.e., the knowledge
of “how”) are the two important kinds of memory that
cognitive function analysis represent declarative knowledge and procedural
knowledge, respectively. They have been implemented
Yun Su1,2, Xiaowei Zhang1 , Philip Moore1 , as important components in many humanoid robotics and
Jing Chen1 , Xu Ma1 , and Bin Hu1* underlying brain-inspired cognitive architectures, such as
D
Edelman’s neural automata (4), ACT–R (adaptive control
of thought–rational) (5), and 4CAPS (cortical capacity-
constrained concurrent activation-based production
riven by the rapid ongoing advances in system) (6).
neuroscience, cognitive science, and computer science, Memory has a complicated and hierarchical structure
brain-inspired intelligence R&D is blossoming (1). Brain- that stores information about important relationships
inspired intelligence is a field that addresses the creation between entities in the world. Similarly, ontological
of machine intelligence through computational modeling technology can provide linguistic, semantic, and
and software—hardware coordination, drawing inspiration quantitative meaning by showing relationships between
from the information-processing mechanisms of the human concepts. This technology provides a suitable and effective
brain at various levels of granularity (2). Research in this model in a form readable by both computerized systems
field has led to the development of intelligence control and humans. Moreover, it enables the presentation of the
architectures in humanoid robots. The area of humanoid hierarchical declarative meaning of concepts along with
robotics has gained traction as a concrete example of the modeling of procedural knowledge, and can use machine
accomplishments of brain-inspired intelligence research learning technologies and statistical methods to infer and
and as a research tool for neural simulation. predict new knowledge. Thus, ontology-based modeling
In this review, we discuss how ontology technologies can be applied to declarative and procedural knowledge
and data mining algorithms are being used to model modeling in the memory systems of humanoid robots.
knowledge of multimodal psychophysiological data By combining ontological techniques with data mining
including electroencephalogram (EEG) and functional algorithms, we have conducted studies of declarative
magnetic resonance imaging (fMRI), with the aim of knowledge modeling of psychophysiological signals
predicting the affective state and stage of mild cognitive and procedural knowledge modeling of both emotion
impairment (MCI). Such methods hold the potential to assessment and early-stage prediction of MCI (7–11).
model memory and enable effective management of inputs
(sensory information), and to provide an effective basis for EEG-based emotion assessment
the conversion of the information to an invariant form that Emotion is considered a critical aspect of intelligent
can be processed to infer and predict new knowledge. human behavior. Current research has shown an increasing
interest in using emotion assessment to more successfully
Modeling memory apply brain-inspired intelligence toward improving
At the crux of intelligence is the ability to implement human-machine interactions. Based on the analysis of
memory and predictions (3). The brain uses a vast amount EEGs, which are derived from automatic nervous system
of memory to create a model of the world and makes responses, computers can assess user emotions and find
continuous predictions of future events. When we see, correlations between significant EEG features and the
feel, hear, or experience external sensory stimulation, human emotional state. With the advent of modern signal
the cortex takes the detailed, highly specific input and processing techniques, the evaluative power of EEG-based
converts it to an invariant form, which captures the essence analysis of human emotion has increased, due in part to
of relationships within the observed world, independent the large number of features typically extracted from EEG
of the details of the moment. When there are consistent signals. However, it is too complex a task for researchers
patterns among the sensory inputs flowing into the brain, to manually manage the expanded set of features in a
the cortex will process them to retrieve the invariant form structured way.
from memories and combine patterns of the invariant Management and distribution of the data present
structure with the most recent details to predict future significant challenges, which can only be addressed
events. through advanced tools for data representation, mining,
When considering humanoid robotics, one of the key and integration. We have therefore carried out studies to
develop tools for implementing declarative knowledge
1
School of Information Science and Engineering, Lanzhou University, Lanzhou,
and procedural knowledge models. To build declarative
China knowledge models, we extract knowledge from EEG data
2
College of Computer Science and Engineering, Northwest Normal University, and contextual information, then encode the knowledge
Lanzhou, China
*
Corresponding Author: bh@lzu.edu.cn and information to create the domain ontology. We address
D EC LARAT I VE AND PROC EDURAL KNOW L E DG E MO DE L ING ME T H O D O LO GY FO R BRAIN CO G NIT IV E FU NCT IO N AN A LY SI S 51
A
FIGURE 1. EEG-based
emotion assessment
system using ontology
and data mining
techniques. (A) EmotionO,
an ontology developed
for the EEG context (8). (B)
Framework of EmotionO+,
an ontology model used
for early-stage prediction
and intervention in
depression (9). (C) The
mapping process from
raw EEG data documents
to the Rawdata ontology
with EEG data (10). EEG,
electroencephalogram;
fNIRS, functional near-
infrared spectroscopy.
C
procedural knowledge using

a rule set that has been A
trained by the data mining
algorithms, which can identify
the emotional state of the
subject from the declarative
knowledge.
Zhang et al. have
developed an ontology
called “Emotiono,” which
models declarative
knowledge that includes
contextual information and
EEG-related knowledge,
as well as the relationships
between them. Figure 1A
shows a representation of
some key entities defined in
this ontology. The key top-
level elements consist of
classes and properties that
describe #Emotion, #User, and
#Situation concepts [ontology
web language (OWL) classes]
with <hasEEGFeature> and
<hasEmotion> properties. In B
order to obtain the emotional
state, the C4.5 algorithm
was used to mine the main
relations between EEG
features of a certain person
and their affective state, with
all of the paths from the root
node to the leaf node in the
constructed decision tree
treated as the procedural
knowledge. Knowledge
reasoning was based on the
support of an inference engine
that concatenates declarative
knowledge associated with
procedural knowledge, and
was constructed to identify
the current emotional state
from the captured data.
Experimental results were FIGURE 2. MRI-based decision support system for the diagnosis of MCI using ontology and
based on the eNTERFACE’06 data mining techniques (11). (A) The structure of mild cognitive impairment (MCI) domain
(www.enterface.net/results) ontology. (B) The framework of decision support for MCI diagnosis. ROI, region of interest;
and DEAP (www.eecs.qmul. NMR, nuclear magnetic resonance; CDD, cognitive deterioration diagnosis; SVM, support
ac.uk/mmv/datasets/deap) vector machine; MRI, magnetic resonance imaging.
databases, and showed that
recognition of human emotion
is highly accurate using this approach (7, 8). using the random forest algorithm. This model can be
Based on the Emotiono ontology, Su et al. proposed used in systems designed to enable early-stage prediction
an extended ontology model called “EmotionO+” that and interventions in depressive disorders (Figure 1B) (9).
semantically represents EEG and functional near-infrared On the basis of the work of Zhang, Chen et al.
spectroscopy (fNIRS) data with contextual information. In have proposed a mapping process that can convert
addition, their proposed ontology model has incorporated unstructured raw EEG documents into an ontology called
emotion-reasoning rules based on EEG data obtained “Rawdata” with structured EEG data, using document
D EC LARAT I VE AND PROC EDURAL KNOW L E DG E MO DE L ING ME T H O D O LO GY FO R BRAIN CO G NIT IV E FU NCT IO N AN A LY SI S 53
content identification, query, and information description and physiological data can be modeled as declarative
mechanisms (Figure 1C) (10). Several OWL classes knowledge based on ontology components that include
have been defined in the Rawdata ontology, but they classes, individuals, and properties. The relationships
have no individuals (i.e., the members of the classes) or between emotion, MCI, and the physiological data can
explicit properties (i.e., the relationships between the then be constructed as procedural knowledge using
OWL classes) [Figure 1C(a)]. Querying and searching data mining techniques. The emotional state or the
are used in both the Rawdata ontology and the raw data stage of the disease can then be predicted from the
documents in order to match keywords, and individuals captured data, based on knowledge reasoning supported
are built under the corresponding classes [Figure 1C(b) by an inference engine that concatenates declarative
and (c)]. Following these initial steps, the raw EEG data knowledge associated with procedural knowledge.
and the data-related content are represented in the Our proposed model enables the management of
Rawdata ontology, which includes explicit individuals, structured or unstructured data, converts the data into an
properties, and the relationships between individuals invariant form (i.e., knowledge), and predicts future events
[Figure 1C(d)]. Finally, the structured description of (i.e., new knowledge). Such methods have great potential
the raw EEG data can be further processed using the for use in brain-inspired intelligence technologies that
quantitative EEG analysis mechanisms and ontology demand the modeling of the declarative content and
modeling. procedural content recorded in memory.
MRI-based decision support for diagnosis of MCI References

In the medical domain, ontology-based modeling 1. H. de Garis, C. Shuo, B. Goertzel, L. Ruiting,
has been used as a data structure, as it enables the Neurocomputing 74, 3–29 (2010).
definition of semantic descriptors, data integration, and 2. Y. Zeng, Cheng-Lin Liu, Tie-Niu Tan, Chin. J. Comput. 39,
management of domain knowledge in clinical health care. 212–222 (2016).
Given these attributes, we have studied ontology-driven 3. J. Hawkins, S. Blakeslee, On Intelligence (Macmillan,
decision support for the diagnosis of MCI based on MRI. London, 2007), pp. 60–71.
Brain structure changes associated with MCI have 4. G. M. Edelman, Science 318, 1103–1105 (2007).
been widely researched. In structural MRI studies, for 5. J. R. Anderson, C. Lebiere, Behav. Brain Sci. 26, 587–601
example, the cortical thickness of patients suffering from (2003).
MCI has been shown to be significantly reduced due 6. M. A. Just, S. Varma, Cogn. Affect. Behav. Neurosci. 7, 153–
to gray matter atrophy (12). To improve diagnosis, we 191 (2007).
propose an objective method capable of diagnosing MCI 7. X. Zhang, B. Hu, P. Moore, J. Chen, L. Zhou, in 18th
automatically using the cortical thickness of each region International Conference on Neural Information Processing
of interest in the MRI scan. The management of MRI/MCI (ICONIP 2011), Part II (Springer, Berlin/Heidelberg,
knowledge hierarchically using ontology-based modeling Germany, 2011), pp. 89–98.
may provide significant diagnostic benefits. 8. X. Zhang, B. Hu, J. Chen, P. Moore, World Wide Web 16,
Based on this hypothesis, Zhang et al. encoded 479–513 (2013).
specialized declarative knowledge—including MRI/MCI 9. Y. Su, B. Hu, L. Xu, H. Cai, P. Moore, Proceedings of the
and environment-context knowledge—into an ontology 2014 IEEE International Conference on Bioinformatics and
(Figure 2A), and mined the procedural knowledge Biomedicine (IEEE, Belfast, United Kingdom, November
constructed by a rule set using the C4.5 machine learning 2014), pp. 529–535.
algorithm (11). Following these steps, the two parts 10. J. Chen, B. Hu, P. Moore, X. Zhang, X. Ma, Appl. Soft
were applied in conjunction with a reasoning engine Comput. 30, 663–674 (2015).
to automatically distinguish MCI patients from normal 11. X. Zhang, B. Hu, X. Ma, P. Moore, J. Chen, Comput. Methods
controls (Figure 2B). The rule set was trained by MRI Programs Biomed. 113, 781–791 (2014).
data from a population of 187 MCI patients and 177 12. Z. Yao, B. Hu, C. Liang, L. Zhao, M. Jackson, PLOS ONE 7,
normal controls selected from the Alzheimer’s Disease e48973 (2012).
Neuroimaging Initiative database (adni.loni.usc.edu).
Evaluation of the results supported the conclusion that Acknowledgments
the approach can assist physicians and health care This work was supported by the National Basic Research Program
professionals to effectively implement clinical diagnosis of China (973 Program) (2014CB744600), the National Natural
of MCI under real-world conditions. Science Foundation of China (61210010 and 61402211), the Natural
Science Foundation of the Education Department of Gansu
Discussion and conclusions Province, China (2015A-008), and the Northwest Normal University
In summary, our studies have shown that the Foundation (NWNU-LKQN-14-5).
proposed models can be used to characterize
declarative knowledge and procedural knowledge. In
the proposed approach, emotion-context, MCI-context,

Brain Inspired Robotics Supplement Final

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Brain Inspired Robotics Supplement Final

Caricato da

Copyright:

Formati disponibili

A Sponsored Supplement to Science

Produced by the Science/AAAS

Science Robotics is a unique journal created to help advance the research

of robotics and 3 Innovating at the intersection of neuroscience

9 Deep learning: Mathematics and neuroscience

12 Collective robots: Architecture, cognitive behavior

16 Toward robust visual cognition through

20 Brain-like control and adaptation for

robotics The European Union-funded Human Brain Project’s Neurorobotics

Jackie Oberst, Ph.D.

Hong Qiao, Ph.D.

perform complex calculations, tasks that would be a marvel

Acorn RISC Machine Paths forward in

FIGURE 4. Proposed hybrid

ment of BICS appears to be ripe for a breakthrough

software trained to play the deceptively complex ancient

22. D. B. Strukov, G. S. Snider, D. R. Stewart, R. S. Williams, Nature

Importance of the science of intelligence

FIGURE 2. John Boyd’s OODA loop [reproduced from (7)].

Toward robust visual In contrast to knowledge representations based

Visual perception and multiscale

FIGURE 1. Brain network

FIGURE 2. (A) Multiscale salience detection in

(1) Event-/attention-driven information acquisition: (3) Networked dynamic information processing:

Conclusions and perspectives

a vector. A system in the non-

Neurorobotics: A strategic Recently, industry has again been advancing technol-

nervous systems. They define control

system as a “neurorobotic system.” The

Step 2: Choose desired level and platform

It goes without saying that for

The robot the brain atlas

by the neuromorphic computing plat-

and LAURON robots were directed by the

Simulation of a humanoid robot

Mouse model and soft-body simulation

To this end, we are currently

FIGURE 2. Introducing memory and association into the

FIGURE 4. Introducing active and dynamic learning

Anthropomorphic action to gain an understanding of the foundations of human ac-

which have subsequently been applied in ergonomics (23) References

Actor–critic reinforcement longitudinal control of the UGV. The near-optimal policy

nmanned ground vehicles (UGVs) are receiving Reinforcement learning in robotics

R is the reward function. By making use of the TD learn-

update rule (34) is used to estimate the approximated

If the optimal action value function Q*(s, a) is com-

In KLSPI and other kernel-based RL algorithms, a

where Pi (i = 1,2,3) and Rj (j = 1,2) are positive constants.

the LSPI-based learning controller.

state transition steps is 200, giving a total of 10 seconds. 908–921 (2012).

Compliant robotic FIGURE 1.

obotics will be key to the success of automation

Achieving compliant design

bean into the bowl, no mat-

Declarative and procedural challenges is how to model human memory, whether

procedural knowledge using

MRI-based decision support for diagnosis of MCI References

Potrebbero piacerti anche