Humanoid Robots - New Developments
Humanoid Robots - New Developments
New Developments
Humanoid Robots
New Developments
Edited by
Armando Carlos de Pina Filho
I-Tech
IV
I-Tech
Vienna
Austria
Abstracting and non-profit use of the material is permitted with credit to the source. Statements and
opinions expressed in the chapters are these of the individual contributors and not necessarily those of
the editors or publisher. No responsibility is accepted for the accuracy of information contained in the
published articles. Publisher assumes no responsibility liability for any damage or injury to persons or
property arising out of the use of any materials, instructions, methods or ideas contained inside. After
this work has been published by the Advanced Robotic Systems International, authors have the right to
republish it, in whole or part, in any publication of which they are an author or editor, and the make
other personal use of the work.
A catalogue record for this book is available from the Austrian Library.
Humanoid Robots, New Developments, Edited by Armando Carlos de Pina Filho
p. cm.
ISBN 978-3-902613-00-4
1. Humanoid Robots. 2. Applications. I. Armando Carlos de Pina Filho
V
Preface
For many years, the human being has been trying, in all ways, to recreate the com-
plex mechanisms that form the human body. Such task is extremely complicated
and the results are not totally satisfactory. However, with increasing technological
advances based on theoretical and experimental researches, man gets, in a way, to
copy or to imitate some systems of the human body.
These researches not only intended to create humanoid robots, great part of them
constituting autonomous systems, but also, in some way, to offer a higher knowl-
edge of the systems that form the human body, objectifying possible applications
in the technology of rehabilitation of human beings, gathering in a whole studies
related not only to Robotics, but also to Biomechanics, Biomimmetics, Cybernetics,
among other areas.
This book presents a series of researches inspired by this ideal, carried through by
various researchers worldwide, looking for to analyze and to discuss diverse sub-
jects related to humanoid robots. The presented contributions explore aspects
about robotic hands, learning, language, vision and locomotion.
From the great number of interesting information presented here, I believe that this
book can offer some aid in new research, as well as stimulating the interest of peo-
ple for this area of study related to the humanoid robots.
Editor
Armando Carlos de Pina Filho
VII
Contents
Preface V
5. Biped Gait Generation and Control Based on Mechanical Energy Constraint 069
Fumihiko Asano, Masaki Yamakita, Norihiro Kamamichi and Zhi-Wei Luo
10. An Incremental Fuzzy Algorithm for The Balance of Humanoid Robots 171
Erik Cuevas, Daniel Zaldivar, Ernesto Tapia and Raul Rojas
11. Spoken Language and Vision for Adaptive Human-Robot Cooperation 185
Peter Ford Dominey
VIII
17. Central pattern generators for gait generation in bipedal robots 285
Almir Heralic, Krister Wolff and Mattias Wahde
29. Drum Beating and a Martial Art Bojutsu Performed by a Humanoid Robot 521
Atsushi Konno, Takaaki Matsumoto, Yu Ishida, Daisuke Sato & Masaru Uchiyama
30. On Foveated Gaze Control and Combined Gaze and Locomotion Planning 531
Kolja Kühnlenz, Georgios Lidoris, Dirk Wollherr and Martin Buss
1. Introduction
The development of a humanoid robot in the collaborative research centre 588 has the
objective of creating a machine that closely cooperates with humans. The
collaborative research centre 588 (SFB588) “Humanoid Robots – learning and
cooperating multi-modal robots” was established by the German Research
Foundation (DFG) in Karlsruhe in May 2000. The SFB588 is a cooperation of the
University of Karlsruhe, the Forschungszentrum Karlsruhe (FZK), the Research
Center for Information Technologies (FZI) and the Fraunhofer Institute for
Information and Data Processing (IITB) in Karlsruhe.
In this project, scientists from different academic fields develop concepts, methods and
concrete mechatronic components and integrate them into a humanoid robot that can share
its working space with humans. The long-term target is the interactive cooperation of robots
and humans in complex environments and situations. For communication with the robot,
humans should be able to use natural communication channels like speech, touch or
gestures. The demonstration scenario chosen in this project is a household robot for various
tasks in the kitchen.
Humanoid robots are still a young technology with many research challenges. Only few
humanoid robots are currently commercially available, often at high costs. Physical
prototypes of robots are needed to investigate the complex interactions between robots
and humans and to integrate and validate research results from the different research
fields involved in humanoid robotics. The development of a humanoid robot platform
according to a special target system at the beginning of a research project is often
considered a time consuming hindrance. In this article a process for the efficient design of
humanoid robot systems is presented. The goal of this process is to minimize the
development time for new humanoid robot platforms by including the experience and
knowledge gained in the development of humanoid robot components in the
collaborative research centre 588.
Weight and stiffness of robot components have a significant influence on energy
efficiency, operating time, safety for users and the dynamic behaviour of the system in
general. The finite element based method of topology optimization gives designers the
possibility to develop structural components efficiently according to specified loads and
boundary conditions without having to rely on coarse calculations, experience or
2 Humanoid Robots, New Developments
intuition. The design of the central support structure of the upper body of the humanoid
robot ARMAR III is an example for how topology optimization can be applied in
humanoid robotics. Finally the design of the upper body of the humanoid ARMAR III is
presented in detail.
3.1 Requirements
The demands result from the actions that the humanoid robot is supposed to perform. The
robot designed in the SFB 588 will interact with humans in their homes, especially in the
kitchen. It will take over tasks from humans, for example loading a dish washer. For this
task it is not necessary, that the robot can walk on two legs, but it has to feature kinematics,
especially in the arms, that enable it to reach for objects in the human surrounding. In
addition, the robot needs the ability to move and to hold objects in its hand (Schulz, 2003).
cannot easily be overseen without help and which complicates an efficient target-oriented
development. Therefore it is helpful to file the components of the joints, actuators and
sensors as objects in an object-oriented classification. It enables a requirement-specific access
to the objects and delivers information about possible combinations of components.
the components of the total system, integrated at a later stage, are already considered from
the start with regard to their general requirements for being integrated. Examples for this
are the sensors, which can then be assembled without problems since it is made sure that the
already designed system offers the possibility to integrate them. During the next step the
neighbouring module is designed. Information about the required interface of the shoulder
and the mass of the arm and its distribution are given to the torso module.
4. Topology optimization
Topology optimization is used for the determination of the basic layout of a new design. It
involves the determination of features such as the number, location and shape of holes and
the connectivity of the domain. A new design is determined based upon the design space
available, the loads, materials and other geometric constraints, e.g. bearing seats of which
the component is to be composed of.
Today topology optimization is very well theoretically studied (Bendsoe & Sigmund, 2003)
and applied in industrial design processes (Pedersen & Allinger, 2005). The designs
obtained using topology optimization are considered design proposals. These topology
optimized designs can often be rather different compared to designs obtained with a trial
and error design process or designs obtained upon improvements based on experience or
intuition as can be deduced from the motor carrier example in Fig. 3. Especially for complex
loads, which are typical for systems like humanoid robots, these methods of structural
optimization are helpful within the design process.
The left picture in figure 5 shows the initial FE model of the available design space including
the geometric boundary conditions like the mechanical interfaces for the adjoining modules
neck, arms and torso joint as well as the space reserved for important components like
computers and controllers. Together with a set of static loads, this was the input for the
optimization process. The bottom left picture shows the design as it was suggested by the
optimization module after the final optimization loop. This design was then manually
transferred into a 3d model in consideration of manufacturing restrictions. The picture on
the right in Fig. 5 shows the assembled support structure made from high-strength
aluminium plates. The result of the optimization is a stiff and lightweight structure with a
total mass of 2.7 kg.
Due to this concept, load transmission is implemented with the use of wire ropes, which are
led from the torso through the shoulder to the elbow by rolls and bowden cables. In order to
realize independent control of both degrees of freedom, the wire ropes for turning the
forearm are led through the axis of rotation for bending the elbow. With altogether twelve
rolls, this rope guide realizes the uncoupling of the motion of bending the elbow from
rotating the forearm. In contrast to the previous version of the elbow, where the steel cables
were guided by Bowden cables, this solution leads to smaller and constant friction losses
which is advantageous for the control of this system.
Similar to the shoulder, the angular measurement is realized by encoders attached directly
to the motors as well as optical sensors that are located directly at the joint for both degrees
of freedom. In order to measure the drive torque, load cells are integrated in the wire ropes
in the upper arm. As each degree of freedom in the elbow is driven by two wire ropes the
measuring of force in the wire ropes can be done by differential measurements. Another
possibility for measuring forces offers the tactile sensor skin, which is integrated in the
cylindrical casing of the upper arm.
By placing the drive units in the thorax, there is sufficient design space left in the arm which
can be used for electronic components that process sensor signals and which can be installed
in direct proximity to the sensors in the upper arm.
By arranging the motors close to the elbow joint, the centre of mass of the forearm is
shifted towards the body, which is an advantage for movements of the robot arm.
Angular measurement in the wrist is realized by encoders at the motors and with
quasi-absolute angular sensors directly at the joint. To measure the load on the hand, a
6-axis force and torque sensor is fitted between the wrist and the hand (Beck et al.,
2003) (not shown in Fig. 11). The casing of the forearm is also equipped with a tactile
sensor skin. The support structure of the forearm consists of a square aluminium
profile. This rigid lightweight structure offers the possibility of cable routing on the
inside and enough space for mounting electronic components on the flat surfaces of
the exterior.
For the conversion of torque and rotational speed, the drive train of each degree of
freedom consists of Harmonic Drive transmissions either as only transmission element
or, depending on the needed overall gear ratio, in combination with a toothed gear
belt.
The drives for all degrees of freedom in the neck are practically free from backlash. The
motors of all degrees of freedoms are placed as close as possible to the rotational axis in
order to keep the moment of inertia small. The sensors for the angular position
measurement in the neck consist of a combination of incremental encoders, which are
attached directly to the motors, and quasi-absolute optical sensors, which are placed directly
at the rotational axis. The neck module as depicted above weighs 1.6 kg.
are to further reduce the weight and to increase the energy efficiency, increase the payload
and to design a closed casing for all robot joints.
7. Acknowledgement
The work presented in this chapter is funded by the German Research Foundation DFG in the
collaborative research centre 588 “Humanoid robots - learning and cooperating multi- modal robots”.
8. References
Asfour, T. (2003). Sensomotorische Bewegungskoordination zur Handlungsausführung
eines humanoiden Roboters, Dissertation Faculty for Computer Science, University
of Karlsruhe (TH)
Beck, S.; Lehmann, A.; Lotz, T.; Martin, J.; Keppler, R.; Mikut, R. (2003). Model-based
adaptive control of a fluidic actuated robotic hand, Proc., GMA-Congress 2003
Bendsoe. M.; Sigmund, O. (2003) Topology Optimization – Theory, Methods, Application,
Springer Verlag
Browning, T. R. (2001). Applying the Design Structure Matrix to System Decomposition and
Integration Problems: A Review and New Directions, IEEE Transaction on
Engineering Management, Vol. 48, No. 3
Häussler, P.; Emmrich ; D.; Müller, O.; Ilzhöfer, B.; Nowicki, L.; Albers, A. (2001).
Automated Topology Optimization of Flexib-le Components in Hybrid Finite
Element Multibody Systems using ADAMS/Flex and MSC.Construct, ADAMS
European User's Conference, Berchtesgaden, Germany
Häussler, P. (2005). Ein neuer Prozess zur parameterfreien Formoptimierung dynamisch
beanspruchter Bauteile in mechanischen Systemen auf Basis von Lebensdaueranalysen und
hybriden Mehrkörpersystemen, dissertation Faculty for Mechanical Engineering, research
reports of the Institute for Product Development, University of Karlsruhe, ISSN 1615-8113
Kerpa, O.; Weiss, K.; Wörn, H.; (2003). Development of Flexible Tactile Sensor System for a
Humanoid Robot, Intelligent Robots and Systems IROS, Las Vegas USA
Minx, J.; Häussler, P.; Albers, A.; Emmrich D.; Allinger, P. (2004). Integration von FEM, MKS
und Strukturoptimierung zur ganzheitlichen, virtuellen Entwicklung von
dynamisch beanspruchten Bauteilen, NAFEMS seminar, analysis of multibody
systems with FEM and MBS, October, 27th -28th, Wiesbaden
Ortiz, J.; Bir, G. (2006). Verification of New MSC.ADAMS Linearization Capability For Wind
Turbine Applications, 44th AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada
Pedersen, C.B.W.; Allinger, P. (2005). Recent Developments in the Commercial
Implementation of Topology Optimization. TopoptSYMP2005 - IUTAM-
Symposium- Topological design optimization of structures, machines and material
– status and perspectives, Copenhagen, Denmark
Schäfer, C. (2000). Entwurf eines anthropomorphen Roboterarms: Kinematik,
Arbeitsraumanalyse, Softwaremodellierung, dissertation Faculty for Computer
Science, University of Karlsruhe (TH)
Schulz, S. (2003). Eine neue Adaptiv-Hand-Prothese auf der Basis flexibler Fluidaktoren,
Dissertation, Faculty for Mechanical Engineering, University of Karlsruhe (TH) 2003
SysML Partners, (2005). Systems Modeling Language (SysML) specification version 1.0
alpha, 14, www.sysml.org
VDI Gesellschaft Entwicklung Konstruktion Vertrieb (Editor) (2004), VDI-Guideline 2206:
Design methodology for mechatronic systems; Beuth Verlag GmbH, Berlin
2
1. Introduction
Studies on biped robots have attracted interest due to such problems as inherent poor
stability and the cooperation of a large degree of freedom. Furthermore, recent advanced
technology, including hardware and software, allows these problems to be tackled,
accelerating the interest. Actually, many sophisticated biped robots have already been
developed that have successfully achieved such various motions as straight walking,
turning, climbing slopes, rising motion, and running (Aoi & Tsuchiya, 2005; Aoi et al., 2004;
Hirai et al., 1998; Kuniyoshi et al., 2004; Kuroki et al. 2003; Löffler et al., 2003; Nagasaki et al.,
2004).
Steady gait for a biped robot implies a stable limit cycle in its state space. Therefore,
different steady gait patterns have different limit cycles, and gait transition indicates that the
state of the robot moves from one limit cycle to another. Even if the robot obtains steady gait
patterns, their transition is not necessarily confirmed as completed. Thus, smooth transition
between gait patterns remains difficult. To overcome such difficulty, many studies have
concentrated on model-based approaches using inverse kinematics and kinetics. These
approaches basically generate robot motions based on such criteria as zero moment point
(Vukobratoviþ et al., 1990) and manipulate robot joints using motors. However, they require
accurate modeling of both the robot and the environment as well as complicated
computations. The difficulty of achieving adaptability to various environments in the real
world is often pointed out, which means that in these approaches the robot is too rigid to
react appropriately to environmental changes. Therefore, the key issue in the control is to
establish a soft robot by adequately changing the structure and response based on
environmental changes.
In contrast to robots, millions of animal species adapt themselves to various environments
by cooperatively manipulating their complicated and redundant musculoskeletal systems.
Many studies have elucidated the mechanisms in their motion generation and control. In
particular, neurophysiological studies have revealed that muscle tone control plays
important roles in generating adaptive motions (Mori, 1987; Rossignol, 1996; Takakusaki et
al., 2003), suggesting the importance of compliance in walking. Actually, many studies on
robotics have demonstrated the essential roles of compliance. Specifically, by appropriately
employing the mechanical compliance of robots, simple control systems have attained
highly adaptive and robust motions, especially in hexapod (Altendorfer et al., 2001; Cham et
al., 2004; Quinn et al., 2003; Saranli et al., 2001), quadruped (Fukuoka et al., 2003; Poulakakis
18 Humanoid Robots, New Developments
et al., 2005), and biped robots (Takuma & Hosoda, 2006; Wisse et al., 2005). However, note
that control systems using motors continue to have difficulty adequately manipulating
compliance in robot joints.
On the other hand, neurophysiological studies have also clarified that animal walking is
generated by central pattern generators (CPGs) that generate rhythmic signals to activate
their limbs (Grillner, 1981, 1985; Orlovsky et al., 1999). CPGs modulate signal generation in
response to sensory signals, resulting in adaptive motions. CPGs are widely modeled using
nonlinear oscillators (Taga et al., 1991; Taga, 1995a,b), and based on such CPG models many
walking robots and their control systems have been developed, in particular, for quadruped
robots (Fukuoka et al., 2003; Lewis & Bekey, 2002; Tsujita et al., 2001), multi-legged robots
(Akimoto et al., 1999; Inagaki et al., 2003), snake-like robots (Ijspeert et al., 2005; Inoue et al.,
2004), and biped robots (Aoi & Tsuchiya, 2005; Aoi et al., 2004; Lewis et al., 2003; Nakanishi
et al., 2004).
This paper deals with the transition from quadrupedal to bipedal locomotion of a biped
robot while walking. These gait patterns originally have poor stability, and the transition
requires drastic changes in robot posture, which aggravates the difficulty of establishing the
transition without falling over. Our previous work developed a simple control system using
nonlinear oscillators by focusing on CPG characteristics that are used for both quadruped
and biped robots, revealing that they achieved steady and robust walking verified by
numerical simulations and hardware experiments (Aoi & Tsuchiya, 2005; Aoi et al., 2004;
Tsujita et al., 2001). In this paper, we use the same developed control system for both
quadrupedal and bipedal locomotion of a biped robot and attempt to establish smooth gait
transition. Specifically, we achieve stable limit cycles of these gait patterns and their
transitions by moving the robot state from one limit cycle to another by cooperatively
manipulating their physical kinematics through numerical simulations. This paper is
organized as follows. Section 2 introduces the biped robot model considered in this paper.
Section 3 explains the developed locomotion control system, and Section 4 addresses the
approach of gait transition and numerical results. Section 5 describes the discussion and
conclusion.
swing and stance phases, and a harmonious balance must be achieved between these
kinematical motions and dynamical interaction, which means that it is essential to
adequately switch from one phase to another. Therefore, our developed control system
focused on this point. Specifically, it modulated the signal generation of the oscillators and
appropriately changed the leg motions from the swing to the stance phase based on touch
sensors. Although we concisely describe the developed control system below, see our
previous work (Aoi & Tsuchiya, 2005) for further details.
trajectory consists of swing and stance phases (see Fig. 3). The former is composed of a
simple closed curve that includes anterior extreme position (AEP) and posterior extreme
position (PEP). This trajectory starts from point PEP and continues until the leg touches the
ground. On the other hand, the latter consists of a straight line from the foot landing
position (LP) to point PEP. Therefore, this trajectory depends on the timing of foot contact
with the ground in each step cycle. Both in the swing and stance phases, nominal foot
movement is designed to be parallel to the line that involves points AEP and PEP. The
height and forward bias from the center of points AEP and PEP to Joint 3 of the leg are
defined as parameters Δ L and H L , respectively. These two nominal foot trajectories provide
nominal trajectories ljˆLi j (i=1,2, j=3,4,5) of Joint j (hip, knee, and ankle pitch joints) of Leg i by
the functions of phase ǗLi of Leg i oscillator written by ljˆLi j (ǗLi ) , where we use ǗLi = 0 at
point PEP and ǗLi = Ǘ̂AEP at point AEP. Note that nominal stride Ŝ is given by the distance
between points AEP and PEP, and duty factor ǃ̂ is given by the ratio between the nominal
stance phase and step cycle durations.
Ǘ I = ǚˆ + g1I
Ǘ T = ǚˆ + g1T
(1)
Ǘ Ai = ǚˆ + g 1i A + g 2i A i = 1,2
Ǘ Li = ǚˆ + g 1i L + g 2i L i = 1,2
where g1I , g1T , g1i A , and g1i L (i=1,2) are functions regarding the nominal phase
relationship shown below, g 2i A and g2i L (i=1,2) are functions arising from sensory
signals given below, and ǚ̂ is the nominal angular velocity of each oscillator obtained
by
1 − βˆ
ωˆ = 2π ( 2)
Tˆ
sw
only modulate the walking rhythm but also switch the leg motions from the swing to stance
phase, as described in Sec. 3.2.1.
4. Gait transition
This paper aims to achieve gait transition from quadrupedal to bipedal locomotion of a
biped robot. As mentioned above, even if the robot establishes steady quadrupedal and
bipedal locomotion, that is, each locomotion has a stable limit cycle in the state space of the
robot, there is no guarantee that the robot can accomplish a transition from one limit cycle of
quadrupedal locomotion to another of bipedal locomotion without falling over.
Furthermore, these gait patterns originally have poor stability, and the transition requires
drastic changes in robot posture, which aggravates the difficulty of preventing the robot
from falling over during the transition.
LA = lTA sin Ǚ T + Δ A
( 5)
LL = lTL sin Ǚ T + Δ L
This aims to use parameters Ǐ 1 and Ǐ 2 to change the distance between the foot and hand
trajectories and the posture of the trunk, respectively. Using this controller, gait transition is
achieved by simply changing introduced parameters ( Ǐ 1 , Ǐ 2 ) from (0, 0) to (1, 1), as shown in Fig. 5.
Fig. 5. Trajectories in Ǐ 1 − Ǐ 2 plane for gait transition from quadrupedal to bipedal locomotion.
Gait Transition from Quadrupedal to Bipedal Locomotion of an Oscillator-driven Biped Robot 25
Note that parameter Ǐ 2 >0 geometrically indicates that the robot becomes supported only by
its legs: that is, the appearance of bipedal locomotion.
Fig. 9. Trajectory of center of mass of robot projected on the ground and contact positions of
the hands and the center of feet on the ground. Time of feet and hands indicate when they
land on the ground.
5. Discussion
Kinematical and dynamical studies on biped robots are important for robot control. As
described above, although model-based approaches using inverse kinematics and kinetics
have generally been used, the difficulty of establishing adaptability to various environments
as well as complicated computations has often been pointed out. In this paper, we employed
an internal structure composed of nonlinear oscillators that generated robot kinematics and
adequately responded based on environmental situations and achieved dynamically stable
quadrupedal and bipedal locomotion and their transition in a biped robot. Specifically, we
generated robot kinematical motions using rhythmic signals from internal oscillators. The
oscillators appropriately responded to sensory signals from touch sensors and modulated
28 Humanoid Robots, New Developments
the rhythmic signals and physical kinematics, resulting in dynamical stable walking of the
robot. This means that a robot driven by this control system established dynamically stable
and adaptive walking through the interaction between the dynamics of the mechanical
system, the oscillators, and the environment. Furthermore, this control system needed
neither accurate modeling of the robot and the environment nor complicated computations.
It just relied on the timing of the touch sensor signals: it is a simple control system.
Since biped robots generate various motions manipulating many degrees of freedom, the
key issue in control remains how to design their coordination. In this paper, we expressed
two types of gait patterns using a set of several kinematical parameters and introduced two
independent parameters that parameterized the kinematical parameters. Based on the
introduced parameters, we changed the gait patterns and established gait transition. That is,
we did not individually design the kinematical motion of all robot joints, but imposed
kinematical restrictions on joint motions and contracted the degrees of freedom to achieve
cooperative motions during the transition. Furthermore, we used the same control system
between quadrupedal and bipedal locomotion except for the physical kinematics, which
facilitated the design of such cooperative motions and established a smooth gait transition.
As mentioned above, the analysis of EMG patterns in animal motions clarified that common
EMG patterns are embedded in the EMG patterns of different motions, despite generating
such motions using complicated and redundant musculoskeletal systems (d'Avella & Bizzi,
2005; Patla et al., 1985; Ivanenko et al., 2004, 2006), suggesting an important coordination
mechanism. In addition, kinematical studies revealed that covariation of the elevation
angles of thigh, shank, and foot during walking displayed in three-dimensional space is
approximately expressed on a plane (Lacquaniti et al., 1999), suggesting an important
kinematical restriction for establishing cooperative motions. In designing a control system,
adequate restrictions must be designed to achieve cooperative motions.
Physiological studies have investigated gait transition from quadrupedal to bipedal
locomotion to elucidate the origin of bipedal locomotion. Mori et al. (1996), Mori (2003), and
Nakajima et al. (2004) experimented on gait transition using monkeys and investigated the
physiological differences in the control system. Animals generate highly coordinated and
skillful motions as a result of the integration of nervous, sensory, and musculoskeletal
systems. Such motions of animals and robots are both governed by dynamics. Studies on
robotics are expected to contribute the elucidation of the mechanisms of animals and their
motion generation and control from a dynamical viewpoint.
6. Acknowledgment
This paper is supported in part by Center of Excellence for Research and Education on
Complex Functional Mechanical Systems (COE program of the Ministry of Education,
Culture, Sports, Science and Technology, Japan) and by a Grant-in-Aid for Scientific
Research on Priority Areas “Emergence of Adaptive Motor Function through Interaction
between Body, Brain and Environment” from the Japanese Ministry of Education, Culture,
Sports, Science and Technology.
7. References
Akimoto, K.; Watanabe, S. & Yano, M. (1999). An insect robot controlled by emergence of
gait patterns. Artificial Life and Robotics, Vol. 3, 102–105.
Gait Transition from Quadrupedal to Bipedal Locomotion of an Oscillator-driven Biped Robot 29
Altendorfer, R.; Moore, N.; Komsuoglu, H.; Buehler, M.; Brown Jr., H.; McMordie, D.;
Saranli, U.; Full, R. & Koditschek, D. (2001). RHex: A biologically inspired hexapod
runner. Autonomous Robots, Vol. 11, No. 3, 207–213.
Aoi, S. & Tsuchiya, K. (2005). Locomotion control of a biped robot using nonlinear
oscillators. Autonomous Robots, Vol. 19, No. 3, 219–232.
Aoi, S.; Tsuchiya, K. & Tsujita, K. (2004). Turning control of a biped locomotion robot using
nonlinear oscillators. Proc. IEEE Int. Conf. on Robotics and Automation, pp. 3043-3048.
Cham, J.; Karpick, J. & Cutkosky, M. (2004). Stride period adaptation of a biomimetic
running hexapod. Int. J. Robotics Res., Vol. 23, No. 2, 141–153.
d’Avella, A. & Bizzi, E. (2005). Shared and specific muscle synergies in natural motor
behaviors. PNAS, Vol. 102, No. 8, 3076–3081.
Fukuoka, Y.; Kimura, H. & Cohen, A. (2003). Adaptive dynamic walking of a quadruped robot on
irregular terrain based on biological concepts. Int. J. Robotics Res., Vol. 22, No. 3-4, 187–202.
Grillner, S. (1981). Control of locomotion in bipeds, tetrapods and fish. Handbook of Physiology,
American Physiological Society, Bethesda, MD, pp. 1179–1236.
Grillner, S. (1985). Neurobiological bases of rhythmic motor acts in vertebrates. Science, Vol.
228, 143–149.
Hirai, K.; Hirose, M.; Haikawa, Y. & Takenaka, T. (1998). The development of the Honda
humanoid robot. Proc. IEEE Int. Conf. on Robotics and Automation, pp. 1321-1326.
Ijspeert, A.; Crespi, A. & Cabelguen, J. (2005). Simulation and robotics studies of salamander
locomotion. Applying neurobiological principles to the control of locomotion in
robots. Neuroinformatics, Vol. 3, No. 3, 171–196.
Inagaki, S.; Yuasa, H. & Arai, T. (2003). CPG model for autonomous decentralized multi-
legged robot system–generation and transition of oscillation patterns and dynamics
of oscillators. Robotics and Autonomous Systems, Vol. 44, No. 3-4, 171–179.
Inoue, K.; Ma, S. & Jin, C. (2004). Neural oscillator network-based controller for meandering
locomotion of snake-like robots. Proc. IEEE Int. Conf. on Robotics and Automation, pp.
5064–5069.
Ivanenko, Y.; Poppele, R. & Lacquaniti, F. (2004). Five basic muscle activation patterns
account for muscle activity during human locomotion. J. Physiol., Vol. 556, 267–282.
Ivanenko, Y.; Poppele, R. & Lacquaniti, F. (2006). Motor control programs and walking.
Neuroscientist, Vol. 12, No. 4, 339–348.
Kuniyoshi, Y.; Ohmura, Y.; Terada, K.; Nagakubo, A.; Eitokua, S. & Yamamoto, T. (2004).
Embodied basis of invariant features in execution and perception of whole-body
dynamic actionsɆknacks and focuses of Roll-and-Rise motion. Robotics and
Autonomous Systems, Vol. 48, No. 4, 189-201.
Kuroki, Y.; Fujita, M.; Ishida, T.; Nagasaka, K. & Yamaguchi, J. (2003). A small biped
entertainment robot exploring attractive applications. Proc. IEEE Int. Conf. on
Robotics and Automation, pp. 471-476.
Lacquaniti, F.; Grasso, R. & Zago, M. (1999). Motor patterns in walking. News Physiol. Sci.,
Vol. 14, 168–174.
Lewis, M. & Bekey, G. (2002). Gait adaptation in a quadruped robot. Autonomous Robots, Vol.
12, No. 3, 301–312.
Lewis, M.; Etienne-Cummings, R.; Hartmann, M.; Xu, Z. & Cohen, A. (2003). An in silico
central pattern generator: silicon oscillator, coupling, entrainment, and physical
computation. Biol. Cybern., Vol. 88, 137–151.
Löffler, K.; Gienger, M. & Pfeiffer, F. (2003). Sensors and control concept of walking
``Johnnie’’. Int. J. Robotics Res., Vol. 22, No. 3-4, 229–239.
30 Humanoid Robots, New Developments
Mori, S. (1987). Integration of posture and locomotion in acute decerebrate cats and in
awake, free moving cats. Prog. Neurobiol., Vol. 28, 161–196.
Mori, S. (2003). Higher nervous control of quadrupedal vs bipedal locomotion in non-
human primates; Common and specific properties. Proc. 2nd Int. Symp. on Adaptive
Motion of Animals and Machines.
Mori, S.; Miyashita, E.; Nakajima, K. & Asanome, M. (1996). Quadrupedal locomotor
movements in monkeys (M. fuscata) on a treadmill: Kinematic analyses.
NeuroReport, Vol. 7, 2277–2285.
Nagasaki, T.; Kajita, S.; Kaneko, K.; Yokoi, K. & Tanie, K. (2004). A running experiment of
humanoid biped. Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 136-141.
Nakajima, K.; Mori, F.; Takasu, C.; Mori, M.; Matsuyama, K. & Mori, S. (2004).
Biomechanical constraints in hindlimb joints during the quadrupedal versus
bipedal locomotion of M. fuscata. Prog. Brain Res., Vol. 143, 183–190.
Nakanishi, J.; Morimoto, J.; Endo, G.; Cheng, G.; Schaal, S. & Kawato, M. (2004). Learning
from demonstration and adaptation of biped locomotion. Robotics and Autonomous
Systems, Vol. 47, No. 2-3, 79–91.
Orlovsky, G.; Deliagina, T. & Grillner, S. (1999). Neuronal control of locomotion: from mollusc to
man. Oxford University Press.
Patla, A.; Calvert, T. & Stein, R. (1985). Model of a pattern generator for locomotion in
mammals. Am. J. Physiol., Vol. 248, 484–494.
Poulakakis, I.; Smith, J. & Buehler, M. (2005). Modeling and experiments of untethered
quadrupedal running with a bounding gait: The Scout II robot. Int. J. Robotics Res.,
Vol. 24, No. 4, 239–256.
Quinn, R.; Nelson, G.; Bachmann, R.; Kingsley, D.; Offi, J.; Allen, T. & Ritzmann, R. (2003).
Parallel complementary strategies for implementing biological principles into
mobile robots. Int. J. Robotics Res., Vol. 22, No. 3, 169–186.
Rossignol, S. (1996). Neural control of stereotypic limb movements. Oxford University Press.
Saranli, U.; Buehler, M. & Koditschek, D. (2001). RHex: A simple and highly mobile hexapod
robot. Int. J. Robotics Res., Vol. 20, No. 7, 616–631.
Taga, G. (1995a). A model of the neuro-musculo-skeletal system for human locomotion I.
Emergence of basic gait. Biol. Cybern., Vol. 73, 97–111.
Taga, G. (1995b). A model of the neuro-musculo-skeletal system for human locomotion II. -
Real-time adaptability under various constraints. Biol. Cybern., Vol. 73, 113–121.
Taga, G.; Yamaguchi, Y. & Shimizu, H. (1991). Self-organized control of bipedal locomotion
by neural oscillators in unpredictable environment. Biol. Cybern., Vol. 65, 147–159.
Takakusaki, K.; Habaguchi, T.; Ohtinata-Sugimoto, J.; Saitoh, K. & Sakamoto, T. (2003). Basal
ganglia efferents to the brainstem centers controlling postural muscle tone and
locomotion: A new concept for understanding motor disorders in basal ganglia
dysfunction. Neuroscience, Vol. 119, 293–308.
Takuma, T. & Hosoda, K. (2006). Controlling the walking period of a pneumatic muscle
walker. Int. J. Robotics Res., Vol. 25, No. 9, 861–866.
Tsujita, K.; Tsuchiya, K. & Onat, A. (2001). Adaptive gait pattern control of a quadruped locomotion
robot. Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 2318–2325.
Vukobratoviþ, M.; Borovac, B.; Surla, D. & Stokiþ, D. (1990). Biped locomotionɆdynamics,
stability, control and application. Springer-Verlag.
Wisse, M.; Schwab, A.; van der Linde, R. & van der Helm, F. (2005). How to keep from
falling forward: elementary swing leg action for passive dynamic walkers. IEEE
Trans. Robotics, Vol. 21, No. 3, 393–401.
3
1. Introduction
One of the main objectives of current research on walking robots is to make their gaits more
fluid by introducing imbalance phases. For example, for walking robots without actuated
ankles, which are under actuated in single support, dynamically stable walking gaits have
been designed with success (Aoustin & Formal’sky 2003; Chevallereau et al. 2003; Zonfrilli
et al. 2002; Aoustin et al. 2006; Pratt et al. 2001; Westervelt et al. 2003). Both the design of
reference motions and trajectories and the control of the mechanism along these trajectories
usually require the knowledge of the whole state of the robot. This state contains internal
variables which are easy to measure using encoders for example, and also the absolute
orientation of the robot with respect to the horizontal plane. For robots with unilateral
constraints with the ground, it may not be possible to derive the latter information from
internal measurements, as in the case of the absolute orientation of a biped in single
support. Of course, technological solutions exist such as accelerometers, gyrometers, inertial
units… But the implementation of these solutions is expensive and difficult.
In order to overcome these difficulties, we propose to use a state observer which, based
on the measurements of the joint variables and on a dynamic model of the robot, provides
estimates of the absolute orientation of the walking robot during imbalance phases. This
strategy was first validated in simulation for a three link biped without feet, using
nonlinear high gain observers and a nonlinear observer based on sliding modes with a
finite time convergence (Lebastard et al. 2006a) and (Lebastard et al. 2006b), for walking
gaits composed of single support phases and impacts. The main drawback with this
family of observers is that, when only some of the degrees of freedom are measured, a
state coordinates transformation is necessary to design their canonical form (Gauthier &
Bornard 1981; Krener & Respondek 1985; Bornard & Hammouri 1991; Plestan &
Glumineau 1997).
In this chapter, the observer is an extended Kalman filter and it is applied to
SemiQuad, a prototype walking robot built at our institute. SemiQuad is a five link
mechanism with a platform and two double-link legs. It is designed to study quadruped
gaits where both front legs (resp. rear legs) have identical movements. Its unique front
leg (resp. rear leg) acts as the two front legs (resp. rear legs) of the quadruped, so that
SemiQuad can be considered as an implementation of a virtual quadruped, hence its
32 Humanoid Robots, New Developments
name. The legs have passive (uncontrolled) feet that extend in the frontal plane. Due to
this, the robot is stable in the frontal plane. Four motors located in haunches and knees
drive the mechanism. The locomotion of the prototype is a planar curvet gait. In double
support, our prototype is statically stable and over actuated. In single support, it is
unstable and under actuated.
The reference walking gaits have been designed using an intuitive strategy which is such
that the absolute orientation is not required. Still, it contains imbalance phases during which
the observer can be used, and its results evaluated. The estimation results being correct,
such an observer can be used for contexts where the absolute angle is necessary.
Furthermore, the idea can be useful for other types of walking robots, such as bipeds with
imbalance phases.
The organization of this chapter is the following. Section 2 is devoted to the model of
SemiQuad. It also contains the data of its physical parameters. The intuitive gaits which were
designed for SemiQuad are presented in section 3. The statement of the problem of
estimation of the absolute orientation of SemiQuad is defined in Section 4. Simulation results
and experimental results avec presented in section 5. Section 6 presents our conclusions and
perspectives.
The maximum value of the torque in the output shaft of each motor gearbox is 40 N.m . This
saturation on the torques is taken into account to design the control law in simulation and in
experiments. The total mass of the quadruped is approximately 14 kg. The four actuated
joints of the robot are each equipped with one encoder to measure the angular position. The
angular velocities are calculated using the angular positions. Currently the absolute
platform orientation angle α (see figure 2) is not measured. As explained before, the
proposed walking gait does not require this measurement. Each encoder has 2000 pts/rev
and is attached directly to the motor shaft. The bandwidth of each joint of the robot is
determined by the transfer function of the mechanical power train (motors, gearboxes) and
the motor amplifiers that drive each motor. In the case of SemiQuad, we have approximately
a 16 Hz bandwidth in the mechanical part of the joints and approximately 1.7 kHz for the
amplifiers. The control computations are performed with a sample period of 5 ms (200 Hz).
The software is developped in C language.
Fig. 2. SemiQuad’s diagram: generalized coordinates, torques, forces applied to the leg tips,
locations of mass centers.
34 Humanoid Robots, New Developments
ª q º
x =
«¬D -1 ( -Hq - G + BΓ )»¼ = f(x) + g(q rel )Γ (3)
T T
With x = ª¬q T q T º¼ and the joint angle vector q rel ljª
= «¬ 1 lj2 lj3 lj 4 º¼» . The state space is
defined as x ∈ X ⊂ R 10 = x = ª¬q T { q T º¼
T 5
}
q < q M < ∞ ; q ∈ ]−π , π] . From these definitions, it
is clear that all state coordinates are bounded.
With this method, a discrete model of (3) is calculated. If we add the equation
corresponding to the measurement of the joint angles, we obtain the following system:
xk +1 = fs (xk , Γ k )
(7)
y k = h(xk ) = ( θ1 θ2 θ4 ) = Cxk
T
θ3 with C = I 4 x 4
before and just after impact, respectively. Furthermore, the velocity of the stance leg tip
before impact is null. Then we have:
36 Humanoid Robots, New Developments
ljid = a 0 + a1 t + a 2 t 2 + a 3 t 3 if t < Tp
(11)
ljid = θie (Tp ) if Tp ≤ t ≤ Tss
The coefficients of these polynomials are calculated from a choice of initial, intermediate and
final configurations and of the initial velocity for each joint link. Just after impact, the next
half step begins and a similar strategy is applied (figure 4). The tracking of the reference
trajectories is achieved using a PD controller. The gains, which were adjusted using pole
placement, were tested in simulation and in experiments. Figure 3 shows the evolutions of
, obtained from the simulation of
the absolute orientation variable ǂ(t) and its velocity ǂ(t)
SemiQuad for five steps. These graphs clearly show that the dynamics of the absolute
orientation cannot be neglected in the design of a control law based on a state feedback. The
Estimation of the Absolute Orientation of a Five-link Walking Robot with Passive Feet 37
durations of the double support phase and the single support phase are 1.5 s and 0.4 s
respectively.
Fig. 3. Evolution of ǂ(t) and ǂ(t) in simulation during the walking gait for five steps.
(
Ο = dh 1 ... dLkf1 − 1 h 1 ... dh p ... dL fp h p
k −1
) (12)
where dh is the gradient vector of function h (see system (7)) with respect to the state vector
x, and dLfh is the Lie derivative of h along the vector field f. We have also checked that the
equivalent property is satisfied by the discrete model. This means that, at each sampling
time tk, it is possible to find an observability matrix with 10 independent rows or columns.
Having checked system observability, we propose an extended Kalman filter to observe the
absolute orientation. The design of this extended Kalman filter is now described.
In practice, due to uncertainty in the determination of parameters and to angular
measurement errors, system (3), and of course system (7), are only approximations. A
standard solution is to represent modelling and measurement errors as additive noises
disturbing the system.
Let us consider the associated ``noisy’’ system:
x k +1 = fs (xk , Γ k ) + α k
(13)
y k = C(x k ) = ( lj 1 lj 4 ) + βk
T
lj2 lj3
In the case of a linear system, if ǂ k and ǃ k are white Gaussian noises, mutually independent
and independent of x, the Kalman filter is an optimal estimator. When the system is not linear,
38 Humanoid Robots, New Developments
it is possible to use the extended Kalman filter (EKF) by linearizing the evolution equation fs
and the observation equation (which is in our case linear) around the current state estimate.
Although convergence and optimality are no longer guaranteed, the interest and the
effectiveness of the extended Kalman filter have been proved in many experimental cases. The
extended Kalman filter is very often used as a state observer (Bonnifait & Garcia 1998).
(a) Double support (half step n) : (f) Double support (end of half step n and start of half step n+1):
The projection of the platform center Just after landing with an impact of the front leg. After half step
is halfway between the leg tips
n, the platform center has moved forward.
(b) Double support (half step n) : (g) Double support (half step n+1) :
The projection of the platform center is The projection of the platform center is
closer to the front leg tip.
closer to the back leg tip
(c) Double support (half step n): (h) Double support (half step n+1):
The front leg is unbent just before The back leg is unbent just before take off
take off (before the single support) (before the next single support phase)
(d) Single support (half step n): (i) Single support (half step n+1) :
Just after jump of the front leg, Just after jump of the back leg,
the front leg is bent. the back leg is bent.
Fig. 4. Plot of half steps n and n+1 of SemiQuad. as a sequence of stick figures.
Estimation of the Absolute Orientation of a Five-link Walking Robot with Passive Feet 39
In the case of our system, the equations of the extended Kalman filter are:
• a priori estimation: uses data available before tk+1
x k− +1 = fs (x k , Γ k )
(14)
Pk−+1 = A k Pk A kT + Q α
• a posteriori estimation: uses data available at tk+1
x k +1 = xk− +1 + K k +1 ( y k +1 − Cx k− +1 )
(15)
Pk +1 = ( I10×10 − K k +1C ) Pk−+1
with:
§ ∂f ·
and K k + 1 = Pk−+ 1C T ( CPk−+ 1C T + Qβ )
−1
Ak = ¨ s ¸
∂x
© ¹ x = xk
Here yk+1 are the measured joint variables, which are the first four components of vector xk
at time tk, and Cxk- +1 is the a priori estimation of these joint variables. Q ǂ and Qǃ are the
covariance matrices of ǂ k and ǃ k , K is the Kalman gain and P the estimated covariance
matrix of prediction ( P - at t ) and estimation ( P at t ) errors.
k k
the direct dynamic model of SemiQuad and the control torques ƥ i (i=1,2,3,4) are the inputs of
the extended Kalman filter. For the experimental tests, the joint variables lj i (i=1,2,3,4) are
measured and differentiated to obtain the velocities lj i . These eight variables, together with
40 Humanoid Robots, New Developments
the four experimental torques ƥ i (i=1,2,3,4) are the inputs of the extended Kalman filter. In
T
both cases, the state vector ªlj 1 lj2 lj3 lj4 ǂ lj 1 lj 2 lj 3 lj 4 ǂ º¼ is estimated.
¬
.
ƥ i (i = 1, 2, 3, 4) lj i lj i (i = 1, 2, 3, 4)
Semiquad
ƥ i (i = 1, 2, 3, 4)
Estended
ǂ, ǂ, lji lji (i = 1, 2, 3, 4)
. Kalman
lji lji (i = 1, 2, 3, 4)
Filter
7 shows the estimated and real orientations of the platform of SemiQuad in the case
when a viscous friction is added. The coefficient F v equals to 0.1 N.m.s/rad. Similarly,
figure 8 shows the estimated and the real orientations of the platform of SemiQuad in
the case of a Coulomb friction, with a coefficient Fs equal to 0.2 N.m. Last robustness
test (figure 9) presents the estimated and real orientations of the platform of
SemiQuad with an inertia reduced by 5% for the platform in the simulator. In practice,
5% precision on inertia is feasible (see identification results in (Lydoire & Poignet
2003)).
From these robustness tests, we can conclude that we have no finite time convergence.
However, the final relative errors of the estimated orientations of the platform of SemiQuad
are small. Since it will be possible to update the initial condition of the estimator during the
next double support phase, with the measurements of the encoders and the geometrical
relations, such errors are not a problem.
ƥ i (i=1,2,3,4).
Fig. 6. Absolute orientation ǂ(t) of the platform: real (solid) and estimated (dashed).
Fig. 7. Absolute orientation ǂ(t) of the platform: real (solid) and estimated (dashed), with a
supplement viscous friction.
42 Humanoid Robots, New Developments
Fig. 8. Absolute orientation ǂ(t) of the platform: real (solid) and estimated (dashed), with a
supplement Coulomb friction.
Fig. 9. Absolute orientation ǂ(t) of the platform: real (solid) and estimated (dashed), with an
error on the inertia of the platform.
Estimation of the Absolute Orientation of a Five-link Walking Robot with Passive Feet 43
Figure 10 shows the estimation of the absolute orientation of the platform ǂ(t) . The
duration of the single support phase is 15% smaller than in simulation. It can probably be
explained by the fact that our knowledge of the physical parameters of the robot is not
perfect, and that we neglected effects such as friction in the joints.
Currently, there is no experimental measured data about the evolution of ǂ(t) in single
support, because SemiQuad is not equipped with a sensor such as a gyrometer or a
gyroscope. However, in double support, using the geometric and kinematic models it is
. In addition to providing initial conditions for the
possible to calculate ǂ(t) and ǂ(t)
observer, this also allows to calculate the orientation at the end of the single support phase,
just after impact. The corresponding value is displayed as a star on the next graph, and is
equal to 0.01 rad (0.57 degree). The difference between this value and the estimated value at
the same instant is almost 3 degrees.
Fig. 10. Estimation of the absolute orientation ǂ(t) of the platform using the experimental data.
6. Conclusion
An application of the digital extended Kalman filter has been presented for the problem of
estimating the absolute orientation of SemiQuad, a 2D dynamically stable walking robot. There
are some differences between simulations and experiments, but the results still prove the
ability of our observer to yield a short term estimation of the orientation of the robot. The
precision is sufficient for the estimation to be useful for calculating the dynamic model and
monitoring the balance of the robot. This task is important for SemiQuad, and vital for bipeds,
to which the idea is also applicable (see Lebastard, Aoustin, & Plestan 2006 for a different type
of observer). Given the possibility to re-initialize the observer at each double support phase,
even if very short, as it can be for bipeds, the observer can provide the absolute orientation
over time without the use of any additional sensor, which was the goal of our work.
44 Humanoid Robots, New Developments
In a near future, we plan to equip SemiQuad with a gyrometer to fully evaluate the
performance of our estimator over time. Our perspective is to use the estimated orientation
in advanced feedback controllers.
7. References
Aoustin, Y. & Formal’sky, A. (2003). Control design for a biped: Reference trajectory based
on driven angles as function of the undriven angle, Journal of Computer and System
Sciences International, Vol. 42, No. 4, (July-August 2003) 159-176.
Chevallereau, C.; Abba, G.; Aoustin, Y.; Plestan, F. ; Westervelt, E. R.; Canudas de Wit, C. &
Grizzle, J. W. (2003). Rabbit: A testbed for advanced control theory, IEEE Control
Systems Magazine, Vol. 23, (October 2003) No. 5, 57-78.
Zonfrilli, F.; Oriolo, M. & Nardi, T. (2002). A biped locomotion strategy for the quadruped
robot Sony ers-210, Proceedings of IEEE Conference on Robotics and Automation, pp.
2768-2774.
Aoustin, Y.; Chevallereau, C. & Formal’sky, A. (2006). Numerical and experimental study of
the virtual quadruped-SemiQuad, Multibody System Dynamics, Vol. 16, 1-20.
Pratt, J.; Chew, C. M.; Torres, A.; Dilworth, P. & Pratt, G. (2001). Virtual model control: an
intuitive approach for bipedal locomotion, International Journal of Robotics Research,
Vol. 20, No. 2, 129-143.
Westervelt, E. R.; Grizzle, J. W. & Koditschek, D. E. (2003). Hybrid zero dynamics of planar biped
walkers, IEEE Transactions on Automatic Control, Vol. 48, No. 1, (February 2003) 42-56.
Lebastard, V.; Aoustin, Y. & Plestan, F. (2006). Observer-based control of a walking biped
robot without orientation measurement, Robotica, Vol. 24, 385-400.
Lebastard, V.; Aoustin, Y. & Plestan F. (2006) . Absolute orientation estimation for observer-
based control of a five-link biped robot, Springer Verlag Series Notes on Control and
Information Sciences, Editor: K. Kozlowski, Springer, vol. 335, Lecture Notes in
Control and Information Sciences, 2006.
Gauthier, J.P. & Bornard, G. (1981). Observability for any u(t) of a class of nonlinear system.
IEEE Transactions on Automatic Control, 26(4): 922-926.
Krener, A. & Respondek, W. (1985). Nonlinear observers with linearization error dynamics,
SIAM Journal Control Optimization, Vol. 2, 197-216.
Bornard, G. & Hammouri, H. (1991). A high gain observer for a class of uniformly
observable systems, Proceedings of the 30th IEEE Conference on Decision and Control,
pp. 1494-1496.
Plestan, F. & Glumineau, A. (1997). Linearisation by generalized input-ouput injection,
Systems and Control Letters, Vol. 31, 115-128.
Spong, M. & Vidyasagar M. (1991). Robot dynamics and control, John Wiley, New-York.
Formal’sky, A. M. (1982). Locomotion of Anthropomorphic Mechanisms, Nauka, Moscow [In
Russian], 368 pages.
Bonnifait, P. & Garcia, G. (1998). Design and experimental validation of an odometric and
goniometric localization system for outdoor robot vehicles, IEEE Transactions on
Robotics and Automation, Vol. 14, No. 541-548.
Lydoire, F. & Poignet, Ph. (2003). Experimental dynamic parameters identification of a seven
dof walking robot, Proceedings of the CLAWAR’03 Conferencel, pp. 477-484.
4
1. Introduction
Children love toys. Human caregivers often employ learning aids, such as books,
educational videos, drawing boards, musical or textured toys, to teach a child. These social
interactions provide a rich plethora of information to a child, and hence they should be
extrapolated to a humanoid robot as well (Arsenio, 2004d).
Inspired in infant development, we aim at developing a humanoid robot's perceptual
system through the use of learning aids, cognitive artifacts, and educational activities, so
that a robot learns about the world according to a child’s developmental phases (Arsenio,
2004c). Of course, the human caregiver plays a very important role on a robot’s learning
process (as it is so with children), performing educational and play activities with the robot
(such as drawing, painting or playing with a toy train on a railway), facilitating robot’s
perception and learning. The goal is for the humanoid robot Cog (Figure 1) to see the world
through the caregiver’s eyes.
1 Research work was developed while the author was at MIT. The author is currently working at
Siemens.
46 Humanoid Robots, New Developments
2004e), sound / face segmentation (Arsenio, 2004f), object / face / hand tracking (Arsenio,
2004e;f) or even robot actuation for active perception (Arsenio, 2003; Metta & Fitzpatrick,
2003) are described in the literature.
We do not treat children as machines, i.e., automatons. But this automaton view is still
widely employed in industry to build robots. Building robots involves indeed the hardware
setup of sensors, actuators, metal parts, cables, processing boards, as well as software
development. Such engineering might be viewed as the robot genotype. But equally
important in a child is the developmental acquisition of information in a social and cultural
context (Vigotsky, 1962).
Therefore, for a humanoid robot to interact effectively in its surrounding world, it must be
able to learn. Section 3 presents learning strategies applied to a diverse set of problems, so
that the robot learns information about objects, scenes, people and actions. Training data for
the algorithms is generated on-line, in real-time, while the robot is in operation. We will
describe the development of these learning mechanisms, presenting statistics from a large
plethora of experimental results.
We aim at introducing robots into our society and treating them as us, using child
development as a metaphor for developmental learning of a humanoid robot
Fig. 4. Matching objects from books to real world objects and drawings.
Fig. 5. The image of a painting by Vincent Van Gogh, Road with Cypress and Star, 1890 is
displayed on a computer screen. Paintings are contextually different than pictures or
photos, since the painter style changes the elements on the figure considerably. Van Gogh, a
post-impressionist, painted with an aggressive use of brush strokes. But individual painting
elements can still be grouped together by having a human actor tapping on their
representation in the computer screen to group them together.
Learning First Words
Auditory processing is also integrated with visual processing to extract the name and
properties of objects. However, hand visual trajectory properties and sound properties
might be independent - while tapping on books, it is not the interacting human caregiver
hand that generates sound, but the caregiver vocal system pronouncing sounds such as the
object’s name. Therefore, cross-modal events are associated together under a weak
requirement: visual segmentations from periodic signals and sound segmentations are
bound together if occurring temporally close (Fitzpatrick & Arsenio, 2004). This strategy is
also well suited for sound patterns correlated with the hand visual trajectory (such as
playing musical tones by shaking a rattle).
systems for children to interact with lifelike characters and play virtual instruments by
classifying optical flow measurements. Other classification techniques include state
machines, dynamic time warping or Hidden Markov Models (HMMs).
We follow a fundamentally different approach, being periodic hand trajectories mapped
into geometric descriptions of objects, to classify simple circular or triangular movements,
for instance (Arsenio, 2004a). Figure 6b reports an experiment in which a human draws
repetitively a geometric shape on a sheet of paper with a pen. The robot learns what was
drawn by matching one period of the hand gesture to the previously learned shape (the
hand gesture is recognized as circular in the Figure). Hence, the geometry of periodic hand
trajectories is recognized in real-time to the geometry of objects in an object database,
instead of being mapped to a database of annotated gestures.
Fig. 6. Sample of experiments for object and shape recognition from hand gestures.
Object Recognition from Hand Gestures.
The problem of recognizing objects in a scene can be framed as the dual version of the hand
gestures recognition problem. Instead of using previously learned object geometries to
recognize hand gestures, hand gestures' trajectories are now applied to recover the
geometric shape (set of lines computed by applying the Hough Transform) and appearance
(given by an image template enclosing such lines) of a scene object (as seen by the robot).
Visual geometries in a scene (such as circles) are recognized as such from hand gestures
having the same geometry (as is the case of circular gestures). Figure 6a shows results for
such task on an experiment in which an interacting human paints a circle. The robot learns
what was painted (a circle) by matching the hand gesture to the shape defined by the ink on
the paper. The algorithm is useful to identify shapes from drawing, painting or other
educational activities (Arsenio, 2004d).
Shape from Human Cues
A very similar framework is applied to extract object boundaries from human cues. Indeed,
human manipulation provides the robot with extra perceptual information concerning
objects, by actively describing (using human arm/ hand/finger trajectories) object contours
or the hollow parts of objects, such as a cup (see experiment with green cup in Figure 6c).
Tactile perception of objects from the robot grasping activities has been actively pursued –
52 Humanoid Robots, New Developments
see for instance (Polana & Nelson, 1994; Rao et al., 1989). Although more precise, these
techniques require hybrid position/ force control of the robot's manipulator end-effector so
as not to damage or break objects.
Functional Constraints
Not only hand gestures can be used to detect interesting geometric shapes in the world as
seen by the robot. For instance, certain toys, such as trains, move periodically on rail tracks,
with a functional constraint fixed both in time and space. Therefore, one might obtain
information concerning the rail tracks by observing the train's visual trajectory. To
accomplish such goal, objects are visually tracked by an attentional tracker which is
modulated by an attentional system (Arsenio, 2004d). The algorithm starts by masking the
input world image to regions inside the moving object's visual trajectory (or outside but on a
boundary neighborhood). Lines modeling the object's trajectory are then mapped into lines
fitting the scene edges. The output is the geometry of the stationary object which is
imposing the functional constraint on the moving object. Figure 6d shows as well an
experiment for the specific case of extracting templates for rail tracks from the train's motion
(which is constrained by the railway circular geometry).
Fig. 7. Approach for segmenting and recognizing faces & objects. Training data for
object/face recognition is extracted by keeping objects and others faces in memory for a
while, generating a collection of training samples consisting of multiple segmentations of
objects and faces. (left) on-line experiment on Cog (right) schematic organization. 1. Object
segmentation 2. Face detection and segmentation 3. Multiple object tracking 4. Object
Recognition 5. Face Recognition.
Teaching a Robotic Child - Machine Learning Strategies for a Humanoid Robot from Social Interactions 53
tracked (if the object appearance is view-dependent), in order to link them to the same
object. This way, a collection of object views becomes available as input.
Recognition works as follows. Quantization of each of the three color channels originates 83
groups Gi of similar colors. The number of image pixels nGi indexed to a group is stored as a
percentage of the total number of pixels. The first 20 color histograms of an object category
are saved into memory and updated thereafter. New object templates are classified
according to their similarity with other object templates previously recognized for all object
categories, by computing:
83
p= ¦ minimum (n
i =1
Gi , n G′i )
If p < th (th set to 0,7) for all of the 20 histograms in an object category, then the object does
not belong to that category. If this happens for all categories, then it is a new object. If p th,
then a match occurs, and the object is assigned to the category with maximum p.
Whenever an object is recognized into a given category, the average color histogram which
originated a better match is updated. Given an average histogram which is the result of
averaging m color histograms, the updating consists of computing the weighted average
between this histogram (weight m) and the new color histograms (unit weight). This has the
advantage that color histograms evolve as more samples are obtained to represent different
views of an object.
Fig. 8. (top) Recognition errors. Matches evaluated from a total of 11 scenes (objects are
segmented and recognized more than once per scene). (bottom) Sequence from an on-line
experiment of several minutes on the humanoid robot Cog: (1) The robot detects and
segments a new object – a sofa; (2) New object is correctly assigned to a new category; (3)
Object, not being tracked, is recognized from previous templates (as shown by the two sofa
templates mapped to it); (4-5-6) Same sequence for a different, smaller sofa.
Experimental Results for Template Matching
Figure 8 presents quantitative performance statistics. It shows a sample of the system
running on the humanoid robot Cog, while recognizing previously learned objects. Incorrect
matches occurred due to color similarity among different objects (such as a big and a small
Teaching a Robotic Child - Machine Learning Strategies for a Humanoid Robot from Social Interactions 55
sofa). Errors arising from labeling an object in the database as a new object are chiefly due to
drastic variations in light sources. Qualitative results from an on-line experiment of several
minutes for object segmentation, tracking and recognition of new objects on the humanoid
robot are also shown.
Out of around 100 samples from on-line experiments, recognition accuracy average was
of 95%. Several experiments have shown however the algorithm not capable to
differentiate among people’s faces, although it differentiated correctly between faces
and other objects.
M
Cφn= ¦ΓΓ
i =1
i i
T =
AAT (1)
being ƥn= φnï Ǚ the difference of each image from the mean, and A = [ƥ1, ƥ2, . . . , ƥM].
Cropped faces are first rescaled to 128 × 128 images (size S = 1282). Determining the
eigenvectors and eigenvalues of the S2 size covariance matrix C is untractable. However, C
rank does not exceed M ï1. For M < S2 there are only M ï1 eigenvectors associated to non-
zero eigenvalues, rather than S2. Let vi be the eigenvectors of the M × M matrix ATA. The
eigenfaces μi are given by:
56 Humanoid Robots, New Developments
M
μi= ¦v ik Γk i = 1, M (2)
k =1
Fig. 9. a) Face image samples are shown for each of three people out of a database of six; b)
Average face image for three people on the database, together with three eigenfaces for each
one; c) Confusion table with face recognition results.
The number of basis functions is further reduced from M to M’ by selecting only the most
meaningful M’ eigenvectors (with the largest associated eigenvalues), and ignoring all the
others. Classification of the image of object Ǘ consists of projecting it into the eigenobject
components, by correlating the eigenvectors with it, for obtaining the coefficients
wi=μi(φïǙ), i= 1,..., M’ of this projection. The weights wi form a vector ={w1, w2, . . . , wM’}.
An object is then classified by selecting the minimum L2 distance to each object’s coefficients
in the database džφ= Ω - Ω k where k describes the kth object class in the database. If džφ is
below a threshold, then it corresponds to a new object.
Eigenfaces: Experimental Results
The eigenvectors are now denominated eigenfaces (Turk & Pentland, 1991), because they are
face-like in appearance (see Figure 9 - the confusion table in this Figure presents results for
recognizing three different people, being the average recognition accuracy of 88.9%). The
training data set contains a lot of variation. Validation data corresponds to a random 20% of
all the data.
Eigensounds for Sound Recognition
A collection of annotated acoustic signatures for each object are used as input data (see
Figure 10) for a sound recognition algorithm by applying the eigenobjects method. A sound
image is represented as a linear combination of base sound signatures (or eigensounds).
Classification consists of projecting novel sounds to this space, determining the coefficients
of this projection, computing the L2 distance to each object’s coefficients in the database, and
selecting the class corresponding to the minimum distance.
Cross-modal information aids the acquisition and learning of unimodal percepts and
consequent categorization in a child’s early infancy. Similarly, visual data is employed here
to guide the annotation of auditory data to implement a sound recognition algorithm.
Training samples for the sound recognition algorithm are classified into different categories
Teaching a Robotic Child - Machine Learning Strategies for a Humanoid Robot from Social Interactions 57
by the visual object recognition system or from information from the visual object tracking
system. This enables the system, after training, to classify the sounds of objects not visible.
The system was evaluated quantitatively by random selection of 10% of the segmented data
for validation, and the remaining data for training. This process was randomly repeated
three times. It is worth noticing that even samples received within a short time of each other
often do not look too similar, due to background acoustic noise, noise on the segmentation
process, other objects’ sounds during experiments, and variability on how objects are
moved and presented to the robot. For example, the car object is heard both alone and with
a rattle (either visible or hidden). The recognition rate for the three runs averaged to 82%
(86.7%, 80% and 80%). Recognition rates by object category were: 67% for the car, 91.7% for
the cube rattle, 77.8% for the snake rattle and 83.3% for the hammer. Most errors arise from
mismatches between (car and hammer) sounds.
1) 7 random sound samples for each of 4 objects. From top to 2) Average 3) Eigenobjects corresponding
bottom: hammer, cube rattle, car and snake rattle, respectively sound images to the three highest eigenvalues
Fig. 10. Sound recognition. Acoustic signatures for four objects are shown along the rows.
(1) Seven sound segmentation samples are shown for each object, from a total of 28 (car), 49
(cube rattle), 23 (snake rattle) and 34 (hammer) samples. (2) Average acoustic signatures.
The vertical axis corresponds to the frequency bands and the horizontal axis to time
normalized by the period. (3) Eigensounds corresponding to the three highest eigenvalues.
The repetitive nature of the sound generated by an object under periodic motion can be
analyzed to extract an acoustic signature for that object. We search for repetition in a set of
frequency bands, collecting those whose energies oscillate together with a similar period
(Fitzpatrick & Arsenio, 2004).
where h(x,y) is a Gaussian window. Thus, v ( x, y ) has dimension 960. Similarly to other
approaches (Torralba, 2003), the dimensionality problem is reduced to become tractable by
&
applying Principal Component Analysis (PCA). The image features v ( p ) are decomposed
into the basis functions given by the PCA:
D
& & & &
v( p) = ¦ ciϕ ki ( p) , ci = ¦ vk ( p)ϕ ki ( p) (4)
&
i =1 p ,k
&
where the functions ϕ ki ( p ) are the eigenfunctions of the covariance operator given by
&
vk ( p ) . These functions incorporate both spatial and spectral information. The
&
decomposition coefficients are obtained by projecting the image features vk ( p ) into the
principal components ci, used hereafter as input context features.
&
The vector c ={ci, i= 1, . . . , D} denotes the resulting D-dimensional input vector, with D=Em,
2 D Tho, where m denotes a class, Tho an upper threshold and Em denotes the number of
eigenvalues within 5% of the maximum eigenvalue. These features can be viewed as a
scene’s holistic (Oliva & Torralba, 2001) representation since all the regions of the image
contribute to all the coefficients, as objects are not encoded individually. The effect of
neglecting local features is reduced by mapping the foveal camera (which grabs data for the
object recognition scheme based on local features) into the image from the peripheral view
& &
camera, where the weight of the local features v I is strongly attenuated. The vector p is
thus given in wide field of view retinal coordinates.
A collection of images is automatically annotated by the robot (Arsenio, 2004b;c) and
used as training data. Mixture models are applied to find interesting places to put a
bounded number of local kernels that can model large neighborhoods. In D-
dimensions a mixture model is denoted by density factorization over multivariate
Gaussians (spherical Gaussians were selected for faster processing times), for each
object class n:
M
& & &
p(c | on ) = ¦ bm G(c , μ m,n , C m,n )
m =1
&
where Gm refers to the mth Gaussian with mean μm and covariance matrix Cm, M is the
number of Gaussian clusters, and bm= p(Gm) are the weights of the local models. The
estimation of the parameters will follow the EM algorithm (Gershenfeld, 1999):
&
E-step for k-iteration: From the observed data c , compute the a-posteriori cluster
probability emk ,n (l ) :
& &
& bmk ,n G(c , μ mk ,n , Cmk ,n )
e k
m, n (l ) = p(cm,n | c) = M (5)
& &
¦ bmk ,n G(c , μ mk ,n , Cmk ,n )
m=1
L
&
L & ¦e k
m ,n (l )cl
bmk +,n1 = ¦ emk ,n , μ m ,n =
k +1 l =1
L
(6)
l =1
¦e l =1
k
m ,n (l )
L
& & & &
¦e k
m,n (l )(cl − μ mk +,n1 )(cl − μ mk +,n1 ) T
Cmk +,n1 = l =1
L
(7)
¦el =1
k
m,n (l )
&
Fig. 11 Test images (wide field of view) organized with respect to p(on | c ) . Top row:
& &
on=scene1, p( scene1 | c ) > 0.5 ; Bottom row: on=scene2, p( scene2 | c ) > 0.5 . Scene descriptions
shown in the right column are built on-line, automatically (Arsenio, 2004b;c).
The EM algorithm converges as soon as the cost gradient is small enough or a maximum
number of iterations is reached. The probability density function (PDF) for an object n is
& &
then given by Bayes’ rule p(on | c ) = p(c | on ) p(on ) / p(on ) where
& & &
p(c ) = p(c | on ) p(on ) + p(c | ¬on ) p(¬on ) .
&
The same method applies for the out-of-class PDF p(c | ¬on ) which represents the statistical
feature distribution for the input data in which on is not present.
Finally, it is necessary to select the number M of gaussian clusters. This number can be
selected as the one that maximizes the join likelihood of the data. An agglomerative
clustering approach based on the Rissanen Minimum Description Length (MDL) order
identification criterion (Rissanen, 1983) was implemented to automatically estimate M
(Figure 11 shows algorithm results for classifying two scenes).
build an object recognition system (Fitzpatrick & Arsenio, 2004). The feature space for
recognition consists of:
Sound/Visual period ratios – the sound energy of a hammer peaks once per visual
period, while the sound energy of a car peaks twice.
Visual/Sound peak energy ratios – the hammer upon impact creates high peaks of
sound energy relative to the amplitude of the visual trajectory.
Dynamic programming is applied to match the sound energy to the visual trajectory signal.
Formally, let S = (S1, . . . , Sn) and V= (V1, . . . , Vm) be sequences of sound and visual
trajectory energies segmented from n and m periods of the sound and visual trajectory
signals, respectively. Due to noise, n may be different to m. If the estimated sound period is
half the visual one, then V corresponds to energies segmented with 2m half periods (given
by the distance between maximum and minimum peaks). A matching path P = (P1, . . . , Pl)
defines an alignment between S and M, where max(m, n) l m + n ï 1, and Pk=(i, j), a
match k between sound cluster j and visual cluster i. The matching constraints are set by:
The boundary conditions: P1= (1, 1) and Pl= (m, n).
Temporal continuity: Pk+1∈{(i+1, j+1), (i+1, j), (i, j+1)}. Steps are adjacent
elements of P.
The function cost ci,j is given by the square difference between Vi and Sj periods. The best
matching path W can be found efficiently using dynamic programming, by incrementally
building an m × n table caching the optimum cost at each table cell, together with the link
corresponding to that optimum. The binding W will then result by tracing back through
these links, as in the Viterbi algorithm.
Fig. 12. Object recognition from cross-modal clues. The feature space consists of period and
peak energy ratios. The confusion matrix for a four-class recognition experiment is shown.
The period ratio is enough to separate the cluster of the car object from all the others.
Similarly, the snake rattle is very distinct, since it requires large visual trajectories for
producing soft sounds. Errors for categorizing a hammer originated exclusively from
erroneous matches with the cube rattle, because hammering is characterized by high energy
ratios, and very soft bangs are hard to identify correctly. The cube rattle generates higher
energy ratios than the snake rattle. False cube recognitions resulted mostly from samples
with low energy ratios being mistaken for the snake.
Teaching a Robotic Child - Machine Learning Strategies for a Humanoid Robot from Social Interactions 61
Experimental Results: Figure 12 shows cross-modal features for a set of four objects. It
would be hard to cluster automatically such data into groups for classification. But as in the
sound recognition algorithm, training data is automatically annotated by visual recognition
and tracking. After training, objects can be categorized from cross-modal cues alone. The
system was evaluated by selecting randomly 10% of the data for validation, and the
remaining data for training. This process was randomly repeated 15 times. The recognition
rate averaged over all these runs were, by object category: 86.7%for the cube rattle, 100% for
both the car and the snake rattle, and 83% for the hammer. The overall recognition rate was
92.1%. Such results demonstrate the potential for recognition using cross-modal cues.
M
& & & & & &
p( x, c | on ) = ¦ bm,nG( x,ηm,n , X m,n )G(c , μmk ,n , Cmk ,n ) (8)
m=1
& & & &
The mean of the new Gaussian G ( x ,η m ,n , X m ,n ) is now a function η = f (c , β m ,n ) , that
&
depends on c and on a set of parameters ǃm,n. A locally affine model was chosen for f, with
{ & &
}
&
β m ,n = (am ,n , Ai ,n ) : η m ,n = am ,n + AT c . The learning equations become now (Gershenfeld,
1999):
& &
E-step for k-iteration: From the observed data c and x , compute the a-posteriori
probabilities of the clusters:
& & & &
bmk ,n G( x,η mk ,n , X mk ,n )G(c , μ mk ,n , Cmk ,n )
e k
m,n (l ) = M
& & & &
¦ bmk ,n G( x,η mk ,n , X mk ,n )G(c , μ mk ,n , Cmk ,n )
m =1
Fig. 13. Localizing and recognizing objects from contextual cues. (top) Samples of scene
images are shown on the first column. The next five columns show probable locations based
on context for finding a door, the smaller sofa, the bigger sofa, the table and the chair,
respectively. Even if the object is not visible or present, the system estimates the places at
which there is a high probability of finding such object. Two such examples are shown for
the chair. Occlusion by humans do not change significantly the context. (bottom) Results in
another day, with different lightning conditions.
Activities, identified as categories which include objects capable of similar motions, and the
object’s function in one activity, can then be learned by classifying 12 × 12 image patterns.
One possibility would be the use of eigenobjects for classification (as described in this
chapter for face and sound recognition). Eigenactivities would then be the corresponding
eigenvectors. We opted instead for neural networks as the learning mechanism to recognize
activities.
Target desired values, which are provided by the multiple object tracking algorithm, are
used for the annotation of the training samples - all the training data is automatically
generated and annotated, instead of the standard manual, offline annotation. An input
feature vector is recognized into a category if the corresponding category output is higher
than 0.5 (corresponding to a probability p > 0:5). Whenever this criterion fails for all
categories, no match is assigned to the activity feature vector – since the activity is estimated
as not yet in the database, it is labeled as a new activity.
We will consider the role of several objects in experiments taken for six different activities.
Five of these activities involve periodic motion: cleaning the ground with a swiping
64 Humanoid Robots, New Developments
brush; hammering a nail-like object with a hammer; sawing a piece of metal; moving a
van toy; and playing with a swinging fish. Since more information is generated from
periodic activities, they are used to generate both training and testing data. The remaining
activity, poking a lego, is detected from the lego’s discontinuous motion after poking.
Figure 14 shows trajectories extracted for the positions of four objects from their
sequences of images.
A three layer neural network is first randomly initialized. The input layer has 144
perceptron units (one for each input), the hidden layer has six units and the output layer has
one perception unit per category to be trained (hence, five output units). Experiments are
run with a set of (15, 12, 15, 6, 3, 1) feature vectors (the elements of the normalized activity
images) for the swiping brush, hammer, saw, van toy, swinging fish and lego, respectively.
A first group of experiments consists of selecting randomly 30% of these vectors as
validation data, and the remaining as training data. The procedure is repeated six times, so
that different sets of validation data are considered. The other two groups of experiments
repeat this process for the random selection of 20% and 5% of feature vectors as validation
data. The correspondent quantitative results are presented in figure 15.
a b
Fig. 14. a) Signals corresponding to one period segments of the object’s trajectories
normalized to temporal lengths of 12 points. From top to bottom: image sequence for a
swiping brush, a hammer, a van toy and a swinging fish. b) Normalized centroid positions
are shown in the left column, while the right column shows the (normalized and scaled)
elements of the affine matrix Ri (where the indexes represents the position of the element on
this matrix).
Teaching a Robotic Child - Machine Learning Strategies for a Humanoid Robot from Social Interactions 65
The lego activity, not represented in the training set, was correctly assigned as a new activity for
67% of the cases. The swinging fish was correctly recognized for just 17% of the cases, being the
percentage of no matches equal to 57%. We believe that this poor result was due to the lack of a
representative training set – this assumption is corroborated by the large number of times that
the activity of swinging a fish was recognized as a new activity. The swiping brush was wrongly
recognized for 3,7% of the total number of trials. The false recognitions occurred for experiments
corresponding to 30% of the validation data. No recognition error was reported for smaller
validation sets. All the other activities were correctly recognized for all the trials.
Fig. 15. Experimental results for activity recognition (and the associated recognition of object
function). Each experiment was ran six times for random initial conditions. Top graph) from
left to right columns: 30%, 20% and 5% of the total set of 516 feature vectors are used as
validation data. The total number of training and validation points, for each of the six trials
(and for each of the 3 groups of experiments), is (15, 12, 15, 6, 3, 1) for the swiping brush,
hammer, saw, van toy, swinging fish and lego, respectively. The three groups of columns
show recognition, error and missed-match rates (as ratios over the total number of
validation features). The bar on top of each column shows the standard deviation. Bottom
table: Recognition results (as ratios over the total number of validation features). Row i and
column j in the table show the rate at which object i was matched to object j (or to known, if j
is the last column). Bold numbers indicate rates of correct recognitions.
Sound Recognition
An artificial neural network is applied off-line to the same data collected as before for sound
recognition. The 32 × 32 sound images correspond to input vectors of dimension 1024.
Hence, the neural network input layer contains 1024 perceptron units. The number of units
in the hidden layer was set to six, while the output layer has four units corresponding to the
four categories to be classified. The system is evaluated quantitatively by randomly selecting
40%, 30% and 5% of the segmented data for validation, and the remaining data for training.
This process was randomly repeated six times. This approach achieves higher recognition
66 Humanoid Robots, New Developments
rates when compared to eigensounds. The overall recognition rate is 96,5%, corresponding
to a significant improvement in performance.
Human Actions
Face Detection
Spatial Context
Spectral, color,
geometric Face
Features Detection
Head Pose
Recognition
Fig. 16. Overview of the cognitive architecture developed for the humanoid robot Cog.
5. Conclusions
We proposed in this chapter the application of a collection of learning algorithms to solve a
broad scope of problems. Several learning tools, such as Weighted-cluster modeling,
Artificial Neural Networks, Nearest Neighbor, Hybrid Markov Chains, Geometric Hashing,
Receptive Field Linear Networks and Principal Component Analysis, were extensively
applied to acquire categorical information about actions, scenes, objects and people.
This is a new complex approach to object recognition. Objects might have various meanings
in different contexts – a rod is labeled as a pendulum if oscillating with a fixed endpoint.
From a visual image, a large piece of fabric on the floor is most often labeled as a tapestry,
while it is most likely a bed sheet if it is found on a bed. But if a person is able to feel the
fabric’s material or texture, or the sound that it makes (or not) when grasped with other
materials, then (s)he might determine easily the fabric’s true function. Object recognition
draws on many sensory modalities and the object’s behavior, which inspired our approach.
6. References
Arsenio, A. (2003). Embodied vision - perceiving objects from actions. Proceedings of IEEE
International Workshop on Human-Robot Interactive Communication, San-Francisco, 2003.
68 Humanoid Robots, New Developments
1. Introduction
Realization of natural and energy-efficient dynamic walking has come to be one of the
main subjects in the research area of robotic biped locomotion. Recently, many
approaches considering the efficiency of gait have been proposed and McGeer’s passive
dynamic walking (McGeer, 1990) has been attracted as a clue to elucidate the mechanism
of efficient dynamic walking. Passive dynamic walkers can walk down a gentle slope
without any external actuation. Although the robot's mechanical energy is dissipated by
heel-strike at the stance-leg exchange instant, the gravity potential automatically restores
it during the single-support phase in the case of passive dynamic walking on a slope and
thus the dynamic walking is continued. If we regard the passive dynamic walking as an
active one on a level, it is found that the robot is propelled by the small gravity in the
walking direction and the mechanical energy is monotonically restored by the virtual
control inputs representing the small gravity effect. Restoration of the mechanical energy
dissipated by heel-strike is a necessary condition common to dynamic gait generations
from the mathematical point of view, and efficient active dynamic walking should be
realized by reproducing this mechanism on a level. Mechanical systems satisfy a relation
between the control inputs and the mechanical energy, the power-input for the system is
equal to the time-derivative of mechanical energy, and we introduce a constraint
condition so that the time-change rate of mechanical energy is kept positive constant.
The dynamic gait generation is then specified by a simple redundant equation including
the control inputs as the indeterminate variables and yields a problem of how to solve
the equation in real-time. The ankle and the hip joint torques are determined according
to the phases of cycle based on the pre-planned priority. The zero moment point
(Vukobuatoviþ & Stepanenko, 1972) can be easily manipulated by adjusting the ankle-
joint torque, and the hip-joint torque in this case is secondly determined to satisfy the
desired energy constraint condition with the pre-determined ankle-joint torque. Several
solutions considering the zero moment point condition are proposed, and it is shown
that a stable dynamic gait is easily generated without using any pre-designed desired
trajectories. The typical gait is analyzed by numerical simulations, and an experimental
case study using a simple machine is performed to show the validity of the proposed
method.
70 Humanoid Robots, New Developments
ª m l 2 2mal cos 2D mab mab º
Q D « H »,
¬« mab 0 ¼»
and D [rad] is the half inter-leg angle at the heel-strike instant given by
T1 T2 T2 T1
D ! 0. (6)
2 2
For further details of derivation, the authors should refer to the technical report by Goswami
et al. This simplest walking model can walk down a gentle slope with suitable choices of
physical parameters and initial condition. Goswami et al. discovered that this model exhibits
period-doubling bifurcations and chaotic motion (Goswami et al., 1996) when the slope
angle increases. The nonlinear dynamics of passive walkers are very attractive but its
mechanism has not been clarified yet.
Biped Gait Generation and Control Based on Mechanical Energy Constraint 71
Z T 2 T 1
u2
mH g
l m
m
a
u1
X
Fig. 1. Experimental walking machine and its foot mechanism (left), and its ideal model (right).
mH 3.0 kg
m 0.4 kg
l ab 0.680 m
a 0.215 m
b 0.465 m
Table 1. Physical parameters of the experimental machine.
Let us define virtual total mechanical energy EI under the gravity condition of Fig. 2 as
follows:
(T , T, I ) 1 T M (T )T P(T , I )
I
(7)
2
where the virtual potential energy is given by
P ș , I ^ mH l ma ml cos T1 I mb cos T 2 I ` g cos I . (8)
In the case of passive dynamic walking on a slope, the total mechanical energy is kept
constant during the single-support phase, whereas EI does not exhibit such behaviour. Fig.
3 shows the simulation results of passive dynamic walking on a gentle slope whose
magnitude is 0.01 [rad]. (c) and (d) show the evolutions of the equivalent transformed
torques and virtual energy E , respectively. From (c), we can see that both u1 and u2 are
I
almost constant-like and thus the ZMP should be kept within a narrow range. This property
is effective in the virtual passive dynamic walking from the viewpoint of the stability of foot
posture (Asano & Yamakita, 2001). It is apparent from (d) that the mechanical energy is
dissipated at the transition instant and monotonically restored during the swing phase. Such
energy behaviour can be considered as an indicator of efficient dynamic walking.
g s in I u2
m H g s in I
g cos I
I
g cos I g
m g s in I
u1
I
Fig. 2. Gravity acceleration mechanism of passive dynamic walking.
E1) The walking pattern is generated automatically, including impulsive transitions, and
converges to a steady limit cycle.
E2) The total mechanical energy is restored during the single-support phase monotonically,
and is dissipated at every transition instant impulsively by heel-strike with the floor.
E2 is considered to be an important characteristic for dynamic gait generation, and is the
basic concept of our method. We will propose a simple method imitating the property in the
next section.
Fig. 3. Simulation results of passive dynamic walking on a slope where I 0.01 [rad].
74 Humanoid Robots, New Developments
where P is the potential energy. The power input to the system is the time-change rate of
the total energy, that is
E ș T IJ ș T Su . (10)
Suppose now that we use a simple control law imitating the characteristic CH1, monotonic
energy restoration. Let O ! 0 be a positive constant and consider the following condition:
E O . (11)
This means that the robot's mechanical energy increases monotonically with a constant rate
of O . We call this control or gait generation method “Energy Constraint Control (ECC)”. In
this method, the walking speed becomes faster w.r.t. the increase of O , in other words, the
magnitude of O corresponds to the slope angle of virtual passive dynamic walking. Here let
us consider the following output function:
H IJ E O ș T IJ O , (12)
and the target constraint condition of Eq. (11) can be rewritten as H IJ 0 . Therefore, the
ECC can be regarded in this sense as an output zeroing control.
Following Eqs. (10) and (11), the detailed target energy constraint condition is expressed as
E ș T Su T1u1 T1 T2 u2 O , (13)
which is a redundant equation on the control inputs. The dynamic gait generation then
yields a problem of how to solve the redundant equation for the control inputs u1 and u2 in
real-time. The property of ECC strategy is that the control inputs can be easily determined
by adjusting the feed-forward parameter O , which can be determined by considering the
magnitude of E of virtual passive dynamic walking.
Under these assumptions, we can calculate the ZMP in the coordinate shown in Fig. 4 left as:
u (14)
ZMP 1
Rn
where u1 [Nm] is the ankle torque acting not on the foot link but on the leg link and Rn [N]
is the vertical element of the reaction force, respectively.
From Fig. 4, it is obvious that the ZMP is always shifted behind the ankle joint when driving
the stance-leg forward, however, at the transition instant, the robot is critically affected by
the reaction moment from the floor as shown in Fig. 4 right. Considering the reaction
moment effect, we can reform the ZMP equation for the simplest model as follows:
u u (15)
ZMP 1 rm
Rn
where urm ! 0 represents the equivalent torque of the reaction moment, and the ZMP is
shifted backward furthermore. urm acts as a disturbance for the transition. Since the actual
walking machines generally have feet with toe and heel, this problem arises. From the
aforementioned point of view, we conclude that the ZMP should be shifted forward the
ankle-joint just after the transition instant to cancel the reaction moment. Based on the
observation, in the following, we consider an intuitive ZMP manipulation algorithm
utilizing the freedom of the redundant equation of (13).
u1
R u1 R
0
Fig. 4. Foot mechanism and reaction moment at the heel-strike instant.
u 0 s d T (16)
® u1
¯u ! 0 otherwise
where s is a virtual time that is reset at every transition instant and s 1.0 . This comes
from the fact that u1 must be negative to shift the ZMP forward the ankle-joint, and if u1 ! 0
the ZMP moves behind the ankle-joint. In this case, u2 is obtained after u1 as follows:
O T1u .
r
(17)
u2
T1 T2
Note that u2
has a singularity at T1 T2 0 which was mentioned before as CH3. This
condition must be taken into account. We then propose a switching control law described
later. Before it, we consider a more reasonable switching algorithm from u to u . In
general, for most part of a cycle from the beginning, the condition T1 T2 ! 0 holds (See Fig.
3 (b)), and thus the sign of u2 of Eq. (17) is identical with that of O T1u1 . If u1 u , this sign
is positive because of O , T1 ! 0 and u 0 . At the beginning of a cycle, O T1u increases
monotonically because of T1 0 (See Fig. 3 (b)). Therefore in general the condition
d
dt
O T1u T1u ! 0 (18)
holds regardless of the system parameter choice. Therefore, if O T1u at the beginning, it is
reasonable to switch when
O T1u 0 (19)
so as to keep u2
of Eq. (17) always positive under the condition of T1 T2 ! 0 . In addition,
by this approach the hip-joint torque can always contribute the mechanical energy
restoration. The switching algorithm of u1 is summarized as follows:
u 0 O d T1u .
(20)
u1 ®
¯u ! 0 otherwise
The value of u must be determined empirically based on the simulation results of virtual
passive dynamic walking, whereas u should be determined carefully so as not to destroy
the original limit cycle or disturb the forward acceleration. Choosing the suitable
combination between O and u is the most important for generating a stable limit cycle.
therefore it is found that this hip-joint torque u2 also contributes the mechanical energy
restoration.
78 Humanoid Robots, New Developments
u1
R u1 R
4.6 Discussion
Here, we compare our method with approach proposed by Goswami et al. “energy tracking
control.” (Goswami et al., 1997) Their approach is formulated as
E Oetc E E
, (26)
where E
[J] (constant) is the reference energy and positive scalar Oetc is the feedback gain.
A solution of Eq. (13) by constant torque ratio P ! 0 which gives the condition u1 Pu2 is
obtained as
Biped Gait Generation and Control Based on Mechanical Energy Constraint 79
Oetc E E
ª P 1º , (27)
IJ
P 1T1 T2 «¬ 1 »¼
and in our case, a solution by constant torque ratio is given by
O ª P 1º . (28)
IJ
P 1T1 T 2 «¬ 1 »¼
Figs. 6 and 7 respectively show the simulation results of active dynamic walking on a level
by the torques of Eqs. (27) and (28) without manipulating the ZMP actively. The two cases
are equal in walking speed. From the simulation results, we can see that, in our approach,
the maximum ankle-joint torque is about 3 times smaller than that of Goswami's approach
and this yields better ZMP condition. In this sense, we should conclude that the mechanical
energy must be restored efficiently but its time-change rate should be carefully chosen to
guarantee the ZMP condition.
Fig. 6. Simulation results of dynamic walking by energy tracking control where Oetc 10.0 ,
P 10.0 and E 22.0 [J].
80 Humanoid Robots, New Developments
Fig. 7. Simulation results of dynamic walking by ECC where O 0.34 [J/s] and P 10.0 .
5. Experiments
In order to confirm the validity of the proposed method, we carried out actual walking
experiment using our developed machine introduced in Fig.1. All encoders of the
servomotors are interfaced to a computer (Pentium III 1.0 GHz) running Windows 98. To
implement the control law, we used RtMaTX (Koga, 2000) for real-time computation with
the sampling period 1.0 [ms].
Since the proposed methods are so called model-matching control, they are not robust for
uncertainty. In this research, we use model following control of the motion generated by
VIM (Virtual Internal Model) which is a reference model in computer. Every post-impact
condition of VIM is reset to that of the actual machine. By using the VIM, the uncertainties
Biped Gait Generation and Control Based on Mechanical Energy Constraint 81
of identification, which is crucial factor in the case of model matching control, can be
compensated. The dynamics of VIM is given by
Mˆ (ș )ș Cˆ (ș , ș )ș gˆ (ș ) IJ , (29)
d d d d d d d
where IJd is the control input to drive the VIM and is determined by șd and șd . The control
input for the actual robot is given by
IJ M ˆ (ș )u Cˆ (ș , ș )ș gˆ (ș )
d d d d d (30)
d
D
d
u ș K ș ș K ș ș K ș ș dt
P d I ³ d
The virtual internal model started walking from the following initial condition:
ª 0.68º ª 0.14 º ,
ș (0) « » , ș (0) « 0.14 »
¬0.62 ¼ ¬ ¼
and its state was calculated and updated in real-time. At every transition instant, the
angular positions of VIM were reset to that of the actual machine. PID controller drives the
ankle-joint of the swing leg during the single-support phase so that the foot keeps the
posture horizontal.
The experimental results are shown in Fig. 8. The adjustment parameters are chosen as
O 0.075 [J/s], u 0.15 , u 0.05 [Nm] and \ 0.05 [rad] empirically. Fig. 8 (a) and (b)
show the evolution of angular positions and velocities of the actual machine, respectively.
The actual angular velocities are calculated by differentiation thorough a filter whose
transfer function is 70 / s 70 . A stable dynamic walking is experimentally realized based
on ECC via model following control.
xk 1 F xk (35)
ªT 2 [ k ] T1 [ k ]º
« », (36)
xk « T1 [ k ] »
« T2 [ k ] »
¬ ¼
that is, relative hip joint angle and angular velocities just after k-th impact. The function F
is determined based on Eqs. (1) and (3), but cannot be expressed analytically. Therefore, we
must compute F by numerical simulation following an approximation algorithm.
In the case of steady walking, the relation F x
x
holds and x
is the equilibrium point
of state at just after transition instant. For a small perturbation G xk around the limit cycle,
the mapping function F can be expressed in terms of Taylor series expansion as
F xk F x
G x k | x
F G x k (37)
where
Biped Gait Generation and Control Based on Mechanical Energy Constraint 83
wF ( x) (38)
F '
wx x x
calculated approximately. The all eigenvalues of F are in the unit circle and the results are
omitted. Although the robustness of the walking system is difficult to evaluate
mathematically, the maximum singular value of F should imply the convergence speed of
gait; smaller the value is, faster the convergence to the steady gait is. Fig. 9 shows the
analysis result of maximum singular value of F w.r.t ] in the Fig. 7 case with energy
feedback control where E0 21.8575 [J] and ] 10.0 . The maximum singular value
monotonically decreases with the increase of ] . The effect of improvement of the gait
robustness by feedback control can be confirmed. Although applying this method destroys
autonomy of the walking system, we can improve the robustness.
Z
u2
l1 a 1 b1
mH
b1 l2 a 2 b2
l3 a 3 b3
b1
m2 a1
g
T 2
b1
m1 , I
m3
a1 T1
T 3
a1
u1
X
O
Fig. 10. Model of a planar underactuated biped robot.
The ECC then yields a problem of how to solve the following redundant equation:
E T1u1 T1 T2 u2 O (40)
for the control inputs in real-time. Since the knee-joint is free, we can give the control input
by applying the form of Eq. (28) as
ª O ª P 1º º
Su
«
P 1 T T « 1 » » . (40)
« 1 2 ¬ ¼»
« 0 »
¬ ¼
On the other hand, a kneed biped has a property of obstacle avoidance, in other words,
guaranteeing the foot clearance by knee-bending. To improve the advantageous, we
introduce an active knee-lock algorithm proposed in our previous work (Asano & Yamakita,
2001) in the following. The passive knee-strike occurs when T 2 T3 during the single-
support phase, and its inelastic collision model is given by
M (ș )ș M (ș )ș J T OI I
(41)
Biped Gait Generation and Control Based on Mechanical Energy Constraint 85
where J I >0
T
1 1@ and OI is the Lagrange’s indeterminate multiplier vector and means
the impact force. We introduce an active knee-lock algorithm before the impact and
mechanically lock the knee-joint at a suitable timing. Let us then consider the dissipated
mechanical energy at this instant. Define the dissipated energy 'Eks as
1 1 (42)
'Eks ' (T ) 7 M (T )T (T ) 7 M (T )T d 0
2 2
This can be rearranged by solving Eq. (41) as
2
1 T T 1 T 1
T2 T3 . (43)
'Eks ș
2
JI JI M JI JIș
2 J I M 1J IT
This shows that the condition to minimize the energy dissipation is T2 T3 , and this leads
'Eks 0 . In general, there exists the timing in the kneed gait. After locking-on the knee-
joint, we should lock-off it and the timing should be chosen empirically following a certain
trigger. In this section, we consider the trigger as X g 0 [m] where X g is the X-position of
the robot’s center of mass. Fig. 11 shows the phase sequence of a cycle with the knee-lock
algorithm, which consists of the following phases.
1. Start
2. 3-link phase I
3. Active knee-lock on
4. Virtual compass phase (2-link mode)
5. Active knee-lock off
6. 3-link phase II
7. Passive knee-strike
8. Compass phase (2-link mode)
9. Heel-strike
Fig. 12 shows the simulation results of dynamic walking by ECC where O 5.0 and
P 4.0 . The physical parameters are chosen as Table 2. From Fig. 12 (b) and (d), it is
confirmed that the passive knee-joint is suitably locked-on without energy-loss, and after
that, active lock-off and passive knee-strike occur. Fig. 13 shows the stick diagram for one
step. We can see that a stable dynamic bipedal gait is generated by ECC.
Fig. 11. Phase sequence of dynamic walking by ECC with active lock of free knee-joint.
86 Humanoid Robots, New Developments
Fig. 12. Simulation results of dynamic walking of a kneed biped by ECC where O 5.0
[J/s] and P 4.0 .
Biped Gait Generation and Control Based on Mechanical Energy Constraint 87
m1 m2 m3 5.0 kg
m2 3.0 kg
m3 2.0 kg
mH 10.0 kg
I 2
m2 m3 a2 b3 / m1 0.243 kg m 2
a1 m l
2 3 a2 m3 a3 / m1 0.52 m
b1 0.48 m
a2 0.20 m
b2 0.30 m
a3 0.25 m
b3 0.25 m
l1 a1 b1 1.00 m
l2 a2 b2 0.50 m
l3 a3 b3 0.50 m
Table 2. Parameters of the planar kneed biped.
Fig. 16. Stick diagram of dynamic walking with free knee-joint by ECC
88 Humanoid Robots, New Developments
8. References
Asano, F. & Yamakita, M. (2001). Virtual gravity and coupling control for robotic gait
synthesis, IEEE Trans. on Systems, Man and Cybernetics Part A, Vol. 31, No. 6, pp.
737-745, Nov. 2001.
Goswami, A.; Thuilot, B. & Espiau, B. (1996). Compass-like biped robot Part I: Stability and
bifurcations of passive gaits, Research Report INRIA 2613, 1996.
Goswami, A.; Espiau, B. & Keramane, A. (1997). Limit cycles in a passive compass gait biped
and passivity-mimicking control laws, Autonomous Robots, Vol. 4, No. 3, pp. 273-286,
Sept. 1997.
Koga, M. (2000). Numerical Computation with MaTX (In Japanese), Tokyo Denki Univ.
Press, ISBN4-501-53110-X, 2000.
McGeer, T. (1990). Passive dynamic walking, Int. J. of Robotics Research, Vol. 9, No. 2, pp. 62-
82, April 1990.
Vukobuatoviþ, M. & Stepanenko, Y. (1972). On the stability of anthropomorphic systems,
Mathematical Biosciences, Vol. 15, pp. 1-37, 1972.
6
1. Introduction
Recently, numerous collaborations have been focused on biped robot walking pattern to trace
the desired paths and perform the required tasks. In the current chapter, it has been focused
on mathematical simulation of a seven link biped robot for two kinds of zero moment points
(ZMP) including the Fixed and the Moving ZMP. In this method after determination of the
breakpoints of the robot and with the aid of fitting a polynomial over the breakpoints, the
trajectory paths of the robot will be generated and calculated. After calculation of the trajectory
paths of the robot, the kinematic and dynamic parameters of the robot in Matlab environment
and with respect to powerful mathematical functions of Matlab, will be obtained. The
simulation process of the robot is included in the control process of the system. The control
process contains Adaptive Method for known systems. The detailed relations and definitions
can be found in the authors’ published article [Musavi and Bagheri, 2007]. The simulation
process will help to analyze the effects of drastic parameters of the robot over stability and
optimum generation of the joint’s driver actuator torques.
- Tm : The time which ankle joint has reached to its maximum height during walking cycle.
: Step number k -
Ankle joint maximum height: H ao -
- Lao : The horizontal traveled distance between ankle joint and start point when the ankle
joint has reached to its maximum height.
Step length: D s -
: Foot lift angle and contact angle with the level ground q b ,q f -
- O : Surface slope
- hs : Stair level height-
- H st : Foot maximum height from stair level
- x ed : The horizontal distance between hip joint and the support foot (Fixed coordinate
system) at the start of double support phase time.
- x sd : The horizontal distance between hip joint and the support foot (Fixed coordinate
system) at the end of double support phase time.
- F. C. S: The fixed coordinate system which would be supposed on support foot in each step.
-M.C : The mass centers of the links
- Saggital plane: The plane that divides the body into right and left sections.
- Frontal plane: The plane parallel to the long axis of the body and perpendicular to the
saggital plane.
The saggital and frontal planes of the human body are shown in figure (1.1) where the
transverse plane schematic and definition have been neglected due to out of range of our
calculation domain.
Sagittal
Frontal
The important parameters of the robot can be assumed as the listed above and are shown in
figures (1.2) and (1.3).
t kTc Tm
Hip joint
H max
z h , xh H min Foot joint
xa , z a
Lao
qb H ao qf
Ds Ds
Fig. (1.2). The robot important parameters for calculation of trajectory path of a seven link
biped robot.
2.3) The displacements of Horizontal and Vertical Foot Traveling Over Horizontal
Surface or Stair
With respect to figures (1.2) and (1.3), the horizontal and vertical components of ankle joint
can be shown as below:
kDs t kTc
°
kD
° s an l sin q b t kTc Td
°laf (1 cosqb )
°° (3)
xahor (t ) ®kDs Lao t kTc Tm
°(k 2)D l sin q t (k 1)T
° s an f c
°lab (1 cosq f )
°
¯°(k 2)Ds t (k 1)Tc Td
°lan cos q f
°
¯°(k 1)hst lan t (k 1)Tc Td
2.4) The displacements of Horizontal and Vertical Foot Traveling Over declined Surface
kDs cosO lan sinO t kTc
°(kD l ) cosO t kTc Td
° s af
°lan sin(qb O) laf cos(qb O)
°°
xa,dec(t) ®(kDs Lao) cosO t kTc Tm (6)
°((k 2)D l ) cosO t (k 1)Tc
° s ab
Assuming the above expressed breakpoints and also applying the following boundary
condition of the robot during walking cycle, generation of the ankle joint trajectory path can
be performed. The boundary conditions of the definite system are determined with respect
to physical and geometrical specifications during movement of the system. As can be seen
from figure (1.2) and (1.3), the linear and angular velocity of foot at the start and the end of
double support phase equal to zero:
° T a ( kT c ) 0
®
°̄ T a (( k 1 ) T c T d ) 0
x a ( kT c ) 0
® (8)
x
¯ a (( k 1 ) T c T d ) 0
z a ( kT c ) 0
®
z
¯ a (( k 1 ) T c T d ) 0
The best method for generation of path trajectories refers to mathematical interpolation. There
are several cases for obtaining the paths with respect to various conditions of the movement
such as number of breakpoints and boundary conditions of the system. Regarding the
mentioned conditions of a seven link biped robot, Spline and Vandermonde Matrix methods
seem more suitable than the other cases of interpolation process. The Vandermonde case is the
simplest method with respect to calculation process while it will include calculation errors
with increment of breakpoint numbers. The stated defect will not emerge on Spline method
and it will fit the optimum curve over the breakpoints regardless the number of points and
boundary conditions. With respect to low number of domain breakpoints and boundary
conditions of a seven link biped robot, there are no considerable differences in calculation
process of Vandermonde and Spline methods. For an example, with choosing one of the stated
methods and for relations (7) and (8), a sixth-order polynomial or third-order spline can be
fitted for generation of the vertical movement of ankle joint.
2.5) Hip Trajectory Interpolation for the Level Ground [Huang and et. Al, 2001] and
Declined Surfaces
From figures (1.2) and (1.3), the vertical and horizontal displacements of hip joint can be
written as below:
kD s x ed t kT c
°
x h , Hor , ®( k 1) D s x sd t kT c T d (9)
stair
°( k 1) D x t ( k 1)T c
¯ s ed
94 Humanoid Robots, New Developments
H h min t kTc
°
z h , Hor . ® H h max t kTc .5(Tc Td ) (11)
°H t (k 1)Tc
¯ h min
H h min cos O ,t kT c
( kD s x ed ) sin O
H h max cos O ,t kT c .5(Tc Td ) (12)
z h , Dec .
( kD s x sd ) sin O
H h min cos O , t ( k 1)Tc
(( k 1) D s x ed ) sin O ,
Where, in the above expressed relations, H min and H max indicate the minimum and
maximum height of hip joint from the fixed coordinate system. Obviously and with respect
to figure (1.2), the ankle and hip joint parameters including x a , z a and x h , z h play main
role in optimum generation of the trajectory paths of the robot. With utilization of relations
(1)-(13) and using the mathematical interpolation process, the trajectory paths of the robot
will be completed.
x hip x a , swing x a ,sup x hip
l3 z hip z a ,sup
z hip z a , swing
T3 l2 T2
l4
T4 l1 T1
{5} {0}
Fig. (1.4). The link's angles and configurations.
Regarding figure (1.4) and the trajectory paths generation of the robot based on four important
parameters of the system ( x a , z a and x h , z h ), the first kinematic parameter of the robot can
be obtained easily. On the other hand, with utilization of inverse kinematics, the link's angle of
Dynamic Simulation of Single and Combined Trajectory Path Generation and
Control of A Seven Link Biped Robot 95
the robot will be calculated with respect to the domain nonlinear mathematical equations. As
can be seen from figure (1.4), the equations can be written as below:
l1 cos(S T 1 ) l 2 cos(S T 2 ) a
l1 sin(S T 1 ) l 2 sin(S T 2 ) b (14)
it
i3 i2
i4
i1
i5 i0
F.C.S
O
Fig. (1.5). The assumed unit vectors to obtain the position vectors.
& (16)
r0 lm.c (cos(Em.c q f ) I sin(Em.c q f ) K )
&
r1 (l0 cos(E tot q f ) l c1 cos(T1 O )) I (17)
(l c1 sin(T1 O ) l 0 sin(E tot q f )) K
&
r2 (l 0 cos(E tot q f ) l1 cos(T1 O )
(18)
l c 2 cos(S T 2 O )) I (l1 sin(T1 O )
l 0 sin( E tot q f ) l c 2 sin(S T 2 O )) K
96 Humanoid Robots, New Developments
&
r3 (l 0 cos( E tot q f ) l1 cos(T1 O )
(19)
l 2 cos(S T 2 O ) l c 3 cos(T 3 O )) I
(l1 sin(T1 O ) l 0 sin( E tot q f )
l 2 sin(S T 2 O ) l c 3 sin(T 3 O )) K
&
r4 (l 0 cos( E tot q f ) l1 cos(T 1 O )
l 2 cos(S T 2 O ) l 3 cos(T 3 O ) (20)
l c 4 cos(T 4 O )) I (l1 sin(T 1 O )
l 0 sin( E tot q f ) l 2 sin(S T 2 O )
l3 sin(T 3 O ) l c 4 sin(T 4 O )) K
&
r5 (l0 cos( E tot q f ) l1 cos(T1 O )
l2 cos(S T 2 O ) l3 cos(T3 O )
l4 cos(T 4 O ) l5 m.c cos(S / 2 O E s , m.s qb )) I (21)
¦ m ( g cosO z )
i 1
i i
Where, xi and zi are horizontal and vertical acceleration of the link's mass center with
respect to F.C.S where Ti is the angular acceleration of the links calculated from the
interpolation process. On the other hand, the stability of the robot is determined according
to attitude of ZMP. This means that if the ZMP be within the convex hull of the robot, the
stable movement of the robot will be obtained and there are no interruptions in kinematic
parameters (Velocity of the links). The convex hull can be imagined as a projection of a
98 Humanoid Robots, New Developments
pyramid with its heads on support and swing foots and also on the hip joint. Generally, the
ZMP can be classified as the following cases:
1) Moving ZMP
2) Fixed ZMP
The moving type of the robot walking is similar to human gait. In the fixed type, the
ZMP position is restricted through the support feet or the user's selected areas.
Consequently, the significant torso's modified motion is required for stable walking of
the robot. For the explained process, the program has been designed to find target angle
of the torso for providing the fixed ZMP position automatically. In the designed
program, qtorso shows the deflection angle of the torso determined by the user or
calculated by auto detector mood of the program. Note, in the mood of auto detector,
the torso needed motion for obtaining the mentioned fixed ZMP will be extracted with
respect to the desired ranges. The desired ranges include the defined support feet area
by the users or automatically by the designed program. Note, the most affecting
parameters for obtaining the robot's stable walking are the hip's height and position. By
varying the parameters with iterative method for xed , x sd [Huang and et. Al, 2001] and
choosing the optimum hip height, the robot control process with respect to the torso's
modified angles and the mentioned parameters can be performed. To obtain the joint’s
actuator torques, the Lagrangian relation [Kraige, 1989] has been used at the single
support phase as below:
(31)
Wi H (q)q C (q, q )q G(qi )
where, i 0, 2, 6 and H , C, G are mass inertia, coriolis and gravitational matrices of the
system which can be written as following:
ªh11 h12 h13 h14 h15 h16 h17º ªc11 c12 c13 c14 c15 c16 c17º ªG1 º
«h h22 h23 h24 h25 h26 h27»» «c «G »
« 21 « 21 c22 c23 c24 c25 c26 c27»» « 2 »
«h h32 h33 h34 h35 h36 h37» «c31 c32 c33 c34 c35 c36 c37» «G 3 »
H(q) « 31 » C(q, q) « » G (q ) « »
«h41 h42 h43 h44 h45 h46 h47» «c41 c42 c43 c44 c45 c46 c47» «G 4 »
«h51 h52 h53 h54 h55 h56 h57» «c51 c52 c53 c54 c55 c56 c57» «G »
« » « » « 5 »
«¬h61 h62 h63 h64 h65 h66 h67»¼ «¬c61 c62 c63 c64 c65 c66 c67»¼ ¬« G tor ¼»
Obviously, the above expressed matrices show the double support phase of the movement
of the robot where they are used for the single support phase of the movement. On the other
hand, the relation (31) is used for the single support phase of the robot. Within the double
support phase of the robot, due to the occurrence impact between the swing leg and the
ground, the modified shape of relation (31) is used with respect to effects of the reaction
forces of the ground [Lum and et. Al. 1999 and Westervelt, 2003, and Hon and et. Al., 1978].
For the explained process and in order to obtain the single support phase equations of the
robot, the value of q 0 (as can be seen in figure (1.4)) must be put equal to zero. The
calculation process of the above mentioned matrices components contain bulk mathematical
relations. Here, for avoiding the aforesaid relations, just the simplified relations are
presented:
Dynamic Simulation of Single and Combined Trajectory Path Generation and
Control of A Seven Link Biped Robot 99
h11 [m1 (lc21 lc1le cos(q1 M))] [m2 (l12 lc22 l1le cos(q1 M) lc2le cos(q2 M)
2l1lc2 cos(q2 q1 ))] [m3 (l12 l22 lc23 l1le cos(q1 M) l2le cos(q2 M) lc3le cos(q3 M)
2l1l2 cos(q2 q1 ) 2l1lc3 cos(q1 q3 ) 2l2lc3 cos(q2 q3 ))] [m4 (l12 l22 l32 lc24
l1le cos(q1 M) l2le cos(q2 M) l3le cos(q3 M) lc4le cos(q4 M) 2l1l2 cos(q2 q1 )
2l1l3 cos(q3 q1 ) 2l1lc4 cos(q1 q4 ) 2l2l3 cos(q3 q2 ) 2l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))]
[m5 (l12 l22 l32 l42 lcfswing
2
l1le cos(q1 M) l2le cos(q2 M) l3le cos(q3 M)
l4le cos(q4 M) lcfswingle cos(M (S / 2) q fswing E fswing) 2l1l2 cos(q2 q1 ) 2l1l3 cos(q3 q1 )
2l1l4 cos(q4 q1 ) 2l1lcfswingcos(q1 (S / 2) q fswing E fswing) 2l2l3 cos(q3 q2 )
2l2l4 cos(q4 q2 ) 2l2lcfswingcos(q2 (S / 2) q fswing E fswing) 2l3l4 cos(q4 q3 )
2l4lcfswingcos(q4 (S / 2) q fswing E fswing) 2l3lcfswingcos(q3 (S / 2) q fswing E fswing))]
[mtor (l12 l22 lctorso
2
l1le cos(q1 M) l2le cos(q2 M) lelctorsocos(qtorso M (S / 2))
2l1l2 cos(q2 q1 ) 2l1lctorsocos(qtorso q1 (S / 2)) 2l2lctorsocos(qtorso q2 (S / 2))] I1 I 2
I3 I 4 I5 Itorso
h12 [m1 (lc21 )] [m2 (l12 lc22 2l1lc2 cos(q2 q1 ))] [m3 (l12 l22 lc23 2l1l2 cos(q2 q1 )
2l1lc3 cos(q1 q3 ) 2l2lc3 cos(q2 q3 ))] [m4 (l12 l22 l32 lc24 2l1l2 cos(q2 q1 )
2l1l3 cos(q3 q1 ) 2l1lc4 cos(q1 q4 ) 2l2l3 cos(q3 q2 ) 2l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))]
[m5 (l12 l22 l32 l42 lcfswing
2
2l1l2 cos(q2 q1 ) 2l1l3 cos(q3 q1 ) 2l1l4 cos(q4 q1 )
2l1lcfswing cos(q1 (S / 2) q fswing E fswing) 2l2l3 cos(q3 q2 ) 2l2l4 cos(q4 q2 )
2l2lcfswing cos(q2 (S / 2) q fswing E fswing) 2l3l4 cos(q4 q3 )
2l4lcfswing cos(q4 (S / 2) q fswing E fswing) 2l3lcfswing cos(q3 (S / 2) q fswing E fswing))]
[mtor (l12 l22 lctorso
2
2l1l2 cos(q2 q1 ) 2l1lctorso cos(qtorso q1 (S / 2))
2l2lctorso cos(qtorso q2 (S / 2))] I1 I 2 I 3 I 4 I 5 I torso
100 Humanoid Robots, New Developments
h13 [m2 (lc22 l1lc2 cos(q2 q1 ))] [m3 (l22 lc23 l1l2 cos(q2 q1 ) l1lc3 cos(q1 q3 )
2l2lc3 cos(q2 q3 ))] [m4 (l22 l32 lc24 l1l2 cos(q2 q1 ) l1l3 cos(q3 q1 )
l1lc4 cos(q1 q4 ) 2l2l3 cos(q3 q2 ) 2l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))]
[m5 (l22 l32 l42 lcfswing
2
l1l2 cos(q2 q1 ) l1l3 cos(q3 q1 ) l1l4 cos(q4 q1 )
l1lcfswing cos(q1 (S / 2) q fswing E fswing) 2l2l3 cos(q3 q2 ) 2l2l4 cos(q4 q2 )
2l2lcfswing cos(q2 (S / 2) q fswing E fswing) 2l3l4 cos(q4 q3 )
2l4lcfswing cos(q4 (S / 2) q fswing E fswing) 2l3lcfswing cos(q3 (S / 2) q fswing E fswing))]
[mtor (l22 lctorso
2
l1l2 cos(q2 q1 ) l1lctorso cos(qtorso q1 (S / 2)) 2l2lctorso cos(qtorso q2 (S / 2))]
I 2 I 3 I 4 I 5 I torso
h14 [m3 (lc23 l1lc3 cos(q1 q3 ) l2lc3 cos(q2 q3 ))] [m4 (l32 lc24 l1l3 cos(q3 q1 )
l1lc4 cos(q1 q4 ) l2l3 cos(q3 q2 ) l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))] [m5 (l32 l42
2
lcfswing l1l3 cos(q3 q1 ) l1l4 cos(q4 q1 ) l1lcfswingcos(q1 (S / 2) q fswing E fswing)
l2l3 cos(q3 q2 ) l2l4 cos(q4 q2 ) l2lcfswingcos(q2 (S / 2) q fswing E fswing) 2l3l4 cos(q4 q3 )
2l4lcfswingcos(q4 (S / 2) q fswing E fswing) 2l3lcfswingcos(q3 (S / 2) q fswing E fswing))] I3 I 4 I 5
2
h16 [m5 (lcfswing l1lcfswing cos(q1 (S / 2) q fswing E fswing) l2lcfswing cos(q2 (S / 2) q fswing E fswing)
l4lcfswing cos(q4 (S / 2) q fswing E fswing) l3lcfswing cos(q3 (S / 2) q fswing E fswing))] I 5
2
h 17 [mtorso (l ctorso l1l ctorso cos( qtorso (S / 2) q1 ) l 2 l ctorso cos(qtorso (S / 2) q 2 ))] I toroso
Dynamic Simulation of Single and Combined Trajectory Path Generation and
Control of A Seven Link Biped Robot 101
h 21 [m2 (lc22 lc2le cos(q2 M) l1lc2 cos(q2 q1 ))] [m3 (l22 lc23 l2le cos(q2 M)
lc3le cos(q3 M) l1l2 cos(q2 q1 ) l1lc3 cos(q1 q3 ) 2l2lc3 cos(q2 q3 ))] [m4 (l22 l32 lc24
l2le cos(q2 M) l3le cos(q3 M) lc4le cos(q4 M) l1l2 cos(q2 q1 ) l1l3 cos(q3 q1 )
l1lc4 cos(q1 q4 ) 2l2l3 cos(q3 q2 ) 2l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))]
[m5 (l22 l32 l42 lcfswing
2
l2le cos(q2 M) l3le cos(q3 M) l4le cos(q4 M)
lcfswingle cos(M (S / 2) q fswing E fswing ) l1l2 cos(q2 q1 ) l1l3 cos(q3 q1 )
l1l4 cos(q4 q1 ) l1lcfswing cos(q1 (S / 2) q fswing E fswing ) 2l2l3 cos(q3 q2 )
2l2l4 cos(q4 q2 ) 2l2lcfswing cos(q2 (S / 2) q fswing E fswing ) 2l3l4 cos(q4 q3 )
2l4lcfswing cos(q4 (S / 2) q fswing E fswing ) 2l3lcfswing cos(q3 (S / 2) q fswing E fswing ))]
[mtor (l22 lctorso
2
l2le cos(q2 M) lelctorso cos(qtorso M (S / 2)) l1l2 cos(q2 q1 )
l1lctorso cos(qtorso q1 (S / 2)) 2l2lctorso cos(qtorso q2 (S / 2))] I 2 I 3 I 4 I 5 I torso
h22 [m2 (lc22 l1lc2 cos(q2 q1))][m3(l22 lc23 l1l2 cos(q2 q1) l1lc3 cos(q1 q3)
2l2lc3 cos(q2 q3))][m4 (l22 l32 lc24 l1l2 cos(q2 q1) l1l3 cos(q3 q1) l1lc4 cos(q1 q4 )
2l2l3 cos(q3 q2 ) 2l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))][m5 (l22 l32 l42 lcfswing
2
l1l2 cos(q2 q1) l1l3 cos(q3 q1) l1l4 cos(q4 q1) l1lcfswingcos(q1 (S / 2) qfswing E fswing) 2l2l3 cos(q3 q2 )
2l2l4 cos(q4 q2 ) 2l2lcfswingcos(q2 (S / 2) qfswing E fswing) 2l3l4 cos(q4 q3)
2l4lcfswingcos(q4 (S / 2) qfswing E fswing) 2l3lcfswingcos(q3 (S / 2) qfswing E fswing))]
[mtor(l22 lctorso
2
l1l2 cos(q2 q1) l1lctorsocos(qtorso q1 (S / 2)) 2l2lctorsocos(qtorso q2 (S / 2))]
I2 I3 I4 I5 Itorso
h23 [m2 (lc22 )][m3 (l22 lc23 2l2lc3 cos(q2 q3 ))][m4 (l22 l32 lc24 2l2l3 cos(q3 q2 )
2l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))][m5 (l22 l32 l42 lcfswing
2
2l2l3 cos(q3 q2 ) 2l2l4 cos(q4 q2 )
2l2lcfswingcos(q2 (S / 2) q fswing E fswing) 2l3l4 cos(q4 q3 ) 2l4lcfswingcos(q4 (S / 2) q fswing E fswing)
2l3lcfswingcos(q3 (S / 2) q fswing E fswing))][mtor (l22 lctorso
2
2l2lctorsocos(qtorso q2 (S / 2))]
I 2 I3 I 4 I5 Itorso
102 Humanoid Robots, New Developments
h24 [m3 (lc23 l2lc3 cos(q2 q3 ))][m4 (l32 lc24 l2l3 cos(q3 q2 ) l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))]
[m5 (l32 l42 lcfswing
2
l2l3 cos(q3 q2 ) l2l4 cos(q4 q2 ) l2lcfswingcos(q2 (S / 2) q fswing E fswing)
2l3l4 cos(q4 q3 ) 2l4lcfswingcos(q4 (S / 2) q fswing E fswing) 2l3lcfswingcos(q3 (S / 2) q fswing E fswing))]
I3 I 4 I5
h25 [m4 (lc24 l2lc4 cos(q2 q4 ) lc4l3 cos(q3 q4 ))][m5 (l42 lcfswing
2
l2l4 cos(q4 q2 )
l2lcfswingcos(q2 (S / 2) q fswing E fswing) l3l4 cos(q4 q3 ) 2l4lcfswingcos(q4 (S / 2) q fswing E fswing)
l3lcfswingcos(q3 (S / 2) q fswing E fswing))] I 4 I5
2
h26 [m5 (lcfswing l2lcfswingcos(q2 (S / 2) q fswing E fswing) l4lcfswingcos(q4 (S / 2) q fswing E fswing)
l3lcfswingcos(q3 (S / 2) q fswing E fswing))] I5
2
h 27 [mtorso (l ctorso l 2 l ctorso cos(qtorso (S / 2) q 2 )] I torso
h31 [m3 (lc23 lc3le cos(q3 M) l1lc3 cos(q1 q3 ) l2lc3 cos(q2 q3 ))] [m4 (l32 lc24
l3le cos(q3 M) lc4le cos(q4 M) l1l3 cos(q3 q1 ) l1lc4 cos(q1 q4 ) l2l3 cos(q3 q2 )
l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))] [m5 (l32 l42 lcfswing
2
l3le cos(q3 M)
l4le cos(q4 M) lcfswingle cos(M (S / 2) q fswing E fswing) l1l3 cos(q3 q1 )
l1l4 cos(q4 q1 ) l1lcfswing cos(q1 (S / 2) q fswing E fswing) l2l3 cos(q3 q2 )
l2l4 cos(q4 q2 ) l2lcfswing cos(q2 (S / 2) q fswing E fswing) 2l3l4 cos(q4 q3 )
2l4lcfswing cos(q4 (S / 2) q fswing E fswing) 2l3lcfswing cos(q3 (S / 2) q fswing E fswing))] I 3 I 4 I 5
Dynamic Simulation of Single and Combined Trajectory Path Generation and
Control of A Seven Link Biped Robot 103
h32 [m3 (lc23 l1lc3 cos(q1 q3 ) l2lc3 cos(q2 q3 ))] [m4 (l32 lc24 l1l3 cos(q3 q1 )
l1lc4 cos(q1 q4 ) l2l3 cos(q3 q2 ) l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))] [m5 (l32 l42
2
lcfswing l1l3 cos(q3 q1 ) l1l4 cos(q4 q1 ) l1lcfswingcos(q1 (S / 2) q fswing E fswing)
l2l3 cos(q3 q2 ) l2l4 cos(q4 q2 ) l2lcfswingcos(q2 (S / 2) q fswing E fswing) 2l3l4 cos(q4 q3 )
2l4lcfswingcos(q4 (S / 2) q fswing E fswing) 2l3lcfswingcos(q3 (S / 2) q fswing E fswing))]
I3 I 4 I5
h33 [m3 (lc23 l2lc3 cos(q2 q3 ))][m4 (l32 lc24 l2l3 cos(q3 q2 ) l2lc4 cos(q2 q4 ) 2lc4l3 cos(q3 q4 ))]
[m5 (l32 l42 lcfswing
2
l2l3 cos(q3 q2 ) l2l4 cos(q4 q2 ) l2lcfswingcos(q2 (S / 2) qfswingE fswing)
2l3l4 cos(q4 q3 ) 2l4lcfswingcos(q4 (S / 2) qfswing E fswing) 2l3lcfswingcos(q3 (S / 2) qfswingE fswing))]
I3 I 4 I5
h 34 [m3 (lc23 )] [m4 (l32 lc24 2lc4l3 cos(q3 q4 ))] [m5 (l32 l42 lcfswing
2
2l3l4 cos(q4 q3 )
2l4lcfswing cos(q4 (S / 2) q fswing E fswing ) 2l3lcfswing cos(q3 (S / 2) q fswing E fswing ))]
I3 I 4 I5
2
h36 [m5 (lcfswing l4lcfswingcos(q4 (S / 2) qfswing E fswing) l3lcfswingcos(q3 (S / 2) qfswing E fswing))] I5
h 37 0
104 Humanoid Robots, New Developments
h41 [m4 (lc24 lc4le cos(q4 M) l1lc4 cos(q1 q4 ) l2lc4 cos(q2 q4 ) lc4l3 cos(q3 q4 ))]
[m5 (l42 lcfswing
2
l4le cos(q4 M) lcfswingle cos(M (S / 2) qfswing E fswing) l1l4 cos(q4 q1)
l1lcfswingcos(q1 (S / 2) qfswing E fswing) l2l4 cos(q4 q2 ) l2lcfswingcos(q2 (S / 2) qfswing E fswing)
l3l4 cos(q4 q3 ) 2l4lcfswingcos(q4 (S / 2) qfswing E fswing) l3lcfswingcos(q3 (S / 2) qfswing E fswing))]
I 4 I5
h42 [m4 (lc24 l1lc4 cos(q1 q4 ) l2lc4 cos(q2 q4 ) lc4l3 cos(q3 q4 ))][m5 (l42 lcfswing
2
l1l4 cos(q4 q1 ) l1lcfswingcos(q1 (S / 2) qfswing E fswing) l2l4 cos(q4 q2 )
l2lcfswingcos(q2 (S / 2) qfswing E fswing) l3l4 cos(q4 q3 ) 2l4lcfswingcos(q4 (S / 2) q fswing E fswing)
l3lcfswingcos(q3 (S / 2) qfswing E fswing))] I4 I5
h43 [m4 (lc24 l2lc4 cos(q2 q4 ) lc4l3 cos(q3 q4 ))][m5 (l42 lcfswing
2
l2l4 cos(q4 q2 )
l2lcfswingcos(q2 (S / 2) q fswing E fswing) l3l4 cos(q4 q3 ) 2l4lcfswingcos(q4 (S / 2) q fswing E fswing)
l3lcfswingcos(q3 (S / 2) q fswing E fswing))] I4 I5
2
h 46 [m5 (l cfswing l 4 l cfswing cos(q 4 (S / 2) q fswing E fswing ))] I 5
h 47 0
Dynamic Simulation of Single and Combined Trajectory Path Generation and
Control of A Seven Link Biped Robot 105
2
h51 [m5 (lcfswing lcfswingle cos(M (S / 2) q fswing E fswing) l1lcfswingcos(q1 (S / 2) q fswing E fswing)
l2lcfswingcos(q2 (S / 2) q fswing E fswing) l4lcfswingcos(q4 (S / 2) q fswing E fswing)
l3lcfswingcos(q3 (S / 2) q fswing E fswing))] I5
2
h52 [m5 (lcfswing l1lcfswingcos(q1 (S / 2) q fswing E fswing) l2lcfswingcos(q2 (S / 2) q fswing E fswing)
l4lcfswingcos(q4 (S / 2) q fswing E fswing) l3lcfswingcos(q3 (S / 2) q fswing E fswing))] I5
2
h 53 [m5 (l cfswing l 2 lcfswing cos(q2 (S / 2) q fswing E fswing )
l 4 lcfswing cos(q4 (S / 2) q fswing E fswing ) l3lcfswing cos(q3 (S / 2) q fswing E fswing ))] I 5
2
h54 [m5 (lcfswingl4lcfswingcos(q4 (S / 2) qfswing E fswing) l3lcfswingcos(q3 (S / 2) qfswing E fswing))] I5
2
h 55 [m5 (l cfswing l 4 l cfswing cos(q 4 (S / 2) q fswing E fswing ))] I 5
2
h 56 [m5 (l cfswing )] I 5
h 57 0
2
h61 [masstorso(ltor le ltor cos(qtorso (S / 2) M ) l1ltor cos(qtorso (S / 2) q1 )
l 2 ltor cos(qtorso (S / 2) q 2 ))] I torso
106 Humanoid Robots, New Developments
2
h62 [masstorso(ltor l1ltor cos(qtorso (S / 2) q1 ) l 2 ltor cos(qtorso (S / 2) q2 ))] I torso
2
h63 [masstorso(l tor l 2 l tor cos(qtorso (S / 2) q 2 ))] I torso
2
h67 [masstorso(ltor )] I torso , h64, 65, 66 0
In the all of above mentioned components, the following parameters have been used and
they can be seen in figure (1.6) :
I E m .c q 0 , M O q 0 E m.c ,.
(32)
(a)
(b)
Fig. (1.6). (a) The foot’s and (b) The support link’s geometrical configurations.
Dynamic Simulation of Single and Combined Trajectory Path Generation and
Control of A Seven Link Biped Robot 107
Where, l fswing and E fswing refer to indexes of the swing leg with respect to geometrical
configurations of the mass center of the swing leg as can be deducted from figure (1.6).
The coriolis and gravitational components of the relation (30) can be calculated easily.
After calculation of kinematic and dynamic parameters of the robot, the control process
of the system will be emerged. In summery, the adaptive control procedure has been
used for a known seven link biped robot. The more details and the related definitions
such as the known and unknown system with respect to control aspect can be found in
reference [Musavi and Bagheri, 2007 and Musavi, 2006, and Bagheri and Felezi, and et.
Al., 2006]. For the simulation of the robot, the obtained parameters and relations are
inserted into the designed program in Matlab environment. As can be seen in
simulation results section, the most concerns refer to stability of the robot with respect
to the important affecting parameters of the robot movements which indicate the ankle
and hip joints parameters [Bagheri and Najafi and et. Al. 2006]. As can be seen from the
simulations figures, the hip height and horizontal positions have considerable effects
over the position of the ZMP and subsequently over the stability of the robot. To
minimize the driver actuator torques of the joints, especially for the knee joint of the
robot, the hip height which measured from the F.C.S has drastic role for diminution of
the torques.
4. Simulation Results
In the designed program, the mentioned simulation processes for the two types of
ZMP have been used for both of the nominal and un-nominal gait. For the un-nominal
walking of the robot, the hip parameters (hip height) have been changed to consider
the effects of the un-nominal motion upon the joint's actuator torques. The results are
presented in figures (1.8) to (1.15) while the robot walks over declined surfaces for the
single phase of the walking. Figure (1.15) shows combined path of the robot. The used
specifications of the simulation of the robot are listed in table No. 1. Figures (1.8),
(1.10) and (1.12) display the moving type of ZMP with the nominal walking of the
robot. Figures (1.9), (1.11) and (1.13) show the same type of ZMP and also the un-
nominal walking of the robot (with the changed hip height form the fixed coordinate
system). Figure (1.14) shows the fixed ZMP upon descending surface. As can been seen
from the table, the swing and support legs have the same geometrical and inertial
values whereas in the designed program the users can choose different specifications.
Note, the swing leg impact and the ground has been regarded in the designed program
as given in references [Lum and et. Al. 1999 and Westervelt, 2003, and Hon and et. Al.,
1978]. Below, the saggital movement and stability analysis of the seven link biped
robot has been considered whereas the frontal considerations are neglected. For
convenience, 3D simulations of the biped robot are presented. In table No. 1, l an , l ab
and l af present the foot profile which are displayed in figure (1.7). The program
enables the user to compare the results as presented in figures where the paths for the
single phase walking of the robot have been concerned. In the program with the aid of
the given break points, either third-order spline or Vandermonde Matrix has been
used for providing the different trajectory paths. With the aid of the designed
program, the kinematic, dynamic and control parameters have been evaluated. Also,
108 Humanoid Robots, New Developments
the two types of ZMP have been investigated. The presented simulations indicate the
hip height effects over joint’s actuator torques for minimizing energy consumption and
especially obtaining fine stability margin. As can be seen in figures (9.h), (11.h) and
(13.h), for the un-nominal walking of the robot with the lower hip height, the knee's
actuator torque values is more than the obtained values as shown in figures (8.h),
(10.h) and (12.h) (for the nominal gait with the higher hip height). This is due to the
robot's need to bend its knee joint more at a low hip position. Therefore, the large knee
joint torque is required to support the robot. Therefore, for reducing the load on the
knee joint and consequently with respect to minimum energy consumption, it is
essential to keep the hip at a high position. Finally, the trajectory path generation
needs more precision with respect to the obtained kinematic relations to avoid the
link’s velocity discontinuities. The presented results have an acceptable consistency
with the typical robot.
g gs g gf H min H max hs Hs
l an
l ab l af
(c) Velocity (d) Acceleration (e) Angular Vel. (f) Angular Acc.
(c) Velocity (d) Acceleration (e) Angular Vel. (f) Angular Acc.
(c) Velocity (d) Acceleration (e) Angular Vel. (f) Angular Acc.
(b) ZMP
(b) ZMP
(b) ZMP
(b) ZMP
5. References
Peiman Naseradin Mousavi, Ahmad Bagheri, “Mathematical Simulation of a Seven Link Biped Robot
and ZMP Considerations”, Applied Mathematical Modelling, Elsevier, 2007, Vol. 31/1.
Q. Huang, K. Yokoi, S. Kajita, K. Kaneko, H. Arai, N. Koyachi, K. Tanie, “Planning Walking
Patterns For A Biped Robot”, IEEE Transactions on Robotics and Automation, VOL
17, No. 3, June 2001.
John J. G, “Introduction to Robotics: Mechanics and Control”, Addison-Wesley, 1989.
H. K. Lum, M. Zribi, Y. C. Soh, “ Planning and Contact of a Biped Robot”, International
Journal of Engineering Science 37(1999), pp. 1319-1349
Eric R. Westervelt, “ Toward a Coherent Framework for the Control of Plannar Biped
Locomotion”, A dissertation submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy, (Electrical Engineering Systems), In the
University of Michigan, 2003.
H. Hon, T. Kim, and T.Park, “Tolerance Analysis of a Spur Gear Train,” in Proc. 3rd DADS
Korean user’s Conf. 1978, pp. 61-81
Peiman Naseradin. Mousavi, “Adaptive Control of 5 DOF Biped Robot Moving On A
Declined Surface”, M.S Thesis, Guilan University, 2006.
A. Bagheri, M.E. Felezi, P. N. Mousavi, “ Adaptivr Control and Simulation of a Seven Link
Biped Robot for the Combined Trajectory Motion and Stability Investigations”,
WSEAS Transactions on Systems, Issue 5, Vol. 5, May 2006, pp: 1214-1222
A. Bagheri. F. Najafi, R. Farrokhi, R. Y. Moghaddam, and M. E. Felezi, “Design, Dynamic
Modification, and Adaptive Control of a New Biped Walking Robot”, International
Journal of Humanoid Robotics, Vol. 3, Num.1, March 2006, pp 105-126
7
Anthony David
Université d’Orléans, Ecole Nationale Supérieure d’Ingénieurs de Bourges,
Laboratoire Vision et Robotique (LVR, EA 2078)
France
1. Introduction
Many studies have been made to develop walking anthropomorphic robots able to move in
environments well-adapted to human beings and able to cooperate with them. Among the
more advanced projects of humanoid robots, one can mention : the Honda robots P2, P3
(Hirai, 1997) (Hirai et al., 1998) and Asimo (Sakagami et al., 2002), the HRP series developed
by AIST (Kaneko et al., 1998) (Kajita et al., 2005) (Kaneko et al., 2004) (Morisawa et al., 2005),
the small robot QRIO proposed by Sony (Nagasaka et al., 2004), the KHR series developed
by KAIST (Kim et al., 2004) (Kim et al., 2005), the last robot of Waseda University having
seven degrees of freedom per leg (Ogura et al., 2004), Johnnie (Lohmeier et al., 2004) and H7
(Kagami et al., 2005). These robots are namely able to climb stairs and to carry their own
power supply during stable walking. The problem of dynamic locomotion and gait
generation for biped robot has been studied theoretically and experimentally with quite
different approaches. However, when searchers study the behavior or the design of
dynamic walking robots, they inevitably meet a number of intrinsic difficulties related to
these kinds of systems : a large number of parameters have to be optimized during the
design process or have to be controlled during the locomotion task; the intrinsic stability of
walking machines with dynamic behaviors is not robust; the coordination of the legs is a
complex task. When human walks, it actively uses its own dynamic effects to ensure its
propulsion. Today, some studies exploit the dynamic effects to generate walking gaits of
robots. In this research field, four kinds of approaches are used. The first one uses pragmatic
rules based on qualitative studies of human walking gaits (Pratt et al., 2001) (Sabourin et al.,
2004). The second one focuses on the mechanical design of the robot in order to obtain
natural passive dynamic gaits (Collins et al., 2005). The third one deals with theoretical
122 Humanoid Robots, New Developments
studies of limit cycles (Westervelt et al., 2004) (Canudas-de-Wit et al., 2002). The fourth one
creates various dynamic walking gaits with reference trajectories (Bruneau et al., 2001)
(Chevallereau et al., 2003). However, all theses approaches are not really universal and do
not allow online dynamic adaptation of the robot motions as function of the environment
and of the real capabilities of the system.
Our objective is to carry out a more adaptive and universal approach based on the
dynamic equations in order to generate walking gaits with a high dynamic behavior.
Moreover, we do not wish to impose exact temporal desired trajectories. On the contrary,
the capabilities of the robot, in term of intrinsic dynamics and actuator torques, are
constantly evaluated and exploited as well as possible with an aim of reaching a desired
average walking rate of the system. In order to do this, we propose a strategy based on two
dynamic criterions, which takes into account, at every moment, the intrinsic dynamics of the
system and the capabilities of the actuators torques in order to modify the dynamics of the
robot directly. These criterions called the “dynamic propulsion criterion” and the “dynamic
propulsion potential” permit to produce analytically the motions of the legs during all
phases and all transition phases without specification of the events to perform. Even if this
method is universal, we illustrate it for a planar under-actuated robot without foot : RABBIT
(Buche, 2006). The particularity of this robot is that it can not be statically stable because the
contact area between the legs and the ground is very small. Thus, a control strategy based
on dynamic considerations is required.
The organization of this chapter is as follows. In the second part the presentation of the
robot Rabbit and its characteristics are given. In part three, dynamics of the biped robot
Rabbit are introduced. Then, in part four, the two dynamic criterions are expressed : the
“dynamic propulsion criterion” and the “dynamic propulsion potential”. In part five, we
describe the control strategy, based on these criterions, to generate highly dynamic walking
gaits. The sixth part gives the simulation results obtained with this analytical approach.
[
where a p = x, y, q p ]
T
is the trunk accelerations vector at Rp, a k = [q1 , q2 , q3 , q4 ] the joint
T
accelerations vector, Fcj (j=1,2) the contacting forces applied to the feet and Γ the joint
torques vector. The index “p” refers to the trunk and the index “k” to the legs. The
subsystem (1), coming from the first three motion equations, is related to the trunk. This
subsystem is said passive. The subsystem (2), coming from the last four motion equations, is
related to the legs. This subsystem is said active.
M p = pM p + k M p
with p C p = 0 (3)
C p = pC p + kC p
G p = pG p + kG p
The following equations system is also obtained:
p
[ ]
M p a p + p C p + p G p = − k M p − H k M k−1 H kT a p
[ ]
(4)
+ H k M k−1 C k + Gk − Γ − k C p − k G p
ª 2 T T T º
+ «¦ Dip Fci − H k M k−1 §¨ D1k Fc1 + D2 k Fc 2 ·¸»
¬ i =1 © ¹¼
The left term of (4) represents trunk dynamics whereas the right term is the external
generalized forces Fr applied to the trunk at Rp.
swing leg and the ground and landing without impact). In the second case, the actuator
torques will be function of joint trajectories. The method used to calculate these desired
actuator torques is the computed torque method using non linear decoupling of the
dynamics. For that, we express the actuator torques Γ as function of the joint accelerations
a k by isolating a p in (1) and by injecting it in (2):
[ ] [ ][
Γ = M k − H kT M p−1 H k a k + C k − H kT M p−1C p + G k − H kT M p−1G p
(5)
]
ª T T
º T T
− « D1k Fc1 + D2 k Fc 2 − H kT M p−1 §¨ D1 p Fc1 + D2 p Fc 2 ·¸»
¬ © ¹¼
Then, we separate (5) in two subsystems. A subsystem, named A, composed of the
equations of the actuator torques used to generate a desired joint trajectory. A subsystem,
named B, composed of the other equations. In the subsystem B, we isolate the joint
acceleration, associated to the actuator torques not used to imposed a desired joint
trajectory. In this case, if we inject their expressions in the subsystem A, this subsystem A
express the actuator torques used to impose joint trajectories as function of the desired joint
acceleration and of the other actuator torques. So, knowing the configuration of the robot
and the desired joint accelerations, given by the desired joint trajectories, we can calculated
the actuator torques.
4. Dynamic criterions
Based on the previous equations, two dynamic criterions are proposed : the dynamic
propulsion criterion and the dynamic propulsion potential.
126 Humanoid Robots, New Developments
that we have to minimize. Knowing the configuration of the robot, Fi and the expression of
the real generalized forces (right term of eq. 4), we can calculate the desired joint torques
Γ d to generate the desired trajectory:
[
Fi = − k M p − H k M k−1 H kT a p ]
[ d
]
+ H k M k−1 C k + Gk − Γ − k C p − k G p
(10)
ª 2 T T T º
+ «¦ Dip Fci − H k M k−1 §¨ D1k Fc1 + D2 k Fc 2 ·¸»
¬ i =1 © ¹¼
To calculate Fi , we use the expression of the trunk dynamics obtained in left term of (4),
where we replace the real values x , x , x , y , y , y , q
p , q p and q p of the trajectory by their
desired values xd , x d , x d , y d , y d , y d , q
dp , q dp and q dp (in paragraph 5.1, we will see how
obtain these desired values). The desired trajectory being defined in acceleration terms as
well as in velocity and position terms, we replace:
[
a pd = xd y d qdp ]
T
(11)
by:
(
ª xd + K v x d − x + K p x d − x ) ( )º
«
(
a = « y d + K v y d − y + K p y d − y
d
p ) ( ) »» (12)
( )
«qdp + K v q dp − q p + K p q dp − q p
¬ ( )»¼
what gives us the following expression of Fi :
Fi = p M p a pd + p C pd + p G pd (13)
~
For a planar system (Oxy plane), P is written as follows :
min
§ PFx · § Frx ; Frx [
max
] ·¸
~ ¨ ¸ ¨ min max
[
P = ¨ PFy ¸ = ¨ Fry ; Fry ]¸ (14)
[
¨ P ¸ ¨ M min ; M max
© Mz ¹ © rz rz ]¸¹
This criterion integrates the technology used as well as the constraints related to the
~ ~
system’s design. In the case of Rabbit, we define three criterions, one for each leg ( P1 , P2 )
~
and one for the robot ( P ).
~ ~
The expressions of P1 and P2 are based on the expression of the external generalized force
generated by each leg and the field of variation of the actuator torques. The expression of the
external generalized force performed by each leg is obtained with the decomposition of (4):
[ ] ª §Γhj ·º
Frj = − kjM p − Hkj Mkj−1HkjT ap + HkjMkj−1 «Ckj + Gkj − ¨¨ ¸¸»−kjCp −kjGp
«¬ ©Γgj ¹»¼ (15)
ª T T
º
+ «Djp − Hkj Mkj−1 Djk »Fcj
¬ ¼
where Γhj and Γgj are respectively the hip and knee actuator torque of the leg j.
Then, we have to determine the field of variation of the actuator torques. For that, the idea is
to limit the field of variation of the actuator torques, given by the actuator technology, such
as the joint actuators have the capability to stop the motion before reaching the joint limits.
To determinate the limits of the field of variation of each actuator torque, we use three
relations. The first relation (16) integrates the actuator technology. The joint velocity and the
actuator torque give the maximum and the minimum actuator torque that the joint actuator
can generate for the next sample time :
(
(q i , Γi ) Γimax , Γimin ) (16)
The second relation (17), based on (8), gives the expression of the joint acceleration as
function of the actuator torque:
Γi qi (17)
The third relation (18) gives the new joint velocity ( q it +δt ) and position ( qit +δt ), knowing the
new joint acceleration ( qit +δt ) and the old angular velocity ( qit ) and position ( qit ) :
q it +δt = q it + qit +δt δt
(18)
δt 2
q it +δt = q it + q it δt + qit +δt
2
where δt is the sample time.
With these three relations and the maximum and minimum joint limits qimin , qimax , we limit ( )
the field of variation of the actuator torque. Here is explained the method to determinate the
maximum torque that the joint actuator can generate, but the principle is the same for the
minimum torque. This method is given by fig. 3.
With (16), we determine the maximum torque Γimax that the technology of the joint actuator
allows to generate. Then, we check if the joint actuator has the capability to stop the joint
motion before reaching the joint limit q imax , if we apply Γimax . In order to perform that, we
128 Humanoid Robots, New Developments
calculate, with (17) and (18), the new joint configuration that we would have obtained if we
had applied Γimax . Then, in order to stop the joint motion, we apply the minimum torque
that the joint actuator can generate in this new joint configuration. We continue this loop as
long as the joint velocity is positive. And the minimum torque is recalculated at each step of
time as function of the new joint configuration with (16). Then, we compare the joint
position obtained with the joint limit qimax . If it is lower than qimax , Γimax can be applied
because the joint actuator has the capability to stop the joint motion before reaching the joint
limit. Otherwise, Γimax is reduced until obtaining the maximum value allowing to stop the
joint motion before reaching the joint limit.
So, knowing the field of variation of the torque that the joint actuators can generate (19) and
using (15), the dynamic propulsion potential of each leg and the dynamic propulsion
potential of the robot are computed.
Related to the design of Rabbit, one joint actuator per leg has an important influence on the
propulsion along the moving direction. It is the knee joint for the stance leg and the hip joint
for the swing leg. So, we decide to consider only the field of variation of one actuator torque
Analytical Criterions for the Generation of Highly Dynamic Gaits for Humanoid Robots:
Dynamic Propulsion Criterion and Dynamic Propulsion Potential 129
per leg to calculate its dynamic propulsion potential. Consequently, the dynamic propulsion
~
potential Pi of the leg i is:
{ ( ) ( )} { ( ) (
§ min Frx Γhimin ; Γgi ; Frx Γhimax ; Γgi ; max Frx Γhimin ; Γgi ; Frx Γhimax ; Γgi )} ·
(20)
~ ¨ ¸
{ ( ) ( )} { ( ) (
Pi = ¨ min Fry Γhimin ; Γgi ; Fry Γhimax ; Γgi ; max Fry Γhimin ; Γgi ; Fry Γhimax ; Γgi )} ¸
{ (hi gi ) (
rz hi gi )} { (
¨ min M Γ min ; Γ ; M Γ max ; Γ ; max M Γ min ; Γ ; M Γ max ; Γ
© rz rz hi gi ) (rz hi gi )}
¸
¹
If the leg i is is the swing leg, and :
{ ( ) ( )} { ( ) (
§ min Frx Γhi ; Γgimin ; Frx Γhi ; Γgimax ; max Frx Γhi ; Γgimin ; Frx Γhi ; Γgimax )} ·
(21)
~ ¨ ¸
{ ( ) ( )} { ( ) (
Pi = ¨ min Fry Γhi ; Γgimin ; Fry Γhi ; Γgimax ; max Fry Γhi ; Γgimin ; Fry Γhi ; Γgimax )} ¸
¨ { ( ) ( )} { ( ) (
© min M rz Γhi ; Γgi ; M rz Γhi ; Γgi ; max M rz Γhi ; Γgi ; M rz Γhi ; Γgi
min max min max
)}
¸
¹
If the leg i is the stance leg.
~
Knowing the potential of each leg, we calculate the dynamic propulsion potential P of the
robot. During the single contact phase, with no close-loop constraints to respect we have
directly:
~ ~ ~
P = P1 + P2 (22)
&
During the double contact phase, we have to respect the close-loop constraint along the y
direction. So, we limit again the field of variation of the actuator torques at the knee joint by
taking into account this constraint. Equation (9) establishing a direct relation between the
two actuator torques, we recalculate the minimum and the maximum torque of each joint
actuator compatible with the minimum and the maximum torque of the other joint torque.
This gives the new field of variation of the torque in (19) in the case of the double contact
phase. Then, we recalculate with (21) the dynamic propulsion potential of each leg and, in
the same way as during the single contact phase, the potential of the robot with (22).
5. Control Strategy
~ ~ ~
Based on the dynamic propulsion of the two legs ( P1 , P2 ) and of the robot ( P ), we are able
to produce for a desired average walking speed (va) the motion during the different
locomotion phases and to detect the necessity to perform a transition between one phase to
another. To perform this, we have to define on one hand the trajectory of Rp as function of va
and on the other hand we have to define the control strategy to generate this desired
trajectory.
if the leg 1 is the stance leg and the leg 2 the swing leg. So, as we have three equations ((26),
(27) and (25)), we can calculate the last three desired actuator torques directly.
Analytical Criterions for the Generation of Highly Dynamic Gaits for Humanoid Robots:
Dynamic Propulsion Criterion and Dynamic Propulsion Potential 131
3) Single Contact Phase: During this phase, we use the same strategy as during the
transition phase. The only difference is that we use the knee actuator of the swing leg to
preserve the desired height for the tip of the swing leg and not to join it.
4) Transition from the Single to the Double Contact Phase: In order to not exceed the
capability of the legs, we initiate the return to the double contact phase when the dynamic
propulsion potential of one leg is weak.
During this transition, we have to ensure the landing of the swing leg, which has to avoid a
&
strong impact. So, we must control the trajectory of the tip of the swing leg along the x and
& directions. In order to carry out this, we use the same strategy as this one used to control
y
the height of the swing leg’s tip during the single contact phase. However, we have to
control the trajectory along two directions. Thus, we decide to use the two joint actuators of
the swing leg.
With the two other joint actuators we perform the dynamic propulsion. As we do not use
the swing leg to ensure the dynamic propulsion, we compare the force and the torque, that
the stance leg has to perform to generate Fix and M iz respectively, with its dynamic
propulsion potential. If the stance leg does not have the capability to generate them, we
limit them. So, as we have two equations ((24) and (25)), we can calculate the last two
desired actuator torques directly.
6. Simulation results
With these criterions, we develop a strategy to control RABBIT and we validate it with
the simulation of the first step of a walking gait. To start, we define the desired average
velocity at Rp. As an illustration, linear velocity x is imposed and the desired trunk
angle qp is equal to zero. The evolution of the height y is free. So, we ensure the
dynamic propulsion along the x direction and around the z direction only. The next
idea is to modify, via the joint torques, the dynamic effects generated by the legs in
order to make Fr converge towards Fi . With the equations along the x direction and
around the z direction, we can calculate Γ d directly. However, in this case, we do not
take into account the constraints related to the robot’s design and to the generation of
walking gaits. These constraints are introduced in the next section as functions of
locomotion phases.
Fig. 5. Dynamic propulsion potential and real force of the back leg along the x direction at Rp.
Fig. 7. Dynamic propulsion potential and real force of the robot along the x direction at
Rp..
134 Humanoid Robots, New Developments
8. References
Bruneau, O.; Ouezdou, F.-B. & Fontaine, J.-G. (2001). Dynamic Walk of a Bipedal Robot
Having Flexible Feet, Proceedings of IEEE International Conference On Intelligent
Robots and Systems, 2001.
Buche., G. (2006). ROBEA Home Page, http://robot-rabbit.lag.ensieg.inpg.fr/English/, 2006.
Canudas-de-Wit, C.; Espiau, B. & Urrea, C. (2002). Orbital Stabilization of Underactuated
Mechanical System, Proceedings of IFAC, 2002.
Chevallereau, C. & Djoudi, D. (2003). Underactuated Planar Robot Controlled via a Set of
Reference Trajectories, Proceedings of International Conference On Climbing and
Walking Robots, September 2003.
Collins, S. H. & Ruina, A. (2005). A Bipedal Walking Robot with Efficient and Human-Like
Gait, Proceedings of IEEE International Conference On Robotics and Automation,
Barcelona, April 2005, Spain.
Analytical Criterions for the Generation of Highly Dynamic Gaits for Humanoid Robots:
Dynamic Propulsion Criterion and Dynamic Propulsion Potential 135
Hirai, K. (1997). Current and Future Perspective of Honda Humanoid Robot, Proceedings of
IEEE International Conference On Intelligent Robots and Systems, PP. 500-508, 1997.
Hirai, K.; Hirose, M. ; Haikawa, Y. & Takenaka, T. (1998). The Development of Honda
Humanoid Robot, Proceedings of IEEE International Conference On Robotics and
Automation, PP. 1321-1326, 1998.
Kagami, S.; Nishiwaki, K.; Kuffner Jr, J.-J.; Kuniyoshi, Y.; Inaba, M. & Inoue, H. (2005).
Online 3D Vision, Motion Planning and Biped Locomotion Control Coupling
System of Humanoid Robot : H7, Proceedings of IEEE International Conference On
Intelligent Robots and Systems, PP. 2557-2562, 2005.
Kajita, S.; Nagasaki, T.; Kaneko, K.; Yokoi, K. & Tanie, K. (2005). A Running Controller of
Humanoid Biped HRP-2LR, Proceedings of IEEE International Conference On Robotics
and Automation, PP. 618-624, Barcelona, April 2005, Spain.
Kaneko, K.; Kanehiro, F.; Kajita, S.; Yokoyama, K.; Akachi, K.; Kawasaki, T.; Ota, S. &
Isozumi, T. (1998). Design of Prototype Humanoid Robotics Plateform for HRP,
Proceedings of IEEE International Conference On Intelligent Robots and Systems, PP.
2431-2436, 1998.
Kaneko, K.; Kanehiro, F. & Kajita, S. (2004). Humanoid Robot HRP-2, Proceedings of IEEE
International Conference On Robotics and Automation, PP. 1083-1090, 2004.
Kim, J.-Y.; Park, I.-W.; Lee, J.; Kim, M.-S.; Cho, B.-K. & Oh, J.-H. (2005). System Design and
Dynamic Walking of Humanoid Robot KHR-2, Proceedings of IEEE International
Conference On Robotics and Automation, PP. 1443-1448, Barcelona, April 2005, Spain.
Kim, J.-Y. & Oh, J.-H. (2004). Walking Control of the Humanoid Platform KHR-1 based on
Torque Feedback Control, Proceedings of IEEE International Conference On Robotics
and Automation, PP. 623-628, 2004.
Lohmeier, S.; Löffler, K. ; Gienger, M.; Ulbrich, H. & Pfeiffer, F. (2004). Computer System
and Control of Biped “Johnnie”, Proceedings of IEEE International Conference On
Robotics and Automation, PP 4222-4227, 2004.
Morisawa, M.; Kajita, S.; Kaneko, K.; Harada, K.; Kanehiro, F.; Fujiwara, K. & H. Hirukawa,
H. (2005). Pattern Generation of Biped Walking Constrained on Parametric Surface,
Proceedings of IEEE International Conference On Robotics and Automation, PP. 2416-
2421, Barcelona, April 2005, Spain.
Nagasaka, K.; Kuroki, Y.; Suzuki, S.; Itoh,Y. and Yamaguchi, J. (2004). Integrated Motion
Control for Walking, Jumping and Running on a Small Bipedal Entertainment
Robot, Proceedings of IEEE International Conference On Intelligent Robots and Systems,
PP. 3189-3194, 2004.
Pratt, J. E.; Chew, C.-M.; Torres, A.; Dilworth, P. & Pratt, G. (2001). Virtual Model Control :
an Intuitive Approach for Bipedal Locomotion, International Journal of Robotics
Research, vol. 20, pp. 129-143, 2001.
Sabourin, C.; Bruneau, O. & Fontaine, J.-G. (2004). Pragmatic Rules for Real-Time Control of
the Dynamic Walking of an Under-Actuated Biped Robot, Proceedings of IEEE
International Conference On Robotics and Automation, April 2004.
Sabourin, C.; Bruneau, O. & Buche, G. (2006). Control strategy for the robust dynamic
walk of a biped robot, International Journal of Robotics Research, Vol.25, N°9, pp.
843--860.
Sakagami,Y.; Watanabe,R.; Aoyama, C.; Matsunaga, S.; Higaki, N. & Fujimura, K. (2002).
The Intelligent ASIMO: System Overview and Integration, Proceedings of IEEE
International Conference On Intelligent Robots and Systems, PP. 2478-2483, 2002.
136 Humanoid Robots, New Developments
1. Introduction
This chapter addresses the design of a robot eye featuring the mechanics and motion
characteristics of a human one. In particular the goal is to provide guidelines for the
implementation of a tendon driven robot capable to emulate saccadic motions.
In the first part of this chapter the physiological and mechanical characteristics of the eye-
plant1 in humans and primates will be reviewed. Then, the fundamental motion strategies
used by humans during saccadic motions will be discussed, and the mathematical
formulation of the relevant Listing’s Law and Half-Angle Rule, which specify the geometric
and kinematic characteristics of ocular saccadic motions, will be introduced.
From this standpoint a simple model of the eye-plant will be described. In particular it will
be shown that this model is a good candidate for the implementation of Listing’s Law on a
purely mechanical basis, as many physiologists believe to happen in humans. Therefore, the
proposed eye-plant model can be used as a reference for the implementation of a robot
emulating the actual mechanics and actuation characteristics of the human eye.
The second part of this chapter will focus on the description of a first prototype of fully
embedded robot eye designed following the guidelines provided by the eye-plant model.
Many eye-head robots have been proposed in the past few years, and several of these
systems have been designed to support and rotate one or more cameras about independent
or coupled pan-tilt axes. However, little attention has been paid to emulate the actual
mechanics of the eye, although theoretical investigations in the area of modeling and control
of human-like eye movements have been presented in the literature (Lockwood et al., 1999;
Polpitiya & Ghosh, 2002; Polpitiya & Ghosh, 2003; Polpitiya et al., 2004).
Recent works have focused on the design of embedded mechatronic robot eye systems (Gu
et al., 2000; Albers et al., 2003; Pongas et al., 2004). In (Gu et al., 2000), a prosthetic
implantable robot eye concept has been proposed, featuring a single degree-of-freedom.
Pongas et al., (Pongas et al., 2004) have developed a mechanism which actuates a CMOS
micro-camera embedded in a spherical support. The system has a single degree-of-freedom,
and the spherical shape of the eye is a purely aesthetical detail; however, the mechatronic
approach adopted has addressed many important engineering issues and led to a very
1 By eye-plant we mean the eye-ball and all the mechanical structure required for its
interesting system. In the prototype developed by Albers et al., (Albers et al., 2003) the
design is more humanoid. The robot consists of a sphere supported by slide bearings and
moved by a stud constrained by two gimbals. The relevance of this design is that it actually
exploits the spherical shape of the eye; however, the types of ocular motions which could be
generated using this system have not been discussed.
In the following sections the basic mechanics of the eye-plant in humans will be described
and a quantitative geometric model introduced. Then, a first prototype of a tendon driven
robot formed by a sphere hold by a low friction support will be discussed. The second part
of the chapter will described some of the relevant issues faced during the robot design.
55 deg
Fig. 1. Frontal and side view of the eye: qualitative placement of recti muscles.
Recent anatomical and physiological studies have suggested that the four recti have an
important role for the implementation of saccadic motions which obey to the so called
Listing’s Law. In fact, it has been found that the path of the recti muscles within the orbit is
constrained by soft connective tissue (Koornneef, 1974; Miller, 1989, Demer et al., 1995,
Clark et al. 2000, Demer et al., 2000), named soft-pulleys. The role of the soft-pulleys to
Design of a Humanoid Robot Eye 139
generate ocular motions compatible with Listing’s Law in humans and primates is still
debated (Hepp, 1994; Raphan, 1998; Porrill et al., 2000; Wong et al., 2002; Koene & Erkelens
2004; Angelaki, 2004); however, analytical and simulation studies suggest that the
implementation of Listing’s Law on a mechanical basis is feasible (Polpitiya, 2002; Polpitiya,
2003; Cannata et al., 2006; Cannata & Maggiali, 2006).
suggested that the mechanics of the eye plant could have a significant role in the
implementation of Half Angle Rule and Listing’s Law (Quaia & Optican, 1998; Raphan 1998;
Porril et al., 2000; Koene & Erkelens, 2004), although counterexamples have been presented
in the literature (Hepp, 1994; Wong et al., 2002).
Fig. 2 Geometry of Listing compatible rotations. The finite rotation of the eye fixed frame
<e>, with respect to <h> is described by a vector v always orthogonal to h3.
Fig. 3. Half Angle Rule geometry. The eye’s angular velocity must belong to the plane Pǚ
passing through axis v.
4. Eye Model
The eye in humans has an almost spherical shape and is actuated by six extra-ocular
muscles. Each extra-ocular muscle has an insertion point on the sclera, and is connected
with the bottom of the orbit at the other end. Accordingly to the rationale proposed in
(Haslwanter, 2002; Koene & Erkelens, 2004), only the four rectii extra-ocular muscles play a
significant role during saccadic movements. In (Lockwood et al., 1989), a complete 3D model
of the eye plant including a non linear dynamics description of the extra-ocular muscles has
Design of a Humanoid Robot Eye 141
been proposed. This model has been extended in (Polpitiya & Ghosh, 2002; Polpitiya &
Ghosh, 2003), including also a description of the soft pulleys as elastic suspensions (springs).
However, this model requires that the elastic suspensions perform particular movements in
order to ensure that Listing’s Law is fulfilled. The model proposed in (Cannata et al., 2006;
Cannata & Maggiali, 2006), and described in this section, is slightly simpler than the
previous ones. In fact, it does not include the dynamics of extra-ocular muscles, since it can
be shown that it has no role in implementing Listing’s Law, and models soft pulleys as fixed
pointwise pulleys. As it will be shown in the following, the proposed model, for its
simplicity, can also be used as a guideline for the design of humanoid tendon driven robot
eyes.
Fig. 4. The relative position of pulleys and insertion points when the eye is in the primary position.
142 Humanoid Robots, New Developments
From expression (2), it is clear that |pi| does not affect the direction or the magnitude of mi
so we can assume in the following that |pi| = |ci|. Instead, the orientation of the vectors pi,
called principal directions, are extremely important. In fact, it is assumed that pi and ci are
symmetric with respect to the plane L; this condition implies:
( v ⋅ ci ) = ( v ⋅ pi ) ∀ i = 1…4 , ∀ v ∈ L (3)
Finally, it is assumed that insertion points are symmetric with respect to the fixation axis:
(h3 ⋅ ci ) = (h3 ⋅ c j ) ∀i, j = 1…4 (4)
and
(c3 - c1 ) ⋅ (c4 - c2 )=0 ∀ i, j = 1…4 (5)
The problem now is to show, according to formula (2), when arbitrary torques mi ෛ Pǚ can
be generated using only pulling forces. Theorem 2 and theorem 3 imply that mi are all
orthogonal to the vector nǚ, normal to plane Pǚ. Therefore, formula (2) can be rewritten as:
4
IJ = −nǚ ×(¦γ i ri ) (6)
i =1
where:
τi (7)
γi = ≥ 0 ∀ i = 14
nǚ × ri
take into account the actual pulling forces generated by the extra-ocular muscles. From
formula (6), it is clear that Ǖ is orthogonal to a convex linear combination of vectors ri. Then,
Design of a Humanoid Robot Eye 143
it is possible to generate any torque vector laying on plane Pǚ, as long as nǚ belongs to the
convex hull of vectors ri as shown in Fig. 5.
Fig. 5. When vector nǚ belongs to the convex hull of vectors ri then rectii extra-ocular
muscles can generate any admissible torque on Pǚ.
Remark 2: The discussion above shows that the placement of the insertion points affects the
range of admissible motions. According to the previous discussion when the eye is in its
primary position any torque belonging to plane L can be assigned. The angle β formed by
the insertion points with the optical axis determines the actual eye workspace. For an angle
β = 55 deg the eye can rotate of about 45 deg in all directions with respect to the direction of
fixation at the primary position.
Assume now that, under the assumptions made in section 3, a simplified dynamic model of
the eye could be expressed as:
Iω = IJ (8)
where I is a scalar describing the momentum of inertia of the eye-ball, while ω is its angular
acceleration of the eye. Let us assume at time 0 the eye to be in the primary position, with
zero angular velocity (zero state). Then, the extra-ocular muscles can generate a resulting
torque of the form:
IJ = v lj(t) (9)
where v ෛ L is a constant vector and lj(t) a scalar control signal. Therefore, ǚ and ω are
parallel to v; then, it is possible to reach any Listing compatible orientation, and also, during
the rotation, the Half Angle Rule is satisfied. Similar reasoning can be applied to control the
eye orientation to the primary position starting from any Listing compatible orientation and
zero angular velocity.
The above analysis proves that saccadic motions from the primary position to arbitrary
secondary positions can be implemented on a mechanical basis. However, simulative
examples, discussed in (Cannata & Maggiali, 2006), show that also generic saccadic motions
can be implemented adopting the proposed model. Theoretical investigations on the model
properties are currently ongoing to obtain a formal proof of the evidence provided by the
simulative tests.
144 Humanoid Robots, New Developments
mm, in the current implementation, and less than 20 mm for an eye of human size), and
limited pulling force. In fact, a pulling force of 2.5 N would generate a nominal angular
acceleration of about 6250 rad sec-2 , for an eye-ball with a mass 50 g and radius of 20 mm,
and about 58000 rad sec-2 in the case of an eye of human size with a mass of 9 g and radius of
12 mm. The actuators used in the current design are standard miniature DC servo motors,
with integrated optical encoder, however, various alternative candidate solutions have been
taken into account including: shape memory alloys and artificial muscles. According to
recent advances, (Carpi et al., 2005; Cho & Asada, 2005), these technologies seem very
promising as alternative solutions to DC motors mostly in terms of size and mass (currently
the mass of the motors is about 160 g, i.e. over 50% of the total mass of the system, without
including electronics). However, presently both shape memory alloys and artificial muscles
require significant engineering to achieve operational devices, and therefore have not be
adopted for the first prototype implementation.
In the following the major components and subsystems developed are reviewed.
6. The Eye-Ball
The eye ball is a precision PTFE sphere having a diameter of 38.1 mm (1.5in). The sphere has
been CNC machined to host a commercial CMOS camera, a suspension spring, and to route
the power supply and video signal cables to the external electronics. A frontal flange is used
to allow the connection of the tendons at the specified insertion points, and to support
miniature screws required to calibrate the position of the camera within the eye ball. On the
flange it is eventually placed a spherical cover purely for aesthetical reasons. Fig. 7 and Fig.
8 show the exploded view and the actual eye-ball.
Fig. 8. The machined eye-ball (left), and the assembled eye-ball (camera cables shown in
background).
146 Humanoid Robots, New Developments
The insertion points form an angle of 55 deg, with respect to the (geometric) optical axis of
the eye, therefore the eye-ball can rotate of about 45 deg in all directions. The tendons used
are monofiber nylon coated wires having a nominal diameter of 0.25 mm, well
approximating the geometric model proposed for the extra-ocular muscles.
7. Supporting Structure
The structure designed to support the eye ball is formed by two distinct parts: a low friction
support, designed to hold the eye ball, Fig. 9, and a rigid flange used to implement the
pointwise pulleys, and providing appropriate routing of the actuation tendons) required to
ensure the correct mechanical implementation of Listing’s Law, Fig. 13.
.
Fig. 9. CAD model of the eye-ball support (first concept).
Let us assume the eye in a Listing compatible position A, then we may assume that a generic
tendon is routed as sketched in Fig. 12. The pulley shown in the figure is tangent to the
principal direction at point pi, and it allows the tendon to pass through pi. Assume now to
rotate the eye to another Listing compatible position B; if the pulley could tilt about the
principal axis, during the eye rotation, the tangential point pi would remain the same so that
the tendon is still routed through point pi. Therefore, the idea of the pulley tilting (about the
principal axis) and possibly rotating (about its center), fully meets the specifications of the
pointwise pulleys as defined for the eye model.
Fig. 10. The eye-ball support is formed by two PTFE parts mated together (final design).
Fig. 11. During Listing compatible eye motions, the insertion points move within the region
internal to the blue curve. The red marker represents the position of the insertion point at the
primary position, while the green markers are the positions of (two of) the frontal contact
points on the eye-ball support. The fixation axis at the primary position, h3, points upward.
148 Humanoid Robots, New Developments
Fig. 12. Sketch of the tendon’s paths, showing the tilting of the routing pulley when the eye
is rotated from position A to position B (tendons and pulleys diameters not to scale).
Fig. 13. Detail of the flange implementing the pointwise pulleys. The tendon slides along a
section of a toroidal surface.
A component featuring these characteristics could be implemented, but its miniaturization
and integration has been considered too complex, so we decided to implement a virtual
pulley, to be intended as the surface formed by the envelope of all the tilting pulleys for all
the admissible eye orientations. Since the pulley tilts about the principal axis at point pi, then
the envelope is a section of a torus with inner diameter equal to the radius of the tendon,
and external radius equal to the radius selected for the pulley. Then, the implementation of
the virtual pulley has been obtained by machining a section of a torus on the supporting
flange as shown in Fig. 13.
The assembly of the eye-ball and its supporting structure is shown in Fig. 14.
Design of a Humanoid Robot Eye 149
Fig. 15. Four DC motor actuate the tendons; pulleys route the tendons towards the eye-ball
The servo-motors are equipped with optical encoders providing the main feedback for the
control of the ocular movements. A second set of sensors for measuring the mechanical
tension of the tendons is integrated in the robot. In fact, as the tendons can only apply
pulling forces, control of the ocular movements can be properly obtained only if slackness of
the tendons or their excessive loading is avoided. The tension sensors are custom made and
integrated within the supporting structure of the eye, Fig. 17.
Each sensor is formed by an infrared led/photodiode couple separated by a mobile shutter,
preloaded with a phosphore-bronze spring. As shown in Fig. 18, the tension of the tendon
counter-balances the pre-load force thus varying the amount of IR radiation received. The
sensor output is the current generated by the photodiode according to the following
equation:
I p = kp DŽ(f)E0 (10)
where Ip is the current generated by the photodiode, kp is the characteristic parameter of the
photodiode, E0 is the IR radiation emitted by the led, and DŽ(f) is a monotonic non-linear
function of the tendon’s tension depending on the system geometry. Each sensor is
calibrated and a look-up table is used to map its current to tension characteristic.
150 Humanoid Robots, New Developments
Fig. 16. The body of the MAC-EYE robot. All the motors are integrated within the body;
furthermore, all the cables from and to the eye-ball and the embedded optical tension
sensors are routed inside of the structure.
Fig. 17. Implementation of the embedded sensors for measuring the mechanical tension of
the tendons. The picture shows the shutter and its preloading spring, cables and tendons.
Fig. 19. Robot eye control scheme. The picture shows the control loops related to a single
actuator.
Assume that a given reference trajectory for the eye is given and expressed by a rotation
matrix R * (t ) , and an angular velocity ω * (t ) . Then, an algebraic and a differential mapping,
relating the eye orientation to the displacement of the tendons, can be easily computed. In
fact, as shown in Fig. 20, for a given eye orientation the tendon starting from the insertion
point ri remains in contact with the eye-ball up to a point ti (point of tangency). From ti the
tendon reaches the pointwise pulley and then moves towards the motor which provides the
actuation force. For a given position of the eye there exists a set of displacements of the free
end of the tendons corresponding to the amount of rotation to be commanded to the motors.
According to Fig. 20, for each tendon this amount can be computed (with respect to a
reference eye position, e.g. the primary position), using the formula:
xi = R(φi - φ0i) (11)
152 Humanoid Robots, New Developments
φi αi
Pointwise pulley i
ri pi
ti
Insertion point i
Fig. 20. The plane of tendon i for a generic eye orientation. Angle αi is constant for any eye
orientation.
where xi is the amount of displacement of the free end of the tendon, while φi and φ0i are the
angles (with positive sign), formed by vectors ri and ti at a generic eye orientation, and at the
primary position, respectively. Signs are chosen so that if xi < 0 then the corresponding
tendon must be pulled in order to orient the eye as specified by matrix R * (t ) . In order to
compute xi the angle φi can be determined, as follows. According to Fig. 20, the angle αi must
be constant for any eye orientation and can be expressed as:
§R·
α i = cos − 1 ¨¨ ¸
¸ (12)
© di ¹
If the eye orientation is known with respect to frame <h>, then ri is known, hence:
Rdi cos(α i + φi ) = ri ⋅ pi (13)
and finally, from equations (12) and (13), we obtain:
§ ri ⋅ p i · §R·
φi = cos − 1 ¨¨ ¸ − cos − 1 ¨ ¸
¸ ¨d ¸ (14)
© Rdi ¹ © i¹
The time derivative of φi can be computed by observing that ri = ω × ri , where ω is the
angular velocity of the eye, then, the time derivative of equations (13) can be written as:
d
(ri ⋅ pi ) = − Rdi sin(α i + φi )φi (15)
dt
Therefore, we obtain the following equality:
(ω × ri )⋅ pi = − Rdi sin(α i + φi )φi (16)
Then, by observing that (ω × ri )⋅ pi = (ri × pi )⋅ ω , we have:
1
φi = − (ri × pi )⋅ ω (17)
ri × pi
Then, if R * (t ) and ω * (t ) are the desired trajectory and angular velocity of the eye, the
reference motor angles and velocities can be computed using formulas (11), (14) and (17) as:
Design of a Humanoid Robot Eye 153
ª § r * ⋅ pi · º
q i* (t ) =
R «
cos − 1 ¨¨ i ¸ − cos − 1 §¨ R ·¸ − φ »
Rm « ¨ Rdi ¸¸ ¨d ¸ i 0
»
¬ © ¹ © i¹ ¼ (18)
q i* (t ) = −
R 1
Rm r * × p i i
( )
r * × pi ⋅ ω *
i
where Rm is the radius of the motor pulley.
Fig. 21. The prototype custom embedded real-time controller. The board has dimensions of
69 × 85mm2.
CAN bus
HOST
COMPUTER
PWM ENC A/D PWM ENC A/D PWM ENC A/D PWM ENC A/D
A
Tension
Sensor
Tendon
M (to eye-ball)
10. Conclusions
This chapter has shown the feasibility of a tendon driven robot eye featuring the
implementation of Listing’s Law on a mechanical basis. The design of the robot is based on a
model which is strongly inspired on assumptions derived from physiological evidence.
The achievements discussed in this chapter represent the starting point for the
development of a more complex humanoid robot eye. From the mechanical point of
view the most important advancement is represented by the possibility of extending the
actuation to six tendons in order to enable the implementation of more complex ocular
motions as the vestibulo-ocular reflex. From a theoretical point of view a more complete
analysis of the properties of the proposed eye model could provide further
understanding of the dynamics of saccadic motions. Furthermore, other issues such as
eye miniaturization and embedding of image processing algorithms modules for direct
visual feedback represent important challenges for the development of fully operational
devices.
11. References
Albers, A. ; Brudniok, S. & Burger, W. (2003). The Mechanics of a Humanoid, Proceedings
of Humanoids 2003, ISBN 3-00-012047-5; Karlsruhe, September 2003, IEEE,
Germany.
Angelaki, D. E. & Hess, B. J. M., (2004). Control of eye orientation: where does the brain’s
role ends and the muscles begins, European Journal of Neurosciences, vol. 19, pp. 1-10,
2004, ISSN 0270-6474.
Becker, W. (1991). Eye Movements, Carpenter, R.H.S. ed., Macmillan, pp. 95-137,1991, ISSN
0301-4738.
Design of a Humanoid Robot Eye 155
Biamino, D. & Piazza, A. (2005). Studio Progetto e Realizzazione di una Coppia di Occhi
Robotici con Sistema di Controllo Embedded, Master Degree Thesis, Faculty of
Engineering, University of Genova, 2005, no ISBN or ISSN.
Cannata, G. ; D’Andrea, M.; Monti, F. & Maggiali, M. (2006). Implementation of Listing’s
Law for a Robot Eye, Proceedings of Sy.Ro.Co 2006, Bologna (Italy), Sept. 6-8, no
ISBN or ISSN, 2006.
Cannata, G. & Maggiali, M. (2006). Implementation of Listing’s Law for a Tendon Driven
Robot Eye, Proceedings of. IEEE Conf. on Intelligent Robots and Systems, IROS 2006,
pp. 3940-3945, ISBN 1-4244-0259-X, Beijing, Oct. 9-15, 2006, IEEE, China.
Clark, R. A.; Miller, J.M. & Demer, J. L. (2000). Three-dimensional Location of Human Rectus
Pulleys by Path Inflection in Secondary Gaze Positions, Investigative Ophthalmology
and Visual Science, vol. 41, pp. 3787-3797, 2000, ISSN 0146-0404.
Carpi F.; Migliore A.; Serra G. & De Rossi, D. (2005). Elical Dielectric Elastomer Actuators,
Smart Material and Structures, vol. 14, pp. 1210-1216, 2005, ISSN 0964-1726.
Cho, K. & Asada, H. H. (2005). Segmentation theory for design of a multi-axis actuator array
using segmented binary control. Proceedings of the American Control Conference, vol.
3, pp. 1969-1974, ISBN 0-7803-649, ACC, 2005.
Demer, J. L.; Miller J. M.; Poukens V.; Vinters, H. V. & Glasgow B.J. (1995). Evidence for
fibromuscular pulleys of the recti extraocular muscles, Investigative Ophthalmology
and Visual Science, vol. 36, pp. 1125-1136, 1995, ISSN 0030-3747.
Demer, J. L.; Ho, S. Y. & Pokens, V. (2000). Evidence for Active Control of Rectus Extraocular
Muscle Pulleys, Invest. Ophtalmol. Visual Sci., vol. 41, pp. 1280-1290, 2000, ISSN
0042-6989.
Furman, J. M. & Schor, R. (2003). Orientation of Listing’s plane during static tilt in young
and older human subjects, Vision Res., vol. 43, pp. 67-76, 2003, ISSN 0042-6989.
Gu, J.; Meng, M., Cook, A. & Faulkner, M. G. (2000). A study of natural movement of
artificial eye plant. Robotics and Autonomous System, vol. 32, pp. 153-161 , 2000, ISSN
0921-8890.
Haslwanter, T. (1995). Mathematics of Three-dimensional Eye Rotations, Vision Res., vol. 35,
pp. 1727-1739, 1995, ISSN 0042-6989.
Haslwanter, T. (2002). Mechanics of Eye Movements Implications of the Orbital Revolution,
Ann. N. Y. Acad. Sci., vol. 956, pp. 33-41, ,2002, ISSN 0077-8923.
Hepp, K. (1994). Oculomotor control: Listing’s law and all that, Current Opinion in
Neurobiology, vol. 4, pp. 862-868, 1994 ISSN 0959-4388.
Koene, A. R. & Erkelens, C.J. (2004). Properties of 3D rotations and their relation to eye
movement control, Biol. Cybern., vol. 90, pp. 410-417, Jul. 2004, ISSN 0340-1200.
Koornneef, L. (1974). The first results of a new anatomical method of approach to the human
orbit following a clinical enquiry, Acta Morphol Neerl Scand, vol. 12, n. 4, pp. 259-
282, 1974, ISSN 0001-6225.
Lockwood-Cooke, P.; Martin, C. F. & Schovanec L. (1999). A Dynamic 3-d Model of Ocular
Motion, Proceedings of 38th Conference of Decision and Control, ISBN 0-7803-5253-X,
Phoenix, December,1999.
Miller, J. M. & Robinson, David A. (1984). A Model of the Mechanics of Binocular
Alignment. Computer and Biomedical Research, vol.17, pp. 436-470, 1984, ISSN 0010-
4809.
Miller, J. M. (1989). Functional anatomy of normal human rectus muscles, Vision Res., pp.
223-240, vol. 29, 1989, ISSN 0042-6989.
156 Humanoid Robots, New Developments
1. Introduction
Humanoid robots, because of their similar structure with humans, are expected to operate in
hazardous and emergency environments. In order to operate in such environments, the
humanoid robot must be highly autonomous, have a long operation time and take decisions
based on the environment conditions. Therefore, algorithms for generating in real time the
humanoid robot gait are central for development of humanoid robot.
In the early works, the humanoid robot gait is generated based on the data taken from
human motion (Vukobratovic et al. 1990). Most of the recent works (Roussel 1998, Silva &
Machado 1998, Channon 1996) consider minimum consumed energy as a criterion for
humanoid robot gait generation. Roussel (1998) considered the minimum consumed energy
gait synthesis during walking. The body mass is concentrated on the hip of the biped robot.
Silva & Machado (1998) considered the body link restricted to the vertical position and the
body forward velocity to be constant. The consumed energy, related to the walking velocity
and step length, is analyzed by Channon (1996). The distribution functions of input torque
are obtained by minimizing the joint torques.
In our previous works, we considered the humanoid robot gait generation during walking
and going up-stairs (Capi et al. 2001) and a real time gait generation (Capi et al. 2003). In
addition of minimum consumed energy (MCE) criteria, minimum torque change (MTC)
(Uno et al. 1989, Nakano et al. 1999) was also considered. The results showed that MCE and
MTC gaits have different advantages. Humanoid robot motion generated based on MCE
criterion was very similar with that of humans. Another advantage of MCE criterion is the
long operation time when the robot is actuated by a battery. On the other hand, MTC
humanoid robot motion was more stable due to smooth change of torque and link
accelerations.
Motivated from these observations, it will be advantageous to generate the humanoid robot
motion such that different criteria are satisfied. This belongs to a multiobjective optimization
problem. In a multiobjective optimization problem there may not exist one solution that is
the best with respect to all objectives. Usually, the aim is to determine the tradeoff surface,
which is a set of nondominated solution points, known as Pareto-optimal or noninferior
solutions.
The multiobjective problem is almost always solved by combining the multiple objectives
into one scalar objective using the weighting coefficients. Therefore, to combine different
158 Humanoid Robots, New Developments
objectives in a single fitness function, an a-priori decision is needed about the relative
importance of the objectives, emphasizing a particular type of solution. These techniques
often require some problem-specific information, such as total range each objective covers.
In complex problems, such as humanoid robot gait generation, this information is rarely
known in advance, making the selection of single objective weighting parameters difficult.
In addition, there is no rational basis of determining adequate weights and the objective
function so formed may lose significance due to combining non-commensurable objectives.
To avoid this difficulty, the e-constraint method for multiobjective optimization was
presented (Becerra & Coello). This method is based on optimization of the most preferred
objective and considering the other objectives as constraints bounded by some allowable
levels. These levels are then altered to generate the entire Pareto-optimat set. The most
obvious weaknesses of this approach are that it is time-consuming and tends to find weakly
nondominated solutions.
In this paper, we present a multiobjective evolutionary algorithm (MOEA) (Coello 1999,
Herrera et al. 1998) technique for humanoid robot gait synthesis. The main advantage of the
proposed algorithm is that in a single run of evolutionary algorithm, humanoid robot gaits
with completely different characteristics are generated. Therefore, the humanoid robot can
switch between different gaits based on the environment conditions. In out method, the
basic idea is to encode the humanoid robot gait parameters in the genome and take the
parameters of the non-dominated optimal gaits in the next generation. The specific
questions we ask in this study are: 1) whether MOEA can successfully generate the
humanoid robot gait that satisfies different objective functions in a certain degree, 2)
whether the humanoid robot gait optimized by MOEA in simulation can indeed be helpful
in hardware implementation.
In order to answer these questions, we considered the MCE and MTC cost functions as
criteria for “Bonten-Maru” humanoid robot gait synthesis. We employed a real number
MOEA. Simulation and experimental results show a good performance of the proposed
method. The non-dominated optimal Pareto optimal solutions have a good distribution and
humanoid robot gait varies from satisfying each of both considered objectives to satisfying
both of them. Therefore, as a specific contribution of proposed method is that in a single run
of MOEA are generated humanoid robot gaits with completely different characteristics,
making it possible to select the appropriate gait based on our preferences. In order to further
verify how the optimized gait will perform on real hardware, we implemented the optimal
gait using the “Bonten-Maru” humanoid robot. The results show that in addition of energy
consumption, the optimized gait was stable and with a small impact due to the smooth
change of the joint torques.
fronts. Since the nondominated fronts are defined, the population is then reproduced
according to the dummy fitness values.
Start
Gen=0
Gen=Gen+1
Evaluate Process
Nondominated Front=1 Genetic
individuals Operations
No
Assignment of Mutation
Population
dummy fitness
Classified?
Crossover
Fitness Sharing Yes
Selection
Front=Front+1 Max nr of
Generations?
Yes
End
Fig. 1. Flowchart of NSGA.
Fitness Sharing: In genetic algorithms, sharing techniques aim at encouraging the formation
and maintenance of stable subpopulations or niches (Zitzler et al. 2000). This is achieved by
degrading the fitness value of points belonging to a same niche in some space.
Consequently, points that are very close to, with respect to some space (decision space X in
this paper), will have their dummy fitness function value more degraded. The fitness value
degradation of near individuals can be executed using (3) and (4), where the parameter dij is
the variable distance between two individuals i and j, and ǔshared is the maximum distance
allowed between any two individuals to become members of a same niche. In addition, dfi is
the dummy fitness value assigned to individual i in the current front and df’i is its
corresponding shared value. Npop is the number of individuals in the population. The
sharing function (Sh) measures the similarity level between two individuals. The effect of
this scheme is to encourage search in unexplored regions. For details about niching
techniques, see Sareni et al. (1998).
§ d ·
2
°°1 ¨ ij
¸ , if d ij V shared
Sh(d ij ) ® ¨ V shared ¸ (3)
¹
° ©
°¯ 0, if d ij t V shared
1
ª Npop º
df i '
df i «¦
«¬ j 1
Sh(d ij )»
»¼
(4)
Multicriteria Optimal Humanoid Robot Motion Generation 161
¦ m (z z
i 1
i i w gz )
where mi is mass of the particle “i”, xw and zw are the coordinates of the waist with respect
to the coordinate system at the ankle joint of supporting leg, x i and z i are the coordinates
of the mass particle “i” with respect to the O1X1Z1 coordinate system, x and z are the
i i
acceleration of the mass particle “i” with respect to the O1X1Z1 coordinate system.
Based on the formula (3), if the position, x i , z i , and acceleration, x i , zi , of the leg part
(i=1,2,4,5), the body angle, ș 3 , and body angular velocity, ș 3 , are known, then because
x , z are functions of l3, ș , ș , ș , it is easy to calculate the body angular acceleration
3 3 3 3 3
based on the ZMP position. Let (0) and (f) be the indexes at the beginning and at the end of
the step, respectively. At the beginning of the step, ș 30 causes the ZMP to be in the position
ZMPjump. At the end of the step, the angular acceleration ș 3f is calculated in order to have
the ZMP at the position ZMPf, so that the difference between ș 3f and ș 30 is minimal.
Therefore, the torque necessary to change the acceleration of the body link will also be
minimal.
m3
T3
z1
O1 x1 Searching
z Ti Area
- m4 m2 P (xp,yp).
+ T4 T2
O x
m5 m1
T5 T1
P(xp,yp)
ZMPf ZMPjump RN
step length R
Ff
Fig. 2. Five-link humanoid robot.
The cost function J, which is a quantity proportional to the energy required for the motion, is
defined as follows:
t t
1 f T f
tf T tf
2
1 § dIJ · § dIJ · § ǻIJ ·
J torquechange ³
( ¨ ¸ ¨ ¸dt ¨ ¸ Cdt).
2 © dt ¹ © dt ¹
0
© ǻt ¹ 0 ³ (8)
In this way, during the instantaneous double support phase, we don’t need to apply an
extra torque to change the angular acceleration of the links. To find the upper body angle
trajectory, an intermediate angle ș 3p and its passing time t3 are considered as GA variables.
To determine the angle trajectories of the swing leg, the coordinates of an intermediate point
P(xp,zp) and their passing time tp, are also considered as GA variables. The searching area
for this point is shown in Figure 2. Based on the number of constraints, the degree of the
time polynomial for ș 1, ș 2, ș 3, ș 4 and ș 5 are 3, 3, 7,6 and 6, respectively.
Body Lower leg Upper leg Lower leg + foot
Mass [kg] 12 2.93 3.89 4.09
Inertia [kg m2] 0.19 0.014 0.002 0.017
Length [m] 0.3 0.2 0.204 0.284
CoM dist.[m] 0.3 0.09 0.1 0.136
Table 1. “Bonten-Maru” humanoid robot link parameters.
5. Results
5.1 “Bonten-Maru” Humanoid Robot
In the simulations and experiments, we use the the “Bonten-Maru” humanoid robot (Nasu
et al. 2002, Takeda et al. 2001). The parameter values are presented in Table 1 and the robot
is shown in Fig. 3(a). The “Bonten-Maru I” humanoid robot is 1.2 m high and weights 32 kg,
like an 8 years old child. The “Bonten-Maru I” is a research prototype, and as such has
undergone some refinement as different research direction are considered. During the
164 Humanoid Robots, New Developments
design process, some predefined degree of stiffness, accuracy, repeatability, and other
design factors have been taken into consideration. The link dimensions are determined such
that to mimic as much as possible the humans. In the “Bonten-Maru” humanoid robot, a DC
motor actuates each joint. The rotation motion is transmitted by a timing belt and harmonic
drive reduction system. Under each foot are four force sensors, two at the toe and two across
the heel. These provide a good indication of both contact with the ground, and the ZMP
position. The head unit has two CCD cameras (542x492 pixels, Monochrome), which are
connected to the PC by video capture board. A Celeron based microcomputer (PC/AT
compatible) is used to control the system.
The dof are presented in Fig. 3(b). The high number of dof gives the “Bonten-Maru I”
humanoid robot the possibility to realize complex motions. The hip is a ball-joint, permitting
three dof; the knee joint one dof; the ankle is a double-axis design, permitting two. The
shoulder has two dof, the elbow and wrist one dof. The DC servomotors act across the three
joints of the head, where is mounted the eye system, enabling a total of three dof. The
distribution of dof is similar with the dof in human limbs.
;CYKPI
4QNNKPI 2KVEJKPI
(a) (b)
Fig. 3. “Bonten-Maru” humanoid robot.
) GP
6% /s] ) GP
) GP
2
TC[(Nm)
CE%[J'2s]
2300
2200
1
2100
TC[(Nm)2/s]
2
2000
1900
3
1800
4 5
1700
220 240 260 280 300 320 340
CE [J2s]
Based on the parameters of the “Bonten-Maru” humanoid robot the step length used in the
simulations varies from 0.2m to 0.55m. The bounds, within which the solution is sought,
change according to the step length and step time. In the following, we present the results
for the step length 0.42m and step time 1.2s.
Fig. 4 shows the Pareto optimal front for generations 1, 50 and 100. During the first 50
generations there is a great improvement on the quality and distribution of Pareto optimal
solutions. From this figure, it can be deduced that the MOEA is equally capable of finding
the best solution for each objective when two conflicting objectives are considered
simultaneously.
166 Humanoid Robots, New Developments
Ǽ
Ǽ
Joint torque [Nm]
Ǽ
20 Ǽ
Ǽ
-20
0 0.5 1 1.2
Time [s]
Ǽ
Ǽ
Joint torque [Nm]
Ǽ
20 Ǽ
Ǽ
-20
0 0.5 1 1.2
Time [s]
Ǽ
Ǽ
Joint torque [Nm]
Ǽ
20 Ǽ
Ǽ
-20
0 0.5 1 1.2
Time [s]
without neglecting the smoothness in the torque change, the results shown in Boxes 2, 3 are
the most important. The results in Box 2, show that by a small increase in the energy
consumption (2.2%), we can decrease the MTC fitness function by around 12.1%. Also, the
energy can be reduced by 14.5% for a small increase in the MTC cost function (Box 4).
0.1
XZMP [m]
0.05
-0.05
0 0.2 0.4 0.6 0.8 1 1.2
Fig. 7. ZMP trajectory. Time [s]
1 2
3 4
5 6
7 8
The torque vector (Wi) and the optimal gaits for different results of Pareto front solutions are
shown in Fig. 6. The robot posture is straighter, similar to humans, for MCE cost function
(Fig.6(a)). Torque value is low for MCE gait and the torques change smoothly for MTC gait
(Fig. 6(b)). The optimal gait generated by Box 3 solutions satisfies both objective functions.
The energy consumption is increased by 9% but on the other hand the value of MTC cost
function is decreased by 19.2%.
The ZMP trajectory is presented in Fig. 7 for humanoid robot gait generated by Box 3
result. The ZMP is always between the dotted lines, which present the length of the foot.
At the end of the step, the ZMP is at the position ZMPf, as shown in Fig. 2. At the
beginning of the step, the ZMP is not exactly at the position ZMP jump because of the
foot’s mass. It should be noted that the mass of the lower leg is different when it is in
supporting leg or swing leg.
In order to investigate how the optimized gaits in simulation will perform in real hardware,
we transferred the optimal gaits that satisfy both objective functions on the “Bonten-Maru”
humanoid robot (Fig. 8). The experimental results show that in addition of reduction in
energy consumption, the humanoid robot gait generated by Box 3 solutions was stable. The
impact of the foot with the ground was small.
6. Conclusion
This paper proposed a new method for humanoid robot gait generation based on
several objective functions. The proposed method is based on multiobjective
evolutionary algorithm. In our work, we considered two competing objective
functions: MCE and MTC. Based on simulation and experimental results, we
conclude:
x Multiobjective evolution is efficient because optimal humanoid robot gaits with
completely different characteristics can be found in one simulation run.G
x The nondominated solutions in the obtained Pareto-optimal set are well distributed
and have satisfactory diversity characteristics.G
x The optimal gaits generated by simulation gave good performance when they were
tested in the real hardware of “Bonten-Maru” humanoid robot. G
x The optimal gait reduces the energy consumption and increases the stability during
the robot motion. G
In the future, it will be interesting to investigate if the robot can learn in real time to switch
between different gaits based on the environment conditions. In uneven terrains MTC gaits
will be more
7. References
Becerra, L. B., and Coello, C. A. (2006). Solving Hard Multiobjective Optimization Problems
Using e-Constraint with Cultured Differential Evolution, in Thomas Philip
Runarsson, Hans-Georg Beyer, Edmund Burke, Juan J. Merelo-Guervós, L. Darrell
Whitley and Xin Yao (editors), Parallel Problem Solving from Nature - PPSN IX,
9th International Conference, pp. 543--552, Springer. Lecture Notes in Computer
Science Vol. 4193.
Multicriteria Optimal Humanoid Robot Motion Generation 169
Capi, G., Nasu, Y., Barolli, L., Mitobe, K., and Takeda, K. (2001). Application of
genetic algorithms for biped robot gait synthesis optimization during
walking and going up-stairs, Advanced Robotics Journal, Vol. 15, No. 6, 675-
695.
Capi, G., Nasu, Y., Barolli, L., Mitobe, K., Yamano, .M., and Takeda, K. (2002) A new gait
optimization approach based on genetic algorithm for walking biped robots and a
neural network implementation, Information Processing Society of Japan (IPSJ), Vol.
43, No. 4, 1039-1049.
Capi, G., Nasu, Y., Barolli, L., and Mitobe, K. (2003). Real time gait generation for
autonomous humanoid robots: a case study for walking, Robotics and Autonomous
Systems, Vol. 42, No. 2, 107-116.
Channon, P. H, Pham, D. T., and Hopkins, S. H. (1996). A variational approach to the
optimization of gait for a bipedal robot, Journal of Mechanical Engineering Science,
Vol. 210, 177-186.
Coello, C. A. C. (1999) A comprehensive survey of evolutionary based multiobjective
optimization techniques, Knowledge and Information Systems, Vol. 1, No. 3, pp. 269-
308.
Dias, A. H. F. & De Vasconcelos, J. A. (2002). Multiobjective genetic algorithms applied to
solve optimization problems, IEEE Transactions on Magnetic, Vol. 38, No. 2, 1133-
1136.
Herrera, F., Lozano, M. and Verdegay, J. L. (1998). Tackling real-coded genetic algorithms:
operators and tools for behavioral analysis, Artificial Intelligence Review, Vol. 12, No.
4, 265-319.
Nakano, E., Imamizu, H., Osu, R., Uno, Y. Gomi, H., Yoshioka, T., and Kawato, M. (1999)
Quantitative examinations of internal representations for arm trajectory planning:
minimum commanded torque change model, The Journal of Neurophysiology, Vol.
81, No. 5, 2140-2155.
Nasu, Y., Capi, G., Yamano, M. (2002). “Bonten-Maru I”: Development and Perspectives of a
Humanoid Robot Project, Proc. of Pacific Conference on Manufacturing (PCM2002), pp.
240-245.
Roussel, L., Canudas-de-Wit, C., and Goswami, A. (1998). Generation of energy optimal
complete gait cycles for biped robots, Proc. IEEE Int. Conf. on Robotics and
Automation, pp. 2036-2041.
Sareni, B., Krähenbühl, L. and Nicolas, A. (1998). Niching genetic algorithms for
optimization in electromagnetics-I Fundamentals, IEEE Transactions on Magnetic,
Vol. 34, 2984–2987.
Silva, F. M. and Machado, J. A. T. (1999). Energy analysis during biped walking, Proc. IEEE
Int. Conf. On Robotics and Automation, pp. 59-64.
Srivinas, N. & Deb, K. (1995 ). Multiobjective optimization using non-dominated sorting in
genetic algorithms, Evolutionary Computation, Vol. 2, No. 3, 279-285.
Takeda, K., Nasu, Y., Capi, G., Yamano, M., Barolli, L., Mitobe, K. (2001). A CORBA-Based
approach for humanoid robot control, Industrial Robot-an International Journal,
Vol.28, No.3, 242-250.
Uno, Y., Kawato, M., and Suzuki, R. (1989). Formulation and control of optimal trajectory in
human multijoint arm movement, Biol. Cybernet. Vol. 61, pp. 89-101.
170 Humanoid Robots, New Developments
Vukobratovic, M., Borovac, B., Surla, D. and D. Stokic. (1990). Biped Locomotion, Dynamics,
Stability, Control and Application. Springer-Verlag Berlin.
Zitzler, E., Deb, K., and Thiele, L. (2000). Comparison of multiobjective
evolutionary algorithms: empirical results, Evolutionary Computation, Vol. 8,
No. 2, 173-195.
10
1. Introduction
Humanoid robots base their appearance on the human body (Goddard et al., 1992; Kanehira
et al., 2002; Konno et al., 2000). Minimalist constructions have at least a torso with a head,
arms or legs, while more elaborated ones include devices that assemble, for example, human
face parts, such as eyes, mouth, and nose, or even include materials similar to skin.
Humanoid robots are systems with a very high complexity, because they aim to look like
humans and to behave as they do.
Mechanical control, sensing, and adaptive behaviour are the constituting logical parts
of the robot that allow it to “behave” like a human being. Normally, researchers
study these components by modelling only a mechanical part of the humanoid robot.
For example, artificial intelligence and cognitive science researches consider the
robot from the waist up, because its visual sensing is located in its head, and its
behavior with gestures normally uses its face or arms. Some engineers are mostly
interested in the autonomy of the robot and consider it from the waist down. They
develop mathematical models that control the balance of the robot and the
movement of its legs (Miller, 1994; Yamaguchi et al., 1999; Taga et al., 1991),
allowing the robot to walk, one of the fundamental behaviours that characterizes
human beings.
Examples of such mathematical models are static and dynamic walking. The static
walking model controls the robot to maintain its center of gravity (COG) inside a stable
support region, while the dynamic walking model maintains the zero moment point
(ZMP) inside the support region. Kajita et al. (1992) designed and developed an almost
ideal 2-D model of a biped robot. He supposed, for simplicity, that the robot's COG
moves horizontally and he developed a control law for initiation, continuation and
termination of the walking process. Zhen and Shen (1990) proposed a scheme to enable
robot climbing on inclined surfaces. Force sensors placed in the robot's feet detect
transitions of the terrain type, and motor movements correspondingly compensate the
inclination of robot.
The models mentioned above can be, however, computationally very expensive, and
prohibitive for its implementation in microcontrollers. Control algorithms for a
stable walking must be sufficiently robust and smooth, to accomplish a balance
correction without putting in risk the mechanical stability of the robot. This could be
resolved by using a controller that modifies its parameters according to a
172 Humanoid Robots, New Developments
2. Robot Structure
The humanoid robot “Dany Walker” used in our research was built only from the waist
down. It consists of 10 low-density aluminium links. They are rotational on the pitch axis at
the hip, knee and ankle. Each link consists of a modular structure. The links form a biped
robot with 10 degrees of freedom, see Fig. 1.
The robot structure and its mass distribution affect directly the dynamic of the
humanoid (Cuevas et al., 2004), therefore, the movement of the Center of Masses
(COM) has a significant influence on the robot stability. In order to achieve static
stability, we placed the COM as low as possible. To such purpose, our design uses
short legs, see Fig. 2
To compensate the disturbances during walking, our construction enables lateral
movements of the robot. Thus, it was possible to control the lateral balance of the robot by
swaying the waist using four motors in the lateral plane: two at the waist and two at the
ankles, see Fig. 3.
An Incremental Fuzzy Algorithm for The Balance of Humanoid Robots 173
COM
Lateral movement
Waist motor2
Waist
motor1
Ankle motor2
Ankle motor1
Lateral movement
xZMP
¦ m ( z g)x ¦ m x z ¦ I
i i i i i i i iy
T iy (1)
..
¦ m ( z g ) i i
.. .. ..
y ZMP
¦ m (z g) y ¦ m x z ¦ I
i i i i i i i ix
T ix
,
(2)
..
¦ m (z g) i i
where (xZMP, yZMP,0) are the ZMP coordinates, (xi ,yi ,zi) is the mass center of the link i in the
coordinate system, mi is the mass of the link i, and g is the gravitational acceleration. Iix and
Iiy are the inertia moment components, T iy and T ix are the angular velocity around the axes
x and y, taken as a point from the mass center of the link i. The force sensor values are
directly used to calculate the ZM. For the lateral control, it is only necessary to know the
ZMP value for one axis. Thus, the ZMP calculus is simplified using the formula
3
¦fr i i
PZMP i 1 , (3)
3
¦f
i 1
i
where fi represents the force at the i sensor, and ri represents the distance between the
coordinate origin and the point where the sensor is located. Figure 4 shows the distribution
of sensors (marked with tree circles) used for each robot‘s foot.
The total ZMP is obtained by the difference between the ZMPs at each foot:
Total _ PZMP PZMP1 PZMP 2 , (4)
where PZMP1 is the ZMP for one foot and PZMP 2 is the ZMP for the other.
Figure 5 shows the ZMP point (black point) for two robot’s standing cases, one before to
give a step (left), and other after give a step (right). The pointed line represents the support
polygon.
An Incremental Fuzzy Algorithm for The Balance of Humanoid Robots 175
f2
r2
r1
f1 r3 f3
0 x
Fig. 4. Sensors distribution at the robot’s foot.
PZMP1
PZMP1 PZM2
Total_PZMP
PZMP2
Total_PZMP
Fig. 5. The black point represents the Robot ZMP before giving a step (left), and after giving
a step (right).
Set-point error
u* u y
Fuzzyfication
+ Ge
- - Control Rules
Defuzzyfication Gu Process
de/dt
Gr
rate Incremental gain
° Gu * (u* ) if e T ( for t 0, Ginc 0)
° (5)
u ®
°
°¯ Ginc Inc if e ! T
Where e is the error (error*Ge), T is an error boundary selected by tuning, and Ginc is the
incremental gain obtained adding the increment “Inc”.
Figure 7 shows the flow diagram for the incremental gain of u.
u*
no
Abs(error)> T
no yes
Ginc =Gu
u =Ginc * u * u =Gu * u *
Figure 8 shows the area where the absolute error is evaluated and the controller output is
incremental (u=Ginc+Inc).
4.1 Fuzzyfication
As is shown in figure 9, there are two inputs to the controller: error and rate. The error is
defined as:
error = setpoint - y (6)
Rate it is defined as it follows:
rate = (ce - pe) / sp (7)
Where ce is the current error, pe is the previous error and sp is the sampling period. Current
and previous error, are referred to an error without gain. The fuzzy controller has a single
incremental output, which is used to control the process
An Incremental Fuzzy Algorithm for The Balance of Humanoid Robots 177
Abs(error)> T Abs(error)> T
Error + 0 -
Fig. 8. Fuzzy PD incremental absolute error area.
The input an output membership functions for the fuzzy controller are shown in Fig. 9 and
Fig. 10, respectively. Fig. 9 shows that each input has two linguistic terms. For the error
input are: Ge* negative error (en) and Ge* positive error (ep) and for the rate input are:
Gr*negative rate (rn) and Gr * positive rate (rp), while the output fuzzy terms are shown in Fig.
10 and they are: Negative output (on), Zero output (oz) and Positive output (op).
As shown in Fig. 9, the same function is applied for error and rate but with different scaling
factors: Ge and Gr respectively.
H and L are two positive constants to be determined. For convenience we will take H=L to
reduce the number of control parameters to be determined. The membership functions for
the input variables, error and rate, are defined by:
L (Ge * error )
P ep
2L
L (Ge * error )
Pen
2L
(8)
L (Gr * rate )
P rp
2L
L (Gr * rate)
Prn
2L
Ge* negative error Ge* positive error
Gr* negative rate Gr* positive rate
1.0
Ge*error
Gr*rate
-L 0 L
-H 0 H Output (u*)
Fig. 10. Input membership functions.
R4 y
error<0 setpoint
rate<0
on
R3 R2
error<0 error>0
R1
rate>0 rate<0
error>0
oz oz
rate>0
op
t
Fig. 11. Determination of the different rules based on the system response (see Text).
An Incremental Fuzzy Algorithm for The Balance of Humanoid Robots 179
For the fuzzy condensed controller proposed, the input error and rate values ranges can be
represented in 20 input regions (IC), like is shown in Fig. 12. If the membership functions
are evaluated, the 4 control rules, the simplification H=L, and the defuzzyfication is applied
in each one of the 20 inputs combinations, then 9 equations can be obtained (Sanchez et al.,
1998), which can determine the control signal u that should be applied, depending on the
region it is located. In other words, to implement the fuzzy condensed controller, only will
be necessary to know the region in which the inputs variables are located and later evaluate
the corresponding equation for this region. For example, the first equation acts in regions
IC1, IC2, IC5, IC6. The figure 13 shows the control surface of the fuzzy condensed controller
considering H=L=1.
Finally, the defuzzyfication method used is the gravity center, in this case is represented by:
H ( P R 4( x ) ) 0( P R 2( x ) P R 3( x ) ) H ( P R 1( x ) ) (9)
u
P R 4( x ) ( P R 2( x ) PR 3( x ) ) P R 1( x )
rate
Control surface
0.6
0.4
0.2
output
−0.2
−0.4
−0.6
1
0.8
0.6 1
0.4 0.5
0
0.2 −0.5
0 −1
rate
error
Fig. 13. Control surface of the fuzzy condensed controller considering H=L=1.
Fig. 14. ZMP value evolution achieved by the controller, for a x-direction impulse.
An Incremental Fuzzy Algorithm for The Balance of Humanoid Robots 181
Finally, a walking test balance was applied to the robot. The robot gave some steps in a flat
surface, while the balance controller compensated the ZMP value. Figure 16 shows the
evolution of the ZMP value (in centimeters) achieved by the controller during 15 seconds.
Figure 17 shows error and rate evolution during the robots walk. The incremental PD fuzzy
controller’s parameters were: Ge=2, Gr=2, and Gu =1.
Fig. 16. ZMP value evolution achieved by the controller during the robot walk.
182 Humanoid Robots, New Developments
Fig. 17. Error and rate evolution during the robot walk.
The motor’s position output was computed by the controller during the walk. It is showed
at Figure 18. This motor’s position value is only just for one waist motor.
6. Conclusion
Figure 11 shows the controller’s performance and response for an x-direction impulse. The
response was fast, approximately 2 seconds, until the controller reached a ZMP value near
to zero. This feature allows the biped robot to gain stability even during walking (figure 13),
maintaining the ZMP always inside the support polygon.
Our experiments with the fuzzy PD incremental controller algorithm demonstrated that it is
computationally economic, all running in a PIC microcontroller, and appropriated for
balance control. The algorithm was successfully used in real-time with the biped robot
“Dany walker” (videos available at http://www.inf.fu-berlin.de/~zaldivar).
The algorithm proposed in this chapter could be also used in other robots structures with a
different dynamic, and even with other degrees of freedom. It would be only necessary to
adjust the controller’s gain parameters to the particular structure.
We plan to use the information given by an inclinometer along with the ZMP value. In this
case, the goal of the bipedal balance robot control will be to achieve an inclination value of
cero and to maintain the ZMP at the center, or inside of the support polygon.
The bipedal robot used in this work is part of a project that is being developed at the Freie
Universität Berlin.
7. References
Goddard, R.; Zheng, Y. & Hemami, H. (1992). Control of the heel-off to toe-off motion of a
dynamic biped gait, IEEE Trans. Syst. Man. Cayb., 1992, vol.22, no. 1
Kanehira, N.; Kawasaki, T.; Ohta, S.; Isozumi, T.; Kawada, T.; Kanehiro, F.; Kajita, S. &
Kaneko, K. (2002). Design and experiments of advanced module (HRPL-2L) for
humanoid robot (HRP-2) development, Proc. 2002, IEEE-RSJ Int. Conf. Intell. Rob.
Sys. EPFL, Lausanne, Switzerland, 2002, 2455-2460.
Konno, A.; Kato, N.; Shirata, S.; Furuta, T. & Uchiyama, M. (2000). Development of a light-
weight biped humanoid robot in Proc. 2000 IEEE-RSJ Int. Con. Intell. Rob. Sys., 2000,
1565-1570.
Miller, W. (1994). Real-time neural network control of a biped walking robot, IEEE contr.
Syst., vol. 14, 1994, 41-48.
Yamaguchi, J.; Soga, E.; Inoue, S. & Takanishi, A. (1999). Development of a biped robot-
control method of a whole body cooperative dynamic biped walking, in Proc. 1999
IEEE Int. Conf. Robot. And Aut., Detroit, Michigan, 1999, 368-374.
Taga, G.; Yamaguchi Y. & Shimizu, H. (1991). Self-organized control of bipedal locomotion
by neural oscilators in unpredictable environment, Biol. Cyb. 65 1991, 147-165.
Kajita, S.; Yamaura, T. & Kobayashi, A. (1992). Dynamic walking control of a biped robot
along a potiential energy conserving orbit, IEEE Trans. Robot Automat. Vol. 8, no. 4,
1992. pp 437-438.
Zheng, Y. & Shen, J. (1990). Gait synthesis for the SD-2 biped robot to climb sloped surface,
IEEE, Trans. Robot Automat. Vol 6, no 1. , 1990, 86-96.
Cuevas, E.; Zaldívar, D. & Rojas, R. (2004). Bipedal robot description, Technical Report B-03-
19, Freie Universität Berlin, Fachbereich Mathematik und Informatik, Berlin, Germany,
2004.
Vukobratovic, M. & Juricic, D. (1969). Contribution to the synthesis of biped gait, IEEE
Trans. Bio-Med. Eng., vol. BME-16, no. 1 , 1969, 1-6.
184 Humanoid Robots, New Developments
1. Introduction
In order for humans and robots to cooperate in an effective manner, it must be possible for
them to communicate. Spoken language is an obvious candidate for providing a means of
communication. In previous research, we developed an integrated platform that combined
visual scene interpretation with speech processing to provide input to a language learning
model. The system was demonstrated to learn a rich set of sentence-meaning mappings that
could allow it to construct the appropriate meanings for new sentences in a generalization
task. We subsequently extended the system not only to understand what it hears, but also to
describe what it sees and to interact with the human user. This is a natural extension of the
knowledge of sentence-to-meaning mappings that is now applied in the inverse scene-to-
sentence sense (Dominey & Boucher 2005). The current chapter extends this work to analyse
how the spoken language can be used by human users to communicate with a Khepera
navigator, a Lynxmotion 6DOF manipulator arm, and the Kawada Industries HRP-2
Humanoid, to program the robots’ behavior in cooperative tasks, such as working together
to perform an object transportation task, or to assemble a piece of furniture. In this
framework, a system for Spoken Language Programming (SLP) is presented. The objectives
of the system are to 1. Allow the human to impart knowledge of how to accomplish a
cooperative task to the robot, in the form of a sensory-motor action plan. 2. To allow the user
to test and modify the learned plans. 3. To do this in a semi-natural and real-time manner
using spoken language and visual observation/demonstration. 4. When possible, to exploit
knowledge from studies of cognitive development in making implementation choices. With
respect to cognitive development, in addition to the construction grammar model, we also
exploit the concept of “shared intentions” from developmental cognition as goal-directed
action plans that will be shared by the human and robot during cooperative activities.
Results from several experiments with the SLP system employed on the different platforms
are presented. The SLP is evaluated in terms of the changes in efficiency as revealed by task
completion time and number of command operations required to accomplish the tasks.
Finally, in addition to language, we investingate how vision can be used by the robot as well
to observe human activity in order to able to take part in the observed activities. At the
interface of cognitive development and robotics, the results are interesting in that they (1)
provide concrete demonstration of how cognitive science can contribute to human-robot
interaction fidelity, and (2) they suggest how robots might be used to experiment with
theories on the implementation of cognition in the developing human.
186 Humanoid Robots, New Developments
In this context, (Lauria et al. 2002) asked naïve subjects to provide verbal instructions to a
robot in a miniature-town navigation task. Based on an analysis of the resulting speech
corpora, they identified a set of verbal action chunks that could map onto robot control
primitives. More recently, they demonstrated the effectiveness of navigation instructions
translated into these primitive procedures for actual robot navigation (Kyriacou et al. 2005).
This research indicates the importance of implementing the mapping between language and
behavioural primitives for high-level natural language instruction or programming. The
current study extends these results by introducing a conditional (e.g. if-then) component to
the programming.
L R
Possibl
Possible
obstacle
A B
Fig. 1. Khepera robot and object transportation scenario. A. Physical set up. B. Labeled
schematic representation.
Figure 1 illustrates a given scenario for HRI. In Figure 1A we see the Khepera robot on a
table top, and in Figure 1B a schematic of the robot (represented as the gray disc) in a two
arm maze. Consider a task in which User1 sends the Khepera robot to User2 who gives it an
object to transport back to User1. Getting to User2 requires a conditional choice between two
paths based on the location of an obstacle that is not known in advance, at the locations
indicated by the dotted lines. Once the robot determines the location of the obstacle, it
should choose the path that is not blocked, and make its way to the end of the arm. There it
should turn around, and wait for User2, to place an object on its carrying surface. When
User2 has performed this, User1 will say “continue” and the robot should then return to the
beginning of the maze arm where User1 will take the transported block, and the process
then continues.
On-Line Commanding: The simplest solution for controlling the robot in this task, which
involves no learning, is for User1 simply to tell the robot what to do, step by step.
Depending on whether the obstacle is at the left or right arm position, the human User1
would decide whether to tell the robot to take the right or left pathway. Likewise, once at
the end point, User1 would wait until the User2 placed the object before commanding the
robot to turn around and come back.
Programming a Pre-Specified Problem: Again, in many cases, the problem is known to the
user in advance (possibly because the user has now “walked” the robot through the
problem as just described) and can be specified prior to execution.
The objective of SLP is to provide a framework in which non-specialist humans can
explain such a task to a robot, using simple spoken language. While the task
described above remains relatively simple, explaining it to the robot already levies
several interesting requirements on the system, and meeting these requirements will
provide a fairly general and robust capability for SLP. In particular, this task requires
188 Humanoid Robots, New Developments
the following: (1) The user should be able to provide a sequence of primitive
commands that will be executed in a particular order. (2) The user should be able to
specify conditional execution, based on the values of robot sensors. (3) The user
should be able to tell the robot that at a certain point it should wait for some sensory
event to occur before proceeding with its sequence execution., as demonstrated
below.
In addition, for tasks that become increasingly long and complex, the user may make
mistakes in his/her initial specification of the task, so the SLP system should allow the user
to test, “debug” and modify programs. That is, once the user has specified the program, he
should be able to execute it and – if there is a problem – modify it in a simple and
straightforward manner.
On-line Commanding with Repetitive Sub-Tasks: On-line commanding allows the user to
be responsive to new situations, and to learn him/herself by taking the robot through a
given task or tasks. On the other hand, for tasks that are well defined, the user can program
the robot as defined above. In between these two conditions there may arise situations in
which during the course of solving a cooperative problem with the robot, the user comes to
see that despite the “open endedness” of a given problem set, there may repetitive subtasks
that occur in a larger context. In this type of situation, the human user may want to teach the
robot about the repetitive part so this can be executed as an autonomous subroutine or
“macro” while the user still remains in the execution loop for the components that require
his/her decision.
A BB CC
Fig. 2. Human-Robot cooperation in a furniture construction task. A. Robot takes a table leg
from User2. B. Robot hands the leg to User1. C. Robot holds the table steady while User1
attaches the leg.
Figure 2 illustrates such an HRI scenario that involves two humans and the HRP-2
cooperating in the construction of a small table. The construction task will involve fixing the
legs to the surface of the table with wood screws. User1 on the left interacts with the robot
and with User2 on the right via spoken language.
User1 will command the robot to prepare to receive one of the table legs that will be passed
to it by User2. The robot waits until it receives a “continue” signal from User1, and will then
pass the leg to User1 who will take the leg, and then ask the robot to hold the table top
steady allowing User1 to attach the leg to the table. User1 then tells the robot to release the
table. At this point, the first leg has been fixed to the table, and the “get the leg and attach it”
sequence can be repeated. This task thus has a repeating subsequence that can be applied to
each of the four legs of the table. Experiments below address this task. These experiments
identify the utility of a more complex command structure based on grammatical
constructions.
Spoken Language and Vision for Adaptive Human-Robot Cooperation 189
Fig. 3. Cooperative manipulation task scenario. User asks for screws from locations
identified by color. While robot holds the plastic “table” (as illustrated) the user can insert
the screws into the table (see inset).
Influence of Robot Command Structure on Language: Part of the central principal of
construction grammar (and the related domain of cognitive linguistics) is that human
language is made up of a structured inventory of grammatical constructions (Goldberg
2003). These constructions reflect the meaning of prototypical events that are basic to
human experience. Thus for example the “ditransitive” construction involves one agent
transferring an object to another, as in “John gave Mary the ball.” From a robotics
perspective, the action predicate-argument Move(Object, Location) can thus be considered
as a template for a motor program that allows the use of constructions of the form “Give
the X to Y”. This robot PA involves localizing X and Y, and then grasping X and
transporting the effector to Y to transfer X to Y. To the extent that such an action PA
(APA) is built into the repertoire of the robot (or to the extent that it can be built up from
primitives like localize(X), transport-to(X) etc.) rich grammatical constructions can be
used to generate and interpret sentences like “give me the red block,” or “put the blue
block next to the red block” for robot control.
Figure 3 illustrates a simplified version of the table construction scenario that uses the
Lynx6 arm for cooperative manipulation in a construction task. Experiments below
examine cooperative construction with the Lynx6 based on lessons learned from the
HRP-2 experiments, including more complex commands based on grammatical
constructions.
Mode Start
Select
Direct Plan
In Figure 5, when the system starts in the “Mode Select” state it asks the user “Shall we learn
by planning or by demonstration?”
Learning by Demonstration: If the user replies “demonstration” (note: In all dialogs,
human responses will be indicated by italics) then the system transitions to the
“Direct” state, and interrogates the subject with “Now what?” The user can reply with
one of the robot-specific motor commands, which the robot will execute and then
again prompt “Now what?” In the course of this direct interaction, if the user
Spoken Language and Vision for Adaptive Human-Robot Cooperation 191
Indeed this is the first concrete example of the use of a multi-word construction. When the
“if condition” construciton is issued, the user specifies the “if” and the corresponding
condition in the same utternence (e.g. “if left clear”), and the “if” and the “left clear”
192 Humanoid Robots, New Developments
condition are paired together in the stored sequence plan. The user then continues to
specify the succession of commands that should logically follow in case the “if” condition
is satisfied. When the user terminates that logical section, he indicates this with the
conditional command “otherwise”. The subsequent succession of commands corresponds
to those that should be executed in case the condition of the “if” fails. During sequence
execution in the Run state, when the system encounters an “if”, it tests the corresponding
condition. When the condition is true, the system executes up to the “otherwise”
statement. When false, it skips to the otherwise, and then resumes execution of the
subsequent commands.
The behavioral scenarios above also identified the need for a conditional wait, in which
the execution of a stored sequence waits for a sensory condition to become true. Like
the “if”, the “wait condition” construction uses the identifier “wait” followed by a
condition, which is paired with the wait command. Then, during execution of the plan
in the Run state, when the system encounters a “wait” action, it tests the corresponding
condition, and waits until that condition is true before proceeding. The “continue”
condition (indicated by the user saying “continue”) is the single robot-independent
condition.
Robot Specific Command: Given the robot-independent component of the SLP architecture,
we now explain how this is adapted to the specific robots. The final behavioral result of a
robot-specific action command that is issued either directly or as part of a learned plan is the
execution of the corresponding action on the robot. We new specify the specific commands
for each of the three robot platforms we used.
K-Team Khepera: The Khepera (K-Team) is a small two-wheeled robot equipped with
proximity sensors that has been extensively explored in sensory-motor robotics. Based on
the requirements for the cooperative task described above, a set of primitive actions, and
senor states was identified for the Khepera. These robot-specific sensory-motor functions are
identified in Table 2.
The Khepera is connected to the SLP system PC via the RS232 serial port. Commands are
issued to the Khepera via this connection, and the values of the forward, left and right
position sensors are read from this connection as well.
Motor Commands Resulting Actions
Explore Move forward until obstacle is encountered
Left Turn 90° left
Right Turn 90° right
Specifiable Conditions Corresponding check
Left clear Check left proximity sensor
Right clear Check right proximity sensor
Obstacle removed Check forward proximity sensor
Table 2. Khepera Action Commands and Sensory Conditions.
Kawada Industries HRP-2 Humanoid: Based on the preliminary analysis of the table-
building scenario above, a set of primitive actions was identified for the Kawada
Industries HRP-2 humanoid robot under the control of the OpenHRP controller (Dominey
et al. 2007). The HRP-2 has 30 controlled degrees of freedom, 8 of which are used in this
study. Each of the functions in Table 3 corresponds to a particular posture that is specified
as the angles for a subset of the 30 DOFs. These actions have been implemented by hand-
Spoken Language and Vision for Adaptive Human-Robot Cooperation 193
written python scripts that specify final, hardcoded, joint angles and motion durations for
the given postures. The python script execution is triggered remotely by the CSLU toolkit,
and communicates directly with the low-level OpenHRP framework. The motion is
achieved by linearly interpolating joint angles between the starting and final
configurations, for each specific action. We have chosen these simple actions in order to
demonstrate the feasibility of the overall approach in the table-building scenario, and
more complex functions are currently under development.
the test condition failed, and execution skipped to the “otherwise” section of the plan. In
execution of this plan segment, an error in the plan was detected. After the robot started to
return to User1 (after the “continue” command had been issued) the plan incorrectly
specified that the robot turn left (into the wall) rather than right (towards the center of the
maze).
That is, during the specification of a plan of this length, it is possible that the user makes an
error in a command, particularly in left vs. right distinctions when the robot is moving
towards the human, in which the human and robot reference frames are reversed. The
editing capability described above thus allowed the user to listen to the program and see it
on the screen and then modify the errors by indicating the number of the action in the
program, and what it should be changed to.
Dialog: Learn. You said learn? Yes. I’ll start learning now. I am ready. Prepare. You said prepare?
Yes. Preparing to grasp....(same sequence as for the first leg) ... Right open. You said right open?
Yes. Opening right hand. <User1 takes the table>. I am ready. OK. You said OK? Yes. OK we will
store this plan. I am ready.
Execution of Stored Program Macro
Now the stored program can be used via the “macro” command.
Dialog: Macro. You said macro? Yes. Running the macro. Preparing to grasp. Waiting for your
signal. Continue. Closing left hand. <Robot takes leg from User2> Giving you the object.
Opening left hand. <User1 takes leg>. Getting ready to hold. Waiting for your signal. <User1
place table in robot’s right hand> Continue. Closing right hand. Waiting for your signal <User1
attaches the table leg>. Continue. Opening right hand. In line macro finished. I am ready. <Third
leg is now attached. User1 takes the table. Same procedure with macro for attaching fourth
leg.>
Performance Analysis: During the assembly of the first two legs, User1 issued 10 distinct
commands for each of the two legs. Installing legs 1 and 2 took 3min 25sec and 3min 24sec.
Once the program was learned, for legs 3 and 4, a single command initiated the program,
and the user was only required to issue 3 “continue” commands in order to indicate to the
robot that he was ready to proceed. Execution time was reduced to 2:11 and 2:33
respectively. We performed a statistical analysis of the variance (ANOVA) in indivisual
action completion times examining the effects of Repetition (i.e. first and second leg in either
the CMD or PRG mode), and Programming condition (i.e. CMD vs PRG). Only the
Programming condition had a significant effect on the completion times (ANOVA,
Programming Effect: F(1,6) = 109, p < 0.0001).
We performed a second experiment in which the same primitives were used, with
absolutely no change to the software, to disassemble the table. The use of the
programming capability for the third and fourth leg (executed in 2:51 and 2:51
respectively) yielded significant reductions in execution time as compared with the first
two legs (executed in 3:57 and 4:11 respectively). To compare performance in the two
experiments we performed a 3 way ANOVA with the factors Experiment (assembly,
disassembly), Programming (with vs. without, i.e. PRG vs CMD), and Repetition (First vs.
second repetition in each condition). For the completion times were elevated for the CMD
vs PRG conditions, i.e. action execution was slower when programming was not used.
The ANOVA reveled that only the Programming effect was significant (F(1,6) = 277, p <
0.0001).
6 robot, as a prototype platform before subsequent implementation on the HRP-2. The table
construction task with the Lynx employed a plastic table surface and four 3cm threaded
bolts with nuts (see insert Fig.3). These served as legs, and the robot was used to fetch the
legs and hold the table while the user fixed the legs to the table, as in the previous
experiment. As illustrated in Fig. 3 we designated 4 areas indicated by color for the four legs
to be placed initially, and four numbered locations where these legs could be placed using a
construction of the form: Put the X leg at location Y, where X was one of the four color
designations and Y one of the four numbered designations (see Fig 3). We also implemented
a construction of the form : Give me the X leg, which commanded the robot to fetch the leg
at location color, and place it at location 3 which was just in front of the user.
Dialog:
Command an action. Give me the red screw. You want the red screw? Yes. Giving the red screw
<robot takes screw at red location> to you <and passes it to the user>. Command an action.
Start learning. <this initiates memory of the subsequent command sequence> Command an
action. Hold. Preparing to hold. Command an action. Close gripper. Closing gripper. Command an
action. Wait. Waiting for your signal. <User insertes and tightens screw while robot holds the
grey plastic “table”. See Fig 3> Continue. Command an action. Open. Opening gripper.
Command an action. Save the macro. You said save the macro? Yes.
Command an action. Give me the blue screw. You want the blue screw? Yes. Giving the blue
screw… to you. Command an action. Run the macro. You said run the macro? Yes. Preparing to
hold. Closing gripper. Waiting for your signal. <User attaches screw while robot holds the table
steady>. Continue. Opening gripper.
Command an action. Give me the green screw. You want the green screw? Yes. Giving the green
screw… to you. Command an action. Run the macro. You said run the macro? Yes. Preparing to
hold. Closing gripper. Waiting for your signal. Continue. Opening gripper.
Command an action. Give me the yellow screw. You want the yellow screw? Yes. Giving the
yellow screw… to you. Command an action. Run the macro. You said run the macro? Yes.
Preparing to hold. Closing gripper. Waiting for your signal. Continue. Opening gripper.
The total execution time for the assembly was less than five minutes. With the macro use,
the user required five spoken commands per leg, and three of these were confirmations of
commands and the continue signal. The two main commands are “Give me..” and “run the
macro.” With respect to the HRP-2, the availability of the “Give me the X screw” PA
command eliminated the need to sequence a number of lower level postural commands into
the macro, which instead was limited to implementing the “hold the table” function.
One of the recurrent findings across studies of human imitation is that in the context of goal
directed action, it is the goal itself that tends to take precedence in defining what is to be
imitated, rather than the means (Bekkering et al. 2000, Tomasello et al. 2005). Of course in
some situations it is the details (e.g. kinematics) of the movement itself that are to be
imitated (see discussion inCarpetner and Call 2007, Cuijpers et al. 2006), but the current
research focuses on goal based imitation. This body of research helped to formulate
questions concerning what could be the neurophysiological substrates for goal based
imitation. In 1992 di Pellegrino (et al.) in the Rizzolatti lab published the first results on
“mirror” neurons, whose action potentials reflected both the production of specific goal-
directed action, and the perception of the same action being carried by the experimenter.
Since then, the premotor and parietal mirror system has been studied in detail in monkey
(by single unit recording) and in man (by PET and fMRI) reviewed in Rizzolatti & Craighero
(2004).
In the context of understanding imitation, the discovery of the mirror system had an
immense theoretical impact, as it provided justification for a common code for action
production and perception. In recent years a significant research activity has used
simulation and robotic platforms to attempt to link imitation behavior to the underlying
neurophysiology at different levels of detail (see Oztop et al. (2006) for a recent and
thorough review, edited volume (Nehaniv & Dautenhahn 2007), and a dedicated special
issue of Neural Networks (Billard & Schaal 2006)). Such research must directly address the
question of how to determine what to imitate. Carpenter and Call (2007) distinguish three
aspects of the demonstration to copy: the physical action, the resulting change in physical
state, and the inferred goal – the internal representation of the desired state. Here we
concentrate on imitation of the goal, with the advantage of eliminating the difficulties of
mapping detailed movement trajectories across the actor and imitator (Cuijpers et al. 2006).
Part of the novelty of the current research is that it will explore imitation in the context of
cooperative activity in which two agents act in a form of turn-taking sequence, with the
actions of each one folding into an interleaved and coordinated intentional action plan. With
respect to constraints derived from behavioral studies, we choose to examine child
development studies, because such studies provide well-specified protocols that test
behavior that is both relatively simple, and pertinent. The expectation is that a system that
can account for this behavior should extend readily to more complex behavior, as
demonstrated below.
Looking to the developmental data, Warneken, Chen and Tomasello (2006) engaged 18-24
month children goal-oriented tasks and social games which required cooperation. In one of
the social games, the experiment began with a demonstration where one participant sent a
wooden block sliding down an inclined tube and the other participant caught the block in a
tin cup that made a rattling sound. This can be considered more generally as a task in which
one participant moves an object so that the second participant can then in turn manipulate
the object. This represents a minimal case of a coordinated action sequence. After the
demonstration, in Trials 1 and 2 the experimenter sent the block down one of the tubes three
times, and then switched to the other, and the child was required to choose the same tube as
the partner. In Trials 3 and 4 during the game, the experimenter interrupted the behavior for
15 seconds and then resumed.
Behaviorally, children successfully participated in the game in Trials 1 and 2. In the
interruption Trials 3 and 4 they displayed two particularly interesting types of response that
were (a) to attempt to perform the role of the experimenter themselves, and/or (b) to
Spoken Language and Vision for Adaptive Human-Robot Cooperation 199
reengage the experimenter with a communicative act. This indicates that the children had a
clear awareness both of their role and that of the adult in the shared coordinated activity.
This research thus identifies a set of behavioral objectives for robot behavior in the
perception and execution of cooperative intentional action. Such behavior could, however,
be achieved in a number of possible architectures.
Fig. 6. Cooperation System. In a shared work-space, human and robot manipulate objects
(green, yellow, read and blue circles corresponding to dog, horse, pig and duck), placing
them next to the fixed landmarks (light, turtle, hammer, etc.). Action: Spoken commands
interpreted as individual words or grammatical constructions, and the command and
possible arguments are extracted using grammatical constructions in Language Proc. The
resulting Action(Agent, Object, Recipient) representation is the Current Action. This is
converted into robot command primitives (Motor Command) and joint angles (Motor
Control) for the robot. Perception: Vision provides object location input, allowing action to be
perceived as changes in World State (State Comparator). Resulting Current Action used for
action description, imitation, and cooperative action sequences. Imitation: The user
performed action is perceived and encoded in Current Action, which is then used to control
the robot under the supervision of Executive Control. Cooperative Games. During
observations, individual actions are perceived, and attributed to the agent or the other
player (Me or You). The action sequence is stored in the We Intention structure, that can
then be used to separately represent self vs. other actions.
screw, which provides an easy grasping target both for the robot and for humans. In
addition there are 6 images that are fixed to the table and serve as landmarks for placing the
moveable objects, and correspond to a light, a turtle, a hammer, a rose, a lock and a lion, as
partially illustrated in Figures 6 & 7. In the interactions, human and robot are required to
place objects in zones next to the different landmarks, so that the robot can more easily
determine where objects are, and where to grasp them. Figure 6 provides an overview of the
architecture, and Figure 7, which corresponds to Experiment 6 provides an overview of how
the system operates.
Representation: The structure of the internal representations is a central factor determining
how the system will function, and how it will generalize to new conditions. Based on the
neurophysiology reviewed above, we use a common representation of action for both
perception and production. Actions are identified by the agent, the object, and the target
location to move that object to. As illustrated in Figure 6, by taking the short loop from
vision, via Current Action Representation, to Motor Command, the system is thus
configured for a form of goal-centered action imitation. This will be expanded upon below.
A central feature of the system is the World Model that represents the physical state of the
world, and can be accessed and updated by vision, motor control, and language, similar to
the Grounded Situation Model of Mavridis & Roy (2006). The World Model encodes the
physical locations of objects that is updated by vision and proprioception (i.e. robot action
updates World Model with new object location). Changes in the World Model in terms of an
object being moved allows the system to detect actions in terms these object movements.
Actions are represented in terms of the agent, the object and the goal of the action, in the
form MOVE(object, goal location, agent). These representations can be used for
commanding action, for describing recognized action, and thus for action imitation and
narration, as seen below. In order to allow for more elaborate cooperative activity, the
system must be able to store and retrieve actions in a sequential structure.
Fig. 7. Cooperative task of Exp 5-6. Robot arm, with 6 landmarks (Light, turtle, hammer,
rose, lock and lion from top to bottom). Moveable objects include Dog and Horse. In A-D,
human demonstrates a “horse chase the dog” game, and successively moves the Dog then
Horse, indicating that in the game, the user then the robot are agents, respectively. After
demonstration, human and robot “play the game”. In each of E – F user moves Dog, and
robot follows with Horse. In G robot moves horse, then in H robot detects that the user is
having trouble and so “helps” the user with the final move of the dog. See Exp 5 & 6.
Spoken Language and Vision for Adaptive Human-Robot Cooperation 201
II.
I.
Figure 8. I. Vision processing. Above: A. – D. Three templates each for the Dog, Duck, Horse
and Pig objects at three different orientations. Below, encompassing circles indicate template
recognition for the four different objects near different fixed landmarks, as seen from the
camera over the robot workspace. II. Dialog flow of Control
In a calibration phase, a target point is marked next to each of the 6 fixed landmark
locations, such that they are all on an arc that is equidistant to the center of rotation of the
robot arm base. For each, the rotation angle of Joint 0 (the rotating shoulder base) necessary
to align the arm with that point is then determined, along with a common set of joint angles
for Joints 1 – 5 that position the gripper to seize any of the objects. Angles for Joint 6 that
202 Humanoid Robots, New Developments
controls the closing and opening of the gripper to grasp and release an object were then
identified. Finally a neutral position to which the arm could be returned in between
movements was defined. The system was thus equipped with a set of primitives that could
be combined to position the robot at any of the 6 grasping locations, grasp the
corresponding object, move to a new position, and place the object there.
Cooperation Control Architecture: The spoken language control architecture illustrated in
Fig 8.II is implemented with the CSLU Rapid Application Development toolkit
(http://cslu.cse.ogi.edu/toolkit/). This system provides a state-based dialog management
system that allows interaction with the robot (via the serial port controller) and with the
vision processing system (via file i/o). It also provides the spoken language interface that
allows the user to determine what mode of operation he and the robot will work in, and to
manage the interaction via spoken words and sentences.
Figure 8.II illustrates the flow of control of the interaction management. In the Start state the
system first visually observes where all of the objects are currently located. From the start
state, the system allows the user to specify if he wants to ask the robot to perform actions
(Act), to imitate the user, or to play (Imitate/Play). In the Act state, the user can specify
actions of the form “Put the dog next to the rose” and a grammatical construction template
is used to extract the action that the robot then performs. In the Imitate state, the robot first
verifies the current state (Update World) and then invites the user to demonstrate an action
(Invite Action). The user shows the robot one action. The robot re-observes the world and
detects the action based on changes detected (Detect Action). This action is then saved and
transmitted (via Play the Plan with Robot as Agent) to execution (Execute action). A
predicate(argument) representation of the form Move(object, landmark) is used both for
action observation and execution.
Imitation is thus a minimal case of Playing in which the “game” is a single action executed
by the robot. In the more general case, the user can demonstrate multiple successive actions,
and indicate the agent (by saying “You/I do this”) for each action. The resulting intentional
plan specifies what is to be done by whom. When the user specifies that the plan is finished,
the system moves to the Save Plan, and then to the Play Plan states. For each action, the
system recalls whether it is to be executed by the robot or the user. Robot execution takes the
standard Execute Action pathway. User execution performs a check (based on user
response) concerning whether the action was correctly performed or not. If the user action is
not performed, then the robot communicates with the user, and performs the action itself.
Thus, “helping” was implemented by combining an evaluation of the user action, with the
existing capability to perform a stored action representation.
World Model with the position of all visible objects. It then informs the user of the locations
of the different objects, for example “The dog is next to the lock, the horse is next to the
lion.” It then asks the user “Do you want me to act, imitate, play or look again?”, and the
user responds with one of the action-related options, or with “look again if the scene is not
described correctly.
Validation of Sensorimotor Control: In this experiment, the user says that he wants the
“Act” state (Fig 8.II), and then uses spoken commands such as “Put the horse next to the
hammer”. Recall that the horse is among the moveable objects, and hammer is among the
fixed landmarks. The robot requests confirmation and then extracts the predicate-argument
representation - Move(X to Y) - of the sentence based on grammatical construction templates.
In the Execute Action state, the action Move(X to Y) is decomposed into two components of
Get(X), and Place-At(Y). Get(X) queries the World Model in order to localize X with respect
to the different landmarks, and then performs a grasp at the corresponding landmark target
location. Likewise, Place-At(Y) simply performs a transport to target location Y and releases
the object. Decomposing the get and place functions allows the composition of all possible
combinations in the Move(X to Y) space. Ten trials were performed moving the four object to
and from different landmark locations. Experiment 1 thus demonstrates (1) the ability to
transform a spoken sentence into a Move(X to Y) command, (2) the ability to perform visual
localization of the target object, and (3) the sensory-motor ability to grasp the object and put
it at the specified location. In ten experimental runs, the system performed correctly.
Imitation: In this experiment the user chooses the “imitate” state. As stated above, imitation
is centered on the achieved ends – in terms of observed changes in state – rather than the
means towards these ends. Before the user performs the demonstration of the action to be
imitated, the robot queries the vision system, and updates the World Model (Update World
in Fig 8.II) and then invites the user to demonstrate an action. The robot pauses, and then
again queries the vision system and continues to query until it detects a difference between
the currently perceived world state and the previously stored World Model (in State
Comparator of Fig 1, and Detect Action in Fig 8.II), corresponding to an object displacement.
Extracting the identity of the displaced object, and its new location (with respect to the
nearest landmark) allows the formation of an Move(object, location) action representation.
Before imitating, the robot operates on this representation with a meaning-to-sentence
construction in order to verify the action to the user, as in “Did you put the dog next to the
rose?” It then asks the user to put things back as they were so that it can perform the
imitation. At this point, the action is executed (Execute Action in Fig 8.II). In ten
experimental runs the system performed correctly. This demonstrates (1) the ability of the
system to detect the goals of user-generated actions based on visually perceived state
changes, and (2) the utility of a common representation of action for perception, description
and execution.
A Cooperative Game: The cooperative game is similar to imitation, except that there is a
sequence of actions (rather than just one), and the actions can be effected by either the user
or the robot in a cooperative manner. In this experiment, the user responds to the system
request and enters the “play” state. In what corresponds to the demonstration in Warneken
et al. (2006) the robot invites the user to start showing how the game works. The user then
begins to perform a sequence of actions. For each action, the user specifies who does the
action, i.e. either “you do this” or “I do this”. The intentional plan is thus stored as a
sequence of action-agent pairs, where each action is the movement of an object to a
particular target location. In Fig 6, the resulting interleaved sequence is stored as the “We
204 Humanoid Robots, New Developments
intention”, i.e. an action sequence in which there are different agents for different actions.
When the user is finished he says “play the game”. The robot then begins to execute the
stored intentional plan. During the execution, the “We intention” is decomposed into the
components for the robot (Me Intention) and the human (You intention).
In one run, during the demonstration, the user said “I do this” and moved the horse from
the lock location to the rose location. He then said “you do this” and moved the horse back
to the lock location. After each move, the robot asks “Another move, or shall we play the
game?”. When the user is finished demonstrating the game, he replies “Play the game.”
During the playing of this game, the robot announced “Now user puts the horse by the
rose”. The user then performed this movement. The robot then asked the user “Is it OK?” to
which the user replied “Yes”. The robot then announced “Now robot puts the horse by the
lock” and performed the action. In two experimental runs of different demonstrations, and 5
runs each of the two demonstrated games, the system performed correctly. This
demonstrates that the system can learn a simple intentional plan as a stored action sequence
in which the human and the robot are agents in the respective actions.
Interrupting a Cooperative Game: In this experiment, everything proceeds as in the
previous experiment, except that after one correct repetition of the game, in the next
repetition, when the robot announced “Now user puts the horse by the rose” the user did
nothing. The robot asked “Is it OK” and during a 15 second delay, the user replied “no”.
The robot then said “Let me help you” and executed the move of the horse to the rose. Play
then continued for the remaining move of the robot. This illustrates how the robot’s stored
representation of the action that was to be performed by the user allowed the robot to
“help” the user.
A More Complex Game: In order to more explicitly test the intentional sequencing
capability of the system, this experiment replicates the Cooperative Game experiment but
with a more complex task, illustrated in Figure 7. In this game (Table 5), the user starts by
moving0 the dog, and after each move the robot “chases” the dog with the horse, till they
both return to their starting places.
As in the simplified cooperative game, the successive actions are visually recognized and
stored in the shared “We Intention” representation. Once the user says “Play the game”, the
final sequence is stored, and then during the execution, the shared sequence is decomposed
into the robot and user components based on the agent associated with each action. When
the user is the agent, the system invites the user to make the next move, and verifies (by
Spoken Language and Vision for Adaptive Human-Robot Cooperation 205
asking) if the move was ok. When the system is the agent, the robot executes the movement.
After each move the World Model is updated. As before two different complex games were
learned, and each one “played” 5 times. This illustrates the learning by demonstration
(Zollner et al. 2004) of a complex intentional plan in which the human and the robot are
agents in a coordinated and cooperative activity.
Interrupting the Complex Game: As in Experiment 4, the objective was to verify that the
robot would take over if the human had a problem. In the current experiment this capability
is verified in a more complex setting. Thus, when the user is making the final movement of
the dog back to the “lock” location, he fails to perform correctly, and indicates this to the
robot. When the robot detects failure, it reengages the user with spoken language, and then
offers to fill in for the user. This is illustrated in Figure 7H. This demonstrates the
generalized ability to help that can occur whenever the robot detects the user is in trouble.
These results were presented in Dominey (2007).
9. Discussion
This beginning of the 21st century marks a period where humanoid robot mechatronics and
the study of human and artificial cognitive systems come in parallel to a level of maturity
sufficient for significant progress to be made in making these robots more human-like in
there interactions. In this context, two domains of interaction that humans exploit with great
fidelity are spoken language, and the visual ability to observe and understand intentional
action. A good deal of research effort has been dedicated to the specification and
implementation of spoken language systems for human-robot interaction (Crangle &
Suppes 1994, Lauria et al. 2002, Severinson-Eklund 2003, Kyriacou et al. 2005, Mavrides &
Roy 2006). The research described in the current chapter extends these approaches with a
Spoken Language Programming system that allows a more detailed specification of
conditional execution, and by using language as a compliment to vision-based action
perception as a mechanism for indicating how things are to be done, in the context of
cooperative, turn-taking behavior.
The abilities to observe an action, determine its goal and attribute this to another agent are
all clearly important aspects of the human ability to cooperate with others. Recent research
in robot imitation (Oztop et al. 2006, Nehaniv & Dautenhahn 2007, Billard & Schaal 2006)
and programming by demonstration (Zollner et al. 2004) begins to address these issues.
Such research must directly address the question of how to determine what to imitate.
Carpenter and Call (2007) The current research demonstrates how these capabilities can
contribute to the “social” behavior of learning to play a cooperative game, playing the game,
and helping another player who has gotten stuck in the game, as displayed in 18-24 month
children (Werneken et al. 2006, Werneken & Tomasello 2006). While the primitive bases of
such behavior is visible in chimps, its full expression is uniquely human. As such, it can be
considered a crucial component of human-like behavior for robots (Carpenter & Call 2007).
The current research is part of an ongoing effort to understand aspects of human social
cognition by bridging the gap between cognitive neuroscience, simulation and robotics
(Dominey 2003, 2005, et al. 2004, 2006, 2007; Dominey & Boucher 2005), with a focus on the
role of language. The experiments presented here indicate that functional requirements
derived from human child behavior and neurophysiological constraints can be used to
define a system that displays some interesting capabilities for cooperative behavior in the
context of spoken language and imitation. Likewise, they indicate that evaluation of
206 Humanoid Robots, New Developments
another’s progress, combined with a representation of his/her failed goal provides the basis
for the human characteristic of “helping.” This may be of interest to developmental
scientists, and the potential collaboration between these two fields of cognitive robotics and
human cognitive development is promising. The developmental cognition literature lays out
a virtual roadmap for robot cognitive development (Dominey 2005, Werneken et al. 2006). In
this context, we are currently investigating the development of hierarchical means-end
action sequences. At each step, the objective will be to identify the behavior characteristic
and to implement it in the most economic manner in this continuously developing system
for human-robot cooperation.
At least two natural extensions to the current system can be considered. The first involves
the possibility for changes in perspective. In the experiments of Warneken et al. the child
watched two adults perform a coordinated task (one adult launching the block down the
tube, and the other catching the block). At 24 months, the child can thus observe the two
roles being played out, and then step into either role. This indicates a “bird’s eye view”
representation of the cooperation, in which rather than assigning “me” and “other” agent
roles from the outset, the child represents the two distinct agents A and B for each action in
the cooperative sequence. Then, once the perspective shift is established (by the adult taking
one of the roles, or letting the child choose one) the roles A and B are assigned to me and
you (or vice versa) as appropriate.
This actually represents a minimal change to our current system. First, rather than assigning
the “you” “me” roles in the We Intention at the outset, these should be assigned as A and B.
Then, once the decision is made as to the mapping of A and B onto robot and user, these
agent values will then be assigned accordingly. Second, rather than having the user tell the
robot “you do this” and “I do this” the vision system can be modified to recognize different
agents who can be identified by saying their name as they act, or via visually identified cues
on their acting hands.
The second issue has to do with inferring intentions. The current research addresses one
cooperative activity at a time, but nothing prevents the system from storing multiple such
intentional plans in a repertory (IntRep in Fig 6). In this case, as the user begins to perform a
sequence of actions involving himself and the robot, the robot can compare this ongoing
sequence to the initial subsequences of all stored sequences in the IntRep. In case of a match,
the robot can retrieve the matching sequence, and infer that it is this that the user wants to
perform. This can be confirmed with the user and thus provides the basis for a potentially
useful form of learning for cooperative activity.
In conclusion, the current research has attempted to build and test a robotic system for
interaction with humans, based on behavioral and neurophysiological requirements derived
from the respective literatures. The interaction involves spoken language and the
performance and observation of actions in the context of cooperative action. The
experimental results demonstrate a rich set of capabilities for robot perception and
subsequent use of cooperative action plans in the context of human-robot cooperation. This
work thus extends the imitation paradigm into that of sequential behavior, in which the
learned intentional action sequences are made up of interlaced action sequences performed
in cooperative alternation by the human and robot. While many technical aspects of robotics
(including visuomotor coordination and vision) have been simplified, it is hoped that the
contribution to the study of imitation and cooperative activity is of some value.
Acknowledgements: I thank Jean-Paul Laumond, Eiichi Yoshida and Anthony Mallet from
the LAAS Toulouse for cooperation with the HRP-2 as part of the French-Japanese Joint
Spoken Language and Vision for Adaptive Human-Robot Cooperation 207
10. References
Bekkering H, WohlschlagerA , Gattis M (2000) Imitation of Gestures in Children is Goal-
directed, The Quarterly Journal of Experimental Psychology: Section A, 53, 153-
164
Billard A, Schaal (2006) Special Issue: The Brain Mechanisms of Imitation Learning, Neural
Networks, 19(1) 251-338
Boucher J-D, Dominey PF (2006) Programming by Cooperation: Perceptual-Motor Sequence
Learning via Human-Robot Interaction, Proc. Simulation of Adaptive Behavior,
Rome 2006.
Calinon S, Guenter F, Billard A (2006) On Learning the Statistical Representation of a Task
and Generalizing it to Various Contexts. Proc IEEE/ICRA 2006.
Carpenter M, Call Josep (2007) The question of ‘what to imitate’: inferring goals and
intentions from demonstrations, in Chrystopher L. Nehaniv and Kerstin
Dautenhahn Eds, Imitation and Social Learning in Robots, Human sand Animals,
Cambridge Univerity Press, Cambridge.
Crangle C. & Suppes P. (1994) Language and Learning for Robots, CSLI lecture notes: no. 41,
Stanford.
Cuijpers RH, van Schie HT, Koppen M, Erlhagen W, Bekkering H (2006) Goals and means in
action observation: A computational approach, Neural Networks 19, 311-322,
di Pellegrino G, Fadiga L, Fogassi L, Gallese V, Rizzolatti G (1992) Understanding motor
events: a neurophysiological study. Exp Brain Res.;91(1):176-80.
Dominey PF (2005) Toward a construction-based account of shared intentions in social
cognition. Comment on Tomasello et al. 2005, Beh Brain Sci. 28:5, p. 696.
Dominey PF, (2003) Learning grammatical constructions from narrated video events for
human–robot interaction. Proceedings IEEE Humanoid Robotics Conference, Karlsruhe,
Germany
Dominey PF, Alvarez M, Gao B, Jeambrun M, Weitzenfeld A, Medrano A (2005) Robot
Command, Interrogation and Teaching via Social Interaction, Proc. IEEE Conf. On
Humanoid Robotics 2005.
Dominey PF, Boucher (2005) Learning To Talk About Events From Narrated Video in the
Construction Grammar Framework, Artificial Intelligence, 167 (2005) 31–61
Dominey PF, Boucher, J. D., & Inui, T. (2004). Building an adaptive spoken language
interface for perceptually grounded human–robot interaction. In Proceedings of the
IEEE-RAS/RSJ international conference on humanoid robots.
Dominey PF, Hoen M, Inui T. (2006) A neurolinguistic model of grammatical construction
processing. Journal of Cognitive Neuroscience.18(12):2088-107.
Dominey PF, Mallet A, Yoshida E (2007) Progress in Spoken Language Programming of the
HRP-2 Humanoid, Proc. ICRA 2007, Rome
208 Humanoid Robots, New Developments
Dominey PF (2007) Sharing Intentional Plans for Imitation and Cooperation: Integrating
Clues from Child Developments and Neurophysiology into Robotics, Proceedings
of the AISB 2007 Workshop on Imitation.
Fong T, Nourbakhsh I, Dautenhaln K (2003) A survey of socially interactive robots. Robotics
and Autonomous Systems, 42 3-4, 143-166.
Goga, I., Billard, A. (2005), Development of goal-directed imitation, object manipulation and
language in humans and robots. In M. A. Arbib (ed.), Action to Language via the
Mirror Neuron System, Cambridge University Press (in press).
Goldberg A. Constructions: A new theoretical approach to language. Trends in Cognitive
Sciences 2003; 7: 219–24.
Kozima H., Yano H. (2001) A robot that learns to communicate with human caregivers, in:
Proceedings of the International Workshop on Epigenetic Robotics,.
Kyriacou T, Bugmann G, Lauria S (2005) Vision-based urban navigation procedures for
verbally instructed robots. Robotics and Autonomous Systems, (51) 69-80
Lauria S, Buggmann G, Kyriacou T, Klein E (2002) Mobile robot programming using natural
language. Robotics and Autonomous Systems 38(3-4) 171-181
Mavridis N, Roy D (2006). Grounded Situation Models for Robots: Where Words and
Percepts Meet. Proceedings of the IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS)
Nehaniv CL, Dautenhahn K eds. (2007) Imitation and Social Learing in Robots, Humans and
Animals, Cambridge University Press, Cambridge.
Nicolescu M.N., Mataric M.J. : Learning and Interacting in Human-Robot Domains, IEEE
Trans. Sys. Man Cybernetics B, 31(5) 419-430.
Oztop E, Kawato M, Arbib M (2006) Mirror neurons and imitation: A computationally
guided review. Neural Networks, (19) 254-271
Pickering MJ, Garrod S. (2004) Toward a mechanistic psychology of dialogue.Behav Brain
Sci. Apr;27(2):169-90
Rizzolatti G, Craighero L (2004) The Mirror-Neuron system, Annu. Rev. Neuroscience (27)
169-192
Severinson-Eklund K., Green A., Hüttenrauch H., Social and collaborative aspects of
interaction with a service robot, Robotics and Autonomous Systems 42 (2003) 223–234
Sommerville A, Woodward AL (2005) Pulling out the intentional structure of action: the
relation between action processing and action production in infancy. Cognition, 95,
1-30.
Tomasello M, Carpenter M, Cal J, Behne T, Moll HY (2005) Understanding and sharing
intentions: The origins of cultural cognition, Beh. Brain Sc;. 28; 675-735.
Torrey C, Powers A, Marge M, Fussel SR, Kiesker S (2005) Effects of Adaptive Robot
Dialogue on Information Exchange and Social Relations, Proceedings HRI 2005.
Warneken F, Chen F, Tomasello M (2006) Cooperative Activities in Young Children and
Chimpanzees, Child Development, 77(3) 640-663.
Warneken F, Tomasello M (2006) Altruistic helping in human infants and young
chimpanzees, Science, 311, 1301-1303
Zöllner R., Asfour T., Dillman R.: Programming by Demonstration: Dual-Arm Manipulation Tasks
for Humanoid Robots. Proc IEEE/RSJ Intern. Conf on Intelligent Robots and systems
(IROS 2004).
12
1. Abstract
Most recent humanoid research has focused on balance and locomotion. This concentration is
certainly important, but one of the great promises of humanoid robots is their potential for
effective interaction with human environments through manipulation. Such interaction has
received comparatively little attention, in part because of the difficulty of this task. One of the
greatest obstacles to autonomous manipulation by humanoids is the lack of efficient collision-
free methods for reaching. Though the problem of reaching and its relative, pick-and-place,
have been discussed frequently in the manipulator robotics literature- e.g., (Lozano-Pérez et
al., 1989); (Alami et al., 1989); (Burridge et al., 1995)- researchers in humanoid robotics have
made few forays into these domains. Numerous subproblems must be successfully addressed
to yield significant progress in humanoid reaching. In particular, there exist several open
problems in the areas of algorithms, perception for modeling, and control and execution. This
chapter discusses these problems, presents recent progress, and examines future prospects.
2. Introduction
Reaching is the one of the most important tasks for humanoid robots, endowing them with
the ability to manipulate objects in their environment. Unfortunately, getting humanoids to
reach efficiently and safely, without collision, is a complex problem that requires solving
open subproblems in the areas of algorithms, perception for modeling, and control and
execution. The algorithmic problem requires the synthesis of collision-free joint-space
trajectories in the presence of moving obstacles. The perceptual problem, with respect to
modeling, is comprised of acquiring sufficiently accurate information for constructing a
geometric model of the environment. Problems of control and execution are concerned with
correcting deviation from reference trajectories and dynamically modifying these
trajectories during execution to avoid unexpected obstacles. This chapter delves into the
relevant subproblems above in detail, describes the progress that has been made in solving
them, and outlines the work remaining to be done in order to enable humanoids to perform
safe reaching in dynamic environments.
3. Problem statement
The problem of reaching is formally cast as follows. Given:
210 Humanoid Robots, New Developments
1.a world =
2.the current time to; T is then defined as the interval [to,f]
3.a robot
4.a smooth manifold called the state space of; let be a function that
maps state-space to the robot’s configuration space
5. the state transition equation is , where and generates a
vector of control inputs ( ) as a function of time
6. a nonstationary obstacle region is then the
projection of obstacles in the robot’s configuration space into state-space (i.e.,
and ).
7. is the reachable workspace1 of
8. a direct kinematics function, F : o that transforms robot states to operational-space
configurations of one of the robot’s end effectors
9. a set of feasible operational-space goal functions of time, G such that Vg G, g : T o
10. a feasible state-space Boolean function G : T x g x o 0,1 where g G
11. x0 free, the state of the robot at t0
generate the control vector function u(.) from time t > t0 such that xt free(t) where xt =
, dt, for t > t0 and there exists a time tj for which
for all ti > tj, or correctly report that such a function u(.) does not
exist.
Informally, the above states that to solve the reaching problem, the commands sent to the
robot must cause it to remain collision-free and, at some point in the future, cause both the
operational space distance from the end-effector to one of the goals to remain below a given
threshold H and the state-space of the robot to remain in an admissable region.
The implications of the above formal definition are:
• The state transition function f(.) should accurately reflect the dynamics of the robot.
Unfortunately, due to limitations in mechanical modeling and the inherent
uncertainty of how the environment might affect the robot, f(.) will only
approximate the true dynamics. Section 4.3 discusses the ramifications of this
approximation.
• The robot must have an accurate model of its environment. This assumption will
only be true if the environment is instrumented or stationary. The environments in
which humanoids are expected to operate are dynamic (see #6 above), and this
chapter will assume that the environment is not instrumented. Constructing an
accurate model of the environment will be discussed in Section 4.2.
• The goals toward which the robot is reaching may change over time; for example,
the robot may refine its target as the robot moves nearer to it. Thus, even if the
target itself is stationary, the goals may change given additional information. It is
also possible that the target is moving (e.g., a part moving on an assembly line).
The issue of changing targets will be addressed in Section 4.1.
• Manipulation is not explicitly considered. It is assumed that a separate process can
grasp or release an object, given the operational-space target for the hand and the
desired configuration for the fingers (the Boolean function G(.) is used to ensure
1 The reachable workspace is defined by Sciavicco & Siciliano (2000) to be the region of operational-space
that the robot can reach with at least one orientation.
Collision-Free Humanoid Reaching: Past, Present, and Future 211
that this latter condition is satisfied). This assumption is discussed further in the
next section.
3 Related work
A considerable body of work relates to the problem defined in the previous section yet does
not solve this problem. In some cases, researchers have investigated similar problems, such
as developing models of human reaching. In other cases, researchers have attempted to
address both reaching and manipulation. This section provides an overview of these
alternate lines of research, though exhaustive surveys of these areas are outside of the scope
of this chapter. Humanoids have yet to autonomously reach via locomotion to arbitrary
objects in known, static environments, much less reach to objects without collision in
dynamic environments. However, significant progress has been made toward solving this
problems recently. This section concludes with a brief survey of methods that are directly
applicable toward solving the reaching problem.
tasks than those used in pick-and-place; for example, both pointing and touching can be
considered as instances of reaching.
A late development by Stilman & Kuffner, Jr. (2007) addresses manipulation planning
amongst movable obstacles. The reaching problem as defined in Section 2 permits the
obstacles to be movable by the humanoid. Many of the issues described in this chapter (e.g.,
constructing a model of the environment, monitoring execution, etc.) need to be resolved to
fully explore this avenue, but the ability to move obstacles as necessary will certainly admit
new solutions to some instances of the reaching problem.
4. Outstanding issues
This section discusses the issues that remain to solve the reaching problem, as follows:
1. Constructing an accurate model of the environment
2. Planning collision-free motions in dynamic environments
3. Correcting deviation from the desired trajectory due to imperfect control
4. Avoiding both fixed and moving obstacles during trajectory execution
The first item has received the least research attention to date and therefore includes the
majority of open problems in collision-free humanoid reaching. Section 4.2 discusses
progress and prospects in this area. Planning collision-free motions, at least in static
environments, has received considerable attention; Section 4.1 discusses why this problem is
challenging from an algorithmic standpoint and addresses extensions to dynamic
environments. Finally, correcting deviation from the planned trajectory and avoiding
obstacles during trajectory execution are key to reach the target in a safe manner. Section 4.3
discusses these two issues.
2 In the context of this work, the two-point boundary problem can be considered to be the problem of
getting from a given state to a desired state under nonholonomic constraints. For example, the problem
of parallel parking a car can be considered a two-point boundary problem: a series of actions is required
to move the car into a parking space, even if only a simple translation is required (e.g., the car initially is
aligned laterally with the parking space).
Collision-Free Humanoid Reaching: Past, Present, and Future 215
3 We assume that all available degrees-of-freedom are used for reaching, rather than for achieving
secondary tasks like singularity avoidance, e.g., (Tanner & Kyriakopoulos, 2000).
216 Humanoid Robots, New Developments
increasing sensory data, leading to new estimates of the target’s position and orientation.
Additionally, the target itself may be moving, perhaps with a predictable trajectory.
The first issue is readily solvable using existing methods. If already in the process of
planning, the goal can be replaced without ill effects: the results from sample-based
planning methods will not be invalidated. If a plan has been determined, replanning can be
performed using the results already determined (i.e., the roadmap, tree, etc.) to speed
computation. Alternatively, inverse kinematics can be utilized to modify the generated plan
slightly, somewhat in the manner used for motion retargeting by Choi & Ko (2000).
Moving targets are far more difficult to manage. Current sample-based planning methods
have not focused on this problem, and the complexity of adapting existing methods to this
purpose is unknown. Again, it is imaginable that existing plans could be modified using
inverse kinematics, though replanning may be required if the target is moving quickly.
Alternatively, Brock’s method (discussed in Section 4.3) could potentially be utilized with
some modifications toward this purpose.
Regardless of the method employed, a significant concern is that the process of planning can
itself cause possible solutions to disappear, because planning occurs in real-time. An
additional significant concern is the requirement of many methods that the trajectories of
the obstacles to be known a priori; filtering techniques– e.g., Kalman filters Kalman & Bucy
(1961)- might permit short-term predictions, but long-term predictions will be problematic
unless the obstacles follow balistic trajectories (which will prove difficult for the robot to
avoid anyhow).
4.1.9 Summary
Substantial progress has been made in motion planning in the past decade, leading to
tractable solutions of high-dimensional planning problems. Indeed, in the areas of planning
to nonstationary goals and planning in dynamic environments, researchers are on the cusp
of solutions that are viable for humanoid reaching. However, considerable work still
remains in the area of planning under uncertainty.
4 Alfhough the goal functions for the reaching problem are given in the global coordinate frame, it is
quite natural to use ego-centric frames instead. As a result, localization is likely not required to perform
reaching, except for effectively constructing the environment model.
218 Humanoid Robots, New Developments
Marching Cubes algorithm (Lorensen & Cline, 1987) or stitching (Turk & Levoy, 1994) and
aligning and registering (Mayer & Bajcsy 1993); (Pito, 1996). In addition to the requirements
listed above, the ability to readily extract high-level features from the representation for use
with localization, described in Section 4.2.2, would be advantageous.
Drumwright et al. (2006) used an octree, which may be considered as an extension of
occupancy grids, for modeling environments for reaching. The octree representation results
in efficient storage, permits fast updates to the representation from range data, and is
capable of performing fast intersection testing with the robot model. Drumwright et al.
(2006) assumed perfect localization, and we are unaware of work that extracts features from
octrees (or their two-dimensional equivalent, quadtrees) for the purpose of localization.
However, there is precedent for feature extraction from octrees, e.g., (Sung et al., 2002).
An alternative to the octree representation would use high-level features (e.g., landmarks,
objects, or shapes) as the base representation. Such features would serve well for input to
one of the popular methods for simultaneous localization and mapping (SLAM) such as
(Smith & Cheeseman, 1985); (Smith et al., 1990) or (Fox et al., 1999). Recognition of objects in
the context of mapping and localization has been limited generally to those objects which
are of interest to mobile robots, including doors (Avots et al., 2002), furniture (Hähnel et al.,
2003), and walls (Martin & Thrun, 2002). Additionally, representations used typically
involve geometric primitives, which fail to realize the potential benefit of using identified
objects to infer occluded features. The difficulty of performing object recognition in
unstructured environments makes the near-term realization of this benefit unlikely.
4.2.2 Localization
The humanoid must be able to localize itself (i.e., know its planar position and orientation)
with respect to the environment, if not globally. Recent work by Fox et al. (1999) indicates
that localization can be performed successfully with range data even in highly dynamic
environments. This work, as well as other probabilistic approaches, typically can provide
measures of certainty of their localization estimates. The estimated variance can be used to
filter updates of the environment model; the environment model will be updated only if the
certainty of the prediction is above a given threshold. Alternatively, localization accuracy
can be quite high, e.g., on the order of centimeters Yamamoto et al. (2005), if the
environment is modestly instrumented.
given feature belongs to a dynamic obstacle; dynamic obstacles are not incorporated into the
constructed map. Finally, Biswas et al. (2002) attempt to model dynamic obstacles as
movable rigid bodies using an occupancy grid and an expectation-maximization based
algorithm. Unfortunately, the first two approaches described above are ill-suited for
movable obstacles, while the second approach is unable to handle dynamic obstacles.
4.2.4 Exploration
Exploration of the environment to facilitate reaching requires both directed locomotion and
directed/active sensing. Both exploration and active sensing have been studied extensively
in the context of mobile robot mapping, it is unlikely that this research is fully applicable
toward our problem. Exploration in the mapping context is conducted in a greedy manner:
movement is directed to build a complete map of the environment with respect to some
criterion, e.g., minimum time (Yamauchi, 1997), minimum energy expenditure (Mei et al.,
2006), maximum safety (González-Baños & Latombe, 2002), etc. In contrast, the reaching
problem requires a balance between exploration and exploitation: the environment must be
sufficiently explored, but not at the expense of finding a valid collision-free path. The
balance between exploration and exploitation precludes the use of active sensing
frameworks, such as that described by Mihaylova et al. (2002), to guide exploration. Finally,
moving obstacles must be avoided during exploration, perhaps by using reactive
approaches like VFH Borenstein & Koren (1989) or potential fields Khatib (1986).
Choset & Burdick (2000) refer to environmental exploration in the context of motion
planning sensor based motion planning, and introduced a new data structure, the hierarchical
generalized voronoi graph, to solve this problem. Unfortunately, their work was targeted
toward mobile robots, and it appears highly unlikely to scale to humanoid robots with high-
dimensional configuration spaces.
Researchers have yet to propose methods to perform directed exploration with respect to
motion planning for humanoid robots. One possibility is to adapt the concepts employed for
mobile robot exploration to humanoids. For example,frontier-based exploration Yamauchi (1997),
which directs the robot to move to the area between open space and uncharted territory (i.e.,
the frontier), could be combined with a heuristic to select frontiers nearest the target.
4.2.5 Summary
This section covered the relevant issues necessary to construct a model of the environment
for reaching. The issues of localization and selecting a proper representation for modeling
the environment seem manageable through existing research. However, considerable work
remains in the areas of modeling nonstationary environments and exploration for motion
planning.
5 As in Section 4.1.3, we assume that the trajectory has been rescaled to respect dynamic constraints.
Collision-Free Humanoid Reaching: Past, Present, and Future 221
4.3.4 Summary
Control and locomotion are both significant problems in humanoid robotics, and assessing
future prospects in these areas is difficult. However, the assumption of a planar controller
for locomotion seems reasonable; it would prove difficult to have the roboticist juggle
balance, locomotion, and task performance simultaneously. Indeed, many current
humanoid designs prohibit direct access to actuators.
222 Humanoid Robots, New Developments
For execution monitoring, only elastic strips has proven successful for real-time obstacle
avoidance on a mobile manipulator that followed a given trajectory. Depending on how
seriously inaccuracies in the joint-space inertia matrix affect elastic strips, this method might
prove extendible to humanoids. Viable alternatives include dynamic replanning and the
offline-online probabilistic roadmap based methods of Jaillet & Siméon (2004) and van den
Berg & Overmars (2005) described in Section 4.1, using dense roadmaps of configuration
space. Regardless of the method chosen for execution monitoring, dynamic obstacles still
require representation in the robot’s perceptual model of the environment, as described in
Section 4.2.
5 Conclusion
The past decade has seen remarkable progress in the types of motion planning problems
that have been solved. However, many of these successes have been in application domains
far removed from robotics. Additionally, humanoids have yet to perform tasks
autonomously in human environments. This chapter formally defined one of the most
important problems for humanoids, efficient, safe reaching in dynamic, unknown
environments. Progress in the three areas critical areas to solving this problem- algorithms,
perception for modeling, and execution-was surveyed and future prospects were examined.
Table 1 summarizes the key issues in these areas as well as progress and prospects.
Issue Progress and prospects
Planning under uncertainty Difficult for humanoids, possible solution is dynamic
replanning
Incompleteness resulting Solved by (Bertram et al., 2006) and (Drumwright & Ng-
from multiple IK solutions Thow-Hing, 2006)
Planning to nonstationary Has received little focus, possible solution is modification
goals of existing plans using inverse kinematics
Planning in dynamic Dynamic replanning and offline-online probabilistic
environments roadmaps seem to be promising directions; effectiveness
toward humanoid reaching needs to be assessed
Representation of Octree approach works well, high-level features might
environment model work better in the future
Localization Highly active area of research; (Fox et al., 1999) indicates
good localization feasible even in nonstationary
environments
Modeling nonstationary Considerable work remaining; current work fails to
environments address movable obstacles and dynamic obstacles
simultaneously
Exploration Considerable work remaining; exploration for planning
humanoid motions yet to be addressed
Controller deviation Better controllers will address this problem; can be
mitigated by assuming differential locomotion controller
Execution monitoring (Brock & Khatib, 2002) is one possible solution; dynamic
replanning is another
Table 1. The core issues with respect to humanoid reaching discussed in this chapter with
brief summaries of the findings.
Collision-Free Humanoid Reaching: Past, Present, and Future 223
With respect to algorithms, control, and execution monitoring, dynamic replanning seems
to be one of the best solutions for humanoid reaching. As computing power continues to
grow, dynamic replanning becomes increasingly viable for planning reaching in
nonstationary environments. For modeling the robot’s environment, computational power
seems to be less of an issue. Instead, new algorithms are required to balance the tradeoff
between exploration and exploitation and to perceive and classify fully dynamic and semi
dynamic obstacles. These three areas critical to humanoid reaching are currently the focus of
intensive research, and results applicable toward humanoid reaching are on the horizon.
6. References
Alami, R., Laumond, J.-P, & Siméon, T.(1997). Two manipulation planning algorithms. In J.-
P. Laumond & M. Overmars (Eds.), Algorithms for robot motion and manipulation.
Wellesley MA: A.K. Peters.
Alami, R., Siméon, T., & Laumond, J.-P. (1989). A geometric approach to planning
manipulation tasks. In Proc. intl. symp. on robotics research (ISRR) (pp. 113-119).
Arbib, M. A., Iberall, T., & Lyons, D. (1985). Coordinated control programs for movements
of the hand. Experimental Brain Research, 111-129.
Avots, D., Lim, E., Thibaux, R., & Thrun, S. (2002). A probabilistic technique for
simultaneous localization and door state estimation with mobile robots in dynamic
environments. In Proc. of the IEEE conf. on intelligent robots and systems (IROS).
Lausanne, Switzerland.
Barraquand, J., & Latombe, J.-C. (1991, Dec). Robot motion planning: a distributed
representation approach. Intl. J. of Robotics Research, 10(6), 628-649.
Baumann, A. (2001). Robot motion planning in time-varying environments. Unpublished
doctoral dissertation.
Bellman, R. E.(1957). Dynamic programming. Dover Publications.
Bertram, D., Kuffner, Jr., J. J., Dillmann, R., & Asfour, T. (2006, May). An integrated
approach to inverse kinematics and path planning for redundant manipulators. In
Proc. of the IEEE intl. conf. on robotics and automation (ICRA). Orlando, FL, USA.
Biswas, R., Limketkai, B., Sanner, S., & Thrun, S. (2002). Towards object mapping in
dynamic environments with mobile robots. In Proc. of the IEEE conf. on intelligent
robots and systems (IROS). Lausanne, Switzerland.
Borenstein, J., & Koren, Y. (1989, Sept-Oct). Real-time obstacle avoidance for fast mobile
robots. IEEE Trans, on Systems, Man, and Cybernetics, 19(5), 1179-1187.
Brock, O. (2000). Generating robot motion: the integration of planning and execution.
Unpublished doctoral dissertation, Stanford University.
Brock, O., & Khatib, O. (2002, Dec). Elastic strips: A framework for motion generation in
human environments. Intl. J. of Robotics Research, 21(12), 1031-1052.
Bullock, D., Grossberg, S., & Guenther, F.(1993). A self-organizing neural model of motor
equivalent reaching and tool use by a multijoint arm. /. of Cognitive Neuroscience, 5,
408-435.
Burridge, R. R., Rizzi, A. A., & Koditschek, D. E. (1995, Aug). Toward a dynamical pick and
place. In Proc. of the 1995IEEE/RS] intl. conf. on intelligent robots and systems (Vol. 2,
pp. 292-297).
Canny, J. (1993). Computing roadmaps of general semi-algebraic sets. The Computer Journal,
36(5), 504-514.
224 Humanoid Robots, New Developments
Choi, K.-J., & Ko, H.-S. (2000). On-line motion retargeting. Journal of Visualization and
Computer Animation, 11, 223-235.
Choset, H., & Burdick, J. (2000). Sensor based motion planning: The hierarchical generalized
voronoi graph. Intl. J. of Robotics Research, 19(2), 96-125.
Crowe, A., Porrill, J., & Prescott, T. (1998, Sept). Kinematic coordination of reach and
balance. Journal of Motor Behavior, 30(3), 217-233.
Dempster, A. P., Laird, A. N., & Rubin, D. B. (1977). Maximum likelihood from incomplete
data via the EM algorithm. J. of the Royal Statistical Society, Series B, 39(1), 1-38.
Drumwright, E., & Ng-Thow-Hing, V.(2006, Oct). Toward interactive reaching in static
environments for humanoid robots. In Proc. of IEEE/RS] intl. conf. on intelligent
robots and systems (IROS). Beijing.
Drumwright, E., Ng-Thow-Hing, V., & Matariþ, M.(2006, Dec). The task matrix framework
for platform independent humanoid programming. In Proc. of IEEE intl. conf. on
humanoid robotics (humanoids). Genova, Italy.
Elfes, A. (1989, June). Using occupancy grids for mobile robot perception and navigation.
Computer, 22(6), 46-57.
Ferguson, D., & Stentz, A. (2006, May). Replanning with RRTs. In Proc. of the IEEE intl. conf.
on robotics and automation (ICRA). Orlando, FL.
Flanagan, J. R., Feldman, A. G., & Ostry, D. J.(1993). Control of trajectory modifications in
target-directed reaching. Journal of Motor Behavior, 25(3), 140-152.
Flash, T., & Hogan, N. (1985, July). The coordination of arm movements: An experimentally
confirmed mathematical model. The Journal ofNeuroscience, 5(7), 1688-1703.
Fox, D., Burgard, W., & Thrun, S. (1999). Markov localization for mobile robots in dynamic
environments. Journal of Artificial Intelligence Research, 11, 391-427.
González-Baños, H., & Latombe, J.-C. (2002, Oct-Nov). Navigation strategies for exploring
indoor environments. Intl. J. of Robotics Research, 22(10-11), 829-848.
Gupta, K., Ahuactzin, J., & Mazer, E. (1998). Manipulation planning for redundant robots: A
practical approach. Intl. J. of Robotics Research, 17(7).
Hähnel, D., Schulz, D., & Burgard, W. (2002, Sept-Oct). Mapping with mobile robots in
populated environments. In Proc. of the IEEE/RSJ intl. conf. on intelligent robots and
systems (IROS). Lausanne, Switzerland.
Hähnel, D., Triebel, R., Burgard, W., & Thrun, S. (2003, Sept). Map building with mobile
robots in dynamic environments. In Proc. of the IEEE intl. conf. on robotics and
automation. Taipei, Taiwan.
Jaillet, L., & Siméon, T.(2004, Sept-Oct). A PRM-based motion planner for dynamically
changing environments. In Proc. of the IEEE conf. on intelligent robots and systems
(IROS) (Vol. 2, pp. 1606-1611). Sendai, Japan.
Kagami, S., Kuffner, Jr., J. J., Nishiwaki, K., Inaba, M., & Inoue, H. (2003, April). Humanoid
arm motion planning using stereo vision and RRT search. Journal of Robotics and
Mechatronics, 15(2), 200-207.
Kallmann, M., Aubel, A., Abaci, T., & Thalmann, D. (2003). Planning collision-free reaching
motions for interactive object manipulation and grasping. In Proc. of euro-graphics
(p. 313-322). Grenada, Spain.
Kallmann, M., & Matariþ, M.(2004, April). Motion planning using dynamic roadmaps. In Proc.
of IEEE intl. conf. on robotics and automation (ICRA) (p. 4399-4404). New Orleans, LA.
Kalman, R., & Bucy, R. (1961). New results in linear filtering and prediction theory. Trans, of
the ASME-Journal of Basic Engineering, 83, 95-107.
Collision-Free Humanoid Reaching: Past, Present, and Future 225
Kavraki, L. E., Svestka, P., Latombe, J.-C, & Overmars, M. (1996). Probabilistic roadmaps for
path planning in high dimensional configuration spaces. IEEE Trans, on Robotics
and Automation, 22(4), 566-580.
Khatib, O. (1986). Real-time obstacle avoidance for manipulators and mobile robots. The Intl.
J. of Robotics Research, 5(1), 90-98.
Kuffner, Jr., J. J. (1998). Goal-directed navigation for animated characters using realtime path
planning and control. Lecture Notes in Computer Science, 1537,171-186.
Kuipers, B., Froom, R., Lee, W. K., & Pierce, D.(1993). The semantic hierarchy in robot
learning. In J. Connell & S. Mahadevan (Eds.), Robot learning. Boston, MA: Kluwer
Academic Publishers.
LaValle, S. (1998, October). Rapidly-exploring random trees: a new tool for path planning (Tech.
Rep. No. TR 98-11). Computer Science Dept., Iowa State University.
LaValle, S.(2006). Planning algorithms. Cambridge University Press.
LaValle, S., & Hutchinson, S. (1998, Jan). An objective-based framwork for motion planning
under sensing and control uncertainties. Intl. J. of Robotics Research, 17(1), 19-42.
Lazanas, A., & Latombe, J. C. (1995). Motion planning with uncertainty: A landmark
approach. Artificial Intelligence, 76(1-2), 285-317.
Liu, Y., & Badler, N. I. (2003, May). Real-time reach planning for animated characters using
hardware acceleration. In Proceedings of computer animation and social agents (p. 86-
93). New Brunswick, NJ: IEEE Computer Society.
Lorensen, W. E., & Cline, H. E. (1987). Marching cubes: a high resolution 3D surface
construction algorithm. Computer Graphics, 22(4), 163-169.
Lozano-Pérez, T., Jones, J. L., Mazer, E., & O’Donnell, P. A. (1989). Task-level planning of
pick-and-place robot motions. Computer, 22(3), 21-29.
Maciejewski, A. A., & Klein, C. A. (1985). Obstacle avoidance for kinematically redundant
manipulators in dynamically varying environments. Intl. J. of Robotics Research, 4(3),
109-117.
Martin, C, &Thrun, S.(2002). Real-time acquisition of compact volumetric maps with mobile
robots. In IEEE intl. conf. on robotics and automation (ICRA). Washington, DC.
Mason, M. T.(2001). Mechanics of robotic manipulation. Cambridge, MA: MIT Press.
Mayer, J., & Bajcsy R. (1993, May). Occlusions as a guide for planning the next view. IEEE
Trans, on Pattern Analysis and Machine Intelligence, 15(5), 417-433.
Mei, Y., Lu, Y.-H., Hu, Y. C, & Lee, C. S. G. (2006). Energy-efficient mobile robot exploration.
In Proc. of the IEEE intl. conf. on robotics and automation (ICRA). Orlando, FL.
Mihaylova, L., Lefebvre,T., Bruyninckx, H., Gadeyne, K., & De Schutter, J.(2002). Active
sensing for robotics- a survey. In Intl. conf. on numerical methods and applications.
Moravec, H., & Elfes, A. (1985, March). High-resolution maps from wide-angle sonar. In
Proc. of IEEE intl. conf. on robotics and automation (ICRA). St. Louis, MO.
Okada, K., Haneda, A., Nakai, H., Inaba, M., & Inoue, H. (2004, Sept-Oct). Environment
manipulation planner for humanoid robots using task graph that generates action
sequence. In Proc. of the IEEE/RS] intl. conf. on intelligent robots and systems (IROS).
Sendai, Japan.
Pito, R.(1996). A sensor based solution to the next best view problem. In 941-945 (Ed.), Proc.
IEEE intl. conf. on pattern recognition (Vol. 1).
Reif, J. H. (1979). Complexity of the mover’s problem and generalizations. In IEEE
symposium on foundations of computer science (pp. 421-427). San Juan, Puerto Rico.
Reif, J. H., & Sharir, M. (1994). Motion planning in the presence of moving obstacles. Journal
of the Association of Computing Machinery, 41, 764-790.
226 Humanoid Robots, New Developments
Sciavicco, L., & Siciliano, B. (2000). Modeling and control of robot manipulators, 2nd ed. London:
Springer-Verlag.
Smith, R., & Cheeseman, P. (1985). On the representation and estimation of spatial uncertainty
(Tech. Rep. Nos. TR 4760, TR 7239). SRI.
Smith, R., Self, M., & Cheeseman, P. (1990). Estimating uncertain spatial relationships in
robotics. In I. Cox & G. Wilfong (Eds.), Autonomous robot vehicles (pp. 167-193).
Springer-Verlag.
Stilman, M., & Kuffner, Jr., J. J. (2007, April). Manipulation planning among movable
obstacles. In To appear in proc. of the IEEE intl. conf. on robotics and automation (ICRA).
Rome.
Sung, R., Corney, J., & Clark, D.(2002). Octree based assembly sequence generation. J. of
Computing and Information Science in Engineering.
Tanner, H. G., & Kyriakopoulos, K. J. (2000). Nonholonomic motion planning for mobile
manipulators. In Proc. of the IEEE intl. conf. on robotics and automation (ICRA) (pp.
1233-1238). San Francisco, CA.
Teller, S. (1998). Automated urban model acquisition: Project rationale and status. In Proc. of
the image understanding workshop (pp. 455-462).
Thoroughman, K. A., & Shadmehr, R.(2000, Oct). Learning of action through adaptive
combination of motor primitives. Nature, 407, 742-747.
Thrun, S., Hähnel, D., Ferguson, D., Montemerlo, M., Triebel, R., Burgard, W., et al. (2003).
A system for volumetric robotic mapping of abandoned mines. In Proc. of the IEEE
intl. conf. on robotics and automation (ICRA).
Turk, G., & Levoy M. (1994, July). Zippered polygon meshes from range images. Computer
Graphics (Proc. of ACM SIGGRAPH), 311-318.
van den Berg, J. R, & Overmars, M. H. (2005, Oct). Roadmap-based motion planning in
dynamic environments. IEEE Trans, on Robotics, 21(5).
Wang, C.-C., & Thorpe, C. (2002, May). Simultaneous localization and mapping with
detection and tracking of moving objects. In Proc. of the IEEE intl. conf. on robotics
and automation (ICRA). Washington, D.C.
Yamamoto, Y., Pirjanian, P., Munich, M., DiBernardo, E., Goncalves, L., Ostrowski, J., et al.
(2005). Optical sensing for robot perception and localization. In Proc. of IEEE
workshop on advanced robotics and its social impacts (pp. 14-17).
Yamauchi, B. (1997, July). A frontier based approach for autonomous exploration. In Proc. of
IEEE intl. symp. on computational intelligence in robotics and automation (CIRA).
Monterey, CA.
Yang, X. S., & LaValle, S. (2000, April). A framework for planning feedback motion
strategies based on a random neighborhood graph. In Proc. of the IEEE intl. conf. on
robotics and automation (ICRA) (Vol. 1, pp. 544-549). San Francisco, CA.
Zucker, M., Kuffner, Jr., J. J., & Branicky, M. (2007, April). Multipartite RRTs for rapid
replanning in dynamic environments. In Proc. of the IEEE intl. conf. on robotics and
automation (ICRA). Rome, Italy.
13
1. Introduction
Biped robots have potential ability to elevate mobility of robotic system and they attract
general attention in the last decade. Due to their form, it is easy to introduce them into our
living space without preparing special infrastructure. There are many theoretical and
experimental studies on the biped robots. Recently, several autonomous biped humanoid
robots have been developed. In 1996, the first autonomous bipedal humanoid robot with
battery power supply was developed by Honda Motor Co., Ltd. (Hirai, et al., 1998).
Takanishi and his co-workers at Waseda University built a human-size bipedal humanoid
robot and proposed a basic control method of whole body cooperative dynamic biped
walking (Yamaguchi et al., 1999). Same group also developed a biped robot having two
Stewart platforms as legs (Sugahara, et al., 2003). Kaneko and his colleagues at the National
Institute of Advanced Industrial Science and Technology (AIST) also developed a bipedal
humanoid robot of 1.54m height and 58kg weight mainly for the purpose of investigating
fundamental techniques and applications (Kaneko et al., 2004). Same group also proposed a
method of a biped walking pattern generation by using a preview control of the zero-
moment point(ZMP) (Kajita, et al., 2003). The ZMP and its extension (FRI; foot-rotation
indicator) are basic stability criteria for biped robots (Vukobratovic, et al, 2001, Goswami,
1999). Nishiwaki and his co-researchers at University of Tokyo studied a humanoid walking
control system that generates body trajectories to follow a given desired motion online
(Nishiwaki, et al. 2003). Loffler and his colleagues at Technical University of Munich also
designed a biped robot to achieve a dynamically stable gait pattern (Loffler, et al., 2003).
Generally, careful design is required for development of a bipedal robot. Selection of gears
and actuators, taking their rate, power, torque, and weight into account, is especially
important. Developments of power conversion technology based on semiconductor
switching devices and rare-earth permanent magnet materials Nd-Fe-B in combination with
optimal design of the electromagnetic field by the finite element method enable
improvement of power/weight ratio of actuators. They contribute to the realization of such
autonomous biped robots. However, they are still underpowered to achieve fast walking
and running motions as the same that human does. There are several researches on running
control of biped model (Raibert, 1986, Hodgins, 1996, Kajita, et al., 2002). The first human-
size running robot was developed at AIST (Nagasaki, et al., 2004). In December 2005, Honda
Motor Co., Ltd. announced their new bipedal humanoid robot that could run at 6km/h.
Kajita, et al. proposed a method to generate a running pattern and reported that it requires
228 Humanoid Robots, New Developments
at least 28 to 56 times more powerful actuators than that of their actual humanoid robot
HRP1, and also the consumption power is estimated ten times bigger than the human
runner (Kajita, et al., 2002)
In this chapter, a method to generate a trajectory of a running motion with minimum energy
consumption is introduced. It is useful to know the lower bound of the consumption energy
when we design the biped robot and select actuators. The generation of low-energy
trajectories for biped robots remained an open problem. Usually, it is formulated as an
optimal control problem. Since symbolic expression of motion equation of robots becomes
extremely complicated in the case that the number of links increases, only specific simple
type of a structure of robots was investigated and simplified assumptions such as ignoring
the effects of centripetal forces were made in the past works (Roussel, et al., 1998). In this
chapter, exact and general formulation of optimal control for biped robots based on
recursive representation of motion equation is introduced. As an illustrative numerical
example, the method is applied to a five link planar biped robot of 1.2m height and 60kg
weight. The robot with the generated minimum energy running pattern with 1m/sec speed
consumes only 45W on average. It is found that big peak power and torque is required for
the knee joints but its consumption power is negative small and the main work is done by
the hip joints.
The rest of this chapter is organized as follows. In Section 2, the problem definition is
introduced where the formulation of the biped running robot is given. The minimization of
consumption energy is explained in the Section 3, and the computational scheme is
proposed in the Section 4. The numerical study of a five link planar biped robot is provided
in Section 5, and conclusions are outlined in the Section 6.
where xs (T s ,M s ) and u s (ns ,0) . The subscript s represents variables during support
phase. The state equation is given by
ª x s º
w s « H 1 u C x g » (6)
¬ s s s s s ¼
where ws ( xs , x s ) (T s , M s ,Ts , M s ) 2 N 6 is a state vector.
where T is a period for one step. S is the stride. K is a coordinate conversion matrix;
ª Ib 0 0º
K «0 I " »»
« 0 (11)
«¬ 0 I" 0 »¼
where I b and I " are identity matrices whose dimensions are same as number of joints in
body and one leg, respectively.
Since the structure of dynamics varies depending on the phase as shown in the previous
section, a reflection of time axis is introduced. A new time axis is given by
t
for 0 d t d DT (12)
°° DT
W ® T t
° for DT d t d T
°¯ 1 D T
The timing chart of events are shown in Fig. 2. All variables in the rest of this section are
functions of W .
³
1
DTsT ns 1 D TTf n f TdW
0
Minimum Energy Trajectory Planning for Biped Robots 231
The state equations (6) for 0 d t d DT and (8) for DT d t d T are transformed onto W -axis as
follows.
dws ª x s º
DT « 1 » (14)
dW H
¬ s su C
x
s s g
s ¼
dw f ª x f º
D 1T «
dW
¬H f
1
f u C f x f g f »¼
(15)
State variables should include the support phase ratio D in order to find its optimal value as
well. The support phase ratio is constant. Then, the following differential equation is
introduced.
dD (16)
0
dW
Finally, the problem (9)(10) is transformed into a Bolza problem.
minimize
~ 1§ T § dz W ··
E g z1 ³ ¨¨ f 0 z W , vW O W ¨ f z W , vW ¸ ¸¸dW (17)
0
© © dW ¹¹
where z ( ws , w f ,D ) ( x s , x s , x f , x f ,D ) (T s ,M s ,Ts ,M s ,T f ,M f , p f ,T f ,M f , p f ,D ) 4 N 19 is a
4 N 19
state vector, v (ns , n f ) 2 N is a input vector, O is a Lagrange multiplier,
z1 z (1) is state at the terminal period, and
f 0 z , v DTsT ns 1 D TTf n f T (18)
ª DTx s º
« DTH 1 u C x g »
« s s s s s »
f z, v « D 1Tx f »
« » (19)
«D 1TH f u f C f x f g f »
1
«¬ 0 »¼
The function g ( z1 ) represents penalty for the terminal condition that is introduced to
guarantee continuity of the state variable at the instant of the taking-off.
2 2 2
g z1 W §¨ T s 1 T f 1 Ts 1 T f 1 M s 1 M f 1
© (20)
2 2
M s 1 M f 1 ps 1 p f 1 p s 1 p f 1
2
where W is an weight coefficient. The variables ps and p s are implicitly defined, which are
~
represented by functions of ws . Variation of the extended objective function E is given by
T T
1 § § wf
T
~ wz0 § wg · ¨ ¨ 0 wf O dO ·¸ Gz
GE OT0 Gz c
0 ¨
¨ wz O1¸
¸ Gz1 ³0 ¨ ¨© wz wz dW ¸¹
wz0cT © 1 ¹ ©
T
(21)
§ wf wf T
· § dz ·
T ·
¨¨ 0 O ¸¸ Gv ¨ f ¸ GO ¸dW
© wv wv ¹ © dW ¹ ¸
¹
232 Humanoid Robots, New Developments
where O0 O (0) and O1 O (1) . In this equation, we assume that the initial state z0 is a
function of certain variables which consist of partial set of the state, namely, a part of the
initial state is independent and the other depends on it. Let the independent initial state
variables be z0c (T s (0),M s (0),T f (0),M f (0), p f (0),D (0)) . The rest of the initial state are
decided by
T f 0 KT s 0 (22)
M f 0 M s 0 (23)
p f 0 hT s 0, M s 0 S (24)
ªT f 0º
ªTs 0 º
« »
ªK
«0
0 0º
I 0»¼
~ ~
~
I H f 1 J Tf J f H f 1 J fT
1 ~ «
»
J f «M f 0 »
(25)
¬M s 0¼ ¬ « p f 0»
¬ ¼
The first three equations are coordinate conversion at the instant of landing and the last is
the condition of perfectly inelastic collision at the instant of landing. Let the impulsive
external force at the foot of support leg be Gf . The impact force Gf is inflicted at the instant
of the landing and the generalized velocity changes discontinuously. From (1), the
generalized momentum after the collision is given by
~
Hx Hx J T Gf (26)
where x and x denote the generalized velocities after and before collision, respectively.
~
J [ J , R, I ] is an extended Jacobian. Since it is support phase after the collision, the
condition (3) holds, namely,
~
J x 0 (27)
Describing (26) and (27) for x and Gf , the following equation is obtained.
~
ª H J T º ª x º ª Hx º
« ~ »« » (28)
« 0 »
¬ J 0 ¼ ¬Gf ¼ ¬ ¼
Eliminating Gf from (28), we have
x I H 1 ~ ~ ~
J T J H 1 J T
1 ~
J x (29)
Here, x corresponds to xs (0) which is the generalized velocity of the support phase at
W 0 , and x corresponds to x f (0) which is the generalized velocity of the flight phase at
W 0 . Taking into account the coordinate conversion between left and right leg, (29) is
transformed into the form of (25).
From (21), the following conditions are obtained.
wg (30)
O1
wz1
dO wf 0 wf T (31)
O
dW wz wz
dz
f (32)
dW
Minimum Energy Trajectory Planning for Biped Robots 233
ª 0 DI 0 0 x s º
« wf s wf »
«D D Ts 0 0 fs »
« wxsT wx s »
wf 0 0 0 D 1I x f »
T u« (35)
wz T « wf f wf »
« 0 0 D 1 D 1 Tf ff »
« wx Tf wx f »
«¬ 0 0 0 0 0 »¼
where
234 Humanoid Robots, New Developments
fs H s1 u s C s x s g s (36)
ff H f 1 u f C f x f g f (37)
And then,
wf s § wC wg · wH s
H s1 ¨¨ s x s s ¸¸ H s1 fs (38)
wxsi wx
© si wx si ¹ wxsi
wf f § wC f wg f · wH f
H f 1 ¨ x ¸ H f 1 ff (39)
wx fi ¨ wx f wx ¸ wx fi
© fi fi ¹
ª 0 0 º
«DTH 1 P 0 »
wf « s s »
« 0 0 » (40)
wvT «
« 0 D 1TH f Pf »»
1
«¬ 0 0 »¼
where Ps [ I ,0]T ( N 3)u N , Pf [ I ,0,0]T ( N 6)u N are selection matrices. And also,
ª 0 º
« 0 »
« »
« DTns »
« »
« 0 »
« 0 »
wf 0 « »
« 0 »
wz « » (41)
0
« »
« 1 D Tn f »
« 0 »
« »
« 0 »
« T »
T
¬ s s T f n f
T n T ¼
wf 0 ª DTTs º
« » (42)
wv «¬1 D TT f »¼
ª I 0 0º
« * / 0»
wz0 «~ »
« K 0 0» (43)
wz0cT « »
« 0 I 0»
«¬ 0 0 1»¼
Minimum Energy Trajectory Planning for Biped Robots 235
ª ª T wJ sT ºº
« T s1 T f 1 J s hs p f 1 «Ts1
T
J sTsi p f 1 » »
« ¬ wT s1i ¼»
« ª wRsT »
«M s1 M f 1 Rs hs p f 1 «M s1
T
RsM s1 p f 1 º» »
« ¬ wM s1i ¼»
«
«
Ts1 T f 1 J sT J sTs1 p f 1 »
»
« M s1 M f 1 Rs RsM s1 p f 1
T T
»
wg
2W u « T s1 T f 1 » (44)
wz1 « »
« M s1 M f 1 »
« hs p f 1 »
« »
« Ts1 T f 1 »
« »
« M s1 M f 1 »
« J sTs1 p f 1 »
« »
¬ 0 ¼
where the subscript 1 corresponds to the value at W 1 , and
ª K 0 º
~ « 0 (45)
K « I »»
¬« J s Rs ¼»
/
~
~ ~
~
K l I H f 1 J fT J f H f 1 J Tf 1 ~
Jf (46)
~ ªK 0 0º
Kl «0 (47)
¬ I 0»¼
ª w/ º
* « x f »
¬ wxsi ¼ (48)
~ ~
w/ ~ wH f 1 ~ T ~ wJ fT
~ ~ wJ fT
K l H f 1 H f J f :J f H f 1
:J f H f 1 J fT :
wxsi wx fi wx fi wx fi
~ ~T
~ § wJ f 1 ~ T ~ 1 wH f 1 ~ T ~ 1 wJ f ·¸ ~ ~
H f 1 J fT :¨ H J JfHf Hf Jf JfHf :J f K r (49)
¨ wx fi f f wx fi wx fi ¸¹
©
: J~ H
f
1
f
~
JT
1 (50)
ªK 0º
~ «0
Kr « I »» (51)
«¬ 0 0»¼
236 Humanoid Robots, New Developments
The partial derivatives appeared in (38), (39), (44), (48), and (49) are computed by using
modified Newton-Euler formulations.
hip knee
peak angular velocity [rad/s] 14.5 11.6
peak torque [N.m] 48.2 117.6
peak power (positive) [W] 355 1265
peak power (negative) [W] -699 -849
consumption power [W] 27.9 -5.37
total consumption power [W] 45.1
Table 2. Actuator requirement.
Table 2 shows requirements for the actuators based on this result. It is found that very big
power is required for knee joints. However, its total consumption power has small negative
value. Therefore, the main work is done by the hip joints. Since the negative power is also
Minimum Energy Trajectory Planning for Biped Robots 237
big, for real robots, the introduction of the energy regeneration mechanism such as elastic
actuators or combination of high back-drivable actuators and bidirectional power converters
is effective to reduce the total consumption power.
2 2
1.5 1.5
1 1
support support support
0.5 0.5 phase phase phase
0 0
support support support
-0.5 phase phase phase -0.5
-1 -1
-1.5 -1.5
-2 -2
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Time [s] Time [s]
(a) hip joint (b) knee joint.
Fig. 3. Joint angles (solid line: right leg, dashed line: left leg).
Anglular velocities of knee joints [rad/s]
Anglular velocities of hip joints [rad/s]
15 15
10 support 10 support
phase phase
5 5
0 0
support support support support
-5 phase phase -5 phase phase
-10 -10
-15 -15
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Time [s] Time [s]
(a) hip joint (b) knee joint.
Fig. 4. Angular velocities of joints (solid line: right leg, dashed line: left leg).
150 150
Torques of knee joints [N.m]
Torques of hip joints [N.m]
100 100
support
50 phase 50
0 0
support support support support support
-50 phase phase -50 phase phase phase
-100 -100
-150 -150
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Time [s] Time [s]
(a) hip joint (b) knee joint.
Fig. 5. Joint torques (solid line: right leg, dashed line: left leg).
238 Humanoid Robots, New Developments
1500 1500
1000 1000
500 500
0 0
support support support support support support
-500 phase phase phase -500 phase phase phase
-1000 -1000
-1500 -1500
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Time [s] Time [s]
(a) hip joint (b) knee joint.
Fig. 6. Joint powers (solid line: right leg, dashed line: left leg).
6. Conclusion
In this chapter, the method to generate a trajectory of a running motion with minimum
energy consumption is proposed. It is useful to know the lower bound of the consumption
energy when we design the bipedal robot and select actuators. The exact and general
formulation of optimal control for biped robots based on numerical representation of
motion equation is proposed to solve exactly the minimum energy consumption trajectories.
Through the numerical study of a five link planar biped robot, it is found that big peak
power and torque is required for the knee joints but its consumption power is small and the
main work is done by the hip joints.
8. References
Fujimoto, Y. & Kawamura, A. (1995). Three Dimensional Digital Simulation and
Autonomous Walking Control for Eight-axis Biped Robot, Proceedings of IEEE
Minimum Energy Trajectory Planning for Biped Robots 239
1. Introduction
Robust real-time stereo facial feature tracking is one of the important research topics for a
variety multimodal Human-Computer, and human robot Interface applications, including
telepresence, face recognition, multimodal voice recognition, and perceptual user interfaces
(Moghaddam et al., 1996; Moghaddam et al., 1998; Yehia et al., 1988). Since the motion of a
person's facial features and the direction of the gaze is largely related to person's intention
and attention, detection of such motions with their 3D real measurement values can be
utilized as a natural way of communication for human robot interaction. For example,
addition of visual speech information to robot's speech recognizer unit clearly meets at least
two practicable criteria: It mimics human visual perception of speech recognition, and it
may contain information that is not always present in the acoustic domain (Gurbuz et al.,
2001). Another application example is enhancing the social interaction between humans and
humanoid agents with robots learning human-like mouth movements from human trainers
during speech (Gurbuz et al., 2004; Gurbuz et al., 2005).
The motivation of this research is to develop an algorithm to track the facial features using
stereo vision system in real world conditions without using prior training data. We also
demonstrate the stereo tracking system through a human to humanoid robot mouth
mimicking task. Videre stereo vision hardware and SVS software system are used for
implementing the algorithm.
This work is organized as follows. In section 2, related earlier works are described. Section 3
discusses face RIO localization. Section 4 presents the 2D lip contour tracking and its
extention to 3D. Experimental results and discussions are presented in Section 5. Conclusion
is given in Section 6. Finally, future extention is described in Section 7.
2. Related Work
Most previous approaches to facial feature tracking utilize skin tone based segmentation
from single camera exclusively (Yang & Waibel, 1996; Wu et al., 1999; Hsu et al., 2002;
Terrillon & Akamatsu, 1999; Chai & Ngan, 1999). However, color information is very
sensitive to lighting conditions, and it is very difficult to adapt the skin tone model to a
dynamically changing environment in real-time.
242 Humanoid Robots, New Developments
Kawato and Tetsutani (2004) proposed a mono camera based eye tracking technique based
on six-segmented filter (SSR) which operates on integral images (Viola & Jones, 2001).
Support vector machine (SVM) classification is employed to verify pattern between the eyes
passed from the SSR filter. This approach is very attractive and fast. However, it doesn't
benefit from stereo depth information. Also SVM verification fails when the eyebrows are
covered by the hair or when the lighting conditions are significantly different than the SVM
training conditions.
Newman et al., (2000) and Matsumoto et al., (19990) proposed to use 3D model fitting
technique based on virtual spring for 3D facial feature tracking. In the 3D feature tracking
stage each facial feature is assumed to have a small motion between the current frame and
the previous one, and the 2D position in the previous frame is utilized to determine the
search area in the current frame. The feature images stored in the 3D facial model are used
as templates, and the right image is used as a search area firstly. Then this matched image in
2D feature tracking is used as a template in left image. Thus, as a result, 3D coordinates of
each facial feature are calculated. This approach requires 3D facial model beforehand. For
example, error in selection of a 3D facial model for the user may cause inaccurate tracking
results.
Russakoff and Herman (2000) proposed to use stereo vision system for foreground and
background segmentation for head tracking. Then, they fit a torso model to the segmented
foreground data at each image frame. In this approach, background needs to be modeled
first, and then the algorithm selects the largest connected component in the foreground for
head tracking.
Although all approaches reported success under broad conditions, the prior knowledge
about the user model or requirement of modeling the background creates disadvantage for
many practical usages. The proposed work extends these efforts to a universal 3D facial
feature tracking system by adopting the six-segmented filter approach Kawato and
Tetsutani (2004) for locating the eye candidates in the left image and utilizing the stereo
information for verification. The 3D measurements data from the stereo system allows
verifying universal properties of the facial features such as convex curvature shape of the
nose explicitly while such information is not present in the 2D image data directly. Thus,
stereo tracking not only makes tracking possible in 3D, but also makes tracking more robust.
We will also describe an online lip color learning algorithm which doesn't require prior
knowledge about the user for mouth outer contour tracking in 3D.
selection of the mouth ROI. At the current implementation, the system tracks the person
closest to the camera only, but it can be easily extended to a multiple face tracking
algorithm.
where Si denotes the integral value of the intensity of a segment in the maximum filter shown
in Fig. 2, and j is the center location of the filter in the current integral intensity profile. The
filter is convolved with the integral intensity profile of every row segment. A row segment
typically extends over 5 to 10 rows of the face ROI image, and a face ROI image typically
contains 20 row segments. Integral intensity profiles of row segments are processed to find
their hull points (see Fig.1 using Equation 1 until either the end of the face ROI is reached or
until Eqn. 1 is no longer satisfied. For the refinement process, we found that the first derivative
of the 3D surface data as well as the first derivative of the intensity at the nose tip are
maximum, and the second derivative is zero at the nostril level (Gurbuz etal., 2004a).
Fig. 1. Nose bridge line using its convex hull points from integral intensity projections.
244 Humanoid Robots, New Developments
4. Lip Tracking
The nose tip location is then utilized for the initial mouth ROI selection. Human mouth has
dynamic behavior and even dynamic colors as well as presence or absence of tongue and
teeth. Therefore, at this stage, maximum-likelihood estimation of class conditional densities
for subsets of lip (w1) and non-lip (w2) classes are formed in real-time for the Bayes decision
rule from the left camera image. That is, multivariate class conditional Gaussian density
parameters are estimated for every image frame using an unsupervised maximum-
likelihood estimation method.
4.1 Online Learning and Extraction of Lip and Non-lip Data Samples
In order to alleviate the influence of ambient lighting on the sample class data,
chromatic color transformation is adopted for color representation (Chiang et al., 2003;
Yang et al., 1998). It was pointed out (Yang et al., 1998) that human skin colors are less
variant in the chromatic color space than the RGB color space. Although in general the
skin-color distribution of each individual may be modeled by a multivariate normal
distribution, the parameters of the distribution for different people and different
lighting conditions are significantly different. Therefore, online learning and sample
data extraction are important keys for handling different skin-tone colors and lighting
changes. To solve these two issues, the authors proposed an adaptation approach to
transform the previous developed color model into the new environment by
combination of known parameters from the previous frames. This approach has two
drawbacks in general. First, it requires an initial model to start, and second, it may fail
in the case of a different user with completely different skin-tone color starts using the
system.
We propose an online learning approach to extract sample data for lip and non-lip classes to
estimate their distribution in real time. Chiang et al. (2003) in their work provides hints for
this approach. They pointed out that lip colors are distributed at the lower range of green
channel in the (r,g) plane. Fig. 4 shows an example distribution of lip and non-lip colors in
the normalized (r,g) space.
Utilizing the nose tip, time dependent (r,g) spaces for lip and non-lip are estimated for
every fame by allowing H % (typically 10%) of the non-lip points stay within the lip (r,g)
space as shown in Fig. 4. Then, using the obtained (r,g) space information in the initial
classification, the pixels below the nostril line that falls within the lip space are considered
as lip pixels, and the other pixels are considered as non-lip pixels in the sample data set
extraction process, and RGB color values of pixels are stored as class attributes,
respectively.
Real-time Vision Based Mouth Tracking and Parameterization for a Humanoid Imitation Task 245
Fig. 3. Left image: result of the Bayes decision rule, its vertical projection (bottom) and
integral projection of intensity plane between nose and chin (right). Middle image:
estimated outer lip contour using the result of the Bayes rule. Right image: a parameterized
outer lip contour.
(2)
246 Humanoid Robots, New Developments
Using the same concept in Eqn. 2, we also separate the non-lip data samples into subsets
according to intensity average of the non-lip class. Fig. 5 depicts simplified conditional
density plots in 1D for the subsets of an assumed non-lip class.
(3)
where i may be w1, or w2, or subset of a class. Pi E[x] is the mean value of the ith class.
¦ i is the n x n (in this work, n is number of color attributes so n = 3) covariance matrix
defined as
(4)
where || || represents the determinant operation, and E[] represents the expected value of a
random variable. Unbiased estimates of the parameters Pi and ¦ i are estimated by using
the sample mean and sample covariance matrix.
(5)
where p ( wi | x ) is the a posteriori probability of wi given x . Equation (5) shows that if the
probability of w1 given x is larger than the probability of w2 , then x is declared belonging
to w1 , and vice versa. Since direct calculation of p( wi | x) is not practical, we can re-write the
Real-time Vision Based Mouth Tracking and Parameterization for a Humanoid Imitation Task 247
a posteriori probability of wi using Bayes' Theorem in terms of a priori probability and the
conditional density function p( x | wi ) , as
(6)
where p (x ) is the density function and is positive constant for all classes. Then, re-arranging
both sides, we obtain
(7)
where L(x) is called the likelihood ratio, and p ( w1 ) / p ( w2 ) is called the threshold value} of
the likelihood ratio for the decision. Because of the exponential form of the densities
involved in Equation (7), it is preferable to work with the monotonic discriminant functions
obtained by taking the logarithm as follows.
(8)
thus, by re-arranging Equation (8), we get
(9)
Where
is a constant for this image frame. In general, Equation (9) has only nonlinear quadratic
form and a summation, and using this equation, the Bayes rule can be implemented for real-
time lip tracking as follows.
(10)
Where
for i {w1 , w2 } and referring to Fig. 5. Threshold value of the likelihood ratio as shown in
Eqn. (7) is based on a priori class probabilities. In our implementation, equally likely a priori
class probabilities are assumed.
(10)
The basic form used in the elliptical parameter estimation in matrix notation is Ma 0
where a [a1 a 6 ]T The dimensionality of M is the number of points, N, in the segment
multiplied by 6 (that is, N x 6). Each row of M corresponds to one point in the segment. The
248 Humanoid Robots, New Developments
parameters of each contour are then solved using the least-squares method to find ai s,
where i 1,2, ,6 .
Using the estimated parameters, parametric lip contour data can be re-generated for each
image frame. Five points are sufficient to represent a general elliptical shape, leading to a
significant data reduction and representation.
Fig. 6. Screen capture of tracked outer lip contours for various skin tone colors and different
lighting conditions.
Fig. 7. Screen capture of the left and right camera images, and their disparity map around
the face region of interest.
Fig. 8. Screen capture of the left and right camera images, and OpenGL plot of texture
mapped 3D face reconstructed by the stereo algorithm.
Real-time Vision Based Mouth Tracking and Parameterization for a Humanoid Imitation Task 249
6. Conclusions
A new method for stereo facial feature tracking of individuals in real world conditions is
described. Specifically, stereo face verification and an unsupervised online parameter
estimation algorithm for the Bayesian rule is proposed. That is, a speaker's lip colors are
learned from the current image frame using the nose tip as a reference point. Vertical and
horizontal integral projections are utilized to guide the algorithm in picking out the correct
lip contour. In the final stage, estimated outer contour data for every image frame of the left
camera is parameterized as a generalized ellipse. Then, utilizing the contour pixel locations
in the left camera image and their disparity of the right camera image, we calculate their 3D
(X,Y,Z) coordinates with respect to the camera coordinate system. Future work for vision
includes extraction of 3D coordinates of other facial points such as eyebrows, chin and
cheek, and further extending the work to the multiple face tracking algorithm.
Fig. 9. Future extension to a TTS based speech articulation system for Humanoids.
Fig. 10. Infanoid robot utilized for human to humanoid robot mouth imitation task.
Real-time Vision Based Mouth Tracking and Parameterization for a Humanoid Imitation Task 251
8. Acknowledgment
This research was conducted as part of “Research on Human Communication“ with
funding from the National Institute of Information and Communications Technology
(NiCT), Japan. Thanks are due to Dr. Hideki Kozima for the use of his Infanoid robot and
Shinjiro Kawato for the original eye tracking work extended in this paper.
9. References
Black, A., & Taylor, P. (1997). The Festival Speech Synthesis System. University of Edinburgh.
Chai, D. & Ngan, K. N., (1999). Face segmentation using skin-color map in videophone
applications. IEEE Trans. on Circuits and Systems for Video Technology 9 (4),
551{564.
Chang, S., Shari, L. & Greenberg, S. (2000). Automatic phonetic transcription
of spontaneous speech (American English), International Conference on Spoken Language
Processing, Beijing, China.
Chiang, C. C., Tai, W. K., Yang, M. T., Huang, Y. T. & Huang, C. J., (2003). A novel method
for detecting lips, eyes and faces in real-time. Real-Time Imaging 9, 277-287.
Gurbuz, S., Shimizu, T. & Cheng, G. (2005). Real-time stereo facial feature tracking:
Mimicking human mouth movement on a humanoid robot head. IEEE-RAS/RSJ
International Conference on Humanoid Robots (Humanoids 2005).
Gurbuz, S., Kinoshita, K., & Kawato, S., (2004a). Real-time human nose bridge tracking in
presence of geometry and illumination changes. Second International Workshop
on Man-Machine Symbiotic Systems, Kyoto, Japan.
Gurbuz, S., Kinoshita, K., Riley, M. & Yano, S., (2004b). Biologically valid jaw movements
for talking humanoid robots. IEEE-RAS/RSJ International Conference on
Humanoid Robots (Humanoids 2004), Los Angeles, CA, USA.
Gurbuz, S., Tufekci, Z., Patterson, E. & Gowdy, J., (2001). Application of affine invariant
Fourier descriptors to lipreading for audio-visual speech recognition. Proceedings
of ICASSP.
Hsu, R. L., Abdel-Mottaleb, M. & Jain, A. K., (2002). Face detection in color images. IEEE
Trans. on PAMI 24 (5), 696-706.
Kawato, S. & Tetsutani, N., (2004). Scale adaptive face detection and tracking in real time
with SSR filter and support vector machine. Proc. of ACCV, vol. 1.
252 Humanoid Robots, New Developments
Kozima, H., (2000). NICT infanoid: An experimental tool for developmental psycho-
robotics. International Workshop on Developmental Study, Tokyo.
Matsumoto, Y. & Zelinsky, A., (1999). Real-time face tracking system for human robot
interaction. Proceedings of 1999 IEEE International Conference on Systems, Man,
and Cybernetics (SMC'99). pp. 830-835.
Moghaddam, B., Nastar, C. & Pentland, A., (1996). Bayesian face recognition using
deformable intensity surfaces. In: IEEE Conf. on Computer Vision and Pattern
Recognition.
Moghaddam, B., Wahid, W. & Pentland, A., (1998). Beyond eigenfaces: Probabilistic
matching for face recognition. In: International Conference on Automatic Face and
Gesture Recognition.
Newman, R., Matsumoto, Y., Rougeaux, S. & Zelinsky, A., (2000). Real-time stereo tracking
for head pose and gaze estimation. Proceedings. Fourth IEEE International
Conference on Automatic Face and Gesture Recognition.
NICT-Japan, Infanoid project. http://www2.nict.go.jp/jt/a134/infanoid/robot-eng.html.
Russako_, D. & Herman, M., (2000). Head tracking using stereo. Fifth IEEE Workshop on
Applications of Computer Vision.
Sethy, A. & Narayanan, S. (2002). Rifined speech segmentation for concatanative
speech synthesis, International Conference on Spoken Language Processing, Denver, Colorado.
Terrillon, J. C. & Akamatsu, S., (1999). Comparative performance of different chrominance
spaces for color segmentation and detection of human faces in complex scene
images. Proc. 12th Conf. on Vision Interface. pp. 180-187.
Viola, P. & Jones, M., (2001). Robust real-time object detection. Second International
Workshop on Statistical and Computational Theories of Vision-Modeling,
Learning, Computing, and Sampling, Vancouver, Canada,.
Wu, H., Cheng, Q. & Yachida, M., (1999). Face detection from color images using a fuzzy
pattern matching method. IEEE Trans. on PAMI 21 (6), 557-563.
Yang, J., Stiefelhagen, R., Meier, U. & Waibel, A., (1998). Visual tracking for multimodal
human computer interaction. Proceedings of the SIGCHI conference on Human
factors in computing systems.
Yang, J. & Waibel, A., (1996). A real-time face tracker. In: Proc. 3rd IEEE Workshop on
Application of Computer Vision. pp. 142-147.
Yehia, H., Rubin, P.E., & Vatikiotis-Bateson, E. (1998). “Quantitative association of vocal
tract and facial behavior,” Speech Communication, no. 26, pp. 23–44.
15
1. Introduction
Controlling of a biped walking mechanism is a very challenging multivariable problem, the
system being highly nonlinear, high-dimensional, and inherently unstable. In almost any
realistic case the exact dynamic equations of a walking robot are too complicated to be
utilized in the control solutions, or even impossible to write in closed form.
Data-based modelling methods try to form a model of the system using only observation
data collected from the system inputs and outputs. Traditionally, the data oriented methods
are used to construct a global black-box model of the system explaining the whole sample
data within one single function structure. Feedforward neural networks, as presented in
(Haykin, 1999), for example, typically map the input to the output with very complicated
and multilayered grid of neurons and the analysis of the whole net is hardly possible. Local
learning methods (Atkeson et al., 1997), on the other hand, offer a more structured approach
to the problem. The overall mapping is formed using several local models, which have a
simple internal structure but are alone valid only in small regions of the input-output space.
Typically, the local models used are linear, which ensures the scalability of the model
structure: Simple systems can be modelled, as well as more complex ones, using the same
structure, only the number of the local models varies.
In robotics, local modelling has been used quite successfully to form inverse dynamics
or kinematic mappings that have then been applied as a part of the actual controller
(Vijayakumar et al., 2002). However, when trying to cover the whole high-dimensional
input-output space, the number of local models increases rapidly. Additionally,
external reference signals are needed for the controller to get the system function as
desired.
To evaluate the assumption of simple local models, a feedback structure based on linear
local models, clustered regression, is used here to implement the gait of a biped walking
robot model. The local models are based on principal component analysis (see Basilevsky,
1994) of the local data. Instead of mapping the complete inverse dynamics of the biped,
only one gait trajectory is considered here. This means that the walking behaviour is
stored in the model structure. Given the current state of the system, the model output
estimate is directly used as the next control signal value and no additional control
solutions or reference signals are needed. The walking cycle can become automated, so that
no higher-level control is needed.
This text summarizes and extends the presentation in (Haavisto & Hyötyniemi, 2005).
254 Humanoid Robots, New Developments
2. Biped model
2.1 Structure of the mechanism
The biped model used in the simulations is a two-dimensional, five-link system which has a
torso and two identical legs with knees. To ensure the possibility of mathematical
simulation, the biped model was chosen to be quite simple. It can, however, describe the
walking motion rather well and is therefore applied in slightly different forms by many
researchers (see e.g. Chevallereau et al., 2003; Hardt et al., 1999; Juang, 2000).
Fig. 1 shows the biped and the coordinates used. Each link of the robot is assumed to be a
rigid body with uniformly distributed mass so that the centre of mass is located in the
middle of the link. The interaction with the walking surface is modelled by adding external
forces F to the leg tips. As the leg touches the ground, the corresponding forces are switched
on to support the leg. The actual control of the biped gait is carried out using the joint
moments M, which are applied to both thighs and knees.
Fig. 1. The coordinates used (left) and the external forces (right).
As such, the system has seven degrees of freedom, and seven coordinates must be chosen to
describe the configuration of the biped. The coordinate vector now is
T
q x0 y0 D EL ER J L J R . (1)
Here the subindices L and R refer to the “left” and “right” leg of the biped, respectively. The
dynamic equations of the system can be derived using Lagrangian mechanics (Wells, 1967)
and the result has the following form:
A(q) q b(q, q , M , F ) , (2)
where the “dotted variables” denote time derivatives of the coordinate vector. The exact
formulas of the inertia matrix A(q) and the right hand side vector are quite complex and
therefore omitted here (see Haavisto & Hyötyniemi, 2004). The minimum dimension n of
the state vector representing the system state is 14 (the 7 coordinates and their time
derivatives); however, because of the efficient compression carried out by the proposed
data-based modelling technique, the dimension of the “extended state” could be higher,
including measurements that are not necessarily of any use in modelling (redundancy
among the measurements being utilized for elimination of noise).
Clustered Regression Control of a Biped Robot Model 255
2.2 Simulator
To simulate the biped walking on a surface, a Matlab/Simulink (MathWorks, 1994-2007)
block was developed. It handles internally the calculation of the support forces F using
separate PD controllers for each force component: The force opposing the movement is
the stronger the farther the foot tip has gone beneath the ground surface (P term), and the
faster the foot tip is moving downward (D term). By adjusting these controller
parameters, different kinds of ground surface properties can be simulated (very hard
surfaces, however, resulting in high force peaks, so that short simulation step sizes have
to be applied). Also the knee angles are automatically limited to a range defined by the
user. To obtain this, the simulation block adds an additional moment to the knee joints if
the angle is exceeding or going under the given limits. This limitation can be used for
example to prevent the knee turning to the “wrong” direction, that is, to keep the angles
JL and JR positive.
The input of the simulation block is the four dimensional moment vector
T
M M L1 M R1 M L2 M R2 , (3)
and the output the 14-dimensional state of the system augmented with leg tip touch sensor
values sL and sR is
T
q T
q T sL sR . (4)
If the leg is touching the ground, the sensor value equals to 1, otherwise 0.
The biped dynamics, ground support forces and knee angle limiters are simulated in
continuous time, but the input and output of the block are discretized using zero order hold.
This allows the use of discrete-time control methods. A more detailed description of the
used biped model and simulation tool is presented in the documentation (Haavisto &
Hyötyniemi, 2004).
Fig. 2. The sample data were produced by simulating the PD controlled biped
The PD controller was able to control the biped so that a cyclic gait movement was attained.
To reach more varying data, independent white noise was added to each moment signal
component. The input and output data of the biped system were collected of a 40 seconds
walk, during which the biped took over 40 steps. The sampling interval was equal to the
discretization period h = 0.005 sec.
The data were mean-zeroed and scaled to unit variance in order to make the different
components equally significant. Also the x0 coordinate of the state vector was omitted
because it is monotonically increasing during the walk instead of being cyclic. As a result,
the modelling data had 8000 samples of the 15 dimensional (reduced) state vector and the
corresponding four dimensional moment vectors. The points were located around a
nominal trajectory in the state and moment space.
In the following, the state vector at time kh is denoted by u(k) and the moment vector by y(k)
since these are the input and the output of the controller. Additionally, the state vector
without the touch sensor values is denoted by u (k ) .
3. Clustered Regression
The data-based model structure, clustered regression, that is applied here is formed purely by
the statistical properties of the sample data. The main idea is to divide the data into clusters
and set an operating point in the centre of each cluster. Every operating point, additionally,
has its own local linear regression model which determines the local input-output mapping
in the cluster region. A scheme of the model structure is presented in Fig. 3.
Fig. 3. The local models cover the whole data along the trajectory
ones can be omitted as noise. The idea of the PCR is first to map the input data to the lower
dimensional principal component subspace and then use multivariable linear regression to
get the output estimate.
Let us denote the input signal for the local model p at the time instant kh as up(k), and the
corresponding output as yp(k). When the input data are mapped onto the principal
components, one gets the latent variable signal xp(k). To simplify the regression structure the
latent variable data are locally scaled to unit variance before the regression mapping.
Now the whole regression structure of the local model p can be stored in the following
statistics, which are calculated using the data in the corresponding cluster:
Cup Expectation value of the input vector up.
p
C y Expectation value of the output vector yp.
p
Pxx Inverse of the latent variable data xp covariance matrix.
p
Rxu Cross-covariance of the latent and reduced input data.
p
R yx Cross-covariance of the ouput and latent variable data.
p
Ruu Covariance of the input data.
Note that in the following the statistics are calculated using the collected data, which gives
merely the estimates of the real quantities. It is assumed, however, that the estimates are
accurate enough and no difference is made between them and the real values.
The regression calculation itself is based on a Hebbian and anti-Hebbian learning structure
(see 6.1). Assuming that the statistics correspond to the real properties of the data, the
output estimate of the model p given an arbitrary input vector can be expressed in
mathematical terms as
2
yˆ p (k ) Ryxp Pxxp Rxup u (k ) Cup C yp . (5)
p p
Here Cu is the expectation value of the reduced state vector u . In (5) the input is first
transformed to the operating point p centred coordinates by removing the input mean value;
then the estimate is calculated and the result is shifted back to the original coordinates in the
output space by adding the output mean vector.
A cost value for the estimate made in the unit p should be evaluated to measure the error of
the result. The cost can depend for example on the distance between the cluster centre and
the estimation point:
1 T
J p u (k )
2
u(k ) Cup H p u (k ) Cup , (6)
yˆ(k )
¦ p 1
K p (k ) yˆ p (k )
. (7)
N
¦ p 1
K p (k )
258 Humanoid Robots, New Developments
The weights should naturally be chosen so that the currently best matching local models affect
the final results the most, whereas the further ones are practically neglected. Let us choose
1 p 1
Hp
V
R uu , (8)
that is, the weighting matrix for each local model cost calculation equals the scaled inverse
of the input data covariance matrix, V being the scaling factor; assuming that the data in
each local model are normally distributed, the maximum likelihood estimate for the combined
output value is obtained when the weights in (7) are chosen as
1
K p (k ) exp J p u (k ) , (9)
p 1
2S
n
H
and V = 1. The simulations, however, showed that a more robust walking behaviour is
reached with larger scaling parameter values. This increases the effect of averaging in the
combination of the local model estimates.
Fig. 4 shows part of the clustered teaching data projected to three input variables. The operating
point centres are shown as large circles with the same shade as the data belonging to the
corresponding clusters. Also the state of the system in the operating point centres is drawn.
Clearly the start of a new DSP is located in the right part of the figure. Following the nominal
trajectory the state of the biped then changes to a SSP after some four operating points. Finally,
the step ends just before the beginning of a new DSP in the lower left corner of the figure, where
the state of the system is almost identical with the original state, only the legs are transposed.
Fig. 5. The learned gait was qualitatively quite similar to the original one
Fig. 6. The clustered regression controlled system (CRC) is functioning a bit slower than the
PD controlled one.
260 Humanoid Robots, New Developments
The biggest difference between the two gaits is that the clustered regression controlled one
is proceeding a little slower. When comparing the coordinate value changes, the lag of the
clustered regression controlled system is clearly seen (Fig. 6). The variations in the PD
controlled signals are due to the added noise in the teaching data generation. However, the
clustered regression controller can reproduce the original behaviour very accurately.
5. Optimization
5.1 Dynamic programming
It was shown that the clustered regression controller is able to repeat the unoptimized
behaviour used in the teaching. It would be very beneficial, however, to reach a more
optimal control scheme so that the biped could, e.g., walk faster with smaller energy
consumption; this would also be a strong evidence of the validity of the selected model
structure when explaining biological systems. It would be interesting to see if the biped
would learn new modes of moving: For example, if the speed were emphasized, would it
finally learn to run?
The optimization procedure of the clustered regression structure could be compared with
dynamic programming (Bellman, 1957) that is based on the principle of optimality. In this case
this idea can be formulated as follows:
An optimal control sequence has the property that whatever the initial state and initial
control are, the remaining controls must constitute an optimal control sequence with
regard to the state resulting from the first control.
In general, a dynamic optimization problem can be solved calculating the control from the
end to the beginning. Starting from the final state, the optimal control leading to that state is
determined. Then the same procedure is performed in the previous state and so on. Finally,
when the initial state is reached, the whole optimal control sequence is attained.
This means that one can form the global optimized behaviour using local optimization. When
compared with the clustered regression structure, the local optimization should now be
done inside every cluster or operating point. When the system enters the operating point
region, the controller is assumedly able to guide it optimally to the next region.
Another fact from the theory of optimal control states that for a quadratic (infinite-time)
optimization problem for a linear (affine) system, the optimal control law is also a linear
(affine) function of the state. This all means that, assuming that the model is locally
linearizable, a globally (sub)optimal control can be implemented in the proposed clustered
regression framework.
When a stable and cyclic motion is obtained the control is slightly varied at one time instant,
and the new data point is adapted to the clustered regression model. If the new model gives
a better control, that is, the cost is now smaller, it is accepted. On the other hand, if the cost
is higher, the new model is rejected. After that, a new change can be made and similarly
evaluated. Repeating these steps the controller can hopefully learn a more optimal
behaviour, although the process may be quite slow and nothing prevents it ending to a local
minimum of the cost function. This random search is an example of reinforcement learning;
now, because of the mathematically simple underlying model structure, the adaptation of
the model can still be relatively efficient.
6. Discussion
The presented scheme was just a mathematical model applying engineering intuition for
reaching good control, and the main goal when constructing the model was simplicity.
However, it seems that there is a connection to real neuronal systems.
It has been observed (Haykin, 1999; Hyötyniemi, 2004) that simple Hebbian/anti-Hebbian
learning in neuronal grids results in a principal subspace model of the input data. This
Hebbian learning is the principle that is implemented by the simple neurons (Hebb, 1949).
262 Humanoid Robots, New Developments
The principal subspace captures the principal components, and it is a mathematically valid
basis for implementations of principal component regressions.
In (Hyötyniemi, 2006) the Hebbian learning principle is extended by applying feedback
through environment. It turns out that when the nonideality is taken into account – exploiting
signals also exhausts them – there is a negative feedback in the system; one can reach
stabilization of the Hebbian adaptation without additional nonlinearities, and emergence of
the principal subspace without complicated hierarchies among neurons. There can exist self-
regulation and self-organization in the neuronal system, meaning that the adaptation of the
global model can be based solely on the local interactions between individual neurons.
But as the biped structure is highly nonlinear, one needs to extend the linear model; here
this was accomplished by introducing the clustered structure with submodels. How could
such extension be motivated in a neuronal system, as it now seems that some kind of central
coordination is necessary to select among the submodels and to master their adaptation?
Again, as studied in (Hyötyniemi, 2006), there emerges sparse coding in a system with the
Hebbian feedback learning. It can be claimed that in sparse coding the basis vectors are
rotated to better match the physically relevant features in data – such behaviour has been
detected, for example, in visual cortex (Földiák, 2002). In Hebbian feedback learning the
correlating structures become separated, and they compete for activation; without any
centralized control structures, the signals become distributed among the best matching
substructures. As seen from outside, the net effect is to have “virtual” clusters, with smooth
interpolation between them.
7. Conclusions
In this work, a clustered regression structure was used to model and control a walking
biped robot model. It was shown that the purely data-based model is accurate enough to
control the biped. The control structures can also be motivated from the physiological point
of view.
The main problem is that to successfully reproduce the walking gait, the clustered
regression controller should learn to keep the system well near the nominal trajectory. If the
state of the system drifts too far from the learned behaviour, the validity of the local models
strongly weakens and the system collapses. As a result, the robustness of the controller is
dependent on the amount of source data variation obtained by additional noise. However,
the clustered regression structure was unable to control the biped with the required noise
level present in the PD controlled simulations. This complicates the iterative optimization
process.
It was also shown that the management of no less than eight principal components is
necessary; the “visibility ratio” between these principal components, or the ratio between
variances, is over three decades. This also dictates the necessary signal-to-noise ratio. It
seems that such accuracy cannot be achieved in biological systems; to construct a
biologically plausible control scheme, the model structure has to be modified.
There are various directions to go. Principal components are always oriented towards the
input data only, neglecting the output, or the outcome of the actions. This problem should
somehow be attacked. One possibility would be to explicitly control the adaptation based on
the observed outputs, so that rather than doing standard principal component analysis,
some kind of partial least squares (PLS) (Hyötyniemi, 2001) approach would be implemented.
Also, the data could be rescaled to emphasize the less visible PC directions.
Clustered Regression Control of a Biped Robot Model 263
It is also important to point out that the physical model of the biped robot is only a model,
and some phenomena present in the data may be caused by the simulator. Especially the leg
contacts with the ground may introduce some oscillations which probably would not
appear in data collected from a real robot. This means that some of the less important
principal component directions may as well describe these irrelevant effects thus making
the modelling problem harder. In the future work, it would be interesting to analyze the
principal component directions in each local model one by one and try to find out which of
them are really connected to the actual walking motion.
8. References
Atkeson, C. G.; Moore, A. W. & Schaal, S. (1997). Locally Weighted Learning. Artificial
Intelligence Review, vol. 11, pp. 11-73.
Basilevsky, A.(1994). Statistical Factor Analysis and Related Methods. John Wiley & Sons, New
York.
Bellman, R. (1957). Dynamic Programming. Princeton University Press, Princeton, New
Jersey.
Chevallereau, C. & et al. (2003). RABBIT: A Testbed for Advanced Control Theory. IEEE
Control Systems Magazine, vol. 23, no. 5, pp. 57-79.
Földiák, P. (2002). Sparse coding in the primate cortex. In Arbib M.A. (ed.). The Handbook of
Brain Theory and Neural Networks. MIT Press, Cambridge, Massachusetts (second
edition).
Haavisto, O. & Hyötyniemi, H. (2004). Simulation Tool of a Biped Walking Robot Model.
Technical Report 138, Helsinki University of Technology, Control Engineering
Laboratory.
Haavisto, O. & Hyötyniemi, H. (2005). Data-based modeling and control of a biped
robot. Proceedings of The 6th IEEE International Symposium on Computational
Intelligence in Robotics and Automation - CIRA2005, Espoo, Finland, June 27-30,
pp. 427-432.
Hardt, M.; Kreutz-Delgado, K. & Helton, J. W. (1999). Optimal Biped Walking with a
Complete Dynamic Model. Proceedings of The 38th Conference on Decision and Control,
Phoenix, Arizona, pp. 2999-3004.
Haykin, S. (1999). Neural Networks, a Comprehensive Foundation. Prentice Hall, Upper Saddle
River, New Jersey (second edition).
Hebb, D.O. (1949): The Organization of Behavior: A Neuropsychological Theory. John Wiley &
Sons, New York.
Hyötyniemi, H. (2001). Multivariate Regression: Techniques and Tools. Technical Report 125,
Helsinki University of Technology, Control Engineering Laboratory.
Hyötyniemi, H. (2002). Life-like Control. Proceedings of STeP 2002 - Intelligence, the Art of
Natural and Artificial, Oulu, Finland, December 15-17, pp. 124-139.
Hyötyniemi, H. (2004). Hebbian Neuron Grids: System Theoretic Approach. Technical Report
144, Helsinki University of Technology, Control Engineering Laboratory.
Hyötyniemi, H. (2006). Neocybernetics in biological systems. Technical Report 151, Helsinki
University of Technology, Control Engineering Laboratory.
Juang, J. (2000). Fuzzy Neural Network Approaches for Robotic Gait Synthesis. IEEE
Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, vol. 30, no. 4, pp.
594-601.
264 Humanoid Robots, New Developments
Wells, D. A. (1967). Theory and Problems of Lagrangian Dynamics with a Treatment of Euler's
Equations of Motion, Hamilton's Equations and Hamilton's Principle. Schaum
publishing, New York.
The MathWorks, Inc., Matlab - The Language of Technical Computing. See the WWW pages at
www.mathworks.com.
Vijayakumar, S.; Shibata, T.; Conradt, J. & Schaal, S. (2002). Statistical Learning for
Humanoid Robots. Autonomous Robots, vol. 12, no. 1, pp. 55-69.
16
Sticky Hands
Joshua G. Hale1 & Frank E. Pollick2
1JST ICORP Computational Brain Project, Japan
2Dept. of Psychology, University of Glasgow, Scotland
1. Introduction
Sticky Hands is a unique physically-cooperative exercise that was implemented with a full-
size humanoid robot. This involved the development of a novel biologically-inspired
learning algorithm capable of generalizing observed motions and responding intuitively.
Since Sticky Hands has thus far been performed as an exercise for the mutual development
of graceful relaxed motion and comfortable physical intimacy between two humans,
performing the exercise with a humanoid robot represents a conceptual advance in the role
of humanoid robots to that of partners for human self-development.
Engendering a sense of comfortable physical intimacy between human and robot is a
valuable achievement: humans must be able to interact naturally with humanoid robots,
and appreciate their physical capabilities and requirements in given situations in order to
cooperate physically. Humans are attuned to performing such assessments with regard to
other humans based on numerous physical and social cues that it will be essential to
comprehend and replicate in order to facilitate intuitive interaction.
The chapter expands these issues in more detail, drawing attention to the relevance of
established research in motion control for computer graphics and robotics, human motion
performance, and perceptual psychology. In particular, attention is paid to the relevance of
theories of human motion production, which inspire biomimetic motion performance
algorithms, and experimental results shedding light on the acuity of human perception of
motion and motion features. We address the following questions: How do we interact
naturally, instinctively, intuitively, and effectively with the most adaptable of multipurpose
machines, i.e., humanoid robots? How do we make interactions natural, graceful, and
aesthetically pleasing? How do we encourage human attribution in the perception of
humanoid robots? How can we expand the concept of humanoid robots through novel
applications? How can we draw on knowledge already existing in other fields to inspire
developments in humanoid robotics?
It is with this perspective, and by way of reply to these inquiries, that we begin this chapter
by describing the implementation of the Sticky Hands system, its intuitive learning and
generalization system, and hardware. We illustrate how a study of biological processes
inspired various aspects of the system, such as the plastic memory of its learning system.
We present experimental results and evaluations of the Sticky Hands system.
We conclude the chapter by returning to the main philosophical themes presented in the
paper, and describe a fruitful future direction for humanoid robotics research: we conclude
that there is a synergistic relationship between neuroscience and humanoid robotics with
266 Humanoid Robots, New Developments
the following four benefits: (i) neuroscience inspires engineering solutions e.g., as we have
illustrated, through human motor production and motion perception research (although the
breadth of relevant knowledge is much wider than the scope of these two examples); (ii)
engineering implementations yield empirical evaluations of neuroscientific theories of
human capabilities; (iii) engineering implementations inspire neuroscientific hypotheses;
(iv) engineering solutions facilitate new neuroscientific experimental paradigms, e.g.,
through the recording, analysis and replication of motion for psychovisual experiments.
We believe that exploiting this synergy will yield rapid advances in both fields, and
therefore advocate parallel neuroscientific experimentation and engineering development in
humanoid robotics. Regarding research building on the Sticky Hands system, experiments
involving physical interaction will thus further the development of humanoids capable of
interacting with humans, and indeed humans capable of interacting with humanoids.
Sticky Hands, being a novel interaction, presents several interesting challenges for the
design of an intelligent system, including complex motion planning that mimics the
ineffable quality of human intuition. As a basis the robot must be capable of moving while
remaining compliant to contact forces from a human, and refining this behaviour, be able to
learn from and mimic the motion patterns of humans playing the game. By imitating the
types of pattern produced by humans, the robot may reflect creativity and encourage people
to explore their own range of motion. To accomplish this while simultaneously experiencing
new motion patterns that may develop unpredictably it is necessary to provide a learning
algorithm capable of generalising motion it has seen to new conditions, maintaining a
suitably evolving internal state.
This work was motivated by a desire to explore physical interaction between humans and
robots. This desire grew from the observation that the arising problem for the design of
appropriate intelligent systems is how to communicate and cooperate with people (Takeda
et al.1997). Using the principle that physical interaction is a familiar and reliable method for
most humans, Sticky Hands broaches the area of physical cooperation and the subtle but
significant issue of communication (Coppin et al. 2000) through physical movement (Adams
et al. 2000; Hikiji 2000). Our work therefore considers human imitation (Scassellati 2000),
which we consider to be of particular relevance for the future -as computer science and
robotics develop it becomes clear that humans and robots will cooperate with a wide range
of tasks. Moreover, we explored the use of a humanoid robot as a playmate facilitating a
human’s self-development. In this context the robot assumes a new social role involving a
physically intimate and cooperative interaction. We can also hope that through the
interaction, people will be encouraged to attribute the robot an anthropomorphic identity
rather than considering it as a mechanical entity. Such a shift of perspective heralds ways of
making human and robot interactions more natural.
Humanoid robotics embodies a certain fascination with creating a mechanical entity
analogous to our human selves. There are of course many other valid motivations for creating
humanoids (Bergener et al. 1997), not least among which is the nature of our environment -
being highly adapted to human sensory and motor capabilities it begs for artificial agents with
analogous capabilities that can usefully coexist in our own living and working spaces. Having
an anthropomorphic shape and organic motion also makes working with such robots
aesthetically and emotionally more pleasing. The production of human-like, and emotionally
expressive styles of movement are of particular relevance. The Sticky Hands interaction
embodies human-like motion and autonomy. The target of maintaining minimal contact force
is tangible. The creative, anticipatory aspect however, is enhanced by initiative. The challenges
posed by these problems motivated the development of a highly generalized learning
algorithm, and a theoretical investigation of expressive motion styles.
We continue this section with a system overview describing the relationship between robot
control and learning in the Sticky Hands system. We then outline the robot control issues,
describing additional sensing technology and explaining how we achieved hand placement
compliant to external forces. We then discuss the learning algorithm which observes
trajectories of the hand throughout interaction with a human and predicts the development
of a current trajectory.
a planned trajectory supplied by the learning algorithm. It also estimates the contact force
between the human’s and robot’s hands, adjusting the trajectory plan to compensate for
contact force discrepancies. The actual hand position data was smoothed to remove noise,
and sent back to the learning algorithm.
Fig. 3. DB kinematics.
The robot was required to balance a force applied by the human player against its hand.
Trajectories were performed while incorporating any adjustments needed to balance the
contact force, thus making the hand actively compliant to changes of the contact force. The
simplest method of determining the contact force is direct measurement, i.e., by means of a
force transducer between the robot’s and human’s hands. Fig. 4 shows the attachment of a
force transducer between the robot’s hand and a polystyrene hemisphere intended to
facilitate an ergonomic surface for the human to contact. The transducer measured forces in
the X, Y and Z directions. The target position of the hand was translated to compensate if
the contact force exceeded a 5N threshold. In order to establish a balance point against the
force applied by the human, we subtracted a small quantity in the Z (forward-backward)
direction from the measured force prior to thresholding. We also implemented a method of
responding to the contact force using only the sensors internal to the SARCOS robot, i.e.,
joint load and angle. In both cases we assumed that the interactions were not so forceful as
to necessitate more than a constant adjustment in hand position, i.e., continuous pressure
would result in a yielding movement with a constant velocity.
torques due to external loads other than gravity, because gravity is the only external force
accounted for in the inverse dynamics calculation. The method relies on having a very
accurate inverse dynamics model, and unfortunately inaccuracies in the dynamic model and
load sensors necessitated a large contact force to facilitate suitable thresholding of the
results.
We therefore measured the positional offset between target and actual hand positions
instead, assuming any large positional discrepancy was caused by external non-gravity
forces. We were able to threshold by 2cm (yielding an effective force threshold of about
12N). When the low-level controller was required to follow a suggested trajectory, the
hand was directed to 5cm further forward from the suggested position. The human was
assumed to supply a force sufficient to maintain the hand in the actual target position
specified by the suggested trajectory. This method did not require readings from the joint
load sensors to respond to external forces because physical interactions were registered
purely through the position of the robot’s hand -which requires only joint angle readings
to compute. If the human reduced or increased the contact force the robot’s hand would
move beyond the 2cm threshold and the suggested trajectory consequently adjusted to
compensate. While the contact force was entirely in the Z direction, perturbations in the X
and Y as well as Z directions were monitored, and accommodated by translating the
target position when the 2cm threshold was reached. This indirect kinematic method
facilitated the use of a significantly lighter contact force than with inverse dynamics based
force estimation.
Any reasonable positioning and compliance strategy is in fact compatible with the high-
level Sticky Hands control system. Ideally, the force threshold is minimal, since the
threshold determines the minimum contact force. The larger the contact force the less
relaxed the human’s motion may be. The way the redundancy in robot hand placement is
resolved does not significantly affect the contact force but on the other hand may have an
effect on the appearance of the robot’s motion. We discuss this point later. We also compare
this kinematic hand positioning technique with the force transducer method and present
traces of the forces measured during interaction with a human.
The trajectories supplied by the learning algorithm were described using piecewise linear
splines. The robot controller ran at 500Hz and the learning algorithm ran at 10Hz. The
sequence of predictions output by the learning algorithm was interpreted as the advancing
end-point of a spline. The knots of this spline were translated to compensate for any
discrepancy in the contact force detected by the motor controller. This translation was
accomplished smoothly, in order to prevent the hand from jerking in response to contact
forces: after translating the knots, a negative translation vector was initialized to bring the
knots back to their original position. This negative translation was gradually decayed to
zero during each cycle of the motor controller. The sum of the translated knots and the
negative translation vector thus interpolated from the original knot positions and the
translated knot positions.
Pi ( pi , vi , ai , Ti ) (2)
3 (3)
vi p i 1 p i R
(4)
| pi 1 pi |
ai R
| pi pi 1 |
(5)
§ § Ti · §T · ·
Ti ¨ cos¨ ¸, sin ¨ i ¸( pi 1 pi ) u ( pi pi 1 ) ¸
¨ ¸
© © 2¹ © 2¹ ¹
§ ( p pi 1 ).( pi 1 pi ) · (6)
Ti cos 1 ¨¨ i ¸R
¸
© | pi pi 1 || pi 1 pi | ¹
272 Humanoid Robots, New Developments
vi is the velocity out of p i (scaled by an arbitrary factor) and ai is a scalar indicating the
magnitude of the acceleration. The direction of the acceleration is deducible from Ti , which
is a quaternion describing the change in direction between v i and v i 1 as a rotation
through their mutually orthogonal axis.
Fig. 5. Datapath in the learning algorithm (arrows) and execution sequence (numbers).
corresponds best to p' j , then p ' j a i Ti ( p ' j p ' j 1 ) is an estimate for p ' j 1 . Pre-
We also linearly blend the position of p i into the prediction, and the magnitude of the
velocity so that p' j combines the actual position and velocity of p i with the prediction
(8)
sj (1 g v ) ai | p ' j p ' j 1 | g v | vi |
g p and g v are blending ratios used to manage the extent to which predictions are entirely
general, or repeat previously observed trajectories, i.e., how much the robot wants to repeat
what it has observed. We chose values of g p and g v in the range [0.1, 0.001] through
S
T [ M a , M a ] T ' T
2M a (10)
T [ M a , M a ] T ' S
vi .v j
T cos 1 (11)
| vi || v j |
Ma and Mp define the maximum angular and positional differences such that d ( Pi , P j ) may
be one or less. Prototypes within this bound are considered similar enough to form a basis
for a prediction, i.e., if d ( Pi , P j ) is greater than 1 for all i then no suitably similar prototype
exists. The metric compares the position of two prototypes, and the direction of their
velocities. Two prototypes are closest if they describe a trajectory traveling in the same
direction, in the same place. In practice, the values of 15cm and S / 4 radians for M p and M a
respectively were found to be appropriate. -A trajectory with exactly the same direction as
the developing trajectory constitutes a match up to a displacement of 15cm, a trajectory with
no displacement constitutes a match up to an angular discrepancy of S / 4 radians, and
within those thresholds there is some leeway between the two characteristics. The threshold
values must be large enough to permit some generalisation of observed trajectories, but not
so large that totally unrelated motions are considered suitable for prediction when
extrapolation would be more appropriate.
The absolute velocity, and bending characteristics are not compared in the metric.
Predictions are therefore general with respect to the path leading a trajectory to a certain
position with a certain direction and velocity, so branching points are not problematic. Also
the speed at which an observed trajectory was performed does not affect the way it can be
generalised to new trajectories. This applies equally to the current trajectory and previously
observed trajectories.
When seeking a prototype we might naïvely compare all recorded prototypes with P ' j 1
to find the closest. If none exist within a distance of 1 we use P ' j 1 itself to extrapolate as
above. Needless to say however, it would be computationally over-burdensome to
compare P ' j 1 with all the recorded prototypes. To optimise this search procedure we
defined a voxel array to store the prototypes. The array encompassed a cuboid enclosing
the reachable space of the robot, partitioning it into a 50u50u50 array of cuboid voxels
indexed by three integer coordinates. The storage requirement of the empty array was
0.5MB. New prototypes were placed in a list attached to the voxel containing their
positional component pi . Given P j 1 we only needed to consider prototypes stored in
voxels within a distance of Mp from p j 1 since prototypes in any other voxels would
definitely exceed the maximum distance according to the metric. Besides limiting the total
number of candidate prototypes, the voxel array also facilitated an optimal ordering for
considering sets of prototypes. The voxels were considered in an expanding sphere about
p j . A list of integer-triple voxel index offsets was presorted and used to quickly identify
voxels close to a given centre voxel ordered by minimum distance to the centre voxel. The
list contained voxels up to a minimum distance of M p . This ensures an optimal search of
the voxel array since the search may terminate as soon as we encounter a voxel that is too
Sticky Hands 275
far away to contain a prototype with a closer minimum distance than any already found.
It also permits the search to be cut short if time is unavailable. In this case the search
terminates optimally since the voxels most likely to contain a match are considered first.
This facilitates the parameterisable time bound since the prototype search is by far the
dominant time expense of the learning algorithm.
D ( n) 1 1
Pa ' Pa Pb (12)
D ( n) D ( n)
AM
D(n) 1 AM (13)
1 nAG
AM /(1 nAM ) defines the maximum weight for the old values, and AG determines how
quickly it is reached. Values of 10 and 0.1 for AM and AG respectively were found to be
suitable. This makes the averaging process linear as usual for small values but ensures the
contribution of the new prototype is worth at least 1/11th.
We facilitated an upper bound on the storage requirements using a deletion indexing
strategy for removing certain prototypes. An integer clock was maintained, and
incremented every time a sample was processed. New prototypes were stamped with a
deletion index set in the future. A list of the currently stored prototypes sorted by deletion
index was maintained, and if the storage bounds were reached the first element of the list
was removed and the corresponding prototype deleted. The list was stored as a heap
(Cormen et al.) since this data structure permits fast O (log(num elements)) insertion,
deletion and repositioning. We manipulated the deletion indices to mirror the
reinforcement aspect of human memory. A function R (n ) defined the period for which a
276 Humanoid Robots, New Developments
3. Results
The initial state and state after playing Sticky Hands with a human partner are shown in Fig.
7. Each prototype is plotted according to its position data. The two data sets are each viewed
from two directions and the units (in this and subsequent figures) are millimeters. The X, Y
& Z axes are positive in the robot’s left, up and forward directions respectively. The point
(0,0,0) corresponds to the robot’s sacrum. The robot icons are intended to illustrate
orientation only, and not scale. Each point represents a unique prototype stored in the
motion predictor’s memory, although as discussed each prototype may represent an
amalgamation of several trajectory samples. The trajectory of the hand loosely corresponds
to the spacing of prototypes but not exactly because sometimes new prototypes are blended
with old prototypes according to the similarities between each’s position and velocity
vectors.
The initial state was loaded as a default. It was originally built by teaching the robot to
perform an approximate circle 10cm in radius and centred in front of the left elbow joint
(when the arm is relaxed) in the frontal plane about 30cm in front of the robot. The
prototype positions were measured at the robot’s left hand, which was used to play the
game and was in contact with the human’s right hand throughout the interaction. The
changes in the trajectory mostly occur gradually as human and robot slowly and
cooperatively develop cycling motions. Once learned, the robot can switch between any
of its previously performed trajectories, and generalise them to interpret new
trajectories.
The compliant positioning system, and its compatibility with motions planned by the
prediction algorithm was assessed by comparing the Sticky Hands controller with a
‘positionable hand’ controller that simply maintains a fixed target for the hand in a
compliant manner so that a person may reposition the hand.
Fig. 8 shows a force/position trace where the width of the line is linearly proportional to
the magnitude of the force vector (measured in all 3 dimensions), and Table 1 shows
corresponding statistics. Force measurements were averaged over a one minute period of
interaction, but also presented are ‘complied forces’, averaging the force measurements
over only the periods when the measured forces exceeded the compliance threshold.
From these results it is clear that using the force transducer yielded significantly softer
compliance in all cases. Likewise the ‘positionable hand’ task yielded slightly softer
Sticky Hands 277
compliance because the robot did not attempt to blend its own trajectory goals with those
imposed by the human.
Fig. 8. Force measured during ‘positionable hand’ and Sticky Hands tasks.
278 Humanoid Robots, New Developments
4. Discussion
We proposed the ‘Sticky Hands’ game as a novel interaction between human and robot. The
game was implemented by combining a robot controller process and a learning algorithm with a
Sticky Hands 279
novel internal representation. The learning algorithm handles branching trajectories implicitly
without the need for segmentation analysis because the approach is not pattern based. It is
possible to bound the response time and memory consumption of the learning algorithm
arbitrarily within the capabilities of the host architecture. This may be achieved trivially by
restricting the number of prototypes examined or stored. The ethos of our motion system may be
contrasted with the work of Williamson (1996) who produced motion controllers based on
positional primitives. A small number of postures were interpolated to produce target joint
angles and hence joint torques according to proportional gains. Williamson’s work advocated the
concept of ``behaviours or skills as coarsely parameterised atoms by which more complex tasks
can be successfully performed’’. Corresponding approaches have also been proposed in the
computer animation literature, such as the motion verbs and adverbs of Rose et al. (1998).
Williamson’s system is elegant, providing a neatly bounded workspace, but unfortunately it was
not suitable for our needs due to the requirements of a continuous interaction incorporating
more precise positioning of the robot’s hand.
By implementing Sticky Hands, we were able to facilitate physically intimate interactions
with the humanoid robot. This enables the robot to assume the role of playmate and
partner assisting in a human’s self-development. Only minimal sensor input was required
for the low-level motor controller. Only torque and joint position sensors were required,
and these may be expected as standard on most humanoid robots. With the addition of a
hand mounted force transducer the force results were also obtained. Our work may be
viewed as a novel communication mechanism that accords with the idea that an
autonomous humanoid robot should accept command input and maintain behavioral
goals at the same level as sensory input (Bergener et al. 1997). Regarding the issue of
human instruction however, the system demonstrates that the blending of internal goals
with sensed input can yield complex behaviors that demonstrate a degree of initiative.
Other contrasting approaches (Scassellati 1999) have achieved robust behaviors that
emphasize the utility of human instruction in the design of reinforcement functions or
progress estimators.
The design ethos of the Sticky Hands system reflects a faith in the synergistic relationship
between humanoid robotics and neuroscience. The project embodies the benefits of cross-
fertilized research in several ways. With reference to the introduction, it may be seen that (i)
neuroscientific and biological processes have informed and inspired the development of the
system, e.g., through the plastic memory component of the learning algorithm, and the
control system’s “intuitive” behaviour which blends experience with immediate sensory
information as discussed further below; (ii) by implementing a system that incorporates
motion based social cues, the relevance of such cues has been revealed in terms of human
reactions to the robot. Also, by demonstrating that a dispersed representation of motion is
sufficient to yield motion learning and generalization, the effectiveness of solutions that do
not attempt to analyze nor segment observed motion has been confirmed; (iii) technology
developed in order to implement Sticky Hands has revealed processes that could plausibly
be used by the brain for solving motion tasks, e.g., the effectiveness of the system for
blending motion targets with external forces to yield a compromise between the motion
modeled internally and external influences suggests that humans might be capable of
performing learned motion patterns according to a consistent underlying model subject to
forceful external influences that might significantly alter the final motion; (iv) the Sticky
Hands system is in itself a valuable tool for research since it provides an engaging
cooperative interaction between a human and a humanoid robot. The robot ‘s behaviour
280 Humanoid Robots, New Developments
may be modulated in various ways to investigate for example the effect of less compliant
motion, different physical cues, or path planning according to one of various theories of
human motion production.
The relationship between the engineering and computational aspect of Sticky Hands and the
neuroscientific aspect is thus profound. This discussion is continued in the following
sections which consider Sticky Hands in the context of relevant neuroscientific fields:
human motion production, perception, and the attribution of characteristics such as
naturalness and affect. The discussion is focused on interaction with humans, human
motion, and lastly style and affect.
robot head motion to fixate visually on its own hand, which by coincidence was where most
human partners were also looking, leading to a form of mutual gaze between human and
robot. This gestural interaction yielded variable reports from the human players as either a
sign of confusion or disapproval by the robot.
This effect is illustrative of the larger significance of subtle cues embodied by human motion that
may be replicated by humanoid robots. Such actions or characteristics of motion may have
important consequences for the interpretation of the movements by humans. The breadth of
knowledge regarding these factors further underlines their value. There is much research
describing how humans produce and perceive movements and many techniques for producing
convincing motion in the literature of computer animation. For example, there is a strong duality
between dynamics based computer animation and robotics (Yamane & Nakamura 2000).
Computer animation provides a rich source of techniques for generating (Witkin & Kass 1988;
Cohen 1992; Ngo & Marks 1993; Li et al. 1994; Rose et al. 1996; Gleicher 1997) and manipulating
(Hodgins & Pollard 1997) dynamically correct motion, simulating biomechanical properties of
the human body (Komura & Shinagawa 1997) and adjusting motions to display affect or achieve
new goals (Bruderlin & Williams 1995; Yamane & Nakamura 2000).
been reported that for planar segments velocity is inversely proportional to curvature raised to
the 1/3rd power, and that for non-planar segments the velocity is inversely proportional to the
1/3rd power of curvature multiplied by 1/6th power of torsion (Lacquaniti et al. 1983; Viviani &
Stucchi 1992; Pollick & Sapiro 1996; Pollick et al. 1997; Handzel & Flash, 1999). Extensive
psychological experiments of the paths negotiated by human-humanoid dyads could inform
which principles of human motor control are appropriate for describing human-humanoid
cooperative behaviours.
5. Conclusion
Having described the Sticky Hands project: it’s origin, hardware and software
implementation, biological inspiration, empirical evaluation, theoretical considerations and
implications, and having broadened the later issues with a comprehensive discussion, we
now return to the enquiries set forth in the introduction.
The Sticky Hands project itself demonstrates a natural interaction which has been
accomplished effectively –the fact that the objectives of the interaction are in some aspects
open-ended creates leeway in the range of acceptable behaviours but also imposes complex
high-level planning requirements. Again, while these may be regarded as peculiar to the
Sticky Hands game they also reflect the breadth of problems that must be tackled for
advanced interactions with humans. The system demonstrates through analysis of human
motion, and cooperation how motion can be rendered naturally, gracefully and aesthetically.
These characteristics are both key objectives in Sticky Hands interaction, and as we have
indicated in the discussion also have broader implications for the interpretation, quality and
effectiveness of interactions with humans in general for which the attribution of human
qualities such as emotion engender an expectation of the natural social cues that improve
the effectiveness of cooperative behaviour through implicit communication.
Sticky Hands 283
We have drawn considerable knowledge and inspiration from the fields of computer
graphics, motion perception and human motion performance. The benefit that the latter two
fields offer for humanoid robotics reveal an aspect a larger relationship between humanoid
robotics and neuroscience. There is a synergistic relationship between the two fields that
offers mutual inspiration, experimental validation, and the development of new
experimental paradigms to both fields. We conclude that exploring the depth of this
relationship is a fruitful direction for future research in humanoid robotics.
8. References
Adams, B.; Breazeal, C.; Brooks, R.A. & Scassellati, B. (2000). Humanoids Robots: A New
Kind of Tool. IEEE Intelligent Systems, 25-31, July/August
Atkeson, C.G.; Hale, J.G.; Kawato, M.; Kotosaka, S.; Pollick, F.E.; Riley, M.; Schaal, S.;
Shibata, T.; Tevatia, G.; Ude A. & Vijayakumar, S. (2000). Using humanoid robots
to study human behavior. IEEE Intelligent Systems, 15, pp46-56
Bergener, T.; Bruckhoff, C.; Dahm, P.; Janben, H.; Joublin, F. & Menzner, R. (1997). Arnold: An
Anthropomorphic Autonomous Robot for Human Environments. Proc. Selbstorganisation
von Adaptivem Verhalten (SOAVE 97), 23-24 Sept., Technische Universitt Ilmenau
Bruderlin, A. & Williams, L. (1995). Motion Signal Processing. Proc. SIGGRAPH 95, Computer
Graphics Proceedings, Annual Conference Series, pp97-104
Cohen, M.F. (1992). Interactive Spacetime Control for Animation. Proc. SIGGRAPH 92,
Computer Graphics Proceedings, Annual Conference Series, pp293-302
Coppin P.; Pell, R.; Wagner, M.D.; Hayes, J.R.; Li, J.; Hall, L. ; Fischer, K.D.; Hirschfield &
Whittaker, W.L. (2000). EventScope: Amplifying Human Knowledge and
Experience via Intelligent Robotic Systems and Information Interaction. IEEE
International Workshop on Robot-Human Interaction, Osaka, Japan
Cormen, T.H.; Leiserson, C.E. & Rivest, R.L. Introduction To Algorithms. McGraw-Hill,
ISBN 0-07-013143-0
Flash, T. & Hogan, N. (1985). The coordination of arm movements: An experimentally
confirmed mathematical model. Journal of Neuroscience, 5, pp1688-1703
Giese, M.A. & Poggio, T. (2000). Morphable models for the analysis and synthesis of
complex motion patterns. International Journal of Computer Vision, 38, 1, pp59-73
Giese, M.A. & Lappe, M. (2002). Perception of generalization fields for the recognition of
biological motion. Vision Research, 42, pp1847-1858
Gleicher, M. (1997). Motion Editing with Spacetime Constraints. Proc. 1997 Symposium on
Interactive 3D Graphics
Handzel, A. A. & Flash, T. (1999). Geometric methods in the study of human motor control.
Cognitive Studies 6, 309-321
Harris, C.M. & Wolpert, D.M. (1998). Signal-dependent noise determines motor planning.
Nature 394, pp780-784
Hikiji, H. (2000). Hand-Shaped Force Interface for Human-Cooperative Mobile Robot.
Proceedings of the 2000 IEICE General Conference, A-15-22, pp300
Hill, H. & Pollick, F.E. (2000). Exaggerating temporal differences enhances recognition of
individuals from point light displays. Psychological Science, 11, 3, pp223-228
Hodgins, J.K. & Pollard, N.S. (1997). Adapting Simulated Behaviors For New Characters. Proc.
SIGGRAPH 97, Computer Graphics Proceedings, Annual Conference Series, pp153-162
Kawato, M. (1992). Optimization and learning in neural networks for formation and control
of coordinated movement. In: Attention and performance, Meyer, D. and Kornblum,
S. (Eds.), XIV, MIT Press, Cambridge, MA, pp821-849
284 Humanoid Robots, New Developments
Komura, T. & Shinagawa, Y. (1997). A Muscle-based Feed-forward controller for the Human
Body. Computer Graphics forum 16(3), pp165-176
Lacquaniti, F.; Terzuolo, C.A. & Viviani, P. (1983). The law relating the kinematic and figural
aspects of drawing movements. Acta Psychologica, 54, pp115-130
Li, Z.; Gortler, S.J. & Cohen, M.F. (1994). Hierarchical Spacetime Control. Proc. SIGGRAPH
94, Computer Graphics Proceedings, Annual Conference Series, pp35-42
Ngo, J.T. & Marks, J. (1993). Spacetime Constraints Revisited. Proc. SIGGRAPH 93, Computer
Graphics Proceedings, Annual Conference Series, pp343-350
Pollick, F.E. & Sapiro, G. (1996). Constant affine velocity predicts the 1/3 power law of
planar motion perception and generation. Vision Research, 37, pp347-353
Pollick, F.E.; Flash, T.; Giblin, P.J. & Sapiro, G. (1997). Three-dimensional movements at
constant affine velocity. Society for Neuroscience Abstracts, 23, 2, pp2237
Pollick, F.E.; Fidopiastis, C.M. & Braden, V. (2001a). Recognizing the style of spatially
exaggerated tennis serves. Perception, 30, pp323-338
Pollick, F.E.; Paterson, H.; Bruderlin, A. & Sanford, A.J. (2001b). Perceiving affect from arm
movement. Cognition, 82, B51-B61
Rose, C.; Guenter, B.; Bodenheimer, B. & Cohen, M.F. (1996). Efficient Generation of Motion
Transitions using Spacetime Constraints. Proc. SIGGRAPH 96, Computer Graphics
Proceedings, Annual Conference Series, pp147-154
Rose, C.; Bodenheimer, B. & Cohen, M.F. (1998). Verbs and Adverbs: Multidimensional
Motion Interpolation. IEEE Computer Graphics & Applications, 18(5)
Scassellati, B. (1999). Knowing What to Imitate and Knowing When You Succeed. Proc. of
AISB Symposium on Imitation in Animals and Artifacts, Edinburgh, Scotland
Scassellati, B. (2000). Investigating models of social development using a humanoid robot.
In: Biorobotics, Webb, B. and Consi, T. (Eds.), MIT Press, Cambridge, MA
Soechting, J.F. & Terzuolo, C.A. (1987a). Organization of arm movements. Motion is
segmented. Neuroscience, 23, pp39-51
Soechting, J.F. & Terzuolo, C.A. (1987b). Organization of arm movements in three-
dimensional space. Wrist motion is piecewise planar. Neuroscience, 23, pp53-61
Stokes, V.P.; Lanshammar, H. & Thorstensson, A. (1999). Dominant Pattern Extraction from
3-D Kinematic Data. IEEE Transactions on Biomedical Engineering 46(1)
Takeda H.; Kobayashi N.; Matsubara Y. & Nishida, T. (1997). Towards Ubiquitous Human-
Robot Interaction. Proc. of IJCAI Workshop on Intelligent Multimodal Systems, Nagoya
Congress Centre, Nagoya, Japan
Tevatia, G. & Schaal, S. (2000). Inverse kinematics for humanoid robots. IEEE International
Conference on Robotics and Automation, San Francisco, CA
Uno, Y.; Kawato, M. & Suzuki, R. (1989). Formation and control of optimal trajectory in
human multijoint arm movement. Biological Cybernetics, 61, pp89-101
Viviani, P. & Stucchi, N. (1992). Biological movements look uniform: Evidence of motor-
perceptual interactions. Journal of Experimental Psychology: Human Perception and
Performance, 18, pp602-623
Williamson, M.M. (1996). Postural primitives: Interactive Behavior for a Humanoid Robot
Arm. Proc. of SAB 96, Cape Cod, MA, USA
Witkin, A. & Kass, M. (1988). Spacetime Constraints. Proc. SIGGRAPH 88, Computer Graphics
Proceedings, Annual Conference Series, pp159-168
Yamane, K. & Nakamura, Y. (2000). Dynamics Filter: Towards Real-Time and Interactive
Motion Generator for Human Figures. Proc. WIRE 2000, 27-34, Carnegie Mellon
University, Pittsburgh, PA
17
1. Introduction
An obvious problem confronting humanoid robotics is the generation of stable and
efficient gaits. Whereas wheeled robots normally are statically balanced and remain
upright regardless of the torques applied to the wheels, a bipedal robot must be actively
balanced, particularly if it is to execute a human-like, dynamic gait. The success of gait
generation methods based on classical control theory, such as the zero-moment point
(ZMP) method (Takanishi et al., 1985), relies on the calculation of reference trajectories for
the robot to follow. In the ZMP method, control torques are generated in order to keep the
zero-moment point within the convex hull of the support area defined by the feet. When
the robot is moving in a well-known environment, the ZMP method certainly works well.
However, when the robot finds itself in a dynamically changing real-world environment,
it will encounter unexpected situations that cannot be accounted for in advance. Hence,
reference trajectories can rarely be specified under such circumstances. In order to address
this problem, alternative, biologically inspired control methods have been proposed,
which do not require the specification of reference trajectories. The aim of this chapter is
to describe one such method, based on central pattern generators (CPGs), for control of
bipedal robots.
Clearly, walking is a rhythmic phenomenon, and many biological organisms are indeed
equipped with CPGs, i.e. neural circuits capable of producing oscillatory output given tonic
(non-oscillating) activation (Grillner, 1996). There exists biological evidence for the presence
of central pattern generators in both lower and higher animals. The lamprey, which is one of
the earliest and simplest vertebrate animals, swims by propagating an undulation along its
body. The wave-like motion is produced by an alternating activation of motor neurons on
the left and right sides of the segments along the body. The lamprey has a brain stem and
spinal cord with all basic vertebrate features, but with orders of magnitude fewer nerve cells
of each type than higher vertebrates. Therefore, it has served as a prototype organism for the
detailed analysis of the nervous system, including CPGs, in neurophysiological studies
(Grillner, 1991; Grillner, 1995). In some early experiments by Brown (Brown, 1911, Brown,
1912), it was shown that cats with transected spinal cord and with cut dorsal roots still
showed rhythmic alternating contractions in ankle flexors and extensors. This was the basis
of the concept of a spinal locomotor center, which Brown termed the half-center model
(Brown, 1914). Further biological support for the existence of a spinal CPG structure in
vertebrates is presented in (Duysens & Van de Crommert, 1998).
286 Humanoid Robots, New Developments
However, there is only evidence by inference of the existence of human CPGs. The strongest
evidence comes from studies of newborns, in which descending supraspinal control is not
yet fully developed, see e.g. (Zehr & Duysens, 2004) and references therein. Furthermore,
advances made in the rehabilitation of patients with spinal cord lesions support the notion
of human CPGs: Treadmill training is considered by many to rely on the adequate afferent
activation of CPGs (Duysens & Van de Crommert, 1998). In view of the results of the many
extensive studies on the subject, it seems likely that primates in general, and humans in
particular, would have a CPG-like structure.
In view of their ability to generate rhythmic output patterns, CPGs are well suited as the
basis for bipedal locomotion. Moreover, CPGs exhibit certain properties of adaptation to the
environment: Both the nervous system, composed of coupled neural oscillators, and the
musculo-skeletal system have their own nonlinear oscillatory dynamics, and it has been
demonstrated that, during locomotion, some recursive dynamics occurs between these two
systems. This phenomenon, termed mutual entrainment, emerges spontaneously from the
cooperation among the systems’ components in a self-organized way (Taga et al., 1991).
That is, natural periodic motion, set close to the natural (resonant) frequency of the
mechanical body, is achieved by the entrainment of the CPGs to a mechanical resonance by
sensory feedback. The feedback is non-essential for the rhythmic pattern generation itself,
but rather modifies the oscillations in order to achieve adaptation to environmental changes.
In the remainder of this chapter, the use of CPGs in connection with bipedal robot control
will be discussed, with particular emphasis on CPG network optimization aimed at
achieving the concerted activity needed for bipedal locomotion. However, first, a brief
introduction to various CPG models will be given.
half-center of interneurons ensure that when one half-center is active, the other is being
suppressed. It was hypothesized that, as activity in the first half-center progressed, a process
of fatigue would build up in the inhibitory connections between the two half-centers,
thereby switching activity from one half-center to the other (Brown, 1914). Since then,
support for the half-center model has been found in experiments with cats (Duysens & Van
de Crommert, 1998).
Fig. 1. The Matsuoka oscillator unit. The nodes (1) and (2) are referred to as neurons, or cells.
Excitatory connections are indicated by open circles, and inhibitory connections are
indicated by filled disks.
However, in this work the CPG model formulated in mathematical terms by Matsuoka
(Matsuoka, 1987) has been used for the development of CPG networks for bipedal walking.
The Matsuoka model is a mathematical description of the half-center model. In its simplest
form, a Matsuoka CPG (or oscillator unit) consists of two neurons arranged in mutual
inhibition, as depicted in Fig. 1. The neurons in the half-center model are described by the
following differential equations (Taga, 1991):
N
W u ui ui Evi ¦ Zij y j u0 (1)
j 1
yi max0, ui (3)
where ui is the inner state of neuron i, vi is an auxiliary variable measuring the degree of self-
inhibition (modulated by the parameter ǃ) of neuron i, Ǖu and Ǖv are time constants, u0 is an
external tonic (non-oscillating) input, wij are the weights connecting neuron j to neuron i,
and, finally, yi is the output of neuron i. Two such neurons arranged in a network of mutual
inhibition (a half-center model), form an oscillator, in which the amplitude of the oscillation
is proportional to the tonic input u0. The frequency of the oscillator can be controlled by
changing the values of the two time constants Ǖu and Ǖv. If an external oscillatory input is
applied to the input of a Matsuoka oscillator, the CPG can lock onto its frequency. Then,
when the external input is removed, the CPG smoothly returns to its original oscillation
frequency. This property, referred to as entrainment, is highly relevant for the application of
the Matsuoka oscillator in adaptive locomotion (Taga, 1991).
of walking speed and step length. By continuously decreasing the speed, the robot could be
brought to a halt in a controlled way. It was also possible to generate backwards walking by
simply inverting the sign of all frequency parameters in the CPG controller.
In the simulation experiments presented here, a fully three-dimensional bipedal robot with
14 DOFs, shown in the leftmost panel of Fig. 2, was used. The simulated robot weighs 7 kg
and its height is 0.75 m. The distance between the ground and the hips is 0.45 m. The waist
has 2 DOFs, each hip joint 3 DOFs, the knee joints 1 DOF each, and the ankle joints 2 DOFs
each, as illustrated in the in Fig. 2 (second panel from the left).
The simulations were carried out using the EvoDyn simulation library (Pettersson, 2003),
developed at Chalmers University of Technology. Implemented in object-oriented Pascal,
EvoDyn runs both on Windows and Linux platforms and is capable of simulating tree-
structured rigid-body systems. Its dynamics engine is based on a recursively formulated
algorithm that scales linearly with the number of rigid bodies in the system (Featherstone,
1987). For numerical integration of the state derivates of the simulated system, a fourth
order Runge-Kutta method is used. Visualization is achieved using the OpenGL library.
Fig. 2. The two panels to the left show the simulated robot and its kinematics structure with
14 DOFs, used in this work. The panels on the right show the dimensions of the robotic
body.
In the simulation experiments, the output of the CPG network was either interpreted as
motor torques, which were applied directly to the joint actuators, or as desired joint angles
for the robot’s joints to follow. In the latter case, a standard PD controller was implemented
in order to generate the adequate torques to the joint actuators.
starting from randomly generated initial populations (the members of which are referred to
as individuals, following standard GA terminology) of such networks. Thus, for experiments
of this kind, the first challenge is to choose a suitable fitness measure that will favor CPG
networks capable of executing an upright bipedal gait.
Fig. 3. The posture-support structure used in this work for helping the robot to balance as it
starts to walk.
Fig. 4. Left panel: the structure of the CPG network used in the simulations, with the hip1
joint CPG marked with a darker shade of gray. The connections between the individual
oscillators are represented, in a slightly simplified way, by the arrows in the figure; see the
text for a more detailed explanation. Right panel: The robot’s kinematics structure with joint
names indicated.
The CPG network is further constrained such that only the hip1 CPG, responsible for rotation
in the sagittal plane, can be connected to all the other ipsilateral1 joint CPGs, the
corresponding contralateral hip1 CPG, and the waist CPGs. Note, however, that the hip1 CPG
on a given side, can only receive connections from the corresponding contralateral hip1 CPG,
see Fig. 4 for more details. The remaining joint oscillators do not transmit any signals at all.
The introduction of these constraints was motivated by several preliminary experiments, the
results of which indicated that the evolutionary process was considerably less efficient when
a fully connected CPG network was used, yet without improving the final results.
For simplicity, in Fig. 4, the connections between CPG units are depicted with a single
arrow. In fact, each such arrow represents a maximum of four unique connections between
two joint oscillators, as indicated in Fig. 5. This figure depicts, as an example, the possible
connections between the hip1 CPG and the knee CPG. Thus, given the symmetry and
connectivity constraints presented above, a total of 8x4 = 32 connections are to be
determined by the GA. Note, that it is possible that the GA may disable all connections, w1 –
1
The term ipsilateral refers to the same side of the body, and is thus the opposite of the term
contralateral.
Central Pattern Generators for Gait Generation in Bipedal Robots 293
w4 between two CPG units. If this happens, one connection will be forced between the two
CPGs. This is motivated by the fact that, without any interconnection, no synchronization
between the two joints will be possible. In this setup, the forced connection will be added
between the flexor neuron in the transmitting CPG, and the flexor neuron in the receiving
CPG. There is, however, no particular reason for preferring this connection; Choosing any
other connection would not make any fundamental difference.
Fig. 5. All possible connections between the hip1 CPG and the knee CPG.
For optimization of the CPG network, a standard genetic algorithm, (Holland, 1975), was
chosen. In the GA, the CPG network of a given individual was represented by two
chromosomes; one binary-valued chromosome determining the presence or absence of each
of the 32 evolvable connections, and one real-valued chromosome determining the
parameter values for those connections that are actually used in a given individual. In
addition to these two chromosomes, a third (real-valued) chromosome was used for
determining the sign and strength of the different feedback paths, as described in the next
subsection.
In early experiments, the internal parameters (see Eqs. (1) and (2)) of the individual half-
center CPGs were also evolved by the GA. However, with the supporting structure present,
evolution always promoted parameters values that produced unnaturally large steps, so
that the support could be exploited for reaching higher fitness values. In such cases, the
robot became highly dependent on the supporting structure for balance, making it even
more difficult to find an appropriate removal strategy for the support. For this reasons, the
internal parameters of the individual half-center CPGs were set to fixed values, generating a
motion frequency of the robot’s legs approximating that of normal walking. The chosen
parameters are shown in Fig. 6 along with the resulting output from the half-center
oscillator. These parameters were applied to all CPGs, except for the knee joint CPGs and the
waist1 CPG. In analogy with human gait, the knee joint oscillator and the waist1 oscillator
were tuned to generate a rhythmic pattern with double frequency compared to the other
CPGs. Thus, for these joints’ CPGs, the Ǖu and Ǖv values were set to half of the values used for
the other CPGs.
In each run, a population of 180 individuals was used. Selection of individuals (for
reproduction), was carried out using tournament selection, with a tournament size of 8
individuals. In a given tournament, the individual with the highest fitness value was
selected with a probability of 0.75, and a random individual (among the 8) with 0.25
294 Humanoid Robots, New Developments
probability. After each selection step, the mutation operator was applied, randomly
changing a gene’s value with probability 10/N, where N denotes the total number of genes
in the three chromosomes of the individual.
Fig. 6. Left panel: The half-center oscillator assigned to each joint. Right panel: The
parameter values used, along with a typical output signal. Note that the knee and waist1 joint
oscillators used different values of the time constants (see the main text for details).
The fitness measure was taken as the distance walked by the robot in the initial forward
direction, x, decreased by the sideways deviation y. The fitness function F, for a given
individual i, can thus be expressed as:
F (i ) x y (4)
In addition to the fitness function, several indirect punishments were introduced during
evaluation. For example, since the two-point supporting structure provided no support in
the sagittal plane, the robot could still fall to the ground during evaluation, leading to the
emergence of motion patterns such as crawling. Avoiding this was important, particularly
in the beginning of the evolutionary process, since these individuals could gain better
fitness than the very first individuals that walked with an upright body posture. Thus,
even though the support was used, the resulting gaits could develop towards gait
patterns other than upright walking. Thus, a rule was introduced such that if a robot’s
hips collided with the ground, the evaluation of that particular individual was
terminated. Other punishments or modifications that were also introduced are described
in the experiment section.
the touch sensor was used both as tactile feedback, i.e. to indicate foot-to-ground contact,
but also to enable, or prohibit, other feedback signals to be connected to certain CPGs
during a specific phase of the gait; see Fig. 7 for an example.
In this section, only some illustrative examples of the feedback network structure are given.
For a detailed description of the feedback paths see (Wolff et al., 2006). The feedback signals
to an individual CPG are introduced by adding an extra term to Eq. (1), which then
becomes:
N
W u ui ui Evi ¦ Zij y j u 0 f (5)
j 1
where f denotes the sum of all feedback signals connected to the particular CPG.
When designing the feedback network, the main idea was to trigger the different phases
in a gait cycle. For example, for the left hip1 CPG, the feedback structure was designed in
the following way: The flexor and the extensor neuron receive two feedback signals, as
shown in the left panel of Fig. 7: One signal from the right-foot touch sensor, and one
signal measuring the inclination angles of the left and right upper leg, scaled by the
strength of the feedback connections. The feedback signal from the touch sensor is
intended to trigger the swing motion of the left leg, as soon as the right foot touches the
ground. Feedback from the inclination angles of the upper legs is intended to promote
repetitive leg motion in such a way that, when the left leg is moving forward, the right leg
should move backwards, and vice versa. The feedback paths connected to the hip1 CPG
are depicted in the left panel of Fig. 7. Note, that the type of connections shown in the
figure are chosen to easier demonstrate how the feedback signals were intended to affect
the motion of the hip1 joint. However, since the dynamical system of a single CPG unit
becomes rather complex in an interaction with other CPGs, the robot, and the
environment, it is very hard to predict if the chosen feedback configuration will work as
intended. For this reason, both the sign and the strength of each connection were
determined by the GA (but not the structure of the feedback paths). Fig. 7 shows an
example in which the feedback connection to the flexor neuron of a certain CPG happened
to be of opposite sign compared with the feedback connection to the extensor neuron of
the same CPG.
In the right panel of Fig. 7, the feedback paths connected to the right hip2 CPG are
illustrated. In same way as above, the illustrated types of connections in the figure intend
only to demonstrate how the structure of the feedback for this joint was planned to affect the
motion of the hip2 joint. It does not represent the best connection configuration (in terms of
the sign and the strength) chosen by the GA. In the case of hip2, only the feedback signal
measuring the inclination angle of the right upper leg is connected to the oscillator.
However, the feedback is only transmitted to the CPG unit if the touch sensor in the right
foot is activated, i.e. when the right foot is on the ground. This configuration was chosen
with the intention of ensuring that the hip2 CPG was able to generate more torque in the
stance phase, i.e. when the right leg is moving backwards, during which the entire body of
the robot is resting on the right leg.
One should notice that by choosing an adequate feedback structure, the feedback signals
may indirectly force synchronization between different joints, even if the connection weights
between the joints’ CPG units may not necessarily ensure this by themselves. In this work, a
total of 20 parameters, determining the sign and the strength of the different feedback paths
were evolved by the GA; see (Wolff et al., 2006) for details.
296 Humanoid Robots, New Developments
Fig. 7. The simulated robot with the hip1 and hip2 joint CPGs and the corresponding
feedback, with specific connection types (i.e. signs) chosen to illustrate how the feedback
was intended to affect the corresponding joint’s motion (see the main text for details). The
hip1 flexor neuron rotates the leg in the counter-clockwise direction in the sagittal plane2,
while the hip2 flexor neuron rotates the leg in the clockwise direction in the frontal plane
(seen from behind the robot).
2
The sagittal plane is a vertical plane running from front to back through the body, while the
frontal plane is a vertical plane running from side to side through the body, perpendicular to the
sagittal plane
Central Pattern Generators for Gait Generation in Bipedal Robots 297
way to balance the robot in the frontal plane as well. Before the second step of evolution was
started, the hip2 and ankle joints were unlocked, and the corresponding genes were set
randomly for each individual in the population. Since the waist joints also affect the frontal
plane balance, the corresponding genes were randomly initiated as well. The remaining
genes (which take identical values for all individuals) ensured sagittal plane balance and
were therefore left unchanged in the second step of evolution. In this approach the hip3 joint
remained locked during the entire procedure. The fitness measure used in both steps was
the distance walked by the robot in the initial forward direction, decreased by the sideways
deviation, as formulated in Eq. (4).
Since the two-point supporting structure provides no support in the sagittal plane, the
evolutionary procedure found several easily accessible solutions for maintaining sagittal
balance. In some cases, the evolved individuals used the torso as a third leg, or the knees, for
maintaining balance. Other examples include forward motion using somersaults or walking
with unnaturally large steps. In order to prevent the individuals from developing other
means of locomotion than upright, human-like gaits, several constraints had to be
introduced. In Method 1, the following rules and modifications were added:
a) The contact points in the knee joints and the torso, used for detecting ground
collisions, were removed.
b) If the robot’s hip collided with the ground, the evaluation run of that individual
was aborted.
c) If the robot’s hips were located less than a 0.15 m above the ground, the supporting
structure was immediately removed, and was not further used during the
evaluation of the individual.
The removal of the robot’s contact points in the torso and the knees eliminated the
possibility of misusing these body parts for support. Ending a run as soon as the robot’s hip
collided with the ground, efficiently removed all crawling solutions. Finally, punishing the
individuals for having the hips too close to the ground successfully removed those
individuals trying to take unnaturally large steps (for improved sagittal plane balance). If
the step length is large and the support is removed in this way, the robot will most likely be
unable to maintain the frontal plane balance. Thus it will fall to the ground, terminating its
evaluation. Using these rules, the evolutionary procedure was strongly biased towards
individuals walking with an upright posture.
Initially, the posture support was intended to be present during the entire evaluation time of
the first step, and then completely removed in the second step. However, it turned out that
such an approach gave no obvious way of deciding at what point to interrupt the first step;
This simply had to be judged by the experimenter in an ad hoc manner. Evolving for too
long, for example, led to individuals that were indeed walking rapidly but, at the same time,
over-exploiting the posture support, making the produced gait unsuitable for the second
step, where the support was removed. For this reason, the time during which the support
was present in the first step was changed from the entire evaluation period to only the first
two seconds of the evaluation. This arrangement is motivated by the assumption that it is
during the starting sequence, before entering the gait cycle, that the individuals are most
vulnerable in terms of frontal plane balance. For this reason, in these experiments, the
posture support was also present during the initial two seconds in the next step. The details
regarding the experiments carried out using Method 1 are shown in Table 1. The results
from the simulations are given in the next subsection.
298 Humanoid Robots, New Developments
Ti k p T r T i k dTi , (6)
where Ti is the torque applied to joint i, kp is the proportional constant, kd is the derivative
constant, ljr is the desired angle, and lji is the current angle of the ith joint. In the experiment
described in the 2nd row of Table 2, kp and kd were chosen as 20.0 and 4.0, respectively.
In both cases, the fitness measure was the distance walked by the robot in the initial forward
direction, decreased by the sideways deviation, as formulated in Eq. (4).
4.4 Results
Method 1:
In Method 1, the generation of upright bipedal locomotion was carried out in a two-step
evolutionary procedure. In the first step, during which the hip2, hip3 and ankle joints were
locked, an individual walking with a speed of 0.58 m/s was evolved after 400 generations.
The results for this individual are shown in Table 3, case 1.1. As Table 3 implies, the
individual evolved here was fairly unstable and able to maintain balance only for one
minute. Note that the stability in the frontal plane was only ensured by the waist joint, since
the hip2 and ankle joints were locked. Even though slower, but stable, individuals were found
earlier in evolution, i.e. individuals that maintained balance during the testing time of 20
minutes, they were not suitable as a starting condition for the second step because of the
way these individuals were walking: They were heavily leaning the torso forward, or
backwards, keeping it motionless to create a sort of counterweight for balance. By contrast,
the fastest individual maintained an active upright posture during the entire walking
sequence.
Case Method F v DOFs Resulting gait
1.1 1, step1 19.54 0.56 8 slow, stable for 60 sec.
1.2 1, step2 23.09 0.58 12 slow, stable for 12 min.
2 2, torque 35.56 0.90 12 fast, stable for 42 sec.
3 2, angle 52.46 1.31 12 fast, stable for >20 min.
Table 3. Results from the trials made using Method 1 and Method 2. In the column labeled
Method, the method and the current step (or type of output) is given. F is the obtained fitness
[m], and v denotes the average locomotion speed [m/s] of the robot during the evaluation
period. DOFs is the number of active joints, and the last column gives a short description of
the resulting gaits.
Fig. 8. The best evolved gait pattern, using Method 1, with 12 active DOFs. The details of the
corresponding individual are shown on the 2nd row of Table 3.
In the second step, the hip2 and ankle joints were unlocked and the best individual, found
within 100 generations, was able to walk with a speed of 0.58 m/s. However, when the
300 Humanoid Robots, New Developments
individual was tested for longer evaluation times, it fell after 12 minutes, see case 1.2 in
Table 3. The resulting gait is depicted in Fig. 8.
Method 2, torque-generating CPGs:
While the CPG networks in Method 1 were evolved in two steps, with the hip2, hip3 and ankle
joints locked at first, in Method 2, all joints, except the hip3 joint, were evolved in one step.
When Method 2 was tested with a torque-generating CPG network, the main difference
compared to Method 1 was the difficulty to evolve stable individuals. While the average
speed of the best individual was significantly improved, the balance capabilities were at the
same time reduced, and it was only after 800 generations that the best individual was able to
balance during the entire evaluation time of 40 seconds. However, this result was not
satisfactory since the individual did not walk more than 42 seconds when the evaluation
time was expanded; see Table 3, case 2 for details.
The resulting motion pattern resembled the one obtained in Method 1, shown in Fig. 8, with
the exception of the foot joint, which was now more active during the lift-off phase.
Nevertheless, the motion of the hip2 and ankle joints appeared rather nervous, suggesting
that more restraining feedback to these joints’ CPGs is necessary, preferably from their own
joint angles, something that was not included in the current feedback network.
Method 2, angle-generating CPGs:
When Method 2 was tested in connection with angle-generating CPGs, the results were
significantly improved, compared to the previous results, both in terms of locomotion speed
and stability of the best individual, but also in terms of evolution time. The best individual
found with this approach reached a distance of 52.46 m, having an average locomotion
speed of 1.31 m/s (see Table 3, case 3). The corresponding individual, in the case of torque-
generating CPGs, reached a distance of 35.56 m walking at an average locomotion speed of
0.90 m/s. Furthermore, stable individuals capable of walking during the entire evaluation
time of 40 seconds emerged quite early, around generation 25, compared to the case of
torque-generating CPGs, where such results were obtained only after 800 generations. While
the best individual using torque-generating CPGs was unable to maintain balance for more
than 42 seconds, the best individual equipped with the angle-generating CPGs was able to
walk without falling during a testing time of 20 minutes. The resulting gait is depicted in
Fig. 9.
Fig. 9. The best evolved gait pattern obtained using Method 2, with 12 active DOFs and with
angle-generating CPGs. The details of the corresponding individual are shown on 4th row of
Table 3.
4.5 Discussion
The results of the study described above demonstrate the feasibility of using artificial
evolution for designing CPG networks, as well as optimizing the parameters of such
networks, in order to achieve efficient and stable bipedal walking for a simulated robot.
Central Pattern Generators for Gait Generation in Bipedal Robots 301
Both methods introduced here solved the problem of generating robust bipedal gaits.
However, the first method showed some limitations in generating gaits with long-term
stability. This drawback was a result of splitting up the evolutionary process into two parts.
Since the hip2 and ankle joints were locked during the first phase of Method 1, stability in the
frontal plane was only ensured by the torso. As a result, the gaits had to be less dynamic in
order for the robot to maintain balance in the frontal plane. Thus, these gaits were less
suitable for generalization to full 3D, i.e. with all joints unlocked. Yet, the gaits evolved with
this method were more stable than the solutions found in Method 2, when torque-
generating CPGs were used in the latter method. The frontal plane balance seems to be the
most critical point when trying to maintain an upright body posture. Thus, more feedback,
based on the inclination of the body, should preferably be introduced to restrain the motion
of some critical joints like the hip2 and the ankle.
The best results were obtained with angle-generating CPGs. The major contribution to the
improved results was the better motion control of the hip2 and ankle joints, which were now
easily restrained using the PD controller. However, a problem with this approach is that the
individuals start to take larger and larger steps, as evolution progresses. In order to prevent
the occurrence of such large steps, small perturbations should perhaps be introduced during
the evaluation of the individuals, preferably in the frontal plane. This should force the robot
to walk in a more cautious manner. The need for the supporting structure during the initial
seconds indicates that the CPG network handling the gait cannot fully handle also the start-
up of the walking cycle. Thus, an extra controller, based on CPGs or other approaches,
should be used for the start-up sequence. It should then be tuned to enter the walking cycle
and hand over control to a CPG network in a smooth way.
possible. Results from preliminary experiments of that kind indicate similar performance
of the evolved gaits (in terms of walking speed and stability) as that obtained in Case 3 of
Method 2. However, an interesting difference in the evolution of the gaits was observed.
In the experiments described in Section 4.4, the first stable gaits that emerged had
basically the same walking pattern as the final gait evolved in the same run. The major
contribution to the fitness increase was the increase in walking speed, rather than step
length. However, in the approaches where the individual CPG parameters are also
evolved, the first stable gaits that emerge have considerably smaller step length than the
final gait evolved in that run.
Another topic that requires further development concerns gait transition from, say, a
walking gait to a running gait. Two preliminary experiments have recently been carried out.
In the first case, a stop-and-go routine was accomplished by manually changing the bias
values and time constants appropriately. In the second case, a complete and smooth gait
transition was realized by abruptly switching control from one CPG network to another.
However, those maneuvers required quite some manual tuning of the CPG parameters in
order to work fully. In a robot intended for real-world applications, the transitions must be
carried out in a more automatic fashion.
Ultimately, the CPG-based controllers should, of course, be implemented in real, physical
robots. Preliminary experiments with a small humanoid robot with 17 DOFs have recently
been carried out (Wolff et al., 2007). In these experiments the robot's gait was optimized
using an evolutionary approach, including structural modifications of the gait control
program.
6. References
Brown, T. G. (1911). The intrinsic factors in the act of progression in the mammal,
Proceedings of the Royal Society of London Series B, Vol. 84, pp. 308–319
Brown, T. G. (1912). The factors in rhythmic activity of the nervous system, Proceedings of the
Royal Society of London Series B, Vol. 85, pp. 278–289
Brown, T. G. (1914). On the nature of the fundamental activity of the nervous centers:
Together with an analysis of the conditioning of rhythmic activity in progression,
and a theory of the evolution of function in the nervous system, Journal of
Physiology, Vol. 48, pp. 18–46
Duysens, J. & Van de Crommert, H. W. A. A. (1998). Neural control of locomotion, Part 1:
The central pattern generator from cats to humans, Gait and Posture, Vol. 7, No. 2,
pp. 131–141
Ekeberg, Ö; Wallén, P.; Lansner, A.; Tråvén, H.; Brodin, L. & Grillner, S. (1991). A computer
based model for realistic simulations of neural networks. I: The single neuron and
synaptic interaction, Biological Cybernetics, Vol. 65, No. 2, pp. 81–90
Ekeberg, Ö. (1993). A combined neuronal and mechanical model of fish swimming,
Biological Cybernetics, Vol. 69, No. 5-6, pp. 363–374
Endo, G.; Morimoto, J.; Nakanishi, J. & Cheng, G. (2004). An empirical exploration of a
neural oscillator for biped locomotion control. In Proceedings of the 2004 IEEE
International Conference on Robotics and Automation (ICRA’04).
Featherstone, R. (1987). Robot Dynamics Algorithms. Kluwer Academic Publishers.
Grillner, S. (1996). Neural networks for vertebrate locomotion, Scientific American, Vol. 274,
pp. 64-69
Central Pattern Generators for Gait Generation in Bipedal Robots 303
Grillner, S.; Wallén, P.; Brodin, L. & Lansner, A. (1991). Neuronal network generating
locomotor behavior in lamprey: Circuitry, transmitters, membrane properties, and
simulation, Annual Review of Neuroscience, Vol. 14, pp. 169–199
Grillner, S.; Deliagina, T.; Ekeberg, Ö.; El Manira, A.; Hill, R.; Lansner, A.; Orlovsky, G. &
Wallen, P. (1995). Neural networks that coordinate locomotion and body
orientation in lamprey, Trends in Neurosciences, Vol. 18, No. 6, pp. 270–279
Holland, J. (1975). Adaptation in Natural and Artificial Systems. The University of Michigan
Press, Ann Arbor, MI, USA.
Kling, U. & Székely, G. (1968). Simulation of rhythmic nervous activities. I. function of
networks with cyclic inhibitions. Kybernetik, Vol. 5, No. 3, pp. 89–103
Lewis, M.; Tenore, F. & Eienne-Cummings, R. (2005). CPG design using inhibitory networks.
In Proc Int Conf on Robotics and Automation (ICRA’05), IEEE-RAS. Barcelona, Spain.
Matsuoka, K. (1987). Mechanisms of frequency and pattern control in the neural rhythm
generators. Biological Cybernetics, Vol. 47, No. 2–3, pp. 345–353
Ogihara, N. & Yamazaki, N. (2001). Generation of human bipedal locomotion by a bio-
mimetic neuro-musculo-skeletal model. Biological Cybernetics, Vol.84, No.1, pp. 1–11
Ogino, M.; Katoh, Y.; Aono, M.; Asada, M. & Hosoda, K. (2004). Reinforcement learning of
humanoid rhythmic walking parameters based on visual information. Advanced
Robotics, Vol. 18, No. 7, pp. 677-697
Ott, E. (1993). Chaos in Dynamical Systems, Cambridge University Press, Cambridge, UK.
Pettersson, J. (2003). EvoDyn: A simulation library for behavior-based robotics, Department
of Machine and vehicle systems, Chalmers University of Technology, Göteborg,
Technical Report
Reil, T. & Husbands, P. (2002). Evolution of central pattern generators for bipedal walking in
a real-time physics environment. IEEE Transactions in Evolutionary Computation, Vol.
6, No. 2, pp. 159-168
Righetti, L. & Ijspeert, A. J. (2006). Programmable central pattern generators: an application
to biped locomotion control. In Proceedings of the 2006 IEEE International Conference
on Robotics and Automation.
Shepard, G. M. (1994). Neurobiology, chapter 20, pages 435-451. Oxford University Press, 3rd
edition, New York.
Taga, G.; Yamaguchi, Y. & Shimizu, H. (1991). Self-organized control of bipedal locomotion
by neural oscillators in unpredictable environment, Biological Cybernetics, Vol. 65,
pp. 147–159
Taga, G. (2000). Nonlinear dynamics of the human motor control – real-time and
anticipatory adaptation of locomotion and development of movements. In Proc 1st
Int Symp on Adaptive Motion of Animals and Machines (AMAM’00).
Takanishi, A.; Ishida, M.; Yamazaki, Y. & Kato, I. (1985). The realization of dynamic walking
by the biped walking robot WL-10RD, Proceedings of the International Conference on
Advanced Robotics (ICAR'85), pp. 459-466
Van Wezel, B. M. H.; Ottenhoff, F. A. M. & Duysens, J. (1997) Dynamic control of location-
specific information in tactile cutaneous reflexes from the foot during human
walking. The Journal of Neuroscience, Vol. 17, No. 10, pp. 3804-3814
Wolff, K.; Pettersson, J.; Heraliþ, A. & Wahde, M. (2006). Structural evolution of central
pattern generators for bipedal walking in 3D simulation. In Proc of the 2006 IEEE
International Conference on Systems, Man, and Cybernetics, pp. 227-234
304 Humanoid Robots, New Developments
Wolff, K.; Sandberg, D. & Wahde, M. (2007). Evolutionary optimization of a bipedal gait in a
physical robot. Submitted to: IEEE Transactions on Robotics.
Zehr, E. P. & Duysens, J. (2004). Regulation of arm and leg movement during human
locomotion., Neuroscientist, Vol. 10, No. 4, pp. 347–361
18
1. Introduction
In recent years, robots demonstrating an aping or imitating function have been proposed.
Such functions can estimate the actions of a human being by using a non-contact method
and reproduce the same actions. However, very few systems can imitate the behaviour of
the hand or the fingers (Bernardin et al., 2005). On the other hand, reports in Neuroscience
state that mirror neurons (Rizzolatti et al., 1996; Gallese et al., 1996), which participate in the
actions imitated by chimpanzees, are activated only for actions of the hand and fingers,
such as cracking peanut shells, placing a sheet of paper over an object, tearing the sheet into
pieces, etc.. Moreover, since human beings perform intelligent actions using their hands, it is
important to attempt to artificially imitate the actions of hands and fingers in order to
understand the dexterity and intelligence of human hands from an engineering viewpoint.
The object actions in the case where one “looks at an action performed by others and
imitates it” include not only grasping or manipulating objects but also the actions involving
“imitating shapes and postures of hands and fingers” of others such as sign language and
dancing motions. The latter two types of actions are often more complicated and require
dexterity as compared to the former actions.
In the action of imitating “the shape of hands and fingers” such as sign language, it is
essential to estimate the shape of the hands and fingers. Furthermore, as compared to the
imitation of the actions of the lower limbs, estimating the shape of the hands is significantly
more important and difficult. In particular, human hands, which have a multi-joint structure,
change their shapes in a complicated manner and often perform actions with self-occlusion
in which the portion of one’s own body renders other portions invisible. In the case of an
artificial system, we can utilize a multi-camera system that records a human hand for
imitative behaviours by surrounding it with a large number of cameras. However, all the
animals that mimic the motions of others have only two eyes. To realize the reproduction of
the actions of hands and fingers by imitating their behaviour, it is desirable to adopt a
single-eye or double-eye system construction.
To roughly classify conventional hand posture estimation systems, the following two types
of approaches can be used. The first approach is a 3D-model-based approach (Rehg &
Kanade, 1994; Kameda & Minoh, 1996; Lu et al., 2003) that consists of extracting the local
characteristics, or silhouette, in an image recorded using a camera and fitting a 3D hand
model, which has been constructed in advance in a computer, to it. The second approach is a
2D-appearance-based approach (Athitos & Scarloff, 2002; Hoshino & Tanimoto, 2005) that
306 Humanoid Robots, New Developments
consists of directly comparing the input image with the appearance of the image stored in a
database. The former is capable of high-accuracy estimations of the shape, but it is weak
against self-occlusion and also requires a long processing time. The latter can reduce the
computation time; however, if 3D changes in the appearance of hands are not an issue,
which also include the motions of the wrist and the forearm, a large-scale reference database
is required, and it becomes difficult to control the robot hand by means of imitation.
However, if the fundamental difficulty in the estimation of the hand posture lies in the
complexity of the hand shape and self-occlusion, the high-accuracy estimation of the shape
will become theoretically possible; this requires the preparation of an extensive database
that includes hand images of all the possible appearances with complexity and self-
occlusion. The feasibility of this approach depends on the search algorithm used for rapidly
finding similar images from an extensive database.
Therefore, in this study, we aim at realizing a system for estimating the human hand shape
capable of reproducing actions that are the same as those of human hands and fingers to a
level reproducible using a robot hand at high speeds and with high accuracy. With regard to
the processing speed, we aimed to achieve short-time processing of a level enabling the
proportional derivative (PD) control of a robot. With regard to the estimation accuracy, we
aimed at achieving a level of estimation error considered almost equivalent to that by the
visual inspection of a human being, namely, limiting the estimation error in the joint angle
to within several degrees. In addition to those two factors, dispersion in the estimation time
due to the actions of human hands and fingers causes difficulties in the control of a robot.
Therefore, we also aimed to achieve uniformity in the estimation time. In particular, in order
to conduct high-speed searches of similar data, we constructed a large-scale database using
simple techniques and divided it into an approximately uniform number of classes and data
by adopting the multistage self-organizing map (SOM) process (Kohonen, 1988), including
self-multiplication and self-extinction, and realized the high-speed and high-accuracy
estimation of the shape of a finger within a uniform processing time. We finally integrated
the hand posture estimation system (Hoshino & Tanimoto, 2005; 2006) with the humanoid
robot hand (Hoshino & Kawabuchi, 2005; 2006) designed by our research group; our final
system is referred to as the “copycat hand”.
2. System construction
2.1 Construction of a large-scale database
First, we conducted measurements of the articular angle data (hereafter referred to as the
“key angle data”). In the present study, we determined the articular angle at 22 points per
hand in the form of a Euler angle by using a data glove (Cyberglove, Virtual Technologies).
For the articular data in the shape of a hand that requires a high detecting accuracy,
especially in the search for similar images, a large number of measurements were made.
Furthermore, we generated CG images of the hand on the basis of the key articular angle
data. The CG editing software Poser 5 (Curious Labs Incorporated) was used for the
generation of images.
Second, from the two key angle data, we interpolated a plurality of the articular angle data
in optional proportions. The interpolation of the articular data is linear. Moreover, we also
generated the corresponding CG images of the hand on the basis of the interpolated data.
This operation enables an experimenter equipped with the data glove to obtain CG images
of the hand in various shapes with the desired fineness without having to measure all the
Copycat Hand - Robot Hand Generating Imitative Behaviour at High Speed and with High Accuracy 307
shapes of the hand. Fig.1 shows a schematic chart of the interpolation of the articular angle
data and the CG images of the hand. Furthermore, Fig.2 shows an example of the
interpolated CG images of the hand. This figure represents an example of a case where the
articular angle was measured at three different points in time for the actions of changing
from ‘rock’ to ‘scissors’ in the rock-paper-scissors game, and the direct generation of CG and
the generation of CG using interpolation were made from two adjoining data. In both these
figures, the three images surrounded by a square represent the former, while the other
images represent the latter.
Interpolation of CG images
Interpolation of
articular angle
Fig. 1. Interpolation of the articular angle data and CG images of the hand.
Moreover, differences are likely to appear in (3) the manner of opening of the index and
the middle finger and (4) the standing angle of the reference finger in the ‘scissors’
shape, and also in (5) the manner of opening and (6) the manner of warping, etc. of the
thumb in the ‘paper’ shape. In order to express such differences among individuals in
the form of the CG hand, we need to adjust the parameters of the length of the finger
bone and the movable articular angle; therefore, we generated the CG images of hands
having differences among individuals on the basis of the articular angle data obtained
by the procedure described above. Fig.4 indicates an example of the additional
generation of the CG hand in different shapes. In the figure, the X axis shows CG hands
arranged in the order starting from those with larger projections of the thumb coxa,
while the Y axis represents those with larger curvature formed by the coxa of the four
fingers other than the thumb, respectively.
(a)
Curvature of the coxa position of four fingers other than the thumb
(b)
Manner of opening
Difference in reference angle
(c)
Manner of opening
Manner of warping
By performing the first to third steps mentioned above, we generated a total of 15,000 CG
hand images using this system.
Then, the resolution was changed. Although the CG image generated this time had a
resolution of 320 x 240 pixels, a substantial calculation time is required in order to estimate
the posture and for applying various image processing techniques. In the present study, a
reduced resolution of 64 x 64 was used. The pixel value after the resolution was changed is
given by the following expression:
1 (1)
gr (i , j ) ¦ ¦ go (i * 320 / 64 k , j * 320 / 64 l )
r k l
Copycat Hand - Robot Hand Generating Imitative Behaviour at High Speed and with High Accuracy 309
Curvature of
four fingers
the coxa of
2.2 Characterization
In the present study, we used the higher-order local autocorrelational function (Otsu &
Kurita, 1998). The characteristics defined using the following expression were calculated
with respect to the reference point and its vicinity:
x N ( a1 , a2 , , a N ) ³ f (r ) f (r a1 ) f (r a N )dr
ᇫ (2)
Here, xN is the correlational function in the vicinity of the point r in dimension N. Since the
pixels around the object point are important when a recorded image is generally used as the
processing object, the factor N was limited up to the second order in the present study.
When excluding the equivalent terms due to parallel translation, xN is possibly expressed
using 25 types of characteristic quantities, as shown in Fig.5. However, patterns M1 through
M5 should be normalized since they have a smaller scale than the characteristic quantities of
patterns M6 and thereafter. By further multiplying the pixel values of the reference point for
patterns M2 through M5 and by multiplying the square of the pixel value of the reference
point for pattern M1, a good agreement with the other characteristic quantities was obtained.
In the present study, an image was divided into 64 sections in total – 8 x 8 each in the
vertical and lateral directions - and the respective divided images were represented by 25
types of characteristic quantities using the higher-order local autocorrelational function.
310 Humanoid Robots, New Developments
performed not only in a single layer but also in multiple layers. Fig. 6 shows the schematic
structure of the multiple layers. The class obtained with the aforementioned processing is
defined as the second-layer class and is considered as data. A third-layer class is prepared by
clustering the second-layer classes as the data. The third-layer class is prepared by following
the same procedure as that used in the preparation of the second-layer class. Further, a fourth-
layer class is prepared by clustering the third-layer classes. The lesser the amount of data in
one class (or the number of classes in the lower layers), the higher the layer in which clustering
can be performed. However, to absorb the dispersion of data, etc., it is preferable to prepare
classes having an amount of data with a certain volume. Table 2 lists the results of the
preliminary experiment in which clustering was performed by setting the amount of data in a
class at 5, 10 and 20. Although the search time is reduced if the clustering is performed with a
small amount of data, the estimation accuracy also reduces accordingly; therefore, we set an
ideal amount of data as 10 in the present study as a trade-off between the two parameters.
The clustered database obtained using the abovementioned operation was termed as a
tertiary database. This tertiary database will hereafter be simply referred to as the database.
In this system, we finally constructed a database comprising 5, 10 and 10 classes in order
from the upper layers, where each class has approximately 10 data items.
25 * n
Er ¦ (x
i 1
ri x ti ) 2 (4)
Here, both xri and xti are characteristic quantities i with the higher-order local autocorrelational
functions of the class r and at the time t, respectively. The class that minimizes Er was selected as
the most vicinal class at time t. With respect to the affiliated data of the most vicinal class and all
the vicinal classes of the most vicinal class, the distances from the characteristic quantities
obtained from the image were calculated using expression (4). At each instance, the angle of the
data with the shortest distance was regarded as the estimated angle. From the second search, the
distance was not calculated by using the characteristic quantity for all the classes in the database.
Instead, only the vicinal classes of the most vicinal class and the affiliated data were selected as
the candidates for the search according to the histories at t-1, as shown in Fig.7.
(a) at first search: all classes are candidates for the search.
(b) from second search, the vicinal classes of the most vicinal class are candidates.
Fig. 7. Differences in the search spaces between the first search and the succeeding searches.
314 Humanoid Robots, New Developments
Fig. 8. Captured hand images and the results of the hand posture estimation.
For the purpose of a quantitative assessment of the system, the measured and estimated
values have to be compared. However, in an ordinary environment using this system, it is
impossible to acquire the measured values of the joint angle information from the human
hand and fingers moving in front of the camera. Consequently, we performed the
estimation experiment by wearing the data glove and a white glove above it. The results
are shown in Fig.9, which reveals the angular data measured using the data glove and the
estimated results. Fig.9(a) shows the interphalangeal (IP) joint of the thumb; Fig.9(b), the
abduction between the middle and ring fingers; and Fig.9(c), the proximal interphalangeal
(PIP) joint of the middle finger. The state where the joint is unfolded was set as 180
degrees. The system at this time operates at more than 150 fps and thus enables realtime
estimation.
Copycat Hand - Robot Hand Generating Imitative Behaviour at High Speed and with High Accuracy 315
(a)
GUVKO CVKQPCPING
O GCUWTGFCPING
Joint angle [ degree ]
(b)
GUVKO CVKQPCPING
O GCUWTGFCPING
Time [s]
Fig. 9. Examples of the joint angle data measured using the data glove and the estimated
results.
As evident from the figure, the standard deviation of the errors in the estimated angles was
4.51 degrees when we avoided the fluorescent light and used the angular data obtained by
means of the data glove as the actual values; the results obtained did not have highly precise
numerical values. We observed a trend of poor estimations, particularly for parts with little
variation in the image (for example, the shape of the rock in the rock-paper-scissors game)
against the angular variation. This may be expected, considering that a human is performing
the figure estimation. In other words, we can hardly observe any difference visually for an
angular difference of 10 degrees when each finger has a difference of 10 degrees. Therefore,
the errors in this system, which conducts estimation on the basis of the camera image, may
be considered as being within the allowable range. On the contrary, it can be observed from
this figure that highly precise estimations are made in the region where visual differences
are observed, namely, where the image changes significantly with the angular variations
and where it is located in between the flexion and the extension.
316 Humanoid Robots, New Developments
Next, the comparative experiments were conducted. The difference between the previous
experiment and these comparative experiments is that the hand position agrees with or
closely resembles the database image since the object for estimation is set by selecting the
CG hand image from the database. Consequently, we can determine the expected
improvement in the estimating precision when the processing for positioning the input
image is integrated into this system. The standard deviation of the errors when estimating
the object was set to 2.86 degrees by selecting the CG image from the database, thus
allowing very high-precision estimation. It is expected that the estimation error can be
reduced to this extent in the future by integrating the processing for correcting the position
into this system. Moreover, the processing time for the search, except for the image
processing, is 0.69 ms per image. From the viewpoint of precision and processing speed, the
effectiveness of the multi-step search using the self-organized database has been proved.
As mentioned above, the estimation error for unknown input images had a standard
deviation of 4.51 degrees. Since this is an image processing system, small variations in the
finger joints in the rock state of the rock-paper-scissors game will definitely exhibit a
minimal difference in the appearance; these differences will numerically appear as a large
error in the estimation. However, this error possibly contains calibration errors arising from
the use of the data glove, as well as the errors caused by slight differences in the thickness,
colour, or texture of the data glove covered with the white glove. Therefore, the output of
the data glove or the actual value of the quantitative assessment requires calibration
between the strain gauge output and the finger joint value whenever the glove is worn since
the joint angle is calculated from a strain gauge worn on the glove. No such calibration
standards exist, particularly for the state in which the finger is extended; therefore, the
measured angle can be easily different from the indicated value. Even when the estimation
is newly calibrated, it is possible that the state of calibration may be different in each
experiment. On the other hand, it is not necessary to apply calibration to the second
experiment that selects the CG hand image from the database. It is highly possible that this
influences the standard deviation value of 4.51 degrees; therefore, it is possible to consider
that the standard deviation of the errors lies between 4.51 and 2.86 degrees even if the
system has not been subjected to corrective processing for the hand position.
The scheme of the present study allows you to add new data even without understanding
the system. Another advantage is that the addition of new data does not require a long time
since it is unnecessary to reorganize the database even when several new data items are
added; this is because the database can sequentially self-organize itself by using the
algorithm for self-multiplication and self-extinction of database classes. Furthermore, it is
possible to search the neighbouring classes having angular similarities since each class
possesses information about the vicinal classes in this system. This fact can also be regarded
as the best fit for estimating the posture of a physical object that causes successive temporal
angular variations, such as estimating the posture of the human hand. We attempted to
carry out the hand posture estimation when the hand is rotated, although the number of
trials was inadequate. Fig.10 shows an example of the result, which suggests that our system
functions when a subject is in front of the camera and is rotating his/her hand. A subject can
also swing the forearm, and our system can effectively estimate the shape of the fingers, as
shown in Fig.11.
The image information and the joint angle information are paired in the database in our
system. Once we output the results of the hand posture estimation to a robot hand, the robot
can reproduce the same motions as those of the fingers of a human being and mimic them.
Copycat Hand - Robot Hand Generating Imitative Behaviour at High Speed and with High Accuracy 317
Fig.12 shows a dexterous robot hand (Hoshino & Kawabuchi, 2005) imitating the human
hand motions without any sensors attached to it. We refer to this integrated system as the
“copycat hand”. This system can generate imitative behaviours of the hand because the
hand posture estimation system performs calculations at high speeds and with high
accuracy.
Fig. 10. An example of hand posture estimation using the rotating motion of the wrist.
Fig. 11. Examples of the estimation when a subject is swinging his/her forearm.
Fig. 12. The copycat hand can ape and imitate human hand motions at high speeds and with
high accuracy.
4. Conclusion
To realize a robot hand capable of instantly imitating human actions, high speed, high
accuracy and uniform processing time in the hand posture estimation are essential.
Therefore, in the present study, we have developed a method that enables the searching of
similar images at high speeds and with high accuracy and the search involves uniform
processing time, even in the case where a large-scale database is used. This is achieved by
(1) clustering databases having approximately uniform amounts of data using self-
organization, including self-multiplication and self-extinction and (2) by collating the input
images with the data in the database by means of the low-order image characteristics, while
narrowing the search space in accordance with the past history.
In the preliminary construction of the database, we generated CG images of the hand by
measuring the joint angles using a data glove and interpolating them; furthermore, we
318 Humanoid Robots, New Developments
extracted the contours, image characteristics and the characteristics that change only in the
hand shape, irrespective of the environmental light or skin colour. The image was divided
into several images and was converted into a number of characteristics by using the high-
order local autocorrelation function; the image was then saved in the database in a state
paired with the joint angle data obtained from a data glove. By clustering this database
using self-organization depending on the number of characteristics and by the self-
organization of classes in multiple stages, a multistage search was enabled using the
representative numbers of classes in several layers. Moreover, by incorporating self-
multiplication and self-extinction algorithms, we achieved a unification of the amount of
data belonging to each class as well as the number of classes in the lower layers to avoid the
dispersion of the search time in the classes.
The input image at the time of an actual estimation of the hand finger shape was subjected
to various types of image processing techniques in the same manner as that at the time of
construction of the database, and it was converted into a number of characteristics. The
distance from the number of characteristics obtained from the picture was calculated by
using a representative number of characteristics. Classes at close distances were selected as
candidate classes for the estimated angle, and a similar distance calculation was also
performed in the classes in each layer belonging to a candidate class for the estimated angle.
Among the respective data belonging to the candidate classes for the estimated angle in the
lowest class, the angle data of the data with the closest distance between the number of
characteristics was considered as the estimation result. Furthermore, for the selection of a
candidate class, we attempted to reduce the search space by using the previous estimation
results and the neighbour information.
By estimating the sequential images of the finger shape by using this method, we
successfully realized a process involving a joint angle estimation error within several
degrees, a processing time of 150 - 160 fps, and an operating time without dispersion by
using a PC having a CPU clock frequency of 2.8 GHz and a memory capacity of 512 MB.
Since the image information and the joint angle information are paired in the database, the
system could reproduce the same actions as those of the fingers of a human being by means
of a robot without any time delay by outputting the estimation results to the robot hand.
5. Acknowledgement
This work is partly supported by Proposal-Oriented Research Promotion Program
(PRESTO) of Japan Science and Technology Agency (JST) and Solution-Oriented Research
for Science and Technology (SORST) project of JST.
6. References
Athitos, V. & Scarloff, S. (2002). An appearance-based framework for 3D hand shape
classification and camera viewpoint estimation, Proc. Automatic Face and Gesture
Recognition, pp.40-45
Bernardin, K.; Ogawara, K.; Ikeuchi, K. & Dillmann, R. (2005). A sensor fusion approach for
recognizing continuous human grasping sequences using Hidden Markov Models,
IEEE Transactions on Robotics, Vol.21, No.1, pp.47-57
Gallese, V.; Fadiga, L.; Fogassi, L. & Rizzolatti, G. (1996). Action recognition in the premotor
cortex, Brain, Vol.119, pp.593-60
Copycat Hand - Robot Hand Generating Imitative Behaviour at High Speed and with High Accuracy 319
Hoshino, K. & Tanimoto, T. (2005). Real time search for similar hand images from database
for robotic hand control, IEICE Transactions on Fundamentals of Electronics,
Communications and Computer Sciences, Vol.E88-A, No.10, pp.2514-2520
Hoshino, K. & Tanimoto, T. (2006). Method for driving robot, United Kingdom Patent
Application No.0611135.5, (PCT/JP2004/016968)
Hoshino, K. & Kawabuchi, I. (2005). Pinching at finger tips for humanoid robot hand, Journal
of Robotics and Mechatronics, Vol.17, No.6, pp.655-663
Hoshino, K. & Kawabuchi, I. (2006). Hobot hand, U.S.A. Patent Application No.10/599510,
(PCT/JP2005/6403)
Kameda, Y. & Minoh, M. (1996). A human motion estimation method using 3-successive
video frames, Proc. Virtual Systems and Multimedia, pp.135-140
Kohonen, T. (1988). The neural phonetic typewriter, IEEE computer, Vol.21, No.3, pp.11-22
Lu, S.; Metaxas, D.; Samaras, D. & Oliensis, J. (2003). Using multiple cues for hand tracking
and model refinement, Proc. CVPR2003, Vol.2, pp.443-450
Otsu, N. & Kurita, T. (1998). A new scheme for practical, flexible and intelligent vision
systems, Proc. IAPR. Workshop on Computer Vision, pp.431-435
Rehg, J. M. & Kanade, T. (1994). Visual tracking of high DOF articulated structures: an
application to human hand tracking, Proc. European Conf. Computer Vision, pp.35-46
Rizzolatti, G.; Fadiga, L.; Gallese, V. & Fogassi, L. (1996). Premotor cortex and the
recognition of motor actions, Cognitive Brain Research, Vol.3, pp.131-141
securely grasp a heavy object. Therefore, we designed a second prototype focusing on the
terminal joint of the fingers and the structure of the thumb.
Fig. 14. Snapshots of the robot hand handling a business card using two or three fingers.
19
1. Introduction
Recently, a lot of researches which aim at realization of dynamic biped walking are being
performed. There is Honda's ASIMO as a representative of researches of humanoid robots.
ASIMO has joints of many degrees of freedom that are near to a human being, high
environment adaptability and robustness, and can do various performances. However, it
needs many sensors, complicated control, and walks with bending a knee joint to keep the
position of a centre of gravity constant. Therefore, it walks unnaturally and consumes much
energy.
On the other hand, McGeer performed the research of passive dynamic walking from the
aspect of that it is natural motion in a gravitational field (McGeer, T., 1990). This robot
which could go down incline only using potential energy was developed and realized the
energy-efficient walking. However, it needs incline, and its applied range is small because
it has no actuator. Therefore, the researches that aimed at energy-efficient biped walking
on level ground have been performed. S.H.Collins exploited the robot which had
actuators at only ankles (Collins, S. H. & Ruina A., 2005). M.Wisse exploited the robot
which used pneumatic actuators (Wisse, M. & Frankenhuyzen, J. van, 2003). Ono
exploited the self-excitation drive type robot which had an actuator only at hip joint (Ono,
K. et al, 2001); (Ono, K. et al, 2004); (Kaneko, Y. & Ono K., 2006). And then, Osuka and
Asano performed the level ground walking from a point of view to mechanical energy for
joints which is the same with the energy consumed of passiveness walk (Osuka, K. et al,
2004); (Asano, F. et al, 2004). These biped robot's studies used the technique of the passive
dynamic walking which used inertia and gravity positively by decreasing the number of
actuators as much as possible. However, in order to adapt the unknown ground, the
biped robot needs actuators to improve the environment adaptability and robustness.
Here, Ono proposed the optimal trajectory planning method based on a function
approximation method to realize an energy-efficient walking of the biped robot with
actuators similar to a passive dynamic walking (Imadu, A. & Ono, K. 1998); (Ono, K. &
Liu, R., 2001); (Ono, K. & Liu, R., 2002); (Peng, C. & Ono K., 2003). Furthermore, Huang
and Hase verified the optimal trajectory planning method for energy-efficient biped
walking by experiment, and proposed the inequality state constraint to obtain better
solution which is desirable posture in the intermediate time of the walking period (Hase,
T. & Huang, Q., 2005); (Huang, Q. & Hase, T., 2006); (Hase, T., et al., 2006).
322 Humanoid Robots, New Developments
In this chapter, we introduce the newest researches on energy-efficient walking of the biped
robot for level ground form two viewpoints, one is semi-passive dynamic walking with only
hip actuator using self-excited mechanism, another is active walking with actuators using
optimal trajectory planning.
The chapter is organized as follows. In section 2, the self-excited walking of a four-link
biped robot and the self-excitation control algorithm enables the robot to walk on level
ground by numerical simulation and experiment will be introduced. In section 3, we aim at
realizing an energy-efficient walking of the four-link biped robot with actuators similar to a
passive dynamic walking. An optimal trajectory planning method based on a function
approximation method applied to a biped walking robot will be shown. And then, we use
the inequality state constraint in the intermediate time and restrict the range of joint angles.
In this way, a better solution which is desirable posture in the intermediate time can be
obtained. Furthermore, in section 4, with “Specific Cost”, we show that the biped walking
with the above two methods have more efficient energy than the other methods which use
geared motors. Finally, the conclusions will be presented in section 5.
should not be bent by the internal force of the knee stopper. (5) The synchronized motion
between the inverted pendulum motion of the support leg and the two-DOF pendulum
motion of the swing leg, as well as the balance of the input and output energy, should have
stable characteristics against small deviations from the synchronized motion.
Fig.1. Three-degree-of freedom walking mechanism on a sagittal plane (Ono, K. et al, 2001)
First we pay attention to the swing leg and try to generate a swing leg motion that can
satisfy the necessary conditions (2) and (3) by applying the self-excitation control to the
swing leg motion. Ono and Okada (Ono, K. & Okada, T., 1994) have already investigated
two kinds of self-excitation control of two-DOF vibration systems and showed that the Van
der Pol-type self-excitation can evoke natural modes of the original passive system, while
the asymmetrical stiffness matrix type can excite the anti-resonance mode that has a phase
shift of about 90 degrees between input and output positions.
The two-DOF pendulum of a swing leg has the first-order mode with an in-phase at each
joint and the second-order mode with an out-of-phase at each joint. Thus, it will be
impossible to generate a walking gait by applying the Van der Pol-type self-excitation. In
contrast, by means of the negative feedback from the shank joint angle T 3 to the input
torque T at the thing joint, the system’s stiffness matrix becomes asymmetrical. Thus, the
swing motion would change so that the shank motion delays at about 90 degrees from the
thigh motion. Through this feedback, it is also expected that the kinetic energy of the swing
leg increases and that the reaction torque (=T) will make the support leg rotate in the
forward direction in a region where T 3 > 0. The self-excitation of the swing leg based on
the asymmetrical matrix is explained in detail below.
T2 kT 3 . (1)
We note from Eq. (2) that the stiffness matrix of the system becomes asymmetrical because
of the feedback gain k.
To examine the value of k to excite a swing leg motion autonomously and the natural
frequency and mode at the threshold, Eq. (2) is linearized about T 2 T 3 0 . The linearized
equation of motion becomes
ª M11 M12 M13 º ªT1 º
« » « »
« M22 M23 » «T 2 » (6)
«¬ sym « »
M33 »¼ «T3 »
¬ ¼
Where the elements Mij , Cij , and Ki of the matrices are shown in Appendix A. T2 is
feedback input torque given by Eq. (1).
ªT1 º
ª f T ,T º
« 1 1 1 »
« » « »
« »
1
«
«T2 » > M @ « f 2 T 2 ,T2 W » ,
»
(7)
«T3 »
¬ ¼
¬
« f 3 T 3 ,T3 W »
¼
Where the elements of the matrix [M] are the same as Mij in Eq. (6) . f 1 , f 2 , and f 3 are
presented in Appendix B. W is the moment impulse at the knee.
During the second phase, the biped system can be regarded as a two-DOF link system. Thus,
the basic equation becomes
ª I m1 a12 m2 l12 m2 a2 l1 cos T 3 T 2 º ªT2 º
«
ᇫ »« »
«m2 a2 l1 cos T 2 T1 I 2 m2 a22 » ¬«T3 ¼»
¬ ¼
ª 0 m2 a2 l1 sin(T 2 T1 )T2 º ªT2 º
« »« » (8)
«¬m2 a2 l1 sin(T 2 T1 )T1 0 »¼ «¬T3 »¼
ª m a m2 l1 g sin T1 T 1 0 º ªT 1 º
« 1 1 » « » 0.
¬ 0 m a
2 2 g sin T 2 2 ¼ ¬T 2 ¼
T
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 327
We assume that the collision of the swing leg with the ground occurs plastically (the
translational velocity at the tip of the swing leg becomes zero) and that there is a sufficient
friction between the tip and the ground surface to prevent slippage. The angular velocities
of the links after the collision are derived from conservation laws of momentum and angular
momentum. They are calculated from Eq. (7) by putting W 0 .
To investigate the self-excited biped locomotion, a set of basic equations of (6), (7), and (8)
were numerically solved based on the fourth-order Runge-Kutta method. The standard
values of the link parameters used in the calculation are shown in Table 1. The values of
mass and moment of inertia were calculated assuming uniform mass distribution along the
link. Since the magnitude of the damping coefficient scarcely influences the biped gait,
J 3 0.15 Nms/rad was used in the calculation.
1 t t4
P ³ T 2T2 dt . (9)
t 4 t0 t t0
Fig.4. Motion of legs and internal torque at knee during one step (Ono, K. et al, 2001).
Fig.5. Effect of feedback gain on walking period, velocity, and efficiency (Ono, K. et al,
2001).
We note from this figure that the walking period shows an almost constant value of 1.3 s
at any k value. This is because the self-excited biped locomotion is a natural motion
inherent in the biped system. On the other hand, the average velocity increases gradually
because of the increase in step length. However, the efficiency decreases with an increase
in the feedback gain k because the average input power increases more rapidly with an
increase in k. It is worthy to note that the average input power to generate biped
locomotion with a velocity of 0.28 m/s is only about 4 W (=0.28/0.07) at k = 6.0 Nm/rad.
If the additional movement of 0.6 m by feet during one period of 1.3 s is taken into
account, the natural walking velocity becomes 2.7 km/h. the natural walking period of an
adult with a 0.9 m leg height is 1.3 s (Yang 1994). Thus, it can be said that the calculated
results of the walking period and velocity are comparable with the natural walking of an
adult with a 0.8 leg height.
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 329
Validity of the self-excited walking stated above was confirmed experimentally (Ono, K.,
2001). Extensions of the self-excited walking to a biped with feet and bent knee walking
mode are described in (Ono, K., 2004) and (Kaneko, Y. & Ono K., 2006), respectively.
Symbols Description
mi i-th link mass
Ii i -th link moment of inertia
li i -th link length
ai Length between i-th joint and i-th link COG
bi Length between (i+1)-th joint and i-th link COG
md Upper body mass
d Length between upper body mass and hip joint
lji i-th link absolute angle
ǂi Offset angle between i-th link and i-th link COG from joint i
ǃi Offset angle between i-th link and i-th link COG from joint (i+1)
xi x-coordinate of i-th link COG
yi y-coordinate of i-th link COG
g Acceleration of gravity
ui i-th joint input torque
J Performance function
LJi Dummy variable used for inequality constraint
Dži Clearance between toe and ground
pi i-th coefficient vector of basis function
hi i-th basis function
T Period of one step
S Length of ste
Table 2. Symbols used in this section (Hase T. & Huang, Q., 2005)
Fig.6. 3 link model with upper body mass (Hase T. & Huang, Q., 2005).
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 331
Fig.7. Process of one step walking (Hase T. & Huang, Q., 2005).
Fig.8. Impulse for i-th link (Hase T. & Huang, Q., 2005).
3.1.2 Dynamic Equations and Analysis of Collision between Toe and Ground
By the analytical model shown in Fig.6, dynamic equations are lead as follows. We assumed
that there is no friction in each joint.
When collision between toe and ground occurs, it is assumed that a momentum and angular
momentum of each link are conserved. Leg exchange happens shown as Fig.7.
Fig.8 shows impulse for i-th link. "vix", "viy" is x, y ingredient of translation velocity of a
center of gravity of each link. "+" indicates the state just after the swing leg collides with the
ground, and "-" indicates the state just before that.
Using the conservation law of momentum and angular momentum, following equations
work out.
mi ( vix
vix ) Pix P( i 1 )x
°°
®mi ( viy viy ) Piy P( i 1 ) y (11)
°
°̄ I i (Ti Ti ) ai u Pi (li ai ) u Pi 1
332 Humanoid Robots, New Developments
When Eq.(10) are organized, angular velocity just before and after collision are as follows.
M(T )T H(T )T (12)
Matrices in Eq.(10), (12) are written in (Ono, K. & Liu, R., 2002).
w2 J w2 J
BT B (19)
w q2 w p2
Then, we obtain p from q and obtain the optimal solution.
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 333
C(p) t 0 C(p) 9 2
(20)
C(p) d 0 C(p) 9 2
Then, we use the dummy variables as optimization parameters.
For example, if you restrict the state variable of joint angle at boundary time ti as follows:
T ( ti ) t 0 (21)
Then, the dummy variable introduced as follows:
T ( ti ) 9 2 (22)
The state variables at boundary time ti are expressed as j-th optimization parameter p j as
follows:
T ( ti ) p j (23)
334 Humanoid Robots, New Developments
3.3.2 The Results of Using Equality Constraint at Beginning and Ending Time
At first, the period T of one step and the length of the step S are determined. The beginning
and ending posture is that both legs become straight. This posture is determined when the
length of the step is determined.
Fig.10 and Fig.11 show stick figure of trajectory, joint input torque, angular velocity, and
clearance between toe and ground at T=0.60[s], S=0.15[m].
Fig.10. The result of using equality constraint at beginning and ending time (S=0.150[m]
T=0.60[s]) (Hase T. & Huang, Q., 2005).
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 335
Fig.11. The result of using equality constraint at beginning and ending time ( u1 =0,
S=0.150[m], T=0.60[s]) (Hase T. & Huang, Q., 2005).
Fig.12. The result of using inequality constraint at intermediate time ( G =20[mm], S=0.150[m],
T=0.55[s]) (Hase T. & Huang, Q., 2005).
Fig.13. The result of using inequality constraint at intermediate time ( u1 =0, G =20[mm],
S=0.150[m], T=0.55[s]) (Hase T. & Huang, Q., 2005).
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 337
Fig.16. The influence of installation position of the upper body mass ( G =20[mm], T=0.55[s])
(Hase T. & Huang, Q., 2005).
Inner Outer
Thigh Shank Thigh Shank
m [kg] 3.02 1.22 2.92 1.20
I [kgm2] 0.027 0.015 0.031 0.017
l [m] 0.30 0.30 0.30 0.30
a [m] 0.10 0.12 0.13 0.13
Table 4. Link parameters (Hase T. & Huang, Q., 2005).
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 339
3.4.2 Software
We performed an optimization calculation in offline beforehand and save the data of
relative joint angle trajectory. It is the best that robot controlled with feed forward control of
optimal torque. But it is difficult to measure the parameters of experimental robot and
collision between toe and ground precisely. Therefore, we used PD feedback control for each
joint and let the experimental trajectory follow the optimal trajectory.
In the experiment, the timing of a landing does not agree with the analysis and the delay
occurs for optimal trajectory. Then, we attached a touch sensor at a sole and the input of the
next period starts when a sensor becomes on. The robot keeps the ending posture until the
toe lands.
The analysis model has only three links and assumed the support leg to be one link. We
attached stoppers to knee joints and fixed it by pushing it to stoppers with motors when the
leg becomes support leg.
4. Energy Efficiency
Energy efficiency of walking can be evaluated with Specific Cost "C" (the energy consumed
per unit mass and per transport velocity) as follows:
P
C (29)
mgv
Here, P is the energy consumed per unit of time [W], m is mass[kg] 㧘 v is transport
velocity[m/s].
In the case of the self-excited mechanism, there is one type of Specific Cost Cmt that
calculated by the simulation results in section 2.3. In this case, Cmt is obtained with the
340 Humanoid Robots, New Developments
mechanical work energy (4W) and walking velocity (0.28m/s) for the 8kg weight of the
robot model. Then, Cmt become about 0.18.
As our robots uses gears of large reduction ratio, the energy consumption by friction grows
large, and Specific Cost grows larger than the other robots which used the mechanism of
passive dynamic walking. However, it is very smaller than the other type robot using the
gears of large reduction ratio.
5. Conclusion
In this chapter, we introduced the researches on energy-efficient walking of the biped robot
for level ground form two viewpoints, one is semi-passive dynamic walking with only hip
actuator using self-excited mechanism, another is active walking with actuators using
optimal trajectory planning. We can make the conclusions as follow.
Energy-Efficient Walking for Biped Robot Using
Self-Excited Mechanism and Optimal Trajectory Planning 341
8. References
McGeer, T. (1990). Passive Dynamic Walking, International Journal of Robotics Research, Vol.9,
pp.62-82
Collins, S. H. & Ruina A. (2005). A Bipedal Walking robot with efficient and human-like gait,
Proceedings of IEEE International Conference of Robotics and Automation, Barcelona,
Spain, Biped Locomotion I, CD-ROM No.1935
Wisse, M. & Frankenhuyzen, J. van (2003). Design and construction of MIKE; a 2d
autonomous biped based on passive dynamic walking, Proceedings of Conference of
Adaptive Motion of Animals and Machines, Kyoto, Japan, Analysis 㧒 Control of
Bipedal Locomotion, CD-ROM No.WeP-I-1
Ono, K. & Okada, T. (1994). Self-excited vibratory actuator (1st report: Analysis of two-
degree-of-freedom self-excited systems), Transactions of the JSME(C), Vol.60, No.577,
pp.92-99 (in Japanese)
Ono, K.; Takahashi, R.; Imadu, A. & Shimada, T. (2001). Self-excited walking of a biped
mechanism, International Journal of Robotics Research, Vol.20, No.12, pp.953-966
Ono, K.; Furuichi, T. & Takahashi, R. (2004). Self-Excited Walking with Feet, International
Journal of Robotics Research, Vol.23, No.1, pp.55-68
Kaneko, Y. & Ono K. (2006). Study of Self-Excited Biped Mechanism with Bending Knee,
Transactions of the Japan Society of Mechanical Engineers, Series C, Vol.72, No.714,
pp.416-434 (in Japanese)
342 Humanoid Robots, New Developments
Osuka, K.; Sugimoto Y. & Sugie T. (2004). Stabilization of Semi-Passive Dynamic Walking
based on Delayed Feedback Control, Journal of the Robotics Society of Japan, Vol.22,
No.1, pp.130-139 (in Japanese)
Asano, F.; Luo, Z.-W. & Yamakita, M. (2004). Gait Generation and Control for Biped Robots
Based on Passive Dynamic Walking, Journal of the Robotics Society of Japan, Vol.22,
No.1, pp.130-139
Imadu, A. & Ono, K. (1998). Optimum Trajectory Planning Method for a System that
Includes Passive Joints (1st Report, Proposal of a Function Approximation Method),
Transactions of the Japan Society of Mechanical Engineers, Series C, Vol.64, No.618,
pp.136-142 (in Japanese)
Ono, K. & Liu, R. (2001). An Optimal Walking Trajectory of Biped Mechanism (1st Report,
Optimal Trajectory Planning Method and Gait Solutions Under Full-Actuated
Condition), Transactions of the Japan Society of Mechanical Engineers, Series C, Vol.67,
No.660, pp.207-214 (in Japanese)
Liu, R & Ono, K. (2001). An Optimal Trajectory of Biped Walking Mechanism (2nd Report,
Effect of Under-Actuated Condition, No Knee Collision and Stride Length),
Transactions of the Japan Society of Mechanical Engineers, Series C, Vol.67, No.661,
pp.141-148 (in Japanese)
Ono, K. & Liu, R. (2002). Optimal Biped Walking Locomotion Solved by Trajectory Planning
Method, Transactions of the ASME, Journal of Dynamic Systems, Measurement and
Control, Vol.124, pp.554-565
Peng, C. & Ono K. (2003). Numerical Analysis of Energy-Efficient Walking Gait with Flexed
Knee for a Four-DOF Planar Biped Model, JSME International Journal, Series C,
Vol.46, No.4, pp.1346-1355
Hase T. & Huang, Q. (2005). Optimal Trajectory Planning Method for Biped Walking Robot
based on Inequality State Constraint, Proceedings. of 36th International Symposium on
Robotics, Biomechanical Robots, CD-ROM, WE413, Tokyo, Japan
Hase, T.; Huang, Q. & Ono, K. (2006). An Optimal Walking Trajectory of Biped Mechanism
(3rd Report, Analysis of Upper Body Mass Model under Inequality State Constraint
and Experimental Verification), Transactions of the Japan Society of Mechanical
Engineers, Series C, Vol.72, No.721, pp.2845-2852 (in Japanese)
Huang, Q.. & Hase, T. (2006). Energy-Efficient Trajectory Planning for Biped walking Robot,
Proceedings. of the 2006 IEEE International Conference on Robotics and Biomimetics,
pp.648-653, Kunming, China
20
1. Introduction
Why are people attracted to humanoid robots and androids? The answer is simple: because
human beings are attuned to understand or interpret human expressions and behaviors,
especially those that exist in their surroundings. As they grow, infants, who are supposedly
born with the ability to discriminate various types of stimuli, gradually adapt and fine-tune
their interpretations of detailed social clues from other voices, languages, facial expressions,
or behaviors (Pascalis et al., 2002). Perhaps due to this functionality of nature and nurture,
people have a strong tendency to anthropomorphize nearly everything they encounter. This
is also true for computers or robots. In other words, when we see PCs or robots, some
automatic process starts running inside us that tries to interpret them as human. The media
equation theory (Reeves & Nass, 1996) first explicitly articulated this tendency within us.
Since then, researchers have been pursuing the key element to make people feel more
comfortable with computers or creating an easier and more intuitive interface to various
information devices. This pursuit has also begun spreading in the field of robotics. Recently,
researcher’s interests in robotics are shifting from traditional studies on navigation and
manipulation to human-robot interaction. A number of researches have investigated how
people respond to robot behaviors and how robots should behave so that people can easily
understand them (Fong et al., 2003; Breazeal, 2004; Kanda et al., 2004). Many insights from
developmental or cognitive psychologies have been implemented and examined to see how
they affect the human response or whether they help robots produce smooth and natural
communication with humans.
However, human-robot interaction studies have been neglecting one issue: the "appearance
versus behavior problem." We empirically know that appearance, one of the most significant
elements in communication, is a crucial factor in the evaluation of interaction (See Figure 1).
The interactive robots developed so far had very mechanical outcomes that do appear as
“robots.” Researchers tried to make such interactive robots “humanoid” by equipping them
with heads, eyes, or hands so that their appearance more closely resembled human beings
and to enable them to make such analogous human movements or gestures as staring,
pointing, and so on. Functionality was considered the primary concern in improving
communication with humans. In this manner, many studies have compared robots with
different behaviors. Thus far, scant attention has been paid to robot appearances. Although
344 Humanoid Robots, New Developments
there are many empirical discussions on such very simple static robots as dolls, the design of
a robot’s appearance, particularly to increase its human likeness, has always been the role of
industrial designers; it has seldom been a field of study. This is a serious problem for
developing and evaluating interactive robots. Recent neuroimaging studies show that
certain brain activation does not occur when the observed actions are performed by non-
human agents (Perani et al., 2001; Han et al., 2005). Appearance and behavior are tightly
coupled, and concern is high that evaluation results might be affected by appearance.
Fig. 1. Three categories of humanlike robots: humanoid robot Robovie II (left: developed by
ATR Intelligent Robotics and Communication Laboratories), android Repliee Q2 (middle:
developed by Osaka University and Kokoro corporation), geminoid HI-1 and its human
source (right: developed by ATR Intelligent Robotics and Communication Laboratories).
In this chapter, we introduce android science, a cross-interdisciplinary research framework
that combines two approaches, one in robotics for constructing very humanlike robots and
androids, and another in cognitive science that uses androids to explore human nature. Here
androids serve as a platform to directly exchange insights from the two domains. To
proceed with this new framework, several androids have been developed so far, and many
researches have been done. At that time, however, we encountered serious issues that
sparked the development of a new category of robot called geminoid. Its concept and the
development of the first prototype are described. Preliminary findings to date and future
directions with geminoids are also discussed.
2. Android Science
Current robotics research uses various findings from the field of cognitive science, especially
in the human-robot interaction area, trying to adopt findings from human-human
interactions with robots to make robots that people can easily communicate with. At the
same time, cognitive science researchers have also begun to utilize robots. As research fields
extend to more complex, higher-level human functions such as seeking the neural basis of
social skills (Blakemore, 2004), expectations will rise for robots to function as easily
controlled apparatuses with communicative ability. However, the contribution from
robotics to cognitive science has not been adequate because the appearance and behavior of
Geminoid: Teleoperated Android of an Existing Person 345
current robots cannot be separately handled. Since traditional robots look quite mechanical
and very different from human beings, the effect of their appearance may be too strong to
ignore. As a result, researchers cannot clarify whether a specific finding reflects the robot’s
appearance, its movement, or a combination of both.
We expect to solve this problem using an android whose appearance and behavior closely
resembles humans. The same thing is also an issue in robotics research, since it is difficult to
clearly distinguish whether the cues pertain solely to robot behaviors. An objective,
quantitative means of measuring the effect of appearance is required.
Androids are robots whose behavior and appearance are highly anthropomorphized.
Developing androids requires contributions from both robotics and cognitive science.
To realize a more humanlike android, knowledge from human sciences is also
necessary. At the same time, cognitive science researchers can exploit androids for
verifying hypotheses in understanding human nature. This new, bi-directional, cross-
interdisciplinary research framework is called android science (Ishiguro, 2005). Under
this framework, androids enable us to directly share knowledge between the
development of androids in engineering and the understanding of humans in cognitive
science (Figure 2).
Analysis and
Neuro sci. Cognitive sci.
understanding of Psychology
humans
the autonomous program installed in the android can make smooth, natural interactions
with people near it.
Even though these androids enabled us to conduct a variety of cognitive experiments, they
are still quite limited. The bottleneck in interaction with human is its lack of ability to
perform long-term conversation. Unfortunately, since current AI technology for developing
humanlike brains is limited, we cannot expect humanlike conversation with robots. When
meeting humanoid robots, people usually expect humanlike conversation with them.
However, the technology greatly lags behind this expectation. AI progress takes time, and
such AI that can make humanlike conversation is our final goal in robotics. To arrive at this
final goal, we need to use currently available technologies and understand deeply what a
human is. Our solution for this problem is to integrate android and teleoperation
technologies.
3. Geminoid
individual to offer meaningful insights into the experiments, which are especially important
at the very first stage of a new field of study when beginning from established research
methodologies.
Teleoperation interface
Figure 5 shows the teleoperation interface prototype. Two monitors show the controlled
robot and its surroundings, and microphones and a headphone are used to capture and
transmit utterances. The captured sounds are encoded and transmitted to the geminoid
server by IP links from the interface to the robot and vice versa. The operator’s lip corner
positions are measured by an infrared motion capturing system in real time, converted to
motion commands, and sent to the geminoid server by the network. This enables the
operator to implicitly generate suitable lip movement on the robot while speaking. However,
compared to the large number of human facial muscles used for speech, the current robot
only has a limited number of actuators on its face. Also, response speed is much slower,
partially due to the nature of the pneumatic actuators. Thus, simple transmission and
playback of the operator’s lip movement would not result in sufficient, natural robot motion.
To overcome this issue, measured lip movements are currently transformed into control
commands using heuristics obtained through observation of the original person’s actual lip
movement.
Geminoid server
The geminoid server receives robot control commands and sound data from the remote
controlling interface, adjusts and merges inputs, and sends and receives primitive
controlling commands between the robot hardware. Figure 6 shows the data flow in the
geminoid system. The geminoid server also maintains the state of human-robot interaction
and generates autonomous or unconscious movements for the robot. As described above, as
the robot’s features become more humanlike, its behavior should also become suitably
Geminoid: Teleoperated Android of an Existing Person 349
sophisticated to retain a “natural” look (Minato et al., 2006). One thing that can be seen in
every human being, and that most robots lack, are the slight body movements caused by an
autonomous system, such as breathing or blinking. To increase the robot’s naturalness, the
geminoid server emulates the human autonomous system and automatically generates these
micro-movements, depending on the state of interaction each time. When the robot is
“speaking,” it shows different micro-movements than when “listening” to others. Such
automatic robot motions, generated without operator’s explicit orders, are merged and
adjusted with conscious operation commands from the teleoperation interface (Figure 6).
Alongside, the geminoid server gives the transmitted sounds specific delay, taking into
account the transmission delay/jitter and the start-up delay of the pneumatic actuators. This
adjustment serves synchronizing lip movements and speech, thus enhancing the naturalness
of geminoid movement.
x In less than 5 minutes both the visitors and I can quickly adapt to conversation
through the geminoid. The visitors recognize and accept the geminoid as me while
talking to each other.
x When a visitor pokes HI-1, especially around its face, I get a strong feeling of
being poked myself. This is strange, as the system currently provides no tactile
feedback. Just by watching the monitors and interacting with visitors, I get this
feeling.
We also asked the visitors how they felt when interacting through the geminoid. Most said
that when they saw HI-1 for the very first time, they thought that somebody (or Dr. Ishiguro,
if familiar with him) was waiting there. After taking a closer look, they soon realized that
HI-1 was a robot and began to have some weird and nervous feelings. But shortly after
having a conversation through the geminoid, they found themselves concentrating on the
interaction, and soon the strange feelings vanished. Most of the visitors were non-
researchers unfamiliar with robots of any kind.
Does this mean that the geminoid has overcome the “uncanny valley”? Before talking
through the geminoid, the initial response of the visitors seemingly resembles the reactions
seen with previous androids: even though at the very first moment they could not recognize
the androids as artificial, they nevertheless soon become nervous while being with the
androids. Are intelligence or long-term interaction crucial factors in overcoming the valley
and arriving at an area of natural humanness?
We certainly need objective means to measure how people feel about geminoids and other
types of robots. In a previous android study, Minato et al. found that gaze fixation revealed
criteria about the naturalness of robots (Minato et al., 2006). Recent studies have shown
different human responses and reactions to natural or artificial stimuli of the same nature.
Perani et al. showed that different brain regions are activated while watching human or
computer graphic arms movements (Perani et al., 2001). Kilner et al. showed that body
movement entrainment occurs when watching human motions, but not with robot motions
(Kilner et al., 2003). By examining these findings with geminoids, we may be able to find
some concrete measurements of human likeliness and approach the “appearance versus
behavior” issue.
Perhaps HI-1 was recognized as a sort of communication device, similar to a telephone or a
TV-phone. Recent studies have suggested a distinction in the brain process that
discriminates between people appearing in videos and existing persons appearing live
(Kuhl et al., 2003). While attending TV conferences or talking by cellular phones, however,
we often experience the feeling that something is missing from a face-to-face meeting. What
is missing here? Is there an objective means to measure and capture this element? Can we
ever implement this on robots?
Synchronization between speech utterances sent by the teleoperation system and body
movements
The most important technology for the teleoperation system is synchronization between
speech utterances and lip movements. We are investigating how to produce natural
behaviors during speech utterances. This problem is extended to other modalities, such as
head and arm movements. Further, we are studying the effects on non-verbal
communication by investigating not only synchronization of speech and lip movements but
also facial expressions, head, and even whole body movements.
Application
Although being developed as research apparatus, the nature of geminoids can allow us
to extend the use of robots in the real world. The teleoperated, semi-autonomous
facility of geminoids allows them to be used as substitutes for clerks, for example, that
can be controlled by human operators only when non-typical responses are required.
Since in most cases an autonomous AI response will be sufficient, a few operators will
be able to control hundreds of geminoids. Also because their appearance and behavior
closely resembles humans, in the next age geminoids should be the ultimate interface
device.
5. Acknowledgement
This work was supported in part by the Ministry of Internal Affairs and Communications of
Japan.
6. References
Blakemore, S. J. & Frith, U. (2004). How does the brain deal with the social world?
Neuroreport, 15, 119-128
Breazeal, C. (2004). Social Interactions in HRI: The Robot View, IEEE Transactions on Man,
Cybernetics and Systems: Part C, 34, 181-186
Fong, T., Nourbakhsh, I., & Dautenhahn, K. (2003). A survey of socially interactive robots,
Robotics and Autonomous Systems, 42, 143–166
352 Humanoid Robots, New Developments
Han, S., Jiang, Y., Humphreys, G. W., Zhou, T., and & Cai, P. (2005). Distinct neural substrates
for the perception of real and virtual visual worlds, NeuroImage, 24, 928– 935
Ishiguro, H. (2005). Android Science: Toward a New Cross-Disciplinary Framework.
Proceedings of Toward Social Mechanisms of Android Science: A CogSci 2005 Workshop,
1–6
Kanda, T., Ishiguro, H., Imai, M., & Ono, T. (2004). Development and Evaluation of
Interactive Humanoid Robots, Proceedings of the IEEE, 1839-1850
Kilner, J. M., Paulignan, Y., & Blakemore, S. J. (2003). An interference effect of observed
biological movement on action, Current Biology, 13, 522-525
Kuhl, P. K., Tsao. F. M., & Liu, H. M. (2003). Foreign-language experience in infancy: Effects
of short-term exposure and social interaction on phonetic learning. Proceedings of the
National Academy of Sciences, 100, 9096-9101
Minato, T., Shimada, M., Itakura, S., Lee, K., & Ishiguro, H. (2006). Evaluating the human
likeness of an android by comparing gaze behaviors elicited by the android and a
person, Advanced Robotics, 20, 1147-1163
Pascalis, O., Haan, M., and Nelson, C. A. (2002). Is Face Processing Species-Specific During
the First Year of Life?, Science, 296, 1321-1323
Perani, D., Fazio, F., Borghese, N. A., Tettamanti, M., Ferrari, S., Decety, J., & Gilardi, M.C.
(2001). Different brain correlates for watching real and virtual hand actions,
NeuroImage, 14, 749-758
Reeves, B. & Nass, C. (1996). The Media Equation, CSLI/Cambridge University Press
21
1. Introduction
Demand for robots is shifting from their use in industrial applications to their use in
domestic situations, where they “live” and interact with humans. Such robots require
sophisticated body designs and interfaces to do this. Humanoid robots that have multi-
degrees-of-freedom (MDOF) have been developed, and they are capable of working with
humans using a body design similar to humans. However, it is very difficult to intricately
control robots with human generated, preprogrammed, learned behavior. Learned behavior
should be acquired by the robots themselves in a human-like way, not programmed
manually. Humans learn actions by trial and error or by emulating someone else’s actions.
We therefore apply reinforcement learning for the control of humanoid robots because this
process resembles a human’s trial and error learning process.
Many existing methods of reinforcement learning for control tasks involve discrediting state
space using BOXES (Michie & Chambers, 1968; Sutton & Barto, 1998) or CMAC
(Albus, 1981) to approximate a value function that specifies what is advantageous in the
long run. However, these methods are not effective for doing generalization and cause
perceptual aliasing. Other methods use basis function networks for treating continuous state
space and actions.
Networks with sigmoid functions have the problem of catastrophic interference. They
are suitable for off-line learning, but are not adequate for on-line learning such as that
needed for learning motion (Boyan & Moore, 1995; Schaal & Atkeson, 1996). On the
contrary, networks with radial basis functions are suitable for on-line learning. However,
learning using these functions requires a large number of units in the hidden layer,
because they cannot ensure sufficient generalization. To avoid this problem, methods of
incremental allocation of basis functions and adaptive state space formation were
proposed (Morimoto & Doya, 1998; Samejima & Omori, 1998; Takahashi et al., 1996;
Moore & Atkeson, 1995).
In this chapter, we propose a dynamic allocation method of basis functions called
Allocation/Elimination Gaussian Softmax Basis Function Network (AE-GSBFN), that is
used in reinforcement learning to treat continuous high-dimensional state spaces. AE-
GSBFN is a kind of actor-critic method that uses basis functions and it has allocation and
elimination processes. In this method, if a basis function is required for learning, it is
allocated dynamically. On the other hand, if an allocated basis function becomes redundant,
the function is eliminated. This method can treat continuous high-dimensional state spaces
354 Humanoid Robots, New Developments
because the allocation and elimination processes reduce the number of basis functions
required for evaluation of the state space.
2. Actor-Critic Method
In this section, we describe an actor-critic method using basis functions, and we apply it to
our method.
Actor-critic methods are temporal difference (TD) methods that have a separate memory
structure to explicitly represent the policy independent of the value function (Sutton & Barto,
1998). Actor-critic methods are constructed by an actor and a critic, as depicted in Figure 1. The
policy structure is known as the actor because it is used to select actions, and the estimated
value function is known as the critic because it criticizes the actions made by the actor.
The actor and the critic each have a basis function network for learning of continuous state
spaces. Basis function networks have a three-layer structure as shown in Figure 2, and basis
functions are placed in middle-layer units. Repeating the following procedure, in an actor-
critic method using basis function networks, the critic correctly estimates the value function
V(s), and then the actor acquires actions that maximize V(s).
1. When state s(t) is observed in the environment, the actor calculates the j-th value uj(t)
of the action u(t) as follows (Gullapalli, 1990):
Obtaining Humanoid Robot Controller Using Reinforcement Learning 355
§N ·
u j (t ) u max
j g¨¨ ¦ Z ij bi s(t ) n j (t ) ¸¸ , (1)
© i ¹
where u max
j is a maximal control value, N is the number of basis functions, bi(s(t)) is a
basis function, Zij is a weight, nj(t) is a noise function, and g() is a logistic sigmoid
activation function whose outputs lie in the range (ï1, 1). The output value of actions is
saturated into u max
j by g().
2. The critic receives the reward r(t), and then observes the resulting next state s(t+1). The
critic provides the TD-error G (t ) as follows:
G (t ) r (t ) JV s(t 1) V s(t ) , (2)
where J is a discount factor, and V(s) is an estimated value function. Here, V(s(t)) is
calculated as follows:
N
V s(t ) ¦ v i bi s(t ) , (3)
i
where vi is a weight.
3. The actor updates weight Zij using TD-error:
Z ij m Z ij EG (t )n j (t )bi s(t ) , (4)
where E is a learning rate.
4. The critic updates weight vi:
v i m v i DG (t )e i , (5)
where D is a learning rate, and ei is an eligibility trace. Here, ei is calculated as follows:
e i m JOe i bs(t ) , (6)
where O is a trace-decay parameter.
5. Time is updated.
t m t 't . (7)
Note that 't is 1 in general, but we used the description of 't for the control interval of the
humanoid robots.
3.1 A-GSBFN
Networks with sigmoid functions have the problem of catastrophic interference. They are
suitable for off-line learning, but not adequate for on-line learning. In contrast, networks
with radial basis functions (Figure 3) are suitable for on-line learning, but learning using
these functions requires a large number of units, because they cannot ensure sufficient
generalization. The Gaussian softmax functions (Figure 4) have the features of both sigmoid
356 Humanoid Robots, New Developments
functions and radial basis functions. Networks with the Gaussian softmax functions can
therefore assess state space locally and globally, and enable learning motions of humanoid
robots.
Fig. 3. Shape of radial basis functions. Four radial basis functions are visible here, but it is
clear that the amount of generalization done is insufficient.
Fig. 4. Shape of Gaussian softmax basis functions. Similar to Figure 3, there are four basis
functions. Using Gaussian softmax basis functions, global generalization is done, such as
using sigmoid functions.
The Gaussian softmax basis function is used in A-GSBFN and is given by the following
equation:
a s(t )
bi s(t ) N i , (8)
¦ k
a s( t )
k
where ai(s(t)) is a radial basis function, and N is the number of radial basis functions. Radial
basis function ai(s(t)) in the i-th unit is calculated by the following equation:
§ 1 2·
ai s(t ) exp¨ M s(t ) c i ¸ , (9)
© 2 ¹
where ci is the center of the i-th basis function, and M is a matrix that determines the shape
of the basis function.
Obtaining Humanoid Robot Controller Using Reinforcement Learning 357
In A-GSBFN, a new unit is allocated if the error is larger than threshold G max and the
activation of all existing units is smaller than threshold amin:
h(t ) ! G max and max ai s(t ) amin , (10)
i
where h(t) is defined as h(t ) G (t )n j (t ) at the actor, and h(t ) G (t ) at the critic. The new
unit is initialized with ci = s, and Zi 0.
Definition 2 - Elimination
The basis function bi s(t ) is eliminated if the following condition is satisfied in the actor or
critic networks.
H i ! H max and W i ! Terase , (12)
where H max and Terase are thresholds.
The trace H i of the activation of radial basis functions is updated at each step in the
following manner:
H i m NH i ai s(t ) , (13)
where N is a discount rate. Using H i , the learning agent can sense states that it has recently
taken. The value of H i takes a high value if the agent stays in almost the same state. This
358 Humanoid Robots, New Developments
situation is assumed when the learning falls into a local minimum. Using the value of H i , we
consider how to avoid the local minimum. Moreover, using W i , we consider how to inhibit a
basis function from immediate elimination after it is allocated. We therefore defined the
condition of elimination using H i and W i .
4. Experiments
4.1 Standing-up motion learning
In this section, as an example of learning of continuous high-dimensional state spaces, AE-
GSBFN is applied to a humanoid robot learning to stand up from a chair (Figure 5). The
learning was simulated using the virtual body of the humanoid robot HOAP-1 made by
Fujitsu Automation Ltd. Figure 6 shows HOAP-1. The robot is 48 centimeters tall, weighs 6
kilograms, has 20 DOFs, and has 4 pressure sensors each on the soles of its feet. Additionally,
angular rate and acceleration sensors are mounted in its chest. To simulate learning, we
used the Open Dynamics Engine (Smith).
One trial ended when the robot fell down or time exceeded ttotal = 10 [s]. Rewards r(t) were
determined by height y [cm] of the robot’s chest:
l y
° 20 stand ( during trial)
r (t ) ® lstand ldown , (16)
° 20 t
¯ total t ( on failure )
where lstand = 35 [cm] is the position of the robot’s chest in an upright posture, and ldown = 20
[cm] is its center in a falling-down posture. We used umax
j S / 36 , J 0.9 , E 0.1 ,
D 0.02 , O 0.6 , and 't 0.01 [s] for parameters in Section 2, M=(1.0, 0.57, 1.0, 0.57, 1.0,
0.57, 1.0, 0.57), G max 0.5 , and amin 0.4 in Section 3.1, and Tadd 1 [s], N 0.9 , H max 5.0 ,
and Terase 3 [s] in Section 3.2.
Figure 7 shows the learning results. First, the robot learned to fall down backward, as shown
in i). Second, the robot intended to stand up from a chair, but fell forward, as shown in ii),
because it could not yet fully control its balance. Finally, the robot stood up while
maintaining its balance, as shown in iii). The number of basis functions in the 2922nd trial
was 72 in both actor and critic networks. Figure 8 shows the experimental result with the
humanoid robot HOAP-1. The result shows that HOAP-1 was able to stand up from a chair,
as in the simulation.
We then compared the number of basis functions in AE-GSBFN with the number of basis
functions in A-GSBFN. Figure 9 shows the number of basis functions of the actor, averaged
over 20 repetitions. In these experiments, motion learning with both AE-GSBFN and A-
GSBFN was successful, but the figure indicates that the number of basis functions required
by AE-GSBFN was fewer than that by A-GSBFN. That is, high dimensional learning is
possible using AE-GSBFN. Finally, we plotted the height of the robot’s chest in successful
experiments in Figures 10 and 11. In the figures, circles denote a successful stand-up. The
results show that motion learning with both AE-GSBFN and A-GSBFN was successful.
i) 300th trial
Fig. 9. Number of basis functions in the actor network (averaged over 20 repetitions).
Obtaining Humanoid Robot Controller Using Reinforcement Learning 361
Fig. 10. Height of robot’s chest with AE-GSBFN. Circles denote successful stand-ups.
where ks is a scalar function with which it becomes a positive value near singular points and
becomes 0 otherwise:
§ 2
w ·
° k0 ¨ 1 ¸ ( w w0 )
ks ® ¨© w0 ¸¹ , (19)
°0 ( otherwise)
¯
where k0 is a positive parameter, w0 is a threshold that divides around singular points from
the others, and w is given by w det J (T ) J T (T ) .
In this experiment, the coordinate of the end of the legs is given by inverse kinematics (i.e.,
up-down motion of the legs is given), and motion of the horizontal direction of the waist is
learned by AE-GSBFN. The coordinate value was acquired from the locomotion data of
HOAP-1. Concretely, motion is generated by solving inverse kinematics from pw to the
idling leg, and from the supporting leg to pw (Figure 12 (a)). The change of supporting and
idling legs is also acquired from HOAP-1’s locomotion data.
Fig. 11. Height of robot’s chest with A-GSBFN. Circles denote successful stand-ups.
362 Humanoid Robots, New Developments
We can use the value of the difference between its supporting leg and p(t ) as rewards,
but these rewards may represent the ideal position of p(t ) because of the use of inverse
kinematics. Therefore, we used the above equation. Using the equation (22), the closer
p(t ) is to p(0 ) , the more the rewards increases. Intuitively, it is unsuitable for rewards
of stamping motion learning, but acquiring a stamping motion only brings more
rewards, because an up-down motion of the leg is given forcibly by inverse kinematics,
and it is necessary to change p(t ) quite a lot to make the robot stay upright without
falling down.
Obtaining Humanoid Robot Controller Using Reinforcement Learning 363
We used u max
j 1.0 u 10 4 , J 0.9 , E 0.1 , D 0.02 , O 0.6 , and 't 0.01 [s] for
parameters in Section 2, M=diag(2.0, 1.1, 2.0, 1.1, 2.0, 1.1, 2.0, 1.1, 2.0, 1.1, 2.0, 1.1, 2.0, 1.1, 2.0,
1.1, 2.0, 1.1, 2.0, 1.1), G max 0.5 , and a min 0.135 in Section 3.1, and Tadd 1.0 [s], N 0.9 ,
H max 5.0 , and Terase 2.0 [s] in Section 3.2. We also used k0 = 0.01 and w0=0.003 for the
parameters of inverse kinematics.
Figure 13 shows the learning results. The robot can control its balance by moving its
waist right and left. Figure 14 plots the amount of time taken to fall down. You can
see that the time increases as the learning progresses. Figure 15 shows the value of
p(t ) in the 986th trial. It is clear that p(t ) changes periodically. These results indicate
that a stamping motion was acquired, but the robot’s idling leg does not rise perfectly
when we look at the photos in Figure 13. We assume that the first reason for these
results is that it is difficult to control the angle of ankle using inverse kinematics
(R) (L )
(since inverse kinematics cannot control T AF and T AF to be parallel to the ground).
The second reason is that we only used y-coordinate values of the waist for learning,
and the third is because we used equation (22) for rewards. To solve the second issue,
we can use its z-coordinate value. Using equation (22), the third reason, a small
periodic motion is obtained (Figure 16). To solve this problem, we should consider
another reward function for this experiment. We will explore these areas in our future
research.
Fig. 14. Time during trial (averaged over 100 repetitions). If the value of a vertical axis is
large, the stamping motion extends for a long time.
(R )
Fig. 16. Ideal angle and output angle of T WF with AE-GSBFN in 986th trial. The dotted line
indicates an ideal motion and the solid line indicates the acquired motion with AE-GSBFN. It is
clear that the acquired motion consists of small periodic motions compared with the deal motion.
Obtaining Humanoid Robot Controller Using Reinforcement Learning 365
5. Conclusion
In this chapter, we proposed a dynamic allocation method of basis functions, AE-GSBFN, in
reinforcement learning. Through allocation and elimination processes, AE-GSBFN
overcomes the curse of dimensionality and avoids a fall into local minima. To confirm the
effectiveness of AE-GSBFN, we applied it to the motion control of a humanoid robot. We
demonstrated that AE-GSBFN is capable of providing better performance than A-GSBFN,
and we succeeded in enabling the learning of motion control of the robot.
The future objective of this study is to do some general comparisons of our method with
other dynamic neural networks, for example, Fritzke’s “Growing Neural Gas” (Fritzke,
1996) and Marsland’s “Grow When Required Nets” (Marsland et al., 2002). An analysis of
the necessity of hierarchical reinforcement learning methods proposed by Morimoto and
Doya (Morimoto & Doya, 2000) in relation to the standing up simulation is also an
important issue for the future study.
6. References
Albus, J. S. (1981). Brains, Behavior, and Robotics, Byte Books
Boyan, J. A. & Moore, A. W. (1995). Generalization in reinforcement learning: Safely
approximating the value function, Advances in Neural Information Processing Systems,
Vol. 7, 369–376
Fritzke, B. (1996). Growing self-organizing networks -- why?, European Symposium on
Artificial Neural Networks, 61–72
Gullapalli, V. (1990). A stochastic reinforcement learning algorithm for learning real valued
functions, Neural Networks, Vol. 3, 671–692
Marsland, S. ; Shapiro, J. & Nehmzow, U. (2002). A self-organizing network that grows
when required, Neural Networks, Vol. 15, 1041–1058
Michie, D. & Chambers, R. A. (1968). BOXES: An Experiment in Adaptive Control, In:
Machine Intelligence 2, E. Dale and D. Michie (Ed.), pp. 137-152, Edinburgh
Moore, A. W. & Atkeson, C. G. (1995). The parti-game algorithm for variable resolution
reinforcement learning in multidimensional state space, Machine Learning, Vol. 21,
199–234
Mori, T.; Nakamura, Y., Sato, M., & Ishii, S. (2004). Reinforcement Learning for CPG-driven
Biped Robot, Nineteenth National Conference on Artificial Intelligence (AAAI2004), pp.
623-630
Morimoto, J. & Doya, K. (1998). Reinforcement learning of dynamic motor sequence:
Learning to stand up, IEEE/RSJ International Conference on Intelligent Robots and
Systems, pp. 1721–1726
Morimoto, J. & Doya, K. (1999). Learning dynamic motor sequence in high-dimensional
state space by reinforcement learning -- learning to stand up -- , IEICE Transactions
on Information and Systems, Vol. J82-D2, No. 11, 2118–2131
Morimoto, J. & Doya, K. (2000). Acquisition of stand-up behavior by a real robot using
hierarchical reinforcement learning, International Conference on Machine Learning, pp.
623–630
Nakamura, K. & Hanafusa, H. (1984). Singularity Low-Sensitive Motion Resolution of
Articulated Robot Arms, Transactions of the Society of Instrument and Control
Engineers, Vol. 20, No. 5, pp. 453–459 (in Japanese)
366 Humanoid Robots, New Developments
Samejima, K. & Omori, T. (1998). Adaptive state space formation method for reinforcement
learning, International Conference on Neural Information Processing, pp. 251–255
Schaal, S. & Atkeson, C. C. (1996). From isolation to cooperation: An alternative view of a
system of experts, Advances in Neural Information Processing System, Vol. 8, 605–611
Smith, R. Open Dynamics Engine, http://opende.sourceforge.net/ode.html
Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning: An Introduction, MIT Press
Takahashi, Y. ; Asada, M. & Hosoda, K. (1996). Reasonable performance in less learning time
by real robot based on incremental state space segmentation, IEEE/RSJ International
Conference on Intelligent Robots and Systems, pp. 1518–1524
22
1. Introduction
Many aspects of modern life involve the use of intelligent machines capable of operating
under dynamic interaction with their environment. In view of this, the field of biped
locomotion is of special interest when human-like robots are concerned. Humanoid robots as
anthropomorphic walking machines have been in operation for more than twenty years.
Currently, research on the design and the humanoid robots are one of the most exciting and
challenging topics in the field of robotics. .The potential applications of this research area are
very foremost in the middle and long term. Humanoid robots are expected to be servants
and maintenance machines with the main task to assist human activities in our daily life and
to replace humans in hazardous operations. It is as obvious as interesting that
anthropomorphic biped robots are potentially capable to effectively move in all
unstructured environments where humans do. There also raises strong anticipations that
robots for the personal use will coexist with humans and provide supports such as the
assistance for the housework, care of the aged and the physically handicapped.
Consequently, humanoid robots have been treated as subjects of robotics researches such as
a research tool for human science, an entertainment/mental-commit robot or an
assistant/agent for humans in the human living environment.
Humanoid robot are autonomous systems capable of extracting information from their
environments and using knowledge about the world and intelligence of their duties and
proper governing capabilities. Intelligent humanoid robots should be autonomous to move
safely in a meaningful and purposive manner, i.e. to accept high-level descriptions of tasks
(specifying what the user wants to be done, rather than how to do it) and would execute
them without further human intervention. Future humanoid robots are likely to have
greater sensory capabilities, more intelligence for valid reasoning and decision making,
higher levels of manual dexterity and adequate mobility as compared to humans. Naturally,
the first approach to making humanoid robots more intelligent was the integration of
sophisticated sensor systems as computer vision, tactile sensing, ultrasonic and sonar
sensors, laser scanners and other smart sensors. However, today’s sensor products are still
very limited in interactivity and adaptability to changing environments. As the technology
and algorithms for real-time 3D vision and tactile sensing improve, humanoid robots will be
able to perform tasks that involve complex interaction with the environment (e.g. grasping
and manipulating the objects). A major reason is that uncertainty and dynamic changes
make the development of reliable artificial systems particularly challenging. On the other
368 Humanoid Robots, New Developments
hand, to design robots and systems that best adapt to their environment, the necessary
research includes investigations in the field of mechanical robot design (intelligent
mechanics), environment perception systems and embedded intelligent control that ought to
cope with the task complexity, multi-objective decision making, large volume of perception
data and substantial amount of heuristic information. Also, in the case when the robot
performs in an unknown environment, the knowledge may not be sufficient. Hence, the
robot has to adapt to the environment and to be capable of acquiring new knowledge
through the process of learning. The robot learning is essentially concerned with equipping
robots with the capacity of improving their behavior over time, based on their incoming
experiences.
Although there has been a large number of the control methods used to solve the problem
of humanoid robot walking , it is difficult to detect a specific trend. Classical robotics and
also the more recent wave of humanoid and service robots still rely heavily on teleoperation
or fixed behavior-based control with very little autonomous ability to react to the
environment. Among the key missing elements is the ability to create control systems that
can deal with a large movement repertoire, variable speeds, constraints and most
importantly, uncertainty in the real-world environment in a fast, reactive manner. There are
several intelligent paradigms that are capable of solving intelligent control problems in
humanoid robotics. Connectionist theory (NN - neural networks), fuzzy logic (FL), and
theory of evolutionary computation (GA - genetic algorithms), are of great importance in
the development of intelligent humanoid robot control algorithms (Katiþ & Vukobratovic,
2003a; Katiþ & Vukobratovic, 2003b; Katiþ & Vukobratoviþ, 2005). Due to their strong
learning and cognitive abilities and good tolerance of uncertainty and imprecision,
intelligent techniques have found wide applications in the area of advanced control of
humanoid robots. Also, of great importance in the development of efficient algorithms are
the hybrid techniques based on the integration of particular techniques such as neuro-fuzzy
networks, neuro-genetic algorithms and fuzzy-genetic algorithms.
One approach of departing from teleoperation and manual ‘hard coding’ of behaviors is by
learning from experience and creating appropriate adaptive control systems. A rather
general approach to learning control is the framework of Reinforcement Learning, described in
this chapter. Reinforcement learning offers one of the most general framework to take
traditional robotics towards true autonomy and versatility.
Reinforcement learning typically requires an unambiguous representation of states and
actions and the existence of a scalar reward function. For a given state, the most traditional of
these implementations would take an action, observe a reward, update the value function, and
select as the new control output the action with the highest expected value in each state (for a
greedy policy evaluation). Updating of value function and controls is repeated until
convergence of the value function and/or the policy. This procedure is usually summarized
under “value update – policy improvement” iterations. The reinforcement learning paradigm
described above has been successfully implemented for many well-defined, low dimensional
and discrete problems (Sutton & Barton, 1998; Bertsekas & Tsitsiklis, 1996) and has also
yielded a variety of impressive applications in rather complex domains in the last 20 years.
Reinforcement learning is well suited to training mobile robots, in particular teaching a robot a
new behavior (e.g. avoid obstacles) from scalar feedback. Robotics is a very challenging
domain for reinforcement learning, However, various pitfalls have been encountered when
trying to scale up these methods to high dimension, continuous control problems, as typically
faced in the domain of humanoid robotics.
Reinforcement Learning Algorithms In Humanoid Robotics 369
categories. First category represents static walkers, whose motion is very slow so that the
system’s stability is completely described by the normal projection of the Centre of Gravity,
which only depends on the joint’s position. Second category represents dynamic walkers,
biped robots with feet and actuated ankles. Postural stability of dynamic walkers depends
on joint’s velocities and acceleration too. These walkers are potentially able to move in a
static way provided they have large enough feet and the motion is slow. The third category
represents purely dynamic walkers, robots without feet. In this case the support polygon
during the single-support phase is reduced to a point, so that static walking is not possible.
In the walk with dynamic balance, the projected centre of mass is allowed outside of the
area inscribed by the feet, and the walker may essentially fall during parts of the walking
gait. The control problems of dynamic walking are more complicated than in walking with
static balance, but dynamic walking patterns provide higher walking speed and greater
efficiency, along with more versatile walking structures.
The rotational equilibrium of the foot is the major factor of postural instability with legged
robots. The question has motivated the definition of several dynamic-based criteria for the
evaluation and control of balance in biped locomotion. The most common criterion are the centre
of pressure (CoP), the zero-moment point (ZMP) concept, that has gained widest acceptance and
played a crucial role in solving the biped robot stability and periodic walking pattern synthesis
(Vukobratoviþ and JuriĀiþ, 1969). The ZMP is defined as the point on the ground about which the
sum of all the moments of the active forces equals zero. If the ZMP is within the convex hull of all
contact points between the foot and the ground, the biped robot can walk.
For a legged robot walking on complex terrain, such as a ground consisting of soft and hard
uneven parts, a statically stable walking manner is recommended. However, in the cases of
soft terrain, up and down slopes or unknown environment, the walking machine may lose
its stability because of the position planning errors and unbalanced foot forces. Hence,
position control alone is not sufficient for practical walking, position/force control being
thus necessary. Foot force control can overcome these problems, so that foot force control is
one of the ways to improve the terrain adaptability of walking robots. For example, in the
direction normal to the ground, foot force has to be controlled to ensure firm foot support
and uniform foot force distribution among all supporting legs; foot force in the tangential
direction has to be monitored to avoid slippage.
A practical biped needs to be more like a human - capable of switching between different
known gaits on familiar terrain and learning new gaits when presented with unknown
terrain. In this sense, it seems essential to combine force control techniques with more
advanced algorithms such as adaptive and learning strategies. Inherent walking patterns
must be acquired through the development and refinement by repeated learning and
practice as one of important properties of intelligent control of humanoid robots. Learning
enables the robot to adapt to the changing conditions and is critical to achieving
autonomous behaviour of the robot.
Many studies have given weight to biped walking which is based only on stability of the
robot: steady-state walking, high-speed dynamic walking, jumping, and so on. A humanoid
robot is however, a kind of integrated machine: a two-arm and two-leg mechanism. Hence,
we must not only focus on the locomotion function but also on arm’s function with this kind
of machines; manipulation and handling being major functions of robot’s arms.
When the ground conditions and stability constraint are satisfied, it is desirable to select a
walking pattern that requires small torque and velocity of the joint actuators. Humanoid
robots are inevitably restricted to a limited amount of energy supply. It would therefore be
Reinforcement Learning Algorithms In Humanoid Robotics 371
advantageous to consider the minimum energy consumption, when cyclic movements like
walking are involved. With this in mind, an important approach in research is to optimise
simultaneously both the humanoid robot morphology and control, so that the walking
behaviour is optimised instead of optimising walking behaviour for the given structure of
humanoid robot. Optimum structures can be designed when the suitable components and
locomotion for the robot are selected appropriately through evolution. It is well known that
real-time generation of control algorithms based on highly-complex nonlinear model of
humanoid robot commonly suffers from a large amount of computation. Hence, new time-
efficient control methods need to be discovered to control humanoid robots in real time, to
overcome the mentioned difficulty.
In summary, conventional control algorithms is based on a kinematics and dunamic
modeling of the mechanism structure.(Vukobraoviþ et al, 1990). This implies precise
identification of intrinsinc parameters of biped’s robot mechanism , requires a high precise
measurement of humanoid state variables and needs for precise evaluation of interaction
forces between foot and ground. Moreover, these methods require a lot of computation
together with some problems related to mathematical tractability, optimisation, limited
extendability and limited biological plausibility. The second approach based on intelligent
control techniques have a potential to overcome the mentioned constraints. In this case, it is
not necessary to know perfectly the parameters and characteristics of humanoid
mechanism. Also, these methods take advantage from off-line and on-line learning
capabilities. This last point is very important because generally the learning ability allows
increasing the autonomy of the bioed robot.
field of reinforcement learning. Using prior knowledge about the desired motion can greatly
simplify controller synthesis. Imitation-based learning or learning from demonstration
allow for policy search to focus only on the areas of the search space that pertain to the task
at hand. Both, model-based and model-free approaches exist to find optimal policies when
agents are allowed to act for unlimited time. For physical agents, such as humanoid robots
acting in the real world, it is much more difficult to gain experience. Hence, the exhaustive
exploration of highdimensional state and action spaces is not feasible. For a physical robot,
it is essential to learn from few trials in order to have some time left for exploitation.
The robot learning is essentially concerned with equipping robots with the capacity of
improving their behavior over time, based on their incoming experiences. For instance, it
could be advantageous to learn dynamics models, kinematic models, impact models, for
model-based control techniques. Imitation learning could be employed for the teaching of
gaits patterns, and reinforcement learning could help tuning parameters of the control
policies in order to improve the performance with respect to given cost functions.
Dynamic bipedal walking is difficult to learn for a number of reasons. First, biped robots
typically have many degrees of freedom, which can cause a combinatorial explosion for
learning systems that attempt to optimize performance in every possible configuration of
the robot. Second, details of the robot dynamics such as uncertainties in the ground contact
and nonlinear friction in the joints must be only experimentally validated. Since it is only
practical to run a small number of learning trials on the real robot, the learning algorithms
must perform well after obtaining a very limited amount of data. Finally, learning
algorithms for dynamic walking must deal with dynamic discontinuities caused by
collisions with the ground and with the problem of delayed reward -torques applied at one
time may have an effect on the performance many steps into the future.
In area of humanoid robotics, there are several approaches of reinforcement learning (Benbrahim
& Franklin, 1997; Chew & Pratt, 2002; Li et al. , 1992; Mori et al., 2004; Morinoto et al., 2004;
Nagasaka et al., 1999; Nakamura et al., 2003; Salatian et al., 1997; Zhou & Meng, 2000) with
additional demands and requirements because high dimensionality of the control problem.
In the paper (Benbrahim & Franklin, 1997), it is shown how reinforcement learning is used
within a modular control architecture to enable a biped robot to walk. The controller structure
consists of central (CPG) and peripheral controllers. The learning architecture succeeded in
dealing with the problems of large numbers of inputs, knowledge integration and task
definition. The central controller controls the robot in nominal situations, and the peripheral
controllers intervene only when they consider that the central controller’s action contradicts their
individual control policies (Figure 1). The action is generated by computing the average of the
outputs of all controllers that intervene including the central controller. Each peripheral
controller’s role is to correct the central controller’s mistakes and issue an evaluation of the
general behaviour. The central controller then uses the average of all evaluations to learn a
control policy that accommodates the requirements of as many peripheral controllers as possible.
The central controller as well as some of the peripheral controllers in this study use
adaptive CMAC neural networks. Because of modular nature, it is possible to use several
neural networks with small numbers of inputs instead of one large neural network. This
dramatically increases the learning speed and reduces the demand on memory and
computing power. The architecture also allows easy incorporation of any knowledge by
adding a peripheral controller that represents that knowledge The CPG uses reinforcement
learning in order to learn an optimal policy. The CMAC weights are updated using the
reinforcement signals received from the peripheral controllers. Reinforcement learning is
Reinforcement Learning Algorithms In Humanoid Robotics 373
well suited for this kind of application. The system can try random actions and choose
those that yield good reinforcement. The reinforcement learning algorithm uses the actor-
critic configuration. It searches the action space using a Stochastic Real Valued (SRV) unit
at the output. The reinforcement signal is generated using TD Learning. The CMAC neural
networks used in the biped’s learning are pre-trained using a biped walking robot simulator
and predefined simple walking gates. The size of the search domain is determined by the
standard deviation of the Gaussian unit. If the standard deviation is too small, the system
will have a very small search domain. This decreases the learning speed and increases the
system’s vulnerability to the local minima problem. If the factor is too large, the system’s
performance will not reach its maximum because there will always be a randomness even if
the system has learned an optimal solution. It is in general safer to use a large factor than a
small one. Even though this learning algorithms and architecture have successfully solved
the problem of dynamic biped walking, there are many improvements that can be added to
increase learning speed, robustness, and versatility. The performance must also be
improved by dynamically setting the PID gains to deal with each specific situation.
Obstacle Body
Posture
Height
Central
Joint angles
Controller
initial conditions. There were, however, many limitations (limited step length, slow walking,
no adaptation for left-right balance, no possibility of walking on sloping surfaces). The new
dynamically balanced scheme for handling variable-speed gait was proposed based on the
preplanned but adaptive motion sequences in combination with closed-loop reactive control.
This allows the algorithm to improve the walking performance over consecutive steps using
adaptation, and to react to small errors and disturbances using reactive control. New sensors
(piezoresistive accelerometers and two solid-state rate gyroscopes) are mounted on the new
UNH biped (Fig. 2).
goals more efficiently. On the other hand, reported that a model-based approach to
reinforcement learning is able to accomplish given tasks much faster than without using
knowledge of the environment.
The problem of biped gait synthesis using the reinforcement learning with fuzzy evaluative
feedback is considered in paper (Zhoy & Meng, 2000). As first, initial gait from fuzzy rules is
generated using human intuitive balancing scheme. Simulation studies showed that the
fuzzy gait synthesizer can only roughly track the desired trajectory. A disadvantage of the
proposed method is the lack of practical training data. In this case there are no numerical
feedback teaching signal, only evaluative feedback signal exists (failure or success), exactly
when the biped robot falls (or almost falls) down. Hence, it is a typical reinforcement
learning problem. The dynamic balance knowledge is accumulated through reinforcement
learning constantly improving the gait during walking. Exactly, it is fuzzy reinforcement
learning that uses fuzzy critical signal. For human biped walk, it is typical to use linguistic
critical signals such as ”near-fall-down”, ”almost-success”, ”slower”, ”faster”, etc. In this
case, the gait synthesizer with reinforcement learning is based on a modified GARIC
(Generalized Approximate Reasoning for Intelligent Control) method. This architecture of
gait synthesizer consists of three components: action selection network (ASN), action
evaluation network (AEN), and stochastic action modifier (SAM) (Fig. 3) The ASM maps a
state vector into a recommended action using fuzzy inference. The training of ASN is
achieved as with standard neural networks using error signal of external reinforcement. The
AEN maps a state vector and a failure signal into a scalar score which indicates the state
goodness. It is also used to produce internal reinforcement. The SAM uses both
recommended action and internal reinforcement to produce a desired gait for the biped. The
reinforcement signal is generated based on the difference between desired ZMP and real
ZMP in the x-y plane. In all cases, this control structure includes on-line adaptation of gait
synthesizer and local PID regulators. The approach is verified using simulation
experiments. In the simulation studies, only even terrain for biped walking is considered,
hence the approach should be verified for irregular and sloped terrain.
Fa ilure signa l Rxl
^
R
X
AENx
D
Xzmp QX
SAMx
ASNx
ASNx D
QY
Yzmp SAMy
AENy
^
RY
d d
where Xzmp Yzmp are the ZMP coordinates; T zmp T zmp are the desired joint angles of the
biped gait.
There are some research works that include the application of reinforcement learning
control algorithms for passive or semi-passive biped walkers (Fig.4) (Tedrake et al., 2004;
Morimoto et al., 2005; Schuitema et al., 2005).
a promising route for the development of reinforcement learning for truly high-
dimensionally continuous state-action systems. In paper (Tedrake et al., 2004) a learning
system which is able to quickly and reliably acquire a robust feedback control policy for
3D dynamic walking from a blank-slate using only trials implemented on physical robot.
The robot begins walking within a minute and learning converges in approximately 20
minutes. This success can be attributed to the mechanics of our robot, which are modelled
after a passive dynamic walker, and to a dramatic reduction in the dimensionality of the
learning problem. The reduction of the dimensionality was realized by designing a robot
with only 6 internal degrees of freedom and 4 actuators, by decomposing the control
system in the frontal and sagittal planes, and by formulating the learning problem on the
discrete return map dynamics. A stochastic policy gradient algorithm to this reduced
problem was applied with decreasing the variance of the update using a state-based
estimate of the expected cost. This optimized learning system works quickly enough that
the robot is able to continually adapt to the terrain as it walks. The learning on robot is
performed by a policy gradient reinforcement learning algorithm (Baxter & Bartlett, 2001;
Kimura & Kobayashi, 1998; Sutton et al., 2000).
Some researxhers ( Kamio & Iba, 2005) were efficiently applied hybrid version of
reinforcement learning structures, integrating genetic programming and Q-Learning
method on real humanoid robot.
the Q-value. It may be considered as a compact version of the AHC, and is simpler in
implementation. Some Q-learning based reinforcement learning structures have also been
proposed (Glorennec & Jouffe, 1997; Jouffe, 1998; Berenji, 1996).. In (Berenji & Jouffe, 1997), a
dynamic fuzzy Q-learning is proposed for fuzzy inference system design. In this method,
the consequent parts of fuzzy rules are randomly generated and the best rule set is selected
based on its corresponding Q-value. The problem in this approach is that if the optimal
solution is not present in the randomly generated set, then the performance may be poor. In
(Jouffe, 1998), fuzzy Q-learning is applied to select the consequent action values of a fuzzy
inference system. For these methods, the consequent value is selected from a predefined
value set which is kept unchanged during learning, and if an improper value set is assigned,
then the algorithm may fail. In (Berenji, 1996), a GARIC-Q method is proposed. This method
works at two levels, the local and the top levels. At the local level, a society of agents (fuzzy
networks) is created, with each learning and operating based on GARIC. While at the top
level, fuzzy Q-learning is used to select the best agent at each particular time. In contrast to
the aforementioned fuzzy Q-learning methods, in GARIC-Q, the consequent parts of each
fuzzy network are tunable and are based on AHC algorithm. Since the learning is based on
gradient descent algorithm, it may be slow and may suffer the local optimum problem.
projections of the contours of the right (RF) and left (LF) robot foot on the ground surface,
whereby the shaded areas represent the zones of direct contact with the ground surface.
'M x( zmp )
l3
2
ª¬ F2 F4 ( F20 F40 ) º¼ l23 ª¬ F1 F3 ( F10 F30 ) º¼
'M y( zmp ) l2 ª¬ F3 F4 ( F30 F40 ) º¼ l1 ª¬ F1 F2 ( F10 F20 ) º¼
4
'M y( zmp ) 'M x( zmp )
Fr( z ) ¦ Fi 'x( zmp )
i 1
Fr( z )
'y ( zmp ) Fr( z )
where Fi and Fi 0 .i 1,.., 4 , are the measured and nominal values of the ground reaction
force; 'M x( zmp ) and 'M y( zmp ) are deviations of the moments of ground reaction forces around
380 Humanoid Robots, New Developments
the axes passed through the 0 zmp ; Fr( z ) is the resultant force of ground reaction in the vertical
z-direction, while 'x ( zmp ) and 'y ( zmp ) are the displacements of ZMP position from its
nominal 0 zmp . The deviations 'x ( zmp ) and 'y ( zmp ) of the ZMP position from its nominal
position in x- and y-direction are calculated from the previous relation . The instantaneous
position of ZMP is the best indicator of dynamic balance of the robot mechanism. In Fig. 6b
are illustrated certain areas ( Z 0 , Z1 and Z 2 ), the so-called safe zones of dynamic balance of
the locomotion mechanism. The ZMP position inside these “safety areas” ensures a
dynamically balanced gait of the mechanism whereas its position outside these zones
indicates the state of loosing the balance of the overall mechanism, and the possibility of its
overturning. The quality of robot balance control can be measured by the success of keeping
the ZMP trajectory within the mechanism support polygon (Fig. 6b).
Based on the above assumptions, in Fig. 7 a block-diagram of the intelligent controller for
biped locomotion mechanism is proposed. It involves two feedback loops: (i) basic dynamic
controller for trajectory tracking, (ii) intelligent reaction feedback at the ZMP based on AHC
reinforcement learning structure. The synthesized dynamic controller was designed on the
basis of the centralized model. The vector of driving moments P̂ represents the sum of the
driving moments Pˆ1 , Pˆ2 ,. The torques P̂1 are determined so to ensure precise tracking of the
robot’s position and velocity in the space of joints coordinates. The driving torques P̂2 are
calculated with the aim of correcting the current ZMP position with respect to its nominal.
The vector P̂ of driving torques represents the output control vector.
v ¦ B 'M i i
( zmp )
¦ C j f (¦ Ai 'M i( zmp ) )
i ji
Rˆ (t 1) 0 begining state
Rˆ (t 1) R ( t ) v ( t ) failure state
Rˆ (t 1) R (t ) J v (t 1) v (t ) otherwise
384 Humanoid Robots, New Developments
where J is a discount coefficient between 0 and 1 (in this case J is set to 0.9).
ASN (action selection network) maps the deviation of dynamic reactions 'M ( zmp ) R 2 x1 in
recommended control torque. The structure of ASN is represented by The ANFIS - Sugeno-
type adaptive neural fuzzy inference systems. There are five layers: input layer. antecedent
part with fuzzification, rule layer, consequent layer, output layer wit defuzzification. This
system is based on fuzzy rule base generated by expert kno0wledge with 25 rules. The
partition of input variables (deviation of dynamic reactions) are defined by 5 linguistic
variables: NEGATIVE BIG, NEGATIVE SMALL, ZERO, POSITIVE SMALL and POSITIVE
BIG. The member functions is chosen as triangular forms.
SAM (Stochastic action modifier) uses the recommended control torque from ASN and
internal reinforcement signal to produce final commanded control torque Pdr . It is defined
by Gaussian random function where recommended control torque is mean, while standard
deviation is defined by following equation:
V ( Rˆ (t 1)) 1 exp( | Rˆ (t 1) |) (9)
Once the system has learned an optimal policy, the standard deviation of the Gaussian
converges toward zero, thus eliminating the randomness of the output.
The learning process for AEN (tuning of three set of weighting factors A, B, C ) is
accomplished by step changes calculated by products of internal reinforcement, learning
constant and appropriate input values from previous layers, i.e. according to following
equations:
Bi (t 1) Bi (t ) E Rˆ (t 1)'M i( zmp ) (t )
C j (t 1) C j (t ) E Rˆ (t 1) f (¦ Aij (t )'M i( zmp ) (t ))
ji
Aij )t 1) Aij (t )
E Rˆ (t 1) f (¦ Aij (t )'M i
( zmp )
)(t )(1 f (¦ Aij (t )'M i( zmp ) )(t )) sgn(C j 'M (j zmp ) (t )) (12)
ji ji
where E is learning constant. The learning process for ASN (tuning of antedecent and
consequent layers of ANFIS) is accomplished by gradient step changes (back propagation
algorithms) defined by scalar output values of AEN, internal reinforcement signal, learning
constants and current recommended control torques.
In our research, the precondition part of ANFIS is online constructed by special
clustering approach. The general grid type partition algorithms perform either with
training data collected in advance or cluster number assigned a priori. In the
reinforcement learning problems, the data are generated only when online learning is
performed. For this reason, a new clustering algorithm based on Euclidean Distance
measure, with the abilities of online learning and automatic generation of number of
rules is used.
parts of a ANFIS system. In application, this method enables us to deal with continuous
state and action spaces. It helps to solve the curse of dimensionality encountered in high-
dimensional continuous state space and provides smooth control actions. Q-learning is a
widely-used reinforcement learning method for an agent to acquire optimal policy. In this
learning, an agent tries an action, a (t ) , at a particular state, x(t ) , and then evaluates its
consequences in terms of the immediate reward R (t ) .To estimate the discounted
cumulative reinforcement for taking actions from given states, an evaluation function, the
Q-function, is used. The Q-function is a mapping from state-action pairs to predict return
and its output for state x and action a is denoted by the Q-value, Q( x, a) . Based on this
Q-value, at time t, the agent selects an action a (t ) . The action is applied to the
environment, causing a state transition from x(t ) to x(t 1) , and a reward R (t ) is
received. Then, the Q-function is learned through incremental dynamic programming. The Q-
value of each state/action pair is updated by
Q( x(t ),a (t )) Q( x(t ), a (t )) D ( R (t )
J Q* ( x(t 1))Q( x(t ), a(t )))
Q* ( x(t 1)) max Q(x(t +1), b) (13)
b A ( x ( t 1))
where A( x (t 1)) is the set of possible actions in state ; D is the learning rate; J is the
discount rate.
Based on the above facts, in Fig. 8 a block-diagram of the intelligent controller for biped
locomotion mechanism is proposed. It involves two feedback loops: (i) basic dynamic
controller for trajectory tracking, (ii) intelligent reaction feedback at the ZMP based on Q-
reinforcement learning structure
Basic
dynamic
ANFIS
Consequent part controller
ANFIS Individual 1 a1 1
1 a2
... aL
1
Precondition part
Individual 2 a12 a22 ... a2L .. Action
Action
-
a(t) +
+
Humanoid
... .
Clustering
.. . Selection Robot
Individual N aN1 a2N ... aNL Q
Value
State
R
q - values Critic
q1
q2 R
. 'q
.. Reinforcement
qN
Rule: If x1 (t ) is Ai1 And ... x n (t) is Ain Then a(t) is a i (t) (14)
Where x(t ) is the input value, a (t) is the output action value, A is a fuzzy set and a(t) is a
recommended action is a fuzzy singleton. If we use a Gaussian membership function as
fuzzy set , then for given an input data x ( x1 , x2 ,..., xn ) , the firing strength ) i ( x) of
rule i is calculated by
° n x j mij 2 °½
)i ( x) exp ® ¦ ( ) ¾ (15)
¯° j 1 V ij ¿°
where mij and V ij denote the mean and width of the fuzzy set.
Suppose a fuzzy system consists of L rules. By weighted average defuzzification method,
the output of the system is calculated by
L
¦ ) ( x)a
i 1
i i
a L
(16)
¦ ) ( x)
i 1
i
The Q-value of this final output action should be a weighted average of the Q-values
ˆ ˆ
corresponding to the actions a1i (t ) ,..., aLi (t ) .i.e.,
L
¦ ) ( x(t ))q (t )
i 1
i i
Q( x(t ), a (t )) L
= q i (t) (18)
¦ ) ( x(t ))
i 1
i
From this equation, we see that the Q-value of the system output is simply equal to qi (t ) ,
the Q-value of the selected individual i . This means qi that simultaneously reveals both
the performance of the individual and the corresponding system output action. In contrast
to traditional Q-learning, where the Q-values are usually stored in a look-up table, and can
deal only with discrete state/action pairs, here both the input state and the output action are
continuous. This can avoid the impractical memory requirement for large state-action
spaces. The aforementioned selecting, acting, and updating process is repeatedly executed
until the end of a trial.
Every time after the fuzzy system applies an action a (t ) to the environment and a
reinforcement R (t ) , learning of the Q-values is performed. Then, we should update qi (t )
based on the immediate reward R (t ) and the estimated rewards from subsequent states.
Based on the Q-learning rule, we can update qiˆ as
388 Humanoid Robots, New Developments
ei (t ) O ei (t 1), if i z ˆi
= O ei (t 1) + 1. if i = ˆi
O is a trace-decay parameter. The value ei (t ) can be regarded as an accumulating trace for
each individual i since it accumulates whenever an individual is selected, then decays
gradually when the individual is not selected. It indicates the degree to which each
individual is eligible for undergoing learning changes. With eligibility trace, (20) is changed
to
qiˆ (t ) qiˆ (t ) + D'q ˆi (t)ei (t) (21)
for all i=1,…,N. Upon receiving a reinforcement signal, the Q-values of all individuals are
updated by (21).
The most important part of algorithm represent the choice of reward function - external
reinforcement. It is possible to use scalar critic signal (Katiþ & VukobratoviĀ, 2007), but as
one of solution, the reinforcement signal was considered as a fuzzy number R(t). We also
assume that R(t) is the fuzzy signal available at time step t and caused by the input and
action chosen at time step t-1 or even affected by earlier inputs and actions. For more
effective learning, a error signal that gives more detail balancing information should be
given, instead of a simple "go -no go" scalar feedback signal. As an example in this paper,
the following fuzzy rules can be used to evaluate the biped balancing according to following
table.
'x ( zmp ) SMALL MEDIUM HUGE
'y ( zmp )
SMALL EXCELLENT GOOD BAD
MEDIUM GOOD GOOD BAD
HUGE BAD BAD BAD
Fuzzy rules for external reinforcement
The linguistic variables for ZMP deviations 'x ( zmp ) and 'y ( zmp ) and for external
reinforcement R are defined using membership functions that are defined in Fig.9.
Fig. 9. The Membership functions for ZMP deviations and external reinforcement.
the appropriate laboratory conditions. Characteristic laboratory details are shown in Fig. 10.
The VICON caption motion studio equipment was used with the corresponding software
package for processing measurement data. To detect current positions of body links use was
made of the special markers placed at the characteristic points of the body/limbs (Figs. 10a
and 10b). Continual monitoring of the position markers during the motion was performed
using six VICON high-accuracy infra-red cameras with the recording frequency of 200 [Hz]
(Fig. 10c). Reactive forces of the foot impact/contact with the ground were measured on the
force platform (Fig. 10d) with a recording frequency of 1.0 [GHz]. To mimic a rigid foot-
ground contact, a 5 [mm] thick wooden plate was added to each foot (Fig. 10b).
Fig. 10. Experimental capture motion studio in the Laboratory of Biomechanics (Univ. of La
Reunion, CURAPS, Le Tampon, France): a) Measurements of human motion using
fluorescent markers attached to human body; b) Wooden plates as feet-sole used in
locomotion experiments; c) Vicon infra-red camera used to capture the human motion; d) 6-
DOFs force sensing platform -sensor distribution at the back side of the plate.
A moderately fast walk (v =1.25 [m/s]) was considered as a typical example of task which
encompasses all the elements of the phenomenon of walking. Having in mind the
experimental measurements on the biological system and, based on them further theoretical
considerations, we assumed that it is possible to design a bipedal locomotion mechanism
(humanoid robot) of a similar anthropomorphic structure and with defined (geometric and
dynamic) parameters. In this sense, we have started from the assumption that the system
parameters presented in Tables 1 and 2 were determined with relatively high accuracy and
that they reflect faithfully characteristics of the considered system. Bearing in mind
mechanical complexity of the structure of the human body, with its numerous DOFs, we
adopted the corresponding kinematic structure (scheme) of the biped locomotion
mechanism (Fig. 11) to be used in the simulation examples. We believe that the mechanism
(humanoid) of the complexity shown in Fig. 11 would be capable of reproducing with a
relatively high accuracy any anthropomorphic motion -rectilinear and curvilinear walk,
running, climbing/descending the staircase, jumping, etc. The adopted structure has three
392 Humanoid Robots, New Developments
active mechanical DOFs at each of the joints -the hip, waist, shoulders and neck; two at the
ankle and wrist, and one at the knee, elbow and toe. The fact is that not all available
mechanical DOFs are needed in diěerent anthropomorphic movements. In the example
considered in this work we defined the nominal motion of the joints of the legs and of the
trunk. At the same time, the joints of the arms, neck and toes remained immobilized. On the
basis of the measured values of positions (coordinates) of special markers in the course of
motion (Figs. 10a, 10b) it was possible to identify angular trajectories of the particular joints
of the bipedal locomotion system. These joints trajectories represent the nominal, i.e. the
reference trajectories of the system i. The graphs of these identified/reference trajectories are
shown in Figs. 12 and 13. The nominal trajectories of the system’s joints were diěerentiated
with respect to time, with a sampling period of Ʀt = 0.001 [ms]. In this way, the
corresponding vectors of angular joint velocities and angular joint accelerations of the
system illustrated in Fig. 11 were determined. Animation of the biped gait of the considered
locomotion system, for the given joint trajectories (Figs. 12 and 13), is presented in Fig. 14
through several characteristic positions. The motion simulation shown in Fig. 14 was
determined using kinematic model of the system. The biped starts from the state of rest and
then makes four half-steps stepping with the right foot once on the platform for force
measurement.Simulation of the kinematic and dynamic models was performed using
Matlab/Simulink R13 and Robotics toolbox for Matlab/Simulink. Mechanism feet track
their own trajectories (Figs. 12 and 13) by passing from the state of contact with the ground
(having zero position) to free motion state.
Fig. 11. Kinematic scheme of a 38-DOFs biped locomotion system used in simulation as the
kinematic model of the human body referred to in the experiments.
Reinforcement Learning Algorithms In Humanoid Robotics 393
Fig. 12. Nominal trajectories of the basic link: x-longitudinal, y-lateral, z-vertical, ˻-roll, lj-
pitch, Ǚ-yaw; Nominal waist joint angles: q7-roll, q8-yaw, q9-pitch.
Some special simulation experiments were performed in order to validate the proposed
reinforcement learning control approach. Initial (starting) conditions of the simulation
examples (initial deviations of joints’ angles) were imposed. Simulation results were
analyzed on the time interval 0.1[s]. In the simulation example, two control algorithms were
analyzed: (i) basic dynamic controller described by computed torque method (without
learning) and (ii) hybrid reinforcement learning control algorithm. (with learning). The
results obtained by applying the controllers (i) (without learning) and (ii) (with learning) are
shown on Figs. 15 and Fig.16. It is evident , that better results were achieved with using
reinforcement learning control structure.
The corresponding position and velocity tracking erros in the case of application
reinforcement learning structure structure are presented on Figs. 17 and 18. The tracking
errirs converge to zero values in the given time interval. It means that the controller ensures
good tracking of the desired trajectory. Also, the application of reinforcement learning
structure ensures a dynamic balance of the locomotion mechanism..
In Fig. 19 value of internal reinforcement through process of walking is presented. It is
clear that task of walking within desired ZMP tracking error limits is achieved in a good
fashion.
394 Humanoid Robots, New Developments
Fig. 13. Nominal joint angles of the right and left leg: q13, q17, q20, q24-roll, q14, q21, q16,
q18, q23, q25-pitch, q15, q22-yaw.
Reinforcement Learning Algorithms In Humanoid Robotics 395
Fig.14. Model-based animation of biped locomotion in several characteristic instants for the
experimentally determined joint trajectories.
0.015
0.01
0.005
Error ZMP - x direction (m)
-0.005
-0.01
-0.015
-0.02
0 10 20 30 40 50 60 70 80 90 100
Time (ms)
0.015
0.01
0.005
Error ZMP - y direction (m)
-0.005
-0.01
-0.015
-0.02
0 10 20 30 40 50 60 70 80 90 100
Time (ms)
0.04
0.03
0.02
Position errors (rad)
0.01
-0.01
-0.02
-0.03
-0.04
0 10 20 30 40 50 60 70 80 90 100
Time (ms)
1
Velocity errors (rad/s)
-1
-2
-3
0 10 20 30 40 50 60 70 80 90 100
Time (ms)
0.12
0.1
0.08
0.06
Reinforcement signal
0.04
0.02
-0.02
-0.04
-0.06
0 10 20 30 40 50 60 70 80 90 100
Time (ms)
6. Conclusions
This study considers a optimal solutions for application of reinforcement learning in
humanoid robotics Humanoid Robotics is a very challenging domain for reinforcement
learning, Reinforcement learning control algorithms represents general framework to take
traditional robotics towards true autonomy and versatility. The reinforcement learning
paradigm described above has been successfully implemented for some special type of
humanoid robots in the last 10 years. Reinforcement learning is well suited to training biped
walk in particular teaching a robot a new behavior from scalar or fuzzy feedback. The
general goal in synthesis of reinforcement learning control algorithms is the development of
methods which scale into the dimensionality of humanoid robots and can generate actions
for biped with many degrees of freedom. In this study, control of walking of active and
passive dynamic walkers by using of reinforcement learning was amalyzed.
Various straightforward and hybrid intelligent control algorithms based RL for active and
passive biped locomotion is presented. The proposed RL algorithms use the learning
elements that consists of various types of neural networks, fuzzy logic nets or fuzzy-neuro
networks with focus on fast convergence properties and small number of learning trials.
Special part of study represents synthesis of hybrid intelligent controllers for biped walking.
The hybrid aspect is connected with application of model-based and model free approaches
as well as with combination of different paradigms of computational intelligence. These
algorithms includes combination of a dynamic controller based on dynamic model and
special compensators based on reinforcement structures. Two different reinforcement
learning structures were proposed based on actor-critic approach and Q-learning. The
algorithms is based on fuzzy evaluative feedback that are obtained from human intuitive
balancing knowledge. The reinforcement learning with fuzzy evaluation feedback is much
closer to the human biped walking evaluation than the original one with scalar feedback.
398 Humanoid Robots, New Developments
The proposed hybrid intelligent control scheme fulfills the preset control criteria. Its
application ensures the desired precision of robot’s motion, maintaining dynamic balance of
the locomotion mechanism during a motion
The developed hybrid intelligent dynamic controller can be potentially applied in
combination with robotic vision, to control biped locomotion mechanisms in the course of
fast walking, running, and even in the phases of jumping, as it possesses both the
conventional position-velocity feedback and dynamic reaction feedback. Performance of the
control system was analyzed in a number of simulation experiments in the presence of
different types of external and internal perturbations acting on the system. In this paper, we
only consider the flat terrain for biped walking. Because the real terrain is usually very
complex, more studies need to be conducted on the proposed gait synthesis method for
irregular and sloped terrain.
Dynamic bipedal walking is difficult to learn because combinatorial explosion in order to
optimize performance in every possible configuration of the robot., uncertainties of the
robot dynamics that must be only experimentally validated, and because coping with
dynamic discontinuities caused by collisions with the ground and with the problem of
delayed reward -torques applied at one time may have an effect on the performance many
steps into the future. Hence, for a physical robot, it is essential to learn from few trials in
order to have some time left for exploitation. It is thus necessary to speed the learning up by
using different methods (hierarchical learning, subtask decomposition, imitation,…).
7. Acknowledgments
The work described in this conducted was conducted within the national research project
”Dynamic and Control of High-Performance Humanoid Robots: Theory and Application”.
and was funded by the Ministry of Science and Environmental Protection of the Republic of
Serbia. The authors thank to Dr. Ing. Aleksandar Rodiþ for generation of experimental data
and realization of humanoid robot modeling and trajectory generation software.
8. References
Barto, A.G., Sutton, R.S & Anderson, C.W., (1983), Neuron like adaptive elements that can
solve difficult learning control problems, IEEE Transactions on Systems, Man,and
Cybernetics, SMC-13, 5, September 1983, 834–846.
Baxter, J. and Bartlett, P., (2001). Infinite-horizon policy-gradient estimation , Journal of
Artificial Intelligence Research, 319-350.
Benbrahim, H. & Franklin, J.A. (1997), Biped Dynamic Walking using Reinforcement
Learning, Robotics and Autonomous Systems, 22, 283-302.
Berenji,H.R. & Khedkar,P., (1992), Learning and Tuning Fuzzy Logic controllers through
Reinforcements, IEEE Transactions on Neural Networks, 724-740
Berenji, H.R., (1996), Fuzzy Q-learning for generalization of reinforcement,” in Proc. IEEE
Int. Conf. Fuzzy Systems, 2208–2214.
Bertsekas,D.P. and Tsitsiklis,J.N. (1996), Neuro-Dynamic Programming, Athena Scientific,
Belmont, USA.
Chew, C. & Pratt, G.A., (2002), Dynamic bipedal walking assisted by learning, Robotica, 477-491
Doya, K., (2000), Reinforcement Learning in Continuous Time and Space, Neural
Computation, 219-245.
Reinforcement Learning Algorithms In Humanoid Robotics 399
Sutton, R.S. and Barto,A.G. (1998) Reinforcement Learning: An Introduction, The MIT Press,
Cambridge, USA.
Sutton, R.S., McAllester, D., & Singh, S., (2000), Policy Gradient Methods for Reinforcement
learning with Function Approximation, in Advances in Neural Information Processing
Systems, 12, MIT Press, Cambrdige, USA, 1057-1063.
Tedrake, R.,Zhang, T.W. & Seung, H.S., (2004), Stochastic policy gradient reinforcement
learning on a simple 3d biped, in Proceedings of the 2004 IEEE/RSJ International
Conference on Intelligent Robots and Systems.
Vukobratoviþ, M, & JuriĀiþ, D. (1969), Contribution to the Synthesis of Biped Gait, IEEE
Transactions on Biomedical Engineering, BME-16, 1, 1-6.
Vukobratoviþ, M., Borovac, B., Surla, D., & Stokiþ, D., (1990), Biped Locomotion - Dynamics,
Stability, Control and Application, Springer Verlag, Berlin, Germany.
Watkins, C.J.C.H. & Dayan, P., (1992), Q Learning, Machine Learning,, 279-292.
Zatsiorsky, V., Seluyanov, V. & Chugunova, L. (1990) ,Methods of Determining Mass –
Inertial Characteristics of Human Body Segments, Contemporary Problems of
Biomechanics, 272-291, CRC Press.
Zhou, C. & Meng, D. (2000), Reinforcement Learning with Fuzzy Evaluative Feedback for a
Biped Robot. In:Proceedings of the IEEE International Conference on Robotics and
Automation, San Francisko, USA, 3829-3834.
23
1. Introduction
For a serious scientific interest or rather an amusing desire to be the creator like Pygmalion,
human being has kept fascination to create something replicates ourselves as shown in
lifelike statues and imaginative descriptions in fairy tales, long time from the ancient days.
At the present day, eventually, they are coming out as humanoid robots and their brilliant
futures are forecasted as follows. 1) Humanoid robot will take over boring recurrent jobs
and dangerous tasks where some everyday tools and environments designed and optimised
for human usage should be exploited without significant modifications. 2) Efforts of
developing humanoid robot systems and components will lead some excellent inventions of
engineering, product and service. 3) Humanoid robot will be a research tool by itself for
simulation, implementation and examination of the human algorithm of motions,
behaviours and cognitions with corporeality.
At the same time, I cannot help having some doubts about the future of the humanoid robot
as extension of present development style. Our biological constitution is evolved properly to
be made of bio-materials and actuated by muscles, and present humanoid robots, on the
contrary, are bounded to be designed within conventional mechanical and electric elements
prepared for industrial use such as electric motors, devices, metal and plastic parts. Such
elements are vastly different in characteristics from the biological ones and are low in some
significant properties: power/weight ratio, minuteness, flexibility, robustness with self-
repairing, energy and economic efficiency and so on. So the “eternal differences” in function
and appearance will remain between the human and the humanoid robots.
I guess there are chiefly two considerable ways for developing a preferable humanoid robot
body. One is to promote in advance a development of artificial muscle that is exactly similar
to the biological one. It may be obvious that an ideal humanoid robot body will be
constructed by a scheme of putting a skeletal model core and attaching the perfect artificial
muscles on it (Weghel et al., 2004, e.g.). Another is to establish some practical and realistic
designing paradigms within extensional technology, in consideration of the limited
performance of mechanical and electric elements. In this way, it will be the key point that
making full use of flexible ideas of trimming and restructuring functions on a target. For
example, that is to make up an alternative function by integrating some simple methods,
when the target function is too complex to be a unitary package. Since it seems to take long
time until the complete artificial muscles will come out, I regard the latter way is a good
prospect to the near future rather than just a compromise.
402 Humanoid Robots, New Developments
In searching the appropriate designing paradigms for humanoid robots, it may be just the
stage of digging and gathering many diverse and unique studies, styles and philosophies of
the designing. After examining their practicability, reasonability and inevitability through
the eyes of many persons over the long time, it will work out the optimised designing
paradigms. I believe the most outstanding ingenuities in mechanical design are brought out
in robot hands, as they are most complex mechanical systems on a humanoid robot body.
Fortunately, I have had some experiences of designing them in each endoskeleton and
exoskeleton styles. To contribute on such evolutive process as a mechanical engineer, I bring
up here some of my ideas of designing robot hands.
It is very important feature of a robot hand that it has smooth back-drivability at each joint
for stable control of contacting force on a fingertip. I think it is more acceptable a slight play
at gears as a consequence of high back-drivability than a large frictional resistance of gears
as a consequence of elimination of the play, thanks to the general robot hand is demanded to
perform relatively small force and low absolute accuracy in finger motions. So the actual
fingers of my robot hand are manufactured with slight play to get most back-drivability;
each finger can be moved passively by small force on the fingertip at most about 0.3N.
406 Humanoid Robots, New Developments
The interlocking mechanism among the three joints Jn,2, Jn,3 and Jn,4 consists of wire-pulley
mechanisms with pulleys carved on the sidewall (Fig. 4(a)). Since they are thin enough the
finger segments afford some internal space for motor, sensor and electric component.
Although identical transmission can be constructed using a link mechanism, it tends to
become larger due to restriction on facilitating smooth transmission in large rotational angle
near 90degree. Considering a reasonable emulation of the human motion, the transmission
ratios are set 7/10 from Jn,3 to Jn,2, and 5/7 from Jn,3 to Jn,4 respectively (same (b)).
(a) Location of the reduction gears and the linkage (b) Embedded PCB
Fig. 6. Mechanism and electric equipment in the palm.
After considering above, I focused attention on the joint Jn,4 (J1,5 in case of the thumb), as it
could be free from generating the large force above; that means it could concentrate to
generate a minute and fine force suitable for delicate pinching. Then I introduced a special
independent DOF at the joint Jn,4 (J1,5) so that it can control force and motion on the
fingertip independently and finely. It is surely a little strange assignment of DOF while
general robot hand has interlocking between the two distal joints Jn,3 and Jn,4 (J1,4 and J1,5),
however, I think that it maintains the basic design conditions as general harmony as
a humanoid robot hand; the human finger also has latent ability to move its terminal joint
independently.
In addition, I propose a new strategy of the handling that combines a powerful grasping and
a delicate pinching in consideration of that the small additional actuator at the joint Jn,4 (J1,5)
cannot sustain large contacting force at the fingertip.
In power grasping (Fig. 7(a)), powerful grasping forces generated by the main actuator are
transmitted to an object through the surface of finger segments except the fingertip. At the
same time, the delicate force generated by the joint Jn,4 (J1,5) lets the fingertip compliant to
the object, and enlarge the effect of slip-free grasping. The movable range of the joint Jn,4
(J1,5) is broadened enough into the backside, so that the function of fingertip can work well
in wide range of its posture. The shape of surface around the terminal joint is formed to be
suitable for the direct contacting against the object.
In delicate pinching (Fig. 7(b)), the contacting forces on the object are generated only by the
joint Jn,4 (J1,5). When larger pinching forces that the joint Jn,4 (J1,5) cannot sustain are
necessary, rotating the joint to the dead end of movable range where a mechanical stopper
sustains the large force generated by the main actuator; strong pinching is executed.
This strategy can be categorized as a general idea for a two-stage mechanism that compiles
separated rough and fine control methods for enlarging the total dynamic range.
When the motor current of the joint Jn,4 exceeded a certain threshold, the target was shifted
gradually to a temporary target by adding/subtracting a small variation proportional to the
difference between the threshold and the present motor currents, so that the excess of motor
current got reduced. When the motor current fell below the threshold and besides the
temporary target was different from the original one, the temporary target was restored
gradually to the original one by adding/subtracting a small variation proportional to the
difference between the latest temporary target and the original one. In consequence of
repeating this control in high frequency, the active spring effect was created stably.
In the experiment, the external pressure for examining the compliance was provided by the
other finger part. Concretely, the finger part between the joints Jn,2 to Jn,4 was moved
repeating a slight sine wave motion by a position-control of the main motor, and kept
pressing the fingertip against a fixed object. Contacting force between the fingertip and the
object was measured using a film-like force sensor (Nitta, FlexiForce) placed on the object;
as a matter of course the observed force was not used in the motor control.
Fig. 9 shows the result of experiment with transitions of the contacting force and the angular
displacement θ n,4. The threshold of motor current was set in two values as examples that
provide the limits of contacting force as about 0.9N and 1.4N respectively. Effectiveness of
the smooth compliance of fingertip is confirmed by analysing each transition as follows.
When the contacting force just exceeded the limit, the joint Jn,4 started to rotated to release
the excess, so that the contacting force was stuck at the limit. The joint Jn,4 started to return
to the original position when the contacting force fell below the limit. These movements are
just the characteristics of desirable compliance. And stable constancy of the maximum
contacting force at the limit reveals high efficiency of force transmission through the reducer,
and prove adequacy of the estimating way of contacting force from the motor current.
This smooth compliance cannot be produced by a simple spring mechanism because it
cannot change the output force freely, then this practical function with minimum additional
parts is one of the distinctive features on my robot hand.
1.5 0.2
/LPLW
1
0.1
Angular Displacement [rad]
Contacting Force [N]
0.5
0 0
0 2 4 6 8 10
1.5 0.2
1
/LPLW 0.1
0.5
0 0
0 2 4 6 8 10
Time [s]
Fig.9. Experimental result of the contacting force and the angular displacement of a fingertip.
A Designing of Humanoid Robot Hands in Endoskeleton and Exoskeleton Styles 411
(a) Worm gear mechanism to drive the pulley (b) Actual embedded situation
Fig. 11. Differential mechanism for the independent root joint motion.
A Designing of Humanoid Robot Hands in Endoskeleton and Exoskeleton Styles 413
(a) Change of wiring path due to the finger flexion (b) Flexible PCB
Fig. 12. Design of the wiring around the joint that contains the large reducer.
414 Humanoid Robots, New Developments
To confirm dexterity of the robot hand, some experiments of representative and practical
handling motions were conducted; this paper displays two handling types: pinching a
business card and holding a pen (Fig. 14). The key evaluation items in these experiments
were the two distinctive functions: the smooth compliance on a fingertip and the twisting of
the thumb. All the fingertip forces were generated by the simple open-loop torque control
method explained in the section 2.7 without force sensors.
By the way, the smart wiring style explained in the section 2.10 is installed only to the latest
model, and the robot hand used in the experiments did not have it unfortunately.
In the experiment of pinching a business card, the robot hand performed switching several
times two couples of pinching fingers: the thumb and the index finder/the thumb and the
middle finger (Fig. 15). In the junction phase when all the three fingers contacted on the card,
the thumb slid its fingertip under the card from a position opposing a fingertip to another
position opposing another fingertip. In the experiment of holding a pen, the robot hand
moved the pen nib up and down and sled the thumb fingertip along the side of the pen (Fig.
16). In both experiments, the objects: card and pen were held stably, and these achievements
prove the contacting force appropriate in both strength and direction could be generated at
each fingertip.
At the SIGGRAPH 2006, I got an opportunity to join into a participating party of the
“Hoshino K. laboratory in the university of Tsukuba” which introduced my humanoid robot
hand for the first time. The robot hand was demonstrated on a humanoid robot arm that is
actuated by pneumatic power, and has 7DOF wide movable range, slender structure and
dimensions like an endoskeleton of a human arm (Fig. 17). While its power is low and the
practical payload at the wrist joint is about 1kg, it could move the robot hand smoothly.
The conclusive advantage of the robot hand is that many complex functions are condensed
in the humanlike size, weight and appearance, and realize the sophisticated dexterity. As
the robot hand has rich suitability for delicate robot arms, after more sophistication, it will
be developed to a good prosthetic hand in the near future.
Fig. 21 shows definition of kinematical symbols of parts and parameters; for example, point
V is the virtual axis. The specifications that provide the shape of rack and sector gear are
only the pitch circle radius r of the sector gear and the standard offset p between the center-
lines of the Segment A and the Bone A. Since the standard offset p is decided 10mm due to
convenience of practical design of mechanism, only the radius r is an object to be optimised.
The point V moves on the Y-axis by change of θ and its behaviour is divided into three types
according to the size of r (Fig. 22). Considering its nearest trajectory to the point C, the
preferable range of r is presumed as 0.5p r (2/π )p.
The evaluation item for the optimisation was set a deviation d defined by next formula that
means deformation of kinematical relationship between two datum points A and B as
shown in the Fig. 21, and the optimal radius r should minimise it.
Fig. 23 shows curves of the deviation d vs. θ in several settings of the radius r. The radius r is
set within the presumed range. To generalise the optimisation each parameter is dealt as
dimensionless number by dividing with the offset p. Screening many curves and seeking
a curve which peak of d during a movable range of θ is minimum among them, the optimal r
420 Humanoid Robots, New Developments
is found as the value that makes the sought curve. For example, when the movable range is
0 θ π/2 the optimal radius r is 0.593p and the peak deviation d is 0.095p, and when the
movable range is 0 θ π/3 the optimal radius r is 0.537p and the peak deviation d is 0.029p.
As the offset p is set 10mm, the peak of d is below acceptable 1mm; therefore, the mechanism
with rectilinear rack and circular gearwheels has practicability enough.
r / p = 0.5
d/ p
0.537
0.593
0.095 2/π
0.029
π π
θ
Fig. 23 Variation of curves of the deviation d.
(a) Path of the wire rope (b) Pushing spring (c) Tandem connection
Fig.24 Driving method of the circuitous joint.
A Designing of Humanoid Robot Hands in Endoskeleton and Exoskeleton Styles 421
The definition of statical symbols is shown in Fig. 25, and the formulas for inverse statics
calculating the input (manipulated) variables: w and F, from the output (controlled)
variables: θ and τ are derived as follows.
w = (2 r + r p)θ (2)
1 2k ⋅ r 2
2 Fs′ ⋅ r
F= τ− θ+ (3)
2r + r p 2( 2 r + r p) 2r + r p
As these formulas show simple and linear relationship between the input and output
valuables, this driving method promises further advantage that the algorithm of controlling
both position and force is fairly simple. When the spring effect is negligible, as the second
and third terms on the right side of formula (3) are eliminated, we would be able to control
the output torque τ by using only the motor torque as the controlled variable.
Tip Gap [mm]
Contacting Force [N]
Master (MAF)
Slave (SLF)
Time [s]
Fig. 31 Experimental result of transferring the contacting force.
Fig. 31 shows an experimental result; θ 1+θ 2+θ 3 means the sum of three joint angular
displacements on MAF. The two vital features are shown: prompt switching of contact/non-
contact phases, and transferring the contacting force from SLF to MAF. The contacting force
at the fingertip of SLF was given by an assistant pushing on it; for example, the two contact
phases at the time 7s and 10s were caused by assistant’s tapping. While the algorithm
switching the phases was a primitive bang-bang control, an oscillation iterating contact/
non-contact did not occur. I guess the season: since the gap between the fingertips is kept
small during the non-contact phase, the impact at the encounter that will lead the oscillation
is not so serious, moreover the human fingertip has effective damping to absorb it.
As shown by the curves after the time was 13s, the operator’s finger could be moved by the
assistant’s force; the master-slave (bilateral) control with force feedback was verified.
In conclusion of this experiment, MSF has enough performance as a principal part of the
master hand for the Telexistence/Telepresence.
A Designing of Humanoid Robot Hands in Endoskeleton and Exoskeleton Styles 425
4. Conclusion
To contribute on the evaluative process of searching the appropriate designing paradigms as
a mechanical engineer, I bring up in this paper some of my ideas about the robot hand
design concretely. While the designs of my robot hands may seem to be filled with eccentric,
vagarious and serendipitous ideas for some people, I believe they are practical outcomes of
flexible ingenuity in mechanical designing, so that they can take on pre-programmed but
robust actuating roles for helping the programmable but limited actuators, and realize
higher total balance in mechatronics. At the same time, for examining their practicability,
reasonability and inevitability through the eyes of many persons, it will need to establish
a standard definition and evaluation items in kinematics, dynamics, control algorithms and
so on, that can subsume almost all humanoid robots. Concretely, a standard formats would
be prepared to sort and identify any robot system by filling it. The Fig. 1 and 2 show my
small trial of comprehensive comparison under a standard definition in the robot hand
kinematics. And I hope the worldwide collaboration, so that it will promote developments
of many sophisticated mechanical and electric elements that are easy to be used by many
engineers like me who want any help to concentrate on his/her special fields.
5. Reference
CyberGrasp: Immersion Co.
http://www.immersion.com/3d/products/cyber_grasp.php
DLR Hand: German Aerospace Center
http://www.dlr.de/rm/en/desktopdefault.aspx/tabid-398/
Gifu Hand: Dainichi Co., Ltd.
http://www.kk-dainichi.co.jp/e/gifuhand.html
Harada Hand: Harada Electric Industry Inc.
http://www.h-e-i.co.jp/Products/e_m_g/ph_sh_2_004.html
Jacobsen, S.C. et al. (1984). The UTAH/M.I.T Dextrous Hand: Works in Progress, Int. J. of
Robotics Research, Vol.3, No.4 (1984), pp.21-50
Lovchik, C.S. & Diftler, M.A.(1999). The Robonaut Hand: A Dexterous Robot Hand For
Space, Proc. of IEEE Int. Conf. on Robots & Automation, Detroit, MI, May 1999
McNeely, W.A. (1993). Robotic Graphics: A New Approach to Force Feedback for Virtual
Reality, Proc. of IEEE Virtual Reality Annual Int. Symp., pp.336–341, Seattle, Sep 1993
NASA Hans: National Aeronautics and Space Administration
http://robonaut.jsc.nasa.gov/hands.htm
Shadow Hand: Shadow Robot Company Ltd.
http://www.shadowrobot.com/hand/
Tachi, S. et al. (1994). A Construction Method of Virtual Haptic Space, Proc. of the 4th Int.
Conf. on Artificial Reality and Telexistence (ICAT '94), pp. 131-138, Tokyo, Jul 1994
Teleistence: Tachi, S. et al., Tele-existence (I): Design and evaluation of a visual display with
sensation of presence, Proc. of the RoManSy ‘84, pp. 245-254, Udine, Italy, Jun 1984
Telepresence: Minsky, M., TELEPRESENCE, OMNI, pp. 44-52, Jun 1980
Weghel, M.V. et al. (2004). The ACT Hand : Design of the Skeletal Structure, Proc. of IEEE Int.
Conf. on Robots & Automation, pp. 3375-3379, New Orleans, LA, Apr 2004
24
1. Introduction
Recently, robotics research has focused on issues surrounding the interaction modalities
with robots, how these robots should look like and how their behavior should adapt while
interacting with humans. It is believed that in the near future robots will be more prevalent
around us. Thus it is important to understand accurately our reactions and dispositions
toward robots in different circumstances (Nomura et al., 2006). Moreover, the robot’s correct
production and perception of social cues is also important. Humans have developed
advanced skills in interpreting the intentions and the bodily expressions of other human
beings. If similar skills can be acquired by robots, it would allow them to generate behaviors
that are familiar to us and thus increase their chances of being accepted as partners in our
daily lives.
The expressiveness of a gesture is of great importance during an interaction process. We are
often required to give special attention to these signs in order to keep track of the interaction.
Humans have learned to adapt their behavior and to react to positive and negative bodily
expressions (Bartenieff & Lewis, 1980). Although there has been remarkable work on the
design issues of sociable robots (Breazeal, 2002) and affective autonomous machines
(Norman et al., 2003), there has not been much work on investigating the real impact of
robot bodily expressions on the human user in the context of human-robot interaction.
Knowing the effect of a generated gesture, a robot can select more accurately the most
appropriate action to take in a given situation. Besides, computer-animated characters have
been used to evaluate human perception of the significance of gestures. However, animated
characters and embodied ones should be treated differently since the latter are tangible
entities (Shinozawa et al., 2005).
In this article we report a study on the relation between bodily expressions and their
impacts on the observer. We also attempt to understand the effect that expressions have on
the observer’s brain activity. Its sensitivity to bodily expressions can be used during an
interaction task since the brain is the source of every cognitive and emotional effort.
1 Corresponding author: ABDELAZIZ KHIAT, Robotics laboratory, Graduate School of Information Science, Nara Institute of
Science and Technology, Keihanna Science City, 630-0192 JAPAN. Email: [email protected]
2 MASATAKA TOYOTA is currently with Canon Corporation.
428 Humanoid Robots, New Developments
Fig. 1. Considered scenario for robot bodily expressions and its perceived impression.
In this work, we have conducted an experimental study where several users were asked to
observe different robot bodily expressions while their brain activity was recorded. The
results suggest the existence of a relation between the type of bodily expressions and the
change in the level of low-alpha channel of brain activity. This result helped in the selection
of features that were used to recognize the type of bodily expression an observer is watching
at a certain time. The recognition rate was of about 80% for both cases of robot bodily
expressions and of human bodily expressions. Potential applications include customized
interface adaptation to the user, interface evaluation, or simple user monitoring.
Fig. 2. The subset of Shaver's classification of emotions used in the categorization of Bodily
Expressions.
Assessment of the Impressions of Robot Bodily Expressions using
Electroencephalogram Measurement of Brain Activity 429
• BE4: The robot lowers both arms and its head, then moves backward at low speed
for some distance, the goal is to show an expression of sadness.
• BE5: The robot raises both arms gradually while advancing before stopping, then it
lowers and raises its arms progressively for two cycles; the goal is to show an
expression of happiness.
• BE6: The robot advances quickly, then goes back and raises its right arm while
turning its head a bit to the right. It then lowers its arm and returns its head to the
original position; the goal is to show an expression of anger.
Fig. 3. Overview of the receptionist robot ASKA and its joints-link model.
The duration of each of these BEs was about 14[sec]. Their appropriateness and their
expressiveness was tested experimentally using questionnaires (see section 3.1).
p
rb (n ) = ¦ a(k )s (n + k ) (2)
k =0
where a(k ) is the AR parameters and p is the order of the model. The order p is based
on the “goodness of fit” criterion. We use the relative error variance (REV) criterion (Schlögl
et al., 2000), defined as:
Fig. 4. An example illustrating the calculation of the power of low-alpha band for a
2[sec] data segment taken from electrode placement F3. The graph to the left shows the
raw EEG signal for the baseline period and the observation period. The graph to the
right shows the power spectra of the EEG signals, where low-alpha frequency band is
highlighted.
We apply (1) and (2) to calculate an approximate estimation of the power spectrum
PS ( f ) of the signal s as follows:
432 Humanoid Robots, New Developments
Vp T (4)
PS ( f ) = 2
p
1 + ¦ a (k )e − j 2πfkT
k =0
¦ ([r (n)] + [r (n )] ) ,
N + p −1
1
Vp =
2 2
f b
(5)
2 n =0
where V p is the averaged sum of the forward and backward prediction error energies and T
is the sampling period.
Research in cognitive neuroscience has shown that the power of low-alpha frequency band
is the most reactive band to social cues such as movements (Allison et al., 2000; Cochin et al.,
1998). We suppose that this frequency band reacts in a similar way to robot bodily
expressions (Khiat et al., 2006). The next step in assessing the impression is to observe the
amount of change in the power of low-alpha frequency band compared to the whole
spectrum. The power L of a band between frequencies a and b is defined by:
³ PS ( f )d f
b
L(a, b ) = a
∞
(6)
³ PS ( f )df
0
Using (6), we calculate the power of low-alpha band frequency Lb for the data taken during
the baseline period and Lm for the data taken during the period of the execution of a bodily
expression. An example illustrating this calculation is shown in Fig. , where raw 2 seconds
EEG signals collected during the baseline period and the observation period is shown to the
left. The power spectrum of these signals is shown to the right, and the low-alpha frequency
band is highlighted. A comparison between Lb and Lm would indicate the effect of a
particular bodily expression on the user. This is used as the main evaluation criterion for
impression.
3 Experimental study
3.1 Expressiveness of robot bodily expressions
The goal of this experiment is to evaluate the expressiveness of every generated robot bodily
expression. Since this quality is highly subjective, the self-reporting approach is used.
Subjects. Seventeen (17) participants (two females and fifteen males aged between 20 and 50
years old) volunteered to take part in this experiment. They were either students or faculty
members at the Graduate School of Information Science. They were all familiar with robots
and had previous experiences of dealing with robots similar to the one used in the experiment.
Procedure. Every subject was shown a total of six bodily expressions, which were described
in section 2.2. The execution of each of the bodily expressions by the humanoid robot ASKA
lasted 14 seconds. After observing each robot bodily expression, enough time was given to
the subject to answer two questions about the expressiveness of that robot bodily expression,
and one more question about their impression after the observation. These answers were
then summarized as explained in section 2.3 to assess their expressiveness.
Assessment of the Impressions of Robot Bodily Expressions using
Electroencephalogram Measurement of Brain Activity 433
Fig. 5. The experimental setup where brain activity was measured according to the 10-20
international standard (Jasper, 1958).
The subjects were asked to relax as much as possible and think of nothing in particular
when recording the baseline period, which lasted for 14[sec]. They were also told that they
would be asked about the robot’s movements and that they had to watch carefully when the
robot was moving. This was important because we needed to make sure that the subjects
attended to the task. After the observation of each bodily expression, the subjects described
their impression in their own words. Having no constraints to express themselves, the
subjects gave more details about their impressions. These answers were used in categorizing
the impressions into pleased or unpleased based on Shaver’s original classification of
emotions (Shaver et al., 1987).
Results. Table 2 shows the self-reporting result about the subjects’ impressions after
observing every robot bodily expression. There is a strong correlation between these results
and the expression results, reported previously in section 3.1, with a coincidence level of
71%. For example, BE4 impression was considered to be unpleasant by up to 81% and its
expressiveness was considered unpleasant by 94%. This is also the case for BE1 where its
impression of being pleasant is 65%, and it expression of being pleasant is 100%. The same
could be said for BE3, with a pleasant impression of 68% and a pleasant expression of 94%.
The case of BE6 is different from the previous ones. While its expression was considered
unpleasant by 82%, its impression shows the small rate of 53% for being unpleasant and
47% for being pleasant. It is still inclined to the unpleasant side. However, its pleasant effect
cannot be explained knowing that this bodily expression was created to express anger. The
last case of BE2 shows a big difference between its 59% neutral expression and its 70%
unpleasant impression.
Electrodes
Subject Category
Fp1 Fp2 F3 F4 T3 T4 P3 P4 O1 O2
Pleasant – – – + – – – – + –
1
Unpleasant – + – + + – – – + –
Pleasant – – + – – – – – – –
2
Unpleasant – – – – – – + + – –
Pleasant – – – – + + – + – –
3
Unpleasant – – – – – – – – – –
Pleasant – – – – – + + + – –
4
Unpleasant – – – – – – + + + –
Pleasant – – – – – + – – – –
5
Unpleasant – – – – + + + + – –
Pleasant + + – – + + – – – –
6
Unpleasant – – – – + + – – – –
Pleasant – – + – + + – – – +
7
Unpleasant – – + – – – – + – –
Table 3. Significant change in low-alpha power according to observed motion categories at
every electrode and for each subject. (+: significant change p<.05; –: no significant change).
This suggests that bodily expressions with a neutral expression can be perceived negatively
and can generate an unpleasant impression. The analysis of EEG data using the method
described in section 2.3 allowed the calculation of the power Lm of low-alpha frequency
band in each electrode channel and for each bodily expression. It also allowed the
calculation of the power Lb of the same frequency band for the baseline period. Comparing
Lm and Lb revealed the effect of observing a bodily expression in the change in the power of
low-alpha frequency band for each electrode channel. Table 3 summarizes the results of this
change in power, where only statistically significant change is indicated with the symbol +.
436 Humanoid Robots, New Developments
It can be seen that significant effect is mostly present at locations T3 and T4, then at P3 and
P4, and finally at F3 and F4. Knowing that these positions are located above the superior
temporal sulcus (STS) and above some specific parts of the prefrontal cortex (PFC) confirms
previous research findings about the activation of STS in the perception of social cues
(Cochin et al., 1998; Allison et al., 2000), and the activation of the mirror neurons located in
the PFC during learning and imitation tasks (Rizolatti & Craighero, 2004). Some reaction can
also be seen at other locations, for instance O1 and Fp2 for subject 1, O2 for subject 7, Fp1
and Fp2 for subject 6. The reaction at locations Fp1 and Fp2 are thought to be the result of
blinking activity during the recording process, since these electrode positions are the closest
to the eyes. It is important to assert that no preprocessing was done to avoid data with eye
blinking artifacts. This approach was adopted because the goal is to conduct this
investigation in natural conditions, where blinking activity is possible and should be
considered. The reaction at locations O1 and O2 could be explained by the fact that during
the vision process the visual cortex gets activated and this activation is usually captured at
locations O1 and O2.
Nevertheless, the reactive locations were not always the same among different observers,
suggesting high individual differences. A generalization cannot be derived at this point
about the reaction of brain locations according to the category of the bodily expression that
is being observed. However, the presence of a reaction is confirmed and another approach is
necessary to achieve a more comprehensive result. On the other hand, there is a need to
assess the repeatability of similar reactions from the same observer when he/she is shown
the same bodily expression.
Fig.6. Mean alpha power calculated for the ten trials and for electrodes C3 and C4 (:
p< .05).
Results. Analysis of the collected data, using the method described in section 2.3, resulted in
identifying the electrode channels of placements C3 and C4 as the most reacting for this
subject. Figure 6 shows the mean power of low-alpha frequency band calculated from the 10
trials for the electrode placements C3 and C4: where C3 reacted significantly to the pleasant
bodily expression and C4 reacted significantly to the unpleasant bodily expression.
Fig. shows the overall result for the two bodily expressions by averaging the power
change for all the electrodes over the 10 trials. The difference in means is significant
between BE1 and BE4. Since BE1 is representative of pleasant bodily expressions and
BE4 is representative of unpleasant ones, this result suggests an overall significant
decrease in the power of low-alpha frequency band for pleasant motions, and a
significant increase in power of low-alpha frequency band for unpleasant ones. This
confirms that the change in low-alpha power happens every time the observer watches
a bodily expression, and that this change is inversely proportional to the category of the
observed bodily expression.
Fig. 7. Change in alpha power from the baseline using all electrodes, for each of the
considered unpleasant and pleasant bodily expressions (: p < :05).
3.4 Discussion
The results presented in Table 1 confirmed the appropriateness of the expressiveness of the
generated bodily expressions used in the experimental study. They show that the
unpleasant bodily expressions were classified as unpleasant, and the pleasant bodily
expressions were also classified as pleasant. During every experiment, the order of which
the bodily expressions were shown to the observer was random so as not to allow the
prediction of the nature of the next bodily expression. Although the subjects were not told
anything about the bodily expressions, their answers agreed with the hypothesis. This
implies that people tend to see bodily expressions in similar ways, which facilitates their
interpretations and use in interactions. There exist a shared basic knowledge that allows
humans the proper interpretation of similar expressions, although this knowledge is highly
affected by the environment factors of culture and local customs. The bodily expressions
were treated by the observers as if they were performed by a human even though it was the
robot ASKA which actually performed them. It would be interesting to compare the
interpretation of the same bodily expressions executed by both humans and robots to
evaluate the existence of interpretation differences.
On the other hand, Table 2 correlates to a high extent with the results of Table 1. Here we
can infer that observing a pleasant bodily expression will result in a pleasant impression
on the observer and vice versa. This means that the observer is affected by what he sees
even though the performer is just a robot. This effect on the observer is shown to be
present in his/her brain activity with the results of section 3.2. Although, a generalization
could not be concluded from the obtained results, the presence of a reaction in brain
activity was proven.
It is important to acknowledge that the most reactive electrode positions were F3, F4,
T3, T4, P3 and P4, which are located above the STS and PFC. This supports the claims that
STS and mirror neurons get activated during the perception of social cues and the
Assessment of the Impressions of Robot Bodily Expressions using
Electroencephalogram Measurement of Brain Activity 439
observation of movements (Allison et al., 2000; Cochin et al., 1998), and that this can be used
effectively in the implementation of Brain-Machine Interfaces (Nicolelis, 2001; Wessberg et
al., 2000).
Finally, it was necessary to confirm the repeatability or the reproducibility of the same
reaction in similar conditions. The results showed that the power level of low-alpha
frequency band over all brain activity was inversely proportional to the category of the
observed bodily expression. Particularly the most significant reaction was present at
electrode positions C3 and C4 for the considered subject. These positions are close to the
premotor and motor cortices. Due to the low spatial resolution of EEG, it is difficult to assert
precisely which part of the brain is responsible for these reactions. However, current
research findings confirm that the STS has an important role in the interpretation of social
cues (Allison et al., 2000), and that mirror neurons are important during learning and
imitation tasks (Rizolatti & Craighero, 2004).
D 2 ( X , M ) = ¦ w j (x j − m j )
d
2
(7)
j =1
where wj is the weighting factor which can be used to give preference to certain features
over others. This proved helpful and important in the semi-assisted learning of the data
structure that was necessary for our EEG data. The usual approach in using SOM starts by
preprocessing the selected data. Then, a feature extraction method is specified and used.
After that, the map is calculated using competitive learning. Finally, the resulting map is
labeled and used for recognition. It is important to note that in practical applications the
selection and preprocessing of data is of extreme importance, because unsupervised
methods only illustrate the structure in the data set, and the structure is highly influenced
by the features chosen to represent data items (Kaski, 1997). In the following we will show
how we used SOM in the recognition of bodily expressions executed by a humanoid robot
and of similar bodily expressions executed by a human.
440 Humanoid Robots, New Developments
Fig. 8. Preprocessing EEG data for features extractions is done by calculating a moving
average of overlapping windows of predefined length (3[sec]).
♦ Delta : 00∼04[Hz]
♦ Theta : 04∼08[Hz]
♦ Low-Alpha : 08∼11[Hz]
♦ High-Alpha: 11∼13[Hz]
♦ Beta : 13∼30[Hz]
♦ Gamma : 30∼50[Hz]
Notice that the alpha frequency band is divided into low and high. This will allow us to give
different importance coefficients to each frequency band, according to each one’s contribution in
recognizing the effect of bodily expressions. The resulting time series of EEG power spectrum
features consists of a vector of 10×6=60 features every 2[sec] (400-point) time intervals.
Fig. 9. Recognition rates when using data from different sources (subjects), when observing
robot’s bodily expressions.
Fig. 10. Recognition rates when using data from different sources (subjects), when observing
a human performer.
Assessment of the Impressions of Robot Bodily Expressions using
Electroencephalogram Measurement of Brain Activity 443
Procedure. The human bodily expressions to be shown to the subjects were prepared
beforehand. A volunteer in a black tight suit performed these bodily expressions. He had a
black cover on his head to make sure that no facial expressions get to the observer. These
bodily expressions were a reproduction of the bodily expressions described in section 2.2 and
were recorded on HDTV7 video tapes. During the experiment, the bodily expressions were
projected on a display big enough to ensure that the projected images of human performer is
as close as possible to real size. Similar to the experiment in section 3.2, the recording was
obtained from 12 electrode locations, namely: F3, F4, F7, F8, C3, C4, P3, P4, T3, T4, T5, T6,
using the Polymate AP1132 EEG recording device. The sampling frequency was 200[Hz] and
the impedance was kept below 5[kΩ]. The subjects were shown a total of six bodily
expressions lasting 10[sec] each. During the recording of the baseline period, the subjects were
asked to relax as much as possible and to think of nothing in particular. To help them achieve
this state of mind, they were shown for 10[sec] the empty space of the room where the
recording of the bodily expressions was performed. In addition, to confirm the subjects
attended to the task they were told that they would be asked about the bodily expressions and
that they had to observe carefully. After each observation, they were asked to explain the
difference between the bodily expressions they just observed and the previous ones.
4.3 Discussion
Regardless of the performer, whether a robot or a human, the recognition rates of the
category of the observed bodily expression was about 80% when using subject’s data
7 HDTV stands for High Definition television a.k.a High-Vision which allows the recording of a high resolution video
stream.
444 Humanoid Robots, New Developments
individually. However, this rate decreased significantly when additional data from other
subjects was used in the training process. To cope with this problem it would be interesting
to train one SOM for each data source, and then combine the resulting SOMs into a bigger
structure for the recognition task. Adopting this approach could result in keeping a high
recognition rate while taking into consideration all the data that was collected so as not to
loose the generality of the solution.
It is interesting to note that the average difference between the recognition rates for robot
and human cases is relatively small as summarized in Table 4. This proves that SOMs are
suitable to the generalization of the effect in the input EEG signals regardless of whether this
effect is generated by a human or a robot. Even if differences appear clearly when analyzing
raw EEG data, the SOMs succeed in eliminating these differences and keep only the
important information. SOMs also succeed in separating the noise and artifacts from signals
reflecting brain activity. This is another powerful characteristic that helps in the online
processing of EEG signals for applications related to Brain-Machine Interfaces (BMI).
styles (Hsu et al., 2005) would be interesting to provide more details about the bodily
expressions. There is an extensive work on generating emotional motions (Amaya et al.
1996; Lim et al. 2004) that could be incorporated in this research. Reaching this goal would
enable us to build an adaptive interface that makes the robot judge the effect of its own
gestures on the users and allows it to change its behavior accordingly.
6. Acknowledgements
This work was partly supported by the Ministry of Education, Culture, Sports, Science and
Technology (MEXT) and the NAIST 21st Century COE program ”Ubiquitous Networked
Media Computing”. The authors would like to thank Mr. Albert Causo for his comments
and suggestions to improve the manuscript.
7. References
T. Allison, A. Puce & G. McCarthy (2000). Social perception from visual cues: Role of the STS
region. Trends in Cognitive Sciences, 4(7):251–291.
K. Amaya, A. Bruderlin & T. Calvert (1996). Emotion from motion. In Graphics Interface ’96,
pages 222–229.
I. Bartenieff & D. Lewis (1980). Body Movement: Coping with the Environment. Gordon and
Breach Science Publishers, USA.
C. Breazeal (2002). Designing Sociable Robots. The MIT Press, USA.
S. Cochin, C. Barthlemy, B. Lejeune, S. Roux & J. Martineau (1998). Perception of motion and
qEEG activity in human adults. Electroencephalography and Clinical
Neurophysiology, 107:287–295.
E. Hsu, K. Pulli, & J. Popovic (2005);. Style translation for human motion. ACM Transactions
on Graphics, 24(3):1082–1089.
J. Ido, K. Takemura, Y. Matsumoto & T. Ogasawara (2002). Robotic receptionist ASKA: a
research platform for human-robot interaction. In IEEE Workshop of Robot and
Human Interactive Communication, pages 306–311.
M. Ito & J. Tani (2004). On-line imitative interaction with a humanoid robot using a
dynamic neutral network model of a mirror system. Adaptive Behavior, 12(2):93–
115.
H. H. Jasper (1958). The ten-twenty electrode system of the international federation.
Electroencephalography and Clinical Neurophysiology, 10:371–375.
G. Johansson (1973). Visual perception of biological motion and a model for its analysis.
Perception and Psychophysics, 14(2):201–211.
S.-L. Joutsiniemi, S. Kaski & T. A. Larsen (1995). Self-organizing map in recognition of
topographic patterns of EEG spectra. IEEE Transactions on Biomedical Engineering,
42(11):1062–1068.
S. Kaski (1997). Data Exploration using Self-Organizing Maps. PhD thesis, Helsinki
University of Technology.
A. Khiat, M. Toyota, Y. Matsumoto & T. Ogasawara (2006). Brain activity in the evaluation
of the impression of robot bodily expressions. In IEEE/RSJ International
Conference on Intelligent Robots and Systems, pages 5504–5508.
T. Kohonen (1982). Self-organized formation of topologically correct feature maps.
Biological Cybernitics, 43:59–69.
446 Humanoid Robots, New Developments
H. Kozima (2002). Infanoid: A babybot that explores the social environment, pages 157–164.
Socially Intelligent Agents: Creating Relationships with Computers and Robots.
Kluwer Academic Publishers, The Netherlands.
H. Lim, A. Ishii & A. Takanishi (2004). Emotion-base biped walking. Robotica, 22:577–586.
M. L. Nicolelis (2001). Actions from thoughts. Nature, 409(18):403–407.
T. Nomura, T. Kanda & T. Suzuki (2006). Experimental investigation into influence of
negative attitudes toward robots on human-robot interaction. AI & Society,
20(2):138–150.
D. A. Norman, A. Ortony, and D. M. Russell (2003). Affect and machine design: Lessons for
the development of autonomous machines. IBM Systems Journal, 42(1):38–44.
F. E. Pollick, H. M. Paterson, A. Bruderlin & A. J. Sanford (2001). Perceiving affect from arm
movement. Cognition, 82:B51–B61.
G. Rizolatti & L. Craighero (2004). The mirror-neuron system. Annual Reviews of
Neuroscience, 27:169–192.
A. Schlögl, S. J. Roberts & G. Pfurtscheller (2000). A criterion for adaptive autoregressive
models. In IEEE International Conference of Engineering in Medicine & Biology
Society, pages 1581–1582.
S. Schaal (1999). Is imitation learning the route to humanoid robot? Trends in Cognitive
Science, 3(6):233–242.
P. Shaver, J. Schwartz, D. Kirson & C. O’Connor (1987). Emotion knowledge: Further
exploration of a prototype approach. Journal of Personality and Social Psychology,
52(6):1061–1086.
K. Shinozawa, F. Naya & K. Kogure (2005). Differences in effect of robot and screen agent
recommendations on human decision-making. International Journal of Human
Computer Study, 62(2):267–279.
T. Tanaka, T. Mori & T. Sato (2001). Quantitative analysis of impression of robot bodily
expression based on laban movement theory. Journal of Robotics Society of Japan,
19(2):252–259. (in Japanese)
J. Wessberg, C. R. Stambaugh, J. D. Kralik, P. D. Beck, M. Laubach, J. K. Chapin, J. Kim, S. J.
Biggs, M. A. Srinivasan & M. L. Nicolelis (2000). Real-time prediction of hand
trajectory by ensembles of cortical neurons in primates. Nature, 408:361–365.
C. M. Whissel (1989). The Dictionary of Affect in Language, volume 4 of Emotion: Theory,
research and experience. The measurement of emotions. Academic Press, USA.
Assessment of the Impressions of Robot Bodily Expressions using
Electroencephalogram Measurement of Brain Activity 447
Fig. 11. Diagram of table plane superposed on a top view of the robot ASKA.
The mathematical definition of Laban features (Shape and Effort) using the robot’s
kinematic and dynamic information is given such that larger values describe fighting
movement forms and smaller values describe indulging ones (Tanaka et al., 2001). Bartenieff
and Lewis stated in (Bartenieff & Lewis, 1980) that the Shape feature describes the
geometrical aspect of the movement using three parameters: table plane, door plane, and
wheel plane. They also reported that the Effort feature describes the dynamic aspect of the
movement using three parameters: weight, space, and time. The robot’s link information
which will be used in the features definitions is given in Fig. . In order to simplify the
mathematical description, a limited number of joint parameters were considered in this
definition, namely: the left arm θ l1 , the right arm θ r1 , the neck δ 1 , the face δ 2 , the left
wheel ωl , and the right wheel ω r . The remaining parameters were fixed to default values
during movement execution.
Using the diagram shown in Fig. , the table parameter of feature Shape represents the
spread of silhouette as seen from above. It is defined as the scaled reciprocal of the
summation of mutual distances between the tips of the left and the right hands along with a
focus point, as shown in (8).
Shapetable = s
(TLF + TRF + TLR ) (8)
where,
The point of focus is set at the fixed distance LF = 44 [cm] in the gaze line of the robot’s head.
Sh = 33 [cm] is the distance between the shoulders; LA = 44 [cm] is the arm’s length during
movement execution and s is a scaling factor. The door parameter of feature Shape
represents the spread of the silhouette as seen from the font. It is defined as the weighted
sum of the elevation angles of both arms and the head as shown in (9). The sine is used to
reflect how upward or downward is each joint angle. The weights d l , d r , d n were fixed
empirically to 1.
Shape door = d l sin θ l1 + d r sin θ r1 + d n sin δ n1 (9)
The wheel parameter of feature Shape represents the lengthwise shift of the silhouette in the
sagittal plane. It is defined as the weighted sum of the velocities of the robot and the
velocities of the arm extremities as shown in (10). Weights were set empirically to -8 for wt ,
to -1 for wl and wr .
d d
Shape wheel = wt vtr + wl L A cos θ l1 + wr L A cos θ r1 (10)
dt dt
The weight parameter of feature Effort represents the strength of the movement. It is
defined in (11) as the weighed sum of the energies exhibited during movement per unit time
at each part of the body. Weights were adjusted with respect of to the saliency of body parts.
Relatively large weights ert = etr = 5 were given to the movement of the trunk and smaller
weights were given to the movements of the arms el = er = 2 and the neck en = 1 .
The time parameter of feature Effort represents the briskness in the movement execution
and covers the entire span from sudden to sustained movements. It is defined in (13) as the
ratio indicating the number of generated commands per time unit.
number of generated commands
Effort time = (13)
time span
25
1. Introduction
The bipedal structure is one of the most versatile ones for the employment of walking robots.
The biped robot has almost the same mechanisms as a human and is suitable for moving in an
environment which contains stairs, obstacle etc, where a human lives. However, the dynamics
involved are highly nonlinear, complex and unstable. So it is difficult to generate human-like
walking motion. To realize human-shaped and human-like walking robots, we call this as
humanoid robot, many researches on the biped walking robots have been reported [1-4]. In
contrast to industrial robot manipulators, the interaction between the walking robots and the
ground is complex. The concept of the zero moment point (ZMP) [2] is known to give good
results in order to control this interaction. As an important criterion for the stability of the
walk, the trajectory of the ZMP beneath the robot foot during the walk is investigated [1-7].
Through the ZMP, we can synthesize the walking patterns of humanoid robot and
demonstrate walking motion with real robots. Thus ZMP is indispensable to ensure dynamic
stability for a biped robot. The ZMP represents the point at which the ground reaction force is
applied. The location of the ZMP can be obtained computationally using a model of the robot.
But it is possible that there is a large error between the actual ZMP and the calculated one, due
to the deviations of the physical parameters between the mathematical model and the real
machine. Thus, actual ZMP should be measured to realize stable walking with a control
method that makes use of it.
In this chapter, actual ZMP data throughout the whole walking phase are obtained from the
practical humanoid robot. And evolutionary inductive self-organizing network [8-9] is
applied. So we obtained natural walking motions on the flat floor, some slopes, and uneven
floor.
Fig. 2. Example of PD whose various pieces of required information are obtained from its
chromosome.
Start
One-point crossover
---: crossover site
NO
A: 0 0 0 0 0 0 0 1 1 1 1 A`: 0 0 0 0 0 0 0 0 0 1 1
Termination condition B: 1 1 0 0 0 1 1 0 0 1 1 B`: 1 1 0 0 0 1 1 1 1 1 1
before crossover after crossover
YES
Invert mutation ---: mutation site
Results: chromosomes which have
good fitness value are selected for the A`: 0 0 0 0 0 0 0 0 0 1 1 A`: 0 0 0 1 0 0 0 0 0 1 1
new input variables of the next layer before mutation after mutation
procedure is repeated until a termination condition is reached. After the evolution process,
the final generation of population consists of highly fit bits that provide optimal solutions.
After the termination condition is satisfied, one chromosome (PD) with the best
performance in the final generation of population is selected as the output PD. All
remaining other chromosomes are discarded and all the nodes that do not have influence on
this output PD in the previous layers are also removed. By doing this, the EISON model is
obtained.
where m is the number of data pairs and yi and yˆ i are the i-th actual output and model
output, respectively.
The design parameters of EISON in each layer are shown in Table 3. The simulation results
of the EISON are summarized in Table 4. The overall lowest values of the performance
index, PI=0.188 EPI=1.087, are obtained at the third layer when the weighting factor (lj) is
0.25.
Advanced Humanoid Robot Based on the Evolutionary Inductive Self-organizing Network 455
Maximum generations 40 60 80
Population size:( w) 20:(15) 60:(50) 80
String length 8 20 55
Crossover rate (Pc) 0.85
Mutation rate (Pm) 0.05
Weighting factor: lj 0.1~0.9
Type (order) 1~3
Table 3. Design parameters of EISON for modeling.
w: the number of chosen nodes whose outputs are used as inputs to the next layer
6 7
EPI
PI
5 6
Performance index(EPI)
Performance index(PI)
4 5
3 4
2 3
1 2
0 1st layer 2nd layer 3rd layer 1 1st layer 2nd layer 3rd layer
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180
Generations Generations
Fig. 5 illustrates the trend of the performance index values produced in successive
generations of the evolutionary algorithm when the weighting factor lj is 0.25. Fig. 6
shows the values of error function and fitness function in successive evolutionary
algorithm generations when the lj is 0.25. Fig. 7 depicts the proposed EISON model with
3 layers when the lj is 0.25. The structure of EISON is very simple and has a good
performance.
456 Humanoid Robots, New Developments
7 0.6
6
0.5
5
0.4
4
3 0.3
2 0.2
1
1st layer 2nd layer 3rd layer 0.1 1st layer 2nd layer 3rd layer
0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180
Generations Generations
APE
Model
PI (%) EPI (%)
GMDH model[12] 4.7 5.7
Fuzzy model Model 1 1.5 2.1
[11] Model 2 0.59 3.4
Type 1 0.84 1.22
FNN [14] Type 2 0.73 1.28
Type 3 0.63 1.25
GD-FNN [13] 2.11 1.54
30 20
Actual output
Model output 15
25
10
20 5
Errors
0
y_tr
15
-5
10 -10
-15
5
-20
5 10 15 20 5 10 15 20
Data number Data number
(a) actual versus model output of training data set (b) errors of (a)
30 20
Actual output
Model output 15
25
10
20 5
Errors
y_te
0
15
-5
10 -10
-15
5
-20
5 10 15 20 5 10 15 20
Data number Data number
`
(c) actual versus model output of testing data set (d) errors of (c)
Fig. 8. Identification performance of EISON model with 3 layers and its errors
θ r1
θr 2 θ l1
θr3
θl 2
θl 3
θr 4
θl 4
θr5
z
θr6 θl 5
x
θl 6 y
DSP CONNECTOR
5V
U3 U5
1 1
5V 2 A15 2
GND 3 SENSOR0 A14 3
ADCIN0 4 SENSOR1 R8 A13 4
ADCIN1 5 SENSOR2 330 A12 5
ADCIN2 6 SENSOR3 A11 6
ADCIN3 7 SENSOR4 A10 7
ADCIN4 8 SENSOR5 A9 8
ADCIN5 9 SENSOR6 D3 A8 9
ADCIN6 10 SENSOR7 A7 10
ADCIN7 11 SENSOR8 LED A6 11
ADCIN8 12 SENSOR9 A5 12 ADDR4
ADCIN9 13 SENSOR10 A4 13 ADDR3
ADCIN10 14 SENSOR11 A3 14 ADDR2
ADCIN11 15 SENSOR12 A2 15 ADDR1
ADCIN12 16 SENSOR13 A1 16 ADDR0
ADCIN13 17 SENSOR14 3.3V A0 17
ADCIN14 18 SENSOR15 D15 18
ADCIN15 19 D14 19
VCC_3.3V 20 D13 20
GND 21 D12 21
VREFHI 22 R9 D11 22 DATA10
VREFLO 23 XINT1 330 D10 23 DATA9
XINT1 24 XINT2 D9 24 DATA8
XINT2/A/DCSOC 25 D8 25 DATA7
zBIO 26 D7 26 DATA6
zBOOT 27 zPDPINTA D4 D6 27 DATA5
zPDPINTA 28 CAP1/QEP1 D5 28 DATA4
CAP1/QEP1 29 CAP2/QEP2 LED D4 29 DATA3
CAP2/QEP2 30 CAP3 D3 30 DATA2
CAP3 31 PWM1 D2 31 DATA1
PWM1 32 PWM2 D1 32 DATA0
PWM2 33 PWM3 D0 33
PWM3 34 PWM4 zPS 34
PWM4 35 PWM5 zDS 35 nCS
PWM5 36 PWM6 zIS 36
PWM6 37 T1PWM/T1CMP IOPC0 37
T1PWM/T1CMP 38 T2PWM/T2CMP RzW 38 nWRITE
T2PWM/T2CMP 39 TDIRA zWE 39 nREAD
TDIRA 40 TCLKINA zRD 40
TCLKINA 41 zPDPINTB ZSTRB 41
zPDPINTB 42 CAP4/QEP3 zVIS_OE 42
CAP4/QEP3 43 CAP5/QEP4 ENA144 43
CAP5/QEP4 44 CAP6 MPzMC 44
CAP6 45 PWM7 IOPF6 45 SCITXD
PWM7 46 PWM8 SCITXD 46 SCIRXD
PWM8 47 PWM9 SCIRXD 47 CLKOUT
PWM9 48 PWM10 CLKOUT 48 SPISTE
PWM10 49 PWM11 SPISTE 49 SPICLK
PWM11 50 PWM12 SPICLK 50 SPISIMO
PWM12 51 T3PWM/T3CMP SPISIMO 51 SPISOMI
T3PWM/T3CMP 52 T4PWM/T4CMP SPISOMI 52 CANTX
T4PWM/T4CMP 53 TDIRB CANTX 53 CANRX
TDIRB 54 TCLKINB CANRX 54 nRESET
TCLKINB zRS
CN9910 CN9920
J1 J10 J19
PULSE0 PULSE9 PULSE18
1 1 1
144
128
108
U1
90
73
54
37
18
2 2 2
3 3 3
MOTOR0 MOTOR9 MOTOR18
VCC
VCC
VCC
VCC
VCC
VCC
VCC
VCC
CCLK 3 3 3
MOTOR7 MOTOR16 MOTOR25
J9 J18
1
8
17
27
35
45
55
64
71
81
91
100
110
118
127
137
PULSE8 PULSE17
1 1
2 2
3 3
MOTOR8 MOTOR17
Fig. 13. Implemented control board and its electric wiring diagram schematic
θ6
θ7
θ1
θ2
θ8
θ3
θ9
θ10
θ4
θ5
The foot pressure is obtained by summing the force signals. By using the force sensor data, it
is easy to calculate the actual ZMP data. Feet support phase ZMPs in the local foot
coordinate frame are computed by equation 6
8
¦fr i i
P= i =1
8
(6)
¦ fi
i =1
where fi is each force applied to the right and left foot sensors and ri is sensor position which
is vector.
Fig. 16. Side view of the biped humanoid robot on a flat floor
Fig. 18. Side view of the biped humanoid robot on an uneven floor.
Experiments using EISON was conducted and the results are summarized in tables and
figures below. The design parameters of evolutionary inductive self-organizing network in
each layer are shown in Table 7. The results of the EISON for the walking humanoid robot
on the flat floor are summarized in Table 8. The overall lowest values of the performance
indicies, 6.865 and 10.377, are obtained at the third layer when the weighting factor (lj) is 1.
In addition, the generated ZMP positions and corresponding ZMP trajectory are shown in
Fig. 19. Table 9 depicts the condition and results for the actual ZMP positions of our
humanoid walking robot on an ascent floor. We can also see the corresponding ZMP
trajectories in Fig. 20, respectively.
Maximum generations 40 60 80
Population size:( w) 40:(30) 100:(80) 160
String length 13 35 85
Crossover rate (Pc) 0.85
Mutation rate (Pm) 0.05
Weighting factor: lj 1
Type (order) 1~3
Table 7. Design parameters of evolutionary inductive self-organizing network.
w: the number of chosen nodes whose outputs are used as inputs to the next layer
1 9.676 18.566
o
0 2 7.641 13.397
3 6.865 10.377
Table 8. Condition and the corresponding MSE are included for actual ZMP position in four
step motion of our humanoid robot.
464 Humanoid Robots, New Developments
5. Concluding remarks
This chapter deals with advanced humanoid robot based on the evolutionary inductive
self-organizing network. Humanoid robot is the most versatile ones and has almost the
same mechanisms as a human and is suitable for moving in an human’s environment.
But the dynamics involved are highly nonlinear, complex and unstable. So it is difficult
to generate human-like walking motion. In this chapter, we have employed zero
moment point as an important criterion for the balance of the walking robots. In
addition, evolutionary inductive self-organizing network is also utilized to establish
empirical relationships between the humanoid walking robots and the ground and to
explain empirical laws by incorporating them into the humanoid robot. From obtained
natural walking motions of the humanoid robot, EISON can be effectively used to the
walking robot and we can see the synergy effect humanoid robot and evolutionary
inductive self-organizing network.
Advanced Humanoid Robot Based on the Evolutionary Inductive Self-organizing Network 465
6. Acknowledgements
The authors thank the financial support of the Korea Science & Engineering Foundation.
This work was supported by grant No. R01-2005-000-11044-0 from the Basic Research
Program of the Korea Science & Engineering Foundation.
7. References
[1] Erbatur, K.; Okazaki, A.; Obiya, K.; Takahashi, T. & Kawamura, A. (2002) A study
on the zero moment point measurement for biped walking robots, 7th International
Workshop on Advanced Motion Control, pp. 431-436
[2] Vukobratovic, M.; Brovac, B.; Surla, D.; Stokic, D. (1990). Biped Locomotion, Springer
Verlag
[3] Takanishi, A.; Ishida, M.; Yamazaki, Y.; Kato,I. (1985). The realization of dynamic
walking robot WL-10RD, Proc. Int. Conf. on Advanced Robotics, pp. 459-466
[4] Hirai, K.; Hirose, M.; Haikawa, Y.; Takenaka, T. (1998). The development of Honda
humanoid robot, Proc. IEEE Int. Conf. on Robotics and Automation, pp. 1321-1326
466 Humanoid Robots, New Developments
[5] Park, J. H.; Rhee, Y. K. (1998). ZMP Trajectory Generation for Reduced Trunk
Motions of Biped Robots. Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems,
IROS ’98, pp. 90-95
[6] Park, J. H.; Cho, H. C. (2000). An On-line Trajectory Modifier for the Base Link of
Biped Robots to Enhance Locomotion stability, Proc. IEEE Int. Conf. on Robotics and
Automation, pp. 3353-3358
[7] Kim, D.; Kim, N.H.; Seo, S.J.; Park, G.T. (2005). Fuzzy Modeling of Zero Moment
Point Trajectory for a Biped Walking Robot, Lect. Notes Artif. Int., vol. 3214, pp. 716-
722 (BEST PAPER AWARDED PAPER)
[8] Kim, Dongwon; Park, Gwi-Tae. (2006). Evolution of Inductive Self-organizing
Networks, Studies in Computational Intelligence Series: Volume 2: Engineering
Evolutionary Intelligent Systems, Springer
[9] Kim, D. W.; and Park, G. T. (2003). A Novel Design of Self-Organizing
Approximator Technique: An Evolutionary Approach, IEEE Intl. Conf. Syst., Man
Cybern. 2003, pp. 4643-4648 (BEST STUDENT PAPER COMPETITION FINALIST
AWARDED PAPER)
[10] Takagi, H.; Hayashi, I. (1991). NN-driven fuzzy reasoning, Int. J. Approx. Reasoning,
Vol. 5, No. 3, pp. 191-212
[11] Sugeno, M.; Kang, G. T. (1988). Structure identification of fuzzy model, Fuzzy Sets
Syst., Vol. 28, pp. 15-33
[12] Kondo, T. (1986). Revised GMDH algorithm estimating degree of the complete
polynomial, Tran. Soc. Instrum. Control Eng., Vol. 22, No. 9, pp. 928-934 (in
Japanese)
[13] Wu, S.; Er, M.J.; Gao, Y. (2001). A Fast Approach for Automatic Generation of
Fuzzy Rules by Generalized Dynamic Fuzzy Neural Networks, IEEE Trans. Fuzzy
Syst., Vol. 9, No. 4, pp. 578-594
[14] Horikawa, S. I.; Furuhashi, T.; Uchikawa, Y. (1992). On Fuzzy modeling Using
Fuzzy Neural Networks with the Back-Propagation Algorithm, IEEE Trans. Neural
Netw., Vol. 3, No. 5, pp. 801-806
26
1. Abstract
One of the most important tasks in biped robot is the balance-keeping control. A question
arisen as how do our human beings make the balance-keeping possible in upright standing
as human beings are the only biped-walking primates, which takes several million years of
evolution period to achieve this ability. Studies on humans’ balance-keeping mechanism are
not only the work of physiologists but also a task of robot engineers since bio-mimetic
approach is a shortcut for developing humanoid robot. This chapter will introduce some
research progresses on balance-keeping control in upright standing. We will introduce the
physical characteristics of human body at first, modeling the physical system of body,
establishing a balance-keeping control model, and at last applying the balance-keeping
ability assessment for falls risk prediction. We wish those studies make contributions to
robotics.
2. Introduction
Scientist’s interest and fascination in balance control and movement has a long history.
Leonardo da Vinci emphasized the importance of mechanics in the understanding of
movement and balance, wrote that “Mechanical science is the noblest and above all the most
useful, seeing that by means of it all animated bodies which have movement perform all
their actions[1]”. In 1685, by using a balance board, Italian mathematician Johannes
Alphonsus Borelli, in his “De motu animalium”, determined the position of the center of
gravity as being 57% of the height of the body taken from the feet in the erect position.
Those early studies on human body mechanics determining the inertial properties of
different body segments serve an important and necessary role in modern biomechanics.
Not like static upright stance in biped robot, upright standing in human is a high-skill task
needed a precise control. In 1862 Vierordtand later Mosso (1884) demonstrated that normal
standing was not a static posture but rather a continuous movement. Kelso and Hellebrandt
in 1937 introduced a balance platform to obtain graphic recording of the involuntary
postural sway of man in the upright stance and to locate the centre of weight with respect to
the feet as a function of time. Using a force analysis platform and an accelerometer,
468 Humanoid Robots, New Developments
Whitney[2] (1958) concluded that the movement of the center of foot pressure must
exaggerate the accompanying movement of the center of gravity of the body mass. Based on
those studies, postural sway is regarded as presenting at the hip level suggests that the
trunk rotates relative to the limbs during standing.
In other hand, clinical significance of postural sway was first observed by Babinski in
1899. He noted that a patient with the disorder termed “asynerhie cerebelleuse” could not
extent his trunk from a standing position without falling backwards. He concluded that
the ankle must perform a compensatory movement to prevent the subject from falling
backwards. Thus, Babinski was one of the first to recognize the presence of an active
postural control of muscular tone and its importance in balance control and in voluntary
movements.
Modern theory on postural control was established on the work of Magnus[3] and De
Kleijn[4]; they proposed that each animal species have a reference posture or stance, which is
genetically determined. According to this view, postural control and its adaptation to the
environment is based on the background postural tone and on the postural reflexes or
reactions, which are considered to originate from inputs from the visual[5], vestibular[6] and
proprioceptive[7] system. The gravity vector is considered to serve as a reference frame. The
center nervous system needs to accomplish two tasks related to the force of gravity. First, it
must activate extensor muscles to act against the gravity vector, creating postural tone.
Second, it must stabilize the body’s centre of gravity with respect to the ground.
Many studies[8-10] on balance-keeping control have been published in recent two decades. A
lot of theoretical control models have been proposed for elucidating the body sway
phenomenon during upright standing. Among them, Inverted pendulum is the most
popular model for body sway analysis. However, those studies are seldom considered
individual’s physical conditions, and one-link inverted pendulum is dominated. However, a
practicable control model for humans’ balance-keeping ability assessment is still
unavailable. For this reason, we proposed a PID model of upright standing, and a reality-
like body sway was simulated. This chapter presents some recent research results in our
laboratory in balance-keeping control and emphasizes its application for fall risk prediction
in fall-prone subjects.
Fig.1. The scheme shows of the body sway-measuring device. The device consists of three
high-resolution CCD cameras (Cam1, Cam2 and Cam3) and a whiteboard. A white ball
with diameter of 1.0cm was put on the whiteboard in subject’s eye level. During the
measurement, subjects were asked to fix their eyes on the ball. Postural sway analyzing was
executed on one desktop PC, and an EMG recording PC was also installed too.
In this study, body sway in lateral direction was recorded and the body was regarded as
two-link inversed pendulum. The motion of the first link was on ankle and the second link
was on lumbosacral. The angular sway of the first link was measured as the averaged value
of the two legs and the angular sway of the second link was calculated indirectly as shown
in Fig.2, where point O is the ankle joint and point A is the lumbosacral joint. P1 represents
the marker of legs and P2 represents the marker of subject’s back. There have
x1 = l1 sin θ1 (1)
x2 = L1 sin θ1 + l2 sin(θ1 + θ 2 ). (2)
Because the angular deviation scope is relatively small (usually < 2.0 degree), then we set
sin θ1 = θ1 ,sin(θ1 + θ 2 ) = θ1 + θ 2 approximately, and θ2 can be calculated as equation (3),
x2 x L
θ2 = ( − 1 )(1 + 1 ). (3)
L1 + l2 l1 l2
Here, we defined the height of lumbosacral as L1 , which is the distance from floor to 5th
lumbar spine. The entire calculations can carried out online. Subjects kept a static upright
stance on the force-plate with eyes open or eyes closed. For minimizing psychological
influence, subjects were asked to numerate on mind while doing standing task.
470 Humanoid Robots, New Developments
Fig.2. Relation of the sway angles during static upright stance. Points P1 and P2 are the
positions of the markers, their coordinates can be calculated by the CCD video camera
measuring system. Points O and A represent are the ankle joint and lumbosacral joint
respectively. θ1 is the angular sway of the ankle joint, and θ2 is the angular sway of the
lumbosacral joint. The postural sway is considered only in coronal direction.
A study from eight healthy young subjects[13] shown that the body sways scope of ankle and
lumbosacral were 0.94±0.36degree (eye-open), 1.35±0.52degree (eye-closed) and
0.99±0.41degree (eye-open), 1.27±0.72degree (eye-closed), respectively. No significant
difference existed between the ankle and lumbosacral. The ankle and lumbosacral sway
approximately in the same degree during the static upright stance. Further more, Fourier
transform of the sway time series showed that the phase differences between ankle and
lumbosacral were approximately equal to π , i.e. i,e, ankle sways in opposite direction to
the lumbosacral. The results help for establishing a structural inverted pendulum model.
We also analyzed the relationship between trunk sway and deviation of COP[12]. By setting
y (t ) as the deviation of COP, u (t ) as the trunk sway, an approximate equation can be
acquired, y (t ) ≈ ku(t ) + hu (t ) , where, k , h are constants. The result proven that COP
deviation is different with trunk sway, and body’s inertia effect does added on the effect of COP
deviation as Whitney pointed out[5].
Balance-Keeping Control of Upright Standing in Biped Human Beings and
its Application for Stability Assessment 471
Vcom =
¦ m gV
i i mi
(4)
¦ mg i i
When Vcom > d , upright standing is impossible and falls may happen. In other words,
upright stance is possible only when
Vcom ≤ d . (5)
Because the angular sway scope in ankle is approximately equal to the sway scope in
lumbosacral, and their phase difference is π , the relationship of sway angles between ankle
and lumbosacral can be considered as (Fig.3c)
θ1 = −θ 2 . (6)
Equation (6) also indicates that the trunk of body is always keeping perpendicular to the
horizon. Based on this fact the structure model of body can be simplified as be shown in
Fig.4.
472 Humanoid Robots, New Developments
In this model, PM and GM are the actuators keeping the inverted pendulum to be balanced.
By intuition, right GM and left PM should activate simultaneously, which is an agonist
against left GM and right PM. Its dynamics can be deduced by Lagrangian equation as
equation (7).
Fig.3. The pelvic anatomical structure and its responding structure model during static
upright stance. The pelvic is fused by four bones, the sacrum, ilium, pubis and ischium,
which linked with 5th lumbar spine and femur through lumbosacral and hip joint (a). The
femur includes a head that is articulated with ilium and a greater trochanter and lesser
trochanter where the GM and PM terminated (b). The structural model of pelvic is schemed
(c). f1 , f 2 are the forces produced by PM, and f3 , f 4 are produced by GM. m1 , m2 and m3
represented the mass of lower leg, pelvic and upper body respectively. θ1 is the sway angle
of leg and θ2 is the sway angle of upper trunk.
Balance-Keeping Control of Upright Standing in Biped Human Beings and
its Application for Stability Assessment 473
Fig.4. A simplified pelvic structural model during static upright stance. A, B are the masses
of leg, C is the mass of pelvic and D is the mass of upper trunk. Because the lumbosacral
always sways in inverse direction of the ankle joint with the same value of θ , the upper
trunk is kept perpendicular to the horizon.
Fig.5. Showing COP deviation and surface EMG recordings from TA, PN and TS when
standing on one foot. The COPx (coronal section) is closely related to TA and PN
contraction.
Fig.6. The model of COP control when standing on one foot. The scheme shows how the force
under the three points is controlled by muscular contraction. Supposing the forces act on the
three points as f a , f b and fc , then the muscles bear the responsibility should be PN, TA, and
TC respectively. The three muscles are innerved by neurons located at the L4-S2 level.
Balance-Keeping Control of Upright Standing in Biped Human Beings and
its Application for Stability Assessment 475
Nervous commands come from cerebral cortex reached to alpha motor neurons of the spinal
anterior column to control the muscles around ankle joint to keep the body balanced. This
kind of balance-keeping control is ankle strategy of balance control. In contract, nervous
commands come from cerebrum to control the muscles around pelvis joints to keep the
body balanced, hereby defined as pelvis strategy. It seems that the balance-keeping strategy
is transformed from ankle to pelvis when the standing posture changes from one foot to two
feet of upright standing.
Generally, the contact areas of the feet and ground play a key role in the static upright
standing, with the more contact areas the feet have, the more stable of the posture[17] Will be.
The contact surface is a polygon. Their vertexes are the contact points receive ground
reaction force, and the area of the polygon was defined as effective contact area (Fig.7).
What important is the physical stability of the inverted body, which determined by the
height of the COM and the effective contact area of polygon. By defining the effective
contact areas as s , the body mass as m and the height of COM as h, then C = mgh / s is an
constrained value of the upright standing ability. In human beings, when standing on one
foot, s value is approximately equal to 3000 cm2, thus the C value reaches up to 10,000
Newton, which is far larger than the biped robot in the world in current time.
Fig.7. Scheme shows the effective contact area of support when standing in a static upright
stance. The area is changed with different inter-foot-distance (IFD).
Physiologically, at least three sensory organs are contributed to balance-keeping, i.e. the
vestibular organ[5], eyes[6] and proprioceptors[7] of muscles. Those sensors detect the gravity
vector or body kinetic state and send information to the central nervous system (CNS).
Actuators are muscles which affiliated to various body parts to produce torques. As
mentioned ahead two pairs of muscle in pelvis, GM and PM are especially important in
balance-keeping control (Figure 3a). The CNS works as a controller which controls muscles
of whole body to produce torque needed for balance-keeping. Up now, the mechanism of
body balance control is not clear clarified.
The body sway can be viewed as an output of body control process of humans’ brain, which
provides an interesting clue for investigating inner balance-keeping control mechanism.
Using a classical control theoretical approach, a multi-link model of human body in coronal
section as shown in Figure 3 has been constructed. The PID controller is modeled for the
body’s balance-keeping controller in upright standing. The dynamics of upright standing
body is expressed as equation (7), the controller is
e(t ) = r (t ) − θ (t ) (r (t ) = 0) (8)
de(t − td ) t
τ (t ) = K P e(t − td ) + K D + K I ³ e(t − td )dt , (9)
dt 0
where td is the time lag, K P , K D and K I are unknown gains of the PID control. By
combining the Eq. (7) and the Eq. (9), and set sin θ (t ) = θ (t ) yields
[2m l 2 + (m + m )l 2 ]θ(t ) − [2m l + (m + m )l ]gθ (t )
1 g 2 3 1 1 g 2 3 1
de(t − td ) t
= K P e (t − t d ) + K D + K I ³ e(t − td )dt . (10)
dt 0
Fig.8. A block diagram of humans’ static upright stance control. Because the aim of standing
is to keep the body upright, the reference value of postural sway angle is set to zero. The
central nervous system detects the error signal and sends output signal to muscles to keep
the body upright, a state of equilibrium.
Balance-Keeping Control of Upright Standing in Biped Human Beings and
its Application for Stability Assessment 477
One task is to estimate its parameters of the PID model according to the measurement of
body sway. This is a typical closed-loop identification problem to estimate controller
parameters instead of plant parameters. Here, we take a modified lest square method for
PID parameters estimation since a time lag parameter td is included in Eq. (10). The
parameters of K P , K D , K I and td can be identified based on the observation of θ . We use a
linear regression scheme by rewriting the Eq. (10) as
y (t ) = K P u1 (t − td ) + K D u2 (t − td ) + K I u3 (t − td ), (11)
2
( y = [2m l + (m + m )l ]θ (t ) − [2m l + (m + m )l ]gθ (t ),
2
1 g 2 3 1 1 g 2 3 1
u1 (t − td ) = e(t − td ),
de(t − td )
u2 (t − td ) = ,
dt
t
u3 (t − td ) = ³ e(t − td )dt ) .
0
Its frequency response can be calculated, the gain T ( jω ) becomes a function of ω and td
as show in Figure 11. The peaks of gain appeared in both eyes open and eyes closed are
consistent with the experimental recorded. The results suggest that the PID model is
reasonable and useful for balance-keeping control analysis.
Fig. 9. Simulated and the Experimental recorded body sway in eyes open, time record (a)
and frequency response (b). The subject is a 37-year old male. The body height is 1.65m, and
body weight is 65Kg. K P , K D , K I and td are estimated from the experimental data as
519.0Nm/rad., 72.3Nms/rad., 3.0 Nm/rad.s and 0.11s respectively. The frequency
expression of simulated is similar to the experimental recorded.
Fig. 10. While considering the sensory noise of u = sin ωt as the input, the PID model can
be simplified as three processes, i.e. a time-lag process, controller of G1 and dynamics of
G2.
Balance-Keeping Control of Upright Standing in Biped Human Beings and
its Application for Stability Assessment 479
Fig. 11. The frequency response of the balance keeping control system simulated from the
same subject. The gain is a function of ω and td . The peaks of gain in eyes open (0.06Hz)
and eyes closed (0.16Hz) are different, which is roughly consistent with the experimental
data as a peak of 0.02Hz in eyes open and a 0.12Hz in eyes closed.
result of COP examination, are still not under specific consideration, even though a lot of
models have been proposed[28, 29]. We introduced an index of averaged angular velocity
(TSS) for stability estimation[30] and it was proven that TSS is a useful index. Based on a
study on 68 subjects, we found that increased in body weight and height tended to decrease
the body sway but aging increased body sway. The reciprocal of TSS, defined as trunk sway
time (TST), kept a close correspondence with the v COP[30]. To reduce the influence of
physical characteristics, especially the height on the measuring results, an empirical
mathematic model had been introduced as
m ×la
TSTs = k × ,..................................(18)
θ +B
y
where m is body weight, l is height and k ,θ are constants, B defined as ( − 1) , y is age
2
21
(year). This model fit the recorded data quite well.
Another interesting result acquired from a study on aging effect on the PID controller.
Parameters identification showed that kD is deceased with aging[31] (Fig.12). The result
suggests a promising method for individual’s balance-keeping ability assessment in the
future.
Fig.12. K D values take from 102 healthy elderly subjects (averaged 64.1±16.3 years old). The
K D is normalized as KdI = K D / mgh (m: body weight, g: gravity acceleration, h: body
height). The result shows the KD decreased with aging just like the equilibrium ability.
᧽ consequences of falls may be serious at any age, but to the elderly they have significant
beyond that in younger people. These consequences can be direct physical effects such as
injury or death, or less direct effects of increased dependency and impaired self-care,
leading to demands on carers, or profound psychological effects. It is known that postural
stability is related to the incidence of falls in elderly. Individuals with poor postural stability
Balance-Keeping Control of Upright Standing in Biped Human Beings and
its Application for Stability Assessment 481
show high fall incidence. When defining the measured trunk-sway-time as TSTr , the value
of R f = TSTr / TSTs represents individual’s postural stability. From a study of 51 older health
subjects[32], the relationship between individual’s R f value and the incidence of falls was
investigated. Regressive analyses showed that the value of R f and the incidence of falls
have a reverse relationship, when R f ≤ 1.12 the falls incidence was significantly higher than
Rf > 1.12. The results suggest that falls are predictable by R f assessment, and this
procedure appeared a useful means for fall-prevention especially in elder people.
Fig.13. A shows a simplified body biomechanics structural model during static upright
stance. ࿆, ࿇ are the masses of leg, ࿈ is the mass of pelvic and ࿉ is the mass of upper
trunk. Because the lumbosacral always sways in inverse direction of the ankle joint with the
same value of θ , the upper trunk is kept perpendicular to the horizon. B is the block
diagram of humans’ static upright stance control. Because the aim of static upright standing
is to keep the body upright, the reference value of postural sway angle is set to zero. The
central nervous system detects the error signal and sends output signal to muscles to keep
the body upright, a state of equilibrium. C shows a simplified block diagram of upright
stance control. Here, the sense noise ( sin(ωt ) ) is looked as an input, and the output is y,
482 Humanoid Robots, New Developments
the deviation of the body’s center-of-mass. G1 is the transfer function of PID controller, and
G2 is the transfer function of inverted pendulum model.
The results showed that the gain of the PID parameter KD is decreased significantly in eyes
closed (131.5±37.6Nms/rad in eyes open and 90.4±26.0Nms/rad in eyes closed᧨p<0.001,
Fig.14), however, KP and KI are unchanged. Simulation results also proved that when
decreasing the gain of KD locus of simulated is more like the measured spontaneous body
sway in eyes closed. The results suggested environmental visual cue is important for
balance-keeping control, and this effect is pattern-dependent. Of cause, angular velocity is
also increased when eyes are closed (Fig.15).
There are two hypothesizes have been proposed about the mechanism of the visual effect on
balance-keeping. One regards that information coming from proprioception of extra-ocular
muscles is important for balance-keeping control[34]. The other theory is “retinal slip”
insisted that images slip on the retinal is used as a cue for balance-keeping control[35]. Our
present studies agreed with the retinal slip hypothesis.
Fig.15. Averaged values of the TSS in different visual stimulations are increased from A to
D. However, no significant difference between A and B. ***: p < 0.001.
9. Summary
Upright standing is a simple and basic posture of human beings. However, the relatively
large mass of the upper body and its elevated position in relation to the area of support
during standing accentuate the importance of an accurate control of trunk movements for
the maintenance of equilibrium. The kinematics and control strategy of the central nervous
system have been studied in recent decade, which brought a PID control algorithm to model
the balance-keeping in upright standing and had successfully interpreted the phenomenon
of spontaneously body sway. Modeling the human body as two-link inverted pendulum
system, we successfully identified parameters of individual’s PID parameters and make this
model analyzable and practicable. The simulation results, both of the body sway and the
spectral response, are quite consistent with experimental data. This proved that the PID
484 Humanoid Robots, New Developments
model is a reasonable and a useful method as well as by measuring the averaged angular
velocity. Both of the two methods help for falls prediction, and become a promising method
for falls prevention.
Many authors have argued that complex architectures including feedforward/feedback is
necessary for the maintenance of upright stance, however, our studies together with some
other recent studies have shown that a model based primarily on a simple feedback
mechanism with 120-ms to 150-ms time delay can account for postural control during a
broad variety of disturbance[36]. Also, one interesting result is that k D is a key parameter
related to individual balance keeping ability. Since k D is not just influenced by visual cue
but also sensitive to aging. It seems that human balance keeping ability is mainly
determined by the gains regulation of k D , and still there have much works to be done in the
future.
10. References
1) William, M. and Lissner, H.R. Boimechanics of human motion. W.B. Saunder
Company, Philadephia, 1977.
2) Whitney, R.J. The strength of the lifting action in man. Ergonomics 1, 101-128, 1958.
3) Magnus, R. Korperstellung. Berlin: Sprinter, 1924.
4) De Kleijn, A., Experimental physiology of the labyrinth. J Laryngol. & Otol., 38,
646-663, 1923.
5) Redfern, M.S., Yardley, L. and Bronstein, A.M.. Visual influence on balance, J
Anxiety Disord., 15, 81-94, 2001.
6) Fitzpatrick, R.C. and Day, B.L.. Probing the human vestibular system with galvanic
stimulation. J Appl Physiol., 96, 2301-2306, 2004.
7) Maurer, C. and Peterka, R. J.. Multisensory control of human upright stance. Exp
Brain Res., 171, 231-250, 2005.
8) Geurts, A.C. Haart, M. van Nes I.J. and Duysens, J.. A review of standing balance
recovery from stroke. Gait Posture, 22, 267-281, 2004.
9) Morosso, P.G. and Schieppati, M.. Can muscle stiffness alone stabilize upright
standing? J Neurophysiol. 82, 1622-1226, 1999.
10) Bloem, B.R., Van Dijk, J.G., and Beckley, D.J., et al. Altered postural reflexes in
Parkinson's disease: a reverse hypothesis. Med Hypotheses 39, 243-247, 1992.
11) Ji, Z., Findley, T., Chaudhry H. and Bukiet, B.. Computational method to evaluate ankle
postural stiffness with ground reaction forces. J Rehabil Res Dev. 41, 207-214, 2004.
12) Jiang, Y., Nagasaki, S. and Kimura, H.. The Relation Between Trunk Sway and the
Motion of Centre of Pressure During Quiet Stance. Jpn. J. Phys. Fit. Sport Med., 52,
533-542, 2003.
13) Jiang, Y., Nagasaki, S., Matsuura Y. and Zhou, J.. Dynamic studies on human body
sway by using a simple model with special concerns on the pelvic and muscle
Roles. Asian J. Control, 8, 297-306, 2006.
14) Massion, J., Postural Control Systems in Developmental Perspective. Neurosci.
Biobehav. Rev., 22, 465-472, 1998.
15) Krishnamoorthy, V., Latash, M.L., Scholz, J.P. and Zatsiorsky, V.M.. Muscle
Synergies During Shifts of the Center of Pressure by Standing Persons. Exp. Brain
Res., 152, 281-292, 2003.
Balance-Keeping Control of Upright Standing in Biped Human Beings and
its Application for Stability Assessment 485
16) Jiang, Y., Nagasaki, S. and Matsuura Y. et al.. The role of ankle joint in the control
of postural stability during upright standing on one-foot. Jpn. J Educ. Health. Sci.,49,
277-284, 2004.
17) Cote K. P., Brunet M. E., Gansneder B.M. and Shultz S.J.. Effects of Pronated and
Supinated Foot Postures on Static and Dynamic Postural Stability. J Athl Train. 40,
41-46, 2005.
18) Masani K., Vette A.H., Popovic M.R.. Controlling balance during quiet
standing: proportional and derivative controller generates preceding motor
command to body sway position observed in experiments. Gait Posture. 23,
164-172, 2006.
19) Mergner T., Maurer C. and Peterka R.J.. A multisensory posture control model of
human upright stance. Prog. Brain Res. 142, 189-201, 2003.
20) Dickin D.C., Brown L.A. and Doan J.B.. Age-dependent differences in the time
course of postural control during sensory perturbations. Aging Clin. Exp. Res. 18,
94-99, 2006.
21) Imaoka K., Murase H. and Fukuhara M., Collection of data for healthy subjects in
stabilometry. Equilibrium Res. Suppl., 12, 1-84, 1997.
22) Chaudhry H., Findley T., Quigley K.S. et al.. Postural stability index is a more
valid measure of stability than equilibrium score. J Rehabil. Res. Dev. 42, 547-556,
2005.
23) Tinetti M.E., Williams T. F. and Mayewski R., Fall risk index for elderly patients
based on number of chronic disabilities. Am. J. Med., 80, 429-434, 1986.
24) Paper S.A. and Soames R. W.. The influence of stationary auditory fields on
postural sway behaviour in man. Eur. J. Appl. Physiol. Occup. Physiol., 63, 363-367,
1991.
25) Horstmann G. A and Dietz V.. A basic posture control mechanism: the
stabilization of the centre of gravity. Electroencephalogr. Clin. Neurophysiol.,
76, 165-176, 1990.
26) Berger W., Trippel M., Discher M. and Dietz V., Influence of subjects’ height on the
stabilization of posture. Acta Otolaryngol., 112, 22-30, 1992.
27) Kimura M., Okuno T., Okayama Y. and Tanaka Y.. Characteristics of the ability of
the elderly to maintain posture. Rep. Res. Cent. Phys. Ed., 26, 103-114, 1998.
28) Giese M.A., Dijkstra T.M., Schoner G. and Gielen C.C.. Identification of the
nonlinear state-space dynamics of the action-perception cycle for visually induced
postural sway. Biol. Cybern. 74, 427-437, 1996.
29) Jeka J., Oie K., Schoner G., Dijkstra T. and Henson E.. Position and velocith
coupling of postural sway to somatosensory drive. J Neurophysiol. 79, 1661-1674,
1988.
30) Jiang, Y., Nagasaki, S., Matsuura Y. et al.. Postural sway depends on aging and
physique during upright standing in normals. Jpn. J Educ. Health. Sci.,48, 233-238,
2002.
31) Kimura Hidenori, Yifa Jiang, A PID model of human balance keeping. IEEE Control
System Magazine, 26(6), 18-23, 2006.
32) Nagasaki, Jiang, Y., S., Yoshinori F. et al.. Falls risk prediction in old women:
evaluated by trunk sway tests in static upright stance. Jpn. J Educ. Health. Sci.,48,
353-358, 2003.
486 Humanoid Robots, New Developments
33) Qin S., Nagasaki S., Jiang Y., et al. Body sway control and visual influence during
quiet upright standing. Jpn. J Physic. Fitness & Sports Medicine, 55, 469-476,
2006.
34) Baron J.B., Ushio N. and Tangapregassom M.J.. Orthostatic postural activity
disorders recorded by statokinesimeter in post-concussional syndromes:
oculomoter aspect.Clin. Otolaryngol. Allied Sci., 4,169-174. 1979.
35) Glasauer S., Schneider E., Jahn K., Strupp M., and Brandt T. How the eyes move the
body. Neurology. 65, 1291-1293, 2005.
36) Maurer C. and Peterka R.J.. A new interpretation of spontaneous sway measures
based on a simple model of human postural control. J Neurophysiol., 93, 189-
200,2005.
27
1. Introduction
Recently, the psychological point of view that grants the body a more significant role in
cognition has also gained attention in artificial intelligence. Proponents of this approach
would claim that instead of a ‘mind that works on abstract problems’ we have to deal with
and understand ‘a body that needs a mind to make it function’ (Wilson, 2002). These ideas
differ quite radically from the traditional approach that describes a cognitive process as an
abstract information processing task where the real physical connections to the outside
world are of only sub-critical importance, sometimes discarded as mere ‘informational
encapsulated plug-ins’ (Fodor, 1983). Thus most theories in cognitive psychology have tried
to describe the process of human thinking in terms of propositional knowledge. At the same
time, artificial intelligence research has been dominated by methods of abstract symbolic
processing, even if researchers often used robotic systems to implement them (Nilsson,
1984).
Ignoring sensor-motor influences on cognitive ability is in sharp contrast to research by
William James (James, 1890) and others (see (Prinz, 1987) for a review) that describe theories
of cognition based on motor acts, or a theory of cognitive function emerging from seminal
research on sensor-motor abilities by Jean Piaget (Wilson, 2002) and the theory of
affordances by (Gibson, 1977). In the 1980s the linguist Lakoff and the philosopher Johnson
(Lakoff & Johnson, 1980) put forward the idea of abstract concepts based on metaphors for
bodily, physical concepts; around the same time, Brooks (Brooks, 1986) made a major
impact on artificial intelligence research by his concepts of behavior based robotics and
interaction with the environment without internal representation instead of the sense-
reason-act cycle. This approach has gained wide attention ever since and there appears to be
a growing sense of commitment to the idea that cognitive ability in a system (natural or
artificial) has to be studied in the context of its relation to a ‘kinematically competent’
physical body.
Among the most competent (in a multi functional sense) physical bodies around are
certainly humans, so the study of humanoid robots appears to be a promising field for
488 Humanoid Robots, New Developments
resources to achieve a complete human-like looking robot, pushing forward the research in
many areas. The current research model is 130cm tall, weights 54 kg and is able to run at
6km/h (December 2005). The research began in 1986, achieving a first ASIMO prototype in
the year 2000. Nowadays, ASIMO is the only robot that is able to autonomously walk
around and climb stairs and slopes. Furthermore, it is able to understand some human
gestures and interact with people using its speech recognition system and some pre-
programmed messages. ASIMO can also push a cart, keeping a fixed distance to it while
moving and still maintaining the capability to change direction or speed of movement,
walking hand-in-hand with a person or carrying a tray.
Fig. 1. Honda ASIMO (left picture), Sony QRIO (middle picture), and ROBONAUT (right
picture).
The HRP-2P (Kaneko, 2003) robot specified by AIST (National Institute of Advanced
Industrial Science and Technology, Japan) and whose hardware was manufactured by the
Kawada Industries (a company that also worked with Honda and the University of Tokyo
in the development of the ASIMO and the H6-H7 robots) is one of the most advanced
humanoids nowadays. It differs from ASIMO on the fact that it is a research prototype
whose software is open to any roboticist. Moreover, it was designed to walk on uneven
terrains and recover from falling positions, features not yet possible for ASIMO. It weights
58kg and is 154cm tall.
Probably the third robot in importance in Japan is the H7 (Kagami, 2001) from the
University of Tokyo. However, there is not much information available apart from videos
showing its capabilities walking on a flat terrain. As above mentioned, Kawada Inc. was
responsible for the hardware development. It weights 55kg, is 147cm tall and has 30 DoF.
Sony entered the humanoid world in 1997 with the SDR-1X series, achieving the SDR-4X
version in 2003, named QRIO (Ischida, 2003) (cf. Fig. 1) as was intended to be commercially
available. In 2006 Sony announced the decision to stop the further development of the robot.
QRIO is comparable to ASIMO in its walking capabilities although since it was designed as
an entertainment robot, its size is substantially smaller than ASIMO: its weights 7kg and is
58cm tall. Its main features include the ability to adapt its walking to the most difficult
situations: from walking on irregular or tiled terrains to react to shocks and possible falling
490 Humanoid Robots, New Developments
conditions. But since its origins as entertainment robot, the most remarkable features are
those that enhanced its interaction capabilities with people: the robot is able to recognise
faces, use memory to remember a previously seen person or his/her words, detect the
person who is speaking and incorporates a vocabulary of more than 20,000 words that
enables the robot to maintain simple dialogues with humans.
oscillators are used to create rhythmic motions to generate the appropriate gait. The major
advantage is claimed to be its adaptation to the environment and new terrain
configurations and the minimum computational effort to control the locomotion. No need
for modelling kinematics, dynamics or generating stable trajectories using complex
criteria are required. It was intended to be used in research labs and universities as an
educational tool where to test different algorithms and for that reason provides an open
source software and weights only 6kg and is 48cm tall. It can walk up to 2km/h and is
sold at about 50,000€. In 2004, HOAP-2 received the Technical Innovation Award from the
Robotics Society of Japan.
Toyota also presented a series of partner robots (2005), one of them walking in two legs,
finding its application in the elderly care and rehabilitation. As a curious feature, Toyota
included artificial lips with human finesse what, together with their hands, enables them to
play trumpets in a similar way a human does.
WABIAN-RIII and WENDY (Ogura, 2004) are the latest developments from the Waseda
University, as already mentioned, the pioneers in the humanoid field with the first full-scale
humanoid robot, a project that began in 1970 and finished three years later with Wabot-1.
WABIAN-RII continues the research in dynamical walking plus load carrying and the
addition of emotional gesture while performing tasks. Likewise, WENDY incorporates
emotional gestures to the manipulation task that is being carried out. WABIAN-RII weights
130kg and is 188cm tall while WENDY is 150cm tall and weights 170kg.
Johnnie (Löffler, 2000) is probably the most well-known and advanced humanoid robot in
Europe. It was developed at the University of Munich with the aim of realising a human-
like walking, in this case based on the well-established Zero Moment Point (ZMP)
approach introduced by Honda in the ASIMO robot, but with the aid of a vision system. It
is able to walk on different terrains and climb over some stairs. It is 180cm tall and
weights 45kg.
Robonaut (Ambrose, 2001) (cf. figure 1) is a humanoid robot developed by the NASA with
the aim of replacing a human astronaut in EVA tasks (outside the vehicle). The main feature
of the robot is a human dexterous manipulation capability that enables it to perform the
same tasks an astronaut would perform and with the same dexterity. The robot is not
autonomous but tele-operated from inside the vehicle. Since legs have no utility in space,
the robot is composed of two arms and a torso that is attached to a mechanical link enabling
the positioning of the robot in any required position/orientation. Because of the bulky suits
the astronauts have to wear to protect against radiations, their manipulation capabilities are
greatly reduced and the handles, tools and interfaces they use are designed to be handled
with their special gloves. A robot, even though needing some protection against radiation,
would not required such a bulky suit thus recovering to a certain extent a human dexterity.
Moreover, risks for astronauts are avoided on these missions outside the spatial vehicle. It
has the size of a human torso and arm, with 54 DoF in total: 14 for each hand, 7 for each arm
and the link to the vehicle, 2 in the neck and 3 on the waist.
In the field of human-robot interaction, the robot Cog (Brooks, 1998), from the MIT AI
Lab, is the best example. Cog is composed of a torso, two arms and a head. The main
focus of this project is to create a platform in order to prove the ideas exposed by
Rodney Brooks claiming that human-like intelligence appearance requires a human-like
body that interacts with the world in the same way a human does. Besides, for a robot
to gain experience in interacting with people it needs to interact with them in the same
way people do. One underlying hope in Brooks theory is that: Having a human-like
492 Humanoid Robots, New Developments
shape, humans will more easily perceive the robot as one of them and will interact with
it in the same way as with other people, providing the robot with valuable information
for its interaction learning process. These ideas have been taken to the next level with
Kismet (Breazeal, 1998), which is also a robot from the same Lab. In this case, a robot
composed of a human-like head. The research focus is on natural communication with
humans using, among others, facial expressions, body postures, gestures, gaze
directions and voice. The robot interacts in an expressive face-to-face way with another
person, showing its emotional state and learning from humans as kind of a parent-child
relation.
In the same direction, the Intelligent Robotics Lab of the University of Osaka (Japan)
developed in 2005 the most human-looking robot so far. It is a female robot named Repliee
Q1Expo (Ishiguru, 2005). Its skin is made of silicon what gives it a more natural appearance
and integrates a vast number of sensors and actuators to be able to interact with people in a
very natural and friendly way. Even the chest is moved rhythmically to create the illusion
that the robot is breathing.
However, as promising as some of these developments seem to be at first glance, one has to
carefully evaluate what exactly can be learned for the field of ‘embodied cognition’ from the
study of more or less isolated features of human behavior, whether that be in the field of
complex locomotion, manipulation or interaction. In her paper (Wilson, 2002) identifies six
viewpoints for the new so-called ‘embodied cognition’ approach:
1) Cognition is situated: All Cognitive activity takes part in the context of a real world
environment.
2) Cognition is time pressured: How does cognition work under the pressures of real
time interaction with the environment
3) Off-loading of cognitive work to the environment: Limits of our information
processing capabilities demand for off-loading.
4) The environment is part of the cognitive system: because of dense and continuous
information flow between the mind and the environment it is not meaningful to
study just the mind.
5) Cognition is for action: Cognitive mechanisms (perception/memory, etc.) must be
understood in their ultimate contribution to situation-appropriate behavior.
6) Off-line Cognition is body-based: Even when uncoupled from the environment, the
activity of the mind is grounded in mechanisms that evolved for interaction with
the environment.
We have cited all six viewpoints here, as they represent an interesting perspective on the
state of the art in embodied cognition. In the experimental work presented here we focus
our attention on viewpoint 2 that appears to have a crucial instantiation especially in
humanoid robots that have to find a way to effectively and efficiently counteract the effects
of gravity while walking. In fact looking at viewpoint 3 and 4 counteracting appears to be
wrong from the beginning instead having gravity work for the system appears to be a better
way of achieving robust locomotion in a technical two legged system.
solely on two simple biological mechanisms that are integrated in an architecture for low
level locomotion control.
Whether or not complex motion control can be achieved only via reflex systems is
subject to further discussion, however, the concept of a set of fixed wired reactions to
sensory stimuli is of high interest to roboticists who aim to gain stability in the systems
locomotion.
Fig. 3. The low level control architecture. On the global level (light gray area) we have
implemented Locomotion Behaviors (LB’s), typically (Forward, Backward and Lateral
locomotion). These global behaviors are connected to all local leg controllers and activate
the local single leg motion behaviors. The local level (dark gray area) implements Rhytmic
Motion Behaviors (RMB’s) and Postural Motion Behaviors (PMB’s). These behaviors
simultaneously influence the amplitude and frequency parameters of –in this case- three
oscillating networks (OST, OSB and OSD). The oscillators are connected to a common clock
which is used for local and global synchronization purposes. The oscillators output is a
rhythmic, alternating flexor and extensor, stimulation signal (see callout box), which is
translated into PWM signals via the motoric end path. Inline with the output of the motoric
end path are a set of pertubartion specific reflexes, which override the signals on the end
path with precompiled activation signals if the sensor information from the physical joints
meets a set of defined criteria.
The design of the control architecture described here was thus driven by these two concepts.
The CPG approach appeared to be interesting to generate rhythmic walking patterns which
can be implemented computationally efficient, while the reflex driven approach seemed to
provide a simple way to stabilize these walking patterns by providing: 1) a set of fixed
Experiments on Embodied Cognition: A Bio-Inspired Approach for Robust Biped Locomotion 495
4. Implementation Issues
Our implementation of the low-level control architecture, shown in figure 5, consists of a
combination of drivers and behaviors, which are connected thru special functions (merge
functions). Our concept for locomotion is a combination of balance control, see figure 4 and
posture behaviors, which should keep the robot balanced while walking or during external
interferences. Central Pattern Generator (CPG) behaviors are used to produce rhythmic
motions for walking. The speed at which the rhythmic motions are performed is defined by
a global clock.
A higher level behavior ‘walking’ has the task to implement directional walking and
another high level behavior ‘ball-following’ will use the sensory information of the
CMUCam2 to follow the intention to track a ball by giving instructions to the walking
behavior.
5. Experimental Results
To demonstrate a possible result for the activation pattern of a joint while overlaying
different CPGs and modifying the posture, we first let the robot walk hanging in the air
496 Humanoid Robots, New Developments
without any balancing behavior in order to get even curves. We could not let the robot
walk on the ground without balance behaviors because then the system braces and
topples down.
Figure 6 shows the desired angles for both hip forward joints at a pulse of 2000
milliseconds. The right and left legs curve are shifted by half the period because one leg is in
the swing phase while the other one is in the stance phase.
The rhythm from 0 ms to 15000 ms is only generated by the forward CPG, however, the
offset value is set to 10 degrees from 5000 ms to 10000 ms and combined via add merge,
thus resulting in a more ducked posture while walking.
From 15000 ms to 21000 ms the forward and turn left CPGs are active and mixed together
with an average merge which has the effect that the robot takes a moderate left curve.
In the time segment from 21000 ms to 27000 ms just the CPG for turning left is active and
after that there is only a basic posture value.
Fig. 6. Reference angle for right and left hip forward joint while walking hanging in the air
without balance behaviors.
As you can see, the output of the PID-Controller shown in figure 8 is less noisy than the tilt
sensor values and seems to be more rhythmic. The pattern is repeated every 2000
milliseconds which shows that frequency of the interferences and the retaliatory action
depend on the pulse.
Fig. 7. Values for rear-front and right-left tilt, while walking on the ground with active
balance behavior.
Fig. 8. Calculated error from the balance behavior’s PID-Controller while walking on the
ground with active balance behavior.
Figure 12 shows the desired and real angle of the left arm and leg joints resulting from the
reaction of the balance behavior as an average of the 5 recurrences.
Figure 13 shows the perception of the hit by the tilt sensor whose values are used as input
for the balance behavior’s PID-Controller, shown in figure 14.
Fig. 9. Desired and real angle (degree) from left leg joints while walking on the ground
without active balance behavior over 20000 ms. The robot was prevented from toppling
down by hand.
500 Humanoid Robots, New Developments
Fig. 10. Desired and real angle (degree) from left leg joints while walking on the ground
with active balance behavior over 20000 ms.
Experiments on Embodied Cognition: A Bio-Inspired Approach for Robust Biped Locomotion 501
Fig. 12. Desired and real angle (degree) from left arm and leg joints over 4000 ms as an
average over 5 recurrences.
Fig. 13 Tilt values for rear-front and right-left pitch as an average over 5 recurrences.
Experiments on Embodied Cognition: A Bio-Inspired Approach for Robust Biped Locomotion 503
Fig. 14. Calculated error from the balance behavior’s PID-controller as an average over 5
recurrences.
Fig. 15. An architecture integrating learning, representation and robust low-level control.
8. References
Albrecht, M., Backhaus, T., Planthaber, S., Stoeppeler, H., Spenneberg, D., & Kirchner, F.
(2005). AIMEE: A Four Legged Robot for RoboCup Rescue. In: Proceedings of
CLAWAR 2005. Springer.
Ambrose, Robert et al "The development of the Robonaut System for Space Operations",
ICAR 2001, Invited Session on Space Robotics, Budapest, August 20, 2001.
Breazeal, C. and Velasquez, J. “Toward Teaching a Robot ‘Infant’ using Emotive
Communication Acts” In Proceedings of 1998 Simulation of Adaptive Behavior,
workshop on Socially Situated Intelligence, Zurich Switzerland. 25-40.
Brooks, R. et al “The Cog Project: Building a Humanoid Robot” In Computation for
Metaphors, Analogy and Agents, Vol. 1562 of Springer Lecture Notes in Artificial
Intelligence, Springer-Verlag, 1998.
504 Humanoid Robots, New Developments
Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of
Robotics and Automation, 2
Fodor, J.A. (1983). The modularity of mind, Cambridge, MIT Press
Hirai, K. et al “The development of Honda Humanoid Robot” In Proc. of the 1998 IEEE Int.
Conf. on Robotics & Automation, p.1321-1326 (1998).
Ishida, T. “A small biped entertainment robot SDR-4X II” In Proceedings of the 2003 IEEE
Internation Symposium on Computational Intelligence in Robotics and
Automation, pp. 1046-1051, vol.3, 2003
II-Woo Park "Mechanical Design of Humanoid Robot platfrom KHR-3(KAIST Humanoid
Robot-3: HUBO)", in Humanoids 2005, Japan
Ishiguro, H. and T. Minato, “Development of androids for studying on human-robot
interaction” In Proceedings of 36th International Symposium on Robotics, TH3H1,
Dec. 2005.
James, W. (1890). The Principles of Psychology. Henry Holt, New York.
Kirchner, F., Spenneberg, D., & Linnemann, R. (2002). A biologically inspired approach
towards robust real world locomotion in an 8-legged robot. In: J. Ayers, J. Davis, &
A. Rudolph (Eds.), Neurotechnology for Biomimetic Robots. MIT-Press, MA, USA.
Kaneko, K. et al“Humanoid Robot HRP-2” In Proceedings of the 2004 IEEE International
Conference on Robotics& Automation, 2004
Kagami, S. et al“Design and Implementation of Software Research Platform for Humanoid
Robotics: H6" In Proc. of International Conference on Robotics and Automation
(ICRA'01), pp. 2431--2436 , 2001
Lakoff, G., & Johnson, M. (1980). Metaphors we Live by, University of Chicago Press.
Löffler, K. Gienger, M. “Control of a Biped Jogging Robot” In Proceedings of the 6th
International Workshop on Advanced Motion Control, Nagoya, Japan, 307-323, 2000.
Murase, Y. et al “Design of a Compact Humanoid Robot as a Platform” In 19th Conf. of
Robotics Society of Japan, p.789-790 (2001)
Nilsson, N.J. (1984). Shakey the robot, SRI Technical Note, no. 323, SRI, Menlo Park, CA, USA.
Ogura, Y. et al“Development of a Human-like Walking Robot Having Two 7-DOF Legs and
a 2-DOF Waist” In Proceedings of the 2004 IEEE International Conference on
Robotics and Automation, pp134-139, 2004
Prinz, W. (1987). Ideo-motor action. In Heuer, H., & Sanders, A.F. (Eds.) Perspectives on
perception and action, p. 47-76, Lawrence Erlbaum Associates, Hillsdale.
Shan, J. Fumio Nagashima (2002). Neural Locomotion Controller Design and
Implementation for Humanoid Robot HOAP-1. In: Proceedings of The 20th Annual
Conference of the Robotics Society of Japan 2002.
Spenneberg, D. (2005). A Hybrid Locomotion Control Approach. In: Proceedings of the
CLAWAR 2005 Conference.
Spenneberg, D., Albrecht, M., & Backhaus, T. (2005). M.O.N.S.T.E.R.: A new Behavior Based
Microkernel for Mobile Robots. In: Proceedings of the ECMR 2005.
Spenneberg, D., Hilljegerdes, J., Strack, A., Zschenker, H., Albrecht, M., Backhaus, T., &
Kirchner, F. (2005). ARAMIES: A Four-Legged Climbing and Walking Robot. In:
Proceedings of 8th International Symposium iSAIRAS.
Tevatia, G. Schaal, S (2000). Inverse kinematics for humanoid robots. In: Proceedings of
IEEE International Conference on Robotics and Automation (ICRA 2000)
Toyota Partner Robot. http://www.toyota.co.jp/en/special/robot.
Wilson, M. (2002). Six views of embodied cognition, University of California, Santa Cruz, In
Psychonomic Bulletin & Review, 9, p. 625-636.
28
1. Introduction
Within the last decade, robotic research has turned more and more towards flexible
assistance and service applications. Especially when cooperating with untrained persons at
small distances in the same workspace, it is essential for the robot to have a deep
understanding and a reliable hypothesis of the intentions, activities and movements of the
human interaction partner.
With growing computational capacities and new emerging sensor technologies, methods
for tracking of articulated motion have become a hot topic of research. Tracking of the
human body pose (often also referred to as Human Motion Capture) without invasive
measurement techniques like attaching markers or accelerometers and gyroscopes
demands (1) for algorithms that maximally exploit sensor data to resolve ambiguities that
compulsorily arise in tracking of a high-degree-of-freedom system, and (2) for strong
models of the tracked body that constrain the search space enough to enable fast and
online tracking.
This chapter proposes a 3D model for tracking of the human body, along with an iterative
tracking approach. The body model is composed of rigid geometric limb models, and joint
models based on an elastic band approach. The joint model allows for different joint types
with different numbers of degrees of freedom. Stiffness and adhesion can be controlled via
joint parameters.
Effectiveness and efficiency of these models are demonstrated by applying them within an
Iterative Closest Point (ICP) approach for tracking of the human body pose. Used sensors
include a Time-of-Flight camera (depth camera), a mono colour camera as well as a laser
range finder. Model and sensor information are integrated within the same tracking step for
optimal pose estimation, and the resulting fusion process is explained, along with the used
sensor model. The presented tracking system runs online at 20-25 frames per second on a
standard PC.
We first describe related work and approaches, which partially form the basis for
the presented models and methods. Then, a brief introduction into the ICP is given.
The model for body limbs and joints is explained in detail, followed by a
description of the full tracking algorithm. Experiments, examples and different
evaluations are given. The chapter closes with a discussion of the achieved results
and a conclusion.
506 Humanoid Robots, New Developments
2. Related work
Tracking of human body motion is a highly active field in current research. Depending on
the target application, many different sensors and models have been used. This includes
invasive sensors like magnetic field trackers (Ehrenmann et al., 2003; Calinon & Billard,
2005) that are fixed to the human body. Within the context of human robot interaction in
every-day life, this approach is not feasible; non-invasive tracking approaches must be
applied. Most of these are based on vision systems, or on multi-sensor fusion (Fritsch et al.,
2003). Systems which rely on distributed sensors (Deutscher et al., 2000) are not practicable
in the given domain; the tracking system must be able to rely only on sensors mounted on
the robot.
Several surveys exist on the area of tracking humans (Aggarwal & Cai, 1999; Gavrila, 1999;
Moeslund & Granum, 2001; Wang et al., 2003). Possible applications range from the
mentioned human-robot interaction to surveillance and security domains. Hence, there is a
big variety of methods ranging from simple 2d approaches such as skin colour segmentation
(Fritsch et al., 2002) or background subtraction techniques (Bobick & Davis, 2001) up to
complex reconstructions of the human body pose. (Ramanan & Forsyth, 2003) shows how to
learn the appearance of a human using texture and colour.
Sidenbladh (Sidenbladh, 2001) used a particle filter to estimate the 3d pose in monocular
images. Each particle represents a specific configuration of the pose which is projected into
the image and compared with the extracted features. (Cheung et al., 2003) use a shape-from-
silhouette approach to estimate the human’s pose.
A similar particle filtering approach is used in (Azad et al., 2004). The whole body is tracked
based on edge detection, with only one camera. The input video stream is captured with
60Hz, which implies only small changes of the configuration between two consecutive
frames. As it is a 2d approach, ambiguities of the 3d posture can hardly be resolved.
An ICP-based approached for pose estimation is shown in (Demirdjian & Darrell, 2002). The
authors use cylinders to model each body part. In (Demirdjian, 2003) the same authors show
how they model joint constraints for their tracking process. However, it the effect of the ICP
is partially removed when the constraints are enforced. Nevertheless, parts of the work
described in this chapter are based on the work of Demirdjian.
3.1 Sensors
3D point clouds are acquired by a Time-of-Flight camera. This depth camera called
Swissranger (CSEM, 2006) has a resolution of 160 x 124 pixels and a depth range of 0.5 to 7.5
meters. Fig. 1 shows the depth image of an example scene. Alternatively, point clouds from
reconstruction of stereo images can be used.
A standard single FireWire camera is used to obtain colour images. These are processed by a
standard skin-colour based algorithm to track head and hands in the image. All image
regions which are candidates for skin regions are provided to the tracking algorithm.
A Human Body Model for Articulated 3D Pose Tracking 507
A SICK laser range finder which is mounted for navigation purposes delivers 3D points in
its measurement plane. These points are also provided to the tracking.
Fig. 1 Depth image, retrieved with the Time-of-Flight sensor. The point cloud is visualized
in 3D, and depth is additionally encoded in colours. Green is equivalent to near, and blue to
distant measurements.
This leads to
§ N 2 N ·
& 1 ¨ & & & & & & ¸
¦ (R( xi ' ) − pi ') + N t ' ¸
2
f ( R, t ) = ¨
N¨ ¦
i =1
R( x 'i ) − p'i − 2t
i =1 ¸
(3)
© ¹
508 Humanoid Robots, New Developments
&
In equation (3), the first part is independent from t ' , the second part reveals to zero.
&
Therefore the function becomes minimal if t ' = 0 . Transformation yields
& & &
t = μ p − R( μ x ) (4)
&
Having the optimal translation (giving t ' = 0 ), equation (2) becomes:
N 2
& 1 & &
f ( R, t ) =
N ¦
i =1
R( x ' i ) − p' i (5)
& &
Considering R( xi ' ) = xi ' , the equation can be rewritten as
§ N 2 N N 2·
& 1 ¨ & & & & ¸
f ( R, t ) = ¨
N¨ ¦
i =1
x 'i −2 ¦
i =1
R ( xi ' ) p i '+ ¦
i =1
p' i ¸
¸
(6)
© ¹
Maximizing
N
& &
¦ R( x ' ) ⋅ p'
i =1
i i (7)
this context: (i) data points must be associated with body parts, before they can be used for
tracking, because each body part is transformed separately by the ICP. (ii) A mechanism
must be established which introduces the joint constraints of the model in the tracking
process, to avoid drift of the limbs due to separate tracking of each part.
The first issue is solved by checking each measurement point with all body part models, and
assigning it to the geometrically closest body part. In addition, a threshold is used to discard
measurements which have too high distance to the model.
To avoid that the body parts drift apart, we introduce a novel joint model. The complete
body model together with the joint model will be explained in the next section.
Fig. 2. Hierarchical body model, consisting of 10 cylinders and 9 joints. The torso describes
the root limb.
If the fusion algorithm also incorporates data from feature trackers (like some vision based
algorithms, or magnetic field trackers that are fixed on the human body), it is required to
identify certain feature points on the human body. This is done following the H|Anim
Specification (H|Anim, 2003).
For each junction of model parts, a set of elastic bands is defined (see Fig. 3). These relations
set up corresponding points on both model parts. The corresponding points can then be
used within the model fitting process to adjust the model configuration according to any
sensor data input and to the defined constraints.
Fig. 3. Different joint types. Universal joint with 3 degrees of freedom (a), hinge joint with
one full and 2 restricted degrees of freedom (b), and (c) elliptic joint with 3 restricted degrees
of freedom.
Major axes: The model type (universal, hinge, elliptic) and the valid range of each degree of
freedom control the choice for the major axes sizes and ratio. Universal joints are modelled
with both ellipse axes set to zero. For hinge and elliptic joints, the axis direction defines the
rotation axis, and the axis length defines the stiffness of the other two rotational degrees of
freedom.
In Fig. 3 b), rotational flexibility around the z-axis (perpendicular to the image plane) and
around the symmetric axis of the cylinders is very limited due to the modelled joint.
Position and orientation: Position and orientation of the point-to-point, hinge or elliptic
joint model with respect to both body parts define the connection between both parts.
Weight: When the joint model is used within a bigger tracking framework the elastic bands
can be used as correspondences which are included as tracking constraints. The use of
measured correspondences together with artificial ones puts up the need for correct
weighting strategies between input and model constraints.
To incorporate this, each joint model can be weighted with respect to measured input. This
parameter is then used within the model-fitting algorithm to balance between measured
input and joint constraints. The weight parameter is defined in relation to the number of
‘natural’ correspondences to keep the ratio between measurements and constraints.
To increase the weight of a joint model tightens the coupling between both model parts, by
decreasing the coupling becomes looser. For hinge and elliptic joints, higher weight also
increases stiffness of a kinematic chain. This makes sense especially for joints like the human
neck or wrist which shows a very tight coupling and very limited angular range.
The proposed joint model provides a ‘‘soft’’ way to restrict the degrees of freedom for the
model parts. It additionally provides means to control the ‘‘degree of restrictiveness’’ for
each DoF in a joint. Applying elastic strips is tantamount to introducing a set of forces which
hold the model parts together. The connected points and magnitude define the joint
behaviour.
The soft joint model can e.g. be applied to the joint between human pelvis and thigh: While
in forward/backward direction the movement is almost unrestricted, there is a high
restriction for the left/right movement. But still a small movement is possible, and a 1-DoF
model would not be sufficient.
Nevertheless, it is possible to model plain bending joints with a hinge joint model with one
large axis.
• The generated point pairs each represent one point on the model and the
associated artificial data point. So each pair has to be added to one body part as
Model - Data and to the other Data - Model relation to retrieve the desired forces
(from the elastic bands) on both model parts. These forces then try to establish the
modelled connections.
• The artificial correspondences are recalculated in each ICP step.
• The chosen weight of each joint depends on the desired stiffness of the model. To
always achieve the same stiffness during tracking, the ratio between measured and
artificial point relations has to be constant. This means that the number of
generated artificial points for one body part in each step depends on the quantity of
measurements for this part of the model. The generated relations are linearly scaled
with the number of measurements.
• From our experience, the ratio r between measured and artificial points should be
chosen as approximately 0.4 < r < 0.7. This gives enough cohesion within the model
without implying to hard and static relationships.
It is important to note that the introduction of multiple identical correspondences within the
ICP does not increase computation time with the order of point counts (like a set of different
measured points would). The only additional effort consists of one multiplication of the
resulting point matrix (4x4) with the scalar weight W.
• 3d points on the human body that are e.g. generated by a stereo vision system that
tracks a person in image space and generates the corresponding 3d points by stereo
reconstruction.
• 3d points assigned to a single body part may also be generated by a stereo vision
system tracking special body parts like the face or the hands.
• 3d point-to-point relations are 3d points that can be assigned to a given point on the
tracked human body. Thus, tracking of special features or points (e.g. with
markers, or magnetic field trackers attached to the human body) can be integrated.
• 2d point-to-line relations can e.g. be derived from a 2d image space based tracker.
The pixel in the image plane together with the focal point define a ray in 3d, which
corresponds to the point on the human body that has been detected in the image.
Fig. 4. The complete tracking algorithm, containing all input data streams. “BB” denotes
Bounding Box.
This data can originate from any sensor that gives data in the described format. Obviously,
all input data has to be transformed into the tracker coordinate system before it is used
within the system. In our setup, we use the 3d point clouds from laser range finder and
depth camera as free 3d points. Extracted features (hands, head) from the colour image are
projected onto 2d point-to-line relations.
5.2 Processing
For the ICP matching algorithm, a list of corresponding point pairs has to be set up for each
limb. Therefore, all “free” 3d points have to be analyzed in order to decide whether they
correspond to points on the tracked model. Otherwise, they are discarded. Additionally, all
given correspondences from other tracking procedures and the background knowledge on
joint constraints have to be added to the correspondences list. Then, the optimal resulting
model configuration has to be computed. These steps are performed iteratively until an
optimum of the configuration is reached.
Before the input data of one time step is processed, it is possible to adjust internal model
parameters. This can be e.g. the model scale factor, or particular cylinder sizes. Even limbs
can be added to or removed from the model.
The processing steps are now described in detail.
514 Humanoid Robots, New Developments
Prefiltering free 3d points: The whole point cloud of free 3d points from used depth sensors is
processed in order to remove all points that are not contained within the bounding box of
the body model (see Fig. 4, step BB Check whole body). This is done on the assumption that
the body configuration changes only locally between two time frames. A parameter defines
an additional enlargement of the bounding box prior to this filtering step. The resulting
point list is concatenated with any sensor data input that has already assigned its measured
3d points with the tracked. It results in a list of 3d points which are close to the body model
and thus are candidates for measurements of the tracked body.
Assigning points to limb models: The point list is now processed in order to assign measured
points to dedicated limb models based on the bounding box of each limb model (see Fig. 4,
step BB Check body parts). Again, the bounding boxes can be enlarged by a parameter to take
the maximum possible displacement into account. Points that do not fall in any bounding
box are again removed. Several behaviours can be selected for points that belong to more
than one bounding box (overlap): These points are either shared between limb models,
exclusively assigned to one limb or shared only in case of adjacent limbs. This last method
avoids collisions between limbs that are not directly connected. The resulting point list can
be joined with any sensor data input that has already assigned its measured 3d points with
dedicated limbs of the tracked body. The resulting point list contains candidates for
measurements of each limb.
Point Number reduction: The resulting point list can be downsampled before the calculation
of the closest points to reduce the overall number of points (see Fig. 4, step Downsampling).
This step is controlled by three parameters: the sampling factor, and minimum and
maximum number of points per limb. Thus, it is possible to reduce the number of points for
limbs with many measurements, but maintain all points for limbs which have been
measured with only a few points.
Closest point computation: The closest point calculation is the most time-consuming step in the
whole loop. For each remaining data point, the corresponding model point on the assigned
limb model has to be computed for the ICP matching step (see Fig. 4, step Closest Point). This
involves several geometric operations. Depending on the resulting distance between data
and model point, all points within a given maximum distance are kept and the
correspondence pair is stored in the output list. All other points are deleted.
3d point-to-point relations from input data can now be added to the resulting list, which
holds now corresponding point pairs between data set and model.
Addition of 2d measurements: Each 2d measure (e.g. tracked features in 2d image plane of a
camera) of a feature on the human body defines a ray in 3d which contains the tracked
feature. This fact is used to add the 2d tracking information to the 3d point correspondences
(see Fig. 4, step Closest point on line): For each reference point on the body model, the closest
point on the straight line is computed and added to the list.
Joint model integration: The joint model for each junction is added as artificial point
correspondences for each limb, depending on the limb type (see Fig. 4, step Joint model).
According to sec. 4.1, the correspondences can be interpreted as elastic bands which apply
dedicated forces to the limbs to maintain the model constraints. Thus, artificial
correspondences will keep up the joint constraints in the fitting step.
Model fitting: When the complete list of corresponding point pairs has been set up, the
optimal transformation between model and data point set can be computed according to
sec. 3.2 (Fig. 4, step Least squares). The transformation is computed separately for each
limb.
A Human Body Model for Articulated 3D Pose Tracking 515
When all transformations have been computed, they can be applied to the model. The
quality measure defined in sec. 3.2 is used for the fitting. All are repeated until the
quality measure is below a given threshold or a maximum number of steps has been
performed.
6. Sensor model
Each used data source has its own stochastic parameters which have to be taken into
account. The described approach offers a very simple method for this: each input date is
weighted with a measure that describes its accuracy. The ICP algorithm then incorporates
these weights in the model-fitting step. Thus it is possible to weight a 2d face tracker much
higher than a single 3d point from a Time-of-Flight camera, or to weight 3d points from a
Time-of-Flight-camera slightly higher than points from the stereo reconstruction due to the
measuring principle and the sensor accuracy.
It is important to note that an increased weight for a single point does not affect the time
needed for the computation. This is very important and is due to the fact that in the
presented approach, each measurement is projected into model space. This is different to
e.g. particle filtering approaches, where each particle is projected into each sensor’s
measurement space to compute the likelihood. In consequence, adding a sensor source to
the tracking framework increases computation time only with the number of different
measurements from the sensor.
An example configuration can be seen in Fig. 5. The model consists of two cylinders,
connected by a linear joint. The measurements contain a 3d point cloud, and a 3d
measurement of one end point. This configuration can e.g. result from a stereo depth image
of a human arm and a colour based hand tracker.
Fig. 5. Different weights for measurements from different sensors, projected into 3D model
space. The depicted point sizes correspond to the sensor data weight, the lines indicate the
closest-point relations. These pictures motivated the system name VooDoo.
with 3d data only (row 2) and 2d data only (row 3), where the 3d data has been
acquired with the Time-of-Flight camera and the 2d data is derived from skin colour
segmentation in one image of the stereo camera. The rays in 3d defined by the skin
colour features can be seen here. Row 4 shows the tracking result with both inputs
used.
For the shown results, the following weights for the input data have been used: 3d data
points w = 1.0, face tracker w = 30.0, hand tracker w = 20.0.
Scene
3d data
2d data
Fusion
Fig. 6. Experiments with different sensor inputs, taken from a sequence containing a “bow”
and a “wave” movement. The frame number is displayed on the top. The used 2d and 3d
correspondences have been added to the resulting model images.
Fig. 8. Time consumption per ICP step vs. number of ICP steps.
The computational effort for one frame depends first of all on the number of ICP steps
needed. The number of iterations again depends on the body displacement between two
consecutive frames. Fig. 7 shows the number of required ICP steps during a typical tracking
sequence for a human body model. During phases without large movements, one iteration
is enough to approximate the body pose (frame 500 to 570). Extensive movements are
compensated by more ICP iteration steps per frame (650 to 800).
The required time per frame obviously increases with the number of needed ICP steps. This
relation is shown in Fig. 8. A maximum number of 6 ICP steps has turned out to be a good
trade-off between time consumption per frame and tracking accuracy. This leads to a frame
period of 20 - 70ms, which corresponds to a frame-rate of 14.2 to 50Hz. The maximum
frame-rate in our framework is only constrained by the camera frame-rate, which is 30Hz.
518 Humanoid Robots, New Developments
can be modelled easily by distributing the elastic bands along two axes in the joint. The joint
constraints are incorporated in the ICP as artificial measurements, so measurements and
model knowledge are processed identically. The model can also be refined by adding
cylindrical primitives for hands, fingers and feet. This is reasonable if the accuracy and
resolution of the available sensors are high enough to resolve e.g. the hand posture, which is
not the case in our approach due to the large distance between human and robot and the
low measurement resolution.
The idea of introducing artificial correspondences into the fitting step can even be exploited
further. Current works include further restriction of the joints in angular space by adding
angular limits to certain degrees of freedom, which are maintained valid by artificial point
correspondences. These will be generated and weighted depending on the current body
configuration.
Our implementation of the described tracking framework has been released under the GPL
license, and is available online at wwwiaim.ira.uka.de/users/knoop/VooDoo/doc/html/,
along with sample sequences of raw sensor data and resulting model sequences.
9. References
Aggarwal, J. K.; Cai, Q. (1999) Human motion analysis: A review, Computer Vision and Image
Understanding: CVIU, vol. 73, no. 3, pp. 428–440.
Azad, P.; Ude, A.; Dillmann, R.; Cheng, G. (2004) A full body human motion capture system
using particle filtering and on-the-fly edge detection, in Proceedings of the IEEE-
RAS/RSJ International Conference on Humanoid Robots. Santa Monica, USA.
Besl, P. J.; McKay, N. D. (1992) A method for registration of 3-d shapes, IEEE Transactions on
pattern analysis and machine intelligence, vol. 14, no. 2, pp. 239–256, February.
Bobick, A. F.; Davis, J. W. (2001) The recognition of human movement using temporal templates,
IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 3, pp. 257–
267.
Calinon, S.; Billard, A. (2005), Recognition and reproduction of gestures using a probabilistic
framework combining PCA, ICA and HMM, in Proceedings of the International
Conference on Machine Learning (ICML), Bonn, Germany
Cheung, G. K. M.; Baker, S.; Kanade, T. (2003) Shape-from-silhouette of articulated objects and its
use for human body kinematics estimation and motion capture, in Computer Vision and
Pattern Recognition.
CSEM (2006) Swissranger website. http://www.swissranger.ch
Demirdjian, D.; Darrell, T. (2002) 3-d articulated pose tracking to untethered deictic references, in
Multimodel Interfaces, pp. 267–272.
Demirdjian, D. (2003) Enforcing constraints for human body tracking, in Conference on
Computer Vision and Pattern Recognition, Workshop Vol. 9, Madison, Wisconsin,
USA, pp. 102–109.
Deutscher, J.; Blake, A.; Reid, I. (2000), Articulated body motion capture by annealed particle
filtering, in Computer Vision and Pattern Recognition (CVPR), Hilton Head, USA,
pp. 2126–2133.
Ehrenmann, M.; Zöllner, R.; Rogalla, O.; Vacek, S.; Dillmann, R. (2003). Observation in
programming by demonstration: Training and execution environment, in Proceedings of
Third IEEE International Conference on Humanoid Robots, Karlsruhe and Munich,
Germany.
520 Humanoid Robots, New Developments
Fritsch, J.; Lang, S.; Kleinehagenbrock, M.; Fink, G. A.; Sagerer, G. (2002) Improving adaptive
skin color segmentation by incorporating results from face detection, in Proc. IEEE Int.
Workshop on Robot and Human Interactive Communication (ROMAN). Berlin,
Germany
Fritsch, J.; Kleinehagenbrock, M.; Lang, S.; Plötz, T.; Fink, G.A.; Sagerer, G. (2003), Multi-
modal anchoring for human-robot-interaction, Robotics and Autonomous Systems,
Special issue on Anchoring Symbols to Sensor Data in Single and Multiple Robot
Systems, vol. 43, no. 2–3, pp. 133–147.
Gavrila, D. M. (1999) The visual analysis of human movement: A survey, Computer Vision and
Image Understanding, vol. 73, no. 1, pp. 82–98.
H|Anim (2003), Information technology — Computer graphics and image processing — Humanoid
animation (H-Anim), Annex B, ISO/IEC FCD 19774, Humanoid Animation Working
Group, Specification.
Horn, B. K. P. (1987) Closed-form solution of absolute orientation using unit quaternions, Optical
Society of America Journal A, vol. 4, pp. 629–642, Apr. 1987.
Knoop, S.; Vacek, S. & Dillmann, R. (2005). Modelling Joint Constraints for an Articulated 3D
Human Body Model with Artificial Correspondences in ICP, Proceedings of the
International Conference on Humanoid Robots (Humanoids 2005), Tsukuba, Japan,
December 2005, IEEE-RAS
Knoop, S.; Vacek, S. & Dillmann, R. (2006). Sensor Fusion for 3D Human Body Tracking with an
Articulated 3D Body Model. Proceedings of the IEEE International Conference on
Robotics and Automation (ICRA), Orlando, Florida, May 2006
Knoop, S.; Vacek, S. & Dillmann, R. (2006). Sensor fusion for model based 3D tracking.
Proceedings of the IEEE International Conference on Multisensor Fusion and
Integration for Intelligent Systems (MFI), Heidelberg, Germany, September 2006
Moeslund, T. B.; Granum, E. (2001) A survey of computer vision-based human motion capture,
Computer Vision and Image Understanding, vol. 81, no. 3, pp. 231–268.
Ramanan, D.; Forsyth, D. A. (2003) Finding and tracking people from the bottom up, in
Computer Vision and Pattern Recognition, vol. 2, 18-20 June, pp. II–467–II–474.
Sidenbladh, H. (2001) Probabilistic tracking and reconstruction of 3d human motion in monocular
video sequences, Ph.D. dissertation, KTH, Stockholm, Sweden.
Wang, L.; Hu, W.; Tan, T. (2004) Recent developments in human motion analysis, Pattern
Recognition, vol. 36, no. 3, pp. 585–601, 2003.and Electronics Engineers.
29
1. Introduction
Over the past few decades a considerable number of studies have been made on impact
dynamics. Zheng and Hemami discussed a mathematical model of a robot that collides with an
environment (Zheng & Hemami, 1985). When a robot arm fixed on the ground collides with a
hard environment, the transition from the free space to constrained space may bring instability
in the control system. Therefore, the impact between robots and environments has been the
subject of controversy. Asada and Ogawa analyzed the dynamics of a robot arm interacting
with an environment using the inverse inertia matrices (Asada & Ogawa, 1987). In the early
90’s, the optimum approach velocity for force-controlled contact has been enthusiastically
studied (Nagata et al., 1990, Kitagaki & Uchiyama, 1992). Volpe and Khosla proposed an impact
control scheme for stable hard-on-hard contact of a robot arm with an environment (Volpe &
Khosla, 1993). Mills and Lokhorst proposed a discontinuous control approach for the tasks that
require robot arms to make a transition from non-contact motion to contact motion, and from
contact motion to non-contact motion (Mills & Lokhorst, 1993). Walker proposed measures
named the dynamic impact measure and the generalized impact measure to evaluate the effects
of impact on robot arms (Walker, 1994). Mandal and Payandeh discussed a unified control
strategy capable of achieving a stable contact against both hard and soft environment (Mandal
& Payandeh, 1995). Tarn et al. proposed a sensor-referenced control method using positive
acceleration feedback and switching control strategy for robot impact control (Tarn et al., 1996).
Space robots does not have fixed bases, therefore, an impact with other free-floating objects may
bring the space robots a catastrophe. In order to minimize the impulsive reaction force or
attitude disturbance at the base of a space robot, strategies for colliding using reaction null-
space have been proposed (Yoshida & Nenchev, 1995, Nenchev & Yoshida, 1998).
Most of the researches have been made to overcome the problems introduced by impacts
between robots and environments. Some researchers have tried to use the advantages of
impacts. When a robot applies a force statically on an environment, the magnitude of force
is limited by the maximum torque of the actuators. In order to exert a large force on the
environment beyond the limitation, applying impulsive force has been studied by a few
researchers. Uchiyama performed a nail task by a 3-DOF robotic manipulator (Uchiyama,
1975). Takase et al. developed a two-arm robotic manipulator named Robot Carpenter, and
performed sawing a wooden plate and nailing (Takase, 1990). Izumi and Hitaka proposed to
use a flexible link manipulator for nailing task, because the flexible link has an advantage in
absorbing an impact (Izumi & Kitaka, 1993).
522 Humanoid Robots, New Developments
However, those works mentioned above were done using robotic manipulators fixed on the
ground except for space robots, and thus, there was no need to take care about loosing a
balance. Humanoid robots are expected to work on human’s behalf. If a humanoid robot can
do heavy works utilizing an impulsive force as well as a human does, the humanoid robot
will be widely used in various application fields such as constructions, civil works, and
rescue activities.
The first attempt on an impact motion by a humanoid robot was reported in (Hwang et al.,
2003). Matsumoto et al. performed a Karate-chop using a small humanoid robot and broke
wooden plates (Matsumoto et al., 2004). In order for a legged robot to effectively exert a
large force to an environment without loosing a balance, working posture is important.
Tagawa et al. proposed a firm standing of a quadruped for mobile manipulation (Tagawa et
al., 2003). Konno et al. discussed an appropriate working posture of a humanoid robot
(Konno et al., 2005).
This chapter addresses an impact motion performed by a humanoid robot HRP-2. A drum
beating is taken as a case study, because it is a typical task that requires large impulsive
forces. The drum beating motion is carefully designed to synchronize with music. The drum
beating and a Japanese martial art Bojutsu were performed by a humanoid robot HRP-2 in
the Prototype Robot Exhibition at Aichi Exposition 2005.
300 300
X X
200 Y 200 Y
100 Z 100 Z
Force [N]
Force [N]
0 0
-100 -100
-200 -200
-300 -300
-400 -400
-500 -500
0 1 2 3 4 5 0 1 2 3 4 5 6
Time [s] Time [s]
(a) (b)
Fig. 3. Force generated at the wrist. (a) When the humanoid robot exerts a quasi-static force
on a wall. (b) When the humanoid robot exerts an impulsive force on a wall.
524 Humanoid Robots, New Developments
Fig. 4. Control system software of the HRP-2 with OpenHRP (the figure is quoted from
http://www.generalrobotix.com/product/openhrp/products_en.htm).
Foundational plugins such as Kalman Filter, Sequential Playback, Walk Stabilizer, Pattern Generator,
Dynamics, Logger, and ZMPSensor are also included in the control system software, however,
users can develop own functions as a plugin to enrich the humanoid robot motions. Please see
the official web page http://www.generalrobotix.com/product/openhrp/products_en.htm for
more details of the control software.
Drum Beating and a Martial Art Bojutsu Performed by a Humanoid Robot 525
4. Drum Beating
4.1 Primitive poses and motions
In order to generate drum beating motions of the humanoid robot HRP-2, the motion is
decomposed into four primitive poses or motions: (a) initial pose, (b) swing, (c) impact, and
(d) withdrawing, as shown in Fig. 5. Among the four primitive motions, impact and
withdrawing are important to exert an impulsive force.
As presented in Fig. 6, three different swing patterns, (a) small swing, (b) middle swing and
(c) big swing, are generated sharing the poses for the impact and withdrawing.
For these swing patterns, three different initial poses are given and the poses to pass
through in swing motion are designed. Cubic spline is used to interpolate the given
poses.
0.500 RS
1.270 LM
1.270 RM
0.635 LS
0.500 END
The numbers listed in the first column indicate the interval (s) to the next beating. The
symbols listed in the second column indicate the way of beating. The first character ’R’ or ’L’
indicates the arm to move (Right or Left), while the second character ’S’, ’M’, ’B’, or ’E’
indicates the kinds of swing (Small swing, Middle swing, Big swing, or Edge beating, see
Fig. 6).
For example, the third line of the script “1.270 RM” indicates “beat the drum after 1.270 s
using the middle swing of the right arm.” The period between the impact and the previous
pose is fixed to 0.1 s to achieve the maximum speed at the impact. As shown in Fig. 6 (b),
seven intermediate poses are designed for the middle swing between the initial pose and the
impact, therefore, if the duration is specified to 1.270 s, each period ΔTM between the poses
is calculated as follows:
526 Humanoid Robots, New Developments
(a)
(b)
ΔTM ΔTM ΔTM ΔTM ΔTM ΔTM ΔTM 0.1 [s] 0.1 [s]
Duration indicated in the beat timing script
Initial pose Swing Impact Withdrawing
(c)
ΔTB ΔTB ΔTB ΔTB ΔTB ΔTB ΔTB ΔTB ΔTB ΔTB ΔTB 0.1 [s] 0.1 [s]
Duration indicated in the beat timing script
Initial pose Swing Impact Withdrawing
Fig. 6. Three swing patterns. The periods between impact and the previous pose, and
between withdrawing and impact are fixed to 0.1 [s]. Other periods denoted by ΔTS, ΔTM,
ΔTB, are computed from the duration indicated in the beat timing script. (a) Small swing. (b)
Middle swing. (c) Big swing.
0.635
0.635 RS
RS
Made by hand
1.270
1.270 LM
LM
1.270
1.270 RM
RM
0.635
0.635 LS
LS Interpreting the beat timing
. script, the timings to pass
.
0.5 through the given poses
0.5 END
END
are adjusted.
Music file Beat timing script
TimingGenerator3_3.so
TokyoOndo.wav TimeKeeper.exe TokyoOndo.txt
CORBA Given
call poses and
timings
OpenHRP
Music player
Cubic spline interpolation
SpeakServer
Spliner.o
Jython script A sequence of joint motions
is generated interpolating
Queue Robot the given postures.
speak.so seqplay.so ArmPluginCS.so
Fig. 7. A software diagram. The components marked with red boundary boxes are
developed in this work.
-30
-40
-50
-60
-70
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Time [s]
Fig. 8. A software diagram. The components marked with red boundary boxes are
developed in this work.
528 Humanoid Robots, New Developments
As can be seen in Fig. 7, during the last 0.1 [s] before the impact (approximately from 0.5 to
0.6 [s]), gradients of the joint trajectories are steep compared with other periods. Since the
period between the impact and the previous pose is set to 0.1 [s], maximum joint speed is
almost achieved.
0.0 [s] 1.0 [s] 2.0 [s] 3.0 [s] 4.0 [s]
(c)
Fig. 9. The Japanese martial art Bojutsu motion patterns. (a) Thrusting a staff weapon
rightward. (b) Thrusting a staff weapon leftward. (c) Banging down a staff weapon.
The Prototype Robot Exhibition was held for 11 days from June 9 to 19, at the Morizo and
Kiccoro Exhibition Center, a convention venue in the Aichi Expo site. The Prototype Robot
Exhibition was organized by the Japan Association for the 2005 World Exposition and the
New Energy and Industrial Technology Development Organization (NEDO). 63 prototypes
performed demonstrations during the period.
Drum Beating and a Martial Art Bojutsu Performed by a Humanoid Robot 529
The drum beating and Bojutsu demonstration was performed twice a day in the Prototype
Robot Exhibition (Fig. 10).
(a) (b)
Fig. 10. Demonstrations at Aichi Exposition 2005. (a) Drum beating performance. (b) A
Japanese martial art Bojutsu performance.
7. Conclusion
This chapter proposed to utilize an impulsive force for humanoid robots to exerts a large
force beyond the torque limitations of actuators. The problems of the impact tasks to be
solved in the future work were brought up in Section 2.
A drum beating is taken as a case study, because it is a typical task that requires large
impulsive forces. The details of the drum beating and a Japanese martial art Bojutsu performed
by a humanoid robot HRP-2 in the Aichi Exposition were presented in this paper.
8. Acknowledgement
Authors would like to express special thanks to the staffs of Kawada Industries, Inc. and
General Robotics Inc. for their kind and sincere support in this project. Authors also would
like to express thanks to all the staffs who are related to the Prototype Robot Exhibition.
9. References
Asada, H. & and Ogawa, K. (1987). ON THE DYNAMIC ANALYSIS OF A MANIPULATOR
AND ITS END EFFECTOR INTERACTING WITH THE ENVIRONMENT,
Proceedings of the IEEE Int. Conf. on Robotics and Automation, pp. 751–756
Hirukawa, H.; Kanehiro, F. & Kajita, S. (2003). Open HRP: Open Architecture Humanoid
Robotics Platform, Robotics Research, STAR 6, Springer-Verlag, Jarvis, R. A. &
Zelinsky, A. Eds., pp. 99–112
Hwang, Y.; Konno, A. & Uchiyama, M. (2003). Whole Body Cooperative Tasks and Static
Stability Evaluations for a Humanoid Robot, Proceedings of IEEE/RSJ Int. Conf. on
Intelligent Robots and Systems, pp. 1901–1906
Izumi, T. & Kitaka, Y. (1993). Control of a Hitting Velocity and Direction for a Hammering
Robot Using a Flexible Link, Journal of the RSJ, Vol. 11, No. 3, pp. 436–443, (In
Japanese).
530 Humanoid Robots, New Developments
1. Introduction
This chapter presents recent research results of our laboratory in the area of vision and
locomotion coordination with an emphasis on foveated multi-camera vision. A novel active
vision planning concept is presented which coordinates the individual devices of a foveated
multi-camera system. Gaze direction control is combined with trajectory planning based on
information theoretic criteria to provide vision-based autonomous exploring robots with
accurate models of their environment.
With the help of velocity and yaw angle sensors, mobile robots can update the internal
knowledge about their current position and orientation from a previous time step; this
process is commonly referred to as dead-reckoning. Due to measurement errors and
slippage these estimations are erroneous and position accuracy degrades over time causing
a drift of the estimated robot pose. To overcome the drift problem it is common to take
absolute measurements evaluating visual information, which are fused dynamically with
the odometry data by applying Kalman-filter or other techniques, e.g. (Dissanayake et al.,
2001). The use of active vision systems for navigation is state-of-the-art providing a
situation-related selective allocation of vision sensor resources, e.g. (Davison & Murray,
2002; Seara et al., 2003; Vidal-Calleja et al., 2006). Active vision systems comprising only one
type of vision sensor face a trade-off between field of view and measurement accuracy due
to limitations of sensor size and resolution, and of computational resources. In order to
overcome this drawback the combined use of several vision devices with different fields of
view and measurement accuracies is known which is called foveated, multi-resolution, or
multi-focal vision, e.g. cf. (Dickmanns, 2003; Kühnlenz et al., 2006; Ude et al., 2006). Thereby,
the individual vision devices can be independently controlled according to the current
situation and task requirements. The use of foveated active vision for humanoid robot
navigation is considered novel.
Active vision is also frequently utilized in the context of robotic exploration. Yet, gaze control
and locomotion planning are generally decoupled in state-of-the-art approaches to simultaneous
localization and mapping (SLAM). An integrated locomotion planning and gaze direction
control concept maximizing the collected amount of information is presented in the second part
of this chapter. This strategy results in more accurate autonomously acquired environment
representations and robot position estimates compared to state-of-the-art approaches.
The chapter is organized as follows: In Section 2 vision-based localization and mapping in
the context of humanoid robots is surveyed; Section 3 is concerned with foveated multi-
532 Humanoid Robots, New Developments
camera coordination; novel concepts of gaze control and path planning coordination are
presented in Section 4; evaluation studies comparing the novel concepts to conventional
planning approaches and vision systems are presented in Section 5; conclusions are given in
Section 6.
x k +1 = x k (1 − γ k ) + f s ( x k , u k , d x , k )γ k
, (1)
where state-vector x contains the robot foot pose and the landmark positions, d represents
system noise capturing dead-reckoning uncertainties, and γ∈{0; 1} is a binary variable
indicating a change of the supporting foot when γ=1. The commanded step u is expressed
by
uk = [F x s , k F ys, k F θs,k ]T , (2)
including the commanded step position [xs ys]T and orientation θs with respect to the current
supporting foot frame SF. Figure 1 schematically shows a typical SLAM situation of a
humanoid robot with the reference frame currently placed in the left foot.
In vision-based SLAM field of view restrictions of the vision device strongly limit the
number of landmarks to be observed simultaneously. Yet, a larger field of view can only be
realized accepting a lower measurement accuracy of the vision device mainly due to
limitations of sensor size and resolution. Therefore, we propose the use of several vision
devices which provide different fields of view and accuracies and a novel gaze control
concept for coordinating the individual vision devices in order to provide both, large field of
view and high measurement accuracy, simultaneously. These foveated active vision
concepts for robot navigation are discussed in the following section.
the most relevant structure in the current situation. Yet, in a variety of scenarios this
approach is unsuitable or even unrealizable. In at least partially unknown environments
and in exploration scenarios a sufficient map is not available and thus has to be created
online. However, due to the strongly limited field of view the detection of new objects of
potential interest is hardly possible. Another aspect are potentially relevant or even
dangerous objects or activities in the local surroundings of the robot which cannot be
detected.
In order to overcome the common drawback of trading field of view versus measurement
accuracy, the combination of wide-angle and telephoto vision devices has been suggested.
Such systems provide at the same time both, an observation of a large part of the
environment and a selective examination with high accuracy. In common literature these
systems are referred to as foveated, multi-resolution or multi-focal systems. The individual
vision devices may be fixed with respect to each other or may be independently motion
controllable in one or more degrees of freedom. Most common embodiments of foveated
systems are used in state-of-the-art humanoid robots comprising two different cameras
combined in each eye which are aligned in parallel, e.g. (Brooks et al., 1999; Ude et al., 2006;
Vijayakumar et al., 2004). Systems for ground vehicles, e.g. (Apostoloff & Zelinsky, 2002;
Maurer et al., 1996; Dickmanns, 2003) are another prominent class. An upcoming area are
surveillance systems which strongly benefit from the combination of large scene overview
and selective observation with high accuracy, e.g. (Bodor et al., 2004; Davis & Chen, 2003;
Elder et al., 2004; Jankovic & Naish, 2005; Horaud et al., 2006). An embodiment with
independent motion control of three vision devices with a total of 6 degrees-of-freedom
(DoF) is the camera head of the humanoid robot LOLA developed at our laboratory which is
shown in Figure 2 providing more flexibility and, due to directly driven gimbals, faster
camera motions than other known systems, cf. e.g. (Kühnlenz et al., 2006).
where counter i covers the considered components of the foot pose and ei are the
eigenvalues of the predicted foot pose covariance matrix Puu which is a submatrix of the
predicted covariance matrix of a possible target state as estimated by the Kalman-filter, e.g.
cf. (Dissanayake et el., 2001)
536 Humanoid Robots, New Developments
§ P i uu P i um ·
Pi k k = ¨ i ¸
¨ P mu P i mm ¸
© ¹, (4)
where Piuu is the error covariance matrix of the robot state estimate, Pimm is the map
covariance matrix of the landmark state estimates and Pium is a cross-covariance matrix
between robot and landmark states. Low values of the defined measure (3), thus, indicate a
high certainty of the robot pose estimation and, therefore, good task performance for the
locomotion task. Additional measures to assess the performance of secondary tasks have
been proposed which also may have an indirect impact on the performance of the primary
(locomotion) task, e.g. field of view limitations, presence of activities, etc., (Kühnlenz, 2006).
These measures are all extensions to the central gaze control concept and, therefore, out of
scope of this chapter.
Given such measures to assess the task performance of the humanoid robot the next task is
to derive appropriate view directions for the individual vision devices in the following time
step in order to achieve a particular desired task performance. This gaze control concept is
topic of the following section.
planning of gaze direction and locomotion path in order to provide the mobile robot with
capabilities of exploring unknown environments.
the robot and the sensory system simultaneously so that the best state estimates possible are
acquired and as much new terrain as possible is explored.
gain I[x,z]. The gain of information between any two distributions can be computed as the
change in entropy. In our case this change is the difference between the entropies of the
state estimates before and after making an observation which are both multivariate
Gaussians with covariances Pk+1|k and Pk+1|k+1. Therefore, the information gain evaluates to
1 1
I [x , z] = H ( x ) − H ( x|z) = log( Pk +1|k ) − log( Pk +1|k +1 )
2 2 (7)
This information gain can be calculated as a function of the state covariance matrix. From (7)
it is obvious that information gain I[x,z] becomes maximal, if the determinant of Pk+1|k+1 is
minimized. Starting from the current state estimate the covariances of the states that can be
observed next by the vision sensors are predicted. The equations for the prediction step of
the classical SLAM algorithm based on an extended Kalman-filter (Dissanayake et al., 2001)
are used. After all covariances are predicted the most informative state can be calculated by
minimizing |Pk+1|k+1|. The new optimal gaze direction Ω οf the active vision system
corresponding to this state is then computed.
N -1 step s N-1 s te ps
Fig. 5. Region covered while planning over a horizon of N steps. Highlighted grid cells
show which cells are taken into account for gaze direction control.
The first step for choosing the next destination for the robot is to estimate the states and
covariances of all possible positions that can be reached over its planning horizon. A
discretized grid environment is used, where each grid represents a position that can be
reached by the robot over future time steps. Therefore, the size of the grid cells depends on
the physical properties of the robot. Based on this discretized environment the states and
their covariances are computed. While the robot moves, observations are made and used to
update the state estimation. This way, all available information is being used. More
specifically, based on an initial state estimate and covariance matrix, we calculate all
possible robot states and their covariances after N time steps and choose to move to the one
that is most informative, namely the one that minimizes relative entropy as described in the
previous section. A mathematical description of the algorithm used to produce the multi-
step predictions, can be found in (Lidoris et al., 2007). The estimation procedure evolves in a
540 Humanoid Robots, New Developments
square-like manner, as shown in Figure 5. Starting from the currently estimated state the
first eight neighboring states and covariances are computed. At each step, the estimated
state and covariances of the neighboring states are used to infer the next ones until step N.
By always using the nearest neighbor in the estimation process, the estimation error is kept
minimum. Over each time step k, 8k new states are calculated. The control signal, uji,k
required in order to drive the robot from a state j to a state i, is chosen as indicated by the
arrows in Figure 5.
Using the predicted covariance matrix (4) and the concept of relative entropy mentioned
previously, each possible future position of the robot can be evaluated to choose the
appropriate target position for the robot. The destination that maximizes the function
1 Pi 1 i
Pmm
Vi = log( uu
0
) − γ log( 0
)
2 Puu 2 Pmm
(8)
is chosen as a target for the robot. The first part of the function is a measure of the position
uncertainty the robot will encounter in the future position and the second part is a measure
of the map quality. The constant DŽ can be used to adjust the behavior of the robotic explorer.
Setting DŽ to values smaller than one, results in a conservative exploration policy, since the
robot will stays near to well-localized features giving more attention to localization. Large
values of DŽ increase the proactivity of the explorer in the sense that it moves to unknown
areas neglecting the lower localization accuracy. After selecting the target position which
maximizes (8), the robot moves making observations which are used to update the
estimated state and covariance. Each time a new state estimate is available, a recalculation of
the gaze direction is made. This way we use all new information that becomes available
during robot motion. Replanning takes place after N time steps when the target position is
reached.
navigation scenario where a humanoid robot fixates two landmarks with two vision devices of
its foveated multi-camera vision system in order to localize itself in the world.
In order to demonstrate the benefits of foveated multi-camera view direction planning the
proposed gaze control approach is now evaluated in a structured humanoid robot navigation
scenario. Several vision system configurations are evaluated by comparison of the achieved
navigation performances. The basic scenario is shown in Figure 6. Four landmarks are
distributed within a rectangular environment. The mission of the robot is to follow the
planned path in x-direction. In order to complete the mission successfully the robot has to
localize itself within the environment evaluating available visual information on the positions
of the identified landmarks. The robot pose is estimated dynamically using the Kalman-filter
approach described in Section 2. In order to maximize the information gain optimal view
directions of the individual vision devices are selected dynamically based on the proposed
approach in Section 3.4. The positions of the landmarks are not known a priori nor are the
number of landmarks. Configurations of the vision system in the considered scenario to be
compared are: a) conventional single stereo-camera, focal-lengths 20mm, aperture angles 30°,
stereo-base 25cm; b) foveated stereo-camera with two cameras per eye aligned in parallel,
focal-lengths 2mm and 40mm, respectively, aperture angles 60° and 10°, respectively, stereo-
bases 25cm; c) two independent stereo-cameras, focal-lengths 2mm and 40mm, respectively,
aperture angles 60° and 10°, respectively, stereo-bases 25cm. All cameras are ideal, based on
the pinhole camera model neglecting lens distortion and quantization effects. Gaussian vision
sensor noise with a standard deviation of 1 pixel is considered. Dead-reckoning errors are
taken from experiments with the humanoid robot JOHNNIE.
The navigation performance is rated assessing the localization accuracy. Therefore, the
covariance matrix of the robot position is evaluated computing the areas of the 90%-
confidence ellipses of the footstep position uncertainties of the humanoid robot. Figure 7
contains a cut-out of Figure 6 showing the planned and real paths, the path estimated by the
Kalman-filter, the foot step positions, and their covariance ellipses. It is noted that due to
dead-reckoning errors the real path deviates increasingly from the planned path as
locomotion control is open loop. The estimated path follows the real path well.
Figure 8 and Figure 9 show the resulting view directions for each step of the robot for the
individual vision systems. The propagations of the areas of the confidence ellipses are
shown in Figure 10. Table 1 shows a comparison of the means of the confidence ellipse areas
and the average number of landmarks visible for the vision systems for all scenarios. These
results are discussed in the following.
considers the lower landmarks. Most of the time only one landmark is visible and used
for localization.
The foveated vision systems provide much more flexibility. Due to the large field of view of
the wide-angle device more landmarks are detected in the initial position to be considered
by the planner. So, at each step two or more landmarks are used for localization whereas at
least one landmark is focused with telephoto cameras resulting in a significantly higher
certainty of the estimated footstep positions of the humanoid robot as shown in Figure 10
and Table 1.
Fig. 9. Pan-angles of foveated vision system with two independent stereo cameras; a)
telephoto and b) wide-angle stereo-camera.
Fig. 10. Comparison of the areas A90 of the 90%-confidence ellipses of the footstep position
covariance matrix using a conventional and a foveated vision system.
544 Humanoid Robots, New Developments
The map accuracy is illustrated in Figure 12 by the error ellipsoids for each observed feature
for the final map. It is obvious that map accuracy grows as the planning horizon becomes
larger. Also more features are observed if gaze direction control is used. From the final map
acquired in the case of a three-step planning horizon with gaze direction control, it becomes
clear that the proposed approach balances well by observing a large number of features and
also building an accurate map.
15 20
10 15
5 10
0 5
] ]
[m [m
y- 5 y0
- 10 -5
- 15 - 10
- 20 - 15
- 20 - 15 - 10 -5 0 5 10 15 20 25 - 25 - 20 - 15 - 10 -5 0 5 10 15 20 25
x [m] x [m]
a) b)
15
10
5
]
m
[0
x
-5
- 10
- 15
- 20 - 15 - 10 -5 0 5 10 15 20 25
y [m]
c)
Fig. 12. Map accuracy illustrated by the error ellipsoids of each observed feature for the final
map; a) one-step planning horizon without gaze direction control, b) one-step, and c) three-
step planning horizons with gaze control. The estimated robot trajectory is illustrated by the
black lines, while the red triangle on-top of the robot represents the active head and its gaze
direction.
Figure 13 shows the reduction of the entropy over time. Each time a new feature is observed
entropy reduces. For that reason it is step-formed. The greedy approaches need more time
to reduce entropy and the larger the planning horizon is, the more entropy is reduced.
Furthermore, when the planning horizon is small, more time is needed to observe the same
number of features. Without gaze direction control entropy is not satisfactorily reduced.
This results from the fact, that the gaze direction control module chooses to direct the sensor
546 Humanoid Robots, New Developments
system mostly towards already observed and more certain features when the environment
is known. Therefore, localization error and feature position uncertainties are kept at a
minimum.
The results show that the proposed approach combining gaze direction control and motion
planning based on information theoretic concepts for the exploration task, gives superior
results in comparison to greedy approaches and approaches that neglect active sensor
control.
50
1- step , n o ac ti ve vis ion
0 1- step
3- step
)) - 5 0
(P
t - 10 0
e
d
(g
l - 15 0
o
- 20 0
- 25 0
- 30 0
0 5 00 1 000 15 00 2 000 25 00
ti me step s
Fig. 13. Entropy measure log(|Pk+1|k+1|) over time for a one-step planning horizon without
gaze direction control, one-step and three-step planning horizons with gaze direction
control.
6. Conclusions
This chapter presents a foveated active vision planning concept for robot navigation in
order to overcome drawbacks of conventional active vision when trading field of view
versus measurement accuracy. This is the first approach of task-related control of foveated
active vision in the context of humanoid robots as well as localization and mapping. In a
typical robot navigation scenario the benefits of foveated active vision have been
demonstrated: an improved localization accuracy combined with an extended visible field
compared to conventional active vision. As a generic information maximization principle
has been used the gaze control strategy is generalizable to other scenarios depending on the
definition of the task-related information measures, thereby, allocating vision sensor
resources optimally.
The second part of this chapter has presented novel concepts of coordinating gaze direction
and locomotion planning. Through this planning procedure, all possible trajectories and
viewing angles over a specific time horizon are anticipated. This results in significantly
improved performance compared to known approaches, which neglect the limitations of the
sensors. It is demonstrated that a very accurate environmental model is autonomously
produced even if very inaccurate robot and sensor models are used. Such a scheme is ideal
for vision-based humanoid robots.
The proposed approaches assume time-invariant scenarios. Path planning and gaze control
methods for dynamically changing environments are not yet covered and subject to ongoing
research.
On Foveated Gaze Control and Combined Gaze and Locomotion Planning 547
7. Acknowledgments
The authors like to gratefully thank Dr. Javier Seara and Klaus Strobl-Diestro for reference
simulation code for performance comparison. This work has been supported in part by
the German Research Foundation (DFG) grant BU-1043/5-1 and the DFG excellence
initiative research cluster Cognition for Technical Systems - CoTeSys, see also
www.cotesys.org.
8. References
Apostoloff, N.; Zelinsky, A. (2002). Vision in and out of vehicles: Integrated driver and road
scene monitoring, Proceedings of the 8th International Symposium on Experimental
Robotics (ISER), 2002, Sant Angelo d’Iscia, Italy
Beis, J. F. & Lowe, D. G. (1997). Shape Indexing Using Approximate Nearest-Neighbour
Search in High-Dimensional Spaces, Proceedings of the 1997 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 1997
Bodor, R.; Morlok, R. & Papanikolopoulos, N. (2004). Dual-camera system for multi-level
activity recognition, Proceedings of the 2004 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), 2004, pp.643-648, Sendai, Japan
Bourgault, F.; Makarenko, A. A.; Williams, S. B.; Grocholsky, B. & Durrant-Whyte, H. (2002).
Information Based Adaptive Robotic Exploration, Proceedings of the IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS), Lausanne,
Switzerland
Bryson, M. & Sukkarieh, S. (2005). An Information-Theoretic Approach to Autonomous
Navigation and Guidance of an Unihabited Aerial Vehicle in Unknown
Environments, Proceedings of the IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), Edmonton, Canada
Brooks, R. A.; Breazeal, C.; Marjanovic, M.; Scasselati, B. & Williamson, M. M. (1999). The Cog
Project: Building a Humanoid Robot, In: Computation for Methaphors, Analogy, and
Agents, C. Nehaniv, (Ed.), pp. 52-87, Springer, Heidelberg, Berlin, Germany
Darrell, T. (1997). Reinforcement learning of active recognition behaviors, Interval Research
Technical Report 1997-045,http://www.interval.com/papers/1997-045, 1997
Davis, J. & Chen, X. (2003). Foveated observation of shape and motion. Proceedings of the
2003 IEEE International Conference on Robotics and Automation (ICRA), pp. 1001-
1005, 2003, Taipei, Taiwan
Davison, A. J. & Murray, D. W. (2002). Simultaneous Localization and Map-Building Using
Active Vision, In IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 24, No.7, pp. 865-880, 2002
Davison, A. J. (2003). Real-time simultaneous localization and Mapping with a single
camera, Proceedings of the International Conference on Computer Vision, 2003, pp.
1403-1410, Nice
Dickmanns, E. D. (2003). An advanced vision system for ground vehicles. Proceedings of the
International Workshop on In-Vehicle Cognitive Computer Vision Systems
(IVC2VS), 2003, Graz, Austria
Dissanayake, G.; Newman, P.; Clark, S.; Durrant-Whyte, H. F. & Csorba, M. (2001). A
Solution to the Simultaneous Localization and Map Building Problem, IEEE
Transactions on Robotics and Automation, 2001, pp. 229-241
548 Humanoid Robots, New Developments
Doucet, A.; de Freitas, N.; Murphy, K. & Russell, S. (2000). Rao-blackwellised particle
filtering for dynamic bayesian networks, Proceedings of the 16th Conference on
Uncertainty in Artificial Intelligence, 2000
Elder, J. H.; Dornaika, F.; Hou, B. & Goldstein, R. (2004). Attentive wide-field sensing for
visual telepresence and surveillance, In: Neurobiology of Attention, L. Itti, G. Rees
& J. Tsotsos, (Eds.), 2004, Academic Press, Elsevier
Eustice, R.; Pizarro, O. & Singh, H. (2004). Visually augmented navigation in an
unstructured environment using a delayed state history, Proceedings of the IEEE
International Conference on Robotics and Automation, 2004, pp. 25-32, New
Orleans, USA
Eustice, R.; Walter, M. & Leonard, J. (2005). Sparse extended information filters: insights into
sparsification, Proceedings of the IEEE/RSJ International Conference On Intelligent
Robots and Systems, 2005, pp. 641-648, Sendai, Japan
Feder, H. J. S.; Leonard J. J. & Smith, C. M. (1999). Adaptive Mobile Robot Navigation and
Mapping, International Journal of Robotics and Research, Vol.18, No.7, pp. 650-668,
1999
Seara, J. F.; Strobl, K. H.; Martin E. & Schmidt, G. (2003). Task-oriented and situation-
dependent gaze control for vision guided autonomous walking, Proceedings of the
IEEE-RAS International Conference on Humanoid Robots (Humanoids), 2003,
München and Karlsruhe, Germany
Grabowski, R.; Khosla, P. & Choset, H. (2003). Autonomous exploration via regions of
interest, Proceedings of the IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), 2003, Las Vegas, USA
Gutmann, J.-S. & Konolige, K. (1999). Incremental Mapping of Large Cyclic Environments,
Proceedings of International Symposium on Computational Intelligence in
Robotics and Automation (CIRA), 1999, Monterey, CA, USA
Horaud, R.; Knossow, D. & Michaelis, M. (2006). Camera cooperation for achieving visual
attention, Machine Vision and Applications, Vol. 15, No. 6, 2006, pp. 331-342
Huang, S.; Kwok, N. M.; Dissanayake, G.; Ha, Q. P. & Fang, G. (2005). Multi-Step Look-
Ahead Trajectory Planning in SLAM: Possibility and Necessity, Proceedings of the
International Conference on Robotics and Automation (ICRA), 2005, pp. 1103-1108,
Barcelona, Spain
Itti, L. & Koch, C. (2001). Computational modeling of visual attention, Nature Reviews
Neuroscience, Vol. 2, No. 3, 2001
Jankovic, N. D.; Naish, M. D. (2005). Developing a modular spherical vision system,
Proceedings of the 2005 IEEE International Conference on Robotics and
Automation (ICRA), pp. 1246-1251, 2005, Barcelona, Spain
Jung, I. & Lacroix, S. (2003). High resolution terrain mapping using low attitude aerial stereo
imagery, Proccedings of the IEEE International Conference on Computer Vision,
2003, pp. 946-951
Karlsson, N.; Goncalves, L.; Munich M. E. & Pirjanian, P. (2005). The vSLAM Algorithm for
Robust Localization and Mapping, Proceedings of the IEEE International
Conference on Robotics and Automation (ICRA), 2005, Barcelona, Spain
Kim J. & Sukkarieh S. (2003). Airborne Simultaneous Localisation and Map Building,
Proceedings of the IEEE International Conference on Robotics and Automation
(ICRA), 2003, Taipei, Taiwan
On Foveated Gaze Control and Combined Gaze and Locomotion Planning 549
Koch, C. & Ullmann, S. (1984). Selecting one among the many: A simple network
implementing shifts in visual attention. MIT AI Memo No. 770, 1984
Koenig, S. & Tovey, C. (2003). Improved analysis of greedy mapping, Proceedings of the
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2003,
Las Vegas, USA
Kühnlenz, K.; Bachmayer, M. & Buss, M. (2006). A multi-focal high-performance vision
system, Proceedings of the 2006 IEEE International Conference on Robotics and
Automation (ICRA), pp. 150-155, 2006, Orlando, FL, USA
Kühnlenz, K. (2006). Aspects of multi-focal vision, Ph.D. Thesis, Institute of Automatic
Control Engineering, Technische Universität München, 2006, Munich, Germany
Kwok, N. M. & Dissanayake, G. (2004). An efficient multiple hypothesis filter for bearing-
only SLAM, Proccedings of the IEEE/RSJ International Conference on Intelligent
Robots and Systems, 2004, pp. 736-741
Leonard, J. J.; Feder H. J. S. (2000). A Computationally Efficient Method for Large-Scale
Concurrent Mapping and Localization, Proceedings of the 9th International
Symposium on Robotics Research, 2000, pp. 169-176, Salt Lake City, Utah
Leonard, J. J.; Rikoski, R. J.; Newman, P. M. & Bosse M. (2002) The International Journal of
Robotics Research, Vol. 21, No. 10-11, 2002, pp. 943-975
Lidoris, G.; Kühnlenz, K.; Wollherr, D. & Buss, M. (2006). Information-Based Gaze Direction
Planning Algorithm for SLAM, Proceedings of IEEE-RAS International Conference
on Humanoid Robots (HUMANOIDS), 2006, Genova, Italy
Lidoris, G.; Kühnlenz, K.; Wollherr, D. & Buss, M. (2007). Combined trajectory planning and
gaze direction control for robotic exploration, Proceedings of the International
Conference of Robotics and Automation (ICRA), 2007, Rome, Italy
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints, International
Journal of Computer Vision, Vol. 60, No. 2, 2004, pp. 91-110
Lu, F. & Milios, E. (1997). Globally Consistent Range Scan Alignment for Environment
Mapping, Autonomous Robots, Vol. 4, 1997, pp. 333-349
Maurer, M.; Behringer, R.; Furst, S.; Thomanek, F. & Dickmanns, E. D. (1996). A compact
vision system for road vehicle guidance, Proceedings of the 13th International
Conference on Pattern Recognition (ICPR), 1996
Montemerlo, M.; Thrun, S.; Koller, D. & Wegbreit, B. (2002). FastSLAM: A Factored Solution
to Simultaneous Localization and Mapping, Proceedings of the National
Conference on Artificial Intelligence (AAAI), Edmonton, Canada
Moorehead, S. J.; Simmons, R. & Whittaker, W. L. (2001). Autonomous exploration using
multiple sources of information, Proceedings of IEEE International Conference on
Robotics and Automation (ICRA), 2001, Seoul, Korea
Ozawa, R.; Takaoka, Y.; Kida, K.; Nishiwaki, J.; Chestnutt, J.; Kuffner, J.; Kagami, S.;
Mizoguchi, H. & Inoue, H. (2005). Using visual odometry to create 3D maps for
online footstep planning, Proceedings of IEEE International Conference on
Systems, Man and Cybernetics (SMC), 2005, pp. 2643-2648, Hawaii, USA
Pellkofer, M. & Dickmanns, E. D. (2000). EMS-Vision: Gaze control in Autonomous vehicles,
Proceedings of the IEEE Intelligent Vehicles Symposium, pp. 296-301, 2000,
Dearborn, MI, USA
Sabe, K.; Fukuchi, M.; Gutmann, J. S.; Ohashi, T; Kawamoto, K. & Yoshigahara, T. (2004).
Obstacle Avoidance and path planning for humanoid robots using stereo vision,
550 Humanoid Robots, New Developments
1. Introduction
Vertical jumping is a complex task requiring quick and harmonized coordination of
jumper’s body segments, first for the push-off, then for the flight and lastly for the landing.
The prime criterion for vertical jump efficiency is the height of the jump that depends on the
speed of the jumper’s center of gravity (COG) in the moment when the feet detach from the
ground. Besides maintaining the balance, the task of the muscles during the push-off phase
of the jump is to accelerate the body’s COG up in the vertical direction to the extended body
position. During the push-off phase of the jump, the jumper’s center of gravity must be
above the supporting polygon that is formed by the feet (BabiĀ et al., 2001). In contrast to the
humans, today’s humanoid robots are mostly unable to perform fast movements such as the
vertical jump. They can mostly perform only slow and statically stable movements that do
not imitate the human motion. Besides, these slow and statically stable movements are
energy inefficient. With the understanding of the anatomy and the biomechanics of the
human body, one can find out that, beside the shape, majority of today’s humanoid robots
and human bodies do not have a lot of common properties. To achieve a better imitation of
the human motion and ability to perform fast movements such as the vertical jump or
running, other properties and particularities, beside the shape of the body, should be
considered in the design of the humanoid robot.
Lower extremities of today’s humanoid robots are mostly serial mechanisms with simple
rotational joints that are driven directly or indirectly by electrical servo drives. Such design
of humanoid robot mechanism allows only rotational motion in joints to occur. This means
that translations of the robot’s center of gravity are solely a result of the transformation of
rotations in joints into translations of the robot center of gravity. Especially in ballistic
movements such as fast running or jumping where the robot center of gravity is to be
accelerated from low or zero velocity to a velocity as high as possible, this transformation is
handicapped. The transfer of the angular motion of the lower extremity segments to the
desired translational motion of the robot center of gravity is less effective the more the joints
are extended. When the joint is fully extended, the effect of this joint on the translational
motion of the robot center of gravity in a certain direction equals zero. Besides, the motion
of the segments should decelerate to zero prior to the full extension to prevent a possible
damaging hyperextension. Where relatively large segments which may contain considerable
amounts of rotational energy are involved, high power is necessary to decelerate the angular
motion.
552 Humanoid Robots, New Developments
Servo drives that drive the joints of the humanoid robot represent a model of the group of
monoarticular muscles that are passing a single joint and thus cause a rotational motion of
the joint that are passing. Beside the monoarticular muscles that are passing only one joint,
human lower extremity consists of another group of muscles that are passing two joints.
These muscles are so called biarticular muscles. Their functions during human movement
have been studied extensively by many researchers. One of such functions is the
transportation of mechanical energy from proximal to distal joints. It is believed that this
transportation causes an effective transformation of rotational motion of body segments into
translation of the body centre of gravity (Schenau, 1989). Gastrocnemius muscle is a
biarticular muscle that is passing the knee and the ankle joints and acts as a knee flexor and
ankle extensor (see Fig. 1).
Fig. 1. Biarticular muscle gastrocnemius passing the knee and the ankle joints. It acts as a
knee flexor and ankle extensor. Gastrocnemius muscle is connected to the foot by an elastic
tendon.
In jumping, the activation of the biarticular gastrocnemius muscle prior to the end of the
push-off enables the transportation of the power generated by the knee extensors from the
knee to the ankle joint. This transfer of mechanical energy by gastrocnemius can be
explained using the following example. During the push-off phase of the jump, the knee
joint is rapidly extended as a result of the positive work done by the knee extensor muscles.
If the biarticular gastrocnemius muscle contracts isometrically (its length does not change),
the additional mechanical work is done at the ankle joint because of the gastrocnemius
muscle, which contributes no mechanical work by itself. A part of the energy generated by
the knee extensors appears as mechanical work at the ankle joint and the height of the jump
is significantly increased. This is because, as the jump proceeds and the knee straightens, the
angular position changes of the knee have progressively less effect on vertical velocity of the
jumper’s centre of gravity. By gastrocnemius muscle activation, a rapid extension of the foot
is produced. This extension has a greater effect on the vertical velocity than the extension of
the almost straightened knee. The energy is more effectively translated into vertical velocity
and a greater height of the jump is achieved. However, the timing of the gastrocnemius
muscle activation is critical to obtain a maximum effect. This was demonstrated by an
articulated physical model of the vertical jump by Bobbert et al (1986).
Besides biarticularity, the gastrocnemius muscle has one more interesting feature. It is
connected to the foot by an elastic tendon (see Fig. 1). The elasticity in the muscle fibers and
tendons plays an important role in enhancing the effectiveness and the efficiency of human
performance. An enhanced performance of human motion has been most effectively
Vertical Jump: Biomechanical Analysis and Simulation Study 553
demonstrated for jumping and running (Cavagna, 1970; Bobbert et al., 1996; Shorten, 1985).
An important feature of elastic tissues is the ability to store elastic energy when stretched
and to recoil this energy afterwards as a mechanical work (Asmussen and Bonde-Petersen,
1974). Beside this feature, oscillatory movements performed at the natural frequency of
muscle-tendon complex could maximize the performance. A countermovement vertical
jump can be treated as one period of oscillatory movement and from this point of view the
natural frequency, as well as parameters that define the natural frequency, can be
determined.
The purpose of our research was twofold. First to analyse the human vertical jump and to
show that for each and every subject there exists an optimal triceps surae muscle-tendon
complex stiffness that ensures the maximal possible height of the vertical jump. We defined
the influence of the m. gastrocnemius activation timing and the m. gastrocnemius and
Achilles tendon stiffness on the height of the vertical jump and established the methodology
for analysis and evaluation of the vertical jump. We monitored kinematics, dynamics and m.
gastrocnemius electrical activity during the maximum height countermovement jump of
human subjects and measured viscoelastic properties of the m. gastrocnemius and Achilles
tendon using the free-vibration technique. Based on the findings of the biomechanical study
of the human vertical jump we performed a simulation study of the humanoid robot vertical
jump. As a result of our research we propose a new human inspired structure of the lower
extremity mechanism by which a humanoid robot would be able to efficiently perform fast
movements such as running and jumping.
Fig. 2 shows the planar model of the musculoskeletal system, composed of four rigid bodies
that represent the foot, shank, thigh and trunk together with the head and both upper
extremities. The origin of the base coordinate system is in the center of the virtual joint that
connects the foot with the ground.
on the ground from the origin of the base coordinate system in time t.
2. Motion controller assures the desired vertical movement of the body’s COG relative to the
d
ankle yTA (t ) . By controlling the movement of the body’s COG relative to the ankle, we
d
excluded the influence of the biarticular link on the motion yTA (t ) . Therefore parameters of
the biarticular link can be varied and optimized for a certain desired motion of the body’s
COG relative to the ankle.
3. Motion controller assures a constant angle of the foot relative to the ground q1 before the
biarticular link activation occurs. Thus the number of degrees of freedom of the model
remains constant during the push-off phase of the vertical jump
556 Humanoid Robots, New Developments
4. In the moment, when the biarticular link activates, motion controller sets the torque
in the ankle joint Q2 to zero and thus enable a free motion of the foot relative to the
ground. By setting the torque in the ankle joint to zero, the motion in the ankle joint is
only a function of the motion in the knee joint that is transmitted to the ankle by the
biarticular link.
Motion controller that considers the requirement (5) and enables the desired vertical motion
d
of the COG relative to the ankle yTA (t ) in the base coordinate system is
ª 0 º §ª 0 º · §ª 0 º ·
xTc = « d
» + kp ¨ « y d + y » − xT ¸ + kd ¨ « y d + y » − x T ¸ , (6)
y
¬ TA +
y A¼ © ¬ TA A¼ ¹ © ¬ TA A¼ ¹
x Tc = JT q c , (7)
where JT is the Jacobian matrix of the COG speed in the base coordinate system. Equation (7)
represents an under determined system of equations. From the requirement that the motion
controller assures a constant angle of the foot relative to the ground q1 before the biarticular
link activation occurs, follows the condition
q c 1 = 0 . (8)
An additional condition that abolishes the under determination of (7) is the relationship of
the knee and hip joint angles
q c 4 = n ⋅ q c 3 , (9)
where n is the coefficient that describes the relationship.
By substitution of (8) and (9) into (7) we get a relation between the vector of the control
speed of the COG in the base coordinate system x Tc and the vector of the control angular
velocities in the ankle and knee joints q ′c
x Tc = J′T q ′c , (10)
where J′T is a new Jacobian matrix of the center of gravity speed in the base coordinate
system. Differentiation of (10) with respect to time yields
ªqc 2 º
′c = « » = J′T −1 ( xTc − J′T q ′) ,
q (11)
¬qc 3 ¼
where
ªq 2 º
q ′ = « » . (12)
¬ q 3 ¼
On the basis of conditions (8) and (9) and relation (11) we define control angular
accelerations in all four joints
Vertical Jump: Biomechanical Analysis and Simulation Study 557
ª 0 º
«
qc 2 »»
c = «
q . (13)
«
qc 3 »
« »
¬«nqc 3 ¼»
By substitution of (13) into a system of dynamic equations of motion (4) we get control
torques in joints Qmov that we need to control the model of the musculoskeletal system
c + h( q , q ) + G( q ) .
Qmov = H( q )q (14)
Direct dynamic model of the musculoskeletal system together with the motion controller
compose the biorobotic model of the vertical jump. Simulation diagram for the simulation of
the vertical jump is shown in Fig. 3. Inputs into the simulation diagram are the desired
d
trajectory of COG relative to the ankle yTA and a signal for biarticular link activation a.
Output from the simulation diagram is the vector of body’s COG position xT.
Fig. 3. Simulation diagram for the simulation of the vertical jump. Inputs into the simulation
diagram are the desired trajectory of the COG relative to the ankle and a signal for
biarticular link activation. Output from the simulation diagram is the vector of jumper’s
COG position.
toes and in the erected position. During each jump, position of anatomical landmarks over
epicondylus lateralis and fifth metatarsophalangeal joint were monitored, ground reaction
forces were measured and electromyogram of m. gastrocnemius was recorded. After the
jumping, we determined viscoelastic properties of the triceps surae muscle tendon complex
of all subjects. Details on methods and procedures are provided in the following sections.
where m is the body mass of the subject and yTm (0) is the initial height of the body’s COG
relative to the ground. To determine the vertical position of the body’s COG relative to the
m
ankle in time t yTA (t ) we measured the motion of the ankle during the vertical jump by
means of the contactless motion capture system (eMotion Smart). The vertical position of the
body’s COG relative to the ankle is therefore
m
yTA (t ) = yTm (t ) − y mA (t ) (16)
where y mA (t ) is the vertical position of the ankle in time t.
3.4 Electromyography
The activity of the m. gastrocnemius was recorded using a pair of surface electrodes put
over the medial head of the m. gastrocnemius. Analog EMG signals were amplified and
filtered with a band-pass filter with cut off frequencies at 1 Hz and 200 Hz. The signals were
then digitalized with 1000 Hz sampling frequency and full-wave rectified. To reduce the
variability of sampled EMG signal, the signal was then smoothed with a digital low-pass
Butterworth filter. Finally the EMG signal was normalized with respect to the maximum
value attained during the vertical jump.
Vertical Jump: Biomechanical Analysis and Simulation Study 559
3.7 Results
To determine the optimal timing of the biarticular link activation that results in the highest
vertical jump, a series of the countermovement vertical jump simulations have been
performed for each subject. Each simulation was performed with a different timing of the
biarticular link activation.
All subjects activated their m. gastrocnemius slightly before the optimal moment,
determined by means of simulations. In average, the difference between the optimal
and measured knee angle when the m. gastrocnemius was activated was 6.4 ± 2.22˚.
Because the dynamic model of the musculoskeletal system does not include the
monoarticular muscle soleus, the measured heights of the jumps were higher than the
jump heights determined with the simulations for 4.3 ±1.12 % in average. The primary
motive for omitting the m. soleus from the modelling is that we wanted to control the
motion of the body’s COG relative to the ankle so that the parameters of the
biarticular link could be varied and optimized for a certain measured motion of the
subject’s COG relative to the ankle. If the dynamic model of the musculoskeletal
system would include the m. soleus, the motion of the ankle would be fully
determined by the force of the m. soleus and we would not be able to control it with
regard to the desired body’s COG relative to the ankle. Moreover if the dynamic
model of the musculoskeletal system would include the m. soleus, force produced by
the m. soleus would be another input into the biomechanical model of the vertical
jump and we would have to accurately measure the force produced by the m. soleus
560 Humanoid Robots, New Developments
of subjects performing the vertical jump. An additional cause for the differences
between the measured heights of the jumps and the jump heights determined by
means of simulations can be the simplified model of the foot that we modeled as one
rigid body. The arch of the foot is linked up by elastic ligaments that can store elastic
energy when deformed and later reutilize it as the mechanical work (Alexander,
1988). Ker et al. (1987) measured the deformation of the foot during running and
determined the amount of the energy stored during the deformation. They showed
that the elasticity of the foot significantly contribute to the efficiency of the human
movement. To be able to compare the measurements and the simulation results, we
corrected the simulation results by adding the contribution of the m. soleus to the
height of the vertical jump. Such corrected heights of the vertical jumps at the optimal
moments of m. gastrocnemius activation are insignificantly larger than the measured
heights of the vertical jumps for all subjects. In average the height difference is only
1.6 ± 0.74 cm.
To determine the optimal stiffness of the Achilles tendon regarding to the height of the
vertical jump, a series of the countermovement vertical jump simulations have been
performed, each with a different stiffness of the Achilles tendon. Optimal timing of the
biarticular link has been determined for each stiffness of the Achilles tendon as described in
the previous paragraph. The measured values of the Achilles tendon stiffness for all subjects
were always higher than the optimal values determined by means of simulations. By
considering the elastic properties of the arch of the foot, we can assume that the optimal
values of the Achilles tendon stiffness would increase and therefore the differences between
the optimal and measured values would decrease.
Results of the measurements, simulations and optimizations of the human vertical jumps
are presented in Fig. 4. Subjects are arranged with regard to the ratio between the optimal
stiffness of the Achilles tendon determined by means of simulations and the measured
stiffness of the Achilles tendon. If we want to evaluate the contribution of the viscoelastic
properties to the height of the jump, ranking of the subjects with regard to the height of the
jump is not appropriate because the main parameter that influences the height of the vertical
jump is the power generated by muscles during the push-off phase of the jump. The
elasticity of the Achilles tendon has the secondary influence on the height of the jump.
Results show that for the same power generated by an individual subject during the push-
off phase of the jump, the height of the vertical jump varies with the Achilles tendon
stiffness. Therefore the appropriate criterion for ranking the subjects have to consider the
elasticity of the Achilles tendon.
Fig. 4. Results of the measurements, simulations and optimizations of the vertical jumps.
Whole bars represent the maximal heights of the vertical jumps achieved with the optimal
values of the Achilles tendon stiffness and determined by means of simulations. The dark
shaded parts of the bars represent the measured heights of the vertical jumps while the light
shaded tops represent the differences between the maximal and measured heights of the
vertical jumps. The differences are also shown as percentages relative to the measured
heights. Bold line represents the ratio between the optimal stiffness of the Achilles tendon
determined by means of simulations and the measured stiffness of the Achilles tendon for
each subject.
Fig. 5. An example trajectory of the robot center of gravity during the countermovement
vertical jump. Time interval between 0s and t2 represents the push-off phase of the jump
while the time interval between t2 and t3 represents the flight phase of the jump. t1 is the
time when robot center of gravity starts to decelerate in the downward direction.
562 Humanoid Robots, New Developments
The height of the vertical jump can be calculated with the following equation
1
( y TA (t2 ) + y A (t2 )) .
2
yT max = yTA (t2 ) + y A (t2 ) − (17)
2g
As evident from (17), there are four parameters whose values in time t2 influence the height
of the vertical jump:
the height of the robot center of gravity relative to the ankle joint yTA (t2 ) ,
the height of the ankle joint relative to the ground y A (t2 ) ,
the velocity of the robot center of gravity relative to the ankle joint y TA (t2 ) and
the velocity of the ankle joint relative to the ground in the vertical direction y A (t2 ) .
The height and the velocity of the robot center of gravity relative to the ankle joint in time t2
depend only on the power of the actuators in the hip and knee joints while the height and
the velocity of the ankle joint relative to the ground in time t2 is the consequence of the feet
motion dictated by the biarticular link. Therefore the height and the velocity of the ankle
joint relative to the ground in time t2 depend on the biomechanical parameters of the
biarticular link and its activity during the push-off.
Fig. 6. Jump height as a function of the knee angle when the biarticular link is activated.
Biarticular link activation moment is presented with the knee angle when the biarticular link
is activated. The highest jump has been achieved when the biarticular link was activated in
the moment when the knee angle was approximately -84°. With this optimal timing, the
robot jumped almost twice a high as when the biarticular link was not actuated or was
Vertical Jump: Biomechanical Analysis and Simulation Study 563
actuated when the knee was almost fully extended. If the biarticular link activates too early
in the push-off phase, the robot looses its ground contact too early when its center of gravity
has not reached the maximal velocity. If the biarticular link activates too late in the push-off
phase, the rotational energy of the thigh and shank increases uselessly. The optimal timing
represents a compromise between these two cases.
Fig. 8. Changes of vertical jump height in relation with variations of biarticular link stiffness
k, biarticular link damping b and ratio between the moment arms of the biarticular link rC/rB.
Multiplicator represents relative changes of parameter values.
As the sensitivity analysis refers to the model with optimal biarticular link stiffness, both
positive and negative change of the biarticular link stiffness k cause a decrease of the jump
height. An increase of the ratio between the moment arms of the biarticular link rC/rB causes
an increase of the jump height. Higher ratio between the moment arms means higher
angular speed in the ankle joint as a result or the biarticular link activation. Beside the
increase of the angular speed in the ankle joint, also the force in the biarticular link
increases. Variations of the biarticular link damping b only have negligible effect on the
height of the vertical jump.
5. Discussion
By building an efficient biorobotic model which includes an elastic model of the biarticular
muscle gastrocnemius and by simulation of the vertical jump we have demonstrated that
biarticular links contribute a great deal to the performance of the vertical jump. Besides, we
have shown that timing of the biarticular link activation and stiffness of the biarticular link
considerably influence the height of the jump.
Methodology and results of our study offer an effective tool for the design of the humanoid
robot capable of performing vertical jumps. We propose a new human inspired structure of
the lower extremity mechanism that includes an elastic biarticular link and by which a
humanoid robot would be able to efficiently perform fast movements such as running and
jumping. However, it has to be considered that this study deals only with one biarticular
link. Although the biarticular link we included in our study has the most distinctive elastic
properties among all biarticular muscles of the human leg, other biarticular muscles such as
the recuts femoris and the hamstrings should be included in humanoid robot design. A
special challenge would be to design a humanoid lower extremity that includes all
biarticular muscles of the human lower extremity and to demonstrate their joint effect on the
vertical jump performance.
Vertical Jump: Biomechanical Analysis and Simulation Study 565
6. Acknowledgement
This investigation was supported by the Slovenian Ministry of Education, Science and Sport.
7. References
Alexander, R.McN. (1988). Elastic mechanisms in animal movement, Cambridge University
Press, Cambridge
Asada, H. & Slotine, J.J.E. (1986). Robot analysis and control, John Wiley and sons, New York
Asmussen, E. & Bonde-Petersen, F. (1974). Storage of elastic energy in skeletal muscles in
man. Acta Physiologica Scandinavica, 91, 385-392
Audu, M.L. & Davy, D.T. (1985). The influence of muscle model complexity in
musculoskeletal motion modeling. Journal of Biomechanical Engineering, 107, 147-157
BabiĀ, J. & LenarĀiĀ, J. (2004). In vivo determination of triceps surae muscle-tendon complex
viscoelastic properties. European Journal of Applied Physiology, 92, 477-484
BabiĀ, J.; KarĀnik, T. & Bajd, T. (2001). Stability analysis of four-point walking, Gait &
Posture, 14, 56-60
Bobbert, M.F.; Gerritsen, K.G.M.; Litjens, M.C.A. & Soest, A.J. van (1996). Why is
countermovement jump height greater than squat jump height? Medicine & Science
in Sports & Exercise, 28, 1402-1412
Bobbert, M.F.; Hoed, E.; Schenau, G.J. van.; Sargeant, A.J. & Schreurs, A.W. (1986). A model
to demonstrate the power transportating role of bi-articular muscles, Journal of
Physiology, 387 24
Bogert, A.J. van den; Hartman, W.; Schamhardt, H.C. & Sauren, A.A. (1989). In vivo
relationship between force, EMG and length change in the deep digital flexor
muscle in the horse, In Biomechanics XI-A, G. de Groot; A.P. Hollander; P.A.
Huijing; G.J. van Ingen Schenau (Eds.), pp. 68-74, Free University Press,
Amsterdam
Brand, R.A.; Crowninshield, R.D.; Wittstock, C.E.; Pederson, D.R.; Clark, C.R. & Krieken,
F.M. van (1982). A model of lower extremity muscular anatomy. Journal of
Biomechanics, 104, 304-310
Cavagna, G.A. (1970). Elastic bounce of the body, Journal of Applied Physiology, 29, 279-282
Davy, D.T. & Audu, M.L. (1987). A dynamic optimization technique for predicting muscle
forces in the swing phase of gait. Journal of Biomechanics, 20, 187-201
Delp, S.L. (1990). Surgery simulation: A computer graphics system to analyze and design
musculoskeletal reconstructions of the lower limb, Doctoral Dissertation, Stanford
University, Palo Alto
Ker, R.F.; Bennett, M.B.; Bibby, S.R.; Kester, R.C. & Alexander, R.McN. (1987). The spring in
the arch of the human foot. Nature, 325, 147-149
Legreneur, P.; Morlon, B. & Hoecke, J. van (1997). Joined effects of pennation angle and
tendon compliance on fibre length in isometric contractions: A simulation study.
Archives of Physiology and Biochemistry, 105, 450-455
Leva, P. de (1996). Adjustments to Zatsiorsky-Seluyanov’s segment inertia parameters.
Journal of Biomechanics, 29, 1223-1230
Schenau, G.J. van. (1989). From rotation to translation: constraints of multi-joint
movements and the unique action of bi-articular muscles, Human Movement
Science, 8, 301-337
566 Humanoid Robots, New Developments
Shorten, M.R. (1985). Mechanical energy changes and elastic energy storage during
treadmill running, In: Biomechanics IXB, D.A. Winter; R.W. Norman; R. Wells; K.C.
Hayes; A. Patla (Eds.), pp. 65-70, Human Kinetics Publ, Champaign
Zatsiorsky, V. & Seluyanov, V. (1983). The mass and inertia characteristics of the main
segments of the human body, In Biomechanics VIII-B, H. Matsui; K. Kobayashi
(Eds.), pp. 1152-1159, Human Kinetics, Illinois
32
1. Introduction
The potential market of service and entertainment humanoid robots has attracted a great
amount of research interests in the recent years. Several models of humanoid robots have
been designed in research projects. However, it remains a great challenge to make
humanoid robots move autonomously. Motion planning is one of the key capabilities that
an autonomous robot should have. An autonomous robot should be able to accept high-
level commands and move in a real-life environment without colliding with environmental
obstacles. A high-level command is something like “Move to location A on the second floor”
while the robot is currently at some location B on the first floor, for example. It is an
interesting problem for humanoid robots since locomotion capability possessed by a
humanoid robot is usually much better than a mobile robot. Like a human, a humanoid
robot should be able to step upstairs or downstairs and stride over small obstacles or a
narrow deep gap in a complex environment such as the one shown in Fig. 1. Similar needs
for the planning capability also arise in the domain of computer animation in generating
motions for autonomous characters.
The motion for a humanoid robot to achieve a given goal is typically very complex because
of the degrees of freedom involved and the contact constraint that needs to be maintained.
Therefore, it is common to take a two-level planning approach to solve this problem. In our
previous work (Li et al, 2003), we have been able to plan efficient humanoid walking
motions in a layered environment. The approach used in the planner decomposed the
planning problem into subproblems of global and local planning, each of which is easier to
solve. The global planner assumes some basic properties of a humanoid and uses an
approximated geometric shape to define the path planning problem. The path generated by
the global planner is then passed to the local planner to realize the path with appropriate
walk motions. However, in previous work, the locomotion of a humanoid is usually limited
to forward walking only, and a humanoid cannot pass a deep gap even if it can stride over
it. In this chapter, we will describe a motion planning system adopting the two-level
approach and extending the work to overcome the above two limitations. We will present
an efficient implementation of the planner in terms of space and time such that it can be
used in an on-line manner.
The rest of the chapter will be organized as follows. In the next section, we will review the
researches pertaining to our work. In the third section, we will give a formal description of
the problem we consider in this chapter. We will then propose the planning algorithm in the
568 Humanoid Robots, New Developments
fourth section and present some implementation details and experimental results in the fifth
section. In the last section, we will conclude the chapter with some future directions.
Fig.1. Example of a layered virtual environment with deep narrow gaps (colored columns
between two large platforms)
2. Related Work
Motion planning problems have been studied for more than three decades. An overview of
motion planning algorithms can be found in Latombe’s book (Latombe, 1991). The
researches pertaining to humanoid motion planning or simulation can be found in the fields
of robotics and computer animation (Kuffner et al., 2001; Polland et al., 2002; Sun & Dimitris,
2001). These researches differ mainly on the way they perform global path planning and
how locomotion is taken into account. For example, early work in computer animation
focused on generating human walking motion to achieve a high-level goal (Bruderlin &
Calvert, 1989). However, no planning was done to generate the global path automatically.
Since the application was mainly for computer graphics, how to simulate human walking
with realistic looking was the main concern (Sun & Dimitris, 2001). A dynamic filtering
algorithm was proposed in (Yamane & Nakamura, 2000) to ensure that the generated
motions could be transformed into a dynamically feasible one.
In an early paper by Kuffner (1998), a gross motion planner utilizing graphics hardware was
proposed to generate humanoid body motions on a flat ground in real time. Captured
locomotion was used in this case to move the humanoid along the generated global path. In
(Kalisiak & Panne, 2001), a stochastic search approach with versatile locomotion was
proposed to generate humanoid motions in an unstructured environment where a set of
predefined grasp points served as contact constraints. In (Choi et al, 2003), sequences of
valid footprints were searched through augmented probabilistic roadmap with a posture
transition graph, and then each pairs of footprints were substituted by corresponding
motion clips. Since the motion clips were captured in advance, the locomotion may not be
flexibly adapted to uneven terrain to avoid collisions. In (Shiller et al, 2002), a multi-layer
grid was used to represent the configuration space for a humanoid with different types of
locomotion such as walking and crawling, and the humanoid may change its posture along
the global path.
In (Pettre et al., 2003), a digital actor was modelled with active (all degrees of freedom
attached to the legs) and reactive (attached to the upper parts of the body) degrees of
freedom. A collision-free trajectory was computed for the active part of the digital actor.
Then the motion warping technique (Witkin & Poppvic, 1995) was applied to the motion
Planning Versatile Motions for Humanoid in a Complex Environment 569
capture data along the path when the reactive part of the digital actor collided with
obstacles. The planner proposed in (Chestnutt et al., 2003) evaluated footstep locations for
viability using a collection of heuristic metrics of relative safety, effort required, and overall
motion complexity. At each iteration of the search loop, a feasible footstep was selected from
the footstep transition sets. The global path of this approach was a sequence of footstep
locations toward a given goal state.
In (Kuffner et al., 2001a; Kuffner et al., 2001b), a humanoid robot with real-time vision and
collision detection abilities was presented. The robot could plan its footsteps amongst
obstacles but could not step onto them. Considering locomotion directly in global path
planning may generate more complete result but, on the other hand, it limits the flexibility
of locomotion. In our previous work (Li et al., 2003), we were able to plan humanoid
motions in real time with a given general description of the objects in the workspace. A
more detailed description of this approach will be given in the next two sections.
3. Problem Description
3.1 Problem Overview
According to motion granularity, the motion-planning problem usually can be classified
into global (gross) motion planning and local (fine) motion planning. In the global motion-
planning problem we concern with working out the body logistics, such as planning a
collision-free path amongst trees in a forest to reach some destination. In contrast, for the
local motion-planning problem we focus more on limb logistics, such as planning hand
motion to grab an awkwardly placed object. For the problem of walking on a layered
complex environment for a humanoid robot, both types of planning needs to be considered
in order to ensure that the desired task can be accomplished with feasible motion plans.
The inputs to a typical motion planner for a humanoid robot include the initial and goal
configurations of the robot, the kinematics description and locomotion abilities of the robot,
and a geometric description of objects in the environment. In our approach, the planning
problem is decomposed into two subproblems: the global motion planning for moving the
body trunk of a humanoid and the local motion planning for realizing the global motion
plan with the chosen locomotion. The planners at the two levels can be linked together to
solve the problem in sequence as well as to feed back failure situation for further replanning
as shown in Fig. 2. The global motion planning will be the main concern of this chapter.
environment and
humanoid setting Reachability &
motion patterns Collision model
and abilities
success
Local complete
Global
qinit global motion
Motion Motion
qgoal Planner path Planner
failure
Stable Stance
Bounding
Cylinder
Moving direction
Fig.3. Frontal and lateral walking model for the humanoid.
Planning Versatile Motions for Humanoid in a Complex Environment 571
(a) (b)
The layered representation approach is appropriate for the gross motion-planning problem
of a humanoid robot since the humanoid needs to stay on the ground all the time, even
when climbing up or stepping down stairs. Therefore, a configuration q in workspace can be
represented as (x, y, l), which means that the humanoid is standing at position (x, y) and on
the “ground” of layer l.
humanoid height H can fit into the clearance above the cell i′ at the layer l or l ′
( (d l + − d l − cil′ ) > H or ( d l′+ − d l′ − cil′′ ) > H , where l+ denotes the layer above l).
(a) (b)
(c) (d)
Fig.5. (a) height map for the environment in Fig. 1., (b) map in (a) after applying dilation, (c)
map in (b) after applying erosion. After the closing operator, the columns in the
discontinuous region are connected. (d) instability map
Planning Versatile Motions for Humanoid in a Complex Environment 573
STABLE_BFP()
1 install qi in T;
2 INSERT(qi, OPEN); mark qi visited;
3 SUCCESS ł false;
4 while ɗEMPTY(OPEN) and ɗSUCCESS do
5 qłFIRST(OPEN);
6 for every neighbor q’ of q in the grid do
7 if q’ is stable then
8 mark q’ visited;
9 if LEGAL(q’, q) then
10 install q’ in T with a pointer toward q;
11 INSERT(q’, OPEN);
12 if q’ = qg then SUCCESS ł true;
13 if SUCCESS then
14 return the backtracked feasible path
15 else return failure;
Fig.6. The STABLE_BFP algorithm
and temporarily stable. It is temporarily stable if and only if the humanoid has not entered
the unstable regions or the gap regions for longer than some threshold. This duration is kept
as an instability counter (q’.cnt) in each cell in the unstable region and step counter (q’.step)
in the gap region when we propagate nodes into it. Note that the validity of a configuration
in the unstable or gap regions depends on the counter of its parent configuration. If there are
more than one possible parent configurations, we cannot exclude any of them. Therefore, in
the STABLE_BFP algorithm, we do not mark a configuration visited if it is in the unstable or
gap region. A configuration in these regions can be visited multiple times as long as the
counter does not exceed the maximal bound. In a gap region, we need to check the gap
width with the gait size. We keep the start point of a gap (gap_begin) when we first enter the
gap region and the end point of a gap (gap_end) when a stable region is reached. These two
points are used to compute the gap width, and the procedure is to ensure that the gap does
not exceed the gait size and the humanoid can stride from gap_begin to gap_end.
STABLE (q’ ,q)
if q’ is unstable then
q’.cnt= q.cnt+1;
else if q’ is gap then
if q.ingap then
q’.step= q.step+1;
else
q’.step=1;
q’.ingap=true;
gap_begin= q;
else if q’ is stable and q.ingap then
gap_end= q’;
of these criteria, whose weights are specified by the user. In general, unstable regions and
gap regions have lower preference, and staying in these regions is not preferred. Therefore,
we use instability (q.cnt) or step counter (q.step) as a penalty measurement to avoid motions
over difficult areas whenever possible.
5.2 Example
In Fig. 9, we use an example to illustrate the features of this planning system. The
environment, which is the same as the one in Fig. 1, consists of two layers of objects with
various sizes and heights scattered on the two main platforms connected by several columns
of various heights. The C-obstacles (configuration space obstacles) with different bounding
cylinders for forward walking and side walking are shown in Fig. 9 (a) and (b), respectively.
Note that the narrow passage exists only when side-walk locomotion (Fig. 9. (b)) is used.
The global path found by the planner is shown in Fig. 9 (c) and (d). In this example, the
initial configuration is on the ground, and the goal is at some location on the second layer
that is reachable only through the following passages: climbing upstairs to the left platform,
changing locomotion to side walk to pass the narrow passage, crossing the columns to the
right platform, and climbing upstairs again to the second layer on the right platform. The
planning time for a typical run consists of two parts: preprocessing and search. In this
example, the total preprocessing time is 251ms (constructing layer map takes 10ms,
computing the reachable region takes 35ms, computing the potential field takes 74ms, and
computing the instability map takes 132ms) and the search time is 15ms.
Planning Versatile Motions for Humanoid in a Complex Environment 577
(a) (b)
(c) (d)
Fig. 8. Merging sparse layers. (a) and (b) are the 3D workspace with spiral stairs from
different views (c) and (d) are the merged results for layer 1 and layer 2, respectively.
(a) (b)
578 Humanoid Robots, New Developments
(c) (d)
(e) (f)
Fig. 9. (a) the C-obstacles for forward walking and (b) side walking. (c) the search result of
the global path planner on layer 1 and (d) layer 2, (e) passing the narrow passage with side-
walk motion, (c) feasible local motions generated to walk across the columns.
In Fig. 10, we use an outdoor environment to illustrate the use of several types of
locomotion. The environment, as shown in Fig. 10(a), is partitioned into three blocks by
the river. Upper and lower blocks are connected through the lily pads floating on the
river. The broken bridge between the left and the right block is the shortest path
between them. However, the distance is too large for the humanoid to stride over. In
this case, the humanoid can only use jumping to move over the broken bridge. In
addition, the passage to the house is too narrow for the humanoid to walk through
without switching to the side-walk locomotion because the narrow passage exists only
when side-walk locomotion is used. The global path found by the planner is shown in
Fig. 10 (b), the red portion of the path is frontal walking, yellow portion is jumping, and
green portion is lateral walking. In this example, the initial configuration is on the left
Planning Versatile Motions for Humanoid in a Complex Environment 579
bottom of the lower block, and the goal is in the yard of the house at the center of the
right block that is reachable only through the following passages: climbing upstairs to
the platform, detouring round the woods in the forest (Fig. 10(c)), striding over the lily
pads (Fig. 10(d)), jumping over the broken bridge between the left and the right block
(Fig. 10(e)), changing locomotion to side walk to pass the narrow fence (Fig. 10(f)), and
finally reach the goal. The total planning time is 312ms (281ms for preprocessing and
31ms for path searching).
(a) (b)
(c) (d)
(e) (f)
Fig. 10. An outdoor environment requiring the humanoid to use several motion skills to
reach the goal.
580 Humanoid Robots, New Developments
A real human usually can avoid obstacles with various kinds of locomotion and body
motions. In the future, we would also like to enable the humanoid robot with more
locomotion abilities when moving to the goal. For example, a human can crawl or stoop to
avoid upper-layer obstacles. In addition, in a complex environment, there could exist
movable objects that can be moved away by pulling or pushing by the humanoid robot to
make ways for itself to reach the goal. We will also consider this kind of manipulation and
reasoning capability in the future.
7. References
Bruderlin, A. & Calvert, T.W. (1989). Goal-Directed, Dynamic Animation of Human
Walking, Proceedings of ACM SIGGRAPH 1989
Barraquand, J. & Latombe, J.C. (1991). Robot Motion Planning: A Distributed Representation
Approach, Intl J. of Robotics Research, 10:628-649
Chestnutt, J.; Kuffner, J.; Nishiwaki, K. & Kagami, S. (2003) Planning Biped Navigation
Strategies in Complex Environments, Proceedings of IEEE International Conference on
Humanoid Robotics
Choi, M.G.; Lee, J. & Shin, S.Y. (2003). Planning Biped Locomotion Using Motion Capture
Data and Probabilistic Roadmaps, ACM Transactions on Graphics, Vol. 22, No. 2, pp.
182–203
Gonzale, R.C. & Woods, R.E. (2002). Digital Image Processing, Second Edition, Prentice Hall
Kalisiak, M. & Panne, M. (2001). A Grasp-Based Motion Planning Algorithm for Character
Animation, Journal of Visual Computer and Animation, pp. 117-129
Kuffner, J. (1998). Goal-Directed Navigation for Animated Characters Using Real-time Path
Planning and Control, Proceedings of CAPTECH’98 Workshop on Modeling and Motion
capture Techniques for Virtual Environments, Springer-Verlag
Kuffner, J.J.; Nishiwaki, K.; Kagami, S. Inaba, M. & Inoue, H. (2001). Footstep Planning
Among Obstacles for Biped Robots, Proceedings of 2001 IEEE International Conference
on Intelligent Robots and Systems (IROS 2001)
Kuffner, J.J.; Nishiwaki, K.; Kagami, S.; Inaba, M. & Inoue, H. (2001). Motion Planning for
Humanoid Robots under Obstacle and Dynamic Balance Constraints, Proceedings of
2001 IEEE International Conference on Robotics and Automation
Latombe, J.C. (1991). Robot Motion Planning, Kluwer, Boston, MA
Li, T.Y.; Chen, P.F. & Huang, P.Z. (2003). Motion Planning for Humanoid Walking in a
Layered Environment, Proceedings of the 2003 International Conference on Robotics and
Automation
Pettre, J.; Laumond, J.P. & Simeon, T. (2003). A 2-Stage Locomotion Planner for Digital
Actors, Proceedings of Eurographics/SIGGRAPH Symposium on Computer
Animation
Pollard, N.S.; Hodgins, J.K.; Riley, M.J. & Atkeson, C.G. (2002). Adapting Human Motion for
the Control of a Humanoid Robot, Proceedings of 2002 IEEE International Conference
on Robotics and Automation, pp. 2265-2270
Shiller, Z.; Yamane, K. & Nakamura, Y. Planning Motion Patterns of Human Figures Using a
Multi-Layered Grid and the Dynamics Filter, Proceedings of IEEE International
Conference on Robotics and Automation, pp.1-8
Sun, H.C. & Dimitris, N.M. (2001). Automating Gait Generation, Proceedings of ACM
SIGGRAPH 2001
582 Humanoid Robots, New Developments