Learning Stable Robotic Skills On Riemannian 2023 Robotics and Autonomous S
Learning Stable Robotic Skills On Riemannian 2023 Robotics and Autonomous S
article info a b s t r a c t
Article history: In this paper, we propose an approach to learn stable dynamical systems that evolve on Riemannian
Received 31 August 2022 manifolds. Our approach leverages a data-efficient procedure to learn a diffeomorphic transformation,
Received in revised form 11 July 2023 enabling the mapping of simple stable dynamical systems onto complex robotic skills. By harnessing
Accepted 14 August 2023
mathematical techniques derived from differential geometry, our method guarantees that the learned
Available online 19 August 2023
skills fulfill the geometric constraints imposed by the underlying manifolds, such as unit quaternions
Keywords: (UQ) for orientation and symmetric positive definite (SPD) matrices for impedance. Additionally, the
Learning from Demonstration method preserves convergence towards a given target. Initially, the proposed methodology is evaluated
Learning stable dynamical systems through simulation on a widely recognized benchmark, which involves projecting Cartesian data
Riemannian manifold learning onto UQ and SPD manifolds. The performance of our proposed approach is then compared with
existing methodologies. Apart from that, a series of experiments were performed to evaluate the
proposed approach in real-world scenarios. These experiments involved a physical robot tasked with
bottle stacking under various conditions and a drilling task performed in collaboration with a human
operator. The evaluation results demonstrate encouraging outcomes in terms of learning accuracy and
the ability to adapt to different situations.
© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
1. Introduction and goal points are examples of complex robotic skills that a non-
linear DS can effectively generate. To learn complex robotic skills,
Robots that successfully operate in smart manufacturing have DS-based approaches for LfD assume that the demonstrations of
to be capable of precisely controlling their behavior in terms of a robotic skill are observations of the time evolution of a DS. DSs
movements and physical interactions [1]. In this regard, modern are flexible motion generators that allow, for example, to encode
industrial and service robots need flexible representations of such periodic and discrete motions [13], to reactively avoid possible
intended behaviors in terms of motion, impedance, and force collisions [14–16], or to update the underlying dynamics in an
skills (see Fig. 1). The development of such representations is a incremental fashion [17,18].
key aspect in speeding up the integration of robotic solutions in Although DS-based representations have several interesting
social and industrial environments. properties, most works assume that the data are observations of
Learning approaches have the possibility to unlock the full Euclidean space. However, this is not always the case in robotics
potential of smart robotic solutions. Among the others, the Learn- where data can belong to a non-Euclidean space. Typical ex-
ing from Demonstration (LfD) paradigm [2] aims at developing amples of non-Euclidean data are orientation trajectories rep-
learning solutions that allow the robot to enrich its skills via resented as rotation matrices and Unit Quaternions (UQs), and
human guidance. Among the several existing approaches [3,4], Symmetric Positive Definite (SPD) matrices for quantities such as
the idea of encoding robotic skills into stable Dynamical Systems inertia, impedance gains, and manipulability, which all belong to
(DSs) has gained interest in the LfD community [5–12]. In this Riemannian manifolds. Applying standard arithmetic tools from
context, a robotic skill is any robot motion, either with a given Euclidean geometry on Riemannian manifolds leads to inaccura-
start and goal (point-to-point or discrete motion), or periodic. A cies and incorrect representations, which can be avoided if the
linear point-to-point motion is an example of simple robotic skill proper mathematical tools developed for Riemannian manifolds
that can be generated, for instance, with a linear DS. Point-to- are used [19].
point motions with an arbitrarily complex path connecting start In this paper, we propose the following.
https://doi.org/10.1016/j.robot.2023.104510
0921-8890/© 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
M. Saveriano, F.J. Abu-Dakka and V. Kyrki Robotics and Autonomous Systems 169 (2023) 104510
Table 1
Comparison among state-of-the-art approaches for LfD on Riemannian manifolds and SDS-RM.
End-Point Multiple demos Multimodal Time-independent Data-efficiency Accuracy+ Training time
DMP [20,21] ✓ – – – high medium low
R-GMM [22] ✓ ✓ – – high medium low
FDMa [9] ✓ – – ✓ high medium low
E-FLOW [23] ✓ ✓ ✓ ✓ low mediumb high
I-FLOW [24–26] ✓ ✓ ✓ ✓ low mediumb high
SDS-RM (ours) ✓ ✓ ✓ ✓ high high low
a
FDM is not designed to work on Riemannian manifolds. However, our formulation allows us to use them to learn a diffeomorphism in TS.
b
E-FLOW and I-FLOW need a hyperparameter search to reach high accuracy. However, performing a hyperparameters search requires a GPU cluster and it is beyond
the scope of this paper. With trial and error, we found a hyperparameter configuration that gives a medium accuracy.
2. Related works
Table 2
Re-interpretation of basic standard operations in a Riemannian manifold [53].
Euclidean space Riemannian manifold
Subtraction
−
→
ma = a − m −→ = Log (a)
ma m
Addition
−
→
p = m + ma a = Expm −
( →)
ma
Distance dist(m, a) = ∥a − m∥
−→
dist(m, a) = ∥mp∥m
Interpolation
−−−→
m(t) = m1 + t m1 m2 m(t) = Expm1 (t −
m−1−
m→)
2
4.1. Base DS on Riemannian manifolds To prove the stability of (9), one can define the Lyapunov candi-
date function [56]
In Euclidean space, a linear DS in the form V (a) = ⟨a, a⟩g = ⟨Logg (a), Logg (a)⟩g (10)
ẋ = −kx (x − g) (7) and follow the same arguments outlined in [42, Section 4.2.2] for
SPD matrices. □
is often used as base2 DS. Indeed, if the gain kx > 0, the linear
DS in (7) is globally exponentially stable [55]. This implies that
Remark 1. The assumption made in Theorem 1 that logg (a) =
−Tg→a Loga (g) holds for any Riemannian manifold [57, Theorem
2 Note that some authors assume, without loss of generality, that g = 0. 6].
4
M. Saveriano, F.J. Abu-Dakka and V. Kyrki Robotics and Autonomous Systems 169 (2023) 104510
Remark 2. The results of Theorem 1 hold where the logarithmic 5. Learning stable skills via diffeomorphisms
map is uniquely defined, i.e., in a region that does not contain
points conjugate to g [56]. For SPD matrices, this holds every- The stability theorems provided in Section 4 give solid the-
where [53]. Hence, Theorem 1 is globally valid on the manifold oretical foundations to our learning approach. In this section,
of SPD matrices. For unit m-sphere (including UQ), instead, the we describe how to generate training data suitable to learn a
logarithmic map Loga (·) is defined everywhere apart from the diffeomorphic map on manifolds. The approach, as well as the
antipodal point −a [58]. derivations in Section 4, are quite general and allow the use of
different approaches to find the sought diffeomorphism. We then
Even if it is not strictly required in our approach, it is inter-
describe how GMM/GMR, a consolidated and data-efficient ap-
esting to show that, like the linear DS in (7), the DS in (8) is
proach for LfD, can be extended to learn such a diffeomorphism.
exponentially stable.
Finally, we discuss how to apply our approach to two popular
Riemannian manifolds, namely the unit quaternions and the SPD
Remark 3. Under the assumptions of Theorem 1, the DS in (8)
matrices.
has a globally (in its domain of definition) exponentially stable
equilibrium at g [42, Section 4.2.2].
5.1. Data pre-processing
4.2. Diffeomorphic DS
We aim at learning a diffeomorphism ψ (·) that maps the
A diffeomorphic map or a diffeomorphism ψ is a bijective, trajectory a(t), solution of the base DS in (8), into an arbitrarily
continuous, and with continuous inverse ψ −1 change of coor- complex demonstration. To this end, let us assume the user
dinates. In this work, we assume that ψ : Ta M → Ta M, provides a set of D ≥ 1 demonstrations each containing L points
i.e., the diffeomorphism transforms a global coordinate a ∈ Ta M on a Riemannian
{ }L,Dmanifold. Demonstrations are organized in the
into another global coordinate b = ψ (a) ∈ Ta M. Further, set B = bdl l=1,d=1 , where each bdl ∈ M. We also assume
the diffeomorphism ψ is assumed to be bounded, i.e., it maps that the demonstrations converge to the same goal (b1L = · · · =
bounded vectors into bounded vectors. bDL = g) and that a sampling time δ t is known. When collecting
In order to match a set of demonstrations, we apply ψ to the demonstrations using kinesthetic teaching, it is possible to ob-
base DS in (8). More in detail, let us assume that a = Loga (g), and serve some variations in the final point. In this case, we re-scale
that the dynamics of the global coordinates ȧ is described by (8). the trajectories to converge to the same goal, which is defined
By taking the time derivative of b, we obtain the DS [23] by the user (e.g., as the average of the end points). It is worth
∂ψ mentioning that, when orientation trajectories are collected from
ḃ = ȧ = ka Jψ (ψ −1 (b))ψ −1 (b), (11) demonstrations with a real robot, it is needed to extract UQs
∂a from rotation matrices. This is because the robot’s forward kine-
where Jψ (·) is the Jacobian matrix of ψ evaluated at a particular matics is typically expressed as a homogeneous transformation
point and the inverse mapping a = ψ −1 (b) is used to remove the matrix [59]. While numerically extracting UQs from a sequence
dependency on a. Having assumed that ψ is a bounded diffeomor- of rotation matrices, it can happen that the approach returns a
phism, the right side of (11) satisfies the Lipschitz condition and, quaternion at time t and its antipodal at t + 1. This is because
therefore, the DS in (11) admits a unique solution. The stability antipodal UQs represents the same rotation. To prevent this dis-
of the diffeomorphic dynamics in (11) is stated by the following continuity, one can check that the dot product qt · qt +1 > 0,
theorem. otherwise, replace qt +1 with −qt +1 .
Given the set of demonstrations B, we generate a set of D base
Theorem 2. The diffeomorphic DS in (11) inherits the stability trajectories by projecting (8) on the manifold. More in detail, we
properties of the base DS in (8). That is if the base DS is globally (in set the initial condition ad1 = bd1 and project the tangent space
its domain of definition) asymptotically stable so is the diffeomorphic velocity on the manifold using the exponential map as
DS.
adl+1 = Expad δ t ȧdl ∀ l, d
( )
(13)
l
Proof. From the proof of Theorem 1, it holds that V (a) defined
in (10) is a Lyapunov function for the base DS in (8). As shown The time derivative ȧdl is defined as in (8), and the exponen-
in [23, Section 3.2], the function Vψ (b) = ⟨ψ −1 (b), ψ −1 (b)⟩g′ , tial/logarithmic maps for UQ and SPD manifolds are defined as
where g′ is the point where ψ −1 (Loga (g)) = 0, is a valid in Section 3. { }L,D
Lyapunov function for the diffeomorphic DS in (11). □ The D base trajectories are organized in a set A = adl l=1,d=1 .
In order to transform the datasets A and B into suitable training
Remark 4. Theorem 2 states the convergence of the DS (11) to data we proceed as follows. We use the logarithmic map adl =
the equilibrium g′ . This point may differ from the equilibrium g of Logad (g), ∀l, d to project the goal g in each TS placed at adl . We
l
the base DS (8). However, in LfD, we are interested in converging use the logarithmic map bdl = Logad (bdl ), ∀l, d to project each
l
to a given goal—let’s say g for simplicity. Assuming that the in-
point in B in the TS placed at adl . As a result, we obtain the sets
verse mapping ψ −1 (·) is identity at the goal, i.e., ψ −1 (Loga (g)) = { d }L,D { d }L,D
Loga (g) = 0, it is straightforward to show from Theorem 2 that A = al l=1,d=1 and B = bl l=1,d=1 . In other words, we have in
the DS (11) also convergences to g. A the points from the base the DS (8) that exponentially converge
towards g and in B their demonstrated values. Note that each adl
Given the global coordinate b, we compute the corresponding and bdl is expressed in the same TS to make them comparable.
manifold point b through the exponential map as After this procedure, the learning problem becomes how to
fit a mapping between A and B while preserving the stability.
b = Expa (b). (12)
Exploiting the theoretical results in Theorem 2 and Remark 4, this
Recalling that the exponential map is a local diffeomorphism, the learning problem is solved by fitting a diffeomorphism between
composite mapping Expa (b) = Expa ◦ ψ (a) can be considered a A and B. The resulting approach is presented in the rest of this
diffeomorphism between manifolds. section.
5
M. Saveriano, F.J. Abu-Dakka and V. Kyrki Robotics and Autonomous Systems 169 (2023) 104510
Algorithm 1: SDS-RM on UQ and SPD manifolds – and compare the results against
FDM [9], R-GMM [22], and E-FLOW [23]. It is worth mention-
1: Pre-process data
ing that FDM has not been designed to work on Riemannian
{ }L,D
– Collect demonstrations B = bdl manifolds. However, the procedure described in Section 4 allows
l=1,d=1
exploiting different approaches to fit a diffeomorphism between
– Define the sampling time δ t
{ }L,D TSs.
– Compute base trajectories A = adl l=1,d=1 using (13)
– Project to TS using Loga (g) ((2) or (4)) to obtain 6.1. The Riemannian LASA dataset
{ }L,D { }L,D
A = adl l=1,d=1 , B = bdl l=1,d=1 . (For SPD profiles,
vectorize data using Mandel’s notation.) In the LfD literature, there is no available dataset to test DS-
based algorithm on Riemannian manifolds. Therefore, we have
2: Learn a diffeomorphism represented as a GMM created a new one by modifying the popular benchmark – the
– Place a small Gaussian component at the origin of the LASA handwriting data-set [6] – to generate manifold (namely
TS (πK +1 = π , N (a, b|0, kN I )) UQ and SPD) motions. The LASA handwriting contains 30 classes
– Place a small Gaussian component at the initial point of 2D Euclidean motions starting from different initial points and
(π0 = π , N (a, b|a, kN I )) converging to the same goal [0, 0]⊤ . Each motion is demonstrated
7 times. A demonstration has exactly 1000 samples and includes
3: Generate Riemannian motion via GMR position, velocity, and acceleration profiles.
The key idea to generate Riemannian data from Euclidean
– Compute ψ −1 from (17), J ψ from (18), and the velocity
points is to consider each demonstration as an observation of a
from (11).
motion in the TS of a given Riemannian manifold. This allows us
– Project on the manifold using (13).
to use the exponential map to project the motion onto the man-
ifold. As discussed in Section 3, the TS is computed wrt a point
on the manifold. For converging motions, as the one generated
by SDS-RM, the TS can be conveniently placed at the goal. We
defined the goal as qg = 1 + [0, 0, 0]⊤ for UQs and as G =
diag([100, 100]) for SPD matrices, but other choices are possible.
It is worth noticing that the described procedure is rather general
and can be applied to Euclidean benchmarks different from the
LASA dataset.
Data in the original LASA dataset are 2D (xy-plane), but the TS
of UQs and 2 × 2 SPD matrices are3 3D. To add the third dimen-
sion, we consider the 7 demonstrations of each motion class as
a matrix Ci for i = 1, . . . , 30 with 14 rows and 1000 columns.
Out of each Ci , we extract the 4 matrices C1,i = Ci [0 : 2, :], C2,i =
Ci [4 : 6, :], C3,i = Ci [8 : 10, :], and C4,i = Ci [[12, 13, 0], :]. As a
result, we obtain 4 demonstrations for each motion class, with
the third component sampled for the demonstration along the x-
Fig. 5. Results obtained with SDS-RM on the ‘‘N’’ shape. axis. In this way, the third component is similar across different
demonstrations of the same motion – as in a typical LfD setting
– and contains sufficient complexity. Finally, the 3D trajectories
contained in the matrices C1,i to C4,i are then projected to the
corresponding manifold using the exponential map. For UQ, we
5.4. Learning in current and fixed TS
scale the data between [−1, 1] before projecting them on the unit
sphere.
In this example, we show the benefits of learning manifold
motions in the tangent space placed at the current point, called
6.2. Evaluation procedure
the current TS in this work, in contrast to projecting the entire
trajectory in a unique (fixed) TS. We use the ‘‘N’’ shape on S 2
We use the Riemannian LASA dataset described in the pre-
provided in [22] and shown in Fig. 5 (black dots). The trajec-
vious section to compare SDS-RM against two baselines and
tory is designed to span both the north and south hemispheres,
three state-of-the-art approaches. The baselines are built con-
where the Lie algebra is known to introduce a non-negligible
sidering as base DS the Euclidean dynamics in (7) and a GMM-
approximation [22].
based diffeomorphism in Euclidean space. The baseline for UQs,
We follow the steps in Algorithm 1 using in one case the
named DS+Normalize, performs an extra normalization step to
current TS and the Lie algebra (i.e., the TS at the north pole)
fulfill manifold constraints. The baseline for SPD matrices, named
in the other. Qualitative results in Fig. 5(a) confirm that using
DS+Cholesky, exploits Cholesky decomposition and Mandel’s no-
the current TS, SDS-RM can effectively encode the trajectory. The
tation to vectorize the matrix for training and re-build an SPD
same result can be obtained using a TS at the goal and parallel
transporting the data at each step (see the assumption in Theo- matrix from the generated vector. The other approaches included
rem 1). However, this choice would increase the computational in the comparison are FDM, R-GMM, and E-FLOW. The Rieman-
complexity due to the need for parallel transport. As expected, nian LASA dataset contains 30 classes. In this experiment, we
using the Lie algebra results in severe distortions (Fig. 5(b)). consider a subset of 26 individual motions that neglects the
4 multi-modal motions. The multi-modal motions, obtained by
6. Validation combining 2 or 3 individual motions, contain different patterns
In this section, we validate the proposed approach on a public 3 The TS of a SPD matrix is a symmetric matrix which can be vectorized. For
benchmark – properly modified to represent trajectories evolving 2 × 2 SPD matrices the TS has 3 independent components.
7
M. Saveriano, F.J. Abu-Dakka and V. Kyrki Robotics and Autonomous Systems 169 (2023) 104510
Fig. 6. Qualitative results obtained on the Riemannian LASA dataset. Reproduced trajectories (brown solid lines) are obtained by applying the diffeomorphism learned
with SDS-RM on the TS demonstrations (black dashed lines). It is worth noticing that TS data are in 3D, but we choose a view angle that makes the plot similar to
the original 2D data. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Table 3
Results for different learning approaches applied to the Riemannian LASA dataset. We report mean and standard deviation for RMSE
and TT.
M Approach k RMSE TT [s]
SDS-RM (ours) 10 0.029 ± 0.019 0.755 ± 0.241
DS + Normalize 10 0.126 ± 0.113 1.621 ± 0.707
S3 FDM [9] 250 0.043 ± 0.030 0.201 ± 0.053
R-GMM [22] 10 0.036 ± 0.016 0.387 ± 0.047
E-FLOW [23] 1000 0.141 ± 0.128 112.25 ± 0.570
SDS-RM (ours) 10 0.029 ± 0.019 1.514 ± 0.726
DS + Cholesky 10 0.121 ± 0.043 1.899 ± 0.490
2
S++ FDM [9] 250 0.042 ± 0.029 0.221 ± 0.023
R-GMM [22] 10 0.037 ± 0.017 14.514 ± 1.560
E-FLOW [23] 1000 0.140 ± 0.126 127.55 ± 8.434
Authors would like to thank N. Perrin and P. Schlehuber-Caissier for providing the source code of the FDM approach in [9].
Fig. 8. Results for the data-efficiency test. SDS-RM is trained by sub-sampling 7. Robot experiments
each demonstration to 100 points. (a) SDS-RM reproduces accurately the
sub-sampled demonstrations. (b) Adjusting the sampling time SDS-RM repro-
duces accurately the original demonstrations (1000 points) without re-training. This section presents experiments4 with a 7 Degree of Free-
(c) E-FLOW is inaccurate when learning from sub-sampled demonstrations.
dom (DoF) manipulator (Franka Emika Panda). The robot’s behav-
(d) E-FLOW achieves the SDS-RM accuracy when learning from the original
demonstrations (1000 points). ior is governed by the Cartesian impedance controller
Fp = Kp (pd − p) + Dp (ṗd − ṗ) ,
(20)
Fo = Ko Logqqd (q) + Do (ωd − ω)
is an expected result as FDM learns from a single demonstra-
tion obtained in this case by averaging the 4 demonstrations in where the subscript p stands for position, o for orientation, and
each class. We expect a similar result by applying DMP-based d for desired. pd and qd are desired position and orientation
approaches [20,21]. Regarding the training time, SDS-RM learns (expressed as UQ) of the robot end-effector. p and q indicate the
a UQ motion (4D) on average in 0.755 s, while FDM takes only measured position and orientation of the end-effector. Desired
0.201 s. For SPD data (2 × 2 matrices), SDS-RM needs on average and measured linear (angular) velocities are indicated as ṗd and
in 1.514 s the learn a motion, while FDM takes only 0.221 s. This
ṗ (ωd and ω). Kp/o and Dp/o are the robot stiffness and damping
is also expected as FDM uses only 1 demonstration for training,
matrices expressed in the robot base frame. Given the stiffness
resulting in 4 times fewer data than SDS-RM. To summarize, FDM
matrices Kp/o – computed as detailed later in this section – the
learns very quickly and it is recommended in applications where
damping matrices Dp/o are obtained by the double diagonalization
< 0.5 s training time is needed. However, most applications
design [61]. Cartesian forces defined in (20) are mapped into joint
do not have such a strict training time requirement but need
accurate motions. In this case, SDS-RM offers high accuracy with torques using the transpose of the manipulator Jacobian (J⊤ ),
a reasonable training time.
[ ]
Fp
R-GMM learns from multiple demonstrations, but it outputs τd = J ⊤
, (21)
Fo
the same trajectory irrespective of the initial pose. This results in
a loss of accuracy, with SDS-RM being about 25 % more accurate
than R-GMM. Regarding the training time, R-GMM learns faster 4 A video of the experiments is available as supplementary material.
9
M. Saveriano, F.J. Abu-Dakka and V. Kyrki Robotics and Autonomous Systems 169 (2023) 104510
reach the goal, and, as a result, the robot stacks the bottle in
the wrong configuration (see the accompanying video). A similar
lack of accuracy also affects E-FLOW, as experimentally shown in
Section 6.3. R-GMM generates the same motion irrespective of
the initial/goal point, which means it would fail in stacking the
bottle in a different location.
The task of stacking bottles in a rack requires continuous the demonstrations and D = 3 is the number of demonstrations.
adjustment of position and orientation (see Fig. 10). Apart from As in the previous experiment, demonstrated positions and
reaching a desired (stacking) pose, the robot should follow ac- orientations are encoded into two stable DSs using SDS-RM. We
curately the demonstrated trajectory to prevent hitting the rack. use 10 Gaussian components for each system. The desired vari-
We provide 3 kinesthetic demonstrations (dashed lines in Fig. 9) able stiffness profiles are generated using the variability in the
starting at different locations and converging to the stacking pose demonstrations as suggested in several previous works [62–64].
defined by pg = [0.474, 0.185, 0.155]⊤ m and qg = −0.520 + More in detail, we first notice that the Cartesian impedance con-
[0.542, 0.442, 0.491]⊤ . The demonstrations are of the form troller (20) assumes that position and orientation are decoupled.
l,d , ql,d }l=1 }d=1 where L is the total length of the demon-
{{pdemo demo L D
In other words, it assumes that positional errors only affect the
strations and D = 3 is the number of demonstrations. Demon- force, while rotational errors only affect the torque. This allows
strated positions and orientations are encoded into two stable us to learn independent linear and angular stiffness profiles. The
DSs using SDS-RM. We, empirically, use 10 Gaussian components idea is to compute the desired stiffness profile from the inverse
for each system. It is worth mentioning that, in order to fit of the covariance matrix.
position trajectories, we, simply, replace logarithmic and expo- For the linear stiffness matrix Kp , we first compute the covari-
nential maps in Algorithm 1 with an identity map. Results of the ance matrix for each of the D demonstrated positions and for each
learning process are shown in Fig. 9 (solid lines). The robot is time step l = 1, . . . , L as
controlled with the Cartesian impedance controller (20) where D
pd and qd are generated with SDS-RM. The stiffness matrices are 1 ∑ ( demo
pl,i − µp,l pdemo − µp,l ⊤ ,
)( )
Σp,l = l,i (22)
kept to constant high values (Kp = diag([1000, 1000, 1000]) N/m D
and Ko = diag([150, 150, 150]) N m/rad) in this task. With the
i=1
selected impedance gains, the robot is able to follow the desired where the mean µp,l is computed as
pose trajectories, as shown in the top row of Fig. 10. D
One of the interesting properties of DS-based trajectory gener- 1∑
µp,l = pdemo
l,i . (23)
ators is the possibility to converge to different goals. Changing the D
i=1
goal is possible also on Riemannian manifolds by following the
approach we have presented in Section 5.3. To demonstrate the Then, we compute the eigenvalue decomposition of each Σp,l =
robustness of SDS-RM to goal switching we repeated the stacking Vl Λl Vl ⊤ , where Vl is an orthogonal matrix and Λl = diag([λ1,l ,
task in different conditions. In each case, the shifted goal pose λ2,l , λ3,l ]). Since all the demonstrations end up in the same posi-
is obtained by manually placing the robot in a suitable release tion, we know that the eigenvalues of Σp,L vanishes, i.e., ΛL = 0
configuration (from which the bottle was released inside the rack) and V = I. Moreover, we want the stiffness to be maximum at L.
without correcting for small differences in the z-axis. Fig. 10 Therefore, we compute the desired stiffness profile as
(middle) shows the robot successfully stacking the bottle at a dif- ⎡ 1
0 0
⎤
ferent position (pg = [0.385, 0.143, 0.172]⊤ m). Fig. 10 (bottom) λ1,l +k̄p
1
Kdemo ⎦ Vl ,
0 0 ⎥ ⊤
= Vl ⎣ (24)
⎢
shows the robot successfully performing the stacking task with a p,l λ2,l +k̄p
rotated rack, which implies a significant change in the stacking 1
0 0 λ3,l +k̄p
orientation (qg = −0.58 + [0.37, 0.63, 0.35]⊤ ) and a less pro-
nounced change in the goal position (pg = [0.469, 0.200, 0.165]⊤ where the maximum linear stiffness gain is set to k̄p = 1000 N/m.
m). As shown in Fig. 11(b), the stiffness profile in (24) converges to
The results of this experiment show that SDS-RM accurately Kp,L = diag([k̄p , k̄p , k̄p ]) N/m and varies according to the vari-
encodes full pose trajectories while ensuring convergence to a ability in the demonstrations. Note that existing approaches also
given target (even if not explicitly demonstrated) and fulfill- impose a minimum value for the stiffness. This is straightforward
ing the underlying geometric constraints (the deviation of the to implement but it was not needed in the performed experiment
norm from 1 is about 1 × 10−16 ) in variable orientation data. as the minimum value of the stiffness computed by means of (24)
For comparison, the baseline DS+Normalize fails to accurately was already enough to track the desired trajectory.
10
M. Saveriano, F.J. Abu-Dakka and V. Kyrki Robotics and Autonomous Systems 169 (2023) 104510
Fig. 10. Results of the bottle stacking experiment collected from three runs; each starts from the same starting pose towards different goals. Left panels show the
snapshots of the robot execution; right panels show desired and executed motion trajectories. (Top) The robot reproduces the trajectory generated by SDS-RM from
known (demonstrated) starting and goal poses. (Middle) The robot stacks the bottle at a different position (pg = [0.385, 0.143, 0.172]⊤ m). (Bottom) The rack is
rotated and the robot stacks the bottle at a different pose (pg = [0.469, 0.200, 0.165]⊤ m and qg = −0.58 + [0.37, 0.63, 0.35]⊤ ).
the wooden plate from a (blue) container and reaches the drill Data availability
pose. During the motion, the robot is complaint which allows a
safer response to possible external perturbations. For instance, The link to the data is shared in the manuscript. Code will be
a high initial stiffness combined with (small) inaccuracies in shared on request.
the generated trajectory could have generated a jerky motion
of the container while the robot lifts the wooden plate. The Acknowledgments
goal pose, instead, is reached with maximum stiffness. As shown
in Fig. 11(c), during the drilling task the maximum position Part of the research presented in this work has been conducted
deviation along the drilling direction (x-axis) is 3.1 cm (gener- when: M. Saveriano was at the Department of Computer Science,
ating a reaction force of 31 N), while the maximum orientation University of Innsbruck, Innsbruck, Austria, and F. Abu-Dakka
deviation about the z-axis (where the robot perceives the highest was at the Intelligent Robotics Group, Department of Electrical
Engineering and Automation, Aalto University, Finland.
momentum) is 1.8 deg (generating a reaction torque of 4.7 N/m).
This work has been partially supported by the Austrian Re-
This shows that the robot is capable to keep the goal pose, letting
search Foundation (Euregio IPN 86-N30, OLIVER), by CHIST-ERA
the human co-worker drill the wooden plate.
project IPALM (Academy of Finland decision 326304), by the
European Union under NextGenerationEU project iNest
8. Conclusions (ECS 00000043), and by euROBIN project under grant agreement
No. 101070596.
In this paper, we presented Stable Dynamical System on Rie-
mannian Manifolds (SDS-RM), an approach to learn stable DSs Appendix A. Jacobian of the mean of a GMR
evolving on Riemannian manifolds. SDS-RM builds on theoretical
stability results, derived for dynamics evolving on Riemannian Recall that
manifolds, to learn stable and accurate DS representations of K
∑
Riemannian data. Similar to its Euclidean counterparts, SDS-RM ψ (a) = hk (a)ψ̂ (a),
learns a diffeomorphic transformation between a simple stable k=1
(27)
system and a set of complex demonstrations. The key difference
wrt Euclidean approaches is that SDS-RM uses tools from differ-
ψ̂ (a) = µ +b
k Σba aa −1
k (Σk ) (a −µ . a
k)
ential geometry to correctly represent complex manifold data, Using the chain rule and (27), Jψ (a) writes as:
such as orientations and stiffness matrices, with their under- K
lying geometric constraints, e.g., unit norm for unit quaternion ∂ψ (a) ∑ ∂ hk (a)
Jψ (a) = = ψ̂ (a)⊤
orientation and symmetry and positive definiteness for stiffness ∂a ∂a
matrices. The proposed approach is first evaluated in simulation
k=1 (28)
and compared with an existing approach, modified to deal with ∂ ψ̂ (a)
+ hk (a) .
Riemannian data. Due to the lack of publicly available Riemannian ∂a
datasets, we developed a procedure to augment a popular – and Let us compute the two partial derivatives at the right side
potentially any other – Euclidean benchmark with UQ and SPD of (28) separately. Considering the expression of ψ̂ (a) in (27), and
profiles. Finally, in order to perform a thorough evaluation, we applying the chain rule, it is easy to verify that
also conducted a set of experiments with a real robot performing
bottle stacking and cooperative (with a human operator) drilling. ∂ ψ̂ (a)
= Σab aa −1
k (Σk ) . (29)
Overall, the conducted evaluation shows that SDS-RM represents ∂a
a good compromise between accuracy and training time and that Using the quotient rule, and setting N̂k = N (a|µak , Σaa
k ), the
it can be effectively adopted to generate complex robotic skills on ∂ h (a)
expression of ∂ka writes as
manifolds.
∂ N̂ ∑K
In our evaluation, we show the adaptation capabilities of ∂ hk (a) πk ∂ ak i=1 πi N̂i
SDS-RM by changing initial and/or goal points. However, in more = (
∂a ∑K )2
general settings, it is needed to incorporate task-dependant pa- i=1 πi N̂i
rameters to adapt the execution to different domains. Augment- (30)
πi ∂∂N̂a i
∑K
ing SDS-RM with task-dependant parameters while maintaining πk N̂k i=1
its stability properties is an interesting research direction that we
− (
∑K )2 .
intend to explore in future work. i=1 πi N̂i
Moreover, SDS-RM has been evaluated on orientation (UQ)
Recall that the derivative of a multivariate Gaussian distribution
and stiffness (SPD) profiles, but it may be extended to other N̂ wrt the input is given by [65]
Riemannian manifolds. Therefore, our future research will also
focus on investigating the possibility to learn stable DS on diverse ∂ N̂
= −N̂ Σ−1 (a − µ). (31)
manifolds like Grassmann or hyperbolic. Grassmann manifolds ∂a
elegantly encode orthogonal projections, while hyperbolic man- Using (31) to compute the derivatives in (30) we obtain:
ifolds represent a continuous embedding of discrete structures ∑K
with possible application to task and motion planning. These ∂ hk (a) −πk N̂k (Σaak )
−1
(a − µak ) i=1 πi N̂i
=
∂a
manifolds are widely unexploited in robotics and can potentially
(∑ )2
K
unlock new applications. i=1 πi N̂i
13
M. Saveriano, F.J. Abu-Dakka and V. Kyrki Robotics and Autonomous Systems 169 (2023) 104510
[51] M. Mukadam, C.-A. Cheng, D. Fox, B. Boots, N. Ratliff, Riemannian mo- Matteo Saveriano received his B.Sc. and M.Sc. degree
tion policy fusion through learnable Lyapunov function reshaping, in: in automatic control engineering from University of
Conference on Robot Learning, PMLR, 2020, pp. 204–219. Naples, Italy, in 2008 and 2011, respectively. He re-
ceived is Ph.D. from the Technical University of Munich
[52] J. Sola, J. Deray, D. Atchuthan, A micro Lie theory for state estimation in
in 2017. Currently, he is an assistant professor at the
robotics, 2018, arXiv preprint arXiv:1812.01537.
Department of Industrial Engineering (DII), University
[53] X. Pennec, P. Fillard, N. Ayache, A Riemannian framework for tensor
of Trento, Italy. Previously, he was an assistant profes-
computing, Int. J. Comput. Vis. 66 (1) (2006) 41–66. sor at the University of Innsbruck and a post-doctoral
[54] S. Sra, R. Hosseini, Conic geometric optimization on the manifold of researcher at the German Aerospace Center (DLR). He
positive definite matrices, SIAM J. Optim. 25 (1) (2015) 713–739. is an Associate Editor for RA-L. His research activi-
[55] J. Slotine, W. Li, Applied Nonlinear Control, Prentice-Hall Englewood Cliffs, ties include robot learning, human–robot interaction,
1991. understanding and interpreting human activities.
[56] F. Pait, D. Colón, Some properties of the Riemannian distance function
and the position vector X, with applications to the construction of
Fares J. Abu-Dakka received his B.Sc. degree in Me-
Lyapunov functions, in: IEEE Conference on Decision and Control, 2010,
chanical Engineering from Birzeit University, Palestine
pp. 6277–6280. in 2003 and his DEA and Ph.D. degrees in robotics
[57] S. Fiori, Manifold calculus in system theory and control–fundamentals and motion planning from the Polytechnic University of Va-
first-order systems, Symmetry 13 (11) (2021). lencia, Spain in 2006 and 2011, respectively. Between
[58] X. Pennec, Intrinsic statistics on Riemannian manifolds: Basic tools for 2013 and 2016, he held a visiting professor position at
geometric measurements, J. Math. Imaging Vision 25 (1) (2006) 127–154. ISA of the Carlos III University of Madrid, Spain. In the
[59] B. Siciliano, L. Sciavicco, L. Villani, G. Oriolo, Robotics: Modelling, Planning period between 2016 and 2019, he was a Postdoc at
and Control, Springer, 2009. Istituto Italiano di Tecnologia. During 2019–2022, he
[60] S. Calinon, Robot Programming by Demonstration: A Probabilistic was a Research Fellow at Aalto University. Currently,
Approach, EPFL/CRC Press, 2009. since 2022, he is a Senior Scientist and leading the
[61] A. Albu-Schaffer, C. Ott, U. Frese, G. Hirzinger, Cartesian impedance control Robot Learning Group at MIRMI, Technical University of Munich, Germany. His
research lies in the intersection of control theory, differential geometry, and
of redundant robots: Recent results with the DLR-light-weight-arms, in:
machine learning in order to enhance robot manipulation performance and
2003 IEEE International Conference on Robotics and Automation, 2003,
safety. He is an Associate Editor for IEEE-ICRA, IEEE-IROS, and IEEE-RA-L.
pp. 3704–3709.
[62] S. Calinon, I. Sardellitti, D.G. Caldwell, Learning-based control strategy for
safe human-robot interaction exploiting task and robot redundancies, in: Ville Kyrki is Associate Professor in automaton tech-
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010, nology at the Department of Electrical Engineering and
pp. 249–254. Automaton, Aalto University, Finland. During 2009–
[63] J. Silvério, Y. Huang, F.J. Abu-Dakka, L. Rozo, D.G. Caldwell, Uncertainty- 2012 he was a professor in computer science with
specialization in intelligent robotics at the Lappeen-
aware imitation learning using kernelized movement primitives, in:
ranta University of Technology, Finland. He received
IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS,
the M.Sc. and Ph.D. degrees in computer science from
2019, pp. 90–97.
Lappeenranta University of Technology in 1999 and
[64] K. Kronander, A. Billard, Learning compliant manipulation through kines- 2002, respectively. He conducts research mainly in
thetic and tactile human-robot interaction, IEEE Trans. Haptics 7 (3) (2013) intelligent robotic systems, with emphasis on methods
367–380. and systems that cope with imperfect knowledge and
[65] K.B. Petersen, M.S. Pedersen, The Matrix Cookbook, Technical University of uncertain senses.
Denmark, 2012.
14