Robotics 2
Trajectory Tracking Control
Prof. Alessandro De Luca
Inverse dynamics control
given the robot dynamic model with 𝑁 joints
𝑀 𝑞 𝑞̈ + 𝑛 𝑞, 𝑞̇ = 𝑢
𝑐 𝑞, 𝑞̇ + 𝑔 𝑞 + friction model
and a twice-differentiable desired trajectory for 𝑡 ∈ [0, 𝑇]
𝑞! 𝑡 → 𝑞̇ ! 𝑡 , 𝑞̈ ! (𝑡)
applying the feedforward torque in nominal conditions
𝑢' = 𝑀 𝑞' 𝑞̈ ' + 𝑛(𝑞' , 𝑞̇ ' )
yields exact reproduction of the desired motion, provided
that 𝑞(0) = 𝑞! (0), 𝑞(0)
̇ = 𝑞̇ ! (0) (initial matched state)
Robotics 2 2
In practice ...
a number of differences from the nominal condition
n initial state is “not matched” to the desired trajectory 𝑞! (𝑡)
n disturbances on the actuators, from unexpected collisions,
truncation errors on data, …
n inaccurate knowledge of robot dynamic parameters 𝜋 ⟶ 𝜋 1
(link masses, inertias, center of mass positions)
n unknown value of the carried payload
n presence of unmodeled dynamics (complex friction
phenomena, transmission elasticity, …)
require the use of feedback information
Robotics 2 3
Introducing feedback
4 𝑞! 𝑞̈ ! + 𝑛(𝑞 ! 𝑛# estimates of terms
with 𝑀,
𝑢1 ! = 𝑀 1 ! , 𝑞̇ ! )
(or coefficients) in the dynamic model
note: 𝑢 % ! (𝑞" , 𝑞̇ " , 𝑞̈ " )]
( ! can be computed off line [e.g., by 𝑁𝐸
feedback is introduced to make the control scheme more robust
different possible implementations depending on
amount of computational load share
§ OFF LINE ( open loop)
§ ON LINE ( closed loop)
two-step control design:
1. compensation (feedforward) or cancellation (feedback) of nonlinearities
2. synthesis of a linear control law stabilizing the trajectory error to zero
Robotics 2 4
A series of trajectory controllers
! = 𝑀, 𝑛# = 𝑛)
(assuming the nominal case: 𝑀
1. inverse dynamics compensation (FFW) + PD local stabilization
of trajectory error
𝑢 = 𝑢1 ! + 𝐾" 𝑞! − 𝑞 + 𝐾# (𝑞̇ ! − 𝑞)
̇ 𝑒 𝑡 = 𝑞! 𝑡 − 𝑞 𝑡
global if additional
2. inverse dynamics compensation (FFW) + variable PD conditions on
𝐾" and 𝐾#
4 𝑞! 𝐾" 𝑞! − 𝑞 + 𝐾# (𝑞̇ ! − 𝑞)
𝑢 = 𝑢1 ! + 𝑀 ̇
3. feedback linearization (FBL) + [PD+FFW] = “COMPUTED TORQUE”
4 𝑞 𝑞̈ ! + 𝐾" 𝑞! − 𝑞 + 𝐾# (𝑞̇ ! − 𝑞)
𝑢=𝑀 ̇ + 𝑛(𝑞,
1 𝑞) ̇
4. feedback linearization (FBL) + [PID+FFW]
! 𝑞
𝑢=𝑀 𝑞̈ " + 𝐾# 𝑞" − 𝑞 + 𝐾$ 𝑞̇ " − 𝑞̇ + 𝐾% 3 𝑞" − 𝑞 𝑑𝑡 + 𝑛(𝑞,
# 𝑞) ̇
global stabilization for any 𝐾" > 0, 𝐾# > 0 (and not too large 𝐾$ > 0)
more robust to small uncertainties/disturbances, even if more complex to implement in real time
Robotics 2 5
Feedback linearization control
ROBOT (or its DYNAMIC MODEL)
𝑞̈ " + 𝐾$ 𝑞̇ " + 𝑎 ! + 𝑢 + 𝑞̈ 𝑞̇ 𝑞
𝑀(𝑞) 𝑀 %& (𝑞)
+ 𝐾# 𝑞" _
_ +
𝑛(𝑞, 𝑞)
̇
symmetric and
positive definite
𝑛(𝑞,
# 𝑞) ̇
matrices 𝐾! , 𝐾"
𝐾"
𝐾!
in nominal
conditions 𝑀 𝑞 𝑞̈ + 𝑛 𝑞, 𝑞̇ = 𝑢 = 𝑀 𝑞 𝑎 + 𝑛 𝑞, 𝑞̇ 𝑞̈ = 𝑎
# = 𝑀, 𝑛& = 𝑛)
(𝑀 linear and
nonlinear robot dynamics nonlinear control law decoupled
global asymptotic system
stabilization of 𝑎 = 𝑞̈ " + 𝐾$ 𝑞̇ " − 𝑞̇ + 𝐾# 𝑞" − 𝑞
tracking error
Robotics 2 6
Interpretation in the linear domain
+ 𝑎 = 𝑞̈ 𝑞̇ 𝑞
𝑞̈ ! + 𝐾" 𝑞̇ ! + 𝐾# 𝑞!
_
> 0, diagonal 𝐾"
𝐾#
under feedback linearization control, the robot has a dynamic behavior that is
invariant, linear and decoupled in its whole state space (∀(𝑞, 𝑞)
̇ )
a unitary mass (𝑚 = 1) in the joint space !!
linearity
error transients 𝑒& = 𝑞"& − 𝑞& → 0 exponentially, prescribed by 𝐾!# , 𝐾"# choice
decoupling
each joint coordinate 𝑞& evolves independently from the others, forced by 𝑎&
𝑒̈ + 𝐾" 𝑒̇ + 𝐾# 𝑒 = 0 ⟺ 𝑒̈$ +𝐾"$ 𝑒$̇ + 𝐾#$ 𝑒$ = 0
Robotics 2 7
Addition of an integral term: PID
whiteboard…
𝑞̈ ! + 𝐾" 𝑞̇ ! + 𝐾# 𝑞!
𝑞! + 𝑒 +
+ 𝑎 = 𝑞̈ 𝑞̇ 𝑞
_
𝐾% _
+
⇒ 𝐾!" > 0, 𝐾#" > 0,
𝐾"
> 0, 𝐾#
0 < 𝐾$" < 𝐾!" 𝐾#"
diagonal
𝑞̈ = 𝑎 = 𝑞̈ $ + 𝐾" 𝑞̇ $ − 𝑞̇ + 𝐾! 𝑞$ − 𝑞 + 𝐾% 0 𝑞$ − 𝑞 𝑑𝜏 𝑒 = 𝑞$ − 𝑞
⇒ 𝑒# = 𝑞$# − 𝑞# (𝑖 = 1, . . , 𝑁) ⇒ 𝑒̈# + 𝐾"# 𝑒̇# + 𝐾!# 𝑒# + 𝐾%# 0 𝑒# 𝑑𝜏 = 0
(1) (2)
&
1 ⇒ (6)
ℒ[𝑒# 𝑡 ] ⇒ 𝑠 + 𝐾"# 𝑠 + 𝐾!# + 𝐾%# 𝑒# 𝑠 = 0 3 1 𝐾!"
(3) 𝑠 exponential
2 𝐾#" 𝐾$" stability
' & ⇒
𝑠 × ⇒ 𝑠 + 𝐾"# 𝑠 + 𝐾!# 𝑠 + 𝐾%# 𝑒# 𝑠 = 0 1 (𝐾#" 𝐾!" − 𝐾$" )/𝐾#" conditions by
(4) (5)
0 𝐾$" Routh criterion
Robotics 2 8
Remarks
n desired joint trajectory can be generated from Cartesian data
𝑝̈" 𝑡 , 𝑝̇" 0 , 𝑝" (0)
𝑞̇ $ (0) 𝑞$ (0) 𝑞$ 0 = 𝑓 () 𝑝$ 0
𝑞̇ $ (𝑡) 𝑞̇ $ 0 = 𝐽() 𝑞$ 0 𝑝̇$ 0
𝑞̈ $ (𝑡) 𝑞$ (𝑡) ̇ $ )𝑞̇ $
𝑞̈ $ 𝑡 = 𝐽() 𝑞$ 𝑝̈$ 𝑡 − 𝐽(𝑞
n 1 𝛼 (𝑞, 𝑞,̇ 𝑎)
real-time computation by Newton-Euler algo: 𝑢&'( = 𝑁𝐸
n simulation of feedback linearization control
true parameters 𝜋
𝑞! 𝑡 , 𝑞̇ ! 𝑡 ,
𝑞̈ ! 𝑡 feedback 𝑞
robot
linearization 𝑞̇
estimated parameters 𝜋/
Hint: there is no use in simulating this control law in the ideal case (𝜋/ = 𝜋); robot behavior
will be identical to the linear and decoupled case of stabilized double integrators!!
Robotics 2 9
Further comments
n choice of the diagonal elements of 𝐾𝑃, 𝐾" (and 𝐾𝐼 )
n shaping the error transients, with an eye also to motor saturations...
𝑒(𝑡) = 𝑞$ (𝑡) − 𝑞(𝑡) 𝑒(0)
critically damped transient
𝑡
n parametric identification
n to be done in advance, using the property of linearity in the dynamic
coefficients of the robot dynamic model
n choice of the sampling time of a digital implementation
n compromise between computational time and tracking accuracy,
typically 𝑇4 = 0.5 ÷ 10 ms
n exact linearization by (state) feedback is a general technique
of nonlinear control theory
n can be used for robots with elastic joints, wheeled mobile robots, ...
n non-robotics applications: satellites, induction motors, helicopters, ...
Robotics 2 10
Another example of feedback linearization design
n dynamic model of robots with elastic joints
n 𝑞 = link position 2𝑁 generalized
n 𝜃 = motor position (after reduction gears) coordinates (𝑞, 𝜃)
n 𝐵* = diagonal matrix (> 0) of inertia of the (balanced) motors
n 𝐾 = diagonal matrix (> 0) of (finite) stiffness of the joints
4𝑁 state 𝑀 𝑞 𝑞̈ + 𝑐 𝑞, 𝑞̇ + 𝑔 𝑞 + 𝐾 𝑞 − 𝜃 = 0 (1)
variables
̇
𝑥 = (𝑞, 𝜃, 𝑞,̇ 𝜃) 𝐵) 𝜃̈ + 𝐾 𝜃 − 𝑞 = 𝑢 (2)
n is there a control law that achieves exact linearization via feedback?
𝑢 = 𝛼 𝑞, 𝜃, 𝑞,̇ 𝜃̇ + 𝛽 𝑞, 𝜃, 𝑞,̇ 𝜃̇ 𝑎
𝑑 * 𝑞$ linear and decoupled system:
YES and it yields * = 𝑎$ , 𝑖 = 1, . . , 𝑁 𝑁 chains of 4 integrators
𝑑𝑡 (to be stabilized by linear
control design)
Hint: differentiate (1) w.r.t. time until motor acceleration 𝜃̈ appears;
substitute this from (2); choose 𝑢 so as to cancel all nonlinearities …
Robotics 2 11
Alternative global trajectory controller
𝑢 = 𝑀 𝑞 𝑞̈ ! + 𝑆 𝑞, 𝑞̇ 𝑞̇ ! + 𝑔 𝑞 + 𝐹$ 𝑞̇ ! + 𝐾" 𝑒 + 𝐾# 𝑒̇
SPECIAL factorization such that symmetric and
𝑀̇ − 2𝑆 is skew-symmetric positive definite matrices
n global asymptotic stability of 𝑒, 𝑒̇ = (0,0) (trajectory tracking)
n proven by Lyapunov +Barbalat (time-varying system) +LaSalle
n does not produce a complete cancellation of nonlinearities
n the variables 𝑞̇ and 𝑞̈ that appear linearly in the model are evaluated
on the desired trajectory
n does not induce a linear and decoupled behavior of the
trajectory error 𝑒 𝑡 = 𝑞! (𝑡) − 𝑞(𝑡) in the closed-loop system
n however, it lends itself more easily to an adaptive version
n computation: by 4× standard or 1× modified NE algorithm
Robotics 2 12
Analysis of asymptotic stability
of the trajectory error - 1
𝑀 𝑞 𝑞̈ + 𝑆 𝑞, 𝑞̇ 𝑞̇ + 𝑔 𝑞 + 𝐹+ 𝑞̇ = 𝑢 robot dynamics (including friction)
control law 𝑢 = 𝑀 𝑞 𝑞̈ ! + 𝑆 𝑞, 𝑞̇ 𝑞̇ ! + 𝑔 𝑞 + 𝐹+ 𝑞̇ ! + 𝐾# 𝑒 + 𝐾" 𝑒̇
n Lyapunov candidate and its time derivative (with 𝑒 = 𝑞" − 𝑞)
1 5 1 5 1 5
𝑉 = 𝑒̇ 𝑀 𝑞 𝑒̇ + 𝑒 𝐾# 𝑒 ≥ 0 ⇒ 𝑉 = 𝑒̇ 𝑀̇ 𝑞 𝑒̇ + 𝑒̇ 5 𝑀 𝑞 𝑒̈ + 𝑒 5 𝐾# 𝑒̇
̇
2 2 2
n the closed-loop system equations yield
𝑀 𝑞 𝑒̈ = −𝑆 𝑞, 𝑞̇ 𝑒̇ − 𝐾" + 𝐹+ 𝑒̇ − 𝐾# 𝑒
n substituting and using the skew-symmetric property of 𝑀̇ − 2𝑆
𝑉̇ = −𝑒̇ , 𝐾" + 𝐹+ 𝑒̇ ≤ 0 𝑉̇ = 0 ⇔ 𝑒̇ = 0
n since the system is time-varying (due to 𝑞" (𝑡)), direct application
⇒ go to
of LaSalle theorem is NOT allowed ⇒ use Barbalat lemma…
slide 10 in
𝑞 = 𝑞! 𝑡 − 𝑒, 𝑞̇ = 𝑞̇ ! 𝑡 − 𝑒̇ ⇒ 𝑉 = 𝑉 𝑒, 𝑒,̇ 𝑡 = 𝑉(𝑥, 𝑡) block 8
error state 𝑥
Robotics 2 13
Analysis of asymptotic stability
of the trajectory error - 2
n since i) 𝑉 is lower bounded and ii) 𝑉̇ ≤ 0, we have to check only
condition iii) in order to apply Barbalat lemma
𝑉̈ = −2𝑒̇ , (𝐾" + 𝐹+ )𝑒̈ ... is this bounded?
n from i) + ii), 𝑉 is bounded ⇒ 𝑒 and 𝑒̇ are bounded
⇒ 𝑞̇ is
n assume that the desired trajectory has bounded velocity 𝑞̇ " bounded
n using the following two properties of dynamic model terms
0 < 𝛼7 ≤ 𝑀 89 (𝑞) ≤ 𝛼: < ∞ 𝑆(𝑞, 𝑞)
̇ ≤ 𝛼; 𝑞̇
then also 𝑒̈ will be bounded (in norm) since
𝑒̈ = −𝑀 -. (𝑞) 𝑆 𝑞, 𝑞̇ 𝑒̇ + 𝐾# 𝑒 + (𝐾" + 𝐹+ )𝑒̇
lim 𝑉̇ 𝑡 = 0
⇒ >→@
bounded bounded bounded bounded
in norm by 𝛼%
bounded
in norm by 𝛼& 𝑞̇ bounded
Robotics 2 14
Analysis of asymptotic stability
of the trajectory error – end of proof
n we can conclude by proceeding as in LaSalle theorem
𝑉̇ = 0 ⇔ 𝑒̇ = 0
n the closed-loop dynamics in this situation is
𝑀 𝑞 𝑒̈ = −𝐾" 𝑒
⟹ 𝑒̈ = 0 ⇔ 𝑒 = 0 (𝑒, 𝑒)̇ = (0, 0)
is the largest
invariant set in 𝑉̇ = 0
(global) asymptotic tracking
will be achieved
Robotics 2 15
Regulation as a special case
n what happens to the control laws designed for trajectory
tracking when 𝑞𝑑 is constant? are there simplifications?
n feedback linearization
𝑢 = 𝑀 𝑞 𝐾" 𝑞! − 𝑞 − 𝐾# 𝑞̇ + 𝑐 𝑞, 𝑞̇ + 𝑔(𝑞)
n no special simplifications
n however, this is a solution to the regulation problem with
exponential stability (and decoupled transients at each joint!)
n alternative global controller
𝑢 = 𝐾" (𝑞! − 𝑞) − 𝐾# 𝑞̇ + 𝑔 𝑞
n we recover the simpler PD + gravity cancellation control law!!
Robotics 2 16
Trajectory execution without a model
n is it possible to accurately reproduce a desired smooth joint-
space reference trajectory with reduced or no information on
the robot dynamic model?
n this is feasible (and possibly simple) in case of repetitive motion
tasks over a finite interval of time
n trials are performed iteratively, storing the trajectory error
information of the current execution [𝑘 -th iteration] and
processing it off line before the next trial [(𝑘 + 1)-iteration] starts
n the robot should be reinitialized in the same initial state at the
beginning of each trial (typically, with 𝑞̇ = 0)
n the control law is made of a non-model based part (often, a
decentralized PD law) + a time-varying feedforward which is
updated before every trial
n this scheme is called iterative trajectory learning
Robotics 2 17
Scheme of iterative trajectory learning
n control design can be illustrated on a SISO linear system
in the Laplace domain
“plug-in” signal:
𝑣& (𝑡) ≡ 0 at
first (𝑘 = 1) iteration
𝑪(𝒔) 𝑷(𝒔)
𝑦(𝑠) 𝑃 𝑠 𝐶(𝑠) closed-loop system without learning
𝑊 𝑠 = =
𝑦" (𝑠) 1 + 𝑃 𝑠 𝐶(𝑠) (𝐶(𝑠) is, e.g., a PD control law)
𝑢A 𝑠 = 𝑢AB 𝑠 + 𝑣A 𝑠 = 𝐶 𝑠 𝑒A 𝑠 + 𝑣A 𝑠 control law at iteration 𝑘
𝑃(𝑠)
𝑦A 𝑠 = 𝑊 𝑠 𝑦" 𝑠 + 𝑣 𝑠 system output at iteration 𝑘
1 + 𝑃 𝑠 𝐶(𝑠) A
Robotics 2 18
Background math on feedback loops
whiteboard…
n algebraic manipulations on block diagram signals in the Laplace domain:
𝑥 𝑠 = ℒ 𝑥(𝑡) , 𝑥 = 𝑦! , 𝑦, 𝑢 ' , 𝑣, 𝑒 ⇒ 𝑦! , 𝑦( , 𝑢(' , 𝑣( , 𝑒( , with transfer functions
𝑦 𝑠 = 𝑃 𝑠 𝑢 𝑠 = 𝑃 𝑠 𝑣 𝑠 + 𝑢' 𝑠
=𝑃 𝑠 𝑣 𝑠 +𝑃 𝑠 𝐶 𝑠 𝑒 𝑠
= 𝑃 𝑠 𝑣 𝑠 + 𝑃 𝑠 𝐶 𝑠 𝑦( 𝑠 − 𝑦(𝑠)
𝑪(𝒔) 𝑷(𝒔) ⇒ (1 + 𝑃 𝑠 𝐶 𝑠 ) 𝑦 𝑠 =
= 𝑃 𝑠 𝑣 𝑠 + 𝑃 𝑠 𝐶 𝑠 𝑦( 𝑠
𝑃 𝑠 𝐶 𝑠 𝑃 𝑠
⇒ 𝑦𝑠 = 𝑦( 𝑠 + 𝑣 𝑠 = 𝑊 𝑠 𝑦( 𝑠 + 𝑊* 𝑠 𝑣(𝑠)
1+𝑃 𝑠 𝐶 𝑠 1+𝑃 𝑠 𝐶 𝑠
§ feedback control law at iteration 𝑘
𝑢)' 𝑠 = 𝐶 𝑠 𝑦( 𝑠 − 𝑦) 𝑠 = 𝐶 𝑠 𝑦( 𝑠 − 𝑃 𝑠 𝐶(𝑠) 𝑣) 𝑠 + 𝑢)' 𝑠
𝐶 𝑠 𝑃 𝑠 𝐶 𝑠
⇒ 𝑢)' 𝑠 = 𝑦( 𝑠 − 𝑣 𝑠 = 𝑊+ 𝑠 𝑦( 𝑠 − 𝑊 𝑠 𝑣) 𝑠
1+𝑃 𝑠 𝐶 𝑠 1+𝑃 𝑠 𝐶 𝑠 )
§ error at iteration 𝑘
𝑒) 𝑠 = 𝑦( 𝑠 − 𝑦) 𝑠 = 𝑦( 𝑠 − 𝑊 𝑠 𝑦( 𝑠 + 𝑊* 𝑠 𝑣) 𝑠 = 1 − 𝑊 𝑠 𝑦( 𝑠 − 𝑊* 𝑠 𝑣) 𝑠
Robotics 2
𝑊, (𝑠) = 1/(1 + 𝑃 𝑠 𝐶 𝑠 ) 19
Learning update law
n the update of the feedforward term is designed as
with 𝛼 and 𝛽 suitable filters
𝑣01. 𝑠 = 𝛼(𝑠)𝑢02 𝑠 + 𝛽(𝑠)𝑣0 𝑠
(also non-causal, of the FIR type)
recursive expression 𝛼 𝑠 𝐶 𝑠
𝑣 𝑠 = 𝑦 𝑠 + (𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 )𝑣0 (𝑠)
of feedforward term 01) 1+𝑃 𝑠 𝐶 𝑠 $
recursive expression 1−𝛽 𝑠
𝑒 𝑠 = 𝑦$ 𝑠 + 𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 𝑒0 (𝑠)
of error 𝑒 = 𝑦" − 𝑦 01) 1+𝑃 𝑠 𝐶 𝑠
n if a contraction condition can be enforced
𝛽 𝑠 − 𝛼 𝑠 𝑊(𝑠) < 1 (for all 𝑠 = 𝑗𝜔 frequencies such that ...)
then convergence is obtained for 𝑘 → ∞
𝑦$ 𝑠 𝛼 𝑠 𝑊 𝑠 𝑦$ 𝑠 1−𝛽 𝑠
𝑣2 𝑠 = 𝑒2 𝑠 =
𝑃(𝑠) 1 − 𝛽 𝑠 + 𝛼 𝑠 𝑊 𝑠 1 + 𝑃 𝑠 𝐶(𝑠) 1 − 𝛽 𝑠 + 𝛼 𝑠 𝑊 𝑠
Robotics 2 20
Proof of recursive updates
whiteboard…
§ recursive expression for the feedworward 𝑣0
𝑣()& 𝑠 = 𝛼 𝑠 𝑢(' 𝑠 + 𝛽 𝑠 𝑣( 𝑠 = 𝛼 𝑠 𝐶 𝑠 𝑒( 𝑠 + 𝛽 𝑠 𝑣( 𝑠
= 𝛼 𝑠 𝐶 𝑠 𝑊* (𝑠)𝑦! 𝑠 − 𝑊+ 𝑠 𝑣( 𝑠 + 𝛽 𝑠 𝑣( 𝑠
𝛼 𝑠 𝐶 𝑠
= 𝑦 𝑠 + (𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 ) 𝑣( (𝑠)
1+𝑃 𝑠 𝐶 𝑠 !
§ recursive expression for the error 𝑒0
𝑒( 𝑠 = 𝑦! 𝑠 − 𝑦( 𝑠 = 𝑦! 𝑠 − 𝑃(𝑠)(𝑣( 𝑠 + 𝑢(' 𝑠 )
1
⇒ 𝑣( 𝑠 = 𝑦! 𝑠 − 𝑒( 𝑠 − 𝑢(' 𝑠
𝑃 𝑠
'
𝑦()& 𝑠 = 𝑃 𝑠 𝑣()& 𝑠 + 𝑢()& 𝑠 = 𝑃 𝑠 𝛼 𝑠 𝑢(' 𝑠 + 𝛽 𝑠 𝑣( 𝑠 + 𝑢()&
'
𝑠
1
= 𝑃 𝑠 𝛼 𝑠 𝐶 𝑠 𝑒( 𝑠 + 𝛽 𝑠 𝑦 𝑠 − 𝑒( 𝑠 − 𝛽 𝑠 𝐶 𝑠 𝑒( 𝑠 + 𝐶 𝑠 𝑒()& (𝑠)
𝑃(𝑠) !
𝑒()& 𝑠 = 𝑦! 𝑠 − 𝑦()& 𝑠
= (1 − 𝛽 𝑠 ) 𝑦! 𝑠 − 𝛼 𝑠 − 𝛽 𝑠 𝑃 𝑠 𝐶 𝑠 − 𝛽 𝑠 𝑒( 𝑠 − 𝑃 𝑠 𝐶(𝑠)𝑒()& 𝑠
1−𝛽 𝑠
⇒ 𝑒()& 𝑠 = 𝑦 𝑠 + 𝛽 𝑠 −𝛼 𝑠 𝑊 𝑠 𝑒( (𝑠)
1+𝑃 𝑠 𝐶 𝑠 !
Robotics 2 21
Proof of convergence
whiteboard…
from recursive expressions
𝛼 𝑠 𝐶 𝑠
𝑣()& 𝑠 = 𝑦 𝑠 + (𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 ) 𝑣( (𝑠)
1+𝑃 𝑠 𝐶 𝑠 !
1−𝛽 𝑠
𝑒()& 𝑠 = 𝑦! 𝑠 + 𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 𝑒( (𝑠)
1+𝑃 𝑠 𝐶 𝑠
compute variations from 𝑘 to 𝑘 + 1 (repetitive term in trajectory 𝑦! vanishes!)
∆𝑣()& 𝑠 = 𝑣()& 𝑠 − 𝑣( 𝑠 = (𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 ) ∆𝑣( (𝑠)
∆𝑒()& 𝑠 = 𝑒()& 𝑠 − 𝑒( 𝑠 = 𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 ∆𝑒( (𝑠)
by contraction mapping condition 𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 < 1 ⇒ 𝑣0 → 𝑣2 , 𝑒0 → 𝑒2
𝛼 𝑠 𝐶 𝑠
𝑣, 𝑠 = 𝑦 𝑠 + (𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 ) 𝑣, (𝑠)
1+𝑃 𝑠 𝐶 𝑠 !
1−𝛽 𝑠
𝑒, 𝑠 = 𝑦 𝑠 + 𝛽 𝑠 − 𝛼 𝑠 𝑊 𝑠 𝑒, (𝑠)
1+𝑃 𝑠 𝐶 𝑠 !
𝑦! 𝑠 𝛼 𝑠 𝑊 𝑠 𝑦! 𝑠 1−𝛽 𝑠
⇒ 𝑣, 𝑠 =
𝑃(𝑠) 1 − 𝛽 𝑠 + 𝛼 𝑠 𝑊 𝑠
𝑒, 𝑠 =
1 + 𝑃 𝑠 𝐶(𝑠) 1 − 𝛽 𝑠 + 𝛼 𝑠 𝑊 𝑠
Robotics 2 22
Comments on convergence
n if the choice 𝛽 = 1 allows to satisfy the contraction condition, then
convergence to zero tracking error is obtained
𝑒@ (𝑠) = 0
and the inverse dynamics command has been learned
𝑦" (𝑠)
𝑣@ (𝑠) =
𝑃(𝑠)
n in particular, for α 𝑠 = 1/𝑊(𝑠) convergence would be in 1 iteration only!
n the use of filter β(𝑠) ≠ 1 allows to obtain convergence (but with residual
tracking error) even in presence of unmodeled high-frequency dynamics
n the two filters can be designed from very poor
information on system dynamics, using classic
tools (e.g., Nyquist plots)
Robotics 2 23
Application to robots
n for 𝑁-dof robots modeled as
𝐵& + 𝑀(𝑞) 𝑞̈ + 𝐹$ + 𝑆(𝑞, 𝑞)
̇ 𝑞̇ + 𝑔 𝑞 = 𝑢
we choose as (initial = pre-learning) control law
𝑢 = 𝑢' = 𝐾" 𝑞! − 𝑞 + 𝐾# 𝑞̇ ! − 𝑞̇ + 𝑔1 𝑞
and design the learning filters (at each joint) using
the linear approximation
𝑞$ (𝑠) 𝐾"$ 𝑠 + 𝐾#$
𝑊$ 𝑠 = = 𝑖 = 1, ⋯ , 𝑁
𝑞!$ (𝑠) 𝐵S )$ 𝑠 + 𝐹S+$ + 𝐾"$ 𝑠 + 𝐾#$
3
n initialization of feedforward uses the best estimates
𝑣. = 𝐵S) + 𝑀(𝑞T ! ) 𝑞̈ ! + 𝐹S+ + 𝑆(𝑞
U ! , 𝑞̇ ! ) 𝑞̇ ! + 𝑔( 𝑞!
or simply 𝑣% = 0 (in the worst case) at first trial 𝑘 = 1
Robotics 2 24
Experimental set-up
n joints 2 and 3 of 6R MIMO CRF robot prototype @DIS
160o
≈ 90% gravity
50o/s
balanced
through springs
high level of
dry friction
Harmonic Drives
transmissions
with ratio 160:1 desired velocity/position for both joints
DSP 𝑇𝑐 = 400µs DC motors with
D/A = 12 bit current amplifiers
R/D = 16 bit/2𝜋 resolvers and
A/D = 11 bit/(rad/s) tachometers
Robotics 2 De Luca, Paesano, Ulivi: IEEE Trans Ind Elect, 1992 25
Experimental results
tracking error 𝑒𝑘 reduces for 𝑘 = 1, 3, 6, 12
joint 2
joint 3
feedforward 𝑣𝑘 increases for 𝑘 = 3, 6, 12 (zero at 𝑘 = 1)
Robotics 2 feedback 𝑢03 decreases for 𝑘 = 1, 3, 6, 12 26
On-line learning control
n re-visitation of the learning idea so as to acquire the missing
dynamic information in model-based trajectory control
n on-line learning approach
n the robot improves tracking performance already while executing the task
in feedback mode
n uses only position measurements from encoders
n no need of joint torque sensors
n machine learning techniques used for
n data collection and organization
n regressor construction for estimating model perturbations
n fast convergence
n starting with a reasonably good robot model
n extensions to underactuated robots or with flexible components
Robotics 2 27
Control with approximate FBL
n dynamic model, its nominal part and (unstructured) uncertainty
𝑀 𝑞 𝑞̈ + 𝑛 𝑞, 𝑞̇ = 𝜏 - + ∆𝑀
𝑀=𝑀 𝑛 = 𝑛/ + ∆𝑛
n model-based (approximate) feedback linearization
- 𝑞 𝑎 + 𝑛/ 𝑞, 𝑞̇
𝜏"#$ = 𝑀
n resulting closed-loop dynamics with perturbation
𝑞̈ = 𝑎 + 𝛿(𝑞, 𝑞,̇ 𝑎) ! − 𝐼 𝑎 + 𝑀 89 𝑛# − 𝑛
𝛿 = 𝑀 89 𝑀
n control law for tracking 𝑞! 𝑡 is completed by using (at 𝑡 = 𝑡A ) a
linear design (PD with feedforward) and a learning regressor 𝜀0
𝑎 = 𝑎% = 𝑢% + 𝜀%
= 𝑞̈ !,% + 𝐾' (𝑞!,% − 𝑞% ) + 𝐾( (𝑞̇ !,% − 𝑞̇ % ) + 𝜀%
Robotics 2 28
On-line learning scheme
𝑞0 , 𝑞̇ 0
𝑞$ (𝑡) 𝑢0 𝑎0 𝜏456,0
Linear Control Feedback
for Tracking + Linearization
+
𝜀0
Gaussian Process
Regression
𝑋0 , 𝑌0
𝑢0 Data Set Collection 𝑞0 , 𝑞̇ 0
Procedure
Robotics 2 29
On-line regressor
n Gaussian Process (GP) regression to estimate the perturbation 𝛿
n from input-output observations that are noisy, with 𝜔 ~ 𝒩 0, 𝛴8 , the
generated data points at the 𝑘 -th control step are
𝑋0 = (𝑞0 , 𝑞̇ 0 , 𝑢0 ) 𝑌0 = 𝑞̈ 0 − 𝑢0
n assuming the ensemble of 𝑛$ observations with a joint Gaussian distribution
a Kernel
𝑌.:59 -. 𝐾 𝑘 to be chosen
~ 𝒩 0, 𝑘 , 𝜅(𝑋59 , 𝑋59 )
𝑌59
n the predictive distribution that approximates 𝛿 𝑋` for a generic query 𝑋` is
S ~ 𝒩 𝜇(𝑋),
𝜀(𝑋) S 𝜎 3 (𝑋)
S
with
𝜇 𝑋` = 𝑘 : 𝑋` 𝐾 + 𝛴8 () 𝑌 ⇒ 𝜀( = 𝜀(𝑋( )
𝜎 & 𝑋` = 𝜅 𝑋,
` 𝑋` − 𝑘 : 𝑋` 𝐾 + 𝛴8 `
() 𝑘(𝑋)
Robotics 2 30
Simulation results
§ Kuka LWR iiwa, 7-dof robot
§ model perturbations: dynamic parameters with ± 20% variation,
uncompensated joint friction
§ 7 separate GPs (one for each joint), each with 21 inputs at every 𝑡 = 𝑡A
§ sinusoidal trajectories for each joint
norm of the joint errors
position
components
in the
Cartesian
space
… at the first and only iteration!
Robotics 2 31
Simulation results
video
(slowed
down)
Proc. of Machine Learning Research, vol. 100 (2020)
Robotics 2 32
Extension to underactuated robots
𝑀;; (𝑞) 𝑀;< (𝑞) 𝑞̈ ; 𝑛; 𝑞, 𝑞̇ 𝜏
: (𝑞)
𝑀;< 𝑀<< (𝑞) 𝑞̈ < + 𝑛< 𝑞, 𝑞̇ =
0
§ planner optimizes motion of passive joints (at every iteration)
§ controller for active joints with partial feedback linearization
§ two regressors (on/off-line) for learning the required acceleration
corrections for active and passive joints
qd(t) uk ak
Robotics 2 33
Experiments on the Pendubot
§ Pendubot, 2-dof robot with passive second joint
§ swing-up maneuvers from down-down to a new equilibrium state
⇒ up-up
⇒ down-up
Robotics 2 34
Experimental results
IEEE Robotics and Automation Letters, vol. 7(1), 2022 video
latest video with more simulations & experiments
convergence in 2 iterations! on YouTube https://youtu.be/1aKG__8gfvk
Robotics 2 35