0% found this document useful (0 votes)
40 views21 pages

Inverted 2024

Uploaded by

Mohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views21 pages

Inverted 2024

Uploaded by

Mohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

electronics

Article
Design, Implementation, and Control of a Wheel-Based
Inverted Pendulum
Dominik Zaborniak 1,† , Krzysztof Patan 2,† and Marcin Witczak 2, *

1 Faculty of Computer, Electrical and Control Engineering, University of Zielona Góra,


65-516 Zielona Góra, Poland; [email protected]
2 Institute of Control and Computation Engineering, University of Zielona Góra, 65-516 Zielona Góra, Poland
* Correspondence: [email protected]
† These authors contributed equally to this work.

Abstract: Control of an inverted pendulum is a classical example of the stabilisation problem per-
taining to systems that are unstable by nature. The reaction wheel and the motor act as actuators,
generating the torque needed to stabilise the system and counteract inevitable disturbances. This
paper begins by describing the design and physical implementation of a wheel-based inverted pen-
dulum. Subsequently, the process of designing and testing the proportional–integral–derivative (PID)
and unknown input Kalman-filter-based linear quadratic regulator (LQR) controllers is performed.
In particular, the design and pre-validation were carried out in the Matlab/Simulink environment.
The final validation step was realised using a constructed physical pendulum, with a digital controller
implemented using the STM32 board. Finally, a set of various physical disturbances were intro-
duced to the system to show the high reliability and superiority of the proposed Kalman-filter-based
LQR strategy.

Keywords: inverted pendulum; reaction wheel; Simulink; LQR controller; PID controller; state
observer; Kalman filter

Citation: Zaborniak, D.; Patan, K.;


1. Introduction
Witczak, M. Design, Implementation, Irrespective of the system under investigation, the main aim of any control strategy is
and Control of a Wheel-Based to stabilise it while satisfying a set of predefined performance indices. This simply means
Inverted Pendulum. Electronics 2024, that the system should be transferred to the so-called equilibrium state while satisfying
13, 514. https://doi.org/10.3390/ a set of performance indices. There is, of course, a wide set of control strategies that
electronics13030514 may cope with the above-defined tasks. These control strategies have been proven to be
Academic Editors: Shuncong Zhong
useful for various tasks. Usually, their usefulness is validated using some challenging
and Len Gelman systems. Undoubtedly, the inverted pendulum belongs to this group. At first glance, it
seems to be a relatively simple system. Indeed, it is simply a suspended pendulum that
Received: 30 December 2023 one must force in such a way as to stand it in a vertical position. Unfortunately, this system
Revised: 22 January 2024
raises a number of very important challenges. These include suitable system modelling,
Accepted: 23 January 2024
copying with non-linearity, non-minimal phase behaviour, and under-actuation. On the
Published: 26 January 2024
other hand, the goal is to keep the pendulum at an unstable equilibrium point. To make this
possible, the controller must continuously and appropriately balance the centre of system’s
gravity above the axis of rotation. Additionally, the control system must be fast enough to
Copyright: © 2024 by the authors.
counteract attempts to destabilise it. Apart from the control-oriented problems, the design
Licensee MDPI, Basel, Switzerland. of the inverted pendulum system raises several issues. These involve, but are not limited
This article is an open access article to, proper selection of electrical and actuator components, control device programming,
distributed under the terms and sensor fusion, and data filtering.
conditions of the Creative Commons Taking into account these preliminary discussions, several authors have attacked the
Attribution (CC BY) license (https:// above-defined problem from different angles. In [1], the authors focus on the stabilisation
creativecommons.org/licenses/by/ problem under the presence of a constant unknown bias in the pendulum angle measure-
4.0/). ments. Another study [2] proves that even in the presence of a time delay in the system,

Electronics 2024, 13, 514. https://doi.org/10.3390/electronics13030514 https://www.mdpi.com/journal/electronics


Electronics 2024, 13, 514 2 of 21

a wheeled inverted pendulum can be well stabilised with data taken from only one ac-
celerometer. The authors of [3] made a comparison between different control strategies for a
rotary inverted pendulum. In [4], the authors performed an investigation devoted to robust
generalised dynamic inversion. Another interesting study concerning robust control was
proposed in [5], along with H∞ analysis. A neural network-based control approach was
introduced in [6]. Yet another method that involves fuzzy controllers was proposed in [7].
Similarly, the authors of [8] developed a fuzzy controller and investigated its performance
by comparing it with an LQR controller. An interesting study concerning an extended
platform that includes one additional reaction wheel is described in [9]. Furthermore, [10]
can be considered as a great guide to the optimal mechanical design of a pendulum and
reaction wheel for a given electric motor, as well as a wheel diameter that maximises the
recovery angle. A mathematical model of inverted pendulum systems was derived in [11].
Special attention has also been paid to the system modelling problem [12]. Finally, since not
all state variables are available, the importance of an accurate state estimation is discussed
in [13]. Inverted pendulums with a higher degree of freedom have also been discussed
by several authors. For example, [14] describes a dual-axis, self-balancing, reaction-wheel-
based inverted pendulum system. Another impressive work [15] presents a novel 3D
inverted pendulum that can be balanced using only a single reaction wheel. Another
very interesting system that involves a cube that contains three reaction wheels and a
nonlinear control algorithm was described in [16]. The inverted pendulum model was also
investigated in a study concerning walking robots [17]. The above-presented review of the
latest state-of-the-art methods clearly indicates the problems facing the current research,
which can be summarised as follows:
• The need for a simple and cost-efficient hardware design (mechanics and electronics).
• The necessity of developing a representative mathematical model, along with a strat-
egy that allows its efficient parameter estimation.
• The need to develop efficient control and estimation strategies that allow for the
desired and reliable performance of inverted pendulums.
Taking into account the above discussion and literature review, the contributions of
this paper can be summarised as follows:
1. Determination of the pendulum structure and its nonlinear model (Section 2).
2. A proposal for a cost-efficient hardware architecture (Section 2.2).
3. Application of the small-angle approach for the determination of the linear state-space
model of an inverted pendulum (Section 2).
4. Modelling and data-based identification of an inverted pendulum (Section 3).
5. Validation of the identified model (Section 3).
6. Matlab/Simulink-based preliminary validation of the model (Section 3).
7. Design and analysis of dedicated cascade PID and Kalman-filter-based LQR con-
trollers (Section 4).
8. Experimental validation of the proposed design and control strategies (Section 5).
Finally, the last section summarises the paper and proposes future research directions.

2. Inverted Pendulum Model and Design


2.1. Mathematical Model
In order to analyse the system and design the controller, a mathematical model of a
reaction wheel inverted pendulum (RWIP) was first derived. Figure 1 shows the scheme of
the RWIP system. It consists of a fixed base, a rotating arm, and a rotating reaction wheel.
The frame ( x0 , y0 , z0 ) is an inertial one and it is related to the pendulum’s base, where z0
represents the axis overlapping the pendulum arm’s axis of rotation. The frame ( x1 , y1 , z1 )
is a non-inertial frame of the reference related to the end of the moving pendulum’s arm,
where z1 is the axis overlapping the wheel’s axis of rotation. The force FG is the gravitational
force acting at the system’s centre of mass. The force FGT is the component of the force FG ,
perpendicular to the arm of the pendulum. The angle θ denotes the pendulum’s roll and is
Electronics 2024, 13, 514 3 of 21

measured in the frame ( x0 , y0 , z0 ) between the vertical, that is, y0 , and current position of
the pendulum’s arm. It’s acceleration is represented as follows:

Figure 1. Scheme of an inverted pendulum.

Mp dM g b bα 1 1
θ̈ = sin(θ ) − θ θ̇ + α̇ + τd − τw , (1)
J J J J J
where M p is the mass of the pendulum’s arm and reaction wheel combined together;
d M stands for the distance between the axis z0 and the centre of mass; g is the gravity
acceleration; J is the pendulum’s total moment of inertia (including the arm, motor, and
reaction wheel) with respect to the z0 axis; bθ and bα are the coefficients of friction associated
with angles θ and α, respectively; θ̇ and α̇ are first derivatives (angular velocities) of θ
and α, respectively; τd stands for the torque caused by disturbing forces acting on the
pendulum’s arm; and τw denotes the torque generated by the reaction wheel. Angle α,
measured in the frame ( x1 , y1 , z1 ), describes how much the reaction wheel is rotated around
the z1 axis. It denotes acceleration in the frame ( x0 , y0 , z0 ), which is represented by the
following equation:

M p gd M b ( J + Jw )bα 1 J + Jw
α̈ = − sin(θ ) + θ θ̇ − α̇ − τd + τw , (2)
J J J Jw J J Jw
where Jw is the wheel’s moment of inertia with respect to the z1 axis.
The reaction wheel is driven by a DC motor, which is described in the following set
of equations:
(
τw = k m i a ,
(3)
R a i a + L a didta = u − k e α̇,
where u is the motor’s supply voltage, with a maximum value of Udd = 12 V; i a is the
armature circuit current; R a is the armature circuit resistance; L a is the armature circuit
inductance; k m is the motor’s mechanical constant; and k e is the motor’s back electromotive
force (EMF) constant.
It was assumed that the time constant of the armature’s circuit is much smaller than
the mechanical time constants of the rest of the system. Therefore, L a = 0 was assumed. By
reformulating Equation (3), the following simplified motor model can be achieved:

km km ke
τw = u− α̇. (4)
Ra Ra
Electronics 2024, 13, 514 4 of 21

Introducing (4) into (1) and (2) gives the following combined model of the
inverted pendulum:
Mp dM g
sin(θ ) − bJθ θ̇ + 1J (bα + kRm ake )α̇ + 1J τd − 1J kRma u,
(
θ̈ = J
M gd (5)
α̈ = − p J M sin(θ ) + bJθ θ̇ − J + Jw km ke 1 J + Jw k m
J Jw ( bα + R a ) α̇ − J τd + J Jw R a u.

Model (5) is a nonlinear one. It is obvious that for small values of θ, sin(θ ) = θ. Thus,
using this property, nonlinear model (5) can be rewritten in a simplified linear form, as
follows:
Mp dM g
θ − bJθ θ̇ + 1J (bα + kRm ake )α̇ + 1J τd − 1J kRma u,
(
θ̈ = J
M gd (6)
α̈ = − p J M θ + bJθ θ̇ − J + Jw km ke 1 J + Jw k m
J Jw ( bα + R a ) α̇ − J τd + J Jw R a u.

Model (6) can be rewritten in the following equivalent state-space form:

ẋ = Ax + Bu,
(7)
y = Cx,

with:
x = [θ θ̇ α α̇] T , u = [u τd ] T ,
0 1 0 0
   
0 0 
1 0 0 0

Mp dM g
 − bJθ 0 1
+ kRm ake ) 
J ( bα
 − 1 km 1 
0 1 0 0
J
, B =  J R a J

A= , C =  .
   
 0 0 0 1   0 0  0 0 1 0
M gd J + Jw k m
− pJ M bθ
J
J + Jw km ke
0 − J Jw (bα + Ra ) J Jw R a − 1J 0 0 0 0

Variable α does not appear in Equations (5) and (6); therefore, it does not directly
affect the dynamics of the system. It can also be observed that A has a very specific form.
However, as can be seen in the matrix C, our physical realisation of the pendulum does not
have a sensor that measures the angular velocity of the wheel α̇. Thus, its value should be
estimated based on measurements of α.

2.2. Physical System Design and Implementation


The pendulum was designed using Fusion 360 software. The final version of the
simulation model can be seen in Figure 2a. All of the components were created based on
this model. Most of them were cut from wood, but a couple of parts were manufactured
using a 3D printer. The complete laboratory stand can be seen in Figure 2b.
The electrical system design boils down to the development of a STM32 Nucleo-
F103RB board that acts as a digital controller. It is responsible for collecting data from the
following sensors:
• A rotary encoder (E38S6-C-(600)B5-26G2) to measure θ.
• An encoder inside the gear motor (Pololu 4752) to measure α.
• An inertial measurement unit (IMU), LSM6DS33, to record θ̇.
Electronics 2024, 13, 514 5 of 21

(a) (b)

Figure 2. Angled view of the inverted pendulum with reaction wheel. Graphical design stage: a
model that was realised in Fusion 360 software (a). Implementation stage: the final form of the
constructed laboratory stand (b).

The controller stabilises the pendulum by calculating the motor control signal u. It
sends a corresponding pulse-width modulation (PWM) signal to the H-bridge module
L298N, which feeds the motor. For the sake of simplification, the H-bridge controller is
assumed to be a component capable of changing the duty cycle of the PWM signal from 0 to
1 linearly, in a voltage range from 0 to Udd , without any losses. The microcontroller unit can
operate without the need for external computing systems. It is programmed using the C++
language and uses an internal clock to ensure a constant sampling time. The conceptual
component connection diagram is shown in Figure 3a. Moreover, the implementation
phases are portrayed in Figure 3b.
(a) (b)

Figure 3. Conceptual connection diagram (a) and implementation chart flow (b).

3. Parameter Estimation
The following parameters were taken directly from the physical model: d M = 0.131 m,
M p = 0.4753 kg, and Mw = 0.2003 kg. Other parameters should be derived based on
physical laws or should be properly estimated. This process is portrayed in the follow-
ing sections.

3.1. The Reaction Wheel’s Moment of Inertia


The reaction wheel consists of a wooden disk and weights made of nuts and bolts.
The disks mass is md = 0.0670 kg, and the mass of each weight mw = 0.0333 kg. The disk
Electronics 2024, 13, 514 6 of 21

was modelled as a uniform annulus with an inner radius ri = 0.06 m and an outer radius
ro = 0.10 m. Its moment of inertia Jd is described by the following formula:

1
m (r2 + ro2 ) = 4.56 · 10−4 kg · m2 .
Jd = (8)
2 d i
The weights are located at a distance rw = 0.08 m from the centre of the annulus. The
moment of inertia of the individual weight Jiw is approximated by the moment of inertia of
a point mass:

2
Jiw = mw rw = 2.13 · 10−4 kg · m2 . (9)
In this way, the moment of inertia of the entire wheel is equal to the following:

Jw = Jd + 4Jiw = 1.3 · 10−3 kg · m2 . (10)

3.2. The Pendulum’s Moment of Inertia


The moment of inertia J was determined and based on a model of a simple physical
pendulum. By knowing the oscillation period T, the moment of inertia can be calculated
as follows:
T2
J= M p gd M . (11)
4π 2
The reaction wheel was locked, making the pendulum behave like a rigid body,
oscillating freely, with a small amplitude. The period of oscillation T was determined by
analysing the pendulum’s response, as presented in Figure 4.

19.32 − 0.53
T= = 0.8170 s. (12)
23
Using (11), one can calculate the moment of inertia J as follows:

J = 0.0103 kg·m2 . (13)


01231245ÿ3728ÿ9 

Figure 4. Free oscillations of the pendulum with the reaction wheel locked. θdown = θ − π indicates
that the pendulum is pointing downwards.
Electronics 2024, 13, 514 7 of 21

3.3. Friction Coefficient


The reaction wheel was kept locked in order to estimate the coefficient of friction bθ .
To do this, the pendulum needs to swing faster (with larger θ̇) and thus with a greater
amplitude. In this case, the motion of the pendulum, with motion resistance and nonlinear
gravity, is described by the following equation:

J θ̈ − M p d M g sin(θ ) + bθ θ̇ = 0. (14)

Based on (14), a scheme for a nonlinear dynamic model was built in Simulink
(see Figure 5). Again, the laboratory stand was forced to swing and measurements of
θ were collected. The obtained measurements were compared with the outputs of the
simulation model and the parameter bθ was adjusted accordingly. The process was
repeated to allow us to obtain the lowest value of the mean square error (MSE). In this way,
the value of bθ = 5.6799 · 10−6 kg m2 s−1 was estimated.

Figure 5. Simulink system model used to determine the value of the coefficient bθ .

3.4. The Motor’s Parameters


In order to determine the parameters of the motor, the pendulum’s arm was kept fixed
(θ = const). In this case, the reaction wheel accelerates according to the following equation:
 
km ke km
Jw α̈ + bα + α̇ = u. (15)
Ra Ra

Let Tw = Jw (bα + kRm ake )−1 and Kw = (bα + kRm ake )−1 kRma . Then, (15) can be rewritten into
the following form:
Tw α̈ + α̇ = Kw u. (16)
The time constant Tw and gain Kw were estimated based on the step response of the
motor, as illustrated in Figure 6.
Electronics 2024, 13, 514 8 of 21

01231245ÿ3728ÿ9  

Figure 6. Step response of α with the pendulum’s arm locked.

From the step response shown in Figure 6, the value of Tw was found to be equal to
0.1 s, while Kw was derived based on the slope of the step response; its value is equal to
3.4114 V −1 s−1 . The rest of parameters were derived from (15) as follows:
km ke Jw
( bα + )= = 0.0131,
Ra Tw
(17)
km km ke
= K w ( bα + ) = 0.0447.
Ra Ra

3.5. Summary of the Model’s Parameters


All the necessary parameters concerning successful system modelling are summarised
in Table 1. A final comparison between the derived model and the laboratory stand can be
seen in Figure 7. This comparison presents the rectangular pulse response of the considered
pendulum system.

Table 1. List of all the parameters needed for system modelling.

Parameter Value Unit


dM 0.131 m
7111712241M
p10231 !282"#8$1 !282"#8$1%&
 ÿ223'(5ÿ ÿ)ÿ ÿÿkg
0.4753 ÿ*+!,)ÿ- 010
Mw 0.2003 kg
J 0.0103 kg m2
Jw 0.0013 kg m2
bθ 5.6799 · 10 − 6 kg m2 s−1
km ke
( bα + R a ) 0.0131 kg m2 s−1
km
Ra 0.0447 kg m2 s−2 V −1
g 9.81 m s −2
Electronics 2024, 13, 514 9 of 21

01231245ÿ3728ÿ9   

Figure 7. Open-loop rectangular pulse response: laboratory stand measurements (blue); system
model outputs (orange).

4. Control of Inverted Pendulum


In order to control RWIP, two closed-loop method were examined and compared:
a cascade control system with two PID controllers and a linear–quadratic–Gaussian
(LQG) control system.

4.1. Cascade PID Control


As an input, the PID controller is fed with an error signal, which is the difference
between the reference of the measured output, i.e., e = yre f − y. Based on this error, a
control signal is generated that affects the system in such a way as to minimise it. For
an ideal PID controller, the control signal is proportional to the current difference K p (P
term), the integral of the previous error values Ki 1s (I term), and the error rate of change Kd s
(D term). In practice, the differentiating component amplifies the high-frequency signals,
including noise, and oscillations caused by the step changes of the digital signal. To avoid
this, a low-pass filter is added, resulting in a filtered version of the PID controller, as follows:

Ki Kd s
GPIDF (s) = K p + + , (18)
7111712241102312!2"8!#12!2"8!#1$%ÿ223&' 5sÿ( ( Tfs+
ÿ)1ÿÿÿÿ*+ ,)ÿ- 010
where T f is the time constant of the low-pass filter. This form was implemented within the
microcontroller. However, for design purposes, a representation that exhibits the positions
of the controller’s zeros and poles was employed, as follows:

(s − sz1 )(s − sz2 )


GPIDF (s) = Km , (19)
s(s − s p )

where sz1 and sz2 represent zeros, s p denotes the pole, and Km is the overall gain of the
controller. Subsequently, it is possible to transition from notation (19) to (18) using the
following substitutions:
Electronics 2024, 13, 514 10 of 21

K p = Km [(sz1 + sz2 )s p − 2sz1 sz2 ]s− 2


p ,

Ki = −Km sz1 sz2 s− 1


p ,
(20)
Kd = − Km s− 1 −3
p + Km [( sz1 + sz2 ) s p − sz1 sz2 ] s p ,

T f = −s− 1
p .

For the remainder of the paper, the name PID is used to denote the filtered version.
The cascade structure of the control system, shown in Figure 8, was used to stabilise
the pendulum. It consists of an inner loop with a controller, represented by the transfer
function GPIDin (s), and an outer loop with the controller GPIDout (s). The transfer function
of the pendulum is denoted by GRW IP (s). The task of the inner control loop is to quickly
stabilise θ. The outer loop stabilises the variable α by providing the reference signal θre f for
the inner control loop. Because of the derivative term in GPIDout , if the velocity α̇ increases,
then θre f decreases. This forces the pendulum to tilt slightly in the opposite direction. To
maintain stability, the wheel begins to accelerate in the opposite direction. As a result, the
velocity α̇ decreases, and θre f and θ go back to zero.
01231245ÿ3728ÿ9 2 

Figure 8. Control diagram using cascade PID controller.

Note that both controllers were designed based on the root locus method. As shown
in Figure 9, using state-space representation (7), the pole–zero maps of the pendulum were
01231245derived
ÿ3728ÿ9 with respect to the input signal u.  

Figure 9. Pole–zero maps of the pendulum.


Electronics 2024, 13, 514 11 of 21

The system has an unstable pole at s = 7.439. In addition, in the transfer function
representing the output α, there is a right-half-plane zero at s = 7.25, which creates a
non-minimum-phase system. It should be also mentioned that by taking into account the
θ output, the system has one zero at the origin. Moreover, by taking θ̇ into account, it is
evident that there are two zeros at the origin.
Let us analyse the root locus of the inner loop. Using the proportional term of GPIDin
alone, the unstable pole tends towards zero at the origin of the complex plane. This feature
is illustrated in Figure 10a. As a result, the P controller itself is not able to stabilise the
system. An additional pole is therefore added at the origin. The system is still unstable;
however, the unstable pole is now no longer attracted to the origin (see Figure 10b). Finally,
additional zeros and a pole were added at sz1 = −3, sz2 = −2, and s p = −1. As portrayed
in Figure 10c, this forces the unstable pole to attract sz2 .
The gain was chosen so that the unstable pole is moved to the left half-plane and the
dominant poles provide an overshoot of around 12% for Km = −4. The negative gain is
due to the opposite directions in which the reaction wheel and the pendulum arm rotate.
This can be seen in matrix B—from input u to θ̇, there is a negative coefficient, while there
is a positive one up to α̇ . The purpose of the inner loop is to control θ and, hence, a
negative gain is needed. By applying the pole and zeros’ locations to (19) and using (20),
the transmittance of the inner controller in the parallel form (18) is obtained as follows:

(s + 3)(s + 2) 24 8s
GPIDin = −4 = 4− − , (21)
s ( s + 1) s s+1

where K p = 4, Ki = −24, Kd = −8, and T f = 1. Such settings make it possible to stabilise θ


to the required reference level. Figure 11 shows the root locus for the outer loop, which
is based on the inner loop. A pole in the origin indicates the presence of an integral
component. For that reason, an additional integral component would make the regulation
worse. Therefore, the structure of the outer controller was simplified to include only the
proportional and derivative components. The root locus plot gain was set to Km = 0.01 and
the transfer function of GPIDout was chosen as follows:

s 0.000667s
GPDout = 0.01 = , (22)
s + 15 0.0667s + 1
with K p = 0, Ki = 0, Kd = 0.000667, and T f = 0.0667.
Figure 12 shows the simulated step responses of the pendulum with respect to u and
τd . The system is stable in the bounded-input, bounded-output (BIBO) sense, taking into
account the step response for signal u (the left part of Figure 12). However, it is not fully
stable for the input τd (the right part of Figure 12). This can be clearly seen in the response
of signal α. It was not considered to be a major problem because signal α is not the control
target. However, it introduces a non-zero constant velocity α̇, which, in practice—with a
large τd —may saturate the motor and destabilise the pendulum.
Electronics 2024, 13, 514 12 of 21

01231245ÿ3732ÿ89 (a) 01231245ÿ3732ÿ89 8 (b) 8

01231245ÿ3733ÿ89 (c) 8

3732ÿ89 10. Root locus for inner openloop:


01231245ÿFigure  2the
 P controller (a), the PI controller (b), and the PID
controller (c).

Figure 11. Root locus of open outer loop.

711171224110231 22 !"1 22 !"1#7$


1117
12241ÿ223%&
102315ÿ
'(
 '
22!"ÿ#1)ÿ ÿÿ2ÿ*+
2!"#,1$%
)ÿ-ÿ220310&'5ÿ(
Electronics 2024, 13, 514 13 of 21

01231245ÿ3733ÿ89 2 

Figure 12. Simulated step response for the cascade PID control system.

4.2. Linear–Quadratic–Gaussian Controller


Alternatively, a state-feedback-based controller was designed. This control scheme
is based on full information about the system state x. If it is not fully available, a state
observer is created to provide its estimate x̂. The state feedback makes it possible to shift
the poles of the closed-loop system to any position in the complex plane [18]. Such a design
method is very desirable and is suitable for dealing with multiple-input and multiple-
output (MIMO) systems. One important characteristic is that it is able to handle dynamic
coupling between state variables. LQG combines such an LQR controller with a Kalman
filter as a state estimator. According to the separation principle [19], the design process of
the controller and the state estimator can be split without exerting a negative influence on
the control performance.
The controller design was realised using a microcontroller, which is a digital device.
Therefore, the state-space model (7) was discretised. For that purpose, the zero-order
hold method was employed. Moreover, the refreshing time of the microcontroler was set
to 12 ms. Finally, the discrete-time state-space model of the open-loop system has the
following form:
x [ k + 1] = A d x [ k ] + B d u [ k ], (23)
y [ k ] = C d x [ k ], (24)

where
 
1.0042 0.0120 0 0.0001
711171224110231 2!2"#!$10.7056
 2!2"#!$1%&ÿ223'(0 5ÿ) 0.0142
1.0042 ) ÿÿÿÿÿ*+ , ÿ- 010
Ad =  −0.0041 −0.0000 1.0000 0.0112,

−0.6649 −0.0041 0 0.8735


Electronics 2024, 13, 514 14 of 21

−0.0036 0.0069
   
1 0 0 0
−0.5832 1.1552  0 1 0 0
Bd = 
 0.0318 −0.0067,
 Cd = 
0
.
0 1 0
5.1780 −1.0884 0 0 0 0
It should be kept in mind that the second column of the matrix Bd is related to the distortion
input τd and cannot be used to control the system. This means that its effect should be
decoupled while estimating an unknown state x[k ]. To settle this problem, a Kalman
filter with an unknown input is employed [20]. In particular, the matrix Bd is split into
two components, Bdc and Bø . They correspond to the first and second columns of Bd .
As indicated in [20], the primary design condition underlying the unknown input Kalman
filter is
rank(Cd Bø ) = dimτd = 1, (25)
which is clearly satisfied. Finally, the unknown input Kalman filter is given by the following:

x̂[k/k − 1] = Ad x̂[k − 1/k − 1] + Bdc u[1, k − 1], (26)


τ̂ [d, k − 1] = M(y[k] − Cd x̂[k/k − 1]), (27)

x̂ [k/k] = x̂[k/k − 1] + Bø τ̂ [d, k − 1], (28)
∗ ∗
x̂[k/k] = x̂ [k/k ] + Kk (y[k] − Cd x̂ [k/k]) (29)

where the matrix M = (Cd Bø )+ , with (·)+ being a pseudo-inverse of its argument. Finally,
the Kalman filter gain matrix is calculated according to the strategy proposed in [20].
The scheme of the closed system with state feedback and the Kalman filter is shown in

01231245ÿ3728ÿ9
Figure 13, with Klqr denoting the optimal feedback gain matrix.
ÿ

Figure 13. Control system with the LQG controller.

4.2.1. Feedback Gain


When designing a control system, it should be ensured that the system is controllable.
The rank of the controllability matrix T is equal to the rank of the state vector

rank(T) = rank ([Bdc Ad Bdc A2d Bdc A3d Bdc ]) = 4,

which means that the pendulum is controllable.


In order to calculate the feedback gain Klqr , the state-cost weighted matrix Qlqr and
the input-cost weighted matrix Rlqr were defined. The values of the Qlqr matrix were
adjusted to prioritise the stabilisation of angle θ more than α. Then, based on the simulations
carried out, the value of Rlqr was selected so that u was between -Udd and Udd for the
desired degree of stabilisation.
 
10 0 0 0
 0 10 0 0
Qlqr =
0 0
 (30)
0.1 0
0 0 0 0.1
Electronics 2024, 13, 514 15 of 21

Rlqr = 2 (31)
Finally, the optimal gain Klqr aims to minimise the quadratic cost function Jd , defined
as follows:

Jd (u) = ∑ (x[k]T Qlqr x[k] + u[k]T Rlqr u[k]). (32)
n =1

with the control law u[k] = −Klqr x[k]. This yields the following:

Klqr = [−18.2432 − 2.6777 − 0.0966 − 0.1529]. (33)


After closing the feedback loop, the discrete closed-loop system is described by the
following equation:

x[k + 1] = (Ad − Bdc Klqr )x[k ] + Bd u[k], (34)


y [ k ] = C d x [ k ]. (35)

The poles’ locations are included in poles = [−0.1167 + 0.0000i, 0.9370 + 0.0272i, 0.9370 −
0.0272i, 0.9881 + 0.0000i] with the following magnitudes: |poles| = [0.1167, 0.9374, 0.9374, 0.9881].
Clearly, all poles are located inside the unit circle, which means that state feedback successfully
stabilises the model. The system’s step responses are shown in Figure 14. In contrast to the
cascade PID controller, this time, the velocity α̇ converges to zero. For this reason, the motor is
012312not
45ÿ3728saturated
ÿ9 in a steady state. 

Figure 14. Simulated step response of closed system with LQR controller.

4.2.2. State Observer


The state estimator aims at reconstructing the current state of the system based on the
measurements and a mathematical model of the system. In this case, this mainly refers to
the estimation of α̇. However, a Kalman filter is used as a full-state observer; thus, other
state variable estimates were also obtained.
Electronics 2024, 13, 514 16 of 21

First, it was verified whether the system is observable or not. The rank of the observ-
ability matrix O is equal to the dimension of the state vector

rank (O) = rank ([Cd Cd Ad Cd A2d Cd A3d ]) = 4,

which means that the system is observable.


The estimator is based on a linear model (6), and it is itself a dynamic system, which
can be described by the following equation:

x̂[k + 1] = (Ad − Bdc Klqr )x̂[k ] + Bd u[k] + Kkf (y[k] − ŷ[k ]),
(36)
ŷ[k] = Cd x̂[k].

where x̂[k] is the estimate of the system’s state at time k, while ŷ[k] is the estimate of the
system’s output at time k.
The covariance of the process noise Qkf and the measurement noise Rkf was assumed
to be constant and have the following form:

10−1
 
0
Qkf = ,
0 100
2 · 10−3
 
0 0 0 (37)
 0 5 · 101 0 0 
Rkf = .
 0 0 1 · 10−1 0 
0 0 0 1010

It was assumed that both the measurement and the process noise are uncorrelated.
This assumption leads to the diagonal form of covariance matrices. The covariance of the
measurement noise was selected based on an analysis of the recorded data. In turn, the
covariance of the process noise was selected via a trial and error method. Such settings
guarantee a good unknown input Kalman filter performance for the whole operational
range of the inverted pendulum.

4.3. Comparative Evaluation


In order to compare the control systems based on PID and LQR controllers, four
performance indices were considered:
• The maximum absolute value of the controlled signal xmax .
• The settling time tr , calculated as the time from the moment the system is excited
until the signal error ex (t) = x (t) − x f inal reaches and remains constantly within the
tolerance zone ±0.05 · emax , with emax being the maximum error.
• The percentage overshoot PO0 —calculated for signals with a zero steady-state value
as the ratio of two adjacent peak amplitudes: PO0 = x peak1 /x peak2 · 100%.
• The percentage overshoot POn —calculated for signals with a non-zero steady-state
value: POn = ( x peak1 − x f inal )/x f inal · 100%.
The quantitative control indices are listed in Table 2.
Electronics 2024, 13, 514 17 of 21

Table 2. Performance indices for cascade PID and LQR controllers.

Cascade PID
Input u Disturbance τd
Variable θ α α̇ θ α α̇
xmax 0.0073rad 1rad 1.14 rad
s 0.033rad – 40 rad
s
tr [s] 1.72 5.52 3.95 6.3 – 7
PO0 [%] 28 – – – – –
POn [%] – 0 – 0 – 0
LQR
Input u Disturbance τd
Variable θ α α̇ θ α α̇
xmax 0.033rad 7.6rad 5.55 rad
s 1.93rad 302rad 215 rad
s
tr [s] 2.6 3.28 3.6 2.1 3.2 3.6
PO0 [%] 22 – – – – –
POn [%] – 0 – 18.3 0 –

Clearly, the cascade PID controller achieves smaller xmax values than the LQR con-
troller. In the case of the control input, the regulation time of θ is better for the cascade
PID controller, but the overshoot is slightly lower for the LQR controller. In the case of the
disturbance τd , the LQR controller performs better than the PID controller. Obviously, the
cascade PID has problems—with a proper control of α, one could not provide the quality
indices in that case. Taking into account all performance indices as well as disturbance
compensation abilities, it can be concluded that the LQR controller performs better than
the cascade PID controller.

5. Experimental Results
The designed and investigated control systems were tested in two ways: (i) under
undisturbed operating conditions and (ii) with a constant disturbance force caused by
additional weights. Both video recordings of the experiments and data collected are
portrayed in the following subsections.

5.1. Impulse Disturbance Torque


Both control systems were tested at the under normal operating conditions. Addition-
ally, in the steady state, the pendulum was lightly pushed with a wooden bar.
The cascade PID controller was able to stabilise the system. Interested readers can find
a video of the controller’s performance on the following webpage: https://www.youtube.
com/shorts/TG2SWD9OdJk. However, this method results in a non-zero reaction wheel
speed (α̇ ̸= 0). This can be seen in Figure 15. It can also be observed that the control signal
u is very noisy.
The LQG controller also successfully stabilised the pendulum. For a video of the
controller’s performance, please visit the following webpage: https://youtube.com/shorts/
Kn4MKzpV6JI. The system balances around its equilibrium point with the speed of the
wheel α̇ oscillating around 0 rad/s. The control signal u is much less noisy than in the case
of the cascade controller.
Both systems were able to function properly following a light push of the bar, but
the LQG controller was much more robust. The results presented in Figure 15 include
the response of the pendulum to pushes occurring at the time instants t = 63.6 s and
t = 64.2 s, while Figure 16 presents the response of the pendulum when the bar was pushed
at t = 146.5 s, t = 149.4 s, and t = 153.2 s.
Electronics 2024, 13, 514 18 of 21

01231245ÿ3738ÿ9  29

Figure 15. Time waveforms of the implemented closed-loop system with the cascade PID controller:
012312measured
45ÿ3730ÿ89 data (blue line); θre f value (orange line).
 

71117128241!"10231 !#2$2%&$'1 !#2$2%&$'1() !ÿ2823* !5ÿ !!ÿ!ÿ!ÿÿÿ+, #-ÿ!. 010

Figure 16. Time waveforms of the implemented system with the LQG controller: measured data
(blue line); estimated data (orange line).

5.2. Constant Disturbance Torque


As part of this experiment, an arm was attached to the pendulum. Weights of 35g each
were added to the pendulum in a stepwise manner. The system recognises these weights
as step changes in the disturbance torque signal τd . A video showing the performance
of the pendulum in the case of the cascade controller can be found at https://www.
youtube.com/shorts/pgWTVsNW0Pg; a video showing the LQR controller can be found at
https://youtu.be/yshctkoC1pk. A portion of the measured data for the cascade structure is
Electronics 2024, 13, 514 19 of 21

shown in Figure 17; data for the LQR controller can be found in Figure 18. The closed-loop
system with the LQR controller can handle additional weights very well. On the other
hand, adding weights to the cascade design increases the average wheel speed α̇ during
steady-state operation. It saturates the motor very quickly, which causes the pendulum to
lose stability. It should be noted that the control signal u in the case of the LQR controller is,
0123124again,
5ÿ3738ÿ9 smoother than that in the cascade
 29system.
control 

Figure 17. Step response for τd of the cascade PID controller: measured data (blue line); θre f value
01231245ÿ3730ÿ89 line).
(orange  

71117128241 !10231  "2#2$%#&1  "2#2$%#&1'(  ÿ2823)  5ÿ**  ÿ ÿ ÿÿÿ+, "-ÿ. 010

Figure 18. Step response for τd of the LQG controller: collected data (blue line); data estimated by the
Kalman filter (orange line).

A constant, non-zero value of torque disturbance τd contributes a constant static error


of θ. This is true for both the cascade and LQR controllers. This appear to degrade the
performance of the control system. However, it can be a desirable property. A constant τd
Electronics 2024, 13, 514 20 of 21

shifts the equilibrium point of the pendulum. The LQR controller brings θ to such a position
that the torque created by the gravity force neutralises the influence of the constant τd .

6. Conclusions
In this study, the development of a mathematical model of a pendulum with a reaction
wheel was described. Based on this, a physical inverted pendulum system was designed
and successfully implemented. For this purpose, the appropriate sensors, motor, and
microcontroller were selected. The linearised pendulum model was implemented in the
Matlab environment. Based on this, cascade PID and LQG controllers were developed.
Finally, they were implemented in the physical pendulum system. In order to collect the
measurements, a connection was established between the microcontroller and a personal
computer. The estimation of the model parameters was based on the knowledge acquired
from the basic physical relations. This proved to be sufficient for developing a good
mathematical model. Finally, the stabilisation of the RWIP was also successful. Both the
cascade PID and LQG controllers were able to stabilise the pendulum. Using the cascade
controller, the pendulum balanced at approximately ±0.01 rad around the equilibrium
point, while the LQG controller balanced at approximately ±0.025 rad. However, it was
evident that the LQG controller was able to deal much better with noise and disturbances.
Moreover, the design process of the LQG controller was much easier than the cascade PID
controller. The application of state feedback made it straightforward to move the poles of
the device. Because of the coupling between θ and α, designing a cascade controller was
more difficult.
Note that the main limitation of this approach was eliminated by applying an unknown
input Kalman filter, which served as a torque disturbance estimator, thereby enabling
greater robustness to external disturbances.
Motivated by these promising results with the LQG approach, further research will be
oriented towards:
• Investigating the selection of the Kalman filter covariance matrices in order to better
deal with both changing noise and disturbances.
• Relaxing the small-angle assumption and deriving a linear parameter-varying (LPV)
model of the pendulum.
• Designing LPV controllers and state observers.

Author Contributions: Conceptualisation, M.W. and K.P.; methodology, D.Z.; software, D.Z.; vali-
dation, D.Z., K.P. and M.W.; formal analysis, K.P.; investigation, D.Z.; data curation, D.Z.; writing—
original draft preparation, D.Z.; writing—review and editing, K.P. and M.W. All authors have read
and agreed to the published version of the manuscript.
Funding: This research received no funding.
Data Availability Statement: The datasets presented in this article are not readily available because
the data are part of an ongoing study.
Conflicts of Interest: The authors declare no conflicts interests.

References
1. Aranovskiy, S.; Biryuk, A.; Nikulchev, E.V.; Ryadchikov, I.; Sokolov, D. Observer design for an inverted pendulum with biased
position sensors. J. Comput. Syst. Sci. Int. 2019, 58, 297–304. [CrossRef]
2. Xu, Q.; Stepan, G.; Wang, Z. Balancing a wheeled inverted pendulum with a single accelerometer in the presence of time delay. J.
Vib. Control 2017, 23, 604–614. [CrossRef]
3. Hamza, M.F.; Yap, H.J.; Choudhury, I.A.; Isa, A.I.; Zimit, A.Y.; Kumbasar, T. Current development on using Rotary Inverted
Pendulum as a benchmark for testing linear and nonlinear control algorithms. Mech. Syst. Signal Process. 2019, 116, 347–369.
[CrossRef]
4. Mehedi, I.M.; Ansari, U.; Al-Saggaf, U.M. Three degrees of freedom rotary double inverted pendulum stabilization by using
robust generalized dynamic inversion control: Design and experiments. J. Vib. Control. 2020, 26, 2174–2184. [CrossRef]
Electronics 2024, 13, 514 21 of 21

5. Baimukashev, D.; Sandibay, N.; Rakhim, B.; Varol, H.A.; Rubagotti, M. Deep learning-based approximate optimal control of a
reaction-wheel-actuated spherical inverted pendulum. In Proceedings of the 2020 IEEE/ASME International Conference on
Advanced Intelligent Mechatronics (AIM), Boston, MA, USA, 6–9 July 2020; pp. 1322–1328.
6. Du, D.; Zhang, C.; Song, Y.; Zhou, H.; Li, X.; Fei, M.; Li, W. Real-time Hinf control of networked inverted pendulum visual servo
systems. IEEE Trans. Cybern. 2019, 50, 5113–5126. [CrossRef] [PubMed]
7. Bezci, Y.E.; Aghaei, V.T.; Akbulut, B.E.; Tan, D.; Allahviranloo, T.; Fernandez-Gamiz, U.; Noeiaghdam, S. Classical and intelligent
methods in model extraction and stabilization of a dual-axis reaction wheel pendulum: A comparative study. Results Eng. 2022,
16, 100685. [CrossRef]
8. Nguyen, B.H.; Cu, M.P.; Nguyen, M.T.; Tran, M.S.; Tran, H.C. Lqr and fuzzy control for reaction wheel inverted pendulum model.
Robot. Manag. 2019, 24.
9. Trentin, J.F.S.; Da Silva, S.; Ribeiro, J.M.D.S.; Schaub, H. Inverted pendulum nonlinear controllers using two reaction wheels:
design and implementation. IEEE Access 2020, 8, 74922–74932. [CrossRef]
10. Belascuen, G.; Aguilar, N. Design, modeling and control of a reaction wheel balanced inverted pendulum. In Proceedings of the
2018 IEEE Biennial Congress of Argentina (ARGENCON), San Miguel de Tucuman, Argentina, 6–8 June 2018; pp. 1–9.
11. Önen, Ü.; Çakan, A. Multibody modeling and balance control of a reaction wheel inverted pendulum using lqr controller. Int. J.
Robot. Control Syst. 2021, 1, 84–89. [CrossRef]
12. Chinelato, C.I.G.; Neves, G.P.D.; Angélico, B.A. Safe control of a reaction wheel pendulum using control barrier function. IEEE
Access 2020, 8, 160315–160324. [CrossRef]
13. Ding, H.; Zhou, Z.; Dang, H.; Zhao, Z. Control Wheel Rotation Inverted Pendulum Control Based on Unscented Kalman Filter.
In Proceedings of the 2019 IEEE 2nd International Conference on Information Communication and Signal Processing (ICICSP),
Weihai, China, 28–30 September 2019; pp. 81–86.
14. Türkmen, A.; Korkut, M.Y.; Erdem, M.; Gönül, Ö.; Sezer, V. Design, implementation and control of dual axis self balancing
inverted pendulum using reaction wheels. In Proceedings of the 2017 10th International Conference on Electrical and Electronics
Engineering (ELECO), Chengdu, China, 10–17 August 2017; pp. 717–721.
15. Hofer, M.; Muehlebach, M.; D’Andrea, R. The One-Wheel Cubli: A 3D inverted pendulum that can balance with a single reaction
wheel. Mechatronics 2023, 91, 102965. [CrossRef]
16. Kim, Y.; Park, J.; Han, S. Balancing the Cubli Frame with LQR-controlled Reaction Wheel. J. Sens. Sci. Technol. 2018, 27, 165–169.
17. Ryadchikov, I.; Sokolov, D.; Biryuk, A.; Sechenev, S.; Svidlov, A.; Volkodav, P.; Mamelin, Y.; Popko, K.; Nikulchev, E. Stabilization
of a hopper with three reaction wheels. In Proceedings of the ISR 2018, 50th International Symposium on Robotics, VDE, Munich,
Germany, 20–21 June 2018; pp. 1–4.
18. Sontag, E.D. Mathematical Control Theory: Deterministic Finite Dimensional Systems; Springer Science & Business Media:
Berlin/Heidelberg, Germany, 2013; Volume 6.
19. Georgiou, T.T.; Lindquist, A. The Separation Principle in Stochastic Control, Redux. IEEE Trans. Autom. Control 2013, 58, 2481–2494.
[CrossRef]
20. Gillijns, S.; De Moor, B. Unbiased minimum-variance input and state estimation for linear discrete-time systems. Automatica 2007,
43, 111–116. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like