0% found this document useful (0 votes)
24 views128 pages

Merton Hmer101 Smme Meche 2024 Thesis

This thesis by Harvey Merton presents the development of a multi-drone platform for lifting slung loads in outdoor environments, utilizing open-source tools like PX4 and ROS2. It introduces a novel approach for load pose estimation using visual feedback without external motion capture, achieving significant accuracy in both simulation and real-world tests. Additionally, a new cable dynamics model is proposed to enhance the performance of the drone-slung-load system.

Uploaded by

hatimmos1412
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views128 pages

Merton Hmer101 Smme Meche 2024 Thesis

This thesis by Harvey Merton presents the development of a multi-drone platform for lifting slung loads in outdoor environments, utilizing open-source tools like PX4 and ROS2. It introduces a novel approach for load pose estimation using visual feedback without external motion capture, achieving significant accuracy in both simulation and real-world tests. Additionally, a new cable dynamics model is proposed to enhance the performance of the drone-slung-load system.

Uploaded by

hatimmos1412
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

State and Dynamics Estimation in an Outdoor

Multi-Drone Slung Load System

by
Harvey Merton
B.E.(Hons) Mechatronics Engineering, University of Auckland, 2022

Submitted to the Department of Mechanical Engineering


in partial fulfillment of the requirements for the degree of
MASTER OF SCIENCE IN MECHANICAL ENGINEERING
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2024

© 2024 Harvey Merton. This work is licensed under a CC BY-NC-ND 4.0 license.
The author hereby grants to MIT a nonexclusive, worldwide, irrevocable, royalty-free
license to exercise any and all rights under copyright, including to reproduce, preserve,
distribute and publicly display copies of the thesis, or release the thesis under an
open-access license.

Authored by: Harvey Merton


Department of Mechanical Engineering
August 23, 2024

Certified by: Ian W. Hunter


Hatsopoulos Professor of Mechanical Engineering, Thesis Supervisor

Accepted by: Nicolas G. Hadjiconstantinou


Professor of Mechanical Engineering
Chair, Department Committee on Graduate Theses
2
State and Dynamics Estimation in an Outdoor Multi-Drone Slung
Load System
by
Harvey Merton
Submitted to the Department of Mechanical Engineering
on August 23, 2024 in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE IN MECHANICAL ENGINEERING

ABSTRACT
Over the past decade, aerial drones have been used to address problems in areas such as
sensing and measurement, inspection, delivery, security, and defense. Adding a load attached
to one or more drones using a flexible cable can significantly enhance the capabilities of these
platforms.
This work aims to develop a multi-drone platform, built on open-source tools such as
PX4 and ROS2, that can be used to lift a general slung load in an outdoor environment.
Various fidelity simulators, including a pseudo-photo-realistic Gazebo simulator, are devel-
oped alongside a functional real world platform for testing load pose estimation methods.
A novel cable-based testing apparatus that enables drone translation is used to facilitate
stability testing of a quasi-static formation control method for lifting a slung load.
This work aims to be the first to use visual feedback to estimate a load’s pose in a
multi-drone slung load system operating without external motion capture devices. In sim-
ulation, perspective-n-point-based visual estimation achieves position errors of 0.1 m, and
geodesic distance attitude errors around 0◦ . Real world testing shows errors of 0.2 m and
5◦ respectively. Applying extended Kalman filter and unscented Kalman filter formulations,
simulated position estimates average around an error of 0 m, while the error noise magnitude
is only 6% of the cable length at 0.06 m.
Achieving accurate load pose estimates without an inertial measurement unit mounted to
the load requires a good cable dynamics model. This work concludes by presenting a novel
model for the effect of cables in a drone-slung-load system. A method based on universal
differential equations shows promising early results.

Thesis supervisor: Ian W. Hunter


Title: Hatsopoulos Professor of Mechanical Engineering

3
4
Acknowledgments

Firstly, I would like to extend my heartfelt gratitude to Professor Ian Hunter, my thesis
supervisor. Your curiosity and excitement about all things science and engineering has
cultivated an extremely diverse and inspiring lab culture. Your unwavering support and
excellent sense of humor have helped carry me through difficult stages of the research pro-
cess (including expensive drone crashes!). Through every interaction, it is clear that you
truly prioritise fostering the exploration and development of your students. For this, I am
immensely grateful.
Second, I extend my thanks to the dedicated members of the MIT BioInstrumentation
Laboratory. Your knowledge, enthusiasm, and wonderful personalities have created an out-
standing environment in which to grow and explore. I am privileged and proud to be able
to call each of you my friend.
I extend special gratitude to Kristan Hilby-Papalia as a senior graduate student also
working on aerial vehicle research. Your assistance through difficult problems, and mentor-
ship on all things academia have been extremely valuable to me. Bowie’s company was also
always a good morale boost, even if he did often eat my swimming shorts or pass judgmental
side-eye.
Thanks to Ryan Poon and Kevin Becker for your shenanigans and tomfoolery. Oh, and
for your help developing the Drake simulator too I suppose. Although friendship always
requires some exchange (looking at you Ryan), your hard work and robotics prowess were
always provided (somewhat) willingly.
Thank you also to my two wonderful undergraduate research assistants: Lingyi Qiu and
Cole Firlie. Your work on the drone testing rig and the cable dynamics estimator respectively,
was invaluable.
Last, but certainly not least, I would like to thank the huge community of family and
friends surrounding me. To my family and friends in New Zealand: thank you for your
continued unconditional support through calls and visits. If not for you, I would not have
arrived at MIT in the first place. To my friends in the United States: I know you can only
understand half of the words I speak, and the other half don’t make sense anyway. Despite
this, I have felt overwhelmingly supported by you all; my experience here would not be
anywhere near as enjoyable or expansive without you.
This work was supported in part by Fonterra Co-operative Group Limited. Thank you
for allowing me to do what I love for the last two years.
Alright, enough with the soppy stuff. Let’s crack on with some robotics!

5
6
Contents

Title page 1

Abstract 3

Acknowledgments 5

List of Figures 9

List of Tables 15

Nomenclature 17

1 Introduction 19
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 Background and Literature Review 23


2.1 Multi-agent systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Multi-agent slung load carrying . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1 System architecture and simulation . . . . . . . . . . . . . . . . . . . 26
2.2.2 Sensing and state estimation . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.3 Trajectory planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.4 Dynamics and control . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Key contributions to literature . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3 System Architecture 37
3.1 Dynamics modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 Higher-level system architecture . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.1 Back-seat driver architecture . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.2 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3.3 Companion computer code . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.4 Multi-agent coordination . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4.1 Low-fidelity custom Julia simulator . . . . . . . . . . . . . . . . . . . 48
3.4.2 Medium-fidelity Drake simulator . . . . . . . . . . . . . . . . . . . . . 50

7
3.4.3 High-fidelity software-in-the-loop Gazebo simulator . . . . . . . . . . 59
3.5 Real-world platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5.1 Hardware system architecture . . . . . . . . . . . . . . . . . . . . . . 63
3.5.2 Testing apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.7 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4 Slung Load Pose Estimation 69


4.1 Higher-level estimation architecture . . . . . . . . . . . . . . . . . . . . . . . 69
4.2 Visual pose measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2.1 Camera projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.2 Perspective-n-point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2.3 Load pose measurement results . . . . . . . . . . . . . . . . . . . . . 75
4.3 Load pose estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3.1 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.3.2 Load pose estimation results . . . . . . . . . . . . . . . . . . . . . . . 85
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5 Cable Dynamics Modelling 91


5.1 The massless cable assumption . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 Cable dynamics modelling with universal differential equations . . . . . . . . 92
5.2.1 Universal differential equations . . . . . . . . . . . . . . . . . . . . . 93
5.2.2 In-system cable dynamics modelling . . . . . . . . . . . . . . . . . . . 94
5.3 Implementation in Julia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.3.1 UDE system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.3.2 Solver and loss functions . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.3.3 Training data generation . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4.1 Training the tension approximation function embedded in the UDE
system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4.2 Directly training the tension approximation function . . . . . . . . . 103
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.6 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

References 107

A Code Repositories 119

B Frame Transform Full Tree 121

C Testing Videos 123

D Onshape CAD Documents 125

E Bill of Materials 127

8
List of Figures

1.1 Example sensing scenarios for a multi-drone, slung load system. Clockwise
from top left: a heard of cattle, a generic row crop, a landfill, and natural gas
pipeline leaks. Copyright © Kristan Hilby, reproduced with permission. . . . 20

2.1 General ‘sense-think-act’ robotic paradigm. Sense includes all sensors (includ-
ing cameras and computer vision) and related filtering techniques. Think in-
cludes estimation, localization and mapping (often performed simultaneously),
decision making (such as what task to perform next), planning (including path
and motion planning), and other task-specific artificial intelligence. Act in-
cludes using knowledge of dynamics and control to move actuators and create
robotic motion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Higher-level classifications of multi-drone systems as presented by Maza et al.
[39]: a) physical coupling, b) flying in formation, c) swarms, d) fleets. Gray
arrows represent drones and gray crosses represent trajectory waypoints on a
black arrow trajectory. Figure from Maza et al. [39]. . . . . . . . . . . . . . 25
2.3 Holybro’s pixhawk 6X wiring example showcasing the large ecosystem of pix-
hawk peripherals. Figure from Holybro [63]. . . . . . . . . . . . . . . . . . . 28
2.4 Trajectory planning submodules. Green dot represents start state, red dot
represents end state, grey dots represent path waypoints while the orange line
represents a dynamically feasible optimal trajectory. . . . . . . . . . . . . . . 32

3.1 Three quadrotors carrying a rigid body slung load. Figure from Sreenath and
Kumar [53]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Higher-level system architecture showing key hardware components and soft-
ware modules (indicated by boxes with colored borders). The hardware com-
ponents are discussed in detail in Section 3.5.1, while the software components
are discussed in Section 3.3. The box outline colors correspond to software
module classifications discussed in Section 3.3. . . . . . . . . . . . . . . . . . 42
3.3 PX4’s offboard mode system architecture. Lower level control is handled by
PX4 autopilot running on a Pixhawk flight controller (Holybro’s Pixhawk
6X here), while higher-level functionality runs on the companion computer
(Raspberry Pi 5 here). This is an example of the back-seat driver architecture.
Figure adapted from PX4 documentation [85]. . . . . . . . . . . . . . . . . . 43

9
3.4 How the ‘extremely resource constrained environment (XRCE)’ distribution
of micro DDS connects PX4’s uORB and ROS2. This allows ROS2 nodes
running on a companion computer (Raspberry Pi 5) to communicate with
PX4 (on the Pixhawk 6X) directly. Figure adapted from PX4 documentation
[85]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 PX4’s multicopter control system architecture. Figure from PX4 documenta-
tion [85]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.6 Modules in the full system architecture linked together by ROS2. Red indi-
cates custom modules written in python, green indicates C++, blue indicates
Julia while grey boxes indicate non-custom systems. Modules below the dot-
ted line have multiple instances (one for each drone) which are designed to
run onboard each drone in the real world system. . . . . . . . . . . . . . . . 46
3.7 Arrangement of a 3-drone system in a virtual cone structure around the load.
The locations of the drone body-fixed coordinate systems (T1W – T3W ) and the
load coordinate system (TLW ) are ultimately defined relative to the fixed (in-
ertial) world coordinate system (W ). Note that all drone coordinate systems
have their x-axes fixed towards the center of the virtual cone’s base. . . . . . 48
3.8 A top-down view of drones arranged in a virtual cone around a load. In this
case, the drones are the same height above the load, so the virtual struc-
ture becomes a circle. Drones are evenly spaced around the circumference
of the virtual circle. Note that the drone coordinate systems (T1L – T3L – TnL )
are defined relative to the load coordinate system (TLW ) to make the virtual
structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.9 Trajectory of the load used to generate data in the Julia simulator: a circle
of 2 m radius with a fixed load orientation. W represents the fixed ‘world’
coordinate system while L0 – Ln represent the load’s coordinate system over
timesteps 0 – n. Note that the drone orientations stay fixed relative to the
load’s orientation as defined in Figure 3.7. . . . . . . . . . . . . . . . . . . . 50
3.10 Dynamically feasible data points for a three-drone rigid-body load system,
generated by the low-fidelity Julia simulator. The load travels anti-clockwise
in a circle without vertical motion. The drones remain arranged in a way such
that neither the load nor the drones rotate over the trajectory. The first ‘x’
in the labels represents state while L, 1, 2 & 3 index the load and the drones.
Data points are split into x, y and z components relative to the inertial frame
as identified by the trailing ‘x’, ‘y’ or ‘z’. Data points for cables that appear
to be missing are directly under the corresponding data points for another cable. 51
3.10 Continued figures from previous page. . . . . . . . . . . . . . . . . . . . . . . 52
3.11 Block diagram illustrating the main components of state regulation and tra-
jectory stabilization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.12 Kinematic tree of the three drones, tethers, and load. Note the parent link is
the load, followed by the three tethers and then the drones. . . . . . . . . . . 54
3.13 Load with three tethers and three drones at a stabilizable fixed point. . . . . 54

10
3.14 Visualizations and plots of position over time for a single quadrotor using a)
an infinite-horizon LQR controller and b) a time-varying LQR controller. In
b), the controller switches to infinite-horizon LQR controller at the end of the
trajectory. Note that the slight overshoots seen a) aren’t seen in b). . . . . . 57
3.15 Visualizations and plots of position over time for three quadrotors using a)
an infinite-horizon LQR controller and b) a time-varying LQR controller. In
b), the controller switches to infinite-horizon LQR controller at the end of the
trajectory. Note that the slight overshoots seen a) aren’t seen in b). . . . . . 58
3.16 Multi-drone slung load system in Gazebo simulation. The box covered in
ArUco markers has a simulated mass of 1.6 kg, while the drones are realistic
simulations of Holybro X500 quadcopters. . . . . . . . . . . . . . . . . . . . 60
3.17 Discretized cable at its initialization point in Gazebo. Note the 10 discrete
rigid links that make up the cable are clearly visible. . . . . . . . . . . . . . 61
3.18 Image from a simulated camera mounted to a drone in the slung load system.
The overlayed axes (red = x, green = y, blue = z) represent the estimated
marker axes using OpenCV’s IPPE Square PnP algorithm. . . . . . . . . . . 62
3.19 The real world multi-drone slung load system. . . . . . . . . . . . . . . . . . 63
3.20 Key features of the Holybro X500 drone platform. . . . . . . . . . . . . . . . 64
3.21 Key components on the custom 1.62 kg load. The load is made of clear acrylic
for ease of prototyping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.22 A ‘gimbal-style’ drone testing apparatus. Note that a drone’s roll, pitch and
yaw can vary by rotating a combination of the red, green and blue rig com-
ponents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.23 Novel cable-based drone testing apparatus in small and large configurations.
Here both configurations are pictured holding a Holybro X500 drone. . . . . 66

4.1 Architecture of the multi-drone slung load state estimator. x̃dLi and θ̃Ldi repre-
sent the measured (indicated by ˜·) position and orientation of the load relative
to drone i ∈ {1..N }, while all other symbols are defined in Table 3.1. Note
that the dotted box contains the components that run on each drone i, and
the dotted arrow indicates that the ‘slung load Kalman filter’ does not yet
produce θL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2 Pinhole camera model showing a projection of a point on an object from the
3D world frame (TpW ) into the image frame (u, v) through the camera frame
(C) adapted from the OpenCV documentation [90]. . . . . . . . . . . . . . . 72
4.3 Perspective-3-point visualized. 3 points on an object P1 , P2 , P3 are projected
on to a 2D image plane P1′ , P2′ , P3′ . A tetrahedron is formed by the camera co-
ordinate system origin (peak), object (base) and the projection lines s1 , s2 , s3 .
This tetrahedron provides the geometric constraints necessary to find the pose
of the object relative to the camera. . . . . . . . . . . . . . . . . . . . . . . . 74
4.4 A visualization of the pose ambiguity when using the IPPE algorithm to detect
a square marker. The two solutions correspond to the marker being flipped
around a plane parallel to the image plane. Note that this ambiguity only
affects the marker’s predicted orientation, not its position. . . . . . . . . . . 75

11
4.5 Trajectory of the load used to generate data for visual measurement: a circle
of 2 m radius changing load orientation. The rate of load orientation change
is synchronised with the load’s circular motion such that the same face of the
load is always facing the center of the circle. W represents the fixed ‘world’
coordinate system while L0 – Ln represent the load’s coordinate system over
timesteps 0 – n. Note that the drone orientations stay fixed relative to the
load’s orientation as defined in Figure 3.7. . . . . . . . . . . . . . . . . . . . 76
4.6 Image stream from a simulated camera mounted to a drone in the slung load
system. The overlayed axes (red = x, green = y, blue = z) represent the
estimated marker axes using OpenCV’s IPPE square PnP algorithm. This is
a repetition of Figure 3.18 for convenience. . . . . . . . . . . . . . . . . . . . 77
4.7 Estimated pose of the slung load using the IPPE square algorithm. These
results were captured in real time using an image stream from a camera on-
board a single drone. All drones were simultaneously running this algorithm
and achieved comparable results. . . . . . . . . . . . . . . . . . . . . . . . . 78
4.8 Estimated pose of the slung load using the IPPE square algorithm - error by
component. These results were captured in real time using an image stream
from a camera onboard a single drone. All drones were simultaneously running
this algorithm and achieved comparable results. . . . . . . . . . . . . . . . . 79
4.9 Estimated pose of the slung load using the IPPE square algorithm - combined
error. These results were captured in real time using an image stream from a
camera onboard a single drone. All drones were simultaneously running this
algorithm and achieved comparable results. . . . . . . . . . . . . . . . . . . . 80
4.10 Trajectory of the load used to generate data in the Julia simulator: a circle
of 2 m radius with a fixed load orientation. W represents the fixed ‘world’
coordinate system while L0 – Ln represent the load’s coordinate system over
timesteps 0 – n. Note that the drone orientations stay fixed relative to the
load’s orientation as defined in Figure 3.7. This is a repetition of Figure 3.9
for convenience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.11 EKF and UKF predicted load trajectory vs the unfiltered (UNFI) load trajec-
tory obtained without measurement correction. The reference load trajectory
is a circle of 2 m radius, flying at 1.5 m/s. Measurement noise is modelled
as ynoise ∼ N (0, 0.01) while process noise is modelled with control input
uncertainty utranslation noise ∼ N (0, 1e−4 ), urotation noise ∼ N (0, 9e−6 ). . . . 86
4.12 Predicted position error components (x, y, z) and magnitude (r) over the
trajectory seen in Figure 4.11. Blue lines correspond to the mean while the
red dotted lines corresponds to 3 standard derivations from the mean (in the
filter cases). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.12 Predicted position error components and magnitude continued. . . . . . . . . 88

5.1 Tension vectors at either end of a cable hanging in a catenary shape (tload , tdrone ∈
R3 ). The cable in question is connected to drone i at xi and to the load at
connection point xLi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

12
5.2 Training data (points) and post-training predictions (lines) for a three-drone
rigid-body load system. The load travels anti-clockwise in a circle without
vertical motion. The drones remain arranged in a way such that neither the
load nor the drones rotate over the trajectory. The first ‘x’ in the labels
represents state while L, 1, 2 & 3 index the load and the drones. Data is split
into x, y and z components relative to the inertial frame as identified by the
trailing ‘x’, ‘y’ or ‘z’. Data points for cables that appear to be missing are
directly under the corresponding data points for another cable. . . . . . . . . 98
5.2 Continued figures from previous page. . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Cable tension and relative cable attachment point motion data and predictions
from the same trajectory as Figure 5.2. Predictions come from training the
neural network embedded in the UDE. . . . . . . . . . . . . . . . . . . . . . 100
5.3 Continued figures from previous page. . . . . . . . . . . . . . . . . . . . . . . 101
5.4 Loss history from training the neural network embedded in the UDE. . . . . 102
5.5 Loss history while training the neural network directly. . . . . . . . . . . . . 103
5.6 Cable tension and relative cable attachment point motion data and predic-
tions from the same trajectory as Figure 5.2. Trained predictions come from
training the neural network directly. . . . . . . . . . . . . . . . . . . . . . . . 104

B.1 A full TF tree generated by ROS2’s tf2_tools package. . . . . . . . . . . . . 122

D.1 Screenshot of the slung load CAD model in Onshape, roughly 0.3 m × 0.3 m ×
0.2 m. The lighter variant removes the bottom tier, the additional weights,
and the u-bolts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
D.2 Screenshot of the testing rig frame CAD model in Onshape in its 1 m3 config-
uration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
D.3 Screenshot of the 170 mm long Intel Realsense camera mount CAD model in
Onshape. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

13
14
List of Tables

2.1 Categorization of key literature addressing slung load estimation in a single-


drone slung load system. Rows represent what is being measured, while the
columns represent how that quantity is being measured. . . . . . . . . . . . . 30

3.1 Symbols used in modelling a multi-quadcopter slung load system’s dynamics.


Note that scalars are represented with Roman style letters such as ‘x’, vectors
use bold lowercase letters like ‘x’, and matricies use bold uppercase letters
like ‘X’. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

E.1 Bill of materials with approximate costs for the drone testing apparatus. All
$ are in USD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
E.2 Bill of materials with approximate costs for the full multi-drone slung load
system. All $ are in USD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

15
16
Nomenclature

Load
mL ∈ R Mass of a slung load.
JL ∈ R3×3 Inertia matrix of a slung load with respect to the load’s body-fixed
frame.
RL ∈ SO(3) The rotation matrix of the load from the load’s body-fixed frame to
the inertial frame.
θL ∈ R3 Orientation of the load as roll-pitch-yaw (RPY) angles in the inertial
frame. This is RL converted to RPY Euler angles.
xL ∈ R3 Position vector of the center of mass of the load in the inertial frame.

Drones
mi ∈ R Mass of the ith drone.
Ji ∈ R3×3 Inertia matrix of the ith drone with respect to that drone’s body-fixed
frame.
Ri ∈ SO(3) The rotation matrix of the ith drone from that drone’s body-fixed
frame to the inertial frame.
θi ∈ R3 Orientation of the ith drone as roll-pitch-yaw (RPY) angles in the
inertial frame. This is Ri converted to RPY Euler angles.
xi ∈ R3 Position vector of the center of mass of the ith drone in the inertial
frame.
τi ∈ R3 Moment produced by the ith drone.
fi ∈ R Thrust produced by the ith drone.
ψi ∈ R Yaw angle of the ith drone.
xi/Li ∈ R3 Relative position of the ith drone to its attachment point on the load
(xLi ) in the body-fixed frame of the load. Equivalent to −li qi .

Cables and Tensions


qi ∈ S2 Unit vector from the ith drone to its attachment point on the load in
the body-fixed frame of the load.
ti ∈ R Tension in the cable between the ith drone and the load. This symbol
is used only when the massless cable assumption is in effect.
tdronei ∈ R 3
Tension vector of the ith cable at the attachment point to the ith drone.
tloadi ∈ R3
Tension vector of the ith cable at its attachment point to the load.
li ∈ R Length of the cable between the ith drone and the load.
ri ∈ R3 Vector from the center of mass of the load to the attachment point of
the ith drone to the load.

17
General
a b
Tc ∈ SE(3) A homogeneous transformation matrix defining the transform from
coordinate system c into coordinate system b as defined in coordinate
system a. This is also the pose of coordinate system c relative to
coordinate system b as defined in coordinate system a. If the ‘a’ is
omitted, the pose can be assumed to be defined in the inertial world
frame W .
a
Rcb ∈ SO(3) A rotation matrix defining the rotational transformation from
coordinate system c into coordinate system b as defined in coordinate
system a. This is also the rotation of coordinate system c relative to
coordinate system b as defined in coordinate system a. If the ‘a’ is
omitted, the rotation can be assumed to be defined in the inertial
world frame W .
a b
tc ∈ R3 A translation vector defining the translation from point b to point c as
defined in coordinate system a.
ê1 , ê2 , ê3 ∈ R3 Standard unit vectors along x, y, z axes in the inertial frame.

Modifiers
ˆ· The value is estimated (with the exception of the unit vectors defined
above).
˜· The value is measured and should be distinguished from the real value.

18
Chapter 1

Introduction

Unmanned aerial vehicles (UAVs), also known as drones, are a class of robotic platform
defined by their ability to fly without an onboard human pilot. They represent a fusion of
mechanical and aeronautical design; power electronics and computing; wireless communica-
tion; energy storage technology; mechatronics; controls; robotics software development; and
artificial intelligence. It is advancements in these technologies that has enabled controlled
unmanned flight to be possible; a feat that manned vehicles have been capable of for over a
century [1].
Drones are usually classified by their size (e.g., micro, small, medium, and large), their
method of take-off (e.g., vertical or horizontal take off and landing: VTOL or HTOL), or
how they produce lift (e.g., rotary wing, fixed wing, flapping wing, and airship) [2]. Some
vehicles even combine the advantages of different classes (e.g., hybrid VTOL tiltrotors), or
have unconventional designs that often imitate nature [3]. One particularly common UAV
platform is known as the quadcopter or quadrotor. Quadcopters are a subclass of rotary wing
aircraft that have four rotors, and represent the simplest form of multirotor aircraft [4]. Each
drone morphology has associated advantages and disadvantages in performance metrics such
as flight time, range, maneuverability, payload capacity, ease of flight, cost, takeoff space, and
portability. As such, drones should be selected carefully based on the desired application.
Drones have already been used successfully for a wide variety of purposes. Some applica-
tions include: disaster relief [5]–[7], delivery and logistics [8]–[11], medical aid delivery [11]–
[13] (particularly in rural communities [14]), defense [15]–[17], surveillance [18], mapping nat-
ural landscapes [19]–[22], agriculture [23]–[26] and infrastructure inspection [27]–[29]. One
particularly pertinent use case of drones is in measuring gaseous (particularly greenhouse
gas) emissions [30]–[32]. Detection and quantification of such emissions is crucial to meeting
our international obligations under the Paris Agreement [33].

1.1 Motivation
In 2019, an estimated 22% of global greenhouse gas emissions were due to agriculture, forestry
and other land use [34]. A further 24% comes from the burning of fossil fuels at industry
sites to produce energy [34]. Aerial monitoring solutions, such as those enabled by drones,
are flexible enough to be deployed over all of these domains. Unfortunately, some gas sensing

19
devices have an extremely limited range, such as the short range infra-red camera used by
Huang [35]. In order to get good data from such sensors, the drone carrying the sensor must
either fly close to the object of interest, or the sensor must hang below the drone. Using a
flexible cable makes it easy to adjust the distance between the drone and sensor, increasing
the variety of situations in which the aerial measurement system would provide utility.
When a load is hung below a drone using a flexible cable, it can be referred to as a
‘slung load’ [36]. A single load can also be hung below multiple drones simultaneously, in a
group described as a ‘multi-drone slung load system’. Figure 1.1 shows how a multi-drone
slung load system could be used to monitor a variety of agricultural and industrial emissions
sources.

Figure 1.1: Example sensing scenarios for a multi-drone, slung load system. Clockwise from
top left: a heard of cattle, a generic row crop, a landfill, and natural gas pipeline leaks.
Copyright © Kristan Hilby, reproduced with permission.

Using a multi-drone system to carry a sensing load provides several advantages over a
single-drone slung load system, including:

• Increased lifting capacity and/or flight time: using more drones means heavier
loads can be carried with the same model of drone. This can also reduce the load on
any one drone in the multi-drone system. As the flight time of a drone increases with

20
decreasing load, this can extend the flight time of the whole multi-drone slung load
system when compared to a single drone lifting the same load.
• More stable load orientation control: controlling the orientation of a load slung
from a single drone requires that an actuator be placed on the load, or that multiple
cables are used to connect the drone and the load. In the latter case, unless the
drone is large enough to space the cables far apart, the torque-generating stabilization
capability of the cables will be low. Using multiple drones allows the tension on each
cable to be controlled individually without additional actuators, and creates a larger
adjustable baseline between the cable attachment points on the drones. This can easily
create a greater horizontal component of cable tension to control the load’s yaw, which
is essential for orienting direction-dependent sensors.
• Shifting propeller downwash: the air forced downwards by drone propellers as
they generate lift is known as downwash. The speed of this downwash depends on
distance below a drone’s propellers, the type of propellers used, radial distance from the
center of a propeller, and how fast a propeller is spinning [37]. Downwash can disrupt
gases that a drone-sensor system might be attempting to measure, thus rendering the
measurements inaccurate. Using a multi-drone slung load system allows this downwash
to be shifted away from directly over a sensing device, thus clearing a gas source from
the effects of downwash and allowing the sensing device to function as intended.
The primary motivation of this work is to enable a variety of environmental sensors to
be used effectively over a range of domains. For the reasons discussed above, a multi-drone
slung load system is selected as the platform to allow this. It is the hope that the platform
developed will also be useful in other aerial lifting applications.

1.2 Objectives
The overarching goal of this work is to develop a platform, built on state-of-the-art open-
source tools, that can be used to lift a general slung load in an outdoor environment. The
primary objectives within this goal are:
• Build a platform: design a generic, higher-level method of drone coordination that
will allow any number of vehicles to cooperatively carry a slung load. Appropriate
simulation and real-world implementations should be developed, based on open-source
tools, to test this method. The work towards this objective is discussed in Chapter 3.
• Get an estimate of load pose: develop a method to estimate the slung load’s
position and orientation (pose) relative to the drones. Progress towards this objective
is discussed in Chapter 4.
• Build a model of cable dynamics: propose a novel method for modelling flexible
cables in the multi-drone slung load system. This objective is discussed in Chapter 5.
To understand the key contributions of this work, we first survey the existing literature
on multi-drone slung load carrying in Chapter 2. This Chapter also derives a dynamics
model that is used throughout the remainder of the study.

21
22
Chapter 2

Background and Literature Review

Robotics and autonomous systems (including drones) are built from a variety of different
subsystems. One commonly accepted platform-independent approach for classifying these
subsystems is known as the ‘sense-think-act’ paradigm. Figure 2.1 describes this paradigm,
and draws rough boundaries around the subsystems. It is important to note that robotics
techniques may lie anywhere in this cycle and draw different boundaries depending on the
exact problem to be solved.

Figure 2.1: General ‘sense-think-act’ robotic paradigm. Sense includes all sensors (including
cameras and computer vision) and related filtering techniques. Think includes estimation,
localization and mapping (often performed simultaneously), decision making (such as what
task to perform next), planning (including path and motion planning), and other task-
specific artificial intelligence. Act includes using knowledge of dynamics and control to move
actuators and create robotic motion.

23
The platform that we are interested in is a system of multiple drones carrying a single load
with flexible cables known as a multi-drone slung load system. This Chapter first describes
characteristics of general multi-agent systems (Section 2.1) before focusing specifically on
physically coupled multi-drone systems (Section 2.2). The survey of work on physically
coupled multi-drone systems is roughly organised to follow the sense-think-act paradigm
presented in Figure 2.1. This Chapter concludes with a summary of the key gaps in existing
literature addressed by this work.

2.1 Multi-agent systems


The term ‘agent’ refers to a generalized unit that can take information in, make decisions,
and act on an environment [38]. In the specific case of robotics and drones, each individual
robot or drone is known as an ‘agent’. Multi-agent drone systems (or ‘multi-drone sys-
tems’) are therefore systems where many drones interact in a way that is either cooperative,
collaborative or competitive.
Maza et al. [39] present four main classifications of multi-drone systems, pictured in
Figure 2.2:

• Physically coupled systems: drones are physically connected through flexible or


rigid structures.

• Formations: drones are not physically connected, but maintain some form of virtual
structure where the position of each drone relative to its neighbours is controlled.

• Swarms: a large number of drones move based on simple rules relative to their neigh-
bours, as inspired by biological systems [40]. The emergent global behavior of the
swarm can be much more complex than the behaviors of individual agents.

• Fleets: drones individually execute a set of tasks required to achieve some higher-level
goal. This usually requires coordinated deliberative planning.

Some other important descriptors common in literature are:

• Homogeneous and heterogeneous: whether all agents in the multi-drone system


are the same (homogeneous) or different (heterogeneous).

• Centralized and decentralized: if a coordination algorithm runs one one agent


(drone or central control computer) and sends commands to other agents in the sys-
tem (centralized), or if computations are performed on all units equally (decentralized).
These terms can also be applied to communication methods: centralized communica-
tions run through one single agent or router, while decentralized communications can
pass directly from agent to agent. In a decentralized system, any one unit can be
removed without affecting the functionality of the system. Conversely, removing the
central node in a centralized system will prevent the system from functioning.

24
Figure 2.2: Higher-level classifications of multi-drone systems as presented by Maza et al.
[39]: a) physical coupling, b) flying in formation, c) swarms, d) fleets. Gray arrows represent
drones and gray crosses represent trajectory waypoints on a black arrow trajectory. Figure
from Maza et al. [39].

25
• Synchronous and asynchronous: in a synchronous regime, agents are time-synced
and make decisions simultaneously. In contrast, asynchronous techniques allow agents
to make decisions at any time and adjust the decisions when communication is possible.
• Offline and online: in offline algorithms, the result of a computation is not used to
affect decision making during a mission. These algorithms are generally run before or
after a mission is performed. On the other hand, the results of online algorithms are
computed during the mission and may be used to inform actions. Real-time comput-
ing represents a special case of online computing, where program responses must be
guaranteed to lie within a specific time frame.
• Explicit and implicit communication: explicit communication schemes rely on
agents sending information directly to each other. In implicit communication schemes,
agents will infer how other agents are acting and make decisions based on this. Tsiamis
et al. provide an example of a leader-follower implicit communication scheme where
one robotic arm (the leader) knows a plan, and the other arm (follower) must infer
how to move cooperatively based on force and torque feedback [41].
An additional important feature that applies specifically to physically coupled systems
and formations is formation feedback. Formation feedback is when individual agents modify
their behavior based on knowledge of how the formation is moving. It was described by
Ren and Beard for modifying the speed of spacecraft based on how well the crafts are
tracking their desired positions in a virtual structure [42]. Ren later described an approach
to decentralize formation control via consensus strategies [43].

2.2 Multi-agent slung load carrying


A multi-agent slung load system is an example of a physically coupled multi-agent system.
The first publicly-available feasibility study into such a system was performed by the Vertol
Aircraft Corporation in 1957 using two manned helicopters and a slung load [44]. It wasn’t
until the 1970’s however when the first flight test demonstrations were conducted by PLM
(now PDG) helicopters [45]. Further studies into the dynamics, control and stability of
the two-helicopter slung load problem continued in the early 1990’s, investigating feedback
linearization [45], [46] and H∞ [47] control. These studies suggested the stability of both
methodologies in simulation, but did not continue to show real world experimental results.
This was likely due to the extreme costs and safety risks associated with testing on full-size
manned vehicles available at the time.
The first multi-UAV slung load system flight was performed in Berlin, December 2007
[48], [49]. As UAVs do not carry humans by definition, and the systems are much smaller,
the risks associated with real world testing are much lower. The following Sections discuss
key advances in the different areas of multi-UAV slung load carrying.

2.2.1 System architecture and simulation


The first real world multi-UAV slung load system flight was performed outdoors using three
12.5 kg helicopters [48], [49]. Each helicopter was powered by a 1.8 kW two-stroke engine,

26
had an onboard real-time computer for autonomous flight, and GPS and IMU units for
helicopter state estimation. The real-time computer ran all flight control, load state estima-
tion (discussed in Section 2.2.2) task scheduling, operator interfacing, wireless connectivity
and inter-process communication. As there were no widely available UAV flight controllers
available at this time, the majority of these modules were implemented specifically for this
system, and are not easily transferable to other platforms (including quadcopters).
From 2010 to 2013, the GRASP Lab at the University of Pennsylvania published a series
of papers demonstrating that multiple-quadcopter systems can successfully carry a cable-
suspended payload [50]–[53]. These papers all used relatively simple MATLAB simulations
for trajectory generation (see Section 2.2.3) and control development (see Section 2.2.4) and
tested their final results indoors, with a motion capture system to track drone and load poses.
Reference [54] builds on this work by using the load position and orientation as feedback to a
geometric controller. Using indoor motion capture systems is the most common method for
experimental validation [55], [56], with outdoor experiments appearing sparsely in literature
[49], [57].

Open-Source Autopilots
In recent years, there has been a resurgence of interest in quadrotor slung-load-carrying
[58]–[61], driven by the maturity of open-source flight control software such as PX4 and
Ardupilot [62]. Both PX4 and Ardupilot perform lower-level UAV automation tasks such as
flight control, sensor integration and safety checks for a range of different UAV types (e.g.,
quadcopters, hexacopters, VTOLs and fixed-wing vehicles). Further, both software stacks
can run on a variety of flight management units (FMUs), including the pixhawk open-source
hardware (see Figure 2.3 for an example hardware system). The standardization and open-
sourcing of both software and hardware platforms for UAVs means hobbyists, researchers
and corporations can focus on building higher-level task-specific autonomy rather than each
having to build their own flight controller from scratch.
One of the few multi-UAV slung load works that has taken advantage of open-source
systems for outdoor flight is that of Bulka et al. [57]. This experimental setup used three
off-the-shelf quadcopters and 8.5 m long cables. With a custom 3 kg load, it was able to
travel at 1 ms−1 . It is interesting to note that, in this setup, the cables from all three drones
come to a single point before attaching to the load. This gives better roll and pitch stability,
but less control over yaw than attaching the cable at the corners of the load like Sreenath
[53]. An important feature of Bulka et al.’s setup, is that it is built on the open-source PX4
autopilot for lower-level flight control [57]. The autopilot is running in offboard mode with
MAVROS and Raspberry Pi computers performing on-drone computations (see Chapter 3
for more detail); a much more generic setup than that proposed by Maza et al. [49]. A
special feature of this setup is the use of ethernet cables for inter-drone communication,
passed through a network switch on the load. It is noted that these cables severely reduced
communication latency seen when using a Wi-Fi network. This additional communication
robustness is essential to the leader-follower control strategy employed by Bulka et al. [57].

27
Figure 2.3: Holybro’s pixhawk 6X wiring example showcasing the large ecosystem of pixhawk
peripherals. Figure from Holybro [63].

28
Simulation
It is noted earlier that many works on multi-quadcopter slung load carrying perform simu-
lations in relatively simple MATLAB environments. Unfortunately, most of these works do
not open-source their simulation and experimental environments, leaving large engineering
effort required to recreate the results. RotorTM is a recent simulator developed to address
this [56]. It has implementations of full system dynamics, planning and control algorithms
and is available in both MATLAB and Python/ROS. An important novel contribution is
the inclusion of hybrid dynamics to capture the case when the cables become slack; most
works prior assumed that the cable tensions remained positive. Despite it’s many advances,
it still uses ROS1 rather than the newer ROS2, doesn’t build on the open-source PX4 or
Ardupilot flight controllers, and doesn’t model the effect of cable mass (see Section 3.1 for
the effect of this). Not building on the available open-source flight controllers means not
taking advantage of the many hours of community testing efforts; a significant limitation
for future-proofing platforms. Real world validations of RotorTM are performed in a Vicon
motion capture room.
Other approaches to building UAV simulators involve building on available open-source
simulation tools such as Gazebo [64], [65]. These tools tend to be more general and so easier
to substitute different UAVs in different scenes. Although work has been done to address
the limitations of some of these general robotics simulators [65], limitations still exist that
inspire custom simulations to be built for specific purposes. One key of limitation of Gazebo
for example is the inability of to simulate flexible bodies such as the cables in a UAV-slung
load system [56].

2.2.2 Sensing and state estimation


Sensing and state estimation fall under the ‘sense’ and ‘localize’ stages of the sense-think-act
cycle seen in Figure 2.1. When experiments are not performed with an external motion-
capture system (sensor), an alternative method must be used to estimate the position and
orientation (pose) of the drones and slung load (localization). The Global Navigation Satel-
lite System (GNSS) is commonly used to obtain ground-truth drone positions. Many different
GNSS constellations exist such as GPS (owned by the government of the United States of
America), GLONASS (owned by the Russian government) and Galileo (owned by the Eu-
ropean Union) which provide global or regional coverage. Pose estimation of a UAV in a
GNSS-enabled environment is a mature problem; systems using extended Kalman filters to
fuse GNSS and inertial measurement unit (IMU) data have proven to be sufficient in indus-
try settings [66]. Methods for pose estimation in GNSS-denied environments, such as visual
inertial odometry (VIO), are outside the scope of this work and so not discussed here.
Experiments on load pose feedback in a multi-drone slung load system are extremely
sparse. Current outdoor multi-drone slung load systems mostly use sensors (such as an
IMU or GPS) mounted directly to the load [57], or use no load feedback in a quasi-static
approach. Quasi-static approaches simply make the assumption that the load is perfectly
positioned ‘statically’ at the center of a carrying formation [50]. This will clearly not produce
an accurate estimate of load pose when using flexible cables. Michael et al. propose an
optimization-based approach to improve on this quasi-static method, but base this purely

29
on a dynamics model rather than including some measurements [51]. As the literature on
multi-drone slung pose estimation is limited, the remaining articles mentioned in this Section
address the single-drone slung load problem.
There are two key categories for UAV slung load pose estimation:

• What is being measured: divided into system parameters (such as cable length and
load mass) and slung load pose.

• How it is being measured: divided into vision-based and non-vision-based tech-


niques.

These categories are used to divide key relevant literature in Table 2.1.

Vision Non-Vision
System parameters N/A [36], [46], [67], [68]
Load pose [69]–[72] [48], [67]

Table 2.1: Categorization of key literature addressing slung load estimation in a single-drone
slung load system. Rows represent what is being measured, while the columns represent how
that quantity is being measured.

As shown by Table 2.1, much of the current literature is focused on estimating system
parameters such as payload mass [46], [68] and cable length [36], [67]. Much of this work is
performed through the lens of adaptive control [46] using methods for such as backstepping
[68]. Prkačin et al. take a different approach, modelling the slung load as a simple pendulum.
By isolating low frequency oscillations from the drone’s onboard IMU using a fast Fourier
transform (FFT), they can find the pendulum’s period (T ), and thus the length (l) from
s
l
T = 2π , (2.1)
g
where g is acceleration due to gravity. Note that all of these methods estimate system
parameters using non-visual methods; visual methods for system parameter estimation are
yet to be explored in literature.
The other key quantity to be measured is the pose of the slung load. If the objective
of carrying a slung load is to precisely position the load (e.g., in measurement or physical
interaction scenarios), it is important that this measurement is used in control feedback.
Without a good estimate of the slung load pose however, any control methods that use the
slung load’s pose in feedback will be rendered useless.
Table 2.1 shows that some effort has been put into non-vision-based methods for load
pose estimation. Both of these use observers based on a simple 2 degree of freedom (DOF)
pendulum model [48], [67]. In one case, the control inputs that compensate for tension
forces are used to estimate cable tension angles [67], whereas in the other, two encoders
measure the 2-DOF pendulum angles directly [48]. Bernard and Kondak note that a good
measurement of the load movement could not be achieved from the encoders directly, as
waves propagating along the cable can not be accounted for [48]. Bernard and Kondak

30
use the pendulum-model-based observer to address this, but do not discuss experimental
results to prove its effectiveness. It is noted however that the observer is highly sensitive to
parameters; the observer fails to converge with cable length discrepancies of 0.1 m and mass
discrepancies of 0.1 kg.
Table 2.1 shows that vision-based methods for slung load pose estimation are more com-
mon in literature than non-vision based methods. Such a vision-based system was used to
provide measurements of a slung load relative to a single carrying helicopter in [69], and
a single carrying quadcopter in [70]. These measurements were integrated with known 2-
DOF pendulum dynamics in an unscented Kalman filter (UKF) and Extended Kalman filter
(EKF) respectively. Results from both papers show that Kalman filters can successfully es-
timate the slung load angular position to within 5 ◦ accuracy [69], [70], even when the slung
load moves momentarily out of the camera’s field of view (FOV) [69]. Neither of these works
provide an estimate of the slung load’s orientation which is unfixed around the cable axis.
Moreno also modelled a slung load below a quadcopter as a 2 degree of freedom (DOF)
pendulum [72]. He also used a model-based computer vision (CV) method for estimating
the angles in this model. Further, he tested a novel neural-network based slung load posi-
tion estimator with three different training methodologies: Back-propagation Through Time
(BPTT), Real-Time Recurrent Learning (RTRL) and Echo State Network (ESN). Offline ex-
perimental results show RTRL performs the best when estimating slung load position giving
a testing mean squared error (MSE) of 0.00542 m. Although accuracy reduces when per-
forming on-board real-time estimation to 0.0231 m in the X-axis and 0.0091 m in the Y-axis,
these estimates are still accurate enough to be useful (accurate to 2 %). No quantitative
performance metrics were provided for the model-based CV method to compare results.
All articles previously discussed focus on estimating the position of a slung load while
neglecting its orientation. Naundrup et al. formulate the 2-DOF pendulum model in a
similar way to works previously discussed. However, they define one of their angles in
such a way that the load’s pitch is linked to one of the angles and as such is determined
when the load’s position is determined [71]. Like other papers, they use computer vision
techniques to measure the load (including Gaussian filtering, HSV filtering and Fitzgibbon
ellipse detection), and achieve a position (and pitch) measurement accuracy of 2 ◦ to 3 ◦
in outdoor tests. This method however neglects load roll and yaw which are essential to
estimate when moving in general trajectories rather than straight lines like [71].

2.2.3 Trajectory planning


Trajectory planning is part of the ‘plan’ stage in the sense-think-act cycle seen in Figure
2.1. It represents the problem: given a starting pose, ending pose and an environment with
obstacles, how do you move through the environment in a dynamically feasible and collision-
free way. General trajectory planning is often solved directly by optimal control methods
(or otherwise). It is also often broken into two subcategories (path planning and trajectory
generation) and solved using a decoupled approach (see Figure 2.4). Path planning repre-
sents the first half of trajectory planning: generate a set of collision-free waypoints through
an environment that gives the shortest path from the start to the end pose. Trajectory
generation represents the second half of the problem: generating a set of states and control
inputs that maximize some defined performance (perhaps minimizing navigation time) while

31
resulting in a dynamically feasible trajectory between waypoints. Trajectory optimization is
a common approach to solving this problem, and does so via mathematical programming.

Figure 2.4: Trajectory planning submodules. Green dot represents start state, red dot
represents end state, grey dots represent path waypoints while the orange line represents a
dynamically feasible optimal trajectory.

In the context of slung load trajectory planning, the objective is often to navigate to
the target while reducing load swing to a minimum [53], [73]–[75]. Sreenath et al. produce
some of the earliest work that demonstrates the successful generation of payload-dynamics-
aware trajectories [53], [73]. They show that their method of using a differentially flat
hybrid system results in a load tracking error 300-400% lower than that generated by the
quasi-static models that proceeded [50]. The two key mathematical tools introduced to the
problem in these papers are:

• Differentially flat systems: systems where the states of, and inputs to, the system
can be written in terms of a set of flat outputs and a finite number of their higher
order derivatives [76]. These flat outputs are simply some functions of the full system
state and inputs. This allows the number of variables that we optimize over to be
reduced, as the full desired state and inputs can be determined once these flat outputs
are decided.

• Hybrid systems: systems that contain both continuous and discrete dynamics. In the
context of a drone-slung-load system, the dynamics are modelled continuously when
the cables are taut and slack respectively, but the hard transition from slack to taut
can introduce a discrete impulse.

Although Sreenath et al. discuss hybrid system dynamics for slack cables, they do not
present results that demonstrate trajectories containing cables that go slack. Similarly, they
present multiple formulations of possible differentially flat outputs, but only present the
results for one case [53]. The idea of differential flatness is used by Dhiman et al. where
differential flatness is applied to the load directly [77]. Here, a virtual cone structure is used
in a formation control approach to generate the quadrotor trajectories required to achieve
the desired load trajectory.

32
Advances in machine learning (ML) have resulted in many papers discussing machine
learning applied to robotics problems. Reinforcement learning (RL) is one of the major
categories of ML (alongside supervised learning and unsupervised learning) particularly useful
in trajectory planning and control. It is often used instead of optimal control methods and
has been shown to outperform these methods in some cases [78].
The application of RL to trajectory planning in UAV-slung-load carrying is relatively new.
It was first proposed to generate a swing-free trajectory from a sample-based motion plan in
[74]. Simulation results show that in a single-drone slung load system, a load displacement
of under 10 ◦ can be achieved when following an RL-generated trajectory. Although this
RL approach continues to be studied [75], no satisfying real world results have yet been
presented in this context.

2.2.4 Dynamics and control


The boundary between trajectory generation and control is blurred; often methods (such
as model predictive control) will solve both in a single step. Both generally require an
understanding of the system dynamics, which is usually expressed with a mathematical
model. The three main methods of obtaining a mathematical model are:

• Physics-based (model-based) approaches. These rely on building a model up from


known physical laws such as Newton’s F = ma. These approaches are often easier
to understand intuitively, but may make limiting simplifying assumptions which are
difficult to expand beyond when generalizing.

• Data-based approaches. These rely on fitting empirical data with a generalized


function approximator (e.g., Taylor series, Fourier series or neural network). Such
approaches can capture complex relationships that are difficult to model from first
principles. Function approximators that make fewer assumptions about the data’s
structure (e.g., neural networks) require large quantities of quality data to tune which
can often be difficult to obtain.

• Hybrid approaches. These methods attempt to use physics-based modelling as much


as possible, and use data-based approaches to capture errors resulting from limiting
assumptions. They can also be used to model the part of the Equation that would usu-
ally require assumptions to be made, thus preventing the need for some assumptions.
Chapter 5 discusses one such hybrid method.

Physics-based methods can very successfully model and control a single quadrotor’s dy-
namics. This is usually done in a cascaded fashion, where the first controller module deals
with position tracking, while the second module deals with orientation tracking (see Section
3.3.2 ). Earlier methods do this cascaded control by controlling roll, pitch and yaw (RPY
angles) separately in near-hovering linearizations [66].
The most commonly accepted modern method for quadcopter control is geometric control
[52]. Geometric control utilizes error functions defined directly on the SO(3) manifold (see
Lie theory for an explanation of manifolds [79], [80]). This avoids complexities such as
singularities which arise when working in the tangent space (such as done when operating

33
with RPY angles), and avoids quaternion ambiguities which arise when using quaternions
for attitude tracking. Geometric control exhibits almost global exponential attractiveness to
tracking errors of zero [52].
Physics-based methods generally take one of three approaches to deal with additional
slung loads [72]:

• Ignore the load and treat cable tensions as disturbances. This approach is usually
used in conjunction with control methods that do not require an understanding of
a system’s dynamical Equations (such as PID control). In this case, the individual
drones’ flight controllers in a (multi) drone-slung-load system simply correct for cable
tension forces as though they are disturbances to formation control (see Figure 2.2).
This method is the easiest to implement if a drone’s controller is robust enough to
handle the ‘cable disturbances’.

• Treat the additional mass as uncertainty. This approach is used when a math-
ematical model of the system is required for control, and contains drone mass as a
parameter. This is usually used when applying adaptive control [36], [46], [68], adap-
tive robust control [58], and passivity-based control [60], [61]. Although these methods
show stability under a range of conditions, the lack of a complete explicit model for
the load makes it harder to optimize for a minimum swing trajectory.

• Adapt the underlying mathematical model. This approach is addressed by


Sreenath et al. [53], [73]. As mentioned earlier, [53] shows that their trajectory genera-
tion method that considers payload dynamics results in a load tracking error 300-400%
lower than that generated by the quasi-static models that proceeded [50], [51]. They
note that tracking error can be reduced even further using feedforward moments cal-
culated from the known dynamics. Neither of these improvements would be possible
without knowledge of the system dynamics model.

A major problem with adapting the underlying mathematical model is the complexity
required to capture general slung load dynamics. These include the case of zero cable tension
(which [53] captures with hybrid system switching dynamics), non-rigid-body masses, cables
with non-negligible mass, elastic cables, loads and cables where aerodynamic effects are non-
negligible and cables not attached at a drone’s center of mass. Ignoring cable dynamics (by
assuming a massless cable) is particularly common due to the large variety and complexity of
cable modelling techniques [81]. This simplification quickly becomes limiting when working
with larger and heavier loads that require longer and stronger cables.
To address some of the difficulties in mathematical modelling, reinforcement learning has
been applied in recent years [74], [75], [82], [83]. Although the results show some promise ([75]
still notes significant instability under particular conditions), reinforcement learning poses
its own suite of challenges, such as gathering a sufficient quantity of good quality training
data, designing reward functions and tuning training parameters. Further, such approaches
completely disregard the earlier modelling work proven to be successful for a range of cases.
One approach to fuse the advantages of ML and physics-based modelling methods, is
to use physics informed neural networks (PINNS). PINNs are simply neural networks used

34
to approximate the solution to a partial differential Equation (PDE). They are ‘physics-
informed’ because part of the loss function used in training the PINN is defined by the
original PDE. In this way, differences from the original PDE’s prediction can be penalized,
resulting in the PINN’s solution being driven towards capturing the PDE’s output during
training. Serrano et al. [84] use a PINN in a single-quadrotor slung load system to predict
the quadrotor’s pose, quadrotor’s linear velocity, and the load’s position. The intention
is to expand a discretized dynamical model to also capture unmodelled disturbances such
as aerodynamic drag and downwash on the slung load from the drones. They show that
their PINN achieves a root mean squared error (RMSE) of 0.508 combined (vehicle pose,
velocity and payload position) for a test set of 4, 252 real world data points each over a 3 s
time horizon. Although this is better than the results presented for the uninformed neural
network (0.561), and significantly better than the pure physics model (1.879) [84], it still
represents a payload position error of 0.166 m, does not give a load orientation estimate,
and is only presented over a short 3 s time horizon. Further, it relies on knowledge of the
load’s pose, and provides no alternatives to motion capture to obtain this information (only
available in dedicated indoor arenas).

2.3 Key contributions to literature


This work aims to be the first to use visual feedback to achieve a load pose estimate suitable
for a multi-drone slung load system operating outdoors. The key contributions to existing
literature are categorized by the objectives described in Section 1.2:

• Build a platform: few multi-drone slung load simulation platforms are available
open-source. RotorTM is one platform that is and, like other available platforms, is
limited to only exploring the multi-drone slung load problem in isolation [56]. Chapter
3 describes the design of an open-source platform built on modern, open-source tools
(ROS2 and PX4) that is easily extensible to general robotics scenarios. Even fewer
works present experimental tests, and those that do, mostly do so indoors with a motion
capture system [53], [73]. Of the experimental results presented outdoors, some systems
are built on outdated custom platforms that are time-consuming to rebuild [36], [69],
and/or do not release their code open source [36], [57], [69]. The system described
in Chapter 3 functions well outdoors with no motion capture system, and is easy to
implement in the real world.

• Get an estimate of load pose: existing literature discusses non-visual [48], [67] and
visual [69]–[72] approaches to slung load pose estimation in single-drone slung load
systems. To the author’s knowledge, no available literature covers pose estimation
of a slung load in a multi-drone slung load system. Chapter 4 discusses a method of
visual pose estimation with a novel Kalman Filter formulation to address this literature
deficit.

• Build a model of cable dynamics: some attempts have been made at capturing
dynamics that are difficult to model with physics-based methods. Most of the liter-
ature in this area comprises of RL-based methods that completely disregard previous

35
successes in physics-based methods [74], [75], [82], [83]. The one available approach
that attempts to fuse ML and physics-based approaches uses a PINN in a single-drone
slung load system [84]. Chapter 5 discusses the use of a universal differential equation
(UDE) to address this problem; an entirely novel approach to dynamics modelling in
a multi-drone slung load system.

36
Chapter 3

System Architecture

Although the field of robotics has received much research attention in the past few decades,
the continued lack of open-source tools renders testing new approaches difficult. As a result
of this, a common pattern robotics research follows is: spend well over 50 % of the allocated
project time building a (often novel) testing platform, and the small remainder of the time
testing new methods applicable to that platform. In order to resolve this issue, it is essential
to recognise that the lack of available tools is a large hurdle holding back the field of robotics.
This will continue to be true for many years given the wide range of applications robotics is
attempting to address, and the variety of technologies that the field depends on.
This Chapter discusses the development of a novel, open-source multi-drone slung load
robotic ‘system’ to help address the lack of robotics platforms available to researchers. Cur-
rently, few platforms that address this problem are discussed in literature. Those that are
either focus on the problem in isolation [56] (i.e. do not provide a simulator that can func-
tion in complex multi-obstacle environments), run indoors with a motion capture system
[53], [73], are built on outdated custom platforms that are time-consuming to rebuild [36],
[69], and/or do not release their code open source [36], [53], [57], [69], [73]. Further, most
of the systems available in literature focus on the multi-drone slung load controls problem.
Many of these advanced controls solutions require load pose feedback in order to function,
which is often provided by a motion capture system (see Chapter 2). In most cases, to make
a multi-drone slung load system useful, it must be flown independently of a motion capture
system which are expensive, difficult to set up and only available in specific controlled in-
dooor environments. The system discussed in this Chapter is specifically designed to test
novel load pose estimation methods (discussed in Chapter 4) and thus allow the lack of such
methods to be rectified. The key contributions of this Chapter to existing literature are
therefore:

• An open-source system, proven to be functional in simulation and the real world,


specifically for testing multi-drone slung load pose estimation methods.

• An open-source design for a novel drone position control testing apparatus with
minimum inertia.

This Chapter’s discussion begins with a presentation of the governing system dynamics
Equations in Section 3.1. The higher-level system architecture is then presented in Section

37
3.2, followed by a description of the software architecture in Section 3.3. Various fidelity
simulators are then discussed (Section 3.4), before finally presenting the full real world
testing platform in Section 3.5.

3.1 Dynamics modelling


Many advanced estimation, planning, and control methods require a dynamics model of the
plant (system to be controlled) in order to function. A complete dynamics model of the n-
drone (where n is the number of drones) rigid-body slung load system can easily be developed
using the Euler-Newton method [53]. Using this method, the system of Equations:

mi ẍi = fi Ri ê3 − mi g ê3 + RL ti qi , (3.1)


Ji θ̈i + θ̇i × Ji θ̇i = τi , (3.2)
Xn
mL ẍL = − RL ti qi − mL g ê3 , (3.3)
i=1
n
X
JL θ̈L + θ̇L × JL θ̇L = ri × (−ti qi ), (3.4)
i=1

is derived for the case when planar quadcopters are used (see Table 3.1 or the Nomenclature
Section for the symbol definitions). Equations 3.1 and 3.2 describe the dynamics of drone
i (where n is the number of drones). Similarly, Equations 3.3 and 3.4 describe the load
dynamics. Equations 3.1 and 3.3 simply represent the sum of forces acting on the drone and
load respectively, whilst Equations 3.2 and 3.4 represent the sum of moments on the same.
Figure 3.1 illustrates an example of a three-quadcopter slung load plant, which is modelled
by this system of Equations when n = 3.
Although this work primarily uses quadcopters, the methods discussed are largely inde-
pendent of the vehicle type. Any vehicle with hovering capability could be substituted with
minor modifications to Equations 3.1 and 3.2.
Several assumptions (also discussed in [53]) are made in deriving this system. These
include:

1. the cables are massless;

2. the cables are constant length;

3. the cables are attached to the quadrotors at their centres of mass;

4. aerodynamic drag acting on the quadrotors, load, and cables, is negligible; and

5. the cables are always taut.

The assumption that presents the most modelling difficulties is assumption 1, known as
the ‘massless cable’ assumption. This assumption is discussed in detail in Chapter 5, along
with a novel approach to address it.

38
mL ∈ R Mass of the slung load.
JL ∈ R3×3 Inertia matrix of the slung load with respect to the load’s body-fixed
frame.
RL ∈ SO(3) The rotation matrix of the load from the load’s body-fixed frame to the
inertial frame.
θL ∈ R3 Orientation of the load as roll-pitch-yaw (RPY) angles in the inertial
frame. This is RL converted to RPY Euler angles.
xL ∈ R3 Position vector of the center of mass of the load in the inertial frame.
mi ∈ R Mass of the ith drone.
Ji ∈ R3×3 Inertia matrix of the ith drone with respect to that drone’s body-fixed
frame.
Ri ∈ SO(3) The rotation matrix of the ith drone from that drone’s body-fixed frame
to the inertial frame.
θi ∈ R3 Orientation of the ith drone as roll-pitch-yaw (RPY) angles in the inertial
frame. This is Ri converted to RPY Euler angles.
xi ∈ R3 Position vector of the center of mass of the ith drone in the inertial frame.
τi ∈ R3 Moment produced by the ith drone.
fi ∈ R Thrust produced by the ith drone.
ψi ∈ R Yaw angle of the ith drone.
qi ∈ S2 Unit vector from the ith drone to its attachment point on the load in the
body-fixed frame of the load.
ti ∈ R Tension in the cable between the ith drone and the load.
li ∈ R Length of the cable between the ith drone and the load.
ri ∈ R3 Vector from the center of mass of the load to the attachment point of the
ith drone to the load.
ê1 , ê2 , ê3 ∈ R3 Standard unit vectors along x, y, z axes in the inertial frame.

Table 3.1: Symbols used in modelling a multi-quadcopter slung load system’s dynamics.
Note that scalars are represented with Roman style letters such as ‘x’, vectors use bold
lowercase letters like ‘x’, and matricies use bold uppercase letters like ‘X’.

39
Figure 3.1: Three quadrotors carrying a rigid body slung load. Figure from Sreenath and
Kumar [53].

Assumption 2 can be addressed by employing system identification and adaptive control


techniques discussed in Section 2.2.2. Assumption 3 can be addressed by adding a term
to Equation 3.2 that captures the moment effects of an offset cable (like in Equation 3.4).
Similarly, assumption 4 can be addressed by adding terms that describe aerodynamic drag
on each of the non-negligible components. For more aggressive flight trajectories, the case in
which cables become slack (assumption 5) can be handled with a hybrid discrete-continuous
dynamics model as discussed in [53]. Each of these extensions complicate the dynamics
model. They should be used in cases where the assumptions are no longer valid, but as they
are valid for the case being studied, we mostly use Equations 3.1 - 3.4 as they are presented
here.

3.2 Higher-level system architecture


Building a robotic system that is governed by the dynamics Equations in Section 3.1 is far
more complex than simply formulating the dynamics Equations. It involves developing a
full software stack to control and coordinate multi-drone systems (Section 3.3); building a
representative simulator to reduce development time and cost (Section 3.4); and building
a physical hardware platform to test on (Section 3.5). This Section provides a higher-level
overview of these components before the following Sections decompose each in greater detail.
Figure 3.2 provides an overview of the key subsystems involved in running the multi-drone
slung load platform. These subsystems are:

• Drones: the key load lifting elements. On each drone, PX4 autopilot runs on the
onboard Pixhawk flight controller (FC), while the ‘drone’, ‘visual measurement’ and
‘cable dynamics estimation’ software modules run on onboard computers (see Section
3.3).

40
• Load: the element to be lifted. Figure 3.2 shows a computing device on the load run-
ning the ‘load’ and ‘Kalman filter’ software modules. If no computing device is available
on the load, these modules can instead be run on the drones’ onboard computers.
• Ethernet cables: set up a hardwired local area network (LAN) between the drones
and load. This reduces latency and drop-out issues for safety-critical components seen
over the Wi-Fi network.
• Ground station laptop: allows a pilot to send commands to the system and receive
live telemetry information through an instance of the QGroundControl software. Runs
the ‘ground control station (GCS)’ and ‘GCS user interaction nodes’. These can instead
be run directly on the drones and interacted with through a remote controller (RC)
to improve reliability. Live debugging information is obtained by running a script to
SSH into the drones and load over the Wi-Fi LAN.
• Router: sets up a Wi-Fi LAN useful for debugging and automatically pulling results
off the drone system.
• RTK GNSS base station: obtains real time kinematic (RTK) GNSS correction data
to provide more accurate GNSS data (see Section 3.5.1).
• Telemetry radios: connect the GCS to the drones’ Pixhawk FCs directly to a) send
RTK correction data, and b) receive a telemetry stream back from the drones. Four
radios are shown as the load can have a Pixhawk FC mounted directly to it to provide
ground truth positioning (see Section 3.5.1).
Figure 3.2 shows three drones, but the system can easily scale to an arbitrary number
N , where N > 3. A minimum of three drones are used as this provides enough points to
fully constrain the load’s pose (cables are attached at different points as seen in Section 3.1).
The system is hardware agnostic so any UAV capable of sustained hovering can be used
(including heterogeneous combinations).

3.3 Software architecture


This Section presents a higher-level overview of the full robotic software stack. It starts with
a discussion of the back-seat driver architecture (Section 3.3.1) core to the software stack.
Discussion then briefly touches on the lower-level control strategy (Section 3.3.2), before
delving into the higher-level autonomy stack in more detail (Section 3.3.3).
The code written to implement each module is linked in the corresponding Section of this
thesis. For ease of reference, all GitHub repository links (which contain detailed installation
instructions) are compiled in Appendix A. The key design features of this code base are as
follows:
• Hierarchical: lower-level control is performed on a real-time FC while higher-level
computations (including agent coordination) are performed on an onboard computer
(see Section 3.3.1). This allows significantly easier development and hardware-agnostic
autonomy.

41
Figure 3.2: Higher-level system architecture showing key hardware components and software
modules (indicated by boxes with colored borders). The hardware components are discussed
in detail in Section 3.5.1, while the software components are discussed in Section 3.3. The
box outline colors correspond to software module classifications discussed in Section 3.3.

• Modular: code is split into a variety of independent modules (see Section 3.3.3).
This facilitates easy decentralization and reconfiguration. Code is also designed in an
object-oriented fashion, maximizing encapsulation and code reusability.
• Easily reconfigurable: launch files and scripts are written to provide a simple and
flexible way to select module combinations. Configuration files provide a single source
of reconfiguration and facilitate the transition from simulation to the real world.
• Safe: robotics systems are inherantly asynchronous and nondeterministic. A higher-
level Moore finite state machine (FSM) is used to isolate nondeterminism in discrete
steps, facilitate debugging and allow safety checks.
• Semi-decentralized: computations are performed onboard each vehicle as much as
possible to reduce dependencies between vehicles (extremely useful in network-limited
environments). Some centralization is kept to enable ease of user interaction and assist
debugging.
These design features will become apparent when browsing the code base, and in the
following Sections.

3.3.1 Back-seat driver architecture


The philosophy behind the back-seat driver architecture is simple: let a front-seat controller
perform lower-level, vehicle-specific controls, while a back-seat computer performs higher-
level autonomy functions. This separates vehicle control (often vehicle-specific, real-time
and safety critical) from higher-level autonomy, thus allowing autonomy be implemented in
a vehicle-agnostic manor.

42
In UAVs, lower-level flight control is often performed by PX4 autopilot; an open-source
code base for autonomous vehicle control (see Section 3.3.2) [66]. In the hardware system,
this front-seat code is run on a dedicated Pixhawk FC board (discussed more in Section
3.5.1), while back-seat code is run on an onboard companion computer. This is referred to
as ‘offboard mode’ in the PX4 ecosystem, and is visualized in Figure 3.3.

Figure 3.3: PX4’s offboard mode system architecture. Lower level control is handled by
PX4 autopilot running on a Pixhawk flight controller (Holybro’s Pixhawk 6X here), while
higher-level functionality runs on the companion computer (Raspberry Pi 5 here). This is
an example of the back-seat driver architecture. Figure adapted from PX4 documentation
[85].

The PX4 autopilot runs on its own internal publish-subscribe messaging system called
uORB. Newer versions of PX4 allow robot operating system (ROS2) nodes (described in
Section 3.3.3) to publish/subscribe to these messages directly thanks to the data distribution
service (DDS) (summarized in Figure 3.4). DDS is a messaging system upon which ROS2’s
messaging is built. Although it is possible to use MAVLink (a lightweight communication
protocol designed for UAVs) to perform this communication, using DDS allows lower-latency
and lower-level access to PX4’s internal flight control. This is desirable to enable greater
customizability and thus the direct ROS2-uORB connection is used here in preference over
MAVLink.

43
Figure 3.4: How the ‘extremely resource constrained environment (XRCE)’ distribution
of micro DDS connects PX4’s uORB and ROS2. This allows ROS2 nodes running on a
companion computer (Raspberry Pi 5) to communicate with PX4 (on the Pixhawk 6X)
directly. Figure adapted from PX4 documentation [85].

3.3.2 Controls
As previously mentioned, lower-level drone flight control is often performed by PX4 autopi-
lot. In the multi-drone slung load system discussed here, each drone has its own Pixhawk
FC with an instance of a slightly modified PX4 autopilot (https://github.com/hmer101/
PX4-Autopilot) running. The functionality of PX4 is extensive, including key components
such as sensor and actuator communication, an extended kalman filter (EKF) state estima-
tor, various controllers for different vehicle architectures, and safety-critical functions (e.g.,
low battery actions and kill-switch interruption). Thanks to the back-seat driver architecture
discussed in Section 3.3.1, we can use the majority of these functions without considering
PX4’s implementation details (which are well documented in [85]).
One module particularly relevant to this project is the multicopter control system, pic-
tured in Figure 3.5. This module takes desired position Xsp and yaw ψsp setpoints and
converts them to a thrust setpoint Tsp for each of the multicopter’s rotors (control actions).
Figure 3.5 shows that this controller follows that ‘cascaded PID’ architecture. In this exam-
ple of cascaded PID, the position setpoint is first converted to a velocity setpoint Vsp using
a P position controller. This velocity setpoint acts as an input to the following PID velocity
controller, which outputs an acceleration setpoint in the inertial frame Asp . The acceleration
and yaw setpoints are then converted to an orientation setpoint as a quaternion (qsp ). This
orientation setpoint is passed through cascaded P angle, and PID angular rate controllers
in the body frame (producing an intermediate angular rate setpoint Ωsp ), which outputs
roll, pitch and yaw control actions (δAsp , δEsp , δRsp respectively) [86]. These orientation
control actions, combined with the scalar thrust control action from position control (Tsp ),
are combined in a configuration-specific mixer to give the desired thrust setpoint. Each of
the submodule controllers operates in a closed loop given current state estimates from PX4’s
EKF state estimator.
As the main focus of this work is not controls, this control architecture is left untouched.
Instead, the higher-level back-seat code discussed in Section 3.3.3 is used to generate position

44
Figure 3.5: PX4’s multicopter control system architecture. Figure from PX4 documentation
[85].

and yaw setpoints for this controller to follow. How these setpoints are decided is the subject
of Section 3.3.4. This way, coordinated load lifting is achieved without the need to develop a
custom controller. Section 3.5 shows that this cascaded-PID controller provides satisfactory
results in the real world, suitable for testing load pose estimation algorithms.

3.3.3 Companion computer code


To maintain modularity and reduce unnecessary FC firmware rebuilds, all custom code is
designed to be run on an onboard (companion) computer. The core of this system is built on
the ROS2 middleware. This middleware simply allows multiple independant, asynchronous,
and distributed processes (ROS2 nodes) to communicate through a publish-subscribe frame-
work. Thanks to ROS2, nodes can be run on any device in a network and, given the network
is set up correctly (see Section 3.5.1), communicate as though they are on the same device.
Figure 3.6 summarizes the key ROS2 nodes running in the multi-drone slung load system.
It is tailored to the simulation case, but requires minimal adaptation to move to the real
world.
The purpose of each node is as follows:
Simulation
• Dynamics simulator: provides realistic physics simulation to allow quick code tests
to be performed independently of real hardware. Section 3.4 discusses the simulators
used for this project. In the real world, this block is replaced by real hardware (Section
3.5.1) and real life physics.
• PX4 autopilot: an open source flight control software for drones and other unmanned
vehicles [66]. Provides the lower-level flight control (discussed in Section 3.3.2) and
safety features required to fly a quadcopter stably. In simulation, this component
runs ’software in the loop’ (SITL), where flight controller code runs on the computing
device that is also running the simulation (see Section 3.4.3). In the real world, this
component runs on an embedded flight controller called a Pixhawk.

45
Figure 3.6: Modules in the full system architecture linked together by ROS2. Red indicates
custom modules written in python, green indicates C++, blue indicates Julia while grey
boxes indicate non-custom systems. Modules below the dotted line have multiple instances
(one for each drone) which are designed to run onboard each drone in the real world system.

Offboard control

• Ground control station (GCS): sends load waypoint references to the drone-load
system to follow.

• GCS user interaction: allows the user to interact with the drone-load system (e.g.,
send a command to take off) from a GCS.

• Load: manages the storage and updating of information regarding the load’s pose.

• Drone: manages the storage and updating of information regarding each drone’s pose.
Also manages the transition of each drone between flight phases essential for safe
operation.

Estimation

• Cable dynamics estimation: a method for estimating the dynamics of flexible cables
using universal differential equations. Discussed in detail in Chapter 5.

• Estimation - Kalman filter: runs a custom Kalman filter to estimate the pose of
the slung load. Discussed in detail in Chapter 4.

• Visual measurement - PnP: runs the position-n-point algorithm to visually measure


the pose of the slung load for the Kalman filter. Discussed in detail in Chapter 4.

46
When the system is run in simulation, all of the nodes run on a single computing device.
It is possible to distribute the computations across multiple devices in a network if necessary.
However, the full system is capable of running well on a Dell XPS 15 7590 laptop with an
Intel(R) Core(TM) i9-9980HK CPU and 32 GB of RAM. When the system is run in the real
world however, the nodes are distributed across computing devices as shown in Section 3.2.

3.3.4 Multi-agent coordination


According to the multi-agent system categories defined at the beginning of Chapter 2, the
multi-drone slung load system studied here is a ‘physically coupled system’. However, for
ease of coordination, we can assume that the drones are not physically connected to anything
and simply command the drones to fly in a formation (a formation-based method ). If we
assume that the load remains reasonably stable (quasi-static) in the middle of the formation,
we are employing a quasi-static formation-based coordination method. Again, as the focus of
this work is not controls, this is perfectly sufficient and provides a good platform for testing
load pose estimation algorithms.
Two key components are required to successfuly fly in a formation:

• a method for each vehicle to follow desired setpoints (see the controller described
in Section 3.3.2), and

• a method to coordinate the setpoints to make some formation.

Setpoint coordination is performed by the ‘offboard control’ software modules described


in Section 3.3.3, and found on GitHub here: https://github.com/hmer101/multi_drone_
slung_load. To get setpoints, a desired (subscript d) load pose in the inertial world frame
W is set (TLdW
). Further, a desired pose of each drone i relative to the load’s desired pose
Tid is set, defining a virtual structure (see Figure 3.7). Each drone node then performs the
Ld

coordinate transform:

TidW = TLd
W Ld
Tid , (3.5)
to get the desired pose of drone i in the world frame (TidW ). This transformation process
can be visualized in the 3-drone case (i ∈ {1..N } where N = 3) by comparing Figures 3.8a
and 3.7. Figure 3.8a shows that the drone poses are initially defined in the load frame to
form the virtual structure, while Figure 3.7 shows the drone poses transformed to the world
coordinate system as they must be to use the controller described in Section 3.3.2.
The implementation of this coordination process requires many more intermediate frames.
The full ROS2 TF tree can be seen in Appendix B, and visualized in this video: https:
//youtu.be/L-jx-SenBfA.

3.4 Simulation
Simulation, in the context of the slung load system, is the act of replicating real world
phenomena computationally. Different simulators simulate different phenomena, provide

47
Figure 3.7: Arrangement of a 3-drone system in a virtual cone structure around the load.
The locations of the drone body-fixed coordinate systems (T1W – T3W ) and the load coordinate
system (TLW ) are ultimately defined relative to the fixed (inertial) world coordinate system
(W ). Note that all drone coordinate systems have their x-axes fixed towards the center of
the virtual cone’s base.

different levels of realism, use varying amounts of computational resource, and are tailored
for different purposes. One way of classifying simulators is by real-life fidelity (how close
simulation results are to real world observations). Lower fidelity simulators generally require
much fewer computational resources, and so are often used to provide quick estimates of
real world behavior, or to simulate one specific component of a system. Conversely, higher-
fidelity simulators provide more accurate estimates of real world behavior in one or more
aspects. One common dimension of higher fidelity is more accurate dynamics simulation,
while another is higher photorealism.
Three simulators were developed for this project: a low-fidelity dynamics simulator in
Julia, a moderate-fidelity dynamics and controls simulator in Drake, and a high-fidelity visual
pose estimation simulator in Gazebo. These simulators, and some results produced by them,
are discussed in Sections 3.4.1 - 3.4.3 below.

3.4.1 Low-fidelity custom Julia simulator


One of the lowest fidelity dynamics simulators possible is one that generates a dynamically
feasible trajectory for all of the components in the system. In the context of the slung load
system, the data is generated using a prescribed load trajectory along with the property
of differential flatness discussed by Sreenath and Kumar [53] (mentioned in Section 2.2.3).
Using these, and the system of dynamics Equations 3.1 - 3.4, the corresponding drone tra-
jectories can be solved. This gives a full trajectory for the drones required to produce the

48
(a) A 3-drone system arrangement on a virtual (b) An n-drone system arrangement on a virtual
circle. circle.

Figure 3.8: A top-down view of drones arranged in a virtual cone around a load. In this
case, the drones are the same height above the load, so the virtual structure becomes a circle.
Drones are evenly spaced around the circumference of the virtual circle. Note that the drone
coordinate systems (T1L – T3L – TnL ) are defined relative to the load coordinate system (TLW )
to make the virtual structure.

49
prescribed load trajectory, and thus gives a dynamically feasible time evolution of the full sys-
tem states. The low-fidelity dynamics simulator is built in the Julia programming language,
and is available here: https://github.com/hmer101/18.337_ude_cables).
An example output of the simulator is shown in Figure 3.10b, where the data is generated
by following the load trajectory in Figure 3.9. This trajectory is selected as it is easy to
validate the results by eye, and it provides the simplest case possible for for building cable
dynamics models discussed in Chapter 5.

Figure 3.9: Trajectory of the load used to generate data in the Julia simulator: a circle of
2 m radius with a fixed load orientation. W represents the fixed ‘world’ coordinate system
while L0 – Ln represent the load’s coordinate system over timesteps 0 – n. Note that the
drone orientations stay fixed relative to the load’s orientation as defined in Figure 3.7.

3.4.2 Medium-fidelity Drake simulator


The low-fidelity dynamics simulator discussed in Section 3.4.1 provides useful dynamically-
feasible trajectories. However, it lacks the ability to simulate more advanced forces such as
propeller aerodynamics and cable tensions. These forces are crucial to testing the drones’
control systems and ensuring that the multi-drone slung load system will be stable under the
quasi-static assumption (described in Section 3.3.4). Although this is represents a deviation

50
(a) Linear trajectory data and predictions - load.

Figure 3.10: Dynamically feasible data points for a three-drone rigid-body load system,
generated by the low-fidelity Julia simulator. The load travels anti-clockwise in a circle
without vertical motion. The drones remain arranged in a way such that neither the load
nor the drones rotate over the trajectory. The first ‘x’ in the labels represents state while L,
1, 2 & 3 index the load and the drones. Data points are split into x, y and z components
relative to the inertial frame as identified by the trailing ‘x’, ‘y’ or ‘z’. Data points for cables
that appear to be missing are directly under the corresponding data points for another cable.

51
(b) Linear trajectory data - drones (continued).

Figure 3.10: Continued figures from previous page.

52
from the primary objective of building a slung load pose estimator, it is a useful attempt at
building a simulator to verify the testing platform’s stability with various drone controllers.
This Section discusses some of the challenges with trying to verify control system sta-
bility in the multi-drone slung load system using Drake. Drake was selected as it is a
popular open-source toolbox that focuses on contact mechanics simulation and provides
many pre-implemented controllers. A video overview of the key results can be found here:
https://youtu.be/NR5nPiXb5wU, while the corresponding code can be found here: https:
//github.com/hmer101/6.8210_drake_slung_load. Note that the video presents three clear
components of the project: differential flatness, state regulation, and trajectory stabilization.
Only the state regulation and trajectory stabilization components (shown in Figure 3.11) are
discussed here, as the differential flatness module was primarily developed by collaborators.

Figure 3.11: Block diagram illustrating the main components of state regulation and trajec-
tory stabilization.

Plant modelling
The Drake toolbox has three core components:

• Modelling dynamical systems: central functionality that allows systems models


to be built from a huge array of dynamical components. These components include
modules such as sensors, actuators, estimators and controllers.

• Multibody kinematics and dynamics: interpret various kinematic model formats,


handles multibody kinematics simulation and contact modelling.

• Solving mathematical programs: a unified interface for a number of mathematical


program solvers.

53
Plant modelling is performed predominantly with a simulation description format (SDF)
file structured as in Figure 3.12.

Parent Load

Ball Ball Ball


Joint Joint Joint

Tether Tether Tether


Child
1 2 3

Ball Ball Ball


Joint Joint Joint

X500 X500 X500


Child
Drone 1 Drone 2 Drone 3

Figure 3.12: Kinematic tree of the three drones, tethers, and load. Note the parent link is
the load, followed by the three tethers and then the drones.

This file is interpreted by Drake’s multibody kinematics component, to give the system
seen in Figure 3.13. This system is then augmented with some modules from the dynamical
systems component including propellers and an LQR controller (discussed in Section 3.4.2).

Figure 3.13: Load with three tethers and three drones at a stabilizable fixed point.

The SDF model features three tethers each modelled as single rigid cylindrical rods at-
tached with ball joints at either end (discussed in Section 3.4.2). Each tether is generated
from a XACRO file to allow the length, mass and number of links to be easily changed. The
load is modelled as a sphere with the tether attachment points symmetrically distributed

54
along the center plane to simulate the rigid body aspect of the load. The tethers are con-
nected to the drones at their centers of mass.

LQR control
Dynamic programming (DP) is an algorithmic technique where a problem is broken into
smaller sub-problems. The solutions to those problems are then saved and optimized to give
an overall optimum solution. In the case where each state and the control inputs in the
problem are continuous, this is known as continuous dynamic programming. The Hamilton-
Jocobi-Bellman (HJB) Equation is often used to formulate these problems.
A special case arises when the dynamics of a system can be expressed in a linear time-
invariant (LTI) form:

ẋ = Ax + Bu, (3.6)
and the infinite-horizon cost function is given by:
Z ∞
J= [x⊺ Qx + u⊺ Ru] dt, Q = Q⊺ ⪰ 0, R = R⊺ ≻ 0, (3.7)
0
where:
• A is the system matrix,

• B is the input matrix,

• Q is the state cost matrix, and

• R is the control cost matrix.


In this special case, the optimal control output (u∗ ) that minimizes the cost function 3.7 can
be found:

u∗ (t) = −Kx(t). (3.8)


The controller that selects this value is called a linear quadratic regulator (LQR). An
LQR controller is therefore an optimal controller that can be applied to a system with
LTI dynamics and a quadratic cost function. It is called a regulator because it stabilizes
the system to be controlled at the origin (technically linear quadratic optimal tracking if
stabilizing at any stabilizable fixed point). Figure 3.13 shows this fixed point for the multi-
drone slung load system. Note that the tethers P are modelled as single cylinders because the
fixed point is calculated with static analysis ( f = m = 0), which is easier with a single
P
rigid link.
The optimal input 3.8 is found by differentiating the cost function 3.7 and solving for the
u that minimizes said cost function. Here, K is the optimal gain matrix given by:

K = R−1 B T P . (3.9)
To determine K, find the positive-definite solution for P (if the system is controllable)
to the continuous-time algebraic Riccati Equation (CARE):

55
AT P + P A − P BR−1 B T P + Q = 0. (3.10)
This is the complete process undertaken by the ‘infinite-horizon LQR controller’ block in
Figure 3.11.
There are many variations and extensions of LQR which are well discussed in Tedrake’s
‘underactuated robotics’ textbook [87]. One important extension is finite-horizon LQR which
finishes optimization at a final time (tf ) rather than an infinite time. As a result, it uses a
cost function defined by:
Z tf

J = x Qf x + [x⊺ Qx + u⊺ Ru] dt, (3.11)
0

Qf = Q⊺f ⪰ 0, Q = Q⊺ ⪰ 0, R = R⊺ ≻ 0,
where Qf is the final state cost matrix.
Another important extension of LQR is time-varying LQR, which operates on a linear
time-varying (LTV) system defined by:

ẋ = A(t)x + B(t)u. (3.12)


Time-varying LQR allows us to define a nominal trajectory (x0 (t), u0 (t)) to linearize
around for a fixed time period t ∈ [t1 , t2 ]. We then run finite-horizon LQR on discretized
Sections of this trajectory (using direct shooting, direct transcription, or direct collocation)
to find the optimal time-varying control input (u∗ (t)) to keep the system on this trajectory.
As time-varying LQR uses the “real” next state in each optimization step, and has mathe-
matically provable guarantees that it will reach this next state, it is considered closed loop.
Time-varying LQR therefore provides a closed-loop method to stabilize a system around a
given trajectory. As such, it is a common method for trajectory stabilization. This is the
complete process undertaken by the ‘time-varying LQR controller’ block in Figure 3.11. To
generalize this formulation to deal with constraints, model predictive control (MPC) can be
used, but is outside of the scope of what is discussed here.

LQR control results


This Section describes the ability of an infinite-horizon LQR controller and a time-varying
LQR controller to perform quadcopter state regulation and trajectory stabilization. Un-
fortunately, due to numerous modelling setbacks in Drake, the full multi-drone slung load
system stability was unable to be tested. The key issues are:

• Difficulty finding a stabilizable fixed point. As discussed earlier, LQR control


algorithms require the selection of a stabilizable fixed point to linearize around. In
Drake, this fixed point must be specified as an exact state, without even small errors.
This is difficult to do for complex multibody systems such as the multi-drone slung
load system.

• A convoluted API and documentation that is difficult to read. There are many
alternative methods for implementing the same systems in Drake. Some of these are

56
comparable, while some are superior for particular use cases; selecting which methods
to use is not straightforward. The online Drake documentation and tutorials are useful
for getting started, but often do not explain the trade-offs of different approaches.
Further, the documentation is missing discussion of many functions; the Python API
documentation is particularly poor.

• Unhelpful error messages. Many error messages produced by Drake are extremely
broad and make it difficult to debug issues. This, combined with convoluted documen-
tation, makes it extremely difficult to debug specific issues.

Despite these setbacks, it was possible to simulate LQR on drones not attached to the
load. For these experiments, Q and Qf matrices are set as diagonal matrices with the first
half of the diagonal elements (corresponding to the position variables) set to 10 and the
other half (corresponding to the velocity variables) set to 1. The R matrix is simply set to
a diagonal matrix of 0.1.

X X
Y Y
Z Z

a) b)

Figure 3.14: Visualizations and plots of position over time for a single quadrotor using a) an
infinite-horizon LQR controller and b) a time-varying LQR controller. In b), the controller
switches to infinite-horizon LQR controller at the end of the trajectory. Note that the slight
overshoots seen a) aren’t seen in b).

57
X X
Y Y
Z Z

a) b)

Figure 3.15: Visualizations and plots of position over time for three quadrotors using a) an
infinite-horizon LQR controller and b) a time-varying LQR controller. In b), the controller
switches to infinite-horizon LQR controller at the end of the trajectory. Note that the slight
overshoots seen a) aren’t seen in b).

58
The infinite-horizon LQR controller is successful at stabilizing both a single drone (Figure
3.14a) and a 3-drone system (Figure 3.15a) around a fixed point. However, the drones
often overshoot their steady state position, especially if the final state (about which the
linearization is performed) is far from the initial state. This is expected as the infinite-horizon
LQR controller isn’t following a strict trajectory defined by an optimizer with constraints
and costs. This also means that a drone’s orientation often takes on impractical values over
its trajectory. For instance, the drone might spin upside-down while flying up towards the
desired final state.
A time-varying LQR controller is successful at stabilizing both a single drone (Figure
3.14b) and a 3-drone system (Figure 3.15b) around a nominal trajectory. In this case, the
overshoot seen with the infinite-horizon LQR controller is no longer present. This is again
expected because the time and effort costs enforced on the direct collocation solver caused
the resulting trajectory to be more gradual in its approach to the final state. It is important
to note that because the time-varying LQR controller uses a finite-horizon LQR controller
to solve for u∗ , the drones must switch to the infinite-horizon LQR controller to stabilize at
the final state.

3.4.3 High-fidelity software-in-the-loop Gazebo simulator


While the previous simulators discussed have their use cases, neither is capable of easily
simulating the multi-drone slung load dynamics and control in a realistic fashion. Some
simulators are capable of this, but are usually built specifically for this purpose (see RotorTM
[56]). In order to accurately test visual load pose estimation algorithms, a simulator that
is capable of generating visual data is necessary. Further, it is important to generate this
data in the context of a realistic dynamics and control scenario, such that the visual data
generated is close to what will be observed in the real world. This has the added benefit
of ensuring that a full system is stable and safe in simulation before it is tested in the real
world.
This Section describes the capabilities of a generic multi-drone slung load simulation
platform built using open-source tools. The platform is built specifically with load state
estimation algorithms in mind, and with a focus on enabling easy transfer to the real outdoor
world. The custom models built to modify PX4’s Gazebo simulation module to this end can
be found here: https://github.com/hmer101/drone_misc/tree/main/PX4-changed/Tools/
simulation/gz, while video results of a simulated flight can be found here: https://youtu.be/
L-jx-SenBfA. For the reader’s convenience, Appendix C compiles all video testing results
referenced in the remainder of this work.
Gazebo is an general-purpose, open-source simulation framework. It connects rigid body
physics engines (such as the open dynamics engine (ODE) used in this work) with a graphical
user interface (GUI) and a huge variety of plugins for simulating various robotics components
(e.g., electric motors, propellers subject to aerodynamic forces, contact mechanics, solid body
mechanics, sensor noise, and cameras). Similar to Drake, it can take in multibody robots
defined in SDF files (Figure 3.16) and perform dynamic interactions with these.

59
Figure 3.16: Multi-drone slung load system in Gazebo simulation. The box covered in ArUco
markers has a simulated mass of 1.6 kg, while the drones are realistic simulations of Holybro
X500 quadcopters.

Cable modelling
Unfortunately Gazebo, like many other robotics simulators, is only capable of simulating
rigid bodies. This means that flexible cable dynamics must be modelled with a multi-link
rigid body (see Figure 3.17). Five key parameters are available to influence the accuracy of
this cable model:

• Number of links: how many rigid cylinders are included in a single cable. More
cylinders results in a model closer to a real cable, but requires more computational
resource. Empirical observation suggests 10 links provide sufficient granularity without
overloading a standard Dell XPS 15 7590 laptop with an Intel(R) Core(TM) i9-9980HK
CPU and 32 GB of RAM.

• Joint type: what class of joint is selected (e.g., ball joint, sliding joint or universal
joint). Since continuous cables have no rigid joints, the best option is the joint with the
fewest constraints: the ball joint. Many parameter iterations show that the ball joint’s
damping coefficient does not alter the joint’s behavior. As the damping coefficient is an
essential parameter in approximating the cable’s dynamics, the next least constrained
joint with a functional damping coefficient is selected: the universal joint.

• Joint damping coefficient: a parameter that describes the rate of energy loss during
joint motion. A large damping coefficient results in less energy transmission down a
cable. Conversely, a damping coefficient that is too small can result in continuous os-
cillations that grow to instability. It is found that a damping coefficient of 0.01 N sm−1
(the minimum allowable non-zero damping) is optimal. Any lower than this (i.e. 0
damping) and the cable oscillates to instability, while any higher and motion is not
transferred down the cable as expected.

60
• Joint friction coefficient: a parameter that describes a force which must be overcome
to move a joint. As flexible cables do not have joints, adding any friction coefficient
results in a cable model that is far too stiff. Because of this, a friction coefficient of 0
is selected.

• Collision geometry: where on each element should collisions be detected and mod-
elled. Due to the way that the drones are arranged, and the quasi-static method of
operation, cable collisions with other world elements never occurs. The one exception
of this is with the ground plane before flight. As contact modelling is computation-
ally expensive, and cable contact never occurs in any significant way, cable collision
geometry is turned off.

Figure 3.17 shows how the discrete links must be arranged during initialization to position
the end points in the correct locations. This shape is determined with a simple parabolic
model. The final SDF defining the most realistic rigid-body cable which starts in a parabolic
shape can be found on GitHub here: https://github.com/hmer101/drone_misc/blob/main/
PX4-changed/Tools/simulation/gz/models/tether/model.sdf.

Figure 3.17: Discretized cable at its initialization point in Gazebo. Note the 10 discrete rigid
links that make up the cable are clearly visible.

Despite the discrete cable model being sufficiently realistic for this simulator, it is not a
perfect representation of a real flexible cable. This highlights the need for accurate, compu-
tationally cheap, and easy method for cable dynamics modelling. Such a methodology is the
subject of Chapter 5.

PX4 SITL
A software in the loop (SITL) simulation is one where compiled embedded code is run on a
different device to the hardware it will be deployed on. This device is usually the development
computer also running a broader simulation that the compiled code interacts with. In the
Gazebo simulation context, the development computer runs the Gazebo robotics simulation.
This simulation replicates the behavior of all sensors, actuators and multibody dynamics in
the multi-drone slung load system. The compiled PX4 autopilot code then runs on the same

61
computer, accepting simulated sensor inputs and sending commands to simulated actuators.
This provides a highly realistic representation of how the code logic will act when deployed
in the final real system.

Cameras
One of the advantages of the Gazebo simulator is that it is capable of producing pseudo-
photorealistic camera simulations (see Figure 3.18). This is an essential component when
building a vision-based pose estimation platform. How the cameras are modelled, how visual
load pose measurement is performed, and how this is combined with other information to
give enhanced load pose estimates is the subject of Chapter 4.

Figure 3.18: Image from a simulated camera mounted to a drone in the slung load system.
The overlayed axes (red = x, green = y, blue = z) represent the estimated marker axes using
OpenCV’s IPPE Square PnP algorithm.

3.5 Real-world platform


Although modern simulators can be very realistic, it is extremely difficult to capture all
real world uncertainties faithfully. Consequently, to ensure sufficient robustness, all robotics
algorithms tested in simulation must also be tested in the real world. This Section first com-
prehensively describes the hardware required to build a suitable real world testing platform
(Section 3.5.1), before presenting a novel testing apparatus to reduce hardware testing risks
(Section 3.5.2). The reader is encouraged to first understand Section 3.2 before browsing
this Section, as it provides a systems-level overview essential to understanding the context
of each hardware element. The CAD models for all custom-designed hardware elements are
available in Appendix D.

62
3.5.1 Hardware system architecture
The full multi-drone slung load system can be seen in Figure 3.19a prior to flight, and
in Figure 3.19b during flight. A full video of the system during flight can be found here:
https://youtu.be/4FRBUPgz-X4, and a complete bill of materials (BOM) can be found in
Appendix E.
A key feature seen in Figure 3.19b is the Ethernet cables which set up a hardwired local
area network (LAN) between the drones and load. Prior to this hardware connection, inter-
drone communication through the local router would often suffer latency and dropout issues.
This type of unreliability can lead to drones often lagging slightly or falling out of formation
as one drone receives an updated load reference point, but others may not. With the drones
being so tightly coupled, this can lead to the drones pulling the load in different directions,
potentially leading to a loss in upward drone lifting force and the load falling out of flight.
As the system construction requires physical cables to be run between the drones and the
load, adding physical Ethernet cables pragmatically solves the communication issue at no
disadvantage.

(a) System pre-flight. (b) System mid-flight.

Figure 3.19: The real world multi-drone slung load system.

Drones
As mentioned in Section 3.2, the system could be constructed with any N > 3 combination
of real-world drones capable of continuous hovering. For convenience, a homogeneous group
of 3 Holybro X500 quadcopters were selected to perform the load lifting operation. Figure
3.20a shows the key components on each X500 drone. Of particular note is the custom Intel
RealSense camera mount showcased in Figure 3.20b. This mount (the CAD model for which

63
can be found here: https://bit.ly/3T1c4SU) is designed to make changing the load-facing-
camera angle easy. This is essential for testing the optimal load observation angle, which is
found to be 30◦ up from straight down (Figure 3.20b) when the drones are arranged in a
1.25 m radius circle around the load (see Figure 3.8a).
Each drone’s onboard Raspberry Pi (RPi) runs balenaOS (a Linux Yocto variant) which
runs Docker containerized code deployed through balenaCloud. The full containerized code
deployed to the balenaCloud fleet can be found here: https://github.com/hmer101/drone_
offboard_computer. This is exemplary of the back-seat driver architecture discussed in
Section 3.3.1: PX4 runs on the Pixhawk 6X providing the lower-level vehicle control (front-
seat) while the higher-level autonomy is performed on the RPi with custom code (back-seat).

(a) Key components mounted on the Holybro (b) Intel Realsense camera mounts of various
X500 drones. camera angles on the X500 drones.

Figure 3.20: Key features of the Holybro X500 drone platform.

Load
The load for this system can be any generic load, such as the infra-red camera mentioned
in Section . For the purposes of prototyping, a cheap custom load seen in Figure 3.21a
(and available as a CAD model here: https://bit.ly/471B8il) is used. This load features a
Pixhawk 6X (configured as a ‘balloon’ aircraft) and an F9P RTK GNSS for tracking the
ground truth position of the load. Data from the RTK GNSS is suitable for acting as a
ground truth reference as it can produce sub-centimeter level accuracy [88]. Ground truth
data is required to evaluate load pose estimation algorithms, but would not be required
once these algorithms are proven to perform at an acceptable accuracy. The ArUco marker
seen in Figure 3.21b (generated with OpenCV’s Aruco module: https://docs.opencv.org/4.
x/d5/dae/tutorial_aruco_detection.html) is important to the visual load pose estimation
algorithm discussed in Chapter 4.
The RPi 5 on the load runs the same code as the drones (see Section 3.5.1) also deployed
through balenaCloud. The only difference is what nodes are run by the launch file. It is

64
noted in Section 3.2 that it is possible to run the system with no computing device on the
load; the drones’ launch files will simply redistribute where the load’s nodes are run.

(a) Load without ArUco marker mounted. (b) Load with ArUco marker mounted.

Figure 3.21: Key components on the custom 1.62 kg load. The load is made of clear acrylic
for ease of prototyping.

3.5.2 Testing apparatus


Even if a drone functions flawlessly in an very high fidelity simulation, there is no guarantee
that it will function as intended in the real world. There are numerous reasons for this such
as: imperfect simulators, unexpected real world conditions, hardware failures, or incorrect
hardware setup. Real world failures can be unsafe, expensive, and time-intensive to rectify.
Therefore, it is important that as many post-simulation hardware failures as possible are
caught before full flight occurs.
Traditionally, real world testing can be safely performed on an apparatus similar to the
one presented in Figure 3.22. Such a ‘gimbal-style’ rig is useful for testing attitude controllers;
ensuring that a drone can hover in a stable position and respond to RC inputs as expected.
However, rigs of this style have two key limitations:

• High yaw inertia: for a drone to perform a yawing motion, it must produce enough
thrust to yaw itself, and the three moving components of the gyroscope (red, green and
blue). As rotational inertia (I) scales with radius of gyration (k) squared (I = mk 2 ),
even if the mass (m) of the rig is low, the inertia can still be non-negligibly high.
Tuning a yaw control system in this rig often results in gains that do not function as
well when the drone is flown unconstrained.

65
• No translation testing: the gimbal-style apparatus does not allow the drone to trans-
late. While this might be desirable for attitude control testing, it renders safely testing
any code requiring drone movement impossible. This is becoming an increasingly lim-
iting feature as more advanced algorithms, generally run on companion computers,
often send only position commands to a FC.

Figure 3.22: A ‘gimbal-style’ drone testing apparatus. Note that a drone’s roll, pitch and
yaw can vary by rotating a combination of the red, green and blue rig components.

To address these key limitations, a new ‘cable-based’ drone testing apparatus is presented.
This apparatus is pictured in Figure 3.23.

(a) Drone testing apparatus in its smallest con- (b) Drone testing apparatus in its largest config-
figuration (1 m3 ). uration (2 m3 ).

Figure 3.23: Novel cable-based drone testing apparatus in small and large configurations.
Here both configurations are pictured holding a Holybro X500 drone.

The key features of this apparatus are:

66
• Minimal rotational inertia: no solid structures are used to fix the drone in place
and so no additional mass must be moved to reorient the drone.

• Translation enabling: the adjustable cables allow the drone to move, thus enabling
position control and other autonomous functions to be tested.

• Adjustable size: the frame can be reconfigured to have anywhere between 1 m3 (see
Figure 3.23a) and 2 m3 (see Figure 3.23b) of usable flight space. This allows the rig to
be stored in more compact spaces, but still allows testing of larger drones and bigger
movement patterns if required.

• Adjustable range of translation: two retractable cables are used to limit the drone’s
range of vertical and horizontal motion; they stop a malfunctioning drone from flying
away, hitting the testing structure or hitting the ground. These cables have length
hard stops which can be adjusted to allow different motion and to fit different drone
and structure sizes.

• Mobile: features lockable caster wheels for easy relocation. This is helpful for enabling
outdoor testing if GPS is required.

• Extremely rapid mounting: the retractable cables are simply clipped onto a harness
(Figure that clips around a drone. This makes it effortless to mount a drone on, and
remove it from, the apparatus.

This design does not allow acrobatic testing as the risk of the cables colliding with the
propellers prevents full flips. However, it does allow testing trajectories that involve drone
orientations up to 45◦ . This is more than sufficient for testing any load-carrying code as
drones will never perform such acrobatic maneuvers while attached to a load. Video results
of the drone in the testing apparatus can be found here: https://bit.ly/3Xrp9rz. A complete
BOM required to make the apparatus can be found in Appendix E, while the CAD model
of the frame can be found here: https://bit.ly/3z2u6gN.

3.6 Conclusions
This Chapter presents a novel multi-drone slung load platform built using open-source tools.
This platform is built specifically to enable the testing of load pose estimation algorithms
outdoors in the real world.
Several key conclusions can be drawn from this Chapter:

• A quasi-static formation control structure is capable of stably lifting a slung load.


Using PX4 running on a Pixhawk with a back-seat Raspberry Pi 5 companion computer
is a simple way of achieving this.

• A custom Julia program can be used to generate dynamically feasible trajectories


defined by the governing Euler-Newton system of Equations.

67
• The Drake toolbox is extremely difficult to use for modelling complex systems; the
documentation is incomplete, and the errors are often unhelpful. The mathematical
programming interface is useful and user friendly however.

• A Gazebo-based simulation using PX4-SITL is capable of creating an environment


suitable for developing vision-based load pose estimation algorithms. Cables can be
modelled as 10 links connected using universal joints with a damping coefficient of
0.1 Nsm−1 , and a friction coefficient of 0.

• Using Ethernet cables for connecting drones in a hardware system enables a safe and
lag-free real world multi-drone slung load system to function.

• An F9P RTK GNSS system provides sufficient localization accuracy to enable phys-
ically connected drones to collaboratively carry a slung load.

• A retractable cable-based testing rig enables drone position-control testing; an es-


sential step when utilizing the back-seat driver architecture. The rig has, by design,
minimal yaw inertia.

3.7 Future work


Key items for future work include:

• Developing a less computationally intensive and more accurate method to model flex-
ible cabels in Gazebo.

• Adding rollers to the center Sections of the testing rig. This is intended to
reduce cable angles and increase the possible orientation testing range of a drone.

68
Chapter 4

Slung Load Pose Estimation

Until this point, we have assumed that the load in the multi-drone slung load system is
quasi-static. This means that the load is expected to stay reasonably stable in the center
of the carrying drone formation, and with an orientation locked to that of the formation.
Chapter 3 discusses how this is sufficient for lifting the load outdoors using a formation-based
strategy. However, when the connections from the drones to the slung load are flexible, the
pose of the slung load is constrained by, but not entirely determined by, the pose of the
drones. Because of this, the quasi-static assumption used until this point does not produce
an accurate estimate of load pose.
There are two key reasons why we might want a better estimate of the load’s pose:

• If we are using the load to measure (as discussed in Chapter 1) or interact with some
region of interest. In these cases, it is often important to know exactly where the load
is such that it can be accurately placed to perform its intended function.

• If the controller we are using to control the multi-drone slung load system requires
load pose feedback. This is common in more advanced control systems (discussed
in Chapter 2) useful for agile flight and precise load positioning.

Chapter 2 discusses how most load pose estimation is performed indoors, with a motion
capture system. Further, it covers a range of literature that use both visual and non-visual
methods suitable for slung load pose estimation in an outdoor single-drone slung load system.
This Chapter aims to be the first to present a load pose estimation algorithm in an outdoor
multi-drone slung load system.
Discussion begins with a higher-level overview of the full-system estimation pipeline (Sec-
tion 4.1). Section 4.2 then discusses visual pose estimation in more detail, followed by Section
4.3 which presents a Kalman filter-based pose estimator.

4.1 Higher-level estimation architecture


The key components of the multi-drone slung load system state estimator can be seen in
Figure 4.1. This Figure shows how each drone i has its own extended kalman filter (EKF).
These EKFs fuse GNSS and IMU (including a magnetometer) data to produce pose (xi , θi )

69
and velocity (ẋi , θ̇i ) estimates for each drone. Each drone also runs a visual measurement
module (discussed in Section 4.2), which produces estimates of the load’s pose relative to
itself (x̃dLi and θ̃Ldi ). The ‘slung load Kalman filter’ module (discussed in Section 4.3) uses
these measurements, and measurements of the load’s linear and angular acceleration from
an on-load IMU, to produce a load position estimate xL . A load orientation estimate θL is
yet to be implemented.

Figure 4.1: Architecture of the multi-drone slung load state estimator. x̃dLi and θ̃Ldi represent
the measured (indicated by ˜·) position and orientation of the load relative to drone i ∈
{1..N }, while all other symbols are defined in Table 3.1. Note that the dotted box contains
the components that run on each drone i, and the dotted arrow indicates that the ‘slung
load Kalman filter’ does not yet produce θL .

It is possible to replace the on-load IMU by using the cable dynamics estimation method
discussed in Chapter 5. This has the advantage of not requiring an IMU to be mounted on
the load. However, as the cable dynamics estimation method is yet to produce functional
results, the on-load IMU is used for results presented in this Chapter.
It is also possible to formulate this state estimation problem differently: each drone
could run its own load pose estimator. However, this would mean that each drone’s estima-
tor would only use the measurement information directly available to it, rather than using
the measurement information available to the full system. Rectifying this would require
sending load measurement information between drones, running an additional estimator to
fuse estimates from each drone, or using a consensus algorithm to fuse the pose estimates
available from each drone. As there are currently no multi-drone slung load pose estimation
methods available in literature, the centralized approach is taken to provide a simple and
functional baseline.

4.2 Visual pose measurement


As previously mentioned, Chapter 2 discusses some visual and non-visual methods that could
be useful for load pose estimation. Visual methods involve using one or multiple cameras

70
and one of a few possible algorithms (described below) to estimate the load’s pose from
a 2D image. Non-visual methods could involve a range of sensors. One technique that is
straightforward to conceptualize is measuring cable tensions with load cells and cable angles
with potentiometers or Hall effect sensors. Visual methods are employed here for a few
reasons:

• They provide a direct measurement of the load’s pose independent of cable dy-
namics. Non-visual methods, measuring cable tension vectors on the drone side, do
not take into account any cable waves or complex non-straight cable shapes. As vi-
sual methods measure the load directly, they circumvent the need to understand cable
dynamics entirely; a good method to start with.
• They can still estimate the load’s pose when the cables are slack. Cable-tension-
based approaches are unable to do this as no cable tension results in no load dynamics
information.
• They have the added benefit of providing an estimate of cable length if the size of the
marker or load is known apriori, or stereo methods are used. Conversely, cable-tension-
based methods can provide a straightforward estimate of the load’s mass, although this
is also possible with adaptive control techniques discussed in Chapter 2.

This Section discusses how a 2D image can be used to construct an estimate of a slung
load’s pose in 3D. It begins with a brief description of how 2D images are formed (Section
4.2.1), before discussing the perspective-n-point algorithm (Section 4.2.2) fundamental to the
proposed method. This Section then concludes with a presentation of the results of visual
load pose measurement in Section 4.2.3.

4.2.1 Camera projections


The pinhole camera model is a simple model for a camera that describes where a point on a
3D object will appear in an image frame (see Figure 4.2). The Equation
 
u  W
C tp
v  = KΠ0 TW , (4.1)
1
1
presents this model mathematically (see Carlone’s ‘Visual Navigation for Autonomous
Vehicles’ notes for a detailed derivation [89]) where:

• tw
p is the position (in the world frame) of a point on an object,

• (u, v) are the corresponding 2D pixel coordinates of the 3D object point,


• K is the intrinsic calibration matrix of the camera defined by Equation 4.2,
• Π0 is the canonical projection defined by Π0 = I3 03 , and
 

• TWC
is the homogeneous transformation matrix from the world frame (W ) into the
camera frame (C).

71
Here, the intrinsic calibration matrix K is defined as:
 
sx f sθ f cx
K= 0  sy f c y  , (4.2)
0 0 1
where:
• sx f is the focal length in horizontal pixels (sx is the number of horizontal pixels per
meter),

• sy f is the focal length in vertical pixels (sy is the number of vertical pixels per meter),

• sθ f is the skew of the pixel, and

• (cx , cy ) is the principal point in pixels.

Figure 4.2: Pinhole camera model showing a projection of a point on an object from the 3D
world frame (TpW ) into the image frame (u, v) through the camera frame (C) adapted from
the OpenCV documentation [90].

Using Equation 4.1, if the location of a point in a world coordinate system (twp ) is known,
and the location of a camera (with known characteristics K) relative to this same coordinate
system is also known (TCW −1 = TW C
), the location of the point in a 2D image generated by
the camera can be found (u, v).
By rearranging Equation 4.1, it is possible to find the location of points in 3D given that
their corresponding points in the 2D image are known. However, when the Equation is used
like
 wthis,
 there are multiple possible solutions which are scalar multiples (λ) of each other:
t
λ p . This is the problem known as scale ambiguity; all 3D points along the red projection
1
line in Figure 4.2 produce the same 2D projection. This means that it is impossible to
reconstruct the size of the object given only a 2D image i.e. the image pictured could be

72
large and far away from the image plane, or small and close. Reconstructing the scale of
the object requires either triangulation (as in a stereo pair ) or known priors of an image size
(used in Section 4.2.2).

4.2.2 Perspective-n-point
In the context of the multi-drone slung load system, Equation 4.1 can be rewritten to use
the load’s coordinate system (L) rather than the world’s:
 
u  L
v  = KΠ0 TL tp .
C
(4.3)
1
1
This modification means that 3D points of interest are defined relative to the load rather
than the world (tLp replaces tW
p ), and the transform of interest now becomes the load in the
camera frame (TLC ) rather than the world in the camera frame (TLC ). Note that as multiple
points on the load are given as inputs, the relative position of these points in the load frame
are known. This gives a prior reference of the load’s scale which allows us to resolve the
scale ambiguity discussed in Section 4.2.1.
The perspective-n-point (PnP) algorithm is designed to solve the pose estimation problem
[91], where n represents the number of points used to do so. Mathematically (see Equation
4.3): find the transform TLC given n load points in 2D pixel coordinates (u,v), corresponding
3D load point coordinates (tLp ), camera intrinsic matrix K and the canonical projection Π0 .
The solution to this problem is a rotation and translation for an object that minimizes the
reprojection error given a set of 2D-3D point correspondences.
There are many variants to the PnP pose estimation problem, each which use a different
solution algorithm. Some common PnP variants are:

• P3P: a minimal solver requiring only 3 points (n = 3), but returning four real solutions
[92]. Using a fourth input point (P 4P ) helps resolve ambiguity, but points that are
close in 2D can still be difficult to resolve.

• Infinitesimal plane-based pose estimation (IPPE): fast and robust solutions when
n ≥ 4 input object points are coplanar [93]. Two solutions are possible when using
IPPE as discussed in Section 4.2.2.

• IPPE square: a variation of IPPE useful for ArUco marker/AprilTag detection where
coplanar points are arranged in a square [93].

• Iterative: uses Levenberg Marquardt [94] to refine a pose estimate after an initial
guess.

The solution to the P3P problem is shown graphically in Figure 4.3, and is used as
an example PnP solving algorithm. This Figure shows that 3 points on an object form a
tetrahedron with the camera coordinate system. Using the cosine rule with this tetrahedron,
a system of 3 Equations can be derived:

73
l12 = s22 + s23 − 2s2 s3 cos θ23 , (4.4)
l22 = s21 + s23 − 2s1 s3 cos θ13 , (4.5)
l32 = s21 + s22 − 2s1 s2 cos θ12 . (4.6)
Using algebra, s1 , s2 , ands3 can be solved (l1 , l2 , l3 , θ13 , θ12 and θ23 are known), thus locating
the object.

Figure 4.3: Perspective-3-point visualized. 3 points on an object P1 , P2 , P3 are projected on


to a 2D image plane P1′ , P2′ , P3′ . A tetrahedron is formed by the camera coordinate system
origin (peak), object (base) and the projection lines s1 , s2 , s3 . This tetrahedron provides
the geometric constraints necessary to find the pose of the object relative to the camera.

As described in Section 3.5.1, the approach taken in this work is to attach an ArUco
marker to the top face of the slung load. This gives a clear set of 4 planar reference points
(corners of the marker) which can be used with the IPPE square PnP algorithm.

IPPE ambiguity
As previously mentioned, the IPPE algorithm returns two solutions. When the projection of
the object is close to affine, it becomes difficult to distinguish these two solutions [93]. This
happens when the object is far away from the camera, or small, which is often the case in
a slung load carrying situation. The ambiguity that results in this case can be visualized in
Figure 4.4.
As this is a fundamental limitation of many PnP algorithms, no approach exists to
distinguish the correct solution in the general case [93]. One common workaround is to
temporally filter solutions: select the solution that is closest to the previous solution to
prevent a pose estimate ‘jumping’ between solutions. The key limitation of this approach is
that it can be difficult to select which solution should be used in the first instance.

Addressing IPPE ambiguity


In the case of the multi-drone slung load system, it is possible to use expected pose priors to
determine the correct solution in the first instance. As discussed in Section 3.3.4, the drones

74
Figure 4.4: A visualization of the pose ambiguity when using the IPPE algorithm to detect
a square marker. The two solutions correspond to the marker being flipped around a plane
parallel to the image plane. Note that this ambiguity only affects the marker’s predicted
orientation, not its position.

fly in a pre-determined formation around the load. During takeoff, there is deliberately a
phase in which the drones fly in this formation but have not yet engaged cable tension. If
the drones and the load are arranged in a particular way prior to takeoff, the drones will
now be in roughly known locations relative to the static load (as the cable tensions have not
yet engaged). Using

Tmci = (Tcdii )−1 (TdLi )−1 TmL , (4.7)


the ith drone’s known position relative to the load (TdLi ) can be used (along with the
known static transforms of the marker relative to the load TmL and the ith camera relative to
the ith drone Tcdii ) to determine the expected transform from the ith camera to the marker
(Tmci ). This can then be compared with the solutions from IPPE (T̃mci ); the solution with the
lesser distance to the expected solution can be taken as the correct solution.
The same logic can be applied to the rotation in isolation (rather than the full transfor-
mation matrix) to give:

ci
Rm = (Rcdii )−1 (RdLi )−1 Rm
L
. (4.8)

4.2.3 Load pose measurement results


The visual load pose measurement approach is tested using the Gazebo simulator and the
hardware platform discussed in Chapter 3. On both of these platforms, the trajectory seen in
Figure 4.5 is used for testing as it is simple and produces both load translation and rotation.

75
Figure 4.5: Trajectory of the load used to generate data for visual measurement: a circle of
2 m radius changing load orientation. The rate of load orientation change is synchronised
with the load’s circular motion such that the same face of the load is always facing the center
of the circle. W represents the fixed ‘world’ coordinate system while L0 – Ln represent the
load’s coordinate system over timesteps 0 – n. Note that the drone orientations stay fixed
relative to the load’s orientation as defined in Figure 3.7.

76
The results of the PnP load pose estimation algorithm at a snapshot in time from a single
drone’s estimator can be seen in Figure 4.6.

Figure 4.6: Image stream from a simulated camera mounted to a drone in the slung load
system. The overlayed axes (red = x, green = y, blue = z) represent the estimated marker
axes using OpenCV’s IPPE square PnP algorithm. This is a repetition of Figure 3.18 for
convenience.

During these tests, ground truth (gt) in simulation is taken to be the exact pose of the
load model in the gazebo simulator (retrieved using the ros_gz_bridge package). In the
real world, ground truth (gt) is taken to be the pose of the load as measured by the EKF
state estimator running on the onboard Pixhawk (see Section 3.5.1). The aggregate position
error magnitude is defined by:

|tC C C
L error | = |tL estimated (est) − tL ground truth (gt)|, (4.9)
while the orientation error magnitude is defined by the geodesic distance:
W T W
tr((RL, est ) RL, gt ) − 1
W
distθ (RL, W
est , RL, gt ) = |arccos( )|. (4.10)
2

Figure 4.7 shows the predicted load’s pose using the IPPE square algorithm compared
to the ground truth load pose. This Figure separates the estimation into xyz and rpy (roll,
pitch and yaw) components of pose. Figure 4.8 uses the same data but instead plots the
error by component (‘measured minus ground truth’), while Figure 4.9 plots the aggregate
error magnitudes.

77
(a) IPPE square results in simulation.

(b) IPPE square results in the real world.

Figure 4.7: Estimated pose of the slung load using the IPPE square algorithm. These results
were captured in real time using an image stream from a camera onboard a single drone. All
drones were simultaneously running this algorithm and achieved comparable results.

78
(a) IPPE square results in simulation - error by component.

(b) IPPE square results in the real world - error by component.

Figure 4.8: Estimated pose of the slung load using the IPPE square algorithm - error by
component. These results were captured in real time using an image stream from a camera
onboard a single drone. All drones were simultaneously running this algorithm and achieved
comparable results.

79
(a) IPPE square results in simulation - combined error.

(b) IPPE square results in the real world - combined error.

Figure 4.9: Estimated pose of the slung load using the IPPE square algorithm - combined
error. These results were captured in real time using an image stream from a camera on-
board a single drone. All drones were simultaneously running this algorithm and achieved
comparable results.

80
Comparing aggregate error magnitudes in simulation (Figure 4.9a) with those from the
real world (Figure 4.9b), it can be seen that translation error averages around 0.1 m in
simulation, but sits higher around 0.2 m in the real world. Following a similar trend, geodesic
distance attitude error sits around 0◦ in simulation, but closer to 5◦ in the real world (when
close to the solution). The geodesic distance attitude error is extremely noisy however, and
can be seen bouncing to an error of around 60◦ in both simulation and the real world. The
translation error shows a similar pattern of high noise.
Figure 4.7 provides some insight into the cause of the perceived error noise. Both Figure
4.7a and Figure 4.7 show high fluctuation in the roll (and to a lesser extent, pitch) com-
ponent’s visual estimation, suggesting a systemic algorithmic issue rather than an imple-
mentation one. This fluctuation can be seen visually in the Gazebo video results (https:
//youtu.be/L-jx-SenBfA) when the load’s estimated coordinate system (similar to that
shown in Figure 4.6) flicks back and fourth between what looks like a reasonable solution,
and one where the the z-axis’ orientation is incorrect. The cause of this issue is the solution
ambiguity discussed in Section 4.2.2. As the load’s marker is small in the camera’s image,
the two solutions returned from IPPE both appear reasonable to the algorithm. As a result,
the solution with the lower reprojection error (and thus the selected solution), constantly
changes. This results in the high solution ‘noise’ seen. One potential solution to this issue
is discussed in Section 4.2.2, but this is yet to be tested.
Figures 4.7, 4.8b, and 4.9b show several long periods without load pose measurement on
the real world platform (e.g., a period of approx. 60 s from 50 s to 110 s experiment time,
and a period of approx. 30 s from 170 s to 200 s experiment time). The periods without
load marker measurement are much smaller in simulation as seen in Figures 4.7a, 4.8a, and
4.9b (maximum of approx. 5 s from 98 s to 103 s experiment time). This difference reflects
known issues with vision in the real world including marker occlusion and inconsistent marker
illumination. The source of these occlusions are extreme load angles, the cables, and the
GNSS unit shown in Section 3.5.1.

4.3 Load pose estimation


Section 4.2.3 shows how cameras on board individual drones can go long periods of time
without measuring the load marker. As some of the inability to measure the load is caused
by a drone’s own cable obstructing its camera’s view (which only occurs for each drone at
particular load angles), it is unlikely that all three drones (see Chapter 3) will not observe the
load’s marker at any one time. Consequently, it would be advantageous to use measurements
from all of the drones to estimate the load’s pose. Further, as the load measurements are
discrete, but its movement is continuous, a method of estimating the load’s pose between
measurements would also be of value.
Several higher-level tools could be used to address these problems:

• Factor graphs: graphical representations of estimation problems that can be solved


by various Bayesian inference algorithms.
• Bundle adjustment: a specific problem that takes a set of 2D correspondences (e.g.,
marker corners) over number of frames as input. Using these 2D correspondences, the

81
objective is to estimate the poses of the 3D landmarks corresponding to the keypoints
(e.g., the slung load), as well as the poses of the cameras that took the images. Bundle
adjustment problems are often formulated using factor graphs.

• Kalman filters: an approach to maximum likelihood estimation in the Bayesian state


estimation framework. Commonly used to fuse a series of measurements over time with
some known dynamics model in order to reduce estimation uncertainty.

• Particle filters: an evolution of Kalman filter methods that relies on sequential Monte
Carlo predictions rather than linear predictions like the basic Kalman filter.

More generic structures and algorithms such as factor graphs and bundle adjustment tend
to use more computationally-intensive optimization methods to solve estimation problems.
As the objective is to estimate the pose of a slung load in real time on board resource-
constrained drones, algorithm efficiency is crucial. This favors Kalman filter and particle
filter based methods. As the problem is low-dimensional and simple (only attempting to
estimate the pose of the load), the Kalman filter approach is attempted first to reduce
computational and implementation complexity. This Section therefore discusses how variants
of the Kalman filter can be used to achieve a more accurate estimate of load pose. The code
required to implement this methodology can be found here: https://github.com/hmer101/
slung_pose_estimation.

4.3.1 Kalman filter


Kalman filters obtain optimal state estimates (x̂k ) by fusing a series of measurements over
time with a known dynamics model. The key benefit is that the fused optimal state estimate
will be a better estimate of the actual state than when relying on any one source of informa-
tion. For example, IMU data (often used in state update Equations) is susceptible to drift
while GNSS data can be susceptible to noise. Combining these two sources of information
gives a smoother result than when relying on GNSS alone, and a more accurate result later
in time than when using an IMU alone.
The most basic Kalman filter assumes a linear system model:

xk = Axk−1 + Buk + wk , (4.11)

with process noise wk and covariance Q defined by wk ∼ N (0, Q), and a linear measurement
model:

yk = Cxk + vk , (4.12)

with measurement noise vk and covariance R defined by vk ∼ N (0, R). The symbols used
are defined as:

• xk : true state at time k,

• A: state transition matrix,

• B: control input matrix,

82
• C: observation matrix,

• uk : control input at time k,

• yk : observation at time k.

The Kalman filter is a recursive estimator that functions in two key steps:
Prediction

x̂−
k = Ax̂k−1 + Buk (4.13)
P̂k− T
= APk−1 A + Q (4.14)

Update

Kk = Pk− C T (CPk− C T + R)−1 (4.15)


x̂k = x̂− −
k + Kk (yk − C x̂k ) (4.16)
Pk = (I − Kk C)Pk− (4.17)

During the prediction step of timestep k, the state update Equation uses the state esti-
mate (x̂) and state covariance (Q) from the previous timestep (k–1), to make an a priori
estimate of the state and covariance at this timestep (x̂−
k and P̂k ). These estimates are then

used in conjunction with observations (yk ) in the update step to calculate the a posteriori
predicted state estimate (x̂k ) and the a posteriori predicted state covariance matrix (P̂k ).
Note that I is the identity matrix.
An important part of the update step is the calculation of the Kalman gain (Kk ) in
Equation 4.15. This gain defines how much emphasis to put on the a priori prediction
relative to the measurement and is decided by the relative size of Q and R. A larger R
relative to Q will result in a smaller Kk . Equation 4.16 shows how a smaller Kk results in
a smaller measurement weighting and vice versa. As a consequence, it is possible to tune
which source of information has a greater influence on x̂k and Pk by tuning the relative
magnitudes of Q and R.
Two key variants of the Kalman filter include:

• Extended Kalman filter (EKF): works with nonlinear system models by linearizing
about an estimate of the current state and covariance before performing the prediction
step.

• Unscented Kalman filter (UKF): also works on nonlinear system models. Rather
than working on a linearization of the system model, the UKF uses an unscented
transformation to pick sample points around the mean. It then propagates those
points through the full nonlinear models which gives a more accurate estimation of the
mean and covariance than the linearization used in the EKF [95].

83
Kalman filter in the multi-drone slung load system
A Lie group is a mathematical group that is also a differentiable manifold. The special
Euclidean group (SE(d)) used to define homogeneous transformation matricies, and the
special orthogonal group (SO(d)) used to define rotations, are both examples of Lie groups.
For a basic introduction to Lie groups for robotics, see Sola et al. [80].
Lie theory is expansive and complex, but is useful here to define the subset of spaces
that valid robot poses can exist within. This provides useful constraints when solving pose
estimation problems which can be exploited by algorithms such as the Kalman filter. This
is the approach taken in the kalmanif library (found here: https://github.com/hmer101/
kalmanif) that implements a structure for using Kalman filters on manifolds (such as Lie
groups). The kalmanif library is used in this work as it provides a functional way to interface
with complex Kalman filter optimizations for Lie groups.
When applying a Kalman filter to the multi-drone slung load system, the key task is
the definition of the measurement Equation 4.12 and the state update Equation 4.13 (by
defining the system dynamics in Equation 4.11).
The measurement Equation is given by:
 L  W
ydi W tdi
= TL , (4.18)
1 1
where TLW is defined as:
 −1
RLW tW
TLW = L
. (4.19)
01×3 1
This Equation defines the position of drone i relative to the load ydLi based on drone i′ s
di , which is obtained from the RTK GNSS system. This is the inverse of what is
position tW
measured by PnP (with static transform offsets for the camera and marker) as it is easier to
formulate in this direction using the kalmanif library.
The state update Equation is implemented in a linear form:

XLW = XLW + U , (4.20)


where:
   
tW  = xL ,
XLW = L
(4.21)
RPYZY X RLW θL
as it is easier to formulate it this way when using the kalmanif library. Note that RPYZY X
maps a rotation matrix into its roll, pitch and yaw Euler angles in the ZYX convention.
Capturing the state update Equation like this means that the nonlinear load dynamics
presented in Section 3.1 must be captured in the control input term:
   
ẋL 1 ẍL
U = dXL = dt + dt2 . (4.22)
θ̇L 2 θ̈L
This Equation is derived using kinematic Equations with constant acceleration. The load
acceleration (ẍL , θ̈L ) information can either be obtained by an IMU mounted directly to the
load, or through rearranged Equations 3.3:

84
1 X
ẍL = [− RLW ti qi ] − ge3 , (4.23)
mL
and 3.4:
X
θ̈L = JL−1 [ ri × −ti qi − θ̇L × JL θ̇L ], (4.24)
where the symbols used are defined in the Nomenclature Section. The rate information
required in Equation 4.22 (ẋL , θ̇L ) can be obtained via numerical integration regardless of
the source of load acceleration information. As discussed in Section 4.1, not using an IMU
mounted on the load requires a good cable dynamics estimator, which is still in development
(see Chapter 5). As such, the on-load IMU method is used to produce the following results.

4.3.2 Load pose estimation results


The low fidelity Julia dynamics simulator discussed in Section 3.4.1 is used to generate the
testing data for the Kalman filters. This simulator generates deterministic, dynamically-
feasible data quickly and so is ideal for early testing.
The trajectory used to generate the testing data in Julia is shown in Figure 4.10. Note
that the load orientation doesn’t change throughout the trajectory as the current state
estimator only estimates the load’s position and not its orientation (see Section 4.1).

Figure 4.10: Trajectory of the load used to generate data in the Julia simulator: a circle of
2 m radius with a fixed load orientation. W represents the fixed ‘world’ coordinate system
while L0 – Ln represent the load’s coordinate system over timesteps 0 – n. Note that the
drone orientations stay fixed relative to the load’s orientation as defined in Figure 3.7. This
is a repetition of Figure 3.9 for convenience.

85
Two filters are tested for estimating the load’s position: an EKF and a UKF. Figure 4.11
presents the trajectories predicted by the filters while Figure 4.12 presents the corresponding
position error components. Both figures show that the unfiltered trajectory, without mea-
surement corrections, veers away from the ground truth over time, while the EKF and UKF
average around 0 error for the whole trajectory. It is interesting to note that the trajectories
predicted by the EKF and UKF are exactly the same currently. This is likely due to the
system dynamics being implemented in a linearized way in Equation 4.20. As such, the non-
linear dynamics are being lost, and the EKF and UKF degrade to handling the linearized
case in the same way. Alternatively, this could be a result of both filters too closely following
observations as they both add observations on every 0.1 s time step. This could also be
part of the cause of the “high noise” seen in Figures 4.12b and 4.12c; the over-reliance on
measurement data means the update Equation has a reduced smoothing effect. However,
a 1 m cable was used to generate the data. As both the EKF and UKF show a maximum
position error magnitude fluctuation of 0.06 m, this represents an error only 6 % of the cable
length; the noise observed is within an acceptable magnitude.

Figure 4.11: EKF and UKF predicted load trajectory vs the unfiltered (UNFI) load trajectory
obtained without measurement correction. The reference load trajectory is a circle of 2 m
radius, flying at 1.5 m/s. Measurement noise is modelled as ynoise ∼ N (0, 0.01) while
process noise is modelled with control input uncertainty utranslation noise ∼ N (0, 1e−4 ),
urotation noise ∼ N (0, 9e−6 ).

Although the results are early, they prove that both EKFs and UKFs can be used to
provide drift-free load pose estimations. Further, they show that load pose estimates can
be obtained even between when load poses are measured. These two results were previously
impossible with IMU or camera data alone.

86
(a) Position errors - unfiltered.

Figure 4.12: Predicted position error components (x, y, z) and magnitude (r) over the
trajectory seen in Figure 4.11. Blue lines correspond to the mean while the red dotted lines
corresponds to 3 standard derivations from the mean (in the filter cases).

87
(b) Position errors - EKF.

(c) Position errors - UKF.

Figure 4.12: Predicted position error components and magnitude continued.

88
4.4 Conclusions
This Chapter presents a novel vision-based slung load pose estimation methodology for a
multi-drone slung load system. Several key conclusions are drawn:

• Visual slung load measurement can be performed successfully using the IPPE square
algorithm for ArUco marker detection. In simulation, translation error averages around
0.1 m while geodesic distance attitude error averages around 0◦ . In the real world, these
errors are higher at 0.2 m and 5◦ respectively.

• Perceived ‘high noise’ in the visual pose measurement is caused by IPPE square solution
ambiguity when observing a distant marker. Temporal filtering combined with the use
of known pose priors could be used to address this issue.

• Two key Kalman filter architectures exist to improve load pose estimates: use inde-
pendent filters on each drone, or run one filter that fuses information from all drones.
The single, centralized filter is used here to achieve a simple baseline position estimate
of the load.

• Both extended Kalman filter (EKF) and unscented Kalman filter (UKF) formulations
can provide drift-free load position estimates. The position estimates average around
an error of 0 m, while the error noise magnitude is only 6% of the cable length. Both
methods have currently only been tested on simulation data.

4.5 Future work


For the load pose measurement system, immediate future work involves testing temporal
filtering with known priors to reduce the effect of PnP solution ambiguity.
A number of items exist for future work on the Kalman-filter-based pose estimator:

• Extend the current Kalman filter formulations to also estimate load orientation.

• Tune process and measurement noise covariances to obtain a better weighted reliance
on system dynamics and measurement information.

• Attempt a reformulation of the state update Equation that captures nonlinearities


directly, rather than in the input term.

• Try the second Kalman filter architecture where one filter runs on each drone. This
could involve developing a method to fuse state estimates across multiple drones, such
as a consensus algorithm or an additional filter.

• Test the Kalman filters in real-time on the real world testing platform discussed in
Chapter 3.

• Attempt the removal of the IMU from the load, and use cable dynamics estimation for
estimating the load’s linear and angular accelerations.

89
• Ultimately, the best method for pose estimation will be a combination of both visual
and non-visual methods. Future work can focus on developing the non-vision-based
estimation methodology, and combining that with the vision-based methods presented
in this Chapter. Potential approaches include using a strain gauge on each cable to
measure tension combined with potentiometers or Hall effect sensors to measure cable
angle.

90
Chapter 5

Cable Dynamics Modelling

Section 3.1 presents a simple dynamics model for multi-drone slung load systems. The model
is reproduced here for the reader’s convenience:

mi ẍi = fi Ri ê3 − mi g ê3 + RL ti qi , (3.1)


Ji θ̈i + θ̇i × Ji θ̇i = τi , (3.2)
Xn
mL ẍL = − RL ti qi − mL g ê3 , (3.3)
i=1
n
X
JL θ̈L + θ̇L × JL θ̇L = ri × (−ti qi ), (3.4)
i=1

where the symbols used are introduced in Table 3.1, and available for reference in the
Nomenclature Section. It is noted in Section 3.1 that this model makes the critical assump-
tion that the cables in the system are massless. Until this point, this assumption, like the
other assumptions, have been used without much investigation. The aim of this Chapter
is to discuss the use of universal differential equations (UDEs) to address the effects of the
massless cable assumption.
This Chapter first discusses the massless cable assumption and its implications (Section
5.1). UDEs are then presented, and an approach to apply them to the multi-drone slung load
problem is discussed (Section 5.2). Finally, a novel implementation to test this approach and
the results of said implementation are discussed in Sections 5.3 and 5.4 respectively.

5.1 The massless cable assumption


A massless cable, when under positive tension, will draw a straight line between its two
connection points. This means that the tension vectors acting on the connection points can
be assumed to be equal and opposite. The effect of this assumption is seen in Equations 3.1,
3.3 and 3.4: the tension vector ti qi has equal magnitude but acts in the opposite direction
on drone i (Equation 3.1) and the load (Equations 3.3 and 3.4).
In reality, a cable will have mass and thus be subject to the effects of gravity. This
results in the cable hanging in an approximately catenary shape. Figure 5.1 illustrates that

91
the tension vectors acting on the drone and load connection points are no longer equal
and opposite. Further, a real cable with mass is capable of carrying waves. Bernard and
Kondak have shown experimentally that angle encoders connecting the drones to the cables
are incapable of correctly predicting the load’s attitude due to these waves [48]. This is
again due to the cable tension vectors acting on the drone and load no longer being equal
and opposite.

Figure 5.1: Tension vectors at either end of a cable hanging in a catenary shape (tload , tdrone ∈
R3 ). The cable in question is connected to drone i at xi and to the load at connection point
xLi .

When discussing sensing with a slung load, it is likely that the load will be slung far
below the drone. This can help increase the areas that the sensor can reach and reduce
the effect of propeller noise and downwash experienced by the sensor. As the length of the
cables increase, the cable mass (affecting cable hanging shape and cable waves) can start to
become significant. Because of this, the cable’s mass can become non-negligible in a multi-
drone slung load sensing case, which is the main motivation for this project. As such, a
method to address the massless cable assumption is presented below.

5.2 Cable dynamics modelling with universal differential


equations
In this Section, a hybrid physics-based and data-based method to modelling slung load
cable dynamics is presented. This method relies on UDEs, which are first presented in a

92
general sense in Section 5.2.1. Discussion then moves to how UDEs can be used to augment
the system of Equations 3.1 - 3.4 to better capture cable dynamics in the drone-slung-load
system (Section 5.2.2).

5.2.1 Universal differential equations


Neural ordinary differential equations (NODEs), such as:

u̇ = N Nθ (u, t), (5.1)


are simply ordinary differential equations (ODEs) where a derivative is defined by a neural
network. Universal differential equations (UDEs) are extensions of NODEs where differential
equations are defined in full or part by a function approximator [96]. A function approximator
is simply a generic mathematical structure that can be used to model some pattern in data.
Function approximators that are able to capture more common and complex nonlinear effects
are the most powerful of these (e.g., Taylor series, Volterra series, Fourier series, or neural
network). The Lotka-Volterra Equations:

ẋ = αx − βxy,
(5.2)
ẏ = −γy + δxy,
model the populations of interacting prey and predator species, and provide a simple use-
case of a UDE [96]. Here, x and y represent the populations of a prey and predator species
respectively. Similarly, α represents the prey’s per capita growth rate, while γ represents
the predator’s per captita death rate. The nonlinear interaction terms xy in each Equation
attempt to model how the predator’s population affects the prey’s population (modified by
a scalar β), and vice versa (modified by a scalar δ). These Equations make a large range
of assumptions, particularly in the nonlinear xy term. This term suggests that the rate at
which individuals of each species meet is proportional to the product of their population
sizes, and that the affect of the populations on eachother can be modelled entirely by this
meeting rate. Rather than making this potent assumption, we can instead use function
approximators as in:

ẋ = αx + U1 (θ1 , x, y),
(5.3)
ẏ = −γy + U2 (θ2 , x, y),
where U1 and U2 represent generic function approximators with parameter vectors θ1 and
θ2 . These parameter vectors contain other important parameters to fit the function approx-
imators to some provided data (e.g., weights and biases in the case of a neural network).
This allows known physics models to be easily expanded to account for some effects that are
difficult to model without assumptions, whilst requiring a much smaller quantity of data to
train than a traditional neural network (or alternative function approximator).
UDEs have been applied across a range of different fields to huge success [96], but have
not yet become a cornerstone tool in mechanical engineering. This is will likely change in the
coming years as researchers attempt to bridge the gap between physics-based and data-based
dynamics and control techniques.

93
5.2.2 In-system cable dynamics modelling
Section 2.2.4 discusses the use of data-based techniques to overcome limiting physics-based
assumptions in the multi-quadrotor slung load system. The only notable effort put into
combining physics and data-based methods is Serrano et al.’s use of PINNs [84]. This
Section provides an example of using UDEs as an alternative hybrid approach to overcome
such limiting assumptions.
As discussed in Section 5.1, the effect of the cables on the drones and load are shown
by the tension vectors ti qi in the system of Equations 3.1 - 3.4. These terms are affected
by the massless cable assumption, and represent where the majority of modelling errors are
introduced into the system. Rather than making this assumption, we can instead use a neural
network to capture the cable tension vectors acting on the drone and load respectively. If
we have enough quality data, this should theoretically be able to capture the effects of the
cable on the system beyond the massless cable case.
To test the hypothesis that a UDE can accurately model the multi-quadcopter slung load
system, we start with the simple massless cable case. In this case, it is hypothesized that
the model:

tdronei = N Np (xi/Li , ẋi/Li , ẍi/Li ), (5.4)


tloadi = −tdronei , (5.5)
can capture the cable tension vectors. Equation 5.4 suggests that a neural network with
parameters ‘p’ (refered to throughout as the ‘tension-predicting neural network’ and ‘the
neural network’) can capture the tension vector acting on the drone. It takes only the
relative motion of the two cable attachment points as inputs to make the prediction: xi/Li
is the position of the attachment point of a cable on drone i relative to its corresponding
attachment point on the load Li , similarly with relative velocity ẋi/Li and acceleration ẍi/Li .
This gives a total of 9 inputs to the neural network (each input is a 3 × 1 vector), and results
in a slightly modified system dynamics model defined by Equations:

mi ẍi = fi Ri ê3 − mi g ê3 + tdronei , (5.6)


Ji θ̈i + θ̇i × Ji θ̇i = τi , (5.7)
Xn
mL ẍL = − tloadi − mL g ê3 , (5.8)
i=1
n
X
JL θ̈L + θ̇L × JL θ̇L = ri × tloadi . (5.9)
i=1

The relative motion of the cable attachment points are used to make the tension predic-
tions independent of where the slung load system is relative to the inertial frame. This has
a similar effect to data normalization. As the massless cable assumption is used, the tension
vector acting on the load is equal and opposite to that acting on the drone (Equation 5.5).
To extend beyond the simple massless cable case, it is hypothesized that a separate neural
network will be required to model the tension on the drone and the load respectively. To

94
allow generalization between cables of different mass, the cable mass would also likely have
to be an input. To capture the effects of varying other cable parameters such as elasticity,
damping and aerodynamic effects (among others), the key quantities defining these effects
(perhaps cable stiffness, damping ratio, radius and coefficient of drag), would also have to
be inputs. As a proof of concept however, this Chapter only deals with trying to replicate
the simple massless case. As such, we proceed with the formulation in Equations 5.4 and
5.5.
It may be noted that it is possible to train the neural network proposed in Equation 5.4
without embedding it in a UDE. This is discussed in Section 5.4. The problem with this
however is that it is difficult to measure the exact orientation of the tension vectors tdrone
and tload on a physical system. Embedding the tension-approximating neural network in a
UDE means we can train directly on drone and load trajectory data, which is much easier
to obtain in a physical system or generic simulator than tension data.

5.3 Implementation in Julia


All code used to test the UDE system is implemented in the Julia programming language,
and is available here: https://github.com/hmer101/18.337_ude_cables (also in Appendix
A). Julia is used due to its support for fast scientific machine learning tools that are ex-
tremely useful for training and solving UDE systems. All experiments involving timing were
conducted on a Dell XPS 15 containing an Intel(R) Core(TM) i9-9980HK CPU clocked at
2.40GHz.

5.3.1 UDE system


The core of this Chapter is the UDE system that implements the system of Equations 5.6 -
5.9. These Equations are implemented as a system of first order UDEs with the state space
‘u’ defined by:
T
u = ξ1 ... ξn xL ẋL θL θ̇L . (5.10)
ξi defined here:
T
ξi = xi ẋi θi θ̇i , (5.11)
represents the state space of drone i, where i ∈ 1, 2...n and n = 3. All other terms are
defined in the Nomenclature Section.
Julia’s DifferentialEquations.jl package requires a function of the form f 1!(du, u, p, t) to
pass into a differential Equation solver. The mutating in-place form ‘f 1!’ is deliberately
used over the out-of-place form so that du need not be re-allocated on every iteration of the
solver. As many solver iterations are required to solve the system over a given time-span,
this significantly reduces the number of allocations required during solving. This in turn
significantly reduces the solve time, and so makes training tractable (many full solves over
the entire timespan are required during training).

95
Equations:
T
du = ξ̇1 ... ξ̇n ẋL ẍL θ̇L θ̈L , (5.12)
and: T
ξ̇i = ẋi ẍi θ̇i θ̈i , (5.13)
show where the drone and load acceleration terms appear in the du calculations. From
Equations 5.6, 5.8 and 5.9, we can see that tension vector estimates (tdronei and tloadi )
appear when calculating ẍi , ẍL and θ̈L . These estimates are found using Equations 5.4 and
5.5 defined earlier and thus represent the embedded universal approximator part of the UDE
system.
Again, Julia’s DifferentialEquations solver limits the UDE system function to the form
f 1!(du, u, p, t). As we only wish to find the sensitivity of the system to changes in parameters
of the tension-approximating neural network, only the flattened parameters of the neural
network are passed into f 1!’s ‘p’. This saves computation time when finding the gradient
of f 1!, which is very relevant when training the neural network; there is no need to find
the sensitivity of the system to additional parameters we are not optimizing. All additional
parameters required in the UDE system computation are defined in a mutable structure
‘DroneSwarmParams’ which wraps f 1!. This results in the UDE system being defined as a
callable structure: (params::DroneSwarmParams)(du,u,p,t). A callable structure is used
rather than a closure as it allows the UDE system to more easily be called in a variety of
testing code blocks.
The UDE system’s state space is comprised of 16 3 × 1 vectors in the three-drone case
(Equations 5.11 and 5.10). Julia’s differential Equations solver requires ‘u’ and ‘du’ to be
in flattened form. Rather than flattening the 16 × 3 matrix, array partitions are used. This
allows the individual components to be accessed without having to unflatten parts of the
array. This is particularly useful when rotation matricies, rather than roll-pitch-yaw Euler
angles, are used to define the orientations of the drones and load.

5.3.2 Solver and loss functions


The UDE is solved using DifferentialEquations.jl’s Tsit5 solver. Using the adaptive solver,
it was found that timesteps are taken on the order of 0.1 ms, which is magnitudes smaller
than the timescale this system is expected to operate on (no less than 1 ms). This resulted in
infeasibly long solve times of around 6 hours for a single trajectory 2.5 s long. Switching to
a fixed timestep solver with a step of 0.1 s gave similar trajectory results (down to 2 decimal
places), but with a solve time of 5 s, thus the fixed time-step solver was used. It is of interest
in the future to compare Julia’s different solvers to verify the correctness of the solutions
and potentially achieve faster solve times in this system.
The loss function for the full UDE is defined as the mean squared error (MSE) between
the full state space’s trajectory data and the full state space predicted by the UDE solver.
Similarly, MSE between the tension data and predicted tension over a trajectory is used
when training the tension-predicting neural network directly.

96
5.3.3 Training data generation
Section 3.4.1 discusses a low fidelity Julia dynamics simulator. This simulator is used here
to give a full time-history of the state space required to train the neural network embedded
in the full UDE. Further, cable tensions are also saved at the same datapoints as as the
trajectories. This generates a dataset to allow the the neural network to be trained directly,
rather than only when it is embedded in the UDE.
The load trajectory is selected to be the load moving in a circle of constant 2 m radius
with a period of 2.5 s. The load does not rotate relative to the inertial frame during motion,
thus giving a constant linear acceleration towards the center of the revolution, and no angular
velocity or acceleration. This is the simplest case containing some varying cable tensions
throughout the trajectory. The load and drones’ linear motions can be found as datapoints
in Figures 5.2a and 5.2b respectively. Similarly, the relative motion of the cable attachment
points and the corresponding cable tensions can be found in Figures 5.3a and 5.3b. All
angular motions (orientation, angular velocity and angular acceleration) of the load and
drones remain at zero for the full extent of the trajectory. This information fully describes
the slung load and drone system trajectory.
As only the drone positions are directly constrained by the cable, data for the drones’
linear velocities and accelerations has to be generated by a finite backwards difference (oth-
erwise the dynamics could be over-constrained). This gives poor initial estimations of drone
velocity for the first timestep, and poor estimates of the drone acceleration for the first two
timesteps. Rather than hard-coding the initial values for these quantities, the training data
is simply trimmed to exclude the first three timesteps.

5.4 Results and discussion


Initially, for the reasons discussed in Section 5.2.2, the neural network was trained whilst
embedded in the UDE system. When this did not converge after extensive debugging,
the cable tension-predicting neural network was trained directly on tension data, and the
corresponding data for the relative motion of the cable attachment points. The results of
both approaches are discussed below.

5.4.1 Training the tension approximation function embedded in the


UDE system
Julia’s Optimization.jl package is used to train the system due to the relative ease of switching
between methods of finding gradients in the back end. This is useful because it was initially
intended to use automatic differentiation (AD) to find the gradients required for training.
However, it was discovered too late that ForwardDiff, ReverseDiff, Zygote and Tracker all
have difficulty functioning with pre-allocated caches used in a function solved by a UDE
solver. As pre-allocated caches are included in the callable structure that surrounds the
UDE system, AD was unable to be used without the use of the PreallocationTools.jl package.
Implementing this when the issue was discovered would have taken too long, so it is left for
future work.

97
(a) Linear trajectory data and predictions - load.

Figure 5.2: Training data (points) and post-training predictions (lines) for a three-drone
rigid-body load system. The load travels anti-clockwise in a circle without vertical motion.
The drones remain arranged in a way such that neither the load nor the drones rotate over
the trajectory. The first ‘x’ in the labels represents state while L, 1, 2 & 3 index the load
and the drones. Data is split into x, y and z components relative to the inertial frame as
identified by the trailing ‘x’, ‘y’ or ‘z’. Data points for cables that appear to be missing are
directly under the corresponding data points for another cable.

98
(b) Linear trajectory data and predictions - drones (continued). Legend corresponds to all graphs
- excluding the dot for the first graph, and adding a dot for the last graph.

Figure 5.2: Continued figures from previous page.

99
(a) Relative motion of cable attachment points.

Figure 5.3: Cable tension and relative cable attachment point motion data and predictions
from the same trajectory as Figure 5.2. Predictions come from training the neural network
embedded in the UDE.

100
(b) Cable tensions throughout trajectory.

Figure 5.3: Continued figures from previous page.

101
Instead, FiniteDiff.jl was used to find the gradient of the loss function (with respect to
neural network parameter changes), using finite differences. This is significantly slower and
less accurate than the ideal source-to-source AD from Zygote.jl.
The tension-predicting neural network is modelled with 1 hidden layer containing 20
nodes. When this is trained using the ADAM optimizer, a learning rate of 0.01 and a
maximum of 1000 iterations, convergence is not reached (see Figure 5.4). Observing the
plateau at the end of the loss history, it is hypothesised that perhaps the training stabilizes
and needs more iterations to converge. Increasing the iteration cap to 10,000 however, does
not help. Increasing the learning rate to 0.02 leads to an even further unstable loss history,
and decreasing it leads to a similar shallow training gradient as seen at the end of Figure 5.4.
Increasing the number of hidden layers and hidden nodes does not help either, and given the
results in 5.4.2, it is not expected to do so.

Figure 5.4: Loss history from training the neural network embedded in the UDE.

Comparing the gradient found from the FiniteDiff.jl package, with that from the Finite-
Difference.jl package, it is found that a significant discrepancy exists (on the order of 1 – 10%
for different parameters). This is concerning and suggests that the finite difference methods
may have significant error build-up that results in lack of accurate gradients. This would
be the leading theory for lack of convergence if not for the discussion at the end of Section
5.4.2. It will however be addressed when implementing the preallocationTools fix discussed
above as AD avoids finite differencing’s error accumulation.
Training takes on the order of 10 minutes when trained on the CPU with GPU training
not attempted due to lack of initial convergence. Comprehensive benchmarking will be
completed once convergence is achieved, and reverse-mode AD is able to be performed on
the UDE (training is expected to be significantly faster).
The trajectory predictions using the non-converged neural network tension estimator are
shown in Figure 5.2.

102
5.4.2 Directly training the tension approximation function

Figure 5.5: Loss history while training the neural network directly.

It could be hypothesised that perhaps the tension-approximating neural network is un-


able to capture the tension data with only knowledge of the relative motion of the cable
attachment points. To test this hypothesis, the tension-predicting neural network is trained
directly on the cached tension and relative cable attachment point motion data. The neural
network remained the same as used in Section 5.4.1 (1 hidden layer with 20 nodes), as did the
hyperparameters. Without the callable structure’s caching causing issues, source-to-source
reverse-mode AD was able to be used to find the gradients for training. This, combined
with the lack of need to solve the UDE system, resulted in a training time of under 5 s.
Convergence is shown to occur in < 200 iterations in Figure 5.5.
Figure 5.6 shows the tension prediction results before and after training, showing that
the neural network is successfully able to predict cable tensions given knowledge of only the
relative motion of the cable connection points. To cement this proof, the trained neural
network would need to be tested on tension data generated from another trajectory. This
has not yet been performed as the main purpose of training the neural network directly is
to attempt to debug the UDE system. This will be completed in the future however.
Even when using this trained neural network, the full UDE is unable to correctly predict
drone and load trajectories. This suggests that the error is not in the neural network training,
but rather in the UDE system itself, or in the solver. Perhaps the solver is incorrectly selected
for the task, or the system is chaotic. Further investigation into both of these is required.
It is interesting to note that when warm-starting the UDE system training with the
already trained neural network parameters, convergence is again not achieved. This is ex-
pected as it has been determined that an issue exists in the UDE solver which affects the
training process (the UDE is solved when calculating the loss, which is required to train).

103
(a) Cable tensions throughout trajectory - pre-training.

(b) Cable tensions throughout trajectory - post-training.

Figure 5.6: Cable tension and relative cable attachment point motion data and predictions
from the same trajectory as Figure 5.2. Trained predictions come from training the neural
network directly.

104
5.5 Conclusions
Universal differential equations (UDEs) provide a new avenue to remove limiting assumptions
in model-based dynamics and controls applications. This Chapter presents a method to
address the massless cable assumption in a multi-quadrotor slung load system. It is shown
that a simple (1 hidden layer with 20 nodes) neural network is capable of predicting cable
tensions in the massless cable case, given only the relative position, velocity and acceleration
of the attachment points of the cable. When this neural network is embedded in a full
UDE system to predict load and drone trajectories however, the estimates were wild. This
suggests that this methodology shows some promise, and with more work on selecting UDE
solvers and debugging the UDE implementation, may be able to provide good trajectory
estimations. This work is intended to be continued and should it be completed, will provide
a practical full-scale example of how UDEs can be successfully used in mechanical systems.

5.6 Future work


Future work includes:

• Debugging the UDE system to achieve accurate trajectory predictions with the trained
neural network. This will involve investigating if the system is chaotic, if the correct
differential Equation solver is being used (including adaptive solver and step size pa-
rameters), and if the rest of the implementation is performing as expected.

• Use Julia’s preallocationTools to allow automatic differentiation to be used to find


gradients of the UDE system. This will give faster and more accurate gradients for
training.

• Investigate if the trained neural network can generalize predictions to unseen trajecto-
ries.

• Training across varied cable masses to capture the non-massless cable case.

• Streamlining implementation to make extensions and maintenance easier.

105
106
References

[1] G. D. Padfield and B. Lawrence, “The birth of flight control: An engineering analysis of
the wright brothers’ 1902 glider,” The Aeronautical Journal, vol. 107, no. 1078, pp. 697–
718, Dec. 2003, issn: 0001-9240, 2059-6464. doi: 10.1017/S0001924000013464. url:
https : / / www . cambridge . org / core / product / identifier / S0001924000013464 / type /
journal_article (visited on 06/19/2024).
[2] M. Hassanalian and A. Abdelkefi, “Classifications, applications, and design challenges
of drones: A review,” Progress in Aerospace Sciences, vol. 91, pp. 99–131, May 2017,
issn: 03760421. doi: 10.1016/j.paerosci.2017.04.003. url: https://linkinghub.elsevier.
com/retrieve/pii/S0376042116301348 (visited on 06/19/2024).
[3] J. Han, Z. Hui, F. Tian, and G. Chen, “Review on bio-inspired flight systems and
bionic aerodynamics,” Chinese Journal of Aeronautics, vol. 34, no. 7, pp. 170–186, Jul.
2021, issn: 10009361. doi: 10.1016/j.cja.2020.03.036. url: https://linkinghub.elsevier.
com/retrieve/pii/S1000936120302466 (visited on 06/19/2024).
[4] R. Niemiec and F. Gandhi, “A comparison between quadrotor flight configurations,”
[5] S. Chowdhury, A. Emelogu, M. Marufuzzaman, S. G. Nurre, and L. Bian, “Drones
for disaster response and relief operations: A continuous approximation model,” In-
ternational Journal of Production Economics, vol. 188, pp. 167–184, Jun. 2017, issn:
09255273. doi: 10.1016/j.ijpe.2017.03.024. url: https://linkinghub.elsevier.com/
retrieve/pii/S0925527317301172 (visited on 06/21/2024).
[6] M. A. R. Estrada and A. Ndoma, “The uses of unmanned aerial vehicles –UAV’s-
(or drones) in social logistic: Natural disasters response and humanitarian relief aid,”
Procedia Computer Science, vol. 149, pp. 375–383, 2019, issn: 18770509. doi: 10 .
1016 / j . procs . 2019 . 01 . 151. url: https : / / linkinghub . elsevier . com / retrieve / pii /
S1877050919301589 (visited on 06/21/2024).
[7] C. Wankmüller, M. Kunovjanek, and S. Mayrgündter, “Drones in emergency response
– evidence from cross-border, multi-disciplinary usability tests,” International Journal
of Disaster Risk Reduction, vol. 65, p. 102 567, 2021, issn: 2212-4209. doi: https :
//doi.org/10.1016/j.ijdrr.2021.102567. url: https://www.sciencedirect.com/science/
article/pii/S2212420921005288.
[8] L. D. P. Pugliese, F. Guerriero, and G. Macrina, “Using drones for parcels delivery
process,” Procedia Manufacturing, vol. 42, pp. 488–497, 2020, issn: 23519789. doi:
10.1016/j.promfg.2020.02.043. url: https://linkinghub.elsevier.com/retrieve/pii/
S2351978920305928 (visited on 06/21/2024).

107
[9] T. Benarbia and K. Kyamakya, “A literature review of drone-based package delivery
logistics systems and their implementation feasibility,” Sustainability, vol. 14, no. 1,
p. 360, Dec. 29, 2021, issn: 2071-1050. doi: 10.3390/su14010360. url: https://www.
mdpi.com/2071-1050/14/1/360 (visited on 06/21/2024).
[10] C. Lemardelé, M. Estrada, L. Pagès, and M. Bachofner, “Potentialities of drones and
ground autonomous delivery devices for last-mile logistics,” Transportation Research
Part E: Logistics and Transportation Review, vol. 149, p. 102 325, 2021, issn: 1366-
5545. doi: https://doi.org/10.1016/j.tre.2021.102325. url: https://www.sciencedirect.
com/science/article/pii/S1366554521000995.
[11] S. J. S. Abderahman Rejeb Karim Rejeb and H. Treiblmaier, “Drones for supply chain
management and logistics: A review and research agenda,” International Journal of
Logistics Research and Applications, vol. 26, no. 6, pp. 708–731, 2023, Publisher: Tay-
lor & Francis _eprint: https://doi.org/10.1080/13675567.2021.1981273. doi: 10.1080/
13675567.2021.1981273. url: https://doi.org/10.1080/13675567.2021.1981273.
[12] K. Flemons, B. Baylis, A. Z. Khan, et al., “The use of drones for the delivery of
diagnostic test kits and medical supplies to remote first nations communities during
covid-19,” American Journal of Infection Control, vol. 50, no. 8, pp. 849–856, Aug.
2022, issn: 01966553. doi: 10 . 1016 / j . ajic . 2022 . 03 . 004. url: https : / / linkinghub .
elsevier.com/retrieve/pii/S0196655322001420 (visited on 06/21/2024).
[13] M. Balasingam, “Drones in medicine-the rise of the machines,” International Journal
of Clinical Practice, vol. 71, no. 9, e12989, Sep. 2017, issn: 13685031. doi: 10.1111/
ijcp.12989. url: https://onlinelibrary.wiley.com/doi/10.1111/ijcp.12989 (visited on
06/21/2024).
[14] A. A. Nyaaba and M. Ayamga, “Intricacies of medical drones in healthcare delivery:
Implications for africa,” Technology in Society, vol. 66, p. 101 624, Aug. 2021, issn:
0160791X. doi: 10.1016/j.techsoc.2021.101624. url: https://linkinghub.elsevier.com/
retrieve/pii/S0160791X21000993 (visited on 06/21/2024).
[15] P. Mahadevan, “The military utility of drones,” in collab. with Möckli, Daniel, 3 p.
Jul. 2010, Artwork Size: 3 p. Medium: application/pdf Publisher: ETH Zurich. doi:
10.3929/ETHZ-A-006253833. url: http://hdl.handle.net/20.500.11850/26193 (visited
on 06/21/2024).
[16] N. Sharkey, “The automation and proliferation of military drones and the protec-
tion of civilians,” Law, Innovation and Technology, vol. 3, no. 2, pp. 229–240, 2011,
Publisher: Routledge _eprint: https://doi.org/10.5235/175799611798204914. doi: 10.
5235/175799611798204914. url: https://doi.org/10.5235/175799611798204914.
[17] P. J. Springer, Military robots and drones: a reference handbook. Bloomsbury Publish-
ing USA, 2013.
[18] A. Gohari, A. B. Ahmad, R. B. A. Rahim, A. S. M. Supa’at, S. Abd Razak, and
M. S. M. Gismalla, “Involvement of surveillance drones in smart cities: A systematic
review,” IEEE Access, vol. 10, pp. 56 611–56 628, 2022, issn: 2169-3536. doi: 10.1109/
ACCESS.2022.3177904. url: https://ieeexplore.ieee.org/document/9781426/ (visited
on 06/21/2024).

108
[19] S. A. Wich and L. P. Koh, Conservation drones: mapping and monitoring biodiversity.
Oxford University Press, 2018, isbn: 0-19-878761-8.
[20] M. Ruwaimana, B. Satyanarayana, V. Otero, A. M. Muslim, M. Syafiq A., S. Ibrahim,
D. Raymaekers, N. Koedam, and F. Dahdouh-Guebas, “The advantages of using drones
over space-borne imagery in the mapping of mangrove forests,” PLOS ONE, vol. 13,
no. 7, S. Joseph, Ed., e0200288, Jul. 18, 2018, issn: 1932-6203. doi: 10.1371/journal.
pone.0200288. url: https://dx.plos.org/10.1371/journal.pone.0200288 (visited on
06/21/2024).
[21] E. Casella, A. Collin, D. Harris, S. Ferse, S. Bejarano, V. Parravicini, J. L. Hench,
and A. Rovere, “Mapping coral reefs using consumer-grade drones and structure from
motion photogrammetry techniques,” Coral Reefs, vol. 36, no. 1, pp. 269–275, Mar.
2017, issn: 0722-4028, 1432-0975. doi: 10.1007/s00338-016-1522-0. url: http://link.
springer.com/10.1007/s00338-016-1522-0 (visited on 06/21/2024).
[22] J. Paneque-Gálvez, N. Vargas-Ramírez, B. Napoletano, and A. Cummings, “Grassroots
innovation using drones for indigenous mapping and monitoring,” Land, vol. 6, no. 4,
p. 86, Dec. 7, 2017, issn: 2073-445X. doi: 10.3390/land6040086. url: https://www.
mdpi.com/2073-445X/6/4/86 (visited on 06/21/2024).
[23] M. Kulbacki, J. Segen, W. Kniec, R. Klempous, K. Kluwak, J. Nikodem, J. Kulbacka,
and A. Serester, “Survey of drones for agriculture automation from planting to har-
vest,” in 2018 IEEE 22nd International Conference on Intelligent Engineering Systems
(INES), Las Palmas de Gran Canaria: IEEE, Jun. 2018, pp. 000 353–000 358, isbn: 978-
1-5386-1122-7. doi: 10.1109/INES.2018.8523943. url: https://ieeexplore.ieee.org/
document/8523943/ (visited on 06/21/2024).
[24] J. J. Cancela, X. P. González, M. Vilanova, and J. M. Mirás-Avalos, “Water manage-
ment using drones and satellites in agriculture,” Water, vol. 11, no. 5, p. 874, Apr. 26,
2019, issn: 2073-4441. doi: 10.3390/w11050874. url: https://www.mdpi.com/2073-
4441/11/5/874 (visited on 06/21/2024).
[25] A. Rejeb, A. Abdollahi, K. Rejeb, and H. Treiblmaier, “Drones in agriculture: A re-
view and bibliometric analysis,” Computers and Electronics in Agriculture, vol. 198,
p. 107 017, Jul. 2022, issn: 01681699. doi: 10.1016/j.compag.2022.107017. url: https:
//linkinghub.elsevier.com/retrieve/pii/S0168169922003349 (visited on 06/21/2024).
[26] S. Meivel and S. Maheswari, “Remote sensing analysis of agricultural drone,” Journal
of the Indian Society of Remote Sensing, vol. 49, no. 3, pp. 689–701, Mar. 2021, issn:
0255-660X, 0974-3006. doi: 10.1007/s12524-020-01244-y. url: http://link.springer.
com/10.1007/s12524-020-01244-y (visited on 06/21/2024).
[27] J. F. Falorca, J. P. N. D. Miraldes, and J. C. G. Lanzinha, “New trends in visual
inspection of buildings and structures: Study for the use of drones,” Open Engineering,
vol. 11, no. 1, pp. 734–743, 2021. doi: doi : 10 . 1515 / eng - 2021 - 0071. url: https :
//doi.org/10.1515/eng-2021-0071 (visited on 06/21/2024).

109
[28] T. Umar, “Applications of drones for safety inspection in the gulf cooperation council
construction,” Engineering, Construction and Architectural Management, vol. 28, no. 9,
pp. 2337–2360, Jan. 1, 2021, Publisher: Emerald Publishing Limited, issn: 0969-9988.
doi: 10.1108/ECAM-05-2020-0369. url: https://doi.org/10.1108/ECAM-05-2020-
0369 (visited on 06/21/2024).
[29] H. Liang, S.-C. Lee, W. Bae, J. Kim, and S. Seo, “Towards UAVs in construction: Ad-
vancements, challenges, and future directions for monitoring and inspection,” Drones,
vol. 7, no. 3, p. 202, Mar. 15, 2023, issn: 2504-446X. doi: 10.3390/drones7030202.
url: https://www.mdpi.com/2504-446X/7/3/202 (visited on 06/21/2024).
[30] M. Rossi, D. Brunelli, A. Adami, L. Lorenzelli, F. Menna, and F. Remondino, “Gas-
drone: Portable gas sensing system on UAVs for gas leakage localization,” in IEEE
SENSORS 2014 Proceedings, Valencia, Spain: IEEE, Nov. 2014, pp. 1431–1434, isbn:
978-1-4799-0162-3. doi: 10.1109/ICSENS.2014.6985282. url: https://ieeexplore.ieee.
org/document/6985282 (visited on 06/21/2024).
[31] M. Rossi and D. Brunelli, “Gas sensing on unmanned vehicles: Challenges and op-
portunities,” in 2017 New Generation of CAS (NGCAS), Genova, Italy: IEEE, Sep.
2017, pp. 117–120, isbn: 978-1-5090-6447-2. doi: 10 . 1109 / NGCAS . 2017 . 58. url:
http://ieeexplore.ieee.org/document/8052283/ (visited on 06/21/2024).
[32] P. Tosato, D. Facinelli, M. Prada, L. Gemma, M. Rossi, and D. Brunelli, “An au-
tonomous swarm of drones for industrial gas sensing applications,” in 2019 IEEE 20th
International Symposium on "A World of Wireless, Mobile and Multimedia Networks"
(WoWMoM), Washington, DC, USA: IEEE, Jun. 2019, pp. 1–6, isbn: 978-1-72810-
270-2. doi: 10 . 1109 / WoWMoM . 2019 . 8793043. url: https : / / ieeexplore . ieee . org /
document/8793043/ (visited on 06/21/2024).
[33] United Nations Framework Convention on Climate Change, Paris agreement, Dec. 12,
2015. url: https : / / unfccc . int / sites / default / files / english _ paris _ agreement . pdf
(visited on 06/21/2024).
[34] United States Environmental Protection Agency. “Global greenhouse gas overview.”
(Apr. 11, 2024), url: https://www.epa.gov/ghgemissions/global- greenhouse- gas-
overview (visited on 06/21/2024).
[35] Z. Q. Huang, “Assessing bovine methane emissions: Respiratory simulation and optical
gas imaging methods,” Sep. 2023.
[36] M. Bisgaard, A. La Cour-Harbo, and J. Dimon Bendtsen, “Adaptive control system for
autonomous helicopter slung load operations,” Control Engineering Practice, vol. 18,
no. 7, pp. 800–811, Jul. 2010, issn: 09670661. doi: 10.1016/j.conengprac.2010.01.017.
url: https : / / linkinghub . elsevier . com / retrieve / pii / S0967066110000341 (visited on
11/08/2023).
[37] Tan Feng, Lian Qi, Liu Chang-liang, and Jin Bing-kun, “MEASUREMENT OF DOWN-
WASH VELOCITY GENERATED BY ROTORS OF AGRICULTURE DRONES.,”
INMATEH - Agricultural Engineering, vol. 55, no. 2, pp. 141–150, May 2018, Pub-
lisher: INMATEH - Agricultural Engineering, issn: 20684215. url: https://search.

110
ebscohost . com / login . aspx ? direct = true & db = a9h & AN = 131602871 & site = ehost -
live&scope=site&authtype=shib&custid=s8978330.
[38] S. Franklin and A. Graesser, “Is it an agent, or just a program?: A taxonomy for
autonomous agents,” in Intelligent Agents III Agent Theories, Architectures, and Lan-
guages, J. P. Müller, M. J. Wooldridge, and N. R. Jennings, Eds., red. by J. G. Car-
bonell, J. Siekmann, G. Goos, J. Hartmanis, and J. Van Leeuwen, vol. 1193, Series
Title: Lecture Notes in Computer Science, Berlin, Heidelberg: Springer Berlin Hei-
delberg, 1997, pp. 21–35, isbn: 978-3-540-62507-0 978-3-540-68057-4. doi: 10.1007/
BFb0013570. url: http : / / link . springer . com / 10 . 1007 / BFb0013570 (visited on
06/25/2024).
[39] I. Maza, A. Ollero, E. Casado, and D. Scarlatti, “Classification of multi-UAV ar-
chitectures,” in Handbook of Unmanned Aerial Vehicles, K. P. Valavanis and G. J.
Vachtsevanos, Eds., Dordrecht: Springer Netherlands, 2015, pp. 953–975, isbn: 978-
90-481-9706-4 978-90-481-9707-1. doi: 10.1007/978-90-481-9707-1_119. url: https:
//link.springer.com/10.1007/978-90-481-9707-1_119 (visited on 06/08/2023).
[40] A. J. C. Sharkey, “Robots, insects and swarm intelligence,” Artificial Intelligence Re-
view, vol. 26, no. 4, pp. 255–268, Dec. 2006, issn: 0269-2821, 1573-7462. doi: 10.1007/
s10462-007-9057-y. url: http://link.springer.com/10.1007/s10462-007-9057-y (visited
on 06/25/2024).
[41] A. Tsiamis, C. K. Verginis, C. P. Bechlioulis, and K. J. Kyriakopoulos, “Coopera-
tive manipulation exploiting only implicit communication,” in 2015 IEEE/RSJ In-
ternational Conference on Intelligent Robots and Systems (IROS), Hamburg, Ger-
many: IEEE, Sep. 2015, pp. 864–869, isbn: 978-1-4799-9994-1. doi: 10.1109/IROS.
2015 . 7353473. url: https : / / ieeexplore . ieee . org / document / 7353473/ (visited on
06/27/2024).
[42] W. Ren and R. Beard, “Virtual structure based spacecraft formation control with for-
mation feedback,” in AIAA Guidance, Navigation, and Control Conference and Exhibit,
Monterey, California: American Institute of Aeronautics and Astronautics, Aug. 5,
2002, isbn: 978-1-62410-108-3. doi: 10.2514/6.2002-4963. url: https://arc.aiaa.org/
doi/10.2514/6.2002-4963 (visited on 06/07/2023).
[43] W. Ren, “Decentralization of virtual structures in formation control of multiple vehicle
systems via consensus strategies,” European Journal of Control, vol. 14, no. 2, pp. 93–
103, Jan. 2008, issn: 09473580. doi: 10.3166/ejc.14.93-103. url: https://linkinghub.
elsevier.com/retrieve/pii/S0947358008707507 (visited on 06/07/2023).
[44] P.F. Sheridan, “Feasibility study for multiple helicopter heavy lift systems,” Vertol
Aircraft Corporation, Vertol Report R-136, Oct. 1957.
[45] M. Mittal, J. V. R. Prasad, and D. P. Schrage, “Comparison of stability and control
characteristics of two twin-lift helicopter configurations,” Nonlinear Dynamics, vol. 3,
no. 3, pp. 199–223, 1992, issn: 0924-090X, 1573-269X. doi: 10.1007/BF00122302. url:
http://link.springer.com/10.1007/BF00122302 (visited on 06/27/2024).

111
[46] M. Mittal, J. V. R. Prasad, and D. P. Schrage, “Nonlinear adaptive control of a twin
lift helicopter system,” in 1990 American Control Conference, San Diego, CA, USA:
IEEE, May 1990, pp. 743–748. doi: 10 . 23919 / ACC . 1990 . 4790832. url: https : / /
ieeexplore.ieee.org/document/4790832/ (visited on 06/27/2024).
[47] H. Reynolds and A. Rodriguez, “H/sup infinity / control of a twin lift helicopter
system,” in [1992] Proceedings of the 31st IEEE Conference on Decision and Control,
Tucson, AZ, USA: IEEE, 1992, pp. 2442–2447, isbn: 978-0-7803-0872-5. doi: 10.1109/
CDC.1992.371352. url: http://ieeexplore.ieee.org/document/371352/ (visited on
06/27/2024).
[48] M. Bernard and K. Kondak, “Generic slung load transportation system using small
size helicopters,” in 2009 IEEE International Conference on Robotics and Automa-
tion, Kobe: IEEE, May 2009, pp. 3258–3264, isbn: 978-1-4244-2788-8. doi: 10.1109/
ROBOT.2009.5152382. url: http://ieeexplore.ieee.org/document/5152382/ (visited
on 06/27/2024).
[49] I. Maza, K. Kondak, M. Bernard, and A. Ollero, “Multi-UAV cooperation and control
for load transportation and deployment,” Journal of Intelligent and Robotic Systems,
vol. 57, no. 1, pp. 417–449, Jan. 2010, issn: 0921-0296, 1573-0409. doi: 10.1007/s10846-
009- 9352- 8. url: http://link.springer.com/10.1007/s10846- 009- 9352- 8 (visited on
09/23/2022).
[50] J. Fink, N. Michael, S. Kim, and V. Kumar, “Planning and control for cooperative
manipulation and transportation with aerial robots,” The International Journal of
Robotics Research, vol. 30, no. 3, pp. 324–334, Mar. 2011, issn: 0278-3649, 1741-3176.
doi: 10.1177/0278364910382803. url: http://journals.sagepub.com/doi/10.1177/
0278364910382803 (visited on 09/28/2022).
[51] N. Michael, J. Fink, and V. Kumar, “Cooperative manipulation and transportation
with aerial robots,” Autonomous Robots, vol. 30, no. 1, pp. 73–86, Jan. 2011, issn:
0929-5593, 1573-7527. doi: 10.1007/s10514- 010- 9205- 0. url: http://link.springer.
com/10.1007/s10514-010-9205-0 (visited on 09/28/2022).
[52] Taeyoung Lee, K. Sreenath, and V. Kumar, “Geometric control of cooperating multiple
quadrotor UAVs with a suspended payload,” in 52nd IEEE Conference on Decision and
Control, Firenze: IEEE, Dec. 2013, pp. 5510–5515, isbn: 978-1-4673-5717-3 978-1-4673-
5714-2 978-1-4799-1381-7. doi: 10.1109/CDC.2013.6760757. url: http://ieeexplore.
ieee.org/document/6760757/ (visited on 06/06/2023).
[53] K. Sreenath and V. Kumar, “Dynamics, control and planning for cooperative manip-
ulation of payloads suspended by cables from multiple quadrotor robots,” in Robotics:
Science and Systems IX, Robotics: Science and Systems Foundation, Jun. 23, 2013,
isbn: 978-981-07-3937-9. doi: 10 . 15607 / RSS . 2013 . IX . 011. url: http : / / www .
roboticsproceedings.org/rss09/p11.pdf (visited on 09/27/2022).
[54] G. Wu and K. Sreenath, “Geometric control of multiple quadrotors transporting a
rigid-body load,” in 53rd IEEE Conference on Decision and Control, Los Angeles,
CA, USA: IEEE, Dec. 2014, pp. 6141–6148, isbn: 978-1-4673-6090-6 978-1-4799-7746-

112
8 978-1-4799-7745-1. doi: 10.1109/CDC.2014.7040351. url: http://ieeexplore.ieee.
org/document/7040351/ (visited on 04/12/2023).
[55] S. Tang, K. Sreenath, and V. Kumar, “Multi-robot trajectory generation for an aerial
payload transport system,” in Robotics Research, N. M. Amato, G. Hager, S. Thomas,
and M. Torres-Torriti, Eds., vol. 10, Series Title: Springer Proceedings in Advanced
Robotics, Cham: Springer International Publishing, 2020, pp. 1055–1071, isbn: 978-
3-030-28618-7 978-3-030-28619-4. doi: 10.1007/978- 3- 030- 28619- 4_70. url: http:
//link.springer.com/10.1007/978-3-030-28619-4_70 (visited on 09/28/2022).
[56] G. Li, X. Liu, and G. Loianno, “RotorTM: A flexible simulator for aerial transportation
and manipulation,” IEEE Transactions on Robotics, vol. 40, pp. 831–850, 2024, issn:
1552-3098, 1941-0468. doi: 10.1109/TRO.2023.3336320. url: https://ieeexplore.ieee.
org/document/10328685/ (visited on 06/28/2024).
[57] E. Bulka, C. He, J. Wehbeh, and I. Sharf, “Experiments on collaborative transport
of cable-suspended payload with quadrotor UAVs,” in 2022 International Conference
on Unmanned Aircraft Systems (ICUAS), Dubrovnik, Croatia: IEEE, Jun. 21, 2022,
pp. 1465–1473, isbn: 978-1-66540-593-5. doi: 10 . 1109 / ICUAS54217 . 2022 . 9836163.
url: https://ieeexplore.ieee.org/document/9836163/ (visited on 09/22/2022).
[58] N. Ullah, I. Sami, W. Shaoping, H. Mukhtar, X. Wang, M. Shahariar Chowdhury,
and K. Techato, “A computationally efficient adaptive robust control scheme for a
quad-rotor transporting cable-suspended payloads,” Proceedings of the Institution of
Mechanical Engineers, Part G: Journal of Aerospace Engineering, vol. 236, no. 2,
pp. 379–395, Feb. 2022, issn: 0954-4100, 2041-3025. doi: 10.1177/09544100211013617.
url: http : / / journals . sagepub . com / doi / 10 . 1177 / 09544100211013617 (visited on
09/22/2022).
[59] U. C. Kaya and K. Subbarao, “Momentum preserving simulation of cooperative mul-
tirotors with flexible-cable suspended payload,” Journal of Dynamic Systems, Mea-
surement, and Control, vol. 144, no. 4, p. 041 007, Apr. 1, 2022, issn: 0022-0434,
1528-9028. doi: 10 . 1115 / 1 . 4053343. url: https : / / asmedigitalcollection . asme . org /
dynamicsystems/article/144/4/041007/1130980/Momentum-Preserving-Simulation-
of-Cooperative (visited on 09/22/2022).
[60] K. Mohammadi, S. Sirouspour, and A. Grivani, “Control of multiple quad-copters with
a cable-suspended payload subject to disturbances,” IEEE/ASME Transactions on
Mechatronics, vol. 25, no. 4, pp. 1709–1718, Aug. 2020, issn: 1083-4435, 1941-014X.
doi: 10.1109/TMECH.2020.2995138. url: https://ieeexplore.ieee.org/document/
9094322/ (visited on 09/22/2022).
[61] K. Mohammadi, S. Sirouspour, and A. Grivani, “Passivity-based control of multiple
quadrotors carrying a cable-suspended payload,” IEEE/ASME Transactions on Mecha-
tronics, vol. 27, no. 4, pp. 2390–2400, Aug. 2022, issn: 1083-4435, 1941-014X. doi:
10.1109/TMECH.2021.3102522. url: https://ieeexplore.ieee.org/document/9508879/
(visited on 09/22/2022).

113
[62] H. M. Omar, R. Akram, S. M. Mukras, and A. A. Mahvouz, “Recent advances and
challenges in controlling quadrotors with suspended loads,” Alexandria Engineering
Journal, S1110016822005245, Aug. 2022, issn: 11100168. doi: 10.1016/j.aej.2022.08.
001. url: https://linkinghub.elsevier.com/retrieve/pii/S1110016822005245 (visited on
09/22/2022).
[63] Holybro. “Sample wiring diagram,” Holybro Docs. (Jan. 2024), url: https : / / docs .
holybro.com/autopilot/pixhawk-6x/sample-wiring-diagram.
[64] F. Furrer, M. Burri, M. Achtelik, and R. Siegwart, “RotorS—a modular gazebo MAV
simulator framework,” in Robot Operating System (ROS), A. Koubaa, Ed., vol. 625,
Series Title: Studies in Computational Intelligence, Cham: Springer International Pub-
lishing, 2016, pp. 595–625, isbn: 978-3-319-26052-5 978-3-319-26054-9. doi: 10.1007/
978-3-319-26054-9_23. url: http://link.springer.com/10.1007/978-3-319-26054-9_23
(visited on 06/28/2024).
[65] F. Okoli, Y. Lang, O. Kermorgant, and S. Caro, “Cable-driven parallel robot simulation
using gazebo and ROS,” in ROMANSY 22 – Robot Design, Dynamics and Control,
V. Arakelian and P. Wenger, Eds., vol. 584, Series Title: CISM International Centre
for Mechanical Sciences, Cham: Springer International Publishing, 2019, pp. 288–295,
isbn: 978-3-319-78962-0 978-3-319-78963-7. doi: 10.1007/978-3-319-78963-7_37. url:
http://link.springer.com/10.1007/978-3-319-78963-7_37 (visited on 06/28/2024).
[66] PX4-autopilot, version 1.14, May 24, 2024. url: https : / / github . com / PX4 / PX4 -
Autopilot.
[67] V. Prkacin, I. Palunko, and I. Petrovic, “State and parameter estimation of suspended
load using quadrotor onboard sensors,” in 2020 International Conference on Unmanned
Aircraft Systems (ICUAS), Athens, Greece: IEEE, Sep. 2020, pp. 958–967, isbn: 978-
1-72814-278-4. doi: 10.1109/ICUAS48674.2020.9213840. url: https://ieeexplore.ieee.
org/document/9213840/ (visited on 11/08/2023).
[68] G. Yu, W. Xie, D. Cabecinhas, R. Cunha, and C. Silvestre, “Adaptive control with un-
known mass estimation for a quadrotor-slung-load system,” ISA Transactions, vol. 133,
pp. 412–423, Feb. 2023, issn: 00190578. doi: 10.1016/j.isatra.2022.06.036. url: https:
//linkinghub.elsevier.com/retrieve/pii/S0019057822003421 (visited on 03/23/2023).
[69] M. Bisgaard, A. La Cour-Harbo, E. N. Johnson, and J. D. Bendtsen, “VISION AIDED
STATE ESTIMATOR FOR HELICOPTER SLUNG LOAD SYSTEM,” IFAC Pro-
ceedings Volumes, vol. 40, no. 7, pp. 425–430, 2007, issn: 14746670. doi: 10.3182/
20070625 - 5 - FR - 2916 . 00073. url: https : / / linkinghub . elsevier . com / retrieve / pii /
S1474667015332791 (visited on 11/08/2023).
[70] S. Tang, V. Wuest, and V. Kumar, “Aggressive flight with suspended payloads using
vision-based control,” IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 1152–
1159, Apr. 2018, issn: 2377-3766, 2377-3774. doi: 10.1109/LRA.2018.2793305. url:
http://ieeexplore.ieee.org/document/8258883/ (visited on 09/28/2022).

114
[71] J. J. Naundrup, J. D. Bendtsen, and A. La Cour-Harbo, “Vision-based parameter
estimation of a slung load,” in 2024 International Conference on Unmanned Aircraft
Systems (ICUAS), Chania - Crete, Greece: IEEE, Jun. 4, 2024, pp. 1216–1223, isbn:
9798350357882. doi: 10.1109/ICUAS60882.2024.10556976. url: https://ieeexplore.
ieee.org/document/10556976/ (visited on 07/01/2024).
[72] A. E. V. Moreno, “Machine learning techniques to estimate the dynamics of a slung
load multirotor UAV system,” 2017.
[73] K. Sreenath, N. Michael, and V. Kumar, “Trajectory generation and control of a
quadrotor with a cable-suspended load - a differentially-flat hybrid system,” in 2013
IEEE International Conference on Robotics and Automation, Karlsruhe, Germany:
IEEE, May 2013, pp. 4888–4895, isbn: 978-1-4673-5643-5 978-1-4673-5641-1. doi:
10.1109/ICRA.2013.6631275. url: http://ieeexplore.ieee.org/document/6631275/
(visited on 09/27/2022).
[74] A. Faust, I. Palunko, P. Cruz, R. Fierro, and L. Tapia, “Automated aerial suspended
cargo delivery through reinforcement learning,” Artificial Intelligence, vol. 247, pp. 381–
398, Jun. 2017, issn: 00043702. doi: 10 . 1016 / j . artint . 2014 . 11 . 009. url: https :
//linkinghub.elsevier.com/retrieve/pii/S0004370214001416 (visited on 03/23/2023).
[75] X. Li, J. Zhang, and J. Han, “Trajectory planning of load transportation with multi-
quadrotors based on reinforcement learning algorithm,” Aerospace Science and Tech-
nology, vol. 116, p. 106 887, Sep. 2021, issn: 12709638. doi: 10.1016/j.ast.2021.106887.
url: https : / / linkinghub . elsevier . com / retrieve / pii / S1270963821003977 (visited on
03/23/2023).
[76] P. M. MICHEL FLIESS JEAN LÉVINE and P. ROUCHON, “Flatness and defect
of non-linear systems: Introductory theory and examples,” International Journal of
Control, vol. 61, no. 6, pp. 1327–1361, 1995, Publisher: Taylor & Francis _eprint:
https://doi.org/10.1080/00207179508921959. doi: 10.1080/00207179508921959. url:
https://doi.org/10.1080/00207179508921959.
[77] K. K. Dhiman, M. Kothari, and A. Abhishek, “Autonomous load control and trans-
portation using multiple quadrotors,” Journal of Aerospace Information Systems, vol. 17,
no. 8, pp. 417–435, Aug. 2020, issn: 2327-3097. doi: 10.2514/1.I010787. url: https:
//arc.aiaa.org/doi/10.2514/1.I010787 (visited on 09/22/2022).
[78] Y. Song, A. Romero, M. Müller, V. Koltun, and D. Scaramuzza, “Reaching the limit in
autonomous racing: Optimal control versus reinforcement learning,” Science Robotics,
vol. 8, no. 82, eadg1462, Sep. 13, 2023, issn: 2470-9476. doi: 10 . 1126 / scirobotics .
adg1462. url: https://www.science.org/doi/10.1126/scirobotics.adg1462 (visited on
07/03/2024).
[79] J. Stillwell, Naive Lie Theory (Undergraduate Texts in Mathematics). New York, NY:
Springer New York, 2008, isbn: 978-0-387-78214-0 978-0-387-78215-7. doi: 10.1007/
978- 0- 387- 78214- 0. url: https://link.springer.com/10.1007/978- 0- 387- 78214- 0
(visited on 05/02/2023).

115
[80] J. Solà, J. Deray, and D. Atchuthan, A micro lie theory for state estimation in robotics,
Dec. 8, 2021. arXiv: 1812.01537[cs]. url: http://arxiv.org/abs/1812.01537 (visited on
12/15/2023).
[81] N. Lv, J. Liu, H. Xia, J. Ma, and X. Yang, “A review of techniques for modeling flexible
cables,” Computer-Aided Design, vol. 122, p. 102 826, May 2020, issn: 00104485. doi:
10 . 1016 / j . cad . 2020 . 102826. url: https : / / linkinghub . elsevier . com / retrieve / pii /
S0010448520300191 (visited on 05/14/2023).
[82] S. Li and D. Zanotto, “Multi-UAV cooperative transportation using dynamic control
allocation and a reinforcement learning compensator,” in Volume 9: 17th International
Conference on Multibody Systems, Nonlinear Dynamics, and Control (MSNDC), Vir-
tual, Online: American Society of Mechanical Engineers, Aug. 17, 2021, V009T09A034,
isbn: 978-0-7918-8546-8. doi: 10.1115/DETC2021-71797. url: https://asmedigitalcollection.
asme.org/IDETC-CIE/proceedings/IDETC-CIE2021/85468/V009T09A034/1128222
(visited on 03/23/2023).
[83] F. Panetsos, G. C. Karras, and K. J. Kyriakopoulos, “A deep reinforcement learning
motion control strategy of a multi-rotor UAV for payload transportation with minimum
swing,” in 2022 30th Mediterranean Conference on Control and Automation (MED),
Vouliagmeni, Greece: IEEE, Jun. 28, 2022, pp. 368–374, isbn: 978-1-66540-673-4. doi:
10 . 1109 / MED54222 . 2022 . 9837220. url: https : / / ieeexplore . ieee . org / document /
9837220/ (visited on 03/23/2023).
[84] G. Serrano, M. Jacinto, J. Ribeiro-Gomes, J. Pinto, B. J. Guerreiro, A. Bernardino, and
R. Cunha, Physics-informed neural network for multirotor slung load systems modeling,
May 15, 2024. arXiv: 2405.09428[cs]. url: http://arxiv.org/abs/2405.09428 (visited
on 07/05/2024).
[85] PX4 autopilot user guide. url: https://docs.px4.io/main/en/ (visited on 08/07/2024).
[86] D. Brescianini, M. Hehn, and R. D’Andrea, “Nonlinear quadrocopter attitude control:
Technical report,” ETH Zurich, 2013, Medium: application/pdf,Online-Ressource. doi:
10 . 3929 / ETHZ - A - 009970340. url: http : / / hdl . handle . net / 20 . 500 . 11850 / 154099
(visited on 07/05/2024).
[87] R. Tedrake, “Underactuated robotics,” Course Notes for MIT 6.832, 2023. url: https:
//underactuated.csail.mit.edu.
[88] D. Janos and P. Kuras, “Evaluation of low-cost GNSS receiver under demanding con-
ditions in RTK network mode,” Sensors, vol. 21, no. 16, p. 5552, Aug. 18, 2021, issn:
1424-8220. doi: 10.3390/s21165552. url: https://www.mdpi.com/1424-8220/21/16/
5552 (visited on 08/17/2024).
[89] Luca Carlone. “Image formation.” (2021), url: https://vnav.mit.edu/lectures.html
(visited on 08/20/2024).
[90] Itseez. “The OpenCV reference manual.” (Apr. 2014), (visited on 08/08/2024).

116
[91] E. Marchand, H. Uchiyama, and F. Spindler, “Pose estimation for augmented reality:
A hands-on survey,” IEEE Transactions on Visualization and Computer Graphics,
vol. 22, no. 12, pp. 2633–2651, Dec. 1, 2016, issn: 1077-2626. doi: 10.1109/TVCG.2015.
2513408. url: http://ieeexplore.ieee.org/document/7368948/ (visited on 12/15/2023).
[92] Xiao-Shan Gao, Xiao-Rong Hou, Jianliang Tang, and Hang-Fei Cheng, “Complete
solution classification for the perspective-three-point problem,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 25, no. 8, pp. 930–943, Aug. 2003,
issn: 0162-8828. doi: 10.1109/TPAMI.2003.1217599. url: http://ieeexplore.ieee.org/
document/1217599/ (visited on 12/15/2023).
[93] T. Collins and A. Bartoli, “Infinitesimal plane-based pose estimation,” International
Journal of Computer Vision, vol. 109, no. 3, pp. 252–286, Sep. 2014, issn: 0920-5691,
1573-1405. doi: 10.1007/s11263-014-0725-5. url: http://link.springer.com/10.1007/
s11263-014-0725-5 (visited on 12/15/2023).
[94] E. Eade, “Gauss-newton / levenberg-marquardt optimization,” 2013.
[95] S. J. Julier and J. K. Uhlmann, “A new extension of the kalman filter to nonlinear
systems,” 1997.
[96] C. Rackauckas, Y. Ma, J. Martensen, C. Warner, K. Zubov, R. Supekar, D. Skinner,
A. Ramadhan, and A. Edelman, Universal differential equations for scientific machine
learning, Nov. 2, 2021. arXiv: 2001.04385[cs,math,q-bio,stat]. url: http://arxiv.org/
abs/2001.04385 (visited on 03/24/2023).

117
118
Appendix A

Code Repositories

• Multi-drone slung load carrying core software: https://github.com/hmer101/


multi_drone_slung_load

• Multi-drone slung load carrying service and message definitions: https://


github.com/hmer101/multi_drone_slung_load_interfaces/tree/main

• PX4 autopilot fork: https://github.com/hmer101/PX4-Autopilot

• Real world on-drone code: https://github.com/hmer101/drone_offboard_computer

• Miscellaneous setup files for the FMU, RC, simulator etc.: https://github.
com/hmer101/drone_misc

• Trajectory generation and LQR control in Drake: https://github.com/hmer101/


6.8210_drake_slung_load

• Kalman-filter-based slung load pose estimator: https://github.com/hmer101/


slung_pose_estimation

• Julia simulator and cable UDE system: https://github.com/hmer101/18.337_


ude_cables

119
120
Appendix B

Frame Transform Full Tree

Figure B.1 shows the full TF tree used in a high-fidelity Gazebo simulation. All body-
fixed frames are defined relative to initial (‘init’) frames with an orientation fixed in the
east-north-up (ENU) convention.

121
122
Figure B.1: A full TF tree generated by ROS2’s tf2_tools package.
Appendix C

Testing Videos

• Drake dynamics and control simulation: https://youtu.be/NR5nPiXb5wU

• Gazebo simulation: https://youtu.be/L-jx-SenBfA

• Real world flight: https://youtu.be/4FRBUPgz-X4

• Testing rig functioning: https://bit.ly/3Xrp9rz

123
124
Appendix D

Onshape CAD Documents

• Slung load: https://bit.ly/471B8il

Figure D.1: Screenshot of the slung load CAD model in Onshape, roughly 0.3 m × 0.3 m ×
0.2 m. The lighter variant removes the bottom tier, the additional weights, and the u-bolts.

• Testing rig: https://bit.ly/3z2u6gN

Figure D.2: Screenshot of the testing rig frame CAD model in Onshape in its 1 m3 configu-
ration.

125
• On-drone camera mount: https://bit.ly/3T1c4SU

Figure D.3: Screenshot of the 170 mm long Intel Realsense camera mount CAD model in
Onshape.

126
Appendix E

Bill of Materials

Item Cost ($) Quantity Total cost ($)


8020 aluminium t-slot profile extrusion: 30-3030 400.82 1 400.82
Frame connecting plates 1.69 48 81.28
8020 frame angle pieces: 30-4332 4.23 24 101.44
8020 t-nuts: 13024 2.3 180 414.00
Cap screw, M6 × 1.00 mm, 15 mm long, 10pk 7.3 13 94.90
8020 40 to 40 series roller wheel: 40-2290 23.59 8 188.72
Al rolling wheel holder - custom 3d print 0.82 2 1.64
4" caster wheels - 4pk 25.99 1 25.99
Dial a-distance retractable dog leash, 0–15 ft 34.99 2 69.98
Rabbitgoo dog harness small 19.98 1 19.98
Leash holder - custom 3d print 2.09 2 4.18
Total $ 1402.93

Table E.1: Bill of materials with approximate costs for the drone testing apparatus. All $
are in USD.

127
Item Cost ($) Quantity Total cost ($)
Drones
Holybro X500 v2 ARF kit 260.99 3 782.97
Holybro Pixhawk 6x v3 set 317.99 4 1271.96
Holybro H-RTK F9P GNSS - rover 296.99 4 1187.96
Holybro power distribution board 8.59 4 34.36
14.8 V 4S 5500 mAh 70C LiPo battery 57.49 4 229.96
Radiomaster R81 RC receiver 17.99 3 53.97
SiK telemetry radio v3 - 915 MHz 100 mW 58.99 4 235.96
RJ45 Ethernet splitter 1 to 2 100 Mbps 9.99 4 39.96
Intel RealSense D435 297.13 3 891.39
Realsense camera mounts - 3D printed 1.2 3 3.6
Drones + load
Raspberry Pi (RPi) 5 80.00 4 320.00
RPi 5 fans 5.00 4 20.00
MicroSD card 128 Gb class 10 200/90 mbs 18.45 4 73.80
UBEC SMPS 5 V/6 V, 5 A continuous, 7.5 A burst 7.99 4 31.96
4-outlet, USB-powered Ethernet hub 31.99 1 31.99
Ethernet cables - cat 6 10 Gbps 1 ft 5pk 9.99 1 9.99
Ethernet cables - cat 6 10 Gbps 14 ft 9.49 3 28.47
Load
Kevlar rope with nylon sheathe - 8 m 39.99 1 39.99
2 mm Clear acrylic sheets - 2 pk 31.99 1 31.99
6-pack tri-glide plastic buckles (3/4" combo set) 9.59 1 9.59
100 mm bolts + nuts + washers + nylon spacers 125.39 1 125.39
Auxiliary
Holybro H-RTK F9P RTK GNSS - base 325.99 1 325.99
Radiomaster TX16S II RC 209.99 3 629.97
TX16S battery 5000 mAh 2S 7.4 V 24.99 3 74.97
Total $ 6486.19

Table E.2: Bill of materials with approximate costs for the full multi-drone slung load system.
All $ are in USD.

128

You might also like