Imaging and Sensing For Unmanned Aircraft Systems Control and Performance (Control, Robotics and Sensors) by Vania v. Estrela (Editor), Jude Hemanth (Editor), Osamu Saotome (Editor), George Nikolakopo
Imaging and Sensing For Unmanned Aircraft Systems Control and Performance (Control, Robotics and Sensors) by Vania v. Estrela (Editor), Jude Hemanth (Editor), Osamu Saotome (Editor), George Nikolakopo
The use of sensors has increased dramatically in all industries. They are fundamental in a wide
range of applications from communication to monitoring, remote operation, process control,
precision and safety, and robotics and automation. These developments have brought new
challenges such as demands for robustness and reliability in networks, security in the
communications interface, and close management of energy consumption. This Book Series
covers the research and applications of sensor technologies in the fields of ICTs, security,
tracking, detection, monitoring, control and automation, robotics, machine learning, smart
technologies, production and manufacturing, photonics, environment, energy, and transport.
Book Series Editorial Board
Edited by
Vania V. Estrela, Jude Hemanth, Osamu Saotome,
George Nikolakopoulos and Roberto Sabatini
This publication is copyright under the Berne Convention and the Universal Copyright
Convention. All rights reserved. Apart from any fair dealing for the purposes of research
or private study, or criticism or review, as permitted under the Copyright, Designs and
Patents Act 1988, this publication may be reproduced, stored or transmitted, in any
form or by any means, only with the prior permission in writing of the publishers, or in
the case of reprographic reproduction in accordance with the terms of licences issued
by the Copyright Licensing Agency. Enquiries concerning reproduction outside those
terms should be sent to the publisher at the undermentioned address:
While the authors and publisher believe that the information and guidance given in this
work are correct, all parties must rely upon their own skill and judgement when making
use of them. Neither the authors nor publisher assumes any liability to anyone for any
loss or damage caused by any error or omission in the work, whether such an error or
omission is the result of negligence or any other cause. Any and all such liability is
disclaimed.
The moral rights of the authors to be identified as authors of this work have been
asserted by them in accordance with the Copyright, Designs and Patents Act 1988.
14 Conclusions 333
Vania V. Estrela, Jude Hemanth, Osamu Saotome,
George Nikolakopoulos and Roberto Sabatini
Index 337
About the editors
UAV avionics aim to boost flight visibility and situational awareness. This
book intends to be a reference for vision and sensor integration. It will display
fundamental aspects, ongoing research efforts, accomplishments, and challenges
faced when it comes to the deployment of imaging capabilities and sensor inte-
gration in UAVs.
Vania V. Estrela
Jude Hemanth
Osamu Saotome
George Nikolakopoulos
Roberto Sabatini
Chapter 1
Introduction to advances in UAV avionics for
imaging and sensing
Vania V. Estrela1, Jude Hemanth2, Osamu Saotome3,
George Nikolakopoulos4 and Roberto Sabatini5
1
Universidade Federal Fluminense, RJ, Brazil
2
Karunya University, Coimbatore, India
3
DCTA-ITA-IEEA, Sao Jose dos Campos, SP, Brazil
4
Luleå University of Technology, Sweden
5
RMIT University, Bundoora, Australia
2 Imaging and sensing for unmanned aircraft systems, volume 1
Issues such as localisation and mapping stem from using sensors to localise the
objects and avoid obstacles on the way.
Size helps categorise airborne platforms. The smaller the airframe, the nar-
rower the range of appropriate sensors and other hardware that can be used. The
main restriction is the smaller payload, which is also restricted by limited onboard
power, where the payload is the transport capacity of an aircraft usually considered
in terms of weight.
Some important UAV categories are listed below (refer to [7,8]):
(i) High-altitude long-endurance (HALE): They carry out extremely long-
range (trans-global) reconnaissance and surveillance being capable of fly-
ing more than 24 h over 15,000 m.
(ii) Medium-altitude long-endurance (MALE): Similar to the HALEs saved for
generally operating at shorter ranges, but still more than 500 km, from
5,000 to 15,000 m of altitude, with 24 h endurance, and from fixed bases.
(iii) Tactical UAV or medium-range UAV: It has a range between 100 km and
300 km and it is smaller/operated than a HALE or MALE.
(iv) Vertical takeoff and landing: This type of UAV can take off, hover, and
land vertically.
(v) Close-range UAV: It usually works at ranges up to 100 km, and it has
possibly the most significant number of uses, including diverse tasks like
reconnaissance, target designation, surveillance, crop-spraying, power-line
inspection, and traffic monitoring, to name a few.
(vi) Mini-UAV: It can be hand-launched and operate at ranges of up to 30 km.
(vii) Micro-UAV (MAV): It is widespread in urban locations and within buildings.
Besides flying slowly, it stops and sits on a wall or post (hover-and-perch
mode). MAVs are, in general, manually launched, and for that reason, winged
versions have shallow wing loadings, which make them vulnerable to atmo-
spheric turbulence and precipitation [9].
(viii) Nano-air vehicle: It is an ultra-light UAV intended adequate to swarms for
purposes such as radar confusion or conceivably, provided camera, pro-
pulsion, and control sub-systems can be made small enough, for ultra-short-
range surveillance.
Another important taxonomy of UAV is the aerodynamical classification in
fixed-wing and rotative-wing UAVs [10–13].
Fixed-wing (FW) UAVs need to develop a certain minimum translational
velocity to maintain in the air. They comprise a much simpler structure than rotary-
wing UAVs, which allows for less complicated maintenance and repair. Hence,
they offer more operational time at a lower cost, greater aerodynamics efficiency,
longer flight durations at higher, which enables larger survey areas per given flight.
These aircraft can transport greater payloads for longer distances on less power,
which gives room to carry more prominent and more expensive sensors as well as
twin sensor configurations.
Rotative or rotary-wing (RW) UAVs normally can have different numbers of
rotors: 1 (helicopter), 3 (tricopter), 4 (quadcopter), 6 (hexacopter), and 8 (octocopter)
Introduction to advances in UAV avionics for imaging and sensing 3
as well as more atypical arrangements with 12 and 16 rotors. When perfectly con-
trolled, they can remain in the air. Once more, each arrangement has its unique
characteristic advantages and disadvantages. Control of rotary UAVs derives from
the variation in thrust and torque from its rotors. Their most significant advantage is
the aptitude to take off and land vertically, which permits the operation within a
smaller neighbourhood without a large fixed landing/take-off area. The RW capacity
to hover and perform quick manoeuvring makes them well suited to applications
such as inspections where precision manoeuvring and the ability to monitor a single
target for extended periods is vital. RW UAVs’ disadvantages are the greater
mechanical and electronic complexities that result in more complicated maintenance
and repair processes, shorter operational time, and higher operational costs than
FW UAVs.
The faster dynamics of flying platforms impose strict requirements on the
timely execution of tasks and decision making. Time dependency is paramount,
especially for indoor UAVs because they handle small distances to obstacles.
Moreover, the use of airborne vehicles calls for strict safety requirements, high
dependability, and robustness.
There are numerous reasons for focusing on imaging sensors. Drones are
usually equipped with cameras, so no additional hardware is needed. Furthermore,
they are lightweight, low-cost and have low power requirements. For an outdoor
UAV, the level of autonomy, as well as the system situational awareness increases
with colour and thermal cameras.
Stimuli have four characteristics: location, modality, intensity, and duration. A
stimulus modality (sensor modality) is a type of physical phenomenon that can be
detected like temperature, taste, sound, and pressure [14–17]. A sensor is an entity
intended to perceive events or changes in its surroundings and then present an
equivalent output. The kind of sensor activated by a stimulus has a vital role in
processing its modality.
Multisensory or multi-modal integration (MMI) studies how information
from different sensory modalities, such as vision, sound, touch, taste, motion, and
smell, may be combined with the control system. A coherent representation of
objects combining modalities results in meaningful and adaptive knowledge.
MMI also deals with how different types of sensors interact and modify each other
processing.
A UAV demands three main features:
● the aircraft itself;
● the command, control, communication, and computer system (C4 system) also
known as Ground Control Station; and
● the operator.
Figure 1.1 shows the basic entities comprising avionics architectures according
to [18,19].
Payloads can be high- and low-resolution cameras, video cameras, night-and-
day reconnaissance gear, high-power radar, gyro-stabilised, electro-optical signals,
meteorological, chemical, biological, relay (communications and navigation
4 Imaging and sensing for unmanned aircraft systems, volume 1
Actuators
Wireless communication
Sensors
signals), cargo (leaflets and supplies), and generally any equipment required for the
UAV mission.
using satellite signals. Such a system can handle position, navigation, tracking the
location of some object tracked by satellites and can operate independently of any
telephony or Internet connection or concomitantly with these technologies for
better positioning information. A global navigation satellite system (GNSS) is an
SNS that affords global coverage.
Synthetic aperture radar (SAR) can be extremely helpful to aid UAV navi-
gation when the INS measurements are not precise enough to get rid of drifts from
a planned route [21]. This setback can influence MALE UAVs, which allows
significant and spacious payloads (as necessary by SAR) gathering large drifts.
The fundamental idea is to figure out the position and attitude of an aerial platform
by inspecting both the amplitude and the phase of SAR images obtained onboard.
For the amplitude-based method, matching the actual ground coordinates of
landmarks with those automatically determined from the SAR image is a way of
implementing the needed navigation corrections. When SAR amplitude analysis is
impracticable, the phase data can be analysed via SAR interferometry and using a
reference digital terrain model (DTM). A feasibility study can obtain adequate
system requirements by exploring both radiometric and geometric parameters of
the acquisition framework. MALE UAVs relying on particular commercial navi-
gation sensors and SAR systems, with typical landmark position accuracy and
classes, and existing DTMs allow available UAV technology using a SAR-based
backup structure.
Indoor and outdoor vehicles pose different challenges. While some techniques
work in both cases, some issues have to be solved independently. The biggest
challenge for an indoor UAS is the lack of a ubiquitous positioning system such as
the global positioning system (GPS). Even though its availability is never assured,
GPS is an essential part of outdoor UAVs. Several commonly used sensors do not
make sense or are not useful indoors, as is the case of the altitude sensor. Hence,
autonomous UAV indoor operation is an open issue, and the detection of people
can become challenging [22,23]. Because of imaging and other sensor types to
facilitate the autonomy of a UAV operating in both outdoor and indoor environ-
ments, one may consider to tailor it to the application. In the case of the former, the
input from cameras increases the number of tasks a UAV can perform autono-
mously. For the latter, a camera sensor can offer basic functionality, which is the
ability to keep flight without human participation [24,25].
An embedded electronic system (sometimes called an onboard computer)
controls the flight stability, the navigation, and the communication of a UAV. In
the case of small UAVs’ hardware, it is often called the flight controller, flight
controller board, or autopilot, and it controls the communication with the base
station, maintains the UAV floating and navigating into the air while performing
the following actions:
1. Task distribution amongst a group of agents, with time and equipment
constraints;
2. UAV path while meeting certain objectives and restrictions, such as obsta-
cles [26];
6 Imaging and sensing for unmanned aircraft systems, volume 1
1.3 Communications
Autonomy means that aerial robots can perform a wider range of tasks by inter-
acting and operating safely in different kinds of environments without relying on
external help via wireless transmission, with safety and self-sufficiency [30,31].
Introduction to advances in UAV avionics for imaging and sensing 7
1.4 Sensors
analogue sensors, e.g., potentiometers and force-sensing resistors, are still widely
used. Applications include manufacturing and machinery, aeroplanes and aero-
space, cars, medicine, and robotics.
Sensor sensitivity indicates how much its output changes when the input
quantity being measured changes. Furthermore, some sensors disturb the phe-
nomenon or characteristic they measure. For example, a room temperature ther-
mometer can be affected by some failure in an indoor UAV. Hence, sensors must
have a small effect on measurements. Often, making the sensor smaller improves
this and may introduce other advantages [33–36].
Sensors provide basic functionality and aid in maintaining flight without
human input, radar, cameras, Infra-Red (IR) scanners or Electronic Intelligence
technology are most common. Other types may include a (laser) target designator
to guide stand-off. Requirements for sensing payloads on UAV extend intelligence,
data collection, reconnaissance/surveillance, tracking, and payload delivery since
they assist detection and identification objects.
Sensors can represent one of the most costly items in an uncrewed aircraft and
are necessary for navigation and mission achievement. Processing units allow UAVs
to fly complete missions autonomously with little or no human intervention [37].
The fusion of multi-modal images can help investigate human body detection
and geolocation for potential survivors in a search and rescue mission. The scan-
ning result is a map of victim locations that can be used by first responders or a
robotic system to, for example, deliver food or medical supplies [38,39]. The UAV
tracking functionality can provide streams of low-level events such as geographical
coordinates of targets on the ground, which allows for qualitative reasoning about
the surroundings and assessment of vehicles overtaking, incoming, and leaving
disaster sites besides improving the situational awareness mostly.
Position and movement sensors inform about the aircraft state. Exteroceptive
sensors obtain data from the environment like distances to objects and deal with
outside information. Exproprioceptive sensors measure values internal to the UAV
(such as speed and heading), establishing a correlation between internal and outer
states.
Non-cooperative sensors can detect targets autonomously, so they are used for
separation assurance and collision avoidance.
The term degrees of freedom refers to both the amount and the quality of
sensors onboard.
The critical communication technologies’ issues are flexibility, adaptability,
safety, robustness, and cognitive controllability of the bandwidth, frequency, and
information/data flows [37]. A UAV data link usually consists of an RF transmitter
and a receiver, an antenna, and modems to link these parts with the sensor systems.
For UAS, data links serve three essential functions:
1. Uplinks from the ground station (GS) and/or a satellite to send control data to
the UAV;
2. Downlinks to transmit records from the UAV onboard sensor and telemetry
systems to the GS; and
Introduction to advances in UAV avionics for imaging and sensing 9
due to linearity, numerical robustness, and low computational burden. The resulting
LC subsystem comes from a full UAV dynamic model. The UAV dynamics are
equivalent to inner and outer systems combined when using nonlinear control
theory. Control schemes may rely on the differential flatness of the outer system. If
the overall closed-loop system exponential stability is guaranteed, then the con-
troller can be tightly coupled with the vision-based state estimation and the aux-
iliary sensors such as accelerometers [41].
On average, a UAV vision system is a cheap way to augment the sensor suite
of the aerial vehicle, including a GPS. Vision systems provide position data
about the inertial frame and INSs [44]. As a passive and information-rich sensor,
CV is gaining more and more importance in a UAV sensor set prompting growing
interest in control design around vision sensors. Stereo vision systems expand
a UAV multi-sensor suite, including laser range-finders for landing [45], estimate
the location/orientation of the landing pad [46], and the camera regarding the
landing pad.
Furthermore, the use of CV-based UAV control is more challenging than the
classical approaches [47] since UAVs are underactuated nonlinear dynamical sys-
tems. To guarantee performance with stability for the overall closed-loop system, a
comprehensive depiction of the UAV dynamics is necessary.
In various UAV imaging applications, the multisensory super-resolution (SR)
method has turned out to be an unending setback and attracted escalating attention.
Multi-sensor SR algorithms exploit the multispectral low-resolution (LR) pictures
to create a high-resolution (HR) image to perk up the performance of the UAV
image processing system. To restore image details without noise amplification or
unnatural post-processing artefacts, regularised SR algorithms by combining the
directionally adaptive constraints and a multi-scale non-local means filter are pro-
posed. Thus, the physical limitation of multispectral sensors can be overcome by
estimating the colour HR image from a set of multispectral LR images using
intensity-hue-saturation image fusion [48].
UAVs can also be controlled based on 3D landmark observations and the
landmark positions’ predictions even under significant external perturbations [49].
Since incoming light to a UAV platform can be scattered by atmospheric haze
and dust, the acquired subject image loses the original colour and brightness. The
enhancement of hazy-affected images is an imperative undertaking to improve the
visibility of various UAV images. Spatially adaptive de-hazing algorithms can rely
on merging colour histograms with consideration of the wavelength-dependent
atmospheric turbidity. Based on the wavelength-adaptive hazy image acquisition
model, a de-hazing procedure consists of three steps: (i) image segmentation;
(ii) creation of a context-adaptive communication map; and (iii) intensity trans-
formation to enhance a hazy image. The degradation model must consider the
wavelength of light sources, and the transmission map has to provide a theoretical
basis to discriminate visually relevant regions from others using the turbidity and
merged classification results [50].
Augmented reality (AR) comprises the technologies and methods to combine
real images with virtually rendered scenes to improve the application of the UAV.
Introduction to advances in UAV avionics for imaging and sensing 11
The high viewpoint and the combination of (multimodal) real and virtual scenes
provide engineers with augmented views of the observed/analysed site. AR
increases the possibility of uncovering problems. Important features for a suc-
cessful UAV-AR platform are (1) stabilised image acquisition, (2) image/video
transmission, (3) reliable communication, and (4) viewing interface. AR can lead to
the (re)construction of scenarios. The UAV-AR solution can be used in construc-
tion, planning, inspection, and remediation [51,52].
Spatial information plays a decisive role in remote sensing (RS) and mapping
applications like environment surveying and disaster monitoring [53,54]. A UAV-
borne mobile mapping system (MMS) can achieve fast spatial data acquisition
under limited conditions with better mobility and flexibility than other means. The
accuracy of a direct geo-referencing (DG) system can be considerably improved by
flying at a lower altitude using such hardware [55].
Recently, significant research effort has been directed towards increasing opera-
tional aircraft safety through numerous approaches to fault-tolerant flight control
schemes. Extensive research programs led by NASA have focused on designing
and validating specific approaches for the design of some fault-tolerant flight
control schemes [7,8], primarily focused on malfunctions of the actuators of the
control surfaces or failures of the propulsion system. However, issues related to
sensor failures have been considered of lower priority. Historically, triple or
quadruple physical redundancy on the sensor suite, along with the implementation
of built-in testing (BIT) and/or voting schemes, have been a reliable approach to
cope with failures for the majority of the sensors of the flight control system [3.4].
On the other side, physical redundancy has clear limitations for light and inex-
pensive aircraft such as small UAVs. Nevertheless, recent events have shown that
physical redundancy is not infallible in air data systems providing airspeed mea-
surements since all the redundant sensors are subjected to the same environmental
conditions in flight.
An alternative approach to physical redundancy is analytical redundancy. In
general, an analytical redundancy-based sensor failure detection, identification, and
accommodation scheme can be distinct and sequential tasks as follows: sensor
failure detection and identification (SFDI) task and the sensor failure accom-
modation (SFA) task. The SFDI task is designed to monitor the sensor systems
without physical redundancy to detect signs of incipient failures followed by the
identification and isolation of such failures; instead, the SFA task is responsible for
providing an alternative estimate in lieu of the measurements from the failed sen-
sor. Several approaches implement analytical redundancy for detection, identifi-
cation, and accommodation of failures on the rate gyros onboard an aircraft.
A flight termination system (FTS) is highly advantageous to work with all
UAV hardware/software redundancies and independent functionalities besides to
warrant the overall safety and predictability of the whole UAV cyber-physical
12 Imaging and sensing for unmanned aircraft systems, volume 1
1.7 Applications
Most UAVs are primarily used for intelligence, surveillance, and reconnaissance
(ISR), which can use chemical, biological, radiological, and nuclear detection, or
merely those tasks considered too dangerous or politically challenging for human
crewed aircraft to undertake. UAVs are preferred over human-crewed aircraft not
only because of downsizing risk and increasing confidence in mission success
avoiding at the same time the human cost of losing lives if the mission is unsuc-
cessful but also because UAVs have better and sustained vigilance over humans
during monotonous operations. Moreover, many other technological, economic,
and political factors have encouraged the development and operation of UAVs.
Operations that involve more than 30- or 40-h missions are best carried out
using UAVs. Automation brings in low workload, low-intensity tasks are suited to
unmanned aircraft.
UAVs are a perfect choice for operations in hostile environments, e.g., some
fire brigades for reconnoitring fires in inaccessible locations or where smoke and
flame would make human presence dangerous use small drones.
Operations like reconnaissance over unfriendly territory may result in loss of
human lives. Thus, in such a scenario, multiple, cheap unmanned aircraft can be
used to handle detection, command, and control systems.
A complete sensory platform for greenhouse monitoring (SPGM) consists of a
sensory system onboard a small UAV to measure temperature, humidity, lumin-
osity and CO2 concentration and plotting maps of these variables. These SPGM
features allow for climate control, crop monitoring, or vulnerability detection
Introduction to advances in UAV avionics for imaging and sensing 13
according to the climate and plant growth models as well as the requirements for
their integration onboard [56,57].
Additional possible applications of UAVs include:
● reconnaissance, surveillance, and target acquisition;
● surveillance for SAR;
● support for maritime manoeuvres;
● meteorology missions;
● route and landing reconnaissance support;
● indirect fire adjustment and prevention;
● radio and data relay;
● nuclear cloud surveillance;
● relaying radio signals;
● contamination monitoring;
● long-range, high-altitude surveillance;
● radar system jamming and destruction;
● security;
● damage assessment;
● policing duties;
● traffic spotting;
● fisheries protection;
● pipeline survey;
● sports events film coverage;
● agricultural operations;
● powerline survey;
● aerial photography;
● border patrol;
● surveillance of coastal borders, road traffic, etc.;
● disaster and crisis management search and rescue;
● environmental monitoring;
● agriculture and forestry;
● public health;
● firefighting;
● communications relay and RS;
● aerial mapping and meteorology;
● research by university laboratories;
● law enforcement; and
● many other applications.
applications specific to a UAV CPS while explaining the general architecture of the
cloud-based UAV-CPS [58,59], its challenges, its necessities, its design goals,
memory usage, specific requirements, limitations of onboard storage, general
solutions, UAV DL, and protocol standardisation with examples.
Chapter 3 investigates the role of OF in visual situation awareness, detection,
and obstacle avoidance systems, which interact with the environment through
sensors and actuators. The use of cameras allows the integration of CV algorithms
with the INS. The movement of characteristics of the image fused with the dynamic
of the UAVs allows us to improve the process of remoting sense, avoid obstacles,
or estimate the position and velocity of the UAV. In the literature, there are various
algorithms to locate characteristics points between two consecutive images. This
chapter shows how to integrate the movement of the pixel textures in the image
with INS and compares the different algorithms that are used to determine points
between consecutive images and to implement an algorithm that is used to deter-
mine points between consecutive images less expensive computationally speaking
and with less memory consumption. A case study about using the FPGA as part of
the visual servoing is discussed showing how to integrate results into the CV
hardware system of a UAV and addressing the need to handle issues such as multi-
resolution.
Chapter 4 explores intelligent vision-based sensor systems, future prerequisites
for UAV performance assessment to unsegregated airspace for a given navigation
sensor, quantitative assessment of visual sensors that are actualized in independent
and incorporated frameworks, and much more so for INSs.
Chapter 5 addresses the need for modelling and simulation of UAV systems,
reviews historical developments in the field, current techniques used, and future
developments. The modelling of a system’s dynamics and performing simulation
are commonly used techniques and almost critical components in the modern
development of manned and unmanned systems alike.
Chapter 6 provides an introduction to the integration of visual sensors in multi-
sensor systems, along with a review of significant research efforts in the domain.
The research area has matured owing to the need to operate UAS in GNSS-denied
environments.
Chapter 7 describes the main visual-based pose estimation algorithms and
discusses where they best apply and when each fails. Results of state-of-the-art are
also presented to stimulate the development of new strategies that will overcome
the remaining challenges of this research field.
As the use of UAVs increased, studies regarding their autonomous flight
became an academic field of great interest for researchers. Until recently, most
studies based their developments using an IMU and a GNSS as the main sensors to
calculate and estimate the UAVs pose. These sensors, on the other hand, have
several limitations, which can affect the navigation, therefore, the fully autonomous
aspect of the system. Images captured during flight, computer vision algorithms,
and photogrammetry concepts have become a core source of data to estimate the
UAVs pose in real-time, therefore, composing new alternative or redundant navi-
gation systems. Several algorithms have been proposed in the scientific
Introduction to advances in UAV avionics for imaging and sensing 15
community, each one working better in specific situations, and using different
kinds of imaging sensors (active and passive sensors).
MAVs are a class of aircraft investigated in Chapter 8. MAVs have a large
range of applications from commercial, research, government to military purposes.
Recently, the applications of bio-inspired MAVs have an amount of ecological,
geological, meteorological, hydrological, and human-induced disasters. Indeed,
animals utilise different strategies of movement in a varied environment, which
allows them to adapt themselves to get food, ran away from a threat, etc., by using
the least energy dissipation. Hence, designing and modelling UAVs inspired from
animals not only can decrease the energy consumption of the robot for long-time
applications but also it helps us to provide some kinds of robots which can be
utilised in different dangerous outdoors environment which even regular robots like
UAVs cannot access them in disaster research. UAVs also have some useful
advantages like safety, flexibility, relatively low cost of ownership, and ease of
operation. Bio-inspired robots can be designed to both fly and move on land, either
by jumping or by walking. One of the most critical parts of such vehicles (MAVs)
is the vision (visual perception) system.
Chapter 9 discusses the robot operating system (ROS) with examples of its use
in UAV deployed CV tasks with some references to the ROS 2.0. It also talks about
the relationship between ROS and available software suites.
Chapter 10 extends some topics from Chapter 8 to the realm of security ana-
lysis. This chapter preliminarily presents typical, published, and on-going research
works about security flaws on the UAV domain and possible scenarios that can
arise from them. Following security strategies to the resilient operation are pre-
sented to support new UAV designs based on typical security approaches (e.g.,
authentication and cryptography), current research works and insights derived from
aircraft design guidelines which relate intentional unauthorised interaction (cyber-
threat) and its effects to safety. Finally, deployment of a robot operating system
(ROS) on consumer UAV (Parrot AR.Drone 2) is performed, and a cybersecurity
assessment is presented containing its findings, mitigations, and proposals to
strengthen its operation (resiliency).
Chapter 11 explores how the type of environmental impacts the CV techni-
ques, algorithms, and specific hardware to be used. Indoor environments, also
known as controlled environments, generally rely on solutions based on beacons,
proximity sensors, and image processing for data acquisition. In this case, as the
environment is controlled, the illuminance of the scene is adjusted and sensors are
previously positioned, which facilitates the development and execution of these
systems. Outdoor environments, generally known for uncontrolled environmental
variables, frequently require solutions based on image processing techniques to
provide the data acquisition. In this environment, the non-constant variation of the
illuminance of the scene and the great variation of the background of images are
important complicating factors for the operation of the image processing algo-
rithms. In addition, constructions and buildings block the signal of sensors and
global positioning systems making it even harder to treat the exceptions caused by
these factors. Each exception being treated in a CV system has a computational cost
16 Imaging and sensing for unmanned aircraft systems, volume 1
UAV (sUAV) technology, able to carry sophisticated optics payloads and to take
aerial images from strategic viewpoints, has become unavoidable in nowadays
battlespace contributing to moving forward the ISR capabilities. The constrained
onboard processing power in addition to the strict limit in the flying time of sUAV
are amongst the serious challenges that have to be overcome to enable cost-
effective persistent surveillance based on sUAV platforms. All previous examples
show that tailoring the sensors to match the platforms’ environment is a challenging
endeavour, and therefore, architects have shifted their design methodology to be
based on hardware and software open architectures as a centrepiece of their
approach in building cost-effective surveillance solution design. This chapter is a
brief introduction to hardware and software building blocks for developing per-
sistent surveillance systems. In our context, the focus is, in particular, on electro-
optic (visual spectrum) and IR integrated solutions leveraging computer vision
techniques for surveillance missions.
Chapter 14 closes the book with some important conclusions and more infor-
mation on new research directions.
References
[1] Razmjooy, N., Mousavi, B. S., Soleymani, F., and Khotbesara, M. H. 2013.
A computer-aided diagnosis system for malignant melanomas, Neural
Computing and Applications, Vol. 23(7–8), 2059–2071.
[2] Moallem, P., Razmjooy, N., and Ashourian, M. 2013. Computer vision-
based potato defect detection using neural networks and support vector
machine. International Journal of Robotics and Automation, Vol. 28(2),
137–145.
[3] Mousavi BS, Sargolzaei P, Razmjooy N, Hosseinabadi V, and Soleymani F.
2011. Digital image segmentation using rule-base classifier. American
Journal of Scientific Research, Vol. 35, 17–23.
[4] Hemanth, D.J., and Estrela, V.V. Deep learning for image processing
applications. Adv. Par. Comp. IOS Press. ISBN978-1-61499-821-1 (print)
978-1-61499-822-8 (online) 2017.
[5] Jesus de M.A., Estrela V.V., Saotome O., and Stutz D. Super-resolution via
particle swarm optimization variants. In: Biologically Rationalized Computing
Techniques for Image Processing Applications, pp. 317–337, 2018.
[6] Estrela, V.V., Saotome, O., Loschi, H.J., et al. 2018. Emergency response
cyber-physical framework for landslide avoidance with sustainable electro-
nics, Technologies, 6, 42. doi: 10.3390/technologies6020042.
[7] Petricca, L., Ohlckers, P., and Grinde, C. 2011. Micro- and nano-air vehi-
cles: state of the art. International Journal of Aerospace Engineering.
Vol. 2011, Article ID 214549, 17 pages. http://dx.doi.org/10.1155/2011/
214549
[8] Ezequiel, C. A. F., Cua, M., Libatique, N. C., et al., UAV aerial imaging
applications for post-disaster assessment, environmental management and
18 Imaging and sensing for unmanned aircraft systems, volume 1
[22] Zhu, Q., Avidan, S., Yeh, M.-C., and Cheng, K.-T. 2006. Fast human
detection using a cascade of histograms of oriented gradients. In
Proceedings of the 2006 Computer Vision and Pattern Recognition
(CVPR06), 1491–1498.
[23] Zhou, J., and Hoang, J. 2005. Real time robust human detection and tracking
system. In Proceedings of 2005 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (CVPR05), 149.
[24] Vanegas, F., and Gonzalez, F. 2016. Enabling UAV navigation with sensor
and environmental uncertainty in cluttered and GPS-denied environments.
Sensors, 16, 666.
[25] Rudol, P., and Doherty, P. 2008. Human body detection and geolocalization
for UAV search and rescue missions using color and thermal imagery.
Proceedings of the IEEE Aerospace Conference. doi: 10.1109/AERO.
2008.4526559.
[26] Razmjooy N., Ramezani M., and Estrela V. V. A solution for Dubins path
problem with uncertainties using world cup optimization and Chebyshev
polynomials. In: Iano Y., Arthur R., Saotome O., Vieira Estrela V., and Loschi
H. (eds), Proc. BTSym 2018. Smart Innovation, Systems and Technologies,
vol 140, 2019. Springer, Cham doi: 10.1007/978-3-030-16053-1_5, 2019.
[27] DeGarmo, M., and Nelson. G. 2004. Prospective unmanned aerial vehicle
operations in the future national airspace system, In Proceedings of the
AIAA 4th Aviation Technology, Integration and Operations (ATIO) Forum,
Aviation Technology, Integration, and Operations (ATIO) Conferences,
Chicago, USA doi: 10.2514/6.2004-6243.
[28] França, R.P., Peluso, M., Monteiro, A.C.B., Iano, Y., Arthur, R., and Estrela,
V.V. Development of a kernel: a deeper look at the architecture of an
operating system. In: Iano Y., Arthur R., Saotome O., Estrela V.V., and
Loschi H. (eds) Proc. BTSym 2018. Smart Innovation, Systems and
Technologies, vol 140. Springer, Cham, 2019.
[29] Park, C., Cho, N., Lee, K., and Kim, Y. 2015. Formation flight of multiple
UAVs via onboard sensor information sharing. Sensors, 15, 17397–17419.
[30] Jain, R., Templin, F., and Yin, K.-S. 2011.Wireless datalink for unmanned
aircraft systems: Requirements, challenges and design ideas. In Proceedings
of the Infotech@Aerospace 2011, St. Louis, Missouri, American Institute of
Aeronautics and Astronautics, doi: 10.2514/6.2011-1426.
[31] Sayyed, A., de Araújo, G.M., Bodanese, J.P., and Becker, L.B. 2015. Dual-
stack single-radio communication architecture for UAV acting as a mobile
node to collect data in WSNs. Sensors, 15, 23376–23401.
[32] Zhang, T., Li, W., Achtelik, M., Kuhnlenz, K., and Buss, M. 2009.
Multisensory motion estimation and control of a mini-quadrotor in an air-
ground multi-robot system. In Proceedings of the 2009 IEEE International
Conference on Robotics and Biomimetics (ROBIO 2009), 45–50.
[33] Kretschmar, M., and Welsby, S. 2005. Capacitive and inductive displace-
ment sensors, in Sensor Technology Handbook, J. Wilson (Ed.), Newnes:
Burlington, MA.
20 Imaging and sensing for unmanned aircraft systems, volume 1
[47] Espiau, B., Chaumette, F., and Rives, P. 1992. A new approach to visual
servoing in robotics. IEEE Transactions on Robotics and Automation,
Vol. 8, No. 3, 313–326.
[48] Kang, W., Yu, S., Ko, S., and Paik, J. 2015. Multisensor super resolution
using directionally-adaptive regularization for UAV images. Sensors, 15,
12053–12079.
[49] Karpenko, S., Konovalenko, I., Miller, A., Miller, B., and Nikolaev, D. 2015.
UAV control on the basis of 3D landmark bearing-only observations.
Sensors, 15, 29802–29820.
[50] Yoon, I., Jeong, S., Jeong, J., Seo, D., and Paik, J. 2015. Wavelength-
adaptive dehazing using histogram merging-based classification for UAV
images. Sensors, 15, 6633–6651.
[51] Wen, M., and Kang, S. 2014. Augmented reality and unmanned aerial
vehicle assist in construction management. Computing in Civil and Building
Engineering. 1570–1577. doi: 10.1061/9780784413616.195 http://asceli-
brary.org/doi/abs/10.1061/9780784413616.195.
[52] Li, H., Zhang, A., and Hu, S. A multispectral image creating method for a
new airborne four-camera system with different bandpass filters. Sensors
2015, 15, 17453–17469.
[53] Aroma R. J., and Raimond K. A novel two-tier paradigm for labeling water
bodies in supervised satellite image classification. 2017 International
Conference on Signal Processing and Communication (ICSPC), 384–388,
2017.
[54] Aroma R. J., and Raimond K. A review on availability of remote sensing
data. 2015 IEEE Technological Innovation in ICT for Agriculture and Rural
Development (TIAR), 150–155, 2015.
[55] Chiang, K.-W., Tsai, M.-L., Naser, E.-S., Habib, A., and Chu, C.-H. 2015.
New calibration method using low cost MEM IMUs to verify the perfor-
mance of UAV-borne MMS payloads. Sensors, 15, 6560–6585.
[56] Roldan, J. J., Joossen, G., Sanz, D., del Cerro, J., and Barrientos, A. 2015.
Mini-UAV based sensory system for measuring environmental variables in
greenhouses. Sensors, 15, 3334–3350.
[57] Gonzalez, L. F., Montes, G. A., Puig, E., Johnson, S., Mengersen, K., and
Gaston, K. J. 2016. Unmanned aerial vehicles (UAVs) and artificial intelli-
gence revolutionizing wildlife monitoring and conservation. Sensors, 16, 97.
[58] Estrela, V. V., Monteiro, A. C. B., França, R. P., Iano, Y., Khelassi, A., and
Razmjooy, N. Health 4.0: Applications, management, technologies and
review. Medical Technologies Journal, 2(4) 262–276, 2019. doi: 10.26415/
2572-004x-vol2iss1p262-27.
[59] Kehoe, B., Patil S., Abbeel P., and Goldberg, K. A survey of research on
cloud robotics and automation, IEEE Transactions on Automation Science
and Engineering, 2014.
This page intentionally left blank
Chapter 2
Computer vision and data storage in UAVs
Vania V. Estrela1, Jude Hemanth2, Hermes J. Loschi3,
Douglas A. Nascimento3, Yuzo Iano3 and Navid Razmjooy4
2.1 Introduction
The choice of onboard components takes into consideration the lowest possible
weight, small size, low vibration sensibility for high recorded data quality and
reliability, computational abilities, compatibility, power consumption, and interfa-
cing ability. Computer vision (CV) algorithms tend to be computationally expen-
sive, and an unmanned aerial system (UAV) project requires a compromise
1
Universidade Federal Fluminense, RJ, Brazil
2
Karunya University, Coimbatore, India
3
LCV, FEEC, UNICAMP, Campinas, SP, Brazil
4
Tafresh University, Tafresh, Iran
24 Imaging and sensing for unmanned aircraft systems, volume 1
between computational resources and the other hardware requirements for a given
maximum payload that can be better understood using the cyber-physical system
(CPS) paradigm [1–8]. The term payload refers to the reduced physical space for
transportation purposes in an aircraft.
The applications of UAVs involve restricted computational processing, and
low energy resources of UAVs show a noteworthy test for real-time information
processing, networking, and policymaking. To talk about the challenges enforced
by the total bulk of acquired data, principally video, the irregular, and limited
network means, and the limited UAV resources, a cloud-supported UAV applica-
tion framework can be implemented. This structure can integrate video acquisition,
file scheduling, information offloading/processing, and network state assessment to
provide an efficient and scalable system. The framework includes a client-side set
of mechanisms hosted on the UAV CPS, which selectively unloads the gathered
data to a cloud-based server. Then, this server arranges for real-time processing and
data feedback services to the associated ground station or control centre and client
device/operator.
This work discourses the storage and software requirements for image acqui-
sition, analysis, compression, and knowledge retrieval in UAV-CPS. Some issues
worth discussion are:
(a) The system safely running from memory with the help of a boot method and
remotely updated without difficulty;
(b) the operating system (OS) kernel;
(c) the file system with minimum utility programs for running and remotely
accessing the UAV;
(d) an implementation methodology for the critical UAV software;
(e) a thorough review of all data logging methods; and
(f) the development of software simulators to model and test the hardware
components and physics of the vehicles in a safe situation.
Developing UAV software can be an enormous task that involves constructing,
incorporating, writing, simulating, and testing all the indispensable components for
a functioning UAV platform. There are various problems anticipated while devel-
oping the full software from top to bottom for autonomous vehicles, thus offering a
complete software framework for these vehicles, which can be applied in practice.
Designing a new product with minimum hardware is possible using simulation
software. This reduction of hardware reduces the costs and the time. Further, it
makes ease use of the product by turning it into an effortlessly upgradable, or even
wholly modifiable, with a simple software update. These all add value to the
avionics product. The heart and brain of a UAV-CPS software appear in Figure 2.1,
which depicts its role along with its supporting hardware modules that are required
to interface with the UAV ecosystem. The software block corresponds to software
on either onboard and offboard units.
Near-all UAV categories follow the above architecture paradigm. Sensors
perceive the environment and send their signals as inputs to the control hardware
and software. Actuators implement alterations in the environment using the
Computer vision and data storage in UAVs 25
Autonomous vehicle
Communications
Data logging
(a)
Temperature
sensor Display
Amplifier
Memory
IR sensor
Other
interfaces
Light sensor
(b)
Figure 2.1 (a) The role of software in an autonomous vehicle and (b) typical
data logger
2.1.1 Requirements
UAVs are planned for some high-level applications that cannot be translated into
source code immediately. First, the development phase needs to state the opera-
tional necessities of the vehicles keeping the interface support and platform
independence.
Since sensors and actuators exist with a huge variety of interfaces such as
USB, Ethernet, RS232, I2C, RS422, RS485, SPI, Bluetooth, PWM, and so on,
26 Imaging and sensing for unmanned aircraft systems, volume 1
communications hardware will have to address these interfaces to keep the whole
system working satisfactorily while handling data logging. The selection of
UAV-CPS hardware and software platforms should support all currently and
eventually necessary interfaces for a particular application.
Furthermore, the software and hardware platforms should be independent.
Software portability allows exchanging computer hardware while using the same
existing software without writing too many code amendments or, ideally, any more
code. Software portability is still under work due to the multidisciplinary character
of avionics and the existence of different civilian and military demands.
A free operating UNIX system kernel such as Linux meets these requirements
since it has drivers for all the compulsory interfaces and it works on multiple
hardware platforms, instead of using a specific hardware platform in addition to a
non-portable software environment with partial interface support. Each UAV has a
list of specific requirements, and with the assistance of the interfaces support and
the portable software, these requirements can be met by writing high-level code,
rather than drivers or assembly.
Cloud
Video capture
Pre-processing
Context collector
meaningful data, the server offers valuable services like powerful processing cap-
abilities for object identification, and scene comprehension that will result in
choices that must be sent to advance processing services in the ground station or
another control facility [10–12].
The deployment of UAVs for collecting critical data requires innovation, and it
is a challenging task comprising some of the following problems [11,12]:
(a) The observed areas can change or mess up the UAV operation.
(b) UAVs battery has limitations that affect their flying time considerably,
communication structure, and processing capacity.
(c) The UAV may connect over a DIL network environment where the network
state sentience must be handled by an algorithm that decides when the
information can be reassigned to the cloud server.
(d) The overall UAV sensing has severe requirements for correctness. For
instance, in search and rescue scenarios, image or video evidence collected
has to be processed appropriately.
(e) The UAV mobility pattern directly impacts the information acquisition and
the execution of image processing processes. Particularly, the collected data
quality may fluctuate during the flying time. Extra power and bandwidth will
be consumed if all records go to the ground station.
(f) Object detection and other applications need a high-quality image and video
data, which leads to the need to adjust UAV flight configurations to optimise
the resolution, dimensionality, and quality of visual data.
(g) Intelligent video capture and intelligent video processing tools significantly
better the quality of decision-making in disaster scenarios.
Computer vision and data storage in UAVs 29
Since UAVs come across the same challenges as mobile communication net-
works, these systems use cloud computing and virtualisation. However, it is not
always preferable to link to a cloud because the deployment of UAVs has con-
siderable higher complexity than the use of smartphones in mobile networks. Other
factors, such as time, task management, flight path, and information processing,
may require specific ground-breaking solutions.
The design purposes of a UAV-CPS cloud framework rely on the following
assumptions:
1. Efficient cost-efficient onboard video pre-processing procedures can be done
by the previous framework to expand performance.
2. Frame-based video segmentation and communication with a convenient sam-
pling rate can reduce the data volume offloaded to the cloud.
3. Sufficient UAV onboard storage to save all data captured during the entire
deployment time with the assistance of current solid-state technologies.
4. Adequate wireless communications between the UAV, the ground station
(GS), the cloud, and other control/management centres demand pre-processing
programs on the UAV that can filter out superfluous frames; the remaining
critical frames must be exchanged reliably among the cloud and the UAVs.
5. Cloud computing helps to deal with issues like the scalability and performance
of UAV-CPSs with reasonable effort.
Figure 2.2 offers a higher-level view of the UAV-CPS client and server fra-
meworks from the design perspective. The UAV-CPS is the client, consists of a
context collector, a context-aware video scheduler, a camera for capturing video,
and a video pre-processing unit. The cloud server stores video retrieves video,
detects an object, and performs other types of data mining. Wireless networks
connect the client and the server. The control centre accesses the processed data
through the server, assisting automatic and human decision-making.
The client video recording and pre-processing units collect video data, store
them in an onboard repository (e.g., a hard disk), in addition to running simple pre-
processing. Further, it picks a frame at a pre-defined time instant and sends it to the
pre-processing unit to attain material for the context-aware scheduler, which
receives video data and information from a context collector.
The context collector instructs the onboard sensors and other systems to collect
the UAV battery power level, real-time mobility patterns, control centre informa-
tion, spatial coordinates, temporal data, and network state metrics concurrently.
The context-aware scheduler processes the acquired material and the output image
developed from the image pre-processing stage to later decide about the frame. For
a more rigorous analysis, the frame is transmitted to the cloud.
A cloud supports virtualisation and hosts the server components, scalable
resource allotment, and easy configuration characteristics like the way post-
processing processes are applied to satisfy the power demanding and computa-
tionally intensive requirements. The server contains two focal modules: video
retrieval units and data storage modules. The data storage systems support an
assortment of data mining services and signal processing. The server retrieves
30 Imaging and sensing for unmanned aircraft systems, volume 1
video data with results and client information for wide-ranging processing (e.g.,
detected potential targets), which are directed to the control centre used for
assessment. First, the scheduling process inspects the UAV status to be sure there is
enough power to transfer data. If this holds, then it checks the UAV motion pattern
to enforce the satisfactory quality of the acquired data and that they comprise
interesting objects. In the end, a more sophisticated task, for instance, relates the
histogram distribution of the frame undergoing processing with those obtained
from previously obtained frames. If a significant change happens, then the network
state is appraised to determine if the frame will be directed within a time proper for
real-time evaluation.
If the network accessibility and capacity are sufficient, then the cloud will
receive the frame for additional processing.
A virtual machine can run in a cloud which also hosts the server components.
The video retrieval unit continually listens for arriving data, reconstructs the video
data, and then post-processes it. The server also saves incoming video data for
further treatment by other control services when extra knowledge is indispensable
to support choices. The server is also responsible for a control centre interface [13].
Wireless and wired networks provide the basis for the communications among
parts of the UAV-CPS: client, server, and ground station. A wireless connection
connects UAVs and the ground station, which is connected to the cloud by wired
and wireless connections, for exchanging data with the network connections. In
each case, data may be transmitted over networks exhibiting DIL environment
characteristics. The choice of communications model may suit a service-oriented
environment where the communication reliability is achieved through a web
services middleware.
There is a strong connection concerning network losses and the types of data
storage/processing because these issues are essential for performance maximisation
of the whole system. Decision-making relies significantly on network state metrics.
If the wireless channels connecting the UAVs and ground station have high data
rates, high definition videos, and different frame rates (depending on use and cir-
cumstances), then more records can be directed to the cloud. If the channel quality
is strict, only critical frames and messages should flow in the network. Both pre-
and post-processing algorithms hosted in the UAV and the cloud should be
enhanced regarding efficiency, scalability and detection performance.
of power. Moreover, compressed image and video data use wireless transmission,
which requires sufficient bandwidth that increases pressure on power system
resources. Thus, image and video data arriving at the GS are generally noisy and
delayed [14]. After ground computer processing, telecommands are sent back to
UAVs via TO but can only assure near real-time operation at best. Likewise, human
participation in TO is a significant downside that can both help and introduce some
autonomy risks.
In contrast to TO, the OVP model encourages onboard processing as grasped
from Figure 2.4. This model safeguards autonomy as well as real-time operation
with scarce onboard computational resources. This model is only valid for simple
image processing tasks. Even for simple CV operations, guarding system power
consumption within reasonable bounds is a non-trivial undertaking. This chapter
emphasizes implementation bottlenecks of onboard CV processing and possible
solutions because autonomy is paramount in avionics research.
or through external tools and sensors. As a rule, they are small, portable, battery-
powered, and have at least a microprocessor, in-house memory for file storage, and
sensors. Some DLs interface with other computers and use software that initiates
the DL so that visualisation and analyses of the collected data can be done onboard
or remotely.
DLs vary between multipurpose types for a variety of measurements to very
particular instruments for gaging one environment or application kind only.
General-purpose DLs are generally programmable. Although electronic DLs
replaced some chart recorders, there are still static machines with a restricted
number or unchangeable parameters.
DLs afford information collection on a 24-hour basis automatically, which is a
chief benefit. DLs are characteristically set up and left unattended to sense and
record material during the monitoring period upon activation. This fosters a com-
prehensive, accurate depiction of the environmental conditions experiencing sur-
veillance, e.g., relative humidity and air temperature.
Standardisation of protocols as well as data formats posed a big problem but
are now emerging in the industry. XML, JSON, and YAML, for instance, are little
by little being embraced for data exchange. The increasing growth of the Semantic
Web together with the Internet of Things will likely quicken this trend. Some
standard protocols embrace a smart protocol called SDI-12 that allows the
connection of some instruments to a variety of DLs. The SDI-12 also supports
multi-drop instruments. Some DL companies also offer the MODBUS standard
traditionally for industrial control, and there are countless industrial devices, which
withstand this communication standard. One more multi-drop protocol widely used
relies on the CAN Bus (ISO 11898). Some DLs employ a malleable scripting
scheme to adapt to various non-standard protocols.
DLs typically have slow sample rates, and they are implicitly standalone
devices. Characteristic data acquisition systems get data from connected compu-
ters. This standalone aspect of DLs implies onboard memory to save the acquired
data. Ideally, this memory should be extended to house an extended.
Given the lengthy recording times of DLs, they normally feature a mechanism
to associate a timestamp to guarantee that each recorded datum has the date and
time of acquisition to create a sequence of events. By themselves, DLs usually
employ integrated real-time clocks whose drift can be a paramount consideration.
DLs vary between simple single-channel input and complex multi-channel
instruments. Overall, the simplest device results in the highest programming flex-
ibility. Some more refined instruments tolerate cross-channel computations and
alarms, given some predetermined conditions. The newest DLs can serve web
pages with numerous people to observe a system remotely.
The automatic and remote nature of various DL applications requires them to
work with a battery supplemented by solar power. These limitations turned the
devices extremely power efficient since, in many cases, they operate in severe envir-
onmental circumstances where high reliability becomes an additional pre-requisite.
DLs must be exceptionally reliable because they work with little or no human
involvement for long nonstop periods and often installed in harsh or inaccessible
34 Imaging and sensing for unmanned aircraft systems, volume 1
locations. Hence, they will not be unsuccessful in logging data for any cause pro-
vided they have power. As such, DLs are almost entirely immune to the problems
affecting a general-purpose computer when a program crashes and eventual OS
instability. Some examples of DLs follows:
(a) Specific data about the aircraft performance is acquired by the flight data
recorder.
(b) A device called event data recorder is installed by a manufacturer to gather
and save various data during the time immediately before as well as after
a crash.
(c) In embedded systems and digital electronics design, specialised high-speed
digital DL helps to bypass the restrictions of conventional instruments as a
logic analyser. A DL can record very long flights to aid in the repairs of
functional bugs that happen occasionally.
(d) Micro-UAV (MAV) can bring considerable improvements to the healthcare
sector, and they require some particular concerns when it comes to DLs. An
open-source electronic health DL can be fast built based on the Raspberry Pi
[28–30]. A Holter monitor, for instance, is useful for continuously monitoring
innumerable electrical activities of the cardiovascular system that can offer
more flexibility to inpatient and outpatient care.
The specifics on the airborne platform status are vital to accomplishing a safe
flight. Ordinarily, the UAV receives control commands only from the operator and
from control algorithms. Moreover, a secondary radio channel can send the
necessary telemetry information.
This method works with a relatively low frequency to monitor mostly online
status and the capacity to transmit high-level commands. Consequently, problems
like the range and quality of wireless connections arise. During a flight, the UAV
can surpass the radio signal coverage, which results in corruption or loss of the
received data. So, the system needs an additional onboard logging unit to back up
crucial status variables when there is an accident. This feature (aka black box) does
not depend on the radio communication quality, and it permits to investigate the
causes of system breakdowns. More sophisticated estimation and control algo-
rithms entail an expressively higher data refresh rate and access to internal vari-
ables of the system. The verification process of algorithms monitors standard
working conditions. A possible solution is also to log this material on the platform
throughout the flight to process it offline. This subsystem should not obstruct the
black-box scheme to preserve high reliability. As a final point, scientific platforms
require data logging with low- and high-frequency rates. Both solutions demand
extra memory storage for the main avionics elements. Figure 2.5 depicts the block
diagram corresponding to the types of storage in a UAV-CPS.
Generally speaking, an avionics system contains three main modules: the
mainboard (MB), the attitude and heading reference system (AHRS), and the power
Computer vision and data storage in UAVs 35
Data logging
Low High
frequency frequency
Figure 2.5 Block diagram for data logging procedures and storage forms
board [28,29]. The essential component is the MB since it executes programs and
fuses all the information from the whole system. Thus, it should be the best location
for extra logging memory. The MB can use a secure digital (SD) card, which is a
non-volatile memory, and interfaced via direct memory access (DMA) and a serial
peripheral interface (SPI).
The file system simplifies information uploading and downloading. During the
initialisation phase, flight parameters come from the configuration file. At that
point, the secondary file with a distinctive name is generated, which will store
status variables. The writing process is sequential and corresponds to another line
for each timestamp. The non-optimal written data amount and the need to convert
floating-point variables to text limit the maximum write speed. It is worth noticing
that the data logging task has the lowest priorities onboard. The alteration of the
software procedure can enhance this subsystem performance. With the help of
bigger buffers, data can be sent in sequential multi-block write with SD mode
communication relying on a standard called secure digital input-output. As stated in
[31], its data transfer works better than SPI with a memory card. This led the
authors to try to an innovative communication approach (SD mode) constructed on
hardware planned upgrades. This chapter will concentrate on modifications of
current black box subsystems and on developing resolutions for a secondary log-
ging method – the high-frequency logger. There are several open-source or cor-
porate avionics modules on the market [32] equipped with specific procedures for
data logging such as the Ardupilot [33].
Analysing some known avionics system reveals two options for data logging:
(i) tlogs, where the telemetry log is sent to the ground control station (GCS) and
saved by specialised software; and (ii) data flash logs relying on extra onboard flash
memory. Online records can provide the necessary parameters.
Various commercial makers, such as Mikrokopter [34], distribute flight con-
trollers and wide-ranging aerial systems, which is not very useful in high-frequency
data logging. However, the onboard SD card limits its capabilities. The Pixhawk [35]
36 Imaging and sensing for unmanned aircraft systems, volume 1
manner to save these data to non-volatile memory. All records will disappear after
the power goes off because of the RAM characteristics. Thus, a new file is formed
in the SD card with a copy of the data right after the termination of the measuring
phase. Regrettably, the standard frequency of the memory card logging will cause a
considerable increase of time, which is a noteworthy disadvantage of the described
methodology because there is no option to elevate the write frequency thru the
flight to safeguard flight control tasks. Hence, a simple improvement is to raise a
frequency only after the descent and with disarmed motors, which improved five
times the highest data acquisition rate. Regrettably, the number of variables lim-
itation persists.
computing frameworks can offer solutions for BD, but these technologies habitually
fail, either from a technical or cost-effective standpoint, when a large information
volume is treated. Thus, these data processing bottlenecks must be addressed for
superior use of BD.
2.6.4.2 Complexity
To leverage the BD power cause problems linked to data complexity and the
inherent difficulties from untreated records (raw data). Habitually, data are saved
using different formats, comprising unstructured databases and discrete text files
[38]. Moreover, the data volume increases on a daily basis, which messes up
handling data from several sources and using different formats more puzzling [39].
The specific managerial and technological aspects that may suffer from problems
caused by data complexity are discussed underneath.
Most organisations are not ready to solve the problem of having a large rate of
data growth at reasonable cost satisfactorily. Many groups merely choose to delete
old data instead of trying to put up with data growth.
The obstacle of merging different data modalities arises from the distinct data
sources results in semantic conflicts deriving from and between the numerous types
of information sources. For example, terms like assets, earnings, and health may vary
from one data to another. Information can be frequently stored in different formats
with both (i) unstructured data that include text documents, SMS, emails, images,
videos, audio files, and transaction files; and (ii) structured data which organised
typically in relational databases. The complication of managing and analysing dif-
fering data formats is frequently outside the capability of many organisations.
2.6.4.3 Privacy
Privacy concerns often hinder the adoption and use of BD analytics within com-
panies. Oftentimes, BD analytics employs people with data collected for an utterly
different purpose [38]. People’s evidence, along with other data sources, can con-
stitute legal and ethical problems, like to leak reserved information about the per-
son (e.g., fiscal situation, medical records, domestic relationships, and awkward
behaviour). What makes the impasse even more onerous is that many companies
are not upfront about the use of their customer data [40].
organisational vision about BD, ensuring top management pledge to this vision and
handling the drivers impelling the executive culture rather than altering the culture
itself [30]. A clear vision of how BD fits in with the whole mission strategy should
hurry and solidify the BD acceptance within the organisation. The framed idea has
to be transformed into specific business processes and significant initiatives that
rely on BD to improve organisational performance.
The original standalone DL model is shifting to a scheme that brings together
data while also establishing wireless communications to set alarms of events, to
report data automatically and to permit remote control. DLs can serve web pages
for readings in progress, e-mail their alarms, to allow the use of FTP their results
into databases or to direct to the users. A popular trend is to use free and open-
source hardware and software approaches in lieu of brand-name products. The
small Raspberry Pi computer is a widespread single-board platform to host real-
time Linux or pre-emptive Linux kernel operating systems with many interfaces,
e.g., SPI, I2C, or UART permitting the direct connection of sensors and computers,
and unlimited configurations to display measurements over the Internet in real-
time, process records, plot charts/diagrams, and so on [46–50].
There are open-source initiatives with increasingly community-developed
projects for data acquisition and DL.
2.7 Conclusions
time. This pool can be used to implement new algorithms or rise the data recording
sampling frequency. All mentioned approaches for high-frequency logging present
advantages and disadvantages. Hence, it is problematic to relate to them
effortlessly.
The first solution applies to any avionics system using a memory card as well
as a microcontroller with plenty of RAM. This concept is stress-free to implement
and supports researchers to validate their algorithms and sensors outputs [1–8,
46–48] using both onboard and cloud resources.
The second method involves software modifications besides hardware mod-
ernisation of avionics. Even when there are external SDRAM memory units, other
existing storage, and enough measurement time, it prolongs the transmission time
to the SD card. Deplorably, at times there are no conditions to run the platform for a
long time.
The third solution relies on external flash memory, and it also requires much
firmware and hardware adjustments. Extra PC software is equally necessary, and it
makes room for the prospect to improve a reconfigurable logging framework. It
also has the best performance. This tactic is a superior suite for wholly enclosed
avionics modules spending output data buses and abolishes mechanical interface
with avionics (introducing memory card). From current development plans, future
work must address investigation on different procedures for data buffering
beforehand saving these records to the memory card.
Taking on superior buffers should enhance the achieved data rate in addition to
optimise the processing time. Moreover, memory cards with higher speeds can be
tested.
In summary, the DLs present the following advantages:
(i) A DL helps to monitor typical failures like intermittent failures;
(ii) It helps in analysing the cause of accidents;
(iii) It helps to perceive the human failure;
(iv) DLs allow for Drivers Passing Signal at Danger (SPAD);
(v) Operational mistakes done by operating staff Signal and Telecom inter-
ferences in safety circuits;
(vi) A DL can spot interferences and/or failures;
(vii) A DL can be a diagnostic tool for preventive maintenance with signalling
mechanisms;
(viii) DLs can be connected inserted into networks to observe the PI/RRI/EI
remotely; and
(ix) Networked DLs can produce failure reports remotely to track events online
and offline whenever possible.
References
[1] Razmjooy N, Mousavi BS, Khalilpour M, and Hosseini H. Automatic
selection and fusion of color spaces for image thresholding. Signal, Image
and Video Processing, 2014;8(4):603–614.
Computer vision and data storage in UAVs 43
[2] Hemanth DJ, and Estrela VV. Deep learning for image processing applica-
tions. Adv. Par. Comp. IOS Press. ISBN978-1-61499-821-1 (print) 978-1-
61499-822-8 (online) 2017.
[3] Mousavi BS, Soleymani F, and Razmjooy N. Color image segmentation
using neuro-fuzzy system in a novel optimized color space. Neural
Computing and Applications, 2013;23(5):1513–1520.
[4] Estrela VV, Magalhaes HA, and Saotome O. Total variation applications in
computer vision. In Handbook of Research on Emerging Perspectives in
Intelligent Pattern Recognition, Analysis, and Image Processing, pp. 41–64.
IGI Global, 2016.
[5] Moallem P, Razmjooy N, and Mousavi BS. Robust potato color image
segmentation using adaptive fuzzy inference system. Iranian Journal of
Fuzzy Systems, 2014;11(6):47–65, 2014.
[6] Mousavi B, Somayeh F, and Soleymani F. Semantic image classification by
genetic algorithm using optimised fuzzy system based on Zernike moments.
Signal, Image and Video Processing, 2014;8(5):831–842.
[7] Razmjooy N, Estrela VV, and Loschi HJ. A survey of potatoes image seg-
mentation based on machine vision. In: Applications of Image Processing
and Soft Computing Systems in Agriculture, 2019:1–38.
[8] Estrela VV, and Coelho AM. State-of-the-art motion estimation in the con-
text of 3D TV. In: Multimedia Networking and Coding. IGI Global, 2013:
148–173. doi:10.4018/978-1-4666-2660-7.ch006.
[9] Bernasconi A, Goodall E, Shea J, et al. Implementation guide for IBM
spectrum virtualize for public cloud. IBM Technical Report. 2017. http://
www.redbooks.ibm.com/redpapers/pdfs/redp5466.pdf.
[10] Luo C, Nightingale J, Asemota E, and Grecos C. A UAV-cloud system for
disaster sensing applications. In IEEE 81st Vehicular Technology
Conference (VTC Spring), 2015:1–5, 2015. doi: 10.1109/VTCSpring.2015.
7145656.
[11] Kosta S., Aucinas A., Hui P., Mortier R., and Zhang X., ThinkAir: Dynamic
resource allocation and parallel execution in the cloud for mobile code off-
loading. In Proceedings of IEEE INFOCOM, 2012:945–953.
[12] Namboodiri V, and Ghose T. To cloud or not to cloud: A mobile device
perspective on energy consumption of applications. In Proceedings of the
2012 IEEE International Symposium on a World of Wireless, Mobile and
Multimedia Networks (WoWMoM).
[13] Muthurajkumar S, Vijayalakshmi M, and Kannan A. Resource allocation
between temporal cloud database and user using access control. In
Proceedings of the International Conference on Informatics and Analytics
(ICIA-16). ACM, New York, NY, 2016. doi: https://doi.org/10.1145/
2980258.2980338.
[14] Ehsan S, and McDonald-Maier, KD. On-board vision processing for small
UAVs: Time to rethink strategy. In Proceedings NASA/ESA Conference on
Adaptive Hardware and Systems, 2009:75–81.
44 Imaging and sensing for unmanned aircraft systems, volume 1
[15] Sun Y., Ng DWK, Xu D, Dai L, and Schober R. Resource allocation for solar
powered UAV communication systems, arXiv preprint arXiv: 1801.07188, 2018.
[16] Sheet NAF. Beamed laser power for UAVs, NASA–2014.http://www.nasa.
gov/centers/armstrong/news/FactSheets/FS087-DFRC.html.
[17] Bay H, Tuytelaars T. Gool, T, and Luc V. SURF: Speeded Up Robust
Features, Proceedings 2006 ECCV, 2006.
[18] Saeed A, Neishaboori A, Mohamed A, and Harras KA. Up and Away: A
visually-controlled easy-to-deploy wireless UAV cyber-physical testbed. In
Proceedings of the IEEE 10th International Conference on Wireless and Mobile
Computing, Networking and Communications (WiMob). 2014:578–584.
[19] Fok C, Petz A, Stovall D, Paine N, Julien C, and Vishwanath S. Pharos: A
testbed for mobile cyber-physical systems. Univ. of Texas at Austin, Tech.
Rep. TR-ARiSE-2011-001, 2011.
[20] Jung D, Levy E, Zhou D, et al. “Design and development of a low-cost test-
bed for undergraduate education in UAVs. In Proceedings of 44th IEEE
Conference on Decision and Control 2005 and 2005 European Control
Conference, CDC-ECC’05. 2005:2739–2744.
[21] Lupashin S, Schollig A, Sherback M, and D’Andrea R. A simple learning
strategy for high-speed quadrocopter multi-flips. In Proceedings of 2010
IEEE International Conference on Robotics and Automation (ICRA). IEEE,
2010:1642–1648.
[22] Michael N, Mellinger D, Lindsey Q, and Kumar V. The grasp multiple micro-
UAV testbed. IEEE Robotics & Automation Magazine, 2010:17(3)56–65,
2010.
[23] Brown TX, Doshi S, Jadhav S, and Himmelstein J. Testbed for a wireless
network on small UAVs. Proceedings of AIAA 3rd Unmanned Unlimited
Technical Conference, Chicago, IL, 2004:20–23.
[24] Engel J, Sturm J, and Cremers D. Camera-based navigation of a low-cost
quadrocopter. Proceedings of 2012 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS). 2012:2815–2821.
[25] Crenshaw TL, and Beyer S. Upbot: A testbed for cyber-physical systems. In
Proceedings of 3rd International Conference on Cybersecurity experi-
mentation and test. USENIX Association. 2010:1–8.
[26] Fowers SG. Stabilization and Control of a Quad-Rotor Micro-UAV using
Vision Sensors, Master of Science Thesis, Brigham Young University, USA,
2008.
[27] Edwards B, Archibald J, Fife W., and Lee DJ. A vision system for precision
MAV targeted landing, Proceedings of 2007 IEEE International Symposium.
on Computational Intelligence in Robotics and Automation, Jacksonville,
FL, USA, 2007.
[28] Bondyra A, Gardecki S, and Ga˛sior P. Distributed control system for multirotor
aerial platforms. Measurement Automation Monitoring, 2015:61(7):343–346.
[29] Bondyra A, Gardecki S, Ga˛sior P, and Kasiński A. Falcon: A compact
multirotor flying platform with high load capability. Advances in Intelligent
Systems and Computing, 2015;351:35–44.
Computer vision and data storage in UAVs 45
Optical flow (OF) plays a decisive role in visual situation awareness, detection and
obstacle avoidance systems for unmanned aerial vehicles (UAVs), which are cyber-
physical systems (CPSs) that interact with the environment through sensors and
actuators. The use of cameras allows the integration of computer vision (CV)
algorithms with the inertial navigation systems (INS). The movement of char-
acteristics of the image fused with the dynamic of the UAVs allows us to improve
the process of remoting sense, avoid obstacles or estimate the position and velocity
of the UAV. In the literature, there are various algorithms to locate characteristics
points between two consecutive images. However, the computation time and con-
sumption of physical resources such as memory features are due to embedded
systems. This chapter shows (i) how to integrate the movement of the pixel textures
(OF) in the image with INS data, (ii) compares different algorithms to match points
between consecutive images, (iii) implements a process to encounter points
between consecutive images and (iv) implements a computationally less expensive
and with less memory consumption algorithm. A case study about using the field-
programmable gate array (FPGA) as part of the visual servoing is discussed
showing how to integrate results into the CV hardware system of a UAV and
addressing the need to handle issues such as multi-resolution.
3.1 Introduction
The unmanned aerial vehicles (UAVs) known as drones or aircraft remotely piloted
(ARP) are systems employed in operations such as supervision, intelligence and
reconnaissance, mapping, search and rescue, among others. However, since early in
the XX century, the ARPs have been used mainly in military operations. From 1979
1
DCTA-ITA-IEEA, Sao Jose dos Campos, SP, Brazil
2
Universidade Federal Fluminense, RJ, Brazil
3
Trafesh University, Trafesh, Iran
48 Imaging and sensing for unmanned aircraft systems, volume 1
on, it was incorporated in the aerial photogrammetry science [1] by Przybilla and
Wester-Ebbinghaus. Since early, it has been improved due to the apparition and
incorporation of microelectronic systems as the inertial measurement unit (IMU),
which is composed of an accelerometer and a gyrometer mainly. Nowadays, we can
see another sensor that can be integrated on the ARP such as LIDAR, SAR, optical
and acoustic sensors, among others.
The cameras are light sensors that catch the waves of the objects in the visible
band of the spectrum; it passively provides information about its environment. The
low price and the reduced size impact the payload of any mission, in a variety of
civil and military applications giving the ARP a little autonomy, incorporating
characteristics as the ability to detect and avoid obstacles, which are essential to the
integration of the ARPs in the aerial traffic space. Additionally, the integration of
the image acquisition (i.e., camera) with the navigation system has permitted us to
estimate most confidently the position and velocity of the UAV with the basic
navigation system (INS)/global navigation satellite system (GNSS).
The CV algorithms and the navigation system augment the UAVs perception
of their environment to make them more autonomous and improved its situa-
tional awareness to make a decision and carry out the mission without the
necessity of the operator’s supervision. Although the camera and other sensors
can provide it characteristics of autonomy to the ARP nowadays, unfortunately,
the regulations of the aeronautical authorities do not permit that ARPs share the
aerial space with the commercial aircraft. Now the challenge is implementing
inside of the ARPs a system reliable to detect and avoid any collision act,
improving its perception and comprehension of the objects nearby its environ-
ment. Recent investigations address the possibility of using cameras to imple-
ment this characteristic [2,3].
Currently, the navigation consists of the fusion of the INS, the IMU and GNSS
data. However, there are situations where the signal is corrupted or is not available
because the ARP is located on an urban or forest area or the signal has been altered
by an attack of spoofing or hacking [4,5]. The CV algorithms are good candidates
to maintain the estimate of the position and velocity when the ARP loses the GNSS
signal. To fuse the data from the CV algorithms and the navigation system, Bonin-
Font et al. [3] use two strategies: (i) to map previously established locations con-
cerning its localisation and (ii) without maps using only the characteristics of the
environment.
The utilisation of CV algorithms with the INS based data on the pre-estab-
lished map to localise its position requires sufficient memory to store the maps.
Another possibility the Ground Control Station stores the images of the maps
online assuming a robust connection and significant bandwidth to send it. Some
works [6–13] are related to the integration of geo-referenced maps. Others, for
example [14], made a study over a navigation system that recognises marks on the
ground and extract the longitude and latitude of the geo-referenced image using
artificial neural networks to identify marks of interest (landmarks) on the drone
image. For this last case, a similitude is established that indicates the geographic
coordinates of the ARP. To flight at night, Da Silva [15] has taken thermal images
Integrated OF for situation awareness, detection and avoidance systems 49
P(x, y, z)
V(x, y, z)
P(x, y, z)
V(x, y, z)
Navigation
by UAV and compared them with geo-referenced map previously established if the
images are very similar, and the geographic coordinates of the map are the same
geographic coordinates of the ARP (Figure 3.1). On the other hand, the navigation
systems without maps are based on the environmental characteristics to self-
localisation, for example, SLAM [16]. This navigation system detects and avoids
obstacles in addition to tracking a target.
Aerial photogrammetric is a sub-area of remoting sense that uses ARPs for
mapping terrain, and the output information can be used for interpretation and
classification of images, orthophotos, digital elevation maps, digital terrain maps,
digital surface maps, and 3D modelling of buildings and terrain.
There is a variety of applications relying on a camera embedded in the pay-
load. Although the navigation system is an essential part of the ARP, it is not
unique. Figure 3.1 shows the different systems that are inside the ARP and its
integration.
The planning system sends new coordinates to the guidance system depend
on the information perceived by its surroundings using an optical sensor and
the type of mission to achieve, for example, tracking of a target or send
information about a possible obstacle. It is also possible to use the optical
sensor to provide redundant information to estimate the velocity and position
most confident.
where f (x, y, t) is the intensity of the image in instant t and Dx and Dy are the change
of the pixel position in instant t þDt. Applying the Taylor series yields:
@f @f @f
f ðx þ Dx; y þ Dy; t þ DtÞ ¼ f ðx; y; tÞ þ Dx þ Dy þ Dt þ O2 (3.2)
@x @y @t
Removing the high terms of order O2 and inserting (3.1) into (3.2), the result is
an equation that describes the condition of the OF as follows:
Ix Dx þ Iy Dy þ It Dt ¼ 0 (3.3)
calculate the movement of the whole image or localising restrictions inside of the
image to find its projection in the next image.
where u and v are the average of the Laplacian, so the solution for u and v is:
a2 þ Ix2 þ Iy2 u ¼ a2 þ Iy2 u Ix Iy v Ix It and
a2 þ Ix2 þ Iy2 v ¼ a2 þ Ix2 v Ix Iy u Ix It : (3.7)
The LK method found is not invariant to the scale, yves Bouguet [18] proposes
a method based on pyramids; in this way, the OF calculation will be invariant
to scale.
The HS method calculates over the whole image the movement of the pixels,
and it is computationally expensive and consumes much memory, but it produces a
dense OF over the whole image. Now, the LK is a method with few OF, based in
characteristic own of the image, and it is less computationally expensive.
3.2.1.2 Feature extractor algorithm
Features are regions that present elements belonging to one object, this element can
be a corner, edge, blob, a new colour and other types of features. The extraction of
the feature points is determined based on the filters of convolution. The LK uses
corners inside the image to determine what points to track. Some algorithms used to
extract feature points include the SIFT, SURF, ORB and Harris.
Scale-invariant feature transform (SIFT)
The SIFT is an algorithm patented by Lowe [19] that localise and match key-points
through two steps. The first step detects the key-point, and the second step
described and matched it with the next image. For this chapter, only the first step is
exposed.
Integrated OF for situation awareness, detection and avoidance systems 53
To detect key-points, SIFT uses Gaussian filters over the whole image at dif-
ferent scales, which results in smoother images. In Figure 3.2, the smoothness of
the images is represented as the red box. The determination of the key-points is
made employing the differences of Gaussians (DoGs). The maximum or minimum
DoGs are considered a key-point in Figure 3.2 is represented in the green box.
Seeded-up robust features
The seeded-up robust features (SURF) is an algorithm patented by [20], which is
almost the same as the SIFT. It has two phases: detect and describe, but, in this
chapter, only we going to explicate the detect stage.
The use of integral images on the part of SURF and box filter to determinate
feature point make that it is less expensive. An integral image is a window where
each cell is the result of summing the previous pixels, and the box filters are
approximations of Hessian filters. In Figure 3.3, it is possible to see the Hessian
filter and its corresponding box filters.
The determination of feature points is based on the Hessian matrix determinant:
DetðHessianÞ ¼ Dxx Dyy wDxy (3.14)
where Dxx, Dyy and Dxy are the approximations of the box filters for the Hessian
filters in direction x, y and xy and w is a factor of weighting to correct the
approximation between the Gaussian kernels.
The SURF uses a pyramid to determine features point on different scales, and
in each level of the pyramid, SURF calculates the Hessian determinant over the
Lxx(x,σ) Lxy(x,σ)
H=
Lxy(x,σ) Lyy(x,σ)
16 1 2
15 3
14 4
13 P 5
12 6
11 7
10 9 8
R
<<
<< T R >>>> Th
R <<<< Th
1 2
R << th
Figure 3.5 Relation between the eigenvalues of the Harris matrix with the
equation alternative (3.17)
If one of the eigenvalues is greater than the other eigenvalue, it is a line. If both
eigenvalues are small, then it is a blob. If both eigenvalues are greater than a certain
threshold, then it is a corner. Therefore, a feature point A can be associated to its
form to determinate a corner without calculating the eigenvalues applying this
equivalence:
R ¼ DetðHarrisÞ a trace2 ðHarrisÞ > th (3.17)
If R is less to threshold th, then it is an outline or blob. If R is greater than th it
is a corner and, therefore, it is considered a feature point, as shown in Figure 3.5.
This method is invariant to the rotation without scaling.
Remote sensing (RS) is the process of detecting and monitoring the physical
characteristics of an area by measuring its reflected and emitted radiation at a
distance from the targeted area. Special cameras collect remotely sensed images of
the Earth, which help researchers sense things about the Earth [22]. To get this
information can be realised by means of satellites or cameras embedded inside
aerial vehicles a sub-area of the remote sensing is the aerial photogrammetry.
The aerial photogrammetry is the science of extracting and interpreting infor-
mation over objects and its environment using images and pattern registered by the
electromagnetic spectrum, this science is divided by two areas [23], the inter-
pretative and metric photogrammetry. The interpretative image has as the main goal
the identification and reconnaissance of objects over determinate area based on its
qualities as size, pattern, texture, contours, bright, contrast and others. On the other
hand, the metric photogrammetry extracts an exact measure on the image based on
the information of the sensor to measure the relative positions among points on the
image as distances, angles, volumes, size and others, and it is widely used in pla-
nimetry and altimetry. The information extracted or interpreted of the images can
be applied to topography, urban planning, archaeology and other sciences.
56 Imaging and sensing for unmanned aircraft systems, volume 1
Yc
Zc Xc C1 C2
P2
P1 P2 P1
s
r
Z
Y
X
where Xp, Yp and Zp are the geographic coordinates of a point in the scene, l is a
factor of scale, R is the matrix of rotation formed by the measure of the IMU, r and
s are the coordinates in the plane of image, f is the focal length and Xc, Yc and Zc are
the geographic coordinates from the GPS.
Now known the tie point in geographic coordinates, the projection of the tie
points in the coordinates of the next image at the instant of time t þ Dt is calculated
by means of the next equation:
2 3 2 3
r2 r0 XP Xc1
4 s2 s0 5 ¼ l2 R2 4 YP Yc1 5: (3.19)
f ZP Zc1
From the previous statements, one can see that the aerial photogrammetry uses
the collinearity equation with the information provided by the GNSS and the INS,
as implemented in [26] based on the tracking of tie points using the movements of
feature points (OF).
On image using the tracking of tie points with the method KLT and the tie
point represented as P1 ¼ (r, s) in Figure 3.6 is determined at point P2 ¼ (r þ dx,
s þ dy) on the next image, where dx and dy is calculated using the LK algorithm
(3.3).
The integration of the tie point in the image according to (3.3) is:
f ðx; y; DtÞ ¼ f ðr2 þ Dx; s2 þ DyÞ (3.20)
where r2 and s2 are the tie points calculate for the collinearity equation (3.19).
So, to enhance the tracking of tie points [26], the integration of the OF (3.3)
with the calculate of the tie points based in (3.19) is proposed according to:
f ðr; s; Dt ¼ f ðx þ ðr2 xÞ þ Dx; y þ ðs2 yÞ þ DyÞÞ (3.21)
● Mission:
The type of mission that will be developed, for example, mapping, tracking,
recreational, search and rescue.
According to the above, there is a high grade of dependence of the UAV with
respect to the operator and the operator with respect to the UAV. The ARP sends
information constantly to the operator to monitor its systems and the environment
around it. The accumulation of tasks and information can cause stress and fatigue in
the operator. So, these factors could affect the success of the mission.
However, there are various tasks that the ARP can make by itself without the
need of the overload of the operator, for example, detect and avoid obstacles or
tracking of targets, so the operator can make decisions most complexes.
The situation awareness is defined as “The perception of the elements in the
environment within a volume of time and space, the comprehension of their
meaning and the projection of their status in the near future” [28]. In this definition,
we can find three main aspects that can be applied in the unmanned systems.
Perception is the knowledge of all objects situated around it.
Comprehension is the capacity of establishing when an object becomes a
possible threat.
Projection is the information that it should consider to evade the threat.
The situational awareness permits one to know what is happening in the sur-
rounding, why it is happening, what would happen now and what one can be made
to make decisions and maintain the control according to the situation.
3.4.1.1 Perception
The monitoring of the elements around the ARP is made by means of the infor-
mation of the sensors as LIDAR, acoustic sensors, optical and infra-red sensors.
However, the optical sensors have a lot advantages as its low cost, small size and
low weight.
In the perception stage it is necessary to know some characteristics own of the
objects that are being filmed, this characteristics can be lines, contours, corner or
blob areas, to determinate lines there are algorithms as the Hough transformation,
to determinate corners we can use the SIFT, SURF ORB or Harris algorithms. With
these feature points it is possible to determinate the OF and with it, the objects
inside of the image can be isolated.
Integrated OF for situation awareness, detection and avoidance systems 59
3.4.1.2 Comprehension
Of the all objects represented by the image, some of it would be a threat and others
not. Determination of the direction of movement of the objects in the scene in
relation to the ARP is necessary to determinate which of it is possible a threat. The
determination of the direction of the movement is known as focus of
expansion (FOE).
Focus of expansion
The focus of expansion (FOE) is the point of congruence of the vectors that form
the movement of the objects in the whole image, and it lets one determine the
direction of the movement of the ARP about each one of them; the movement can
be rotational, transnational or a combination of both (Figure 3.7).
The focus of expansion can be calculated through the OF around the whole
image or on feature points, so [30] propose a solution based on the movement on
features points:
where V ¼ @Z=@t is the velocity of the ARP in the direction of the FOE and t is
the TOC.
3.4.1.3 Projection
The projection is a stage that search determinate if the obstacle that is in front really
is a threat, if the object really is a threat, it generate a new path to avoid the
obstacle.
x I S
y P
v
u x
C P
z
f
The estimation of the movement based on the fusion of different sensors with a
vision system is made through the projection of the movement of the object in a
scene that is recorded in the OF [35]. The process of calculating the movement of
the ARP in the plane of the image is known as egomotion, and the estimation of this
movement is conditioned to the model of the camera; in this chapter, we are going
to applicate the pinhole model. In Figure 3.8, the model of the camera is presented.
In Figure 3.8, C is the camera placed at the origin of the camera coordinate
system xyz, S is the scene that is being filmed, and f is the focal distance from C
until the origin of plane I (P lies on I). Since the camera provides information about
the structures being recorded, it is necessary to map these captured objects on the
two-dimensional plane XY, and the relation between u and Y, and the one between
v and X become, respectively,
u r r v s s
¼ )u¼f and ¼ )v¼f (3.26)
f Z Z f Z Z
3.5.1 Egomotion
Egomotion is the estimation of the camera movement relative to the scene that is
being filmed, using feature points to the scene represented in the image and
transforming from the image plane until the origin of the camera we can find the
system movement with relation to one point in the ground. To estimate the move-
ment, it is necessary to consider that the Z axis of the camera is aligned with the Z
axis of the ARP’s body, and the localisation of the camera may coincide with the
IMU’s position; additionally, the earth may be considered plane and the altitude
and attitude are known.
The next equations were derivative based on the pinhole model camera.
62 Imaging and sensing for unmanned aircraft systems, volume 1
The velocity of the camera with respect to the a point is determined by:
Rc ¼ Re wce XR (3.28)
where R is the geographic coordinates of a point in the scene ½X Y Z T . By defining
Re ¼ VCAM ¼ [Vx Vy Vz]T represents the camera’s velocity and wce ¼ wCAM ¼ [wx wy wz]
represents the measure of the IMU sensor, it follows:
2 3 2 3 2 3
X_ Vx wy Z Ywz
4 Y_ 5 ¼ 4 Vy 5 4 wz X Zwx 5 (3.29)
Z_ Vz wx Y Xwy
Deriving the equations of the pinhole model (3.26) with respect to the time
yields the system:
Z r Z s
X_ ¼ r_ þ Z_ and Y_ ¼ s_ þ Z_ (3.30)
f f f f
Combining (3.29) with (3.30) gives:
Z r
Vx wy Z þ Ywy ¼ r_þ Vz wx Y þ Xwy and
f f
(3.31)
Z s
Vy wz X þ Zwx ¼ s_ þ Vz wx Y þ Xwy
f f
Arranging the above equations, the movement of the pixels in the coordinates
of the image becomes:
f Z X Z Z
r_ ¼ Vx wy Z þ wz Y vz wx Y þ wy Z and
Z f f f f
(3.32)
f Z Y Z Z
s_ ¼ Vy wx Z þ wz X vz wx Y þ wy X
Z f f f f
For one feature point, the matrix is defined as:
2 Vx 3
=
2 36 Vy Z 7
6 =Z
2
rs r 7
6 f 0 r f þ s 76 7
r_ 6 f f 76
6 =Z
Vz 7
7
¼6 7 (3.33)
s_ 4 56 wx 7
r 6 7
s2 rs
0 f s f 6 7
f f 4 wy 5
wz
Now, defining
2 3
rs r2
6 f 0 r
f
f þ
f
s 7
6 7
M ð f ; rn ; sn Þ ¼ 6 7 (3.34)
4 s2 rs 5
0 f s f r
f f
Integrated OF for situation awareness, detection and avoidance systems 63
According to the sections above, the determination of OF is the concept that joins
the computer vision algorithms with the navigation systems. In this case of study,
an architecture to fuse the OF with the navigation system INS/GPS is proposed.
The computation of the vision algorithms is computationally costly, and it con-
sumes much memory to process and store the images; the process has characteristics
of real-time to respond most rapid to the stimulus of the environment or the operator
[36]. To differentiate from processors, FPGAs are devices with circuits projected by
the system designer. The developer decides which task is the input, how processing
stages are interconnected, and which tasks can be executed in parallel form. This
characteristic is essentials to work on computer vision.
The initial point to merge the OF with the INS/GNSS is to detect the image
feature points in need of tracking via various algorithms to extract them, e.g., SIFT,
SURF, ORB and Harris (please, refer to Table 3.1 for computing times).
Table 3.1 Calculate of time in seconds of the feature points in different sizes
of images
Resolution
According to Table 3.1, it is possible to see that SIFT and SURF determine the
feature points in a time very similar, so the SIFT and SURF need 0.59 and 0.69 s,
respectively, to determine feature points on one image of 320 420 pixels. Now,
the time of computation to determinate the feature points on an image of
1,024 768 pixels is about 11.315 s to SIFT and 12.274 s to SURF, so in general,
the time increases exponentially when the size of the image increases. This eleva-
tion time is due to the prominent features points that are processed as is observed in
the figure.
In contrast, the LK and the ORB have very similar times of computation.
However, the corresponding times of calculation are smaller than the time for the
SIFT and SURF techniques (Figure 3.9). The calculation of the feature points on
one image of 320 420 pixels, the ORB needs 0.048 s and the LK 0.039 s. If the
resolution of the image increases to 1,024 768, then the time of calculation
becomes 0.2419 s for the ORB and 0.15 s for the LK. The quantity of feature points
is minimal when compared to the SIFT and SURF. Hence, the quantity of feature
points that the ORB and the LK can estimate OF is computed in less time.
The embedded systems are physical units with a few resources, and the char-
acteristics of real-time are necessary, for it, two candidates to implement the
algorithm in the FPGA are the ORB and the Harris (Figure 3.10). However, the
ORB is not invariant to rotation so that the Harris algorithm is the candidate.
(a) (b)
(a) (b)
RAM
Data_out
endereco
Data_in
W/R
clk
en
DATA_IN 8b
delta 9b
8b
Rx clk Tracking
Data_from feature
Matlab 8b dx
8b dy
CONTROL
clk
DATA_IN 8b
en
Data_to Feature
Matlab 10b x detector
Tx 10b y
8b 36b umbral
Read/write
Data_out
Data_in
address
Clock Rx
Clock Tx
en
Clock sdram
dk
en Control
5×5
3×3 Ix Ixx 5×5
x
Iyy ∑ K y
(x,y) Iy 5×5
Ixy Harris corner
done
K=det(M)–0.04*trace(M)2
∆t
dy
5×5 dx
3×3
Ix Ixx 5×5
∆x
Iyy 5×5
∑ ∆y
(x,y) Iy Ixy
where, 24 states to implement the Kalman filter were defined, it is described below:
XNAV ¼ ½drN ; drE ; drD ; dVN ; dVE ; dVD ; dyN ; dyE ; dyD T
XAcc ¼ ½rbx; rby; jrbz; rfx; rfy; rfzT
XGyro ¼ ½ebx ; eby ; ebz T
XGrav ¼ ½dbx ; dgby ; dgbz T
XAnt ¼ ½dLbx ; dLby ; dLbz T
where dv, dr and dy are velocity, position and attitude error, respectively, wie is the
earth rate vector, win is the angular rate vector of the true coordinate system with
respect to the inertial frame, wen is the angular rate vector of the true coordinate
system concerning the earth, r is the accelerometer error vector, dg is the error in
the computed gravity vector as is described by Ho et al. [37].
So, the vector of observation is the difference between the position vector
calculated by the INS and the position provided by the GPS, being as in the
equation of observation:
Zk ¼ PINS;k PGPS;k (3.40)
To add the equations of the OF inside of the Kalman filter [37] assume that the
ground is flat and the OF is the sum of the translational and angular velocity, and
the ARP is flying in a constant height, and there are not high dynamic. Therefore,
the simplified OF equation is:
Vx Vy
Qx ¼ þ wy and Qx ¼ þ wx (3.41)
H H
68 Imaging and sensing for unmanned aircraft systems, volume 1
where Qx and Qy are the OF vectors, Vx and Vy are the velocity components along
the camera body x and y and wx and wy are the rotational angular rates of the UAV
camera, and H is the ground height, which is measured from the camera to the
ground surface.
When the OF measures are perturbed, the error of the model can be written as
below:
dVx Vx dVy Vy
dQx ¼ þ dwy 2 dH and dQy ¼ dwx 2 dH (3.42)
H H H H
As the camera and the IMU are fixed in the body, the longitudinal and lateral
velocity of the camera are defined employing the velocity in the coordinate system
NED as:
Vx cos y sin y VN
¼ (3.43)
Vy sin y cos y VE
Now, the height is defined as:
H ¼ H0 þ Hs þ Hd (3.44)
where H0 is the initial ground height, Hs is the ground height of ground surface
during UAV flight and Hd is the height variation due to UAV dynamics, differen-
cing (3.44)
H_ ¼ H_ s Vd (3.45)
where Vd is the velocity of the component vertical in the NED coordinate.
Now, substituting (3.44) and (3.43) on the perturbed model equation, we have
the state augmented to the Kalman Filter (3.46), so the observer vector to the
Kalman filter according to [37] is:
PINS;k PGPS;k
Zk ¼ (3.46)
Z
image pixels, similar to visible light cameras. It acquires high-resolution range data
fast [37,38].
As the PMD range camera delivers sufficient evidence about the obstacles, it
can estimate the moving obstacles trajectories. These TOF cameras provide a 3D
point cloud consisting of a set of surface points in a 3D coordinate system (X, Y, Z),
for all entities in the field of view of the camera.
3D OF quantifies the motion of each volume voxel between adjacent volumes.
The moving patterns correspond to temporal variations of the image intensity.
Generally, determining the OF employs the smoothness constraint equation. The
spatiotemporal derivatives of the image intensity are used in differential techniques
to obtain the OF.
Differential methods can be local or global. Local techniques encompass the
optimisation of a local functional, like in the LK scheme. The global procedures
find the flow vector through minimisation of a global functional, as in Horn and
Schunck. Local methods are robust to noise but do not deliver dense OF. Global
techniques produce extremely dense OF fields, with much larger noise sensitivity.
Any OF is the scene flow projection onto the image plane of a camera. In a
completely non-rigid world, the motions of the scene points may altogether be
independent of each other. Hence, the 3D scene motion corresponds to a dense 3D
vector field defined for all points on every single surface in the scene. This type of
OF requires the ability to handle a heavy computational load in a distributed cyber-
physical system [8–13,35,36,39–41].
composition materials. Henceforth, some facts visible in one band may turn out to
be invisible in another band.
The second reason is that the point of focus varies due to wavelength because
the focal length of a lens is associated with its refractive index, which declines with
the rise of wavelength. Hence, most available MSI and HSI cameras can only be
well focussed on several bands and become unfocussed in other bands. So, different
feature points are extracted from each band, which produces different band-wise
3D models with varying scales and point sets preventing direct registration to form
a complete 3D model. Each 3D model from a set of MSI and HSI images can be the
result of merging all key-points from several bands for 3D reconstruction.
Nevertheless, the fine structural and spectral data from the band images are lost.
The use of MSI and HSI images demands a lot from the CPS concerning
distributed computing [38,42–45] and fast communication links [45,46]. Since
dimensionality is quite a burden, computational intelligence [47–50] is called for to
handle the computational load both via software, hardware and hybrid frameworks.
A complete MSI and HSI 3D model combines all band-level 3D models into a
single one with some structural descriptor developed to characterise the spatial
relationship among points belonging to the same set, which can then match points
between two 3D models created at different bands.
Complete 3D models obtained from MSI and HSI images to help a UAV detect
and avoid obstacles by generating band-wise 3D models with excellent structural,
spectral object data integrated into the fully reconstructed model. This exceeds
traditional 3D registration and reconstruction from RGB images because this brings
in accurate and efficient band-wise merging models to aid navigation, which
facilitates the analysis about the way structural data from different band models
affect the whole scenery.
3.8 Conclusion
Throughout the chapter, the importance of CV algorithms to estimate the move-
ment of pixels in a pair of consecutive images to obtain several cues about the UAV
outer environment was analysed. The OF can be integrated into the navigation
system to help to improve the orientation and the tracking of tie points. We also
saw how the OF could help prevent acts of collision. Finally, a case study based on
the calculation of the OF through a hardware architecture relying on FPGA is
examined and to show how OF can be integrated into the navigation system using
an extended Kalman filter.
The use of the camera is a fundamental piece in the development of any mis-
sion. However, the cameras also have suffered a significant step forward in recent
times. Today’s cameras have high resolution but in the long term, these are going to
be fully integrated into the navigation system with even better imaging character-
istics. These and other future advancements will require systems with faster pro-
cessing units that with enough memory to store the imageries and execute more
cutting-edge algorithms compatible with them.
Integrated OF for situation awareness, detection and avoidance systems 71
This chapter talked about a few algorithms to determine the OF and gave a
comprehensive literature survey. Nonetheless, many other algorithms can detect
feature points even faster than ORB or Harris, as SUSAN, AKAZE and others,
which can be integrated into the OF calculation. To improve the real-time perfor-
mance is crucial to know which of them is/are adequate to the application in mind
and, in case of necessity, for a better FPGA implementation.
It should be pointed out that information fusion from multimodality imaging
sources may require extensive distributed processing through the cloud with the need
to balance communication, onboard and offboard processing [8–13,35,36,39–41].
References
[1] Eisenbeiss H, et al. A mini unmanned aerial vehicle (UAV): system over-
view and image acquisition. International Archives of Photogrammetry
Remote Sensing and Spatial Information Sciences. 2004;36(5/W1):1–7.
[2] Cigla C, Brockers R, and Matthies L. Image-based visual perception and
representation for collision avoidance. In: IEEE International Conference on
Computer Vision and Pattern Recognition, Embedded Vision Workshop; 2017.
[3] Bonin-Font F, Ortiz A, and Oliver G. Visual navigation for mobile robots: a
survey. Journal of Intelligent & Robotic Systems. 2008 Nov; 53(3):263–296.
Available from: http://dx.doi.org/10.1007/s10846-008-9235-4.
[4] Schmidt GT. Navigation sensors and systems in GNSS degraded and denied
environments. Chinese Journal of Aeronautics. 2015;28(1):1–10.
[5] Kerns AJ, Shepard DP, Bhatti JA, et al. Unmanned aircraft capture and
control via GPS spoofing. Journal of Field Robotics. 2014;31(4):617–636.
[6] Mota RLM, Ramos AC, and Shiguemori EH. Application of self-organizing
maps at change detection in Amazon Forest. In: 2014 11th International
Conference on Information Technology: New Generations (ITNG). IEEE; 2014.
p. 371–376.
[7] Martins MP, Medeiros FL, Monteiro MV, et al. Navegacao Aerea Autonoma
por Imagens. In: VI WAI – Workshop Anual de Pesquisa e Desenvolvimento
do Instituto de Estudos Avancados, Sao Jose dos Campos; 2006.
[8] Razmjooy N, Mousavi BS, Khalilpour M, and Hosseini H, Automatic
selection and fusion of color spaces for image thresholding. Signal, Image
and Video Processing. 2014;8(4):603–614.
[9] Mousavi BS, Soleymani F, and Razmjooy N. Color image segmentation
using neuro-fuzzy system in a novel optimized color space. Neural
Computing and Applications. 2013;23(5):1513–1520.
[10] Estrela VV, Magalhaes HA, and Saotome O. Total variation applications in
computer vision. In Handbook of Research on Emerging Perspectives in
Intelligent Pattern Recognition, Analysis, and Image Processing, pp. 41–64.
IGI Global, 2016.
[11] Mousavi B, Somayeh F, Razmjooy, and Soleymani F. Semantic image
classification by genetic algorithm using optimised fuzzy system based on
Zernike moments. Signal, Image and Video Processing. 2014;8(5):831–842.
72 Imaging and sensing for unmanned aircraft systems, volume 1
[12] Razmjooy N, Estrela VV, and Loschi HJ. A survey of potatoes image seg-
mentation based on machine vision. In Applications of Image Processing and
Soft Computing Systems in Agriculture, pp. 1–38. IGI Global, 2019.
[13] Estrela VV, and Coelho AM. State-of-the art motion estimation in the con-
text of 3D TV. In: Multimedia Networking and Coding. IGI Global, 2013.
148–173. doi:10.4018/978-1-4666-2660-7.ch006.
[14] Silva Filho P. Automatic landmark recognition in aerial images for the
autonomous navigation system of unmanned aerial vehicles [dissertation].
ITA. Sao Jose dos Campos, (SP); 2016.
[15] Da Silva W. Navegacao autonoma de vant em perıodo noturno com images
infravermelho termal [dissertation]. INPE. Sao Jose dos Campos, (SP);
2016.
[16] Fabresse F, Caballero F, Merino L, et al. Active perception for 3D range-
only simultaneous localization and mapping with UAVs. In: Unmanned
Aircraft Systems (ICUAS), 2016 International Conference on. IEEE; 2016.
p. 394–398.
[17] Barron JL, Fleet DJ, and Beauchemin SS. Performance of optical flow
techniques. International Journal of Computer Vision. 1994;12(1):43–77.
[18] Yves Bouguet J. Pyramidal implementation of the Lucas-Kanade feature
tracker. Intel Corporation, Microprocessor Research Labs. 2000.
[19] Lowe DG, inventor; University of British Columbia, assignee. Method and
apparatus for identifying scale-invariant features in an image and use of
same for locating an object in an image, 1999-03-08.
[20] Funayamam R, inventor; Katholieke Universiteit Leuven, assignee. Method
and apparatus for identifying scale invariant features in an image and use of
same for locating an object in an image; 2006-04-28.
[21] Rublee E, Rabaud V, Konolige K, et al. ORB: An efficient alternative to
SIFT or SURF. In: 2011 IEEE international conference on Computer Vision
(ICCV). IEEE; 2011. p. 2564–2571.
[22] Science for a changing world [homepage on the Internet]. The Association;
c1995-2002. AMA Office of Group Practice Liaison. Available from: https://
www.usgs.gov/faqs/what-remote-sensing-and-what-it-used-0?qt-news science
products¼7#qt-news science products.
[23] Zomrawi N, Hussien MA, and Mohamed H. Accuracy evaluation of digital
aerial triangulation. International Journal of Engineering and Innovative
Technology. 2011;2(10):7–11.
[24] Cheng L, Chen SX, Liu X, et al. Registration of laser scanning point clouds:
A review. Sensors. 2018;18:1641.
[25] Munoz PM. Apoyo aereo cinematico y aerotriangulacion digital frente a los
sistemas de navegacion inercial; D.Sc. dissertation, 2004.
[26] Tanathong S, and Lee I. Using GPS/INS data to enhance image matching for
real-time aerial triangulation. Computers & Geosciences. 2014;72:244–254.
[27] Hartmann K, and Steup C. The vulnerability of UAVs to cyber attacks – an
approach to the risk assessment. In: 2013 5th International Conference on
Cyber Conflict (CyCon). IEEE; 2013. p. 1–23.
Integrated OF for situation awareness, detection and avoidance systems 73
[28] Endsley MR. Design and evaluation for situation awareness enhancement.
In: Proceedings of the Human Factors Society Annual Meeting, vol. 32.
SAGE: Los Angeles, CA; 1988. p. 97–101.
[29] Chand BN, Mahalakshmi P, and Naidu V. Sense and avoid technology in
unmanned aerial vehicles: a review. In: 2017 International Conference
on Electrical, Electronics, Communication, Computer, and Optimization
Techniques (ICEECCOT), IEEE; 2017. p. 512–517.
[30] Sabatini AM. (2006). Quaternion-based extended Kalman filter for deter-
mining orientation by inertial and magnetic sensing. IEEE Transactions on
Biomedical Engineering, 53, 1346–1356.
[31] Zsedrovits T, Zarándy Á, Vanek B, Peni T, Bokor J, and Roska T. (2011).
Collision avoidance for UAV using visual detection. 2011 IEEE
International Symposium of Circuits and Systems (ISCAS), 2173–2176.
[32] Chao H, Gu Y, and Napolitano MR. (2013). A survey of optical flow tech-
niques for UAV navigation applications. 2013 International Conference on
Unmanned Aircraft Systems (ICUAS), 710–716.
[33] Gageik N, Benz P, and Montenegro S. (2015). Obstacle detection and col-
lision avoidance for a UAV with complementary low-cost sensors. IEEE
Access, 3, 599–609.
[34] Wang C, Ji T, Nguyen T, and Xie L. (2018). Correlation flow: robust optical
flow using kernel cross-correlators. 2018 IEEE International Conference on
Robotics and Automation (ICRA), 836–841.
[35] Estrela VV, Monteiro ACB, França RP, Iano Y, Khelassi A, and Razmjooy
N Health 4.0: Applications, Management, Technologies and Review. Med
Tech J, 2019;2(4):262–276, http://medtech.ichsmt.org/index.php/MTJ/article/
view/205.
[36] Gupta S, Girshick RB, Arbeláez PA, and Malik J Learning rich features from
RGB-D images for object detection and segmentation. Proceedings of 2014
ECCV. 2014.
[37] Ho HW, De Wagter C, Remes BDW, and de Croon GCHE, Optical-flow
based self-supervised learning of obstacle appearance applied to MAV
landing, Robotics and Autonomous Systems, Vol. 100, 2018, pp. 78–94,
ISSN 0921-8890, https://doi.org/10.1016/j.robot.2017.10.004.
[38] Kaldestad KB, Hovland G, and Anisi DA, 3D sensor-based obstacle detec-
tion comparing octrees and point clouds using CUDA, Modeling,
Identification and Control, Vol. 33, No. 4, 2012, pp. 123–130, ISSN
1890–1328 doi:10.4173/mic.2012.4.1
[39] Zhou Y, Li H, and Kneip L Canny-VO: visual odometry with RGB-D
cameras based on geometric 3-D–2-D edge alignment. IEEE Transactions on
Robotics, 2019;35:184–199.
[40] Peng HX, Liang L, Shen X, and Li GY Vehicular communications: a net-
work layer perspective. IEEE Transactions on Vehicular Technology, 68,
2018, 1064–1078.
[41] Brahmbhatt S, Amor HB, and Christensen HI Occlusion-aware object loca-
lization, segmentation and pose estimation. Proceedings 2015 BMVC, 2015.
74 Imaging and sensing for unmanned aircraft systems, volume 1
[42] Aroma RJ, and Raimond K. (2019) Intelligent land cover detection in multi-
sensor satellite images. In: Ane B, Cakravastia A, and Diawati L (eds)
Proceedings of the 18th Online World Conference on Soft Computing in
Industrial Applications (WSC18). WSC 2014. Advances in Intelligent
Systems and Computing, vol. 864. Springer, Cham.
[43] Li C, Yang SX, Yang Y, et al. Hyperspectral remote sensing image classi-
fication based on maximum overlap pooling convolutional neural network.
Sensors. 2018;18:3587
[44] Zhao H, Wang Z, Jia G, Li X, and Zhang Y. Field imaging system for
hyperspectral data, 3D structural data and panchromatic image data mea-
surement based on acousto-optic tunable filter. Optics Express.
2018;26(13):17717–17730.
[45] Estrela VV, Saotome O, Loschi HJ, et al. Emergency response cyber-
physical framework for landslide avoidance with sustainable electronics.
Technologies. 2018;6:42. doi:10.3390/technologies6020042.
[46] Coppola M, McGuire KN, Scheper KY, and Croon GC. On-board commu-
nication-based relative localization for collision avoidance in micro air
vehicle teams. Autonomous Robots. 2018;42:1787–1805.
[47] Hemanth DJ, and Estrela VV. (2017). Deep learning for image processing
applications. Advances in Parallel Computing Series, vol. 31, IOS Press,
ISBN 978-1-61499-821-1 (print), ISBN 978-1-61499-822-8 (online).
[48] de Jesus MA, Estrela VV, Saotome O, and Stutz D. (2018) Super-resolution
via particle swarm optimization variants. In: Hemanth J, and Balas V (eds)
Biologically Rationalized Computing Techniques For Image Processing
Applications. Lecture Notes in Computational Vision and Biomechanics,
vol. 25. Springer, Cham.
[49] Jia Z, and Sahmoudi M. A type of collective detection scheme with
improved pigeon-inspired optimization. International Journal of Intelligent
Computing and Cybernetics. 2016;9:105–123.
[50] Duan H, and Luo Q. New progresses in swarm intelligence-based compu-
tation. IjBIC. 2015;7:26–35.
Chapter 4
Introduction to navigation and intelligence
for UAVs relying on computer vision
Suraj Bijjahalli1 and Roberto Sabatini1
4.1 Introduction
An overall reduction in manufacturing costs of cameras and reductions in the form
factor of processors has led to the advent of a large market for commercial-off-the-
shelf (COTS) VBN systems for UAS operations. A wide range of VBSs is
increasingly employed in UAS operations for sensing the surrounding environment,
localising the host platform and tracking nearby intruder platforms for functions
such as separation assurance and collision avoidance. Scanning LIDARs are an
alternative sensing modality capable of perceiving the surrounding environment at
a high level of detail and are typically employed in surveying and mapping appli-
cations. Although advances have been made in lowering the form factor of LIDAR
1
RMIT University, Bundoora, Australia
76 Imaging and sensing for unmanned aircraft systems, volume 1
systems to support implementation on small UAS platforms, VBSs are better suited
to this application owing to their relatively low size, weight and power, the passive
nature of their operation and their low cost. A typical UAS system architecture is
shown in Figure 4.1. The UAS comprises both the unmanned aerial vehicle (UAV)
and the ground control station (GCS). Outputs from the navigation system includ-
ing vehicle state estimates and/or low-level cues from sensors are used to drive
guidance strategies and control. The navigation system outputs are also transmitted
to the GCS via a telemetry downlink. Uplink from the GCS facilitates transmission
of manual control commands by the remote pilot.
Within the navigation system, VBN sub-systems are typically employed in
tandem with a suite of sensors. Common sensor systems are the global navigation
satellite system (GNSS) and the inertial navigation system (INS) to enhance UAS
state estimation and to increase navigation system redundancy in the event of
individual sensor failure or loss. For example, VBSs are used for vehicle state
estimation in environments such as urban canyons where GNSS is completely or
intermittently unavailable.
Raw visual data is acquired and processed to compute vehicle states in a stan-
dalone configuration. However, a more common practice is to integrate the visual
sub-system with other sensors via a suitable data-fusion algorithm. Depending on the
functionality demanded of the visual sub-system, processing modules of varying
complexity can be implemented, ranging from simple visual servoing (VS) routines
to computationally intensive visual odometry (VO) techniques. For example, attitude
estimation during the landing phase of a fixed-wing UAV can be implemented using
a simple image processing module that employs edge detection methods and optical
flow (OF) techniques to extract the horizon and calculate attitude based on the
apparent motion observed [75, 76].
Image processing is a branch of computer knowledge that is concerned with
processing a digital signal representing images taken with a digital camera or
scanned by a scanner [1–9].
UAV
Navigation Guidance Control
Comm Link system system system
Aircraft
dynamics
Ground Control Station
Mission
Human–Machine Management
Interface System
In contrast, VO provides the full navigation state vector but requires more
computation owing to the requirement to extract visual features and track them
across consecutive frames.
The following section presents a brief overview of commonly employed
methods for VBN and introduces basic terminology associated with them.
O X
O X
− Incremental update
k k k
O X
task is to estimate vehicle pose R relative to a global coordinate system given a priori
locations s of environmental features and online observations r of said environmental
features.
Visual odometry is a visual dead-reckoning technique in which vehicle states
Rk are estimated incrementally at each discrete time-step given the vehicle states
Rk 1 in the previous time step and the apparent motion of the observed scene. VO
is a subset of the larger structure-from-motion (SFM) problem and is an inter-
mediate step in any visual-simultaneous localisation and mapping (V-SLAM)
algorithm (Figure 4.4).
Each of these methods will be expanded in greater detail, followed by an
exploration of likely future work in this field of study.
Introduction to navigation and intelligence for UAVs 79
Observed
features
Reference Cartesian
pose control law
+
–
3D pose Feature
reconstruction extraction
Error Observed
signal features
Control law in
Reference
+ image plane
pose
–
Feature
extraction
Start of visual
route Key frame (2)
End of visual
route
Equation (4.1) expresses the assumption that intensity I does not change over
incremental shifts in pixel coordinates and over time.
A Taylor series expansion applied to the right-hand side of (4.1) yields:
@I @I @I
Iðx þ Dx; y þ Dy; t þ DtÞ ¼ Iðx; y; tÞ þ Dx þ Dy þ Dz (4.2)
@x @y @z
By comparison of (4.1) and (4.2), (4.3) is written as:
@I Dx @I Dy @I Dt
þ þ ¼0 (4.3)
@x Dt @y Dt @t Dt
Or more concisely,
@I @I @I
vx þ vy þ ¼0 (4.4)
@x @y @t
Equation (4.4) is an equation in two unknowns termed the aperture problem,
and additional constraints must be applied in order to solve it. Different methods
specify different constraints in including differential methods, region-based
matching methods and energy-based methods, to mention a few. A description of
each method is beyond the scope of this book. A review of OF methods for robotic
navigation is found in [13]. The oldest known (and hence, most widely imple-
mented) algorithms are the classical differential techniques such as the Lucas–
Kanade method. Differential methods compute the spatial and temporal derivatives
of image intensity to obtain OF. The method is generally recognized as being more
robust to image noise than another classical differential technique, the Horn–
Schunck method, which optimizes energy globally, to obtain a dense flow field. A
relevant method for the target application in this review is the image interpolation
method intended for small UAV’s [14], which has more recently, been shown to be
capable of a maximum rate of 20,000 OF vectors per second [15]. A bio-inspired
approach based on the centring response of bees has been implemented in [16,17],
wherein OF derived from images captured by side-facing cameras and an omni-
directional camera was used to determine the relative distance between the aerial
platform and the buildings in an urban canyon. The phenomenon is analogous to
observing the environment through the window of a moving car – nearby features
are perceived as moving quickly, whereas distant features are perceived as moving
more slowly. The flow field is, therefore, a function of the distance to the observed
object. The concept is illustrated in Figure 4.9.
The estimated lateral distance was used as an input to the heading control loop
to produce yaw rate commands and centre the vehicle in the canyon. A similar
approach can be used for fixed-wing UAS landing applications wherein the OF will
increase with proximity to the ground. A limitation with this approach is that
absolute range measurements are not provided through OF alone. The emphasis, in
this case, is in guiding the vehicle relative to objects in the environment, rather than
obtaining an absolute position estimate. However, the problem can be addressed by
jointly implementing a range sensing modality along with OF estimation in a single
Introduction to navigation and intelligence for UAVs 83
Figure 4.9 Use of OF to centre a UAV between two planar walls. The measured
optic flow is a function of the lateral distance to each wall
platform. For example, the UAS architecture implemented in [18] combines esti-
mation and stereo-vision in a single platform to safely traverse urban canyon sce-
narios. One drawback with OF is its reliance on the translational motion of the
camera to generate flow fields. Additionally, unless the motion of the UAV is
precisely known, this technique cannot produce absolute range measurements. It is
therefore not suitable for situations in which a rotorcraft UAV is required to move
slowly or hover in place near obstacles, such as for surveillance or structure
inspections. A key problem observed in the research in [18] occurred when nego-
tiating 90 bends or T-junctions in urban canyons. Under these conditions, the
plane vertical sidewall assumption no longer applied, leading to an inability to
control using OF alone. The perception and control algorithms were designed,
assuming perfectly planar canyon walls. However, real-world operations are likely
to violate this assumption, with no published tests in these conditions. Other pro-
minent examples of OF used to drive aircraft control directly include terrain fol-
lowing and landing operations in static and dynamic environments [19,20]. Apart
from VS applications which may or may not explicitly output UAS PVA, state
estimation using OF has been investigated in the literature. The performance of
OF-based motion estimation (specifically platform rotation rate) was compared
against integrated GPS-INS systems in [21]. Besides OF, vision-based control of
the aircraft relative to a perceived object can also be accomplished through feature
extraction and tracking over an image sequence. The error signal for velocity
84 Imaging and sensing for unmanned aircraft systems, volume 1
real-time UAS SLAM are found in the literature owing to computational limita-
tions. The rest of this section focuses on reviewing VO implementations for UAS
navigation, beginning with the fundamentals of the algorithm. The formulation of
the VO problem is illustrated in Figure 4.10.
For simplicity of explanation, the camera coordinate frame C is also assumed
to be the body reference frame of the aerial vehicle. Images Ik1, Ik and Ikþ1 are
captured at time-steps k 1, k and k þ 1, respectively, as the UAS moves in the
environment. Features in the 3D environment (edges, corners, etc.) are projected
onto the image plane of the camera and must be tracked across the successive
frames. Consider time-steps k 1 and k: Given the presence of features that are
tracked across two successive frames k 1 and k, VO is essentially the computa-
tion of the rigid body transformation Tk1|k comprising translation and rotation
between the poses Ck1 and Ck. Once the transformation is computed, it is con-
catenated with Ck1 to output the current pose Ck. Without loss of generalisation,
the same explanation describes VO when a stereo-camera is employed to acquire
images rather than a monocular camera. In this case, image-pairs are acquired at
each time-step instead of a single image. Additionally, the depth to the observed
features can be estimated with each captured image-pair, which is a significant
advantage over monocular vision where scale cannot be directly estimated.
Regardless of whether monocular or stereo-vision is employed, the steps in each
cycle are shown in Figure 4.11.
Feature detection is performed on each set of images captured at each given
time-step, followed by matching the same set of features over successive frames
(correspondence). Optimization routines estimate motion between frames with the
tracked features forming constraints. Each type of optimisation routine depends on
the type of feature correspondences performed between frames and has distinct
advantages and limitations. The speed and accuracy of feature correspondences are
typically boosted through the fusion of visual and inertial sensor measurements in
order to exploit their complementary nature. In [33], the tight coupling of the two
Time-step k+1
Time-step k–1
Time-step k
VO pipeline
Acquired frames
Feature detection
Feature tracking
over frames
Motion estimation
Bundle
adjustment
Tk+1,k+2
Tk,k+1 Ck+1
Ck
Ck+2
Z ¼ Tf =d; (4.6)
where T is the distance or baseline between the left and right cameras and f is the
focal length of each camera. From (4.6), it can be seen that depth-perception in
stereo-vision is dependent on two intrinsic parameters of the stereo-vision camera:
the focal length and the baseline.
Stereo-vision has been employed for altitude estimation and subsequent regula-
tion in autonomous helicopters [39]. A similar approach was utilised for altitude
90 Imaging and sensing for unmanned aircraft systems, volume 1
Disparity map
Figure 4.13 Stereo-vision concept. The same scene is observed from two cameras.
Features that are common to both cameras have different pixel
coordinates in either image – the disparity between the coordinates is
a function of the depth of the observed feature
estimation in [40] although the results could not be experimentally validated in real-
time. Maintaining an accurate scale, therefore, becomes increasingly difficult as the
range to the perceived object grows, effectively reducing utility to that of a monocular
setup. Additionally, vibration due to platform dynamics can cause small deformations
in the stereo-rig, potentially resulting in large triangulation errors. Some research
aimed at addressing these gaps in stereo-VO has begun to emerge recently [41,42].
Figure 4.16 Cross-correlation between the captured image and the stored image
to lead to cost-effective MFIR sensors for UAS navigation systems in the future
[59–61]. Similarly, the development of cost-effective hardware is likely to lead to
greater uptake of multi- and hyper-spectral imaging for UAS applications [61–65]
in the future. The methods presented in this chapter can be extended to aid multi-
UAV coordination and control scenarios, which remains an area of significant
research interest [66–71].
4.4 Conclusions
This chapter has identified and presented several visual cues and methods com-
monly applied in visual navigation for UAS. Methods employed in the literature
have been reviewed and concisely presented to introduce the reader to the field of
study. VBSs present several advantages for UAS navigation, primarily by the
amount of information they can capture and their small size, weight and power
requirements. Despite this, however, several limitations exist that suggest a
potential direction for future research – sensitivity to environmental conditions
such as occlusion and variation of ambient light can hamper the availability and
accuracy of this sensing modality. Artificial intelligence-based methods are likely
to be beneficial by supporting the development of visual systems that learn and
adapt to a number of operating conditions. One of the primary future requirements
for UAS access to unsegregated airspace is the need to certify that a given navi-
gation sensor can meet a stipulated level of performance with the desired prob-
ability. This will require rigorous quantitative evaluation of visual sensors that are
implemented in standalone and integrated systems and even more so for integrated
navigation systems. Along with the development of detailed error models that
cover various failure modes and more importantly the sensitivity of data-fusion
algorithms to these errors.
References
[1] C. Hwang (2015). Image processing and applications based on visualizing
navigation service. ICDIP.
[2] N. Razmjooy, B.S. Mousavi, F. Soleymani, and M.H. Khotbesara (2013). A
computer-aided diagnosis system for malignant melanomas. Neural
Computing and Applications, 23(7–8), 2059–2071.
[3] V.V. Estrela, and A.E. Herrmann (2016). Content-based image retrieval
(CBIR) in remote clinical diagnosis and healthcare. In Encyclopedia of
E-Health and Telemedicine (pp. 495–520). IGI Global.
[4] N. Razmjooy, M. Ramezani, and N. Ghadimi (2017). Imperialist competitive
algorithm-based optimization of neuro-fuzzy system parameters for automatic
red-eye removal. International Journal of Fuzzy Systems, 19(4), 1144–1156.
[5] P. Moallem, and N. Razmjooy (2012). Optimal threshold computing in
automatic image thresholding using adaptive particle swarm optimization.
Journal of Applied Research and Technology, 10(5), 703–712.
Introduction to navigation and intelligence for UAVs 95
[64] C. Li, S.X. Yang, Y. Yang, et al. Hyperspectral remote sensing image
classification based on maximum overlap pooling convolutional neural net-
work. Sensors 2018, 18, 3587.
[65] H. Zhao, Z. Wang, G. Jia, X. Li, and Y. Zhang (2018). Field imaging system
for hyperspectral data, 3D structural data and panchromatic image data
measurement based on acousto-optic tunable filter. Optics Express, 26, 13,
17717–17730.
[66] A. Lioulemes, G. Galatas, V. Metsis, G.L. Mariottini, and F. Makedon
(2014). Safety challenges in using AR. Drone to collaborate with humans in
indoor environments. PETRA.
[67] P. Schmuck (2017). “Multi-UAV collaborative monocular SLAM,” in 2017
IEEE International Conference on Robotics and Automation (ICRA),
pp. 3863–3870.
[68] X. Li, and T. Chen (2017). “Collaborative visual navigation for UAVs in
blurry environment,” in 2017 IEEE International Conference on Signal
Processing, Communications and Computing (ICSPCC), pp. 1–6.
[69] S. Zhao, Z. Li, R. Cui, Y. Kang, F. Sun, and R. Song (2017). Brain–machine
interfacing-based teleoperation of multiple coordinated mobile robots. IEEE
Transactions on Industrial Electronics, 64, 5161–5170.
[70] S.M. Mahi (2018). Multi-Modal Multi sensor Interaction between Human
and Heterogeneous Multi-Robot System. ICMI.
[71] C.T. Recchiuto, A. Sgorbissa, and R. Zaccaria (2016). Visual feedback with
multiple cameras in a UAVs human-swarm interface. Robotics and
Autonomous Systems, 80, 43–54.
[72] A. I. Mourikis and S. I. Roumeliotis, “A multi-state constraint Kalman filter
for vision-aided inertial navigation,” in Proc. 2007 IEEE International
Conference on Robotics and automation, 2007, pp. 3565–3572.
[73] D. D. Diel, P. DeBitetto, and S. Teller, “Epipolar constraints for vision-aided
inertial navigation,” in 2005 Seventh IEEE Workshops on Application of
Computer Vision. WACV/MOTIONS’05, 2005;1:221–228.
[74] J. W. Langelaan, “State estimation for autonomous flight in cluttered envir-
onments,” Journal of Guidance Control and Dynamics, 30, 1414, 2007.
[75] M. B. Rhudy, Y. Gu, H. Chao, and J. N. Gross, “Unmanned aerial vehicle
navigation using wide-field optical flow and inertial sensors,” Journal of
Robotics, 2015, 1, 2015.
[76] F. Kendoul, I. Fantoni, and K. Nonami, “Optic flow-based vision system for
autonomous 3D localization and control of small aerial vehicles,” Robotics
and Autonomous Systems, 57, 591–602, 2009.
[77] M. E. Antone and S. Teller, “Automatic recovery of relative camera rota-
tions for urban scenes,” in 2000 IEEE Conference on Computer Vision and
Pattern Recognition, 2000, 282–289.
[78] C. Demonceaux, P. Vasseur, and C. Pégard, “UAV attitude computation by
omnidirectional vision in urban environment,” in 2007 IEEE International
Conference on Robotics and Automation, 2007, pp. 2017–2022.
This page intentionally left blank
Chapter 5
Modelling and simulation of UAV systems
Narendran Muraleedharan1 and Daniel S. Cohen2
1
Aptus Engineering, Inc., Scottsdale, AZ 85260, USA
2
Embry-Riddle Aeronautical University, Prescott, AZ 86301, USA
102 Imaging and sensing for unmanned aircraft systems, volume 1
and test control systems [1–3]. The use of simulation systems allows for developers
and engineers to tune control system gains, parameters and even experiment with
different control strategies in very short periods.
the entry of UAVs into service and how simulation moulded the design and testing
of modern drones.
frames, generalised equations of motion and specific models that are commonly
used to model both fixed-wing and rotary-wing UAV dynamics.
dny d n1 y dy
þ an1 þ þ a1 þ a0 y ¼ b0 u (5.1)
dtn dtn1 dt
where the left-hand terms describe the physical rates of systems being modelled,
like d n1 y=dtn1 multiplied by a constant an1 , and the right-hand terms char-
acterise an input to the system.
Producing accurate differential equations provides the first of many critical
steps system modelling so designers may suggest potential improvements.
x_ ¼ Ax þ Bu (5.2)
y ¼ Cx þ Du (5.3)
where x is the state, y is the output vector, u is the input or control vector, A is the
system matrix, B is the input matrix, C is the output matrix and D is the feed-
forward matrix.
State-space modelling aims to define “phase variables” where the subsequent
state variables can be defined as and by the derivative of the previous state variable.
Using the state-space equations, the second-order differential equations can be built
into coupled first-order differential equations.
106 Imaging and sensing for unmanned aircraft systems, volume 1
Ze Zi
Xn
Yn
Prime meridian
Zn
R
O j
Equator Ye
l
Xi Yi
Xe
Figure 5.1 ECI (i), ECEF (e) and Navigation (n) reference frames [33]
Figure 5.1 depicts navigation (n) reference frame at longitude l and latitude f.
The transformation from the ECEF frame to the n frame is Rz ðlÞRx ðjÞ. Rotation
matrices are discussed in more detail in Section 5.3.3.2.
Roll
x
y Pitch
Yaw
z
Traditionally, f, q and y are not limited to being the sole variables of rota-
tional about a single axis. However, a compact way of indicating the first, second
and third rotation.
Moreover, to describe the orientation of a frame attached to fixed body mod-
ellers, it is necessary to calculate the sequential rotations by way of rotation
matrices. Such as ZYZ or ZYX, where rotations in the z-, y- and x-axis can be seen in
(5.6)–(5.8), respectively [36]:
2 3
cosðfÞ sinðfÞ 0
Rz ðfÞ ¼ 4 sinðfÞ cosðfÞ 05 (5.6)
0 0 1
2 3
cosðqÞ 0 sinðqÞ
Ry ðqÞ ¼ 4 0 1 0 5 (5.7)
sinðqÞ 0 1 cosðqÞ
2 3
1 0 0
Rx ðyÞ ¼ 4 0 cosðyÞ sinðyÞ 5 (5.8)
0 sinðyÞ cosðyÞ
In many dynamic modelling fields, computing composite rotation matrices by
successive multiplication of ZYZ matrices is sufficient. However, aeronautical
modellers, many times prefer to use y, q and f, respectively, Z, Y and X rotations.
When these are multiplied accordingly in order, they produce a final orientation of
an aircraft relative to an internal axis.
where F denotes the rotations relative to the fixed frame connected to the centre of
mass of the aircraft [36]. Equation (5.9) shows the relatively simple nature of
rotation matrices. However, Euler angle-based rotation matrices can misinterpret
the orientation of the aircraft if a gimbal lock occurs.
5.3.3.3 Quaternions
Although Euler angles are widely used along with rotation matrices, there are many
times they are not optimal due to inherent problems with Euler angles. The most
prominent issues are related to the gimbal lock conditional. In a gimbal lock
condition, at least two of the three axes of rotation are aligned. When inertial
measurement units (IMUs) and mechanical systems are unable to handle the
alignment logically, the gimbal lock takes effect [37]. Once in effect, attitude
reference systems no longer report usable data.
The use of a four-parameter model helps to overcome any flaws. A model is
based on the unit quaternion can describe the orientation of a body gimbal without
alignment issues. Although a single Euler angle can initially define them, q and unit
vector of rotation r ¼ ½rx ; ry ; rz T are receptive to an origin frame. Equation (5.11)
shows the relationship between q, r and the unit quaternions:
Q ¼ fh; eg (5.10)
This unit relationship provides the namesake of the unit quaternion and shows
that unit quaternions are invariant with respect to the rotation matrix and its
transpose.
Many times modellers like to view the rotation matrix of a given quaternion.
This can be seen in the following equation [36]:
2 3
2ðh2 þ e2x Þ 1 2ðex ey hez Þ 2ðex ez þ hey Þ
6 2ðey ez hex Þ 7
Rðh; eÞ ¼ 4 2ðex ey hez Þ 2ðh2 þ e2y Þ 1 5 (5.14)
2ðex ez hey Þ 2ðey ez hex Þ 2ðh þ
2
e2x Þ 1
Modelling and simulation of UAV systems 111
Certain assumptions are made in order to simplify the modelling process [31].
Assumption 1: The UAV is assumed to be a rigid body, i.e., the distance between
any two arbitrary points on the UAV is fixed and time-invariant.
dV P
m ¼ mg þ F A þ F T (5.19)
dt
where V P is the velocity of the body, g is the acceleration due to gravity, F A is the
aerodynamic force on the body and F T is the thrust force on the body [31].
Conservation of linear momentum
Equation (5.19) is with respect to the ECI frame and is more useful with respect to
the UAV body frame. This transformation can be made in order to derive the
conservation of linear momentum equations in the body frame:
where J is the inertia tensor of the aircraft; L, M and N are the moments about the
x, y and z directions of the body frame, respectively; sub-script A indicates
moments due to aerodynamic effects and sub-script T indicates moments due to
thrust [31].
where L is the Lagrangian of the system, defined by the difference between the total
kinetic energy K and potential energy U and f is a vector of generalised forces
along with the generalised coordinates g [37].
The kinetic energy of a system can be computed with reference to an inertial
reference frame using the equation:
X 1 I _ 1
_ T
_ _ I _ ^I _ T
K¼ mi r i r i þ mi r i T i r i;cm þ trace T i J i T i (5.27)
i
2 2
where r i is a vector from the origin of the inertial frame of reference to the frame of
body i, mi is the mass of body i, I T_ i is the transformation matrix from the body
frame i to the inertial reference frame, r i;cm is the vector from the origin of the body
frame i to the centre of mass of body i measured with respect to the body frame and
J^ i is the pseudo-inertia tensor of body i, which is defined as follows:
1
J^ i ¼ traceðJi ÞI Ji (5.28)
2
where Ji is the true inertia tensor, and I is a three-dimensional identity matrix [37].
The potential energy of the system can be computed with the equation:
X
where HðgÞ is the system mass matrix written as a function of the generalised
coordinates, d ðg; g_ Þ is the vector of Coriolis and centripetal forces, and gðgÞ is the
114 Imaging and sensing for unmanned aircraft systems, volume 1
vector of gravitational forces on the system [37]. Equation (5.30) can be rearranged
to solve for the generalised accelerations g€ as follows:
using data collected from flight to fit the models. NASA has used empirical model
matching techniques since the 1950s, and researchers at the NASA Dryden centre
started using system identification techniques to approximate flight models.
Maximum likelihood estimation was the first system identification method used
which have also aided in modelling the dynamics and developing simulations for
the space shuttle [31].
These modelling and approximation methods can be used for and are to date
being used for UAVs similar to manned aircraft, which these methods have been
developed for.
5.3.5.2 Multi-rotors and VTOL
Multi-rotors and Vertical Take-Off and Landing (VTOL) UAVs are fledgelings in
the industry. Therefore, many new approaches have not been explicitly developed
to model their aerodynamics. The assumption of low-speed flight, however, allows
us to develop an approximate flight dynamics model. Due to the physical nature of
multi-rotors and the multiple rigid bodies, a robotics approach can be taken to
modelling the dynamics.
A robotics approach
On setting up the coordinate frames for multi-rotors and VTOL aircraft, the system
kinetic and potential energy equations can be derived using (5.27) and (5.28). The
Euler–Lagrange method can then be used to derive the equations of motion as done
for classical robotic systems.
Modelling of forces and moments
Forces and moments of multi-rotors and VTOL aircraft can be modelled using
simple propeller equations, free-body diagrams and simplified aerodynamic drag
equations. For instance, propeller thrust and torque can be mapped to motor
revolutions per minute (RPM) via a simple quadratic relation.
In either case, once the solution to equations of motions is solved for, they can be
used by modellers for simulation purposes. Many use these simulations to experi-
ment by testing new inputs and variables to determine how the model responses.
Moreover, simulations are lower-cost programmes to finance then designing and
building full systems or deploying code on an experiential flying vehicle.
The following subsections will outline techniques used in the industry to
generate solutions to dynamic models.
5.4.1.1 Euler method
Prior to modern computational methods, analytical integration used a pen and paper
to find solutions to the differential equations. The explicit Euler method was solved
by hand before early computational methods. To be solved by modern methods, the
equations must be discretised and then evaluated at different time steps.
Classically, this method was used to solve the initial value problem and the asso-
ciated differential equations [40].
For a differential equation such as (5.36), a solution can be formed by rear-
ranging the definition of derivative found in (5.37):
dy
¼ y_ ¼ f ðt; yðtÞÞ; yðt0 Þ ¼ y0 (5.36)
dt
yðt þ dÞ yðtÞ
y_ ðtÞ ¼ lim (5.37)
d!0 d
To simplify (5.37), d can be defined as a small positive time step. As such, this
approximation of 1.38n can be used.
yðt þ dÞ yðtÞ
y_ (5.38)
d
Therefore, after rearranging (5.38) into (5.39), and substituting tnþ1 ¼ tn þ d
and yn ¼ yn ðtn Þ, the solution ynþ1 can be found in (5.40) [41] as follows:
yðt þ dÞ yðtÞ y_ d (5.39)
ynþ1 y_ nþ1 d þ yn (5.40)
By evaluating (5.40) over a series of time steps, the integral of the original dif-
ferential equation can be found. Using this method, a second time starting with a
second-order differential equation in a dynamic model yields the position of the system.
Starting with the initial value problem, see (5.36) as an estimate of the deri-
vative of (5.36) can be found. Assuming the initial time t0 in which initial condi-
tions are known, we obtain the following equation:
k1 ¼ f ðt0 ; y ðt0 ÞÞ (5.41)
where y ðtÞ is the approximate of the derivative at time t ¼ t0 þ d (endpoint
method) or t ¼ t0 þ d2 (midpoint method). The next step is to use k1 to estimate k2 ,
the following equations for the RK-2 algorithm will outline the use of the midpoint
method [42]:
d d
y1 t0 þ ¼ y ðt0 Þ ¼ k1 (5.42)
2 2
Then, estimating the derivative produces k2 as follows:
d d
k2 ¼ f t 0 ; y1 t 0 : (5.43)
2 2
Using k1 and k2 the estimate for the full-time step can be found.
yðt0 þ dÞ ¼ y ðt0 Þ þ k2 d (5.44)
To fully realise the RK-2 algorithm, k1 ; y1 ; k2 and y must be recursively cal-
culated throughout the simulation [41].
The more prolific RK method is the RK-4, which encompass higher order
terms from the Taylor series explanation about t0 by finding k3 and k4 .
Once again, the method starts with the initial value problem from in (5.36) as
an estimate of the derivative of (5.36), which produces the terms k1 , k2 , k3 and k4 .
Assuming the initial time t0 in which initial conditions are known, the over the time
period of the, then:
k1 ¼ f ðt0 ; y ðt0 ÞÞ (5.45)
d d
k 2 ¼ f t 0 þ ; y ðt 0 Þ þ k 1 (5.46)
2 2
d d
k 3 ¼ f t 0 þ ; y ðt 0 Þ þ k 2 (5.47)
2 2
k4 ¼ f ðt0 þ d; y ðt0 Þ þ k3 dÞ (5.48)
Lastly, the equation to estimate yðt0 þ dÞ can be found by using the following
equations [41]:
k1 þ 2k2 þ 2k3 þ k4
yðt0 þ dÞ ¼ yðt0 Þ þ d: (5.49)
6
Using the RK-4 integration sequentially on second-order differential equations
derived from dynamic models allows for simulations of physical bodies imple-
mented in more heuristic terms for modellers and engineers.
Modelling and simulation of UAV systems 119
5.5 Conclusion
This chapter reviewed the need for modelling and simulation in the unmanned aircraft
industry, its development, how it is used today and the fundamental concepts that
allow its use. Furthermore, technological advancements in computing power bring
about many more use cases and improvements in the simulation. Dynamic modelling
and simulation has become a critical component of any UAV design and development
project and continues to provide many cost and time benefits.
References
[1] Ivaldi S, Padois V, and Nori F. Tools for dynamics simulation of robots:
A survey based on user feedback. Paris, France: Sorbonne Universite; 2014.
Report No.: arXiv:1402.7050vl. Supported by the EU Project CODYCO
(FP7-ICT-2011-9, No. 600716).
[2] Wu X, Figueroa H, and Monti A. Testing of digital controllers using real-
time hardware in the loop simulation. In: 2004 IEEE 35th Annual Power
Electronics Specialists Conference (IEEE Cat. No.04CH37551). vol. 5;
2004. p. 3622–3627.
[3] Muraleedharan N, Isenberg DR, and Gentilini I. Recreating planar free-
floating environment via model-free force-feedback control. In: 2016 IEEE
Aerospace Conference; 2016. p. 1–12.
[4] Carter JR. A business case for modeling and Simulation. U.S. Army
Aviation and Missile Research, Development, and Engineering Center;
2011. Report No.: RD-AS-01-02.
[5] Pappu MR. Online Model Predictive Control of a Robotic System by
Combining Simulation and Optimization [Master of Science in Technology];
2015.
[6] Wu J, Wang W, Zhang J, et al. Research of a kind of new UAV training
simulator based on equipment simulation. In: Proceedings of 2011 International
Conference on Electronic Mechanical Engineering and Information Technology.
vol. 9; 2011. p. 4812–4815.
[7] Insitu. Insitu – ICOMC2 – University Training Program; 2018. Available
from: https://www.insitu.com/information.
[8] Simlat. Simlat has delivered an advanced UAS Training and Simulation
Classroom to Embry-Riddle Aeronautical University campus at Prescott,
AZ. 2017.
[9] Simlat. Drexel University to use Simlat C-STAR Simulation System in
ASSURE human factors research. Drexel University at Philadelphia, PA. 2017.
[10] Simlat. Simlat delivered advanced simulation-based research lab to
Macquarie University in Sydney, Australia. 2017.
[11] Ma O, Flores-Abad A, and Boge T. Use of industrial robots for hardware-in-
the-loop simulation of satellite rendezvous and docking. Acta Astronautica.
2012;81(1):335–347.
120 Imaging and sensing for unmanned aircraft systems, volume 1
[12] Ackroyd JAD. Sir George Cayley: The invention of the aeroplane near
Scarborough at the time of Trafalgar. Journal of Aeronautical History.
2011;1(06):130–136.
[13] Bilstein RE. Flight in America. The Johns Hopkins University Press; 1994.
[14] Air SN, Museum S. The NACA/NASA Full Scale Wind Tunnel; 2015.
Available from: https://www.airandspace.si.edu/stories/editorial/nacanasa-
full-scale-wind-tunnel.com.
[15] Chambers J. The Role of Dynamically Scaled Free-Flight Models in Support
of NASA’s Aerospace Programs. Library of Congress; 2009.
[16] Day DA. Computers in Aviation; 2002. Available from: https://www.cen-
tennialofflight.net/essay/Evolution_of_Technology/Computers/Tech37.htm.
[17] Bochannek A. Why Analog Computers? 2013. Available from: https://www.
computerhistory.org/atchm/why-analog-computer/.com.
[18] Wallmann R. Analog Computer; 2007. Available from: https://www.smecc.
org/analog_computers.htm.
[19] Small JS. The Analogue Alternative: The Electronic Analogue Computer In
Britain and USA, 1930–1975. Routledge; 2001.
[20] Copper NR. X-15 Analog Flight Simulation Program; 1961. Available from:
https://www.omputer.org/csdl/proceedings/afips/1961/5058/00/50580623.pdf.
[21] Stengel RF. Flight Dynamics. Princeton University Press; 2004.
[22] Stamp J. World War 1:100 Years later; 2013. Available from: https://www.
smithsonianmag.com/arts-culture/uumanned-drones-have-been-around-since-
world-war-i16055939/.
[23] Miller R. The First Drones, Used in World War 1; 2003. Available from:
https://io9.gizmodo.com/the-first-drones-used-in-world-war-i-453365075.
[24] Jeffries Z. Charles Franklin Kettering 1876–1958. National Academy of
Sciences; 1960.
[25] Turcker FJR. The Pioneers Aviation and Aeromodelling-Interdependent
Evolution and Histories; 2005. Available from: www.cite.monash.edu.au/
hargrave/righter6_part1.html.
[26] Diaz M. The Father of Inertial Guidance; 2005. Available from: http://www.
nmspacemuseum.org/halloffame/index.php?type¼search.
[27] Harris-Para B. 100 Years of Sustained Power Flight History; 2003. Available
from: https://www.faa.gov/education/educators/curriculum/k12/media/K-12_
One_Hundred_Years_of_Sustained_Flight.pdf.
[28] Jung D, and Tsiotras P. Modeling and Hardware-in-the-Loop Simulation for
a Small Unmanned Aerial Vehicle. In: AIAA Infotech Aerospace 2007
Conference and Exhibit; 2007.
[29] Fields D, editor, and Galilean Relativity. University of New Mexico; 2015.
[30] Landau LD, and Lifshitz EM. Mechanics. Pergamon Press; 1960.
[31] Napolitano MR. Aircraft Dynamics: From Modeling to Simulation. John
Wiley and Sons, Inc.; 2012. Physics 262-01.
[32] Colasurdo G. Astrodynamics. Politecnico di Torino, Space Exploration and
Development Systems; 2006.
Modelling and simulation of UAV systems 121
6.1 Introduction
A large number of commercial use-cases for small size, weight, power and cost
(SWaP-C) unmanned aircraft systems (UAS) have emerged over the last five years.
A number of these applications, including package delivery and surveillance typically
involve a flight over densely populated areas, thereby incurring the risk of injury and/
or liability in the event of degraded navigation system performance. Most UAS rely
on the global navigation satellite system (GNSS) to support drift-free absolute
1
RMIT University, Bundoora, Australia
124 Imaging and sensing for unmanned aircraft systems, volume 1
positioning and can be examined using the cyber-physical system paradigm [1].
However, GNSS-based positioning is subject to several errors and failure modes that
compromise its accuracy, integrity and availability. In particular, operations in dense
urban environments are vulnerable to outright loss of service due to low satellite
visibility and misleading information due to signal multipath, consequently posing a
higher risk of unstable guidance. Over the past decade, visual sensors have emerged
as desirable candidates for inclusion in UAS sensor suites. Highly miniaturised colour
CMOS cameras are quickly becoming low-cost, high-performance commodities with
low weight and form-factor, allowing their deployment on small aerial platforms. A
large body of knowledge has developed on the implementation of visual navigation
sensors on unmanned aircraft either as standalone systems or more frequently as part
of integrated multi-sensor systems. The typical strategy is to augment conventional
sensor suites that comprise GNSS, inertial sensors, magnetometer and barometric
altimeter in order to improve state observability and robustness to individual sensor
failures. This chapter commences by presenting the fundamentals of data-fusion
algorithms most commonly employed in unmanned aircraft such as the extended
Kalman filter (EKF), and to a lesser extent, the unscented Kalman filter (UKF).
Different sensor integration architectures are presented, primarily focusing on the
fusion of visual sensors. A review of the literature in this domain was conducted in
order to capture seminal work and significant outcomes in order to highlight the
current state of the art and to future research.
models. The EKF equations are now described in state-space notation by a set of
first-order non-linear differential equations:
x_ ¼ f ðx; wÞ (6.1)
where x 2 Rn denotes the system state vector, f ðxÞ is the non-linear function of the
states and w 2 Rn represents a zero-mean random process. The matrix of the pro-
cess noise, Q 2 Rnn is given by:
Q ¼ E wwT (6.2)
z ¼ hðx; vÞ (6.3)
zk ¼ hðxk ; vk Þ (6.5)
@f ðxÞ
F¼ (6.6)
@x x¼^x
@hðxÞ
H¼ (6.7)
@x x¼^x
where ^x represents the mean value. The fundamental matrix is approximated by the
Taylor series expansion as follows:
F 2 Ts2 F 3 Ts3
Fk ¼ I þ FTs þ þ þ (6.8)
2! 3!
where Ts is the sampling time, and I is the identity matrix. The Taylor series is
often truncated to first order:
Fk I þ FTs (6.9)
The overview of the steps in an iteration of the EKF is illustrated in Figure 6.1.
126 Imaging and sensing for unmanned aircraft systems, volume 1
Prediction Correction
Predict state Compute Kalman Gain
Kt = P –t HT (HP –t HT + R )–1
X –t = Φ X +t–1
‹
‹
‹
‹
‹
Predict state covariance
Update the predicted state covariance
P –t = Φ P+t–1FT + Q
P+t = (I– Kt H) P –t
Measurement
P –t Prediction
P +t–1
Zt
P –t
Φ R
‹
‹
X +t–1 X –t
‹
X –t
The prediction of the navigation state x and its covariance P at time step k from
the previous time step through the state-transition matrix F is given by:
^x xþ
k ¼ Fk ^ k1 (6.10)
P þ
k ¼ Fk Pk1 Fk þ Q
T
(6.11)
where the () and (þ) superscripts denote the process before and after incorpor-
ating the measurement, respectively. The ( ^: ) symbol denotes an estimate of the
state. Q is the covariance of the process noise. The computation of the Kalman gain
Kk is given by:
T 1
Kk ¼ P k H HPk H þ R
T
(6.12)
which is a function of the state covariance, the measurement noise covariance R
and the measurement matrix H. The measurement is incorporated in the correction
step to obtain the optimal estimate of the state ^x þ þ
k and its covariance Pk :
^x þ x
k ¼^ k þ Kk zk hk ^ xk (6.13)
Pþ
k ¼ ðI Kk H ÞPk (6.14)
^x þ þ
k and Pk are then used recursively in the next iteration of the filter. Despite its wide-
spread use in aircraft navigation, the EKF is subject to limitations, prompting the devel-
opment of alternative implementations of the KF. One of these methods is the UKF.
Multisensor data fusion for vision-based UAV navigation and guidance 127
X
s
^x kjk1 ¼ wi cik
i¼1
X
s T
Pkjk1 ¼ Qk þ wi cik ^x kjk1 cik ^x kjk1
i¼1
● Determine the sigma points c1k ; . . . ; csk and weights w1 ; . . . ; ws to match mean
^x kjk1 and covariance matrix Pkjk1 .
● Compute the transformed sigma points yik ¼ h cik ; i ¼ 1; . . . ; s
128 Imaging and sensing for unmanned aircraft systems, volume 1
Initialization
Initial covariance and state vector
Sigma points
Calculation of Generation of
weights sigma points
Navigation
Sensor n
processor n
Sensor 1
corrections
Integrated navigation
Sensor 2 solution
Integration
filter
Sensor n
Reference
sub-system
Local
Navigation solution 1
Sensor 1
processor 1
Reset
Integration
filter Integrated navigation
Navigation solution
Sensor 2
processor 2
Navigation
Sensor n
processor n
Structure 3D
Feature estimation structure
extraction
Camera and
tracking Motion Camera
estimation motion
Inertial navigation
IMU PVA
mechanization
Feature 3D
extraction structure
Camera and
tracking
Data fusion PVA
Predict feature
position
IMU
The saving in computational expense is due to the relatively small number of ele-
ments in the estimated state vector. This comparison is illustrated in Figure 6.8, which
compares the typically estimated states between the two architectures. The inclusion
of visual features in the tightly coupled architecture renders the architecture optimal
but ultimately difficult to implement due to the computational overhead.
In [8], a hybrid (centralised/federated) loosely coupled architecture was
developed which employed a vision-based navigation sub-system to augment
attitude estimation of a fixed-wing UAV during the approach and landing phase.
The top-level architectural pipeline is illustrated in Figure 6.9. The architecture
comprises the following sensors: visual-inertial-GNSS-aircraft dynamics model
(VIGA).
An aircraft dynamics model (ADM) is essentially a virtual sensor (i.e.
knowledge-based module) that augments the state vector by computing the UAV
trajectory and attitude motion. Both three degrees-of-freedom (3-DoF) and six
degrees-of-freedom (6-DoF) ADM can be employed in the virtual sensor design.
Two integration schemes were developed differing primarily in terms of the inte-
gration filters, either an EKF or a UKF.
The conceptual architecture of the visual sub-system is shown in Figure 6.10.
The sub-system comprises basic image-processing routines to extract dominant
edges from the observed scene, namely, the horizon and the runway markers. Pitch
and roll are estimated relative to the horizon by comparing the images captured
online with priori-stored keyframe images of the runway. Lateral position is
computed as a deviation from the runway centre-line (Figure 6.11).
OF extraction from the captured images was used to estimate the aircraft body
rates. A formal definition of OF was provided in [30–32] as the 2-D velocity field
describing the apparent motion or the relative 3-D motion between the camera and
the environment. The OF generated from movement is the result of translational
and rotational motions. Tracking the changes or shifts in the images using these two
components can give information about the motion of the camera and the shape of
the environment it is viewing.
b b
Inertial sensor biases . .
b b
Visual odometry scale
l l
f1
..
. Visual features
fn
Sensors
Data fusion
Figure 6.9 Multi-sensor navigation system architecture for a small UAV (adopted
from [8])
Figure 6.11 Attitude and relative position estimation based on edge extraction
and image comparison [8]
where u and v are the pixel coordinates of an observed 3D point; f is the focal
length of the camera and w and r are the vectors of body rates and position,
respectively. If the observed point is on the horizon, z is large, and the first term on
the right-hand side of (6.15) tends to zero leading to a simplified form:
2 3
uv u2 2 3
f þ v 7 wx
u_ 6 f f
¼6 4
74 w y 5
5 (6.16)
v_ v2 uv
f þ u wz
f f
The sub-system was used to provide measurement updates to augment an EKF-
based estimator.
The measurement model in the EKF incorporating direct measurements of
VBS-based roll and pitch and measurements of optical flow is given by:
2 3
1 0 0 0 0
6 0 1 0 0 2 0 7
6 7
6 u1 v1 u1 7
2 3 6 0 0 f þ v1 7
f 6 f f 7
6 7
6 q 7 6 v 2
u v 7
6 7 6 0 0 f þ 1
1 1
u 72 3
6 u_ 1 7 6 f f
1 7 f
6 7 6 7
6 v_1 7 6 u2 v2 u2 2 76 q 7
6 7 6 0 f þ v2 7 6 7
6 u_ 2 7 ¼ 6 0 76 w x 7 (6.17)
6 7 6 f f 7 6 7
6 v_ 2 7 6 74 w y 5
6 7 6 v 2
u2 v2 7
67 6 0 0 f þ 2 u2 7 wz
6 7 6 f f 7
4 u_ i 5 6 2 7
6 7
v_ i 6 ui vi u 7
6 0 0 f þ i vi 7
6 f f 7
6 7
4 v2i ui vi 5
0 0 f þ ui
f f
Multisensor data fusion for vision-based UAV navigation and guidance 137
Vision-Based
IMU Attitude
+ – Corrected
Vision-Based Navigation VBN Attitude
Sensor VBN Attitude
Processor
Attitude IMU Attitude
VBN Attitude
6
Corrected Phi Attitude
4
Attitude Phi [deg]
2
–2
–4
–6
0 1 2 3 4 5 6 7 8 9 10
(a) Time = [0, 1042] s ×104
6
Corrected Theta Attitude
4
Attitude Theta [deg]
–2
–4
–6
0 1 2 3 4 5 6 7 8 9 10
(b) Time = [0, 1042] s ×104
6
Corrected Psi Attitude
4
Attitude Psi [deg]
–2
–4
–6
0 1 2 3 4 5 6 7 8 9 10
(c) Time = [0, 1042] s ×104
Figure 6.13 Error time series of VIGA attitude estimation during the approach
and landing phase: (a) roll error; (b) pitch error; and (c) yaw error
EKF-1 EKF-2
Camera
Egomotion
Visual-
OF OF and To guidance
OF update inertial data
prediction measurement structure and control
fusion
IMU update
Inertial
measurements
Ik
Ik+1
Figure 6.15 Search area prediction using inertial measurements to reduce block
matching computation
of view. The standard method for model estimation from a set of data affected by
outliers is a RANdom SAmple Consensus (RANSAC). It consists of randomly
selecting a set of data points, computing the corresponding model hypothesis and
verifying this hypothesis on all the other data points. The solution is the hypothesis
with the highest consensus. The number of iterations (N) necessary to guarantee a
robust outlier removal is:
logð1 pÞ
N¼ (6.18)
log½1 ð1 eÞs
where s is the number of data points from which the model can be computed, e is
the percentage of outliers in the dataset and p is the probability of success
requested. Figure 6.16 shows the number of iterations (N) with respect to the
number of points necessary to estimate the model (s). The values are computed for
Multisensor data fusion for vision-based UAV navigation and guidance 141
Required iterations vs number of points - Probability of success: 99%; Outlier percentage: 50%
1200
1000
Required number of iterations
800
600
400
200
0
1 2 3 4 5 6 7 8
Number of model estimation points
Tx
∆j
Ty
Xb
∆q
Yb
Zb
References
[24] Y. Song, S. Nuske, and S. Scherer, “A multi-sensor fusion MAV state esti-
mation from long-range stereo, IMU, GPS and barometric sensors,” Sensors,
vol. 17, p. 11, 2016.
[25] N. Razmjooy, B. S. Mousavi, M. Khalilpour, and H. Hosseini. Automatic
selection and fusion of color spaces for image thresholding. Signal, Image
and Video Processing, vol. 8(4), pp. 603–614, 2014.
[26] D. J. Hemanth, and V. V. Estrela. Deep learning for image processing
applications, Adv in Parallel Computing Series, vol. 31, IOS Press, 2017.
ISBN 978-1-61499-821-1 (print), ISBN 978-1-61499-822-8 (online).
[27] V. V. Estrela, H. A. Magalhaes, and O. Saotome. “Total variation applica-
tions in computer vision.” In Handbook of Research on Emerging
Perspectives in Intelligent Pattern Recognition, Analysis, and Image
Processing, pp. 41–64. IGI Global, 2016.
[28] B. Mousavi, F. Somayeh, N. Razmjooy, and F. Soleymani. Semantic image
classification by genetic algorithm using optimised fuzzy system based on
Zernike moments. Signal Image and Video Processing, vol. 8, no. 5, pp. 831–
842, 2014.
[29] N. Razmjooy, V. V. Estrela, and H. J. Loschi. “A survey of potatoes image
segmentation based on machine vision.” In Applications of Image Processing
and Soft Computing Systems in Agriculture, pp. 1–38. IGI Global, 2019.
[30] M. A. de Jesus, and V. V. Estrela. Optical flow estimation using total least
squares variants. Oriental Journal of Computer Science and Technology,
2017;10(3):563–579. http://dx.doi.org/10.13005/ojcst/10.03.03
[31] V. V. Estrela, and A. M. Coelho. State-of-the-art motion estimation in the
context of 3D TV. In R. Farrugia, and C. Debono (Eds.), Multimedia
Networking and Coding (pp. 148–173). Hershey, PA: IGI Global, 2013.
http://dx.doi.org/10.4018/978-1-4666-2660-7.ch006
[32] H. R. Marins, and V. V. Estrela. On the use of motion vectors for 2D and 3D
error concealment in H.264/AVC Video. In Feature Detectors and Motion
Detection in Video Processing (pp. 164–186). Hershey, PA: IGI Global,
2017. http://dx.doi.org/10.4018/978-1-5225-1025-3.ch008
Chapter 7
Vision-based UAV pose estimation
Paulo Silva Filho1, Elcio Hideiti Shiguemori2,
Osamu Saotome1, and Jairo Panetta1
As the use of unmanned aerial vehicles increased, studies regarding their autono-
mous flight became an academic field of great interest for researchers. Until
recently, most studies based their developments using an inertial measurement unit
(IMU) and a global navigation satellite system (GNSS) as the main sensors to
calculate and estimate the UAVs pose. These sensors, on the other hand, have
several limitations, which can affect the navigation, therefore, the fully autonomous
aspect of the system. Images captured during flight, computer vision algorithms,
and photogrammetry concepts have become a core source of data to estimate the
UAVs pose in real-time, therefore, composing new alternative or redundant navi-
gation systems. Several algorithms have been proposed in the scientific commu-
nity, each one working better in specific situations and using different kinds of
imaging sensors (active and passive sensors). This chapter describes the main
visual-based pose estimation algorithms and discusses where they best apply and
when each fails. Fresh results depict the development of new strategies that will
overcome the remaining challenges of this research field.
7.1 Introduction
Unmanned aerial vehicles (UAVs) became a highly popular technology in recent
years. Their applicability has expanded to several civil applications such as
monitoring [1], agriculture [2], search and rescue missions [3], remote sensing [4],
photogrammetry [5], construction inspections [6], entertainment [7], and many others.
It became then a significant and strategic technology due to all those applications,
drawing the attention of the scientific community to develop studies on how to make
these platforms even more stable, intelligent, and reliable and mostly autonomous.
In terms of stability and reliability, the navigation system (NS) of these vehi-
cles plays an important part, and it is essential in order to develop a fully autono-
mous UAV. A good NS must provide to the control system of the UAV precise
1
DCTA-ITA-IEEA, Sao Jose dos Campos, SP, Brazil
2
IEAv – Institute of Advanced Studies, S. J. Campos, SP, Brazil
146 Imaging and sensing for unmanned aircraft systems, volume 1
state information and interpret this information without human intervention [8].
It has to estimate the UAVs state (position, velocity, attitude, airspeed, angle of
attack, sideslip angle, and rotation – pitch, roll, and yaw – rates) and give this
information to a control and guidance system, which will allow it to manoeuvre in a
way consistent with its planned route. Obtaining this state information is not a
simple task. One can deduce then that the basis for the autonomous navigation is
the state estimation of the aircraft.
Sensors embedded in the aircraft are used to estimate its state for the control
system. Nowadays, the main sensors used are gyrometers and accelerometers,
which compose the inertial measurement unit (IMU) or inertial navigation system
(INS) and the global navigation satellite system (GNSS). It is possible to operate a
UAV autonomously in most situations combining these sensors. There are, though,
several situations in which those sensors fail or are unable to determine an accurate
state. Because of the situations in which the IMU þ GNSS does not work well, new
and different navigation systems (NSs) are needed. The most promising techniques
rely on computer vision (CV).
This chapter presents and discusses the central CV NSs being developed. It
will explain the main drawbacks of the IMU þ GNSS navigation systems and the
most challenging aspects of any vision-based system. After that, it will discuss the
different vision models.
INS and GNSS revolutionised the development of autopilots for many types of
vehicles and artefacts that have to follow a precise trajectory, with little or no
human interaction. This is the basis for any autonomous system. It is essential first
to understand how the INS and GNSS work in order to be aware of when these
sensors may fail and how to overcome those drawbacks.
Planned
Flying only with route
the IMU readings
This loss in position accuracy is known as the INS drift. Each INS has a different
degrading rate. Typically, the smaller the IMU in the INS, the more significant the
decrease in accuracy [10]. The IMUs used on UAVs usually lose precision in a matter
of seconds, because they have to be small and light. Figure 7.1 compares the flight of
an aircraft with only the IMU with a planned route.
d2
d3
d1
(x1, y1, z1) d4
The dependency on external signals for position estimation is the main vul-
nerability of the GNSS. Accuracy depends on the number of satellites in a clear,
direct view, and on the geometry formed by them, poor geometry or a small amount
of satellites means an imprecise position. For example, in environments such as
indoors or in dense urban areas, the GNSS signal will be weak and may also be
blocked or reflected by the buildings, causing signal multipath [10]. The UAV will
operate in a highly hazardous area, running the risk of getting lost or colliding with
obstacles [14]. This is one of the main concerns for UAV operations in urban areas.
Vision-based UAV pose estimation 149
Communication
Communication
satellite B
satellite A
FORMOSAT-1
Ionosperic bubble
Figure 7.3 Equatorial plasma bubbles blocking radio signals. Source: [16]
Besides the geometry and multipath already mentioned, if for any reason, the
sensor does not receive the satellite signal, it will not be able to estimate the
position. Even in the clear satellite view, the GNSS signal can be blocked. A nat-
ural signal blocker is the equatorial plasma bubbles that are formed in the iono-
sphere [15]. They tend to delay radio waves and degrade the performance of GNSS
readings. Figure 7.3 shows how the plasma bubbles block signals.
GNSS signals can be interfered with by malicious attacks also, which will
degrade the signals readings, affecting the navigation like jamming and spoofing.
Jamming is the process of blocking or interfering with a radio signal by decreasing
the signal-to-noise ratio so that the signal is completely blocked. For the UAV
GNSS signal receiver, it seems that there are no GNSS satellites in the area affected
by the jamming device. Spoofing, on the other hand, is similar to jamming, but
instead of blocking the signal, it deceives the receiver, by falsifying the data, so that
the position estimated is completely wrong.
Attacks of such nature have already happened in recent years as the highjack of
a Lockheed RQ-170 drone [17]. It is expected that the more popular UAVs become,
the more hacking attacks as such can happen. An NS depending only on external
signals is not suitable when safety, stability, and reliability are wanted.
an important and viable strategy for new NSs. Images captured during flight can be
compared with previous ground information to estimate the UAV pose [18], which
can be used to eliminate the INS error, the same way GNSS readings do.
Images as an alternative for the GNSS, or also as a complementary NS, has
several advantages: (i) the system does not depend on an external signal [19] and
(ii) the technology involved. CV algorithms use mostly embedded passive sensors,
which are cheaper than the maintenance of a GNSS [20].
On the other hand, several challenging aspects complicate the development of
a visual NS. First, the particularities on which the UAV may fly are quite diverse
and complex, so that it is impossible to develop a flexible and unique solution for
all flight conditions [21]. Depending on the time of flight, the weather condition,
and the scene, the sensor used to obtain the image for the navigation must change,
and, therefore, the algorithm must also be adapted [19,21]. Table 7.1 maps some of
the variables a flight might have that would affect the visual navigation system.
Each environmental set may impose a different image sensor to be used. For
example, optical sensors in the visible spectrum may not be adequate if the aircraft
route is over a lake or if the flight is at night. Other sensors with responses on
different bands of the electromagnetic spectrum or active sensors would be more
suitable for that. Different images would be produced, and, then, different
approaches would be necessary for the navigation system.
Any solution proposed for the autonomous navigation using images must take
into consideration how the image is acquired in each sensor it is going to use: the
velocity of the sensor’s shutter, the luminosity of the scene, the environment of the
scene, the scale, rotation and projective factors, and many other physical aspects of
the sensor. Besides that, aspects related to the algorithm itself, its complexity, the
onboard computer that will process it, the payload of the aircraft, for example, also
make restrictions on which method will be most effective [8].
Most visual navigation systems developed until now tries to determine the spe-
cific condition they will operate in order to choose which sensor they are going to
use. They usually use passive sensors, such as RGB, thermal, multispectral, or
hyperspectral cameras, mostly because of the more accessible sensors of these types
with size, weight, and energy consumption compatible with the UAVs’ limitations.
Passive image sensors are electronic devices that capture and convert the
electromagnetic radiation reflected by the objects in the scene (or emitted by the
objects, in the case of the thermal radiation) into electronic signals, which will form
the image. Their pixels register the amount of electromagnetic radiation that falls
on them, converting this amount into a corresponding number of electrons [22].
The main difference between passive sensors is the electromagnetic band they
capture.
The visual or RGB sensor works on radiation with wavelengths from 390 to
700 nm. The two main technologies for optic sensors are the charge-coupled device
[23] and the complementary metal-oxide semiconductor [24,25]. Most works
develop the visual NS using these sensors [20,26–32] since they are the most
common sensor and capture exactly what one sees.
Multispectral and hyperspectral sensors are more sensitive to a broader range
of wavelengths than the visible spectre. They are used more when details of the
scene are needed, such as materials, types of plants, and other more specific
information. The only difference between those sensors is that hyperspectral sen-
sors capture contiguous spectral bands, covering a wide range of wavelengths in a
fine resolution, and multispectral sensors, on the other hand, measure spaced
spectral bands [33].
A thermal imaging camera is a sensor that forms an image using infra-red
radiation, similar to a common visual sensor, but it is sensible to wavelengths as
long as 14,000 nm [34]. An environment where these sensors would be used for the
image NS would be in flights at night, for example, when optical (visual, multi-
spectral, and hyperspectral) sensors would not respond properly. In [35], thermal
information is used for the NS to estimate the position of the UAV. The thermal
image captured during flight is compared with optical images previously stored in
the UAV.
It is known, though, that not all flight conditions can be tackled with passive
sensors. As mentioned before, at night, most of the passive sensors do not work
correctly. That is because these sensors only capture the radiation that was reflected
by the objects. If one does not have a source of radiation to illuminate the objects in
the scene and then reflect that radiation to the sensor, the image will not be formed.
The only exception for passive sensors is the thermal sensors because all objects are
emitting heat, so they all are a source for this specific spectral band. However, for
example, in a cloudy or rainy weather condition, even the thermal sensor would not
work suitably, because the clouds would pose as a barrier for the image to form. Also,
if the flight were over a lake or the sea, all the passive sensors would respond with a
similar and plain image, which could not be used to estimate the aircraft’s pose.
Active sensors are a possible solution in order to gather visual information
inflight conditions when there is no source of radiation. They are devices with two
complementary parts: a transmitter that sends out a specific signal, wavelength, or
electrons to be reflected by a target; and also a data-gathering sensor that will
register the signal reflected by the target. In order words, they have both the
radiation source and the sensor for that specific radiation. The two most used
sensors in nowadays investigation work in visual navigation are LIDAR and
Synthetic aperture radar (SAR).
LiDAR [36] is an acronym for light detection and ranging. It refers to a remote
sensing technology that emits intense, focussed beams of light and measures the
time it takes for the reflections to be detected by the sensor. This information is
used to compute ranges, or distances, to objects. The three-dimensional coordinates
152 Imaging and sensing for unmanned aircraft systems, volume 1
of the target objects and the intensity of the returned pulse are computed from the
time difference between the laser pulse being emitted and returned, the angle at
which the pulse was emitted and the absolute location of the sensor on or above the
surface of the Earth [36]. Works have used LiDAR for navigation and remote
sensing [37] when the UAV is flying over shallow waters [38,39] and forests [40].
SAR [41] is a form of radar which is used to create images (2D or 3D images) of
a scene using the motion of the radar antenna over a targeted region to provide finer
spatial resolution than is possible with conventional beam-scanning radars. To create
a SAR image, successive pulses of radio waves are transmitted to illuminate a target
scene, and the echo of each pulse is received and recorded. The pulses are transmitted,
and the echoes received using a single beamforming antenna, with wavelengths of a
metre down to several millimetres. As the SAR device onboard the aircraft moves, the
antenna location relative to the target changes with time. Signal processing of the
successively recorded radar echoes allows the combination of the recordings from
these multiple antenna positions and allows the creation of higher resolution images
than would otherwise be possible with a given physical antenna. The multiple antenna
positions are what is called a synthetic antenna aperture. In [42], the algorithm pro-
posed uses SAR images for visual navigation, for example.
x
(0,0)
The equation can be rewritten in the following linear model as in (7.3), pre-
sented by [48]:
(
x0 ða0 ; a1 ; a2 ; b0 ; b1 ; b2 Þ ¼ a0 þ a1 x þ a2 y
(7.3)
y0 ða0 ; a1 ; a2 ; b0 ; b1 ; b2 Þ ¼ b0 þ b1 x þ b2 y
C
f y
→
Z c′ r
x
p
→
R
ZC
Y
P
XC
XP
YC YP
O X
y
X
α γ
and
8 9
>
< XP XC >
=
!
R ¼ YP YC (7.7)
>
: >
;
ZP ZC
! !
The collinearity condition extracts the following relation between r and R,
since P, p, and C must be on the same line:
! 1 !
R ¼ M T r: (7.8)
s
When we substitute the values in both equations:
8
>
> xp xc ¼ s½m11 ðXP XC Þ þ m12 ðYP YC Þ þ m13 ðZP ZC Þ
<
yp yc ¼ s½m21 ðXP XC Þ þ m22 ðYP YC Þ þ m23 ðZP ZC Þ (7.9)
>
>
:
f ¼ s½m31 ðXP XC Þ þ m32 ðYP YC Þ þ m33 ðZP ZC Þ
156 Imaging and sensing for unmanned aircraft systems, volume 1
8
> m11 ðXP XC Þ þ m12 ðYP YC Þ þ m13 ðZP ZC Þ
>
< xp xc ¼ f m31 ðXP XC Þ þ m32 ðYP YC Þ þ m33 ðZP ZC Þ
>
(7.10)
>
> m ðX XC Þ þ m22 ðYP YC Þ þ m23 ðZP ZC Þ
>
: yp yc ¼ f 21 P
m31 ðXP XC Þ þ m32 ðYP YC Þ þ m33 ðZP ZC Þ
Rewriting on a matrix format:
8
>
> 0 M1 X
< x ¼ f M X
3
; (7.11)
>
>
: y0 ¼ f M2 X
M3 X
where M1, M2, and M3 are the rows of matrix M for the rotation of a rigid body, and
X ¼ ðXP X C; YP YC ; ZP ZC ÞT .
The linear format of the equation, considering the planar projective equation
(ZP ZC ¼ 0), the projective transformation is as in the following equation:
( 0
x ða0 ; a1 ; a2 ; b0 ; b1 ; b2 ; c0 ; c1 Þ ¼ xa0 þ ya1 þ a2 x0 xc0 x0 yc1
(7.12)
y0 ða0 ; a1 ; a2 ; b0 ; b1 ; b2 ; c0 ; c1 Þ ¼ xb0 þ yb1 þ b2 y0 xc0 y0 yc2
The challenge in template matching then is how to find the captured image in the
stored image, since they can be quite different from each other. They are not neces-
sarily from the same sensor, and it is not guaranteed that they will have the same
orientation, scale, brightness level, and many other characteristics [20]. This poses a
critical image-processing problem to be tackled. Different methods have been pro-
posed for this problem, and most of them are described in image registration theory.
Image registration has been an active research field for many years, and it is the
process of overlaying two or more images of the same scene taken at different times,
from different viewpoints and by different sensors and geometrically aligning these
two images. In other words, we have to find a geometric transformation that the
images may have the same viewpoint and similar characteristics.
One such method is the use of feature points, such as SIFT, SURF, LATCH,
ORB, and AKAZE [49–53]. These techniques work in three main steps: the
selection of points in each image, description of the point’s neighbourhood, and
matching of corresponding points in both images. Using the corresponding points,
it is possible to state a projective transformation and then use it to estimate the
UAV pose, as the stored image is a georeferenced image.
Considering F: T(x, y)|!G(lat, long) the georeferencing relation from stored
image T with the object space G, and K: T(x, y)|!Q(X, Y) the geometric transfor-
mation that maps the stored image T in the captured image Q, it is possible to build
the georeferencing transformation H, from the query image Q, in which:
H : QðX ; Y Þ ! T ðx; yÞK 1 ! Gðlat F; longÞ (7.13)
That is, we can infer to each pixel of the captured image latitude and longitude
position, thus, obtaining the UAV location. Associating the parameters of the
projective transformation with the non-linear equation in (7.5) gives the pose.
Even though this is a possible and direct method, it is not yet computationally
efficient. First, because the big size of a stored image of a flight area may demand a
long time to obtain its feature points, and match them to the captured images points
and, thus, exceeds the time required for the navigation system. Likewise, it is a
method restricted to when we have images (stored image and captured image) from
the same type of sensor and must be captured in a similar time of day, in order to
eliminate luminosity differences.
Another method has been proposed to lower the execution time and compen-
sate for all the restrictions. It needs to use complementary sensors such as a height
sensor (a laser altimeter or sonar) and the INS, and the calibrated optical sensor’s
geometric parameters (focal distance, sensor size, and radial distortions) [54].
As the UAV fly, the images are taken by the embedded optical sensor every
instant. In an instant, when the INS is in the limit of losing precision, the proposed
template matching method starts the operation. It uses the INS attitude information,
height information, and the sensors parameters to form a geometric transformation
to compensate for the differences in scale, rotation, and disposition to rectify the
captured image [13]. Now, the captured image has the same orientation and pixel
size as the stored image. If the images are from the same sensor type, there is a high
chance they can be found through a correlation method.
158 Imaging and sensing for unmanned aircraft systems, volume 1
The correlation can measure the similarity of matching images in the spatial
domain or a transform domain [8], which uses the intensity of the image pixels.
Mutual information [55] and cross-correlation [56] are other metrics for matching
images. Only the correlation is described above to simplify without losing generality.
Methods based on correlation can be implemented very efficiently and are
suitable for real-time applications. They can also be applied in areas with no dis-
tinct landmarks. However, they are typically sensitive to differences between the
sensed and the reference image [20] (Figure 7.7). They are performed as a mask,
that does window operation as in (7.14) (squared difference correlation), (7.15)
(classic correlation) or (7.16) (Pearson correlation), where T is the stored image, I is
the transformed captured image, w and h are the width (number of columns) and
height (number of rows) of the image I, respectively, as below:
P 0 0 0 0 2
x0 ;y0 ðT ðx ; y Þ Iðx þ x ; y þ y ÞÞ
R1 ðx; yÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P P ffi (7.14)
ð x0 ;y0 T ðx0 ; y0 Þ2 Þð x0 ;y0 ; Iðx þ x0 ; y þ y0 Þ2 Þ
P 0 0 0 0
T x ;y I x ;y
x0 ;y0
R2 ðx; yÞ ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P P (7.15)
0 0 2 0 0 2
0 0 T ðx ; y Þ 0 0 I ðx þ x ; y þ y Þ
x ;y x ;y
P 0 0 0 0 0 0
T ðx ; y ÞI ðx ; y Þ
R3 ðx; yÞ ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P P ffi
0 0 2 0 0 2
x0 ;y0 T ðx ; y Þ x0 ;y0 I x þ x ; y þ y Þ ;
(7.16)
1 X
T 0 ðx 0 ; y 0 Þ ¼ T ðx 0 ; y 0 Þ T ðx00 ; y00 Þ
hw x00 ;y00
(a) (b)
Figure 7.7 The use of the correlation metric to compare RGB images from the
same sensor [57]
Vision-based UAV pose estimation 159
and
1 X
I 0 ðx þ x 0 ; y þ y 0 Þ ¼ I ðx þ x 0 ; y þ y 0 Þ I ðx þ x00 ; y þ y00 Þ
hw x00 ;y00
(a) (b)
Figure 7.8 Correlation and the comparison of images borders based on [57]
160 Imaging and sensing for unmanned aircraft systems, volume 1
makes use of the feature point algorithms described before possible, as the reduc-
tion of stored information makes them suitable for real-time applications depending
on the number of landmarks to be tested. In order to reduce the amount of land-
marks to be tested, it is recommended to use the INS position estimation con-
sidering its error to reduce the number of landmarks per flight region.
Once the landmarks are recognised, and their pixels are associated with the
pixels from the captured image, it is possible to obtain the geometric transformation
to georeference the captured image (7.13) and obtain the UAV position, consider-
ing it as the position of the centre of the image. An example appears in [26]. More
sophisticated methods for pose estimation can be obtained the same way as
described before, using (7.5).
The main disadvantages of this situation are related to both the limitations of
the feature method used and the need for a recent image of the landmarks. If the
landmark is removed or modified, the method will not be able to recognise it.
The main problem with this situation is that the system is not able to separate
landmarks of the same class. For example, if there are two different bridges in the
route, the system will recognise both as bridges but will not be able to separate one
from another, leading to probable errors in position estimation. This aspect has to
be tackled during flight planning. The safest way is when each class of landmark
appears only once, or if other techniques or other sensors information is employed.
It is even possible to have the same class of landmarks on the same route, but they
need to be far apart, and the planned route must have a strict sequence of landmarks
to follow.
There are few works on aerial landmark recognition navigation systems in the
literature yet. It is a promising field of study that suits the need for a reliable and
precise method for the auto-localisation of the UAV, since it needs less information
about the route than the template matching, and also because each estimation is
independent of previous estimations.
θx θy
h
Sy
Sx
Figure 7.9 Scenario depicting how a UAV obtains images at t and t þ Dt [74]
and to obtain both Rk and tk, we need to determine the fundamental matrix F of each
consecutive pair of images (Ik1, Ik) [75]. The fundamental matrix is a 3 3 matrix
that represents the epipolar geometry between two images of the same scene, which
depends only on the parameters of the sensor and the positions each image was
taken. The most important property of the fundamental matrix is that for every
corresponding pair of points xk1 and xk in each image, according to:
xTk Fk xk 1 ¼ 0 (7.19)
Fk ¼ C T Rk ½tk x C 1 (7.20)
164 Imaging and sensing for unmanned aircraft systems, volume 1
F can also be expressed as in (7.21), where E is the essential matrix. Therefore, the
definition of the essential matrix is as in (7.26) [75]:
Fk ¼ C T Ek C 1 (7.21)
Ek ¼ Rk ½tk x (7.22)
E can be decomposed into singular values (SVD) as in (7.23). This decomposition
is vital because, as stated by [75], E and the matrix [tk]x have the same singular
values, since they are apart of only one rotation. From that statement, both [tk]x and
Rx are obtained by (7.24) and (7.25), where U and VT are unitary matrices of the
SVD of E, S is the diagonal matrix with the singular values in the diagonal, and W
is the 90 rotation matrix in the z-axis. These entities are related as follows:
SVDðEÞ ¼ USV T (7.23)
Rk ¼ UW V T T
(7.24)
and
½tk x ¼ UW SU T (7.25)
2 3
0 1 0
W ¼ 41 0 05 (7.26)
0 0 0
The strategy can have a high rate of repeatability so that it can correct the INS
frequently [8]. On the other hand, the VO tends to accumulate errors in the direc-
tion of movement. Therefore, the position is not well accurate in long flights [8].
This happens because, besides the first estimation, which uses the known first
position, all other estimations use its previous estimation (and thus its implicit
error) as input, accumulating errors similar to the INS. Some examples of VO
systems are in [27,76–78] (Figure 7.10).
nowadays tend to combine strategies in a single method. In this context, data fusion
algorithms play an essential part in mixing the different estimations and their
reliabilities to give a better pose for navigation. Systems with this aspect have
proven to be more stable and reliable in most situations.
A first example is a work developed in [20,54]. Both use the template
matching strategy combined with the VO. In [20], the position estimations and the
INS information are fused using a Bayesian estimation framework, consisting of
two parts: A standard Kalman filter and a point-mass filter [20]. Braga [54], on the
other hand, uses a particle filter algorithm to estimate the new position.
Simultaneous localisation and mapping (SLAM) is another visual navigation
that can be considered as a combination of strategies. It combines the VO with
landmark recognition, and its goal is to localise a robot in the environment while
mapping it at the same time [79]. SLAM nowadays is one of the most studied and
developed methods because there is a high interest in using it in disastrous areas,
for example, where GNSS signals are impossible, and the area has changed com-
pletely. In such a scenario, it is necessary to know how the environment is in order
to plan how to act and rescue the survivors, as fast as possible.
In SLAM approaches, an internal representation of the world is built on-line
mostly as a mosaic image, and a landmarks database is also formed, which is then
used for localisation purposes. For indoor robotic applications, SLAM is already a
standard [80]. More challenging is the use of such techniques in vast outdoor
environments. It is one of the most promising techniques being developed for UAVs
and aquatic UAVs [80].
There are two types of vision-based SLAM systems: the monocular, which
uses only one sensor, and the binocular, which uses more than one sensor, and fuse
their measurements for a stereo vision of the scene [20]. The binocular SLAM has
less dependence on the hardware and is more flexible than the monocular SLAM.
The binocular uses information from the sensors to form the stereovision pair and
build the map of the scene [81]. The stereovision is used to obtain depth informa-
tion of the environment. The monocular SLAM, on the other hand, uses the
sequence of images also to obtain the depth information. It has, though, to be
limited to the overlapping area between the two images.
In both SLAMs, the navigation is performed by the VO [81] at first. While the
navigation is performed, the system also maps the region, automatically collecting
data about landmarks encountered [82]. These data are stored in a database, and, as
the reliability of the VO drops, the aircraft returns to previous areas already mapped
to recognise the landmarks and eliminate the accumulated error of the visual [81].
This need to revisit already mapped places to identify some landmarks before
continuing the route is the most significant disadvantage of SLAM, as it impacts on
the autonomy of the vehicle for some missions [21].
are more studied and more developed than outdoor navigation. This happens
because the GNSS flaws for outdoor aerial navigation are less frequent. The indoor
navigation, on the other hand, is always under a severe vulnerability (the non-
satellite visibility) and has a more controlled environment for the development.
This research preference may change as UAVs become more popular, and the
demands for safer and more reliable outdoor navigations increases.
Another possible future development relates to the sensor used for navigation.
Most proposed methods until now have used mostly optical RGB sensors, which is
quite limiting when we consider the vast possibility of flight scenes. New sensors
and new adaptations of the strategies need to be studied and developed, in order to
pursue a more reliable and broader visual NS. This also leads to studies on adaptive
systems, which need first to understand their environment before deciding the best
algorithm to produce the most reliable pose estimation.
At the same time, most methods yet developed have a high computational cost,
which means that obtaining real-time conditions is always a challenge to be taken
into consideration. One must not forget that the computation must be embedded,
and there are certain limitations to the computer that will be on the aircraft, such as
energy consumption and weight, which will affect the UAV’s autonomy. Cloud
computing and better communication links could be thought of as a solution at first
[66,83–85]. On the other hand, they introduce vulnerabilities in the system due to
data transmission, despite the use of cryptography and other security measures.
Few works [54] nowadays are using hybrid computing to accelerate their methods.
Field programmable gate array, multiple central processing units and graphical
processing unit are combined, in order to obtain the most appropriate hardware
architecture for the method in analysis.
Even though the navigation is discussed in this chapter as a task for UAVs
individually, multi-UAV collaboration can also provide information for the swarm,
and its location. This method is also not recommended since the communication
between each UAV in the swarm would also introduce vulnerabilities in the system,
as cloud computing would.
7.6 Conclusion
The high number of UAV applications has pushed the development of more
reliable navigations systems in order to surpass the limitations and problems
related to the GNSS þ INS. Losing signals steadily, staying susceptible to jam-
ming and spoofing and not being able to fly indoors became a high barrier to
several important applications [83,85]. The scientific community then has been
developing several visual NSs: landmark recognition, template matching, VO
and the combinations of these strategies [84–87]. These strategies, therefore,
have proven to be quite reliable so far, with results similar to the GNSS. There
are several improvements yet to be made, which makes it a vast and rich research
field to be explored.
Vision-based UAV pose estimation 167
References
[1] Mersheeva V, and Friedrich G. Multi-UAV monitoring with priorities and
limited energy resources. In: ICAPS; 2015. p. 347–356.
[2] Costa FG, Ueyama J, Braun T, et al. The use of unmanned aerial vehicles and
wireless sensor network in agricultural applications. In: 2012 IEEE Int’l
Geoscience and Remote Sensing Symp. (IGARSS). IEEE; 2012. p. 5045–5048.
[3] Doherty P, and Rudol P. A UAV search and rescue scenario with human
body detection and geolocalization. In: Australasian Joint Conf. on Artificial
Intelligence. Springer; 2007. p. 1–13.
[4] Everaerts J, et al. The use of unmanned aerial vehicles (UAVs) for remote
sensing and mapping. In: The Int’l Archives of the Photogrammetry, Remote
Sensing and Spatial Inf. Sciences (IAPRSSIS). 2008;37. p. 1187–1192.
[5] Remondino F, Barazzetti L, Nex F, et al. UAV photogrammetry for mapping
and 3D modeling–current status and future perspectives. In: IAPRSSIS.
2011;38(1). p. C22.
[6] Mader D, Blaskow R, Westfeld P, et al. Potential of UAV-based laser
scanner and multispectral camera data in building inspection. In: IAPRSSIS.
2016. p. 41.
[7] Valavanis KP, and Vachtsevanos GJ. UAV applications: Introduction. In:
Handbook of Unmanned Aerial Vehicles. Springer; 2015. p. 2639–2641.
[8] Silva CAO. Evaluation of Template Matching for UAV Location. UFMG;
2015.
[9] Woodman OJ. An introduction to inertial navigation. University of
Cambridge, Computer Laboratory; 2007.
[10] LeMieux J. Alternative UAV Navigation Systems; 2012. Available from:
http://electronicdesign.com/embedded/alternative-uav-navigation-systems
[cited 12/05.2016].
[11] Parkinson BW, Enge P, Axelrad P, et al. Global positioning system: Theory
and applications, Volume II. Am. Inst. of Aeronautics and Astronautics; 1996.
[12] Taiwan National Space Organization. NSPO Satellite Database.
[13] Goltz GAM. Artificial Neural Networks on Images for UAV Pose
Estimation. Instituto Nacional de Pesquisas Espaciais; 2011.
[14] de Babo Martins F, Teixeira LF, and Nobrega R. Visual-inertial based
autonomous navigation. In: Robot 2015: 2nd Ib. Rob. Conf. Springer; 2016.
p. 561–572.
[15] Kintner PM, Kil H, Beach TL, et al. Fading timescales associated with GPS
signals and potential consequences. Radio Sci. 2001;36(4):731–743.
[16] Allen PD. Modeling global positioning system effects in the Tlc/Nlc Model,
1994.
[17] Ruegamer A, and Kowalewski D. Jamming and spoofing of GNSS signals – An
underestimated risk?! Proc. Wisdom Ages Chall. Mod. World. 2015; p. 17–21.
[18] Zhang C, Chen J, Song C, et al. An UAV navigation aided with computer
vision. In: The 26th Chinese Control and Decision Conf. (2014 CCDC); 2014.
168 Imaging and sensing for unmanned aircraft systems, volume 1
[19] Shiguemori EH, Martins MP, and Monteiro MVT. Landmarks recognition
for autonomous aerial navigation by neural networks and Gabor transform.
In: SPIE 2007 Electr. Imaging. SPIE; 2007. p. 64970R–64970R.
[20] Conte G, and Doherty P. Vision-based unmanned aerial vehicle navigation
using geo-referenced information. In: EURASIP J. Adv. in Signal Proc.
2009;2009.
[21] Mathe K, and Busoniu L. Vision and control for UAVs: A survey of general
methods and of inexpensive platforms for infrastructure inspection. Sensors.
2015;15(7):14887–14916.
[22] Gamal AE, and Eltoukhy H. CMOS image sensors. Circ. Dev. Mag. 2005;
21(3):6–20.
[23] Lees, AW, and Ryan WD. A simple model of a buried channel charge
coupled device. Google Patents; 1974. US Patent 3,792,322.
[24] Cardoza S, Alexsander J, et al. Digital image sensors: CCD AND CMOS. In:
VII CONNEPI-Congresso Norte Nordeste de Pesquisa e Inovacao; 2012.
[25] Sohn IY. CMOS active pixel sensor. Google Patents; 2002. US Patent
6,365,950. Available from: https://www.google.com/patents/US6365950.
[26] Silva Filho PFF. Automatic Landmark Recognition in Aerial Images for the
Autonomous Navigation System of Unmanned Aerial Vehicles. Instituto
Tecnologico de Aeronautica (ITA). Sao Jose dos Campos-SP; 2016.
[27] Roos DR. Machine learning applied to visual odometry for UAV pose esti-
mation. UNIFESP. Sao Jose dos Campos; 2018.
[28] Rebecq H, Horstschaefer T, and Scaramuzza D. Real-time visual inertial
odometry for event cameras using keyframe-based nonlinear optimization.
In: British Machine Vis. Conf. (BMVC). vol. 3; 2017.
[29] Depaola R, Chimento C, Anderson ML, et al. UAV navigation with
computer vision–flight testing a novel visual odometry technique. In:
2018 AIAA Guidance, Navigation, and Control Conference; 2018. p. 2102.
[30] Mansur S, Habib M, Pratama GNP, et al. Real time monocular visual
odometry using Optical Flow: Study on navigation of quadrotors UAV.
In: 2017 3rd Int’l Conf. Sc. Technology-Computer (ICST). IEEE; 2017.
p. 122–126.
[31] Estrela VV, Magalhaes HA, and Saotome O. Total variation applications in
computer vision. In Handbook of Research on Emerging Perspectives in Int.
Pattern Rec., Analysis, and Image Proc. IGI Global; 2016, pp. 41–64.
[32] Estrela V., Rivera LA, Beggio PC, and Lopes RT. Regularized pel-recursive
motion estimation using generalized cross-validation and spatial adaptation.
In: Proc. SIBGRAPI; 2003. DOI: 10.1109/SIBGRA.2003.1241027
[33] Hagen NA, and Kudenov MW. Review of snapshot spectral imaging tech-
nologies. Optical Eng. 2013;52(9):090901.
[34] Gade R, and Moeslund TB. Thermal cameras and applications: A survey.
Machine Vis. Appl. 2014;25(1):245–262.
[35] da Silva W, Shiguemori EH, Vijaykumar NL, et al. Estimation of UAV
position with use of thermal infrared images. In: 2015 9th Int’l Conf.
Sensing Technology (ICST). IEEE; 2015. p. 828–833.
Vision-based UAV pose estimation 169
[86] Coelho AM, de Assis JT, and Estrela VV. Error concealment by means of
clustered blockwise PCA. In: Proc. 2009 IEEE PCS. https://doi.org/10.1109/
PCS.2009.5167442
[87] Razmjooy N, Ramezani M, Estrela VV, Loschi HJ, and do Nascimento DA.
Stability analysis of the interval systems based on linear matrix inequalities.
In: Y. Iano et al. (eds), Proc. BTSym’18. Springer; 2019. https://doi.org/10.
1007/978-3030-16053-1_36
Chapter 8
Vision in micro-aerial vehicles
Navid Razmjooy1, Vania V. Estrela2
and Roberto Sabatini3
1
Department of Electrical Engineering, Tafresh University, Tafresh, Iran
2
Departament of Telecomunications, Federal Fluminense University, RJ, Brazil
3
School of Engineering, RMIT University, Bundoora, Australia
174 Imaging and sensing for unmanned aircraft systems, volume 1
this chapter, different technologies for designing the vision systems for bio-inspired
MAVs will be reviewed.
8.1 Introduction
Micro-aerial vehicles (MAVs) possess the potential of accomplishing a wide range
of limited duration missions. Their small size facilitates their transport to a launch
site and allows them to remotely flown them by a single operator. This makes
them ideal for reconnoitring dangerous situations (e.g., gas discharges, chemical
spills, and so forth). Their size also makes them hard to be detected visually,
making them useful critical surveillance missions. Therefore, MAVs provide a
convenient and safe way to conduct missions that are presently unmanageable to
do with any other existing device. When it is difficult for bigger unmanned air
vehicles (UAVs) to operate, a MAV can be deployed both in confined and open-air
spaces [3].
On the contrary, MAVs’ limitations will be more visible when considering
smaller sizes. Smaller wings decrease the dimensionless Reynolds number, which
illustrates in fluid mechanics how turbulent or steady is a fluid flow passing a body
or in a duct. The Reynolds number (Re) portrays the ratio between inertial forces
and viscous forces and explains the amount of laminar or turbulent flow as follows:
rv1 v1
Re ¼ ¼ ; (8.1)
m u
where v is the fluid velocity in m/s, l is the characteristics length (or the cord width
of an aerofoil) in m, r is the fluid density, m is the dynamic fluid viscosity, and u is
the fluid kinematic viscosity in m2/s. The systems that function at the same Re will
have similar flow characteristics regardless of the fluid, speed as well as char-
acteristic lengths vary.
They differ along with their means of producing propulsive thrust and lift
fixed-wing MAV (FMAV) [4], rotary-wing MAV (RMAV), flapping-wing (also
called biomimetic MAV or BMAV), and hybrid MAV as described in the following
section [5,6] (Figure 8.1).
(a) (b)
(c) (d)
Figure 8.1 Different types of MAVs: (a) fixed-wing, (b) rotary-wing, (c) flapping-
wing, and (d) hybrid MAVs
have also been used. Lift is generated by airflow over non-moving wings that have
airfoil cross-sections. This generally limits the direction of flight to forwarding
motion [7–9].
FMAVs often consist of rigid wing, fuselage, and tails, which use a motor and
propeller as their propulsion system and can cover a wide range of possible oper-
ating environments, including a jungle, desert, urban, maritime, mountains, and
arctic environments [10]. Because of their small dimensions compared to UAVs
and low required power, FMAVs have low radar cross-section, are very difficult to
detect and fly in environments, such as urban or forested areas, require short wings
with low aspect ratios since UAVs with more extended wings are quite delicate,
and likely to collide with obstacles [7,8].
High endurance and extensive range are critical in MAV applications such as
data gathering or surveillance. Usually, FMAVs with more lift/drag values perform
better than those with lower values. Furthermore, FMAVs have a more extended
range and endurance and can fly at a higher altitude than the BMAVs and RMAVs,
which usually perform indoor missions with slower flight speed. The wing plan-
form is the wing silhouette when seen from above or beneath. Different types of
platforms are possible: rectangular, elliptical, tapered wings with swept leading
edges, Zimmerman, and inverse Zimmerman [10]. Some aircraft may have variable
planform geometry during flight.
For mathematical modelling of FMAVs, some considerations can be per-
formed: FMAVs can be modelled as a rigid body, curvature of the earth can be
neglected, and FMAVs fly short distances [11]. By considering these assumptions,
the model of an FMAV can be achieved by applying Newton’s laws of motion.
176 Imaging and sensing for unmanned aircraft systems, volume 1
L
Vx
v θ ψ
α
γ
D
r
δe q
δr
(a) W (b) Vy
δa
Vy ϕ
–ϕ
P
(c) W
Figure 8.2 Dynamic motion modelling for (a) pure pitching motion, (b) pure
yawing motion, and (c) pure rolling motion [13]
Vision in micro-aerial vehicles 177
where dr is the rudder deflection, y is the angle of yaw, and r is the yaw rate with
respect to the centre of gravity of the MAV. Nr and Ndr are the stability derivatives
for yaw motion. The following equations can model the dynamics of roll angle:
f_ ¼ p
; (8.4)
p_ ¼ Lp p þ Lda da
where p is the roll rate, j is the roll angle, and da is the deviation of the ailerons. Lp
and Lda are the stability derivatives of the roll motion. Figure 8.2(c) shows the roll
motion variables.
F3 F2 Kinematics
τ3 MF–1 M1 τ2
y eb frames
Jcm–1 ϕ (Vy,ωy) en Vehicle
Xv
F4 r F1 Frenet Frame f{v}
f{CM} Frame f{r} 180° Frame V
τ4
θ
τ1 et f{c}
Soi,cm Smf2,cm Yv
m front x
Joi,cm M2 (Vx,ωx) Xi
Jcm–2 P Zv
MF–2 Inertial
z ψ (Vz,ωz) Yi
Frame f{i}
(a) (b)
Zi
Dynamics
description
τoi
Motor Soi,cm
Oi
Center of mass
Foi Fcm CM
(c)
Total inertia of the vehicle due to rotors and electronics can be calculated as
follows:
2
2 a þ b2
Jcm;T ¼ M1 r þ M2
2
þ smf 2;cm þ m rm2 þ s2oi;cm
2
(8.6)
5 12
For the RMAV, spherical-shape has been adopted from the main electronics
with mass M1 and radius r, whereas the rectangular shape with mass M2 and
cylindrical (m, rm) shape for other electronics (battery þ IMU) and the 4 rotors,
respectively. The terms (a,b) refer to the lengths of the rectangular shape assumed
for the second electronics (see Figure 8.3(a)).
" #1 " # " #!
Jcm;T es oi;cm mes oi;cm mes oi;cm tcm;T zT
_
V cm ¼ ; (8.7)
mes oi;cm mU fcm;T 0
where es oi;cm 2 <33 is the skew-symmetric matrix corresponding to the vector cross
product operator of es oi;cm (see Figure 8.3 for operators description), and U 2 <33
is the identity operator.
flexible and light wings, as observed in birds and insects, which indicate that the
flexibility and weight of wings are essential for aerodynamics and flight stability.
The research on natural and human-made flapping wings showed that these types of
air vehicles have more complexities as compared to FMAVs and RMAVs, mainly
due to the aerodynamics complexity [3].
Biologic inspiration indicates that flying with flapping wings presents unique
manoeuvrability advantages. There are fundamental challenges for tiny fixed and
rotary wings to fly reliably. When the wing area is reduced, a flow transition to low
Reynolds number occurs, which reduces the aerodynamic wing efficiency [26].
FMAVs and RMAVs that operate at low Reynolds numbers are prone to flow
separation resulting in a drag increase and loss of efficiency.
A flapping wing can benefit from the advantages of other UAV types. The
hovering ability of insects, coupled with the ability for a quick transition to forwarding
flight, provides an ideal drone for search and rescue and other applications [27].
BMAVs can be designed and fabricated in three configurations, namely, monoplane,
biplane, and tandem [3].
UAVs inspired by the flapping wings of the birds’ or insects’, or by the bats’
flight model have been developed. Lift and thrust are achieved by flapping the
wings. Flapping the wings generates a more significant force per wing surface area.
Therefore, a BMAV has the potential to be much smaller than either an FMAV or
an RMAV. The flapping frequency depends on the wing surface area. A dragonfly-
inspired aircraft, for example, must flap at a lower frequency than a bee-inspired
MAV since bees have much smaller wing-to-body ratios.
Some methods to design BMAVs use empirical formulae [3]. These formulae
originated from allometric data extracted from avian flight [10] and involve design
parameters of flapping wings, such as wing area, weight, and wing loading to the
flapping frequency, flight speed, required power for flight, the geometry of the
wing including the area and wingspan and the aircraft weight. Other methods based
on statistical and experimental sizing and testing were applied.
Most of the sizing methods for MAVs depend on (allometric) expressions
extracted from natural birds and insects to directly sizing artificial flapping wings
without considering the impacts of other parameters such as the used materials for the
wing membranes. The non-optimised MAVs result from the use of the empirical
formulae derived from the observation of natural birds and insects and these mathe-
matical models need revision and, probably, some correction factors [26].
After sizing and during the design process of BMAVs, different aerodynamic
and structural analyses can be performed on them. Usually, in natural and human-
made flapping wings, their aerodynamics, structure, and flight dynamics intersect
with some fascinating problems like unsteady three-dimensional (3D) separation,
transition in boundary layers and shear layers, uncertain flight environment,
aeroelasticity, and anisotropic wing structure, and nonlinear and adaptive control
[28]. Different theories are used to model the aerodynamic forces of the natural
and human-made flapping wings, such as quasi-steady, strip theory, unsteady,
and Navier–Stokes methods. It should be mentioned that the type of analysis is
dependent on the type of flapping wing, its configuration, and flight modes. For
180 Imaging and sensing for unmanned aircraft systems, volume 1
The flight path angle g for horizontal flight is zero. Furthermore, the rela-
tionship 0.75.c.q models the relative air effect of the pitching rate q, which man-
ifests at 75% of the chord length [33].
From the above equations, the relative velocity, the relative angle between the
two velocity components y, and the effective AOA can be obtained:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Vrel ¼ Vx2 þ Vy2 ;
1 Vx (8.11)
y ¼ tan ;
Vy
aeff ¼ y þ q
By assuming the Kutta–Joukowski condition, the section lift due to circulation
can be calculated as:
1
dLc ¼ r Vrel
2
Clc c dr;
2
where c and dr are the chord length and the width of the element of the wing under
consideration, respectively, and
Clc ¼ 2pCðkÞsinðaeff Þ: (8.12)
The apparent mass effect (momentum transferred by accelerating the air to the
wing) for the section is perpendicular to the wing and acts at mid-chord. Profile
drag dDp and induced drag dDi forces can be calculated as follows:
1
dDp ¼ r Vrel
2
Cdp c dr
2 (8.13)
1
dDi ¼ r Vrel
2
Cdi c dr
2
Moreover, the total section drag is equal to:
dDd ¼ dDp þ dDi : (8.14)
The horizontal and vertical components of the forces can be defined as fol-
lows:
dFver ¼ dLc cosðyÞ: cosðdÞ þ dNnc cosðqÞ: cosðbÞ: cosðdÞ
(8.15)
dFhor ¼ dLc sinðyÞ: cosðdÞ þ dNnc sinðqÞ: cosðdÞ dDd cosðyÞ: cosðdÞ
By collecting the whole vertical and horizontal components of forces, the lift
and thrust of the BMAV can be calculated as follows:
rPc2 _ € cosðqÞ 0:5qÞdr:
€
dNnc ¼ ðqU þ rb (8.16)
4
Table 8.1 presents a brief overview of the advantages and disadvantages of the
types of MAVs.
182 Imaging and sensing for unmanned aircraft systems, volume 1
● Hover flight
MAVs are hybrid designs, which use fixed wings for lift and flapping wings for
propulsion. In this type of MAVs, the drone usually consists of a low aspect ratio
fixed-wing with a trailing pair of higher aspect ratio flapping wings, which flap in
the counter phase. The flapping-wing part increases efficiency, provides a
mechanically and aerodynamically balanced planform, and quenches stall over the
fixed-wing by entraining flow [34]. This type of drone also can be seen in dragonfly
with tandem wings, where they apply two pairs of wings to increase the lift and
thrust forces.
The rest of this chapter is organised as follows. In Section 2, computer vision
(CV) and its application in the biological-based MAVs are presented. Section 3
describes the role of sensing in MAVs based on different sensors. Section 4 pre-
sents the effects of illumination on the MAVs’ vision. Section 5 presents the con-
ception of navigation, pathfinding, and orientation on the MAVs. Section 6
explains about CCD cameras and their applications in machine vision. Section 7
presents future trends about MAVs. Finally, Section 8 concludes the chapter.
However, the demand for drones, like airborne robots and bombers that can be used
in emergencies and industrial applications, has just come into being. Recently, the
MAVs, among the full range of drones, have been developed to fly in challenging
conditions and have become popular UAVs.
MAVs have the ability to carry a variety of sensors; however, the small-scale
types have some limitations in carrying the heavier objects. The most types of
MAVs can carry a small digital camera to take aerial images for the preset goals.
The capability of carrying in MAVs makes them the ability to pilot. However,
this process needs lots of training for piloting. This problem can be covered by
considering the autopilot system to keep it in a definite spot.
For outdoor applications, the Global Positioning System (GPS) can be achieved;
however, indoor applications cannot rely on GPS and requires an alternative sensor
for the autopilot system. The best choice for autopilot is using on-board cameras.
These autopilot systems have been utilised on-board cameras successfully. The
CV algorithms then evaluate the obtained ego-motion information from the camera
and then fused with inertial measurement units (IMUs) to employ in the hover
control loop in MAVs.
Image processing and CV algorithms can be employed for achieving the
environment information for better controlling of the MAVs. In some applications,
this information can be synthesised with measurements of IMUs to better control-
ling of the MAVs (Figure 8.4).
To reach this purpose, there are some challenges to overcome:
● The limitation of the on-board processors, particularly for image processing
that demands a high computational capacity for the operations.
● For exact controlling, the highest frame rate is needed.
● The requirement of MAVs for robust operations in sensitive conditions.
● The metric scale cannot be measured by ego-motion estimation from a single
camera.
Fusing camera-based measurements with IMU generates an efficient and
robust ego-motion estimation algorithm [9,23–25,49–53]. Furthermore, we can
employ the camera images for interpretation and environment sensing.
Using the connected cameras, MAVs can compute a 3D map of the environ-
ment, which can then utilised for autopilot applications [54].
Next, some of the applications of image processing and CV methods for con-
trolling, autopilot, 3D mapping the MAVs are introduced and presented.
Yang et al. [55] introduced an autonomous search method for landing the
MAVs on a pre-defined landing place using real-time monocular vision with only
one single reference image of the landing place with an unknown size before
initiating the task. Some other works in this area can be achieved in [56–58].
Different from these systems, systems with on-board sensors have different and
harder methods for achieving the performance parameters [59,60].
There are several animals with different locomotion ways in the environment.
Among these animals, birds and insects have efficient movements with their wings,
which allow them to have more dynamics, more speed, and more manoeuvrability.
Bio-inspired robots can move on both the sky and the land, either by jumping
or by walking [61]. The ability of the BMAVs to cope with wind turbulence and
also their behaviour in front of the surrounding environmental conditions makes
them fascinating.
For instance, consider a dragonfly-inspired robot. This robot has the following
features:
– The presence of four wings in this robot makes it possible to have the max-
imum ability for lifting extra payloads like sensors and more processing power.
– The existence of four wings provides high manoeuvrability to these robots
while maintaining robust and stable flight performance to environmental
disturbances.
– The existence of one actuator per wing makes the system control simpler.
– This type of wings makes MAV sustain long flight times at minuscule energy
requirements.
The dynamics and kinematics of MAVs can be modelled using CAD software
tools. A mathematical model of the system is then can be verified by simulation
software. The design of the Autopilot system can be performed in parallel with the
mechanical testing and construction. The aerodynamic analysis is finally employed
as leading to design and fabrication of the final wing for the bio-inspired MAV
(Figure 8.5).
(a) (b)
υ
Drag force (mN)
f1 0.3
fd
τc
rw
mg Zz
y
X 0 1.5
θ Y x
Wind speed (m s–1)
and effective algorithms, can lead to substantial pose estimate enhancement in addi-
tion to bounding accumulated errors.
The utility of optical sensors for pose estimation is critical for the automatic
guidance of aerial vehicles that require some degree of decision-making autonomy
[62]. Through a bio-inspired image-processing approach for visual guidance known
as OF, translation, and rotation in a series of consecutive images can be detected.
OF can be described as the apparent visual motion in a scene as seen by a moving
observer. Advancements in solid-state sensors and digital signal processors have
helped to implement OF in MAVs.
There have also been recent attempts to utilise multiple sensors for OF per-
formance improvement [63,64]. A major limitation to OF and all other electro-
optical (EO) approaches is that their performance is highly dependent on the visual
scene. Lighting conditions, lack of discernible or distinct features, and weather
conditions (e.g., rain or fog) can degrade their performance. This limits a MAV
utility in certain conditions and reduces its operational spectrum. Despite sig-
nificant advances in computer processing hardware, OF approaches are computa-
tionally intensive, and this may introduce undesirable latencies [49–53]. The
additional computational hardware required to process optical measurement-based
approaches means that they have higher relative power consumption than many
other sensing options. OF may also be constrained by the bandwidth of the visual
data input. The video sampling rate must account for aliasing because the update
rate of the computed pose estimate must at least match the perturbation rate.
The fusion of the output of flow sensors to the INU to improve pose estimation
is essential [8]. Flow sensors capable of sensing flow disturbances and control
systems capable of counteracting them before they induce a perturbation are
desired. Such systems have been shown to improve pose estimation and thus to
enhance attitude stability [65]. The minimal processing required to produce an
output makes this sensory technique ideal for enhancing the pose estimates of
INUs. This emerging area of research may provide the sensory needs for MAVs to
fly in severe turbulence.
used, as they vibrate when a voltage is applied across the material and vice versa.
The operational bands of piezo-based sonars are limited to the piezoelectric
material oscillation frequency.
The operational frequencies of MEMS transducers can, therefore, be tailored to
match the acoustic impedance of air to offer higher performance.
Sonars are suited for MAV operations because they are resilient to routine
operational impacts and are relatively affordable. Their power and range are
directly proportional and entirely hardware-dependent.
They are not sensitive to the colour or reflectivity of the detected object.
Nonetheless, they are generally suited for sensing obstacles normal to the direction
of pulse propagation. Angled, smooth objects may lead to the redirection of the
beam resulting in false readings of an incident object. The reflected signal can also
be particularly sensitive to the target area, relative object attitude, and the ability of
the material to absorb sound energy (such as foam). The reliance on acoustics is
particularly limited in sensing.
Ultrasonic sensors are generally immune to background sounds. This is not the
case for the high-frequency noise that may be induced, for instance, from propeller
motion. Thus, sensor placement may also present an issue. Environmental changes
such as pressure, altitude, and temperature variations will alter the speed of sound
and, consequently, the readings. Additionally, operating in turbulent or particle
dense (scattering) atmospheres will also influence the sonar readings. Sonars have
high latency compared with other range sensors. The first cause is that the trans-
ducers must be stopped before receiving a signal to avoid interference, which is
known as the blanking time. This is in addition to the reliance on sound, which is
physically limited compared to electromagnetic waves. The combination of limited
range and high latency of sonars may render them unreliable for the safe navigation
of the MAV.
These sensors are often used for obstacle detection, where the distance of a
measured obstacle can be inferred from the ToF measurement. A sensor array or a
rotating sensor for a 360 scan can help to map the nearby obstacles. Scan patterns
detect features like corners and specific geometries. It is possible to use a dual
sonar receiver structure for indoor Simultaneous Localisation and Mapping
(SLAM) [67].
As the detection distance grows, the signal-to-noise ratio (SNR) decreases, and the
readings become less trustworthy. Their outdoor use is restricted to low light
conditions, thanks to the interference of solar IR radiation. This, in turn, limits the
use of increasingly popular RGB-depth imaging sensors, which have been used for
indoor-mapping and navigation [68]. They are also highly sensitive to the reflec-
tivity and direction of motion of detected objects.
IR proximity sensors are more oriented towards obstacle detection. Sensor
arrays could be arranged around a MAV structure due to their beneficial SWaP
characteristics. They are often used for obstacle avoidance and tasks such as GNSS-
denied altitude estimation. For MAVs, they are better suited for tasks in which
visibility is limited, such as in the presence of smoke and low-lighting/night
operation.
8.3.6 LIDAR
LIDAR was initially developed for atmospheric sensing and topographic mapping
but now can be found in GNSS and INU. LIDAR sensors used for range mea-
surements characteristically operate similarly to IR-range and sonar sensors, i.e., as
ToF sensors. Hence, unlike IR proximity, there is a limited divergence from the
source that enables ToF sensing using the phase shift between the emitted and
returned signals.
A large number of pulses are emitted per unit time for 2D range scans where
the sensor captures the reflected waves. In many cases, one or more sources are
used to accomplish each scan. For 3D scanning with compound sources, angled
rotating mirrors, and rotating housings are used. LIDAR results in large data sets
and a very high processing computer load. The intensity of the returned wave
depends on the reflectivity of the detected object and may be affected by ambient
illumination. However, LIDARs are active devices less sensitive than EO sensors
to environmental factors and can also relay on depth information. EO sensors rely
on passively sensing ambient light. LIDAR ranging requires more power, particu-
larly as the measurement range and scanning frequency increase. Another aspect
190 Imaging and sensing for unmanned aircraft systems, volume 1
that must be considered with the use of lasers is safety, as MAV will operate in
urban environments. There are regulations for the various classes of lasers when
humans are exposed that are eye-safe [69].
LIDARs using SwaP are suited for MAV operations [70,71] although they
jeopardise other operational necessities and suffer solar interference in outdoor
operations such as cost, robustness, accuracy, sensitivity, and range.
8.3.7 Cameras
CV is an essential aspect of MAVs since images captured by one or more onboard
cameras can provide both EA and pose estimation [7,69]. These techniques are
used for image segmentation, target detection, target tracking, feature extraction,
and mapping. Motion estimation entails the comparison of frames adjacent in time
to detect the displacement between objects [72–77]. Visual SLAM algorithms
perform feature extraction for landmark detection and must deal with all types of
uncertainties and cooperative mapping. CV is appropriate for MAVs since visual
sensors are compact and consume less power than active ranging sensors do.
Nonetheless, there are associated challenges that must be addressed with CV
solutions. First, the computational load of image processing and map building is
demanding limited on-board resources. As with any computationally intensive task,
this is often handled by relying on powerful off-board computers, which leads to
data latency and increases power consumption. MAVs must follow the Johnson
criteria (JC) for the quality of video transmission [69,78]. JC establishes the reso-
lution thresholds for object detection, orientation, recognition, and identification.
Compared with LIDAR scanners, cameras have a limited FoV, which limits the use
of solutions relying solely on vision, principally in active urban environments.
Combining multiple EO sensors and curved lenses to improve the FoV can increase
the role of CV-based systems.
Optical sensing can suffer from sensitivity to external changes in environ-
mental conditions and disturbances. A single-camera is not suitable for MAV
environmental-awareness applications, such as mapping, because instantaneous
depth data are not captured. A complementary sensor can provide depth informa-
tion such as using two cameras (stereovision) [69] and monocular cameras with
INUs and/or range sensors [79]. Kinect and other RGB-D sensors provide single-
camera images with depth information, but they are limited in range, and their use
is compromised by strong sunlight. The use of a single camera with an INU is more
practical for MAVs because most onboard flight control systems are equipped with
INUs. Consequently, this leads to a reduction in payload mass and energy con-
sumption. On the contrary, it has been shown the INUs are typically ill-suited for
MAVs operations, because of their relatively high levels of uncertainty.
8.4 Illumination
Illumination is one of the most significant components for configuring the hard-
ware. To achieve the most efficient system, the light source should be chosen based
Vision in micro-aerial vehicles 191
on its applications, like the required stand-off distances, the environmental pro-
blems and the amount of light [80].
The main requirement of the system for achieving proper monitoring is to
provide adjusted contrast between the object and the background regions. We should
also adjust the illumination angle and the intensity source to emphasise the tool
region of interest. Usually, lasers, incandescent lamps, and sometimes the natural
light are used as light sources for condition monitoring.
In addition to appropriate light source selection, it is also essential to consider
proper techniques to give the optimum results. There are three main techniques which
are used extensively for different machine vision applications [81] (Figure 8.6):
1. Backlighting: This method illuminates the subject from the back and involves a
proper contrast, but it is limited to silhouette information.
2. Front lighting: This method provides direct illumination of the object; this
method has been extensively used in image processing applications.
3. Structured lighting: This method projects a known pattern (often grids or
horizontal bars) on to a scene. This method is usually utilised to estimate the
depth of the region.
hive [82]. The salmon fishes to orient underwater light fields [83] use a similar
ability. Another example is the light reflection on the water surface, which helps
the insects to find their path and orientation [84].
Some research studies have focused on Guidance, Navigation, and Control
(GNC) for UAVs, resulting in various methods and systems [85]. There are three
main categories: control, navigation, and guidance. For each category, methods are
grouped at the highest level based on the autonomy level they provide, and then
according to the algorithmic approach used, which in most cases depends on the
type of sensors used [85].
UAV GNC is traditionally carried out through three methods, namely radio
control, video-based, and autopilot [86].
The use of UAVs in an indoor environment is a challenging task, and it needs
higher requirements than the outdoor environment. It is challenging to use GPS to
avoid collisions in an indoor environment. Usually, an indoor MAV deals with a
GPS-denied environment, thus inhibiting the use of RF signals. RF signals could be
reflected, and degraded by the indoor obstacles and walls.
Stationary landmarks can aid localisation in GPS-disrupted or GPS-denied
environments. In particular, given the vehicle’s path, we formulate a landmark-
placement problem and present algorithms to place the minimum number of land-
marks while satisfying the localisation, sensing, and collision-avoidance constraints.
In a radio-control system, the UAV navigation system uses a transmitter and a
receiver [86]. Remote control (RC) equipment consists of a radio transmitter that
includes several radio channels. The pilot transmits instructions to the UAV using
any of these channels [3]. In RC systems, the transmitter range is different and
usually covers a range of about five kilometre.
A UAV radio transmitter must have at least 4 to 6 channels to control their different
flight levels. Additional channels can be used for camera control, where the receiver
transmits instructions to the servomotors and speed controller [86] (Figure 8.7).
For video-based navigation systems, a camera installed on a MAV takes videos
and photos when passing regions and sending them to the ground station by a video
transmitter. Small size, low weight, and high visibility and clarity are considered as
the essential features of a video system [86]. In a video-based system, the images sent
from the video-transmitter and the images received by the antenna are displayed on a
screen at the ground station. Analysis of the output waves can be used to evaluate
antennas. In some cases, amplifiers are combined with the antenna, which makes it
much easier to receive images. Currently, ultrasonic sensors, colour, thermal, or IR
cameras obtain information about the environment. MAVs often use colour cameras,
which are more useful only in the daytime and cannot provide scale and depth infor-
mation for the observed environment. In a video-based navigation system, CV plays a
vital role in the MAV automation because they extract the required data from images
for navigation, stabilisation, and additional data collection.
Ordinarily, video transmitters can broadcast signals over a limited distance that
is not suitable for long-distance flights. When out of range, drones signal displaying
a dead zone and decrease the flight radius [86]. The best type of GNC for MAVs is
the autopilot, which is a set of software and hardware tools to enable their flight
Vision in micro-aerial vehicles 193
Obstacle
Path
Target
Avoidance path
(a)
Avoidance path
Path Target
(b) Obstacle
Figure 8.7 Pathfinding robot with an obstacle in the (a) bottom and (b) top of the
avoidance path [13]
negligible lag and high bandwidth with minimal losses, while control over thou-
sands of miles results in severe control lag, low bandwidth, and substantial losses.
Hence, UAVs for long-distance and endurance flights typically have augmentation
autopilots to stabilise flights if there is a loss of the command and control link.
Likewise, the control system is structured differently for the same reason.
The positioning system is responsible for the central part of the MAV navi-
gation. There are different methods for UAV positioning, such as GPS and inertial
navigation system (INS) [3].
In drones, to detect the position, velocity, and altitude, GPS is usually used. To
provide an accurate position of the drone, GPS should be in contact with at least
four satellites simultaneously [86]. Since GPS signals are easily disturbed by
external noise or interference [18], MAVs equipped only with GPS may lose their
connection for some time. Such a situation leads to landing drones and aborting
their mission due to security concerns. So, to avoid this problem, the INS is an
appropriate method to estimate the location of the drones when they temporarily
lose their GPS connection [3].
The INS includes gyroscopes and accelerometers, which are used to calculate
the position and orientation of the UAVs. Nowadays, GPS and INS are combined to
avoid errors in positioning and to produce accurate navigational evidence. The
Kalman filter is a typical fusion algorithm and, the extended Kalman filter (EKF)
can estimate the MAVs location when they lose their GPS info momentarily [3].
MAVs navigation and direction can use wireless networks and the Internet to
exchange instructions [91–93].
information, which can be achieved by the sky polarisation patterns, can be used as an
external compass. Skylight polarisation patterns can be affected by sun location [98].
The main advantage of employing the skylight polarisation pattern toward
directly employing the sun is that only patches of sky are sufficient for orientation
tasks. Photodiodes use orientation techniques to read polarisation data. Moreover,
by getting cheaper and advanced camera technology, new methods using CCD,
CMOS, and/or Fisheye lens are introduced to achieve the polarisation information.
Skylight
ϕ1 ϕ3 ϕ2 ϕ4
Microcontroller Photodiode
1. To do this, the sky should be scanned actively by swinging it around the ver-
tical body axis. Likewise, when reaching the maximum value on the output
signal of each POL-OP units (or a combination of multiple POL-OP units), the
robot swings itself into the solar meridian direction.
2. The method has employed current information from the solar meridian to find
the heading direction.
3. By comparing the corresponding direction of the robot with the POL-OP unit’s
current output values with a lookup table, the compass direction can be achieved.
4. Before experiments by a single 360 rotation on the robot, we should record the
lookup table.
Instead, in the simultaneous model, there is no need to continue scanning
movements for determining the heading direction continuously.
In [101], a simultaneous model is employed, which does not need the lookup
table; instead, an analytical procedure is utilised to derive compass information
from the POL-OP units values.
They also consider the polarisation pattern changes during the day, which can
be done by either normalising the outputs of the POL-OP units or regularly
updating the lookup table by the following considerations.
– The sigmoid function is applied to the POL-OP signals for de-logarithmized
– Two candidate orientations can be defined as follows [101]:
Normalised response
POL-sensors
1
1 2 1 2
Blue-filter
Polariser
Photodiode
0
0° 90°
e-vector orientation
POL-OP unit
1
Normalised response
log 1 - log 2
Log ratio
amplifier
0
LOG100
log 1-log 2 1
0° 90°
e-vector orientation
There are many types of research like [99,103–105], which illustrated that
achieving the directional information from polarisation has high accuracy, and the
travelling distance shows the value of the error.
Schmolke et al. [106] compared the performance using a polarisation compass
with the performance of the path integration without external reference. In this
research, results showed that the e-vector compass has the highest accuracy.
Chahl and Mizutani [107] improved two bio-mimetic sensors, and then they
performed flight tests for orientation and stabilisation of an aircraft using the
polarisation, spatial, and spectral distribution of the environment light. The main
attempt in their work was to imitate the head of a kind of dragonfly.
C C D
D
Parallax
D C
A B
What camera B sees
The design of MAVs involves conceptual design, preliminary design, and detailed
design [143]. Each step requires increasingly sophisticated sizing, aerodynamic,
aeroelastic, structural, propulsion, stability, control, electronic, and fabrication
analysis [11,26,27].
202 Imaging and sensing for unmanned aircraft systems, volume 1
Size impacts the optimum values of their dimensions and weights and has
usually five steps: (1) defining the mission, (2) setting the flight mode based on the
type, (3) determining the wing shape (planform) and aspect ratio, (4) constraint
analysis, and (5) weight estimation [11,26,27]. The analysis of the route happens
during the mission definition, and it helps to determine the flight time, cruise speed,
turning speed, flight modes, the wing shape, and its aspect ratio for the type of
mission. Then, to determine the appropriate UAV wing and thrust loadings, a
constraint analysis is carried out to simulate the kinematic and dynamic equations
of the flight. Along with the steps above, different methods for weight estimation
can be employed resulting in the determination of the UAV geometry and dimen-
sions besides the calculation of some aerodynamic parameters for each type [27].
MAVs are more than scaled-down versions of larger aircraft [10] in a small
volume. The challenges and complexity in their design and fabrication of insect-
sized drones grow significantly due to the corresponding low Reynolds numbers,
which results from their low speed and small sizes [144]. Generally, the difficulties
during the flight in this regime of flow led to the study of the flight of insects [145].
Next, some challenges for designing some types of MAVs are discussed.
the harvested power. The piezoelectric energy harvested from the morphing
of wings can also power many sensors and cameras using wasted mechanical
energy [75].
8.10 Conclusion
This chapter provides a survey of bio-inspired UAV application for imagery
acquisition in different environments, design challenges along with possible solu-
tions. Moreover, the used manufacturing methods and challenges, propulsion sys-
tems and actuators, power supply and endurance, control and navigation of drones
are reviewed with proposing new ideas to get rid of the existing limitations. The
importance of swarm flight and separation of drones is also discussed. Using only
one drone for a given mission can be risky in view of some technical or other
problems. Hence, various missions can be performed more efficiently with multiple
UAVs or drone swarms. A swarm flight of drones has an advantage. If one drone of
the swarm strays in flight, the rest of the drones can carry out the mission. These air
vehicles push the flight limitations in the low Reynolds number regime, requiring
new research in designing optimisation procedures; lightweight structures and
Vision in micro-aerial vehicles 205
References
[138] R. He, A. Bachrach, and N. Roy, “Efficient planning under uncertainty for
a target-tracking micro-aerial vehicle,” in 2010 IEEE International
Conference on Robotics and Automation (ICRA), pp. 1–8, 2010.
[139] M. W. Achtelik, S. Weiss, M. Chli, and R. Siegwart, “Path planning for motion
dependent state estimation on micro aerial vehicles,” in 2013 IEEE International
Conference on Robotics and Automation (ICRA), pp. 3926–3932, 2013.
[140] C. Forster, M. Faessler, F. Fontana, M. Werlberger, and D. Scaramuzza,
“Continuous on-board monocular-vision-based elevation mapping applied
to autonomous landing of micro aerial vehicles,” in 2015 IEEE International
Conference on Robotics and Automation (ICRA), pp. 111–118, 2015.
[141] N. Razmjooy, M. Ramezani, and V. V. Estrela, “A solution for Dubins
path problem with uncertainties using world cup optimization and
Chebyshev polynomials,” in Y. Iano et al. (eds), Proceedings of the 4th
Brazilian Technology Symposium (BTSym’18), Smart Innovation,
Systems and Technologies, vol. 140. Springer, Zurich, Switzerland, 2019.
DOI: 10.1007/978-3-030-16053-1_5.
[142] N. Razmjooy, M. Ramezani, V. V. Estrela, H. J. Loschi, and D. A. do
Nascimento, “Stability analysis of the interval systems based on linear
matrix inequalities,” in Y. Iano et al. (eds), Proceedings of the 16th
Brazilian Symposium on Computer Graphics and Image Processing
(BTSym’18), Smart Innovation, Systems and Technologies, vol. 140,
Springer, 2019. DOI: 10.1007/978-3-030-16053-1_36.
[143] M. Sadraey, “A systems engineering approach to unmanned aerial vehicle
design,” in 10th AIAA Aviation Technology, Integration, and Operations
(ATIO) Conference, p. 9302, 2010.
[144] W. Shyy, Y. Lian, J. Tang, et al., “Computational aerodynamics of low
Reynolds number plunging, pitching and flexible wings for MAV appli-
cations,” Acta Mechanica Sinica, vol. 24, pp. 351–373, 2008.
[145] T. Nguyen, D. S. Sundar, K. S. Yeo, and T. T. Lim, “Modeling and analysis
of insect-like flexible wings at low Reynolds number,” Journal of Fluids
and Structures, vol. 62, pp. 294–317, 2016.
[146] I. M. Al-Qadi and A. M. Al-Bahi, “Micro aerial vehicles design challenges:
State of the art review,” in SSAS UAV Scientific Meeting & Exhibition,
2006, pp. 29–45.
[147] N. Razmjooy, M. Khalilpour, and V. V. Estrela, and H. J. Loschi, “World
Cup optimization algorithm: An application for optimal control of pitch
angle in hybrid renewable PV/wind energy system,” in M. Quiroz, A. Lara,
Y. Maldonado, and L. Trujillo and O. Schuetze. (eds), Proceedings of the
NEO 2018: Numerical and Evolutionary Optimization, 2019.
[148] P. A. T.P., R. Pandiarajan, and P. Raju, “Wireless power transmission to
UAV using LASER beaming,” International Journal of Mechanical
Engineering and Research, vol. 5, no.1, 137–142, 2015.
[149] M. C. Achtelik, J. Stumpf, D. Gurdan, and K.-M. Doth, “Design of a
flexible high performance quadcopter platform breaking the MAV endur-
ance record with laser power beaming,” in Proceedings of 2011 IEEE/RSJ
216 Imaging and sensing for unmanned aircraft systems, volume 1
This chapter presents the concepts, advantages, and practical examples of computer
vision on robot operating system (ROS), applied explicitly on unmanned aerial
vehicle (UAV) domain. ROS was built with the concept of abstraction (using
messages interfaces) to allow re-use of software. This concept allows the integra-
tion with already built source code for computer vision tasks with the ROS
environment.
In conjunction with the available simulators for ROS, it is possible to design
UAV solutions in a model-based concept, anticipating many operational and
technical constraints prior to operational release.
9.1 Introduction
The robot operating system (ROS) is not a real operating system (OS), but a fra-
mework and set of tools that offer the functionality of an OS on a heterogeneous
computer arrangement or cyber–physical system (CPS). The ROS ecosystem consists
of robotics middleware that is to say an assortment of software tools to develop robots
for example hardware abstraction, low-level device control, implementation of
commonly used functionality, message passing between processes, and package
management. Running sets of ROS-based processes are represented in a graph
architecture where processing takes place in nodes that may receive, post and mul-
tiplex sensor, control, state, planning, actuator, and other messages.
Computer vision (CV) is a field of engineering, which enables computers to
comprehend what is being shown in an image, identifying its attributes and sym-
bolic representing it. It has synergy with artificial intelligence in applications such
as object detection and recognition, navigation, tracking, and image-based decision
systems allowing better and efficient systems deployment. It faces many challenges
to be accomplished such as sensors (e.g., camera) quality, an environment that is
applied, sometimes facing light restriction or obstacles, target dynamics, and so on.
1
DCTA-ITA-IEEA, Sao Jose dos Campos, SP, Brazil
2
Universidade Federal Fluminense (UFF), RJ, Brazil
218 Imaging and sensing for unmanned aircraft systems, volume 1
These challenges directly relate to real-time processing and typical usage for CV in
robotics or industrial processes.
9.3 Applications
9.3.1 OpenCV in ROS
As previously stated, the smooth integration of ROS with other packages allows
new functionalities to be available on ROS deployments. OpenCV, which is a
widely used open-source CV library [23], will be used for the following examples.
OpenCV has algorithms for image manipulations such as translation, rotation,
crop, arithmetic, and bitwise operations, convolution, blurring, de-noising, shar-
pening, threshold operation, dilatation, erosion, edge detection, and segmentation
techniques like contours.
For these manipulations, OpenCV translates images into a matrix representation
where matrix cells containing colour space values represent each pixel. On this text,
Computer vision in UAV using ROS 219
colour space will be RGB (red–green–blue), which OpenCV uses as BGR. It is impor-
tant to mention that other colour spaces are available, but they are out of the scope, such
as HSV (hue–saturation–value) and CMYK (cyan–magenta–yellow–key). Based on
this matrix representation, algorithms access, manipulate, and analyse the images
(Figures 9.1–9.3).
107 110 114 113 112 113 113 117 117 113 112 112 113 114 116 116 109 105 106 104
106 108 113 114 114 116 114 117 123 118 116 116 114 115 117 117 113 109 106 105
108 108 114 116 115 121 119 118 125 122 118 117 112 113 116 116 116 112 103 104
113 114 116 115 115 122 124 121 119 120 122 116 111 114 114 112 113 112 106 101
115 117 116 115 116 121 123 121 117 118 123 117 115 115 114 109 109 112 114 105
116 118 118 117 118 121 122 126 123 119 122 120 114 117 119 113 107 110 116 110
114 116 119 119 121 121 124 128 131 124 117 120 109 106 111 115 110 105 112 112
114 116 119 119 120 120 127 126 132 129 114 119 113 99 97 108 111 102 110 115
117 120 122 122 121 122 124 121 124 130 121 118 115 108 103 107 108 106 113 114
119 123 125 126 122 120 122 120 117 129 129 118 110 111 110 111 113 114 115 111
120 123 126 127 122 121 124 123 115 123 123 113 107 109 110 111 114 118 113 108
120 121 125 130 128 121 115 118 114 118 116 110 107 105 105 105 113 119 110 104
118 118 125 135 136 126 110 114 115 118 112 107 107 100 102 104 115 121 109 104
119 120 127 131 131 129 114 117 114 108 109 103 106 98 99 107 111 118 107 102
117 120 127 128 129 132 121 115 115 99 101 106 109 102 96 99 109 119 105 100
116 117 125 130 130 136 131 115 119 97 89 102 110 105 96 101 113 116 105 103
116 117 125 130 132 138 136 124 125 104 85 91 109 102 98 117 117 108 108 100
116 117 124 128 133 140 134 126 120 109 89 88 108 106 105 120 112 106 107 95
116 118 121 124 130 140 132 119 116 106 91 88 102 103 104 112 107 108 106 96
116 119 120 122 130 137 130 112 115 103 90 88 93 93 99 106 103 109 105 98
Figure 9.3 Grayscale pixel values between 0 and 255 from the cropped image
To move forward and consider the Ubuntu Linux as ROS host, it is required the
installation of Ubuntu Linux packages as follows:
A typical Python code for ROS node that consumes and processes ROS images
requires the specific libraries:
● rospy – ROS library for Python
● cv2 – OpenCV library
● cv bridge – Converts between ROS images and OpenCV images on Python
The following example is a node that subscribes to UAV camera data topic and
then performs a conversion from ROS image data type to OpenCV image data type
through a library called cv bridge (Figures 9.4 and 9.5). This Python ROS node is as
follows:
Listing 9.2: Python code for image conversion between ROS and OpenCV
1 #!/usr/bin/env python
2
3 import rospy
4 import cv2
5 from cv_bridge import CvBridge
Computer vision in UAV using ROS 221
6
7 # to handle ROS camera image
8 from sensor_msgs.msg import Image
9
10 bridge = CvBridge()
11
12 def callback(data):
13 cv_image = bridge.imgmsg_to_cv2(data,"bgr8")
14 cv2.imshow("Image from ROS", cv_image)
cv2.waitKey(0)
16
17 def main():
18 image_subscribe =
19 rospy.Subscriber("/ardrone/image_raw",
20 Image, callback)
21
22 rospy.init_node("ROSImgToCV")
23 rospy.spin()
24 cv2.destroyAllWindows()
25
26 if __name__ == ’__main__’:
27 main()
The Python code presented creates an ROS topic that subscribes to /ardrone/
image raw topic (Figure 9.6). This topic outputs the UAV camera image. Once on
ROSImgToCV node, this image is converted using the cv bridge library.
After this conversion, all OpenCV processing can be conducted through
available algorithms or reused codes. Its processing outputs can be published
containing commands and/or feedback for the ROS system.
14
15 # publishers creation
16 cmd_vel = rospy.Publisher(’/cmd_vel’, Twist, queue_-
size=1)
17 takeoff = rospy.Publisher(’/ardrone/takeoff’, Empty,
18 queue_size=1)
19 land = rospy.Publisher(’/ardrone/land’, Empty,
20 queue_size=1)
21
22 # message type definition
23 move_msg = Twist()
24 takeoff_msg = Empty()
25 land_msg = Empty()
26
27 # it is required to pay attention between ROS commands,
28 # to overcome any issue, it is being used timers
29
30 time.sleep(5)
31
32 # example of Takeoff
33 takeoff.publish(takeoff_msg)
time.sleep(1)
35
36 # example of stopping
37 move_msg.linear.x = 0.0
38 move_msg.angular.z = 0.0
39 cmd_vel.publish(move_msg)
time.sleep(1)
41
42 # example of turning 90 degrees
43 move_msg.linear.x = 0.0
44 move_msg.angular.z = 1.0
45 cmd_vel.publish(move_msg)
time.sleep(1)
47
48 # example of moving forward
49 move_msg.linear.x = 1.0
50 move_msg.angular.z = 0.0
51 cmd_vel.publish(move_msg)
time.sleep(1)
53
54 # example of Landing
55 land.publish(land_msg)
time.sleep(1)
/gazebo_gui
/joint_states /robot_state_publisher
ardrone
/ardrone/image_raw /ROSimgToCV
/ardrone/reset
/ardrone/front/image_raw
/ardrone/land
/ardrone/navdata
/ardrone/bottom/image_raw
/ardrone/front/camera_info /gazebo
/ardrone/bottom/camera_info
/ardrone/takeoff
/ardrone/imu
gazebo
/gazebo/set_model_state
/gazebo/set_link_state
/cmd_vel
/sonar_height
/ground_truth
/ground_truth/state /ground_truth_to_tf
OpenCV has a built-in algorithm to detect a person, and it is the HOG with
SVM approach [24]. HOG is Histograms of Oriented Gradients and is a ‘feature
descriptor’ to generalise objects. SVM means Support Vector Machine [25], which
is a machine-learning algorithm for classification.
The arguments of hog.detectMultiScale present the following impacts on
HOGSVM algorithm [26]:
● winStride
Determine the step size in both x and y for the sliding window. Smaller
winStride will require more evaluations by the algorithm.
226 Imaging and sensing for unmanned aircraft systems, volume 1
● padding
Indicates the number of pixels in both x and y in which the sliding window is
padded.
● scale
Controls the factor that image is resized at each layer of the image pyramid.
● useMeanshiftGrouping
Boolean value that indicates if mean-shift grouping will be performed to
handle potential overlapping bounding boxes.
The algorithm output is illustrated in Figure 9.9.
Finally, an integrated algorithm that takes advantage of the ROS easy inte-
gration with OpenCV is presented. This algorithm consumes UAV camera images
through ROS nodes and topics infrastructure and according to CV evaluation, if
there is a person on the image, the UAV is sent to land.
Computer vision in UAV using ROS 227
29
30 if len(rects):
31 land.publish(land_msg)
32
33 def main():
34 image_subscribe = rospy.Subscriber(
35 "/ardrone/image_raw", Image, callback)
36 rospy.init_node("LandingCV")
37 rospy.spin()
38
39
40 if __name__ == ’__main__’:
41 main()
Moreover, the modified ROS graph including the new /LandingCV node is
presented by Figure 9.10. It can be confirmed that this node subscribes on /ardrone/
image raw topic and publish on /ardrone/land topic.
/gazebo_gui
/robot_state_publisher
ardrone
/ardrone/image_raw
/LandingCV
/ardrone/land
/ardrone/front/image_raw
/ardrone/bottom/camera_info
/ardrone/navdata /gazebo
/ardrone/bottom/image_raw
/ardrone/front/camera_info
ground_truth
/ground_truth/state /ground_truth_to_tf
tum_ardrone
/tum_ardrone/com
/rosbridge_websocket
/cmd_vel
/robot_state_publisher
ardrone
/drone_autopilot /ardrone/image_raw
/ardrone/land /drone_gui
/ardrone/navdata
/ardrone/reset
/ardrone/takeoff
/ardrone/front/camera_info
/gazebo /sonar_height
/ardrone/bottom/image_raw
/ardrone/front/image_raw
/ardrone/bottom/camera_info
/ardrone/imu
ground_truth
/ground_truth/state /ground_truth_to_tf
Figure 9.11 ROS graph for tum ardrone deployment (nodes and topics)
On the top-right corner of the GUI (refer to the Node Communication Status),
information related to communication between nodes and UAV is provided like the
navigation data (Navdata) frequency, Control frequency, Pose Estimation fre-
quency and motors statuses.
For reference, Navdata shall be higher than 100 Hz and Pose Estimate 33 Hz to
indicate correct operation.
There is ‘Autopilot Status’ that shows the status of Autopilot node. This node
allows UAV navigation through waypoint sent via camera feed screen.
Figure 9.12 tum ardrone windows
4 setMaxControl 1
5 setInitialReachDist 0.2
6 setStayWithinDist 0.5
7 setStayTime 3
8
9 # rectangle
10 goto 0 0 1 0
11 goto 0 0 0 0
12 goto -1.5 0 0 0
13 goto -1.5 0 1 0
14 goto 1.5 0 1 0
15 goto 1.5 0 0 0
16 goto 0 0 0 0
17
18 # house
19 goto -1 0 0 0
20 goto 1 0 0 0
21 goto -1 -2 0 0
22 goto 1 -2 0 0
23 goto 0 -3 0 0
24 goto -1 -2 0 0
25 goto -1 0 0 0
26 goto 1 -2 0 0
27 goto 1 0 0 0
28 goto 0 0 0 0
29
30 land
ROS integration with other frameworks, libraries, and already built code creates an
excellent platform for robotics systems development using CV tasks through an
agile development workflow from prototype to production.
ROS and CV applications have been used in research areas of self-driving cars,
industrial robots (manufacturing), smart home robots, surgical robotics, flying
robots, space robots, underwater vehicles, inspection robots, mining, military
applications, among others.
These integration aspects are highly tied with new technology trends like
digital twin and Internet of Things (IoT) allowing tasks like augmented reality,
navigation, motion planning, perception – based on CV – to be simulated and
evolved during all system life cycle phases.
With the release of ROS 2.0 more focus on real-time control and distributed
processing was provided which allow applications of deep learning – high throughput
– due to distributed processing and embedded application with real-time requirements
and also providing real-time operating systems (RTOS) interface allowing compliance
with the strong real-time response that can lead to edge embedded applications.
238 Imaging and sensing for unmanned aircraft systems, volume 1
9.5 Conclusion
ROS is a very useful tool for robot development and simulation [19,20]. In this
chapter, it was shown that deployment of CV tasks on ROS is easy mainly by its
integration with the available libraries and tools for development, such as those
demonstrated as Gazebo simulator, ready to integrate ROS packages (tum ardrone)
and OpenCV libraries.
It was demonstrated that with simulation, it is possible to train and test models
producing a more mature solution before physical or real-world deployment,
anticipating issues and reducing costs.
Despite the importance of reactivity and low latency in UAV control, ROS,
itself, is not a RTOS, but it allows some real-time code to be used.
The lack of real-time support is being addressed by the ROS 2.0 including
security aspects and distributed processing [32,33].
CPSs have restricted computational and storage facilities owing to the fact they
generally possess small size and are embedded into or interact with larger systems.
With the advent of cloud computing and the IoT [34,35], UAV-CPSs encounter
several new prospects to lengthen their capacities via the cloud assets in different
forms. There is a need to develop efforts towards tools that will facilitate the
integration of UAV-CPSs with cloud computing, classify and extend them to areas
like (1) remote intelligence [34,36–39], (2) big data handling [40], (3) virtualisation
[17,41], (4) mobile robots [32,42], (5) wireless sensor networks [32,42], and
(6) vehicular networks [32,42,43].
References
[1] Koubaa A Robot Operating System (ROS) The Complete Reference
(Volume 2). Studies in Computational Intelligence book series (SCI, volume
707), Springer Cham, Zurich, Switzerland, 2017.
[2] Sinapov J, Schenck C, and Stoytchev A. Learning relational object categories
using behavioral exploration and multimodal perception. 2014 IEEE International
Conference on Robotics and Automation (ICRA). 2014, p. 5691–5698.
[3] Gupta S, Girshick RB, Arbeláez PA, and Malik J. Learning rich features
from RGB-D images for object detection and segmentation. Proceedings of
2014 ECCV. Zurich, Switzerland, 2014.
[4] Razmjooy N, Mousavi BS, and Soleymani F. A hybrid neural network
imperialist competitive algorithm for skin color segmentation. Mathematical
and Computer Modelling, Vol. 57, 2013, p. 848–856.
[5] Khalilpour M, and Razmjooy N. A hybrid method for gesture recognition.
Journal of World’s Electrical Engineering and Technology, Vol. 2, no. 3,
2013, p. 47–53.
[6] Firoze A, and Deb T. Face recognition time reduction based on partitioned
faces without compromising accuracy and a review of state-of-the-art face
recognition approaches. Proceedings of ICIGP 2018, Hong Kong, China.
Computer vision in UAV using ROS 239
10.1 Introduction
Unmanned aerial vehicle (UAV) is a technology that improves typical tasks such as
photography, delivery, agriculture, inspection, rescue, catastrophe response, among
others. Initially used in military applications, nowadays they are increasing its pre-
sence as a commercial platform and consumer electronics. There are advantages such
as low maintenance costs, the stability of navigation, reliability and compact size.
Despite its benefits and evolving technology aspects, it embraces traits such as
complex architecture and mission-critical applications that require a rigorous
development process to guarantee operational success during its whole life cycle.
1
Instituto Tecnolgico de Aeronutica (ITA)
244 Imaging and sensing for unmanned aircraft systems, volume 1
Processing
(e.g. Computer)
Sensing Acting
(e.g. accelerometer) (e.g. Motor)
Environment
(e.g. UAV)
There are still some challenges regarding the UAV, such as limited payload,
computational power and sensors, battery life, fast dynamics, and electronic control
and safety, which tightly relate with results from security events or can be impaired
or triggered by security issues. The UAV system consists of three main compo-
nents, sensors such as camera, accelerometer, gyroscope; actuators such as motors,
gimbals, and grippers; and control unit/software that is implemented by computers
or embedded board computers. These components are responsible for performing a
state machine consisting of sensing the environment, processing sensing data and
acting on the environment (Figure 10.2).
ROS Distribution 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023
Melodic
Lunar
Kinetic
Jade
Indigo
Hydro
Groovy
service finishes. When one calls an action, this one does not necessarily have to
wait for the action to complete.
A navigation system will be considered as an action once ROS will call it to
perform some movement, and during this move, other tasks can be performed
(asynchronous behaviour, like a thread).
cyber warfare [8]. As news states and based on publications related to sensors
reading spoofing (e.g., GPS), jamming, or even physical access and data manage-
ment flaws allow concluding that critical application and platforms, commercial
and military, could be vulnerable to security risks.
Establishing a parallel to the aircraft industry show that security concerns are
rising, and guidelines are still too young to be on currently available aircraft. It is
the case of RTCA DO-356 Airworthiness Security Methods and Considerations
that was released in 2014. Currently, certification agencies, e.g., the Federal
Aviation Administration (FAA) has no specific regulations to address cybersecur-
ity. They are addressing these “gaps of analysis” during design and certification
through special conditions to guarantee an equivalent level of safety as existing
airworthiness standards focusing on aircraft designs.
Research on FAA Special Conditions database shows the history of issuances
of special conditions for both transport category (commercial aircraft) and general
aviation (business jets).
The multi-UAV operation has been used to provide network infrastructure on
remote areas or during temporary network disruption in specific areas. In these
applications, UAVs are the nodes of the network (aerial nodes) [9,10]. Networks
available through multi-UAV networks are susceptible to attacks like unauthorised
access, signal jamming, signal spoofing, and data injection. This type of UAV
operation is an emerging area of research [11].
Attacks (e.g., the Coagulation Attack) were already studied on UAVs networks,
that affects UAVs physical configuration and manoeuvres. These attacks have as
impact network failures, crashes and UAV hijacking [12].
From the information systems domain, it is a normal approach to take a service down
when an attack arises. This system mode or approach is not feasible when considering
CPS applications and more specific UAV operations that, besides its information
aspects must take account of its physical interaction with the environment, for
example, during a flight, requiring a mitigation or design decision to guarantee safe
operation. The Security Scenarios considers the published ROS security flaws [17]
and its possible impacts on the consumer UAV under test with ROS:
● Data Capturing and Interception (sniffing)
● Data Interruption aka Denial of Service (DoS)
● Data Fabrication (spoofing – non-repudiation aspect)
Security aspects of UAV and robot operating system 251
For UAV applications, some threat scenarios can be created. Some interesting
examples can be found below:
● A malicious agent forcing a landing procedure during the flight to hijack the
UAV through a fake topic publishing (injection);
● False sensor readings through node corruption causing misbehaviours or
UAV crash;
● Fake topic publisher flooding with a large number of data to cause a subscriber
denial of service due to high processing;
● Data breaching to capture image transmission compromising privacy.
/ros/topic_4185_15218477409957
/trajectory_interactive_markers_node_4185_15218477409957 /visualization_marker
/copter_height
ROS graph with
/move_square node subscribed to
UAV camera
ground_truth_to_ft /tf
/copter_trajectory_surveyor
/robot_state_publisher /tf_static
responsible for linear and angular movements of the UAV, due to lack of check, it
is possible to create an unauthorized node that can interfere with /cmd vel proper
operation. A simple malicious code is presented next (Figure 10.12).
7
8 pub = rospy.Publisher(’/cmd_vel’, Twist,
9 queue_size=1)
10
11 rate = rospy.Rate(1)
12
13 while True:
14 pub.publish()
15 rate.sleep()
This code is responsible for creating a node named /unauthorized which acts
publishing an empty message on /cmd vel node interfering on the legit node /move
square from publishing.
The evidenced behaviour on simulation relates to UAV freezing due to the fact
that the last published command to /cmd vel either by /move square or /unauthorized
is the response for commanding the UAV, once the /unauthorized node publishes with
a higher frequency, it takes control over /move square on /cmd vel.
This malicious node appears on the architecture as another interface with /cmd
vel as highlighted on the graph (Figure 10.13).
10.9 Conclusion
Security is an apparent concern for UAV and a key enabling technology for wide
adoption. With ROS as the supporting middleware for UAV, these security aspects
shall be considered on ROS. These conclusions from UAV domain can be derived for
all CPSs once ROS is glimpsed to be adopted in a wide range of domains [23–26].
This can be improved by the continuity of initiatives such as Security Robot
Operating Systems (SROS) and as provided by ROS 2.
ardrone
/ardrone/front/image_raw
/ardrone/land
/ardrone/navdata
/ardrone/bottom/image_raw
/ardrone/imu
/move_square /ardrone/takeoff
/ardrone/bottom/camera_info
/unauthorized /cmd_vel
ground_truth
/robot_state_publisher /tf_static
References
[1] Vattapparamban E, Guvenç I, Yurekli AI, et al. Drones for Smart Cities:
Issues in Cybersecurity, Privacy, and Public Safety. In: Proc. 2016 Int’l Wir.
Comm. and Mob. Comp. Conf. (IWCMC). IEEE, 2016, Paphos, Cyprus,
p. 216–221.
[2] Estrela VV, Saotome O, Loschi HJ, et al. Emergency Response Cyber-
Physical Framework for Landslide Avoidance with Sustainable Electronics.
Technologies, 6, 2018, p. 42.
[3] Khalilpour M, and Razmjooy N. A Hybrid Method for Gesture Recognition.
Journal of World’s Electrical Engineering and Technology, 2, 3, 2013, p. 47–53.
[4] Hooper M, Tian Y, Zhou R, et al. Securing Commercial WiFi-based UAVs from
Common Security Attacks. In: Military Communications Conference,
MILCOM 2016-2016 IEEE. IEEE; Baltimore, MD, USA, 2016, p. 1213–1218.
[5] Bisio I, Garibotto C, Lavagetto F, Sciarrone A, and Zappatore S. Blind
Detection: Advanced Techniques for WiFi-Based Drone Surveillance. IEEE
Transactions on Vehicular Technology, 68, 2019, p. 938–946.
[6] Lykou G, Anagnostopoulou A, and Gritzalis D. Smart airport cybersecurity:
Threat mitigation and cyber resilience controls. Sensors. 2018;19(1). doi:
10.3390/s19010019.
[7] Estrela VV, and Coelho AM. State-of-the art Motion estimation in the con-
text of 3D TV. In: Multimedia Networking and Coding. IGI Global,
Hershey, PA, USA, 2013, p. 148–173. DOI:10.4018/978-1-4666-2660-7.
ch006.
[8] Faria LD, Silvestre CA, Correia MA, and Roso NA. Susceptibility of GPS
dependent complex systems to spoofing. Journal of Aerospace Technology
and Management. 2018;10:10.
[9] Singh K, and Verma AK. Flying adhoc networks concept and challenges. In
M. Khosrow-Pour, D.B.A. (Ed.), Encyclopedia of Information Science and
Technology, Fourth Edition (pp. 6106–6113), 2018. Hershey, PA: IGI Global.
doi:10.4018/978-1-5225-2255-3.ch530.
[10] Kaleem Z, and Rehmani MH. Amateur Drone Monitoring: State-of-the-Art
Architectures, Key Enabling Technologies, and Future Research Directions.
IEEE Wireless Communications, 25, 2018, p. 150–159.
[11] Singh K, and Verma AK. Threat Modeling for Multi-UAV Adhoc Networks.
In: IEEE Region 10 Conf., TENCON 2017-2017. IEEE; Penang, Malaysia,
2017, p. 1544–1549.
[12] Sharma V, Kumar R, Srinivasan K, et al. Coagulation Attacks over
Networked UAVs: Concept, Challenges, and Research Aspects. International
Journal of Engineering & Technology, 7, 3.13, 2018, p. 183–187.
[13] McClean J, Stull C, Farrar C, et al. A preliminary cyber-physical security
assessment of the robot operating system (ROS). In: Unmanned Systems
Technology XV. Vol. 8741. International Society for Optics and Photonics;
2013, p. 874110.
Security aspects of UAV and robot operating system 259
This chapter explores how the type of environmental impacts the CV techniques,
algorithms and specific hardware to be used. Indoor environments, also known as
controlled environments, generally rely on solutions based on beacons, proximity
sensors and image processing for data acquisition. In this case, as the environment
is controlled, the illuminance of the scene is adjusted and sensors are previously
positioned, which facilitates the development and execution of these systems. In
outdoor environments, generally known for uncontrolled environmental variables,
frequently require solutions based on image processing techniques to provide the
data acquisition. In this environment, the non-constant variation of the illuminance
of the scene and the great variation of the background of images are important
complicating factors for the operation of the image processing algorithms. In
addition, constructions and buildings block the signal of sensors and global posi-
tioning systems making it even harder to treat the exceptions caused by these factors.
Each exception being treated in a CV system has a computational cost that can be
high. If this is considered in applications using embedded hardware, some projects
simply become infeasible. Researchers put great effort attempting to optimise the
software for high performance and better use of the hardware resources, so that less
processing power is demanded as well as positively impacting the energy savings.
This chapter presents a review of the main CV techniques currently used in the
development of mission control software for the use in indoor and outdoor environ-
ments, providing autonomy navigation and interaction for these aerial robots.
1
Department of Electrical Engineering, Federal University of Rio Grande do Sul, Porto Alegre, RS,
Porto Alegre - RS Brazil
262 Imaging and sensing for unmanned aircraft systems, volume 1
which involves intense programming so that it fits with the activities and require-
ments of the system. The CV scope encompasses several areas of knowledge, as
can be seen in Figure 11.1 [2]. The CV differs from image processing because, it
also handles signals obtained from sensors and other means to analyse, to under-
stand and to provide the interaction of the system with the environment in which it
is inserted.
From the engineering point of view, CV is a valuable tool to build autonomous
systems that can accomplish some of the tasks that the human visual system, and, in
many cases, overcome the human capabilities [1]. Currently, vision systems are
widely integrated with robotic systems. Commonly, visual perception and manip-
ulation are combined in two steps ‘look’ and then ‘move’ [3]. The accuracy of the
results depends directly on the accuracy of the visual sensor. An alternative to
increasing the overall system accuracy is to use a visual feedback control loop,
decreasing the obtained error.
In this context, there is an increasing interest in using CV techniques in
unmanned aerial vehicles (UAVs), also known as drones. Such UAV systems can
be better modelled if they are treated as cyber–physical systems (CPSs) [4]. In their
case, these techniques are applied to position the UAV in autonomous flight mode
or to perform aerial surveillance and detection of regions of interest (ROIs) or
points of interest (POIs) [5]. This trend is driven by different facts, such as the
miniaturisation of electronic components, including sensors (driven by other tech-
nologies such as smartphones) [6]; the increase of computational power for the on-
board CPUs; and the cost reduction of other components of this type of robotic
platform [7]. With today’s technology, the variety and complexity of mission tasks
modern UAVs are demanding and they require higher levels of autonomy. The
central part of a standalone UAV is the navigation control system and its supporting
Pattern
Signal recognition
processing
Image
processing
Computer
vision
Physics
Artificial
Mathematics intelligence
subsystems, integrated to the ground control stations [8]. The ground control sta-
tions are responsible by the remote monitoring of the operations executed by the
UAVs. The autonomous navigation system uses information from several sub-
systems to perform three main tasks: estimating the location and orientation of the
UAV (location), identifying obstacles in the environment (obstacle detection), and
then making decisions (decision-making). These decisions are critical to main-
taining the control loop and to provide navigation in an unknown environment [9].
CV is an indispensable tool to achieve these goals.
The CV systems installed in and integrated with the architecture of the UAVs
generally have similar architecture and take three steps in its operation, which are
data acquisition in numerical form, data processing and data analysis. Cameras,
proximity sensors, and ultrasonic sensors typically perform the data acquisition
step. After data acquisition, the embedded computer performs data processing by
performing algorithms with measurement techniques (variables, index, and coef-
ficients), detection (patterns, objects or ROI/POI) or monitoring (people, vehicles
or animals).
The resulting processed data are analysed and then converted into decision
commands that serve as input to the autonomous robotic system [10]. Autonomy is
characterised as the ability to move and act in the environment in which it is
inserted, to perceive the environment through sensors, to adapt or to change the
environment, to learn from past experiences, to construct representations of the
environment, and to develop a process of interacting with the environment.
However, the performance of a CV system has limiting and determinant factors.
For example, in a visual computing system that uses image processing, some fac-
tors can affect the image acquisition, such as occlusion, blur movement, rapid pose
variation, disordered background environments, and onboard mechanical vibration,
among others [11].
In addition, the type of environment is a determining factor for the techniques,
algorithms and specific hardware to be used. Internal environments can be con-
trolled and most often rely on solutions based on beacons, proximity sensors and
image processing for data acquisition. In this case, as the environment is controlled,
the scene illumination can be adjusted, and the sensors can be pre-positioned,
which facilitates the development and execution of these systems. On the other
hand, these environments usually have more obstacles, and the space of navigation
is much more restricted, which can lead to spending much of the processing for this
verification. However, these conditions can be further hampered when the internal
environments are unknown by the stand-alone system, as in the case of rescue
operations [12].
In outdoor environments, often known to be variable, with uncontrolled
environmental factors, often require solutions based on image processing techni-
ques to provide data acquisition. Moreover, in outdoor operations, most navigation
systems are based on the Global Positioning System (GPS) [13]. In this environ-
ment, the constant variation of the scene luminance and the large variation of the
background in the acquired images are important complicating factors for the
operation of the image processing algorithms. Environmental noise is also
264 Imaging and sensing for unmanned aircraft systems, volume 1
something that prevents the proper functioning of the sensors that use this form for
data acquisition.
On the contrary, constructions and buildings block the signal from sensors and
GPSs making it even harder to handle the exceptions caused by these factors. Each
exception treated in a CV system has a computational cost that can be high,
depending on the time requirements to which this system is constrained.
Considering applications using embedded hardware, some projects become unfea-
sible. For this reason, the research in this area is focused on trying to optimise the
software for high performance and better use of hardware resources, so that less
processing power is required and positively impacting the energy consumption of
these systems, which are usually driven by batteries.
To understand the current state of research in the area of visual computing used
in UAVs, Sections 2.1.1 and 2.1.2 present a study about techniques of computa-
tional vision used in drones, along with their relation to the development of mission
control software for use in internal and external environments. The text highlights
the importance of this research area to provide support for autonomy in the navi-
gation and the interaction for these airborne robots with the environment in which
they are inserted.
camera and laser SLAM with an inertial navigation system. The monocular SLAM
visual system had fully correlated characteristics and modelled the vehicle states.
The SLAM laser system is based on a scan matching map of the Monte Carlo
model and leverages the visual data to reduce ambiguities in the vehicle pose
estimation. The system is validated in the simulation with 6 degrees of freedom and
in real flight test. According to the authors, one of the main features of the work is
that the system is validated with a controller in the navigation loop. Figure 11.2
shows the trajectory traversed by the UAV in the simulation of an exploration
mission using the proposed techniques.
Although the laser scanner can provide accurate depth information, it is very
expensive and cumbersome. With the use of an RGB-D camera, which can provide
RGB and depth images, a simultaneous location and mapping method based on this
type of camera is proposed in [16]. An RGB-D SLAM algorithm was used to locate
the camera and construct the 3D map of the used test environment. The developed
system can achieve the pose and the trajectory of the camera. In addition, chart
optimisation and loop closure are adopted to eliminate cumulative errors. However,
in this solution, when the camera moves too fast, loss of frames occurs, and few
points are detected. According to the author, the performance of the algorithm and
its accuracy need to be improved.
In vision-based navigation systems, the UAV path must be known a priori.
Data from the surrounding environment of the UAV’s flight path is taken and
analysed to identify the basic features of that path before the flight mission begins.
Figure 11.2 Result of the SLAM map and trajectory of the UAV while exploring
a simulated indoor environment [15]
266 Imaging and sensing for unmanned aircraft systems, volume 1
Then the real-time data taken from the on-board sensors during the flight mission is
compared to the vehicle’s visual memory to identify the known characteristics and
estimate the UAV’s motion. In this context, the work proposed by the authors in
[17] describes an indoor, inexpensive and simple navigation system using three
laser beams fixed to the UAV and pointed to the ground.
A proposed algorithm uses the camera to capture the laser points on the ground
and to determine their coordinates. The positions of the laser points are used to
obtain complete information on the position and orientation of the UAV. The
proposed navigation system can be classified as a vision-based navigation system.
However, it does not depend much on the quality of videos taken from the vision
camera and does not require an image processing algorithm with high computa-
tional cost employed. An illustrative simulation study is conducted to demonstrate
the validity of the proposed navigation system. According to the author’s results,
the proposed system is more efficient over a range of angles between the laser
beams because it is possible to obtain the laser beam length without the need
for extra sensors or estimation algorithms. This work manages to provide a location
for navigation through a simple and easy method, but it is not possible to predict
collisions with obstacles along the way.
Collision with obstacles is another well-known problem in indoor environ-
ments, so an indoor location assistance system is proposed by [18]. This work
demonstrates the development of an indoor navigation system, specifically for
industrial applications that require customised detection technologies to aid navi-
gation. A custom sensing array, with ultrasonic transceivers, was developed to
locate the drone’s position in a known closed environment and to provide feedback
to the navigation system. Six people were recruited to pilot the drone with and
without the navigation system in an enclosed room for a predefined target at a
known location. Two approaches were used, the first when the UAV was at the line
of sight of the pilot, and the second with no line of sight. The duration of the flight,
the number of collisions and the distance from the target were recorded and used to
measure the performance of the proposed application. Using the proposed naviga-
tion system, it was possible to reduce the flight duration by an average of 19.7%
during an obstructed line of sight. Moreover, UAV preventative collision detection
and navigation has been improved. The navigation system provides a detection
radius of 7.65 m and a position accuracy of 1 cm.
A localisation technique known as Air-SSLAM was presented in [19]. This
technique uses a stereo camera setup besides two or more cameras that can be used.
Figure 11.3 demonstrates the operation of the proposed system.
Stereo images are captured, and then the image features are extracted. A
mapping between the features of the images is performed through a descriptor, and
an initial map is generated with the depth estimate of each pair of features. This
process is then repeated using the initial map as a reference. The maintenance and
updating of the long-term map are continuously carried out by analysing the quality
of each correspondence, as well as inserting new features in unexplored areas of the
environment. According to the author, the work presents three main contributions
that are the development of a new method to combine the features efficiently, the
Vision in indoor and outdoor drones 267
CAMERA
Phantom 3 Pro
POSE
Acquisition
of image pairs Tara camera
Tracking
Pi = R t
Air-SSLAM Drone pose 01×3 1
GFTT rBRIEF
method estimation
x = [x, y, z]T
Mapping D = [d0, d1, ..., d62, d63]
Map Quality Add/remove Q1 (t) = q1
analysis Q2 (t) = q2
creation points
Q3 (t) = q3
Figure 11.4 Images of the navigation system running inside a corridor [20]
During the tests with the proposed solution, it was identified that these algo-
rithms require a significantly lower computational power, allowing the UAV to
react quickly and navigate through several internal environments.
Until then, techniques have been described that are generally used in uncon-
trolled or unknown environments. The studies reported in [21] demonstrate the
development of a UAV using visual computing techniques to provide vehicle
navigation in a controlled environment. The indoor environment used in the tests
had coloured tracks fixed to the ground. The visual computing system detected
these tracks and decoded in the form of instructions used to power the vehicle’s
navigation system. Important data such as attitude, speed, and acceleration of the
UAV along with the real-time video, are sent as feedback to the ground station
through communication links to give commands and also for monitoring purposes.
The algorithms of processing and control by CV were evaluated and obtained good
performance.
Using a scenario controlled with markers, the system from [22] is a completely
autonomous solution used to participate in the IMAV 2013 competition (IMAV
2013), which has won the first prize of the competition to the authors. The proposed
solution is a system composed of multiple UAVs without centralised coordination,
whose UAV agents share their positional estimates. The ability to navigate and
detect collisions is a consequence of the behaviour of each member participating in
the group of UAVs. Figure 11.5 demonstrates the system execution scenario.
All processing takes place at a base station outside the vehicles. For each
vehicle, there is running in the ground station an instance of the proposed archi-
tecture that communicates through Wi-Fi with the UAV. Visual markers are used to
detect and map obstacles and to improve pose estimation. In the executed tests,
(a) (b)
(c) (d)
Figure 11.5 The architecture and the scenario application presented in [20]
Vision in indoor and outdoor drones 269
each UAV presented the ability to navigate avoiding obstacles and even collision
with other members of the system.
In the case of indoor environments, what can be observed from the use of
visual computing techniques in UAVs is that most of these works are designed to
develop systems with the ability to navigate and/or collision detection/avoidance in
objects or obstacles. Efforts are concentrated in the development of aerial vehicles
to explore the environment and interaction with it, preventing missions and opti-
mising their flight time. Another relevant point to notice is that the research focuses
on developing solutions for indoor environments that are generally uncontrolled
and unknown, which can increase the level of autonomy but also the level of dif-
ficulty in implementing and merging the architecture of these systems. Another
important direction of studies that must be highlighted is on platforms that consider
the interaction of several vehicles with the environment (and among themselves),
being collaborative or independent.
Satellite
Smartphone
UAV
Camera for
fire detection
Ground station
self-stabilisation. The proposed software uses the forward-facing camera images and
some of the IMU data to calculate the references to the control loops. The results
obtained with the work relativity show that the system was able to perform the visual
detection with targets of variable size, that the system was able to follow a person
with a velocity of approximately 2.5 m/s for a time of approximately 45 seconds.
A new method for detecting, locating and recognising trees with a UAV
equipped with a monocular camera is presented in [32]. Trees are detected on a
frame basis using the latest generation convolutional neuron networks, inspired by
the recent rapid advances shown in the literature. The trees are detected, and their
position is marked on the GPS. Localised trees are segmented, characterised by
resource descriptors, and stored in a database that contains their GPS coordinates.
The trees detected on later flights are compared with the data in the database to
avoid duplication of information. The proposed method is also able to identify if
the trees are absent from the expected locations with GPS marked, allowing to alert
the authorities concerned about possible illegal deforestation immediately.
In addition to counting trees, other vision applications in UAVs such as
counting other types of commercial plants help to generate production statistics in
precision agriculture. In this context, the work presented in [33] demonstrates an
economical and attractive technology for the automatic detection of palm trees to
be used in UAVs. From the in-flight image, it first extracts a set of key-points using
the algorithm to detect features. Afterwards, these key-points are analysed with a
trained classifier in a training data set. Finally, the palm trees are identified. Then,
to capture the shape of each tree, it merges the key-points with an active contour
method. Finally, the texture of the obtained regions is analysed with local binary
patterns to distinguish palms from other vegetation. The results of the experimental
tests were acquired in two different farms and confirm the promising capabilities of
the proposed structure in palms detection.
The work presented in [34] demonstrates a rigorous study on the use of
embedded hardware coupled to a UAV to perform real-time image processing. A
POI detection algorithm has been proposed to perform the tests on the hardware. In
the article, it is shown that it is possible and feasible to use a low-cost processing
board in a mini UAV for image processing. In this work, a practical application is
demonstrated using image processing in UAVs; it is shown the preliminary results
of an application for the detection of power lines in order to perform the autono-
mous guidance of the UAV on the lines for cable inspection for example.
Figure 11.7 shows an image illustrating the results in the detection of three elec-
trical power lines. Notice that the system successfully perform the detection, even
considering a noise environment in the background.
Another emerging technology is the monitoring and tracking of ice for use in
marine ice management applications. This technique was reported in [35], and it is
aimed to be used to detect and track the movement of icebergs and other ice sheets
in an arctic environment autonomously. An occupancy grid map algorithm and
locations of interest generator are used, coupled with a mission control system.
According to the developers of the project, one of the contributions of their work is
the interface of the algorithm with a vision-based object detection module to
Vision in indoor and outdoor drones 273
generate an occupancy grid map of a real-time predefined search area using on-
board processing data. A generator of places of interest has also been developed,
generating locations where the UAV should investigate based on the occupation
grid map. The results were obtained based on test flights performed, and it was
possible to verify that the system was able to successfully create an occupation grid
map based on the automatic segmentation of thermal images on-board and in real-
time, in ice regions and regions without ice. However, in the tests carried out, it is
possible to observe some delays in the communication and the fact that the mission
controller was implemented in a ground base station.
As can be observed, the visual computing applications used in outdoor envir-
onments have a great diversity due to the number of possible applications.
Advances in technology increasingly allow UAVs to be used in a variety of tasks
and they are thus changing the way some processes are conducted, such as in the
case of electrical transmission line inspections or even in the inspection of photo-
voltaic panels. However, this huge number of applications requires systems with
increasingly complex and computationally costly architecture. This is one of the
most significant problems in CV outdoor applications. There is a need to use sev-
eral algorithms engaged in an application loop competing for processing time on a
low-cost processor. To distribute tasks and processing studies from this moment on
start and turn to applications available to multiple execution agents.
computational cost because they have to deal with the disturbances and disparities
in the data collected by the sensors due to the great variation of the scene and the
environmental conditions. However, they can meet a more significant number of
requirements for a particular mission, both indoors and outdoors.
11.3 Conclusion
This chapter examines the impact and necessities of indoor and outdoor environ-
ments on CV procedures, algorithms and hardware.
Indoor or controlled environments generally depend on solutions relying on
beacons, nearness sensors and intense CV algorithms for data acquisition and
decision-making. In this case, as the environment is controlled, the illuminance of
the scene is adjusted and sensors are previously positioned, which simplifies the
development and execution of these systems.
In outdoor environments, generally known for uncontrolled environmental vari-
ables, frequently require solutions relying on CV techniques and Geographic
Information Systems (GISs) to provide the data acquisition. In this environment, the
non-constant variation of the scene illuminance and the great background dissimilarity
of images are important complicating factors for the operation of the CV algorithms. In
addition, construction sites and buildings block the signal of sensors and GPSs making
it even tougher to handle the exceptions produced by these factors. In a CV system, a
high computational cost can result from treating each exception. If an application uses
embedded hardware, such as an FPGA, then some projects turn out to be infeasible.
Investigators put great effort to optimise the software regarding a growth in
performance and better usage of the hardware resources, to decrease the necessary
computational power as well as to positively boost energy savings. These advances
result from a comprehensive review of the key CV techniques now used in mission
control software for indoor and outdoor cases to provide autonomy, superior
navigation and interaction between these aerial robots.
Scenarios involving disaster response and people in distress would rather use a
high level of indoor–outdoor interaction of geospatial information with a network
data model. Nevertheless, indoor geospatial information acquisition is timewasting.
Numerous research works employed the Architecture, Engineering, and Construction
(AEC) paradigm to create indoor network strategies. These schemes hinge on the
types of input data, and the interior characteristics of buildings and their associated
objects are typically incomplete. Consequently, combining Building Information
Modelling (BIM) with GIS can improve indoor–outdoor combined applications. An
open standard for BIM entitled Industry Foundation Classes (IFC) is sustained by
building SMART to accomplish data interoperability. Reference [42] proposed a
Multi-Purpose Geometric Network Model (MGNM) depending on BIM and it uses
both indoor and outdoor network connections. To attain the objectives, the IFC-to-
MGNM adaptation embraces the subsequent stages: (i) obtaining edifice data from
IFC, (ii) MGNM information isolation from the previously mentioned building
data, and (ii) construction and accumulation the topological associations of the
276 Imaging and sensing for unmanned aircraft systems, volume 1
References
The aeronautical structures used in unmanned aerial vehicles (UAVs) have become
more flexible and lightweight to meet the need to operate at high altitudes and for
long periods in the air, such as those required in the NASA Helios project. These
flexible aeroplanes, the similarity, between iron birds and natural birds. The
increased structural flexibility of these aircraft reignited concerns related to aero-
elastic instabilities, such as flutter. Improving the techniques and methods used in
aircraft certification flights is an important concern of the aeronautical community
because current standards and procedures do not provide recommendations and
guidelines for aircraft with a high degree of flexibility. The techniques traditionally
used in commercial aeroplanes cannot be used in UAVs, because the new UAVs
have a high degree of non-linearity in their flight dynamics. Contemporary research
studies indicate an increasing awareness of the importance of vision in the mon-
itoring of UAV structural health. This chapter addresses algorithm-based solutions
to real-time infrastructure damage recognition and damage accreditation for UAV
systems. Damage detection and identification of effects are quantified for reme-
diation. A vision-based UAV system may obtain visual evidence from a damaged
surface of interest to detect faults, eliminate the irrelevant areas from image data,
spot damage, measure the resulting consequences, record information, identify the
type of fault and point to the most noticeable kind of problem present. This chapter
also discusses new approaches to sense and acquire vibration data and to pre-
process these data on aeroelastic certification test flights. These new approaches
aim to reduce the time to identify the aeroelastic phenomenon and to reduce the
size of the hardware that must be boarded in the aircraft, thus minimising the risks
and costs of the vibration tests. The advance of computer vision systems enables
the use of cameras as a motion tracker sensor with millimetre precision and accu-
racy. Non-contact sensors are suited to flutter analysis because they do not interfere
with the dynamics of the aircraft. With the aid of Computer Vision algorithms, they
allow the acquisition of 2D and/or 3D data, instead of one-dimensional data
acquired by traditional vibration sensors, such as piezoelectric accelerometers.
1
DCTA-ITA-IEEA, Sao Jose dos Campos, SP, Brazil
282 Imaging and sensing for unmanned aircraft systems, volume 1
12.1 Introduction
Dynamic
Dynamics
aeroelasticity
(inertial forces)
(flutter and more)
Flight
mechanics Structural
dynamics
(a) (b)
Extraction of Flutter
Sensors Fourier the dynamic
(analogue data) Acquisition transform characteristics occurrence
of the system speed
1 2 3 4
5
where
L and M are the synthetic data of lift and moment, respectively;
h and Q _ are the rates of bending and torsion;
Lh , Lh , Lq and Lq are the aerodynamic lift derivatives;
Mh , Mh , Mq and Mq are the aerodynamic derivatives of momentum;
.
θ
.
h
Figure 12.5 Simplified model of a typical wing section with two degrees
of freedom [36]
288 Imaging and sensing for unmanned aircraft systems, volume 1
6
20
5
4 0
3
–20
2
–40
1
0 –50
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Time (s) Time (s)
(a) (b)
θ [rad/s2]
200
100
–100
–200
–300
0 0.5 1 1.5 2 2.5 3
(c) Time (s)
Figure 12.6 Model excitation and responses for 55% and 95% of flutter:
(a) excitation signal; (b) response to 55% of flutter speed;
(c) response to 95% of flutter speed
12.4.1.2 Pre-processing
The test data analysis helps to obtain the aircraft flutter characteristics [38–40].
Since the measured signal, in general, tends to have low quality, the pre-processing
is crucial ameliorate the accuracy of subsequent flutter data analysis. Several types
of filtering can be introduced to improve data processing of the flutter structure. For
instance, the effectiveness of the morphological filter depends on the different
operation types performed and the structural elements used. After this pre-
processing, the usual Flutter Boundary Prediction (FBP) technique helps to validate
and confirm the feasibility of the pre-processing stage through the numerical
simulations, and the test of the flight flutter data. The results can be examined
employing the Signal-to-Noise Ratio (SNR) metric of structural response from the
physical assessment regarding the improvement of the prediction accuracy in actual
applications [41,42]. Among pre-processing algorithms, there are basic ones, like
least-squares variants, maximum likelihood variations, spectral analysis, Kalman
290 Imaging and sensing for unmanned aircraft systems, volume 1
filtering, and more contemporary methodologies such as subspace methods [43], and
computational intelligence algorithms [44–47].
12.4.1.3 Extraction of dynamic characteristics
The literature contains several methods to extract dynamic characteristics in the
measured signal, including highlight peak-amplitude method, inverse or line-fit
method, and residuals [48]. In this work, the circle-fit method was investigated,
because calculation methodology facilitates the embedded process.
From the frequency domain transformation, two dynamic characteristics of this
system are estimated using the peak extraction method: natural frequency and
damping of each vibration mode. This method has two phases: first, the location of
peaks and estimation of natural frequency for each vibration mode; the second
estimate the damping of each vibration mode. This methodology is performed in
step 4 of the proposed solution and was also implemented in MATLAB for theo-
retical validation, based on the references [37,49–51].
The implementation of the algorithm based on the peak extraction method
followed the steps below:
Steps 1–2: Synthetic torsion and flexural data in the time domain of experiment 1;
Step 3: Transform the synthetic torsion data and bending of the time domain to
the frequency domain using the Fourier transform;
Step 4: Extract the dynamic characteristics of a system. Calculate the response
in the frequency domain of H(w). In practice, the frequency response func-
tion (FRF) can be calculated as:
X ðwÞ
HðwÞ ¼ ; (12.5)
FðwÞ
where X ðwÞ is an input Fourier transform for output, and FðwÞ is a Fourier
transform for input. From the considerations of digital signal processing, we
can write that:
SXX ðwÞ ¼ jHðwÞj2 Sff ðwÞ; (12.6)
where SXX ðwÞ is the autocorrelation of input, and Sff ðwÞ is the output auto-
correlation. Next, plot the Nyquist diagram and locate the points of half-arc
farthest from the origin. Determine the adjacent frequencies wa , wb and wR
(farthest from the origin). Calculate the angle that represents the angle between
these adjacent frequencies (Figure 12.7).
For each mode of vibration, estimate the corresponding damping as follows:
w2b w2a
z¼ (12.7)
2wR wa ranða2 Þ þ wb tanða2 Þ
Step 5: Monitor the damping ðzÞ for each mode of vibration, estimate the
speed and monitor the flutter.
Figure 12.8 illustrates the flowchart for implementing these steps. The results
of implementing this algorithm are presented in Section 12.4.2.1.
Sensors and computer vision as a means to monitor and maintain a UAV 291
Im(α)
Re(α)
α
ωa
2
ωb
2 Calculation of H(ω)
Frequency response
function
3 Extraction of peaks
Natural frequency of
each mode of vibration
4
Step 4 Location of the nearest half energy frequencies
6 Damping monitoring
Step 5
Flutter rate estimation
Figure 12.8 Flowchart of implementation of the steps for the calculation and
monitoring of flutter via the Nyquist diagram method
292 Imaging and sensing for unmanned aircraft systems, volume 1
0.35
Damping
0.3
0.25
Damping
0.2
0.15
0.1
0.05
0
40 60 80 100 120 140 160 180
Flow velocity (m/s)
–100
–200
y
Synthetic data
–300 Estimated circle
Origin
Circle center
–400 Freq. resonance
Freq. adjacent b
Freq. adjacent a
–500
–500 –400 –300 –200 –100 0 100
x
The aeroelastic wing was mounted in a low-speed wind tunnel operating at low
subsonic speed, with a maximum velocity of 33 m/s, whose measurements were
performed with two degrees of freedom, bending, and twisting. For this, we used:
● Three ICP accelerometers M352C68 at the root of the aeroelastic wing,
according to the figure;
● Two differential pressure meters connected to a Pitot tube to measure the air-
flow velocity. The meters used were a testo512 model manometer and the
MPXV7002DP differential pressure sensor;
● A PCB 086C03 impact hammer for excitation (force application) on the
aeroelastic wing;
● Lynx ADS1800-SV acquisition system.
For this test, the flow rate was manually increased until the flutter speed. For
each speed, an excitation was performed on the wing, and during this process flow
rate, and the measured accelerations was monitored. Another parameter that
influences the speed of flutter in this wing is the position of the ballast about the
elastic axis. This test evaluated two conditions; in the first configuration, the centre
of the ballast was positioned 5 mm in front of the elastic axis of the wing, then
the distance from the centre of the ballast was modified to 10 mm in front of the
elastic axis of the wing. Figure 12.12 illustrates the ballast positions for this test. As
expected, the change in velocity of the flutter was observed when changing the
position of ballast when the centre of the ballast is positioned 10 mm in front of the
elastic axis of the wing, the speed required to reach the flutter increases.
Figure 12.12 also shows the positions for wing excitation.
Exciser
Aeroneutical structure
Acc 01 Acc 02 Acc 03
SPEED
METER
5 mm 10 mm
Ballast
Air flow
Figure 12.12 Ballast positioning and excitation positions in the aeroelastic wing
2
V
–2
–4
0 5 10 15 20 25 30
(a) Time (s)
0.5
0
V
–0.5
–1
–1.5
0 5 10 15 20 25 30
(b) Time (s)
12.5 Discussion
The experiments carried out proved the technical feasibility of a real-time monitoring
system for structural health. The developed software was able to estimate the damping
for each flight condition. However, the implementation of this concept in real systems
still requires some development of technologies, since the cost of instrumentation and
the processing of these measures in real-time are still impediments.
Sensors and computer vision as a means to monitor and maintain a UAV 297
2
V
–2
–4
0 5 10 15 20 25 30
(a) Time (s)
0.5
0
V
–0.5
–1
–1.5
0 5 10 15 20 25 30
(b) Time (s)
0.016
0.015
0.014
0.013
Damping
0.012
0.011
Estimated damping
0.01
3* polynomial approximation
0.009
0.008
0.007
1 2 3 4 5 6 7 8 9 10 11
Flow velocity (m/s)
Track points
g
in
W
Video camera A
Bending Torsion
N Pixels
Calibration
200 200
400 400
600 600
200 400 600 800 1000 1200 200 400 600 800 1000 1200
200 200
400 400
600 600
200 400 600 800 1000 1200 200 400 600 800 1000 1200
angle with abscissa axis or normal inclination of the wing characterise a rotation.
Knowing the distances between targets and the camera, the relation between pixels
and real distance is set.
The targets were made in the colours red, green, and blue (RGB) because of an
image recognised by the software is formed by the sum of contributions of these
primary colours (RGB image), thus avoiding secondary colours. Once each target is
identified, the contributions of blue and red colour matrices can be subtracted to
obtain only the green colour, and thus respectively, to identify each target as shown
in Figure 12.19. This software will track the markings throughout the video as
shown in Figure 12.20.
Sensors and computer vision as a means to monitor and maintain a UAV 301
Tracked tragets
580
560
540
520
500
Pixel Y
480
460
440
420
400
380
720 730 740 750 760 770 770 790
Pixel X
Then convert those measurements into bending (Figure 12.21) and torsional
displacements (Figure 12.22). After the conversion of the pixel measurements to
displacements, they will be processed by the algorithms developed in the previous
phases of this work.
To verify if the proposed system was able to capture the frequency of vibration
of the bending movement, the FFT (Figure 12.23) of the signal was taken with the
DC signal average removed to verify if the peak was in the position expected.
The oscillation period was verified in the time signal used, thus verifying if the
vibration mode was at the correct frequency.
This chapter discusses new approaches to sense and acquire vibration data and to
pre-process these data on aeroelastic certification test flights. These new approa-
ches aim to reduce the time to identify the aeroelastic phenomenon and to reduce
the size of hardware that must be boarded in the aircraft, thus minimising the risks
and costs of the vibration tests.
The presented experiments construct a way to develop a non-contact mea-
surement system for flight vibration tests in the aircraft certification process. These
experiments have shown that the techniques used today for in-flight trials will be
obsolete in the near future, as the aeronautical structures are becoming lighter every
day, thus not admitting any additional mass for instrumentation in-flight trials.
302 Imaging and sensing for unmanned aircraft systems, volume 1
790
785
780
Pixel
775
770
765
760
755
80 90 100 110 120 130 140 150 160 170
Frames
1.5
0.5
Pixel
–0.5
–1
–1.5
–2
–2.5
80 90 100 110 120 130 140 150 160 170
Degrees
6
|P1(f )|
0
0 20 40 60 80 100 120
f (Hz)
The proposed computer vision experiment is based on works that have already
proven this type of measurement system can accurately determine natural fre-
quencies and structural damping, thus opening the way for investigation of natural
frequencies and damping resulting from aeroelastic phenomena.
The main advantages of the computer vision for this case are that it does not
interfere with the structure, low complexity to be replicated, low instrumentation
cost, and easy to use on difficult access structures (wings). Works have already
been obtaining excellent results with commercial cameras and suggest that the
results can be further improved by using cameras with higher resolutions and
acquisition frequencies [22].
In addition, this work proposes modifications in the procedures of acquisition and
pre-processing of the signals traditionally used in flutter certification analytic, for
example, modal analysis, suited to an embedded system and near-real-time process.
References
[1] Noll TE, Ishmael SD, Henwood B, et al. Technical findings, lessons learned,
and recommendations resulting from the helios prototype vehicle mishap.
Security. Technical Report RTO-MP-AVT-145. NASA Langley Research
Center Hampton, VA 23681 USA 2007:1–17. Available from: http://stinet.dtic.
mil/oai/oai?fn&gverb¼getRecordfn&gmetadataPrefix¼htmlfn&gidentifier¼
ADA478771.
304 Imaging and sensing for unmanned aircraft systems, volume 1
[2] Tsushima N, and Su W. Flutter suppression for highly flexible wings using
passive and active piezoelectric effects. Aerospace Science and Technology.
2017;65:78–89. Available from: http://dx.doi.org/10.1016/j.ast.2017.02.013.
[3] Kayran A. Flight flutter testing and aeroelastic stability of aircraft. Aircraft
Engineering and Aerospace Technology. 2007;79(5):494–506. Available from:
http://www.emeraldinsight.com/doi/abs/10.1108/00022660710732707.
[4] Saeed S, and Salman S. Flutter analysis of hybrid metal-composite low
aspect ratio trapezoidal wings in supersonic flow. Chinese Journal of
Aeronautics. 2017;30(1):196–203. Available from: http://linkinghub.elsevier.
com/retrieve/pii/S1000936116302357.
[5] Sinske J, Jelicic G, Buchbach R, et al. Flight testing using fast online aero-
elastic identification techniques with DLR research aircraft. In Proc. 17th
International Forum on Aeroelasticity and Structural Dynamics (IFASD 2017),
Como, Italy 2017;1:1–19.
[6] Estrela VV, Saotome O, Loschi HJ, et al. Emergency response cyber-
physical framework for landslide avoidance with sustainable electronics.
Technologies, 2018;6:42.
[7] Names B. 5 things you should know about flutter—LinkedIn; 2016.
Available from: https://www.linkedin.com/pulse/5-things-you-should-know-
flutter-ben-names/.
[8] AIRBOYD. Ressonancia Aeroelastica - Efeito Flutter—YouTube. Available
from: https://www.youtube.com/watch?v¼3CMlXyV2XnE.
[9] Sez S. Tacoma narrows bridge collapse 1940—Symon Sez; 2010. Available
from: https://symonsez.wordpress.com/tag/tacoma-narrows-bridge-collapse-
1940/.
[10] Ko JM, and Ni YQ. Technology developments in structural health monitoring of
large-scale bridges. Engineering Structures. 2005;27(12 SPEC. ISS.):1715–1725.
[11] Cawley P. Structural health monitoring: Closing the gap between research
and industrial deployment. Structural Health Monitoring: An International
Journal. 2018;17:1225–1244, 147592171775004. Available from: http://
journals.sagepub.com/doi/10.1177/1475921717750047.
[12] de Oliveira M, Araujo N, da Silva R, et al. Use of Savitzky Golay filter for
performances improvement of SHM systems based on neural networks and
distributed PZT sensors. Sensors. 2018;18(1):152. Available from: http://
www.mdpi.com/1424-8220/18/1/152.
[13] Pozo F, and Vidal Y. Damage and fault detection of structures using prin-
cipal component analysis and hypothesis testing; In Advances in Principal
Component Analysis, Springer, Berlin, Germany, 2017, 137–191.
[14] Flynn EB. A Bayesian experimental design approach to structural health
monitoring with application to ultrasonic guided waves. University of
California; 2010. Available from: https://escholarship.org/uc/item/9m15r6ph.
[15] Coelho AM, de Assis JT, and Estrela VV. Error concealment by means
of clustered blockwise PCA, in Proceedings of 2009 IEEE Picture
Coding Symposium, Chicago, IL, USA, 2009. DOI: 10.1109/PCS.2009.
5167442
Sensors and computer vision as a means to monitor and maintain a UAV 305
1
Tawazun Technology and Innovation (TTI), Abu Dhabi, United Arab Emirates
310 Imaging and sensing for unmanned aircraft systems, volume 1
13.1 Introduction
Currently, in our connected digital world, there are many ways to track, watch and
collect data on individuals. Such a thing is termed ‘surveillance’, and one method
of surveying targets is by using cameras, which is a type of visual surveillance [1].
While there are many other types of surveillance, visual surveillance is heavily
used in different areas in the world. From roads to workplaces, crowds to indivi-
duals, cameras that image in the large spectrum of light with varying resolutions
and fields of view are used daily to collect data about multiple targets. Surveillance
systems can be as simple as a single fixed camera or complex like a network of
connected cameras over a huge area [1].
In our current times, using aerial systems fitted with cameras to acquire
imagery is becoming more and more prevalent, and one specific application of
these imaging systems is the wide-area surveillance system (WASS) [2], which is
concerned about covering a wide area, either by imagery or by signals, and
detecting multiple targets in this broad area. The added benefit of wide-area
coverage is that it eliminates the ‘soda straw’ view that most narrow-field sur-
veillance systems have when embedded on aerial platforms. WASS provides the
user with the opportunity to track multiple targets while not losing any data in the
area of interest. However, tracking moving objects in a video sequence is a chal-
lenging task in WASS applications, and it poses severe problems for the unmanned
aerial vehicle (UAVs) due to their cyber-physical nature [2–4] especially for small
UAVs (sUAVs). In addition, WASS requires typically big and heavy imaging and
processing components that make their integration into the smaller aerial vehicle a
challenging task.
In general, sUAVs are small, agile and flexible, facilitating the design of easy,
fast, and cost-efficient solutions for a variety of day-to-day problems. Recently,
sUAV with visual sensors attained proper attention for a broad range of applica-
tions such as visual surveillance, border control, search and rescue missions, wildlife
monitoring, crowd detection and monitoring [5]. Leveraging sUAV for persistent
surveillance is by itself an evolutionary solution for smarter aerial surveillance.
Developments in sUAV technologies caused a rapid growth of their application areas
such as disaster management, tactical military area surveillance and major events
surveillance. Although there are still lots of possible sUAV applications, most of the
existing applications suffer from limited flight time endurance (mostly less than 3 h)
[6,7]. Hence, while there are numerous ongoing studies for improving sUAV flying
time [8,9], they appear to be unsuitable platforms for persistent surveillance which
requires more extended operating hours.
In [10] an approach to tether sUAV to a ground power source was promoted as
a feasible solution. This is due to the recent market availability of commercial
tethered aerial platforms that have gotten very satisfactory feedbacks in terms of
flight time and operational consistency [11–14]. This approach proposes that the
multi-rotor type sUAV continuously receive power during a flight from a ground
power source at the cost of limiting their freedom of movement. For some appli-
cations such as persistent surveillance, the duration of the operation is more
Small UAV: persistent surveillance made possible 311
important than the freedom of the movement. In this chapter, we will baseline our
hardware and software recommendations based on an exemplary sUAV persistent
surveillance that leverage computer vision techniques to achieve its surveillance
missions.
Datalink
Aerial
Serial Onboard platform
Gimbal Yaw
communication processor
controller motor
Pitch Power unit
motor
Roll
motor
Tether station
aerial module
Onboard Camera
system
Wireless communication
Data Tether cable
Power HMI
Ground
system Intelligence
Terrain
Assets
(no datalink support) to the drone and therefore, the data link component selection
becomes a crucial factor in the feasibility of our design.
In summary, the drone technology choice should ensure to carry the imaging
system and to fly with a tether to achieve the height and duration needed for cap-
turing ‘quality’ images, while being able to downlink them in real-time. The drone
presents the advantage of flying in closed spaces and between city buildings
without obstructions. The tether helps in supplying the drone with the power to
keep it operational for a longer period. Nevertheless, since it is tethered, the length
of the tether limits the height performance of the drone, and the payload weight
budget has to account for the pull exercised by the tether.
Imaging system: Generically, we refer to the imaging system as the compo-
nent that includes at the same time the gimbal device with all its attached optics and
sensors. The gimbal hosts the cameras and is responsible for stabilising the
acquired images. The gimbal should be small enough to be attached to the selected
drone, but big enough to accommodate the imaging equipment. The gimbal has to
withstand the effects of altitude, humidity, vibration and temperatures’ variation.
To ensure the tracking capability of moving targets, the gimbal design should offer
at least 2-axes of movement. The main advantages of the gimbal reside in providing
the observation and targeting capabilities in addition to offering the stabilisation
and control functions. However, since all drone is attached parts are competing for
the same payload weight all gimbal’s sub-components need to be as small as pos-
sible and might have their functionalities downgraded to fit the size and weight
budget.
One of the most critical sub-components of the gimbal is the cameras.
Various cameras sensitive to different wavelength can be considered. As such,
electro-optic (EO, visual spectrum), IR and hyperspectral (HS) are all options that
are suitable for sUAV surveillance. Moreover, the multi-cameras configuration
can be considered in order to achieve wider fields of view or to gain multiple
spectrums of imagery. However, the size and weight of these cameras have to be
selected with great attention, aligned with the aerial platform’s overall weight and
performance.
Data link: The data link allows in parallel the control of the aerial platform
navigation, and the downlink of the images of interest to the ground operator.
Depending on where the PEU is deployed (on the ground or on-board), the data link
system is exercised differently. With a PEU on the ground, the data link has to
handle real-time streaming of the images to avoid any side effects to the image
processing pipeline running in the PEU. With a PEU on board, the data link is freed
up from the image streaming. Having a datalink with bigger bandwidth allows the
use of higher quality imagery from a bigger coverage area and thus resulting in
better system performance. However, there is always a limited bandwidth no matter
which datalink is chosen. Therefore, both the frame rate and resolution of the
images transmitted have to be selected carefully.
PEU: The PEU is the component responsible for receiving and processing the
images obtained by the imaging system, and sent by the data link. The size, pla-
cement, and performance of the PEU are solely dependent on the requirements of
314 Imaging and sensing for unmanned aircraft systems, volume 1
the user. As such, multiple placement and design options can be adopted for
the PEU.
endeavour, due to conflicting camera settings that affect the quality of the photos,
and hence might impact the images processing pipeline taking place in the PEU. As
such, we recommend that the imaging system should:
● have an optimised weight to fit within the allocated gimbal weight budget;
● be compatible with the data link and PEU components in order to transfer and
consume the images successfully;
● have a configurable optimised frame rate (FPS) compatible with the PEU’s
expectations (e.g., track people and vehicles);
● have a resolution that achieves a Ground Sampling Distance (GSD) compatible
with the PEU’s expectation at the maximum flying altitude; and
● have a field of view (FOV) big enough to cover the surveyed area.
Of particular interest to the tracking capability of the sUAV surveillance is the GSD
and FOV criteria. A higher GSD will result in pixilation of the object and might
result in losing its track. The FOV dictates the limit of the surveyed geographic
area. Governed by the following equations and based on typical camera specifi-
cations, the sUAV surveillance system designer can easily deduce the expected
performance of the selected imaging system. The equations are describing the
imaging area and quality of a single-camera gimbal system.
Imaging system variables:
– xsensor : width of the sensor, in mm
– ysensor : height of the sensor, in mm
– focallen : focal length of lens, in mm
– altitude : height, in m
– xgimbal : x-axis gimbal tilt angle, in degree
– ygimbal : y-axis gimbal tilt angle, in degree
– xres : x-axis sensor resolution, in pixels
– yres : y-axis sensor resolution, in pixels
Equations:
1 xsensor
Field of view wide ¼ 2 tan ; and (13.1)
2 focallen
1 ysensor
Field of view tall ¼ 2 tan : (13.2)
2 focallen
Equations (13.1) and (13.2) provide the camera’s horizontal and vertical Fields of
View (FOV), which helps in determining the coverage area generated on the
ground.
altitude xsensor
GSD ¼ : (13.9)
focallen xres
where GSD denotes how much distance on the ground is 1 pixel in the image.
Under no-tilt conditions, the x-axis GSD and the y-axis GSD are the same.
However, when the camera is tilted, a different GSD is obtained for each tilt angle,
and it is as follows:
Width of photo footprint
GSDtiltx ¼ : (13.10)
xres
Height of photo footprint
GSDtilty ¼ : (13.11)
yres
The sUAV surveillance system designer should be mindful about the GSD values
while designing the system, as a GSD higher than the wanted amount will result in
pixilation of the target which could affect the tracking process.
Datalink: Datalinks are used to transfer imagery, commands and various data
back and forth between the ground segment and the platform. The recommenda-
tions below are specifically for data links that are used to transfer imagery only:
● Have an optimised weight to fit within the allocated payload weight budget;
● Be compatible with the aerial system in terms of power and operational system;
Small UAV: persistent surveillance made possible 317
● Satisfy the transmission frequency regulations of the country where the solu-
tion is deployed;
● Have the transmission range greater than the maximum flying distance;
● Embed standard encryption mechanism; and
● Have a bandwidth and throughputs aligned with the PEU’s expectation.
PEU: The PEU is a central piece of the sUAV surveillance system. The PEU pro-
vides the computing power required to ingest the images from the sensors and build in
real-time any useful intelligence to the user. Therefore, the main mission of the PEU
is to run the image processing and computer vision algorithms [17–20] efficiently.
Image processing is a branch of computer knowledge which is concerned with
processing a digital signal representing images taken with a digital camera or
scanned by a scanner [15,16,21,22].
The specification of the PEU is dictated by various requirements such as the
resolutions of the images, the frame rate, the numbers of sensors, the flying platform,
etc. Therefore, the PEU solution should pay particular attention to the following:
● Scalability: achieved by leveraging distributed architecture, multithreading
and GPU programming.
● Performance: achieved by leveraging, for example, special-purpose parallel
computing libraries such as the ones mentioned below.
* Open Multi-Processing (OpenMP) can be used for distributing the work-
load over multiple CPUs and CPU cores;
* Message Passing Interface (MPI) can be used for inter-workstation and
inter-process communication;
* Compute Unified Device Architecture (CUDA) can be considered for General
Purpose computing on Graphics Processing Units (GPGPUs) computations.
Camera Background
Stitching Stabilization Tracking
calibration subtraction
entropy and mutual information [38]. Feature-based matching techniques have also
been employed, such as the Scale-Invariant Feature Transform (SIFT) [39–41],
Harris points [42] and Speeded Up Robust Features (SURF) [43–45]. The designer
selecting the appropriate stitching technique for sUAV surveillance system should
consider surveying the area of interest, accounting for the frequency of image
acquisition, and the required percentage of the overlapping area [46].
13.3.3 Stabilization
With a moving camera, it will not be possible to detect and track moving objects
based on processing motion imagery. Stabilisation is the process of transforming the
mosaicked (stitched) motion images into a new one where the camera motion has
been cancelled out. Moreover, sUAV with streaming video cameras suffer from the
undesired shaking motion due to atmospheric turbulence, jittery flight control
emphasising the need for more and more efficient stabilisation implementation tech-
niques. These implementations should take into consideration the following points:
● Feature detection and description is an expensive (w.r.t. time) operation.
Therefore image scaling needs to be supported in this phase;
● Consecutive images do not contain the view of the identical region on the
ground plane:
* The angle of view/perspective changes due to the aircraft movement, and
* Even when the aircraft movement and gimbal movement is optimised to view
a fixed point on the ground, the visible areas change among different frames.
There is a limited number of works related to the stabilisation of videos taken from
sUAV [47–51]. Generally, UAV video stabilisation algorithms include the following
three main steps: (1) motion estimation, (2) motion compensation and (3) image
composition. Many studies attempted to find the 2D motion model (such as homo-
graphy) for estimating the global motion trajectory and then applied a low-pass filter
to the trajectory for removing the high-frequency jitters. Finally, the low-frequency
parameters are used onto frames via warping techniques, which is more effective for
scenes with very little dynamic movement taken by UAVs. In [47], a video stabili-
sation algorithm using a circular block to search and match key places is proposed. In
[48], a smoothing method utilises the Lucas–Kanade tracker [49] to detect interest
points. In [50], a three-step video stabilisation method is proposed for UAVs.
Initially, a Features from Accelerated Segment Test (FAST) corner detector is used
to locate the feature points in the frames, and then the matched key-points are used to
estimate the affine transform for reducing the false matches. Finally, the affine
model based on motion estimation and the compensation is performed.
jA;B Latitude of points A and B, where A is the drone point, and B is the user-defined
point. Assume north is positive.
LA;B Longitude of points A and B, where A is the drone point, and B is the user-defined
point. Assume east is positive.
b Bearing angle
D Distance between two points on a flat plane
R Radius of Earth in a spherical model
s Drone heading, clockwise from north
x The horizontal distance from the centre of the drone to the centre of the camera
z The vertical distance from the centre of the drone to the centre of the camera
H Drone height from the ground
g Camera yaw angle. Clockwise is positive
a Camera pitch angle from the horizontal position, downward is positive
If there are two points A and B on the surface of Earth (assumed spherical with
radius R), and the two points are close together (within a few kilometres of each
other), then the following formulas can be used to calculate the distance between
the two points [60].
Gimbal inputs
(IMUs,
Magnetometer)
Drone inputs
(GPS, altitude)
For the bearing, it will be measured eastward from the north. The bearing of point
B, as seen from point A, can be determined by first calculating two quantities S and
C as follows:
S ¼ cosðjB ÞsinðDLÞ; (13.14)
C ¼ cosðjA ÞsinðjB Þ sinðjA ÞcosðjB Þcos ðDLÞ: (13.15)
Then, the bearing angle b can be calculated by:
1 S
b ¼ tan :
C
However, this cannot be solved if C is zero, thus the atan2 function can be used in
the code or any equivalent function.
The distance between the drone and the point of interest and the bearing angle
can be calculated from the above equations. Referring to Figure 13.4, the yaw angle
of the camera can be calculated as follows:
0 1 D sin b x sin s
g ¼ tan ; and
D cos b x cos s
0
g ¼ g s:
Then, referring to Figure 13.2, the pitch angle can be calculated as follows:
1 H z
a ¼ tan ;
Dx
which can be replaced by the atan2 function to prevent any errors from occurring in
the angle calculations (Figure 13.5).
With regards to the roll angle, it is used in the system as a stabiliser angle,
which shall prevent the gimbal from rolling, which can lead to image instability. Of
course, the value can be set manually if needed, but remaining at a horizontal level
is better for stabilization.
If the used gimbal has a limited range of movement, it is better to specify it in
the program to prevent the gimbal from getting damaged by huge ranges of motion.
Small UAV: persistent surveillance made possible 325
As an example, the current ranges for yaw, pitch and roll angles are specified.
90 g 90 ;
0 a 180 ; and
90 b 90 :
North
σ
Point
γ′
Camera
x
D
Drone
Figure 13.4 Top view of the drone and the point of interest
D
X
Drone
Z
α
Camera
Point of
interest
Of course, such values can be changed or modified based on the gimbal system used and
its capabilities. The last thing to do would be to check the bit rate of the gimbal controller.
As most gimbals’ motors work in encoder counts, the designer should find the number of
counts corresponding to the angle value. This is to allow the algorithm to output the angle
in encoder counts in order to be sent to the gimbal controller. Of course, if the gimbal can
read normal angles immediately, then there is no need to use such a transformation.
13.5 Conclusion
In this chapter, we presented the system view of a referential sUAV persistent sur-
veillance system. Such a system view enables an open architecture based on COTS
hardware components. Within this architecture, there are two types of supported
payloads; EO payload and IR payload which allow the system to operate day and
night. The communication between on-board and ground stations is done through
datalink systems. The gimbal controller controls the gimbal IMU sensors and motors
in order to stabilise the gimbal. A key component in this system view is the PEU,
which is in charge of running the images processing pipeline required for the tracking
capability. The PEU ingests the imageries and schedules the required processing such
as stitching, stabilisation, background subtraction, and tracking. Seeking for scal-
ability, ease of deployment and extensibility, we recommend that the PEU should
adhere to a distributed/multithreaded architecture running, whenever deemed possible,
on embedded hardware with GPU parallelisation in mind promoting software sub-
stitutability consequently with minimum development and integration effort.
References
[1] Ren J., Xu M., Smith J., Zhao H., and Zhang R. ‘Multi-view visual sur-
veillance and phantom removal for effective pedestrian detection’.
Multimedia Tools and Applications. 2017;77(14):18801–18826.
Small UAV: persistent surveillance made possible 327
[2] Sommer L.W., Teutsch M., Schuchert T., and Beyerer J. ‘A survey on
moving object detection for wide area motion imagery’. Proceedings of the
IEEE Winter Conference on Applications of Computer Vision; Lake Placid,
NY, USA, 2016. pp. 1–9.
[3] Palaniappan K., Rao R., and Seetharaman G. ‘Wide-area persistent
airborne video: Architecture and challenges’. Distributed Video Sensor
Networks: Research Challenges and Future Directions, Springer, 2010;
pp. 349–371.
[4] Blasch E., Seetharaman G., Suddarth S., et al. ‘Summary of methods in
wide-area motion imagery (WAMI)’. Proceedings of SPIE 9089; Baltimore,
Maryland, USA, 2014. p. 90890C.
[5] Cai G., Dias J., and Seneviratne L. ‘A survey of small-scale unmanned aerial
vehicles: Recent advances and future development trends’. Unmanned
Systems. 2014;2(02):175–199.
[6] Batista da Silva L.C., Bernardo R.M., de Oliveira H.A., and Rosa P.F.F.
‘Unmanned aircraft system coordination for persistent surveillance with
different priorities’. Proceedings of the IEEE 26th International Symposium
on Industrial Electronics; Edinburgh, UK, 2017. pp. 1153–1158.
[7] Khandani H., Moradi H., and Yazdan P.J. ‘A real-time coverage and track-
ing algorithm for UAVs based on potential field’. Proceedings of the 2nd
RSI/ISM International Conference on Robotics and Mechatronics; Tehran,
Iran, 2014. pp. 700–705.
[8] Logan M.J., Chu J., Motter M.A., Carter D.L., Ol M., and Zeune C. ‘Small
UAV research and evolution in long endurance electric powered vehicles’.
Proceedings of the AIAA Infotech Aerospace Conference and Exhibition;
Rohnert Park, California, USA, 2007. pp. 1–7.
[9] Zeng Y., Zhang R., and Lim T.J. ‘Wireless communications with unmanned
aerial vehicles: Opportunities and challenges’. IEEE Communications
Magazine, 2016;54(5):36–42.
[10] Ohood A.A., Omar A., Abdulrahman A., Abdulla A.A., Slim S., and
Issacniwas S, ‘Persistent surveillance with small unmanned aerial vehicles
(SUAVs): A feasibility study’. Proceedings of SPIE 10796; Electro-Optical
Remote Sensing XII; Berlin, Germany, 2018. p. 107960K.
[11] https://www.ntpdrone.com/ [Accessed 28 July 2018].
[12] http://tethered.solutions/ [Accessed 01 Feb 2018].
[13] http://www.unmannedsystemstechnology.com/2017/05/mmc-announces-new-
tethered-drone-power-supply-system/ [Accessed 28 Jan 2018].
[14] https://hoverflytech.com/defense/ [Accessed 05 Feb 2018].
[15] Estrela V.V., and Galatsanos N.P. ‘Spatially-adaptive regularized pel-recursive
motion estimation based on cross-validation’. In Proceedings 1998
International Conference on Image Processing. ICIP98 (Cat. No. 98CB36269);
Chicago, IL, USA, 1998 October; Vol. 2. pp. 200–203. IEEE.
[16] Coelho A.M., and Estrela V.V. (2016). EM-based mixture models applied to
video event detection. arXiv preprint arXiv:1610.02923.
328 Imaging and sensing for unmanned aircraft systems, volume 1
[17] Razmjooy N., Mousavi B.S., Soleymani F., and Khotbesara M.H. ‘A
computer-aided diagnosis system for malignant melanomas’. Neural
Computing and Applications. 2013;23(7–8):2059–2071.
[18] Estrela V.V., and Herrmann A.E. ‘Content-based image retrieval (CBIR) in
remote clinical diagnosis and healthcare’. In Encyclopedia of E-Health and
Telemedicine, IGI Global, 2016, pp. 495–520.
[19] Razmjooy N., Ramezani M., and Ghadimi N. ‘Imperialist competitive
algorithm-based optimization of neuro-fuzzy system parameters for auto-
matic red-eye removal’. International Journal of Fuzzy Systems. 2017;19(4):
1144–1156.
[20] de Jesus M.A., Estrela V.V., Saotome O., and Stutz D. ‘Super-resolution via
particle swarm optimization variants.’ Biologically Rationalized Computing
Techniques For Image Processing Applications. Lecture Notes in
Computational Vision and Biomechanics. Springer, Zurich, Switzerland,
2018;25: 317–337.
[21] Somayeh Mousavi B., and Soleymani F. ‘Semantic image classification
by genetic algorithm using optimised fuzzy system based on Zernike
moments’. Signal, Image and Video Processing. 2014;8(5):831–842.
[22] Razmjooy N., Mousavi B.S., Khalilpour M., and Hosseini H. ‘Automatic
selection and fusion of color spaces for image thresholding’. Signal, Image
and Video Processing. 2014;8(4):603–614.
[23] Emilia G.D., and Gasbarro D.D. ‘Review of techniques for 2D camera
calibration suitable for industrial vision systems’. Journal of Physics:
Conference Series. 2017;841:012030.
[24] Yusuf A.R., Ariff M.F.M., Khairulnizam M.I., Zulkepli M., and Albert K.C.
‘Camera calibration accuracy at different UAV flying heights’. The
International Archives of the Photogrammetry, Remote Sensing and Spatial
Information Sciences, vol. XLII-2/W3, 3D Virtual Reconstruction and
Visualization of Complex Architectures. 2017; pp. 1–3.
[25] Tsai R.Y. ‘A versatile camera calibration technique for high-accuracy 3D
machine vision metrology using off the-shelf TV cameras and lenses’. IEEE
Transactions on Robotics and Automation. 1987;3(4):323–344.
[26] Zhang Z.Y. ‘A flexible new technique for camera calibration’. IEEE
Transactions on Pattern Analysis. 2000;22(11):1330–1334.
[27] Zhang Z. ‘Flexible camera calibration by viewing a plane from unknown
orientations’. Proceedings of the 7th IEEE International Conference on
Computer Vision; Kerkyra, Greece, 1999. pp. 666–673.
[28] Pérez M., Agüera F., and Carvajal F. ‘Digital camera calibration using
images taken from an unmanned aerial vehicle’. The International Archives
of the Photogrammetry, Remote Sensing and Spatial Information Sciences,
Zurich, Switzerland, Vol. XXXVIII, Part 1/C22, 2011; pp. 167–171.
[29] Chiang K.-W., Tsai M.-L., and Chu C.-H. ‘The development of an UAV
borne direct georeferenced photogrammetric platform for ground control
point free applications’. Sensors, 2012;12:9161–9180.
Small UAV: persistent surveillance made possible 329
[30] Sauerbier M., and Eisenbeiss H. ‘Investigation of UAV systems and flight
modes for photogrammetric applications’. The Photogrammetric Record,
2011;26(136):400–421.
[31] Deng D.W., and Li B.A. ‘Large Unmanned Aerial Vehicle Ground Testing
System’. In Applied Mechanics and Materials, 2015;719:1244–1247.
[32] Jimenez P.L., and Agudelo D. ‘Validation and calibration of a high-
resolution sensor in unmanned aerial vehicles for producing images in the IR
range utilizable in precision agriculture’. American Institute of Aeronautics
and Astronautics (AIAA) SciTech, 2015.
[33] Tahar K.N. ‘Aerial terrain mapping using unmanned aerial vehicle
approach’. International Archives of the Photogrammetry, Remote Sensing
and Spatial Information Sciences, Vol. XXXIX-B7, XXII ISPRS Congress,
Melbourne, Australia, 2012. pp. 493–498.
[34] Mohamed M.R.M., and Schwarz K.P. ‘An autonomous system for
aerial image acquisition and georeferencing’. American Society of
Photogrammetry and Remote Sensing Annual Meeting, 1999;pp. 17–21.
[35] Liu P., Xi C., and Yang L. ‘An approach of system calibration for UAV
photogrammetry’. Proceedings of SPIE, 2011;p. 8200.
[36] Xie H., Hicks N., Keller G.R., Huang H., and Kreinovich V. ‘An IDL/ENVI
implementation of the FFT-based algorithm for automatic image registra-
tion’. Computers & Geosciences. 2003;29:1045–1055.
[37] Zhao F., Huang Q., and Gao W. ‘Image matching by normalized cross-cor-
relation’. Proceedings of the IEEE International Conference on Acoustics,
Speech and Signal Processing, 2006.pp. II.
[38] Brown M., and Lowe D.G. ‘Automatic panoramic image stitching
using invariant features’. International Journal of Computer Vision.
2007;74:59–73.
[39] De Césare C., Rendas M.-J., Allais A.-G., and Perrier M. ‘Low overlap
image registration based on both entropy and mutual information measures’.
OCEANS 2008. IEEE, 2008. pp. 1–9.
[40] Jia Y., Su Z., Zhang Q., Zhang Y., Gu Y., and Chen Z. ‘Research on UAV
remote sensing image mosaic method based on SIFT’. International Journal of
Signal Processing, Image Processing and Pattern Recognition, 2015;8
(11):365–374.
[41] Cruz B.F., de Assis J.T., Estrela V.V., and Khelassi A. ‘A compact SIFT-
based strategy for visual information retrieval in large image databases.’
Medical Technologies Journal, 2019;3(2):402–412, doi:10.26415/2572-
004X-vol3iss2p402-412.
[42] Zagrouba E., Barhoumi W., and Amri S. ‘An efficient image-mosaicing
method based on multifeature matching’. Machine Vision and Applications.
2009;20:139–162.
[43] Wang J., and Watada J. Panoramic image mosaic based on SURF algorithm
using OpenCV, 9th International Symposium on Intelligent Signal
Processing, IEEE, Siena, Italy, 2015; pp. 1–6.
330 Imaging and sensing for unmanned aircraft systems, volume 1
[44] Rong W., Chen H., Liu J., Xu Y., and Haeusler R. Mosaicing of microscope
images based on SURF, 24th International Conference on Image and Vision
Computing New Zealand, IEEE, Wellington, New Zealand, 2009; pp. 271–275.
[45] Geng N., He D., and Song Y. ‘Camera image mosaicing based on an opti-
mized SURF algorithm’. TELKOMNIKA Indonesian Journal of Electrical
Engineering. 2012;10:2183–2193.
[46] Zhang Y., Xiong J., and Hao L. ‘Photogrammetric processing of low-altitude
images acquired by unpiloted aerial vehicles’. The Photogrammetric Record,
2011;26:190–211.
[47] Shen H., Pan Q., Cheng Y., and Yu Y. Fast video stabilization algorithm for
UAV. In IEEE International Conference on Intelligent Computing and
Intelligent Systems, 2009. ICIS 2009, Shanghai, China, 2009, vol. 4,
pp. 542–546.
[48] Vazquez M., and Chang C. ‘Real-time video smoothing for small RC heli-
copters’. Proc. IEEE International Conference on Systems, Man and
Cybernetics, San Antonio, Texas, USA, 2009;pp. 4019–4024.
[49] Lucas B.D., and Kanade T., ‘An iterative image registration technique with
an application to stereo vision’. Proceeding of the 7th International
Joint Conference on Artificial Intelligence. San Francisco, USA, 1981;
81:674–679.
[50] Wang Y., Hou Z., Leman K., and Chang R. ‘Real-time video stabilization for
unmanned aerial vehicles’. Proc. IAPR Conference on Machine Vision
Applications, 2011, Nara, JAPAN, 2011:336–339.
[51] Issacniwas S., Slim S., and Maya A.H. ‘Parallax rectification and stabiliza-
tion technique for multiple objects tracking in wide area surveillance system’.
Proc. SPIE 10649, Pattern Recognition and Tracking XXIX, Orlando, Florida,
USA, 2018;p. 1064914.
[52] Srinivasan K., Porkumaran K., and Sainarayanan G. ‘Improved background
subtraction techniques for security in video applications’. Proc. IEEE 3rd
International Conference on Anti-counterfeiting, Security and Identification
in Communication, Hong Kong, China, 2009.
[53] Oliver N.M., Rosario B., and Pentland A.P. ‘A Bayesian computer vision
system for modeling human interactions’. IEEE Transactions on Pattern
Analysis and Machine Intelligence. 2000;22(8):831–843.
[54] Stauffer C., and Grimson W. ‘Adaptive background mixture models for real-
time tracking’. Proc. IEEE Computer Society Conference on Computer
Vision and Pattern Recognition. Fort Collins, CO, USA, 1999.
[55] McFarlane N.J.B., and Schofield C.P., ‘Segmentation and tracking of piglets
in images’. Machine Vision and Applications, 1995;8:187–193.
[56] Wei L., Jianhu W., and Qin L. ‘Study on moving object tracking algorithm
in video images’. Proc. 8th IEEE International Conference on Electronic
Measurement and Instruments, Xian, China, 2007;pp. 810–816.
[57] Shai A. ‘Support vector tracking’. IEEE Transactions on Pattern Analysis
and Machine Intelligence. 2004;26(8):1064–1072.
Small UAV: persistent surveillance made possible 331
[58] Hu W., Tan T., Wang L., and Steve M. ‘A survey on visual surveillance of
object motion and behaviors’. IEEE Transactions on Systems, Man, and
Cybernetics Applications and Reviews. 2004;34(3):334–352.
[59] Javed O., and Shah M. ‘Tracking and object classification for automated
surveillance’. Proceedings of the 7th European Conference on Computer
Vision-Part IV, Copenhagen, Denmark, 2002;pp. 343–357.
[60] Karasulu B. ‘Review and evaluation of well-known methods for moving
object detection and tracking in videos’. Journal of Aeronautics and Space
Technologies. 2010;4(4):11–22.
[61] Jeong B., Ha J., and Choi H. ‘MDP-based mission planning for multi-UAV
persistent surveillance’. 14th International Conference on Control,
Automation and Systems. Seoul, South Korea, 2014;pp. 831–834.
[62] Nigam N., and Kroo I. ‘Persistent surveillance using multiple unmanned air
vehicles’. 2008 IEEE Aerospace Conference, Big Sky, MT, USA, 2008;
pp. 1–14.
[63] Hartuv E., Agmon N., and Kraus S. Scheduling spare drones for persistent
task performance under energy constraints. 2008 International Conference
on Autonomous Agents and Multiagent Systems AAMAS, Stockholm,
Sweden, 2018.
[64] Mitchell D., Corah M., Chakraborty N., Sycara K.P., and Michael N. ‘Multi-
robot long-term persistent coverage with fuel constrained robots’. Proc.
2015 IEEE International Conference on Robotics and Automation (ICRA),
Seattle, WA, USA, 2015;pp. 1093–1099.
[65] Peters, J.R., Wang, S.J., Surana, A., and Bullo, F. ‘Cloud-supported coverage
control for persistent surveillance missions’. Journal of Dynamic Systems
Measurement and Control. 2017;139(8):1–12.
[66] Razmjooy N., Estrela V.V., and Loschi H.J. ‘A survey of potatoes image
segmentation based on machine vision.’ Applications of Image Processing
and Soft Computing Systems in Agriculture. IGI Global, Hershey, PA, USA,
2019;1–38. doi:10.4018/978-1-5225-8027-0.ch001
[67] Nigam N. ‘The multiple unmanned air vehicle persistent surveillance pro-
blem: A review’. Machines. 2014;2(1):13–72.
[68] Mahmood S., Afsharchi M., and Weindling A.M. ‘A Markovian decision
process analysis of experienced agents joining ad-hoc teams’. 2018
21st Euromicro Conference on Digital System Design (DSD), Prague,
Czech Republic, 2018; 691–698.
This page intentionally left blank
Chapter 14
Conclusions
Vania V. Estrela1, Jude Hemanth2, Osamu Saotome3,
George Nikolakopoulos4 and Roberto Sabatini5
The current awareness in UAVs has prompted not only military applications but
also civilian uses. Aerial vehicles’ requirements aspire to guarantee a higher level
of safety comparable to see-and-avoid conditions for piloted aeroplanes. The pro-
cess of probing obstacles in the path of a vehicle, and to determine if they pose a
threat, alongside measures to avoid problems, is known as see-and-avoid or sense
and-avoid involves a great deal of decision-making. Other types of decision-
making tasks can be accomplished using computer vision and sensor integration
since they have great potential to improve the performance of UAVs.
Macroscopically, Unmanned Aerial Systems (UASs) are cyber-physical systems
(CPSs) that can benefit from all types of sensing frameworks, despite severe design
constraints such as precision, reliable communication, distributed processing cap-
abilities, and data management.
The first chapter overviews some main concepts and it calls attention to several
issues still undergoing discussions in UAV-CPSs. Several trends and needs are dis-
cussed to foster the criticism from readers to the upcoming chapters and provide food
for thought.
Some of the significant UAV-CPSs advantages are their capability to withstand
the duties and objectives of human beings while accomplishing chores for them.
Simultaneously, they also take some decisions and execute some actions indepen-
dently. Therefore, people and machines must collaborate. Even though these cap-
abilities offer noteworthy paybacks, there is still a tremendous amount of effort
necessary to fully master suitable ways of assisting human–machine interaction.
Using low-cost, open-source components together with multiple sensors and
actuators is quite a challenge in terms of effort and cost. Hence, it is desirable to
employ Open Source Software (OSS) and Hardware (OSH) in UAV-CPSs.
1
Telecommunications Department, Universidade Federal Fluminense, RJ, Brazil
2
ECE Department, Karunya University, Coimbatore, India
3
DCTA-ITA-IEEA, Sao Jose dos Campos, SP, Brazil
4
Department of Computer Science, Luleå University of Technology, Luleå, Sweden
5
Department of Aerospace Engineering and Aviation, RMIT University, Bundoora, Australia
334 Imaging and sensing for unmanned aircraft systems, volume 1
Whenever possible, available OSH and OSS should be used in designs indepen-
dently of the type of hardware framework and s operating system.
The performance of an Orthogonal Frequency Division Multiplexing (OFDM)
UAV-CPS can be enhanced by adding channel coding (i.e., error correction code)
to identify and correct the errors that happen throughout data transmission. The
massive bandwidth of UWB channels can originate new effects as compared to
conventional wireless channel modelling in several MAV applications. Ultra-Wide
Band (UWB) technologies based on IEEE 802.15.4a have various uses when there
is a need for highly accurate localisation with various sensors for stabilisation and
navigation. Still, absolute indoor positioning challenges UAVs.
Flying a UAV in unstructured settings with changeable conditions is challen-
ging. To support the development of better algorithms, a multipurpose dataset for
low-altitude UAV flights in a given Brazilian environment is proposed as a
benchmark for positioning and other avionics tasks for assessing computer vision
procedures in terms of robustness and generalisation with a baseline for depth
estimation with and without landmarks. This stage of development can help the
advancement of future integration with remote-sensing (RS) modules that will
bring more spectral information to the analyses.
UAV-CPSs involve a great deal of knowledge on networking – more specifically
– on Flying Ad-hoc Networks (FANETs). The fact that the traffic of high-
dimensional multimedia data streams through UAV-CPSs tends to grow exponen-
tially raises several issues and point towards future research directions.
The texture is an important feature to recognise objects or regions of interest
(ROIs) in any images, and it has been commonly used for image classification from
satellite images to assess biomass for instance. UAV imagery takes advantage of ultra-
high spatial resolution, which shows that texture is also a paramount source of
knowledge. Nevertheless, texture in the UAV imagery was seldom used in surveil-
lance. Moreover, merging ground hyperspectral data could compensate for the limited
bands of UAV sensors and increase the estimation precision of analyses. Consequently,
the goals of this chapter are (i) to explore UAV-based multispectral imagery and (ii) to
improve several types of estimation accuracy through hyperspectral information.
The application of camera-equipped UAVs for visually observing ecological
balance, construction and operation of hydric bodies, buildings, bridges, forest
reservations and other types of infrastructure arrangements has grown exponen-
tially. These UAV-CPSs can frequently examine all sorts of sites, monitor work-in-
progress, generate documents/reports for safety, and scrutinise existing structures,
chiefly for hard-to-reach regions.
Regardless of the sophistication of the on-board sensors, the Cloud, RS,
computational intelligence, and communication advances, super-resolution (SR)
will be in demand for quite a long time. This will continue to happen in contexts
where acquiring imageries is expensive and troublesome such as in healthcare,
astronomy, and disaster relief.
Quality of Service (QoS) and Quality of Experience (QoE) (besides other
qualitative performance metrics) will play a pivotal role to impulse further
improvements in all stages of a UAV-CPS.
Conclusions 335
Stereo vision enhances UAV navigation by providing depth perception, which is achieved through two intrinsic parameters: focal length and baseline. This allows for altitude estimation and scale estimation of the observed features, which are significant advantages over monocular vision that cannot estimate scale directly. This depth perception is crucial for navigation tasks such as altitude regulation and obstacle detection .
Cloud-supported frameworks can enhance UAV systems by allowing video acquisition, file scheduling, information offloading/processing, and network state assessment to be managed efficiently. This integration improves the scalability and real-time processing capabilities by selectively offloading data to a cloud-based server which then processes and provides feedback to the ground station .
Key future trends for data storage and processing in unmanned aerial systems include the increased use of data logging methods that provide high-frequency recording and analysis capabilities . Additionally, storage virtualization and software-defined storage (SDS) enable greater flexibility and scalability, allowing integration with existing storage infrastructures, internal disks, or cloud platforms . Furthermore, the need for improved data compression technologies is vital to manage the large volumes of data generated by UAV sensors, ensuring efficient storage and transmission to ground stations or the cloud . UAVs are increasingly integrating distributed processing frameworks that allow collaboration between multiple units for complex tasks like high-resolution image processing . The adoption of cloud computing in UAV systems supports offloading computational tasks, thus compensating for onboard resource constraints and enhancing processing and storage capabilities . Moreover, addressing big data challenges, such as infrastructure readiness, privacy concerns, and decision-making processes, remains critical for effective integration of data analytics in UAV operations .
Environmental variables such as changing lighting conditions and platform dynamics can heavily affect stereo vision's accuracy. Variations in illumination can lead to inaccurate depth perception, while platform vibrations can cause deformations in the stereo rig, resulting in large triangulation errors. Addressing these inaccuracies often requires robust algorithmic or hardware solutions specifically tailored to these conditions .
Hybrid solutions for optimizing UAV computational resources involve combining algorithmic, software, and hardware improvements. Such configurations might include hardware-software co-design or specific hardware resolutions with algorithmic optimizations to reduce computations. FPGAs, for example, offer benefits like low weight, reduced size, and low power, which are crucial for UAV-CPS real-time performance. These solutions allow for greater flexibility and efficiency in deploying computational resources .
Balancing computational capabilities with power constraints in UAV systems is a critical challenge impacting real-time application performance. UAVs require lightweight, low-power computing solutions due to restrictive size, weight, and power (SWaP) budgets which limit their onboard computational resources . To achieve real-time performance, UAVs must utilize low-power computer vision (CV) processing architectures, often operating at reduced clock frequencies . This limits the ability to perform complex computations onboard, necessitating innovations in both algorithmic efficiency and hardware capabilities such as field-programmable gate arrays (FPGAs). Power constraints also mean that increasing the clock frequency to boost performance is not feasible, thus compelling the use of alternative energy solutions like solar panels or laser power beaming for sustained operations . Despite these innovations, the real-time processing of high-quality data, such as high-resolution video, remains challenging and often requires offloading computations to ground stations or clouds to conserve energy and maintain performance . This offloading, however, introduces delays and potential data transmission issues, affecting immediate responsiveness and autonomy . Therefore, achieving an optimal balance between processing capability and power consumption is vital for enhancing the real-time application performance of UAVs.
Feature detection in visual odometry (VO) for UAVs is crucial as it involves the identification and correspondence of distinct visual landmarks across successive frames, enabling the estimation of motion. Visual-inertial fusion enhances the performance of VO by leveraging complementary data, where inertial sensors help constrain the visual feature search space, thereby increasing the robustness and accuracy of feature matching and tracking . Visual-inertial sensor fusion can be implemented in loosely coupled or tightly coupled manners. Loosely coupled systems use standalone solutions from visual and inertial systems to aid each other, predicting visual feature motion from inertial estimates to enhance feature correspondence . In tightly coupled systems, raw sensor data is processed by a central module to improve state estimation of the UAV, which is optimal but computationally demanding . Both fusion techniques improve the robustness of VO against environmental challenges such as changes in illumination, making UAV navigation more reliable in GNSS-denied environments .
The Internet of Drones (IoD) aims to improve UAV functionality by enabling a networked environment for drones to communicate and collaborate, performing complex tasks, and sharing data for efficient mission execution . Challenges facing IoD include reliable network stability for maintaining connectivity among multiple UAVs, handling the exponential growth of multimedia data streams, and ensuring real-time data processing . Additionally, limited battery life and power consumption are significant hurdles . Furthermore, centralised control systems can lead to bottlenecks and single points of failure, necessitating distributed control strategies to ensure fault tolerance and effective resource utilisation ."}
The integration of visual processing systems in UAV navigation involves the use of vision-based sensors (VBSs) to complement traditional systems like GPS and INS, particularly in environments where GNSS signals may be unavailable, such as urban canyons . Core components include visual-inertial fusion, where visual and inertial sensors aid each other for improved estimation of vehicle states, with techniques such as optical flow measurements used to enhance velocity estimation . Visual odometry (VO) provides a full navigation state vector by tracking visual features across frames, allowing incremental estimation of vehicle states, and is often a subset of visual-inertial aiding . Techniques like visual servoing use real-time visual data for feedback control, navigating the UAV relative to environmental features . Also, visual simultaneous localization and mapping (V-SLAM) algorithms are employed for environment mapping and navigation . These integrations are increasingly deployed in UAVs due to their low cost, passive operation, and capability to provide detailed environmental perception .