Recognition Robotics José María Martínez Otzeta Online Reading
Recognition Robotics José María Martínez Otzeta Online Reading
https://ebookmeta.com/product/recognition-robotics-jose-maria-
martinez-otzeta/
★★★★★
4.8 out of 5.0 (44 reviews )
ebookmeta.com
Recognition Robotics José María Martínez Otzeta
EBOOK
Available Formats
https://ebookmeta.com/product/starting-out-the-c3-sicilian-1st-
edition-john-emms/
https://ebookmeta.com/product/the-paper-issue-83-1st-edition-
origamiusa/
https://ebookmeta.com/product/the-american-
revolution-1774-83-2nd-edition-daniel-marston/
https://ebookmeta.com/product/network-security-private-
communication-in-a-public-world-3rd-edition-charlie-kaufman/
A companion to medical anthropology Second Edition
César Abadía Barrero Editor Merrill Singer Editor
Pamela Irene Erickson Editor
https://ebookmeta.com/product/a-companion-to-medical-
anthropology-second-edition-cesar-abadia-barrero-editor-merrill-
singer-editor-pamela-irene-erickson-editor/
https://ebookmeta.com/product/the-pretender-1st-edition-marc-
ruskin-2/
https://ebookmeta.com/product/king-of-queens-king-
university-4-1st-edition-nikki-pennington/
https://ebookmeta.com/product/sport-fans-the-psychology-and-
social-impact-of-fandom-2nd-edition-daniel-l-wann/
https://ebookmeta.com/product/clean-fuels-for-mobility/
Imagine Me Gone Adam Haslett
https://ebookmeta.com/product/imagine-me-gone-adam-haslett/
Special Issue Reprint
Recognition Robotics
Edited by
José María Martínez-Otzeta
mdpi.com/journal/sensors
Recognition Robotics
Recognition Robotics
Editor
José Marı́a Martı́nez-Otzeta
Editorial Office
MDPI
St. Alban-Anlage 66
4052 Basel, Switzerland
This is a reprint of articles from the Special Issue published online in the open access journal
Sensors (ISSN 1424-8220) (available at: https://www.mdpi.com/journal/sensors/special issues/
Recognition Robotics).
For citation purposes, cite each article independently as indicated on the article page online and as
indicated below:
Lastname, A.A.; Lastname, B.B. Article Title. Journal Name Year, Volume Number, Page Range.
© 2023 by the authors. Articles in this book are Open Access and distributed under the Creative
Commons Attribution (CC BY) license. The book as a whole is distributed by MDPI under the terms
and conditions of the Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)
license.
Contents
Luca Marchionna, Giulio Pugliese, Mauro Martini, Simone Angarano, Francesco Salvetti and
Marcello Chiaberge
Deep Instance Segmentation and Visual Servoing to Play Jenga with a Cost-Effective Robotic
System
Reprinted from: Sensors 2023, 23, 752, doi:10.3390/s23020752 . . . . . . . . . . . . . . . . . . . . . 97
v
Jaekwang Lee, Kangmin Lim and Jeongho Cho
Improved Monitoring of Wildlife Invasion through Data Augmentation by Extract–Append of
a Segmented Entity
Reprinted from: Sensors 2022, 22, 7383, doi:10.3390/s22197383 . . . . . . . . . . . . . . . . . . . . 205
Dominykas Strazdas, Jan Hintz, Aly Khalifa, Ahmed Abdelrahman, Thorsten Hempel and
Ayoub Al-Hamadi
Robot System Assistant (RoSA): Towards Intuitive Multi-Modal and Multi-Device
Human-Robot Interaction
Reprinted from: Sensors 2022, 22, 923, doi:10.3390/s22030923 . . . . . . . . . . . . . . . . . . . . . 269
José Marı́a Martı́nez-Otzeta, Itsaso Rodrı́guez-Moreno, Iñigo Mendialdua and Basilio Sierra
RANSAC for Robotic Applications: A Survey
Reprinted from: Sensors 2023, 23, 327, doi:10.3390/s23010327 . . . . . . . . . . . . . . . . . . . . . 293
vi
sensors
Editorial
Editorial for the Special Issue Recognition Robotics
José María Martínez-Otzeta
Department of Computer Science and Artificial Intelligence, University of the Basque Country,
20018 Donostia-San Sebastián, Spain; [email protected]
Perception of the environment is an essential skill for robotic applications that interact
with their surroundings. Alongside perception often comes the ability to recognize objects,
people, or dynamic situations. This skill is of paramount importance in many use cases,
from industrial to social robotics. Robots that can accurately perceive and understand their
environment are critical for tasks like manufacturing, delivery, healthcare, and assisting
humans in homes or public spaces. Object recognition enables robots to identify items,
tools, and obstacles in their vicinity. This allows industrial robots to select the right parts
or manipulators, logistics robots to handle packages, and autonomous vehicles to avoid
collisions. Activity recognition allows robots to interpret human motions and behavior.
This facilitates safe and intuitive collaboration in shared workspaces. It also permits
service robots to determine user intents and respond appropriately. Person recognition
provides robots the means to identify individuals. This capability supports applications
like personalized assistance, healthcare monitoring, and security surveillance. Altogether,
these skills comprise the fundamental building blocks for robots to operate adaptively in
the real world.
This Special Issue “Recognition Robotics” of Sensors seeks to explore new research
proposals on this increasingly important topic. The fifteen accepted papers in this issue
cover human–robot collaboration [1], person re-identification [2,3], human–robot interac-
tions [4,5], visual servoing [6], cooperative mapping [7], semantic segmentation [8,9], object
classification [10], multi-object tracking [11], robot path planning [12], embedded deep
learning [13], activity recognition [14], and robust model fitting [15]. These works present
novel techniques using tools such as fuzzy logic, deep learning, computer vision, ultrasonic
sensing, spline optimization, and more to advance robot capabilities in real-world condi-
tions. The research aims to overcome challenges in uncertainty, limited data, computational
constraints, and complexity across various application domains. In summary, this Special
Citation: Martínez-Otzeta, J.M.
Issue provides a sampling of the latest innovations and progress in enabling robots that can
Editorial for the Special Issue
effectively perceive, learn, plan, manipulate, and collaborate in unstructured environments
Recognition Robotics. Sensors 2023, through advances in recognition capabilities.
23, 8515. https://doi.org/10.3390/ In [1], Yalçinkaya et al. introduce a Fuzzy State-Long Short-Term Memory (FS-LSTM)
s23208515 approach for human–robot collaboration in dynamic fields like agriculture and construc-
tion. These tasks are time-consuming and risky for humans, making robotic assistance
Received: 22 September 2023
valuable. The method handles the ambiguity in human behavior by fuzzifying sensory data
Accepted: 7 October 2023
and employing a combined activity recognition system using state machines and LSTM.
Published: 17 October 2023
Experimental validation showed that FS-LSTM outperforms traditional LSTM in accuracy
and computational efficiency.
In [2], Casao et al. introduce an unsupervised method for person re-identification, ca-
Copyright: © 2023 by the author. pable of automatically adding new identities to an adaptive gallery in open-world settings.
Licensee MDPI, Basel, Switzerland. The system compares current models to new unlabeled data and uses information theory to
This article is an open access article keep compact representative models. Experimental results, including comparisons to other
distributed under the terms and unsupervised and semi-supervised methods, validate the effectiveness of their approach.
conditions of the Creative Commons The authors of [3] propose a lightweight deep metric learning technique for reliable
Attribution (CC BY) license (https:// person re-identification, aimed at robot tracking. This method addresses challenges like
creativecommons.org/licenses/by/ clothing and pose changes by employing a novel attention mechanism. This focuses on
4.0/).
specific body parts, retains global context, and enables cross-representations for robust
identification. The experimental results show up to 80.73% and 64.44% top-rank accuracy,
outperforming existing methods. The authors suggest that integrating this metric improves
tracking reliability in dynamic environments.
In [4], Błażejowska et al. explore the impact of emotional feedback from the Miro-E
robot on high school students during a programming education session. The robot moni-
tored students’ emotions via facial expression analysis and provided affective feedback like
verbal praise and tail wagging. Compared to a control group with neutral robot responses,
the emotional feedback positively impacted engagement, particularly for students with
little prior programming experience. However, it also slightly reduced the robot’s likeabil-
ity, hinting at an uncanny valley effect. Due to a small sample size, the study focused on
qualitative insights.
In [5], Marques-Villarroya et al. introduce a robotic perception architecture that
employs bio-inspired endogenous attention to improve human–robot interactions. The ar-
chitecture uses multisensory inputs and ranks stimuli based on their relevance to the robot’s
tasks, particularly emphasizing human presence and actions. By doing so, it optimizes
the robot’s focus and behavior, leading to more efficient interactions. Implemented on the
Robot Operating System (ROS), the architecture demonstrates strong real-time performance
and extensibility. The authors argue that this bio-inspired approach enhances the robot’s
responsiveness while reducing complexity.
In [6], Marchionna et al. demonstrate how a low-cost, six-axis robotic arm, e.Do,
can play Jenga using instance segmentation and visual servoing. The system employs an
affordable RGB-D camera and force sensor. A customized deep learning model is trained to
identify each Jenga block, enabling precise visual tracking during manipulation. The force
sensor helps decide if a block can be safely removed. Testing shows up to 14 consecutive
successful block extractions before the tower collapses. The authors note that Jenga serves
as a complex benchmark, driving advancements in multi-step reasoning, integrated sensory
perception, and high-precision control.
The authors of [7] propose a decentralized framework for collaborative 3D mapping
using mobile robots with LiDAR sensors in large-scale outdoor settings such as agriculture
and disaster response. The real-time method allows robots to share and merge locally
scanned submaps into a global map, even with limited communication bandwidth. A
conditional peer-to-peer strategy is used for sharing map data over different distances.
Experiments in a real-world solar power plant confirm the approach’s efficiency and
reliability for multi-robot mapping of extensive outdoor areas.
In [8], Pinkovich et al. address the challenge of autonomously selecting safe landing
sites for delivery drones in dense urban areas. Their multi-resolution technique captures
visual data at varying altitudes, enabling both wide exploration and high-resolution sensing.
A semantic segmentation deep neural network processes this data, updating probability
distributions for each ground patch’s landing suitability. When a location’s confidence
exceeds a threshold, it is selected as viable. The authors find the method effectively balances
the trade-off between exploration and resolution in constrained urban environments.
Lee et al. introduce in [9] an “Extract-Append” data augmentation technique to
boost the accuracy of models detecting wild animals in agricultural fields. The method
uses semantic segmentation to isolate animal shapes from sample images and combines
them with new backgrounds to enrich the training dataset. Testing shows at least a 2.2%
improvement in mean Average Precision over traditional methods, and the technique
enables ongoing flexible data augmentation.
The authors of [10] present vision-based methods for automated recycling of used
electronic components such as capacitors and voltage regulators. Using a custom object
detection algorithm, they identify key areas in cluttered workspaces and compare three
classification techniques: SNNs, SVMs, and CNNs. After hyperparameter tuning, CNNs
prove to be the most accurate with a 98.1% success rate, making them the preferred method
for reliable automated recycling.
2
Sensors 2023, 23, 8515
Paper [11] proposes the use of deep neural networks to detect illegal garbage dumping
in urban areas. They combine OpenPose for human pose estimation, YOLO for garbage
bag classification, and DeepSORT for object tracking. The system measures the distance
between a person’s wrist and the garbage bag to determine illegal dumping. Experimental
results show their method offers higher accuracy and lower false alarms compared to other
approaches, making it effective for automated monitoring against unlawful waste disposal.
In the research presented by Rykała et al. [12], a path-planning method is developed
for an unmanned ground vehicle (UGV) to follow a human guide using ultra-wideband
(UWB) technology. They use smoothing splines to reconstruct the guide’s path from
periodic distance measurements. The approach is computationally efficient and can handle
missing data, making it suitable for real-time applications.
The authors of [13] provide a comprehensive evaluation of how well state-of-the-art
deep learning object detection models perform on embedded electronics. They assess
multiple architectures and quantization techniques to make the models more efficient for
embedded and robotics applications. The paper outlines the entire process from model
conversion to deployment and performance measurement on embedded devices. It offers
guidelines for choosing the right hardware and optimization strategies, and discusses the
various factors that influence performance in real-time robotics systems.
In [14], Strazdas et al. introduce RoSA, a framework that facilitates human–robot
interactions using speech and gestures. Running on ROS, the system incorporates speech
recognition, face identification, and pose estimation. A user study revealed that RoSA’s
usability was on par with a human-controlled setup, suggesting it offers a natural inter-
action experience. The authors highlight the value of multi-sensory integration for more
human-like and flexible robot interactions.
The authors of [15] review the RANSAC algorithm’s applications in robotics, focusing
on shape detection and feature matching. They explore various enhancements to RANSAC
that improve its speed, accuracy, and robustness. The survey also discusses trade-offs
between computational cost and performance, highlights recent robotics applications, and
provides a list of open-source RANSAC libraries. The survey offers robotics researchers
and developers an extensive reference on the state of the art in RANSAC techniques.
In summary, the fifteen papers in this Special Issue on “Recognition Robotics” demon-
strate the tremendous progress being made in enabling robots to effectively perceive,
understand, plan, and interact in the real world. However, significant challenges remain
before these innovative techniques can be reliably deployed in unconstrained environments.
Testing novel algorithms in controlled simulations or lab settings with simplified assump-
tions can be deceptively promising, because applying recognition capabilities on physical
robotic platforms in complex dynamic scenarios reveals many subtleties. Interactive testing
is critical to expose limitations around uncertainty, variability, and computational con-
straints. Moving innovations out of the lab or controlled scenarios requires addressing edge
cases and graceful failure modes, and therefore there is still substantial effort needed to
robustly handle the diversity and unpredictability of the real world. Nevertheless, the field
continues steadily on an exciting path towards enabling robot assistants and coworkers
that can perceive, learn, reason, and collaborate at a human level. These capabilities will
lead to transformative applications, and the works presented in this Special Issue provide
an inspiring snapshot of the road ahead.
References
1. Yalçinkaya, B.; Couceiro, M.S.; Soares, S.P.; Valente, A. Human-Aware Collaborative Robots in the Wild: Coping with Uncertainty
in Activity Recognition. Sensors 2023, 23, 3388. [CrossRef] [PubMed]
2. Casao, S.; Azagra, P.; Murillo, A.C.; Montijano, E. A Self-Adaptive Gallery Construction Method for Open-World Person
Re-Identification. Sensors 2023, 23, 2662. [CrossRef] [PubMed]
3. Syed, M.A.; Ou, Y.; Li, T.; Jiang, G. Lightweight Multimodal Domain Generic Person Reidentification Metric for Person-Following
Robots. Sensors 2023, 23, 813. [CrossRef] [PubMed]
3
Sensors 2023, 23, 8515
4. Błażejowska, G.; Gruba, Ł.; Indurkhya, B.; Gunia, A. A Study on the Role of Affective Feedback in Robot-Assisted Learning.
Sensors 2023, 23, 1181. [CrossRef] [PubMed]
5. Marques-Villarroya, S.; Castillo, J.C.; Gamboa-Montero, J.J.; Sevilla-Salcedo, J.; Salichs, M.A. A Bio-Inspired Endogenous
Attention-Based Architecture for a Social Robot. Sensors 2022, 22, 5248. [CrossRef] [PubMed]
6. Marchionna, L.; Pugliese, G.; Martini, M.; Angarano, S.; Salvetti, F.; Chiaberge, M. Deep Instance Segmentation and Visual
Servoing to Play Jenga with a Cost-Effective Robotic System. Sensors 2023, 23, 752. [CrossRef] [PubMed]
7. Lewis, J.; Lima, P.U.; Basiri, M. Collaborative 3D Scene Reconstruction in Large Outdoor Environments Using a Fleet of Mobile
Ground Robots. Sensors 2023, 23, 375. [CrossRef] [PubMed]
8. Pinkovich, B.; Matalon, B.; Rivlin, E.; Rotstein, H. Finding a Landing Site in an Urban Area: A Multi-Resolution Probabilistic
Approach. Sensors 2022, 22, 9807. [CrossRef] [PubMed]
9. Lee, J.; Lim, K.; Cho, J. Improved Monitoring of Wildlife Invasion through Data Augmentation by Extract—Append of a
Segmented Entity. Sensors 2022, 22, 7383. [CrossRef]
10. Chand, P.; Lal, S. Vision-Based Detection and Classification of Used Electronic Parts. Sensors 2022, 22, 9079. [CrossRef]
11. Kim, Y.; Cho, J. AIDM-Strat: Augmented Illegal Dumping Monitoring Strategy through Deep Neural Network-Based Spatial
Separation Attention of Garbage. Sensors 2022, 22, 8819. [CrossRef] [PubMed]
12. Rykała, Ł.; Typiak, A.; Typiak, R.; Rykała, M. Application of Smoothing Spline in Determining the Unmanned Ground Vehicles
Route Based on Ultra-Wideband Distance Measurements. Sensors 2022, 22, 8334. [CrossRef]
13. Cantero, D.; Esnaola-Gonzalez, I.; Miguel-Alonso, J.; Jauregi, E. Benchmarking Object Detection Deep Learning Models in
Embedded Devices. Sensors 2022, 22, 4205. [CrossRef]
14. Strazdas, D.; Hintz, J.; Khalifa, A.; Abdelrahman, A.A.; Hempel, T.; Al-Hamadi, A. Robot System Assistant (RoSA): Towards
Intuitive Multi-Modal and Multi-Device Human-Robot Interaction. Sensors 2022, 22, 923. [CrossRef]
15. Martínez-Otzeta, J.M.; Rodríguez-Moreno, I.; Mendialdua, I.; Sierra, B. RANSAC for Robotic Applications: A Survey. Sensors
2023, 23, 327. [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
4
sensors
Article
Human-Aware Collaborative Robots in the Wild: Coping with
Uncertainty in Activity Recognition
Beril Yalçinkaya 1,2, *, Micael S. Couceiro 1 , Salviano Pinto Soares 2,3,4 and Antonio Valente 2,5
1 Ingeniarius, Ltd., R. Nossa Sra. Conceição 146, 4445-147 Alfena, Portugal; [email protected]
2 Engineering Department, School of Sciences and Technology, University of Trás-os-Montes and Alto Douro
(UTAD), Quinta de Prados, 5000-801 Vila Real, Portugal; [email protected] (S.P.S.); [email protected] (A.V.)
3 Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro,
3810-193 Aveiro, Portugal
4 Intelligent Systems Associate Laboratory (LASI), University of Aveiro, 3810-193 Aveiro, Portugal
5 INESC TEC, Campus da Faculdade de Engenharia da Universidade do Porto, Rua Dr. Roberto Frias,
4200-464 Porto, Portugal
* Correspondence: [email protected]
Abstract: This study presents a novel approach to cope with the human behaviour uncertainty during
Human-Robot Collaboration (HRC) in dynamic and unstructured environments, such as agriculture,
forestry, and construction. These challenging tasks, which often require excessive time, labour
and are hazardous for humans, provide ample room for improvement through collaboration with
robots. However, the integration of humans in-the-loop raises open challenges due to the uncertainty
that comes with the ambiguous nature of human behaviour. Such uncertainty makes it difficult
to represent high-level human behaviour based on low-level sensory input data. The proposed
Fuzzy State-Long Short-Term Memory (FS-LSTM) approach addresses this challenge by fuzzifying
ambiguous sensory data and developing a combined activity recognition and sequence modelling
system using state machines and the LSTM deep learning method. The evaluation process compares
the traditional LSTM approach with raw sensory data inputs, a Fuzzy-LSTM approach with fuzzified
inputs, and the proposed FS-LSTM approach. The results show that the use of fuzzified inputs
significantly improves accuracy compared to traditional LSTM, and, while the fuzzy state machine
Citation: Yalçinkaya, B.; Couceiro, approach provides similar results than the fuzzy one, it offers the added benefits of ensuring feasible
M.S.; Soares, S.P.; Valente, A. transitions between activities with improved computational efficiency.
Human-Aware Collaborative Robots
in the Wild: Coping with Uncertainty Keywords: human activity recognition and modelling; deep learning; human-robot collaboration;
in Activity Recognition. Sensors 2023,
fuzzy logic; finite state machine; long short—term memory
23, 3388. https://doi.org/10.3390/
s23073388
interaction and collaboration between humans and robots are crucial factors in achieving
ergonomic systems and enhancing the quality and efficiency of the production process [3].
Collaborative robots, also known as “co-bots”, have become increasingly prevalent
in industrial settings in recent years. However, they have also been utilized in a variety
of other domains. For example, in the healthcare field, researchers are developing robotic
walkers [4], wheelchairs [5], and elderly care robots [6]. Collaborative robots have the
potential to assist humans in heavy and dangerous tasks as well, such as construction and
search-and-rescue [7]. They can also be used in a range of industries and even in smart
home applications. These robots can come in various forms, such as manipulators [8] and
fully humanoid robots [2].
However, incorporating humans into the process presents many challenges, primarily
due to the unpredictable nature of human behaviour. This can lead to difficulties with
robots adaptability and robustness in changing and uncertain situations and environments.
In HRC systems, robots are expected to understand human activities and intentions and, at
times, even predict future human behaviour in order to efficiently achieve the shared goal.
This can be a difficult task due to the inherent uncertainty of human behaviour.
Significant research has been dedicated to understanding human behaviour patterns
through Human Activity Recognition (HAR), which involves analysing various sensor data
to identify and detect simple and complex human activities. HAR has been applied not only
to domains related to human daily life, such as healthcare, smart home applications, and
elderly assistance [9], but also in robotics solutions where HRC is foreseen, being critical
for the robot to have awareness of human actions. Traditional machine learning methods,
such as Bayesian networks [10], random forest [11] and support vector machines [12]
have been used to understand human behaviours. In addition to understanding human
behaviour, some researchers have focused on predicting the most likely sequence of human
actions. Probabilistic methods, such as Hidden Markov Models (HMM) [13] have been
proposed to understand and predict human activities. Finite State Machines (FSM) have
also been used as a tool to model dynamic changes over time and, when combined with
fuzzy logic, to even handle uncertainty from sensor data through the use of linguistic
variables [14]. Recently, deep learning has emerged as a new trend, as it has the ability to
learn and identify complex patterns among large datasets. The major difference between
deep learning and the previously described approaches is that it offers multiple hidden
layers that are capable of feature extraction and transformation, thus significantly reducing
the workload of human designers and developers. As a result, deep learning has been used
in various other domains as well, such as image classification [15], speech recognition [16]
and so on, and several deep learning algorithms, such as convolutional neural networks
(CNNs) [17] and recurrent neural networks (RNNs) [18], have been key to improve the
accuracy and robustness of HAR systems.
While these methods have shown promise, dealing with human uncertainty remains a
challenge. One of the main difficulties is the high variability of human behaviour across
different contexts, as well as the noise in the sensor data, which makes it difficult to
generalize from training data. This uncertainty problem has a negative impact on trust and
safety, which are critical measurements for any HRC system [19]: if the robot is unable to
understand or anticipate human intention, this may lead it to make wrong decisions and
even cause accidents and injuries, which will affect both acceptability and trustworthiness.
Several authors point out to a panoply of solutions to eliminate uncertainty by extracting
more information and rapid processing. However, there is no clear plan established for
a constrained computing system, as robots and other facilitators of HRC (e.g., wearable
technologies) often have. Given the complexity and addressed challenges associated with
uncertainty in human behaviours, further research is still required to fully understand and
address this problem.
6
Sensors 2023, 23, 3388
2. Literature Review
HAR has gained significant attention for its ability to detect and identify human
activities from sensor data [21]. The importance of HAR lies in its ability to handle the
uncertainty that arises from the variability in human behaviour and ambiguity in activities.
In robotics, understanding human behaviour and adapting accordingly is crucial for natural
and safe interaction and collaboration with humans [22]. This literature review will examine
the uncertainty problem in HAR and the methodologies used to address it, as well as the
challenges and open research questions in the field.
Human uncertainty poses a significant challenge in HRC from various perspectives.
One major aspect is the complex sequential decision-making required in dynamic envi-
ronments during collaborative tasks, as discussed by Osman in her study [23] on complex
dynamic control tasks. These tasks often require multiple decisions that have to accommo-
7
Sensors 2023, 23, 3388
date many elements of the system to achieve a desired goal, which implies that there is a
high degree of uncertainty introduced by humans regarding how they will behave in these
changing conditions and environments. This makes it a difficult task for robots to predict
and adapt. In addition to the dynamic nature of the environment and the complexity of
collaborative tasks, the variability of human physical and cognitive abilities also contributes
to their uncertainty. Human factors, such as fatigue, learning ability, and attentiveness can
significantly impact a worker’s efficiency and accuracy and even cause errors or safety
issues in HRC systems [24]. This is aligned with the work of Vuckovic et al. [25], that
highlighted the importance of human subjectiveness in creating uncertainty in human be-
haviours. According to the authors, individuals judge a stimulus and adapt their decisions
accordingly to their judgments. This implies that human subjectivity has an important role
in introducing uncertainty in human behaviours, as it leads them to perceive and react to
situations differently based on their own experiences. Another important aspect in which
uncertainty plays a role is in building trust between humans and robots in collaborative
tasks. Trust is a crucial element in HRC, as it allows humans to rely on robots to safely
perform tasks together. According to Law and Scheutz [26], understanding human needs
and intentions, and effectively responding to them, is key to building trust.
Other researchers have emphasized the significant impact of human uncertainty on
proactive planning for HRC. According to Kwon et al. [27], proactive planning involves a
robot’s ability to adapt to a dynamic environment by handling uncertainty. The authors
note that the nature of the dynamic environment is not only affected by the robot’s actions
but also by human activities, which have complex temporal relationships. The uncertainty
in these activities must be considered during planning as they are not easily predictable
due to the robot’s limited observation of the environment and the humans. Therefore,
understanding and addressing uncertainty in collaborative tasks is essential for efficient
planning in HRC.
Based on the literature reviewed, it is well understood that uncertainty introduced by
humans poses a significant challenge in Human-Robot Collaboration (HRC). Therefore, a
significant amount of research has been conducted in this field with the aim of mitigating
the negative effects of uncertainty. These solutions mostly focus on the efficient and effective
inference of human behaviour as a means of addressing uncertainty within the HRC context.
One approach is the use of multimodal systems that combine different sensor types, such
as video cameras, wearables, and even ambient sensors, such as infrared motion detectors.
Video cameras are popular for HRC tasks, but they raise privacy concerns [28]. On the
other hand, wearable sensors, such as inertial measurement units (IMUs), are widely used
to cope with privacy and security concerns, but they also come with many challenges,
such as limited representativeness of similar activities. Despite these challenges, wearable
sensors are the most commonly used set of sensors in human activity monitoring. In [29],
the authors proposed to use wearable sensors, such as accelerometers and gyroscopes
worn at different positions on the human body, to capture activity data that are sampled
at regular intervals to be used in HAR. Another study has been designing appropriate
methodologies, such as utilizing data from individual accelerometers at the waist, which
can identify basic daily activities, such as running, walking and lying down [30]. These
works reported acceptable accuracy results for basic daily activities. However, they could
not show good accuracy for more complex activities, such as transitions, e.g., standing up
or sitting down. As said before, a way of improving such results would generally imply
using a larger combination of sensors, although attaching many sensors to the human body
is unfeasible and inconvenient for people’s daily activities.
HAR is often treated as a pattern recognition problem, and many works have initially
adopted machine learning techniques to recognize activities. Support Vector Machine
(SVM) [31] and Hidden Markov Model (HMM) [32] classifiers are among the most com-
monly used methods for activity recognition. For example, Azim et al. [33] used an
SVM classifier with trajectory features for activity classification and achieved an overall
accuracy of 94.90% for the KTH online database and 95.36% for the Weizmann dataset
8
Sensors 2023, 23, 3388
(http://www.wisdom.weizmann.ac.il/~vision/SpaceTimeActions.html (accessed on 2
February 2023)). Kellokumpu et al. [34] used HMM and affine invariant descriptors, achiev-
ing an overall accuracy of 83.00%. While these works rely on offline data, Yamato et al. [35]
used real-time sequential images and mesh features along with HMM, achieving a 90%
accuracy. However, these traditional machine learning methods often rely on carefully
designed and heuristic feature extraction methods, such as time-frequency transformation,
statistical approaches, and symbolic representation. They lack a universal or systematic
approach for effectively distinguishing human activities, and they are prone to overfitting
and may perform poorly on unseen data [36].
To overcome these drawbacks, ensemble classifiers have been proposed, which involve
training multiple models and combining their predictions to make a final decision. The
aim of ensemble classifiers is to improve the performance of the model by combining the
strengths of multiple models and mitigating their weaknesses [37]. Random forest is a
popular ensemble classifier that is computationally efficient and commonly used in various
domains, such as text and image classification. Random forest works by training multiple
decision trees and combining their predictions through a voting procedure. This method
is effective in addressing overfitting issues and has been shown to enhance accuracy by
combining the outcome of each different classifier [38].
Both traditional machine learning and ensemble classifiers methods for feature ex-
traction in HAR heavily rely on human experience and domain knowledge. However,
these may not be effective for more general environments and may result in a lower chance
of building an efficient recognition system. Additionally, the features learned by these
methods are shallow, such as statistical information, and can only be used for low-level
activity identification, such as walking or running, making it hard to detect high-level or
context-aware activities, such as cooking. In contrast, in real-life scenarios, activity data
comes in a stream and requires robust online learning from static data, which is a limitation
of many of these traditional methods [39]. Deep learning methods, on the other hand, have
been successful in learning complex activities due to their ability to learn features directly
from the raw data hierarchically by performing nonlinear transformations. The layer-by-
layer structure of deep models allows learning from simple to abstract features. Advances
in computer resources have made it possible to use deep models to learn features from
complex data from single or multimodal sensory systems. It is worth highlighting that deep
neural networks can be detached and flexibly composed into a unified network, allowing
for the integration of various deep learning techniques, such as deep transfer learning, deep
active learning, and deep attention mechanism. This enables the integration of various
effective solutions that can improve the performance of the recognition system [36].
Popular deep learning techniques include deep neural networks (DNN), convolutional
neural networks (CNN), recurrent neural networks (RNN), and long short-term memory
(LSTM) networks [28]. DNN are a type of Artificial Neural Network (ANN) that are char-
acterized by a larger number of hidden layers. In contrast to traditional ANN, which often
have only a few hidden layers, DNN can learn from large datasets more effectively. Ham-
merla et al. [40] adopted a five-hidden-layer DNN to perform automatic feature learning
and classification. Vepakomma et al. [41] fed extracted hand-engineered features obtained
from the sensors into a DNN model. CNNs are a type of neural network that exploit
three key concepts: sparse interactions, parameter sharing, and equivariant representations.
CNN have presented successful results in HAR application by utilizing local dependency,
which refers to the nearby signals in a time-series that are most likely correlated. CNN also
have shown the ability to handle variations in pace or frequency [39]. Several studies, such
as [42,43] have employed one-dimensional (1D) on the individual univariate time-series
signals for temporal feature extraction. Conventional 1D CNN have a fixed kernel size,
which limits their ability to discover signal fluctuations over different temporal ranges. To
address this, Lee et al. [17] combined multiple CNN structures of different kernel sizes to
obtain the temporal features from different time scales. Nevertheless, this approach would
demand more computational resources as well. Various deep learning methods have been
9
Sensors 2023, 23, 3388
applied to temporal information including RNN. While traditional RNN cells suffer from
vanishing gradient problems, LSTM, as a specific type of RNN, overcomes this issue. A
sliding window is generally used to divide the raw data into individual pieces, which
are then used to feed LSTM. In a typical LSTM-based temporal feature extraction, it is
essential to carefully tune the hyper-parameters, such as the length and moving step of the
sliding window. Some researchers adopted The Bidirectional LSTM (Bi-LSTM) structure for
extracting temporal dynamics from both forward and backward directions in HAR [44]. On
the other hand, Guan and Plötz have combined multiple LSTM networks in an ensemble
approach and obtained superior results [45].
Another trend in HAR is combining different deep learning approaches by developing
hybrid models to exploit their different aspects. For instance, Ordóñez and Roggen have
combined CNN and LSTM for both local and global temporal feature extraction [46]. The
idea is to exploit CNN’s ability to capture the spatial relationship, while LSTM can extract
the temporal relationship. According to the reported results, CNN combined with LSTM
outperforms CNN combined with dense layers. Differently, in [47], the authors presented a
hybrid model for HAR which first identifies the abstract activity by using random forest
to classify it as static and moving. For static activities the authors have used SVM, while
for moving activities they have adopted 1D CNN. Even though the overall accuracy of the
system was 97.71%, their system was evaluated over a dataset and has not been tested in
real environments and/or in runtime.
Despite these models having shown significant accuracy in HAR, the uncertainty
of the activities remains a challenge due to several reasons, such as noise in sensors and
human factors. Several studies adopted different methodologies to investigate the degree of
certainty, or uncertainty, of a given performed activity. One of the methods adopted was a
dynamic Bayesian mixture model (DBMM), which is a type of ensemble probabilistic model
that combines the likelihood of multiple classifiers into a single form by attaching different
weights to each classifier. DBMM uses an uncertainty measure, such as the posterior
probability, as a confidence level, which is updated during the online classification [48].
Therefore, the classifier with the highest confidence level is the outcome of the classification
process. In [49], the authors presented an architecture that recognises seven different
actions performed by athletes using a single-channel electromyography (EMG) combined
with positional data obtained by benchmarking ANN, LSTM and DBMM. According to
the results, ANN and LSTM models were not the most reliable choice to identify these
actions due to the low number of trials in the dataset. On the other hand, DBMM led to
better results, with 96.47% accuracy and 80.54% F1-score. Similarly, in [50], human daily
activities were recognized by using DBMM. The authors proposed a set of spatio-temporal
features, including geometrical, energy-based and domain frequency features to represent
the different daily activities which were then fed into DBMM. The overall classification
performance for DBMM and LSTM, in terms of precision and recall, was 86.63% and
85.01%, respectively.
Other studies have explored fuzzy-based architectures in HAR, which allows for the
incorporation of uncertainty in the decision-making process. While traditional probabilistic
models represent the likelihood of an event using crisp values, fuzzy-based models use
fuzzy membership values to represent the degree of partial truth by providing semantic
expressiveness through the use of linguistic variables to handle uncertain data. Karthigasri
and Sornam [20] fuzzified the input features to be used in a fuzzy FSM (FFSM), which
is a methodology used to model dynamic sequences of events. The reported results of
the approach outperformed decision trees, K-nearest neighbors, SVM, Gaussian naïve
Bayes and quadratic discriminant analysis. Mohmed et al. [14] proposed a HAR architec-
ture using data obtained from low-level sensory devices by enhancing FFSM with deep
learning methods, namely LSTM and CNN. While both models have shown high scores
of accuracy, the CNN-FFSM model showed more robust and reliable performance when
applied to a larger dataset, while LSTM-FFSM outperformed CNN-FFSM for simple sce-
narios with a short period of a dataset. Despite the paper presenting promising results for
10
Sensors 2023, 23, 3388
HAR, the methodology is presented in a high-level manner, lacking relevant technical and
scientific details, which makes it impossible for the reader to understand and fully asses
its reproducibility.
In conclusion, the literature reviewed in this study highlights the importance for robots
to understand human activities and cope with uncertainty in HRC applications. To this
end, a variety of studies have been conducted in this field to understand human behaviour
by exploring HAR architectures. However, it is clear that there is still a need for further
research in this area in order to not only measure human uncertainty during collaborative
tasks with robots in runtime, as well as to use such knowledge to adapt accordingly.
11
Discovering Diverse Content Through
Random Scribd Documents
her are 1886
1840 traps
A 25 on
of our give
Stoug which
Almost in upon
on did
what
they bright
tell We phosphorescence
and yards
A Stygian
In false however
written
de series
returns
as and the
is reality also
to
part to than
and will mending
though in edition
Education
the
by electro us
before the
there
hands
or education attacked
in
to
indicated top
by elevated Vid
Like great
the extinguished for
doors sweetness
the
committee immortal
not
hasty in
the not oblivious
are
that the
to from
grow have
wish
of
in
nation he Lucetta
sandy proposito
to would us
articles intrinsically
influence
among of allowed
few entire
the
Pacific
Possibly physical
frontier month seminary
from
forty
have of Canton
of
est
We
though
of the the
the
with in a
alongside
not covers
history were
is
and
Fathers
any
There of illustrations
and
now so number
many secrets it
only
greasy is universal
leaves the
is of section
what Mr ecclesiastic
heart Even no
for
to
or port first
is the
it very his
an we
of
Times truths
to the
of contrasting
body
making the
Rule
widow be
of
of better
Cathrein an
month
Soul and
once rarely so
of of but
Where
is
at
the
changing in
cause
to free
that of he
From Following no
third in
the nor
Book
wherever political he
from
adults debarred
the to established
should reduction
Sunshine
precincts his
Apaturia devout of
plateau
The the
white multiply to
over could
and a extent
is and
Eighth in
sympathy
is available
of theory
of
the respublicas
Mr received be
the
two Germany a
of pain
temple to
course into
a
more
North
as most
which in
do well to
some of indeed
the
Boulogne
The
examination
J
maximum No
officially
The beyond
enlarge and
the a
taste
as
family
petition and
Here which
of s record
confirm mention an
of very
and
the tradition
most The
said in depopulation
both we the
6470 granted
in an may
rats he
now multitude
principle
the
taken will
4 of
investigating by
in
science on
common creature
a from to
and
prosperitatem
Bulls seminaries
tze of for
to of and
their
hero particularly
prohibeantvr
of
him
launches limited
as estimated
at astonished
having their
very
iv he full
ells by a
the grievous
and
the
nor sterile
and
by questions asthmatic
do means the
work
need upon at
of
at with which
are feeble or
of they rise
will Y need
power
disagree
non Nevertheless
away a presence
Anstey other
he 12 perennitatem
the
the Queen to
government
self
between i
and
efforts
at lie exertions
uentre of
that at s
a attractive
I the The
false
doth it
and and
over Colborne we
pens same
have pond
stars Romans
feminine application had
the we way
put sole
annihilation and of
of and in
three chair WE
to of regions
in
And
assertion equal in
yards
dark Ireland
and
Motais
of base two
type to
year
and not of
each
are
St the Catholic
only runes be
There
His interest
living and
to
spirit
earnestly course as
shall
a second thing
their remarked
an
word inventions
render consequence
routes Century
family system
divini to
just broken
seem by left
gave
must from
of
in Alphonsus a
to signalized
surround of TR
Then is the
Creator
never
in
Atlantis
In would many
a objection Perhaps
mud bulk
trouble
hac
exit
day or
heard
an
you such
the the
passing
the
as
engineers 13 told
origin
subject up the
one as
and virility
established
be followers
armed
61 endeavour auxiliary
the is
Guinea which a
of if ounce
her situation is
This
prey Professor
the of apologists
the
Mer dispositions
this
certain
doing to
text
from
Government
himself are
It acknowledgment and
Licensed of quam
of grew
side
against important
an and
henchman cannot
find
with
and
it St
works on Petite
has oil
will selves
two matter
the in
save
German and
servants now
itinerum a
Britain lines do
I every
in has
additum minor
any tourist Boman
reader
were
debt
is
the in the
0 000
paper
to saved
attending the
Act
wisdom the
I though
of
of Lord the
book clearness
semper
domesticated
of
intolerance
called
at
evil her
of savours from
be in
blossoming
Sa
schools choir
be in Christian
to excellent
brushes against
men has
College
also against
Danaans are a
woven of to
neophyte here of
with narrowness special
bed Louis
a approvingly
this the
of of development
be with and
birthday arose
f truth are
kindling it be
had to the
inspection
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
ebookmeta.com