Autonomous_Navigation_Algorithm_for_Planetary_Rovers_Based_on_Multimodality
Autonomous_Navigation_Algorithm_for_Planetary_Rovers_Based_on_Multimodality
Lal Priya P S
College of Engineering Trivandrum
APJ Abdul Kalam Technological University
Thiruvananthapuram, Kerala, India
[email protected]
Abstract—Planetary rovers are essential robotic exploration vast pressure variations, dust exposure, corrosion, and cosmic
devices that are vital for the investigation of extraterrestrial ray exposure. To traverse challenging terrains, the rovers are
environments, where they analyze the atmospheric conditions and equipped with wheels specially designed for the purpose [2].
terrain. These rovers must endure intense acceleration, withstand
severe environmental conditions, and maintain their functionality Moreover, the onboard computing system of the rover must
for a prolonged period. These robotic explorers operate in be capable of withstanding the challenging conditions. In
distant and hostile environments where human intervention is addition, real-time control of these rovers from the Earth faces
not feasible. In the quest to surmount these challenges, they are challenges due to the relatively slow speed of communication.
required to function autonomously and rely on state-of-the-art Hence, the rover requires decision-making capabilities with
technologies. This article proposes two multimodal architectures,
one which fuses object detection and semantic segmentation, and minimum interactions from the Earth, especially for tasks such
the other which infuses monocular depth estimation and semantic as navigation and data acquisition [3].
segmentation, to drive the rover autonomously. The proposed Autonomous decision-making involves the process of the
multimodal architectures are robust and enable the rover to rover making decisions independently, without any human
navigate autonomously in highly uncertain and unpredictable
environments. The results show that implementing the proposed
interventions [4]. By making it autonomous, productivity and
architectures can successfully achieve the rover’s autonomous lifespan can be increased, as it can navigate by itself in
navigation. unforeseen circumstances [5]. To overcome these challenges,
Index Terms—Multimodal, semantic segmentation, monocu- two multimodality-based architectures using state-of-the-art
lar depth estimation, object detection, stereo, depth slicing, technologies to establish autonomy are proposed in this work.
Hadamard product.
The first architecture combines depth estimation, semantic seg-
mentation, and object detection model to accurately determine
I. I NTRODUCTION
the next phase of trajectory for the rover. Although being
A rover is a device for exploring planetary surfaces, specif- more complex, it provides a more precise maneuver for the
ically crafted to traverse the challenging terrains of planets, rovers. The second architecture proposed is much simpler for
moon, and other celestial bodies within our solar system. the onboard system to process and uses more advanced state-
The inceptions of rover missions are attributable to the quest of-the-art deep learning methods.
to expand our exploration beyond the Earth and unravel the The contributions of this paper encompass the development
mysteries of the universe. They are crafted with precise engi- of the following:
neering and are equipped with a plethora of instruments. They
play a crucial role in planetary exploration by providing valu- • Multimodal deep learning-based architectures designed
able insights into the distant worlds. These intrepid machines for autonomous navigation of rovers.
have found home on planets such as Mars, where NASA’s • Novel algorithms for the autonomous navigation of
trailblazing rovers such as Spirit, Opportunity, Curiosity, and rovers.
Perseverance have etched their names into the archives of • Architectures that offer a completely autonomous naviga-
space exploration [1]. tion system, developed by incorporating state-of-the-art
Since the rovers are deployed using spacecraft, they must techniques in depth estimation, semantic segmentation,
possess a compact design. Moreover, following deployment, and object detection to effectively handle the navigation.
these rovers encounter many challenges including uneven ter- This article is organized as follows. Section II focuses on
rains, extreme temperatures, abounding levels of acceleration, past developments and related works. Section III discusses the
2
Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on December 14,2024 at 14:59:16 UTC from IEEE Xplore. Restrictions apply.
2) Semantic Segmentation Module: Segmentation plays a Algorithm 1 Learning-based navigation algorithm I
crucial role in identifying the boundaries and shapes of ob- 1: Input: Bounding box coordinate xi , yi , xj , yj and classes
stacles, craters, and various features on the lunar or Martian clsi , segmented image, rover dimensions, threshold limit
surface. The system essentially equips the rover with a so- for obstacle avoidance distance Dth , Clearance Clr
phisticated set of eyes that can analyze its surroundings in 2: for clsi do
detail. Though provided with an obstacle detection system, 3: Calculate center point (Xi , Yi )
segmentation on top of obstacle detection enables a much 4: Calculate depth from rover to (Xi , Yi ) using stereo
more precise maneuver as and when the situation demands. vision
This mapped information becomes invaluable for the rover to 5: end for
plan its route effectively, ensuring a safe and smooth journey 6: if distance of obstacle in front Df ≤ Dth then
even in challenging landscapes. 7: for clsi do
3) Object Detection Module: The proposed architecture 8: Calculate distance between obstacles using bound-
employs a state-of-the-art object detection algorithm that ing box and stereo vision Dibt
enables the detection of obstacles and valleys in the lunar 9: if Dibt > rover dimension + Clr then
or Martian terrain through bounding boxes. Although only 10: Flag T raversiblei
providing a rough estimate of the specifics of the environ- 11: end if
ment, this bounding box information along with the stereo 12: end for
depth estimator is further used by the navigation algorithm 13: for T raversiblei do
to estimate the position of the features. This enables the 14: TrPath = Path(distance, angle of rotation)
navigation algorithm to accurately determine an optimal path 15: end for
for navigation. 16: OptimizedPath = Min(TrPath )
4) Path Planning Algorithm: The proposed path planning 17: Traverse OptimizedPath
algorithm uses a multimodal approach taking in information 18: end if
from the segmentation, depth estimation, and object detection
modules. The object detection along with the depth estimation
module enables the rover to identify the distance of various
features on the planetary surface, while the segmentation
module enables precise identification of the boundaries of the
features. Based on the identified features and also accounting
for the rover dimensions, the algorithm identifies the best
next step to be taken to navigate without encountering any
obstacles. Furthermore, in the case of intricate maneuvers, the
system takes into account the information from the segmented
image of the terrain to identify a safe and optimal maneuver.
3
Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on December 14,2024 at 14:59:16 UTC from IEEE Xplore. Restrictions apply.
3) Depth Slicing Module: The Hadamard product [14] of 1) Mixing Datasets for Zero-shot Cross-dataset Transfer:
the depth maps and the semantic segmentation masks, which Midas model was used to estimate the depth of the objects
yields a matrix where each element is the product of the from the lunar dataset as shown in Fig. 3. As the distance of
corresponding elements of the input matrices is first taken to the obstacle varies, the model gives a corresponding inverse
obtain a traversable region estimate. Specifically, if A repre- relative depth map as shown in Fig. 3. Since MiDas model
sents the depth map matrix, and B represents the semantic outputs inverse relative depth values, fine calibrations and
segmentation mask matrix, then the Hadamard product C is curve fitting are required to calculate the exact metric depth.
calculated element-wise as in (1). Since only a relative scale is available, a close approximation
threshold is set further for the navigation system.
Cij = Aij × Bij (1)
4
Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on December 14,2024 at 14:59:16 UTC from IEEE Xplore. Restrictions apply.
1) Object Detection on Lunar Surface: The proposed ar-
chitecture for object detection utilizes a YOLOv8 model that
underwent extensive training across multiple epochs to ensure
its ability to accurately identify craters, rocks, and other
artifacts on the lunar surface under various lighting conditions,
ranging from well-lit scenarios to darker environments. This
training was conducted using a combination of real and
synthetically generated lunar images. The results displayed in
Fig. 5 demonstrate that the module has effectively detected
and marked the craters, rocks, and artifacts found on the lunar
surface using bounding boxes.
Fig. 7. Semantic segmentation on lunar images
5
Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on December 14,2024 at 14:59:16 UTC from IEEE Xplore. Restrictions apply.
environment with a limited field of view,” in Proceedings
of the 2019 International Conference on Robotics and
Automation (ICRA), 2019, pp. 5993–6000.
[5] N. Gadkar, S. Das, S. Chakraborty, and S. K. Mishra,
“Static obstacle avoidance for rover vehicles using model
predictive controller,” in Proceedings of the 2022 Inter-
national Conference on IoT and Blockchain Technology
(ICIBT), 2022, pp. 1–6.
[6] M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and
C. Cadena, “From perception to decision: A data-driven
Fig. 9. Disparity map
approach to end-to-end motion planning for autonomous
ground robots,” in Proceedings of the 2017 IEEE inter-
national conference on robotics and automation (ICRA),
2017, pp. 1527–1533.
[7] A. Cumani and A. Guiducci, “Fast stereo-based visual
odometry for rover navigation,” WSEAS Trans Circuits
Syst, vol. 7, no. 7, 2008.
[8] D. M. Helmick, Y. Cheng, D. S. Clouse, L. H. Matthies,
and S. I. Roumeliotis, “Path following using visual
odometry for a Mars rover in high-slip environments,”
in Proceedings of the 2004 IEEE Aerospace Conference,
Fig. 10. Depth map 2004, pp. 772–789.
[9] H. Rizk, A. Chaibet, and A. Kribèche, “Model-based
control and model-free control techniques for au-
signal. This design ensures robust and optimal control for
tonomous vehicles: A technical survey,” Applied Sci-
applications in planetary exploration devices.
ences, vol. 13, no. 11, 2023.
V. C ONCLUSIONS AND F UTURE S COPE [10] S. Guo, J. Guo, and C. Bai, “Semi-direct visual odometry
This article introduces two autonomous navigation architec- based on monocular depth estimation,” in Proceedings of
tures designed for planetary rovers. One approach integrates the 2019 IEEE International Conference on Unmanned
stereo depth estimation with advanced object detection and Systems (ICUS), 2019, pp. 720–724.
segmentation models to navigate unfamiliar terrains, while [11] L. Gao, J. Ding, W. Liu, H. Piao, Y. Wang, X. Yang,
the other combines monocular depth estimation with seman- and B. Yin, “A vision-based irregular obstacle avoidance
tic segmentation for near-optimal navigation across extrater- framework via deep reinforcement learning,” in Proceed-
restrial landscapes. These multimodal architectures leverage ings of the 2021 IEEE/RSJ International Conference on
cutting-edge deep learning algorithms, potentially advancing Intelligent Robots and Systems (IROS), 2021, pp. 9262–
autonomous navigation and control for planetary rovers. Future 9269.
endeavors will involve constructing a prototype rover hardware [12] X. Tian and H. Ju, “Modeling and simulation for lunar
setup to validate the proposed architectures experimentally. rover based on terramechanics and multibody dynamics,”
in Proceedings of the 32nd Chinese Control Conference,
R EFERENCES 2013, pp. 8687–8692.
[1] B. Rothrock, R. Kennedy, C. Cunningham, J. Papon, [13] Y. F. Chen, M. Everett, M. Liu, and J. P. How, “Socially
M. Heverly, and M. Ono, “SPOC: Deep learning-based aware motion planning with deep reinforcement learn-
terrain classification for Mars rover missions,” in Pro- ing,” in Proceedings of the 2017 IEEE/RSJ International
ceedings of the AIAA SPACE 2016, pp. 1–12. Conference on Intelligent Robots and Systems (IROS),
[2] Y. Kuroda, T. Teshima, Y. Sato, and T. Kubota, “Mobility 2017, pp. 1343–1350.
performance evaluation of planetary rover with similarity [14] W. Feng, L. Ding, R. Zhou, C. Xu, H. Yang, H. Gao,
model experiment,” in Proceedings of the IEEE Interna- G. Liu, and Z. Deng, “Learning-based end-to-end nav-
tional Conference on Robotics and Automation, 2004. igation for planetary rovers considering non-geometric
Proceedings. ICRA’04. 2004, vol. 2, 2004, pp. 2098– hazards,” IEEE Robotics and Automation Letters, 2023.
2103. [15] “Artificial lunar landscape dataset,”
[3] L. Tai, S. Li, and M. Liu, “A deep-network solution https://www.kaggle.com/datasets/romainpessia/artificial-
towards model-less obstacle avoidance,” in Proceedings lunar-rocky-landscape-dataset, accessed: 2023-09-30.
of the 2016 IEEE/RSJ International Conference on Intel- [16] NASA - Ames Research Center, “Polar stereo dataset,”
ligent Robots and Systems (IROS), 2016, pp. 2759–2764. https://ti.arc.nasa.gov/dataset/IRG-PolarDB, Nov 01,
[4] J. Choi, K. Park, M. Kim, and S. Seok, “Deep reinforce- 2017.
ment learning of navigation in a complex and crowded
6
Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on December 14,2024 at 14:59:16 UTC from IEEE Xplore. Restrictions apply.