Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation

Tan, Jiaqi; Zheng, Xu; Liu, Yang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2505.12861 (cs)

[Submitted on 19 May 2025]

Title:Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation

Authors:Jiaqi Tan, Xu Zheng, Yang Liu

View PDF HTML (experimental)

Abstract:Multi-modal semantic segmentation (MMSS) faces significant challenges in real-world scenarios due to dynamic environments, sensor failures, and noise interference, creating a gap between theoretical models and practical performance. To address this, we propose a two-stage framework called RobustSeg, which enhances multi-modal robustness through two key components: the Hybrid Prototype Distillation Module (HPDM) and the Representation Regularization Module (RRM). In the first stage, RobustSeg pre-trains a multi-modal teacher model using complete modalities. In the second stage, a student model is trained with random modality dropout while learning from the teacher via HPDM and RRM. HPDM transforms features into compact prototypes, enabling cross-modal hybrid knowledge distillation and mitigating bias from missing modalities. RRM reduces representation discrepancies between the teacher and student by optimizing functional entropy through the log-Sobolev inequality. Extensive experiments on three public benchmarks demonstrate that RobustSeg outperforms previous state-of-the-art methods, achieving improvements of +2.76%, +4.56%, and +0.98%, respectively. Code is available at: this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2505.12861 [cs.CV]
	(or arXiv:2505.12861v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2505.12861

Submission history

From: Xu Zheng [view email]
[v1] Mon, 19 May 2025 08:46:03 UTC (5,626 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators