Per-Pixel Classification is Not All You Need for Semantic Segmentation

Cheng, Bowen; Schwing, Alexander G.; Kirillov, Alexander

Computer Science > Computer Vision and Pattern Recognition

arXiv:2107.06278 (cs)

[Submitted on 13 Jul 2021 (v1), last revised 31 Oct 2021 (this version, v2)]

Title:Per-Pixel Classification is Not All You Need for Semantic Segmentation

Authors:Bowen Cheng, Alexander G. Schwing, Alexander Kirillov

View PDF

Abstract:Modern approaches typically formulate semantic segmentation as a per-pixel classification task, while instance-level segmentation is handled with an alternative mask classification. Our key insight: mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks in a unified manner using the exact same model, loss, and training procedure. Following this observation, we propose MaskFormer, a simple mask classification model which predicts a set of binary masks, each associated with a single global class label prediction. Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results. In particular, we observe that MaskFormer outperforms per-pixel classification baselines when the number of classes is large. Our mask classification-based method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.

Comments:	NeurIPS 2021, Spotlight. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2107.06278 [cs.CV]
	(or arXiv:2107.06278v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2107.06278

Submission history

From: Bowen Cheng [view email]
[v1] Tue, 13 Jul 2021 17:59:50 UTC (1,246 KB)
[v2] Sun, 31 Oct 2021 17:41:44 UTC (1,266 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2021-07

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Bowen Cheng
Alexander G. Schwing
Alexander Kirillov

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Per-Pixel Classification is Not All You Need for Semantic Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Per-Pixel Classification is Not All You Need for Semantic Segmentation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators