A Multimodal Memes Classification: A Survey and Open Research Issues

Afridi, Tariq Habib; Alam, Aftab; Khan, Muhammad Numan; Khan, Jawad; Lee, Young-Koo

doi:10.1007/978-3-030-66840-2_109

Computer Science > Computer Vision and Pattern Recognition

arXiv:2009.08395 (cs)

[Submitted on 17 Sep 2020]

Title:A Multimodal Memes Classification: A Survey and Open Research Issues

Authors:Tariq Habib Afridi, Aftab Alam, Muhammad Numan Khan, Jawad Khan, Young-Koo Lee

View PDF

Abstract:Memes are graphics and text overlapped so that together they present concepts that become dubious if one of them is absent. It is spread mostly on social media platforms, in the form of jokes, sarcasm, motivating, etc. After the success of BERT in Natural Language Processing (NLP), researchers inclined to Visual-Linguistic (VL) multimodal problems like memes classification, image captioning, Visual Question Answering (VQA), and many more. Unfortunately, many memes get uploaded each day on social media platforms that need automatic censoring to curb misinformation and hate. Recently, this issue has attracted the attention of researchers and practitioners. State-of-the-art methods that performed significantly on other VL dataset, tends to fail on memes classification. In this context, this work aims to conduct a comprehensive study on memes classification, generally on the VL multimodal problems and cutting edge solutions. We propose a generalized framework for VL problems. We cover the early and next-generation works on VL problems. Finally, we identify and articulate several open research issues and challenges. This is the first study that presents the generalized view of the advanced classification techniques concerning memes classification to the best of our knowledge. We believe this study presents a clear road-map for the Machine Learning (ML) research community to implement and enhance memes classification techniques.

Comments:	This is a survey paper on recent state of the art VL models that can be used for memes classification. it has 15 pages and 2 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
Cite as:	arXiv:2009.08395 [cs.CV]
	(or arXiv:2009.08395v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2009.08395
Journal reference:	SCA 2020
Related DOI:	https://doi.org/10.1007/978-3-030-66840-2_109

Submission history

From: Tariq Habib Afridi Mr. [view email]
[v1] Thu, 17 Sep 2020 16:13:21 UTC (480 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Multimodal Memes Classification: A Survey and Open Research Issues

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Multimodal Memes Classification: A Survey and Open Research Issues

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators