FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment

Wu, Sijing; Li, Yunhao; Xu, Ziwen; Gao, Yixuan; Duan, Huiyu; Sun, Wei; Zhai, Guangtao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2504.09255 (cs)

[Submitted on 12 Apr 2025]

Title:FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment

Authors:Sijing Wu, Yunhao Li, Ziwen Xu, Yixuan Gao, Huiyu Duan, Wei Sun, Guangtao Zhai

View PDF HTML (experimental)

Abstract:Face video quality assessment (FVQA) deserves to be explored in addition to general video quality assessment (VQA), as face videos are the primary content on social media platforms and human visual system (HVS) is particularly sensitive to human faces. However, FVQA is rarely explored due to the lack of large-scale FVQA datasets. To fill this gap, we present the first large-scale in-the-wild FVQA dataset, FVQ-20K, which contains 20,000 in-the-wild face videos together with corresponding mean opinion score (MOS) annotations. Along with the FVQ-20K dataset, we further propose a specialized FVQA method named FVQ-Rater to achieve human-like rating and scoring for face video, which is the first attempt to explore the potential of large multimodal models (LMMs) for the FVQA task. Concretely, we elaborately extract multi-dimensional features including spatial features, temporal features, and face-specific features (i.e., portrait features and face embeddings) to provide comprehensive visual information, and take advantage of the LoRA-based instruction tuning technique to achieve quality-specific fine-tuning, which shows superior performance on both FVQ-20K and CFVQA datasets. Extensive experiments and comprehensive analysis demonstrate the significant potential of the FVQ-20K dataset and FVQ-Rater method in promoting the development of FVQA.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2504.09255 [cs.CV]
	(or arXiv:2504.09255v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2504.09255

Submission history

From: Sijing Wu [view email]
[v1] Sat, 12 Apr 2025 15:26:02 UTC (5,008 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators