Multi-Style Facial Sketch Synthesis through Masked Generative Modeling

Sun, Bowen; Lu, Guo; Zheng, Shibao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2408.12400 (cs)

[Submitted on 22 Aug 2024]

Title:Multi-Style Facial Sketch Synthesis through Masked Generative Modeling

Authors:Bowen Sun, Guo Lu, Shibao Zheng

View PDF HTML (experimental)

Abstract:The facial sketch synthesis (FSS) model, capable of generating sketch portraits from given facial photographs, holds profound implications across multiple domains, encompassing cross-modal face recognition, entertainment, art, media, among others. However, the production of high-quality sketches remains a formidable task, primarily due to the challenges and flaws associated with three key factors: (1) the scarcity of artist-drawn data, (2) the constraints imposed by limited style types, and (3) the deficiencies of processing input information in existing models. To address these difficulties, we propose a lightweight end-to-end synthesis model that efficiently converts images to corresponding multi-stylized sketches, obviating the necessity for any supplementary inputs (\eg, 3D geometry). In this study, we overcome the issue of data insufficiency by incorporating semi-supervised learning into the training process. Additionally, we employ a feature extraction module and style embeddings to proficiently steer the generative transformer during the iterative prediction of masked image tokens, thus achieving a continuous stylized output that retains facial features accurately in sketches. The extensive experiments demonstrate that our method consistently outperforms previous algorithms across multiple benchmarks, exhibiting a discernible disparity.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2408.12400 [cs.CV]
	(or arXiv:2408.12400v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2408.12400

Submission history

From: Bowen Sun [view email]
[v1] Thu, 22 Aug 2024 13:45:04 UTC (1,344 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-Style Facial Sketch Synthesis through Masked Generative Modeling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-Style Facial Sketch Synthesis through Masked Generative Modeling

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators