InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

Li, Senmao; Wang, Kai; van de Weijer, Joost; Khan, Fahad Shahbaz; Guo, Chun-Le; Yang, Shiqi; Wang, Yaxing; Yang, Jian; Cheng, Ming-Ming

Computer Science > Computer Vision and Pattern Recognition

arXiv:2502.02215 (cs)

[Submitted on 4 Feb 2025 (v1), last revised 21 Mar 2025 (this version, v2)]

Title:InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

Authors:Senmao Li, Kai Wang, Joost van de Weijer, Fahad Shahbaz Khan, Chun-Le Guo, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng

View PDF

Abstract:Diffusion priors have been used for blind face restoration (BFR) by fine-tuning diffusion models (DMs) on restoration datasets to recover low-quality images. However, the naive application of DMs presents several key limitations. (i) The diffusion prior has inferior semantic consistency (e.g., ID, structure and color.), increasing the difficulty of optimizing the BFR model; (ii) reliance on hundreds of denoising iterations, preventing the effective cooperation with perceptual losses, which is crucial for faithful restoration. Observing that the latent consistency model (LCM) learns consistency noise-to-data mappings on the ODE-trajectory and therefore shows more semantic consistency in the subject identity, structural information and color preservation, we propose InterLCM to leverage the LCM for its superior semantic consistency and efficiency to counter the above issues. Treating low-quality images as the intermediate state of LCM, InterLCM achieves a balance between fidelity and quality by starting from earlier LCM steps. LCM also allows the integration of perceptual loss during training, leading to improved restoration quality, particularly in real-world scenarios. To mitigate structural and semantic uncertainties, InterLCM incorporates a Visual Module to extract visual features and a Spatial Encoder to capture spatial details, enhancing the fidelity of restored images. Extensive experiments demonstrate that InterLCM outperforms existing approaches in both synthetic and real-world datasets while also achieving faster inference speed.

Comments:	Accepted at ICLR2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2502.02215 [cs.CV]
	(or arXiv:2502.02215v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2502.02215

Submission history

From: Senmao Li [view email]
[v1] Tue, 4 Feb 2025 10:51:20 UTC (22,931 KB)
[v2] Fri, 21 Mar 2025 18:51:58 UTC (22,928 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators