Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation

Zhang, Shengyuan; Yang, Ling; Li, Zejian; Zhao, An; Meng, Chenye; Yang, Changyuan; Yang, Guang; Yang, Zhiyuan; Sun, Lingyun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2408.15991 (cs)

[Submitted on 28 Aug 2024 (v1), last revised 17 Apr 2025 (this version, v3)]

Title:Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation

Authors:Shengyuan Zhang, Ling Yang, Zejian Li, An Zhao, Chenye Meng, Changyuan Yang, Guang Yang, Zhiyuan Yang, Lingyun Sun

View PDF HTML (experimental)

Abstract:Accelerating the sampling speed of diffusion models remains a significant challenge. Recent score distillation methods distill a heavy teacher model into a student generator to achieve one-step generation, which is optimized by calculating the difference between the two score functions on the samples generated by the student model. However, there is a score mismatch issue in the early stage of the distillation process, because existing methods mainly focus on using the endpoint of pre-trained diffusion models as teacher models, overlooking the importance of the convergence trajectory between the student generator and the teacher model. To address this issue, we extend the score distillation process by introducing the entire convergence trajectory of teacher models and propose Distribution Backtracking Distillation (DisBack). DisBask is composed of two stages: Degradation Recording and Distribution Backtracking. Degradation Recording is designed to obtain the convergence trajectory of the teacher model, which records the degradation path from the trained teacher model to the untrained initial student generator. The degradation path implicitly represents the teacher model's intermediate distributions, and its reverse can be viewed as the convergence trajectory from the student generator to the teacher model. Then Distribution Backtracking trains a student generator to backtrack the intermediate distributions along the path to approximate the convergence trajectory of teacher models. Extensive experiments show that DisBack achieves faster and better convergence than the existing distillation method and accomplishes comparable generation performance, with FID score of 1.38 on ImageNet 64x64 dataset. Notably, DisBack is easy to implement and can be generalized to existing distillation methods to boost performance. Our code is publicly available on this https URL.

Comments:	Our code is publicly available on this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2408.15991 [cs.CV]
	(or arXiv:2408.15991v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2408.15991

Submission history

From: Shengyuan Zhang [view email]
[v1] Wed, 28 Aug 2024 17:58:17 UTC (48,535 KB)
[v2] Wed, 25 Sep 2024 03:05:05 UTC (48,515 KB)
[v3] Thu, 17 Apr 2025 03:58:19 UTC (41,240 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators