Self-Evolution Knowledge Distillation for LLM-based Machine Translation

Song, Yuncheng; Ding, Liang; Zan, Changtong; Huang, Shujian

Computer Science > Computation and Language

arXiv:2412.15303 (cs)

[Submitted on 19 Dec 2024]

Title:Self-Evolution Knowledge Distillation for LLM-based Machine Translation

Authors:Yuncheng Song, Liang Ding, Changtong Zan, Shujian Huang

View PDF HTML (experimental)

Abstract:Knowledge distillation (KD) has shown great promise in transferring knowledge from larger teacher models to smaller student models. However, existing KD strategies for large language models often minimize output distributions between student and teacher models indiscriminately for each token. This overlooks the imbalanced nature of tokens and their varying transfer difficulties. In response, we propose a distillation strategy called Self-Evolution KD. The core of this approach involves dynamically integrating teacher distribution and one-hot distribution of ground truth into the student distribution as prior knowledge, which promotes the distillation process. It adjusts the ratio of prior knowledge based on token learning difficulty, fully leveraging the teacher model's potential. Experimental results show our method brings an average improvement of approximately 1.4 SacreBLEU points across four translation directions in the WMT22 test sets. Further analysis indicates that the improvement comes from better knowledge transfer from teachers, confirming our hypothesis.

Comments:	COLING 2025
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2412.15303 [cs.CL]
	(or arXiv:2412.15303v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.15303

Submission history

From: Yuncheng Song [view email]
[v1] Thu, 19 Dec 2024 12:24:15 UTC (1,352 KB)

Computer Science > Computation and Language

Title:Self-Evolution Knowledge Distillation for LLM-based Machine Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Self-Evolution Knowledge Distillation for LLM-based Machine Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators