Skip to content

Repository for the paper: PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Notifications You must be signed in to change notification settings

ZongqianLi/PT-MoE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 

Repository files navigation

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Content

🚀 News✏️ Todo✨ Introduction

🖥️ Environment🤗 Inference Demo🤗 Finetuning Demo

✏️ Evaluation🎲 Results

💾 Download📌 Citation🔖 License

 

Links

Project PagePaper

🚀 News

  • [2025.05.15] This page is created.
 
 
 

✏️ Todo

  • Upload the codes.
  • Upload the models.
 
 
 

✨ Introduction

Contributions

  • Novel finetuning framework: We propose PT-MoE, integrating matrix decomposition with MoE for prompt tuning. Our framework achieves state-of-the-art performance with fewer parameters while outperforming either method alone, demonstrating their complementary benefits.
  • Design dynamics: We thoroughly analyze key variables influencing the performance of PT-MoE, including prompt length, expert count, trainable parameters, routing mechanisms, and model size. Findings provide design guidelines for future parameter-efficient tuning approaches.
  • Comprehensive analysis: We provide detailed empirical studies across diverse tasks, including QA and mathematical problem solving, establishing a basis for future work in efficient finetuning methods.

Overview performance

Performance comparison of PEFT methods on 12 QA datasets in the MRQA benchmark (upper) and 5 math datasets (lower). ↑ indicates higher is better; ↓ indicates lower is better:

Framework

Framework of PT-MoE. Each soft prompt is decomposed into an input-specific matrix $A_i$ and a shared matrix $B$, with a router adaptively selecting and combining prompt components based on input. The resulting soft prompt is prepended to the input for the frozen LLM:

 
 
 

🖥️ Environment

Please use the same environment:

python==3.11.5
torch==2.3.1+cu118
transformers==4.46.0
datasets==2.18.0
huggingface_hub==0.24.2
deepspeed==0.15.3
wandb==0.14.2
numpy==1.23.5
tqdm==4.66.4
 
 
 

🤗 Inference Demo

QA

Math

 
 
 

🤗 Finetuning Demo

QA

Math

 
 
 

✏️ Evaluation

QA

Math

 
 
 

🎲 Results

QA

Evaluation results (F1 scores) for various PEFT methods on MRQA datasets. SQ: SQuAD; News: NewsQA; Tri: TriviaQA; Srch: SearchQA; HP: HotpotQA; NQ: NaturalQuestions; BSQ: BioASQ; DP: DROP; DRC: DuoRC; RC: RACE; RE: RelationExtraction; TB: TextbookQA. The bold values indicate the best performance among prompt tuning-based methods:

Evaluation results (Exact Match) for MRQA datasets:

Math

Accuracy (%) on mathematical problem-solving tasks with the number of trainable parameters shown in the second column. The first four out-of-domain datasets are from the SVAMP dataset. MP500 denotes the first 500 questions from MATH_PROBLEMS:

 
 
 

💾 Download

 
 
 

📌 Citation

@misc{li2025ptmoeefficientfinetuningframework,
      title={PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning}, 
      author={Zongqian Li and Yixuan Su and Nigel Collier},
      year={2025},
      eprint={2505.09519},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.09519}, 
}
 
 
 

🔖 License


About

Repository for the paper: PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published