PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Content

🚀 News • ✏️ Todo • ✨ Introduction

🖥️ Environment • 🤗 Inference Demo • 🤗 Finetuning Demo

✏️ Evaluation • 🎲 Results

💾 Download • 📌 Citation • 🔖 License

Links

Project Page • Paper

🚀 News

[2025.05.15] This page is created.

✏️ Todo

Upload the codes.
Upload the models.

✨ Introduction

Contributions

Novel finetuning framework: We propose PT-MoE, integrating matrix decomposition with MoE for prompt tuning. Our framework achieves state-of-the-art performance with fewer parameters while outperforming either method alone, demonstrating their complementary benefits.
Design dynamics: We thoroughly analyze key variables influencing the performance of PT-MoE, including prompt length, expert count, trainable parameters, routing mechanisms, and model size. Findings provide design guidelines for future parameter-efficient tuning approaches.
Comprehensive analysis: We provide detailed empirical studies across diverse tasks, including QA and mathematical problem solving, establishing a basis for future work in efficient finetuning methods.

Overview performance

Performance comparison of PEFT methods on 12 QA datasets in the MRQA benchmark (upper) and 5 math datasets (lower). ↑ indicates higher is better; ↓ indicates lower is better:

Framework

Framework of PT-MoE. Each soft prompt is decomposed into an input-specific matrix $A_i$ and a shared matrix $B$, with a router adaptively selecting and combining prompt components based on input. The resulting soft prompt is prepended to the input for the frozen LLM:

🖥️ Environment

Please use the same environment:

python==3.11.5
torch==2.3.1+cu118
transformers==4.46.0
datasets==2.18.0
huggingface_hub==0.24.2
deepspeed==0.15.3
wandb==0.14.2
numpy==1.23.5
tqdm==4.66.4

🤗 Inference Demo

QA

Math

🤗 Finetuning Demo

QA

Math

✏️ Evaluation

QA

Math

🎲 Results

QA

Evaluation results (F1 scores) for various PEFT methods on MRQA datasets. SQ: SQuAD; News: NewsQA; Tri: TriviaQA; Srch: SearchQA; HP: HotpotQA; NQ: NaturalQuestions; BSQ: BioASQ; DP: DROP; DRC: DuoRC; RC: RACE; RE: RelationExtraction; TB: TextbookQA. The bold values indicate the best performance among prompt tuning-based methods:

Evaluation results (Exact Match) for MRQA datasets:

Math

Accuracy (%) on mathematical problem-solving tasks with the number of trainable parameters shown in the second column. The first four out-of-domain datasets are from the SVAMP dataset. MP500 denotes the first 500 questions from MATH_PROBLEMS:

💾 Download

📌 Citation

@misc{li2025ptmoeefficientfinetuningframework,
      title={PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning}, 
      author={Zongqian Li and Yixuan Su and Nigel Collier},
      year={2025},
      eprint={2505.09519},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.09519}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
demo		demo
figures		figures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

🚀 News

✏️ Todo

✨ Introduction

Contributions

Overview performance

Framework

🖥️ Environment

🤗 Inference Demo

QA

Math

🤗 Finetuning Demo

QA

Math

✏️ Evaluation

QA

Math

🎲 Results

QA

Math

💾 Download

📌 Citation

🔖 License

About

Uh oh!

Releases

Packages

Uh oh!

ZongqianLi/PT-MoE

Folders and files

Latest commit

History

Repository files navigation

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

🚀 News

✏️ Todo

✨ Introduction

Contributions

Overview performance

Framework

🖥️ Environment

🤗 Inference Demo

QA

Math

🤗 Finetuning Demo

QA

Math

✏️ Evaluation

QA

Math

🎲 Results

QA

Math

💾 Download

📌 Citation

🔖 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages