Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

Nguyen, Tuc; Le, Thai

Computer Science > Computation and Language

arXiv:2402.10639 (cs)

[Submitted on 16 Feb 2024 (v1), last revised 6 Jun 2024 (this version, v2)]

Title:Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

Authors:Tuc Nguyen, Thai Le

View PDF HTML (experimental)

Abstract:Several parameter-efficient fine-tuning methods based on adapters have been proposed as a streamlined approach to incorporate not only a single specialized knowledge into existing Pre-Trained Language Models (PLMs) but also multiple of them at once. Recent works such as AdapterSoup propose to mix not all but only a selective sub-set of domain-specific adapters during inference via model weight averaging to optimize performance on novel, unseen domains with excellent computational efficiency. However, the essential generalizability of this emerging weight-space adapter mixing mechanism on \textit{unseen, in-domain examples} remains unexplored. Thus, in this study, we conduct a comprehensive analysis to elucidate the generalizability of domain-specific adapter mixtures in in-domain evaluation. We also provide investigations into the inner workings of the mixture of domain-specific adapters by analyzing their weight signs, yielding critical analysis on the negative correlation between their fraction of weight sign difference and their mixtures' generalizability.

Comments:	ACL Main 2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2402.10639 [cs.CL]
	(or arXiv:2402.10639v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2402.10639

Submission history

From: Tuc Nguyen Van [view email]
[v1] Fri, 16 Feb 2024 12:39:10 UTC (1,814 KB)
[v2] Thu, 6 Jun 2024 04:52:09 UTC (1,805 KB)

Computer Science > Computation and Language

Title:Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators