Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

Shi, Xiufang; Zhang, Wei; Wu, Mincheng; Liu, Guangyi; Wen, Zhenyu; He, Shibo; Shah, Tejal; Ranjan, Rajiv

Computer Science > Machine Learning

arXiv:2409.17517 (cs)

[Submitted on 26 Sep 2024]

Title:Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

Authors:Xiufang Shi, Wei Zhang, Mincheng Wu, Guangyi Liu, Zhenyu Wen, Shibo He, Tejal Shah, Rajiv Ranjan

View PDF HTML (experimental)

Abstract:In federated learning, the heterogeneity of client data has a great impact on the performance of model training. Many heterogeneity issues in this process are raised by non-independently and identically distributed (Non-IID) data. This study focuses on the issue of label distribution skew. To address it, we propose a hybrid federated learning framework called HFLDD, which integrates dataset distillation to generate approximately independent and equally distributed (IID) data, thereby improving the performance of model training. Particularly, we partition the clients into heterogeneous clusters, where the data labels among different clients within a cluster are unbalanced while the data labels among different clusters are balanced. The cluster headers collect distilled data from the corresponding cluster members, and conduct model training in collaboration with the server. This training process is like traditional federated learning on IID data, and hence effectively alleviates the impact of Non-IID data on model training. Furthermore, we compare our proposed method with typical baseline methods on public datasets. Experimental results demonstrate that when the data labels are severely imbalanced, the proposed HFLDD outperforms the baseline methods in terms of both test accuracy and communication cost.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2409.17517 [cs.LG]
	(or arXiv:2409.17517v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2409.17517

Submission history

From: Wei Zhang [view email]
[v1] Thu, 26 Sep 2024 03:52:41 UTC (4,939 KB)

Computer Science > Machine Learning

Title:Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators