当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Supplement data in federated learning with a generator transparent to clients
Information Sciences ( IF 8.1 ) Pub Date : 2024-03-07 , DOI: 10.1016/j.ins.2024.120437
Xiaoya Wang , Tianqing Zhu , Wanlei Zhou

Federated learning is a decentralized learning approach that shows promise for preserving users' privacy by avoiding local data sharing. However, the heterogeneous data in federated learning limits its applications in wider scopes. The data heterogeneity from diverse clients leads to weight divergence between local models and degrades the global performance of federated learning. To mitigate data heterogeneity, supplementing training data in federated learning has been explored and proven effective. However, traditional methods of supplementing data raise privacy concerns and increase learning costs. In this paper, we propose a solution to supplement training data with a generative model that is transparent to local clients. We keep the learning of the generative model on the server side and store the supplementary data from the generative model on the server side as well. This approach avoids collecting auxiliary data directly from local clients, reducing privacy concerns for them and preventing rising costs for local clients. To avoid loose learning on the real and synthetic samples, we constrain the optimization of the global model with a distance between the training global model and the distribution of the aggregated global model. Extensive experiments have verified that the synthetic data from the generative model improve the performance of federated learning, especially in a heterogeneous environment.

中文翻译:

使用对客户透明的生成器补充联邦学习中的数据

联邦学习是一种去中心化的学习方法,有望通过避免本地数据共享来保护用户的隐私。然而,联邦学习中的异构数据限制了其更广泛的应用。来自不同客户端的数据异构性导致本地模型之间的权重差异,并降低联邦学习的全局性能。为了减轻数据异构性,联邦学习中补充训练数据已经被探索并被证明是有效的。然而,传统的补充数据方法会引起隐私问题并增加学习成本。在本文中,我们提出了一种解决方案,通过对本地客户透明的生成模型来补充训练数据。我们将生成模型的学习保留在服务器端,并将生成模型的补充数据存储在服务器端。这种方法避免了直接从本地客户收集辅助数据,减少了他们的隐私问题并防止本地客户的成本上升。为了避免对真实样本和合成样本的松散学习,我们用训练全局模型和聚合全局模型的分布之间的距离来约束全局模型的优化。大量实验已经验证,生成模型的合成数据可以提高联邦学习的性能,尤其是在异构环境中。
更新日期:2024-03-07
down
wechat
bug