Byzantine-robust decentralized stochastic optimization with stochastic gradient noise-independent learning error,Signal Processing

当前位置： X-MOL 学术 › Signal Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Byzantine-robust decentralized stochastic optimization with stochastic gradient noise-independent learning error
Signal Processing ( IF 4.4 ) Pub Date : 2024-02-03 , DOI: 10.1016/j.sigpro.2024.109419
Jie Peng , Weiyu Li , Qing Ling

This paper studies Byzantine-robust stochastic optimization over a decentralized network, where every agent periodically communicates with its neighbors to exchange local models, and then updates its own local model with one or a mini-batch of local samples. The performance of such a method is affected by an unknown number of Byzantine agents, which conduct adversarially during the optimization process. To the best of our knowledge, there is no existing work that simultaneously achieves a linear convergence speed and a small learning error. We observe that the unsatisfactory trade-off between convergence speed and learning error is due to the intrinsic stochastic gradient noise. Motivated by this observation, we introduce two variance reduction methods, stochastic average gradient algorithm (SAGA) and loopless stochastic variance-reduced gradient (LSVRG), to Byzantine-robust decentralized stochastic optimization for eliminating the negative effect of the stochastic gradient noise. The two resulting methods, BRAVO-SAGA and BRAVO-LSVRG, enjoy both linear convergence speeds and stochastic gradient noise-independent learning errors. Such learning errors are optimal for a class of methods based on total variation (TV)-norm regularization and stochastic subgradient update. We conduct extensive numerical experiments to show their effectiveness under various Byzantine attacks.

中文翻译：

具有随机梯度噪声无关学习误差的拜占庭鲁棒分散随机优化

本文研究了去中心化网络上的拜占庭鲁棒随机优化，其中每个代理定期与其邻居通信以交换本地模型，然后用一个或一小批本地样本更新自己的本地模型。这种方法的性能受到未知数量的拜占庭代理的影响，这些代理在优化过程中表现出对抗性。据我们所知，目前还没有同时实现线性收敛速度和较小学习误差的工作。我们观察到收敛速度和学习误差之间的不令人满意的权衡是由于固有的随机梯度噪声造成的。受此观察的启发，我们将两种方差减少方法，随机平均梯度算法（SAGA）和无循环随机方差减少梯度（LSVRG）引入拜占庭鲁棒分散随机优化中，以消除随机梯度噪声的负面影响。由此产生的两种方法 BRAVO-SAGA 和 BRAVO-LSVRG 既具有线性收敛速度又具有与随机梯度噪声无关的学习误差。这种学习误差对于基于全变分（TV）范数正则化和随机次梯度更新的一类方法来说是最佳的。我们进行了大量的数值实验，以证明它们在各种拜占庭攻击下的有效性。

更新日期：2024-02-03

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>