Current Issue Cover

戚银城1,2, 霍亚琳1, 王宁1, 侯禹3(1.华北电力大学 电子与通信工程系;2.华北电力大学 河北省电力物联网技术重点实验室;3.国网湖北综合能源服务有限公司武汉分公司)

摘 要
目的 在联邦学习场景中,由于各客户端数据分布的不一致,会导致各客户端的局部目标之间偏差较大,以及全局平均模型偏离全局最优,影响模型训练的收敛速度和模型精度。针对非独立同分布数据导致的全局模型收敛缓慢以及模型准确率较低的问题,提出一种联合动态校正的联邦学习算法(federated learning algorithm for joint dynamic correction, FedJDC),分别从客户端和服务器端进行优化。方法 为了降低局部模型更新偏移的影响,定义累积偏移度来衡量各参与客户端的数据非独立同分布程度,并在本地损失函数中引入动态约束项,根据累积偏移度动态调整约束项大小,可自动适应不同程度的非独立同分布数据,减小局部模型的更新方向不一致性,从而提高模型准确率及通信效率;其次,针对全局模型聚合偏移,将参与客户端上传的累积偏移度作为全局模型聚合权重,从而动态更新全局模型,大幅减少通信轮数。结果 本文在三个真实数据集上的实验结果表明,与四种不同的联邦学习算法相比,在多种不同非独立同分布程度的情况下,FedJDC可以平均减少62.29%、20.90%、24.93%和20.47%的通信轮次,平均提高5.48%、1.62%、2.10%和2.28%的模型准确率。结论 本文提出的联邦学习中局部和全局偏移的联合动态校正算法从局部模型更新和全局模型聚合两方面进行改进,降低了通信轮次,提高了准确率,取得了良好的收敛效果。
Joint dynamic correction algorithms for local and global drifts in federated learning

(1.Department of Electronic and Communication Engineering, North China Electric Power University;2.Wuhan Branch, State Grid Hubei Comprehensive Energy Service Co. Ltd)

Objective Federated learning enables multiple parties to collaboratively train a machine learning model without communicating their local data. In practical applications, the data between nodes is usually non-Independent Identical Distribution(non-IID). In the local update, each client model will be optimized towards its local optima (i.e., fitting its individual feature distribution) instead of the global optimal objective, and raises a client update drift. Meanwhile, in the global updates that aggregates these diverged local models, the server model is further distracted by the set of mismatching local optima, which subsequently leads to a global drift at the server model. To solve the problem of slow global convergence and the increased number of training communication rounds caused by non-IID data, this paper proposes a joint dynamic correction federated learning algorithm (FedJDC), which is optimized from both client and server. Method In order to reduce the influence of non-IID on federated learning, this paper carries out joint optimization from two aspects of local model update and global model update and proposes joint dynamic correction federated learning algorithm (FedJDC). We use the cosine similarity between the local update direction and the global update direction to measure the offset of each participating client. Secondly, since each client has a different degree of non-IID, if the degree of the model offset is only determined by the cosine similarity calculated in this round, it may bring instability to the model update. Therefore, this algorithm defines the cumulative offset and introduces the attenuation coefficient ρ. In the calculation stage of the cumulative offset of the model, not only the current cumulative offset but also the historical cumulative offset are taken into account. In addition, by changing ρ to reduce the proportion of the cumulative offset of the current round, the influence of the offset of the current round on the final result can be reduced. This paper proposes a dynamic adjustment strategy of constraint terms for local model update offset. The constraint terms of the local loss function are dynamically adjusted according to the calculated cumulative offset of the local model, and the algorithm is automatically adapted to various non-IID settings without careful selection of hyperparameters, which makes the algorithm more flexible. At the same time, so as to dynamically change the weight of global model aggregation in each round, effectively improve the convergence speed and model accuracy, we design a dynamic weighted aggregation strategy, which takes the accumulated offset uploaded by all of the clients as the weight of global model aggregation in each round of communication. Result This method is tested using different deep learning models on three datasets. For the MNIST dataset, LeNeT5 is used for training. For the FMNIST dataset, the VGG16 network model is used for training. For the CIFAR10 dataset, the ResNet18 network model is used for training. Four different experiments are designed to prove the effectiveness of the algorithm. In order to verify the effectiveness in model accuracy of FedJDC at different degrees of non-IID, experiments are carried out by varying the hyperparameter β of the Dirichlet distribution to compare the performance of different algorithms. The experimental results show that FedJDC can improve the model accuracy by 5.48%, 1.62%, 2.10% and 2.28% on average compared with FedAvg, FedProx, FedAdp and FedLAW, respectively. To evaluate the communication efficiency of FedJDC, we count the number of communication rounds as FedJDC reaches to a target accuracy and compared to other four methods. The experimental results show that under different degrees of non-IID, FedJDC can reduce communication rounds by an average of 62.29%, 20.90%, 24.93%, and 20.47% on average compared with FedAvg, FedProx, FedAdp and FedLAW, respectively. At the same time, we study the effect of number of local epochs on the accuracy of final model. The experimental results show that the final model accuracy of FedJDC outperforms other four methods under different epochs. FedJDC has better robustness against larger offset caused by more local update epochs. Finally, the ablation experiments show that each optimization method performs well on all datasets, and FedJDC combines the two strategies to achieve global optimal performance. Conclusion In this paper, we optimize the local model offset and the global model offset from two aspects, and propose a joint dynamic correction algorithm for local and global offsets in federated learning. The cumulative offset is defined, and the attenuation coefficient is introduced into the calculation of the cumulative offset. Considering the historical offset and current offset information, the size of the cumulative offset is dynamically adjusted to ensure the stability of the training parameter update.The dynamic constraint strategy takes the cumulative offset calculated by the client in each round as the constraint parameter of the client model. The dynamic weighted aggregation strategy changes the weight of each local model during the global model aggregation based on the cumulative offset of each participating client, so as to dynamically update the global model in each round. The combination of the two optimization strategies has achieved good results, effectively alleviating the performance degradation of federated learning model caused by non-IID data, and has laid a good foundation for the further implementation of federated learning in this field.