数据重要性可感知的快速联邦学习算法

    Data Importance Awared Time Efficient Federated Learning Algorithm

    • 摘要: 联邦学习可在不侵犯用户隐私的情况下,利用用户本地数据训练机器学习模型。然而在实际的同步联邦学习过程中,设备使用本地所有数据样本训练将导致较长的计算时延,并且部分数据样本对模型精度的提升效果十分有限。此外,过多设备参与训练与不合理的带宽分配策略也会导致较长的通信时延。若联邦学习系统能制定合理的带宽分配和设备选择策略,并对训练数据集进行精简,其将有利于模型训练的加速,同时也能保障模型的精度。为此,本文提出一个基于数据重要性的数据选择、带宽分配与设备选择问题,目的是在最小化联邦学习时延且满足设备电量约束的同时,有目的性地选取最能提高模型精度的数据样本进行模型训练。为求解该问题,本文基于拉格朗日对偶分解与次梯度法设计算法。具体而言,由于原问题的对偶问题为凸问题,因此可以通过次梯度法不断迭代乘子值以求得对偶问题的解,然后将对偶问题的解作为原问题的次优解。对于对偶函数值的求解,本文将其分为带宽分配子问题和设备与数据选择子问题,采取交替迭代法进行求解。其中,对于带宽分配子问题,本文利用KKT(Karush-Kuhn-Tucker) 条件进行求解;对于设备与数据选择子问题,本文通过穷举法进行求解。实验结果表明,针对不同的设备总数、设备电量阈值等情况,与其他算法相比,本文算法在时延上至少可减少8.32个百分点,至多可减少51.07个百分点。在模型精度上,本文算法可确保模型精度不受损失,甚至高于部分算法1~3个百分点。

       

      Abstract: Federated Learning (FL) enables devices to train model locally using their data, thereby protecting data privacy. However, in synchronous FL implementations, using all local data for model training may cause a relatively long computation latency, and certain data have a rather limited impact on enhancing model accuracy. Moreover, excessive participation of devices and unreasonable bandwidth allocation will further increase communication latency. Rational strategies for bandwidth allocation, device selection, and dataset streamlining can accelerate FL training while ensuring the model accuracy. Therefore, an issue of data selection, bandwidth allocation, and device selection is put forth based on data importance, aiming to select the data that can most improve model accuracy while minimizing the latency of FL under devices’ energy constraint. To solve this problem, an algorithm is designed based on Lagrange duality decomposition and subgradient method. Specifically, since the dual problem of the original problem is a convex problem, the solution of the dual problem can be obtained by iteratively multiplying the solution using the subgradient method, and then the solution of the dual problem can be used as the suboptimal solution of the original problem. The solution of dual function values is implemented by dividing it into two subproblems, namely bandwidth allocation subproblem and device and data selection subproblem, and then iteratively solving them alternately. The bandwidth allocation subproblem is solved using the KKT conditions, while the device and data selection subproblem is addressed via exhaustive search. Experimental results demonstrate that our algorithm can reduce the total latency by at least 8.32 percentage points and at most 51.07 percentage points for different device total numbers, device power thresholds, and other situations, on average, compared with baseline algorithms. In terms of model accuracy, this algorithm can ensure no loss in model accuracy and even outperform some baseline algorithms by approximately 1~3 percentage points, on average.

       

    /

    返回文章
    返回