广东工业大学学报 ›› 2024, Vol. 41 ›› Issue (04): 61-69.doi: 10.12052/gdutxb.240005

• 信息与通信工程 • 上一篇    下一篇

面向无人机数字孪生边缘网络的联邦学习资源自适应优化机制

谢正昊1,2, 赖健鑫1,2, 庄晓翀1,3, 蒋丽1,2   

  1. 1. 广东工业大学 自动化学院, 广东 广州 510006;
    2. 广东工业大学 物联网信息技术广东省重点实验室,广东 广州 510006;
    3. 智能检测与制造物联教育部重点实验室, 广东 广州 510006
  • 收稿日期:2024-01-29 出版日期:2024-07-25 发布日期:2024-08-13
  • 通信作者: 蒋丽(1986–),女,副教授,主要研究方向为6G网络和网络内生安全等,E-mail:jiangli@gdut.edu.cn
  • 作者简介:谢正昊(1998–),男,硕士研究生,主要研究方向为无人机网络、联邦学习和数字孪生等,E-mail:redteaice@foxmail.com
  • 基金资助:
    国家重点研发计划项目(2020YFB1807801);国家自然科学基金资助面上项目(62371142,62273107)

Adaptive Resource Optimization for Federated Learning in UAV Digital Twin Edge Networks

Xie Zheng-hao1,2, Lai Jian-xin1,2, Zhuang Xiao-chong1,3, Jiang Li1,2   

  1. 1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China;
    2. Guangdong Key Laboratory of IoT Information Technology, Guangdong University of Technology, Guangzhou 510006, China;
    3. Key Laboratory of Intelligent Detection and the Internet of Things in Manufacturing, Ministry of Education, Guangzhou 510006, China
  • Received:2024-01-29 Online:2024-07-25 Published:2024-08-13

摘要: 为了解决无人机数字孪生边缘网络联邦学习性能优化问题,本文提出一种基于深度强化学习的无人机数字孪生边缘网络资源调度策略。考虑动态时变的无人机数字孪生边缘网络环境,构建包含地面基站(Base Station, BS)、地面智能终端、空中无人机以及无线传输信道的孪生网络模型,建立联合无人机飞行距离、飞行角度以及无线网络频谱资源分配的自适应资源优化模型,实现最小化联邦学习时延的目标。在无人机数字孪生边缘网络环境下,提出多智能体深度确定性策略梯度算法(Multi-Agent Deep Deterministic Policy Gradient,MA-DDPG),求解自适应资源优化模型。算法训练过程采用中心化训练、去中心化执行的方式,每个无人机智能体在评估动作价值时会考虑其他智能体的状态和动作,而在执行时只根据自身的局部观察来决定动作。上述训练过程将在数字孪生环境中执行,算法收敛后再应用于真实世界,最大限度地减少物理实体的资源开销。仿真结果表明,所提算法可显著降低联邦学习服务时延,同时保证联邦学习训练损失和准确率的优越性。

关键词: 无人机网络, 数字孪生, 联邦学习, 多智能体深度确定性策略梯度

Abstract: To address the performance optimization issues in federated learning for unmanned aerial vehicle (UAV) digital twin edge networks, a resource scheduling strategy is proposed based on deep reinforcement learning for UAV digital twin edge networks. Considering dynamic and time varying features of UAV digital twin edge networks environment, a twin network model is built consisting of base station (BS) , intelligent terminals, UAV and wireless transmission channel. Then an adaptive resource optimization model is formulated which jointly optimized UAV flying distance, flying angle and spectrum resource allocation, in order to minimize time delay of federated learning. Moreover, an UAV digital twin edge networks empowered multi-agent deep deterministic policy gradient (MA-DDPG) algorithm is designed to solve the adaptive resource optimization model. The algorithm training process adopts centralized training and decentralized execution. Each UAV agent considers the states and actions of other agents when evaluating the value of actions, but decides actions based only on its own local observations during execution. The above training process is conducted in digital twin environment, and after the algorithm converges, and it is applied to the real world, minimizing the resource overhead of physical entities to the greatest extent. Numerical results illustrate that the proposed algorithm can significantly decrease the service latency of federated learning, while guaranteeing the superiority of training loss and accuracy of federated learning.

Key words: unmanned aerial vehicle networks, digital twin, federated learning, multi-agent deep deterministic policy gradient

中图分类号: 

  • TN929.5
[1] GERACI G, RODRIGUEZ A, AZARI M, et al. What will the future of UAV cellular communications be? A flight from 5G to 6G [J]. IEEE Communications Survey & Tutorials, 2022, 24(3): 1304-1335.
[2] JIANG L, CHEN B, XIE S L, et al. Incentivizing resource cooperation for blockchain empowered wireless power transfer in UAV networks [J]. IEEE Transactions on Vehicular Technologies, 2020, 69(12): 15828-15841.
[3] HAZRA K, SHAH V K, ROY S, et al. Exploring biological robustness for reliable multi-UAV networks [J]. IEEE Transactions on Network and Service Management, 2021, 18(3): 2776-2788.
[4] TRIPATHY A, TRIPATHY K, MOHAPATRA G, et al. WeDoShare: a ridesharing framework in transportation cyber-physical system for sustainable mobility in smart cities [J]. IEEE Consumer Electronics Magazine, 2020, 9(4): 41-48.
[5] HU C H, FAN W C, ZENG E, et al. Digital twin-assisted real-time traffic data prediction method for 5G-enabled Internet of Vehicles [J]. IEEE Transactions on Industrial Informatics, 2022, 18(4): 2811-2819.
[6] WANG Z R, GUPTA R, HAN K, et al. Mobility digital twin: concept, architecture, case study, and future challenges [J]. IEEE Internet of Things Journal, 2022, 9(18): 17452-17467.
[7] KONECNY J, MCMAHAN H B, RAMAGE D. Federated optimization: distributed machine learning for on-device intelligence[J]. arXiv: 1610.02527(2016-10-8) [2023-10-11]. https://arxiv.org/abs/1610.02527.
[8] KHAN L U, SAAD W, HAN Z, et al. Federated learning for Internet of things: recent advances, taxonomy, and open challenges [J]. IEEE Communications Surveys & Tutorials, 2021, 23(3): 1759-1799.
[9] 蒋丽, 谢胜利, 张彦. 面向6G网络的联邦学习资源协作激励机制设计[J]. 广东工业大学学报, 2021, 38(6): 47-52,83.
JIANG L, XIE S L, ZHANG Y. Incentivizing resource cooperation for federated learning in 6G networks [J]. Journal of Guangdong University of Technology, 2021, 38(6): 47-52,83.
[10] SONG Y, WANG T, WU Y, et al. Non-orthogonal multiple access assisted federated learning for UAV swarms: an approach of latency minimization[C]//2021 International Wireless Communications and Mobile Computing (IWCMC). Harbin, China: IEEE, 2021: 1123-1128.
[11] CHENG Z, XIA X, MIN M, et al. Auction-promoted trading for multiple federated learning services in UAV-aided networks [J]. IEEE Transactions on Vehicular Technology, 2022, 70(10): 10960-10974.
[12] SUN W, XU N, WANG L, et al. Dynamic digital twin and federated learning with incentives for air-ground network [J]. IEEE Transactions on Network Science and Engineering, 2022, 9(1): 321-333.
[13] LI B, LIU Y F, TAN L, et al. Digital twin assisted task offloading for aerial edge computing and networks [J]. IEEE Transactions on Vehicular Technology, 2022, 71(10): 10863-10877.
[14] CHEN X , CHEN T, ZHAO Z , et al. Resource awareness in unmanned aerial vehicle-assisted mobile-edge computing system[C]//2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring). Antwerp, Belgium: IEEE, 2020.
[15] RYAN L, YI W, AVIV T, et al. Multi-agent actor-critic for mixed cooperative- competitive environments[J]. arXiv: 1706.02275(2017-6-7) [2023-10-12]. https://arxiv.org/abs/1706.02275.
[16] KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[EB/OL]. (2009-04-08)[2023-12-28]. http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
[1] 蒋丽, 谢胜利, 张彦. 面向6G网络的联邦学习资源协作激励机制设计[J]. 广东工业大学学报, 2021, 38(06): 47-52,83.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!