广东工业大学学报 ›› 2021, Vol. 38 ›› Issue (04): 59-64.doi: 10.12052/gdutxb.200113

• • 上一篇    下一篇

基于强化学习的无人机安全通信轨迹在线优化策略

郑思远, 崔苗, 张广驰   

  1. 广东工业大学 信息工程学院,广东 广州 510006
  • 收稿日期:2020-09-07 出版日期:2021-07-10 发布日期:2021-05-25
  • 通信作者: 崔苗(1978-),女,讲师,博士,主要研究方向为电子信息与无线通信技术,E-mail: cuimiao@gdut.edu.cn E-mail:cuimiao@gdut.edu.cn
  • 作者简介:郑思远(1994-),男,硕士研究生,主要研究方向为强化学习和无线安全通信等
  • 基金资助:
    广东省科技计划项目(2017B090909006,2018A050506015,2019B010119001,2020A050515010,2020A0505100012,2021A0505030015);广东特支计划项目(2019TQ05X409)

Reinforcement Learning-Based Online Trajectory Optimization for Secure UAV Communications

Zheng Si-yuan, Cui Miao, Zhang Guang-chi   

  1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2020-09-07 Online:2021-07-10 Published:2021-05-25

摘要: 物理层安全是无线通信实现安全通信的一种有效手段, 无人机基站在存在地面窃听者的情况下向地面合法用户传输机密信息时, 已有研究一般采用离线方法对无人机基站的飞行轨迹进行优化, 得到的轨迹是固定的, 保密传输缺乏应对通信环境变化的能力。针对该问题, 本文研究无人机飞行轨迹的在线优化策略, 使机密信息被安全传输的同时实现通信平均保密率最大化。文中采用了更符合实际的视距/非视距混合信道模型对空地通信链路进行建模, 提出一种基于Q-learning算法的保障无人机通信系统安全的飞行轨迹在线优化方法。仿真结果表明, 与两种基准算法相比, 本文提出的算法能够有效提高无人机通信的平均保密率。

关键词: 物理层安全, 无人机基站, 在线优化, 强化学习

Abstract: Physical-layer security is an effective technique to achieve secure communication in wireless communication. Physical-layer security in an unmanned aerial vehicle (UAV) communication system is investigated, where a UAV base station transmits confidential information to a legitimate receiver in the presence of a potential eavesdropper. The existing works usually assume that the flight environment of the UAV is stable, and the corresponding trajectory optimization methods are offline, which are not able to adjust the trajectory according to changing environment. To solve this issue, online trajectory optimization for the UAV base station is studied to ensure the information transmission security and maximize the communication secrecy rate. Different from the existing works, a more accurate hybrid light-of-sight/non-light-of-sight channel model is adopted for the air-to-ground communication links and an online trajectory optimization method proposed based on the Q-learning algorithm to ensure the physical-layer security of UAV communication. Numerical results show that the proposed algorithm significantly improves the secrecy rate of the UAV communication system, as compared with two benchmark schemes.

Key words: physical-layer security, UAV base station, online optimization, reinforcement learning

中图分类号: 

  • TN929.5
[1] ZENG Y, ZHANG R, LIM T J. Wireless communications with unmanned aerial vehicles — opportunities and challenges [J]. IEEE Communications Magazine, 2016, 54(5): 36-42.
[2] LYU J, ZENG Y, ZHANG R. Cyclical multiple access in UAV-aided communications — a throughput-delay tradeoff [J]. IEEE Wireless Communications Letters, 2016, 5(6): 600-603.
[3] SOTHEARA S, ASO K, AOMI N, et al. Effective data gathering and energy efficient communication protocol in wireless sensor networks employing UAV[C]//Wireless Communications & Networking Conference. Istanbul: IEEE, 2014: 2342-2347.
[4] ZHENG G, KRIKIDIS I, LI J, et al. Improving physical layer secrecy using full-duplex jamming receivers [J]. IEEE Transactions on Signal Processing, 2013, 61(20): 4962-4974.
[5] WYNER A D. The wire-tap channel [J]. Bell System Technical Journal, 1975, 54(8): 1355-1387.
[6] SHIU Y S, CHANG S Y, WU H C, et al. Physical layer security in wireless networks — a tutorial [J]. IEEE Wireless Communications, 2011, 18(2): 66-74.
[7] 童辉志, 张广驰, 周绪龙, 等. 具有能量获取基站的相邻多蜂窝小区的能量与频谱分配研究[J]. 广东工业大学学报, 2018, 35(4): 72-78.
TONG H Z, ZHANG G C, ZHOU X L, et al. Joint energy and spectrum allocation in multiple adjacent cells with energy harvesting base stations [J]. Journal of Guangdong University of Technology, 2018, 35(4): 72-78.
[8] GOEL S, NEGI R. Guaranteeing secrecy using artificial noise [J]. IEEE Transactions on Wireless Communications, 2008, 7(6): 2180-2189.
[9] GOPALA P K, LAI L, EL GAMAL H. On the secrecy capacity of fading channels [J]. IEEE Transactions on Information Theory, 2008, 54(10): 4687-4698.
[10] WANG H, YIN Q, XIA X. Distributed beamforming for physical layer security of two-way relay networks [J]. IEEE Transactions on Signal Processing, 2012, 60(7): 3532-3545.
[11] ZHANG G, LI X, CUI M, et al. Signal and artificial noise beamforming for secure simultaneous wireless information and power transfer multiple-input-multiple-output relaying systems [J]. IET Communications, 2016, 10(7): 796-804.
[12] ZHANG G, WU Q, CUI M, et al. Securing UAV communications via joint trajectory and power control [J]. IEEE Transactions on Wireless Communications, 2019, 18(2): 1376-1389.
[13] ZHANG J, ZENG Y, ZHANG R. Spectrum and energy efficiency maximization in UAV-enabled mobile relaying[C]//International Conference on Communications. Paris: IEEE, 2017: 1-6.
[14] HUANG Y, CUI M, ZHANG G, et al. Bandwidth, power and trajectory optimization for UAV base station networks with back haul and user QoS constraints [J]. IEEE Access, 2020(8): 67625-67634.
[15] ZHOU Y, YEOH P L, CHEN H, et al. Secrecy outage probability and jamming coverage of UAV-enabled friendly jammer[C]//2017 11th International Conference on Signal Processing and Communication Systems. Gold Coast: IEEE, 2017: 1-6.
[16] LIU J, LIU Z, ZENG Y, et al. Cooperative jammer placement for physical layer security enhancement [J]. IEEE Network the Magazine of Global Internetworking, 2016, 30(6): 56-61.
[17] FAN J, CUI M, ZHANG G, et al. Throughput improvement for multi-hop UAV relaying [J]. IEEE Access, 2019(7): 147732-147742.
[18] ZHANG G, YAN H, ZENG Y, et al. Trajectory optimization and power allocation for multi-hop UAV relaying communications [J]. IEEE Access, 2018(6): 48566-48576.
[19] SUTTON R, BARTO A. Reinforcement learning — an introduction[M]. 2nd ed. Cambridge: MIT Press, 2018.
[20] FENG Q, TAMEH E, NIX A R, et al. Modelling the likelihood of line-of-sight for air-to-ground radio propagation in urban environments[C]// Globecom2006. San Francisco: IEEE, 2007: 1-5.
[21] CUI M, ZHANG G, WU Q, et al. Robust trajectory and transmit power design for secure UAV communications [J]. IEEE Transactions on Vehicular Technology, 2018, 67(9): 9042-9046.
[22] AL HOURANI A, KANDEEPAN S, LARDNER S. Optimal lap altitude for maximum coverage [J]. Wireless Communications Letters IEEE, 2014, 3(6): 569-572.
[1] 吴家锐, 崔苗, 张广驰, 王丰. 可重构智能表面辅助的非正交多址接入系统的安全通信研究[J]. 广东工业大学学报, 2022, 39(03): 49-54,69.
[2] 陈辞, 谢立华. 具有指定收敛速度的离散系统鲁棒跟踪数据驱动设计[J]. 广东工业大学学报, 2021, 38(06): 29-34.
[3] 李明磊, 章阳, 康嘉文, 徐敏锐, Dusit Niyato. 基于多智能体强化学习的区块链赋能车联网中的安全数据共享[J]. 广东工业大学学报, 2021, 38(06): 62-69.
[4] 郭心德, 丁宏强. 离散制造智能工厂场景的AGV路径规划方法[J]. 广东工业大学学报, 2021, 38(06): 70-76.
[5] 叶伟杰, 高军礼, 蒋丰, 郭靖. 一种提升机器人强化学习开发效率的训练模式研究[J]. 广东工业大学学报, 2020, 37(05): 46-50.
[6] 吴运雄, 曾碧. 基于深度强化学习的移动机器人轨迹跟踪和动态避障[J]. 广东工业大学学报, 2019, 36(01): 42-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!