Journal of Guangdong University of Technology ›› 2021, Vol. 38 ›› Issue (04): 59-64.doi: 10.12052/gdutxb.200113

Previous Articles     Next Articles

Reinforcement Learning-Based Online Trajectory Optimization for Secure UAV Communications

Zheng Si-yuan, Cui Miao, Zhang Guang-chi   

  1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2020-09-07 Online:2021-07-10 Published:2021-05-25

Abstract: Physical-layer security is an effective technique to achieve secure communication in wireless communication. Physical-layer security in an unmanned aerial vehicle (UAV) communication system is investigated, where a UAV base station transmits confidential information to a legitimate receiver in the presence of a potential eavesdropper. The existing works usually assume that the flight environment of the UAV is stable, and the corresponding trajectory optimization methods are offline, which are not able to adjust the trajectory according to changing environment. To solve this issue, online trajectory optimization for the UAV base station is studied to ensure the information transmission security and maximize the communication secrecy rate. Different from the existing works, a more accurate hybrid light-of-sight/non-light-of-sight channel model is adopted for the air-to-ground communication links and an online trajectory optimization method proposed based on the Q-learning algorithm to ensure the physical-layer security of UAV communication. Numerical results show that the proposed algorithm significantly improves the secrecy rate of the UAV communication system, as compared with two benchmark schemes.

Key words: physical-layer security, UAV base station, online optimization, reinforcement learning

CLC Number: 

  • TN929.5
[1] ZENG Y, ZHANG R, LIM T J. Wireless communications with unmanned aerial vehicles — opportunities and challenges [J]. IEEE Communications Magazine, 2016, 54(5): 36-42.
[2] LYU J, ZENG Y, ZHANG R. Cyclical multiple access in UAV-aided communications — a throughput-delay tradeoff [J]. IEEE Wireless Communications Letters, 2016, 5(6): 600-603.
[3] SOTHEARA S, ASO K, AOMI N, et al. Effective data gathering and energy efficient communication protocol in wireless sensor networks employing UAV[C]//Wireless Communications & Networking Conference. Istanbul: IEEE, 2014: 2342-2347.
[4] ZHENG G, KRIKIDIS I, LI J, et al. Improving physical layer secrecy using full-duplex jamming receivers [J]. IEEE Transactions on Signal Processing, 2013, 61(20): 4962-4974.
[5] WYNER A D. The wire-tap channel [J]. Bell System Technical Journal, 1975, 54(8): 1355-1387.
[6] SHIU Y S, CHANG S Y, WU H C, et al. Physical layer security in wireless networks — a tutorial [J]. IEEE Wireless Communications, 2011, 18(2): 66-74.
[7] 童辉志, 张广驰, 周绪龙, 等. 具有能量获取基站的相邻多蜂窝小区的能量与频谱分配研究[J]. 广东工业大学学报, 2018, 35(4): 72-78.
TONG H Z, ZHANG G C, ZHOU X L, et al. Joint energy and spectrum allocation in multiple adjacent cells with energy harvesting base stations [J]. Journal of Guangdong University of Technology, 2018, 35(4): 72-78.
[8] GOEL S, NEGI R. Guaranteeing secrecy using artificial noise [J]. IEEE Transactions on Wireless Communications, 2008, 7(6): 2180-2189.
[9] GOPALA P K, LAI L, EL GAMAL H. On the secrecy capacity of fading channels [J]. IEEE Transactions on Information Theory, 2008, 54(10): 4687-4698.
[10] WANG H, YIN Q, XIA X. Distributed beamforming for physical layer security of two-way relay networks [J]. IEEE Transactions on Signal Processing, 2012, 60(7): 3532-3545.
[11] ZHANG G, LI X, CUI M, et al. Signal and artificial noise beamforming for secure simultaneous wireless information and power transfer multiple-input-multiple-output relaying systems [J]. IET Communications, 2016, 10(7): 796-804.
[12] ZHANG G, WU Q, CUI M, et al. Securing UAV communications via joint trajectory and power control [J]. IEEE Transactions on Wireless Communications, 2019, 18(2): 1376-1389.
[13] ZHANG J, ZENG Y, ZHANG R. Spectrum and energy efficiency maximization in UAV-enabled mobile relaying[C]//International Conference on Communications. Paris: IEEE, 2017: 1-6.
[14] HUANG Y, CUI M, ZHANG G, et al. Bandwidth, power and trajectory optimization for UAV base station networks with back haul and user QoS constraints [J]. IEEE Access, 2020(8): 67625-67634.
[15] ZHOU Y, YEOH P L, CHEN H, et al. Secrecy outage probability and jamming coverage of UAV-enabled friendly jammer[C]//2017 11th International Conference on Signal Processing and Communication Systems. Gold Coast: IEEE, 2017: 1-6.
[16] LIU J, LIU Z, ZENG Y, et al. Cooperative jammer placement for physical layer security enhancement [J]. IEEE Network the Magazine of Global Internetworking, 2016, 30(6): 56-61.
[17] FAN J, CUI M, ZHANG G, et al. Throughput improvement for multi-hop UAV relaying [J]. IEEE Access, 2019(7): 147732-147742.
[18] ZHANG G, YAN H, ZENG Y, et al. Trajectory optimization and power allocation for multi-hop UAV relaying communications [J]. IEEE Access, 2018(6): 48566-48576.
[19] SUTTON R, BARTO A. Reinforcement learning — an introduction[M]. 2nd ed. Cambridge: MIT Press, 2018.
[20] FENG Q, TAMEH E, NIX A R, et al. Modelling the likelihood of line-of-sight for air-to-ground radio propagation in urban environments[C]// Globecom2006. San Francisco: IEEE, 2007: 1-5.
[21] CUI M, ZHANG G, WU Q, et al. Robust trajectory and transmit power design for secure UAV communications [J]. IEEE Transactions on Vehicular Technology, 2018, 67(9): 9042-9046.
[22] AL HOURANI A, KANDEEPAN S, LARDNER S. Optimal lap altitude for maximum coverage [J]. Wireless Communications Letters IEEE, 2014, 3(6): 569-572.
[1] Wu Jia-rui, Cui Miao, Zhang Guang-chi, Wang Feng. RIS-Assisted Secure Communication in Non-Orthogonal Multiple Access Systems [J]. Journal of Guangdong University of Technology, 2022, 39(03): 49-54,69.
[2] Chen Ci, Xie Li-hua. A Data-Driven Prescribed Convergence Rate Design for Robust Tracking of Discrete-Time Systems [J]. Journal of Guangdong University of Technology, 2021, 38(06): 29-34.
[3] Li Ming-lei, Zhang Yang, Kang Jia-wen, Xu Min-rui, Dusit Niyato. Multi-Agent Reinforcement Learning for Secure Data Sharing in Blockchain-Empowered Vehicular Networks [J]. Journal of Guangdong University of Technology, 2021, 38(06): 62-69.
[4] Guo Xin-de, Chris Hong-qiang Ding. An AGV Path Planning Method for Discrete Manufacturing Smart Factory [J]. Journal of Guangdong University of Technology, 2021, 38(06): 70-76.
[5] Ye Wei-jie, Gao Jun-li, Jiang Feng, Guo Jing. A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2020, 37(05): 46-50.
[6] Wu Yun-xiong, Zeng Bi. Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2019, 36(01): 42-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!