Journal of Guangdong University of Technology ›› 2023, Vol. 40 ›› Issue (04): 85-93.doi: 10.12052/gdutxb.220159

• Computer Science and Technology • Previous Articles     Next Articles

An Improved Double Deep Q Network for Multi-user Dynamic Spectrum Access

He Yi-shan, Wang Yong-hua, Wan Pin, Wang Lei, Wu Wen-tao   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-10-19 Online:2023-07-25 Published:2023-08-02

Abstract: With the rapid development of mobile communication technology, the contradiction between the limited spectrum utilization resources and the demand of a lot of spectrum communication is increasingly aggravated. New intelligent methods are needed to improve the utilization rate of spectrum. A multi-user dynamic spectrum access method based on distributed priority experience pool and double deep Q network is proposed. This method can help the secondary users to continuously learn according to their perceived environment information in the dynamic environment, and choose the idle channel to complete the spectrum access task for improving the spectrum utilization rate. In this method, a distributed reinforcement learning framework is adopted, and each secondary user is regarded as an agent. Each agent learns by using standard single-agent reinforcement learning method to reduce the underlying computing overhead. In addition, the method adds priority sampling on the basis of neural network training, and then optimizes the training efficiency of neural network to help sub-users choose the optimal strategy. The simulation results show that this method can improve the success rate, reduce the collision rate and improve the communication rate.

Key words: dynamic spectrum access, distributed reinforcement learning, prioritized experience pool, deep reinforcement learning

CLC Number: 

  • TN929.5
[1] SU H, ZHANG X. Cross-layer based opportunistic MAC protocols for QoS provisionings over Cognitive radio wireless networks[J]. IEEE Journal on Selected Areas in Communications, 2008, 26(1): 118-129.
[2] WANG J, HUANG Y, JIANG H. Improved algorithm of spectrum allocation based on graph coloring model in cognitive radio[C]//2009 WRI International Conference on Communications and Mobile Computing. Kunming: IEEE, 2009: 353-357.
[3] GAO L, DUAN L, HUANG J. Two-sided matching based cooperative spectrum sharing[J]. IEEE Transactions on Mobile Computing, 2017, 16(2): 538-551.
[4] 刘新浩, 马昕睿, 王大为. 基于图论模型的认知无线电频谱分配仿真建模研究[J]. 电脑与电信, 2021(3): 16-20.LIU X H, MA X R, WANG D W. Simulation modeling of cognitive radio spectrum allocation based on graph theory model[J]. Computer & Telecommunication, 2021(3): 16-20.
[5] LIU X, SUN C, ZHOU M, et al. Reinforcement learning based dynamic spectrum access in cognitive internet of vehicles[J]. China Communications, 2021, 18(7): 58-68.
[6] 郑思远, 崔苗, 张广驰. 基于强化学习的无人机安全通信轨迹在线优化策略[J]. 广东工业大学报, 2021, 38(4): 59-64.ZHENG S Y, CUI M, ZHANG G C. Reinforcement learning-based online trajectory optimization for secure UAV communications[J]. Journal of Guangdong University of Technology, 2021, 38(4): 59-64.
[7] XU F, YANG F, BAO S, et al. DQN inspired joint computing and caching resource allocation approach for software defined information-centric Internet of Things network[J]. IEEE Access, 2019, 7: 61987-61996.
[8] CHEN Y, LI Y, XU D, et al. DQN-based power control for IoT transmission against jamming[C]//2018 IEEE 87th Vehicular Technology Conference (VTC Spring). Portugal: IEEE, 2018: 1-5.
[9] SUN Y, PENG M, POOR H V. A distributed approach to improving spectral efficiency in uplink device-to-device-enabled cloud radio access networks[J]. IEEE Transactions on Communications, 2018, 66(12): 6511-6526.
[10] KAI C H, MENG X W, MEI L S, et al. Deep reinforcement learning based user association and resource allocation for d2d-enabled wireless networks[C]//2021 IEEE/CIC International Conference on Communications in China (ICCC). Xiamen: IEEE, 2021: 1172-1177.
[11] ASMUTH J, LI L, LITTMAN M L, et al. A bayesian sampling approach to exploration in reinforcement learning[J]. Eprint Arxiv, 2009, 58(7): 1805-1810.
[12] ZHANG R B, ZHONG Y, GU G C. A new accelerating algorithm for multi-agent reinforcement learning[J]. Journal of Harbin Institute of Technology, 2005, 12(1): 48-51.
[13] VIRIYASITAVAT W, BOBAN M, TSAI H M, et al. Vehicular communications: survey and challenges of channel and propagation models[J]. IEEE Vehicular Technology Magazine, 2015, 10(2): 55-66.
[14] MEINIL J, KYSTI P, JMS T, et al. WINNER II channel models[M]. New Jersey: John Wiley & Sons, Ltd, 2009.
[15] SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Cambridge: MIT Press, 2018.
[16] 傅波. 基于交替跟踪的分布式多智能体合作学习算法研究[D]. 长沙: 中南大学, 2014.
[17] 郭瑝清, 陈锋. 干线动态协调控制的深度Q网络方法[J]. 信息技术与网络安全, 2020, 39(6): 1-6.GUO H Q, CHEN F. A deep Q network method for dynamic arterial coordinated control[J]. Cyber Security and Data Governance, 2020, 39(6): 1-6.
[1] Su Tian-ci, He Zi-nan, Cui Miao, Zhang Guang-chi. Intelligent Path Planning Algorithm for Multi-UAV-assisted Data Collection Systems [J]. Journal of Guangdong University of Technology, 2023, 40(04): 77-84.
[2] Guo Xin-de, Chris Hong-qiang Ding. An AGV Path Planning Method for Discrete Manufacturing Smart Factory [J]. Journal of Guangdong University of Technology, 2021, 38(06): 70-76.
[3] Ye Wei-jie, Gao Jun-li, Jiang Feng, Guo Jing. A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2020, 37(05): 46-50.
[4] Wu Yun-xiong, Zeng Bi. Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2019, 36(01): 42-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!