广东工业大学学报 ›› 2024, Vol. 41 ›› Issue (05): 39-47,71.doi: 10.12052/gdutxb.230203

• 电气工程 • 上一篇    下一篇

基于多智能体注意力机制的自动巡检路线强化学习模型

欧嘉俊, 曾伟良, 李谕锋, 范竞敏   

  1. 广东工业大学 自动化学院, 广东 广州 510006
  • 收稿日期:2023-12-15 出版日期:2024-09-25 发布日期:2024-10-08
  • 通信作者: 曾伟良(1986-),男,副教授,博士,主要研究方向为大规模路网的路径规划,E-mail:weiliangzeng@gdut.edu.cn
  • 作者简介:欧嘉俊(2000-),男,硕士研究生,主要研究方向为路径规划、AI交叉应用,E-mail:kingsely@mail2.gdut.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(62273102,62073084);广东省基础与应用基础研究基金资助项目(2024A1515010629)

Reinforcement Learning Model for Automatic Inspection Route Based on Multi-agent Attention Mechanism

Ou Jia-jun, Zeng Wei-liang, Li Yu-feng, Fan Jing-min   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2023-12-15 Online:2024-09-25 Published:2024-10-08

摘要: 合理的任务分配与巡检路线规划是确保机器人能够高效替代工程师完成变电站危险区域巡检任务的关键所在。然而,以往的研究大多局限于为变电设备规划固定的最短巡检路径,却鲜少考虑到设备检测时间和检验等级的差异性。为了进一步提升变电站巡检的有效性和灵活性,本文在充分考虑检测时间、设备检验等级以及待检测设备数量差异性的基础上,构建了一个动态巡检路径规划模型。鉴于所建模型属于NP-hard问题,提出了一种基于强化学习和多智能体注意力机制的求解策略。在求解过程中,先利用具有注意力层的编码器–解码器框架生成巡检路径,随后通过无监督神经网络进行训练优化。最后,以南方电网某变电站作为实验点进行模型验证。与遗传算法、分层可变领域搜索算法和自适应并行蚁群算法相比,本文提出的算法在路径距离上分别缩短了3.31%,1.24%与1.73%,规划用时分别缩短了17.06%,16.22%与13.89%,单次巡检成本分别降低了21.22%,6.86%与9.14%,展现出显著的优越性。

关键词: 多智能体, 变电站, 路径规划, 强化学习, 注意力机制

Abstract: Reasonable task allocation and inspection routes are crucial for robots to replace engineers in performing inspection tasks in dangerous areas of substations. However, most existing studies focused solely on planning fixed shortest paths for inspecting power transformation equipment, neglecting the variability of equipment inspection times and the heterogeneity of inspection levels. To enhance the effectiveness and flexibility of substation inspections, this study establishes a dynamic inspection path planning model by comprehensively considering the variability of inspection times, the heterogeneity of equipment inspection levels, and the number differences equipments to be inspected. To address the NP-hard of the proposed model, this paper proposes a solution based on the reinforcement learning and multi-agent attention mechanism, which first generates inspection paths using an encoder-decoder framework with an attention layer, and then trains it using an unsupervised neural network. Finally, a substation of China Southern Power Grid is used as an experimental site to validate the model. Compared with the genetic algorithm (GA), Hierarchical Variable Neighborhood Search algorithm (HVNS) , and Adaptive Parallel Memetic Multi-Elite Ant System algorithm (APMMEAS) , the proposed algorithm reduces the path distances by 3.31%, 1.24%, and 1.73%, respectively; reduces the planning time by 17.06%, 16.22% and 13.89%, respectively; and reduces the single inspection costs by 21.22%, 6.86%, and 9.14%, respectively.

Key words: multi-agent, power substation, path planning, reinforcement learning, attention mechanism

中图分类号: 

  • TM732
[1] 傅惠, 伍乃骐, 胡刚. 城市交通系统管理与优化研究综述[J]. 工业工程, 2016, 19(1): 10-15.
FU H, WU N Q, HU G. An overview of management and optimization of urban transportation systems [J]. Industrial Engineering Journal, 2016, 19(1): 10-15.
[2] 王建邦, 袁智勇, 陈波, 等. 变电站巡检机器人数据驱动无模型自适应控制[J]. 电测与仪表, 2019, 56(19): 114-120.
WANG J B, YUAN Z Y, CHEN B, et al. Data driven model free adaptive control for substation inspection robots [J]. Electric Measurement and Instrumentation, 2019, 56(19): 114-120.
[3] 黄金魁. 智能变电站三维实景无人值守感知系统的应用研究[J]. 电测与仪表, 2020, 57(4): 87-92.
HUANG J K. Research on the application of three-dimensional real scene unattended sensing systems in intelligent substations [J]. Electric Measurement and Instrumentation, 2020, 57(4): 87-92.
[4] 张永涛, 李博, 张甲, 等. 基于图论的变电站巡检机器人全局路径规划[J]. 山东电力技术, 2020, 47(9): 45-49.
ZHANG Y T, LI B, ZHANG J, et al. Global path planning of substation inspection robots based on graph theory [J]. Shandong Electric Power Technology, 2020, 47(9): 45-49.
[5] 张永涛, 于倩倩, 肖智彬, 等. 基于枚举法的变电站巡检机器人巡视路线优化[J]. 浙江电力, 2021, 40(1): 12-17.
ZHANG Y T, YU Q Q, XIAO Z B, et al. Optimization of inspection routes for substation inspection robots based on enumeration method [J]. Zhejiang Electric Power, 2021, 40(1): 12-17.
[6] 薛阳, 俞志程, 吴海东, 等. 基于IACO-ABC 算法的变电站巡检机器人路径规划[J]. 浙江电力, 2019, 38(11): 10-15.
XUE Y, YU Z C, WU H D, et al. Path planning for substation inspection robots based on the IACO-ABC algorithm [J]. Zhejiang Electric Power, 2019, 38(11): 10-15.
[7] 刘胜, 晏齐忠, 张志鑫, 等. 基于ACO-PSO算法的变电站巡检机器人路径规划研究[J]. 浙江电力, 2021, 40(1): 29-35.
LIU S, YAN Q Z, ZHANG Z X, et al. Study on path planning for substation inspection robots using the ACO-PSO algorithm [J]. Zhejiang Electric Power, 2021, 40(1): 29-35.
[8] 宋涛, 李丹, 路宁. 基于分层强化学习的数字化输电线路路径规划研究[J]. 电测与仪表, 2022, 59(4): 91-97.
SONG T, LI D, LU N. Study on path planning for digital transmission lines using hierarchical reinforcement learning [J]. Electric Measurement and Instrumentation, 2022, 59(4): 91-97.
[9] 王万良, 陈浩立, 李国庆, 等. 基于深度强化学习的多配送中心车辆路径规划[J]. 控制与决策, 2022, 37(8): 2101-2109.
WANG W L, CHEN H L, LI G Q, et al. Path planning for vehicles at multiple distribution centers using deep reinforcement learning [J]. Control and Decision, 2022, 37(8): 2101-2109.
[10] YANG Q, MA S, ZHANG G, et al. A new assistance navigation method for substation inspection robots to safely cross grass areas [J]. Sensors, 2023, 23(22): 9201.
[11] NING X, LIU Z, LI Y. Autonomous obstacle crossing method for substation inspection robots based on locust optimization algorithm[C]//3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing. Wuhan: SPIE, 2023: 1271702.
[12] 傅惠, 金诚谦, 牛张哲, 等. 网联自动驾驶货车编队规划与控制研究综述[J]. 工业工程, 2024, 27(1): 25-35.
FU H, JIN C Q, NIU Z Z, et al. A comprehensive review of platoon planning and control research for networked automated driving trucks [J]. Industrial Engineering, 2024, 27(1): 25-35.
[13] YANG Z, SHEN Y, ZHOU R, et al. A transfer learning fault diagnosis model of distribution transformer considering multi-factor situation evolution [J]. IEEJ Transactions on Electrical and Electronic Engineering, 2020, 15(1): 30-39.
[14] YANG Z C, SHEN Y, FAN Y, et al. Research on differentiation inspection strategy of distribution networks[C]//Proceedings of the 2019 IEEE 2nd International Conference on Electronics and Communication Engineering. Xi'an: IEEE, 2019: 331-335.
[15] KOOL W, VAN HOOF H, WELLING M. Attention, learn to solve routing problems![C]// International Conference on Learning Representations. New Orleans: ICLR, 2019: 1-25.
[16] ZHANG K, HE F, ZHANG Z, et al. Multi-vehicle routing problems with soft time windows: a multi-agent reinforcement learning approach [J]. Transportation Research Part C: Emerging Technologies, 2020, 121: 102861.
[17] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. LongBeach, California: NIPS'17, 2017: 6000-6010.
[18] AREL I, LIU C, URBANIK T, et al. Reinforcement learning-based multi-agent system for network traffic signal control [J]. IET Intelligent Transport Systems, 2010, 4(2): 128-135.
[19] AGARWAL A, HENAFF M, KAKADE S, et al. PC-PG: policy cover directed exploration for provable policy gradient learning [J]. Advances in Neural Information Processing Systems, 2020, 33: 13399-13412.
[20] WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning [J]. Machine Learning, 1992, 8(3): 229-256.
[21] MIRJALILI S. Genetic algorithm[M]// Evolutionary Algorithms and Neural Networks. Cham: Springer, 2019: 43-55.
[22] EUCHI J, YASSINE A. A hybrid metaheuristic algorithm to solve the electric vehicle routing problem with battery recharging stations for sustainable environmental and energy optimization[J]. Energy Systems, 2023, 14(1): 243-267.
[23] YAN X, LI W, HUANG Y, et al. An adaptive parameter for max-min elite ant system to solve CVRP problem[C]// 17th International Conference on Computational Intelligence and Security.Chengdu: IEEE, 2021: 580-584.
[1] 谢光强, 万梓坤, 李杨. 基于分层邻域选择的切换拓扑多智能体系统一致性协议[J]. 广东工业大学学报, 2024, 41(04): 44-51.
[2] 陈应瑟, 彭世国, 王永华. 动态事件触发脉冲机制下多智能体系统的拟一致性[J]. 广东工业大学学报, 2024, 41(04): 52-60.
[3] 谢正昊, 赖健鑫, 庄晓翀, 蒋丽. 面向无人机数字孪生边缘网络的联邦学习资源自适应优化机制[J]. 广东工业大学学报, 2024, 41(04): 61-69.
[4] 李雪森, 谭北海, 余荣, 薛先斌. 基于YOLOv5的轻量化无人机航拍小目标检测算法[J]. 广东工业大学学报, 2024, 41(03): 71-80.
[5] 涂泽良, 程良伦, 黄国恒. 基于局部正交特征融合的小样本图像分类[J]. 广东工业大学学报, 2024, 41(02): 73-83.
[6] 杨镇雄, 谭台哲. 基于生成对抗网络的低光照图像增强算法[J]. 广东工业大学学报, 2024, 41(01): 55-62.
[7] 赖志茂, 章云, 李东. 基于Transformer的人脸深度伪造检测技术综述[J]. 广东工业大学学报, 2023, 40(06): 155-167.
[8] 曾安, 陈旭宙, 姬玉柱, 潘丹, 徐小维. 基于自注意力和三维卷积的心脏多类分割方法[J]. 广东工业大学学报, 2023, 40(06): 168-175.
[9] 胡然, 彭世国. 基于脉冲观测器的多智能体系统的领导跟随一致性[J]. 广东工业大学学报, 2023, 40(05): 88-93.
[10] 戴彬, 曾碧, 魏鹏飞, 黄永健. 改进判别式深度Dyna-Q的任务对话策略学习方法[J]. 广东工业大学学报, 2023, 40(04): 9-17,23.
[11] 吴亚迪, 陈平华. 基于用户长短期偏好和音乐情感注意力的音乐推荐模型[J]. 广东工业大学学报, 2023, 40(04): 37-44.
[12] 曹智雄, 吴晓鸰, 骆晓伟, 凌捷. 融合迁移学习与YOLOv5的安全帽佩戴检测算法[J]. 广东工业大学学报, 2023, 40(04): 67-76.
[13] 苏天赐, 何梓楠, 崔苗, 张广驰. 多无人机辅助数据收集系统的智能路径规划算法[J]. 广东工业大学学报, 2023, 40(04): 77-84.
[14] 何一汕, 王永华, 万频, 王磊, 伍文韬. 面向多用户动态频谱接入的改进双深度Q网络方法研究[J]. 广东工业大学学报, 2023, 40(04): 85-93.
[15] 赖东升, 冯开平, 罗立宏. 基于多特征融合的表情识别算法[J]. 广东工业大学学报, 2023, 40(03): 10-16.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!