基于多智能体注意力机制的自动巡检路线强化学习模型

doi:10.12052/gdutxb.230203

摘要/Abstract

摘要： 合理的任务分配与巡检路线规划是确保机器人能够高效替代工程师完成变电站危险区域巡检任务的关键所在。然而，以往的研究大多局限于为变电设备规划固定的最短巡检路径，却鲜少考虑到设备检测时间和检验等级的差异性。为了进一步提升变电站巡检的有效性和灵活性，本文在充分考虑检测时间、设备检验等级以及待检测设备数量差异性的基础上，构建了一个动态巡检路径规划模型。鉴于所建模型属于NP-hard问题，提出了一种基于强化学习和多智能体注意力机制的求解策略。在求解过程中，先利用具有注意力层的编码器–解码器框架生成巡检路径，随后通过无监督神经网络进行训练优化。最后，以南方电网某变电站作为实验点进行模型验证。与遗传算法、分层可变领域搜索算法和自适应并行蚁群算法相比，本文提出的算法在路径距离上分别缩短了3.31%，1.24%与1.73%，规划用时分别缩短了17.06%，16.22%与13.89%，单次巡检成本分别降低了21.22%，6.86%与9.14%，展现出显著的优越性。

关键词: 多智能体, 变电站, 路径规划, 强化学习, 注意力机制

Abstract: Reasonable task allocation and inspection routes are crucial for robots to replace engineers in performing inspection tasks in dangerous areas of substations. However, most existing studies focused solely on planning fixed shortest paths for inspecting power transformation equipment, neglecting the variability of equipment inspection times and the heterogeneity of inspection levels. To enhance the effectiveness and flexibility of substation inspections, this study establishes a dynamic inspection path planning model by comprehensively considering the variability of inspection times, the heterogeneity of equipment inspection levels, and the number differences equipments to be inspected. To address the NP-hard of the proposed model, this paper proposes a solution based on the reinforcement learning and multi-agent attention mechanism, which first generates inspection paths using an encoder-decoder framework with an attention layer, and then trains it using an unsupervised neural network. Finally, a substation of China Southern Power Grid is used as an experimental site to validate the model. Compared with the genetic algorithm (GA), Hierarchical Variable Neighborhood Search algorithm (HVNS) , and Adaptive Parallel Memetic Multi-Elite Ant System algorithm (APMMEAS) , the proposed algorithm reduces the path distances by 3.31%, 1.24%, and 1.73%, respectively; reduces the planning time by 17.06%, 16.22% and 13.89%, respectively; and reduces the single inspection costs by 21.22%, 6.86%, and 9.14%, respectively.

Key words: multi-agent, power substation, path planning, reinforcement learning, attention mechanism

中图分类号:

TM732

欧嘉俊, 曾伟良, 李谕锋, 范竞敏. 基于多智能体注意力机制的自动巡检路线强化学习模型[J]. 广东工业大学学报, 2024, 41(05): 39-47,71.doi: 10.12052/gdutxb.230203

Ou Jia-jun, Zeng Wei-liang, Li Yu-feng, Fan Jing-min. Reinforcement Learning Model for Automatic Inspection Route Based on Multi-agent Attention Mechanism[J]. Journal of Guangdong University of Technology, 2024, 41(05): 39-47,71.doi: 10.12052/gdutxb.230203

参考文献

[1] 傅惠, 伍乃骐, 胡刚. 城市交通系统管理与优化研究综述[J]. 工业工程, 2016, 19(1): 10-15.
FU H, WU N Q, HU G. An overview of management and optimization of urban transportation systems [J]. Industrial Engineering Journal, 2016, 19(1): 10-15.
[2] 王建邦, 袁智勇, 陈波, 等. 变电站巡检机器人数据驱动无模型自适应控制[J]. 电测与仪表, 2019, 56(19): 114-120.
WANG J B, YUAN Z Y, CHEN B, et al. Data driven model free adaptive control for substation inspection robots [J]. Electric Measurement and Instrumentation, 2019, 56(19): 114-120.
[3] 黄金魁. 智能变电站三维实景无人值守感知系统的应用研究[J]. 电测与仪表, 2020, 57(4): 87-92.
HUANG J K. Research on the application of three-dimensional real scene unattended sensing systems in intelligent substations [J]. Electric Measurement and Instrumentation, 2020, 57(4): 87-92.
[4] 张永涛, 李博, 张甲, 等. 基于图论的变电站巡检机器人全局路径规划[J]. 山东电力技术, 2020, 47(9): 45-49.
ZHANG Y T, LI B, ZHANG J, et al. Global path planning of substation inspection robots based on graph theory [J]. Shandong Electric Power Technology, 2020, 47(9): 45-49.
[5] 张永涛, 于倩倩, 肖智彬, 等. 基于枚举法的变电站巡检机器人巡视路线优化[J]. 浙江电力, 2021, 40(1): 12-17.
ZHANG Y T, YU Q Q, XIAO Z B, et al. Optimization of inspection routes for substation inspection robots based on enumeration method [J]. Zhejiang Electric Power, 2021, 40(1): 12-17.
[6] 薛阳, 俞志程, 吴海东, 等. 基于IACO-ABC 算法的变电站巡检机器人路径规划[J]. 浙江电力, 2019, 38(11): 10-15.
XUE Y, YU Z C, WU H D, et al. Path planning for substation inspection robots based on the IACO-ABC algorithm [J]. Zhejiang Electric Power, 2019, 38(11): 10-15.
[7] 刘胜, 晏齐忠, 张志鑫, 等. 基于ACO-PSO算法的变电站巡检机器人路径规划研究[J]. 浙江电力, 2021, 40(1): 29-35.
LIU S, YAN Q Z, ZHANG Z X, et al. Study on path planning for substation inspection robots using the ACO-PSO algorithm [J]. Zhejiang Electric Power, 2021, 40(1): 29-35.
[8] 宋涛, 李丹, 路宁. 基于分层强化学习的数字化输电线路路径规划研究[J]. 电测与仪表, 2022, 59(4): 91-97.
SONG T, LI D, LU N. Study on path planning for digital transmission lines using hierarchical reinforcement learning [J]. Electric Measurement and Instrumentation, 2022, 59(4): 91-97.
[9] 王万良, 陈浩立, 李国庆, 等. 基于深度强化学习的多配送中心车辆路径规划[J]. 控制与决策, 2022, 37(8): 2101-2109.
WANG W L, CHEN H L, LI G Q, et al. Path planning for vehicles at multiple distribution centers using deep reinforcement learning [J]. Control and Decision, 2022, 37(8): 2101-2109.
[10] YANG Q, MA S, ZHANG G, et al. A new assistance navigation method for substation inspection robots to safely cross grass areas [J]. Sensors, 2023, 23(22): 9201.
[11] NING X, LIU Z, LI Y. Autonomous obstacle crossing method for substation inspection robots based on locust optimization algorithm[C]//3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing. Wuhan: SPIE, 2023: 1271702.
[12] 傅惠, 金诚谦, 牛张哲, 等. 网联自动驾驶货车编队规划与控制研究综述[J]. 工业工程, 2024, 27(1): 25-35.
FU H, JIN C Q, NIU Z Z, et al. A comprehensive review of platoon planning and control research for networked automated driving trucks [J]. Industrial Engineering, 2024, 27(1): 25-35.
[13] YANG Z, SHEN Y, ZHOU R, et al. A transfer learning fault diagnosis model of distribution transformer considering multi-factor situation evolution [J]. IEEJ Transactions on Electrical and Electronic Engineering, 2020, 15(1): 30-39.
[14] YANG Z C, SHEN Y, FAN Y, et al. Research on differentiation inspection strategy of distribution networks[C]//Proceedings of the 2019 IEEE 2nd International Conference on Electronics and Communication Engineering. Xi'an: IEEE, 2019: 331-335.
[15] KOOL W, VAN HOOF H, WELLING M. Attention, learn to solve routing problems![C]// International Conference on Learning Representations. New Orleans: ICLR, 2019: 1-25.
[16] ZHANG K, HE F, ZHANG Z, et al. Multi-vehicle routing problems with soft time windows: a multi-agent reinforcement learning approach [J]. Transportation Research Part C: Emerging Technologies, 2020, 121: 102861.
[17] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. LongBeach, California: NIPS'17, 2017: 6000-6010.
[18] AREL I, LIU C, URBANIK T, et al. Reinforcement learning-based multi-agent system for network traffic signal control [J]. IET Intelligent Transport Systems, 2010, 4(2): 128-135.
[19] AGARWAL A, HENAFF M, KAKADE S, et al. PC-PG: policy cover directed exploration for provable policy gradient learning [J]. Advances in Neural Information Processing Systems, 2020, 33: 13399-13412.
[20] WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning [J]. Machine Learning, 1992, 8(3): 229-256.
[21] MIRJALILI S. Genetic algorithm[M]// Evolutionary Algorithms and Neural Networks. Cham: Springer, 2019: 43-55.
[22] EUCHI J, YASSINE A. A hybrid metaheuristic algorithm to solve the electric vehicle routing problem with battery recharging stations for sustainable environmental and energy optimization[J]. Energy Systems, 2023, 14(1): 243-267.
[23] YAN X, LI W, HUANG Y, et al. An adaptive parameter for max-min elite ant system to solve CVRP problem[C]// 17th International Conference on Computational Intelligence and Security.Chengdu: IEEE, 2021: 580-584.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed