• •
谢光强, 宁凯鑫, 李杨
Xie Guang-qiang, Ning Kai-xin, Li Yang
摘要: 在智能网联汽车(Connected Autonomous Vehicle, CAV)与人类驾驶车辆(Human Driving Vehicle, HDV)共存的混合交通中,高速公路入口匝道合并问题具有挑战性。涉及不同类型车辆的道路争夺问题会对交通流产生影响,车辆可能在道路上争夺位置,包括合并、变道等行为,导致CAV难以准确预测和适应其行为,增加了合并的风险,导致交通效率下降,并引发交通拥堵。传统的强化学习算法在复杂环境中难以有效地搜索到优秀的策略,并容易陷入局部最优解,无法有效应对复杂的交通情况,导致合并决策不够精准。针对上述问题,提出了ESACD(Evolutionary Soft Actor-critic for Discrete Action Settings)算法,通过CAV协作适应HDV的策略以最大化交通吞吐量。首先,提出了基于排名选择的父代选择与交叉互换方法,对交互种群进行建模。其次,设计了基于多种群的弹性训练种群,提高CAV应对动态变化的交通流量的适应性。最后,提出了基于适应度评估的二次考核机制。通过在两种不同的交通密度下进行仿真实验,实验结果表明,与传统的演员评论家(Soft Actor-critic, SAC)算法相比,采用该算法能够更高效地完成车联网在入口匝道合并任务,综合提升率较为显著。这验证了该算法能够提升训练效率,扩大交通吞吐量。
中图分类号:
[1] SUN J, ZHANG J, ZHANG H M. Investigation of the early-onset breakdown phenomenon at urban expressway bottlenecks in Shanghai[J]. Transportmetrica B: Transport Dynamics, 2014, 2(3): 215-228. [2] CHEN D, SRIVASTAVA A, AHN S. Harnessing connected and automated vehicle technologies to control lane changes at freeway merge bottlenecks in mixed traffic[J]. Transportation Research Part C: Emerging Technologies, 2021, 123(1): 102950. [3] ZHU J, EASA S, GAO K. Merging control strategies of connected and autonomous vehicles at freeway on-ramps: a comprehensive review[J]. Journal of Intelligent and Connected Vehicles, 2022, 5(2): 99-111. [4] LIU H, ZHUANG W, YIN G, et al. Safety-critical and flexible cooperative on-ramp merging control of connected and automated vehicles in mixed traffic[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(3): 2920-2934. [5] WANG H, WANG W, YUAN S, et al. On social interactions of merging behaviors at highway on-ramps in congested traffic[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(8): 11237-11248. [6] CHEN D, HAJIDAVALLOO M R, LI Z, et al. Deep multi-agent reinforcement learning for highway on-ramp merging in mixed traffic[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(11): 11623-11638. [7] ZHOU W, CHEN D, YAN J, et al. Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic[J]. Autonomous Intelligent Systems, 2022, 2(1): 5. [8] VALIENTE R, TOGHI B, PEDARSANI R, et al. Robustness and adaptability of reinforcement learning-based cooperative autonomous driving in mixed-autonomy traffic[J]. IEEE Open Journal of Intelligent Transportation Systems, 2022, 3: 397-410. [9] YUE W, LI C, WANG S, et al. Cooperative incident management in mixed traffic of CAVs and human-driven vehicles[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(11): 12462-12476. [10] LIN Y, MCPHEE J, AZAD N L. Anti-jerk on-ramp merging using deep reinforcement learning[C]//2020 IEEE Intelligent Vehicles Symposium (IV). Las Vegas: IEEE, 2020: 7-14. [11] SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//International Conference on Machine Learning(PMLR). Beijing: PMLR 2014: 387-395. [12] ZHAO R, SUN Z, JI A. A Deep reinforcement learning approach for automated on-ramp merging[C]//2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). Macau: IEEE, 2022: 3800-3806. [13] BOUTON M, NAKHAEI A, FUJIMURA K, et al. Cooperation-aware reinforcement learning for merging in dense traffic[C]//2019 IEEE Intelligent Transportation Systems Conference (ITSC). Auckland: IEEE, 2019: 3441-3447. [14] LUBARS J, GUPTA H, CHINCHALI S, et al. Combining reinforcement learning with model predictive control for on-ramp merging[C]//2021 IEEE International Intelligent Transportation Systems Conference (ITSC). Indianapolis: IEEE, 2021: 942-947. [15] LEUNG K, SCHMERLING E, ZHANG M, et al. On infusing reachability-based safety assurance within planning frameworks for human–robot vehicle interactions[J]. The International Journal of Robotics Research, 2020, 39(10-11): 1326-1345. [16] MAHBUB A M I, MALIKOPOULOS A A. Platoon formation in a mixed traffic environment: a model-agnostic optimal control approach[C]//2022 American Control Conference (ACC). Atlanta: IEEE, 2022: 4746-4751. [17] WANG J, ZHENG Y, XU Q, et al. Data-driven predictive control for connected and autonomous vehicles in mixed traffic[C]//2022 American Control Conference (ACC). Atlanta: IEEE, 2022: 4739-4745. [18] FOGEL D B. Evolutionary computation: toward a new philosophy of machine intelligence[M]. Piscataway: John Wiley & Sons, 2006. [19] KHADKA S, TUMER K. Evolution-guided policy gradient in reinforcement learning[J]. Advances in Neural Information Processing Systems, 2018, 31(18): 1196-1208. [20] WASTON J, PETERS J. Inferring smooth control: monte carlo posterior policy iteration with gaussian processes[C]//Conference on Robot Learning (PMLR). Atlanta: PMLR, 2023: 67-79. [21] FUJIMOTO S, MEGER D, PRECUP D. Off-policy deep reinforcement learning without exploration[C]//International Conference on Machine Learning (PMLR). Long Beach: PMLR, 2019: 2052-2062. [22] BAI H, CHENG R, JIN Y. Evolutionary reinforcement learning: a survey[J]. Intelligent Computing, 2023, 2(1): 0025. [23] BODNAR C, DAY B, LIÓ P. Proximal distilled evolutionary reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 3283-3290. [24] HAARNOJA T, ZHOUS A, ABBEEL P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//International Conference on Machine Learning. Stockholm: PMLR, 2018: 1861-1870. [25] SIGAUD O. Combining evolution and deep reinforcement learning for policy search: a survey[J]. ACM Transactions on Evolutionary Learning, 2023, 3(3): 1-20. [26] CARR S, JANSEN N, JUNGES S, et al. Safe reinforcement learning via shielding under partial observability[C]//Proceedings of the AAAI Conference on Artificial Intelligence. San Jose: AAAI, 2023: 14748-14756. . [27] LIU Q, DANG F, WANG X, et al. Autonomous highway merging in mixed traffic using reinforcement learning and motion predictive safety controller[C]//2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). Piscataway: IEEE, 2022: 1063-1069. [28] TREIBER M, HENNECKE A, HELBING D. Congested traffic states in empirical observations and microscopic simulations[J]. Physical Review E, 2000, 62(2): 1805. [29] KESTING A, TREIBER M, HELBING D. General lane-changing model MOBIL for car-following models[J]. Transportation Research Record, 2007, 1999(1): 86-94. [30] POLACK P, ALTCHÉ F, D'ANDRÉA-NOVEL B, et al. The kinematic bicycle model: A consistent model for planning feasible trajectories for autonomous vehicles?[C]//2017 IEEE intelligent vehicles symposium (IV) . Los Angeles: IEEE, 2017: 812-818. |
[1] | 谢光强, 赵俊伟, 李杨, 许浩然. 基于多集群系统的车辆协同换道控制[J]. 广东工业大学学报, 2021, 38(05): 1-9. |
|