基于时空不确定性建模的自动驾驶车辆行为决策方法

    A Behavior Decision Method for Autonomous Vehicles Based on Spatiotemporal Uncertainty Modeling

    • 摘要: 自动驾驶车辆在十字路口通行过程中,由于多车交互行为复杂且环境信息不完备,车辆行为决策面临显著的不确定性。针对该问题,本文提出一种基于深度强化学习的自动驾驶车辆行为决策方法,其核心为图注意力与门控循环单元增强型深度确定性策略梯度(Graph Attention and GRU enhanced Deep Deterministic Policy Gradient, GAG-DDPG) 算法。首先,对自动驾驶场景中的认知不确定性与任意不确定性进行建模与量化,实现交通环境风险评估。其次,为刻画车辆之间的动态交互关系及其时序依赖,在策略网络中引入图注意力网络(Graph Attention Network, GAT) 与门控循环单元(Gated Recurrent Unit, GRU)。GAT通过多头注意力机制建模多车辆之间的交互关系并自适应关注关键车辆,而GRU用于捕获交通场景的时序变化特征,从而构建具有时空交互感知能力的状态表示,为不确定性风险建模提供可靠的特征表达。最后,引入条件风险值(Conditional Value at Risk, CvaR) 构建风险敏感决策机制,使智能体在不确定环境中能够选择更加稳健的行为动作,实现安全性与探索性的平衡。实验结果表明,所提出的GAG-DDPG方法能够有效提高车辆在十字路口左转场景中的通行成功率与行驶平稳性,有效提升自动驾驶车辆在复杂不确定交通环境中的行为决策性能。

       

      Abstract: Autonomous vehicles operating at urban intersections face substantial behavioral decision-making uncertainty due to complex multi-vehicle interactions and incomplete environmental information. To address this challenge, this paper proposes a deep reinforcement learning-based behavior decision method for autonomous vehicles, termed Graph Attention and GRU enhanced Deep Deterministic Policy Gradient (GAG-DDPG) . First, both epistemic and aleatoric uncertainties in autonomous driving scenarios are modeled and quantified to facilitate traffic risk assessment. To effectively capture the dynamic interactions among surrounding vehicles and their temporal dependencies, a Graph Attention Network (GAT) and a Gated Recurrent Unit (GRU) are integrated into the policy network. Specifically, the GAT employs a multi-head attention mechanism to model inter-vehicle interaction relationships and adaptively focus on critical surrounding vehicles, while the GRU captures the temporal evolution of traffic states. This design enables the construction of a spatiotemporal interaction-aware state representation, providing a reliable feature basis for uncertainty-aware risk modeling. Furthermore, Conditional Value at Risk (CVaR) is incorporated to develop a risk-sensitive decision-making mechanism, allowing the agent to select more robust behavioral actions under uncertain environments and thereby achieve a balance between safety and exploration. Experimental results demonstrate that the proposed GAG-DDPG method significantly improves both the success rate and driving smoothness of autonomous vehicles in intersection left-turn scenarios, thereby enhancing decision-making performance in complex and uncertain traffic environments.

       

    /

    返回文章
    返回