Abstract:
Aiming at the problem that in Q learning algorithm
Q value is slow in updating speed in complex unknown environment and the dimensionality disaster is easy to occur, a path planning algorithm based on virtual subtarget for Q learning robot in unknown environment is proposed. According to the state trajectory explored by the mobile robot, two state chains are established to record the state-action pair and the state-reverse action pair respectively. The
Q value of each single chain current state is fed back to the
Q value of the previous state in turn till it affects the head of a single chain. Meanwhile, the problem that Q learning is prone to dimensionality disaster in large-scale environment is solved by finding the optimal virtual subtarget in the local detection domain. The experimental results show that the algorithm can effectively accelerate the convergence of the algorithm learning, improve the learning efficiency and complete the robot navigation task with a better path in the complex unknown environment.