一种提升机器人强化学习开发效率的训练模式研究

doi:10.12052/gdutxb.200009

Abstract

Abstract: Deep reinforcement learning (DRL) model combining reinforcement learning and deep learning is currently widely used in the field of robot control. Robot reinforcement learning needs to train the model in a 3D simulation environment. However, in the absence of prior environmental knowledge, trial and error learning in a 3D environment leads to long training cycles and high development costs. To solve this problem, a training mode from 2D to 3D is proposed. Time-consuming and computationally intensive work is completed in a 2D environment, and the results are transferred to a 3D environment for testing. Experiments show that this training mode can improve the development efficiency by about five times, so that personal computers can also do research related to robot reinforcement learning.

Key words: deep reinforcement learning, robot control, training mode, development efficiency

CLC Number:

TP242.6

Ye Wei-jie, Gao Jun-li, Jiang Feng, Guo Jing. A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning[J].Journal of Guangdong University of Technology, 2020, 37(05): 46-50.

References

[1] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of go with deep neural networks and tree search [J]. Nature, 2016, 529(7587): 484
[2] SUTTON R S, BARTO A G. Introduction to reinforcement learning[M]. Cambridge: MIT press, 1998: 4-6.
[3] IRPAN A. Deep reinforcement learning doesn’t work yet[EB/OL]. (2018-02-14)[2018-02-14]. https://www.alexirpan.com/2018/02/14/rl-hard.html.
[4] LEVINE S, PASTOR P, KRIZHEVSKY A, et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection [J]. The International Journal of Robotics Research, 2018, 37(4-5): 421-436
[5] HESSEL M, MODAYIL J, VAN HASSELT H, et al. Rainbow: Combining improvements in deep reinforcement learning[C]//Thirty-Second AAAI Conference on Artificial Intelligence. Orleans, Louisiana: AAAI, 2018.
[6] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[J]. arXiv preprint arXiv, 2017: 1707.06347.
[7] WANG J, HU J, MIN G, et al. Computation offloading in multi-access edge computing using a deep sequential model based on reinforcement learning [J]. IEEE Communications Magazine, 2019, 57(5): 64-69
[8] 吴运雄, 曾碧. 基于深度强化学习的移动机器人轨迹跟踪和动态避障[J]. 广东工业大学学报, 2018, 36(1): 42-50
WU Y X, ZENG B. Trajectory tracking and dynamic obstacle avoidance of mobile robot based on deep reinforcement learning [J]. Journal of Guangdong University of Technology, 2018, 36(1): 42-50
[9] STOOKE A, ABBEEL P. Accelerated methods for deep reinforcement learning[J]. arXiv preprint arXiv, 2018: 1803.02811.
[10] HENDERSON P, CHANG W D, SHKURTI F, et al. Benchmark environments for multitask learning in continuous domains[J]. arXiv preprint arXiv, 2017: 1708.04352.
[11] TAI L, PAOLO G, LIU M. Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). [S.l.]:IEEE, 2017: 31-36.
[12] FUJIMOTO S, VAN HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[J]. arXiv preprint arXiv, 2018: 1802.09477.
[13] BROCKMAN G, CHEUNG V, PETTERSSON L, et al. Openai gym[J]. arXiv preprint arXiv, 2016: 1606.01540.
[14] TAKAYA K, ASAI T, KROUMOV V, et al. Simulation environment for mobile robots testing using ROS and Gazebo[C]//2016 20th International Conference on System Theory, Control and Computing (ICSTCC). Sinaia: IEEE, 2016: 96-101.
[15] BUŞONIU L, BABUŠKA R, DE SCHUTTER B. Multi-agent reinforcement learning: An overview[M]. Berlin Heidelberg: Springer, 2010: 183-221.

Metrics

Viewed

Full text

3101

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	3101

From	Others	local

Times	546	2555
Rate	18%	82%

Abstract

677

Just accepted	Online first	Issue

0	0	677

From	Others	local

Times	112	565
Rate	17%	83%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning

HTML

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 2

Metrics

Comments

Recommended 0

[1]	Guo Xin-de, Chris Hong-qiang Ding. An AGV Path Planning Method for Discrete Manufacturing Smart Factory [J]. Journal of Guangdong University of Technology, 2021, 38(06): 70-76.
[2]	Wu Yun-xiong, Zeng Bi. Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2019, 36(01): 42-50.