Journal of Guangdong University of Technology ›› 2021, Vol. 38 ›› Issue (06): 29-34.doi: 10.12052/gdutxb.210105

Previous Articles     Next Articles

A Data-Driven Prescribed Convergence Rate Design for Robust Tracking of Discrete-Time Systems

Chen Ci1,2, Xie Li-hua3   

  1. 1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China;
    2. Guangdong Key Laboratory of IoT Information Technology, Guangzhou 510006, China;
    3. School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore
  • Received:2021-07-08 Online:2021-11-10 Published:2021-11-09

Abstract: A robust tracking control problem for linear discrete-time systems with a prescribed convergence rate is considered. The robust tracking problem is formulated by utilizing robust output regulation and is subsequently solved by reinforcement learning with integration of the prescribed convergence rate. The learned controller ensures that the tracking error asymptotically converges to zero, meanwhile it is robust to uncertain system dynamics. The proposed convergence rate design is data-driven in the sense that it does not depend on the time for the system evolution or the accurate system model.

Key words: reinforcement learning, prescribed convergence rate, data-driven, value iteration, tracking control

CLC Number: 

  • TP273
[1] 柴天佑. 复杂工业过程运行优化与反馈控制[J]. 自动化学报, 2013, 39(11): 1744-1757.
CHAI T Y. Operational optimization and feedback control for complex industrial processes [J]. Acta Automatica Sinica, 2013, 39(11): 1744-1757.
[2] SICILIANO B, KHATIB O. Springer handbook of robotics [M]. Berlin:Springer-Verlag, 2016.
[3] FRANCIS B A, WONHAM W M. The internal model principle of control theory [J]. Automatica, 1976, 12(5): 457-465.
[4] ISIDORI A, BYRNES C I. Output regulation of nonlinear systems [J]. IEEE Transactions on Automat Control, 1990, 35(2): 131-40.
[5] ÅSTR M K J, WITTENMARK B. Adaptive control [M].New York: Dover Publications, 2008.
[6] GOODWIN G C, SIN K S. Adaptive filtering prediction and control [M]. New York: Dover Publications, 2014.
[7] SASTRY S, BODSON M. Adaptive control: stability, convergence and robustness [M].New York: Dover Publications, 2011.
[8] IOANNOU P A, SUN J. Robust adaptive control [M]. New York: Dover Publications, 2012.
[9] KRSTIC M, KOKOTOVIC P V, KANELLAKOPOULOS I. Nonlinear and adaptive control design [J]. Lecture Notes in Control & Information Sciences, 1995,5(2):4475-4480.
[10] IOANNOU P, FIDAN B. Adaptive control tutorial [M]. Philadelphia:Society for Industrial and Applied, 2006.
[11] ARIYUR K B, KRSTIĆ M. Real time optimization by extremum seeking control [M]. New Jersey: John Wiley & Sons, 2003.
[12] SCHEINKER A, KRSTIĆ M. Model-free stabilization by extremum seeking [M]. Switzerland:Springer International Publishing,2017.
[13] LEWIS F L, VRABIE D, SYRMOS V L. Optimal control [M]. New Jersey: John Wiley & Sons, 2012.
[14] SUTTON R S, BARTO A G. Reinforcement learning: an introduction [M]. Cambridge: MIT Press, 2018.
[15] ZHANG H, LIU D, LUO Y, et al. Adaptive dynamic programming for control: algorithms and stability [M]. London: Springer-Verlag, 2012.
[16] VRABIE D, VAMVOUDAKIS K G, LEWIS F L. Optimal adaptive control and differential games by reinforcement learning principles [M]. London:The Institution of Engineering and Technology, 2013.
[17] JIANG Y, JIANG Z P. Robust adaptive dynamic programming [M]. New Jersey: John Wiley & Sons, 2017.
[18] KAMALAPURKAR R, WALTERS P, ROSENFELD J, et al. Reinforcement learning for optimal feedback control [M]. Switzerland:Springer International Publishing, 2018.
[19] JIANG Z P, BIAN T, GAO W. Learning-based control: a tutorial and some recent results [J]. Foundations and Trends® in Systems and Control, 2020, 8(3): 176-284.
[20] CHEN C, MODARES H, XIE K, et al. Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics [J]. IEEE Transactions on Automatic Control, 2019, 64(11): 4423-4438.
[21] CHEN C, LEWIS F L, XIE K, et al. Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems [J]. Automatica, 2020, 119: 109081.
[22] KIUMARSI B, VAMVOUDAKIS K G, MODARES H, et al. Optimal and autonomous control using reinforcement learning: a survey [J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 29(6): 2042-2062.
[23] KIUMARSI B, LEWIS F L. Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems [J]. IEEE Transactions on Neural Networks and Learning Systems, 2014, 26(1): 140-151.
[24] KIUMARSI B, LEWIS F L, MODARES H, et al. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics [J]. Automatica, 2014, 50(4): 1167-1175.
[25] KIUMARSI B, LEWIS F L, JIANG Z P. H∞ control of linear discrete-time systems: off-policy reinforcement learning [J]. Automatica, 2017, 78: 144-152.
[26] RIZVI S A A, LIN Z. Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control [J]. Automatica, 2018, 95: 213-221.
[27] JIANG Y, KIUMARSI B, FAN J, et al. Optimal output regulation of linear discrete-time systems with unknown dynamics using reinforcement learning [J]. IEEE Transactions on Cybernetics, 2019, 50(7): 3147-3156.
[28] LEWIS F L, VAMVOUDAKIS K G. Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 41(1): 14-25.
[29] KIUMARSI B, LEWIS F L, NAGHIBI-SISTANI M B, et al. Optimal tracking control of unknown discrete-time linear systems using input-output measured data [J]. IEEE Transactions on Cybernetics, 2015, 45(12): 2770-2779.
[30] GAO W, JIANG Z P. Adaptive optimal output regulation via output-feedback: an adaptive dynamic programing approach[C]//Proceedings of the 2016 IEEE 55th Conference on Decision and Control (CDC). Las Vegas: IEEE, 2016
[31] JIANG Y, FAN J, GAO W, et al. Cooperative adaptive optimal output regulation of nonlinear discrete-time multi-agent systems [J]. Automatica, 2020, 121: 109149.
[32] CHEN C, XIE L, JIANG Y, et al. Robust output regulation and reinforcement learning-based output tracking design for unknown linear discrete-time systems [EB/OL].(2021-01-21)[2021-07-01]. https://arxiv.org/abs/2101.08706
[33] 姜艺, 范家璐, 柴天佑. 数据驱动的保证收敛速率最优输出调节[J/OL]. 自动化学报, 2021, 47(x): 1-12. http://www.aas.net.cn/cn/article/doi/10.16383/j.aas.c200932.
JIANG Y, FAN J L, CHAI T Y. Data-driven optimal output regulation with assured convergence rate [J/OL]. Acta Automatica Sinica, 2021, 47(x): 1-12. http://www.aas.net.cn/cn/article/doi/10.16383/j.aas.c200932.
[34] HUANG J. Nonlinear output regulation: theory and applications [M]. Philadelphia: Society for Industrial and Applied Mathematics,2004.
[35] LANCASTER P, RODMAN L. Algebraic riccati equations [M]. Oxford: Clarendon Press, 1995.
[1] Li Ming-lei, Zhang Yang, Kang Jia-wen, Xu Min-rui, Dusit Niyato. Multi-Agent Reinforcement Learning for Secure Data Sharing in Blockchain-Empowered Vehicular Networks [J]. Journal of Guangdong University of Technology, 2021, 38(06): 62-69.
[2] Guo Xin-de, Chris Hong-qiang Ding. An AGV Path Planning Method for Discrete Manufacturing Smart Factory [J]. Journal of Guangdong University of Technology, 2021, 38(06): 70-76.
[3] Zheng Si-yuan, Cui Miao, Zhang Guang-chi. Reinforcement Learning-Based Online Trajectory Optimization for Secure UAV Communications [J]. Journal of Guangdong University of Technology, 2021, 38(04): 59-64.
[4] Ye Wei-jie, Gao Jun-li, Jiang Feng, Guo Jing. A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2020, 37(05): 46-50.
[5] Wu Yun-xiong, Zeng Bi. Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2019, 36(01): 42-50.
[6] Liu Yi, Zhang Yun. A Cooperative Optimization Algorithm Based on Adaptive Dynamic Programming [J]. Journal of Guangdong University of Technology, 2017, 34(06): 15-19.
[7] Liu Yi, Zhang Yun. Convergence Condition of Value-iteration Based Adaptive Dynamic Programming [J]. Journal of Guangdong University of Technology, 2017, 34(05): 10-14.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!