Journal of Guangdong University of Technology ›› 2021, Vol. 38 ›› Issue (06): 29-34.doi: 10.12052/gdutxb.210105
Previous Articles Next Articles
Chen Ci1,2, Xie Li-hua3
CLC Number:
[1] 柴天佑. 复杂工业过程运行优化与反馈控制[J]. 自动化学报, 2013, 39(11): 1744-1757. CHAI T Y. Operational optimization and feedback control for complex industrial processes [J]. Acta Automatica Sinica, 2013, 39(11): 1744-1757. [2] SICILIANO B, KHATIB O. Springer handbook of robotics [M]. Berlin:Springer-Verlag, 2016. [3] FRANCIS B A, WONHAM W M. The internal model principle of control theory [J]. Automatica, 1976, 12(5): 457-465. [4] ISIDORI A, BYRNES C I. Output regulation of nonlinear systems [J]. IEEE Transactions on Automat Control, 1990, 35(2): 131-40. [5] ÅSTR M K J, WITTENMARK B. Adaptive control [M].New York: Dover Publications, 2008. [6] GOODWIN G C, SIN K S. Adaptive filtering prediction and control [M]. New York: Dover Publications, 2014. [7] SASTRY S, BODSON M. Adaptive control: stability, convergence and robustness [M].New York: Dover Publications, 2011. [8] IOANNOU P A, SUN J. Robust adaptive control [M]. New York: Dover Publications, 2012. [9] KRSTIC M, KOKOTOVIC P V, KANELLAKOPOULOS I. Nonlinear and adaptive control design [J]. Lecture Notes in Control & Information Sciences, 1995,5(2):4475-4480. [10] IOANNOU P, FIDAN B. Adaptive control tutorial [M]. Philadelphia:Society for Industrial and Applied, 2006. [11] ARIYUR K B, KRSTIĆ M. Real time optimization by extremum seeking control [M]. New Jersey: John Wiley & Sons, 2003. [12] SCHEINKER A, KRSTIĆ M. Model-free stabilization by extremum seeking [M]. Switzerland:Springer International Publishing,2017. [13] LEWIS F L, VRABIE D, SYRMOS V L. Optimal control [M]. New Jersey: John Wiley & Sons, 2012. [14] SUTTON R S, BARTO A G. Reinforcement learning: an introduction [M]. Cambridge: MIT Press, 2018. [15] ZHANG H, LIU D, LUO Y, et al. Adaptive dynamic programming for control: algorithms and stability [M]. London: Springer-Verlag, 2012. [16] VRABIE D, VAMVOUDAKIS K G, LEWIS F L. Optimal adaptive control and differential games by reinforcement learning principles [M]. London:The Institution of Engineering and Technology, 2013. [17] JIANG Y, JIANG Z P. Robust adaptive dynamic programming [M]. New Jersey: John Wiley & Sons, 2017. [18] KAMALAPURKAR R, WALTERS P, ROSENFELD J, et al. Reinforcement learning for optimal feedback control [M]. Switzerland:Springer International Publishing, 2018. [19] JIANG Z P, BIAN T, GAO W. Learning-based control: a tutorial and some recent results [J]. Foundations and Trends® in Systems and Control, 2020, 8(3): 176-284. [20] CHEN C, MODARES H, XIE K, et al. Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics [J]. IEEE Transactions on Automatic Control, 2019, 64(11): 4423-4438. [21] CHEN C, LEWIS F L, XIE K, et al. Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems [J]. Automatica, 2020, 119: 109081. [22] KIUMARSI B, VAMVOUDAKIS K G, MODARES H, et al. Optimal and autonomous control using reinforcement learning: a survey [J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 29(6): 2042-2062. [23] KIUMARSI B, LEWIS F L. Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems [J]. IEEE Transactions on Neural Networks and Learning Systems, 2014, 26(1): 140-151. [24] KIUMARSI B, LEWIS F L, MODARES H, et al. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics [J]. Automatica, 2014, 50(4): 1167-1175. [25] KIUMARSI B, LEWIS F L, JIANG Z P. H∞ control of linear discrete-time systems: off-policy reinforcement learning [J]. Automatica, 2017, 78: 144-152. [26] RIZVI S A A, LIN Z. Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control [J]. Automatica, 2018, 95: 213-221. [27] JIANG Y, KIUMARSI B, FAN J, et al. Optimal output regulation of linear discrete-time systems with unknown dynamics using reinforcement learning [J]. IEEE Transactions on Cybernetics, 2019, 50(7): 3147-3156. [28] LEWIS F L, VAMVOUDAKIS K G. Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 41(1): 14-25. [29] KIUMARSI B, LEWIS F L, NAGHIBI-SISTANI M B, et al. Optimal tracking control of unknown discrete-time linear systems using input-output measured data [J]. IEEE Transactions on Cybernetics, 2015, 45(12): 2770-2779. [30] GAO W, JIANG Z P. Adaptive optimal output regulation via output-feedback: an adaptive dynamic programing approach[C]//Proceedings of the 2016 IEEE 55th Conference on Decision and Control (CDC). Las Vegas: IEEE, 2016 [31] JIANG Y, FAN J, GAO W, et al. Cooperative adaptive optimal output regulation of nonlinear discrete-time multi-agent systems [J]. Automatica, 2020, 121: 109149. [32] CHEN C, XIE L, JIANG Y, et al. Robust output regulation and reinforcement learning-based output tracking design for unknown linear discrete-time systems [EB/OL].(2021-01-21)[2021-07-01]. https://arxiv.org/abs/2101.08706 [33] 姜艺, 范家璐, 柴天佑. 数据驱动的保证收敛速率最优输出调节[J/OL]. 自动化学报, 2021, 47(x): 1-12. http://www.aas.net.cn/cn/article/doi/10.16383/j.aas.c200932. JIANG Y, FAN J L, CHAI T Y. Data-driven optimal output regulation with assured convergence rate [J/OL]. Acta Automatica Sinica, 2021, 47(x): 1-12. http://www.aas.net.cn/cn/article/doi/10.16383/j.aas.c200932. [34] HUANG J. Nonlinear output regulation: theory and applications [M]. Philadelphia: Society for Industrial and Applied Mathematics,2004. [35] LANCASTER P, RODMAN L. Algebraic riccati equations [M]. Oxford: Clarendon Press, 1995. |
[1] | Li Ming-lei, Zhang Yang, Kang Jia-wen, Xu Min-rui, Dusit Niyato. Multi-Agent Reinforcement Learning for Secure Data Sharing in Blockchain-Empowered Vehicular Networks [J]. Journal of Guangdong University of Technology, 2021, 38(06): 62-69. |
[2] | Guo Xin-de, Chris Hong-qiang Ding. An AGV Path Planning Method for Discrete Manufacturing Smart Factory [J]. Journal of Guangdong University of Technology, 2021, 38(06): 70-76. |
[3] | Zheng Si-yuan, Cui Miao, Zhang Guang-chi. Reinforcement Learning-Based Online Trajectory Optimization for Secure UAV Communications [J]. Journal of Guangdong University of Technology, 2021, 38(04): 59-64. |
[4] | Ye Wei-jie, Gao Jun-li, Jiang Feng, Guo Jing. A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2020, 37(05): 46-50. |
[5] | Wu Yun-xiong, Zeng Bi. Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2019, 36(01): 42-50. |
[6] | Liu Yi, Zhang Yun. A Cooperative Optimization Algorithm Based on Adaptive Dynamic Programming [J]. Journal of Guangdong University of Technology, 2017, 34(06): 15-19. |
[7] | Liu Yi, Zhang Yun. Convergence Condition of Value-iteration Based Adaptive Dynamic Programming [J]. Journal of Guangdong University of Technology, 2017, 34(05): 10-14. |
|