基于自适应动态规划的能量管理系统研究综述

doi:10.12052/gdutxb.220029

Abstract

Abstract: Adaptive Dynamic Programming (ADP) algorithm, as a research focus in the field of optimal control, has been widely applied in the field of Energy Management System (EMS). ADP is an effective tool for solving optimal control problems in complex nonlinear systems, which can adaptively adjust the control strategy through the input and output data of the system. The research progress of ADP algorithm and its application in EMS are introduced. Then the research status and algorithm implementation of ADP algorithm in discrete EMS and continuous EMS are analyzed. And at last the Real-time Adaptive Dynamic Programming (RT-ADP) algorithm and its feasibility are introduced.

Key words: adaptive dynamic programming, energy management system, real-time control

CLC Number:

TP183

Yuan Jun, Zhang Yun, Zhang Gui-dong, Li Zhong, Chen Zhe, Yu Sheng-long. A Survey of Energy Management System Based on Adaptive Dynamic Programming[J].Journal of Guangdong University of Technology, 2022, 39(05): 21-28.

References

[1] 康重庆. 能源互联网促进实现“双碳”目标[J]. 全球能源互联网, 2021, 4(3): 205-206.
KANG C Q. Energy internet promotes "dual carbon" goal [J]. Global Energy Internet, 2021, 4(3): 205-206.
[2] 白建华, 辛颂旭, 刘俊, 等. 中国实现髙比例可再生能源发展路径研究[J]. 中国电机工程学报, 2015, 35(14): 284-287.
BAI J H, XIN S X, LIU J, et al. Research on the development path of renewable energy in China [J]. China Electric Machinery Journal of engineering, 2015, 35(14): 284-287.
[3] MARIANO-HERNáNDEZ D, HERNáNDEZ-CALLEJO L, et al. A review of strategies for building energy management system: model predictive control, demand side management, optimization, and fault detect diagnosis [J]. Journal of Building Engineering, 2021, 33(1): 352-371.
[4] WERBOS P J. Computational intelligence for the smart grid-history, challenges, and opportunities [J]. IEEE Computational Intelligence Magazine, 2011, 6(3): 14-21.
[5] LIU D, XUE S, ZHAO B, LUO B, et al. Adaptive dynamic programming for control: a survey and recent advances [J]. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2021, 51(1): 142-160.
[6] 陈辞, 谢立华. 具有指定收敛速度的离散系统鲁棒跟踪数据驱动设计[J]. 广东工业大学学报, 2021, 38(6): 29-34.
CHEN C, XIE L H. A data-driven prescribed convergence rate design for robust tracking of discrete-time systems [J]. Journal of Guangdong University of Technology, 2021, 38(6): 29-34.
[7] BOARO M, FUSELLI D, ANGELIS F D, et al. Adaptive dynamic programming algorithm for renewable energy scheduling and battery management [J]. Cognitive Computation, 2013, 5(2): 264-277.
[8] WEI Q, LIU D, SHI G. A novel dual iterative Q-learning method for optimal battery management in smart residential environments [J]. IEEE Transactions on Industrial Electronics, 2015, 62(4): 2509-2518.
[9] WANG D, LIU D, WEI Q L, et al. Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming [J]. Automatica, 2012, 48(8): 1825-1832.
[10] LU Z X, LI H B, QIAO Y. Power system flexibility planning and challenges with high proportion of renewable energy [J]. Power System automation, 2016, 40(13): 147-158.
[11] KONG L H, HE W, YANG C G, et al. Robust neuro-optimal control for a robot via adaptive dynamic programming [J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(6): 2584-2594.
[12] FERRARI S, SARANGAPANI J, LEWIS F L. Special issue on approximate dynamic programming and reinforcement learning [J]. Journal of Control Theory and Applications, 2011, 9(3): 309.
[13] WANG D, LIU D R. Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique [J]. Neurocomputing, 2013, 121(1): 218-225.
[14] NI Z, HE H B, WEN J Y, et al. Goal representation heuristic dynamic programming on maze navigation [J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(12): 2038-2050.
[15] LIU D R, WANG D, ZHAO D B, et al. Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming [J]. IEEE Transactions on Automation Science and Engineering, 2012, 9(3): 628-634.
[16] HE H B, NI Z, FU J. A three-network architecture for on-line learning and optimization based on adaptive dynamic programming [J]. Neurocomputing, 2012, 78(1): 3-13.
[17] XU X, HOU Z S, LIAN C Q, et al. Online learning control using adaptive critic designs with sparse kernel machines [J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(5): 762-775.
[18] 宋浩楠, 赵刚, 孙若莹. 基于深度强化学习的知识推理研究进展综述[J]. 计算机工程与应用, 2022, 58(1): 12-25.
SONG H N, ZHAO G, SUN R Y. Developments of knowledge reasoning based on deep reinforcement learning [J]. Computer Engineering and Applications, 2022, 58(1): 12-25.
[19] XU X, HU D, LU X. Kernel-Based least squares policy iteration for reinforcement learning [J]. IEEE Transactions on Neural Networks, 2007, 18(4): 973-992.
[20] XU X, YANG H, et al. Self-Learning control using dual heuristic programming with global laplacian eigenmaps [J]. IEEE Transactions on Industrial Electronics, 2017, 64(12): 9517-9526.
[21] 刘毅, 章云. 一种基于自适应动态规划的协同优化算法[J]. 广东工业大学学报, 2017, 34(6): 15-19.
LIU Y, ZHANG Y. A cooperative optimization algorithm based on adaptive dynamic programming [J]. Journal of Guangdong University of Technology, 2017, 34(6): 15-19.
[22] MURRAY J J, COXC C J, LENDARIS G G, et al. Adaptive dynamic programming [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2002, 32(2): 140-153.
[23] LEE J Y, PARK J B, CHOI Y H. On integral generalized policy iteration for continuous-time linear quadratic regulations [J]. Automatica, 2014, 50(2): 475-489.
[24] LEWIS F L, WEI Q L. Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games [J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(3): 704-713.
[25] LIU D R, WEI Q L, WANG D, et al. Adaptive dynamic programming with applications in optimal control[M]. Cham, Switzerland: Springer, 2017.
[26] LIU D R, WEI Q L. Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems [J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 25(3): 621-634.
[27] WEI Q L, LIU D R, LIN Q, et al. Adaptive dynamic programming for discrete-time zero-sum games [J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 29(4): 957-969.
[28] 刘毅, 章云. 基于值迭代的自适应动态规划的收敛条件[J]. 广东工业大学学报, 2017, 34(5): 10-14.
LIU Y, ZHANG Y. Convergence condition of value-iteration based adaptive dynamic programming [J]. Journal of Guangdong University of Technology, 2017, 34(5): 10-14.
[29] AL-TAMIMI A, LEWIS F L, ABU-KHALAF M. Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2008, 38(4): 943-949.
[30] SUN T, SUN X M. An adaptive dynamic programming scheme for nonlinear optimal control with unknown dynamics and its application to turbofan engines [J]. IEEE Transactions on Industrial Informatics, 2020, 17(1): 367-376.
[31] WU N, WANG H L. Deep learning adaptive dynamic programming for real time energy management and control strategy of micro-grid [J]. Journal of Cleaner Production, 2018, 204(1): 1169-1177.
[32] DAVARI M, GAO W N, JIANG Z P, et al. An optimal primary frequency control based on adaptive dynamic programming for islanded modernized microgrids [J]. IEEE Transactions on Automation Science and Engineering, 2020, 18(3): 1109-1121.
[33] LI L, YAN B J, YANG C, et al. Application-oriented stochastic energy management for plug-in hybrid electric bus with AMT [J]. IEEE Transactions on Vehicular Technology, 2015, 65(6): 4459-4470.
[34] LIANG J K, TANG W Y. Optimal trading strategies in continuous double auctions for transactive energy[C]//2019 North American Power Symposium (NAPS). Wichita, KS, USA : IEEE, 2019.
[35] 林小峰, 丁强. 基于评价网络近似误差的自适应动态规划优化控制[J]. 控制与决策, 2015, 30(3): 495-499.
LIN X F, DING Q. Adaptive dynamic programming optimal control based on approximation error of critic network [J]. Control and Decision., 2015, 30(3): 495-499.
[36] MARINONI M, BUTTAZZO G. Adaptive DVS management through elastic scheduling[C]//2005 IEEE Conference on Emerging Technologies and Factory Automation. Catania, Italy: IEEE, 2005.
[37] WEI Q L, LIU D R, LEWIS F L, et al. Mixed iterative adaptive dynamic programming for optimal battery energy control in smart residential microgrids [J]. IEEE Transactions on Industrial Electronics, 2017, 64(5): 4110-4120.
[38] ZHU Y, ZHAO D, LI X. Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data [J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3): 714-725.
[39] ZHU J, HOU Y J, LI H. Optimal control of nonlinear systems with time delays: an online ADP perspective [J]. IEEE Access, 2019, 7(1): 145574-145581.
[40] HUANG M, JIANG Z P, et al. Learning-Based adaptive optimal control for connected vehicles in mixed traffic: robustness to driver reaction time [J]. IEEE Transactions on Cybernetics, 2022, 52(6): 267-5277.
[41] LEWIS F L, SONG R Z, WEI Q L, et al. Multiple actor-critic structures for continuous-time optimal control using input-output data [J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(4): 851-865.
[42] KWON S, XU Y J, GAUTAM N. Meeting inelastic demand in systems with storage and renewable sources [J]. IEEE Transactions on Smart Grid, 2015, 8(4): 1619-1629.
[43] ZHU J Q, CHEN J J, ZHUO Y L, et al. Stochastic energy management of active distribution network based on improved approximate dynamic programming [J]. IEEE Transactions on Smart Grid, 2021, 13(1): 406-416.
[44] SHUAI H, FANG J K, AI X M, et al. Optimal real-time operation strategy for microgrid: an ADP-based stochastic nonlinear optimization approach [J]. IEEE Transactions on Sustainable Energy, 2018, 10(2): 931-942.
[45] LI W M, XU G Q, XU Y S. Online learning control for hybrid electric vehicle [J]. Chinese Journal of Mechanical Engineering, 2012, 25(1): 98-106.
[46] DE KEYSER A, VANSOMPEL H, CREVECOEUR G. Real-time energy-efficient actuation of induction motor drives using approximate dynamic programming [J]. IEEE Transactions on Industrial Electronics, 2020, 68(12): 11837-11846.
[47] LI Z M, WU L, XU Y, et al. Multi-stage real-time operation of a multi-energy microgrid with electrical and thermal energy storage assets: a data-driven mpc-adp approach [J]. IEEE Transactions on Smart Grid, 2021, 13(1): 213-226.
[48] XU X, LIAN C Q, ZUO L, et al. Kernel-based approximate dynamic programming for real-time online learning control: an experimental study [J]. IEEE Transactions on Control Systems Technology, 2013, 22(1): 146-156.
[49] WANG D, LIU D R, LI H L. Policy iteration algorithm for online design of robust control for a class of continuous-time nonlinear systems [J]. IEEE Transactions on Automation Science and Engineering, 2014, 11(2): 627-632.
[50] LIU J, HUANG Z, XU X, et al. Multi-kernel online reinforcement learning for path tracking control of intelligent vehicles [J]. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2021, 51(11): 6962-6975.
[51] WANG Z, LIU X P, LIU K F, et al. Backstepping-based lyapunov function construction using approximate dynamic programming and sum of square techniques [J]. IEEE Transactions on Cybernetics, 2016, 47(10): 3393-3403.
[52] LEWIS F L, VRABIE D, VAMVOUDAKIS K G. Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers [J]. IEEE Control Systems Magazine, 2012, 32(6): 76-105.
[53] YUAN J, SAMSON S Y, ZHANG G D, et al. Design and HIL realization of an online adaptive dynamic programming approach for real-time economic operations of household energy systems [J]. IEEE Transactions on Smart Grid, 2021, 13(1): 330-341.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

A Survey of Energy Management System Based on Adaptive Dynamic Programming

HTML

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 2

Metrics

Comments

Recommended 0

[1]	Liu Yi, Zhang Yun. A Cooperative Optimization Algorithm Based on Adaptive Dynamic Programming [J]. Journal of Guangdong University of Technology, 2017, 34(06): 15-19.
[2]	Liu Yi, Zhang Yun. Convergence Condition of Value-iteration Based Adaptive Dynamic Programming [J]. Journal of Guangdong University of Technology, 2017, 34(05): 10-14.