Liu / Wei / Wang Adaptive Dynamic Programming with Applications in Optimal Control

1. Auflage 2017
ISBN: 978-3-319-50815-3
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark

Häufig gestellte Fragen zu E-Books

E-Book, Englisch, 609 Seiten, eBook

Reihe: Advances in Industrial Control

Adaptive Dynamic Programming with Applications in Optimal Control
1. Auflage 2017, 978-3-319-50813-9, Buch

E-Book, Englisch, 609 Seiten, eBook

Reihe: Advances in Industrial Control

ISBN: 978-3-319-50815-3
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark

Häufig gestellte Fragen zu E-Books

234,33 €

(inkl. MwSt.)

versandkostenfreie Lieferung
sofort verfügbar

This book covers the most recent developments in adaptive dynamic programming (ADP). The text begins with a thorough background review of ADP making sure that readers are sufficiently familiar with the fundamentals. In the core of the book, the authors address first discrete- and then continuous-time systems. Coverage of discrete-time systems starts with a more general form of value iteration to demonstrate its convergence, optimality, and stability with complete and thorough theoretical analysis. A more realistic form of value iteration is studied where value function approximations are assumed to have finite errors. Adaptive Dynamic Programming also details another avenue of the ADP approach: policy iteration. Both basic and generalized forms of policy-iteration-based ADP are studied with complete and thorough theoretical analysis in terms of convergence, optimality, stability, and error bounds. Among continuous-time systems, the control of affine and nonaffine nonlinear systems is studied using the ADP approach which is then extended to other branches of control theory including decentralized control, robust and guaranteed cost control, and game theory. In the last part of the book the real-world significance of ADP theory is presented, focusing on three application examples developed from the authors’ work:

• renewable energy scheduling for smart power grids;• coal gasification processes; and• water–gas shift reactions.
Researchers studying intelligent control methods and practitioners looking to apply them in the chemical-process and power-supply industries will find much to interest them in this thorough treatment of an advanced approach to control.

Liu / Wei / Wang Adaptive Dynamic Programming with Applications in Optimal Control jetzt bestellen!

Zielgruppe

Research

Autoren/Hrsg.

Liu, Derong

Wei, Qinglai

Wang, Ding

Yang, Xiong

Li, Hongliang

Weitere Infos & Material

Inhaltsverzeichnis

1;Foreword;6
2;Series Editors’ Foreword;8
2.1;References;10
3;Preface;11
4;Acknowledgements;16
5;Contents;17
6;Abbreviations;24
7;Symbols;25
8;1 Overview of Adaptive Dynamic Programming;27
8.1;1.1 Introduction;27
8.2;1.2 Reinforcement Learning;29
8.3;1.3 Adaptive Dynamic Programming;33
8.3.1;1.3.1 Basic Forms of Adaptive Dynamic Programming;36
8.3.2;1.3.2 Iterative Adaptive Dynamic Programming;41
8.3.3;1.3.3 ADP for Continuous-Time Systems;44
8.3.4;1.3.4 Remarks;47
8.4;1.4 Related Books;48
8.5;1.5 About This Book;52
8.6;References;53
9;Part I Discrete-Time Systems;60
10;2 Value Iteration ADP for Discrete-Time Nonlinear Systems;61
10.1;2.1 Introduction;61
10.2;2.2 Optimal Control of Nonlinear Systems Using General Value Iteration;62
10.2.1;2.2.1 Convergence Analysis;64
10.2.2;2.2.2 Neural Network Implementation;72
10.2.3;2.2.3 Generalization to Optimal Tracking Control;76
10.2.4;2.2.4 Optimal Control of Systems with Constrained Inputs;80
10.2.5;2.2.5 Simulation Studies;83
10.3;2.3 Iterative ?-Adaptive Dynamic Programming Algorithm for Nonlinear Systems;91
10.3.1;2.3.1 Convergence Analysis;93
10.3.2;2.3.2 Optimality Analysis;101
10.3.3;2.3.3 Summary of Iterative ?-ADP Algorithm;104
10.3.4;2.3.4 Simulation Studies;107
10.4;2.4 Conclusions;111
10.5;References;111
11;3 Finite Approximation Error-Based Value Iteration ADP;115
11.1;3.1 Introduction;115
11.2;3.2 Iterative ?-ADP Algorithm with Finite Approximation Errors;116
11.2.1;3.2.1 Properties of the Iterative ADP Algorithm with Finite Approximation Errors;117
11.2.2;3.2.2 Neural Network Implementation;124
11.2.3;3.2.3 Simulation Study;128
11.3;3.3 Numerical Iterative ?-Adaptive Dynamic Programming;131
11.3.1;3.3.1 Derivation of the Numerical Iterative ?-ADP Algorithm;131
11.3.2;3.3.2 Properties of the Numerical Iterative ?-ADP Algorithm;135
11.3.3;3.3.3 Summary of the Numerical Iterative ?-ADP Algorithm;144
11.3.4;3.3.4 Simulation Study;145
11.4;3.4 General Value Iteration ADP Algorithm with Finite Approximation Errors;149
11.4.1;3.4.1 Derivation and Properties of the GVI Algorithm with Finite Approximation Errors;149
11.4.2;3.4.2 Designs of Convergence Criteria with Finite Approximation Errors;157
11.4.3;3.4.3 Simulation Study;164
11.5;3.5 Conclusions;171
11.6;References;171
12;4 Policy Iteration for Optimal Control of Discrete-Time Nonlinear Systems;174
12.1;4.1 Introduction;174
12.2;4.2 Policy Iteration Algorithm;175
12.2.1;4.2.1 Derivation of Policy Iteration Algorithm;176
12.2.2;4.2.2 Properties of Policy Iteration Algorithm;177
12.2.3;4.2.3 Initial Admissible Control Law;183
12.2.4;4.2.4 Summary of Policy Iteration ADP Algorithm;185
12.3;4.3 Numerical Simulation and Analysis;185
12.4;4.4 Conclusions;196
12.5;References;197
13;5 Generalized Policy Iteration ADP for Discrete-Time Nonlinear Systems;199
13.1;5.1 Introduction;199
13.2;5.2 Generalized Policy Iteration-Based Adaptive Dynamic Programming Algorithm;199
13.2.1;5.2.1 Derivation and Properties of the GPI Algorithm;201
13.2.2;5.2.2 GPI Algorithm and Relaxation of Initial Conditions;210
13.2.3;5.2.3 Simulation Studies;214
13.3;5.3 Discrete-Time GPI with General Initial Value Functions;221
13.3.1;5.3.1 Derivation and Properties of the GPI Algorithm;221
13.3.2;5.3.2 Relaxations of the Convergence Criterion and Summary of the GPI Algorithm;233
13.3.3;5.3.3 Simulation Studies;237
13.4;5.4 Conclusions;243
13.5;References;243
14;6 Error Bounds of Adaptive Dynamic Programming Algorithms;244
14.1;6.1 Introduction;244
14.2;6.2 Error Bounds of ADP Algorithms for Undiscounted Optimal Control Problems;245
14.2.1;6.2.1 Problem Formulation;245
14.2.2;6.2.2 Approximate Value Iteration;247
14.2.3;6.2.3 Approximate Policy Iteration;252
14.2.4;6.2.4 Approximate Optimistic Policy Iteration;258
14.2.5;6.2.5 Neural Network Implementation;262
14.2.6;6.2.6 Simulation Study;264
14.3;6.3 Error Bounds of Q-Function for Discounted Optimal Control Problems;268
14.3.1;6.3.1 Problem Formulation;268
14.3.2;6.3.2 Policy Iteration Under Ideal Conditions;270
14.3.3;6.3.3 Error Bound for Approximate Policy Iteration;275
14.3.4;6.3.4 Neural Network Implementation;278
14.3.5;6.3.5 Simulation Study;280
14.4;6.4 Conclusions;283
14.5;References;284
15;Part II Continuous-Time Systems;286
16;7 Online Optimal Control of Continuous-Time Affine Nonlinear Systems;287
16.1;7.1 Introduction;287
16.2;7.2 Online Optimal Control of Partially Unknown Affine Nonlinear Systems;287
16.2.1;7.2.1 Identifier--Critic Architecture for Solving HJB Equation;289
16.2.2;7.2.2 Stability Analysis of Closed-Loop System;301
16.2.3;7.2.3 Simulation Study;306
16.3;7.3 Online Optimal Control of Affine Nonlinear Systems with Constrained Inputs;311
16.3.1;7.3.1 Solving HJB Equation via Critic Architecture;314
16.3.2;7.3.2 Stability Analysis of Closed-Loop System with Constrained Inputs;318
16.3.3;7.3.3 Simulation Study;322
16.4;7.4 Conclusions;325
16.5;References;326
17;8 Optimal Control of Unknown Continuous-Time Nonaffine Nonlinear Systems;328
17.1;8.1 Introduction;328
17.2;8.2 Optimal Control of Unknown Nonaffine Nonlinear Systems with Constrained Inputs;329
17.2.1;8.2.1 Identifier Design via Dynamic Neural Networks;330
17.2.2;8.2.2 Actor--Critic Architecture for Solving HJB Equation;335
17.2.3;8.2.3 Stability Analysis of Closed-Loop System;337
17.2.4;8.2.4 Simulation Study;342
17.3;8.3 Optimal Output Regulation of Unknown Nonaffine Nonlinear Systems;346
17.3.1;8.3.1 Neural Network Observer;347
17.3.2;8.3.2 Observer-Based Optimal Control Scheme Using Critic Network;352
17.3.3;8.3.3 Stability Analysis of Closed-Loop System;356
17.3.4;8.3.4 Simulation Study;359
17.4;8.4 Conclusions;362
17.5;References;362
18;9 Robust and Optimal Guaranteed Cost Control of Continuous-Time Nonlinear Systems;364
18.1;9.1 Introduction;364
18.2;9.2 Robust Control of Uncertain Nonlinear Systems;365
18.2.1;9.2.1 Equivalence Analysis and Problem Transformation;367
18.2.2;9.2.2 Online Algorithm and Neural Network Implementation;369
18.2.3;9.2.3 Stability Analysis of Closed-Loop System;372
18.2.4;9.2.4 Simulation Study;375
18.3;9.3 Optimal Guaranteed Cost Control of Uncertain Nonlinear Systems;379
18.3.1;9.3.1 Optimal Guaranteed Cost Controller Design;381
18.3.2;9.3.2 Online Solution of Transformed Optimal Control Problem;387
18.3.3;9.3.3 Stability Analysis of Closed-Loop System;392
18.3.4;9.3.4 Simulation Studies;397
18.4;9.4 Conclusions;402
18.5;References;403
19;10 Decentralized Control of Continuous-Time Interconnected Nonlinear Systems;406
19.1;10.1 Introduction;406
19.2;10.2 Decentralized Control of Interconnected Nonlinear Systems;407
19.2.1;10.2.1 Decentralized Stabilization via Optimal Control Approach;408
19.2.2;10.2.2 Optimal Controller Design of Isolated Subsystems;413
19.2.3;10.2.3 Generalization to Model-Free Decentralized Control;419
19.2.4;10.2.4 Simulation Studies;423
19.3;10.3 Conclusions;433
19.4;References;433
20;11 Learning Algorithms for Differential Games of Continuous-Time Systems;435
20.1;11.1 Introduction;435
20.2;11.2 Integral Policy Iteration for Two-Player Zero-Sum Games;436
20.2.1;11.2.1 Derivation of Integral Policy Iteration;438
20.2.2;11.2.2 Convergence Analysis;441
20.2.3;11.2.3 Neural Network Implementation;443
20.2.4;11.2.4 Simulation Studies;446
20.3;11.3 Iterative Adaptive Dynamic Programming for Multi-player Zero-Sum Games;449
20.3.1;11.3.1 Derivation of the Iterative ADP Algorithm;451
20.3.2;11.3.2 Properties;456
20.3.3;11.3.3 Neural Network Implementation;462
20.3.4;11.3.4 Simulation Studies;469
20.4;11.4 Synchronous Approximate Optimal Learning for Multi-player Nonzero-Sum Games;477
20.4.1;11.4.1 Derivation and Convergence Analysis;478
20.4.2;11.4.2 Neural Network Implementation;482
20.4.3;11.4.3 Simulation Study;491
20.5;11.5 Conclusions;496
20.6;References;496
21;Part III Applications;499
22;12 Adaptive Dynamic Programming for Optimal Residential Energy Management;500
22.1;12.1 Introduction;500
22.2;12.2 A Self-learning Scheme for Residential Energy System Control and Management;501
22.2.1;12.2.1 The ADHDP Method;505
22.2.2;12.2.2 A Self-learning Scheme for Residential Energy System;506
22.2.3;12.2.3 Simulation Study;509
22.3;12.3 A Novel Dual Iterative Q-Learning Method for Optimal Battery Management;513
22.3.1;12.3.1 Problem Formulation;513
22.3.2;12.3.2 Dual Iterative Q-Learning Algorithm;514
22.3.3;12.3.3 Neural Network Implementation;520
22.3.4;12.3.4 Numerical Analysis;523
22.4;12.4 Multi-battery Optimal Coordination Control for Residential Energy Systems;530
22.4.1;12.4.1 Distributed Iterative ADP Algorithm;532
22.4.2;12.4.2 Numerical Analysis;544
22.5;12.5 Conclusions;550
22.6;References;550
23;13 Adaptive Dynamic Programming for Optimal Control of Coal Gasification Process;553
23.1;13.1 Introduction;553
23.2;13.2 Data-Based Modeling and Properties;554
23.2.1;13.2.1 Description of Coal Gasification Process and Control Systems;554
23.2.2;13.2.2 Data-Based Process Modeling and Properties;556
23.3;13.3 Design and Implementation of Optimal Tracking Control;562
23.3.1;13.3.1 Optimal Tracking Controller Design by Iterative ADP Algorithm Under System and Iteration Errors;562
23.3.2;13.3.2 Neural Network Implementation;570
23.4;13.4 Numerical Analysis;573
23.5;13.5 Conclusions;584
23.6;References;585
24;14 Data-Based Neuro-Optimal Temperature Control of Water Gas Shift Reaction;586
24.1;14.1 Introduction;586
24.2;14.2 System Description and Data-Based Modeling;587
24.2.1;14.2.1 Water Gas Shift Reaction;587
24.2.2;14.2.2 Data-Based Modeling and Properties;588
24.3;14.3 Design of Neuro-Optimal Temperature Controller;590
24.3.1;14.3.1 System Transformation;590
24.3.2;14.3.2 Derivation of Stable Iterative ADP Algorithm;591
24.3.3;14.3.3 Properties of Stable Iterative ADP Algorithm with Approximation Errors and Disturbances;593
24.4;14.4 Neural Network Implementation for the Optimal Tracking Control Scheme;597
24.5;14.5 Numerical Analysis;600
24.6;14.6 Conclusions;604
24.7;References;604
25;Index;606

Über Autor(innen)

Derong Liu received the Ph.D. degree in electrical engineering from the University of Notre Dame, Indiana, USA, in 1994. Dr. Liu was a Staff Fellow with General Motors Research and Development Center, from 1993 to 1995. He was an Assistant Professor with the Department of Electrical and Computer Engineering, Stevens Institute of Technology, from 1995 to 1999. He joined the University of Illinois at Chicago in 1999, and became a Full Professor of Electrical and Computer Engineering and of Computer Science in 2006. He was selected for the “100 Talents Program” by the Chinese Academy of Sciences in 2008. He has published 16 books. Dr. Liu was the Editor-in-Chief of the IEEE Transactions on Neural Networks and Learning Systems, from 2010 to 2015. Currently, he is an elected AdCom member of the IEEE Computational Intelligence Society, he is the Editor-in-Chief of Artificial Intelligence Review, and he serves as the Vice President of Asia-Pacific Neural Network Society. He was the General Chair of 2014 IEEE World Congress on Computational Intelligence and was the General Chair of 2016 World Congress on Intelligent Control and Automation. He received the Faculty Early Career Development Award from the National Science Foundation in 1999, the University Scholar Award from University of Illinois from 2006 to 2009, the Overseas Outstanding Young Scholar Award from the National Natural Science Foundation of China in 2008, and the Outstanding Achievement Award from Asia Pacific Neural Network Assembly in 2014. He is a Fellow of the IEEE and a Fellow of the International Neural Network Society.

Qinglai Weie="font-family: 'Courier New';"> received the Ph.D. degree in control theory and control engineering, from the Northeastern University, Shenyang, China, in 2009. From 2009 to 2011, he was a postdoctoral fellow with The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China. He is currently a Professor of the institute. Prof. Wei is an Associate Editor of IEEE Transactions on Systems, Man, and Cybernetics: Systems, Information Sciences, Neurocomputing, Optimal Control Applications and Methods, and Acta Automatica Sinica, and was an Associate Editor of IEEE Transactions on Neural Networks and Learning Systems during 2014–2015. He was the organizing committee member of several international conferences. He was recipient of Asia Pacific Neural Networks Society (APNNS) young researcher award in 2016. He was a recipient of the Outstanding Paper Award of Acta Automatica Sinica in 2011 and Zhang Siying Outstanding Paper Award of Chinese Control and Decision Conference (CCDC) in 2015.

Ding Wang received the Ph.D. degree in control theory and control engineering from the Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 2012. He is currently an Associate Professor with The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences. He has been a Visiting Scholar with the Department of Electrical, Computer, and Biomedical Engineering, University of Rhode Island, Kingston, RI, USA, since 2015. His research interests include adaptive and learning systems, intelligent control, and neural networks. He has published over 70 journal and conference papers, and coauthored two monographs. He was the organizing committee member of several international conferences. He was recipient of the Excellent Doctoral Dissertation Award of Chinese Academy of Sciences in 2013. He serves as an Associate Editor of IEEE Transactions on Neural Networks and Learning Systems and Neurocomputing. He is a member of IEEE, Asia-Pacific Neural Network Society (APNNS), and CAA.

Xiong Yang received the Ph.D. degree in control theory and control engineering from the Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 2014. Dr. Yang was a recipient of the Excellent Award of Presidential Scholarship of Chinese Academy of Sciences in 2014. He was an Assistant Professor with The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, from 2014 to 2016. He is currently an Associate Professor with School of Electrical Engineering and Automation, Tianjin University.

Hongliang Li received the Ph.D. degree in control theory and control engineering from the University of Chinese Academy of Sciences in 2015. Dr. Li was a Research Scientist with IBM Research - China, Beijing, from 2015 to 2016. He joined Tencent Inc., Shenzhen, China, in 2016. He has published more than 10 journal papers on adaptive dynamic programming and reinforcement learning.

Produktsicherheit

Fragen zum Artikel?

Ihre Fragen, Wünsche oder Anmerkungen

Vorname*

Nachname*

Ihre E-Mail-Adresse*

Kundennr.

Ihre Nachricht*

Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.

Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.

234,33 € (inkl. MwSt.)

sofort verfügbar

Webcode: www2.sack.de/s8afy