E-Book, Englisch, 609 Seiten, eBook
Liu / Wei / Wang Adaptive Dynamic Programming with Applications in Optimal Control
1. Auflage 2017
ISBN: 978-3-319-50815-3
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 609 Seiten, eBook
Reihe: Advances in Industrial Control
ISBN: 978-3-319-50815-3
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
• renewable energy scheduling for smart power grids;• coal gasification processes; and• water–gas shift reactions.
Researchers studying intelligent control methods and practitioners looking to apply them in the chemical-process and power-supply industries will find much to interest them in this thorough treatment of an advanced approach to control.
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
1;Foreword;6
2;Series Editors’ Foreword;8
2.1;References;10
3;Preface;11
4;Acknowledgements;16
5;Contents;17
6;Abbreviations;24
7;Symbols;25
8;1 Overview of Adaptive Dynamic Programming;27
8.1;1.1 Introduction;27
8.2;1.2 Reinforcement Learning;29
8.3;1.3 Adaptive Dynamic Programming;33
8.3.1;1.3.1 Basic Forms of Adaptive Dynamic Programming;36
8.3.2;1.3.2 Iterative Adaptive Dynamic Programming;41
8.3.3;1.3.3 ADP for Continuous-Time Systems;44
8.3.4;1.3.4 Remarks;47
8.4;1.4 Related Books;48
8.5;1.5 About This Book;52
8.6;References;53
9;Part I Discrete-Time Systems;60
10;2 Value Iteration ADP for Discrete-Time Nonlinear Systems;61
10.1;2.1 Introduction;61
10.2;2.2 Optimal Control of Nonlinear Systems Using General Value Iteration;62
10.2.1;2.2.1 Convergence Analysis;64
10.2.2;2.2.2 Neural Network Implementation;72
10.2.3;2.2.3 Generalization to Optimal Tracking Control;76
10.2.4;2.2.4 Optimal Control of Systems with Constrained Inputs;80
10.2.5;2.2.5 Simulation Studies;83
10.3;2.3 Iterative ?-Adaptive Dynamic Programming Algorithm for Nonlinear Systems;91
10.3.1;2.3.1 Convergence Analysis;93
10.3.2;2.3.2 Optimality Analysis;101
10.3.3;2.3.3 Summary of Iterative ?-ADP Algorithm;104
10.3.4;2.3.4 Simulation Studies;107
10.4;2.4 Conclusions;111
10.5;References;111
11;3 Finite Approximation Error-Based Value Iteration ADP;115
11.1;3.1 Introduction;115
11.2;3.2 Iterative ?-ADP Algorithm with Finite Approximation Errors;116
11.2.1;3.2.1 Properties of the Iterative ADP Algorithm with Finite Approximation Errors;117
11.2.2;3.2.2 Neural Network Implementation;124
11.2.3;3.2.3 Simulation Study;128
11.3;3.3 Numerical Iterative ?-Adaptive Dynamic Programming;131
11.3.1;3.3.1 Derivation of the Numerical Iterative ?-ADP Algorithm;131
11.3.2;3.3.2 Properties of the Numerical Iterative ?-ADP Algorithm;135
11.3.3;3.3.3 Summary of the Numerical Iterative ?-ADP Algorithm;144
11.3.4;3.3.4 Simulation Study;145
11.4;3.4 General Value Iteration ADP Algorithm with Finite Approximation Errors;149
11.4.1;3.4.1 Derivation and Properties of the GVI Algorithm with Finite Approximation Errors;149
11.4.2;3.4.2 Designs of Convergence Criteria with Finite Approximation Errors;157
11.4.3;3.4.3 Simulation Study;164
11.5;3.5 Conclusions;171
11.6;References;171
12;4 Policy Iteration for Optimal Control of Discrete-Time Nonlinear Systems;174
12.1;4.1 Introduction;174
12.2;4.2 Policy Iteration Algorithm;175
12.2.1;4.2.1 Derivation of Policy Iteration Algorithm;176
12.2.2;4.2.2 Properties of Policy Iteration Algorithm;177
12.2.3;4.2.3 Initial Admissible Control Law;183
12.2.4;4.2.4 Summary of Policy Iteration ADP Algorithm;185
12.3;4.3 Numerical Simulation and Analysis;185
12.4;4.4 Conclusions;196
12.5;References;197
13;5 Generalized Policy Iteration ADP for Discrete-Time Nonlinear Systems;199
13.1;5.1 Introduction;199
13.2;5.2 Generalized Policy Iteration-Based Adaptive Dynamic Programming Algorithm;199
13.2.1;5.2.1 Derivation and Properties of the GPI Algorithm;201
13.2.2;5.2.2 GPI Algorithm and Relaxation of Initial Conditions;210
13.2.3;5.2.3 Simulation Studies;214
13.3;5.3 Discrete-Time GPI with General Initial Value Functions;221
13.3.1;5.3.1 Derivation and Properties of the GPI Algorithm;221
13.3.2;5.3.2 Relaxations of the Convergence Criterion and Summary of the GPI Algorithm;233
13.3.3;5.3.3 Simulation Studies;237
13.4;5.4 Conclusions;243
13.5;References;243
14;6 Error Bounds of Adaptive Dynamic Programming Algorithms;244
14.1;6.1 Introduction;244
14.2;6.2 Error Bounds of ADP Algorithms for Undiscounted Optimal Control Problems;245
14.2.1;6.2.1 Problem Formulation;245
14.2.2;6.2.2 Approximate Value Iteration;247
14.2.3;6.2.3 Approximate Policy Iteration;252
14.2.4;6.2.4 Approximate Optimistic Policy Iteration;258
14.2.5;6.2.5 Neural Network Implementation;262
14.2.6;6.2.6 Simulation Study;264
14.3;6.3 Error Bounds of Q-Function for Discounted Optimal Control Problems;268
14.3.1;6.3.1 Problem Formulation;268
14.3.2;6.3.2 Policy Iteration Under Ideal Conditions;270
14.3.3;6.3.3 Error Bound for Approximate Policy Iteration;275
14.3.4;6.3.4 Neural Network Implementation;278
14.3.5;6.3.5 Simulation Study;280
14.4;6.4 Conclusions;283
14.5;References;284
15;Part II Continuous-Time Systems;286
16;7 Online Optimal Control of Continuous-Time Affine Nonlinear Systems;287
16.1;7.1 Introduction;287
16.2;7.2 Online Optimal Control of Partially Unknown Affine Nonlinear Systems;287
16.2.1;7.2.1 Identifier--Critic Architecture for Solving HJB Equation;289
16.2.2;7.2.2 Stability Analysis of Closed-Loop System;301
16.2.3;7.2.3 Simulation Study;306
16.3;7.3 Online Optimal Control of Affine Nonlinear Systems with Constrained Inputs;311
16.3.1;7.3.1 Solving HJB Equation via Critic Architecture;314
16.3.2;7.3.2 Stability Analysis of Closed-Loop System with Constrained Inputs;318
16.3.3;7.3.3 Simulation Study;322
16.4;7.4 Conclusions;325
16.5;References;326
17;8 Optimal Control of Unknown Continuous-Time Nonaffine Nonlinear Systems;328
17.1;8.1 Introduction;328
17.2;8.2 Optimal Control of Unknown Nonaffine Nonlinear Systems with Constrained Inputs;329
17.2.1;8.2.1 Identifier Design via Dynamic Neural Networks;330
17.2.2;8.2.2 Actor--Critic Architecture for Solving HJB Equation;335
17.2.3;8.2.3 Stability Analysis of Closed-Loop System;337
17.2.4;8.2.4 Simulation Study;342
17.3;8.3 Optimal Output Regulation of Unknown Nonaffine Nonlinear Systems;346
17.3.1;8.3.1 Neural Network Observer;347
17.3.2;8.3.2 Observer-Based Optimal Control Scheme Using Critic Network;352
17.3.3;8.3.3 Stability Analysis of Closed-Loop System;356
17.3.4;8.3.4 Simulation Study;359
17.4;8.4 Conclusions;362
17.5;References;362
18;9 Robust and Optimal Guaranteed Cost Control of Continuous-Time Nonlinear Systems;364
18.1;9.1 Introduction;364
18.2;9.2 Robust Control of Uncertain Nonlinear Systems;365
18.2.1;9.2.1 Equivalence Analysis and Problem Transformation;367
18.2.2;9.2.2 Online Algorithm and Neural Network Implementation;369
18.2.3;9.2.3 Stability Analysis of Closed-Loop System;372
18.2.4;9.2.4 Simulation Study;375
18.3;9.3 Optimal Guaranteed Cost Control of Uncertain Nonlinear Systems;379
18.3.1;9.3.1 Optimal Guaranteed Cost Controller Design;381
18.3.2;9.3.2 Online Solution of Transformed Optimal Control Problem;387
18.3.3;9.3.3 Stability Analysis of Closed-Loop System;392
18.3.4;9.3.4 Simulation Studies;397
18.4;9.4 Conclusions;402
18.5;References;403
19;10 Decentralized Control of Continuous-Time Interconnected Nonlinear Systems;406
19.1;10.1 Introduction;406
19.2;10.2 Decentralized Control of Interconnected Nonlinear Systems;407
19.2.1;10.2.1 Decentralized Stabilization via Optimal Control Approach;408
19.2.2;10.2.2 Optimal Controller Design of Isolated Subsystems;413
19.2.3;10.2.3 Generalization to Model-Free Decentralized Control;419
19.2.4;10.2.4 Simulation Studies;423
19.3;10.3 Conclusions;433
19.4;References;433
20;11 Learning Algorithms for Differential Games of Continuous-Time Systems;435
20.1;11.1 Introduction;435
20.2;11.2 Integral Policy Iteration for Two-Player Zero-Sum Games;436
20.2.1;11.2.1 Derivation of Integral Policy Iteration;438
20.2.2;11.2.2 Convergence Analysis;441
20.2.3;11.2.3 Neural Network Implementation;443
20.2.4;11.2.4 Simulation Studies;446
20.3;11.3 Iterative Adaptive Dynamic Programming for Multi-player Zero-Sum Games;449
20.3.1;11.3.1 Derivation of the Iterative ADP Algorithm;451
20.3.2;11.3.2 Properties;456
20.3.3;11.3.3 Neural Network Implementation;462
20.3.4;11.3.4 Simulation Studies;469
20.4;11.4 Synchronous Approximate Optimal Learning for Multi-player Nonzero-Sum Games;477
20.4.1;11.4.1 Derivation and Convergence Analysis;478
20.4.2;11.4.2 Neural Network Implementation;482
20.4.3;11.4.3 Simulation Study;491
20.5;11.5 Conclusions;496
20.6;References;496
21;Part III Applications;499
22;12 Adaptive Dynamic Programming for Optimal Residential Energy Management;500
22.1;12.1 Introduction;500
22.2;12.2 A Self-learning Scheme for Residential Energy System Control and Management;501
22.2.1;12.2.1 The ADHDP Method;505
22.2.2;12.2.2 A Self-learning Scheme for Residential Energy System;506
22.2.3;12.2.3 Simulation Study;509
22.3;12.3 A Novel Dual Iterative Q-Learning Method for Optimal Battery Management;513
22.3.1;12.3.1 Problem Formulation;513
22.3.2;12.3.2 Dual Iterative Q-Learning Algorithm;514
22.3.3;12.3.3 Neural Network Implementation;520
22.3.4;12.3.4 Numerical Analysis;523
22.4;12.4 Multi-battery Optimal Coordination Control for Residential Energy Systems;530
22.4.1;12.4.1 Distributed Iterative ADP Algorithm;532
22.4.2;12.4.2 Numerical Analysis;544
22.5;12.5 Conclusions;550
22.6;References;550
23;13 Adaptive Dynamic Programming for Optimal Control of Coal Gasification Process;553
23.1;13.1 Introduction;553
23.2;13.2 Data-Based Modeling and Properties;554
23.2.1;13.2.1 Description of Coal Gasification Process and Control Systems;554
23.2.2;13.2.2 Data-Based Process Modeling and Properties;556
23.3;13.3 Design and Implementation of Optimal Tracking Control;562
23.3.1;13.3.1 Optimal Tracking Controller Design by Iterative ADP Algorithm Under System and Iteration Errors;562
23.3.2;13.3.2 Neural Network Implementation;570
23.4;13.4 Numerical Analysis;573
23.5;13.5 Conclusions;584
23.6;References;585
24;14 Data-Based Neuro-Optimal Temperature Control of Water Gas Shift Reaction;586
24.1;14.1 Introduction;586
24.2;14.2 System Description and Data-Based Modeling;587
24.2.1;14.2.1 Water Gas Shift Reaction;587
24.2.2;14.2.2 Data-Based Modeling and Properties;588
24.3;14.3 Design of Neuro-Optimal Temperature Controller;590
24.3.1;14.3.1 System Transformation;590
24.3.2;14.3.2 Derivation of Stable Iterative ADP Algorithm;591
24.3.3;14.3.3 Properties of Stable Iterative ADP Algorithm with Approximation Errors and Disturbances;593
24.4;14.4 Neural Network Implementation for the Optimal Tracking Control Scheme;597
24.5;14.5 Numerical Analysis;600
24.6;14.6 Conclusions;604
24.7;References;604
25;Index;606




