E-Book, Englisch, 716 Seiten
Lapan Deep Reinforcement Learning Hands-On
1. Auflage 2024
ISBN: 978-1-83588-271-9
Verlag: Packt Publishing
Format: EPUB
Kopierschutz: 0 - No protection
A practical and easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF
E-Book, Englisch, 716 Seiten
ISBN: 978-1-83588-271-9
Verlag: Packt Publishing
Format: EPUB
Kopierschutz: 0 - No protection
Start your journey into reinforcement learning (RL) and reward yourself with the third edition of Deep Reinforcement Learning Hands-On. This book takes you through the basics of RL to more advanced concepts with the help of various applications, including game playing, discrete optimization, stock trading, and web browser navigation. By walking you through landmark research papers in the field, this deep RL book will equip you with practical knowledge of RL and the theoretical foundation to understand and implement most modern RL papers.
The book retains its approach of providing concise and easy-to-follow explanations from the previous editions. You'll work through practical and diverse examples, from grid environments and games to stock trading and RL agents in web environments, to give you a well-rounded understanding of RL, its capabilities, and its use cases. You'll learn about key topics, such as deep Q-networks (DQNs), policy gradient methods, continuous control problems, and highly scalable, non-gradient methods.
If you want to learn about RL through a practical approach using OpenAI Gym and PyTorch, concise explanations, and the incremental development of topics, then Deep Reinforcement Learning Hands-On, Third Edition, is your ideal companion
*Email sign-up and proof of purchase required
Autoren/Hrsg.
Weitere Infos & Material
Preface
This book is on reinforcement learning (RL), which is a subfield of machine learning (ML); it focuses on the general and challenging problem of learning optimal behavior in complex environments. The learning process is driven only by the reward value and observations obtained from the environment. This model is very general and can be applied to many practical situations, from playing games to optimizing complex manufacturing processes. We largely focus on deep RL in this book, which is RL that leverages deep learning (DL) methods.
Due to its flexibility and generality, the field of RL is developing very quickly and attracting lots of attention, both from researchers who are trying to improve existing methods or create new methods and from practitioners interested in solving their problems in the most efficient way.
Why I wrote this book
There is a lot of ongoing research activity in the RL field all around the world. New research papers are being published almost every day, and a large number of DL conferences, such as Neural Information Processing Systems (NeurIPS) or the International Conference on Learning Representations (ICLR), are dedicated to RL methods. There are also several large research groups focusing on the application of RL methods to robotics, medicine, multi-agent systems, and others.
However, although information about the recent research is widely available, it is too specialized and abstract to be easily understandable. Even worse is the situation surrounding the practical aspect of RL, as it is not always obvious how to make the step from an abstract method described in its mathematics-heavy form in a research paper to a working implementation solving an actual problem.
This makes it hard for somebody interested in the field to get a clear understanding of the methods and ideas behind papers and conference talks. There are some very good blog posts about various aspects of RL that are illustrated with working examples, but the limited format of a blog post allows authors to describe only one or two methods, without building a complete structured picture and showing how different methods are related to each other in a systematic way. This book was written as an attempt to fill this obvious gap in practical and structured information about RL methods and approaches.
The approach
A key aspect of the book is its orientation to practice. Every method is implemented for various environments, from the very trivial to the quite complex. I’ve tried to make the examples clean and easy to understand, which was made possible by the expressiveness and power of PyTorch. On the other hand, the complexity and requirements of the examples are oriented to RL hobbyists without access to very large computational resources, such as clusters of graphics processing units (GPUs) or very powerful workstations. This, I believe, will make the fun-filled and exciting RL domain accessible to a much wider audience than just research groups or large artificial intelligence companies. On the other hand, this is still deep RL, so access to a GPU is highly recommended, as computation speed up will make experimentations much more convenient (waiting for several weeks for a single optimization to complete is not very fun). Approximately half of the examples in the book will benefit from being run on a GPU.
In addition to traditional medium-sized examples of environments used in RL, such as Atari games or continuous control problems, this book contains several chapters (10, 13, 14, 19, 20, and 21) that contain larger projects, illustrating how RL methods can be applied to more complicated environments and tasks. These examples are still not full-sized, real-life projects (they would occupy a separate book on their own), but just larger problems illustrating how the RL paradigm can be applied to domains beyond the well-established benchmarks.
Another thing to note about the examples in Parts 1, 2, and 3 of the book is that I’ve tried to make them self-contained, with the source code shown in full. Sometimes this has led to the repetition of code pieces (for example, the training loop is very similar in most of the methods), but I believe that giving you the freedom to jump directly into the method you want to learn is more important than avoiding a few repetitions. All examples in the book are available on GitHub at https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On-3E/, and you’re welcome to fork them, experiment, and contribute.
Besides the source code, several chapters (15, 16, 19, and 22) are accompanied by video recordings of the trained model. All these recordings are available in the following YouTube playlist: https://youtube.com/playlist?list=PLMVwuZENsfJmjPlBuFy5u7c3uStMTJYz7.
Who this book is for
This book is ideal for machine learning engineers, software engineers, and data scientists looking to learn and apply deep RL in practice. It assumes familiarity with Python, calculus, and ML concepts. With practical examples and high-level overviews, it’s also suitable for experienced professionals looking to deepen their understanding of advanced deep RL methods and apply them across industries, such as gaming and finance.
What this book covers
Chapter 1, What Is Reinforcement Learning?, contains an introduction to RL ideas and the main formal models.
Chapter 2, OpenAI Gym API and Gymansium, introduces the practical aspects of RL, using the open source library Gym and its descendant, Gymnasium.
Chapter 3, Deep Learning with PyTorch, gives you a quick overview of the PyTorch library.
Chapter 4, The Cross-Entropy Method, introduces one of the simplest methods in RL to give you an impression of RL methods and problems.
Chapter 5, Tabular Learning and the Bellman Equation, this chapter opens Part 2 of the book, devoted to value-based family of methods.
Chapter 6, Deep Q-Networks, describes deep Q-networks (DQNs), an extension of the basic value-based methods, allowing us to solve complicated environments.
Chapter 7, Higher-Level RL Libraries, describes the library PTAN, which we will use in the book to simplify the implementations of RL methods.
Chapter 8, DQN Extensions, gives a detailed overview of a modern extension to the DQN method, to improve its stability and convergence in complex environments.
Chapter 9, Ways to Speed up RL Methods, provides an overview of ways to make the execution of RL code faster.
Chapter 10, Stocks Trading Using RL, is the first practical project and focuses on applying the DQN method to stock trading.
Chapter 11, Policy Gradients, opens Part 3 of the book and introduces another family of RL methods that is based on direct policy optimisation.
Chapter 12, The Actor-Critic Method: A2C and A3C, describes one of the most widely used policy-based method in RL.
Chapter 13, The TextWorld Environment, covers the application of RL methods to interactive fiction games.
Chapter 14, Web Navigation, is another long project that applies RL to web page navigation using the MiniWoB++ environment.
Chapter 15, Continuous Action Space, opens the advanced RL part of the book and describes the specifics of environments using continuous action spaces and various methods (widely used in robotics).
Chapter 16, Trust Regions, is yet another chapter about continuous action spaces describing the trust region set of methods: PPO, TRPO, ACKTR and SAC.
Chapter 17, Black-Box Optimization in RL, shows another set of methods that don’t use gradients in their explicit form.
Chapter 18, Advanced Exploration, covers different approaches that can be used for better exploration of the environment — a very important aspect of RL.
Chapter 19, Reinforcement Learning with Human Feedback, introduces and implements recent approach to guide the process of learning by giving human feedback. This methed is widely used in training large language models (LLMs). In this chapter, we’ll implement RLHF pipeline from scratch and check its efficiency.
Chapter 20, AlphaGo Zero and MuZero, describes the AlphaGo Zero method and its evolution into MuZero, and applies both these methods to the game Connect 4.
Chapter 21, RL in Discrete Optimization, describes the application of RL methods to the domain of discrete optimization, using the Rubik’s cube as an environment.
Chapter 22, Multi-Agent RL, introduces a relatively new direction of RL methods for situations with multiple agents.
To get the most out of this book
This book is suitable for you if you’re using a machine with at least 32 GB of RAM. A GPU is not strictly required, but an Nvidia GPU is highly recommended. The code has been tested on Linux and macOS. For more details on the hardware and software requirements, refer to Chapter 2.
All the...




