E-Book, Englisch, 462 Seiten
McMahon Machine Learning Engineering with Python
2. Auflage 2023
ISBN: 978-1-83763-435-4
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection
Manage the lifecycle of machine learning models using MLOps with practical examples
E-Book, Englisch, 462 Seiten
ISBN: 978-1-83763-435-4
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection
No detailed description available for "Machine Learning Engineering with Python".
Fachgebiete
- Mathematik | Informatik EDV | Informatik Informatik Mensch-Maschine-Interaktion Informationsarchitektur
- Mathematik | Informatik EDV | Informatik Informatik Theoretische Informatik
- Mathematik | Informatik EDV | Informatik Daten / Datenbanken Datenbankdesign & Datenbanktheorie
- Mathematik | Informatik EDV | Informatik Informatik Künstliche Intelligenz Neuronale Netzwerke
Weitere Infos & Material
Table of Contents - Introduction to ML Engineering
- The Machine Learning Development Process
- From Model to Model Factory
- Packaging Up
- Deployment Patterns and Tools
- Scaling Up
- Deep Learning, Generative AI, and LLMOps
- Building an Example ML Microservice
- Building an Extract, Transform, Machine Learning Use Case
Preface
”Software is eating the world, but AI is going to eat software.” — Jensen Huang, CEO of Nvidia Machine learning (ML), part of the wider field of Artificial Intelligence (AI), is rightfully recognized as one of the most powerful tools available for organizations to extract value from their data. As the capabilities of ML algorithms have grown over the years, it has become increasingly obvious that implementing such algorithms in a scalable, fault-tolerant, and automated way requires the creation of new disciplines. These disciplines, ML Engineering (MLE) and ML Operations (MLOps), are the focus of this book. The book covers a wide variety of topics in order to help you understand the tools, techniques, and processes you can apply to engineer your ML solutions, with an emphasis on introducing the key concepts so that you can build on them in your future work. The aim is to develop fundamentals and a broad understanding that will stand the test of time, rather than just provide a series of introductions to the latest tools, although we do cover a lot of the latest tools! All of the code examples are given in Python, the most popular programming language in the world (at the time of writing) and the lingua franca for data applications. Python is a high-level and object-oriented language with a rich ecosystem of tools focused on data science and ML. For example, packages such as scikit-learn and pandas have have become part of the standard lexicon for data science teams across the world. The central tenet of this book is that knowing how to use packages like these is not enough. In this book, we will use these tools, and many, many more, but focus on how to wrap them up in production-grade pipelines and deploy them using appropriate cloud and open-source tools. We will cover everything from how to organize your ML team, to software development methodologies and best practices, to automating model building through to packaging your ML code, how to deploy your ML pipelines to a variety of different targets and then on to how to scale your workloads for large batch runs. We will also discuss, in an entirely new chapter for this second edition, the exciting world of applying ML engineering and MLOps to deep learning and generative AI, including how to start building solutions using Large Language Models (LLMs) and the new field of LLM Operations (LLMOps). The second edition of Machine Learning Engineering with Python goes into a lot more depth than the first edition in almost every chapter, with updated examples and more discussion of core concepts. There is also a far wider selection of tools covered and a lot more focus on open-source tools and development. The ethos of focusing on core concepts remains, but it is my hope that this wider view means that the second edition will be an excellent resource for those looking to gain practical knowledge of machine learning engineering. Although a far greater emphasis has been placed on using open-source tooling, many examples will also leverage services and solutions from Amazon Web Services (AWS). I believe that the accompanying explanations and discussions will, however, mean that you can still apply everything you learn here to any cloud provider or even in an on-premises setting. Machine Learning Engineering with Python, Second Edition will help you to navigate the challenges of taking ML to production and give you the confidence to start applying MLOps in your projects. I hope you enjoy it! Who this book is for
This book is for machine learning engineers, data scientists, and software developers who want to build robust software solutions with ML components. It is also relevant to anyone who manages or wants to understand the production life cycle of these systems. The book assumes intermediate-level knowledge of Python and some basic exposure to the concepts of machine learning. Some basic knowledge of AWS and the use of Unix tools like bash or zsh will also be beneficial. What this book covers
Chapter 1, Introduction to ML Engineering, explains the core concepts of machine learning engineering and machine learning operations. Roles within ML teams are discussed in detail and the challenges of ML engineering and MLOps are laid out. Chapter 2, The Machine Learning Development Process, explores how to organize and successfully execute an ML engineering project. This includes a discussion of development methodologies like Agile, Scrum and CRISP-DM, before sharing a project methodology developed by the author that is referenced to throughout the book. This chapter also introduces continuous integration/continouous deployment (CI/CD) and developer tools. Chapter 3, From Model to Model Factory, shows how to standardize, systematize and automate the process of training and deploying machine learning models. This is done using the author’s concept of the model factory, a methodology for repeatable model creation and validation. This chapter also discusses key theoretical concepts important for understanding machine learning models and covers different types of drift detection and model retrain triggering criteria. Chapter 4, Packaging Up, discusses best practices for coding in Python and how this relates to building your own packages, libraries and components for reuse in multiple projects. This chapter covers fundamental Python programming concepts before moving onto more advanced concepts, and then discusses package and environment management, testing, logging and error handling and security. Chapter 5, Deployment Patterns and Tools, teaches you some of the standard ways you can design your ML system and then get it into production. This chapter focusses on architectures, system design and deployment patterns first before moving onto using more advanced tools to deploy microservices, including containerization and AWS Lambda. The popular ZenML and Kubeflow pipelining and deployment platforms are then reviewed in detail with examples. Chapter 6, Scaling Up, is all about developing with large datasets in mind. For this, the Apache Spark and Ray frameworks are discussed in detail with worked examples. The focus for this chapter is on scaling up batch workloads where massive compute is required. Chapter 7, Deep Learning, Generative AI and LLMOps, covers the latest concepts and techniques for training and deploying deep learning models for production use cases. This chapter includes material discussing the new wave of generative models, with a particular focus on Large Language Models (LLMs) and the challenges for ML engineers looking to productionize these models. This leads us onto define the core elements of LLM Operations (LLMOps). Chapter 8, Building an Example ML Microservice, walks through the building of a machine learning microservice that serves a forecasting solution using FastAPI, Docker and Kubernetes. This pulls together many of the previous concepts developed throughout the book. Chapter 9, Building an Extract, Transform, Machine Learning Use Case, builds out an example of a batch processing ML system that leverages standard ML algorithms and augments these with the use of LLMs. This shows a concrete application of LLMs and LLMOps, as well as providing a more advanced discussion of Airflow DAGs. To get the most out of this book
In this book, some previous exposure to Python development is assumed. Many introductory concepts are covered for completeness but in general it will be easier to get through the examples if you have already written at least some Python programs before. The book also assumes some exposure to the main concepts from machine learning, such as what a model is, what training and inference refer to and an understanding of similar concepts. Several of these are recapped in the text but again it will be a smoother ride if you have previously been acquainted with the main ideas behind building a machine learning model, even at a rudimentary level. On the technical side, to get the most out of the examples in the book, you will need access to a computer or server where you have privileges to install and run Python and other software packages and applications. For many of the examples, access to a UNIX type terminal, such as bash or zsh, is assumed. The examples in this book were written and tested on both a Linux machine running Ubuntu LTS and an M2 Macbook Pro running macOS. If you use a different setup, for example Windows, the examples may require some translation in order to work for your system. Note that the use of the M2 Macbook Pro means several examples show some additional information to get the examples working on Apple Silicon devices. These sections can comfortably be skipped if your system does not require this extra setup. Many of the Cloud based examples leverage Amazon Web Services (AWS) and so require an AWS account with billing setup. Most of the examples will use the free-tier services available from AWS but this is not always possible. Caution is advised in order to avoid large bills. If in doubt, it is recommended you consult the AWS documentation for more...