E-Book, Englisch, 388 Seiten
Sukup, John scikit-learn Cookbook
3. Auflage 2025
ISBN: 978-1-83664-444-6
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection
Over 80 recipes for machine learning in Python with scikit-learn
E-Book, Englisch, 388 Seiten
ISBN: 978-1-83664-444-6
Verlag: De Gruyter
Format: EPUB
Kopierschutz: 0 - No protection
Trusted by data scientists, ML engineers, and software developers alike, scikit-learn offers a versatile, user-friendly framework for implementing a wide range of ML algorithms, enabling the efficient development and deployment of predictive models in real-world applications. This third edition of scikit-learn Cookbook will help you master ML with real-world examples and scikit-learn 1.5 features.
This updated edition takes you on a journey from understanding the fundamentals of ML and data preprocessing, through implementing advanced algorithms and techniques, to deploying and optimizing ML models in production. Along the way, you'll explore practical, step-by-step recipes that cover everything from feature engineering and model selection to hyperparameter tuning and model evaluation, all using scikit-learn.
By the end of this book, you'll have gained the knowledge and skills needed to confidently build, evaluate, and deploy sophisticated ML models using scikit-learn, ready to tackle a wide range of data-driven challenges.
*Email sign-up and proof of purchase required
Autoren/Hrsg.
Fachgebiete
Weitere Infos & Material
Preface
Although the technology world today is all abuzz about artificial intelligence (AI) and the large language models (LLMs) that power them, machine learning (ML) is still providing value to businesses through predictive modeling and prescriptive analytics. So many systems today are powered by ML on the backend that most people would be surprised to learn how often businesses employ such techniques to refine their marketing strategy, upsell and improve product placement, and customize user experiences, among other applications.
While countless tools and software exist today to enable ML applications, one tool has become the backbone of both hobbyists and enterprises alike: scikit-learn. It’s hard to believe that scikit-learn v0.1 debuted over 15 years ago in January 2010, yet even after all that time and all the changes and advancements in ML and AI, it still holds its place as one of the foremost Python libraries for both AI/ML.
scikit-learn is a powerful, open source ML library for Python that provides simple and efficient tools for data mining and data analysis, built on top of NumPy, SciPy, and Matplotlib. It offers a versatile, user-friendly framework for implementing a wide range of ML algorithms, enabling efficient development and deployment of predictive models in real-world applications.
This book is devoted to scikit-learn v1.5. It takes you on a journey from understanding the fundamentals of ML and data preprocessing, through implementing advanced algorithms and techniques, to deploying and optimizing ML models in production. Along the way, you will explore practical, step-by-step recipes that cover everything from feature engineering and model selection to hyperparameter tuning and model evaluation, all using scikit-learn 1.5.
Finally, every chapter contains examples designed to give you an opportunity to apply the chapter’s learning through coding exercises.
Who this book is for
This book is for data scientists and ML professionals looking to deepen their understanding of advanced ML techniques. Additionally, software engineers and developers who want to implement sophisticated ML models in their applications can benefit equally.
What this book covers
, , covers the standard conventions and core API elements of scikit-learn, including the design principles behind estimators, transformers, and pipelines, as well as common methods such as fit(), predict(), and transform().
, , covers preprocessing tools and techniques, including enhanced data transformers and feature engineering methods.
, , includes updated approaches for dimensionality reduction with new algorithms and improvements in scikit-learn.
, , includes updates on the latest developments in distance metric-based models.
, , covers the linear models and regularization techniques that are now available.
, , explores the latest advancements in logistic regression and its extensions.
, , covers features and optimizations in SVMs and kernel methods.
, , includes the latest improvements and new ensemble techniques.
, , covers new text vectorization methods and multiclass classification strategies.
, , explores unsupervised learning techniques for finding naturally occurring groupings of similar data points.
, , covers techniques for finding inlier and outlier data points in training datasets.
, , covers cross-validation strategies, scoring methods, and model evaluation tools.
, , includes tools and best practices for deploying scikit-learn models in production environments, with a focus on scalability and maintainability.
To get the most out of this book
This book is designed to provide basic examples of the most important features of scikit-learn v1.5. In order to maximize the effectiveness of your learning, in addition to completing the exercises in each chapter, we encourage you to try your own examples and explore additional function arguments beyond those presented. Additionally, combining your learning from different chapters is an effective way to coalesce your scikit-learning understanding holistically.
| Software/hardware covered in the book | OS requirements |
| scikit-learn v1.5 or greater | Windows, macOS X, and Linux (any) |
| Git >=2.46.x |
| Python >=3.9.x |
Each chapter reminds you of the GitHub repository where example code is stored and how to install it locally.
Installing Python libraries in virtual environments with requirements.txt
Installing Python packages from a requirements.txt file is a common practice for managing project dependencies. Here’s a step-by-step guide:
- Navigate to your project directory.
- Open your Terminal or Command Prompt and navigate to the root directory of your Python project, where the requirements.txt file is located: cd /path/to/your/project
- Using a virtual environment isolates your project’s dependencies from other Python projects on your system, preventing conflicts. Next, create the virtual environment: python -m venv venv_name
(Replace venv_name with your desired name for the virtual environment, e.g., venv or scikitlearncookbook.)
- Activate the virtual environment:
- On macOS/Linux, use the following: source venv_name/bin/activate
- On Windows, use this:venv_name\Scripts\activate
Installing the packages
With your virtual environment activated (if you created one), use pip to install the packages listed in requirements.txt:
pip install -r requirements.txtIf you are not using a virtual environment or need to specify a particular Python executable, you might use pip3 instead of pip.
Verifying installation (optional)
You can verify that the packages are installed by running the following:
pip listThis command will list all the installed packages in your current environment, including those from requirements.txt.
When you are finished working on the project, you can deactivate the virtual environment:
deactivateIf you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
Download the example code files
The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/scikit-learn-Cookbook-Third-Edition.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing. Check them out!
Conventions used
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file...




