Models Demystified
A Practical Guide from t-tests to Deep Learning
Preface
Hello and welcome! This book is your companion to exploring the realm of modeling in data science. It is designed to provide you something useful whether you’re a beginner looking to learn some fundamentals, or an experienced practitioner seeking a fresh perspective. Our goal is to equip you with a better understanding of how models work and how to use them, including both basic and more advanced techniques, where we touch on everything from linear regression to deep learning. We’ll also show how different models relate to one another to better empower you to successfully apply them in your own data-driven projects. We aim to provide an overview on how to use both machine learning and traditional statistical modeling in a practical fashion, with a balanced emphasis on interpretability and predictive power. Join us on this exciting journey as we explore the world of models in data science!
This is still a work in progress, with more to come and plenty of things to clean up still. We hope to have the print version out on CRC press by the end of 2024. We welcome any feedback in the meantime as it develops, so please feel free to create an issue. For contributions, please see the contributing page for more information. Thanks for reading!
What Will You Get Out of This Book?
We’re hoping for a couple things for you as you read through this book. In particular, if you’re starting your journey into data science, we hope you’ll leave with:
- A firm understanding of modeling basics from a practical perspective
- A toolset of models and related ideas that you can instantly apply for competent modeling
If you’re already familiar with modeling, we hope you’ll leave with:
- Additional context for the models you already know
- Some introduction to models you don’t know
- Additional understanding of how to choose the right model for the job and what to focus on
For anyone reading this book, we especially hope you get a sense of the commonalities between different models and a good sense of how they work.
Brief Prerequisites
You’ll definitely want to have some familiarity with R or Python (both are used for examples), and some very basic knowledge of statistics will be helpful. We’ll try to explain things as we go, but we won’t be able to cover everything. If you’re looking for a good introduction to R, we recommend R for Data Science or the Python for Data Analysis book for Python. Beyond that, we’ll try to provide the context you need so that you can be comfortable trying things out.
Also, if you happen to be reading this book in print, you can find the book in web form at https://m-clark.github.io/book-of-models. There you’ll find all the code, figures, and other content that you can interact with more easily, as well as the most up-to-date content, fixes, etc. The web version will be updated with some regularity and have additional content as well.
Data & Code
All the data and code used in this book is available on the book’s GitHub repository. See the data descriptions in the data section for more information on each of the datasets used. In addition, notebooks with chapter code are also available there (if applicable).