Introduction
Overview
Mixed models are an extremely useful modeling tool for situations in which there is some dependency among observations in the data, where the correlation typically arises from the observations being clustered in some way. For example, it is quite common to have data in which we have repeated measurements for the units of observation, or in which the units of observation are otherwise grouped together (e.g. students within school, cities within geographic region). While there are different ways to approach such a situation, mixed models are a very common and powerful tool to use. In addition, they have ties to other statistical approaches that further expand their applicability.
Goals
The goal here is primarily to provide a sense of when one would use mixed models, and a variety of standard techniques to implement them. While it can be seen as a standalone treatment, the document originally served as the basis for a workshop, and from that, exercises are available to practice your skills.
Prerequisites
The document is for the most part very applied in nature, and only assumes a basic understanding of standard regression models. Use of R for regression modeling is also assumed, though there will be some review. Demonstrations will be done almost entirely with the lme4 package.
Note the following color coding used in this document:
- emphasis
- package
- function
- object/class
- link
Data and Exercises
All data used can be downloaded here.
If you are interested in doing the examples and exercises, you can follow these steps.
- Download the zip file at GitHub. Be mindful of where you put it.
- Unzip it. Be mindful of where you put the resulting folder.
- Open RStudio.
- File/Open Project and click on the blue icon (
mixed-models-with-r-workshop-2019.Rproj
) in the folder you just created. - File/Open Click on the ReadMe file and do what it says.
Otherwise just download the data files from GitHub.
Key packages
To run the code in this document you’ll really only need the following:
- lme4
- tidyverse: for data processing
- merTools: optional
- glmmTMB: optional
- brms: optional
- modelr: optional
- nlme: part of base R, no need for install
I also use a custom package called mixedup that provides more usable and consistent output for mixed models from lme4, brms, mgcv, etc. Much of the output you see will come from that.