Here I will go into a bit of detail regarding rstanarm and brms. For standard models, these should be your first choice, rather than using Stan directly. Why? For one, the underlying code that is used will be more optimized and efficient than what you come up with, and also, that code has had multiple individuals developing it and hundreds actually using it. Furthermore, you can still, and probably should, set your priors as you wish.
The nice thing about both is that you use the same syntax that you do for R modeling in general. Here is a a basic GLM in both.
stan_glm(y ~ x + z, data = d) brm(y ~ x + z, data = d)
And here are a couple complexities thrown in to show some minor differences. For example, the priors are specified a bit differently, and you may have options for one that you won’t have in the other, but both will allow passing standard arguments, like cores, chains, etc. to rstan.
stan_glm( ~ x + z + (1|g), y data = d, family = binomial(link = "logit"), prior = normal(0, 1), prior_covariance = decov(regularization = 1, concentration = .5), QR = TRUE, chains = 2, cores = 2, iter = 2000 ) brm( ~ x + z + (1|g), y data = d, family = bernoulli(link = 'logit'), prior = prior(normal(0, 1), class = b, cauchy(0, 5), class = sd), chains = 2, cores = 2, iter = 2000 )
So the syntax is easy to use for both of them, and to a point identical to standard R modeling syntax, and both have the same rstan arguments. However, you’ll need to know what’s available to tweak and how to do so specifically for each package.
Standard Regression and GLM
A good starting point for getting more comfortable with Bayesian analysis is to use it on what you’re already more comfortable with, e.g. the standard linear or generalized linear model, and rstanarm and brms both will do this for you. In general, for these models I would suggest rstanarm, as it will run much faster and is optimized for them.
It’s not a good thing that for the most common linear models R has multiple functions and even an additional packages. So we have the following for standard linear, glm, and categorical models:
- aov: ANOVA
- lm: standard regression (linear model)
- glm: generalized linear model
- MASS::glm.nb: negative binomial for count data
- MASS::polr: ordinal regression model
- nnet::nnet: multinomial regression model
- biglm::biglm: big data lm
rstanarm keeps this nomenclature unfortunately, and currently doesn’t offer anything for multinomial models. Thus we have:
- stan_aov: ANOVA
- stan_lm: standard regression (linear model)
- stan_glm: generalized linear model
- stan_glm.nb: negative binomial for count data or neg_binomial_2 family for stan_glm
- stan_polr: ordinal regression model
- stan_biglm: big data lm
Contrast this with brms, which only requires the brm function and appropriate family, e.g. ‘poisson’ or ‘categorical,’ and which can do multinomial models also.
However, if you want to do a standard linear regression, I would probably not recommend using stan_lm, as it requires a prior for the \(R^2\), which is unfamiliar and only explained in technical ways that are likely going to be lost on those less comfortable with, or new to, statistical or Bayesian analysis54. The good news is that you can simply run stan_glm instead, and work with the prior on the regression coefficients as we have discussed, and you can use bayes_R2 to get the \(R^2\).
You can certainly use brms for GLM, but it would have to compile the code first, and so will always be relatively, but not prohibitively, slower. For linear models with interactions or GLM generally, you may prefer it for the marginal effects plots and other additional functionality.
If you’re just doing a standard logistic regression, I’d suggest stan_glm, again, for the speed. In addition, it has a specific model function for conditional logistic regression (stan_clogit). Beyond that, I’d recommend brms. For ordinal regression, stan_polr goes back to requiring a prior for \(R^2\), which is now the \(R^2\) for the underlying latent variable of the ordinal outcome55. Furthermore, brms has some ordinal-specific plots, as well as other types of ordinal regression (e.g. adjacent category) that allow the proportional odds assumption to be relaxed. It also can do multi-category models.
brm(y ~ x, family = 'categorical') # nominal outcome with > 2 levels brm(y ~ cs(x), family = 'acat') # ordinal model with category-specific effects for x
brms families for categorical:
- bernoulli: binary target
- categorical: nominal target
- cumulative, sratio, cratio, and acat: ordinal outcome (cumulative, stopping ratio, continuation-ratio, adjacent category)
Extended Count Models
For going beyond binomial, poisson, and negative binomial distributions for count data, brms has a lot more for common extensions to those models, and beyond. It also has zero-altered counterparts to continuous outcomes (e.g. hurdle_gamma).
As mentioned previously, there is currently no direct way to do multinomial count models[^nomulti] except via the poisson
The Bayesian approach really shines for mixed models in my opinion, where the random effects are estimated like other parameters, and so complicated structures are notably easier to deal with, and extending such models to other distribution families is straightforward. For the usual speed boost you can use rstanarm:
- stan_lmer: standard lme4 style mixed model
- stan_glmer: glmm
- stan_glmer.nb: for negative binomial
- stan_nlmer: nlme (but see stan_gamm4)
- stan_mvmer: multivariate outcome
- stan_gamm4: generalized additive mixed model in lme4 style
I would probably just recommend rstanarm for stan_lmer and stan_glmer, as brms has more flexibility, and even would be recommended for the standard models if you want to estimate residual (co-)variance structure, e.g. autocorrelation. It also will do multivariate models, and one can use mgcv::s for smooth terms in any brms model.
Even More Packages
I’ve focused on the two widely-used general-purpose packages, but nothing can stop Stan at this point. As of the latest update to this document, there are now 144 other packages depending on rstan in some way. There are also interfaces for Python, Julia, and more. Odds are good you’ll find one to suit your needs.
The developers note in their vignette for ANOVA:
‘but it is reasonable to expect a researcher to have a plausible guess for R2 before conducting an ANOVA.’
Actually, I’m not sure how reasonable this is. In consulting I’ve seen many, many researchers of varying levels of expertise, and I don’t think even a large minority of them would be able to hazard much of a guess about \(R^2\) before running a model, unless they’re essentially duplicating previous work. I also haven’t come across an explanation in the documentation (which is otherwise great) of how to specify it that would be very helpful to people just starting out with Bayesian analysis or even statistics in general. If the result is that one then has to try a bunch of different priors, then that becomes the focus of the analytical effort, which likely won’t appeal to people just wanting to run a standard regression model.↩︎
If someone tells me they know what the prior should be for that, I probably would not believe them.↩︎