Easy Bayes with rstanarm and brms

rstanarm: GLM

rstanarm uses the same nomenclature and general approach as base R

library(rstanarm)
attendance_bglm <- stan_glm(daysabs ~ math + gender + prog,
                            data = attendance, 
                            family = poisson)
summary(attendance_bglm, digits = 2, prob=c(.025, .5, .975))


Model Info:

 function:     stan_glm
 family:       poisson [log]
 formula:      daysabs ~ math + gender + prog
 algorithm:    sampling
 priors:       see help('prior_summary')
 sample:       4000 (posterior sample size)
 observations: 314
 predictors:   5

Estimates:
                mean     sd       50%      2.5%     97.5% 
(Intercept)       1.49     0.08     1.49     1.33     1.65
math             -0.01     0.00    -0.01    -0.01    -0.01
genderMale       -0.24     0.05    -0.24    -0.33    -0.15
progGeneral       1.27     0.08     1.27     1.12     1.42
progAcademic      0.84     0.07     0.84     0.71     0.98
mean_PPD          5.95     0.20     5.96     5.57     6.33
log-posterior -1324.70     1.57 -1324.39 -1328.55 -1322.59

Diagnostics:
              mcse Rhat n_eff
(Intercept)   0.00 1.00 1862 
math          0.00 1.00 3255 
genderMale    0.00 1.00 3474 
progGeneral   0.00 1.00 1845 
progAcademic  0.00 1.00 1758 
mean_PPD      0.00 1.00 3914 
log-posterior 0.04 1.00 1994 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Summary Info:

This is the same as you see in every other regression model:

mean: the point estimate for the parameter
sd: standard error for the point estimate
quantiles: are whatever you want, but here represent the median and 95% uncertainty inteval

Additional:

mean_PPD: mean of the posterior predictive distribution (hopefully on par with the mean of the target variable (daysabs))
log-posterior: similar to the log-likelihood from maximum likelihood, but for the Bayesian case

Diagnostics for quick eyeball inspection:

Monte Carlo Standard Error: The standard error of the mean of the posterior draws. Want mcse than 10% of the posterior standard deviation.
\(n_{eff}\): is an estimate of the effective number of independent draws from the posterior distribution of the estimand of interest. Because the draws within a chain are not independent if there is autocorrelation, the effective sample size will be smaller than the total number of iterations. Should be greater than 10% of max.
\(\hat{R}\): measures the ratio of the average variance of samples within each chain to the variance of the pooled samples across chains; if all chains are at equilibrium, these will be the same and R̂ will be one. Desire less than 1.1.

Adding more options

Typical configuration would involve setting priors, as well as MCMC options such as iterations, warm-up, etc.

attendance_bglm <- stan_glm(daysabs ~ math + gender + prog,
                            data = attendance, 
                            family = poisson, 
                            prior = student_t(df = 7), 
                            prior_intercept = student_t(df = 7),
                            iter = 5000,
                            warmup = 2000,
                            thin = 10,
                            cores = 4, 
                            seed = 1234)