rstanarm: GLM

rstanarm uses the same nomenclature and general approach as base R


Model Info:

 function:     stan_glm
 family:       poisson [log]
 formula:      daysabs ~ math + gender + prog
 algorithm:    sampling
 priors:       see help('prior_summary')
 sample:       4000 (posterior sample size)
 observations: 314
 predictors:   5

Estimates:
                mean     sd       50%      2.5%     97.5% 
(Intercept)       1.49     0.08     1.49     1.33     1.65
math             -0.01     0.00    -0.01    -0.01    -0.01
genderMale       -0.24     0.05    -0.24    -0.33    -0.15
progGeneral       1.27     0.08     1.27     1.12     1.42
progAcademic      0.84     0.07     0.84     0.71     0.98
mean_PPD          5.95     0.20     5.96     5.57     6.33
log-posterior -1324.70     1.57 -1324.39 -1328.55 -1322.59

Diagnostics:
              mcse Rhat n_eff
(Intercept)   0.00 1.00 1862 
math          0.00 1.00 3255 
genderMale    0.00 1.00 3474 
progGeneral   0.00 1.00 1845 
progAcademic  0.00 1.00 1758 
mean_PPD      0.00 1.00 3914 
log-posterior 0.04 1.00 1994 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Summary Info:

This is the same as you see in every other regression model:

  • mean: the point estimate for the parameter
  • sd: standard error for the point estimate
  • quantiles: are whatever you want, but here represent the median and 95% uncertainty inteval

Additional:

  • mean_PPD: mean of the posterior predictive distribution (hopefully on par with the mean of the target variable (daysabs))
  • log-posterior: similar to the log-likelihood from maximum likelihood, but for the Bayesian case

Diagnostics for quick eyeball inspection:

  • Monte Carlo Standard Error: The standard error of the mean of the posterior draws. Want mcse than 10% of the posterior standard deviation.
  • \(n_{eff}\): is an estimate of the effective number of independent draws from the posterior distribution of the estimand of interest. Because the draws within a chain are not independent if there is autocorrelation, the effective sample size will be smaller than the total number of iterations. Should be greater than 10% of max.
  • \(\hat{R}\): measures the ratio of the average variance of samples within each chain to the variance of the pooled samples across chains; if all chains are at equilibrium, these will be the same and R̂ will be one. Desire less than 1.1.