The following serves as a practical and applied introduction to Bayesian estimation methods for the uninitiated. The goal is to provide just enough information in a brief format to allow one to feel comfortable exploring Bayesian data analysis for themselves, assuming they have the requisite context to begin with. The idea is to cover a similar amount of material as one would in part of a standard statistics sequence in various applied disciplines where statistics is being introduced in general.
After a conceptual introduction, a fully visible by-hand example is provided using the binomial distribution. After that, the document proceeds to introduce fully Bayesian analysis with the standard linear regression model, as that is the basis for most applied statistics courses and is assumed to be most familiar to the reader. Model diagnostics, model enhancements, and additional modeling issues are then explored. Supplemental content in the appendix provides more technical detail if desired, and includes a maximum likelihood refresher, an overview of programming options in Bayesian analysis, the same regression model using BUGS and JAGS, and ‘by-hand’ code for the model using the Metropolis-Hastings and Hamiltonian Monte Carlo algorithms.
Prerequisites include a basic statistical exposure such as what would be covered in a typical (probably graduate) applied science statistics course. At least some familiarity with R is necessary to follow the code, but that itself is not necessary, and one may go through any number of introductions on the web to acquire enough knowledge in that respect. However, note that for the examples here, at least part of the code will employ some Bayesian-specific programming language (e.g. Stan primarily, BUGS and JAGS in the appendix). No attempt is made to teach those languages though, as it would be difficult to do so efficiently in this more conceptually oriented setting. As such, it is suggested that one follow the code as best they can, and investigate the respective manuals, relevant texts, etc. further on their own. Between the text and comments within the code it is hoped that what the code is accomplishing will be fairly clear. However, I also provide a set of notes that can serve as an overview of Stan here.
This document relies heavily on Gelman et al. (2013), which I highly recommend, if one is ready for it. Other sources used, or particularly pertinent to the material for this document, can be found in the references section at the end. Some are more introductory, and which might be more suitable depending on the context you bring.
This document focuses more on concepts and teaching the Bayesian approach to modeling, while using Stan more as a practical vehicle. However, I do want to say something about the development of Stan and the document itself. Since this document was first put together in Summer of 2014, Stan and associated packages have undergone vast and continued development, and tools for publishing documents with R have as well. Packages such as rstanarm and brms didn’t exist then, but now make even some fairly complicated models easy to pull off with just a line or two of standard R code, and without ever having to use the Stan language directly. Further, newer visualization tools such as bayesplot and shinystan make model exploration even easier than before. While the document now more reflects these developments, it still leans on conducting analysis more explicitly so that things are not so much of a black box. However, it’s worth noting that the example regression model within and associated diagnostics and visuals would now only take a few lines of code, and fully within R. In short, the combination of R and Stan make Bayesian analysis easier than ever before. Once armed with the basic concepts one should feel able to dive as easily as they would with any other R modeling tool.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3rd ed.