Backflip Picture

Reversible-Jump (RJ)


Aspect
Description
Acceptance Rate The optimal acceptance rate is unknown (not for reversible-jump in general, but for the CHARM algorithm within this version of reversible-jump). As such, the observed acceptance rate may be suitable in the interval [10%,90%].
Applications This version of reversible-jump is intended for variable selection and Bayesian Model Averaging (BMA).
Difficulty This algorithm is difficult for a beginner. There are several algorithm specifications. Convergence will be difficult to assess.
Final Algorithm? Yes.
Proposal Componentwise.

Reversible-jump Markov chain Monte Carlo (RJMCMC) was introduced by Green (1995) as an extension to MCMC in which the dimension of the model is uncertain and to be estimated. Reversible-jump belongs to a family of trans-dimensional algorithms, and it has many applications, including variable selection, model selection, mixture component selection, and more. The RJ algorithm, here, is one of the simplest possible implementations and is intended for variable selection and Bayesian Model Averaging (BMA).

The Componentwise Hit-And-Run Metropolis (CHARM) algorithm was selected in LaplacesDemon to be extended with reversible-jump. CHARM was selected because it does not have tuning parameters, it is not adaptive (which simplifies things with RJ), and it performs well. Even though it is a componentwise algorithm, it does not evaluate all potential predictors each iteration, but only those that are included. This novel combination is Reversible-Jump (RJ) with Componentwise Hit-And-Run Metropolis (CHARM). The optimal acceptance rate, and a good suggested acceptance rate range, are unknowns.

RJ proceeds by making two proposals during each iteration. First, a within-model move is proposed. This means that the model dimension does not change, and the algorithm proceeds like a traditional CHARM algorithm. Next, a between-models move is proposed, where a selectable parameter is sampled, and its status in the model is changed. If this selected parameter is in the current model, then RJ proposes a model that excludes it. If this selected parameter is not in the current model, then RJ proposes a model that includes it. RJ also includes a user-specified binomial prior distribution for the expected size of the model (the number of included, selectable parameters), as well as user-specified prior probabilities for the inclusion of each of the selectable parameters.

Behind the scenes, RJ keeps track of the most recent non-zero value for each selectable parameter. If a parameter is currently excluded, then its value is currently set to zero. When this parameter is proposed to be included, the most recent non-zero value is used as the basis of the proposal, rather than zero. In this way, an excluded parameter does not have to travel back toward its previously included density, which may be far from zero. However, if RJ begins updating after a previous run had ended, then it will not begin again with this most recent non-zero value. Please keep this in mind with this implementation of RJ.

RJ has five algorithm specifications. bin.n is the scalar size parameter of the binomial prior distribution for model size, and is the maximum size that RJ will explore. bin.p is the scalar probability parameter of the binomial prior distribution for model size, and the mean or median expected model size is bin.n x bin.p. parm.p is a vector of parameter-inclusion probabilities that must be equal in length to the number of initial values. selectable is a vector of indicators (0 or 1) that indicate whether or not a parameter is selectable by reversible-jump. When an element is zero, it is always in the model. Finally, the selected vector contains indicators of whether or not each parameter is in the model when RJ begins to update. All of these specifications must receive an argument with exactly that name (such as bin.n=bin.n, for the Consort function to recognize it, with the exception of the selected specification.

RJ provides a sampler-based alternative to variable selection, compared with the Bayesian Adaptive Lasso (BAL) or Stochastic Search Variable Selection (SSVS), as two of many other approaches. Examples of BAL and SSVS are in the Examples vignette. Advantages of RJ are that it is easier for the user to apply to a given model than writing the variable-selection code into the model, and RJ requires fewer parameters, because variable-inclusion is handled by the specialized sampler, rather than the model specification function. A disdvantage of RJ is that other methods allow the user to freely change to other MCMC algorithms, if the current algorithm is unsatisfactory.

References

  • Green P (1995). "Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination." Biometrika, 82, 711-732.

Bookmark this page