Ordinal Regression

The following demonstrates a standard cumulative link ordinal regression model via maximum likelihood. Default is with probit link function. Alternatively you can compare it with a logit link, which will result in values roughly 1.7*parameters estimates from the probit.

Data

This data generation is from the probit perspective, where the underlying continuous latent variable is normally distributed.

library(tidyverse)

set.seed(808)

N = 1000                                       # Sample size
x = cbind(x1 = rnorm(N), x2 = rnorm(N))        # predictor variables
beta   = c(1,-1)                               # coefficients
y_star = rnorm(N, mean = x %*% beta)           # the underlying latent variable
y_1 = y_star > -1.5                            # -1.50 first cutpoint
y_2 = y_star > .75                             #  0.75 second cutpoint
y_3 = y_star > 1.75                            #  1.75 third cutpoint
y   = y_1 + y_2 + y_3 + 1                      # target

table(y)
y
  1   2   3   4 
175 495 182 148 
d = data.frame(x, y = factor(y))

Function

ordinal_ll <- function(par, X, y, probit = TRUE) {
  K = length(unique(y))                        # number of classes K
  ncuts = K-1                                  # number of cutpoints/thresholds
  cuts  = par[(1:ncuts)]                       # cutpoints
  beta  = par[-(1:ncuts)]                      # regression coefficients
  lp    = X %*% beta                           # linear predictor
  ll    = rep(0, length(y))                    # log likelihood
  pfun  = ifelse(probit, pnorm, plogis)        # which link to use
  
  for(k in 1:K){
    if (k==1) {
      ll[y==k] = pfun((cuts[k] - lp[y==k]), log = TRUE)
    }
    else if (k < K) {
      ll[y==k] = log(pfun(cuts[k] - lp[y==k]) - pfun(cuts[k-1] - lp[y==k]))
    }
    else {
      ll[y==k] = log(1 - pfun(cuts[k-1] - lp[y==k])) 
    }
  }
  
  -sum(ll)
}

Estimation

init = c(-1, 1, 2, 0, 0)         # initial values
fit_probit = optim(
  init,
  ordinal_ll,
  y = y,
  X = x,
  probit  = TRUE,
  control = list(reltol = 1e-10)
)

fit_logit  = optim(
  init,
  ordinal_ll,
  y = y,
  X = x,
  probit  = FALSE,
  control = list(reltol = 1e-10)
)

Comparison

We can compare our results with the ordinal package.

library(ordinal)
fit_ordpack_probit = clm(y ~ x1 + x2, data = d, link = 'probit')
fit_ordpack_logit  = clm(y ~ x1 + x2, data = d, link = 'logit')
method cut_1 cut_2 cut_3 beta1 beta2
probit ordinal_ll -1.609 0.731 1.798 1.019 -1.051
probit ordpack -1.609 0.731 1.798 1.019 -1.051
logit ordinal_ll -2.872 1.284 3.168 1.806 -1.877
logit ordpack -2.872 1.284 3.168 1.806 -1.877