Quantile Regression

Data Setup

We’ll use the quantreg package for comparison, and the classic data set on Belgian household income and food expenditure. Scale income if you want a meaningful ‘centercept’.

library(tidyverse)

library(quantreg)

data(engel)          
# engel$income = scale(engel$income)

X = cbind(1, engel$income)
colnames(X) = c('Intercept', 'income')

Function

Loss function. It really is this simple.

qreg <- function(par, X, y, tau) {
  lp = X%*%par
  res = y - lp
  loss = ifelse(res < 0 , -(1 - tau)*res, tau*res)
  sum(loss)
}

Estimation

We’ll estimate the median to start. Compare optim output with quantreg package.

optim(
  par = c(intercept = 0, income = 0),
  fn  = qreg,
  X   = X,
  y   = engel$foodexp,
  tau = .5
)$par
 intercept     income 
81.4853550  0.5601706 
rq(foodexp ~ income, tau = .5, data = engel)
Call:
rq(formula = foodexp ~ income, tau = 0.5, data = engel)

Coefficients:
(Intercept)      income 
 81.4822474   0.5601806 

Degrees of freedom: 235 total; 233 residual

Other quantiles

Now we will add additional quantiles to estimate.

# quantiles
qs = c(.05, .1, .25, .5, .75, .9, .95)

fit_rq = coef(rq(foodexp ~ income, tau = qs, data = engel))

fit_qreg = map_df(qs, function(tau)
  data.frame(t(
    optim(
      par = c(intercept = 0, income = 0),
      fn  = qreg,
      X   = X,
      y   = engel$foodexp,
      tau = tau
    )$par
  )))

Comparison

Compare results.

coef tau= 0.05 tau= 0.10 tau= 0.25 tau= 0.50 tau= 0.75 tau= 0.90 tau= 0.95
fit_rq X.Intercept. 124.880 110.142 95.484 81.482 62.397 67.351 64.104
fit_rq income 0.343 0.402 0.474 0.560 0.644 0.686 0.709
fit_qreg intercept 124.881 110.142 95.484 81.485 62.400 67.332 64.143
fit_qreg income.1 0.343 0.402 0.474 0.560 0.644 0.686 0.709

Visualize

Let’s visualize the results.

Python

The above is available as a Python demo in the supplemental section.