Quantile Regression
Data Setup
We’ll use the quantreg package for comparison, and the classic data set on Belgian household income and food expenditure. Scale income if you want a meaningful ‘centercept’.
library(tidyverse)
library(quantreg)
data(engel)
# engel$income = scale(engel$income)
X = cbind(1, engel$income)
colnames(X) = c('Intercept', 'income')Function
Loss function. It really is this simple.
qreg <- function(par, X, y, tau) {
lp = X%*%par
res = y - lp
loss = ifelse(res < 0 , -(1 - tau)*res, tau*res)
sum(loss)
}Estimation
We’ll estimate the median to start. Compare optim output with quantreg package.
optim(
par = c(intercept = 0, income = 0),
fn = qreg,
X = X,
y = engel$foodexp,
tau = .5
)$par intercept income
81.4853550 0.5601706
rq(foodexp ~ income, tau = .5, data = engel)Call:
rq(formula = foodexp ~ income, tau = 0.5, data = engel)
Coefficients:
(Intercept) income
81.4822474 0.5601806
Degrees of freedom: 235 total; 233 residual
Other quantiles
Now we will add additional quantiles to estimate.
# quantiles
qs = c(.05, .1, .25, .5, .75, .9, .95)
fit_rq = coef(rq(foodexp ~ income, tau = qs, data = engel))
fit_qreg = map_df(qs, function(tau)
data.frame(t(
optim(
par = c(intercept = 0, income = 0),
fn = qreg,
X = X,
y = engel$foodexp,
tau = tau
)$par
)))Comparison
Compare results.
| coef | tau= 0.05 | tau= 0.10 | tau= 0.25 | tau= 0.50 | tau= 0.75 | tau= 0.90 | tau= 0.95 | |
|---|---|---|---|---|---|---|---|---|
| fit_rq | X.Intercept. | 124.880 | 110.142 | 95.484 | 81.482 | 62.397 | 67.351 | 64.104 |
| fit_rq | income | 0.343 | 0.402 | 0.474 | 0.560 | 0.644 | 0.686 | 0.709 |
| fit_qreg | intercept | 124.881 | 110.142 | 95.484 | 81.485 | 62.400 | 67.332 | 64.143 |
| fit_qreg | income.1 | 0.343 | 0.402 | 0.474 | 0.560 | 0.644 | 0.686 | 0.709 |
Visualize
Let’s visualize the results.
Python
The above is available as a Python demo in the supplemental section.
Source
Original code available at https://github.com/m-clark/Miscellaneous-R-Code/blob/master/ModelFitting/quantile_regression.Rmd