Quantile Regression
Data Setup
We’ll use the quantreg package for comparison, and the classic data set on Belgian household income and food expenditure. Scale income if you want a meaningful ‘centercept’.
library(tidyverse)
library(quantreg)
data(engel)
# engel$income = scale(engel$income)
= cbind(1, engel$income)
X colnames(X) = c('Intercept', 'income')
Function
Loss function. It really is this simple.
<- function(par, X, y, tau) {
qreg = X%*%par
lp = y - lp
res = ifelse(res < 0 , -(1 - tau)*res, tau*res)
loss sum(loss)
}
Estimation
We’ll estimate the median to start. Compare optim
output with quantreg package.
optim(
par = c(intercept = 0, income = 0),
fn = qreg,
X = X,
y = engel$foodexp,
tau = .5
$par )
intercept income
81.4853550 0.5601706
rq(foodexp ~ income, tau = .5, data = engel)
Call:
rq(formula = foodexp ~ income, tau = 0.5, data = engel)
Coefficients:
(Intercept) income
81.4822474 0.5601806
Degrees of freedom: 235 total; 233 residual
Other quantiles
Now we will add additional quantiles to estimate.
# quantiles
= c(.05, .1, .25, .5, .75, .9, .95)
qs
= coef(rq(foodexp ~ income, tau = qs, data = engel))
fit_rq
= map_df(qs, function(tau)
fit_qreg data.frame(t(
optim(
par = c(intercept = 0, income = 0),
fn = qreg,
X = X,
y = engel$foodexp,
tau = tau
$par
) )))
Comparison
Compare results.
coef | tau= 0.05 | tau= 0.10 | tau= 0.25 | tau= 0.50 | tau= 0.75 | tau= 0.90 | tau= 0.95 | |
---|---|---|---|---|---|---|---|---|
fit_rq | X.Intercept. | 124.880 | 110.142 | 95.484 | 81.482 | 62.397 | 67.351 | 64.104 |
fit_rq | income | 0.343 | 0.402 | 0.474 | 0.560 | 0.644 | 0.686 | 0.709 |
fit_qreg | intercept | 124.881 | 110.142 | 95.484 | 81.485 | 62.400 | 67.332 | 64.143 |
fit_qreg | income.1 | 0.343 | 0.402 | 0.474 | 0.560 | 0.644 | 0.686 | 0.709 |
Visualize
Let’s visualize the results.
Python
The above is available as a Python demo in the supplemental section.
Source
Original code available at https://github.com/m-clark/Miscellaneous-R-Code/blob/master/ModelFitting/quantile_regression.Rmd