Logistic Regression

The following demo regards a standard logistic regression model via maximum likelihood or exponential loss. This can serve as an entry point for those starting out to the wider world of computational statistics as maximum likelihood is the fundamental approach used in most applied statistics, but which is also a key aspect of the Bayesian approach. Exponential loss is not confined to the standard GLM setting, but is widely used in more predictive/‘algorithmic’ approaches e.g. in machine learning and elsewhere.

This follows the linear regression model approach.

Data Setup

Predictors and target. This follows the same approach as the linear regression example, but now draws the target variable from the binomial distribution with size = 1.

library(tidyverse)

set.seed(1235)  # ensures replication

N = 10000 # sample size
k = 2     # number of desired predictors
X = matrix(rnorm(N * k), ncol = k)

# the linear predictor
lp = -.5 + .2*X[, 1] + .5*X[, 2] # increasing N will get estimated values closer to these

y = rbinom(N, size = 1, prob = plogis(lp))

dfXy = data.frame(X, y)

Functions

A maximum likelihood approach.

logreg_ml <- function(par, X, y) {
  # Arguments
  # par: parameters to be estimated
  # X: predictor matrix with intercept column
  # y: target
  
  # setup
  beta = par                                # coefficients
  N = nrow(X)
  
  # linear predictor
  LP = X %*% beta                           # linear predictor
  mu = plogis(LP)                           # logit link
  
  # calculate likelihood
  L = dbinom(y, size = 1, prob = mu, log = TRUE)         # log likelihood
  #   L =  y*log(mu) + (1 - y)*log(1-mu)    # alternate log likelihood form
  
  -sum(L)                                   # optim by default is minimization, and we want to maximize the likelihood 
  # (see also fnscale in optim.control)
}

Another approach via exponential loss function.

logreg_exp <- function(par, X, y) {
  # Arguments
  # par: parameters to be estimated
  # X: predictor matrix with intercept column
  # y: target
  
  # setup
  beta = par                                   # coefficients
  
  # linear predictor
  LP = X %*% beta                              # linear predictor

  # calculate exponential loss function (convert y to -1:1 from 0:1)
  L = sum(exp(-ifelse(y, 1, -1) * .5 * LP))
}

Estimation

Setup for use with optim.

X = cbind(1, X)

# initial values

init = rep(0, ncol(X))
names(init) = c('intercept', 'b1', 'b2')

fit_ml = optim(
  par = init,
  fn  = logreg_ml,
  X   = X,
  y   = y,
  control = list(reltol = 1e-8)
)

fit_exp = optim(
  par = init,
  fn  = logreg_exp,
  X   = X,
  y   = y, 
  control = list(reltol = 1e-15)
)

pars_ml  = fit_ml$par
pars_exp = fit_exp$par

Comparison

Compare to glm.

fit_glm = glm(y ~ ., dfXy, family = binomial)
intercept b1 b2
fit_ml -0.504 0.248 0.503
fit_exp -0.503 0.245 0.501
fit_glm -0.504 0.248 0.503

Python

The above is available as a Python demo in the supplemental section.