# Logistic Regression

The following demo regards a standard logistic regression model via maximum likelihood or exponential loss. This can serve as an entry point for those starting out to the wider world of computational statistics as maximum likelihood is the fundamental approach used in most applied statistics, but which is also a key aspect of the Bayesian approach. Exponential loss is not confined to the standard GLM setting, but is widely used in more predictive/‘algorithmic’ approaches e.g. in machine learning and elsewhere.

This follows the linear regression model approach.

## Data Setup

Predictors and target. This follows the same approach as the linear regression example, but now draws the target variable from the binomial distribution with size = 1.

library(tidyverse)

set.seed(1235)  # ensures replication

N = 10000 # sample size
k = 2     # number of desired predictors
X = matrix(rnorm(N * k), ncol = k)

# the linear predictor
lp = -.5 + .2*X[, 1] + .5*X[, 2] # increasing N will get estimated values closer to these

y = rbinom(N, size = 1, prob = plogis(lp))

dfXy = data.frame(X, y)

## Functions

A maximum likelihood approach.

logreg_ml <- function(par, X, y) {
# Arguments
# par: parameters to be estimated
# X: predictor matrix with intercept column
# y: target

# setup
beta = par                                # coefficients
N = nrow(X)

# linear predictor
LP = X %*% beta                           # linear predictor
mu = plogis(LP)                           # logit link

# calculate likelihood
L = dbinom(y, size = 1, prob = mu, log = TRUE)         # log likelihood
#   L =  y*log(mu) + (1 - y)*log(1-mu)    # alternate log likelihood form

-sum(L)                                   # optim by default is minimization, and we want to maximize the likelihood
# (see also fnscale in optim.control)
}

Another approach via exponential loss function.

logreg_exp <- function(par, X, y) {
# Arguments
# par: parameters to be estimated
# X: predictor matrix with intercept column
# y: target

# setup
beta = par                                   # coefficients

# linear predictor
LP = X %*% beta                              # linear predictor

# calculate exponential loss function (convert y to -1:1 from 0:1)
L = sum(exp(-ifelse(y, 1, -1) * .5 * LP))
}

## Estimation

Setup for use with optim.

X = cbind(1, X)

# initial values

init = rep(0, ncol(X))
names(init) = c('intercept', 'b1', 'b2')

fit_ml = optim(
par = init,
fn  = logreg_ml,
X   = X,
y   = y,
control = list(reltol = 1e-8)
)

fit_exp = optim(
par = init,
fn  = logreg_exp,
X   = X,
y   = y,
control = list(reltol = 1e-15)
)

pars_ml  = fit_ml$par pars_exp = fit_exp$par

## Comparison

Compare to glm.

fit_glm = glm(y ~ ., dfXy, family = binomial)
intercept b1 b2
fit_ml -0.504 0.248 0.503
fit_exp -0.503 0.245 0.501
fit_glm -0.504 0.248 0.503

## Python

The above is available as a Python demo in the supplemental section.