# Marginal Structural Model

This is a demonstration of a simple marginal structural model for estimation of so-called ‘causal’ effects using inverse probability weighting.

Example data is from, and comparison made to, the ipw package. See more here.

## Data Setup

This example is from the helpfile at ?ipwpoint.

library(tidyverse)
library(ipw)

set.seed(16)

n = 1000
simdat = data.frame(l = rnorm(n, 10, 5))
a_lin  = simdat$l - 10 pa = plogis(a_lin) simdat = simdat %>% mutate( a = rbinom(n, 1, prob = pa), y = 10 * a + 0.5 * l + rnorm(n, -10, 5) ) ipw_result = ipwpoint( exposure = a, family = "binomial", link = "logit", numerator = ~ 1, denominator = ~ l, data = simdat ) summary(ipw_result$ipw.weights)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.4810  0.5127  0.5285  0.9095  0.6318 74.6994 
ipwplot(ipw_result$ipw.weights) We create the weights as follows using the probabilities from a logistic regression. ps_num = fitted(glm(a ~ 1, data = simdat, family = 'binomial')) ps_num[simdat$a == 0] = 1 - ps_num[simdat$a == 0] ps_den = fitted(glm(a ~ l, data = simdat, family = 'binomial')) ps_den[simdat$a == 0] = 1 - ps_den[simdat$a == 0] wts = ps_num / ps_den Compare the weights. rbind(summary(wts), summary(ipw_result$ipw.weights))
      Min.   1st Qu.    Median      Mean  3rd Qu.    Max.
[1,] 0.481 0.5127181 0.5284768 0.9094652 0.631794 74.6994
[2,] 0.481 0.5127181 0.5284768 0.9094652 0.631794 74.6994

Add inverse probability weights to the data if desired.

simdat = simdat %>%
mutate(sw = ipw_result$ipw.weights) ## Function Create the likelihood function for using the weights. msm_ll <- function( par, # parameters to be estimated; first is taken to be sigma X, # model matrix y, # target variable wts # estimated weights ) { beta = par[-1] lp = X %*% beta sigma = exp(par[1]) # exponentiated value to stay positive ll = dnorm(y, mean = lp, sd = sigma, log = TRUE) -sum(ll * wts) # weighted likelihood # same as # ll = dnorm(y, mean = lp, sd = sigma)^wts # -sum(log(ll)) } ## Estimation We want to estimate the marginal structural model for the causal effect of a on y corrected for confounding by l, using inverse probability weighting with robust standard error from the survey package. Create the matrices for estimation, estimate the model, and extract results. X = cbind(1, simdat$a)
y = simdat$y fit = optim( par = c(sigma = 0, intercept = 0, b = 0), fn = msm_ll, X = X, y = y, wts = wts, hessian = TRUE, method = 'BFGS', control = list(abstol = 1e-12) ) dispersion = exp(fit$par[1])^2
beta = fit$par[-1] Now we compute the standard errors. The following uses the survey package raw version to get the appropriate standard errors, which the ipw approach uses. glm_basic = glm(y ~ a, data = simdat, weights = wts) # to get unscaled cov res = resid(glm_basic, type = 'working') # residuals glm_vcov_unsc = summary(glm_basic)$cov.unscaled          # weighted vcov unscaled by dispersion solve(crossprod(qr(X)))
estfun = X * res * wts
x = estfun %*% glm_vcov_unsc 

## Comparison

library("survey")

fit_msm = svyglm(
y ~ a,
design = svydesign(~ 1, weights = ~ sw, data = simdat)
)

summary(fit_msm)

Call:
svyglm(formula = y ~ a, design = svydesign(~1, weights = ~sw,
data = simdat))

Survey design:
svydesign(~1, weights = ~sw, data = simdat)

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   -4.375      1.142  -3.832 0.000135 ***
a             10.647      1.190   8.948  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 29.58889)

Number of Fisher Scoring iterations: 2

Now get the standard errors.

se = sqrt(diag(crossprod(x) * n/(n-1)))                  # a robust standard error
se_robust = sqrt(diag(sandwich::sandwich(glm_basic)))    # an easier way to get it
se_msm    = sqrt(diag(vcov(fit_msm)))                        # extract from msm model

Compare standard errors.

tibble(se, se_robust, se_msm)
# A tibble: 2 x 3
se se_robust se_msm
<dbl>     <dbl>  <dbl>
1  1.14      1.14   1.14
2  1.19      1.19   1.19

Inspect the general fit and compare with the other.

(#tab:msm-comparison- show)msm_ll
Estimate init_se se_robust t p dispersion
-4.378 0.247 1.141 -3.834 0 29.563
10.649 0.361 1.189 8.950 0 29.563
(#tab:msm-comparison- show)svyglm
term estimate std.error statistic p.value
(Intercept) -4.375 1.142 -3.832 0
a 10.647 1.190 8.948 0

## Source

Original code available at https://github.com/m-clark/Miscellaneous-R-Code/blob/master/ModelFitting/ipw.R