Marginal Structural Model
This is a demonstration of a simple marginal structural model for estimation of so-called ‘causal’ effects using inverse probability weighting.
Example data is from, and comparison made to, the ipw package. See more here.
Data Setup
This example is from the helpfile at ?ipwpoint.
library(tidyverse)
library(ipw)
set.seed(16)
n = 1000
simdat = data.frame(l = rnorm(n, 10, 5))
a_lin = simdat$l - 10
pa = plogis(a_lin)
simdat = simdat %>%
mutate(
a = rbinom(n, 1, prob = pa),
y = 10 * a + 0.5 * l + rnorm(n, -10, 5)
)
ipw_result = ipwpoint(
exposure = a,
family = "binomial",
link = "logit",
numerator = ~ 1,
denominator = ~ l,
data = simdat
)
summary(ipw_result$ipw.weights) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.4810 0.5127 0.5285 0.9095 0.6318 74.6994
ipwplot(ipw_result$ipw.weights)We create the weights as follows using the probabilities from a logistic regression.
ps_num = fitted(glm(a ~ 1, data = simdat, family = 'binomial'))
ps_num[simdat$a == 0] = 1 - ps_num[simdat$a == 0]
ps_den = fitted(glm(a ~ l, data = simdat, family = 'binomial'))
ps_den[simdat$a == 0] = 1 - ps_den[simdat$a == 0]
wts = ps_num / ps_denCompare the weights.
rbind(summary(wts), summary(ipw_result$ipw.weights)) Min. 1st Qu. Median Mean 3rd Qu. Max.
[1,] 0.481 0.5127181 0.5284768 0.9094652 0.631794 74.6994
[2,] 0.481 0.5127181 0.5284768 0.9094652 0.631794 74.6994
Add inverse probability weights to the data if desired.
simdat = simdat %>%
mutate(sw = ipw_result$ipw.weights)Function
Create the likelihood function for using the weights.
msm_ll <- function(
par, # parameters to be estimated; first is taken to be sigma
X, # model matrix
y, # target variable
wts # estimated weights
) {
beta = par[-1]
lp = X %*% beta
sigma = exp(par[1]) # exponentiated value to stay positive
ll = dnorm(y, mean = lp, sd = sigma, log = TRUE)
-sum(ll * wts) # weighted likelihood
# same as
# ll = dnorm(y, mean = lp, sd = sigma)^wts
# -sum(log(ll))
}Estimation
We want to estimate the marginal structural model for the causal effect of a on y corrected for confounding by l, using inverse probability weighting with robust standard error from the survey package. Create the matrices for estimation, estimate the model, and extract results.
X = cbind(1, simdat$a)
y = simdat$y
fit = optim(
par = c(sigma = 0, intercept = 0, b = 0),
fn = msm_ll,
X = X,
y = y,
wts = wts,
hessian = TRUE,
method = 'BFGS',
control = list(abstol = 1e-12)
)
dispersion = exp(fit$par[1])^2
beta = fit$par[-1]Now we compute the standard errors. The following uses the survey package raw version to get the appropriate standard errors, which the ipw approach uses.
glm_basic = glm(y ~ a, data = simdat, weights = wts) # to get unscaled cov
res = resid(glm_basic, type = 'working') # residuals
glm_vcov_unsc = summary(glm_basic)$cov.unscaled # weighted vcov unscaled by dispersion solve(crossprod(qr(X)))
estfun = X * res * wts
x = estfun %*% glm_vcov_unsc Comparison
library("survey")
fit_msm = svyglm(
y ~ a,
design = svydesign(~ 1, weights = ~ sw, data = simdat)
)
summary(fit_msm)
Call:
svyglm(formula = y ~ a, design = svydesign(~1, weights = ~sw,
data = simdat))
Survey design:
svydesign(~1, weights = ~sw, data = simdat)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.375 1.142 -3.832 0.000135 ***
a 10.647 1.190 8.948 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 29.58889)
Number of Fisher Scoring iterations: 2
Now get the standard errors.
se = sqrt(diag(crossprod(x) * n/(n-1))) # a robust standard error
se_robust = sqrt(diag(sandwich::sandwich(glm_basic))) # an easier way to get it
se_msm = sqrt(diag(vcov(fit_msm))) # extract from msm modelCompare standard errors.
tibble(se, se_robust, se_msm)# A tibble: 2 x 3
se se_robust se_msm
<dbl> <dbl> <dbl>
1 1.14 1.14 1.14
2 1.19 1.19 1.19
Inspect the general fit and compare with the other.
| Estimate | init_se | se_robust | t | p | dispersion |
|---|---|---|---|---|---|
| -4.378 | 0.247 | 1.141 | -3.834 | 0 | 29.563 |
| 10.649 | 0.361 | 1.189 | 8.950 | 0 | 29.563 |
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | -4.375 | 1.142 | -3.832 | 0 |
| a | 10.647 | 1.190 | 8.948 | 0 |
Source
Original code available at https://github.com/m-clark/Miscellaneous-R-Code/blob/master/ModelFitting/ipw.R