12  Until Next Time…

PLACEHOLDER

12.1 More Models

Simplified Linear Models

  • correlation
  • t-test and ANOVA
  • chi-square

Generalize Linear Models and related

  • True GLM e.g. logistic, poisson
  • Other distributions: beta regression, tweedie, t (so-called robust), truncated
  • Penalized regression: ridge, lasso, elastic net
  • Censored outcomes: Survival models, tobit
  • Modeling other parameters (e.g. heteroscedastic models that also predict scale in Gaussian linear regression)

Multivariate/multiclass/multipart

  • Multivariate regression (multiple targets)
  • Multinomial/Categorical/Ordinal regression (>2 classes)
  • MANOVA/Linear Discriminant Analysis
  • Zero (or some number) -inflated/hurdle/altered
  • Mixture models and Cluster analysis

Random Effects

  • Mixed effects models (random intercepts/coefficients)
  • Generalized additive models (GAMMs)
  • Gaussian process regression
  • Spatial models (CAR)
  • Time series models (ARIMA)
  • Factor analysis

Latent Linear Models

  • PCA, Factor Analysis
  • Mixture models
  • Structural Equation Modeling, Graphical models generally

All of these are explicitly linear models or can be framed as such, and most are either identical in description to what you’ve already seen or require only a tweak or two - e.g. a different distribution, a different link function, penalizing the coefficients, etc. In other cases, we can bounce from one to the another. For example we can reshape our multivariate outcome to be amenable to a mixed model approach, and get the exact same results. We can potentially add a random effect to any model, and that random effect can be based on time, spatial or other considerations. The important thing to know is that the linear model is a very flexible tool that expands easily, and allows you to model most of the types of outcomes were interested in. As such, it’s a very powerful approach to modeling.

Multi-class, rank, ordered

12.2 Other ML

Historical models like single decision tree, knn-regression, svm, naive bayes, etc. Most of these are note used so much anymore but may be interesting.

Miscellaneous combinations of models like ensembles/stacking, meta-learners, random forests and bagging, and more. (we demo’d boosting)

Recommender systems, graphical models, dimension reduction

Unsupervised/semi-supervised learning

12.3 Other DL

could mention specific/historical models here, like resnet, bert, GANs, LSTM, etc. talk about DL applied to tabular autoencoders, VAEs, etc. misc - extreme learning machines, etc.

12.4 Toolbox

  • GAM (including mixed models)
  • Penalized regression
  • Boosting
  • A Basic MLP, or better, one that incorporates embeddings for categorical features

For Numeric & Binary/multiclass

Awareness: time series, dimension reduction (e.g. PCA, embeddings, time-based)

Periphery: ordinal, survival, ranks, spatial, etc. dive in as needed

Metrics: RMSE, likelihood/log loss, AUC, Prec/Recall, F1, AUPRC, brier

12.5 How to Choose?

People love to say that ‘all models are wrong, but some are useful’1. We prefer to think of this a bit differently. There is no wrong model to use to answer your question, and there’s no guarantee that you would come to a different conclusion from using a simple correlation than you would from a complex neural network.

12.6 Key Steps


  1. George Box, a famous statistician, said this in 1976.↩︎