12 Until Next Time…

PLACEHOLDER

12.1 More Models

Simplified Linear Models

correlation
t-test and ANOVA
chi-square

Generalize Linear Models and related

True GLM e.g. logistic, poisson
Other distributions: beta regression, tweedie, t (so-called robust), truncated
Penalized regression: ridge, lasso, elastic net
Censored outcomes: Survival models, tobit
Modeling other parameters (e.g. heteroscedastic models that also predict scale in Gaussian linear regression)

Multivariate/multiclass/multipart

Multivariate regression (multiple targets)
Multinomial/Categorical/Ordinal regression (>2 classes)
MANOVA/Linear Discriminant Analysis
Zero (or some number) -inflated/hurdle/altered
Mixture models and Cluster analysis

Random Effects

Mixed effects models (random intercepts/coefficients)
Generalized additive models (GAMMs)
Gaussian process regression
Spatial models (CAR)
Time series models (ARIMA)
Factor analysis

Latent Linear Models

PCA, Factor Analysis
Mixture models
Structural Equation Modeling, Graphical models generally

All of these are explicitly linear models or can be framed as such, and most are either identical in description to what you’ve already seen or require only a tweak or two - e.g. a different distribution, a different link function, penalizing the coefficients, etc. In other cases, we can bounce from one to the another. For example we can reshape our multivariate outcome to be amenable to a mixed model approach, and get the exact same results. We can potentially add a random effect to any model, and that random effect can be based on time, spatial or other considerations. The important thing to know is that the linear model is a very flexible tool that expands easily, and allows you to model most of the types of outcomes were interested in. As such, it’s a very powerful approach to modeling.

Multi-class, rank, ordered

12.2 Other ML

Historical models like single decision tree, knn-regression, svm, naive bayes, etc. Most of these are note used so much anymore but may be interesting.

Miscellaneous combinations of models like ensembles/stacking, meta-learners, random forests and bagging, and more. (we demo’d boosting)

Recommender systems, graphical models, dimension reduction

Unsupervised/semi-supervised learning

12.3 Other DL

could mention specific/historical models here, like resnet, bert, GANs, LSTM, etc. talk about DL applied to tabular autoencoders, VAEs, etc. misc - extreme learning machines, etc.

12.4 Toolbox

GAM (including mixed models)
Penalized regression
Boosting
A Basic MLP, or better, one that incorporates embeddings for categorical features

For Numeric & Binary/multiclass

Awareness: time series, dimension reduction (e.g. PCA, embeddings, time-based)

Periphery: ordinal, survival, ranks, spatial, etc. dive in as needed

Metrics: RMSE, likelihood/log loss, AUC, Prec/Recall, F1, AUPRC, brier

12.5 How to Choose?

People love to say that ‘all models are wrong, but some are useful’¹. We prefer to think of this a bit differently. There is no wrong model to use to answer your question, and there’s no guarantee that you would come to a different conclusion from using a simple correlation than you would from a complex neural network.

12.6 Key Steps

George Box, a famous statistician, said this in 1976.↩︎