12 Until Next Time…
PLACEHOLDER
12.1 More Models
Simplified Linear Models
- correlation
- t-test and ANOVA
- chi-square
Generalize Linear Models and related
- True GLM e.g. logistic, poisson
- Other distributions: beta regression, tweedie, t (so-called robust), truncated
- Penalized regression: ridge, lasso, elastic net
- Censored outcomes: Survival models, tobit
- Modeling other parameters (e.g. heteroscedastic models that also predict scale in Gaussian linear regression)
Multivariate/multiclass/multipart
- Multivariate regression (multiple targets)
- Multinomial/Categorical/Ordinal regression (>2 classes)
- MANOVA/Linear Discriminant Analysis
- Zero (or some number) -inflated/hurdle/altered
- Mixture models and Cluster analysis
Random Effects
- Mixed effects models (random intercepts/coefficients)
- Generalized additive models (GAMMs)
- Gaussian process regression
- Spatial models (CAR)
- Time series models (ARIMA)
- Factor analysis
Latent Linear Models
- PCA, Factor Analysis
- Mixture models
- Structural Equation Modeling, Graphical models generally
All of these are explicitly linear models or can be framed as such, and most are either identical in description to what you’ve already seen or require only a tweak or two - e.g. a different distribution, a different link function, penalizing the coefficients, etc. In other cases, we can bounce from one to the another. For example we can reshape our multivariate outcome to be amenable to a mixed model approach, and get the exact same results. We can potentially add a random effect to any model, and that random effect can be based on time, spatial or other considerations. The important thing to know is that the linear model is a very flexible tool that expands easily, and allows you to model most of the types of outcomes were interested in. As such, it’s a very powerful approach to modeling.
Multi-class, rank, ordered
12.2 Other ML
Historical models like single decision tree, knn-regression, svm, naive bayes, etc. Most of these are note used so much anymore but may be interesting.
Miscellaneous combinations of models like ensembles/stacking, meta-learners, random forests and bagging, and more. (we demo’d boosting)
Recommender systems, graphical models, dimension reduction
Unsupervised/semi-supervised learning
12.3 Other DL
could mention specific/historical models here, like resnet, bert, GANs, LSTM, etc. talk about DL applied to tabular autoencoders, VAEs, etc. misc - extreme learning machines, etc.
12.4 Toolbox
- GAM (including mixed models)
- Penalized regression
- Boosting
- A Basic MLP, or better, one that incorporates embeddings for categorical features
For Numeric & Binary/multiclass
Awareness: time series, dimension reduction (e.g. PCA, embeddings, time-based)
Periphery: ordinal, survival, ranks, spatial, etc. dive in as needed
Metrics: RMSE, likelihood/log loss, AUC, Prec/Recall, F1, AUPRC, brier
12.5 How to Choose?
People love to say that ‘all models are wrong, but some are useful’1. We prefer to think of this a bit differently. There is no wrong model to use to answer your question, and there’s no guarantee that you would come to a different conclusion from using a simple correlation than you would from a complex neural network.
12.6 Key Steps
George Box, a famous statistician, said this in 1976.↩︎