Appendix C — More Models

This is a list of models that includes those we’ve seen, some we’ve just mentioned, and some we’ve not explored at all.

C.1 Linear Models

Many of these were listed in Section 3.8, but we provide a few more to think about here.

Simplified Linear Models

correlation
t-test and ANOVA
chi-square

Generalized Linear Models and related

True GLM (e.g., gamma)
Other GLM type distributions: beta regression, tweedie, t (so-called ‘robust’), truncated
Censored outcomes: Survival models, tobit, Heckman
Nonlinear regression
Modeling other parameters (e.g., heteroscedastic models)
Modeling beyond the mean (quantile regression)
Mixed models
Generalized Additive Models
Media mix models

Other Random Effects

Gaussian process regression
Spatial models (CAR, SAR, etc.)
Time series models (ARIMA and generalizations like state space models, Dynamic linear)
Factor analysis

Multivariate/multiclass/multipart

Multivariate regression (multiple targets)
Multinomial/Categorical/Ordinal regression (>2 classes)
MANOVA/Linear Discriminant Analysis (these are identical, and can handle multiple outputs or >=2 classes)
Zero (or some number) -inflated/hurdle/altered
Mixture models and Cluster analysis (e.g., K-means, Hierarchical clustering)
Two-stage least squares, instrumental variables, simultaneous equations
SEM, simultaneous equations
PCA, Factor Analysis
Item Response Theory
Mixture models
Structural Equation Modeling, Graphical models generally

All of these are explicitly linear models or can be framed as such, and compared to what you’ve already seen, typically only require a tweak or two from those - for example, a different distribution, a different link function, penalizing the coefficients, etc. In other cases, we can bounce from one to another, or there is heavy overlap. For example we can reshape our multivariate regression or an IRT model to be amenable to a mixed model approach, and get the exact same results. We can potentially add a random effect to any model, and that random effect can be based on time, spatial or other considerations. The important thing to know is that the linear model is a very flexible tool that expands easily, and allows you to model most of the types of outcomes we are interested in. As such, it’s a very powerful approach to modeling.

C.2 Other Machine Learning Models

As we’ve mentioned, machine learning is more of a modeling approach, and any model could be used for machine learning. That said, there are some models you’ll typically only see in a machine learning context, as they often do not provide specific parameters of interest like coefficients or variance components, have easy interpretability, nor have straightforward ways to estimate uncertainty. While the focus of these models was more on prediction, many of these in this list are no longer performant compared to tools used today. Still, they can be interesting historically or conceptually, may still be employed in some domains, some are special cases of more widely used techniques, and some can still be used as baseline models.

Standard Regression/Classification

Linear Discriminant Analysis (identical to MANOVA)
k-Nearest neighbors regression
Naive Bayes
Support Vector Machines, Boltzmann Machines
Projection pursuit regression, Multivariate Adaptive Regression Splines (MARS)
(Hidden) Markov Models
Undirected graphs, Markov Random Fields, Network analysis
Single Decision trees, CART, C4.5, etc.

Ensemble Models

Stacking
Bayesian Model Averaging

Unsupervised Techniques

Latent Models

PCA, probabilistic PCA, ICA, SVD
Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA)
(Non-negative) Matrix Factorization (NMF)
Dirichlet process
Recommender systems and collaborative filtering

Clustering

t-SNE
(H)DBSCAN
k-means, k-medoids
Hierarchical clustering
Fuzzy clustering
Spectral clustering
Self-organizing maps

C.3 Other Deep Learning Models

We haven’t delved into the world of deep learning as much as there hasn’t yet been a ‘foundational’ model for tabular data of the sort we’ve focused on. However most of the models that make headlines today are built upon simpler models, even going back to the basic multilayer perceptron which we did introduce (Section 11.7). Here are some of the models you might see in the wild:

Convolutional Neural Networks
Recurrent Neural Networks
Long Short-Term Memory Networks
Transformers/Attention Mechanisms
Autoencoders
Generative Adversarial Networks
Graph Neural Networks
Reinforcement Learning

Convolutional neural networks as currently implemented can be seen going back to LeNet in the late 1990s, and took off several years later with AlexNet and VGG. ResNet (residual networks), Densenet, and YOLO are relatively more recent examples of CNNs, though even they have been around for several years at this point. Even so, several of these still serve as baseline models for image classification and object detection, either in practice or as a reference point for current model performance. In general, you’ll have specific models for certain types of tasks, such as segmentation, object tracking, etc.

Modern NLP and language processing can be seen as starting from matrix factorization and LDA, and subsequently neural network models such as word2vec and GloVe. In addition, the temporal nature of text suggested time-based models, including even more statistical models like hidden markov models back in the day. But in the neural network domain, we have standard recurrent neural networks, then LSTMs, GRUs, Seq2Seq, and more that continued the theme. Now the field is dominated by attention-based transformers, of which BERT and variants were popular early on, and OpenAI’s GPT is among the most famous example of modern larger language models. But there are many others that have been developed in the last few years, offered from Meta, Google, Anthropic and others. You can see some recent performance rankings, and note that there is not one model that is best at every task.

You’ll also find deep learning approaches to some of the models in the ML section, such as recommender systems, clustering, graphs and more. Recent efforts have attempted ‘foundational’ models to time series, such as Moirai. Michael has surveyed some of the developments in deep learning for tabular data (Clark (2022)), and though he hasn’t seen anything as of this writing to change the general conclusion there, he hopes to revisit the topic in earnest again in the future, so stay tuned.