Appendix C — More Models

This is a list of models that includes those we’ve seen, some we’ve just mentioned, and some we’ve not explored at all.

C.1 Linear Models

Many of these were listed in Section 3.8, but we provide a few more to think about here.

Simplified Linear Models

  • correlation
  • t-test and ANOVA
  • chi-square

Generalized Linear Models and related

  • True GLM e.g. gamma
  • Other GLM type distributions: beta regression, tweedie, t (so-called ‘robust’), truncated
  • Censored outcomes: Survival models, tobit, Heckman
  • Nonlinear regression
  • Modeling other parameters (e.g. heteroscedastic models)
  • Modeling beyond the mean (quantile regression)
  • Mixed models
  • Generalized Additive Models
  • Media mix models

Other Random Effects

  • Gaussian process regression
  • Spatial models (CAR, SAR, etc.)
  • Time series models (ARIMA and related, e.g. state space, Dynamic linear)
  • Factor analysis

Multivariate/multiclass/multipart

  • Multivariate regression (multiple targets)
  • Multinomial/Categorical/Ordinal regression (>2 classes)
  • MANOVA/Linear Discriminant Analysis (these are identical, and can handle multiple outputs or >=2 classes)
  • Zero (or some number) -inflated/hurdle/altered
  • Mixture models and Cluster analysis (e.g. K-means, Hierarchical clustering)
  • Two-stage least squares, instrumental variables, simultaneous equations
  • SEM, simultaneous equations
  • PCA, Factor Analysis
  • Item Response Theory
  • Mixture models
  • Structural Equation Modeling, Graphical models generally

All of these are explicitly linear models or can be framed as such, and compared to what you’ve already seen, typically only require a tweak or two from those - e.g. a different distribution, a different link function, penalizing the coefficients, etc. In other cases, we can bounce from one to another, or there is heavy overlap. For example we can reshape our multivariate regression or an IRT model to be amenable to a mixed model approach, and get the exact same results. We can potentially add a random effect to any model, and that random effect can be based on time, spatial or other considerations. The important thing to know is that the linear model is a very flexible tool that expands easily, and allows you to model most of the types of outcomes we are interested in. As such, it’s a very powerful approach to modeling.

C.2 Other Machine Learning Models

As we’ve mentioned, machine learning is more of a modeling approach, and any model could be used for machine learning. That said, there are some models you’ll typically only see in a machine learning context, as they often do not provide specific parameters of interest like coefficients or variance components, have easy interpretability, nor have straightforward ways to estimate uncertainty. While the focus of these models was more on prediction, many of these in this list are no longer performant compared to tools used today. Still, they can be interesting historically or conceptually, may still be employed in some domains, some are special cases of more widely used techniques, and some can still be used as baseline models.

Standard Regression/Classification

  • Linear Discriminant Analysis (identical to MANOVA)
  • k-Nearest neighbors regression
  • Naive Bayes
  • Support Vector Machines, Boltzmann Machines
  • Projection pursuit regression, MARS
  • (Hidden) Markov Models
  • Undirected graphs, Markov Random Fields, Network analysis
  • Single Decision trees, CART, C4.5, etc.

Ensemble Models

  • Stacking
  • Bayesian Model Averaging

Unsupervised Techniques

Latent Models

  • PCA, probabilistic PCA, ICA, SVD
  • Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA)
  • (Non-negative) Matrix Factorization (NMF)
  • Dirichlet process
  • Recommender systems and collaborative filtering

Clustering

  • t-SNE
  • (H)DBSCAN
  • k-means, k-medoids
  • Hierarchical clustering
  • Fuzzy clustering
  • Spectral clustering
  • Self-organizing maps

C.3 Other Deep Learning Models

We haven’t delved into the world of deep learning as much as there hasn’t yet been a ‘foundational’ model for tabular data of the sort we’ve focused on. However most of the models that make headlines today are built upon simpler models, even going back to the basic multilayer perceptron which we did introduce (Section 10.7). Here are some of the models you might see in the wild:

  • Convolutional Neural Networks
  • Recurrent Neural Networks
  • Long Short-Term Memory Networks
  • Transformers/Attention Mechanisms
  • Autoencoders
  • Generative Adversarial Networks
  • Graph Neural Networks
  • Reinforcement Learning

Convolutional neural networks as currently implemented can be seen going back to LeNet in the late 1990s, and took off several years later with AlexNet and VGG. ResNet (residual networks), Densenet, and YOLO are relatively more recent examples of CNNs, though even they have been around for several years at this point. Even so, several of these still serve as baseline models for image classification and object detection, either in practice or as a reference point for current model performance. In general, you’ll have specific models for certain types of tasks, such as segmentation, object tracking, etc.

NLP and language processing more generally can be seen as evolving from matrix factorization and LDA, to neural network models such as word2vec and GloVe. In addition, the temporal nature of text suggested time-based models, including even more statistical models like hidden markov models back in the day. But in the neural network domain, we have standard Recurrent networks, then LSTMs, GRUs, Seq2Seq, and more that continued the theme. Now the field is dominated by attention-based transformers, of which BERT variants were popular early on, and OpenAI’s GPT is among the most famous example of modern larger language models. But there are many others that have been developed in the last few years, offered from Meta, Google, Anthropic and others. You can see some recent performance rankings, and note that there is not one model that is best at every task.

You’ll also find deep learning approaches to some of the models in the ML section, such as recommender systems, clustering, graphs and more. Michael has surveyed some of the developments in deep learning for tabular data (Clark (2022)), and though he hasn’t seen anything as of this writing to change the general conclusion there, he hopes to revisit the topic in earnest again in the future, so stay tuned.