Michael J. Clark

pic of michael

General Statement

From t-tests to deep learning, I’ve covered a lot of ground in modeling, visualizing, and understanding data. I can provide inference for models on millions of observations, classify biomedical images to determine pathology, and analyze text to explore political sentiment. What’s more, I can help others understand the results and take appropriate action regarding them.

I enjoy empowering people and helping them discover the secrets hidden within their data, and I am passionate about doing quality work that answers the questions at hand. What drew me to the world of data science and keeps my interest is that it frees me to engage in whatever domain I choose, and provides a great many tools with which to learn more about the things we humans are most interested in.

Experience

I’ve held senior level positions in academia and industry, leading faculty and industry clients on various journeys of data discovery, helping them gain actionable insights that meet their needs, and allowing them to achieve their goals.

I’m currently Senior Machine Learning Engineer @ Strong Analytics and part of the Predictive Analytics Team. I engage with several client projects across multiple industries such as real estate, entertainment, education, and health care, helping them get the most from their data to maximize customer satisfaction and outreach, increase profitability, and explore new data-enabled territory.

As lead statistician @ University of Michigan, I took charge of analysis for dozens of faculty and graduate student projects, consulted on many more, and through my workshops I helped hundreds of faculty, students, and staff enhance their skills. This work spanned numerous disciplines across the social sciences, business, medical research, education, and more.

As the first statistical consultant hired at the Center for Social Science Research @ Notre Dame, I helped faculty and students in multiple disciplines engage in better research methods to increase research productivity. The groundwork my colleagues and I did provided a stable foundation on which the CSSR continues to help those in the social sciences today.

Skills

Analysis

Analytically speaking, I have a fairly wide exposure to modeling techniques, from traditional models to their extensions in clustered, temporal, spatial and data contexts (so-called tabular data). I have employed latent variable models, various machine learning techniques and deep learning, Bayesian approaches, dealt with unstructured data (e.g. natural language processing), and more. Moreover, I have taught, written about, or conducted workshops on these topics, helping others gain expertise to become self-sufficient in their own efforts.

However, it is important to remember that analyzing data is a means to knowledge, the model is not the goal. I continue to hone my tool set to get good results quickly. But those results then need to be communicated to others, and this is often the most challenging aspect of the data endeavor. However, without an interpretable result, the effort falls short. Ultimately we need to move beyond the data to take action in the world in which the data lives.

Programming



I’ve programmed in various languages and spend most of my time in R and Python. I’m a long time R user, and can easily shift from simulations and matrix programming in base R, to tidyverse style data processing, to using various packages to analyze even large data of millions of observations with complex models. I can eek out extra utility from popular R packages, and master a new one in minimal time. With Python I’m well-versed with common packages for data wrangling and optimizing machine learning models, and have used various deep learning tools for analyzing text, images, and more.

Simply put, you have to use the best tool to get the job done. Sometimes there are growing pains using something new or less familiar, but you have to be flexible to get the most from your data.

Spreading the word…

It’s important to me to share my knowledge, and I have many GitHub repositories, some with 100+ stars. My website/blog has entries that are engaged hundreds of times a day, appealing to many, from beginners to those well-versed in practical data science looking to enhance their skills further.

A Philosophical Slant

My academic background has provided me with critical thinking skills that help me understand a problem quickly and iterate through a multitude of solutions efficiently.

  • Ph.D. in Experimental Psychology, concentration in statistics (UNT), B.S. Psychology & Philosophy (TCU)
  • Personal documents and posts exploring analysis and programming techniques that garner 1000s of clicks per month.
  • Blog postings and documents have been cited in Nature, British Medical Journal, PLOS ONE, and other high profile journals.
  • 30+ peer-reviewed papers across various disciplines, including psychology, medicine, biology, chemistry, and more
  • 1000+ citations of peer-reviewed papers
  • *-index values that would get many tenure

Coda

I like answering difficult questions with thoughtful approaches to analyzing data, and also just having fun with an interesting challenge. Feel free to contact me if you have any questions about the things I’m up to!



Last updated: 2023 Jan