Michael J. Clark
General Statement
Throughout my career in data science, I have navigated a vast landscape of modeling, visualizing, and understanding data. I have conducted causal inference for marketing campaigns, classified biomedical images to detect pathology, analyzed text to uncover political sentiment, and explored baboon survival rates based on their social status. My experience spans dozens of industries and academic disciplines, helping clients and researchers unlock the full potential of their data with statistical analysis, machine learning, and AI.
I enjoy empowering people and helping them discover the secrets hidden within their data, and I am passionate about doing quality work that answers the questions at hand. What drew me to the world of data science and keeps my interest is that it frees me to engage in whatever domain I choose, and provides a great many tools with which to learn more about the things we humans are most interested in.
Experience
I’ve held senior level positions in academia and industry, leading faculty and industry clients on various journeys of data discovery, helping them gain actionable insights that meet their needs, and allowing them to achieve their goals.
I’m currently a Senior Machine Learning Scientist at @ OneSix (Strong Analytics pre-merger) and part of the Predictive Analytics Team. I engage with several client projects across multiple industries, such as real estate, entertainment, education, and health care, helping them get the most from their data to maximize customer satisfaction and outreach, increase profitability, and explore new data-enabled territory.
As lead statistician @ University of Michigan, I took charge of analysis for dozens of faculty and graduate student projects, consulted on many more, and through my workshops I helped hundreds of faculty, students, and staff enhance their skills. This work spanned numerous disciplines across the social sciences, business, medical research, education, and more.
As the first statistical consultant hired at the Center for Social Science Research @ Notre Dame, I helped faculty and students in multiple disciplines engage in better research methods to increase research productivity. The groundwork my colleagues and I did provided a stable foundation on which the CSSR continues to help those in the social sciences today.
Skills
Analysis
I have extensive experience with a wide array of modeling techniques, including statistical analysis, machine learning, deep learning, and AI. My statistical expertise spans the range from traditional models to advanced methods in clustered, temporal, and spatial data contexts, including Bayesian approaches. I have utilized various machine learning approaches, deep learning architectures, and in production for companies large and small. I have also taught, written about, and conducted workshops on these topics, empowering others to become proficient in their own projects.
However, it is important to remember that analyzing data is a means to knowledge, the model is not the goal. I continue to hone my tool set to get good results quickly. But those results then need to be communicated to others, and this is often the most challenging aspect of the data endeavor. However, without an interpretable result, the effort falls short. Ultimately we need to move beyond the data to take action in the world in which the data lives.
Programming
I’ve programmed in various languages and spend most of my time in Python and R. Presently I mostly use Python for most of the client projects I work on. I’m well-versed with general programming, common packages for data wrangling and optimizing machine learning models, and have used various deep learning tools for time series analysis, natural language processing, computer vision, and more. However, I’m a long time R user, and can easily shift from simulations and matrix programming in base R, to tidyverse style data processing, to using various packages to analyze even large data of millions of observations with complex models. I can eek out extra utility from popular R packages, and master a new one in minimal time.
Simply put, you have to use the best tool to get the job done. Sometimes there are growing pains using something new or less familiar, but you have to be flexible to get the most from your data.
Spreading the word…
It’s important to me to share my knowledge, and I have many GitHub repositories, some with 100+ stars. My website/blog has entries that are engaged hundreds of times a day, appealing to many, from beginners to those well-versed in practical data science looking to enhance their skills further.
A Philosophical Slant
My academic background has provided me with critical thinking skills that help me understand a problem quickly and iterate through a multitude of solutions efficiently.
- Ph.D. in Experimental Psychology, concentration in statistics (UNT)
- B.S. Psychology & Philosophy (TCU)
- GitHub repositories with 1000+ stars, and 1000+ followers
- Personal documents and blog posts exploring analysis and programming techniques that garner 1000s of clicks per month.
- Those posts and documents have been cited in Nature, British Medical Journal, PLOS ONE, and other high profile journals.
- 30+ peer-reviewed papers across various disciplines, including psychology, medicine, biology, chemistry, and more
- 1000+ citations of peer-reviewed papers
- Publication/citation index values that would get many faculty tenure
While I haven’t been in academia for a while now, I’m proud of what I was able to accomplish, and I continue to maintain an academic mindset when applicable to my work. I’m always looking for ways to improve my skills, and I’m always looking for ways to help others improve theirs.
Coda
I like answering difficult questions with thoughtful approaches to analyzing data, and also just having fun with an interesting challenge. Feel free to contact me if you have any questions about the things I’m up to!
Last updated: 2024 Dec