This document is the basis for multiple workshops, whose common goal is to provide some tools, tips, packages etc. that make data processing, programming, modeling, visualization, and presentation in R easier. It is oriented toward those who have had some exposure to R in an applied data analysis fashion, but would also be useful to someone coming to R from another programming language. It is not an introduction to R. The goal here is primarily to instill awareness, specifically of tools that will make your data exploration, modeling, and visualization easier, and to understand some of the why behind the tools, so that one can better implement them. It is meant to fill in some of the gaps that typically befall applied users of R.
Part 1: Information Processing
Understanding Basic R Approaches to Gathering and Processing Data
- Overview of Data Structures
Getting Acquainted with Other Approaches to Data Processing
- Pipes, and how to use them
Part 2: Programming Basics
Using R more fully
- Dealing with objects
- Iterative programming
- Writing functions
- Code style
- Regular expressions
Part 3: Modeling
- Key concepts
- Understanding and fitting models
- Overview of extensions
- Model Assessment
- Model Comparison
- Demonstration of techniques
Part 4: Visualization
- Visualizing Information
- and more…
- and more…
- Package demos
Part 5: Presentation
Possible future addition. See this for now.
To follow along with the examples, clone/download the related section repos. Downloading any one of them will have an R project and associated data, such that the code from any section should run.
(in progress as document is being revamped and extended)
Color coding in text:
Some key packages used in the following demonstrations and exercises:
tidyverse (several packages), data.table, tidymodels
The related Python notebooks may be found here: here.
Many other packages are also used for data or minor demonstration, so feel free to install as we come across them. Here are a few.
ggplot2movies, nycflights13, DT, highcharter, magrittr, maps, mgcv (already comes with base R), plotly, quantmod, readr, visNetwork, emmeans, ggeffects