This document is the basis for multiple workshops, whose common goal is to provide some tools, tips, packages etc. that make data processing, programming, modeling, visualization, and presentation in R easier. It is oriented toward those who have had some exposure to R in an applied data analysis fashion, but would also be useful to someone coming to R from another programming language. It is not an introduction to R. The goal here is primarily to instill awareness, specifically of tools that will make your data exploration, modeling, and visualization easier, and to understand some of the why behind the tools, so that one can better implement them. It is meant to fill in some of the gaps that typically befall applied users of R.
Part 1: Data Processing
Understanding Base R Approaches to Data Processing
- Overview of Data Structures
Getting Acquainted with Other Approaches to Data Processing
- Pipes, and how to use them
Part 2: Programming Basics
Using R more fully
- Dealing with objects
- Iterative programming
- Writing functions
- Regular expressions
Part 3: Modeling
Part 4: Visualization
- Visualizing Information
- and more…
- and more…
- Package demos
Part 5: Presentation
Possible future addition. See this for now.
To follow along with the examples, clone/download the related section repos. Downloading any one of them will have an R project and associated data, such that the code from any section should run.
(in progress as document is being revamped and extended)
Color coding in text:
Some key packages used in the following demonstrations and exercises:
tidyverse (several packages), data.table, ggplot2movies
Python notebooks for the data processing section and visualization sections may be found here.
Many other packages are also used, so feel free to install as we come across them. Here are a few.
nycflights13, DT, highcharter, magrittr, maps, mgcv (already comes with base R), plotly, quantmod, readr, visNetwork