Advanced R
Objectives
An Introduction to data.table
An Introduction to dplyr
A Brief Overview of Parallel Computing with R, and some Big Data considerations
Getting basic understanding of single vs multithread computing, parallelism, benchmarking
Topics
data.table
The [i, j, by] idiom
fread and fwrite
dcast and melt
examples / case study in appendix
dplyr
filter, select, mutate
pipe (now native since R 4.1.0)
examples / case study
Parallel Computing with R which is single threaded
It math libraries may not be
parallel package as perfect start: mclapply, parLapply
simple benchmarking
big data / external memory / bigmemory
Efficient R Programming (Gillespie/Lovelace book)
Chapter 3: Efficient Programming
Chapter 5: Efficient I/O
Chapter 6: Efficient Data Carpentry
Chapter 7: Efficient Optimization
Efficient data wrangling
Core Material
Lecture Slides
Lecture Videos
Video 23 (2025): Week 8 (Box) also Video 23 (2025, captioned) (ClassTranscribe)
Video 24 (2021): data.table (Box, captioned), also Video 24 (2021): data.table (uncaptioned)
Video 25 (2021): dplyr (Box, captioned), also Video 25 (2021): dplyr (uncaptioned)
Video 26 (2025): Week 9 (Box) also Video 26 (2025, captioned) (ClassTranscribe)
Video 27 (2021): Parallel R (Box, captioned), also Video 27 (2021): Parallel R (uncaptioned)
Video 28 (2021): Efficient R (Box, captioned), also Video 28 (2021): Efficient R (uncaptioned)
Additional Resources