- An Introduction to
`data.table`

- An Introduction to
`dplyr`

- A Brief Overview of Parallel Computing with R, and some Big Data considerations
- Getting basic understanding of single vs multithread computing, parallelism, benchmarking

`data.table`

- The
`[i, j, by]`

idiom `fread`

and`fwrite`

`dcast`

and`melt`

- examples / case study in appendix

- The
`dplyr`

`filter`

,`select`

,`mutate`

- pipe (now native since R 4.1.0)
- examples / case study

- Parallel Computing with R which is single threaded
- It math libraries may not be
- parallel package as perfect start: mclapply, parLapply
- simple benchmarking
- big data / external memory / bigmemory

- Efficient R Programming (Gillespie/Lovelace book)
- Chapter 3: Efficient Programming
- Chapter 5: Efficient I/O
- Chapter 6: Efficient Data Carpentry
- Chapter 7: Efficient Optimization
- Efficient data wrangling

- Lecture 15 (2024): data.table and Lecture 15 R Code
- Lecture 16 (2024): dplyr and Lecture 16 R Code
- Lecture 17 (2024): Parallel R and Lecture 17 R Code
- Lecture 18 (2024): Efficient R and Lecture 18 R Code

- Video 23 (2024): Week 8 (Box)
- Video 24 (2021): data.table (Box, captioned), also Video 24 (2021): data.table (uncaptioned)
- Video 25 (2021): dplyr (Box, captioned), also Video 25 (2021): dplyr (uncaptioned)
- Video 26 (2024): Week 9 (Box)
- Video 27 (2021): Parallel R (Box, captioned), also Video 27 (2021): Parallel R (uncaptioned)
- Video 28 (2021): Efficient R (Box, captioned), also Video 28 (2021): Efficient R (uncaptioned)

- data.table cheatsheet
- data.table wiki
- Chapter 10: Relational Data with dplyr
in Wickham and Grolemund,
*R for Data Science*, 2017 - Chapter 12: Faster Group Manipulation with dplyr
in Lander,
*R for Everone*, 2017. - Vignette of R package ‘parallel’ (also included in every R installation)
- Chapter 3, 5, 6 and 7 in Efficient R Programming, Gillespie and Lovelace, 2016

*Getting Started with R - Tinyverse Edition*introduction to R and data manipulation with a focus on data.table: site, eight-page pdf*Life in the Fast Lane: data.table Intro and Best Practices*: Nice one-hour data.table talk by Bill Gold at NY HackR Meetup: YouTube video and code and slides github repo- Shorter parallel computing with R tutorial by Matt Jones
- Longer comprehensive tutorial by Jonathan Dursi (with sources)
- Textbook Parallel Computing for Data Science book by Norm Matloff