Skip to content

Advanced R

Objectives

  • An Introduction to data.table
  • An Introduction to dplyr
  • A Brief Overview of Parallel Computing with R, and some Big Data considerations
  • Getting basic understanding of single vs multithread computing, parallelism, benchmarking

Topics

  • data.table
  • The [i, j, by] idiom
  • fread and fwrite
  • dcast and melt
  • examples / case study in appendix
  • dplyr
  • filter, select, mutate
  • pipe (now native since R 4.1.0)
  • examples / case study
  • Parallel Computing with R which is single threaded
  • It math libraries may not be
  • parallel package as perfect start: mclapply, parLapply
  • simple benchmarking
  • big data / external memory / bigmemory
  • Efficient R Programming (Gillespie/Lovelace book)
  • Chapter 3: Efficient Programming
  • Chapter 5: Efficient I/O
  • Chapter 6: Efficient Data Carpentry
  • Chapter 7: Efficient Optimization
  • Efficient data wrangling

Core Material

Lecture Slides

Lecture Videos

Extras

Additional Resources