NTRES 6100

Cornell University, Spring 2021

This website complements the class Canvas site.


Instructor: Assistant Professor Nina Overgaard Therkildsen   TA: PhD Student Nicolas Lou


Meeting times

Mondays and Wednesdays 2:45pm - 4:00pm
(February 8 - April 23, 2021)

Optional lab sessions: Fridays 12:25-2:20pm
Hands-on practice sessions in groups and with TA support


Course description

As datasets grow larger and more complex across all areas of science, computational skills are increasingly in high demand. This course introduces a series of practical tools that enable researchers to spend less time wrestling with software or repeating error-prone manual data processing and more time getting research done in efficient and transparent ways that facilitate collaboration and reproducibility. We will work in R/RStudio, primarily with the tidyverse packages and with Git and GitHub integration. The course emphasizes practical skill development and will be structured around hands-on (the keyboard) learning.

By the end of this course, students will be able to:

  • Describe strategies for ensuring that their data analysis is reproducible
  • Demonstrate best practices for coding and project-oriented workflows in RStudio
  • Import and clean messy data files using a variety of packages and functions in R
  • Subset, reorganize, and merge diverse datasets in R
  • Effectively explore and visualize patterns in complex datasets with ggplot in R
  • Write simple functions/programs and data analysis pipelines in R
  • Automate repeated analysis tasks in R
  • Track the history of file changes (version control) and collaborate effectively on scripts with others with Git and GitHub
  • Use R Markdown to combine text, equations, code, tables, and figures into reports, websites, and presentations