Skip To Content

Git and GitHub for Public Health



Full course description


Version control, the practice of tracking and managing changes to statistical code, is essential for reducing errors in a statistical analysis. However, many analysts are not trained in version control. In this training, we will provide an introduction to git and GitHub, to equip analysts with version control tools that also meet ethical standards.


We will overview key concepts in version control and the installation and setup of git. Course enrollees will create a GitHub repository and implement a version control workflow for projects that they work on alone. We will cover concepts such as branching, pulling, committing, and pushing. We will provide sample data and code that attendees will update – committing the changes locally and pushing them to the main GitHub repository.


We will then expand the workflow to projects done within a team. The two course instructors will demonstrate three ways of working together in a team. Special focus will be given challenges that arise when working simultaneously on files, such as merge conflicts, and how to resolve these.


We will cover general tips on using git, best practices around data storage internal versus external to a repository, and an overview of how GitHub workflows can be compliant with IRB requirements.


Intended audience

This course is targeted to individuals who write statistical code as part of their daily job. The course uses R in the examples, but individuals who write code in other languages such as SAS will also benefit from this training.


Time commitment

This 5-hour course is lecture-based. Course registrants are encouraged to follow along from their computer to get the most out of this training.



Upon completion of all videos, quizzes and activities, participants will be awarded a certificate of completion.  


About the course facilitators

Lauren Wilner is a second-year Epidemiology PhD student at the University of Washington. She is immersed in projects at the nexus of the built environment, climate change, and public health. Lauren previously earned an MPH from Tufts University. During that time, she served as a data analyst at the Friedman School of Nutrition, contributing to numerous nutritional epidemiology projects and earning her MPH in epidemiology and biostatistics. Lauren’s extensive experience includes a fellowship at the CDC, where Lauren implemented and taught Field Epidemiology Training Programs across Francophone West Africa. She then took a role as a Research Scientist at the Institute of Health Metrics and Evaluation (IHME), where she was the statistical modeler for the team investigating the global burden of congenital birth defects. As a teaching assistant, Lauren has instructed numerous courses at Tufts and the University of Washington in biostatistics and data science. Lauren has used Github for six years, since she worked at IHME. She has trained fellow research assistants on git and GitHub, and has used it to organize her research projects during graduate school.


Corinne Riddell, PhD MSc, is an Assistant Adjunct Professor at the University of California, Berkeley, School of Public Health in the Divisions of Biostatistics and Epidemiology. She leads a body of research in pediatric/perinatal epidemiology, social epidemiology, and epidemiologic methods. Corinne has used R since 2010, RStudio since 2012, and git and GitHub since 2017. She has taught introductory biostatistics and epidemiologic analysis at UC Berkeley since she began her position in Fall 2018.


Sign up for this course today!