Skip To Content

R for Public Health



Full course description


This course is designed to equip public health professionals with the principles and methods underpinning the use of R, with a strong emphasis on its application in multi-disciplinary, collaborative, and real-world public health settings. Participants will learn practical data management techniques, such as working with data frames, importing and exporting data, troubleshooting common issues, and preparing messy data for analysis or presentation. Through instructional videos and interactive activities, participants will gain confidence in applying R in a public health context, enhancing their data analysis capabilities and contributing to informed public health decision-making.

Certificates and Time Commitment

This course offers four certificates. Enrollees can earn multiple certificates, based on their training needs.

Data Management Certificate

Time commitment: 15 hours

No Previous Coding Experience Required

Participants who complete a Data Management Certificate will learn the basics of data management and manipulation using RStudio. Participants will learn how R is applied in public health, including how to assign values to objects, adhere to best practices for naming, and understand the five basic data types in R. The program also covers essential operations like using functions to create data objects, handling vectors and lists, and managing data frames for efficient data storage. It introduces the concept of packages in R, demonstrating how to install, load, and navigate their documentation. Practical skills such as importing and exporting data, managing work directories, and creating reproducible examples for sharing code or seeking help are emphasized. Additionally, participants will learn how to use various packages and functions from the tidyverse in order to prepare data for presentation or analysis. This will include tools for subsetting tables, creating new columns, aggregating and restructuring data, combining multiple tables, handling unexpected or missing data, and more.

Visualization Certificate

Time commitment: 5 hours

R Programming Experience Required

Participants completing the Visualization Certificate participants will learn to create tables using kable and gt and enhance them with kableExtra and gtExtras, customizing details by rows and columns. They will also learn to construct graphs using ggplot, including basics like changing axis labels and modifying visual elements. The certificate introduces interactive elements, showing how to make HTML tables with DT/datatable and interactive graphs with plotly, including title and label adjustments.

Dissemination Certificate

Time commitment: 1.5 hours

R Programming Experience Required

Building upon data manipulation and data visualization skills, participants aiming for this certificate will advance to creating interactive HTML RMarkdown documents and developing interactive webpages and dashboards using the Shiny package.

Functions Certificate

Time commitment: 1.5 hours

R Programming Experience Required

Participants seeking this certificate will learn how to write functions, understanding the benefits, basic structure, and optimal usage times in R, enhancing their coding efficiency and error-handling capabilities.

Instructional Team

Lauren Nelson (MPH) is an Informatics Supervisor within the Informatics Branch in the Center for Infectious Disease, California Department of Public Health. She has worked at CDPH for the last 10 years in epidemiology and informatics roles within the Environmental Health Investigation Branch, STD Control Branch, Office of AIDS, and COVID and MPOX responses.

She uses R and R Studio to support projects related to the integration and centralization of HIV and STD data, creates tools for the dissemination of public health data to state and local health departments, and wrangles and combines data from a variety of surveillance, administrative, and laboratory sources.

Lauren earned her MPH in epidemiology from Emory University in Atlanta, Georgia and Bachelor of Science in Mathematics from Point Loma Nazarene University in San Diego, California.

Will Wheeler (PhD, MPH) is the Director of Public Health Informatics and Data Operations at the County of Santa Clara Public Health Department since November 2023. Previously he worked in a number of informatics roles at California Department of Public Health - first in the Office of AIDS, Surveillance Branch and later on the COVID and MPOX responses. 

While serving in public health in various federal, state, and local organizations for the past 18 years, his focus has been on designing, building, and improving data systems and infrastructure to efficiently translate collected data into public health action. He is an R and RStudio/Posit evangelist for the past 15 years, experienced in SAS and SQL, and has dabbled in python. Will has an MPH from Emory University and a PhD in epidemiology from Georgia State University, where he focused on using geospatial methods to analyze relationships between social/community-related factors. He earned a BA in Psychology from St. Olaf College in Northfield, Minnesota.

Madeline (Maddy) Adee is a Health Policy PhD student with a specialization in Population Health and Data Science. Her research focuses on improving health for people involved in the United States carceral system through decarceration efforts, addressing human rights concerns, and improved medical care. She was a Computational Social Science Training Program fellow last year at UC Berkeley, and uses R and Python extensively for her work. Prior to starting her PhD, she worked as a programmer analyst on cost-effectiveness and simulation modeling studies focused on hepatitis C elimination and overdose reduction.  Maddy holds an MPH in Health Policy from the Rollins School of Public Health at Emory University and a BS in Anthropology from Portland State University.


Sign up for this course today!