R for Data Science

Social Determinants of Health

Course Directors


La’Marcus T Wingate, PharmD, PhD
La’Marcus Wingate is an Associate Professor in Social and Administrative Pharmacy Sciences at the Howard University College of Pharmacy where he also serves as Director of Assessment. He obtained his PharmD and PhD degrees from the University of Tennessee Health Science Center and afterward completed a postdoctoral fellowship at the Centers for Disease Control and Prevention where he was trained in the conduct of economic evaluations for public health programs.  His current research focuses on the scholarship of teaching and learning and the impact of social determinants of health on cardiometabolic outcomes.


Salome Bwayo Weaver, PharmD, BCGP, FASCP
Salome Weaver is a full professor with tenure at the Howard University College of Pharmacy where she focuses on clinical practice with a specialty in geriatrics, hematology, and oncology.  She completed her Doctor of Pharmacy degree at the Howard University College of Pharmacy and after completed a hematology/oncology fellowship.  Her current research focuses on optimizing the treatment of sickle cell disease and the evaluation of health disparities in the treatment of cancer in minority patients.

Course Overview

This is an introductory-level, self-paced course. In the "Welcome and Getting Started" Module, participants learn how to install R, R tools, and R studio. In this course, learners will have access to two national databases to create and develop a framework for statistical analyses to be carried out in later Modules. The two databases are: 

  • The All of Us NIH Research Hub which can be accessed through the All of Us Researcher Workbench
  • The Medical Expenditure Panel Survey (MEPS) accessed through the Agency for Healthcare and Research Quality (AHRQ) website. 

The goal of Module 1 is to help learners formulate research questions pertaining to both cardiovascular and cancer health using the social determinants of health. It contains five lectures that explore the association of obesity and cancer, social determinants of health in the development and treatment of cancer, and social determinants of health in the development and treatment of cardiovascular disease. At the end of the module, the learner will complete a comprehensive quiz before advancing to Module 2.

Module 2 is based on data visualization in R which includes an Introduction to ggplot. Learners will gain familiarity with examples of graphs such as a histograms, bar plots, violin plots, and scatter plots along with attached R scripts on how to run these in R. At the end of this module, learners complete the first lab activity by uploading an example graph, corresponding code, and a sample database.

Modules 3, 4, and 5 focus on descriptive statistics and linear and logistic regression respectively. Lab exercises require interpretation of the results produced from the sample data provided as well as the R scripts attached for the respective analysis. When learners successfully complete all five modules, they receive a certificate of course completion.

Course Objectives 
  1. Understand how to properly prepare datasets for use in computing workspaces
  2. Describe the principal models used in machine learning/data science and the benefits and limitations of these approaches
  3. Gain hands-on experience using data gathered from a national clinical informatics dataset
  4. Evaluate the role that social determinants of health and/or precision medicine play in disparities in cancer and cardiovascular-related prevalence and treatment outcomes
  5. Recognize and articulate a research problem