Skip to content

Data science research project under Professor Abba Greenleaf

Notifications You must be signed in to change notification settings

keertikoya/Greenleaf-Research

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Research Project

Between October 1, 2019 and May 15, 2021, New York State reported 162,679 positive flu cases. This study aims to answer how age, poverty level, and healthcare access affected whether a New York City (NYC) resident received the flu vaccine in 2020. After data cleaning and preparation, the final dataset contained 8071 observations of 17 variables. Chi-squared tests were run on every variable with "fluvaccineshot," which served as the response variable. Five machine learning models were trained and tested on the data, and their accuracy was determined through a confusion matrix. The accuracy for the models, namely logistic regression, gradient-boosted trees, random forest, KNN, and SVM, was 67.21%, 65.95%, 64.46%, 62.98%, and 63.08% respectively. This study concluded that age, poverty level, and healthcare access are not good indicators of whether or not an NYC resident received the flu vaccine in 2020 on their own. Broader factors must be considered for more accurate predictions. Once those factors are found, changes can be implemented in public health programs to vaccinate more residents and decrease the number of annual flu cases in NYC.

About

Data science research project under Professor Abba Greenleaf

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 99.0%
  • R 1.0%