Between October 1, 2019 and May 15, 2021, New York State reported 162,679 positive flu cases. This study aims to answer how age, poverty level, and healthcare access affected whether a New York City (NYC) resident received the flu vaccine in 2020. After data cleaning and preparation, the final dataset contained 8071 observations of 17 variables. Chi-squared tests were run on every variable with "fluvaccineshot," which served as the response variable. Five machine learning models were trained and tested on the data, and their accuracy was determined through a confusion matrix. The accuracy for the models, namely logistic regression, gradient-boosted trees, random forest, KNN, and SVM, was 67.21%, 65.95%, 64.46%, 62.98%, and 63.08% respectively. This study concluded that age, poverty level, and healthcare access are not good indicators of whether or not an NYC resident received the flu vaccine in 2020 on their own. Broader factors must be considered for more accurate predictions. Once those factors are found, changes can be implemented in public health programs to vaccinate more residents and decrease the number of annual flu cases in NYC.
forked from the-codingschool/DSRP-2023-Greenleaf
-
Notifications
You must be signed in to change notification settings - Fork 0
keertikoya/Greenleaf-Research
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Data science research project under Professor Abba Greenleaf
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- HTML 99.0%
- R 1.0%