-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path02-Methods.Rmd
176 lines (146 loc) · 8.34 KB
/
02-Methods.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
# Methods
Analyses were conducted in R. All observations from 2010-2013 were removed from analysis due to changes in reporting calculated scores. An additional 33 records with missing values were excluded from overall analysis. Summary statistics for all values of each calculated score were reviewed, including: number of records, number of readmissions, readmission rate, odds of readmission, and log odds.
Certain values at the tail-ends of each calculated score exhibited insufficient sample size. Since these values may be considered as clinical extremes, the following aggregations ensured $N \ge 20$ across the range of values considered:
+ $CCI \ge 14$ were combined into one group, $N=28$
+ $LACE \le 4$ were combined into one group, $N=55$
+ $LACE \ge 18$ were combined into one group, $N=20$
+ $HOSPITAL \ge 11$ were combined into one group, $N=22$
Covariates were reviewed for sufficient sample size within each group to include in model-fitting. Exploratory data analysis also included visualizations of empirical log odds (logits) for each calculated score across all potential covariates.
Model fitting began with logistic regression considering the association of acuity score with readmission risk. Separate models were fitted for each acuity score, and expanded to consider covariates as appropriate. Additional observations were dropped or aggregated in consideration of sample size and were be noted in each case. Individual models were assessed through reviewing deviance residuals and conducting a $\chi^2$ goodness-of-fit test. Individual predictors were assessed with confidence intervals. Models for each score were compared using Akaike's Information Criteria.
*Note: add a MC simulation if there's enough time!*
```{r include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
```
```{r load_libs}
# -- Load Libraries
library(readxl) # read in Excel files
library(car) # Type III Sums of Squares Anova
library(emmeans) # multiple comparisons
library(tidyverse) # data processing, visualization
library(lubridate) # process dates
library(ggsci) # colors
library(rstatix)
library(gridExtra) # make and arrange tables
library(gtable)
library(kableExtra)
library(DescTools) # categorical correlation
library(nnet) # multinomial model fitting
library(lme4) # mixed-level model fitting
library(corrr) # tidy correlations
```
```{r load_data}
snf_data <- read_xlsx("data/SNF_CODED_CALCULATED_SCORES_04042022.xlsx",
sheet = "CODED DATA AND CALCULATIONS")
readmits <- read_xlsx("data/Stern 30-day Readmissions 2014 - 2019.xlsx")
```
```{r format_var}
snf_data <- snf_data %>%
transmute(SternClientID = `Stern Client Id`,
MRN = MRN,
Year = WorkSheet,
CurrentAdmit = as.Date(`Current Admit`, format = "%m/%d/%y"),
DischargeDate = as.Date(`Discharge Date`, format = "%m/%d/%y"),
Gender = factor(Gender),
Race = Race,
Ethnicity = factor(Ethnicity),
InsurancePlanName = (InsurancePlanName),
ED6moPrior = factor(`LACE Num_of_ED_visits_prior 6months`),
PatientAge = PatientAge,
AdmissionHospital = factor(AdmissionHospital),
CCIAgeScore = ordered(`CCI AGE SCORE`),
LACELOSDaysScore = ordered(`LACE LOSDays SCORE`),
LOSDays = LOSDays,
LOS5Days = factor(`LOS>=5days`),
HOSPITALLOS5Days = factor(`HOSPITAL LOS>=5days`),
VisitType = factor(VisitType),
AdmitDtm = as.Date(AdmitDtm, format = "%m/%d/%y"),
DischargeDtm = as.Date(DischargeDtm, format = "%m/%d/%y"),
NumAdmitPastYear = `NumberOfAdmissionInPastYear`,
PriorAdmin = ordered(`HOSPITAL # OF admission prior yr`),
Cancer = factor(`HOSPITAL SCORE D/C ONC SERVICE`),
AcuteAdmin = factor(`LACE ACUTE ADMISSION`),
AdminType = factor(`HOSPITAL SCORE INDEX ADMISSION TYPE`),
Hemoglobin = `Hemoglobin at discharge`,
HemoglobinLevel = factor(`HOSPITAL SCORE Hemoglobin at discharge`),
Sodium = `Sodium, Serum at discharge`,
SodiumLevel = factor(`HOSPITAL SCORE Sodium`),
LACECCI = `LACE CCI SCORE`,
MyoInfarc = factor(`Myocardial infarction`),
CongHrtFail = factor(`Congestive heart failure`),
PeriphVascDis = factor(`Peripheral vascular disease`),
CerebroVascDis = factor(`Cerebrovascular disease`),
Dementia = factor(Dementia),
ChronPulmoDis = factor(`Chronic pulmonary disease`),
RheuConTisDis = factor(`Rheumatic disease=connective tissue disease`),
PepUlcDis = factor(`Peptic ulcer disease`),
MildLiverDis = factor(`MILD liver disease`), # combine into one liver disease var
ModSevLiverDis = factor(`MODERATE or SEVERE liver disease`),
DiabetesWOChrCom = factor(`Diabetes WITHOUT chronic complication`), # combine into one diabetes var
DiabetesWChrCom = factor(`Diabetes WITH chronic complication`),
HemiParaPlegia = factor(`Hemiplegia or paraplegia`),
RenalDisCKD = factor(`Renal disease = mod to severe CKD`),
Malignancy = factor(`Any Malignancy`),
MetaSolidTumor = factor(`Metastatic solid tumour`),
AidsHIV = factor(`AIDS/HIV`),
CCI = `CALCULATED CCI SCORE`,
LACE = `CALCULATED LACE SCORE`,
HOSPITAL = `CALCULATED HOSPITAL SCORE`,
Readmit30 = `READMISSION <30DAYS`) %>%
mutate(StayDurationDays = as.numeric((DischargeDate - CurrentAdmit)),
DtmDuration = DischargeDtm - AdmitDtm)
```
```{r standardize}
# Exclude records before 2014 due to reporting changes
snf_data <- snf_data %>% filter(Year > 2013)
# Create new column for whether or not record is a readmission
snf_data <- snf_data %>%
mutate(Readmit30 = ifelse(Readmit30 == "N/A", "O", Readmit30)) %>%
rename(Readmit = Readmit30)
# Standardize values of Race, Insurance, Liver Disease, Diabetes
snf_data <- snf_data %>%
mutate(Race = as.factor(ifelse(Race == "White", "Caucasian/White",
ifelse((Race == "Declined" | Race == "Not Specified" | Race == "Unknown"), "Unavailable/Unknown",
ifelse(Race == "Multiracial", "Other/Multiracial", Race)))),
InsurancePlanCleaned = ifelse(grepl("MEDICARE", InsurancePlanName), "MCARE_MCAID",
ifelse(grepl("MCARE", InsurancePlanName), "MCARE_MCAID",
ifelse(grepl("MEDICAID", InsurancePlanName), "MCARE_MCAID",
ifelse(grepl("MCAID", InsurancePlanName), "MCARE_MCAID",
"NO_MCARE_MCAID")))),
LiverDis = ordered(ifelse(ModSevLiverDis == 3, 3,
ifelse(MildLiverDis == 1, 1, 0))),
Diabetes = ordered(ifelse(DiabetesWChrCom == 2, 2,
ifelse(DiabetesWOChrCom == 1, 1, 0))))
# relevel Readmit factor, combine original admissions with readmissions > 30 days
snf_data <- snf_data %>%
mutate(Readmit = ifelse(Readmit == "O", "N", Readmit)) %>%
mutate(Readmit = as.factor(Readmit))
```
```{r, nas}
# Summarize missing values by variable
nas <- map(snf_data, ~sum(is.na(.)))
# nas %>% as_tibble(nas)
# Look at records with missing values
NAlook <- function(x) rowSums(x) > 0
NAs <- snf_data %>% filter(NAlook( across( .cols = everything(), .fns = ~ is.na(.x))))
# remove small number of records with NA values
snf_data <- snf_data %>% na.omit()
```
```{r data_prep}
# Aggregate Race/Ethnicity
snf_data <- snf_data %>%
mutate(Race_Eth = factor(ifelse(Ethnicity == "Hispanic or Latino", "Hispanic/Latino", as.character(Race))))
snf_data_a <- snf_data %>%
filter(Gender != "Unspecified") %>%
mutate(Race_Eth = fct_recode(Race_Eth, "Other/Multiracial" = "Native Amer/Alaskan"),
Insurance = as.factor(InsurancePlanCleaned))
snf_data_a$Gender <- droplevels(snf_data_a$Gender)
```
```{r join_los}
# join length of stay in snf for readmissions 3to main data
readmits <- readmits %>%
mutate(SternClientID = as.numeric(`Stern ID #`)) %>%
rename(LOS_snf = LOS) %>%
select(c(SternClientID, LOS_snf))
snf_data <- snf_data_a %>%
left_join(readmits, by = "SternClientID")
```