Skip to content

Commit

Permalink
practice midterm 2
Browse files Browse the repository at this point in the history
  • Loading branch information
jmledford3115 committed Feb 22, 2024
1 parent 8654aad commit 38f613d
Show file tree
Hide file tree
Showing 14 changed files with 32,229 additions and 0 deletions.
Binary file modified .DS_Store
Binary file not shown.
Binary file added practice midterm2/.DS_Store
Binary file not shown.
Binary file added practice midterm2/data/.DS_Store
Binary file not shown.
Binary file not shown.
Binary file not shown.
32,002 changes: 32,002 additions & 0 deletions practice midterm2/data/surgery.csv

Large diffs are not rendered by default.

57 changes: 57 additions & 0 deletions practice midterm2/midterm_2.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: "BIS 15L Midterm 2"
output:
html_document:
theme: spacelab
keep_md: yes
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Instructions
Answer the following questions and complete the exercises in RMarkdown. Please embed all of your code and push your final work to your repository. Your code should be organized, clean, and run free from errors. Remember, you must remove the `#` for any included code chunks to run. Be sure to add your name to the author header above.

Make sure to use the formatting conventions of RMarkdown to make your report neat and clean! Use the tidyverse and pipes unless otherwise indicated. To receive full credit, all plots must have clearly labeled axes, a title, and consistent aesthetics. This exam is worth a total of 35 points.

Please load the following libraries.
```{r message=FALSE, warning=FALSE}
library("tidyverse")
library("janitor")
library("naniar")
```

## Data
These data are from a study on surgical residents. The study was originally published by Sessier et al. “Operation Timing and 30-Day Mortality After Elective General Surgery”. Anesth Analg 2011; 113: 1423-8. The data were cleaned for instructional use by Amy S. Nowacki, “Surgery Timing Dataset”, TSHS Resources Portal (2016). Available at https://www.causeweb.org/tshs/surgery-timing/.

Descriptions of the variables and the study are included as pdf's in the data folder.

Please run the following chunk to import the data.
```{r message=FALSE, warning=FALSE}
surgery <- read_csv("data/surgery.csv")
```

1. Use the summary function(s) of your choice to explore the data and get an idea of its structure. Please also check for NA's.

2. Let's explore the participants in the study. Show a count of participants by race AND make a plot that visually represents your output.

3. What is the mean age of participants by gender? (hint: please provide a number for each) Since only three participants do not have gender indicated, remove these participants from the data.

4. Make a plot that shows the range of age associated with gender.

5. How healthy are the participants? The variable `asa_status` is an evaluation of patient physical status prior to surgery. Lower numbers indicate fewer comorbidities (presence of two or more diseases or medical conditions in a patient). Make a plot that compares the number of `asa_status` I-II, III, and IV-V.

6. Create a plot that displays the distribution of body mass index for each `asa_status` as a probability distribution- not a histogram. (hint: use faceting!)

The variable `ccsmort30rate` is a measure of the overall 30-day mortality rate associated with each type of operation. The variable `ccscomplicationrate` is a measure of the 30-day in-hospital complication rate. The variable `ahrq_ccs` lists each type of operation.

7. What are the 5 procedures associated with highest risk of 30-day mortality AND how do they compare with the 5 procedures with highest risk of complication? (hint: no need for a plot here)

8. Make a plot that compares the `ccsmort30rate` for all listed `ahrq_ccs` procedures.

9. When is the best month to have surgery? Make a chart that shows the 30-day mortality and complications for the patients by month. `mort30` is the variable that shows whether or not a patient survived 30 days post-operation.

10. Make a plot that visualizes the chart from question #9. Make sure that the months are on the x-axis. Do a search online and figure out how to order the months Jan-Dec.

Please be 100% sure your exam is saved, knitted, and pushed to your github repository. No need to submit a link on canvas, we will find your exam in your repository.
170 changes: 170 additions & 0 deletions practice midterm2/midterm_2_key.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
---
title: "BIS 15L Midterm 2"
output:
html_document:
theme: spacelab
keep_md: yes
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Instructions
Answer the following questions and complete the exercises in RMarkdown. Please embed all of your code and push your final work to your repository. Your code should be organized, clean, and run free from errors. Remember, you must remove the `#` for any included code chunks to run. Be sure to add your name to the author header above.

Make sure to use the formatting conventions of RMarkdown to make your report neat and clean! Use the tidyverse and pipes unless otherwise indicated. To receive full credit, all plots must have clearly labeled axes, a title, and consistent aesthetics. This exam is worth a total of 35 points.

Please load the following libraries.
```{r message=FALSE, warning=FALSE}
library("tidyverse")
library("janitor")
library("naniar")
```

## Data
These data are from a study on surgical residents. The study was originally published by Sessier et al. “Operation Timing and 30-Day Mortality After Elective General Surgery”. Anesth Analg 2011; 113: 1423-8. The data were cleaned for instructional use by Amy S. Nowacki, “Surgery Timing Dataset”, TSHS Resources Portal (2016). Available at https://www.causeweb.org/tshs/surgery-timing/.

Descriptions of the variables and the study are included as pdf's in the data folder.

Please run the following chunk to import the data.
```{r message=FALSE, warning=FALSE}
surgery <- read_csv("data/surgery.csv")
```

1. Use the summary function(s) of your choice to explore the data and get an idea of its structure. This would also be a good time to check for NA's.
```{r}
glimpse(surgery)
```

```{r}
surgery %>% naniar::miss_var_summary()
```

2. Let's explore the participants in the study. Show a count of participants by race AND make a plot that visually represents your output.
```{r}
surgery %>% count(race)
```

```{r}
surgery %>%
ggplot(aes(x=race, fill=race))+
geom_bar(alpha=0.6, color="black")+
labs(title="Participants by Race",
x=NULL,
y="# participants")+
theme_light()
```

3. What is the mean age of participants by gender? (hint: please provide a number for each) Since only three participants do not have gender indicated, remove these participants from the data.
```{r}
surgery %>%
filter(gender!="NA") %>%
group_by(gender) %>%
summarize(mean_age=mean(age, na.rm=T))
```

4. Make a plot that shows the range of age associated with gender.
```{r}
surgery %>%
filter(gender!="NA") %>%
ggplot(aes(x=gender, y=age, fill=gender))+
geom_boxplot(na.rm=T, alpha=0.6)+
labs(title="Range of Participants' Age by Gender",
x=NULL,
y="Age (years)")+
theme_light()
```

5. How healthy are the participants? The variable `asa_status` is an evaluation of patient physical status prior to surgery. Lower numbers indicate fewer comorbidities (presence of two or more diseases or medical conditions in a patient). Make a plot that compares the number of `asa_status` I-II, III, and IV-V.
```{r}
surgery %>%
filter(asa_status!="NA") %>%
ggplot(aes(x=asa_status, fill=asa_status))+
geom_bar(color="black", alpha=0.6)+
labs(title="ASA Status of Participants", x="ASA", y="# Participants")+
theme_light()
```

6. Create a plot that displays the distribution of body mass index for each `asa_status` as a probability distribution- not a histogram. (hint: use faceting!)
```{r}
surgery %>%
ggplot(aes(x=bmi))+
geom_density(fill="steelblue", color="black", alpha=0.6, na.rm=T)+
facet_wrap(~asa_status)+
labs(title="Distribution of BMI", x="BMI", y=NULL)+
theme_light()
```

The variable `ccsmort30rate` is a measure of the overall 30-day mortality rate associated with each type of operation. The variable `ccscomplicationrate` is a measure of the 30-day in-hospital complication rate. The variable `ahrq_ccs` lists each type of operation.

7. What are the 5 procedures associated with highest risk of 30-day mortality AND how do they compare with the 5 procedures with highest risk of complication? (hint: no need for a plot here)
```{r}
surgery %>%
group_by(ahrq_ccs) %>%
summarise(ccsmort30rate=mean(ccsmort30rate)) %>%
arrange(desc(ccsmort30rate)) %>%
slice_max(ccsmort30rate, n=5)
```

```{r}
surgery %>%
group_by(ahrq_ccs) %>%
summarise(ccscomplicationrate=mean(ccscomplicationrate)) %>%
arrange(desc(ccscomplicationrate)) %>%
slice_max(ccscomplicationrate, n=5)
```

```{r}
surgery %>%
select(ahrq_ccs, ccsmort30rate, ccscomplicationrate) %>%
filter(ahrq_ccs %in% c("Colorectal resection", "Small bowel resection", "Gastrectomy; partial and total", "Endoscopy and endoscopic biopsy of the urinary tract", "Spinal fusion", "Nephrectomy; partial or complete")) %>%
group_by(ahrq_ccs) %>%
summarise(ccsmort30rate=mean(ccsmort30rate),
ccscomplicationrate=mean(ccscomplicationrate))
```

8. Make a plot that compares the `ccsmort30rate` for all listed `ahrq_ccs` procedures.
```{r}
surgery %>%
ggplot(aes(x=ahrq_ccs, y=ccsmort30rate, group=ahrq_ccs))+
geom_col(position="dodge", fill="steelblue", alpha=0.6)+
coord_flip()+
labs(title="30-Day Mortality by Procedure",
x=NULL,
y="Rate")+
theme_light()
```

9. When is the best month to have surgery? Make a chart that shows the 30-day mortality and complications for the patients by month. `mort30` is the variable that shows whether or not a patient survived 30 days post-operation.
```{r}
surgery_new <- surgery %>%
mutate(mort30_n = ifelse(mort30 == "Yes", 1, 0),
complication_n = ifelse(complication =="Yes", 1, 0)) %>%
group_by(month) %>%
summarise(n_deaths=sum(mort30_n),
n_comp=sum(complication_n))
surgery_new
```

```{r}
surgery %>% tabyl(month, mort30)
```

```{r}
surgery %>% tabyl(month, complication)
```

10. Make a plot that visualizes the chart from question #9. Make sure that the months are on the x-axis. Do a search online and figure out how to order the months Jan-Dec.
```{r}
surgery_new %>%
ggplot(aes(x=month, y=n_deaths))+
geom_col(alpha=0.8, fill="steelblue", color="black")+
labs(title="30-Day Mortality by Month",
x=NULL,
y="Rate")+
scale_x_discrete(limits=c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))+
theme_light()
```

Please be 100% sure your exam is saved, knitted, and pushed to your github repository. No need to submit a link on canvas, we will find your exam in your repository.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 38f613d

Please sign in to comment.