forked from dataquestio/solutions
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Mission277Solutions.Rmd
104 lines (81 loc) · 2.95 KB
/
Mission277Solutions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
title: "Solutions for Guided Project: Exploratory Visualization of Forest Fire Data"
author: "Rose Martin"
dat:e "December 4, 2018"
output: html_document
---
Load the packages we will need for the exercise:
```{r}
library(readr)
library(dplyr)
library(ggplot2)
library(purrr)
```
Import the data file. Save it as a data frame.
```{r}
forest_fires <- read_csv("forestfires.csv")
```
Create a bar chart showing the number of forest fires occuring during each month
```{r}
fires_by_month <- forest_fires %>%
group_by(month) %>%
summarize(total_fires = n())
ggplot(data = fires_by_month) +
aes(x = month, y = total_fires) +
geom_bar(stat = "identity") +
theme(panel.background = element_rect(fill = "white"),
axis.line = element_line(size = 0.25,
colour = "black"))
```
Create a bar chart showing the number of forest fires occurring on each day of the week
```{r}
fires_by_DOW <- forest_fires %>%
group_by(day) %>%
summarize(total_fires = n())
ggplot(data = fires_by_DOW) +
aes(x = day, y = total_fires) +
geom_bar(stat = "identity") +
theme(panel.background = element_rect(fill = "white"),
axis.line = element_line(size = 0.25,
colour = "black"))
```
Change the data type of month to factor and specify the order of months
```{r}
forest_fires <- forest_fires %>%
mutate(month = factor(month, levels = c("jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sep", "oct", "nov", "dec")),
day = factor(day, levels = c("sun", "mon", "tue", "wed", "thu", "fri", "sat")))
## once you have reordered the months and days of the week, you can re-run the bar chart code above
# to create new bar graphs
```
Write a function to create a boxplot for visualizing variable distributions by month and day of the week
```{r}
## Write the function
create_boxplots <- function(x, y) {
ggplot(data = forest_fires) +
aes_string(x = x, y = y) +
geom_boxplot() +
theme(panel.background = element_rect(fill = "white"))
}
## Assign x and y variable names
x_var_month <- names(forest_fires)[3] ## month
x_var_day <- names(forest_fires)[4] ## day
y_var <- names(forest_fires)[5:12]
## use the map() function to apply the function to the variables of interest
month_box <- map2(x_var_month, y_var, create_boxplots) ## visualize variables by month
day_box <- map2(x_var_day, y_var, create_boxplots) ## visualize variables by day
```
Create scatter plots to see which variables may affect forest fire size:
```{r}
## write the function
create_scatterplots = function(x, y) {
ggplot(data = forest_fires) +
aes_string(x = x, y = y) +
geom_point() +
theme(panel.background = element_rect(fill = "white"))
}
## Assign x and y variable names
x_var_scatter <- names(forest_fires)[5:12]
y_var_scatter <- names(forest_fires)[13]
## use the map() function to apply the function to the variables of interest
scatters <- map2(x_var_scatter, y_var_scatter, create_scatterplots)
```