-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy path_main.Rmd
218 lines (157 loc) · 4.41 KB
/
_main.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
---
title: 'Statistics 218: Labs Outline'
output:
pdf_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Lab 1: What is data?
## Topics:
* Describing variables in a dataset
* Interpreting observations in real-world terms
* Calculating numerical summaries of samples
* Deciding which variables to collect or analyze
* Discussing evidence for specific research questions
## Dataset: Titanic (abbreviated version)
```{r titanic}
```
## R Skills:
* Getting used to R Markdown
⦁* Reading a dataset into R
⦁* Looking at, and interpreting, the`summary()` of the dataset
⦁* Looking at, and interpreting, individual rows of the dataset:`head()`, `[1,]`
⦁* Checking for obvious errors and missing data: `na.omit()`
⦁* Filtering to look at specific observation and variables:`filter()`, `select()`
⦁* Mutating variables by factoring, combining, etc: `mutate()`, `mutate_at()`, `factor()`
⦁* Calculating summaries of samples, including by group: `mean()`, etc; `summarize_at()`, `group_by()`
## Assignment: Full Titanic dataset: who lived and died?
## Additional Resources:
* DataCamp
* RStudio Cheatsheets: dplyr
* Swirl
# Lab 2: Visualization
## Topics:
Reading, interpreting, comparing, and knowing when to use...
* Histograms
* Dot plots
* Box plots
* Bar graphs
* [Pie Charts - omitted in R section]
* Side-by-Side box plots
* Side-by-Side bar graphs
* Stacked bar graphs
* Scatter plots
* Line plots
## Dataset: Titanic (abbreviated)
## R Skills:
* Creating all the above in ggplot
* Grouping and facetting
* Optional bells and whistles for plotting
## Assignment: Titanic final report
## Additional Resources:
* RStudio Cheatsheets: ggplot2
* Colors pdf
* Link some tutorials?
# Lab 3: Random Variables, Categorical/discrete distributions
## Topics:
* Random variables: samples and populations, statistics and parameters
* Frequency tables
* Probability
* The Binomial Distribution
* Quantifying evidence: baby chi-square?
## Dataset: Election data?
## Skills:
* [review] Counting up categoricals
* Making a frequency table from data: `reshape` stuff
* Using `pbinom()` and `qbinom()`
* Simulation? `rbinom()`
## Assignment: Hmmmm.....
## Additional Resources
* Applets!
# Lab 4: Hypothesis Testing
## Topics:
*⦁ Hypothesis test general principles
* 1-sample prop test
* very gentle intro to confidence interval for $p$
* Chi-Square Test
## Dataset: same as lab 3 ideally. Election data?
## R Skills:
* [review] making frequency tables
* [review] `pbinom()`, `qbinom()`
* `pchisq()`, `qchisq()`, maybe? probably.
* `prop.test()`
* `chisq.test()`
## Assignment: Final report on [biodata].
## Additional Resources:
* Applets
* Real-world chi-square study abstracts and such
# Lab 5: Densities, the CLT
## Topics:
* Density curves
* The Uniform distribution
* The Normal distribution
* CLT
* Normal approximation to Binomial
## Dataset: basketball??? hmmmmm I don't love it....
## R Skills:
* `runif()`, `punif()`, `qunif()`
* `rnorm()`, `dnorm()`, `qnorm()`, `pnorm()`
* Curves in `ggplot`
* simulation....?
* qq plots
## Assignment: Not bball. Make them do a different dataset.
# Lab 6: t-tests and confidence intervals
## Topics:
* one-sample t-test
* two-sample t-test
* Confidence intervals for $\mu$ and $\mu_1 - \mu_2$
## Dataset: wine?
## R Skills:
* `t.test()`
* Maybe conf int calculations?
## Assignment: wine? options?
## Additional Resources:
* Real world abstracts
# Lab 7: ANOVA
## Topics:
* ANOVA
* Multiple testing, Tukey HSD
## Dataset: bodwins
## R Skills:
* [review] Side-by-side boxplots
* `anova()`, `lm()`, `aov()`
* `tukeyHSD()`
## Assignment: your family or similar
## Additional Resources:
* Real world abstracts
# Lab 8: Regression
## Topics:
* Linear models
* Least squares and residuals
* Residual plots
* Transformation of variables?
## Dataset: kellys
## R Skills:
* `lm()`
* [review] scatterplots
* Adding model to scatterplot
* Prediction
* Plotting residuals (hmmm why does it suck in ggplot. maybe worth building by hand)
## Assignment: your choice of names stuff
## Additional Resources
# Lab 9: Final project
<!--chapter:end:Labs_Outline.Rmd-->
---
title: "Shiny Load Test Notes"
author: "Kelly Bodwin"
date: "September 5, 2018"
output: html_document
---
```{r setup, include=FALSE}
library(tidyverse)
library(devtools)
devtools::install_github("rstudio/shinyloadtest")
```
<!--chapter:end:Shiny_LoadTest_Notes.Rmd-->