forked from datacamp/courses-introduction-to-r
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathchapter4.Rmd
662 lines (473 loc) · 23.7 KB
/
chapter4.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
---
title_meta: Chapter 4
title: Factors
description: >-
Data often falls into a limited number of categories. For example, human hair
color can be categorized as black, brown, blond, red, grey, or white—and
perhaps a few more options for people who color their hair. In R, categorical
data is stored in factors. Factors are very important in data analysis, so
start learning how to create, subset, and compare them now.
---
## What's a factor and why would you use it?
```yaml
type: NormalExercise
key: 05273321916d99bb9c0deadf75c6834d25a47244
xp: 100
skills:
- 1
```
In this chapter you dive into the wonderful world of **factors**.
The term factor refers to a statistical data type used to store categorical variables. The difference between a categorical variable and a continuous variable is that a categorical variable can belong to a **limited number of categories**. A continuous variable, on the other hand, can correspond to an infinite number of values.
It is important that R knows whether it is dealing with a continuous or a categorical variable, as the statistical models you will develop in the future treat both types differently. (You will see later why this is the case.)
A good example of a categorical variable is sex. In many circumstances you can limit the sex categories to "Male" or "Female". (Sometimes you may need different categories. For example, you may need to consider chromosomal variation, hermaphroditic animals, or different cultural norms, but you will always have a finite number of categories.)
`@instructions`
Assign to variable `theory` the value `"factors"`.
`@hint`
Simply assign a variable (`<-`); make sure to capitalize correctly.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Assign to the variable theory what this chapter is about!
```
`@solution`
```{r}
# Assign to the variable theory what this chapter is about!
theory <- "factors"
```
`@sct`
```{r}
msg_undef <- "It looks like you haven't defined the variable `theory`."
msg_incor <- "The value of `theory` looks incorrect. Make sure to assign it the character string `\"factors\"`. Remember that R is case sensitive."
msg_err <- "Make sure that you defined `theory` correctly, using `<-` for assignment."
# If get error and theory is undefined, point out the error
ex() %>% check_or(check_error(.,msg_err), check_object(.,"theory") %>% check_equal(eval = FALSE))
check_object(.,"theory", undefined_msg = msg_undef) %>% check_equal(incorrect_msg = msg_incor)
success_msg("Good job! Ready to start? Continue to the next exercise!")
```
---
## What's a factor and why would you use it? (2)
```yaml
type: NormalExercise
key: 6cc21c842b075347926bb1b244782213df32e370
xp: 100
skills:
- 1
```
To create factors in R, you make use of the function [`factor()`](http://www.rdocumentation.org/packages/base/functions/factor). First thing that you have to do is create a vector that contains all the observations that belong to a limited number of categories. For example, `sex_vector` contains the sex of 5 different individuals:
```
sex_vector <- c("Male","Female","Female","Male","Male")
```
It is clear that there are two categories, or in R-terms **'factor levels'**, at work here: "Male" and "Female".
The function [`factor()`](http://www.rdocumentation.org/packages/base/functions/factor) will encode the vector as a factor:
```
factor_sex_vector <- factor(sex_vector)
```
`@instructions`
- Convert the character vector `sex_vector` to a factor with `factor()` and assign the result to `factor_sex_vector`
- Print out `factor_sex_vector` and assert that R prints out the factor levels below the actual values.
`@hint`
Simply use the function [`factor()`](http://www.rdocumentation.org/packages/base/functions/factor) on `sex_vector`. Have a look at the assignment, the answer is already there somewhere...
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Sex vector
sex_vector <- c("Male", "Female", "Female", "Male", "Male")
# Convert sex_vector to a factor
factor_sex_vector <-
# Print out factor_sex_vector
```
`@solution`
```{r}
# Sex vector
sex_vector <- c("Male", "Female", "Female", "Male", "Male")
# Convert sex_vector to a factor
factor_sex_vector <- factor(sex_vector)
# Print out factor_sex_vector
factor_sex_vector
```
`@sct`
```{r}
ex() %>% check_object("factor_sex_vector") %>% check_equal(incorrect_msg = "Did you assign the factor of `sex_vector` to `factor_sex_vector`?")
ex() %>% check_output_expr("factor_sex_vector", missing_msg = "Don't forget to print out `factor_sex_vector`!")
success_msg("Great! If you want to find out more about the `factor()` function, do not hesitate to type `?factor` in the console. This will open up a help page. Continue to the next exercise.");
```
---
## What's a factor and why would you use it? (3)
```yaml
type: NormalExercise
key: 5bd4f50afc2c2dbc881e16b8ca94ca56960dff42
xp: 100
skills:
- 1
```
There are two types of categorical variables: a **nominal categorical variable** and an **ordinal categorical variable**.
A nominal variable is a categorical variable without an implied order. This means that it is impossible to say that 'one is worth more than the other'. For example, think of the categorical variable `animals_vector` with the categories `"Elephant"`, `"Giraffe"`, `"Donkey"` and `"Horse"`. Here, it is impossible to say that one stands above or below the other. (Note that some of you might disagree ;-) ).
In contrast, ordinal variables do have a natural ordering. Consider for example the categorical variable `temperature_vector` with the categories: `"Low"`, `"Medium"` and `"High"`. Here it is obvious that `"Medium"` stands above `"Low"`, and `"High"` stands above `"Medium"`.
`@instructions`
Submit the answer to check how R constructs and prints nominal and ordinal variables. Do not worry if you do not understand all the code just yet, we will get to that.
`@hint`
Just submit the answer and look at the console. Notice how R indicates the ordering of the factor levels for ordinal categorical variables.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Animals
animals_vector <- c("Elephant", "Giraffe", "Donkey", "Horse")
factor_animals_vector <- factor(animals_vector)
factor_animals_vector
# Temperature
temperature_vector <- c("High", "Low", "High","Low", "Medium")
factor_temperature_vector <- factor(temperature_vector, order = TRUE, levels = c("Low", "Medium", "High"))
factor_temperature_vector
```
`@solution`
```{r}
# Animals
animals_vector <- c("Elephant", "Giraffe", "Donkey", "Horse")
factor_animals_vector <- factor(animals_vector)
factor_animals_vector
# Temperature
temperature_vector <- c("High", "Low", "High","Low", "Medium")
factor_temperature_vector <- factor(temperature_vector, order = TRUE, levels = c("Low", "Medium", "High"))
factor_temperature_vector
```
`@sct`
```{r}
msg <- "Do not change anything about the sample code. Simply submit the answer and inspect the solution!"
ex() %>% check_object("animals_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_object("temperature_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_object("factor_animals_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_output_expr("factor_animals_vector", missing_msg = msg)
ex() %>% check_object("factor_temperature_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_output_expr("factor_temperature_vector", missing_msg = msg)
success_msg("Can you already tell what's happening in this exercise? Awesome! Continue to the next exercise and get into the details of factor levels.")
```
---
## Factor levels
```yaml
type: NormalExercise
key: 1aa698978d32d1a0befa4700d7da85a648e1d69e
xp: 100
skills:
- 1
```
When you first get a data set, you will often notice that it contains factors with specific factor levels. However, sometimes you will want to change the names of these levels for clarity or other reasons. R allows you to do this with the function [`levels()`](http://www.rdocumentation.org/packages/base/functions/levels):
```
levels(factor_vector) <- c("name1", "name2",...)
```
A good illustration is the raw data that is provided to you by a survey. A common question for every questionnaire is the sex of the respondent. Here, for simplicity, just two categories were recorded, `"M"` and `"F"`. (You usually need more categories for survey data; either way, you use a factor to store the categorical data.)
```
survey_vector <- c("M", "F", "F", "M", "M")
```
Recording the sex with the abbreviations `"M"` and `"F"` can be convenient if you are collecting data with pen and paper, but it can introduce confusion when analyzing the data. At that point, you will often want to change the factor levels to `"Male"` and `"Female"` instead of `"M"` and `"F"` for clarity.
**Watch out:** the order with which you assign the levels is important. If you type `levels(factor_survey_vector)`, you'll see that it outputs `[1] "F" "M"`. If you don't specify the levels of the factor when creating the vector, `R` will automatically assign them alphabetically. To correctly map `"F"` to `"Female"` and `"M"` to `"Male"`, the levels should be set to `c("Female", "Male")`, in this order.
`@instructions`
- Check out the code that builds a factor vector from `survey_vector`. You should use `factor_survey_vector` in the next instruction.
- Change the factor levels of `factor_survey_vector` to `c("Female", "Male")`. Mind the order of the vector elements here.
`@hint`
Mind the order in which you have to type in the factor levels. Hint: look at the order in which the levels are printed when typing `levels(factor_survey_vector)`.
`@pre_exercise_code`
```{r}
# no pec
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
```
`@sample_code`
```{r}
# Code to build factor_survey_vector
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
# Specify the levels of factor_survey_vector
levels(factor_survey_vector) <-
factor_survey_vector
```
`@solution`
```{r}
# Code to build factor_survey_vector
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
# Specify the levels of factor_survey_vector
levels(factor_survey_vector) <- c("Female", "Male")
factor_survey_vector
```
`@sct`
```{r}
msg = "Do not change the definition of `survey_vector`!"
test_object("survey_vector", undefined_msg = msg, incorrect_msg = msg)
msg = "Do not change or remove the code to create the factor vector."
ex() %>% check_function("factor", not_called_msg = msg) %>% check_arg('x') %>% check_equal(incorrect_msg = msg)
# MC-note: ideally would want to test assign operator `<-`, and have it highlight whole line.
# MC-note: or negate this test_student_typed, to highlight where they type this incorrect phrase
# test_student_typed('c("Male", "Female")')
ex() %>% check_object("factor_survey_vector") %>% check_equal(eq_condition = "equal", incorrect_msg = paste("Did you assign the correct factor levels to `factor_survey_vector`? Use `levels(factor_survey_vector) <- c(\"Female\", \"Male\")`. Remember that R is case sensitive!"))
success_msg("Wonderful! Proceed to the next exercise.")
```
---
## Summarizing a factor
```yaml
type: NormalExercise
key: a549f13c0644ccc89cd39a10aa48706754637ed0
xp: 100
skills:
- 1
```
After finishing this course, one of your favorite functions in R will be [`summary()`](http://www.rdocumentation.org/packages/base/functions/summary). This will give you a quick overview of the contents of a variable:
```
summary(my_var)
```
Going back to our survey, you would like to know how many `"Male"` responses you have in your study, and how many `"Female"` responses. The [`summary()`](http://www.rdocumentation.org/packages/base/functions/summary) function gives you the answer to this question.
`@instructions`
Ask a [`summary()`](http://www.rdocumentation.org/packages/base/functions/summary) of the `survey_vector` and `factor_survey_vector`. Interpret the results of both vectors. Are they both equally useful in this case?
`@hint`
Call the [`summary()`](http://www.rdocumentation.org/packages/base/functions/summary) function on both `survey_vector` and `factor_survey_vector`, it's as simple as that!
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Build factor_survey_vector with clean levels
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")
factor_survey_vector
# Generate summary for survey_vector
# Generate summary for factor_survey_vector
```
`@solution`
```{r}
# Build factor_survey_vector with clean levels
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")
factor_survey_vector
# Generate summary for survey_vector
summary(survey_vector)
# Generate summary for factor_survey_vector
summary(factor_survey_vector)
```
`@sct`
```{r}
msg = "Do not change anything about the first few lines that define `survey_vector` and `factor_survey_vector`."
ex() %>% check_object("survey_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_object("factor_survey_vector", undefined_msg = msg) %>% check_equal(eq_condition = "equal",incorrect_msg = msg)
msg <- "Have you correctly used `summary()` to generate a summary for `%s`?"
ex() %>% check_output_expr("summary(survey_vector)", missing_msg = sprintf(msg, "survey_vector"))
ex() %>% check_output_expr("summary(factor_survey_vector)", missing_msg = sprintf(msg, "factor_survey_vector"))
success_msg("Nice! Have a look at the output. The fact that you identified `\"Male\"` and `\"Female\"` as factor levels in `factor_survey_vector` enables R to show the number of elements for each category.")
```
---
## Battle of the sexes
```yaml
type: NormalExercise
key: 90ecc160d1ebf2f75bf53f9c3843fc1632bdd0a5
xp: 100
skills:
- 1
```
You might wonder what happens when you try to compare elements of a factor. In `factor_survey_vector` you have a factor with two levels: `"Male"` and `"Female"`. But how does R value these relative to each other?
`@instructions`
Read the code in the editor and submit the answer to test if `male` is greater than (`>`) `female`.
`@hint`
Just submit the answer and have a look at output that gets printed to the console.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Build factor_survey_vector with clean levels
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")
# Male
male <- factor_survey_vector[1]
# Female
female <- factor_survey_vector[2]
# Battle of the sexes: Male 'larger' than female?
male > female
```
`@solution`
```{r}
# Build factor_survey_vector with clean levels
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")
# Male
male <- factor_survey_vector[1]
# Female
female <- factor_survey_vector[2]
# Battle of the sexes: Male 'larger' than female?
male > female
```
`@sct`
```{r}
msg = "Do not change anything about the code; simply submit the answer and look at the result."
ex() %>% check_object("survey_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_object("factor_survey_vector", undefined_msg = msg) %>% check_equal(eq_condition = "equal", incorrect_msg = msg)
ex() %>% check_object("male", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_object("female", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_output_expr("male > female", missing_msg = msg)
success_msg("How interesting! By default, R returns `NA` when you try to compare values in a factor, since the idea doesn't make sense. Next you'll learn about ordered factors, where more meaningful comparisons are possible.")
```
---
## Ordered factors
```yaml
type: NormalExercise
key: 9ab0928916bf84ab225713a9a1ce40d9e322c6a0
xp: 100
skills:
- 1
```
Since `"Male"` and `"Female"` are unordered (or nominal) factor levels, R returns a warning message, telling you that the greater than operator is not meaningful. As seen before, R attaches an equal value to the levels for such factors.
But this is not always the case! Sometimes you will also deal with factors that do have a natural ordering between its categories. If this is the case, we have to make sure that we pass this information to R...
Let us say that you are leading a research team of five data analysts and that you want to evaluate their performance. To do this, you track their speed, evaluate each analyst as `"slow"`, `"medium"` or `"fast"`, and save the results in `speed_vector`.
`@instructions`
As a first step, assign `speed_vector` a vector with 5 entries, one for each analyst. Each entry should be either `"slow"`, `"medium"`, or `"fast"`. Use the list below:
- Analyst 1 is medium,
- Analyst 2 is slow,
- Analyst 3 is slow,
- Analyst 4 is medium and
- Analyst 5 is fast.
No need to specify these are factors yet.
`@hint`
Assign to `speed_vector` a vector containing the character strings `"slow"`, `"medium"`, or `"fast"`.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Create speed_vector
speed_vector <-
```
`@solution`
```{r}
# Create speed_vector
speed_vector <- c("medium", "slow", "slow", "medium", "fast")
```
`@sct`
```{r}
ex() %>% check_object("speed_vector") %>% check_equal(incorrect_msg = "`speed_vector` should be a vector with 5 entries, one for each analyst's speed rating. Don't use capital letters; R is case sensitive!")
success_msg("A job well done! Continue to the next exercise.")
```
---
## Ordered factors (2)
```yaml
type: NormalExercise
key: 279077d10248ce03d5f972939ef8576430a16683
xp: 100
skills:
- 1
```
`speed_vector` should be converted to an ordinal factor since its categories have a natural ordering. By default, the function [`factor()`](http://www.rdocumentation.org/packages/base/functions/factor) transforms `speed_vector` into an unordered factor. To create an ordered factor, you have to add two additional arguments: `ordered` and `levels`.
```
factor(some_vector,
ordered = TRUE,
levels = c("lev1", "lev2" ...))
```
By setting the argument `ordered` to `TRUE` in the function [`factor()`](http://www.rdocumentation.org/packages/base/functions/factor), you indicate that the factor is ordered. With the argument `levels` you give the values of the factor in the correct order.
`@instructions`
From `speed_vector`, create an ordered factor vector: `factor_speed_vector`. Set `ordered` to `TRUE`, and set `levels` to `c("slow", "medium", "fast")`.
`@hint`
Use the function [`factor()`](http://www.rdocumentation.org/packages/base/functions/factor) to create `factor_speed_vector` based on `speed_character_vector`. The argument `ordered` should be set to `TRUE` since there is a natural ordering. Also, set `levels = c("slow", "medium", "fast")`.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Create speed_vector
speed_vector <- c("medium", "slow", "slow", "medium", "fast")
# Convert speed_vector to ordered factor vector
factor_speed_vector <-
# Print factor_speed_vector
factor_speed_vector
summary(factor_speed_vector)
```
`@solution`
```{r}
# Create speed_vector
speed_vector <- c("medium", "slow", "slow", "medium", "fast")
# Add your code below
factor_speed_vector <- factor(speed_vector, ordered = TRUE, levels = c("slow", "medium", "fast"))
# Print factor_speed_vector
factor_speed_vector
summary(factor_speed_vector)
```
`@sct`
```{r}
msg = "Do not change anything about the command that specifies the `speed_vector` variable."
ex() %>% check_object("speed_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_function("factor") %>% {
check_arg(., 'x') %>% check_equal(incorrect_msg="The first argument you pass to `factor()` should be `speed_vector`.")
check_arg(., 'ordered') %>% check_equal(incorrect_msg="Make sure to set `ordered = TRUE` inside your call of `factor()`.")
check_arg(., 'levels') %>% check_equal(incorrect_msg="Make sure to set `levels = c(\"slow\", \"medium\", \"fast\")` inside your call of `factor()`.")
}
ex() %>% check_object("factor_speed_vector") %>% check_equal(eq_condition = "equal", incorrect_msg="There's still something wrong with `factor_speed_vector`; make sure to only pass `speed_vector`, `ordered = TRUE` and `levels = c(\"slow\", \"medium\", \"fast\")` inside your call of `factor()`.")
success_msg("Great! Have a look at the console. It is now indicated that the Levels indeed have an order associated, with the `<` sign. Continue to the next exercise.")
```
---
## Comparing ordered factors
```yaml
type: NormalExercise
key: db16e69805625bcfde227743a8cbc985f8482a37
xp: 100
skills:
- 1
```
Having a bad day at work, 'data analyst number two' enters your office and starts complaining that 'data analyst number five' is slowing down the entire project. Since you know that 'data analyst number two' has the reputation of being a smarty-pants, you first decide to check if his statement is true.
The fact that `factor_speed_vector` is now ordered enables us to compare different elements (the data analysts in this case). You can simply do this by using the well-known operators.
`@instructions`
- Use `[2]` to select from `factor_speed_vector` the factor value for the second data analyst. Store it as `da2`.
- Use `[5]` to select the `factor_speed_vector` factor value for the fifth data analyst. Store it as `da5`.
- Check if `da2` is greater than `da5`; simply print out the result. Remember that you can use the `>` operator to check whether one element is larger than the other.
`@hint`
- To select the factor value for the third data analyst, you'd need `factor_speed_vector[3]`.
- To compare two values, you can use `>`. For example: `da3 > da4`.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Create factor_speed_vector
speed_vector <- c("medium", "slow", "slow", "medium", "fast")
factor_speed_vector <- factor(speed_vector, ordered = TRUE, levels = c("slow", "medium", "fast"))
# Factor value for second data analyst
da2 <-
# Factor value for fifth data analyst
da5 <-
# Is data analyst 2 faster than data analyst 5?
```
`@solution`
```{r}
# Create factor_speed_vector
speed_vector <- c("medium", "slow", "slow", "medium", "fast")
factor_speed_vector <- factor(speed_vector, ordered = TRUE, levels = c("slow", "medium", "fast"))
# Factor value for second data analyst
da2 <- factor_speed_vector[2]
# Factor value for fifth data analyst
da5 <- factor_speed_vector[5]
# Is data analyst 2 faster data analyst 5?
da2 > da5
```
`@sct`
```{r}
msg = "Do not change anything about the commands that define `speed_vector` and `factor_speed_vector`!"
ex() %>% check_object("speed_vector", undefined_msg = msg) %>% check_equal(incorrect_msg = msg)
ex() %>% check_object("factor_speed_vector", undefined_msg = msg) %>% check_equal(eq_condition = "equal", incorrect_msg = msg)
msg <- "Have you correctly selected the factor value for the %s data analyst? You can use `factor_speed_vector[%s]`."
ex() %>% check_object("da2") %>% check_equal(eq_condition = "equal", incorrect_msg = sprintf(msg,"second", "2"))
ex() %>% check_object("da5") %>% check_equal(eq_condition = "equal",incorrect_msg = sprintf("fifth", "5"))
ex() %>% check_output_expr("da2 > da5", missing_msg = "Have you correctly compared `da2` and `da5`? You can use the `>`. Simply print out the result.")
success_msg("Bellissimo! What does the result tell you? Data analyst two is complaining about the data analyst five while in fact they are the one slowing everything down! This concludes the chapter on factors. With a solid basis in vectors, matrices and factors, you're ready to dive into the wonderful world of data frames, a very important data structure in R!")
```