-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathAdvanced_RMarkdown.Rmd
516 lines (345 loc) · 14.3 KB
/
Advanced_RMarkdown.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
---
title: "Advanced R Markdown"
author: "BBL"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
html_document:
df_print: paged
toc: true
toc_float: true
code_folding: hide
---
```{r arm-setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(ggplot2)
theme_set(theme_bw())
```
## Topics and goals
<------- Topics are to the left
**This workshop assumes (but ask questions as needed!) you are:**
* familiar with basic data structures and control structure of R: `data.frame`, `for`, function calls, parameters, etc.
* comfortable with the idea, and basic mechanics, of R Markdown documents: how to make them, chunks and chunk options, inline code
* have heard of terms like [HTML](https://en.wikipedia.org/wiki/HTML)
**Goal: exposure to a variety of more advanced R Markdown techniques and tricks.**
Note that this is NOT intended to be a comprehensive survey of the possibilities with R Markdown.
## Under the hood {.tabset}
Disclaimer: I'm not an expert, and this quickly gets _really_ complex.
* `rmarkdown` is an R package for converting R Markdown documents into a variety of output formats
* Its `render()` function processes R Markdown input, creating a [Markdown](https://en.wikipedia.org/wiki/Markdown) (`*.md`) file
* This uses `knitr`, an R package for dynamic report generation with R.
* This is then transformed into HTML by [pandoc](https://pandoc.org/)
* R Markdown files have a [YAML](https://en.wikipedia.org/wiki/YAML) header giving configuration options that can apply to many stages of this pipeline
### R Markdown workflow
<img src="images-rmarkdown/workflow.png" width = "100%">
Original graphic from [The R Markdown Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/rmarkdown-process.html)
### Areas we'll be discussing
<img src="images-rmarkdown/workflow-annotated.png" width = "100%">
Original graphic from [The R Markdown Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/rmarkdown-process.html)
## HTML goodies
### TOC and code folding
Here's the YAML header for this presentation:
```
---
title: "Advanced R Markdown"
author: "BBL"
date: "`r "\u0060r format(Sys.time(), '%d %B %Y')\u0060"`"
output:
html_document:
toc: true
toc_float: true
code_folding: hide
---
```
**Things to notice:**
* The `date` field has inline R code to dynamically insert the current date
* The `html_document` setting for `output:` has three sub-settings:
+ `toc: true` generates a table of contents (based on `#` and `##` lines)
+ `toc_float: true` makes it 'floating'
+ `code_folding: hide` turns on code folding with a default of hidden code
```{r code-folding, fig.height=3}
ggplot(diamonds, aes(carat, fill = cut)) +
geom_density(position = "stack")
```
### Tabs {.tabset}
You can use tabs to organize your content:
```
## Tabs {.tabset}
### Tab 1 name
(content)
### Tab 2 name
(content)
```
#### Cars
```{r cars}
plot(cars$speed, cars$dist)
```
#### Iris
```{r iris}
pairs(iris)
```
#### Volcano
```{r volcano}
image(volcano)
```
### Printing data frames
For HTML output only, you can add the `df_print: paged` parameter to your YAML header to
have printed data frames rendered as HTML tables.
```
output:
html_document:
df_print: paged
```
```{r mtcars}
mtcars
```
### Equations
Equations are (mostly) straightforward and based on [LaTeX mathematical typesetting](https://www.latex-project.org/):
R Markdown | Final document
----------------------- | ------------------
`$x^{n}$` | $x^{n}$
`$\frac{a}{b}$` | $\frac{a}{b}$
`$\sum_{n=1}^{10} n^2$` | $\sum_{n=1}^{10} n^2$
`$\sigma \Sigma$` | $\sigma \Sigma$
A handy summary is [here](https://www.calvin.edu/~rpruim/courses/s341/S17/from-class/MathinRmd.html).
_Extremely_ usefully, the RStudio editor provides has an equation preview feature.
<img src="images-rmarkdown/editor-eq-preview.png" width = "75%">
### Static image files
These are inserted with a bit of HTML, e.g. for the image above:
`<img src="images-rmarkdown/editor-eq-preview.png" width = "75%">`
There are lots of options that can be applied here, including size, whether the image floats, its justification, etc. See the `img` tag [documentation](https://www.w3schools.com/tags/tag_img.asp).
### Themes {.tabset}
This quickly gets confusing (to me anyway).
#### Bootswatch themes
These are built into `rmarkdown` so easy to use; themes are from the Bootswatch theme [library](https://bootswatch.com/3/). Just insert lines into your YAML header:
```
output:
html_document:
theme: sandstone
highlight: tango
```
<img src="images-rmarkdown/theme-default.png" width = "50%">
<img src="images-rmarkdown/theme-sandstone.png" width = "50%">
#### rmdformats
When the [rmdformats](https://github.com/juba/rmdformats) package is installed, it allows us create R Markdown documents using _very_ different themes.
```
output:
rmdformats::readthedown:
highlight: kate
```
<img src="images-rmarkdown/theme-reaedthedown.png" width = "100%">
There's also the [prettydoc](https://prettydoc.statr.me/) package.
#### Custom CSS
You can use a custom [Cascading Style Sheet (CSS)](https://en.wikipedia.org/wiki/Cascading_Style_Sheets) file. You're on your own here :)
[![](images-rmarkdown/black-hole.jpg)](https://en.wikipedia.org/wiki/Black_hole)
## knitr tricks
<img src="images-rmarkdown/grandma.jpg" width = "25%" style="float: right;">
### combine_words
The first 10 letters are `` `r "\u0060r knitr::combine_words(LETTERS[1:10])\u0060"` ``.
The first 10 letters are `r knitr::combine_words(LETTERS[1:10])`.
### Chunk defaults
Most R Markdown documents (including this one) have a first chunk that, among other things, sets the _default chunk options_:
```
knitr::opts_chunk$set(echo = TRUE)
```
### Computable chunk options
Chunk options can take non-constant values; in fact, they can take values _from arbitrary R expressions_:
````markdown
`r ''````{r}
# Define a global figure width value
my_fig_width <- 7
```
`r ''````{r, fig.width = my_fig_width}
plot(cars)
```
````
An example of R code in a chunk option setting:
````markdown
`r ''````{r}
width_small <- 4
width_large <- 7
small_figs <- TRUE
```
`r ''````{r, fig.width = if(small_figs) width_small else width_large}
plot(cars)
```
````
Here's a chunk that only executes when a particular package is available:
````markdown
`r ''````{r, eval = require("ggplot2")}
ggplot2::ggplot(cars, aes(speed, dist)) + geom_point()
```
````
More information [here](https://bookdown.org/yihui/rmarkdown-cookbook/chunk-variable.html).
### Child documents
R Markdown documents may be split, with a primary document incorporating others via a [child document](https://bookdown.org/yihui/rmarkdown-cookbook/child-document.html) mechanism.
### Caching
Don't forget about the `cache=TRUE` chunk option. Critical for keeping the build time of longer, complex documents under control.
[![](images-rmarkdown/two-hard-things.png)](https://twitter.com/codinghorror/status/506010907021828096?s=20)
### Line breaks
_Two trailing spaces_ are used to force a line break:
This line **does not** has two spaces at the end.
The following line.
This line has two spaces at the end.
The following line.
(This is actually part of the [Markdown spec](https://daringfireball.net/projects/markdown/syntax).)
## Programmatic reports
What if I want to run the same analysis, and/or generate the same report, for different datasets or conditions?
<img src="images-rmarkdown/dataset_reports.png" width = "100%">
This offers the possibility of tremendously extending the utility of `rmarkown`!
### Parameters
R Markdown documents can take parameters. These are specified in the YAML header as a name followed by a default value:
```
params:
cut: NULL
min_price: 0
```
and can then be accessed by code in the document, via a read-only list called `params`:
```
print(params$min_price)
```
_Let's go make an R Markdown document_ that takes one or more parameters, for example to produce a report on some part of the `diamonds` dataset.
### render
So far so good, but how do we _use_ this capability programmatically?
The `rmarkdown::render()` function converts an input file to an output format, usually calling `knitr::knit()` and pandoc along the way.
```
rmarkdown::render("diamonds-report.Rmd",
params = list(cut = "Ideal"),
output_file = "Ideal.html")
```
_Let's go make a driver script_ that generates an output file for each diamond cut in the dataset.
### Working directory issues
Because R Markdown files are parsed in a _separate_ R instance, the [working directory](https://bookdown.org/ndphillips/YaRrr/the-working-directory.html) is the _location of your R Markdown file_.
Don't mess with it via `setwd()`.
**Don't mess with it via `setwd()`.**
<img src="images-rmarkdown/setwd-is-bad.png" width = "100%">
>If the first line of your #rstats script is setwd("C:\Users\jenny\path\that\only\I\have"), I will come into your lab and SET YOUR COMPUTER ON FIRE. [Source](https://twitter.com/hadleywickham/status/940021008764846080)
**It's almost _always_ much better to use relative paths.** Absolute paths aren't robust and break reproducibility and transportability.
Note that `render` has an `output_dir` parameter.
Finally, check out the [here](https://github.com/jennybc/here_here) package, which
tries to figure out the top level of your current project using some sane heuristics.
## Neat R packages
### plotly
Interactive graphics.
```{r plotly, message=FALSE}
library(plotly)
p <- ggplot(mtcars, aes(hp, mpg, size = cyl, color = disp)) + geom_point()
ggplotly(p)
```
### DT
Handy if you want to sort or filter your table data.
```{r DT, message=FALSE}
library(DT)
library(gapminder)
datatable(mtcars, rownames = TRUE, filter = "top",
options = list(pageLength = 5, scrollX = TRUE))
```
Example based on [this post](https://holtzy.github.io/Pimp-my-rmd/#use_dt_for_tables).
### reactable
I haven't used the `reactable` package but it can make cool tables, and link those tables to data visualizations:
```{r, message=FALSE, warning=FALSE}
library(dplyr)
library(sparkline)
library(reactable)
data <- chickwts %>%
group_by(feed) %>%
summarise(weight = list(weight)) %>%
mutate(boxplot = NA, sparkline = NA)
reactable(data, columns = list(
weight = colDef(cell = function(values) {
sparkline(values, type = "bar", chartRangeMin = 0, chartRangeMax = max(chickwts$weight))
}),
boxplot = colDef(cell = function(value, index) {
sparkline(data$weight[[index]], type = "box")
}),
sparkline = colDef(cell = function(value, index) {
sparkline(data$weight[[index]])
})
))
```
More information [here](https://glin.github.io/reactable/).
### leaflet
I really like the simplicty of the `leaflet` package.
```{r leaflet, out.width='100%'}
library(leaflet)
leaflet() %>%
addTiles() %>%
setView(-76.9219, 38.9709, zoom = 17) %>%
addPopups(-76.9219, 38.9709,
"Here is the <b>Joint Global Change Research Institute</b>")
```
## Citations and references
We might want to include citations. This is surprisingly easy; the source
```
In a subsequent paper [@Bond-Lamberty2009-py], we used the
same model outputs to examine the _hydrological_ implications
of these wildfire regime shifts [@Nolan2014-us].
Nolan et al. [-@Nolan2014-us] found that...
```
becomes:
>In a subsequent paper (Bond-Lamberty et al. 2009), we used the same model outputs to examine the _hydrological_ implications of these wildfire regime shifts (Nolan et al. 2014). Nolan
et al. (2014) found that...
>
>**References**
>
>Bond-Lamberty, Ben, Scott D Peckham, Stith T Gower, and Brent E Ewers. 2009. “Effects of Fire on Regional Evapotranspiration in the Central Canadian Boreal Forest.” Glob. Chang. Biol. 15 (5): 1242–54.
>
>Nolan, Rachael H, Patrick N J Lane, Richard G Benyon, Ross A Bradstock, and Patrick J Mitchell. 2014. “Changes in Evapotranspiration Following Wildfire in Resprouting Eucalypt Forests.” Ecohydrol. 6 (January). Wiley Online Library.
To do this we include a new in (of course) the YAML header, for example:
```
---
bibliography: bibliography.json
---
```
While `*.json` is preferred, a wide variety of file formats can be accommodated:
Format | File extension
----------- | -------
CSL-JSON | .json
MODS | .mods
BibLaTeX | .bib
BibTeX | .bibtex
RIS | .ris
EndNote | .enl
EndNote XML | .xml
ISI | .wos
MEDLINE | .medline
Copac | .copac
More details can be found [here](https://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html).
## Bookdown
Larger projects can become difficult to manage in a single R Markdown file (or even
one with child files).
The [bookdown](https://bookdown.org/yihui/rmarkdown/books.html) package (by the same [author](https://yihui.org/en/) as `rmarkdown`) offers several key improvements:
* Books and reports can be built from multiple R Markdown files
* Documents can easily be exported in a range of formats suitable for publishing, including PDF, e-books and HTML websites
* Additional formatting features are added, such as cross-referencing, and numbering of figures, equations, and tables
The last of these is so useful that it's available in R Markdown as well:
```
output: bookdown::html_document2
```
````markdown
`r ''````{r cars-plot, fig.cap = "An amazing plot"}
plot(cars)
```
````
````markdown
`r ''````{r mtcars-plot, fig.cap = "Another amazing plot"}
plot(mpg ~ hp, mtcars)
```
````
```
See Figure \@ref(fig:cars-plot).
```
>See Figure 1.
Theorems, equations, and tables can also be cross-referenced; see [the documentation](https://bookdown.org/yihui/bookdown/cross-references.html).
## Resources
Good resources:
* The [R Markdown Cheat Sheet](https://rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf)
* The [R Markdown Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/)
* [15 Tips on Making Better Use of R Markdown](https://slides.yihui.org/2019-dahshu-rmarkdown#1)
* [How to Make Beautiful Tables in R](https://rfortherestofus.com/2019/11/how-to-make-beautiful-tables-in-r/)
* [Bookdown](https://bookdown.org/)
## The End
Thanks for attending this workshop on Advanced R Markdown! I hope it was useful.
This presentation was made using R Markdown version `r packageVersion("rmarkdown")` running under `r R.version.string`. It is available at https://rpubs.com/bpbond/630335. The code is [here](https://github.com/JGCRI/Rworkshops/blob/master/Advanced_RMarkdown.Rmd).
```{r sessionInfo}
sessionInfo()
```