Skip to content

Commit

Permalink
Remove whitespace from code chunks, change highlight theme
Browse files Browse the repository at this point in the history
  • Loading branch information
sophie-a-lee committed Jul 1, 2024
1 parent cc25d47 commit 6f5e406
Show file tree
Hide file tree
Showing 12 changed files with 36 additions and 26 deletions.
Binary file added data/data_description.pdf
Binary file not shown.
Binary file modified exercise-solutions.pdf
Binary file not shown.
Binary file added images/github_qr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 11 additions & 12 deletions session1_notes.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ author: Sophie Lee
fontfamily: lmodern
output:
pdf_document:
highlight: tango
toc: TRUE
latex_engine: xelatex
fig_height: 4
Expand All @@ -21,7 +22,7 @@ url_color: blue
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(echo = TRUE, collapse = TRUE)
```

\newpage
Expand All @@ -42,7 +43,7 @@ The screenshot below shows the RStudio interface which comprises of four windows
![RStudio interface](images/rstudio_ide.png)
\newpage

#### Window A: R script files
**Window A: R script files**

All analysis and actions in R are carried out using the R syntax language. R script files allow you to write and edit code before running it in the console window.

Expand All @@ -55,31 +56,30 @@ The main advantage of using the script file rather than entering the code direct
Past script files can be opened using *File -> Open File…* from the drop-down menu or by clicking the ![open icon](images/open_shortcut.png) icon and selecting a `.R` file. The keyboard shortcut to open an existing script file is `Ctrl + o` on Windows, and `Command + o` on Macs.


#### Window B: The R console
**Window B: The R console**

The R console window is where all commands run from the script file, results (other than plots), and messages, such as errors, are displayed. Commands can be written directly into the R console after the `>` symbol and executed using `Enter` on the keyboard. It is not recommended to write code directly into the console as it is cannot be saved or replicated.

Every time a new R session is opened, details about version and citations of R will be given by default. To clear text from the console window, use the keyboard shortcut `control + l` (this is the same for both Windows and Mac users). Be aware that this clears all text from the console, including any results. Before running this command, check that any results can be replicated within the script file.

\newpage

#### Window C: Environment and history
**Window C: Environment and history**

This window lists all data and objects currently loaded into R. More details on the types of objects and how to use the Environment window are given in later sections.

#### Window D: Files, plots, packages and help
**Window D: Files, plots, packages and help**

This window has many potential uses: graphics are displayed and can be saved from here, and R help files will appear here. This window is only available in the RStudio interface and not in the basic R package.


## Exercise 1

1. Open a new script file if you have not already done so.
2. Save this script file into an appropriate location.

\newpage

# Chapter 2: R syntax

All analyses within R are carried out using **syntax**, the R programming language. It is important to note that R is case-sensitive, so always ensure that you use the correct combination of upper and lower case letters when running functions or calling objects.

Any text written in the R console or script file can be treated the same as text from other documents or programmes: text can be highlighted, copied and pasted to make coding more efficient.
Expand Down Expand Up @@ -109,14 +109,13 @@ The choice of brackets in R coding is particularly important as they all have di
All standard notation for mathematical calculations (`+`, `-`, `*`, `/`, `^`, etc.) are compatible with R. At its simplest level, R is just a very powerful calculator!

## Exercise 2

1. Add your name and the date to the top of your script file (hint: comment this out so R does not try to run it)
2. Use R to calculate the following calculations. Add the result to the same line of the script file in a way that ensures there are no errors in the code.
a. $64^2$
b. $3432 \div 8$
c. $96 \times 72$

When you have finished this exercise, select the entire script file (using `Ctrl + a` on windows or `Command + a` on Mac) and run it to ensure there are no errors in the code.
When you have finished this exercise, select the entire script file (using `ctrl + a` on windows or `Command + a` on Mac) and run it to ensure there are no errors in the code.

\newpage

Expand All @@ -125,17 +124,17 @@ When you have finished this exercise, select the entire script file (using `Ctrl
## 3.1 Objects
One of the main advantages to using R over other software packages such as SPSS is that more than one dataset can be accessed at the same time. A collection of data stored in any format within the R session is known as an **object**. Objects can include single numbers, single variables, entire datasets, lists of datasets, or even tables and graphs.

Objects are defined in R using the `<-` symbol or `=`. For example,
Objects are defined in R using the `<-` symbol. For example,

```{r assign qone}
object_1 <- 81
```

Creates an object in the environment named `object_1`, which takes the value `81`. This will appear in the environment window of the console (window C from the interface shown in the first chapter).
Creates an object in the environment named `object_1`, which takes the value `81`. This will appear in the environment window of the console (window C from the interface shown earlier).

To retrieve an object, type its name into the script or console and run it. This object can then be included in functions or operations in place of the value assigned to it:

```{r qone}
```{r qone, collapse = TRUE}
object_1
object_1 * 2
Expand Down
Binary file modified session1_notes.pdf
Binary file not shown.
15 changes: 9 additions & 6 deletions session2_notes.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ author: "Sophie Lee"
fontfamily: lmodern
output:
pdf_document:
highlight: tango
toc: TRUE
latex_engine: xelatex
fig_caption: FALSE
Expand All @@ -19,7 +20,7 @@ url_color: blue
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(echo = TRUE, collapse = TRUE, message = FALSE)
library(tidyverse)
```
Expand All @@ -32,7 +33,9 @@ Up to this point, we have not thought about the style of R coding we will be usi

The choice of R 'dialect' depends on personal preference. Some prefer to use the 'base R' approach that does not rely on any packages that may need updating, making it a more stable approach. However, base R can be difficult to read for those not comfortable with coding.

![boyfriend tidyverse meme](images/r_meme.png)
```{r boyfriend meme, echo=FALSE, out.width="75%"}
knitr::include_graphics("images/r_meme.png")
```

The alternative approach that we will be adopting in this course is the 'tidyverse' approach. Tidyverse is a set of packages that have been designed to make R coding more readable and efficient. They have been designed with reproducibility in mind, which means there is a wealth of online (mostly free), well-written resources available to help use these packages.

Expand Down Expand Up @@ -145,13 +148,13 @@ select(csp_2020, ons_code:region)

The `select` function can also be combined with a number of 'selection helper' functions that help us select variables based on naming conventions:

`starts_with("xyz")` returns all variables with names beginning `xyz`
`ends_with("xyz")` returns all variables with names ending `xyz`
`contains("xyz")` returns all variables that have `xyz` within their name
- `starts_with("xyz")` returns all variables with names beginning `xyz`
- `ends_with("xyz")` returns all variables with names ending `xyz`
- `contains("xyz")` returns all variables that have `xyz` within their name

Or based on whether they match a condition:

`where(is.numeric)` returns all variables that are classed as numeric
- `where(is.numeric)` returns all variables that are classed as numeric

For a full list of these selection helpers, access the helpfile using `?tidyr_tidy_select`.

Expand Down
Binary file modified session2_notes.pdf
Binary file not shown.
13 changes: 9 additions & 4 deletions session3_notes.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ author: "Sophie Lee"
fontfamily: lmodern
output:
pdf_document:
highlight: tango
toc: TRUE
latex_engine: xelatex
fig_height: 4
Expand All @@ -21,7 +22,7 @@ url_color: blue
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(echo = TRUE, collapse = FALSE, message = FALSE)
library(tidyverse)
```
\newpage
Expand Down Expand Up @@ -96,7 +97,7 @@ csp_201520 <- list.files(path = "data", pattern = "CSP_20") %>%

The dataset containing core spending power in England between 2015 and 2020 is currently in what is known as **wide format**. This means there is a variable per measure per year, making the object very wide.

Some analyses and visualisations, particularly those used for temporal data, require a time variable in the dataset (for example, year). This requires the data to be in a different format, known as**long format**. Long format is where each row contains an observation per year (making the data much longer and narrower).
Some analyses and visualisations, particularly those used for temporal data, require a time variable in the dataset (for example, year). This requires the data to be in a different format, known as **long format**. Long format is where each row contains an observation per year (making the data much longer and narrower).

To convert data between wide and long formats, we can use the tidyverse functions `pivot_longer` and `pivot_wider`.

Expand All @@ -113,9 +114,13 @@ Using a combination of the helpfile (`?pivot_longer`) and vignette, the argument
csp_long <- pivot_longer(csp_201520,
# Pivot columns sfa_2015 up to and including rsdg_2020
cols = sfa_2015:rsdg_2020,
# Separate the old variable names in two, keep the prefix as it was, and put the suffix into a new variable, year
# Separate the old variable names in two,
# keep the prefix as it was, and put the suffix
# into a new variable, year
names_to = c(".value", "year"),
# The name prefix and suffix were separated by an _, the prefix can take different lengths, the suffix is always the final 4 characters
# The name prefix and suffix were separated by an _,
# the prefix can take different lengths, the suffix
# is always the final 4 characters
names_pattern = "(.*)_(....)")
# Check the new, long dataset's structure
Expand Down
Binary file modified session3_notes.pdf
Binary file not shown.
11 changes: 7 additions & 4 deletions session4_notes.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ author: "Sophie Lee"
fontfamily: lmodern
output:
pdf_document:
highlight: tango
toc: TRUE
latex_engine: xelatex
fig_height: 4
Expand All @@ -24,7 +25,7 @@ url_color: blue
```{r setup, include=FALSE}
pacman::p_load(flextable, tidyverse)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(echo = TRUE, collapse = TRUE, message = FALSE)
csp_2020 <- read_csv("data/CSP_2020.csv")
csp_long2 <- read_csv("data/CSP_long_201520.csv")
Expand All @@ -40,7 +41,7 @@ Data visualisation is a powerful tool with multiple important uses. First, visua
The most appropriate choice of visualisation will depend on the type of variable(s) we wish to display, the number of variables and the message we are trying to disseminate. Common plots used to display combinations of different types of data are given in following table:


```{r Visualisation table, echo = FALSE, message = FALSE}
```{r Visualisation table, include = FALSE}
vis_tab <- data.frame(n_vars = c(rep("One variable", 5), rep("Two variables", 5),
rep("> 2 variables", 2)),
type_vars = c(rep("Categorical", 2), "Numerical", "Spatial",
Expand Down Expand Up @@ -121,7 +122,8 @@ This outlier is the Greater London Authority which is a combination of local aut
```{r Scatter without London}
# Take the csp_2020 data, and then
csp_2020 %>%
# Return all rows where authority is not equal to Greater London Authority, and then
# Return all rows where authority is not equal to Greater London Authority,
# and then
filter(authority != "Greater London Authority") %>%
# Generate a plot
ggplot( ) +
Expand Down Expand Up @@ -190,7 +192,8 @@ Aesthetic properties of the geom object may also be set manually, outside of the
```{r Manually setting aesthetics}
ggplot(csp_nolon_2020_new) +
geom_point(aes(x = sfa_2020, y = ct_total_2020),
# Adding the colour outside of the aes wrapper as it is not from the data
# Adding the colour outside of the aes wrapper as it is not
# from the data
colour = "blue")
```

Expand Down
Binary file modified session4_notes.pdf
Binary file not shown.
Binary file modified session5_notes.pdf
Binary file not shown.

0 comments on commit 6f5e406

Please sign in to comment.