forked from datacarpentry/R-genomics
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path00-before-we-start.Rmd
101 lines (78 loc) · 4.4 KB
/
00-before-we-start.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
layout: topic
title: Before we start
author: Data Carpentry contributors
minutes: 15
---
```{r, echo=FALSE, purl=FALSE}
knitr::opts_chunk$set(results='hide', fig.path='img/r-lesson-')
```
------------
> ## Learning Objectives
>
> * Describe the purpose of RStudio’s script, console, environment, and file/plot/help windows.
> * Create an R project.
> * Organize files and directories for a set of analyses as an R Project.
> * Use the built-in RStudio help interface to search for more information on R functions.
------------
# Presentation of RStudio
Start RStudio -- Let's start by learning about our tool.
* Console, Scripts, Environments, Plots
* Code and workflow are more reproducible if we can document everything that we
do.
* Our end goal is not just to "do stuff" but to do it in a way that anyone can
easily and exactly replicate our workflow and results.
# Before we get started
* Under the `File` menu, click on `New project`, choose `New directory`, then
`Empty project`
* Enter a name for this new folder, and choose a convenient location for
it. This will be your **working directory** for the rest of the day
(e.g., `~/data-carpentry`)
* Confirm that the folder named in the `Create project as a sub-directory of` box is where you want the working directory created. Use the `Browse` button to navigate folders if changes are needed.
* Click on "Create project"
* Under the `Files` tab on the right of the screen, click on `New Folder` and
create a folder named `data` within your newly created working directory.
(e.g., `~/data-carpentry/data`)
* Create a new R script (File > New File > R script) and save it in your working
directory (e.g. `data-carpentry-script.R`)
Your working directory should now look like this:
![How it should look like at the beginning of this lesson](img/r_starting_how_it_should_like.png)
# Interacting with R
There are two main ways of interacting with R: using the console or by using
script files (plain text files that contain your code).
The console window (in RStudio, the bottom left panel) is the place where R is
waiting for you to tell it what to do, and where it will show the results of a
command. You can type commands directly into the console, but they will be
forgotten when you close the session. It is better to enter the commands in the
script editor, and save the script. This way, you have a complete record of what
you did, you can easily show others how you did it and you can do it again later
on if needed. You can copy-paste into the R console, but the Rstudio script
editor allows you to 'send' the current line or the currently selected text to
the R console using the `Ctrl-Enter` shortcut.
If R is ready to accept commands, the R console shows a `>` prompt. If it
receives a command (by typing, copy-pasting or sent from the script editor using
<kbd>Ctrl</kbd>-<kbd>Enter</kbd>), R will try to execute it, and when ready, show the results and
come back with a new `>` prompt to wait for new commands.
If R is still waiting for you to enter more data because it isn't complete yet,
the console will show a `+` prompt. It means that you haven't finished entering
a complete command. This is because you have not 'closed' a parenthesis or
quotation. If you're in Rstudio and this happens, click inside the console
window and press <kbd>Esc</kbd>; this should help you out of trouble.
# Basics of R
R is a versatile, open source programming/scripting language that's useful both
for statistics but also data science. Inspired by the programming language S.
* Open source software under GPL.
* Superior (if not just comparable) to commercial alternatives. R has over 7,000
user contributed packages at this time. It's widely used both in academia and
industry.
* Available on all platforms.
* Not just for statistics, but also general purpose programming.
* For people who have experience in programmming: R is both an object-oriented
and a so-called [functional language](http://adv-r.had.co.nz/Functional-programming.html)
* Large and growing community of peers.
## Organizing your working directory
You should separate the original data (raw data) from intermediate datasets that
you may create for the need of a particular analysis. For instance, you may want
to create a `data/` directory within your working directory that stores the raw
data, and have a `data_output/` directory for intermediate datasets and a
`figure_output/` directory for the plots you will generate.