-
Notifications
You must be signed in to change notification settings - Fork 0
/
course_outline.Rmd
115 lines (93 loc) · 3.79 KB
/
course_outline.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
title: "Course outline"
output:
html_document:
toc: true
toc_depth: 3
toc_float:
collapsed: false
smooth_scroll: true
---
# Data science
### Introduction to data science
- Introduction
- Define data science
- List common tools used in data science
***
# Command line
### Introduction to Unix
- Introduction
- Define command line
- Describe several advantages to using command line
- Download instructions
- Provides instructions for download and install of Unix terminals for Mac, Linux, and Windows
- Unix navigation tutorial and practice
- Define parts of the terminal
- Use Unix commands to navigate your computer including pwd, ls, man/help, and cd
- Unix manipulation tutorial and practice
- Use Unix commands to manipulate files including mkdir, cp, mv, and rm
- Apply equivalent file paths in Unix commands
- Define best practices for directory and file names
### Applications of command line
- BLAST tutorial and practice
- Complete nucleotide BLAST of a large sequencing dataset using command line tools
- Git tutorial and practice
- Enact version control on a text file using Git command line tools
- GitHub tutorial and practice
- Share and modify a version controlled file using GitHub
***
# R/RStudio
### Introduction to R
- Introduction
- Describe general uses for R
- List several advantages to using R and RStudio
- Download instructions
- Provides instructions for download and install of R and RStudio
- RStudio tutorial
- Navigate the RStudio software including key shortcuts, projects, packages, and help
- All of our R tutorials and practice are implemented in RStudio so we **strongly recommend** that this tutorial be included with all R curriculum
- Base R tutorial and practice
- Execute commands in base R to:
- Load tabular data
- Access columns and rows within a data frame
- Perform basic calculations on tabular data
- Subset a data frame
### Data manipulation in R
- Data manipulation tutorial and practice
- Load tabular data using the [tidyverse](https://www.tidyverse.org/)
- Subset and clean data in `dplyr` (filter, select, rename, arrange, mutate)
- Summarize data in `dplyr` (group_by, summarize)
- Transform data frames using `tidyr` (gather, spread) and `dplyr` (*_join)
- Link multiple tidyverse functions using pipes `%>%`
### Data visualization in R
- Data visualization tutorial and practice
- Define the grammar of graphics
- Create scatterplots using the `ggplot2` package
- Customize plot color, shape, axes, scales, and other attributes
- Represent subsets of data using facets
- *Recommend first completing 'Data manipulation in R'*
***
# Statistics *Under development*
### Introduction to statistics
- Introduction
- Identify and distinguish between a population and a sample, and between parameters and statistics
- Define "p-value"" and interpret its meaning
- Identify factors that influence statistical test selection
### Statistics in R/RStudio
- Download instructions
- Provides instructions for download and install of R and RStudio
- RStudio tutorial
- Navigate the RStudio software including key shortcuts, projects, packages, and help
- All of our statistics tutorials and practice are implemented in RStudio so we **strongly recommend** that this tutorial be included with all R curriculum
- *t*-tests
-
- Analysis of Variance (ANOVA)
-
- Linear regression
-
***
# Capstone projects *Under development*
### Metagenomics analysis team project
- Focuses on pipeline construction and biological interpretation of metagenomic sequence data from microbiomes
### Microbiome analysis team project
- Focuses on biological interpretation of amplicon sequence data from microbiomes