Skip to content

Commit

Permalink
Version 0.1.0 complete
Browse files Browse the repository at this point in the history
  • Loading branch information
ismayc committed Jan 7, 2017
1 parent 5b2f343 commit 4247c5b
Show file tree
Hide file tree
Showing 117 changed files with 18,796 additions and 12,869 deletions.
8 changes: 4 additions & 4 deletions 02-intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ This can be summarized in a graphic that is commonly used by Hadley Wickham:
knitr::include_graphics("images/tidy1.png")
```

We will begin with a discussion on what is meant by tidy data and then dig into the gray **Understand** portion of the cycle and conclude by talking about interpretting and discussing the results of our models via **Communication**. These steps are vital to any statistical analysis. But why should you care about statistics? "Why did they make me take this class?"
We will begin with a discussion on what is meant by tidy data and then dig into the gray **Understand** portion of the cycle and conclude by talking about interpreting and discussing the results of our models via **Communication**. These steps are vital to any statistical analysis. But why should you care about statistics? "Why did they make me take this class?"

There's a reason so many fields require a statistics course. Scientific knowledge grows through an understanding of statistical significance and data analysis. You needn't be intimidated by statistics. It's not the beast that it used to be and paired with computation you'll see how reproducible research in the sciences particularly increases scientific knowledge.

Expand All @@ -60,14 +60,14 @@ Another large goal of this book is to help readers understand the importance of

Copying and pasting is not the way that efficient and effective scientific research is conducted. It's much more important for time to be spent on data collection and data analysis and not on copying and pasting plots back and forth across a variety of programs.

In a traditional analyses if an error was made with the original data, we'd need to step through the entire process again: recreate the plots and copy and paste all of the new plots and our statistical analsis into your document. This is error prone and a frustrating use of time. We'll see how to use R Markdown to get away from this tedious activity so that we can spend more time doing science.
In a traditional analyses if an error was made with the original data, we'd need to step through the entire process again: recreate the plots and copy and paste all of the new plots and our statistical analysis into your document. This is error prone and a frustrating use of time. We'll see how to use R Markdown to get away from this tedious activity so that we can spend more time doing science.

> "We are talking about _computational_ reproducibility." - Yihui Xie
Reproducibility means a lot of things in terms of different scientific fields. Are experiments conducted in a way that another researcher could follow the steps and get similar results? In this book, we will focus on what is known as **computational reproducibility**. This refers to being able to pass all of one's data analysis and conclusions to someone else and have them get exactly the same results on their machine. This allows for time to be spent doing actual science and interpretting of results and assumptions instead of the more error prone way of starting from scratch or follow a list of steps that may be different from machine to machine.
Reproducibility means a lot of things in terms of different scientific fields. Are experiments conducted in a way that another researcher could follow the steps and get similar results? In this book, we will focus on what is known as **computational reproducibility**. This refers to being able to pass all of one's data analysis and conclusions to someone else and have them get exactly the same results on their machine. This allows for time to be spent doing actual science and interpreting of results and assumptions instead of the more error prone way of starting from scratch or follow a list of steps that may be different from machine to machine.

## Who is this book for?

This book is targetted at students taking a traditional intro stats class in a small college environment using RStudio and preferably RStudio Server. We assume no prerequisites: no calculus and no prior programming experience. This is intended to be a gentle and nice introduction to the practice of statistics in terms of how data scientists, statisticians, and other scientists analyze data and write stories about data. We have intentionally avoided the use of throwing formulas at you and instead have focused on developing statistical concepts via data visualization and statistical computing. We hope this is a more intuitive experience than the way statistics has traditionally been taught in the past (and how it is commonly perceived from the outside). We additionally hope that you see the value of reproducible research via R as you continue in your studies. We understand that there will initially be growing pains in learning to program but we are here to help you and you should know that there is a huge community of R users that are always happy to help newbies along.
This book is targeted at students taking a traditional intro stats class in a small college environment using RStudio and preferably RStudio Server. We assume no prerequisites: no calculus and no prior programming experience. This is intended to be a gentle and nice introduction to the practice of statistics in terms of how data scientists, statisticians, and other scientists analyze data and write stories about data. We have intentionally avoided the use of throwing formulas at you and instead have focused on developing statistical concepts via data visualization and statistical computing. We hope this is a more intuitive experience than the way statistics has traditionally been taught in the past (and how it is commonly perceived from the outside). We additionally hope that you see the value of reproducible research via R as you continue in your studies. We understand that there will initially be growing pains in learning to program but we are here to help you and you should know that there is a huge community of R users that are always happy to help newbies along.

Now let's get into learning about how to create good stories about and with data!
2 changes: 1 addition & 1 deletion 03-tidy.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ lc <- 0
rq <- 0
# **`r paste0("(LC", chap, ".", (lc <- lc + 1), ")")`**
# **`r paste0("(RQ", chap, ".", (rq <- rq + 1), ")")`**
knitr::opts_chunk$set(tidy = FALSE, out.width='\\textwidth')#, fig.align = "center")
knitr::opts_chunk$set(tidy = FALSE, out.width='\\textwidth')
```

## What is tidy data?
Expand Down
Loading

0 comments on commit 4247c5b

Please sign in to comment.