diff --git a/.Rbuildignore b/.Rbuildignore index 0725922..eb3c68e 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -17,3 +17,5 @@ ^codemeta\.json$ ^pkgdown$ ^.lintr$ +^CRAN-SUBMISSION$ +^LICENSE\.md$ diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md deleted file mode 100644 index 10ac61c..0000000 --- a/.github/CONTRIBUTING.md +++ /dev/null @@ -1,47 +0,0 @@ -# Contributing to datasauRus - -This outlines how to propose a change to datasauRus. -For more detailed info about contributing to this, and other tidyverse packages, please see the -[**development contributing guide**](https://rstd.io/tidy-contrib). - -## Fixing typos - -You can fix typos, spelling mistakes, or grammatical errors in the documentation directly using the GitHub web interface, as long as the changes are made in the _source_ file. -This generally means you'll need to edit [roxygen2 comments](https://roxygen2.r-lib.org/articles/roxygen2.html) in an `.R`, not a `.Rd` file. -You can find the `.R` file that generates the `.Rd` by reading the comment in the first line. - -## Bigger changes - -If you want to make a bigger change, it's a good idea to first file an issue and make sure someone from the team agrees that it’s needed. -If you’ve found a bug, please file an issue that illustrates the bug with a minimal -[reprex](https://www.tidyverse.org/help/#reprex) (this will also help you write a unit test, if needed). - -### Pull request process - -* Fork the package and clone onto your computer. If you haven't done this before, we recommend using `usethis::create_from_github("jumpingrivers/datasauRus", fork = TRUE)`. - -* Install all development dependencies with `devtools::install_dev_deps()`, and then make sure the package passes R CMD check by running `devtools::check()`. - If R CMD check doesn't pass cleanly, it's a good idea to ask for help before continuing. -* Create a Git branch for your pull request (PR). We recommend using `usethis::pr_init("brief-description-of-change")`. - -* Make your changes, commit to git, and then create a PR by running `usethis::pr_push()`, and following the prompts in your browser. - The title of your PR should briefly describe the change. - The body of your PR should contain `Fixes #issue-number`. - -* For user-facing changes, add a bullet to the top of `NEWS.md` (i.e. just below the first header). Follow the style described in . - -### Code style - -* New code should follow the tidyverse [style guide](https://style.tidyverse.org). - You can use the [styler](https://CRAN.R-project.org/package=styler) package to apply these styles, but please don't restyle code that has nothing to do with your PR. - -* We use [roxygen2](https://cran.r-project.org/package=roxygen2), with [Markdown syntax](https://cran.r-project.org/web/packages/roxygen2/vignettes/rd-formatting.html), for documentation. - -* We use [testthat](https://cran.r-project.org/package=testthat) for unit tests. - Contributions with test cases included are easier to accept. - -## Code of Conduct - -Please note that the datasauRus project is released with a -[Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this -project you agree to abide by its terms. diff --git a/CRAN-SUBMISSION b/CRAN-SUBMISSION new file mode 100644 index 0000000..66be463 --- /dev/null +++ b/CRAN-SUBMISSION @@ -0,0 +1,3 @@ +Version: 0.1.6 +Date: 2022-05-03 20:18:55 UTC +SHA: 29e97d78dfa8d28b6db13ef9477d2a8c954d92ed diff --git a/DESCRIPTION b/DESCRIPTION index 633f45a..88812bd 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -27,7 +27,7 @@ Description: The Datasaurus Dozen is a set of datasets with the same Statistics through Simulated Annealing" . License: MIT + file LICENSE URL: https://github.com/jumpingrivers/datasauRus, - https://jumpingrivers.github.io/datasauRus + https://jumpingrivers.github.io/datasauRus/ BugReports: https://github.com/jumpingrivers/datasauRus/issues Depends: R (>= 3.0.0) diff --git a/LICENSE b/LICENSE index eb1d070..960c937 100644 --- a/LICENSE +++ b/LICENSE @@ -1,2 +1,2 @@ -YEAR: 2017 -COPYRIGHT HOLDER: Stephanie Locke +YEAR: 2022 +COPYRIGHT HOLDER: datasauRus authors diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 0000000..304fcc4 --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,21 @@ +# MIT License + +Copyright (c) 2022 datasauRus authors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/R/datasaurus-package.R b/R/datasaurus-package.R index 58bf85c..3dd5c6f 100644 --- a/R/datasaurus-package.R +++ b/R/datasaurus-package.R @@ -15,7 +15,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/box_plots.R "box_plots" @@ -57,7 +57,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/datasaurus_dozen_wide.R "datasaurus_dozen_wide" @@ -76,7 +76,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/datasaurus_dozen.R "datasaurus_dozen" @@ -98,7 +98,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats).#nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs).#nolint #' @example inst/examples/simpsons_paradox_wide.R "simpsons_paradox_wide" @@ -119,7 +119,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/simpsons_paradox.R "simpsons_paradox" @@ -138,7 +138,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/twelve_from_slant_alternate_long.R "twelve_from_slant_alternate_long" @@ -178,7 +178,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/twelve_from_slant_alternate_wide.R "twelve_from_slant_alternate_wide" @@ -197,7 +197,7 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/twelve_from_slant_long.R "twelve_from_slant_long" @@ -237,6 +237,6 @@ #' Varied Appearance and Identical Statistics through Simulated #' Annealing. _CHI 2017 Conference proceedings: ACM SIGCHI #' Conference on Human Factors in Computing Systems._ -#' Retrieved from [https://www.autodeskresearch.com/publications/samestats](https://www.autodeskresearch.com/publications/samestats). #nolint +#' Retrieved from [https://www.autodesk.com/research/publications/same-stats-different-graphs](https://www.autodesk.com/research/publications/same-stats-different-graphs). #nolint #' @example inst/examples/twelve_from_slant_wide.R "twelve_from_slant_wide" diff --git a/README.Rmd b/README.Rmd index 1ac3359..f82d729 100644 --- a/README.Rmd +++ b/README.Rmd @@ -15,26 +15,24 @@ knitr::opts_chunk$set( # datasauRus -[![CRAN version](http://www.r-pkg.org/badges/version/datasauRus)](https://cran.r-project.org/package=datasauRus) [![Downloads](http://cranlogs.r-pkg.org/badges/datasauRus)](http://cran.rstudio.com/web/packages/datasauRus/index.html) +[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable) +[![CRAN status](https://www.r-pkg.org/badges/version/datasauRus)](https://CRAN.R-project.org/package=datasauRus) [![R-CMD-check](https://github.com/jumpingrivers/datasauRus/workflows/R-CMD-check/badge.svg)](https://github.com/jumpingrivers/datasauRus/actions) -[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) - + This package wraps the awesome Datasaurus Dozen datasets. The Datasaurus Dozen show us why visualisation is important -- summary statistics can be the same but distributions can be very different. In short, this package gives a fun alternative to [Anscombe's Quartet](https://en.wikipedia.org/wiki/Anscombe%27s_quartet), available in R as `anscombe`. -The original Datasaurus was created by Alberto Cairo in this great [blog post](http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html). - -The other Dozen were generated using simulated annealing and the process +The original Datasaurus was created by Alberto Cairo. The other Dozen were generated using simulated annealing and the process is described in the paper "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing" by Justin -Matejka and George Fitzmaurice ([open access materials including manuscript and code](https://www.autodeskresearch.com/publications/samestats), [official paper](https://doi.org/10.1145/3025453.3025912)). +Matejka and George Fitzmaurice ([open access materials including manuscript and code](https://www.autodesk.com/research/publications/same-stats-different-graphs), [official paper](https://doi.org/10.1145/3025453.3025912)). In the paper, Justin and George simulate a variety of datasets that the same summary statistics to the Datasaurus but have very different distributions. ```{r, out.width="600px", fig.alt="Sequential dinosaur gif", echo = FALSE} -knitr::include_graphics("man/figures/DinoSequential.gif") +knitr::include_graphics("https://damassets.autodesk.net/content/dam/autodesk/research/publications-assets/gifs/same-stats-different-graphs/DinoSequentialSmaller.gif") ``` ## Install @@ -64,10 +62,7 @@ ggplot(datasaurus_dozen, aes(x = x, y = y, colour = dataset))+ facet_wrap(~dataset, ncol = 3) ``` -## Contributing to the package - -Want to report a bug or suggest a feature? Great stuff! For more information on how to contribute check out [our contributing guide](.github/CONTRIBUTING.md). - ## Code of Conduct +## Code of Conduct - Please note that the datasauRus project is released with a [Contributor Code of Conduct](https://jumpingrivers.github.io/datasauRus/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms +Please note that the datasauRus project is released with a [Contributor Code of Conduct](https://jumpingrivers.github.io/datasauRus/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms diff --git a/README.md b/README.md index 122d7c3..1994e09 100644 --- a/README.md +++ b/README.md @@ -5,13 +5,11 @@ +[![Lifecycle: +stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable) [![CRAN -version](http://www.r-pkg.org/badges/version/datasauRus)](https://cran.r-project.org/package=datasauRus) -[![Downloads](http://cranlogs.r-pkg.org/badges/datasauRus)](http://cran.rstudio.com/web/packages/datasauRus/index.html) +status](https://www.r-pkg.org/badges/version/datasauRus)](https://CRAN.R-project.org/package=datasauRus) [![R-CMD-check](https://github.com/jumpingrivers/datasauRus/workflows/R-CMD-check/badge.svg)](https://github.com/jumpingrivers/datasauRus/actions) -[![Project Status: Active – The project has reached a stable, usable -state and is being actively -developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) This package wraps the awesome Datasaurus Dozen datasets. The Datasaurus @@ -21,22 +19,20 @@ gives a fun alternative to [Anscombe’s Quartet](https://en.wikipedia.org/wiki/Anscombe%27s_quartet), available in R as `anscombe`. -The original Datasaurus was created by Alberto Cairo in this great [blog -post](http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html). - -The other Dozen were generated using simulated annealing and the process -is described in the paper “Same Stats, Different Graphs: Generating -Datasets with Varied Appearance and Identical Statistics through -Simulated Annealing” by Justin Matejka and George Fitzmaurice ([open -access materials including manuscript and -code](https://www.autodeskresearch.com/publications/samestats), +The original Datasaurus was created by Alberto Cairo. The other Dozen +were generated using simulated annealing and the process is described in +the paper “Same Stats, Different Graphs: Generating Datasets with Varied +Appearance and Identical Statistics through Simulated Annealing” by +Justin Matejka and George Fitzmaurice ([open access materials including +manuscript and +code](https://www.autodesk.com/research/publications/same-stats-different-graphs), [official paper](https://doi.org/10.1145/3025453.3025912)). In the paper, Justin and George simulate a variety of datasets that the same summary statistics to the Datasaurus but have very different distributions. -Sequential dinosaur gif +Sequential dinosaur gif ## Install @@ -69,12 +65,9 @@ ggplot(datasaurus_dozen, aes(x = x, y = y, colour = dataset))+ ![](man/figures/datasets-1.png) -## Contributing to the package - -Want to report a bug or suggest a feature? Great stuff! For more -information on how to contribute check out [our contributing -guide](.github/CONTRIBUTING.md). +## Code of Conduct -Please note that this R package is released with a [Contributor Code of -Conduct](CODE_OF_CONDUCT.md). By participating in this package project -you agree to abide by its terms. +Please note that the datasauRus project is released with a [Contributor +Code of +Conduct](https://jumpingrivers.github.io/datasauRus/CODE_OF_CONDUCT.html). +By contributing to this project, you agree to abide by its terms diff --git a/man/box_plots.Rd b/man/box_plots.Rd index e419eb0..fd0b065 100644 --- a/man/box_plots.Rd +++ b/man/box_plots.Rd @@ -66,6 +66,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/man/datasaurus_dozen.Rd b/man/datasaurus_dozen.Rd index c3e915d..2d32b09 100644 --- a/man/datasaurus_dozen.Rd +++ b/man/datasaurus_dozen.Rd @@ -52,6 +52,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/man/datasaurus_dozen_wide.Rd b/man/datasaurus_dozen_wide.Rd index de31713..8d2df7b 100644 --- a/man/datasaurus_dozen_wide.Rd +++ b/man/datasaurus_dozen_wide.Rd @@ -67,6 +67,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/man/figures/DinoSequential.gif b/man/figures/DinoSequential.gif deleted file mode 100644 index 966728e..0000000 Binary files a/man/figures/DinoSequential.gif and /dev/null differ diff --git a/man/simpsons_paradox.Rd b/man/simpsons_paradox.Rd index aeeafdb..313f225 100644 --- a/man/simpsons_paradox.Rd +++ b/man/simpsons_paradox.Rd @@ -51,6 +51,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/man/simpsons_paradox_wide.Rd b/man/simpsons_paradox_wide.Rd index cf5c09c..9a0e386 100644 --- a/man/simpsons_paradox_wide.Rd +++ b/man/simpsons_paradox_wide.Rd @@ -46,6 +46,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}.#nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}.#nolint } \keyword{datasets} diff --git a/man/twelve_from_slant_alternate_long.Rd b/man/twelve_from_slant_alternate_long.Rd index 28e3f76..52c571b 100644 --- a/man/twelve_from_slant_alternate_long.Rd +++ b/man/twelve_from_slant_alternate_long.Rd @@ -50,6 +50,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/man/twelve_from_slant_alternate_wide.Rd b/man/twelve_from_slant_alternate_wide.Rd index c951a3d..3e8e5db 100644 --- a/man/twelve_from_slant_alternate_wide.Rd +++ b/man/twelve_from_slant_alternate_wide.Rd @@ -65,6 +65,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/man/twelve_from_slant_long.Rd b/man/twelve_from_slant_long.Rd index 3ac1c2b..cc60376 100644 --- a/man/twelve_from_slant_long.Rd +++ b/man/twelve_from_slant_long.Rd @@ -50,6 +50,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/man/twelve_from_slant_wide.Rd b/man/twelve_from_slant_wide.Rd index df016a3..c7e8596 100644 --- a/man/twelve_from_slant_wide.Rd +++ b/man/twelve_from_slant_wide.Rd @@ -65,6 +65,6 @@ Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. \emph{CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems.} -Retrieved from \url{https://www.autodeskresearch.com/publications/samestats}. #nolint +Retrieved from \url{https://www.autodesk.com/research/publications/same-stats-different-graphs}. #nolint } \keyword{datasets} diff --git a/vignettes/Datasaurus.Rmd b/vignettes/Datasaurus.Rmd index a7496d1..8fa9e99 100644 --- a/vignettes/Datasaurus.Rmd +++ b/vignettes/Datasaurus.Rmd @@ -8,13 +8,11 @@ vignette: > %\VignetteEncoding{UTF-8} --- -This package wraps the awesome Datasaurus Dozen dataset, which contains [13](http://www.phrases.org.uk/meanings/Bakers-dozen.html) sets of x-y data. Each sub-dataset has five statistics that are (almost) the same in each case. (These are the mean of x, mean of y, standard deviation of x, standard deviation of y, and Pearson correlation between x and y). However, scatter plots reveal that each sub-dataset looks very different. The dataset is intended to be used to teach students that it is important to plot their own datasets, rather than relying only on statistics. +This package wraps the awesome Datasaurus Dozen dataset, which contains [13](https://www.phrases.org.uk/meanings/Bakers-dozen.html) sets of x-y data. Each sub-dataset has five statistics that are (almost) the same in each case. (These are the mean of x, mean of y, standard deviation of x, standard deviation of y, and Pearson correlation between x and y). However, scatter plots reveal that each sub-dataset looks very different. The dataset is intended to be used to teach students that it is important to plot their own datasets, rather than relying only on statistics. -The Datasaurus was created by Alberto Cairo in this great [blog post](http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html). +The Datasaurus was created by Alberto Cairo. Datasaurus shows us why visualisation is important, not just summary statistics. -Datasaurus shows us why visualisation is important, not just summary statistics. - -He's been subsequently made even more famous in the paper [Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.autodeskresearch.com/publications/samestats) by Justin Matejka and George Fitzmaurice. +He's been subsequently made even more famous in the paper [Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.autodesk.com/research/publications/same-stats-different-graphs) by Justin Matejka and George Fitzmaurice. In the paper, Justin and George simulate a variety of datasets that the same summary statistics to the Datasaurus but have very different distributions.