-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
16 changed files
with
5,551 additions
and
0 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
--- | ||
title: "Quarto excercise -- BIOS259" | ||
format: html | ||
editor: visual | ||
author: | ||
- name: Aziz Khan | ||
affiliation: Stanford University, CA, USA | ||
- name: Your Name | ||
affiliation: Your University Name | ||
date: "`r format(Sys.time(), '%d %B %Y')`" | ||
abstract: "This is hands-on excrcise for BIOS 259: The Art of Reproducible Science | ||
– a Stanford Biosciences mini-course on computational reproducibility\n" | ||
tags: | ||
- reproducibility | ||
- notebook | ||
- iris | ||
--- | ||
|
||
## Introduction | ||
|
||
In this **Quarto document**, we'll explore some built-in datasets in R base and create visualizations to analyze the data. The goal is to demonstrate how Quarto combines code and narrative text to produce *reproducible research*. | ||
|
||
## Load Data | ||
|
||
We'll start by loading the `iris` dataset, which contains measurements of iris flowers. | ||
|
||
```{r} | ||
# Load the iris dataset | ||
data(iris) | ||
head(iris) | ||
``` | ||
|
||
The iris dataset contains measurements of sepal length, sepal width, petal length, and petal width for **`r length(unique(iris$Species))` species** of iris flowers: setosa, versicolor, and virginica. | ||
|
||
## Summary Statistics | ||
|
||
Let's explore the structure of the iris dataset and summary statistics for each variable. | ||
|
||
```{r} | ||
# Explore dataset structure | ||
str(iris) | ||
# Summary statistics | ||
summary(iris) | ||
``` | ||
|
||
## Data Visualization | ||
|
||
### Scatter Plot | ||
|
||
We'll create a scatter plot to visualize the relationship between sepal length and sepal width for each species of iris flowers. | ||
|
||
```{r, fig.width=8, fig.height=5} | ||
# Scatter plot of sepal length vs. sepal width | ||
library(ggplot2) | ||
iris_scatter_plot <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + | ||
geom_point() + | ||
labs(title = "Scatter Plot", | ||
x = "Sepal Length", y = "Sepal Width") | ||
# print the plot | ||
iris_scatter_plot | ||
``` | ||
|
||
### Make the plot publication ready | ||
|
||
```{r iris-figure1a, fig.width=8, fig.height=5} | ||
# Load the cowplot | ||
require(cowplot) | ||
# Use the theme from cowplot | ||
iris_scatter_plot <- iris_scatter_plot + theme_cowplot(12) | ||
iris_scatter_plot | ||
``` | ||
|
||
### Boxplot | ||
|
||
Next, we'll create a boxplot to compare the distribution of petal lengths for each species of iris flowers. | ||
|
||
```{r iris-figure1b, fig.width=4, fig.height=5} | ||
# Boxplot of petal length by species | ||
iris_box_plot <- ggplot(iris, aes(x = Species, y = Petal.Length, fill = Species)) + | ||
geom_boxplot() + | ||
labs(title = "Boxplot", | ||
x = "Species", y = "Petal Length") + theme_cowplot(12) | ||
iris_box_plot | ||
``` | ||
|
||
### Density Plot | ||
|
||
Finally, let's add a density plot to visualize the distribution of sepal lengths for each species of iris flowers. | ||
|
||
```{r iris-figure1c} | ||
# Density plot of sepal length by species | ||
iris_density_plot <- ggplot(iris, aes(x = Sepal.Length, fill = Species)) + | ||
geom_density(alpha = 0.5) + | ||
labs(title = "Density Plot", | ||
x = "Sepal Length", y = "Density") + theme_cowplot(12) | ||
iris_density_plot | ||
``` | ||
|
||
## Conclusion | ||
|
||
In this R Markdown document, we explored the iris dataset available in R base and created visualizations to analyze the data. By combining code and narrative text in R Markdown, we produced a reproducible analysis that can be easily shared and reproduced by others. | ||
|
||
The `cowplot` package provides the function `plot_grid()` to arrange plots into a grid and label them. | ||
|
||
```{r iris-figure1, fig.width=14, fig.height=4} | ||
#Arranging plots into a grid | ||
plot_grid(iris_scatter_plot, iris_box_plot, iris_density_plot, labels = c('A', 'B','C'), | ||
ncol=3, rel_widths = c(2,1.5,2)) | ||
``` | ||
|
||
Feel free to modify the code, explore other datasets, or add additional visualizations to further analyze the data! | ||
|
||
> **Note:** To generate publication ready figures you can try [ggpubr](https://rpkgs.datanovia.com/ggpubr/index.html) | ||
## R session | ||
|
||
A good practice is to print R session to record the versions of the packages used. | ||
|
||
```{r} | ||
devtools::session_info() | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Literate programming exercises | ||
> Literate programming using R Markdown/Notebook, Jupyter Notebook, and Quarto | ||
This repository contains three exercise files for practicing data analysis and visualization: | ||
|
||
1. `R_notebook_bios259.Rmd`: R Notebook Markdown file. | ||
2. `Quarto_excercise_bios259.qmd`: Quarto document. | ||
3. `Jupyter-notebook-bios259.ipynb`: Jupyter Notebook. | ||
|
||
Follow the instructions below to run each exercise: | ||
|
||
## R Notebook Exercise | ||
|
||
1. Open RStudio. | ||
2. Open the `R_notebook_bios259.Rmd` file. | ||
3. Install any required R packages mentioned in the document using `install.packages(c('cowplot','ggplot2')`. | ||
4. Preview the R Notebook document to produce the HTML report. | ||
|
||
## Quarto Exercise | ||
|
||
1. Install Quarto if you haven't already (`install.packages("quarto")`). | ||
2. Open the `Quarto_excercise_bios259.qmd` file in RStudio or a text editor. | ||
3. Run the Quarto document to produce the output. | ||
|
||
## Jupyter Notebook Exercise | ||
|
||
1. If you've already installed Jupyter Notebook and `seaborn` python package. If you haven't already installed, use the `environment.yaml` provided in the repo. | ||
2. Navigate to the directory containing `Jupyter-notebook-bios259.ipynb` in your terminal. | ||
3. Run `jupyter notebook` to start the Jupyter Notebook server. | ||
4. Open `Jupyter-notebook-bios259.ipynb` in the Jupyter interface and execute the code cells. | ||
|
||
Feel free to explore and modify the exercises to practice your data analysis skills! | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
--- | ||
title: "R Notebook excercise -- BIOS259" | ||
author: | ||
- name: Aziz Khan | ||
affiliation: Stanford University, CA, USA | ||
- name: Your Name | ||
affiliation: Your University Name | ||
date: "`r format(Sys.time(), '%d %B %Y')`" | ||
output: | ||
html_notebook: default | ||
abstract: "This is hands-on excrcise for BIOS 259: The Art of Reproducible Science | ||
– a Stanford Biosciences mini-course on computational reproducibility\n" | ||
tags: | ||
- reproducibility | ||
- notebook | ||
- iris | ||
--- | ||
|
||
## Introduction | ||
|
||
In this **R Notebook**, we'll explore some built-in datasets in R base and create visualizations to analyze the data. The goal is to demonstrate how R Markdown combines code and narrative text to produce *reproducible research*. | ||
|
||
## Load Data | ||
|
||
We'll start by loading the `iris` dataset, which contains measurements of iris flowers. | ||
|
||
```{r} | ||
# Load the iris dataset | ||
data(iris) | ||
head(iris) | ||
``` | ||
|
||
The iris dataset contains measurements of sepal length, sepal width, petal length, and petal width for **`r length(unique(iris$Species))` species** of iris flowers: setosa, versicolor, and virginica. | ||
|
||
## Summary Statistics | ||
|
||
Let's explore the structure of the iris dataset and summary statistics for each variable. | ||
|
||
```{r} | ||
# Explore dataset structure | ||
str(iris) | ||
# Summary statistics | ||
summary(iris) | ||
``` | ||
|
||
## Data Visualization | ||
|
||
### Scatter Plot | ||
|
||
We'll create a scatter plot to visualize the relationship between sepal length and sepal width for each species of iris flowers. | ||
|
||
```{r, fig.width=8, fig.height=5} | ||
# Scatter plot of sepal length vs. sepal width | ||
library(ggplot2) | ||
iris_scatter_plot <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + | ||
geom_point() + | ||
labs(title = "Scatter Plot", | ||
x = "Sepal Length", y = "Sepal Width") | ||
# print the plot | ||
iris_scatter_plot | ||
``` | ||
|
||
### Make the plot publication ready | ||
|
||
```{r, fig.width=8, fig.height=5} | ||
# Load the cowplot | ||
require(cowplot) | ||
# Use the theme from cowplot | ||
iris_scatter_plot <- iris_scatter_plot + theme_cowplot(12) | ||
iris_scatter_plot | ||
``` | ||
|
||
### Boxplot | ||
|
||
Next, we'll create a boxplot to compare the distribution of petal lengths for each species of iris flowers. | ||
|
||
```{r, fig.width=4, fig.height=5} | ||
# Boxplot of petal length by species | ||
iris_box_plot <- ggplot(iris, aes(x = Species, y = Petal.Length, fill = Species)) + | ||
geom_boxplot() + | ||
labs(title = "Boxplot", | ||
x = "Species", y = "Petal Length") + theme_cowplot(12) | ||
iris_box_plot | ||
``` | ||
|
||
### Density Plot | ||
|
||
Finally, let's add a density plot to visualize the distribution of sepal lengths for each species of iris flowers. | ||
|
||
```{r} | ||
# Density plot of sepal length by species | ||
iris_density_plot <- ggplot(iris, aes(x = Sepal.Length, fill = Species)) + | ||
geom_density(alpha = 0.5) + | ||
labs(title = "Density Plot", | ||
x = "Sepal Length", y = "Density") + theme_cowplot(12) | ||
iris_density_plot | ||
``` | ||
|
||
## Conclusion | ||
|
||
In this R Markdown document, we explored the iris dataset available in R base and created visualizations to analyze the data. By combining code and narrative text in R Markdown, we produced a reproducible analysis that can be easily shared and reproduced by others. | ||
|
||
The `cowplot` package provides the function `plot_grid()` to arrange plots into a grid and label them. | ||
|
||
```{r, fig.width=14, fig.height=4} | ||
#Arranging plots into a grid | ||
plot_grid(iris_scatter_plot, iris_box_plot, iris_density_plot, labels = c('A', 'B','C'), | ||
ncol=3, rel_widths = c(2,1.5,2)) | ||
``` | ||
|
||
Feel free to modify the code, explore other datasets, or add additional visualizations to further analyze the data! | ||
|
||
> **Note:** To generate publication ready figures you can try [ggpubr](https://rpkgs.datanovia.com/ggpubr/index.html) | ||
## R session | ||
|
||
A good practice is to print R session to record the versions of the packages used. | ||
|
||
```{r} | ||
devtools::session_info() | ||
``` |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
--- | ||
title: "The Art of Reproducibility" | ||
author: "Aziz Khan" | ||
format: revealjs | ||
--- | ||
|
||
## Getting up | ||
|
||
- Turn off alarm | ||
- Get out of bed | ||
- | ||
|
||
## Going to sleep | ||
|
||
- Get in bed | ||
- Count sheep | ||
|
||
## My Quarto Demo Document | ||
|
||
## Introduction | ||
|
||
Welcome to my Quarto demo document! In this document, we will learn the basics of Quarto and how to create beautiful and interactive documents. | ||
|
||
## Getting Started | ||
|
||
To get started with Quarto, you will need to install the Quarto CLI. You can do this by running the following command: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
*.log | ||
.git/ | ||
.pixi | ||
pixi.lock |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
FROM python:3.9-slim | ||
|
||
# Install Pixi | ||
|
||
FROM ghcr.io/prefix-dev/pixi:latest | ||
|
||
# Set working directory | ||
WORKDIR /project | ||
|
||
# Copy project files | ||
COPY . . | ||
|
||
# Install dependencies | ||
RUN cd /project | ||
RUN pixi install | ||
|
||
# Default command | ||
CMD ["pixi", "run", "preprocess"] | ||
|
Oops, something went wrong.