03-prerequisites.Rmd

# Prerequisites {#prerequisites}

The analysis presented in this book requires a basic understanding of the 
`R` programing language. An introduction to `R` can be found [here](https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf) and
in the book [R for Data Science](https://r4ds.had.co.nz/index.html).

Furthermore, it is beneficial to be familiar with single-cell data analysis
using the [Bioconductor](https://www.bioconductor.org/) framework. The 
[Orchestrating Single-Cell Analysis with Bioconductor](https://bioconductor.org/books/release/OSCA/) 
gives an excellent overview on data containers and basic analysis that are being
used here.

An overview on IMC as technology and necessary image processing steps can be
found on the [IMC workflow website](https://bodenmillergroup.github.io/IMCWorkflow/). 

Before we get started on IMC data analysis, we will need to make sure that
software dependencies are installed and the needed example data is downloaded.

## Software requirements

To install all R packages needed for the analysis, please run:

```{r install-packages, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install(c("pheatmap", "viridis",
                       "zoo", "BiocManager", "devtools", "tiff",
                       "distill", "openxlsx", "ggrepel", "patchwork", "mclust",
                       "RColorBrewer", "uwot", "Rtsne", "harmony", "Seurat", 
                       "SeuratObject", "cowplot", "kohonen", "caret", 
                       "randomForest", "ggridges", "cowplot", "gridGraphics",
                       "scales", "tiff", "CATALYST", "scuttle", "scater", 
                       "dittoSeq", "tidyverse", "batchelor", 
                       "bluster","scran", "lisaClust", "spicyR"))

# Github dependencies
devtools::install_github(c("BodenmillerGroup/imcRtools", 
                           "BodenmillerGroup/cytomapper", 
                           "i-cyto/Rphenograph"))
```

```{r load-libraries, echo = FALSE, message = FALSE}
options(timeout=10000)
library(CATALYST)
library(SpatialExperiment)
library(SingleCellExperiment)
library(scuttle)
library(scater)
library(imcRtools)
library(cytomapper)
library(dittoSeq)
library(tidyverse)
library(bluster)
library(scran)
library(lisaClust)
library(caret)
```

Throughout the analysis, we rely on different R software packages.
This section lists the most commonly used packages in this workflow.

Data containers:

* [SpatialExperiment](https://bioconductor.org/packages/release/bioc/html/SpatialExperiment.html) version `r packageVersion("SpatialExperiment")`
* [SingleCellExperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) version `r packageVersion("SingleCellExperiment")`

Data analysis:

* [CATALYST](https://bioconductor.org/packages/release/bioc/html/CATALYST.html) version `r packageVersion("CATALYST")`
* [imcRtools](https://github.com/BodenmillerGroup/imcRtools) version `r packageVersion("imcRtools")` from [Github](https://github.com/BodenmillerGroup/imcRtools)
* [scuttle](https://bioconductor.org/packages/release/bioc/html/scuttle.html) version `r packageVersion("scuttle")`
* [scater](https://bioconductor.org/packages/release/bioc/html/scater.html) version `r packageVersion("scater")`
* [batchelor](https://www.bioconductor.org/packages/release/bioc/html/batchelor.html) version `r packageVersion("batchelor")`
* [bluster](https://www.bioconductor.org/packages/release/bioc/html/bluster.html) version `r packageVersion("bluster")`
* [scran](https://www.bioconductor.org/packages/release/bioc/html/scran.html) version `r packageVersion("scran")`
* [harmony](https://github.com/immunogenomics/harmony) version `r packageVersion("harmony")`
* [Seurat](https://satijalab.org/seurat/index.html) version `r packageVersion("Seurat")`
* [lisaClust](https://www.bioconductor.org/packages/release/bioc/html/lisaClust.html) version `r packageVersion("lisaClust")`
* [caret](https://topepo.github.io/caret/) version `r packageVersion("caret")`

Data visualization:

* [cytomapper](https://github.com/BodenmillerGroup/cytomapper) version `r packageVersion("cytomapper")` from [Github](https://github.com/BodenmillerGroup/cytomapper)
* [dittoSeq](https://bioconductor.org/packages/release/bioc/html/dittoSeq.html) version `r packageVersion("dittoSeq")`

Tidy R:

* [tidyverse](https://www.tidyverse.org/) version `r packageVersion("tidyverse")`

## Image processing {#image-processing}

The analysis presented here fully relies on packages written in the programming
language `R` and primarily focuses on analysis approaches downstream of image
processing. The example data available at
[https://zenodo.org/record/5949116](https://zenodo.org/record/5949116) were
processed (file type conversion, image segmentation, feature extraction as
explained in Section \@ref(processing)) using the
[steinbock](https://bodenmillergroup.github.io/steinbock/latest/) framework. The
exact command line interface calls to process the raw data are shown below:

```{r, echo = FALSE, message = FALSE}
dir.create("data/steinbock")
dir.create("data/ImcSegmentationPipeline")
# Pre-download steinbock file
download.file("https://zenodo.org/record/6642699/files/steinbock.sh", 
              "data/steinbock/steinbock.sh")
```

```{bash, file="data/steinbock/steinbock.sh", eval=FALSE}

```

## Download example data {#download-data}

Throughout this tutorial, we will access a number of different data types. 
To declutter the analysis scripts, we will already download all needed data here.

To highlight the basic steps of IMC data analysis, we provide example data that
were acquired as part of the **I**ntegrated i**MMU**noprofiling of large adaptive
**CAN**cer patient cohorts projects ([immucan.eu](https://immucan.eu/)). The
raw data of 4 patients can be accessed online at 
[zenodo.org/record/5949116](https://zenodo.org/record/5949116) the
sample/patient metadata information here:

```{r download-sample-data}
download.file("https://zenodo.org/record/5949116/files/sample_metadata.xlsx", 
         destfile = "data/sample_metadata.xlsx")
```

### Processed multiplexed imaging data

The IMC raw data was either processed using the 
[steinbock](https://github.com/BodenmillerGroup/steinbock) framework or the
[IMC Segmentation Pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline).
Image processing included file type conversion, cell segmentation and feature
extraction. 

**steinbock output**

The output of the `steinbock` framework required for the analysis presented here includes the single-cell mean
intensity files, the single-cell morphological features and spatial locations,
spatial object graphs in form of edge lists indicating cells in close proximity,
hot pixel filtered multi-channel images, segmentation masks, image metadata and
channel metadata. All these files will be downloaded here for later use. The
commands which were used to generate this data can be found in
`data/steinbock/steinbock.sh`.

```{r steinbock-results}
# download intensities
url <- "https://zenodo.org/record/6642699/files/intensities.zip"
destfile <- "data/steinbock/intensities.zip"
download.file(url, destfile)
unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
unlink(destfile)

# download regionprops
url <- "https://zenodo.org/record/6642699/files/regionprops.zip"
destfile <- "data/steinbock/regionprops.zip"
download.file(url, destfile)
unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
unlink(destfile)


# download neighbors
url <- "https://zenodo.org/record/6642699/files/neighbors.zip"
destfile <- "data/steinbock/neighbors.zip"
download.file(url, destfile)
unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
unlink(destfile)

# download images
url <- "https://zenodo.org/record/6642699/files/img.zip"
destfile <- "data/steinbock/img.zip"
download.file(url, destfile)
unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
unlink(destfile)

# download masks
url <- "https://zenodo.org/record/6642699/files/masks_deepcell.zip"
destfile <- "data/steinbock/masks_deepcell.zip"
download.file(url, destfile)
unzip(destfile, exdir="data/steinbock", overwrite=TRUE)
unlink(destfile)

# download individual files
download.file("https://zenodo.org/record/6642699/files/panel.csv", 
              "data/steinbock/panel.csv")
download.file("https://zenodo.org/record/6642699/files/images.csv", 
              "data/steinbock/images.csv")
download.file("https://zenodo.org/record/6642699/files/steinbock.sh", 
              "data/steinbock/steinbock.sh")
```

**IMC Segmentation Pipeline output**

The example data was also processed using the 
[IMC Segmetation Pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline) (version 3). 
To highlight the use of the reader function for this type of output, we will need
to download the `cpout` folder which is part of the `analysis` folder. The `cpout`
folder stores all relevant output files of the pipeline. For a full description
of the pipeline, please refer to the [docs](https://bodenmillergroup.github.io/ImcSegmentationPipeline/).

```{r imcsegpipe-results}
# download analysis folder
url <- "https://zenodo.org/record/6449127/files/analysis.zip"
destfile <- "data/ImcSegmentationPipeline/analysis.zip"
download.file(url, destfile)
unzip(destfile, exdir="data/ImcSegmentationPipeline", overwrite=TRUE)
unlink(destfile)

unlink("data/ImcSegmentationPipeline/analysis/cpinp/", recursive=TRUE)
unlink("data/ImcSegmentationPipeline/analysis/crops/", recursive=TRUE)
unlink("data/ImcSegmentationPipeline/analysis/histocat/", recursive=TRUE)
unlink("data/ImcSegmentationPipeline/analysis/ilastik/", recursive=TRUE)
unlink("data/ImcSegmentationPipeline/analysis/ometiff/", recursive=TRUE)
unlink("data/ImcSegmentationPipeline/analysis/cpout/images/", recursive=TRUE)
unlink("data/ImcSegmentationPipeline/analysis/cpout/probabilities/", recursive=TRUE)
unlink("data/ImcSegmentationPipeline/analysis/cpout/masks/", recursive=TRUE)
```

### Files for spillover matrix estimation

To highlight the estimation and correction of channel-spillover as described by
[@Chevrier2017], we can access an example spillover-acquisition from:

```{r download-spillover-data}
download.file("https://zenodo.org/record/5949116/files/compensation.zip",
              "data/compensation.zip")
unzip("data/compensation.zip", exdir="data", overwrite=TRUE)
unlink("data/compensation.zip")
```

### Gated cells

In Section \@ref(classification), we present a cell type classification approach
that relies on previously gated cells. This ground truth data is available
online at [zenodo.org/record/6554611](https://zenodo.org/record/6554611) and
will be downloaded here for later use:

```{r download-gated-cells}
download.file("https://zenodo.org/record/7079294/files/gated_cells.zip",
              "data/gated_cells.zip")
unzip("data/gated_cells.zip", exdir="data", overwrite=TRUE)
unlink("data/gated_cells.zip")
```

## Software versions {#sessionInfo}

<details>
   <summary>SessionInfo</summary>
   
```{r, echo = FALSE, message = FALSE}
sessionInfo()
```
</details>