The goal for this repository is to compile a basic data analysis workflow of flow cytometry data that can be used by others in our research lab, or by anyone else, for:
- anomalous event detection and removal
- batch effect correction
- clustering
The main libraries, packages that are used in the current workflow are as follows:
- Tidyverse
- FlowCore for handling of flow cytometry data, link for paper
- FlowAI for quality control, link for paper
- cyCombine for batch effect correction, link for paper
- FlowSOM for clustering, link for paper
The documentation, vignettes and other information of these packages can be found on the above links. Other packages, such as umap, patchwork and Biobase, are also used by some specific functions.
For learning flow cytometry data handling, cleaning, compensation, gating, transformation and basic plotting in R, I recommend this tutorial from Christopher Hall and the following page.
Marker expression data needs to be transformed before batch effect correction for which the asinh, or inverse hyperbolic sine, transformation is applied in this workflow. Setting an appropriate value of cofactor for the transformation is a crucial step. The following sources may help to figure out what cofactor to use: 1. 2. 3.
cyCombine requires the data as a tibble to work with, therefore .fcs files or flowSets have to be converted for which some additional metadata files (metadata_cC.xlsx and panel_cC.xlsx) are needed. Examples of how to prepare these files are shown in the metadata_preparation_for_cyCombine.pdf. For more information please see the Reference manual of cyCombine.
To convert the output of batch effect correction into a flowFrame with the save_as_ff function, or into a .fcs file with the save_as_fcs function, I utilized some code from Yann Abraham: link to the original. Alternatively, the output could be saved as a .csv file and then converted into a .fcs file for which this might be a useful tool.