Many single-cell and single-nucleus datasets are publicly available. Some of these datasets are used over and over again to develop new methods, demonstrate new tools, to use in tutorials, etc. For example, 10x Genomics provided various publicly available datasets on their website for free downloading (https://www.10xgenomics.com/resources/datasets), and many of these are commonly used in single-cell tutorials and vignettes.
To stop re-inventing the wheel, here we introduce an alevin-fry workflow written in Nextflow that can be used to quantify arbitrary number of single-cell RNA-sequencing projects (https://github.com/COMBINE-lab/10x-requant) in one command with some required spreadsheets under the input_files
directory as the input, which are:
sample_sheet.tsv
: This spreadsheet records the detailed information of the datasets one would like to process. Please refer to the provided sheet for examples.ref_sheet.tsv
: This spreadsheet include the reference used to make the alevin-fry splici reference upon. Currently the provided reference must be the pre-build Cell Ranger references, for example, human2020A and mm10-2020A. All the references specified in thereference
column of thesample_sheet.tsv
must be included in theref_sheet.tsv
, otherwise the workflow will have no idea what reference should be used for mapping reads against.pl_sheet.tsv
: This spreadsheet records the permit lists that will be used for deduplicating cellular barcode in alevin-fry, for example, the permit list used for 10x Chromium V2 and V3 chemistry. All the chemistry specified in thechemistry
column of thesample_sheet.tsv
must be included in thepl_sheet.tsv
.
With the three required spreadsheets, this workflow will download the references, make the splici references and indices, and run the alevin-fry pipeline for each dataset in the sample_sheet.tsv
.
The outputs of this workflow is under the nf_pipeline/alevin_fry
folder, which includes the quantification result of all the processed datasets. The quantification folder is named by the MD5sum of the dataset for simplicity. One can find the name and URL of the dataset in the dataset_description.txt
file in each quantification result folder.
Using this workflow we have collected and processed some datasets from 10x website. Here we provide the link of the quantification result generated by alevin-fry. For more information, please check the webpage of quantaf:
- 500 Human PBMCs, 3' LT v3.1, Chromium Controller: link to the quant result
- 500 Human PBMCs, 3' LT v3.1, Chromium X: link to the quant result
- 1k PBMCs from a Healthy Donor (v3 chemistry): link to the quant result
- 10k PBMCs from a Healthy Donor (v3 chemistry): link to the quant result
- 10k Human PBMCs, 3' v3.1, Chromium X: link to the quant result
- 10k Human PBMCs, 3' v3.1, Chromium Controller: link to the quant result
- 10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Single Indexed: link to the quant result
- 10k Peripheral blood mononuclear cells (PBMCs) from a healthy donor, Dual Indexed: link to the quant result
- 20k Human PBMCs, 3' HT v3.1, Chromium X: link to the quant result
- PBMCs from EDTA-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
- PBMCs from Heparin-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
- PBMCs from ACD-A Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
- PBMCs from Citrate-Treated Blood Collection Tubes Isolated via SepMate-Ficoll Gradient (3' v3.1 Chemistry): link to the quant result
- PBMCs from Citrate-Treated Cell Preparation Tubes (3' v3.1 Chemistry): link to the quant result
- PBMCs from a Healthy Donor: Whole Transcriptome Analysis: link to the quant result
- Whole Blood RBC Lysis for PBMCs and Neutrophils, Granulocytes, 3': link to the quant result
- Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 5): link to the quant result
- Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Manual (channel 1): link to the quant result
- Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 5): link to the quant result
- Peripheral blood mononuclear cells (PBMCs) from a healthy donor - Chromium Connect (channel 1): link to the quant result
- Hodgkin's Lymphoma, Dissociated Tumor: Whole Transcriptome Analysis: link to the quant result
- 200 Sorted Cells from Human Glioblastoma Multiforme, 3’ LT v3.1: link to the quant result
- 750 Sorted Cells from Human Invasive Ductal Carcinoma, 3’ LT v3.1: link to the quant result
- 2k Sorted Cells from Human Glioblastoma Multiforme, 3’ v3.1: link to the quant result
- 7.5k Sorted Cells from Human Invasive Ductal Carcinoma, 3’ v3.1: link to the quant result
- Human Glioblastoma Multiforme: 3’v3 Whole Transcriptome Analysis: link to the quant result
- 1k Brain Cells from an E18 Mouse (v3 chemistry): link to the quant result
- 10k Brain Cells from an E18 Mouse (v3 chemistry): link to the quant result
- 1k Heart Cells from an E18 mouse (v3 chemistry): link to the quant result
- 10k Heart Cells from an E18 mouse (v3 chemistry): link to the quant result
- 10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Single Indexed: link to the quant result
- 10k Mouse E18 Combined Cortex, Hippocampus and Subventricular Zone Cells, Dual Indexed: link to the quant result
- 1k PBMCs from a Healthy Donor (v2 chemistry): link to the quant result
- 1k Brain Cells from an E18 Mouse (v2 chemistry): link to the quant result
- 1k Heart Cells from an E18 mouse (v2 chemistry): link to the quant result