-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
update info text + minor help changes (#10)
* update info text + minor help changes * remove extra space
- Loading branch information
Showing
5 changed files
with
65 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
include kb_python/info.txt | ||
include kb_python/whitelists/* | ||
recursive-include kb_python/bins * |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
INFO_FILENAME = 'info.txt' | ||
|
||
# Default filenames | ||
CDNA_FILENAME = 'cdna.fa' | ||
INTRON_FILENAME = 'introns.fa' | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
kb is a python package for rapidly pre-processing single-cell RNA-seq data. It is a wrapper for the methods described in | ||
|
||
Melsted, Booeshaghi et al., Modular and efficient pre-processing of single-cell RNA-seq, bioRxiv, 2019 | ||
|
||
The goal of the wrapper is to simplify downloading and running of the kallisto [1] and bustools [2] programs. It was inspired by Sten Linnarsson’s loompy fromfq command (http://linnarssonlab.org/loompy/kallisto/index.html) | ||
|
||
The kb program consists of two parts: | ||
|
||
The `kb ref` command builds or downloads a species-specific index for pseudoalignment of reads. This command must be run prior to `kb count`, and it runs the `kallisto index` [1]. | ||
|
||
The `kb count` command runs the kallisto [1] and bustools [2] programs. It can be used for pre-processing of data from a variety of single-cell RNA-seq technologies, and for a number of different workflows (e.g. production of gene count matrices, RNA velocity analyses, etc.). The output can be saved in a variety of formats including mix and loom. Examples are provided below. | ||
|
||
|
||
Examples | ||
======== | ||
(1) kb ref -i transcriptome.idx -g transcripts_to_genes.txt -f1 cdna.fa Mus_musculus.GRCm38.dna.primary_assembly.fa Mus_musculus.GRCm38.98.gtf | ||
(2) kb count -i transcriptome.idx -g transcripts_to_genes.txt -x 10xv2 -o output --loom Reads1.fastq.gz Reads2.fasta.gz | ||
Build a Kallisto index and transcripts-to-genes mapping using the mouse transcriptome, generated from the provided genomic FASTA and GTF. Then, generate count matrices with the built index and transcripts-to-genes mapping. Convert the final count matrix to a .loom file. | ||
|
||
(1) kb ref -i transcriptome.idx -g transcripts_to_genes.txt -f1 cdna.fa Mus_musculus.GRCm38.dna.primary_assembly.fa Mus_musculus.GRCm38.98.gtf | ||
(2) kb count -i transcriptome.idx -g transcripts_to_genes.txt -x 10xv2 -w 10xv2_whitelist -o output --h5ad Reads1.fastq.gz Reads2.fasta.gz | ||
Build a Kallisto index and transcripts-to-genes mapping using the mouse transcriptome, generated from the provided genomic FASTA and GTF. Then, generate count matrices with the built index, transcripts-to-genes mapping and provided whitelist. Convert the final count matrix to a .h5ad file. | ||
|
||
(1) kb ref -i transcriptome.idx -g transcripts_to_genes.txt -d mouse | ||
(2) kb count -i transcriptome.idx -g transcripts_to_genes.txt -x 10xv2 -o output Reads1.fastq.gz Reads2.fasta.gz | ||
Instead of building a Kallisto index locally, download a pre-built index. Then, generate count matrices with the built index and transcripts-to-genes mapping. | ||
|
||
(1) kb ref -i transcriptome.idx -g transcripts_to_genes.txt -f1 cdna.fa -f2 introns.fa -c1 cdna_transcripts_to_capture.txt -c2 intron_transcripts_to_capture --lamanno Mus_musculus.GRCm38.dna.primary_assembly.fa Mus_musculus.GRCm38.98.gtf | ||
(2) kb count -i transcriptome.idx -g transcripts_to_genes.txt -x 10xv2 -o output -c1 cdna_transcripts_to_capture.txt -c2 intron_transcripts_to_capture.txt --lamanno Reads1.fastq.gz Reads2.fasta.gz | ||
Prepare files (Kallisto index, transcripts-to-genes mapping, cDNA transcripts to capture, intron transcripts to capture) for RNA velocity based on Lamanno et al. 2018 logic. Then, calculate RNA velocity using the prepared files. | ||
|
||
|
||
References | ||
========== | ||
[1] Bray, N. L., Pimentel, H., Melsted, P., & Pachter, L. (2016). Near-optimal probabilistic RNA-seq quantification. Nature biotechnology, 34(5), 525. | ||
[2] Melsted, P., Booeshaghi, A. S., Gao, F., da Veiga Beltrame, E., Lu, L., Hjorleifsson, K. E., ... & Pachter, L. (2019). Modular and efficient pre-processing of single-cell RNA-seq. BioRxiv, 673285. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters