Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use --read-geometry when specifying start and end of the both cDNA read pairs? #960

Open
pchaturvedi-takara opened this issue Sep 11, 2024 · 0 comments

Comments

@pchaturvedi-takara
Copy link

Hi Salmon team,
First of all, thank you so much for building a fantastic light weight alignment and quantification tool "Salmon & AlevinFry".

I have question regarding the use of --read-geometry parameter in salmon alevin:

I have scRNA data where it is a paired-end sequencing data with barcode [16] and umi [10] is attached to cDNA read1. I am trying to use salmon but I am confused on how to specify the parameter for read geometry.

I tried two ways and got different statistics each time and thus a clarification from you will be very helpful.

Run1 command: salmon alevin -i hg38_splici_idx_RL_75/ -p 16 -l A --sketch -1 Read1.fastq.gz -2 Read2.fastq.gz -o output --tgMap transcriptome_splici_fl70_t2g.tsv --noDedup --bc-geometry 1[1-16] --umi-geometry 1[17-27] --read-geometry 1[28-end],2[1-end]

Output:
[2024-09-11 16:17:20.192] [jointLog] [info] Number uniquely mapped : 4856777
[2024-09-11 16:17:20.390] [jointLog] [info] Computed 0 rich equivalence classes for further processing
[2024-09-11 16:17:20.390] [jointLog] [info] Counted 0 total reads in the equivalence classes
[2024-09-11 16:17:20.390] [jointLog] [info] Selectively-aligned 10466950 total fragments out of 159997960
[2024-09-11 16:17:20.390] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 0
Run2 command: salmon alevin -i hg38_splici_idx_RL_75/ -p 16 -l A --sketch -1 Read1.fastq.gz -2 Read2.fastq.gz -o output --tgMap transcriptome_splici_fl70_t2g.tsv --noDedup --bc-geometry 1[1-16] --umi-geometry 1[17-27] --read-geometry 1[28-end] 2[1-end]

Output:
[2024-09-11 16:39:31.848] [jointLog] [info] Number uniquely mapped : 53335563
[2024-09-11 16:39:31.964] [jointLog] [info] Computed 0 rich equivalence classes for further processing
[2024-09-11 16:39:31.964] [jointLog] [info] Counted 0 total reads in the equivalence classes
[2024-09-11 16:39:31.965] [jointLog] [info] Selectively-aligned 134356089 total fragments out of 159997960
[2024-09-11 16:39:31.965] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 0

Can you please help me understand which method is correct where Salmon is correctly reading pairs with specified start and end as in Run2 the uniquely mapped reads are ~53 million and ~4 million in Run1. Also, I get "Counted 0 total reads in the equivalence classes" for both cases and is it normal?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant