Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differing read count number in simplex and duplex basecalling #474

Open
dpaudel-tb opened this issue Nov 15, 2023 · 2 comments
Open

Differing read count number in simplex and duplex basecalling #474

dpaudel-tb opened this issue Nov 15, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@dpaudel-tb
Copy link

Hello,
I ran simplex and duplex basecalling on the same dataset (dorado-0.4.1-linux-x64 with [email protected]). I was expecting to get same number of reads on the simplex basecalling and the duplex basecalling filtered with ( dx:i:0; dx:i:-1). However there seems to be some discrepancy on the reported read counts. I was wondering if this was expected and which simplex reads should be trusted (direct simplex basecalling or simplex filtered after duplex basecalling)?
Thanks

File ReadCount Tags included
simplex.bam 11,130,442
duplex.bam 13,130,264
filtered_duplex_only.bam 1,966,966 dx:i:1
filtered_simplex_only.bam 11,163,298 dx:i:0; dx:i:-1
filtered_simplex_NoDuplex_i0.bam 7,980,822 dx:i:0
filtered_simplex_WithDuplex_i-1.bam 3,182,476 dx:i:-1
@tijyojwad
Copy link
Collaborator

Hi @dpaudel-tb - we have slightly different read splitting configurations for simplex vs duplex basecalling. This can lead to a different number of reads being split in each case. That's most likely the root cause of this count discrepancy. So I would suggest you go with the dx:0 + dx:-1 simplex reads from the duplex run.

We'll look at harmonizing the options between the 2 cases.

@tijyojwad tijyojwad added the bug Something isn't working label Nov 16, 2023
@dpaudel-tb
Copy link
Author

Thank you @tijyojwad!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants