-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very few duplex reads with short amplicons #268
Comments
A couple of questions:
|
Hi @vellamike
|
Hi @HenkvdMeulen, Dorado has a known issue whereby duplex basecalling of short reads (<1kb) results in low yields. This is an result of the duplex algorithm which has been optimised for longer reads. We are working on a fix and will update this issue when we release it, but at present what you are trying to do will unfortunately not work very well. Mike |
Hi @vellamike, Best regards, |
Hi @vellamike, Best regards, |
Hi @vellamike I'm also trying to perform duplex basecalling on reads < 1kb. It is tagging most reads with dx:i:0 (simplex without duplex offpring) and some reads with dx:i:-1 (simplex reads which have duplex offsprings), but no reads actually get tagged with dx:i:1 (duplex). Shouldn't it produce some duplex reads tagged with dx:i:1 if there are dx:i:-1 simplex reads present? Thanks |
Hi @quintonwessells which version of dorado are you using? your issue sounds like a bug we fixed in 0.3.4 we haven't made any explicit improvements for short read duplex yet. we made some updates to long read duplex yield in the most recent release, and we are in the process of discussing short read related changes. |
Hi @tijyojwad, I was using 0.3.3. Just tried it with 0.3.4 and now I'm not getting any dx:i:-1 or dx:i:1 reads, so seems that is sort of fixed even though I can tell some of the reads marked as simplex are actually duplex. Thanks for keeping us posted on status of short read related changes! |
what do you mean by this? |
I think I might be running into the same length limitation. I was playing around with forcing dorado to duplex call reads I knew were complementary to each other by passing the --pairs flag, but it appears that the reads are filtered. Below is a partial output of my experimental run. Is dorado filtering by length or is there another criteria I'm missing that it's using to reject this pair? [2023-09-07 11:26:51.844] [info] > Starting Stereo Duplex pipeline |
Unless the min-qscore option was specified in the cmdline, the only other filter enabled by default is the min sequence length, which is |
Thanks for the reply! According to dorado's help file for duplex, the min-qscore default is 0 so I never altered it; does that mean the filter is off or is it behaving in some other way? I also don't see a minimum sequence length flag in the help documentation, is this an actual option I can set? |
correct, min qscore 0 means no filtering is happened based on the read's mean qscore
unfortunately not. but that minimum length is 5, which we thought was a good threshold and anything smaller than that is most likely garbage anyway. from your log I'd expect the duplex read to be ~200 bp or longer... |
I see, so it sounds like it is the same problem as the OP: the duplex basecaller is rejecting these shorter alignments. Thanks for your insight! |
@vellamike Do you have any info on when duplex pairing for short amplicons might be ready? Thanks! |
Apologies for the delay in replying. Duplex pairing for short amplicons is not something which we will deliver soon. It is currently considered a research problem. |
Somehow I'm ending up with a very low number of duplex reads after using Dorado duplex basecalling with the following command (used the ligation sequencing kit V14 SQK-LSK114):
dorado duplex [email protected] pod5/directory/ > duplex.bam
I extracted the duplex reads from this file using
samtools view duplexBarcoded/barcode01.bam | grep "dx:i:1"| samtools view -O BAM -o duplexOnly.bam
However, only 615 out of nearly 2M reads turn out to be duplex reads. What could be the issue here? Does anyone have suggestions as to what might be going wrong and how I could solve it? Thanks in advance!
The text was updated successfully, but these errors were encountered: