dorado demux outpus a greate portion of unclassfied reads #1185

Zhenlisme · 2024-12-18T13:08:12Z

Hello,

I used dorado demux to demultiplex and trim barcode. But more than 95% of reads remained unclassified, and the barcode sequences were supposed to be trimmed but were not. Here, I am showing you an example of reads in the unclassified output, with the barcode flanking sequences highlighted in black.

The kit we used is SQK-RBK114-24 because the barcode flanking sequence is : 5' - ATCGCCTACCGTGAC - barcode - CGTTTTTCGTGCGCCGCTTC - 3'.

Dorado version: 0.8.3+98456f7
Dorado command: dorado demux reads.fastq.gz -o dorado_demux --kit-name SQK-RBK114-24 -t 20 --emit-fastq --emit-summary

malton-ont · 2024-12-18T14:11:19Z

Hi @Zhenlisme,

Those flanking regions are not correct for SQK-RBK114-24 - see here. From those sequences I would expect SQK-RPB114-24, SQK-RPB004 or SQK-RLB001. Are you certain that SQK-RBK114-24 is the barcoding kit you used to prep the data?

Zhenlisme · 2024-12-18T15:08:19Z

Hi,
Thank you for your reply. I did not do the sequencing experiment. I just deduced the kit number according to this document, where I found it conformed the kit "Rapid PCR Barcoding Kit 24 V14". And I am sorry I did not use the correct kit before (they look so similar..).

Yet, I still have some questions:

I agree with you that the correct kit should be either SQK-RPB114-24 or SQK-RPB004. But currently I don't know which one is the exact kit we used for sequencing. Is there any way to deduce them from the reads or should I use both of them?
I rerun the demux function with the two kit respectively. I still got many unclassified reads with both kits. And I found that the many these reads contain internal barcodes. Do you have any suggestions?

Thank you again for your time.
Zhen

malton-ont · 2024-12-18T15:19:15Z

@Zhenlisme,

From the document you linked to you can see that Rapid PCR Barcoding Kit 24 V14 is SQK-RPB114-24.

If you have the .pod5 files, you can run:

pod5 inspect debug <file.pod5> | grep sequencing_kit

and this should output the kit used.

Regarding your remaining unclassified reads, it's difficult to say without more information - how many is "many"? What proportion of your reads remain unclassified?

My first suggestion would be to run the original basecall command with --no-trim to prevent adapter trimming from adversely affecting the flank sequences. I would then suggest demuxing a small sample of the unclassified reads with the -vv flag and looking for lines in the output log like "Found midstrand barcode flanks" - reads that contain barcode flanks beyond the expected barcode window (~175 bases from either end) are marked as unclassified as these are typically concatamers.

Zhenlisme · 2024-12-18T15:40:10Z

Hi,
Thanks for the quick reply.
I tried the 'vv' flag. Indeed, I found lines "Found midstrand barcode flanks" in the output log, meaning there are concatamers in the unclassified reads. Should I just dope the concatamers or is there any other methods to deal with them?

malton-ont · 2024-12-18T15:42:08Z

dorado doesn't provide any other mechanism to deal with these reads beyond marking them as unclassified. Any further analysis of them is left to the user.

Zhenlisme · 2024-12-18T15:43:30Z

Thanks a lot

Zhenlisme · 2024-12-18T17:55:24Z

Hello again,

I found adaptors even in the clasified reads, meaning that the dorado demux did not trim all reads accordingly. In addition, there are still some reads being concatemers among those classified reads. Is it normal? I worried about that such reads would influence the quality of genome assembly if concatemers are prevalent in my case.

Thank you for your time.
Zhen

malton-ont · 2024-12-19T09:26:41Z

dorado demux does not trim adapters directly, only barcodes - since adapters are outboard of barcodes, classified reads should also have the adapter removed. dorado attempts to identify mid-strand barcodes (and mark the read unclassified), but it does not search for mid-strand adapters.

malton-ont added the barcode Issues related to barcoding label Dec 18, 2024

Zhenlisme closed this as completed Dec 18, 2024

Zhenlisme reopened this Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dorado demux outpus a greate portion of unclassfied reads #1185

dorado demux outpus a greate portion of unclassfied reads #1185

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 19, 2024

dorado demux outpus a greate portion of unclassfied reads #1185

dorado demux outpus a greate portion of unclassfied reads #1185

Comments

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

Zhenlisme commented Dec 18, 2024

malton-ont commented Dec 19, 2024