-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dorado Demultiplexing Significantly Increased Unclassified Reads Compared to Guppy #1178
Comments
Hi @ohickl, A full description of the classification algorithm can be found in the docs here. This also contains information on creating a custom barcode arrangement in which you can override various scoring parameters. My first suggestion would be to turn on the verbose log for a small subset of the unclassified reads - I suspect you'll see messages like:
Either of these two conditions will cause dorado to mark reads as unclassified regardless of its final barcode scores. These were options available in guppy, but are always on in dorado. Guppy controlled these with the parameters |
Hi, thanks for the swift reply!
I assumed you meant to rerun
and interestingly got 105985 classified reads. Running again on this runs unclassified reads yielded no further classifications. |
Apologies, I should been more specific - I meant the "very verbose" logging, using the Interesting. One of the required checks for dorado barcoding is that the barcode begins/ends within 75 bases of the start/end of the read. If there are additional bases that are pushing the barcode beyond this then the read would be unclassified. It would then be trimmed by the adapter trimming - possibly this then brought the barcode back within the required proximity. |
hmm, not sure if Im doing something wrong but
I also checked the arrangement file options, is it possible to just specify parts of it, e.g.:
|
Log output from dorado goes to
Should output lines like:
Yes it's possible to only provide a subset of scoring options, but this must be done as part of a full custom arrangement, specifying the masks and sequences:
|
Ok. thanks! Ill also try the new 0.9.0 to see if it natively results in more identifications. |
Issue Report
Please describe the issue:
Significant increase in unclassified reads when using Dorado (version 0.8.2) for demultiplexing compared to using Guppy (version 6.5.7).
Steps to reproduce the issue:
Run environment:
Logs
Here a section of the barcoding_summary.txt file showing examples of unclassified reads with high barcode scores (front or rear >=90):
barcode_summary_sub.tsv.gz
Here the basecalling and (part of) the demux logs:
bc:
demux:
Additionally:
I would greatly appreciate guidance on the following:
Best
Oskar
The text was updated successfully, but these errors were encountered: