Low Qscore after Dorado basecalling - P2 solo - ultra long reads #1239

jefferalexdurfue · 2025-01-31T20:10:43Z

Hi everyone

I have performed a P2 solo sequencing with an Ultra-Long Sequencing library (SQK-ULK114). Three separate loading events (24h each) were performed, as recommended by the kit protocol. Metrics were: estimated bases (11.75 Gb), reads generated (171.61K), and estimated N50 (596.82 kb)! Although low output in Gb, we were excited with the read N50.

But after DORADO basecalling , we have seen a very low Qscore as summarized with pycoQC and Nanoplot.

Basecall duplex script, after organizing with pod5 (split_by_channel):

#SBATCH --time=7-00:00:00
#SBATCH --nodes=1
#SBATCH --gpus-per-node=4
#SBATCH --cpus-per-task=32
#SBATCH --mem=740G
#SBATCH --job-name=02-basecalling_duplex
#SBATCH -o ~HOME/Nanopore/log/02-basecalling_duplex.out
#SBATCH -e ~HOME/Nanopore/log/02-basecalling_duplex.err

Dir_POD5="~HOME/Nanopore/01-POD5/split_by_channel"
Dir_basecaller="~HOME/Nanopore/02-basecaller"
DORADO="~HOME/software/dorado-0.8.3-linux-x64/bin/dorado"
model="~HOME/software/dorado-0.8.3-linux-x64/model"
Dir_fastq="~HOME/Nanopore/04-fastq"

     ${DORADO} duplex --device 'cuda:all' \
            ${model}/[email protected] \
            ${Dir_POD5}/ > ${Dir_basecaller}/calls.bam
              
     samtools fastq ${Dir_basecaller}/call_duplex.bam \
            > ${Dir_fastq}/call_duplex.fastq

pycoQC, all reads, Median read quality 2,92!:

Report_Nanopore.pdf

Conclusion:
We ended up with 7Gb (already low for P2), N50 of 380 kb (!!!) and median qscore around 3.
If cut out bad reads, we get only ~148 Mb of data, N50 23 kb...
So bad result for an ultra long library and P2 flowcell.
Is it a bad flowcell? Any advice to try to improve this?

Thanks in advance.
Best regards

The text was updated successfully, but these errors were encountered:

HalfPhoton · 2025-02-05T17:17:48Z

Hi @jefferalexdurfue,
I think this questions best asked on the Nanopore Community Forum as it doesn't appear to be a dorado issue but a sequencing one.

Taking a look at your script however:Dir_POD5="~HOME/Nanopore/01-POD5/split_by_channel"

It looks like this is calling all the split-by-channel pod5s in one job - this is not what we suggest for good performance.
You should run multiple small jobs for a collection of channels to get a performance improvements in duplex.

Best regards,
Rich

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low Qscore after Dorado basecalling - P2 solo - ultra long reads #1239

Low Qscore after Dorado basecalling - P2 solo - ultra long reads #1239

jefferalexdurfue commented Jan 31, 2025 •

edited

Loading

HalfPhoton commented Feb 5, 2025

Low Qscore after Dorado basecalling - P2 solo - ultra long reads #1239

Low Qscore after Dorado basecalling - P2 solo - ultra long reads #1239

Comments

jefferalexdurfue commented Jan 31, 2025 • edited Loading

HalfPhoton commented Feb 5, 2025

jefferalexdurfue commented Jan 31, 2025 •

edited

Loading