Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

48 fix fast5 errors with guppy and dorado #49

Merged
merged 6 commits into from
Apr 30, 2024

Conversation

jonperdomo
Copy link
Contributor

Fix errors when running dorado-generated FAST5 files and large guppy FAST5 inputs.

@jonperdomo jonperdomo added the bug Something isn't working label Apr 29, 2024
@jonperdomo jonperdomo self-assigned this Apr 29, 2024
@jonperdomo jonperdomo linked an issue Apr 29, 2024 that may be closed by this pull request
@jonperdomo
Copy link
Contributor Author

jonperdomo commented Apr 29, 2024

The error with dorado-generated fast5 files was due not handling when a multi-read fast5 file does not have the FASTQ sequence data. Although the test file does only have a single read, it is in the multi-read FAST5 format (https://github.com/nanoporetech/dorado/blob/master/tests/data/fast5/single_read.fast5). When you attempt to access the 'ReadID/Analyses' group, the H5 error was not handled. It is fixed now, generating signal plots even when no sequence data is available (same as when running a single-read fast5).
newplot (21)

@jonperdomo
Copy link
Contributor Author

I tested with large guppy-generated inputs from both ONT and HPRC data with no issues, thus all errors seem to be resolved. Input datasets and command, using 100G memory and 12 CPUs:

# ONT test
guppy_fast5_dir="/.../ONT_official/gm24385_mod_2021.09/multi_fast5/20210510_1127_X4_FAQ32498_b90eaed8"
longreadsum f5 -P "${guppy_fast5_dir}/*.fast5" -o $output_dir -t 12

# HPRC test
guppy_fast5_dir="/.../HPRC/basecalls/GM24149_1/workspace/GM24149_1/20190129_0227_2-A1-D1_PAD29702_c7e7f995/fast5/"
longreadsum f5 -P "${guppy_fast5_dir}/*.fast5" -o $output_dir -t 12

@jonperdomo jonperdomo marked this pull request as ready for review April 30, 2024 15:30
@jonperdomo jonperdomo merged commit 1a6a6c7 into main Apr 30, 2024
1 check passed
@jonperdomo jonperdomo deleted the 48-fix-fast5-errors-with-guppy-and-dorado branch April 30, 2024 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix FAST5 errors with Guppy and Dorado
1 participant