You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.
I noticed the signal lengths for each nucleotide are only multiples of 12 after re-squiggling with prepare_mapped_reads.py
So differences between adjacent values in Ref_to_signal are e.g.
[ 24., 12., 12., 132., 108., 24., 12., ...]
Why is that?
Seems like unnecessarily coarse resolution
Does this correspond to the Move table in any way?
Possibly unrelated:
I just realized that my reads have old Albacore basecalls (Events table etc) and Guppy basecalls (Move table etc) - will prepare_mapped_reads.py only use the newest Basecalls in the fast5?
Cheers
The text was updated successfully, but these errors were encountered:
RaverJay
changed the title
Why are signal lengths of nucleotides multiples of 12?
Why are signal lengths per nucleotides always multiples of 12?
Feb 24, 2020
RaverJay
changed the title
Why are signal lengths per nucleotides always multiples of 12?
Why are signal lengths per nucleotide always multiples of 12?
Feb 24, 2020
Hello. It's to do with the resolution ("stride") of the model used to map the data, and a trade-off between speed and precision when remapping. The mapping can be coarse since training only uses it to select signal-sequence pairs; the training criterion itself does not use this information and considers all possible ways of aligning the signal to the sequence.
Does that mean that Taiyaki's prepare_mapped_reads.py is not really best suited as a resquiggler, e.g. to extract signal means for individual bases for other downstream tasks?
Are you aware of anything more accurate?
EDIT: current guppy models also have stride 10, while the default for taiyaki trained models is 2 - a pretrained stride 1 or 2 model for RNA could be useful
If you are looking at mapping to sample resolution, you could try scrappie mappy --model squiggle_r94_rna ref.fa read.fast5 (from https://github.com/nanoporetech/scrappie ). Another option would be the tools from the Nanopolish or Tombo packages.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hey there,
I noticed the signal lengths for each nucleotide are only multiples of 12 after re-squiggling with prepare_mapped_reads.py
So differences between adjacent values in Ref_to_signal are e.g.
[ 24., 12., 12., 132., 108., 24., 12., ...]
Why is that?
Seems like unnecessarily coarse resolution
Does this correspond to the Move table in any way?
Possibly unrelated:
I just realized that my reads have old Albacore basecalls (Events table etc) and Guppy basecalls (Move table etc) - will prepare_mapped_reads.py only use the newest Basecalls in the fast5?
Cheers
The text was updated successfully, but these errors were encountered: