You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before I start, I just wanted to thank you for this library; It's a staple of the software I'm writing.
I had a question about the fetch() method of the IndexedReader, for use with indexed .bam files:
I noticed that running fetch on a specified range of coordinates across a reference can take quite a while with larger alignment files (e.g. 9GB), even without iterating across the reads in the fetched region. Importantly, I noticed that fetching a range of coordinates with 0 mapped reads takes a similarly long time.
I'm assuming that this is because, for every call of fetch(), the line pointer in the BAM file must be moved to the appropriate line each time, from the beginning? If this is the case, is there any way to reduce the overhead of repeatedly calling fetch(), or some way to get the line pointer from the previous fetch call and simply navigate to the next position from there?
For context, I'm processing alignment files iteratively in groups of coordinate windows; the code below gets all reads from a given reference in a given window size (say 100bp, from coordinates 0 - 100):
for window in window_chunk {let start = window[0];let end = window[1];
reader
.fetch((tid, start, end)).expect("Error: invalid window value supplied!");
The text was updated successfully, but these errors were encountered:
Before I start, I just wanted to thank you for this library; It's a staple of the software I'm writing.
I had a question about the fetch() method of the IndexedReader, for use with indexed .bam files:
I noticed that running fetch on a specified range of coordinates across a reference can take quite a while with larger alignment files (e.g. 9GB), even without iterating across the reads in the fetched region. Importantly, I noticed that fetching a range of coordinates with 0 mapped reads takes a similarly long time.
I'm assuming that this is because, for every call of fetch(), the line pointer in the BAM file must be moved to the appropriate line each time, from the beginning? If this is the case, is there any way to reduce the overhead of repeatedly calling fetch(), or some way to get the line pointer from the previous fetch call and simply navigate to the next position from there?
For context, I'm processing alignment files iteratively in groups of coordinate windows; the code below gets all reads from a given reference in a given window size (say 100bp, from coordinates 0 - 100):
The text was updated successfully, but these errors were encountered: