-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic support for '--targets/-t' #18
Conversation
This is great, thanks! I think skipping fully masked out chunks is good enough in the first instance perf wise - otherwise I don't think this will be practical for looking at bits of large datasets. |
This addresses most of the things discussed above (skipping fully masked out chunks, adding numba, region start/end, multiple regions). I also introduced pyranges and found that it made things simpler even for the Still need to add support for the ^ exclude syntax, and more tests for edge cases, but I think these can be added later. (I'm away for the first half of next week BTW.) |
import pyranges | ||
|
||
|
||
def parse_region(region: str) -> tuple[str, Optional[int], Optional[int]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this would return a Region
object, like the one in bio2zarr, but I'm not sure which way the dependency is between the two projects (if indeed there is one). So I left it as a tuple for now.
Thanks @tomwhite, this is great. I'd be pretty slow to add external dependencies here, and I'm sceptical about whether we're really getting much value out of pyranges and pandas here for what we're currently doing. But, I'd vote we merge this and use it as a data point, and see if doing things with a simpler approach can achieve as good or better performance. |
So, I'm happy to merge this now, if you are? |
Yes, please go ahead! (Sorry for slow response, I was away for a few days.) |
No worries, I knew you were away |
Fixes #16
This is just a quick draft, but I wanted to share early for visibility.
-t
syntax - still needs to handle missing start/end, multiple targets, and the ^ syntax (for exclude).