-
Notifications
You must be signed in to change notification settings - Fork 4
Conversation
Following the example of the sgkit repo.
The naming looks fine to me, although |
Sure, I can see the advantage there. I'm happy to follow the consensus on this one. |
# format used. We probably want to have the option of | ||
# writing to a ZipFile anyway, so that we don't have | ||
# gazillions of files. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, here's an example that sets custom numcodec filters for Zarr: https://github.com/related-sciences/gwas-analysis/blob/master/notebooks/platform/xarray/lib/io/core.py#L60. I can see that becoming another thing that we do often between read_*
and ds.to_zarr
in addition to using options for zipping files and maybe rechunking.
|
||
|
||
def get_sgkit_plink_parser(): | ||
top_parser = argparse.ArgumentParser( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any strong opinions on using argparse vs Click or Fire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, haven't got strong opinions. I haven't tried either of these. How about I get something basic up and running with argparse, and we can weigh up the pros and cons of external packages based on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also tiangolo/typer which is not likely to be our solution but is worth knowing about. I like it because it's from the same author as FastAPI so you get a consistent experience across CLI and web Interfaces.
Looks great to me @jeromekelleher. I think this will be a good place to build up some of the necessary options or hard-coded behaviors that should live between calls to create the readers (like |
Thanks for the quick work on this @jeromekelleher! Before we land a CLI tool in this repo, I just wanted to step back and think about the CLI interaction we want for I tend to prefer tools that have a single point of entry for CLI interactions such as |
Yes, good call, let's think this through. For the core
So, how could we do I think I like this approach better, good idea @hammer! It also makes the requirements for one of the file format plugins quite clear, with a well defined interface. Hopefully that keeps the format plugins relatively simple, and concerned only with reading and writing the format in question. What do others think? I think I should probably finish the initial version of this CLI just to get something working, and so we have a good idea of what the format specific arguments are going to look like. It'll be easy enough to pull it out afterwards. |
Yes! I love IO libraries that infer file formats and their properties for me. For example, inferring the presence of a header and delimiter used in a CSV file. I can also imagine this interface is where most users will discover they need a separate package to read/write a particular file format, and we could guide them through the installation here. |
Yes, I like this plugin idea a lot actually, it seems to be a good compromise between supporting lots of formats but not letting the main sgkit package get all sprawly and horrible. Any thoughts on the CLI structure here @tomwhite, @eric-czech, @ravwojdyla? |
It's a minor comment, but from a plugin perspective, I think it's going to be easier for the commands to be something like I've filed https://github.com/pystatgen/sgkit/issues/53 to track the design and implementation of a unified CLI that can accommodate plugins. |
I like the subcommand structure, implemented using interfaces and plugins. |
OK, thanks. Rather than spend more time here then, I might just make a start on what this would look like in the main repo. There'll be some rough edges, but it's good to make a start. I'll leave this one open for now anyway. |
Closing in favour of pystatgen/sgkit#57 |
Closes #7
This is an initial outline of the conversion CLI.
I haven't written any testing code yet, as I wanted to see if people agreed with the basic organisation first (and ran out of time!).
This adds a
plink2sg
CLI that can be run either viapython -m sgkit_plink plink2sg
or viaplink2sg
, if the package has been installed and the user's PATH set up accordingly. There's a bit of indirection needed to do this, but I think it's worth it as it's always useful to be able to use thepython -m
version. I haven't madepython -m sgkit_plink
callplink2sg
directly because I think we'll inevitably need to have another method,sg2plink
, which does the conversion in the other direction. There's some extra stuff in there that will facilitate testing, also, and having it broken up into little pieces makes things easier to mock out.What do we think of this organisation? Do we like the names?