-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New argument parser #4254
Comments
Interesting :)
|
I agree that this is something to watch out for. In my opinion the problems with clap have been stacking up too much and I think I found a solution that will simplify the code by a lot in some utilities. I'm trying to avoid reinventing the wheel as much as possible by using If you want to see how bad clap can get for our use case, you only have to look at this function in Lines 431 to 939 in c2e4da9
My assertion is that we can get rid of almost this entire function with the new library. I'll try to port
I have asked about a few things in the past:
However, most of the things I tried to fix here are more fundamental. clap seems to assume that every option maps to a single value, which is a great simplification if you have the luxury of defining new API's, but it doesn't work well for coreutils. This is too much of a difference that they won't be able to bridge. clap is also (rightfully) concerned about backwards compatibility, which will limit some changes. The reason that I'm so confident that this works well, is that the generated code is very similar to GNU. In GNU, they have an iterator over arguments and change the configuration based on the encountered values. clap first collects all arguments and provides an API for the final values. This is fundamentally why we need to inspect indices and do complicated tricks with some arguments to match the behaviour. This new library does basically the same as GNU just with a derive API and is therefore a great fit. I've listed most of the problems I found here: https://github.com/tertsdiepraam/uutils-args/blob/main/design/problems_with_clap.md, which shows how complicated these are to change in clap.
I wish we could, but I don't know what that would look like. For example, I wouldn't know how to do that |
@tertsdiepraam , I'd love to have improvements in command line arguments support. Your contributions have always been great and I'm happy to support your further experiments. I would suggest that, if you decide to embark on this project, it should be built and developed in another repo and as it's own crate. We can set it up as a repo under the uutils banner. I can understand @sylvestre's concern about duplication of effort and reinvention, but I empathize strongly with your position. I've contributed to clap a few times, and it's become a bit of a beast. I quite understand the "replace it" feeling that you are experiencing. 😄 |
yeah, we should probably do it :) |
This sounds like a reasonable change to me, I've run into quite a few issues with argument parsing. So if we can make it easier to deal with silliness like
or
without too much effort, that would be great. |
What's the silliness you're referring to? Is it about parsing the permissions and non-ambiguous prefixes of long arguments? I have the second one implemented, but the first one will probably not be part of this library. You can define a type that implements a parser that will work though. |
Yes, I was referring to parsing the permissions and non-ambiguous prefixes of long arguments. They were just two things that came to mind when I tried to recall trouble I had had with arguments in the past. |
10 months after opening this issue, I've resumed my work on this :). I think the next step is determining when we deem it feature-complete and battle-tested enough to be included here. Here's what I have right now:
What's missing is:
The problem is that there's only so much I can do in a separate repo. We have to start properly integrating this if we want to find the rest of the actual problems. There's also the issue that while we are migrating, the docs and manpages will probably be broken anyway, because we will temporarily have both The markdown format for help is also missing at this point. I cut that out to actually get to something done. We can add it back in later. If you want to see the library in action, I recommend looking at the tests that implement several coreutils. |
EDIT: Yes, it is possible, I was just a bit too daft. |
That might be where I definitively draw the line of compatibility 😄 Holy cow, that's cursed. |
I think it's pretty doable with uutil-args. If I get it to work, would you be interested, or is that too cursed? More cursed stuff …$ echo foo > a; echo bar > b
$ shuf a b
shuf: extra operand 'b'
Try 'shuf --help' for more information.
[$? = 1]
$ shuf a b -e # A *later* argv string changes the interpretation of an *earlier* argv!
b
a
$ |
Actually, I think this is because Sort of related, I think we should also put some guidance in CONTRIBUTING.md about filing issues on our behalf. You obviously mean well and it's appreciated, but I'd like to minimize the time other maintainers have to spend on our problems and explore solutions on our side first. Especially with clap, we've had people file issues a few times without communicating with the uutils maintainers and I don't want Ed Page to get annoyed with us 1 😅 Footnotes
|
According to the GNU documentation, I tried to keep the clap issue reasonably general, and in retrospect I agree that perhaps it would have been better to coordinate with you. Thankfully, nothing bad happened this time. :) |
Where do you see that
with
and in the docs it says
So I think it's just a boolean. But, depending on the flag, the maximum number of arguments changes, which is supported by the new parser! |
Here's another behavior that is impossible to replicate using $ date -Ilolwut -R -R # parses "lolwut" before either "-R", and raises that error
date: invalid argument 'lolwut' for '--iso-8601'
Valid arguments are:
- 'hours'
- 'minutes'
- 'date'
- 'seconds'
- 'ns'
Try 'date --help' for more information.
[$? = 1]
$ date -R -R -Ilolwut # parses the second "-R" before "-Ilolwut", and raises that error
date: multiple output formats specified
[$? = 1]
$ date -R -Ilolwut -R # In fact, it first tries to parse the value of "-i" before noticing duplicity
date: invalid argument 'lolwut' for '--iso-8601'
Valid arguments are:
- 'hours'
- 'minutes'
- 'date'
- 'seconds'
- 'ns'
Try 'date --help' for more information.
[$? = 1]
$ date -R -Idate -R # … so if the value to "-I" is valid, it notices the duplicity only afterwards.
date: multiple output formats specified
[$? = 1] Doing this in uutils-args is actually also difficult, because |
From your examples, it seems like the parsing is just handled sequentially in order, terminating at the first error. Is that right? |
@jfinkels: Yes, hence my proposal to implement it like this: uutils/uutils-args#113 |
I talked about this a bit on Discord, but I think it's time to actually start talking about a new library I've been working on. So here goes: introducing the very creatively named
uutils-args
, a derive-based argument parsing library.Why a custom argument parser
For a while now, we've been using
clap
as our argument parser, but over time I've been become increasingly frustrated with it. Not because it's a bad library, but because it doesn't fit the very specific needs of the coreutils and we constantly have to work around clap with a lot of verbose code. I looked into basically every Rust argument parser out there and none really fit our needs. So about a month ago, I started working on a parser based onlexopt
specifically foruutils
.The whole design is too complex to explain here, so I wrote some documents in the repo. It has three subpages:
clap
uutils-args
works.The library also integrates very nicely with some of the plans I had earlier for the
--help
pages (#3181, #2816), because the help text is read as markdown from a file in this library.Current Status
The library supports many utils already, but a few important features are still missing. Some low-hanging fruit:
head
,tail
anduniq
,Hidden arguments,Setting exit codes.And more complicated:
There are also rough edges that can be smoothed out like improving the error messages and the output of
--help
.It is also very much a product of me working on it alone, which means that most of the naming conventions make sense to me but maybe not to others. I would very much welcome feedback on any part of the API.
If you want to see the library in action, I suggest to take a look at some of the coreutils for which I have created tests (more to come in the future):
basename
cat
mktemp
ls
A full list is here: https://github.com/tertsdiepraam/uutils-args/tree/main/tests/coreutils. I found that most are relatively easy to port.
Downsides
Porting to this new library will have a few downsides.
uudoc
.clap
has.Proposed roadmap
Of course, I'm aware that this is gonna be a huge effort to port and we should carefully how (and even if) we want to do that. For now, I've come up with the following steps:
uudoc
to use the info from the new library.How to help out
For now, I need lots of feedback on this library, so I can get it ready, feel free to open an issue with whatever feedback you have or comment below. You can also help by picking up any of the open issues.
I'd be happy to answer any questions about this library :)
cc @sylvestre, @rivy, @jfinkels
Footnotes
I think this is possible by having clap and this new library coexist in the repo for a while, though it will break completion and the online docs. ↩
The text was updated successfully, but these errors were encountered: