Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New argument parser #4254

Open
7 tasks
tertsdiepraam opened this issue Jan 1, 2023 · 17 comments
Open
7 tasks

New argument parser #4254

tertsdiepraam opened this issue Jan 1, 2023 · 17 comments

Comments

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Jan 1, 2023

I talked about this a bit on Discord, but I think it's time to actually start talking about a new library I've been working on. So here goes: introducing the very creatively named uutils-args, a derive-based argument parsing library.

Why a custom argument parser

For a while now, we've been using clap as our argument parser, but over time I've been become increasingly frustrated with it. Not because it's a bad library, but because it doesn't fit the very specific needs of the coreutils and we constantly have to work around clap with a lot of verbose code. I looked into basically every Rust argument parser out there and none really fit our needs. So about a month ago, I started working on a parser based on lexopt specifically for uutils.

The whole design is too complex to explain here, so I wrote some documents in the repo. It has three subpages:

The library also integrates very nicely with some of the plans I had earlier for the --help pages (#3181, #2816), because the help text is read as markdown from a file in this library.

Current Status

The library supports many utils already, but a few important features are still missing. Some low-hanging fruit:

  • The deprecated arguments for head, tail and uniq,
  • Hidden arguments,
  • Setting exit codes.

And more complicated:

  • better and more complete markdown rendering,
  • argument completion.

There are also rough edges that can be smoothed out like improving the error messages and the output of --help.

It is also very much a product of me working on it alone, which means that most of the naming conventions make sense to me but maybe not to others. I would very much welcome feedback on any part of the API.

If you want to see the library in action, I suggest to take a look at some of the coreutils for which I have created tests (more to come in the future):

A full list is here: https://github.com/tertsdiepraam/uutils-args/tree/main/tests/coreutils. I found that most are relatively easy to port.

Downsides

Porting to this new library will have a few downsides.

  1. New contributors won't be familiar with this library.
  2. We won't have argument completion (until we implement ourselves in this library)
  3. We'll have to update uudoc.
  4. We don't get all the features and years of iterative improvement that clap has.

Proposed roadmap

Of course, I'm aware that this is gonna be a huge effort to port and we should carefully how (and even if) we want to do that. For now, I've come up with the following steps:

  • Figure out whether the migration is going to be worth it.
  • Collect feedback and accept PRs to get the library ready for use in uutils. This includes
    • Reevaluating the names for crates, traits and functions.
    • Implementing missing features.
    • Adding more documentation.
    • Stabilize the API.
    • Add more coreutils APIs as tests to prove the feasibility of the library.
  • Fix up administration
    • Move the repo to the uutils organisation
    • Publish the crates in the repo
  • Port utils one by one1
  • Change uudoc to use the info from the new library.

How to help out

For now, I need lots of feedback on this library, so I can get it ready, feel free to open an issue with whatever feedback you have or comment below. You can also help by picking up any of the open issues.

I'd be happy to answer any questions about this library :)

cc @sylvestre, @rivy, @jfinkels

Footnotes

  1. I think this is possible by having clap and this new library coexist in the repo for a while, though it will break completion and the online docs.

@tertsdiepraam tertsdiepraam changed the title New year, new argument parser New year, new argument parser! Jan 1, 2023
@sylvestre
Copy link
Contributor

Interesting :)
but I am a bit worried that we end up reinventing the wheel.
Did you try to:

  • ask clap developer if they would be willing to take some patches?
  • if we could write a clap extension for this?

@tertsdiepraam
Copy link
Member Author

but I am a bit worried that we end up reinventing the wheel.

I agree that this is something to watch out for. In my opinion the problems with clap have been stacking up too much and I think I found a solution that will simplify the code by a lot in some utilities. I'm trying to avoid reinventing the wheel as much as possible by using lexopt to deal with all the parsing complexity. We therefore get a lot of correct behaviour for free.

If you want to see how bad clap can get for our use case, you only have to look at this function in ls, which is 500 lines of workarounds for clap and was really complicated to get right:

pub fn from(options: &clap::ArgMatches) -> UResult<Self> {
let context = options.get_flag(options::CONTEXT);
let (mut format, opt) = if let Some(format_) = options.get_one::<String>(options::FORMAT) {
(
match format_.as_str() {
"long" | "verbose" => Format::Long,
"single-column" => Format::OneLine,
"columns" | "vertical" => Format::Columns,
"across" | "horizontal" => Format::Across,
"commas" => Format::Commas,
// below should never happen as clap already restricts the values.
_ => unreachable!("Invalid field for --format"),
},
Some(options::FORMAT),
)
} else if options.get_flag(options::format::LONG) {
(Format::Long, Some(options::format::LONG))
} else if options.get_flag(options::format::ACROSS) {
(Format::Across, Some(options::format::ACROSS))
} else if options.get_flag(options::format::COMMAS) {
(Format::Commas, Some(options::format::COMMAS))
} else if options.get_flag(options::format::COLUMNS) {
(Format::Columns, Some(options::format::COLUMNS))
} else if atty::is(atty::Stream::Stdout) {
(Format::Columns, None)
} else {
(Format::OneLine, None)
};
// The -o, -n and -g options are tricky. They cannot override with each
// other because it's possible to combine them. For example, the option
// -og should hide both owner and group. Furthermore, they are not
// reset if -l or --format=long is used. So these should just show the
// group: -gl or "-g --format=long". Finally, they are also not reset
// when switching to a different format option in-between like this:
// -ogCl or "-og --format=vertical --format=long".
//
// -1 has a similar issue: it does nothing if the format is long. This
// actually makes it distinct from the --format=singe-column option,
// which always applies.
//
// The idea here is to not let these options override with the other
// options, but manually whether they have an index that's greater than
// the other format options. If so, we set the appropriate format.
if format != Format::Long {
let idx = opt
.and_then(|opt| options.indices_of(opt).map(|x| x.max().unwrap()))
.unwrap_or(0);
if [
options::format::LONG_NO_OWNER,
options::format::LONG_NO_GROUP,
options::format::LONG_NUMERIC_UID_GID,
options::FULL_TIME,
]
.iter()
.flat_map(|opt| {
if options.value_source(opt) == Some(clap::parser::ValueSource::CommandLine) {
options.indices_of(opt)
} else {
None
}
})
.flatten()
.any(|i| i >= idx)
{
format = Format::Long;
} else if let Some(mut indices) = options.indices_of(options::format::ONE_LINE) {
if options.value_source(options::format::ONE_LINE)
== Some(clap::parser::ValueSource::CommandLine)
&& indices.any(|i| i > idx)
{
format = Format::OneLine;
}
}
}
let files = if options.get_flag(options::files::ALL) {
Files::All
} else if options.get_flag(options::files::ALMOST_ALL) {
Files::AlmostAll
} else {
Files::Normal
};
let sort = if let Some(field) = options.get_one::<String>(options::SORT) {
match field.as_str() {
"none" => Sort::None,
"name" => Sort::Name,
"time" => Sort::Time,
"size" => Sort::Size,
"version" => Sort::Version,
"extension" => Sort::Extension,
// below should never happen as clap already restricts the values.
_ => unreachable!("Invalid field for --sort"),
}
} else if options.get_flag(options::sort::TIME) {
Sort::Time
} else if options.get_flag(options::sort::SIZE) {
Sort::Size
} else if options.get_flag(options::sort::NONE) {
Sort::None
} else if options.get_flag(options::sort::VERSION) {
Sort::Version
} else if options.get_flag(options::sort::EXTENSION) {
Sort::Extension
} else {
Sort::Name
};
let time = if let Some(field) = options.get_one::<String>(options::TIME) {
match field.as_str() {
"ctime" | "status" => Time::Change,
"access" | "atime" | "use" => Time::Access,
"birth" | "creation" => Time::Birth,
// below should never happen as clap already restricts the values.
_ => unreachable!("Invalid field for --time"),
}
} else if options.get_flag(options::time::ACCESS) {
Time::Access
} else if options.get_flag(options::time::CHANGE) {
Time::Change
} else {
Time::Modification
};
let mut needs_color = match options.get_one::<String>(options::COLOR) {
None => options.contains_id(options::COLOR),
Some(val) => match val.as_str() {
"" | "always" | "yes" | "force" => true,
"auto" | "tty" | "if-tty" => atty::is(atty::Stream::Stdout),
/* "never" | "no" | "none" | */ _ => false,
},
};
let cmd_line_bs = options.get_one::<String>(options::size::BLOCK_SIZE);
let opt_si = cmd_line_bs.is_some()
&& options
.get_one::<String>(options::size::BLOCK_SIZE)
.unwrap()
.eq("si")
|| options.get_flag(options::size::SI);
let opt_hr = (cmd_line_bs.is_some()
&& options
.get_one::<String>(options::size::BLOCK_SIZE)
.unwrap()
.eq("human-readable"))
|| options.get_flag(options::size::HUMAN_READABLE);
let opt_kb = options.get_flag(options::size::KIBIBYTES);
let bs_env_var = std::env::var_os("BLOCK_SIZE");
let ls_bs_env_var = std::env::var_os("LS_BLOCK_SIZE");
let pc_env_var = std::env::var_os("POSIXLY_CORRECT");
let size_format = if opt_si {
SizeFormat::Decimal
} else if opt_hr {
SizeFormat::Binary
} else {
SizeFormat::Bytes
};
let raw_bs = if let Some(cmd_line_bs) = cmd_line_bs {
OsString::from(cmd_line_bs)
} else if !opt_kb {
if let Some(ls_bs_env_var) = ls_bs_env_var {
ls_bs_env_var
} else if let Some(bs_env_var) = bs_env_var {
bs_env_var
} else {
OsString::from("")
}
} else {
OsString::from("")
};
let block_size: Option<u64> = if !opt_si && !opt_hr && !raw_bs.is_empty() {
match parse_size(&raw_bs.to_string_lossy()) {
Ok(size) => Some(size),
Err(_) => {
show!(LsError::BlockSizeParseError(
cmd_line_bs.unwrap().to_owned()
));
None
}
}
} else if let Some(pc) = pc_env_var {
if pc.as_os_str() == OsStr::new("true") || pc == OsStr::new("1") {
Some(POSIXLY_CORRECT_BLOCK_SIZE)
} else {
None
}
} else {
None
};
let long = {
let author = options.get_flag(options::AUTHOR);
let group = !options.get_flag(options::NO_GROUP)
&& !options.get_flag(options::format::LONG_NO_GROUP);
let owner = !options.get_flag(options::format::LONG_NO_OWNER);
#[cfg(unix)]
let numeric_uid_gid = options.get_flag(options::format::LONG_NUMERIC_UID_GID);
LongFormat {
author,
group,
owner,
#[cfg(unix)]
numeric_uid_gid,
}
};
let width = match options.get_one::<String>(options::WIDTH) {
Some(x) => {
if x.starts_with('0') && x.len() > 1 {
// Read number as octal
match u16::from_str_radix(x, 8) {
Ok(v) => v,
Err(_) => return Err(LsError::InvalidLineWidth(x.into()).into()),
}
} else {
match x.parse::<u16>() {
Ok(u) => u,
Err(_) => return Err(LsError::InvalidLineWidth(x.into()).into()),
}
}
}
None => match terminal_size::terminal_size() {
Some((width, _)) => width.0,
None => match std::env::var_os("COLUMNS") {
Some(columns) => match columns.to_str().and_then(|s| s.parse().ok()) {
Some(columns) => columns,
None => {
show_error!(
"ignoring invalid width in environment variable COLUMNS: {}",
columns.quote()
);
DEFAULT_TERM_WIDTH
}
},
None => DEFAULT_TERM_WIDTH,
},
},
};
#[allow(clippy::needless_bool)]
let mut show_control = if options.get_flag(options::HIDE_CONTROL_CHARS) {
false
} else if options.get_flag(options::SHOW_CONTROL_CHARS) {
true
} else {
!atty::is(atty::Stream::Stdout)
};
let opt_quoting_style = options
.get_one::<String>(options::QUOTING_STYLE)
.map(|cmd_line_qs| cmd_line_qs.to_owned());
let mut quoting_style = if let Some(style) = opt_quoting_style {
match style.as_str() {
"literal" => QuotingStyle::Literal { show_control },
"shell" => QuotingStyle::Shell {
escape: false,
always_quote: false,
show_control,
},
"shell-always" => QuotingStyle::Shell {
escape: false,
always_quote: true,
show_control,
},
"shell-escape" => QuotingStyle::Shell {
escape: true,
always_quote: false,
show_control,
},
"shell-escape-always" => QuotingStyle::Shell {
escape: true,
always_quote: true,
show_control,
},
"c" => QuotingStyle::C {
quotes: quoting_style::Quotes::Double,
},
"escape" => QuotingStyle::C {
quotes: quoting_style::Quotes::None,
},
_ => unreachable!("Should have been caught by Clap"),
}
} else if options.get_flag(options::quoting::LITERAL) {
QuotingStyle::Literal { show_control }
} else if options.get_flag(options::quoting::ESCAPE) {
QuotingStyle::C {
quotes: quoting_style::Quotes::None,
}
} else if options.get_flag(options::quoting::C) {
QuotingStyle::C {
quotes: quoting_style::Quotes::Double,
}
} else {
// TODO: use environment variable if available
QuotingStyle::Shell {
escape: true,
always_quote: false,
show_control,
}
};
let indicator_style = if let Some(field) =
options.get_one::<String>(options::INDICATOR_STYLE)
{
match field.as_str() {
"none" => IndicatorStyle::None,
"file-type" => IndicatorStyle::FileType,
"classify" => IndicatorStyle::Classify,
"slash" => IndicatorStyle::Slash,
&_ => IndicatorStyle::None,
}
} else if let Some(field) = options.get_one::<String>(options::indicator_style::CLASSIFY) {
match field.as_str() {
"never" | "no" | "none" => IndicatorStyle::None,
"always" | "yes" | "force" => IndicatorStyle::Classify,
"auto" | "tty" | "if-tty" => {
if atty::is(atty::Stream::Stdout) {
IndicatorStyle::Classify
} else {
IndicatorStyle::None
}
}
&_ => IndicatorStyle::None,
}
} else if options.get_flag(options::indicator_style::SLASH) {
IndicatorStyle::Slash
} else if options.get_flag(options::indicator_style::FILE_TYPE) {
IndicatorStyle::FileType
} else {
IndicatorStyle::None
};
let time_style = parse_time_style(options)?;
let mut ignore_patterns: Vec<Pattern> = Vec::new();
if options.get_flag(options::IGNORE_BACKUPS) {
ignore_patterns.push(Pattern::new("*~").unwrap());
ignore_patterns.push(Pattern::new(".*~").unwrap());
}
for pattern in options
.get_many::<String>(options::IGNORE)
.into_iter()
.flatten()
{
match parse_glob::from_str(pattern) {
Ok(p) => {
ignore_patterns.push(p);
}
Err(_) => show_warning!("Invalid pattern for ignore: {}", pattern.quote()),
}
}
if files == Files::Normal {
for pattern in options
.get_many::<String>(options::HIDE)
.into_iter()
.flatten()
{
match parse_glob::from_str(pattern) {
Ok(p) => {
ignore_patterns.push(p);
}
Err(_) => show_warning!("Invalid pattern for hide: {}", pattern.quote()),
}
}
}
// According to ls info page, `--zero` implies the following flags:
// - `--show-control-chars`
// - `--format=single-column`
// - `--color=none`
// - `--quoting-style=literal`
// Current GNU ls implementation allows `--zero` Behavior to be
// overridden by later flags.
let zero_formats_opts = [
options::format::ACROSS,
options::format::COLUMNS,
options::format::COMMAS,
options::format::LONG,
options::format::LONG_NO_GROUP,
options::format::LONG_NO_OWNER,
options::format::LONG_NUMERIC_UID_GID,
options::format::ONE_LINE,
options::FORMAT,
];
let zero_colors_opts = [options::COLOR];
let zero_show_control_opts = [options::HIDE_CONTROL_CHARS, options::SHOW_CONTROL_CHARS];
let zero_quoting_style_opts = [
options::QUOTING_STYLE,
options::quoting::C,
options::quoting::ESCAPE,
options::quoting::LITERAL,
];
let get_last = |flag: &str| -> usize {
if options.value_source(flag) == Some(clap::parser::ValueSource::CommandLine) {
options.index_of(flag).unwrap_or(0)
} else {
0
}
};
if get_last(options::ZERO)
> zero_formats_opts
.into_iter()
.map(get_last)
.max()
.unwrap_or(0)
{
format = if format == Format::Long {
format
} else {
Format::OneLine
};
}
if get_last(options::ZERO)
> zero_colors_opts
.into_iter()
.map(get_last)
.max()
.unwrap_or(0)
{
needs_color = false;
}
if get_last(options::ZERO)
> zero_show_control_opts
.into_iter()
.map(get_last)
.max()
.unwrap_or(0)
{
show_control = true;
}
if get_last(options::ZERO)
> zero_quoting_style_opts
.into_iter()
.map(get_last)
.max()
.unwrap_or(0)
{
quoting_style = QuotingStyle::Literal { show_control };
}
let color = if needs_color {
Some(LsColors::from_env().unwrap_or_default())
} else {
None
};
let dereference = if options.get_flag(options::dereference::ALL) {
Dereference::All
} else if options.get_flag(options::dereference::ARGS) {
Dereference::Args
} else if options.get_flag(options::dereference::DIR_ARGS) {
Dereference::DirArgs
} else if options.get_flag(options::DIRECTORY)
|| indicator_style == IndicatorStyle::Classify
|| format == Format::Long
{
Dereference::None
} else {
Dereference::DirArgs
};
Ok(Self {
format,
files,
sort,
recursive: options.get_flag(options::RECURSIVE),
reverse: options.get_flag(options::REVERSE),
dereference,
ignore_patterns,
size_format,
directory: options.get_flag(options::DIRECTORY),
time,
color,
#[cfg(unix)]
inode: options.get_flag(options::INODE),
long,
alloc_size: options.get_flag(options::size::ALLOCATION_SIZE),
block_size,
width,
quoting_style,
indicator_style,
time_style,
context,
selinux_supported: {
#[cfg(feature = "selinux")]
{
selinux::kernel_support() != selinux::KernelSupport::Unsupported
}
#[cfg(not(feature = "selinux"))]
{
false
}
},
group_directories_first: options.get_flag(options::GROUP_DIRECTORIES_FIRST),
eol: if options.get_flag(options::ZERO) {
'\0'
} else {
'\n'
},
})
}

My assertion is that we can get rid of almost this entire function with the new library. I'll try to port ls as a test soon, so I can prove that assertion :). There are more utils with similar behaviour that we are not handling correctly right now.

  • ask clap developer if they would be willing to take some patches?

I have asked about a few things in the past:

However, most of the things I tried to fix here are more fundamental. clap seems to assume that every option maps to a single value, which is a great simplification if you have the luxury of defining new API's, but it doesn't work well for coreutils. This is too much of a difference that they won't be able to bridge. clap is also (rightfully) concerned about backwards compatibility, which will limit some changes.

The reason that I'm so confident that this works well, is that the generated code is very similar to GNU. In GNU, they have an iterator over arguments and change the configuration based on the encountered values. clap first collects all arguments and provides an API for the final values. This is fundamentally why we need to inspect indices and do complicated tricks with some arguments to match the behaviour. This new library does basically the same as GNU just with a derive API and is therefore a great fit.

I've listed most of the problems I found here: https://github.com/tertsdiepraam/uutils-args/blob/main/design/problems_with_clap.md, which shows how complicated these are to change in clap.

  • if we could write a clap extension for this?

I wish we could, but I don't know what that would look like. For example, I wouldn't know how to do that ls function above as an extension.

@rivy
Copy link
Member

rivy commented Jan 6, 2023

@tertsdiepraam , I'd love to have improvements in command line arguments support.

Your contributions have always been great and I'm happy to support your further experiments.

I would suggest that, if you decide to embark on this project, it should be built and developed in another repo and as it's own crate. We can set it up as a repo under the uutils banner.

I can understand @sylvestre's concern about duplication of effort and reinvention, but I empathize strongly with your position. I've contributed to clap a few times, and it's become a bit of a beast. I quite understand the "replace it" feeling that you are experiencing. 😄

@sylvestre
Copy link
Contributor

yeah, we should probably do it :)
I just want to make sure that we have really discussed about it

@jfinkels
Copy link
Collaborator

jfinkels commented Feb 5, 2023

This sounds like a reasonable change to me, I've run into quite a few issues with argument parsing. So if we can make it easier to deal with silliness like

chmod u=rrrr+-+-+-www f

or

df --o

without too much effort, that would be great.

@tertsdiepraam
Copy link
Member Author

What's the silliness you're referring to? Is it about parsing the permissions and non-ambiguous prefixes of long arguments? I have the second one implemented, but the first one will probably not be part of this library. You can define a type that implements a parser that will work though.

@jfinkels
Copy link
Collaborator

jfinkels commented Feb 6, 2023

Yes, I was referring to parsing the permissions and non-ambiguous prefixes of long arguments. They were just two things that came to mind when I tried to recall trouble I had had with arguments in the past.

@tertsdiepraam tertsdiepraam changed the title New year, new argument parser! New argument parser Sep 14, 2023
@tertsdiepraam
Copy link
Member Author

tertsdiepraam commented Nov 1, 2023

10 months after opening this issue, I've resumed my work on this :). I think the next step is determining when we deem it feature-complete and battle-tested enough to be included here.

Here's what I have right now:

  • Short args (with no, optional and required arguments)
  • Long args (with no, optional and required arguments)
  • Positional arguments
  • --help and --version
  • Help text based on an external file
  • Help text based on docstrings
  • dd-style args
  • A special parsing mode for echo
  • An escape hatch for arguments like -10/+10 like fold
  • Error messages on invalid arguments
  • Parsing of values with a derive macro
  • Control over the exit code on errors

What's missing is:

The problem is that there's only so much I can do in a separate repo. We have to start properly integrating this if we want to find the rest of the actual problems.

There's also the issue that while we are migrating, the docs and manpages will probably be broken anyway, because we will temporarily have both clap and this parser in the repo. Unless we only work on a separate branch, but that seems suboptimal too. Happy to hear any suggestions for how we should deal with this. My solution is that we generate the manpages and docs, store them and then remove the code that generates it. That will give people a snapshot that will be pretty accurate for a while.

The markdown format for help is also missing at this point. I cut that out to actually get to something done. We can add it back in later.

If you want to see the library in action, I recommend looking at the tests that implement several coreutils.

@BenWiederhake
Copy link
Collaborator

BenWiederhake commented Feb 19, 2024

Looks like I accidentally stumbled into another clap-impossibility: unattached multi-valued short-args, which clap would prefer not to fix, and I honestly agree that it's a silly feature that should have never been invented in the first place.

However, with clap it is impossible to implement shuf -ez hello world correctly.

EDIT: Yes, it is possible, I was just a bit too daft.

@tertsdiepraam
Copy link
Member Author

That might be where I definitively draw the line of compatibility 😄 Holy cow, that's cursed.

@BenWiederhake
Copy link
Collaborator

I think it's pretty doable with uutil-args. If I get it to work, would you be interested, or is that too cursed?

More cursed stuff …
$ echo foo > a; echo bar > b
$ shuf a b
shuf: extra operand 'b'
Try 'shuf --help' for more information.
[$? = 1]
$ shuf a b -e # A *later* argv string changes the interpretation of an *earlier* argv!
b
a
$

@tertsdiepraam
Copy link
Member Author

tertsdiepraam commented Feb 19, 2024

Actually, I think this is because --echo does not take any arguments. So, it should be possible in both clap and the new parser.

Sort of related, I think we should also put some guidance in CONTRIBUTING.md about filing issues on our behalf. You obviously mean well and it's appreciated, but I'd like to minimize the time other maintainers have to spend on our problems and explore solutions on our side first. Especially with clap, we've had people file issues a few times without communicating with the uutils maintainers and I don't want Ed Page to get annoyed with us 1 😅

Footnotes

  1. Of course, he's been awesome about this and always provided friendly and detailed responses.

@BenWiederhake
Copy link
Collaborator

According to the GNU documentation, --echo does take arguments, and shuf a b proves that, kinda. But yes, I would implement it as a boolean flag, and then later error out if more than one "file" was supplied. Sorry, I'm getting off-topic.

I tried to keep the clap issue reasonably general, and in retrospect I agree that perhaps it would have been better to coordinate with you. Thankfully, nothing bad happened this time. :)

@tertsdiepraam
Copy link
Member Author

tertsdiepraam commented Feb 19, 2024

Where do you see that --echotakes arguments? In the manpage it says

treat each ARG as an input line

with ARG referring to this line:

shuf -e [OPTION]... [ARG]...

and in the docs it says

Treat each command-line operand as an input line.

So I think it's just a boolean.

But, depending on the flag, the maximum number of arguments changes, which is supported by the new parser!

@BenWiederhake
Copy link
Collaborator

BenWiederhake commented Mar 29, 2024

Here's another behavior that is impossible to replicate using clap:

$ date -Ilolwut -R -R  # parses "lolwut" before either "-R", and raises that error
date: invalid argument 'lolwut' for '--iso-8601'
Valid arguments are:
  - 'hours'
  - 'minutes'
  - 'date'
  - 'seconds'
  - 'ns'
Try 'date --help' for more information.
[$? = 1]
$ date -R -R -Ilolwut  # parses the second "-R" before "-Ilolwut", and raises that error
date: multiple output formats specified
[$? = 1]
$ date -R -Ilolwut -R  # In fact, it first tries to parse the value of "-i" before noticing duplicity
date: invalid argument 'lolwut' for '--iso-8601'
Valid arguments are:
  - 'hours'
  - 'minutes'
  - 'date'
  - 'seconds'
  - 'ns'
Try 'date --help' for more information.
[$? = 1]
$ date -R -Idate -R  # … so if the value to "-I" is valid, it notices the duplicity only afterwards.
date: multiple output formats specified
[$? = 1]

Doing this in uutils-args is actually also difficult, because Settings::apply does not (yet?) permit raising an error. See uutils/uutils-args#112 for my suggestion how to change that.

@jfinkels
Copy link
Collaborator

From your examples, it seems like the parsing is just handled sequentially in order, terminating at the first error. Is that right?

@BenWiederhake
Copy link
Collaborator

@jfinkels: Yes, hence my proposal to implement it like this: uutils/uutils-args#113

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants