`--output` overwrites previous commands and runs #529

StripedMonkey · 2022-07-14T16:02:42Z

Currently I'm working on a set of benchmarks where I want to save the output of (at least) one run from each. --output should arguably be allowed N times, where N is the number of commands being benchmarked.

However, I would also argue that keeping each runs output is extremely useful, specifically for debugging/figuring out why statistical outliers exist.

As an arbitrary example:

$ hyperfine --output ./example.log "ls" "pwd"

will only show output from a single pwd. Passing multiple outputs does not work, as hyperfine only expects one.

As a proposal for how this could change there are two things that would be nice to see:

Allow multiple outputs, the same way --prepare is allowed per command
```
$ hyperfine --output ./ls.log "ls" --output ./pwd.log "pwd"
```
Allow some parameter, like N, for the run number
```
$ hyperfine --output "./ls-{N}.log" "ls"
```

For short running commands like ls this is a problem as it may run somewhere in the neighborhood of 5k times, so it may be a problem to create a new logfile for each run. Longer running commands obviously do not have this problem.

The text was updated successfully, but these errors were encountered:

sharkdp · 2022-07-17T13:00:45Z

Thank you for your feedback!

The --output option was mainly introduced to control where output is being redirected (terminal, /dev/null, a pipe, a file). We didn't really think about your use case, to be honest (FYI @tavianator).

Note that it's rather easy to get the desired functionality if you simply redirect the output yourself:

hyperfine "ls > ls.log" "pwd > pwd.log"

As for the per-iteration output, you could use >> to get the output of each and every benchmark run. You could use --setup to make sure that the file is empty before starting the benchmark. And you could use --prepare to print a separator like --- before each new output:

hyperfine \
  --setup "rm -f ls.log" \
  --prepare "echo --- >> ls.log" \
  "ls >> ls.log"

If you want a new logfile for every run, I'm sure there would be workarounds as well, but I guess they won't be as concise as the commands suggested above.

As for your feature requests:

Allow multiple outputs, the same way --prepare is allowed per command

I can see how there might be cases where one would like to use --output=pipe for the first command and --output=null for the second, but I guess that's rather the exception. Given that this can be worked around using normal shell redirection, I think I would vote for allowing just one --output command.

2. Allow some parameter, like N, for the run number

This might be helpful in general, so we could think about setting e.g. {iteration} or {#} by default (similar to how other variables can be set via --parameter-scan/--parameter-list. This way, you could do something like

hyperfine "ls > ls-{#}.log"

Another option would be to inject an environment variable, similar to how we already set HYPERFINE_RANDOMIZED_ENVIRONMENT_OFFSET. This could be used in such a way:

hyperfine "ls > ls-${HYPERFINE_ITERATION}.log"

StripedMonkey · 2022-07-17T15:37:34Z

Note that it's rather easy to get the desired functionality if you simply redirect the output yourself:

Indeed this is actually what I have been doing so far, the only issue lies in that multiple iterations are either overwritten or get appended, which can be pretty annoying to parse out with very long outputs even if you mark them.

An injected HYPERFINE_ITERATION environment variable would be, in my opinion, the best solution.

I am fine with the idea that --output is only allowed once, but I would argue that presenting it as a way to write output to a file becomes rather misleading as it does not do what I expect wrt multiple commands. It's surprising to me that if I redirect the output to a file for later viewing I only see the output of the very last command. I'd expect the output to be appended rather than overwritten. Not to say that it's useless though, it does help in debugging a failed run as hyperfine bails. Perhaps the documentation needs to be clarified in that instance?

I appreciate the consideration, and thank you for this awesome tool

sharkdp · 2022-07-17T19:47:32Z

An injected HYPERFINE_ITERATION environment variable would be, in my opinion, the best solution.

Ok. I believe that should be easy to implement.

I am fine with the idea that --output is only allowed once, but I would argue that presenting it as a way to write output to a file becomes rather misleading as it does not do what I expect wrt multiple commands. It's surprising to me that if I redirect the output to a file for later viewing I only see the output of the very last command.

This is of course a very valid point. Sorry that I discarded that point above.

I'd expect the output to be appended rather than overwritten.

Hm. That's also not the best user experience, right? I think we should go with your "multiple --output options allowed" idea after all 👍

Another option which might be easier to implement but with an arguably worse UX would be to show a (nice) error message in case --output=path/to/file is used with multiple benchmarks, suggesting to use shell redirection instead.

tavianator · 2022-07-18T13:40:35Z

I think we should keep a couple things in mind:

Overwriting a file is likely to be a more consistent filesystem workload than appending to a file. I wouldn't want to append by default.
If we allow --output multiple times, what about --output=null find --output=./file fd? That's not a fair benchmark.

Are there compelling use cases other than debugging? Maybe a --save-all-outputs option would be nice that would generate something like

+-- command_01_find/
|   +-- 0001.stdout
|   +-- 0001.stderr
|   +-- 0002.stdout
|   +-- ...
+-- command_02_fd/
|   +-- 0001.stdout
|   +-- ...

StripedMonkey · 2022-07-18T15:13:24Z

Overwriting a file is likely to be a more consistent filesystem workload than appending to a file. I wouldn't want to append by default.

Is the overwriting Vs. appending such a large difference that it matters over a more reasonable/expected UX? How is overwriting output every time useful? From my perspective the only reason it has any use at all at the moment is that a failed test run will kill hyperfine and leave the failed runs log behind.

Logging the last command of a series of runs as it currently does is practically useless for anything other than a run failure. If that's the only usecase of writing output to file it should be renamed to --record-errors and spit out a file per nonzero exit code (assuming ignore errors was on) Certainly redirecting/piping output is useful, but overwriting file contents for the sake of consistency?

If possible, using a not-insignificant write buffer and flushing between runs would be better for "consistency" than anything if that were a big deal to someone.

If we allow --output multiple times, what about --output=null find --output=./file fd? That's not a fair benchmark.

Personally I'm apathetic to such an argument. I think that's user error. It might be detectable/something you can warn against but similar results can already be achieved by "find >> /dev/null" "fd >> ./file" That's not a fair benchmark either, but that's the test that's being run. Edit: Actually that'd be a good argument for handling it with hyperfine, so you can warn about this kind of thing.

Are there compelling use cases other than debugging?

My use case is being able to try and examine statistical anomalies in the debug output. Being able to go back and see that a particular run for one reason or another bailed out of a hotpath, or retried is useful to me even beyond "debugging" what went wrong. Knowing why a set of tests took significantly longer might not even be "debugging" but normal operation of said program.

Another potential usecase is determining if a particular run is successful even if the exit code is 0. I know of a few obtuse applications that can fail but not report that as an exit code. Right now I don't know of a way that can be determined with hyperfine other than writing output.

tavianator · 2022-07-18T17:13:19Z

Is the overwriting Vs. appending such a large difference that it matters over a more reasonable/expected UX?

Maybe not. I just did some local benchmarking and >> was typically 2-3% slower than >, but the variance didn't seem any different. I was mainly worried about the variance.

How is overwriting output every time useful?

For me, the purpose of --output=./file is “I want to know how fast my program is when outputting to a file,” not “I want to look at the output.”

That's not a fair benchmark

Personally I'm apathetic to such an argument

Me too, actually I've kind of changed my mind on this one because

$ hyperfine --output=null fd --output=./file fd

is an interesting benchmark.

Are there compelling use cases other than debugging?

My use case is being able to try and examine statistical anomalies in the debug output.

This makes a lot of sense to me actually. I usually use hyperfine for short-running programs that generate lots of (deterministic) output, and I want to remove the I/O overhead as much as possible. But if you're doing things where I/O isn't a bottleneck, preserving logs seems very helpful.

Anyway, here's what I think now. Supporting --output multiple times scares me because the command line becomes order sensitive. What should the behaviour of these commands be?

$ hyperfine --output=./file find fd  # Both to ./file right now
$ hyperfine find --output=./file fd  # Does --output apply to fd or both?
$ hyperfine find fd --output=./file  # Does --output apply to anything?

I agree we should clarify in the documentation that --output=./file is designed for timing stdout-is-a-file execution paths, not examining the output.

For examining the output, I think we should recommend using shell redirections, and provide $HYPERFINE_ITERATION to help with that.

I also think something like --save-all-outputs is a good idea. It seems to apply to your non-debugging use cases as well. We should probably add to it some metadata about each run, so you can find which ones were outliers etc.

StripedMonkey · 2022-07-18T19:29:48Z

Supporting --output multiple times scares me because the command line becomes order sensitive. What should the behaviour of these commands be?

I was simply going to assume the same semantics as --prepare to be honest. You have to have a single prepare or N --prepares

sharkdp · 2022-08-15T19:46:41Z

Ok, I think we're all on the same page.

To summarize:

We introduce a new environment variable HYPERFINE_ITERATION that will be set in each benchmarking run (see similar code here). This can be used to log the output of each and every benchmark run when using something like hyperfine 'cmd > "cmd-${HYPERFINE_ITERATION}.log"
We allow --output to appear exactly once or exactly N times (same as --prepare and --command-name). This will enable several new and useful use cases like hyperfine --output=cmd1.log cmd1 --output=cmd2.log cmd2 or hyperfine --output=null cmd --output=./file cmd.
In the --help text for --output, we mention the HYPERFINE_ITERATION environment variable for users who want to log each and every iteration.
We should probably also allow {…} parameters to be used in the --output argument. This way, we can also support parametrized benchmarks: hyperfine -P threads 1 10 --output='make-{threads}.log' 'make -j {threads}'. This could also be left for later but we should check possible interference between --output and --parameter-{scan,list}.

rcoacci · 2022-08-29T16:05:03Z

My 2 cents after finding out that --output overwrites the file: make it explicit in the --help/manual. I was also surprised that the file wasn't appended to at least by each iteration. I'd be fine with any of the proposed solutions.

sharkdp · 2022-09-02T07:04:52Z

To be clear, I would like us to implement all of the four points listed in my previous comment. That would include an improved --help text for --output (third point).

ramblingenzyme · 2024-10-31T06:15:25Z

I just ran into this myself, and everything here makes sense, though I do have an alternative suggestion (especially since this issue hasn't been updated in just over 2 years).

How would we feel about adding an extra flag to switch the file output behaviour that's only valid when --output isn't 'pipe', 'null' or 'inherit'?
I'm thinking --file-output-mode=overwrite|append with the default being overwrite so it's not a breaking change (and a much smaller one than what @sharkdp has written up)?

To provide some context, what I'm currently benchmarking is a https://vitest.dev/ configuration, and the output includes an internal breakdown of how much CPU time different phases of the runner took.

 Test Files  328 passed (328)
      Tests  2493 passed | 6 skipped (2499)
   Start at  16:40:07
   Duration  74.07s (transform 23.80s, setup 31.73s, collect 212.98s, tests 170.28s, environment 64.49s, prepare 25.31s)

And this is the current fish shell abomination I've kludged together to get what I want... 😂

begin                                                                                                                   
	set BENCHMARK_NAME 'projects-vs-unified'
    set BENCHMARK_OUTPUT "benchmarks/$BENCHMARK_NAME-output.txt"
    rm $BENCHMARK_OUTPUT || true
    hyperfine \
		--setup "rm node_modules/.vite/vitest/vitest/results.json || true" \
		--cleanup 'rm node_modules/.vite/vitest/vitest/results.json || true' \
		--warmup 2 --runs 5 \
		--export-markdown "benchmarks/$BENCHMARK_NAME-results.md" \
		"yarn test --silent {config} 2>&1 >> $BENCHMARK_OUTPUT" \
		-L config '-c vitest.unified.config.ts,--workspace vitest.projects.workspace.ts' | tee -a "$BENCHMARK_OUTPUT"
end

sharkdp · 2024-11-10T21:31:55Z

Task one from my list is now completed (see #772)

sharkdp added the feature-request label Aug 15, 2022

sharkdp added this to the hyperfine 1.17 milestone Apr 17, 2023

sharkdp modified the milestones: hyperfine 1.17, hyperfine 1.18 Jun 3, 2023

sharkdp mentioned this issue Jul 19, 2023

How know which run is used to execute the command? #667

Closed

sharkdp modified the milestone: hyperfine 1.19 Nov 10, 2024

sharkdp mentioned this issue Nov 10, 2024

Add iteration information to failure error message #772

Merged

sharkdp mentioned this issue Nov 11, 2024

Multiple output policies #775

Merged

sharkdp closed this as completed in #775 Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`--output` overwrites previous commands and runs #529

`--output` overwrites previous commands and runs #529

StripedMonkey commented Jul 14, 2022

sharkdp commented Jul 17, 2022 •

edited

Loading

StripedMonkey commented Jul 17, 2022

sharkdp commented Jul 17, 2022

tavianator commented Jul 18, 2022

StripedMonkey commented Jul 18, 2022 •

edited

Loading

tavianator commented Jul 18, 2022

StripedMonkey commented Jul 18, 2022

sharkdp commented Aug 15, 2022 •

edited

Loading

rcoacci commented Aug 29, 2022

sharkdp commented Sep 2, 2022

ramblingenzyme commented Oct 31, 2024 •

edited

Loading

sharkdp commented Nov 10, 2024

--output overwrites previous commands and runs #529

--output overwrites previous commands and runs #529

Comments

StripedMonkey commented Jul 14, 2022

sharkdp commented Jul 17, 2022 • edited Loading

StripedMonkey commented Jul 17, 2022

sharkdp commented Jul 17, 2022

tavianator commented Jul 18, 2022

StripedMonkey commented Jul 18, 2022 • edited Loading

tavianator commented Jul 18, 2022

StripedMonkey commented Jul 18, 2022

sharkdp commented Aug 15, 2022 • edited Loading

rcoacci commented Aug 29, 2022

sharkdp commented Sep 2, 2022

ramblingenzyme commented Oct 31, 2024 • edited Loading

sharkdp commented Nov 10, 2024

`--output` overwrites previous commands and runs #529

`--output` overwrites previous commands and runs #529

sharkdp commented Jul 17, 2022 •

edited

Loading

StripedMonkey commented Jul 18, 2022 •

edited

Loading

sharkdp commented Aug 15, 2022 •

edited

Loading

ramblingenzyme commented Oct 31, 2024 •

edited

Loading