Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--output overwrites previous commands and runs #529

Closed
StripedMonkey opened this issue Jul 14, 2022 · 12 comments · Fixed by #775
Closed

--output overwrites previous commands and runs #529

StripedMonkey opened this issue Jul 14, 2022 · 12 comments · Fixed by #775

Comments

@StripedMonkey
Copy link

Currently I'm working on a set of benchmarks where I want to save the output of (at least) one run from each. --output should arguably be allowed N times, where N is the number of commands being benchmarked.

However, I would also argue that keeping each runs output is extremely useful, specifically for debugging/figuring out why statistical outliers exist.

As an arbitrary example:

$ hyperfine --output ./example.log "ls" "pwd"

will only show output from a single pwd. Passing multiple outputs does not work, as hyperfine only expects one.

As a proposal for how this could change there are two things that would be nice to see:

  1. Allow multiple outputs, the same way --prepare is allowed per command
    $ hyperfine --output ./ls.log "ls" --output ./pwd.log "pwd"
    
  2. Allow some parameter, like N, for the run number
    $ hyperfine --output "./ls-{N}.log" "ls"
    

For short running commands like ls this is a problem as it may run somewhere in the neighborhood of 5k times, so it may be a problem to create a new logfile for each run. Longer running commands obviously do not have this problem.

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

Thank you for your feedback!

The --output option was mainly introduced to control where output is being redirected (terminal, /dev/null, a pipe, a file). We didn't really think about your use case, to be honest (FYI @tavianator).

Note that it's rather easy to get the desired functionality if you simply redirect the output yourself:

hyperfine "ls > ls.log" "pwd > pwd.log"

As for the per-iteration output, you could use >> to get the output of each and every benchmark run. You could use --setup to make sure that the file is empty before starting the benchmark. And you could use --prepare to print a separator like --- before each new output:

hyperfine \
  --setup "rm -f ls.log" \
  --prepare "echo --- >> ls.log" \
  "ls >> ls.log"

If you want a new logfile for every run, I'm sure there would be workarounds as well, but I guess they won't be as concise as the commands suggested above.

As for your feature requests:

  1. Allow multiple outputs, the same way --prepare is allowed per command

I can see how there might be cases where one would like to use --output=pipe for the first command and --output=null for the second, but I guess that's rather the exception. Given that this can be worked around using normal shell redirection, I think I would vote for allowing just one --output command.

2. Allow some parameter, like N, for the run number

This might be helpful in general, so we could think about setting e.g. {iteration} or {#} by default (similar to how other variables can be set via --parameter-scan/--parameter-list. This way, you could do something like

hyperfine "ls > ls-{#}.log"

Another option would be to inject an environment variable, similar to how we already set HYPERFINE_RANDOMIZED_ENVIRONMENT_OFFSET. This could be used in such a way:

hyperfine "ls > ls-${HYPERFINE_ITERATION}.log"

@StripedMonkey
Copy link
Author

Note that it's rather easy to get the desired functionality if you simply redirect the output yourself:

Indeed this is actually what I have been doing so far, the only issue lies in that multiple iterations are either overwritten or get appended, which can be pretty annoying to parse out with very long outputs even if you mark them.

An injected HYPERFINE_ITERATION environment variable would be, in my opinion, the best solution.

I am fine with the idea that --output is only allowed once, but I would argue that presenting it as a way to write output to a file becomes rather misleading as it does not do what I expect wrt multiple commands. It's surprising to me that if I redirect the output to a file for later viewing I only see the output of the very last command. I'd expect the output to be appended rather than overwritten. Not to say that it's useless though, it does help in debugging a failed run as hyperfine bails. Perhaps the documentation needs to be clarified in that instance?

I appreciate the consideration, and thank you for this awesome tool

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

An injected HYPERFINE_ITERATION environment variable would be, in my opinion, the best solution.

Ok. I believe that should be easy to implement.

I am fine with the idea that --output is only allowed once, but I would argue that presenting it as a way to write output to a file becomes rather misleading as it does not do what I expect wrt multiple commands. It's surprising to me that if I redirect the output to a file for later viewing I only see the output of the very last command.

This is of course a very valid point. Sorry that I discarded that point above.

I'd expect the output to be appended rather than overwritten.

Hm. That's also not the best user experience, right? I think we should go with your "multiple --output options allowed" idea after all 👍

Another option which might be easier to implement but with an arguably worse UX would be to show a (nice) error message in case --output=path/to/file is used with multiple benchmarks, suggesting to use shell redirection instead.

@tavianator
Copy link
Contributor

I think we should keep a couple things in mind:

  • Overwriting a file is likely to be a more consistent filesystem workload than appending to a file. I wouldn't want to append by default.
  • If we allow --output multiple times, what about --output=null find --output=./file fd? That's not a fair benchmark.

Are there compelling use cases other than debugging? Maybe a --save-all-outputs option would be nice that would generate something like

+-- command_01_find/
|   +-- 0001.stdout
|   +-- 0001.stderr
|   +-- 0002.stdout
|   +-- ...
+-- command_02_fd/
|   +-- 0001.stdout
|   +-- ...

@StripedMonkey
Copy link
Author

StripedMonkey commented Jul 18, 2022

  • Overwriting a file is likely to be a more consistent filesystem workload than appending to a file. I wouldn't want to append by default.

Is the overwriting Vs. appending such a large difference that it matters over a more reasonable/expected UX? How is overwriting output every time useful? From my perspective the only reason it has any use at all at the moment is that a failed test run will kill hyperfine and leave the failed runs log behind.

Logging the last command of a series of runs as it currently does is practically useless for anything other than a run failure. If that's the only usecase of writing output to file it should be renamed to --record-errors and spit out a file per nonzero exit code (assuming ignore errors was on) Certainly redirecting/piping output is useful, but overwriting file contents for the sake of consistency?

If possible, using a not-insignificant write buffer and flushing between runs would be better for "consistency" than anything if that were a big deal to someone.

  • If we allow --output multiple times, what about --output=null find --output=./file fd? That's not a fair benchmark.

Personally I'm apathetic to such an argument. I think that's user error. It might be detectable/something you can warn against but similar results can already be achieved by "find >> /dev/null" "fd >> ./file" That's not a fair benchmark either, but that's the test that's being run. Edit: Actually that'd be a good argument for handling it with hyperfine, so you can warn about this kind of thing.

Are there compelling use cases other than debugging?

My use case is being able to try and examine statistical anomalies in the debug output. Being able to go back and see that a particular run for one reason or another bailed out of a hotpath, or retried is useful to me even beyond "debugging" what went wrong. Knowing why a set of tests took significantly longer might not even be "debugging" but normal operation of said program.

Another potential usecase is determining if a particular run is successful even if the exit code is 0. I know of a few obtuse applications that can fail but not report that as an exit code. Right now I don't know of a way that can be determined with hyperfine other than writing output.

@tavianator
Copy link
Contributor

Is the overwriting Vs. appending such a large difference that it matters over a more reasonable/expected UX?

Maybe not. I just did some local benchmarking and >> was typically 2-3% slower than >, but the variance didn't seem any different. I was mainly worried about the variance.

How is overwriting output every time useful?

For me, the purpose of --output=./file is “I want to know how fast my program is when outputting to a file,” not “I want to look at the output.”

That's not a fair benchmark

Personally I'm apathetic to such an argument

Me too, actually I've kind of changed my mind on this one because

$ hyperfine --output=null fd --output=./file fd

is an interesting benchmark.

Are there compelling use cases other than debugging?

My use case is being able to try and examine statistical anomalies in the debug output.

This makes a lot of sense to me actually. I usually use hyperfine for short-running programs that generate lots of (deterministic) output, and I want to remove the I/O overhead as much as possible. But if you're doing things where I/O isn't a bottleneck, preserving logs seems very helpful.


Anyway, here's what I think now. Supporting --output multiple times scares me because the command line becomes order sensitive. What should the behaviour of these commands be?

$ hyperfine --output=./file find fd  # Both to ./file right now
$ hyperfine find --output=./file fd  # Does --output apply to fd or both?
$ hyperfine find fd --output=./file  # Does --output apply to anything?

I agree we should clarify in the documentation that --output=./file is designed for timing stdout-is-a-file execution paths, not examining the output.

For examining the output, I think we should recommend using shell redirections, and provide $HYPERFINE_ITERATION to help with that.

I also think something like --save-all-outputs is a good idea. It seems to apply to your non-debugging use cases as well. We should probably add to it some metadata about each run, so you can find which ones were outliers etc.

@StripedMonkey
Copy link
Author

Supporting --output multiple times scares me because the command line becomes order sensitive. What should the behaviour of these commands be?

I was simply going to assume the same semantics as --prepare to be honest. You have to have a single prepare or N --prepares

@sharkdp
Copy link
Owner

sharkdp commented Aug 15, 2022

Ok, I think we're all on the same page.

To summarize:

  • We introduce a new environment variable HYPERFINE_ITERATION that will be set in each benchmarking run (see similar code here). This can be used to log the output of each and every benchmark run when using something like hyperfine 'cmd > "cmd-${HYPERFINE_ITERATION}.log"
  • We allow --output to appear exactly once or exactly N times (same as --prepare and --command-name). This will enable several new and useful use cases like hyperfine --output=cmd1.log cmd1 --output=cmd2.log cmd2 or hyperfine --output=null cmd --output=./file cmd.
  • In the --help text for --output, we mention the HYPERFINE_ITERATION environment variable for users who want to log each and every iteration.
  • We should probably also allow {…} parameters to be used in the --output argument. This way, we can also support parametrized benchmarks: hyperfine -P threads 1 10 --output='make-{threads}.log' 'make -j {threads}'. This could also be left for later but we should check possible interference between --output and --parameter-{scan,list}.

@rcoacci
Copy link

rcoacci commented Aug 29, 2022

My 2 cents after finding out that --output overwrites the file: make it explicit in the --help/manual. I was also surprised that the file wasn't appended to at least by each iteration. I'd be fine with any of the proposed solutions.

@sharkdp
Copy link
Owner

sharkdp commented Sep 2, 2022

To be clear, I would like us to implement all of the four points listed in my previous comment. That would include an improved --help text for --output (third point).

@ramblingenzyme
Copy link

ramblingenzyme commented Oct 31, 2024

I just ran into this myself, and everything here makes sense, though I do have an alternative suggestion (especially since this issue hasn't been updated in just over 2 years).

How would we feel about adding an extra flag to switch the file output behaviour that's only valid when --output isn't 'pipe', 'null' or 'inherit'?
I'm thinking --file-output-mode=overwrite|append with the default being overwrite so it's not a breaking change (and a much smaller one than what @sharkdp has written up)?

To provide some context, what I'm currently benchmarking is a https://vitest.dev/ configuration, and the output includes an internal breakdown of how much CPU time different phases of the runner took.

 Test Files  328 passed (328)
      Tests  2493 passed | 6 skipped (2499)
   Start at  16:40:07
   Duration  74.07s (transform 23.80s, setup 31.73s, collect 212.98s, tests 170.28s, environment 64.49s, prepare 25.31s)

And this is the current fish shell abomination I've kludged together to get what I want... 😂

begin                                                                                                                   
	set BENCHMARK_NAME 'projects-vs-unified'
    set BENCHMARK_OUTPUT "benchmarks/$BENCHMARK_NAME-output.txt"
    rm $BENCHMARK_OUTPUT || true
    hyperfine \
		--setup "rm node_modules/.vite/vitest/vitest/results.json || true" \
		--cleanup 'rm node_modules/.vite/vitest/vitest/results.json || true' \
		--warmup 2 --runs 5 \
		--export-markdown "benchmarks/$BENCHMARK_NAME-results.md" \
		"yarn test --silent {config} 2>&1 >> $BENCHMARK_OUTPUT" \
		-L config '-c vitest.unified.config.ts,--workspace vitest.projects.workspace.ts' | tee -a "$BENCHMARK_OUTPUT"
end

@sharkdp
Copy link
Owner

sharkdp commented Nov 10, 2024

Task one from my list is now completed (see #772)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants