-
-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using --prepare increases benchmark results #715
Comments
This is an interesting observation that I haven't made before. I can reproduce this. Let's first rule out the shell as a possible source for this, which you also did by using So we are going to compare launching
Interestingly, if I use
This seems to be a real effect which is not related to hyperfine. Another thing I did was to do the run with We can see that the main peak is around 3 ms, but there is a clear second peak at the low (~500µs) runtime that we saw in the benchmark without I don't know the real reason for this, but here are two wild guesses:
Some evidence for hypothesis 1 comes from the following experiment, where I pin
|
I did not know that shells use builtin commands instead of the actual binary but when I think of if it is kinda obvious that they would do it for performance reasons. Is there a more convenient way to check if a built-in is used other than checking if a new process is created? Hypothesis 1:If that is the case a spinlock should not show the same behavior because it never signals the OS that it is idling. To test this I created a simple loop with the asm macro to prevent compiler optimizations: use std::arch::asm;
const COUNT: usize = 100_000_000;
fn main() {
unsafe {
asm! {
"mov rax, {0:r}",
"2:",
"dec rax",
"jnz 2b",
in(reg) COUNT
};
}
} Benchmarking this with hyperfine reported around 23ms runtime on my system and htop shows 100% usage of the cpu core. Using this spinlock the effect is still present in the echo benchmark even with pinning:
However in the head random benchmark it is not present anymore:
Afaik there should be no caching possible with Therefore I think the first hypothesis can be correct. Hypothesis 2:We can use the same command for
And now it actually looks to be equal. So maybe both of your hypotheses are correct. Confirming that it is not tool specific:It is probably a good idea to try the same benchmarking with a different tool and see if it behaves the same. There we would need to find a tool that offers these capabilities. I can't think of one off the top of my head. Adding a warning / notice about this behaviorI think it would be a good idea to document this behavior somewhere to make people aware of it even tho it might not be specific to this tool but rather to the system architecture. Possible solutionsMaybe it is possible to gain exclusive access to a single core? Not sure if linux supports this. This would prevent the scheduler from running other processes on the core while the benchmarked process is waiting for I/O or otherwise idling. |
When using
hyperfine
I noticed that it always reported worse performance when using the--prepare
option to execute a command in between runs even if the command should not have any effect on the system or the benchmark command. I tested--prepare
with commands likesleep 1
andcurl <some address>
. As benchmark command I usedecho test
. This is a very short command and hyperfines warns about it and suggests using--shell=none
but even with that it happens. Here is a recording.To make sure its not only with very short commands I also tested it with a slower command: Reading 3000000 bytes from
/dev/urandom
but it showed the same effect but the impact was smaller, probably because its a constant slowdown or it grows much slower than the overall execution time. Recording.Is this expected behavior for some reason? Or is it a bug?
Also another question: Why is using
--shell=none
so much slower in theecho test
example? Is it because ofstdin
andstdout
connections?Some system information:
Arch Linux with
6.6.9-arch1-1
(btw)CPU: Ryzen 5900x
Desktop: i3wm
Executed in: Alacritty -> Fish Shell -> Zellij
The text was updated successfully, but these errors were encountered: