Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mtest: fix rebuilding correct target list before running tests #10413

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

eli-schwartz
Copy link
Member

@eli-schwartz eli-schwartz commented May 20, 2022

Regression in the original implementation of commit 79e2c52.

If an explicit list of targets is passed on the CLI, then that is passed to rebuild_deps. If not, we pass every loaded test to rebuild_deps instead. This means we cannot distinguish between "trying to run all tests" and "trying to run specific tests". We then load all the deps for all tests, and try to build them all as explicit arguments to the underlying ninja.

This is usually papered over by people running ninja test, which depends on "all" anyway as a prerequisite for invoking the underlying meson test --no-rebuild --print-errorlogs, however, instead running meson setup builddir && meson test -C builddir with under-specified test deps would actually build a handful of deps and not anything else, then fail.

The reverse problem occurs when manually specifying a target on the CLI, which does not have any test deps. Since we calculated that there are no deps to rebuild, we run ninja without any targets, and this invokes the default "all" rule and maybe builds a few thousand targets that this specific test does not need.

@eli-schwartz
Copy link
Member Author

The first and second commits fix fundamentally very different things, and I'm not actually 100% sure about the first one at all.

In @bonzini's original implementation of #7902 it may or may not have been intentional to have a bare meson test only rebuild the targets directly depended upon by correctly wired dependencies. The ninja test flow anyways will cover your back by rebuilding all and the ninja target is often considered "run the meson tool with sensible defaults", so that may or may not be a rational approach to take here.

@codecov
Copy link

codecov bot commented May 20, 2022

Codecov Report

Attention: Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 68.43%. Comparing base (5d0538d) to head (45f3478).
Report is 2282 commits behind head on master.

Files Patch % Lines
mesonbuild/mtest.py 75.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10413      +/-   ##
==========================================
- Coverage   68.75%   68.43%   -0.33%     
==========================================
  Files         406      406              
  Lines       87646    87652       +6     
  Branches    19486    19488       +2     
==========================================
- Hits        60260    59982     -278     
- Misses      22848    23080     +232     
- Partials     4538     4590      +52     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jpakkane
Copy link
Member

Have not gone through the implementation yet, but for the record the conceptual ideas I had were the following:

  • ninja test and plain meson test should do the same thing
  • ninja test depends on the all target and then invokes meson test with --no-rebuild
  • a plain meson test invokes ninja all (or equivalent) before running tests
  • meson test testname only rebuilds things that test explicitly depends on

Basically no matter what the user does, Meson would take care of ensuring that all dependencies are up to date. If said dependencies are set up incorrectly then that is a bug in the build files and the user's responsibility to fix. Similarly using --no-rebuild means that the user takes full responsibility of ensuring things are up to date.

@eli-schwartz
Copy link
Member Author

eli-schwartz commented May 22, 2022

Meson would take care of ensuring that all dependencies are up to date.

Yeah, the question is, what constitutes "all dependencies"?

If said dependencies are set up incorrectly then that is a bug in the build files and the user's responsibility to fix.

In theory, per the description above, this only counts when running individual tests. There is no way to run the full testsuite while assuming that dependencies are set up correctly and only rebuilding whatever is necessary for "all the tests".

In reality, the referenced PR actually enabled that workflow. Run ninja test and all targets, including the ones you didn't want for just tests, get rebuilt. Run meson test, and only the targets you need for testing, get rebuilt.

This PR removes that capability by making meson test change behavior to behave exactly like ninja test.

@jpakkane
Copy link
Member

Just to be sure, here is the same thing said in a different way:

If you set up a fresh build dir and run either ninja test or meson test, the outcome should be exactly the same as first running ninja and then ninja test or meson test.

@jpakkane
Copy link
Member

This PR removes that capability by making ninja test behave exactly like meson test.

Does that mean that it no longer builds all? If yes, then it should probably be the other way around where meson test is changed to do the same as ninja test.

@eli-schwartz
Copy link
Member Author

eli-schwartz commented May 22, 2022

Does that mean that it no longer builds all?

Sorry, that was bad wording, tweaked. This PR changes the behavior of meson test, so that ninja test and meson test behave the same.

Currently meson test does not rebuild all (unless all tests fail to specify any deps). This PR makes it do so.


runner = find_program('runner.py')
exe1 = executable('main1', 'main.c')
exe2 = executable('main2', 'main.c')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness one of these two should have build_by_default: false.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not really what I'm testing here, I'm testing that meson test needs to build "all". I'm not testing what is included in "all".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is relevant for this. Rebuilding needs to work even if the end user has tagged a test executable as build_by_default: false so we should test it.

@bonzini
Copy link
Contributor

bonzini commented May 23, 2022

it may or may not have been intentional to have a bare meson test only rebuild the targets directly depended upon by correctly wired dependencies

It was intentional. Not changing the behavior of ninja test was not intentional, however.

a plain meson test invokes ninja all (or equivalent) before running tests

I'm not sure why that would be the case. If you want to compile everything and run the tests, that would be a meson compile task. meson test on the other hand should do what it says on the tin, i.e. just run tests.

I don't really care what ninja test should do (I have never noticed the difference since #7902, in fact, and I think it's telling that it took 18 months for someone to notice).

My suggestion is to add meson compile --test as a portable, and in fact superior, replacement for ninja test. It would run the compilation as usual, and then use introspection info to run tests that depend on the targets that have been built. meson compile --test with no arguments would indeed run all tests, since no targets are specified.

@eli-schwartz
Copy link
Member Author

My suggestion is to add meson compile --test [...]

My suggestion is to remove mcompile.py without a replacement, I'm not fond of any suggestion that involves making it even more confusingly inconsistent and badly designed than the current disaster.

It was intentional. Not changing the behavior of ninja test was not intentional, however.

Well, that's a problem. Not because I don't like the idea -- in fact I agree with you and think @jpakkane is completely wrong. But because quite plainly no one understood the change, and it wasn't documented behavior.

I don't really care what ninja test should do (I have never noticed the difference since #7902, in fact, and I think it's telling that it took 18 months for someone to notice).

This sentence basically feels to me like the underlying attitude is "I managed to pull the wool over your eyes for 18 months before you noticed".

I don't think there's anything telling about the fact that it took 18 months for anyone to notice the intended purpose of that change. It also took 18 months to notice the unintended bug that built potentially 5k unwanted translation units for a test that legitimately has no dependencies.

All it really says is that no one is using the functionality you added, for better or for worse.

What I'd actually like to see as the outcome of this PR, is a discussion about what mtest should really do, and document the conclusion so that people know what behavior to expect.

@bonzini
Copy link
Contributor

bonzini commented May 23, 2022

But because quite plainly no one understood the change, and it wasn't documented behavior.

I agree it's a problem that ninja test behavior wasn't understood properly.

The fact that ninja test behavior isn't documented is a preexisting issue. The fact that meson test and ninja test diverged could actually be seen as a feature, not a bug; ninja test is more of a user-that-compiles-the-program thing, while meson test is for developers. In any case should be documented (again, my preferred solution would actually involve improving meson compile; but let's agree to disagree there).

However, I don't see what you mean by "no one understood the change". The issue that you point out ("meson test -C builddir with under-specified test deps would actually build a handful of deps and not anything else, then fail") was both expected and mentioned in the release notes: "this change could cause failures when upgrading to 0.57, if the dependencies are not specified correctly".

The reason why it is much better than what you make it sound like, is that —see create_test_serialisation and commit fa5c236—test dependencies always include test executable and arguments, if these are build targets, and therefore most common cases work. For example:

$ cd test\ cases/common/1\ trivial/
$ meson setup build
$ meson test
ninja: Entering directory `/home/pbonzini/work/upstream/meson/test cases/common/1 trivial/build'
...
Ok:             1
...

And ninja was invoked with arguments, i.e. requesting to only build the test program:

$ strace -ff -e execve meson test
...
[pid 323089] execve("/usr/bin/ninja", ["/usr/bin/ninja", "-C", "/home/pbonzini/work/upstream/mes"..., "trivialprog"], 0x7ffde71a8a88 /* 75 vars */) = 0

The broken case will happen when you have a test script that takes care of running the executable, and the test script makes assumptions on the path to the executable. Then indeed you would have a test with no dependencies, and that triggers the incorrect build of the default target. But it's a corner case and not the common one.

This sentence basically feels to me like the underlying attitude is "I managed to pull the wool over your eyes for 18 months before you noticed".

Sorry if that's how it came across. What I meant is simply that developers are probably not using ninja test when bisecting or fixing bugs, which is why they never noticed the difference between it and meson test.

All it really says is that no one is using the functionality you added, for better or for worse.

That seems quite a jump. Did you miss that dependencies come from more than just test(depends: ...) perhaps? I guarantee that it's in use.

@eli-schwartz
Copy link
Member Author

eli-schwartz commented May 23, 2022

However, I don't see what you mean by "no one understood the change". The issue that you point out ("meson test -C builddir with under-specified test deps would actually build a handful of deps and not anything else, then fail") was both expected and mentioned in the release notes: "this change could cause failures when upgrading to 0.57, if the dependencies are not specified correctly".

Again, I actually like the idea of meson test having that effect.

But the actual PR and release notes was described as having that effect for meson test <test-name>.

I asked @jpakkane whether that was expected for meson test too, because we don't really document how this works -- he said no, so I implemented what he said is supposed to happen.

I discovered this as a side effect of discovering that a test without any dependencies rebuilt the entire project.

That seems quite a jump. Did you miss that dependencies come from more than just test(depends: ...) perhaps? I guarantee that it's in use.

My test case explicitly makes use of such dependencies. It tests that a test('name', runner, args: built_exe) works with correctly wired dependencies, and it tests that a test('name', runner, args: built_exe.full_path()) has incorrectly wired dependencies, because I have seen various projects actually use .full_path() which is completely extraneous and outright breaks what would have otherwise worked, presumably because they strangely think you can't pass a built target as an argument to test() and didn't check the docs to see whether it is allowed. Strange, but there you have it.

The thing that I think perhaps people aren't actually using, is "bisecting projects other than qemu with something other than ninja test because we don't have such complicated needs", or "running meson test for anything other than a scripted recipe doing meson setup ... && ninja && meson test && meson install.

What I think is that actually using the full power of the tools is relatively rare, and the relatively few people doing it also tend to spend a lot of time making sure their test definitions are correct, and are unlikely to notice edge cases, so... Saying that nobody complained about the interactions of a uncommon workflow is not really a defense for functionality that apparently the lead developer doesn't like. Any more than it's a defense for the bug you didn't notice because it's uncommon for you to have tests that don't depend on anything.

Instead of discussing whether people noticed or should have noticed, why don't we just discuss whether, in a bubble, it's the desired behavior? And then document it, fair and square and out in the open so it's no longer a matter of confusion or disagreement.

@eli-schwartz
Copy link
Member Author

eli-schwartz commented May 23, 2022

Alright, opinion time.

@jpakkane

The meson <foo> tools generally have flexibility in how you use them. We also generate ninja <foo> targets that try to guess the most useful default and then do that. This is for example why ninja test prints errorlogs.

I posit that this is also why ninja test depends on all, to cover your back in case you have badly specified tests that lack dependencies due to globbing, wrapper scripts with assumptions, or the unfortunate use of .full_path().

IMO meson test should fundamentally assume you specify the test dependencies correctly, and only rebuild actual dependencies. This is no longer a breaking change, since we've been doing it for a while now, so... If we document that, then people know what to assume, and we have a project stance that aligns with the implementation to point to and stick with.

If people want to run something that goes to extra lengths to rebuild everything you might need but forgot to say you need, then that exists, and it's called ninja test.

@bonzini
Copy link
Contributor

bonzini commented May 23, 2022

The thing that I think perhaps people aren't actually using, is "bisecting projects other than qemu with something other than ninja test"

Okay, for that I also can guarantee that people are using it - there are a couple totally unrelated to qemu (or libvirt) even in the pull request itself.

I posit that this is also why ninja test depends on all, to cover your back in case you have badly specified tests that lack dependencies due to globbing, wrapper scripts with assumptions, or the unfortunate use of .full_path().

That would be retconning, but it's sensible retconning and I agree with it.

So, anyway we agree on both the desired outcome and the way to reach it (commit the second patch and document the rest). Any improvements to meson compile (including burning and salting it) can be discussed in a separate PR.

@bonzini
Copy link
Contributor

bonzini commented May 24, 2022

@eli-schwartz are you going to adjust the PR or perhaps open another one?

@@ -1666,9 +1666,10 @@ def doit(self) -> int:
raise RuntimeError('Test harness object can only be used once.')
self.is_run = True
tests = self.get_tests()
rebuild_only_tests = tests if self.options.args else []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I think this line could go after the if not tests: return 0

Copy link
Member

@bruchar1 bruchar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Would be helpful to me if we could get it in the next release.

eli-schwartz and others added 3 commits November 20, 2024 13:47
Inconsistency in the original implementation of commit
79e2c52.

If an explicit list of targets is passed on the CLI, then that is passed
to rebuild_deps. If not, we pass every loaded test to rebuild_deps
instead. This means we cannot distinguish between "trying to run all
tests" and "trying to run specific tests". We then load all the deps for
all tests, and try to build them all as explicit arguments to the
underlying ninja.

There are two situations where this falls flat:

- given underspecified deps
- given all (selected?) tests legitimately happen to have no
  dependencies

In both cases, we calculate that there are no deps to rebuild, we run
ninja without any targets, and this invokes the default "all" rule and
maybe builds a few thousand targets that this specific test run does not
need.

Additionally, in large projects which define many tests with many
dependencies, we could end up overflowing ARG_MAX when processing *all*
tests.

Instead, pass no tests to rebuild_deps. We then specially handle this by
directly running the relevant ninja target for "all test deps", which is
overall more elegant than specifying many many dependencies by name.

Given a subset of tests to guarantee the freshness of, we instead skip
running ninja at all if there are indeed no test dependencies.
When running `ninja all` we shouldn't build testsuite programs as these
might not be wanted e.g. in order to just install the project. We do
want them to be built when running `ninja test`. Since meson 0.63 we
actually have a dedicated ninja alias for test dependencies -- move
these from the "all" rule to the dedicated test/benchmark rules.
Copy link
Contributor

@bonzini bonzini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworded the release notes, but otherwise looks great.

Comment on lines +3 to +19
The `ninja test` / `meson test` rules have been reworked to no longer force
tests to be built unnecessarily. The correct, guaranteed workflow for this has
been to either run `ninja test` or `meson test`, both of which rebuild
dependencies on demand. *Also* building test-only binaries as part of
installing the project (`ninja && ninja install`) was unnecessary and has no
use case.

Some users might have been relying on the "all" target building test
dependencies in combination with `meson test --no-rebuild` in order to skip
calling out to ninja when running tests. The `--no-rebuild` option has always
been intended for expert usage -- you must provide your own guarantees that it
will work -- and it should be noted that this change means test programs are no
longer guaranteed to have been built, depending on whether those test programs
were *also* defined to build by default / marked as installable. The desired
behavior of building test programs in a separate stage can be restored by
building `ninja all meson-test-prereq` (or `meson-benchmark-prereq` for running
benchmarks), as these prereq targets have been available since meson 0.63.0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ninja test behavior did not have substantial changes though, did it? Instead the change was in ninja all (removing meson-test-prereq meson-benchmark-prereq) and meson test (removing the invocation of ninja with no arguments).

Suggested change
The `ninja test` / `meson test` rules have been reworked to no longer force
tests to be built unnecessarily. The correct, guaranteed workflow for this has
been to either run `ninja test` or `meson test`, both of which rebuild
dependencies on demand. *Also* building test-only binaries as part of
installing the project (`ninja && ninja install`) was unnecessary and has no
use case.
Some users might have been relying on the "all" target building test
dependencies in combination with `meson test --no-rebuild` in order to skip
calling out to ninja when running tests. The `--no-rebuild` option has always
been intended for expert usage -- you must provide your own guarantees that it
will work -- and it should be noted that this change means test programs are no
longer guaranteed to have been built, depending on whether those test programs
were *also* defined to build by default / marked as installable. The desired
behavior of building test programs in a separate stage can be restored by
building `ninja all meson-test-prereq` (or `meson-benchmark-prereq` for running
benchmarks), as these prereq targets have been available since meson 0.63.0.
`meson test` and the `ninja all` rule have been reworked to no longer force
unnecessary rebuilds.
`meson test` was invoking `ninja all` due to a bug if the chosen set of tests
had no build dependencies. The behavior is now the same as when tests do
have build dependencies, i.e. to only build the actual set of targets that are
used by the test. This change could cause failures when upgrading to Meson
1.7.0, if the dependencies are not specified correctly in meson.build.
`ninja all` does not rebuild all tests anymore; it should be noted that this
change means test programs are no longer guaranteed to have been
built, depending on whether those test programs were *also* defined
to build by default / marked as installable. This avoids building test-only
binaries as part of installing the project (`ninja && ninja install`), which
is unnecessary and has no use case. Some users might have been
relying on the "all" target building test dependencies in combination
with `meson test --no-rebuild` in order to skip calling out to ninja when
running tests. The correct, guaranteed workflow for this has always been
to run either `ninja test` or `ninja && meson test`, both of which rebuild
dependencies on demand.
If you wish to build test programs and dependencies in a separate stage,
you can restore this behavior with `ninja all meson-test-prereq` (or
`meson-benchmark-prereq` for benchmarks), as these prereq targets
have been available since meson 0.63.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants