Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fewer mutations generated with version 1.15.4 #1291

Open
Stephan202 opened this issue Jan 21, 2024 · 10 comments
Open

Fewer mutations generated with version 1.15.4 #1291

Stephan202 opened this issue Jan 21, 2024 · 10 comments

Comments

@Stephan202
Copy link

While reviewing PicnicSupermarket/error-prone-support#984, I noticed that version 1.15.4 of Pitest reports fewer mutations than version 1.15.3. On cursory inspection it appears that version 1.15.4 no longer mutates deferred code that is referenced by static initialization expressions, such as:

  • Lambda expressions directly or indirectly referenced by enum and static fields initialization expressions, such as here and here.
  • Methods that are referenced only by a method reference passed to a static fields initialization expression, such as here.

The following reproduction steps show the issue (the above are just some examples; quite a lot of other classes are impacted as well):

# Clone the repo.
git clone [email protected]:PicnicSupermarket/error-prone-support.git
cd error-prone-support

# Collect mutation coverage before the upgrade.
./run-mutation-tests.sh
mv error-prone-contrib/target/pit-reports /tmp/pit-reports-before

# Collect mutation coverage after the upgrade.
git checkout origin/renovate/pitest-maven-plugin-1.x
./run-mutation-tests.sh
mv error-prone-contrib/target/pit-reports /tmp/pit-reports-after

# Compare the reports.
firefox /tmp/pit-reports-before/index.html
firefox /tmp/pit-reports-after/index.html

An example of a dramatic differences for the class tech.picnic.errorprone.bugpatterns.CanonicalAnnotationSyntax:

  • Before: image
  • After: image
@hcoles
Copy link
Owner

hcoles commented Jan 21, 2024

Thanks for the report.

I've just taken a quick look and I suspect the issue is that the release notes missed out the inclusion of #1274

This change expands pitest's search for code that is only called from static initializers. Although we'd ideally like to mutate this, because the code is only executed once within a JVM, it is only possible to kill these mutants if pitest happens to have launched a fresh JVM just before the mutant is inserted.

From a very quick scan of your example class, this looks to be working as intended since dropRedundantParentheses etc are called only when initializing a static field.

@Stephan202
Copy link
Author

Stephan202 commented Jan 21, 2024

@hcoles thanks for the quick response!

Perhaps I misunderstand, but since execution of the impacted code is deferred, I would expect it to be excluded from the search. After all, the version 1.15.3 report shows that the mutants could indeed be killed prior to this change (without restarting the JVM).

Put another way, given a method foo that is referenced only by a static field, I would would expect it to be excluded when used as

private static final String CONST = foo();

but not when used as

private static final Supplier<String> CONST = () -> foo();
private static final Supplier<String> CONST_2 = ThisClass::foo;

Because in the latter cases any mutation of foo will impact use of the static Suppliers.

@hcoles
Copy link
Owner

hcoles commented Jan 21, 2024

The mutants are not completely unkillable, but they are not reliably killable.

If they happen to be the first mutant in one of the JVMs that pitest launches, they will be killed by an effective test. Unfortunately, they may also cause other mutants to be falsely killed as any side effects from their execution (e.g bad state stored in a static variable) will endure in the JVM from that point on.

However, I think you are correct that there is an issue here. The change has not considered delayed execution from Suppliers etc.

I'll most likely re-relase 1.15.5 tomorrow with the change rolled back until this is reexamined.

Thanks again for the report.

@hcoles
Copy link
Owner

hcoles commented Aug 28, 2024

@Stephan202 Pitest 1.16.2 has just been released with #1274 reintroduced, but with some additional analysis to pick up Suppliers etc.

Let me know if it causes any problems for your codebase.

@Stephan202
Copy link
Author

Very cool! The automated upgrade PR should land tomorrow; I'll keep an eye on it :)

@Stephan202
Copy link
Author

Alright, now that PicnicSupermarket/error-prone-support#1313 landed I had a look. On our code base the number of generated mutants went down again, causing a lower "test strength" to be reported. The reproduction case in this issue's first message applies again: for CanonicalAnnotationSyntax the number of mutants went back down to three. I guess for this case it may be hard to infer that the methods are BiFunctions? 🤔

If "functional types" such as Supplier are special-cased, it may be nice to make that set extensible. For example, in our code base we define a lot of static com.google.errorprone.matchers.Matcher (subtype) instances.

Alternatively: perhaps the feature as a whole could be toggled? (But I can see that's not nice from a UX/API point of view.)

@hcoles
Copy link
Owner

hcoles commented Aug 29, 2024

Yes, it can't currently cope with collections of function types as the contained type is lost due to type erasure on call. Thinking about this again, it ought to be possible to fix that with a pre-scan of class fields to collect the declarations.

That would just leave the problem of the special-cased types. In theory that could be (largely) solved by scanning the classpath for anything annotated with FunctionalInterface, although that may cause some issues.

I'll take another look at this and see if things can be improved with a 2nd iteration.

@hcoles
Copy link
Owner

hcoles commented Aug 30, 2024

I had a play with this yesterday and made some significant improvements. Unfortunately, scanning the project classpath (ie all dependencies) for function classes looks like it would cause a performance hit of several seconds. This is insignificant for a full mutation analysis, but is not acceptable when running against diffs.

So, it looks like it will need to stick to a fixed list of classes.

I'm still undecided on whether to scan the user code. This is less of a performance hit, but may be confusing when the same class acts differently when used as a dependency vs directly.

@hcoles
Copy link
Owner

hcoles commented Sep 2, 2024

@Stephan202 1.16.3 is releasing now and ought to renable the majority of mutants.

After some consideration I went with the simplest possible approach, with no scanning and a fixed list of classes.

@Stephan202
Copy link
Author

Tnx for working on this @hcoles! I can report that version 1.16.3 generates almost the same set of mutants as version 1.16.1. I documented a small regression here. IIUC the first point documented there is "expected", give the trade-off you describe. Not sure about the second point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants