Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: what is the right way to deal with state during generation in jqwik? #617

Open
DavidGregory084 opened this issue Feb 17, 2025 · 4 comments

Comments

@DavidGregory084
Copy link

DavidGregory084 commented Feb 17, 2025

Thanks for jqwik! It's really great to be able to use such a well documented and comprehensive PBT framework in Java codebases.

I am using jqwik for something that it's perhaps not ideally suited for: generating well-typed syntax trees in a compiler codebase, so that I can fuzz the JVM code generation process. It's working surprisingly well, and has helped me to find all kinds of bugs!

You can see the arbitraries I am using here.

I have been able to get this far by using Arbitraries#recursive and sprinkling lazy around liberally.

However, I still (infrequently) run into issues where the syntax trees generated make use of variables that are out of scope at the use site.

This is because I am trying to put values into mutable scopes inside my generators and then push and pop those scopes as appropriate for the current point in the generation, but doing side effects inside an arbitrary seems difficult to reason about without knowledge of the internals of jqwik.

I have looked at the stateful generation features, but I don't actually want to test a mutable interface in my properties, I want to make use of state during generation. I actually don't mind if that is done via a mutable or immutable interface, I just need a way to thread that state through the generation process somehow.

In the Scala world I have previously used StateT with ScalaCheck's Gen to do the same thing and that worked well, but I'm struggling to find a good way to achieve the same thing with jqwik.

Am I missing something that I could use for this? If not, no big deal, what I have now is working most of the time and when it doesn't it typically means I forgot a lazy somewhere!

@DavidGregory084 DavidGregory084 changed the title Question: what is the right way to deal with mutable state in jqwik? Question: what is the right way to deal with mutable state during generation in jqwik? Feb 17, 2025
@DavidGregory084 DavidGregory084 changed the title Question: what is the right way to deal with mutable state during generation in jqwik? Question: what is the right way to deal with state during generation in jqwik? Feb 17, 2025
@jlink
Copy link
Collaborator

jlink commented Feb 17, 2025

@DavidGregory084 Thanks for using jqwik! Could you provide the simplest example you can think of that would show your intent? Or a test that fails reliably so that I can check what's going on?
Without really understanding what you do, I can at least confirm that the combination of recursive and lazy might become problematic. Both make use of jqwik's internal generator caching, which is tricky and uses side-effects.
So maybe a minimal example could enable me to understand the problem at hand.

@jlink
Copy link
Collaborator

jlink commented Feb 17, 2025

In general you have to push all state into dependent generators via flatMap or combine(..).flatAs(..) otherwise you just don't know which version of your state is available during generation. It looks like this rule is violated through GenEnvironment env being handed around very liberally. But my dry analysis might be wrong.

@DavidGregory084
Copy link
Author

DavidGregory084 commented Feb 17, 2025

@DavidGregory084 Thanks for using jqwik! Could you provide the simplest example you can think of that would show your intent? Or a test that fails reliably so that I can check what's going on?

I will see what I can do - like you say it might be difficult to minimise as it relies so much on the peculiarities of generating a recursive data structure and the specific evaluation order of my arbitraries.

The one that I have just encountered was (I think) triggered by the fact that refNode is not wrapped in lazy, so I think that some of the other arbitraries (e.g. refNodeWithType) are probably also affected.

In general you have to push all state into dependent generators via flatMap or combine(..).flatAs(..) otherwise you just don't know which version of your state is available during generation. It looks like this rule is violated through GenEnvironment env being handed around very liberally. But my dry analysis might be wrong.

I think this is the insight I was looking for, thanks! Perhaps I can rework the generation process so that the GenEnvironment is combined into each generator rather than being passed directly to an arbitrary method.

One idea that occured to me is that there could be a more Java-like design for StateT a bit like Builders that could be used to manipulate some scoped state during generation, e.g.

State
  // State.StateCombinator<S, S> init(Supplier<? extends Arbitrary<S>> initialState)
  .init(initialState)
  // State.StateCombinator<T, A> use(Function<? super S, Tuple.Tuple2<? extends T, ? extends Arbitrary<A>> useState)
  .use(initState -> {
    // create some new arbitrary using `initState` and return a `newState` and the arbitrary
    return Tuples.of(newState, newArb);
  })
  // State.StateCombinator<U, B> recursive(
  //      Supplier<? extends Arbitrary<B>> base,
  //      BiFunction<? super T, ? super Arbitrary<B>, Tuple.Tuple2<? extends U, ? extends Arbitrary<B>>> recurState,
  //      int depth)
  .recursive(base, (initState, arb) -> {
    // create some new arbitrary using `initState` and `arb` and return a `newState` and the new arbitrary
    return Tuples.of(newState, newArb);
  }, depth)
  .build() // Arbitrary<B>;

However that seems like a lot of work for quite a niche usage, and I suspect I can probably achieve something like this with the primitives already in jqwik, so please don't let me nerd-snipe you!

@jlink
Copy link
Collaborator

jlink commented Feb 21, 2025

The State builder idea is interesting. The difference to the normal builder would be the recursive nature. In theory that should just take a couple (maybe a few dozen) lines of code. Somehow my gut warns me that the recursion will introduce some problems that I cannot think of right now.

@DavidGregory084 Maybe you want to give it a shot in a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants