Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully simplifying assertions with base sometimes harmful for precision of relational analyses #1650

Open
michael-schwarz opened this issue Dec 20, 2024 · 2 comments
Labels
precision relational Relational analyses (Apron, affeq, lin2var)

Comments

@michael-schwarz
Copy link
Member

michael-schwarz commented Dec 20, 2024

Currently, when translating CIL expression -> Apron expression for guards, we fully simplify all appearing expressions, which is usually beneficial.

Thus, if base knows something will hold it gets simplified to 1. However, this can have adverse effects when a relational domain is not as precise as the base domain, as the information from the guard is not added to the relational domain, even if is expressible.
After a control-flow join, base may no longer be able to show the invariant even if it could in both branches. If the information was added to the relational domain, it would survive.

Consider, e.g., the following example with intervals and lin2vareq:

  • In both branches, a and b are constant and b == a+2 holds which is checked by __goblint_check. However, as the constant value for a and b is established by a sequence of inequalities, this information is not available in the equality-based lin2vareq. Then, a guard asserting b == a+2 is executed. Normally, this would cause 2vareq to pick up on the equality, but we simplify it to 1 before passing it to the domain. Thus the value in 2vareq remains T.

  • After join, base can no longer establish b == a+2. As it was not added to 2vareq either, it cannot be shown.

//SKIP PARAM:  --enable ana.int.interval  --set sem.int.signed_overflow assume_none  --set ana.activated[+] lin2vareq

int main() {
  int a;
  int b;

  if(a <= 7 && a >= 7 && b <= 9 && b >= 9) {
    // a = 7, b = 9
    // But the lin2vareq does not know about it as it is established via guards
    // Holds trivially => is simplified to Pos(1) before being passed to linvareq
    if(b == a+2) {
      __goblint_check(b == a+2); // Assertion passes because base knows about it
    } else {
      return 0;
    }
  }
  else if(a <= 5 && a >= 5 && b <= 7 && b >= 7) {
    // a = 5, b = 7
    if(b == a+2) {
      __goblint_check(b == a+2);
    } else {
      return 0;
    }
  }
  else {
    return 0;
  }

  // b == a+2 no longer can be shown by base, but was not added to lin2vareq
  // Assertion fails despite b == a+2 guard on all paths that lead here and being expressioble by lin2vareq
  __goblint_check(b == a+2);


  return 0;
}

There seem to be various things that could be done here:

  • Ensure that the information from base is propagated to relational analyses when it is expressible.
  • Ensure that guards do not get completely simplified if basencan show them, but instead get simplified into (all?) implied guards of the form that the domain can handle?

@DrMichaelPetter, @jerhard and I jointly came across this while talking about #1610.

@michael-schwarz michael-schwarz added precision relational Relational analyses (Apron, affeq, lin2var) labels Dec 20, 2024
@sim642
Copy link
Member

sim642 commented Dec 30, 2024

// Holds trivially => is simplified to Pos(1) before being passed to linvareq

I don't think that's exactly what's happening. In parsing Cil.constFold does get applied, but that doesn't apply here. And base evaluating an expression doesn't change the expression for relational analyses. We don't even have a query for simplifying expressions.

I think what might be happening is that relational analyses themselves do some EvalInt queries (with integer domain results, not simplified expressions) to sometimes simplify things. So it should be entirely up to the relational analyses: they should still have the original expression as well if they want to assert both.


Can lin2vareq express constant propagation (e.g. x = 42)? I think all Apron domains can. If not, then it might not be too difficult because it should be similar to DBMs, where one special variable can stand in for zero (i.e. x = zero + 42).
Although it's probably not too useful and doesn't really solve this problem, but might be interesting nevertheless.

@DrMichaelPetter
Copy link
Collaborator

// Holds trivially => is simplified to Pos(1) before being passed to linvareq

I don't think that's exactly what's happening. In parsing Cil.constFold does get applied, but that doesn't apply here. And base evaluating an expression doesn't change the expression for relational analyses. We don't even have a query for simplifying expressions.

I think what might be happening is that relational analyses themselves do some EvalInt queries (with integer domain results, not simplified expressions) to sometimes simplify things. So it should be entirely up to the relational analyses: they should still have the original expression as well if they want to assert both.

The apron analyses share the functionality of converting cil expressions to apron expressions via the sharedFunctions module, currently. During this processing of expressions, this simplification process is implicitly performed, without the actual domain/analyses even noticing that this has happened.


Can lin2vareq express constant propagation (e.g. x = 42)? I think all Apron domains can. If not, then it might not be too difficult because it should be similar to DBMs, where one special variable can stand in for zero (i.e. x = zero + 42).
Although it's probably not too useful and doesn't really solve this problem, but might be interesting nevertheless.

All apron domains can indeed represent constant propagation. The precision tweak that was behind this whole simplification thing was a little more complicated. On the one hand, there are the weakly relational domains that can profit enormously from these kinds of simplifications, to have a chance of interpreting certain guards at all. On the other hand, even though the equality based domains like affeq or lin2vareq can represent constants, they in practice do miss constants that are inferred by means of inequality guards. These two aspects have led us to pull the evalint simplification into the expression conversion part.

The purpose of this issue here is to document and create awareness of the downside of this implementation choice.

What I have wondered is, whether as soon as algebraic effects are available, we could let evalint communicate back via effect what the outcome of the greatest lower bound of the query is, in order to refine the other domains....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
precision relational Relational analyses (Apron, affeq, lin2var)
Projects
None yet
Development

No branches or pull requests

3 participants