Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BASIL-1 - Added unit tests for SMT operations #57

Merged
merged 5 commits into from
Sep 12, 2023
Merged

BASIL-1 - Added unit tests for SMT operations #57

merged 5 commits into from
Sep 12, 2023

Conversation

ziggyfish
Copy link
Contributor

No description provided.

@l-kent
Copy link
Contributor

l-kent commented Sep 11, 2023

Why have you replaced all the type matching with much messier casting?

@ziggyfish
Copy link
Contributor Author

Why have you replaced all the type matching with much messier casting?

I replaced them for code readability, and type checking was being done multiple times, which is very inefficient. So this was repaced for the following reasons:

  • smt_bvuge(s: Literal, t: Literal) was being checked at smt_bvuge and at smt_bvule.
  • there is also quite a significant performance hit on match case vs asInstanceOf.
  • The type checking should really be handled at the level above this code, and these functions accept only BitVecLiterals (but thats an other issue for another day)
  • better debugging information as there is a distinction between arg1 not being the correct type and arg2
  • Remove redundant code, "Exception("cannot apply bitvector operator to non-bitvectors")" apears 27 times, which is bad practice.
  • Easier to read, and identify missing require statements.
  • Mark seemed to like that approach

@l-kent
Copy link
Contributor

l-kent commented Sep 11, 2023

I really disagree that it's more readable to cast the variables every time they are used. For readability I personally would at least create new variables for the casted variables so you don't have to put .asInstanceOf everywhere.

It would certainly be good to have the type checking handled at a level above this code, but that seemed to require significant overhaul of the rather convoluted Lattice code inherited from TIP when I looked into fixing it. The current situation really is far from ideal.

Are the performance issues meaningful right now? If they were such an issue, maybe it's worth trying to fix the Lattice code so we don't need any casting or type-checking at all?

@ziggyfish
Copy link
Contributor Author

I really disagree that it's more readable to cast the variables every time they are used. For readability I personally would at least create new variables for the casted variables so you don't have to put .asInstanceOf everywhere.

Yeah, I can do that. I'll update the code soon.

In terms of the Lattice code. I say, one step at a time. We can review that code in another ticket.

@ailrst
Copy link
Contributor

ailrst commented Sep 11, 2023

there is also quite a significant performance hit on match case vs asInstanceOf.

Do you have a source for this? I'd need a lot of convincing to believe this is (1) true and (2) significant in this context.

The pipeline will always be bottle-necked by boogie, by orders of magnitude. Performance of the Scala code has never really been an issue, so fine-grained optimizations like this aren't really justified if it comes at the expense of readability.

@l-kent
Copy link
Contributor

l-kent commented Sep 11, 2023

It's plausible that a performance issue here could potentially be an issue in the future, given that the bitvector evaluation code will be heavily used in some of the analyses and so small improvements here could have meaningful benefits when applied to larger programs, but I also don't think it's worth worrying about too much in the short term.

@ailrst
Copy link
Contributor

ailrst commented Sep 11, 2023

If that was the case I'd still want profile data to show that its in the hot path and is a significant issue, but yeah true.

@ailrst
Copy link
Contributor

ailrst commented Sep 11, 2023

profile.zip

If you're interested... Its just a profile of the JVM so there is a lot of irrelevant stuff, but yeah the analysis is the most expensive part.

java -agentpath:/path/to/libasyncProfiler.so=start,file=profile.html -jar target/scala-3.1.0/wptool-boogie-assembly-0.0.1.jar examples/jumptable/jumptable.adt examples/jumptable/jumptable.relf -analyse

image

@l-kent
Copy link
Contributor

l-kent commented Sep 12, 2023

I'm still curious to see data on the performance differences with the pattern matching vs. AsInstanceOf, but it makes sense that there would be one. I don't have an issue with the readability now.

The following tests are all testing invalid inputs and if anything we should be throwing exceptions for these cases:

  • Natural to BitVector - should return 0 when natural number does not fit within specified bit size
  • BitVector Add - should subtract two BitVectors when the second value is negative
  • BitVector Negate - should return x when negating negative numbers
  • SMT - ConCat - should combine two numbers into a single number, with last number being larger than the indicated size
  • BitVector Extract - should extract bits from a given number when range is outside of bit given bit range

BitVecLiterals must have a value in the range [0, 2^size]. I realise plenty of things do not check this as-is (we really should enforce this, possibly with a new constructor), but testing for these invalid behaviours is not useful. smt_extract is not defined where i > size or j > size, but we should check this and currently do not.

The test 'BitVector Multiply - should return wrap BitVectors when sum is greater size number of bits' isn't doing what it says as 255*1 = 255 so there is no overflow to handle.

I would like to keep around some of the old tests that were removed too - the 128-bit bvadd and the extract tests. We do want to have at least some tests for larger sizes of bitvectors - 128 bits is approximately the largest size we are likely to ever need to deal with), especially as BigInt's underlying implementation changes for larger values, which has meant that some bugs in our implementation have not shown up at certain sizes in the past. Extract is probably the trickiest implementation and I would like to keep all those tests around given past bugs.

@ailrst
Copy link
Contributor

ailrst commented Sep 12, 2023

Is it worth checking functionality against Z3's implementation of the theories? E.g. encoding something like echo "(simplify (bvadd #x07 #x03))" | z3 -smt2 -in -t:1 and matching the expected result, in Scala, shouldn't be too hard?

@ziggyfish
Copy link
Contributor Author

I'm still curious to see data on the performance differences with the pattern matching vs. AsInstanceOf, but it makes sense that there would be one. I don't have an issue with the readability now.

When I have a bit of time, I'll prove this.

The following tests are all testing invalid inputs and if anything we should be throwing exceptions for these cases:

  • Natural to BitVector - should return 0 when natural number does not fit within specified bit size
  • BitVector Add - should subtract two BitVectors when the second value is negative
  • BitVector Negate - should return x when negating negative numbers
  • SMT - ConCat - should combine two numbers into a single number, with last number being larger than the indicated size
  • BitVector Extract - should extract bits from a given number when range is outside of bit given bit range

Removed.

The test 'BitVector Multiply - should return wrap BitVectors when sum is greater size number of bits' isn't doing what it says as 255*1 = 255 so there is no overflow to handle.

Fixed.

I would like to keep around some of the old tests that were removed too - the 128-bit bvadd and the extract tests. We do want to have at least some tests for larger sizes of bitvectors - 128 bits is approximately the largest size we are likely to ever need to deal with), especially as BigInt's underlying implementation changes for larger values, which has meant that some bugs in our implementation have not shown up at certain sizes in the past. Extract is probably the trickiest implementation and I would like to keep all those tests around given past bugs.

Yeah, it was unclear what the old tests were actually testing, I'm happy if you want to add them back in, however the tests will need to be renamed to be more descriptive.

I had to modify the implementation of smt_sign_extend because it was implemented incorrectly and producing bad results.

@ziggyfish
Copy link
Contributor Author

Is it worth checking functionality against Z3's implementation of the theories? E.g. encoding something like echo "(simplify (bvadd #x07 #x03))" | z3 -smt2 -in -t:1 and matching the expected result, in Scala, shouldn't be too hard?

Excuse my ignorance, where is the source code for Z3 located?

@ailrst
Copy link
Contributor

ailrst commented Sep 12, 2023

@l-kent
Copy link
Contributor

l-kent commented Sep 12, 2023

Yes, it would be good to make sure we've tested our tests against Z3, although that leaves open the possibility of a bug in Z3's implementation of the spec.

The old extract tests were just from playing around with some cases that had a bug in the past.

Thanks for fixing the sign_extend issue. Looking at the old version, I think I forgot what I was doing partway through writing it.

@ziggyfish
Copy link
Contributor Author

Yes, it would be good to make sure we've tested our tests against Z3, although that leaves open the possibility of a bug in Z3's implementation of the spec.

I agree, lets see if we can create a script in Z3 and Boogie to produce a result which tests its implementation, then create the same test in Scala, and check if we get the same result (i.e testing multiple operations at the same time). Let me think about this over the coming days.

The old extract tests were just from playing around with some cases that had a bug in the past.

No worries, not sure I can add those tests in, unless we know exactly the bug we were testing for.

Thanks for fixing the sign_extend issue. Looking at the old version, I think I forgot what I was doing partway through writing it.

No worries. it happens to the best of us.

@l-kent l-kent merged commit 7b9472f into main Sep 12, 2023
1 check passed
@l-kent l-kent deleted the BASIL-1 branch November 6, 2023 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants