Performance tests #121

ailrst · 2023-10-24T07:19:25Z

Change summary:

Add timing to SystemTests
Differentiate timeout from verification failure
Have SystemTests write out two CSV files summarising the test results and times
Pass CLI configuration through the program in runutils and the boogie translation
- Add CLI flag --boogie-use-lambda-stores to generate the lambda definition for >8bit stores, rather than the nested MemOp definition, this gives reasonable performance without ArrayAxioms.
Add inline flag to BFunction

Breaking changes:

boogie timeout is now always a test failure

With defaults MemOps and ArrayAxioms:

[info] - summary
[info]   + Test summary: 305 verified, 81 failed to verify including 0 timeouts. 
[info]   + Average time to verify: 599 
[info]   + Average time to counterexample: 701

With lambdas and array theories:

[info] - summary
[info]   + Test summary: 305 verified, 81 failed to verify including 2 timeouts. 
[info]   + Average time to verify: 665 
[info]   + Average time to counterexample: 1851

l-kent · 2023-10-30T00:13:14Z

What's the point of storing the expected IL output too? It isn't used for anything here and seems completely redundant since it is just going to be the same information as the final output.

I also think that something like that --boogie-use-lambda-stores flag should probably be in its own pull request so we can really assess if it's a better approach, but I guess you've added it as a flag so you can experiment with it for now.

ailrst · 2023-10-30T00:16:50Z

I also think that something like that --boogie-use-lambda-stores flag should probably be in its own pull request so we can really assess if it's a better approach, but I guess you've added it as a flag so you can experiment with it for now.

Its necessary for some examples to verify (though none of the test cases we have currently) and just gives worse performance for others, I think we want to keep both around at this stage. But yes it could have been its own PR.

l-kent · 2023-10-30T03:38:16Z

Can you explain about storing the IL output?

ailrst · 2023-10-30T03:42:30Z

Can you explain about storing the IL output?

Yeah that's not particularly useful at this point, my thinking was its so when we have more in-place transformation of the IR we have history in case of regressions.

l-kent · 2023-10-30T03:52:15Z

I don't see any point in including it right now, it's just unnecessary bloat and any regressions will be visible in the Boogie output.

l-kent · 2023-10-30T04:02:02Z

src/test/scala/SystemTests.scala

+    val csvHeader = "passed,verified,shouldVerify,hasExpected,timedOut,matchesExpected,translateTime,verifyTime"
+  }
+
+  val testResults: mutable.Map[String, TestResult] = mutable.HashMap()


Is there a reason for using a Map here, given that it means the output isn't ordered, which is much less convenient? It doesn't seem like we care about the constant lookup time?

Uses an ArrayBuffer now

l-kent · 2023-10-30T04:03:39Z

src/test/scala/SystemTests.scala

-      Main.main(Array("--adt", ADTPath, "--relf", RELFPath, "--spec", specPath, "--output", outPath))
+      Main.main(Array("--adt", ADTPath, "--relf", RELFPath, "--spec", specPath, "--output", outPath, "--dump-il", ilPath))
    } else {
-      Main.main(Array("--adt", ADTPath, "--relf", RELFPath, "--output", outPath))
+      Main.main(Array("--adt", ADTPath, "--relf", RELFPath, "--output", outPath, "--dump-il", ilPath))


Not really useful right now, as mentioned

l-kent · 2023-10-30T04:03:49Z

build.sbt

+            if (ILOutPath.exists() && !(ILExpectedPath.exists() && filesContentEqual(ILExpectedPath, ILOutPath))) {
+              IO.copyFile(ILOutPath, ILExpectedPath)
+            }


Also unnecessary right now

# Conflicts: # src/main/scala/boogie/BProgram.scala # src/main/scala/translating/IRToBoogie.scala

ailrst · 2023-11-01T06:23:28Z

src/test/scala/SystemTests.scala

    if (verifying.nonEmpty)
      info(s"Average time to verify: ${verifying.sum / verifying.size}")
    if (counterExamples.nonEmpty)
      info(s"Average time to counterexample: ${counterExamples.sum/ counterExamples.size}")

    val summaryHeader = "verifiedCount,counterexampleCount,timeoutCount,verifyTotalTime,counterexampleTotalTime"
-    val summaryRow = s"$numVerified,${counterExamples.size},$numTimeout,${verifying.sum},${counterExamples.sum}"
+    val summaryRow = s"$numSuccess,${counterExamples.size},$numTimeout,${verifying.sum},${counterExamples.sum}"


I wanted verifyTotalTime/verifyCount = mean verify time. scalaTest already tells you how many tests passed.

I'll just make the summary include both things.

ailrst and others added 15 commits October 23, 2023 10:47

add config case class

4a11293

add IL to expected

805e596

fix add IL updateExpected

4fc09f7

update gitignore

7477bbd

add IL expected files

81fa6f0

add csv test summary and timing

c953a7e

add test summary

b2e6406

and condition on store generator

37c0661

Change memory updates > 8 bits to use lambdas

d1b558c

add lambda mem flag

13a5186

distinguish timeout from failure, add summary test

a1b4d37

update gitignore

277f167

revert defaults

ff0049a

add bigendian support for lambda stores

55b1911

always consider timeout a failure

1a358bc

ailrst mentioned this pull request Oct 25, 2023

Performance regression testing #120

Open

ailrst marked this pull request as ready for review October 27, 2023 07:32

ailrst requested a review from l-kent October 27, 2023 07:32

l-kent requested changes Oct 30, 2023

View reviewed changes

ailrst and others added 5 commits November 1, 2023 13:32

use list for results

7370c6a

do not output IL.expected

71624fb

remove il.expected files

51d94f7

Merge branch 'main' into performance-tests

23a7b80

# Conflicts: # src/main/scala/boogie/BProgram.scala # src/main/scala/translating/IRToBoogie.scala

make summary give more useful information

cbf547d

l-kent approved these changes Nov 1, 2023

View reviewed changes

ailrst commented Nov 1, 2023

View reviewed changes

add num verify/failed to summary

e9b5b0a

ailrst merged commit 5f02a10 into main Nov 1, 2023
1 check passed

ailrst deleted the performance-tests branch November 6, 2023 23:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance tests #121

Performance tests #121

ailrst commented Oct 24, 2023 •

edited

Loading

l-kent commented Oct 30, 2023

ailrst commented Oct 30, 2023 •

edited

Loading

l-kent commented Oct 30, 2023

ailrst commented Oct 30, 2023

l-kent commented Oct 30, 2023

l-kent Oct 30, 2023

ailrst Nov 1, 2023

l-kent Oct 30, 2023

l-kent Oct 30, 2023

ailrst Nov 1, 2023

ailrst Nov 1, 2023

Performance tests #121

Performance tests #121

Conversation

ailrst commented Oct 24, 2023 • edited Loading

l-kent commented Oct 30, 2023

ailrst commented Oct 30, 2023 • edited Loading

l-kent commented Oct 30, 2023

ailrst commented Oct 30, 2023

l-kent commented Oct 30, 2023

l-kent Oct 30, 2023

Choose a reason for hiding this comment

ailrst Nov 1, 2023

Choose a reason for hiding this comment

l-kent Oct 30, 2023

Choose a reason for hiding this comment

l-kent Oct 30, 2023

Choose a reason for hiding this comment

ailrst Nov 1, 2023

Choose a reason for hiding this comment

ailrst Nov 1, 2023

Choose a reason for hiding this comment

ailrst commented Oct 24, 2023 •

edited

Loading

ailrst commented Oct 30, 2023 •

edited

Loading