Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests Cleanup #245

Merged
merged 34 commits into from
Sep 18, 2024
Merged

Tests Cleanup #245

merged 34 commits into from
Sep 18, 2024

Conversation

l-kent
Copy link
Contributor

@l-kent l-kent commented Sep 16, 2024

  • Removes many redundant tests, including all the no_plt_no_pic variants, which reduces the execution time by nearly half
  • Removes most of examples folder, which was largely very out of date, redundant, and not a good target for tests
  • Separated tests containing unresolved indirect calls into a new IndirectCallTests suite that replaces the existing IndirectCallsTests, containing many more tests and more detailed checks
  • Consistently uses BigInts for addresses since they can be longer than 32 bits (it may make sense to make them Longs instead)
  • Removed deprecated lifting scripts and replaces them with a single script, scripts/lift.sh which can be used to lift single examples
  • Removes deprecated MemoryRegionAnalysisMiscTest
  • Takes steps towards unifying test suite functionality with BASILTest (replacing TestUtil) now the base for many other tests
  • Adds new test suites AnalysisSystemTestsBAP and AnalysisSystemTestsGTIRB which just run all the tests in /correct and /incorrect with the static analysis enabled - this is very useful for finding bugs
  • Minor cleanup elsewhere in the tests

I'm still not sure what to do with all the histogram stuff in SystemTests - it doesn't really feel like it all belongs there, but I'm not sure where to put it.

I probably should move the cases where GTIRB successfully pre-resolves indirect calls back into the /correct folder, because they're at least tests of the lifter?

I'm not sure what to do with correct/syscall/clang_O2 - maybe it should be in IndirectCallTests? It regularly breaks things in interesting ways though and the indirect call isn't really the point?

l-kent added 24 commits September 5, 2024 13:12
# Conflicts:
#	src/main/scala/ir/Program.scala
#	src/main/scala/translating/GTIRBToIR.scala
#	src/main/scala/translating/ReadELFLoader.scala
#	src/main/scala/util/RunUtils.scala
#	src/test/scala/IndirectCallsTests.scala
#	src/test/scala/IrreducibleLoop.scala
#	src/test/scala/LiveVarsAnalysisTests.scala
#	src/test/scala/MemoryRegionAnalysisMiscTest.scala
#	src/test/scala/PointsToTest.scala
#	src/test/scala/TaintAnalysisTests.scala
#	src/test/scala/ir/IRTest.scala
@l-kent l-kent requested a review from ailrst September 16, 2024 02:49
Copy link
Contributor

@ailrst ailrst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SystemTests/arrays_simple and syscall/clang_O2 are still failing in system-tests, I suggest we add a suite for TDD style tests, functionally the same as system tests but just containing the borderline test cases.
Otherwise I have no strong objections.

@@ -36,48 +26,52 @@ trait SystemTests extends AnyFunSuite {

val testResults: mutable.ArrayBuffer[(String, TestResult)] = mutable.ArrayBuffer()

def runTests(programs: Array[String], path: String, name: String, conf: TestConfig): Unit = {
private val testPath = "./src/test/"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set Logger.setLevel(LogLevel.WARN), printing the globals, proc addresses and then the same list of translation stage log messages for every test really isn't useful and makes it harder to see the failing/passing tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just replaced most of the Logger.info calls with Logger.debug, only keeping ones to print the very basic test progress.

@@ -45,7 +42,7 @@ jobs:
uses: actions/checkout@v4

- name: System Tests
run: ./mill test.testOnly '*SystemTests*' || true
run: ./mill test.testOnly 'SystemTests*' || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we expect SystemTests to pass we should remove the || true, its possibly also worth adding the AnalysisSystemTests as a separate stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't expect SystemTests to pass yet - arrays_simple doesn't pass yet for reasons unrelated to indirect calls and that has long been the case.

AnalysisSystemTests is not really worth putting in the GitHub action - it's useful to check for crashes but rather redundant in total. Many of the other test suites may be worth including though.

@l-kent
Copy link
Contributor Author

l-kent commented Sep 16, 2024

Which are the borderline cases you mean? arrays_simple/gcc because it still fails? Getting it to pass requires a more sophisticated specification system.

@ailrst
Copy link
Contributor

ailrst commented Sep 16, 2024

Which are the borderline cases you mean? arrays_simple/gcc because it still fails? Getting it to pass requires a more sophisticated specification system.

Yeah, if it requires new features to pass we should put it in a separate test suite to test new features against.

@l-kent
Copy link
Contributor Author

l-kent commented Sep 17, 2024

Other tests that don't always pass right now are some of the malloc_memcpy_strlen_memset_free tests which tend to cause Z3 to time out (though I think this is slightly inconsistent). Should they go in their own suite, or in the same one as arrays_simple?

@ailrst
Copy link
Contributor

ailrst commented Sep 17, 2024

I think we could relax the time limit (by 5 - 10 s?) so they pass reliably, since we don't have so many tests timing out now. They should consistently pass in a reasonable amount of time, clang is 8s on my laptop and gcc < 1s.

@l-kent
Copy link
Contributor Author

l-kent commented Sep 17, 2024

Relaxing the time limit doesn't help for incorrect/malloc_memcpy_strlen_memset_free/gcc_O2:BAP but it does for correct/malloc_memcpy_strlen_memset_free/clang_O2:BAP.

@l-kent l-kent closed this Sep 17, 2024
@l-kent l-kent reopened this Sep 17, 2024
@l-kent
Copy link
Contributor Author

l-kent commented Sep 17, 2024

correct/malloc_memcpy_strlen_memset_free/clang_O2:BAP still times out for the github action too. Maybe we should just put those tests in their own suite?

@ailrst
Copy link
Contributor

ailrst commented Sep 17, 2024

Yeah that's probably best

@l-kent
Copy link
Contributor Author

l-kent commented Sep 18, 2024

I think this is good now. I've moved the malloc_memcpy_strlen_memset_free tests to their own ExtraSpecTests suite

@l-kent l-kent merged commit 3030d6b into main Sep 18, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants