Skip to content

Latest commit

 

History

History
82 lines (65 loc) · 2.5 KB

notes.md

File metadata and controls

82 lines (65 loc) · 2.5 KB

Notes

To test that benchexec is installed correctly and run tools:

python3 -m benchexec.test_tool_info dummy --read-only-dir / --overlay-dir /home

To test the apalache tool info (from within the same directory as the apalache.py module):

python3 -m benchexec.test_tool_info .apalache --read-only-dir / --overlay-dir /home

To run the scratch configuration (from within the ./scratch dir)

benchexec scratch/config.xml --read-only-dir / --overlay-dir /home

Output

Ends up in the results dir in the working directory

Plan

  • Get tool info working
  • Make PR with tool info
  • Generate task defs from sbt, with little EDSL as SBT plugin
  • Review Igor's PR on benchmark organization
  • Use SBT for running the benchmarks
  • Make sbt config into a plugin
  • Fetch apalache executable
  • Add way to specify which apalache version
  • Publish the BenchExec reports on GH pages
    • Set up sbt-site
    • Construct index for all benchmarks
  • Transfer over the existing benchmarks
  • Fix any missing results in output data
  • Compare across different version wc
  • Enable running individual suites
  • Enable running individual benchmarks in a suite?
  • Open source the benchexec sbt rules
  • Set up incremental builds for benchmarks? (With option to override?)
  • Document how to run and add benchmarks
  • Support adding benchmarks for files in remote repos:

Committing results:

If the target directory for the output files (specified with --outputpath) is a git repository without uncommitted changes and the option --commit is specified, benchexec will add and commit all created files to the git repository. One can use this to create a reliable archive of experimental results.

TODO ADR:

  • Why picked benchexec
    • considered:
      • bench press (only smt, not as mature)
      • python thing
      • bespoke
    • robust
    • actively maintained
    • widely known
    • publication quality
    • containment
  • Why picked sbt
    • considered:
      • Dhall -- another novelty in our stack. Not known by most
      • XML -- can't generate experiments programmatically, verbose, not statically checked
    • we need a build system
    • we already use it and must know scala