Submission for smtinterpol #45

jhoenicke · 2024-05-15T20:05:01Z

The smtinterpol submission.

bobot · 2024-05-21T13:50:12Z

Hi @jhoenicke , could you rebase your submission? For simplicity instead of being in the tests, the submission summary is now presented as a specific comment.

jhoenicke · 2024-05-22T08:39:44Z

@bobot, I'm not sure about the Docker requirement. Is the solver meant to be run with docker run. Because this adds something like 4 seconds overhead just for running solver -version.

Since we have a java application as a portable jar archive, is it okay if our tar.gz submission contains only the jar file and we just assume someone installed java 8 or higher on the path, or do we need to include java in the archive, which is about 300 MB and also architecture specific?

Our docker file is currently just using the public java docker eclipse-tumerin:8 and then copying the jar file and the start script to some directory. Is this what you had in mind?

mpreiner · 2024-05-23T23:14:33Z

@jhoenicke As far as I understand the Dockerfile is only for building the solver binaries.

bobot · 2024-05-24T06:04:38Z

The Dockerfile mention were an idea that we forgot to remove in the last iteration of the rules. I'm really sorry . The binaries must already be built. The computers are Ubuntus, we are going to provide the exact name of the image today. I will make an announcement later today.

mbromber · 2024-05-27T05:27:53Z

JFYI: Your solver link does not seem to work. I get the following error:

Forbidden

You don't have permission to access this resource.

Maybe it is only accessible from your university's intranet?

jhoenicke · 2024-05-27T05:56:43Z

Fixed. I always have to set permission manually and forgot to set it.

mbromber · 2024-05-27T06:56:21Z

Thanks. It works now.

Draft for the smtinterpol submission.

github-actions · 2024-05-31T09:07:13Z

Summary of modified submissions

SMTInterpol

14 authors
website: https://ultimate.informatik.uni-freiburg.de/smtinterpol
Participations
- UnsatCore
  - Arith
    - all
  - Equality
    - all
  - Equality+LinearArith
    - all
  - Equality+NonLinearArith
    - all
  - QF_Datatypes
    - all
  - QF_Equality
    - all
  - QF_Equality+LinearArith
    - all
  - QF_Equality+NonLinearArith
    - all
  - QF_LinearIntArith
    - all
  - QF_LinearRealArith
    - all
  - QF_NonLinearIntArith
    - all
  - QF_NonLinearRealArith
    - all
- SingleQuery
  - Arith
    - all
  - Bitvec
    - all
  - Equality
    - all
  - Equality+LinearArith
    - all
  - Equality+MachineArith
    - ABV
    - AUFBV
    - AUFBVDTLIA
    - AUFBVDTNIA
    - AUFBVDTNIRA
    - UFBV
    - UFBVDT
    - UFBVLIA
  - Equality+NonLinearArith
    - all
  - QF_Bitvec
    - all
  - QF_Datatypes
    - all
  - QF_Equality
    - all
  - QF_Equality+Bitvec
    - all
  - QF_Equality+LinearArith
    - all
  - QF_Equality+NonLinearArith
    - all
  - QF_LinearIntArith
    - all
  - QF_LinearRealArith
    - all
  - QF_NonLinearIntArith
    - all
  - QF_NonLinearRealArith
    - all
- ModelValidation
  - QF_ADT+BitVec
    - all
  - QF_ADT+LinArith
    - all
  - QF_Bitvec
    - all
  - QF_Datatypes
    - all
  - QF_Equality
    - all
  - QF_Equality+Bitvec
    - all
  - QF_Equality+LinearArith
    - all
  - QF_Equality+NonLinearArith
    - all
  - QF_LinearIntArith
    - all
  - QF_LinearRealArith
    - all
  - QF_NonLinearIntArith
    - all
  - QF_NonLinearRealArith
    - all
- Incremental
  - Arith
    - all
  - Bitvec
    - all
  - Equality
    - all
  - Equality+LinearArith
    - all
  - Equality+NonLinearArith
    - all
  - QF_Bitvec
    - all
  - QF_Equality
    - all
  - QF_Equality+Bitvec
    - all
  - QF_Equality+Bitvec+Arith
    - all
  - QF_Equality+LinearArith
    - all
  - QF_Equality+NonLinearArith
    - all
  - QF_LinearIntArith
    - all
  - QF_LinearRealArith
    - all
  - QF_NonLinearIntArith
    - all

- New version uploaded - Added Contact - Added new BV logics to model-validation track

#84: Create cvc5-cloud #74: Draft STP submission #70: draft yicesQS submission #68: Create STP-CNFLS #66: Yices2 SMTCOMP 2024 Submission #65: Z3-alpha draft PR #64: Solver submission: cvc5 #63: submission iProver #61: OSTRICH 1.4 #60: SMT-RAT submission #57: Amaya's submission for SMT-COMP 2024 #55: plat-smt submission #54: Add 2024 Bitwuzla submission. #53: 2024 solver participant submission: OpenSMT #52: Z3-Noodler submission #51: Submission Colibri #45: Submission for smtinterpol #42: Adding Algaroba to SMTCOMP 2024

martinjonas · 2024-06-10T22:09:02Z

@jhoenicke We have executed the latest version of smtinterpol on a randomly chosen subset of 20 single-query benchmarks from each logic where it participates. The benchmarks are also scrambled by the competition scrambler (with seed 1). You can find the results here: https://www.fi.muni.cz/~xjonas/smtcomp/smtinterpol.table.html#/table

Quick explanation:

Green status means that the result agrees with the (set-info :status _) annotation from the benchmark.
Blue status means that the benchmark has annotation (set-info :status unknown).
By clicking on the result (e.g. false, true, ABORTED, …) you can see the command-line arguments with which your solver was called and its output on the benchmark.
By clicking on the benchmark name (i.e., *scrambled*.yml), you can see the details of the benchmark including its contents (by clicking on the file link in input_files) and the name of the original bennchmark before scrambling (e.g., # original_files: 'non-incremental/AUFBVFP/20210301-Alive2-partial-undef/ph7/583_ph7.smt2').

Please check whether there are some discrepancies, such as missing/extra logics, unexpected aborts or unknowns, and similar. If you update the solver, let me know and I can execute further test runs. We still have plenty of time for several follow-up test runs.

martinjonas · 2024-06-12T08:46:09Z

@jhoenicke Here are also results of the test run on a small subset of incremental benchmarks: https://www.fi.muni.cz/~xjonas/smtcomp/smtinterpol_inc.table.html#/table

The status column has the following meaning:

DONE (X correct): the solver terminated successfully and correctly decided X (check-sat) queries before that.
SOMETHING_ELSE (X correct) (e.g., TIMEOUT (10 correct) or ABORTED (10 correct)): The solver did not terminate successfully, but correctly decided X (check-sat) queries before it crashed or was killed.
WRONG: There was at least one answer to (check-sat) that did not agree with the status specified in the benchmark.

Again, if there are discrepancies (missing/extra logics, unexpected results), please let me know.

jhoenicke · 2024-06-12T11:12:11Z

@martinjonas Thanks for the preliminary runs. Do you by any chance already have the error.log files?

martinjonas · 2024-06-12T11:18:08Z

@martinjonas Thanks for the preliminary runs. Do you by any chance already have the error.log files?

I can obtain them if you want. Do you need SingleQuery, Incremental, or both?

martinjonas · 2024-06-12T12:41:00Z

Do you need SingleQuery, Incremental, or both?

Nevermind, I got them both. :-)

Single-query: https://www.fi.muni.cz/~xjonas/smtcomp/processed_results/smtinterpol/smtinterpol.2024-06-12_14-26-36.files.tar.gz
Incremental: https://www.fi.muni.cz/~xjonas/smtcomp/processed_results/smtinterpol_inc/smtinterpol_inc.2024-06-12_14-29-11.files.tar.gz

jhoenicke · 2024-06-12T12:59:47Z

It looks like the zip file in the archive are all empty. I would suspect that most of the BV benchmarks trigger an output to error.log. And all runs should create at least an empty error.log file. Is this working? Could this be a permission problem, that we can't write to the docker image?

martinjonas · 2024-06-13T10:16:33Z

It looks like the zip file in the archive are all empty. I would suspect that most of the BV benchmarks trigger an output to error.log. And all runs should create at least an empty error.log file. Is this working? Could this be a permission problem, that we can't write to the docker image?

Ah, it seems that there is some problem of our configuration of cluster for returning the generated files. The file gets created but is not returned from the computation nodes to the server. I will look into that.

martinjonas · 2024-06-13T10:17:26Z

@jhoenicke We have finished test runs of model-validation and unsat-core generation tracks. You can find the results here:

As before, you can click on the status of the benchmark to see the output of your solver. If you find any discrepancies or extra/missing logics, please let me know.

Note that we selected only SAT benchmarks for model validation and only UNSAT for unsat-core generation. As a result of that, some logics do not contain any benchmarks. So do not be surprised if you have subscribed to one of these logics and you do not have any result for it. In particular, the logics are:

For model-generation: QF_UFFP, QF_UFBVDT, QF_UFDTNIA, QF_NIRA.

For unsat-core: UFBVDT, FPLRA, QF_AUFBVFP, QF_ABVFPLRA, QF_UFBVDT, QF_UFDTNIA, QF_UFDT, QF_NIA, QF_SNIA.

jhoenicke · 2024-06-13T11:26:25Z

Hi Martin, thanks for the logs. I found two problems:

the links to the yml-Files are not working, so it's impossible to see which benchmark it is.
it seems that for model-validation, the model is not checked at all. It reports correct for QF_ABV/scrambled63358.smt2, even though the model production crashed.

martinjonas · 2024-06-13T11:36:20Z

1. the links to the yml-Files are not working, so it's impossible to see which benchmark it is.

Thanks for the report. It should be ok now.

2. it seems that for model-validation, the model is not checked at all. It reports correct for QF_ABV/scrambled63358.smt2, even though the model production crashed.

Yes, this is true. We are just running the solver to check whether it can be executed and not doing any extra post-processing yet. The true in the table corresponds to the result of the (check-sat) result. I see why this can be confusing and it would be better to show ERROR in this case. I will check whether we can do something about that.

jhoenicke · 2024-06-13T15:24:42Z

I updated the solver binaries for our final submission.

jhoenicke self-assigned this May 15, 2024

bobot added the submission Submissions for SMT-COMP label May 21, 2024

jhoenicke force-pushed the jhoenicke-smtinterpol branch from 17c006b to 40bb385 Compare May 22, 2024 00:51

jhoenicke changed the title ~~Create smtinterpol~~ Submission for smtinterpol May 26, 2024

jhoenicke marked this pull request as ready for review May 26, 2024 09:56

jhoenicke added 5 commits May 31, 2024 11:05

Create smtinterpol

5c2aaec

Draft for the smtinterpol submission.

Updated links

980d0cb

trim trailing whitespace

b4c9065

Add bitvector logics

3633c3e

Added seed

fef3875

jhoenicke force-pushed the jhoenicke-smtinterpol branch from 9e28d35 to fef3875 Compare May 31, 2024 09:06

jhoenicke added 3 commits May 31, 2024 12:05

Updated run script, fixed paths

74316ec

Updated submission

479e315

- New version uploaded - Added Contact - Added new BV logics to model-validation track

Added BV logics to model-validation

1e431f9

Updated binary

8d8ccad

martinjonas merged commit 91ed77f into master Jun 18, 2024
5 checks passed

martinjonas deleted the jhoenicke-smtinterpol branch June 18, 2024 06:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Submission for smtinterpol #45

Submission for smtinterpol #45

jhoenicke commented May 15, 2024 •

edited

Loading

bobot commented May 21, 2024

jhoenicke commented May 22, 2024

mpreiner commented May 23, 2024

bobot commented May 24, 2024

mbromber commented May 27, 2024 •

edited

Loading

jhoenicke commented May 27, 2024

mbromber commented May 27, 2024

github-actions bot commented May 31, 2024 •

edited

Loading

SMTInterpol

martinjonas commented Jun 10, 2024

martinjonas commented Jun 12, 2024

jhoenicke commented Jun 12, 2024

martinjonas commented Jun 12, 2024

martinjonas commented Jun 12, 2024

jhoenicke commented Jun 12, 2024 •

edited

Loading

martinjonas commented Jun 13, 2024

martinjonas commented Jun 13, 2024

jhoenicke commented Jun 13, 2024

martinjonas commented Jun 13, 2024 •

edited

Loading

jhoenicke commented Jun 13, 2024

Submission for smtinterpol #45

Submission for smtinterpol #45

Conversation

jhoenicke commented May 15, 2024 • edited Loading

bobot commented May 21, 2024

jhoenicke commented May 22, 2024

mpreiner commented May 23, 2024

bobot commented May 24, 2024

mbromber commented May 27, 2024 • edited Loading

jhoenicke commented May 27, 2024

mbromber commented May 27, 2024

github-actions bot commented May 31, 2024 • edited Loading

SMTInterpol

martinjonas commented Jun 10, 2024

martinjonas commented Jun 12, 2024

jhoenicke commented Jun 12, 2024

martinjonas commented Jun 12, 2024

martinjonas commented Jun 12, 2024

jhoenicke commented Jun 12, 2024 • edited Loading

martinjonas commented Jun 13, 2024

martinjonas commented Jun 13, 2024

jhoenicke commented Jun 13, 2024

martinjonas commented Jun 13, 2024 • edited Loading

jhoenicke commented Jun 13, 2024

jhoenicke commented May 15, 2024 •

edited

Loading

mbromber commented May 27, 2024 •

edited

Loading

github-actions bot commented May 31, 2024 •

edited

Loading

jhoenicke commented Jun 12, 2024 •

edited

Loading

martinjonas commented Jun 13, 2024 •

edited

Loading