sail -smt very slow to generate. #783

nwf · 2024-11-19T22:23:40Z

Building prefixes of the CHERIoT sail model's properties file gets very slow very quickly:

Up through line	Time taken (s)
49 (no `$property`-s)	2.18
57 (`prop_nullzero`)	247.75
72 (`prop_seal_root`)	248.19
83 (`prop_decEnc`)	539.42
119 (`prop_andperms`)	(killed after several hours)

This is not the kind of growth curve that inspires confidence in running the whole file. Unfortunately, the vast majority of this time is spent after --verbose 1 stops being useful:

Compiling [================================================= ] 99% (prop_nullzero)

And similarly for --dtc_verbose and --dsmt_verbose. Any hints as to what I could do to see where sail is spending its time?

The text was updated successfully, but these errors were encountered:

Alasdair · 2024-11-20T04:40:30Z

There's the --dprofile flag which gives some coarse-grained timing information, anything more and you need an actual profiler.

| | Profiled SSA conversion (prop_nullzero): 0.012452s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_nullzero): 10.668481s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_nullzero): 10.141833s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_mem_root): 0.006286s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_mem_root): 0.066871s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_mem_root): 0.021631s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_exe_root): 0.006511s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_exe_root): 0.068106s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_exe_root): 0.021916s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_seal_root): 0.006965s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_seal_root): 0.068628s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_seal_root): 0.021575s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_decEnc): 0.006986s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_decEnc): 8.335943s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_decEnc): 13.978323s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_andperms): 0.042519s
| |   SMT calls: 0, SMT time: 0.000000s

I think the problem might be the path-condition generation after inlining in the SMT properties. The current method is very slow, but the previous method was subtly wrong in some rare cases.

Alasdair · 2024-11-20T04:47:05Z

I think I recently made the application of some optimisations a bit too conservative when I was working on the SystemVerilog generation, if I force them to be applied everywhere (even when potentially unsound) it goes fast:

> time make -B generate_smt
...
Profiled Generating SMT: 0.251947s
  SMT calls: 0, SMT time: 0.000000s
        0.65 real         0.58 user         0.05 sys

Just a case of finding the right point where the optimisations fire, but not anywhere it might be unsound.

nwf · 2024-11-20T04:59:10Z

Thank you, again, for your fast turnaround. Indeed, I think --dprofile tells me nothing that I (think I) don't already know:

Profiled Compiling to Jib IR: 0.073096s
  SMT calls: 443, SMT time: 0.016352s
| | Profiled SSA conversion (prop_nullzero): 0.043928s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_nullzero): 88.200192s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_nullzero): 151.570311s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_mem_root): 0.021943s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_mem_root): 0.331778s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_mem_root): 0.117937s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_exe_root): 0.022117s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_exe_root): 0.330953s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_exe_root): 0.119901s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_seal_root): 0.022884s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_seal_root): 0.333424s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_seal_root): 0.119254s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_decEnc): 0.024844s
| |   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT conversion (prop_decEnc): 73.973810s
|   SMT calls: 0, SMT time: 0.000000s
| Profiled SMT optimization (prop_decEnc): 216.931438s
|   SMT calls: 0, SMT time: 0.000000s
| | Profiled SSA conversion (prop_andperms): 0.112770s
| |   SMT calls: 0, SMT time: 0.000000s

Alasdair · 2024-11-20T05:15:50Z

I think I have a fix here #784, but it has a few issues still that'll I'll need to fix tomorrow before I can merge it.

rmn30 · 2024-11-20T12:04:29Z

It definitely goes faster now :-)

Alasdair · 2024-11-20T14:45:05Z

I think that commit was unsound after testing it some more though...

Looks like the issue is with encCapabilityToCapability. The way it uses mutable variables means that we can't optimise the match how we'd like. Right now for SMT generation we do inlining -> SSA transform -> SMT output. What we really need to make it faster is to do it as SSA transform -> inline -> SMT output. The match optimisations are mostly there to make sure the SSA transform doesn't blow up after inlining.

rmn30 · 2024-11-20T15:56:37Z

Maybe we should rewrite that not to use mutable variables: just define the return Capability upfront with all permissions false and otype for the non-exe case then use with in the match cases to override as necessary.

Alasdair · 2024-11-20T17:22:00Z

Yes, I think there are a few other similar functions though. Essentially I've already done the thing I need in order to fix the performance issue more generally in the Sail to SystemVerilog backend.

The intermediate representation I use when generating SystemVerilog (SVIR) is just bitvector SMT expressions wrapped in a SystemVerilog module structure. If I inlined that structure into plain SMT, I'd get the right thing and wouldn't have to worry about things like mutable variables turning off optimisations which is just going to stay brittle and annoying.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sail -smt very slow to generate. #783

sail -smt very slow to generate. #783

nwf commented Nov 19, 2024

Alasdair commented Nov 20, 2024

Alasdair commented Nov 20, 2024

nwf commented Nov 20, 2024

Alasdair commented Nov 20, 2024

rmn30 commented Nov 20, 2024

Alasdair commented Nov 20, 2024

rmn30 commented Nov 20, 2024

Alasdair commented Nov 20, 2024

sail -smt very slow to generate. #783

sail -smt very slow to generate. #783

Comments

nwf commented Nov 19, 2024

Alasdair commented Nov 20, 2024

Alasdair commented Nov 20, 2024

nwf commented Nov 20, 2024

Alasdair commented Nov 20, 2024

rmn30 commented Nov 20, 2024

Alasdair commented Nov 20, 2024

rmn30 commented Nov 20, 2024

Alasdair commented Nov 20, 2024