Generalize mutex-meet-tid privatization to arbitrary digest #1278

sim642 · 2023-11-30T11:05:19Z

The current implementation of PerMutexMeetTIDPriv for both base and relational analyses hard-codes thread IDs as the digest framework. This generalizes them over a digest module.

sim642 · 2023-12-01T09:00:21Z

Unfortunately, this is far from being general enough. I've been trying to get may-have-ever-been-locked digests working on the lock-digest branch and it doesn't work out in any simple way.

Unlike the lockset digests and thread ID digests, the observing lock operation for this abstraction requires getting all traces from all observable unlocks. Moreover, a cartesian product should be taken: one combining operation for each local state at lock and one state from the incoming traces. That essentially demands some kind of path splitting in globals (path-sensitive flow-insensitive unknowns).
Moreover, the splitting by maylocksdigest analysis should somehow split the globals of the apron analysis, but that's not possible in our current framework. Local path-sensitivity works like this like a product, but globals are a sum over analyses.

Besides that, mutex-meet-tid, both as implemented and as specified in the paper, tightly couples thread IDs in another way for the thread returns and joins. It all works out if thread IDs are both used for that and the digests, but not in general.

michael-schwarz · 2023-12-01T09:21:02Z

I have not looked into the implementation yet, so these are if you want "observations untainted by reality" 😉

requires getting all traces from all observable unlocks.

No, it suffices here to get the digests, no need to access the actual traces.

That essentially demands some kind of path splitting in globals (path-sensitive flow-insensitive unknowns).

I don't really understand what you mean by this: This is the same kind of information as are the thread ids: Things that additionally refine the values -- whether implemented as a map under the global (implementation) or as a in the paper (by having different unknowns).
Or were you talking about the implementation attempt in this PR here?

Moreover, the splitting by maylocksdigest analysis should somehow split the globals of the apron analysis

In what way is this different from the thread id based splitting we already have?

Besides that, mutex-meet-tid, both as implemented and as specified in the paper, tightly couples thread IDs in another way for the thread returns and join

Yes, this is a problem: Actually even the restricted publishing at unlock to only things I have written myself (as well as not reading from your own global invariant for unique threads) is a departure from the generic refinement framework. (See appendix G for the actual instance of $\mathcal{A}$ one needs to construct to prove these things sound).

sim642 · 2023-12-01T10:12:26Z

No, it suffices here to get the digests, no need to access the actual traces.

Sorry, that's what I meant: digests for all the reaching traces.

The big point is in the following:

Lockset digests don't actually use that information: (the $S'$ is unused).
Thread IDs don't either because the ID of the ego thread does not change in any way due to anything it observes (we don't even give right-hand sides for it).
May-ever-have-been-locked digests crucially use it: (the $L'$ is used).

For this reason, lockset and thread ID analyses don't even need global unknowns. This crucially makes them nicely fit into Goblint.

I don't really understand what you mean by this: This is the same kind of information as are the thread ids: Things that additionally refine the values -- whether implemented as a map under the global (implementation) or as a in the paper (by having different unknowns).
Or were you talking about the implementation attempt in this PR here?

It's not just that (map domain vs separate unknowns), but follows from the above observation. Because lockset and thread ID digests don't have globals, their (unit) globals also don't split all the other globals. But for may-ever-have-been-locked digests, this is no longer the case. In the paper view, the digests act as a product with the existing global unknowns (if you have two digests, it's effectively a product digest), not as a sum.

In Goblint, each analysis providing digests (e.g. mutex or threadid) adds global unknowns instead of multiplying. In a product, the may-ever-have-been-locked digests split apron global unknowns (this also explicitly happens with the current mutex-meet-tid implementation for thread IDs), but also vice versa (!): the global unknowns for may-ever-have-been-locked should be split. Since locksets/thread IDs don't use global unknowns, there's nothing to split and no circularity arises.

Therefore, I think the ad-hoc implementation of digests into relation analysis itself doesn't cut it. Instead we'll probably have to do how the paper presents it: the digest splitting lifts any existing analysis, i.e. it's a Spec functor in Goblint. It also means digests (provided by whatever activated analyses) need to be first-class citizens in Spec and multiplied into a product by MCP (as opposed to summing like for V).
Not only that, there's a recursive dependency between the two:

V needs to be lifted into a product with the combined Digest,
while the global unknowns in each Digest-providing analysis are split by all other digests themselves.

But even in that case, a very practical problem arises from the definition of observing actions:

Formally, we define it universally over all $A_0, A_1, A'$ but that's not how we can implement it at all. In a Goblint transfer function, we're in the middle of computing it for a particular constraint unknown (i.e. for a given $A'$). Therefore, $A_0, A_1$ are existentially quantified: we have to find all combinations of them that actually yield the $A'$ whose right-hand side we're in. Multiple combinations may give the same $A'$, so the implementation needs to join over all such.

In a way, that's analogous to the non-side-effecting constraint system definition for partial context-sensitivity. And we work around the existential nature using side effects. I suspect we'll need to somehow do a similar transformation here: have additional side-effected unknowns that can be directly looked up for the joined value, but it's not so clear to me how that should work (an unlock side-effects to all lock nodes in all functions and contexts?).
This is a non-problem for existing digests precisely because they don't depend on the incoming digest, but only at the previous local one, which we know where to find (and put into ctx.local).

michael-schwarz · 2023-12-01T10:30:41Z

Ah, yes, these are indeed problems one faces here.

I am actually defining the term of being ego-lane-derived for these kinds of digests, that can be computed from the ego lane and do not consider the values of digests incorporated at other unknowns (beyond whether the transition is defined at all or not):

but also vice versa (!): the global unknowns for may-ever-have-been-locked should be split.

I don't quite understand this point, as none of the digests should have their own unknowns -- instead they ought to refine the existing ones. But maybe we can discuss this in detail at the GobCon then.

sim642 · 2023-12-05T14:39:25Z

This discussion can continue in #1286. Although this PR clearly does a generalization, it might not be enough.

michael-schwarz · 2023-12-29T09:14:29Z

@sim642 does it make sense for me to review this, i.e., is this something that is sound and we should have merged even if it does not generalize as much as we had hoped for?

sim642 · 2024-01-09T09:41:51Z

is this something that is sound and we should have merged even if it does not generalize as much as we had hoped for?

I think we should. It better distinguishes TIDs of digest and TIDs of thread returns in these privatizations.

Generalize mutex-meet-tid privatization to arbitrary digest

a392323

sim642 added cleanup Refactoring, clean-up feature labels Nov 30, 2023

sim642 requested a review from michael-schwarz November 30, 2023 11:05

sim642 added 2 commits November 30, 2023 13:34

Simplify CommonPriv.ThreadDigest

6dfc08f

Avoid MustJoinedThreads query in ThreadDigest if not needed

05198f9

sim642 mentioned this pull request Dec 5, 2023

Add may-ever-have-been-locked digests for relational mutex-meet #1286

Closed

sim642 added 2 commits December 5, 2023 17:09

Rename Digest.compatible -> accounted_for

bda139f

Flip Digest.accounted_for implementation to match name

0704cd5

michael-schwarz approved these changes Jan 10, 2024

View reviewed changes

sim642 added this to the v2.4.0 milestone Jan 11, 2024

sim642 merged commit f53e4eb into master Jan 11, 2024
16 checks passed

sim642 deleted the mutex-meet-digest branch January 11, 2024 09:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize mutex-meet-tid privatization to arbitrary digest #1278

Generalize mutex-meet-tid privatization to arbitrary digest #1278

sim642 commented Nov 30, 2023

sim642 commented Dec 1, 2023

michael-schwarz commented Dec 1, 2023 •

edited

Loading

sim642 commented Dec 1, 2023

michael-schwarz commented Dec 1, 2023

sim642 commented Dec 5, 2023

michael-schwarz commented Dec 29, 2023

sim642 commented Jan 9, 2024

Generalize mutex-meet-tid privatization to arbitrary digest #1278

Generalize mutex-meet-tid privatization to arbitrary digest #1278

Conversation

sim642 commented Nov 30, 2023

sim642 commented Dec 1, 2023

michael-schwarz commented Dec 1, 2023 • edited Loading

sim642 commented Dec 1, 2023

michael-schwarz commented Dec 1, 2023

sim642 commented Dec 5, 2023

michael-schwarz commented Dec 29, 2023

sim642 commented Jan 9, 2024

michael-schwarz commented Dec 1, 2023 •

edited

Loading