Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better ways to check/enforce prerequisite properties (decomposability, etc.) #61

Open
guyvdbroeck opened this issue Mar 4, 2021 · 5 comments

Comments

@guyvdbroeck
Copy link
Member

Most of our inference functions assume something about the circuit (smoothness, decomposability, determinism, structured decomposability). Currently there are some functions that check these preconditions explicitly in code, some that add them as warnings in the documentation, and some that simply assume the user knows what's safe.

There are 2 possible improvements:

  1. Provide a cross-API function that can tell the user whether the input circuit is supported by a given function. For example:
prereq(::typeof(issatisfiable), circuit) = isdecomposable(circuit)
prereq(::typeof(istautology), circuit) = isdecomposable(circuit) && isdeterministic(circuit)

> prereq(issatisfiable, circuit)
true
> prereq(issatisfiable, circuit & rain)
false
  1. Create a wrapper struct for circuits that keeps track of all these properties, that sets/unsets them after transformations, that allows for them to be checked before running inference functions, etc. This change would require duplicating all inference functions into a 'safe' version that takes this new struct and checks properties, and a lower-level 'unsafe' one that assumes the user knows what they're doing (i.e., the current code).
@guyvdbroeck
Copy link
Member Author

I would like general feedback on this issue, and I'm especially curious what our external users like @RenatoGeh and @eliavw think.

@RenatoGeh
Copy link
Contributor

For me, as a user, I usually know beforehand if I have a decomposable, deterministic, etc circuit. So (1) seems like a good feature to have with minimum effort for users to know what is required for queries. I don't see myself needing (2) unless I were working with circuits I had no idea how they were generated, but even then that'd be far fetched, since I would probably blame myself for not reading the article before using that circuit learner.

@khosravipasha
Copy link
Contributor

I think 1 is very useful in most cases. Checking determinsim each time for bigger circuit might not be doable though. For (1) we can maybe also have functions that simply outputs what properties are needed for each query type. Then the user can decide whether to check the properties hold or if they already know it holds or not.

Option (2), is also very useful sepcially for new users that might not know what these properties are. Its probably more work than 1.

@eliavw
Copy link

eliavw commented Mar 4, 2021

When it comes to circuits, I consider myself very much a novice still, so you can take my view on these matters as a somewhat "newbie" perspective.

My recent usecase basically came down to the following;

  • I have an SDD from elsewhere
  • Load this SDD in Juice
  • Use Juice to convert the SDD to a PSDD
  • Learn the parameters of this PSDD on data.

After reading the docs, my workflow was;

c = load_logic_circuit("my_sdd.sdd")
c2 = smooth(c);
c3 = propagate_constants(c2);
pc = ProbCircuit(c3);

The smoothing in line (2) did not work and did not crash, and that's probably where things get dangerous; in my case I checked (I knew I was working on something that potentially would malfunction, due to reading issue #53) and caught the error. But people that do not double-check and 'trust' the code potentially have a hard time figuring this one out, and only catch the bug at a much later stage.

So, how to keep this from happening? Option (2) suggested above, which basically introduces a safe variant of all inference functions would catch this kind of error by definition, as it (if I understand correctly) would enforce all prerequisites.

However, I believe option (1) could also be a viable solution, as long as the documentation is very clear on these matters. Essentially, the docs should be very explicit about which properties each query requires from the circuit. A user that is unsure should then be very aware that he/she needs to use the prerequisite function to check whatever properties are needed. If the docs are clear enough on these matters, I see no issues with keeping the inference functions themselves "unsafe".

Lastly, if a user notices that their circuit is not suitable for the desired query, it's probably also nice to have a section in the docs about how to fix that, since otherwise having failed the prerequisites becomes a dead end.

So, in conclusion: (2) would work for sure, but (1) would already be enough, given enough documentation on how to properly do prerequisite checks and fixes within Juice.

@eliavw
Copy link

eliavw commented Mar 4, 2021

So, I just looked back to the previous issue in a bit more detail and I should be more clear on which workflows I tried and where the problems were

This strategy;

c = load_logic_circuit("my_sdd.sdd")
c2 = smooth(c);
c3 = propagate_constants(c2);
pc = ProbCircuit(c3);

does not throw any errors, but does crash when computing the EVI-query (so thats a good thing)

Then I tried this strategy;

circuit = load_logic_circuit(filename)

# Initialize VTree with correct amount of vars
vtree = Vtree(num_variables(circuit), :balanced)
mgr = SddMgr(vtree)
    
# Add structure to the logic circuit (i.e. vtree)
struct_circuit = StructLogicCircuit(mgr, circuit)
struct_circuit = propagate_constants(struct_circuit; remove_unary=true)

# Convert to probabilistic circuit
prob_circuit = StructProbCircuit(mgr, struct_circuit)
psdd = convert(StructProbCircuit, prob_circuit)
psdd

which does not fail, but running an EVI-query on this yields incorrect probabilities (do not sum to one). So that's a bit more dangerous. Probably this way of trying things is not what a regular user would do, I was just really trying to shoehorn my SDD into Juice. The solutions (1) and (2) proposed above would still be of help, even in such a contrived scenario, since before running the EVI-query, I would still be advised to meet the prerequisites or expect errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants