Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Batching Query Public Inputs for Tabular Queries + Remove Old Circuits #417

Merged

Conversation

nicholas-mainardi
Copy link
Contributor

This PR cleans up the circuit code in verifiable-db crate. In particular

  • Rather than keeping 2 versions of query circuits (batched and non-batched) in the codebase, the old non-batched circuits, which will become useless when batching circuits will be integrated, are removed
  • To avoid having to deal with 2 different set of public inputs in query circuits API, the main change of this PR is making tabular queries circuits compatible with the set of public inputs exposed by batching circuits. This will yield some unnecessary public inputs for tabular queries, but it will allow to keep a single API to generate query proofs
  • As a consequence of the removal of the old non-batched circuits, also the integration test code for queries can be greatly simplified

Copy link
Collaborator

@nikkolasg nikkolasg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in depth review but looks good !! Thanks for the simplification !

Comment on lines 128 to 131
/// Public inputs for generic query circuits
pub type PublicInputs<'a, T, const S: usize> = PublicInputsFactory<'a, T, S, false>;
/// Public inputs for universal query circuit
pub type PublicInputsUniversalCircuit<'a, T, const S: usize> = PublicInputsFactory<'a, T, S, true>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC we should probably rename these like

  • PublicInputs => BatchingQueryPublicInputs
  • PublicInputsUniversalCircuit => TabularQueryPublicInputs

Would that be accurate ? The "generic query circuits" is not specific enough to be meaningful imo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure about using BatchingQueryPublicInputs because we no longer have the non-batched circuits, so imo doesn't make much sense to label these public inputs as Batching ones. Also, TabularQueryPublicInputs would be accurate now, but maybe once we will introduce generic tabular queries with results tree, I think we might still use BatchingQueryPublicInputs, so TabularQueryPublicInputs would be misleading. That's why I thought about using UniversalCircuit to identifiy this version of public inputs. Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I renamed PublicInputs in PublicInputsQueryCircuits in commit 0666bb9, until we find better names xD

@@ -173,7 +174,9 @@ where
.chain(empty_hash.elements.iter())
.chain(node_min.to_targets().iter())
.chain(node_max.to_targets().iter())
.chain(once(second_index_id))
.chain(once(
&hash_wires.input_wires.column_extraction_wires.column_ids[1],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we adding this one again ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean. You are wondering why I didn't keep the second_index_id variable?

@@ -1,2 +1,82 @@
pub(crate) mod binding;
pub(crate) mod construction;
/// Old query public inputs, moved here because the circuits in this module still expects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this whole module is unused, I would be more in favor of just removing it all together. We can always go back in time where it was there to copy paste no ?
That OR importing this module only when enabling a legacy feature for example if we want to keep it around live.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a feature results_tree to disable this code until we don't resume development on it

row_cells,
predicate_operations,
placeholders,
false, // doesn't matter for placeholder hash computation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it doesn't matter in both cases (agg or no agg), then why keeping these arguments in the first place ? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in some other cases (e.g., when we want to instantiate a dummy row to pad to maximum number of rows processed in a circuit) those parameters are meaningful. Maybe I could use an Option so that we just provide None in cases where the value is not relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit 0666bb9, I changed the code to avoid instantiating that data structure providing dummy values for these flags

TabularQueryOutputModifiers::new(
limit,
offset,
results_structure.distinct.unwrap_or(false),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we can use unwrap_or_default i believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 0666bb9

@@ -221,6 +167,97 @@ where

let flat_computational_hash = flatten_poseidon_hash_target(b, computational_hash);

// additional constraints on boundary rows to ensure completeness of proven rows
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear why these new assertions are needed now and weren't present before. Can you clarify ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constraints are necessary to enforce completeness in batching query circuits, as detailed in the revelation circuit description. They weren't necessary in the old circuits, and since we replace also the revelation circuit for the old query circuits with the one for batching, that's why these constraints have been added to this circuit

@nikkolasg nikkolasg merged commit 95bfa33 into feat/batching_integration_test Jan 8, 2025
4 checks passed
@nikkolasg nikkolasg deleted the fix/remove_old_query_circuits branch January 8, 2025 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants