-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(rules): config structure #47
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well done @cmdoret 🥳
praise: the code for rules looks both more structured and cleaner
praise: the new config is definitely easier to understand and use
I only have a major comment and an open question related to it:
issue: Since we're playing with the rules, I tried a test run on my end after compiling your code and noticed that all literal values were not pseudonymized in the output. After a bit of digging I stumbled upon a matching condition that was looking for only NamedNode
as a type and used _
for any other case. I suggested avoiding this when possible because it goes a bit against Rust's principle of explicit typing, let me know your thoughts.
./target/debug/tripsu pseudo --index tests/data/type_map.nt --config tests/data/config.yaml tests/data/test.nt
INFO Args: PseudoArgs { index: "tests/data/type_map.nt", input: "tests/data/test.nt", config: "tests/data/config.yaml", output: "-", secret: None }
<http://example.org/4cbe42b24189cb18b362a8cd0e952c482ad3b757ae96dc2590ba8b94d0911895> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
<http://example.org/4cbe42b24189cb18b362a8cd0e952c482ad3b757ae96dc2590ba8b94d0911895> <http://xmlns.com/foaf/0.1/holdsAccount> <http://example.org/a7ee682dbe9f90d4b52c156744757a4aae47df7da05f2ac91261fb4c66cb5340> .
<http://example.org/a7ee682dbe9f90d4b52c156744757a4aae47df7da05f2ac91261fb4c66cb5340> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/OnlineAccount> .
<http://example.org/a7ee682dbe9f90d4b52c156744757a4aae47df7da05f2ac91261fb4c66cb5340> <http://schema.org/name> "my_account32" . <- should be pseudo
<http://example.org/a7ee682dbe9f90d4b52c156744757a4aae47df7da05f2ac91261fb4c66cb5340> <http://schema.org/accessCode> "secret-123" . <- should be pseudo
<http://example.org/4cbe42b24189cb18b362a8cd0e952c482ad3b757ae96dc2590ba8b94d0911895> <http://schema.org/name> "Alice" . <- should be pseudo
[...]
suggestion(tests): the bug above is very funny because all unit tests pass, meaning that all of our main functions work. The issue arises when you run the whole pipeline and a "side" function doesn't do everything it's supposed to. How can we address this?
I suggest adding a test function in pass_second
to test the encryption of several triples using rstest
. Let me know if you have a better idea 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for pointing out the wrong line for the bug and excellent job with fixing and the tests 😎
praise: seeing the index!
macro in action with multiple entries is so clean!
praise: the test for triples is exactly how I imagined it
praise: thank you for renaming pass_first and pass_second, now it feels like we're 100% in release state
All looks good to me!
Proposed Changes
Changes the rule specification format to match the proposal in #45.
match_rules
into the rules moduleThe mask and triple were cloned a few times by the matching rules, the refactored version passes everything by reference, resulting in a 13% speedup (15.9s -> 14.1s on a 2.3M triples file). That's ~164k triples/second.
Types of Changes
What types of changes does your code introduce? Put an
x
in the boxes thatapply
functionality to not work as expected).
other choices apply).
Checklist
Put an
x
in the boxes that apply. You can also fill these out after creatingthe PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.
CONTRIBUTING
guidelines.
works.