-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor, fix, and optimizer filters/rules #1794
base: dsb/depickle
Are you sure you want to change the base?
Conversation
""" | ||
Apply the rule. Return True on a match. | ||
""" | ||
obj = self.get_object(data) | ||
# FIXME: probably can do the rest without object: | ||
if not self.match_substring(0, obj.get_name()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not self.match_substring(0, obj.get_name()): | |
if not self.match_substring(0, data["name"]): |
def apply(self, db, person): | ||
def apply_to_one(self, db, data): | ||
person = self.get_object(data) | ||
# FIXME: can probably do all of this without object: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use data["birth_ref_index"] to avoid constructing the object?
@@ -145,6 +145,8 @@ def apply(self, db, place): | |||
|
|||
# now we know at least one is given in the filter and is valid | |||
|
|||
place = self.get_object(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use data["lat"] and data["long"] to avoid constructing the object?
Thanks @stevenyoungs for looking this over and the feedback. I'm hoping that @Nick-Hall is open to these three ideas (fixes, using JSON data, and the optimizer) because all three make the filter system so much faster while keeping the API almost the same. Many places in Gramps use the raw data for speed, including tools, views, displayers, and importers. It would be a shame if filters couldn't do the same. |
No problem. I'm also keen to see the benefits of more standard storage and see what opportunities it unlocks. The recent discussion of GQL is an interesting option. |
Your feedback is very useful. It always helps for another person to review the code. I very much appreciate your contributions. |
With the new JSON format, I think that we can regard the raw data as providing lightweight objects. The format mirrors the Gramps objects and is therefore much easier to understand and more resilient to updates. I've been doing some research today to see if we can make the dict format more object-like. For example, I am open to using JSON data in more core code, including filters. We should probably discuss where the raw format is acceptable. Third-party code should be careful when using the raw format. It may change between feature releases, not just major releases, but his has always been the case. An optimizer seems like a good idea. I haven't reviewed your code yet though. |
How about just wrapping our raw data with class DataDict(dict):
def __getattr__(self, key):
value = self[key]
if isinstance(value, dict):
return DataDict(value)
elif isinstance(value, list):
return DataList(value)
else:
return value
class DataList(list):
def __getitem__(self, position):
value = super().__getitem__(position)
if isinstance(value, dict):
return DataDict(value)
elif isinstance(value, list):
return DataList(value)
else:
return value It allows exactly what you describe, is low-cost, and doesn't require any more code than that above.
Agreed. The rule has seemed to be: use where it is needed for speed or easy representation. |
The only downside is that we leak the internal representation of the objects into the higher layers of code. But since those layers, in part, already use the raw format, this PR does not change make a material difference. |
For many attributes of an object, it isn't so much "internal" now that we are switching to JSON. For example, this is the Person object API to get the parent family list: person.parent_family_list # object attribute
person.get_parent_family_handle_list() # function call With the above person.parent_family_list # JSON access
person.get_parent_family_handle_list() # function call, instantiates object The new version would hide the fact that the function call creates the object (but just once). So because the JSON data mirrors the actual attribute names, it doesn't have to change at all. |
This PR does three things: