-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filtering doesn't work as expected #40
Comments
what dpath version are you using? |
Nevermind, I can replicate this behavior on latest master. |
This has been broken for a long time, ever since values() was introduced, possibly even before. Really sorry about this one. I may not get around to a fix for this one tonight, but it's high on my list. |
From a quick glance at the logic in the code yesterday, I think it is due to the way the code does in place filtering while it does the search so it fails on the top level branch of the dict before it hits a result where the filter would pass it. I was expecting a post dpath filter, not necessarily a bug. Although it didn't print anything so it could be something else. Also I have a bug in the lambda should be .get() == 'correct', will test once I get into the office |
So the typo doesn't change anything |
TL;DR - Filtering doesn't work the way any of us assumes it does! So, after sleeping on this and pulling my head out of my buttcheeks, this boils down to a lack of good documentation. In fact, the lack of good docs is so bad that even I misunderstood what's actually going on behind the scenes, and how these functions work. First, you're correct, filters are pre, not post. They are executed as each level of the source dictionary is parsed. Second, dpath treats dictionaries and lists as trees. Filters execute on the leaf nodes, e.g., the individual dictionary values; they do not execute on dictionaries or lists, which are branches on the tree, not leaf nodes. Consider:
The first call to values failed (and printed nothing) because actions/* matches only one level deep (not multi-level), and the only thing there is 2 dictionary objects, neither of which are leaf nodes, so the filter isn't executed on them, hence nothing gets printed (and not returned). This behavior is not exclusive to values() either, the filtering happens in search, so search has the same issue:
.. If we expand the search range to "actions/**", we will reach the leaf nodes we want to filter against, and our print works:
... if we change our filter to actually give us some results now (with a .get() lambda like you had), we see now that this is obviously going to fail, because we're calling .get() on a string:
... but it works great if we check against a simple string comparison, because we're getting the leaf nodes:
So, while filtering does work exactly as it was designed, this implementation clearly defies the rule of least astonishment. Even though the docs clearly state how it works:
... I think it's worth considering changing how this works, or supporting some kind of alternate filtering mechanisms. RFC, @calebcase, OP? |
In terms of alternative post-search, I am able to handle it using other methods. Thus it's not that big of a deal if a pre filter is kept. Is it less efficient than doing it while stepping through the tree? Yes, but it's not something that should result in significant issues for me in my use case. Honestly I can't think of too many use cases where I would want a pre search, but I can see where others may want it. I suppose the documentation could definitely use some clarification on this. Changing the filter type might cause backwards incompatibility for users though so a post-search might be better to be added as a separate arg param. Incidentally I need to remember to test unicode/ascii with filter. |
What if we could say that all filter arguments must be subclasses of some base filter object, say dpath.Filter, and each Filter can implement one or more filtering steps, which would be called at various steps of the process? Then the user could just implement/override the steps they care about, and you could also make multi-step filters this way:
This would raise the question re: what to do with the existing filter functionality (being able to pass a lambda is awfully convenient), but I think this implementation would be significantly more clear. |
This seems like an excellent thing to fix in 2.0, re-labeling. |
Blocked by #136 |
I guess I will look into this sometime
The text was updated successfully, but these errors were encountered: