-
Notifications
You must be signed in to change notification settings - Fork 42
Setting node name keeps tree linkage #310
Setting node name keeps tree linkage #310
Conversation
datatree/treenode.py
Outdated
parent = self._parent | ||
self.orphan() | ||
self._name = name | ||
self._set_parent(parent, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this orphaning then un-orphaning logic, couldn't we just change the key in .children
directly via some logic more like in _post_attach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem immediate to me right now (I tried some things but not successfully until now)
I still need to remove then add again the node in the parent's children at some point 🤔
In all cases I get more messy code than the current solution
__delitem__
itself uses orphan
under the hood for instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, I went for a .children =
assignment. I ensured the order of children is preserved by re-creating an OrderedDict from a generator on the existing parent's children. Code is more verbose, but I think this is unavoidable to ensure the order is preserved. Indeed, orphaning then re-attaching would have lost the ordering information.
Note: _post_attach
solely won't rename the node _name
, so two renaming still occur:
One on the parent's children (preserving order)One on the node_name
itself
Edit: if the node had a parent, there is no need to manually set the ._name
. Reassigning the updated list of children to its parents renames the node along the way
I remained conservative and forbid the renaming of a node to None if it has a parent Edit: according to a comment in the datatree design document mentioned in pydata/xarray#8747, section 4) Are nodes named? in practice, only the root node would remain anonmyous. Hence it makes sense to only authorize a node without a parent (root) to be renamed to None? See my comment #309 (comment) Edit: build is failing because of the new change in printing sizes of DataArrays: pydata/xarray#8702 Whether |
I see I got to this pretty late, so it may be solved already (if so, please ignore this). But I couldn't quite tell from this conversation if everything was working or not. If not, I have a solution which passes all pytests (except for a format error that was not passing prior to my changes) at https://github.com/marcel-goldschen-ohm/datatree/tree/main. Changes are in datatree.py in |
Any news on this? It would be great if this could finally be incorporated into the PyPI distribution of datatree. |
I know everyone is busy, so I hate to harp on this. But I'm surprised that even after 7 months this very fundamental issue with what seems like should be a simple solution is still not resolved (as far as I can tell). I'm not sure what the status is or what the issue is regarding unsuccessful checks for @etienneschalk's solution. I also have a solution that works for me at https://github.com/marcel-goldschen-ohm/datatree/tree/main as previously mentioned. How can I help get this resolved? |
Sorry @marcel-goldschen-ohm for letting this slip! I understand it's frustrating that we didn't just merge this and release. For context we've been concentrating on trying to get datatree merged upstream into xarray (see pydata/xarray#8572 (comment)), which would allow us to completely deprecate and archive this repository. We were hoping to have that done in July. 😅 This process is necessary, but it's an awkward time because it's tricky to keep track of both repos at once, especially as now some bugs in one are fixed in the other... We've made various small changes during that migration, including to whether or not copies of trees are created during operations, with the aim of making everything more consistent with the rest of xarray (see for example pydata/xarray#9297). We've also been trying to close issues here in favour of raising them upstream in xarray (because annoyingly we can't just auto-transfer all these issues to xarray upstream as it's in a different github org). However I've been looking at this specific issue again and I don't think it's trivial. After @etienneschalk 's comment in #309 (comment) it is now genuinely not clear to me what the desired behaviour should be. I have raised an upstream issue on xarray to track that question of what the behaviour should be going forward (pydata/xarray#9447). I am happy to merge this PR here and release another version if that would really help you, but my main concern has been ensuring that the upstream (non-prototype) version of datatree makes the right decisions going forward, not making sure all bugs are fixed in this version that's about to be deprecated. |
Regarding @etienneschalk 's comment in #309 (comment) , my personal opinion is that All of that said, I hear you that you have a lot on your plate with the move to Xarray. Also, although the desired behavior is not unclear to me, I hear you that Xarray itself apparently has some pretty silly behavior in this regard, and I get wanting to be Xarray compatible although I disagree with it in this case. Now that I know what the issue is, I can find a way to work around it, so don't worry about this for my benefit. Hopefully in the future both Xarray and DataTree will be more intuitive in this regard. Best wishes on the merge with Xarray! And also a big thank you for DataTree in the first place, I couldn't agree more that a tree structure of Datasets is highly useful :) |
Thanks @marcel-goldschen-ohm for your (very useful) perspective.
In that case I'm closing this in favour of the discussion upstream in pydata/xarray#9447. |
Small bugfix. I added a test reproducing the example in #309, and tests are not broken
Note: In
_post_attach
, I set the private_name
instead of setting thename
. Indeed, it can lead to infinite recursions when a setter is used inside of a class. I assume the_post_attach
method is like a "runtime assertion"pre-commit run --all-files
New functions/methods are listed inapi.rst
docs/source/whats-new.rst