Replies: 1 comment 1 reply
-
I think (1) is fixed if you upgrade Re (2): First to do the arithmetic you'd need (https://docs.xarray.dev/en/stable/user-guide/groupby.html#grouped-arithmetic)
But that's still a bug (#8952)
Do you really need the MultiIndex? It works just fine if I comment out the |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have been using xarray for a while and all my troubles seem to be around how groupby operations work, especially with a multiindex.
Let us take a simple common example : a dataset with a multiindex and one wishes to groupby some variable to compute the mean. Then one wishes to substract the mean of each group to the variable.
[Edit] Added code that works and updated issues status
Example code
Note: the example has been changed to reflect a real use case where we want to compute the luminosity of each frame normalized by the average luminosity of its video. To justify the use of a multiindex, the videos have different number of frames.
Encountered issues
There are several issues with this code:
use_flox=False
, thend.groupby("video").mean()
fails.res = d - m
fails because m is an array of size 10 because of the grouping and it seems that xarray does not automatically call pandas merge on the multiindex beforehand... [Solution] one is expected to dod.groupby("video") - m
(see code that works section and dcherian's reply)d.set_index(vidframe = ["video", "frame"])
,d.groupby("video") - m
fails. Solution is to not use a multiindexCode that works
Sparse attempted solution
Another solution is simply promote each level of a multiindex to a dimension and use a sparse array backend to avoid memory blowup. This has its own issues, but it using it provides the most readable code.
[EDIT] I actually tried with the following code and I get errors... Is xarray fully integrated with sparse or am I doing a mistake?
[EDIT] Works with the correct code (i.e
d["luminosity"].mean("frame")
), however, fails withd["luminosity"].mean("video")
, but I do not see why it should fail.Beta Was this translation helpful? Give feedback.
All reactions