-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.x and v2.x harmony and the transition to xarray #143
Comments
I should note that I'm not a regular v2.x user, so for anyone who is, please report other incompatibility issues here. The ultimate goal is to have code work either way, and to provide a transition path even if we ultimately decide to change the structure of v2.x. |
@freemansw1 Thanks for bringing up this discussion point. We might need to do a selection on which methods should run in both version and which not. For one of our tutorial notebooks, I started to test a relative import header like this: # relative imports
try: # version 2.0 structure
from tobac.themes.tobac_v1 import feature_detection_multithreshold, segmentation, linking_trackpy
from tobac.plot import map_tracks, animation_mask_field, plot_lifetime_histogram_bar
print('...imported tobac in version 2.x structure')
except: # version 1.3 structure
from tobac import feature_detection_multithreshold, segmentation, linking_trackpy
from tobac import map_tracks, animation_mask_field, plot_lifetime_histogram_bar
print('...imported tobac in version 1.x structure')
from tobac.utils import get_spacings It is not working properly ... more thoughts are needed, but I guess it is anyway much more than adjusting the import statements :-( |
The issue raised in #145 is related to that and should be taken into account I guess? |
@fsenf Yes, this only addresses the first point but could definitely be a good start to make the tutorials work for both versions. Your code works fine for me using either |
Hmm, you are right! Again, the difference between Now, it fails here: # Determine temporal and spatial sampling:
In [10] : dxy,dt = get_spacings(OLR)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [10], in <cell line: 2>()
1 # Determine temporal and spatial sampling:
----> 2 dxy,dt = get_spacings(OLR)
File ~/TROPOS/proj/2022-06-tobac-devel/tobac/tobac/utils.py:480, in get_spacings(field_in, grid_spacing, time_spacing)
476 from copy import deepcopy
478 # set horizontal grid spacing of input data
479 # If cartesian x and y corrdinates are present, use these to determine dxy (vertical grid spacing used to transfer pixel distances to real distances):
--> 480 coord_names = [coord.name() for coord in field_in.coords()]
482 if (
483 "projection_x_coordinate" in coord_names
484 and "projection_y_coordinate" in coord_names
485 ) and (grid_spacing is None):
486 x_coord = deepcopy(field_in.coord("projection_x_coordinate"))
TypeError: 'DataArrayCoordinates' object is not callable which is somehow expected because version 1.3 expects Iris cubes, right? An option for the tutorial notebook would be that the user selects @freemansw1 & @JuliaKukulies : What do you think? Would it be an option for a rather painless update for version compatibility? |
I am more inclined to fix the incompatibilities on the tobac side than on the tutorial notebook side, personally. I think the best way to resolve that issue could be to start wrapping functions in the |
I've broken up the xarray/iris/pandas checkboxes into input and output above. |
I agree with @freemansw1, fixing the incompabilities on the tobac side may be smoother and a bit less "just fixing the symptoms". This would keep the code in the example notebooks much cleaner and help users to use old or write new code that works in either of the two versions. I think the suggestion above is good. Just remind me: Was the plan in the long long run to completely move to xarray meaning remove the iris dependency and rewrite the code within the old functions that are based on iris?
Yes, exactly. So that is maybe one good example that shows that it would be smoother to solve this within
That makes sense, thanks! |
I believe that is the plan, based on the roadmap document and previous discussions. Iris isn't well supported in the current ecosystem and can be pretty inflexible. Xarray also has better dask support, allowing for better use with large datasets. |
Agreed with that. But yes, working with the decorators first as you suggested is probably a good solution, as it may take longer time to completely migrate the code inside the functions. And we probably would like to keep the one that converts an iris cube to an xarray.DataArray |
If everybody is ok with that I would like to work on a PR that copies the decorators from v2 to v1.x and adds some tests. Since we talked about this in the last meeting I played around with them for a while and they seem to be working just fine. I wouldn't decorate any function in the code yet. |
@snilsn I think we would all be interested in that pull request. As we make this transition, we will have to be careful with user communication. Not all xarray Datasets are Iris-compatible. I think porting over the decorators is a great first step, though. |
Full support also from my side. And yes, some fields need to be treated with special care when internal transformation from xarray to Iris is made. |
With #191 merged, we've now addressed another item on this list. I suspect that will be the only item fixed for v1.5. I think a good goal for v1.6 would be to switch much of the internal functionality over to xarray such that the xarray goal can be addressed as well. |
Minimally invasive workflow for xarray transitionInspired by the interesting discussions we had the last couple of days, I was re-thinking how we could reach the goal to transition to xarray as fast as possible. My proposal would be a test-driven development approach sticking to the structure, functions, call schemes we already have and doing the code refactoring a bit later. Unit testingA propose the following test-driven workflow:
Real data testingWe need more real data testing opportunities for the xarray transition. ideas
@freemansw1 @JuliaKukulies @w-k-jones @kelcyno @deeplycloudy : What do you think about this approach and how we can expand out real data testing options? |
I created a branch exp_iris2xarray (branched from main & merged RC_v1.5) where you can see the details in action |
BTW: I did not change protection and review rules for this experimental branch. Happy if somebody can point me to the place where this can be done ... |
I like this plan. I think the first step should be to make a set of really robust (and well tested!) conversion routines between xarray and iris/pandas, that go beyond the existing xarray routines and ensure consistency in e.g. dimension names and handle the proleptic gregorian time weirdness. After that we should be able to test consistently for iris vs xarray input, then change the internals and ensue the results are the same. Also we definitely need more real data for larger scale tests. Will also come in useful for the new analysis cookbooks! |
I also like this plan and I agree with will’s note below. For the larger scale tests and more real data, I have several gridded radar cases that are anywhere from 4-24hrs, depending on how big we might need going forward.
…-Kelcy
--
Kelcy Brunner
Post Doctoral Researcher
TTU, National Wind Institute
On Apr 22, 2023, at 3:07 AM, William Jones ***@***.******@***.***>> wrote:
This email originated outside TTU. Please exercise caution<https://askit.ttu.edu/phishing>!
I like this plan. I think the first step should be to make a set of really robust (and well tested!) conversion routines between xarray and iris/pandas, that go beyond the existing xarray routines and ensure consistency in e.g. dimension names and handle the proleptic gregorian time weirdness. After that we should be able to test consistently for iris vs xarray input, then change the internals and ensue the results are the same.
Also we definitely need more real data for larger scale tests. Will also come in useful for the new analysis cookbooks!
—
Reply to this email directly, view it on GitHub<#143 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AU7Z2U7AA3LIPHJMA2BEZMDXCOGUBANCNFSM5ZQ36VHQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
The protection/review rules for a particular branch can be changed here: https://github.com/tobac-project/tobac/settings/branches |
Thank you for the well developed and detailed plan on how to proceed on this issue @fsenf! Agree with @w-k-jones and I guess the only way to develop more advanced conversion routines we will need real data to be able to know about all naming convention and time oddities, right? Can we collect those somewhere? One issue with the dimension names was for example discussed here |
How big is your data @kelcyno ? Will you be able to upload this data to our zenodo repository? I think in addition to the specific data sets each of us is working with, it would also be nice to add more test cases with commonly used openly accessible datasets (e.g. ERA5, more satellite obs? ) |
@fsenf I had a thought this weekend that I described briefly in #337. I think another step we can take in concert with the test-driven development approach that you propose here is to add in type hints as we start this transition... I found that it really helped me understand where xarray and iris were used, and it helps us with our goal on #75 as well. What do you think? |
@freemansw1 : Thank you for the suggestion! I actually have no experience with type hints and therefore cannot make a meaningful contribution to the discussion. However, I am not against it, but we should not destroy back compatibility. |
Agreed. I don't think adding type hints should break anything, especially after I made the change to #337. We had to switch to python >=3.7 first, but we have now done this. All part of an integrated strategy :) |
I'm opening this as an overarching issue of resolving the differences between v1.x and v2.x and allowing codebases to work with either without modification. I encourage discussion of this current incompatibility and any ideas on how to link things better. Below is a checklist of everything currently on my mind that causes incompatibility, with the intent of checking things off as we resolve the incompatibility:
import
strings (i.e.,import tobac.themes
vsimport tobac
)xarray
as output from Feature Detection, Tracking, and Segmentation #83)utils
module into internal utilities and external utilities #122)The text was updated successfully, but these errors were encountered: