-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check nan handling in aggregation #26
Comments
@atrisovic @andyhultgren here is the proposed NaN handling spec. Does this seem right? |
@delgadom @atrisovic Yes this is right. Here is the full set of NaN conditions:
For example, our transforms of the climate data are structured to pass NaNs through like this: And then when we spatially collapse gridcells to adm units, we take the weighted average while ignoring missing values (unless they are all missing for a given adm unit, in which case NaN is returned): However when aggregating over time (for an adm unit) if e.g. one day has missing data for the entire adm unit, then the temporally-aggregated output (say a month containing that day) is also reported as missing. As in: All of these code snippets are from the master branch of the |
@andyhultgren thanks! You're absolutely right - @atrisovic the aggregation math in aggregations.py line 92 does need to be changed:
should be
also @andyhultgren, sorry for the nitpicky off-topic point, but just so you know np.vectorize is super slow compared with using actual vector math operations. It's essentially just a for-loop, so this is a pure python cell-wise operation:
You'd see major speedups with something like this:
Although exponentiation does preserve nans, so this doesn't actually do anything and could be written:
|
Oh and to completely respond - the aggregation code only aggregates in space - aggregation in time is defined in each transformation. mortality polynomials preserve daily resolution, Ag GDD/KDDs sum to the month, and energy is annual. We'll make sure the transformations fit the current spec, but just so you know this will ultimately be up to the user to make sure the transformations are done right. |
Proposed spec:
For a given region to be aggregated:
Note that no interpolation is done in time, space, or any other dimensions with these tools
This should be tested in the output
The text was updated successfully, but these errors were encountered: