[Feature]: More Sophisticated Bounds Handling in Temporal Averaging Operations #594

pochedls · 2024-01-31T02:26:35Z

Is your feature request related to a problem?

xcdat temporal averaging operations currently bin data by the labelled time point with the weights derived from the difference in the time bounds. This works for most conventional climate data: a timepoint of 2020-01-16 12:00 with bounds of [2020-01-01 00:00, 2020-02-01 00:00] would be given 31 days of weight in January (e.g., in creating an annual average or climatology), which is correct.

There are reasonable instances where this wouldn't work. Imagine pentad data with a time point of 2020-02-02 12:00 with bounds of [2020-01-31 00:00, 2020-02-05 00:00]. This time point should be given one day of weight in January and four days of weight in February. The current algorithm (e.g., for monthly averaging) assigns all five days of weight in February (the labelled time point).

Describe the solution you'd like

Weights should be assigned based on the time period that they fall into. This would mean that a given time point can contribute to averages in more than one time interval

Describe alternatives you've considered

Solutions for the time being would be to update documentation to note that:

Weights are determined from the difference in the bounds
Data is grouped into the labelled time point for averaging operations
If one time point spans across the time intervals that you are averaging into, then weights are not properly assigned
get_time_bounds generally assumes data with frequencies of annual, monthly, daily, or sub-daily (I thought of this while writing this issue)
Perhaps other disclaimers I haven't thought of

Additional context

I'm not sure cdms / cdutil covers this case; it would be helpful to determine what cdutil does.

This seems like it could be challenging issue to address in general and might require a major refactor of the logic use for existing temporal averaging calculations.

The text was updated successfully, but these errors were encountered:

tomvothecoder · 2024-02-06T17:06:43Z

I opened up PR #601 to implement the documentation updates suggested in your alternative solution.

pochedls · 2025-01-29T18:49:33Z

Just referencing this comment, which provides further discussion of this issue.

pochedls · 2025-01-30T00:21:53Z

FYI – @tomvothecoder – this is a pentad dataset that this issue would effect:

/p/user_pub/climate_work/pochedley1/msu/netcdf/uah_6.1-pentad_tlt_197901-202412.nc

pochedls · 2025-02-01T01:31:41Z

@tomvothecoder – I put together a notebook on this – maybe we could schedule a time to go over it? Or should I post it? Or just start a PR?

tomvothecoder · 2025-02-03T18:17:23Z

@tomvothecoder – I put together a notebook on this – maybe we could schedule a time to go over it? Or should I post it? Or just start a PR?

@pochedls You can start a new PR and commit it temporarily there, I'll analyze it, then we can go over it if needed. Thanks for starting work on this!

pochedls · 2025-02-06T04:20:32Z

@tomvothecoder - I've prototyped some new functionality in a new branch. It is not anywhere near complete – should I still do a PR (that way I could comment on it to help orient you)?

FYI: this is the prototype example of producing a monthly average from a pentad dataset with correct weights:

import xcdat as xc                                                                                                                                                    
fn = '/p/user_pub/climate_work/pochedley1/msu/netcdf/uah_6.1-pentad_tlt_197901-202412.nc'                                                                             
ds = xc.open_dataset(fn)                                                                                                                                              
ds = ds.spatial.average('tlt')                                                                                                                                        
dsa = ds.temporal.compute_monthly_average('tlt')

tomvothecoder · 2025-02-10T18:10:44Z

@tomvothecoder - I've prototyped some new functionality in a new branch. It is not anywhere near complete – should I still do a PR (that way I could comment on it to help orient you)?

FYI: this is the prototype example of producing a monthly average from a pentad dataset with correct weights:

import xcdat as xc
fn = '/p/user_pub/climate_work/pochedley1/msu/netcdf/uah_6.1-pentad_tlt_197901-202412.nc'
ds = xc.open_dataset(fn)
ds = ds.spatial.average('tlt')
dsa = ds.temporal.compute_monthly_average('tlt')

Yes, a PR would be helpful to get things rolling. Thanks!

github-project-automation bot added this to xCDAT Development Jan 31, 2024

github-project-automation bot moved this to Todo in xCDAT Development Jan 31, 2024

tomvothecoder mentioned this issue Feb 6, 2024

[Doc]: Add more info about how temporal averaging is performed and how weights are generated #600

Closed

tomvothecoder added the type: enhancement New enhancement request label Feb 12, 2024

tomvothecoder added this to the FY24Q4 (07/01/24 - 09/30/24) milestone Jun 20, 2024

tomvothecoder moved this from Todo to In Progress in xCDAT Development Jun 20, 2024

tomvothecoder moved this from In Progress to Todo in xCDAT Development Jun 20, 2024

tomvothecoder modified the milestones: FY24Q4 (07/01/24 - 09/30/24), FY25Q1 (10/01/24 - 12/31/24) Sep 25, 2024

pochedls mentioned this issue Jan 15, 2025

[Feature]: Retain bounds and compute time point for group averaging operations #565

Open

tomvothecoder modified the milestones: FY25Q1 (10/01/24 - 12/31/24), FY25 Q2 (01/01/25 - 03/31/25) Jan 17, 2025

pochedls linked a pull request Feb 10, 2025 that will close this issue

More sophisticated bounds handling for temporal averaging #735

Open

9 tasks

tomvothecoder moved this from Todo to In Progress in xCDAT Development Feb 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: More Sophisticated Bounds Handling in Temporal Averaging Operations #594

[Feature]: More Sophisticated Bounds Handling in Temporal Averaging Operations #594

pochedls commented Jan 31, 2024 •

edited

Loading

tomvothecoder commented Feb 6, 2024

pochedls commented Jan 29, 2025

pochedls commented Jan 30, 2025

pochedls commented Feb 1, 2025

tomvothecoder commented Feb 3, 2025

pochedls commented Feb 6, 2025 •

edited

Loading

tomvothecoder commented Feb 10, 2025

[Feature]: More Sophisticated Bounds Handling in Temporal Averaging Operations #594

[Feature]: More Sophisticated Bounds Handling in Temporal Averaging Operations #594

Comments

pochedls commented Jan 31, 2024 • edited Loading

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

tomvothecoder commented Feb 6, 2024

pochedls commented Jan 29, 2025

pochedls commented Jan 30, 2025

pochedls commented Feb 1, 2025

tomvothecoder commented Feb 3, 2025

pochedls commented Feb 6, 2025 • edited Loading

tomvothecoder commented Feb 10, 2025

pochedls commented Jan 31, 2024 •

edited

Loading

pochedls commented Feb 6, 2025 •

edited

Loading