Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate interfaces to monthly and annual temporal averaging #193

Open
agstephens opened this issue Oct 8, 2021 · 7 comments
Open

Investigate interfaces to monthly and annual temporal averaging #193

agstephens opened this issue Oct 8, 2021 · 7 comments
Assignees

Comments

@agstephens
Copy link
Contributor

Description

We are looking at adding monthly and annual averaging to the roocs stack. Carsten has made a suggestion in this PR:

#191

Task

@ellesmith88 Please can you do some looking around in similar code stacks/standards to see how interfaces to these types of temporal averages are described (like you did before when we defined the main interface to the subset operation). It would be good for us to learn from other approaches (although we may still choose our own :-)

Please document what you find in this issue.

@ellesmith88
Copy link
Contributor

I've put together a spreadsheet of examples I was able to find: https://docs.google.com/spreadsheets/d/1VLkIxFvYlGPqO-MW1pNaHKXuBFTQNMbi4KZrNG2RmWo/edit?usp=sharing

I couldn't think of or find any more stacks/standards to look at for temporal averaging, if there are others let me know and I'll look into them :)

@agstephens
Copy link
Contributor Author

Hi Elle,
This looks great.
Thanks

@agstephens
Copy link
Contributor Author

After discussions with the C3S team, we are considering:

resample(collection, dims="time", freq="month", how="mean")

or

aggregate(collection, dims="time", freq="month", how="mean")

The resample operation/method is common to xarray, pandas and cds-toolbox. The aggregate operation/method exists in xarray.

@agstephens
Copy link
Contributor Author

agstephens commented Nov 12, 2021

Or:

clisops.ops.average::average(ds, dims="time", freq="month")

or:

clisops.ops.average::average_time(ds, freq="month")

@agstephens
Copy link
Contributor Author

agstephens commented Nov 15, 2021

Side discussion...

  • ideally, clisops.core would contain numerous functions that all do 1 thing - and you can chain them in a workflow because they are all operating on dask.delayed objects
  • maybe we should refactor functions like subset_bbox so that the time and level parts are excluded
    • then clisops.ops.subset can decide which functions are called

If we refactored as above, then it would make more sense to write clisops.core.average::average(ds, dims=["time", "longitude"], freq="month") knowing that clisops.ops.average::average(...) could offer more functionality (or maybe not).

Note refactoring issue: roocs/clisops#114

Ouranos support this suggestion: clisops.core has single/specific functions that can be chained together (with delayed compute).

@agstephens
Copy link
Contributor Author

agstephens commented Dec 16, 2021

Decision! We will start by implementing this:

clisops.core.average::average_time(ds, freq="month"|"year")

and...

clisops.ops.average::average_time(ds, freq="month"|"year")

@agstephens
Copy link
Contributor Author

@ellesmith88 please have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants