Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Case: [C]Worthy OAE dataset #119

Open
3 of 8 tasks
TomNicholas opened this issue Sep 27, 2024 · 4 comments
Open
3 of 8 tasks

Use Case: [C]Worthy OAE dataset #119

TomNicholas opened this issue Sep 27, 2024 · 4 comments
Labels
use case 🌎 Real-world use case virtual references 👻 Involves virtual kerchunk/virtualizarr chunk references

Comments

@TomNicholas
Copy link
Contributor

TomNicholas commented Sep 27, 2024

This issue but for icechunk: zarr-developers/VirtualiZarr#132

I was originally planning to virtualize this [C]Worthy dataset and save the references using the kerchunk parquet format, but now the timelines have changed such that both icechunk and the [C]Worthy OAE atlas are planned to release on the same day (Oct 15th 2024)! So I could use icechunk's format instead (or just write both)...

I think it's pretty unlikely that virtualizing using icechunk happens by then (I have enough work to do to just release the un-virtualized version of the dataset) but I do need to do all this by December anyway because I submitted this as a talk to AGU 🙃 Regardless of when this dataset is a good real-world test case for icechunk - as I said in zarr-developers/VirtualiZarr#132:

If we can virtualize this we should be able to virtualize most things 💪

Wishlist:

@dcherian
Copy link
Contributor

Datetime support

you probably don't need this since Xarray encodes datetimes by default.

@TomNicholas
Copy link
Contributor Author

you probably don't need this since Xarray encodes datetimes by default.

You mean if I save the time coordinates as non-virtual zarr arrays then xarray's decoding should handle this as normal?

@dcherian
Copy link
Contributor

if you go through xarray, yes.

@rabernat
Copy link
Contributor

Should also work with virtual data. Usual CF datasets use int as the raw array dtype and then have attributes like units: days since X, which Xarray / CFTime decode to python datetimes. There is no native datetime type in netcdf.

@TomNicholas TomNicholas added use case 🌎 Real-world use case virtual references 👻 Involves virtual kerchunk/virtualizarr chunk references labels Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use case 🌎 Real-world use case virtual references 👻 Involves virtual kerchunk/virtualizarr chunk references
Projects
None yet
Development

No branches or pull requests

3 participants