-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zarr V3 metadata fixes #248
Open
LDeakin
wants to merge
6
commits into
zarr-developers:main
Choose a base branch
from
LDeakin:zarr_v3_fixes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
da9188b
Change Zarr V3 chunk key encoding to `v2` with `.` separator
LDeakin 3de4d5d
Fix Zarr V3 non-finite float fill value encoding
LDeakin 425b5b4
Fix Zarr V3 NaN fill value for integer arrays
LDeakin 479f7bc
Zarr V3 default to little endian configuration for `bytes` codec
LDeakin 2a5dc68
Fix tests using `null` fill value or `nan` fill value for integer dat…
LDeakin c10535e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong? For writing v3 metadata?
In general if we're not planning to use this format any more (see #262 (comment)), how much of this PR do you want to keep @LDeakin ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably all the rest of the fixes are still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chunk manifest example in zarr-developers/zarr-specs#287 and
virtualizarr
produces"0.0"
style chunk key encoding, which isv2
with.
separator.default
with/
would be"c/0/0"
.If the chunk key encoding of the array and the chunk manifest matches, then the
chunk-manifest-json
storage transformer does not need to concern itself with chunk key encodings, which makes sense to me.Not fussed, this PR was just the minimal changes I needed to use the
chunk-manifest-json
as currently spec'd and produced byvirtualizarr
. I'd hope most of these changes would be superseded by bringing inzarr-python
V3 as a dependency anyway.I haven't looked thoroughly at the spec for icechunk yet, but do you see it replacing
chunk-manifest-json
entirely? Can the time travel stuff be decoupled from the chunk manifests?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intention was to test out writing to and reading from a v3-compatible json-based chunk manifest spec. If what I actually did looks more like v2 then that's my bad for not understanding the spec properly!
Okay thanks. Maybe we get virtualizarr working fully, then look at the updated diff, as I would expect @mpiannucci's efforts on icechunk compatibility should iron out similar concerns around fill values?
👍 We're close to being able to do that now that zarr-python v3
alpha(beta today actually) is out.I think that is Earthmover's intention.
In theory it probably could, but in practice unless there is a strong use case for using chunk manifests where you wouldn't also like to have all the other features of icechunk, I'm not really sure why you would bother separting them. All the features of icechunk are closely-related in that they all involve/require adding a new layer of indirection into the store, i.e. the manifests + snapshots (which are kind of like time-stamped consolidated metadata IIUC). This question deserves discussion on that zarr spec proposal issue though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've asked in zarr-developers/zarr-specs#287 (comment)