Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to the spec and schemata for RFC-2 #242

Merged
merged 39 commits into from
Nov 22, 2024
Merged

Conversation

normanrz
Copy link
Contributor

@normanrz normanrz commented May 31, 2024

This is the companion PR for RFC-2 which adds the changes to the spec document, json schemata, examples and test files. The PR is meant to support the review process of RFC-2 by providing the specifics.

Again, a brief summary of the main changes:

  • Zarr v3 is used for OME-Zarr that includes that all metadata moves from .zattrs to zarr.json
  • The OME-Zarr metadata will live under the ome key in the attributes of the zarr.json files
  • Renaming the spec doc to OME-Zarr

Copy link
Contributor

github-actions bot commented May 31, 2024

Automated Review URLs

latest/index.bs Outdated Show resolved Hide resolved
@will-moore
Copy link
Member

Unchanged in this PR but index.bs still has:

Each "multiscales" dictionary SHOULD contain the field "name". It MUST contain the field "version", which indicates the version of the multiscale metadata of this image (current version is [NGFFVERSION]).

Even the example below that text doesn't contain version. Also it seems that version is no-longer needed since that's provided by the https://ngff.openmicroscopy.org/0.5 key?

latest/index.bs Outdated Show resolved Hide resolved
@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/bioformats2raw-removing-resolution-index-from-zarr-hierarchy-seems-to-forfeit-some-metadata/97347/4

@will-moore
Copy link
Member

In looking to implement support for reading the proposed V0.5 data (in ome-ngff-validator), I am finding the usage of the versioned key https://ngff.openmicroscopy.org/0.5 is a bit painful when you want to get_version() since I have to iterate through a list of potential versions to check if the key exists.
This means that I will always have to update the code to support new versions, instead of retrieving the version and using that to automatically pick the correct schema to validate against.

So I find that I am in agreement with various comments on RFC-2 about the concerns of using a version string as a key.

@d-v-b
Copy link
Contributor

d-v-b commented Jun 12, 2024

Instead of using a URL-with-a-version-inside as a key, I think it would be better to pick a name like "ome" or "ome-ngff" as the key for an object, and have a version field in that object, and a schema_url field in that object. much clearer, and it would allow parsers to check the version of the metadata without knowing the version beforehand.

@normanrz normanrz mentioned this pull request Jul 2, 2024
@normanrz
Copy link
Contributor Author

normanrz commented Jul 2, 2024

I updated this PR for the RFC-2 revision. The namespace key is now ome and there is a separate version attribute.

@will-moore will-moore mentioned this pull request Jul 3, 2024
4 tasks
@will-moore
Copy link
Member

Working with these schemas and those from @d-v-b's dev1 branch e.g. https://github.com/ome/ngff/blob/7da3d7bbd7c49db29b44e54a6bf5fd7e1387f100/0.5-dev1/schemas/image.schema in the ome-ngff-validator, I noticed that in this PR, the schemas include the attributes (and ome), so that you can validate the raw zarr.json against the schema, whereas in the dev1 branch, the attributes were not included in the schemas, so you needed to validate against the contents of the attributes key. This approach may have been chosen to reduce the number of changes in going from zarr v2 -> v3.

I don't know which approach is most useful to the community, given the various tools that might want to consume these schemas? Is it most useful to be able to validate against a whole zarr.json file or against the root.attrs as loaded in hand via zarr-python?
In the case of ome-ngff-validator I'm happy to use either approach, so I just wanted to flag it up for discussion in case others have strong views?

rfc/2/index.md Outdated Show resolved Hide resolved
@normanrz
Copy link
Contributor Author

normanrz commented Jul 8, 2024

Is there a json schema for the base zarr.json where we could plug in the OME-Zarr metadata schema? cc @d-v-b

@d-v-b
Copy link
Contributor

d-v-b commented Jul 8, 2024

Is there a json schema for the base zarr.json where we could plug in the OME-Zarr metadata schema? cc @d-v-b

I'm not aware of one, but we should a) make one b) include it with the zarr v3 spec. Were I to work on this today, I would start by fixing up the rather meager v3 support in pydantic-zarr, and then use that to generate the schema. But any way of generating such a schema is valid.

@joshmoore
Copy link
Member

The three approving reviews of RFC-2 have now been merged: #261

Minor changes here to address the above discussions, #259 and any issues of versioning, etc. are welcome.

@normanrz
Copy link
Contributor Author

I changed the JSON schema files to use the attributes as a root instead of the root of the zarr.json. I think that composes better because we don't have to redefine the Zarr core metadata in our schema.

I also added schema_url as a new property but removed it again because it caused issues with the ome-ngff-challenge. We should discuss whether to add schema_url to the OME-Zarr metadata. Personally, I am not convinced that this is necessary because it is trivial to look up the schema in this repository based on the version attribute. That makes schema_url somewhat redundant and verbose.
cc @joshmoore @d-v-b

conf.py Outdated Show resolved Hide resolved
@joshmoore
Copy link
Member

See speced/bikeshed#2946 for some of the concerns around bikeshed

@joshmoore
Copy link
Member

As someone of you may have noticed from the large number of commits pushed here, the spec build is currently broken due to an update in the build system (bikeshed). This is now breaking other work like #267. I'm in the process of copying the reference data (IDR/ome-ngff-samples#22) and getting implementations other than vizarr and validator (like neuroglancer: google/neuroglancer#670) released. I'm going to move forward with our TODO list (#242 (comment)) and merge this but not undertake the announcement step. That way we can get the data in place and the implementations re-tested before announcing more widely.

@will-moore: please note the new behavior of /latest/ as a built copy of /0.5/ in case there is any specific login in the validator that depends on something there. It might be useful to be able to specify an explicit schema location in the validator (either via semi-hidden text box or another URL parameter) to be able to test before a full release.

@joshmoore joshmoore merged commit 3fc518b into ome:main Nov 22, 2024
10 checks passed
github-actions bot added a commit that referenced this pull request Nov 22, 2024
Changes to the spec and schemata for RFC-2

SHA: 3fc518b
Reason: push, by @joshmoore

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@joshmoore joshmoore mentioned this pull request Nov 22, 2024
"attributes": {
"ome": {
"version": "0.5",
"schema_url": "https://ngff.openmicroscopy.org/latest/schemas/ome_zarr.schema",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this on purpose? The schema_url key is not defined in the specification and I assume a latest URL means the underlying schema will evolve through time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not on purpose. This needs fixing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to open a PR to remove it unless it's already part of a batch of changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #279

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants