Add ZEP 9 (extension naming) draft #65

joshmoore · 2025-02-12T13:30:02Z

This ZEP drafted with with @normanrz and reviewed by the rest of the @zarr-developers/steering-council attempts to unblock lingering spec-related issues like:

TODOs:

Open phase 1 PR against zarr-specs ([PR330])(ZEP9 (phase 1): add clarifications for extension naming zarr-specs#330)
Define a location for assigning names (e.g. https://zarr.dev/extensions)
Cross link related conversations in governance and specs
~~Clarify stores are not extensions (not written in metadata)~~ (This is never asserted, but we should agree that they are not extensions)
Decide what content to should be moved to subdocuments

Please see https://zarr.dev/zeps/draft/ZEP0009.html 🎉 Comments welcome as issues or on Zulip.

draft/ZEP0009.md

d-v-b · 2025-02-13T15:03:30Z

draft/ZEP0009.md

+🛠️ We propose defining two categories of names for immediate use by extensions:
+raw names and URI-based.


this would be a good place to add an explanation for why exactly we want to use URIs instead of some other kind of prefixing scheme. I understand that URIs are mentioned in the spec, but the spec doesn't provide any reason for their use.

The aspect of URLs that I like is that it delegates the name registration problem to DNS. But I also am unclear on the advantages of using fake https URLs compared to something slightly less verbose like <domain>/<suffix> which less strongly implies a real resource. Someone is surely going to try to fetch the documents, find them missing, and think that the URLs are just stale. Then also there is the question of using http or https --- or are other schemes like ftp, etc. allowed?

Aside from that, the added https:// part just adds extra verbosity. Tensorstore relies on just the json representation itself for specifying certain metadata, and making it more verbose just makes it harder to read and write.

Reading the proposal a bit more, I see that the URLs are recommended to resolve to some description of the extension, that is just not a requirement. I was thrown off a bit by the later examples showing ttps://zarr.dev/array/data_type which didn't seem like a plausible actual URL, since for one thing it does not specify zarr v3 at all, and also having a separate document for each field in the zarr v3 metadata was also rather different from the current documentation.

this would be a good place to add an explanation for why exactly we want to use URIs

Thanks, will do.

I see that the URLs are recommended to resolve to some description of the extension, that is just not a requirement

Is this generally a positive? A request that it's a requirement?

I was thrown off a bit by the later examples showing ttps://zarr.dev/array/data_type which didn't seem like a plausible actual URL

Sorry, perhaps chosen too quickly. I'll review the example for plausibility.

rabernat · 2025-02-13T16:32:28Z

Josh thanks so much for this important ZEP which addresses some critical issues around V3. 🙏

jbms · 2025-02-13T18:00:24Z

Regarding "raw" name registration:

Can you clarify a bit more how you are proposing this to work?

Is the spec going to get reviewed, or will it just be a cursory inspection?
If I register a raw name, does that mean I am registering a fixed version of a specification for that name, and any further revisions will require another PR, or am I getting delegated authority over that name, and can evolve it as I see fit without any further involvement of the zarr organization?

I think it would be better to leave delegation to the URL-based names, or at least names that are clearly prefixed by the delegate name, in order to make the nature of the name clearer.

jbms · 2025-02-13T18:05:08Z

Regarding "@context": my expectation is that the nature of the zarr metadata is such that each name is typically only specified once per a given array metadata, and furthermore other than the standard names in the zarr specification itself, a given URL prefix is also unlikely to be used more than once.

Consequently, I'm concerned that adding an additional "@context" mechanism (similar to xmlns) would not really help with verbosity, but would make it more complicated for computers to parse, and harder for humans to read the format easily, since they would now need to keep track of this additional name indirection mechanism.

joshmoore · 2025-02-13T19:49:25Z

Can you clarify a bit more how you are proposing this to work?

In general, I would say we're trying to define how things get technically added more than the decision making but a few point from my perspetive:

Is the spec going to get reviewed, or will it just be a cursory inspection?

I'd probably tend to less, say, schema reviewing and more an evaluation of whether or not the name can be safely given out. Will it lead to confusion, name squatting, etc. But open for discussion and perhaps a matter for additional governance document.

If I register a raw name, does that mean I am registering a fixed version of a specification for that name, and any further revisions will require another PR, or am I getting delegated authority over that name, and can evolve it as I see fit without any further involvement of the zarr organization?

I think that depends on which versioning scheme is chosen by the implementation. If versioning is "within extension" then it's just one PR. If it's "versioning by name", then it would be multiple, but that would also put the burden on the reviewers to be comfortable with accepting the multiple names.

Regarding "@context": my expectation is that the nature of the zarr metadata is such that each name is typically only specified once per a given array metadata, and furthermore other than the standard names in the zarr specification itself, a given URL prefix is also unlikely to be used more than once.

Agreed.

Consequently, I'm concerned that adding an additional "@context" mechanism (similar to xmlns) would not really help with verbosity, but would make it more complicated for computers to parse, and harder for humans to read the format easily, since they would now need to keep track of this additional name indirection mechanism.

Phase 3 definitely needs multiple proposals with weighted pros & cons after phase 1 and phase 2 are complete. I think it's useful though to take a full specification that doesn't need designing and see which modifications (if any) are needed to make it useful. For example, is a single flag in the metadata which states which version of the context one is currently using sufficient?

  "zarr_format": 3,
  "@context": "v3.1", // out of spec for JSON-LD

This is a bit like RDFas "initial contexts":

Were this elevated to a zarr "extension point" this could then also take a URI (at the risk of verbosity).

P.S. apologies to @context. I suggest we write @context from here on.

joshmoore · 2025-02-14T14:24:15Z

Pushed the minor clarifications. I would work towards merging this as a draft ZEP. Questions and comments would continue to be welcome in issues, PRs, Zulip, or most other places.

The bulk of feedback on phase 1, however, is likely better placed on the PR with spec changes in zarr-developers/zarr-specs#330.

Once those clarifications have been released (e.g., tagged as 3.1) then we can return to the conversation about phase 2 in this ZEP for a final decision.

Timeline:

few days: major clarifications on this PR
few weeks: comments and improvements on 330 (phase 1)
following that, we will follow the ZEP for the timeline of phase 2
A decision on whether or not phase 3 is needed or desired will follow.

rabernat

I agree we should merge this ASAP. According to the ZEP process, the feedback period starts once the draft is published. Let's use the spec PR to discuss and give further feedback.

draft/ZEP0009.md

normanrz · 2025-02-17T20:11:35Z

Regarding "raw" name registration:

Can you clarify a bit more how you are proposing this to work?

* Is the spec going to get reviewed, or will it just be a cursory inspection?

* If I register a raw name, does that mean I am registering a fixed version of a specification for that name, and any further revisions will require another PR, or am I getting delegated authority over that name, and can evolve it as I see fit without any further involvement of the zarr organization?

I added a section to the ZEP about the registration process that we envision for the raw names in 3538de5. We hope to keep the process very lightweight. The review is only supposed to avoid confusing names and prevent malicious activities. It is not intended to review the contents of the specification to design decisions. If changes to an extension specification go into the zarr-extensions repo, which I would encourage, there needs to be another review for updates as well.

joshmoore · 2025-02-18T14:53:15Z

All the (initial) TODOs have been handled from my side. Follow on from @rabernat's comment, I'd be for getting this in as a "draft ZEP" knowing that a) we can still modify it and b) feedback can be taken via zarr-developers/zarr-specs#330

cc: @zarr-developers/steering-council @zarr-developers/implementation-council @zarr-developers/python-core-devs

joshmoore added 4 commits February 11, 2025 17:45

Add ZEP 9 (extension naming) draft

56b95ef

Add ellipses in second example

4bffa12

Cleanup

1b87688

Add 'Immediate Clarifications'

04f2b0b

joshmoore mentioned this pull request Feb 12, 2025

zarr-specs v3.1 follow-ons zarr-developers/zarr-specs#329

Open

12 tasks

joshmoore commented Feb 12, 2025

View reviewed changes

draft/ZEP0009.md Outdated Show resolved Hide resolved

Update draft/ZEP0009.md

fe215da

d-v-b reviewed Feb 13, 2025

View reviewed changes

joshmoore marked this pull request as ready for review February 14, 2025 14:16

joshmoore added 2 commits February 14, 2025 15:17

Add explanation of URI usage

231e803

Clarify make believe URI prefix

3c77d62

joshmoore mentioned this pull request Feb 14, 2025

ZEP9 (phase 1): add clarifications for extension naming zarr-developers/zarr-specs#330

Open

5 tasks

rabernat approved these changes Feb 14, 2025

View reviewed changes

d-v-b reviewed Feb 14, 2025

View reviewed changes

draft/ZEP0009.md Outdated Show resolved Hide resolved

joshmoore and others added 2 commits February 16, 2025 13:15

Fix crazy typos and further clarify

a859a36

add zarr-extensions repo

3538de5

Review of primary phase 1 text

312bd73

This was referenced Feb 18, 2025

Add zstd codec zarr-developers/zarr-specs#256

Open

Define the list of codecs in the v3 spec zarr-developers/zarr-specs#312

Closed

Added consolidated metadata to spec zarr-developers/zarr-specs#309

Open

joshmoore mentioned this pull request Feb 18, 2025

Prototype of new DType interface zarr-developers/zarr-python#2750

Draft

normanrz merged commit a644e70 into zarr-developers:main Feb 18, 2025
1 check passed

joshmoore deleted the zep9-extension-naming branch February 19, 2025 08:02

normanrz mentioned this pull request Feb 25, 2025

Mesh specification ome/ngff#33

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ZEP 9 (extension naming) draft #65

Add ZEP 9 (extension naming) draft #65

joshmoore commented Feb 12, 2025 •

edited

Loading

d-v-b Feb 13, 2025

jbms Feb 13, 2025

jbms Feb 13, 2025

joshmoore Feb 13, 2025

rabernat commented Feb 13, 2025

jbms commented Feb 13, 2025

jbms commented Feb 13, 2025

joshmoore commented Feb 13, 2025

joshmoore commented Feb 14, 2025 •

edited

Loading

rabernat left a comment •

edited

Loading

normanrz commented Feb 17, 2025

joshmoore commented Feb 18, 2025

		🛠️ We propose defining two categories of names for immediate use by extensions:
		raw names and URI-based.

Add ZEP 9 (extension naming) draft #65

Add ZEP 9 (extension naming) draft #65

Conversation

joshmoore commented Feb 12, 2025 • edited Loading

d-v-b Feb 13, 2025

Choose a reason for hiding this comment

jbms Feb 13, 2025

Choose a reason for hiding this comment

jbms Feb 13, 2025

Choose a reason for hiding this comment

joshmoore Feb 13, 2025

Choose a reason for hiding this comment

rabernat commented Feb 13, 2025

jbms commented Feb 13, 2025

jbms commented Feb 13, 2025

joshmoore commented Feb 13, 2025

joshmoore commented Feb 14, 2025 • edited Loading

Timeline:

rabernat left a comment • edited Loading

Choose a reason for hiding this comment

normanrz commented Feb 17, 2025

joshmoore commented Feb 18, 2025

joshmoore commented Feb 12, 2025 •

edited

Loading

joshmoore commented Feb 14, 2025 •

edited

Loading

rabernat left a comment •

edited

Loading