Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Further updates to property definitions #460

Closed

Conversation

rartino
Copy link
Contributor

@rartino rartino commented Feb 27, 2023

In the work on #445 a few more things have come up in need of adjustments in the specification.

This PR adopts $comment from JSON Schema to address #459. Even if I, in the work on definitions, ended up not needing it, I foresee possible future uses and it is a trivial extension.

This PR bundles a potentially controversial change of something that has irked me more and more when working on the definitions. The type field we inherit from JSON Schema is AFAIK the only place in the OPTIMADE standard where a dictionary field may have multiple data types, either String or List. Hence, I'm going to propose that we force Property Definitions to always use the list format. Most of our properties anyway allow "null" and are thus not affected. For the few that don't, it may seem silly to force ["string"] over "string"; but having started the work on converting these definitions between various formats in #445, I really think the simplification of the representation format is worth it.

Further changes in this PR are:

  • The addition of x-optimade-requirements/response-default-level to specify to what degree a property should or should not be included in a response per default. We have this information embedded in some of our current property definitions on this format: "Response: REQUIRED in the response."

  • A corresponding addition in x-optimade-implementation/response-default for implementations to indicate what a specific implementation does for each field. (I had to name it differently, since the latter should be a boolean and the former a string.)

When I was adding these, I realized we need to clarify what is, and what is not, "the same" in terms of Property Definitions, so this is also done. Feel free to complain (and/or suggest fixes) if these formulations are not clear. They are, however, IMO absolutely important.

I should say I was initially a bit hesitant if the "default response" info really belong in the Property Definitions. However, after some thinking, I really believe they do. When groupings of people uses this format to "develop standards" e.g. for biomol, experimental, magnetism, etc., databases they will want to be able to say what a database is expected to return by "default", just as we wanted for the basic properties we have standardized so far.

@rartino rartino requested review from ml-evs and vaitkus February 27, 2023 13:53
@rartino
Copy link
Contributor Author

rartino commented Feb 27, 2023

For some reason I cannot request a review from @fekad via the Reviewers system, so I ping him here instead. Since you gave much good feedback on the initial property definitions, and I suspect you may have thoughts on the type field change, I'd be happy if you have time to take a look at this.

@rartino rartino requested a review from merkys February 27, 2023 13:58
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To limit slosh/churn I will wait until Antanas' review has been addressed before taking a look, please re-request my review after that

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
Co-authored-by: Antanas Vaitkus <[email protected]>
optimade.rst Outdated Show resolved Hide resolved
@rartino
Copy link
Contributor Author

rartino commented Feb 28, 2023

@ml-evs Ok, I think I've addressed everything from @vaitkus now, except the ongoing discussion we have about the default for response-default.

vaitkus
vaitkus previously approved these changes Feb 28, 2023
Copy link
Contributor

@vaitkus vaitkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes.

@rartino
Copy link
Contributor Author

rartino commented Mar 1, 2023

So, another thing came up as I was working on the property definitions yesterday. It turns out that, since I don't want to duplicate my unit definitions in the source files, I more or less automatically get definition files also for units.

Hence, it is going to be trivial to do the same thing for physical units as we are about to do for properties, i.e., give them URIs that are URLs under https://schemas.optimade.org/units/..., that point at the unit definitions using the format described in the OPTIMADE specification section 7.2.

I've set things up for this in #445. For example, YAML source files for the angstrom and meter:

This actually seems very useful in itself, in that it provides a way in any context to reference our precise unit definitions via these URIs.

However, this motivates a few additions I'd like to do the Unit definitions format:

  • Allowing an $id field also inside the unit definition.
  • Allowing version and property-format, and make property-format MANDATORY if the unit definition is not embedded inside a property definition.
  • Add another tag defining-relation that allows us to describe how the units we define relate to others (primarily to relate them to SI units).

Any thoughts? Given that this PR isn't yet merged, can I add these changes also to this PR, or should they go in their own?

Edit: Also one more thing: we have inherited from JSON Schema that the text in our descriptions uses commonmark. However, it is super inconvenient not to be able to write math expressions in our context. I suppose one could embed HTML MathML. But, for our use, should we not refer to an extension (or define ourselves) that allow embedding of LaTeX expressions inside $ signs?

Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments @rartino. Might make sense to also accept further contributions to this PR if that is easier for you. Will try to respond your most recent comment outside of this particular review.

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Show resolved Hide resolved
@ml-evs
Copy link
Member

ml-evs commented Mar 1, 2023

So, another thing came up as I was working on the property definitions yesterday. It turns out that, since I don't want to duplicate my unit definitions in the source files, I more or less automatically get definition files also for units.

Hence, it is going to be trivial to do the same thing for physical units as we are about to do for properties, i.e., give them URIs that are URLs under https://schemas.optimade.org/units/..., that point at the unit definitions using the format described in the OPTIMADE specification section 7.2.

I've set things up for this in #445. For example, YAML source files for the angstrom and meter:

* https://github.com/rartino/OPTIMADE/blob/add-property-definitons-3/properties/src/units/current/atomistic/angstrom.yaml

* https://github.com/rartino/OPTIMADE/blob/add-property-definitons-3/properties/src/units/current/si/m.yaml

This actually seems very useful in itself, in that it provides a way in any context to reference our precise unit definitions via these URIs.

However, this motivates a few additions I'd like to do the Unit definitions format:

* Allowing an `$id` field also inside the unit definition.

* Allowing `version` and `property-format`, and make `property-format` MANDATORY if the unit definition is not embedded inside a property definition.

* Add another tag `defining-relation` that allows us to describe how the units we define relate to others (primarily to relate them to SI units).

I like the sound of this!

Edit: Also one more thing: we have inherited from JSON Schema that the text in our descriptions uses commonmark. However, it is super inconvenient not to be able to write math expressions in our context. I suppose one could embed HTML MathML. But, for our use, should we not refer to an extension (or define ourselves) that allow embedding of LaTeX expressions inside $ signs?

I think MathML makes som sense as the only real native web standard I know of (especially now that Chrome has support for it again since January 2023), however I think we are all most familiar with LaTeX. GitHub right now will render e.g.,

$$ x = y^2 $$

with MathJax, but seemingly not MathML:

x = − b ± b2 − 4⁢a⁢c 2 ⁢ a x = \frac{-b\pm\sqrt{b^2-4ac}}{2a} x = {-b plusminus sqrt {b^2 - 4 ac}} over {2 a}

Given that Mathjax supports MathML input, perhaps this is the standard we could assume (which would also enable asciimath etc)?

Co-authored-by: Matthew Evans <[email protected]>
ml-evs
ml-evs previously approved these changes Mar 1, 2023
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve this for now @rartino, just let me know if you want another review if you do add the unit stuff here.

@rartino
Copy link
Contributor Author

rartino commented Mar 1, 2023

I think we are all most familiar with LaTeX. GitHub right now will render e.g., $x=y^2$, but seemingly not MathML [...] Given that Mathjax supports MathML input, perhaps this is the standard we could assume (which would also enable asciimath etc)?

I think this is a great way to go. How is it best expressed? Something like:

  • Formatting in the text SHOULD use Markdown in the CommonMark v0.3 format, with mathematical expressions rendered using the LaTeX mode of Mathjax 3.2. When possible, it is preferable if mathematical expressions use as straightforward notation as possible to make them readable also when not rendered through Mathjax.

I rather not allow people to use MathML as it really isn't very human readable in raw form.

@ml-evs
Copy link
Member

ml-evs commented Mar 1, 2023

Sounds good to me! I think that makes it a lot easier for us to host a really nice rendered version of the definitions and schemas somewhere outside of GitHub

@fekad
Copy link
Contributor

fekad commented Mar 1, 2023

Sounds good to me! I think that makes it a lot easier for us to host a really nice rendered version of the definitions and schemas somewhere outside of GitHub

In that case KaTex could also be a nice Latex renderer.

@rartino rartino marked this pull request as draft March 21, 2023 15:57
@rartino rartino changed the title Further updates to property definitions [WIP] Further updates to property definitions Mar 21, 2023
rartino added 2 commits March 22, 2023 02:52
…units to be defined one level higher, but rename unit-defintions to units so the fields can be named the same thing.
…units to be defined one level higher, but rename unit-defintions to units so the fields can be named the same thing.
optimade.rst Outdated Show resolved Hide resolved
@rartino rartino mentioned this pull request Jun 11, 2023
5 tasks
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
rartino and others added 2 commits July 14, 2023 15:56
Co-authored-by: Johan Bergsma <[email protected]>
Co-authored-by: Antanas Vaitkus <[email protected]>
@ml-evs ml-evs added the blocking-release This is a PR or issue that presently blocks the release of next version of the spec. label Dec 19, 2023
Comment on lines +2127 to +2132
- :field:`x-optimade-unit-definitions`: List.
A list of definitions of the symbols used in the Property Definition (including its nested levels) for physical units given as values of the :field:`x-optimade-unit` field.
This field **MUST be included at the outermost level of a property definition** if the defined property, at any level, includes an :field:`x-optimade-unit` with a value that is not :val:`dimensionless` or :val:`inapplicable`, and it MUST include definitions of all units used on all levels in the property definition.
The field MAY also occur at deeper nesting levels (but this is not required).
If it does, the unit definitions provided MUST be redundant with those provided at higher nesting levels.
See subsection `Physical Units in Property Definitions`_ for the details on how units are represented in OPTIMADE Property Definitions and the precise format of this dictionary.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should make x-optimade-unit-definitions always REQUIRED in the outermost level (so, possibly an empty list). The specification just gets bulky from the OPTIONAL - oh but often REQUIRED - categorization. However, setting this to REQUIRED requires updating the property definitions in #445.

@ml-evs
Copy link
Member

ml-evs commented Dec 23, 2023

Now I look at this wrt my comments on #488 and #445, I see we do not use the JSON Schema pattern field here. Is that intentional @rartino? It looks like we have maxLength and minLength for strings but not pattern.

optimade.rst Outdated Show resolved Hide resolved
@rartino
Copy link
Contributor Author

rartino commented Dec 23, 2023

I see we do not use the JSON Schema pattern field here. Is that intentional @rartino? It looks like we have maxLength and minLength for strings but not pattern.

pattern was indeed excluded intentionally, since we would have to agree on a regex format for OPTIMADE before we can support pattern (which to be used here would have to be equal to, or a subset of, ECMA since that is what JSON Schema specifies). Since we likely want to sync the regex format across OPTIMADE, this is waiting for a decision in #160.

For now, I'd suggest we merge the property definitions without 'pattern' and thus the symops would have to be validated outside of what can be done with the schemas. If anyone feels strongly enough about this omission, I think we have all pieces in place to finish #160, and then bring over the regex subset into this PR.

@rartino rartino marked this pull request as ready for review January 5, 2024 16:54
@rartino rartino closed this Jan 5, 2024
@rartino
Copy link
Contributor Author

rartino commented Jan 5, 2024

This PR is now merged into #445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocking-release This is a PR or issue that presently blocks the release of next version of the spec.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants