Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ranged properties #452

Closed
Changes from 22 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c18cf0c
Intermediate state adding ranged properties.
JPBergsma Dec 31, 2022
7dc50f2
Merge branch 'develop' into JPBergsma/Add_ranged_properties
JPBergsma Jan 3, 2023
6afcf9c
first draft for ranged properties.
JPBergsma Jan 6, 2023
1a55c2a
Merge branch 'develop' into JPBergsma/Add_ranged_properties
JPBergsma Jan 6, 2023
b597a67
Removed average, set, min and max fields for now as these become quit…
JPBergsma Jan 9, 2023
37db878
Small corrections.
JPBergsma Jan 9, 2023
a832751
Added how to treat missing values for requested range.
JPBergsma Jan 11, 2023
16aba07
changed description field
JPBergsma Jan 12, 2023
b1d69a8
Apply suggestions from code review
JPBergsma Jan 12, 2023
edbfc25
Merge branch 'Materials-Consortia:develop' into JPBergsma/Add_ranged_…
JPBergsma Jan 12, 2023
14de45d
Apply suggestions from code review Vaitkus
JPBergsma Jan 17, 2023
906db81
intermediate state from implementing code review.
JPBergsma Jan 17, 2023
1feb4a9
Merge branch 'JPBergsma/Add_ranged_properties' of https://github.com/…
JPBergsma Jan 17, 2023
a96dffe
Processed comments rartino.
JPBergsma Jan 18, 2023
15f599c
Small corrections.
JPBergsma Feb 15, 2023
73905dc
Added returned range property.
JPBergsma Feb 15, 2023
c6834f3
Added extra explanation values field.
JPBergsma Feb 16, 2023
b0cc94c
Apply suggestions from code review
JPBergsma Feb 17, 2023
0cee1e6
Merge branch 'develop' into JPBergsma/Add_ranged_properties
JPBergsma Mar 6, 2023
d7c8a9c
Processed comments rickard and a few more small improvements.
JPBergsma Mar 9, 2023
d1e8d74
Further changes after proof reading.
JPBergsma Mar 9, 2023
139c70e
further refinements.
JPBergsma Mar 9, 2023
916d6f2
Changed 'n_' to 'n' for ranged metadata properties tio be consistent …
JPBergsma Mar 13, 2023
ee3651e
Changed wording of range_id field after suggestion Rickard.
JPBergsma Mar 13, 2023
1f794b6
placed subsequent sentences on seperate lines.
JPBergsma Mar 16, 2023
169a1f4
Processed points discussed with Rickard.
JPBergsma Mar 24, 2023
f513596
Added per entry next field + small corrections.
JPBergsma Mar 28, 2023
65b8ad1
Apply suggestions from code review
JPBergsma May 2, 2023
94db38c
Adjusted the description next fields for ranged properties.
JPBergsma May 2, 2023
033ea11
Corrected range name for _exmpl_ranged_thermostat.
JPBergsma May 25, 2023
3762d30
Added that querying on properties in the range dictionary is optional.
JPBergsma May 30, 2023
8332567
Specifically mention that support for queries directly on the values …
JPBergsma May 31, 2023
a87b301
Updated example to latest version metadata proposal and added more ex…
JPBergsma Jun 2, 2023
7e9b4f4
Improved explanation returned_range field.
JPBergsma Jun 2, 2023
4b298c5
Small corrections.
JPBergsma Jun 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
214 changes: 214 additions & 0 deletions optimade.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ OPTIMADE API specification v1.2.0~develop

entry : names of type of resources, served via OPTIMADE, pertaining to data in a database.
property : data item that belongs to an entry.
ranged_property : A property that can be returned in pieces and that supports slicing.
JPBergsma marked this conversation as resolved.
Show resolved Hide resolved
val : value examples that properties can be.
:val: is ONLY used when referencing values of actual properties, i.e., information that belongs to the database.
type : data type of values.
Expand Down Expand Up @@ -67,6 +68,8 @@ OPTIMADE API specification v1.2.0~develop

.. role:: property(literal)

.. role:: ranged-property(literal)

JPBergsma marked this conversation as resolved.
Show resolved Hide resolved
.. role:: val(literal)

.. role:: type(literal)
Expand Down Expand Up @@ -442,6 +445,120 @@ For example, the following query can be sent to API implementations `exmpl1` and

:filter:`filter=_exmpl1_band_gap<2.0 OR _exmpl2_band_gap<2.5`


Ranged Properties
-----------------

Ranged properties are used for properties that are too large to be returned by default for every entry in a response.
They can also support slicing, so the client can request that only some of the values need to be returned.
The server can limit the size of the response and offer the values of the property in multiple parts.


- **Requirements/Conventions**:

- **Support**: OPTIONAL support in implementations.
- A ranged property can be recognized by the presence of the field :field:`range` in the metadata of the property, i.e. in the field: :field:`<property_name>_meta`.
- The server may return null or only a part of the values of the property under the field :field:`<property_name>`. In that case a links object MUST be provided in the field :field:`<property_name>_meta.range.next` from which the next part of the property is returned.

The dictionary under :field:`<property_name>_meta.range` MUST include these fields.
JPBergsma marked this conversation as resolved.
Show resolved Hide resolved
All fields in this section SHOULD be queryable with support for all mandatory filter features.

- :field:`serialization_format`: string.

To improve the compactness of the data, there are several ways to show to which index a value belongs.
The string MUST take one of the following values:

- `linear`: The value is a linear function of the indexes.
This function is defined by :property:`offset_linear` and :property:`step_size_linear`.
- `regular`: The value is set for one out of every :property:`step_size_sparse` indexes, with :property:`offset_sparse` indicating the index of the first value.
- `custom`: A separate list with indexes is defined in the field :property:`indexes` to indicate to which index each value belongs.


- :field:`n_indexable_dim`: integer.

The number of dimensions that can be indexed. The values themselves can also be lists, so a property may have more dimensions than listed here, but it may not be practical for the server to support indexing at that level.

- :field:`dim_size`: list of integers.

The size of the range in each indexable dimension.

Depending on the value of the :field:`serialization_format`, the following fields SHOULD/MUST be present or SHOULD NOT be present in the dictionary.
All fields in this section SHOULD be queryable with support for all mandatory filter features.

- :field:`n_values`: integer.
JPBergsma marked this conversation as resolved.
Show resolved Hide resolved

The total number of values in the property. This may be larger than the number of values that are returned. This field SHOULD be present when :property:`serialization_format` is not set to :val:`"linear"` else it SHOULD NOT be present.

- :field:`step_size_linear`: list of floats.

If :property:`serialization_format` is set to :val:`"linear"`, this value gives the change in the value of the property per step along each of the dimensions of the range.
For example, if the value :property:`offset_linear` = 0.5 and the value of :property:`step_size_linear` = [0.2,0.3] than at index[3,4] the value of the property will be 1.8.
The value MUST be present when :property:`serialization_format` is set to :val:`"linear"`.
Otherwise, it SHOULD NOT be present.

- :field:`step_size_regular`: list of integers.

If :property:`serialization_format` is set to :val:`"regular"`, this value indicates that a value is defined one out of every :property:`step_size_regular` steps in each dimension.
The value MUST be present when :property:`serialization_format` is set to :val:`"regular"`.
Otherwise, it SHOULD NOT be present.


Depending on the value of the :property:`serialization_format`, the following fields MAY be present or SHOULD NOT be present in the dictionary.
All fields in this section SHOULD be queryable with support for all mandatory filter features.

- :field:`offset_linear`: float.

If :property:`serialization_format` is set to :val:`"linear"` this property gives the value at the origin, i.e. where the index in all dimensions is 1.
The value MAY be present when :property:`serialization_format` is set to :val:`"linear"`, otherwise the value SHOULD NOT be present.
The default value is 0.

- :field:`offset_regular`: list of integers.

If :property:`serialization_format` is set to :val:`"regular"` this property gives the indexes of the first value.
The value MAY be present when :property:`serialization_format` is set to :val:`"regular"`, otherwise the value SHOULD NOT be present.
The default value is 1 in every dimension.

The dictionary MAY include these fields.

- :field:`range_ids`: list of strings.

A list with an identifier for each dimension of the property.
If two properties have the same range_id for a dimension, it means that the values at an index along this dimension belong to each other.
For example, if both the :property:`energy` and :property:`cartesian_site_positions` of a trajectory have the same :field:`range_ids` it indicates that the energy at an index x(in the dimension labelled by this range_ids) belongs to the cartesian_site_positions at the same index x.
SHOULD be a queryable property with support for all mandatory filter features.

If the :field:`<property_name>` contains data, the following properties MUST be present or SHOULD NOT be present, depending on the value of the :property:`serialization_format`. Querying is not relevant for these properties and SHOULD NOT be supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should range_id really be a per-entry metadata? Is that perhaps something we define as part of the property definition?

Copy link
Contributor Author

@JPBergsma JPBergsma Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not yet been able to come up with an example where defining range_id in the property definition would cause more problems than at the single entry level. So I think it should be possible to define it in the property definition.


- :field:`indexes`: List of lists of integers.

If :property:`serialization_format` is set to :val:`"custom"`, this field holds the indexes to which the values in the field :field:`<property_name>` belong.
The value MUST be present when :property:`serialization_format` is set to :val:`"custom"`.
Otherwise, it SHOULD NOT be present.

- :field:`n_returned_values`: integer

The number of values that have been returned.
This value SHOULD be present when `serialization_format` is set to :val:`"custom"` or :val:`"regular"`. Otherwise it SHOULD NOT be present.

- :field:`returned_range`: List of list of integers.

The range belonging to the returned data. It uses the same format as the :query-param:`property_ranges` query parameter.
It consists of a list which for each dimension contains a list of three values.
The first value indicates the index, in that dimension, of the first value that has been returned.
The second value indicates the index of last returned value.
The third value is the step size.
It MUST be returned when the `serialization_format` is :val:`"regular"` or :val:`"custom"`. Otherwise, it should not be returned.

- :field:`next`: a `JSON API links object <http://jsonapi.org/format/1.0/#document-links>`__.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- :field:`next`: a `JSON API links object <http://jsonapi.org/format/1.0/#document-links>`__.
- :field:`next`: String.


If the field is too large to be returned in a single response, this field will contain a URL pointing to the entry, to which this property belongs, which contains the next set of values for this property.
If all the data for this property has been returned, its value should be :val:¨`null`

- :field:`more_data_available`: boolean.

:field-val:`false` if all the values in the requested range have been returned, and :field-val:`true` if the returned values are incomplete.


Responses
=========

Expand Down Expand Up @@ -697,6 +814,79 @@ An example of a full response:
]
}


- Several examples of how ranged properties can be returned in the JSON format:

.. code:: jsonc

{
"cartesian_site_positions": [[[2.36, 5.36, 9.56],[7.24, 3.58, 0.56],[8.12, 6.95, 4.56]],
[[2.38, 5.37, 9.56],[7.24, 3.57, 0.58],[8.11, 6.93, 4.58]],
[[2.39, 5.38, 9.55],[7.23, 3.57, 0.59],[8.10, 6.93, 4.57]]
// ...
],
"cartesian_site_positions_meta": {
"range": {
"n_indexable_dim": 3,
"dim_size": [100, 3, 3],
"range_ids": ["mdsteps","particles","xyz"],
"serialization_format": "regular",
"offset_regular": [1, 1, 1],
"step_size_regular": [1, 1, 1],
"n_values": 900,
"n_returned_values": 50,
"returned¨_range": [[1,100,2],[1,3,1],[1,3,1]],
"more_data_available": true,
"next": "https://example.com/optimade/v1/structures/id123456?response_fields=cartesian_site_positions&property_ranges=cartesian_site positions[[101,900,2],[1,3,1],[1,3,1]]"
},
// ...
},
"species_at_sites": ["He", "Ne", "Ar"],
"species_at_sites_meta": {
"range": {
"n_indexable_dim": 1,
"dim_size": [3],
"range_ids": ["particles"],
"serialization_format": "regular",
"offset_regular": [1],
"step_size_regular": [1],
"n_returned_values": 3,
"n_values": 3,
"returned_range":[[1,3,1]],
"more_data_available": false,
"next": null
},
// ...
},
"_exmpl_ranged_time": null,
"_exmpl_ranged_time_meta":{
"range":{
"n_indexable_dim": 1,
"dim_size": [100],
"range_ids": ["mdsteps"],
"serialization_format": "linear",
"step_size_linear": 0.2
},
// ...
},
"_exmpl_ranged_thermostat": [20, 40, 60],
"_exmpl_ranged_thermostat_meta": {
"range": {
"n_indexable_dim": 1,
"dim_size": [100],
"range_ids": ["mdsteps"],
"serialization_format": "custom",
"n_values": 3,
"n_returned_values": 3,
"indexes": [[0], [20], [80]],
"more_data_available": false,
"next": null
},
// ...
}
}


HTTP Response Status Codes
--------------------------

Expand Down Expand Up @@ -880,6 +1070,30 @@ Standard OPTIONAL URL query parameters not in the JSON API specification:
If provided, these fields MUST be returned along with the REQUIRED fields.
Other OPTIONAL fields MUST NOT be returned when this parameter is present.
Example: :query-url:`http://example.com/optimade/v1/structures?response_fields=last_modified,nsites`
- **property\_ranges**: specifies which data ranges should be returned for ranged properties.
In general support is OPTIONAL, property definitions may however deviate from this and place stricter requirements on servers.
It consists of a property name directly followed by the range that should be returned.
A range is a list containing a list for each dimension.
A list consists of a pair of square brackets ("[", ASCII 91(0x5B)) and ("]", ASCII 93(0x5D)) enclosing a number of values separated by commas (",", ASCII 91(0x5B))
Each dimension's list has three integer values.
The first value of the range specifies the first index in that dimension for which values should be returned.
The second value specifies the last index for which values should be returned.
The third value specifies the step size.

Ranges can be specified for multiple properties by separating them with a comma.
JPBergsma marked this conversation as resolved.
Show resolved Hide resolved
The field may include optional space characters, which do not alter the meaning of the expression.
Databases SHOULD return the values falling within this range and the :property:`indexes` or :property:`returned_range` field belonging to the returned values.
For properties with :property:`serialization_format` :val:`custom` indexes that fall in the requested range but for which there is no value defined should not be returned.
For properties with :property:`serialization_format` :val:`regular` indexes that fall in the requested range but for which there is no value defined should have the value :val:`null`.
The ranges are 1 based, i.e. the first value has index 1, and inclusive i.e. for the range :val:`[10,20,1]` the last value returned belongs to index 20.
Example:

A database has a :entry:`structure` entry with id: :val:`id_12345` and a ranged property :property:`test_field` with the two-dimensional data values :val:`[[9.64, 7.52, 0.69, 5.69], [4.82, 8.35, 3.26, 3.25], [4.82, 2.78, 7.87, 7.42], [5.49, 3.48, 1.65, 0.75]]`.
A client makes a request :query-url:`http://example.com/optimade/v1/structures/id_12345?property_ranges=test_field[[1, 3, 2], [2, 3, 1]]`.
The response is then a single entry response for structure `12345` where the `test_field` property is included with the values :val:`[[7.52, 0.69], [2.78, 7.87]]`.

Multiple ranges can be requested in one query. e.g. :query-param:`property_ranges=test_field[[1, 3, 2], [2, 3, 1]], other_field[[1,100,1]]`.


Additional OPTIONAL URL query parameters not described above are not considered to be part of this standard, and are instead considered to be "custom URL query parameters".
These custom URL query parameters MUST be of the format "<database-provider-specific prefix><url\_query\_parameter\_name>".
Expand Down