Make standards that can be exported/imported/validated via HDF5/JSON/XML/YAML.
mkstd
uses pre-existing standards for validation and schema specification, such that files produced using mkstd
can be used independently of mkstd
. For example, a tool developer can use mkstd
to create a standard for their tool's data, but users do not necessarily need mkstd
installed to use the data. However, mkstd
also provides importers and exporters, so intended use also involves an mkstd
installation for convenience.
pip install mkstd
# For HDF5 support
pip install hdfdict@git+https://github.com/SiggiGue/hdfdict
For environments requiring numpy<2
, replace mkstd
with mkstd[numpyv1]
above.
mkstd
is intended to be used at two stages of data management. An example of these stages is provided in .
At this stage, the "user" is the person designing the data type and corresponding standard. For example, a tool developer who wants to standardize the data produced by their tool. The steps could look like:
- (with
mkstd
) Design the data type as a Pydantic data model. Thanks to Pydantic, it behaves like a standard for your data, as a Python object. - (with
mkstd
) Export the standard as e.g. XML and JSON schemas. - (TODO, with
mkstd
) Generate documentation for the standard, based on the Pydantic data model docstrings.
At this stage, the "user" is someone who wants to use data generated by the tool, or import their own data into the tool.
- (with or without
mkstd
) Reformat data to match the standard specified by the e.g.mkstd
-generated XML or JSON schema. - (with or without
mkstd
) Validate the data against the schema. - (with or without
mkstd
) Import/export the reformatted data with the tool.
As written above, many uses of the standard produced by mkstd
are intended to be possible without an mkstd
installation. This is because the generated standards are in standardized schema formats. Below are the different formats supported by mkstd
, and how to use/validate standards/data independently of mkstd
.
The XSD format is used. Search the web for validate xml data against schema
.
The official JSON schema format is used. Search the web for validate json data against schema
.
There is no official YAML schema format, so YAML data is typically validated against JSON schemas. mkstd
takes this approach too. Hence, tools that can validate YAML data against a JSON schema can be used, without an mkstd
installation.
For example, the pajv
tool can be used to validate YAML data against a JSON schema, without mkstd
.
pajv validate -s output/mkstd_generated_schema.yaml -d output/data.yaml
By default, mkstd
stores the schemas for YAML standards in YAML too.
There is currently no standard available for the specification of HDF5 schemas. Hence, the HDF5 files produced by mkstd
can only be validated with mkstd
.
There is a format for HDF5 that enables interconversion with JSON. This is out-of-scope.