Skip to content

JSON Schema for documents in HuBMAP Elasticsearch

License

Notifications You must be signed in to change notification settings

etlds/search-schema

 
 

Repository files navigation

search-schema

JSON Schemas for documents from the HuBMAP Search API: The HuBMAP Portal depends on there being a consistent document structure.

The wrapped metadata elements in these documents come from metadata TSVs submitted along with the data; Their structure is described by ingest-validation-tools.

Usage

JSON Schemas (as YAML) are in data/schemas. This repo is incorporated as a git submodule in portal-ui.

Maintenance Plan

As we see see validation warnings in the Portal, we should update the schemas here to tolerate these variations, but for each tweak, we should also file an issue here to follow-up with PSC:

  • Is the variation something intentional, or a bug?
  • If intentional, what are the semantics?
  • Are there other variations we should be aware of?

Development

After checkout and cd:

pip install -r requirements.txt
pip install -r requirements-dev.txt
./test.sh
  • Schemas begin as TSV files.
  • The information is then consolidated into a single YAML by definitions-tsv-to-yaml.py.
  • And from that single YAML, JSON Schemas are generated by definitions-yaml-to-schema.py.

Update

The schemas are derived from TSVs in data/definitions. If updates are needed, please make the changes there, and Chuck will regenerate the schemas.

Tag and release

For now, we're just using git tags. When it's time to tag a version, checkout a new branch, and

./push.sh

Update the CHANGELOG, adding the date for the current tag, and stubbing the new "in progress" version, and make a PR from the branch.

About

JSON Schema for documents in HuBMAP Elasticsearch

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 74.7%
  • Shell 25.3%