@@ -989,6 +989,103 @@ Broad Institute).
989
989
990
990
991
991
992
+ Ontologies
993
+ ==========
994
+
995
+ The `HCA Metadata Schema `_ designates certain document properties as
996
+ ontologized. An *ontologized property * (OP) contains a JSON object referencing a
997
+ term in an ontology that is hosted externally, outside of the DCP/2. The shape
998
+ of that JSON object is specified by one of the `ontology modules `_ of the `HCA
999
+ Metadata Schema `_. All such modules specify at least the following three child
1000
+ properties:
1001
+
1002
+ ``ontology ``
1003
+ optional; the stable and unique identifier of an ontology term
1004
+
1005
+ ``ontology_label ``
1006
+ optional; a human readable description of the term refered to by the
1007
+ ``ontology `` child property
1008
+
1009
+ ``text ``
1010
+ required; a human readable description to fall back on should no term exist
1011
+
1012
+ .. _ontology modules : https://github.com/HumanCellAtlas/metadata-schema/tree/master/json_schema/module/ontology
1013
+
1014
+
1015
+ Rules for producers
1016
+ -------------------
1017
+
1018
+ When setting an OP in a metadata document, producers of metadata should
1019
+ select the most specific ontology term currently available that best describes
1020
+ the experimental facts and satisfies the requirements of the ontology module
1021
+ governing the the OP.
1022
+
1023
+ A) If a sufficiently specific match is found, the producer
1024
+
1025
+ - sets the ``ontology `` child property of OP to the identifier of the
1026
+ selected term and
1027
+
1028
+ - sets the ``ontology_label `` and ``text `` child properties to the label
1029
+ of the selected term.
1030
+
1031
+ The label of an ontology term can change over time. The producer must keep
1032
+ the ``ontology_label `` and ``text `` child properties up to date whenever the
1033
+ document is updated. There is no requirement to update the document whenever
1034
+ the label changes.
1035
+
1036
+ B) If no sufficiently specific term exists, but a more general one does, the
1037
+ producer
1038
+
1039
+ - sets the ``ontology `` child property of OP to the identifier of the more
1040
+ general term,
1041
+
1042
+ - sets the ``ontology_label``child property to the label of that term and
1043
+
1044
+ - sets the ``text `` child property of the OP to what they expect the label
1045
+ of a hypothetical exact match would be.
1046
+
1047
+ The producer initiates the process of adding that expected term to the
1048
+ ontology. After that term has been added, the producer updates the
1049
+ document as described under A).
1050
+
1051
+ C) Otherwise, the producer
1052
+
1053
+ - omits the ``ontology `` and ``ontology_label `` child properties of the OP
1054
+ and
1055
+
1056
+ - sets the ``.text``child property of the OP to what they expect the
1057
+ label of a hypothetical term would be if it existed.
1058
+
1059
+ The producer initiates the process of adding that assumed term to the
1060
+ ontology. After that term has been added, the producer updates the
1061
+ document as described under A).
1062
+
1063
+
1064
+ Rules for consumers
1065
+ -------------------
1066
+
1067
+ When reading an ontologized property (OP) in a metadata document, consumers of
1068
+ metadata should read the ``ontology `` child property of the OP, if that child
1069
+ property is present. If a description of the term in English (or any other
1070
+ language supported by the ontology) is needed, the consumer should look that
1071
+ description up in the ontology API referred to by the module governing the OP,
1072
+ using the term identifier in the ``ontology `` child property. If a lookup is not
1073
+ possible for technical reasons, the producer should read the ``text `` child
1074
+ property if present or the ``ontology_label `` otherwise. If both are absent, the
1075
+ consumer should raise an error.
1076
+
1077
+ If the ``.ontology `` child property is absent, the consumer instead reads the
1078
+ ``text `` child property of the OP.
1079
+
1080
+ |nn | Under the above rules, if an OP was set under scenario B, consumers will
1081
+ ignore the hypothetical label. This leads to a more consistent user experience.
1082
+ There is no guarantee that different wranglers come up with different
1083
+ hypothetical terms and we don't want the UX to suffer in that case, considering
1084
+ that there is at least a partial match available. If an OP was set using
1085
+ scenario C, the hypothetical term label is the best we have. In both scenarios
1086
+ the producer must update the document once the term becomes available, so the
1087
+ degraded UX is only temporary. |ne |
1088
+
992
1089
Project-level matrices
993
1090
======================
994
1091
0 commit comments