@@ -733,7 +733,7 @@ descriptors, one for metadata files and one for subgraphs.
733
733
of one entity to coexist in a non-delta staging area. A delta staging area,
734
734
on the other hand, must contain at most one object with a given
735
735
``entity_id``, and therefore only one version of that entity.
736
-
736
+
737
737
738
738
The ``.remove`` suffix is used to request the removal of an entity. It can
739
739
only be used in staging areas that have the ``is_delta`` property set to
@@ -1134,7 +1134,7 @@ staging areas may contain updates is for backwards compatibility: The DCP
1134
1134
already utilized this functionality before this section of the specification was
1135
1135
written. |ne |
1136
1136
1137
- |nn | It may be tempting to reuse an existing staging area after it has been
1137
+ |nn | It may be tempting to reuse an existing staging area after it has been
1138
1138
imported so as to avoid having to repopulate a completely new staging area for
1139
1139
the next import. For non-delta staging areas this can be a good strategy. For
1140
1140
delta staging areas it usually isn't because delta staging areas can only
@@ -1443,6 +1443,103 @@ row and finally soft-deleting any unmarked rows. |ne|
1443
1443
1444
1444
1445
1445
1446
+ Ontologies
1447
+ ==========
1448
+
1449
+ The `HCA Metadata Schema `_ designates certain document properties as
1450
+ ontologized. An *ontologized property * (OP) contains a JSON object referencing a
1451
+ term in an ontology that is hosted externally, outside of the DCP/2. The shape
1452
+ of that JSON object is specified by one of the `ontology modules `_ of the `HCA
1453
+ Metadata Schema `_. All such modules specify at least the following three child
1454
+ properties:
1455
+
1456
+ ``ontology ``
1457
+ optional; the stable and unique identifier of an ontology term
1458
+
1459
+ ``ontology_label ``
1460
+ optional; a human readable description of the term refered to by the
1461
+ ``ontology `` child property
1462
+
1463
+ ``text ``
1464
+ required; a human readable description to fall back on should no term exist
1465
+
1466
+ .. _ontology modules : https://github.com/HumanCellAtlas/metadata-schema/tree/master/json_schema/module/ontology
1467
+
1468
+
1469
+ Rules for producers
1470
+ -------------------
1471
+
1472
+ When setting an OP in a metadata document, producers of metadata should
1473
+ select the most specific ontology term currently available that best describes
1474
+ the experimental facts and satisfies the requirements of the ontology module
1475
+ governing the the OP.
1476
+
1477
+ A) If a sufficiently specific match is found, the producer
1478
+
1479
+ - sets the ``ontology `` child property of OP to the identifier of the
1480
+ selected term and
1481
+
1482
+ - sets the ``ontology_label `` and ``text `` child properties to the label
1483
+ of the selected term.
1484
+
1485
+ The label of an ontology term can change over time. The producer must keep
1486
+ the ``ontology_label `` and ``text `` child properties up to date whenever the
1487
+ document is updated. There is no requirement to update the document whenever
1488
+ the label changes.
1489
+
1490
+ B) If no sufficiently specific term exists, but a more general one does, the
1491
+ producer
1492
+
1493
+ - sets the ``ontology `` child property of OP to the identifier of the more
1494
+ general term,
1495
+
1496
+ - sets the ``ontology_label `` child property to the label of that term and
1497
+
1498
+ - sets the ``text `` child property of the OP to what they expect the label
1499
+ of a hypothetical exact match would be.
1500
+
1501
+ The producer initiates the process of adding that expected term to the
1502
+ ontology. After that term has been added, the producer updates the
1503
+ document as described under A).
1504
+
1505
+ C) Otherwise, the producer
1506
+
1507
+ - omits the ``ontology `` and ``ontology_label `` child properties of the OP
1508
+ and
1509
+
1510
+ - sets the ``text `` child property of the OP to what they expect the
1511
+ label of a hypothetical term would be if it existed.
1512
+
1513
+ The producer initiates the process of adding that expected term to the
1514
+ ontology. After that term has been added, the producer updates the
1515
+ document as described under A).
1516
+
1517
+
1518
+ Rules for consumers
1519
+ -------------------
1520
+
1521
+ When reading an ontologized property (OP) in a metadata document, consumers of
1522
+ metadata should read the ``ontology `` child property of the OP, if that child
1523
+ property is present. If a description of the term in English (or any other
1524
+ language supported by the ontology) is needed, the consumer should look that
1525
+ description up in the ontology API referred to by the module governing the OP,
1526
+ using the term identifier in the ``ontology `` child property. If a lookup is not
1527
+ possible for technical reasons, the producer should read the ``text `` child
1528
+ property if present or the ``ontology_label `` otherwise. If both are absent, the
1529
+ consumer should raise an error.
1530
+
1531
+ If the ``ontology `` child property is absent, the consumer instead reads the
1532
+ ``text `` child property of the OP.
1533
+
1534
+ |nn | Under the above rules, if an OP was set under scenario B, consumers will
1535
+ ignore the hypothetical label. This leads to a more consistent user experience.
1536
+ There is no guarantee that different wranglers come up with different
1537
+ hypothetical terms and we don't want the UX to suffer in that case, considering
1538
+ that there is at least a partial match available. If an OP was set using
1539
+ scenario C, the hypothetical term label is the best we have. In both scenarios
1540
+ the producer must update the document once the term becomes available, so the
1541
+ degraded UX is only temporary. |ne |
1542
+
1446
1543
Project-level matrices
1447
1544
======================
1448
1545
0 commit comments