Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DC vs oboInOwl annotation properties? #1202

Open
turbomam opened this issue Aug 10, 2021 · 14 comments
Open

DC vs oboInOwl annotation properties? #1202

turbomam opened this issue Aug 10, 2021 · 14 comments
Assignees

Comments

@turbomam
Copy link
Contributor

@pbuttigieg recently suggested switching to Dublin Core annotation properties, as opposed to of oboInOwl.

The NMDC team has been using oboInOwl in our templates, like NMDC-03_EnvO_template_robot_sheet

  • oboInOwl:created_by
  • oboInOwl:creation_date^^xsd:dateTime
  • oboInOwl:hasBroadSynonym@en
  • oboInOwl:hasDbXref
  • oboInOwl:hasExactSynonym@en
  • oboInOwl:hasNarrowSynonym@en
  • oboInOwl:hasRelatedSynonym@en
  • oboInOwl:inSubset

That's based on @kaiiam's nice EnvO/robot documentation

I have checked for predicate usage in EnvO via http://sparql.hegroup.org/sparql/

Remember there are two different Dublin Core prefixes.

     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:terms="http://purl.org/dc/terms/"

SELECT  ?p (COUNT(?s) AS ?count)
WHERE
  { GRAPH <http://purl.obolibrary.org/obo/merged/ENVO>
      { ?s  ?p  ?o }
  }
GROUP BY ?p

DC and oboInOwl usage

?p ?count
http://purl.org/dc/elements/1.1/contributor 186
http://purl.org/dc/elements/1.1/creator 451
http://purl.org/dc/elements/1.1/date 352
http://purl.org/dc/elements/1.1/description 1
http://purl.org/dc/elements/1.1/source 3
http://purl.org/dc/elements/1.1/title 1
http://purl.org/dc/terms/creator 3
http://purl.org/dc/terms/date 1
http://purl.org/dc/terms/license 2
http://purl.org/dc/terms/rightsHolder 1
http://www.geneontology.org/formats/oboInOwl#consider 101
http://www.geneontology.org/formats/oboInOwl#created_by 810
http://www.geneontology.org/formats/oboInOwl#creation_date 591
http://www.geneontology.org/formats/oboInOwl#default-namespace 1
http://www.geneontology.org/formats/oboInOwl#hasAlternativeId 86
http://www.geneontology.org/formats/oboInOwl#hasBroadSynonym 468
http://www.geneontology.org/formats/oboInOwl#hasDbXref 7676
http://www.geneontology.org/formats/oboInOwl#hasExactSynonym 3131
http://www.geneontology.org/formats/oboInOwl#hasNarrowSynonym 465
http://www.geneontology.org/formats/oboInOwl#hasOBONamespace 1130
http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym 1999
http://www.geneontology.org/formats/oboInOwl#hasSynonym 36
http://www.geneontology.org/formats/oboInOwl#hasSynonymType 106
http://www.geneontology.org/formats/oboInOwl#id 1115
http://www.geneontology.org/formats/oboInOwl#inSubset 1253
http://www.geneontology.org/formats/oboInOwl#is_class_level 1
http://www.geneontology.org/formats/oboInOwl#is_inferred 4
http://www.geneontology.org/formats/oboInOwl#is_metadata_tag 1
http://www.geneontology.org/formats/oboInOwl#shorthand 2
http://www.geneontology.org/formats/oboInOwl#source 1

Searching for owl:onPropertys associated with owl:Restrictions doesn't show any DC or oboInOwl usage that might need remediation.

There is also heterogeneity in the objects of the various APs for mentioning creators and contributors:

SELECT  ?o (COUNT(?s) AS ?count)
WHERE
  { GRAPH <http://purl.obolibrary.org/obo/merged/ENVO>
      { ?s  <http://purl.org/dc/elements/1.1/contributor>  ?o }
  }
GROUP BY ?o
  • dc:contributor all IRIs, except for nine "https://orcid.org/0000-0002-7556-2097" strings
  • dc:creator? mixture of ORCID IRIs, ORCID strings and some non-ORCID strings. Some http and some https
  • terms:creator? All names or initials
  • http://www.geneontology.org/formats/oboInOwl#created_by most heterogeneous!
@cmungall
Copy link
Member

cmungall commented Aug 10, 2021

Thanks for this analysis

The very first thing that should be done is replace elements with terms. This should not be controversial. See OBOFoundry/OBOFoundry.github.io#540

I am not sure that the question should be "dc vs oio". There are no synonyms in dc for example.

I would strongly advocate for ENVO continuing to use

oboInOwl:hasBroadSynonym@en
oboInOwl:hasExactSynonym@en
oboInOwl:hasNarrowSynonym@en
oboInOwl:hasRelatedSynonym@en

for synonyms (there is no equivalent for this in dc. there is also no direct equivalent in skos, as skos predicates map between concept URIs, and should not be used for literals)

I would also recommend keeping on using

oboInOwl:inSubset

for subsets

and

oboInOwl:hasDbXref

on axiom annotations for provenenance

This will all be made clearer with the next OMO release

I think your analysis is showing a lot of terms from other ontologies. I am not sure it's the best use of your time to redo on envo-base as this will all be folded into OMO-based validation tools soon.

@turbomam
Copy link
Contributor Author

@turbomam
Copy link
Contributor Author

@cmungall or @pbuttigieg has this happened yet?

this will all be folded into OMO-based validation tools soon.

@pbuttigieg
Copy link
Member

I am not sure that the question should be "dc vs oio". There are no synonyms in dc for example.

What can be replaced by DC, should be for more global interoperability

I would strongly advocate for ENVO continuing to use

oboInOwl:hasBroadSynonym@en
oboInOwl:hasExactSynonym@en
oboInOwl:hasNarrowSynonym@en
oboInOwl:hasRelatedSynonym@en

for synonyms (there is no equivalent for this in dc. there is also no direct equivalent in skos, as skos predicates map between concept URIs, and should not be used for literals)

Agreed. There are no valid equivalents.

I would also recommend keeping on using

oboInOwl:inSubset

for subsets

Agree. There are likely simpler ways to do this via queries stored in the repo, but this doesnt hurt.

and

oboInOwl:hasDbXref

on axiom annotations for provenenance

This can / should be replaced by more broadly understood properties from DC, schema.org or similar.

@pbuttigieg
Copy link
Member

@cmungall or @pbuttigieg has this happened yet?

this will all be folded into OMO-based validation tools soon.

I'm not aware of what OMO has done since this was posted. I'd rather we use more broadly interoperable properties where possible, rather than those idiosyncratic to OBO .

@cmungall
Copy link
Member

Let's keep the ontology engineering issue of how we validate separate from the question of what the standard is we are validating against for now. See

for a general discussion on mechanisms for enforcing.

Regarding the standard for annotations in ENVO, looks like we are reaching consensus, with @pbuttigieg's comments from yesterday:

#1202 (comment)

We have agreement about entity-level annotations, what remains is axiom-level annotations

me:

oboInOwl:hasDbXref on axiom annotations for provenenance

@pbuttigieg:

This can / should be replaced by more broadly understood properties from DC, schema.org or similar.

I am sympathetic to this. If we do replace axiom-level hasDbXref, I would strong advocate for dcterms, an in particular dcterms:source. I'm not sure what specific schema.org property we'd use here, and I created some recommendations for how schema.org should be used in OBO (and other) ontologies here: information-artifact-ontology/ontology-metadata#176

Personally I think we should stick to hasDbXref on axiom level annotations. It's an established de-facto standard in OBO, and OBO is one of the few groups with ontologies that represent axiom-level provenance. Switching would cause a lot of churn. It makes it harder to mix and match ENVO and OBO workflows. But I am also to switch, and to do the search and replace.

@matentzn
Copy link
Collaborator

I can add my votes to a more clear breakdown of all changes you want to make (google docs, or individual ENVO issues), but I agree we should hammer out the 90% non-controversial ones.

Rule of thumb: if it is in OMO, it should be used; if it is not, we should discuss.

hasDbXref should be retained IMO for its wide use and augmented (not replaced) by skos:*Match properties where precision is known.

@turbomam
Copy link
Contributor Author

turbomam commented Jul 29, 2024

Rule of thumb: if it is in OMO, it should be used; if it is not, we should discuss.

Thanks @matentzn that's helpful

@turbomam
Copy link
Contributor Author

turbomam commented Jul 29, 2024

I haven't used OMO in anger yet so I'm doing some familiarization

For communicating who should get credit for a term's presence in an ontology, and for communicating when the term was first added, it seems like even OMO contains some redundancy/ambiguity regarding DC terms, DC elements and oboInOwl.

@matentzn
Copy link
Collaborator

This is silly.. and most likely my fault :D

We have some pretty clear preferences now in any case:

http://purl.org/dc/terms/contributor
http://purl.org/dc/terms/creator
http://purl.org/dc/terms/date or better http://purl.org/dc/terms/created for dates (not decided, but pretty clear)

@turbomam
Copy link
Contributor Author

I agree with @cmungall's questions about community consensus on the acceptable datatypes for the object. He mentioned dates in information-artifact-ontology/ontology-metadata#63, but I am equally concerned about the representation of creators/contributors. I propose a single ORCID URI for each contributor. That would require some effort to clean up creator/contributor assertions whose values are a string representation of somebody's (or some groups's) name

@turbomam
Copy link
Contributor Author

I am going to use @matentzn's guidelines now to make a PR for

@matentzn
Copy link
Collaborator

but I am equally concerned about the representation of creators/contributors. I propose a single ORCID URI for each contributor. That would require some effort to clean up creator/contributor assertions whose values are a string representation of somebody's (or some groups's) name

We have all these checks now in ROBOT / ODK - I can add them to ENVO into your PR once you get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants