Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Rhea import #168

Merged
merged 6 commits into from
Feb 2, 2024
Merged

Improve Rhea import #168

merged 6 commits into from
Feb 2, 2024

Conversation

cthoyt
Copy link
Member

@cthoyt cthoyt commented Jan 30, 2024

Closes #167

  1. Add "has participant" relations between the "master" and "bidirected" reaction and ChEBI identifiers
  2. Add has input / has output relations between directional reactions and ChEBI identifiers
  3. Adds "enabled by" for UniProt and EC xrefs
  4. Adds GO relations with newly minted debio:0000047 (as a placeholder for the maybe results of How to model relations between EC, Rhea, and GO oborel/obo-relations#783)
  5. Adds programmatically generated names for left-to-right, right-to-left, and bi-directed reactions

Example output:

[Term]
[Term]
id: rhea:11880
name: \(S\)-6-hydroxynicotine + H2O + O2 = 6-hydroxypseudooxynicotine + H2O2
relationship: RO:0000057 CHEBI:15377 ! has participant water
relationship: RO:0000057 CHEBI:15379 ! has participant dioxygen
relationship: RO:0000057 CHEBI:16240 ! has participant hydrogen peroxide
relationship: RO:0000057 CHEBI:58182 ! has participant (S)-6-hydroxynicotinium(1+)
relationship: RO:0000057 CHEBI:58682 ! has participant 6-hydroxypseudooxynicotinium(1+)
relationship: RO:0002333 eccode:1.5.3.5 ! enabled by
relationship: RO:0002333 uniprot:A0A075BSX9 ! enabled by
relationship: RO:0002333 uniprot:Q93NH4 ! enabled by
relationship: debio:0000007 rhea:11881 ! has left-to-right reaction (S)-6-hydroxynicotine + H2O + O2 => 6-hydroxypseudooxynicotine + H2O2
relationship: debio:0000008 rhea:11882 ! has right-to-left reaction 6-hydroxypseudooxynicotine + H2O2 => (S)-6-hydroxynicotine + H2O + O2
relationship: debio:0000009 rhea:11883 ! has bi-directional reaction (S)-6-hydroxynicotine + H2O + O2 <=> 6-hydroxypseudooxynicotine + H2O2
relationship: debio:0000047 GO:0018531 ! reaction enabled by molecular function

[Term]
id: rhea:11881
name: \(S\)-6-hydroxynicotine + H2O + O2 => 6-hydroxypseudooxynicotine + H2O2
xref: metacyc.compound:S-6-HYDROXYNICOTINE-OXIDASE-RXN
is_a: rhea:11880 ! (S)-6-hydroxynicotine + H2O + O2 = 6-hydroxypseudooxynicotine + H2O2
relationship: RO:0002233 CHEBI:15377 ! has input water
relationship: RO:0002233 CHEBI:15379 ! has input dioxygen
relationship: RO:0002233 CHEBI:58182 ! has input (S)-6-hydroxynicotinium(1+)
relationship: RO:0002234 CHEBI:16240 ! has output hydrogen peroxide
relationship: RO:0002234 CHEBI:58682 ! has output 6-hydroxypseudooxynicotinium(1+)
relationship: RO:0002333 uniprot:A0A075BSX9 ! enabled by
relationship: RO:0002333 uniprot:Q93NH4 ! enabled by

[Term]
id: rhea:11882
name: 6-hydroxypseudooxynicotine + H2O2 => \(S\)-6-hydroxynicotine + H2O + O2
is_a: rhea:11880 ! (S)-6-hydroxynicotine + H2O + O2 = 6-hydroxypseudooxynicotine + H2O2
relationship: RO:0002233 CHEBI:16240 ! has input hydrogen peroxide
relationship: RO:0002233 CHEBI:58682 ! has input 6-hydroxypseudooxynicotinium(1+)
relationship: RO:0002234 CHEBI:15377 ! has output water
relationship: RO:0002234 CHEBI:15379 ! has output dioxygen
relationship: RO:0002234 CHEBI:58182 ! has output (S)-6-hydroxynicotinium(1+)

[Term]
id: rhea:11883
name: \(S\)-6-hydroxynicotine + H2O + O2 <=> 6-hydroxypseudooxynicotine + H2O2
xref: kegg.reaction:R03202
is_a: rhea:11880 ! (S)-6-hydroxynicotine + H2O + O2 = 6-hydroxypseudooxynicotine + H2O2
relationship: RO:0000057 CHEBI:15377 ! has participant water
relationship: RO:0000057 CHEBI:15379 ! has participant dioxygen
relationship: RO:0000057 CHEBI:16240 ! has participant hydrogen peroxide
relationship: RO:0000057 CHEBI:58182 ! has participant (S)-6-hydroxynicotinium(1+)
relationship: RO:0000057 CHEBI:58682 ! has participant 6-hydroxypseudooxynicotinium(1+)

@cmungall
Copy link

I think having the top and bottom of the diamond be undistinguished directionally and having direction on the middle makes sense practically and ontologically.

Part of me really wants the convenience of having explicit left and right relations on the tip, even if it doesn't make perfect ontological sense. A compromise could be an axiom annotation on the participant relationships (this does make the RDF/XML explode a little - even though it's nice and compact in the .obo :-). We can imagine adding stoichiometry on these later.

Of course, it should be RHEA not rhea - but I'll check with the RHEA folks to see if they concur :-)

Will you declare debio:0000009 as an AP? An existential restriction doesn't make sense for linking parts of the diamond

Wasn't familiar with debio - looks like placeholder ontology? I'd also use chemrof over debio for this.

@cmungall
Copy link

cmungall commented Jan 30, 2024

Also I just noticed that this forces an ontological commitment to EC representing gene products. This is of course on the surface defensible, the E stands for enzyme, enzymes are gene product. But other valid treatments are to say the EC represents the reaction or mechanism, which would make a mapping rather than a relationship more appropriate.

Of course we shouldn't hold up this PR on rabbit holes like this but this does highlight the need for governance if obo-db-ingest moves more into semi-official obo rendering territory

Another issue with the existential restriction is that it's too strong logically for many cases, e.g. for https://www.rhea-db.org/rhea/11092

you would end up with:

id: RHEA:11092
relationship: RO:0002333 EC:1.1.1.8
relationship: RO:0002333 EC:1.1.1.94

This is logically wrong - you're asserting that every instance of this reaction is catalyzed by both those enzymes

@cthoyt
Copy link
Member Author

cthoyt commented Jan 31, 2024

I think having the top and bottom of the diamond be undistinguished directionally and having direction on the middle makes sense practically and ontologically.

Part of me really wants the convenience of having explicit left and right relations on the tip, even if it doesn't make perfect ontological sense. A compromise could be an axiom annotation on the participant relationships (this does make the RDF/XML explode a little - even though it's nice and compact in the .obo :-). We can imagine adding stoichiometry on these later.

I'm not sure I understand what you mean by "the diamond". I am happy to update this to fit whatever pattern you say is best, though

Of course, it should be RHEA not rhea - but I'll check with the RHEA folks to see if they concur :-)

We can address this upstream in the Bioregistry if Anne wants to get involved

Will you declare debio:0000009 as an AP? An existential restriction doesn't make sense for linking parts of the diamond

Wasn't familiar with debio - looks like placeholder ontology? I'd also use chemrof over debio for this.

debio (https://github.com/biopragmatics/debio) is my solution to the RO governance taking too long to get new relations in (Biolink does the same thing as far as I understand :p). I would be much happier to get stuff in RO if you can tell me what you think it should be encoded as.

Is chemrof ready for use? I am not confident to use it since the documentation is confusing and doesn't give examples/support to me as a potential user

@cthoyt
Copy link
Member Author

cthoyt commented Jan 31, 2024

Also I just noticed that this forces an ontological commitment to EC representing gene products. This is of course on the surface defensible, the E stands for enzyme, enzymes are gene product. But other valid treatments are to say the EC represents the reaction or mechanism, which would make a mapping rather than a relationship more appropriate.

Of course we shouldn't hold up this PR on rabbit holes like this but this does highlight the need for governance if obo-db-ingest moves more into semi-official obo rendering territory

Another issue with the existential restriction is that it's too strong logically for many cases, e.g. for https://www.rhea-db.org/rhea/11092

you would end up with:

id: RHEA:11092
relationship: RO:0002333 EC:1.1.1.8
relationship: RO:0002333 EC:1.1.1.94

This is logically wrong - you're asserting that every instance of this reaction is catalyzed by both those enzymes

What's the best way that I can represent either EC catalyzes this reaction? If I turn this into a knowledge graph, it's obvious to me that I want both relations.

@cthoyt
Copy link
Member Author

cthoyt commented Jan 31, 2024

Btw, here's a diagram to help us guide discussions on how to ontologize GO, Rhea, and EC:

@cmungall
Copy link

cmungall commented Feb 1, 2024

I'm not sure I understand what you mean by "the diamond".

See slide 2:

https://docs.google.com/presentation/d/1Umumzq4Ix9o55RwKEn2WI6u2QClUogGZRRU2Li-7D_k/edit#slide=id.gd29e7f76ab_0_6

image

debio (https://github.com/biopragmatics/debio) is my solution to the RO governance taking too long to get new relations in (Biolink does the same thing as far as I understand :p).

Nope! RO is an OWL ontology.

I would be much happier to get stuff in RO if you can tell me what you think it should be encoded as.

tl;dr don't. I think oboizations of databases should avoid distributing annotations (in the GO sense not OWL sense) in the same file as the entities.

Made an issue for this:

In this particular case, some may have good reasons to model EC to RHEA as SSSOM (this is what we would do in GO, we consider these mappings, and we use SKOS for the relations). But everyone doesn't need to agree! If someone wants to model this using an incorrect OWL axiom in a local obo file ecosystem that's fine too! IMO pyobo/obo-db-ingest will be more successful with fewer commitments. Just distribute the associations as TSVs!

(if you like we can get into the ways of modeling the relationship between an enzyme molecule and a reaction on the RO tracker, but the tl;dr is there are semantic minefields in going down this path. I didn't lay the minefields, I'm just trying to stop people wandering into them...)

@cthoyt
Copy link
Member Author

cthoyt commented Feb 2, 2024

FYI I've made two changes before merging:

  1. Added a placeholder relation that's not an xref to GO (debio:0000047)
  2. Added names for ltr, rtl, and bi reactions

@cthoyt cthoyt merged commit a2da824 into main Feb 2, 2024
8 checks passed
@cthoyt cthoyt deleted the update-rhea branch February 2, 2024 08:29
@cmungall
Copy link

It would be awesome if we could get a new https://github.com/biopragmatics/obo-db-ingest with this incorporated

@cthoyt
Copy link
Member Author

cthoyt commented Mar 18, 2024

@cmungall new build from March 14th, 2024 available. New links available in https://github.com/biopragmatics/obo-db-ingest/blob/6a136f3aadceb11aa27dd542786be2f0614a03fc/docs/_data/manifest.yml#L877-L918

@cthoyt cthoyt mentioned this pull request Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add participants to Rhea import
2 participants