Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OBI:0100026 as taxon for variant objects #860

Open
deepakunni3 opened this issue Nov 6, 2019 · 10 comments
Open

OBI:0100026 as taxon for variant objects #860

deepakunni3 opened this issue Nov 6, 2019 · 10 comments
Assignees

Comments

@deepakunni3
Copy link

This bug was originally raised by @iimpulse

For reference, the query: https://api-dev.monarchinitiative.org/api/bioentity/gene/MGI:98297/variants?fetch_objects=true&start=0&rows=10&facet=true&facet_fields=subject_taxon&taxon=OBI%3A0100026

yields gene to variant associations. But there are some associations where object has a BNODE prefix, and a taxon of OBI:0100026 'organism'. This makes it difficult to filter variants based on taxon since 'organism' is too generic of a taxon term.

@cmungall @kshefchek Thoughts?

CC'ing @monicacecilia for her awesomeness!

@kshefchek
Copy link
Contributor

we originally attempted to infer taxon on genotype parts, instead of making an explicit edge for each (in retrospect maybe a mistake). Theres an inference path in the solr loader that has been broken for some time, in the sense that it either doesn't infer the taxa or infers 'organism'.

tl;dr this should probably be fixed in dipper and likely won't be in the short term

@kshefchek
Copy link
Contributor

see also - SciGraph/golr-loader#10

@deepakunni3
Copy link
Author

Thats good to know.

Thanks @kshefchek

So this is blocked by Dipper or SciGraph loader? or both?

@kshefchek
Copy link
Contributor

The way it works now, this could either be fixed in dipper or the golr-loader code. We could also add something in scigraph but it would be some new post processor. I think the best thing to do is to add it in dipper.

@kshefchek kshefchek transferred this issue from monarch-initiative/monarch-ui Nov 6, 2019
@kshefchek
Copy link
Contributor

looking closer, many of these are transgenes, so which taxon applies? I would think the taxon in which the variant is studied but that is not entirely accurate.

@iimpulse
Copy link
Member

iimpulse commented Nov 8, 2019

@mbrush Are you able to join the monarch-ui call on Tuesday November 12?

This ticket is in relation to representation of variants, and we believe we need your help.

@mbrush
Copy link
Member

mbrush commented Nov 8, 2019

Hi. Happy to join call on Tuesday. In the meantime, Appendix I of this document provides food for thought that I think is relevant to this topic. It gets pretty into the weeds concerning what it means to be a 'transgene' or an 'allele' from the GENO perspective. But the key bits are in the third paragraph that starts with "An allele . . . "). Copying key text below, but see document for broader context.

An allele in GENO, including those caused by insertions, is an allele_of some reference genomic feature. This feature is typically a gene, but even insertions falling outside of genes are considered alleles_of the reference feature they alter (e.g. alleles of other named features such as QTLs). The feature or gene that an allele is an allele_of is entirely dependent on its genomic position, and not on the sequence content it contains. For example, insertion of the S. cerevisiae GAL4 gene sequence within the D. melanogaster Bx gene locus would create an allele_of this Bx gene, but the resulting transgene would not be considered an allele_of the S. cerevisiae GAL4 gene - because positionally it is not located in a yeast genome at the yeast GAL4 locus. Rather, GENO would say that this transgene derives_sequence_from the S. cerevisiae GAL4 gene.

@cmungall
Copy link
Member

I am confused about why we are talking about an OBI ID in the first place. We shouldn't be using the OBI class for organism.

@kshefchek
Copy link
Contributor

@cmungall this comes from a multi integration issue, first from running elk on geno, then attempting to infer taxon via this graph path search: https://github.com/SciGraph/golr-loader/blob/master/src/main/java/org/monarch/golr/GolrLoader.java#L157

@justaddcoffee
Copy link
Member

@kshefchek send along the list of troublesome IDs when you get a chance, and I"ll figure out what Dipper ingests needs to be corrected here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants