You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the icon on the gene is a protein and gene symbol.
from a user POV we expect this is a "protein" that is being affected, the protein of the gene Hebp1. But in our graph building, disambiguation of genes and proteins causes a lot of noise (both too many edges and too few - some sources only annotate to "gene" ids when they should mean, biologically, "protein" -- they "pre-conflate" these two concepts). Because of this, the query has to "know" to look for a protein or gene, based on the source and no one really has a good way of doing those queries effectively. So we decided that NN should do "conflation" -- it's configurable -- if you query by "gene" you get results for "genes" and "proteins" -- it eases the searchability.
polypeptide, protein, protein isoform were what the users were looking for, translated this into a protein category in the TRAPI, went looking by protein ID or protein symbol (which is not necessarily the same as the gene id).
from the UI perspective - we need to explain this. chemists might be more interested in the difference.
we like the way it works now.
e.g. if the person asks for a protein, we de-conflate at the user level maybe? return the protein instead of the gene, even if the underlying data sources use genes? If it is a conflated entity, it will always choose the gene first. Arbitrarily de-conflating would be dangerous.
should be tagged as a low priority to "fix" - if it is easy, change the icon.
we might be testing with a more critical eye.
When a path intends to show a protein (as denoted by the icon/symbol), it annotates as "Gene." Would be easier on the user if it said "Protein."
The text was updated successfully, but these errors were encountered: