-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AG: class tree visualization issues with AG backend #264
Comments
updated ncbo_cron to the latest codebase in staging env and reprocessed ontologies. Missing perfLabel for purl.obolibrary.org/obo/GO_0008150 in GO ontology is fixed. |
labels still missing in Mondo, e.g., https://stage.bioontology.org/ontologies/MONDO/?p=classes&conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMONDO_0005200 ('viral dilated cardiomyopathy') |
see also ncbo/ontologies_linked_data#137 for example from production (fixed by re-parsing in that case) |
Note that the MONDO case does not have prefLabels in the original ontology, it only has regular labels throughout the XML/RDF file. I believe the rdf:label is displayed as the Preferred Label annotation when there is no prefLabel. |
In case this gives any clue: If I click on the missing label (highlighted under herpes zoster in right side of screen shot), the page may not resolve (big WAITING box); but if I hit reload in the browser I get this display. After further testing: If I click on the empty label location, I get LOADING CLASS spinner, nothing happens, and the network trace reads per the first gif below (a 500 error on that class). If I then hit Reload, the page refreshes to display the item, the network trace shows a normal 200 response. The only visible difference in the call is there is no callback=load in the second case. |
I think we found the responsible code for this problem. Well, we have a good theory, anyway. It's all in Slack for now, I'll let Misha decide what is worth summarizing in this thread. |
I was able to identify the cause of this issue. It has to do with the fact that AllegroGraph does not impose a default ordering of records for paginated results, which results in duplicate values to be included when iterating over the entire record set:
While each run of this query does not produce duplicates, the TOTAL run over the entire graph does. Because of these duplicates, many of the legitimate classes are omitted and are left without a label. The attached file contains a good illustration of the issue. It includes both the queries run as well as the results of each run right below it: vto_id_queries_with_results_run1.txt If you grep for the term
and
4store does the internal ordering correctly, so we’ve never encountered this issue until AG. Because the internal ordering of records in AG is not deterministic, you end up getting random labels missing from one run to the next. Here is another run to compare to the first one with different duplicates and different missing terms: Per the selected answer in this StackOverflow thread: https://stackoverflow.com/questions/55146844/offset-in-sparql,
Based on this, a possible solution should be adding the ORDER BY clause to the query:
This, however, may come at a performance cost. |
I am working with the Franz developers on improving the performance of the ORDER BY clause in AllegroGraph. As of now, the performance degradation experienced as a result of adding ORDER BY is unacceptable. |
|
resolved but needs to be confirmed with AllegroGraph v7.4 when it comes up |
A number of ontologies have class tree visualization problems when BioPortal runs with AllegroGraph backend. The preferred name is missing so the class tree has blank entries.
API shows
perfLabel: null
The text was updated successfully, but these errors were encountered: