Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Top pathway hit #1425

Open
jvwong opened this issue Nov 28, 2022 · 12 comments
Open

Feature: Top pathway hit #1425

jvwong opened this issue Nov 28, 2022 · 12 comments

Comments

@jvwong
Copy link
Member

jvwong commented Nov 28, 2022

Goal
Feature a the top pathway search hit and include a visual preview (SBGN) when a query references at least one of the molecular participants.

Primary use case
Referral from public entity databases. In these cases, the query will be in the form of a (a) UniProt accession (referrer: UniProt) and/or b) HGNC symbol (referrer: GeneCards). The response should contain information necessary to populate a Feature Component for the pathway: name, source, #participants, and a URL pointing to an SBGN preview]

Restrictions & Limitations
Main concern is slowing down the "time to interactive" of the search, which degrades the user experience. Several potential bottlenecks:

  • Generating an SBGN snapshot on-the-fly
  • Identifying entity IDs from the query
  • Retrieving a pathway's participants (UnificationXref db, id).

Edge cases
Integrating with case where Biofactoid is contained in a Feature view (i.e. triggered when top search hit source is Biofactoid)

Notes
To reduce bottlenecks, might consider

a. Rendering and caching SBGN previews for all pathways instread of on-the-fly, which might also be useful in Google indexing

@jvwong
Copy link
Member Author

jvwong commented Nov 29, 2022

Update

Goal
Feature a the top pathway search hit and include a visual preview (SBGN) so that users are aware of what the pathway section links will hold.

Primary use case
Any user that visits the search page.

Mockup

summary

Details

Benefits:

  • Generic to all users of the search, whether referred (UniProt, GeneCards) or arriving via Google
  • Does not change the experience for majority of users who are mainly interested in interactions app
  • Preview is placed in the Pathway section, indicating that other links will provide the same info (SBGN)
  • No complex mapping of query to pathway previewed
  • User existing components (e.g. SBGN pathway viewer)

Risks:

  • The top hit may or may not be relevant (i.e. we're not checking that a database ID query is in the pathways)
  • Generating the SBGN visualization may be slow
  • SBGN network can be enormous, complex

@cannin
Copy link
Member

cannin commented Dec 7, 2023

Python code to generate static images of PC pathways using the Syblars (https://pubmed.ncbi.nlm.nih.gov/36374853/ from Ugur Dogrusoz)

Code

https://gist.github.com/cannin/7e35f3fae274370bd0a70c7b1840c743

General Workflow

  1. Use a PC GMT file to get pathway URIs
  2. Call Pathway Commons API to get SBGN
  3. Call Syblars to get PNG data content
  4. Save to file

Examples

There is an upper limit nodes where the automated layout of SBGN is poor (~100 nodes???)

http-identifiers-org-kegg-pathway-hsa00601
http-identifiers-org-kegg-pathway-hsa00860
http-identifiers-org-kegg-pathway-hsa00920
http-identifiers-org-kegg-pathway-hsa01100

Indexing of Static Content by Google

Google depends on sitemaps to know what to index. Example from another project with 1700 pages that includes shows minimum to have static images indexed : https://discover.nci.nih.gov/rsconnect/cellminercdb/cell_lines/sitemap.xml Pages were slow to index at first, but everything eventually indexed; images tend to be even slower.

@jvwong
Copy link
Member Author

jvwong commented Dec 12, 2023

I looked into using the existing services here in app-ui, and the only blocking issue in generating image snapshots with cytosnap is the lack of support for styles via cytoscape-sbgn-stylesheet, as mentioned in this issue. In fact, I think the latter issue was posted in reference to syblars and I suspect they've ended up using a cytosnap fork of some sort.

I was also thinking that also/instead we could pre-cache the pathway JSON data - PC HGNC GMT file says there are 3971 pathways.

@maxkfranz
Copy link
Member

and I suspect they've ended up using a cytosnap fork of some sort.

This looks like it: https://github.com/iVis-at-Bilkent/cytosnap/commits/master

There are a number of new features there, including SVG export. It would be best to get those merged into the main Cytosnap lib. The main thing for now is SBGN support.

@ugurdogrusoz, @hasanbalci -- are there any technical reasons why the general stylesheet / SBGN support could not be merged into the main lib? If not, we should aim to have features like this merged into the main Cytosnap lib so that it's maintained and compatible with the rest of the ecosystem going forward.

@hasanbalci
Copy link

@maxkfranz I think we also have some extension specific code in that fork, but I will open a PR with some basic features we added like ability of using other stylesheets or cy extensions and allowing export of image and node positions at the same time etc.

@maxkfranz
Copy link
Member

Great. Thanks, @hasanbalci

@jvwong
Copy link
Member Author

jvwong commented Jan 29, 2024

This first pass is running on our appsbeta.pathwaycommons.org. There's some good, bad and ugly, depending on the pathway size:

Good

Mitochondrial transcription termination | Reactome

Bad

TP53 regulates transcription of additional cell cycle genes... Reactome

Ugly

Direct p53 effectors | NCI-PID

@maxkfranz
Copy link
Member

Improving the network layout generally would at least remove the 'ugly' (hairball) cases, even if they're still 'bad' (too large).

This could motivate bumping up the priority of updated layout as a student project. We have lots of new layouts to try that didn't exist when the PC apps were built.

@jvwong
Copy link
Member Author

jvwong commented Jan 30, 2024

Improving the network layout generally would at least remove the 'ugly' (hairball) cases, even if they're still 'bad' (too large).

By "bad", I meant from the standpoint of "how is this helpful or attractive to a user" - not so much if text labels are unreadable. So maybe better zooming, panning, and cropping (like google maps).

This could motivate bumping up the priority of updated layout as a student project. We have lots of new layouts to try that didn't exist when the PC apps were built.

Sounds good.

@maxkfranz
Copy link
Member

Yeah, the 'bad' is a balance. On the one hand, a zoomed in view is prettier. On the other hand, a full view is more what-you-see-is-what-you-get.

Either way, better layout would be good.

@jvwong
Copy link
Member Author

jvwong commented Jan 31, 2024

Just to re-up on my previous comment: #1425 (comment)

The rendering and layouts are done in syblars: It would be nice (preferred) if we could just do all the rendering/snapping/layouts here.

@jvwong
Copy link
Member Author

jvwong commented Dec 2, 2024

Just putting here: In regards to how my command line script generates names of images, an outstanding issue (@IgorRodchenkov)

#1443 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants