Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import REDIportal data #169

Open
1 of 4 tasks
blakesweeney opened this issue Feb 9, 2023 · 0 comments
Open
1 of 4 tasks

Import REDIportal data #169

blakesweeney opened this issue Feb 9, 2023 · 0 comments
Assignees

Comments

@blakesweeney
Copy link
Member

blakesweeney commented Feb 9, 2023

REDIportal provides information on the locations of modified RNA nts in the human genome. We can think of their data as a BED file of locations and some metadata about the type of modifications. This will be an expert database that does not provide any sequences, but we do have one like this, CRS. To import their data we need to:

  • Intersect the coordinates they provide with all ncRNA locations in the correct genome
  • Create rnc_sequence_feature entries for each hit overlap. These overlaps should indicate the feature is an RNA editing event. Provide information on the edit and a link the REDIportal database.
  • Do these steps post genome mapping each time it is run.

We also need to provide a search export that makes it possible to find all sequences with these editing events and possibly search them. Maybe adding terms like:

  • [ ]has_editing_event (True/False) like our other flags
  • [ ]edit_type:I

I'm not sure the second term is idea, so better suggestions are encourged.

Finally, we need to provide them a linkage between editing even and URS_taxid. I think a file like:

  • tab seperated file of editing ID and URS_taxid
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants