Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bigbed to gtf converter and tests #19809

Open
wants to merge 8 commits into
base: dev
Choose a base branch
from

Conversation

d-callan
Copy link
Contributor

adding a bigbed to gtf datatype converter. see galaxyproject/brc-analytics#371

leaned on the bed to gff converter a bit, for the sake of consistent treatment of similar things etc. let me know if we prefer a different strategy.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@mvdbeek mvdbeek force-pushed the bigbed-to-gtf-converter branch from 94d38d2 to ddbda74 Compare March 17, 2025 13:56
temp.bed &&

## Step 2: Convert BED to GFF using the existing converter script
python '$__tool_directory__/bed_to_gff_converter.py' temp.bed temp.gff &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you're not using genePredToGtf ? I don't know if we can trust that 17 year old hand-written bed to gff script ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was partly trying to be consistent w what was already here. it seemed to me like itd be unexpected as a user if i got a very different gtf vs gff2 file w the same input. but also, i was a little worried that if the incoming file wasnt from ucsc that the genepred intermediate file wouldnt make sense or maybe even work. (im not really confident i know what genepred is tbh) and if we did know the incoming data was from ucsc, why not have a genepred to gtf converter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree these are issues, but realistically I think all bigbeds come from ucsc. That said, I think the bigbed to bed converter was the important one that would have an application, maybe we can wait until there's a reason not to use the pre-built gtfs hosted by ucsc before we proceed here ? If there is an actual application we'll also know if this will all work out correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants