Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bigbed to gtf converter and tests #19809

Open
wants to merge 8 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
clean up output gtf some for bigbedtogtf converter
  • Loading branch information
d-callan committed Mar 13, 2025
commit edc7aa6b38c8623f1c3aa74344d2d8b7641210e1
4 changes: 2 additions & 2 deletions lib/galaxy/datatypes/converters/bigbed_to_gtf_converter.xml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<tool id="ucsc_bigbedtogtf" name="Convert from bigBed to GTF format" version="@TOOL_VERSION@+galaxy@SUFFIX_VERSION@" profile="20.01">
<tool id="bigbedtogtf" name="Convert from bigBed to GTF format" version="@TOOL_VERSION@+galaxy@SUFFIX_VERSION@" profile="20.01">
<description>Convert bigBed to GTF</description>
<macros>
<token name="@TOOL_VERSION@">377</token>
Expand Down Expand Up @@ -26,7 +26,7 @@
python '$__tool_directory__/bed_to_gff_converter.py' temp.bed temp.gff &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you're not using genePredToGtf ? I don't know if we can trust that 17 year old hand-written bed to gff script ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was partly trying to be consistent w what was already here. it seemed to me like itd be unexpected as a user if i got a very different gtf vs gff2 file w the same input. but also, i was a little worried that if the incoming file wasnt from ucsc that the genepred intermediate file wouldnt make sense or maybe even work. (im not really confident i know what genepred is tbh) and if we did know the incoming data was from ucsc, why not have a genepred to gtf converter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree these are issues, but realistically I think all bigbeds come from ucsc. That said, I think the bigbed to bed converter was the important one that would have an application, maybe we can wait until there's a reason not to use the pre-built gtfs hosted by ucsc before we proceed here ? If there is an actual application we'll also know if this will all work out correctly.


## Step 3: Convert GFF to GTF (GFF2 is very similar to GTF, just need to adjust the last column)
awk -F '\t' '{
awk -F '\t' 'NR > 3 && $1 !~ /^#/ {
if ($1 !~ /^#/) {
split($9, attrs, ";");
feature_id = "";
Expand Down
3 changes: 0 additions & 3 deletions test-data/output.gtf
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
##gff-version 2
##bed_to_gff_converter.py
gene_id "gene___"; transcript_id "gene___";
chr7 bed2gff uc003sii.2 54029 73584 0 - . gene_id "gene_chr7_54029_73584"; transcript_id "gene_chr7_54029_73584";
chr7 bed2gff uc010krx.1 60329 61569 0 - . gene_id "gene_chr7_60329_61569"; transcript_id "gene_chr7_60329_61569";
chr7 bed2gff uc003sij.2 62968 63529 0 - . gene_id "gene_chr7_62968_63529"; transcript_id "gene_chr7_62968_63529";
Expand Down