-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new BAM example for full PG entry #245
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the intentions of this, but handling this properly is more complex than what should be shown in an example. I'd like to first work on program record chaining before adding something like this. What you have now can certainly be used as a simple workaround.
_ => unreachable!(), | ||
}; | ||
// the command-line to insert into the CL tag | ||
let cmd_str = env::args().collect::<Vec<_>>().join(" "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra care needs to be taken for arguments containing tab characters.
let program = if let Some(last_pg) = header.programs().keys().last() { | ||
program.insert(pp, last_pg.clone()) | ||
} else { | ||
program | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Header records are not guaranteed to be ordered by recency. This would be incorrect for, e.g.,
@PG ID:pg1 PP:pg0
@PG ID:pg0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so this is what you mean by program-chaining--you want to parse the order during Program
construction?
I modified this to use the defined tags in |
The SAM header programs are now wrapped by noodles/noodles-sam/src/header/programs.rs Lines 52 to 69 in f23008d
Note how pg0 is the root of the chain and pg1 links to pg0 via the previous program ID ( noodles/noodles-sam/src/header/programs.rs Lines 229 to 291 in f23008d
I added creating the self program record in the reheader examples, e.g., noodles/noodles-bam/examples/bam_reheader.rs Lines 15 to 23 in f23008d
Thanks for initial exploration of this issue! |
This is an example script that adds a full PG tag to the header of a BAM file (
sample.bam
is the output ofbam_write
):I spent some time yesterday figuring out how to do this, and I figured the example might be useful. I'm also opening the PR to see if there are any improvements you can see.
The name could be more clear, perhaps
bam_add_pg_entry
would be better. People might expect "tag" to refer to the reads.Constructing the
PN
,PP
,VN
, andCL
tags seems cumbersome. Maybe these should be built-in tags (I believe they're part of the spec). Or if there's a way to construct them more cleanly I'd be happy to use that.