Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extra tools #21

Merged
merged 22 commits into from
Mar 6, 2020
Merged

extra tools #21

merged 22 commits into from
Mar 6, 2020

Conversation

matthdsm
Copy link
Contributor

No description provided.

@matthdsm
Copy link
Contributor Author

matthdsm commented Feb 27, 2020

Hi,

I'm probably missing some things for the new tools to be enables, could you give this PR a quick review?

Nevermind!
Tools check out and generate valid cwl. Ready to merge IMO

Thanks
M

@illusional
Copy link
Member

This is looking great!

I'll highlight one feature: secondaries_present_as in the ToolOutput. Essentially, we made the decision for bam indexes to be .bam.bai, but GATK will only recognise ^.bai. So the secondaries_present_as is a way for Janis to automatically tell CWL / WDL to localise a file as the required extension (or pick this back up).

Unfortunately, there's a problem in CWLTool (untested in other engines) that may hinder your ability to run these workflows when collecting these outputs.

I realised that the ToolArgument wasn't in the docs, but it's here now.

A couple of minor things I noticed in the scramble wrapper:

  • The tag for quay is 1.14.12--h244ad75_0 to be quay.io/biocontainers/staden_io_lib:1.14.12--h244ad75_0
  • A ToolInput has a default= argument. Usually, the only argument a DataType (like String) takes is optional=True/False, for example String(optional=True).
  • Within the arguments method, you should return a ToolArgument which has the structure (ToolArgument(value, prefix=none, position=None)).
  • The tool arguments should be for values that should never change. For example, if you never wanted to change the version or compressions, you could use:
    [ # ...other args
        ToolArgument("3.0", prefix="-V"),
        ToolArgument("-9")
    ]

@matthdsm
Copy link
Contributor Author

Great, thanks for commenting.
You can expect a fix shortly.

Cheers
M

@matthdsm
Copy link
Contributor Author

I think we're good to go now!

Cheers
M

@matthdsm
Copy link
Contributor Author

I'm trying to fix input errors in the new Gatk4MergeBamAlignment tool.
The ToolInput aligned can be both an array of either Samof Bam inputs. How do I go about declaring that? Atm I have the following

            ToolInput(
                "aligned",
                Array(Sam(), Bam()),
                prefix="--ALIGNED_BAM",
                prefix_applies_to_all_elements=True,
                doc="SAM or BAM file(s) with alignment data.",
                position=10,
            ),

But this still yields
2020-02-27T15:08:25 [CRITICAL]: Mismatch of types when joining 'generateAlignedBam.out' to 'mergeBam.bam': stdout<SAM> -/→ Optional<Array<BAM>>

Thanks for the help
Cheers
M

@illusional
Copy link
Member

Hey @matthdsm, at the moment you can't (but proposed in Unions in Janis: PMCC-BioinformaticsCore/janis#10). I'd recommend just letting it be an Array of Bams. Janis will tell you there's a type connection error but it should work anyway.

(Sort of like TypeScript errors, there's an error but it'll finish the transpile)

@matthdsm
Copy link
Contributor Author

matthdsm commented Mar 5, 2020

I suppose a squash and merge would be best here 😅

@illusional
Copy link
Member

Thanks @matthdsm for the contribution, this is awesome!!

@illusional illusional merged commit 2844e5a into PMCC-BioinformaticsCore:master Mar 6, 2020
@matthdsm
Copy link
Contributor Author

matthdsm commented Mar 6, 2020

No problem! More stuff to come!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants