-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback after reading tutorials #51
base: master
Are you sure you want to change the base?
Changes from all commits
dd6cb91
c50d220
7042fdc
8d26050
e0f3145
f0814d5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
# Tutorial 1 - Building a Workflow | ||
|
||
> This tutorial uses directories created in [Tutorial 0](https://janis.readthedocs.io/en/latest/tutorials/tutorial0.html). | ||
|
||
In this stage, we're going to build a simple workflow to align short reads of DNA. | ||
|
||
1. Start with a pair of compressed `FASTQ` files, | ||
|
@@ -15,16 +17,16 @@ These tools already exist within the Janis Tool Registry, you can see their docu | |
|
||
## Preparation | ||
|
||
To prepare for this tutorial, we're going to create a folder and download some data: | ||
To prepare for this tutorial, we're going to need to download some data first: | ||
|
||
```bash | ||
mkdir janis-tutorials && cd janis-tutorials | ||
cd ~/janis/janis-tutorials | ||
|
||
# If WGET is installed | ||
wget -q -O- "https://github.com/PMCC-BioinformaticsCore/janis-workshops/raw/master/janis-data.tar" | tar -xz | ||
wget -q -O- "https://github.com/PMCC-BioinformaticsCore/janis-workshops/raw/master/janis-data.tar" | tar -x | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the |
||
|
||
# If CURL is installed | ||
curl -Ls "https://github.com/PMCC-BioinformaticsCore/janis-workshops/raw/master/janis-data.tar" | tar -xz | ||
curl -Ls "https://github.com/PMCC-BioinformaticsCore/janis-workshops/raw/master/janis-data.tar" | tar -x | ||
``` | ||
|
||
|
||
|
@@ -60,7 +62,7 @@ from janis_bioinformatics.data_types import FastqGzPairedEnd, FastaWithDict | |
|
||
### Tools | ||
|
||
We've discussed the tools we're going to use. The documentation for each tool has a row in the tbale caled "Python" that gives you the import statement. This is how we'll import how tools: | ||
We've discussed the tools we're going to use. The documentation for each tool has a row in the table caled "Python" that gives you the import statement. This is how we'll import these tools: | ||
|
||
|
||
```python | ||
|
@@ -129,7 +131,7 @@ Workflow.step( | |
) | ||
``` | ||
|
||
We provide a identifier for the step (unique amongst the other nodes in the workflow), and intialise our tool, passing our inputs of the step as parameters. | ||
We provide an identifier for the step (unique amongst the other nodes in the workflow), and intialise our tool, passing our inputs of the step as parameters. | ||
|
||
We can refer to an input (or previous result) using the dot notation. For example, to refer to the `fastq` input, we can use `w.fastq`. | ||
|
||
|
@@ -212,7 +214,7 @@ w.output("out", source=w.sortsam.out) | |
|
||
## Workflow + Translation | ||
|
||
Hopefully you have a workflow that looks like the following! | ||
Hopefully now you have a workflow that looks like the following! | ||
|
||
```python | ||
from janis_core import WorkflowBuilder, String | ||
|
@@ -272,33 +274,30 @@ janis translate tools/alignment.py wdl | |
We'll run the workflow against the current directory. | ||
|
||
```bash | ||
janis run -o . --engine cwltool \ | ||
janis run -o tutorial1 --engine cwltool \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the intention is to keep tutorial 1 files within the |
||
tools/alignment.py \ | ||
--fastq data/BRCA1_R*.fastq.gz \ | ||
--reference reference/hg38-brca1.fasta \ | ||
--sample_name NA12878 \ | ||
--read_group "@RG\tID:NA12878\tSM:NA12878\tLB:NA12878\tPL:ILLUMINA" | ||
``` | ||
|
||
After the workflow has run, you'll see the outputs in the current directory: | ||
After the workflow has run, you'll see the outputs in the tutorial1 directory: | ||
|
||
```bash | ||
ls | ||
ls ~/janis/janis-tutorials/tutorial1 | ||
|
||
# drwxr-xr-x mfranklin 1677682026 160B data | ||
# drwxr-xr-x mfranklin 1677682026 256B janis | ||
# -rw-r--r-- mfranklin wheel 2.7M out.bam | ||
# -rw-r--r-- mfranklin wheel 296B out.bam.bai | ||
# drwxr-xr-x mfranklin 1677682026 320B reference | ||
# drwxr-xr-x mfranklin 1677682026 128B tools | ||
``` | ||
|
||
### OPTIONAL: Run with Cromwell | ||
|
||
If you have `java` installed, Janis can run the workflow in the Crowmell execution engine by using the `--engine cromwell` parameter: | ||
|
||
```bash | ||
janis run -o run-with-cromwell --engine cromwell \ | ||
janis run -o tutorial1-run-with-cromwell --engine cromwell \ | ||
tools/alignment.py \ | ||
--fastq data/BRCA1_R*.fastq.gz \ | ||
--reference reference/hg38-brca1.fasta \ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -73,7 +73,7 @@ ToolName = CommandToolBuilder( | |
Let's start by creating a file with this template inside a second output directory: | ||
|
||
```bash | ||
mkdir -p tools | ||
cd ~/janis/janis-tutorials | ||
vim tools/samtoolsflagstat.py | ||
``` | ||
|
||
|
@@ -280,13 +280,13 @@ Jobs: | |
[✓] samtoolsflagstat (N/A) | ||
|
||
Outputs: | ||
- stats: $HOME/janis-tutorials/tutorial2/stats.txt | ||
- stats: $HOME/janis-tutorials/tutorial2/stats | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was no |
||
``` | ||
|
||
Janis (and CWLTool) said the tool executed correctly, let's check the output file: | ||
|
||
```bash | ||
cat tutorial2/stats.txt | ||
cat tutorial2/stats | ||
``` | ||
|
||
``` | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,7 +33,6 @@ | |
modules = ["janis_assistant." + p for p in sorted(find_packages("./janis_assistant"))] | ||
|
||
|
||
fixed_unix_version = f"janis-pipelines.unix==" + JANIS_UNIX_VERSION | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Already defined up in the same code block with the same value ☝️ |
||
setup( | ||
name="janis pipelines", | ||
version=__version__, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating
janis-tutorials
in tutorial 0. Assuming users will follow the tutorials in order (added a note to each about it, if not already there).