Skip to content

Commit

Permalink
Merge pull request #16 from PMCC-BioinformaticsCore/tutorials-fix
Browse files Browse the repository at this point in the history
Update tutorials to 0.9.x
  • Loading branch information
illusional authored Mar 16, 2020
2 parents 7c89597 + c739018 commit f4ce511
Show file tree
Hide file tree
Showing 3 changed files with 355 additions and 200 deletions.
119 changes: 78 additions & 41 deletions docs/tutorials/tutorial0.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
# Tutorial 0 - Introduction to Janis

Welcome to the introduction for Janis! This tutorial introduces Janis and installs it on your local computer, ready for building your first workflow.
Janis is workflow framework that uses Python to construct a declarative workflow. It has a simple workflow API within Python that you use to build your workflow. Janis converts your pipeline to the Common Workflow Language (CWL) and Workflow Description Language (WDL) for execution, and it’s also great for publishing and archiving.

Janis is workflow framework that uses Python to construct a declarative workflow. It has a simple workflow API within Python that you use to declare your workflow. Janis can convert your pipeline to the Common Workflow Language (CWL) and Workflow Description Language (WDL) for execution, but it's also great for publishing and archiving.
Janis was designed with a few points in mind:

Janis was designed with a few priorities:

- Workflows should be easy to build/
- Workflows and tools must be easily shared (portable).
- Execution must be able to occur on HPCs and cloud environments.
- Workflows should be easy to build,
- Workflows and tools must be easily shared (portable),
- Workflows should be able to execute on HPCs and cloud environments.
- Workflows should be reproducible and re-runnable.

Janis uses an *abstracted execution environment*, which removes the shared file system in favour of you specifiying all the files you need up front and passing them around as a File object. This allows the same workflow to be executable on your local machine, HPCs and cloud, and we let the `execution engine` handle moving our files. This also means that we can use file systems like ``S3``, ``GCS``, ``FTP`` and more without any code changes.
Janis uses an *abstracted execution environment*, which removes the shared file system in favour of you specifiying all the files you need up front and passing them around as a File object. This allows the same workflow to be executable on your local machine, HPCs and cloud, and we let the `execution engine` handle moving our files. This also means that we can use file systems like ``S3``, ``GCS``, ``FTP`` and more without any changes to our workflow.

> Instructions for setting up Janis on a compute cluster are under construction.
Expand Down Expand Up @@ -46,14 +44,16 @@ We'll install Janis in a virtual environment as it preserves versioning of Janis

```bash
janis -v
# -------------------- -------
# janis-core v0.8.0
# janis-assistant v0.8.0
# janis-unix v0.8.0
# janis-bioinformatics v0.8.0
# -------------------- -------
# -------------------- ------
# janis-core v0.9.7
# janis-assistant v0.9.9
# janis-unix v0.9.0
# janis-bioinformatics v0.9.5
# janis-pipelines v0.9.2
# janis-templates v0.9.4
# -------------------- ------
```
å

### Installing CWLTool

[CWLTool](https://github.com/common-workflow-language/cwltool) is a reference workflow engine for the Common Workflow Language. Janis can run your workflow using CWLTool and collect the results. For more information about which engines Janis supports, visit the [Engine Support](https://janis.readthedocs.io/en/latest/references/engines.html) page.
Expand All @@ -72,56 +72,93 @@ cwltool --version

## Running an example workflow with Janis

First off, let's create a directory to store our janis workflows. This could be anywhere you want, but for now we'll put it at `$HOME/janis/`

```bash
mkdir ~/janis
cd ~/janis
```

You can test run an example workflow with Janis and CWLTool with the following command:

```bash
janis run --engine cwltool --stay-connected hello
janis run --engine cwltool -o tutorial0 hello
```

Usually Janis starts a separate process to run and manage the workflow. By including the `--stay-connected` parameter, Janis and the engine are connected, so you'll see any errors that occur. When you exit the Janis process, this will also exit the engine.
You'll see the `INFO` statements from CWLTool in terminal.
If this works successfully, you can omit the `--stay-connected` param and you'll be presented with the progress screen as your workflow completes. Some things to note:
> To see all logs, add `-d` to become:
> ```bash
> janis -d run --engine cwltool -o tutorial0 hello
> ```
- `WID` - the janis identifier of your workflow.
- `Task Dir` - Where your workflow, output files and logs are.
-
At the start, we see the two lines in our output:
```
WID: df5daa
EngId: df5daa
Name: hello
Engine: cwltool

Task Dir: $HOME/janis/hello/20191115_105042_df5daa/
Exec Dir: None
2020-03-16T18:49:08 [INFO]: Starting task with id = 'd909df'
d909df
```
Status: Completed
Duration: 6s
Start: 2019-11-14T23:50:42.940196+00:00
Finish: 2019-11-14T23:50:48.453133+00:00
Updated: Just now (2019-11-14T23:50:53+00:00)
This is our workflow ID (wid) and is one way we can refer to our workflow.
Jobs:
[] hello (2s)
After the workflow has completed (or in a different window), you can see the progress of this workflow with:
Outputs:
- out: $HOME/janis/execution/hello/20191115_105042_df5daa/output/out
```bash
janis watch d909df
# WID: d909df
# EngId: d909df
# Name: hello
# Engine: cwltool
#
# Task Dir: $HOME/janis/tutorial0
# Exec Dir: None
#
# Status: Completed
# Duration: 4s
# Start: 2020-03-16T07:49:08.367981+00:00
# Finish: 2020-03-16T07:49:11.881006+00:00
# Updated: 3h:51m:54s ago (2020-03-16T07:49:11+00:00)
#
# Jobs:
# [✓] hello (1s)
#
# Outputs:
# - out: $HOME/janis/tutorial0/out
```
There is a single output `out` from the workflow, cat-ing this result we get:
```bash
cat $HOME/janis/execution/hello/20191115_105042_df5daa/output/out
cat $HOME/janis/tutorial0/out
# Hello, World
```
### Overriding an input
The workflow `hello` has one input `inp`. We can override this input by passing `--inp $value` onto the end of our run statement, eg:
The workflow `hello` has one input `inp`. We can override this input by passing `--inp $value` onto the end of our run statement. Note the structure for workflow parameters and parameter overriding:
```
janis run <run options> worklowname <workflow inputs>
```
We can run the following command:
```bash
janis run --engine cwltool -o tutorial0-override hello --inp "Hello, $(whoami)"
# out: Hello, mfranklin
```
### Running Janis in the background
You may want to run Janis in the background as it's own process. You could do this with `nohup [command] &`, however we can also run Janis with the `--background` flag and capture the workflow ID to watch, eg:

```bash
janis run --engine cwltool hello --inp "Hello, yourname"
# out: Hello, yourname
wid=$(janis run \
--background --engine cwltool -o tutorial0-background \
hello \
--inp "Run in background")
janis watch $wid
```


Expand Down
Loading

0 comments on commit f4ce511

Please sign in to comment.