-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify and simplify README #38
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
54e34b9
Clarify and simplify README
asmacdo 52ebedf
container runtime vars should default, not clobber
asmacdo 3d84fbc
Document latex dependencies and container
asmacdo 6565050
Clarify that pdfs should always be downloaded
asmacdo 0d6b681
Changes for chymera comments
asmacdo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,51 +17,88 @@ cd opfvta-replication-2023 | |
|
||
## How to re-run | ||
|
||
**Note:** *If the `SCRATCH_PATH` variable is not defined for the `make` invocation, all intermediary results (approx. 400GB) will be stored in the `scratch/` directory, which is inside the directory of the repository. | ||
This might go beyond the available space on the respective partition, crashing the workflow and possibly other programs. | ||
It is advisable to check space availability on your partition before full reexecution, and if sufficient space is unavailable specify a `SCRATCH_PATH` on a partition with more available space.* | ||
|
||
There are 2 distinct phases of executing this study, which differ strongly in both time and space requirements. | ||
While they are hierarchically related, the results of the first step are version tracked, meaning that you can choose to only run the latter. | ||
|
||
### I. Reexecuting the OPFVTA Article | ||
|
||
This is by far the most time consuming and resource-intensive step as it re-computes all work that was required to generate the original OPFVTA article, starting from the bare raw data. | ||
The requirements of this step are therefore the raw data (study data and mouse brain templates), and the article code, which are included in this repository as submodules and whose content can be fetched via a dedicated `make` target: | ||
::: warning | ||
Warnings: | ||
1. We estimate that the analysis required more than 500GB, 400GB of which will be stored in a scratch directory, which is `./scratch/` by default and can be configured with the `SCRATCH_PATH` variable. | ||
1. The analysis self-limits RAM to run on less powerful systems | ||
1. Reexecuting the computation as well as the article is time consuming and resource-intensive, it is recommended to use a tool such as `tmux` or `screen` to preserve long running processes. | ||
::: | ||
|
||
First, retrieve the data and other large files: | ||
|
||
```console | ||
make submodule-data | ||
``` | ||
|
||
Once the required content is fetched, you can reexecute the OPFVTA article via either of the following commands, depending on the desired platform: | ||
Once the required content has been fetched, you can reexecute the OPFVTA article via `singularity` or `oci` containers. | ||
This step generates intermediate results in the scratch directory and are not preserved by default, as configured in `scratch/.gitignore`. | ||
The final result is a PDF article and its associated elements (mainly volumetric binary data, `.nii.gz` files) which will be stored in a datestamped and annotated directory under `outputs/`. | ||
Most large files, including the results are stored and versioned via `git-annex` and therefore present in this repository, and your output can also be saved and recorded. | ||
|
||
For apptainer/singularity: | ||
|
||
```console | ||
make analysis-singularity | ||
``` | ||
_or_ | ||
|
||
With docker or podman, you can execute the analysis inside an OCI container. | ||
|
||
```console | ||
make analysis-oci | ||
``` | ||
|
||
This produces a PDF article and its associated elements (mainly volumetric binary data, `.nii.gz` files) which will be stored in a datestamped and annotated directory under `outputs/`. | ||
A number of outputs are recorded via `git-annex` and therefore present in this repository, and your output can also be saved and recorded. | ||
|
||
### II. Reexecuting the Meta-Article | ||
|
||
To avoid confusion, we use the term 'article' to refer to a version of the OPFVTA article, and 'meta-article' to refer to the paper regarding the reexecution process and findings. | ||
|
||
### II. Reexecuting the Meta-Article | ||
Generation of the meta-article uses files generated by the OPFVTA analysis which are expected to be in the `outputs/` directory. | ||
Prior to generating the meta-article, `outputs/` must contain the data from previous analyses, which is not locally available by default. | ||
|
||
This uses the aforementioned PDF files in `outputs/` in order to generate dynamic graphical elements, and subsequently compiles them alongside the document text via LaTeX into a novel and fully distinct article PDF. | ||
To avoid confusion, please make sure you understand this is *not* another version of the OPFVTA article but a fully different text. | ||
Note: Regenerating the OPFVTA article will create an additional pdf, but the previous pdfs are required to compare. | ||
|
||
This phase requires fetching the actual binary content for the myriad PDF outputs of OPFVTA reexecution, and then running the `make article` target: | ||
To fetch the OPFVTA analysis outputs: | ||
|
||
```console | ||
datalad get outputs/*/article.pdf | ||
make article | ||
``` | ||
|
||
A finer point here is that the dynamic elements of this article can be cached. | ||
If you are not merely trying to get a PDF to read or working on the human-readable text — but instead working on the figure-generating code — it is advisable to always clean the dynamic elements in between re-making the article via the dedicated target. | ||
Finally we generate new graphical elements and compile the text via LaTeX into a novel meta-article PDF. | ||
|
||
The meta-article can then be generated by a container with all of the dependencies preinstalled using: | ||
|
||
```console | ||
make container-article | ||
``` | ||
|
||
_or_ | ||
|
||
If you prefer to run the generation outside of a container, you will need to install dependencies (suggested to use distribution package manager, packages below are debian names): | ||
- laTex | ||
- biber | ||
- datalad | ||
- diff-pdf | ||
- graphviz | ||
- matplotlib | ||
- pandas | ||
- seaborn | ||
- sklearn | ||
- statsmodels | ||
- yaml | ||
|
||
You will also need to install sourceserifpro font using the tlmgr. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see other |
||
|
||
```console | ||
make container-article | ||
``` | ||
|
||
#### Cleaning up between runs | ||
|
||
The steps are designed to be idempotent, and some dynamically generated components will not be regenerated for subsequent runs. | ||
If you are not merely trying to get a PDF to read or working on the human-readable text — but instead working on the figure-generating code — it is advisable to always deep-clean the dynamic elements in between re-making the article. | ||
|
||
```console | ||
make article-clean && make article | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do differently?
=
here won't overwrite the variable if you set it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, for me
THE_VAR is foo, even if I
export THE_VAR=somethingelse
When I added the ?= it started working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange, ok, let's leave it like that then.