This repository contains the FedDESeq2 package, a Python package for Federated Differential Expression Analysis based on PyDESeq2 which is itself a python implementation of DESeq2.
You can install the package from PyPI using the following command:
pip install fedpydeseq2
Start by cloning this repository
git clone git@github.com:owkin/fedpydeseq2.git
conda create -n fedpydeseq2 python=3.11 # or a version compatible
conda activate fedpydeseq2
Run
conda install pip
pip install poetry==1.8.2
and test the installation with poetry --version
.
cd
to the root of the repository and run
poetry install --with linting,testing
To download the data, cd
to the root of the repository run this command.
fedpydeseq2-download-data --raw_data_output_path data/raw
This way, you create a data/raw
subdirectory in the directory containing all the necessary data. If you want to modify
the location of this raw data, you can in the following way. Run this command instead:
fedpydeseq2-download-data --raw_data_output_path MY_RAW_PATH
And create a file in the tests
directory named paths.json
containing
- A
raw_data
field with the path to the raw dataMY_RAW_PATH
- An optional
assets_tcga
field with the path to the directory containing theopener.py
file and its description (by default present in the fedpydeseq2_datasets module, so no need to specify this unless you need to modify the opener); - An optional
processed_data
field with the path to the directory where you want to save processed data. Note that this is used only if you want to run tests locally without reprocessing the data during each test session (test marked with thelocal
marker).
Still in the root of the repository, run
pre-commit install
You are now ready to contribute.
Tests are run using a self-hosted runner. To add a self-hosted runner, instantiate the machine
you want to use as a runner, go to the repository settings, then to the Actions
tab, and click on
Add runner
. Follow the instructions to install the runner on the machine you want
to use as a self-hosted runner.
Make sure to label the self-hosted runner with the label "fedpydeseq2-self-hosted" so that the CI workflow can find it.
The docker mode is only tested manually. To test it, first run poetry build
in order to create a wheel in the dist
folder. Then launch in a tmux the
following:
pytest -m "docker" -s
The -s
option enables to print all the logs/outputs continuously. Otherwise, these
outputs appear only once the test is done. As the test takes time, it's better to
print them continuously.
To run a compute plan on an environment with the substra front-end, you need first to generate token in each of the
substra nodes. Then you need to duplicate
credentials-template.yaml
into a new file
credentials.yaml and fill in the
tokens. You should not need to rebuild the wheel manually by running
poetry build
as the script will try to do it for you, but watch out for
related error message when executing the file.