Name		Name	Last commit message	Last commit date
Latest commit History 413 Commits
.github/workflows		.github/workflows
docs/flexynesis		docs/flexynesis
examples		examples
flexynesis		flexynesis
img		img
tests		tests
.gitignore		.gitignore
LICENCE.md		LICENCE.md
README.md		README.md
channels.scm		channels.scm
guix.scm		guix.scm
manifest.scm		manifest.scm
pyproject.toml		pyproject.toml
spec-file.txt		spec-file.txt

Repository files navigation

flexynesis

A deep-learning based multi-omics bulk sequencing data integration suite with a focus on (pre-)clinical endpoint prediction. The package includes multiple types of deep learning architectures such as simple fully connected networks, supervised variational autoencoders; different options of data layer fusion, and automates feature selection and hyperparameter optimisation. The tools are continuosly benchmarked on publicly available datasets mostly related to the study of cancer. Some of the applications of the methods we develop are drug response modeling in cancer patients or preclinical models (such as cell lines and patient-derived xenografts), cancer subtype prediction, or any other clinically relevant outcome prediction that can be formulated as a regression or classification problem.

Documentation

A detailed documentation of classes and functions in this repository can be found here.

Benchmarks

For the latest benchmark results see: https://bimsbstatic.mdc-berlin.de/akalin/buyar/flexynesis-benchmark-datasets/dashboard.html

The code for the benchmarking pipeline is at: https://github.com/BIMSBbioinfo/flexynesis-benchmarks

Quick Start

# install 
git clone https://github.com/BIMSBbioinfo/flexynesis.git
cd flexynesis
conda create --name flexynesis --file spec-file.txt
conda activate flexynesis
pip install -e .

# test the installation
curl -L -o dataset1.tgz https://bimsbstatic.mdc-berlin.de/akalin/buyar/flexynesis-benchmark-datasets/dataset1.tgz
tar -xzvf dataset1.tgz

flexynesis --data_path dataset1 --model_class DirectPred --target_variables Erlotinib --fusion_type early --hpo_iter 1 --features_min 50 --features_top_percentile 5 --log_transform False --data_types gex,cnv --outdir . --prefix erlotinib_direct --early_stop_patience 3 --use_loss_weighting False --evaluate_baseline_performance False

Input Dataset Structure

InputFolder/
| --  train 
|    |-- omics1.csv 
|    |-- omics2.csv
|    |--  ... 
|    |-- clin.csv

| --  test 
|    |-- omics1.csv 
|    |-- omics2.csv
|    |--  ... 
|    |-- clin.csv

File contents

clin.csv

clin.csv contains the sample metadata. The first column contains unique sample identifiers. The other columns contain sample-associated clinical variables. NA values are allowed in the clinical variables.

v1,v2
s1,a,b
s2,c,d
s3,e,f

omics.csv

The first column of the feature tables must be unique feature identifiers (e.g. gene names). The column names must be sample identifiers that should overlap with those in the clin.csv. They don't have to be completely identical or in the same order. Samples from the clin.csv that are not represented in the omics table will be dropped.

s1,s2,s3
g1,0,1,2
g2,3,3,5
g3,2,3,4

Concordance between train/test splits

The corresponding omics files in train/test splits must contain overlapping feature names (they don't have to be identical or in the same order). The clin.csv files in train/test must contain matching clinical variables.

Guix

You can also create a reproducible development environment or build a reproducible package of Flexynesis with GNU Guix. You will need at least the Guix channels listed in channels.scm. It also helps to have authorized the Inria substitute server to get binaries for CUDA-enabled packages. See this page for instructions on how to configure fetching binary substitutes from the build servers.

You can build a Guix package from the current committed state of your git checkout and using the specified state of Guix like this:

guix time-machine -C channels.scm -- \
    build --no-grafts -f guix.scm

To enter an environment containing just Flexynesis:

guix time-machine -C channels.scm -- \
    shell --no-grafts -f guix.scm

To enter a development environment to hack on Flexynesis:

guix time-machine -C channels.scm -- \
    shell --no-grafts -Df guix.scm

Do this to build a Docker image containing this package together with a matching Python installation:

guix time-machine -C channels.scm -- \
  pack -C none \
  -e '(load "guix.scm")' \
  -f docker \
  -S /bin=bin -S /lib=lib -S /share=share \
  glibc-locales coreutils bash python

Defining Kernel for Jupyter Notebook

For interactively using flexynesis on Jupyter notebooks, one can define the kernel to make flexynesis and its dependencies available on the jupyter session.

Assuming you have already defined an environment and installed the package:

conda activate flexynesis
python -m ipykernel install --user --name "flexynesis" --display-name "flexynesis"

To export existing spec-file.txt:

conda list --explicit > spec-file.txt

Testing

Run unit tests

pytest -vvv tests/unit

This will run all the unit tests in the tests directory.

Contributing

If you would like to contribute to the project, please open an issue or a pull request on the GitHub repository.

Branches

When working on a feature on a new branch, don't forget to write a branch description:

git branch --edit-description

You can view branch descriptions:

git config branch.<branch name>.description

Documentation

pdoc --html --output-dir docs --force flexynesis

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

flexynesis

Documentation

Benchmarks

Quick Start

Input Dataset Structure

File contents

clin.csv

omics.csv

Concordance between train/test splits

Guix

Defining Kernel for Jupyter Notebook

Testing

Contributing

Branches

Documentation

About

Releases

Packages

Contributors 6

Languages

License

BIMSBbioinfo/flexynesis

Folders and files

Latest commit

History

Repository files navigation

flexynesis

Documentation

Benchmarks

Quick Start

Input Dataset Structure

File contents

clin.csv

omics.csv

Concordance between train/test splits

Guix

Defining Kernel for Jupyter Notebook

Testing

Contributing

Branches

Documentation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages