Skip to content

Latest commit

 

History

History
80 lines (54 loc) · 3.12 KB

README.md

File metadata and controls

80 lines (54 loc) · 3.12 KB

DeepVelo - A Deep Learning-based velocity estimation tool with cell-specific kinetic rates

PyPI version License: MIT

This is the official implementation of the DeepVelo method. DeepVelo employs cell-specific kinetic rates and provides more accurate RNA velocity estimates for complex differentiation and lineage decision events in heterogeneous scRNA-seq data. Please check out the paper for more details.

alt text

Installation

Please note that using the pip version is currently recommended. The currently supported python versions are 3.7, 3.8, and 3.9.

pip install deepvelo

Using GPU

The dgl cpu version is installed by default. For GPU acceleration, please install a proper dgl gpu version compatible with your CUDA environment.

pip uninstall dgl # remove the cpu version
# replace cu101 with your desired CUDA version and run the following
pip install "dgl-cu101>=0.4.3,<0.7"

Install the development version

We use poetry to manage dependencies.

poetry install

This will install the exact versions in the provided poetry.lock file. If you want to install the latest version for all dependencies, use the following command.

poetry update

Minimal example

We provide a number of notebooks in the examples folder to help you get started. This folder contains analyses from the paper, as well as a minimal python notebook.

DeepVelo fully integrates with scanpy and scVelo. The basic usage is as follows:

import anndata as ann
import deepvelo as dv
import scvelo as scv

adata = ann.read_h5ad("..") # load your data in AnnData here - modify the path accordingly

# preprocess the data
scv.pp.filter_and_normalize(adata, min_shared_counts=20, n_top_genes=2000)
scv.pp.moments(adata, n_neighbors=30, n_pcs=30)

# run DeepVelo using the default configs
trainer = dv.train(adata, dv.Constants.default_configs)
# this will train the model and predict the velocity vectore. The result is stored in adata.layers['velocity']. You can use trainer.model to access the model.

# Plot the velocity results 
scv.tl.velocity_graph(adata, n_jobs=4)
scv.pl.velocity_embedding_stream(
    adata,
    basis="umap",
    color="clusters",
    legend_fontsize=9,
    dpi=150          
)

Fitting large number of cells

If you can not fit a large dataset into (GPU) memory using the default configs, please try setting a small inner_batch_size in the configs, which can reduce the memory usage and maintain the same performance.

Currently the training works on the whole graph of cells, we plan to release a flexible version using graph node sampling in the near future.