Skip to content

Junjie-Zhu/IDPFold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Precise Generation of Conformational Ensembles for Intrinsically Disordered Proteins Using Fine-tuned Diffusion Models

PyTorch Lightning Config: Hydra Template biorXiv

Overview

We developed a generative deep learning model that predict IDP conformational ensembles directly from their sequences using fine-tuned diffusion models, named as IDPFold. IDPFold bypasses the need for Multiple Sequence Alignments (MSA) or experimental data, achieving accurate predictions of ensemble properties across numerous IDPs.

IDPFold is pretrained on the PDB database and fine-tuned on conformational ensembles provided by IDRome, achieving more precise sampling of IDP ensembles than SOTA deep learning models and MD simulation.

Overview of IDPFold

The codebase of IDPFold is mainly inspired by Str2Str, thank Jiarui Lu for his valuable suggestions.

Installation

git clone https://github.com/Junjie-Zhu/IDPFold.git
cd IDPFold

# Create a new conda environment
conda env create -f environment.yml
conda activate idpfold

# Install ESM for sequence embedding extraction
pip install fair-esm

# Install IDPFold as a package
pip install -e .

After installation, you need to update the .env file that contains path to datasets. We provide a script for initializing .env file, just run the folloing command:

python initialize.py

Inference

To generate conformational ensembles for given sequences, you should:

  • Prepare a fasta file, both single sequence and multiple are allowed, an example has been provided in data/example.fasta which contains 3 IDP sequences
  • Check the checkpoint file, our pretrained model checkpoints can be accessed from Google Drive
  • Run the following command
# Extract sequence embeddings
python src/read_seqs.py pred_dir='./data/example.fasta'

# Inference
python src/eval.py ckpt_path='/path/to/ckpt'

Training

To be updated ...


This is a test version of IDPFold, if you have any question please either create an issue or directly contact [email protected]!

About

IDPFold test version

Resources

Stars

Watchers

Forks

Languages