Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

DeepFRI (Deep Functional Residue Identification)

Implementation in PaddlePaddle and Paddle Graph Learning (PGL) of the method proposed by Gligorijević et al.[1] for protein function prediction.

Implementation Setup

  • Python==3.7
  • PaddlePaddle==2.2.1
  • Pgl==2.2.2
  • scikit-learn==1.0.1
  • tqdm==4.62.3

Dataset

The Protein Data Bank (PDB). Pre-processing and transformation of proteins into graphs can be found here. After preprocessing, the data should be copied in the ./data folder. Dataset splits (i.e., test, validation, and test) as proposed by [1] can be downloaded here or from their repository. They should also be copied to the folder ./data after extraction.

Training

python train.py [params]   

Where params are keyword arguments. See train.py for the list of arguments (with their default values).

Testing

python test.py --model_name <path-to-saved-model> --label_data_path <path-to-protein-with-their-labels> [more params]  

model_name and label_data_path are required arguments. More (optional) parameters can be added as well. See test.py for a full list of expected arguments.

References

[1] Gligorijević, V., Renfrew, P.D., Kosciolek, T. et al. Structure-based protein function prediction using graph convolutional networks. Nat Commun 12, 3168 (2021).