Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 1.79 KB

File metadata and controls

31 lines (23 loc) · 1.79 KB

DeepFRI (Deep Functional Residue Identification)

Implementation in PaddlePaddle and Paddle Graph Learning (PGL) of the method proposed by Gligorijević et al.[1] for protein function prediction.

Implementation Setup

  • Python==3.7
  • PaddlePaddle==2.2.1
  • Pgl==2.2.2
  • scikit-learn==1.0.1
  • tqdm==4.62.3

Dataset

The Protein Data Bank (PDB). Pre-processing and transformation of proteins into graphs can be found here. After preprocessing, the data should be copied in the ./data folder. Dataset splits (i.e., test, validation, and test) as proposed by [1] can be downloaded here or from their repository. They should also be copied to the folder ./data after extraction.

Training

python train.py [params]   

Where params are keyword arguments. See train.py for the list of arguments (with their default values).

Testing

python test.py --model_name <path-to-saved-model> --label_data_path <path-to-protein-with-their-labels> [more params]  

model_name and label_data_path are required arguments. More (optional) parameters can be added as well. See test.py for a full list of expected arguments.

References

[1] Gligorijević, V., Renfrew, P.D., Kosciolek, T. et al. Structure-based protein function prediction using graph convolutional networks. Nat Commun 12, 3168 (2021).