This is the code used to train and run PROTGOAT which placed 4th in the 2023 CAFA5 protein function prediciton competition. PROTGOAT can be trained and ran for all three GO domains using CAFA5_Train_and_infer_f02.ipynb
The embeddings generated for all 142246 proteins in the training set and 141864 proteins in the test superset of CAFA5 using PLMs are available for download at https://www.kaggle.com/datasets/zmcxjt/cafa5-train-test-data