For these scripts to work, the RR-GCN package needs to be installed (pip install -e .
in the parent directory) and this directory needs to be the working directory. Note that node classification results will only exactly match those reported in the paper as CatBoost on CPU as GPUs produce different results.
- Make sure
git-lfs
is installed and working on your machine - Edit
dataset_root.py
and setroot
equal to a path where datasets can be downloaded
Either run the shell script run_all_node_classification.sh
and upon completion look in the updated tables.ipynb
for new results or follow the following steps:
tune.py
to generate validation results for different hyperparameter setups. The resulting validation scores will be saved intuning_results.csv
(you need to delete the supplied version first if you want to run it yourself).evaluate.py
to generate test scores based on the optimal validation hyperparameters. This script requires a completetuning_results.csv
and saves test performance for both RR-GCN and RR-GCN-PPV inevaluation_results.csv
(you need to delete the supplied version first if you want to run it yourself).evaluate_degree_cutting.py
to generate results with nodes of degree < 5 removed for AM, AIFB and BGS. This script requires a completetuning_results.csv
and saves test performance for both RR-GCN and RR-GCN-PPV inevaluation_results_degree_cutting.csv
(you need to delete the supplied version first if you want to run it yourself).evaluate_relation_cutting.py
to generate results with nodes of degree < 5 and unimportant relations (as extracted from a trained R-GCN) removed for AM. This script requires a completetuning_results.csv
and saves test performance for both RR-GCN and RR-GCN-PPV inevaluation_results_relation_cutting.csv
(you need to delete the supplied version first if you want to run it yourself).tables.ipynb
aggregates all previously generated csv-files into the results used in the paper.
link_prediction.ipynb
prints the results reported in the paper in the last cell.
Run the rrgcn_attenuation_node_classification.py to generate a results.csv with the reported metrics (Accuracy, F1 scores). Note if the GPU is running out of memory, the model will restart training using checkpoint activation in an attempt to fit intermediate activations in memory. The rrgcn architecture differs from the other two tasks and makes use of RRGCN_A. The included RRGCN architecture has different API and targets a newer version of ["Pytorch Geometric"] >=2.3.0