First major release
First Major Release
This is the first major release of OGB.
A number of changes have been made to the datasets, which are summarized below.
- Re-indexed all the nodes in the node/link datasets (The graphs remain essentially the same).
- In dataset folders for all the datasets, added
mapping/
directory that contains information to map node/edge/graph/label indices to real-world entities (e.g., mapping from nodes in PPA to unique protein identifiers, mapping from molecular graphs into the SMILES strings.) - Deleted the
ogbn-proteins
node features, and put them in the species variable. - Deleted
ogbl-reviews
datasets. - Added 4 datasets:
ogbn-arxiv
,ogbl-citation
,ogbl-collab
,ogbl-wikikg
. - Renamed
ogbg-ppi
toogbg-ppa
. - Renamed
ogbg-mol-hiv
andogbg-mol-pcba
toogbg-molhiv
andogbg-molpcba
, respectively. - Changed the evaluation metric of imbalanced molecule dataset (e.g., pcba) from ROC-AUC to PRC-AUC.
- Changed the
get_split_edge()
interface inLinkPropPredDataset
. The downloaded dataset files are also changed accordingly. - Added
num_classes
attribute for multi-class classification datasets.