Leveraging prior knowledge for protein–protein interaction extraction with memory network Zhou, H., Liu Z., Yang Y. et al. Published in Database: The Journal of Biological Databases and Curation
An implementation of Memory Networks Model (MNM) for protein-protein extraction task.
This code has been written using Pytorch 0.2.
We have put word embeddings, entity and relation embeddings learned from TransE[1] to the folder of data.
Go to the model path and run:
❱❱❱ python3 main.py
In this setting, the default hyperparameters are used. Or run in specific settings:
❱❱❱ python3 main.py --trainPath ../data/train.txt --testPath ../data/test.txt --batchSize 100 --wd 100 --ed 100 --hop 2 --clas 2 --epoch 20 --wePath ../data/wordEmb/bio-word2id100 --w2IDPath ../data/wordEmb/bio-embed100 --eePath ../data/KB/entity2vec.vec --rePath ../data/KB/relation2vec.vec --t2idPath ../data/KB/triple2id.txt --e2idPath ../data/KB/entity2id.txt --paraPath ./parameters/ --results ./results/
the option you can choose are:
--trainPath
path of train dataset.--testPath
path of test dataset.--batchSize
batch size.--wd
dimension of word embedding.--ed
dimension of entity embedding learned from TransE.--hop
number of hop.--clas
number of class.--epoch
number of iterations.--wePath
path of word embedding file.--w2IDPath
path of file that contains mapping from word to its number.--eePath
path of entity embedding file.--rePath
path of relation embedding file.--t2idPath
path of file that contains the triples.--e2idPath
path of file that contains mapping from Entrez Gene ID to number.--paraPath
path of model parameters.--results
path where the results write to.
[1] Bordes, Antoine, et al. Translating embeddings for modeling multi-relational data. Proceedings of NIPS, 2013.