Knowledge-aware Attention Network for Protein-Protein Interaction Extraction Zhou, H., Liu Z., Ning S. et al. Accepted by The Journal of Biomedical informatics
An implementation of Knowledge-aware attention networks (KAN) for protein-protein extraction task.
This code has been written using Pytorch 0.4.
We have put word embeddings, entity and relation embeddings learned from TransE[1] to the folder of data.
Go to the model path and run:
❱❱❱ python3 main.pyIn this setting, the default hyperparameters are used. Or run in specific settings:
❱❱❱ python3 main.py --trainPath ../data/train.txt --validPath ../data/valid.txt --testPath ../data/test.txt --trainGold ../data/trainGold.txt --testGold ../data/testGold.txt --batchSize 100 --wd 100 --ed 100 --hop 2 --clas 2 --epoch 20 --wePath ../data/wordEmb/bio-word2id100 --w2IDPath ../data/wordEmb/bio-embed100 --eePath ../data/KB/entity2vec.vec --rePath ../data/KB/relation2vec.vec --t2idPath ../data/KB/triple2id.txt --e2idPath ../data/KB/entity2id.txt --paraPath ./parameters/kan --results ./results/ --training True --lr 0.1 --wdecay 0.0 --validsetR 0.15
the option you can choose are:
--trainPathpath of training dataset.--validPathpath of valid dataset.--testPathpath of test dataset.--trainGoldpath of triples of training dataset--testGoldpath of triples of test dataset--batchSizebatch size.--wddimension of word embedding.--eddimension of entity embedding learned from TransE.--hopnumber of hop.--clasnumber of class.--epochnumber of iterations.--wePathpath of word embedding file.--w2IDPathpath of file that contains mapping from word to its number.--eePathpath of entity embedding file.--rePathpath of relation embedding file.--t2idPathpath of file that contains the triples.--e2idPathpath of file that contains mapping from Entrez Gene ID to number.--paraPathpath of model parameters.--resultspath where the results write to.--trainingbool value for training or not. A non-empty string means training phase. An empty string means test phase.--lrlearning rate.--wdecayweight decay.--validsetRthis parameter ([0.0, 1.0]) means how much document in training set is selected to be the valid set.
[1] Bordes, Antoine, et al. Translating embeddings for modeling multi-relational data. Proceedings of NIPS, 2013.