
Investigating Robustness and Interpretability of Link Prediction
via Adversarial Modifications
Pouya Pezeshkpour
University of California
Irvine, CA
Yifan Tian
University of California
Irvine, CA
Sameer Singh
University of California
Irvine, CA
Abstract
Representing entities and relations in an em-
bedding space is a well-studied approach for
machine learning on relational data. Existing
approaches, however, primarily focus on im-
proving accuracy and overlook other aspects
such as robustness and interpretability. In
this paper, we propose adversarial modifica-
tions for link prediction models: identifying
the fact to add into or remove from the knowl-
edge graph that changes the prediction for a
target fact after the model is retrained. Us-
ing these single modifications of the graph,
we identify the most influential fact for a pre-
dicted link and evaluate the sensitivity of the
model to the addition of fake facts. We in-
troduce an efficient approach to estimate the
effect of such modifications by approximating
the change in the embeddings when the knowl-
edge graph changes. To avoid the combinato-
rial search over all possible facts, we train a
network to decode embeddings to their corre-
sponding graph components, allowing the use
of gradient-based optimization to identify the
adversarial modification. We use these tech-
niques to evaluate the robustness of link predic-
tion models (by measuring sensitivity to addi-
tional facts), study interpretability through the
facts most responsible for predictions (by iden-
tifying the most influential neighbors), and de-
tect incorrect facts in the knowledge base.
1 Introduction
Knowledge graphs (KG) play a critical role in many
real-world applications such as search, structured
data management, recommendations, and question
answering. Since KGs often suffer from incom-
pleteness and noise in their facts (links), a number
of recent techniques have proposed models that em-
bed each entity and relation into a vector space, and
use these embeddings to predict facts. These dense
representation models for link prediction include
tensor factorization [Nickel et al., 2011, Socher
et al., 2013, Yang et al., 2015], algebraic opera-
tions [Bordes et al., 2011, 2013b, Dasgupta et al.,
2018], multiple embeddings [Wang et al., 2014,
Lin et al., 2015, Ji et al., 2015, Zhang et al., 2018],
and complex neural models [Dettmers et al., 2018,
Nguyen et al., 2018]. However, there are only a few
studies [Kadlec et al., 2017, Sharma et al., 2018]
that investigate the quality of the different KG mod-
els. There is a need to go beyond just the accuracy
on link prediction, and instead focus on whether
these representations are robust and stable, and
what facts they make use of for their predictions.
In this paper, our goal is to design approaches
that minimally change the graph structure such
that the prediction of a target fact changes the
most after the embeddings are relearned, which we
collectively call Completion Robustness and Inter-
pretability via Adversarial Graph Edits (CRIAGE).
First, we consider perturbations that remove a
neighboring link for the target fact, thus identi-
fying the most influential related fact, providing an
explanation for the model’s prediction. As an exam-
ple, consider the excerpt from a KG in Figure 1a
with two observed facts, and a target predicted
fact that
Princes Henriette
is the parent of
Violante
Bavaria
. Our proposed graph perturbation, shown
in Figure 1b, identifies the existing fact that
Fer-
dinal Maria
is the father of
Violante Bavaria
as
the one when removed and model retrained, will
change the prediction of
Princes Henriette
’s child.
We also study attacks that add a new, fake fact into
the KG to evaluate the robustness and sensitivity
of link prediction models to small additions to the
graph. An example attack for the original graph in
Figure 1a, is depicted in Figure 1c. Such pertur-
bations to the the training data are from a family
of adversarial modifications that have been applied
to other machine learning tasks, known as poison-
ing [Biggio et al., 2012, Corona et al., 2013, Biggio
arXiv:1905.00563v1 [cs.LG] 2 May 2019