InvestigatingRobustnessandLinkPredictionAdversarialModifications.pdf资源-CSDN下载

需积分: 9 197 浏览量 2019-08-09 15:09:03 上传评论收藏 417KB PDF 举报

资源推荐

资源详情

资源评论

Investigating Robustness and Interpretability of Link Prediction

via Adversarial Modiﬁcations

Pouya Pezeshkpour

University of California

Irvine, CA

p[email protected]

Yifan Tian

University of California

Irvine, CA

[email protected]

Sameer Singh

University of California

Irvine, CA

[email protected]

Abstract

Representing entities and relations in an em-

bedding space is a well-studied approach for

machine learning on relational data. Existing

approaches, however, primarily focus on im-

proving accuracy and overlook other aspects

such as robustness and interpretability. In

this paper, we propose adversarial modiﬁca-

tions for link prediction models: identifying

the fact to add into or remove from the knowl-

edge graph that changes the prediction for a

target fact after the model is retrained. Us-

ing these single modiﬁcations of the graph,

we identify the most inﬂuential fact for a pre-

dicted link and evaluate the sensitivity of the

model to the addition of fake facts. We in-

troduce an efﬁcient approach to estimate the

effect of such modiﬁcations by approximating

the change in the embeddings when the knowl-

edge graph changes. To avoid the combinato-

rial search over all possible facts, we train a

network to decode embeddings to their corre-

sponding graph components, allowing the use

of gradient-based optimization to identify the

adversarial modiﬁcation. We use these tech-

niques to evaluate the robustness of link predic-

tion models (by measuring sensitivity to addi-

tional facts), study interpretability through the

facts most responsible for predictions (by iden-

tifying the most inﬂuential neighbors), and de-

tect incorrect facts in the knowledge base.

1 Introduction

Knowledge graphs (KG) play a critical role in many

real-world applications such as search, structured

data management, recommendations, and question

answering. Since KGs often suffer from incom-

pleteness and noise in their facts (links), a number

of recent techniques have proposed models that em-

bed each entity and relation into a vector space, and

use these embeddings to predict facts. These dense

representation models for link prediction include

tensor factorization [Nickel et al., 2011, Socher

et al., 2013, Yang et al., 2015], algebraic opera-

tions [Bordes et al., 2011, 2013b, Dasgupta et al.,

2018], multiple embeddings [Wang et al., 2014,

Lin et al., 2015, Ji et al., 2015, Zhang et al., 2018],

and complex neural models [Dettmers et al., 2018,

Nguyen et al., 2018]. However, there are only a few

studies [Kadlec et al., 2017, Sharma et al., 2018]

that investigate the quality of the different KG mod-

els. There is a need to go beyond just the accuracy

on link prediction, and instead focus on whether

these representations are robust and stable, and

what facts they make use of for their predictions.

In this paper, our goal is to design approaches

that minimally change the graph structure such

that the prediction of a target fact changes the

most after the embeddings are relearned, which we

collectively call Completion Robustness and Inter-

pretability via Adversarial Graph Edits (CRIAGE).

First, we consider perturbations that remove a

neighboring link for the target fact, thus identi-

fying the most inﬂuential related fact, providing an

explanation for the model’s prediction. As an exam-

ple, consider the excerpt from a KG in Figure 1a

with two observed facts, and a target predicted

fact that

Princes Henriette

is the parent of

Violante

Bavaria

. Our proposed graph perturbation, shown

in Figure 1b, identiﬁes the existing fact that

Fer-

dinal Maria

is the father of

Violante Bavaria

the one when removed and model retrained, will

change the prediction of

Princes Henriette

’s child.

We also study attacks that add a new, fake fact into

the KG to evaluate the robustness and sensitivity

of link prediction models to small additions to the

graph. An example attack for the original graph in

Figure 1a, is depicted in Figure 1c. Such pertur-

bations to the the training data are from a family

of adversarial modiﬁcations that have been applied

to other machine learning tasks, known as poison-

ing [Biggio et al., 2012, Corona et al., 2013, Biggio

arXiv:1905.00563v1 [cs.LG] 2 May 2019

Ferdinand

Maria

Princess

Henriette

Violante

Bavaria

isMarried

hasChild

target prediction

hs, r, oi

(a) KG, with the target prediction

Ferdinand

Maria

Princess

Henriette

Violante

Bavaria

A.S.D.

Astrea

isMarried

hasChild

, r

, oi

removed

hasChild

(b) After removing a fact

Ferdinand

Maria

Princess

Henriette

Violante

Bavaria

New

York

Al Jazira

Club

isMarried

hasChild

playsFor

, r

, oi

added

hasChild

Figure 1: Completion Robustness and Interpretability via Adversarial Graph Edits (CRIAGE): Change in

the graph structure that changes the prediction of the retrained model, where (a) is the original sub-graph of the

KG, (b) removes a neighboring link of the target, resulting in a change in the prediction, and (c) shows the effect

of adding an attack triple on the target. These modiﬁcations were identiﬁed by our proposed approach.

et al., 2014, Zügner et al., 2018].

Since the setting is quite different from tradi-

tional adversarial attacks, search for link prediction

adversaries brings up unique challenges. To ﬁnd

these minimal changes for a target link, we need to

identify the fact that, when added into or removed

from the graph, will have the biggest impact on the

predicted score of the target fact. Unfortunately,

computing this change in the score is expensive

since it involves retraining the model to recompute

the embeddings. We propose an efﬁcient estimate

of this score change by approximating the change

in the embeddings using Taylor expansion. The

other challenge in identifying adversarial modiﬁ-

cations for link prediction, especially when con-

sidering addition of fake facts, is the combinato-

rial search space over possible facts, which is in-

tractable to enumerate. We introduce an inverter of

the original embedding model, to decode the em-

beddings to their corresponding graph components,

making the search of facts tractable by performing

efﬁcient gradient-based continuous optimization.

We evaluate our proposed methods through fol-

lowing experiments. First, on relatively small KGs,

we show that our approximations are accurate com-

pared to the true change in the score. Second,

we show that our additive attacks can effectively

reduce the performance of state of the art mod-

els [Yang et al., 2015, Dettmers et al., 2018] up to

3% and 50

7% in Hits@1 for two large KGs:

WN18 and YAGO3-10. We also explore the util-

ity of adversarial modiﬁcations in explaining the

model predictions by presenting rule-like descrip-

tions of the most inﬂuential neighbors. Finally, we

use adversaries to detect errors in the KG, obtaining

up to 55% accuracy in detecting errors.

2 Background and Notation

In this section, we brieﬂy introduce some notations,

and existing relational embedding approaches that

model knowledge graph completion using dense

vectors. In KGs, facts are represented using triples

of subject, relation, and object,

hs, r, oi

, where

s, o ∈ ξ

, the set of entities, and

r ∈ R

, the set

of relations. To model the KG, a scoring function

ξ × R × ξ → R

is learned to evaluate whether

any given fact is true. In this work, we focus on

multiplicative models of link prediction

, speciﬁ-

cally DistMult [Yang et al., 2015] because of its

simplicity and popularity, and ConvE [Dettmers

et al., 2018] because of its high accuracy. We can

represent the scoring function of such methods as

(

s, r, o

) =

(

, e

)

· e

, where

, e

∈ R

are embeddings of the subject, relation, and object

respectively. In DistMult,

(

, e

) =

 e

where



is element-wise multiplication operator.

Similarly, in ConvE,

(

, e

) is computed by a

convolution on the concatenation of e

and e

We use the same setup as Dettmers et al. [2018]

for training, i.e., incorporate binary cross-entropy

loss over the triple scores. In particular, for subject-

relation pairs (

s, r

) in the training data

, we

use binary

s,r

to represent negative and positive

facts. Using the model’s probability of truth as

σ(ψ(s, r, o)) for hs, r, oi, the loss is deﬁned as:

L(G) =

(s,r)

s,r

log(σ(ψ(s, r, o)))

+ (1 − y

s,r

) log(1 − σ(ψ(s, r, o))). (1)

Gradient descent is used to learn the embeddings

, e

, and the parameters of f, if any.

As opposed to additive models, such as TransE [Bordes

et al., 2013a], as categorized in Sharma et al. [2018].

剩余11页未读，继续阅读

评论收藏

内容反馈

Jayxp

粉丝: 6

Investigating Robustness and LinkPredictionAdversarialModificati...

最新资源

Investigating Robustness and LinkPredictionAdversarialModificati...

A Self-supervised Approach for Adversarial Robustness.pdf

Investigating.Cryptocurrencies.2018.6.pdf

Syngress.Malware.Forensics.Investigating.and.Analyzing.MaliciousCode

Differential Equations and Dynamical Systems.pdf

Wiley.Next.Generation.IPTV.Services.and.Technologies.Jan.2008.pdf

MIT10_626S11_lec06.pdf

Big Data and Visual Analytics-Springer(2017).pdf

Toeic托业考试习题PDF版大全.pdf

Complete Guide to Open Source Big Data Stack-Apress(2018).pdf

Investigating User Privacy in Android Ad Libraries.pdf

Syngree Press:Managing Catastrophic Loss of Sensitive Data A Guide for IT and Security Professionals(Mar 2008).pdf

Microservice Architectures for ADAS.pdf

Cloud and Fog Computing in 5G Mobile Networks-IET(2017).pdf

Fast Data Processing with Spark 2, 3rd Edition.pdf

主谓一致68577.pdf

hawkes_intel_microcode.pdf

Investigating the effect of rudder profile on 6DOF ship turning performance-2019.pdf

Investigating Bi-Level Optimization for Learning and Vision f

(完整版)主谓一致用法总结.pdf

CRC Press - Investigating Computer-Related Crime.7z

ghostnet.pdf

高考英语3500单词第14讲(单词速记与拓展）.pdf

Computer Forensics Investigating Wireless Networks

检索课题最后[定义].pdf

例解回归分析英文第5版

Biophysical analysis of membrane proteins investigating structure and function Wiley-VCH. 2008

Practical Packet Analysis 3rd Edition

数学中常用符号总统 导资料需的页号(50页)

解读雾计算：概念、框架与技术

最新资源

数学中常用符号总统导资料需的页号(50页)