0% found this document useful (0 votes)
42 views11 pages

Visual Analytics For Human-Centered Machine Learning: N. Andrienko L. Adilova

Uploaded by

siddhisatwe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views11 pages

Visual Analytics For Human-Centered Machine Learning: N. Andrienko L. Adilova

Uploaded by

siddhisatwe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Department: Visualization Viewpoints

Editor: Theresa-Marie Rhyne,


theresamarierhyne@[Link]

Visual Analytics for human-


centered Machine Learning
N. Andrienko L. Adilova
Fraunhofer Institute IAIS (Germany) and City, Fraunhofer Institute IAIS and Ruhr University Bochum
University of London (UK) (Germany)
G. Andrienko S. Wrobel
Fraunhofer Institute IAIS (Germany) and City, Fraunhofer Institute IAIS and University of Bonn
University of London (UK) (Germany)

Abstract—We introduce a new research area in Visual Analytics (VA) aiming to bridge existing
gaps between methods of interactive Machine Learning (ML) and eXplainable Artificial Intelligence
(XAI), on one side, and human minds, on the other side. The gaps are, first, a conceptual mismatch
between ML/XAI outputs and human mental models and ways of reasoning, second, a mismatch
between the information quantity and level of detail and human capabilities to perceive and
understand. A grand challenge is to adapt ML and XAI to human goals, concepts, values, and ways of
thinking. Complementing the current efforts in XAI towards solving this challenge, VA can
contribute by exploiting the potential of visualization as an effective way of communicating
information to humans and a strong trigger of human abstractive perception and thinking. We
propose a cross-disciplinary research framework and formulate research directions for VA.

 T HE IMPORTANCE of involving humans in for practical use. Explainability of models may even
the process of creating and training Machine Learning be more important than their performance, especially
(ML) models is currently widely recognized in the in high-stake domains. In response to the need to
ML community [1]. It is argued that humans involved explain untransparent ML models (“black boxes”) to
in this process need to understand what the machine is users, the research field of eXplainable Artificial
doing and how it uses human inputs; hence, the Intelligence (XAI) has emerged recently [8]. The
machine must be able to explain its behavior to the work in this field was boosted by the European
users. Understanding of ML models has also critical Parliament’s adoption of the General Data Protection
importance for deciding whether they can be adopted Regulation (GDPR), which introduces the right of

IEEE CG&A Published by the IEEE Computer Society XXXX-XXXX© 2019 IEEE
individuals to receive explanations of automatically contribute to human-centered ML and propose
made decisions relevant to them. research directions towards realizing this idea.
However, there is a tendency to admit that model
“explainability” does not necessarily mean that the BACKGROUND
model is indeed properly explained to humans and The following definitions and statements are based
understood by them [9,12]. In this paper, we discuss on a recent survey of the XAI research [8] unless
the deficiencies of the common approaches to another reference is specified.
explaining ML models, mention current efforts In the ML and XAI literature, the terms
towards overcoming these deficiencies, argue why “explainability” and “interpretability” are used
Visual Analytics (VA) [11] should be involved in interchangeably. Interpretability is defined as the
such efforts, and consider its possible role in helping ability to explain or to provide the meaning of
humans to understand models. something in terms understandable to a human. The
definition assumes implicitly that an explanation is
self-contained and does not need further explanations.
An important distinction is made between global
and local interpretability. Global interpretability
means that humans can understand the whole logic of
a model and follow the reasoning leading to all
possible outcomes. Local interpretability means the
possibility to understand the reasons for a specific
decision.
Among the existing types of ML models, a few are
recognized as interpretable and easily understandable
Figure 1. Schematic representation of the for humans, namely, decision tree, rules, and linear
research framework for human-centered ML (regression) models. A decision tree can be
supported by VA. represented graphically, and a human can trace its
Based on our considerations, we propose a research branches and read logical conditions in the nodes.
framework for developing VA approaches supporting Rules have the form of logical statements “if … then
human-centered ML. The basic idea is schematically …”, which are familiar and understandable to humans.
represented in Fig. 1. Here, the term “informed ML” Linear models can be interpreted by considering the
means involving prior knowledge in the process of sign and magnitude of the contribution of each
deriving models from data, and “informed XAI” attribute to a prediction.
means involving knowledge in the process of These model types are considered interpretable by
explaining models to humans. While informed ML their nature and needing no explanations. The research
uses knowledge that is explicitly represented in a in XAI is concerned with explaining other types of
machine-processable form, VA can support acquiring models that are untransparent to humans. The research
knowledge from a human expert, including expert’s addresses three distinct problems. The black box
prior knowledge and new knowledge that the expert explanation problem consists in providing a globally
has obtained through interactive visual data analysis. interpretable model which is able to mimic the
The knowledge of the expert is externalized through behavior of the black box. The black box outcome
interactive visual interfaces and supplied to the ML explanation problem consists in providing
and XAI components. Please note that “informed explanations of the reasons for predictions or
XAI” is a new term that we introduce by analogy with decisions made by a black box. It is not required to
“informed ML”. The contents of Fig. 1 will be explain the whole logic behind the black box. The
explained in more detail later. black box inspection problem consists in providing a
We shall begin with providing background representation (visual or textual) for understanding
information concerning explainability of ML models either how the black box model works or why the
and deficiencies of common approaches in XAI. After black box returns certain predictions more likely than
an overview of the relevant research in ML, XAI, and others.
VA, we present the general idea of how VA can The content of this paper partly refers to the first
problem, i.e., black box explanation, which is being

2 IEEE Computer Graphics & Applications


solved by creating interpretable mimic models. several indicators, which may not have a domain
However, the problem we consider here is different. interpretation.
We share the doubts of ML researchers who question The problem that these examples refer to can be
the common belief that certain model types are easily characterized as conceptual mismatch between
understood by humans just because they can be ML/XAI outcomes and human mental models.
represented in a human-readable form. Another problem, also discussed in [12], is that a
model interpretable in theory may be
DEFICIENCIES OF CURRENT XAI incomprehensible in practice due to its size and
Some ML researchers argue that the current XAI complexity. Consider, for example, a decision tree
approaches fail to provide satisfactory explanations containing hundreds of nodes, as in Fig. 2. A human
that can be well understood by humans, i.e., linked to can trace and understand any small part of it, but the
their mental models. The term “explainability” is whole tree is beyond the human capabilities for
contrasted with “explanation” [12] and “causability” tracing and understanding. Hence, there is a mismatch
[9]. According to Kovalerchuk et al. [12], a model is between the information quantity and the human
truly explained if a domain expert accepts it based on perceptual and cognitive capabilities.
both empirical evidence of satisfactory accuracy and
the domain knowledge/theory/reasoning, which is
beyond a given dataset. Instead, XAI methods
generate “quasi-explanations”, which refer to
components and properties of data and specifics of the
modelling algorithm but do not explain models in
terms of domain knowledge and concepts that humans
use in their reasoning. “Causability” [9] is defined as
the extent to which an explanation achieves a
specified level of causal understanding.
The authors of [12] give the following example.
Consider a branch of a decision tree or a logical rule
“If (x1 > 5) and (x2 < 7) and (x3 > 10) then x belongs to
class 1”. It may be quite accurate in classifying data
instances, and a domain expert can understand what it
says if attributes x1 to x3 are meaningful in the
domain where the data are taken from. However, the
domain expert can say that, despite its high empirical Figure 2. The structure of a decision tree
confirmation, it is not clear why this model should meant to “explain” the logics of an ML model
work. The model is not explained in the terms of the (an example).
domain knowledge such as causal relations known in There exists research in XAI aimed at making
the domain. This is a quite common situation in ML. models easier to comprehend. A few representative
Another example given in [12] refers to linear works are mentioned in the Sidebar 1. However, XAI
models, which are also commonly recognized as researchers strive to develop purely algorithmic
interpretable. It is typical that linear models involve approaches. VA researchers can complement these
heterogeneous attributes, such as blood pressure, efforts by supporting involvement of human
cholesterol level, temperature, and so on. The knowledge and reasoning.
weighted summation of such heterogeneous attributes
does not have physical meaning. Even when attributes STATE OF THE ART
are homogeneous it is still not necessary that the Here we briefly overview the state of the art and
regression models will be meaningful. For instance, open problems in ML and VA concerning two sides of
what is the meaning of a weighted sum of systolic and human-computer collaboration in development of ML
diastolic blood pressure measurements? models. One side can be called “Humans for ML”:
Additionally, a theoretically interpretable model, how to make better use of human intellectual
similarly to a deep learning model, may involve capabilities in developing ML models? The other side
highly engineered features, such as a cube root of is “ML for humans”: how to ensure that ML results
are properly explained to humans in terms of human-

Xxx/xxx 2020 3
relevant concepts rather than machine-specific objects learning targets, such as topics in text documents or
and structures? data patterns and hierarchies. For obtaining different
forms of human feedback, machine learning is
Interactive ML
increasingly combined with visual analytics [15].
ML acknowledges the value of human feedback in
the process of deriving models from data [1]. Most of Granular Computing (GC)
all, ML researchers are concerned with eliciting Granular computing [15,17] is a paradigm in
training data from human experts [14]. In the ML computer science and applied mathematics that strives
paradigm known as active learning, an algorithm to reflect the human ability to perceive the world at
applies some strategy to choose from a pool of different levels of granularity and to switch between
unlabeled examples and queries a human “oracle” to these levels. According to [17], there are three basic
provide labels. Apart from practical difficulties in concepts that underlie human cognition: granulation,
finding suitable strategies [5,6], this approach can organization and causation. Informally, granulation
cause such problems as human’s frustration and involves decomposition of whole into parts;
unwillingness to repeatedly perform a routine task [1]. organization involves integration of parts into whole;
Visual-interactive labeling provides users an active and causation involves association of causes with
role and possibility to apply different strategies [6]. effects. The central concept of GC is an information
The concept of interactive machine learning [10] granule, which is a construct composed of data or
acknowledges the fact that people may be capable and information items based on their similarity, adjacency,
willing to do much more for development of a good or other relationships. The ultimate objective of
model than just provide data labels. Interactive ML information granules is to describe phenomena in an
engages human users in a tight interaction loop of easily understood way and at a certain level of
iterative modification of data and/or features to abstraction. Therefore, the ideas of GC align very well
improve model performance [2,10]. However, to play with the need of explaining ML models in human-
such an active role, the users need to understand what friendly ways [15].
the machine is doing and how it uses their inputs. Acknowledging that information granules created
Hence, the machine must be able to explain its and used by humans are fuzzy rather than crisp, the
behavior to the users. founder of GC L. Zadeh proposed the theory of fuzzy
information granulation supported by fuzzy logic [17].
Informed ML
There are also research works in GC applying the
While traditional ML develops methods to derive
theory of rough sets.
models purely from data, more and more researchers
Granular computing does not consist of specific
call for combining data- and knowledge-based
methods; it is rather a set of ideas and a way of
approaches, which can reduce the required amount of
thinking. The book [15] contains some examples of
training data and, at the same time, lead to better
involving the ideas of GC in building ML models for
model quality, explainability, and trustworthiness. A
specific applications. One of the book chapters calls
research field called informed ML [16] works on
for combining GC with visual analytics.
integrating machine learning techniques with
processing of conceptual and contextual knowledge. Visual Analytics (VA)
Researchers mostly focus on utilizing knowledge that VA is a natural partner of ML and AI in the
has been previously prepared and represented in a research both on involving users in ML processes and
machine-readable form, such as logic rules, algebraic on explaining ML to users. Combining human and
equations, or concept graphs. machine intelligence is the central idea of VA [2].
The survey [16] refers to many works on involving Sidebar 2 points out the research areas in VA related
knowledge of human experts into the ML pipeline. to ML and refers to representative works.
Expert knowledge may be provided in the form of Most of the research dealing with ML models has
algebraic equations, probabilistic relations (often been related so far to different aspects of the problem
represented by Bayesian network structures), or “humans for ML”. The area of VA for XAI can be, in
human feedback. The first two forms can be directly principle, categorized as “ML for humans”, but the
used in an ML algorithm. Examples of human current research in it addresses mostly the needs of
feedback are setting preferences, judging relevance, model developers rather than domain experts. The
editing algorithm outcomes, and pre-specifying visualization of classification rules in RuleMatrix [13]

4 IEEE Computer Graphics & Applications


is meant for users with little competence in ML; should also think about possible visual and interactive
however, the authors do not consider the problem of ways of organizing outputs of current XAI methods
comprehensibility of large rule sets with rules based on human knowledge.
involving many conditions. This research framework refers simultaneously to
A series of VISxAI workshops (Visualization for both perspectives of human-computer collaboration in
AI Explainability, [Link] promotes the the creation of computer models, i.e., “humans for
creation of interactive visual "explainables" or ML” and “ML for humans”. These two perspectives
"explorables" explaining how ML/AI techniques work are united through the involvement of human expert
using visualization. AI explorables are also being knowledge. The role of VA is to support acquisition
created and published by the Google team PAIR of knowledge from experts and use of the expert
(People + AI Research, knowledge in providing model explanations to the end
[Link] There are users.
many interesting works allowing users to experiment The research on human-centered ML can built on
with models by changing parameters or supplying the achievements and current developments in the
different inputs. Such experiments, however, do not areas of interactive ML, informed ML, XAI, and VA.
explain the internal logic of the models. Other works Since VA is interdisciplinary by its nature, it will be
focus on explaining ML concepts and methods rather the task of VA researchers to design and develop
than models created for specific applications. Both integrated VA-ML-XAI workflows implementing
groups of work are more oriented to students and the conceptual view of visual analytics activity as the
curious public than to domain experts going to use the process of model building [4].
models in practice.
Integrated VA-ML-XAI workflows
It can be seen that different research communities
Two complementary directions for integration can
are concerned with making ML models understood by
be envisaged. The first direction involves applying
users. These communities focus on different aspects
VA to the data that will be used for model building.
of the model explanation problem, such as model
The idea is that VA supports the human analyst in
complexity, form of representation, level of
organizing the data and defining meaningful concepts
abstraction, and “what-if” explorability. It seems,
at an appropriate level of abstraction. There is a
however, that satisfactory solutions can only be
special ML component that learns the concepts and
achieved when the communities join their efforts in
the ways of organizing data items into instances of
tackling the problem. Therefore, we propose an
these concepts. The knowledge thus gained from the
interdisciplinary research framework for human-
human is then used in an ML algorithm that derives a
centered ML.
model from the data, which means that the algorithm
is designed to utilize this expert knowledge for
RESEARCH FRAMEWORK
directing the data-driven learning process, according
The idea of the proposed research framework is
to the ideas and approaches of informed ML.
schematically represented in Fig. 1. It is interpreted as
Additionally to this, the knowledge is used by an XAI
follows. Following the paradigms of interactive ML
component, which generates and organizes
and informed ML, models are developed in tight
explanations according to the human-defined concepts
interaction of ML algorithms with humans, so that
thereby implementing the idea of “informed XAI”.
human knowledge and human-defined concepts are
There exist multiple VA solutions for supporting
transferred to the algorithms and used in building
transfer of knowledge from humans to ML algorithms,
computer models. This process is supported by
e.g., [7]. However, we are not aware of works
interactive visual interfaces provided by VA. The
implementing the next step, in which the knowledge
knowledge and concepts that have been acquired from
obtained is used for generation of human-oriented
the human experts are involved not only in model
explanations.
building but also in generating explanations of the
The second direction is interplay of VA and XAI
models. The methods for doing this, which still need
components. The XAI component initially generates
to be developed, can be called “informed XAI”, by
detailed low-level explanations. The human analyst
analogy with informed ML. It can be expected that
uses VA techniques to organize subsets of these
such methods will soon be developed in the XAI area.
explanations into meaningful information granules, in
When they appear, it will be the task of VA to
terms of granular computing, and thereby define
represent their outcomes to users. VA researchers
relevant concepts at suitable levels of abstraction. The

Xxx/xxx 2020 5
XAI component learns the concepts and the ways of GC proposes to use fuzzy sets and rough sets theories.
organizing explanations from the analyst and applies These formalisms appear suitable for representing
this knowledge to other subsets of “raw” explanations data patterns, such as a cluster or a trend, which
under the expert’s supervision. Being trained in this usually have an approximate character.
way, the XAI component will later be able to use the Based on the definition of a data pattern as a
learned principles of structuring and abstracting in system of type-specific relationships between data
explaining other ML models of the same type (e.g., items [3], it may be possible to generate formal
classification or regression) in the same domain. This representations of data patterns discovered in the
is another way of implementing the idea of “informed process of visual analysis and roughly outlined or
XAI”. VA techniques are used to present the resulting otherwise marked by the analysts. These formal
explanations to users in effective ways. representations can be processed by computers and
To create a theoretical basis underpinning these used in model building. To find suitable ways of
practical developments, we propose to work on representing data patterns, it is also reasonable to
combining the ideas and frameworks of visual consult the literature on knowledge representation in
analytics and granular computing. the classical AI.
To provide theoretical foundation to organization
Theoretical research
and abstraction of low-level XAI outputs, it is
GC aims to model the human ability to organize
necessary to elaborate the pattern theory in more
and perceive information at different levels of
detail for defining possible patterns in such a complex
abstraction. VA, in turn, is concerned with supporting
type of information as XAI-generated explanations,
abstractive perception of data and information from
e.g., having the form of decision rules or trees. In the
visual displays. The central concept in VA is a
next section, we describe some preliminary ideas
pattern, which is a combination of multiple items
concerning patterns in a set of rules and possibilities
perceived and considered together as a single entity
for uniting and generalizing related rules. Please note
due to relationships existing between the items [3].
that these ideas and examples refer to the second
Patterns themselves may also be linked by
direction in the work on implementing integrated VA-
relationships and on this basis integrated into patterns
ML-XAI workflows.
of a higher level of abstraction.
There is a semantic similarity between the concepts
EXAMPLE: GRANULATION OF RULES
of information granule in GC and pattern in VA. The
Let us consider decision rules with conditions
ultimate goal of VA is similar to that of GC: enable
involving numeric attributes (features). Such rules
humans to understand phenomena at appropriate
may be components of an original ML model or of a
levels of abstraction. Therefore, it appears reasonable
mimic model constructed by some XAI method to
to link these two research fields. It needs to be
explain a black box model. Each condition of a rule
investigated what theories and methods of GC can be
refers to one feature and states that the feature value
integrated with techniques of VA, how different types
must be lower or higher than a certain constant, or that
of information granules can be represented visually,
it must be within a certain interval. A rule usually
and how these types of granules can be formed
contains several conditions connected by the logical
through human-computer discourse using visual and
operator AND. The outcome, or consequent, of a rule
interactive techniques.
is one element from a finite set of possible classes,
Particularly, GC is concerned with modelling the
decisions, or actions.
approximate, fuzzy way of human conceptualization
and reasoning. As mathematical apparatuses for this,

6 IEEE Computer Graphics & Applications


Figure 3. A fragment of a table representing rules.
To see rules in a visual form, we can use a table conditions. Three rules are represented by glyphs. The
view like the one shown in Fig. 3. Each table row outcome of the first rule differs from the outcomes of
corresponds to one rule. For each feature, there is a the two others. The first rule is selected. Its conditions
column. Conditions, i.e., intervals of feature values, are represented in all three glyphs by bars shaded in
are represented by horizontal bars, which show the light blue and drawn on the right of the corresponding
relative positions of the intervals between the minimal feature axes. The relative positions of the framed
and maximal feature values. If a feature is not used in hollow bars and the shaded bars represent the
a rule, the corresponding cell is empty. relationships between the feature value intervals in the
A single rule can also be represented by a glyph, as conditions of the selected rule and in the other rules.
shown in Fig. 4. Based on the idea of parallel
coordinate axes, a glyph includes vertical axes
corresponding to all features occurring in a rule set.
Vertical bars represent the value intervals of the
features used in the rule. The color of the glyph frame
encodes the rule outcome.

Figure 4. One rule represented by a glyph.


Figure 5. Representation of relationships
According to the definition of a pattern [3], between rule conditions.
patterns in a set of rules emerge due to relationships To understand the possible relationships between
between rules. Relationships between rules are rule antecedents composed of multiple conditions, let
composed from relationships between their conditions us imagine the multidimensional space of all features
and between the outcomes. For the outcomes, two involved in all rules. The antecedent of a rule can be
relationships are possible: same or distinct. imagined as a shape (a hyper-parallelepiped) in this
Relationships between two conditions involving the space. When some feature is not used in a rule
same feature are relationships between the value explicitly, it can be treated as implicitly present with
intervals specified in the conditions. The intervals can the value interval covering the whole range of feature
be disjoint, overlapping, coinciding, or one can lie values from the smallest to the largest. In such a view,
inside the other. Relationships between intervals can relationships between rule antecedents translate to
be expressed numerically as distances between them. relationships between such multidimensional shapes.
For this purpose, we can use an adapted version of the Possible types of relationships are set relationships
Hausdorff distance between two subsets of a metric (disjoint, intersect, include, coincide) and metric
space. distance relationships between the shapes. As a
Figure 5 demonstrates a possible visual numerical expression of these distances, we can use,
representation of relationships between rule

Xxx/xxx 2020 7
for example, the mean of the distances in all different outcome. The accuracy of a rule can be
individual dimensions. This numeric measure of rule numerically expressed as the ratio of the number of
similarity can be used to algorithmically find groups covered rules with the same outcome as in this rule to
(clusters) of close rules, as well as for ordering of the total number of the rules covered by this rule. The
rules. Thus, adjacent table rows in Fig. 3 correspond accuracy of a rough rule will thus be less than 1.
to close rules. Obviously, a rough union rule is less suitable for
making predictions than the original group of rules
that has been abstracted. However, it may be quite
well suitable for explanation of the model logic to a
human, since it is normal for human cognition to deal
with rough concepts and approximations. A user of an
ML model can agree to accept some inaccuracies in
exchange for a simper description of the model logic,
and the user can choose the minimal accuracy that is
still acceptable. Hence, by finding and abstracting
clusters of rules with same outcomes, we aim to
Figure 6. A union of a group of close rules. derive a simpler model that is descriptive but not
necessarily predictive.
An important rule pattern is a cluster of close rules We have conducted multiple experiments on
having the same outcome. Such a cluster can be granulation of different ML models consisting of rules
abstracted into a multidimensional shape enclosing it. or decision trees (a decision tree can be transformed to
This envelope shape, in turn, corresponds to a rule that a set of rules by representing each path from the root
is more general than each member rule of the cluster; to a leaf by one rule). The models were created based
we shall call it a union rule. We say that the union rule on several benchmark datasets using state-of-the-art
covers each original rule of the cluster that has been ML methods. Our goal was to find out how much a
abstracted. In terms of rule conditions, it means that model can be simplified by means of rule granulation.
each interval of feature values of the union rule covers We varied the minimal accuracy threshold from 1 to
(i.e., coincides with or includes) the value intervals of 0.6. Interestingly, even with the threshold equal to 1
the same feature of all original rules. Hence, a union some compression is achieved. For example, a 3-class
rule can be derived from a group of rules by obtaining classification model with 109 rules and 818 conditions
the unions of the value intervals of the same features. in total has been reduced to 103 rules with 762
When a union of two or more intervals equals the full conditions. With the threshold of 0.75 for the same
range of the feature values, the condition referring to model, we obtained 84 rules with 594 conditions, and
this feature can be omitted from the union rule. Fig. 6 the threshold 0.6 gave us 54 rules with 342 conditions.
shows an example of a union rule abstracting a cluster A model with 10 classes containing 202 rules (1739
of five close rules. conditions) was abstracted to 167 rules (1357
In terms of granular computing, a union rule is an conditions) taking the threshold 0.75 and to 139 rules
information granule. Union rules can be derived by (1062 conditions) taking the threshold 0.6. Similar
iterative joining of pairs of close rules. This creates degrees of compression were achieved in the other
rule hierarchies involving information granules of experiments.
different degrees of abstraction. Based on our experiments, we can conclude that
A union rule covering a cluster of close rules with rule granulation is a viable approach to simplification
the same outcome may occasionally also cover some of rule sets. However, its power is limited: the
other rules with different outcomes. This is similar to simplified models still contain too many rules and
enclosing a cluster of points of the same class on a conditions to be treated as easily comprehendible. The
scatterplot by a bounding box: some points of another reason for this inadequacy is that abstracted rules
class may also fit into the box. Hence, a union rule can involve the same low-level features taken from
be an approximate, rough representation of a cluster of training data as the original rules. A model can be
similar rules. We shall use the term rough rule for a better understood by a domain expert if it refers to
rule covering two or more rules with the same higher-level domain-relevant concepts. Such concepts
outcome as in this rule and at least one rule with a cannot be automatically derived from data but need to

8 IEEE Computer Graphics & Applications


be taken from other sources, as it is supposed in the building models enhanced by the power of human
paradigm of informed machine learning [16]. One of intelligence and readily accepted by humans as
the possible sources may be a human expert extensions of their mental models and enhancers of
interacting with a model building algorithm, as shown their reasoning.
schematically in Fig. 1. The expert may define
concepts based on groups of features, which can be ACKNOWLEDGEMENTS
seen as “feature granulation”. This research was supported by Fraunhofer Center
As a simple example of feature granulation, let us for Machine Learning within the Fraunhofer Cluster
imagine creation of a model for diagnosing various for Cognitive Internet Technologies, by DFG within
allergies. Elementary features may be symptoms like Priority Programme 1894 (SPP VGI), by EU in
sneezing, runny nose, blocked nose, red eyes, itchy project SoBigData++, and by SESAR in projects
eyes, watery eyes, itchy skin, red rash, and many TAPAS and SIMBAD.
others. A domain experts may organize the symptoms
in groups, such as nasal symptoms, eye symptoms,
skin symptoms, etc., and tell the learning algorithm
which groups of symptoms are related to respiratory  REFERENCES
allergies, skin allergies, food allergies, and so on. 1. S. Amershi, M. Cakmak, W.B. Knox, & T. Kulesza.
“Power to the People: The Role of Humans in
When the groups of symptoms and groups of allergies Interactive Machine Learning”. AI Magazine, vol. 35,
defined by the expert are involved in the model or at no. 4, pp. 105-120. 2014.
least used in generating explanations of the model, it 2. N. Andrienko, G. Andrienko, G. Fuchs, A. Slingsby,
C. Turkay, and S. Wrobel. “Visual Analytics for Data
can be expected that the explanations will be more Scientists”. Springer, 2020.
structured, more meaningful for domain users, and 3. N. Andrienko, G. Andrienko, S. Miksch, H.
better understood by them. Schumann, & S. Wrobel. “A theoretical model for
pattern discovery in visual analytics”. Visual
Informatics, vol. 5, no. 1, pp. 23-42, 2021.
CONCLUSION 4. N. Andrienko, T. Lammarsch, G. Andrienko, G.
With this paper, we aim to motivate and trigger Fuchs, D. Keim, S. Miksch, and A. Rind. “Viewing
visual analytics as model building”. Computer
research on bridging gaps between machine learning
Graphics Forum, vol. 37, no. 6, pp. 275–299, 2018.
and human mental models using a synergy of 5. J. Attenberg and F. Provost, “Inactive learning?
approaches from informed machine learning, artificial difficulties employing active learning in practice”.
intelligence, and visual analytics. While substantial ACM SIGKDD Explorations Newsletter, vol. 12, no. 2
(December 2010), 36–41, 2011.
amount of research is being conducted in several areas 6. J. Bernard, M. Zeppelzauer, M, Lehmann, M. Müller,
of computer science, the contribution from visual and M. Sedlmair, “Towards User-Centered Active
analytics is still low. We believe that VA researchers Learning Algorithms”. Computer Graphics Forum, vol.
37, pp. 121-132, 2018.
should take a lead in these efforts, since the goal of 7. S. van den Elzen and J. J. van Wijk, "BaobabView:
combining human and computer intelligence lies at Interactive construction and analysis of decision
the core of VA. Interactive visual interfaces serve as trees," 2011 IEEE Conference on Visual Analytics
Science and Technology (VAST), pp. 151-160, 2011.
means of human-computer communication and as 8. R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, and
facilitators of human abstractive perception of F. Giannotti. “A Survey of Methods for Explaining
information and derivation of new knowledge, which Black Box Models”. ACM Computing Surveys, vol.
51. 2018.
refines and enriches human mental models [4]. Since
9. A. Holzinger, G. Langs, H. Denk, K. Zatloukal, H.
human knowledge plays the key role in the proposed Müller. “Causability and explainability of artificial
framework (Fig. 1), visual analytics researchers are intelligence in medicine”. Wiley Interdisciplinary
supposed to care about capturing this knowledge and Reviews: Data Mining and Knowledge Discovery, vol.
9, no. 4: e1312, Jul-Aug. 2019.
transferring it to computers. 10. L. Jiang, S. Liu, and C. Chen. “Recent research
We have outlined several lines of research in VA advances on interactive machine learning”. Journal of
that fit in the proposed research framework. These Visualization, vol. 22, no. 2, pp. 401–417, 2019.
11. D. Keim, G. Andrienko, J.-D. Fekete, C. Görg, J.
include theoretical developments, such as models and Kohlhammer, and G. Melançon. “Visual Analytics:
methods of information granulation and Definition, Process, and Challenges”. In: A. Kerren,
transformation of data patterns into knowledge J.T. Stasko, J.-D. Fekete, C. North (eds) Information
Visualization. Lecture Notes in Computer Science,
structures, and practice-oriented design of workflows vol 4950. Springer, Berlin, Heidelberg. 2008.
involving cross-disciplinary approaches. Progress in 12. B. Kovalerchuk, M.A. Ahmad, and A. Teredesai.
these directions will result in methods and systems for “Survey of Explainable Machine Learning with Visual
and Granular Methods Beyond Quasi-Explanations”.

Xxx/xxx 2020 9
In: Pedrycz W., Chen SM. (eds) “Interpretable Gennady Andrienko, is a lead scientist responsible
Artificial Intelligence: A Perspective of Granular for visual analytics research at Fraunhofer Institute
Computing”. Studies in Computational Intelligence,
vol. 937, pp. 217-267. Springer, 2021. for Intelligent Analysis and Information Systems and
13. Y. Ming, H. Qu, and E. Bertini. “RuleMatrix: part-time professor at City University London.
Visualizing and Understanding Classifiers with Gennady Andrienko was a paper chair of IEEE
Rules”. IEEE Transactions on Visualization and
Computer Graphics, vol. 25, no. 1 (Jan. 2019), 342– VAST conference (2015–2016) and associate editor
352, 2019. of IEEE Transactions on Visualization and
14. R. Monarch. “Human-in-the-Loop Machine Learning. Computer Graphics (2012–2016), Information
Active learning and annotation for human-centered
Visualization and International Journal of
AI”. Manning Publications, 2021.
15. W. Pedrycz, and S.-M. Chen (eds.) “Interpretable Cartography.
Artificial Intelligence: A Perspective of Granular
Computing”. Springer, 2021. Linara Adilova, is a PhD student in Ruhr University
16. L. von Rueden et al. Informed Machine Learning - A Bochum and a research scientist at Fraunhofer
Taxonomy and Survey of Integrating Prior Knowledge
into Learning Systems, in IEEE Transactions on IAIS. She has been working and publishing on
Knowledge and Data Engineering, 2021. doi: multiple research directions, e.g., distributed
10.1109/TKDE.2021.3079836. (federated) learning, autonomous driving. Her main
17. L.A. Zadeh, “Toward a theory of fuzzy information
granulation and its centrality in human reasoning and research focus lies in theory and mathematics of
fuzzy logic”. Fuzzy Sets and Systems, vol. 90, pp. deep learning.
111-127, 1997.
Stefan Wrobel, is Professor of Computer Science
at University of Bonn and Director of the Fraunhofer
Natalia Andrienko, is a lead scientist at Fraunhofer Institute for Intelligent Analysis and Information
Institute for Intelligent Analysis and Information Systems IAIS. His work is focused on questions of
Systems and part-time professor at City University the digital revolution, in particular intelligent
London. Results of her research have been algorithms and systems for the largescale analysis
published in two monographs, ”Exploratory Analysis of data and the influence of Big Data/Smart Data on
of Spatial and Temporal Data: a Systematic the use of information in companies and society. He
Approach” (2006) and ”Visual Analytics of is the author of a large number of publications on
Movement” (2013). Natalia Andrienko has been an data mining and machine learning, is on the
associate editor of IEEE Transactions on Editorial Board of several leading academic journals
Visualization and Computer Graphics (2016-2020) in his field and is an elected founding member of the
and is now an associate editor of Visual Informatics. “International Machine Learning Society”.

10 IEEE Computer Graphics & Applications


SIDEBAR 1: XAI EFFORTS FOR IMPROVING SIDEBAR 2: VA RESEARCH ON COMBINING
MODEL COMPREHENSIBILITY VA AND ML
A so-called “user-centric XAI framework” [3] aims There are several research areas in VA related to
to link XAI approaches to theories describing human ML:
reasoning and decision making, which have been
• ML in VA: incorporation of ML methods in VA
developed in psychology and philosophy. The systems and workflows to complement human
framework is intended to inform XAI researchers reasoning and advance data analysis [1].
about human cognitive patterns that should be taken
• Predictive VA: synergistic use of ML and VA
into account in designing XAI methods. The authors
techniques for development of predictive models
care most of all about the use of XAI for mitigation of
[4].
human cognitive biases and improvement of human
• VA-assisted ML: leveraging VA techniques in
reasoning and decision making rather than about the
ML workflows [5,7].
improvement of XAI itself.
• VA of ML models: VA support to model
There exist research works on structuring and
inspection, i.e., the process of understanding,
abstracting information for increasing model
diagnosing, and refining an ML model [2,3].
comprehensibility. One example is an approach to
identifying the contribution of groups of features to • VA for XAI: interactive visual interfaces to XAI
the predictive accuracy of a model [1]. It uses a methods [6].
predefined hierarchy of features and tries to ascertain
the level of resolution at which the importance of the
features and feature groups can be determined. REFERENCES
Another example is integration of multiple decision 1. A. Endert, W. Ribarsky, [Link], B.L.W. Wong, I.T.
tree models into a more general model [2]. The Nabney, I. Diaz-Blanco, and F. Rossi, “The State of
proposed approaches are purely algorithmic. VA the Art in Integrating Machine Learning into Visual
Analytics”. Computer Graphics Forum, vol. 36, no. 8,
researchers can complement these efforts by pp. 458-486, 2017.
supporting involvement of human knowledge and 2. F. Hohman, M. Kahng, R. Pienta and D. H. Chau,
reasoning. "Visual Analytics in Deep Learning: An Interrogative
Survey for the Next Frontiers," IEEE Transactions on
Visualization and Computer Graphics, vol. 25, no. 8,
pp. 2674-2693, 1 Aug. 2019.
3. S. Liu, X. Wang, M. Liu, and J. Zhu. “Towards better
REFERENCES analysis of machine learning models: A visual
1. K. Lee, A. Sood, and M. Craven, “Understanding analytics perspective”, Visual Informatics, vol. 1, no.
Learned Models by Identifying Important Features at 1, pp. 48-56, 2017
the Right Resolution”, AAAI, vol. 33, no. 01, pp. 4. Y. Lu, R. Garcia, B. Hansen, M. Gleicher, and R.
4155-4163, Jul. 2019. Maciejewski. “The state-of-the-art in predictive visual
2. P. Strecht, J. Mendes-Moreira, and C. Soares. analytics”. Computer Graphics Forum, vol. 36, no. 3,
“Inmplode: A framework to interpret multiple related pp. 539–562, 2017.
rule-based models”, Expert Systems. 2021 5. D. Sacha, M. Kraus, D.A. Keim, and M. Chen.
3. D. Wang, Q. Yang, A. Abdul, and B. Y. Lim. “VIS4ML: An Ontology for Visual Analytics Assisted
Designing Theory-Driven User-Centric Explainable Machine Learning,” in IEEE Transactions on
AI. Proceedings of the 2019 CHI Conference on Visualization and Computer Graphics, vol. 25, no. 1,
Human Factors in Computing Systems. Association pp. 385-395, 2019.
for Computing Machinery, New York, NY, USA, 6. T. Spinner, U. Schlegel, H. Schäfer, and M. El-
Paper 601, 1–15. 2019. Assady. "explAIner: A Visual Analytics Framework for
Interactive and Explainable Machine Learning," in
IEEE Transactions on Visualization and Computer
Graphics, vol. 26, no. 1, pp. 1064-1074, 2020.
7. J. Yuan, C. Chen, W. Yang, M. Liu, J. Xia, and S. Liu.
“A survey of visual analytics techniques for machine
learning”. Computational Visual Media, vol. 7, pp. 3–
36, 2021.

Xxx/xxx 2020 11

You might also like