Visual Analytics For Human-Centered Machine Learning: N. Andrienko L. Adilova
Visual Analytics For Human-Centered Machine Learning: N. Andrienko L. Adilova
Abstract—We introduce a new research area in Visual Analytics (VA) aiming to bridge existing
gaps between methods of interactive Machine Learning (ML) and eXplainable Artificial Intelligence
(XAI), on one side, and human minds, on the other side. The gaps are, first, a conceptual mismatch
between ML/XAI outputs and human mental models and ways of reasoning, second, a mismatch
between the information quantity and level of detail and human capabilities to perceive and
understand. A grand challenge is to adapt ML and XAI to human goals, concepts, values, and ways of
thinking. Complementing the current efforts in XAI towards solving this challenge, VA can
contribute by exploiting the potential of visualization as an effective way of communicating
information to humans and a strong trigger of human abstractive perception and thinking. We
propose a cross-disciplinary research framework and formulate research directions for VA.
T HE IMPORTANCE of involving humans in for practical use. Explainability of models may even
the process of creating and training Machine Learning be more important than their performance, especially
(ML) models is currently widely recognized in the in high-stake domains. In response to the need to
ML community [1]. It is argued that humans involved explain untransparent ML models (“black boxes”) to
in this process need to understand what the machine is users, the research field of eXplainable Artificial
doing and how it uses human inputs; hence, the Intelligence (XAI) has emerged recently [8]. The
machine must be able to explain its behavior to the work in this field was boosted by the European
users. Understanding of ML models has also critical Parliament’s adoption of the General Data Protection
importance for deciding whether they can be adopted Regulation (GDPR), which introduces the right of
IEEE CG&A Published by the IEEE Computer Society XXXX-XXXX© 2019 IEEE
individuals to receive explanations of automatically contribute to human-centered ML and propose
made decisions relevant to them. research directions towards realizing this idea.
However, there is a tendency to admit that model
“explainability” does not necessarily mean that the BACKGROUND
model is indeed properly explained to humans and The following definitions and statements are based
understood by them [9,12]. In this paper, we discuss on a recent survey of the XAI research [8] unless
the deficiencies of the common approaches to another reference is specified.
explaining ML models, mention current efforts In the ML and XAI literature, the terms
towards overcoming these deficiencies, argue why “explainability” and “interpretability” are used
Visual Analytics (VA) [11] should be involved in interchangeably. Interpretability is defined as the
such efforts, and consider its possible role in helping ability to explain or to provide the meaning of
humans to understand models. something in terms understandable to a human. The
definition assumes implicitly that an explanation is
self-contained and does not need further explanations.
An important distinction is made between global
and local interpretability. Global interpretability
means that humans can understand the whole logic of
a model and follow the reasoning leading to all
possible outcomes. Local interpretability means the
possibility to understand the reasons for a specific
decision.
Among the existing types of ML models, a few are
recognized as interpretable and easily understandable
Figure 1. Schematic representation of the for humans, namely, decision tree, rules, and linear
research framework for human-centered ML (regression) models. A decision tree can be
supported by VA. represented graphically, and a human can trace its
Based on our considerations, we propose a research branches and read logical conditions in the nodes.
framework for developing VA approaches supporting Rules have the form of logical statements “if … then
human-centered ML. The basic idea is schematically …”, which are familiar and understandable to humans.
represented in Fig. 1. Here, the term “informed ML” Linear models can be interpreted by considering the
means involving prior knowledge in the process of sign and magnitude of the contribution of each
deriving models from data, and “informed XAI” attribute to a prediction.
means involving knowledge in the process of These model types are considered interpretable by
explaining models to humans. While informed ML their nature and needing no explanations. The research
uses knowledge that is explicitly represented in a in XAI is concerned with explaining other types of
machine-processable form, VA can support acquiring models that are untransparent to humans. The research
knowledge from a human expert, including expert’s addresses three distinct problems. The black box
prior knowledge and new knowledge that the expert explanation problem consists in providing a globally
has obtained through interactive visual data analysis. interpretable model which is able to mimic the
The knowledge of the expert is externalized through behavior of the black box. The black box outcome
interactive visual interfaces and supplied to the ML explanation problem consists in providing
and XAI components. Please note that “informed explanations of the reasons for predictions or
XAI” is a new term that we introduce by analogy with decisions made by a black box. It is not required to
“informed ML”. The contents of Fig. 1 will be explain the whole logic behind the black box. The
explained in more detail later. black box inspection problem consists in providing a
We shall begin with providing background representation (visual or textual) for understanding
information concerning explainability of ML models either how the black box model works or why the
and deficiencies of common approaches in XAI. After black box returns certain predictions more likely than
an overview of the relevant research in ML, XAI, and others.
VA, we present the general idea of how VA can The content of this paper partly refers to the first
problem, i.e., black box explanation, which is being
Xxx/xxx 2020 3
relevant concepts rather than machine-specific objects learning targets, such as topics in text documents or
and structures? data patterns and hierarchies. For obtaining different
forms of human feedback, machine learning is
Interactive ML
increasingly combined with visual analytics [15].
ML acknowledges the value of human feedback in
the process of deriving models from data [1]. Most of Granular Computing (GC)
all, ML researchers are concerned with eliciting Granular computing [15,17] is a paradigm in
training data from human experts [14]. In the ML computer science and applied mathematics that strives
paradigm known as active learning, an algorithm to reflect the human ability to perceive the world at
applies some strategy to choose from a pool of different levels of granularity and to switch between
unlabeled examples and queries a human “oracle” to these levels. According to [17], there are three basic
provide labels. Apart from practical difficulties in concepts that underlie human cognition: granulation,
finding suitable strategies [5,6], this approach can organization and causation. Informally, granulation
cause such problems as human’s frustration and involves decomposition of whole into parts;
unwillingness to repeatedly perform a routine task [1]. organization involves integration of parts into whole;
Visual-interactive labeling provides users an active and causation involves association of causes with
role and possibility to apply different strategies [6]. effects. The central concept of GC is an information
The concept of interactive machine learning [10] granule, which is a construct composed of data or
acknowledges the fact that people may be capable and information items based on their similarity, adjacency,
willing to do much more for development of a good or other relationships. The ultimate objective of
model than just provide data labels. Interactive ML information granules is to describe phenomena in an
engages human users in a tight interaction loop of easily understood way and at a certain level of
iterative modification of data and/or features to abstraction. Therefore, the ideas of GC align very well
improve model performance [2,10]. However, to play with the need of explaining ML models in human-
such an active role, the users need to understand what friendly ways [15].
the machine is doing and how it uses their inputs. Acknowledging that information granules created
Hence, the machine must be able to explain its and used by humans are fuzzy rather than crisp, the
behavior to the users. founder of GC L. Zadeh proposed the theory of fuzzy
information granulation supported by fuzzy logic [17].
Informed ML
There are also research works in GC applying the
While traditional ML develops methods to derive
theory of rough sets.
models purely from data, more and more researchers
Granular computing does not consist of specific
call for combining data- and knowledge-based
methods; it is rather a set of ideas and a way of
approaches, which can reduce the required amount of
thinking. The book [15] contains some examples of
training data and, at the same time, lead to better
involving the ideas of GC in building ML models for
model quality, explainability, and trustworthiness. A
specific applications. One of the book chapters calls
research field called informed ML [16] works on
for combining GC with visual analytics.
integrating machine learning techniques with
processing of conceptual and contextual knowledge. Visual Analytics (VA)
Researchers mostly focus on utilizing knowledge that VA is a natural partner of ML and AI in the
has been previously prepared and represented in a research both on involving users in ML processes and
machine-readable form, such as logic rules, algebraic on explaining ML to users. Combining human and
equations, or concept graphs. machine intelligence is the central idea of VA [2].
The survey [16] refers to many works on involving Sidebar 2 points out the research areas in VA related
knowledge of human experts into the ML pipeline. to ML and refers to representative works.
Expert knowledge may be provided in the form of Most of the research dealing with ML models has
algebraic equations, probabilistic relations (often been related so far to different aspects of the problem
represented by Bayesian network structures), or “humans for ML”. The area of VA for XAI can be, in
human feedback. The first two forms can be directly principle, categorized as “ML for humans”, but the
used in an ML algorithm. Examples of human current research in it addresses mostly the needs of
feedback are setting preferences, judging relevance, model developers rather than domain experts. The
editing algorithm outcomes, and pre-specifying visualization of classification rules in RuleMatrix [13]
Xxx/xxx 2020 5
XAI component learns the concepts and the ways of GC proposes to use fuzzy sets and rough sets theories.
organizing explanations from the analyst and applies These formalisms appear suitable for representing
this knowledge to other subsets of “raw” explanations data patterns, such as a cluster or a trend, which
under the expert’s supervision. Being trained in this usually have an approximate character.
way, the XAI component will later be able to use the Based on the definition of a data pattern as a
learned principles of structuring and abstracting in system of type-specific relationships between data
explaining other ML models of the same type (e.g., items [3], it may be possible to generate formal
classification or regression) in the same domain. This representations of data patterns discovered in the
is another way of implementing the idea of “informed process of visual analysis and roughly outlined or
XAI”. VA techniques are used to present the resulting otherwise marked by the analysts. These formal
explanations to users in effective ways. representations can be processed by computers and
To create a theoretical basis underpinning these used in model building. To find suitable ways of
practical developments, we propose to work on representing data patterns, it is also reasonable to
combining the ideas and frameworks of visual consult the literature on knowledge representation in
analytics and granular computing. the classical AI.
To provide theoretical foundation to organization
Theoretical research
and abstraction of low-level XAI outputs, it is
GC aims to model the human ability to organize
necessary to elaborate the pattern theory in more
and perceive information at different levels of
detail for defining possible patterns in such a complex
abstraction. VA, in turn, is concerned with supporting
type of information as XAI-generated explanations,
abstractive perception of data and information from
e.g., having the form of decision rules or trees. In the
visual displays. The central concept in VA is a
next section, we describe some preliminary ideas
pattern, which is a combination of multiple items
concerning patterns in a set of rules and possibilities
perceived and considered together as a single entity
for uniting and generalizing related rules. Please note
due to relationships existing between the items [3].
that these ideas and examples refer to the second
Patterns themselves may also be linked by
direction in the work on implementing integrated VA-
relationships and on this basis integrated into patterns
ML-XAI workflows.
of a higher level of abstraction.
There is a semantic similarity between the concepts
EXAMPLE: GRANULATION OF RULES
of information granule in GC and pattern in VA. The
Let us consider decision rules with conditions
ultimate goal of VA is similar to that of GC: enable
involving numeric attributes (features). Such rules
humans to understand phenomena at appropriate
may be components of an original ML model or of a
levels of abstraction. Therefore, it appears reasonable
mimic model constructed by some XAI method to
to link these two research fields. It needs to be
explain a black box model. Each condition of a rule
investigated what theories and methods of GC can be
refers to one feature and states that the feature value
integrated with techniques of VA, how different types
must be lower or higher than a certain constant, or that
of information granules can be represented visually,
it must be within a certain interval. A rule usually
and how these types of granules can be formed
contains several conditions connected by the logical
through human-computer discourse using visual and
operator AND. The outcome, or consequent, of a rule
interactive techniques.
is one element from a finite set of possible classes,
Particularly, GC is concerned with modelling the
decisions, or actions.
approximate, fuzzy way of human conceptualization
and reasoning. As mathematical apparatuses for this,
Xxx/xxx 2020 7
for example, the mean of the distances in all different outcome. The accuracy of a rule can be
individual dimensions. This numeric measure of rule numerically expressed as the ratio of the number of
similarity can be used to algorithmically find groups covered rules with the same outcome as in this rule to
(clusters) of close rules, as well as for ordering of the total number of the rules covered by this rule. The
rules. Thus, adjacent table rows in Fig. 3 correspond accuracy of a rough rule will thus be less than 1.
to close rules. Obviously, a rough union rule is less suitable for
making predictions than the original group of rules
that has been abstracted. However, it may be quite
well suitable for explanation of the model logic to a
human, since it is normal for human cognition to deal
with rough concepts and approximations. A user of an
ML model can agree to accept some inaccuracies in
exchange for a simper description of the model logic,
and the user can choose the minimal accuracy that is
still acceptable. Hence, by finding and abstracting
clusters of rules with same outcomes, we aim to
Figure 6. A union of a group of close rules. derive a simpler model that is descriptive but not
necessarily predictive.
An important rule pattern is a cluster of close rules We have conducted multiple experiments on
having the same outcome. Such a cluster can be granulation of different ML models consisting of rules
abstracted into a multidimensional shape enclosing it. or decision trees (a decision tree can be transformed to
This envelope shape, in turn, corresponds to a rule that a set of rules by representing each path from the root
is more general than each member rule of the cluster; to a leaf by one rule). The models were created based
we shall call it a union rule. We say that the union rule on several benchmark datasets using state-of-the-art
covers each original rule of the cluster that has been ML methods. Our goal was to find out how much a
abstracted. In terms of rule conditions, it means that model can be simplified by means of rule granulation.
each interval of feature values of the union rule covers We varied the minimal accuracy threshold from 1 to
(i.e., coincides with or includes) the value intervals of 0.6. Interestingly, even with the threshold equal to 1
the same feature of all original rules. Hence, a union some compression is achieved. For example, a 3-class
rule can be derived from a group of rules by obtaining classification model with 109 rules and 818 conditions
the unions of the value intervals of the same features. in total has been reduced to 103 rules with 762
When a union of two or more intervals equals the full conditions. With the threshold of 0.75 for the same
range of the feature values, the condition referring to model, we obtained 84 rules with 594 conditions, and
this feature can be omitted from the union rule. Fig. 6 the threshold 0.6 gave us 54 rules with 342 conditions.
shows an example of a union rule abstracting a cluster A model with 10 classes containing 202 rules (1739
of five close rules. conditions) was abstracted to 167 rules (1357
In terms of granular computing, a union rule is an conditions) taking the threshold 0.75 and to 139 rules
information granule. Union rules can be derived by (1062 conditions) taking the threshold 0.6. Similar
iterative joining of pairs of close rules. This creates degrees of compression were achieved in the other
rule hierarchies involving information granules of experiments.
different degrees of abstraction. Based on our experiments, we can conclude that
A union rule covering a cluster of close rules with rule granulation is a viable approach to simplification
the same outcome may occasionally also cover some of rule sets. However, its power is limited: the
other rules with different outcomes. This is similar to simplified models still contain too many rules and
enclosing a cluster of points of the same class on a conditions to be treated as easily comprehendible. The
scatterplot by a bounding box: some points of another reason for this inadequacy is that abstracted rules
class may also fit into the box. Hence, a union rule can involve the same low-level features taken from
be an approximate, rough representation of a cluster of training data as the original rules. A model can be
similar rules. We shall use the term rough rule for a better understood by a domain expert if it refers to
rule covering two or more rules with the same higher-level domain-relevant concepts. Such concepts
outcome as in this rule and at least one rule with a cannot be automatically derived from data but need to
Xxx/xxx 2020 9
In: Pedrycz W., Chen SM. (eds) “Interpretable Gennady Andrienko, is a lead scientist responsible
Artificial Intelligence: A Perspective of Granular for visual analytics research at Fraunhofer Institute
Computing”. Studies in Computational Intelligence,
vol. 937, pp. 217-267. Springer, 2021. for Intelligent Analysis and Information Systems and
13. Y. Ming, H. Qu, and E. Bertini. “RuleMatrix: part-time professor at City University London.
Visualizing and Understanding Classifiers with Gennady Andrienko was a paper chair of IEEE
Rules”. IEEE Transactions on Visualization and
Computer Graphics, vol. 25, no. 1 (Jan. 2019), 342– VAST conference (2015–2016) and associate editor
352, 2019. of IEEE Transactions on Visualization and
14. R. Monarch. “Human-in-the-Loop Machine Learning. Computer Graphics (2012–2016), Information
Active learning and annotation for human-centered
Visualization and International Journal of
AI”. Manning Publications, 2021.
15. W. Pedrycz, and S.-M. Chen (eds.) “Interpretable Cartography.
Artificial Intelligence: A Perspective of Granular
Computing”. Springer, 2021. Linara Adilova, is a PhD student in Ruhr University
16. L. von Rueden et al. Informed Machine Learning - A Bochum and a research scientist at Fraunhofer
Taxonomy and Survey of Integrating Prior Knowledge
into Learning Systems, in IEEE Transactions on IAIS. She has been working and publishing on
Knowledge and Data Engineering, 2021. doi: multiple research directions, e.g., distributed
10.1109/TKDE.2021.3079836. (federated) learning, autonomous driving. Her main
17. L.A. Zadeh, “Toward a theory of fuzzy information
granulation and its centrality in human reasoning and research focus lies in theory and mathematics of
fuzzy logic”. Fuzzy Sets and Systems, vol. 90, pp. deep learning.
111-127, 1997.
Stefan Wrobel, is Professor of Computer Science
at University of Bonn and Director of the Fraunhofer
Natalia Andrienko, is a lead scientist at Fraunhofer Institute for Intelligent Analysis and Information
Institute for Intelligent Analysis and Information Systems IAIS. His work is focused on questions of
Systems and part-time professor at City University the digital revolution, in particular intelligent
London. Results of her research have been algorithms and systems for the largescale analysis
published in two monographs, ”Exploratory Analysis of data and the influence of Big Data/Smart Data on
of Spatial and Temporal Data: a Systematic the use of information in companies and society. He
Approach” (2006) and ”Visual Analytics of is the author of a large number of publications on
Movement” (2013). Natalia Andrienko has been an data mining and machine learning, is on the
associate editor of IEEE Transactions on Editorial Board of several leading academic journals
Visualization and Computer Graphics (2016-2020) in his field and is an elected founding member of the
and is now an associate editor of Visual Informatics. “International Machine Learning Society”.
Xxx/xxx 2020 11