Hiiiijhi
Hiiiijhi
cambria@[Link]
Lorenzo Malandri
Dept. of Statistics and Quantitative Methods
University of Milano-Bicocca, Milan, Italy
[Link]@[Link]
Fabio Mercorio
Dept. of Statistics and Quantitative Methods
University of Milano-Bicocca, Milan, Italy
[Link]@[Link]
Navid Nobani
Dept. of Statistics and Quantitative Methods
University of Milano-Bicocca, Milan, Italy
[Link]@[Link]
Andrea Seveso
Dept. of Statistics and Quantitative Methods
University of Milano-Bicocca, Milan, Italy
[Link]@[Link]
Abstract—In this survey, we address the key chal- community to advance both LLM and XAI fields
lenges in Large Language Models (LLM) research, together.
focusing on the importance of interpretability. Driven
by increasing interest from AI and business sectors, Index Terms—Explainable Artificial Intelligence,
we highlight the need for transparency in LLMs. We Interpretable Machine Learning, Large Language
examine the dual paths in current LLM research and Models, Natural Language Processing
eXplainable Artificial Intelligence (XAI): enhancing
performance through XAI and the emerging focus
on model interpretability. Our paper advocates for a I. I NTRODUCTION
balanced approach that values interpretability equally
with functional advancements. Recognizing the rapid
development in LLM research, our survey includes
both peer-reviewed and preprint (arXiv) papers, of-
T HE emergence of LLMs has significantly im-
pacted Artificial Intelligence (AI), given their
excellence in several Natural Language Processing
fering a comprehensive overview of XAI’s role in (NLP) applications. Their versatility reduces the
LLM research. We conclude by urging the research need for handcrafted features, enabling applications
2
across various domains. Their heightened creativity a clear and organised overview of the state of the
in content generation and contextual understand- art.
ing contributes to advancements in creative writing 2) We conduct a comprehensive survey of peer-
and conversational AI. Additionally, extensive pre- reviewed and preprint papers based on ArXiv
training on large amounts of data enables LLMs and DBLP databases, going beyond using com-
to exhibit strong generalisation capacities without mon research tools.
further domain-specific data from the user Zhao 3) We critically assess current practices, identifying
et al. [2023a], Amin et al. [2023]. For those research gaps and issues and articulating poten-
reasons, LLMs are swiftly becoming mainstream tial future research trajectories.
tools, deeply integrated into many industry sectors,
such as medicine (see, e.g., Thirunavukarasu et al. B. Research questions
[2023]) and finance (see, e.g., Wu et al. [2023a]),
to name a few. In this survey, we explore the coexistence of
However, their emergence also raises ethical XAI methods with LLMs and how these two fields
concerns, necessitating ongoing efforts to address are [Link], our investigation revolves
issues related to bias, misinformation, and respon- around these key questions:
sible AI deployment. LLMs are a notoriously com- Q1 How are XAI techniques currently being inte-
plex “black-box” system. Their inner workings are grated with LLMs?
opaque, and their intricate complexity makes their Q2 What are the emerging trends in converging
interpretation challenging Kaadoud et al. [2021], LLMs with XAI methodologies?
Cambria et al. [2023a]. Such opaqueness can lead to Q3 What are the gaps in the current related litera-
the production of inappropriate content or mislead- ture, and what areas require further research?
ing outputs Weidinger et al. [2021]. Finally, lacking
visibility on their training data can further hinder II. T HE N EED FOR E XPLANATIONS IN LLM S
trust and accountability in critical applications Liu In XAI field, the intersection with LLMs presents
[2023]. unique challenges and opportunities. This survey
In this context, XAI is a crucial bridge between paper aims to dissect these challenges, extending
complex LLM-based systems and human under- the dialogue beyond the conventional understand-
standing of their behaviour. Developing XAI frame- ing of XAI’s objective, which is to illuminate the
works for LLMs is essential for building user trust, inner mechanisms of opaque models for various
ensuring accountability and fostering a responsible stakeholders while avoiding the introduction of new
and ethical use of those models. uncertainties (See e.g., Cambria et al. [2023b],
In this article, we review and categorise current Burkart and Huber [2021]).
XAI for LLMs in a structured manner. Emphasising Despite their advancements, LLMs struggle with
the importance of clear and truthful explanations, as complexity and opacity, raising design, deployment
suggested by Sevastjanova and El-Assady [2022], and interpretation issues. Inspired by Weidinger
this survey aims to guide future research towards et al. [2021], this paper categorises LLM challenges
enhancing LLMs’ explainability and trustworthiness into user-visible and invisible ones.
in practical applications. a) Visible User Challenges: Directly perceiv-
able challenges for users without specialised tools.
b) Trust and Transparency: Trust issues
A. Contribution
arise in crucial domains, e.g., healthcare Merco-
The contribution of our work is threefold: rio et al. [2020], Gozzi et al. [2022], Alimonda
1) We introduce a novel categorisation framework et al. [2022] or finance Xing et al. [2020], Castel-
for assessing the body of research concerning the novo et al. [2023], Yeo et al. [2023], due to the
explainability of LLMs. The framework provides opacity of black-box models, including LLMs.
3
XAI must offer transparent, ethically aligned ex- SLRs focus on specific questions, thoroughly
planations for wider acceptance, especially under review fewer publications, and strive for precise,
stringent regulations that mandate explainability evidence-based outcomes Barn et al. [2017].
(e.g., EU’s GDPR Novelli et al. [2024]). This im- Following Martı́nez-Gárate et al. [2023], we
pacts regulatory compliance and public credibil- designed our SMS for XAI and LLMs, including
ity, with examples in European skill intelligence peer-reviewed and preprint papers. The latter
projects requiring XAI for decision explanations choice is because we believe in rapidly evolving
Malandri et al. [2022a, 2024, 2022b,c]. fields like computer science, including preprints
c) Misuse and Critical Thinking Impacts: offering access to the latest research, essential
LLMs’ versatility risks misuse, such as con- for a comprehensive review Oikonomidi et al.
tent creation for harmful purposes and evading [2020].
moderation Shen et al. [2023]. Over-reliance on We followed these steps to structure our SMS:
LLMs may also erode critical thinking and inde- Section I-B proposes and defines the research
pendent analysis, as seen in educational contexts questions, Section III-A describes how the pa-
(see, e.g. Abd-Alrazaq et al. [2023]). per retrieval has been performed; Section III-B
d) Invisible User Challenges: Challenges re- describes the paper selection process based on
quiring deeper model understanding. the defined criteria; Section III-C explains who
e) Ethical and Privacy Concerns: Ethical we dealt with false positive results and finally in
dilemmas from LLM use, such as fairness and Section IV we describe the obtained results.
hate speech issues, and privacy risks like sensitive
data exposure, require proactive measures and A. Paper retrieval
ethical guidelines Weidinger et al. [2021], Yan
et al. [2023], Salimi and Saheb [2023]. a) Overview: Instead of utilising common
f) Inaccuracies and Hallucinations: LLMs scientific search engines such as Google Scholar,
can generate false information, posing risks in we employed a custom search methodology de-
various sectors like education, journalism, and scribed in the following part. By scrutinising the
healthcare. Addressing these issues involves im- titles and abstracts of the obtained papers, we
proving LLM accuracy, educating users, and conducted targeted searches using a predefined
developing fact-checking systems Rawte et al. set of keywords pertinent to LLMs and XAI. This
[2023], Azaria and Mitchell [2023]. manual and deliberate search strategy was cho-
sen to minimise the risk of overlooking relevant
studies that automated search algorithms might
III. M ETHODOLOGY
miss and ensure our SMS dataset’s accuracy
Systematic Mapping Studies (SMSs) are com- and relevance. Through this rigorous process, we
prehensive surveys that categorise and summarise constructed a well-defined corpus of literature
a range of published works in a specific research poised for in-depth analysis and review. Figure
area, identifying literature gaps, trends, and future 1 provides an overview of this process.
research needs. They are especially useful in b) Peer-reviewed papers: We initiated this
large or under-explored fields where a detailed step by identifying top-tier Q1 journals within
Systematic Literature Review (SLR) may not be the “Artificial Intelligence” category of 2022 (last
feasible. year available at the start of the study), providing
SMS and SLR follow a three-phase method us with 58 journals from which to draw relevant
(planning, conducting, reporting) but differ in publications.
their approach, as SMSs address broader ques- Subsequently, we utilised the XML dump1
tions, cover a wider range of publications with from dblp computer science bibliography to get
a less detailed review, and aim to provide
an overview of the research field. In contrast, 1 [Link]
4
the titles of all papers published in the identified some research keywords possess a broad mean-
Q1 journals, except ten journals not covered by ing, for instance the words ’explain’ and ’in-
dblp. Once we gathered these paper titles, we terpret’ can be used in contexts different from
proceeded to find their abstract. To do so, we the one of XAI, we retrieved few false posi-
initially used the last available citation network tive papers, i.e., papers not dealing with both
of AMiner2 but given that this dump lacks the XAI and LLMs. We excluded the false pos-
majority of 2023 publications, we leveraged Sco- itives—publications that address only XAI or
pus API, a detailed database of scientific abstracts LLMs independently or none of them. To do
and citations, to retrieve the missing abstracts so, we manually analysed the title and abstract
corresponding to the amassed titles. of each paper. This meticulous vetting process
c) Pre-print papers: We scraped all com- resulted in 233 papers relevant to XAI and LLMs.
puter science papers presented in the Arxiv Given that including all these papers in our sur-
database from 2010 until October 2023, resulting vey was not feasible, we have selected the most
in 548,711 papers. Consequently, we used the relevant ones, based on their average number of
Arxiv API to get the abstracts of these papers. citations per year. The whole research process
resulted in 35 articles selected.
B. Paper selection
We employed a comprehensive set of keywords IV. R ETRIEVAL R ESULTS
to filter the collected papers for relevance to We divide papers into two macro-categories
LLMs and XAI. The search terms were care- of Applicaiton papers, i.e., papers that somehow
fully chosen to encompass the various terminolo- generated explanations, either towards explain-
gies and phrases commonly associated with each ability or to use them as a feature for another task,
field.3 and Discussion papers, i.e., papers that do not
In our search, we applied a logical OR operator engage with explanation generation but address
within the members of each list to capture any an issue or research gap regarding the explainable
of the terms within a single category, and an LLM models.
AND operator was used between the two lists to
ensure that only papers containing terms from A. Application Papers
both categories were retrieved for our analysis. The first macro-category includes papers using
LLMs in a methodology, tool, or task. Based
C. Dealing with false positives on how LLMs are used, we further divide this
Upon completion of the initial retrieval phase, category into two sub-categories as follows: ”To
we identified a total of 1,030 manuscripts. Since explain”, i.e., papers which try to explain how
2 [Link]
LLMs work and provide an insight into the
[Link]
3 The opaque nature of these models. The second sub-
keywords for XAI included: [’xai’, ’explain’, ’explana-
tion’, ’interpret’, ’black box’, ’black-box’, ’blackbox’, ’transpar- category of papers called ”As feature”, uses the
ent model understanding’, ’feature importance’, ’accountable ai’, explanations and features generated by LLMs to
’ethical ai’, ’trustworthy ai’, ’fairness’, ’ai justification’, ’causal improve the results of various tasks. The follow-
inference’, ’ai audit’]
While for LLMs, the keywords are; [’llm’, ’large language ing parts discuss these sub-categories:
model’, ’gpt-3’, ’gpt-2’, ’gpt3’, ’gpt2’, ’bert’, ’language model 1) To Explain: Most papers, i.e., 17 out of 35,
pre-training’, ’fine-tuning language models’, ’generative pre- fit into this sub-category, with most addressing
trained transformer’, ’llama’, ’ bard’, ’roberta’, ’ T5’, ’xl-
net’, ’megatron’, ’electra’, ’deberta’, ’ ernie’, ’ albert’, ’ bart’, the need for more interpretable and transparent
’blenderbot’, ’open pre-trained transformer’, ’mt-nlg’, ’turing- LLMs.
nlg’, ’pegasus’, ’gpt-3.5’, ’gpt-4’, ’gpt3.5’, ’gpt4’, ’ cohere’, For instance, Vig [2019] introduces a visuali-
’claude’, ’jurassic-1’, ’openllama’, ’falcon’, ’dolly’, ’mpt’, ’gua-
naco’, ’bloom’, ’ alpaca’, ’openchatkit’, ’gpt4all’, ’flan-t5’, sation tool for understanding the attention mech-
’orca’] anism in Transformer models like BERT and
5
2 Paper retrieval
Defining Arxiv
1 research
questions Not
available
abstracts
Preprint
Papers
Title + Abstract
3
Paper All
selection Papers
dblp
Dataset 1 Peer-reviewed
Title Papers
Elimination of Abstract
4 Aggregation
false positive Title + Abstract
Filter DBLP-Citation
network V14
Dataset
+ 2
Title Title
Title + Abstract
Not Not
Paper List of available available
5 journals
classification Journals abstracts
Fig. 1: The process used for getting the papers related to our keywords, including the definition of research
questions, paper retrieval, paper selection, elimination of false positives and classifying papers in the pre-
defined categories.
GPT-2. Their proposed tool provides insights approach for visual classification using descrip-
at multiple scales, from individual neurons to tions generated by LLMs. This method, which
whole model layers, helping to detect model bias, they term “classification by description,” involves
locate relevant attention heads, and link neurons using LLMs like GPT-3 to generate descriptive
to model behaviour. features of visual categories. These features are
Swamy et al. [2021] presents a methodol- then used to classify images more accurately
ogy for interpreting the knowledge acquisition while providing more transparent results than
and linguistic skills of BERT-based language traditional methods that rely solely on category
models by extracting knowledge graphs from names.
these models at different stages of their training.
Gao et al. [2023a] examines ChatGPT’s ca-
Knowledge graphs are often used for explainable
pabilities in causal reasoning using tasks like
extrapolation reasoning Lin et al. [2023].
Event Causality Identification (ECI), Causal Dis-
Wu et al. [2021] propose Polyjuice, a general-
covery (CD), and Causal Explanation Generation
purpose counterfactual generator. This tool gen-
(CEG). The authors claim that while ChatGPT
erates diverse, realistic counterfactuals by fine-
is effective as a causal explainer, it struggles
tuning GPT-2 on multiple datasets, allowing for
with causal reasoning and often exhibits causal
controlled perturbations regarding type and loca-
hallucinations. The study also investigates the
tion.
impact of In-Context Learning (ICL) and Chain-
Wang et al. [2022] investigates the mechanistic
of-Thought (CoT) techniques, concluding that
interpretability of GPT-2 small, particularly its
ChatGPT’s causal reasoning ability is highly sen-
ability to identify indirect objects in sentences.
sitive to the structure and wording of prompts.
The study involves circuit analysis and reverse
engineering of the model’s computational graph, Pan et al. [2023] is a framework that aims to
identifying specific attention heads and their roles enhance LLMs with explicit, structured knowl-
in this task. edge from KGs, addressing issues like halluci-
Menon and Vondrick [2022] introduce a novel nations and lack of interpretability. The paper
6
TABLE I: Synthesis of recent application papers, summarising engagement indicators as of January 2024,
update timelines, model specificity, and the overarching aims of each study. In the first section of the
table, To Explain papers are listed, and As Feature works in the second. Stars, forks, and last updates
are not reported (-) for papers lacking associated repositories. Target is the specific focus of the study,
such as a particular type of language model. Agnostic indicates whether the study is model-agnostic or
not. The goal represents the primary objective of each study: comparison of models (C), explanation (E),
improvement (IMP), interpretability (INT), and reasoning (R).
outlines three main approaches: KG-enhanced external knowledge to enhance the faithfulness
LLMs, LLM-augmented KGs, and synergised of explanations and improve overall performance.
LLMs with KGs. This unification improves the This approach, called Rethinking with Retrieval,
performance and explainability of AI systems in uses CoT prompting to generate reasoning paths
various applications. refined with relevant external knowledge. The
Conmy et al. [2023] focuses on automating a authors claim that their method significantly im-
part of the mechanistic interpretability workflow proves the performance of LLMs on complex
in neural networks. Using algorithms like Auto- reasoning tasks by producing more accurate and
matic Circuit Discovery (ACDC), the authors au- reliable explanations.
tomate the identification of sub-graphs in neural Multi-Chain Reasoning (MCR) introduced
models that correspond to specific behaviours or by Yoran et al. [2023] improves question-
functionalities. answering in LLMs by prompting them to meta-
He et al. [2022] presents a novel post- reason over multiple reasoning chains. This ap-
processing approach for LLMs that leverages proach helps select relevant facts, mix informa-
7
tion from different chains, and generate better reasoning with pre-trained language models. The
explanations for the answers. The paper demon- authors claim their framework improves logical
strates MCR’s superior performance over previ- reasoning in language models through a sym-
ous methods, especially in multi-hop question- bolic module that performs deductive reasoning,
answering. enhancing accuracy on deductive reasoning tasks.
Inseq Sarti et al. [2023] is a Python library that 2) As Feature: Papers in this sub-category do
facilitates interpretability analyses of sequence not directly aim to provide more transparent mod-
generation models. The toolkit focuses on ex- els or explain LLM-based models. Instead, they
tracting model internals and feature importance use LLMs to generate reasoning and descriptions,
scores, particularly for transformer architectures. which are used as input to a secondary task.
It centralises access to various feature attribution For instance, Li et al. [2022] explore how
methods, intuitively representable with visualisa- LLMs’ explanations can enhance the reasoning
tions such as heatmaps Aminimehr et al. [2023], capabilities of smaller language models (SLMs).
promoting fair and reproducible evaluations of They introduce a multi-task learning framework
sequence generation models. where SLMs are trained with explanations from
Boundless Distributed Alignment Search LLMs, leading to improved performance in rea-
(Boundless DAS) introduced by Wu et al. soning tasks.
[2023b] is a method for identifying interpretable Ye and Durrett [2022] evaluates the reliability
causal structures in LLMs. In their paper, the of explanations generated by LLMs in few-shot
authors demonstrate that the Alpaca model, a learning scenarios. The authors claim that LLM
7B parameter LLM, solves numerical reasoning explanations often do not significantly improve
problems by implementing simple algorithms learning performance and can be factually unre-
with interpretable boolean variables. liable by highlighting the potential misalignment
Li et al. [2023] investigate how various demon- between LLM reasoning and factual correctness
strations influence ICL in LLMs by exploring the in their explanations.
impact of contrastive input-label demonstration Turpin et al. [2023] investigates the reliabil-
pairs, including label flipping, input perturbation, ity of CoT reasoning. The authors claim that
and adding complementary explanations. The while CoT can improve task performance, it can
study employs saliency maps to qualitatively and also systematically misrepresent the true reason
quantitatively analyse how these demonstrations behind a model’s prediction. They demonstrate
affect the predictions of LLMs. this through experiments showing how biasing
LMExplainer Chen et al. [2023] is a method features in model inputs, such as reordering
for interpreting the decision-making processes of multiple-choice options, can heavily influence
LMs. This approach combines a knowledge graph CoT explanations without being acknowledged in
and a graph attention neural network to explain the explanation itself.
the reasoning behind an LM’s predictions. Kang et al. [2023] introduce an approach for
Gao et al. [2023b] propose a novel recom- automating the debugging process called Auto-
mendation system framework, Chat-REC, which mated Scientific Debugging (AutoSD). This ap-
integrates LLMs for generating more interactive proach leverages LLMs to generate hypotheses
and explainable recommendations. The system about bugs in code and uses debuggers to interact
converts user-profiles and interaction histories with the buggy code. This approach leads to
into prompts for LLMs, enhancing the recom- automated conclusions and patch generation and
mendation process with the ICL capabilities of provides clear explanations for the debugging
LLMs. decisions, potentially leading to more efficient
DSR-LM proposed by Zhang et al. [2022] is and accurate decisions by developers.
a framework combining differentiable symbolic Krishna et al. [2023] present a framework
8
includes around 21k questions with diverse sci- identified problems suggests an imperative for
ence topics and annotations, featuring lectures substantial engagement from the XAI community
and explanations to aid in understanding the to confront these issues adequately.
reasoning process. The authors demonstrate how a) Open-Source Engagement: Our survey
language models, particularly LLMs, can be study shows that more studies are moving beyond
trained to generate these lectures and explana- the traditional approach of merely describing
tions as part of a CoT process, enhancing their methodologies in text. Instead, they release them
reasoning capabilities. The study shows that CoT as tangible tools or open-source code, frequently
improves question-answering performance and hosted on platforms such as GitHub. This evo-
provides insights into the potential of LLMs to lution is a commendable step toward enhanc-
mimic human-like multi-step reasoning in com- ing transparency and reproducibility in computer
plex, multimodal domains. science research. The trend suggests a growing
Golovneva et al. [2022] introduce ROSCOE, a inclination among authors to release their code
set of metrics designed to evaluate the step-by- and publicly publish their tools, a notable change
step reasoning of language models, especially in from a few years ago. However, we should
scenarios without a golden reference. This work also mention the inconsistency in the level of
includes a taxonomy of reasoning errors and a community engagement with these repositories.
comprehensive evaluation of ROSCOE against While some repositories attract substantial inter-
baseline metrics across various reasoning tasks. est, fostering further development and improve-
The authors demonstrate ROSCOE’s effective- ment, others remain underutilised. This disparity
ness in assessing semantic consistency, logicality, in engagement raises important questions about
informativeness, fluency, and factuality in model- the factors influencing community interaction
generated rationales. with these resources.
Zhao et al. [2023b] presents a comprehensive b) Target: Predominantly, most works have
survey on explainability techniques for LLMs, directed their attention towards LLMs rather than
focusing on Transformer-based models. It cat- concentrating on more specialised or narrower
egorises these techniques based on traditional subjects within AI-based systems. This broad ap-
fine-tuning and prompting paradigms, detailing proach contrasts the relatively few studies that fo-
methods for generating local and global expla- cus specifically on Transformers or are confined
nations. The paper addresses the challenges and to examining particular categories of systems,
potential directions for future research in explain- such as recommendation systems. This overar-
ability, highlighting LLMs’ unique complexities ching focus on LLMs represents a positive and
and capabilities compared to conventional deep- impactful trend within the AI community. Given
learning models. Nevertheless, the survey mainly the rapid development and increasing prominence
focuses on XAI in general and has minimal of LLM systems in academic and practical appli-
coverage of the relationship between XAI and cations, this broader focus is timely and crucial
LLMs. for driving our understanding and capabilities
in this domain forward. It ensures that research
V. D ISCUSSION keeps pace with the advancements in the field,
Our analysis indicates that a limited number fostering a comprehensive and forward-looking
of the reviewed publications directly tackle the approach essential for AI technologies’ continued
challenges highlighted in Section II. For example, growth and evolution.
the work by Liu et al. [2023] focuses on trust- c) Goal: Our analysis, as delineated in Ta-
related concerns in LLMs, whereas Gao et al. ble I, reveals a bifurcation in the objectives of
[2023a] investigates the issue of misinformation the LLM studies under review. On the one hand,
propagation by LLMs. This scant attention to the a subset of these works is primarily dedicated to
10
explaining and enhancing the interpretability of challenges posed by the opacity of these sys-
these ’black box’ models. On the other hand, a tems. The importance of explainability should be
larger contingent is more task-oriented, focusing elevated from a mere ’nice-to-have’ feature to
on augmenting specific tasks and models, with an integral aspect of the development process.
interpretability emerging merely as a byproduct. This involves a proactive approach to incorporate
This dichotomy in research focus underscores a explainability in the design and implementation
pivotal trend: a pressing need to shift more at- phases of LLM-based systems. Such a shift in
tention towards demystifying the inner workings perspective is essential to ensure that these mod-
of LLMs. Rather than solely leveraging these els are effective, transparent and accountable.
models to boost task performance, their inher- Secondly, we urge researchers in the XAI field
ently opaque nature should not be overlooked. to broaden their investigative scope. The focus
The pursuit of performance improvements must should not only be on devising methodologies ca-
be balanced with efforts to unravel and clarify the pable of handling the complexity of LLM-based
underlying mechanisms of LLMs. This approach systems but also on enhancing the presentation
is crucial for fostering a deeper understanding of layer of these explanations. Currently, explana-
these complex systems, ensuring their application tions provided are often too complex for non-
is effective and transparent. Such a balanced technical stakeholders. Therefore, developing ap-
focus is essential for advancing the field tech- proaches that render these explanations more
nically and maintaining ethical and accountable accessible and understandable to a wider audi-
AI development. ence is imperative. This dual approach will make
LLMs more understandable and user-friendly and
bridge the gap between technical efficiency and
VI. C ONCLUSION ethical responsibility in AI development.
Our SMS reveals that only a handful of works
are dedicated to developing explanation methods R EFERENCES
for LLM-based systems. This finding is partic- Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi
ularly salient, considering the rapidly growing Tang, Xiaolei Wang, Yupeng Hou, Yingqian
prominence of LLMs in various applications. Our Min, Beichen Zhang, Junjie Zhang, Zican Dong,
study, therefore, serves a dual purpose in this et al. A survey of large language models.
context. Firstly, it acts as a navigational beacon arXiv:2303.18223, 2023a.
for the XAI community, highlighting the fertile Mostafa Amin, Erik Cambria, and Björn Schuller.
areas where efforts to create interpretable and Can ChatGPT’s responses boost traditional nat-
transparent LLM-based systems can effectively ural language processing? IEEE Intelligent Sys-
address the challenges the broader AI community tems, 38(5):5–11, 2023.
faces. Secondly, it is a call to action, urging Arun James Thirunavukarasu, Darren Shu Jeng
researchers and practitioners to venture into this Ting, Kabilan Elangovan, Laura Gutierrez,
relatively underexplored domain. The need for Ting Fang Tan, and Daniel Shu Wei Ting. Large
explanation methods in LLM-based systems is language models in medicine. Nature medicine,
not just a technical necessity but also a step pages 1–11, 2023.
towards responsible AI practice. By focusing on Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravol-
this area, the XAI community can contribute ski, Mark Dredze, Sebastian Gehrmann, Prab-
significantly to making AI systems more efficient, hanjan Kambadur, David Rosenberg, and Gideon
trustworthy and accountable. Mann. Bloomberggpt: A large language model
Our call for action is as follows: Firstly, for finance. arXiv:2303.17564, 2023a.
researchers employing LLM models must ac- Ikram Chraibi Kaadoud, Lina Fahed, and Philippe
knowledge and address the potential long-term Lenca. Explainable ai: a narrative review at
11
the crossroad of knowledge discovery, knowledge Artificial Intelligence and Neural Engineering
representation and representation learning. In (MetroXRAINE), pages 265–270. IEEE, 2022.
MRC, volume 2995, pages 28–40. ceur-ws. org, Frank Xing, Lorenzo Malandri, Yue Zhang, and
2021. Erik Cambria. Financial sentiment analysis: an
Erik Cambria, Rui Mao, Melvin Chen, Zhaoxia investigation into common mistakes and silver
Wang, and Seng-Beng Ho. Seven pillars for the bullets. In Proceedings of the 28th international
future of artificial intelligence. IEEE Intelligent conference on computational linguistics, pages
Systems, 38(6):62–69, 2023a. 978–987, 2020.
Laura Weidinger, John Mellor, Maribeth Rauh, Alessandro Castelnovo, Nicole Inverardi, Lorenzo
Conor Griffin, Jonathan Uesato, Po-Sen Huang, Malandri, Fabio Mercorio, Mario Mezzanzanica,
Myra Cheng, Mia Glaese, Borja Balle, Atoosa and Andrea Seveso. Leveraging group contrastive
Kasirzadeh, et al. Ethical and social risks of harm explanations for handling fairness. In World
from language models. arXiv:2112.04359, 2021. Conference on Explainable Artificial Intelligence,
Yang Liu. The importance of human-labeled data pages 332–345. Springer, 2023.
in the era of llms. In Proceedings of the Thirty- Wei Jie Yeo, Wihan van der Heever, Rui Mao,
Second International Joint Conference on Artifi- Erik Cambria, Ranjan Satapathy, and Gianmarco
cial Intelligence, pages 7026–7032, 2023. Mengaldo. A comprehensive review on financial
Rita Sevastjanova and Mennatallah El-Assady. Be- explainable ai. arXiv preprint arXiv:2309.11960,
ware the rationalization trap! when language 2023.
model explainability diverges from our mental Claudio Novelli, Federico Casolari, Philipp Hacker,
models of language, 2022. Giorgio Spedicato, and Luciano Floridi. Gener-
Erik Cambria, Lorenzo Malandri, Fabio Mercorio, ative ai in eu law: Liability, privacy, intellectual
Mario Mezzanzanica, and Navid Nobani. A property, and cybersecurity. EU Law: Liability,
survey on xai and natural language explanations. Privacy, Intellectual Property, and Cybersecurity
Information Processing & Management, 60(1): (January 14, 2024), 2024.
103111, 2023b. Lorenzo Malandri, Fabio Mercorio, Mario Mezzan-
Nadia Burkart and Marco F Huber. A survey on zanica, Navid Nobani, and Andrea Seveso. Con-
the explainability of supervised machine learning. trXT: Generating contrastive explanations from
Journal of Artificial Intelligence Research, 70: any text classifier. Inf. Fusion, 81:103–115,
245–317, 2021. 2022a. doi: 10.1016/[Link].2021.11.016. URL
Fabio Mercorio, Mario Mezzanzanica, and Andrea [Link]
Seveso. exdil: A tool for classifying and explain- Lorenzo Malandri, Fabio Mercorio, Mario Mezzan-
ing hospital discharge letters. In International zanica, and Andrea Seveso. Model-contrastive
Cross-Domain Conference for Machine Learn- explanations through symbolic reasoning. Deci-
ing and Knowledge Extraction, pages 159–172. sion Support Systems, 176:114040, 2024.
Springer, 2020. Lorenzo Malandri, Fabio Mercorio, Mario Mez-
Noemi Gozzi, Lorenzo Malandri, Fabio Merco- zanzanica, Navid Nobani, and Andrea Seveso.
rio, and Alessandra Pedrocchi. Xai for myo- Contrastive explanations of text classifiers as a
controlled prosthesis: Explaining emg data for service. In Proceedings of the 2022 Conference
hand gesture classification. Knowledge-Based of the North American Chapter of the Association
Systems, 240:108053, 2022. for Computational Linguistics: Human Language
Nicola Alimonda, Luca Guidotto, Lorenzo Malan- Technologies: System Demonstrations, pages 46–
dri, Fabio Mercorio, Mario Mezzanzanica, and 53, 2022b.
Giovanni Tosi. A survey on xai for cyber physical Lorenzo Malandri, Fabio Mercorio, Mario Mez-
systems in medicine. In 2022 IEEE International zanzanica, Navid Nobani, Andrea Seveso, et al.
Conference on Metrology for Extended Reality, The good, the bad, and the explainer: a tool for
12