The Dawn of The Human Machine Era A Forecast Report 2021
The Dawn of The Human Machine Era A Forecast Report 2021
net/publication/351904077
CITATIONS READS
8 1,076
3 authors:
Fisnik Dalipi
Linnaeus University
67 PUBLICATIONS 868 CITATIONS
SEE PROFILE
All content following this page was uploaded by Fanny Meunier on 27 May 2021.
This work is licenced under a Creative Commons Attribution 4.0 International Licence
https://creativecommons.org/licenses/by/4.0/
This publication is based upon work from COST Action ‘Language in the Human-Machine Era’, supported by COST
(European Cooperation in Science and Technology).
COST (European Cooperation in Science and Technology) is a funding agency for research and innovation networks.
Our Actions help connect research initiatives across Europe and enable scientists to grow their ideas by sharing them
with their peers. This boosts their research, career and innovation.
www.cost.eu
Acknowledgements................................................................................................................................. 61
References............................................................................................................................................... 61
1 Introduction:
speaking through and
to technology
“Within the next 10 years, many millions of people will … walk around
wearing relatively unobtrusive AR devices that offer an immersive and high-res-
olution view of a visually augmented world” (Perlin 2016: 85)
The ‘human-machine era’ is coming soon: a time when technology is integrated with our senses, not confined to
mobile devices. What will this mean for language?
Over the centuries there have been very few major and distinctive milestones in how we use language. The inven-
tion(s) of writing allowed our words to outlive the moment of their origin (Socrates was famously suspicious of
writing for this reason). The printing press enabled faithful mass reproduction of the same text. The telegram and
later the telephone allowed speedy written and then spoken communication worldwide. The internet enabled bil-
lions of us to publish mass messages in a way previously confined to mass media and governments. Smartphones
brought all these prior inventions into the palms of our hands. The next major milestone is coming very soon.
For decades, there has been a growing awareness that technology plays some kind of active role in our communi-
cation. As Marshall McLuhan so powerfully put it, ‘the medium is the message’ (e.g. McLuhan & Fiore 1967; Carr
2020; Cavanaugh et al. 2016). But the coming human-machine era represents something much more fundamental.
Highly advanced audio and visual filters powered by artificial intelligence – evolutionary leaps from the filters we
know today – will overlay and augment the language we hear, see, and feel in the world around us, in real time,
all the time. We will also hold complex conversations with highly intelligent machines that are able to respond in
detail.
5
Language In The Human-Machine Era • lithme.eu • COST Action 19102
In this report we describe and forecast two imminent changes to human communication:
• Speaking through technology. Technology will actively contribute and participate in our commu-
nication – altering the voices we hear and facial movements we see, instantly and imperceptibly trans-
lating between languages, while clarifying and amplifying our own languages. This will not happen
overnight, but it will happen. Technology will weave into the fabric of our language in real time, no
longer as a supplementary resource but as an inextricable part of it.
• Speaking to technology. The current crop of smart assistants, embedded in phones, wearables,
and home listening devices will evolve into highly intelligent and responsive utilities, able to address
complex queries and engage in lengthy detailed conversation. Technology will increasingly under-
stand both the content and the context of natural language, and interact with us in real time. It
will understand and interpret what we say. We will have increasingly substantive and meaningful
conversations with these devices. Combined with enhanced virtual reality featuring lifelike characters,
this will increasingly enable learning and even socialising among a limitless selection of intelligent and
responsive artificial partners.
In this introduction, we further elaborate these two features of the human-machine era, by describing the advance
of key technologies and offering some illustrative scenarios. The rest of our report then goes into further detail
about the current state of relevant technologies, and their likely future trajectories.
6
Language In The Human-Machine Era • lithme.eu • COST Action 19102
and see the same. This is what we mean when we say technology will become an active participant, inextricably
woven into the interaction.
All this might feel like a sci-fi scenario, but it is all based on real technologies currently at prototype stage, under
active development, and the subject of vast (and competing) corporate R&D investment. These devices are com-
ing, and they will transform how we use and think about language.
7
Language In The Human-Machine Era • lithme.eu • COST Action 19102
and innovative, harder to process. Next is modalities. Modalities are the various ways that humans use language
through our senses, including writing, speech, sign, and touch. The more of these a machine uses at once, the more
processing power is needed. The model below sets all these out for comparison.
Figure 1. Levels of difficulty for machines, according to language formality and modalities
There are predictions that over time the distinction between written and spoken language will gradually fade, as
more texts are dictated to (and processed by) speech recognition tools, and texts we read become more speech-like.
Below we discuss types of human language, combining the perspectives of linguists and technologists. As above,
this is relevant to the amount of work a machine must do.
8
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Modality Meaning is encoded in... Sense Commonly associated Machine must produce...
required languages
Written Graphemes Sight
Text
(written characters)
English, Finnish,
Spoken Phonemes Hearing
Esperanto, Quechua, Synthesised voice
(distinctive sounds)
etc.
Haptic Touch (as in Braille or Touch
Moveable surface
fingerspelling)
Signed Movements of the hands, Vision British Sign Language, Avatar with distinguishable
arms, head and body; Finnish Sign Language, arms, fingers, facial features,
facial expression International Sign etc. mouth detail and posture
Table 1. Modalities of language and what they require from machines
Put another way, the signed modality is the basic modality for individual sign languages, but some other languages
can also be expressed in the signed modality. It is possible to differentiate further into full sign languages and
signed languages, such as fingerspelling, etc. often used in school education for young students (see ISO, in prep.).
A further distinction is needed between visual sign languages and tactile sign languages. For example, unlike visual
sign languages, tactile sign languages do not have clearly defined grammatical forms to mark questions. Additionally,
visual sign languages use a whole range of visible movements beyond just the handshapes hearing people typically
associated with sign. This includes facial expression, head tilt, eyebrow positions or other ways of managing what
in spoken language would be intonation (Willoughby et al. 2018). “Unlike spoken languages, sign languages employ
multiple asynchronous channels to convey information. These channels include both the manual (i.e. upper body
motion, hand shape and trajectory) and non-manual (i.e. facial expressions, mouthings, body posture) features”
(Stoll et al. 2018). It is important to distinguish all these, for understanding different people’s needs and the
different kinds of use cases of new and emerging language technologies.
9
Language In The Human-Machine Era • lithme.eu • COST Action 19102
10
Language In The Human-Machine Era • lithme.eu • COST Action 19102
diversity of facial expression, body posture, and social context that add multiple layers of meaning, emphasis and
feeling to sign. Moreover, these technologies help non-signers to understand something from sign but they strip
signers of much intended meaning. The inequality is quite palpable. There are early signs of progress, with small
and gradual steps towards multimodal chatbots which are more able to detect and produce facial movements and
complex gestures. But this is a much more emergent field than verbal translation, so for the foreseeable future, sign
language automation will be distantly inferior.
Another issue is privacy and security. The more we speak through and to a company’s technology, the more data
we provide. AI feeds on data, using it to learn and improve. We already trade privacy for technology. AI, the
Internet of Things, and social robots all offer endless possibilities, but they may conceal boundless risks. Whilst
improving user experiences, reducing health and safety risks, easing communication between languages and other
benefits, technology can also lead to discrimination and exclusion, surveillance, and security risks. This can take
many forms. Some exist already, and may be exacerbated, like the “filter bubbles” (Pariser 2011), “ideological
frames” (Scheufele, 1999; Guenther et al. 2020) or “echo chambers” (Cinelli et al., 2021) of social media, which risk
intellectual isolation and constrained choices (Holone 2016). Meanwhile automatic text generation will increasingly
help in identifying criminals based on their writing, for example grooming messages or threatening letters, or a
false suicide letter. Such text generation technologies can also challenge current plagiarism detection methods and
procedures, and allow speakers and writers of a language to plagiarise other original texts. Likewise, the automatic
emulation of someone’s speech can be used to trick speech recognition systems used by banks, thus contributing
to cybercriminal activities. New vectors for deception and fraud will emerge with every new advance.
The limits of technology must be clearly understood by human users. Consider the scenario we outlined earlier, a
virtual world of lifelike characters – endlessly patient interlocutors, teachers, trainers, sports partners, and plenty
else besides. Those characters will never be truly sad or happy for us, or empathise – even if they can emulate these
things. We may be diverted away from communicating and interacting with – imperfect but real – humans.
Last but not least, another challenging setting for technology is its use by minority languages communities. From
a machine learning perspective, the shortage of digital infrastructure to support these languages may hamper
development of appropriate technologies. Speakers of less widely-used languages may lag in access to the exciting
resources that are coming. The consequences of this can be far-reaching, well beyond the technological domain:
unavailability of a certain technology may lead speakers of a language to use another one, hastening the disappear-
ance of their language altogether.
LITHME is here to scrutinise these various critical issues, not simply shrug our shoulders as we cheer exciting
shiny new gadgets. A major purpose of this report, and of the LITHME network, is to think through and foresee
future societal risks as technology advances, and amplify these warnings so that technology developers and regu-
lators can act pre-emptively.
11
2 Behind the scenes:
the software powering
the human-machine era
Summary and overview
Artificial Intelligence (AI) is a broad term applied to computing approaches that enable ma-
chines to ‘learn’ from data, and generate new outputs that were not explicitly programmed into
them. AI has been trained on a wide range of inputs, including maps, weather data, planetary
movements, and human language. The major overarching goal for language AI is for machines
to both interpret and then produce language with human levels of accuracy, fluency, and speed.
Recent advances in ‘Neural Networks’ and ‘deep learning’ have enabled machines to reach un-
precedented levels of accuracy in interpretation and production. Machines can receive text or
audio inputs and summarise these or translate them into other languages, with reasonable (and
increasing) levels of comprehensibility. They are not yet generally at a human level, and there
is distinct inequality between languages, especially smaller languages with less data to train the
AI, and sign languages – sign is a different ‘modality’ of language in which data collection and
machine training are significantly more difficult.
There are also persistent issues of bias. Machines learn from large bodies of human language
data, which naturally contain all of our biases and prejudices. Work is underway to address this
ongoing challenge and attempt to mitigate those biases.
Machines are being trained to produce human language and communicate with us in in-
creasingly sophisticated ways – enabling us to talk to technology. Currently these chatbots
12
Language In The Human-Machine Era • lithme.eu • COST Action 19102
power many consumer devices including ‘smart assistants’ embedded in mobile phones and
standalone units. Development in this area will soon enable more complex conversations on a
wider range of topics, though again marked by inequality, at least in the early stages, between
languages and modalities.
Automatic recognition of our voices, and then production of synthesised voices, is progressing
rapidly. Currently machines can receive and automatically transcribe many languages, though
only after training on several thousand hours of transcribed audio data. This presents issues
for smaller languages.
Deep learning has also enabled machines to produce highly lifelike synthetic voices. Recently
this has come to include the ability to mimic real people’s voices, based on a similar principle of
churning through long recordings of their voice and learning how individual sounds are pro-
duced and combined. This has remarkable promise, especially when combined with automated
translation, for both dubbing of recorded video and translation of conversation, potentially
enabling us to talk in other languages, in our own voice. There are various new ways of talking
through technology that will appear in the coming years.
Aside from text and voice, attempts are underway to train AI on sign language. Sign is an
entirely different system of language with its own grammar, and uses a mix of modalities to
achieve full meaning: not just shapes made with the hands but also facial expression, gaze, body
posture, and other aspects of social context. Currently AI is only being trained on handshapes;
other modalities are simply beyond current technologies. Progress on handshape detection and
production is focused on speed, accuracy, and making technologies less intrusive – moving
from awkward sensor gloves towards camera-based facilities embedded in phones and web-
cams. Still, progress is notably slower than for the spoken and written modalities.
A further significant challenge for machines will be to understand what lies beyond just words,
all the other things we achieve in conversation: from the use of intonation (questioning, happy,
aggressive, polite, etc.), to the understanding of physical space, implicit references to common
knowledge, and other aspects woven into our conversation which we typically understand
alongside our words, almost without thinking, but which machines currently cannot.
Progress to date in all these areas has been significant, and more has been achieved in recent
years than in the preceding decades. However, significant challenges lie ahead, both in the state
of the art and in the equality of its application across languages and modalities.
This section covers advances in software that will power the human-machine era. We describe
the way machines will be able to understand language. We begin with text, then move on to
speech, before looking at paralinguistic features like emotion, sentiment, and politeness.
Underlying these software advances are some techniques and processes that enable machines to understand human
speech, text, and to a lesser extent facial expression, sign and gesture. ‘Deep learning’ techniques have now been
used extensively to analyse and understand text sequences, to recognise human speech and transcribe it to text, and
to translate between languages. This has typically relied on ‘supervised’ machine learning approaches; that is, large
manually annotated corpora from which the machine can learn. An example would be a large transcribed audio
database, from which the machine could build up an understanding of the likelihood that a certain combination of
sounds correspond to certain words, or (in a bilingual corpus) that a certain word in one language will correspond
to another word in another language. The machine learns from a huge amount of data, and is then able to make
educated guesses based on probabilities in that data set.
The term ‘Neural Networks’ is something of an analogy, based on the idea that these probabilistic models are
working less like a traditional machine – with fixed inputs and outputs – and more like a human brain, able to
arrive at new solutions somewhat more independently, having ‘learned’ from prior data. This is a problematic
and somewhat superficial metaphor; the brain cannot be reduced to the sum of its parts, to its computational
abilities (see e.g. Epstein 2016; Cobb 2020; Marincat 2020). Neural Networks do represent a clear advance from
computers that simply repeat code programmed into them. Still, they continue to require extensive prior data and
programming, and have less flexibility in computing the importance and accuracy of data points. This is significant
13
Language In The Human-Machine Era • lithme.eu • COST Action 19102
in the real world because, for example, the large amounts of data required for deep learning are costly and time
consuming to gather. Investment has therefore followed the line of greatest utility and profit with lowest initial
cost. Low-resource languages lose out from deep learning.
‘Deep Neural Networks’ (DNNs), by contrast, work by building up layers of knowledge about different aspects
of a given type of data, and establishing accuracies more dynamically. DNNs enable much greater flexibility in
determining, layer by layer, whether a sound being made was a ‘k’ or a ‘g’ and so on, and whether a group of sounds
together corresponded to a given word, and words to sentences. DNNs allow adaptive, dynamic, estimated guesses
of linguistic inputs which have much greater speed and accuracy. Consequently, many commercial products inte-
grate speech recognition; and some approach a level comparable with human recognition.
Major recent advances in machine learning have centred around different approaches to Neural Networks. Widely
used technical terms include Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), and Gated
Recurrent Units (GRUs). Each of these three can be used for a technique known as sequence-to-sequence, ‘se-
q2seq’. Introduced by Google in 2014 (https://arxiv.org/pdf/1409.3215.pdf), seq2seq analyses language input
(speech, audio etc.) not as individual words or sounds, but as combined sequences; for example in a translation
task, interpreting a whole sentence in the input (based on prior understanding of grammar) and assembling that
into a likely whole sentence in a target language – all based on probabilities of word combinations in each language.
This marks a major advance from translating word for word, and enables more fluent translations. In particular it
allows input and output sequences of different lengths, for example a different number of words in the source and
translation – useful if source and target languages construct grammar differently (for example presence of absence
of articles, prepositions, etc.) or have words that don’t translate into a single word in another language.
The above is a highly compressed review of some of the underlying machinery for machine learning of language.
Worth also noting that many of these same processes are used in areas like automatic captioning of photos (inter-
preting what is in a photo by comparing similar combinations of colours and shapes in billions of other photos),
facial recognition (identifying someone’s unique features by referring to different ‘layers’ of what makes a face look
like a human, like a man, like a 45 year old, and so on), self-driving cars (distinguishing a cyclist from a parking
space), and so on. These algorithms will govern far more than language technology in the human-machine era.
We move on now to discuss how these underlying machine smarts are used to analyse text, speech, paralinguistic
features like sentiment, and then visual elements like gesture and sign.
14
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Attempts at machine translation were soon dropped, but were resumed later on by projects such as Google
Translate, which approached the problem not based on rules but statistics, not on direct dictionary correspondence
but on the likelihood of one word following another, or surrounding others in the semantic space. Statistical
machine translation systems first aligned large volumes of text in a source and target language side by side, and
then arrived at statistical assumptions for which words or word combinations were more likely to produce the
same meanings in another language. Companies like Google were ideally placed for this, as they indexed trillions
of pages written in many languages. The system would soon become a victim of its own success, as companies
and users worldwide started using poor quality translations, including those produced by Google, to produce
websites in many different languages. As a result, poor quality data fed into the same system. Garbage in, garbage
out. Statistical machine translation, too, then fell short of expectations, and Google invited their users to correct
the translations produced by the system.
Translation is nowadays perhaps the area where human-machine interaction technologies have advanced the most.
Yet, not all types of translation have evolved at the same pace; translation of written language has progressed more
than spoken and haptic languages.
More recently, research has focused on neural machine translation (NMT). The rationale behind NMT is that
technology is able to simulate human reasoning and hence produce human-like machine translations. Indeed,
the functions of MT are likely to continue to expand. In the area of machine translation there are now various
utilities including Google Translate, Microsoft Translate and DeepL. Open source alternatives include ESPNet,
and FBK-Fairseq-ST.
These are based on deep learning techniques, and can produce convincing results for many language pairs. Deep
learning uses large datasets of previously translated text to build probabilistic models for translating new text.
There are many such sources of data. One example is multilingual subtitles: and within these, a particularly useful
dataset comes from TED talks – these are routinely translated by volunteers into many languages with adminis-
tratively managed quality checks; they cover a variety of topics and knowledge domains, and they are open access
(Cettolo et al. 2012). There are limitations, for example translations are mainly from English to other languages;
and since many talks are pre-scripted, they may not represent typical conversational register (Dupont & Zufferey
2017; Lefer & Grabar 2015). TED talks are nevertheless valuable for parallel data. They are employed as a data set
for statistical machine translation systems and are one of the most popular data resources for multilingual neural
machine translation (Aharoni et al. 2019; Chu et al. 2017; Hoang et al. 2018; Khayrallah et al. 2018; Zhang et al.
2019).
The accuracy of machine translation is lower in highly inflected languages (as in the Slavic family), and aggluti-
native languages (like Hungarian, Turkish, Korean, and Swahili). In many cases, this can be remedied with more
data, since the basis of deep learning is precisely to churn through huge data sets to infer patterns. This, however,
presents problems for languages spoken by relatively small populations – often minority languages. Hence, prog-
ress is running at different paces, with potential for inequalities.
Even though deep learning techniques can provide good results, there are still rule-based machine translation sys-
tems in the market like that of the oldest machine learning company SYSTRAN (systransoft.com). There are also
open source systems like Apertium (apertium.org). These toolkits allow users to train neural machine translation
(NMT) systems with parallel corpora, word embeddings (for source and target languages), and dictionaries. The
different toolkits offer different (maybe overlapping) model implementations and architectures. Nematus (https://
github.com/EdinburghNLP/nematus) implements an attention-based encoder-decoder model for NMT built in
Tensorflow. OpenNMT (opennmt.net, https://www.aclweb.org/anthology/P17-4012) and MarianNMT (https://
marian-nmt.github.io/) are two other open source translation systems. One of the most prolific open source
machine translation systems is the Moses phrase-based system (https://www.statmt.org/moses), used by Amazon
and Facebook, among other corporations. Moses was also successfully used for translation of MOOCs across four
translation directions – from English into German, Greek, Portuguese, and Russian (Castilho et. al. 2017).
Another research trend is AI-powered Quality Estimation (QE) of machine translation. This provides a quality
indication for machine translation output without human intervention. Much work is being undertaken on QE,
and some systems such as those of Memsource (https://www.memsource.com/features/translation-quality-es-
timation/) are available; but so far none seems to have reached sufficient robustness for large-scale adoption.
According to Sun et al. (2020), it is likely that QE models trained on publicly available datasets are simply guessing
translation quality rather than estimating it. Although QE models might capture fluency of translated sentences
and complexity of source sentences, they cannot model adequacy of translations effectively. There could be vari-
15
Language In The Human-Machine Era • lithme.eu • COST Action 19102
ous reasons for this, but this ineffectiveness has been attributed to potential inherent flaws in current QE datasets,
which cause the resulting models to ignore semantic relationships between translated segments and the originals,
resulting in incorrect judgments of adequacy.
CJEU MT Systran – SYStem TRANSlation has contributed significantly to machine translation (https://curia.
europa.eu/jcms/upload/docs/application/pdf/2013-04/cp130048en.pdf). Another example is the European
Union’s eTranslation online machine translation service, which is provided by the European Commission (EC) for
European official administration, small and medium sized enterprises (SMEs), and higher education institutions
(https://ec.europa.eu/info/resources-partners/machine-translation-public-administrations-etranslation_en).
Bergamot (browser.mt/) is a further interesting project whose aim is to add and improve client-side machine trans-
lation in a web browser. The project will release an open-source software package to run inside Mozilla Firefox.
It aims to enable bottom-up adoption by non-experts, resulting in cost savings for private and public sector users.
Lastly, ParaCrawl (paracrawl.eu/) is a European project which applies state-of-the-art neural methods to the
detection of parallel sentences, and the processing of the extracted corpora.
As mentioned above, translation systems tend to focus on languages spoken by large populations. However, there
are systems focusing on low-resource languages. For instance, the GoURMET project (https://gourmet-project.
eu/) aims to use and improve neural machine translation for low-resource language pairs and domains. The WALS
database (https://wals.info/) (Dryer & Haspelmath 2013) is used to improve systems (language transfer), especial-
ly for less-resourced languages (Naseem et al. 2012; Ahmad et al. 2019).
Machine translation has been particularly successful when applied to specialized domains, such as education, health,
and science. Activities focused on specific domains abound: for example, the Workshop for Machine Translation
(WMT) has offered a track on biomedical machine translation which has led to the development of domain-specif-
ic resources. http://www.statmt.org/wmt20/biomedical-translation-task.html. There are limited parallel corpora,
and much more monolingual data in specialized domains (e.g. for the biomedical domain: https://www.aclweb.
org/anthology/L18-1043.pdf). Back-translation is studied to integrate monolingual corpus into NMT training of
domain-adapted machine translation (https://www.aclweb.org/anthology/P17-2061.pdf).
European Language Resource Coordination (ELRC) — http://lr-coordination.eu/node/2 — is gathering data
(corpora) specialised on Digital Service Infrastructures. The EU’s Connecting Europe Facility (CEF) in Telecom
enables cross-border interaction between organisations (public and private). Projects financed by CEF Telecom
usually deliver domain-specific corpora (especially for less resourced languages) for training and tuning of the
e-Translation system. Examples include MARCELL (marcell-project.eu) and CURLICAT (curlicat.eu).
Currently, the main obstacle is the need for huge amounts of data. As noted above, this creates inequalities for
smaller languages. Current technology based on neural systems conceal a hidden threat: neural systems require
much more data for training than rule-based or traditional statistical machine-learning systems. Hence, technologi-
cal language inclusion depends to a significant extent on how much data is available, which furthers the technolog-
ical gap between ‘resourced’ and ‘under-resourced’ languages. Inclusion of additional, under-resourced languages
is desirable, but this becomes harder as the resources to build on are scarce. Consequently, these languages will be
excluded from the use of current technologies for a long time to come and this might pose serious threats to the
vitality and future active use of such languages. A useful analytical tool to assess the resources of such languages
is the ‘Digital Language Vitality Scale’ (Soria 2017).
Advances in ‘transfer learning’ may help here (Nguyen & Chiang 2017; Aji et al. 2020), as well as less super-
vised MT (Artetxe el al. 2018). Relevant examples include HuggingFace (https://huggingface.co/Helsinki-NLP/
opus-mt-mt-en) and OPUS (opus.nlpl.eu). There is also a need to consider the economic impact for translation
companies. For example in Wales the Cymen translation company has developed and trained its own NMT within
its workflow, as part of the public-private SMART partnership (https://businesswales.gov.wales/expertisewales/
support-and-funding-businesses/smart-partnerships). Other companies (e.g. rws.com) have adopted similar ap-
proaches. The benefits of such technology are evident, although their use raises issues related to ownership of
data, similarly to older ethical questions of who owns translation memories.
Human translators have not yet been entirely surpassed, but machines are catching up. A 2017 university study of
Korean-English translation, pitting various machine translators against a human rival, came out decisively in favour
of the human; but still the machines averaged around one-third accuracy (Andrew 2018). Another controlled test,
comparing the accuracy of automated translation tools, concludes that “new technologies of neural and adaptive
translation are not just hype, but provide substantial improvements in machine translation quality” (Lilt Labs 2017).
16
Language In The Human-Machine Era • lithme.eu • COST Action 19102
More recently, Popel et al. (2020) demonstrated a deep learning system for machine translation of news media,
which human judges assessed as more accurate than humans, though not yet as fluent. This was limited to news
media, which is a specific linguistic register that follows fairly predictable conventions compared to conversation,
personal correspondence, etc. (see Biber & Conrad, 2009); but this still shows progress.
17
Language In The Human-Machine Era • lithme.eu • COST Action 19102
18
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Natural Language Understanding (NLU) includes a range of technologies such as pattern-based NLU; these are
powerful and successful due to a huge number of stored patterns. For instance, AIML (Artificial Intelligence
Mark-up Language) forms the brain of KuKi (former Mitsuku), the Loebner prize-winner chatbot.
19
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Hidden Markov Models (HMMs) also made a big impact, allowing researchers to move beyond conventional
recognition methods to statistical approaches. Accuracy, accordingly, increased. By the 1990s, products began to
appear on the market. Perhaps the most well-known is Dragon Dictate (released 1990) – which, though cutting
edge for its time, actually required consumer’s to “train” the algorithm themselves, and to speak very slowly.
Progress from this point was relatively slow until the 2010s, when Deep Neural Networks (DNNs, discussed
earlier) were introduced in speech engineering.
Commercial speech recognition facilities include Microsoft Windows’ inbuilt dictation facility (https://support.
microsoft.com/en-us/fec94565-c4bd-329d-e59a-af033fa5689f), IBM Watson (ibm.com/cloud/watson-speech-
to-text), Amazon Transcribe (aws.amazon.com/transcribe), and Google Speech-to-Text (cloud.google.com/
speech-to-text).
Open source alternatives include Mozilla Deep Speech (github.com/mozilla/DeepSpeech), NVIDIA Jasper
(https://nvidia.github.io/OpenSeq2Seq/html/speech-recognition/jasper), Kaldi ASR (https://github.com/kaldi-
asr/kaldi), wav2letter (https://github.com/flashlight/wav2letter), VOSK (alphacephei.com/vosk), and Fairseq-
S2T (Wang et al. 2020). Notably some of these open source facilities are developed by private companies (e.g.
Facebook, NVIDIA) with their own incentives to contribute to other products in their portfolio.
A recently founded EU-funded project, MateSUB (matesub.com), is leveraging these kinds of capabilities spe-
cifically for adding subtitles following speech recognition. Machine translation of subtitles was the topic of the
SUMAT project http://www.fp7-sumat-project.eu/.
20
Language In The Human-Machine Era • lithme.eu • COST Action 19102
A huge step towards unsupervised speech recognition was made when Facebook released wav2vec (Schneider et
al. 2019) and its successor wav2vec 2.0 (Baevski et al. 2020), which is able to achieve a 5.2% word error rate using
only 10 minutes of transcribed speech. It learns speech representations directly from raw speech signals without
any annotations, requiring no domain knowledge, while the model is fine-tuned using only a minimal amount of
transcribed speech. This holds great promise, though success will depend on factors including accessibility, privacy
concerns, and end user cost.
Mozilla Deep Speech, accompanied with its annotated data CommonVoice initiative (see next section), aims to em-
ploy transfer learning. That is, models previously trained on existing large annotated corpora, such as for American
English, are adapted and re-trained with smaller annotated corpora for a new domain and or language. Such an
approach has been proven to be viable for bootstrapping speech recognition in a voice assistant for a low-resourced
language. Companies like Google, as well as many university AI research groups, are busily attempting to apply
self-supervised learning techniques to the automatic discovery and learning of representations in speech. With
little or no need for an annotated corpus, self-supervised learning has the potential to provide speech technology
to a very wide diversity of languages and varieties: see for example https://icml-sas.gitlab.io/.
Further challenges ahead for automated subtitling include improved quality, and less reliance on human post-edit-
ing (Matusov et al. 2019).
21
Language In The Human-Machine Era • lithme.eu • COST Action 19102
• Currently: deep neural networks for acoustic modelling and waveform generation (replacing decision
tree HMM-based models and vocoders)
• Advanced techniques for acoustic modelling using sequence-to-sequence (‘seq2seq’) (Hewitt & Kriz
2018) models for end-to-end (‘e2e’) speech synthesis (Taylor & Richmond 2020)
22
Language In The Human-Machine Era • lithme.eu • COST Action 19102
23
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Although all progress is welcome, the technological advances in the field of sign language have been slow, when
compared to the pace at which written and spoken language technologies have evolved.
As Jantunen et al. (2021) predict, machine translation of sign languages will face at least three important challenges
for some time to come (that is, long after similar obstacles are overcome for spoken and written modalities): (a)
multimodality, wherein meaning is made not only with hand gestures but also with a rich mix of gesture, facial
expression, body posture, and other physical cues, yet even the most advanced detection systems in development
are focused only on hand movement; (b) there are hundreds to thousands of sign languages, but research so far
has focused on ‘major’ sign languages, so the complexity of sign language communication and translation is higher
than gesture-recognition systems currently take into account; (c) meaning often also depends on socio-cultural
context and signers’ knowledge of each other’s lives, which machines cannot know (and training them to find out
provokes major privacy concerns).
In section 1.3.3 we discussed sign language. In section 2.3.1 we expand sign recognition and synthesis – including
its many persistent limitations.
24
Language In The Human-Machine Era • lithme.eu • COST Action 19102
that do not neatly fit in other linguistic levels (syntax, morphology, etc.). To elaborate a little further, the myriad
phenomena at play include at least:
• how we encode physical distance (“You’re too far away”, “Come closer”)
• how we select the right terms to address each other (“Madam”, “buddy”, “Mx”, etc.)
• tacit cultural knowledge, or prior understanding that directly affects word choice, like who I am and
who you are, where we are, what will happen if you drop a ball vs. a glass, etc.
• how we convince people to do things by appearing authoritative, weak, apologetic, etc.
• discourse markers and fillers (“hmm”, “uhuh”, “right”)
And for every single one of these, there is bountiful and wonderful but baffling and unknowable diversity and
change – across languages and cultures, within smaller groups, between individuals, and in the same individual
when talking to different people at different times. All this adds up to quite a challenge for machines to understand
pragmatic meaning.
The second limitation noted above is the need to further develop other fields of Computational Linguistics.
Recognising and classifying pragmatic phenomena first relies on recognising and classifying other linguistic phe-
nomena (e.g. phonological, prosodic, morphological, syntactic or semantic). If someone tells a friend they are short
of money, it could imply a request, a dispensation, a simple plea for pity, or something else; a machine cannot know
which without first knowing about different ways of describing money, poverty, and so on; as well as the subtle
but vital combination of gestures, facial expressions and other movements that could contribute to any of these
intended meanings. All this might entail the incorporation into pragmatics-related corpora of sound at a larger
scale and its processing and annotation with adequate schemes and formats.
Archer et al. (2008) mention two definitions of computational pragmatics: “the computational study of the relation
between utterances and action” (Jurafsky 2004: 578); and “getting natural language processing systems to reason in
a way that allows machines to interpret utterances in context” (McEnery 1995: 12). As far as pragmatic annotation
is concerned, it is noted that “the majority of the better-known (corpus-based) pragmatic annotation schemes are
devoted to one aspect of inference: the identification of speech/dialogue acts” (Archer et al. 2008: 620).
Some projects developed subsequently – such as the Penn Discourse Treebank (https://www.cis.upenn.edu/~p-
dtb/) – have also worked extensively in the annotation of discourse connectives, discourse relations and discourse
structure in many languages (see e.g. Ramesh et al. 2012; Lee et al. 2016; Webber et al. 2012).
Progress in the last two decades includes attempts to standardise subareas of Pragmatics, such as discourse struc-
ture (ISO/TS 24617-5:2014), discourse relations (ISO 24617-8:2016) speech act annotation (ISO 24617-2:2020),
dialogue acts (ISO 24617-2:2020), and semantic relations in discourse (ISO 24617-8:2016); and even to structure
the whole field of Computational Pragmatics and pragmatic annotation (Pareja-Lora & Aguado de Cea 2010;
Pareja-Lora 2014) and integrate it with other levels of Computational Linguistics and linguistic annotation (Pareja-
Lora 2012). Further current research concerns, for instance, the polarity of speech acts, that is, in their classification
as neutral, face-saving or face-threatening acts (Naderi & Hirst 2018).
However, as Archer, Culpeper & Davies (2008) indicate, “[u]nlike the computational studies concerning speech act
interpretation, [...] corpus-based schemes are, in the main, applied manually, and schemes that are semi-automatic
tend to be limited to specific domains” (e.g. “task-oriented telephone dialogues”). This is only one of the manifold
limitations of research in this area.
All this could be solved, to some extent, by suitable annotated gold standards to help train machine learning tools
for the (semi-)automatic annotation and/recognition of pragmatic phenomena. These gold standards would need
to include and integrate annotations pertaining to all linguistic levels – as discussed above, a machine may struggle
to identify pragmatic values if it cannot first identify other linguistic features (for instance politeness encoded
in honorifics, pronouns, and verb forms). These annotated gold standards would be quite useful also for the
evaluation of any other kinds of systems classifying and/or predicting some particular pragmatic phenomenon.
Another big limitation in this field, as discussed above, is about the journey our words take out there in the real
world. Much of our discussion so far in this report is all about the basic message we transmit and receive: words,
signs, and so on. But human language is much more complex. The basic message – the combination of sounds,
the array of signs and gestures, that make up our ‘utterances’ – are absolutely not the end of the story for language.
When we put together the words ‘Let me go’, those words have a linguistic meaning; they also carry an intention;
and then subsequently (we hope) they have an actual effect on our lives. These are different aspects of our lan-
25
Language In The Human-Machine Era • lithme.eu • COST Action 19102
guage, all essential for any kind of full understanding. Neurotypical adults understand all these intuitively but they
must be learned; and so a machine must be trained accordingly. There have been some advances but constraints
remain (mentioned below and also pervasively in this document), for example:
1. Higher-order logical representation of language, discourse and statement meaning are still partial,
incomplete and/or under development.
2. Perhaps also as a consequence, computational inference over higher-order logic(s) for language (e.g.
to deal with presuppositions or inference) require further research to overcome their own current
limitations and problems. “Indeed, inference is said to pose “four core inferential problems” for the
computational community: abduction [...], reference resolution [...], the interpretation and generation
of speech acts [...], and the interpretation and generation of discourse structure and coherence rela-
tions [...]” (Archer, Culpeper & Davies 2008). The first of these, abduction, means roughly inference
towards the best possible explanation, and has proved the most challenging for machines to learn;
no great progress is expected here in the next decade. But progress on speech acts and discourse
structure (and/or relations) has been robust for some widely-spoken languages; and some resources
and efforts are being devoted to the reference resolution problem (that is, reference, inference, and
ways of referring to physical space), in the fields of (i) named entity recognition and annotation and
(ii) anaphora (and co-reference) resolution.
The final big limitation of this field is the strong dependency of pragmatic features on culture and cultural differ-
ences. Indeed, once identified, the values of these pragmatic features must be interpreted (or generated) according
to their particular cultural and societal (not only linguistic) context. That pragmatic disambiguation is often a
challenge for humans, let alone machines. Take ‘face-saving’ or ‘face-threatening’ acts: for example we attempt
face-saving for a friend when we say something has gone missing, not that our friend lost it; while face-threatening
acts, by contrast, are less forgiving.
Interpretation or formulation of face-saving and face-threatening acts are highly culture-dependent. This also
affects, for example, the interpretation and production of distance-related features (any kind of distance: spatial,
social, temporal, etc.). Earlier we mentioned some levels of pragmatic meaning – our basic literal message, our
intention, its possible effects in the world. Understanding face-saving is a key part of managing those things, and
they all differ according to who we are talking to. It is almost impossible to understand pragmatic meaning without
understanding a huge amount of overlapping social information. Nothing is understood before everything is
understood.
In the end, all these aspects entail the codification and management of lots of common (or world) knowledge,
information, features and values. Machines might be more able to process all these items now than in the past by
means of big data processes and techniques (such as supercomputation or cloud computing). However, all these
items still need to be identified and encoded in a suitable computer-readable format. The community of linked
open data is working hard in this aspect and their advances might help solve this issue in due course (Pareja-Lora
et al. 2020).
26
Language In The Human-Machine Era • lithme.eu • COST Action 19102
2.5 Politeness
Linguistic politeness concerns the way we negotiate relationships with language. This is in some senses a sub-dis-
ciplinary area of pragmatics. We design our speech in order to further our intentions. For that, we pay attention
to ‘face’. In common parlance, to ‘save face’ is to say something in a way that minimises the imposition or embar-
rassment it might cause. It is simple everyday conversational diplomacy. Politeness theory builds on this simple
insight to interrogate the various ways we attend to other people’s ‘face needs’, their self-esteem and their sense of
worth. Neurotypical adults have an intuitive sense of interlocutors’ ‘face needs’. That sense enables a choice about
whether we want to either uphold or undermine those needs. Do we say ‘I’m sorry I wasn’t clear’ or ‘You idiot, you
completely misunderstood me!’, or something in between these extremes? They ‘mean’ the same thing, but they
attend to face needs very differently.
How we attend to face needs will depend on the nature of the relationship, and what we want to achieve in
interaction. There is the basic classical distinction between ‘positive face’ (the desire to be liked) and ‘negative
face’ (the desire not to impose on people). Brown & Levinson (1987) is a commonly used foundational text in the
discipline, though obviously somewhat dated now. Since then, the study of politeness has proceeded to explore
various aspects of extra-linguistic behaviour and meaning. This has resulted in increased insight, but has somewhat
fragmented the field away from the kinds of unifying, standardised principles represented by Brown & Levinson
(1987). But developers of machine learning systems need stability; and so, current studies in machine learning of
politeness are in the slightly curious position of continuing to apply categories from Brown & Levinson (1987) to
machine learning – for example Li et al. (2020) on social media posts in the US and China; Naderi & Hirst (2018)
on a corpus from the official Canadian parliamentary proceedings; and Lee et al. (2021) on interactions between
robots and children. All these studies use categories from Brown & Levinson (1987) as a basis for machine learn-
ing. The research teams themselves are not failing per se in keeping up with more recent developments in the field;
rather, this reliance on older work simply reflects the relatively early stage of the technology, and the need for tight
structures and categories from which machines can learn. This in turn indicates the distance left to go in designing
pragmatically aware machines.
The difficulty (for machines and indeed often for humans) lies in the heavily contextual nature of linguistic polite-
ness within one language and across languages, and within individuals and different social groups. Culturally aware
robots will need to understand that some sentiments will be expressed in some communities, while in others it is
not acceptable to express them. Some verbal reactions may be acceptable in some cultures, in others less so or not
at all. This will be important for social inclusion and justice in multicultural societies.
The above reports are also based on text alone; or if they cover gesture, they focus on production of politeness
gestures by machines, not interpretation of human gestures by machines. As discussed above in relation to sign
language, this is simply beyond the purview of the current state of the art. Bots can themselves approximate polite
behaviour, but cannot distinguish it in users.
27
3 Gadgets & gizmos:
human-integrated devices
Summary and overview
Two major technological fields will influence language in the human-machine era: Augmented
Reality (AR), and Virtual Reality (VR). Both of these are currently somewhat niche areas of
consumer technology, but both are subject to enormous levels of corporate investment explic-
itly targeting wide adoption in the coming years. By 2025, the global market for AR is forecast
to be around $200 billion, and for VR around $140 billion.
AR is currently widely available: embedded into mobile phones in apps like Google Translate
which can translate language in the phone's view; and in dedicated headsets and glasses that
overlay the user's visual field with richer and more detailed information. Progress in AR is
rapid, though to date the headsets and glasses have been mostly used in industrial settings due
to their somewhat awkward and unsightly dimensions. In the next year or two, AR tech will
shrink down into normal sized glasses, at which point the above-noted market expansion will
begin in earnest.
AR eyepieces will combine with intelligent earpieces for more immersive augmentation. These
devices will enable other AI technologies to combine and transform language use in specific
ways. Two in particular are augmentation of voice, and augmentation of facial/mouth move-
ments. As covered in the previous section, advances in speech recognition and voice synthesis
are enabling automatic transcription of human language, rapid translation, and synthetic speech
closely mimicking human voices. These will be straightforward to include in AR eyepieces and
28
Language In The Human-Machine Era • lithme.eu • COST Action 19102
earpieces. That in turn will enable normal conversation to be subtitled for clarity or recording,
amplified, filtered in noisy environments, and translated in real time. Meanwhile advances in
facial recognition and augmentation are beginning to enable real-time alterations to people's
facial movements. At present this is targeted at automatically lip-syncing translations of video,
making the mouth look as though it is producing the target language as well as speaking it.
Embedded into AR eyepieces, this will also allow us to talk to someone who speaks another
language, hear them speaking our language (in their own voice) while their mouth appears to
make the translated words. This is what we mean by speaking through technology. The devices
and algorithms will be an active contributor to our language, tightly woven into the act of
speaking and listening.
VR is similarly the site of rapid technological progress, and vast corporate investment. Devices
currently on the market enable highly immersive gaming and interaction between humans.
Future advances in VR point to much more transformative experiences with regard to lan-
guage. Mostly this relates to the combination of VR with the same technologies mentioned
earlier, plus some extra ones. Augmentation of voice and face, noted above, will enable much
more immersive first-person and multi-player games and other interaction scenarios, including
translation and revoicing. The main distinguishing feature of VR will be in the addition of
future improved chatbot technology.
Currently chatbots are somewhat limited in their breadth of topics and complexity of re-
sponses. This is rapidly changing. In the near future chatbots will be able to hold much more
complex and diverse conversations, switching between topics, registers, and languages at pace.
Embedded into virtual worlds, this will enable us to interact with highly lifelike virtual char-
acters. These could become effective teachers, debaters, counsellors, friends, and language
learning partners, among others. The potential for language learning is perhaps one of the
most transformative aspects. VR is already used for language learning, mostly as a venue, with
some limited chatbot-based interaction. This will change as VR chatbots in virtual characters
evolve. The possibility of learning from endlessly patient conversation partners, who would
never tire of you repeating the same word, phrase or sentence, who would happily repeat an
interaction again and again for your confidence, could truly upend much about pedagogical
practice and learner motivation and engagement.
AR too will enable language learning in different ways. As we noted above, it will soon be
possible to translate in-person conversations in AR earpieces. AR eyepieces could also show
you the same overlain information in whatever language it supports. There are tools already
available for this, and their sophistication and number will grow rapidly.
Next, law and order. Machines are already used for some legal tasks, and recent years have
seen marked growth in the market for 'robot lawyers' able to file relatively simple legal claims,
for example appealing parking tickets, drafting basic contracts, and so on. This too is set to
grow in breadth and complexity with advances in AI, moving into drafting of legislation, and
regulatory compliance.
Somewhat relatedly, machine learning will increasingly take on enforcement and forensic
applications. Machine learning is already applied for example to plagiarism detection in educa-
tion, to personality profiling in human resource management, and other comparable tasks, as
well as identifying faces in crowds and voices in audio recordings. Illicit uses include creation
of fake news and other disinformation, as well as impersonation of others to bypass voice-
based authentications systems. All these are likely to expand in due course, especially as these
capabilities become embedded into wearable devices.
In health and care, language AI will enable better diagnosis and monitoring of health con-
ditions. Examples include diagnosis of conditions that come with tell-tale changes to the
patient's voice, like Parkinson's or Alzheimer's disease. AI can detect these changes much
earlier than friends or relatives, since it is not distracted by familiarity or slow progression.
Ubiquitous 'always on' listening devices will be primely placed for this kind of diagnosis and
monitoring.
29
Language In The Human-Machine Era • lithme.eu • COST Action 19102
For sign language, certain devices are being developed for automatic recognition and synthesis
of sign. These have tended to focus exclusively on detecting the handshapes made during
signing, but signing actually involves much more than this. Fully understanding and producing
sign also involves a range of other visible features, including gesture, gaze, facial expression,
body posture, and social context. These are currently out of reach for current and even planned
future technologies. For this reason the Deaf community will benefit much more slowly from
technological advances in the human-machine era.
Moreover, as we noted previously, and as we will continue to note in this report, all these
exciting advances will not work equally well for everyone. They will obviously be unaffordable
for many, at least in the early stages. They are also likely to work better in larger languages than
in smaller and less well-resourced languages. Sign language have the additional obstacles just
noted.
AR has additional major implications for a range of language professionals – writers, editors,
translators, journalists, and others. As machines become more able to auto-complete our sen-
tences, remember common sentences or responses, and indeed generate significant amounts of
content independently, so too the world of language work will change significantly. There will
still be a role for humans for the foreseeable future, but language professionals, perhaps more
than anyone, will be writing and talking through technology.
All this will also in turn provoke major dilemmas and debates about privacy, security, regulation
and safeguarding. These will come to the fore alongside the rise of these technologies, and
spark into civic debate in the years to come.
30
Language In The Human-Machine Era • lithme.eu • COST Action 19102
This is useful and used in a range of other professional contexts, but the form factor is still slightly too bulky
for streetwise social contexts. Nevertheless, the technology is being rapidly refined and miniaturised to become
inconspicuous and sleek. There is currently much media hype surrounding Apple Glass, rumoured for release in
2022. Apple is being characteristically secretive about this; but certain competitors are saying more, for example
Facebook’s ‘Project Aria’: https://about.fb.com/realitylabs/projectaria/. Other notable rivals include Nreal (nreal.
ai), and MagicLeap (magicleap.com). With all these advances in view, the global market for AR devices is forecast
to rise from $3.5 billion in 2017 to around $200 billion by 2025 (Statista 2020).
Altogether, advances in AR will combine for a comprehensive and varied means to deliver visual augmentations.
Now, the challenge is to take advantage of such overlay to bring applications that go beyond mere visual augmen-
tations, and exploit the multimodal nature of human interactions. In this respect, combining visual tracking and
augmentations with language interactions would be particularly interesting and symbiotic. A major milestone, and
a cornerstone of the human-machine era, will be feeding information into conversation in real time.
31
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Like Breglet et al. (1997) before them, Thies et al. (2016) target dubbing – “face videos can be convincingly dubbed
to a foreign language” (p. 2387). And by making it work ‘online’, they also target real-time uses, “for instance, in
video conferencing, the video feed can be adapted to match the face motion of a translator” (ibid.). This would be
a human translator (as the source actor), whose facial movements Face2Face would map on to the live video of the
person speaking. So there were two main limitations to Face2Face: it could only alter the mouth movements within
an existing video; and it required a source actor. The next evolution of this technology would progress beyond that.
The team behind Face2Face went on to develop ‘NerFACE’. This did away with the need for a source actor or
indeed a video of the original person speaking. NerFACE begins by applying machine learning to recordings of
a person speaking and otherwise using their face. It builds up a series of four-dimensional maps of each facial
expression: everything from how that person looks when they say each vowel to the contours of their smile, their
frown, and so on. With all that programmed in, NerFACE can make the recorded face say and do completely
new things, achieving “photo-realistic image generation that surpasses the quality of state-of-the-art video-based
reenactment methods” (Gafni et al. 2020: 1). Alongside progress with ‘visemes’ – face shapes associated with
each sound (e.g. Peymanfard et al. 2021) – there is clear potential for anyone’s face to be quickly and convincingly
re-animated to say anything.
As machine translation becomes quicker and approaches real time, NerFACE could create a machine-generated
visualisation of your face speaking another language. As we discussed earlier, machine translation is not quite
32
Language In The Human-Machine Era • lithme.eu • COST Action 19102
yet able to work in real time with your speech; but it is getting there. Progress here could also enable convincing
animations of historical figures. Right now there are somewhat rudimentary re-animations using currently available
methods, often for fun purposes, for example getting Winston Churchill to sing pop songs. As the technology
improves, this could become more convincing, and more immersive. Imagine, for example, Da Vinci’s Mona Lisa
giving an art lecture, Albert Einstein teaching physics, or Charles Darwin lecturing about the science of evolution.
All these are perfectly predictable extensions of these technologies currently in development. And all this will be
complemented by lifelike avatars in virtual bodies, which we come back to under Virtual Reality below.
Meanwhile advances in real-time hologram technology point to a near future where augmented faces could also
be beamed onto surfaces or in space around us, using the relatively low processing power of phones and other
wearable devices; see for example the Tensor Holography project at MIT: http://cgh.csail.mit.edu/.
So, near future advances in augmented reality, combined with improved eyepieces and earpieces, suggest the fol-
lowing distinct possibilities within the foreseeable future:
• Talking with people in high-noise environments or over a distance and hearing their amplified and
clarified voice
• Real-time translation of people speaking different languages, fed into your earpiece as you talk to them
• An augmented view of their face, fed into your eyepiece, so you see their mouth moving as if they were
speaking your language, or if using the same language, ‘seeing’ their face when it is covered by clothing
or they’re facing away from you
As we discussed earlier, this is unlikely to include sign language anywhere near as comprehensively in the foreseeable
future. That is down to the combined problems of mixed modality in sign language (handshapes, gesture, body
position, facial expression, etc.) that are much harder for machines to understand, and generally lower funding and
incentives in a highly privatised and profit-led innovation space.
But the innocent uses of speech and face synthesis outlined above can and will also be used for malicious purposes,
to impersonate people for profit, to spread misinformation, and to abuse people by making lifelike depictions of
them doing and saying degrading and disgusting things. This is already a reality with the rise of ‘deep fakes’, and
indeed this has been showcased in a number of high-profile contexts, including the feature series Sassy Justice,
made by the creators of South Park and featuring an augmented Donald Trump and other celebrities, doing and
33
Language In The Human-Machine Era • lithme.eu • COST Action 19102
saying things that are out of character. The future will see this kind of augmentation available in real time on
mobile equipment, with the potential for quite a mix of exciting prospects and challenging risks.
Relatively affordable prices have led to high levels of adoption. AR and VR are poised to grow and mature quickly,
fueled by enormous private investment; see for example Facebook Reality Labs, explicitly aiming for widespread
adoption of AR and VR: https://tech.fb.com/ar-vr/. According to market analysis (Consultancy.uk 2019), VR is
forecast to contribute US $138.3 billion to global GDP in 2025, rising to US $450.5 billion by 2030.
VR will be a key contributor to the human-machine era. And while AR will enable us to speak through technology,
with VR we will also increasingly speak to technology. VR enables both: it can augment our voices, faces and
movements; and can provide us with virtual characters to talk to.
Progress in the field may remind us of the ‘uncanny valley’ findings, which suggest that humanoid objects that
resemble actual humans produce unsettling feelings in observers. In other words, this hypothesis suggests that
there is a relationship between an object that resembles a human being and the emotional response to that object.
This may be taken as a metaphor for research on previous technologies that were at first rejected but gradually
embraced. This will certainly play out in the coming years.
The key advance in the coming years will be the improvement and refinement of automated conversation (chatbots),
discussed in the next section below. As chatbots become more able to conduct natural, spontaneous conversations
on any topic, VR will gradually change. At present we can interact with other connected humans, or relatively
simplistic chatbots. Given the likely technological advances we have discussed so far, in the near future that line
will blur, at least our sense of whether we are talking to a human or a machine. That will be the key turning point
for the human-machine era: talking to technology.
34
Language In The Human-Machine Era • lithme.eu • COST Action 19102
35
Language In The Human-Machine Era • lithme.eu • COST Action 19102
applications, webchat, e-learning systems, cars, smart-home applications and many others.
4. Policymakers: discuss what the chatbots should and should not do.
5. Users: love and hate chatbots.
We discuss each of these in turn.
Researchers have worked on all aspects of chatbots since decades. While automated dialogue was rather a niche
before 2015, the number of conferences dedicated to chatbots increased thereafter. Most research groups and
conferences focus on dialogue systems, for example SIGDial (sigdial.org), SEMDial (semdial.org), CUI (dl.acm.
org/conference/cui, focused on conversational user interfaces), CONVERSATIONS (conversations2021.word-
press.com, user experience and conversational interfaces), EXTRAAMAS (extraamas.ehealth.hevs.ch, explanation
generation, inter alia). Major AI and NLP conferences and journals also publish about various aspects of chatbots.
Researchers create models of conversations/dialogues, find suitable algorithms to implement those models, and
deploy software that uses those algorithms. Highly influential theories from the past include:
• Attention, Intentions, and the structure of discourse (Grosz & Sidner 1986).
• Speech Acts, emanating from Searle (1969) but developed for annotation in corpora (Weisser 2014).
Current approaches dominating conferences and workshops are variations of Deep Learning-based end-to-end
systems (https://arxiv.org/abs/1507.04808) trained on masses of examples (e.g. Meena, Blenderbot).
Research topics cover a variety aspects of discourse processing (e.g. discourse parsing, reference resolution, multilin-
gual discourse processing), dialogue system development (e.g. language generation, chatbot personality and dialogue
modelling), pragmatics and semantics of dialogue, corpora and tools, as well as applications of dialogue systems.
Language Technology Creators and Communication Channels providers are distributed across a variety of
startups and big companies. Language technology companies provide business-to-business and business-to-cus-
tomer conversational AI solutions. Pre-trained language models can be downloaded and reused or accessed via
APIs and cloud-based services. Proprietary language understanding tools (e.g. ibm.com/watson, luis.ai) and open
source alternatives (such as rasa.com) facilitate the development of chatbots. Services such as chatfuel.com and
botsify.com allow creating chatbots for messengers within hours. And support tools such as botsociety.io facilitate
chatbot design. These tools take a great deal of programming work out of chatbot design, significantly lowering
the technological bar and enabling a wide range of people to build a chatbot.
Language technology creators focus less on answering challenging research questions, more on providing a service
that is scalable, available, and usable; and that satisfies a specific need of a specific customer group. For instance,
NLU-as-a-Service providers only support their users in the NLU task. All remaining problems related to the
production of a good chatbot are left open. Even with the support of NLU platforms, chatbot authors still need
to provide their own data for re-training of usually existing language models. Especially for cloud-based services,
chatbot builders need to read the privacy notice carefully: what happens with the training examples that are up-
loaded to the cloud? After a chatbot is developed, it needs to be connected with its users on a particular channel.
Typically chatbots are deployed on instant messengers such as Facebook Messenger, Telegram, Viber, WhatsApp.
Policy makers regulate the chatbot development implicitly and explicitly. For an overview of AI policy in the EU,
see https://futureoflife.org/ai-policy-european-union/. As of May 2021 the EU has 59 policy initiatives aimed at
regulation of different aspects of AI – up from 51 in March 2021 (see https://oecd.ai/dashboards/countries/
EuropeanUnion). Implicit regulations include for instance GDPR, which influences among other things how
chatbots can record and store user input. Some new privacy regulations have negatively impacted the user expe-
rience, for example GDPR policies invoked in December 2020 removed many popular and engaging functions
of online messengers, including persistent menus (previously shown to improve the user’s freedom and control:
Hoehn & Bongard-Blancy 2020) – disabled for all chatbots in to a Facebook page based in Europe, and/or for
users located in Europe (see https://chatfuel.com/blog/posts/messenger-euprivacy-changes). The same update
disabled chatbots from sending video or audio, or displaying one-time notifications. The solution suggested by
some technology provider was to create a copy of the bot with the reduced functionality and to have two versions:
one bot with the full functionality, and the other “reduced European version”. So the evolution of chatbots is not
linear, and will be punctuated by such debates over the balance between privacy and functionality.
There are wider ethical and philosophical questions raised by speaking to technology in the form of chatbots; and
these questions in turn highlight the challenge to many traditional areas of linguistic research. Capturing some of
these question, in 2019 the French National Digital Ethics Committee (CNPEN) opened a call for opinions on
36
Language In The Human-Machine Era • lithme.eu • COST Action 19102
37
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Petersen (2010) dinstiguishes Communicative ICALL and Non-Communicative ICALL. He sees Communicative
ICALL as an extension of the human-computer interaction field. His understanding of Communicative ICALL is
that “Communicative ICALL employs methods and techniques similar to those used in HCI research, but focuses
on interaction in an L2 context” (Petersen 2010: 25). We look at applications in both subfields.
Frequently cited real-life ICALL applications are E-Tutor for German learners (Heift 2002, 2003), Robo-Sensei for
Japanese learners (Nagata 2009) and TAGARELLA for learning Portuguese (Amaral et al. 2011). These systems
have conceptually similar structures.. Main technical components include an expert model, a student model and
an activity model. Language technologies are heavily employed to analyse learner errors and provide corrective
feedback. Automated linguistic analysis of learner language includes the analysis of the form (tokenization, spell-
check, syntactic parsing, disambiguation and lexical look-up) and the analysis of the meaning (whether the learner
answer makes sense, e.g. expected words appear in the input, the answer is correct etc.).
Typical learner errors are usually part of the instruction model in ICALL systems. Corpora of learner language
are used to model typical L2 errors for different L1 speakers, for example FALKO (Reznicek et al. 2012, 2013),
WHiG (Krummes & Ensslin 2014) and EAGLE (Boyd 2010). The repository Learner Corpora around the World
(https://uclouvain.be/en/research-institutes/ilc/cecl/learner-corpora-around-the-world.html) contains many
other learner corpora. The annotation of learner corpora is mainly focused on annotation of learner errors;
however, annotation of linguistic categories in learner corpora is also of interest. Error annotation of a corpus
assumes a non-ambiguous description of the de-viations from the norm, and therefore, the norm itself. The
creation of such a description may even be problematic for errors in spelling, morphology and syntax (Dickinson
& Ragheb 2015). In addition, different annotators’ interpretations lead to a huge variation in annotation of errors
in semantics, pragmatics, textual argumentation (Reznicek et al. 2013) and usage (Tetreault & Chodorow 2008).
Multiple annotation schemes and error taxonomies have been proposed for learner corpora, for instance (Díaz-
Negrillo and Domínguez 2006; Reznicek et al. 2012).
38
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Mondly VR: Learn Languages in VR – this app focuses on non-native language learning. Moreover, it operates
in 33 different languages. It has two sections: a vocabulary section and a conversation section. A learner studies
new words and phrases in context (e.g. in a restaurant or at the post office), practices other language skills, i.e.
listening and reading, as well as s/he is provided feedback on his/her pronunciation.
InmerseMe – an online language learning platform that contains the teaching of several languages which offers
several configurations based on VR. The use of this technology allows a most authentic representation of the
region where the desired language is spoken. After selecting a configuration and a class, students interact with
pre-recorded “teachers”. Beginners can participate in dictation exercises in which they repeat the words spoken by
the virtual teachers. The words spoken by the students are automatically recorded and transcribed by the applica-
tion. The application provides materials for learning nine languages. Briefly, the application creates a contextualized
environment in which students can improve their skills in the desired language.
Panolingo – this application offers users the opportunity to learn a foreign language through the gamification
approach, where they can earn points and bonuses. You can share your scores with friends on the app to stimulate
a competitive mindset and engage users in the learning process. The immersive nature of this interactive platform
offers guidance to users to complete different challenges, providing more effective learning experiences as it
exposes the users directly to scenarios where the foreign language is the basis of the whole scene.
The descriptions of the apps mentioned above indicate that the only skill, which is not practiced, is the skill of
wiring. Otherwise, the content of the app attempts to teach the target language in context through meaningful
activities with a special focus on the development of vocabulary. However, not much attention is paid to collabo-
rative learning, which appears to be crucial for learning a non-native language because it facilitates understanding,
develops relationships or stimulates critical thinking.
Foreign/Second Language learning through VR presents interesting didactic challenges. Despite VR´s much-stud-
ied benefits and positive learning outcomes (Lin & Lan 2015; Parmaxi 2020; Guerra et al. 2018; Barreira et al. 2012;
Solak & Erdem 2015), other studies suggest that in the long term conventional approaches are better (Sandusky
2015). There is still a need for systematic comparative evaluations here (Hansen & Petersen 2012). More extensive
longitudinal studies are needed.
40
Language In The Human-Machine Era • lithme.eu • COST Action 19102
conversation prompts and augmented conversation (as we discussed earlier). When combined with advances in
machine translation, this is absolutely ripe for language learning (Zhang et.al. 2020). Through the use of mobiles,
tablets and wearable technologies like smart glasses and contact lenses, AR will enable innovative learning scenarios
in which language learners can use their knowledge and skills in the real world. Learning by doing.
AR has a relatively recent history of research and application in second language learning. In the relevant literature,
recurring themes regarding the potential benefits of AR applications are: a) learning through real-time interaction,
b) experiential learning, c) better learner engagement, d) increased motivation, e) effective memorization and better
retention of the content (Khoshnevisan & Le 2019; Parmaxi & Demetriou 2020).
Communication and interaction – vital in language learning – are effectively supported in AR-enriched environ-
ments. Through the use of QR codes, markers and sensors such as GPS, gyroscopes, and accelerometers, any
classroom can be turned into a smart environment in which additional information is added to physical surround-
ings. These smart environments can be exploited in the form of guided digital tours or place-based digital games
in which learners can interact with the objects and people synchronously, while fulfilling any language-related tasks.
If well-designed, AR-enriched environments will enable learners to take an active role speaking, reading, listening
and writing in the target language. The project Mentira (Holden & Sykes 2011) is an early example of such an
environment in which a place-based game, a digital tour and classroom activities were combined to teach Spanish
pragmatics. While framing requests or refusals, students were able to interact with game characters as well as
actual Spanish speaking inhabitants during the tours. These innovative interactive environments also boost learner
motivation and engagement.
Collaboration, novelty of the application, feeling of presence, and enjoyable tasks through playful elements are
among the reported factors attributed to AR enhancing learner motivation (Liu & Tsai 2013; Chen & Chan 2019;
Cózar-Gutiérrez & Sáez-López 2016). Experiential learning – the “process whereby knowledge is created by the
transformation of the experience” (Kolb 1984) – is another theme frequently underlined in AR-based language
learning. Experiential learning emphasizes that learning happens when learners participate in meaningful encoun-
ters through concrete experience and reflect upon their learning process.
AR-based instruction has potential to turn the educational content into a concrete experience for the learners.
Place-based AR environments, for example, guide the users at certain locations and help them to carry out certain
tasks through semiotic resources and prompts. Within the language learning contexts, these tasks could be in
the form of maintaining a dialogue at an airport or a library or asking for directions on a street. As the learners’
attention is oriented to relevant features of the setting, their participation into the context is embodied, which
makes their experience more concrete than in-class practicing of these tasks. This “embodied interactive action” in
the language learning process” (Wei et al. 2019) also leads to successful uptake and better retention of the content.
Based on the possible benefits above, there is an increasing interest in AR tools and applications integrated into
language education, which yields more and more attempts in developing new tools or adapting existing ones into
educational settings. Below is the brief presentation of the AR software frequently used for language teaching
purposes.
ARIS (Augmented Reality and Interactive Storytelling), developed at the University of Wisconsin. As a free open-
source editor, ARIS could be used by any language teacher without technical knowledge to create simple apps
like tours, scavenger hunts. More complex instructional tasks, on the other hand, require HTML and JavaScript
knowledge. Action triggers could be either GPS coordinates (i.e. location) or a QR code, which could be exploited
to start a conversation, seeing a plaque or visiting a website to guide learners in their language practice (Godwin-
Jones, 2016). Using ARIS, Center for Applied Second Language Studies at the Oregon University has recently
released a number of language teaching games and free to use AR materials targeting different proficiency levels,
which means language teachers could readily integrate AR into their teaching practices.
TaleBlazer, developed at MIT. Its visual blocks-based scripting language enables the users to develop interactive
games avoiding syntax errors. GP coordinates could be used as triggers which could initiate certain tasks targeting
speaking or writing or practicing vocabulary in the target language. Imparapp developed through this software
at Coventry University to teach beginner level Italian is a good example of exploiting TaleBlazer for language
teaching purposes (Cervi-Wilson & Brick, 2018).
HP Reveal (formerly Aurasma). A free to use tool, HP Reveal has both mobile and web interfaces. While the
mobile version is generally used to view AR materials and offers limited opportunity to create AR materials, the
41
Language In The Human-Machine Era • lithme.eu • COST Action 19102
web interface provides the users with a wide range of tools including content management, statistics and share
options to develop any AR-enriched environment. Myriad ready to use AR materials makes it accessible and
popular. Its uses for language teaching purposes range from creating word walls to teach vocabulary, designing
campus trips, interactive newspaper articles, and improving reading skill in the target language. For discussion of
HP Reveal for language learning, see Plunkett (2019).
Unity (ARToolkit), released by the University of Washington. As an open-source AR tool library with an active
community of developers, Unity ARToolkit’s two distinguishing aspects are efficient viewpoint tracking and virtual
object interaction; these make it popular among AR developers. However, as for the teachers with less program-
ming knowledge, it might not be so favourable.
Vuforia. As a freely available development kit (on the condition of inserting vuforia watermark), Vuforia allows
the users to develop a single native app for both Android and IOS along with providing them with a number of
functions including Text Recognition, Cloud Recognition, Video Playback, Frame Markers etc. It smoothly works
in connection with Unity 3D and its developer community constantly offers updates, which contributes to its
popularity. Although it is widely used in different fields of education, the field of language teaching has yet to
embrace vuforia in instructional activities. Interested readers could examine Alrumayh et al. (2021).
Looking ahead, the effects of an increased use of technologies (as integrated with our senses rather than simply
confined to mobile or external devices) are very hard to predict regarding the cognitive treatment of the informa-
tion. Research has shown that reading modalities, for instance, can impact metacognitive regulation (Ackerman &
Goldsmith 2011; Baron 2013; Carrier et al. 2015) and higher order cognitive treatment (Hayles 2012; Greenfield
2009; Norman & Furnes 2016; Singer & Alexander 2017). The addition of numerous additional layers of multi-
modal information will have to be studied carefully to make sure that optimal cognitive treatment is possible. In
case it is not, efforts will be needed to tailor-make the input received to make it manageable for the human brain.
42
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Various attempts have been made to use logics to capture different legislations. An example is the DAPRECO
knowledge base (Robaldo et al. 2020), which has used an input-output logic (Makinson & van der Torre 2003) to
capture the whole European general data protection regulation (GDPR).
There were also successful implementations of legal reasoning procedures. One example is the Regorous system
(Governatori 2015) which is capable of checking regulatory compliance problems written in defeasible deontic
logic (Governatori et al. 2013).
In the remainder of this subsection, we will describe two applications of these and similar systems to law.
Consistent legal drafting – In computer programming, a program written in a formal programming language can
be executed by a computer. The compilation/interpretation process which is performed also checks the program
for consistency issues and errors. The legal language has some similarities to programming languages – both
depend on a relatively small vocabulary and precise semantics. In contrast to computer programs, legislations
cannot be checked by a compiler for consistency issues and errors. There might be many reasons for such errors,
ranging from syntactical errors in the legislation to different contradicting legislations and authorities which apply
at the same time. In order to validate and check legislations for such errors, two processes are needed. The ability
to translate a legislation to the computer program, for example by annotations (Libal & Steen 2019) or even
manually. Once a legislation is translated, other programs can check the legislation for errors and consistency issues
(Novotná & Libal 2020).
Regulatory compliance checking – Similarly, the ability to translate a legislation to a computer program has
other benefits. An important application is the ability to check various documents for regulatory compliance.
Regulatory compliance checking in many domains, such as the financial one, is a very complicated process. Such a
process should result in a yes/no answer. Nevertheless, the process normally results in only a decreased likelihood
of compliance violation. A computer program which can instantly check all possible violations and interpretations,
for example when checking for GDPR compliance (Libal 2020), can greatly improve this process.
43
Language In The Human-Machine Era • lithme.eu • COST Action 19102
The following are some of the most common applications of computational forensic linguistics:
• Forensic authorship analysis, often referred to also as authorship attribution, consists of establish-
ing the most likely author of a text from a pool of possible suspects. This analysis is applied to texts
of questioned authorship, such as anonymous offensive or threatening messages, defamation, libel,
suicide notes of questioned authorship or fabricated documents (e.g. wills), among others. Authorship
analysis typically involves identifying an author’s writing style and establishing the distinction between
this style and that of other authors. A high-profile case involving authorship analysis is that of Juola
(2015), who concluded that Robert Galbraith, the author of the novel ‘The Cuckoo’s Calling’, was
indeed J.K. Rowling, but forensic authorship analysis has also been used in criminal contexts, e.g. the
‘unabomber’ case.
• Authorship profiling consists of establishing the linguistic persona of the author of a suspect text,
and is crucial in cases where an anonymous text containing criminal content is disseminated, but no
suspects exist. It allows linguists to find elements in the text that provide hints to the age range, sex/
gender, socioeconomic status of the author, geographical origin, level of education or even whether
the author is a native or a non-native speaker of the language. When successful, authorship profiling
allows the investigator to narrow down the pool of possible suspects. Recent approaches to authorship
profiling include profiling of hate speech spreaders on Twitter (https://pan.webis.de/clef21/pan21-
web/author-profiling.html).
• Plagiarism is a problem of authorship – or, more precisely, of its violation. Although plagiarism
detection and analysis is approached differently from authorship attribution, in some cases authorship
attribution methods can also be helpful. This is the case, in particular, when the reader intuits that
the text does not belong to the purported author, but is unable to find the true originals. An intrinsic
analysis can be used, in this case, to find style inconsistencies in the text that are indicative of someone
else’s authorship. The most frequent cases of plagiarism, however, can be detected externally, i.e. by
comparing the suspect, plagiarising text against other sources; if those sources are known, a side-by-
side comparison can be made, otherwise a search is required, e.g. using a common search engine or one
of the so-called ‘plagiarism detection software’ packages. Technology plays a crucial role in plagiarism
detection; as was argued by Coulthard & Johnson (2007), the technology that helps plagiarists plagia-
rise also helps catching them, and an excellent example of the potential of technology is ‘translingual
plagiarism’ detection (Sousa-Silva 2013, 2021): this is where a plagiarist lifts the text from a source in
another language, machine-translates the text into their own language and passes it off as their own. In
this case, machine translation can be used to revert the plagiarist’s procedure and identify the original
source.
• Cybercrime has become extremely sophisticated, to the extent that cybercriminals easily adopt obfus-
cation techniques that prevent their positive identification. However, cybercriminal activities, including
stalking, cyberbullying and online trespassing usually resort to language for communication. A forensic
authorship analysis and profiling of the cybercriminal communications can assist the positive identifi-
cation of the cybercriminals.
• Fake news has increasingly been a topic of concern, and has gained relevance especially after the
election of Donald Trump, in the USA, in 2016 and Bolsonaro, in Brazil, in 2018. The phenomenon
has been approached computationally from a fact-checking perspective; however, not all fake news are
factually deceiving (Sousa-Silva 2019), so they pass the fact-checking text. Given the ubiquitous nature
of misinformation, a computational forensic linguistic analysis is crucial to assist the detection and
analysis of fake news. This is an area where the Human-Machine relationship is particularly effective,
since together they are able to identify patterns that are typical of pieces of misinformation.
44
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Filing paperwork for the general public is one of the concrete applications of legal chatbots: for instance DoNotPay
(donotpay.com), widely termed the “first robot lawyer”, is an interactive tool meant to help members of the US
population to e.g. appeal parking tickets, fight credit card fees and ask for compensation, among the many use
cases. SoloSuit (solosuit.com) efficiently helps people that are sued for a debt to respond to the lawsuit. Australia-
based ‘AILIRIA’ (Artificially Intelligent Legal Information Research Assistant, (ailira.com/build-a-legal-chatbot)
is a chatbot implemented directly on Facebook Messenger that offers the possibility to build one’s own legal
chatbot to advise clients on a variety of matters and promises to have a “focused and deep understanding of the
law”. PriBot (pribot.org/bot) is able to analyse privacy policies published by organizations and provide direct
answers concerning their data practices (e.g. does this service share my data with third parties?). In such a scenario,
chatbots spare individuals (end-users and supervisory authorities alike) the effort of finding information in an
off-putting, lengthy and complex legal document. Chatbots are also employed to automatically check for the state
of compliance of an organization with applicable regulations and even suggest a course of action based on the
answers (e.g. GDPR-chatbot.com).
In all these cases, however, the tradeoff is between ease of use and low or no cost on the one hand, and reliability
and formal assurance on the other. Naturally, bots that interpret legal terms and conditions also themselves have
terms and conditions. Bespoke professional legal advice from a human may still have value in the human-machine
era.
45
Language In The Human-Machine Era • lithme.eu • COST Action 19102
This would be entirely feasible as an embedded feature of any smartphone, virtual assistant, or other listening
device. Viable widely used applications are therefore to be expected in near future. But these will of course bring
major dilemmas in terms of privacy and security, as well as potential uses and abuses for providers of private health
insurance and others who may profit from this health data.
Aside from diagnosing mental illnesses and cognitive decline, there are numerous AI robots and chatbots made for
neurodivergent conditions, for example bots designed to help autistic children to develop and refine their social
and interpersonal skills (e.g. Jain et al. 2020).
The second broad approach mentioned above, camera-based systems, has seen some slow incremental progress
recently. For example, Herazo (2020a) shows some accurate recognition capabilities, though overall the results are
“discouraging” (see also Herazo 2020b). Google’s MediaPipe and SignAll (https:/signall.us/sdk) embed visual
detection of sign into smartphone cameras or webcams. However, as they note somewhat quietly in a blog post
(https://developers.googleblog.com/2021/04/signall-sdk-sign-language-interface-using-mediapipe-now-avail-
able.html), this is still limited to handshapes, excluding the various other multimodal layers also essential to meaning
in sign:
“While sign languages’ complexity goes far beyond handshapes (facial features, body, grammar,
etc.), it is true that the accurate tracking of the hands has been a huge obstacle in the first layer
of processing – computer vision. MediaPipe unlocked the possibility to offer SignAll’s solutions
not only glove free, but also by using a single camera.”
It is positive to see a less physically intrusive solution; but as they briefly acknowledge, this shares precisely the
same limitations as its predecessors, and does nothing to advance a solution.
46
Language In The Human-Machine Era • lithme.eu • COST Action 19102
47
Language In The Human-Machine Era • lithme.eu • COST Action 19102
(e.g. Estonian), so as to help students develop their own writing and receive corrective feedback on their free
writing. This is the case of the free online resource UNED Grammar Checker (http://portal.uned.es/portal/
page?_pageid=93,53610795&_dad=portal&_schema=PORTAL).
Currently, systems like Apple’s predictive writing or Google Docs guess what one’s next word is based on previous
words, which allows writers to speed up typing, find that ‘missing’ word, and make fewer mistakes. The improve-
ment of the technology over time inevitably had an impact on an author’s writing: one can speculate that as more
writing assistants are available, and of a higher quality, the line between human and machine in the writing process
will blur. This blurring is precisely a defining feature of the human-machine era.
Other tools use the results of the linguistic and NLP analysis of large corpora (by learning, for instance, the use
of typical collocations or multi-word units) to help users improve the vocabulary range, specificity and fluency
of their academic English writing. A tool like Collocaid, for example (https://collocaid.uk/prototype/editor/
public/), does so by suggesting typical collocations for a large number of academic terms. Users can start writing
directly in the ColloCaid editor, or paste an existing draft into it.
Textio Flow (textio.com) advertises itself as an augmented writing tool that can ‘transform a simple idea of a few
words into a full expression of thoughts consisting of several sentences or even a whole paragraph’. The system,
then, goes farther than simple text-processing tools, because they allow text to be written from a conceptualisation
of the writer’s ideas. See also: https://www.themarketingscope.com/augmented-writing/
In professional settings, writing assistants have been used, for example, to mitigate gender bias in job advertise-
ments. Hodel et al. (2017) investigate whether and to what extent gender-fair language correlates with linguistic,
cultural, and socioeconomic differences between countries with grammatically gendered languages. The authors
conclude that a correlation does indeed exist, and that such elements contribute to socially reproduce gender (in)
equalities and gender stereotypes. Similarly, Gaucher et al. (2011) investigate whether gendered wording (which
includes, e.g. male-/ female-themed words used to establish gender stereotypes) may contribute to the maintenance
of institutional-level inequality. This is especially the case of gendered wording commonly used in job recruitment
materials, especially in roles that are traditionally male-dominated. Several tools can be used to balance such gender
bias in job advertisements. An example is Textio (textio.com), a tool that seeks to provide more inclusive language,
which assesses a preexisting job description, scores it and subsequently makes suggestions on how to improve the
writing and obtain a higher score – which means more applications, including from people who otherwise would
not apply. Unlike Textio, which is a subscription-based tool, Gender Decoder (gender-decoder.katmatfield.com) is
a free tool that assists companies, by reviewing their job descriptions. The tool makes suggestions based on a word
list to remove words associated with masculine roles, hence discarding gender bias.
However, gender bias is not exclusive of corporations. Recent research has found that court decisions tend to
be often gender-biased, despite the expectations that the law treats everyone equally. Recent research (Pinto et
al. 2020) has thus focused on building a linguistic model that will be used to develop a writing assistant that will
be used by legal practitioners. The tool will flag the drafted text for possible instances of gendered language and
subsequently draw the attention of the writer to those instances.
48
Language In The Human-Machine Era • lithme.eu • COST Action 19102
like Google Translate. This allows translators to auto-translate a text segment if it does not exist in the TM. More
recently, translation companies have developed proprietary machine translation systems. One of the main advan-
tages of these is that access is controlled, and hence the risk of violating copyright is smaller. An example of such
technology is RWS Language Cloud (rws.com/translation/language-cloud). The role of the human translator has
thus shifted from a quiet, independent human working at home, to a ‘techo-translator’, trained to use technology
for their own advantage, rather than resist it. This is one of the main applications of machine translation, but not
the only one.
The issue of privacy in CAT has long been a topic of concern for translators. The fact that TM can be used to
speed the translation process (while lowering the translators’ fees) has raised concerns over intellectual property.
If several translators contribute to a translation project, it becomes unclear whose property is the TM.Add to this
the fact that translations usually include confidential information that may be compromised when using translation
systems, in general, and MT in particular. MT has multiplied these concerns exponentially, as identifying the
contributors becomes virtually impossible. In any future developments, therefore, it will be important to embed
privacy as a core element.
But nowadays most of the machine translation in the world is done for non-professionals: for people living in a
multilingual area, travelling abroad, shopping online in another language, and so on. Machine translation is not yet
fully accurate but usually good enough for the gist of text or speech in supported language pairs. is also used on a
daily basis by monolingual speakers – or writers – of a language, or by people who can’t speak a certain language,
to get a gist of the text or to produce texts immediately and cost-effectively where hiring a human translator is not
possible. The result cannot yet be compared to a human-translated text, but it allows access to a text that would
otherwise be impossible.
Machine translation can also be used at a more technical level, as part of a process to further train and refine
machine-translation systems. Texts can be auto-translated, then ‘post-edited’ (corrected) by a human translator,
then this output is used to train new machine translation systems.
Machine translation can also be used to perform other tasks. An example of such an application is translingual
plagiarism detection (Sousa-Silva 2014), i.e. to detect plagiarism where the plagiarist copies the text from a source
language, translates it to another target language and uses it as their own. The underlying assumption is that
because plagiarism often results from time pressure or laziness, plagiarists use machine (rather than human) trans-
lation. Hence, by reverting the procedure – i.e. by translating the suspect text back – investigators can establish a
comparison, identify the original and demonstrate the theft.
49
Language In The Human-Machine Era • lithme.eu • COST Action 19102
• Translation;
• Identification of a grammatically ambiguous pronoun referent;
• Common sense reasoning;
• Reading comprehension;
• Natural language inference.
Among the applications enabled by automatic text generation systems are:
• Cyber Journalism;
• Legal text/contract generation;
• Automatic essay writing.
As reported by Brown et al. (2020), despite the wide range of beneficial applications, the system still has several
limitations. It tends to perform well when completing a specific task if trained over such a task; conversely, it
tends to perform less efficiently when trained over a diverse range of materials to perform a diverse set of tasks.
Indeed, the quality of the text currently generated automatically by AI tools is obviously largely dependent on the
text genre. For example, smart document drafting is already used as a service provided by companies for instance
to automatically draft contracts (see https://www.legito.com/US/en; https://www.erlang-solutions.com/blog/
smart-contracts-how-to-deliver-automated-interoperability.html). However, the success of these servicse is reliant
on the fact that this text genre is highly formulaic and controlled, and because often only some minor details
change across different contracts. Automatic text generation can be less fruitful in other areas, or when producing
other text genres.
Automatic text generation systems, like GPT-3, do not have a reasoning of their own; hence, they do what com-
puters have done in recent decades: augment the authors’ writing experience. Additionally, because they are trained
– at least for now – on text produced by humans, they tend to carry with them some major human flaws, including
bias, fairness and skewed representation (Brown et al. 2020).
Another challenge faced by automatic text generation systems is the human skills of recursion and productivity.
Recursion is the human property that allows humans to incorporate existing sentences and sentence excerpts in
similar linguistic frames; productivity is the property that allows speakers and writers of a language to combine
a few linguistic elements and structures to generate and understand an infinite number of sentences. Therefore,
given a relatively small number of elements and rules, “humans can produce and understand a limitless number of
sentences” (Finegan 2008) – which systems, even those as sophisticated as GPT-3, still struggle to do.
However, regardless of whether GPT-3 (or a successor, GPT-n) can produce text independently, or whether it is
used to augment the writing process, the implications will be significant. Think for example about automatic essay
generation. One of the basic tenets of academic and scientific writing is integrity and honesty. Currently, ‘cheating’
students may resort to ‘essay mills’, private companies or freelancers who write essays for money. Although current
technology is somewhat limited, future developments could bring high quality essays into the purview of AI,
including accurate approximation of the student’s own writing style (assuming they have previously written enough
themselves to train the AI) (Feng et al. 2018). But by the same token, that same AI could detect whether an essay
had been created by AI. We may be in for something of a virtual arms race.
Although the system performs better than previous machine learning systems in some of these tasks, it is not yet
on a par with human-generated text. Yet.
50
Language In The Human-Machine Era • lithme.eu • COST Action 19102
participants to write free text answers. Since the construct of language models has reached far beyond mere
domain vocabulary-matching and word-counting, a deeper understanding of the text considering the context
and implicit information can be expected. In the case of personality profiling, a model can be trained to capture
not only the content of the text but also the way the information is conveyed. A real-life application of such a
solution is implemented at Zortify (zortify.com), which claims to enable participants to express themselves freely
in a non-competitive environment. A large amount of annotated, multilingual data is collected for training a
deep learning model: 30 thousand combinations of personality-related, open questions. It has been found that
the length of the text influences the performance of the model: the analysis is more accurate when the answer
is relatively long. In addition, it undermines the performance when the answers are not related to the questions,
which is as expected.
There are multiple and overlapping concerns about the use of automated personality profiling. Traditional models
work well under certain constraints while performing poorly when the constraints cannot be met. Imagine a
scenario where the questionnaires are applied to participants in a competitive environment, the answers to the
questions can be manipulated such that a ‘perfect’ profile is created to oversell without revealing the real personality
of the participant. Questions are also clearly based on certain cultural assumptions about behaviour, values, and
other subjective traits. Profiling is at the very least a simplistic and reductive exercise in categorising humans;
at worse it can be racially discriminatory or otherwise unappreciative of cultural differences (see Emre 2018).
Involving AI could introduce some nuance into this, but could also add new problems. Concerns a rise about data
security, since a lot of personal data must be handed over for these tests to operate; and relatedly there is the issue
of consent – if you refuse to surrender your data and therefore fail the interview. Moreover, any profiling, however
intelligent, is based on a snapshot and will struggle to handle changes in individual humans’ lives, while also
potentially screening out those with mental health or other conditions that may present as undesirable personality
traits. Further, as machines become superficially better at performing a given task, they can be trusted more fully
and unquestioningly by those who employ them. This can further reduce the scope for appreciating nuance (Emre
2018). Progress in this area will clearly centre on issues of trust, consent, equality, diversity, and discrimination,
many of which – as discussed earlier – can be baked into a system by virtue of biases in the training data.
51
4
4.1 Multilingual everything
Language in the
Human-machine Era
We have outlined a near future of ubiquitous, fast or even real-time translation. This will bring new issues of user
engagement, trust, understanding, and critical awareness. There is a need for “machine translation literacy” to
provide people with the critical resources to become informed users. One relevant project on this is the Machine
Translation Literacy project (sites.google.com/view/machinetranslationliteracy, see e.g. Bowker 2020).
The technological developments of the last decades have given rise to a “feedback loop”: technology influencing
our production of language. It is not uncommon (especially for large companies) to write for machine translation.
This process, which is known as pre-editing, consists of drafting a text following a set of rules that keep in mind
that the produced text could be easily translated by an MT engine and require a minimal post-editing effort, if any.
This involves, e.g. mimicking the syntax of the target language, producing simple and short sentences, avoiding
idioms and cultural references, using repetitions, and so on. This is not a new procedure, since controlled language
has been used for decades, especially in technical writing, to encourage accuracy and technical/scientific rigour.
More recently, language technology companies have encouraged their content creators to use such controlled lan-
guage, and this can ultimately lead to a simplified use of language (see e.g. cloud.ibm.com/docs/GlobalizationPipe
line?topic=GlobalizationPipeline-globalizationpipeline_tips, unitedlanguagegroup.com/blog/writing-content-ma-
chine-translation-keep-it-simple-recycle-repeat, ajujaht.ee/en/estonians-take-machine-translation-to-the-next-lev-
el/). Meanwhile for the user, similar issues may arise; see for example Cohn et al. (2021) showing the way people
may adjust their speech patterns when talking to a virtual assistant. If we spend more time talking to technology,
this effect may grow. In due course, with enough exposure, it could lead to larger changes in human speech, at
least amongst those most frequently connected. All this is wide open space for research in the human-machine era.
52
Language In The Human-Machine Era • lithme.eu • COST Action 19102
53
Language In The Human-Machine Era • lithme.eu • COST Action 19102
fine-grained statistically “fair” models will contribute to feeding machine learning with more representative and
inclusive training data, which will find its reflection in output that is sensitive towards bias and discrimination.
The following questions are likely to be at the centre of research at least in the coming years: How will diversity be
represented? Synthetic voices will be able to mimic a wide array of accents and dialects, as we have noted so far,
and so to what extent will we choose difference over sameness in talking to and through technology? Could all this
have a positive impact on society, by helping avoid linguistic discrimination on the basis of accent? Or, in contrast,
with AI tools that are programmed to detect frequent patterns, will there be even less diversity?
Studies in other realms (e.g. image recognition, face recognition) have shown severe problems, as in minority
groups or women being misrecognized, due to lower amounts of data with these groups. This makes it all the more
pressing to accelerate efforts to develop machine-readable corpora for minority languages. The human-machine
era is coming. We will be interacting evermore through and to technology. Some languages and language varieties
will be excluded, at least to begin with. The research community must rise to this task of equalising access and
representation.
A number of initiatives have arisen in the past 10 years on ethical and legal issues in NLP (e.g. Hovy et al. 2017;
Axelrod et al. 2019). An example of an ethical application of NLP is the deployment of tools to identify biased
language in court decisions, thereby giving the courts an opportunity to rephrase those decisions (Pinto et al.
2020). Another is in the detection of fake news. Until now, most computational systems have approached fake
news as a problem of untruthful facts, and consequently have focused on fact-checking or pattern recognition of
fake news spreading. The problem, however, is that misinformation is not necessarily based on true or false facts,
but rather on the untruthful representation of such facts (Sousa-Silva 2019). Hence, more sophisticated systems
are required that are able to identify linguistic patterns to flag potential fake news. A team of researchers, building
upon previous work (Cruz et al. 2019) that focused on hyperpartisan news detection, is currently developing such
a system. This is another area to watch for progress in the coming years.
54
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Moreover, BBIs present the potential for transformations that are magnitudes beyond the comparatively trivial
tweaks represented by the AR and VR gadgets reviewed in this report. The potential for communicating directly
between our brains could ultimately enable communication without language, or with some mix of language and
inter-brain fusion that we cannot begin to imagine yet. But with great power comes great responsibility. There
will come great risks, not only of surgery and physical brain damage, but also risks to autonomy, privacy, agency,
accountability and ultimately our very identity (Hildt 2019).
55
Language In The Human-Machine Era • lithme.eu • COST Action 19102
colonising discourses in speech and language technology. For example, Bird (2020) invites us towards a postcolo-
nial approach to computational methods for supporting language vitality, by suggesting new ways of working with
indigenous communities. This includes the ability to handle regional or non-standard languages/language varieties
as (predominantly) oral, emergent, untranslatable and tightly attached to a place, rather than as a communication
tool whose discourse may be readily available for processing (ibid.).
This approach to technology has attracted the interest of linguists and non-linguists alike. An example of the latter
is that of Rudo Kemper, a human geographer with a background in archives and international administration,
and a lifelong technology tinkerer, whose work has revolved around co-creating and using technology to support
marginalized communities in defending their right to self-determination and representation, thereby contributing
towards achieving decolonizing and emancipatory ends. This is the case, in particular, of his work with Digital
Democracy on the programs team, where he leads the creation of the Earth Defenders Toolkit (earthdefend-
erstoolkit.com). The aim of this project is to provide communities with documents, tools and materials that
foster community-based independence. To that end, the project supports communities’ capacity-building, local
autonomy, and ownership of data.
56
Language In The Human-Machine Era • lithme.eu • COST Action 19102
another being; and whether – and at what stage – we will be speaking to technology in the way that we speak to
other humans. As the lines separating chatbots from humans become increasingly blurred, new affective robots
and chatbots bring a new dimension to interaction, and could become a means of influencing individuals. The
ethical issues underlying affective computing are myriad (see Devillers et al. 2020 for an extensive discussion).
57
5 Conclusion, looking ahead
At this particular moment in human history, it may feel somewhat of a dangerous distraction to spend so much
time thinking about gadgets for watching movies, playing games, and taking the effort out of simply talking to each
other. There are, it must be said, slightly more pressing issues afoot, as the following tweet wryly puts it:
Figure 9. https://twitter.com/etienneshrdlu/status/1388589083765710854
58
Language In The Human-Machine Era • lithme.eu • COST Action 19102
We are not here to cheerlead for new language technologies, or to suggest they are anywhere near as important as
the future of the planet, or other issues like access to clean water, healthcare, and so on. Rather, we begin from the
position that these technologies are coming, thanks to the huge and competing private investment fuelling rapid
progress; and that we can either understand and foresee their effects, or be taken by surprise and spend our time
trying to catch up.
Debates on Beneficial and Humane AI (humane-ai.eu) may be a source of inspiration for debate on new
and emerging language technologies. Dialogue will enrich and energise all sides - see https://futureoflife.
org/2017/12/27/research-for-beneficial-artificial-intelligence/, https://uva.nl/en/research/research-at-the-uva/
artificial-intelligence/artificial-intelligence.
This report of ours has so far sketched out some transformative new technologies that are likely to fundamentally
change our use of language. Widespread AR will soon augment our conversations in real time, while showing us
information about the world around us. We will be able to talk in noisy or dark environments easily, and across
multiple languages - as voices and faces are augmented to sound and look like they are producing other languages.
We will be able to immerse ourselves in virtual worlds, and interact with a limitless selection of lifelike virtual
characters equipped with new chatbot technology able to hold complex conversations, discuss plans, and even
teach us new things and help us practise other languages.
Some of these may feel unrealistically futuristic or far-fetched, but a central purpose of this report - and the
wider LITHME network - is to illustrate that these are mostly just the logical development and maturation of
technologies currently in prototype. Huge levels of corporate investment lie behind these new technologies; and
once they are released, huge levels of marketing will follow.
One of the founders of RASA (open-source chatbot development platform), Alan Nichol, predicted in 2018
that chatbots are just the beginning of AI assistants in the enterprise sector: oreilly.com/radar/the-next-genera-
tion-of-ai-assistants-in-enterprise. His forecast on the future of automation in the enterprise sector is structured
into five levels: automated notifications, FAQ chatbots, contextual assistants, personalised assistants and autono-
mous organisations of assistants. If his prediction were true, we must be in the age of contextual and personalised
assistants as we write, in 2021. Autonomous organisations of assistants - to appear in 5-7 years, according to
Nichol’s forecast - or multi-agent system, will leave the academic niche and reach the “normal” users. See for instance
the AAMAS conference for an overview of topics and problems: https://aamas2021.soton.ac.uk/.
But will everyone benefit from all these shiny new gadgets? Throughout this report we have tried to emphasise a
range of groups who will be disadvantaged. This begins with the most obvious, that new technologies are always
out of reach for those with the lowest means; and the furious pace of consumer tech guarantees the latest versions
will always be out of reach. As these new technologies mature and spread, this may evolve into an issue of gov-
ernment concern and philanthropy. It may seem fanciful to imagine that futuristic AR eyepieces could become the
subject of benevolence or aid; but there is recent historical form here. Only two decades ago, broadband internet
was a somewhat exclusive luxury; but today it is the subject of large-scale government subsidy (e.g. in the UK:
https://gov.uk/guidance/building-digital-uk), as well as philanthropic investment in poorer countries (https://
connectivity.fb.com/, https://www.gatesfoundation.org/our-work/programs/global-development/global-librar-
ies); and even a UN resolution (https://article19.org/data/files/Internet_Statement_Adopted.pdf). In due course,
the technologies we discuss in this report could go the same way. VR for all.
Then there are less universal but still pressing and persistent issues of inequality, which we have also highlighted.
Artificial intelligence feeds on data. It churns through huge datasets to arrive at what are essentially informed
guesses about what a human would write or say. More data equals better guesses. Little or no data? Not so useful.
The world’s bigger languages - English, Mandarin, Spanish, Russian, and so on - have significant datasets for AI to
chew over. For smaller languages, this is a taller order. Progress in ‘transfer learning’, as we reviewed in this report,
may help here.
The inequality faced by minority languages is essentially just a question of gathering enough data. The data in
question will be in exactly the same format as for bigger languages: audio recordings, automatically transcribed
and perhaps tidied up by humans for AI to digest. But for sign languages, there is a much bigger hill to climb. Sign
language is multimodal: it involves not only making shapes with the hands; it also relies heavily on facial expression,
gesture, gaze, and a knowledge of the other signer’s own life. All that represents a much greater technological
challenge than teaching a machine to hear us and speak like us. For the Deaf community, the human-machine era
is less promising.
59
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Important issues of security and privacy will accompany new language technologies. AR glasses that see everything
you see, VR headsets that track your every movement; these devices will have unprecedented access to incredibly
personal data. Privacy policies for technologies are not an open buffet. You either accept all the terms, or no
gadget for you. This is playing out in the news at time of writing, with a controversial update to Facebook’s instant
messenger WhatsApp, enabling Facebook to “share payment and transaction data in order to help them better
target ads” (Singh 2021). The two choices are to either accept this or stop using WhatsApp. Reportedly millions
are choosing the latter. But this is actually a fairly minor level of data collection compared to what lies ahead with
AR and VR devices, able to collect magnitudes more data. When those companies try to sneak clauses into their
privacy policies to monetise that data, we may look back whimsically at today’s WhatsApp controversy as amusingly
trivial.
A further caution to end with is to re-emphasise the current limitations of AI. It is very popular to compare the
astonishing abilities of AI to the human brain; but as we noted earlier this is over-simplistic (see e.g. Epstein 2016;
Cobb 2020; Marincat 2020). And as the technology progresses further, we may look back at this comparison with
a wry smile. There is indeed a history of comparing the brain to the latest marvellous technology of the day, which
has inspired some gentle mockery here and there, for example the following tweet:
The strong reliance of AI on form, rather than meaning, limit its understanding of natural language (Bender &
Koller 2020). In making meaning, we humans connect our conversations to previous conversations, common
knowledge, and other social context. AI systems have little access to any of that. To become really intelligent, to
reason in real-time scenarios on natural language statements and conversations, even to approximate emotions,
AI systems will need access to many more layers of semantics, discourse, and pragmatics (Pareja-Lora et al. 2020).
Looking ahead, we see many intriguing opportunities and new capabilities, but a range of other uncertainties and
inequalities. New devices will enable new ways to talk, to translate, to remember, and to learn. But advances in
technology will reproduce existing inequalities among those who cannot afford these devices, among the world’s
smaller languages, and especially for sign language. Debates over privacy and security will flare and crackle with
every new immersive gadget. We will move together into this curious new world with a mix of excitement and
apprehension - reacting, debating, sharing and disagreeing as we always do.
Plug in, as the human-machine era dawns.
60
Acknowledgements
First and foremost we thank our funders, the COST Association and European Commission (cost.eu). Without
this funding, there would be no report. We have further benefited from excellent administrative support at our
host institution, the University of Jyväskylä, Finland, especially the allocated project manager Hanna Pöyliö. All
the authors express their gratitude to supportive friends, family, and pets. Appropriately for a report like this,
we should express satisfaction with Google Docs and Google Meet, which together enabled us to collaborate
smoothly in real time, and avoid the horror of email attachments and endless confusion over the latest version. We
recommend these two facilities for any such collaboration.
References
Abbasi, A., Chen, H., & Salem, A. (2008). Sentiment Analysis in Multiple Languages: Feature
Selection for Opinion Classification in Web Forums. ACM Trans. Inf. Syst., 26(3). https://doi.
org/10.1145/1361684.1361685
Ackerman, R., & Goldsmith, M. (2011). Metacognitive regulation of text learning: On screen versus on
paper. Journal of Experimental Psychology: Applied, 17(1), 18–32. https://doi.org/10.1037/a0022086
Ahmad, W., Zhang, Z., Ma, X., Hovy, E., Chang, K.-W., & Peng, N. (2019). On difficulties of cross-lingual
transfer with order differences: A case study on dependency parsing. Proceedings of the 2019 Conference
of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,
Volume 1 (Long and Short Papers), 2440–2452.
Aharoni, R., Johnson, M., & Firat, O. (2019). Massively Multilingual Neural Machine Translation. Proceedings
of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, Volume 1 (Long and Short Papers), 3874–3884. https://doi.org/10.18653/v1/
N19-1388
Ai, H. (2017). Providing graduated corrective feedback in an intelligent computer-assisted language learning
environment. ReCALL, 29(3), 313–334.
Ai, R., Krause, S., Kasper, W., Xu, F., & Uszkoreit, H. (2015). Semi-automatic Generation of Multiple-
Choice Tests from Mentions of Semantic Relations. Proceedings of the 2nd Workshop on Natural Language
Processing Techniques for Educational Applications, 26–33. https://doi.org/10.18653/v1/W15-4405
Aji, A. F., Bogoychev, N., Heafield, K., & Sennrich, R. (2020). In Neural Machine Translation, What Does
Transfer Learning Transfer? In D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the
58th Annual Meeting of the Association for Computational Linguistics (pp. 7701–7710). https://www.aclweb.
org/anthology/2020.acl-main.688.pdf
Alonso, J. M., & Catala, A. (Eds.). (2021). Interactive Natural Language Technology for Explainable Artificial
Intelligence. In TAILOR Workshop 2020 Post-proceedings. LNCS, Springer.
Alrumayh, A. S., Lehman, S. M., & Tan, C. C. (2021). Emerging mobile apps: Challenges and open problems.
CCF Transactions on Pervasive Computing and Interaction, 3(1), 57–75.
Amaral, L. (2011). Revisiting current paradigms in computer assisted language learning research and develop-
ment. Ilha Do Desterro A Journal of English Language, Literatures in English and Cultural Studies, 0. https://
doi.org/10.5007/2175-8026.2011n60p365
Amaral, L. A., Meurers, D., & Ziai, R. (2011). Analyzing learner language: Towards a flexible natural lan-
guage processing architecture for intelligent language tutors. Computer Assisted Language Learning, 24,
1–16.
Amoia, M., Bretaudiere, T., Denis, R., Gardent, C., & Perez-beltrachini, L. (2012). A Serious Game for
Second Language Acquisition in a Virtual Environment. Journal on Systemics, Cybernetics and Informatics
JSCI, 24–34.
Andresen, B., & van den Brink, K. (2013). Multimedia in Education: Curriculum. Unesco Institute for
Information Technologies in Education.
61
Language In The Human-Machine Era • lithme.eu • COST Action 19102
62
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Beukeboom, C. J., & Burgers, C. (2017). Linguistic bias. In H. Giles & J. Harwood (Eds.), Oxford Encyclopedia
of Intergroup Communication. Oxford University Press.
Bibauw, S., François, T., & Desmet, P. (2019). Discussing with a computer to practice a foreign language:
Research synthesis and conceptual framework of dialogue-based CALL. Computer Assisted Language
Learning, 32(8), 827–877. https://doi.org/10.1080/09588221.2018.1535508
Biber, D., & Conrad, S. (2009). Register, Genre, and Style. Cambridge University Press.
Bigand, F., Prigent, E., & Braffort, A. (2020). Person Identification Based On Sign Language
Motion: Insights From Human Perception and Computational Modeling. Proceedings of the
7th International Conference on Movement and Computing (MOCO ’20), Article 3, 1–7. https://doi.
org/10.1145/3401956.3404187
Bird, S. (2020). Decolonising Speech and Language Technology. Proceedings of the 28th International Conference
on Computational Linguistics, 3504–3519. https://doi.org/10.18653/v1/2020.coling-main.313
Blazquez, M. (2019). Using bigrams to detect written errors made by learners of Spanish as a foreign lan-
guage. CALL-EJ, 20, 55–69.
Blodgett, S.L., Barocas, S., Daumé, H., & Wallach, H. (2020). Language (technology) is power: A critical
survey of ” bias. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,
454–476.
Bowker, L. (2020). Machine translation literacy instruction for international business students and business
English instructors. Journal of Business & Finance Librarianship, 25(1–2), 25–43. https://doi.org/10.108
0/08963568.2020.1794739
Boyd, A. (2010). EAGLE: an Error-Annotated Corpus of Beginning Learner German. Proceedings of
LREC.
Bragg, D., Koller, O., Bellard, M., Berke, L., Boudrealt, P., Braffort, A., Caselli, N.,
Huenerfauth, M., Kacorri, H., Verhoef, T., Vogler, C., & Morris, M. R. (2019).
Sign Language Recognition, Generation, and Translation: An Interdisciplinary
Perspective. ASSETS 2019. https://www.microsoft.com/en-us/research/publication/
sign-language-recognition-generation-and-translation-an-interdisciplinary-perspective/
Brandtzaeg, P. B., & Følstad, A. (2018). Chatbots: Changing user needs and motivations. Interactions, 25(5),
38–43.
Brandtzaeg, P. B., & Følstad, A. (2017). Why people use chatbots. 377–392.
Bregler, C., Covell, M., & Slaney, M. (1997). Video Rewrite: Driving Visual Speech with Audio. Proceedings
of the 24th Annual Conference on Computer Graphics and Interactive Techniques, 353–360. https://doi.
org/10.1145/258734.258880
Brown, C., Chauhan, J., Grammenos, A., Han, J., Hasthanasombat, A., Spathis, D., Xia, T., Cicuta, P., &
Mascolo, C. (2020). Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory
Sound Data. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &
Data Mining, 3474–3484. https://doi.org/10.1145/3394486.3412865
Brown, P., & Levinson, S. C. (1987). Politeness: Some universals in language usage. Cambridge University Press.
Buckley, P., & Doyle, E. (2016). Gamification and student motivation. Interactive Learning Environments, 24(6),
1162–1175. https://doi.org/10.1080/10494820.2014.964263
Cahill, A. (2015). Parsing learner text: To shoehorn or not to shoehorn. The 9th Linguistic Annotation Workshop
Held in Conjuncion with NAACL 2015, 144.
Camgöz, N. C., Hadfield, S., Koller, O., Ney, H., & Bowden, R. (2018). Neural sign language translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7784–7793. https://openac-
cess.thecvf.com/content_cvpr_2018/papers/Camgoz_Neural_Sign_Language_CVPR_2018_paper.
pdf
63
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Capuano, N., Greco, L., Ritrovato, P., & Vento, M. (2020). Sentiment analysis for customer relationship
management: An incremental learning approach. Applied Intelligence. https://doi.org/10.1007/
s10489-020-01984-x
Carmo, J., & Jones, A. J. (2002). Deontic logic and contrary-to-duties. In D. Gabbay & F. Guenthner (Eds.),
Handbook of philosophical logic (pp. 265–343). Springer.
Carosia, A. E. de O., Coelho, G. P., & Silva, A. E. A. da. (2019). The Influence of Tweets and News on
the Brazilian Stock Market through Sentiment Analysis. Proceedings of the 25th Brazillian Symposium on
Multimedia and the Web, 385–392. https://doi.org/10.1145/3323503.3349564
Carosia, A. E. O., Coelho, G. P., & Silva, A. E. A. (2020). Analyzing the Brazilian Financial Market through
Portuguese Sentiment Analysis in Social Media. Applied Artificial Intelligence, 34(1), 1–19. https://doi.or
g/10.1080/08839514.2019.1673037
Carr, N. (2020). The shallows: What the Internet is doing to our brains. WW Norton & Company.
Carrier, L. M., Rosen, L. D., Cheever, N. A., & Lim, A. F. (2015). Causes, effects, and practicalities of every-
day multitasking. Developmental Review, 35, 64–78. https://doi.org/10.1016/j.dr.2014.12.005
Castilho, S., Moorkens, J., Gaspari, F., Calixto, I., Tinsley, J., & Way, A. (2017). Is Neural Machine Translation
the New State of the Art? The Prague Bulletin of Mathematical Linguistics, 108(1), 109–120. https://doi.
org/10.1515/pralin-2017-0013
Cavanaugh, J. M., Giapponi, C. C., & Golden, T. D. (2016). Digital technology and student cognitive
development: The neuroscience of the university classroom. Journal of Management Education, 40(4),
374–397.
Cervi-Wilson, T., & Brick, B. (2018). ImparApp: Italian language learning with MIT’s TaleBlazer mobile app.
In F. Rosell-Aguilar, T. Beaven, & M. Fuertes Gutiérrez (Eds.), Innovative language teaching and learning
at university: Integrating informal learning into formal language education (pp. 49–58). Research-publishing.net.
https://doi.org/10.14705/rpnet.2018.22.775
Cettolo, M., Girardi, C., & Federico, M. (2012). WIT3: Web inventory of transcribed and translated talks.
Proceedings of the 16th EAMT Conference, 28-30 May 2012, Trento, Italy, 8.
Chapelle, C. A., & Sauro, S. (Eds.). (2017). The Handbook of Technology and Second Language Teaching and Learning.
Wiley Blackwell,.
Char, D. S., Shah, N. H., & Magnus, D. (2018). Implementing Machine Learning in Health Care—Addressing
Ethical Challenges. N Engl J Med., 378(11), 981–983.
Chauhan, P., Sharma, N., & Sikka, G. (2021). The emergence of social media data and sentiment analysis in
election prediction. Journal of Ambient Intelligence and Humanized Computing, 12(2), 2601–2627. https://
doi.org/10.1007/s12652-020-02423-y
Chen, R., & Chan, K. K. (2019). Using Augmented Reality Flashcards to Learn Vocabulary in Early
Childhood Education. Journal of Educational Computing Research, 57, 073563311985402. https://doi.
org/10.1177/0735633119854028
Chen, Y.-L., & Hsu, C.-C. (2020). Self-regulated mobile game-based English learning in a virtual reality envi-
ronment. Computers & Education, 154, 103910. https://doi.org/10.1016/j.compedu.2020.103910
Christensen, L. B. (2009). RoboBraille – Braille Unlimited. The Educator, XXI(2), 32–37.
Chu, C., Dabre, R., & Kurohashi, S. (2017). An Empirical Comparison of Domain Adaptation Methods for
Neural Machine Translation. Proceedings of the 55th Annual Meeting of the Association for Computational
Linguistics (Volume 2: Short Papers), 385–391. https://doi.org/10.18653/v1/P17-2061
Cinelli, M., De Francisci Morales, G., Galeazzi, A., Quattrociocchi, W., & Starnini, M. (2021). The echo
chamber effect on social media. Proceedings of the National Academy of Sciences, 118(9). https://doi.
org/10.1073/pnas.2023301118
Cobb, M. (2020). The Idea of the Brain: The Past and Future of Neuroscience. Profile.
Cohn, M., Liang, K. H., Sarian, M., Zellou, G., & Yu, Z. (2021). Speech rate adjustments in conversations
with an Amazon Alexa socialbot. Frontiers in Communication, 6, 82.
64
Language In The Human-Machine Era • lithme.eu • COST Action 19102
65
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Dubiel, M., Halvey, M., & Oplustil, P. (2020). Persuasive Synthetic Speech: Voice Perception and
User Behaviour. Proceedings of the 2nd Conference on Conversational User Interfaces, 1–9. https://doi.
org/10.1145/3405755.3406120
Dupont, M., & Zufferey, S. (2017). Methodological issues in the use of directional parallel corpora: A case
study of English and French concessive connectives. International Journal of Corpus Linguistics, 22.
https://doi.org/10.1075/ijcl.22.2.05dup
Elfeky, M. G., Moreno, P., & Soto, V. (2018). Multi-Dialectical Languages Effect on Speech Recognition:
Too Much Choice Can Hurt. Procedia Computer Science, 128, 1–8. https://doi.org/10.1016/j.
procs.2018.03.001
Emre, M. (2018). The personality brokers: The strange history of Myers-Briggs and the birth of personality testing.
Doubleday.
Epstein, R. (2016). The empty brain [Website]. Aeon. https://aeon.co/essays/
your-brain-does-not-process-information-and-it-is-not-a-computer
Erard, M. (2017). Why Sign-Language Gloves Don’t Help Deaf People [Website].
The Atlantic. https://www.theatlantic.com/technology/archive/2017/11/
why-sign-language-gloves-dont-help-deaf-people/545441/
Esuli, A., & Sebastiani, F. (2006). SENTIWORDNET: A Publicly Available Lexical Resource for Opinion
Mining. Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06).
http://www.lrec-conf.org/proceedings/lrec2006/pdf/384_pdf.pdf
Färber, M., Qurdina, A., & Ahmedi, L. (2019). Team Peter Brinkmann at SemEval-2019 Task 4: Detecting
Biased News Articles Using Convolutional Neural Networks. SemEval@NAACL-HLT 2019,
1032–1036.
Fairclough, N. (1989). Language and power. Longman.
Felice, M. (2016). Artificial error generation for translation-based grammatical error correction.
Feng, X., Liu, M., Liu, J., Qin, B., Sun, Y., & Liu, T. (2018). Topic-to-Essay Generation with Neural
Networks. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18),
7.
Filhol, M., Hadjadj, M. N., & Testu, B. (2016). A rule triggering system for automatic text-to-sign translation.
Univ Access Inf Soc, 15, 487–498. https://doi.org/10.1007/s10209-015-0413-4
Finegan, E. (2008). Language: Its Structure and Use (6th, Inter ed.). Wadsworth.
Følstad, A., & Brandtzaeg, P. B. (2020). Users’ experiences with chatbots: Findings from a questionnaire
study. Quality and User Experience, 5(1), 3. https://doi.org/10.1007/s41233-020-00033-2
Fortune Business Insights. (2021). Chatbot Market Size and Regional Forecast, 2020-2027. https://www.for-
tunebusinessinsights.com/chatbot-market-104673
Fryer, L., & Carpenter, R. (2006). Emerging technologies—Bots as language learning tools. Language Learning
& Technology, 3(10), 8–14.
Gafni, G., Thies, J., Zollhöfer, M., & Nießner, M. (2020). Dynamic Neural Radiance Fields for Monocular 4D Facial
Avatar Reconstruction.
Gálvez, R. H., Beňuš, Š., Gravano, A., & Trnka, M. (2017). Prosodic Facilitation and Interference While
Judging on the Veracity of Synthesized Statements. INTERSPEECH.
Gaucher, D., Friesen, J., & Kay, A.C. (2011). Evidence That Gendered Wording in Job Advertisements Exists
and Sustains Gender Inequality. Journal of Personality and Social Psychology. https://doi.org/10.1037/
a0022530
Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). The gesturer is the speaker. 2013 IEEE International
Conference on Acoustics, Speech and Signal Processing, 3751–3755. https://doi.org/10.1109/
ICASSP.2013.6638359
66
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Godwin-Jones, R. (2016). Augmented Reality and Language Learning: From annotated vocabulary to place-
based mobile games. Language Learning & Technology, 20(3), 9–19.
Governatori, G. (2015). The Regorous Approach to Process Compliance. 2015 IEEE 19th International
Enterprise Distributed Object Computing Workshop, 33–40. https://doi.org/10.1109/EDOCW.2015.28
Governatori, G., & Shek, S. (2013). Regorous: A Business Process Compliance Checker. Proceedings
of the Fourteenth International Conference on Artificial Intelligence and Law, 245–246. https://doi.
org/10.1145/2514601.2514638
Greenfield, P. M. (2009). Technology and informal education: What is taught, what is learned. Science (New
York, N.Y.), 323(5910), 69–71. https://doi.org/10.1126/science.1167190
Grieve, J., Chiang, E., Clarke, I., Gideon, H., Heini, A., Nini, A., & Waibel, E. (2018). Attributing the Bixby
Letter using n-gram tracing. Digital Scholarship in the Humanities, April 2018, 1–20.
Grosz, B., & Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Computational
Linguistics, 12(3), 175–204.
Guerra, A. M., Ferro, R., & Castañeda, M. A. (2018). Analysis on the gamification and implementation of
Leap Motion Controller in the I.E.D. Técnico industrial de Tocancipá. Interactive Technology and Smart
Education, 15(2), 155–164. https://doi.org/10.1108/ITSE-12-2017-0069
Guenther, L., Ruhrmann, G., Bischoff, J., Penzel, T., & Weber, A. (2020). Strategic Framing and Social Media
Engagement: Analyzing Memes Posted by the German Identitarian Movement on Facebook. Social
Media + Society, 6. https://doi.org/10.1177/2056305119898777
Hadjadj, M. N., Filhol, M., & Braffort, A. (2018). Modeling French Sign Language: A proposal for a semanti-
cally compositional system. Proceedings of the Language Resource and Evaluation Conference (LREC).
Hansen, T., & Petersen, A. C. (2012). ‘The Hunt for Harald’-Learning Language and Culture Through
Gaming. Proceedings of the 6th European Conference on Games Based Learning: ECGBL, 184.
Hassani, K., Nahvi, A., & Ahmadi, A. (2016). Design and implementation of an intelligent virtual environ-
ment for improving speaking and listening skills. Interactive Learning Environments, 24(1), 252–271.
https://doi.org/10.1080/10494820.2013.846265
Hayles, K. (2012). How We Think: Digital Media and Contemporary Technogenesis. University of Chicago Press.
He, M., Xiong, B., & Xia, K. (2021). Are You Looking at Me? Eye Gazing in Web Video Conferences.
CPEN 541 HIT’21, Vancouver, BC, Canada, 8.
Heift, T. (2002). Learner Control and Error Correction in ICALL: Browsers, Peekers, and Adamants.
CALICO Journal, 19.
Heift, T. (2003). Multiple Learner Errors and Meaningful Feedback: A Challenge for ICALL Systems.
CALICO Journal, 20(3), 533–548.
Herazo, J. (2020a). Reconocimiento de señas de la lengua de señas panameña mediante aprendizaje profundo [Masters
Thesis, Universidad Carlos III de Madrid]. https://github.com/joseherazo04/SLR-CNN/blob/
master/Jos%C3%A9%20Herazo%20TFM.pdf
Herazo, J. (2020b). Sign language recognition using deep learning. https://towardsdatascience.com/
sign-language-recognition-using-deep-learning-6549268c60bd
Hewitt, J., & Kriz, R. (2018). Sequence-to-sequence Models [Lecture]. CIS 530 Computational Linguistics, U. Penn.
https://nlp.stanford.edu/~johnhew/public/14-seq2seq.pdf
Hildt, E. (2019). Multi-Person Brain-To-Brain Interfaces: Ethical Issues. Frontiers in Neuroscience, 13, 1177.
https://doi.org/10.3389/fnins.2019.01177
Hoang, H., Dwojak, T., Krislauks, R., Torregrosa, D., & Heafield, K. (2018). Fast Neural Machine Translation
Implementation. 116–121. https://doi.org/10.18653/v1/W18-2714
Hodel, L., Formanowicz, M., Sczesny, S., Valdrová, J., & Stockhausen, L. von. (2017). Gender-Fair Language
in Job Advertisements: A Cross-Linguistic and Cross-Cultural Analysis. Journal of Cross-Cultural
Psychology, 48(3), 384–401. https://doi.org/10.1177/0022022116688085
67
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Hoehn, S. (2019). Artificial Companion for Second Language Conversation. Springer International Publishing.
Hoehn, S., & Bongard-Blanchy, K. (2020). Heuristic Evaluation of COVID-19 Chatbots. Proceedings of
CONVERSATIONS 2020.
Hoehn, Sviatlana. (2019). Artificial Companion for Second Language Conversation: Chatbots Support Practice Using
Conversation Analysis. Springer International Publishing.
Holden, C., & Sykes, J. (2011). Mentira.
Holone, H. (2016). The filter bubble and its effect on online personal health information. Croatian Medical
Journal, 57, 298–301. https://doi.org/10.3325/cmj.2016.57.298
Hovy, D., Spruit, S., Mitchell, M., Bender, E. M., Strube, M., & Wallach, H. (Eds.). (2017). Proceedings of the
First ACL Workshop on Ethics in Natural Language Processing. Association for Computational Linguistics.
https://doi.org/10.18653/v1/W17-16
Hutson, M. (2021). Robo-writers: The rise and risks of language-generating AI. Nature, 591, 22–25.
Ijaz, K., Bogdanovych, A., & Trescak, T. (2017). Virtual worlds vs books and videos in history education.
Interactive Learning Environments, 25(7), 904–929. https://doi.org/10.1080/10494820.2016.1225099
inVentiv Health Communications. (2017). 2017 Digital Trends. https://www.gsw-w.com/2017Trends/
inV_GSW_2017_Digital_Trends.pdf
Jain, S., Thiagarajan, B., Shi, Z., Clabaugh, C., & Matarić, M. J. (2020). Modeling engagement in long-term,
in-home socially assistive robot interventions for children with autism spectrum disorders. Science
Robotics, 5(39). https://doi.org/10.1126/scirobotics.aaz3791
Jantunen, T., Rousi, R., Rainò, P., Turunen, M., Moeen Valipoor, M., & García, N. (2021). Is There Any
Hope for Developing Automated Translation Technology for Sign Languages? In M. Hämäläinen, N.
Partanen, & K. Alnajjar (Eds.), Multilingual Facilitation (p. 61-73).
Jia, J. (2009). CSIEC: A computer assisted English learning chatbot based on textual knowledge and reason-
ing. Knowledge-Based Systems, 22(4), 249–255. https://doi.org/10.1016/j.knosys.2008.09.001
Jiang, L., Stocco, A., Losey, D. M., Abernethy, J. A., Prat, C. S., & Rao, R. P. N. (2019). BrainNet: A Multi-
Person Brain-to-Brain Interface for Direct Collaboration Between Brains. Scientific Reports, 9(1), 6115.
https://doi.org/10.1038/s41598-019-41895-7
Jones, D. (2020). Macsen: A Voice Assistant for Speakers of a Lesser Resourced Language. Proceedings of the
1st Joint Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU) and Collaboration
and Computing for Under-Resourced Languages (CCURL), 194–201. https://www.aclweb.org/antholo-
gy/2020.sltu-1.27
Juan, E. S. (2007). Language and Decolonization. In U.S. Imperialism and Revolution in the Philippines (pp.
67–87). Palgrave Macmillan US. https://doi.org/10.1057/9780230607033_4
Juola, P. (2015). The Rowling Case: A Proposed Standard Analytic Protocol for Authorship Questions.
Digital Scholarship in the Humanities, 30(suppl_1), i100–i113. https://doi.org/10.1093/llc/fqv040
Jurafsky, D. (2004). Pragmatics and computational linguistics. In L. R. Horn & G. Ward (Eds.), Handbook of
Pragmatics (pp. 578–604). Blackwell Publishing.
Kasilingam, D. L. (2020). Understanding the attitude and intention to use smartphone chatbots for shopping.
Technology in Society, 62, 101280.
Kasper, G., & Wagner, J. (2011). A conversation-analytic approach to second language acquisition. In D.
Atkinson (Ed.), Alternative approaches to second language acquisition (pp. 117–142). Taylor & Francis.
https://doi.org/10.4324/9780203830932
Kastrati, Z., Imran, A. S., & Kurti, A. (2020). Weakly Supervised Framework for Aspect-Based Sentiment
Analysis on Students’ Reviews of MOOCs. IEEE Access, 8, 106799–106810. https://doi.
org/10.1109/ACCESS.2020.3000739
68
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Khayrallah, H., & Koehn, P. (2018). On the Impact of Various Types of Noise on Neural Machine
Translation. Proceedings of the 2nd Workshop on Neural Machine Translation and Generation,
74–83. https://doi.org/10.18653/v1/W18-2709
Khoshnevisan, B., & Le, N. (2018). Augmented reality in language education: A systematic literature review.
Proceedings of the Global Conference on Education and Research (GLOCER) Conference, 2, 57–71.
Klüwer, T. (2011). From Chatbots to Dialog Systems. In A. Sagae, W. L. Johnson, & A. Valente (Eds.),
Conversational Agents and Natural Language Interaction: Techniques and Effective Practices (pp. 1–22). IGI
Global. https://doi.org/10.4018/978-1-60960-617-6.ch016
Koehler, M., Mishra, P., Kereluik, K., Seob, S., & Graham, C. (2014). The Technological Pedagogical Content
Knowledge Framework. Handbook of Research on Educational Communications and Technology, 101–111.
https://doi.org/10.1007/978-1-4614-3185-5_9
Kolb, D. (1984). Experiential learning. Prentice-Hall.
Krummes, C., & Ensslin, A. (2014). What’s Hard in German? WHiG: a British learner corpus of German.
Corpora, 9(2), 191–205.
Laguarta, J., Hueto, F., & Subirana, B. (2020). COVID-19 Artificial Intelligence Diagnosis Using Only
Cough Recordings. IEEE Open Journal of Engineering in Medicine and Biology, 1, 275–281. https://doi.
org/10.1109/OJEMB.2020.3026928
Lee, A., Prasad, R., Webber, B., & Joshi, A. (2016). Annotating discourse relations with the PDTB annotator.
Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016): System
Demonstrations, 121–125.
Lee, J., Lee, J., & Lee, D. (2021). Cheerful encouragement or careful listening: The dynamics of robot eti-
quette at Children’s different developmental stages. Computers in Human Behavior, 118, 106697. https://
doi.org/10.1016/j.chb.2021.106697
Lefer, M.-A., & Grabar, N. (2015). Super-creative and over-bureaucratic: A cross-genre corpus-based study
on the use and translation of evaluative prefixation in TED talks and EU parliamentary debates.
Across Languages and Cultures, 16, 187–208. https://doi.org/10.1556/084.2015.16.2.3
Legault, J., Zhao, J., Chi, Y.-A., Chen, W., Klippel, A., & Li, P. (2019). Immersive Virtual Reality as an
Effective Tool for Second Language Vocabulary Learning. Languages, 4(1). https://doi.org/10.3390/
languages4010013
Leviathan, Y., & Matias, Y. (2018). Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone.
https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html
Li, B., Sainath, T. N., Sim, K. C., Bacchiani, M., Weinstein, E., Nguyen, P., Chen, Z., Wu, Y., & Rao, K.
(2017). Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model.
Li, M.-C., & Tsai, C.-C. (2013). Game-Based Learning in Science Education: A Review of Relevant
Research. Journal of Science Education and Technology, 22(6), 877–898. https://doi.org/10.1007/
s10956-013-9436-x
Li, M., Hickman, L., Tay, L., Ungar, L., & Guntuku, S. C. (2020). Studying Politeness across Cultures Using
English Twitter and Mandarin Weibo. Proc. ACM Hum.-Comput. Interact., 4(CSCW2). https://doi.
org/10.1145/3415190
Libal, T. (2020). Towards Automated GDPR Compliance Checking. In F. Heintz, M. Milano, & B. O’Sullivan
(Eds.), Trustworthy AI - Integrating Learning, Optimization and Reasoning—First International Workshop,
TAILOR 2020, Virtual Event, September 4-5, 2020, Revised Selected Papers (Vol. 12641, pp. 3–19).
Springer. https://doi.org/10.1007/978-3-030-73959-1_1
Libal, T., & Steen, A. (2019). The NAI Suite-Drafting and Reasoning over Legal Texts. In M. Araszkiewicz
& V. Rodríguez-Doncel (Eds.), Proceedings of the 32nd International Conference on Legal Knowledge and
Information Systems (JURIX 2019) (pp. 243–246). http://dx.doi.org/10.3233/FAIA190333
Lilt Labs. (2017). 2017 Machine Translation Quality Evaluation. https://web.archive.org/web/20170724060649/
http://labs.lilt.com/2017/01/10/mt-quality-evaluation/
69
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Lin, T.-J., & Lan, Y.-J. (2015). Language Learning in Virtual Reality Environments: Past, Present, and Future.
Journal of Educational Technology & Society, 18(4), 486–497.
Lo, S. L., Cambria, E., Chiong, R., & Cornforth, D. (2017). Multilingual sentiment analysis: From formal
to informal and scarce resource languages. Artificial Intelligence Review, 48(4), 499–527. https://doi.
org/10.1007/s10462-016-9508-4
Lum, K., & Isaac, W. (2016). To predict and serve? Significance, 13(5), 14–19. https://doi.
org/10.1111/j.1740-9713.2016.00960.x
Macedo, D. (Ed.). (2019). Decolonizing Foreign Language Education: The Misteaching of English and Other Colonial
Languages. Routledge.
Makinson, D., & van der Torre, L. (2003). What is input/output logic? Trends in Logic, 17, 163–174.
Makransky, G., Terkildsen, T., & Mayer, R. (2017). Adding Immersive Virtual Reality to a Science Lab
Simulation Causes More Presence But Less Learning. Learning and Instruction, 60. https://doi.
org/10.1016/j.learninstruc.2017.12.007
Marincat, N. (2020). Why the brain is not like a computer—And artificial intelligence is not
likely to surpass human intelligence any time soon. Medium.Com. https://medium.com/
is-consciousness/6d93d45df077
Markee, N. (2000). Conversation analysis. Routledge.
Matusov, E., Wilken, P., & Georgakopoulou, Y. (2019). Customizing Neural Machine Translation for
Subtitling. Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers), 82–93.
https://doi.org/10.18653/v1/W19-5209
McCorduck, P. (2004). Machines Who Think: A Personal Inquiry into the History and Prospects of Artificial
Intelligence. AK Peters Ltd.
McEnery, T. (1995). Computational Pragmatics: Probability, Deeming and Uncertain References [Unpublished PhD
thesis]. Lancaster University.
McLuhan, M., & Fiore, Q. (1967). The medium is the message. 123.
McLuhan, Marshall. (1962). The Gutenberg galaxy. The making of typographic man. University of Toronto
Press.
McTear, M. (2020). Conversational AI: Dialogue Systems, Conversational Agents, and Chatbots. Synthesis
Lectures on Human Language Technologies, 13(3), 1–251.
Meunier, F. (2020). A case for constructive alignment in DDL: Rethinking outcomes, practices and assess-
ment in (data-driven) language learning. In P. Crosthwaite (Ed.), Data-Driven Learning for the Next
Generation. Corpora and DDL for Pre-tertiary Learners. Routledge.
Mishra, P., & Koehler, M. (2006). Technological Pedagogical Content Knowledge: A
Framework for Teacher Knowledge.Teachers College REcord, 108, 1017–1054. https://doi.
org/10.1111/j.1467-9620.2006.00684.x
Naderi, N., & Hirst, G. (2018). Using context to identify the language of face-saving. Proceedings of the 5th
Workshop on Argument Mining, 111–120. https://www.aclweb.org/anthology/W18-5214.pdf
Nagata, N. (2009). Robo-Sensei’s NLP-Based Error Detection and Feedback Generation Robo-Sensei’s
NLP-Based Error Detection and Feedback Generation. CALICO, 26.
Nangia, N., Vania, C., Rasika, B., & Bowman, S. R. (2020). CrowS-Pairs: A Challenge Dataset for Measuring
Social Biases in Masked Language Models. Proc. of EMNLP 2020.
Naseem, T., Barzilay, R., & Globerson, A. (2012). Selective sharing for multilingual dependency parsing.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers,
629–637.
Nass, C. (2004). Etiquette equality: Exhibitions and expectations of computer politeness. Communications of
the ACM, 47(4), 35–37.
70
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Nass, C., & Yen, C. (2010). The man who lied to his laptop: What we can learn about ourselves from our machines.
Penguin.
Nerbonne, J. (2016). Data from Non-standard Varieties. In S. Dipper, F. Neubarth, & H. Zinsmeister (Eds.),
Proceedings of the 13th Conference on Natural Language Processing: (KONVENS 2016) (Vol. 16, pp. 1–12).
Bochumer Linguistische Arbeitsbereichte.
Nguyen, M., He, T., An, L., Alexander, D. C., Feng, J., & Yeo, B. T. T. (2020). Predicting Alzheimer’s
disease progression using deep recurrent neural networks. NeuroImage, 222, 117203. https://doi.
org/10.1016/j.neuroimage.2020.117203
Nguyen, T., & Chiang, D. (2017). Transfer Learning Across Low-Resource Related Languages For Neural
Machine Translation. IJCNLP8 Proceedings.
Noffs, G., Perera, T., Kolbe, S. C., Shanahan, C. J., Boonstra, F. M. C., Evans, A., Butzkueven, H., van der
Walt, A., & Vogel, A. P. (2018). What speech can tell us: A systematic review of dysarthria charac-
teristics in Multiple Sclerosis. Autoimmunity Reviews, 17(12), 1202–1209. https://doi.org/10.1016/j.
autrev.2018.06.010
Norman, E., & Furnes, B. (2016). The relationship between metacognitive experiences and learning: Is there
a difference between digital and non-digital study media? Computers in Human Behavior, 54, 301–309.
https://doi.org/10.1016/j.chb.2015.07.043
North, M. M., & North, S. M. (2018). The Sense of Presence Exploration in Virtual Reality Therapy. J-Jucs,
24(2), 72–84.
Novotná, T., & Libal, T. (2020). Towards Automating Inconsistency Checking of Legal Texts. Jusletter IT. Die
Zeitschrift Für IT Und Recht.
Ntoutsi, E., Fafalios, P., Gadiraju, U., Iosifidis, V., Nejdl, W., Vidal, M.-E., Ruggieri, S., Turini, F.,
Papadopoulos, S., Krasanakis, E., Kompatsiaris, I., Kinder-Kurlanda, K., Wagner, C., Karimi, F.,
Fernandez, M., Alani, H., Berendt, B., Kruegel, T., Heinze, C., … Staab, S. (2020). Bias in data-driven
artificial intelligence systems—An introductory survey. WIREs Data Mining and Knowledge Discovery,
10(3), e1356. https://doi.org/10.1002/widm.1356
Ong, W. J. (1982). Orality and Literacy. The Technologizing of the Word. Routledge.
Pareja-Lora, A. (2012). OntoLingAnnot’s Ontologies: Facilitating Interoperable Linguistic Annotations (Up
to the Pragmatic Level). In C. Chiarcos, S. Nordhoff, & S. Hellmann (Eds.), Linked Data in Linguistics:
Representing and Connecting Language Data and Language Metadata (pp. 117–127). Springer Berlin
Heidelberg. https://doi.org/10.1007/978-3-642-28249-2_12
Pareja-Lora, A. (2014). The Pragmatic Level of OntoLingAnnot’s Ontologies and Their Use in Pragmatic
Annotation for Language Teaching. In E. Bárcena, T. Read, & J. Arús (Eds.), Languages for
Specific Purposes in the Digital Era (pp. 323–344). Springer International Publishing. https://doi.
org/10.1007/978-3-319-02222-2_15
Pareja-Lora, A., & Aguado de Cea, G. (2010). Modelling Discourse-Related Terminology in
OntoLingAnnot’s Ontologies. Proceedings of TKE 2010: Presenting Terminology and Knowledge Engineering
Resources Online: Models and Challenges, 549–575.
Pareja-Lora, A., Blume, M., Lust, B. C., & Chiarcos, C. (2020). Development of Linguistic Linked Open Data
Resources for Collaborative Data-Intensive Research in the Language Sciences. The MIT Press.
Pariser, E. (2011). The Filter Bubble: What the Internet Is Hiding from You. Penguin.
Parmaxi, A. (2020). Virtual reality in language learning: A systematic review and implications for research
and practice. Interactive Learning Environments, 0(0), 1–13. https://doi.org/10.1080/10494820.2020.176
5392
Parmaxi, A., & Demetriou, A. A. (2020). Augmented Reality in Language Learning: A State-of-the-Art
Review of 2014-2019. Journal of Computer Assisted Learning, 36(6), 861–875.
Patil, M., Chaudhari, N., Bhavsar, R., & Pawar, B. (2020). A review on sentiment analysis in
psychomedical diagnosis. Open Journal of Psychiatry & Allied Sciences, 11, 80. https://doi.
org/10.5958/2394-2061.2020.00025.7
71
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Pearson, J., Hu, J., Branigan, H. P., Pickering, M. J., & Nass, C. I. (2006). Adaptive language behavior in HCI:
how expectations and beliefs about a system affect users’ word choice. Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems, 1177–1180.
Peixoto, B., Pinto, D., Krassmann, A., Melo, M., Cabral, L., & Bessa, M. (2019). Using Virtual Reality
Tools for Teaching Foreign Languages. In Á. Rocha, H. Adeli, L. P. Reis, & S. Costanzo (Eds.), New
Knowledge in Information Systems and Technologies (pp. 581–588). Springer International Publishing.
Pereira, A. L. D. (2003). Problemas actuais da gestão do direito de autor: Gestão individual e gestão colectiva
do direito de autor e dos direitos conexos na sociedade da informação. In Estudos em Homenagem ao
Professor Doutor Jorge Ribeiro de Faria – Faculdade de Direito da Universidade do Porto (pp. 17–37). Coimbra
Editora.
Perlin, K. (2016). Future Reality: How Emerging Technologies Will Change Language Itself. IEEE Computer
Graphics and Applications, 36(3), 84–89.
Petersen, K. A. (2010). Implicit corrective feedback in computer-guided interaction: Does mode matter? [PhD Thesis].
Georgetown University.
Peymanfard, J., Mohammadi, M. R., Zeinali, H., & Mozayani, N. (2021). Lip reading using external viseme
decoding. CoRR, abs/2104.04784. https://arxiv.org/abs/2104.04784
Pierce, J. R., & Carroll, J. (1966). Language and Machines: Computers in Translation and Linguistics.
Pinto, A. G., Cardoso, H. L., Duarte, I. M., Warrot, C. V., & Sousa-Silva, R. (2020). Biased Language
Detection in Court Decisions. In C. Analide, P. Novais, D. Camacho, & H. Yin (Eds.), IDEAL
(2) (Vol. 12490, pp. 402–410). Springer. http://dblp.uni-trier.de/db/conf/ideal/ideal2020-2.
html#PintoCDWS20
Plank, B. (2016). What to do about non-standard (or non-canonical) language in NLP. 8.
Plunkett, K. N. (2019). A simple and practical method for incorporating augmented reality into the class-
room and laboratory. J. Chem. Educ., 96(11), 2628–2631.
Poncelas, A., Lohar, P., Way, A., & Hadley, J. (2020). The Impact of Indirect Machine Translation on Sentiment
Classification.
Popel, M., Tomkova, M., Tomek, J., Kaiser, Ł., Uszkoreit, J., Bojar, O., & Žabokrtský, Z. (2020).
Transforming machine translation: A deep learning system reaches news translation quality
comparable to human professionals. Nature Communications, 11(1), 4381. https://doi.org/10.1038/
s41467-020-18073-9
Prakken, H., & Sergot, M. (1996). Contrary-to-duty obligations. Studia Logica, 57(1), 91-115.
PwC EU Services. (2019). Architecture for public service chatbots. European Commission.
Ramesh, B. P., Prasad, R., Miller, T., Harrington, B., & Yu, H. (2012). Automatic Discourse Connective
Detection in Biomedical Text. Journal of the American Medical Informatics Association (JAMIA), 19(5),
800–808.
Repetto, C. (2014). The use of virtual reality for language investigation and learning. Frontiers in Psychology, 5,
1280. https://doi.org/10.3389/fpsyg.2014.01280
Research and Markets. (2020). Intelligent Virtual Assistant Market Size, Share & Trends Analysis Report by Product
(Chatbot, Smart Speakers), by Technology, by Application (BFSI, Healthcare, Education), by Region, and Segment
Forecasts, 2020-2027. https://www.researchandmarkets.com/reports/3292589/
Reznicek, M., Lüdeling, A., & Hirschmann, H. (2013). Competing target hypotheses in the Falko corpus:
A flexible multi-layer corpus architecture. In A. Díaz-Negrillo, N. Ballier, & P. Thompson (Eds.),
Automatic Treatment and Analysis of Learner Corpus Data (pp. 101–123). John Benjamins.
Reznicek, M., Lüdeling, A., Krummes, C., Schwantuschke, F., Walter, M., Schmidt, K., Hirschmann, H., &
Andreas, T. (2012). Das Falko-Handbuch. Korpusaufbau und Annotationen (2.01). Humboldt Universität zu
Berlin.
72
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Robin, J., Harrison, J. E., Kaufman, L. D., Rudzicz, F., Simpson, W., & Yancheva, M. (2020). Evaluation of
Speech-Based Digital Biomarkers: Review and Recommendations. Digit Biomark, 4(99–108). https://
doi.org/10.1159/000510820
Sagae, A., Johnson, W. L. & Valente, A. (Eds.) (2011). Conversational Agents and Natural Language Interaction:
Techniques and Effective Practices (pp. 1–22). IGI Global.
Saleiro, P., Rodolfa, K. T., & Ghani, R. (2020). Dealing with Bias and Fairness in Data Science Systems: A
Practical Hands-on Tutorial. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining, 3513–3514. https://doi.org/10.1145/3394486.3406708
Sandusky, S. (2015). Gamification in Education.
Saunders, J., Hunt, P., & Hollywood, J. S. (2016). Predictions put into practice: A quasi-experimental evalua-
tion of Chicago’s predictive policing pilot. Journal of Experimental Criminology, 12(3), 347–371. https://
doi.org/10.1007/s11292-016-9272-0
Scheufele, D. A. (1999). Framing as a theory of media effects. Journal of Communication, 49, 103–122.
Schneider, B. (2020). What is ‘correct’ language in digital society? From Gutenberg
to Alexa Galaxy [Blog post]. Digital Society Blog. https://www.hiig.de/en/
what-is-correct-language-in-digital-society-from-gutenberg-to-alexa-galaxy/
Schneider, B. (Forthcoming). Von Gutenberg zu Alexa – Posthumanistische Perspektiven auf
Sprachideologie. In M. Schmidt-Jüngst (Ed.), Mensch – Tier – Maschine. Transcript Verlag.
Schneider, S., Baevski, A., Collobert, R., & Auli, M. (2019). wav2vec: Unsupervised Pre-training for Speech
Recognition.
Searle, J. R. (1969). Speech acts: An essay in the philosophy of language. Cambridge University Press. https://books.
google.pt/books/about/Speech_Acts.html?id=t3_WhfknvF0C&redir_esc=y
Ségouat, J. (2010). Modélisation de la coarticulation en Langue des Signes Française pour la diffusion automatique d’infor-
mations en gare ferroviaire à l’aide d’un signeur virtuel [PhD Dissertation, Université Paris Sud]. https://tals.
limsi.fr/docs/TheseJeremieSegouat.pdf
Shahid, A. H., & Singh, M. P. (2020). A deep learning approach for prediction of Parkinson’s dis-
ease progression. Biomedical Engineering Letters, 10(2), 227–239. https://doi.org/10.1007/
s13534-020-00156-7
Shalunts, G., Backfried, G., & Commeignes, N. (2016). The Impact of Machine Translation on Sentiment
Analysis.
Shawar, B. A., & Atwell, E. (2007). Chatbots: Are they really useful? Ldv Forum, 22(1), 29–49.
Shokat, S., Riaz, R., Rizvi, S. S., Khan, K., Riaz, F., & Kwon, S. J. (2020). Analysis and Evaluation of
Braille to Text Conversion Methods. Mobile Information Systems, 2020, 3461651. https://doi.
org/10.1155/2020/3461651
Sinclair, A., McCurdy, K., Lucas, C. G., Lopez, A., & Gašević, D. (2019). Tutorbot Corpus: Evidence of
Human-Agent Verbal Alignment in Second Language Learner Dialogues. Proceedings of the 12th
International Conference on Educational Data Mining, 414–419.
Sinclair, A., McCurdy, K., Gasevic, D., Lucas, C., & Lopez, A. (2019). Tutorbot Corpus: Evidence of Human-
Agent Verbal Alignment in Second Language Learner Dialogues.
Singer, L. M., & Alexander, P. A. (2017). Reading Across Mediums: Effects of Reading Digital and Print
Texts on Comprehension and Calibration. The Journal of Experimental Education, 85(1), 155–172.
https://doi.org/10.1080/00220973.2016.1143794
Singh, M. (2021). WhatsApp details what will happen to users who don’t agree to privacy changes. TechCrunch. https://
techcrunch.com/?p=2115500
Solak, E., & Erdem, G. (2015). A Content Analysis of Virtual Reality Studies in Foreign Language
Education. Participatory Educational Research, 2(5), 21–26.
73
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Sorgini, F., Caliò, R., Carrozza, M. C., & Oddo, C. M. (2018). Haptic-assistive technologies for audition and
vision sensory disabilities. Disability and Rehabilitation: Assistive Technology, 13(4), 394–421. https://doi.
org/10.1080/17483107.2017.1385100
Soria, C. (2017). The digital language vitality scale: A model for assessing digital vitality of languages. 100–100.
Sousa Silva, R., Laboreiro, G., Sarmento, L., Grant, T., Oliveira, E., & Maia, B. (2011). ‘twazn me! ;(‘
Automatic Authorship Analysis of Micro-Blogging Messages. In Lecture Notes in Computer Science
(including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Vol. 6716 LNCS
(pp. 161–168). https://doi.org/10.1007/978-3-642-22327-3_16
Sousa-Silva, R. (2013). Detecting plagiarism in the forensic linguistics turn [PhD Thesis]. Aston University.
Sousa-Silva, R. (2014). Detecting translingual plagiarism and the backlash against translation plagiarists.
Language and Law / Linguagem e Direito, 1(1), 70— – 94.
Sousa-Silva, R. (2018). Computational Forensic Linguistics: An Overview of Computational Applications in
Forensic Contexts. Language and Law / Linguagem e Direito, 5(2), 118–143.
Sousa-Silva, R. (2019). When news become a forensic issue (revisited): Fake News, Mis- and Disinformation, and other ethi-
cal breaches [Conference presentation]. IV International Symposium on Communication Management
- XESCOM, Porto.
Sousa-Silva, R. (2021). Plagiarism: Evidence based plagiarism detection in forensic contexts. In Malcolm
Coulthard, A. May, & R. Sousa-Silva (Eds.), The Routledge Handbook of Forensic Linguistics (2nd ed., pp.
364–381). Routledge.
Statista. (2020). Augmented reality (AR) market size worldwide in 2017, 2018 and 2025. https://www.statista.com/
statistics/897587/
Statista. (2021). Number of digital voice assistants in use worldwide from 2019 to 2024 (in billions). https://www.
statista.com/statistics/973815/
Stepin, I., Alonso, J. M., Catala, A., & Pereira-Fariña, M. (2021). A Survey of Contrastive and Counterfactual
Explanation Generation Methods for Explainable Artificial Intelligence. IEEE Access, 9,
11974–12001.
Stoll, S., Camgöz, N. C., Hadfield, S., & Bowden, R. (2018). Sign language production using neural machine
translation and generative adversarial networks. Proceedings of the 29th British Machine Vision Conference
(BMVC 2018). http://bmvc2018.org/contents/papers/0906.pdf
Sun, S., Guzmán, F., & Specia, L. (2020). Are we Estimating or Guesstimating Translation Quality? Proceedings
of the 58th Annual Meeting of the Association for Computational Linguistics, 6262–6267. https://doi.
org/10.18653/v1/2020.acl-main.558
Sun, T., Gaut, A., Tang, S., Huang, Y., ElSherief, M., Zhao, J., Mirza, D., Belding, E., Chang, K.-W., & Wang,
W. Y. (2019). Mitigating Gender Bias in Natural Language Processing: Literature Review. Proceedings
of the 57th Annual Meeting of the Association for Computational Linguistics, 1630–1640. https://doi.
org/10.18653/v1/P19-1159
Symonenko, S., Shmeltser, E., Zaitseva, N., Osadchyi, V., & Osadcha, K. (2020). Virtual reality in foreign
language training at higher educational institutions.
Tatman, R. (2017). Gender and Dialect Bias in YouTube’s Automatic Captions. Proceedings of the First ACL
Workshop on Ethics in Natural Language Processing, 53–59. https://doi.org/10.18653/v1/W17-1606
Taylor, J., & Richmond, K. (2020). Enhancing Sequence-to-Sequence Text-to-Speech with Morphology.
Submitted to IEEE ICASSP. http://homepages.inf.ed.ac.uk/s1649890/morph/Morphology_inter-
speech2020.pdf
Tetreault, J. R., & Chodorow, M. (2008). Native Judgments of Non-Native Usage: Experiments in
Preposition Error Detection. Proceedings of the Workshop on Human Judgements in Computational Linguistics,
24–32.
74
Language In The Human-Machine Era • lithme.eu • COST Action 19102
Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., & Niessner, M. (2016). Face2Face: Real-Time Face
Capture and Reenactment of RGB Videos. Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR).
Timpe-Laughlin, V., Evanini, K., Green, A., Blood, I., Dombi, J., & Ramanarayanan, V. (2017). Designing
interactive, automated dialogues for L2 pragmatics learning. 116–125. https://doi.org/10.21437/
SemDial.2017-13
Tomalin, M., Byrne, B., Concannon, S., Saunders, D., & Ullmann, S. (2021). The practical ethics of bias
reduction in machine translation: Why domain adaptation is better than data debiasing. Ethics and
Information Technology. https://doi.org/10.1007/s10676-021-09583-1
Tseng, W.-T., Liou, H.-J., & Chu, H.-C. (2020). Vocabulary learning in virtual environments: Learner autono-
my and collaboration. System, 88, 102190. https://doi.org/10.1016/j.system.2019.102190
Vanmassenhove, E., Hardmeier, C., & Way, A. (2018). Getting Gender Right in Neural Machine Translation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 3003–3008. https://
doi.org/10.18653/v1/D18-1334
Vázquez, C., Xia, L., Aikawa, T., & Maes, P. (2018). Words in Motion: Kinesthetic Language Learning in
Virtual Reality. 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT),
272–276. https://doi.org/10.1109/ICALT.2018.00069
Vijayakumar, B., Höhn, S., & Schommer, C. (2018). Quizbot: Exploring formative feedback with conversa-
tional interfaces. International Conference on Technology Enhanced Assessment, 102–120.
Vilares, D., Alonso, M. A., & Gómez-Rodríguez, C. (2017). Supervised sentiment analysis in multilingual
environments. Information Processing & Management, 53(3), 595–607. https://doi.org/10.1016/j.
ipm.2017.01.004
Wang, Changhan, Tang, Y., Ma, X., Wu, A., Okhonko, D., & Pino, J. (2020a). Fairseq S2T: Fast Speech-to-
Text Modeling with fairseq. ArXiv E-Prints, arXiv:2010.05171.
Wang, Chien-pang, Lan, Y.-J., Tseng, W.-T., Lin, Y.-T. R., & Gupta, K. C.-L. (2020b). On the effects of
3D virtual worlds in language learning – a meta-analysis. Computer Assisted Language Learning, 33(8),
891–915. https://doi.org/10.1080/09588221.2019.1598444
Wang, R., Newton, S., & Lowe, R. (2015). Experiential Learning Styles in the Age of a Virtual Surrogate.
International Journal of Architectural Research: ArchNet-IJAR, 9, 93–110. https://doi.org/10.26687/
archnet-ijar.v9i3.715
Webber, B., Egg, M., & Kordoni, V. (2012). Discourse Structure and Language Technology. Natural Language
Engineering, 18(4), 437–490.
Wei, X., Yang, G., Wang, X., Zhang, K., & Li, Z. (2019). The Influence of Embodied Interactive Action
Games on Second Language Vocabulary Acquisition. 2019 International Joint Conference on Information,
Media and Engineering (IJCIME), 383–387. https://doi.org/10.1109/IJCIME49369.2019.00083
Weisser, M. (2014). Speech act annotation. In K. Aijmer & C. Rühlemann (Eds.), Corpus Pragmatics: A
Handbook (pp. 84–116). Cambridge University Press.
Weizenbaum, J. (1966). ELIZA—a computer program for the study of natural language communication
between man and machine. Communications of the ACM, 9(1), 36–45.
Wik, P., & Hjalmarsson, A. (2009). Embodied conversational agents in computer assisted language learning.
Speech Communication, 51(10), 1024–1037.
Wik, P., Hjalmarson, A., & Brusk, J. (2007). DEAL – A Serious Game For CALL Practicing Conversational Skills
In The Trade Domain. Workshop on Speech and Language Technology in Education. http://citeseerx.
ist.psu.edu/viewdoc/download?doi=10.1.1.384.8231&rep=rep1&type=pdf
Williams, K. (1991). Decolonizing the Word: Language, Culture, and Self in the Works of Ngũgĩwa
Thiong’o and Gabriel Okara. Research in African Literatures, 22(4), 53–61.
75
Willoughby, L., Iwasaki, S., Bartlett, M., & Manns, H. (2018). Tactile sign languages. In J.-O. Östman & J.
Verschueren (Eds.), Handbook of Pragmatics (pp. 239–258). John Benjamins Publishing Company.
https://doi.org/10.1075/hop.21.tac1
Wilske, S. (2014). Form and Meaning in Dialog-Based Computer-Assisted Language Learning [PhD Thesis]. University
of Saarland.
Woolls, D. (2021). Computational forensic linguistics: Computer-assisted document comparison. In M.
Coulthard, A. May, & R. Sousa-Silva (Eds.), The Routledge Handbook of Forensic Linguistics (2nd ed.).
Routledge.
Yadav, A., & Vishwakarma, D. K. (2020). Sentiment analysis using deep learning architectures: A review.
Artificial Intelligence Review, 53(6), 4335–4385. https://doi.org/10.1007/s10462-019-09794-5
Yang, L., Li, Y., Wang, J., & Sherratt, R. S. (2020). Sentiment analysis for E-commerce product reviews in
Chinese based on sentiment lexicon and deep learning. IEEE Access, 8, 23522–23530.
Yule, G. (1996). Pragmatics. Oxford University Press.
Zellou, G., & Cohn, M. (2020). Top-down effect of apparent humanness on vocal alignment toward human and device
interlocutors. 2020 Cognitive Science Society Meeting.
Zhang, Z., Wu, S., Liu, S., Li, M., Zhou, M., & Xu, T. (2019). Regularizing neural machine translation by
target-bidirectional agreement. Proc. AAAI, 33, 443–450.
Zhang, T., Huang, H., Feng, C., & Wei, X. (2020). Similarity-aware neural machine translation: Reducing
human translator efforts by leveraging high-potential sentences with translation memory. Neural
Computing and Applications, 1–13.
Zhou, Y. (2019). Preventive Strategies for Pedophilia and the Potential Role of Robots: Open
Workshop Discussion. In Y. Zhou & M. H. Fischer (Eds.), AI Love You: Developments in Human-
Robot Intimate Relationships (pp. 169–174). Springer International Publishing. https://doi.
org/10.1007/978-3-030-19734-6_9
Zhou, Y., & Fischer, M. H. (Eds.). (2019). AI Love You: Developments in Human-Robot Intimate Relationships.
Springer.
Zhou, Z., Chen, K., Li, X., Zhang, S., Wu, Y., Zhou, Y., Meng, K., Sun, C., He, Q., Fan, W., Fan, E., Lin, Z.,
Tan, X., Deng, W., Yang, J., & Chen, J. (2020). Sign-to-speech translation using machine-learning-as-
sisted stretchable sensor arrays. Nat Electron, 3, 571–578.