THEORIES IN SECOND
LANGUAGE ACQUISITION
This third edition of the best-selling Theories in Second Language Acquisition surveys the
major theories currently used in second language acquisition (SLA) research, serving as
an ideal introductory text for undergraduate and graduate students in SLA and language
teaching.
Designed to provide a consistent and coherent presentation for those seeking a basic
understanding of the theories that underlie contemporary SLA research, each chapter
focuses on a single theory. Chapters are written by leading scholars in the field and
incorporate a basic foundational description of the theory, relevant data or research
models used with this theory, common misunderstandings, and a sample study from the
field to show the theory in practice.
New to this edition is a chapter addressing the relationship between theories and L2
teaching, as well as refreshed coverage of all theories throughout the book. A key work
in the study of second language acquisition, this volume will be useful to students of
linguistics, language and language teaching, and to researchers as a guide to theoretical
work outside their respective domains.
Bill VanPatten was a professor of Spanish at Michigan State University, where he
was also an affiliate faculty in the Department of Cognitive Science. He is currently an
independent scholar.
Gregory D. Keating is an associate professor of Linguistics at San Diego State
University. He is an associate editor of Studies in Second Language Acquisition.
Stefanie Wulff is an associate professor in the Linguistics Department at the University
of Florida, and between 2019 and 2023, Professor II at UiT The Arctic University of
Norway.
Second Language Acquisition Research
Susan M. Gass and Alison Mackey, Series Editors
The Second Language Acquisition Research series presents and explores issues
bearing directly on theory construction and/or research methods in the study
of second language acquisition. Its titles (both authored and edited volumes)
provide thorough and timely overviews of high-interest topics, and include
key discussions of existing research findings and their implications. A special
emphasis of the series is reflected in the volumes dealing with specific data
collection methods or instruments. Each of these volumes addresses the kinds
of research questions for which the method/instrument is best suited, offers
extended description of its use, and outlines the problems associated with its use.
The volumes in this series will be invaluable to students and scholars alike, and
perfect for use in courses on research methodology and in individual research.
Using Judgments in Second Language Acquisition Research
Patti Spinner and Susan M. Gass
Language Aptitude
Advancing Theory, Testing, Research and Practice
Edited by Zhisheng (Edward) Wen, Peter Skehan, Adriana Biedroñ, Shaofeng Li and
Richard Sparks
Eye Tracking in Second Language Acquisition and Bilingualism
A Research Synthesis and Methodological Guide
Aline Godfroid
Theories in Second Language Acquisition
An Introduction, Third Edition
Edited by Bill VanPatten, Gregory D. Keating and Stefanie Wulff
For more information about this series, please visit: www.routledge.com/
Second-Language-Acquisition-Research-Series/book-series/LEASLARS
THEORIES IN
SECOND LANGUAGE
ACQUISITION
An Introduction
Third Edition
Edited by Bill VanPatten,
Gregory D. Keating and Stefanie Wulff
Third edition published 2020
by Routledge
52 Vanderbilt Avenue, New York, NY 10017
and by Routledge
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2020 Taylor & Francis
The right of Bill VanPatten, Gregory D. Keating and Stefanie Wulff to
be identified as the authors of the editorial material, and of the authors
for their individual chapters, has been asserted in accordance with
sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or
reproduced or utilised in any form or by any electronic, mechanical,
or other means, now known or hereafter invented, including
photocopying and recording, or in any information storage or retrieval
system, without permission in writing from the publishers.
Trademark notice: Product or corporate names may be trademarks
or registered trademarks, and are used only for identification and
explanation without intent to infringe.
First edition published by Lawrence Erlbaum Associates, Inc. 2007
Second edition published by Routledge 2015
Library of Congress Cataloging-in-Publication Data
Names: VanPatten, Bill, editor. | Keating, Gregory D., editor. |
Wulff, Stefanie, editor.
Title: Theories in second language acquisition: an introduction /
edited by Bill VanPatten, Gregory D. Keating and Stefanie Wulff.
Description: Third edition. | New York, NY: Routledge, 2020. |
Series: Second language acquisition research | Includes bibliographical
references and index.
Identifiers: LCCN 2019045044 (print) | LCCN 2019045045 (ebook) |
ISBN 9781138587373 (hardback) | ISBN 9781138587380 (paperback) |
ISBN 9780429503986 (ebook)
Subjects: LCSH: Second language acquisition.
Classification: LCC P118.2.T45 2020 (print) | LCC P118.2 (ebook) |
DDC 418.0071—dc23
LC record available at https://lccn.loc.gov/2019045044
LC ebook record available at https://lccn.loc.gov/2019045045
ISBN: 978-1-138-58737-3 (hbk)
ISBN: 978-1-138-58738-0 (pbk)
ISBN: 978-0-429-50398-6 (ebk)
Typeset in Bembo
by codeMantra
CONTENTS
Contributors vii
Preface xi
Acknowledgments xiii
vi Contents
Glossary 291
Index 301
CONTRIBUTORS
Kathleen Bardovi-Harlig is a provost professor of second language studies
at Indiana University, where she teaches and conducts research on second
language acquisition in tense-aspect systems, L2 pragmatics, and conventional
expressions. Her work employing functional approaches to tense-aspect sys-
tems has appeared in Language Learning, Studies in Second Language Acquisition,
EuroSLA Yearbook, and edited volumes. She is the author of Tense and Aspect in
Second Language Acquisition: Form, Meaning, and Use (2000).
Robert DeKeyser is a professor of second language acquisition at the University
of Maryland. His interests include skill acquisition theory, the roles of implicit
and explicit learning, individual differences, aptitude-treatment interaction,
and study abroad. He is a former associate editor of Bilingualism: Language and
Cognition and former editor of Language Learning.
Nick C. Ellis is a professor of psychology and linguistics at the University of
Michigan. His interests include language acquisition, cognition, emergen-
tism, corpus linguistics, cognitive linguistics, and psycholinguistics. His SLA
research concerns explicit and implicit language learning and their interface;
usage-based acquisition and the probabilistic tuning of the system; vocabulary
and phraseology; and learned attention and language transfer. He serves as gen-
eral editor of Language Learning.
Susan M. Gass is university distinguished professor in the Second Language
Studies Program at Michigan State University. She has published widely in
the field of second language acquisition and is the winner of numerous local,
viii Contributors
national, and international awards. She has served as president of the A merican
Association for Applied Linguistics and the International Association of
Applied Linguistics. She is the current editor of Studies in Second Language
Acquisition.
Gregory D. Keating is an associate professor of Linguistics at San Diego State
University. His research interests include second language acquisition, heritage
language bilingualism, sentence processing, and online methods. He is an asso-
ciate editor of Studies in Second Language Acquisition.
James P. Lantolf is Greer professor in language acquisition, emeritus, in the
Department of Applied Linguistics, Penn State University, and Changjiang
Professor (Yangtze River Professor) in the School of Foreign Studies at Xi’an
Jiaotong University. He was the president of AAAL, coeditor of Applied
Linguistics, and founding editor of Language and Sociocultural Theory (Equinox).
He coauthored Sociocultural Theory and the Genesis of Second Language Develop-
ment (Oxford University Press, 2006) and Sociocultural Theory and the Pedagogical
Imperative in L2 Education (Routledge, 2014).
Diane Larsen-Freeman is a professor emerita of education, professor emer-
ita of linguistics, and research scientist emerita and former director, English
Language Institute, University of Michigan. She is also a professor emerita
at the Graduate SIT Institute in Vermont and a visiting senior fellow at the
University of Pennsylvania.
Anke Lenzing is a senior lecturer of English linguistics and psycholinguistics
at Paderborn University. Her main research interests within SLA are early L2
acquisition and the L2 initial state, L2 transfer, and the interface between com-
prehension and production.
Alison Mackey is professor of linguistics at Georgetown University and in sum-
mers, professor of Applied Linguistics at Lancaster University. She investigates
how second languages are learned and taught. She has published numerous
journal articles and book chapters and 15 books. She is current editor-in-chief
of the Annual Review of Applied Linguistics, an official journal of the American
Association for Applied Linguistics.
Manfred Pienemann is a professor of English Linguistics at Paderborn Uni-
versity. He has held positions in linguistics and applied linguistics at the Uni-
versities of Newcastle (UK), the Australian National University, the University
of Sydney, and the Universities of Hamburg and Passau (Germany). He is the
cofounder of PacSLRF and coeditor of the PALART series.
Contributors ix
Matthew E. Poehner is a professor of education (world languages) and applied
linguistics at the Pennsylvania State University. His research focuses on
Vygotskian sociocultural theory and its relevance to L2 educational practices,
particularly through work in dynamic assessment, mediated development, and
concept-based language instruction. He is an associate editor of Language and
Sociocultural Theory and president of the International Association for Cognitive
Education and Psychology.
Steven L. Thorne is a professor of second language acquisition in the
Department of World Languages and Literatures at Portland State University,
with a secondary appointment in the Department of Applied Linguistics at
the University of Groningen (the Netherlands). His research utilizes cultural-
historical, usage-based, and critical approaches to language development, often
with a focus on human interactivity in technology-culture contexts.
Michael T. Ullman is a professor in the Department of Neuroscience, with sec-
ondary appointments in neurology and psychology, at Georgetown University.
He is a director of the Brain and Language Laboratory and the Georgetown
EEG/ERP Lab. He teaches undergraduate, masters, PhD, and medical students.
His research examines the neurocognition of first and second language, math,
reading, and memory; how these domains are affected in various disorders
(e.g., autism, dyslexia, developmental language disorder, aphasia, Alzheimer’s,
Parkinson’s, and Huntington’s diseases); and how they may be modulated by
factors such as sex, handedness, aging, and genetic variability.
Bill VanPatten was a professor of Spanish at Michigan State University, where
he was also an affiliate faculty in Cognitive Science. He is currently an inde-
pendent scholar while he also pursues fiction writing in both English and Span-
ish. His primary areas of research within second language acquisition are the
acquisition of formal properties of language, language processing and parsing,
and the interface between processing and acquisition.
Lydia White is James McGill professor Emeritus at McGill University and a
Fellow of the Royal Society of Canada. She is a leading scholar in the field of
generative second language acquisition and has published extensively in this
area. She is a coeditor of the book series Language Acquisition and Language Dis-
orders, and she is on the editorial boards of several international journals.
Jessica Williams is an emeritus professor of the Linguistics Department at the
University of Illinois at Chicago, where she has taught in the MA TESOL
program. Her primary research is in second language writing instruction and
the effect of instruction on second language acquisition.
x Contributors
Stefanie Wulff is an associate professor in the Linguistics Department at the
University of Florida, and between 2019 and 2023, Professor II at UiT The
Arctic University of Norway. Her research interests are in second language
learning, quantitative corpus linguistics, and student writing development.
She is the editor-in-chief of Corpus Linguistics and Linguistic Theory (de Gruyter
Mouton).
PREFACE
This book focuses on a number of contemporary mainstream theories in sec-
ond language (L2) acquisition research that have generated attention among
scholars. Since the mid-1980s, the field of second language acquisition (SLA)1
has struggled with the nature of theories, what they are, and what would be
an “acceptable” theory of SLA. Indeed, the present volume draws on one par-
ticular publication by Michael Long in a special issue of the TESOL Quarterly
from 1990 devoted to the construction of a theory in SLA. In that article, Long
discussed the nature of what a theory needs to be in SLA and also summarized
the research to establish “the least” a theory of SLA needs to explain. In other
words, what are the observed phenomena that a theory ought to explain? We
borrow from Long’s article in our first chapter to outline the challenges to
contemporary theories and list 10 observations that need to be accounted for
on theoretical grounds.
One might ask why there are so many “competing” theories in SLA at this
point. Why isn’t there just one theory that accounts for L2 acquisition? What
is it about L2 acquisition that invites a diffusion of theoretical perspectives? To
understand this, one might consider the parable about the four blind men and
the elephant. These sightless men chance upon a pachyderm for the first time
and one, holding its tail, says, “Ah! The elephant is very much like a rope.” The
second one has wrapped his arms around a giant leg and says, “Ah! The ele-
phant is like a tree.” The third has been feeling alongside the elephant’s massive
body and says, “Ah! The elephant is very much like a wall.” The fourth, having
seized the trunk, cries out, “Ah! The elephant is very much like a snake.” For
us, SLA is a big elephant that researchers can easily look at from different per-
spectives. L2 acquisition is, after all, an incredibly complex set of processes, and
if you have been introduced to the field via any of the excellent overviews, this
xii Preface
most likely is your conclusion. Thus, researchers have grabbed onto d ifferent
parts of the elephant as a means of coming to grips with the complex phe-
nomenon. This does not mean, however, that researchers and scholars have
gone poking around L2 acquisition blindly and without thought; the present
chapters should convince you otherwise. Unlike the blind men of our fable,
researchers understand that to grasp the whole of L2 acquisition, they may need
to concentrate on the smaller parts first. In the end, we may even need multiple
complementary theories to account for different observed phenomena of SLA
(e.g., VanPatten, 2018). As you complete the readings in your book, you might
ask yourself, “Just what part of the elephant is each theory examining?”
The present book came about as a perceived need to have a comprehensive
yet readily accessible set of readings for the beginning student of SLA. Each
of us has taught introductory courses on SLA to students in TESOL, applied
linguistics, and related areas, and we have felt that a good introduction to the-
ories is beneficial. At the same time, we know that it is easy for authors who
don’t work in a particular theory to reduce the theory to the point of students
misinterpreting it or to misinterpret the theory themselves and pass on this mis-
interpretation to students. To this end, we decided that a collection of chapters
written by the experts who work in the theories would best suit our needs as
well as those of our students. We are pleased to present this third edition for the
beginning student of SLA.
Since the publication of the first edition of this book, the field has continued
to develop, incorporating insights from theories and research methods from
other fields. Over the various editions of this book, we have added additional
theories to the original set in the first volume. However, it is important to be
clear that this book does not cover all theories of SLA or perspectives about what
the field of L2 research should include. The focus of the original book was on
linguistic, psycholinguistic, and cognitive perspectives in SLA, and subsequent
editions have maintained this focus. There are several fine books exploring
alternative and, in particular, more social perspectives on L2 acquisition. We
encourage the reader to seek these out.
Note
Reference
VanPatten, B. (2018). Theories of second language acquisition. In K. Geeslin (Ed.).,
The Cambridge Handbook of Spanish Linguistics (pp. 649–667). Cambridge: CUP.
ACKNOWLEDGMENTS
Since its inception, this volume has been developed with the novice reader in
mind—the beginning student of SLA who may not have much background in
linguistics or SLA. Keeping that novice reader in mind has been a challenge for
us and no less for the various contributors whose theories you will read here.
The process of getting this volume into final form was long and demanded
considerable effort on the part of the contributors to present some very complex
notions in an accessible and consistent format. We know this often tried the
patience of our authors. We took them away from their research and teaching
duties to answer our numerous queries and revise their chapters, not once but,
for most of the authors, now three times for the third edition. That they stuck
with us to the end is a demonstration of their commitment and dedication to
the profession and to its newest members. They have our heartfelt thanks. As
the reader may notice, there are two new editors on the volume and a former
editor, Jessica Williams, has retired from academic work. We note that the
third edition could not have been possible without her contributions on the
first two volumes, and her spirit, generosity of time and effort, and her acumen
in presenting information serve as a model for all. We send her our best wishes.
Finally, we thank the folks at Routledge for bringing this volume into the
hands of the reader.
1
INTRODUCTION
The Nature of Theories
Bill VanPatten, Jessica Williams, Gregory D. Keating,
and Stefanie Wulff
Almost everyone has heard of Einstein’s Theory of Relativity. People have also
heard of things such as the Theory of Evolution, Atomic Theory, Quantum
Theory, Plate Tectonics, and The Big Bang Theory. What is common to
all these theories is that they are theories about what scientists call natural
phenomena: things that we observe every day or are somehow observable.
Theories are a fundamental staple in science, and all advances in science are,
in some way or another, advances in theory development. If you ask scientists,
they would tell you that the sciences could not proceed without theories. And
if you ask applied scientists (such as those who develop medicines or attempt
to solve the problem of how to travel from Earth to Mars), they would tell you
that a good deal of their work is derived from assumptions and laws within
theories.
Theories are also used in the social and behavioral sciences, such as psy-
chology, sociology, and economics. As in the natural sciences, social sciences
attempt to explain observed phenomena, such as why people remember some
things better than others under certain conditions or why the stock market
behaves the way it does.
In the field of second language acquisition (SLA) research, theories have also
come to occupy a central position. Some researchers, though by no means all,
would even say that the only way SLA can advance as a research field is if it
is theory driven. The purpose of the present book is to introduce the reader to
certain current or mainstream theories in second language (L2) acquisition and
provide a background for continued in-depth reading of the same. As a starting
point, we will need to examine the nature of theories in general.
2 Bill VanPatten et al.
What Is a Theory?
At its most fundamental level, a theory is a set of statements (“laws”) about
natural phenomena that explains why these phenomena occur the way they do.
In the sciences, theories are used in what Kuhn (1996) calls the job of “puzzle
solving.” By this, Kuhn means that scientists look at observable phenomena
as puzzles or questions to be solved. Why does the earth revolve around the
sun and not fly off into space? Why are humans bipedal but gorillas knuckle-
walkers? Why are some eyes blue and some brown, for example, but not red or
orange? These are all questions about things that confront us every day, and it
is the job of scientists to account for them.
In short, then, the first duty of a theory is to account for or explain observed
phenomena. But a theory ought to do more than that. A theory also ought
to make predictions about what would occur under specific conditions. Let’s
look at three examples: one familiar, the other two perhaps less so. In the
early part of the 19th century, scientists were already aware of the presence of
microorganisms in the air and water, and they had an idea about the connec-
tion between the organisms and disease. However, they had no idea of how
they came into existence; indeed, belief in the spontaneous generation of these
organisms was widespread. Disease was thought to be caused by “bad air.”
Careful experimentation by Louis Pasteur and other scientists demonstrated
that microbes, though carried by air, are not created by air. Living organisms
come from other living organisms. These discoveries led to the development
of the germ theory of disease, which proposed that disease was caused by micro-
organisms. The acceptance of this theory had obvious important applications
in public health, such as the development of vaccines, hygienic practices in
surgery, and the pasteurization of milk. It not only could explain the presence
and spread of disease, but it could also predict, for example, that doctors who
delivered babies without washing their hands after performing autopsies on
patients who had died from childbirth fever would transmit the disease to
new patients. Even more important, the same theory could be used to connect
phenomena that, on the surface, appeared unrelated, such as the transmittal
of disease, fermentation processes in wine and beer production, and a decline
in silkworm production.
Now let’s take an example from psychology. It is an observed phenomenon
that some people read and comprehend written text faster and better than oth-
ers. As researchers began to explore this question, a theory of individual differ-
ences in working memory evolved. That theory says that people vary in their
ability to hold information in what is called working memory (defined, roughly,
as that mental processing space in which a person performs computations on
information at lightning speed). More specifically, the theory says that people
vary in their working memory capacity: Some have greater capacity for process-
ing incoming information compared with others, but for everyone, capacity
Introduction 3
is limited in some way. Initially used to account for individual d ifferences in
reading comprehension ability in a person’s first language (L1), the theory also
accounts for a wide range of seemingly unrelated phenomena, such as why
people remember certain sequences of numbers and not others, why they recall
certain words that have been heard, why people vary on what parts of sentences
or sequences they remember best, why certain stimuli are ignored and others
attended to, and why some students are good note takers and others are not.
A theory of working memory, then, allows psychologists to unify a variety of
behaviors and outcomes that on the surface level do not necessarily appear to be
related. There are even attempts to apply the theory to SLA in order to explain
why some people learn faster and better than others.
Let’s take a final example, this time from language. In one theory of syntax
(sentence structure), a grammar can allow movement of elements in the sen-
tence. This is how we get two sentences that essentially mean the same thing,
as in the following:
(2) What did Mary say?
In this particular theory, the what in (2) is said to have moved from its position
as an object of the verb said to occupy a place in a different part of the sentence.
At the same time, this theory also says that when something moves, it leaves a
hidden trace. Thus, the syntactician would write (2) like (3):
In (3), the t stands for the empty spot that the what left and the i simply shows
that the what and the t are “co-indexed”; that is, if there happens to be more
than one thing that moves, you can tell which trace it left behind.
To add to the picture, the theory also says that ts, although hidden, are psy-
chologically real and occupy the spot left behind. Thus, nothing can move into
that spot and no contractions can occur across it. Armed with this, the syntac-
tician can make a variety of predictions about grammatical and ungrammatical
sentences in English. We might predict, for example, that (4) is a good sentence
but (5) is bad and not allowed by English grammar:
The reason for this is that should has moved from its original spot and left a
t behind, as illustrated in (6):
4 Bill VanPatten et al.
At the same time, the syntactician would predict restrictions on the contrac-
tion of want to to wanna. Thus, (7) is fine because there is no trace intervening
where a contraction wants to happen:
(7) Whoi do you want to invite ti to dinner? → Who do you wanna invite to
dinner?
All English speakers would agree, however, that (8) is awful:
(8) *Who do you wanna invite Susie to dinner?
You could probably work this out yourself, but the reason (8) sounds bad is that
the who has moved and has left behind a t that blocks a possible contraction.
Compare (7) and (8) redone here as (9) and (10):
(9) Whoi do you want to invite ti to dinner? → Who do you wanna invite to
dinner?
(10) Whoi do you want ti to invite Susie to dinner? → *Who do you wanna
invite Susie to dinner?
Be careful not to pronounce wanna like want tuh; want tuh is not a contraction
and is merely the schwawing of the vowel sound in to. Want tuh sounds OK in
sentence (8) precisely because it is not a contraction.
Thus, the theory unifies constraints on contractions with modals (should,
would, will, may, might), with auxiliaries (do, have), with copular verbs (be), with
the verb want, and with pronouns (I, you, he, and so on). It makes predictions
about good and bad sentences that perhaps we have never seen or heard, some
of which—like silkworms and beer—don’t seem to have much in common.
To summarize so far, a theory ought to account for and explain observed
phenomena and also make predictions about what is possible and what is not.
In addition, most theories—good ones, that is—when accounting for and pre-
dicting things, also tend to unify a series of generalizations about the world
or unify a series of observations about the world. In the brief view we had of
syntactic theory, the few generalizations made about how syntax works unify a
variety of observations about contractions and not just contractions with should.
All contractions conform to the generalizations.
For SLA, then, we will want a theory that acts like a theory should. We will
want it to account for observable phenomena (something to which we turn our
attention later in this chapter). We want it to make predictions. And, ideally,
we want it to unify the generalizations we make as part of the theory. In other
words, we want a single theory to bring all of the observed phenomena under
one umbrella. Whether this is possible at this time has yet to be determined and
is something that this book will explore.
Introduction 5
What Is a Model?
Many people confuse theories and models. A model describes processes or sets
of processes of a phenomenon. A model may also show how different compo-
nents of a phenomenon interact. The important word here is how. A model
does not need to explain why. Whereas a theory can make predictions based
on generalizations, this is not required of a model. In short, theories are a lways
explanatory and predictive whereas models need only be descriptive. The
problem is that in the real world—and in SLA as a research discipline—this
distinction is not always maintained. You will find as you read further in the
field that researchers often use model and theory interchangeably. Thus, although
in principle it would be a good idea to distinguish between these two terms
as they do in the natural sciences, in practice many of us in SLA do not do so.
What Is a Hypothesis?
Distinct from a theory, a hypothesis does not unify various phenomena; it
is usually an idea about a single phenomenon. Some people use theory and
hypothesis interchangeably, but in fact, they are distinct and should be kept sep-
arate. In science, we would say that a theory can generate hypotheses that can
then be tested by experimentation or observation. In psychology, for example,
there are theories regarding memory. You may recall the theory about work-
ing memory and capacity discussed earlier. The theory says (among many
other things) that working memory is limited in capacity. This means that
people can pay attention to only so much information at a given time before
working memory is overloaded. The theory also says that there are individual
differences in working memory and how people use what they have. Some
people have X amount of working memory capacity as they attend to incoming
information, whereas others have more or less. A hypothesis that falls out of
this, then, is that working memory differences among individuals should affect
reading comprehension: Those with greater working memory capacity should
be faster readers or should comprehend more. This is a testable hypothesis.
We ought to add here that the only valuable hypotheses for a theory are those
that are testable, meaning some kind of experiment can be run or some kind
of data can be examined to see if the hypothesis holds up. Another example
of a hypothesis comes from SLA: the Critical Period Hypothesis. There is a
theory in neurolinguistics claiming that at an early age, the brain begins to
specialize; specific brain functions become increasingly associated with spe-
cific areas of the brain. In addition, some functions may be developmentally
controlled; that is, they turn on and off at specific points in development. The
Critical Period Hypothesis is a hypothesis in L2 research based on this theory.
It states that the ability to attain native-like proficiency in a language is related
to the initial age of exposure. If language learning begins after a certain age
6 Bill VanPatten et al.
(and there is considerable controversy over what this age is as well as whether
there even is a critical period—see the various papers in Birdsong, 1999), the
learners will never reach a level of proficiency or competence comparable to a
native speaker’s. A corollary to this hypothesis is that language-learning ability
declines with age after this point. Again, both of these are testable hypotheses.
Recall that earlier we said we wanted a theory to make predictions. Predictions
are actually hypotheses. When we make a prediction based on a theory, we are
in effect making a hypothesis.
These definitions about theories, models, and hypotheses are import-
ant because in everyday speech, we may use the term theory in a way not
intended in science. For example, one might hear in a disparaging tone that
something is “ just a theory.” In science, the phrase “ just a theory” makes
no sense, as all work is theoretically driven. What is more, the term theory
has often been politicized to denigrate particular theories (e.g., evolution,
climate change) so that “ just a theory” becomes a way of dismissing some-
thing that has scientific rigor but runs against some other set of beliefs.
Finally, in movies and other nonscientific situations, one often hears the
term theory used to mean “an idea” or a “hypothesis.” A detective trying
to solve a crime might say, “I have a theory about the killer,” when that
detective means, “I have an idea about the killer.” We cannot, of course, rid
everyday speech of how it uses certain words. Our point in bringing up the
everyday use of theory is to make sure that the reader understands the term
as it is used in this book. Theories in science are not just “ideas” and thus
theories in SLA should not be either.
Constructs
All theories have what are called constructs. Constructs are key features, con-
cepts, or mechanisms on which the theory relies; they must be definable in
the theory. In the theory about disease transmission, germ is a construct; in the
theory about working memory, capacity is a construct; and in the theory about
syntax, a trace is a construct.
In evaluating any theory, it is important to understand the constructs on
which the theory relies; otherwise, it is easy to judge a theory one way or
another—that is, as a good or bad theory—without a full understanding of the
underpinnings of the theory. For example, without an understanding of the
construct germ, it would have been easy to dismiss germ theory. But given that
the construct germ was easily definable and identifiable, dismissal of germ trans-
mission and diseases was not so facile. To fully understand something like rela-
tivity, one must have a thorough grasp of the constructs time, space, and others.
Going back to the theory of evolution for a moment, one way in which some
people dismiss the theory is by a misunderstanding of the constructs natural
selection and adaptation. Some people think this means that as organisms adapt,
Introduction 7
the species it represents disappears to become the new species. While this may
sometimes happen, it is also true that adaptations create new species while the
original species continues to develop along another path. This is one reason
why humans are humans and spider monkeys are spider monkeys. Humans
did not replace monkeys or even evolve from monkeys. Both evolved from
something that pre-existed both species. In short, constructs are fundamental
to understanding a theory and what it both explains and predicts.
In SLA, we find an abundance of constructs that are in need of defini-
tions. For example, take the term second language acquisition itself. Each word
is actually a construct, and you can ask yourself, “What does second mean?”
“What does language mean?” and “How do we define acquisition?” In SLA
theorizing, most people use the term second to mean any language other than
one’s L1. It makes no difference what the language is, where it is learned, or
how it is learned. This suggests, then, that any theorizing about L2 acquisi-
tion ought to apply equally to the person learning Egyptian Arabic in Cairo
without the benefit of instruction as to the person learning French in a college
classroom in the United States. By defining second in an all-encompassing
way, it has an effect on the scope of the theory. If the construct second were
not defined this way, then it would have limited scope over the contexts of
language learning. For example, some people define second language to refer
to a language learned where it is spoken (e.g., immigrants learning English in
the United States, an American learning Japanese in Osaka), whereas foreign
is used to refer to situations in which the language is not spoken outside of
the classroom (e.g., German in San Diego, California). Thus, if second were
defined in the more restricted way, a theory in SLA would be limited to the
first context of learning.
The term language is deceptively simple as a construct, but have you ever
tried to define it? Does it mean speech? Or does it mean the rules that govern
speech production? Or does it mean the unconscious knowledge system that
contains all the information about language (e.g., the sound system, the mental
dictionary—or lexicon, as theorists like to call it—syntactic constraints, rules
on word formation, rules on use of language in context)? Or does it mean
something else? Or does it mean some combination of things? Thus, any theory
in SLA needs to be clear on what it means by language. Otherwise, the reader
may not fully grasp what the theory claims, or worse, misinterpret it.
In summary, here are key issues discussed so far:
• Theories ought to explain observable phenomena.
• Theories ought to unify explanations of various phenomena where possible.
• Theories are used to generate hypotheses that can be tested empirically.
• Theories may be explanations of a thing (such as language) or explanations
of how something comes to be (such as the acquisition of language).
• Theories have constructs, which in turn are defined in the theory.
8 Bill VanPatten et al.
Why Are Theories and Models Either
Good or Necessary for SLA?
We have explored what theories are but only obliquely addressed why they
might be useful. Certainly, they help us to understand the phenomena that
we observe. Consider again the Critical Period Hypothesis. It has often been
observed that speakers who begin the process of L2 acquisition later in life
usually have an accent. A theory about the loss of brain plasticity during nat-
ural maturation may help explain this phenomenon. The same theory might
predict that learners who begin language study in high school will be less likely
to approach a native-like standard of pronunciation than those learners who
have access to significant amounts of target-language input much earlier in life.
These kinds of predictions have clear practical applications; for example, they
might suggest that language learning should begin at a young age.
Let’s look at another concrete example. In one theory in SLA, producing
language (usually called output) is considered an important element in struc-
turing linguistic knowledge and anchoring it in memory. In another theory,
in contrast, output is considered unimportant in developing second language
knowledge. Its role is limited to building control over knowledge that has
already been acquired. These differences in theory would have clear and
important consequences for second language instruction. In the first case, out-
put practice would have a significant role in all aspects of instruction. In the
second case, it would be most prominent in fluency practice if at all.
So far we have explored the utility of theories from a practical, real-world
perspective. Theories are also useful in guiding research, which may not always
have immediate practical purposes related to, say, instruction. If we step back
for a moment and consider the theories previously mentioned, we have looked
at the following:
• a theory that explains/predicts constraints on contraction in English
• a theory that explains/predicts foreign accents in adult learners
• theories that predict the role of output in the L2 acquisition process.
You may notice that they are not all the same. The first is a theory of what is
to be acquired, that is, the unconscious mental representation of constraints
on language. It is not enough to say, for example, that learners are acquiring
English, for this begs the question, “What is English? How is it different from
Spanish or Chinese?” Clearly, a dictionary of the English language is not the
language itself, and so memorizing a dictionary is not equivalent to acquiring
English. Nor would it be sufficient to study a big grammar book and commit
all its rules to memory. It is very unlikely that any grammar book includes
the constraints on wanna/’ve contractions that appeared earlier in this chap-
ter, for example. And what about the sound system and constraints on syllable
Introduction 9
formation (e.g., no syllable in English can start with the cluster rw, but such a
syllable-initial cluster is possible in French)? In short, English, like any other
language, is complex and consists of many components. You may recall that
we touched on this issue when we noted that language itself is a construct that
a theory needs to define. Once the theory defines what it means by language, it
can better guide the questions needed to conduct research.
The second two items on the preceding list are not really about the target
of acquisition; rather, they address the factors that affect learning outcomes
(e.g., the Critical Period position) or they address how learning takes place, in
other words, processes learners must undergo. These processes may be internal
to the learner (such as what might be happening in working memory as the
learner is attempting to comprehend language and how this impacts learning)
or they may be external to the learner (such as how learners and native speakers
engage in conversation and how this impacts learning). Theories regarding
factors or processes are clearly different from theories about the what of acquisi-
tion, but they, too, can guide researchers conducting empirical research.
Finally, research can return the favor to theorists by evaluating competing
theories. For example, one theory of learning, including one theory in language
learning, maintains that humans are sensitive to the frequency of events and
experiences and that this sensitivity shapes their learning. Within this theory,
linguistic elements are abstracted from exposure to language and from language
use. What look like rules in a learner’s grammar are really just the result of
repeated exposure to regularities in the input. A competing theory maintains
that language learning takes place largely by the interaction of innate knowledge
(i.e., human-specific and universal linguistic knowledge) and data gathered from
the input (i.e., language the learner is exposed to in communicative contexts).
Within this theory, frequency may have some role in making some aspects of
language more “robust,” but it is not a causal factor as it is in the first theory.
Each of these two theories can generate predictions, or hypotheses, about how
language acquisition will take place under specific conditions. These hypotheses
can then be tested against observations and the findings of empirical studies.
What Needs to Be Explained by Theories in SLA?
As we mentioned at the outset of this chapter, one of the roles of theories is
to explain observed phenomena. Examples we gave from the sciences were
the observation that the Earth revolves around the sun and doesn’t fly off
into space and that humans are bipedal while some of our closest relatives are
knuckle-walkers. Theories in science attempt to explain these observations,
that is, tell why they exist.
In the field of SLA research, a number of observations have been catalogued
(e.g., Long, 1990), and what follows is a condensed list of them. At the end of
the chapter are references for more detailed accounts of these observations.
10 Bill VanPatten et al.
Observation 1: Exposure to input is necessary for L2 acquisition. This observation
means that acquisition will not happen for learners of a second language unless
they are exposed to input. Input is defined as language the learner attempts to
comprehend during communicative events. For example, when a learner hears
“Open your books to page 24” in a second language, the learner is expected to
comprehend the message and open his or her book to page 24. Language the
learner does not respond to for its meaning (such as language used in a mechan-
ical drill or rote practice) is not input. Although everyone agrees that input is
necessary for L2 acquisition, not everyone agrees that it is sufficient.
Observation 2: A good deal of L2 acquisition happens incidentally. This captures
the observation that various aspects of language enter learners’ minds/brains
when they are focused on communicative interaction (including reading). In
other words, with incidental acquisition, the learner’s primary focus of attention
is on the message contained in the input, and linguistic features are “picked up”
in the process. Incidental acquisition can occur with any aspect of language
(e.g., vocabulary, syntax, morphology [inflections], phonology).
Observation 3: Learners come to know more than what they have been exposed to in
the input. Captured here is the idea that learners attain unconscious knowledge
about the L2 that could not come from the input alone. For example, learners
come to know what is ungrammatical in a language, such as the constraints
on wanna/I’ve contraction that we saw earlier in this chapter. These constraints
are not taught and are not evident in the samples of language learners hear.
Another kind of unconscious knowledge that learners attain involves ambigu-
ity. Learners come to know, for example, that the sentence John told Fred that he
was going to sing can mean that either John will sing or Fred will sing. The issue
of unconscious knowledge that is not directly derivable from input data is often
referred to as “The Poverty of the Stimulus” in second language circles but is
also observed in L1 acquisition and is sometimes called the “Logical Problem
of Language Acquisition.”
Observation 4: Learners’ output (speech) often follows predictable paths with pre-
dictable stages in the acquisition of a given structure. Learners’ speech shows evi-
dence of what are called “developmental sequences.” One example involves
the acquisition of negation in English. Learners from all language backgrounds
show evidence of the following stages:
Stage 1: no + phrase: No want that.
Stage 2: subject + no + phrase: He no want that.
Stage 3: don’t, can’t, not may alternate with no: He can’t/don’t/not want that.
Stage 4: Negation is attached to modal verbs: He can’t do that.
Stage 5: Negation is attached to auxiliaries: He doesn’t want that.
In addition to developmental sequences, there are such things as “acquisition
orders” for various inflections and small words. For example, in English, -ing
Introduction 11
is mastered before regular past tense, which is mastered before irregular past
tense forms, which in turn are mastered before third person (present tense) -s.
In languages like Spanish, we find that learners acquire plural marking before
gender marking on adjectives, while typically using masculine singular as the
default or starting point.
Related to the above is that learners may pass through “U-shaped” devel-
opment. In such a case, the learner starts out doing something correctly then
subsequently does it incorrectly and then “reacquires” the correct form. A clas-
sic example comes from the irregular past tense in which learners begin with
came, went (and similar forms), then may begin to produce camed, goed/wented,
and then later produce the correct went, came and other irregular forms. When
this pattern of accuracy is represented in a line graph, it forms a u shape, hence
the term “u-shaped” development.
Observation 5: Second language learning is variable in its outcome. Here we mean
that not all learners achieve the same degree of unconscious knowledge about a
second language. They may also vary on speaking ability, comprehension, and
a variety of other aspects of language knowledge and use. This may happen
even under the same conditions of exposure. Learners under the same condi-
tions may be at different stages of developmental sequences or be further along
than others in acquisition orders. What is more, it is a given that most learners
do not achieve native-like ability in a second language. In fact, this could be
an observation all by itself: Most learners demonstrate non-nativeness in one or more
domains of language knowledge and language use.
Observation 6: Second language learning is variable across linguistic subsystems.
Language is made up of a number of components that interact in different ways.
For example, there is the sound system (including constraints on what sound
combinations are possible and impossible as well as constraints on pronuncia-
tion), the lexicon (the mental dictionary along with word-specific information
such as verb “X” cannot take a direct object or it requires a prepositional phrase
or it can only become a noun by addition of -tion and not -ment, for example),
syntax (what are possible and impossible sentences), pragmatics (knowledge of
what a speaker’s intent is, say, a request versus an actual question), and others.
Learners may vary in whether the syntax is more developed compared with the
sound system, for example.
Observation 7: There are limits on the effects of frequency on L2 acquisition. It has
long been held that frequency of occurrence of a linguistic feature in the input
correlates with whether it is acquired early or late, for example. However,
frequency is not an absolute predictor of when a feature is acquired. In some
cases, something very frequent takes longer to acquire than something less
frequent.
Observation 8: There are limits on the effect of a learner’s L1 on L2 acquisition.
Evidence of the effects of the L1 on L2 acquisition has been around since the
beginning of contemporary SLA (i.e., the early 1970s). It is clear, however, that
12 Bill VanPatten et al.
the L1 does not have massive effects on either processes or outcomes, as once
thought. Instead, it seems that the influence of the L1 is somehow attenuated
and also varies across individual learners.
Observation 9: There are limits on the effects of instruction on L2 acquisition.
Teachers and learners of languages often believe that what is taught and prac-
ticed is what gets learned. The research on instructed L2 acquisition says
otherwise. First, instruction sometimes has no effect on acquisition. As one
example, instruction has not been shown to cause learners to skip develop-
mental sequences or to alter acquisition orders (see Observation 4). Second,
some research has shown that instruction is detrimental and can slow down
acquisition processes by causing stagnation at a given stage. On the other hand,
there is also evidence that in the end, instruction may affect how fast learners
progress through sequences and acquisition orders and possibly how far they get
in those sequences and orders. Thus, there appear to be beneficial effects from
instruction, but they are not direct and not what many people think. It is also
not the case that instruction is necessary, even though it is a ubiquitous aspect
of L2 acquisition around the world.
Observation 10: There are limits on the effects of output (learner production) on lan-
guage acquisition. Although it may seem like common sense that “practice makes
perfect,” this adage is not entirely true when it comes to L2 acquisition. There
is evidence that having learners produce language has an effect on acquisition,
and there is evidence that it does not. What seems to be at issue, then, is that
whatever role learner production (i.e., using language to speak or write) plays
in acquisition, there are constraints on that role, as there are on other factors,
as noted earlier.
Again, the role of a theory is to explain these ten (and other) phenomena.
It is not enough for a theory to say they exist or to predict them; it also has to
provide an underlying explanation for them. For example, natural orders and
stages exist. But why do they exist and why do they exist in the form they
do? Why do the stages of negation look the way they do? As another exam-
ple, why is instruction limited? What is it about language acquisition that puts
constraints on it? Why can’t stages of acquisition be skipped if instruction is
provided for a structure? And if instruction can speed up processes, why can it?
As you read through the various theories in this volume, you will see that
current theories in SLA may explain close to all, some, or only a few of the
phenomena. What is more, the theories will differ in their explanations as they
rely on different premises and different constructs.
The Explicit/Implicit Debate
Of concern and considerable controversy in SLA are the roles of explicit
and implicit learning and knowledge. These concepts are notoriously diffi-
cult to define, in part because they rest on constructs such as consciousness
Introduction 13
and awareness, which themselves have been the subject of extended scholarly
debate.
Hulstijn (2005) defines the distinction in learning as follows:
Explicit learning is input processing with the conscious intention to find
out whether the input information contains regularities and, if so, to
work out the concepts and rules with which these regularities can be
captured. Implicit learning is input processing without such an intention,
taking place unconsciously.
(p. 131)
Hulstijn’s definition of explicit learning appears to include both awareness of
what is to be learned and the intention to learn it. Not all researchers agree.
DeKeyser (2003) counts only the former as a hallmark of explicit learning and
its absence as a defining feature of implicit learning, which he calls “learning
without awareness of what is being learned” (p. 314). Elsewhere, Hulstijn (2003)
also provides a more fine-grained distinction, noting that whereas explicit
learning involves awareness at the point of learning, intentional learning
additionally involves a “deliberate attempt to commit new information to
memory” (p. 360). Ellis (2009a) offers a definition of explicit learning that
includes intentionality, demands on attentional resources, and awareness of
what is being learned and a definition of implicit learning as learning that takes
place when all of these features are absent.
What is important to note about these definitions and others is the absence
of instruction; that is, they present explicit/implicit learning from the view-
point of what the learner thinks and does, not from the perspective of what the
environment is doing to the learner. Thus, the issue that confronts us here is
not the role of instruction (that is handled by Observation 9). Instead, the focus
is on what is going on in the mind/brain of the learner when that learner is
exposed to L2 input (with or without instruction). Thus, the reader is cautioned
not to confuse explicit/implicit learning with explicit/implicit teaching.
As we mentioned, the relative roles (or contributions) of explicit and implicit
learning are debated in SLA. Does L2 acquisition fully or largely involve explicit
learning? Does it fully or largely involve implicit learning? Or does L2 acquisition
somehow engage both explicit and implicit learning, and if so, how, under what
conditions, and for what aspects of language? On the one hand, some scholars
have questioned whether learning without awareness is even possible. On the
other hand, others have questioned whether explicit learning can ever provide
the basis for spontaneous and automatic retrieval of knowledge, while others go
so far as to reject any role for explicit learning in L2 acquisition at all.
Indeed, embedded within these questions about learning is the distinction
between explicit and implicit knowledge. Ellis (2009b) asserts both a behav-
ioral and neurobiological basis for this distinction. For the first, he offers “the
14 Bill VanPatten et al.
well-attested fact that speakers of a language may be able to use a linguistic
feature accurately and fluently without any awareness of what the feature con-
sists of and vice versa” (p. 335), and for the second, “whereas implicit knowledge
involves widely divergent and diffuse neural structures … explicit knowledge
is localized in more specific areas of the brain” (p. 335). Implicit and explicit
learning and knowledge are distinct concepts (Schmidt, 1994) yet Ellis (2009a)
connects them by referring to the resulting representations of the two types of
learning. Specifically, he claims that implicit learning leads to sub-symbolic
knowledge representations, whereas explicit learning results in symbolic repre-
sentations, allowing learners to verbalize what they have learned. Others speak
to the qualitative difference between explicit and implicit knowledge, albeit in
different ways from what we have just seen. Relying on linguistic theory for a
definition of language, these scholars do not see a connection between explicit
and implicit knowledge (e.g., Schwartz, 1993; VanPatten, 2016).
Regardless of the how one defines the two types of knowledge, the major
question that has challenged researchers is the nature of any interface between
them. Although most scholars agree that implicit knowledge is the goal of
acquisition, how does implicit knowledge develop? Can explicit knowledge
become implicit? Does explicit knowledge somehow aid the acquisition of
implicit knowledge? Or are they completely separate systems, which, under
most conditions of L2 acquisition, do not interact?
Because the field has not yet arrived at a consensus on these questions, and
because there is conflicting evidence on the relative roles of explicit and implicit
learning, we cannot offer an observation like those that have preceded this
section. Therefore, we have asked the contributors to this volume to address
explicit and implicit learning and knowledge in a special section in each chap-
ter, asking them to discuss what each theory or framework would claim about
the two types of learning and the development of the two types of knowledge.
Early Theories in SLA
Prior to the late 1980s, little real theorizing happened in L2 research that con-
sidered what theories were supposed to do and how they worked. The two
most well-known theories were behaviorism and Monitor Theory.
Behaviorism was a psychological theory of learning that focused on behaviors,
as its name implies. For behaviorists, all actions are the result of some kind of
conditioned response. An entity was somehow “rewarded” or “punished” for
an action. Reward resulted in continued use of that action or behavior, while
punishment resulted in suppression. Thus, rewards reinforced certain behav-
iors. In terms of language acquisition, language was seen as a set of patterns or
behaviors. Reward for doing something right with language reinforced that
linguistic behavior. Lack of reward (or punishment) suppressed a linguistic be-
havior. This theory was seen to apply to both L1 and L2 acquisition. Without
Introduction 15
going into detail here, the revolution in linguistics in the late 1950s begun under
Chomsky’s observations about the abstract and complex nature of language
coupled with the research in child L1 acquisition provided strong evidence that
behaviorism could not account for language acquisition in children. (For more
detail, we direct the reader to a summary of this history in Chapter 1 of
VanPatten, Smith, & Benati, forthcoming.)
As behaviorism lost ground and the research began pouring in on child L1
acquisition, L2 researchers began looking at similar questions to those posed by
L1 acquisition researchers. The 1970s saw an early proliferation of descriptive
studies that suggested the same thing as the L1 acquisition data: behaviorism
could not account for the development of second language learners that scholars
were uncovering. By the late 1970s, Stephen Krashen had begun to formulate
his ideas about L2 acquisition that were eventually solidified in the early 1980s
in what became known as Monitor Theory (e.g., Krashen, 1982). In essence,
Monitor Theory said the following:
• like children learning an L1, L2 learners constructed linguistic systems
based on communicatively embedded input they were exposed to;
• development occurred as learners were exposed to language that was
slightly beyond their current level;
• there was a difference between acquisition (acquiring language using
the same underlying processes that children used) and learning (explicit
and intentional focus on grammar and vocabulary to try and internalize
language); only acquisition resulted in an implicit system, while learning
resulted in an explicit system that could only be used as a monitor of output
under very reduced situations;
• acquisition was largely unaffected by instructional efforts as evidenced in
natural orders and staged development;
• learners possessed affective filters that, if high, could block acquisition
(i.e., keep input from “getting in”).
Krashen’s theory was well received among many corners of language teachers
(and still is) but was quickly criticized by researchers for not explaining any facts,
making testable predictions, and for including vague constructs. For example,
what does “slightly beyond the current level” mean when it comes to useful
input for learners? How would researchers operationalize this kind of construct
to set up experiments? Another example was the affective filter. Just what was
this and again, how could it be operationalized to be included in an experiment?
Although the criticisms of Monitor Theory as a scientific theory might be valid,
this is not to say that Krashen’s observations about L2 acquisition were necessar-
ily wrong or that his theory did not make significant contributions to the field.
Again, his ideas fueled curricular development in some circles. It was in the sci-
entific realm where the theory ran into trouble. As stated earlier, for theories to
16 Bill VanPatten et al.
be good theories, they need to offer hypotheses that can be tested and constructs
that can be operationalized. Interestingly, it was Krashen’s efforts at launching
Monitor Theory that led to serious interest in what an L2 theory should do and
was the catalyst for much of the theorizing we see today (and most if not all of his
observations about language acquisition more generally are still valid).
About This Volume
In this volume, we have asked some of the foremost proponents of particular
theories and models to describe and discuss them in an accessible manner to
the beginning student of SLA theory and research. As they do so, the various
authors address particular topics and questions so that the reader may compare
and contrast theories more easily:
• The Theory and Its Constructs
• What Counts as Evidence for the Theory
• Common Misunderstandings
• An Exemplary Study
• How the Theory Addresses the Observable Phenomena of SLA
• The Explicit/Implicit Debate
Our own interests and areas of expertise have led us to the linguistic and cog-
nitive aspects of L2 acquisition. Thus, the theories and perspectives taken in
the present volume—for the most part—reflect such orientations. To be sure,
there are social perspectives that can be brought to bear on SLA (see Atkinson,
2011; Block, 2003). These perspectives are often offered as “alternatives” to
the linguistic and cognitive orientations that are said to dominate L2 research,
but in our view, they are simply looking at different phenomena (see, e.g., the
discussion in Rothman & VanPatten, 2013). Those who seek more detailed
accounts of socially oriented frameworks used in L2 research, we suggest con-
sulting something like Atkinson’s (2011) edited volume.
Discussion Questions
1. In what ways do theories affect our everyday lives? Try to list and discuss
examples from politics, education, and society.
2. Discuss a theory from the past that has been disproved. Also discuss a
theory from the past that has stood the test of time. Do you notice any
differences between these theories in terms of their structures? Is one
simpler than the other? Does one rely on nonnatural constructs for
explanation?
3. Theories are clearly useful in scientific ventures and may have practi-
cal applications. They have also become useful, if not necessary, in the
Introduction 17
4. Reexamine the list of observable phenomena. Are you familiar with all of
them and the empirical research behind them? You may wish to consult
some basic texts on this topic listed in the “Suggested Further Reading”
section (e.g., Ellis, Gass, Long).
5. Is there an observable phenomenon in particular you would like to see
explained? Select one and, during the course of the readings, keep track of
how each theory accounts for this phenomenon.
Suggested Further Reading
Atkinson, D. (Ed.). (2011). Alternative approaches to second language acquisition. New York,
NY: Routledge.
This volume presents six approaches to SLA that complement or contrast with
cognitive approaches to the field. Two of the approaches are represented in this
volume.
Ellis, R. (2008). The study of second language acquisition (2nd ed.). Oxford, England:
Oxford University Press.
This volume is a comprehensive overview of the field that continues to be an
excellent resource on many topics in the field.
Gass, S. (2013). Second language acquisition: An introductory course (4th ed.). New York,
NY: Routledge.
This is a basic introduction to the field in a form that is accessible to readers new
to the field. It includes authentic data-based problems at the end of each chapter that
help readers grapple with issues typical of SLA research.
Hustijn, J. (2005). Theoretical and empirical issues in the study of implicit and explicit
second language learning. Studies in Second Language Acquisition, 27, 129–140.
This article is the introduction to a special issue on implicit and explicit learning
and knowledge in SLA. As such, it provides a good overview of the issues on this
topic.
Lightbown, P., & Spada, N. (2013). How languages are learned (4th ed.). Oxford, England:
Oxford University Press.
This volume is aimed at teachers and focuses on language acquisition in class-
room settings.
Long, M. H. (1990). The least a second language acquisition theory needs to explain.
TESOL Quarterly, 24, 649–666.
The observations listed in this chapter are based, in part, on this seminal article.
Rothman, J., & VanPatten, B. (2013). On multiplicity and mutual exclusivity: The
case for different theories. In M. P. García Mayo, M. J. Gutiérrez-Mangado, &
M. Martínez Adrián (Eds.), Contemporary approaches to second language acquisition
(pp. 243–256). Amsterdam, Netherlands: John Benjamins.
This chapter, while taking a generative perspective on language, argues that
different theories exist because of the complexity of acquisition, suggesting that
multiple theories may be necessary to understand acquisition in its entirety.
VanPatten, B., Smith, M., & Benati, A. (2020). Key questions in second language acquisition:
An introduction. Cambridge, England: Cambridge University Press.
18 Bill VanPatten et al.
This is an introductory volume for those with little background in SLA, linguis-
tics, psychology, or related areas. It focuses on the major questions that drive the
field of L2 research, bringing them together to address the fundamental question of
whether L1 and L2 acquisition are similar or different.
References
Atkinson, D. (Ed.). (2011). Alternative approaches to second language acquisition. New York,
NY: Routledge.
Birdsong, D. (Ed.). (1999). Second language acquisition and the critical period hypothesis.
Mahwah, NJ: Lawrence Erlbaum Associates.
Block, D. (2003). The social turn in second language acquisition. Edinburgh, Scotland:
Edinburgh University Press.
DeKeyser, R. (2003). Implicit and explicit learning. In C. Doughty & M. Long (Eds.),
The handbook of second language acquisition (pp. 313–348). Cambridge, England:
Cambridge University Press.
Ellis, R. (2009a). Implicit and explicit learning, knowledge and instruction. In R. Ellis,
S. Loewen, C. Elder, R. Erlam, J. Philp, & H. Reinders (Eds.), Implicit and explicit
knowledge in second language learning, testing and teaching (pp. 3–25). Bristol, England:
Multilingual Matters.
Ellis, R. (2009b). Retrospect and prospect. In R. Ellis, S. Loewen, C. Elder, R. Erlam,
J. Philp, & H. Reinders (Eds.), Implicit and explicit knowledge in second language learning,
testing and teaching (pp. 335–353). Bristol, England: Multilingual Matters.
Hulstijn, J. (2003). Incidental and intentional learning. In C. Doughty & M. Long
(Eds.), The handbook of second language acquisition (pp. 349–381). Cambridge, England:
Cambridge University Press.
Hulstijn, J. (2005). Theoretical and empirical issues in the study of implicit and explicit
second language learning. Studies in Second Language Learning, 27, 129–140.
Krashen, S. D. (1982). Principles and practice in second language acquisition. New York, NY:
Pergamon Press.
Kuhn, T. S. (1996). The structure of scientific revolutions (3rd ed.). Chicago, IL: University
of Chicago Press.
Rothman, J., & VanPatten, B. (2013). On multiplicity and mutual exclusivity: The
case for different theories. In M. P. García Mayo, M. J. Gutiérrez-Mangado, &
M. Martínez Adrián (Eds.), Contemporary approaches to second language acquisition
(pp. 243–256). Amsterdam, Netherlands: John Benjamins.
Schmidt, R. (1994). Deconstructing consciousness: In search of useful definitions for
applied linguistics. AILA Review, 11, 129–158.
Schwartz, B. (1993). On explicit and negative data effecting and affecting competence
and linguistic behavior. Studies in Second Language Acquisition, 15, 147–163.
VanPatten, B. (2016). Why explicit information cannot become implicit knowledge.
Foreign Language Annals, 49, 650–657.
2
LINGUISTIC THEORY, UNIVERSAL
GRAMMAR, AND SECOND
LANGUAGE ACQUISITION
Lydia White
The Theory and Its Constructs
The Linguistic Competence of Native Speakers and L1 Acquirers
Generative linguistic theory aims to provide a characterization of the linguistic
competence of native speakers of a language and to explain how it is possible
for child first language (L1) acquirers to achieve that competence. The genera-
tive perspective on second language (L2) acquisition has parallel goals, specif-
ically, to account for the nature and acquisition of interlanguage competence
(see Gregg, 1996; White, 1989, 2003).
In this framework, language use (comprehension and production) is assumed
to be based upon an abstract linguistic system, a mental representation of
grammar (syntax, phonology, morphology, and semantics). The knowledge
of language represented in this way is unconscious. Furthermore, much of
this unconscious knowledge does not have to be learned in the course of L1
acquisition; rather, it is derived from Universal Grammar (UG). This claim is
motivated by the so-called logical problem of language acquisition or the
problem of the poverty of the stimulus, namely, the mismatch between the input
that children are exposed to and their ultimate attainment (e.g., Chomsky,
1986b). Our knowledge of language goes beyond the input in numerous ways.
For instance, children and adults can understand and produce sentences that
they have never heard before, they know that certain structures are ungram-
matical without being taught this, and they know that certain interpretations
of sentences are not possible in certain contexts.
Consider the following example, from de Villiers, Roeper, and Vainikka
(1990) and Roeper and de Villiers (1992). Imagine a scenario where a boy
20 Lydia White
climbs a tree in the afternoon and falls out of it and hurts himself. In the
evening, he tells his father about what happened. Now consider the questions
in (1), uttered in this context.
a. When did the boy say (that) he got a bruise?
b. When did the boy say how he got a bruise?
In this context, a question like (1a) is ambiguous; it can be a question about the
time that he got hurt (in the afternoon) or the time that he told his father about
the incident (in the evening). Question (1b), conversely, is not ambiguous. Even
though it differs by only one word (the embedded clause being introduced by
how), this question can only have an answer that relates to the time of telling,
such as in the evening or when he was in the bath. In other words, it must be con-
strued as a question about the main clause; the embedded clause interpretation
is impossible, even though it is perfectly acceptable and available in the case of
(1a). De Villiers and colleagues conducted a series of experiments using such
scenarios and found that young children acquiring English as their mother
tongue are highly sensitive to the difference between these two sentence types,
allowing both interpretations in the case of (1a) but only one interpretation (the
matrix clause one) in the case of (1b).
How do children know this? It is most unlikely that children are explicitly
told that certain sentences are ambiguous, while others (which are superficially
very similar) are not. Nor does this kind of information seem to be inducible
from the language that children hear, given that children will be exposed to
a range of grammatical wh-questions, in simple and embedded clauses. The
observation, then, is that the input underdetermines the child’s linguistic com-
petence. Hence, it is argued, children must bring innate, built-in knowledge
to bear on the task of first language acquisition. In this case, a principle of UG,
one of a number of so-called island constraints (Ross, 1967) (also known as
the Subjacency Principle), restricts wh-movement in particular ways. Effec-
tively, these constraints state that certain constituents form islands, from which
phrases cannot escape.1 The embedded clause in (1b) is a wh-island, headed
by a question phrase (how), whereas the embedded clause in (1a) is not. The
wh-phrase when can “escape” only in the case of (1a), passing through a position
which is not available in the case of (1b), since it is already filled by how. The
alternative interpretation is possible in both cases because when is construed
with the main clause, so there has been no extraction from an embedded clause
(island or otherwise).
A related effect of island constraints is that sentences where wh-movement
takes place out of an island are ungrammatical in English, as shown in (2a).
Here, what has been extracted out of an embedded wh-clause, an extraction that
is impossible for the same reason that the embedded clause interpretation of (1b)
Linguistic Theory, Universal Grammar 21
is impossible. In contrast, (2b) is acceptable because the embedded clause is not
an island; there is a position for the wh-phrase to “escape” through, indicated
by the intermediate t in (2b).
a. *What i does John wonder [who bought t i]?
b. What i does John think [t i that Mary bought t i]?
Once again, on learnability grounds, it is implausible to suppose that L1
acquirers of English arrive at knowledge of the ungrammaticality of sentences
like (2a) on the basis of English input alone. Instead, constraints of this kind
must derive from UG; acquisition of wh-movement (and many other properties
of language) is constrained by innate principles. Language is acquired presup-
posing such knowledge, with the consequence that L1 acquirers do not have to
learn when certain kinds of sentences are ungrammatical or when there can or
cannot be certain kinds of structural ambiguity.
Interlanguage Competence
Given that linguistic theory offers a model of the linguistic competence of
native speakers, it may be able to provide a characterization of nonnative com-
petence as well. This is the assumption of researchers working on L2 acquisition
from the perspective of generative linguistics. It has long been observed that the
language of L2 learners is systematic and rule-governed (e.g., Corder, 1967).
The term interlanguage, coined by Selinker (1972), has been widely adopted to
refer to the linguistic competence of L2 learners and L2 speakers (henceforth
L2ers). L2 researchers working in the generative paradigm assume that inter-
language grammars, like native speaker grammars, involve unconscious mental
representations, though they do not necessarily agree as to the precise nature of
these representations, for example, the nature and degree of influence of the L1
and the status of UG constraints.
While the operation of UG in L2 acquisition cannot be taken for granted,
considerations of learnability (the logical problem of L2 acquisition) apply in L2
acquisition as they do in L1 (e.g., White, 1989). That is, if it can be shown that
L2ers acquire abstract and subtle properties that are underdetermined by the L2
input, this suggests that interlanguage competence must be subject to the same
constraints as native competence.
Consider wh-movement once again. The ambiguity of (1a) in contrast to
(1b), and ungrammaticality of (2a) in contrast to (2b), constitute an L2 learn-
ability problem, parallel to the problem faced by L1 acquirers. There is no
reason to suppose that the L2 English input is any more informative about
wh-questions than the L1 English input, unless L2ers receive specific instruc-
tion on these aspects of wh-movement, which seems highly unlikely. In other
22 Lydia White
words, in the case of successful acquisition of this kind of abstract knowledge,
island constraints must be implicated in L2 as well as L1.
However, if it turns out that L2ers indeed demonstrate the same kind of
subtle knowledge as native speakers, a reasonable objection would be that the
source of this knowledge is not UG directly; rather it is the mother tongue
grammar (see, e.g., Bley-Vroman, 1989; Schachter, 1990). In effect, the
L2er might show knowledge of island constraints because these have been
activated in the L1 grammar and not because interlanguage grammars are
UG- constrained as such. Hence, to eliminate this possibility, it is necessary
to investigate cases where the L1 and L2 differ in such a way that the mother
tongue grammar could not provide the learner with the necessary knowledge.
In the case of wh-questions, this would be achieved if the L1 is a language
with so-called wh-in-situ instead of wh-movement, such as Chinese, Japanese,
or Korean. In these languages, in contrast to English, wh-phrases do not move
but remain in their underlying positions. This is true of simple wh-questions,
as in the Chinese example in (3a), and wh-questions from embedded clauses,
as in (3b).2
(3)
a. ni xihuan shei?
like who
“Who do you like?”
b. Zhangsan xiangxin shei mai-le shu?
Zhangsan believe who buy-ASP books
“Who does Zhangsan believe bought books?”
In (3a), there is no wh-fronting; rather, the wh-phrase (shei “who”) remains in
object position within the clause. The same is true of (3b), where shei remains
in subject position in the embedded clause.
Now consider Chinese equivalents of (1). In (4a), the wh-phrase (shenmoshihou
“when”) does not move out of the embedded clause. In consequence, and in
contrast to its English equivalent, this question is not ambiguous. Instead, it
can only be a question about the time the boy got hurt (the embedded clause
reading), not the time of telling (the main clause reading). To ask a question
about the time of telling, when must be in the main clause, as in (4b). Again, this
question is not ambiguous. In Chinese, then, each interpretation is reflected in
a different word order (compare (4a) vs. (4b)), in contrast to English, where one
word order can have two meanings (as in (1a)).
(4)
a. nanhai shuo ta shenmoshihou nong qing
boy say he when got bruise
“When did the boy say that he got a bruise?”
Linguistic Theory, Universal Grammar 23
b. nanhai shenmoshihou shuo ta nong qing
boy when say he got bruise
“When did the boy say that he got a bruise?”
c. nanhai shenmoshihou shuo ta zenyang nong qing
boy when say he how got bruise
“When did the boy say how he got a bruise?”
In the case of (4c), when again unambiguously requires the matrix interpreta-
tion. As a result, there is no contrast in ambiguity in Chinese equivalent to the
contrast found in English between (1a) and (1b). Thus, if L2 learners come to
know that English sentences like (1a) are ambiguous, whereas sentences like
(1b) are not, this would suggest not only that they have acquired wh-movement
but also that they have knowledge of constraints on wh-movement which could
not have come from the L1.
We have also seen that sentences like (2a), where wh-movement has taken
place out of an island, are ungrammatical in English. In contrast, the Chinese
sentence in (5) is grammatical because no wh-movement out of an island has
taken place; rather, the wh-phrase remains in situ.
(5) ni xiang-zhidao shei mai-le shenme?
you wonder who buy-ASP what
“What is the thing such that you wonder who bought it?”
(cf. “*What do you wonder who bought?”)
To summarize so far, the linguistic competence of native speakers of a lan-
guage includes knowledge of ambiguity and of ungrammaticality, as exempli-
fied by the preceding restrictions on wh-movement. Given that the input alone
is insufficient to account for how such knowledge could have been acquired,
children acquiring their mother tongues must have an innate specification for
language, UG, which guides and limits their hypotheses about the form of the
grammar that they are acquiring. If L2 learners of English come to know sim-
ilar restrictions on wh-movement, especially if these could not be derived from
the L1 grammar, this provides evidence for the continuing functioning of UG
constraints in interlanguage grammars as well.
Universal Grammar: Principles and Parameters
The precise nature and content of UG is the domain of linguistic theory; propos-
als change and are refined as the theory develops. Nevertheless, broadly speak-
ing, the following assumptions hold true across different versions of generative
grammar, such as Government and Binding (GB) theory (Chomsky, 1981) or
Minimalism (Chomsky, 1995). Principles of UG constrain the form of grammars,
as well as the operation of linguistic rules. The island constraints discussed earlier
24 Lydia White
are examples of such principles, specifying universal restrictions on wh-movement,
the idea being that all cases of wh-movement will be subject to such constraints.
As we have seen, the claim is that language acquirers do not have to learn these
principles—they are built into UG. Parameters, conversely, account for certain
circumscribed differences across languages; the idea is that these differences are
encoded in UG, so that language acquirers can easily determine what kind of
language they are acquiring. Input data are said to trigger the appropriate paramet-
ric choice for the language being acquired (Lightfoot, 1989). It is the input that
determines the choice between parameter values made available by UG.
As an example of a parameter, consider the case of wh-movement again. As
we have seen, the position of wh-phrases differs across languages. This differ-
ence is attributed to a parameter, namely, ± wh-movement. Languages divided
into two main types, those with wh-in-situ, such as East Asian languages, and
those with wh-movement, such as Germanic, Romance, and Slavic languages.
In this case, input in the form of simple wh-questions will be enough to trig-
ger the appropriate parameter value (see Crain & Lillo-Martin, 1999). The
child acquiring English will be exposed to questions like (6a), with a fronted
wh-phrase, indicating wh-movement, whereas the child acquiring Chinese will
be exposed to sentences like (6b), indicating lack of movement.
a. What do you want?
b. ni xihuan shei?
you like who
Universal principles do not necessarily operate in all languages but only in that
subset of languages that exhibits the relevant properties. In languages with overt
wh-movement, movement is subject to UG principles in the form of island con-
straints. In other words, movement is not totally free; rather, there are certain
kinds of constituents from which a wh-phrase cannot escape, as we have seen.
It is UG that specifies which domains form islands.3 In wh-in-situ languages,
conversely, island constraints are irrelevant, because of absence of movement.4
Table 2.1 summarizes the relevant differences between Chinese and English.
If interlanguage competence is UG-constrained, then, once L2ers reset the
±wh-movement parameter, wh-movement in the interlanguage grammar should
be subject to island constraints.
TABLE 2.1 Parameters and Constraints Relating to wh- Expressions
Parameters Constraints
English +wh-movement Island constraints activated
Chinese −wh-movement Island constraints inactive
Linguistic Theory, Universal Grammar 25
What Counts as Evidence?
It must be understood that linguistic competence is an abstraction; it is impos-
sible to “tap” linguistic competence directly. The generative perspective on L2
explores the nature of interlanguage competence by adopting a variety of per-
formance measures to try to discover the essential characteristics of underlying
mental representations. One frequently encountered problem is that it can be
difficult to construct tasks that relate to unconscious knowledge, as opposed to
conscious knowledge learned explicitly in the classroom. Ideally, performance
data from a variety of sources should be brought to bear on the question of
interlanguage competence. Relevant data can be classified into three broad
categories: production data, intuitional data, and data relating to interpretations (or
comprehension). The appropriateness of a particular elicitation task will depend
on what the researcher is trying to discover.
Spontaneous production data might seem to provide an obvious source of
information as to the nature of interlanguage competence. However, usage
does not necessarily accurately reflect knowledge or acquisition; production
data can result in an underestimation of an L2er’s overall linguistic compe-
tence. Lardiere (1998) examines such a case, showing that an L2er, whose use
of appropriate tense and agreement morphology is very infrequent in spoken
English, at the same time shows mastery of complex syntax. It is also possible
that production data might lead one to overestimate a learner’s competence.
That is, L2ers may appear to be highly proficient, even native-like, and yet have
nonnative grammars; sentences that superficially appear to be identical to those
produced by native speakers might in fact have different underlying represen-
tations (e.g., Hawkins & Chan, 1997).
Furthermore, when researchers are interested in phenomena that might not
show up readily in production, alternatives are required. In the case of the island
constraints discussed earlier, the ambiguity of sentences like (1a) and lack of
ambiguity of sentences like (1b) are unlikely to be observable in production.
Similarly, it is unlikely that L2ers will produce sentences like (2a). However,
failure to find violations of island constraints in production cannot be taken as
evidence of knowledge of ungrammaticality, since their absence might be due
to an accident of data sampling. Since a major research question in this frame-
work is whether interlanguage grammars are constrained by UG, it is essential
to discover whether forms ruled out by principles of UG are in fact ungrammat-
ical in the interlanguage grammar. One potential means of establishing this is
through elicited production (Crain & Thornton, 1998). If a task is set up so that
a certain structure might be expected and this structure is avoided by L2ers, this
suggests, indirectly, that the structure is ungrammatical for the learner. White
and Juffs (1998) use this technique to investigate island constraints in L2.
The most commonly used task to determine knowledge of L2 (un)grammat-
icality is the grammaticality judgment task, which taps linguistic intuitions.
26 Lydia White
This kind of task allows the researcher to investigate whether sentences that
are disallowed for native speakers because of principles of UG are also prohib-
ited in the interlanguage grammar. Considering island constraints once again,
another kind of island is formed by a complex NP (an NP containing a relative
clause or a complement clause). Extraction of a wh-phrase from a complex NP
is ungrammatical in English, as shown in (7a). In contrast, sentences like (7b)
are grammatical, because no such extraction has taken place.
a. *Whose life did you read a biography that described?
(cf. I read a biography that described someone’s life.)
b. Whose life did you read about?
Suppose that you wish to determine whether an L2 learner of English knows
that sentences like (7a) are ungrammatical, in other words, whether their gram-
mar is subject to island constraints. To establish whether L2 learners “know”
this constraint, a grammaticality judgment task is appropriate, in which learn-
ers are given a set of grammatical and ungrammatical sentences relevant to the
structure in question and are asked to indicate whether the sentences are gram-
matical. If interlanguage grammars are constrained by UG, and provided that
wh-movement has been acquired, then L2ers are expected to reject sentences
like (7a), while accepting grammatical equivalents. Hence, by using grammati-
cality judgments, the experimenter can (indirectly) investigate aspects of inter-
language competence that may not otherwise be amenable to inspection. Many
studies have used such tasks to explore island effects in interlanguage grammars
(e.g., Hawkins & Chan, 1997; Schachter, 1990; White & Juffs, 1998).
No single methodology is appropriate for investigating all aspects of lin-
guistic competence. If questions of interpretation are being investigated, gram-
maticality judgments can be totally uninformative. Consider, once again, the
ambiguity of sentences like (1a), contrasted with the lack of ambiguity of (1b). If
L2ers were asked to judge such sentences and indicated that both sentences are
grammatical, this would not help to determine how they were interpreting the
sentences (i.e., whether when was being construed as a question about the embed-
ded clause or the main clause). For this reason, in testing whether L1 acquirers
of English know the relevant properties, de Villiers and colleagues adopted a
comprehension task, where children were shown pictures and asked the test
question; their responses indicated how they had interpreted that question.
Another way of investigating L2ers’ interpretations of sentences is by means
of truth-value judgments. In this methodology, participants are presented
with contexts, in the form of a short story or a picture, and have to judge
whether a given sentence is true or false in that context. Dekydtspotter and
colleagues (e.g., Dekydtspotter, Sprouse, & Swanson, 2001) have used this
methodology to show that L2ers are sensitive to subtle interpretive properties
Linguistic Theory, Universal Grammar 27
related to word order in French, proposing that this sensitivity must come from
UG rather than the L1. What makes this methodology particularly suitable as
a means of providing evidence as to the nature of unconscious linguistic com-
petence is that the judgments are not made on a metalinguistic basis, a poten-
tial problem with grammaticality judgments, where one cannot eliminate the
possibility that conscious knowledge is being brought to bear on the judgment.
In addition to production data, intuitional data, and data from sentence
interpretation, in recent years there has been a growing interest in the use of
online methodologies to provide evidence concerning how L2ers process or
compute the input incrementally, in real time. As far as L2 knowledge and
use of wh-movement and island constraints are concerned, methodologies have
included—but are not limited to—eye-tracking (e.g., Cunnings, Batterham,
Felser, & Clahsen, 2010) and self-paced reading (e.g., Aldwayan, Fiorentino, &
Gabriele, 2010; Omaki & Schultz, 2011); see Clahsen and Felser (2006), for
an overview). We return to the use of self-paced reading when discussing the
exemplary study.
Common Misunderstandings
Here we consider four common areas of misunderstanding about generative
SLA research. These relate to (a) the scope of the theory, (b) lack of native-like
“success” in L2, (c) transfer, and (d) methodology.
To address the first of these misconceptions, the theory described in this
chapter does not seek to account for all aspects of L2 acquisition. On the con-
trary, the theory is deliberately circumscribed, concentrating on description
and explanation of interlanguage competence, defined in a technical way. The
focus is on how the learner represents the L2 in terms of a mental grammar.
The theory does not aim to account for second language use, nor does it aim to
account for all of the observable phenomena (see later).
It is important to understand that UG is a theory of constraints on rep-
resentation, as shown by the examples discussed earlier; this is true both of
L1 acquisition (e.g., Borer, 1996) and L2 (White, 2003). UG determines the
nature of linguistic competence; principles of UG (constraints) guarantee that
certain potential analyses are never in fact entertained. This says nothing about
the time course of acquisition (L1 or L2) or about what drives changes to the
grammar during language development. Similarly, the theory of parameter set-
ting does not, in fact, provide a theory of language development, even though
it is often seen as such. The concept of parameter resetting in L2 presupposes
that some kind of change takes place in the interlanguage grammar, from the
L1 parameter value to some other parameter value (e.g., the change from –
wh-movement to +wh-movement). In consequence, interlanguage grammars at
different points in time may be characterized in terms of different parameter
settings. However, the precise mechanisms that lead to such grammar change
28 Lydia White
are not part of the theory of UG. Rather, the theory needs to be augmented in
various ways (Carroll, 2001; Gregg, 1996).
The second misconception is that if UG constrains interlanguage gram-
mars, this necessarily predicts a “successful” outcome in L2 acquisition, such
that the endstate grammars of L2ers should not differ in significant respects
from those of native speakers. However, the claim that interlanguage grammars
are UG-constrained is a claim that the linguistic representations of L2ers are
subject to principles of UG, like other natural languages. It is not a claim that
L2ers will necessarily achieve the same grammar as a native speaker would.
In the case of wh-questions, UG does not dictate that wh-movement must be
acquired (since it is not acquired by L1 acquirers of wh-in-situ languages), only
that, if acquired, it must be constrained by the relevant principles, including
island constraints. Many factors come into play in L2 that simply do not arise
in L1 acquisition—including prior knowledge of another language and possible
deficiencies in the input—which might prevent native-like attainment.
The third misconception is that the L1 should play a relatively trivial role in
L2 acquisition, the idea being that strong L1 influence is somehow incompatible
with the claim that UG is implicated in L2 acquisition. In fact, however, many
proponents of generative SLA incorporate the L1 grammar as an integral part
of the theory. In particular, the Full Transfer Full Access Hypothesis (FTFA)
and its precursors claim that the initial state of L2 acquisition consists of the
steady state grammar of L1 acquisition (Schwartz & Sprouse, 1996; White,
1989). L2ers initially adopt the L1 grammar as a means of characterizing the
L2 data; this constitutes full transfer. Subsequently, in the light of L2 input,
revisions to the grammar may be effected. Such revisions are assumed to be
UG-constrained, hence full access. Transfer may be persistent or not, depending
on particular linguistic properties and particular language combinations (see
observable phenomena, in what follows). In the event that L2ers fail to arrive at
properties of the L2, interlanguage grammars are nevertheless expected to fall
within the range permitted by UG; that is, they will be subject to constraints,
like any natural language. It is also conceivable that L2ers arrive at analyses
appropriate for other languages but not for the L1 or the L2 (e.g., Finer, 1991).
Continuing with our wh-movement examples, what this implies is that,
prior to acquiring wh-movement, learners whose L1 is a wh-in-situ language
would be expected to treat the L2 as wh-in-situ as well. In support of this,
evidence for wh-in-situ in the L2 English of speakers of Hindi (a wh-in situ lan-
guage) is reported by Bhatt and Hancin-Bhatt (2002). Furthermore, even when
L2ers appear, superficially at least, to have abandoned a wh-in-situ analysis of the
L2, they may not have acquired wh-movement, instead generating wh-phrases
as clause initial topics, analogous to other topics in the L1 (Hawkins & Chan,
1997; Martohardjono & Gair, 1993; White, 1992).5 In consequence, island con-
straints would be nonoperative for the same reason that they do not apply to
wh-in-situ, because movement has not taken place.
Linguistic Theory, Universal Grammar 29
Finally, there are misconceptions relating to methodology. It has been
claimed (Carroll & Meisel, 1990; Ellis, 1991) that researchers working in the
UG paradigm take grammaticality judgment tasks to have some kind of privi-
leged status, such that they provide a direct reflection of linguistic competence.
In fact, judgment data are recognized as being performance data, on par with
other data (White, 1989, 2003). The only privilege that grammaticality judg-
ment tasks offer is a relatively straightforward way of assessing knowledge of
ungrammaticality. As described in the section on evidence, different kinds of
data provide different kinds of evidence, and the suitability of any particular
task (and the performance data gathered by means of it) will depend on the
precise issue that the researcher is trying to investigate.
An Exemplary Study: Aldwayan, Fiorentino,
and Gabriele (2010)
In recent years, researchers have become interested not only in the issue of
whether L2 grammars are subject to UG principles but also whether L2ers are
constrained by such knowledge when comprehending linguistic input in real
time. The question is whether L2ers are able to make full use of structural
information, including universal constraints, in on-line processing, technically
referred to as parsing.
This issue has been investigated by examining processing of long-distance
dependencies, including those relating to wh-movement. As someone listens to
(or reads) a sentence, the fronted wh-phrase (often referred to as the filler) must
be held in memory until it can be associated with the structural position that
it has moved from (the gap). In a simple wh-question like (8a), what is the filler
and the gap occurs after the verb see, in the object position where what is inter-
preted. In (8b), a more complex wh-question, what is again the filler and there
are actually two potential gap positions, one after the verb say and one after the
verb found; the first gap is not the actual gap but at the point of hearing the word
say, that is not yet obvious.
(8)
a. What did the girl see _?
b. What did the boy say _ that he found _?
Clahsen and Felser (2006, 2018) advance the Shallow Structure Hypothesis (SSH),
proposing that L2ers, in contrast to native speakers, rely mostly on lexical and
semantic cues rather than syntax in parsing. In consequence, their behavior
when parsing L2 sentences differs from native speakers, particularly where
complex structures are concerned. Aldwayan, Fiorentino, and Gabriele (2010)
hypothesize, instead, that L2 parsing is syntactically based; not only are inter-
language grammars constrained by principles of UG, but these principles are
30 Lydia White
accessible to L2ers when parsing L2 input. Their study examines the case of
native speakers of Nadji Arabic, a language without wh-movement, whose L2
is English, a wh-movement language.
In Nadji Arabic, a wh-phrase is found in initial position, but not as the result
of movement. Rather, it is base-generated there and a resumptive pronoun (ih
“him” in the examples in (9), from Aldwayan et al., 2010) fills the position
where the wh-phrase is interpreted. In consequence, there is no gap, as can be
seen in (9a) and, as is the case of Chinese discussed earlier, sentences which
would be ungrammatical in wh-movement languages like English are accept-
able in Arabic; see the contrast between (9b) and its English translation.
(9)
a. min alli arsal ar-rasalah li-ih
who C send the-letter to-him
“Who did you send the letter to?”
b. hatha ar-rjal alli Mary 9alima-t-ni mita ib-til zor-ih
this the-man C Mary tell when will-she visit-him
“*This is the man who Mary told me when she will visit”
If L2 acquisition is UG-constrained, and if these Nadji Arabic speakers have
acquired English wh-movement (i.e., reset the wh-movement parameter), they
are expected to observe and apply constraints on wh-movement, even though
these constraints are not activated in the L1. This hypothesis is tested by means
of a self-paced reading task, based on Stowe (1986). In self-paced reading, par-
ticipants read sentences presented word-by-word (or phrase-by-phrase) on a
computer screen. After reading a word, they press a key to indicate that they
are ready for the next word; reading time is measured when the key is pressed.
Stowe found that native speakers of English slow down in their reading times
at a point where they are expecting a gap but find it to be filled (the so-called
filled-gap effect). Consider the sentence in (10).
(10) My cousin wondered who David will put _ me near _ at the wedding.
This sentence contains an embedded wh-clause, starting with who as the filler;
there are two potential gaps, one after the verb put and one after the prepo-
sition near. The correct interpretation involves the second gap. However, on
encountering the verb put (without yet seeing the rest of the sentence), the first
assumption that native speakers make is that who is the object of that verb, so
they show a slower reading time on encountering me, indicating the unexpect-
edness of the filled gap.
Now consider sentences like (11). Here, there also appears to be a potential
gap, after the preposition about, that might slow down the reader; this gap is in
fact filled (by Sam’s sick mother). However, Stowe found that native speakers of
English show no slow down in such cases. This is because the potential gap is
Linguistic Theory, Universal Grammar 31
contained within a complex NP subject, which is a syntactic island. There can
be no relationship between a wh-phrase and a position inside an island, since
movement is prohibited in such cases, as we have seen.
(11) The girl questioned who [the sad findings about _ Sam’s sick mother] were
announced to upset _.
To summarize, native speakers show filled-gap effects but only in cases where
the “misleading” gap is in a grammatically possible position.
Aldwayan et al. (2010) tested whether Nadji Arabic speakers would show
native-like patterns of behavior in their L2. They hypothesize that L2ers will
be sensitive: (a) to licit gaps (demonstrated by a slowing down in their reading
of filled gaps in sentences like (10)); and (b) to the fact that gaps cannot occur
inside syntactic islands (demonstrated by absence of a slowing down in sen-
tences like (11)). Such results would indicate that L2ers parse the input with
reference to syntactic structure, including island constraints.
Participants included 40 adult native speakers of Nadji Arabic who were
advanced English L2ers and 40 native speakers of English. They took part in a
self-paced (word-by-word) reading task, which included two sets of sentences (as
well as fillers). The first set were like those in (10) (as well as structurally similar
sentences with no movement, hence no gaps), to establish whether or not par-
ticipants exhibited a filled-gap effect. The second set was designed to find out
how they would treat illicit gaps (i.e., ones which violate island constraints), as
in (11). If L2ers are sensitive to islands, they should not show a filled gap effect
in such cases.
Results showed a significant filled-gap effect (a slowing down in reading
time) for sentences like (10) compared to equivalent sentences with no gaps,
for both native speakers and L2ers. In contrast, there was no filled-gap effect
in sentences like (11), suggesting that neither group posited a gap within the
syntactic island. Taken together, these results suggest that L2ers with an L1
without wh-movement know when movement is permitted, so gaps can be
postulated, as well as when it is not permitted, so gaps are unavailable. Recall
that there are no gaps in the L1 wh-constructions because Nadji Arabic requires
resumptive pronouns, which can occur inside syntactic islands. The fact that
L2ers demonstrated this contrast suggests access to abstract structure, including
constraints on wh-movement, both in their underlying knowledge as well as
their access to that knowledge in online processing.
To summarize, these results are consistent with the claim that parameters
of UG can be reset (in this case from – wh-movement to + wh-movement) and
that interlanguage grammars are subject to principles of UG (in this case, island
constraints), principles which are also available for L2 processing. For other
research demonstrating that L2ers access island constraints in online processing,
see Cunnings, Batterham, Felser, and Clahsen (2010) and Omaki and Schultz
(2011), amongst others.
32 Lydia White
Explanation of Observed Findings in L2 Acquisition
In seeking to characterize the unconscious underlying linguistic competence of
L2 learners, the generative perspective on L2 acquisition cannot and does not
aim to account for all of the observable phenomena discussed in this volume.
Nevertheless, this perspective does offer insights into several of them.
Observation 1: Exposure to input is necessary for L2 acquisition. According to
UG theory, there are certain aspects of grammar that are not learned through
exposure to input, specifically, knowledge of universal constraints. Nevertheless,
UG does not operate in a vacuum: Universal principles and language-specific
parameter settings must be triggered by input from the language being acquired.
In the case of our examples, learners acquiring an L2 with wh-movement will
require input to motivate the +wh-movement value of the parameter. However,
once they have established that the L2 has wh-movement, they will not require
input to determine that island constraints operate; these come for free, so to
speak.
Observation 3: Learners come to know more than what they have been exposed to in
the input. This is the central observation that the theory aims to account for. The
main motivation for the proposal that L1 language acquisition is constrained
by UG is precisely the fact that native speakers come to know more than they
have been exposed to. Generative SLA researchers make the same claim in the
case of L2 acquisition: L2ers come to know very subtle properties of the L2
(such as ambiguity and ungrammaticality) which are underdetermined by the
L2 input, both naturalistic input and classroom instruction, and which cannot
be explained in terms of the L1 grammar either.
Observation 6: Second language learning is variable across linguistic subsystems. It has
frequently been observed by researchers working within the generative SLA
framework that there is a dissociation between attainment in the syntactic and
morphological domains. As discussed under Evidence, syntax acquisition is largely
successful, in contrast to inflectional morphology, which is often not supplied or
supplied inappropriately. This is not taken to reflect a failure of UG-constraints.
Rather, Lardiere (1998, 2000) attributes this dissociation to a problem in map-
ping between these two domains, a problem that can be very persistent, showing
up not just in the course of L2 acquisition but in the endstate. In a related vein,
Prévost and White (2000) advance the Missing Surface Inflection Hypothesis (MSIH),
proposing that learners have a low-level morphological deficit such that they can-
not consistently realize L2 morphology, resorting to default forms when in doubt,
while at the same time having no problems with associated syntax.
Another discrepancy that has been investigated in recent years concerns
syntax versus discourse. Sorace and colleagues (Belletti, Bennatti, & Sorace,
2007; Sorace, 2011; Sorace & Filiaci, 2006) report that near-native speakers of
null subject languages achieve native-like proficiency in syntax but neverthe-
less show problems in at the interface between syntax and discourse, overusing
Linguistic Theory, Universal Grammar 33
overt subjects in discourse contexts where null subjects would be more appro-
priate. These problems are attributed, in part, to lasting effects of discourse
properties of the L1 or to processing problems that arise when syntax must be
integrated with other domains.
Observation 7: There are limits on the effects of frequency on L2 acquisition. The
claim of UG theory is that certain properties of language are not subject to fre-
quency effects. Indeed, the idea is the opposite: UG allows learners to acquire
properties quite unrelated to frequency; children achieve certain kinds of
knowledge on the basis of little or no input. For example, consider the case of
so-called parasitic gaps, illustrated in (12). In (12a), the sentence is ungrammati-
cal because the verb correcting requires an overt direct object pronoun (i.e., them).
The example in (12b), a yes–no question, is ungrammatical for the same reason.
But (12c), a wh-question, is significantly better, even though correcting still lacks
an overt object.
(12)
a. *By mistake, I filed the papers without correcting.
b. *Did you file the papers without correcting?
c. Which papers did you file without correcting?
The grammaticality of (11c) is a consequence of properties of wh-movement that
are encoded in UG. Native speakers of English acquire the distinction between
these sentence types, even though sentences like (11a) and (11b) will not be
exemplified in the input (because they are ungrammatical), while sentences like
(11c) are, presumably, relatively rare, even nonexistent. The point, then, is that
frequency cannot play a role in such cases. The same claim would apply in L2;
certain properties of the L2 are expected to be acquired regardless of frequency.
Observation 8: There are limits on the effect of a learner’s first language on L2 acqui-
sition. We have already seen that certain versions of generative SLA (e.g., FTFA)
assume strong L1 influence, with the L1 grammar taken as the starting point
(the initial state) in L2 acquisition. Although this claim implies that all param-
eters are initially set at the L1 setting in the interlanguage grammar, it does not
imply that they will all be reset to the L2 value at the same time. Hence, L1
effects may be quite fleeting in some cases but lasting in others. Depending on
the L1 and the L2 in question, triggering input may motivate resetting to the L2
value extremely early, as Haznedar (1997) shows for the headedness parameter
(switching from head final L1 to head initial L2). Conversely, if the L2 input
does not provide suitable positive evidence to motivate resetting, transfer
effects will be much longer lasting, maybe even permanent. For instance, as
discussed earlier, some L2ers appear to take fronted wh-phrases in L2 English
not as evidence for wh-movement as such but rather as evidence that wh-phrases
can be topicalized (a possibility permitted in the L1). Once such an analysis is
adopted, it is not clear what evidence would lead learners to abandon it.
34 Lydia White
Observation 9: There are limits on the effects of instruction on L2 acquisition. Clearly
one cannot instruct L2ers as to UG-constraints (nor does anyone attempt to
do so). On the contrary, this kind of abstract, complex, and subtle knowl-
edge is achieved without being taught. Furthermore, several researchers have
shown that classroom instruction and information provided in textbooks can
often be quite misleading, providing superficial and incorrect analyses of cer-
tain complex linguistic phenomena assumed to stem from UG (e.g., Belikova,
2008; Bruhn-Garavito, 1995). Conversely, it could be that instruction might
be effective in providing L2 input necessary to trigger parameter resetting
(e.g., in providing evidence—in the form of wh-questions—for or against
wh-movement). Attempts have been made to provide learners with triggering
input in the classroom, with mixed results (see White, 2003, for discussion).
The Explicit/Implicit Debate
As already discussed, a central assumption of the generative perspective on sec-
ond language acquisition described in this chapter is that language learners (both
L1 and L2) acquire linguistic competence, which takes the form of an abstract
and unconscious grammar, achieved, at least in part, by means of UG. Assum-
ing Hulstijn’s (2005) identification of explicit with conscious knowledge and
implicit with unconscious knowledge, what this theory seeks to account for is
the implicit linguistic knowledge that L2 learners attain. This does not mean that
all linguistic behavior reflects implicit knowledge. There are aspects of language
that relate to what is sometimes called “encyclopedic” knowledge, such as entries
in the mental lexicon and morphological paradigms. This kind of knowledge
may well be explicit and the way it is learned may be explicit, with general cog-
nitive processes such as attention and memorization playing a crucial role.
It is the acquisition of implicit knowledge that is at the heart of the generative
perspective on L2 acquisition. We now turn to the question of how this implicit
knowledge is achieved. Here the distinction between acquisition and learn-
ing originally formulated by Krashen (1981) becomes relevant (see Chapter 1).
Schwartz (1986, 1993) has extended Krashen’s proposal, arguing for a distinc-
tion between unconscious (i.e., implicit) linguistic competence (in the techni-
cal sense assumed in this chapter) and learned linguistic knowledge. (See also
Felix, 1985, for a similar division of labor. Felix argues that competing systems
are implicated in adult L2, that is, UG and problem-solving systems.)
Researchers working in the generative SLA framework agree that the
acquisition process is implicit and that the outcome of acquisition (in terms
of unconscious knowledge) is implicit. But there is ongoing debate about the
exact relationship between the acquisition of an unconscious system and a
more explicitly learned system and the extent to which linguistic input can be
manipulated in the classroom to affect language acquisition.
Arguing for the claim that linguistic competence is acquirable only by
means of implicit mechanisms, Schwartz (1986, 1993) follows Fodor (1983)
Linguistic Theory, Universal Grammar 35
in assuming that language implicates cognitive processes that are domain
specific, or modular, in contrast to more general, nonmodular processes that
apply to other aspects of cognition. Language acquisition takes place by means
of mechanisms within the language module (including UG), involving only
implicit mechanisms, specific to language. Language acquisition cannot come
about through conscious memorization of rules, for example, or by means of
a conscious search for patterns in the input, or by paying explicit attention to
certain aspects of the input. Schwartz furthermore makes the strong assump-
tion (again, following Krashen, 1981) that the outcome of explicit learning can
never become the input to the implicit acquisition system or to implicit com-
petence. On this account, there is no interface between implicit and explicit
knowledge, or between explicit learning and implicit linguistic competence.
Schwartz further extends these ideas to issues relating to the kinds of input
typically available in language classrooms, arguing that explicit input (such
as grammar teaching or correction) can never serve as input to the language
acquisition system. In contrast, White (1991), making the same assumptions
about the acquisition of implicit linguistic competence, has nevertheless argued
that explicit input might contribute to the shaping of underlying competence,
particularly in cases where the L2 positive input does not provide evidence of
ungrammaticality. For more recent perspectives on such issues, see papers in
Whong, Gil, and Marsden (2013).
Conclusion
In this chapter, the perspective offered by linguistic theory has been presented.
The central tenet of the theory is that the linguistic competence of native
speakers is underdetermined by the input that children are exposed to, hence
that an innate UG is implicated in language acquisition. Researchers with a
generative perspective on SLA investigate whether the same holds true for L2
acquisition. If interlanguage competence goes beyond the L2 input and the
L1 grammar in particular respects, then UG must be implicated in nonnative
language acquisition as well.
Discussion Questions
1. Like other researchers within a UG framework, White relies on the logical
problem of learning and the poverty of the stimulus argument to posit an
innate language faculty. What other explanation might there be, other
than innateness, for the problems White discusses?
2. If both L1 acquisition and L2 acquisition are constrained by UG, how
would you explain the observed differential outcomes between L1 acquisi-
tion and L2 acquisition (e.g., L1 acquisition is universally successful while
L2 acquisition is not; L1 learners all attain some kind of native pronuncia-
tion, whereas most L2 learners do not)?
36 Lydia White
3. What evidence does White use to suggest language acquisition is different
from other kinds of learning? Do you agree with this evidence? If not, how
would you explain the findings of UG research?
4. Research within the UG framework tends to ignore the social context of
language learning. Why is this appropriate for the framework?
5. As a theory of linguistic competence, the crux of research on UG is how
learners come to know more than what they are exposed to. In addition
to UG, what other mechanisms in the mind–brain would you suggest
are necessary to explain language acquisition? For example, what triggers
movement from one stage of acquisition to the next?
6. Read the exemplary study presented in this chapter and prepare a discus-
sion for class in which you describe how you would conduct a replication
study. Be sure to explain any changes you would make and what motivates
such changes.
Notes
Suggested Further Reading
Hawkins, R. (2019). How second languages are learned. Cambridge, England: Cambridge
University Press.
This textbook provides a clear and comprehensive introduction to L2 acquisition
from a generative linguistic perspective.
Slabakova, R. (2016). Second language acquisition. Oxford, England: Oxford University
Press.
A recent perspective on generative L2 acquisition, including an overview of cur-
rent issues and research. The main focus is on adult second language acquisition,
Linguistic Theory, Universal Grammar 37
placed in the broader context of generative linguistic theory, first language acquisi-
tion, childhood second language acquisition and bilingualism.
White, L. (2003). Second language acquisition and Universal Grammar. Cambridge, England:
Cambridge University Press.
In this book, theories as to the role of UG and the extent of mother tongue
influence are presented and discussed. Particular consideration is given to the nature
of the interlanguage grammar at different points in development, from the initial
state to ultimate attainment.
White, L. (2012). Research timeline: Universal Grammar, crosslinguistic variation and
second language acquisition. Language Teaching, 45, 309–328.
This paper traces some of the main strands of generative SLA research conducted
between 1985 and 2011.
References
Aldwayan, S., Fiorentino, R., & Gabriele, A. (2010). Evidence of syntactic constraints
in the processing of wh-movement: A study of Najdi Arabic learners of English.
In B. VanPatten & J. Jegerski (Eds.), Research in second language processing and parsing
(pp. 65–86). Amsterdam, Netherlands: John Benjamins.
Belikova, A. (2008). Explicit instruction vs. linguistic competence in adult L2-acquisition.
In H. Chan, H. Jacob, & E. Kapia (Eds.), Proceedings of the 32nd annual Boston University
Conference on language development (BUCLD 32) (pp. 48–59). Somerville, MA:
Cascadilla Press.
Belletti, A., Bennati, E., & Sorace, A. (2007). Theoretical and developmental issues
in the syntax of subjects: Evidence from near-native Italian. Natural Language and
Linguistic Theory, 25, 657–689.
Bhatt, R., & Hancin-Bhatt, B. (2002). Structural minimality, CP and the initial state
in adult L2 acquisition. Second Language Research, 18, 348–392.
Bley-Vroman, R. (1989). What is the logical problem of foreign language learning?
In S. Gass & J. Schachter (Eds.), Linguistic perspectives on second language acquisition
(pp. 41–68). Cambridge, England: Cambridge University Press.
Borer, H. (1996). Access to Universal Grammar: The real issues. Brain and Behavioral
Sciences, 19, 718–720.
Bruhn-Garavito, J. (1995). L2 acquisition of verb complementation and Binding
Principle B. In F. Eckman, D. Highland, P. Lee, J. Mileman, & R. Rutkowski
Weber (Eds.), Second language acquisition theory and pedagogy (pp. 79–99). Hillsdale,
NJ: Lawrence Erlbaum.
Carroll, S. (2001). Input and evidence: The raw material of second language acquisition.
Amsterdam, Netherlands: John Benjamins.
Carroll, S., & Meisel, J. (1990). Universals and second language acquisition: Some com-
ments on the state of current theory. Studies in Second Language Acquisition, 12, 201–208.
Chomsky, N. (1981). Lectures on government and binding. Dordrecht, Netherlands: Foris.
Chomsky, N. (1986a). Barriers. Cambridge, MA: MIT Press.
Chomsky, N. (1986b). Knowledge of language: Its nature, origin, and use. New York, NY:
Praeger.
Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press.
Clahsen, H., & Felser, C. (2006). Grammatical processing in language learners. Applied
Psycholinguistics, 27, 3–42.
38 Lydia White
Clahsen, H., & Felser, C. (2018). Some notes on the Shallow Structure Hypothesis.
Studies in Second Language Acquisition, 40, 693–706.
Corder, S. P. (1967). The significance of learners’ errors. International Review of Applied
Linguistics, 5, 161–170.
Crain, S., & Lillo-Martin, D. (1999). An introduction to linguistic theory and language acqui-
sition. Oxford, England: Blackwell.
Crain, S., & Thornton, R. (1998). Investigations in Universal Grammar: A guide to experi-
ments on the acquisition of syntax. Cambridge, MA: MIT Press.
Cunnings, I., Batterham, C., Felser, C., & Clahsen, H. (2010). Constraints on L2 learn-
ers’ processing of wh-dependencies. In B. VanPatten & J. Jegerski (Eds.), Research
in second language processing and parsing (pp. 87–110). Amsterdam, Netherlands: John
Benjamins.
Dekydtspotter, L., Sprouse, R., & Swanson, K. (2001). Reflexes of mental architec-
ture in second-language acquisition: The interpretation of combien extractions in
English-French interlanguage. Language Acquisition, 9, 175–227.
de Villiers, J., Roeper, T., & Vainikka, A. (1990). The acquisition of long-distance
rules. In L. Frazier & J. de Villiers (Eds.), Language processing and language acquisition
(pp. 257–297). Dordrecht, Netherlands: Kluwer.
Ellis, R. (1991). Grammaticality judgments and second language acquisition. Studies in
Second Language Acquisition, 13, 161–186.
Felix, S. (1985). More evidence on competing cognitive systems. Second Language
Research, 1, 47–72.
Finer, D. (1991). Binding parameters in second language acquisition. In L. Eubank (Ed.),
Point counterpoint: Universal Grammar in the second language (pp. 351–374). A msterdam,
Netherlands: John Benjamins.
Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press.
Gregg, K. R. (1996). The logical and developmental problems of second language
acquisition. In W. Ritchie & T. Bhatia (Eds.), Handbook of second language acquisition
(pp. 49–81). San Diego, CA: Academic Press.
Hawkins, R., & Chan, C. Y. H. (1997). The partial availability of Universal Grammar
in second language acquisition: The “failed functional features hypothesis.” Second
Language Research, 13, 187–226.
Haznedar, B. (1997). L2 acquisition by a Turkish-speaking child: Evidence for L1 influ-
ence. In E. Hughes, M. Hughes, & A. Greenhill (Eds.), Proceedings of the 21st annual
Boston University Conference on language development (pp. 245–256). Somerville, MA:
Cascadilla Press.
Huang, C. T. J. (1982). Move WH in a language without WH movement. The Linguistic
Review, 1, 369–416.
Hulstijn, J. H. (2005). Theoretical and empirical issues in the study of implicit and
explicit second-language learning. Studies in Second Language Acquisition, 27, 129–140.
Krashen, S. (1981). Second language acquisition and second language learning. Oxford,
England: Pergamon Press.
Lardiere, D. (1998). Case and tense in the “fossilized” steady state. Second Language
Research, 14, 1–26.
Lardiere, D. (2000). Mapping features to forms in second language acquisition. In
J. Archibald (Ed.), Second language acquisition and linguistic theory (pp. 102–129).
Oxford, England: Blackwell.
Lightfoot, D. (1989). The child’s trigger experience: Degree-0 learnability. Brain and
Behavioral Sciences, 12, 321–375.
Linguistic Theory, Universal Grammar 39
Martohardjono, G., & Gair, J. (1993). Apparent UG inaccessibility in second language
acquisition: Misapplied principles or principled misapplications? In F. Eckman (Ed.),
Confluence: Linguistics, L2 acquisition and speech pathology (pp. 79–103). Amsterdam,
Netherlands: John Benjamins.
Omaki, A., & Schultz, B. (2011). Filler-gap dependencies and island constraints in
second-language sentence processing. Studies in Second Language Acquisition, 33, 563–588.
Prévost, P., & White, L. (2000). Missing surface inflection or impairment in second
language acquisition? Evidence from tense and agreement. Second Language Research,
16, 103–133.
Roeper, T., & de Villiers, J. (1992). Ordered decisions in the acquisition of
wh- questions. In J. Weissenborn, H. Goodluck, & T. Roeper (Eds.), Theoretical issues
in language acquisition: Continuity and change in development (pp. 191–236). Hillsdale,
NJ: Lawrence Erlbaum.
Ross, J. R. (1967). Constraints on variables in syntax (Unpublished doctoral dissertation).
Cambridge, MA: MIT.
Schachter, J. (1990). On the issue of completeness in second language acquisition. Second
Language Research, 6, 93–124.
Schwartz, B. D. (1986). The epistemological status of second language acquisition.
Second Language Research, 2, 120–159.
Schwartz, B. D. (1993). On explicit and negative data effecting and affecting compe-
tence and “linguistic behavior.” Studies in Second Language Acquisition, 15, 147–163.
Schwartz, B. D., & Sprouse, R. (1996). L2 cognitive states and the full transfer/full
access model. Second Language Research, 12, 40–72.
Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics, 10, 209–231.
Sorace, A. (2011). Pinning down the concept of “interface” in bilingualism. Linguistic
Approaches to Bilingualism, 1, 1–33.
Sorace, A., & Filiaci, F. (2006). Anaphora resolution in near-native speakers of Italian.
Second Language Research, 22, 339–368.
Stowe, L. (1986). Parsing wh-constructions: Evidence for online gap location. Language
and Cognitive Processes, 1, 227–245.
White, L. (1989). Universal Grammar and second language acquisition. Amsterdam,
Netherlands: John Benjamins.
White, L. (1991). Adverb placement in second language acquisition: Some effects of pos-
itive and negative evidence in the classroom. Second Language Research, 7, 133–161.
White, L. (1992). Subjacency violations and empty categories in L2 acquisition. In
H. Goodluck & M. Rochemont (Eds.), Island constraints (pp. 445–464). Dordrecht,
Netherlands: Kluwer.
White, L. (2003). Second language acquisition and Universal Grammar. Cambridge, England:
Cambridge University Press.
White, L., & Juffs, A. (1998). Constraints on wh-movement in two different contexts
of non-native language acquisition: Competence and processing. In S. Flynn,
G. Martohardjono, & W. O’Neil (Eds.), The generative study of second language acquisi-
tion (pp. 111–129). Hillsdale, NJ: Lawrence Erlbaum.
Whong, M., Gil, K. H., & Marsden, H. (Eds.). (2013). Universal Grammar and the second
language classroom. New York, NY: Springer.
Xu, L. (1990). Remarks on LF movement in Chinese questions. Linguistics, 28, 355–382.
3
ONE FUNCTIONAL APPROACH
TO L2 ACQUISITION
The Concept-Oriented Approach
Kathleen Bardovi-Harlig
The Theory and Its Constructs
Functionalist approaches to language hold that language is primarily used
for communication and does not exist without language users. Functional-
ism views language in terms of form-to-function and function-to-form mappings.
Functional approaches to second language (L2) acquisition investigate such
mappings in interlanguage and are especially interested in how these change
over time in the developing interlanguage system. Functionalist approaches to
linguistics in general and to L2 acquisition in particular are not common in
North America, and readers might find the functionalist emphasis on meaning
and function to be both exciting and unfamiliar.
This chapter provides an overview of the concept-oriented approach, one
functionalist approach to L2 acquisition. A functionalist approach can take
either a form-oriented approach or a concept- (or meaning-) oriented approach.
A form-to-function approach would begin with a form such as the English past
tense (-ed) and follow the use of the form to discover how it functions. If we
took this approach in L2 acquisition to examine the acquisition of the simple
past, we would likely discover that the first use of the simple past is to mark
completion with a certain class of predicates. We would also discover a second
function of indicating the main events in a story. Finally, we would observe
that the morphological past takes the function of expressing past time regardless
of predicate type or role in a story. These observations have been made under
the auspices of the Aspect Hypothesis (Andersen, 1991; Bardovi-Harlig, 1998,
2000) and the Discourse Hypothesis (Bardovi-Harlig, 1995), examples of the
form-to-function type of functional analysis (Ellis, 2013).
One Functional Approach to L2 Acquisition 41
A function-to-form approach, typically called the concept-oriented
approach, identifies one function, concept, or meaning and investigates how it is
expressed. In this way, the concept-oriented approach focuses on one d irection
of the form and function mapping, specifically the function-to-form mapping.1
Within the concept-oriented approach, the main construct is the concept that
is being investigated. Concepts can be overarching, like time or temporality
(time relations), or they can be subsets of larger concepts, like futurity (which
considers time that follows the moment of speaking or writing that we call the
future). The concepts considered in this chapter are temporal, that is, related
to time. Non-temporal concepts that have been investigated include space or
spatial relations (the location of objects in space or in relation to each other),
movement (how objects move through space), and reference (the naming of
actors or objects). Concepts are frequently referred to—when a suitable word
exists—with the –ity suffix to distinguish the semantic concept, for example,
temporality, futurity, and modality, from the grammatical or morphological means
of expressing it, for example, tense, future, and modals, respectively.
Following Klein (2009), a concept such as time can be encoded linguis-
tically by six devices: tense, grammatical aspect, lexical aspect, temporal
adverbials, temporal particles, and discourse principles. Even semantically
more restricted concepts like the future are expressed by a range of linguistic
devices; for example, expression of future by learners of English includes tem-
poral adverbials (e.g., tomorrow), modals (e.g., might, could), will, going to, and
lexical futures (future-oriented verbs), such as want to or need to ( Bardovi-
Harlig, 2005). Concept-oriented studies have investigated a range of tem-
poral concepts, which include past (time prior to the moment of speech;
Bardovi-Harlig, 1992, 2000; Dietrich, Klein, & Noyau, 1995), reverse-order
reports (RORs) (reports in which the events are reported in anti- chronological
order; Bardovi-Harlig, 1994), futurity (time following the moment of speak-
ing; Bardovi-Harlig, 2004, 2005; Howard, 2012; Kanwit, 2017; Moses, 2002;
Solon & Kanwit, 2014), and simultaneity (the expression of events that hap-
pen at the same time or simultaneously; Aksu-Koç & von Stutterheim, 1994;
Leclercq, 2009; Schmiedtová, 2004).
A basic tenet of the concept-oriented approach to L2 acquisition is that
adult learners of second or foreign languages have access to the full range of
semantic concepts from their previous linguistic and cognitive experience. Von
Stutterheim and Klein (1987) argue that “a second language learner—in con-
trast to a child learning his first language—does not have to acquire the under-
lying concepts. What he has to acquire is a specific way and a specific means of
expressing them” (p. 194). The concept-oriented approach begins with a learn-
er’s need to express a certain concept, such as time, space, reference, epistemic
modality (possibility or necessity) or deontic modality (permission or obliga-
tion), or a meaning within a larger concept (such as past or future time, within
42 Kathleen Bardovi-Harlig
the more general concept of time), and investigates the means that a learner uses
to express that concept.
The basic claim of functional approaches is the centrality of meaning and
function in influencing language structure and language acquisition. Cooreman
and Kilborn (1991) outline two major tenets: Language serves communication
and form serves function. Functional approaches always work on multiple levels
of language. As Cooreman and Kilborn state, “there is no formal separation of
the traditionally recognized subcomponents in language, i.e., morphosyntax,
semantics, and pragmatics” (p. 196).
Consistent with other functional approaches, the concept-oriented approach
embraces a multilevel analysis, including lexical devices, morphology, syntax,
discourse, and pragmatics. In other words, the concept-oriented approach
includes all means of expression used by learners. As Long and Sato (1984) note,
“function to form analysis automatically commits one to multi-level [emphasis
added] analysis, since the entire repertoire of devices and strategies used by
learners must be examined” (p. 217).
Thus, concept-oriented analyses document the range of linguistic devices
that speakers use to express a particular concept (von Stutterheim & Klein,
1987), the interplay of ways to express a meaning, and the balance of what is
explicitly expressed and what is left to contextual information (Klein, 1995).
As Klein observes, from the concept-oriented perspective, a substantial part
of language acquisition is the permanent reorganization of the balance among
means of expression. The analysis seeks to explain how meanings within a
larger concept are expressed at a given time, and how the expression of the
concept changes over time.
As an example of the interplay among means of expression and the changing
balance, consider a learner’s expression of past time. The earliest resources that
learners have are their interlocutors’ turns, which may provide a time frame on
which a learner can build (this is called scaffolding), and universal principles
such as chronological order, by which listeners assume that events in narratives
are told in the same order in which they happened. This is called the pragmatic
stage (Meisel, 1987). In the next stage, the lexical stage, learners use temporal
and locative adverbials as well as connectives (e.g., and then) to indicate time.
Finally, learners may move to the morphological stage, in which tense indi-
cates temporal relations. At the same time that past morphology develops, it also
participates in structuring the narrative. The main story line (the foreground)
is distinguished from the supporting information (the background) by high use
of simple past in English (or preterit in Spanish and passé composé in French).
Note that both the inventory and the balance change. The inventory changes
as new forms are added: first lexical markers, then verbal morphology. The
balance changes as the use of morphology becomes more reliable. In the early
stages, adverbs are used in the absence of tense, whereas in the morphological
stage, tense is used more than adverbials. However, as Schumann (1987) points
One Functional Approach to L2 Acquisition 43
out, adverbials persist in advanced interlanguage just as they do in the native
speaker system.
The concepts of interplay and balance in a system also relate to the function-
alist concept of functional load. Every linguistic device, whether a structure,
morphology, or word, has a function. For example, if an adverb such as yesterday
is the only indicator in a sentence that an event happened in the past, then the
functional load of the adverb is high. If the sentence also employs past tense
verb morphology to indicate the time frame, the functional load of both the
adverb and the verbal morphology is lower than either one occurring alone
(Bardovi-Harlig, 1992, 2000).
One natural outcome of functionalism’s interest in the interplay of linguistic
resources and their change over time is an attempt to understand how interlan-
guage selects the first meaning-to-form mappings and how they expand. This
interest in the development of function-to-form and form-to-function mapping
is captured in two principles for L2 acquisition, the one-to-one principle and
the multifunctionality principle (Andersen, 1984, 1990). The one-to-one
principle states that an interlanguage system “should be constructed in such a
way that an intended underlying meaning is expressed with one clear invariant
surface form (or construction)” (Andersen, 1984, p. 79). As Andersen sums up,
the one-to-one principle “is a principle of one form to one meaning [emphasis
original]” (p. 79). The multifunctionality principle comes into play at later
stages and was formulated as follows (Andersen, 1990):
(a) Where there is clear evidence in the input that more than one form
marks the meaning conveyed by only one form in the interlanguage, try
to discover the distribution and additional meaning (if any) of the new
form. (b) Where there is evidence in the input that an interlanguage form
conveys only one of the meanings that the same form has in the input, try
to discover the additional meanings of the form in the input.
(p. 53)
The multifunctionality principle, then, allows multiple forms for a single
meaning and multiple meanings for a single form.
As an illustration, consider the early expression of futurity by learners of
English. Learners begin to express futurity with will, and only later, under
certain circumstances, expand their repertoire to include the going to future
(Bardovi-Harlig, 2004, 2005). Audiences often ask me why learners do not just
use the present progressive (e.g., I’m going to Chicago). The data show that the
present progressive is used in less than 2% of learner expressions of the future.
The explanation is rather straightforward functionally. The present progressive
has the primary function of expressing ongoing action. In other words, it is
involved in a one-to-one relationship with another meaning in the interlan-
guage. With time, learners do expand their systems beyond the initial stage
44 Kathleen Bardovi-Harlig
described by the one-to-one principle and move into a stage characterized by
multifunctionality, but at the outset they begin with a transparent, invariant,
and simple association of futurity and will.
Adult learners use language in the service of communication, so making
(and expressing) meaning is the main process underlying acquisition. Fail-
ure to convey the intended meaning is seen as an impetus to moving to the
next acquisitional stage. Consider the three main stages in the expression of
temporality: the pragmatic (the use of chronological order or building on an
interlocutor’s discourse that provides temporal reference), the lexical (the use
of temporal adverbials to establish a time orientation), and the morphological
(the use of verb inflections to indicate time relations). Failure to convey the
intended meaning using pragmatic means may drive learners to develop a more
elaborated system, moving from the pragmatic to the lexical stage, or from the
lexical stage to greater lexical elaboration or to the acquisition of verbal mor-
phology in the final stage (Dietrich et al., 1995).
The emphasis on the learner’s use of various linguistic devices and the
change in the balance of those devices in the course of acquisition aligns func-
tional approaches with Selinker’s (1972) influential concept of interlanguage
( Bardovi-Harlig, 2014). The concepts of interlanguage (Selinker, 1972) and
learner varieties (Dietrich et al., 1995; Klein, 1995; Klein & Dimroth, 2009) both
emphasize the systematicity of the emerging linguistic system and de-emphasize
comparison with achievement of target-language norms. Studies that quantify
the use of various means of expression do so, not in relation to target-language
use, but rather in terms of other means used at the same time in the interlan-
guage (see, e.g., the example study in this chapter; also Bardovi-Harlig, 2004,
2005; Kanwit, 2017; Moses, 2002; Solon & Kanwit, 2014).
The concept-oriented approach is also compatible with research on vari-
ation in L2 acquisition and the acquisition of variable targets. Within the
variationist framework, Gudmestad and Geeslin (2011, 2013) investigated the
expression of futurity in Spanish by advanced learners, and Kanwit (2017)
investigated the acquisition of the expression of futurity cross-sectionally by
university learners of Spanish as a foreign language at five levels of profi-
ciency, ranging from third semester undergraduate students to graduate stu-
dent instructors of Spanish. In addition, Kanwit and Solon (2013) explored
the effect of regional exposure on the acquisition of Spanish future expression
in two study abroad contexts.
In contrast to the theories and models outlined in other chapters in this
volume, the concept-oriented approach is neither a theory, nor a model, but
rather a framework for analysis.2 Although it does not make predictions or
model the acquisition process as theories and models do, it does provide an
orientation to L2 acquisition research that guides research and research ques-
tions. If one of the functions of a theory is to provide direction in identifying
important research questions, this analytic framework satisfies that function.
One Functional Approach to L2 Acquisition 45
Klein (1995) compared the concept-oriented framework to a theory in the
following way:
A frame of analysis, such as the one used here, is not a theory which is
meant to excel by the depth of its insights or by its explanatory power.
Rather, it is an instrument designed for a specific purpose [to analyze lan-
guage], and to serve this purpose, it should be simple, clear and handy. …
A frame of analysis, if it is to be more than a temporary crutch, should
also be flexible in the sense that it can easily be enlarged, refined and
made more precise, whenever there is need to.
(p. 17)
What Are the Origins of the Approach?
Functionalist approaches to L2 acquisition are related to functional linguistics more
generally, which may also be a valuable resource for L2 research. The interest in
the function-to-form and form-to-function mapping is broader than the concept-
oriented approach, and I will mention a few areas of investigation to give the reader
a sense of the breadth of functionalist inquiry possible in L2 research.
Different approaches to functionalism explore different functions. Prague
School functionalism pioneered work on functional sentence perspective (the
role of information bearing elements, whether known or unknown, given or
new) in determining word order (Firbas, 1979; Svoboda, 1974). This parallels
a syntactic concern for word order but investigates it functionally. Similarly,
research on topic (and topic-comment structure) in both first language (L1)
(Chafe, 1970; Kuno, 1980; Prince, 1981) and L2 (Hendriks, 2000; Huebner,
1983; Rutherford, 1983) offer a second perspective on word order. Discourse
concerns related to text type, specifically narratives, have been investigated
cross-linguistically for a range of languages by Hopper (1979) and for L2
(Bardovi-Harlig, 1995, 1998; Flashner, 1989; Kumpf, 1984; von Stutterheim,
1991; von Stutterheim & Lambert, 2005). Functionalist approaches can also be
found in studies of processing and weighting of cues, most notably in the com-
petition model (Bates & MacWhinney, 1981, 1987), which has influenced a
number of L2 studies (Cooreman & Kilborn, 1991; Gass, 1987; Kilborn & Ito,
1989; MacWhinney, 1987).
The concept-oriented approach to L2 acquisition owes its articulation to
von Stutterheim and Klein (1987). The concept-oriented approach is particu-
larly compatible with other meaning-oriented or function-oriented approaches
to language and linguistic universals, such as semantic or notional typology,
which investigates the expression of semantic concepts across the world’s
languages (Croft, 1995; Palmer, 2001). The research on L2 temporality (the
expression of time relations) has benefited greatly from cross-linguistic stud-
ies (e.g., Bybee, 1985; Bybee, Perkins, & Pagliuca, 1994; Dahl, 1985, 2000;
46 Kathleen Bardovi-Harlig
Klein & Li, 2009). Such inquiries inform L2 acquisition researchers about both
the range of expression and the range of systems in which they appear that are
possible in human language.
What Counts as Evidence?
Studies in the concept-oriented framework typically take as evidence language
used communicatively, a subset of what is generally called production data.
Studies in this framework also prefer to observe production over time, in what is
called a longitudinal design. The tasks used to elicit data allow learners to con-
struct meaning. The studies tend to observe learners’ production (or output) in
fairly natural situations. When speakers communicate, they encode meaning
in various ways. Since the concept-oriented approach is interested in the way
in which meanings or semantic concepts are expressed, communicative tasks or
activities which have a clearly definable concept or purpose are used. Examples
of some tasks that have been used include the telling of narratives (stories),
retelling of short film excerpts, and giving directions for the reenactment of an
event. Telling or retelling stories allows researchers to study how events in the
past are expressed, for example. Asking learners to make predictions may reveal
how they express the future, and also how they express certainty or uncertainty,
which is related to modality. Giving directions on how to perform an action
naturally allows learners to encode both the order of events (what to do first,
second, and third) and spatial relations (where to put what).
Studies in this framework have typically employed a longitudinal design.
Longitudinal designs allow individual learners to be observed for a relatively long
time, which facilitates the observation of how the use of various linguistic devices
changes over time. Longitudinal studies have included both instructed learners
observed over 15–18 months (Bardovi-Harlig, 1994, 2000, 2004, 2005) and pre-
dominantly uninstructed learners observed over 30 months in a large multinational
study (e.g., Becker & Carroll, 1997; Dietrich et al., 1995). In academic settings, the
academic year becomes a natural period of observation (Moses, 2002; Salsbury,
2000). In addition to the longitudinal component of the study in which Moses
elicited learner production six times during the course of an academic year, he also
collected data from students in four levels of instruction at the same university,
adding a cross-sectional component. As interest in the concept-oriented approach
grew, cross-sectional studies were conducted (Kanwit, 2017; Solon & Kanwit,
2014) as well as single-moment studies (Edmonds, Gudmestad, & Donaldson, 2017;
Howard, 2012). Early studies investigated learners only in the target-language
environment (such as English in the United States or Swedish in Sweden), but later
studies have also included instructed foreign language learners (e.g., for French:
Howard, 2012; Moses, 2002; for Spanish: Kanwit, 2017; Solon & Kanwit, 2014).
Evidence from language processing studies is also valued in functional
approaches (Cooreman & Kilborn, 1991). In contrast to the concept-oriented
One Functional Approach to L2 Acquisition 47
studies which rely on production tasks that are as close to spontaneous commu-
nication as possible, processing studies rely on highly controlled experimental
designs and the results are understood in terms of preference, interpretation, and
rate of processing. Evidence from processing studies suggests that patterns in
production—namely, that in early stages L2 learners rely on adverbs to convey
temporal relations—are mirrored in comprehension; that is, learners may use
adverbs to understand temporal relations even when morphological indicators
are present or when they conflict with the adverb (e.g., Lee, Cadierno, Glass, &
VanPatten, 1997; Musumeci, 1989; Sanz & Fernández, 1992). Additional stud-
ies of the Spanish future have investigated the presence of adverbials as well as
other factors (Lee, 2002; Rossomondo, 2007). These studies are offline pro-
cessing studies. In offline studies, participants hear or read a sentence and then
respond to a comprehension probe (e.g., a question) that assesses their under-
standing of the sentence. In offline studies, the primary data of interest (i.e., the
response to the question and sometimes the time taken to answer) is gathered
after the primary stimulus has been perceived. In contrast, online studies mea-
sure processing as it unfolds in real time. One online technique, eyetracking,
records readers’ eye movements while they read sentences or while they hear
sentences related to an array of pictures that is visible while the sentence is
heard. Fixation times on words or pictures are measured every few milliseconds
and provide data about how the sentence is understood as each incoming word
is perceived. We could easily imagine a replication of the earlier findings on
the reliance on adverbials using eyetracking as learners read written sentences.
von Stutterheim and colleagues (Flecken, 2011; von Stutterheim, A ndermann,
Carroll, Flecken, & Schmiedtová, 2012; von Stutterheim & Carroll, 2006) have
used eyetracking as one means of assessing how learners prepare to encode
motion events using perfective or imperfective (progressive) morphology in
narratives as they view short film clips. Participants were asked to view video
clips and were prompted to tell what is happening. The researchers were interested
in whether native speakers, bilinguals, and/or advanced learners would repre-
sent the events as completed (using perfective aspect) or ongoing or incomplete
(imperfective/progressive aspect), and whether this corresponded to fixation
times on motion-path endpoints.
What these different types of studies have in common is their focus on
investigating form–meaning associations. Research designs which facilitate the
investigation of such associations would be considered to be consistent with a
functional inquiry.
Because functionalist approaches do not seek to explain form for form’s
sake, or structure in the absence of function, functionalist inquiries, includ-
ing concept-oriented studies, tend to avoid designs that focus on form rather
than meaning or that focus on form in the absence of meaning. Thus, tasks
such as grammaticality judgments, which are used by other approaches, are not
found in functionalist studies. Similarly, one would not expect to find sentence
48 Kathleen Bardovi-Harlig
correction tasks, if the sentences are isolated; however, if the sentences were
part of a text and thus context and meaning are involved, it would be harder to
rule out such a task a priori.
It is also important to consider what type of analysis would and would not
be appropriate to the approach. Concept-oriented analyses report how learners
use language and how they construct their language but typically do not report
the findings in terms of whether the learners are correct or incorrect relative to
the language being learned (what we call the target language). Consider that in a
concept-oriented approach, we would say that learners who use goed (yesterday)
are using morphological inflection to express the past, but we would be unlikely
to discount the form as ill-formed.
Common Misunderstandings
Although functionalist approaches are not very common, I do not think the
concept-oriented approach is misunderstood, largely because few people think
about this approach to interlanguage research and analysis. I think many
people instinctively like the concept-oriented approach (and other functionalist
approaches) because it is meaning oriented, but when novice researchers attempt
studies in this framework, they find it very difficult not to refer to form as the
primary focus or to describe learner production without evaluating accuracy
based on what is expected in the target language. As an illustration, consider
the concept of plurality, which is distinct from the plural morpheme (in English
indicated by -s). Consider also the noun phrases two boy, many friend, and three
girls. In a target-like analysis, only one noun phrase, three girls, correctly uses
plural morphology; formally speaking, it is the only noun phrase that is “plu-
ral.” In contrast, in a concept-oriented analysis, all three noun phrases express
plurality. The interlanguage is seen to have three means of indicating plurality,
namely, quantifiers, numerals, and plural morphology. Over time, the balance
will change and -s will become the dominant marker of plurality, co-occurring
with the other markers.
Researchers who were trained in other traditions may regard the lack of
formal separation of the traditionally recognized subcomponents in language
(morpho-syntax, semantics, and pragmatics) to be rather disconcerting and
perhaps reflect “fuzzy thinking,” as a syntactician suggested to me many years
ago. However, to a functionalist, taking many levels into account at once leads
to a more complete picture of language in the service of communication.
An Exemplary Study: Bardovi-Harlig (1994)
A concept-oriented approach typically begins with a concept to be investi-
gated. It examines (a) how learners express the concept, (b) how the means of
expression interact, and (c) how the expression of the concept changes over
One Functional Approach to L2 Acquisition 49
time. The study presented here, Bardovi-Harlig (1994), investigated the con-
cept of reverse-order reports (RORs), or how learners conveyed events that are
not in the order in which they happened.
Without evidence to the contrary, a series of events is understood to be in
the order in which they occurred, or chronological order. Narratives, for example,
relate events in chronological order (Dahl, 1984). Any change from chronolog-
ical order must be indicated. As Klein (1986) states, “unless marked otherwise,
the sequence of events mentioned in an utterance corresponds to their real
sequence” (p. 127).
Despite the strong tendency to report events in chronological order, nar-
rators, including L2 learners, must also be able to deviate from chronological
order. Compare example (1), which reports events in chronological order, with
(2), which reports them in reverse order. The first event is labeled [1] and the
second [2].
1. John graduated from high school in 1975. [1] He went to college five years
later. [2]
2. John entered college in 1980. [2] He had graduated from high school five
years earlier. [1]
3. The first one they met was a horse as thin as a stick, tied to an oak tree [2].
He had eaten the leaves as far as he could reach [1]. (Thompson, 1968, p. 2)
4. I ate my lunch [2] after my wife came back from her shopping [1]. (Leech,
1971, p. 43)
English signals RORs by tense (the pluperfect, as in 3), adverbials (4), and by a
combination of both (2).
Method
This study is a longitudinal production study that followed 16 learners for
9–16 months during their enrollment in an intensive English language pro-
gram in the United States. The learners were from four language back-
grounds (Arabic, Japanese, Korean, and Spanish) and were low-level learners,
as measured by their placement in the first of six levels in the intensive
English program. The intensive English classes met for 23 hours per week
and provided instruction in listening and speaking, reading, writing, and
grammar.
The data for the study came from two sources: primary language samples
produced by the learners and teaching logs completed by participating gram-
mar and writing instructors. The production data comprised the first three
past-time texts from each half-month sampling period, resulting in 430 texts:
376 journal entries, 37 narratives from film retell tasks, and 17 essay exams
and in-class compositions. Past-time texts were identified by the use of time
50 Kathleen Bardovi-Harlig
adverbials that provided time frames (Bardovi-Harlig, 1992; Harkness, 1987;
Thompson & Longacre, 1985) and program calendars.
Every verb supplied in past-time contexts was coded for its verbal mor-
phology, and all adverbials were identified and coded (Harkness, 1987). Rates
of appropriate use of past tense were calculated as the ratio of the number of
different past tense forms (i.e., types in which all occurrences of a verb form
such as was are counted only once, regardless of number of uses) supplied to
the number of obligatory environments. Next, the RORs were identified and
coded for verbal morphology and presence of other markers, namely, adverbi-
als, relative clauses, complements, and causal constructions.
The findings show that RORs are indeed marked as Klein predicted: RORs
are marked by a variety of devices; fewer lexical and syntactic devices are used
when specialized verbal morphology is used; and RORs seem to emerge when
expression of the past has stabilized. The individual research questions are
examined in turn.
How Are RORs Expressed?
One hundred (94.2%) of the 103 RORs showed an explicit marker of reverse
order, whereas only 3 (or 5.8%) did not. The explicitly marked RORs
exhibited a variety of linguistic devices: morphological contrast (tense-aspect
usage), adverbials (single and dual), and syntactic devices including causal con-
structions (especially the use of because), complementation (especially reported
speech or thought), and relative clauses (see Table 3.1). Sixty-three of the 103
RORs exhibited a contrast in verbal morphology and 40 did not. Some of the
RORs were indicated in multiple ways; thus, there are more markers of RORs
than RORs themselves in Table 3.1.
The morphological contrasts employed are often target-like, and two-thirds
of the sample showed a contrast between the simple past to indicate the second
event and the pluperfect to indicate the first event. The remaining one-third
comprises other contrasts including past [event 2] with present perfect or past
progressive [event 1] and base forms [event 2] with past [event 1].
How Do the Means of Expression Interact?
When learners use verbal morphology to indicate RORs, the use of other
devices decreases (Table 3.1). The use of dual adverbials to show contrast
declines by about half (21.2% to 10.3%). Learners also produced utterances that
have neither lexical nor syntactic devices to indicate RORs once verbal morphol-
ogy marked RORs, whereas this is very unusual when verbal morphology is not
used: 20.6% of RORs with morphological contrast occur with no additional
lexical or syntactic devices.
TABLE 3.1 Past-Time Reverse-Order Reports
Type of contrast
No morphological contrast (40 tokens) N % Morphological contrast (63 tokens) N %
Devices
No marking She said to me “Yes.” [2] She didn’t eat 3 5.8 N/A
breakfast, lunch [1] so also she was hungry.
Morphology N/A John and I went to her building [2]. She had 14 20.6
only invited her friends [1].
Single adverb My sister played piano very well. Before she 13 25.0 By the time the baker caught her [2] she had run 23 33.8
played [2], we were very nervous [1] into our hero [1]. [Carlos T5.5]
Dual adverb In level two I studied many new things 11 21.2 Today morning my father called us. [2] He told 7 10.3
for me [2], I didn’t study before in the us [3] that grandmother has been sick during two
another school [1] weeks [1]
Relative clause Then the bolice [police] but [put] the girl 7 13.5 In order to avoid mistakes and misunderstandings 11 16.2
[2] which stole the bread [1] on the lory with I had to review severals time [2] what I had
Charlie done [1].
Because Yesterday was a pusy [busy] day because 12 23.1 I spent that time with my family [2] because I had 7 10.3
I had to go to Indiana bell [2] because been here since Ogust [August] [1].
I didn’t biad [paid] my bell [1] so I did.
Complement He thought [2] that I said “Coming” [1] 6 11.5 I thought about [2] how she had bought them and 6 8.8
packed them [1].
Total 52 100.1 68 100.0
One Functional Approach to L2 Acquisition
51
52 Kathleen Bardovi-Harlig
How Does Expression Change over Time?
Looking at language production over time shows that the learners exhibited
variable rates of emergence for RORs. (Emergence refers to the earliest expres-
sion of a concept or a form.) Eight of the 16 learners began to use RORs within
the first 3 months of observation, another 4 in the next 3 months, and another
4 between months 7 and 13. Although calendar time does not present a consis-
tent picture, emergence with respect to other features of the temporal system
does. RORs emerge when learners show stable use of simple past tense, at about
80% appropriate use in past-time contexts (keeping in mind that appropriate use
means marking the past in a past-time context, even if the form used, e.g., goed,
is incorrect by formal analyses, although by the time 80% use is achieved, few
interlanguage forms are in evidence).
Pluperfect, the grammatical form that serves to uniquely mark the past-in-
the-past, emerges even later. Whereas half of the learners showed use of RORs
in the first three months, only half of those showed early use of the pluperfect.
One learner showed emergent use by the end of the sixth month, and another
two by the end of the seventh month. Three additional learners started to use
the pluperfect between months 9.5 and 12.5.
As predicted by Klein’s principle of natural order, deviations from chrono-
logical order were signaled in interlanguage. The expression of RORs is
delayed until the learner can use a marker to distinguish them from the sur-
rounding narrative.
The finding that the acquisition of the pluperfect serves the expression
of RORs, but does not itself make RORs possible, is an important result
of employing a meaning-oriented approach to this inquiry. Both a form-
focused approach (i.e., focusing on the acquisition of the pluperfect) and
a meaning-oriented approach (i.e., focusing on the expression of RORs)
could identify the acquisitional prerequisite of high appropriate use of past
tense. However, focus on the form of the pluperfect alone would fail to
capture the fact that the pluperfect moves into an established semantic envi-
ronment. In contrast, a meaning-oriented approach reveals both prerequi-
sites for the acquisition of pluperfect: high appropriate use of past tense and
expression of RORs.
Explanation of Observed Findings in SLA
As noted at the beginning of this chapter, the concept-oriented approach is
an analytic framework rather than a theory. It thus lacks the predictive power
of a theory. It does, however, contribute to detailed descriptions of L2 acqui-
sition that take meaning as well as form into account. Studies in the concept-
oriented framework have contributed to a number of observations outlined in
Chapter 1, especially, stages of acquisition, variable outcomes, influence of the
One Functional Approach to L2 Acquisition 53
L1, and the effects of instruction. The longitudinal design which is favored by
concept-oriented studies permits the investigation of multiple learner variables.
The functionalist approach offers accounts of two of the observations from
Chapter 1: predictable stages and the limitations of instruction.
Observation 4: Learners’ output (speech) often follows predictable paths with pre-
dictable stages in the acquisition of a given structure. The overriding concern of
functionalism is communication. Therefore, it is in keeping with its orienta-
tion for functionalism to explain major stages in those terms. More success-
ful communication—resulting in conveying the speaker’s meaning—propels
learners to the next acquisitional stage. In the sequence from pragmatic to lex-
ical to morphological expression of temporality, each stage affords a learner a
greater range of expression and less dependence on interlocutors, resulting in
an increasingly independent language user who speaks an interlanguage with
ever increasing communicative power.
Within the larger stages of development, there are multiple discrete stages.
The morphological stage, for example, exhibits many substages, in which dif-
ferent morphological markings emerge and enter into meaning-to-form map-
pings. One explanation for the order of acquisition of morphemes within the
same subsystem is functional load. Meanings for which there are reasonable
(i.e., grammatical and communicatively comprehensible) means of expression
are less likely than others to promote acquisition of a new form. Take as an
example, the present perfect (e.g., have gone) and the pluperfect (e.g., had gone).
Both are equally complex structurally, having both a tensed form of have plus a
past participle. Longitudinal observation shows that the present perfect emerges
noticeably earlier in adult L2 acquisition, even when both are taught at the
same time (Bardovi-Harlig, 2000). The difference between them is that the
present perfect has no functional equivalent, that is, there is no alternative verb
form with or without an adverb that is both grammatical and carries the same
meaning as the present perfect. In contrast, the meaning of the pluperfect (past
in the past) can be expressed by the simple past plus an adverbial, as discussed
earlier. This helps explain the order of emergence. It appears that an emergent
interlanguage system puts greater store in range of expression (covering all the
conceptual bases) than in redundancy.
Observation 9: There are limits on the effects of instruction on L2 acquisition.
Naturally, the explanation for instructional effects depends on the instructional
effects that one sees. If instructed and uninstructed learners are compared
in the early period before instructed learners overtake uninstructed learners
(Bardovi-Harlig, 2000), the stages are the same. As Gass (1989) argued, the
fundamental psycholinguistic process of L2 acquisition is the same whether
learners enter classrooms or acquire language outside of them.
On the level of expression of individual concepts, instruction is also
seen as having a limited role. The stage of acquisition of individual learners
interacts with instruction as Pienemann (1989, 1998; Pienemann & Lenzing,
54 Kathleen Bardovi-Harlig
this volume) has argued. To better understand what leads to development in
temporal expression, the study reported on earlier (Bardovi-Harlig, 1994)
collected instructional logs from teachers in addition to written and oral lan-
guage production from learners. Comparing the documented form-focused
instruction with the emergence and use patterns of the pluperfect revealed
that learners began to use the pluperfect at or following the time of instruc-
tion if they had already passed through the prerequisite stages for the plu-
perfect: stable use of past in past-time contexts (above 80%) and expression
of RORs. Learners who had not established a stable use of past, or had not
expressed RORs, did not show spontaneous use of pluperfect in their written
texts following instruction.
This suggests that meeting these acquisitional prerequisites is a necessary step
even when the pluperfect is available in instructional input, but it also shows
that merely meeting the prerequisites at the time of instruction is not sufficient.
This is consistent with Pienemann’s teachability hypothesis (now part of pro-
cessability theory; see Pienemann & Lenzing, this volume), according to which
the effects of instruction on the developing interlanguage are constrained by
the learner’s current stage of acquisition. However, even learners who appar-
ently satisfy the acquisitional prerequisites for an instructionally targeted form
may not immediately integrate that form into productive use.
The learners in the ROR study show a similar lack of incorporation of a
targeted form after instruction in the expression of futurity. According to the
instructional logs kept by the teachers, the instruction of the future introduced
going to + Verb (e.g., going to study, going to do homework) a full chapter before
will (Bardovi-Harlig, 2004). Learners nevertheless show a delay in using going
to compared to the early acquisition of will. The presence of targeted forms in
instructional input and completion of acquisitional prerequisites, within con-
tinued nonuse of the target form, suggests that even in the presence of focused
instruction, the one-to-one principle is at work. When learners have an estab-
lished means of expression for a given concept in interlanguage (and especially
when that form is communicatively clear and grammatical), they may be slow
to expand their grammars even with instruction.
In addition to linguistic and cognitive constraints on the effect of instruc-
tion, Klein and Dimroth (2009) outline three potentially disadvantageous ways
in which instructed learning differs from untutored acquisition which are partic-
ularly relevant to the concerns of a functionalist approach. Compared to contact
language learning, classrooms offer preprocessed language, reduce communica-
tive urgency, and provide an external arbiter of correctness (the teacher) rather
than allowing learners to develop their own assessment of their success by asking
themselves, “Do I understand, am I understood?” and “Do I have the impres-
sion that my way of speaking is exactly like that of the others?” (p. 508). All of
these emphasize the centrality of communication to a functional approach to L2
acquisition.
One Functional Approach to L2 Acquisition 55
The Explicit/Implicit Debate
The seminal works of the concept-oriented approach (Dietrich et al., 1995;
Klein, 1995, 2009; von Stutterheim & Klein, 1987) do not directly address the
constructs of explicit and implicit learning. It is clear from the early writing
that researchers regarded the linguistic phenomena under observation to be the
product of acquisition or implicit knowledge (Ellis, 2009). This is even clearer
in light of Klein and Dimroth’s (2009) characterization of instruction as some-
what disadvantageous to acquisition processes found in the communicatively
oriented experiences of untutored learners.
Because the concept-oriented approach is a framework for analysis rather
than a theory (following Klein, 1995), this section evaluates the likelihood
that the phenomena investigated by the seminal research studies were indeed
reflexes of implicit knowledge using Ellis’s (2009) work on explicit and i mplicit
knowledge as a framework.
The early concept-oriented studies shared at least four design characteristics:
(a) learners were essentially untutored; (b) they often had low literacy skills in
L1 (and L2); (c) they produced both personal narratives and narratives based
on film retells; and (d) all data were oral. Using the categories of degree of
awareness, focus of attention, and utility of metalanguage (Ellis, 2009, p. 47),
the elicited narratives would score high in the implicit knowledge categories
of production by feel for degree of awareness, meaning for focus of attention,
and “no” for utility of metalanguage. In fact, the extended narratives that form
the data for the concept-oriented studies far exceed the focus on meaning and
communication of the read-and-repeat task called “oral narrative” in Ellis’s
(2009) test battery (Bardovi-Harlig, 2013).
In addition, the learners in the early concept-oriented studies (Dietrich
et al., 1995; Klein, 1995, 2009; von Stutterheim & Klein, 1987) and the early
form-oriented studies (Andersen, 1991) were untutored learners, thereby fur-
ther increasing the likelihood that implicit knowledge accounted for observed
patterns of development. Relatively low educational levels and low literacy
levels in L1 and L2 may have further contributed to limiting the potential for
explicit knowledge.
As functional analyses were adopted in other sites, researchers added
instructed learners in the target-language environment and later foreign lan-
guage learners to the learner populations, careful to retain oral, meaning-
oriented communication tasks that have included personal narrative, narrative
retells, and personalized narratives (Bardovi-Harlig, 2000) and sociolinguistic
interviews and danger-of-death stories (Bayley, 1994; see Bardovi-Harlig, 2013,
for an analysis of texts and tasks in tense-aspect research). Given the literacy
levels of the instructed learners and the potential for pronunciation to affect the
oral production of verbal morphology (English past creates consonant clusters
in verbs such as walked which may not be articulated by some learners), written
56 Kathleen Bardovi-Harlig
narratives were added for comparison. As the concepts broadened from past
to futurity, narratives necessarily gave way to elicitation tasks suited to future
expression (Kanwit, 2017; Moses, 2002; Solon & Kanwit, 2014).
The value that functional approaches in general, and the concept-oriented
approach in particular, place on communication heavily favors tasks that tap
implicit knowledge, but as can be seen from the number of variables involved,
does not guarantee it. All approaches to L2 acquisition must consider features of
the task and learner characteristics when designing elicitation tasks.
Conclusion
The study of interlanguage development from a concept-oriented approach
highlights the relationship of the various linguistic devices that learners may
employ to express a given concept. In this chapter, we saw how various linguis-
tic means convey temporal reference, how they relate to each other, and how
the balance changes over time. Because the concept-oriented approach investi-
gates concepts rather than specific forms, the approach encourages the investi-
gation of the interlanguage system from the very earliest stages, although it also
allows investigation in advanced stages as well. It thus allows the L2 acquisition
researcher to document premorphological stages that importantly form part of
the sequences of L2 acquisition. The concept-oriented approach also emphasizes
the investigation of the interlanguage system in its own right. This is important
because target-language orientations, which are more common in the general
L2 acquisition literature, often focus on the distance between a learner’s inter-
language and the target language rather than exploring the emergence, inter-
play, and balance of features of the interlanguage as a linguistic system.
Discussion Questions
1. Compare and contrast the generative approach (White, this volume) with
the functional approach in this chapter. As a starting point, you might con-
sider what counts as evidence within each approach. Are the kinds of evi-
dence similar or different, and what do they suggest about each framework?
2. How does a functionalist approach explain staged development in L2
acquisition?
3. Identify a concept or question that you would like to investigate using a
concept-oriented framework. How would you set up the study? Keep in mind
that you need to determine how learners express the concept, how the various
means of expression interact, and how the expression changes over time.
4. Functionalist approaches are useful for explaining the acquisition of many,
if not all, meaning-based aspects of language. Can you think of areas of
language that are not meaning-based? If so, how would a functionalist
approach account for their acquisition?
One Functional Approach to L2 Acquisition 57
5. Review the three stages involved in the expression of temporality in a L2.
What factors could explain why some L2 learners fail to reach the morpho-
logical stage?
6. Read the exemplary study presented in this chapter and prepare a discussion for
class in which you describe how you would conduct a replication study. Be sure
to explain any changes you would make and what motivates such changes.
Notes
Suggested Further Reading
Bardovi-Harlig, K. (2000). Tense and aspect in second language acquisition: Form, meaning,
and use. Oxford, England: Blackwell.
This book synthesizes research on the acquisition of tense and aspect from a
variety of research perspectives, including concept-/meaning-oriented perspectives,
across a variety of target languages. See especially Chapters 2 and 6.
Bardovi-Harlig, K. (2018). Concept-oriented analysis: A reflection on one approach
to studying interlanguage development. In A. Edmonds & A. Gudmestad (Eds.),
Critical reflections on data in second language acquisition (pp. 171–195). Amsterdam,
Netherlands: John Benjamins.
For those interested in going beyond an introduction to the concept-oriented
approach, this chapter discusses some of the methodological and analytical challenges
and rewards of using the concept-oriented approach to study interlanguage temporality.
Becker, A., & Carroll, M. (Eds.). (1997). The acquisition of spatial relations in a second lan-
guage. Amsterdam, Netherlands: John Benjamins.
For readers who would like to see how the concept-oriented framework is
applied to areas beyond the expression of time, this European Science Foundation–
sponsored study (see also Dietrich et al., 1995) reports on the expression of spatial
concepts in five languages.
Cooreman, A., & Kilborn, K. (1991). Functionalist linguistics: Discourse structure and
language processing in second language acquisition. In T. Huebner & C. A. Ferguson
(Eds.), Crosscurrents in second language acquisition and linguistic theories (pp. 195–224).
Amsterdam, Netherlands: John Benjamins.
This article presents another view of functionalism in L2 acquisition research,
including a comparison of functionalism and the competition model (Bates &
MacWhinney, 1987).
58 Kathleen Bardovi-Harlig
Dietrich, R., Klein, W., & Noyau, C. (1995). The acquisition of temporality in a second
language. Amsterdam, Netherlands: John Benjamins.
Sponsored by the European Science Foundation, the study reported here is a
masterful undertaking. The acquisition of temporal expression in five languages
(English, French, German, Swedish, and Dutch) by adult learners is reported on
from a concept-oriented approach. Summary introduction and conclusion chapters
provide an overview of the procedures and results.
von Stutterheim, C., & Klein, W. (1987). A concept-oriented approach to second
language studies. In C. W. Pfaff (Ed.), First and second language acquisition processes
(pp. 191–205). Cambridge, MA: Newbury House.
This seminal article introduces the concept-oriented approach to the L2 acqui-
sition literature in English.
References
Aksu-Koç, A., & von Stutterheim, C. (1994). Temporal relations in narrative: Simulta-
neity. In R. A. Berman & D. I. Slobin (Eds.), Relating events in narrative: A crosslinguis-
tic developmental study (pp. 393–455). Hillsdale, NJ: Lawrence Erlbaum.
Andersen, R. W. (1984). The one-to-one principle of interlanguage construction.
Language Learning, 34, 77–95.
Andersen, R. W. (1990). Models, processes, principles and strategies: Second language
acquisition inside and outside the classroom. In B. VanPatten & J. F. Lee (Eds.),
Second language acquisition—foreign language learning (pp. 45–78). Clevedon, UK:
Multilingual Matters. (Reprinted from IDEAL, 3, 111–138).
Andersen, R. W. (1991). Developmental sequences: The emergence of aspect marking
in second language acquisition. In T. Huebner & C. A. Ferguson (Eds.), Crosscur-
rents in second language acquisition and linguistic theories (pp. 305–324). Amsterdam,
Netherlands: John Benjamins.
Bardovi-Harlig, K. (1992). The use of adverbials and natural order in the development
of temporal expression. IRAL, 30, 299–320.
Bardovi-Harlig, K. (1994). Reverse-order reports and the acquisition of tense: Beyond
the principle of chronological order. Language Learning, 44, 243–282.
Bardovi-Harlig, K. (1995). A narrative perspective on the development of the tense/
aspect system in second language acquisition. Studies in Second Language Acquisition,
17, 263–291.
Bardovi-Harlig, K. (1998). Narrative structure and lexical aspect: Conspiring factors in
second language acquisition of tense-aspect morphology. Studies in Second Language
Acquisition, 20, 471–508.
Bardovi-Harlig, K. (2000). Tense and aspect in second language acquisition: Form, meaning,
and use. Oxford, England: Blackwell.
Bardovi-Harlig, K. (2004). Monopolizing the future OR how the go-future breaks into
will’s territory and what it tells us about SLA. In S. Foster-Cohen (Ed.), EuroSLA
yearbook (pp. 177–201). Amsterdam, Netherlands: John Benjamins.
Bardovi-Harlig, K. (2005). The future of desire: Lexical futures and modality in L2
English future expression. In L. Dekydtspotter, R. A. Sprouse, & A. Liljestrand
(Eds.), Proceedings of the 7th generative approaches to second language acquisition conference
(GASLA 2004) (pp. 1–12). Somerville, MA: Cascadilla Proceedings Project.
One Functional Approach to L2 Acquisition 59
Bardovi-Harlig, K. (2013). Research design: From text to task. In R. Salaberry &
L. Comajoan (Eds.), Research design and methodology in studies on second language tense
and aspect (pp. 219–269). Berlin, Germany: Mouton de Gruyter.
Bardovi-Harlig, K. (2014). Documenting interlanguage development. In Z. H. Han
& E. Tarone (Eds.), Interlanguage: Forty years later (pp. 127–146). Amsterdam,
Netherlands: John Benjamins.
Bates, E., & MacWhinney, B. (1981). Second language acquisition from a functionalist
perspective: Pragmatics, semantics and perceptual strategies. In H. Winitz (Ed.),
Annals of the New York Academy of Sciences conference on native language and foreign
language acquisition (pp. 190–214). New York, NY: New York Academy of Sciences.
Bates, E., & MacWhinney, B. (1987). Language universals, individual variation, and
the competition model. In B. MacWhinney (Ed.), Mechanisms of language acquisition
(pp. 157–193). Hillsdale, NJ: Lawrence Erlbaum.
Bayley, R. J. (1994). Interlanguage variation and the quantitative paradigm: Past tense
marking in Chinese-English. In S. Gass, A. Cohen, & E. Tarone (Eds.), Research
methodology in second language acquisition (pp. 157–181). Hillsdale, NJ: Lawrence
Erlbaum.
Becker, A., & Carroll, M. (1997). The acquisition of spatial relations in a second language.
Amsterdam, Netherlands: John Benjamins.
Bybee, J. L. (1985). Morphology: A study of the relation between meaning and form. Amsterdam,
Netherlands: John Benjamins.
Bybee, J., Perkins, R., & Pagliuca, W. (1994). The evolution of grammar: Tense, aspect, and
modality in the languages of the world. Chicago, IL: University of Chicago Press.
Chafe, W. (1970). Meaning and structure of language. Chicago, IL: University of Chicago
Press.
Cooreman, A., & Kilborn, K. (1991). Functionalist linguistics: Discourse structure and
language processing in second language acquisition. In T. Huebner & C. A. Ferguson
(Eds.), Crosscurrents in second language acquisition and linguistic theories (pp. 195–224).
Amsterdam, Netherlands: John Benjamins.
Croft, W. (1995). Modern syntactic typology. In M. Shibatani & T. Bynon (Eds.),
Approaches to language typology (pp. 85–144). Oxford, England: Clarendon Press.
Dahl, Ö. (1984). Temporal distance: Remoteness distinctions in tense-aspect systems.
In B. Butterworth, B. Comrie, & Ö. Dahl (Eds.), Explanations for language universals
(pp. 105–122). Berlin, Germany: Mouton de Gruyter.
Dahl, Ö. (1985). Tense and aspect systems. Oxford, England: Basil Blackwell.
Dahl, Ö. (2000). Tense and aspect in the languages of Europe. Berlin, Germany: Mouton
de Gruyter.
Dietrich, R., Klein, W., & Noyau, C. (1995). The acquisition of temporality in a second
language. Amsterdam, Netherlands: John Benjamins.
Dijk, S. (1978). Functional grammar. Amsterdam, Netherlands: North-Holland.
Edmonds, A., Gudmestad, A., & Donaldson, B. (2017). A concept-oriented analysis of
future-time reference in native and near-native Hexagonal French. Journal of French
Language Studies, 27, 381–404.
Ellis, N. (2013). Frequency-based grammar and the acquisition of tense and aspect in
L2 learning. In R. Salaberry & L. Comajoan (Eds.), Research design and methodology
in studies on second language tense and aspect (pp. 89–117). Berlin, Germany: Mouton
de Gruyter.
60 Kathleen Bardovi-Harlig
Ellis, R. (2009). Measuring implicit and explicit knowledge of a second language. In
R. Ellis, S. Loewen, C. Elder, R. Erlam, J. Phelp, & H. Reinders (Eds.), Implicit and
explicit knowledge in second language learning, testing, and teaching (pp. 31–64). Bristol,
England: Multilingual Matters.
Firbas, J. (1979). A functional view of “ordo naturalis.” Brno Studies in English, 13, 29–59.
Flashner, V. E. (1989). Transfer of aspect in the English oral narratives of native
Russian speakers. In H. Dechert & M. Raupach (Eds.), Transfer in language production
(pp. 71–97). Norwood, NJ: Ablex.
Flecken, M. (2011). Event conceptualization by early Dutch–German bilinguals: Insights
from linguistic and eye-tracking data. Bilingualism: Language and Cognition, 14, 61–77.
Foley, W. A., & Van Valin, R. D., Jr. (1984). Functional syntax and universal grammar.
Cambridge, England: Cambridge University Press.
Gass, S. M. (1987). The resolution of conflicts among competing systems: A bidirec-
tional perspective. Applied Psycholinguistics, 8, 329–350.
Gass, S. M. (1989). Second and foreign language learning: Same, different, or none of
the above? In B. VanPatten & J. F. Lee (Eds.), Second language acquisition—foreign lan-
guage learning (pp. 34–44). Clevedon, England: Multilingual Matters.
Giacalone Ramat, A. (1992). Grammaticalization processes in the area of temporal and
modal relations. Studies in Second Language Acquisition, 14, 297–322.
Gudmestad, A., & Geeslin, K. L. (2011). Assessing the use of multiple forms in variable
contexts: The relationship between linguistic factors and future-time reference in
Spanish. Studies in Hispanic and Lusophone Linguistics, 4, 3–34.
Gudmestad, A., & Geeslin, K. L. (2013). Second-language development of variable
forms of future-time expression in Spanish. In S. Beaudrie & A. M. Carvalho (Eds.),
Selected proceedings of the 6th Workshop on Spanish Sociolinguistics (pp. 63–75). Somer-
ville, MA: Cascadilla Proceedings Project.
Harkness, J. (1987). Time adverbials in English and reference time. In A. Schopf (Ed.),
Essays on tensing in English: Vol. 1. Reference time, tense and adverbs (pp. 71–110).
Tübingen, Germany: Niemeyer.
Hendriks, H. (2000). The acquisition of topic marking in L1 Chinese and L1 and L2
French. Studies in Second Language Acquisition, 22, 369–397.
Hopper, P. J. (1979). Aspect and foregrounding in discourse. In T. Givón (Ed.), Syntax
and semantics: Discourse and syntax (pp. 213–241). New York, NY: Academic Press.
Howard, M. (2012). From tense and aspect to modality—The acquisition of future,
conditional and subjunctive morphology in L2 French: A preliminary study. In
E. Labeau (Ed.), Development of tense, aspect and mood in L1 and L2 (pp. 203–226).
Amsterdam, Netherlands: Rodopi/Cahiers Chronos.
Huebner, T. (1983). A longitudinal analysis of the acquisition of English. Ann Arbor, MI:
Karoma.
Kanwit, M. (2017). What we gain by combining variationist and concept-oriented
approaches: The case of acquiring Spanish future-time expression. Language Learn-
ing, 67, 461–498.
Kanwit, M., & Solon, M. (2013). Acquiring variation in future time expression abroad
in Valencia, Spain and Mérida. In J. E. Aaron, J. Cabrelli Amaro, G. Lord, & A. de
Prada Pérez (Eds.), Selected proceedings of the 16th Hispanic Linguistics Symposium
(pp. 206–221). Somerville, MA: Cascadilla Proceedings Project.
Kilborn, K., & Ito, T. (1989). Sentence processing strategies in adult bilinguals. In
B. Mac-Whinney & E. Bates (Eds.), The crosslinguistic study of sentence processing
(pp. 257–291). Cambridge, England: Cambridge University Press.
One Functional Approach to L2 Acquisition 61
Klein, W. (1986). Second language acquisition (Rev. ed., Bohuslaw Jankowski, Trans.).
Cambridge, England: Cambridge University Press.
Klein, W. (1995). The acquisition of English. In R. Dietrich, W. Klein, & C. Noyau
(Eds.), The acquisition of temporality in a second language (pp. 31–70). Amsterdam,
Netherlands: John Benjamins.
Klein, W. (2009). How time is encoded. In W. Klein & P. Li (Eds.), The expression of time
(pp. 39–82). Berlin, Germany: Mouton de Gruyter.
Klein, W., & Dimroth, C. (2009). Untutored second language acquisition. In W. C.
Ritchie & T. K. Bhatia (Eds.), The new handbook of second language acquisition (2nd rev.
ed., pp. 503–522). Bingley, England: Emerald.
Klein, W., & Li, P. (Eds.). (2009). The expression of time. Berlin, Germany: Mouton de
Gruyter.
Kumpf, L. (1984). Temporal systems and universality in interlanguage: A case study.
In F. Eckman, L. Bell, & D. Nelson (Eds.), Universals of second language acquisition
(pp. 132–143). Rowley, MA: Newbury House.
Kuno, S. (1980). Functional syntax. In E. Moravcsik & J. Wirth (Eds.), Current approaches
to syntax: Vol. 13. Syntax and semantics (pp. 117–135). New York, NY: Academic
Press.
Langacker, R. W. (1987). Foundations of cognitive grammar: Theoretical prerequisites.
Stanford, CA: Stanford University Press.
Leclercq, P. (2009). The influence of L2 French on near native French learners of
English: The case of simultaneity. In E. Labeau & F. Myles (Eds.), The advanced
learner variety: The case of French (pp. 269–289). Oxford, England: Peter Lang.
Lee, J. F. (2002). The incidental acquisition of Spanish future tense morphology through
reading in a second language. Studies in Second Language Acquisition, 24, 55–80.
Lee, J. F., Cadierno, T., Glass, W. R., & VanPatten, B. (1997). The effects of lexical and
grammatical cues on processing past temporal reference in second language input.
Applied Language Learning, 8, 1–23.
Leech, G. N. (1971). Meaning and the English verb. Harlow, England: Longman.
Long, M., & Sato, C. J. (1984). Methodological issues in interlanguage studies: An
interactionist perspective. In A. Davies, C. Criper, & A. P. R. Howatt (Eds.), Inter-
language (pp. 253–279). Edinburgh, Scotland: Edinburgh University Press.
MacWhinney, B. (1987). Applying the competition model to bilingualism. Applied
Psycho-linguistics, 8, 315–327.
Meisel, J. (1987). Reference to past events and actions in the development of natural
language acquisition. In C. W. Pfaff (Ed.), First and second language acquisition processes
(pp. 206–224). Cambridge, MA: Newbury House.
Moses, J. (2002). The expression of futurity by English-speaking learners of French ( Unpublished
doctoral dissertation). Indiana University, Bloomington.
Musumeci, D. (1989). The ability of second language learners to assign tense at the sentence level
(Unpublished doctoral dissertation). University of Illinois, Urbana-Champaign.
Palmer, F. R. (2001). Mood and modality. Cambridge, England: Cambridge University
Press.
Pienemann, M. (1989). Is language teachable? Psycholinguistic experiments and
hypotheses. Applied Linguistics, 10, 52–79.
Pienemann, M. (1998). Language processing and second language development: Processability
theory. Amsterdam, Netherlands: John Benjamins.
Prince, E. (1981). Toward a taxonomy of given-new information. In P. Cole (Ed.),
Radical pragmatics (pp. 223–255). New York, NY: Academic Press.
62 Kathleen Bardovi-Harlig
Rossomondo, A. E. (2007). The role of lexical temporal indicators and text interaction
format in the incidental acquisition of the Spanish future tense. Studies in Second
Language Acquisition, 29, 39–66.
Rutherford, W. (1983). Language typology and language transfer. In S. Gass & L.
Selinker (Eds.), Language transfer in language learning (pp. 358–370). Rowley, MA:
Newbury House.
Salsbury, T. (2000). The grammaticalization of unreal conditionals: A longitudinal study of L2
English (Unpublished doctoral dissertation). Indiana University, Bloomington.
Sanz, C., & Fernández, M. (1992). L2 learners’ processing of temporal cues in Spanish.
MIT Working Papers in Linguistics, 16, 155–168.
Schmiedtová, B. (2004). At the same time … The expression of simultaneity in learner variet-
ies. Berlin, Germany: Mouton de Gruyter.
Schumann, J. (1987). The expression of temporality in basilang speech. Studies in Second
Language Acquisition, 9, 21–41.
Selinker, L. (1972). Interlanguage. IRAL, 10, 209–231.
Skiba, R., & Dittmar, N. (1992). Pragmatic, semantic, and syntactic constraints and
grammaticalization: A longitudinal perspective. Studies in Second Language Acquisi-
tion, 14, 323–349.
Solon, M., & Kanwit, M. (2014). The emergence of future verbal morphology in Spanish
as a foreign language. Studies in Hispanic and Lusophone Linguistics, 7, 115–148.
Svoboda, A. (1974). On two communicative dynamisms. In F. Daneš (Ed.), Papers on
functional sentence perspective (pp. 38–42). Prague, Czech Republic: Academia Pub-
lishing, Czechoslovak Academy of Sciences.
Thompson, S. (1968). One hundred favorite folktales. Bloomington, IN: Indiana Univer-
sity Press.
Thompson, S. A., & Longacre, R. E. (1985). Adverbial clauses. In T. Shopen (Ed.), Lan-
guage typology and syntactic description (pp. 171–234). Cambridge, England: Cambridge
University Press.
Trévise, A., & Porquier, R. (1986). Second language acquisition by adult immigrants:
Exemplified methodology. Studies in Second Language Acquisition, 8, 265–275.
von Stutterheim, C. (1991). Narrative and description: Temporal reference in second
language acquisition. In T. Huebner & C. A. Ferguson (Eds.), Crosscurrents in sec-
ond language acquisition and linguistic theories (pp. 385–403). Amsterdam, Netherlands:
John Benjamins.
von Stutterheim, C., Andermann, M., Carroll, M., Flecken, M., & Schmiedtová, B.
(2012). How grammaticized concepts shape event conceptualization in language
production: Insights from linguistic analysis, eye tracking data, and memory perfor-
mance. Linguistics, 50, 833–867.
von Stutterheim, C., & Carroll, M. (2006). The impact of grammatical temporal categories
on ultimate attainment in L2 learning. In H. Byrnes, H. Weger-Guntharp, & K. A.
Sprang (Eds.), Educating for advanced foreign language capacities: Constructs, curriculum, instruc-
tion, assessment (pp. 40–53). Washington, DC: Georgetown University Press.
von Stutterheim, C., & Klein, W. (1987). A concept-oriented approach to second
language studies. In C. W. Pfaff (Ed.), First and second language acquisition processes
(pp. 191–205). Cambridge, MA: Newbury House.
von Stutterheim, C., & Lambert, M. (2005). Cross-linguistic analysis of temporal per-
spectives in text production. In H. Hendricks (Ed.), The structure of learner varieties
(pp. 203–230). Berlin, Germany: Mouton de Gruyter.
4
USAGE-BASED APPROACHES
TO L2 ACQUISITION
Nick C. Ellis and Stefanie Wulff
The Theory and Its Constructs
Various approaches to second language acquisition (L2 acquisition) can be
labeled as “usage-based.” What unites these approaches is their commitment to
two working hypotheses:
(1) Language learning is primarily based on learners’ exposure to their second
language (L2) in use, that is, their communicative experience using the L2.
(2) Learners induce the rules of their L2 from this experience by employing
cognitive mechanisms that are not exclusive to language learning, but that
are general cognitive mechanisms at work in any kind of learning, includ-
ing language learning.
We will look at the following major constructs of usage-based approaches to L2
acquisition in more detail:
• Constructions: language learning is the learning of constructions, pairings
of form and meaning or function. Constructions range from simple
morphemes like -ing to complex and abstract syntactic frames such as
Subject-Verb-Object-Object (as in Nick made Steffi a sandwich).
• Associative language learning: learning constructions means learning the
association between form and meaning or function. The more reliable the
association between a form and its meaning or function, the easier it is
to learn. For example, the sound sequence /ˈsæn(d)wɪtʃ/ is reliably asso-
ciated with a particular meaning (“slices of meat and/or cheese between
two slices of bread”). The form -ing, in contrast, has different meaning/
functions in different contexts, making it comparatively harder to learn.
64 Nick C. Ellis and Stefanie Wulff
• Rational cognitive processing: language learning is rational such that a
learner’s knowledge of a given form–meaning pair at any point in their
language development is a reflection of how often and in what specific
contexts the learner has encountered that form–meaning pair.
• Exemplar-based learning: language learning is in large parts implicit in
the sense of taking place without the learner being consciously aware of it.
The learner’s brain engages simple learning mechanisms in distributional
analyses of the exemplars of a given form–meaning pair that take various
characteristics of the exemplar into consideration, including how frequent
it is, what kind of words and phrases and larger contexts it occurs with,
and so on.
• Emergent relations and patterns: language learning is a gradual process in
which language emerges as a complex and adaptive (in the sense of contin-
uously fine-tuning) system from the interaction of simple cognitive learn-
ing mechanisms with the input (and in interaction with other speakers in
various social settings).
Constructions
The basic units of language representation are constructions. Constructions
are pairings of form and meaning or function. By that definition, we know
that words like, say, squirrel, must be constructions: a form—that is, a particular
sequence of letters or sounds—is conventionally associated with a meaning (in
the case of squirrel, something like “agile, bushy-tailed, tree-dwelling rodent
that feeds on nuts and seeds”). In Construction Grammar, constructions are not
restricted to the level of words (Goldberg, 2006; Trousdale & Hoffman, 2013).
Instead, these form-function pairings are assumed to pervade all layers of lan-
guage. Morphemes such as -licious (roughly meaning “delightful or extremely
attractive”) are constructions. Idiomatic expressions such as I can’t wrap my head
around this (meaning “I do not fully comprehend this”) are constructions. Even
abstract syntactic frames are constructions: sentences like Nick gave the squirrel
a nut, Steffi gave Nick a hug, or Bill baked Jessica a cake all have a particular form
(Subject-Verb-Object-Object) that, regardless of the specific words that realize
its form, share at least one stable aspect of meaning: something is being trans-
ferred (nuts, hugs, and cakes). Some constructions do not have a meaning in
the traditional sense but serve more functional purposes; passive constructions,
for example, serve to shift what is in attentional focus by defocusing the agent
of the action (compare an active sentence such as Bill baked Jessica a cake with its
passive counterpart A cake was baked for Jessica).
Constructions can be simultaneously represented and stored in mul-
tiple forms and at various levels of abstraction: table + s = tables; [Noun] +
(morpheme -s) = “plural things”). Ultimately, constructions blur the tradi-
tional distinction between lexicon and grammar. A sentence is not viewed as
Usage-Based Approaches to L2 Acquisition 65
the application of grammatical rules to put a number of words obtained from
the lexicon in the right order; a sentence is instead seen as a combination of
constructions, some of which are simple and concrete while others are quite
complex and abstract. For example, What did Nick give the squirrel? comprises
the following constructions:
• Nick, squirrel, give, what, do constructions
• VP, NP constructions
• Subject-Verb-Object-Object construction
• Subject-Auxiliary inversion construction
We can therefore see the language knowledge of an adult as a huge ware-
house of constructions. Constructions vary in their degree of complexity and
abstraction. Some of them can be combined with one another while others can-
not; their combinability largely depends on whether their meanings/functions
are compatible, or can at least be coerced into compatibility, given the spe-
cific context and situation in which a speaker may want to use them together.
The more often a speaker encounters a particular construction, or combina-
tion of constructions, in the input, the more entrenched the (arrangement of )
constructions becomes.
Associative Learning Theory
Constructions that are frequent in the input are processed more easily than
rare constructions are. This empirical fact is compatible with the idea that we
learn language from usage through associative learning (Ellis, 2002). Let’s stick
to words for now, though the same is true for letters, morphemes, syntactic
patterns, and all other types of constructions. Through experience, a learner’s
perceptual system becomes tuned to expect constructions according to their
probability of occurrence in the input, with words like one or won occurring
more frequently than words like seventeen or synecdoche.
When a learner notices a word in the input for the first time, a memory is
formed that binds its features into a unitary representation, such as the phono-
logical sequence /wʌn/or the orthographic sequence one. Alongside this repre-
sentation, a so-called detector unit is added to the learner’s perceptual system.
The job of the detector unit is to signal the word’s presence whenever its features
are present in the input. Every detector unit has a set resting level of activation
and some threshold level which, when exceeded, will cause the detector to fire.
When the component features are present in the environment, they send acti-
vation to the detector that adds to its resting level, increasing it; if this increase
is sufficient to bring the level above threshold, the detector fires. With each
firing of the detector, the new resting level is slightly higher than the previ-
ous one—the detector is primed. This means it will need less activation from
66 Nick C. Ellis and Stefanie Wulff
the environment in order to reach threshold and fire the next time. Priming
events sum to lifespan-practice effects: features that occur frequently acquire
chronically high resting levels. Their resting level of activation is heightened by
the memory of repeated prior activations. Thus, our pattern-recognition units
for higher-frequency words require less evidence from the sensory data before
they reach the threshold necessary for firing.
The same is true for the strength of the mappings from form to interpreta-
tion. Each time /wʌn/ is properly interpreted as one, the strength of this con-
nection is incremented. Each time /wʌn/ signals won, this is tallied too, as are
the less frequent occasions when it forewarns of wonderland. Thus, the strengths
of form–meaning associations are summed over experience. The resultant net-
work of associations, a semantic network comprising the structured inventory
of a speaker’s knowledge of language, is tuned such that the spread of activation
upon hearing the formal cue /wʌn/ reflects prior probabilities of its different
interpretations.
Many additional factors qualify this simple picture. First, the relationship
between frequency of usage and activation threshold is not linear but follows a
curvilinear “power law of practice” whereby the effects of practice are greatest
at early stages of learning, but eventually reach asymptote (see De Keyser, this
volume). Second, the amount of learning induced from an experience of a con-
struction depends upon the salience of the form (i.e., how much it stands out
relative to its context) and the importance of understanding it correctly (Ellis,
2017; Wulff & Ellis, 2018). Third, the learning of a construction is interfered
with if the learner already knows another form that cues that interpretation, or,
conversely, if the learner knows another interpretation for that form. Fourth, a
construction may provide a partial specification of the structure of an utterance,
and hence an utterance’s structure is specified by a number of distinct construc-
tions which must be collectively interpreted. Some cues are much more reliable
signals of an interpretation than others, and it is not just first-order probabilities
that are important—sequential probabilities matter a great deal as well, because
context qualifies interpretation. For example, the interpretation of /wʌn/ in
the context Alice in wun … is already clear after the learner has heard Alice in …;
in other words, Alice in and /wʌn/ are highly reliably associated with each
other. If a sentence starts out with I /wʌn/ …, in contrast, several competing
interpretations are co-activated (I wonder …, I won …, I once …, etc.) because
the first person pronoun I is a much less reliable cue for the interpretation of /
wʌn/ than Alice.
Rational Language Processing
These associative underpinnings allow language users to be rational in the
sense of having a mental model of their language that is custom-fit to their
linguistic experience at any given time (Ellis, 2002, 2006a). The words that
Usage-Based Approaches to L2 Acquisition 67
they are likely to hear next, the most likely senses of these words, the linguistic
constructions they are most likely to utter next, the syllables they are likely
to hear next, the graphemes they are likely to read next, the interpretations
that are most relevant, and the rest of what’s coming …? (next) across all levels
of language representation, are made readily available to them by their lan-
guage processing systems. Their unconscious language representation systems
are adaptively tuned to predict the linguistic constructions that are most likely
to be relevant in the ongoing discourse context, optimally preparing them for
comprehension and production. As a field of research, the rational analysis
of cognition is guided by the principle that human psychology can be under-
stood in terms of the operation of a mechanism that is optimally adapted to its
environment in the sense that the behavior of the mechanism is as efficient as
it conceivably could be, given the structure of the problem space and the cue-
interpretation mappings it must solve (Anderson, 1989).
Exemplar-Based Learning
Much of our language use is formulaic, that is, we recycle phrasal constructions
that we have memorized from prior use (Wulff, 2008; Wulff & Ellis, 2018).
However, we are obviously not limited to these constructions in our language
processing. Some constructions are a little more open in scope, like the slot-
and-frame greeting pattern [Good + (time-of-day)] which generates examples
like Good morning and Good afternoon. Others still are abstract, broad-ranging,
and generative, such as the schemata that represent more complex morphologi-
cal (e.g., [NounStem-PL]), syntactic (e.g., [Adj Noun]), and rhetorical (e.g., the
iterative listing structure, [the (), the (), the (), …, together they …]) patterns.
Usage-based theories investigate how the acquisition of these productive pat-
terns, generative schema, and other rule-like regularities of language is based
on exemplars. Every time the language learner encounters an exemplar of
a construction, the language system compares this exemplar with memories
of previous encounters of either the same or a sufficiently similar exemplar to
retrieve the correct interpretation. According to exemplar theory, construc-
tions such as Good + (time of day), [Adj Noun], or [NounStem-PL] all gradually
emerge over time as the learner’s language system, processing exemplar after
exemplar, identifies the regularities that exemplars share and makes the corre-
sponding abstractions.
The Associative Bases of Abstraction
Prototypes, the exemplars that are most typical of their categories, are those
that are similar to many members of their category but not similar to members of
other categories. People more quickly classify sparrows as birds (or other average
sized, average colored, average beaked, average featured specimens) than
68 Nick C. Ellis and Stefanie Wulff
they do birds with less common features or feature combinations, like geese or
albatrosses. They do so on the basis of an unconscious frequency analysis of the
birds they have known (their usage history) with the prototype that reflects the
central tendencies of the distributions of the relevant features of these mem-
orized exemplars. We don’t walk around consciously counting these features,
but yet we have very accurate knowledge of the underlying distributions and
their most usual settings.
We are really good at this. Research in cognitive psychology demonstrates
that such implicit tallying is the raw basis of human pattern recognition, cat-
egorization, and rational cognition. As the world is classified, so language is
classified. As for the birds, so for their plural forms. In fact, world and lan-
guage categorization go hand in hand: Psycholinguistic research demonstrates
that people are faster at generating plurals for the prototype or default case that
is exemplified by many types and are slower and less accurate at generating
“irregular” plurals, the ones that go against the central tendency and that are
exemplified by fewer types, such as [plural + “NounStem s” = “NounStems-es”]
or, worse still, [plural + moose =?], [plural + noose =?], [plural + goose =?].
These examples make it clear that there are no 1:1 mapping between cues
and their outcome interpretations. Associative learning theory demonstrates
that the more reliable the mapping between a cue and its outcome, the more
readily it is learned. Consider an ESL learner trying to learn from natural-
istic input what -s at the ends of words might signify. This particular form
has several potential interpretations: It could be the plural (squirrels), it could
indicate possession (Nick’s hat), it could mark third person singular present
(Steffi sleeps), and so on. Therefore, if we evaluate -s as a cue for any one of
these outcomes, it is clear that the cue will be abundantly frequent in learn-
ers’ input, yet the cue is not reliably associated with their interpretation or
outcome. A similar picture emerges when we reverse the directionality of
our thinking: plural -s, third person singular present -s, and possessive -s
all have variant expression as the allomorphs [s], [z], and [ɨz]. Thus, if we
evaluate just one of these, say, [ɨz], as a cue for one particular outcome, say,
plurality, then it is clear that there are many instances of that outcome in
the absence of the cue. Such contingency analysis of the reliabilities of the
cue-interpretation associations suggests that they will not be readily learn-
able. High-frequency grammatical functors are often highly ambiguous in
their interpretations (Goldschneider & DeKeyser, 2001).
Connectionism is one strand of research in L2 acquisition that seeks to
investigate how simple associative learning mechanisms such as the kind of
contingency analysis mentioned earlier meets the complex language evidence
available to a learner in their input and output. The term “connectionist”
reflects the idea that mental and behavioral models are in essence intercon-
nected networks of simple units. Connectionist models are typically run as com-
puter simulations. The simulations are data-rich and process-light: Massively
Usage-Based Approaches to L2 Acquisition 69
parallel systems of artificial neurons use simple learning processes to statistically
generalize over masses of input data. It is important that the input data is
representative of learners’ usage history, which is why connectionist and other
input-influenced research rests heavily on large-scale, maximally representa-
tive digital collections of authentic language (these are often called databanks
or corpora). Connectionist simulations show how prototypes emerge as the
prominent underlying structural regularity in the whole problem space, and
how minority subpatterns of inflection regularity, such as the English plural
subpatterns discussed earlier (or the much richer varieties of the German plural
system, for example), also emerge as smaller, less powerful attractors. Connec-
tionism provides the computational framework for testing usage-based theories
as simulations, for investigating how patterns appear from the interactions of
many language parts.
Emergent Relations and Patterns
Complex systems are those that involve the interactions of many different parts,
such as ecosystems, economies, and societies. All complex systems share the
key aspect that many of their systematicities are emergent: They develop over
time in complex, sometimes surprising, dynamic, and adaptive ways. Com-
plexity arises from the interactions of learners and problems too. Consider the
path of an ant making its homeward journey on a pebbled beach. The path
seems complicated as the ant probes, doubles back, circumnavigates, and zig-
zags. But these actions are not deep and mysterious manifestations of intellec-
tual power. Instead, the control decisions are simple and few in number. An
environment-driven problem solver often produces behavior that is complex
because it relates to a complex environment.
Language is a complex adaptive system (Beckner et al., 2009; Ellis &
Larsen-Freeman, 2009; see also Larsen-Freeman, this volume). It comprises
the interactions of many players: People who want to communicate and a world
to be talked about. It operates across many different levels (neurons, brains,
and bodies; phonemes, constructions, interactions, and discourses), different
human conglomerations (individuals, social groups, networks, and cultures),
and different timescales (evolutionary, epigenetic, ontogenetic, interactional,
neurosynchronic, diachronic) (MacWhinney & O’Grady, 2015).
Emergentists believe that simple learning mechanisms, operating in and
across the human systems for perception, motor-action and cognition
as they are exposed to language data as part of a communicatively-rich
human social environment by an organism eager to exploit the function-
ality of language, suffice to drive the emergence of complex language
representations.
(Ellis, 1998, p. 657)
70 Nick C. Ellis and Stefanie Wulff
Two Languages and Language Transfer
Our neural apparatus is highly plastic in its initial state. It is not entirely an
empty slate, since there are broad genetic constraints on the usual networks of
system-level connections and on the broad timetable of maturation. Neverthe-
less, the cortex of the brain can represent any type of information equally well
(Elman et al., 1996). From this starting point, the brain quickly responds to the
input patterns it receives, and through associative learning, it optimizes its rep-
resentations to model the particular world of an individual’s experience. The
term “neural plasticity” summarizes the fact that the brain is tuned by experi-
ence. Our neural endowment provides a general-purpose cognitive apparatus
that, constrained by the makeup of our human bodies, filters and determines
our experiences. In the first few years of life, the human learning mechanism
optimizes its representations of the first language (L1) being learned. Thousands
of hours of L1 processing tunes the system to the cues of the L1 and automatizes
its recognition and production. It is impressive how rapidly we start tuning into
our ambient language and disregarding cues that are not relevant to them (Kuhl,
2004). One result of this process is that the initial state for L2 acquisition is no
longer a plastic system; it is one that is already tuned and committed to the L1.
Our later experience is shaded by prior associations; it is perceived through the
memories of what has gone before. Since the optimal representations for the L2
do not match those of the L1, L2 acquisition is impacted by various types of L1
interference. Transfer phenomena pervade L2 acquisition (Flege, 2002; Jarvis &
Pavlenko, 2008; Lado, 1957; MacWhinney, 1997; Odlin, 1989; Weinreich, 1953).
Associative Aspects of Transfer: Learned Attention and Interference
Associative learning provides the rational mechanisms for L1 acquisition from
linguistic usage and its analysis—allowing just about every human being to
acquire fluency in their native tongue. Yet although L2 learners too are sur-
rounded by language, not all of it “goes in,” and L2 acquisition is typically
limited in success. This is Corder’s distinction between input, the available
target language, and intake, that subset of input that actually gets in and that the
learner utilizes in some way (Corder, 1967). Does this mean that L2 acquisition
cannot be understood according to the general principles of associative learn-
ing? If L1 acquisition is rational, is L2 acquisition fundamentally irrational?
No. Paradoxically perhaps, it is the very achievements of L1 acquisition that
limit the input analysis of the L2. Associative learning theory explains these
limitations too, because associative learning in animals and humans alike is
affected by what is called learned attention.
We can consider just one example of learned attention here. Many gram-
matical form–meaning relationships are both low in salience and redundant
in the understanding of the meaning of an utterance. It is often unnecessary,
Usage-Based Approaches to L2 Acquisition 71
for instance, to interpret inflections that mark grammatical meanings such as
tense because they are usually accompanied by adverbs that indicate the tem-
poral reference: “if the learner knows the French word for ‘yesterday,’ then in
the utterance Hier nous sommes allés au cinéma (Yesterday we went to the mov-
ies) both the auxiliary and past participle are redundant past markers” (Terrell,
1991, p. 59). This redundancy is much more influential in L2 acquisition than L1
acquisition. Children learning their native language only acquire the meanings
of temporal adverbs quite late in development. But L2 learners already know
about adverbs from their L1 experience, and adverbs are both salient and reliable
in their communicative functions, while tense markers are neither ( Ellis, 2017,
see also VanPatten, this volume). Thus, the L2 expression of temporal reference
begins with a phase where reference is established by adverbials alone, and the
grammatical expression of tense and aspect thereafter emerges only slowly if at
all (Bardovi-Harlig, 2000; see also Bardovi-Harlig, this volume).
This is an example of the associative learning phenomenon of “blocking,”
where redundant cues are overshadowed because the learners’ L1 experience
leads them to look elsewhere for the cues to interpretation (Ellis, 2006b;
Wulff & Ellis, 2018). Under normal L1 circumstances, usage optimally tunes
the language system to the input; under these circumstances of low salience of
L2 form and blocking, however, all the extra input in the world can sum to
nothing, with interlanguage sometimes being described as having “fossilized”
(Han & Odlin, 2006). Untutored adult associative L2 learning from naturalistic
usage can thus stabilize at a “Basic Variety” of inter-language which, although
sufficient for everyday communicative purposes, predominantly comprises
just nouns, verbs, and adverbs, with little or no functional inflection and with
closed-class items, in particular determiners, subordinating elements, and prep-
ositions, being rare or not present at all (Klein, 1998).
The usual social-interactional or pedagogical reactions to such nonnative-like
utterances involve an interaction partner (Long, 1983; Mackey, Abbuhl, &
Gass, 2011) or instructor (Doughty & Williams, 1998) who intentionally brings
additional evidence to the learner’s attention by some means of attentional
focus that helps the learner to “notice” the cue (Schmidt, 2001). This way, L2
acquisition can be freed from the bounds of L1-induced selective attention: a
focus on form is provided in social interaction (Tarone, 1997; see also Lantolf,
Thorne, & Poehner, this volume) that recruits the learner’s explicit conscious
processing. We might say that the input to the associative network is “socially
gated” (Cadierno & Eskildsen, 2015; Ellis, 2015, 2019; Hulstijn et al., 2014).
What Counts as Evidence?
Like other enterprises in cognitive science and cognitive neuroscience,
usage-based approaches are not restricted to one specific research methodology
or evidential source. Indeed, different approaches require different methods,
72 Nick C. Ellis and Stefanie Wulff
and often a combination of different qualitative and quantitative methods. As
mentioned earlier, many usage-based analyses employ data from large digitized
collections of language, so-called corpora; computational modeling is at the
heart of rational cognition analysis, exemplar theory, and emergentist anal-
yses alike. Other relevant research methods include classroom field research,
psycholinguistic studies of processing, and dense longitudinal recording (that
is, recordings that capture learners’ development at dense intervals over a pro-
longed period of time).
Corpus-based analysis constitutes a particularly growing trend across
usage-based paradigms (McEnery & Hardie, 2012; Sinclair, 1991). If language
learning is in the social-cognitive linguistic moment of usage, we need to cap-
ture all these moments so that we can objectively study them. We need large,
dense, longitudinal corpora of language use, with audio, video, transcriptions,
and multiple layers of annotation, for data sharing in open archives. We need
recordings between short time intervals so that we can chart learners’ usage
history and their development (Tomasello & Stahl, 2004). We need them in
sufficient detail that we can engage in detailed analyses of the processes of
interaction (Kasper & Wagner, 2011). MacWhinney (2000) has long been
working toward these ends with CHILDES, a corpus of L1 acquisition data,
and later with Talkbank, a corpus that also covers language data from L2 learn-
ers. Alongside these and other corpora, a growing number of computer tools
are becoming available that assist the researcher in analyzing corpus data. These
corpus tools can help researchers interested in the most diverse areas of L2
acquisition by covering the full range from qualitative data analysis, such as a
fine-grained conversation analysis of individual corpus files (say, a transcribed
conversation between a student and an ESL teacher), to semi-quantitative anal-
ysis of a representative sample of attestations of a particular phenomenon (such
as the use of the -ing morpheme by English language learners), to large-scale
quantitative analysis of distributional patterns (e.g., the association strength
between verbs and the larger constructions they occur in; see the exemplary
study in this chapter or Gries & Wulff, 2005, 2009, for examples).
What Are Some Common Misunderstandings
about the Theory?
Broad frameworks, particularly those that revive elements of no-longer-
fashionable theories such as behaviorism or structuralist approaches to linguis-
tics, open the potential for misunderstanding. Common misconceptions include
that connectionism is the new behaviorism, that connectionist models cannot
explain creativity and have no regard for internal representation, and that cogni-
tive approaches deny influence of social factors, motivational aspects, and other
individual differences between learners. At the heart of most of these misun-
derstandings is the idea that usage-based analyses only do number-crunching,
Usage-Based Approaches to L2 Acquisition 73
with too much of a focus on the effects that the frequency of constructions and
other cues play in the learning process. While it is true that most usage-based
approaches will discuss frequency as one of several factors, no usage-based the-
orist would claim that frequency is the only factor impacting L2 acquisition. In
fact, there is a lively debate among usage-based theorists about the exact role
frequency effects play in what is conceived of as a complex network of factors
that can mute and amplify each other in complex ways (Ellis & Larsen-Freeman,
2006). At an even more fundamental level, what constitutes a frequency effect in
the first place is a question we are far from having answered. Without going into
too much detail here, there is ample empirical evidence, for instance, that we
cannot always define a frequency effect by the rule “the more frequent, the more
salient/important/relevant”—by that rationale, English articles and prepositions,
which are the most frequent words in the English language, should not pose such
an obstinate challenge to the average language learner! Instead, it seems that
frequency effects come in different kinds (as absolute frequencies, ratios, associ-
ation strengths, and other distributional patterns), and they will have differently
weighted impacts depending on the target structure under examination, and,
crucially, depending on the state of the learner’s language development (Ellis,
2017; Wulff & Ellis, 2018). An emergentist/complex systems approach views L2
acquisition as a dynamic process in which regularities and systems emerge from
many of the processes covered in this volume—from the interaction of people,
brains, selves, societies, and cultures using languages in the world (Beckner
et al., 2009; Ellis, 2008, 2019)—while at the same time investigating component
processes in a rigorous fashion.
An Exemplary Study: Ellis, O’Donnell, and Römer (2014a)
Research Questions
While previous studies were able to demonstrate that frequency, prototypicality,
and contingency are factors that impact L2 learners’ constructional knowledge,
most of these studies have considered only one of these factors at a time. This study
wanted to determine whether and how these factors jointly affect L2 learners’ con-
structional knowledge. The specific kind of constructions this study focused on are
so-called verb-argument constructions (VACs). VACs are semi-abstract patterns
that comprise verbs and the arguments they occur with, such as “V across N” or “V
of N”; in this study, the authors examined VACs that another team of researchers
previously identified using corpus analysis (Francis, Hunston, & Manning, 1996).
Methods
One hundred thirty-one German, 131 Spanish, and 131 Czech advanced L2
learners of English as well as 131 native speakers of English were engaged in a
74 Nick C. Ellis and Stefanie Wulff
free association task: They were shown 40 VAC frames such as “V across N” or
“it V of the N” and asked to fill in the verb slot with the first word that came to
mind. The learners’ responses were compared with results obtained from two
native speaker databases. To get an impression of the frequencies with which
different verbs occur in the VACs examined, and to calculate how strongly each
verb is associated with the individual VACs, the authors consulted the British
National Corpus (BNC). The BNC is a 100-million-word corpus of British
English that strives to be representative of language use across different registers
and genres. To obtain the verb type frequencies, one can simply run a search for
the VACs in the BNC and count how often each verb type occurs. To calculate
the association strength between each verb type and each VAC, the authors
used a specific association measure called DeltaP (for more information on how
DeltaP works, see Ellis, 2006a). To see how prototypical the verbs selected by
the participants would be for each VAC, the authors consulted a second data-
base called WordNet (Miller, 2009). WordNet is a lexical database, so unlike
the BNC, it is not a collection of cohesive and complete texts and dialogues
but rather a thesaurus-like database that groups words together based on their
meanings. Using sophisticated computational techniques, the authors used this
information to generate semantic networks for each of the VACs examined. For
the “V across N” VAC, for instance, the verbs in the center of the network are
go, move, run, and travel, while verbs like shout, splash, and echo constitute less
prototypical verbs in that VAC.
Main Findings
A multifactorial analysis (i.e., a statistic that can gauge the impact of more than
one factor on a specific outcome at a time; in our example, it measured the
potential impact of frequency, prototypicality, and contingency on speakers’
associations) revealed that for all of the VACs examined, each factor made an
independent contribution to learners’ and native speakers’ associations:
1. The more frequently a particular verb occurred in a specific VAC in the
native speaker corpus data, the more likely it was elicited as a response for
that VAC in the word association experiment.
2. The more strongly a verb and a VAC were associated with each other
as expressed in their DeltaP association scores, the more likely that
verb was elicited as a response for that VAC in the word association
experiment.
3. The more prototypical a verb was for the VAC as indicated by its position
in the semantic networks the authors generated for each VAC, the more
likely it was elicited as a response for that VAC in the word association
experiment.
Usage-Based Approaches to L2 Acquisition 75
Theoretical Implications
Based on the statistical analyses, the authors concluded that advanced L2 learn-
ers’ knowledge of VACs involves rich associations that are very similar in kind
and strength to those of native speakers (Ellis, O’Donnell, & Römer, 2014a,
2014b). The word associations generated in the experiment testified to learn-
ers having rich associations for VACs that are tuned by verb frequency, verb
prototypicality, and verb-VAC contingency alike—factors that, in combina-
tion, interface across syntax and semantics.
Why/How This Theory Provides an Adequate Explanation
of Observable Phenomena in L2 Acquisition
Observation 1: Exposure to input is necessary for L2 acquisition. Usage-based
approaches are input driven, emphasizing the associative learning of construc-
tions from input. As with other statistical estimations, a large and representative
sample of language is required for the learner to abstract a rational model that is a
good fit to the language data. Usage is necessary, and it is sufficient for successful
L1 acquisition though not for L2 acquisition. This is because the initial state for
L2 acquisition is knowledge of an L1, and the learner’s representations, process-
ing routines, and attention to language are tuned and committed to the L1.
Observation 2: A good deal of L2 acquisition happens incidentally. The majority of
language learning is implicit. Implicit tallying is the raw basis of human pattern
recognition, categorization, and rational cognition. All of the counting that
underpins the setting of thresholds and the tuning of the system to the proba-
bilities of the input evidence is unconscious, as is the emergence of structural
regularities, prototypes, attractors, and other system regularities. At any one
point we are conscious of one particular communicative meaning, yet mean-
while the cognitive operations involved in each of these usages are tuning the
system without us being aware of it (Ellis, 2002). We know (or can be shown
to be sensitive to in our processing) far too many linguistic regularities for us to
have explicitly learned them. Usage-based approaches maintain that incidental
associative learning provides the rational mechanisms and is sufficient for L1
acquisition from input analysis and usage, allowing just about every human
being to acquire fluency in their native tongue. They do not suffice for L2
acquisition because of learned attention.
Observation 3: Learners come to know more than what they have been exposed
to in the input. The study of implicit human cognition shows us to know far
more about the world than we have been exposed to or have been explicitly
taught. Prototype effects are one clear and ubiquitous example of this: learn-
ers who have never been exposed to the prototype of a category nevertheless
classify it faster and more accurately than examples further from the central
76 Nick C. Ellis and Stefanie Wulff
tendency, and name it with the category label with great facility. The same is
true for language, where learners go beyond the input in producing U-shaped
learning, with novel errors (like goed instead of went) and other systematici-
ties of stages of interlanguage development in L2 acquisition, for example, of
negation or question formation. These creations demonstrate that the learners’
language system is constantly engaged in making generalizations and finding
abstractions of systematicities.
Observation 4: Learners’ output (speech) often follows predictable paths with predictable
stages in the acquisition of a given structure. As in L1 acquisition, L2 acquisition is
characterized not by complete idiosyncrasy or variability but rather by predictable
errors and stages during the course of development: interlanguage is systematic.
Usage-based approaches hold that these systematicities arise from regularities in
the input: For example, constructions that are much more frequent, that are con-
sistent in their mappings and exhibit high contingency, that have many friends
(constructions that behave in a similar way) of like-type, and that are salient are
likely to be acquired earlier than those that do not have these features (Ellis, 2007).
Observation 6: Second language learning is variable across linguistic subsystems. The
learners’ mental lexicon is diverse in its contents, spanning lexical, morpholog-
ical, syntactic, phonological, pragmatic, and sociolinguistic knowledge. Within
any of these areas of language, learners may master some structures before they
acquire others. Such variability is a natural consequence of input factors such
as exemplar type and token frequency, recency, context, salience, contingency,
regularity, and reliability, along with the various other associative learning fac-
tors that affect the emergence of attractors in the problem space. Some aspects
of these problem spaces are simpler than others. Second language learning is a
piecemeal development from a database of exemplars with patterns of regular-
ity emerging dynamically.
Observation 7: There are limits on the effects of frequency on L2 acquisition. This is
explicitly addressed above under Associative aspects of transfer: learned atten-
tion and interference.
Observation 8: There are limits on the effect of a learner’s first language on L2 acqui-
sition. The effect of a learner’s L1 is no longer considered the exclusive deter-
minant of L2 acquisition as proposed in the Contrastive Analysis Hypothesis.
Usage-based accounts see the major driving force of language acquisition to be
the constructions of the target language and the learner’s experience of these
constructions. However, the significance of transfer from L1 in the L2 learn-
ing process is uncontroversial. As we explain earlier under “Two Languages and
Language Transfer,” at every level of language, there is L1 influence, both neg-
ative and positive. The various cross-linguistic phenomena of learned selective
attention, overshadowing and blocking, latent inhibition, perceptual learning,
interference, and other effects of salience, transfer, and inhibition all filter and
color the perception of the L2. So, usage-based accounts of L2 acquisition look at
the effects of both L1 and L2 usage upon L2 acquisition (Robinson & Ellis, 2008).
Usage-Based Approaches to L2 Acquisition 77
Observation 9: There are limits on the effects of instruction on L2 acquisition.
L1-tuned learned attention limits the amount of intake from L2 input, thus
restricting the endstate of L2 acquisition. Attention to language form is some-
times necessary to allow learners to notice some blocked, overshadowed, or
otherwise nonsalient aspect of the language form. Reviews of the empirical
studies of instruction demonstrate that social recruitment of learners’ con-
scious, explicit learning processes can be effective.
However, instruction is not always effective. Any classroom teacher can
provide anecdotal evidence that what is taught is not always learned. But this
observation can be made for all aspects of the curriculum, not just language.
Explicit knowledge about language is of a different stuff from that of the
implicit representational systems, and it need not impact upon acquisition for
a large variety of reasons. Explicit instruction can be ill-timed and out of syn-
chrony with development (Pienemann, 1998; see also Pienemann & Lenzing,
this volume); it can be confusing; it can be easily forgotten; it can be dissociated
from usage, lacking in transfer-appropriateness and thus never brought to bear
so as to tune attention to the relevant input features during usage; it can be
unmotivating; it can fail in so many ways.
The Explicit/Implicit Debate
Learning a new symbol, for example, the French word étoile for a *, initially
involves explicit learning: you are consciously aware of the fact that you did
not know the French word for “star” before, and that now you do (Ellis, 1994).
Some facts about how to use étoile properly you may not know yet, such as
its proper pronunciation, its grammatical gender, synonymous forms, words,
phrases, and idioms that étoile is associated with. Some of these facts you will
learn by making a conscious effort, that is, via explicit learning; other facts
you will not consciously figure out but rather learn implicitly. Without you
being aware of it, your language system is hard at work, upon each subsequent
encounter of étoile, to fill in these knowledge gaps and fine-tune the mental
representation you have for this construction.
Despite the fact that many of us go to great lengths to engage in explicit
language learning, the bulk of language acquisition is implicit learning from
usage. Most knowledge is tacit knowledge; most learning is implicit; the vast
majority of our cognitive processing is unconscious. Implicit learning supplies
a distributional analysis of the problem space: our language system implicitly
figures out how likely a given construction is in particular contexts, how often
they instantiate one sense or another, how these senses are in turn associated
with different features of the context, and so on. To the extent that these distri-
butional analyses are confirmed time and again through continuous exposure
to more input, generalizations and abstractions are formed that are also largely
implicit.
78 Nick C. Ellis and Stefanie Wulff
Implicit learning would not do the job alone. Some aspects of an L2 are
unlearnable—or at best are acquired very slowly—from implicit processes
alone. In cases where linguistic form lacks perceptual salience and so goes
unnoticed by learners, or where the L2 semantic/pragmatic concepts to be
mapped onto the L2 forms are unfamiliar, additional attention is necessary in
order for the relevant associations to be learned. To counteract the L1 atten-
tional biases to allow implicit estimation procedures to optimize induction, all
of the L2 input needs to be made to count (as it does in L1 acquisition), not just
the restricted sample typical of the biased intake of L2 acquisition.
Ellis (2005) reviews the instructional, psychological, social, and neurolog-
ical dynamics of the interface by which explicit knowledge of form–meaning
associations impacts upon implicit language learning:
The interface is dynamic: It happens transiently during conscious pro-
cessing, but the influence upon implicit cognition endures thereafter.
Explicit memories can also guide the conscious building of novel linguistic
utterances through processes of analogy. Patterned practice and declarative
pedagogical grammar rules both contribute to the conscious creation of
utterances whose subsequent usage promotes implicit learning and proce-
duralization. Flawed output can prompt focused feedback by way of recasts
that present learners with psycholinguistic data ready for explicit analysis.
(p. 305)
Once a construction has been represented in this way, its use in subsequent
implicit processing can update the statistical tallying of its frequency of usage
and probabilities of form-function mapping.
So, we believe that learners’ language systematicity emerges from their his-
tory of interactions of implicit and explicit language learning, from the sta-
tistical abstraction of patterns latent within and across form and function in
language usage. The complex adaptive system of interactions within and across
form and function is far richer than that emergent from implicit or explicit
learning alone (Ellis, 2014).
Conclusion
In the terms of Chapter 1, the usage-based approaches touched upon here are
too broad and far-ranging to qualify as a theory. Instead, they are a framework
for understanding many of the complex agents that underlie language learn-
ing. No single factor alone is a sufficient cause of L2 acquisition. Language is a
complex adaptive system. It comprises the interactions of many players: people
who want to communicate and a world to be talked about. It operates across
many different levels, different human configurations, and different timescales
(Douglas Fir Group, 2016; Ellis, 2019). Take out any one of these levels and a
Usage-Based Approaches to L2 Acquisition 79
different pattern emerges, a different conclusion is reached. But nevertheless,
like other complex dynamic systems, there are many systematicities that, like
Observations 1–10, emerge to form the things a theory should explain.
Discussion Questions
1. One critique of the type of approach Ellis and Wulff take has been that it is
an updated version of behaviorism. Do you agree with this criticism? How
do the authors of the chapter handle this criticism?
2. Explain the difference between rule-based and rule-like behavior.
3. How do usage-based approaches address explicit and implicit learning and
the nature of their interface?
4. Consider the case of the acquisition order of the present perfect and the
pluperfect from Bardovi-Harlig (this volume). The functionalist approach
offers one explanation for the order of emergence. How would usage-based
approaches account for the order? What evidence might distinguish the
two interpretations?
5. As we saw in White (this volume), the principal foundation of the approach
White takes is the poverty of the stimulus (POS) situation or the logical problem
of language acquisition. How do usage-based approaches view the POS? (You
might want to review the examples in White’s chapter before answering.)
6. Read the exemplary study presented in this chapter and prepare a discus-
sion for class in which you describe how you would conduct a replication
study. Be sure to explain any changes you would make and what motivates
such changes.
Suggested Further Reading
Ellis, N. C. (2019). Essentials of a theory of language cognition. Modern Language Jour-
nal, 103, 39–60.
A state-of-the art overview of language cognition.
Ellis, N. C., Römer, U., & O’Donnell, M. B. (2016). Usage-based approaches to lan-
guage acquisition and processing: Cognitive and corpus investigations of construction grammar.
Language Learning Monograph Series. Wiley-Blackwell.
Recent investigations into usage-based L2 acquisition.
Ellis, N. C. (Ed.). (1994). Implicit and explicit learning of languages. San Diego, CA:
Academic Press.
An edited collection focusing upon the explicit/implicit debate.
Rebuschat, P. (Ed.). (2014). Implicit and explicit learning of language. Amsterdam,
Netherlands: John Benjamins.
A reprise summarizing explicit/implicit research 20 years on.
Ellis, N. C., & Larsen-Freeman, D. (Eds.). (2009). Language as a complex adaptive
system [Special issue]. Language Learning, 59 (Suppl. 1), 93–128.
A special issue gathering experts from various language domains who share the
CAS perspective.
80 Nick C. Ellis and Stefanie Wulff
Robinson, P., & Ellis, N. C. (Eds.). (2008). Handbook of cognitive linguistics and second
language acquisition. London, England: Routledge.
The first collection focusing upon usage-based approaches to L2 acquisition.
Tomasello, M. (2003). Constructing a language. Boston, MA: Harvard University Press.
A thorough usage-based account of child language.
Tyler, A. (2012). Cognitive linguistics and second language learning. New York, NY:
Routledge.
L2 acquisition from a cognitive-linguistic perspective, referencing many
usage-based studies and research into pedagogical applications of a usage-based L2
acquisition.
References
Anderson, J. R. (1989). A rational analysis of human memory. In H. L. I. Roediger &
F. I. M. Craik (Eds.), Varieties of memory and consciousness: Essays in honour of Endel
Tulving (pp. 195–210). Hillsdale, NJ: Lawrence Erlbaum.
Bardovi-Harlig, K. (2000). Tense and aspect in second language acquisition: Form, meaning,
and use. Oxford, England: Blackwell.
Beckner, C., Blythe, R., Bybee, J., Christiansen, M. H., Croft, W., Ellis, N. C., &
Schoenemann, T. (2009). Language is a complex adaptive system. Language Learning,
59 (Suppl. 1), 1–26.
Cadierno, T., & Eskildsen, S. W. (Eds.). (2015). Usage-based perspectives on second language
learning. Berlin, Germany: De Gruyter Mouton.
Corder, S. P. (1967). The significance of learners’ errors. International Review of Applied
Linguistics, 5, 161–169.
Doughty, C., & Williams, J. (Eds.). (1998). Focus on form in classroom second language
acquisition. New York, NY: Cambridge University Press.
Douglas Fir Group (Atkinson, D., Byrnes, H., Doran, M., Duff, P., Ellis, N., Hall,
J. K., … Tarone, E.) (2016). A transdisciplinary framework for SLA in a multilingual
world. Modern Language Journal, 100, 19–47.
Ellis, N. C. (1994). Vocabulary acquisition: The implicit ins and outs of explicit
cognitive mediation. In N. C. Ellis (Ed.), Implicit and explicit learning of languages
(pp. 211–282). San Diego, CA: Academic Press.
Ellis, N. C. (1998). Emergentism, connectionism and language learning. Language
Learning, 48, 631–664.
Ellis, N. C. (2002). Frequency effects in language processing: A review with impli-
cations for theories of implicit and explicit language acquisition. Studies in Second
Language Acquisition, 24, 143–188.
Ellis, N. C. (2005). At the interface: Dynamic interactions of explicit and implicit lan-
guage knowledge. Studies in Second Language Acquisition, 27, 305–352.
Ellis, N. C. (2006a). Language acquisition as rational contingency learning. Applied
Linguistics, 27, 1–24.
Ellis, N. C. (2006b). Selective attention and transfer phenomena in SLA: Contingency,
cue competition, salience, interference, overshadowing, blocking, and perceptual
learning. Applied Linguistics, 27, 1–31.
Ellis, N. C. (2007). Dynamic systems and SLA: The wood and the trees. Bilingualism:
Language & Cognition, 10, 23–25.
Usage-Based Approaches to L2 Acquisition 81
Ellis, N. C. (2008). The dynamics of second language emergence: Cycles of language
use, language change, and language acquisition. Modern Language Journal, 41, 232–249.
Ellis, N. C. (2014). Implicit AND explicit learning of language. In P. Rebuschat (Ed.),
Implicit and explicit learning of language (pp. 3–23). Amsterdam, Netherlands: John
Benjamins.
Ellis, N. C. (2015). Cognitive and social aspects of learning from usage. In T. Cadierno
& S. Eskildsen (Eds.), Usage-based perspectives on second language learning (pp. 49–73).
Berlin, Germany: DeGruyter Mouton.
Ellis, N. C. (2017). Salience in usage-based SLA. In S. Gass, P. Spinner, & J. Behney
(Eds.), Saliency in second language acquisition (pp. 39–58). New York, NY: Routledge.
Ellis, N. C. (2019). Essentials of a theory of language cognition. Modern Language Jour-
nal, 103, 39–60.
Ellis, N. C., & Larsen-Freeman, D. (2006). Language emergence: Implications for
applied linguistics. Applied Linguistics, 27, 558–589.
Ellis, N. C., & Larsen-Freeman, D. (Eds.). (2009). Language as a complex adaptive system.
Boston, MA: Wiley-Blackwell.
Ellis, N. C., O’Donnell, M. B., & Römer, U. (2014a). Second language verb-argument
constructions are sensitive to form, function, frequency, contingency, and prototyp-
icality. Linguistic Approaches to Bilingualism, 4, 405–431.
Ellis, N. C., O’Donnell, M. B., & Römer, U. (2014b). The processing of verb-
argument constructions is sensitive to form, function, frequency, contingency, and
prototypicality. Cognitive Linguistics, 25, 55–98.
Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett,
K. (1996). Rethinking innateness: A connectionist perspective on development. Cambridge,
MA: MIT Press.
Flege, J. (2002). Interactions between the native and second-language phonetic sys-
tems. In P. Burmeister, T. Piske, & A. Rohde (Eds.), An integrated view of lan-
guage development: Papers in honor of Henning Wode (pp. 217–244). Trier, Germany:
Wissenschaftlicher Verlag Trier.
Francis, G., Hunston, S., & Manning, E. (Eds.). (1996). Grammar patterns 1: Verbs. The
COBUILD Series. London, England: HarperCollins.
Goldberg, A. E. (2006). Constructions at work: The nature of generalization in language.
Oxford, England: Oxford University Press.
Goldschneider, J. M., & DeKeyser, R. (2001). Explaining the “natural order of L2 mor-
pheme acquisition” in English: A meta-analysis of multiple determinants. Language
Learning, 51, 1–50.
Gries, S. T., & Wulff, S. (2005). Do foreign language learners also have constructions?
Evidence from priming, sorting, and corpora. Annual Review of Cognitive Linguistics,
3, 182–200.
Gries, S. T., & Wulff, S. (2009). Psycholinguistic and corpus-linguistic evidence for L2
constructions. Annual Review of Cognitive Linguistics, 7, 164–187.
Han, Z. H., & Odlin, T. (Eds.). (2006). Studies of fossilization in second language acquisition.
Clevedon, England: Multilingual Matters.
Hulstijn, J. H., Young, R. F., Ortega, L., Bigelow, M., DeKeyser, R., Ellis, N. C., …
Talmy, S. (2014). Bridging the gap: Cognitive and social approaches to research in
second language learning and teaching. Studies in Second Language Acquisition, 36(3),
361–421.
82 Nick C. Ellis and Stefanie Wulff
Jarvis, S., & Pavlenko, A. (2008). Crosslinguistic influence in language and cognition.
New York, NY: Routledge.
Kasper, G., & Wagner, J. (2011). A conversation-analytic approach to second language
acquisition. In D. Atkinson (Ed.), Alternative approaches to second language acquisition
(pp. 117–142). New York, NY: Routledge.
Klein, W. (1998). The contribution of second language acquisition research. Language
Learning, 48, 527–550.
Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nature
Reviews Neuroscience, 5, 831–843.
Lado, R. (1957). Linguistics across cultures: Applied linguistics for language teachers. Ann
Arbor, MI: University of Michigan Press.
Long, M. H. (1983). Linguistic and conversational adjustments to non-native speakers.
Studies in Second Language Acquisition, 5, 177–193.
Mackey, A., Abbuhl, R., & Gass, S. (2011). Interactionist approach. In S. Gass &
A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 7–23).
New York, NY: Routledge.
MacWhinney, B. (1997). Second language acquisition and the Competition Model. In
A. M. B. De Groot & J. F. Kroll (Eds.), Tutorials in bilingualism: Psycholinguistic perspec-
tives (pp. 113–142). Mahwah, NJ: Lawrence Erlbaum.
MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk 3rd edition.
Mahwah, NJ: Lawrence Erlbaum Associates.
MacWhinney, B., & O’Grady, W. (Eds.). (2015). The handbook of language emergence.
Oxford, England: Wiley-Blackwell.
McEnery, T., & Hardie, A. (2012). Corpus linguistics. Cambridge, England: Cambridge
University Press.
Miller, G. A. (2009). WordNet—About us. Retrieved from http://wordnet.princeton.
edu
Odlin, T. (1989). Language transfer. New York, NY: Cambridge University Press.
Pienemann, M. (1998). Language processing and second language development: Processability
theory. Amsterdam, Netherlands: John Benjamins.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 3–32). Cambridge, England: Cambridge University Press.
Sinclair, J. (1991). Corpus, concordance, collocation. Oxford, England: Oxford University
Press.
Tarone, E. (1997). Analyzing IL in natural settings: A sociolinguistic perspective of
second-language acquisition. Communication and Cognition, 30, 137–150.
Terrell, T. (1991). The role of grammar instruction in a communicative approach. The
Modern Language Journal, 75, 52–63.
Tomasello, M., & Stahl, D. (2004). Sampling children’s spontaneous speech: How much
is enough? Journal of Child Language, 31, 101–121.
Trousdale, G., & Hoffmann, T. (Eds.). (2013). Oxford handbook of construction grammar.
Oxford, England: Oxford University Press.
Weinreich, U. (1953). Languages in contact. The Hague, Netherlands: Mouton de
Gruyter.
Wulff, S. (2008). Rethinking idiomaticity: A usage-based approach. London, England:
Continuum.
Wulff, S., & Ellis, N. C. (2018). Usage-based approaches to SLA. In D. Miller, F. Bayram,
J. Rothman, & L. Serratrice (Eds.), Bilingual cognition and language: The state of the sci-
ence across its subfields (pp. 37–56). Amsterdam, Netherlands: John Benjamins.
5
SKILL ACQUISITION THEORY
Robert DeKeyser
Skill Acquisition Theory accounts for how people progress in learning a variety
of skills, from initial learning to advanced proficiency. Skills studied include
both cognitive and psychomotor skills, in domains that range from classroom
learning to applications in sports and industry. Research in this area ranges
from quite theoretical (computational modeling of skill acquisition, the place of
skills in an architecture of the mind) to quite applied (how to sequence activi-
ties for maximal learning efficiency in areas as diverse as teaching high school
algebra, tutoring college physics, coaching professional basketball, or training
airplane pilots).
The scientific roots of Skill Acquisition Theory are to be found in various
branches of psychology, but this research area has proven to be remarkably
resilient through various developments in psychology, from behaviorism to
cognitivism to connectionism. After all, the practical needs as well as the
fundamental theoretical questions and the basic empirical facts remain, regard-
less of the continuous developments in psychological theory, methodology, and
terminology.
The Theory and Its Constructs
The basic claim of Skill Acquisition Theory is that the learning of a wide
variety of skills shows a remarkable similarity in development from initial
representation of knowledge through initial changes in behavior to eventual
fluent, spontaneous, largely effortless, and highly skilled behavior, and that
this set of phenomena can be accounted for by a set of basic principles common
to the acquisition of all skills. The terminology in the previous sentence was
deliberately chosen to be nontechnical and theory-neutral; it will come as no
84 Robert DeKeyser
surprise that a theory that has been applied to so many domains over such a
long period of time has seen its share of technical terms, which have varied
with the area of psychology researchers have worked in and the types of skills
they have studied. Generally speaking, however, researchers have posited
three stages of development, whether they called them cognitive, associative,
and autonomous, as Fitts and Posner (1967); or declarative, procedural, and
automatic, as Anderson (e.g., Anderson, 1982, 1993, 2007; Anderson et al.,
2004; Taatgen, Huss, Dickison, & Anderson, 2008); or presentation, practice,
and production, as Byrne (1986).
These three stages are characterized by large differences in the nature of
knowledge and its use, as reflected in various ways through introspection, ver-
balization, and most importantly various aspects of behavior especially under
demanding conditions. Initially, a student, learner, apprentice, or trainee may
acquire quite a bit of knowledge ABOUT a skill without ever even trying
to use it. That knowledge may be acquired through perceptive observation
and analysis of others engaged in skilled behavior (e.g., learning a new dance
move), but most often is transmitted in verbal form from one who knows to
one who does not (as in a parent or driving instructor teaching a teenager how
to drive a car), and often through a combination of the two, when the “expert”
demonstrates the behavior slowly while commenting on the relevant aspects
(e.g., teaching a child how to swim or how to play tennis).
Next comes the stage of “acting on” this knowledge, turning it into a
behavior, turning “knowledge that” into “knowledge how,” or in more techni-
cal terms, turning declarative knowledge into procedural knowledge. This
proceduralization of knowledge is not particularly arduous or time consum-
ing. Provided that the relevant declarative knowledge is available and drawn on
in the execution of the target behavior, proceduralization can be complete after
just a few trials/instances. Anderson et al. (2004, p. 1046), for instance, point
out that, in a typical psychology experiment, the participant is converting from
a declarative representation and a slow interpretation of the task (as set forth in
the experimenter’s instructions) to a smooth, rapid, procedural execution of
the task (for an example in second language learning, see DeKeyser, 1997, who
argues that proceduralization was essentially complete after the first 16-item
block of practice items). Yet, proceduralized knowledge has a big advantage
over declarative knowledge: It no longer requires the individual to retrieve bits
and pieces of information from memory to assemble them into a “program” for
a specific behavior; instead, that “program” is now available as a ready-made
chunk (as a result of production compilation, i.e., the combination of several
production rules; see Anderson, 2007; Taatgen & Lee, 2003) to be called up in
its entirety each time the conditions for that behavior are met.
Once procedural knowledge has been acquired, there is still a long way to go
before the relevant behavior can be consistently displayed with complete fluency
or spontaneity, rarely showing any errors. In other words, the knowledge is not
Skill Acquisition Theory 85
yet robust and fine-tuned. A large amount of practice is needed to decrease
the time required to execute the task (“reaction time”), the percentage of errors
(“error rate”), and the amount of attention required (and hence interference
with/from other tasks, or more generally “robustness”; cf. Taatgen et al., 2008).
This practice leads to gradual automatization of knowledge. Automaticity
is not an all-or-nothing affair; even highly automatized behaviors are not 100%
automatic, as becomes clear when we stumble walking down the stairs, when
we realize we are driving too fast when engaged in an exciting conversation
with a passenger, or when we stumble over our words while ut tering a simple
sentence in our native language.
It should be stressed that this intensive practice (sometimes called overlearn-
ing) after mastery over the task has been achieved is only useful if it takes
learners from the proceduralization stage (where declarative and procedural
knowledge are used) to the automatization stage (where knowledge is com-
pletely procedural already). In such cases, however, its impact is great, not only
because of the obvious immediate advantages of reaching high levels of auto-
maticity but also because (automatized) procedural knowledge is known to
decay less with time. On the other hand, while some tasks can be carried out
completely on the basis of procedural knowledge (esp. motor skills), others keep
requiring access to at least some declarative information, and hence benefit less
from overlearning (Kim, Ritter, & Koubek, 2013).
A central concept in the study of skill acquisition is the power law of
learning (named this way because its mathematical formalization is a power
function: an equation with an exponent, which in this case represents the
amount of practice). This equation formalizes mathematically what has been
observed many times, for skills as different as making cigars out of tobacco
leaves or writing computer programs: that both reaction time and error rate
decrease systematically as a consequence of practice. If the learning curves for
reaction time and error rate for such a variety of skills share the very specific
shape of a power function (and not even a quite similar one like that of an
exponential function), then this shape must contain the key to some funda-
mental learning mechanisms. Since Newell and Rosenbloom’s (1981) semi-
nal article on the power law, a variety of hypotheses have been formulated to
explain this robust empirical phenomenon. This chapter is not the place to dis-
cuss the relative merit of these hypotheses (for more discussion, see DeKeyser,
2001; Segalowitz, 2010), but what they all have in common is that they posit
a qualitative change over time, as a result of practice, in the basic cognitive
mechanisms used to execute the same task. What superficially seems like a set
of smooth quantitative changes (reaction time and error rate declining follow-
ing a power function) in fact reflects a qualitative change in mechanisms of
knowledge retrieval, quite radical for a while, and then gradually stabilizing
without ever reaching an absolute endpoint (hence the learning curve in the
specific shape of a power function illustrated in Figure 5.1).
86 Robert DeKeyser
Probably the most widely accepted interpretation of this change is that it
represents first a shift from declarative to procedural knowledge (achieved
rather quickly, hence the rather steep initial section of the curve) followed
by a much slower process of automatization of procedural knowledge. The
term automatization itself can be interpreted in various ways, ranging from a
mere speed-up of the same basic mechanisms to a speed-up of a broader task
through a qualitative change in its components. Again, we are not taking a
position here on this point either (for more discussion, see DeKeyser, 2007a;
Segalowitz, 2010), but we are using automatization in a more specific sense than
just “improvement through practice,” because we are reserving the term for
the latter, flatter part of the learning curve, after the steep decline due to rapid
proceduralization has taken place (see Figure 5.1).
Another point on which there is widespread agreement is that, regardless
of the exact nature of the knowledge drawn on in the later stages of devel-
opment, this knowledge is much more specific than at the beginning, and in
fact, so highly specific that it does not transfer well, even to what may seem
quite similar tasks. A well-known example from the skill acquisition litera-
ture is reading versus writing computer programs (see Singley & Anderson,
1989), and an obvious parallel in the domain of language learning is com-
prehension versus production (De Jong, 2005; DeKeyser, 1997; DeKeyser &
1400
1200
Time to solution (seconds)
1000
800
600
400
200
20 40 60 80 100
Number of problems
(a)
FIGURE 5.1 A sample graph of the power of learning curve.
Skill Acquisition Theory 87
Sokalski, 2001; Shintani, Li, & Ellis, 2013; Tanaka, 2001). Other examples, of
course, would be transfer from speaking to writing, or from one situation to
another (such as from orderly dialogue to argument with multiple interlocu-
tors or from the kitchen table to the boardroom). The implication for train-
ing is that two kinds of knowledge need to be fostered, both highly specific
procedural knowledge: highly automatized for efficient use in the situations
that the learner is most likely to confront in the immediate future, and solid
abstract declarative knowledge that can be called upon to be integrated into
much broader, more abstract procedural rules, which are indispensable when
confronting new contexts of use.
What is often overlooked is that this whole sequence of proceduralization
and automatization cannot get started if the right conditions for procedural-
ization are not present (the declarative knowledge required by the task at hand
and a task setup that allows for use of that declarative knowledge). Anderson,
Fincham, and Douglass (1997), in particular, show convincingly that the com-
bination of abstract rules and concrete examples is necessary to get learners past
the declarative threshold into proceduralization. DeKeyser (2007b) argues that
precisely this is often lacking in language teaching in general and in preparing
students for maximum benefit from a stay abroad in particular.
What Counts as Evidence?
The oldest form of evidence in this area is behavioral in nature: reaction times,
error rates, and differences in performance from one condition to another
such as interference from a secondary task. Any overview of the behavioral
data should start with Newell and Rosenbloom (1981), not because it was the
first study in this area but because it was seminal in that it brought together
empirical data from so many different studies about so many different forms
of skill acquisition, and it proposed both a quantitative model (the power law)
and a qualitative interpretation for this mountain of data. Some of the domains
of learning included motor behavior, reading, decision making, and problem
solving. For information on the individual studies included, see Newell and
Rosenbloom’s article. Major empirical studies since then include Anderson
et al. (1997) on the role of rules and examples in the proceduralization of a sim-
ple reasoning task, Logan (1988, 1992, 2002) on the learning of a new form of
arithmetic (with letters), and Taatgen et al. (2008) on robustness and flexibility
of procedural knowledge as a function of the form of the production rules (with
or without explicit statement of pre- and postconditions in the environment).
In the last 25 years, less direct evidence in the form of computational mod-
eling has become very important in the study of skill acquisition, even more
so than in other subfields of psychology. This line of evidence includes large
amounts of work with a variety of computer models such as the various con-
secutive incarnations of ACT (Adaptive Control of Thought) (see especially
88 Robert DeKeyser
During this same period, skill acquisition researchers have begun to draw on
what some would see as data that are even more direct than the behavioral data
themselves, that is, neuroimaging and other forms of neurological evidence
such as evoked potentials (measures of electrophysiological activity in specific
areas of the brain, experimentally linked to specific cognitive tasks).
Increasing use of techniques from cognitive neuroscience has yielded studies
such as Raichle et al. (1994) using PET (positron emission tomography) to trace
the effect of practice on the relative involvement of different brain areas in the
same task (word generation), and Qin, Anderson, Silk, Stenger, and Carter
(2004) using fMRI (functional magnetic resonance imaging) to investigate the
effect of children’s practice in algebra. (For discussion of the role of neuroimag-
ing, and fMRI in particular, in the development of skill acquisition models, see
esp. Anderson, 2007, Chapter 2 and pp. 169–181; see also Chein & Schneider,
2005; Hill & Schneider, 2006, for a broader discussion of neuroimaging of skill
development.)
In sum, the behavioral data show the similarity in skill development across
different cognitive domains (how reaction time and error rate develop as a
result of practice); the neurological data show how different areas of the brain
are involved to a different extent after different amounts of practice; and the
computational models show the hypothetical inner workings of the mecha-
nisms that cannot be observed directly through behavioral or neurological data.
As should be clear from the literature cited earlier, evidence for central
constructs such as the power law, procedural knowledge, or automatization
abounds in the psychological literature. What is harder to come by is empirical
data that unambiguously point to a specific interpretation of these phenomena
in terms of learning mechanisms. More importantly for our purposes here, not
much research in the field of second language learning has explicitly set out to
gather data from second language learners to test (a specific variant of ) Skill
Acquisition Theory.
The same can be said about other directions in which skill acquisition
research has expanded in recent years: the study of the forgetting of skills and
the role of distributed versus massed practice in learning and forgetting.
The longstanding topic of what constitutes ideal distribution of practice has
been revived in the cognitive and educational psychology literature in the last
decade, and the results of individual studies often appear contradictory, but a
provisional conclusion from this literature as a whole (see esp. the meta-analyses
Skill Acquisition Theory 89
in Cepeda, Pashler, Vul, Wixted, & Rohrer, 2006; Rohrer, 2015; the literature
review in Carpenter, Cepeda, Rohrer, Kang, & Pashler, 2012; and the studies
by Cepeda et al., 2009; Rohrer & Pashler, 2007) is that the ideal spacing of
practice is determined by the ratio of inter-session interval (the amount of time
between different encounters with the same item) and retention interval (the
amount of time between the end of practice and the beginning of testing).
On this point, too, the L2 acquisition literature is still rather limited.
On the one hand, studies on complete foreign language programs (Collins,
Halter, Lightbown, & Spada, 1999; Lightbown & Spada, 1994; Serrano, 2011;
Serrano & Muñoz, 2007; White & Turner, 2005) have shown massed practice
to be more effective. Much more narrowly focused studies, on the other hand,
have come to divergent conclusions. Bird (2010) found distributed practice to
be superior for past tense practice in English as a second language (ESL), and
Nakata (2012) obtained similar results for vocabulary learning in ESL. Suzuki
and DeKeyser (2017a, 2017b), however, in a study which was narrowly focused
on the “gerund” in Japanese SL but still required integration of grammatical
skills and vocabulary knowledge found that massed practice was best for the
acquisition of procedural skill; they also found that memory was more import-
ant in massed practice and analytical ability more in distributed practice. Li
and DeKeyser (2019), in a study on the learning of tone in Chinese L2, found
the same advantage of massed practice for procedural skill, but an advantage
of distributed practice for declarative knowledge. It appears then that whether
massed or distributed practice is better may not only depend on the length of
the treatment and the scope of the knowledge involved but also—and perhaps
most importantly—on the extent to which declarative and procedural learning
is involved.
Finally, long-term studies on forgetting of L2 skills among foreign language
learners (as opposed to heritage learners or fluent second language speakers)
are rare (see, however, Bahrick, Hall, & Baker, 2013; Bahrick, Hall, Goggin,
Bahrick, & Berger, 1994); none have taken a skill acquisition perspective.
One of the reasons why research from a skill acquisition perspective has
been rather rare in the field of second language acquisition is the methodology
required. Experiments on skill acquisition typically involve rather large num-
bers of participants over rather long periods of time, yielding very large amounts
of data for statistical analysis. Moreover, the collection of these data, and the
control required over the treatments and practice conditions, requires a certain
amount of investment in hardware and software.
This methodological challenge combined with the fact that focus on form
was out of fashion for a number of years in applied linguistics research explains
the small volume of directly relevant empirical research so far. The studies
that have tested the predictions of skill acquisition most directly are DeKeyser
(1997), Robinson (1997), de Jong (2005), De Jong and Perfetti (2011), Rodgers
(2011), and Li and DeKeyser (2017). The first two each test one of two com-
peting theories of skill acquisition with L2 data: DeKeyser (1997) found that
90 Robert DeKeyser
the concepts of proceduralization, automatization, and specificity of procedural
rules accounted well for the learning curves for reaction time and error rate
during a semester of practice of a small number of grammar rules. Robinson
(1997), on the other hand, found that his data on the learning of an ESL gram-
mar rule did not fit the predictions of Logan’s competing theory of automa-
tization through retrieval of specific instances from memory instead of rules.
De Jong (2005), with learners of Spanish as a second language, provides
further evidence for the skill specificity documented by DeKeyser (1997).
She showed that extensive aural comprehension training, while increasing
processing speed in comprehension, did not preempt a substantial number
of errors in production and that, conversely, early production did not hin-
der acquisition. Rodgers (2011) worked with learners of Italian L2 to show
that automatization of verbal morphology developed as a function of practice
but that it was less advanced in production than in comprehension, further
documenting the specificity of procedural knowledge. Zooming in on pro-
ceduralization, De Jong and Perfetti (2011) showed in detail how indices of
proceduralization such as length of runs and phonation/time ratio develop as
a result of repeated but gradually sped-up practice with a task, either identical
or similar. Li and DeKeyser (2017), in a study on Chinese L2, once more pro-
vided strong evidence for skill specificity, in this case for the learning of tone
in Chinese L2. They compared learners who had been trained either in pro-
ductive or in receptive skills, and the results showed that performance was far
worse when participants were tested on the reverse skill than when they were
tested on the practiced skill in terms of both error rates and reaction times (for
more details, see the boxed inset).
A recent development in the area of Skill Acquisition Theory in L2 acquisi-
tion has been research on the declarative/procedural distinction in the area of
individual differences, in particular the aptitudes for declarative and procedural
learning. Morgan-Short, Faretta-Stutenberg, Bill-Schuetz, Carpenter, and
Wong (2014) used the CVMT (Continuous Visual Memory Task; Trahan &
Larrabee, 1988) along with part V (Paired Associates) of the MLAT (Modern
Language Aptitude Test; Carroll & Sapon, 1959) as measures of “declara-
tive memory” (aptitude for acquiring declarative knowledge) and the WPT
(Weather Prediction Task; Knowlton, Squire, & Gluck, 1994) and the TOL
(Tower of London task) as measures of “procedural memory” (aptitude for
acquiring procedural knowledge). They found that declarative learning ability
was highly predictive of language development at early stages, while procedural
learning ability was highly predictive of later development.
Faretta-Stutenberg and Morgan-Short (2018) found compatible results,
using ERP (event-related potentials) to show that, while learners in a classroom
context made behavioral progress during a semester that did not correlate with
individual differences in various aspects of memory, learners during a semester
abroad showed correlations between behavioral changes and processing
Skill Acquisition Theory 91
changes on the one hand, and procedural learning ability and working mem-
ory on the other hand.
Given the increasing sophistication of the technology as well as the research
methodology at the disposal of second language researchers, along with a return
to focus on form and explicit learning in recent years (see, e.g., DeKeyser,
2009, 2017; Doughty, 2001; Ellis, 2012; Goo, Granena, Yilmaz, & Novella,
2015; Norris & Ortega, 2000; Spada & Tomita, 2010), one can expect this
area of research to pick up, especially as many researchers have begun to at
least interpret existing findings from the second language literature within
the framework of Skill Acquisition Theory (de Bot, 1996; Healy et al., 1998;
Lyster, 2004; Lyster & Sato, 2013; Macaro, 2003; O’Malley & Chamot, 1990;
Ranta & Lyster, 2007; Sato & Lyster, 2012; Towell & Hawkins, 1994; Towell,
Hawkins, & Bazergui, 1996). Researchers do not need to be trained in com-
putational modeling or neuroscience at all to contribute to research on skill
acquisition; with a sophisticated approach to design, data collection, and data
analysis, using technology that is fairly easily available at research institutions,
behavioral data still have much to contribute to this area.
Common Misunderstandings
Two kinds of misunderstanding about the contribution of Skill Acquisition
Theory to second language acquisition research are very common: the idea that
skill acquisition either explains everything about second language acquisition
or nothing, in other words, that it competes with other theories to be the one
and only valid explanation of the set of phenomena we call “second language
acquisition,” and the idea that it is incompatible with a variety of empirical
findings in the field. These two misunderstandings are, of course, related, as
we will see later.
Because of its emphasis on the importance of explicit/declarative knowledge
in initial stages of learning, Skill Acquisition Theory is most easily applicable
to what happens in (a) high-aptitude adult learners engaged in (b) the learn-
ing of simple structures at (c) fairly early stages of learning in (d) instructional
contexts. That does not mean these four conditions all have to be fulfilled for
Skill Acquisition Theory to be applicable, but it does mean that the more the
learning situation deviates from this prototypical situation in one of these four
respects, the less likely it is that concepts from Skill Acquisition Theory will
account well for the data. If adults have below-average verbal aptitude, they
may find it hard to form declarative representations of grammar rules (whether
with the help of a teacher and textbooks or not). By the same token, children
will not be able to conceptualize most grammar rules, which are of course
inherently abstract. This problem is even worse when the rules are very com-
plex: In that case even adults of above-average aptitude will find it hard to
understand, and especially to proceduralize and automatize, the rule. Finally, as
92 Robert DeKeyser
learners enter more advanced stages of learning (where they interact constantly
and fluently with native speakers and are exposed to a large amount of oral and
written input), the likelihood of implicit learning of frequent and relatively
concrete patterns in the input increases substantially. That in turn does not
mean that skill acquisition theory is of marginal relevance: a substantial amount
of second/foreign language learning is done by adolescents and adults of above-
average aptitude going through the initial stages of learning in a school context.
Moreover, if the potential for learning in these initial stages is not maximized
(because everything we know about cognitive skill acquisition is ignored), this
will have repercussions, of course, for all learning thereafter.
Related to overgeneralization of Skill Acquisition Theory to the situations
where it does not apply well is the tendency to see the theory as incompatible
with a number of empirical findings as well as theoretical positions in the field.
Some will overinterpret the theory as predicting that any kind of construction
can be learned, practiced, and automatized by anybody in any order and that
therefore it is incompatible with the literature on the natural order of acqui-
sition (summarized, e.g., in Dulay, Burt, & Krashen, 1982; Goldschneider &
DeKeyser, 2001; Luk & Shirai, 2009). This reasoning actually combines a mis-
reading of both Skill Acquisition Theory and research on the natural order
of acquisition, because the latter never found an ordering for all or even most
structures in the language, only for a few morphemes in some studies or for
a few closely related syntactic patterns in others, and because most studies of
order of acquisition were carried out with learners who had massive expo-
sure to the language and/or were young learners, which means that they were
largely implicit learners and that the skill acquisition model (going from declar-
ative/explicit to procedural/implicit knowledge) did not apply to them.
Similarly, Skill Acquisition Theory should not be seen as being in compe-
tition with the theory underlying processing instruction (see esp. VanPatten,
2004), as long as the latter is not seen as implying that practice in production is
not important for full-fledged skill acquisition or the fine-tuning of declarative
knowledge; in fact, processing instruction does for comprehension skills exactly
what Skill Acquisition Theory suggests should be done: taking students from
explicitly taught (or induced) declarative knowledge, through careful proce-
duralization by engaging in the relevant task while the declarative knowledge
is maximally activated, to (very initial stages of ) automatization. Skill Acqui-
sition Theory is not incompatible either with other contemporary tendencies
in the way focus on form is implemented, such as task-based learning (see esp.
Bygate, 2018; Ellis, 2003; Long, 2015; Robinson, 2011; Van den Branden,
2006), because engaging in carefully sequenced tasks (from a psycholinguistic
perspective) will again lead to proceduralization and potentially some degree
of automatization, provided that the requisite declarative knowledge is at the
disposal of the learner during the task. Nor does Skill Acquisition Theory
contradict the notion that implicit learning is important (leading directly
to implicit knowledge, that is, knowledge that one is not aware of, which is
Skill Acquisition Theory 93
stressed both in the universal grammar approach [see White, this volume] and
the usage-based approach to learning [see Ellis & Wulff, this volume]). While
stressing the importance of implicit learning in general and frequency in par-
ticular, Ellis (see esp. Ellis, 2002, 2005; see also Ellis & Wulff, this volume)
makes it very clear that “many aspects of a second language are unlearnable—
or at best acquired very slowly—from implicit processes alone” (Ellis, 2005,
p. 307), and that “slot-and-frame patterns, drills, mnemonics, and declarative
statements of pedagogical grammar … all contribute to the conscious creation
of utterances that then partake in subsequent implicit learning and procedur-
alization” (p. 308).
Finally, perhaps the most common misunderstanding concerns the concept
of declarative knowledge “turning into” procedural knowledge. This is not
meant to suggest that any mysterious transformation or move happens in the
brain (for more about the declarative/procedural distinction and the brain, see
Ullman, 2004; see also Ullman, this volume), not even that the more proce-
dural knowledge there is, the less declarative knowledge. The phrase “turn-
ing into” is a bit misleading on that point; all that is claimed is that existing
declarative knowledge, via practice, plays a causal role in the development of
procedural knowledge (see, e.g., DeKeyser, 2009).
An Exemplary Study: Li and DeKeyser (2017)
I have chosen this article as example because it clearly shows how different
forms of skill-specific procedural knowledge develop from the same declar-
ative knowledge. Participants in this study were 38 monolingual native
English-speaking adults with little or no musical training; the majority had
had some exposure to foreign languages in class, but none had any fluency, and
none had even attempted to learn a tone language. They were taught about
tone in Mandarin Chinese, the instruction focusing on the four tonal patterns
in Chinese, the diacritics to represent them, and the meaning of 16 monosyl-
labic words with their tone (four words for each tone). They were trained on
these words till they knew all of them, using pictures and the corresponding
words in the Roman alphabet, with the diacritics for tone, and then practiced
them either in production or in comprehension during three training sessions,
on three separate days. For comprehension practice, the experimenter read
words aloud, and the participants pointed at their transcription in the Roman
alphabet (in the first training phase) or at a corresponding picture (in the second
phase). For production practice, in the first phase, the participants were given
written words to read aloud, and in the second phase, they were shown pictures
and had to say the corresponding word.
After the training, both the production and the comprehension training
groups were tested on both skills. For comprehension, they were given a tone
identification test (they heard a word and had to select the correct transcrip-
tion out of a set of four, with the tone included) and a word comprehension
94 Robert DeKeyser
test (they heard a word and had to point to the picture corresponding to its
meaning, out of a set of four). For production, the participants also took two
tests: a word reading test (reading aloud words written in the Roman alphabet
with the diacritics, the 16 training words as well as 16 new ones) and a picture
naming test (pronouncing the word corresponding to each of the 16 pictures
shown). For all tests, each trial started with a fixation point (***) for 500 ms
followed by the stimulus (an auditorily presented word in the comprehension
tests; a written word or a picture in the production test).
Both accuracy and reaction time (for correct responses only) were recorded,
and the expected pattern was observed. The production practice group did
better than the receptive practice group on both production tests, for both
accuracy and reaction time and for both old and new words (except for a non-
significant difference for reaction time on the productive tests for the new
words). The receptive practice group did better than the productive practice
group on both receptive tests, for accuracy and reaction time and for both old
and new items (except that reaction time for the new items showed an advan-
tage for the receptive practice group for only one of the receptive tests; the
effect for the other one was very small).
Overall then, the results showed a strong skill specificity effect of practice.
Performance was far worse when participants were tested on the reverse skill
from the practiced one, the only clear exception being a loss, on the new items,
of the previous reaction time advantages for the production group on the pro-
duction tests and for the receptive group on the receptive tests.
Explanation of Observed Findings in L2 Acquisition
Observation 7: There are limits on the effects of frequency on L2 acquisition; Observa-
tion 9: There are limits on the effects of instruction on L2 acquisition; Observation 10:
There are limits on the effects of output (learner production) on language acquisition.
The findings that there are limits on the effects of frequency, on the effects of
instruction, and on the effects of output are very easily explained in this frame-
work: factors such as whether students receive instruction, produce output, and
are exposed to certain structures frequently play little role if (explicit) instruc-
tion and practice with input and output are not integrated in a way that makes
sense according to this theory. Automatization requires procedural knowledge.
Proceduralization requires declarative knowledge and slow deliberate practice.
The acquisition of declarative knowledge of a kind that can be procedural-
ized requires the judicious use of rules and examples. These stages cannot be
skipped, reversed, or rushed. Unfortunately, however, just about any kind of
existing teaching methodology tends to do at least one of the latter three.
Observation 5: Second language learning is variable in its outcome; Observation 6:
Second language learning is variable across linguistic subsystems. The findings that sec-
ond language learning is variable in its outcome and variable across linguistic
subsystems are equally easy to explain in this framework. Different learners
Skill Acquisition Theory 95
achieve very different levels of proficiency in a given area because of their dif-
ferent levels of ability to grasp the declarative knowledge, the widely differing
amounts of practice of specific kinds that individual learners receive for specific
structures, and most importantly, the different sequencing of various kinds of
explicit information, implicit input, and practice with input and output that dif-
ferent learners receive or create for themselves (which are influenced in turn by
motivation, personality, and social context). Learners also show a large amount
of intraindividual variation between the different linguistic domains because
of differential aptitude, instruction, and practice. Even more importantly, Skill
Acquisition Theory easily explains the differences in performance from task to
task that are so often observed for the same subcomponent of language in the
same individual learner. Performance draws on procedural knowledge, which
we saw is very specific, and unevenly developed depending on the amount
of practice of various elements of the language under various task conditions.
In the same vein, Skill Acquisition Theory explains a factor that is not often
addressed in the more linguistically oriented literature, but that is of tremendous
importance in the more applied literature: the importance of learning activities
and their sequencing and spacing. No amount of any activity means much if it
does not fit into the right point of development of skill for a given individual.
Observation 4: Learners’ output (speech) often follows predictable paths with predict-
able stages in the acquisition of a given structure. The fact that learners follow a pre-
dictable path in their development for a given structure also fits well with Skill
Acquisition Theory, especially if it is understood somewhat more broadly than
in merely linguistic terms. Learners who are exposed to little or no instruction
may learn different variants of a structure in a certain order through implicit
mechanisms, and show little task variation at a given point in time, but learn-
ers who are carefully guided through the stages of skill acquisition for a given
structure may show less developmental variation in that kind of structure, but
more developmental variation in speed and systematicity of use of this structure,
including variation due to (even small variations in) task conditions. When such
learners are forced to perform beyond the level of skill they have reached, they
may or may not fall back on the same variants of structures used by implicit
learners, depending on factors such as how much exposure they have received
along with their systematic instruction and what age they are (these two factors
influence their opportunity for and their relative susceptibility to implicit and
explicit learning).
Skill Acquisition Theory and the Explicit/Implicit Debate
As stated in the previous sections, Skill Acquisition Theory stresses the impor-
tance of the distinction between declarative and procedural knowledge and sees
the transition from mostly declarative to mostly procedural as the norm in skill
development (cf. Anderson, 2007). The declarative/procedural and explicit/
implicit distinctions do not quite coincide, but for our purposes here, they are
96 Robert DeKeyser
equivalent (for more in-depth discussion, see DeKeyser, 2009). It is important
to realize, however, that Skill Acquisition Theory by no means denies a role
for implicit learning. There can even be “synergy” between the two types of
learning for a particular rule or a distribution of roles between the two when a
variety of different rules, patterns, or regularities need to be learned. Research
on skill acquisition outside of the language domain, as well as research with
artificial languages and research with regular second/foreign language learning
is increasingly concerned with such synergies or role distributions of implicit
and explicit learning processes.
Early work with serial reaction time tasks (Cohen, Ivry, & Keele, 1990)
or artificial grammars (Mathews et al., 1989) already hinted at such syner-
gies. More recently and again with artificial grammars, Sallas, Mathews, Lane,
and Sun (2007) showed that while chunk learning may lead to better approx-
imation, structure learning through animated model presentation leads to a
much higher number of perfect letter strings. Ferman, Olshtain, Schechtman,
and Karni (2009) showed that there may be a role distribution in the sense that
the simpler rules tend to be learned explicitly and the complex or probabilis-
tic ones—being hard to induce, comprehend, or proceduralize—tend to fare
very poorly in explicit learning, to the extent that implicit learning, slow and
probabilistic as it may be, yields better results. The latter study with an artificial
grammar (letter strings without meaning) is reminiscent of earlier research with
a miniature linguistic system constituting a made-up natural language, that is,
with a meaning component (DeKeyser, 1995), which also showed that explicit
learning worked significantly better for the abstract, but simple and categorical
rules of morphology, while implicit learning yielded at least descriptively better
results for the concrete, but complex and probabilistic patterns of allomorphs.
(For thorough reviews of the implicit–explicit learning issue in L2 acquisition,
see, e.g., DeKeyser, 2003, 2009; Williams 2009; for a discussion of the potential
interaction in L2 acquisition, see esp. Ellis, 2005.)
Skill Acquisition Theory, then, does not reject the possibility or usefulness of
implicit learning, but focuses on how explicit learning (which is often the only
realistic possibility for specific learning problems because of time constraints
or logistic issues) can, via proceduralization and automatization of explicitly
learned knowledge, lead to knowledge that is functionally equivalent to implicit
knowledge. From a purely psycholinguistic point of view, it is important to
stress, as does Paradis (2009), that explicit knowledge never becomes implicit
through practice; from an applied point of view, however, it is equally import-
ant to stress that what matters is fast, accurate, and robust use, the hallmark of
automatized procedural knowledge. Moreover, there is at least tentative evi-
dence that explicit knowledge, through a long process of proceduralization and
automatization, can be instrumental in the development of implicit knowledge
(Suzuki & DeKeyser, 2017c). Given how difficult it is to determine whether
knowledge is implicit or explicit (and even more whether learning was implicit
Skill Acquisition Theory 97
or explicit), even under controlled laboratory conditions, it stands to reason that
the implicit/explicit distinction in this narrow sense should be of little concern
to second language learners and teachers. Proceduralization, however, as well as
a certain degree of automatization of explicitly acquired knowledge is necessary
conditions for practically useful levels of proficiency. How exactly to get to that
point is what Skill Acquisition Theory is all about.
Conclusion
In this chapter, I have presented both major findings and methodological aspects
of skill acquisition research, illustrated them with a study from the second lan-
guage domain, and explained how Skill Acquisition Theory is quite compatible
with many of the major findings from second language acquisition research and
even explains some phenomena better than other theories. In closing, however,
it is only fitting to take a somewhat broader view of how well explanations of
second language acquisition phenomena based on Skill Acquisition Theory fit
into the larger enterprise of cognitive science; in our case, that means trying
to understand how the same mind that learns how to recognize the neighbors,
play chess, appreciate music, ride a bicycle, program a computer, or use a native
language also learns to understand and produce a second language.
An advantage of the approach illustrated in this chapter is definitely that
it fits in very well with other aspects of cognitive science. The same mecha-
nisms, whether couched in psychological or neurological terms, are invoked to
explain second language learning and a wide variety of other skills. Second, this
approach to skill learning has itself proven to be quite robust over the decades,
despite the obvious changes in emphases, methodology, and terminology.
Furthermore, research on skill acquisition, whether carried out with behav-
ioral data or through neuroimaging or computer modeling, is tremendously
explicit in its procedures and claims. Power curves, computer programs, and
brain scanners give precise answers to precise questions (even though interpret-
ing the answers can still leave a lot of room for discussion). Most important of
all, perhaps, research in this area is truly developmental. It does not take snap-
shots of learners at two or three points between initial learning and near-native
proficiency and speculate on how learners got from point a to point b. It can
document learning day after day and show how rapid acquisition of declarative
knowledge about some structures, rapid proceduralization of knowledge about
others, and automatization of some elements of knowledge for specific uses all
happen in parallel, while other elements never get automatized, or maybe not
even proceduralized, or perhaps not even learned. It may have less to say about
which elements of language are going to be learned in what order than other,
more (psycho-)linguistically oriented approaches, but it is painstakingly precise
and explicit about the big and small steps a learner takes in acquiring (a specific
use of ) a specific structure.
98 Robert DeKeyser
Discussion Questions
1. Central to Skill Acquisition Theory are the constructs of declarative
knowledge, proceduralization, and automatization. Discuss each, pay-
ing particular attention to the difference between proceduralization and
automatization as well as the context(s) in which automatization may occur.
2. Both De Keyser and Ellis and Wulff offer approaches that are cognitive in
nature, that is, built on models/theories from psychology rather than, say,
linguistics. How are the two approaches similar or different?
3. It is clear that Skill Acquisition Theory is concerned with language behav-
ior. Do you think that such an approach is incompatible with an approach
that focuses on competence (e.g., White, this volume)?
4. One interpretation of Skill Acquisition Theory is that it is better suited to
explain tutored language acquisition as compared to nontutored language
acquisition. Another is that it is better suited to explain adult L2 acquisition
but not child L1 acquisition or child L2 acquisition. Do you agree?
5. As you read in Chapter 1, a perennial issue in L2 acquisition concerns the
roles of explicit and implicit learning and knowledge. Now that you have
read about four different theories and models, compare and contrast what
each has to say about this issue.
6. Read the exemplary study presented in this chapter and prepare a discus-
sion for class in which you describe how you would conduct a replication
study. Be sure to explain any changes you would make and what motivates
such changes.
Suggested Further Reading
Anderson, J. R. (2007). How can the human mind occur in the physical universe? New York,
NY: Oxford University Press.
This book provides a more thorough and at the same time more readable account
of what was covered in the 2004 article, with ample discussion of how modeling
skill acquisition fits into the broader psychological currents of the last three decades.
Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004).
An integrated theory of the mind. Psychological Review, 111, 1036–1060.
An overview of ACT-R theory, with new emphases on neuro-imaging data
and the issue of modularity of the mind. Parts are very technical; others are very
readable.
DeKeyser, R. (Ed.). (2007). Practice in a second language: Perspectives from applied linguistics
and cognitive psychology. New York, NY: Cambridge University Press.
A book that takes a broad view of practice, with many chapters drawing on Skill
Acquisition Theory, applying it to issues from error correction in the classroom to
interaction with native speakers during study abroad.
DeKeyser, R. (2018). Task repetition for language learning: A perspective from skill
acquisition theory. In M. Bygate (Ed.), Learning language through task repetition
(pp. 27–41). Amsterdam, Netherlands: Benjamins.
Skill Acquisition Theory 99
This chapter provides detailed discussion of issues such as distribution of practice,
skill specificity, and transfer from the point of view of Skill Acquisition Theory.
DeKeyser, R. M., & Criado-Sánchez, R. (2012). Automatization, skill acquisition, and
practice in second language acquisition. In C. A. Chapelle (Ed.), The encyclopedia of
applied linguistics (pp. 323–331). Oxford, England: Wiley-Blackwell.
A discussion of what Skill Acquisition Theory means for practice activities in a
second language.
Lim, H., & Godfroid, A. (2015). Automatization in second language sentence process-
ing: A partial, conceptual replication of Hulstijn, Van Gelderen, and Schoonen’s 2009
study. Applied Psycholinguistics, 36(5), 1247–1282. doi:10.1017/S0142716414000137
An interesting discussion of the coefficient-of-variation criterion for automaticity
introduced by Segalowitz and Segalowitz (1993) and used by, e.g., Faretta- Stutenberg
and Morgan-Short (2018), Rodgers (2011), and Suzuki and Sunada (2018).
Lyster, R., & Sato, M. (2013). Skill acquisition theory and the role of practice in L2
development. In M. García Mayo, J. Gutierrez-Mangado, & M. Martínez Adrián
(Eds.), Contemporary approaches to second language acquisition (pp. 71–91). Amsterdam,
Netherlands: John Benjamins.
A thorough discussion of Skill Acquisition Theory and practice in L2, with
some emphasis on the role of feedback. Very useful to read in conjunction with this
chapter.
Segalowitz, N. (2010). Cognitive bases of second language fluency. London, England:
Routledge.
The most thorough discussion to date of automaticity and the process of autom-
atization as they apply to second language learning and bilingualism.
Suzuki, Y., & Sunada, M. (2018). Automatization in second language sentence process-
ing: Relationship between elicited imitation and maze tasks. Bilingualism: Language
and Cognition, 21, 32–46. doi:10.1017/S1366728916000857
This study shows how the qualitative change indexed by the coefficient of vari-
ation may occur during study abroad, but that reaction time remains a better pre-
dictor of proficiency.
Ullman, M. (2020). The declarative/procedural model: A neurobiologically motivated
theory of first and second language. In: B. VanPatten, G.D. Keating, S. Wulff (Eds.),
Theories in second language acquisition: An introduction (pp. 128–161). New York: Rout-
ledge (this volume).
A recent discussion of the procedural/declarative model, with emphasis on its
neurological underpinnings.
References
Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Review, 89, 369–406.
Anderson, J. R. (1993). Rules of the mind. Hillsdale, NJ: Lawrence Erlbaum.
Anderson, J. R. (2007). How can the human mind occur in the physical universe? New York,
NY: Oxford University Press.
Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004).
An integrated theory of the mind. Psychological Review, 111(4), 1036–1060.
Anderson, J. R., Fincham, J. M., & Douglass, S. (1997). The role of examples and rules
in the acquisition of a cognitive skill. Journal of Experimental Psychology: Learning,
Memory and Cognition, 23, 932–945.
100 Robert DeKeyser
Anderson, J. R., & Lebiere, C. (1998). The atomic components of thought. Mahwah, NJ:
Lawrence Erlbaum.
Bahrick, H. P., Hall, L. K., & Baker, M. K. (2013). Life-span maintenance of knowledge.
New York, NY: Psychology Press.
Bahrick, H., Hall, L. K., Goggin, J., Bahrick, L., & Berger, S. (1994). Fifty years or
language maintenance in bilingual Hispanic immigrants. Journal of Experimental Psy-
chology: General, 123, 264–283.
Bird, S. (2010). Effects of distributed practice on the acquisition of second language
English syntax. Applied Psycholinguistics, 31, 635–650.
Bygate, M. (Ed.). (2018). Learning language through task repetition. Amsterdam,
Netherlands: Benjamins.
Byrne, D. (Ed.). (1986). Teaching oral English (2nd ed.). Harlow, England: Longman.
Carpenter, S. K., Cepeda, N. J., Rohrer, D., Kang, S. H. K., & Pashler, H. (2012).
Using spacing to enhance diverse forms of learning: Review of recent research and
implications for instruction. Educational Psychology Review, 24, 369–378.
Carroll, J. B., & Sapon, S. (1959). Modern language aptitude test. Form A. New York, NY:
The Psychological Corporation.
Cepeda, N., Coburn, N., Rohrer, D., Wixted, J., Mozer, M., & Pashler, H. (2009).
Optimizing distributed practice. Experimental Psychology, 56, 236–246.
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed
practice in verbal recall tasks: A review and quantitative synthesis. Psychological
Bulletin, 132, 354–380.
Chein, J., & Schneider, W. (2005). Neuroimaging studies of practice-related change:
fMRI and meta-analytic evidence of a domain-general control network for learn-
ing. Cognitive Brain Research, 25, 607–623.
Cohen, A., Ivry, R. I., & Keele, S. W. (1990). Attention and structure in sequence learn-
ing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 17–30.
Collins, L., Halter, R. H., Lightbown, P. M., & Spada, N. (1999). Time and the distri-
bution of time in L2 instruction. TESOL Quarterly, 33, 655–680.
de Bot, K. (1996). The psycholinguistics of the output hypothesis. Language Learning,
46, 529–555.
De Jong, N. (2005). Can second language grammar be learned through listening? An
experimental study. Studies in Second Language Acquisition, 27, 205–234.
De Jong, N., & Perfetti, C. A. (2011). Fluency training in the ESL classroom: An
experimental study of fluency development and proceduralization. Language Learn-
ing, 62, 533–568.
DeKeyser, R. M. (1995). Learning second language grammar rules: An experiment with
a miniature linguistic system. Studies in Second Language Acquisition, 17, 379–410.
DeKeyser, R. M. (1997). Beyond explicit rule learning: Automatizing second language
morphosyntax. Studies in Second Language Acquisition, 19, 195–221.
DeKeyser, R. M. (2001). Automaticity and automatization. In P. Robinson (Ed.),
Cognition and second language instruction (pp. 125–151). New York, NY: Cambridge
University Press.
DeKeyser, R. M. (2003). Implicit and explicit learning. In C. Doughty & M. Long
(Eds.), Handbook of second language acquisition (pp. 313–348). Oxford, England:
Blackwell.
DeKeyser, R. M. (2007a). Situating the concept of practice. In R. M. DeKeyser (Ed.),
Practice in a second language: Perspectives from applied linguistics and cognitive psychology
(pp. 1–18). New York, NY: Cambridge University Press.
Skill Acquisition Theory 101
DeKeyser, R. M. (2007b). Study abroad as foreign language practice. In R. M. DeKeyser
(Ed.), Practice in a second language: Perspectives from applied linguistics and cognitive psychol-
ogy (pp. 208–226). New York, NY: Cambridge University Press.
DeKeyser, R. M. (2009). Cognitive-psychological processes in second language
learning. In M. Long & C. Doughty (Eds.), Handbook of second language teaching
(pp. 119–138). Oxford, England: Wiley-Blackwell.
DeKeyser, R. (2017). Knowledge and skill in SLA. In S. Loewen & M. Sato (Eds.),
Handbook of instructed second language acquisition (pp. 15–32). London: Routledge.
DeKeyser, R. (2018). Task repetition for language learning: A perspective from skill
acquisition theory. In M. Bygate (Ed.), Learning language through task repetition
(pp. 27–41). Amsterdam, Netherlands: Benjamins.
DeKeyser, R. M., & Criado-Sánchez, R. (2012). Automatization, skill acquisition, and
practice in second language acquisition. In C. A. Chapelle (Ed.), The encyclopedia of
applied linguistics (pp. 323–331). Oxford, England: Wiley-Blackwell.
DeKeyser, R. M., & Sokalski, K. (2001). The differential role of comprehension and
production practice. In R. Ellis (Ed.), Form-focused instruction and second language learn-
ing (pp. 81–112). Oxford, England: Blackwell.
Doughty, C. (2001). Cognitive underpinnings of focus on form. In P. Robinson
(Ed.), Cognition and second language instruction (pp. 206–257). Cambridge, England:
Cambridge University Press.
Dulay, H., Burt, M., & Krashen, S. (1982). Language two. New York, NY: Oxford
University Press.
Ellis, N. (2002). Reflections on frequency effects in language processing. Studies in
Second Language Acquisition, 24(2), 297–339.
Ellis, N. (2005). At the interface: Dynamic interactions of explicit and implicit lan-
guage knowledge. Studies in Second Language Acquisition, 27, 305–352.
Ellis, R. (2003). Task-based language learning and teaching. Oxford, England: Oxford
University Press.
Ellis, R. (2012). Language teaching research and language pedagogy. Malden, MA:
Wiley-Blackwell.
Faretta-Stutenberg, M., & Morgan-Short, K. (2018). The interplay of individual dif-
ferences and context of learning in behavioral and neurocognitive second language
development. Second Language Research, 34(1), 67–101.
Ferman, S., Olshtain, E., Schechtman, E., & Karni, A. (2009). The acquisition of a
linguistic skill by adults: Procedural and declarative memory interact in the learning
of an artificial morphological rule. Journal of Neurolinguistics, 22, 384–412.
Fitts, P., & Posner, M. (1967). Human performance. Belmont, CA: Brooks/Cole.
Goldschneider, J., & DeKeyser, R. (2001). Explaining the “natural order of L2 mor-
pheme acquisition” in English: A meta-analysis of multiple determinants. Language
Learning, 51, 1–50.
Goo, J., Granena, G., Yilmaz, Y., & Novella, M. (2015). Implicit and explicit instruc-
tion in L2 learning: Norris & Ortega (2000) revisited and updated. In P. Rebuschat
(Ed.), Implicit and explicit learning of languages. Amsterdam, Netherlands: Benjamins.
Healy, A. F., Barshi, I., Crutcher, R. J., Tao, L., Rickard, T. C., Marmie, W. R.,
Bourne, Lyle E., Jr. (1998). Toward the improvement of training in foreign lan-
guages. In A. F. Healy & L. E. J. Bourne (Eds.), Foreign language learning. Psycholin-
guistic studies on training and retention (pp. 3–53). Mahwah, NJ: Lawrence Erlbaum.
Hill, N. M., & Schneider, W. (2006). Brain changes in the development of expertise:
Neuroanatomical and neurophysiological evidence about skill-based adaptations.
102 Robert DeKeyser
In K. A. Ericsson, N. Charness, P. J. Feltovich, & R. R. Hoffman (Eds.), The
Cambridge handbook of expertise and expert performance (pp. 653–682). New York, NY:
Cambridge University Press.
Just, M., & Varma, S. (2007). The organization of thinking: What functional brain
imaging reveals about the neuroarchitecture of complex cognition. Cognitive,
Affective and Behavioral Neuroscience, 7(3), 153–191.
Kim, J. W., Ritter, F. E., & Koubek, R. J. (2013). An integrated theory for improved
skill acquisition and retention in the three stages of learning. Theoretical Issues in
Ergonomics Science, 14, 22–37.
Knowlton, B. J., Squire, L. R., & Gluck, M. A. (1994). Probabilistic classification in
amnesia. Learning and Memory, 1, 106–120.
Laird, J. E. (2012). The soar cognitive architecture. Cambridge, MA: MIT Press.
Lightbown, P., & Spada, N. (1994). An innovative program for primary ESL students in
Quebec. TESOL Quarterly, 28, 563–579.
Li, M., & DeKeyser, R. (2017). Perception practice, production practice, and musical
ability in L2 Mandarin tone-word learning. Studies in Second Language Acquisition,
39(4), 593–620.
Li, M., & DeKeyser, R. (2019). Distribution of practice effects in the acquisition and
retention of L2 Mandarin tonal word production. The Modern Language Journal,
103(3), 607–628.
Lim, H., & Godfroid, A. (2015). Automatization in second language sentence process-
ing: A partial, conceptual replication of Hulstijn, Van Gelderen, and Schoonen’s 2009
study. Applied Psycholinguistics, 36(5), 1247–1282. doi:10.1017/S0142716414000137
Logan, G. (1988). Toward an instance theory of automatization. Psychological Review,
95, 492–527.
Logan, G. (1992). Shapes of reaction-time distributions and shapes of learning curves:
A test of the instance theory of automaticity. Journal of Experimental Psychology, Learn-
ing, Memory and Cognition, 18, 883–914.
Logan, G. (2002). An instance theory of attention and memory. Psychological Review,
109, 376–400.
Long, M. (2015). Task-based language learning. Oxford, England: Wiley-Blackwell.
Luk, Z. P., & Shirai, Y. (2009). Is the acquisition order of grammatical morphemes
impervious to L1 knowledge? Evidence from the acquisition of plural -s, articles,
and possessive’s. Language Learning, 59, 721–754.
Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruc-
tion. Studies in Second Language Acquisition, 26, 399–432.
Lyster, R., & Sato, M. (2013). Skill acquisition theory and the role of practice in L2
development. In M. García Mayo, J. Gutierrez-Mangado, & M. Martínez Adrián
(Eds.), Contemporary approaches to second language acquisition (pp. 71–91). Amsterdam,
Netherlands: John Benjamins.
Macaro, E. (2003). Teaching and learning a second language: A guide to recent research and its
applications. London, England: Continuum.
Mathews, R., Buss, R., Stanley, W., Blanchard-Fields, F., Cho, J. R., & Druhan, B.
(1989). Role of implicit and explicit processes in learning from examples: A syner-
gistic effect. Journal of Experimental Psychology: Learning, Memory and Cognition, 15,
1083–1100.
Meyer, D., & Kieras, D. (1997). A computational theory of executive cognitive pro-
cesses and multiple-task performance: Part 1. Basic mechanisms. Psychological Review,
104, 3–65.
Skill Acquisition Theory 103
Morgan-Short, K., Faretta-Stutenberg, M., Bill-Schuetz, K. A., Carpenter, H., &
Wong, P. C. M. (2014). Declarative and procedural memory as individual differ-
ences in second language acquisition. Bilingualism: Language and Cognition, 17(1),
56–72.
Nakata, T. (2012, September). Effects of expanding and equal spacing on second lan-
guage vocabulary learning: Do the amount of spacing and retention interval make a
difference? In Paper presented at EUROSLA conference, Poznan, Poland.
Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University
Press.
Newell, A., & Rosenbloom, P. (1981). Mechanisms of skill acquisition and the law
of practice. In J. R. Anderson (Ed.), Cognitive skills and their acquisition (pp. 1–55).
Hillsdale, NJ: Lawrence Erlbaum.
Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis
and quantitative meta-analysis. Language Learning, 50, 417–528.
O’Malley, J., & Chamot, A. (1990). Learning strategies in second language acquisition.
New York, NY: Cambridge University Press.
Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam,
Netherlands: John Benjamins.
Qin, Y., Anderson, J. R., Silk, E., Stenger, V., & Carter, C. (2004). The change of the
brain activation patterns along with the children’s practice in algebra equation solv-
ing. Proceedings of the National Academy of Sciences, 100, 5686–5691.
Raichle, M., Fiez, J., Videen, T., MacLeod, A. M., Pardo, J. V., Fox, P., Peterson,
S. (1994). Practice-related changes in human brain functional anatomy during non-
motor learning. Cerebral Cortex, 4, 8–26.
Ranta, L., & Lyster, R. (2007). A cognitive approach to improving immersion stu-
dents’ oral production abilities: The awareness, practice, and feedback sequence. In
R. DeKeyser (Ed.), Practice in a second language: Perspectives from applied linguistics and
cognitive psychology (pp. 141–160). New York, NY: Cambridge University Press.
Robinson, P. (1997). Generalizability and automaticity of second language learning
under implicit, incidental, enhanced, and instructed conditions. Studies in Second
Language Acquisition, 19, 223–247.
Robinson, P. (2011). Task-based language learning: A review of issues. Language Learn-
ing, 61(S1), 1–36.
Rodgers, D. M. (2011). The automatization of verbal morphology in instructed second
language acquisition. IRAL, 49, 295–319.
Rohrer, D. (2015). Student instruction should be distributed over long time periods.
Educational Psychology Review, 27(4), 635–643. doi:10.1007/s10648-015-9332–4
Rohrer, D., & Pashler, H. (2007). Increasing retention without increasing study time.
Current Directions in Psychological Science, 16(4), 1209–1224.
Sallas, B., Mathews, R. C., Lane, S. M., & Sun, R. (2007). Developing rich and quickly
accessed knowledge of an artificial grammar. Memory and Cognition, 35, 2118–2133.
Sato, M., & Lyster, R. (2012). Peer interaction and corrective feedback for accuracy and
fluency development: Monitoring, practice, and proceduralization. Studies in Second
Language Acquisition, 34, 591–626.
Segalowitz, N. (2010). Cognitive bases of second language fluency. London, England:
Routledge.
Segalowitz, N. S., & Segalowitz, S. J. (1993). Skilled performance, practice, and the
differentiation of speed-up from automatization effects: Evidence from second
language word recognition. Applied Psycholinguistics, 14, 369–385.
104 Robert DeKeyser
Serrano, R. (2011). The time factor in EFL classroom practice. Language Learning, 61,
117–145.
Serrano, R., & Muñoz, C. (2007). Same hours, different time distribution: Any differ-
ence in EFL? System, 35, 305–321.
Shintani, N., Li, S., & Ellis, R. (2013). Comprehension-based versus production-based
grammar instruction: A meta-analysis of comparative studies. Language Learning,
63, 296–329.
Singley, M., & Anderson, J. R. (1989). The transfer of cognitive skill. Cambridge, MA:
Harvard University Press.
Spada, N., & Tomita, Y. (2010). Interactions between type of instruction and type of
language feature: A meta-analysis. Language Learning, 60, 263–308.
Suzuki, Y., & DeKeyser, R. (2017a). Effects of distributed practice on the automatiza-
tion of L2 morphosyntax. Language Teaching Research, 21(2), 166–188.
Suzuki, Y., & DeKeyser, R. (2017b). Exploratory research on L2 distributed practice:
An aptitude-by-treatment interaction. Applied Psycholinguistics, 38(1), 27–56.
Suzuki, Y., & DeKeyser, R. (2017c). The interface of explicit and implicit knowledge in
a second language. Language Learning, 67(4), 747–779. doi:10.1111/lang.12241
Taatgen, N. A., & Anderson, J. R. (2010). The past, present, and future of cognitive
architectures. Topics in Cognitive Science, 2(3), 693–704.
Taatgen, N. A., Huss, D., Dickison, D., & Anderson, J. R. (2008). The acquisition of
robust and flexible cognitive skills. Journal of Experimental Psychology: General, 137,
548–565.
Taatgen, N. A., & Lee, F. J. (2003). Production compilation: A simple mechanism to
model complex skill acquisition. Human Factors, 45, 61–76.
Tanaka, T. (2001). Comprehension and production practice in grammar instruction:
Does their combined use facilitate second language acquisition? JALT Journal, 23,
6–30.
Towell, R., & Hawkins, R. (1994). Approaches to second language acquisition. Clevedon,
England: Multilingual Matters.
Towell, R., Hawkins, R., & Bazergui, N. (1996). The development of fluency in
advanced learners of French. Applied Linguistics, 17, 84–119.
Trahan, D. E., & Larrabee, G. J. (1988). Continuous visual memory test. Odessa, FL:
Assessment Resources.
Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/
procedural model. Cognition, 92, 231–270.
Van den Branden, K. (Ed.). (2006). Task-based language education: From theory to practice.
Cambridge, England: Cambridge University Press.
VanPatten, B. (Ed.). (2004). Processing instruction: Theory, research, and commentary.
Mahwah, NJ: Lawrence Erlbaum.
White, J., & Turner, C. E. (2005). Comparing children’s oral ability in two ESL
programs. The Canadian Modern Language Review, 61, 491–517.
Williams, J. (2009). Implicit learning in second language acquisition. In T. Bhatia &
W. Ritchie (Eds.), The new handbook of second language acquisition (pp. 319–353).
Bingley, England: Emerald.
6
INPUT PROCESSING IN ADULT
L2 ACQUISITION
Bill VanPatten
Imagine the speaker of Spanish learning English. In a conversation or discussion,
she hears someone say, “The police officer was killed by the robber.” A lthough
for the native speaker it is clear that it was the police officer who died, the
learner of English may interpret this sentence as “The police officer killed the
robber.” Why does the learner make this misinterpretation? It cannot be due
to L1 influence because Spanish has the exact same construction: El policía fue
matado por el ladrón.
Imagine another English speaker learning Spanish in a formal setting. That
learner studies the preterit (simple past) in Spanish. A month later she hears
someone say Juan estudió en Cuernavaca “John studied in Cuernavaca.” How-
ever, she interprets this sentence to mean that John is studying in Cuernavaca,
even though she has studied and practiced past tense formation in Spanish and
even though English also clearly marks past from present (e.g., “studies” vs.
“studied”). Why does she make this misinterpretation?
Input processing (IP) is concerned with these situations, the reason being
that acquisition is, to a certain degree, a by-product of comprehension (see, e.g.,
Truscott & Sharwood Smith, 2004). Although comprehension does not guar-
antee acquisition, acquisition cannot happen if comprehension does not occur.
Why? Because a good deal of acquisition is dependent upon learners making
appropriate form–meaning connections during the act of comprehension. It
is the raw data in the input that learners need to construct a linguistic system. A
good deal of acquisition is dependent upon learners correctly interpreting what
a sentence means (Carroll, 2001; VanPatten, 2004a; VanPatten & Rothman,
2014; White, 1987).
In this chapter, I deal with the fundamentals of IP and the research associ-
ated with it. What will become clear is that IP is not a comprehensive theory
or model of language acquisition. Instead, it aims to be a theory or model of
106 Bill VanPatten
what happens during comprehension that may subsequently affect or interact
with other processes. I will begin with a sketch of the theory and its constructs.
The Theory and Its Constructs
IP is concerned with three fundamental questions that involve the assumption
that an integral part of language acquisition is making form–meaning connec-
tions during comprehension:
• Under what conditions do learners make initial form–meaning connections?
• Why, at a given moment in time, do they make some and not other form–
meaning connections?
• What internal psycholinguistic strategies do learners use in comprehend-
ing sentences and how might this affect acquisition?
Let’s take a concrete example based on the introduction of this chapter. In English
as an L2, learners must, at some point, map the meaning of PASTNESS onto the
verb inflection /-t/ (or “-ed” in written form). How does this happen and why
don’t learners do this from the first time they encounter this form in a context
in which the speaker is clearly making reference to the past? In this regard, IP is
a model of moment-by-moment sentence processing during comprehension and
how learners connect or don’t connect particular forms with particular mean-
ings at a given moment in time. It is a model of how learners derive the initial
data from input for creating a linguistic system, that is, the data that are delivered
to other processors and mechanisms that actually store and organize the data
(e.g., UG; see Chapter 31). This can be sketched as in Figure 6.1.
Input processing
Other processors and mechanisms
Input (e.g., UG, general learning
architecture)
The learner’s internal grammar
(i.e., the developing system)
FIGURE 6.1 Where IP fits into an acquisition scheme.
Input Processing in Adult L2 Acquisition 107
IP makes a number of claims about what guides learners’ processing of
linguistic data in the input as they are engaged in comprehension. These claims
can be summarized as follows.
• Learners are driven to get meaning while comprehending.
• Comprehension for learners is initially quite effortful in terms of cog-
nitive processing and working memory. Unlike L1 native speakers, L2
learners must develop the ability to comprehend, and comprehension for
some time may tax the computational resources as learners engage in the
millisecond-by-millisecond analysis of a sentence. This has consequences
for what the IP mechanisms will pay attention to.
• At the same time, learners are limited capacity processors and cannot
process and store the same amount of information as native speakers can
during moment-by-moment processing.
• Learners may make use of certain universals of IP but may also make use of
the L1 input processor (or parser, which we will define shortly).
The first claim has led to the principle in IP that learners will seek to grasp
meaning by searching for lexical items, although the precise manner in which
this is done is still not clear.2 In other words, learners enter the task of L2 acqui-
sition knowing that languages have words. They are thus first driven to make
form–meaning connections that are lexical in nature. For example, if they hear
“The cat is sleeping” and this sentence is uttered in a context in which a cat is
indeed sleeping, the learner will seek to isolate the lexical forms that encode
the meanings of CAT and SLEEP, for instance, because the learner (a) has these
concepts stored somewhere in the mind/brain based on past human experience
and (b) knows that there are probably words for these concepts that must be
somewhere in the speech stream. What is more, learners know that there are
differences between content lexical items (e.g., “cat,” “sleep”) and noncontent
lexical items (e.g., “the,” “is”) and will seek out content lexical items first as
the building blocks of interpreting sentences. Thus, in “The cat is sleeping,”
the learner may initially only make the connections between cat-CAT and
sleep-SLEEP (again, the reader is reminded that such a sentence is uttered in
a context in which there is a cat sleeping). These claims are codified in the
following IP principle:
The Primacy of Content Words Principle. Learners process content words in
the input before anything else.
At this point, the learner most likely does not process the noncontent words or
the inflections on nouns and verbs, process referring specifically to actually
making connections between meaning and form (as opposed to mere “notic-
ing”). If the learner does process noncontent words and/or inflections, it is likely
108 Bill VanPatten
that the (other) processors responsible for data storage and grammar building
may not yet be able to make use of them and will dump them, preventing fur-
ther processing. One example is the auxiliary do in English. This auxiliary may
be initially perceived by learners (it is almost always in sentence-initial position
in yes/no questions such as Do you like Mexican food?; see the later definition
of the “Sentence Location Principle”), but because learners can’t attach any
meaning to it in the early stages of processing, do does not get processed after
it is initially perceived.
However, the model of IP also makes another claim regarding such things as
inflections and grammatical markers, namely, that if the marker is redundant, it
may not get processed because the learner is focused on getting content words
first. Processing the content word (i.e., the Primacy of Content Words) obvi-
ates the need to process the grammatical marker if it encodes the same meaning.
In this scenario, presented with a sentence such as “I called my mother yester-
day,” learners will not process tense markers. Instead, they will derive tense
from their processing of adverbs of time (e.g., “yesterday,” “tomorrow”). The
Primacy of Content Words principle thus has consequences for what learners
extract from the input when grammatical devices are present. This is codified
in the following principle:
The Lexical Preference Principle. Learners will process lexical items for
meaning before grammatical forms when both encode the same semantic
(“real world”) information.
Learners will first tend to link semantic notions with content lexical items in
the input and only later link grammatical forms that encode the same semantic
notions. There are two possible consequences of this particular principle. The
first is that learners will begin to process redundant grammatical markers only
when they have processed and incorporated corresponding lexical forms into
their developing linguistic systems. Thus, past tense markers won’t be pro-
cessed and incorporated until learners have processed and incorporated lexical
forms such as “yesterday,” “last night,” and so on.3 If so, the Lexical Preference
Principle might be revised to state the following:
(Revised) Lexical Preference Principle. If grammatical forms express a mean-
ing that can also be encoded lexically (i.e., that grammatical marker is
redundant), then learners will not initially process those grammatical
forms until they have lexical forms to which they can match them.
The other possible consequence is that learners may begin to rely exclusively
on lexical forms for all information and never process grammatical m arkers
in the input at all. In this scenario, the processing of lexical items “over-
rides” any need to process grammatical markers when redundancy is involved
Input Processing in Adult L2 Acquisition 109
(i.e., the lexical form and the grammatical form express the same meaning
as in PASTNESS/-ed, FUTURE/will, THIRD-PERSON SINGULAR/-s,4
and so on). In either scenario, one of the predictions of the model of IP is that
learners will continue to focus on the processing of lexical items to the detri-
ment of grammatical markers given that lexical items maximize the extraction
of meaning, at least from the learner’s point of view. Grammatical markers
will be processed later. In VanPatten and Keating (2007), we demonstrated
this in one study in which learners of Spanish L2 with English L1 processed
Spanish sentences in which the adverb matched or didn’t match the verb for
tense (e.g., “Yesterday I am talking to John” vs. “Now I’m talking to John”).
Using eye-tracking (for a discussion of eye-tracking, see the section “What
Kind of Evidence”), we found that native speakers lingered or “regressed” to
verb forms to verify temporal reference, whereas beginners and intermediate
staged learners tended to linger or regress to adverbs to verify temporal refer-
ence. However, advanced learners patterned like native speakers, suggesting
that eventually learners begin to focus on grammatical inflections in the input
to obtain temporal information (see also Ellis & Sagarra, 2010; Lee, Cadierno,
Glass, & VanPatten, 1995). At the same time, VanPatten and Keating found that
Spanish L1 speakers of English L2 did not begin the processing of English sen-
tences by relying on the Spanish preference for verbs. Instead, their early-stage
learners of English patterned after the English L1 learners of Spanish, using
adverbials to process temporal reference in sentences, suggesting this strategy is
universal and not dependent on L1 experience.
Not all grammatical markers are redundant. In English, -ing is the sole
marker of the semantic notion of an event in progress as in “The cat is sleep-
ing [IN PROGRESS].” There is no lexical indication of IN PROGRESS in
the sentence with -ing. This contrasts with something like “The cat sleeps ten
hours everyday,” where the meaning of -s of “sleeps” [THIRD PERSON,
SINGULAR, ITERATIVE] is encoded lexically in “the cat [THIRD-
PERSON, SINGULAR]” and “everyday [ITERATIVE].” Because learners
always search for ways in which meaning is encoded, if it is not encoded lex-
ically, only then will they turn to grammatical markers to see if a seman-
tic notion is expressed there. Thus, if learners are confronted with something
like -ing on verb forms, they will be forced to make this form–meaning con-
nection sooner than say third-person -s because the latter is redundant and the
former is not. This leads to another principle of IP:
The Preference for Nonredundancy Principle. Learners are more likely to pro-
cess non-redundant meaningful grammatical markers before they process
redundant meaningful markers.5
Until now, we have considered only grammatical markers that carry meaning
such as -s on the end of a noun means “more than one” and -ing means “in
110 Bill VanPatten
progress.” But there are some grammatical markers, albeit not many, that do
not carry meaning. Consider “that” in the sentence “John thinks that Mary is
smart.” What real word semantic information does “that” encode? It’s not a
tense marker. It’s not an indication of whether or not the event is in progress
or iterative. It’s not a plurality marker or any other such semantically linked
grammatical device. There is nothing you can point to in the world or describe
and say “that’s a ‘that’” as you might with “that’s a cow” and “that’s love.” It has
a grammatical function, to be sure—to link two sentences (i.e., introduce an
embedded clause) but it doesn’t encode any semantic information. In Spanish,
adjective agreement is similar. In the case of el libro blanco (“the white book”)
and la casa blanca (“the white house”), there is no semantic reason why in one
case blanco must be used and in another blanca must be used.6 Spanish just makes
adjectives agree with nouns. The model of IP says that such formal features of
language will be processed in the input later than those for which true form–
meaning connections can be made. The principle says:
The Meaning before Nonmeaning Principle. Learners are more likely to
process meaningful grammatical markers before nonmeaningful gram-
matical markers.
IP is more, however, than making form–meaning connections. When a
person hears a sentence, whether in the L1 or the L2, that person also does a
microsecond-by-microsecond computation of the syntactic structure of that
sentence. This is called parsing. For example, in English when a person hears
“The cat …” the parsing mechanism (called a parser) does the following: the
cat = DP (determiner phrase) = subject. This is a called a projection because the
parser projects a syntactic structure (i.e., the parser is making the best guess at
what the grammatical relationships will be among words). If a verb follows, the
parser may continue in this path. For example, “The cat chased …”, the cat =
DP = subject, chased = verb [so far, so good for the syntactic projection]. If a
phrase like “the mouse” comes next, the parser may continue: the cat = DP =
subject, chased = verb, the mouse = DP = object; parsing completed, syntac-
tic projection successful, sentence computed and understood. But if instead of
“the mouse” what follows is “by the boy,” the parser must reanalyze on the
spot and project something different onto the syntactic structure: the cat =
DP = subject, chased = verb, by the boy = oops, not an object therefore “the
cat chased by the boy” = DP = subject. If a verb follows such as “howled” the
parser continues: the cat chased by the boy = DP = subject, howled = verb,
parsing completed and successful.
The previous description of parsing is greatly simplified to be sure,7 but for
the present discussion it allows us to ask the following question: how do learn-
ers parse sentences in the L2 when they do not have a fully developed parser as
they do for L1 sentence processing? (Again, I am ignoring here how learners
Input Processing in Adult L2 Acquisition 111
come to perceive word boundaries and isolate words during parsing.) The first
avenue is that learners possess universal parsing strategies (or procedures) and
apply these as they begin interacting with the L2 input. The other avenue is that
learners transfer or attempt to transfer their L1 parsing strategies (or procedures)
when interacting with the L2 input. These two positions are clear when we
examine sentences such as the following in English and Spanish:
a. Mary hates John.
b. María detesta a Juan.
c. A Juan María lo detesta.
In English, only subject-verb-object (SVO) order is possible, regardless of
whether an object is a full noun ( John) or a pronoun (him) as in (1a), (2a),
and (3a). This is true whether the sentence is a simple declarative or whether
it is a yes/no question. In Spanish, however, SVO is certainly prototypical,
SOV (with pronouns as in 2b), OVS (with full nouns and pronouns as in
1c and 2c), and OV (when the subject is null, that is, not expressed as in
3b).8 In Spanish, OV and OVS are fairly standard for yes/no questions, are not
infrequent in simple declaratives, and are the prototypical orders for sentences
containing certain verbs. So, what happens when a language learner, say of
English L1 background, first encounters (and continues to encounter) OVS
and OV type sentences? Research has shown that such learners misinterpret
such sentences and reverse “who does what to whom.” In the case of A Juan
lo detesta María, learners misinterpret this as “John hates Mary” rather than
“Mary hates John.” In the case of Lo detesta María, they misinterpret this sen-
tence as “He hates Mary” rather than “Mary hates him.” The result is that
incorrect form–meaning connections are made (e.g., lo = he[subject] rather
than lo = h im[object]) and wrong data about sentence structure is provided to
the internal processors responsible for storage and organization of language; in
this case, these processors receive incorrect information that Spanish is rigid
SVO and the pronoun system becomes a mess.
The question is as follows: is this parsing problem due to some universal strat-
egy 9 or to the English parser interacting with Spanish input data? In previous
112 Bill VanPatten
research, I have taken the position that this is a universal strategy and posed the
following principle:
The First-Noun Principle. Learners tend to process the first noun or
pronoun they encounter in a sentence as the subject.
Under this universal position, any learner, whether from an SVO language or a
language with flexible word order or rigid OVS order, would initially process
the first noun as the subject.
Under the alternative position, that the L1 parser is transferred into L2 IP,
the principle would look different and would have different consequences. The
principle might look like this:
The L1 Transfer Principle. Learners begin acquisition with L1 parsing
procedures.
In this case, problems would be language specific in terms of transfer. So, the
Italian speaker learning Spanish would not have difficulty with OV and OVS
structures in Spanish because these exist in Italian (e.g., Lo vede Maria “Maria
sees him”) and the L1 parser has computing mechanisms for dealing with them.
The English speaker, on the other hand, would have difficulty due to the rigid
word order of English with no parsing mechanism to handle non-SVO struc-
tures (except cleft sentences such as “Him I hate”) (see Isabelli, 2008, for a
sample study on this issue).
A question arises with this example: Is the transfer due to transfer of the pars-
ing mechanism itself or lexical transfer? Spanish and Italian share object pronouns
such as lo and la so that the Italian speaker learning Spanish can transfer these lex-
ical items along with their functional features into the new lexicon. The English
speaker cannot do this. The underlying features of lo prohibit it from being taken
as a subject in Italian, and presumably this would happen in Spanish as L2 for
these learners. In this case, the learner of Italian L1 begins Spanish L2 with the
First-noun principle but when that learner accesses lo with its features borrowed
from Italian, the principle is at odds with the underlying features of the lexical item.
On the other hand, there is research on the acquisition of passives that sug-
gests that word for word passive structure equivalents in languages like English
and French do not transfer, so that early-stage learners of French tend to misin-
terpret passives in terms of who does what to whom (Ervin-Tripp, 1974). Thus,
the question is open as to whether and to what degree there is L1 influence
in basic IP, and whether that influence is an actual processing procedure or
lexical influence. In another study of Italian and English as both L1s and L2s,
Gass (1989) showed that although there were some differences between groups
on how they interpreted first nouns, the groups patterned the same when ani-
macy was not considered (see below regarding principles that can attenuate the
First-noun Principle as well as Tight, 2012).
Input Processing in Adult L2 Acquisition 113
Other factors may influence how learners parse and thus interpret sentences.
Consider the following verb: scold. Which is more likely, for a parent to scold
a child or a child to scold a parent? In the real world, the first situation is more
likely. So, what happens if a learner hears “The child scolded the mother”? In such
cases, it is possible (though not necessary) that the probability of real-life scenarios
might override the First-Noun Principle (or the alternative L1 Transfer Principle).
The learner might incorrectly reparse the sentence to mean “the parent scolded
the child” and send information to the internal processors that the language has
OVS structures (when it may not). This is what would happen during parsing
under this scenario: the child = NP = subject, scolded = verb, the parent = NP =
object, but wait, children don’t scold parents, parents scold children so the sen-
tence must mean that the parent scolded the child, reanalyze the parse: the child =
NP = object, scolded = verb, the parent = NP = subject. The influence of what
are called event probabilities is captured in the following principle:
The Event Probability Principle. Learners may rely on event probabilities,
where possible, instead of the First-Noun Principle to interpret sentences.
Similarly, learners also come to the task of parsing knowing that certain verbs
require certain situations. For example, the verb “kick” requires an animate
being with legs for the action to occur. Thus, people, horses, frogs, and even
dogs can kick, but snakes, rocks, and germs cannot kick. When confronted with
the sentence “The cow was kicked by the horse,” the First-Noun Principle (or
L1 Transfer Principle) may cause a misinterpretation: The cow did the kicking.
However, when confronted with the sentence “The fence was kicked by the
horse,” a faulty interpretation is unlikely (how can a fence kick anything?) and
the sentence may actually cause the parser to reanalyze what it just computed
(assuming there is time to do so). This situation involves what is called lexical
semantics. Lexical semantics refers to how the meanings of verbs place require-
ments on nouns for an action or event to occur. Does the event expressed by the
verb require an animate being to bring the event about? Does the event require
particular properties of a being or entity for the event to come about? Note that
lexical semantics is different from event probabilities in a fundamentally different
way: with event probabilities, either noun may be capable of the action but one
is more likely. With lexical semantics, it is the case that only one noun is capable
of the action. Thus, both a child and a parent can scold, but one is more likely to
scold the other (event probabilities). However, between the two entities “horse”
and “fence” a horse can kick something else; a fence cannot kick something else
(lexical semantics). The use of lexical semantics during parsing can be expressed
by the following principle:
The Lexical Semantics Principle. Learners may rely on lexical semantics,
where possible, instead of the First-Noun Principle (or an L1 parsing
procedure) to interpret sentences.
114 Bill VanPatten
Research on L2 IP has also demonstrated that context may affect how learners
parse sentences. Consider the following two sentences:
(4a) John is in the hospital because Mary attacked him.
(4b) John told his friends that Mary attacked him.
In Spanish, the embedded clause can either be SOV (María lo atacó) or OVS
(lo atacó María). If the First-Noun Principle or its L1 alternative were active
(for English speakers, say), the OVS structure could be misinterpreted as “he
attacked Mary.” But note that if the preceding context is “John is in the hospi-
tal” a misinterpretation is less likely. Why would John be in the hospital if he
attacked Mary? He’d be in jail, if anything. No, it’s most reasonable that he’s
in the hospital as the result of an injury so Mary must have attacked him. If the
preceding context is neutral as in “John told his friends …,” there is nothing
to constrain interpretation of the following clause: John could equally tell his
friends that he attacked Mary or that Mary attacked him (see VanPatten &
Houston, 1998). The effects of context, then, result in another principle:
The Contextual Constraint Principle. Learners may rely less on the First
Noun Principle (or L1 transfer) if preceding context constrains the possi-
ble interpretation of a clause or sentence.
So far, we have dealt with factors that affect the connection of form and mean-
ing during processing, as well as parsing (e.g., computation of syntactic structure).
There is another area of processing that enters the picture: where elements are more
likely to appear in a sentence. Imagine you hear the following set of numbers:
11 32 51 4 8 42 71 39 7 22 60 15 96 12 85 44
If you are typical, you will remember the numbers at the beginning (say 11, 32)
before you would remember numbers at the end (say 44, 85) and in turn would
remember both before you would remember any numbers in the middle (say
39, 7, or 60). This ability to process and remember best things at the beginning,
followed by things at the end, followed by things in the middle is true of a good
deal of human information processing, and language is no different. We can
couch this phenomenon in the following principle:
The Sentence Location Principle. Learners tend to process items in sentence
initial position before those in final position and those in medial position.
Barcroft and VanPatten (1997) found this to be the case for the initial detec-
tion of the grammatical morpheme se in Spanish in which the morpheme was
much more frequently detected by naïve learners of Spanish in sentences such as
Input Processing in Adult L2 Acquisition 115
Se levanta Juan temprano todos los días compared with Todos los días Juan se levanta
temprano “John gets up early every day.”
To be sure, the principles just outlined (and any others that might affect
IP10) do not act in isolation. One can envision, for example, that even though
object pronouns in Spanish can and do appear in initial position (e.g., in OVS
structures), this does not mean that learners will process them correctly. The
First-Noun Principle would most likely interact with object pronouns so that
learners may indeed process the object pronoun because it is in initial position
in an OVS structure (as opposed to when it might normally appear in medial
position in the sentence) but they would process it incorrectly. What is import-
ant to keep in mind here is that the term “process” means that learners link
meaning and form, either locally (words, morphology) or at the sentence level.
As I will discuss later, processing is not an equivalent term for “noticing.”
What Counts as Evidence?
It is probably clear that only data gathered during comprehension-oriented
research is appropriate for making inferences about IP. Typical research designs
include sentence interpretation tasks and eye-tracking experimentation.
Sentence Interpretation Tasks
In this kind of experimentation, participants hear sentences and indicate what
they understand. For example, in the case of word order, participants might
hear “The cow was kicked by the horse.” They are then asked to choose
between two pictures that could represent what they heard. In one picture,
the cow is kicking the horse. In the other, the horse is kicking the cow. If the
participant chooses the first picture, then we can infer that the First-Noun
Principle is guiding sentence processing. If the participant selects the second
picture, then the First-Noun Principle is not guiding sentence processing (see,
e.g., VanPatten, 1984).
With form–meaning connections, a variation on this type of task may occur.
Participants might hear sentences such as Mi mamá me llamó por teléfono “My
mother called me on the phone” and Mi amigo me ayuda con la casa “My friend
is helping me with the house.” Note that there are no adverbials of time in
the sentences. At the same time, the subject may hear similar sentences but
an adverbial of time is present, as in Mi mamá me llamó anoche por teléfono “My
mother called me on the phone last night.” Learners are then asked to indicate
whether the action occurred in the past is happening now or happens everyday
or is going to/will happen in the future. If learners fail to correctly make such
indications when the adverbs are not present in sentences but correctly do so
when they are, this tells us they are relying on lexical items to get semantic
information and not verbal inflections.
116 Bill VanPatten
Eye-tracking
Eye-tracking research involves having participants read sentences or text
on a computer screen while tiny cameras track eye-pupil movements via
a very small infrared light directed at their pupils. As people read, they
unconsciously skip words and parts of words, regress to some words, and so
on, on a m illisecond-by-millisecond basis (see Keating, 2013, for an overview
of eye-tracking research in L2). Eye-tracking can reveal, for instance, whether
learners attend to verbal inflections during IP and whether they regress like
native speakers do when encountering something that does not seem right. For
example, given the sentence “Last night my mother calls me on the phone,”
native speakers eye-tracking reveals fixations on the verb “calls” often with
regression to the phrase “last night.” We do not see the same eye-movement
behavior from beginning and intermediate learners. However, when asked to
press a button to indicate past, present, or future for the action, both native
speakers and nonnatives always press “past.” These combined results suggest
that learners do indeed rely on lexical cues for meaning and skip over grammat-
ical markers that encode the same meaning as they process sentences.
There are other online methods in addition to eye-tracking that can be used
to research IP. The reader is referred to Jegerski and VanPatten (2013) for a
volume dedicated to psycholinguistic methods used in L2 research.
Common Misunderstandings
There are several common misunderstandings about both IP and the specific
model of IP described here.
Misunderstanding 1: IP is a model/theory of acquisition. People who claim this
believe that IP attempts to account for acquisition more generally. It does not.
As stated, IP is only concerned with how learners come to make form–meaning
connections and/or parse sentences. Acquisition involves other processes as
well, including accommodation of data (how the data are incorporated into
the developing linguistic system and why they might not be), how Universal
Grammar acts upon the data, restructuring (how incorporated data affect the
system, such as in when regular forms cause irregular forms to become regu-
larized), how learners make output, how interaction affects acquisition, and
others. In short, IP is only concerned with initial data gathering. Consider the fol-
lowing analogy: honey-making. Bees have to make honey. To do so, they have
to gather nectar. They go to some flowers and not others. They have to find
their way to flowers and then back to the hive. They then process the nectar to
produce honey. They build combs to store the honey, and so on. All of these
endeavors are part of honey-making. But we can isolate our research to ask
the following questions: How do bees gather nectar? Why do they select some
flowers and not others? This is similar to the concerns of IP: how do learners
Input Processing in Adult L2 Acquisition 117
make form–meaning connections? Why, at a given period in time, do they
make some connections and not others? IP isolates one part of acquisition; it is
concerned with the “nectar gathering aspect” of acquisition and leaves other
models and theories to account for what happens to the nectar when it gets to
the hive. (See Rothman & VanPatten, 2013, as well as VanPatten, 2014a, 2018,
for a discussion of how various theories are necessary to account for the com-
plex picture that is acquisition.)
Misunderstanding 2: Input processing discounts a role for output, social factors, and
other matters. Under this scenario, the person believes that because there is a focus
on one aspect of acquisition, that the researcher or scholar does not believe any-
thing else plays a role in acquisition. We thus hear of such things as “the input
versus output debate” or “comprehension versus production” in SLA. Again,
if we go back to the honey analogy, clearly someone who examines how bees
collect nectar and why they do it the way they do it understands quite well that
gathering nectar is not the same as making honey. And hopefully someone who
researches what happens in the hive once the nectar arrives clearly understands
that without nectar there is no honey-making. That a researcher focuses on one
particular part of the acquisition puzzle, does not mean he or she discounts the
rest. It means that the researcher is merely staking out a piece of the puzzle to
examine in detail.
Misunderstanding 3: Input processing is equivalent to “noticing.” Some readers of
research on IP mistakenly equate processing with noticing. As a reminder to
the reader, processing means that learners are connecting meaning and form.
Examples are /kaet/ means “cat”; however, this concept is represented in the
mind of the learner, /tahkt/ means that the talking happened in a past time
frame, and “John was told a lie by Mary” means that Mary did the lying to
John. Noticing, as defined by Schmidt (e.g., Schmidt, 1990, 2001), does not
entail a connection between form and meaning. Noticing simply means that
learners have become aware of a formal feature of language (including new
words). In addition, noticing has not been applied to the sentence level; its use is
almost always restricted to morpho-lexical form. The distinction between the
two is important because in some publications, researchers have argued against
the principles outlined in this chapter. However, their research methods use
techniques for noticing and not processing, including measures of knowledge
(e.g., grammaticality judgment tasks), introspective think-alouds (e.g., “Tell me
what you notice in what you are reading”), and mark-up tasks (e.g., “As you
read, circle anything in the text that catches your attention”). One cannot use
research paradigms for noticing to argue against principles related to processing
(see, e.g., VanPatten, 2014b, 2015).
Misunderstanding 4: Input processing is a meaning-based approach to studying
acquisition and ignores what we know about syntactic processes. People who make
this claim are focused on aspects of the model in which lexical primacy and
the quest to get meaning from the input drives sentence interpretation, for
118 Bill VanPatten
example. Their conclusion is understandable, but it is not correct. As we have
seen with the issue of the First-Noun Principle and with parsing, the model is
also concerned with syntactic aspects of parsing and how these affect sentence
interpretation and processing (which in turn affects acquisition). What is more,
sometimes those who believe IP to ignore syntactic processes may be thinking
of what we know about adult native-speaking models of sentence interpreta-
tion, which are largely (but no exclusively) syntactic in nature. The idea is that
if this is what native-speaking processing models entail, shouldn’t L2 models do
the same? The answer is maybe. The position taken by those of us in IP research
is that other than the kinds of principles described here, processing develops
over time. What learners begin with may not be processing mechanisms that
can make full use of syntactic processes in sentence interpretation the way
native speakers can. For example, in one experiment, researchers have shown
that native speakers and nonnatives process “gap” sentences differently. Gap
sentences are those in which a wh- element (e.g., who) has been moved out of
one part of the sentence into another: “The nurse who the doctor argued that
the rude patient had angered is refusing to work late.” In this kind of sentence,
who (the relative clause marker) is actually linked to the verb angered (i.e., is the
object of the verb angered). What the researchers noticed is that even though
both natives and nonnatives can equally determine who was rude to whom,
their millisecond-by-millisecond processing reveals substantial differences in
how they make use of syntactic processing with the nonnatives relying much
more on lexical-semantic and other nonsyntactic cues. This is referred to as the
Shallow Structure Hypothesis (Clahsen & Felser, 2006). This does not mean
that the model of IP and the SSH are equivalent; the point here is that it is possi-
ble to process sentences and not make full use of syntactic resources in doing so.
Misunderstanding 5: Input processing is a pedagogical approach. Some people believe
that the model of IP as described here is a pedagogical model. This is because
there is a pedagogical intervention called processing instruction (what some people
mistakenly call “input processing instruction”) that is derived from insights about
IP. Processing instruction is directed at the following question: If we know what
learners are doing wrong at the level of IP, can we create pedagogical intervention
that is comprehension based to push them away from nonoptimal processing? IP,
however, is not about pedagogy nor is it concerned with what learners in class-
rooms do. As a model of processing, it is meant to apply to all learners of all lan-
guages in all contexts (in and out of classrooms). Thus, the First-Noun Principle
could be researched with learners of English as immigrants in the United States,
learners of English in a classroom in Canada, learners of English in a classroom in
Saudi Arabia, and so on. The model attempts to describe what learners do on their
own, the same way research on say Universal Grammar describes what learners
do on their own regardless of instruction. With this said, there have been many
more studies on processing instruction than on IP. Perhaps for this reason there is
some confusion in the literature.
Input Processing in Adult L2 Acquisition 119
An Exemplary Study: Wong and Ito (2018)
In this study, Wong and Ito (2018) set out to see if a particular pedagogical
intervention informed by IP called processing instruction changes process-
ing behaviors as measured in real time. In two experiments, they examined
third-semester college-level learners of French as an L2 with English as L1. The
processing problem was the First-noun principle and its intersection with the
French causative with the verb faire “to make.” English and French form caus-
ative sentences slightly differently as illustrated in (5) and (6).
What is different is that in the embedded clause after the verbs “make” and faire,
the subject/agent of the verb clean comes before the verb in English while in
French the subject/agent of the verb nettoyer comes after the verb and is marked
by a preposition. The prediction from the First-noun Principle would be that
learners of French will carry the first noun they encounter, in this case Henri,
over to the second verb, nettoyer, and “assume” he is the agent/subject of that
verb. In short, they may not process the verb fait in any real way and interpret
the sentence to mean something like “Henry cleans the kitchen for Phillip.”
In earlier research (VanPatten & Wong, 2004), this was shown to be the case.
What Wong and Ito did was divide their participants into two treatment
groups, one receiving processing instruction (derived from insights of IP prin-
ciples) and traditional instruction (which involves presentation plus practice of
a given structure, in this case causatives). Participants engaged in two pre- and
post-treatment assessments. One was designed to test basic sentence compre-
hension and involved selecting the correct picture to match what was heard
(e.g., either Henry is standing there while Phillip is cleaning the kitchen or
Phillip is watching while Henry is cleaning the kitchen). The other assessment
was an on-line eye-tracking experiment using what is called Visual World
Paradigm. In this kind of experiment, the participants hear a sentence while
two or more pictures are displayed. Their unconscious eye movements (i.e.,
what pictures they first look at, when they look at another if all, if they return
to another picture, and so on) are recorded.
On the pre-treatment assessments, Wong and Ito found the participants per-
formed rather poorly on interpretation of causatives in French; they chose the
correct picture less than 10% of the time. In eye-tracking, their eyes tended not
to go to the correct picture either as they seemed to settle on the wrong picture
from the outset. What is interesting and meaningful for the present chapter are
the post-treatment results. For the comprehension test, the participants’ scores
significantly approved for the processing instruction group to around 62% but
120 Bill VanPatten
for the traditional group, scores nudged up to around 11%. In eye-tracking,
the processing group began to show different eye movements after treatment,
going more often toward the correct picture and regressing to it while the
traditional group did not. When the researchers added explicit information
into the mix (their experiment 2), the results were the same except that the
traditional group’s comprehension scores only increased to 30%, while their
unconscious eye movements slightly nudged toward the correct picture. The
results for comprehension and eye movement did not change for the processing
group when explicit information was added.
The results of this experiment are interesting for a number of reasons. First,
they show that explicit information (to be talked about later in this chapter) is
barely useable by participants to change behaviors or knowledge in something
like IP. Second, the researchers were somewhat surprised that the processing
instruction group didn’t improve more and concluded that the first-noun prin-
ciple represents a fairly strong strategy that takes considerable encounters in
the input to overcome. (For an interesting comparison with similar results, see
the study by VanPatten, Borst, Collopy, Qualin, and Price (2013), in which the
first-noun principle was applied to four different languages: Spanish, Russian,
German, and French.) In short, the researchers considered the first-noun strat-
egy used by learners can be fairly resilient.
Explanation of Observed Findings in SLA
Observation 1: Exposure to input is necessary for L2 acquisition; Observation 2: A good
deal of L2 acquisition happens incidentally. It goes without saying that IP incorpo-
rates the important role of input. What is more, however, is that the model of
IP would suggest that most of acquisition is incidental. As we noted earlier, IP
is dependent on comprehension (learners actively engaged in getting meaning
from what they hear or read). In a certain sense, acquisition is a byproduct of
learners’ actively attempting to comprehend input. Their primary focus is on
meaning and the connection of form–meaning and the parsing of sentences is a
result of the learners’ communicative endeavors.
Observation 4: Learners’ output (speech) often follows predictable paths with predict-
able stages in the acquisition of a given structure. Although it is not the goal of IP to
explain all of L2 acquisition, there are certain observed phenomena for which it
can help to account. Due to its concern with the question of “Why do learners
make some form–meaning connections and not others?” it can speak to orders
of acquisition. When taken together, the various principles of IP account for
why the verbal inflection system in English, for example, emerges the way it
does. Learners will first process (and subsequently acquire) -ing due to the Mean-
ing before Nonmeaning Principle and due to the Lexical Preference Principle
(no lexical items carry the meaning of -ing). Third-person -s will be acquired
last because of the Preference for Nonredundancy Principle (third-person -s is
Input Processing in Adult L2 Acquisition 121
always redundant whereas the other verbal inflections in English are not). Like-
wise, the initial stages of the acquisition of negation in English, for example,
are marked by the isolation of specific words to indicate negation: notably “no”
and unanalyzed “don’t” (i.e., the learner does not know that “don’t” consists of
two words, “do” and “not,” and merely uses it as a substitute for “no”). What
this suggests is that initial IP is attempting to isolate content words to indicate
negation.
Observation 7: There are limits on the effects of frequency on L2 language acquisition;
Observation 8: There are limits on the effect of a learner’s first language on L2 acquisi-
tion. Within IP, frequency is not a major factor. Because IP is concerned with
initial processing and the factors that affect it, frequency does not play a major
role. For example, adjective agreement is frequent in Spanish, but the principle
regarding redundancy mitigates against initial processing of agreement. Other
less frequent things, if they are not redundant, will get processed sooner. The
problem with frequency is that sometimes it goes hand-in-hand with redun-
dancy/nonredundancy. For example, -ing may be more frequent in English
than simple past tense, -ed. But -ing is also never redundant whereas -ed often
is (see earlier). The question then becomes is it frequency that gets -ing pro-
cessed before -ed or is it the nonredundancy and meaning-based nature of -ing
as suggested by the Lexical Preference Principle? Such questions can only be
answered by continued research on a variety of languages.
IP accounts for limits of the effects of both frequency and the L1. The various
principles that deal with Lexical Preference, Nonredundancy, Meaning before
Nonmeaning, and so on, would mitigate against the sheer effects of frequency
as well as against the L1. Just because a form is highly frequent does not mean
it will be processed if (a) it is redundant and/or (b) if it carries no meaning, for
example. At the same time, if parsing strategies turn out to be at least partially
universal rather than L1 based (see the discussion on the First-Noun Principle),
then the model of IP would account even more for the limited effects of the L1.
Observation 9: There are limits on the effects of instruction on L2 acquisition. The
present model of IP also helps to account for the limited effects of instruction.
A good deal of instruction is centered on product rather than process. That is,
instruction is most often concerned with rules and with learner output. Our
model of IP suggests that part of the learning problem is in processing. Thus,
if instruction fails to account for how things get processed in the input, it may
not be as useful as we think. Work on IP has led to an instructional inter-
vention called processing instruction, which speaks to this very issue. In process-
ing instruction, instruction actually seeks to intervene during IP, thus altering
learners’ processing behaviors and leading to more grammatically rich (one
might even say more “appropriate”) intake.
Observation 10: There are limits on the effects of output (learner production) on
language acquisition. Although IP does not speak directly to issues of output, the
model would suggest that the effects of learner output would be constrained if
122 Bill VanPatten
output does not help to alter learners’ processing behaviors. For example, an
English-speaking learner of Spanish can produce all the sentences he or she
wants in a variety of contexts. But if the interaction does not lead to the learner
to realize that he or she has misinterpreted an OVS sentence, then little will
change in terms of acquisition. That learner will continue to process Spanish
first (pro)nouns as subjects. Under this scenario, output is useful if it leads learn-
ers to register and then correct their misinterpretations of others’ meanings (see
VanPatten, 2004c, for some discussion on this).
The Explicit/Implicit Debate
The model of IP presented in this chapter is neutral/agnostic on the issue of
whether adults engage implicit or explicit processes when learning a second
language, although more recently VanPatten (2015, and elsewhere) has begun
to take a stand on how explicit processing cannot be a significant part of IP.
In most models of parsing and processing, syntactic computations and map-
pings occur outside of awareness, except perhaps in the case of learning lexical
items (see, e.g., Truscott & Sharwood Smith, 2004, 2011). Indeed, processing
would be a very laborious process if the learner stopped at each word or piece
of morphological datum to explicitly register it (e.g., “this is a verb, it means
X, it refers to a 3rd person, it’s in the past tense” and so on). Such explicit pro-
cessing would grind comprehension to a halt. People can and do experience
moments of “what?” while listening or reading, very brief milliseconds of rec-
ognizing that what they heard or read is not what was meant. But even in such
cases, it is not clear that the resolution of what was meant happens explicitly;
that is, awareness of a problem does not necessarily entail awareness of how
to resolve the problem (e.g., “oh, that was a reduced relative clause and not a
main verb …”, “oh, that was a past tense verb form and not a present tense verb
form …”). (For additional discussion see VanPatten (2015).) And, there is some
evidence that, as far as explicit processing is concerned—that is, the explicit
teaching of information about the language and whether that information can
be used during on-line comprehension—explicit information usually is not
and probably cannot be used to process language (e.g., VanPatten et al., 2013).
Conclusion
IP as a phenomenon should be viewed as one part of a complex set of processes
that we call acquisition. As such, any model or theory of IP should not be
expected to be a model or theory of acquisition more generally. Ideally, one
would like to see various models that account for different processes in acquisi-
tion and when viewed this way, a better picture of acquisition ought to emerge.
Models and theories undergo change and evolution and this is no less true
for a model of L2 IP. As is the case with almost every theory and model in
SLA, challenges have been leveled against IP resulting in lively debate in the
Input Processing in Adult L2 Acquisition 123
professional literature (see, e.g., DeKeyser, Salaberry, Robinson, & Harrington,
2002; VanPatten, 2002, for one exchange; and Harrington, 2004; Carroll,
2004; VanPatten, 2004b, for another exchange). However, these challenges are
leveled at the specifics of the model and not at the underlying questions that
drive the model, namely, “Why some form–meaning connections and not oth-
ers? Under what conditions?” Some kind of model of IP will need to coexist
alongside models that deal with how linguistic data are incorporated into the
developing system as well as how learners access the system to make output,
and so on. The current model of IP is our first pass at considering how learners
process input during real-time comprehension.
Discussion Questions
1. IP theory claims that lexical items are privileged in IP (i.e., the Primacy
of Content Words Principle). Do you think lexical items are privileged in
acquisition more generally? What about learner attempts to produce lan-
guage? What about learner strategies in terms of overt attempts to learn a
language (e.g., conscious strategies to try and comprehend what someone
else is saying)?
2. VanPatten argues that L2 parsing may involve universal procedures or it
may be L1 based initially (i.e., the L1 parser is “transferred”). Or some
combination of both may be at play. Which do you think is more likely?
Can you think of additional experimentation and data that would help to
determine which position is more likely?
3. The theory of IP in this chapter claims that learners’ initial orientation
toward input is to process it for meaning; that is, they do what they can
to extract basic meanings from sentences. Can you think of any circum-
stances under which learners would approach processing sentences for
form/structure first? Do you think this leads to acquisition? Keep in mind
the definition of processing as the connection of form and meaning.
4. Take the language you teach or are most familiar with and try to apply
either the Lexical Preference Principle or the First-Noun Principle to
IP for that language. Can you make any predictions about processing
problems? For example, under the Lexical Preference Principle, what
formal features of the language tend to co-occur with lexical items or
phrases that express the same concept? What is your prediction about
processing?
5. One of the most well-known outcomes of the model of IP is VanPatten’s
processing instruction. Select one of the following studies and present it to your
class: Cadierno (1995), VanPatten et al. (2013), Uludag and VanPatten (2012).
6. Read the exemplary study presented in this chapter and prepare a discus-
sion for class in which you describe how you would conduct a replication
study. Be sure to explain any changes you would make and what motivates
such changes.
124 Bill VanPatten
Notes
Input Processing in Adult L2 Acquisition 125
Suggested Further Reading
VanPatten, B. (2015). Foundations of processing instruction. International Review of
Applied Linguistics, 53, 91–109.
This article updates some of the basic ideas about input processing that under-
lie processing instruction including a detailed discussion of the difference between
processing and noticing as well as why processing cannot involve explicit learning
or explicit processing to any significant degree.
VanPatten, B. (2009). Processing matters in input enhancement. In T. Piske &
M. Young-Scholten (Eds.), Input matters (pp. 47–61). Clevedon, England: Multilin-
gual Matters.
This book chapter compares and contrasts three frameworks related to input pro-
cessing, discussing such things as structure distance and the role of the L1.
VanPatten, B. (Ed.). (2004). Processing instruction: Theory, research, and commentary.
Mahwah, NJ: Lawrence Erlbaum.
This book is, essentially, an update on VanPatten’s (1996) book Input Process-
ing and Grammar Instruction: Theory and Research. The 2004 volume contains two
important expository essays (one on input processing and one on processing instruc-
tion). Also included are 10 previously unpublished research papers. What makes the
book interesting is the inclusion of commentary and criticism by six other scholars,
offering a balance for the reader.
VanPatten, B., & Rothman, J. (2014). Against “rules.” In A. Benati, C. Laval, &
M. J. Arche (Eds.), The grammar dimension in instructed second language acquisition: The-
ory, research, and practice (pp. 15–35). London, England: Continuum Press.
This chapter takes a generative perspective on language and language acquisi-
tion, while demonstrating the role that processing plays in acquisition. It thus situ-
ates one theory alongside another to demonstrate how two perspectives are not in
competition but may work together to help understand acquisition.
VanPatten, B., Williams, J., & Rott, S. (2004). Form–meaning connections in sec-
ond language acquisition. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet
(Eds.), Form–meaning connections in second language acquisition (pp. 1–26). Mahwah, NJ:
Lawrence Erlbaum.
This first chapter in the VanPatten, Williams, Rott, and Overstreet book offers
an overview of the many factors that contribute to how form–meaning connections
are made and strengthened. As such, it extends beyond the scope of IP theory,
demonstrating how IP fits into a larger picture of acquisition.
References
Barcroft, J., & VanPatten, B. (1997). Acoustic salience: Testing location, stress and the
boundedness of grammatical form in second language acquisition input perception.
In W. R. Glass & A. T. Pérez-Leroux (Eds.), Contemporary perspectives on the acquisi-
tion of Spanish: Vol. 2. Production, processing, and comprehension (pp. 109–121). Somer-
ville, MA: Cascadilla Press.
Cadierno, T. (1995). Formal instruction from a processing perspective: An investigation
into the Spanish past tense. The Modern Language Journal, 79, 179–193.
Carreiras, M., García-Albea, J., & Sebastián-Gallés, N. (Eds.). (1999). Language process-
ing in Spanish. Mahwah, NJ: Lawrence Erlbaum.
126 Bill VanPatten
Carroll, S. (2001). Input and evidence: The raw material of second language acquisition.
Amsterdam, Netherlands: John Benjamins.
Carroll, S. (2004). Commentary: Some general and specific comments on input pro-
cessing and processing instruction. In B. VanPatten (Ed.), Processing instruction: The-
ory, research, and commentary (pp. 293–309). Mahwah, NJ: Lawrence Erlbaum.
Clahsen, H., & Felser, C. (2006). Grammatical processing in language learners. Applied
Psycholinguistics, 27, 3–42.
Clifton, C., Frazier, L., & Rayner, K. (1994). Perspectives on sentence processing. Mahwah,
NJ: Lawrence Erlbaum.
DeKeyser, R., Salaberry, R., Robinson, P., & Harrington, M. (2002). What gets
processed in processing instruction? A commentary on Bill VanPatten’s “processing
instruction: An update.” Language Learning, 52, 805–823.
Ellis, N. C., & Sagarra, N. (2010). The bounds of adult language acquisition: Blocking
and learned attention. Studies in Second Language Acquisition, 32, 553–580.
Ervin-Tripp, S. (1974). Is second language learning like the first? TESOL Quarterly, 8,
111–127.
Gass, S. M. (1989). How do learners resolve linguistic conflicts? In S. M. Gass & J.
Schachter (Eds.), Linguistic perspectives on second language acquisition (pp. 183–199).
Cambridge, England: Cambridge University Press.
Harrington, M. (2004). IP as a theory of processing input. In B. VanPatten (Ed.), Process-
ing instruction: Theory, research, and commentary (pp. 79–92). Mahwah, NJ: Lawrence
Erlbaum.
Isabelli, C. (2008). First-noun principle or L1 transfer in SLA? Hispania, 91, 465–478.
Jegerski, J., & VanPatten, B. (Eds.). (2013). Research methods in second language psycholin-
guistics. New York, NY: Routledge.
Keating, G. D. (2013). Eye-tracking with text. In J. Jegerski & B. VanPatten (Eds.),
Research methods in second language psycholinguistics (pp. 69–92). New York, NY:
Routledge.
Lee, J. F., Cadierno, T., Glass, W. R., & VanPatten, B. (1995). The effects of lexical and
grammatical cues on processing tense in second language input. Applied Language
Learning, 8, 1–23.
Pritchett, B. L. (1992). Grammatical competence and parsing performance. Chicago, IL:
University of Chicago Press.
Rothman, J., & VanPatten, B. (2013). On multiplicity and mutual exclusivity: The
case for different theories. In M. P. García Mayo, M. J. Gutierrez-Mangado, &
M. Martínez Adrián (Eds.), Contemporary approaches to second language acquisition
(pp. 243–256). Amsterdam, Netherlands: John Benjamins.
Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied
Linguistics, 11, 129–158.
Schmidt, R. W. (2001). Attention. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 3–32). Cambridge, England: Cambridge University Press.
Tight, D. G. (2012). The first-noun principle and ambitransitive verbs. Hispania, 95,
103–115.
Truscott, J., & Sharwood Smith, M. (2004). Acquisition by processing: A modular
perspective on language development. Bilingualism: Language and Cognition, 7, 1–20.
Truscott, J., & Sharwood Smith, M. (2011). Input, intake, and consciousness: The quest
for a theoretical foundation. Studies in Second Language Acquisition, 33, 497–528.
Uludag, O., & VanPatten, B. (2012). The comparative effects of processing instruction
and dictogloss on the acquisition of the English passive by speakers of Turkish. Inter-
national Review of Applied Linguistics, 50, 187–210.
Input Processing in Adult L2 Acquisition 127
VanPatten, B. (1984). Learner comprehension of clitic object pronouns in Spanish.
Hispanic Linguistics, 1, 56–66.
VanPatten, B. (1996). Input processing and grammar instruction: Theory and research.
Norwood, NJ: Ablex.
VanPatten, B. (2002). Processing the content of input processing and processing
instruction research: A response to DeKeyser, Salberry, Robinson, and Harrington.
Language Learning, 52, 825–831.
VanPatten, B. (2004a). Input processing in second language acquisition. In B. VanPatten
(Ed.), Processing instruction: Theory, research, and commentary (pp. 5–31). Mahwah, NJ:
Lawrence Erlbaum.
VanPatten, B. (2004b). Several reflections on why there is good reason to continue
researching the effects of processing instruction. In B. VanPatten (Ed.), Processing
instruction: Theory, research, and commentary (pp. 325–335). Mahwah, NJ: Lawrence
Erlbaum.
VanPatten, B. (2004c). On the role(s) of input and output in making form–meaning
connections. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet (Eds.), Form–
meaning connections in second language acquisition (pp. 29–47). Mahwah, NJ: Lawrence
Erlbaum.
VanPatten, B. (2014a). Language acquisition theories. In C. Fäcke (Ed.), Language acqui-
sition (pp. 103–109). Berlin, Germany: Mouton de Gruyter.
VanPatten, B. (2014b). Input processing by novices: The nature of processing and
research methods. In Z. H. Hong & R. Rast (Eds.), Input processing at second language
initial state. Cambridge, England: Cambridge University Press.
VanPatten, B. (2015). Foundations of processing instruction. For a special issue of Inter-
national Review of Applied Linguistics, 53, 91–109.
VanPatten, B. (2018). Theories of second language acquisition. In K. Geeslin (Ed.),
The handbook of Spanish linguistics (pp. 649–667). Cambridge, England: Cambridge
University Press.
VanPatten, B., Borst, S., Collopy, E., Qualin, A., & Price, J. (2013). Explicit informa-
tion, grammatical sensitivity, and the First-Noun Principle: A cross-linguistic study
in processing instruction. The Modern Language Journal, 92, 506–527.
VanPatten, B. & Houston, T. (1998). The effects of intrasentential context on second
language sentence processing. Spanish Applied Linguistics, 2, 53–70.
VanPatten, B., & Keating, G. D. (2007, April). Getting tense. In Paper delivered at the
annual meeting of the American Association for Applied Linguistics, Costa Mesa, CA.
VanPatten, B., & Rothman, J. (2014). Against “rules.” In A. Benati, C. Laval, & M.
J. Arche (Eds.), The grammar dimension in instructed second language acquisition: Theory,
research, and practice (pp. 15–35). London, England: Bloomsbury Press.
VanPatten, B., & Wong, W. (2004). Processing instruction vs. traditional instruction,
once again: A study on the French causative. In B. VanPatten (Ed.), Processing instruc-
tion: Theory, research, and commentary (pp. 97–118). Mahwah, NJ: Erlbaum.
White, L. (1987). Against comprehensible input: The input hypothesis and the devel-
opment of L2 competence. Applied Linguistics, 8, 95–110.
Wong, W., & Ito, K. (2018). The effects of processing instruction and traditional
instruction on L2 online processing of the causative construction in French: An
eye-tracking study. Studies in Second Language Acquisition, 40, 241–268.
7
THE DECLARATIVE/PROCEDURAL
MODEL
A Neurobiologically Motivated Theory
of First and Second Language1
Michael T. Ullman
In evolution and biology, previously existing structures and mechanisms are
constantly being reused—that is, co-opted—for new purposes. For exam-
ple, fins evolved into limbs, which in turn became wings, while scales were
modified into feathers. Feathers themselves initially appear to have evolved for
temperature regulation and then only later were adapted for flight. Reusing
structures to solve new problems occurs not only evolutionarily but also devel-
opmentally. For example, our ability to read emerges from the co-optation of
brain circuitry that existed before the inception of reading.
It is thus likely that the uniquely human capacity of language depends at
least in part, if not largely, on neurobiological substrates that existed prior to
this capacity—whether or not those substrates have become further special-
ized for language, either through evolution or acquisition and development.
In this chapter, I focus on the co-optation of learning and memory sys-
tems for language, since most of language must be learned, whether or not
aspects are innately specified. In particular, I am interested in whether and
how declarative memory and procedural memory, arguably the two most
important learning and memory systems in the brain, play roles in language.
The declarative/procedural (DP) model simply posits that, due to the funda-
mental principle of co-optation, these two systems should play key roles in
language in ways that are largely analogous to the functioning of the systems in
non-linguistic domains.
These learning and memory systems have been well studied in both humans
and non-human animals and thus are relatively well understood at many levels,
including their neurobiological substrates and their behavioral correlates. This
independent knowledge of the systems leads to a wide range of often quite
The Declarative/Procedural Model 129
specific and novel predictions about language that one might have no reason
to make based on the study of language alone. For example, if a given brain
structure or gene is known to play a particular role in these memory systems,
we might expect it to play a similar role in language, even if one might have no
reason to make such a prediction based on our knowledge of language alone.
The DP model is thus a very powerful theoretical framework—whether it is
right or wrong, and indeed scientific hypotheses are often at least partly wrong.
As we shall see, however, accumulating evidence does in fact seem to support
important aspects of the model, in both first and second language (also see
Ullman, 2004, 2005, 2016; Ullman, Earle, Walenski, & Janacsek, 2020).
The Theory and Its Constructs
In this section, I present an overview of the two learning and memory systems
and how they interact with each other. I then examine predictions for language
that follow from this knowledge of the systems. Although I focus on predic-
tions for second language (L2), I specifically compare L2 and first language
(L1), since the predictions for L2 are intimately bound up with those for L1.
For more on the DP model, including background, predictions, and evidence,
see Ullman (2001b, 2004, 2007, 2016), Hamrick, Lum, and Ullman (2018),
and Ullman et al. (2020). For more on the DP model and second language in
particular, see Ullman (2001a, 2005, 2006, 2012), Morgan-Short and Ullman
(2012), and Hamrick et al. (2018).
A Primer on the Brain
Before we delve into the two memory systems, I will provide a quick tour of
the brain, as an overview of the necessary neurobiological basics. The largest
part of the brain, and the most important for cognition, including language,
is the cerebrum (Figure 7.1). The cerebrum is composed of two hemispheres,
each of which contains four lobes: the frontal, temporal, parietal, and occipital
lobes. Each lobe contains many smaller structures known as gyri and sulci (sin-
gular: gyrus and sulcus). The gyri are the ridges on the surface of the brain, and
the sulci the valleys that lie between them. These gyri and sulci form the outer
part of the cerebrum, called cortex (from Latin for “bark” or “rind”).
Although most studies of language focus on cortex, that is, cortical regions
such as Broca’s area (which corresponds to the opercular and triangular parts
of the inferior frontal gyrus; see Figure 7.1), other parts of the brain are also
important for language. The cerebellum, which lies below the cerebrum at the
back of the brain, was previously thought to be involved only in movement.
However, we now know that it plays roles in cognition, including memory
and language. There are also a number of structures deep inside the cerebrum
130 Michael T. Ullman
Central sulcus
Premotor cortex
Frontal lobe
Parietal lobe
Inferior frontal gyrus:
Opercular part
Triangular part Cerebrum
Orbital part
Occipital
lobe
Temporal
lobe Cerebellum
Lateral sulcus
(Sylvian fissure)
FIGURE 7.1 The left side of the brain, indicating structures referred to in the chapter.
itself. Of particular interest here are two sets of structures (see Plate 1): first,
the basal ganglia (which include the caudate nucleus, the nearby putamen,
and other portions not discussed here), which were previously thought to be
mainly involved in movement, and second, the hippocampus and other me-
dial temporal lobe (MTL) regions (i.e., structures located toward the inner
part of the temporal lobe, which can be found under the temples), which were
previously thought to underlie only memory. As we shall see, these structures
play important roles in language.
The Learning and Memory Systems
Declarative Memory
Declarative memory is defined here as the learning and memory that rely on
the MTL and its associated circuitry (e.g., see Ullman et al., 2020). Declar-
ative memory has been intensively studied in both humans and non-human
animals. The hippocampus and other MTL structures are critical for learning
and consolidating new knowledge in this system (consolidation refers to the
stabilization of memories after learning, for example, during sleep). These
MTL structures may be not just involved, but actually required, for learning
The Declarative/Procedural Model 131
arbitrary pieces of information and their associations. Evidence for this comes
from studies of patients with extensive MTL damage. These individuals, such
as the famous patient H.M., cannot generally learn new idiosyncratic infor-
mation, such as facts or events. The different MTL structures have somewhat
different roles. For example, the hippocampus may be especially important
for learning associations, while other MTL structures, such as perirhinal cor-
tex, may be more important for learning individual items (Davachi, 2006).
And while the hippocampus seems to be involved in the explicit recollec-
tion of at least recently learned information, perirhinal cortex may underlie
familiarity-based recognition (Eichenbaum, Yonelinas, & Ranganath, 2007).
Although MTL structures are critical for learning and processing new knowl-
edge in this system, eventually this knowledge seems to depend less on MTL
structures and more on neocortex, especially in the temporal lobes (neocortex
refers to all cerebral cortex outside the MTL; for example, all of the cortex you
see in Figure 7.1 is neocortex). Additionally, a region corresponding largely to
the triangular and orbital parts of the inferior frontal cortex—often simplified
to Brodmann’s Areas (BAs) 45 and 47, respectively—may play roles in the
encoding of new memories as well as their later recall. The molecular bases
of declarative memory are also beginning to be understood. For example,
various genes (e.g., for the proteins BDNF or APOE) play important roles in
declarative memory and hippocampal function, as does the hormone estro-
gen (higher levels are associated with better declarative memory). For more
information about the declarative memory brain system and its functions, see
Eichenbaum (2012), Henke (2010), Squire and Wixted (2011), and Ullman
(2004, 2016).
The learning and memory functions of this network of brain structures
are reasonably well characterized. Their role in learning idiosyncratic knowl-
edge across a wide range of modalities and domains may explain why they
are important for learning information about facts (semantic knowledge) and
events (episodic knowledge), such as the fact that the dish röschti is relished in
Switzerland, or that earlier this week my daughter Clemi and I ate delicious Thai
pork buns together. However, declarative memory is extremely flexible and
can learn much more than just semantic and episodic knowledge. Knowledge
can be learned very quickly in declarative memory, with as little as a single
exposure of the stimulus (the reader now knows what I ate with Clemi earlier
this week), although additional exposures of course strengthen these memories.
Knowledge learned in this system is at least partly, although not completely,
explicit (available to conscious awareness): the system also underlies implicit
(non-conscious) knowledge (e.g., Henke, 2010); also see “The Explicit/Implicit
Debate” later in this chapter. Nevertheless, declarative memory appears to be
the only learning and memory system that underlies explicit knowledge; thus,
any explicit knowledge must have been learned in declarative memory.
132 Michael T. Ullman
Finally, a number of subject-level factors, including both developmental and
individual differences, appear to modulate learning in this system. Of particular
interest for L2 acquisition, learning in declarative memory seems to improve
during childhood and then plateaus in adolescence and early adulthood, after
which it declines. Thus, an older child, adolescent, or young adult tends to be
better at learning in this system than a young child. Sex is also a factor, with
evidence suggesting that females tend to have an advantage at declarative mem-
ory over males, possibly due to their higher estrogen levels. Other factors that
seem to affect declarative memory include handedness (left-handedness may
be associated with an advantage at declarative memory), sleep (memory con-
solidation seems to improve during sleep), and exercise (which may enhance
declarative memory).
Procedural Memory
Procedural memory is defined here as the learning and memory that rely on
the basal ganglia and their associated circuitry (Ullman et al., 2020). The sys-
tem is composed of a network of interconnected brain structures rooted in
circuitry running through the basal ganglia and neocortical regions, especially
in frontal cortex. The basal ganglia, particularly anterior portions of the cau-
date nucleus and putamen, appear to underlie (early phases of ) learning motor
and cognitive functions. In contrast, frontal and other neocortical regions—in
particular premotor (BA 6) and related cortex, including the opercular part
of the inferior frontal gyrus (BA 44; Figure 7.1)—may be more important
for processing such functions, especially after they have become automatized.
Note that although the cerebellum interacts with basal ganglia-based learning
( Bostan & Strick, 2018; Ullman, 2016), we do not focus on the cerebellum
here. Finally, some aspects of the molecular bases of procedural memory are
beginning to be understood. For example, a number of genes that play roles in
procedural memory have been identified, including for the proteins FOXP2,
DARPP-32, and DRD2. For more on this memory system, see Ashby, Turner,
and Horvitz (2010), Doyon et al. (2009), Eichenbaum (2012), Packard (2008),
Ullman (2004, 2016), and Ullman et al. (2020).
This brain circuitry underlies the learning and processing of a range of
functions, including habits, perceptual-motor skills, perceptual sequences
(often tested in “statistical learning” paradigms), categories, and routes (for
navigation); see Ullman et al. (2020). Learning takes place by (1) gener-
ating temporally sensitive predictions (e.g., predicting what the next item
in a sequence is and when it will occur, or that an item is a member of a
particular category); then (2) evaluating these predictions based on rapidly
occurring feedback (e.g., whether the next item in a sequence and when it
occurred were correctly predicted or not, or an indication if the predicted
category was correct); and (3) creating or updating representations after
PLATE 1 The caudate nucleus (red), part of the basal ganglia, and the h ippocampus
(green), in the medial temporal lobe.
PLATE 2 ERPs for word-order violations as compared to correct sentences, for
participants undergoing explicit (instructed) or implicit (uninstructed)
training of an artificial language: at low proficiency, high proficiency,
and 5 months later (retention). Adapted from Morgan-Short, Finger, et al.
(2012) and Morgan-Short, Steinhauer, et al. (2012).
The Declarative/Procedural Model 133
incorrect predictions. The need for rapid feedback may help explain why
local dependencies in a sequence are learned faster than long-distance
dependencies (since the predicted item generally occurs sooner after the pre-
diction in local than long-distance dependencies), which in turn are learned
faster if the dependency (number of non-predictive items between the ele-
ments) is smaller. Learning and knowledge in procedural memory seem to
be implicit. Learning occurs gradually rather than quickly as in declarative
memory but appears to eventually result in more automatized (faster, reliable,
inflexible) processing of functions than does learning in declarative memory.
Additionally, longer-term retention seems to be better for knowledge learned
in procedural memory than in declarative memory. Various factors appear to
modulate procedural memory abilities, including age, which is of course a
factor of particular interest for L2 learning: learning and consolidation in pro-
cedural memory may already be robust early in childhood, though they seem
to decline around adolescence, resulting in poorer learning/consolidation
abilities in adulthood.
Interactions between the Memory Systems
Declarative and procedural memory interact with each other (also see Packard,
2008; Poldrack & Packard, 2003; Ullman, 2004, 2016; Ullman et al., 2020),
with important consequences for L2 acquisition. First, the two systems can
complement each other in learning and processing analogous knowledge, such
as knowledge of sequences and categories. That is, they play at least partly
redundant roles, though they generally learn and process the knowledge in
different ways. (However, some types of knowledge may be learnable only
in one or the other system, such as arbitrary pieces of information and their
associations in declarative memory, and perhaps automatized motor skills in
procedural memory.) Second, representations learned in declarative memory
can apparently inhibit (block) analogous representations learned in procedural
memory, and vice versa, depending on which is predominant. The two systems
can therefore also be thought of as being in competition.
Various factors appear to modulate which of the two systems is relied on
more. Declarative memory often learns knowledge initially (e.g., about a given
sequence or category), thanks to its quick learning abilities, while the proce-
dural system gradually acquires analogous knowledge, which eventually may
become predominant. Thus, overall, the processing of such knowledge tends to
gradually become automatized, due to its increasing dependence on procedural
memory and its automatization in this system. The learning context can also
affect which system is relied on more. Explicit instruction (e.g., of sequences),
or conscious attention to input stimuli and an attempt to understand underly-
ing patterns, can increase learning in declarative memory. Conversely, a lack
of explicit instruction, as well as manipulations that reduce attention to the
134 Michael T. Ullman
stimuli (e.g., in dual-task paradigms in which the learner’s attention is shifted
toward another task), or a high level of complexity of underlying patterns (thus
decreasing the learner’s ability to explicitly detect them), can shift learning
toward procedural memory. Additionally, rapid feedback in tasks involving
prediction leads to procedural learning, whereas slow feedback or none at all
leads primarily to learning in declarative memory.
Individual or group differences can also moderate the relative dependence
on the two systems. Any subject-level factors that enhance or depress learning
or processing in declarative or procedural memory relative to the other system
may shift reliance toward the more functionally available system. Thus, young
children (good procedural memory, not-so-good declarative memory) will
tend to rely heavily on procedural memory for learning, whereas young adults
(not-so-good procedural memory, good declarative memory) will instead tend
to rely heavily on declarative memory. Similarly, learning in one versus the
other system can be affected by factors such as genotype (e.g., individuals with
versions of the gene for BDNF that are associated with better declarative mem-
ory should rely more on this system) and sex (females should rely more on
declarative memory than males).
Predictions for Language
The fundamental claim of the DP model is that the declarative and proce-
dural memory systems should play roles in language that are largely analo-
gous to the roles they play in non-linguistic domains. Thus, our independent
knowledge of the two memory systems, as laid out above, leads to quite
specific predictions for language. Here I lay out a number of these predic-
tions, specifying where they are common to L1 and L2, and where they hold
particularly for L2.
Predictions: Declarative Memory
Since declarative memory is important, and perhaps necessary, for learning
arbitrary pieces of information and associating them, this learning and memory
system should be crucial for all learned idiosyncratic knowledge in language.
Thus, in both L1 and L2, simple content words (e.g., wombat, devour), includ-
ing their phonological forms, meanings, subcategorization frames (e.g., devour
requires a complement), and the mappings between them (e.g., their sound-
meaning mappings), should be learned at least in part in this system. Some sort
of knowledge about irregular morphological forms, both inflectional (e.g., dig-
dug) and derivational (e.g., solemn-solemnity), should also be stored in declarative
memory, as should knowledge about idiosyncratic aspects of idioms, proverbs,
and so on (e.g., jump the gun).
The Declarative/Procedural Model 135
Since declarative memory is flexible in what it can learn, it should also be
available for learning arguably non-idiosyncratic aspects of language, including
those involving sequences, categories, and other types of knowledge that can
be learned by procedural memory. Thus, just like simple and irregular words,
one should be able to store, in some manner (e.g., as “chunks”), rule-governed
complex forms (e.g., walked, the cat), together with their meanings. Grammatical
rules and constraints should also be learnable by this system, explicitly or
implicitly. Other language-related knowledge that can be supported by proce-
dural memory may also be learned in declarative memory, including speech-
sound representations (Ullman et al., 2020). As we shall see, all such linguistic
knowledge that could be learned by either system should generally depend
more on declarative memory (and less on procedural memory) in L2 than L1,
due to factors such as age of acquisition and learning context.
In both L1 and L2, linguistic knowledge in declarative memory should be
learned relatively rapidly, perhaps in some cases even from a single presentation
of the information (e.g., if the information is simple enough), though repeated
exposure should improve learning and retention. Linguistic knowledge learned
in this system could be explicit or implicit, since both types can be learned by
declarative memory. However, any explicit long-term knowledge of language
must have been learned by declarative memory, since this appears to be the only
long-term memory system to underlie explicit knowledge. Importantly for L2,
language learning that depends on declarative memory should improve during
childhood, plateau in adolescence/early adulthood, and then decline.
We can also make neurobiological predictions about language knowledge
learned in declarative memory. The functional neuroanatomy of this knowl-
edge, whether in L1 or L2, and whether lexical or grammatical, should reflect
the functional neuroanatomy of declarative memory. Thus, the learning of
such knowledge should crucially depend on the hippocampus and other MTL
structures. Since the hippocampus seems to be especially important for learn-
ing associations, learning form–meaning mappings and other linguistic as-
sociations should rely heavily on the hippocampus. Retrieval of knowledge
learned in declarative memory should also depend on MTL structures, at
least soon after learning, with explicit recollection relying particularly on the
hippocampus, and familiarity-based recognition involving perirhinal cortex.
Eventually, MTL structures should become less important, with a correspond-
ing increasing role for neocortical regions, especially in the temporal lobes.
The area encompassing BA 45 and 47 should underlie the encoding of new
linguistic information being learned in declarative memory, as well as the
recall of that knowledge once it has been learned. Finally, the genes, hor-
mones (e.g., estrogen), and other factors (e.g., sex, handedness, sleep, exercise)
that moderate declarative memory should play analogous roles in aspects of
language learned by this system.
136 Michael T. Ullman
Predictions: Procedural Memory
Given what we know about procedural memory, this system should under-
lie the learning and processing of a variety of language-related functions,
including those related to sequences and categories (Ullman et al., 2020).
For example, it should play an important role in learning linguistic cate-
gories, such as grammatical categories (e.g., Noun, Determiner, Tense) and
speech-sound categories (e.g., phonemic and phonetic categories). Given the
apparent implicit nature of procedural memory, only implicit (not explicit)
linguistic knowledge should be learned in this system. However, not all
implicit linguistic knowledge should involve procedural memory, since there
are other implicit memory systems, and, as we have seen, declarative memory
also supports implicit knowledge (also see “The Explicit/Implicit Debate”).
Since procedural memory learns gradually, linguistic knowledge learned in
this system should be gradually acquired. Such knowledge should eventu-
ally tend to become automatized and should show excellent retention. The
system may be particularly important for prediction in language, including
prediction-based learning and the eventual automatized processing of such
predictions. However, only prediction-based linguistic learning with rapid
feedback should depend on procedural memory. Thus, in contexts where
feedback or its processing are slowed (e.g., where the input speech stream
does not occur at a rapid rate), procedural memory may be unlikely to learn
and process efficiently.
Procedural memory shares key characteristics with grammar learning,
in particular in L1. Both involve learning sequential and categorical knowl-
edge, which is largely implicit and eventually automatized in both cases, and
both involve prediction (Ullman et al., 2020). Thus, procedural memory is
expected to play a critical role in grammar. This should hold across linguis-
tic subdomains, including syntax, morphology, and phonology (phonotac-
tics). In particular, the system may subserve learning grammatical knowledge
that underlies the (real-time) combination of elements, perhaps through the
acquisition of relations between categories (e.g., Noun, Verb) or specific units
(e.g., individual words) that allow for the real-time prediction of downstream
elements. However, and of particular importance for L2 acquisition, grammar
should be easier to acquire and process in procedural memory in childhood
(whether in L1 or L2) than in adulthood (generally as an L2) for various reasons,
including the decline of procedural learning/consolidation during childhood
and the concomitant improvement in declarative memory (also see below).
Other aspects of language may also be learned in procedural memory
(Ullman, 2016; Ullman et al., 2020). First, even though lexical knowledge
should depend importantly on declarative memory, it may also rely partially
on procedural memory. For example, the implicit learning of word bound-
aries in a speech stream (“word segmentation,” often examined in statistical
The Declarative/Procedural Model 137
learning paradigms) may rely on procedural memory, as may closed-class words
and morphemes (e.g., function words such as determiners and auxiliaries, and
bound inflectional morphemes such as the past tense suffix –ed), which are not
tightly linked to conceptual meanings but depend strongly on grammatical
structure. Second, articulation, which is a motor skill (in that it involves a serial
combination of motor computations that are gradually learned and automatized
over time), should depend heavily on procedural memory. Likewise, i mportant
aspects of speech production, such as the timing, prediction, selection, and ini-
tiation of speech-motor programs, should involve procedural memory ( Ullman
et al., 2020). Third, efficient speech perception (the process of mapping the low-
level features of the speech signal to meaning), which depends on the real-time
prediction of upcoming sounds and other linguistic information, may also be
expected to rely on procedural memory (Ullman et al., 2020). Indeed, evi-
dence supports roles for procedural memory in all of these cases in L1 ( Ullman
et al., 2020). However, analogous to grammar, in most if not all of these cases,
procedural memory should play less of a role (vs. declarative memory) and may
be less efficient in L2 than in L1.
As with declarative memory, neurobiological predictions for L1 and L2 fol-
low largely from what we know independently about procedural memory,
from both human and animal studies. Linguistic skills and knowledge learned
in this system should rely on frontal and basal ganglia structures. Learning
should rely on the basal ganglia, especially (during early phases of learning)
anterior portions of the caudate nucleus and putamen. The eventual autom-
atized processing of such knowledge should depend heavily on neocortical
regions, in particular premotor cortex and BA 44. However, since procedural
memory abilities decline during childhood, in L2 automatization may be a
slow and unreliable process. Thus, one might expect that the basal ganglia are
involved in learning for a longer period of time in L2 than in L1 (or early L2).
Finally, genes such as FOXP2 may modulate learning and processing in this
system.
Predictions: Interactions between the Memory Systems
Our understanding of the interactions between the two memory systems leads
to predictions for language and is of particular interest for L2 (see Ullman,
2004, 2005, 2016; Ullman et al., 2020). To a fair extent, we expect the two
memory systems to acquire the same or analogous linguistic knowledge, that
is, to play at least partly redundant roles. As in non-linguistic domains, such
redundancy may be found for any functions that can be subserved by both
systems. Given the learning flexibility of declarative memory, and the fact that
this system can underlie implicit as well as explicit knowledge, it should at least
partly support many (though not all) linguistic functions that can be learned by
procedural memory.
138 Michael T. Ullman
Declarative memory could support grammatical knowledge in a variety of
ways, including learning grammatical rules explicitly or implicitly, and stor-
ing rule-governed complex forms as chunks (which could be unstructured
or structured representations). Some aspects of grammar should be easier to
learn in declarative memory than others. For example, sequences involving
local dependencies (e.g., walked, the cat) should be easier to chunk than those
involving long-distance dependences (e.g., John … walks). Thus, long-distance
dependencies should be particularly problematic in L2, as they are difficult to
learn not only in procedural memory (whose learning abilities are moreover
attenuated in later learners) but also in declarative memory. Declarative memory
may also support non-grammatical functions that can be learned in procedural
memory. For example, speech-sound representations may be learned in declar-
ative memory as suboptimal “episodic” (instance-specific) features of speech
rather than as categories in procedural memory (Ullman et al., 2020). Note that
articulation and speech-motor aspects of speech production may rely less on
declarative memory, which might not be well suited for learning automatized
motor-related functions (Ullman et al., 2020); thus, these aspects of language
may remain particularly difficult for L2 learners.
Various factors should modulate which of the two memory systems is
relied on more for linguistic knowledge that can be learned by either system.
Such knowledge should generally be learned first by declarative memory, but
more slowly and in parallel by procedural memory. However, since learning
in declarative memory improves during childhood up to adolescence and
young adulthood, while learning in procedural memory seems to become
less effective during this period, young adult L2 learners should rely more
on knowledge learned in declarative memory and less on that learned in
procedural memory as compared to (L1 or L2) child learners, holding con-
stant their exposure to the language. And though such knowledge should
eventually become at least somewhat proceduralized (i.e., dependent on
knowledge and skills learned in the procedural memory system, which can
lead to automatized processing) in both child and adult learners, this pro-
cess should occur faster and more completely in children. Thus, even after
years of exposure, adult L2 learners may still not attain the same degree of
proceduralization of their grammar as L1 or early L2 learners, and similarly
for other knowledge that can be learned in either system, such as speech-
sound categories. It is worthwhile pointing out that similarly-aged adult L2
learners and L1 learners are often compared in empirical studies examining
L2 and L1 processing, even though this comparison probes the two groups at
quite different points in the learning trajectory, that is, L2 learners at earlier
stages than L1 learners.
Explicit language instruction, or encouraging attention to language stim-
uli or patterns in the input, should increase language learning in declarative
The Declarative/Procedural Model 139
memory, while a lack of such instruction or attention, and greater complexity
of rules or patterns (e.g., more complex grammatical rules or constraints), may
lead to a greater relative dependence on procedural memory. Thus, explicit
instruction of grammar, as is often given in classrooms to L2 learners, should
encourage learning in declarative memory (which may then inhibit reliance on
procedural memory). Conversely, exposure to the L2 without explicit instruc-
tion, as often occurs in immersion contexts, may enhance grammar acquisition
in procedural memory, and thus lead to more L1-like grammatical process-
ing. Similarly, dual-task paradigms during learning (such as encouraging the
learner to pay attention to something other than grammatical patterns in the
input) may encourage proceduralization. Additionally, slower feedback or no
feedback in linguistic learning that involves prediction should push learning
toward declarative memory, while faster feedback should promote learning
in procedural memory. This should hold whatever the form of the feedback,
including the mere presence or absence of upcoming items in the speech stream
that may have been predicted, and thus can serve as feedback regarding the
correctness of predictions that they should occur (Ullman et al., 2020). Thus,
slower input (e.g., in a classroom environment, or from other non-native
speakers) may decrease procedural memory-based learning of grammar, lead-
ing to a greater dependence on declarative memory, whereas native-speaker
environments such as immersion contexts may be particularly beneficial for
proceduralization and thus for eventual automatization of the L2.
Other factors may also modulate the relative dependence on the two mem-
ory systems. For example, females should tend to depend more on declarative
than procedural memory for grammar and other linguistic functions that can
rely on both systems, as compared to males. Thus, females should, on average,
be more successful at L2 learning than males, since they should show advan-
tages both at word learning (which depends heavily on declarative memory)
and grammar learning (which is particularly reliant on declarative memory
in L2). Analogously, sleep and exercise may be beneficial for L2 learning, given
their positive effects on declarative memory.
Summary of Predictions
Here I summarize some of the main non-neurobiological predictions, focusing
on similarities and differences between L2 and L1. First of all, in some ways, the
predictions are similar in first and second language. In both L1 and L2, declar-
ative memory should critically underlie the learning, storage, and use of all
idiosyncratic knowledge in language. Such knowledge should always depend
importantly on this system, across linguistic subdomains (e.g., for simple words
and their meanings, irregular morphology, syntactic complements, idioms).
Moreover, in both L1 and L2, aspects of grammar as well as other functions
140 Michael T. Ullman
that can rely on either system (e.g., learning speech-sound categories) should
initially be learned in declarative memory, while in parallel procedural mem-
ory should gradually learn the same or analogous knowledge. After substan-
tial experience with the language, procedural memory-based processing may
take precedence over analogous declarative knowledge, resulting in increased
automatized processing.
However, L2 acquisition and processing are also expected to differ in
important ways from L1 acquisition and processing. (Throughout this chap-
ter and in our research more generally, I use learning and acquisition inter-
changeably, consistent with the use of these terms in cognitive neuroscience.)
Perhaps most importantly, grammar, as well as other linguistic functions that
can rely on either system, should tend to be less well learned in procedural
memory, with less automatization, and should depend more on declarative
memory, in L2 than in L1, for several reasons. Here I discuss this with a focus
on grammar.
First, L2 learners will generally have had less language exposure than L1
learners at the same age, simply because they began learning the L2 later. The
later the L2 age of acquisition, the more pronounced this difference. Since
declarative memory learns quite rapidly, while procedural memory learns
only gradually, at any given age a learner’s L2 grammar should typically be
less proceduralized (and thus less automatized) and should therefore depend
more on declarative memory than their L1 grammar. Thus, just for this rea-
son alone, L2 grammar should tend to rely more on declarative memory than
L1 grammar.
Second, because learning abilities in procedural memory seem to be estab-
lished early and then decline, while declarative memory shows the opposite
pattern, L1 learners and early L2 learners should tend to rely particularly on
procedural memory for learning grammar, which should eventually lead to
automatized grammatical processing. In contrast, later L2 learners should rely
more on declarative memory, and indeed may never proceduralize their gram-
mar to the same extent as L1 or early L2 learners. Since declarative memory is
not well suited for learning certain functions (e.g., articulation, long-distance
dependencies), these may be particularly problematic in L2. Importantly, these
patterns should hold even after the same amount of language exposure in L1
and L2. However, most neurocognitive studies do not compare L1 children
with L2 adults (e.g., both after ten years of language exposure). Rather, as
pointed out earlier, most studies comparing the neurocognition of L2 with L1
examine both groups at the same age, and thus at different points in the learn-
ing trajectory. This is not problematic per se, but it must be kept in mind when
interpreting the data.
Third, the type of language experience may influence the learner’s rela-
tive dependence on the two memory systems. As we have seen, explicit,
classroom-like instruction of grammar may encourage learning in declarative
The Declarative/Procedural Model 141
memory, perhaps at the expense of learning in procedural memory. Conversely,
the lack of explicit instruction, and immersion contexts in particular, may
encourage learning in procedural memory. These predictions should hold for
both L1 and L2 learners. However, L1 learning generally occurs primarily in
an immersion (naturalistic) context, further encouraging proceduralization of
the grammar in L1 speakers and eventual automatization. In contrast, since L2
learners vary considerably with respect to the type of their exposure, this fac-
tor should often have an impact on the neurocognition of L2, and should tend
to result in an increased reliance of grammar on declarative memory in L2 as
compared to L1.
Fourth, an intriguing possibility is that lexical learning in declarative mem-
ory may also be a driving factor regarding whether or not grammar relies on
automatized procedures. For example, in many cases automatized operations
learned in procedural memory in the L1 (or even an earlier learned L2) could
be used in an L2, if the grammars are similar (transfer), or if very general
grammatical operations are learned in procedural memory (e.g., Merge, a basic
operation posited by some linguistic theories in which two syntactic elements
are combined to form a new element). However, since automatized skills
learned in this system tend to operate rapidly (see above), grammatical pro-
cessing relying on automatized procedural memory routines may not function
properly unless lexical access is also rapid, since words need to be integrated
into grammatical structures (at least in syntax and morphology) (Ferrill, Love,
Walenski, & Shapiro, 2012). Since word retrieval becomes faster and more
reliable as words are learned better (as a result of more frequent encounters),
at early stages of L2 learning word retrieval may not be fast enough to enable
previously learned proceduralized routines to operate in the L2 (for a related
idea, see Hopp, 2016). This leads to the testable prediction that grammatical
transfer is possible but occurs mainly with well-learned vocabulary. Along the
same lines, to the extent that new grammatical learning in procedural memory
is involved in learning an L2, grammatical processing relying on such newly
learned procedures may be limited when lexical retrieval is not fast enough
to support it. Moreover, slow lexical access may impede procedural learning
itself, since it may slow down evaluation of the predicted item (e.g., if one has
predicted a noun in the oncoming speech stream, but the encountered noun is
then accessed slowly in lexical memory, evaluation of that item as being correct
or not may be too late to provide timely feedback for learning in procedural
memory). In all such cases, grammar may rely largely on whatever knowledge
has been learned in declarative memory, such as chunks or explicit rules, rather
than on procedural memory.
Note that much of the literature on the neurocognition of L2 grammar,
and whether L1-like neurocognitive grammatical processing can be attained,
has focused on two factors: age of acquisition and proficiency. However, pro-
ficiency is somewhat of a problematic variable. First, it is operationalized and
142 Michael T. Ullman
measured quite differently across studies. Perhaps more importantly, in the vast
majority of studies proficiency is highly confounded with other variables. In
particular, proficiency is generally confounded with the amount of exposure
and even the type of exposure, since higher proficiency is associated with
higher exposure and in many studies with more immersion experience. As we
have seen, the DP model makes separate predictions for both the amount and
the type of exposure. In contrast, the model does not take a strong position on
proficiency itself to the extent that it may vary independently from these other
variables. For example, it might (or might not) be the case that higher profi-
ciency is associated with greater grammatical proceduralization when holding
constant the amount and type of experience. Future research will hopefully
elucidate this issue.
What Counts as Evidence?
Multiple types of evidence can help test the predictions of the DP model. These
include evidence from different methodologies, different language paradigms
(e.g., natural languages, artificial languages, artificial grammars), different
tasks, and different experimental designs. Of course, every methodology, par-
adigm, task, and design has both strengths and weaknesses. Thus, it is crucial
to obtain evidence from multiple approaches to test for converging evidence. Only
with converging evidence should we begin to have confidence in a theory. In
this section, I discuss several types of relevant evidence. For an overview of
other methodological approaches, such as direct brain recording and stimula-
tion, or transcranial magnetic stimulation, see Ullman (2014).
Note that here I focus on what counts as evidence, not on laying out the cur-
rent evidence in any detail. For the latter, see especially Ullman (2001a, 2005,
2016), Ullman et al. (2020), Hamrick et al. (2018), Morgan-Short and U llman
(2012), Tagarelli, Shattuck, Turkeltaub, and Ullman (2019), and Tagarelli,
Grey, Turkeltaub, and Ullman (in preparation), as well as the empirical refer-
ences cited below in this section.
Behavioral Evidence: Correlational Studies—Leveraging
Individual Differences
Various types of behavioral evidence can be used to test the predictions of
the DP model (e.g., see Babcock, Stowe, Maloof, Brovetto, & Ullman, 2012;
Ullman, 2004, 2016). However, one of the most straightforward behavioral
approaches is examining predicted correlations across participants, between
how well they learn in the memory systems and how well they learn or
process language. For example, one can test the predicted reliance of lexical
abilities on declarative memory by examining whether people who are better
The Declarative/Procedural Model 143
at learning in declarative memory are also better at word learning. Thus,
the correlational approach leverages individual differences in both language
abilities and learning abilities in the memory systems to test for relations
between them.
The correlational approach may be a particularly valid test of the predictions
of the DP model, since declarative and procedural memory are operationalized
as the learning and memory abilities that depend on particular neural circuitry:
the medial temporal lobe and associated structures for declarative memory, and
the basal ganglia and associated structures for procedural memory (see above).
Thus, testing associations between language abilities and such learning abili-
ties (i.e., learning performance in tasks that have been independently linked
to one or the other memory system) can directly test the model’s predictions.
In contrast, since the neural substrates of both systems likely also underlie
(non-learning) functions not related to either system, linking language solely
to the systems’ neural correlates may not reliably implicate involvement of
either declarative or procedural memory. For example, implication of the basal
ganglia in grammar in lesion or neuroimaging studies may in part be due to
the roles of the basal ganglia in attention and working memory as well as pro-
cedural memory. Thus, overall, a particularly powerful empirical approach for
testing the predictions of the DP model is to examine the predicted pattern of
correlations between measures of particular language abilities and measures of
learning abilities in one or the other system.
However, we have to be careful, because correlation does not imply
causation. A correlation between word learning and declarative memory could
be explained not just by words being learned in declarative memory but instead
by some other cognitive process that underlies both word learning and declar-
ative memory. One way to address this problem is to hold such other factors
constant (e.g., in statistical analyses), if one can identify them. Another approach
is to show the specificity of the correlation. If word learning or processing cor-
relates with learning in declarative memory but not with learning in procedural
memory, this suggests that lexical memory has a particular link to declarative
memory that is not found with all learning systems. Moreover, if the converse
also holds (a double dissociation), that is, performance at grammar learning
or processing correlates with learning in procedural but not declarative mem-
ory, this would increase confidence in both language/memory system relations.
Given that the DP model generates multiple specific predictions regarding which
aspects of language depend on which memory systems in which circumstances
(e.g., lexicon or grammar in L1 or L2 and at different stages of learning), it is
quite unlikely that the full pattern of predictions will be explained by other
factors or by chance (Hamrick et al., 2018).
Hamrick et al. (2018) reported multiple meta-analyses that synthe-
sized findings from 16 L1 and L2 studies of such language/memory system
144 Michael T. Ullman
correlations, with a total of 665 participants. The meta-analyses tied lexical
abilities to learning only in declarative memory, while grammar was linked
to learning in both systems in both child first language and adult second lan-
guage, in specific ways. Of interest here, in second language learners, grammar
was associated with only declarative memory at lower language experience but
with only procedural memory at higher experience. The findings yielded large
effect sizes and held consistently across languages, language families, linguistic
structures, and tasks, underscoring their reliability and validity. The results,
which met the patterns predicted by the DP model for both first and sec-
ond language, provide comprehensive evidence that language is indeed linked
closely to the two learning and memory systems in typical development, both
in children acquiring their native language and in adults learning an additional
language.
Neurological Evidence: The Lesion Method
If a person suffers from damage (lesions) to particular brain structures, and then
loses the ability to carry out certain cognitive functions, one might infer that
the lost functions previously depended on the damaged structures. Using this
approach to understand which brain structures normally underlie which func-
tions is referred to as the lesion method. For example, the fact that lesions to
the occipital lobes consistently lead to visual deficits indicates that the occipital
lobes are important for vision. By analogy, if you damage your lungs, you will
have trouble breathing, whereas if you damage your stomach, you will probably
have trouble with your digestion. This shows that one cannot perform these
functions without these particular organs, and so these organs are necessary for
these functions.
The lesion method can be used to test the DP model. Patients with lesions
limited to the medial temporal lobes, including the hippocampus, should have
trouble learning not only facts and events in declarative memory, but also words.
This is indeed the case, including, for example, for the famous amnesic patient
H.M. (Davis & Gaskell, 2009; Postle & Corkin, 1998). Additionally, patients
with lesions that extend to temporal neocortex (e.g., from Alzheimer’s disease)
should have more trouble with an already-learned L2 grammar than with an L1
grammar, since the former should depend more on the temporal lobe neocor-
tical regions that are important for already-learned declarative memories. Con-
versely, patients with lesions to frontal/basal ganglia circuits (e.g., from a stroke
or from Parkinson’s disease) should have greater grammatical impairments in
their L1 than their L2. And in any such patients who have more than one L2,
such patients should show greater grammatical impairments in the L2 to which
they had more exposure. Note that these predictions are striking given that
the L1 (and the higher exposure L2) had presumably been better learned than
the L2 (and the lower exposure L2) and yet are predicted to be more impaired.
The Declarative/Procedural Model 145
Indeed, such double dissociations have been found, supporting the DP model
(Hyltenstam & Stroud, 1989; Johari et al., 2013; Ullman, 2001a, 2005; Zanini,
Tavano, & Fabbro, 2010).
Like all other methods, however, the lesion method has its weaknesses.
Clearly, we cannot go around lesioning people’s hippocampi willy-nilly.
Rather, we must test patients who have already had a brain injury. But such
“accidental experiments” are not ideal. One cannot choose the location of the
lesion, which is moreover often large and involves multiple brain structures,
complicating structure–function inferences: how do you know which brain
structure does what when many structures are damaged? Moreover, since a
given structure may underlie more than one function (e.g., the basal ganglia
subserve working memory in addition to procedural memory), even a lesion
restricted to that structure could have impacts that are difficult to interpret
(e.g., since grammar likely depends on working memory as well as procedural
memory, a lesion to the basal ganglia could cause grammar deficits for more
than one reason; see above). Timing is also an issue. If one waits too long
after the onset of an acute brain lesion, other structures may take over some
of the functions that the damaged structure used to perform. Such compen-
sation confuses one’s inferences, since a lesioned structure may indeed have
been critical for a function, but compensation by a different part of the brain
leads to normal functioning, and thus could lead to the false conclusion that
the lesioned structure is not generally important for the function. On the other
hand, if one tests a patient too quickly after a stroke or head injury, the loss
of function can be much greater than is attributable to the damaged regions,
because nearby regions are often temporarily affected by factors such as tissue
swelling. In practice, researchers tend to err on the side of longer periods of
time, usually waiting months or even a year or more after acute brain damage
before testing a patient.
Electrophysiological Evidence: Event-Related Potentials
Event-Related Potentials (ERPs) are measures of brain activity, specifically the
electrical activity that constitutes the basis of brain function. In an ERP study,
one records EEGs (electroencephalograms)—that is, electrical potentials from
brain activity—from electrodes placed on the scalp. ERPs are simply the EEG
activity that occurs right after a person hears or sees a word, sees a picture,
and so on. The presentation of such a stimulus is called an “event,” hence the
name ERP.
ERPs offer several advantages over some other methodologies. First,
unlike functional neuroimaging methods like functional magnetic resonance
imaging (f MRI) or positron emission tomography (PET) (see later), ERPs
provide excellent temporal resolution, with millisecond measurements that
allow one to examine the actual time course of language processing in the
146 Michael T. Ullman
brain. On the down side, however, localizing the particular neuroanatom-
ical source(s) of ERPs in the brain is quite difficult. Second, ERP research
has revealed a set of widely studied language-related ERP patterns (called
“ERP components”) in L1, whose characteristics and underlying functions
are reasonably well understood: primarily the N400, Left Anterior Negativ-
ity (LAN) and P600 (see later). Moreover, lexical and grammatical processing
in the L1 are each associated with relatively distinct ERP components. These
components thus provide a reasonably clear way of comparing the neuro-
cognition of language processing between L2 and L1, in particular for these
language domains. Finally, since ERPs can be sensitive to effects that are not
actually observed with behavioral measures, including in language learning
studies, they can potentially reveal L2-L1 differences and similarities that
might not be found with behavioral approaches. For reviews of ERP research
in L2, see Van Hell and Tokowicz (2010), Morgan-Short and Ullman (2012),
Bowden, Steinhauer, Sanz, and Ullman (2013), Morgan-Short (2014), and
Steinhauer (2014).
Here I briefly outline the main ERP language components—the N400,
(L)AN, and P600—and explain how they can be used to test the DP model.
In L1, lexical manipulations, such as seeing or hearing an unexpected content
word (e.g., rugs in Rose-Marie likes to eat rugs) reliably lead to N400 ERP
components: that is, negative (hence the N) potentials that are generally
found about 400 milliseconds after the presentation of the word, mainly at
electrodes on the top of the head (see Plate 2). It has been argued that the
N400 reflects, at least in part, the learning and/or processing of knowledge in
declarative memory (Reichle, Bowden, & Ullman, in preparation; Ullman,
2001a). Since the DP model predicts that lexical memory depends on declar-
ative memory not only in L1 but also in L2, lexical manipulations should
consistently elicit N400s in L2 as well as in L1. Indeed, this is the case; see
reviews above.
Disruptions of rule-governed grammatical processing (of syntactic word
order, or of morphosyntax such as agreement or tense, as in the sentence Yes-
terday my father Fred walk all around Prague) often produce two ERP components
in L1 (a “biphasic” response). First, they can elicit anterior negativities (“ANs”;
Plate 2). These are often predominant in the left hemisphere and hence are fre-
quently referred to as LANs. LANs can begin as early as 100 milliseconds after
the critical word (e.g., walk) and often continue for hundreds of milliseconds.
It has been suggested that LANs may partly reflect procedural memory-based
processes (Ullman, 2001a). (Note that some recent work suggests the possi-
bility that LANs might be due partially, or even largely, to additivity effects
between N400 and P600 components; Tanner & Van Hell, 2014.) Second,
grammatical disruptions typically also produce P600s, positive potentials
(hence the P) that often begin around 600 milliseconds after the presentation
of a word (Plate 2). The P600 seems to reflect largely conscious processing of
The Declarative/Procedural Model 147
syntax and is not posited to depend on procedural memory, though it generally
follows LANs, perhaps due to accompanying processes. As we have seen, the
DP model predicts that grammar depends more on declarative memory in L2
than L1, in particular in low exposure L2, but can be proceduralized to some
extent at high exposure L2. Thus, grammatical disruptions may elicit N400s
in L2, especially at lower levels of L2 experience, but LANs (and P600s) at
higher levels. The evidence thus far seems broadly consistent with this pattern;
see reviews above.
Functional Neuroimaging Evidence
Functional neuroimaging methods such as PET and the more common fMRI
have also been widely used to examine the neural bases of both L1 and L2.
These techniques detect changes in blood flow (PET) or blood oxygen levels
(fMRI) that are known to correlate with changes in neuronal activity. For
example, if during grammar processing neurons fire particularly in the oper-
cular part of the inferior frontal gyrus (Figure 7.1), this region should show
particular changes in oxygen levels (since firing neurons need more ox ygen),
which should lead to changes in fMRI activation. The primary benefit of
functional neuroimaging techniques is their excellent spatial resolution,
allowing one to pinpoint activity to within a few millimeters in the brain. In
contrast, such changes in the blood are too slow for the detection of real-time
processing changes, so (unlike with ERPs) one cannot use functional neuro-
imaging to measure real-time language processing in the brain. For a sum-
mary of fMRI and other neuroimaging methods, including their pros and
cons, see Ullman (2014).
Functional neuroimaging can be used to test the DP model. The model
predicts that in both L1 and L2, learning new words should show activation in
the MTL, perhaps in particular the hippocampus. Over time, once words are
reasonably well learned, such activation should decrease and activation in neo-
cortical regions, especially in the temporal lobe and BA 45/47, should become
predominant. Grammar learning may also often yield MTL activation in both
L1 and L2 (though in L1 this should be found primarily in children because
they may still be learning the grammar). MTL activation for grammar learning
should occur particularly in contexts that encourage learning in declarative
memory, such as explicit instruction. As such grammar learning proceeds,
activation should decrease in the MTL. Crucially, grammar learning should
also activate procedural memory structures, in particular the anterior caudate
nucleus and anterior putamen during early stages of learning, whereas over the
course of learning activation in BA 6/BA 44 should become predominant. In L1
in adults, declarative memory structures should generally not still be engaged
in grammar, and the basal ganglia may no longer be reliably activated, leaving
mainly BA 6/BA 44 activation (in addition to brain structures not involved in
148 Michael T. Ullman
the two memory systems). A similar pattern may be found in L2 at high expo-
sures, though the basal ganglia may continue to be active (since learning in pro-
cedural memory should continue for an extended period in L2). The specificity
of these predictions allows the DP model to be tested and potentially falsified
(i.e., shown to be incorrect). However, as with all other techniques, functional
neuroimaging has its weaknesses. For example, as discussed above, activation
in a particular structure (e.g., the basal ganglia) during grammar learning may
be due to other basal ganglia functions (e.g., working memory) instead of or in
addition to procedural memory, complicating interpretation.
Thus far, the neuroimaging evidence from f MRI and PET has been
somewhat inconsistent. However, two recent neuroanatomical meta- analysis
papers have helped clarify the picture. One reported several neuroanatom-
ical meta-analyses that synthesized the functional neuroimaging literature
for different aspects of experimentally controlled adult language learning,
in particular for word learning and grammar learning (for both natural and
artificial grammar learning paradigms) (Tagarelli, Shattuck, et al., 2019).
The meta-analyses revealed that, across studies, word learning was associ-
ated with reliable activation of ventral temporal lobe areas that are adjacent
to the MTL and are linked to declarative memory, while grammar learning
was associated with activation of the anterior caudate nucleus and putamen,
consistent with their role in early stages of procedural learning. Moreover,
grammar learning predicted by the DP model to rely especially on declar-
ative memory (e.g., with explicit/instructed training) showed hippocam-
pal involvement, whereas grammar learning predicted to rely particularly
on procedural memory (e.g., with implicit/uninstructed training) showed
anterior caudate/putamen involvement. Both word and grammar learning
yielded activation in both BA 45 and BA 44. This is consistent with both
lexical and grammatical abilities relying on declarative memory (BA 45),
while grammar relies on procedural memory (BA 44); the involvement of
BA 44 in lexical processing may be due to articulatory processes (Tagarelli
et al., 2019).
The other neuroanatomical meta-analysis paper focused on adult L2 and L1
processing, that is, the processing of L2 and L1 that were learned in real-world
contexts (Tagarelli et al., in preparation). (Note that at the time of writing
this chapter, this paper is still in preparation, and thus the results presented
here should be interpreted with caution.) The basal ganglia, in particular the
anterior caudate nucleus, yielded reliable activation across meta-analyses only
for L2 grammatical processing, not for L1 grammatical processing or L2 or L1
lexical processing. This supports the prediction that procedural memory plays
an important role for grammatical but not lexical abilities, and that in adults
grammar learning is still taking place in L2 but not in L1. In contrast, only L2
lexicon yielded activation in ventral temporal lobe areas adjacent to the MTL
The Declarative/Procedural Model 149
and linked to declarative memory. Hippocampal activation was not observed,
perhaps in part because in the experiments included in these meta-analyses,
the learners had already had a fair amount of L2 exposure. As in the language
learning meta-analyses, both lexical and grammatical processing in L2 (but
not in L1) yielded activation in both BA 45 and BA 44, again implicating both
systems in both language abilities.
Overall, the findings from these functional neuroanatomical meta- analyses
link the adult learning and L2 processing of words mainly to declarative
memory, but of grammar primarily to procedural memory—though also to
declarative memory, in particular when it was predicted by the DP model to
rely especially on this system (e.g., with explicit training). However, further
neuroimaging studies are clearly needed, ideally with better controls for fac-
tors such as age of acquisition, and the timing, amount, and type of language
experience.
Common Misunderstandings
Here I address two common misunderstandings about the DP model and also
discuss how this model differs from other neurocognitive perspectives of L2.
I will discuss the various misunderstandings regarding the relation between
the declarative and procedural memory systems on the one hand, and explicit
and implicit knowledge on the other, in the section “The Explicit/Implicit
Debate.”
First, there is a common misunderstanding regarding the domain gen-
erality of the two memory systems. On the one hand, both systems are
“ domain general” in that they underlie multiple cognitive domains. How-
ever, this does not preclude sub-specialization for language within either
system, which could come about either evolutionarily or during learning
and development. Indeed, evidence from other domains suggests that sub-
specialization can occur in both systems. For example, different portions
of the MTL and different regions of temporal neocortex underlie different
types of information (Ullman, 2016). Likewise, different frontal/basal gan-
glia circuits subserve different sorts of information (Middleton & Strick,
2000). Nevertheless, at this time there is no convincing evidence for
domain-specific circuitry for language (i.e., circuitry dedicated exclusively to
language), either within structures involved in the two memory systems or
elsewhere in the brain (Ullman, Lum, & Conti-Ramsden, 2014). Future
research may further clarify this issue.
Second, a common misconception is that the changes in the reliance of
grammar from declarative to procedural memory are due to some sort of
“transformation” of knowledge from one to the other system. This is not the
case. Rather, the two systems seem to acquire knowledge largely independently
150 Michael T. Ullman
from each other (apart from interactions such as competition between the sys-
tems, as described above). Indeed, amnesic patients such as H.M. can acquire
skills in procedural memory without ever having learned such skills in declarative
memory (such dissociations were in fact the basis of the discovery of multiple
memory systems). Thus, the proceduralization of grammar does not constitute
the “transformation” of declarative into procedural representations but rather
the gradual acquisition of grammatical knowledge in procedural memory,
which is increasingly relied on, with an accompanying decrease in reliance on
any analogous grammatical knowledge that was learned in declarative memory.
Finally, to clarify any potential misconceptions regarding differences
between the DP model and other neurocognitive models of L2, here I will
compare the models. The DP model lies within one of three broad classes of
neurocognitive models of L2. One class of models posits that the neurocogni-
tive mechanisms underlying L2 are essentially the same as those subserving L1
(Abutalebi, 2008; Ellis, 2005; Green, 2003; Hernandez, Li, & MacWhinney,
2005; Indefrey, 2006; MacWhinney, 2011). Second, it has been suggested
that the mechanisms underlying L2 are fundamentally different from those
underlying L1 (Bley-Vroman, 1989). A third group of models hypothesizes that
L2 learners initially depend heavily on different substrates than L1, but, with
increasing experience or proficiency, gradually rely more on L1-like neuro-
cognitive mechanisms. This group of theories includes the views espoused by
Paradis and Clahsen, as well as the DP model. Although these views are similar
in certain respects, they also differ.
Here I summarize the perspectives taken by Paradis and Clahsen, in the
context of the DP model. Paradis (2004, 2009) suggests that a shift between
neurocognitive systems can take place not only for rule-governed grammatical
processes but also for at least some lexical properties, specifically, for grammat-
ical properties of lexical items that are generally implicit in L1. More generally,
Paradis takes a traditional view equating the explicit/implicit distinction with
the declarative/procedural memory distinction, a view that is not tenable given
what we know about the memory systems (see details on the memory systems
above, and the later section “The Explicit/Implicit Debate”). Clahsen pro-
poses a model that is quite similar to the DP model in many respects, though
with less of an expectation that the processing of grammar can become L1-like
(Clahsen & Felser, 2006a, 2006b; Clahsen, Felser, Neubauer, Sato, & Silva,
2010). Additionally, Clahsen’s model focuses on psycholinguistic processing
claims, rather than the neurocognitive bases of language.
An Exemplary Study: Morgan-Short,
Steinhauer, Sanz, and Ullman (2012)
A major limitation of L2 research that examines the L2 learning trajectory is
the time it takes learners to reach high proficiency. This makes it impractical
The Declarative/Procedural Model 151
to examine participants over the full course of language learning (i.e., in a
longitudinal study). As a result, the trajectory of learning has almost always
been examined between different groups of learners at different proficiency or
exposure levels. However, as with any between-subjects design, this approach
is not ideal, since the difficulty in selecting and matching different subject
groups on critical factors of L2 exposure and use, let alone on other factors that
may influence language learning (e.g., genetic makeup), can introduce noise
and inconsistency, reducing confidence in the findings.
To address these weaknesses, some studies have turned to artificial gram-
mars or artificial languages. An artificial grammar typically involves pre-
senting participants with letter or tone sequences that are generated by some
grammar. Artificial grammars can be learned to high proficiency quickly,
in minutes to hours. However, even though the rules of artificial gram-
mars can be consistent with the rules of natural languages, they are not
fully language-like, since they lack vocabulary and the sequences have no
meanings. Additionally, unlike a natural language, one does not speak or
comprehend artificial grammars. A rtificial languages address some of these
concerns (though note that recent evidence suggests overlapping neural sub-
strates between artificial grammars and more naturalistic paradigms, includ-
ing artificial languages, supporting the validity of artificial grammars for
probing grammar learning; Tagarelli et al., 2019). An artificial language
contains a small, meaningful lexicon and a limited number of grammatical
rules that are generally consistent with those found in natural languages. The
sentences have meanings, and the language can be spoken and understood.
Crucially, their small size makes them learnable to high proficiency within
hours to days, allowing one to longitudinally examine the L2 learning tra-
jectory to high proficiency.
A noteworthy ERP and behavioral study using an artificial language para-
digm examined L2 learning longitudinally to high proficiency (Morgan-Short,
Steinhauer, et al., 2012). Monolingual native English-speaking adults were
trained to speak and comprehend an artificial language, Brocanto2. The
words in this language refer to the pieces and moves of a computer-based
game, and the rules follow those of natural languages. Half the participants in
this study were given “explicit,” instructed, classroom-like training, and half
were given an equivalent amount of “implicit,” uninstructed, immersion-like
training. ERPs on violations of syntactic word order and morpho- syntactic
gender agreement were measured three times: at low proficiency, at high
proficiency, and again about five months later to test the neurocognition of
L2 retention (retention testing is reported in Morgan-Short, Finger, Grey, &
Ullman, 2012).
Behavioral analyses showed that both the explicitly and implicitly trained
groups learned the language to high proficiency and then retained it five
months later, and did not differ from each other at any of these time points.
152 Michael T. Ullman
In contrast, ERPs showed clear group differences (here I discuss word order
violations; for agreement violations, see Morgan-Short, Sanz, Steinhauer, &
Ullman, 2010). At low proficiency and exposure, the implicitly trained group
showed an N400, whereas the explicitly trained group showed no detect-
able ERP effects (Plate 2). At high proficiency and exposure, the implicitly
trained group showed an AN-P600 biphasic pattern (although the AN was
not significantly left-lateralized), with the AN continuing as a late anterior
negativity. In contrast, the explicitly trained group showed only an anterior
positivity (not typical of native language syntactic processing) followed by
a P600. At retention testing five months later, the implicitly trained group
showed a more robust and left-lateralized AN than at high proficiency, the
explicitly trained group no longer showed the (non-L1-like) anterior positiv-
ity and developed a more robust P600, and both groups showed a stronger
late anterior negativity.
In sum, L1-like processing of syntactic word order was more likely for
implicit, uninstructed, immersion-like training than for explicit, instructed,
classroom-like training and was more likely at retention testing than at high
proficiency/exposure than at low proficiency/exposure. Specifically, N400s
were only found at low proficiency, suggesting a reliance of syntax on declar-
ative memory early in the learning trajectory. The fact that no N400 or any
other ERP component was reliably found in the explicitly-trained partici-
pants at low proficiency may be due to greater temporal variability (i.e., in
when the component occurs) for explicit, conscious, strategies, resulting in
the lack of any consistent ERP components. At high proficiency, more
native-like grammatical ERP components were found, including an AN. This
is consistent with proceduralization and more generally with greater L1-like
grammatical processing emerging with greater exposure and proficiency. The
finding that both training groups showed more native-like syntactic process-
ing at retention testing may have been due in part to continuing consolida-
tion of the grammar in procedural memory (Morgan-Short, Finger, et al.,
2012). Finally, the greater native-like processing resulting from implicit than
explicit training is consistent with evidence for greater procedural learning
(activation in the anterior caudate/putamen) for grammar in implicit than
explicit learning contexts in the functional neuroimaging literature (see ear-
lier), and more generally with immersion leading to more native-like pro-
cessing and proceduralization than explicit instructed classroom training
(Bowden et al., 2013).
This study is exemplary in several respects. First, the use of an artificial lan-
guage allows one to control for the amount, type, and timing of L2 exposure.
Second, the fact that an artificial language rather than an artificial grammar was
examined, and moreover one that participants learned to speak and compre-
hend, and that followed the rules of natural languages, suggests that the results
The Declarative/Procedural Model 153
are reasonably likely to generalize to natural languages, which is of course what
we actually care about understanding. Third, the measurement of ERPs as well
as behavioral assessments provides a variety of advantages (see earlier), includ-
ing revealing ERP differences that were not found in behavior. Fourth, exam-
ining and contrasting explicit, instructed, classroom-like training and implicit,
uninstructed, immersion-like training, moreover in a tightly controlled design,
elucidate neurocognitive effects of the type as well as the amount of input.
Fifth, examining retention, moreover after quite an extended period, provides
insights into longer-term outcomes of language learning. Since people usually
learn an L2 in order to retain it (at least for a reasonable period), this is partic-
ularly important.
Explanation of Observed Findings in SLA
Observation 1: Exposure to input is necessary for L2 acquisition; Observation 2: A good
deal of L2 acquisition happens incidentally. Not only is exposure to input neces-
sary for learning an L2, but, as discussed earlier, the amount and even the type
of input is important. Specifically, both greater exposure (correlating with
higher proficiency) and implicit, immersion-like experience (which presum-
ably is associated with incidental learning) seem to facilitate proceduralization
of the grammar and the attainment of L1-like neurocognitive grammatical
processing.
Observation 5: Second language learning is variable in its outcome; Observation 6:
Second language learning is variable across linguistic subsystems. According to the DP
model, both behavioral and neural correlates of L2 learning should vary on the
basis of multiple factors, including biological variables (e.g., sex and genetic
variability), input variables (e.g., amount and type of L2 exposure), and linguis-
tic subsystems (e.g., lexicon vs. grammar). Moreover, a number of these vari-
ables likely interact (Babcock et al., 2012). Some of these factors have already
been reasonably well examined (in particular, lexicon vs. grammar, and input
variables), and indeed the evidence suggests that they influence L2 acquisition.
A host of other variables should be examined in future studies.
Observation 9: There are limits on the effects of instruction on L2 acquisition. As
we have seen, implicit, uninstructed, immersion-like L2 training appears to be
more effective than explicit, instructed, classroom-like training for the proce-
duralization of grammar and the attainment of L1-like grammatical processing.
The Explicit/Implicit Debate
At first blush, the distinction between the declarative and procedural mem-
ory brain systems seems to parallel that between explicit and implicit knowl-
edge. Indeed, explicit knowledge is subserved only by declarative memory,
154 Michael T. Ullman
while procedural memory underlies implicit knowledge. However, the parallel
largely falls apart at this point.
First, the DP model is largely based on claims from cognitive neurosci-
ence about brain systems, whereas the explicit/implicit distinction is premised
on claims about awareness. This latter distinction is somewhat problematic in
that awareness is difficult not only to define but also to test (DeKeyser, 2003;
Schmidt, 1994). In contrast, the distinction between declarative and procedural
memory, as operationalized by the DP model and often in cognitive neurosci-
ence, is relatively clear, and the dichotomy can be tested, as we have seen, with
a variety of methodological approaches.
Second, the mapping between declarative/procedural memory, on the one
hand, and explicit/implicit knowledge, on the other, is by no means isomorphic
(one-to-one). Information stored in declarative memory can indeed be explicit
(accessible to conscious awareness in some sense). In fact, as we have seen, this
brain system appears to be the only long-term memory system to underlie
explicit knowledge—a finding that is useful since it allows us to identify declar-
ative memory as the locus of any long-term explicit knowledge. However, this
system also underlies implicit knowledge. Although declarative memory was
historically associated only with explicit knowledge, this was always a highly
problematic assumption (even though this problem was rarely discussed). It was
never shown (how would one do so?) that this brain system does not underlie
implicit knowledge. Indeed, work in non-human animals such as rats and
monkeys on this brain system did not assume that learning involved explicit
knowledge, since testing animals’ conscious awareness of what they have learned
would clearly be very difficult. And of course, it is also highly unwarranted
to simply define a biological entity such as a brain system as having particular
behavioral characteristics, in this case that it only underlies explicit knowledge.
Rather, this is an empirical question. Thus, not only was it always the case that
the assumption that declarative memory underlies only explicit knowledge was
unwarranted, but evidence now indicates that this assumption was not correct,
and that declarative memory also underlies implicit knowledge (Henke, 2010;
Ullman, 2016). In sum, although declarative memory appears to be the only
long-term memory system in the brain that underlies explicit knowledge, it also
underlies implicit knowledge.
There are also often confusions with respect to procedural memory. Cog-
nitive neuroscientists often define procedural memory as it is operationalized
here, that is, as the (implicit) learning and memory abilities that are rooted in
the basal ganglia and associated structures. Importantly, on this view, proce-
dural memory is only one of several brain systems that underlie implicit knowl-
edge, including not just declarative memory but also other systems (e.g., those
underlying priming and habituation) (Eichenbaum, 2012; Squire & Dede,
2015). Nevertheless, the terms procedural memory and implicit memory are
The Declarative/Procedural Model 155
still often used interchangeably in some fields, which can result in substantial
confusion. To clarify: although procedural memory appears to only underlie
implicit knowledge, several other brain systems, including the declarative mem-
ory system, also underlie implicit knowledge.
I have been discussing problems pertaining to explicit/implicit knowledge.
However, the explicit/implicit distinction in other respects is at least as prob-
lematic. First, one may hear of a distinction between explicit and implicit
learning. This distinction usually refers to whether knowledge is explicit or
implicit, but during the learning period rather than subsequent to learning
(i.e., the product). The terms explicit and implicit are also used with respect
to the input (e.g., see the exemplary study in this chapter). However, this ter-
minology is perhaps even more problematic, since it also causes confusion as
to where the explicit knowledge is supposed to lie, that is, with the instructor
(experimenter) or the learner (participant). If the knowledge lies with the
teacher/experimenter, then it is uninteresting with respect to learning; if
the knowledge lies with the learner/participant, then again, the same issues
described earlier apply. Clearer terms, such as instructed and uninstructed
learning, may be more useful. In the exemplary study, we used the terms
explicit and implicit training to be consistent with the use of these terms in
the existing literature (including in Morgan-Short, Steinhauer, et al., 2012),
though we have attempted to further clarify them by specifying that the dis-
tinction can also be described as instructed/uninstructed and classroom-like/
immersion-like.
Conclusion
The DP model is a powerful theoretical framework. First, it is motivated by
basic principles of evolution and biology. Second, it generates a wide range of
behavioral and neurocognitive predictions, for both L1 and L2, many of which
would be unwarranted by the study of language alone. Third, it is testable by
multiple methods. Fourth, converging evidence from different methods and
experimental paradigms supports many of the key predictions of the model, for
both first and second language.
Finally, the DP model has important applied implications. Studies have
shown that particular manipulations (interventions) lead to better learning and
retention in declarative or procedural memory, such as spaced (versus massed)
presentation, retrieval practice (the testing effect), and exercise (Cepeda,
Pashler, Vul, Wixted, & Rohrer, 2006; Roediger & Karpicke, 2006; Stern &
Alberini, 2012). The DP model predicts that to the extent to which these tech-
niques enhance learning and retention in declarative or procedural memory,
they should also enhance language learning and retention, including in L2
acquisition, in specific ways, depending on which aspect of language depends
156 Michael T. Ullman
on which memory system in which circumstance. A recent review paper sug-
gests that L2 acquisition can indeed be enhanced as predicted by the DP model,
in particular by spaced presentation and retrieval practice (Ullman & Lovelett,
2018). The DP model may therefore prove useful for improving pedagogy and
thus has implications beyond basic research.
Discussion Questions
1. A major tenet of the declarative/procedural model is that the learning and
memory systems used in non-language learning are co-opted for learn-
ing language. Does this perspective suggest that language learning is like
learning anything else? Is it possible for language to make use of human
memory systems and yet be “special” in the way that, say, generative
approaches to L2 acquisition (White, this volume) suggests language is
special?
2. Both skill theory (DeKeyser, this volume) and the DP model make distinc-
tions between declarative and procedural memory. What differences do
you see between the two theoretical approaches regarding these memory
constructs?
3. One of the predictions of the DP model is that learning grammar in
procedural memory becomes more difficult as we get older during child-
hood and adolescence. Compare and contrast this perspective with what is
known as the Critical Period Hypothesis, which basically states that adults
cannot make use of the same device(s) for language acquisition as children
learning a first language.
4. An interesting finding is that immersion-like L2 experience seems to result
in more L1-like (i.e., native-like) neurocognition than instructional L2
experience. What do you make of this finding in the context of the DP
model?
5. Explain, in your own words, why one cannot equate declarative memory
with explicit knowledge/learning and procedural memory with implicit
knowledge/learning.
6. Read the exemplary study presented in this chapter and prepare a discus-
sion for class in which you describe how you would conduct a replication
study. Be sure to explain any changes you would make and what motivates
such changes.
Note
The Declarative/Procedural Model 157
Suggested Further Reading
Ashby, F. G., Turner, B. O., & Horvitz, J. C. (2010). Cortical and basal ganglia contri-
butions to habit learning and automaticity. Trends in Cognitive Sciences, 14, 208–215.
Doyon, J., Bellec, P., Amsel, R., Penhune, V. B., Monchi, O., Carrier, J., & Benali, H.
(2009). Contributions of the basal ganglia and functionally related brain structures
to motor learning. Behavioural Brain Research, 199, 61–75.
Overviews of aspects of procedural memory.
Hamrick, P., Lum, J. A. G., & Ullman, M. T. (2018). Child first language and adult
second language are both tied to general-purpose learning systems. Proceedings of
the National Academy of Sciences of the United States of America, 115, 1487–1492. doi.
org/10.1073/pnas.1713975115
A meta-analysis of correlational studies testing the DP model in first and second
language.
Squire, L. R., & Wixted, J. T. (2011). The cognitive neuroscience of human memory
since H.M. Annual Review of Neuroscience, 34, 259–288.
Squire, L. R., & Dede, A. J. O. (2015). Conscious and unconscious memory systems.
Cold Spring Harbor Perspectives in Biology, 7(3), 1–14.
Overviews of learning and memory systems in the brain, with somewhat of a
focus on declarative memory.
Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/
procedural model. Cognition, 92, 231–270.
This paper gives an in-depth overview of the DP model and relevant evidence.
Ullman, M. T. (2016). How does language depend on general-purpose long-term
memory systems in the brain? In G. Hickok & S. A. Small (Eds.), The neurobiology of
language (pp. 953–968). New York, NY: Elsevier.
A recent overview of the DP model, its predictions, and relevant evidence.
Ullman, M. T., Earle, F. S., Walenski, M., & Janacsek, J. (2020). The neurocognition
of developmental disorders of language. Annual Review of Psychology, 71, 389–417.
A new overview of the DP model, including updates on the two memory systems
and evidence of their involvement in grammar, lexicon, and other aspects of language
(e.g., speech-sound representations, articulation, speech production, speech percep-
tion), in particular (but not only) in neurodevelopmental disorders of language.
References
Abutalebi, J. (2008). Neural aspects of second language representation and language
control. Acta Psychologica, 128, 466–478.
Ashby, F. G., Turner, B. O., & Horvitz, J. C. (2010). Cortical and basal ganglia contri-
butions to habit learning and automaticity. Trends in Cognitive Sciences, 14, 208–215.
Babcock, L., Stowe, J. C., Maloof, C. J., Brovetto, C., & Ullman, M. T. (2012).
The storage and composition of inflected forms in adult-learned second lan-
guage: A study of the influence of length of residence, age of arrival, sex, and other
factors. Bilingualism: Language and Cognition, 15, 820–840.
Bley-Vroman, R. (1989). What is the logical problem of foreign language learning?
In S. M. Gass & J. Schacter (Eds.), Linguistic perspectives on second language acquisition
(pp. 41–68). Cambridge, England: Cambridge University Press.
Bostan, A. C., & Strick, P. L. (2018). The basal ganglia and the cerebellum: Nodes in
an integrated network. Nature Reviews Neuroscience, 19, 338–350.
158 Michael T. Ullman
Bowden, H. W., Steinhauer, K., Sanz, C., & Ullman, M. T. (2013). Native-like brain
processing of syntax can be attained by university foreign language learners. Neuro-
psychologia, 51, 2492–2511.
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed
practice in verbal recall tasks: A review and quantitative synthesis. Psychological
Bulletin, 132, 354–380.
Clahsen, H., & Felser, C. (2006a). Grammatical processing in language learners. Applied
Psycholinguistics, 27, 3–42.
Clahsen, H., & Felser, C. (2006b). How native-like is non-native language processing?
Trends in Cognitive Sciences, 10, 564–570.
Clahsen, H., Felser, C., Neubauer, K., Sato, M., & Silva, R. (2010). Morphological
structure in native and nonnative language processing. Language Learning, 60, 21–43.
Davachi, L. (2006). Item, context and relational episodic encoding in humans. Current
Opinion in Neurobiology, 16, 693–700.
Davis, M. H., & Gaskell, M. G. (2009). A complementary systems account of word
learning: Neural and behavioural evidence. Philosophical Transactions of the Royal
Society of London, 364, 3773–3800.
DeKeyser, R. M. (2003). Implicit and explicit learning. In C. J. Doughty & M. H. Long
(Eds.), The handbook of second language acquisition (pp. 313–348). Malden, MA: Blackwell.
Doyon, J., Bellec, P., Amsel, R., Penhune, V. B., Monchi, O., Carrier, J., & Benali, H.
(2009). Contributions of the basal ganglia and functionally related brain structures
to motor learning. Behavioural Brain Research, 199, 61–75.
Eichenbaum, H. (2012). The cognitive neuroscience of memory: An introduction (2nd ed.).
Oxford, England: Oxford University Press.
Eichenbaum, H., Yonelinas, A. P., & Ranganath, C. (2007). The medial temporal lobe
and recognition memory. Annual Review of Neuroscience, 30, 123–152.
Ellis, N. C. (2005). At the interface: Dynamic interactions of explicit and implicit lan-
guage knowledge. Studies in Second Language Acquisition, 27, 305–352.
Ferrill, M., Love, T., Walenski, M., & Shapiro, L. P. (2012). The time-course of lexical
activation during sentence comprehension in people with aphasia. American Journal
of Speech-Language Pathology, 21(2), S179–S189.
Green, D. W. (2003). Neural basis of the lexicon and the grammar in L2 acquisition.
In R. V. Hout, A. Hulk, F. Kuiken, & R. Towell (Eds.), The interface between syntax
and the lexicon in second language acquisition (pp. 197–208). Amsterdam, Netherlands:
John Benjamins.
Hamrick, P., Lum, J. A. G., & Ullman, M. T. (2018). Child first language and adult
second language are both tied to general-purpose learning systems. Proceedings of
the National Academy of Sciences of the United States of America, 115, 1487–1492. doi.
org/10.1073/pnas.1713975115
Henke, K. (2010). A model for memory systems based on processing modes rather than
consciousness. Nature Reviews Neuroscience, 11, 523–532.
Hernandez, A., Li, P., & MacWhinney, B. (2005). The emergence of competing mod-
ules in bilingualism. Trends in Cognitive Sciences, 9, 220–225.
Hopp, H. (2016). The timing of lexical and syntactic processes in second language sen-
tence comprehension. Applied Psycholinguistics, 37, 1253–1280.
Hyltenstam, K., & Stroud, C. (1989). Bilingualism in Alzheimer’s dementia: Two case
studies. In K. Hyltenstam & L. Obler (Eds.), Bilingualism across the lifespan: Aspects of
acquisition, maturity and loss (pp. 202–226). Cambridge, England: Cambridge Uni-
versity Press.
The Declarative/Procedural Model 159
Indefrey, P. (2006). A meta-analysis of hemodynamic studies on first and second lan-
guage processing: Which suggested differences can we trust and what do they mean?
Language Learning, 56 (Suppl. 1), 279–304.
Johari, K., Ashrafi, F., Zali, A., Ashayeri, H., Fabbro, F., & Zanini, S. (2013). Gram-
matical deficits in bilingual Azari–Farsi patients with Parkinson’s disease. Journal of
Neurolinguistics, 26, 22–30.
MacWhinney, B. (2011). The logic of the unified model. In S. M. Gass & A. Mackey
(Eds.), The Routledge handbook of second language acquisition (pp. 211–227). New York,
NY: Routledge.
Middleton, F. A., & Strick, P. L. (2000). Basal ganglia output and cognition: Evi-
dence from anatomical, behavioral, and clinical studies. Brain and Cognition, 42,
183–200.
Morgan-Short, K. (2014). Electrophysiological approaches to understanding second
language acquisition: A field reaching its potential. Annual Review of Applied Lin-
guistics, 34, 15–36.
Morgan-Short, K., Finger, I., Grey, S., & Ullman, M. T. (2012). Second language pro-
cessing shows increased native-like neural responses after months of no exposure.
PLoS ONE, 7, e32974.
Morgan-Short, K., Sanz, C., Steinhauer, K., & Ullman, M. T. (2010). Second language
acquisition of gender agreement in explicit and implicit training conditions: An
event-related potential study. Language Learning, 60, 154–193.
Morgan-Short, K., Steinhauer, K., Sanz, C., & Ullman, M. T. (2012). Explicit and
implicit second language training differentially affect the achievement of native-like
brain activation patterns. Journal of Cognitive Neuroscience, 24, 933–947.
Morgan-Short, K., & Ullman, M. T. (2012). The neurocognition of second language.
In S. M. Gass & A. Mackey (Eds.), The Routledge handbook of second language acquisition
(pp. 282–299). New York, NY: Routledge.
Packard, M. G. (2008). Neurobiology of procedural learning in animals. In J. H.
Byrne (Ed.), Concise learning and memory: The editor’s selection (pp. 341–356). London,
England: Elsevier Science and Technology.
Paradis, M. (2004). A neurolinguistic theory of bilingualism. Amsterdam, Netherlands: John
Benjamins.
Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam,
Netherlands: John Benjamins.
Poldrack, R. A., & Packard, M. G. (2003). Competition among multiple memory sys-
tems: Converging evidence from animal and human brain studies. Neuropsychologia,
41, 245–251.
Postle, B. R., & Corkin, S. (1998). Impaired word-stem completion priming but intact
perceptual identification priming with novel words: Evidence from the amnesic
patient H.M. Neuropsychologia, 15, 421–440.
Reichle, R. V., Bowden, H. W., & Ullman, M. T. (in preparation). The Janus
Hypothesis: The N400 reflects learning as well as processing.
Roediger, H. L., III, & Karpicke, J. D. (2006). Test-enhanced learning: Taking mem-
ory tests improves long-term retention. Psychological Science, 17, 249–255.
Schmidt, R. W. (1994). Implicit learning and the cognitive unconscious: Of artificial
grammars and SLA. In N. C. Ellis (Ed.), Implicit and explicit learning of languages
(pp. 165–209). London, England: Academic Press.
Squire, L. R., & Dede, A. J. O. (2015). Conscious and unconscious memory systems.
Cold Spring Harbor Perspectives in Biology, 7(3), 1–14.
160 Michael T. Ullman
Squire, L. R., & Wixted, J. T. (2011). The cognitive neuroscience of human memory
since H.M. Annual Review of Neuroscience, 34, 259–288.
Steinhauer, K. (2014). Event-related potentials (ERPs) in second language research: A
brief introduction to the technique, a selected review, and an invitation to recon-
sider critical periods in L2. Applied Linguistics, 35, 393–417.
Stern, S. A., & Alberini, C. M. (2012). Mechanisms of memory enhancement. Wiley
Interdisciplinary Reviews: Systems Biology and Medicine, 5, 37–53.
Tagarelli, K. M., Grey, S., Turkeltaub, P. E., & Ullman, M. T. (in preparation). The
brain bases of second language: A comprehensive neuroanatomical meta-analysis.
Tagarelli, K. M., Shattuck, K. F., Turkeltaub, P. E., & Ullman, M. T. (2019). Language
learning in the adult brain: A neuroanatomical meta-analysis of lexical and gram-
matical learning. NeuroImage, 193, 178–200.
Tanner, D., & van Hell, J. G. (2014). ERPs reveal individual differences in morphosyn-
tactic processing. Neuropsychologia, 56, 289–301.
Ullman, M. T. (2001a). The neural basis of lexicon and grammar in first and second
language: The declarative/procedural model. Bilingualism: Language and Cognition,
4, 105–122.
Ullman, M. T. (2001b). A neurocognitive perspective on language: The declarative/
procedural model. Nature Reviews Neuroscience, 2, 717–726.
Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/
procedural model. Cognition, 92, 231–270.
Ullman, M. T. (2005). A cognitive neuroscience perspective on second language acqui-
sition: The declarative/procedural model. In C. Sanz (Ed.), Mind and context in adult
second language acquisition: Methods, theory and practice (pp. 141–178). Washington, DC:
Georgetown University Press.
Ullman, M. T. (2006). The declarative/procedural model and the shallow-structure
hypothesis. Applied Psycholinguistics, 27, 97–105.
Ullman, M. T. (2007). The biocognition of the mental lexicon. In M. G. Gaskell (Ed.),
The Oxford handbook of psycholinguistics (pp. 267–286). Oxford, England: Oxford
University Press.
Ullman, M. T. (2012). The declarative/procedural model. In P. Robinson (Ed.),
Routledge encyclopedia of second language acquisition (pp. 160–164). New York, NY:
Routledge.
Ullman, M. T. (2014). Language and the brain. In J. Connor-Linton & R. W. Fasold
(Eds.), An introduction to language and linguistics, 2nd ed. (pp. 249–286). Cambridge,
England: Cambridge University Press.
Ullman, M. T. (2016). How does language depend on general-purpose long-term
memory systems in the brain? In G. Hickok & S. A. Small (Eds.), The neurobiology of
language (pp. 953–968). New York, NY: Elsevier.
Ullman, M. T., Earle, F. S., Walenski, M., & Janacsek, J. (2020). The neurocognition
of developmental disorders of language. Annual Review of Psychology, 71, 389–417.
Ullman, M. T., & Lovelett, J. T. (2018). Implications of the declarative/procedural
model for improving second language learning: The role of memory enhancement
techniques. Second Language Research, 34, 39–65.
Ullman, M. T., Lum, J. A. G., & Conti-Ramsden, G. (2014). Domain specificity in
language development. In P. Brooks, V. Kempe, & J. G. Golson (Eds.), Encyclopedia
of language development (pp. 163–166). Los Angeles, CA: Sage.
The Declarative/Procedural Model 161
Van Hell, J. G., & Tokowicz, N. (2010). Event-related brain potentials and second
language learning: Syntactic processing in late L2 learners at different L2 proficiency
levels. Second Language Research, 26, 43–74.
Zanini, S., Tavano, A., & Fabbro, F. (2010). Spontaneous language production in bilin-
gual Parkinson’s disease patients: Evidence of greater phonological, morphological
and syntactic impairments in native language. Brain & Language, 113, 84–89.
8
PROCESSABILITY THEORY1
Manfred Pienemann and Anke Lenzing
The Theory and Its Constructs
Processability Theory (PT) (e.g., Pienemann, 1998) is a theory of second
language development. The logic underlying PT is as follows: At any stage of
development, the learner can produce and comprehend only those second lan-
guage (L2) linguistic forms which the current state of the language processor
can handle. It is therefore crucial to understand the architecture of the language
processor and the way in which it handles an L2. This enables one to predict
the course of development of L2 linguistic forms in language production and
comprehension across languages.
The architecture of the language processor accounts for language processing
in real time and within human psychological constraints such as word access
(i.e., the rapid retrieval of words from the mental lexicon within milliseconds)
and working memory. The incorporation of the language processor in the
study of L2 acquisition therefore brings to bear a set of human psychological
constraints that are crucial for the processing of languages. The view on lan-
guage production followed in PT is largely that described by Levelt (1989),
which overlaps to some extent with the computational model of Kempen and
Hoenkamp (1987) and Merrill Garrett’s work (e.g., Garrett, 1976, 1980, 1982).
The basic premises of that view are the following:
• Processing components (e.g., modules that grammatically or phonologi-
cally encode the message that the speaker wants to utter) operate largely
automatically and are generally not consciously controlled (i.e., the speaker
does not need to be aware of the grammatical structures he/she produces).
• Processing is incremental (i.e., the speaker can start producing an utterance
without having planned all of it).
Processability Theory 163
• The output of the processor is linear, although it may not be mapped onto
the underlying meaning in a linear way (for instance, the idea produced
first does not need to occur first in natural events, as in, Before I drove off,
I started the engine).
• Grammatical processing has access to a temporary memory store that can
hold grammatical information (e.g., in the sentence The little kid loves ice
cream, the grammatical information singular, third-person present in the little
kid is retained in grammatical memory and it is used when the verb loves is
produced, which is marked for third-person singular) (see Pienemann, 1998,
for details).
The core of PT is formed by a universal processability hierarchy that is based
on Levelt’s (1989) approach to language production. PT is formally modeled
using Lexical-Functional Grammar (LFG) (Bresnan, 2001). PT is a univer-
sal framework that has the capacity to predict developmental trajectories for
any L2. The notion developmental trajectory implies a developmental
dimension also known as staged development as well as a variational dimen-
sion accounting for differences between developmental paths of individual
learners.
For instance, in English as a second language (ESL) question formation, the
following developmental sequence has been found (Pienemann, 1998):
PT Level Structure Example
1 One-constituent question Here?
2 SVO question He live here?
3 WH+SVO Where he is?
4 Copula inversion Where is he?
5 Aux-second Where has he been?
6 Cancel Aux-second I wonder where he has been.
This developmental sequence is predicted by PT’s processability hierarchy. In
the PT paradigm, each level of development represents a set of grammatical
rules that shares certain processing procedures. Further details of these rules
and procedures will be discussed later in this chapter. At this point, it is cru-
cial to note that PT differentiates between the observation of developmental
sequences (staged development) in L2 data and the psycholinguistic explanation
of such sequences through a universal system of rules and procedures that is
formally modeled using a theory of grammar.
In the PT paradigm, each interlanguage variety represents a specific vari-
ant of the grammatical rules relating to a given level. This means that the
grammatical rules at each level of development allow for a limited number of
structural options that learners can produce. For example, learners attempting
to produce “Aux-second” at level 4 of the ESL sequence of question formation
164 Manfred Pienemann and Anke Lenzing
(i.e., before they are ready for this structure) have been found to produce the
following interlanguage variants:
A. Where he been?
B. Where has been?
C. Where he has been?
D. He has been where?
What variants A to D have in common is that they get around placing the
auxiliary in second position after an initial wh- word. In other words, they
constitute different solutions to the same learning problem. In the course of L2
development, learners accumulate grammatical rules and their variants, allow-
ing them to develop individual developmental trajectories while adhering to
the overall developmental schedule. In this way, PT accounts for both universal
levels of development and individual variation within levels.
There are two separate problems that are crucial to address in understanding
L2 acquisition. The original version of PT focused solely on what is known as
the developmental problem (i.e., why learners follow universal sequences of
acquisition). The extended version of PT (Pienemann, Di Biase, & K awaguchi,
2005) and recent developments of the theory (Lenzing, 2013) also begin to
address the so-called logical problem (i.e., how learners come to know what
they know if their knowledge is not represented in the input) (see White, this
volume). The developmental and the logical problems are the key issues that a
theory of (second) language acquisition must address (e.g., Gregg, 1996), and
PT addresses these issues in a modular fashion. One part of the theory deals
with the developmental problem; a separate but connected part deals with
the logical problem. Both parts are based on LFG because LFG is designed to
account for linguistic knowledge in a way that is compatible with the archi-
tecture of the language processor. The developmental problem is addressed by
describing the constraints the language processor places on development, and
the logical problem is addressed using specific components of LFG that are
summarized later in this chapter.
The basic claim of the original version of PT (Pienemann, 1998) is that
language development is constrained by processability, the definition of
which will emerge as the discussion progresses. Processability affects first
language (L1) and L2 development (albeit in different ways) and also con-
strains (i.e., puts limits on) interlanguage variation and L1 transfer. The
extended version of PT adds to this the claim that the initial form of gram-
mar in L2 acquisition is determined by the default relationship between
arguments (i.e., the entities representing who does what to whom) and the
way they are expressed by the grammatical forms of the target language
(i.e., the language being acquired). We turn our attention now to the major
constructs of the theory.
Processability Theory 165
Processability Hierarchy
The processability hierarchy (Pienemann, 1998) is based on the notion
of transfer of grammatical information within and between the phrases of a
sentence. For instance, in the sentence Little Peter goes home, the grammatical
information third-person singular is present in the phrase Little Peter and in
goes. This is commonly referred to as subject-verb agreement. In LFG and in
Levelt’s model of language generation, it is assumed that the language processor
checks that the two parts of the sentence, Little Peter and goes, contain the same
grammatical information. To be able to carry out this matching operation, the
procedures that build phrases in language generation need to have developed
in the L2 processing system. In our example, learners need to have developed a
procedure for building noun phrases such as Little Peter and verb phrases such as
goes home. They also need to have developed a procedure for putting these two
phrases together to form a sentence. In Levelt’s model of language generation,
it is assumed that the grammatical information third-person singular needs to
be stored in the procedures that build the phrases in which this information is
used and that the two sets of information are compared within the procedure
that puts the two phrases together to form a sentence (i.e., the sentence proce-
dure). The learner of a language needs to develop procedures that can handle
the job of storing and comparing grammatical information. The acquisition of
the processing procedures enables the learner to produce grammatically well-
formed utterances. For instance, in the ungrammatical sentence Little Peter go
home, the noun phrase Little Peter is marked for third-person singular but the
verb is not. This will be detected by a competent speaker when the noun phrase
and the verb phrase are assembled, and it would be dismissed before a sentence
is formed. However, if the learner has not yet developed a fully functioning
sentence procedure, or if a specific verb has not yet been annotated for person/
number, the mismatch will not be detected, resulting in the production of an
ungrammatical sentence like the one above.
The same principle applies to grammatical information contained within
phrases. For instance, in the noun phrase two kids, the grammatical informa-
tion plural is contained in the numeral two and in the noun kids. In language
generation, these two bits of information are compared when the noun phrase
is assembled by the noun phrase procedure. In the case of two and kids, the
two bits of grammatical information match. The different types of information
exchange are illustrated in Figure 8.1.
We can now see that in both examples grammatical information has to be
matched between parts of the sentence. In LFG, this process is called feature
unification. In nontechnical language we might describe this process as infor-
mation matching. LFG uses formal means to account for such processes. The
fact that LFG has this capacity is one of the key reasons why PT uses LFG to
model these psycholinguistic processes.
166 Manfred Pienemann and Anke Lenzing
Information Exchange
Stage Locus of Example Illustration
exchange
Sentence within sentence Peter sees a dog
S
NP VP
N V NP
[3rd ps sg] [3rd ps sg]
Phrase within phrase only two kids
NP
Det N
[pl] [pl]
Category no exchange talk-ed
V
[past]
FIGURE 8.1 A simplified account of the processability hierarchy.
The two examples we have used also illustrate the implicational nature of
the processability hierarchy. It is easy to see that in the Little Peter example,
grammatical information has to be matched between the noun phrase and
the verb phrase and that this occurs when the two pieces are assembled to
form the sentence. In contrast, in the second example (i.e., two kids), the
information matching occurs in the noun phrase procedure, before the sen-
tence is assembled. In other words, there is a time sequence involved in the
matching of grammatical information, which forms the basis of the original
processability hierarchy. Noun phrases are assembled before verb phrases,
which are assembled before sentences. In addition, individual words belong
to categories such as noun and verb, and category procedures are the mem-
ory stores that hold grammatical information about features such as num-
ber (singular, plural), person (first, second, third), and tense (present, past),
among others. Therefore, category procedures are acquired before noun
phrase procedures.
The following is an overview of the original processability hierarchy, fol-
lowing Pienemann (1998):
1. no procedure: no access to syntactic information; production of single
words and formulaic utterances (e.g., producing a simple word such as yes
or a formulaic utterance such as What’s your name?)
2. category procedure: annotation of L2 lexical items for their syntactic cat-
egory and activation of lexical morphemes (e.g., adding a past tense mor-
pheme to a verb as in talked)
Processability Theory 167
3. phrasal procedure: exchange of grammatical information within phrases
(particularly noun phrases and prepositional phrases) and production of
phrasal morphemes (e.g., matching number information in English noun
phrases, as in two kids)
4. verb phrase procedure: exchange of grammatical information within the
verb phrase (e.g., the agreement between the auxiliary and the past parti-
ciple as in He has seen him)
5. sentence procedure: exchange of grammatical information across phrases
(e.g., subject-verb agreement as in Peter sees a dog)
6. subordinate clause procedure: unification of features/exchange of gram-
matical information across sentences (e.g., use of subjunctive in subordi-
nate clauses triggered by information in a main clause as in The doctor
insisted that the patient be quiet)
The basic hypothesis underlying PT is that learners develop their grammat-
ical inventories following this hierarchy for two reasons: (a) the hierarchy is
implicationally ordered, that is, every procedure is a necessary prerequisite
for the next procedure, and (b) the hierarchy mirrors the time course in
language generation, that is, the procedures are acquired in the sequence in
which they are activated in the language generation process. Therefore, the
learner has no choice but to develop along this hierarchy. Phrases cannot be
assembled without words being assigned to categories such as noun and verb,
and sentences cannot be assembled without the phrases they contain and so
forth. The fact that learners have no choice in the path they take in the devel-
opment of processing procedures follows from the time course of language
generation and the design of the processing procedures. This is how the
architecture of language generation constrains language development. So,
observed sequences of development are a direct result of the stage of process-
ing in which learners find themselves. For example, if learners are at level 3
of the processing hierarchy (i.e., they can only exchange information within
a noun phrase or prepositional phrase), they will not produce wh- questions
that require processing abilities beyond level 3 (e.g., they will not produce
questions with correct use of auxiliaries as in Where has he gone?) because
such questions involve processing that relies on the exchange of i nformation
across phrases.
As mentioned earlier, the original version of the processability hierarchy
focuses on information transfer within phrase structure. In the extended ver-
sion of PT (Pienemann et al., 2005), the processability hierarchy is extended
to include further aspects of language generation, in particular, the relation-
ship between what is known as argument structure and grammatical structure.
Grammatical structure contains information about grammatical functions, such
as subject and object. Argument structure refers to the basic ideas conveyed in
a sentence, namely, who does what to whom. The extended version of PT also
168 Manfred Pienemann and Anke Lenzing
includes the relationship between what is intended to be said and the way this
is expressed using L2 grammatical forms. This extension is also modeled using
LFG. Details will be summarized later on.
Hypothesis Space
The processability hierarchy has been described as the sequence in which
the fundamental design of the language processor develops in L2 acquisition,
and it has been added that the learner is constrained to follow this sequence.
At the same time, the processing procedures developed at every stage of the
hierarchy do allow for some degree of leeway for the shape of the L2 gram-
mar. Hypothesis Space is created by the interplay between the processability
hierarchy and the leeway it generates at every level.
An example will illustrate the constraining effect of the processability hier-
archy. At level 3 (noun phrase and prepositional phrase procedures), grammat-
ical information can be exchanged only within noun phrases and prepositional
phrases, not beyond the phrasal boundary. Therefore, grammatical struc-
tures requiring information exchange beyond the phrase boundary, such as
subject-verb agreement, cannot be processed at this stage. At the same time, these
constraints leave sufficient leeway for learners to find different solutions to
structural learning problems. We illustrated this previously with the example of
the position of auxiliaries in English wh- questions. Correct placement of auxil-
iaries in second position requires processing procedures from a much later stage
in the hierarchy. L2 learners may nevertheless produce wh- questions. When
they attempt to do this, they have four structural options within the range of
the available means processable by their interlanguage grammar that avoid the
placement of the auxiliary in second position. The reader will recall the exam-
ples given previously. Note how the learner can remain at level 3 of processing
(i.e., can only process information in noun phrases and prepositional phrases)
even when confronted with target structures that require higher processing
procedures. In each example that follows, the reader can see how learners delete
something or use a nonstandard or unexpected word order to avoid processing
information across phrases.
A. Where he been?
B. Where has been?
C. Where he has been?
D. He has been where?
Transfer of Grammatical Information and Feature Unification
As previously mentioned, the original version of PT focused on phrase structure
(which is called constituent structure in LFG) and the transfer of grammatical
Processability Theory 169
information within it. This information transfer process is modeled using
feature unification. Every entry in the learner’s mental lexicon needs to be
annotated for the specific features of the target language. For instance, the
entry Peter needs to be assigned to the lexical class noun. It needs to be anno-
tated as a proper noun, the feature NUMBER needs to have the value singular,
and the feature PERSON needs to have the value third. The lexical entry sees
needs to be assigned to the lexical class verb, and the features NUMBER,
PERSON, TENSE, and ASPECT need to have the following values:
NUMBER = singular
PERSON = 3
TENSE = present
ASPECT = noncontinuous
To achieve subject-verb agreement in the sentence Peter sees a dog, the value
of the features NUMBER and PERSON has to be matched (or unified).
The features NUMBER and PERSON have the values singular and third,
respectively, and these values reside in the lexical entries of the noun Peter and
the verb sees. This grammatical information is passed on to the noun phrase
procedure and verb phrase procedure, respectively. From there the two sets of
information are passed on to the sentence procedure, where they are matched.
In the design of PT, the point of unification is related to the hierarchy of
processability that reflects the time course of real-time processing. The hierar-
chy that results from a comparison of the points of feature unification can be
ordered as follows:
1. No exchange of grammatical information (= no unification of features)
2. Exchange of grammatical information within the phrase
3. Exchange of grammatical information within the sentence
If one applies this hierarchy to the acquisition of English as a second language,
the following developmental trajectory can be predicted:
1. past –ed will appear before
2. plural –s, which in turn will appear before
3. third-person –s.
To appreciate the universal nature of PT, it is crucial to consider that the
processability hierarchy is not language-specific and that, in principle, it applies
to the transfer of grammatical information in any language. In contrast, the
examples that were given for the acquisition of English morphology utilize
this hierarchy to make predictions about linguistic development in one specific
target language.
170 Manfred Pienemann and Anke Lenzing
What the preceding discussion suggests is that learners develop a lexically
driven grammar; that is, the lexicon stores grammatical information about fea-
tures like number, person, tense, and so forth. For instance, the lexical entry
for walked is marked for past tense. Lexical information of this type is required
in the assembly of the sentence and grammatical information and features must
be matched or unified.
Lexical-Functional Grammar
LFG has three independent and parallel levels of representation: argument
structure (a-structure), functional structure (f-structure), and constituent struc-
ture (c-structure). The three levels of linguistic representation are related to
each other by specific linking or mapping principles to unify the information
that is encoded in each of the three levels. These three levels are illustrated in
Figure 8.2.
Argument structure is related to who does what to whom in a sentence. It
contains the verb and its corresponding arguments. Arguments take specific
thematic roles (such as agent, experiencer, locative, or patient/theme) that are
ordered according to a universal hierarchy of thematic roles. This hierarchy
structures the semantic roles of the verb according to their prominence. For
instance, the agent always precedes the patient in a-structure. The arguments
for each verb are listed in the lexical entry of the verb. As illustrated in the
following examples, verbs differ in both the number and types of arguments
they require:
1. John (agent) threw the ball (patient/theme).
2. Peter (experiencer) sees ghosts (patient/theme).
3. Mary (agent) put the pen (patient/theme) on the desk (location).
argument structure agent theme
functional structure SUBJ OBJ
constituent structure NPsubj NPobj
John [threw] the ball.
FIGURE 8.2 Sample of three levels of structure in LFG.
Processability Theory 171
Functional structure consists of universal grammatical functions (such as
subject or object) that are related to constituent structure in a language-specific
way. Functional structure serves to connect argument structure and constitu-
ent structure. For instance, in example (1), the constituent [[ John] N] NP is the
grammatical subject of the sentence (and the agent of the verb played).
(1) play <agent, patient/theme>
| |
SUBJ OBJ
| |
John played the piano.
As previously mentioned, constituent structure is basically another name for phrase
structure and describes the structure of the parts of sentences. This component
consists of units that are constructed on the basis of a universal core of constituent
categories (verb, noun, and so forth), but these are arranged in a way that is specific
for every language. For instance, in some languages, descriptive adjectives precede
the noun (e.g., in English: the white horse), whereas in other languages, they follow
the noun (e.g., in French: le cheval blanc; and in Spanish: el caballo blanco).
Lexical Mapping
Lexical Mapping Theory is a component of LFG (e.g., Bresnan, 2001). Lexical
Mapping specifies the mapping processes from a-structure to f-structure, that
is, from arguments to grammatical functions. This design of LFG as a theory
of language ensures that universal argument roles (e.g., agent, patient/theme)
can be expressed using a range of different grammatical forms (e.g., subject,
object). For instance, in English, agents and experiencers can be realized as the
grammatical subject of a sentence, as was shown in the examples we saw earlier
with the verbs throw, put, and see. But note that in English, other arguments can
be realized as the subject in a sentence. In passives such as The ball was thrown
by John, the theme is now the subject. In other words, the relationship between
argument structure and the other two levels of structure is variable in a spe-
cific language and it also varies between languages. This variable relationship
between what is intended to be said (argument structure) and the way it is
expressed using grammatical forms (such as grammatical morphemes or word
order) creates expressiveness in language, but it also creates what Levelt (1981)
calls the linearization problem, that is, the sequencing of the communicative
intention to be expressed. As mentioned previously, the output of the processor
is linear, but it may not be mapped onto the underlying meaning in a linear way.
In the case of the active-passive alternation introduced earlier, the linearization
problem applies to the relationship between argument structure and functional
structure. The passive in the preceding example deviates from a simple match
172 Manfred Pienemann and Anke Lenzing
Unmarked Alignment
Lexical Mapping Theory accounts for the mapping of argument structure onto
functional structure. In PT the default mapping principle is unmarked align-
ment, which is based on the one-to-one mapping of argument roles onto gram-
matical functions. In English, for instance, agent = SUBJECT is the prototypical
or default association between argument structure and functional structure. But
as we saw with passives, languages allow for a much wider range of relationships
between argument structure and functional structure, and the ability to map these
relationships develops stepwise in L2 acquisition. Principles of lexical mapping
can account for these developmental processes. For L2 acquisition, unmarked
alignment is the initial state of development and results in canonical word order
(i.e., the most typical word order for that language). For English, this is SVO (but
for Japanese, for example, canonical word order is SOV). Unmarked alignment
simplifies language processing for the L2 learner of English who, at this stage, will
classify the first noun phrase as the agent. This way, canonical word order avoids
any kind of transfer of grammatical information during language processing.
PT claims that L2 acquisition starts with an unmarked assignment of func-
tional structure. Subsequent changes of the relationship between arguments
and functional structure will require additional processing procedures that
will be acquired later. Hence, the unmarked alignment hypothesis implies a
developmental prediction for L2 structures affecting the relationship between
argument structure and functional structure. Let’s return to the passive we saw
earlier. In the passive, the relationship between argument roles and grammati-
cal functions may be altered, as illustrated in examples (2) and (3).
(2) throw <agent, patient/theme>
| |
SUBJ OBJ
| |
John threw the ball.
(3) thrown <agent, patient/theme>
| |
Ø SUBJ (ADJ)
The ball was thrown by John.
Processability Theory 173
Sentences (2) and (3) describe the same event involving two participants. The dif-
ference between the two is that in (3), the argument the ball that is OBJECT in
(2) is realized as SUBJECT, and the argument John that is SUBJECT in (2) is
realized as ADJUNCT.
These alterations of the relationship between argument roles and grammati-
cal functions constitute a deviation from unmarked alignment. In order for this
type of marked alignment to be possible, learners have to acquire additional
processing resources to be able to map arguments onto grammatical functions
in a more flexible way. Therefore, PT predicts that the passive is acquired later
than active SVO sentences in English.
The TOPIC Hypothesis
As discussed earlier, Lexical Mapping Theory specifies the relationship
between argument structure and functional structure, and PT derives
developmental predictions from the language-specific relationship between
argument structure and functional structure using Lexical Mapping The-
ory. Similar predictions can also be derived from the relationship between
functional structure and constituent structure. One set of such predictions is
entailed in the TOPIC Hypothesis. To account for developmental dynamics
in the relationship between functional structure and constituent structure,
Pienemann et al. (2005) propose the TOPIC Hypothesis, which predicts that
learners will initially not differentiate between SUBJECT and other gram-
matical functions in sentence-initial position (e.g., TOPIC). In this context,
it is important to note that in LFG, TOPIC is a grammatical function. For
instance, in the sentence Ann, he likes, Ann has two functions, OBJECT
(of the verb likes) and TOPIC (of the sentence). In this case, the TOPIC
function is assigned to a constituent in sentence-initial position other than
the SUBJECT (because the subject of likes is he). This process is referred
to as topicalization. When the learner is able to add a constituent before
the subject position, this will trigger the differentiation of the grammatical
functions TOPIC and SUBJECT.
The TOPIC Hypothesis predicts that, initially, the first noun phrase is
mapped onto the SUBJECT function (as in 1), because the learner does not
differentiate between the grammatical function TOPIC and SUBJECT
(see also VanPatten, this volume). At a later stage, the unmarked alignment
between constituents and grammatical functions is altered: The assignment of
the TOPIC function to nonargument functions results in the occurrence of
adjuncts in sentence-initial position (as in 2). Finally, the TOPIC function is
assigned to a core argument that is not the subject. This applies, for instance, to
the topicalization of objects (as in 3).
174 Manfred Pienemann and Anke Lenzing
1. TOPIC and SUBJECT are not differentiated.
(Peter saw Mary.) He liked the girl.
| |
SUBJ OBJ
2. The initial constituent is an ADJUNCT or a question-word. TOPIC is differentiated
from SUBJECT. Yesterday everyone smiled.
| |
ADJ SUBJ
3. The TOP function is assigned to a core argument other than SUBJ.
Ann, I think, he likes.
| |
OBJ SUBJ
The Initial L2 Grammatical System
As pointed out earlier, PT is based on the assumption that the interlanguage gram-
mar is constrained by the limited processing resources available to the learner. This
assumption materializes in the form of Hypothesis Space, which delineates struc-
tural hypotheses available to the learner at any given stage. Processing constraints
also delineate possible L1 transfer. PT predicts that the initial state of L2 syntax fol-
lows the canonical word order of the target language (depending on the language
being learned, canonical word order can be SVO, SOV, VSO, and so forth).
Lenzing (2013) investigated the oral speech production of 24 beginning
learners of English as L2 with a German background at primary school level.
The study was both cross-sectional and longitudinal in design, as the data
collection took place on two occasions, after one year and after two years of
instruction in English. The data were elicited using different communicative
tasks that the learners completed in pairs. Lenzing analyzed large quantities
of early L2 learner data that initially appeared to be much more diverse than
what the highly constrained initial state assumed in PT would predict. Early
learners produced “strange” utterances that differed from the target language
in several ways. For instance, the structures differed in terms of their syntax,
as in Ski the mouse (for The mouse is skiing). Other utterances were semantically
ill-formed. For example, the learners produced question forms such as What’s
the spaghetti? and What’s you {ne} sister?. In these cases, the intended meaning of
the question could only be inferred from the context; that is, the first question
intended to ask Do you like spaghetti? and the second question meant Do you
have a sister? A further deviation from the target language relates to the number
of arguments the learners expressed in their utterances. These ranged from ut-
terances with missing arguments, as in Is sleep on the {wolk} (for The elephant is
sleeping on the cloud), in which the agent is missing, to structures that contained
more arguments than the learner wished to express, as was the case in She likes
you spinach? (for Do you like spinach?). Finally, in a number of utterances, the
Processability Theory 175
TABLE 8.1 Deviations in Early L2 Learner Utterances
1. Syntactic deviation Ski the mouse (= The mouse is skiing)
2. Semantic deviation What’s the spaghetti? (= Do you like spaghetti?)
What’s you {ne} sister? (= Do you have a sister?)
3. Number of arguments sleeping on the {wolk}. (= The elephant is sleeping
(participants in event) on the cloud)
She likes you spinach? (= Do you like spinach?)
4. Lexical class does not match It’s a pink? (= Is it pink?)
What’s your eating? (= What do you like to eat?)
lexical class did not match. In the question form It’s a pink?, the lexical item
pink does not seem to be annotated for its correct syntactic category (adjective).
The same applies to the structure What’s your eating?. Table 8.1 summarizes
these observations.
However, a detailed distributional analysis of a large corpus of the very
early learner data revealed that semantic and syntactic deviations like those
shown in Table 8.1 were limited to a very small class of verbs in the context of
a highly limited lexicon. The same was true for nontarget-like argument struc-
tures. Lenzing’s analysis revealed that the linguistic system of very early learn-
ers is highly constrained in its constituent structure, its argument structure,
and lexicon. She concluded that the unpredicted “chaotic” structures shown in
Table 8.1 are not generated by the learners’ grammatical system. Instead, these
structures are based on lexical processes by which the arguments are mapped
directly onto surface form. This implies that, initially, no c-structure is present.
In some cases, the learners rely on formulaic units (such as What’s or She likes)
and simply attach to these units the lexical item(s) that best match the argu-
ment(s) they intend to express. This process results in idiosyncratic question
forms composed of a formulaic question marker and one or more lexical items
attached to it, as in What’s the spaghetti?
Initially, learners also fail to assign a lexical class to L2 words. For instance,
in the anomalous question It’s a pink?, the adjective pink occupies the syn-
tactic slot of a noun. In terms of the syntactic distribution of adjectives in
English, they can occur between a determiner and a noun. However, they
cannot occupy the syntactic slot of a noun, as is the case in the example above.
Lenzing concludes that the L2 constituent structure and argument structure
need to be discovered stepwise by the learner, and lexical classes need to be
assigned gradually to new lexical entries. As essential features and functions are
underdeveloped or missing, feature unification and mapping cannot be carried
out at the initial state. Lenzing refers to these assumptions as the Multiple
Constraints Hypothesis (MCH).
In PT, universal grammatical functions (subject, object, and so forth) are
assumed to be present in the initial L2 mental grammatical system. The MCH
makes the additional assumption that grammatical functions are inaccessible
176 Manfred Pienemann and Anke Lenzing
Lexicon
.
A-structure:
syntactic side not
(fully) annotated in a-structure like <experiencer patient/theme> semantic side
Constraints on processability
the mental lexicon
Lexical Processes
(...) (...) syntactic side
Direct Mapping
for syntactic
features
.
F-structure: f-structure SUBJ OBJ
grammatical
functions present
BUT: inaccessible
due to lack of
syntactic features in
a-structure c-structure I milk.
.
C-structure:
initially not present I like rolls {mit} jam.
.(lexical processes)
development from
flat c-structure trees
S
to more Flat c-structure
hierarchical ones
N V N
FIGURE 8.3 The Multiple Constraints Hypothesis.
at the initial state, as the mapping process from a- to f-structure is blocked.
The L2-specific c-structure is assumed to develop gradually in the acquisition
process following the predictions spelled out in PT. Its gradual development
is characterized by basic, flat c-structures that later become more complex,
hierarchical ones.
The MCH is illustrated in Figure 8.3, which shows the direct mapping of
a-structure onto c-structure, bypassing f-structure. Development of this initial
learner system into the L2 is driven by the gradual annotation of the lexicon,
which permits the processor to map a- and c-structure onto f-structure, thus
facilitating nonlinear mapping processes.
L2 Comprehension
To date, the focus of PT has mainly been on L2 production. Recently, research-
ers have begun to ask whether the psycholinguistic processing procedures that
govern the acquisition of L2 learner capacities to produce morpho-syntactic
structures also play a role in L2 comprehension. Lenzing (2019, in press) argues
that L2 production and comprehension rely on (at least partially) shared pro-
cessing resources. This does not imply that production and comprehension are
based on exactly the same resources. A key difference is that in comprehen-
sion, learners may not have to rely on the structural form of the message they
encounter, as other cues (e.g., semantic, pragmatic, contextual) form an essen-
tial part of the resources available for comprehension.
Processability Theory 177
Acknowledging the differences in resources, Lenzing argues that L2
comprehension can follow either of two routes, a semantic or a syntactic route.
Lenzing claims that the semantic route allows learners to rely on nonsyntac-
tic cues to meaning, such as lexical semantics (e.g., that the agent of the verb
kick must be animate). In relation to the syntactic route, she argues that PT’s
processing procedures for L2 production also apply to L2 comprehension. This
means that when the focus is on the syntactic aspects of a message, the same
syntactic procedures that operate in encoding (speech production) also operate
in decoding (comprehension). Lenzing claims that in L2 comprehension, learn-
ers initially draw on semantic cues and follow the semantic route in compre-
hension because they lack the processing procedures required to syntactically
decode the message. As learners acquire the syntactic processing procedures,
they progressively become able to process more nonsemantic/nonpragmatic
aspects of what they hear (e.g., the acquisition of the sentence procedure is a
prerequisite for L2 learners to process subject-verb agreement as in Peter plays).
As we pointed out earlier in this chapter, in production, passives are more
complex than active sentences because one processing prerequisite for passive
structures is the acquisition of nonlinear mapping operations between arguments
and grammatical functions. Basically, in order to produce a sentence such as
The wall was painted by the boy, learners need to be able to map the patient/theme
argument (the wall) onto the SUBJECT function and the agent argument (the boy)
onto the ADJUNCT function. Learners who have not yet acquired the capac-
ity to implement this kind of nonlinear mapping operation have been found to
produce sentences in active voice instead of passive (e.g., The boy paint the wall).
Lenzing (in press) applied these claims to the L2 comprehension of the
English passive. She pointed out that nonreversible passives (i.e., those with
an inanimate patient/theme argument in subject position) such as The wall was
painted by the boy can be understood before they can be produced by L2 learners.
Lenzing’s claim is that learners are able to do this because they do not have to
process this kind of sentence via the syntactic route. In this instance, they can
rely on semantic cues, as there is only one plausible interpretation of the rela-
tionship between the wall and the boy. An active interpretation would result in
a semantically anomalous sentence (The wall paints the boy) (see also VanPatten,
this volume).
However, there are passive forms that provide no semantic cues to aid their
comprehension. In these reversible passives, swapping the positions of agent and
patient would result in a logically possible meaning (compare The girl was kissed
by the boy with The boy was kissed by the girl). In addition, both interpretations
would be equally plausible (i.e., likely to occur in the real world). Lenzing
claims that in instances such as these, when semantic cues are insufficient, learn-
ers have to use syntactic resources to make sense of what they hear and process
the sentence syntactically in order to understand who does what to whom.
178 Manfred Pienemann and Anke Lenzing
In order to test her claims, Lenzing conducted a study with 82 German
learners of English in a classroom context. The learners were at different lev-
els of L2 development. Lenzing examined different aspects of the learners’ L2
acquisition process in both comprehension and production utilizing differ-
ent instruments: (1) communicative tasks to determine the learners’ levels of
acquisition; (2) a sentence-picture matching task and an enactment task to elicit
comprehension of English passives; and (3) a sentence-matching reaction time
experiment to gain insights into morpho-syntactic processing during the com-
prehension of English passives. She reported three findings. First, at lower levels
of acquisition, L2 learners comprehended only nonreversible passives (e.g., The
wall was painted by the boy) and passives with low event probability in active
interpretation (e.g., The cat is fed by the woman). Second, passive sentences that
lacked semantic cues to meaning (e.g., The boy was kissed by the girl) were only
reliably comprehended by learners at higher levels of acquisition. Third, the
results of the sentence-matching experiment showed that, in comprehension,
morpho-syntactic processing of English passives occurs only if the respective
processing procedures have been acquired.
These findings support Lenzing’s claim that PT’s processing procedures do
not explain all aspects of the L2 comprehension of passives but do apply to the
syntactic aspects of L2 comprehension. For other work that extends PT to com-
prehension (albeit with a somewhat different focus), see Buyl (2015), Buyl and
Housen (2013, 2015), Spinner (2013), and Spinner and Jung (2018).
L2 Profiling and the Implementation of PT
into Artificial Intelligence
Given that PT’s objective is to describe and explain L2 developmental trajec-
tories as well as learner variation, it is only natural that researchers in the PT
community have applied this framework to the measurement of L2 acquisi-
tion. Pienemann, Johnston, and Brindley (1988) started this work before the
conceptualization of PT was finalized. These authors utilized standard devel-
opmental ESL patterns as an empirical basis for the construction of an ESL
profiling procedure. Linguistic profiling was initially developed in the context
of L1 speech pathology (Crystal, Fletcher, & Garman, 1984). Such profiles were
based on a comparison of conversational data obtained from patients with stan-
dard developmental patterns found for the target language. When used with
an individual, this kind of systematic comparison of a developing linguistic
system with an established standard can reveal the current state of the patient’s
language system and hence identify developmental delays because it informs
speech pathologists of the exact grammatical inventory of a patient at a specific
point in time. Pienemann et al. (1988) applied this approach to the acquisi-
tion of ESL. They saw a benefit in being able to detail the exact grammatical
Processability Theory 179
inventory of an L2 learner. Their vision was that L2 profiling would enable
teachers to pick up L2 learners from the exact point to which their L2 system
had developed. They suggested that one advantage of profiling approaches over
proficiency measures is that proficiency measures do not reflect the learner’s
language acquisition processes and are not specific about the learner’s gram-
matical inventory. In seeking to make this approach more manageable for busy
professionals, Pienemann and collaborators (Mackey, Pienemann, & Thornton,
1991; Pienemann, 1990, 1992; Pienemann & Mackey, 1993) extended this
approach to a computer-assisted application called Rapid Profile (RP). In RP,
conversational data are collected and analyzed in a 10-minute session using
communicative tasks. The analyst observes the L2 output of the learner for
pre-defined developmental markers and enters them into a customized data-
base. In other words, RP is based on a 10-minute session, whereas a manual
analysis of an equivalent ESL sample may take several hours. Pienemann and
collaborators assumed that this increased speed would be necessary before lin-
guistic profiling would be feasible for ESL professionals.
Despite its potential as a valid approach to the measurement of L2 develop-
ment, RP has not found its way into many ESL classrooms. The reasons for this
are simple: First, for decades, it has been the objective of mainstream language
testing to obtain a global view of a learner’s language proficiency. RP does the
exact opposite: it focuses on maximum precision in its description of the learn-
er’s language system. Second, RP relies on an observational procedure carried
out by a human analyst. Due to the attentional limits of the analyst, many
features of the learner’s language remain unanalyzed, including the entire vari-
ational dimension. Third, although the observation-based RP procedure is
about 30 times faster than the original approach for speech pathology, language
practitioners have found administering RP too time-intensive.
To overcome these limitations, Pienemann and Lanze (2017) took linguistic
profiling into an artificial intelligence environment. They designed a system
called APES (Automatic Profiling Expert System) that is capable of carrying
out a full linguistic analysis of L2 data without needing a human analyst. APES
permits transcriptions of L2 conversations to be entered for analysis. Alterna-
tively, learners can enter their own text into the system and receive immediate
feedback. In a third method of data entry, that is currently under construction,
APES obtains learner input from a chat interface using typed learner utter-
ances. APES operates on the basis of a lexical unification grammar similar
to LFG. It also incorporates the PT processing procedures. This is achieved
by computing the unifications required for each structure and identifying the
position of these aspects of the learner’s linguistic system on the processability
hierarchy.
Figure 8.4 shows the profile generated by APES for an ESL sample that had
been entered as a text file. In the syntax block in the top left part of Figure 8.4,
180 Manfred Pienemann and Anke Lenzing
FIGURE 8.4 ESL profile of a speech sample generated automatically by APES.
the names of syntactic structures are listed in the first column and the number of
times that the structure is used by the learner in the second column (labeled +).
The third column (labeled -) lists the number of times when the structure is not
used in an obligatory context, and the fourth column (labeled %) displays the
percentage resulting from the comparison of uses of the structure to the total
number of contexts for its use. The fifth column (Acq) indicates whether the
emergence criterion (cf. Pienemann, 1998) is met. Further to the right, the sec-
ond block (labeled Morphology) is laid out in the same way as the syntax block,
with one difference. The morphological analysis contains an additional column
for over-application (labeled >) to account for errors such as putting past –ed
on irregular verbs, for instance, wented. The quantification of three variational
features is displayed under the heading “variational features.” The APES analysis
is based on all data contained in the sample, yielding fully quantified results for
all features. All computations including the application of the emergence cri-
terion take less than one second for a sample containing 50 sentences. In other
words, APES solves the time problem, errors introduced by the human analyst
are reduced to near zero, and the variational dimension is included with maxi-
mum precision.
The breadth of features able to be included in APES exceeds what could be
included in RP. The RP approach was limited to observing 18 pre- defined
developmental markers because of the attentional limits of human analysts.
In contrast, the APES parser is equipped with the knowledge of the majority
Processability Theory 181
of morpho-syntactic structures that occur in English. It is also capable of
processing a wide array of grammatically ill-formed structures that occur in
the language production of L2 learners and to analyze them in terms of the
processability hierarchy. Comparing the nonnative language production with
potential requirements of the target language allows the system to offer the user
alternatives for ungrammatical sentences as illustrated in Figure 8.5.
The fragment of an APES analysis displayed in Figure 8.5 illustrates some
of these capabilities. In Figure 8.5, we show how APES offers a morphological
correction for the ungrammatical sentence She go home every day. Some details
of the correction process are displayed in the form of a tree diagram. In addi-
tion to generating the tree structure, APES computes the unification of lexical
features (not displayed in Figure 8.5). It recognizes the features PERSON = 3rd
and NUMBER = singular of the subject noun phrase (She). It also recognizes
that these features do not match those of the verb go and that the phrase every
day denotes habitual action and is thus connected to the feature -continuous.
APES further recognizes that no marking for tense is present in what is being
analyzed. It also compares the features of the preceding sentence (She works
in the city) with the current sentence. As the morphologically unmarked tense
is present and the inter-sentential comparison of features does not indicate a
change of tense, the conditions for third-person –s are met. Therefore, APES
copies the features of the subject noun phrase onto the verb and adds the –s affix
to the verb for the suggested correction.2 Because none of the feature unifica-
tions that occur in this sentence are attributable to levels 3 or above of the PT
hierarchy, the sentence is classified as level 2.
For reasons of limited space, we are only able to display a small fragment of
the APES system. However, this limited example may serve to illustrate that
the APES system goes far beyond looking for developmental markers in L2
speech. Instead, it infers an entire grammatical system from a sufficiently rich
interlanguage sample and relates it to the norms of the target language system
and the PT architecture.
FIGURE 8.5 Fragment of an APES analysis.
182 Manfred Pienemann and Anke Lenzing
What Counts as Evidence?
Given the focus of PT on developmental dynamics, the most suitable research
design is a longitudinal (sampling the same learners over time) or cross-
sectional (sampling different learners at one specific point in time) study with
a large set of data relevant for the phenomena under scrutiny. In such studies,
the researcher collects naturalistic or elicited speech data that form the corpus
on which the study is based. Relevant data do not necessarily imply a large data
set. The data need to be relevant to the point to be studied. For instance, the
study of subject-verb agreement marking requires a large set of contexts for
subject-verb agreement marking. This will allow the researcher to decide if
the verbal marker is supplied or not. If no context appears, no conclusion can
be drawn. However, even the presence of a number of morphological mark-
ers is no guarantee that these are based on productive interlanguage rules. To
exclude the use of formulae and chunks (i.e., unanalyzed pieces of language like
How are you? or What’s your name?), the researcher needs to check lexical and
morphological variation (i.e., same morpheme applied to different words and
same word used with different morphemes). For instance, to determine that the
structure he goes is used productively and not merely stored as a chunk in the
learner’s mental lexicon, one needs to ensure that the third-person singular –s
occurs with different lexical verbs in the speech sample (e.g., eat-s, walk-s, sleep-s,
like-s) and that the verb appears with different suffixes (e.g., go-ing, go-Ø).
Apart from corpora (plural of corpus) of speech data, reaction time exper-
iments also constitute valid tests of PT. As an example, a learner might be
tested on subject-verb agreement in a sentence-matching experiment (as in
Pienemann, 1998, pp. 215–230). The learner is presented with two sentences,
either in written form on a computer screen or in aural form via headphones
(see, e.g., Lenzing, in press). The sentences are either identical or not, and
the learner has to decide as quickly as possible whether the two sentences are
identical or not by pressing particular computer keys for yes and no responses.
Some pairs of sentences are grammatical (e.g., John goes… followed by John
goes…), and some are not ( John go… followed by John go…). What is measured
is the time it takes the learner to make the decision. The focus of the analysis
is only on the matching sentence pairs. A crucial finding of studies in this par-
adigm is that native speakers are faster in responding to matching grammatical
sentence pairs than to matching ungrammatical sentence pairs, and it is assumed
that ungrammatical sentences take longer to process because the native speaker
is “checking” for feature agreement. However, language learners do not nec-
essarily behave in the same way. If they are presented with a sentence pair with
a missing third-person –s and have not acquired subject-verb agreement, they
are not slower to respond to the ungrammatical sentence pair than to the gram-
matical pair. In this way, sentence-matching experiments can provide insights
into which features the learner has acquired at a given point in time. Evidence
Processability Theory 183
of the acquisition of particular linguistic features allows researchers to draw
conclusions about the acquisition of the procedures required for the processing
of those features.
Another method that is used in research on L2 comprehension within the
PT paradigm is sentence-picture matching (see, e.g., Buyl & Housen, 2015;
Lenzing, in press). In a sentence-picture matching task, the learner is presented
with an aural stimulus (either words or a full sentence) and is asked to match
the stimulus with a picture representing the item or event described by the
sentence. For each prompt presented to the learner, he or she has to choose
among three different pictures. For instance, in a sentence-picture matching
task focusing on passive sentences, the learner hears the prompt The girl is kissed
by the boy. One of the pictures depicts the event described by the sentence,
that is, a boy kissing a girl. The second picture represents the event involving
a reversal of roles, that is, it shows a girl kissing a boy. The third picture is a
distractor item: It shows both participants—the boy and the girl—but does not
display the action described by the sentence. In this way, one can determine
whether learners comprehend a particular structure, for instance, the passive
discussed above.
Common Misunderstandings
A major misunderstanding regarding PT is that it can be applied to any lan-
guage without first considering how particular features of a target language
are processed. For example, some scholars who have tried to apply PT to a
new target language have based their application on the developmental tra-
jectories found for English and German, the two key target languages in early
research on PT. These researchers looked for such things as agreement or word
order phenomena that appeared similar to the developmental sequences found
in English and German. But the grammars of individual languages may vary
considerably, as may the processes involved in producing specific structures.
For instance, languages differ in how grammatical functions, such as subject
and object, are realized at the phrase structure level. The language- specific
nature of the relationship between grammatical functions and constituent
structure becomes evident when comparing English and Italian. Let us consider
the sentence I see them. In English, the grammatical functions SUBJECT and
OBJECT appear as noun phrases in constituent structure ([I]SUBJ see [them]
OBJ ) (see Figure 8.6).
In Italian, grammatical functions are marked differently in constituent
structure: The SUBJECT function is marked on the verb by means of a mor-
phological subject marker, and the OBJECT function appears as a clitic ([li]OBJ
ved-[o]SUBJ ) (see Figure 8.7).
This example shows that the processes involved in the production of sen-
tences differ in English and Italian and thus the rules for English cannot be
184 Manfred Pienemann and Anke Lenzing
FIGURE 8.6 Matching grammatical functions onto c-structure in English. Adapted
from Di Biase and Kawaguchi (2002).
FIGURE 8.7 Matching grammatical functions onto c-structure in Italian. Adapted
from Di Biase and Kawaguchi (2002).
applied wholesale to Italian. Instead, one has to take into account how the
mapping of grammatical functions onto phrase structure is modeled for differ-
ent languages in LFG.
In sum, the processability hierarchy needs to be applied to a new target
language on the basis of the fundamental principles of PT, not on the basis of
developmental trajectories found in specific target languages. Utilizing funda-
mental principles of PT includes a detailed analysis of the information transfer
required for the production of specific structures. This is best done on the basis
of an LFG analysis of the structures in question. The exemplary study, discussed
next, takes into account these crucial considerations.
An Exemplary Study: Kawaguchi (2005)
Given that PT has been designed as a universal theory of L2 development, it
is important to demonstrate that it can be applied to typologically distant lan-
guages and that the predictions for developmental trajectories derived from this
Processability Theory 185
cross-linguistic application are borne out by empirical studies. Kawaguchi’s
(2005) study exemplifies the applicability of PT to the acquisition of Japanese
by deriving a developmental trajectory for the acquisition of Japanese as a sec-
ond language ( JSL), which is supported by longitudinal data.
To appreciate Kawaguchi’s application of PT to JSL, it is crucial to consider
some of the key features of Japanese grammar that the predicted developmental
trajectory is based on. First, Japanese is a head-final language, meaning that
the heads of phrases appear after their complements (e.g., verbs are the heads
of verb phrases and in Japanese verbs appear after their complements, as in rice
eats). English is head-initial (i.e., the head appears first, followed by the comple-
ment, as in eats rice). Second, the verb is always in final position (i.e., canonical
word order in Japanese is SOV). However, syntactic relations (such as subject,
object) are not marked by word order (as they are in English). Instead, syntactic
relations are marked by nominal particles that follow the noun to be marked.
Kawaguchi (2005, p. 259) provides the following example:
(4) Piano-o Tamiko-ga hii.ta
Piano-acc Tamiko-nom play-past
“Tamiko played the piano”
In this example, the marker –o marks the word piano for accusative, and the
particle –ga marks Tamiko for nominative. These markers allow the correct
interpretation of the sentence with Tamiko as the agent and piano as the theme.
As Kawaguchi points out, Japanese word order is relatively free. Therefore, the
two noun phrases in (4) may be scrambled (i.e., moved to different positions)
without affecting the meaning of the sentence.
Kawaguchi derives a specific and testable developmental trajectory from the
extended version of PT for JSL. For the purpose of this chapter, we summarize only
the three example structures that follow from the hypotheses discussed previously:
Level Information transfer Information transfer
1. Lexical procedure Category TOPSUBJ(O)V
2. Phrasal procedure Phrase TOPIC + S(O)V
3. Sentence procedure Sentence OBJECT topicalization
The TOPIC Hypothesis predicts that, initially, L2 learners of Japanese will not
differentiate between TOPIC and SUBJECT. This is reflected in the structure
TOPSUBJ(O)V, where placement of TOPSUBJ prior to the verb is the canonical
word order that applies to Japanese (SOV). At the phrasal level, TOPIC and
SUBJECT can be two different phrases (TOPIC + S(O)V), and at the sentence
level, objects can be in initial (i.e., TOPIC) position. The latter two steps are
186 Manfred Pienemann and Anke Lenzing
predicted by the TOPIC Hypothesis, which states that the TOPIC function
will first be applied to nonarguments (e.g., adjuncts) and only then to core
arguments (i.e., to objects in Kawaguchi’s study). Kawaguchi demonstrates that
these structures are related to the general levels of the processability hierarchy
as shown above. It is this systematic linkage of the specific JSL structures with
the processability hierarchy that yields the crucial prediction of a JSL develop-
mental trajectory.
Kawaguchi conducted two longitudinal studies spanning two and three
years, respectively. The informants were Australian native speakers of English
who started learning Japanese in a formal setting. The informants also had reg-
ular contact with Japanese exchange students using Japanese. The informants
received six hours of linguistic input per week for 24 weeks per year. Data were
collected in natural conversation and using communicative tasks. Data collec-
tion started four weeks after commencement of the course. Each session lasted
between 20 and 30 minutes. Samples were collected every month. The data
were transcribed and further transliterated using a romanization system that
permits a computer-based analysis of the data. To test the hypotheses derived
from PT, Kawaguchi carried out a distributional analysis of the data. For this
analysis, she searched the learner data for the structures that are included in
the hypothesized developmental trajectory. She then counted every absence or
presence of these structures.
The data analysis revealed that in Kawaguchi’s corpus, all structures included
in her hypotheses followed the predicted sequence: The verb appeared in the
last position in every sentence and in every session right from the start. The
structure TOPSUBJ(O)V appeared in a clearly distinguishable next step, and this
was followed by structures in which TOPIC and SUBJECT were differenti-
ated. Thus, the results support the predictions made by PT about how process-
ing constrains language development.
Explanation of Observed Findings in SLA
PT can account for several of the observed phenomena in L2 acquisition out-
lined in Chapter 1.
Observation 4: Learners’ output (speech) often follows predictable paths with pre-
dictable stages in the acquisition of a given structure. Explaining this observation is
one of the key aims of PT. PT has the capacity to predict levels of acquisition
in typologically diverse languages by locating grammatical structures of the
L2 within the processability hierarchy. These predictions can be universally
applied because they are specified within LFG.
Observation 5: Second language learning is variable in its outcome. Interlanguage
variability is generated by the leeway defined by Hypothesis Space at every
level of development. We demonstrated above that every learning problem
Processability Theory 187
(i.e., every developmental structure) can be solved in a limited number of
different ways and that the range of solutions is defined by Hypothesis Space.
In the course of development, the learner thus accumulates different variants
of developmental structures. The accumulated choices made by the learner
determine the shape of the learner’s interlanguage variety. One class of choices
made by learners implies that the specific interlanguage rule cannot develop
further. For instance, at level 2 (SVO), a learner may opt to leave out the cop-
ula (i.e., the verb to be) in sentences like He is nice (i.e., He nice). If this learner
develops an interlanguage system without a copula, he or she will not be able to
produce copula inversion (level 4) later on, for instance, Is he nice? When learn-
ers accumulate many of these choices the interlanguage stabilizes. Different
degrees of “bad choices” made by the learner determine the point in develop-
ment at which the interlanguage system stabilizes.
Observation 7: There are limits on the effects of frequency on L2 acquisition. Given
the implicational nature of the processability hierarchy, none of the process-
ing procedures can be skipped because every lower procedure constitutes a
prerequisite for the next higher one. Therefore, frequency in the input cannot
override the constraints imposed by the hierarchy.
Observation 8: There are limits on the effect of a learner’s first language on L2
acquisition. The key assumption of PT is that L2 learners can produce only
those linguistic forms for which they have acquired the necessary processing
procedures. Under this scenario, L1 features and structures can only be trans-
ferred when the learner begins to process L2 features and structures that are
relevant to the L1. For example, learners cannot transfer knowledge or abilities
regarding L1 subject-verb agreement until they get to the stage where they can
process this kind of grammatical information in the L2. This claim is referred
to as the Developmentally Moderated Transfer Hypothesis (Pienemann, Di
Biase, Kawaguchi, & Håkansson, 2005).
Observation 9: There are limits on the effects of instruction on L2 acquisition. Given
that every processing procedure in the hierarchy is a prerequisite for the next
higher one, none of the procedures can be skipped. This implies that levels of
acquisition cannot be skipped through formal instruction. In other words, the
effect of teaching is constrained by processability. This was formerly referred
to as the Teachability Hypothesis (Pienemann, 1984) and has been subsumed
under PT.
Observation 10: There are limits on the effects of output (learner production) on
language acquisition. Because output is constrained by processability, learners
cannot produce structures that are beyond their current level of processing.
Thus, practice does not make perfect in language learning, and interaction in
which learners may become aware of structures may not lead to them being
produced. Production of new features and structures reflects a change in pro-
cessing (i.e., the acquisition of a processing procedure) and is not the cause of it.
188 Manfred Pienemann and Anke Lenzing
The Explicit/Implicit Debate
PT does not address the explicit/implicit debate directly. Given that PT is
based on Levelt’s approach to language generation, it shares Levelt’s assump-
tions regarding the automaticity and implicit nature of several of the processing
components. The key component of Levelt’s approach utilized by PT is the
Grammatical Encoder, which is thought to operate largely automatically and
on the basis of implicit knowledge. In the context of Levelt’s model, explicit
knowledge comes into play through monitoring, which is seen as highly con-
strained by the overall architecture of the language generator. These assumptions
allow for a very constrained interface between explicit and i mplicit knowledge.
To this end, PT would not take a stand on explicit/implicit learning as it is
normally discussed in the literature.
Discussion Questions
1. How does PT explain staged development in L2 acquisition?
2. In what ways is PT different from (or similar to) generative or cognitive
theories used in research on L2 acquisition?
3. Select two different structures from a language you know or have studied.
What would you predict about their relative order of emergence based on
PT? What processing procedures seem to be involved?
4. Consider the insights about developmental trajectories provided by PT. Do
you see any implications for language teaching?
5. What methodological challenges do you see to testing the processability
hierarchy in comprehension?
6. Read the exemplary study presented in this chapter and prepare a discussion
for class in which you describe how you would conduct a replication study.
Be sure to explain any changes you would make and what motivates such
changes.
Notes
Suggested Further Reading
Lenzing, A. (2013). The development of the grammatical system in early second language acqui-
sition: The Multiple Constraints Hypothesis. Amsterdam, Netherlands: John Benjamins.
This book introduces the Multiple Constraints Hypothesis. Based on PT and
LFG, it focuses on the nature of the L2 initial grammatical system and its underlying
constraints.
Processability Theory 189
Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT
Press.
This book is foundational for all work on speech production and is critical for a
deeper understanding of how PT operates. It is also foundational for understanding
L1 speech production.
Pienemann, M. (1998). Language processing and second language development: Processability
theory. Amsterdam, Netherlands: John Benjamins.
This is the first book published on PT and is essential reading. The first and
second chapters are particularly useful for grasping the basic tenets of the theory.
Pienemann, M. (Ed.). (2005). Cross-linguistic aspects of processability theory. Amsterdam,
Netherlands: John Benjamins.
The first chapter in this volume contains a good overview of PT. Other chap-
ters include empirical research that tests the predictions made by PT on cross-
linguistically diverse languages such as Arabic, Chinese, and Japanese.
Pienemann, M., & Keßler, J.-U. (Eds.). (2011). Studying processability theory: An introduc-
tory textbook. Amsterdam, Netherlands: John Benjamins.
This textbook provides a reader-friendly introduction to PT. Designed for stu-
dents with basic knowledge of (applied) linguistics, it offers a comprehensive over-
view of the key claims of the theory.
References
Bresnan, J. (2001). Lexical-functional syntax. Malden, MA: Blackwell.
Buyl, A. (2015). Studying receptive grammar acquisition within a PT framework:
A methodological exploration. In K. Baten, A. Buyl, K. Lochtman, & M. Van
Herreweghe (Eds.), Theoretical and methodological developments in processability theory
(pp. 139–168). Amsterdam, Netherlands: John Benjamins.
Buyl, A., & Housen, A. (2013). Testing the applicability of PT to receptive grammar
knowledge in early immersion education: Theoretical considerations, methodological
challenges and some empirical results. In A. Flyman Mattsson & C. Norrby (Eds.),
Language acquisition and use in multilingual contexts: Theory and practice (pp. 13–27). Lund,
Sweden: Lund University Press.
Buyl, A., & Housen, A. (2015). Developmental stages in receptive grammar acquisition:
A processability theory account. Second Language Research, 31, 523–550.
Crystal, D., Fletcher, P., & Garman, M. (1984). The grammatical analysis of language dis-
ability. London, England: Arnold.
Di Biase, B., & Kawaguchi, S. (2002). Exploring the typological plausibility of pro-
cessability theory: Language development in Italian second language and Japanese
second language. Second Language Research, 18, 274–302.
Garrett, M. F. (1976). Syntactic process in sentence production. In R. Wales &
E. Walker (Eds.), New approaches to language mechanisms (pp. 231–256). Amsterdam,
Netherlands: North-Holland.
Garrett, M. F. (1980). Levels of processing in sentence production. In B. Butterworth
(Ed.), Language production: Vol. 1. Speech and talk (pp. 177–220). London, England:
Academic Press.
Garrett, M. F. (1982). Production of speech: Observations from normal and patholog-
ical language use. In A. W. Ellis (Ed.), Normality and pathology in cognitive functions
(pp. 19–76). London, England: Academic Press.
190 Manfred Pienemann and Anke Lenzing
Gregg, K. R. (1996). The logical and developmental problems of second language
acquisition. In W. R. Ritchie & T. J. Bhatia (Eds.), Handbook of second language acqui-
sition (pp. 49–81). San Diego, CA: Academic Press.
Kawaguchi, S. (2005). Argument structure and syntactic development in Japanese as a
second language. In M. Pienemann (Ed.), Cross-linguistic aspects of processability theory
(pp. 253–298). Amsterdam, Netherlands: John Benjamins.
Kempen, G., & Hoenkamp, E. (1987). An incremental procedural grammar for sen-
tence formulation. Cognitive Science, 11, 201–258.
Lenzing, A. (2013). The development of the grammatical system in early second language
acquisition: The multiple constraints hypothesis. Amsterdam, Netherlands: John
Benjamins.
Lenzing, A. (2019). Towards an integrated model of grammatical encoding and
decoding in SLA. In A. Lenzing, H. Nicholas, & J. Roos (Eds.), Working with pro-
cessability approaches: Theories and issues (pp. 13–48). Amsterdam, Netherlands: John
Benjamins.
Lenzing, A. (in press). The production-comprehension interface in second language acquisition:
An integrated encoding-decoding model. London, England: Bloomsbury.
Levelt, W. J. M. (1981). The speaker’s linearisation problem. Philosophical Transactions of
the Royal Society of London, Series B, 295, 305–315.
Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT
Press.
Mackey, A., Pienemann, M., & Thornton, I. (1991). Rapid profile: A second language
screening procedure. Language and Language Education, 1, 61–82.
Pienemann, M. (1984). Psychological constraints on the teachability of languages.
Studies in Second Language Acquisition, 6, 186–214.
Pienemann, M. (1990). LARC research projects. NLIA/LARC publications. Sydney,
Australia: Sydney University.
Pienemann, M. (1992). Assessing second language acquisition through rapid profile.
LARC Occasional Papers, 3, 1–24. Sydney, Australia: Sydney University.
Pienemann, M. (1998). Language processing and second language development: Processability
theory. Amsterdam, Netherlands: John Benjamins.
Pienemann, M., & Mackey, A. (1993). An empirical study of children’s ESL devel-
opment and rapid profile. In P. McKay (Ed.), ESL development: Language and liter-
acy in schools (pp. 115–259). Canberra: Commonwealth of Australia and National
Languages and Literacy Institute of Australia.
Pienemann, M., & Lanze, F. (2017, September). Constructing an automatic procedure
for ESL profile analysis. In Plenary presented at the 16th Symposium on Processability
Approaches to Language Acquisition (PALA), Ludwigsburg University of Education,
Germany.
Pienemann, M., Di Biase, B., & Kawaguchi, S. (2005). Extending processability theory.
In M. Pienemann (Ed.), Cross-linguistic aspects of processability theory (pp. 199–252).
Amsterdam, Netherlands: John Benjamins.
Pienemann, M., Di Biase, B., Kawaguchi, S., & Håkansson, G. (2005). Processing con-
straints on L1 transfer. In J. F. Kroll & A. M. B. DeGroot (Eds.), Handbook of bilin-
gualism: Psycholinguistic approaches (pp. 128–153). New York, NY: Oxford University
Press.
Processability Theory 191
Pienemann, M., Johnston, M., & Brindley, G. (1988). Constructing an acquisition-based
procedure for second language assessment. Studies in Second Language Acquisition, 10,
217–243.
Spinner, P. (2013). Language production and reception: A processability theory study.
Language Learning, 63, 704–739.
Spinner, P., & Jung, S. (2018). Production and comprehension in processability theory:
A self-paced reading study. Studies in Second Language Acquisition, 40, 295–318.
9
INPUT, INTERACTION, AND
OUTPUT IN L2 ACQUISITION
Susan M. Gass and Alison Mackey
The Theory and Its Constructs
As VanPatten, Williams, Keating, and Wulff note in Chapter 1, a distinction
needs to be made between models and theories. Notably, they distinguish
between the how and the why. They also describe hypotheses, which differ
from theories in that a hypothesis “does not unify various phenomena; it is
usually an idea about a single phenomenon.” This chapter deals with input,
interaction, feedback, and output in second language (L2) acquisition. These
constructs have been integrated and were originally referred to as the Interac-
tion Hypothesis. However, following a significant amount of empirical work
leading to greater specificity and theoretical advancement, it is now generally
referred to in the literature as the interaction approach.
In its current form, the interaction approach subsumes some aspects of the
Input Hypothesis (e.g., Krashen, 1982, 1985) together with the Output
Hypothesis (Swain, 1985, 1995, 2005). It has also been referred to as the
input, interaction, output model (Block, 2003) and interaction theory (Carroll,
1999). As Mackey (2012) notes, “it is important to point out that the Interac-
tion Hypothesis was not intended or claimed to be a complete theory of SLA,”
despite the fact that it is occasionally characterized this way in the literature
(p. 4). As Pica points out, “as a perspective on language learning, [the Interac-
tion Hypothesis] holds none of the predictive weight of an individual theory.
Instead, it lends its weight to any number of theories” (1998, p. 10).
If we follow the distinction provided in Chapter 1, it becomes clear that
the Interaction Hypothesis includes elements of a hypothesis (an idea that
needs to be tested about a single phenomenon), elements of a model (a descrip-
tion of processes or a set of processes of a phenomenon), as well as elements
Input, Interaction, and Output in L2 Acquisition 193
of a theory (a set of statements about natural phenomena that explains why
these phenomena occur the way they do). Recent work reflects the nature
and development of the interaction approach from its inception over the past
two and a half decades. In fact, Jordan (2005) suggested that the Interaction
Hypothesis shows signs of progression toward a theory, using it as an example of
how “an originally well-formulated hypothesis is upgraded in the light of crit-
icism and developments in the field” (p. 220). Likewise, Myles (2013) included
it in the “theoretical family” of interactional, sociolinguistic, and sociocultural
approaches to L2 acquisition. At the point when this book was published in
its first edition, various aspects of the Interaction Hypothesis had been tested
and links between interaction and learning clearly demonstrated, thereby sug-
gesting that it was time for a change in the term hypothesis. Its inclusion in a
volume on theories in second language acquisition (SLA), references to it as
the “model that dominates current SLA research” (Ramírez, 2005, p. 293) and
“the dominant interactionist paradigm” (Byrnes, 2005, p. 296) supported this
view, together with the appearance of book-length critiques of it (Block, 2003).
All of these collectively showed that researchers were moving toward think-
ing about the Interaction Hypothesis in terms of a model of L2 acquisition.
Using the framework of this book, for example, it is a model in the sense that
it describes the processes involved when learners encounter input, are involved
in interaction, receive feedback, and produce output. However, it is moving
toward the status of a theory in the sense that it also attempts to explain why in-
teraction and learning can be linked, using cognitive concepts derived from psy-
chology, such as noticing, working memory (WM), and attention. In this
chapter, then, as in much of the current literature, including recent handbook
and encyclopedia articles and reviews (García-Mayo & Alcón-Soler, 2013; Gor &
Long, 2009; Mackey, 2012; Mackey, Abbuhl, & Gass, 2012; Mackey & Goo,
2012; Myles, 2013; Robinson, 2013), we refer to it as the interaction approach.
Since the early 1980s and since Long’s update in 1996, the interaction
approach has witnessed a growth in empirical research and is now at a point where
meta-analyses and research syntheses can be carried out (Keck, Iberri-Shea,
Tracy-Ventura, & Wa-Mbaleka, 2006; Kim, 2017; Li, 2010; Lyster & Saito,
2010; Mackey & Goo, 2007; Norris & Ortega, 2000; Plonsky & Gass, 2011;
Russell & Spada, 2006), increasingly with a focus on specific aspects of interac-
tion such as corrective feedback (Brown, 2016) or the influence of advances in
technology on interaction (Ziegler, 2016), and book-length treatments of mate-
rials and methods associated with interaction are appearing (Mackey, 2020). It
is now commonly accepted within the SLA literature that there is a robust con-
nection between interaction and learning. In the current chapter, we provide
an update in which we present a description of the constructs of the interaction
approach as well as a discussion of the theoretical underpinnings that account
for the link between interaction and learning.
194 Susan M. Gass and Alison Mackey
The interaction approach attempts to account for learning through the
learner’s exposure to language, production of language, and feedback on that
production. As Gass (2003) notes, interaction research “takes as its starting
point the assumption that language learning is stimulated by communicative
pressure and examines the relationship between communication and acquisi-
tion and the mechanisms (e.g., noticing, attention) that mediate between them”
(p. 224). In the following sections, we turn to an examination of the major
components of this approach.
Input
Input is the sine qua non of acquisition. Quite simply it refers to the language
that a learner is exposed to in a communicative context (i.e., from reading or lis-
tening, or, in the case of sign language, from visual language). In all approaches
to L2 acquisition, input is an essential component for learning in that it provides
the crucial evidence from which learners can form linguistic hypotheses.
Because input serves as the basis for hypotheses about the language being
learned, researchers within the interaction approach have sought over the years
to characterize the input that is addressed to learners, and like UG researchers
(see White, this volume), interaction researchers also see input as providing
positive evidence, that is, information about what is possible in a language.
Early interaction researchers have shown that the language addressed to learn-
ers differs in interesting ways from the language addressed to native speak-
ers and fluent L2 speakers (for overviews, see Gass, Behney, & Plonsky, 2020;
Hatch, 1983; Wagner-Gough & Hatch, 1975). This language that is addressed
to learners has been referred to as modified input.
One proposal concerning the function of modified input is that modifying
input makes the language more comprehensible. If learners cannot understand
the language that is being addressed to them, then that language is not useful
to them as they construct their L2 grammars. An example of how individuals
modify their speech is provided below (from Kleifgen, 1985). In this exam-
ple, a teacher of kindergarteners, including native speakers (NSs) and nonna-
tive speakers (NNSs) of English at varying levels of proficiency, is providing
instructions to the class and to individuals.
1. Instructions to a kindergarten class
a) Instructions to English NSs in a kindergarten class
These are babysitters taking care of babies. Draw a line from Q to q.
From S to s and then trace.
b) To a single NS of English
Now, Johnny, you have to make a great big pointed hat.
c) To an intermediate level NS of Urdu.
No her hat is big. Pointed.
Input, Interaction, and Output in L2 Acquisition 195
d) To a low intermediate level NS of Arabic.
See hat? Hat is big. Big and tall.
e) To a beginning level NS of Japanese.
Big, big, big hat.
As shown in the example, when addressing a learner of a language, speakers
often make adjustments that are likely to render the language comprehensible,
which, in turn, ease the burden for the learner. It is important to note that
simplifications are not the only form of adjustments, which can also include
elaborations, thereby providing the learner with a greater amount of semantic
detail. An example of elaboration is seen in (2) (from Gass & Varonis, 1985). In
this example, when the NNS indicates a possible lack of understanding (Pardon
me?), the NS replies by elaborating on her original comment about nitrites,
adding an example and restating that she doesn’t eat them.
2. Elaboration
NNS: There has been a lot of talk lately about additives and preservatives
in food. In what ways has this changed your eating habits?
NS: I try to stay away from nitrites.
NNS: Pardon me?
NS: Uh, from nitrites in uh like lunch meats and that sort of thing.
I don’t eat those.
Input, along with negative evidence obtained through interaction (to which
we turn next), is believed to be crucial for acquisition to occur, not only in the
interaction approach but in other approaches as well (e.g., input processing) (see
VanPatten, this volume).
Interaction
Interaction, simply put, refers to the conversations that learners participate in.
Interactions are important because it is in this context that learners receive
information about the correctness and, more important, about the incorrect-
ness of their utterances. Within the interaction approach, negative evidence,
as in the UG literature (see White, this volume), refers to the information
that learners receive concerning the incorrectness of their own utterances. For
our purposes, learners receive negative evidence through interactional feed-
back that occurs following problematic utterances and provides learners with
information about the linguistic and communicative success or failure of their
production. Gass (2018, p. 144) presents the model in Figure 9.1 to characterize
the role negative evidence plays in the interaction-learning process.
Negative evidence, which can come (among other ways) through overt cor-
rection or negotiation, is one way of alerting a learner to the possibility of
196 Susan M. Gass and Alison Mackey
Negative Evidence
Negotiation Other correction types
Notice Error
Search Input
Input available Input not available
(confirmatory/disconfirmatory)
FIGURE 9.1 The function of negative evidence.
an error in his or her speech. Assuming that the error is noticed, the learner
then has to determine what the problem was and how to modify existing lin-
guistic knowledge. The learner then comes up with a hypothesis as to what the
correct form should be (e.g., he wented home versus he went home). Obtaining
further input (e.g., listening, reading) is a way of determining that in English
one says he went home, but never says he wented home. Thus, listening for further
input is a way to confirm or disconfirm a hypothesis that he or she may have
come up with regarding the nature of the target language. The learner may also
use output to test these hypotheses, which we address next.
Output
Known in the literature as the Output Hypothesis (Swain, 1985, 1993, 1995,
1998, 2005), Swain’s observations about the importance of output emerged from
her research that took place in the context of immersion programs in Canada.
Swain observed that children who had spent years in immersion programs still
had a level of competence in the L2 that fell significantly short of native-like
abilities. She hypothesized that what was lacking was sufficient opportunities
for language use. She claimed that language production forces learners to move
from comprehension (semantic use of language) to syntactic use of language.
As Swain (1995) states,
output may stimulate learners to move from the semantic, open-ended
nondeterministic, strategic processing prevalent in comprehension to
Input, Interaction, and Output in L2 Acquisition 197
the complete grammatical processing needed for accurate production.
Output, thus, would seem to have a potentially significant role in the
development of syntax and morphology.
(p. 128)
For example, after producing the initially problematic Spanish utterance Al lado
de la sofá? “on the side of the couch” and receiving feedback about its incorrect
form via a recast, the NNS in (3) appears to realize that he used the incorrect
form of the definite article (de la versus del). Pushed to reformulate his initial
utterance in order to incorporate the interlocutor’s recast, he modifies his lin-
guistic output by reformulating the utterance in a more target-like way.
3. Modified output (from Gurzynski-Weiss & Baralt, 2014)
LEARNER: Al lado de la sofá.
To the side of the [fem.] couch [masc.].
INTERLOCUTOR: Al lado del sofá.
To the side of the [masc.] couch [masc.].
LEARNER: Del sofá.
Of the [masc.] couch [masc].
In addition to pushing learners to produce more target-like output, another func-
tion of production, as mentioned earlier, is that it can be used to test hypotheses
about the target language. An example of hypothesis testing is provided in (4).
This example comes from a study in which learners were involved in videotaped
interactions and then interviewed immediately afterward, using the video as a
prompt. The retrospective comments, given in the learners’ L1, which was English,
demonstrate that the learner was using the conversation as a forum through which
she could test the accuracy of her knowledge (in particular, “I’ll say it and see”).
4. From Mackey, Gass, and McDonough (2000) (INT = interviewer)
NNS: Poi un bicchiere.
then a glass
INT: Un che, come?
a what, what?
NNS: Bicchiere.
glass
NNS RECALL COMMENTS: “I was drawing a blank. Then I thought of a
vase but then I thought that since there was no flowers, maybe it was just
a big glass. So, then I thought I’ll say it and see.”
Another function of output is to promote automaticity, which refers to the
routinization of language use. Little effort is expended when dealing with
198 Susan M. Gass and Alison Mackey
automatic processes (e.g., driving from home to work is automatic and does
not require much thought as to the route to take). Automatic processes come
about as a result of “consistent mapping of the same input to the same pattern
of activation over many trials” (McLaughlin, 1987, p. 134; cf. Chapter 6). We
can consider the role of production as playing an integral role in automaticity.
To return to the example of driving, the automaticity of the route from home
to work occurs following multiple trips along that route. The first time may
require more effort and more concentration. With regard to language learning,
continued use of language moves learners to more fluent, automatic production.
How Interaction Brings about Learning
The relationship among these three components can be summed up by Long’s
(1996) frequently cited explanation that
negotiation for meaning, and especially negotiation work that triggers inter-
actional adjustments by the NS or more competent interlocutor, facilitates
acquisition because it connects input, internal learner capacities, particu-
larly selective attention, and output in productive ways.
(pp. 451–452)
Furthermore,
it is proposed that environmental contributions to acquisition are
mediated by selective attention and the learner’s developing L2 process-
ing capacity, and that these resources are brought together most usefully,
although not exclusively, during negotiation for meaning. Negative feedback
obtained during negotiation work or elsewhere may be facilitative of L2
development, at least for vocabulary, morphology, and language-specific
syntax, and essential for learning certain specifiable L1-L2 contrasts.
(p. 414)
In this view, through interaction, a learner’s attentional resources (selective
attention) are directed to problematic aspects of knowledge or production. This
is evident in the following interaction between four learners of Spanish:
5. From Fernández Dobao (2016, p. 39)
BOB: ¿Co-cómo se dice seemed like?
h-how is it said seemed like
“How do you say seemed like?”
ANN: Parecer.
to seem like
“To seem like.”
Input, Interaction, and Output in L2 Acquisition 199
JEAN:Parecer.
to seem like
“To seem like.”
TOM: Parece.
it seems like
“It seems like.”
BOB: ¡Parece! que...so...el once de septiembre parecía día normal
it seems like that so the eleven of September seemed like normal day
“it seems like! So September 11 seemed like a normal day.”
This interaction reflects a pattern typical of L2 learners. It’s often the case that
learners, in interaction, notice that what they say or understand differs from
what a native speaker (or another, sometimes more competent, L2 learner) says
or seems to understand. These sorts of processes are sometimes referred to as
learners having noticed a gap (Schmidt & Frota, 1986). As part of this, learners, as
is possibly the case with Bob in the example above, may notice that since they
can’t express what they want to express, they have a hole in their interlanguage
(Swain, 1998). Researchers often interpret interaction between learners and
their interlocutors as having been responsible for directing a learner’s attention
to something new, such as a new lexical item or grammatical construction, thus
promoting the acquisition of the L2.
Feedback
There are two broad types of feedback: explicit and implicit. Explicit feedback
includes corrections and metalinguistic explanations. Of concern to us here are
implicit forms of feedback, which include negotiation strategies such as
• confirmation checks (expressions one interlocutor uses to confirm that
s/he has correctly heard or understood another interlocutor, for example,
Is this what you mean?)
• clarification requests (expressions one interlocutor uses to elicit clarifica-
tion of another interlocutor’s preceding utterance(s), for example, What did
you say?)
• comprehension checks (expressions one interlocutor uses to verify that
another interlocutor has correctly heard or understood, for example, Did
you understand?)
• recasts (a rephrasing of a non-target-like utterance using a more target-like
form while maintaining the original meaning)
Feedback may help to make problematic aspects of learners’ interlanguage
salient and may give them additional opportunities to focus on their production
or comprehension, thus promoting L2 acquisition. For instance, in example
(6), the NS’s provision of implicit feedback in the form of confirmation checks
200 Susan M. Gass and Alison Mackey
(lines 2 and 4) gives the learner the opportunity to infer (from her interlocutor’s
lack of comprehension) that there was a problem with her pronunciation.
6. Confirmation checks (from Mackey et al., 2000)
NNS: There’s a basen of flowers on the bookshelf
NS: a basin?
NNS: base
NS: a base?
NNS: a base
NS: oh, a vase
NNS: vase
Feedback occurs during negotiation for meaning. Long (1996, 2015) defines
negotiation as
the process in which, in an effort to communicate, learners and compe-
tent speakers provide and interpret signals of their own and their interloc-
utor’s perceived comprehension, thus provoking adjustments to linguistic
form, conversational structure, message content, or all three, until an
acceptable level of understanding is achieved.
(2015, p. 418)
Negotiation for meaning has traditionally been viewed and coded in terms of
the “three Cs”: confirmation checks, clarification requests, and comprehension
checks, each of which we defined earlier. A confirmation check was seen in
example (6). Examples (7) and (8) exemplify clarification requests. In example
(7), the NNS’s clarification request (line 2) and the NS’s rephrasings (lines 3
and 5) result in input that the learner finally seems to understand.
7. Clarification Request and Rephrasing (from Mackey, 2000)
NS: A curve slightly to the left here and then straight ahead the road goes
NNS: A Er er straight?
NS: No, it goes on a curve left first, then it goes straight ahead
NNS: No, because dry cleaner is the way is here? Curve? It means how?
NS: Exactly so go a little bit to the left, curve slightly left, then go straight
ahead with it
NNS: Oh a little bit left around then straight ahead goes first curve
NS: right, like that, exactly, right, curve, go straight ahead, no, no, no I
mean left right curve left [laughs]
NNS: [laughs] curve
Input, Interaction, and Output in L2 Acquisition 201
Example (8) illustrates a clarification request, in which Learner 2 needs more
information to understand Learner 1’s question about what is important to the
character in the task they are completing.
8. Clarification Request (from Gass, Mackey, & Ross-Feldman, 2005, 2011)
LEARNER 1: ¿Qué es importante a ella?
what is important to her
“What is important to her?”
LEARNER 2: ¿Cómo?
how
“What?”
LEARNER 1: ¿Qué es importante a la amiga? ¿Es solamente el costo?
what is important to the friend is only the cost
“What is important to the friend? Is it just the cost?”
A comprehension check is an attempt to anticipate and prevent a breakdown
in communication. In example (9), Learner 1 asks if Learner 2 needs him
to repeat what he has just said (¿Quieres que repita? “Do you want me to
repeat?”), basically checking to see if Learner 2 has understood the previous
utterance.
9. Comprehension Check (from Gass et al., 2005, 2011)
LEARNER 1: La avenida siete va en una dirección hacia el norte desde la
calle siete hasta la calle ocho. ¿Quieres que repita?
Avenue Seven goes in one direction toward the north from
Street Seven to Street Eight. Do you want me to repeat?
LEARNER 2: Por favor.
Please.
LEARNER 1: La avenida seven, uh siete, va en una dirección hacia el norte
desde la calle siete hasta la calle ocho.
Avenue Seven, uh Seven, goes in one direction toward the
north from Street Seven to Street Eight.
Through negotiation, input can be uniquely tailored to individual learners’ par-
ticular strengths, weaknesses, and communicative needs, providing language
that is in line with learners’ developmental levels. Pica (1994, 1996) and Mackey
(2012) describe how negotiation contributes to the language learning process,
suggesting that negotiation facilitates comprehension of L2 input and serves
to draw learners’ attention to form–meaning relationships through processes
of repetition, segmentation, and rewording. Gass (2018) similarly claims that
202 Susan M. Gass and Alison Mackey
negotiation can draw learners’ attention to linguistic problems and proposes
that initial steps in interlanguage development occur when learners notice mis-
matches between the input and their own organization of the target language.
Interaction research, with its focus on the cognitive processes that drive
learning, has augmented and in some cases replaced the three Cs with other
constructs, including recasts. Recasts are a form of implicit feedback and have
received a great deal of attention in recent research. Nicholas, Lightbown, and
Spada (2001) define recasts as “utterances that repeat a learner’s incorrect utter-
ance, making only the changes necessary to produce a correct utterance, with-
out changing the meaning” (p. 733). In other words, recasts are interactional
moves that give learners linguistically target-like reformulations of what they
have just said. A recast does not necessarily involve the repetition of every-
thing a learner said and may include additional elaborations not present in the
learner’s original utterance, but it is semantically contingent upon the learner’s
utterance and often comes directly after it. For instance, in example (10), an NS
recasts an NNS’s utterance.
10. Recast (from Fernández-García & Martínez-Arbelaiz, 2014)
NNS: sí primero tiempo
yes first [incorrect form of adjective] period [incorrect word choice]
“Yes, first period of time.”
NS: vez
time [corrected word choice]
“Time.”
NNS: primera vez en Europa
first time in Europe
“First time in Europe.”
Recasts have been associated with L2 learning in a number of primary
research studies (e.g., Ammar, 2008; Ammar & Spada, 2006; Ayoun, 2001;
Bigelow, Delmas, Hansen, & Tarone, 2006; Braidi, 2002; Carpenter,
Jeon, MacGregor, & Mackey, 2006; Egi, 2007; Ellis & Sheen, 2006;
Fernández-García & Martínez-Arbelaiz, 2014; Fujii, Ziegler, & Mackey,
2016; Goo, 2012; Han, 2002; Ishida, 2004; Iwashita, 2003; Kim & Han, 2007;
Leeman, 2003; Loewen & Philp, 2006; Lyster, 2004; Lyster & Izquierdo, 2009;
Mackey & Philp, 1998; McDonough & Mackey, 2006; Morris, 2002; Nassaji,
2009; Nicholas et al., 2001; Philp, 2003; Révész, 2012; Révész & Han, 2006;
Sachs & Suh, 2007; Sagarra, 2007; Sheen, 2008; Storch, 2002; Trofimovich,
Ammar, & Gatbonton, 2007; Ziegler et al., 2013) as well as meta-analyses
(e.g., Cleave, Becker, Curran, Van Horne, & Fey, 2015; Li, 2010; Mackey &
Goo, 2007; Miller & Pan, 2012). Current research has also indicated that recasts
Input, Interaction, and Output in L2 Acquisition 203
and negotiation may work to impact learning in different ways. For example,
recasts are complex discourse structures that have been said to contain positive
evidence (a model of the correct form in the target language) and negative feed-
back (since the correct form is juxtaposed with the non-target-like form) in an
environment where the positive evidence is enhanced (because the corrected
form is given directly after the incorrect form is uttered). If learners do not
selectively attend to and recognize the negative feedback contained in recasts,
then the documented contribution of recasts to learning might be attributed to
the positive evidence they contain, or to the enhanced salience of the positive
evidence, which is one of Leeman’s (2003) suggestions.
While negotiation for meaning always requires learner involvement, as shown
in example (6), recasts do not consistently make such participatory demands.
There is a wide range of possibilities with this sort of discourse, as shown by the
learner’s response to the recast in example (10). As a number of researchers (e.g.,
Lyster, 1998a, 1998b) have pointed out, reformulations sometimes occur after
grammatical utterances as well, and a recast may be perceived as responding to
the content rather than the form of an utterance, or as an optional and alternative
way of saying the same thing. Thus, learners may not repeat or rephrase their
original utterances following recasts, and they may not even perceive recasts
as corrective at all (Mackey et al., 2000; McDonough & Mackey, 2006). It
also must be kept in mind that even when learners do understand the correc-
tive nature of recasts, they may have trouble understanding and addressing the
source of the problem (as discussed by several researchers, including Carroll,
2001). However, it is possible that neither a response nor a recognition of the
corrective intent of the recast is crucial for learning (Mackey & Philp, 1998)
and a substantial body of research, using increasingly innovative methods, has
linked recasts with L2 learning of different forms, in different languages, for a
range of learners in both classroom and laboratory contexts (for a review, see
Mackey & Gass, 2006).
Language-Related Episodes
Another construct, language-related episodes (LREs), is also studied within the
context of interaction. Briefly defined, LREs refer to instances where learners
consciously reflect on their own language use, or, more specifically,
instances in which learners may (a) question the meaning of a linguis-
tic item; (b) question the correctness of the spelling/pronunciation of a
word; (c) question the correctness of a grammatical form; or (d) implic-
itly or explicitly correct their own or another’s usage of a word, form or
structure.
(Leeser, 2004, p. 56; see also Swain & Lapkin, 1998; Williams, 1999)
204 Susan M. Gass and Alison Mackey
LREs, as Williams (1999) notes, encompass a wide range of discourse moves,
such as requests for assistance, negotiation sequences, and explicit and implicit
feedback, and are generally taken as signs that learners have noticed a gap
between their interlanguages (or their partners’ interlanguages) and the system
of the target language. Example (11) illustrates an LRE where two interlocutors
in a conversation group in a study abroad context discuss the Spanish word for
translated:
11. Language-related episode (from Bryfonski & Sanz, 2018)
NNS: Nunca de mis bromas pueden ser traslados.
none of my jokes can be translated [incorrect word choice]
“None of my jokes can be translated.”
NS: ¿Pueden ser?
they can be
“They can be?”
NNS: ¿Traslados?
transported
“Translated?”
NS: Traducidas.
translated
“Translated.”
NNS: Traducidas.
translated
“Translated.”
Based on this example, it might be possible to conclude that the NNS recog-
nized a gap in her lexical knowledge of Spanish, and thus produces an LRE (an
explicit request for assistance). A number of studies investigating L2 learners’
use of LREs have found that LREs not only represent language learning in
process (Donato, 1994; Swain & Lapkin, 1998) but are also positively correlated
with L2 acquisition (e.g., Basturkmen, Loewen, & Ellis, 2002; Leeser, 2004;
Swain, Brooks, & Tocalli-Beller, 2002; Williams, 2001).
Attention
While input such as that provided in recasts may be regarded as a catalyst for
learning, and LREs as evidence that learning processes are being engaged,
attention is believed to be one of the mechanisms that mediates between input
(or intake) and learning. It is widely agreed that L2 learners are exposed to
more input than they can process, and that some mechanism is needed to help
learners “sort through” the massive amounts of input they receive. As Gass,
Svetics, and Lemelin (2003) explain, “language processing is like other kinds
Input, Interaction, and Output in L2 Acquisition 205
of processing: Humans are constantly exposed to and often overwhelmed by
various sorts of external stimuli and are able to, through attentional devices,
“tune in” some stimuli and “tune out” others” (p. 498). Attention, broadly
conceptualized, may be regarded as the mechanism that allows learners to
“tune in” to a portion of the input they receive.
Although generally held to be crucial for L2 acquisition, attention has never-
theless been the focus of much debate in the field. Schmidt (1990, 2001, 2012),
for example, argues that learning cannot take place without awareness because
the learner must be consciously aware of linguistic input in order for it to become
internalized; thus, awareness and learning cannot be dissociated. Similarly,
Robinson (1995, 2001, 2002) claims that attention to input is a consequence
of encoding in WM, and only input encoded in WM may be subsequently
transferred to long-term memory. Thus, in Robinson’s model, as in Schmidt’s,
attention is crucial for learning, and in both models, no learning can take place
without attention and some level of awareness. An alternative and distinct per-
spective, emerging from work in cognitive psychology (Posner, 1988, 1992;
Posner & Peterson, 1990), is presented by Tomlin and Villa (1994), who advocate
for a disassociation between learning and awareness. As can be seen from this
brief overview, not all researchers use the same terminology when discussing
attention, and in fact, there have been proposals that have divided attention into
different components. What is important for the current chapter is that interac-
tion researchers assume that the cognitive constructs of attention, awareness, and
the related construct of noticing are part of the interaction-L2 learning process.
WM has also been implicated as a potential explanation for how
interaction-driven L2 learning takes place, as well as language learning in
general (Mackey, Adams, Stafford, & Winke, 2010; Mackey & Sachs, 2012).
For example, in a study of Korean L1 English language learners, Goo’s (2012)
research showed that WM was associated with the noticing of recasts, while
Trofimovich et al. (2007) suggested that WM (along with attention control and
analytical ability) was associated with their Francophone learners’ production of
English morphosyntax. Such research suggests that WM may play an important
role in the processing and use of recasts by L2 learners. Another factor that may
relate to a learner’s ability to benefit from interaction is their ability to sup-
press information, referred to as inhibitory control. Gass, Behney, and Uzum
(2013) found evidence that those individuals who were better able to suppress
interfering information were also better able to learn from interaction. Recent
work has also suggested that an individual’s creativity may affect their L2 inter-
actions. For example, McDonough, Crawford, and Mackey (2015) found that
individuals who exhibited higher levels of creativity as measured by a divergent
thinking test were more likely to use certain interactive features such as ques-
tions, which might enable them to make confirmation checks to support their
own learning. In addition, Pipes (2016) suggested links between creativity and
interactive communication strategies such as questions and exemplification.
206 Susan M. Gass and Alison Mackey
There have been hundreds of empirical studies of the various differ-
ent aspects of interaction since the mid-1990s. As outlined in Mackey et al.
(2012), researchers have concentrated on interaction and its impact on specific
morphosyntactic features, finding benefits for a range of features, “including
articles (Muranoi, 2000; Sheen, 2007), questions (Mackey, 1999; Mackey &
Philp, 1998; Philp, 2003), past tense formation (Doughty & Varela, 1998; Ellis,
2007; Ellis, Loewen, & Erlam, 2006; McDonough, 2007), and plurals (Mackey,
2006)” (p. 10). As they also point out, these results have been found across sev-
eral areas, including:
• children as well as adults (Mackey & Oliver, 2002; Mackey & Silver, 2005;
Van den Branden, 1997) and older adults (Mackey & Sachs, 2012);
• different learning contexts, including classroom as well as laboratory set-
tings (Brown, 2016; Gass, Mackey, & Ross-Feldman, 2005; Russell &
Spada, 2006), naturalistic settings (McDonough & Hernández González,
2013; Ranta & Meckelborg, 2013), peer-to-peer interactions (Sato, 2017;
Ziegler et al., 2013), and in CALL contexts (Smith, 2012); and
• with several different languages, including French (Ayoun, 2001; Swain &
Lapkin, 1998, 2002), Japanese (Ishida, 2004; Iwashita, 2003), Korean ( Jeon,
2007), and Spanish (de la Fuente, 2002; Fernández-García & Martínez-
Arbelaiz, 2014; Gass & Alvarez-Torres, 2005; Leeman, 2003).
The interaction research agenda now seems to focus on a range of different
topics, including (a) grammatical aspects of the L2 and their likelihood of being
impacted by interaction; (b) individual differences variables, such as WM,
inhibition, and cognitive creativity, and how these might be related to the link
between interaction and L2 acquisition; and (c) what forms of interaction (and
in particular, what types of feedback) are the most beneficial for L2 learners in
particular contexts and settings. There has also been a move to recognize the
influence of the social context in interaction, with factors such as the relation-
ship between the learners (whether they know each other already, for exam-
ple), affecting inter alia their willingness to communicate (Dörnyei, 2009) and
therefore their opportunities to learn through interaction (Philp & Mackey,
2010). The field has reached some level of maturity with the before mentioned
meta-analyses and analyses of study quality (Plonsky & Gass, 2011).
What Counts as Evidence?
As Mackey and Gass (2005) point out, the goal of much interaction-based
research involves manipulating the kinds of interactions that learners are
involved in, the kinds of feedback they receive during interaction, and the
kinds of output they produce, to determine the relationship between the
Input, Interaction, and Output in L2 Acquisition 207
various components of interaction and L2 learning. Thus, longitudinal designs,
cross-sectional designs (sampling learners at different proficiency levels), and
case studies are all appropriate methods. Methods for collecting interaction data
are discussed in Mackey (2020). However, the most common way of gathering
data is to involve learners in a range of carefully planned tasks.
Tasks
Various ways of categorizing task types have been discussed (for discussions of
task categorization, modality, and complexity, see Bygate, 2016; Ellis, 2003;
Kim, 2012; Mackey & Gass, 2007; Payant & Kim, 2015; Pica, Kanagy, &
Falodun, 1993; and Zalbidea, 2017). For example, a common distinction is
to classify tasks as one-way and two-way. In a one-way task, the information
flows from one person to the other, as when a learner describes a picture to her
partner. In other words, the information that is being conveyed is held by one
person. In a two-way task, there is an information exchange whereby all parties
hold information that is vital to the resolution of the task. For example, in a
story completion task, each learner may hold a portion of the information and
must convey it to the other learner(s) to successfully complete the task. Each
type of task may produce different kinds of interaction, with different oppor-
tunities for feedback and output.
Interaction researchers are usually interested in eliciting specific gram-
matical structures to test whether particular kinds of interactive feedback on
non-target-like forms are associated with learning. Learning is sometimes
examined through immediate changes in the learners’ output on the particu-
lar structures about which they have received interactional feedback, although
short- and long-term change on posttests is generally considered to be the
gold standard.
Obviously, tasks need to be carefully pilot-tested to ensure they produce the
language intended. It is also possible, and becoming more common in interac-
tion research, to try to examine learners’ thought processes as they carry out a
task or to interview learners on previous thought processes. For example, if a
researcher employed a dictogloss task (a type of consensus task where learners
work together to reconstruct a text that has been read to them; Swain & Lapkin,
2002), that researcher could examine the text that learners produce (the out-
put). Or, instead of examining the output in isolation, the researcher could also
ask the learners to think aloud as they carry out the task (this is known as an
introspective protocol or “think aloud”). Alternatively, the researcher could ask
the learners to make retrospective comments as soon as they are finished with
a task. This is often done by providing the learners with a video replay to jog
their memories (a procedure known as stimulated recall) (Gass & Mackey,
2000, 2017).
208 Susan M. Gass and Alison Mackey
Difficulties in Determining Learning
It is often difficult to determine if learning has actually taken place. One diffi-
culty, common in any approach to L2 acquisition, is in the operationalization of
learning. If a learner utters a new form once and then does not do so again for
two months, does that constitute knowledge? If a learner utters a new form two
times, does that constitute knowledge? Questions like these (and many more)
are ones that are often faced when conducting research on interaction and L2
learning more generally.
A second difficulty in determining learning occurs when considering actual
interactions in the absence of posttests or in the absence of some commentary,
as in a stimulated recall or an LRE. If we consider the example presented in
(6), for instance, it might appear on the surface that the NS and NNS have
negotiated the difficulty to the point where the NS did understand that the
NNS is referring to a vase rather than a basin. But when we focus on the NNS,
we need to ask what learning has occurred. Is she simply repeating what the
NS had said without true understanding, or did some type of learning take
place? Or was some process engaged that might eventually lead to, or facilitate,
later learning? Example (12), taken from Bryfonski and Sanz (2018), illustrates
a similar concern:
12. Bryfonski and Sanz (2018)
NNS: Pero yo perdí el um... ¿Cómo se dice “password” otra vez?
But I lost the um... How do you say “password” again?
NS: ¿Contraseña?
Password?
NNS: El col- la- colaseña [ill-formed]. Uh yo perdí la colaseña [ill-formed].
The col- the – password [ill-formed]. Uh I lost the password [ill-formed].
The question that must be addressed here is whether or not the learner did in
fact learn the Spanish word for password. In the above example, the learner
did not accurately modify her output. However, the results from a posttest
and interview indicate that she did remember the correct form. During a
stimulated-recall interview, the learner stated that she remembered the word
because she asked about it so many times throughout the program. She said,
“There were a few words that came up like contraseña that I definitely learned
throughout the program.” What can’t be determined from this example is
whether the learner learned the word because of the modified output she was
given in (12), or because she was exposed to it multiple times over her study
abroad experience.
These examples help foreground the concern that whatever the data source,
the important point is not to rely solely on the transcript of the interaction but
Input, Interaction, and Output in L2 Acquisition 209
to investigate the link between interaction and learning by whatever means
possible. For this reason, research designs which employ pretests and posttests
(and ideally, delayed posttests and possibly tailor-made posttests as well) and/
or designs that include introspective or retrospective protocols are of value. As
research designs progress, clearer answers to the questions about interaction and
learning can be obtained.
Common Misunderstandings
Here we will consider two common areas of misunderstanding about input,
interaction, and L2 acquisition. These relate to the nature of the interac-
tion approach and the relationship of the interaction approach to teaching
methods.
The first misunderstanding concerns the scope of the interaction approach.
Although occasionally criticized for not addressing all aspects of the learning
process (such as how input is processed, or the sociocultural context of the
learning), the interaction approach, like all approaches and theories in SLA,
takes as its primary focus particular aspects of the L2 learning process. Some
theories focus on innateness, others on the sociolinguistic context, and still
others purely on the cognitive mechanisms involved in learning a language.
The interaction approach, for the time being, is focused primarily on the role
of input, interaction, and output in learning. Future research will undoubtedly
be enriched by exploring the connections between various approaches to the
study of L2 acquisition in greater depth, so as to arrive at a more comprehensive
explanation of the L2 acquisition process.
A second misunderstanding is that the interaction approach can be directly
applied to classroom methodology. For example, work on task-based language
teaching (see Bygate, 2016; Bygate, Skehan, & Swain, 2001; Ellis, 2003) and
focus on form (Long & Robinson, 1998) both draw heavily on the Interac-
tion Hypothesis as part of their theoretical basis. Task-based language teach-
ing and the research that supports its use, in the words of Ellis (2003), “has
been informed primarily by the interaction hypothesis” (p. 100). Like most
SLA researchers, however, Ellis is cautious about making direct connections
between theory, research, and teaching practice, saying both that “the case
for including an introduction to the principles and techniques of task-based
teaching in an initial teacher-training program is a strong one” and also that “if
task-based teaching is to make the shift from theory to practice it will be neces-
sary to go beyond the psycholinguistic rationale…to address the contextual fac-
tors that ultimately determine what materials and procedures teachers choose”
(p. 337). The interaction approach, like most other accounts of L2 acquisition,
is primarily focused on how languages are learned. Thus, direct application to
the classroom may be premature.
210 Susan M. Gass and Alison Mackey
An Exemplary Study: Gurzynski-Weiss and Baralt (2014)
The study carried out by Gurzynski-Weiss and Baralt (2014) illustrates many of
the issues and constructs discussed in this chapter. Their research investigated
learners’ perception and use of feedback in both computer-mediated commu-
nication (CMC) and face-to-face (FTF) modes. The main research questions
were as follows: (a) Do learners perceive feedback provided during task-based
interaction? (b) Do learners recognize the target of the feedback provided during
task-based interaction? (c) Does learner perception of feedback differ according to
the mode in which it is provided (i.e., CMC or FTF)? (d) Does opportunity for
learner-modified output differ according to mode? (e) Does learner use of oppor-
tunity for modified output differ according to the mode in which it is provided?
The 24 participants were intermediate learners of Spanish studying at the
university level. All participants were native speakers of English and some had
studied a different foreign language prior to taking Spanish classes.
Each participant interacted with one of the researchers, a native English
speaker with near-native proficiency in Spanish, to complete an equivalent
version of an information-gap task in each of the two previously mentioned
modes (CMC and FTF), with task and mode counterbalanced. The tasks were
completed in Spanish and the interlocutor gave the participants feedback on
errors of lexis, morphosyntax, semantics, phonology (in the FTF mode), and
spelling (in the CMC mode) wherever it seemed appropriate and in whatever
form seemed appropriate during the interaction.
Introspective data were collected from the learners using stimulated recall
methodology (Gass & Mackey, 2000, 2017) immediately following completion
of the task-based activities. The participants in the FTF mode watched clips of
the interaction with a second researcher who also gave the directions for this
part of the research to the learner. While watching the videotape, the learners
could pause the tape if they wished to describe their thoughts at any particular
point in the interaction. The researcher paused the tape after each interaction
episode and asked the participant what they remembered thinking at the time
and what they believed their interlocutor was trying to communicate in that
moment. For the participants in the CMC mode, the same procedure as the
FTF mode was repeated, except that the clips were viewed on a computer
instead of a TV. This recall procedure was aimed at eliciting learners’ original
perceptions about the feedback episodes—that is, their perceptions at the time
they were taking part in the interaction. After the first task and mode-specific
stimulated recall were completed, the learner began the second task, followed
by the second stimulated recall exercise.
The interactional feedback episodes and the stimulated recall comments that
were provided about the episodes were coded and analyzed. From the inter-
action episodes, the nature and amount of student errors, interlocutor feed-
back, and opportunities for and learner production of modified output were
coded while the stimulated recall comments were coded for reported learner
Input, Interaction, and Output in L2 Acquisition 211
perception. Participant errors coded in both modes included lexis, semantics,
morphosyntax, and other, plus phonological errors coded for the FTF mode
and spelling errors for the CMC mode. Thus, for each mode, there were five
possible error types.
In answer to the first research question, learners did accurately perceive
feedback the majority of the time: overall, 68.3% of the time in the FTF mode
and 71.1% in CMC. In terms of correct recognition of the target of feedback
(research question 2), learners were most accurate in their perception of feed-
back targeting lexis (80.2% in the FTF mode and 78.5% in the CMC). Learners
were the second most accurate in their perception of feedback targeting seman-
tics (66.7% in the FTF mode and 69.2% in CMC). Next were learners’ inter-
pretations of feedback targeting morphosyntax (41.5% in FTF mode and 48.4%
in CMC). Finally, learners accurately perceived feedback targeting phonology
40% of the time in the FTF mode. The one instance of spelling feedback in the
CMC mode was not correctly perceived.
Regarding feedback mode (research question 3), learners did not perceive
feedback more accurately in one mode compared to the other. Modality, how-
ever, did have an effect on learners’ opportunities to provide modified output
(research question 4). Interaction in the FTF mode resulted in statistically more
opportunities to modify output than in CMC, and the effect size for this differ-
ence was more than one standard deviation. Finally, there was also a significant
difference in learner-modified output after receiving feedback in FTF com-
pared to CMC mode (research question 5); again, the effect size for this finding
was more than one standard deviation.
In summary, what this study of L2 Spanish learners’ perceptions about feed-
back in task-based interaction showed was that learners were most accurate
in their perceptions about lexical and semantic feedback, were moderately
accurate in addressing morphosyntactic feedback, and were least accurate in
addressing phonological feedback. Proponents of the interaction approach have
suggested that interaction can result in feedback that focuses learners’ attention
on aspects of their language that deviate from the target language. If learners’
reports about their perceptions can be equated with attention, then the findings
in this study are consistent with the claims of the Interaction Approach, at least
with regard to the lexicon and semantics.
Explanation of Observed Findings in SLA
As we noted in the first section of this chapter, the interactionist approach does not
address all aspects of L2 acquisition and therefore does not account for all of the
observable phenomena outlined in Chapter 1. In this section, therefore, we discuss
the observable phenomena that are most relevant to the interactionist approach.
Observation 1: Exposure to input is necessary for L2 acquisition. The interactionist
approach relies heavily on input to account for acquisition and so is in agreement
212 Susan M. Gass and Alison Mackey
with Observation 1. However, there is no assumption in the interactionist
approach that input alone is sufficient. In fact, it is the way that a learner interacts
with the input (through interaction) that is at the heart of this approach. If input
were sufficient, we would not have so many learners who, despite years in an
L2 environment, are not highly proficient. For example, the French immersion
students Swain makes reference to in her studies should have been able to acquire
native-like proficiency in the L2 as they were consistently exposed to the L2.
Observation 2: A good deal of L2 acquisition happens incidentally. The interac-
tionist approach does not deal specifically with incidental learning, but insofar
as attention is seen as a driving, explanatory force behind the interactionist
approach, incidental learning is not seen as a major part of L2 learning. Within
the interactionist approach, learning takes place through an interactive context.
For example, negotiation for meaning involves the learner in directing specific
attention toward a linguistic problem.
Observation 5: Second language learning is variable in its outcome. To the extent
that this observation is compatible with the idea that individuals vary in
whether and how they negotiate meaning as well as the extent to which they
focus attention on specific parts of language, it is consistent with interaction-
ist proposals. Keeping in mind the importance to interaction proposals of the
individual learner and the cognitive differences that exist between learners such
as differences in WM capacity (as opposed to innate dispositions), individuals
will likely have different results in terms of their outcomes.
Observation 7: There are limits on the effects of frequency on L2 acquisition. A
frequency-based explanation of L2 acquisition is compatible with some of the
interactionist claims in that one way in which interactional modifications are
claimed to impact learning is through facilitating pattern identification and
recognition of matches and mismatches. However, input frequency is not suf-
ficient to account for learning in the absence of some other considerations. For
example, in an interactionist approach, the native language might play some
role when trying to understand which forms a learner might attend to follow-
ing feedback, particularly implicit feedback. The impact of frequency is depen-
dent on a learner noticing the input. Other factors such as the native language
may play a role in determining what is noticed and what is not.
Observation 10: There are limits on the effects of output (learner production) on
language acquisition. At this point in SLA research, no approach or theory can
account for all learning. The interactionist approach is no exception. The
interactionist approach takes a particular perspective on output and highly
values pushed or modified output, or that output which involves a learner
attempting to go beyond his/her current level of knowledge. In other words,
the most important output is that output which stretches the limited linguistic
resources of a learner. Thus, while output may be important for automatiza-
tion, it is less valuable for language learning.
Input, Interaction, and Output in L2 Acquisition 213
The Explicit/Implicit Debate
Regarding the role of explicit and implicit learning in relation to the inter-
action approach, the approach does not make claims about learning processes
or knowledge types, but it does make claims about feedback types, in partic-
ular, the roles of implicit and explicit feedback. As has been noted in earlier
discussions, one of the central components of the interaction approach is the
role of attention. If attention is central, one must then consider how attention
is drawn to language forms and/or functions. For example, is it explicit (e.g.,
through metalinguistic correction) or is it implicit (e.g., through recasts)? Both
are beneficial for language learning (see Goo & Mackey, 2013, for a review of
the literature on recasts), but the interaction approach, with few exceptions,
does not go further to investigate the type of knowledge that results.
One notable exception comes from Ellis, Loewen, and Erlam (2006), who
measured two types of feedback on the acquisition of English past tense and
their relationship to implicit and explicit knowledge (learning processes are
not dealt with in their study). Their learning data came from three tests, an
untimed grammaticality judgment test, a metalinguistic knowledge test, and an
oral imitation test. The first two of these were intended to provide information
about explicit knowledge and the third about implicit knowledge. What they
claim is that both implicit and explicit knowledge benefit from feedback (more
so from metalinguistic feedback than implicit feedback). Lyster and Ranta
(2013) have continued to hold that more explicit feedback is more conducive to
noticing, while Long (2015) emphasizes the importance of recasts.
Thus, even though interaction-based research is centrally concerned with
learning that emanates from an interactive event that includes both implicit
and explicit information, it has been silent on the predicted result of that
information.
Conclusion
In this chapter, the perspective offered by input and interaction has been pre-
sented. The central tenet of the approach is that interaction facilitates the pro-
cess of acquiring a second language, as it provides learners with opportunities
to receive modified input, to receive feedback, both explicitly and implicitly,
which in turn may draw learners’ attention to problematic aspects of their
interlanguage and push them to produce modified output.
Discussion Questions
1. The authors describe this approach as a model but not a theory. Do you
agree? Why? Do you think this will change? What would it take to change
how we refer to this approach?
214 Susan M. Gass and Alison Mackey
2. Is the Interaction Approach compatible with, for example, the UG
approach (White, this volume) and usage-based approaches (Ellis & Wulff,
this volume)?
3. Describe the role of negative evidence within the interaction approach.
Does this differ from other approaches you have read about in this volume?
4. One possible critique of the interaction approach is that it ignores import-
ant social factors that may affect people’s interactions, for example, power
relationships, social status, or gender. Do you think this is a valid criticism?
To what extent would a theory of L2 acquisition need to consider such
social factors?
5. Read the exemplary study presented in this chapter and prepare a discus-
sion for class in which you describe how you would conduct a replication
study. Be sure to explain any changes you would make and what motivates
such changes.
Suggested Further Reading
Gass, S. M. (2018). Input, interaction, and the second language learner. New York, NY:
Routledge.
This book provides a thorough and accessible introduction to the main compo-
nents of the interaction approach, including classroom applications and implications.
Gass, S. M. (2003). Input and interaction. In C. Doughty & M. H. Long (Eds.), Hand-
book of second language acquisition (pp. 224–255). Oxford, England: Blackwell.
In this article, Gass provides an overview of the interaction approach from a
cognitive perspective. The article considers the role of input and output from the
perspective of the sine qua non of learning. She considers both input and interaction
in early and more recent studies of L2 acquisition and discusses the research that
links interaction and learning. Gass additionally focuses on the role of attention and
relates it to the theory of contrast proposed by Saxton (1997).
Long, M. H. (1996). The role of the linguistic environment in second language acqui-
sition. In W. Ritchie & T. K. Bhatia (Eds.), Handbook of language acquisition: Vol. 2.
Second language acquisition (pp. 413–468). San Diego, CA: Academic Press.
One of the most often cited articles in the field, Long’s article discusses the
theoretical underpinnings of the interaction approach, including positive evidence,
comprehensible input, input and cognitive processing, and negotiating for meaning.
Mackey, A. (2007). Interaction and second language development: Perspectives from
SLA research. In R. DeKeyser (Ed.), Practice in second language learning: Perspec-
tives from linguistics and psychology (pp. 85–110). Cambridge, England: Cambridge
University Press.
This chapter discusses research on interaction in SLA that points to the
importance of a range of interactional processes in the L2 learning process. These
processes include negotiation for meaning, the provision of feedback, and the pro-
duction of modified output. Highlighted in this chapter is the importance of cogni-
tive (learner-internal) factors such as attention, noticing, and memory for language.
Mackey, A. (Ed.). (2007). Conversational interaction in second language acquisition. Oxford,
England: Oxford University Press.
Input, Interaction, and Output in L2 Acquisition 215
This book provides an edited collection of empirical studies on a variety of
issues concerning the relationship between conversational interaction and L2 learn-
ing. In particular, it highlights the benefits of interactional feedback, explores the
relationship between learners’ perceptions and learning, and investigates individual
differences and social and cognitive factors.
Mackey, A. (2012). Input, interaction, and corrective feedback in L2 learning. Oxford,
England: Oxford University Press.
This book provides a comprehensive and up-to-date survey of 20 years of
research on interaction-driven L2 learning, with a particular interest in the recent
growth in research into the role of cognitive and social factors in evaluating how
interaction works.
Mackey, A., & Abbuhl, R. (2005). Input and interaction. In C. Sanz (Ed.), Internal and
external factors in adult second language acquisition (pp. 207–233). Washington, DC:
Georgetown University Press.
This chapter provides a detailed overview of the interaction approach, discussing
both empirical work that has investigated the relationship between interaction and
L2 learning and implications for L2 pedagogy.
Mackey, A., Abbuhl, R., & Gass, S. M. (2012). Interactionist approach. In S. M. Gass &
A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 7–23).
New York, NY: Routledge.
This chapter provides an overview of the historical background of the interac-
tionist approach and discusses core issues surrounding it. It examines some ways of
collecting data, explores practical applications of the approach, and gives directions
for future research.
References
Ammar, A. (2008). Prompts and recasts: Differential effects on second language
morpho-syntax. Language Teaching Research, 12, 183–210.
Ammar, A., & Spada, N. (2006). One size fits all? Recasts, prompts, and L2 learning.
Studies in Second Language Acquisition, 28, 543–574.
Ayoun, D. (2001). The role of negative and positive feedback in the second language
acquisition of the passé composé and imparfait. The Modern Language Journal, 85,
226–243.
Basturkmen, H., Loewen, S., & Ellis, R. (2002). Metalanguage in focus on form in the
communicative classroom. Language Awareness, 11, 1–13.
Bigelow, M., delMas, R., Hansen, K., & Tarone, E. (2006). Literacy and processing of
oral recasts in SLA. TESOL Quarterly, 40, 665–689.
Block, D. (2003). The social turn in second language acquisition. Edinburgh, Scotland:
Edinburgh University Press.
Braidi, S. (2002). Reexamining the role of recasts in native-speaker/nonnative-speaker
interactions. Language Learning, 52, 1–42.
Brown, D. (2016). The type and linguistic foci of oral corrective feedback in the L2
classroom: A meta-analysis. Language Teaching Research, 20, 436–458.
Bryfonski, L., & Sanz, C. (2018). Opportunities for corrective feedback during study
abroad: A mixed methods approach. Annual Review of Applied Linguistics, 38, 1–32.
Bygate, M. (2016). Sources, developments and directions of task-based language teach-
ing. The Language Learning Journal, 44, 381–400.
216 Susan M. Gass and Alison Mackey
Bygate, M., Skehan, P., & Swain, M. (Eds.). (2001). Researching pedagogic tasks: Second
language learning, teaching, and testing. Harlow, England: Pearson.
Byrnes, H. (2005). Review of task-based language learning and teaching. The Modern
Language Journal, 89, 297–298.
Carpenter, H., Jeon, S., MacGregor, D., & Mackey, A. (2006). Learners’ interpretations
of recasts. Studies in Second Language Acquisition, 28, 209–236.
Carroll, S. E. (1999). Putting ‘input’ in its proper place. Second Language Research, 15,
337–388.
Carroll, S. E. (2001). Input and evidence: The raw material of second language acquisition.
Amsterdam, Netherlands: John Benjamins.
Cleave, P., Becker, S., Curran, M., Van Horne, A., & Fey, M. (2015). The efficacy of
recasts in language intervention: A systematic review and meta-analysis. American
Journal of Speech-Language Pathology, 24, 237–255.
de la Fuente, M. J. (2002). Negotiation and oral acquisition of L2 vocabulary: The roles
of input and output in the receptive and productive acquisition of words. Studies in
Second Language Acquisition, 24, 81–112.
Donato, R. (1994). Collective scaffolding in second language learning. In J. Lantolf &
G. Appel (Eds.), Vygotskian approaches to second language research (pp. 33–56). Norwood,
NJ: Ablex.
Dörnyei, Z. (2009). Individual differences: Interplay of learner characteristics and
learning environment. Language Learning, 59 (Suppl. 1), 230–248.
Doughty, C., & Varela, E. (1998). Communicative focus on form. In C. Doughty &
J. Williams (Eds.), Focus on form in classroom second language acquisition (pp. 114–138).
Cambridge, England: Cambridge University Press.
Egi, T. (2007). Interpreting recasts as linguistic evidence: The roles of linguistic target,
length, and degree of change. Studies in Second Language Acquisition, 29, 511–537.
Ellis, R. (2003). Task-based language learning and teaching. Oxford, England: Oxford
University Press.
Ellis, R. (2007). The differential effects of corrective feedback on two grammatical
structures. In A. Mackey (Ed.), Conversational interaction in second language acquisition
(pp. 339–360). Oxford, England: Oxford University Press.
Ellis, R., Loewen, S., & Erlam, R. (2006). Implicit and explicit corrective feedback and
the acquisition of L2 grammar. Studies in Second Language Acquisition, 28, 339–368.
Ellis, R., & Sheen, Y. (2006). Reexamining the role of recasts in second language
acquisition. Studies in Second Language Acquisition, 28, 575–600.
Fernández Dobao, A. (2016). Peer interaction and learning. In M. Sato & S. Ballinger
(Eds.), Peer interaction and second language learning: Pedagogical potential and research
agenda (pp. 33–62). Amsterdam, Netherlands: John Benjamins.
Fernández-García, M., & Martínez-Arbelaiz, A. (2014). Native speaker–non-native
speaker study abroad conversations: Do they provide feedback and opportunities for
pushed output? System, 42, 93–104.
Fujii, A., Ziegler, N., & Mackey, A. (2016). Learner-learner interaction and metacogni-
tive instruction in the EFL classroom. In M. Sato & S. Ballinger, (Eds.), Peer interaction
and second language learning (pp. 63–89). Amsterdam, Netherlands: John Benjamins
García-Mayo, M., & Alcón-Soler, E. (2013). Negotiated input and output interaction.
In J. Herschensohn & M. Young-Scholten (Eds.), The Cambridge handbook of second
language acquisition (pp. 209–229). Cambridge, England: Cambridge University Press.
Gass, S. M. (2018). Input, interaction, and the second language learner. New York, NY: Routledge.
Input, Interaction, and Output in L2 Acquisition 217
Gass, S. M. (2003). Input and interaction. In C. Doughty & M. H. Long (Eds.), Hand-
book of second language acquisition (pp. 224–255). Oxford, England: Blackwell.
Gass, S. M., & Alvarez-Torres, M. (2005). Attention when? An investigation of the
ordering effect of input and interaction. Studies in Second Language Studies, 27, 1–31.
Gass, S. M., Behney, J., & Plonsky, L. (2020). Second language acquisition: An introductory
course (5th ed.). New York, NY: Routledge.
Gass, S. M., Behney, J. N., & Uzum, B. (2013). Inhibitory control, working memory
and L2 interaction gains. In K. Drozdzial-Szelest & M. Pawlak (Eds.), Psycholinguis-
tic and sociolinguistic perspectives on second language learning and teaching (pp. 91–114).
New York, NY: Springer.
Gass, S. M., & Mackey, A. (2000). Stimulated recall methodology in second language research.
Mahwah, NJ: Lawrence Erlbaum.
Gass, S. M., & Mackey, A. (2017). Stimulated recall methodology in applied linguistics and L2
research (2nd ed.). New York, NY: Routledge.
Gass, S. M., Mackey, A., & Ross-Feldman, L. (2005). Task-based interactions in class-
room and laboratory settings. Language Learning, 55, 575–611.
Gass, S. M., Mackey, A., & Ross-Feldman, L. (2011) Task-based interactions in class-
room and laboratory settings. Language Learning, 61, 189–220.
Gass, S. M., Svetics, I., & Lemelin, S. (2003). Differential effects of attention. Language
Learning, 53, 497–545.
Gass, S. M., & Varonis, E. (1985). Variation in native speaker speech modification to
non-native speakers. Studies in Second Language Acquisition, 7, 37–57.
Goo, J. (2012). Corrective feedback and working memory capacity in interaction-driven
L2 learning. Studies in Second Language Acquisition, 34, 445–474.
Goo, J., & Mackey, A. (2013). The case against the case against recasts. Studies in Second
Language Acquisition, 35, 127–165.
Gor, K., & Long, M. (2009). Input and second language processing. In W. Ritchie &
T. Bhatia (Eds.), The new handbook of second language acquisition (pp. 445–472). New
York, NY: Academic Press.
Gurzynski-Weiss, L., & Baralt, M. (2014). Does type of modified output correspond to
learner noticing of feedback? A closer look in face-to-face and computer-mediated
task-based interaction. Applied Psycholinguistics, 36, 1393–1420.
Han, Z. (2002). A study of the impact of recasts on tense consistency in L2 output.
TESOL Quarterly, 36, 543–572.
Hatch, E. (1983). Psycholinguistics: A second language perspective. Rowley, MA: Newbury.
Ishida, M. (2004). Effects of recasts on the acquisition of the aspectual form -te i- (ru) by
learners of Japanese as a foreign language. Language Learning, 54, 311–394.
Iwashita, N. (2003). Negative feedback and positive evidence in task-based interaction.
Studies in Second Language Acquisition, 25, 1–36.
Jeon, K. S. (2007). Interaction-driven L2 learning: Characterizing linguistic develop-
ment. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A col-
lection of empirical studies (pp. 379–403). Oxford, England: Oxford University Press.
Jordan, G. (2005). Theory construction in second language acquisition. Amsterdam,
Netherlands: John Benjamins.
Keck, C., Iberri-Shea, G., Tracy-Ventura, N., & Wa-Mbaleka, S. (2006). Investigating
the empirical link between task-based interaction and acquisition: A meta-analysis.
In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teach-
ing (pp. 91–131). Philadelphia, PA: John Benjamins.
218 Susan M. Gass and Alison Mackey
Kim, Y. (2012). Task complexity, learning opportunities, and Korean EFL ‘learners’
question development. Studies in Second Language Acquisition, 34, 627–658.
Kim, Y. (2017). Cognitive-interactionist approaches to L2 instruction. In S. L oewen &
M. Sato (Eds.), The Routledge handbook of instructed second language acquisition
(pp. 126–135). New York, NY: Routledge.
Kim, J., & Han, Z. (2007). Recasts in communicative EFL classes: Do teacher intent
and learner interpretation overlap? In A. Mackey (Ed.), Conversational interaction in
second language acquisition (pp. 269–297). Oxford, England: Oxford University Press.
Kleifgen, J. (1985). Skilled variation in a kindergarten teacher’s use of foreigner talk.
In S. M. Gass & C. Madden (Eds.), Input in second language acquisition (pp. 59–68).
Rowley, MA: Newbury.
Krashen, S. (1982). Principles and practices in second language acquisition. Oxford, England:
Pergamon.
Krashen, S. (1985). The input hypothesis: Issues and complications. London, England:
Longman.
Leeman, J. (2003). Recasts and second language development: Beyond negative evi-
dence. Studies in Second Language Acquisition, 25, 37–63.
Leeser, M. J. (2004). Learner proficiency and focus on form during collaborative
dialogue. Language Teaching Research, 8, 55–81.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language
Learning, 60, 309–365.
Loewen, S., & Philp, J. (2006). Recasts in the adult English L2 classroom: Characteris-
tics, explicitness, and effectiveness. The Modern Language Journal, 90, 536–556.
Long, M. H. (1996). The role of the linguistic environment in second language acquisi-
tion. In W. Ritchie & T. Bhatia (Eds.), Handbook of language acquisition: Vol. 2. Second
language acquisition (pp. 413–468). San Diego, CA: Academic Press.
Long, M. (2015). Second language acquisition and task-based language teaching. West Sussex,
England: John Wiley & Sons, Ltd.
Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In
C. Doughty & J. Williams (Eds.), Focus on form in classroom second language acquisition
(pp. 15–41). New York, NY: Cambridge University Press.
Lyster, R. (1998a). Recasts, repetition, and ambiguity in L2 classroom discourse. Studies
in Second Language Acquisition, 20, 51–81.
Lyster, R. (1998b). Negotiation of form, recasts, and explicit correction in relation
to error types and learner repair in immersion classrooms. Language Learning, 48,
183–218.
Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruc-
tion. Studies in Second Language Acquisition, 26, 399–432.
Lyster, R., & Izquierdo, J. (2009). Prompts versus recasts in dyadic interaction. Language
Learning, 59, 453–498.
Lyster, R., & Ranta, L. (2013). Counterpoint piece: The case for variety in corrective
feedback research. Studies in Second Language Acquisition, 35, 1–18.
Lyster, R., & Saito, Y. (2010). Oral feedback in classroom SLA: A meta-analysis. Studies
in Second Language Acquisition, 32, 265–302.
Mackey, A. (1999). Input, interaction, and second language development: An empirical
study of question formation in ESL. Studies in Second Language Acquisition, 21, 557–587.
Mackey, A. (2000, September). Feedback, noticing and second language development:
An empirical study of L2 classroom interaction. In Paper presented at the annual meet-
ing of the British Association for Applied Linguistics, Cambridge, England.
Input, Interaction, and Output in L2 Acquisition 219
Mackey, A. (2006). Feedback, noticing and instructed second language learning.
Applied Linguistics, 27, 405–430.
Mackey, A. (2012). Input, interaction, and corrective feedback in L2 learning. Oxford,
England: Oxford University Press.
Mackey, A. (2020). Interaction, feedback and task research in L2 learning: Methods and
design. Cambridge, England: Cambridge University Press.
Mackey, A., Abbuhl, R., & Gass, S. M. (2012). Interactionist approach. In S. M. Gass
& A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 7–23).
New York, NY: Routledge.
Mackey, A., Adams, R., Stafford, C., & Winke, P. (2010). Exploring the relationship
between modified output and working memory capacity. Language Learning, 60,
501–533.
Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and design.
Mahwah, NJ: Lawrence Erlbaum.
Mackey, A., & Gass, S. M. (2006). Introduction: Methodological innovation in interac-
tion research. Studies in Second Language Acquisition, 28, 169–178.
Mackey, A., & Gass, S. M. (2007). Data elicitation for second and foreign language research.
Mahwah, NJ: Lawrence Erlbaum.
Mackey, A., Gass, S. M., & McDonough, K. (2000). How do learners perceive interac-
tional feedback? Studies in Second Language Acquisition, 22, 471–497.
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research
synthesis. In A. Mackey (Ed.), Conversational interaction in second language acquisition:
A collection of empirical studies (pp. 407–449). Oxford, England: Oxford University
Press.
Mackey, A., & Goo, J. (2012). Interaction approach in second language acquisition. In
C. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 2748–2758). Malden, MA:
Wiley-Blackwell.
Mackey, A., & Oliver, R. (2002). Interactional feedback and children’s L2 develop-
ment. System, 30, 459–477.
Mackey, A., & Philp, J. (1998). Conversational interaction and second language devel-
opment: Recasts, responses, and red herrings? Modern Language Journal, 82, 338–356.
Mackey, A., & Sachs, R. (2012). Older learners in SLA research: A first look at working
memory, feedback, and L2 development. Language Learning, 62, 704–740.
Mackey, A., & Silver, R. E. (2005). Interactional tasks and English L2 learning by
immigrant children in Singapore. System, 32, 239–260.
McDonough, K. (2007). Interactional feedback and the emergence of simple past
activity verbs in L2 English. In A. Mackey (Ed.), Conversational interaction in second
language acquisition: A collection of empirical studies (pp. 323–338). Oxford, England:
Oxford University Press.
McDonough, K., Crawford, W. J., & Mackey, A. (2015). Creativity and EFL stu-
dents’ language use during a group problem-solving task. TESOL Quarterly, 49,
188–199.
McDonough, K., & Hernández González, T. H. (2013). Language production
opportunities during whole-group interaction in conversation group settings. In
K. McDonough & A. Mackey (Eds.), Second language interaction in diverse educational
contexts (pp. 293–314). Amsterdam, Netherlands: John Benjamins.
McDonough, K., & Mackey, A. (2006). Responses to recasts: Repetitions, primed
production, and linguistic development. Language Learning, 56, 693–720.
McLaughlin, B. (1987). Theories of second-language learning. London, England: Arnold.
220 Susan M. Gass and Alison Mackey
Miller, P., & Pan, W. (2012). Recasts in the L2 classroom: A meta-analytic review.
International Journal of Educational Research, 56, 48–59.
Morris, F. A. (2002). Negotiation moves and recasts in relation to error types and learner
repair in the foreign language classroom. Foreign Language Annals, 35, 395–404.
Muranoi, H. (2000). Focus on form through interaction enhancement: Integrating for-
mal instruction into a communicative task in EFL classrooms. Language Learning,
50, 617–673.
Myles, F. (2013). Theoretical approaches. In J. Herschensohn & M. Young-Scholten
(Eds.), The Cambridge handbook of second language acquisition (pp. 46–70). Cambridge,
UK: Cambridge University Press.
Nassaji, H. (2009). Effects of recasts and elicitations in dyadic interaction and the role
of feedback. Language Learning, 59, 411–452.
Nicholas, H., Lightbown, P., & Spada, N. (2001). Recasts as feedback to language
learners. Language Learning, 51, 719–758.
Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis
and quantitative meta-analysis. Language Learning, 50, 417–528.
Payant, C., & Kim, Y. (2015). Language mediation in an L3 classroom: The role of task
modalities and task types. Foreign Language Annals, 48, 706–729.
Philp, J. (2003). Constraints on “noticing the gap”: Nonnative speakers’ noticing of
recasts in NS-NNS interaction. Studies in Second Language Acquisition, 25, 99–126.
Philp, J., & Mackey, A. (2010). Interaction research: What can socially informed
approaches offer to cognitivists (and vice versa)? In R. Batstone (Ed.), Sociocogni-
tive perspectives on language use and language learning (pp. 210–227). Oxford, England:
Oxford University Press.
Pica, T. (1994). Research on negotiation: What does it reveal about second-language
learning conditions, processes, and outcomes? Language Learning, 44, 493–527.
Pica, T. (1996). Second language learning through interaction: Multiple perspectives.
University of Pennsylvania Working Papers in Educational Linguistics, 12, 1–22.
Pica, T. (1998). Second language learning through interaction: Multiple perspectives.
In V. Regan (Ed.), Contemporary approaches to second language acquisition (pp. 1–31).
Dublin, Ireland: University of Dublin Press.
Pica, T., Kanagy, R., & Falodun, J. (1993). Choosing and using communication tasks for
second language instruction and research. In G. Crookes & S. M. Gass (Eds.), Tasks
and language learning: Integrating theory and practice (pp. 9–34). Clevedon, England:
Multilingual Matters.
Pipes, A. (2016, August). Cognitive creativity as an individual difference and its
relationship with L2 communication strategy use. In Paper presented at EuroSLA 26,
Jyvaskala, Finland.
Plonsky, L., & Gass, S. M. (2011). Quantitative research methods, study quality, and
outcomes: The case of interaction research. Language Learning, 61, 325–366.
Posner, M. I. (1988). Structures and functions of selective attention. In T. Boll &
B. Bryant (Eds.), Clinical neuropsychology and brain function: Research, measurement, and
practice (pp. 173–201). Washington, DC: American Psychological Association.
Posner, M. I. (1992). Attention as a cognitive and neural system. Current Directions in
Psychological Science, 1, 11–14.
Posner, M. I., & Peterson, S. (1990). The attention system of the human brain. Annual
Review of Neuroscience, 13, 25–42.
Ramírez, A. G. (2005). Review of the social turn in second language acquisition. The
Modern Language Journal, 89, 292–293.
Input, Interaction, and Output in L2 Acquisition 221
Ranta, L., & Meckelborg, A. (2013). How much exposure to English do interna-
tional graduate students really get? Measuring language use in a naturalistic setting.
Canadian Modern Language Review, 69, 1–33.
Révész, A. (2012). Working memory and the observed effectiveness of recasts in differ-
ent L2 outcome measures. Language Learning, 62, 93–132.
Révész, A., & Han, Z. (2006). Task content familiarity, task type and efficacy of recasts.
Language Awareness, 15, 160–178.
Robinson, P. (1995). Attention, memory, and the “noticing” hypothesis. Language
Learning, 45, 283–331.
Robinson, P. (Ed.). (2001). Cognition and second language instruction. Cambridge, England:
Cambridge University Press.
Robinson, P. (Ed.). (2002). Individual differences and instructed language learning.
Amsterdam, Netherlands: John Benjamins.
Robinson, P. (Ed.). (2013). The Routledge encyclopedia of second language acquisition.
New York, NY: Routledge.
Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acqui-
sition of L2 grammar: A meta-analysis of the research. In J. M. Norris & L. Ortega
(Eds.), Synthesizing research on language learning and teaching (pp. 133–164). Philadel-
phia, PA: John Benjamins.
Sachs, R., & Suh, B.-R. (2007). Textually enhanced recasts, learner awareness, and
L2 outcomes in synchronous computer-mediated interaction. In A. Mackey (Ed.),
Conversational interaction in second language acquisition: A collection of empirical studies
(pp. 197–227). Oxford, England: Oxford University Press.
Sagarra, N. (2007). From CALL to face-to-face interaction: The effect of
computer-delivered recasts and working memory on L2 development. In A.
Mackey (Ed.), Conversational interaction in second language acquisition: A collection of
empirical studies (pp. 229–248). Oxford, England: Oxford University Press.
Sato, M. (2017). Oral peer corrective feedback: Multiple theoretical perspectives.
In H. Nassaji & E. Kartchava (Eds.), Corrective feedback in second language teaching
and learning: Research, theory, applications, implications (pp. 19–34). New York, NY:
Routledge.
Saxton, M. (1997). The contrast theory of negative input. Journal of Child Language, 24,
139–161.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied
Linguistics, 11, 129–158.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 3–32). Cambridge, England: Cambridge University Press.
Schmidt, R. (2012). Attention, awareness, and individual differences in language learn-
ing. In Perspectives on individual characteristics and foreign language education (pp. 27–50).
Berlin, Germany: De Gruyter Mouton.
Schmidt, R., & Frota, S. (1986). Developing basic conversational ability in a second
language: A case study of an adult learner of Portuguese. In R. Day (Ed.), Talking to
learn: Conversation in second language acquisition (pp. 237–326). Rowley, MA: Newbury.
Sheen, Y. (2007). The effect of focused written corrective feedback and language apti-
tude on ESL learners’ acquisition of articles. TESOL Quarterly, 41, 255–283.
Sheen, Y. (2008). Recasts, language anxiety, modified output, and L2 learning.
Language Learning, 58, 835–874.
Smith, B. (2012). Eye tracking as a measure of noticing: A study of explicit recasts in
SCMC. Language Learning & Technology, 16, 53–81.
222 Susan M. Gass and Alison Mackey
Storch, N. (2002). Patterns of interaction in ESL pair work. Language Learning, 52,
119–158.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input
and comprehensible output in its development. In S. M. Gass & C. Madden (Eds.),
Input in second language acquisition (pp. 235–253). Rowley, MA: Newbury.
Swain, M. (1993). The output hypothesis: Just speaking and writing aren’t enough.
Canadian Modern Language Review, 50, 158–164.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook &
B. Seidlhofer (Eds.), Principle and practice in applied linguistics (pp. 125–144). Oxford,
England: Oxford University Press.
Swain, M. (1998). Focus on form through conscious reflection. In C. Doughty &
J. Williams (Eds.), Focus on form in classroom second language acquisition (pp. 64–81).
Cambridge, England: Cambridge University Press.
Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.),
Handbook on research in second language learning and teaching (pp. 471–483). Mahwah,
NJ: Lawrence Erlbaum.
Swain, M., Brooks, L., & Tocalli-Beller, A. (2002). Peer-peer dialogue as a means of
second language learning. Annual Review of Applied Linguistics, 22, 171–185.
Swain, M., & Lapkin, S. (1998). Interaction and second language learning: Two ado-
lescent French immersion students working together. The Modern Language Journal,
82, 320–337.
Swain, M., & Lapkin, S. (2002). Talking it through: Two French immersion learners’
response to reformulation. International Journal of Educational Research, 37, 285–304.
Tomlin, R., & Villa, V. (1994). Attention in cognitive science and second language
acquisition. Studies in Second Language Acquisition, 16, 183–203.
Trofimovich, P., Ammar, A., & Gatbonton, E. (2007). How effective are recasts? The
role of attention, memory, and analytical ability. In A. Mackey (Ed.), Conversa-
tional interaction in second language acquisition: A series of empirical studies (pp. 171–195).
Oxford, England: Oxford University Press.
Van den Branden, K. (1997). Effects of negotiation on language learners’ output.
Language Learning, 47, 589–636.
Wagner-Gough, J., & Hatch, E. (1975). The importance of input data in second lan-
guage acquisition studies. Language Learning, 25, 297–308.
Williams, J. (1999). Learner-generated attention to form. Language Learning, 51,
303–346.
Williams, J. (2001). The effectiveness of spontaneous attention to form. System, 29,
325–340.
Zalbidea, J. (2017). ‘One task fits all’? The roles of task complexity, modality, and
working memory capacity in L2 performance. The Modern Language Journal, 101,
335–352.
Ziegler, N. (2016). Synchronous computer-mediated communication and interaction:
A meta-analysis. Studies in Second Language Acquisition, 38, 553–586.
Ziegler, N., Seas, C., Ammons, S., Lake, J., Hamrick, P., & Rebuschat, P. (2013).
Interaction in conversation groups: The development of L2 conversational styles. In
K. McDonough & A. Mackey (Eds.), Second language interaction in diverse educational
contexts (pp. 269–292). Amsterdam, Netherlands: John Benjamins.
10
SOCIOCULTURAL THEORY
AND L2 DEVELOPMENT
James P. Lantolf, Matthew E. Poehner,
and Steven L. Thorne
Sociocultural theory (hereafter SCT) has its origins in the writings of the Russian
psychologist L. S. Vygotsky and his colleagues. SCT argues that human mental
functioning is fundamentally a mediated process that is organized by cultural
artifacts, activities, and concepts (Ratner, 2002).1 Within this framework,
humans are understood to utilize existing, and to create new, cultural artifacts
that allow them to regulate, or more fully monitor and control, their material
and symbolic activity. Practically speaking, developmental processes take place
through participation in cultural, linguistic, and historically formed settings
such as family life, peer group interaction, public spaces (e.g., restaurants, banks,
leisure-time activities, etc.), work places, and above all, for our purposes, formal
educational contexts. SCT argues that while human neurobiology is a necessary
condition for higher mental processes, the most important forms of human
cognitive activity develop through interaction within social and material
environments, including conditions found in instructional settings (Engeström,
1987). Importantly, SCT and its sibling approaches, such as cultural-historical
activity theory, emphasize not only research and understanding of human
developmental processes but also praxis-based research, which entails inter-
vening and creating conditions for development (see Lantolf & Poehner, 2014).
Second language (L2) SCT researchers are increasingly emphasizing a praxis
orientation to understand processes of language development through active
engagement with teachers and learners, as illustrated in the Exemplary Study
and other sections of this chapter.
Despite an untimely death from tuberculosis at the age of 38 in 1934,
Vygotsky had an extremely productive career profoundly influenced by the
social conditions produced by the Russian Revolution. While SCT is most
strongly associated with the research of Vygotsky and his colleagues, Luria
224 James P. Lantolf et al.
and Leont’ev (see Valsiner & van der Veer, 2000), its intellectual roots extend
back to 18th and 19th century European philosophy (particularly Hegel and
Spinoza) and to the sociological, economic, and natural science writings of
Marx and Engels (specifically Theses on Feuerbach and Capital). Drawing on this
work, Vygotsky attempted to formulate “a psychology grounded in Marxism”
(Wertsch, 1995, p. 7), which emphasized locating individual development
within material, social, and historical conditions. Wertsch (1985, p. 199) has
suggested that Vygotsky’s developmental research was inspired specifically by
three principles of Marxist theory: (1) that human consciousness is fundamen-
tally social, rather than merely biological, in origin; (2) that human social and
psychological activity are mediated by material artifacts (e.g., computers, the
layout of built environments) and symbolic tools/signs (e.g., language, literacy,
numeracy, concepts); and (3) that units of analysis for understanding human
activity and development should be holistic in nature, that is, the units take
their function and meaning from the whole in which they participate.
This chapter describes the major theoretical principles and constructs asso-
ciated with SCT as they primarily relate to L2 development as a psychological
process that should be accounted for through the same principles and concepts
that account for all other higher (i.e., socially organized) mental processes.
While Vygotsky’s own empirical research focused largely on development
during childhood, the theory proposes principles and concepts intended to
account for all higher mental processes, including L2 development.
Particular attention is given to development in instructed settings, where
activities and environments may be intentionally organized according to the-
oretical principles in order to optimally guide developmental processes. In
the first section, we elaborate on mediation—the central construct of the
theory. We then discuss and relate to L2 development other aspects of SCT,
namely, private speech, internalization, regulation (closely connected to
mediation and internalization), and the zone of proximal development
(ZPD). We also consider SCT-informed L2 pedagogy, in particular the growing
bodies of work in Dynamic Assessment (DA) and Systemic-Theoretical
Instruction, that has been imported into L2 education as Concept-Based Lan-
guage Instruction (C-BLI).
The Theory and Its Constructs
Mediation
Vygotsky laid the foundation for a unified theory of mental functioning that
initiated a new way of thinking about development. He acknowledged that
the human mind was comprised of what he characterized as lower mental
processes; that is processes (memory, attention, perception, etc.) that are
Sociocultural Theory and L2 Development 225
biologically specified and that are generally shared with other higher ani-
mals. These processes result from millennia of evolution and are more or less
“instinctive or habitual reactions” to specific environmental inputs (Arievitch,
2017, p. 32). For example, if someone hears a sudden loud noise or sees a sud-
den flash of light, an automatic response is to react to these environmental
disturbances without intentionally considering if, and how, to react. How-
ever, because the environment in which humans live includes a myriad of
unanticipated occurrences, responses to which are not determined by our
automatic instincts, another system emerged among humans that provided for
non-automatic regulation of behavior and this is our conscious psychological
system which develops ontogenetically, that is, from childhood to adult life
(Arievitch, 2017, p. 34). Gal’perin, whose research was heavily influenced by
Vygotsky’s writings, proposed that the function of consciousness was precisely
to cope with the unanticipated events of human reality (Arievitch, 2017, p. 35).
It enables humans, acting collectively in social configurations, and eventually,
individually, to plan a potential response and assess the risks and likelihood
of success when the plan is implemented in concrete reality. In other words,
consciousness imbues humans with the unique capacity to act mentally prior
to acting materially in response to a particular circumstance. Mental action,
however, requires an abstract system of symbols, and this is provided by the
most powerful and pervasive symbolic system available to humans—language.
Thus, language is considered to have a dual, communicative and psychological,
function. That is, it serves to influence both the activity of others and of the
self. Both of these processes are referred to in the theory as mediation. Our par-
ticular interest here is in psychological mediation.
To better understand psychological mediation via symbolic systems, we can
consider the more obvious relationship between humans and the physical world
that is mediated by concrete tools. If we want to dig a hole in order to plant a tree,
it is possible, following the behavior of other species, to simply use our hands.
However, modern humans rarely engage in such non-mediated activity; instead,
we mediate the digging process through the use of a shovel, which allows us to
make more efficient use of our physical energy and to dig a more precise hole. We
can be even more efficient and expend less physical energy if we use a mechanical
digging device such as a backhoe. Notice that the object of our activity remains
the same whether we dig with our hands or with a tool, but the action of digging
itself changes its appearance when we shift from hands to a shovel or a backhoe.
While physical tools imbue humans with a great deal more ability than natural
endowments alone (e.g., microscopes and telescopes radically augment our visual
acuity), tools also exert influence on their users as we are generally not completely
free to use a tool in any way we like. The design of the tool as well as the habitual
patterns of its use (i.e., its “cultures-of-use,” see Thorne, 2003, 2016) influence
the purposes to which it is put and methods by which it is used.
226 James P. Lantolf et al.
Vygotsky reasoned that in a parallel fashion to the development and use
of material tools, humans also have the capacity to create and use symbols
as tools to mediate their own psychological activity. He proposed that while
physical tools are outwardly directed, symbolic tools are bi-directional, in that
through social communication we influence and control others, and through
self-communication we influence and control our own thinking process. This
ability allows humans, unlike other species, to inhibit and delay the function-
ing of automatic biological processes and to use this time to consider possible
actions (i.e., plan) on an ideal plane before realizing them on the objective
plane. Planning itself entails memory of previous actions, attention to relevant
(and overlooking of irrelevant) aspects of the situation, rational thinking, and
projecting possible outcomes. All of this, according to Vygotsky, constitutes
human consciousness. From an evolutionary perspective, this capacity imbues
humans with a considerable advantage over other species because, through the
creation of auxiliary means of mediation, we are able to assay a situation and
consider alternative courses of action and possible outcomes on the ideal or
mental plane before acting on the concrete objective plane (see Arievitch & van
der Veer, 2004).
Within SCT, a pervasive and powerful form of mediation is provided by
language. When children learn language, words not only function to isolate
specific objects and actions, they also serve to reshape biological perception
into cultural perception and concepts (see also Gibson, 1979; for an inter-
face of SCT and language socialization, see Duff, 2007). SCT researchers
describe a developmentally sequenced shift in the locus of control of human
activity as three processes of regulation—object-, other-, and self-regulation.
Object-regulation describes instances when artifacts in the environment serve
as affordances for activity, including cognitive activity, as occurs in the use
of an online translation tool to look up unknown words while reading or
writing, the use of PowerPoint or an outline when making an oral presenta-
tion, or pen and paper for making a to-do list or working out mathematical
problems. Other-regulation describes mediation of cognition by other people
and can include explicit or implicit feedback on grammatical form, corrective
comments on writing a ssignments, or guidance from an expert or teacher.
Self-regulation refers to individuals who have internalized external forms of
mediation for the execution or completion of a task. In this way, development
can be described as the process of gaining greater voluntary control over one’s
capacity to think and act either by becoming more proficient in the use of
meditational resources, or through a lessening or severed reliance on external
meditational means (Thorne & Tasker, 2011).
To be a proficient user of a language, L1 or otherwise, is to be self-regulated;
however, self-regulation is not a stable condition. Even the most proficient
communicators, including native speakers, may need to re-access earlier
stages of development (i.e., other- or object-regulation) when confronted
Sociocultural Theory and L2 Development 227
with challenging communicative situations. Under stress, for example, adult
native users of a language produce ungrammatical and incoherent utterances
(see Frawley, 1997). In this instance, an individual may become regulated by
the language as an object and instead of controlling the language, they become
disfluent and may require assistance from another person or from objects such
as a thesaurus, dictionary, or exemplar of a genre specific text. Each of the three
stages—object regulation, other regulation, and self-regulation—is “symmet-
rical and recoverable, an individual can traverse this sequence at will [or by
necessity], given the demands of the task” (Frawley, 1997, p. 98).
Language in all its forms is the most pervasive and powerful cultural artifact
that humans possess to mediate their connection to the world, to each other,
and to themselves. The key that links thinking to social and communicative
activity, as we have said earlier, resides in the double function of the linguistic
sign, which simultaneously points in two directions—outwardly, “as a unit of
social interaction (i.e., a unit of behavior),” and inwardly, “as a unit of thinking
(i.e., as a unit of mind)” (Prawat, 1999: p. 268, italics in original). The inward or
self-directed use of language as a symbolic tool for cognitive regulation is called
inner speech (see Lantolf & Thorne, 2006). When we learn to communicate
socially, we appropriate the patterns and meanings of this social speech and also
utilize it inwardly to mediate our mental activity.
Considerable research has been carried out on the form and function of the
developmental precursor of inner speech in children, generally referred to as
private speech, which tends to be highly fragmentary yet observable speech
that is reminiscent of the dialogue that occurs in everyday face-to-face social
interaction (see Diaz & Berk, 1992; Wertsch, 1985). L2 researchers, beginning
with the work of Frawley and Lantolf (1985), have also examined the cognitive
function of private speech. Private speech is defined as an individual’s exter-
nalization of language for purposes of maintaining or regaining self-regulation,
for example, to aid in focusing attention, problem-solving, orienting oneself to
a task, to support memory-related tasks, to facilitate internalization of novel or
difficult information (e.g., language forms, sequences of numbers and mathe-
matical computation), and to objectify and make salient phenomena and infor-
mation to the self (e.g., DiCamilla & Anton, 2004; Frawley, 1997; McCafferty,
1992; Ohta, 2001). Such use of language shares empirical features that include
averted gaze, lowered speech volume, altered prosody, abbreviated syntax,
and multiple repetitions. Recent research on private speech has explored the
fact that on occasion private speech can have the unintended secondary func-
tion of stimulating collective attention to group-relevant problems and issues
(e.g., Smith, 2007; Steinbach-Koehler & Thorne, 2011).
A particularly intriguing issue has been whether or not learners are capable
of producing L2 private speech that has the capacity to regulate their behavior
similar to what happens when they use their L1. Research by Centeno-Cortés
and Jiménez-Jiménez (2004) showed that even advanced L2 users had a difficult
228 James P. Lantolf et al.
time solving problems that required technical (e.g., mathematical) knowledge
when they deployed L2 private speech. More recently, Garbaj (2018) demon-
strated that advanced L2 speakers can use L2 private speech to solve tasks that
do not require specialized knowledge. Jiménez-Jiménez (2015) reported that if
the technical knowledge is appropriated through the L2 instead of the L1, as
in the case of some bilingual individuals, it is indeed possible to successfully
resolve a complex task through use of L2 private speech.
Internalization
Vygotsky (1981) stated that the challenge to psychology was to “show how the
individual response emerges from the forms of collective life [and] in contrast
to Piaget, we hypothesize that development does not proceed toward social-
ization, but toward the conversion of social relations into mental functions”
(p. 165). The process through which cultural artifacts, including language,
take on a psychological function is known as internalization (Kozulin, 1990).
Drawing from earlier theorists such as Janet (see Valsiner & van der Veer, 2000),
Vygotsky described the process of internalization as follows:
Any function in the child’s cultural development appears twice, or on
two planes. First it appears on the social plane, and then on the psycho-
logical plane. First it appears between people as an interpsychological
category, and then within the child as an intrapsychological category.
This is equally true with regard to voluntary attention, logical memory,
the formation of concepts, and the development of volition
(Vygotsky, 1981, p. 163)
As this quotation makes clear, higher order cognitive functions, which include
planning, categorization, and interpretive strategies, are initially social and
subsequently are internalized and made available as cognitive resources. This
process of creative appropriation occurs through exposure to, and use of, semi-
otic systems such as languages, textual (and now digital) literacies, numeracy
and mathematics, and other historically accumulated cultural practices. In this
sense, internalization describes the developmental process whereby humans
gain the capacity to perform complex cognitive and physical-motor functions
with progressively decreasing reliance on external, and increasing reliance on
internal, mediation. Recently, Arievitch (2017), following Gal’perin (1967), has
proposed that internalization not be interpreted as taking information into the
mind for the purpose of processing it internally but rather as the capacity to “to
solve problems while being detached (and often being away) from [sic] immedi-
ately displayed problem situation” (p. 89). On this view, there is not a separate
domain of mental representations that is “fundamentally different from the
outside world” (ibid.). Rather, acting mentally “retains the major characteristics
Sociocultural Theory and L2 Development 229
of real-life, practical (external) activity,” and as such goal- directed actions “are
carried out in the medium of meanings, that is, without overt physical execu-
tion” (pp. 89–90). Children have a difficult time carrying out mental actions
in the absence of physical objects, whereas for adults it is a relatively easy task,
as, for example, when one mentally plans to rearrange furniture in one’s home
while sitting in a café. The point is that while the planning is mental it is still
directed at impacting the external world and as such must take account of “the
regularities and relationships in the environment” (p. 90).
The Zone of Proximal Development
The ZPD has had a substantial impact on developmental psychology, educa-
tion, and applied linguistics. The most frequently referenced definition of the
ZPD is “the distance between the actual developmental level [of a person or
group] as determined by independent problem solving and the level of potential
development as determined through problem solving under adult guidance or
in collaboration with more capable peers” (Vygotsky, 1978, p. 86).
The ZPD has captivated educators and psychologists for a number of rea-
sons. One is the notion of assisted performance, which though not equivalent
to the ZPD, has been a driving force behind much of the interest in Vygotsky’s
research. Another compelling attribute of the ZPD is that, in contrast to tra-
ditional tests and measures that only indicate the level of development already
attained, the ZPD is forward looking through its assertion that the functioning
one can reach in the present through cooperation with others is indicative
of what one will be able to do independently in the future. In this sense,
ZPD-oriented assessment provides a nuanced determination of both develop-
ment achieved and developmental potential.
With the ZPD, Vygotsky put into concise form his more general conviction
that “human learning presupposes a specific social nature and a process by
which children grow into the intellectual life of those around them” (1978,
p. 88). Vygotsky was particularly intrigued with the complex effects that
schooling had on cognitive development. One of Vygotsky’s most important
findings, and contra Piaget, is that instruction, especially formal instruction in
school, can precede development and therefore shape it. In this sense, the ZPD
is not only a model of the developmental process but also a conceptual tool that
educators can use to understand aspects of students’ emerging capacities that are
in early stages of maturation. When used proactively, teachers using the ZPD as
a diagnostic have the potential to create conditions that may give rise to specific
forms of future development.
In L2 research, the ZPD concept was used by Aljaafreh and Lantolf (1994) to
analyze the relationship between mediation and L2 development. They identi-
fied a number of mechanisms of effective mediation, for example, that media-
tion should be contingent on actual need, provided following a continuum that
230 James P. Lantolf et al.
begins with implicit hints and moves toward explicit correction as necessary,
and that mediation should be removed when a student demonstrates the capac-
ity to function independently. This process requires continuous assessment of a
learner’s emerging abilities and subsequent tailoring of mediation to best facil-
itate progression from other- to self-regulation. This insight has been formal-
ized as DA, a framework for integrating teaching and assessment discussed later.
In a study that builds on conversation analysis of classroom discourse, Ohta
(2001) describes the interaction cues to which peers orient in dyad work in
order to provide developmentally appropriate mediation to one another. A key
finding is that differences in learner abilities are not fixed or located solely
within individuals. While some instances show one participant consistently
providing assistance to another, there are often expert–novice reversals over
the course of a single, or multiple, sessions. Reiterating Donato’s (1994) notion
of collective expertise, Ohta observed that, “when learners work together …
strengths and weaknesses may be pooled, creating a greater expertise for the
group than of any of the individuals involved” (Ohta, 2001, p. 76). Swain
(2000) made similar claims about what she termed “collaborative dialogue,”
and more recently, through the development of the process she termed “lan-
guaging” (Swain, Lapkin, Knouzi, Suzuki, & Brooks, 2009), extends her ear-
lier formulation about communicative output “to include its operation as a
socially constructed cognitive tool” … that “serves second-language learning
by mediating its own construction, and the construction of knowledge about
itself ” (2000, p. 112).
What Counts as Evidence?
Sociocultural research is grounded in the genetic method, an approach to sci-
entific research in which the development of individuals, groups, and processes
is traced over time. Consequently, single snapshots of learner performance are
not assumed to constitute adequate evidence of development. Evidence must
have a historical perspective. This is not necessarily an argument for the exclu-
sive use of long-term longitudinal studies. While development surely occurs
over the course of months, years, or even the entire lifespan of an individ-
ual or group, it may also occur over relatively short periods of time, where
learning takes place during a single interaction between, for example, parent
and child or tutor and student. Moreover, development arises in the dialogic
interaction that transpires among individuals (this includes the self-talk that
people produce when trying to bootstrap themselves through difficult activities
such as learning another language) as they collaborate in ZPD activity (Swain
et al., 2009). Evidence of development from this perspective is not limited to
the actual linguistic performance of learners. On the face of it, this perfor-
mance in itself might not change very much from one time to another. What
may change, however, is the frequency and quality of mediation needed by a
Sociocultural Theory and L2 Development 231
particular learner in order to perform appropriately in the new language (see
the discussion of DA, below). On one occasion, a learner may respond only to
explicit mediation from a teacher or peer to produce a specific feature of the L2
and on a later occasion (later in the same interaction or in a future interaction)
the individual may only need a subtle hint to produce the feature. Thus, while
nothing has ostensibly changed in the learner’s actual performance, develop-
ment has taken place, because the quality of mediation needed to prompt the
performance has changed.2
These observations have been formalized into an assessment framework
for diagnosing both learner actual abilities as well as those that have not yet
fully developed but are still emerging. This framework is generally referred
to as DA, a term inspired by Vygotsky’s colleague, Luria (1961). An exten-
sive research literature documents a range of DA procedures that have been
designed outside the L2 field and for use with populations including those
with special needs, minority and at-risk individuals, gifted learners, patients
suffering from dementia, and prison inmates (see Haywood & Lidz, 2007).
DA approaches are characterized by the intentional inclusion of mediation—
usually through dialogic interaction but in some cases scripted prompts,
tutoring, or feedback—as part of the assessment procedure. Analysis of the
extent to which learners required mediation and their responsiveness yields a
profile that includes identification of emerging ability, underlying sources of
difficulty, and the forms of mediation that were most helpful to individuals.
Beginning with Poehner’s (2005) study of DA with learners of L2 French,
a significant line of L2 SCT research has elaborated DA procedures with a
range of L2 populations and focused on a variety of language features and
abilities (see Poehner, 2018).
Common Misconceptions about SCT
We will focus here only on misconceptions that relate to the ZPD, easily the
most widely used and yet least understood of the central concepts of SCT
(Chaiklin, 2003). There are two general misconceptions about the ZPD. The
first is that the ZPD is equivalent to scaffolding (or assisted performance) and
the second is that it is similar to Krashen’s notion of i + 1 (e.g., Krashen, 1982).
Both assumptions are inaccurate. Scaffolding, a term popularized by Jerome
Bruner and his colleagues nearly four decades ago (see Wood, Bruner, & Ross,
1976), refers to any type of adult-child (expert-novice) assisted performance.
Scaffolding, unlike the ZPD, is thought of in terms of the amount of assistance
provided by the expert to the novice rather than in terms of the quality, and
changes in the quality, of mediation that is negotiated between expert and nov-
ice with the primary goal of task completion (Stetsenko, 1999).
With regard to ZPD and Krashen’s i + 1, the fundamental problem is that
the ZPD focuses on the nature of the concrete dialogic relationship between
232 James P. Lantolf et al.
expert and novice and its goal of moving the novice toward greater self-
regulation through the new language. Krashen’s concept focuses on language
and the language acquisition device, which is assumed to be the same for all
learners with very little room for differential development (e.g., Dunn &
Lantolf, 1998; Thorne, 2000). As researchers have pointed out, there is no
way of determining precisely the i + 1 of any given learner in advance of
development. It can only be assumed after the fact. In terms of the ZPD,
development can be predicted in advance for any given learner on the basis
of his or her responsiveness to mediation. Moreover, as we mentioned in
our discussion of the ZPD, development is not merely a function of shifts in
linguistic performance, as in the case of Krashen’s model, but is also deter-
mined by the type of, and changes in, mediation negotiated between expert
and novice.
Exemplary Study: Kim and Lantolf (2018)
Arievitch (2017) discusses approaches to formal classroom instruction from
the perspective of SCT. Two of these are relevant to the exemplary study to be
considered here: traditional instruction and systemic-theoretical instruction,
which is a form of concept-based instruction based on Vygotsky’s educational
theory. 3 Traditional instruction, according to Arievitch (2017, p. 122), typ-
ically encompasses five general procedures: (1) presentation and explanation
of a problem or task, (2) presentation of rules for dealing with the problem,
(3) illustration of the rules through use of typical examples, (4) learner mem-
orization of the rules, and (5) practice using the rules to resolve the problem.
Arievitch (2017) argues that a major shortcoming of traditional instruction
is that as students work to resolve problems, their lack of understanding of
the rules as well as inadequacy of the rules themselves means that they resort
to trial-and-error strategies, resulting in slow and gradual learning at best
(see also Karpov, 2018). One of the consequences of traditional instruction
is markedly differential performance across learners. Hence, appeal is often
made to individual differences to account for the variation in learning out-
comes (Arievitch, 2017, p. 122).
In systemic-theoretical instruction, learners are provided with sophisticated
explicit knowledge of particular concepts, including those related to language,
that capture the essence of the concept. The more complex and subtle the con-
cept, the more explicit instruction is necessary. This knowledge must be made
pedagogically understandable and presented in a way that avoids rote memori-
zation and yet is still memorable for learners to access. In addition, the knowl-
edge must be sufficiently general to enable learners to extend it to a wide array
of contexts. Gal’perin (1992) developed—and tested in over 800 studies in all
school subjects—an instructional program designed to concretize Vygotsky’s
general theory of educational development (see Haenen, 2001; Talyzina, 1981).
Sociocultural Theory and L2 Development 233
Gal’perin’s program encompasses several phases that incorporate Vygotsky’s
developmental principles, as follows:
• systematic explanation of a concept presented to students in non-technical
language that can be easily understood;
• this explanation is compared and contrasted with the students’ current
understanding of the concept, which might be derived from their everyday
knowledge or from previous instruction;
• the concept is formulated as a visual image such as a drawing or graph that
captures the meaning of the concept;
• students are then asked to verbalize their understanding of the concept
first to each other (communicated understanding) and then to themselves
(dialogic understanding). The importance of verbalization, Swain’s “lan-
guaging,” is based on Vygotsky’s notion that language plays a powerful role
in mediating human thinking;
• the concept is linked to specific concrete activities designed to provide
opportunities for students to practice using the concept. In the case of lan-
guage instruction, this might involve tasks, drama, role-plays, reading and
writing activities, etc.;
• the graphic representation of the concept is eventually internalized as stu-
dents improve their ability to manipulate the concept to meet their specific
communicative needs.
Gal’perin created the acronym SCOBA, or Schema for the Orienting Basis
of Action, to capture the fact that the graphic image serves as a resource for
learners to guide their performance. Gal’perin called his approach to education
Systemic Theoretical Instruction or STI, because not only was the conceptual
knowledge to be organized systematically in order to reflect the essence of a
concept, it was also to be theoretical in the sense that it was to avoid the prob-
lem of traditional instruction whereby knowledge is often linked to specific
contexts and as such was not easily transferred to other contexts. For conceptual
knowledge to be functional, it must be generalizable and therefore applicable
to a wide array of environments. Within the SCT-L2 literature, STI is often
referred to as Concept-Based Language Instruction, or C-BLI, to avoid confu-
sion with content-based instruction.
Although initial C-BLI studies focused on L2 grammar (e.g., Negueruela,
2003; Lai, 2012; Zhang & Lantolf, 2015) as well as pragmatics (van Compernolle,
2014), the exemplary study we present here focused on figurative language,
specifically, sarcasm—a concept that is notoriously difficult for L2 learners to
identify and appropriately interpret (Kim, 2014). Briefly, sarcasm is a type of
verbal irony in which a speaker (or writer) produces utterances that contrast
literal and intended meaning and which function to either criticize, insult, or
generate a humorous effect on listeners (Kim & Lantolf, 2018, p. 208).
234 James P. Lantolf et al.
According to Kim and Lantolf (2018), sarcasm appears to be a universal
feature of human communication, but its identifying cues and its intended
meaning generally vary across cultures. For instance, in American English, spo-
ken sarcasm can be displayed through such prosodic features as vowel length,
nasalization, slurred speech, enhanced articulation rate as well as facial expres-
sions, posture, and gestures; even the physical setting or an individual’s per-
sonality can be indicators of sarcasm. These cues can be used in isolation or in
combination. Not only do cues vary cross culturally, so do the intended mean-
ings of sarcastic utterances. Kim and Lantolf (2018) point out, for instance, that
in Korean, the L1 of the speakers participating in Kim’s study, sarcasm is used
much less frequently than in English and its intended meaning is almost always
negative (i.e., criticism and insult). The term for sarcasm in Korean is bi-kkom or
“twisting” and refers to the twist that use of sarcasm gives to the communica-
tive situation in which it is used (Kim & Lantolf, 2018, p. 210). While bi-kkom is
not uniquely signaled (Kim & Lantolf, 2018, p. 210), it may be marked through
vowel length or by bodily behavior such as clapping.
The nine participants in the study were advanced ESL pursuing different
courses of study at a large North American university. Prior to instruction, and
following Gal’perin’s educational procedure, Kim (2014) asked the students to
provide their pre-understanding of sarcasm. One of the students volunteered a
statement that reflects the general understanding of the group:
When someone uses sarcasm with me in Korean, I can at least show
that I am offended by making an angry face or something…but in the
English-speaking context, I simply become stupid. Even when someone
is being sarcastic towards me, I will not understand what is actually going
on and they will think of me as a stupid person, which is obviously not
true […] I hate that reality.
(Kim & Lantolf, 2018, p. 209)
The statement exhibits the fact that the participants had some sense that English
speakers frequently use sarcasm but that even if identified, a speaker’s intention
may be missed because the utterance is interpreted through the lens of Korean
in a negative light.
Kim’s (2014) instructional program was implemented over 16 weeks, which
included the initial interviews to determine the students’ pre-understanding
of the concept. The students were also given a series of pre-tests to assess their
ability to detect and interpret English sarcasm. This was followed by ten weekly
one-on-one hour-long instruction sessions which presented learners with a
series of SCOBAs designed to illustrate different ways of identifying English
sarcasm along with authentic video examples selected from various TV shows
and YouTube political discussions, debates and talk-show interviews. Between
weeks 5 and 10, Kim conducted three 90-minute focus-groups comprised
Sociocultural Theory and L2 Development 235
of 3 students in each group. The groups discussed their understandings and
interpretations of the video examples presented during the one-on-one ses-
sions. In week 12, the students were given a posttest as well as a second inter-
view to determine if changes had occurred in their understanding and attitude
toward sarcasm. Four weeks later, the students were given a second posttest. All
instruction and interviews were conducted in L1 Korean.
Figure 10.1 is the initial SCOBA intended to explain the function of
sarcasm.
The SCOBA links use of sarcasm to the types of emotions that a speaker
wishes to convey. As the participants were shown the selected video clips, they
were asked to use Figure 10.1 to “deconstruct” utterances with regard to their
sarcastic or non-sarcastic intent as well as with regard to whether or not the
sarcastic utterances imparted positive or negative speaker intentions (Kim &
Lantolf, 2018, p. 217).
Additional SCOBAs were introduced throughout the instructional process
to illustrate visual cues for sarcasm (e.g., facial contortion, eye movement, man-
ual gestures, etc.). Figure 10.2 shows the SCOBA for describing use of eyes
and eyebrows and Figure 10.3 is the SCOBA for body movement indicators of
sarcasm.
Each of the SCOBAs was robustly illustrated with appropriate video clips.
The remaining SCOBAs described how gestures, mouth configurations, facial
expressions, and prosodic cues are used to indicate sarcasm.
The pre-, post-, and delayed posttests each consisted of ten items drawn
from the videos described above. The items contained one sarcastic utterance
with some combination of three cues (e.g., gesture, tone of voice, body posture)
FIGURE 10.1 SCOBA describing the features of positive and negative sarcasm.
236 James P. Lantolf et al.
FIGURE 10.2 SCOBA for eye and eyebrow cues for indicating sarcasm.
FIGURE 10.3 SCOBA for body movement indicators of sarcasm.
Sociocultural Theory and L2 Development 237
that marked the relevant utterance as sarcastic. In scoring the tests, one point
was assigned for correct identification of the sarcastic utterance and if at least
one correct cue was selected. Additional points were given for correct iden-
tification of each of the remaining cues for a maximum score of 3 for each
item; thus, the total score possible on each assessment was 30. A score of 0 was
assigned for misidentification of the sarcastic utterance or if the correct utter-
ance was identified but the participant failed to select any of the appropriate
cues. Following is an example of a test item from the pretest. It is an excerpt
based on the American TV show Desperate Housewives:
OLDER DAUGHTER: do you and Daddy play with them?
GABBI: if we did, you and I wouldn’t be having this conversation.
((hurriedly goes upstairs to Anna’s room and abruptly
opens the door without knocking))
ANNA: ((playing with her cell phone and looks at Gabbi))
NICE KNO::ck
GABBI: ((puts up her hand and shows condoms)) what are
these?
(Kim & Lantolf, 2018, p. 222)
The video clips were evaluated for sarcasm and cues by 15 English native speak-
ers. Five of the evaluators were trained to identify the cues illustrated in the
SCOBAs. Not surprisingly, the native speakers had little difficulty selecting the
sarcastic utterance, but only the five speakers trained through the SCOBAs were
able to identify the specific cues. When asked what the cues were, the non-trained
NSs responded by saying that the sarcasm is obvious or that the speaker is not say-
ing what she means (Kim & Lantolf, 2018, p. 218). Consequently, for cue iden-
tification, only the responses of the trained NSs were used for comparison with
the learners’ performance on the tests. On each of the three assessments, a high
degree of inter-evaluator agreement cue identification was reported (p. 219). For
example, in the excerpt from Desperate Housewives, the NSs agreed that utterance
six showed sarcasm with the intended meaning as you should have knocked and
they identified the cues as facial expression, prosody, and encyclopedic knowl-
edge of the target culture (Kim & Lantolf, 2018, p. 221). On the pretest, seven
of the nine learners incorrectly identified the utterance in seven as sarcastic with
the intended meaning of you should not have condoms (p. 222).
Kim and Lantolf (2018, p. 220) report a significant improvement for the
group mean from the pre- to the posttest and from the posttest to the delayed
posttest. Here we forego the details of learner responses regarding changes in
cue selection across the assessments provided in Kim and Lantolf (2018). Suffice
it to say that the learners improved in the variety of cues they were able to
detect and in several cases were more robust in their selection than even the
trained NSs.
238 James P. Lantolf et al.
Turning finally to the qualitative interview data, one of the participants in
the early stages of the project associated no relevant meaning to the cue pro-
vided by facial expression, something that is not typically used in Korean. The
participant remarked that a witness who rolled her eyes during an episode of
the Judge Judy TV would not be exhibiting sarcasm in Korean but would instead
be seen as a “crazy person” (p. 224). After some instruction and during the
focus-group discussion from week 5, the same participant remarked that she
had become much more sensitive to the different ways English speakers indicate
sarcasm, including through visual signals. Several participants indicated in the
postinstruction interview that they had gained an increased sense of empower-
ment as a result of their newly developed ability to detect and interpret sarcasm
in English. One learner even felt that the program had enhanced her awareness
of and even ability to more effectively use sarcasm in her L1 (p. 226).
Explanation of Observed Findings in SLA
As a preamble to our discussion, we would like to point to a fundamental
difference between the observed phenomena taken as a whole and how SCT
approaches the learning process. The ten phenomena taken together are predi-
cated on a theoretical assumption (in our view, and in the view of many other
researchers, scientific observations, as Vygotsky (1997) insightfully stated, are
never theory-free) that separates individuals from the social world. In other
words, the phenomena assume a dualism between autonomous learners and their
social environment represented as linguistic input—a concept closely linked to
the computational metaphor of cognition and learning. SCT is grounded in
a perspective that does not separate the individual from the social and in fact
argues that the individual emerges from social interaction and as such is always
quasi social (Vygotsky, 1994). This includes not only obvious social relation-
ships but also the qualities that comprise higher order mental activity mediated
through language. With this as a background, we will briefly address the given
observations as they pertain to an SCT perspective of L2 development.
Observation #1. Exposure to input is necessary for SLA. Since the social world is
the source of all learning in SCT, participation in culturally organized activity
is essential for learning to happen. This entails not just the obvious case of
interaction with others but also the artifacts that others have produced, includ-
ing written texts. It also includes Ohta’s (2001) “vicarious” participation in
which learners observe the linguistic behavior of others and attempt to imitate
it through private speech. However, as our discussion makes clear, development
may be optimally guided when intentional effort is made to sensitize interac-
tions to learners’ emergent needs.
Observation #2. A good deal of SLA happens incidentally. Here we believe a bit of
clarification is in order. From the perspective of SCT, what matters is the specific
subgoal that learners form in which the language itself becomes the intentional
Sociocultural Theory and L2 Development 239
object of their attention in the service of a higher goal. Thus, looking up a
word in a dictionary, guessing at the meaning of a word when reading a text,
and asking for clarification or help are subgoals that subserve higher order goals
such as writing a research paper, passing a test, or finding one’s way through
an unknown city. This process reflects the instrumental function of language;
that is, the use of language to achieve specific concrete goals. Thus, what is
called incidental learning is not really incidental. It is at some level a function of
intentional, goal directed, meaningful activity. Moreover, as we explain in the
next section, SCT compels us to place a premium on the explicit presentation of
linguistic knowledge in order to intentionally provoke L2 development.
Observation #4. Learner’s output (speech) often follows predictable paths with pre-
dictable stages in the acquisition of a given structure. #9. There are limits on the effects of
instruction in SLA. It is important to distinguish between learning in untutored
immersion settings and highly organized educational settings. The evidence
reported in the L2 literature supports the developmental hypothesis position
in the case of untutored learners. There is also research that shows learners
follow the same paths in classroom settings (see Pienemann & Lenzing, this
volume). The question we have about this research is that as far as we are aware
the teaching did not take account of the ZPD. In other words, it provided a
uniform intervention for all learners and did not engage learners in the type
of negotiated mediation demanded by the concept of ZPD. Zhang and Lan-
tolf (2015) in a study on L2 development of Chinese word order report that
instruction organized in accordance with principles of SCT and that is sensi-
tive to learners’ ZPD can, in fact, alter the predicted developmental sequence,
including skipping stages.
Observations #5. Second language learning is variable in its outcome and Observa-
tions #6. Second language learning is variable across linguistic subsystems. Variability in
the development of any given learner as well as across learners is a characteristic
of L2 acquisition. In addition, the evidence shows that learners variably acquire
different subsystems of a new language depending on the type of mediation they
receive and the specific goals for which they use the language (see Lantolf &
Aljaafreh, 1995; for a discussion of L2 variability that takes into account both
SCT and a dynamic systems theory perspective, see de Bot, Lowie, Thorne, &
Verspoor, 2013).
Observation #8. There are limits on the effect of a learner’s first language on SLA.
From an SCT perspective it is important to distinguish form from meaning
when addressing this observation. While L1 formal features may have a lim-
ited effect on L2 learning, it is clear with regard to observations on variability
that L1 meanings continue to have a pervasive effect in L2 development (see
Negueruela, Lantolf, Jordan, & Gelabert, 2004). In addition, as was discussed
in regard to private speech, L2 speakers encounter difficulties using the new
language to mediate their cognitive activity in tasks that require specialized
knowledge, but not in those involving general knowledge.
240 James P. Lantolf et al.
Observation #10. There are limits on the effects of output (learner production) on
language acquisition. In this case, it is important to distinguish between use of the
L1 to mediate the learning of the L2 and the effects of L1 on L2 production.
Because our first language is used not only for communicative interaction but
also to regulate our cognitive processes, it stands to reason that learners must
necessarily rely on this language in order to mediate their learning of the L2.
However, there is also evidence showing that social speech produced in the
L1 and the L2 each impact L2 learning. Swain and her colleagues documented
how classroom learners of second languages, including immersion learners,
push linguistic development forward by talking, either in the L1 or L2, about
features of the new language (Swain et al., 2009; Swain & Lapkin, 2002).
Implicit and Explicit L2 Knowledge
SCT acknowledges a distinction between implicit and explicit knowledge,
including knowledge of language. Implicit knowledge, which Vygotsky (1987)
discusses under the rubric of spontaneous concepts, is largely non-conscious and
appropriated from participation in the everyday activities of a community.
Explicit knowledge, which Vygotsky (1987) discusses under the rubric of scien-
tific concepts, is primarily learned through intentional and systematic instruction
generally associated with formal education. Paradis (2009) and Ullman (2005)
have argued that the distinction between implicit and explicit knowledge is
not supported by neurological systems of the brain. Instead, they point out that
the brain comprises two memory systems: procedural and declarative. Among
other things, the former underlies the kind of knowledge that people have of
their native language as acquired in immersion settings and which is not usually
available to conscious inspection per se. The latter system, on the other hand,
supports lexical knowledge and other kinds of explicit information that people
generally learn through intentional and conscious instruction.
According to Paradis (2009) as we mature into our teenage years and beyond,
learning through the procedural system declines, whereas learning through the
declarative system increases. Leaving aside, because of space constraints, many
of the complexities and subtleties of the processes entailed in the respective
models proposed by Paradis and Ullman, we wish to highlight two components
that are directly relevant for the SCT position. The first is that there are no neu-
rological pathways connecting the procedural and declarative systems, which
means that declarative knowledge cannot convert into procedural knowledge,
no matter how much practice one engages in (NB: neither researcher rules out
the possibility of developing the procedural system in immersion settings but it
requires extensive and intensive experiences—experiences that are not likely to
occur in educational settings). The second is that the declarative system, with
appropriate practice, can be accessed smoothly and rapidly, although perhaps
not as rapidly as the procedural system. What this means is that explicit and
Sociocultural Theory and L2 Development 241
systematic classroom instruction can result in functionally useful knowledge of
a second language that learners can access for spontaneous spoken and written
communicative purposes. SCT research for the past decade has largely focused
on this aspect of L2 acquisition: the intentional development of second language
ability through systematic explicit instruction. The Exemplary Study consid-
ered above illustrates how SCT principles are implemented in concrete instruc-
tional practice. Again, concern with development through explicit instruction
does not deny that spontaneous development of the procedural system is possi-
ble in immersion settings where implicit knowledge may be accessed.
Conclusion
In this chapter, we have outlined the primary constructs of SCT, namely mediation
and regulation, internalization, and the ZPD. Additionally, we have considered
DA and C-BLI Instruction and how they inform the study of L2 acquisition and
the structuring of educational interventions. Mediation is the principle construct
that unites all varieties of SCT and is rooted in the observation that humans
do not act directly on the world—rather their cognitive and material activities
are mediated by symbolic artifacts (such as languages, literacy, numeracy, con-
cepts, and forms of logic and rationality) as well as by material artifacts and tech-
nologies. The claim is that higher-order mental functions, including voluntary
memory, logical thought, learning, and attention, are organized and amplified
through participation in culturally organized activity. This emphasis within the
theory embraces a wide range of research including linguistic relativity, distrib-
uted cognition, and cognitive linguistics. We also addressed the concept of inter-
nalization, the processes through which interpersonal and person–environment
interaction form and transform one’s internal mental functions, and the role of
imitation in learning and development. Finally, we discussed the ZPD, the dif-
ference between the level of development already obtained and the cognitive
functions comprising the proximal next stage of development that may be visible
through participation in collaborative activity. We emphasized that the ZPD is
not only a model of developmental processes, but also a conceptual and pedagog-
ical tool that educators can use to better understand aspects of students’ emerging
capacities that are in early stages of formation.
Because of its emphasis on praxis, SCT does not rigidly separate under-
standing (research) from transformation (concrete action). While SCT is used
descriptively and analytically as a research framework, it is also an applied
methodology that can be used to improve educational processes and environ-
ments (see Lantolf & Poehner, 2008, 2011, 2014; Thorne, 2004, 2005). SCT
encourages engaged critical inquiry wherein investigation into psychological
abilities leads to the development of material and symbolic tools necessary to
enact positive interventions. In other words, the value of the theory resides
not just in the analytical lens it provides for the understanding of psychological
242 James P. Lantolf et al.
development but in its capacity to directly impact that development. Though
certainly not unique among theoretical perspectives, SCT approaches take
seriously the issue of applying research to practice by understanding commu-
nicative processes as inherently cognitive processes, and cognitive processes
as indivisible from humanistic issues of self-efficacy, agency, and the effects of
participation in culturally organized activity.
Discussion Questions
1. The perspectives outlined by Lantolf, Poehner, and Thorne, on the one
hand, and Gass and Mackey, on the other, are both “interactional” in
nature. How are they different?
2. What is private speech? Reflecting on your own language learning experi-
ences, can you relate instances in which private speech has played a role in
language development or cognitive regulation?
3. Considering your own experience teaching language or observing
instructional interactions in classrooms, what do you think of the Dynamic
Assessment principle of providing mediation that is initially implicit and
only becomes more explicit as necessary? How does this relate to discus-
sions of corrective feedback?
4. Concept-based language instruction argues for the presentation of linguis-
tic concepts to capture the ways in which language may be used to con-
struct meaning rather than reliance on grammar rules. What implications
does this have for language textbooks? What challenges does this entail for
classroom teachers and teacher education programs?
5. If you were to adopt a sociocultural approach, what implications would
this have for conducting classroom SLA research? How would it compare
with research using other approaches you might adopt?
Notes
Sociocultural Theory and L2 Development 243
making more rapid gains than others and some showing gradual rather than dra-
matic improvement.
Suggested Further Reading
Lantolf, J. P., & Poehner, M. E. (2014). Sociocultural theory and the pedagogical imperative
in L2 education: Vygotskian praxis and the research/practice divide. London, England:
Routledge.
The authors elaborate a reading of Vygotsky that brings to light his commit-
ment to a Marxian perspective on theory and practice, according to which the
proper role of theory and research is to orient practical activity while practical
activity provides the ultimate test of theory. The implications of this perspec-
tive for SLA research and L2 education are treated in detail. Dynamic Assess-
ment and Systemic-Theoretical Instruction are presented as two essential forms
of Vygotskian praxis.
Lantolf, J. P., & Thorne, S. L. (2006). Sociocultural theory and the genesis of second language
development. Oxford, England: Oxford University Press.
This book presents an in-depth introduction to sociocultural theory and to L2
research and pedagogical interventions carried out within this framework.
Lantolf, J. P., & Beckett, T. (2009). Research timeline for sociocultural theory and
second language acquisition. Language Teaching, 42, 459–475.
This article presents a critical survey of SCT research conducted on L2
learning.
Lantolf, J. P., & Poehner, M. E. (2008). Sociocultural theory and the teaching of second lan-
guages. London, England: Equinox.
This volume, focusing exclusively on issues of L2 teaching and pedagogy, includes
chapters addressing Systemic-Theoretical Instruction, Dynamic Assessment, and
instructional initiatives framed by the ZPD and related constructs.
Lantolf, J. P., Poehner, M. E., with Swain, M. (Eds.) (2018). The Routledge handbook of
sociocultural theory and second language development. New York, NY: Routledge.
This collection of 35 chapters surveys the research carried out on a wide
array of topics dealing with the development and teaching of languages beyond
the first as informed by sociocultural theory. In addition to work on assessment
and pedagogy, it also includes chapters that address teacher education, bilin-
gualism and neurolinguistics, and the potential political implications of L2 SCT
research.
Poehner, M. E. (2008). Dynamic assessment: A Vygotskian approach to understanding and
promoting second language development. Berlin, Germany: Springer Publishing.
This volume offers the only book-length treatment of L2 Dynamic Assess-
ment (DA), detailing its origins in Vygotsky’s writings and the development of DA
approaches in psychology and education before elaborating a DA framework for
integrating teaching and assessment in L2 classrooms.
Thorne, S. L. (2005). Epistemology, politics, and ethics in sociocultural theory. Modern
Language Journal, 89, 393–409.
244 James P. Lantolf et al.
This article describes the history of Vygotsky-inspired cultural historical
research, provides a select review of L2 investigations taking this approach, and
outlines recent conceptual, theoretical, and methodological innovations.
van Lier, L. (2004). The ecology and semiotics of language learning: A sociocultural perspective.
Boston, MA: Kluwer Academic Publishers.
van Lier insightfully combines Vygotskian theory with detailed discussions of
semiotics and ecological approaches to language and L2 development.
References
Aljaafreh, A., & Lantolf, J. P. (1994). Negative feedback as regulation and second
language learning in the zone of proximal development. The Modern Language Jour-
nal, 78, 465–483.
Arievitch, I. (2017). Beyond the brain. An agentive activity perspective on mind, development,
and learning. Rotterdam, Netherlands: Sense Publishers.
Arievitch, I., & van der Veer, R. (2004). The role of nonautomatic process in activity
regulation: From Lipps to Gal’perin. History of Psychology, 7, 154–182.
Chaiklin, S. (2003). The zone of proximal development in Vygotsky’s analysis of
learning and instruction. In A. Kozulin, B. Gindis, V. Ageyev, & S. Miller (Eds.),
Vygotsky’s educational theory in cultural context (pp. 39–64). Cambridge, England:
Cambridge University Press.
Centeno-Cortés, B., & Jiménez-Jiménez, A. (2004). Problem-solving tasks in a foreign
language: The importance of the L1 in private verbal thinking. International Journal
of Applied Linguistics, 14, 7–35.
de Bot, K., Lowie, W., Thorne, S. L., & Verspoor, M. (2013). Dynamic sys-
tems theory as a theory of second language development. In M. García Mayo,
M. Gutierrez-Mangado, & M. Martínez Adrián (Eds.), Contemporary approaches
to second language acquisition (pp. 199–220). Amsterdam, Netherlands: John
Benjamins.
Diaz, R., & Berk, L. (Eds.). (1992). Private speech: From social interaction to self regulation.
Hillsdale, NJ: Erlbaum.
DiCamilla, F. J., & Antón, M. (2004). Private speech: A study of language for thought
in the collaborative interaction of language learners. International Journal of Applied
Linguistics, 14(1), 36–69.
Donato, R. (1994). Collective scaffolding in second language learning. In J. P. Lantolf &
G. Appel (Eds.), Vygotskian approaches to second language research (pp. 334–356).
Norwood, NJ: Ablex Publishing.
Dunn, W., & Lantolf, J. (1998). Vygotsky’s zone of proximal development and Krashen’s
i + 1: Incommensurable constructs; incommensurable theories. Language Learning,
48, 411–442.
Duff, P. (2007). Second language socialization as sociocultural theory: Insights and
issues. Language Teaching, 40, 309–319.
Engeström, Y. (1987). Learning by expanding: An activity theoretical approach to developmen-
tal research. Helsinki, Finland: Orienta-Konsultit.
Erlam, R., Ellis, R., & Batstone, R. (2013). Oral corrective feedback on L2 writing:
Two approaches compared. System, 41, 257–268.
Frawley, W. (1997). Vygotsky and cognitive science. Language and the unification of the social
and computational mind. Cambridge, MA: Harvard University Press.
Sociocultural Theory and L2 Development 245
Frawley, W., & Lantolf, J. (1985). Second language discourse: A Vygotskyan perspec-
tive. Applied Linguistics, 6, 19–44.
Gal’perin, P. Ya. (1967). On the notion of internalization. Soviet Psychology, 5(3), 28–33.
Gal’perin, P. Ya. (1992). Stage-by-stage formation as a method of psychological inves-
tigation. Journal of Russian and East European Psychology, 30, 60–80.
Garbaj, M. (2018). Thinking through the non-native language: The role of private
speech in mediating cognitive functioning in problem solving among proficient
non-native speakers. Language and Sociocultural Theory, 5(2), 108–129.
Gibson, J. J. (1979). The ecological approach to visual perception. Boston, MA: Houghton
Mifflin.
Haenen, J. (2001). Outlining the teaching-learning process: Piotr Gal’perin’s contribu-
tion. Learning and Instruction, 11, 157–170.
Haywood, H. C., & Lidz, C. S. (2007). Dynamic assessment in practice. Clinical and educa-
tional applications. New York, NY: Cambridge University Press.
Jiménez-Jiménez, A. (2015). Private speech during problem-solving activities in
bilingual speakers. International Journal of Bilingualism, 19(3), 259–281.
Karpov, Y. V. (2018). Acquisition of scientific concepts as the content of school instruc-
tion. In J. P. Lantolf, M. E. Poehner, & M. Swain (Eds.), The Routledge handbook
of sociocultural theory and second language development (pp. 102–116). New York, NY:
Routledge.
Kim, J. (2014). Developing conceptual understanding of sarcasm in a second language through
concept-based instruction (Unpublished doctoral dissertation). The Pennsylvania State
University, University Park, PA.
Kim, J., & Lantolf, J. P. (2018). Developing understanding of sarcasm in L2 English
through explicit instruction. Language Teaching Research, 22, 208–229.
Kozulin, A. (1990). Vygotsky’s psychology. A biography of ideas. Cambridge, MA: Harvard
University Press.
Krashen, S. (1982). Principles and practices in second language acquisition. Oxford: Pergamon.
Lai, W. (2012). Concept-based foreign language pedagogy: Teaching the Chinese temporal sys-
tem (Unpublished PhD dissertation). The Pennsylvania State University, University
Park, PA.
Lantolf, J. P., & Aljaafreh, A. (1995). Second language learning in the zone of prox-
imal development: A revolutionary experience. International Journal of Educational
Research, 23, 619–632.
Lantolf, J. P., Kurtz, L., & Kisselev, O. (2017). Understanding the revolutionary char-
acter of L2 development in the ZPD: Why levels of mediation matter. Language and
Sociocultural Theory, 3(2), 153–171.
Lantolf, J. P., & Poehner, M. E. (Eds.) (2008). Sociocultural theory and the teaching of second
languages. London: Equinox Press.
Lantolf, J. P., & Poehner, M. E. (2011). Dynamic assessment in the classroom: Vygotskian
praxis for L2 development. Language Teaching Research, 15, 11–33.
Lantolf, J. P., & Poehner, M. E. (2014). Sociocultural theory and the pedagogical imperative
in L2 education: Vygotskian praxis and the research/practice divide. London: Routledge.
Lantolf, J., & Thorne, S. L. (2006). Sociocultural theory and the genesis of second language
development. Oxford: Oxford University Press.
Luria, A. R. (1961). Study of the abnormal child. American Journal of Orthopsychiatry. A
Journal of Human Behavior, 31, 1–16.
McCafferty, S. G. (1992). The use of private speech by adult second language learners:
A cross-cultural study. The Modern Language Journal, 76, 179–189.
246 James P. Lantolf et al.
Negueruela, E. (2003). A sociocultural approach to the teaching and learning of second lan-
guages: Systemic-theoretical instruction and L2 development (Unpublished doctoral disser-
tation). The Pennsylvania State University, University Park, PA.
Negueruela, E., Lantolf, J. P., Jordan, S., & Gelabert, J. (2004). The “private function”
of gesture in second language speaking activity: A study of motion verbs and ges-
turing in English and Spanish. International Journal of Applied Linguistics, 14, 113–147.
Ohta, A. (2001). Second language acquisition processes in the classroom: Learning Japanese.
Mahwah, NJ: Erlbaum.
Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam,
Netherlands: John Benjamins.
Poehner, M. E. (2005). Dynamic assessment of oral proficiency among advanced learners of
L2 French (Unpublished doctoral dissertation). The Pennsylvania State University,
University Park, PA.
Poehner, M. E. (2018). Probing and provoking L2 development: The object of mediation
in dynamic assessment and mediated development. In J. P. Lantolf & M. E. Poehner
(Eds.), with M. Swain, The Routledge handbook of sociocultural theory and second language
development (pp. 249–265). London, England: Routledge.
Prawat, R. S. (1999). Social constructivism and the process-content distinction as
viewed by Vygotsky and the pragmatists. Mind, Culture, and Activity, 6, 255–73.
Ratner, C. (2002). Cultural psychology: Theory and method. New York, NY: Kluwer/
Plenum.
Smith, H. (2007). The social and the private worlds of speech: Speech for inter- and
intramental activity. The Modern Language Journal, 91, 341–356.
Steinbach-Koehler, F., & Thorne, S. L. (2011). The social life of self-directed talk: A
sequential phenomenon? In J. Hall, J. Hellermann, S. Pekarek Doehler, & D. Olsher
(Eds.), L2 interactional competence and development (pp. 66–92). Bristol, England:
Multilingual Matters.
Stetsenko, A. (1999). Social interaction, cultural tools and the zone of proximal
development: In search of a synthesis. In S. Chaiklin, M. Hedegaard, & U. J. Jensen
(Eds.), Activity theory and social practice: Cultural historical approaches (pp. 235–253).
Aarhus, Denmark: Aarhus University Press.
Swain, M. (2000). The output hypothesis and beyond: Mediating acquisition through
collaborative dialogue. In J. Lantolf (Ed.), Sociocultural approaches to second language
research (pp. 97–115). Oxford, England: Oxford University Press.
Swain, M., & Lapkin, S. (2002). Talking it through: Two French immersion learners’
response to reformulation. International Journal of Educational Research, 37, 285–304.
Swain, M., Lapkin, S., Knouzi, I., Suzuki, W., & Brooks, L. (2009). Languaging:
University students learn the grammatical concept of voice in French. Modern Lan-
guage Journal, 93, 5–29.
Talyzina, N. (1981). The psychology of learning. Moscow, Russia: Progress Press.
Thorne, S. L. (2000). Second language acquisition theory and the truth(s) about relativ-
ity. In J. Lantolf (Ed.), Sociocultural approaches to second language research (pp. 219–243).
Oxford, England: Oxford University Press.
Thorne, S. L. (2003). Artifacts and cultures-of-use in intercultural communication.
Language Learning and Technology, 7, 38–67.
Thorne, S. L. (2004). Cultural historical activity theory and the object of innovation.
In O. St. John, K. van Esch, & E. Schalkwijk (Eds.), New insights into foreign language
learning and teaching (pp. 51–70). Frankfurt, Germany: Peter Lang.
Sociocultural Theory and L2 Development 247
Thorne, S. L. (2005). Epistemology, politics, and ethics in sociocultural theory. The
Modern Language Journal, 89, 393–409.
Thorne, S. L. (2016). Cultures-of-use and morphologies of communicative action.
Language Learning & Technology, 20(2), 185–191.
Thorne, S. L., & Tasker, T. (2011). Sociocultural and cultural-historical theories of
language development. In J. Simpson (Ed.), Routledge handbook of applied linguistics
(pp. 487–500). New York, NY: Routledge.
Ullman, M. T. (2005). A cognitive neuroscience perspective on second language acqui-
sition: The declarative/procedural model. In C. Sanz (Ed.), Mind and context in adult
second language acquisition. Methods, theory, and practice (pp. 141–178). Washington,
D. C.: Georgetown University Press.
Valsiner, J., & Van der Veer, R. (2000). The social mind: Construction of the idea. Cambridge,
England: Cambridge University Press.
van Compernolle, R. A. (2014). Sociocultural theory and L2 instructional pragmatics. Bristol,
England: Multilingual Matters.
Vygotsky, L. (1978). In M. Cole, V. John-Steiner, S. Scribner, & E. Souberman (Eds.),
Mind in society. The development of higher psychological processes. Cambridge, MA: Har-
vard University Press.
Vygotsky, L. (1981). The genesis of higher mental functions. In J. Wertsch (Ed.), The
concept of activity in Soviet psychology. Armonk, NY: M.E. Sharpe.
Vygotsky, L. (1987). Thinking and speech. In R. Reiber & A. Carton (Eds.), The col-
lected works of L. S. Vygotsky, volume 1. Problems of general psychology. New York, NY:
Plenum Press.
Vygotsky, L. S. (1994). The problem of the environment. In J. van der Veer & J. Valsiner
(Eds.), The Vygotsky reader (pp. 338–354). Oxford: Blackwell.
Vygotsky, L. S. (1997). On psychological systems. In R. W. Rieber & J. Wollock (Eds.),
The collected works of L. S. Vygotsky. Volume 3: Problems of the theory and history of psy-
chology. New York, NY: Plenum.
Wertsch, J. (1985). Vygotsky and the social formation of mind. Cambridge, MA: Harvard
University Press.
Wertsch, J. (1995). The need for action in sociocultural research. In J. Wertsch, P. Del
Rio & A. Alvarez (Eds.), Sociocultural studies of mind (pp. 56–74). New York, NY:
Cambridge University Press.
Wood, D., Bruner, J., & Ross, G. (1976). The role of tutoring in problem-solving.
Journal of Child Psychology and Psychiatry, 17, 89–100.
Zhang, X., & Lantolf, J. P. (2015). Natural or artificial: Is the route of second language
development teachable? Language Learning, 65(1), 152–180.
11
COMPLEX DYNAMIC SYSTEMS
THEORY
Diane Larsen-Freeman
The Theory and Its Constructs
Complexity theorists are fundamentally concerned with describing and tracing
emerging patterns in dynamic systems in order to explain change and growth.
As such, Complexity Theory (CT) is well suited for use by researchers who
study L2 acquisition, and it is not surprising, therefore, that its influence has
been increasing. In fact, the famous physicist Stephen Hawking (2000) has
called the 21st century “the century of complexity.” This chapter begins with
an overview of the constructs within CT, and then turns to how they apply to
L2 acquisition, or second language development (L2 development), as an adher-
ent of CT would prefer to call it (Larsen-Freeman, 2015). More recently, in its
application to L2 development, the theory has been called Complex Dynamic
Systems Theory (CDST), uniting CT with Dynamic Systems Theory (de
Bot & Larsen-Freeman, 2011).
Complexity Theory
CT has a broad reach. It is transdisciplinary in two senses of the term. First,
it has been used to inform a variety of disciplines, for example, epidemiol-
ogy in biology, dissipative systems in chemistry, stock market performance
in business—and more germane to our interests—language issues in linguis-
tics and applied linguistics. Here is just a sampling of the language issues that
have been addressed: the nature of language (e.g., Bybee & Hopper, 2001;
Ellis & Larsen-Freeman, 2009), language use (e.g., Kretzschmar, 2009, 2015),
language evolution (e.g., MacWhinney, 1999; Mufwene, Coupé, & Pellegrino,
2017), L1 acquisition (Evans, 2007), L2 development (e.g., Larsen-Freeman,
Complex Dynamic Systems Theory 249
2006b; de Bot, Lowie, Thorne, & Verspoor, 2013), discourse (e.g., Cameron,
2007), English as a lingua franca (e.g., Baird, Baker, & Kitazawa, 2014),
World Englishes (Larsen-Freeman, 2018a), language policy and planning (e.g.,
Larsen-Freeman, 2018b), multilingualism (e.g., Herdina & Jessner, 2002), and
others (Larsen-Freeman, 2017).
The second way that CT is transdisciplinary is that it contributes a new
cross-cutting theme, comparable to innovative and far-reaching transdisci-
plinary themes such as structuralism and evolution (Halliday & Burns, 2006).
Complexity introduces the theme of emergence (Holland, 1998), “the spon-
taneous occurrence of something new” (van Geert, 2008, p. 182) that arises
from the interaction of the components of a complex system, just as a bird
flock emerges from the interaction of individual birds. Since a bird flock can-
not be understood from examining a single bird, the search for understand-
ing a phenomenon shifts from reductionism, or explaining the phenomenon
by describing its simpler components, to understanding how complex order
emerges from the interconnectedness of its components. Furthermore, this
order emerges “without direction from external factors and without a plan of
the order embedded in an individual component” (Mitchell, 2003, p. 6). In
other words, complex systems are decentralized and self-organizing.
It is important to add that saying that order emerges does not mean that the
resulting pattern remains static, just as a bird flock is not fixed. In this regard,
complex systems are also known as dynamic systems. Calling them such high-
lights their ceaseless movement: they attain periods of stability but never stasis.
They are about becoming, not being (Gleick, 1987, p. 5). Complexity theorists
study change through time, sometimes continuous change, sometimes sudden.
Dynamic systems are represented as trajectories or paths in state space (de Bot,
2008). As the systems evolve, they undergo phase transitions, in which one
more or less stable pattern gives rise to another. One way to think of phase
transitions is to observe a pot of water on a stove. As the water heats, it changes
from a seemingly inert phase to a roiling phase.
Complex systems are context dependent, that is, they are inseparable from
their environment or context. Further, they are open, exchanging informa-
tion, matter, or energy with other systems, while showing the emergence of
order. We can think of an eddy in a stream. The whorl is created within a par-
ticular spatio-temporal context. The water molecules that comprise the pattern
are constantly passing through it because the stream is an open, dynamic sys-
tem. Nonetheless, the whorl remains more or less constant—a pattern emergent
in the flux.
Complex systems are also adaptive. An adaptive system changes in response
to changes in its environment. Successful adaptive behavior entails the ability
to respond to novelty. For example, a human being’s adaptive immune system
lacks centralized control and does not settle into a permanent, fixed structure;
for this reason, it is able to adapt to combat previously unknown invaders.
250 Diane Larsen-Freeman
Complex dynamic systems exhibit nonlinearity, which means that an effect
is not proportionate to a cause. In a nonlinear system, a small change in one
parameter can have huge implications downstream. In the closely related Chaos
Theory, this sensitivity has been called the “butterfly effect,” to make the point
that a small change, such as a butterfly’s flapping its wings in one part of the
world, can have a big impact on the weather elsewhere. While admittedly these
examples may seem remote from human activities, such as language learning,
the next section shows how they apply.
Complex Dynamic Systems Theory
Thus far, we have seen that complexity theorists seek to explain the function-
ing of emergent, complex, interconnected, dynamic, self-organizing, context-
dependent, open, adaptive, and nonlinear systems (Larsen-Freeman, 1997).
The position taken here is that these attributes of CT also characterize language
use and development, where it is referred to as CDST. Gleick (1987) observes
that in a complex dynamic system, “the act of playing the game has a way of
changing the rules” (p. 24). Gleick is not a linguist, and he was not writing
about linguistic rules; nonetheless, the connection between dynamically play-
ing the game and the emergence of a changing nonlinear system can be made.
Language is a meaning-making resource (Byrnes, 2013); when it is used mean-
ingfully (playing the game), it changes.
The game is played over multiple timescales: over long periods of time in
the evolution of language, on an intermediate timescale in the spread of lin-
guistic innovations within a speech community and the formation of dialects to
distinguish communities, and, on the shortest timescale, in on-line processing
of linguistic stimuli, leading to the acquisition of language in infants and the
development of a second language, or indeed plurilingualism. Thus, CDST
offers us a way to unite language use, evolution, change, and development or
acquisition: Real-time language processing, evolutionary change in language
structure, and developmental change in learner language are all reflections
of the same dynamic process of language using, albeit at different timescales
( Bybee, 2006; Larsen-Freeman, 2003).
Now turning specifically to language development, the focus on dyna-
mism and change that CDST motivates is significant because, as Elman (2003)
notes, often studies of language development focus on behaviors that occur
during development without considering what precedes or follows them or
for that matter the mechanisms of change themselves. As for the nonlinearity
of the process, Elman points out that “the processing mechanisms that under-
lie [language development] … are fundamentally nonlinear. This means that
development itself will frequently have phase-like characteristics, that there
may be periods of extreme sensitivity to input (‘critical periods’)” (p. 431).
Complex Dynamic Systems Theory 251
Importantly, also, the concept of emergence problematizes the deep-seated
assumption that learning is a matter of assembling an internal model of an
external reality (Davis & Simmt, 2003, p. 142). Instead, language patterns are
seen to be continually emerging. They can conform to linguists’ categories
of regularities, such as canonical grammatical structures, but in their self-
organizing, they need not. Also, they can be, but need not be, the patterns
linguists describe. They may be sequences of a few words or an intonation con-
tour, such as the one Peters (1977) observed her Vietnamese participant Minh
using. Minh would also use “fillers,” which straddle the boundary between
phonology and morphosyntax, as place-holders to fill out not yet analyzed parts
of his phrases. Sometimes learners’ patterns are an amalgam of old and new, the
most obvious instance being the use of two or more languages in a single utter-
ance. They can also be formulaic, for example, in English, “Nice to meet you,”
and “Nice to see you again,” two formulas that English learners sometimes
confuse. As with other language users, learners have the capacity to create their
own patterns and to expand the meaning potential of a given language, not just
to conform to a ready-made system.
CDST adopts an embodied sociocognitive view of L2 development. The
process is embodied because L2 development is not merely about changes in the
brain but rather is distributed throughout the body (de Bot, 2015; Smith, 2005),
and further, rests on the brain and the body’s interaction with the environment
(Gallagher, 2018). Iteration is key to cognitive processing. Through encoun-
tering repeated instances of patterns, learners, with their capacious memories,
adaptively imitate them, an innovative and iterative process that involves per-
ceiving and transforming a pattern in accordance with co-textual and contex-
tual constraints to meet the user’s goals (Macqueen, 2012). They also adopt
cognitive strategies, such as inferencing and analogizing. One vehicle for itera-
tion and adaptation is the social process of co-adaptation, where each partner
in an interaction adjusts to the other over and over again (Larsen-Freeman &
Cameron, 2008), much as in reciprocal child-directed speech between a child
and the child’s caregiver (van Dijk et al., 2013). Not only is co-adaptation
reciprocal in the moment, but it is also adjustable over time (i.e., the tendency
to adapt may increase or decrease as a function of learning or progress made
by the learner over time) (van Dijk et al., 2013). Thus, some of the multiple
timescales that were alluded to above obtain in language development. The
adaptation that takes place locally, say between conversation partners, in the
short term is self-similar to the process of adaptation in learning the language
in the long term (van Dijk et al., 2013).
The dynamic patterns emerging in learner language are the consequence
of the learners’ adaptation to a specific context, which includes the learners’
interlocutors. In this way, it can be said that CDST is an ecological theory, rec-
ognizing the embeddedness of the process in a particular context. The patterns
252 Diane Larsen-Freeman
are variegated and “softly assembled” by learners (Thelen & Smith, 1994). Soft
assembly refers to the fact that the patterns are flexibly “created and dissolved as
tasks and environments change” (Thelen & Bates, 2003). It might be helpful to
think in terms of Levi-Strauss’s “bricolage,” reusing available materials to solve
new problems (Lévi-Strauss, 1962), or what Becker (1994) called reshaping prior
texts to new contexts. Similarly, Makoni and Makoni (2010) accord speakers’
agency in using “bits and pieces” of languages. Nowhere is this improvisation
more evident than in translanguaging, where plurilingual learners draw on the
language resources of their different languages to respond to the demands of
the situation. Rather than thinking in terms of transfer from one language to
another, then, CT inspires the thinking that a multilingual system expands the
language resources from which a plurilingual may draw, the use of one lan-
guage affects the use of another, and therefore the influence between languages
is bidirectional (Herdina & Jessner, 2002; Pavlenko & Jarvis, 2002).
Thus, adapting patterns sometimes means appropriating the language
others use in a Bakhtinian dialogic sense. At other times, it means innovat-
ing by analogy or recombination. Learner language can thus be seen as an
ensemble of interacting elements (Cooper, 1999), though overall achieving
some stability. An important point is that learners transform their knowledge
( Larsen-Freeman, 2013a); they do not merely copy or transfer it, nor do they
implement knowledge in the form in which it was received, or in which it was
delivered through instruction.
New forms are not mere additions to learners’ system; they change the sys-
tem itself (Feldman, 2006). As they do so, there is a great deal of variability
in a learner’s language resources. Self-organizing growth is ordered but not
invariant (Verspoor, Lowie, & van Dijk, 2008). The language resources of a
learner are not fixed internal representations but rather continuously assembled
in real time, depending on the real-time interactions between person- and
context-specific properties (van Geert, Steenbeek, & van Dijk, 2011). Fur-
thermore, the learning trajectory is not necessarily linear because the learn-
er’s language resources are constantly under construction and in flux as usage
environments change (Hopper, 1998). Even when the conditions of learning
remain steady, learners’ performance both regresses and progresses from a
target-language perspective. In other words, development is not unidirectional.
In an adaptive system, learners are not passive recipients of input. From
a CDST perspective, then, it is better to think in terms of affordances, or
opportunities for learning that are present, rather than input (van Lier, 2000). In
other words, what a learner attends to and makes use of in a particular instance
is determined by the “reciprocal relationship between an organism and a partic-
ular feature of its environment. What becomes an affordance depends on what
the organism does, what it wants, and what is useful for it” (van Lier, 2000,
p. 252). It also depends on the emotional valence that an affordance holds for
Complex Dynamic Systems Theory 253
a learner. The interconnectedness of components in a complex system extends
to the connection between emotion and cognition (e.g., Dewaele & Li, 2018).
As such, CDS researchers seek to avoid, what William James called “the
psychologist’s fallacy”: the expectation that the observer can register the truth
about an event. Instead, CDST recognizes the unique perspective of the learner
(Nelson, 2013) and thus finds an emic perspective useful when analyzing data.
Learners are agentive in another sense. Although feedback in a dynamic
system is a stimulus for adaptation, from a CDST perspective, learners do
not need to depend on others for negative feedback. They can generate their
own feedback through anticipation (Spivey, 2007), something called predictive
error ( Jaeger & Ferreira, 2013) or statistical preemption: “If learners consis-
tently witness one construction in contexts where they might have expected
to hear another, the former can statistically preempt the latter” ( Johnson,
Turk-Browne, & Goldberg, 2013, p. 361). Neuroscientists who model the brain
as complex network suggest that every sensory input, e.g., every use of a word,
simultaneously strengthens certain, and weakens other, connections in a neu-
ral network (Globus, 1995). Neural network models are thus constantly being
updated both by negative evidence that something doesn’t exist and positive
evidence that it does.
A view of language as a complex adaptive system (Ellis & Larsen-Freeman,
2009) counters the tendency to portray learner language as being an incomplete
and deficient version of the target language. Indeed, implicit in this under-
standing of language as a self-modifying, emerging system is that the develop-
mental change process is never complete (Larsen-Freeman, 2006a), and neither
is its learning. Learner language “is the way it is because of the way it has been
used, its emergent stabilities arising out of interaction” (Larsen-Freeman &
Cameron, 2008, p. 115).
Provided that the system remains open, a learner’s language resources can
grow in the learner’s quest for increased functionality, a quest which also
motivates change in other biological systems (Givón, 2002). For some lan-
guage learners, this means choosing to do more—to make more meaningful
distinctions in more pragmatically appropriate ways, resulting in a multiplic-
ity of (competing) forms and the ability to express increasingly nuanced mes-
sages. For example, learners may expand their language resources in the area
of greetings, increasingly tailoring greetings to the situation, going beyond the
one-to-one associations that they forge initially (Andersen, 1984).
This picture that I have painted so far is primarily of natural L2 develop-
ment. It is necessary to recognize that although L2 development proceeds at
least partially implicitly, instruction that recruits and directs learners’ attention
explicitly can make the process whereby increased functionality is achieved
more efficient, especially with adults. Therefore, taking the ethical imperative
for social science research to heart (Ortega, 2005), CDST has a contribution to
254 Diane Larsen-Freeman
make in terms of pedagogical suggestions. CDST-informed instruction would
teach adaptation through iteration (Larsen-Freeman, 2013b) by changing the
conditions of a particular task or activity each time it is conducted. It would
also involve meaningful practice, recognizing that it is the learners who will
determine meaningfulness for themselves. It would acknowledge that instruc-
tion calls for teaching learners, not just teaching language. In other words, it
would acknowledge the individual differences among learners, which might
call for differentiated instruction and would certainly call for cultivating a
relationship between learners’ perceptions and the affordances in the target lan-
guage (Thoms, 2014). It would insist on ongoing self-referential assessment, as
opposed to a single snapshot taken from the perspective of the target language.
Finally, it would provide for appropriate cognitive and affectively supportive
feedback, which would make the process of learning more efficacious.
What Counts as Evidence?
Because complex systems operate on many different levels (from the inner
workings of the brain to the interactions of different speech communities) and
on many different timescales (from nanoseconds to millennia), different sources
count as evidence, including those ranging from the brain scans in neuroscience
to pattern detection in corpus analysis to tracing the evolution of patterns in evo-
lutionary linguistics. Because of the system-environment interconnectedness,
there exists the challenge of defining the boundaries for any complex system
(Larsen-Freeman, 2017). Therefore, in the study of L2 development, because
CDST is a theory of holistic change, data gathered in longitudinal case studies
and from simplex systems in educational contexts (van Geert & Steenbeek,
2014) are particularly prized, as is true of other approaches. Such studies yield
data in which dynamic patterns are revealed and examined through both qual-
itative and quantitative means (Verspoor, de Bot, & Lowie, 2011).
With regard to the latter, various techniques from Dynamic Systems Theory,
such as min-max graphs and other descriptive techniques have been employed
(e.g., Hepford, 2017). Min-max graphs trace the average minimum and maxi-
mum scores on developmental indices over time, thus displaying the variabil-
ity in the form of bandwidths. The larger the bandwidth, the more variable
the scores. Also, CDST studies often include analyses that smooth the data
and check for statistical significance (Verspoor, de Bot, & Lowie, 2011). What
these techniques show is that individual learners exhibit very different patterns
in their development over time. In one study, even identical twins, who had
highly similar experiences in their exposure to an L2, exhibited contrasting
patterns of development (Lowie, van Dijk, Chan, & Verspoor, 2017). In fact,
the twin’s developmental patterns for spoken and written language went so far
to even show opposite tendencies. These observations buttress the claim for the
individual nature of the process of second language development.
Complex Dynamic Systems Theory 255
In addition, evidence is found in frequently occurring patterns in longitudi-
nal corpora of learner speech or writing, which can provide useful signposts for
tracing the trajectory of a dynamic system. Because of the sensitivity of the sys-
tem, one, perhaps unanticipated, pattern can trigger a turning point and cause
the system to veer in a different direction. Looking for these unanticipated
patterns is important because they can initiate a phase shift in the learner’s lan-
guage resources, often resulting in a bifurcation (Evans, 2018), a characteristic
pattern in a complex dynamic system.
Research on individual differences from a CDST perspective has been
conducted as well (Dörnyei, MacIntyre, & Henry, 2015). One technique that
produces evidence of individual differences is retrospection or retrodiction.
Rather than predicting the future from the present, retrodiction uses present
evidence to look back in someone’s developmental history to see if the present
state of affairs can be explained by the past. In other words, prediction involves
forward inference; retrodiction requires backward inference from present data.
Researchers employ retrodiction with the expectation that they will find evi-
dence of past events or patterns, which can explain the present (Dörnyei, 2014;
Larsen-Freeman & Cameron, 2008).
Of course, not all factors in a complex system can be anticipated and identi-
fied, let alone controlled for. This makes true experiments, in which the aim is
to control all factors but one, an unpromising method for producing evidence
in support of complex systems. This problem is compounded by the nonlin-
earity of the system and the fact that each factor does not make a uniform con-
tribution over time. Therefore, Byrne and Callaghan (2014, pp. 6–7) point to
the inadequacy of commonly employed statistical techniques when it comes to
the study of complex systems. Forgoing the usual statistical means used to gen-
eralize does not mean that generalizability is impossible. Case studies provide
evidence that may not reveal much about a population of language learners,
but they do have a direct bearing on theory (van Geert, 2011, p. 276). In other
words, generalizability from single case studies can relate to how they link to an
underlying theory. Moreover, in the history of SLD, there are many examples
of how single case studies have acted as “correctives” in challenging generaliza-
tions present in extant theories (Larsen-Freeman, 2019).
Furthermore, Morin (2008) distinguishes between general complexity and
restricted complexity, the former being philosophical and the latter method-
ological. From a restricted complexity perspective, evidence can come from
computer simulations of complex systems; formalisms can be used to model
complex systems. Evidence stemming from such models always needs to be held
accountable to “authentic data,” and such models are limited in that they are
decontextualized; however, such simulations have been useful in investigations
of language phenomena by allowing the exploration of different hypotheses. To
cite two examples, Ellis with Larsen-Freeman (2009) used a simple recurrent
network to model the emergence of verb-argument constructions as generalized
256 Diane Larsen-Freeman
linguistic schema, and Caspi and Lowie (2013) built a model that supports the
hypothesis that complex interactions between levels of vocabulary knowledge
account for the gap between reception and production in L2 vocabulary use and
learning. Moss (2008) adds that combining the precision of modeling with the
richness of narrative scenarios holds promise in providing evidence in the study
of complex dynamic systems. Other combinations that have been recommended
are Conversation Analysis with longitudinal studies of L2 development, and the
latter with corpus linguistics (Larsen-Freeman & Cameron, 2008). Recently,
building on Larsen-Freeman and Cameron’s (2008) “complexity thought mod-
eling” (a way to use CT to think about a problem or situation), a “dynamic
ensemble” of research methods (nine considerations intended to inform research
design) has been proposed (Hiver & Al-Hoorie, 2016, 2020).
Common Misunderstandings
A possible source of confusion is that the genesis of CT lies in the physical sci-
ences. For this reason, some might find it inapplicable to more human concerns,
such as language development. However, this concern can be put to rest once
it is clear that the explanatory power of the theory extends beyond the physical
sciences. For instance, Larsen-Freeman (2019) has used CDST to make a case
for the importance of human agency in L2 development. Moreover, Byrne and
Callaghan (2014) assert “that much of the world and most of the social world con-
sists of complex systems and if we want to understand it we have to understand it
in those terms” (p. 8). For this reason, it is well suited for the transdisciplinarity
required to understand L2 development (Douglas Fir Group, 2016). Indeed, some
call CT a metatheory (Larsen-Freeman, 2017) or paradigm, which radically shifts
the underlying ontology and epistemology of disciplines from reductionism to
emergentism (Wheeler, 2006), atomism to holism, and from a Cartesian mecha-
nistic worldview to a relational ecological one (Overton, 2013). While reduction-
ism has led to much success through its investigation of simple linear cause–effect
links, there is need for a complementary systems-level investigations that explore
the situated interactions among the components of a complex system.
The other, perhaps most prevalent, misunderstanding is that “complex”
means “complicated.” It does not. A complex system may be made up of many
heterogeneous components, but what is of interest is the complex, ordered
behavior that arises from their interactions. In other words, “complex” relates
to the emergence of order and structure from the interactions of components
while the system is simultaneously interacting with its environment.
An Exemplary Study: Lowie and Verspoor (2018)
As has already been mentioned, one of the issues that has been brought to light
by CDST is the individual nature of the developmental trajectories. Of course,
this is not an entirely new insight, and certainly anyone who teaches knows
Complex Dynamic Systems Theory 257
how differently students perform. The point not to be missed here, though, is
that typically both teachers and researchers make generalizations at the group
level that are assumed to hold at the individual level. For instance, teachers
might talk about a certain class of students being difficult, though they obvi-
ously don’t mean everyone in the class. Similarly, researchers aim to generalize
to a population from the sample that they study. For instance, researchers who
study individual differences speak in general terms of which factors influence
learner motivation. To offer another example, researchers who study interactive
tasks often attempt to identify which task designs are most effective in engaging
all learners in meaningful exchanges.
While generalizing at the group level is an important pursuit in research,
CDST warns against the common practice of making inferences about individ-
uals based on the behavior of groups that they are members of Doing so vio-
lates what is called the ergodicity principle (Molenaar & Campbell, 2009). The
ergodicity principle simply states that only under special conditions (homoge-
neity of the participant sample, i.e., the means and other statistics describing the
data should not vary across individual participants) and stationarity (i.e., the mean
and variance should not change between the measurements) is such an inference
valid. These conditions are not likely met in studies of language development.
To further assess the ergodicity principle in a study of second language
development, researchers Lowie and Verspoor (2018) looked into the role of
motivation and aptitude at both a group level and in longitudinal case stud-
ies. The participants in their study were 22 Dutch learners of English at the
secondary level, aged 12–13. They lived in a small town in the North of the
Netherlands and attended the same school. For this reason and the fact that they
were all enrolled in a special bilingual program, the researchers expected the
participants to constitute a relatively homogeneous sample.
Samples of the participants’ writing on a topic of their choosing were col-
lected every other week during the school year for a total of 23 samples for all
participants. The samples were anonymized and randomly sequenced. Each
sample text was then scored holistically and analytically. For the former, each
text was scored independently by three trained raters for its relative complexity,
accuracy, and fluency, using a 5-point scale from strongest writing to weakest.
For the analytic measurements, the texts were scored on measures of syntactic
complexity, phrasal complexity, lexical complexity, and lexical sophistication.
In addition, participants’ motivation was assessed using a 6-point Likert scale
with each of nine statements, and aptitude was determined by the general Dutch
Cito score. Information on other factors that might contribute to L2 develop-
ment was obtained, such as participants’ amount of out-of-school exposure to
English and their proficiency level of L2 English at the onset of the study.
Not surprisingly, the overall writing scores improved between the first and
the last texts produced, but the variability patterns over time for the individ-
ual learners turned out to be quite different. Because Molenaar and Ca mpbell
(2009) did suggest that generalization to the wider population might be
258 Diane Larsen-Freeman
achieved “through the identification of subsets of similar individuals” (p. 116),
the researchers took an additional step of creating two subgroups that were
maximally homogeneous in terms of motivation and aptitude (and other fac-
tors such as initial proficiency and out-of-school exposure). However, even
after constituting very small groups, each group having only three members,
group members exhibited different developmental paths from the beginning
to the end of the school year. None of the individual differences significantly
predicted the final ratings, although the lexical measures did. Moreover, a
higher degree of variability coincided with higher overall proficiency gains.
This finding is consistent with the CDST position that variability is facilitative,
even necessary, for development.
In sum, cross-sectional group studies give us valuable information about
the relative weight of factors that may play a role in L2 development. How-
ever, researchers must bear in mind that the findings may not hold for a lon-
ger period of time and cannot predict much about an individual’s behavior at
any one point in time (Larsen-Freeman, 2006b). This is because both of the
ergodicity requirements are violated: a randomized group is most probably not
homogenous, and the data are not stationary. On the other hand, longitudinal
case studies can tell us something about how an individual’s language resources
develop, but the findings may not hold for other learners in other contexts.
Moreover, even the variable nature of development for an individual means
that the development is not predetermined and not completely predictable,
but does evince a developing system. CDST approaches focus on describing
the process of development, and they include measures of the learner’s variable
performance over time as important sources of information in these descrip-
tions (Lowie, Verspoor, & van Dijk, 2018, p. 105). By so doing, they highlight
not only the dynamic nature of the process, and its ubiquitous variability, but
also its individual nature and the need for repeated observation and sampling.
Explanation of Observed Findings in SLA
Observation 1: Exposure to input is necessary for L2 acquisition. It goes without
saying that learners have to be exposed to ambient language to expand their
language resources. However, the term “input” is problematic, dehumanizing
the learner, making responsibility for learning appear to be unidirectional by
overlooking the learner’s agency, metaphorizing him or her as a computer, and
necessitating all manner of terminological qualifications in terms of “intake,”
“uptake,” and “output.” In contrast, it may very well be the case that, just as it
is in child language development (Smith, Jayaraman, Clerkin, & Chen, 2018),
agentive learners construct their own tacit curricula for language learning as
they seek to make meaning with the new code.
Observation 1 also draws a line between the learner and the environ-
ment, which is antithetical to a CDST perspective. In contrast, the concept
Complex Dynamic Systems Theory 259
of affordance reunites the two. Affordances are realized in the interaction
between organisms and objects in the environment (Bærentsen & Trettvik,
2002). In this way, affordances are opportunities for semiotic action in the
ecosocial environments (as perceived by learners) that can motivate agents to
act and co-act (Zheng & Newgarden, 2012).
A way to word this is to say that the learner’s language resources develop
from experience, a product of the learner’s perception of affordances and the
need for these specific resources in the environment. The language of the
environment or ambient language does, therefore, play a role in their shape.
But the point is that it does not determine them, nor does it define the learning
trajectory. If it did, there would be no way to account for the individual devel-
opmental paths that learners take.
Observation 3: Learners come to know more than what they have been exposed to
in the input. First, complex systems are sensitive to initial conditions, and L2
learners are not blank slates. All come with knowing one other language, and
some come with knowing several, which should help to narrow the hypothesis
space for learning. Often our learners are treated as if they were speakers of one
language learning to be monolinguals of another (Ortega, 2018). In contrast to
this depiction, plurilingual learners, especially, can use inference to go beyond
the language data to which they have been exposed (e.g., Todeva, 2010). There
are also some powerful claims about the ability of learners of all ages to extract
the organizational structure of sequences from the language and to generalize
these learned patterns to novel instances through statistical learning (Aslin &
Newport, 2012). In other words, those universal characteristics of language,
such as structure dependence, may well be extractable from ambient language
sequences and not built in, although individual differences in the ability to
extract patterns have been found (Misyak & Christiansen, 2012).
Second, learners come to be able to use more than what they have been
exposed to because they are creative. However, the creativity does not reside
in the linguistic system; rather, it is in the learners’ relationship with the envi-
ronment, including their interlocutors. In other words, the creativity is not a
property of the linguistic system itself but rather is a property of agents’ behav-
ior in co-regulated interactions (Shanker & King, as cited in de Bot, Lowie, &
Verspoor, 2007, p. 10). Through analogizing and recombining, learners create
new patterns, presumably by soft-assembling or cobbling them together at a
particular time for a particular purpose, using what language resources are
accessible in the moment, often made so by interactions with others.
Observation 4: Learners’ output (speech) often follows predictable paths with predict-
able stages in the acquisition of a given structure. Humans are superb pattern detec-
tors. Certain patterns are more detectable due to their frequency and salience.
In addition, because we are meaning-making beings, we will be attracted to
those patterns that afford us the most communicative potential. Thus, patterns
that are both nonsalient and semantically redundant are likely to develop later.
260 Diane Larsen-Freeman
Attested stages could also be due to developmental constraints, such as marked-
ness, or processing constraints (see Chapter 9). Presumably learners have many
demands competing for their attention; therefore, it should not be surprising
that they cannot attend to everything at once.
However, CDST would confer more agency on learners than what these
observations suggest. For instance, Eskildsen (2012) clearly demonstrates that
the putative stages of acquisition of syntactic structures previously reported are
not followed uniformly, but rather, are influenced by learners’ present interac-
tional goals and learners’ perception of affordances in the environment. One of
Eskildsen’s participants, a Spanish-speaker learning English, used you no+verb
as in you no write frequently during one lesson. The learner’s reliance on this
phrase may have resulted from the interaction between the initial conditions (of
Spanish), which led the learner to misperceive a negative structure in English as
being similar to Spanish, and to the abundant presence of the negative particle
“no” in the English ambient language. Thus, because his L1 encouraged the
adoption of a particular form and because he was motivated to communicate a
particular message, he made use of the resources he perceived in the language
he was learning, despite their ungrammaticality from a target-language perspec-
tive. The study showed that one’s language resources are locally constituted, and
therefore do not always conform to the putative stages and that locally contextu-
alized interactions can influence L2 development. It is clear, concludes Eskildsen,
that, as CDST would have it, local usage and long-term learning are inseparable.
Also in keeping with CDST, in relation to her Chinese learners of English,
Tasker (2013) observed
Seemingly unpredictable differences in the behaviour of systems, or of
elements within systems, can be caused by tiny differences (sometimes
imperceptible or apparently insignificant) in their very initial stages.
(p. 137)
Observation 5: Second language learning is variable in its outcome. Iteration in a com-
plex system introduces heterogeneity; it generates variability (Larsen-Freeman,
2012, 2017). Pavlenko and Jarvis (2002) write that “the evidence of bidirectional
transfer underscores the unstable nature of ‘native-speakerness’” (p. 210). If lan-
guage use by native speakers is variable, there is no reason that the outcome of
second language learning should be any less so, which speaks to the issue of what
the target should be. With today’s sensibilities, it is clear that it should no longer
be isomorphism with native speaker norms (Larsen-Freeman, 2014). Grammar
books can be written summarizing norms of usage, but no single user conforms
to the norms. They are idealized abstractions from the collective:
A language is not a single homogeneous construct to be acquired; rather,
a complex systems view … foregrounds the centrality of variation among
Complex Dynamic Systems Theory 261
different speakers and their developing awareness of the choice they have
in how they use patterns within a social context.
(Larsen-Freeman & Cameron, 2008, p. 116)
For instance, Eisner and Macqueen (2006) have shown how context influences
the pronunciation of phonemes. Indeed, it might be better said that complex
systems “have a set of potential states rather than a single determined state”
(Byrne & Callaghan, 2014, p. 19). The existence of variable outcomes is not
an aberration from a CDST perspective, but instead is both typical and useful.
As Larsen-Freeman and Cameron (2008) have observed, “when we make use
of genres in speaking or writing, we use the stabilized patterns but exploit the
variability around them to create what is uniquely needed for that particular
literacy or discourse event” (p. 190).
Awareness of variability also allows us to interpret the speech of others.
Eisner and Macqueen (2006) note that when we listen to others speaking, we
need to adjust our interpretation of their differences in articulation. Interest-
ingly, they claim that “the variability in the speech signal that is introduced by
speaker idiosyncrasies continues to be problematic for automatic speech recog-
nizers, but is usually handled with remarkable ease by the human perceptual
system” (p. 1950). This is so because perceptual representations of phonemes
are flexible and adapt rapidly to accommodate idiosyncratic articulation in the
speech of a particular speaker. In addition to creating options for speakers, and
for allowing them to interpret the speech cues of others, variation gives speak-
ers the resources to adapt their speech to that of others, and by so doing, achieve
either social congruence or distance (Larsen-Freeman, 2012).
Finally, as we have seen in the exemplary study, in CDST, variability of out-
comes is inevitable due to the complexity of the many interacting factors, which
change in their relation to each other over time. A CDST approach to language
development considers variability as indicative of change processes, not merely
measurement error (van Geert & van Dijk, 2002). Due to the observation that
variability is a necessary condition for change to take place, variability has been
termed as the “motor of change” (Lowie & Verspoor, 2015).
Observation 6: Second language learning is variable across linguistic subsystems. In
CDST research on language development, there is a great deal of evidence for
variability across linguistic subsystems. As Hirsh-Pasek, Golinkoff, and Hollich
(1999) have shown for L1 acquisition, syntactic complexity, phonological com-
plexity, and frequency may be separate but dynamically interacting forces shap-
ing acquisition. A change in one leads to a change in another. This implies
that any account that focuses on one aspect only or only at one time cannot
but provide an oversimplification of reality. Only an account that incorporates
the dynamic interaction of multiple factors can inform an appreciation of the
actual complexity (de Bot, Lowie, & Verspoor, 2007, pp. 18–19). Indeed, it
is well known that learners rely on lexis before syntax, especially early on in
262 Diane Larsen-Freeman
their language development. Then, too, Spoelman and Verspoor (2010) have
demonstrated that a correlation between accuracy and complexity may not
show significant results, yet a moving correlation of the same data can show
periods of strong correlations alternating over time.
Larsen-Freeman’s (2006b) L2 study bears out Hirsh-Pasek et al.’s finding for
L1. The English learners in this study clearly charted their own distinct paths
through state space when it came to the development of grammatical complex-
ity, lexical complexity, fluency, and accuracy in their writing over time. Learner
agency was apparent in the way that learners chose to focus on one subsystem over
another during the duration of the study. Though they may not have consciously
favored one subsystem over the other, individuals seemed to perceive and seize
affordances to make progress in different subsystems. Of course, experience and
learning conditions matter as well. Polat and Kim (2014) hypothesized that their
participant improved in the areas necessary to communicate (lexical diversity)
rather than grammatical correctness because he was a naturalistic learner who
was not receiving any type of explicit grammar instruction.
Observation 7: There are limits on the effects of frequency on SLA. Years ago,
Larsen-Freeman (1975, 1976) reported a correlation between frequency of
occurrence of grammatical morphemes in English speakers’ speech and their
accuracy order in ESL learner language. Since then, a frequency effect has
been observed to be influential in language processing (Ellis, 2002), and fre-
quency, working in conjunction with other factors, has been seen to increase
the salience of a form, thus presumably helping to attract learners’ attention
to it (Goldschneider & DeKeyser, 2001). From a CDST perspective, the way
it would be stated today is that the frequency of forms in the language of the
environment forms deep attractors in state space.
However, from the same perspective, there are also limits to the effects of
frequency in L2 development. First of all, often the frequency of a given form is
determined when researchers consult a large corpus of natural language use. If
acquisition were determined by frequency of forms in this manner, then articles
and prepositions would be the first acquired since corpora show that the and
of are the two most frequently occurring forms in English. This is clearly not
the case. Therefore, L2 acquisition cannot only be about frequency matching
(Larsen-Freeman, 2002). A view from CDST is fundamentally learner cen-
tered. Frequency in a general corpus, even one constructed from second lan-
guage learner speech, is not necessarily the frequency with which a particular
learner experiences the form. Perceptions about language are state dependent,
and therefore local, not universal. When a learner becomes aware of a new form
or has a framework into which a new form fits and has a need for the form, the
new form becomes salient to the learner; otherwise, it is noise.
Second, embedded in this position is another limitation on frequency. Most
often, attributions of frequency (my own included) are attributions based
on structure. However, language exists for meaning making. Heightened
Complex Dynamic Systems Theory 263
frequency of a construction over time leads to its becoming ambiguous (Zipf ’s
principle of economy) and semantically bleached (Bybee, 2010). Thus, often it
is the less frequent, more irregular, more marked constructions that are more
meaningful and useful to the learner.
Finally, increasing frequency is a linear measure; however, there are periods
of nonlinearity, where increasing frequencies will seem to have no effect or, con-
versely, a radical one. One example is with type/token ratios. We know that high
token frequency promotes entrenchment, where little change in performance is
evident. Increasing type frequency, on the other hand, can stimulate a funda-
mental shift in the learner’s language resources, in keeping with the preceding
caveats (Ellis & Larsen-Freeman, 2006). Thus, according to CDST, frequency is
one factor in a complex process and is always relative to learner perception.
Explicit and Implicit Learning
Language learners can learn implicitly. Even neonates can tally probabilistic
patterns in the language spoken to them (Saffran, Aslin, & Newport, 1996), and
infants and adults demonstrate this ability for tone sequences (Saffran, Johnson,
Aslin, & Newport, 1999). Babies can use statistical learning “to detect units
within continuous, dynamic events,” and “the ability to segment these units
is critical not only for interpreting meaning in the flux and flow of events,
but also for language learning” (Roseberry, Richie, Hirsh-Pasek, Michnick
Golinkoff, & Shipley, 2011, p. 1424).
With older second language learners, for whom it is a challenge to associate
meaning with forms, let alone learn to use the forms appropriately, the issue
of learning implicitly becomes more complicated, and research results have
been mixed (Hama & Leow, 2010; Williams, 2005). Nevertheless, the case
for implicit learning remains strong as successful L2 development among some
untutored immigrants attests.
Segmenting speech into units can also be learned explicitly when learners’
attention is drawn to them so that the learner perceives them. From a CDST
perspective, there are different paths to the same outcome; therefore, more
important than the simple dichotomy between explicit and implicit learning is
the contribution of either to the learner’s perception of affordances. In a com-
plex system, the present level of development is critically dependent on what
preceded it. Learning is motivated by an awareness of difference (Marton, 2006).
The perception of contrasts can be learner-generated or can be promoted by
others. However, the response by a learner depends on the learner’s history.
Then, too, especially in the case of plurilinguals, the perception of similarity
among languages can contribute to productive analogizing (Todeva, 2010). The
assumption cannot be made that both the learner and the other language users
are operating on the same system, so attempts to promote awareness may not
always afford an immediate opportunity for learning. That said, humans are
264 Diane Larsen-Freeman
always learning to adapt, even when there is a mismatch between the learner and
available affordances. Learning is therefore continuous and always self-referential
(Larsen-Freeman, 2014), and there are multiple paths to the same outcome. We
learn how to learn, and this is what is important in L2 development.
Conclusion
CDST inspires a view of language that is not a fixed code but is rather an open
and dynamic meaning-making system, the learning of which is a sociocognitive
and ecological process, the latter in the sense that it insists on the context depen-
dency of learning. In the moment, embodied learners soft assemble their lan-
guage resources co-adapting to the immediate environment. As they do so, their
language resources change, as does the environment. Learning is not the taking
in of linguistic forms by learners. Instead, the language resources of learners
are emergent, mutable, variable, and self-organizing. Their development is self-
referential, not an act of conformity. Development is spurred by learners’ quest
for increasing functionality, enabled by the learners’ awareness of similarities and
differences, made perceptible by the affordances in the environment, and by a
continuing dynamic adaptation to a specific, but ever-changing, context.
Discussion Questions
1. Larsen-Freeman describes one of the themes of complexity theory as
“emergence.” What characteristics does this theory share with usage-based
theories (Chapter 5) that are also often described as emergentist?
2. Complexity theory rejects standard experimental design as “unpromising.”
Critics of this theory would claim that without imposing control on some
of the factors in a learning context, it is impossible to interpret research
findings. After reading this chapter, what is your view?
3. In what sense is variation “useful” from the perspective of CDST?
4. Compare the constructs of input and affordance.
5. How do the impact and importance of frequency differ in complexity the-
ory, usage-based approaches, and skill theory?
6. Read the exemplary study presented in this chapter and prepare a discus-
sion for class in which you describe how you would conduct a replication
study. Be sure to explain any changes you would make and what motivates
such changes.
Suggested Further Reading
Byrne, D., & Callaghan, G. (2014). Complexity theory and the social sciences: The state of the
art. Oxon, England: Routledge.
This book treats complexity from a social realist perspective.
Complex Dynamic Systems Theory 265
de Bot, K. (2008). Introduction: Second language development as a dynamic process.
The Modern Language Journal, 92, 166–178.
An edited special issue of the Modern Language Journal, with contributions from
various researchers on the value of seeing language development dynamically.
Dörnyei, Z., MacIntyre, P. D., & Henry, A. (2015). Motivational dynamics in second lan-
guage learning. Bristol, UK: Multilingual Matters.
Applies the lessons and methods of CDST to the individual difference factor of
motivation.
Larsen-Freeman, D., & Cameron, L. (2008). Complex systems in applied linguistics.
Oxford, England: Oxford University Press.
Introduces complex systems thinking and applies it to language, language devel-
opment, discourse, and classroom interaction.
Ortega, L., & Han, Z.-H. (Eds). (2017). Complexity theory and language development: In
celebration of Diane Larsen-Freeman. Amsterdam: John Benjamins.
Updates and expands upon CT, tracing its genealogy and illustrating its many
applications.
Verspoor, M., de Bot, K., & Lowie, V. (Eds.). (2011). A dynamic approach to second lan-
guage development. Amsterdam, Netherlands: John Benjamins.
A description of research methods and techniques that can be used to study
dynamic systems.
References
Andersen, R. (1984). The one to one principle of interlanguage construction. Language
Learning, 34, 77–95.
Aslin, R., & Newport, E. (2012). Statistical learning: From acquiring specific items to
forming general rules. Current Directions in Psychological Science, 21, 170–176.
Baird, R., Baker, W., & Kitazawa, M. (2014). The complexity of ELF. Journal of English
as a Lingua Franca, 3, 171–196.
Bærentsen, K. B., & Trettvik, J. (2002). An activity theory approach to affordance. In
O. W. Bertelsen, S. Bødker, & K. Kuuti (Eds.), NordiCHI 2002: Proceedings of the
second nordic conference on human-computer interaction (pp. 51–60). Aarhus, Denmark:
ACM-SIGCHI.
Becker, A. L. (1994). Repetition and otherness: An essay. In B. Johnstone (Ed.), Repetition
in discourse: Interdisciplinary perspectives (Vol. 2, pp. 162–175). Norwood, NJ: Ablex.
Bybee, J. (2006). From usage to grammar: The mind’s response to repetition. Language,
82, 711–733.
Bybee, J. (2010). Language, usage and cognition. Cambridge, England: Cambridge
University Press.
Bybee, J., & Hopper, P. (Eds.). (2001). Frequency and the emergence of linguistic structure.
Amsterdam, Netherlands: John Benjamins.
Byrne, D., & Callaghan, G. (2014). Complexity theory and the social sciences: The state of the
art. Oxon, England: Routledge.
Byrnes, H. (2013). On the way to meaning-making: Language education and applied
linguistics. In Distinguished scholarship and service award lecture presented at the American
Association for Applied Linguistics Conference. Dallas, Texas, March 18, 2013.
Cameron, L. (2007). Patterns of metaphor use in reconciliation talk. Discourse and
Society, 18, 197–222.
266 Diane Larsen-Freeman
Caspi, T., & Lowie, W. (2013). The dynamics of L2 vocabulary development: A case
study of receptive and productive knowledge. Revista Brasileira de Linguística Aplicada,
13, 437–462.
Cooper, D. (1999). Linguistic attractors: The cognitive dynamics of language acquisition and
change. Amsterdam, Netherlands: John Benjamins.
Davis, B., & Simmt, E. (2003). Understanding learning systems: Mathematics educa-
tion and complexity. Journal for Research in Mathematics Education, 34, 137–167.
de Bot, K. (2008). Introduction: Second language development as a dynamic process.
Modern Language Journal, 92, 166–178.
de Bot, K. (2015). A history of applied linguistics. From 1980 to the present. London, England
and New York, NY: Routledge.
de Bot, K., & Larsen-Freeman, D. (2011). Researching second language development
from a dynamic systems perspective. In M. Verspoor, K. de Bot, & W. Lowie (Eds.),
A dynamic approach to second language development: Methods and techniques (pp. 5–23).
Amsterdam, Netherlands: John Benjamins.
de Bot, K., Lowie, W., Thorne, S. L., & Verspoor, M. (2013). Dynamic systems the-
ory as a comprehensive theory of second language development. In M. Mayo,
M. Gutierrez-Mangado, & M. Adrián (Eds.), Contemporary approaches to second lan-
guage acquisition (pp. 199–220). Amsterdam, Netherlands: John Benjamins.
de Bot, K., Lowie, W., & Verspoor, M. (2007). A dynamic systems theory approach to
second language acquisition. Bilingualism: Language and Cognition, 10, 7–21.
Dewale, J.-M., & Li, C. (Eds.). (2018). Emotions and second language acquisition. Stud-
ies in Second Language Learning and Teaching, 8 (Special Issue), 15–19.
Douglas Fir Group. (2016). A transdisciplinary framework for SLA in a multilingual
world. Modern Language Journal, 100 (Supp. 2016), 19–47.
Dörnyei, Z. (2014). Researching complex dynamic systems: Retrodictive qualitative
modeling in the language classroom. Language Teaching, 47, 80–91.
Dörnyei, Z., MacIntyre, P. D., & Henry, A. (Eds.). (2015). Motivational dynamics in second
language learning. Bristol, UK: Multilingual Matters.
Eisner, F., & Macqueen, J. M. (2006). Perceptual learning in speech: Stability over time
(L). Journal of the Acoustic Society of America, 119, 1950–1953.
Ellis, N. C. (2002). Frequency effects in language processing. Studies in Second Language
Acquisition, 24, 143–188.
Ellis, N. C., & Larsen-Freeman, D. (2006). Language emergence: Implications for
applied linguistics—Introduction to the special issue. Applied Linguistics, 27, 558–589.
Ellis, N. C., & Larsen-Freeman, D. (Eds.). (2009). Language as a complex adaptive system.
Boston, MA: Wiley-Blackwell.
Ellis, N. C., with Larsen-Freeman, D. (2009). Constructing a second language: Analy-
ses and computational simulations of the emergence of linguistic constructions from
usage. Language Learning, 59 (Suppl. 1), 90–112.
Elman, J. (2003). Development: It’s about time. Developmental Science, 6, 430–433.
Eskildsen, S. (2012). L2 negation constructions at work. Language Learning, 62, 335–372.
Evans, J. (2007). The emergence of language: A dynamical systems account. In E. Hoff
M. Shatz (Eds.), Blackwell handbook of language development (pp. 128–148). Malden,
MA: Blackwell.
Evans, R. (2018). Bifurcations, fractals, and non-linearity in second language development: A
complex dynamic systems perspective (Unpublished Ph.D. dissertation). University of
Buffalo.
Complex Dynamic Systems Theory 267
Feldman, J. (2006). From molecule to metaphor: A neural theory of language. Cambridge,
MA: MIT Press.
Gallagher, S. (2018). Decentering the brain. Embodied cognition and the critique of
neurocentrism and narrow-minded philosophy of mind. Constructivist Foundations,
14, 8–21.
Givón, T. (2002). Biolinguistics: The Santa Barbara lectures. Amsterdam, Netherlands:
John Benjamins.
Gleick, J. (1987). Chaos: Making a new science. New York, NY: Penguin Books.
Globus, G. (1995). The postmodern brain. Amsterdam, Netherlands: John Benjamins.
Goldschneider, J., & DeKeyser, R. (2001). Explaining the “natural order of L2 mor-
pheme acquisition” in English: A meta-analysis of multiple determinants. Language
Learning, 50, 1–50.
Halliday, M., & Burns, A. (2006). Applied linguistics: Thematic pursuits or disciplinary
moorings? Journal of Applied Linguistics, 3, 113–128.
Hama, M., & Leow, R. P. (2010). Learning without awareness revisited: Extending
Williams (2005). Studies in Second Language Acquisition, 32, 465–491.
Hawking, S. (2000, January 23). “Unified Theory” is getting closer, Hawking predicts.
San Jose Mercury News, p. 29A.
Hepford, E. (2017). Dynamic second language development: The interaction of complexity,
accuracy, and fluency in a naturalistic learning context (Unpublished Ph.D. dissertation).
Temple University, Philadelphia, PA.
Herdina, P., & Jessner, U. (2002). A dynamic model of multilingualism. Clevedon, England:
Multilingual Matters.
Hirsh-Pasek, K., Golinkoff, R. M., & Hollich, G. (1999). Trends and transitions in lan-
guage development: Looking for the missing piece. Developmental Neuropsychology,
16, 139–162.
Hiver, P., & Al-Hoorie, A. H. (2016). A dynamic ensemble for second language
research: Putting complexity theory into practice. Modern Language Journal, 100,
741–756.
Hiver, P., & Al-Hoorie, A. H. (2020). Research methods for complexity theory in applied
linguistics. Bristol, England: Multilingual Matters.
Holland, J. (1998). Emergence: From chaos to complexity. Reading, MA: Addison Wesley.
Hopper, P. (1998). Emergent grammar. In M. Tomasello (Ed.), The new psychology of
language (pp. 155–175). Mahwah, NJ: Lawrence Erlbaum.
Jaeger, F. T., & Ferreira, V. (2013). Seeking predictions from a predictive framework.
Behavioral and Brain Sciences, 36, 359–360.
Johnson, M., Turk-Browne, N. B., & Goldberg, A. (2013). Prediction plays a key role in
language development as well as processing. Behavioral and Brain Sciences, 36, 360–361.
Kretzschmar, W. (2009). The linguistics of speech. Cambridge, England: Cambridge
University Press.
Kretzschmar, W. (2015). Language and complex systems. Cambridge, England: Cambridge
University Press.
Larsen-Freeman, D. (1975). The acquisition of grammatical morphemes by adult ESL
students. TESOL Quarterly, 9, 409–419.
Larsen-Freeman, D. (1976, June). ESL teacher speech as input to the ESL learner, Work-
papers in TESL, 10. Los Angeles, CA: UCLA.
Larsen-Freeman, D. (1997). Chaos/complexity science and second language acquisi-
tion. Applied Linguistics, 18, 141–165.
268 Diane Larsen-Freeman
Larsen-Freeman, D. (2002). Making sense of frequency. Studies in Second Language
Acquisition, 24, 275–285.
Larsen-Freeman, D. (2003). Teaching language: From grammar to grammaring. Boston, MA:
Heinle/Cengage.
Larsen-Freeman, D. (2006a). Second language acquisition and the issue of fossilization:
There is no end, and there is no state. In Z.-H. Han & T. Odlin (Eds.), Studies of
fossilization in second language acquisition (pp. 189–200). Clevedon, England: Multilin-
gual Matters.
Larsen-Freeman, D. (2006b). The emergence of complexity, fluency, and accuracy in
the oral and written production of five Chinese learners of English. Applied Linguis-
tics, 27, 590–619.
Larsen-Freeman, D. (2012). On the roles of repetition in language teaching and learn-
ing. Applied Linguistics Review, 3, 195–210.
Larsen-Freeman, D. (2013a). Transfer of learning transformed. Language Learning,
63(Suppl. 1), 107–129.
Larsen-Freeman, D. (2013b). Complex systems and technemes: Learning as iterative
adaptations. In J. Arnold & T. Murphey (Eds.), Meaning ful action: Earl Stevick’s influ-
ence on language teaching (pp. 190–201). Cambridge, England: Cambridge University
Press.
Larsen-Freeman, D. (2014). Another step to be taken: Rethinking the endpoint of the
interlanguage continuum. In Z.-H. Han & E. Tarone (Eds.), Interlanguage: Forty years
later (pp. 203–220). Amsterdam, Netherlands: John Benjamins.
Larsen-Freeman, D. (2015). Saying what we mean: Making a case for language acquisi-
tion to become language development. Language Teaching, 48, 491– 505.
Larsen-Freeman, D. (2017). Complexity theory: The lessons continue. In L. Ortega &
Z.-H. Han (Eds.), Complexity theory and language development. In celebration of Diane
Larsen-Freeman (pp. 11−50). Amsterdam, Netherlands: John Benjamins.
Larsen-Freeman, D. (2018a). Resonances: Second language development and language
planning and policy from a complexity theory perspective. In F. Hult, T. Kupisch,
& M. Siiner (Eds.), Language acquisition and language policy planning (pp. 203−217).
Cham, Switzerland: Springer.
Larsen-Freeman, D. (2018b). Second language acquisition, WE, and language as a com-
plex adaptive system (CAS). In P. De Costa & K. Bolton (Eds.), World Englishes, 37(1)
(Special Issue), 80−92.
Larsen-Freeman, D. (2019). On language learner agency: A complex dynamic systems
perspective. Modern Language Journal, 103(Special Issue), 61–79.
Larsen-Freeman, D., & Cameron, L. (2008). Complex systems in applied linguistics.
Oxford, England: Oxford University Press.
Lévi-Strauss, C. (1962). La pensée sauvage. Paris: Plon.
Lowie, W., van Dijk, M., Chan, H., & Verspoor, M. (2017). Finding the key to suc-
cessful L2 learning in groups and individuals. Studies in Second Language Learning and
Teaching, 7, 127–148.
Lowie, W., & Verspoor, M. (2015). Variability and variation in second language acqui-
sition orders: A dynamic reevaluation. Language Learning, 65, 63–88.
Lowie, W. M., & Verspoor, M. H. (2018). Individual differences and the ergodicity
problem. Language Learning. Online pre-publication (25 September 2018). Retrieved
from https://onlinelibrary.wiley.com/doi/abs/10.1111/lang.12324
Complex Dynamic Systems Theory 269
Lowie, W., Verspoor, M., & van Dijk, M. (2018). The acquisition of L2 speaking: A
dynamic perspective. In R. Alonso (Ed.), Speaking in a second language (pp. 105–126).
Amsterdam, Netherlands: John Benjamins.
Macqueen, S. (2012). The emergence of patterns in second language writing: A sociocognitive
exploration of lexical trails. Bern, Switzerland: Peter Lang.
MacWhinney, B. (Ed.). (1999). The emergence of language. Mahwah, NJ: Lawrence
Erlbaum.
Makoni, B., & Makoni, S. (2010). Multilingual discourses on wheels and public English
in Africa: A case for vague linguistique. In J. Maybin, & J. Swann (Eds.), The
Routledge companion to English language studies (pp. 258–270). Abingdon, England:
Routledge.
Marton, F. (2006). Sameness and difference in transfer. Journal of the Learning Sciences,
15, 499–535.
Misyak, J., & Christiansen, M. (2012). Statistical learning and language: An individual
differences study. Language Learning, 62, 302–331.
Mitchell, S. D. (2003). Biological complexity and integrative pluralism. Cambridge, England:
Cambridge University Press.
Molenaar, P. C. M., & Campbell, C. G. (2009). The new person-specific paradigm in
psychology. Current Directions in Psychological Science, 18, 112–117.
Morin, E. (2008). On complexity. Cresskill, NJ: Hampton Press.
Moss, S. (2008). Alternative approaches to the empirical validation of agent-based
models. Journal of Artificial Societies and Social Simulation, 11. Retrieved from http://
jasss.soc.surrey.ac.uk/11/1/5.html
Mufwene, S., Coupé, C., & Pellegrino, F. (Eds.) (2017). Complexity in language: Devel-
opmental and evolutionary perspectives. Cambridge, England: Cambridge University
Press.
Nelson, K. (2013). Editorial. Cognitive Development, 28, 175–177.
Overton, W. F. (2013). A new paradigm for developmental science: Relationism and
relational-developmental systems. Applied Developmental Science, 17, 94–107.
Ortega, L. (2005). For what and for whom is our research? The ethical as transformative
lens in instructed SLA. Modern Language Journal, 89, 427–443.
Ortega, L. (2018). SLA in uncertain times: Disciplinary constraints, transdisciplinary
hopes. 33. Retrieved from https://repository.upenn.edu/wpel/vol33/iss1/1.
Pavlenko, A., & Jarvis, S. (2002). Bidirectional transfer. Applied Linguistics, 23, 190–214.
Peters, A. M. (1977). Language learning strategies: Does the whole equal the sum of the
parts? Language, 53, 560–573.
Polat, B., & Kim, Y. (2014). Dynamics of complexity and accuracy: A longitudinal case
study of advanced untutored development. Applied Linguistics, 35, 184–207.
Roseberry, S., Richie, R., Hirsh-Pasek, K., Michnick Golinkoff, R., & Shipley, T. F.
(2011). Babies catch a break: 7- to 9-month-olds track statistical probabilities in
continuous dynamic events. Psychological Science, 22, 1422–1424.
Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-
old infants. Science, 274, 1926–1928.
Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learn-
ing of tone sequences by human infants and adults. Cognition, 70, 27–52.
Smith, L. B. (2005). Cognition as a dynamic system: Principles from embodiment.
Developmental Review, 25, 278–298.
270 Diane Larsen-Freeman
Smith, L. B., Jayaraman, S., Clerkin, E., & Chen, Y. (2018). The developing infant
creates a curriculum for statistical learning. Trends in Cognitive Sciences, 22, 325–336.
Spivey, M., (2007). The continuity of mind. Oxford, England: Oxford University Press.
Spoelman, M., & Verspoor, M. (2010). Dynamic patterns in development of accuracy
and complexity: A longitudinal case study in the acquisition of Finnish. Applied
Linguistics, 31, 532–553.
Tasker, I. (2013). The dynamics of Chinese learning journeys: A longitudinal study of adult
learners of Mandarin in Australia (Unpublished Ph.D. dissertation). University of
New England, Armidale, Australia.
Thelen, E., & Bates, E. (2003). Connectionism and dynamic systems: Are they really
different? Developmental Science, 6, 378–391.
Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the development of cognition
and action. Cambridge, MA: MIT Press.
Thoms, J. J. (2014). An ecological view of whole-class discussions in a second language
literature classroom: Teacher reformulations as affordances for learning. Modern
Language Journal, 98, 724–741.
Todeva, E. (2010). Multilingualism as a kaleidoscopic experience: The mini universes
within. In E. Todeva, & J. Cenoz (Eds.), The multiple realities of multilingualism: Personal
narratives and researchers’ perspectives (pp. 53–74). Berlin, Germany: Mouton de Gruyter.
van Dijk, M., van Geert, P., Korecky-Kröll, K., Maillochon, I., Laaha, S., Dressler,
W. U., & Bassano, D. (2013). Dynamic adaptation in child–adult language interac-
tion. Language Learning, 63, 243–270.
van Geert, P. (2008). The dynamic systems approach in the study of L1 and L2 acquisi-
tion: An introduction. Modern Language Journal, 92, 179–199.
van Geert, P. (2011). The contribution of complex dynamic systems to development.
Child Development Perspectives, 5, 273–278.
van Geert, P., & Steenbeek, H. (2014). The good, the bad and the ugly? The dynamic
interplay between educational practice, policy and research. Complicity: An Interna-
tional Journal of Complexity and Education, 11, 22–39.
van Geert, P., Steenbeek, H., & van Dijk, M. (2011). A dynamic model of language
learning and acquisition. In M. Schmid, & W. Lowie (Eds.), Modeling bilingualism:
From structure to chaos (pp. 235–266). Amsterdam, Netherlands: John Benjamins.
van Geert, P., & van Dijk, M. (2002). Focus on variability: New tools to study
intra-individual variability in developmental data. Infant Behavior and Development,
25, 340–374.
van Lier, L. (2000). From input to affordance: Social-interactive learning from an
ecological perspective. In J. Lantolf (Ed.), Sociocultural theory and second language learn-
ing (pp. 245–259). Oxford, England: Oxford University Press.
Verspoor, M., de Bot, K., & Lowie, W. (Eds.) (2011). A dynamic approach to second language
development: Methods and techniques. Amsterdam, Netherlands: John Benjamins
Publishing Company.
Verspoor, M., Lowie, W., & van Dijk, M. (2008). Variability in L2 development from a
dynamic systems perspective. Modern Language Journal, 92, 214–231.
Wheeler, W. (2006). The whole creature: Complexity, biosemiotics, and the evolution of culture.
London, UK: Lawrence & Wishart.
Williams, J. (2005). Learning without awareness. Studies in Second Language Acquisition,
27, 269–304.
Zheng, D., & Newgarden, K. (2012). Rethinking language learning: Virtual world as a
catalyst for change. The International Journal of Learning and Media, 3, 13–36.
12
THEORIES AND LANGUAGE
TEACHING
Bill VanPatten
Those of us working in both second language (L2) theory/research and teacher
education/development often get asked this question: What does the theory
and research have to say to teachers? This is not a new question. In 1985,
Patsy Lightbown addressed this question in an essay titled “Great Expectations:
Second Language Research and Classroom Teaching.” Stephen Krashen has
addressed this off and on for a number of decades but perhaps most famously
in his 1982 book Principles and Practice in Second Language Acquisition. I have
addressed it recently in my book While We’re On the Topic. However, the idea
of linking L2 theory and research to language teaching can be traced to the
origins of contemporary L2 research in Corder’s (1967) essay on the signif-
icance of learners’ errors. Corder called on researchers to look at L2 devel-
opment in the same way linguists had begun looking at first language (L1)
development a decade prior. In that essay, he suggested that understanding
L2 development would ultimately be informative for language teaching. For
the reader unfamiliar with the history of L2 research, prior to the 1970s, ideas
about L2 development and their implications for teaching were based on the
prevailing notions in behaviorist psychology and were largely that: ideas. Little
was known about how L2s actually developed in the mind/brain of learners
and how this manifested itself in communicative events.
As it now stands, the relationship between theory/research and language
teaching depends on who you are talking to—and of course what theory and
what research you are talking about. It may surprise the reader when I say that
most L2 theory and research is irrelevant to language teaching, that most of what
is relevant to language teaching we’ve known for some 30 years. What I will
argue in this chapter is that there are some basic facts about language acquisition
that are of interest to teachers. These facts exist independently of any particular
272 Bill VanPatten
theory. Teachers need to grapple with basic facts before grappling with any
particular theory. In order for the reader to understand such a claim, the present
chapter will:
• briefly review what theories are and what they are supposed to do
(i.e., quickly summarize key ideas from Chapter 1);
• offer some basic ideas about what teaching is using definitions from both
scholars and practicing teachers;
• highlight the different purposes of current theoretical development and
language teaching;
• summarize Ligthbown’s basic claim about expectations and review some
basic “facts” about acquisition and what they might suggest to teachers,
showing how these facts exist independent of any particular theory.
What Are Theories and What Do They Do?
In Chapter 1 of this book, we discussed the nature of theories and their role in
research. Here I will review and expand a bit on some of those ideas. The reader
is advised that my approach in this chapter is to view theories as a scientific
endeavor and not a humanistic or cultural endeavor (e.g., Feminist Theory and
Queer Theory are not theories in the way that Quantum Theory and Evolution-
ary Theory are). This section is based on another publication, VanPatten (2018).
In science, a theory is a set of laws or statements set out to explain observed
natural phenomena (e.g., Kuhn, 1962; Popper, 2002). As we noted in Chapter 1,
a central purpose of any theory is to explain things we see around us. Let’s look
at the theory of evolution. For evolutionists, there are essentially two phe-
nomena in need of explanation. The first is why species vary within a given
related group. Why are some canaries completely yellow but some are yellow
and black? Why does human eye color vary among blue, brown, green, gray,
and shades in between (but not purple, red, or yellow)? The second phenom-
enon in need of explanation involves inter-species differentiation. For exam-
ple, dolphins and humpback whales are both mammalian, live in the oceans
and breathe through blowholes but they are not of the same species. What
makes them different and how did they get that way? Evolutionary theory is an
attempt to explain such things that we see in the natural world.
Another related purpose of theories is to provide testable hypotheses. In other
words, a scientist must be able to articulate a hypothesis based on constructs and
laws within the theory and the scientist must be able to conduct research to test
that hypothesis. These testable hypotheses are indispensable for evaluation of
the theory itself. If the hypothesis is supported by research, then the theory is
supported. If sufficient evidence runs counter to the hypothesis, then the theory
is questioned and may be refined or replaced. Let’s return to evolution once
more. Two of the fundamental laws of the theory are (1) mutation occurs by
Theories and Language Teaching 273
chance (at the genetic level) and (2) natural selection encourages mutations that
increase the possibility of reproduction. So, from these two laws we can state
a hypothesis that says, “Over successive generations of a species, a particular
change that increases survival which in turn increases chances of reproduc-
tion (passing along the genetic trait responsible for the mutation) will spread
through the population.” Traditionally, this hypothesis was “tested” through
fossil record and reconstruction. Sometimes it was tested by observation of
natural occurring conditions. In a series of studies, referred to as the Peppered
Moth studies, it was found that the environment favored the coloring of Pep-
pered Moths. Light colored Peppered Moths thrived in wooded areas where
the bark was light; they were harder for birds to see and pick off. Dark-colored
Peppered Moths thrived in wooded areas where trees were covered by soot;
they blended better against the soot darkened tree bark and thus were harder
for birds to see (e.g., Cook, Grant, Saccherri, & Mallet, 2012).
A decade or so ago, the hypothesis of natural selection was experimen-
tally manipulated and thus tested in a study on leg-length in lizards. Losos,
Schoener, Langerhans, and Spiller (2006) reported the findings of a study in
which a new predatory species of lizard, Leiocephalus carinatus, was introduced
into six islands in the Bahamas. The prediction was that the introduction of
this lizard would force a smaller terrestrial lizard, Anolis sagrei, to become arbo-
real in order to escape predation. As it became arboreal, A. sagrei’s legs would
become shorter in order to better accommodate its movement along rough tree
limbs and branches. One year after the introduction of the new predatory liz-
ard, the researchers compared hind leg lengths on A. sagrei to those of a control
population on other islands where the predatory lizard was not introduced.
As predicted, the lizards moved into tree habitats and after just 12 months
the researchers found a significant change in hind leg length. The legs indeed
became shorter. In both the moth study and the lizard study, the results support
the hypotheses. The support of the hypotheses in turn adds evidence for the
tenets of the theory.
In SLA, a theory—if it is to be a theory—should do the same as it does in
other scientific endeavors (e.g., Jordan, 2004; Long, 1990, 2007; see the over-
view in Chapter 1 of this volume). It should (1) provide a set of laws or state-
ments with (2) the purpose of explaining observed phenomena. And it should
provide means by which to test the theory itself through empirical research on
hypotheses derived from the theory. If a theory cannot do these things, it is not
a theory but something else (e.g., a framework, a hypothesis).
What Is Language Teaching?
Definitions of teaching and what it means to be a teacher abound. A simple
Google search reveals that the most widely used definitions of teaching come
from schools of education and from scholars who write about teaching and
274 Bill VanPatten
education (e.g., Biesta & Stengel, 2016). The National Education Association
provides a position statement in the form of a pdf that can be found at http://
www.nea.org/assets/docs/HE/mf_ltbrief.pdf. And if you ask non-teachers,
they will offer their own definitions of teaching.
I won’t review all of the possible definitions here and instead note that a
recurring theme in all of them (whether from scholars in education or from
non-teachers) is this: teaching involves helping someone else learn. Of course,
this common thread among definitions begs two important questions. The first
is: what does it mean to learn something and how does that learning happen
(i.e., what does development look like over time)? The second is: learn what? In
considering language teaching, let’s examine the “what” first. We will return
to the issue of learning in a later section.
There are two ways to conceive of language teaching.1 One is that language
is like the teaching of other subject matter.2 That is, the principles, pedagogy,
and methods used in schools of education to develop social science teachers,
math teachers, and English teachers are the same for French teachers, Spanish
teachers, and German teachers. Language is the object of some kind of explicit
teaching, learning, and testing in this scenario and the teacher’s job is to apply
the principles, pedagogies, and methods used to teach other subject matter
to the teaching of French, Spanish, and German. Underlying this scenario is
that the learning of language can be “taught” and measured like any other sub-
ject matter, with grades assigned to indicate “how well” someone did on what
was to be learned. It is probably fair to say that some version of this scenario
dominates most language teaching around the world, especially in those con-
texts in which the language is not spoken outside the classroom (e.g., English
in Osaka, French in Missouri, Spanish in Manchester, Arabic in Las Vegas). In
this scenario, the “what” of language teaching is to be found in textbooks that
organize units and lessons around grammatical points or structures, vocabulary
groups, and so on (especially in world language contexts). There is an assumed
and expected “scope and sequence” of language courses, for example, that is
reflected in the materials that teachers use.
Another way to conceive of language teaching is that language is not like
other subject matter therefore language teaching requires its own pedagogy.
Language teaching is different because language is different from math, social
science, and English. (Whether the others are different from each other is not
the question here.) In this scenario, language is typically not viewed as an object
to be learned explicitly and practiced but an implicit “knowledge” to develop,
while skill (via communication) is also acquired over time. In this scenario, both
language and communication (not the same thing; see endnote 2) are viewed
as too abstract and complex to be taught and learned like subject matter. In this
scenario, language as mental representation (e.g., VanPatten & Rothman, 2014)
and communication (as defined by Sandra Savignon, Lyle Bachman, and Buzz
Palmer among others—see VanPatten, 2017a for an overview) cannot be found
Theories and Language Teaching 275
on textbook pages. In such scenarios, traditional syllabi, methods, and text-
books are abandoned in favor of approaches that organize lessons around non-
linguistic units such as themes/questions, tasks, projects, stories, among others.
The “what” in such cases is not specified because the underlying approach is
that whatever language and communication are, they will develop as the learner
is exposed to level-appropriate communicative events over time.
Both of the above scenarios also hint at what teachers believe or understand
language acquisition to be. In the language-as-subject-matter scenario, some
version of explicit teaching and practice is involved. The level of explicitness
and the kind of practice may vary from teacher to teacher and curriculum to
curriculum but the fundamental similarity is the unstated belief that language
is learned like any other subject matter (largely because of the idea that lan-
guage is to be found in the pages of textbooks). In the language-is-different
scenario, teachers tend to believe that learners get data from the communicative
events they are involved in (i.e., they get input) and, from this, learners create
language in their heads on their own. However, in both cases, it is not clear
what teachers have as an operational definition of acquisition (and its processes),
what actually happens in the mind/brain of the learner, and what development
over time looks like. They differ more on what the role of the teacher and the
curriculum is. And, of course, both generally lead to different ways regarding
“how” to teach.
So, it is not so clear-cut that we can define language teaching in any
particular way. There seem to be two broad conceptualizations (i.e., language-
is-subject-matter and language-is-different) but like all applications of ideas
there is variation in how the concepts are realized by teachers. Let’s return to
the relationship between L2 theories and language teaching.
Different Purposes
It would seem from the previous sections that there is a largely unbridgeable
chasm between the purpose of theories and the purpose of language teach-
ing. Theories attempt to explain observed phenomena, in and out of class-
rooms. In the present volume, theories can be examined based on how well
they explain the ten basic observations about L2 acquisition found in Chapter
1. Language teaching, on the other hand, is about what to do in the classroom
and how this might help learners. My experience—if not almost everyone’s
experience—is that a good many people working in theoretical development
within SLA (including the nature of language and the nature of communi-
cation) do not engage in teacher education. And many academics involved
in teacher education do not engage in L2 theory development of the kinds
discussed in this volume and specified above in the discussion about the nature
of theories. (I underscore the use of “good many people/many academics” and
remind the reader that I did not say “all.”)
276 Bill VanPatten
The situation just described is often the case in most disciplines. For example,
theorists working in evolution tend not to be involved in teacher education
related to biology. The authors of the “lizard study” cited earlier in this chap-
ter do not, as far as I know, work in K-12 teacher education. Likewise, those
working to develop future teachers of biology in schools of education tend not
to be involved in developing or testing evolutionary theory.
However, in teacher education, it is often the case that an undergraduate
is understood to develop content matter in a discipline while learning about
teaching and education in a school of education—at least in the context of
the United States. So the student interested in teaching biology develops an
understanding of biology from the biology department while learning about
curriculum development, adolescent behavior and cognition, assessment,
and other issues from the school of education. The same would be true for
teacher education students in math, history, and English, for example. This
scenario leads us to an interesting question: what is the “subject matter”
for the teacher education student of languages? Traditionally, this is seen as
course work in French, Spanish, German, English, and so on. But such pro-
grams do not focus on the nature of language, communication, or language
acquisition. These programs focus on literature and culture for the most
part. Where do teacher education students of languages get any background
in the nature of language, the nature of communication, and the nature of
language acquisition, for example? The answer is they tend not to. And yet it
is precisely a background in these areas that would help to shape the teacher
education student’s ideas about both what to teach (or what not to teach) and
how to teach.
In the next section, I outline what are some basic “facts” about L2 acquisi-
tion that are relevant to language teaching and what they offer the teacher. The
reader will notice that I make no reference to any particular theory but instead
to some basic information about both language and language acquisition.
What Is Relevant to Teachers (Expectations Revisited)
I begin this section referring the reader to Lightbown (1985), which was cited at
the outset of this chapter. In her essay, Lightbown essentially makes the following
observation: that at the time (early to mid-1980s) teachers were wondering what
the relevance of L2 research was for teachers. Did teachers have expectations that
L2 research might tell them what to do in class or what curriculum is best? Light-
bown suggested that this expectation could not be met—and yet, L2 research
did have something to offer teachers. She suggested that the research might lead
teachers to have different expectations—expectations about what their learners
could and could not do and possibly of the limits of instruction (my interpreta-
tion). As I reflect on Lightbown’s basic idea, information from L2 research could
Theories and Language Teaching 277
create a “revolution” in the mind of the teacher who is largely uninformed about
L2 acquisition. The situation is akin to something like astronomy’s finding that
the Earth was not the center of the universe or evolution’s finding that modern
species, including homo sapiens sapiens, are but the latest versions of processes that
began millions of years ago. These facts led to how individuals and how whole
societies began to rethink some of their basic notions. These facts became ele-
ments of change—and change they indeed did effect.
What about language teaching? The ten observations in Chapter 1 are basic
findings about L2 acquisition that could serve as such information for lan-
guage teachers. For teachers who are not familiar with them, they could serve
as elements of change or at least reflection. In this section, I am going to bor-
row from some of those observations and discuss seven facts we know about
acquisition. I have selected these seven because in my many conversations
with language teachers across the United States, I find that these basic facts
are often unknown—and they tend not to appear in textbooks for language
education and they also tend not to be part of standards or certification tests in
the United States. In short, what Lightbown suggested some 30+ years ago as
relevant information for language teachers is information largely not known
by language teachers and—for the most part—not part of teacher education
in languages.
Fact 1. What winds up in learners’ heads is an abstract, complex, and implicit mental
representation. Language does not consist of the rules and paradigms presented in text-
books or other sources.
As evidenced in this volume, theories may differ in how they conceptualize
“language”—and some are actually agnostic or mute regarding the nature of
language. Still, there is consensus that what winds up in learners’ heads is not
what we see in textbooks. That is, textbook rules do not resemble the abstract
and complex nature of language that develops in both first and second language
learners. What this means for teachers is that language is not what is on page 32
of the textbook (VanPatten, 2017a).
To illustrate what I mean, I have sometimes used the example of differen-
tial object marking in Spanish (e.g., VanPatten, 2017a), erroneously called the
“personal a” in Spanish textbooks. Spanish has an object marker a that has no
meaning other than “the noun phrase following the verb (immediately dom-
inated by the verb) is the object of the verb.” A typical rule one might find in
a textbook or in a Google search is this (which will sound familiar to every
Spanish teacher in the United States):
In Spanish, when the direct object is a person, it is preceded by the prepo-
sition “a.” This word has no English translation. The personal “a” is not used
when the direct object is not a person or is an animal for which no personal
feelings are felt.
278 Bill VanPatten
This rule does not match up well with what winds up in people’s heads.
Here are some sample sentences in Spanish based on the majority of dialects
around the world:
(1) María conoce a Juan
“Mary knows John.”
(2) María conoce la materia.
“Mary knows the material.”
(3) ¿Conoces un buen médico?
“Do you know a good doctor?”
(4) Tengo una hermana.
“I have a sisster.”
(5) El camión sigue al carro.
“The truck is following the car.”
(6) El chico asustó al coyote.
“The boy scared the coyote.”
For (1) and (2), our rule works well: John is a person and is the object of the verb
and is marked with a while the material is not a person and is not marked with
a. But (3) and (4) create problems for the rule because both a doctor and a sister
are persons but in this case neither is marked with a. In (5), a car is not a person
but is marked with a and in (6) the coyote is not a pet or animal with which
the boy has a personal relationship but is marked with a. (3) through (6) are
not exceptions to the rule, as some might claim. Languages tend to be highly
regular and congruent. Instead, sentences (3) through (6) reflect underlying
abstract features that govern the use of differential object marking in Spanish—
an abstraction that can’t be captured by a textbook or a rule obtained from a
Google search. Most, if not all, of the language is like this.
Again, to underscore the distinction between a fact and a theory, that lan-
guage is abstract, complex and implicit is a fact independent of any theory.
Different theories may describe the abstractness and complexity in different
ways—although they would concur on what it means for something to be
implicit (i.e., the content exists outside of one’s awareness). That is, theories
will “describe and explain” language in different ways, but they would not
disagree that what’s in our heads is not what is represented on textbook pages.
Fact 2. Communicatively embedded input is necessary for language acquisition. Such
input forms the “data pool” from which a linguistic system is constructed.
The basic data that learners need to construct a mental representation of
language are found in language that they hear, read, or see (in the case of sign
languages) during communicative events. It is language the learner processes for
its meaning, for its message. The fundamental role of communicatively embed-
ded input calls into question such things as practice, repetition, and explanation
(from teachers, textbooks, or the Internet) as the basic data for acquisition.
Theories and Language Teaching 279
At first, the fundamental role of input was put forth as a hypothesis
(e.g., Corder, 1967; Hatch, 1983; Krashen, 1978) based on its role in first lan-
guage acquisition and based on the research on L2 learners that started to come
in the early and late 1970s. As research has accumulated since then, the fun-
damental role of communicatively embedded input has been established and
accepted as a basic fact of second language acquisition. Acquisition of mental
representation for language doesn’t happen without the kind of input talked
about here. And as in Fact 1, the role of input in language acquisition exists
independently of any theory. Theories may differ on how input is used by
learners to create language, but the importance of communicatively embedded
input is commonly accepted among all theories.
Returning to Fact 1, it is important to point out that when we refer to input
data, we are not referring to “rules” or “forms” in the traditional sense (e.g., text-
book rules, what students are told about language). Regardless of how theories
conceptualize how learning from input happens, the consensus is that there are
no rules or forms “out there in the input” for learners to take in. There are simply
raw data in the input on which the learner’s internal mechanisms operate. So, for
example, there is no rule for making yes/no questions in English in the input. There
are no rules for differential object marking in Spanish that we saw earlier. There
are no rules for the use of the passé composé and the imparfait in the input. There is no
rule in the input that keeps finite verbs in German in final position in embedded
clauses. There are only data from which something that looks like a rule (to us on
the outside) evolves over time. This point leads us to the next two facts.
Fact 3. The learner comes equipped to the task of language acquisition to process and
organize language data in particular ways.
Different theories posit different mechanisms responsible for language
acquisition, as the reader of this book is well aware. No matter what these
mechanisms are, a common thread is that something internal to the learner
organizes language in particular ways (see, e.g., Fact 4). Additionally, these
mechanisms are beyond the means of the teacher to change (Fact 7). Teachers
notice that what learners can do spontaneously with speech (or signing) does
not match what they have been taught and practiced. At the same time, they
often lament this (e.g., “Why can’t students apply what they have learned and
practiced?”). In addition, teachers tend to notice what they think are the links
between their instructional efforts and what learners eventually do, especially
on tests and writing exercises (e.g., “Now he gets it!”). However, when it comes
to what is actually evolving in the student’s mind/brain seems to be operating
independently of instruction and testing.
What teachers generally don’t notice—because you’d have to be a specialist
trained to look for it—is that learners also come to know more about language
than what they are exposed to. In some circles, this is referred to as the pov-
erty of the stimulus problem (see Observation 3 in Chapter 1). Traditionally,
this means that, like L1 learners, L2 learners come to know such things as the
280 Bill VanPatten
constraints on contractions described in Chapter 1. That is, learners come to
know when we can contract want to but there is nothing in the input that tells
them when we can’t do the contraction. That is, the input only contains exam-
ples of possible contractions. How do they come to know when contractions
aren’t possible? Theoretically, want + to should always be contractible: if you
get wanna enough in the input, why not wanna wherever you want? The same
with I’ve. Learners easily get input data that tell them I + have is contractible
but how do they come to know that Should I have done it? can never be *Should
I’ve done it? Something internal to the learner is organizing language so that in
English, across the board, something is allowing contractions in some places
and disallowing them in others.3 Both L1 and L2 learners of English come to
know this but they can’t have learned it from the input.
Here’s another example, albeit more complex and abstract. Spanish allows
null subjects in simple declarative sentences.
(7) ¿Escribió Bill este capítulo? Sí. De hecho, pro escribe mucho sobre este tema.
“Did Bill write this chapter? Yes. In fact he writes a lot about this topic.”
(8) ¿Por qué quieres pro leer esto?
“Why do you want to read this?”
At the same time, null subjects are required when there is no antecedent.
(9) weather: Está nevando/*Ello está nevando. “It’s snowing”
(10) time: Es la una/*Ello es la una “It’s one o’clock”
(11) impersonals: Es probable que…/*Ellos es probable que… “It’s probable that…”
(12) existentials: ¿Hay razón para esto? ¿*Allí hay razón para esto? “Is there a
reason for this?”
(13) unidentified subjects: ¡Me robaron!/*¡Ellos me robaron! “I was robbed!/
They robbed me!” (“they” refers to an unknown person or persons)
In addition to the above phenomena, overt subject pronouns cannot take quan-
tified and negative antecedents. This is known as the Overt Pronoun Con-
straint or OPC (e.g., Montalbetti, 1984).
(14a) Cada profesor i piensa que proi/j es muy inteligente.
(b) Cada profesor i piensa que él*i/j es muy inteligente.
“Each professor thinks he is intelligent.”
(c) El profesor i piensa que proi/j es muy inteligente.
(d) El profesor i piensa que él i/j es muy inteligente.
“The professor thinks he is intelligent”
The workings of null and explicit subjects in Spanish as illustrated in (7)
through (13) are readily available in the input. However, the OPC as illus-
trated in (14a-d) is not. Even early stage learners of Spanish show evidence
of the OPC constraining how they interpret sentences such as those in (8)
Theories and Language Teaching 281
(e.g., Pérez-Leroux & Glass, 1999). As in the case of contractions in English,
how do learners come to know such things when there is no evidence for this
constraint in the input? This is what some scholars call the poverty of the stimu-
lus: the stimulus (input) doesn’t contain everything in order for the learner to
arrive at a mental representation of the language yet the learner does. There
are lots and lots of examples of this. What this observation points to is that
something is happening internal to the learner that pushes language along
particular paths and not others. In the next section, we will see how learners
organize language in stage-like and ordered ways over time that don’t neces-
sarily resemble what they hear and see in the input.
Fact 4. Language acquisition is largely ordered. It is piecemeal and stage-like, and gen-
erally not linear. The ordered nature of language acquisition is observed across learners.
Idiosyncratic aspects of acquisition and variation among learners do not affect these orders.
In both first and second language acquisition, we find that acquisition is not
some simple linear path with components of language simply piled on over
time. That is, learners don’t simply add B to A and then C to AB and then D
to ABC and so on until presto! Language! Instead, simply getting A reveals a
complex process with stages representing qualitative shifts in the mental repre-
sentation. And while learners are getting A, they’re beginning to get B (in part)
and possibly even C. Let’s briefly review a number of examples.
Negation in English. L2 learners of English are observed to traverse various
stages on their way to something that looks like negation in English. Briefly,
the stages are these.
Stage 1: negator + phrase (e.g., no hungry, no like ham)
Stage 2: subject + negator + phrase (e.g., I no hungry, I no like ham, Teacher don’t
like morning [where don’t is an unanalyzed chunk of language])
Stage 3: Modals (e.g., can, may) and auxiliary have creep in (e.g., I can’t eat that,
I haven’t studied)
Stage 4: analyzed auxiliary do comes under control (e.g., I don’t/do not like ham,
The teacher doesn’t/does not like mornings)
Ser and estar in Spanish. Spanish has two equivalents of the verb “be.” The
following stages have been uncovered.
Stage 1: no verb (e.g., Juan, uh, alto “John tall,” Elena no aquí “Helen not here”)
Stage 2: emergence of ser which is overextended to all contexts (e.g., Juan es alto,
Elena no es aquí, El chico es correr “The boy is running”)
Stage 3: emergence and predominance of estar as auxiliary (e.g., El chico está
corriendo “The boy is running”)
Stage 4: emergence and some control of estar with adjectives and location (e.g.,
Juan es alto “John is tall” but Elena no está aquí “Elena is not here” and El
profe está enojado “The prof is ticked off.”)
282 Bill VanPatten
Singular before plural. A general finding across languages that overtly mark
plurality on nouns and verbs is that singular forms come in before plural forms.
For example, in Spanish we see something like these three broad stages for
learners using the verb tomar “to drink” as an example:
bare verb form, toma → tomo “I drink,” tomas “you drink” → tomamos “we
drink,” tomais “you all drink,” toman “they drink”
There are dozens of other examples we could offer. To be sure, stages and
orders aren’t neat little steps in which a learner cleanly and abruptly moves
from one stage to the next. There is overlap and there is some variation within
a stage, just as there is in L1 acquisition. But the stages are discernible and they
appear to be universal (although learners may vary as to rate of acquisition, as
we will discuss for Fact 5).
What these stages suggest is that learners are pulling data from the input and
using it to create language in their heads—and that something internal to the
learner is pushing them in particular directions, hence the universality of so much
of language acquisition among learners. At each and every stage, an abstract and
complex system is developing in learners’ heads that does not directly resemble
the input and certainly does not look like textbook rules and information.
Fact 5. Learners vary as to their rate of acquisition and as to their eventual outcomes. But
this variation does not suggest differential acquisition mechanisms across learners.
Individual variation is a fact in life. Although two canaries may look alike to
us, with close scrutiny we see they are not identical. And although all planets orbit
the sun in much the same way, there are slight differences in how circular the
orbits actually are, for example, as well as the angle of each planet’s axis of rotation.
Language acquisition is no different. In children learning their first language, we
see that although they traverse the same stages in the acquisition of something
like negation in English, there are differences in the rate of acquisition. One child
might get through all the stages of negation by 3,0 years while another child takes
3,3 years and yet another takes 3,6 years (all growing up in the same community).
In second language acquisition, we have observed the same phenomenon. There
are universal aspects of stages and ordered development, for example, but some
learners take longer than others to traverse those stages or orders.
Different from L1 learners who, if from the same community, converge on
the same mental representation of language (e.g., syntax, phonology, morphol-
ogy), L2 learners may vary one from the other in terms of how far they get.
Some learners sound native-like with the sound system, some don’t. Some are
native-like with vocabulary and formal features of language, some are not.
Technically, this may be an extension of the rate issue: because some learners
are “faster” than others, they get to an outcome sooner than others and, by the
time we measure them, all it looks like there are different outcomes. But in real-
ity, this may mean that some just haven’t caught up yet to where the others are.
Theories and Language Teaching 283
So individual variation in rate (and its consequence, outcome) are to be
expected in the L2 context. Learners are all building a mental representation of
language in much the same way they just aren’t doing it at the same rate.
In addition to rate and outcome, there is some individual variation within
stages of acquisition. For example, in Stage 2 of the acquisition of ser and estar,
in which learners have yet to uncover the function of estar as an auxiliary, we
might get variation in how learners “solve the problem” of expressing progres-
sives while almost universally using ser as the default (and non-native) verb (e.g.,
Juan es correr, Juan es corriendo, Juan es corre “John is running”). And as learners
move toward incorporating estar as the auxiliary for progressives, we might see
variation as to which main verbs occur with estar and which don’t, possibly
based on frequency in the input. In short, learners could produce some “correct”
structures and some “incorrect” structures during a given stage of acquisition.
Fact 6. The L1 influences the L2 but is constrained in how it does this.
It is clear that the L1 somehow influences L2 acquisition. As speakers of
English, without seeing someone, we may be able tell if their first language is
French, Spanish, Russian, German, or another. Before the days of empirical
research in SLA (i.e., prior to the 1970s), it was thought that the L1 was the
source of all “difficulties” and that the L1 was massively transferred into the L2
learning context. As research began to trickle in, many scholars adjusted this
claim toward the idea that L1 transfer or influence was constrained in some
way (e.g., Andersen, 1983; Dulay, Burt, & Krashen, 1982; Kellerman, 1983).
And to be sure, the universality of much of developmental stages and ordered
development (Fact 4) also suggests that L1 transfer is constrained in some way
(e.g., no matter how your first language “does negation” you will go through
the stages of the acquisition of negation in English like others although your L1
may influence the rate at which you traverse various stages).
How L1 transfer is constrained depends on the theory or framework of the
scholar. Some discuss the role of markedness (e.g., Eckman, 1977), others dis-
cuss constraints based on ideas from a generativist perspective (e.g., Vainikka &
Young-Scholten, 1996), others discuss strategies of input processing (e.g.,
VanPatten, 2015), others discuss constraints on strategies for output processing
(e.g., Pienemann, 1998), and still others discuss alternative ways in which L1
influence is constrained. To be sure, there is continued argument about what is
called the “initial state” in L2 acquisition (i.e., to what degree the L1 is trans-
ferred into the “hypothesis space” of an L2 at the outset of acquisition), but
even those who believe in full or massive transfer of the L1 at the beginning
note that many aspects of the L1 are quickly attenuated in acquisition, which
suggests a kind of constraint in that the L1 is not persistently transferred in all
areas during SLA. In short, even under a scenario of massive L1 transfer at the
outset, universals put the L1 in its place as the data from the input come in.
Again, to be clear to the reader, it is not the case that there is no L1 influence.
It is not the case that we can’t hear or see L1 influence on learner production in
284 Bill VanPatten
spontaneous communication and in certain types of language processing tasks.
The point here is that the L1 works in tandem with universals of language
acquisition (however these are conceptualized). Where the universals of acqui-
sition and L1 transfer are not aligned, universals seem to win out.
Fact 7. Attempts to purposely alter or affect the processes of language acquisition are
severely constrained because language acquisition is constrained. In other words, the effects
of explicit instruction (and practice) are limited.
Ever since the beginning of contemporary L2 research, scholars have pondered
and debated the effects of instruction on form (grammar, pronunciation, vocabu-
lary, and so on) on the acquisition of language. As early as 1972, Selinker hypoth-
esized in his seminal essay on “interlanguage” that the effects of such instruction
would be minimal on the abstract and complex system developing in the learner’s
head (see VanPatten, 2014, for an update of Selinker’s original claim). To date,
here are two critical aspects of what the research on explicit instruction has shown.
• Explicit instruction and practice don’t alter ordered development (Fact 4).
Staged and ordered development doesn’t disappear with instruction.
Ordered development seems to always assert itself somehow (e.g., Ellis,
1989; Lightbown, 1983; Kessler, Liebner, & Mansouri, 2011; VanPatten,
2014; and many others).
• Explicit instruction and practice don’t obviate the role of input and the
internal mechanisms that create language. When I talk about the role of
input at workshops, some teachers claim that because they don’t have the lux-
ury of time, they can’t provide input to learners. Instruction is their “short-
cut” to acquisition. However, given the fundamental and critical nature of
input for acquisition (Fact 3) and the nature of what winds up in learners’
heads (Fact 1), it is clear that instruction cannot substitute for input. It might
be supplemental in some way, but it is not a substitute. Given that ordered
development is not affected by instruction in any significant way (see above),
input and the internal mechanisms for acquisition that act on input data seem
to override most instructional efforts. What is more, there is no theory in SLA
that has been able to link explicit instruction and explicit knowledge with
the acquisition of mental representation. That is, explicit knowledge cannot
turn into mental representation via practice (e.g., Schwartz, 1993; VanPatten,
2016). There are references in the literature to “interfaces” between explicit
knowledge and mental representation but just what these interfaces are or
even could be has never been delineated and remain vague.
There are meta-analyses that suggest that instruction is somehow beneficial
to language learning (e.g., Goo, Granena, Yilmaz, & Novella, 2015; Norris &
Ortega, 2000; Spada & Tomita, 2010). However, a number of drawbacks to
the research used in these metanalyses have been pointed out suggesting that
Theories and Language Teaching 285
instruction mainly helps explicit knowledge and explicit learning but does not
affect the development of mental representation or even communicative ability
(e.g., Doughty, 2003; Truscott, 2004; VanPatten, 2017b).
To be sure, there may be other facts that other scholars would suggest should
be listed as “facts” here (e.g., Long, 1990; VanPatten, 2017b, and some intro-
ductory books on SLA) and we have not touched on all ten observations from
Chapter 1 of this volume. The seven facts provided here are not to be exhaus-
tive but suggestive. In short, the point of this chapter is to offer a sampling in
order to illustrate what Lightbown meant about expectations and impact of
research on teachers’ thinking. Taken together, just these seven facts could
and should lead teachers who believe in the “language-as-subject-model”
approach to instruction to question to what extent what they do promotes
actual language acquisition. They might revise their expectations about how
learners learn and what they can learn. Likewise, the teachers who believe in
the “language-is-different” approach to instruction need to examine just why
they believe this and what the facts are that underlie such an approach. In short,
facts help teachers understand acquisition. Understanding acquisition leads to
informed decision making. Belief is replaced by knowing.
Back to Theories for a Moment
I remind the reader here that one of the charges of authors in the present
volume was to address (explain) the ten observations as part of the presentation
of each theory or framework. No doubt, the seven facts presented above could
or should be addressed in some way by each theory (and, as already stated, some
of the facts have considerable overlap with the ten observations). How theories
might address these facts is, in my opinion, of less concern to language teachers
than the facts themselves. Why would I say this? Aside from Lightbown’s obser-
vations about great expectations, it should be clear to the reader that no theory
included in this book can explain all of these facts or all of the observations
from Chapter 1. As suggested elsewhere (e.g., Rothman & VanPatten, 2013;
VanPatten, 2018), no single theory can account for the complex nature of SLA.
The reason we have multiple theories is that theories in SLA tend to look at dif-
ferent things and from different perspectives. Only three theories in the present
chapter are theories specific to the L2 context: input processing, processabil-
ity, and interaction (which is technically a hypothesis). The rest are theories
imported from other disciplines: linguistics, psychology, and education, mostly.
This suggests to me that depending on what one wishes to examine and what
one wishes to explain, different theories do different kinds of work. In the end,
we may see that multiple theories will co-exist as they actually explain different
aspects of acquisition, not all of acquisition. Because of this multiplicity, there is
the question of which theory or theories we want teachers to know and why we
want them to know this particular theory and not any of the others. Teachers
286 Bill VanPatten
will want to know which theory is the “correct one” and miss the point that
there are basic facts about acquisition that need to be explained by any theory
and that these facts are useful in and of themselves.
This multiplicity makes the role of theories in teachers’ lives more compli-
cated than it needs to be. In short, the teacher in training and perhaps the prac-
ticing teacher with no background in L2 acquisition both need to understand
the basic facts of L2 acquisition before venturing into the “why” of theories.
And in the cases of competing “why’s,” the explanation does not obviate the
implications of the fact. That is, the facts exist independent of any particular
theory that might try to account for them. Let’s take the case of the fundamen-
tal role of input in SLA. One theory, such as generative linguistic theory, might
say that language is special and modularized and can only operate on linguistic
data in the input. Another theory, such as a usage-based approach, might say
that language is learned like anything else and constructed from data in the
input. In both cases, input is fundamental—and that is what the teacher needs
to know. It is not the theory per se that suggests to the teacher “get as much
communicatively embedded input into the curriculum as possible” but the sim-
ple fact that such input is necessary. The “why” is less important to the teacher
and to materials development than the fact in need of explanation.
Let’s look at one more example. We noted earlier that L1 influence
is constrained. However, one theory might explain constraints one way
(e.g., Processability Theory) and another might explain them in another way
(e.g., Input Processing) and yet another might explain them another way (e.g.,
Functional Approaches). What is relevant to the teacher is not so much the
explanation but the mere fact that L1 influence is constrained. This fact may
help teachers make decisions about, for example, what to focus on and what not
to focus on or what to expect and what not to expect.
Concluding Thoughts
I have worked in L2 research and theory development for over 30 years and my
formation as a scholar was born on the heels of the revolution in linguistics and
first/second language acquisition that occurred in the 1960s and 1970s. At the
same time, I have worked with language teachers at all levels of instruction in
the United States for that same time period. I have directed language programs
in which we have had to make curricular decisions. In those three decades, I
have come to understand both the needs of theory/research on the one hand and
the needs of teachers and students on the other. Because this chapter is about
the relationship between theories and language teaching, I have let my expe-
rience in these matters guide my thinking. This has led me to the thesis that
theory is less important for teachers than basic facts about language acquisition.
As I have said earlier, very often teachers just don’t know the facts of how L2
acquisition unfolds over time. Worrying about theory first puts the cart before
the horse. In this sense, I am standing on the shoulders of Patsy Lightbown by
Theories and Language Teaching 287
saying that basic L2 research is relevant to teachers. And in agreement with her,
the relevance is not so much about what to do on Monday morning but instead
about how to conceptualize one’s expectations and how this influences decision
making. As such, I have focused on facts about language acquisition and less on
the theories that try to explain them. I have focused on facts about language
acquisition that in my experience have been “eye openers” for language teach-
ers. Once teachers grapple with the facts, they have reasons to explore options.
Nothing dictates what to do to the teacher, but the basics about L2 acquisition
allow teachers to question unstated assumptions and to fashion language teach-
ing in ways they might not have thought of otherwise.
With the above said, I should clarify that theories may be of interest to
teachers but not without the basic facts in need of explanation. Theories (i.e.,
understanding the “why” behind what we see in acquisition) can further push
teachers to think about practice. But I would argue that bringing theories
to teachers only makes sense once teachers understand the basic facts about
acquisition. After all, a theory’s job is to explain what we see—and only by
understanding the basic facts will teachers be able to judge which theory does
the best job of accounting for such facts. In short, theory may be relevant to
teachers in the end but only after teachers have a solid grasp of what acquisition
looks like.
To be sure, some of the contributors in this volume might disagree with me
on the relevance of theories to language teaching. “Oh, but my theory has much
to say to teachers.” Others might agree with me and echo with a “Hear! Hear!”
Others might not have an opinion. I remind the reader that such variations in
reaction may very well reflect what a particular theory assumes to explain and
what it excludes from its domain of inquiry. The reactions might very well
reflect the various disciplines from which a theory is imported (i.e., linguistic,
cognitive, educational). I would hope, however, that we would all agree that
the basic facts about language acquisition are of interest to everyone—theorists
(for one reason) and teachers (for another). It is the facts about language acqui-
sition that exist independently of any theory that teachers need to grapple with.
Acknowledgments
I would like to thank Michael Leeser, Reed Riggs, Luke Plonsky, Russell
Simonsen, Karen Lichtman, Paul Mandell, Stephanie Wulff, Dustin De Felice,
and Greg Keating for comments on the first draft of this chapter.
Discussion Questions and Projects
1. VanPatten’s basic argument is that understanding the “why” is secondary
or less important to knowing the basic “facts” about L2 acquisition for
teachers and teachers in training. In what ways do you agree or disagree
with this argument?
288 Bill VanPatten
2. Select one or two theories or frameworks in this book and determine what
implications there are for language instruction. Then do the same with the
seven facts in this chapter. How do your implications compare?
3. An important but unstated issue regarding the facts presented in this chap-
ter is that they are not derived from observations about how classroom
learners perform on paper-and-pencil tests or tests about what they have
explicitly learned in class. Instead, they are based on the kinds of data col-
lected by researchers. Review some of the theories in this book focusing
on the sections “What Counts as Evidence.” Do you think knowing about
how researchers study language acquisition would be of use to teachers? If
so, in what ways?
4. Conduct a survey of teacher education majors in world languages as well as
practicing teachers to see if they are familiar with the facts presented in this
chapter and to what degree they are familiar with them. For example, can they
provide examples? Can they illustrate? What are your results from this survey?
5. Conduct an Internet-based research study in which you examine the course
content and goals of courses in teacher education programs. Determine
whether students are required to take courses in the nature of language, the
nature of communication, or the nature of L1 and L2 acquisition as part of
their education.
Notes
References
Andersen, R. W. (1983). Transfer to somewhere. In S. Gass, & L. Selinker (Eds.),
Language transfer in language learning (pp. 177–201). Rowley, MA: Newbury House.
Biesta, G. J. J., & Stengel, B. S. (2016). Thinking philosophically about teaching.
In D. H. Gitomer, & C. A. Bell (Eds.), Handbook of research on teaching (5th ed.,
pp. 7–68). Washington, DC: AERA.
Cook, L. M., Grant, B. S., Saccheri, I. J., & Mallet, J. (2012). Selective bird predation
on the peppered moth: The last experiment of Michael Majerus. Biology Letters, 8,
609–612.
Theories and Language Teaching 289
Corder, S. P. (1967). The significance of learners’ errors. The International Review of
Applied Linguistics, 5, 147–159.
Doughty, C. J. (2003). Instructed SLA: Constraints, compensation, and enhancement.
In C. J. Doughty, & M. H. Long (Eds.), The handbook of second language acquisition
(pp. 206–257). New York, NY: Blackwell.
Dulay, H., Burt, M., & Krashen, S. D. (1982). Language two. Oxford, England: Oxford
University Press.
Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language Learn-
ing, 27, 315–330.
Ellis, R. (1989). Are classroom and naturalistic acquisition the same? A study of the
classroom acquisition of the German word order rules. Studies in Second Language
Acquisition, 11, 303–328.
Goo, J., Granena, G., Yilmaz, Y., & Novella, M. (2015). Implicit and explicit i nstruction
in L2 learning: Norris and Ortega (2000) revisited and updated. In P. Rebuschat
(Ed.), Implicit and explicit learning of languages (pp. 443–482). Amsterdam, Netherlands:
John Benjamins.
Hatch, E. (1983). Simplified input in and second language acquisition. In R. W. Andersen
(Ed.), Pidiginization and creolization as language acquisition (pp. 64–86). Rowley, MA:
Newbury House.
Jordan, G. (2004). Theory construction in second language acquisition. Amsterdam,
Netherlands: John Benjamins.
Kellerman, R. (1983). Now you see it, now you don’t. In S. Gass & L. Selinker (Eds.),
Language transfer in language learning (pp. 112–134). Rowley, MA: Newbury House.
Kessler, J.-U., Liebner, M., & Mansouri, F. (2011). Teaching. In M. Pienemann & J.-U.
Kessler (Eds.), Studying processability theory (pp. 149–156). Amsterdam, Netherlands:
John Benjamins.
Krashen, S. D. (1978). Adult second language acquisition and learning: A review of the-
ory and practice. In R. Gringas (Ed.), Second language acquisition and foreign language
teaching. Washington, DC: The Center for Applied Linguistics.
Krashen, S. D. (1982). Principles and practice in second language acquisition. New York, NY:
Pergamon.
Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago, IL: The University of
Chicago Press.
Larsen-Freeman, D., & Tedick, D. (2016). Teaching world languages: Thinking differ-
ently. In D. H. Gitomer, & C. A. Bell (Eds.), Handbook of research on teaching (5th ed.,
pp. 1135–1387). Washington, DC: The American Educational Research Association.
Lightbown, P. M. (1983). Exploring relationships between developmental and instruc-
tional sequences in L2 acquisition. In H. W. Seliger, & M. H. Long (Eds.), Classroom
oriented research (pp. 217–245). Rowley, MA: Newbury House.
Lightbown, P. M. (1985). Great expectations: Second language research and classroom
teaching. Applied Linguistics, 6, 173–189.
Long, M. H. (1990). The least a second language acquisition theory needs to explain.
TESOL Quarterly, 24, 649–666.
Long, M. H. (2007). Problems in SLA. Mahwah, NJ: Lawrence Erlbaum Associates.
Losos, J. B., Schoener, T. W., Langerhans, B., & Spiller, D. A. (2006). Rapid temporal
reversal in predator-driven natural selection. Science, 314, 1111.
Montalbetti, M. M. (1984). After binding. On the interpretation of pronouns. Ph.D. disserta-
tion. Massachusetts Institute of Technology, Cambridge, MA.
290 Bill VanPatten
Norris, J., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and
quantitative meta-analysis. Language Learning, 50, 417–528.
Pienemann, M. (1998). Language processing and second language development: Processability
theory. Amsterdam, Netherlands: John Benjamins.
Pérez-Leroux, A. T., & Glass, W. R. (1999). Null anaphora in Spanish second language
acquisition: Probabilistic versus generative approaches. Second Language Research, 15,
220–249.
Popper, K. (2002). The logic of scientific discovery. New York: Routledge.
Rothman, J., & VanPatten, B. (2013). On multiplicity and mutual exclusivity: The
case for different SLA theories. In M. del Pilar García-Mayo, M. Junkal Gutiérrez-
Mangado, & M. Martínez Adrián (Eds.), Contemporary approaches to second language
acquisition (pp. 243–256). Amsterdam, Netherlands: John Benjamins.
Schwartz, B. D. (1993). On explicit and negative evidence effecting and affecting com-
petence and “linguistic behavior.” Studies in Second Language Acquisition, 15, 147–163.
Selinker, L. (1972). Interlanguage. The International Review of Applied Linguistics, 10,
209–231.
Spada, N., & Tomita, Y. (2010). Interactions between type of instruction and type of
language feature: A meta-analysis. Language Learning, 60, 263–308.
Truscott, J. (2004). The effectiveness of grammar instruction: Analysis of a meta-
analysis. English Teaching & Learning, 28, 17–29.
Vainikka, A., & Young-Scholten, M. (1996). Gradual development of L2 phrase struc-
ture. Second Language Research, 12, 7–39.
VanPatten, B. (2014). On the limits of instruction: 40 years after ‘Interlanguage’. In
Z.-H. Han, & E. Tarone (Eds.), Interlanguage: 40 years later (pp. 105–126). A msterdam,
Netherlands: John Benjamins.
VanPatten, B. (2015). Foundations of processing instruction. International Review of
Applied Linguistics, 53, 91–109.
VanPatten, B. (2016). Why explicit knowledge cannot turn into implicit knowledge.
Foreign Language Annals, 49, 650–657.
VanPatten, B. (2017a). While we’re on the topic…. Principles of contemporary language teach-
ing. Alexandria, VA: The American Council on the Teaching of Foreign Languages.
VanPatten, B. (2017b). Situating instructed second language acquisition within second
language acquisition: Facts and consequences. Instructed Second Language Acquisition,
1, 45–59.
VanPatten, B. (2018). Theories of second language acquisition. In K. Geeslin (Ed.),
The handbook of Spanish linguistics (pp. 649–667). Cambridge, England: Cambridge
University Press.
VanPatten, B., & Rothman, J. (2014). Against “rules.” In A. Benati, C. Laval, & M. J.
Arche (Eds.), The grammar dimension in instructed second language acquisition: Theory,
research, and practice (pp. 15–35). London, England: Bloomsbury.
GLOSSARY
Adaptive Adaptive systems change in response to their changing environments.
Affordances In Complexity Theory, as applied to second language develop-
ment, affordances are learning opportunities that learners perceive to exist
in the environment.
Attention The orientation of mental powers.
Automaticity (1) The end point of the process of automatization, character-
ized by the capacity to carry out a task at high speed, with a low error rate
and minimal interference from or with other tasks or new task conditions.
The latter is sometimes referred to as robustness. (2) The extent of routin-
ized control over (linguistic) knowledge.
Automatization The gradual improvement that occurs in speed, error rate,
and effort required that occurs as a function of task practice. The (vir-
tual) end point is automaticity. Sometimes automatization is used in the
sense of mere speed-up, but usually in this broader sense; for some (e.g.,
Segalowitz), it is only used to refer to changes as a result of practice that are
beyond mere speed-up.
Basal ganglia A group of highly interconnected structures deep in the brain.
Within each hemisphere, the basal ganglia include the caudate nucleus, the
putamen, and other structures (e.g., the globus pallidus).
Broca’s area A classical brain language area in the frontal lobe that is gener-
ally considered to include the opercular part and the triangular part of the
inferior frontal gyrus (these correspond largely to Brodmann’s areas 44 and
45, respectively). The area is named for the French scientist Paul Broca,
who first suggested in the 1800s that brain tissue in this region is involved
in language.
292 Glossary
Co-adaptation A reciprocal social process whereby speakers adjust their
language resources to their interlocutors.
Co-optation In evolution and biology, the term co-optation (or co-option)
is often used to refer to the re-use of an existing trait (e.g., mechanism,
structure) for new function(s). One can think of this as an existing trait
being hijacked for new purposes.
Comprehensible input Input that is slightly above the level of the learner’s
current proficiency.
Construct Within a theory, a clearly defined feature or mechanism. For
example, in atomic theory, a proton is a feature and particle attraction is a
mechanism. Both are constructs within the theory.
Construction In Construction Grammar, constructions are conventionalized
pairings of form and meaning or function that range from morphemes to
words and abstract syntactic frames.
Content words/content lexical items Lexical items, or words, such as nouns,
most verbs (not auxiliaries and modals), adjectives, and adverbs, that are
used to express an object, process, or some nongrammatical meaning.
Contingency When the presence and/or specific realization of a form A
depends on the presence and/or specific realization of another form B,
then A is contingent on B. Contingency can vary in strength. Some cues
(like lightning → thunder) are highly predictive and so have a high con-
tingency, and other cues are less reliable (summer days → fine weather).
Corpus (pl. corpora) A large and structured collection of transcribed spoken
and/or written language data in digital format.
Cultural artifacts Physical objects and symbolic systems developed by
human societies over the course of their history that mediate (see medi-
ation) their social and psychological behavior. Physical artifacts include
tools (e.g., hammers, saws, shovels, bulldozers, computers) and symbolic
systems (e.g., language, numbers, art, music, literature, sanctioned social
behaviors). Artifacts mediate through their use and not as objects in
themselves—hammering, not hammers; counting, not numbers; commu-
nicating, not language. It is important to remember that cultural artifacts
have psychological impacts on how people think.
Declarative knowledge Knowledge that can be explicitly expressed
(“declared”), such as a law of physics, a grammar rule, or a historical fact,
as opposed to knowledge that can only be performed (procedural knowl-
edge), such as how to swim or speak fluently. Sometimes called factual
knowledge, or knowledge that as opposed to knowledge how (procedural
knowledge).
Declarative memory Declarative memory is defined by the declarative/
procedural model as the learning and memory that rely on the medial tem-
poral lobe and its associated circuitry. It underlies knowledge of facts and
personal experiences, as well as other types of information. Knowledge
learned in this system can be explicit or implicit.
Glossary 293
Developmental problem One of the two core issues to be addressed by a
theory of SLA (see also logical problem). It focuses on the question of why
learners follow a specific path in their L2 acquisition process.
Developmental trajectory Relates to the path L2 learners follow in their
acquisition process. This includes the developmental dimension, which is
characterized by universal stages that L2 learners pass through, and the
variational dimension, which captures individual learner variation within
the constraints of processability. The PT hierarchy of processing proce-
dures generates specific predictions for developmental trajectories.
Distributed versus massed practice Large versus minimal spacing between
successive instances of practicing a rule or retrieval of an item.
Double dissociation The demonstration that two experimental manipu-
lations have different effects on two dependent variables. For example,
grammar is impaired from a lesion to brain structure X but not Y, while
lexical abilities are impaired from a lesion to structure Y but not X. Or
grammar is associated with brain activation in structure X but not Y, and
vice versa for lexical processing.
Dynamic assessment A type of assessment based on the Zone of Proximal
Development (see Zone of Proximal Development) used to diagnose indi-
vidual and group abilities by introducing an instructional element, such as
the provision of hints, models, and leading questions, in order to deter-
mine learner responsiveness. It can be used in summative and formative
contexts of assessment.
Effortful comprehension Real-time nonfluent comprehension that causes a
hearer (usually a second language learner) to miss information in a speech
stream.
Emergence In the concept-oriented approach, a learner’s earliest expression
of a concept or use of a form. In dynamic systems theory, the spontaneous
occurrence of something new that arises when the components of a com-
plex system interact.
Emergentism A system is emergent if it is a new outcome of some other
properties of the system and their interaction, while it is itself different
from them.
Endstate The final grammar or linguistic competence achieved by a learner.
No further acquisition occurs beyond this point (with the exception of
vocabulary). Often referred to as ultimate attainment.
Exemplar Exemplars are specific examples of a category. The word “house,”
for instance, is an exemplar of the category NOUN in English.
Explicit learning Explicit learning is the learning of information in a con-
scious and often effortful manner.
Feature unification A central component of Lexical Functional Grammar.
The mechanism of feature unification ensures that the different parts of a
sentence fit together by merging the features that are present in the lexical
entries. Feature unification allows for the matching of features that are
294 Glossary
conceptually related even if they occur in different parts of the sentence.
This mechanism accounts, for instance, for agreement, such as in The mon-
keys are in the forest, as the feature number = plural in the lexical entries of
monkeys and are is unified.
Form–meaning connections The matching of a linguistic form (such as a
word, a morpheme, or a structure) to a function/meaning/concept (such as
an action, a time reference, person, number, and so on). Same as function-
to-form mapping.
Functional load The information value of a linguistic form in context.
Genetic method The approach to scientific research proposed by Vygotsky
in which development of individuals, groups, and processes is traced over
time. The goal is to discover the contributions of cultural artifacts to psy-
chological development. The research can entail different temporal scales,
including ontogenesis, whereby children are studied in either natural or
laboratory settings to trace their ability to incorporate and eventually
internalize (see Internalization) cultural artifacts into their psychological
behavior; the history of a society or even human culture as a whole as
artifacts are created, modified, and abandoned over long stretches of time.
It also includes the reverse process, whereby adults with cerebral impair-
ment lose their ability to regulate their mental behavior.
Grammaticality judgments Judgments made regarding the possibility or
impossibility of certain sentence types. Grammaticality judgment tasks
typically include grammatical and ungrammatical sentences which partic-
ipants are asked to assess.
Higher mental processes Mental processes built on the foundation of lower
mental processes (see lower mental processes) as a consequence of the
appropriation and internalization of cultural artifacts that in turn convert
the lower processes from involuntary to voluntary, or mediated processes,
and organize them into a unified system of human consciousness.
Hippocampus A brain structure in the medial temporal lobe (that is, deep
inside the brain, beneath your temples) that underlies learning in declarative
memory. As with other brain structures, there is both a left and a right hip-
pocampus, respectively in the left and right hemispheres of the cerebrum.
Hypothesis A singular testable idea generated by a theory or by a set of
observations.
Hypothesis Space Specifies the scope of the structural hypotheses at a given
stage of development. The structural options are constrained by the pro-
cessing procedures available to the L2 learner. The concept of Hypothesis
Space represents both the developmental and the variational dimensions
of L2 acquisition and defines the variation occurring in the learners’
interlanguage.
Implicit learning Learning of a particular thing without awareness of what
was learned.
Glossary 295
Input Hypothesis A position that holds that what is needed for learning
is input that is slightly above learners’ current knowledge of the second
language.
Interface/noninterface theories of SLA Theories that claim that explicit
knowledge can or cannot become implicit knowledge, respectively.
Internalization The process through which forms of mediation are
appropriated, or made one’s own. This often occurs under mediation (see
Mediation) in the ZPD (see Zone of Proximal Development), frequently
involves private speech (see Private speech), and results in self-regulation
(see Regulation).
Interpsychological [function] The process whereby mental ability is distrib-
uted between two individuals or between an individual and a cultural
artifact that is used as an external form of mediation. Thus, behavior under
mediation in the ZPD (see Zone of Proximal Development) is interpsy-
chological, and so is the activity of looking up a word in a dictionary or
consulting a grammar in an L2. The concept captures the notions of other-
and object-regulation. See Intrapsychological [function].
Intrapsychological [function] The process whereby mental ability is located
under the control of the individual (i.e., self-regulation). It results from the
internalization of cultural artifacts. See Interpsychological [function].
Island constraints Constraints of UG that place limits on how far wh-phrases
can move. The idea is that certain syntactic domains form units such that
nothing can move out of them. Island constraints are often subsumed
under the Subjacency Principle.
Iteration Repetition that is not exact, which takes place when the results of
one procedure are applied to the results of a previous application.
Learned attention People learn to attend to the cues that are relevant to a
problem-space, and this increases speed of acquisition and automaticity of
processing. While selective attention benefits acquisition, it can also lead
to distortions of knowledge that are evident when the learner transfers to
novel problem spaces. Learners continue to attend to the old cues, even
when these are no longer optimal, and can ignore relevant new cues, espe-
cially when they are lacking in salience.
Lexical mapping The principles specified in LFG that govern the linking
between the arguments of a verb (e.g., agent, patient, or theme) and the
corresponding grammatical functions (e.g., subject and object). The corre-
spondence between arguments and functions can be linear, as in John threw the
ball, where the agent is mapped onto the subject ( John), or nonlinear, as in the
passive sentence The ball was thrown by John. In this case, the theme is linked
to the subject (the ball) and the agent is mapped onto the adjunct (by John).
Linearization problem Addresses the question of how speakers order the
information they intend to express. The mapping of conceptual material
onto linguistic form does not necessarily take place in a linear fashion,
296 Glossary
as in the sentence Before he went home, he had dinner. In this case, proposi-
tional content needs to be stored in memory. The linearization problem
also applies to the morphosyntactic level, as in Peter sees a dog. Here, gram-
matical information needs to be stored in memory to achieve subject-verb
agreement.
Linguistic competence The underlying unconscious and abstract knowledge
of language of native speakers and L2 learners.
Logical problem One of the two core issues to be addressed by a theory of
SLA (see also developmental problem). It refers to the claim that learners
come to know more than what they were exposed to along with the ques-
tion of how this is possible. Also referred to as the poverty of the stimulus
problem or the learnability problem.
Lower mental processes Those mental processes governed by the endowed
neurological organization of our brains, including involuntary attention,
memory, perception, and awareness of the environment. These serve as the
foundation on which higher mental processes are constructed.
Mediation The central concept of sociocultural theory to which all other
theoretical concepts are directly or indirectly connected. It argues that
all forms of higher human mental processes (see Higher mental processes)
result from participation in and appropriation of social relationships (e.g.,
family life, school, work) and cultural artifacts (see Cultural artifacts) that
intervene between people and their relationship to each other and the
objective world.
Meta-analysis (plural meta-analyses) Statistically combining data from multi-
ple studies. A meta-analytical approach can rigorously synthesize an existing
literature and reveal any consistent patterns. Meta-analyses have substantial
power because they typically examine a much larger number of participants
than individual studies. Thus, meta-analysis results are likely more reliable
and generalizable than single-study findings. Additionally, they can be more
objective than qualitative reviews. A neuroanatomical meta-analysis typi-
cally combines results from many neuroimaging studies to reveal, for exam-
ple, where neural activation was found consistently across studies.
Metalanguage Language used to talk about language.
Model A description of a set of processes. Models describe how something
occurs; they are not required to explain why something occurs.
Morphological stage The last stage in the development of temporality in
which learners begin to use verb morphology to indicate temporal (time)
relations.
Multifunctionality principle In later stages of linguistic development, a
learner’s interlanguage allows multiple forms for one meaning or multiple
meanings for one form (compare to the one-to-one principle).
Multiple Constraints Hypothesis Relates to the initial L2 mental grammati-
cal system and the question of what kind of linguistic resources are available
Glossary 297
to L2 learners at the initial state. The Multiple Constraints Hypothesis
proposes that this system is initially highly constrained at the three levels
of representation spelled out in Lexical-Functional Grammar (a-structure,
f-structure, and c-structure). The lack of syntactic features at the level of
a-structure affects the learners’ ability to map arguments onto grammatical
functions at f-structure level. Grammatical functions are initially inacces-
sible, and learners need to rely on direct mapping operations from argu-
ments to surface form. Constituent structure is not present in the initial
L2 mental grammatical system, and utterances are generated on the basis
of lexical processes. The L2 learner’s lexicon is claimed to be annotated
successively in the course of the L2 acquisition process.
Negotiation (of) for meaning The attempt made in conversation to resolve
a lack of understanding.
Nonlinearity When an effect is not proportionate to a cause
Noticing Detection involving cognitive registration.
One-to-one principle Captures the tendency in early stages of linguistic
development for one interlanguage form to convey only one meaning and
one meaning to be realized by only one form (compare to the multifunc-
tionality principle).
Open An open system interacts with its environment. Depending on what
type of system it is, it exchanges information, matter, or energy with its
environment.
Output Hypothesis A position that holds that output (language production)
is a significant factor in second language learning.
Parsing The moment-by-moment implicit computation of sentence structure
during real-time comprehension.
Power Law of Learning The law invoked to describe the specific way reac-
tion time and error rate decline as a function of practice for a wide variety
of skills. “Power” refers to the exponent in the mathematical equation
describing the learning curve.
Practice In a narrow sense, activities engaged in repeatedly with the
goal of becoming better at them (often called deliberate practice); in
a wider sense, activities that make an individual draw on procedural
knowledge, whether or not the goal is to improve that knowledge (e.g.,
speaking a foreign language because somebody just addressed you in
that language).
Private speech A form of speech that appears social in form but is psycholog-
ical in function. That is, it often appears as a conversational turn with an
interlocutor; however, it is directed not at another person but at one’s self
and functions to regulate psychological behavior (i.e., trying to figure out
a difficult math problem or to remember a word or learn a new linguistic
feature of an L2). Private speech may be completely externalized and thus
audible. It may also be whispered, or even subvocal.
298 Glossary
Proceduralization In skill theory, the process of creating procedural
knowledge by merging declarative knowledge with more encompassing
procedural rules (more recently often called productions). This takes place
when learners repeatedly engage in a task that calls on the broad proce-
dural rules and the relevant declarative knowledge. Production compila-
tion (combination of rules or productions frequently used together into
one production) is also part of this process. According to the declarative/
procedural model, proceduralization instead refers to the gradual learning
in and increasing dependence of skills or knowledge on the procedural
memory system, which is likely to lead to automatized processing. This
can occur whether or not analogous skills or knowledge were previously
learned in declarative memory.
Procedural knowledge Knowledge that can only be performed, such as how
to swim, to do mental arithmetic, or to speak fluently. Sometimes called
task knowledge, or also knowledge how as opposed to knowledge that
(declarative knowledge).
Procedural memory Procedural memory is defined by the declarative/
procedural model as the learning and memory that rely on the basal
ganglia and their associated circuitry. Procedural memory underlies a
range of cognitive and motor functions and behaviors, including habits,
perceptual-motor sequences, perceptual sequences, and categories. This
knowledge seems to be entirely implicit.
Processability hierarchy A central construct in PT. It is based on the notion
of transfer of grammatical information within and across phrases. The
hierarchy consists of five specific processing procedures that are ordered
hierarchically and are implicationally related. The hierarchical arrange-
ment of these procedures accounts for the developmental path L2 learners
follow in L2 acquisition.
Processing The connection or linking of form and meaning during real-time
comprehension. The connection can be local (at the word level) or senten-
tial (the interpretation of an entire sentence).
Processing instruction A pedagogical intervention or technique that manip-
ulates input in certain ways to counteract the (potential) negative effects of
various input processing principles.
Prototype A prototype is the most typical member of a category, and it is
created by combining the most representative attributes of that category.
Rational analysis of cognition Rational analysis is an empirical program
that attempts to explain the function and purpose of cognitive processes.
Unlike traditional cognitive science, in which the cognitive system is often
treated as an arbitrary assortment of mechanisms with likewise arbitrary
limitations, rational analysis views cognition as intricately adapted to its
environment and the problems it faces.
Glossary 299
Reaccess The process through which earlier forms of development are called
upon, either intentionally or unintentionally, in carrying out specific
activities. Thus, individuals who may be able to regulate their own psy-
chological or social behavior (see Regulation), at times, find it necessary to
seek support (i.e., mediation) from others (i.e., other-regulation) or from
cultural artifacts (i.e., object-regulation) during difficult activities.
Regulation The human ability to intentionally control our own social and/
or psychological behavior (i.e., self-regulation) or the behavior of others
(i.e., other-regulation), or to subject their behavior to that of others (also
other-regulation) or to cultural forms of mediation (i.e., object-regulation).
Retrodiction Predicting that one will find evidence of past events to explain
current performance.
Reverse-order reports A portion of an oral or written account in which
events are not reported in the order in which they happened.
Scaffold(ing) Communicative support provided by another speaker’s turn on
which a learner can build a contribution.
Self-organizing When order in a complex system emerges on its own, with-
out direction from an external force or without a preexisting plan.
Spacing of practice The time interval between different instances of practic-
ing the same rule or retrieving the same item.
Spatial resolution The precision of a measurement with respect to space. A
neurocognitive method with high spatial resolution, such as fMRI, allows
one to localize neural activity accurately in the brain.
Statistical preemption Learners can generate their own negative feedback
when they come to expect one form in a particular context yet witness
another.
Stimulated recall A research methodology in which, following completion
of a task, individuals are asked to verbalize what they were thinking at
the time of the original task. A stimulus (such as a video of the participant
engaged in the task) is provided.
Systemic-Theoretical Instruction An approach to education developed by
P. Gal’perin and his colleagues based on Vygotsky’s theory. It privileges
explicit systematic conceptual knowledge of any subject domain and links
it to concrete practical activity whereby the conceptual knowledge medi-
ates this activity.
Temporal resolution The precision of a measurement with respect to time.
A neurocognitive method with high temporal resolution, such as ERPs,
allows one to track the actual time course of brain activity, for example,
during language processing.
Theory A set of statements or laws designed to explain observable phenomena
and make predictions about other phenomena.
Transfer The transfer of first language knowledge to second language use.
300 Glossary
Truth-value judgments Judgments made regarding the appropriateness of a
sentence in a given context. Participants pay attention to the meaning of the
test sentences rather than grammaticality (see grammaticality judgments).
Universal Grammar (UG) A system of linguistic principles and parameters,
placing limitations on the form of grammars. UG is assumed to be part of
a biologically endowed language faculty (i.e., innate). Principles of UG
are invariable across languages, whereas parameters allow for constrained
variation from language to language.
Unmarked alignment The default mapping principle. In this case, the corre-
spondence between arguments, grammatical functions, and surface struc-
ture constituents is entirely linear. For instance, in the sentence John played
the piano, the most prominent argument—the agent—is mapped onto the
subject function, which is realized as the initial noun phrase in constituent
structure.
U-shaped learning U-shaped learning denotes one frequent developmental
path when new cognitive skills are developed. Imagine a curve shaped like
the letter “U” in a graph, with the x-axis depicting time and the y-axis
depicting the learner’s level of skill. Learners often start out with seem-
ingly high levels of skill but then go through a phase in which their profi-
ciency plummets before it rises again. U-shaped learning characterizes the
learning of new words, high-level mathematic algorithms, and even the
building of muscle strength, among many other skills. Early high levels of
performance often reflect memorized, unanalyzed responding; middle to
lower levels of performance reflect the development of analyzed systematic
responding (whereby irregular responses are now overgeneralized).
Wh-movement Syntactic movement of a wh-phrase (containing an expres-
sion like who, which, what) to a higher position in the clause, typically in
questions and relative clauses.
Working memory The metaphoric “computational space” available to lis-
teners and readers during the act of comprehension; allows for temporary
information storage and manipulation.
Zone of Proximal Development The activity whereby individuals and
groups, interacting under the systematic and planned (e.g., schooling), or
unsystematic and unplanned, mediation (see Mediation) of other individu-
als and groups take part in tasks that they cannot perform alone and at the
same time appropriate the cultural artifacts available in their community.
INDEX
NOTE: Page numbers followed by “n” denotes endnotes
abstraction, associative bases of 67–69 behaviorism 14–16
acquisition orders 10 Behney, J. N. 205
adaptive systems 249, 251, 291 Bill-Schuetz, K. A. 90
adult learners 41, 44, 139 Bird, S. 89
affordances 252, 291 blocking 71
Aldwayan, S. 29–31 Brindley, G. 178
alignment, unmarked 172–173, 300 Broca’s area 129, 291
Aljaafreh, A. 229 Bryfonski, L. 204, 208
Andersen, R. W. 43 Byrne, D. 255–256
Anderson, J. R. 84, 87, 88
Anolis sagrei 273 Callaghan, G. 255–256
anterior negativities (ANs) 146, 152 Cameron, L. 256
Arievitch, I. 228, 232, 243n3 Campbell, C. G. 257
artificial grammars/languages 151–153 Carpenter, H. 90
artificial intelligence 178–181 Carter, C. 88
associative learning 63, 65–71 Caspi, T. 256
Atkinson, D. 16 CDST (Complex Dynamic Systems
attention 70–71, 204–206, 291, 295 Theory) see Complex Dynamic Systems
automaticity 85, 197, 291 Theory (CDST)
Automatic Profiling Expert System Centeno-Cortés, B. 227
(APES) 179–181 Chan, C.Y. H. 36n5
automatization 84–87, 291 Chinese language 22–24
Clahsen, H. 29–30, 150–151
Bahamas 273 clarification requests 199, 200
Baralt, M. 210–211 co-adaptation 251, 292
Barcroft, J. 114 cognition 35, 67, 298
Bardovi-Harlig, K. 48–52 communication 274, 275
basal ganglia 129–131, Plate 1 complex adaptive system 253
Becker, A. L. 252 complex dynamic systems theory (CDST)
behavioral evidence 142–144 250–254, 256–258; evidence for
302 Index
254–256; exemplary study 256–258; explicit/implicit debate 138–142,
explicit/implicit debate 263–264; 154–155; misunderstandings, common
misunderstandings, common 256; 149–151; observable phenomena
observable phenomena 258–263; 153–154; predictions for language
theory and its constructs 248–254; 134–142; procedural memory 132–133,
transdisciplinary nature of 248–249 136–137, 154, 155; theory and its
complexity theory 248–250 constructs 129–142
comprehensible input 292 De Jong, N. 89–90
comprehension, effortful 107, 293 DeKeyser, R. M. 13, 87, 89–90, 93–94
comprehension checks 199, 201 Desperate Housewives 237
computer-mediated communication developmental problem 164, 293
(CMC) 210, 211 developmental sequences 10, 11
computer simulations, complex developmental trajectory 163, 293
systems 255 Di Biase, B. 184
Concept-Based Language Instruction Dimroth, C. 54, 55
(C-BLI) 233 dissociation, double 145, 293
concept-oriented approach: evidence distributed versus massed practice
for 46–48; exemplary study 48–52; 88–89, 293
explicit/implicit debate 55–56; double dissociation 145, 293
misunderstandings, common 48; Douglass, S. 87
observable phenomena 52–54; origins DP model see Declarative/Procedural
of 45–46; theory and its constructs model
40–45 Dutch Cito score 257
confirmation checks 199–200 dynamic assessment (DA) 224, 293
connectionism 68–69 dynamic systems 249
constituent structure 168, 171
constructions 63, 64–65, 73–75, 292 educational theory 232
constructs 6–7, 292; see also specific constructs effortful comprehension 107, 293
content words/content lexical items Eisner, F. 261
107–108, 292 elaborations 195
context dependent complex systems 249 electrophysiological evidence 145–147,
Contextual Constraint Principle 114 151–153
contingency 68, 292 Ellis, N. C. 73–75, 78, 93, 255
Contrastive Analysis Hypothesis 76 Ellis, R. 13–14, 55, 209, 213
co-optation 292 Elman, J. 250
Cooreman, A. 42 emergence 52, 249, 293
Corder, S. P. 271 emergentism 256, 293
corpus (corpora) 72, 182, 292 endstate 28, 32, 77, 293
correlational studies 142–144 English as a Second Language (ESL) 179
Critical Period Hypothesis 5, 8 Erlam, R. 213
criticisms of Monitor Theory 15 error, predictive 253
CT see complexity theory Eskildsen, S. 260
cultural artifacts 223, 227, 228, 292 Event Probability Principle 113
Event-Related Potentials (ERPs) 145–147,
data pool 278 151–153
declarative knowledge 84, 93, 292 evidence, negative 195–196
declarative memory 131–136, 138, 139, evolutionary theory 272
154, 155; see also Declarative/Procedural exemplar-based learning 64, 67–69
model exemplars 67, 293
Declarative/Procedural model (DP explicit feedback 199
model): about 128; declarative memory explicit/implicit debate: about 12–14;
131–136, 138, 139, 154, 155; evidence complex dynamic systems theory
for 142–149; exemplary study 151–153; 263–264; concept-oriented approach
Index 303
55–56; Declarative/Procedural Gass, S. M. 53, 112, 194–195, 197, 202,
model 138–142, 154–155; input 204–206
processing 122; interaction approach Geeslin, K. L. 44
213; linguistic theory and Universal gender agreement 124n6
Grammar 34–35; processability theory generative linguistic theory 286
188; Skill Acquisition Theory 95–97; genetic method 230, 294
Sociocultural Theory 240–241; usage- germ theory of disease 2, 6
based approaches 77–78 Gleick, J. 250
explicit knowledge 13–14, 240; see also Golinkoff, R. M. 261, 262
explicit/implicit debate Google search 278
explicit learning 13, 154–155, 293; see also grammars, artificial 151–153
explicit/implicit debate grammaticality judgments 25–26, 294
exposure to input see Observation 1 “Great Expectations: Second Language
(exposure to input) Research and Classroom Teaching” 271
eye-tracking experiment 116, 119, 120 Gudmestad, A. 44
Gurzynski-Weiss, L. 210–211
face-to-face (FTF) modes 210, 211
Faretta-Stutenberg, M. 90 Hamrick, P. 129, 144
feature unification 165, 168–170, 293–294 Hawkins, R. 36n5
feedback 199–203, 253 Haznedar, B. 33
Felser, C. 29–30 higher mental processes 223, 294
Ferman, S. 96 hippocampus 131, 294, Plate 1
Fernández Dobao, A. 198 Hirsh-Pasek, K. 261, 262
filled-gap effect 30, 31 Hoenkamp, E. 162
Fincham, J. M. 87 Hollich, G. 261, 262
Fiorentino, R. 29–31 Hopper, P. J. 45
first language 283 Hulstijn, J. H. 13, 34
first-noun principle 112, 114, 115, hypotheses 5, 6, 294; see also specific
118–120 hypotheses
Fitts, P. 84 hypothesis space 168, 294
Fodor, J. A. 34
form-meaning connections 105, 294; idiosyncratic knowledge in language 135
see also input processing implicit feedback 199–203
form-to-function approach 40 implicit knowledge 13–14, 240;
fossil record 273 see also explicit/implicit debate
full access 28 implicit learning 13, 154–155, 294;
full transfer 28 see also explicit/implicit debate
Full Transfer Full Access Hypothesis incidental effects see Observation 2
(FTFA) 28 (incidental effects)
functionalist approaches 40–41, 45–46; inner speech 227
see also concept-oriented approach input: comprehensible 292; interaction
functional load 43, 294 approach 194–195; modified
functional neuroimaging evidence 194–195; see also interaction approach;
147–149 Observation 1 (exposure to input);
functional structure 171 Observation 3 (limits of effects of input)
function-to-form approach input, interaction, output model
see concept-oriented approach see interaction approach
function-to-form mapping Input Hypothesis 192, 295;
see form-meaning connections see also interaction approach
input processing (IP): about 105; evidence
Gabriele, A. 29–31 for 115–116; exemplary study
Gal’perin, P.Ya. 225, 228, 232–234 119–120; explicit/implicit debate 122;
Garrett, M. F. 162 misunderstandings, common 116–118;
304 Index
observable phenomena 120–122; theory procedural 84, 93, 298; “transformation”
and its constructs 106–115 of 150; see also explicit/implicit debate
instruction: processing 118, 298; see Krashen, S. 232
also Observation 9 (limitations of Krashen, S. D. 15–16, 34, 231–232, 271;
instruction) see also Monitor Theory
interaction approach: attention 204–206;
evidence for 206–209; exemplary L1 transfer principle 113
study 210–211; explicit/implicit L2 comprehension 176–178
debate 213; feedback 199–203; input L2 grammars 29, 233
194–195; interaction 195–196, 198–199; L2 profiling 178–181
language-related episodes 203–204; L2 theory development 275
learning 198–199; misunderstandings, Lane, S. M. 96
common 209; observable phenomena Langerhans, B. 273
211–212; output 196–198; scope of 209; language, as construct 7
theory and its constructs 192–206 language acquisition 14, 15; constrained
interface/noninterface theories of 284–285; negation, English 281; order
SLA 295 281–282; singular before plural 282;
interference 71 verb, Spanish 281
interlanguage 21–23, 76, 204 language-as-subject-model approach 285
internalization 228–229, 295 language-is-different approach 285
interpsychological function 228, 295 language learning 284
intrapsychological function 228, 295 language processing tasks 284
intuitional data 25–26 language-related episodes (LREs) 203–204
IP see input processing languages: artificial 151–153
island constraints 20–21, 24, 295 language teaching 271–288; conceive 274
Italian language 112, 183–184 Lantolf, J. P. 229, 232–239
iteration 251, 295 Lanze, F. 179
Ito 119–120 Lardiere, D. 25, 32
Ito, K. 119–120 Larsen-Freeman, D. 255–256, 261–262,
288n1
James, W. 253 learnability problem see logical problem
Japanese language 185–186 learned attention 70–71, 295
Jarvis, S. 260 learner language 253
Jegerski, J. 116 learners: differential acquisition
Jiménez-Jiménez, A. 227, 228 mechanisms 282–283; noticed
Johnston, M. 178 a gap 199; systemic-theoretical
Jordan, G. 193 instruction 232
Judge Judy TV 238 learning: associative 63, 65–71; difficulties
Juffs, A 25 in determining 208–209; exemplar-
Juffs, W. 25 based 64, 67–69; explicit 13, 154–155,
293; implicit 13, 154–155, 294;
Kanwit, M. 44 interaction and 198–199; interaction
Karni, A. 96 approach 198–199; operationalization of
Kawaguchi, S. 184–186 208; U-shaped 76, 300; see also explicit/
Keating, G. D. 1–16, 109, 192 implicit debate
Kempen, G. 162 Leeman, J. 203
Kilborn, K. 42 Leiocephalus carinatus 273
Kim, J. 232–238 Lemelin, S. 204
Kim,Y. 262 Lenzing, A. 174–178
Klein, W. 41, 42, 45, 49, 50, 52, 54–55 lesion method 144–145
knowledge: declarative 84, 93, 292; explicit Levelt, W. J. M. 163, 165, 171, 188
13–14, 240; implicit 13–14, 240; Lévi-Strauss, C. 252
Index 305
Lexical-Functional Grammar (LFG) 164, medial temporal lobe (MTL) 131
165, 170–171; see also Processability mediation 224–228, 231, 242n2, 296
Theory memory: declarative 131–136, 138, 154,
lexical mapping 171–172, 295 155; procedural 132–133, 136–137, 154,
Lexical Mapping Theory 171–172 155, 298; working 2–3, 5, 6, 205, 300
Lexical Preference Principle 108 mental processes: higher 223, 294; lower
Lexical Semantics Principle 113 224–225, 296
Li, M. 89, 90, 93–94 meta-analysis 296
Lightbown, P. M. 202, 271, 276, 277, metalanguage 55, 296
285, 286 Minh 251
limitations of instruction see Observation 9 min-max graphs 254
(limitations of instruction) Missing Surface Inflection Hypothesis
linearization problem 171–172, 295–296 (MSIH) 32
linguistic competence 19–21, 296 models 5, 296; see also specific models
linguistic theory and universal grammar: modified input 194–195
evidence for 24–27; exemplary modified output 196–198
study 29–31; explicit/implicit debate Molenaar, P. C. M. 257
34–35; interlanguage 21–23; linguistic Monitor Theory 14–16; behaviorism and
competence 19–21; methodology 29; 14–16; criticisms of 15
misunderstandings, common 27–29; Morgan-Short, K. 90, 129, 151–153
observable phenomena 32–34; scope of Morin, E. 255
theory 27–28; theory and its constructs morphological stage 42, 53, 296
19–24; transfer 28; Universal Grammar, Moses, J. 46
defined 300; Universal Grammar Moss, S. 256
principles and parameters 23–24 MSIH (Missing Surface Inflection
load, functional 43, 294 Hypothesis) 32
Loewen, S. 213 MTL (medial temporal lobe) 131
Logan, G. 87, 90 multifunctionality principle 43, 44, 296
logical problem 19–20, 164, 296 Multiple Constraints Hypothesis 175–176,
Long, M. H. 42, 198, 200, 213 296–297
Losos, J. B. 273 Myles, F. 193
lower mental processes 224–225, 296
Lowie, W. 256–258 N400 ERP components 146, 152
LREs (language-related episodes) 203–204 Nadji Arabic 30, 31
Luria, A. R. 231 Nakata, T. 89
National Education Association 274
McDonough, K. 19, 197, 205 negation in English 10, 121, 281–283
Mackey, A. 192, 197, 201, 205–207 negative evidence 195–196
Macqueen, J. M. 261 negotiation for meaning 198, 200–201,
Makoni, B. 252 203, 297
Makoni, S. 252 neural network models 253
Mandarin Chinese 93 neurological evidence 144–145
mapping, lexical 171–172, 295 Newell, A. 85, 87
Marxism 224 Nicholas, H. 202
massed practice 88–89, 293 nonlinearity 250, 297
Mathews, R. C. 96 noticed a gap 199
meaning, negotiation for 198, 200–201, noticing 117, 297
203, 297 “null subject” languages 124n8
Meaning before Nonmeaning
Principle 110 object-regulation 226–227
meaning-oriented approach see observable phenomena: about 9–12;
concept-oriented approach Complexity Theory (CT) 258–263;
306 Index
concept-oriented approach 52–54; 33; Processability Theory 187; Skill
Declarative/Procedural model Acquisition Theory 94; usage-based
153–154; input processing (IP) approaches 76
120–122; interaction approach 211–212; Observation 8 (limits on effects of
linguistic theory and Universal first language): about 11–12; input
Grammar 32–34; Processability Theory processing 121; linguistic theory and
186–187; Skill Acquisition Theory Universal Grammar 33; Processability
94–95; Sociocultural Theory 238–240; Theory 187; Sociocultural Theory 239;
usage-based approaches 75–77 usage-based approaches 76
Observation 1 (exposure to input): about Observation 9 (limitations of instruction):
10; Complexity Theory 258–259; about 12; concept-oriented approach
Declarative/Procedural model 153; 53–54; Declarative/Procedural model
input processing 120; interaction 154; input processing 121; linguistic
approach 212; linguistic theory and theory and Universal Grammar
Universal Grammar 32; Sociocultural 34; Processability Theory 187; Skill
Theory 238; usage-based approaches 75 Acquisition Theory 94; usage-based
Observation 2 (incidental effects): about approaches 77
10; Declarative/Procedural model Observation 10 (limits on effects of
153; input processing 120; interaction output): about 12; input processing
approach 212; Sociocultural Theory 121–122; interaction approach 212;
238–239; usage-based approaches 75 Processability Theory 187; Skill
Observation 3 (limits of effects of Acquisition Theory 94; Sociocultural
input): about 10; Complexity Theory Theory 240
259; linguistic theory and Universal O’Donnell, M. B. 73–75
Grammar 32; usage-based approaches Ohta, A. 230, 238
75–76 Olshtain, E. 96
Observation 4 (predictable stages of one-to-one principle 43, 297
output): about 10–11; Complexity one-way tasks 207
Theory 259–260; concept-oriented open systems 249, 297
approach 53; input processing 120–121; operationalization of learning 208
Processability Theory 186; Skill other-regulation 226–227
Acquisition Theory 95; Sociocultural output 8, 196–198; see also interaction
Theory 239; usage-based approaches 76 approach; Observation 4 (predictable
Observation 5 (outcome variability of stages of output); Observation 10 (limits
SLA): about 11; Complexity Theory on effects of output)
260–261; Declarative/Procedural Output Hypothesis 192, 196–198, 297;
model 153–154; interaction approach see also interaction approach
212; Processability Theory 186–187; overlearning 85
Skill Acquisition Theory 94–95;
Sociocultural Theory 239 P600 ERP components 146, 152
Observation 6 (variability of SLA Paradis, M. 96, 150–151, 240
across linguistic subsystems): about parsing 29, 110–111, 297
11; Complexity Theory 261–262; Pavlenko, A. 260
Declarative/Procedural model Peppered Moth studies 273
153–154; linguistic theory and Perfetti, C. A. 89–90
Universal Grammar 32; Skill Acquisition Peters, A. M. 251
Theory 94–95; Sociocultural Theory phrase structure 168, 171
239; usage-based approaches 76 Pica, T. 201
Observation 7 (limits on effects of Pienemann, M. 54, 166–167, 173, 178–179
frequency): about 11; Complexity Pipes, A. 205
Theory 262–263; input processing 121; pluperfect 52
interaction approach 212; linguistic plural meta-analyses 296
theory and Universal Grammar Poehner, M. E. 231, 242
Index 307
Polat, B. 262 reaction time experiments 182–183
Posner, M. 84 recall, stimulated 207, 299
poverty 279; of stimulus 281 (see also recasts 199, 202
logical problem) regulation 226–227, 299
power law 85–86; of learning 297 reports, reverse-order 48–52, 54, 299
practice 85, 88–89, 293, 297, 299 response, learners 203
predictive error 253 retrodiction 255, 299
Preference for Nonredundancy Lexical reverse-order reports 48–52, 54, 299
Preference Principle 109, 124n5 (Revised) Lexical Preference
Prévost, P. 32 Principle 108
Primacy of Content Words Principle Robinson, P. 89–90, 205
107–108 Rodgers, D. M. 89–90
Principles and Practice in Second Language Roman alphabet 93
Acquisition 271 Römer, U. 73–75
private speech 227–228, 297 Rosenbloom, P. 85, 87
proceduralization 84, 87, 93, 298
procedural knowledge 84, 93, 298 Sallas, B. 96
procedural memory 132–133, 136–137, Sanz, C. 151–153, 208
154, 155, 298; see also Declarative/ Sato, C. J. 42
Procedural model scaffolding 42, 231, 299
process 107 Schechtman, E. 96
processability hierarchy 165–168, 298 Schema for the Orienting Basis of Action
processability theory (PT) 186–187; (SCOBA) 233–237
evidence for 182–183; exemplary study Schmidt, R. W. 117, 205
184–186; explicit/implicit debate 188; Schoener, T. W. 273
hypothesis space 168; implementation Schumann, J. 42
178–181; initial L2 grammatical system Schwartz, B. D. 34
174–176; Lexical-Functional Grammar science theories 2, 5, 9
170–171; Lexical Mapping Theory scientific concepts see explicit knowledge
171–172; misunderstandings, common SCT see sociocultural theory
183–184; observable phenomena second language, as construct 7
186–187; processability hierarchy second language acquisition (SLA):
165–168; theory and its constructs early theories in 14–16; interface/
162–181; TOPIC Hypothesis 173–174; noninterface theories of 295; as term 7;
transfer of grammatical information and see also specific topics
feature unification 168–170; unmarked self-organizing 249, 299
alignment 171–172 self-regulation 226–227
processing 71, 298 Selinker, L. 21, 44, 284
processing instruction 118, 298 semantics, lexical 113
production data 25, 46 sentence interpretation tasks 115
projection 110 Sentence Location Principle 114
pronouns, subject 124n8 Shallow Structure Hypothesis (SSH)
prototypes 67–68, 298 29–30, 118
psychology theories 2–3, 5, 6 Silk, E. 88
PT see processability theory simplifications 195
singular, third-person 124n4
Qin,Y. 88 Skill Acquisition Theory 95; about 83;
evidence for 87–91; exemplary study
Raichle, M. 88 93–94; explicit/implicit debate 95–97;
Rapid Profile (RP) approach 179, 180 misunderstandings, common 91–93;
rational analysis of cognition 67, 298 observable phenomena 94–95; theory
rational language processing 64, 66–67 and its constructs 83–87
re-access 226, 299 SLA see second language acquisition
308 Index
sociocultural theory (SCT): about third-person singular 124n4
223–224; evidence for 230–231; Tomlin, R. 205
exemplary study 232–238; explicit/ TOPIC Hypothesis 173–174
implicit debate 240–241; internalization traces 3–4, 6
228–229; mediation 224–228, 230, transfer 28, 70–71, 168–170, 299
242n2; misconceptions 231–232; truth-value judgments 26, 300
misconceptions, common 231–232; two-way tasks 207
observable phenomena 238–240; theory
and its constructs 224–230; Zone of Ullman, M. T. 151–153, 240
Proximal Development (ZPD) 229–230 universal grammar (UG) see linguistic
soft assembly 252 theory and universal grammar
Solon, M. 44 unmarked alignment 171–172, 300
Sorace, A. 32 usage-based approaches 286; associative
spacing of practice 89, 299 aspects of transfer 70–71; associative
Spada, N. 202 bases of abstraction 67–69;
Spanish language 277, 278; input associative learning theory 63, 65–66;
processing 109–112, 114, 115, 124n8 constructions 63, 64–65; emergent
spatial resolution 148, 299 relations and patterns 64, 69–71;
speech, private 227–228, 297 evidence for 71–72; exemplar-based
Spiller, D. A. 273 learning 64, 67–69; exemplary study
Spoelman, M. 262 73–75; explicit/implicit debate 77–78;
spontaneous concepts see implicit misunderstandings, common 72–73;
knowledge observable phenomena 75–77; rational
SSH (Shallow Structure Hypothesis) language processing 64, 66–67; theory
29–30, 118 and its constructs 63–71; two languages
statistical preemption 253, 299 and language transfer 70
Steinhauer, K. 151–153 U-shaped learning 76, 300
Stenger,V. 88 Uzum, B. 205
stimulated recall 207, 299
stimulus 94 VanPatten, B. 1–16, 105–124, 192, 272
Stutterheim, C., von 41, 45, 47 verb-argument constructions (VACs)
Subagency Principle see island constraints 73–75
subject matter 276 Verspoor, M. 256–258, 262
subject pronouns 124n8 Villa,V. 205
subject-verb agreement 177, 182 Vygotsky, L. S. 223–226, 228–229,
Sun, R. 96 231–233, 238, 240, 242n1; see also
Suzuki,Y. 89 sociocultural theory
Svetics, I. 204
Swain, M. 196, 212, 230, 233, 240 Wertsch, J. 224
syntactic theory 3–4, 6 While We’re On the Topic 271
Systemic Theoretical Instruction (STI) wh-in-situ 22–24, 28, 36n4
224, 232, 233, 299 White, L. 25, 32, 35
wh-movement 29; about 20–21, 23–24,
Tasker, I. 260 28–31; defined 300; lack of 22–24,
tasks 115, 207 36n5; languages 30
teacher education 276 Williams, J. 1–16, 192, 204
teachers expectations 276–285 Wong, P. C. M. 90
Tedick, D. 288n1 Wong, W. 119–120
temporal resolution 145, 299 working memory 2–3, 5, 6, 205, 300
theories: about 2–4; defined 2, 299; duties Wulff, S. 1–16, 192
of 2–4; hypotheses versus 5; models
versus 5; in psychology 2–3, 5, 6; in Zhang, X. 239
sciences 2, 5, 9; as term 6; usefulness of zone of proximal development (ZPD)
8–9; see also specific theories 229–232, 300