0% found this document useful (0 votes)

395 views51 pages

Spaced Practice in Second Language Learning

Uploaded by

Martin Fuchs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

395 views51 pages

Spaced Practice in Second Language Learning

Uploaded by

Martin Fuchs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Language Learning ISSN 0023-8333

METHODOLOGICAL REVIEW ARTICLE

The Effects of Spaced Practice on Second

Language Learning: A Meta-Analysis
Su Kyung Kim and Stuart Webb
University of Western Ontario

Abstract: This meta-analysis investigates earlier studies of spaced practice in second

language learning. We retrieved 98 effect sizes from 48 experiments (N = 3,411). We
compared the effects of three aspects of spacing (spaced vs. massed, longer vs. shorter
spacing, and equal vs. expanding spacing) on immediate and delayed posttests to calcu-
late mean effect sizes. We also examined the extent to which nine empirically motivated
variables moderated the effects of spaced practice. Results showed that (a) spacing had
a medium-to-large effect on second language learning; (b) shorter spacing was as effec-
tive as longer spacing in immediate posttests but was less effective in delayed posttests
than longer spacing; (c) equal and expanding spacing were statistically equivalent; and
(d) variability in spacing effect size across studies was explained methodologically by
the learning target, number of sessions, type of practice, activity type, feedback timing,
and retention interval. The methodological and pedagogical significance of the findings
are discussed.
Keywords meta-analysis; spaced practice; spacing effect; second language learning

Introduction
Massed practice involves studying the same items in succession without any
intervening time or items, whereas spaced practice involves studying items

This research received no specific grant from any funding agency. We would like to express our
gratitude to the journal editor, Emma Marsden, and the anonymous Language Learning reviewers
for their insightful feedback and suggestions at every stage of the manuscript writing and revising
process. We would like to thank the following researchers, who kindly provided additional infor-
mation necessary for the current meta-analysis: Irina Elgort, Emilie Gerbier, Sean Kang, Jeffrey
Karpicke, Tatsuya Nakata, Steven Pan, and Ulf Schuetze.
Correspondence concerning this article should be addressed to Su Kyung Kim, University of
Western Ontario, Faculty of Education, 1137 Western Road, London, ON N6G 1G7, Canada.
E-mail: skim2323@uwo.ca
The handling editor for this article was Emma Marsden.

Language Learning 72:1, March 2022, pp. 269–319 269

© 2022 Language Learning Research Club, University of Michigan
DOI: 10.1111/lang.12479
14679922, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/lang.12479 by Charité - Universitaetsmedizin, Wiley Online Library on [03/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Kim and Webb Spacing and Second Language Learning

separated by an interval of time or other items. For example, massed practice

in second language (L2) learning could involve learning cat, dog, and fish in
the sequence cat, cat, cat, dog, dog, dog, fish, fish, fish, whereas spaced prac-
tice could involve learning the same items in a sequence such as cat, dog,
fish, cat, dog, fish, cat, dog, fish. Research reveals that the inclusion of spac-
ing promotes learning (e.g., Bahrick, 1979). The term spacing effect refers to
enhanced learning, for a given item, during spaced practice as compared with
massed practice.
There are, however, different types of spacing. Absolute spacing is the to-
tal amount of intervals between all learning opportunities for a given item
(Karpicke & Bauernschmidt, 2011). For example, if an item is encountered
six times with an encounter occurring every 3 minutes, the absolute spacing
is 18 minutes. The distribution of learning opportunities relative to one an-
other, including equal and expanding spacing, is captured by relative spacing
(Karpicke & Bauernschmidt, 2011). Equal spacing, also known as fixed or
uniform spacing, expresses the condition where the spacing between encoun-
ters for a given item is constant. In expanding spacing, the interval between
encounters gradually increases. Lag effects refer to comparisons of the effects
of different amounts of spacing (e.g., relatively short vs. relatively long).
Blocking ensures that the amount of practice devoted to a particular skill
(or concept) is massed, and interleaving guarantees that practice of the par-
ticular skill (or concept) is spaced across multiple learning opportunities and
separated by intervening tasks. In interleaved practice, for example, under the
category of English tense (as a superordinate concept), learners learn differ-
ent types of tense an equal number of times but in a different order (e.g.,
present, past, future, past, present, future, present, past, future). In block-
ing practice, learners learn one type of tense, followed by another type (e.g.,
present, present, present, past, past, past, future, future, future). Although in-
terleaving and spacing are separate constructs (i.e., interleaving operates at a
superordinate level; Metcalfe, 2011), they are often confounded, and interleav-
ing effects may reflect the contribution of spacing (Taylor & Rohrer, 2010).
Learning new skills or knowledge typically requires practice, and learning
is enhanced when practice is spaced rather than massed (Baddeley, 1999;
DeKeyser, 2007). Consequently, the development of spaced practice has
become one of the most powerful advancements in learning and memory re-
search. Numerous empirical studies (e.g., Carpenter & DeLosh, 2005) and re-
views of literature (e.g., Cepeda, Pashler, Vul, Wixted, & Rohrer, 2006; Dono-
van & Radosevich, 1999) have demonstrated the benefits of spaced practice for
skill learning (e.g., music performance, airplane control simulation) and for

Language Learning 72:1, March 2022, pp. 269–319 270

14679922, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/lang.12479 by Charité - Universitaetsmedizin, Wiley Online Library on [03/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Kim and Webb Spacing and Second Language Learning

verbal memory learning tasks such as picture naming, fact recall, and paired-
associate learning (e.g., first language [L1] word pairs, L2−L1 word pairs).
Although previous reviews included L2−L1 word pairs as a verbal memory
task, L2 learning studies were limited in number (no more than seven studies,
e.g., Cepeda et al., 2006) or not clearly mentioned (Donovan & Radosevich,
1999), and thus the effects of spaced practice on L2 learning are less clear.
There has been a great deal of research investigating the effects of spaced
practice on L2 learning, but the effects reported have been inconsistent. Re-
search has revealed that (a) spaced practice benefited learning and retention
of L2 vocabulary (e.g., Bloom & Shuell, 1981) and L2 grammar (e.g., Suzuki,
Yokosawa, & Aline, 2020); (b) spaced practice was as effective as massed prac-
tice on immediate posttests (Lee & Choe, 2014); (c) longer spacing was supe-
rior to shorter spacing on delayed posttests measuring L2 vocabulary (e.g.,
Pashler, Zarow, & Triplett, 2003) and L2 grammar learning (Rogers, 2015);
(d) shorter spacing contributed to greater learning than longer spacing on
delayed posttests (e.g., Küpper-Tetzel, Erdfelder, & Dickhäuser, 2014); (e)
shorter spacing was as effective as longer spacing on delayed posttests (e.g.,
Kasprowicz, Marsden, & Sephton, 2019); (f) equal spacing was more effective
than expanding spacing on delayed posttests (e.g., Çekiç & Bakla, 2019); and
(g) equal spacing was as effective as expanding spacing on delayed posttests
(e.g., Kang, Lindsey, Mozer, & Pashler, 2014).
Given the limited number of L2 studies included in previous reviews and
the inconsistent results obtained from L2 studies on spaced practice, research
aimed at clarifying findings is warranted. Compared to skill learning and
verbal memory, there are arguably even more individual differences (e.g.,
language aptitude, Kasprowicz et al., 2019) and contextual variables (e.g.,
teaching techniques, Rogers & Cheung, 2020a; multiple modes of L2 input,
Serrano & Huang, 2018; type of knowledge to be learned, Suzuki & DeKeyser,
2017a; task complexity, Suzuki et al., 2020) involved in L2 learning. Further-
more, there is abundant evidence of various instructional treatment benefits
(e.g., form-focused instruction, implicit inductive teaching) in L2 learning
(e.g., Norris & Ortega, 2000). It is, therefore, important to clarify the overall
effects of spacing and the different types of spacing on L2 learning in order
to provide pedagogical guidance, as well as to identify useful directions for
future research. In addition, because learner-related variables (e.g., prior L2
knowledge, Nakata & Suzuki, 2019b) and methodological features (e.g., feed-
back, Nakata, 2015a) were noted as reasons for the inconsistency in findings, it
is important to explore whether and to what extent the effect of spaced practice
is moderated by different variables across studies. The present study aims to

271 Language Learning 72:1, March 2022, pp. 269–319

address these questions by conducting a meta-analysis, one of the most effec-

tive tools for comprehensive research synthesis (Hunter & Schmidt, 2004).

Background Literature
Theories of Spaced Practice Effects
Many theories of spaced practice effects have been proposed and examined.
First, spacing between learning opportunities makes learning more difficult,
but desirably so (desirable difficulty framework, e.g., Bjork, 1994; Suzuki,
Nakata, & DeKeyser, 2019). Second, forgetting occurring via spacing cre-
ates more effortful retrieval attempts, which strengthens retention (Bjork,
1975). Third, spacing between learning opportunities enhances subsequent re-
peated learning (consolidation, e.g., Wickelgren, 1972). Fourth, spacing be-
tween learning opportunities results in more attentional processing, but massed
learning results in less processing (deficient processing, e.g., Jacoby, 1978;
Koval, 2019). Fifth, reducing the accessibility of information in memory after
spacing enhances additional learning of that information (accessibility prin-
ciple, e.g., Bjork & Bjork, 1992). Sixth, spacing makes subsequent repeated
learning more distinctive, and the learning in different contexts is better re-
membered (contextual variability theory, e.g., Melton, 1970). Seventh, spacing
manipulated between retrievals (i.e., testing information from memory) pro-
duces benefits on long-term retention (study-phase retrieval, e.g., Toppino &
Bloom, 2002).

Previous Meta-Analytic Reviews of Spaced Practice

Donovan and Radosevich (1999) examined a total of 63 studies with 112 ef-
fect sizes and found that spaced practice was superior to massed practice. They
reported that about 10% of the sample examined verbal memory with tasks
(e.g., face–name pairs, low associate pairs; in which all the written and oral
tasks were presented in the L1) and with L2−L1 word pairs. However, the
number of L2 studies was unclear. Cepeda et al. (2006) meta-analyzed the ef-
fect of spaced practice in verbal recall tasks for memory (e.g., picture naming,
spelling, low associates; in which all the materials were presented in the L1)
and for L2 learning (e.g., learning the meanings of L2 words from paired asso-
ciates), but only about 4% of the studies out of 184 research reports involved
L2 learning. They found that spaced conditions were significantly better than
massed conditions. They also found that longer spacing was more effective
than shorter spacing at longer retention intervals (the interval between the last
learning session and the final posttest). However, they found no obvious differ-
ence between equal and expanding spacing. Although Cepeda et al. reviewed

Language Learning 72:1, March 2022, pp. 269–319 272

the effects of spaced practice, there is as yet no clear description of the ex-
tent to which spaced practice affects L2 learning. This is because they mainly
investigated the relationship between spacing intervals (the interval between
learning opportunities) and retention intervals, and there were few L2 studies
examined.
Uchihara, Webb, and Yanagisawa’s (2019) meta-analysis included spacing
as a moderator variable and found that frequency effects in L2 incidental vo-
cabulary learning (whereby the higher the number of encounters with a word,
the better the learning) were larger when words were encountered in massed
conditions (defined as within one session), r = .38, 95% CI [.31, .45], than
when words were encountered in spaced conditions (defined as learning across
multiple sessions), r = .23, 95% CI [.12, .34]. However, spacing was not exam-
ined as the sole construct, so a clear picture of spacing effects on L2 vocabulary
learning was not obtained.

Review of Moderator Variables on Spacing Effects

Age
Several L1 studies have examined the effects of spaced practice at different
ages but have obtained inconsistent results: Older children showed spacing ef-
fects, but not younger children (e.g., Toppino & DiGeorge, 1984); young adults
showed larger spacing effects than older adults (Maddox, Balota, Coane, &
Duchek, 2011); there was no age difference between the effects of shorter and
longer spacing (e.g., Seabrook, Brown, & Solity, 2005) or between the effects
of equal and expanding spacing (e.g., Maddox et al., 2011). Furthermore, some
findings conflict with Wilson’s (1976) hypothesis that the effects of different
types of spacing are dependent on working memory capacity (the ability to not
only temporarily store information but also manipulate it for learning, Badde-
ley, Eysenck, & Anderson, 2015), which develops with age (Gathercole, Pick-
ering, Ambridge, & Wearing, 2004). In L2 studies, spaced practice effects have
been observed with adult learners (e.g., Li & DeKeyser, 2019) and with young
learners (e.g., Lotfolahi & Salehi, 2017). However, given that no studies have
examined age as an independent variable, the effects of spaced practice with
L2 learners of different ages remain unclear. Furthermore, given that working
memory capacity is significantly positively correlated with L2 learning (e.g.,
Linck, Osthus, Koeth, & Bunting, 2014), the effects of spaced practice may not
be the same among L2 learners of different ages.

273 Language Learning 72:1, March 2022, pp. 269–319

Learning Target
Most L2 spaced practice studies have investigated L2 vocabulary learning
(e.g., Koval, 2020). Positive effects have also been demonstrated with L2 gram-
mar or morphology (e.g., Suzuki et al., 2020) and L2 pronunciation (e.g., Car-
penter & Mueller, 2013). However, acquisition of vocabulary and grammar
may occur through different processes (Pinker, 1998). For example, Ullman
(2015) reported that declarative memory may play different roles in lexical
and grammatical aspects of learning and processing. Pronunciation learning
is a different skill from vocabulary and grammar learning (Li & DeKeyser,
2019). Therefore, the effects of spaced practice may not be the same among
different domains (vocabulary, grammar, and pronunciation) of a L2.

Number of Sessions
Spaced practice studies involve spacing within a single session or between
multiple sessions. Most single-session studies manipulate item spacing (i.e.,
studying items separated by an interval of other items), and most multiple-
session studies manipulate time spacing (i.e., studying items separated by an
interval of time). It is also possible for multiple-session studies to manipu-
late item spacing (i.e., manipulating item spacing within each session). Spaced
practice benefits have been observed when manipulated within a single session
(e.g., Nakata & Suzuki, 2019b) as well as between multiple sessions (e.g., Li
& DeKeyser, 2019). However, it is not clear whether the number of sessions
affects outcomes. Therefore, it may be methodologically and pedagogically
valuable to see whether it influences learning through spaced practice.

Type of Practice
Spaced practice can involve repeated practice in studying materials (study
trials), retrieving information from memory (test trials), or a combination
of studying and retrieval (test–restudy or study–test trials; e.g., Roediger &
Karpicke, 2006). Several studies have revealed long-term retention benefits
of information relearned in spaced practice (e.g., Verkoeijen, Rikers, & Öz-
soy, 2008). Other studies found that repeatedly assessing information across
time promotes learning (e.g., Lawrence, 2013). This suggests that both spaced
restudy and retrieval practice are effective for learning and retention. However,
studies comparing repeated restudy practice (study trials) to repeated retrieval
followed by feedback across time (test–restudy trials) found that the best re-
tention occurred in the test–restudy trials (e.g., Butler & Roediger, 2007). L2
studies have found positive effects of retrieval relative to restudy on L2 vo-
cabulary learning and retention (e.g., Barcroft, 2007). None of these studies,

Language Learning 72:1, March 2022, pp. 269–319 274

however, involved spacing as an independent variable. Furthermore, there has

been no empirical research comparing restudy to retrieval on L2 grammar or
pronunciation.

Activity Type
Research (e.g., Dunlosky, Rawson, Marsh, Nathan, & Willingham, 2013) has
found the benefits of spaced practice to be general across a range of mate-
rials, such as verbal materials (e.g., word pairs, facts), visual materials (e.g.,
pictures, videos), and educational materials (e.g., lectures, mathematical for-
mulas). However, not all tasks yield large benefits of spaced practice. Dono-
van and Radosevich (1999) found that there was a large spacing effect with
a low level of task complexity, d = 0.97, 95% CI [0.88, 1.06], but a small
effect with a high level of task complexity, d = 0.07, 95% CI [−0.05, 0.18].
Spaced practice for L2 learning has also been studied with a wide range of
activities: paired-associate tasks (e.g., Nakata, 2015a), listening and reading
activities for form–meaning mapping (e.g., Kasprowicz et al., 2019), judgment
tasks (e.g., Li & DeKeyser, 2019), oral description using pictures (e.g., Suzuki
et al., 2020), and exercises such as multiple-choice tasks, fill-in-the-blanks
tasks (e.g., Bloom & Shuell, 1981), and crossword puzzles (e.g., Rogers &
Cheung, 2020b). These activities are used to help L2 learners to comprehend
target items (e.g., multiple-choice tasks, reading texts, listening and identifying
the correct spoken forms of words) and to produce target items (e.g., picture
description, making sentences, pronouncing words). Donovan and Radosevich
(1999) coded foreign language tasks (L2–L1 word pairs) as representing an
average level of task complexity and found a small-to-medium effect of spac-
ing, d = 0.42, 95% CI [0.36, 0.48]. However, there might be a difference in the
level of difficulty that learners experience in comprehending versus producing
target items, and hence this may impact the magnitude of spacing effects.

Provision of Feedback
Studies have demonstrated that spacing effects may be influenced by the pro-
vision of feedback after retrieval (e.g., Roediger & Karpicke, 2006). Cepeda
et al. (2006) reported that feedback may be a variable that explains differences
between equal and expanding spacing; when feedback is provided, expanding
spacing benefits performance because feedback minimizes the chance of for-
getting an item (Pashler, Cepeda, Wixted, & Rohrer, 2005). However, Cepeda
et al. (2006) could not examine the effect of feedback because all three stud-
ies included in their meta-analysis for equal and expanding spacing provided
feedback. It would be useful to examine the effects of feedback because spaced

275 Language Learning 72:1, March 2022, pp. 269–319

practice studies that have provided feedback have reported contrasting results.
For example, Kang et al. (2014) failed to find a positive effect for expanding
spacing with feedback relative to equal spacing with feedback, whereas Nakata
(2015a) found expanding spacing with feedback to be superior to equal spac-
ing with feedback. However, it should be noted that Nakata found a significant
effect of expanding spacing only on a posttest involving receptive recall (from
L2 to L1), with very small effect sizes, d = 0.12−0.19, 95% CI [−0.80, 0.53].
Furthermore, given that feedback to correct learners’ responses has generally
been found to be beneficial to L2 learning (e.g., Li, 2010), it would be interest-
ing to see whether the effect of spaced practice is moderated by feedback.

Feedback Timing
The timing of feedback may also moderate learning through spaced practice.
Some studies in cognitive psychology found that delayed feedback (e.g., feed-
back given after all responses) had a greater effect on learning than imme-
diate feedback (e.g., Butler, Karpicke, & Roediger, 2007), but others found
more benefit from immediate feedback (e.g., Brosvic, Epstein, Cook, & Dihoff,
2005). The superiority of delayed feedback can be explained by the fact that
delayed feedback results in more laborious learning circumstances, which fits
with the desirable difficulty framework (e.g., Bjork, 1994; Suzuki et al., 2019).
In contrast, because immediate feedback is generally provided after each re-
sponse, it is more likely to make learners fully process feedback after both
incorrect and correct responses (Butler & Roediger, 2007).
In L2 studies, Nakata (2015b) examined feedback timing (immediate and
delayed) in four different repeated retrieval practice conditions (one, three, five,
or seven retrievals). Sixteen English–Japanese word pairs were divided into
two sets of eight items. One set was assigned to the immediate feedback con-
dition, in which feedback was provided immediately after each response. The
other set was assigned to the delayed feedback condition, in which feedback
was provided after all eight items were performed. The interval between the
last encounter with a given item and the posttest was controlled. Nakata found
no main effect of feedback timing for L2 vocabulary learning on either recep-
tive (from L2 to L1) or productive (from L1 to L2) recall posttests. On the
1-week delayed posttest, he found a significant effect of the immediate feed-
back on only the receptive recall posttest, with a very small effect size, d =
0.14, 95% CI [0.03, 0.51]. However, because this study did not manipulate the
spaced learning conditions, the effect of feedback timing on spaced practice
for L2 vocabulary learning and retention remains unclear. Furthermore, there
has been no empirical research on L2 grammar or pronunciation learning that

Language Learning 72:1, March 2022, pp. 269–319 276

has directly investigated the interaction between spacing and feedback timing.
Given that the impact of feedback on learning and memory has been endorsed
by the majority of investigations, it is useful to examine whether immediate or
delayed feedback is more conducive to L2 learning in more versus less spaced
conditions.

Frequency of Practice
Spaced practice studies have included different numbers of encounters with
target items, ranging from one or two (e.g., Pyc & Rawson, 2009) to 27 or 30
(e.g., Suzuki, 2017). Greater frequency of practice can provide learners with
more time to restudy or more attempts to retrieve. Maddox and Balota (2015)
found, in a L1 study using low associate word pairs (e.g., apple–evil), signifi-
cant increases in retrieval practice performance as the number of tests during
the training sessions increased from one to five in a shorter spacing condition,
whereas in a longer spacing condition retrieval practice performance increased
from the one-test to the three-test condition, but did not increase further in the
five-test condition. These findings may suggest that providing more practice
does not always lead to better performance or better retention. Nakata (2017)
looked at the role of retrieval frequency (one, three, five, or seven retrievals)
within a single session for L2 vocabulary learning. He found that five or seven
retrievals led to better performance than one or three retrievals on both im-
mediate and delayed posttests. To our knowledge, there is no L2 empirical re-
search investigating the relationship between spaced conditions and frequency
of practice.

Retention Interval
Spaced practice effects may depend on when knowledge is measured (Cepeda
et al., 2006; Cepeda, Vul, Rohrer, Wixted, & Pashler, 2008; Rohrer & Pash-
ler, 2007). Cepeda et al. (2006) found a positive relationship between spacing
intervals and retention intervals (RIs); the longer the spacing, the greater the
retention. Rohrer and Pashler (2007) reported that spacing effects depended
jointly on spacing intervals and RI, arguing that the learning outcomes of dif-
ferent types of spaced practice may be better or worse depending on when the
final test is taken. Cepeda et al. (2008) found that longer spacing produces
better retention than shorter spacing at long RIs, whereas shorter spacing out-
performed longer spacing at short RIs. These findings suggest that the length
of RI may have a considerable impact on the effects of spaced practice.

277 Language Learning 72:1, March 2022, pp. 269–319

The Present Study

Research Questions
The current meta-analysis was guided by the following research questions:
1. To what extent does spacing affect L2 learning?
2. To what extent do learning gains differ in relation to type of spacing?
3. Which empirically motivated variables (age, learning target, number of ses-
sions, type of practice, activity type, provision of feedback, feedback tim-
ing, frequency of practice, and RI) moderate the effects of spaced practice?

Method
Literature Search
First, we comprehensively searched 22 relevant journals of cognitive psychol-
ogy, applied psychology, applied linguistics, and second language acquisition
for different combinations of key words: spacing effect, massed, interleaving,
blocking, lag effect, shorter spacing, longer spacing, absolute spacing, relative
spacing, equal spacing, fixed spacing, uniform spacing, expanding spacing,
second language learning, and foreign language learning. We then employed
the following electronic databases in order to extend the search: Education
Resources Information Center, Linguistics and Language Behavior Abstracts,
PsycINFO, and Google Scholar. In addition, we searched references in review
articles (e.g., Cepeda et al., 2006) and in book chapters (e.g., Carpenter, 2017).
We set 1979 as the starting point because Bahrick’s study from that year is one
of the classic experiments on spaced practice (as observed by Dunlosky et al.,
2013), and because there were very few L2 empirical studies prior to 1979 (cf.
Crothers & Suppes, 1967, Experiments 8, 9, 10, and 11), and those that existed
did not report sufficient statistical information to calculate effect sizes. We set
July 2020 as the completion point for our data collection.
In order to minimize the “file-drawer” problem in research synthesis (the
fact that some studies remain in researchers’ files because of the publication
bias toward studies reporting significant findings; Rosenthal, 1979), we consid-
ered retrieving “fugitive” literature (e.g., unpublished papers, doctoral theses,
conference presentations). However, due to the difficulty involved in retriev-
ing those sources, we decided to include only doctoral theses that are carefully
designed and provide detailed statistical information. We used the electronic
database ProQuest Global Dissertations and Theses to search for doctoral the-
ses, employing the same key words as for published studies.

Language Learning 72:1, March 2022, pp. 269–319 278

Inclusion Criteria
All reports that appeared initially eligible for the meta-analysis were then ex-
amined in reference to a set of inclusion criteria. To be included in the meta-
analysis, a study report had to meet all of the following criteria:
1. The study had to examine the effect of spaced practice on L2 learning.
We took L2 learning to include learning of L2 vocabulary such as sin-
gle words or collocations (Snoder, 2017), L2 grammatical structures (e.g.,
past perfect tense; Bird, 2010), L2 morphological features (e.g., Japanese
te-form of the verb; Suzuki & DeKeyser, 2017a), L2 pronunciation (e.g.,
Mandarin monosyllables such as ba with different tones; Li & DeKeyser,
2019), and orthographic and phonological nonsense items (e.g., Nakata &
Elgort, 2021).
2. The study had to feature a comparison of one type of practice with an-
other type of practice in order to examine the effects of spaced practice
(i.e., comparing spaced with massed practice, longer with shorter spacing,
or equal with expanding spacing). For example, Uchihara et al. (2019)
meta-analysis included massed and spaced conditions as a moderator to
examine frequency effects in L2 incidental vocabulary learning. However,
the studies included in their meta-analysis were not included in the cur-
rent meta-analysis because none of them qualified as a comparative study
examining the effects of spaced practice.
3. Studies comparing blocking to interleaving were included (Carpenter &
Mueller, 2013; Nakata & Suzuki, 2019b; Pan, Tajran, Lovelett, Osuna, &
Rickard, 2019; Suzuki et al., 2020). Blocking corresponds to massed prac-
tice or shorter spacing (not pure massed practice), whereas interleaving is
equivalent to spaced practice or longer spacing (see Appendix S2 in the
Supporting Information online for the category criteria).
4. The study had to provide clear spacing intervals. For example, we ex-
cluded the study by Lightbown and Spada (1994) that compared 18 hours
per week to 2 hours per week because it was not clear whether the time dis-
tribution was either shorter or longer, or equal or expanding. Additionally,
we excluded studies involving spaced practice with different criterion lev-
els via a dropout method (where items that were correctly retrieved during
a trial were removed from the to-be-practiced list in the subsequent trial),
because the number of test–restudy trials per item was variable between
participants (e.g., five-drop group, Pyc & Rawson, 2007).
5. The study had to control for participants’ preexisting knowledge of tar-
get items (vocabulary, grammatical features, and pronunciation rules).

279 Language Learning 72:1, March 2022, pp. 269–319

Conducting a pretest to show no statistically significant difference be-

tween groups on the pretest (e.g., Suzuki et al., 2020), using nonsense
items (e.g., Nakata & Elgort, 2021) or a miniature artificial language (e.g.,
Suzuki, 2017), and pilot testing of target items with another population
(e.g., Nakata & Elgort, 2021) are common ways of controlling for prior
knowledge.
6. Studies adopting both intentional and incidental learning conditions for
the target L2 items were included. In the former, target items are explicitly
taught or studied (e.g., Bird, 2010). In the latter, the target items are not
explicitly taught or studied, and participants are not told about subsequent
posttests (e.g., Serrano & Huang, 2018).
7. The study had to provide enough statistical information for effect size cal-
culation. Several studies (e.g., Bahrick, 1979) did not provide enough in-
formation to calculate effect sizes. We contacted authors and were grateful
to receive additional information that allowed us to complete the current
meta-analysis (our thanks to Emilie Gerbier, Jeffrey Karpicke, Sean Kang,
Steve Pan, and Thomas Toppino).
8. When the study included more than one experiment with different par-
ticipants, each experiment contributed an effect size in the meta-analysis
(e.g., Pan et al., 2019).
9. Replicated or extended studies had to involve different data samples. For
example, Suzuki (2017) reported the same data as Suzuki (2018, 2019);
Suzuki and DeKeyser (2017a) reported the same data as Suzuki and
DeKeyser (2017b). In this meta-analysis, we included the study by Suzuki
(2017), which was replicated, and the study by Suzuki and DeKeyser
(2017a), which examined the effects of spaced practice as the main focus,
whereas we excluded the authors’ subsequent studies (Suzuki, 2018, 2019;
Suzuki & DeKeyser, 2017b), which reanalyzed the same data using cog-
nitive aptitude (e.g., working memory) from the perspective of aptitude–
treatment interaction.
10. The study was written in English.
11. Studies adopting both within-participants and between-participants de-
signs were included. In a within-participants design, the independent vari-
able (spacing) is manipulated within participants. For example, half of the
items might be studied in a massed condition whereas the other half are
studied in a spaced condition. In a between-participants design, spacing
is manipulated between participants. For instance, half of the participants
study the items with a massed condition and the other half study them with
a spaced condition.

Language Learning 72:1, March 2022, pp. 269–319 280

Figure 1 PRISMA flow diagram.

The PRISMA flow diagram presented in Figure 1 depicts the study inclu-
sion criteria (Moher, Liberati, Tetzlaff, & Altman, 2009) and provides the num-
ber of included and excluded references. More detailed information is reported
in Appendix S1 in the online Supporting Information for this article.
Forty-eight experiments reported in 37 studies (N = 3,411) were selected
for this meta-analysis. The 48 experiments were then divided into three differ-
ent categories of spaced schedules: (a) spaced versus massed, (b) longer versus
shorter, and (3) equal versus expanding comparisons (see Appendix S2 in the
Supporting Information online for the category criteria). Each category was
meta-analyzed separately.

Coding: Dependent and Moderator Variables

The dependent variables were the effect sizes derived from the included
L2 studies. The effect sizes were classified as either immediate effects
(from immediate posttest scores) or delayed effects (from delayed posttest
scores). Some previous studies on spaced practice involved filler tasks (e.g.,
5-minute U.S. state naming, Carpenter & Mueller, 2013; 73-second 10

281 Language Learning 72:1, March 2022, pp. 269–319

two-digit additions, Nakata, 2015a), followed by immediate posttests, and

other studies involved delayed posttests 1 day after treatments (e.g., Pashler
et al., 2003). In the current meta-analysis, a test was defined as an immediate
posttest if it was taken on the same day as the treatment (i.e., immediately after
the last training session or after a filler task administered at the end of the last
training session); a test was defined as a delayed posttest if it was taken after
a delay of 1 day or greater following the treatment.
Following Suzuki (2017), when a study administered two or more delayed
posttests, only the last posttest’s score was included and coded as the dependent
variable. For example, when a study administered two delayed posttests (e.g.,
7-day and 35-day delayed posttests), the first delayed posttest was regarded
as a learning session, and the second (last) posttest’s score was included and
coded as the dependent variable (for delayed effect). When the posttest was
administered at three different RIs (e.g., 1-day, 4-week, and 8-week delayed
posttests; Schuetze, 2014), the first and second delayed posttests were regarded
as learning sessions, and the last delayed posttest’s score was coded as the
dependent variable. Note that this was the case only if the RI was manipulated
within participants.
When there were multiple types of posttests, a shifting unit of analysis was
adopted (Patall, Cooper, & Robinson, 2008). For example, if a study involved
two different types of immediate and/or delayed posttests (e.g., matching and
grammaticality judgment tests, Kasprowicz et al., 2019; error correction and
translation tests, Miles, 2014), two separate effect sizes were calculated. For
estimating the overall effect of choice, we averaged these two effect sizes so
that the sample contributed only one effect size. However, we did not include
reaction time data, because such data were provided in only a few of the studies
included in the meta-analysis (k = 4: Li & DeKeyser, 2019; Nakata & Elgort,
2021; Suzuki, 2017; Suzuki & DeKeyser, 2017a), and they are based on differ-
ent metrics (e.g., speed rate or word processing). For example, Suzuki’s (2017)
study measured accuracy (from vocabulary and grammar tests) and speed
(from reaction time), and we included only data from the accuracy measure.
Another reason to not include reaction time data was that Avery and Marsden
(2019) found that effect sizes from reaction time data are quite a lot lower than
the field averages, and they speculated that this could be because the standard
deviations are normally wider than for other metrics.
In the current meta-analysis, therefore, each of the three categories (spaced
vs. massed, longer vs. shorter, and equal vs. expanding) includes two differ-
ent timings of the outcome measures: immediate and delayed effects in the
spaced versus massed category, immediate and delayed effects in the longer

Language Learning 72:1, March 2022, pp. 269–319 282

versus shorter category, and immediate and delayed effects in the equal versus
expanding category.
We included a total of nine moderator variables: one learner-related vari-
able (age) and eight methodology-related variables (learning target, number
of sessions, type of practice, activity type, provision of feedback, feedback
timing, frequency of practice, and RI) (see Appendix S3 in the Supporting In-
formation online for the coding scheme). The coding sheet with the data (Kim
& Webb, 2021) is publicly available at http://www.iris-database.org.

Age
Because a limited number of studies reported the age of participants (21 of
37 studies, 57%), age was initially categorized according to grade levels (e.g.,
Grade 3, Rogers & Cheung, 2020a). However, because some studies involved
participants with a wide range of grade levels (e.g., Grades 9−12, Bloom &
Shuell, 1981; Grades 3−8, Lotfolahi & Salehi, 2016) or involved adults rang-
ing from 20 to 63 years (Kang et al., 2014), which makes it difficult to deter-
mine the differential effects of spaced practice on learners of different grade
levels, this variable was coded as young learners (Grades 1−12) and adult
learners (university students or older).

Learning Target
This variable consists of three types of L2 items: vocabulary (both single words
and multiword items), grammar (including morphological structure), and pro-
nunciation (a monosyllabic item with different tones or pronunciation rules).

Number of Sessions
This variable was coded as single session and multiple sessions. Note that the
number of sessions includes only training sessions and does not include testing
(immediate or delayed posttest) sessions. For example, if a study used time
spacing (e.g., a 10- minute interval between trials) within one training session,
followed by testing sessions (e.g., one immediate and two delayed posttests),
the study is coded as single session.

Type of Practice
Practice includes two types of conditions: study trial and test trial. A study trial
refers to an opportunity to restudy the target items that participants learned or
studied. A test trial refers to an opportunity to recall or retrieve the target items
that participants learned or studied. Note that feedback provided after a test
trial can also be an opportunity to restudy the target items that participants

283 Language Learning 72:1, March 2022, pp. 269–319

learned in the initial learning session. Type of practice was coded as being one
of five types: test–restudy (all) trial (testing, followed by restudying all target
items); test–restudy (not recalled) trial (testing, followed by restudying only
the items that were not recalled); study trial; test trial; and study–test trial (for
details, see Tables S4.2 and S4.3, Appendix S4, in the Supporting Information
online).

Activity Type
This variable was coded as one of: paired associate; comprehension activi-
ties; production activities; and combined activities that involved both compre-
hension and production activities. Paired-associate learning included learning
from word lists or word cards. As in the descriptions of activities reported in the
meta-analysis by Shintani, Li, and Ellis (2013), L2 activities other than paired-
associate learning were coded as comprehension or production activities. Ad-
ditionally, activities that involved both comprehension (e.g., multiple-choice
tasks) and production (e.g., fill-in-the-blanks tasks) were coded as combined
activities. Note that although a paired-associate task can involve either recep-
tive retrieval (comprehending the L1 meaning of a L2 word) or productive re-
trieval (producing the L2 word corresponding to a L1 word given), we consider
paired-associate tasks as a separate type of activity, distinct from comprehen-
sion, production, and combined activities (for details, see Tables S4.4 and S4.5,
Appendix S4, in the Supporting Information online).

Provision of Feedback and Feedback Timing

Provision of feedback was coded according to the absence or presence of feed-
back. The presence of feedback was further categorized into two subgroups
according to feedback timing (whether feedback was provided immediately
after each response or with a delay).

Frequency of Practice
Frequency of practice was reported as the amount of repeated practice (ex-
cluding the initial presentation to learn target items). Thus, this is different
from the total number of sessions, which includes the presentation, practice,
and posttest sessions used in the treatment. For example, Nakata and Suzuki
(2019a) included two sessions: The first session consisted of the pretest, learn-
ing session (presentation followed by three test trials), and immediate posttest,
whereas the second session involved a delayed posttest. Frequency of practice
in this study is 3 and the total number of sessions is 2. Following Suzuki (2017),
when a study administered two posttests (immediate and delayed) and RI was

Language Learning 72:1, March 2022, pp. 269–319 284

manipulated within participants, the immediate posttest can be regarded as a

learning session. When a study administered three posttests (immediate and
two delayed), the immediate posttest and the first delayed posttest are regarded
as learning sessions. Thus, in Nakata and Suzuki’s (2019a) study, whereas fre-
quency of practice was 3 at the time point for immediate posttest, frequency of
practice was 4 at the time point for delayed posttest.1

Retention Interval
RI was coded as the number of days between the last learning session and
the final posttest. In the current meta-analysis, six studies administered multi-
ple delayed posttests (Bird, 2010; Li & DeKeyser, 2019; Lotfolahi & Salehi,
2016; Schuetze, 2014; Suzuki, 2017; Suzuki & DeKeyser, 2017a). Suzuki
(2017) pointed out that the first delayed posttest could influence the reten-
tion of knowledge measured by the second delayed posttest. Hence, the first
delayed posttest was considered another retrieval practice in Suzuki’s (2017)
study. Following Suzuki (2017), if a study involved 7-day and 35-day delayed
posttests, the calculated RI is 28 days (RI of the last delayed posttest − RI of
the delayed posttest administered before the last delayed posttest; 35 days − 7
days = 28 days).2 It should be noted that this was the case only if the RI was
manipulated within participants.3

Reliability of the Coding

To assess the reliability of our coding procedures, 12 studies (approximately
32% of 37 studies) were randomly selected and independently coded by a
second rater. The second rater is an expert in the area of spaced practice re-
search with a doctoral thesis examining the effects of spaced practice on L2
vocabulary learning. The number of discrepancies between the two raters was
calculated by performing Cohen’s kappa test (a statistic for either interrater
or intrarater reliability testing). The overall agreement was rated at 99% (al-
most perfect agreement; Cohen, 1960). After all disagreements were resolved
through discussion, the first author coded the remaining studies (see Appendix
S5 in the Supporting Information online for coding reliability, including Co-
hen’s kappa [k] for each variable that was coded).

Data Analysis
We used Comprehensive Meta-Analysis (version 3.3) software (Borenstein,
Hedges, Higgins, & Rothstein, 2013) to calculate the overall effect sizes and
conduct analyses for nine moderator variables. In order to address the first re-
search question, we aggregated effect sizes from the studies included in the

285 Language Learning 72:1, March 2022, pp. 269–319

spaced versus massed comparison to produce a weighted mean effect size. For
the second research question, we aggregated effect sizes from the studies in-
cluded in the longer versus shorter and equal versus expanding categories. To
aggregate effect sizes, we used a random-effects model (using the unrestricted
maximum likelihood method) so that variation in intervention effects across
studies was accommodated (Borenstein, Hedges, Higgins, & Rothstein, 2009).
A significant between-group Q value indicates a heterogeneous distribution
with a common effect size among identified samples and thus facilitates sub-
sequent moderator analyses. However, a nonsignificant Q value is not always
taken as assurance that the effects are consistent, because the Q statistic and
its p value only address the variability of the null hypothesis (Borenstein et al.,
2009). In the current meta-analysis, therefore, we also report I2 statistics (the
proportion of variation in effect sizes across studies), tau (the standard devia-
tion of true effects), and prediction interval (how widely the effect sizes vary
across studies), which are intended to quantify heterogeneity (the distribution
of effects; Borenstein, Higgins, Hedges, & Rothstein, 2017). For the last re-
search question, we conducted moderator analyses in all of the three categories
(spaced vs. massed; longer vs. shorter; and equal vs. expanding). A random-
effects meta-regression (using the unrestricted maximum likelihood method)
was performed for continuous variables (frequency of practice and RI). The
statistical significance is assessed if the p value of the data analysis is less than
the prespecified alpha of 0.05.

Effect Size Calculation

To calculate the effect size of each study, the standardized mean difference
from a study that used two independent groups was estimated and converted to
Hedges’s g by multiplying a correction factor: J = 1 − (3/[4 × df − 1]). The
overall effect size was calculated by weighing the average effect size for each
study according to sample size and then pooling the effect sizes across studies.
Because the current study examines the effectiveness of spaced practice
(spaced vs. massed, longer vs. shorter, and equal vs. expanding), comparative
effect sizes were computed. A comparative effect size represents the effect of
treated groups in comparison with baseline groups (Shintani et al., 2013). In
the spaced versus massed comparison, for example, a significant effect size
(g = 0.50) in the positive direction implies that spaced practice (the treated
condition) is more effective than massed practice (the baseline condition) by
0.5 standard deviation units. In contrast, a significant effect size in the negative
direction (g = −0.50) suggests that massed practice (the baseline condition)
is more effective than spaced practice (the treated condition) by 0.5 standard

Language Learning 72:1, March 2022, pp. 269–319 286

deviation units. In the longer versus shorter comparison, longer and shorter
spacing data were coded as treated and baseline data, respectively. In the equal
versus expanding comparison, equal and expanding spacing data were coded
as treated and baseline data, respectively.
From 48 experiments, we identified 26 effect sizes in the spaced versus
massed comparison, including 11 with immediate posttests and 15 with de-
layed posttests. In the longer versus shorter spacing comparison, we identi-
fied 49 effect sizes, including 17 with immediate posttests and 32 with de-
layed posttests. Finally, in the equal versus expanding comparison, we identi-
fied 23 effect sizes, including 7 with immediate posttests and 16 with delayed
posttests.
The detection of outliers was performed to ensure the robustness of the re-
sults, because the presence of studies with extreme effect sizes may have an
impact on the results. Following previous meta-analyses (e.g., that by Shintani
et al., 2013), the effect sizes contributed by the included studies were trans-
formed into z scores, and any value (regardless of whether it was positive or
negative) larger than 2.0 was removed from the analysis. Outlier detection was
performed repeatedly until there were no further outliers. One outlier was iden-
tified from the z-score examination (Lotfolahi & Salehi, 2017: z = 2.152).
Finally, we assessed publication bias in the current data sets. Because most
studies in this meta-analysis were published (35, alongside one contribution to
conference proceedings, Khoii & Abed, 2017, and one doctoral thesis, Koval,
2020), our meta-analysis is more likely to include statistically significant find-
ings than statistically nonsignificant findings (Lipsey & Wilson, 2001); there-
fore, a bias might influence the results of our meta-analysis. Results demon-
strated that publication bias is considered to be a potential threat to conclusions
drawn about the effects of spaced practice. The true magnitudes of effects of
spaced practice on L2 learning might be smaller than those reported in this
meta-analysis, though it is not known how much smaller and whether it would
affect all three categories (spaced vs. massed, longer vs. shorter, and equal vs.
expanding) of comparisons and all the moderator variables in the same way
(see Appendix S6 in the Supporting Information online for publication bias
analyses).

Results
To What Extent Does Spacing Affect Second Language Learning?
Results showed that spaced practice led to significant improvement in L2
learning and retention compared to massed practice (see Figures 2 and 3).
Spaced practice was significantly more effective than massed practice on the

287 Language Learning 72:1, March 2022, pp. 269–319

Figure 2 Overall average effect size (indicated by a diamond) of spaced practice when
compared to massed practice, and effect sizes with 95% confidence intervals for each
study (dependent variable = immediate posttest scores, k = 11). Effect sizes are calcu-
lated as Hedges’s g.

Figure 3 Overall average effect size (indicated by a diamond) of spaced practice when
compared to massed practice, and effect sizes with 95% confidence intervals for each
study (dependent variable = delayed posttest scores, k = 15). Effect sizes are calculated
as Hedges’s g.

immediate posttests, g = 0.58, 95% CI [0.16, 1.00], a medium effect accord-

ing to Cohen’s benchmarks (1988; g = 0.2 for small, 0.5 for medium, and 0.8
for large), and small-to-medium with reference to the benchmarks found by
Plonsky and Oswald (2014; between-group contrast, g = 0.4 for small, 0.7 for
medium, and 1.0 for large). For the domain of psychology (g = 0.32, median
effect, Schäfer & Schwarz, 2019), however, the spacing effect of 0.58 from our
meta-analysis could be considered large. A significant Q value (Q = 54.72, p
< .001) indicates that the true effect size is not identical in all the studies. Of
the variance in observed effects, 81.72% reflects variance in true effects rather

Language Learning 72:1, March 2022, pp. 269–319 288

than sampling error (I2 = 81.72), and the standard deviation of true effects (tau)
was 0.631. We predict that the true effects would fall in the range of −0.93 to
2.09, and it would make sense to apply moderator analyses or meta-regression
to explain the variance (Borenstein et al., 2009).
A spacing effect was also found on the delayed posttests, g = 0.80, 95% CI
[0.44, 1.17], and the confidence interval values (which do not pass zero) sug-
gested that the size of the spacing effect in the long term could be considered
medium to large (Plonsky & Oswald, 2014), and large with reference to Cohen
(1988) and to Schäfer and Schwarz (2019). A significant Q test (Q = 79.83,
p < .001) and high value of I2 (82.46%) indicated that the observed variance
would remain among identified samples. Tau was 0.639, and the prediction in-
terval tells us that most effects would fall in the range of −0.64 to 2.24. This
justified subsequent moderator analyses or meta-regression.

To What Extent Do Learning Gains Differ in Relation to Type of

Spacing?
Results demonstrated that the effects of shorter and longer spacing were similar
on the immediate posttests, g = −0.15, 95% CI [−0.37, 0.06]; the confidence
intervals crossed zero (see Figure 4), and tau was 0.332. The prediction interval
was −0.90 to 0.60, and we predict that the true effects would fall in this wide
range. A significant Q value (Q = 37.07, p < .001) and an I2 value of 56.84%
justified subsequent moderator analyses or meta-regression. However, longer
spacing showed a greater effect than shorter spacing on the delayed posttests,

Figure 4 Overall average effect size of longer spaced practice (treated) when compared
to shorter spaced practice (baseline), and effect sizes with 95% confidence intervals for
each study (dependent variable = immediate posttest scores, k = 17). Effect sizes are
calculated as Hedges’s g.

289 Language Learning 72:1, March 2022, pp. 269–319

Figure 5 Overall average effect size of longer spaced practice (treated) when compared
to shorter spaced practice (baseline), and effect sizes with 95% confidence intervals for
each study (dependent variable = delayed posttest scores, k = 32). Effect sizes are
calculated as Hedges’s g.

g = 0.40, 95% CI [0.16, 0.64] (see Figure 5). The confidence interval values,
with the lower bound only just above zero, suggested that the size of longer
spacing effects in the long term could be considered small (Plonsky & Oswald,
2014), or small to medium with reference to Cohen (1988), but in the medium
range within the domain of psychology (Schäfer & Schwarz, 2019). Tau was
0.607, and the prediction interval was −0.87 to 1.67 for the delayed effects.
We would predict that the true effect sizes would fall in this wide range. A
significant Q value (Q = 163.63, p < .001) and I2 value of 81.05% justified
subsequent moderator analyses or meta-regression.
Results showed that equal spacing was as effective as expanding spacing
on both immediate posttests, g = 0.15, 95% CI [−0.07, 0.37], and delayed
posttests, g = −0.15, 95% CI [−0.33, 0.03]; the confidence intervals crossed
zero (see Figures 6 and 7). I2 values in this comparison were zero on the im-
mediate posttests and 27.19% on the delayed posttests; a value near zero sug-
gested that almost no observed variance remained, thus no subsequent moder-
ator analysis for the immediate effects is reported; and the value on the delayed
posttests indicated that there was a small part (I2 = 27.19%) of an observed
dispersion. Tau was 0.188, and the prediction interval was −0.60 to 0.30.

Language Learning 72:1, March 2022, pp. 269–319 290

Figure 6 Overall average effect size of equal spaced practice (treated) when compared
to expanding spaced practice (baseline), and effect sizes with 95% confidence intervals
for each study (dependent variable = immediate posttest scores, k = 7). Effect sizes are
calculated as Hedges’s g.

Figure 7 Overall average effect size of equal spaced practice (treated) when compared
to expanding spaced practice (baseline), and effect sizes with 95% confidence intervals
for each study (dependent variable = delayed posttest scores, k = 16). Effect sizes are
calculated as Hedges’s g.

Subsequent moderator analysis and meta-regression for the delayed effects in

the equal versus expanding comparison was somewhat justified but should be
cautiously interpreted when the results of the analyses suggest that moderator
variables may explain the variance.
It should be noted that publication bias analyses indicated that apparent
bias exists in the subset of effects from delayed posttests from the spaced versus
massed comparison. However, the results of p-uniform (see Appendix S6 in the

291 Language Learning 72:1, March 2022, pp. 269–319

Supporting Information online) showed that the bias is negligible. In the subset
of effects from immediate posttests from the equal versus expanding compari-
son, I2 and tau were zero, indicating that estimates of p-uniform should be ex-
amined. P-uniform enables testing of the extent of heterogeneity and considers
the statistical significance of effect sizes (van Aert, Wicherts, & van Assen,
2016). However, the results of both p-uniform and the random-effects model
were similar (very small effects with confidence intervals that crossed zero),
which led to the conclusion that random-effects meta-analysis results may be
interpreted as the standard meta-analytic estimates. Because most studies in-
cluded in the current meta-analysis were published studies (published studies
= 35, contribution to conference proceedings = 1, and PhD thesis = 1), a sym-
metrical distribution may not rule out publication bias. Therefore, the overall
effects of spaced practice on L2 learning from the current meta-analysis should
be interpreted with caution.

Which Empirically Motivated Variables Moderate the Effects of Spacing?

The Q test indicates whether a variable is a significant predictor; that is,
whether the effect sizes of baseline and treated conditions effect sizes for that
variable are significantly different. However, because of small samples in the
current meta-analysis, we interpret the results by focusing more on effect sizes
and their confidence interval values. Recall that the moderator analyses are
based on the comparative effect sizes; a positive effect size indicates a better
effect for the treated group and a negative effect size shows a superior effect
for the baseline group. No moderator analysis for the immediate effects in the
equal versus expanding comparison (I2 = 0, tau = 0) was reported. Separate
meta-regression analyses (using the unrestricted maximum likelihood method)
for two continuous variables (frequency of practice and RI) were performed to
determine whether these variables were significant predictors of the effects of
spaced practice on L2 learning. Moderator analyses for learner and method-
ological variables are shown in Tables 1 and 2 (see Appendix S8 in the Sup-
porting Information online for details).

Age
Spacing promoted better learning when it involved adult learners, g = 0.66,
95% CI [0.13, 1.20], than when it involved young learners, g = 0.39, 95% CI
[−0.44, 1.22]. However, in the long term, the effects were larger with young
learners, g = 0.97, 95% CI [0.11, 1.82], than with adult learners, g = 0.77,
95% CI [0.36, 1.18]. Longer spacing significantly led to better retention than
shorter spacing when it involved adult learners, g = 0.54, 95% CI [0.27, 0.81].

Language Learning 72:1, March 2022, pp. 269–319 292

Table 1 Moderator analyses for categorical variables (immediate posttests)

95% CI Q tests

Variables k g Variance LL UL pa Q pa

Age
Spaced vs. massed 0.30 .58
Young 3 0.39 0.03 −0.44 1.22 .36
Adult 8 0.66 0.10 0.13 1.20 .01
Longer vs. shorter 0.45 .50
Young 3 −0.03 0.04 −0.42 0.37 .89
Adult 14 −0.19 0.02 −0.44 0.06 .14
Equal vs. expanding 2.18 .14
Young 2 0.35 0.09 0.01 0.69 .05
Adult 5 0.01 0.02 −0.29 0.30 .96
Learning target
Spaced vs. massed 1.71 .19
Vocabulary 8 0.76 0.08 0.26 1.25 .00
Grammar 3 0.14 0.08 −0.64 0.92 .72
Longer vs. shorter 15.59 .00
Vocabulary 9 0.14 0.02 −0.11 0.38 .28
Grammar 4 −0.41 0.02 −0.70 −0.13 .01
Pronunciation 4 −0.64 0.03 −0.98 −0.30 .00
Number of sessions
Spaced vs. massed 5.86 .02
Single session 6 1.04 0.01 0.49 1.59 .00
Multiple sessions 5 0.04 0.06 −0.55 0.63 .88
Longer vs. shorter 0.78 .38
Single session 10 −0.08 0.03 −0.40 0.23 .60
Multiple sessions 7 −0.27 0.02 −0.52 −0.01 .04
Equal vs. expanding 0.25 .62
Single session 4 0.07 0.03 −0.29 0.44 .70
Multiple sessions 3 0.19 0.06 −0.12 0.51 .23
Type of practice
Spaced vs. massed 1.34 .72
Test–restudy (all) 6 0.69 0.13 0.05 1.34 .04
Test–restudy (no recalled) 2 0.48 0.05 −0.06 1.55 .39
Study trial 2 0.81 0.45 −0.34 1.97 .17
Longer vs. shorter 11.74 .01
Test–restudy (all) 6 0.22 0.02 −0.08 0.51 .16
Test–restudy (no recalled) 3 −0.54 0.03 −0.89 −0.18 .00
(Continued)

293 Language Learning 72:1, March 2022, pp. 269–319

Table 1 (Continued)

95% CI Q tests

Variables k g Variance LL UL pa Q pa

Study trial 5 −0.41 0.05 −0.86 0.04 .07

Study–test trial 3 −0.24 0.02 −0.54 0.07 .13
Activity type
Spaced vs. massed 1.91 .59
Paired associate 3 0.67 0.60 −0.29 1.63 .17
Comprehension activities 3 0.97 0.37 0.04 1.91 .04
Production activities 2 0.68 0.03 −0.42 1.78 .22
Combined activities 3 0.07 0.03 −0.85 1.00 .88
Longer vs. shorter 13.75 .00
Paired associate 7 0.17 0.02 −0.11 0.45 .24
Comprehension activities 4 −0.38 0.03 −0.69 −0.07 .02
Production activities 3 −0.64 0.03 −0.99 −0.28 .00
Combined activities 3 −0.15 0.08 −0.71 0.41 .60
Provision of feedback
Spaced vs. massed 1.32 .25
Absence 2 0.02 0.06 −0.99 1.02 .98
Presence 7 0.69 0.09 0.15 1.23 .01
Note. LL= lower limit; UL = upper limit. a Estimates in boldface are statistically signif-
icant at α = .05.

However, longer spacing was as effective as shorter spacing when it involved

young learners. Because the sample sizes for young learners were small (k =
3 in the spaced vs. massed comparison and k = 8 in the longer vs. shorter
comparison), readers should be cautious in interpreting the benefits of spaced
practice with young learners.

Learning Target
Spacing led to better learning and retention when it involved L2 vocabulary,
g = 0.76−1.15, 95% CI [0.26, 1.49], than when it involved L2 grammar,
g = 0.11−0.14, 95% CI [−0.64, 0.92]. However, the confidence intervals for
L2 grammar learning crossed zero, suggesting that the spacing effects could
be statistically unstable when learning involves L2 grammar. Shorter spacing
was significantly more effective for the immediate learning of L2 pronuncia-
tion, g = −0.64, 95% CI [−1.06, −0.21] (not passing through zero), and of
grammar, g = −0.41, 95% CI [−0.70, −0.13] (not passing through zero), but

Language Learning 72:1, March 2022, pp. 269–319 294

Table 2 Moderator analyses for categorical variables (delayed posttests)

95% CI Q tests
a a
Variables k g Variance LL UL p Q p

Age
Spaced vs. massed 0.16 .69
Young 3 0.97 0.25 0.11 1.82 .03
Adult 12 0.77 0.04 0.36 1.18 .00
Longer vs. shorter 4.35 .04
Young 8 −0.04 0.03 −0.52 0.44 .86
Adult 24 0.54 0.02 0.27 0.81 .00
Learning target
Spaced vs. massed 13.78 .00
Vocabulary 10 1.15 0.04 0.81 1.49 .00
Grammar 5 0.11 0.03 −0.32 0.54 .61
Longer vs. shorter 0.54 .76
Vocabulary 22 0.34 0.02 0.04 0.64 .03
Grammar 8 0.56 0.07 0.06 1.06 .03
Pronunciation 2 0.42 0.06 −0.57 1.42 .41
Number of sessions
Spaced vs. massed 1.91 .17
Single session 9 0.61 0.05 0.16 1.05 .01
Multiple sessions 6 1.12 0.10 0.55 1.69 .00
Longer vs. shorter 6.83 .01
Single session 11 0.76 0.04 0.42 1.11 .00
Multiple sessions 21 0.18 0.02 −0.10 0.45 .21
Equal vs. expanding 0.68 .41
Single session 6 −0.04 0.02 −0.35 0.28 .81
Multiple sessions 10 −0.20 0.02 −0.42 0.02 .08
Type of practice
Spaced vs. massed 3.35 .34
Test–restudy (all) 10 0.70 0.05 0.25 1.14 .00
Test–restudy (no recalled) 2 1.73 0.65 0.67 2.79 .00
Study trial 2 0.69 0.43 −0.36 1.73 .20
Longer vs. shorter 15.86 .00
Test–restudy (all) 16 0.38 0.02 0.10 0.67 .01
Test–restudy (no recalled) 6 1.06 0.09 0.61 1.50 .00
Study trial 6 −0.12 0.06 −0.62 0.38 .64
Study–test trial 3 0.40 0.06 −0.23 1.03 .22
Equal vs. expanding 15.33 .00
Test–restudy (all) 8 −0.32 0.01 −0.54 −0.10 .00
(Continued)
295 Language Learning 72:1, March 2022, pp. 269–319
14679922, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/lang.12479 by Charité - Universitaetsmedizin, Wiley Online Library on [03/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Kim and Webb Spacing and Second Language Learning

Table 2 (Continued)

95% CI Q tests
a a
Variables k g Variance LL UL p Q p

Test–restudy (no recalled) 3 −0.05 0.03 −0.37 0.27 .76

Study trial 2 −0.17 0.11 −0.82 0.49 .62
Test trial 2 −0.23 0.03 −0.59 0.12 .19
Activity type
Spaced vs. massed 6.26 .10
Paired associate 6 1.36 0.08 0.80 1.92 .00
Comprehension activities 5 0.43 0.10 −0.15 1.00 .14
Combined activities 3 0.53 0.07 −0.23 1.29 .52
Longer vs. shorter 10.72 .01
Paired associate 12 0.58 0.05 0.23 0.93 .00
Comprehension activities 9 0.73 0.03 0.31 1.15 .00
Production activities 8 −0.24 0.03 −0.72 0.24 .32
Combined activities 3 0.16 0.03 −0.55 0.86 .66
Equal vs. expanding 13.42 .00
Paired associate 13 −0.23 0.01 −0.41 −0.06 .01
Production activities 2 −0.23 0.03 −0.59 0.12 .19
Provision of feedback
Spaced vs. massed 0.00 .95
Absence 3 0.85 0.03 0.02 1.68 .05
Presence 10 0.82 0.06 0.36 1.27 .00
Longer vs. shorter 0.71 .40
Absence 4 0.24 0.12 −0.41 0.89 .47
Presence 23 0.55 0.02 0.27 0.82 .00
Equal vs. expanding 0.01 .93
Absence 6 −0.16 0.01 −0.45 0.14 .31
Presence 8 −0.14 0.03 −0.42 0.15 .36
Feedback timing
Spaced vs. massed 10.40 .00
Immediate feedback 8 0.52 0.05 0.10 0.94 .02
Delayed feedback 2 2.35 0.13 1.36 3.34 .00
Longer vs. shorter 2.83 .09
Immediate feedback 15 0.39 0.03 0.08 0.71 .01
Delayed feedback 8 0.87 0.06 0.41 1.34 .00
Equal vs. expanding 1.06 .30
Immediate feedback 5 0.04 0.09 −0.44 0.52 .88
Delayed feedback 3 −0.36 0.03 −0.94 0.22 .23
Note. LL = lower limit; UL = upper limit.
a
Estimates in boldface are statistically significant at α = .05.
Language Learning 72:1, March 2022, pp. 269–319 296
14679922, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/lang.12479 by Charité - Universitaetsmedizin, Wiley Online Library on [03/07/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Kim and Webb Spacing and Second Language Learning

longer spacing significantly enhanced retention for L2 grammar and vocabu-

lary; the effect was larger with grammar, g = 0.56, 95% CI [0.06, 1.06], than
with vocabulary, g = 0.34, 95% CI [0.04, 0.64]. However, the benefit from
longer spacing in the long term remains unclear when it targets pronunciation,
because the sample size was small (k = 2 for delayed effects).

Number of Sessions
We found a significantly large benefit of spacing on improving immediate L2
performance when it involved a single session, g = 1.04, 95% CI [0.49, 1.59].
However, better retention occurred when it involved multiple sessions, g =
1.12, 95% CI [0.55, 1.69], than when it involved a single session, g = 0.61,
95% CI [0.16, 1.05]. Longer spacing significantly promoted greater retention
than shorter spacing when it involved a single session, g = 0.76, 95% CI [0.42,
1.11]. However, when it involved multiple sessions, longer spacing was as ef-
fective as shorter spacing. Small effects of expanding spacing for retention
were found when it involved a single session, g = −0.04, 95% CI [−0.35,
0.28], and multiple sessions, g = −0.20, 95% CI [−0.42, 0.02], but the effects
were not statistically reliable.

Type of Practice
Spaced practice promoted better learning and retention when it involved a test–
restudy trial (g = 0.48−1.73, 95% CI [−0.06, 2.79], than when it involved
a study-only trial, g = 0.69−0.81, 95% CI [−0.36, 1.97]. However, because
the sample size for study-only trial was small (k = 2), the smaller effect with
a study-only trial should be interpreted with caution. Longer spacing signif-
icantly led to greater retention than shorter spacing when it involved a test–
restudy trial, g = 0.38−1.06, 95% CI [0.10, 1.50], but longer spacing was as
effective as shorter spacing when it involved a study trial or study–test trial.
Expanding spacing led to greater retention when it involved a test–restudy trial
than when it involved a study trial or test trial. Although the confidence inter-
vals for the test–restudy trial showed statistically reliable effects of expanding
spacing, the findings from the equal versus expanding comparison should be
interpreted with caution because of small samples (k = 2 for study trial, k = 2
for test trial).

Activity Type
Spacing promoted better learning on immediate posttests when it in-
volved comprehension activities, g = 0.97, 95% CI [0.04, 1.91], than
when it involved other activities, g = 0.07−0.68, 95% CI [−0.85, 1.78].

297 Language Learning 72:1, March 2022, pp. 269–319

However, better retention occurred when it involved a paired-associate

task, g = 1.36, 95% CI [0.80, 1.92], than when it involved other ac-
tivities, g = 0.43−0.53, 95% CI [−0.23, 1.29]. Shorter spacing bene-
fited immediate L2 performance when it involved production activities,
g = −0.64, 95% CI [−0.99, −0.28], but longer spacing led to greater retention
when it involved comprehension activities and paired associates than when it
involved production or combined activities; the positive effect of longer com-
pared to shorter spacing was larger with comprehension activities, g = 0.73,
95% CI [0.31, 1.15], than with paired associates, g = 0.58, 95% CI [0.23,
0.93]. Expanding spacing led to significantly better retention than equal spac-
ing when it involved paired associates, g = −0.23, 95% CI [−0.41, −0.06].
Because the sample size for production activities was small (k = 2), the benefit
of expanding spacing with production activities remains unclear.

Provision of Feedback
Spaced practice relative to massed practice improved immediate L2 perfor-
mance more when feedback was provided, g = 0.69, 95% CI [0.15, 1.23], than
when feedback was not provided, g = 0.02, 95% CI [−0.99, 1.02]. However,
spacing enhanced retention regardless of whether feedback was provided or
not. The effect when there was an absence of feedback should be interpreted
with caution due to small samples (k = 2 at immediate posttests and k = 3 at
delayed posttests in the spaced vs. massed comparison). Longer spacing pro-
duced better retention at delayed posttests when feedback was provided, g =
0.55, 95% CI [0.27, 0.82], than when feedback was not provided, g = 0.24,
95% CI [−0.41, 0.89]. The confidence intervals (95% CI [0.27, 0.82]) for the
presence of feedback did not include zero, suggesting that larger spacing be-
tween feedback and the subsequent trial promotes better retention. Feedback
did not have an impact on the comparative effectiveness of equal and expand-
ing spacing.

Feedback Timing
Spacing led to greater retention when feedback was provided with a delay,
g = 2.35, 95% CI [1.36, 3.34], than when feedback was immediately provided,
g = 0.52, 95% CI [0.10, 0.94]. However, the extreme effect size should be
interpreted with caution due to the small samples (k = 2 for delayed feedback).
Longer spacing led to significantly better retention when delayed feedback
was provided, g = 0.87, 95% CI [0.41, 1.34], than when immediate feedback
was provided, g = 0.39, 95% CI [0.08, 0.71]. An extremely small to negligible
effect in favor of equal spacing was found when immediate feedback was

Language Learning 72:1, March 2022, pp. 269–319 298

provided, g = 0.04, 95% CI [−0.44, 0.52], and a small effect was found in
favor of expanding spacing when delayed feedback was provided, g = −0.36,
95% CI [−0.94, 0.22]. However, for both these effects the confidence intervals
crossed zero indicating that these differential effects between equal and
expanding spacing regarding feedback timing are unlikely to be statistically
reliable.

Frequency of Practice
The random-effects meta-regression analyses showed a positive relationship
between frequency of practice and the immediate effects (i.e., the greater the
frequency of practice, the larger the spacing effects relative to massed practice
on immediate learning), but a negative relationship with the delayed effects
(i.e., the greater the frequency of practice, the smaller the spacing effects rel-
ative to massed practice in the long term). A negative relationship between
frequency of practice and effect sizes was found in the longer versus shorter
comparison (i.e., the greater the frequency of practice, the larger the effects for
shorter spacing). A negative relationship was also found in the equal versus
expanding comparison (i.e., the greater the frequency of practice, the larger
the expanding spacing effects). However, the effects of frequency of practice
in the three comparisons (spaced vs. massed, longer vs. shorter, and equal vs.
expanding) were small to negligible (not statistically significant).

Retention Interval
The random-effects meta-regression analyses showed a positive, albeit small
and negligible (not statistically significant), relationship between RI and ef-
fect sizes in the spaced versus massed comparison (i.e., the longer the RI, the
greater the spacing effects relative to massed practice). In the longer versus
shorter comparison, the analyses indicated that the longer the RI, the greater
the shorter spacing effects, however, the relationship was negligible (not sta-
tistically significant). In the equal versus expanding comparison, the results
showed a significant negative relationship, indicating that the longer the RI,
the larger the effects of expanding spacing schedules.

Discussion
The analyses of comparative effects indicated that spaced practice was signifi-
cantly more effective for L2 learning (g = 0.58) and retention (g = 0.80) than
massed practice. It is notable that spaced practice can lead to better imme-
diate gains than massed practice. The benefits of massed learning have been
demonstrated at extremely short RIs (2 or 4 seconds, e.g., Peterson, Saltzman,

299 Language Learning 72:1, March 2022, pp. 269–319

Hillner, & Land, 1962). Our finding contrasts with results obtained by Peterson
et al. (1962) and suggests that spaced practice is a more effective strategy than
massed practice to enhance learners’ L2 performance immediately. Our find-
ing is consistent with previous meta-analyses (Cepeda et al., 2006; Donovan &
Radosevich, 1999). Donovan and Radosevich (1999) found a mean weighted
effect size of 0.45, 95% CI [0.41, 0.50], for immediate learning and 0.51, 95%
CI [0.39, 0.64], for retention, indicating that spaced practice was significantly
more beneficial than massed practice for both immediate learning and reten-
tion. Cepeda et al. (2006) found positive effects of spaced practice at short
RIs ranging from 1 second to less than 1 day (averaged percentage correct on
the final test: 38.5% for massed practice, 47.6% for spaced practice) and at
longer RIs ranging from 1 day to more than 31 days (28.5% for massed prac-
tice, 47.4% for spaced practice). It is also important to note that the effects of
spacing are considered smaller than those of certain types of L2 instruction
(e.g., form-focused or implicit instruction). Norris and Ortega (2000) meta-
analyzed the effectiveness of L2 instruction (i.e., focus on form explicit and
implicit treatments, and focus on forms explicit and implicit treatments) and
found a large effect of all instructional treatments, d = 0.96, 95% CI [0.78,
1.14]. Although the benefits of spaced practice on L2 learning found in the
current meta-analysis were smaller (g = 0.58 to 0.80) than the effects of other
types of L2 instruction (e.g., focus on form explicit, focus on forms explicit)
found by Norris and Ortega, spacing can still be considered to be useful for L2
learning.
The analyses indicated that both shorter and longer spacing have initial
benefits, whereas longer spacing has a greater effect on durable learning.
Cepeda et al. (2006) also found a pattern with the greatest increases in reten-
tion at longer spacing. Consistent with the desirable difficulty framework (e.g.,
Bjork, 1994), better retention occurred under difficult conditions, such as after
longer spacing as opposed to shorter spacing. The overall magnitude of the
longer spacing effect (g = 0.40) from our findings is small to medium, in spite
of a number of previous memory studies (e.g., Cepeda et al., 2005) that have
demonstrated the benefits of longer spacing in the long term. This might be
because some inconsistency was shown regarding the effects of shorter and
longer spacing on L2 learning, suggesting that other variables affecting the
benefits of one type of practice over another could be observable in instructed
L2 learning.
The analyses also revealed that there were no significant differences
between equal and expanding spacing in either the immediate or the delayed
posttests. It should be noted that only a small number of studies included

Language Learning 72:1, March 2022, pp. 269–319 300

immediate posttests (k = 7), and so we should be cautious about the differen-

tial effects of these two spacing types on short-term learning. It is important to
note, however, that expanding spacing was as effective as equal spacing in the
delayed posttests. This finding suggests that how soon learners retrieve items
in the first (initial) retrieval practice or how soon subsequent practice occurs,
may not have much impact on long-term retention.
We focused on variables that may moderate the effects of spacing on L2
learning. First, spaced practice promoted better learning and retention of L2
vocabulary and grammar for both young and adult learners. Specifically, adult
learners showed greater retention with longer spacing than young learners.
This supports Wilson’s (1976) hypothesis that the effect of different types of
spacing is dependent on working memory capacity; increasing the spacing be-
tween items may be more beneficial to older learners than younger learners.
Because the sample sizes for young learners were small (k = 3 in the spaced
vs. massed comparison and k = 8 in the longer vs. shorter comparison), there
would be value in further exploring the effects of spaced practice with young
learners.
Second, the effects of different types of spacing were evident in the learn-
ing of L2 grammar and pronunciation. Shorter spacing led to greater immedi-
ate learning of L2 grammar (g = −0.41) and pronunciation (g = −0.64) than
longer spacing. This may be due to the complexity of the task or skill to be
learned in grammar and pronunciation learning. It may be more difficult for
learners to retrieve grammatical rules in oral production tasks than in compre-
hension and written tasks (Suzuki, 2017). Brief auditory input in pronunciation
learning may be difficult for learners to access after spacing, especially when
the spacing is longer (Baddeley, Thomson, & Buchanan, 1975). The benefits of
blocking and interleaving may be more relevant for pronunciation and gram-
mar learning than for vocabulary learning. Blocking can help learners identify
the commonalities within each concept, whereas interleaving can help learn-
ers distinguish among different concepts (Taylor & Rohrer, 2010). However,
when target features (e.g., pronunciation rules) are easily distinguished from
each other (e.g., eau, s, ch; Carpenter & Mueller, 2013), the benefits of in-
terleaving can be reduced (less pronounced). Thus, shorter spacing (or block-
ing, with immediate repetition of items sharing the same pronunciation rules)
may be particularly beneficial for helping learners to notice and understand the
pronunciation rule patterns (Carpenter & Mueller, 2013). Saito and Plonsky
(2019) found a medium effect of L2 pronunciation teaching on L2 pronuncia-
tion development, d = 0.68, 95% CI [0.49, 0.86], for between-group contrasts.
Similarly, we found a medium effect of longer spacing for L2 pronunciation

301 Language Learning 72:1, March 2022, pp. 269–319

learning relative to shorter spacing (g = −0.64). However, given that our study
sample size was small (k = 4), there would be value in further exploring the
effects of spacing on L2 pronunciation learning.
Longer spacing promoted better retention for L2 grammar than shorter
spacing. One explanation is that learners’ comprehension can be impaired
by shorter spacing between presentations of different (but related) types of
grammatical rules, leading to undesirable difficulties (Metcalfe, 2011). How-
ever, learners may devote more attention or processing effort to longer spaced
conditions (Jacoby, 1978). Interleaving can benefit the retention of grammati-
cal features (e.g., Nakata & Suzuki, 2019b). Interleaved practice requires that
learners repeatedly switch between different kinds of intervening tasks for
the target features, which improves discriminability (Taylor & Rohrer, 2010).
However, given that the number of blocked and interleaved practice studies
on grammar learning was small (Nakata & Suzuki, 2019b; Pan et al., 2019;
Suzuki et al., 2020), researchers should be cautious in interpreting the effects
of blocking and interleaving for L2 grammar learning. Shintani et al. (2013)
found large effects of comprehension-based instruction (e.g., error identifica-
tion) on receptive knowledge of L2 grammar, d = 1.09, 95% CI [0.64, 1.55],
and small effects of production-based instruction (e.g., translation) on produc-
tive knowledge, d = −0.21, 95% CI [−0.39, −0.02]. Shintani’s (2015) meta-
analysis revealed very large effects of processing instruction (e.g., structured
input activities) on receptive knowledge, d = 2.60, 95% CI [2.19, 3.00], and
productive knowledge, d = 2.03, 95% CI [1.65, 2.41], of L2 grammar. We
found a small-to-medium effect of spaced practice for L2 grammar learning
(g = 0.56 for overall effect; g = 0.88 for receptive knowledge, g = 0.42 for
productive knowledge), which is smaller than that found by Shintani (2015)
for comprehension-based and processing instruction but larger than the effect
Shintani found for production-based instruction (for details, see Table S7.2,
Appendix S7, in the Supporting Information online).
Third, spacing manipulated within one session promoted better immedi-
ate L2 performance than spacing manipulated between sessions, but spacing
manipulated between sessions led to better retention than spacing manipu-
lated within one session. Because within-session spacing inevitably involves
shorter spacing than between-session spacing, spaced practice within a single
session may support higher levels of retrieval success at immediate posttests
than spaced practice between multiple sessions. The effects of between-session
spacing on long-term retention support the distributed practice effect (e.g.,
Bahrick, Bahrick, Bahrick, & Bahrick, 1993), suggesting that longer spacing
(time intervals between multiple sessions are relatively longer than intervals

Language Learning 72:1, March 2022, pp. 269–319 302

within a session) yields better retention. However, we found a greater effect of

longer spacing for the retention of L2 vocabulary when the spacing was ma-
nipulated within a single session, g = 0.79, 95% CI [0.32, 1.25], than when it
was manipulated between multiple sessions, g = 0.02, 95% CI [−0.23, 0.26].
It should be noted that all within-session studies included in the longer versus
shorter comparison (k = 11) involved a retrieval condition as practice. Consis-
tent with study-phase retrieval account (proposing that the benefits of spacing
arise from the effects of retrieving information from the first presentation dur-
ing the second presentation, e.g., Toppino & Bloom, 2002) and the desirable
difficulties framework (proposing the desirability of making study more diffi-
cult by increasing spacing, e.g., Bjork, 1994), increasing spacing within a sin-
gle session might be expected to produce superior retention when it involves
retrieval conditions.
Fourth, when longer spacing was involved, greater retention occurred in
test–restudy trials than in study-only trials. Specifically, the effect of longer
spacing was greater in L2 vocabulary learning, g = 1.27, 95% CI [0.75, 1.78],
than in L2 grammar learning, g = 0.84, 95% CI [0.39, 1.29]. Consistent with
study-phase retrieval theory and the desirable difficulties framework, increas-
ing spacing between test–restudy trials represents a condition that requires
more effort, leading to greater learning than study-only trials. It is also notable
that we found no clear effects for equal versus expanding spacing in either
retrieval or restudy practice. This might be explained by study time and time
available to take a posttest. Gerbier and Koenig (2012), in their Experiment 1,
allowed unlimited time for studying and performing the posttest and found the
superiority of expanding spacing. However, Gerbier and Koenig in their Exper-
iment 2 and Schuetze (2014) controlled studying time and time on posttest, and
they found no benefits for expanding versus equal spacing. Although learning
is desirably difficult in the case of spaced practice, learners may compensate
for this difficulty by spending more time on tasks (Gerbier & Koenig, 2012).
Fifth, spacing with comprehension activities enhanced learning and reten-
tion of L2 vocabulary, g = 1.38−1.56, 95% CI [0.87, 2.08]. However, it is
notable that no clear spacing effect was found with paired associates on the
immediate posttests. This might be because a paired-associate task has a fast
presentation rate (shorter study time), and learners may not encode what they
need for deep and useful encoding during practice (Metcalfe, 2011). As the de-
sirable difficulty perspective recommends, massing may be advantageous when
initial encoding has not been completed during the first presentation. This sug-
gests that spacing may work at slower presentation rates; during spaced condi-
tions, more study time is needed to encode.

303 Language Learning 72:1, March 2022, pp. 269–319

Sixth, there were greater effects of spacing relative to massed practice on

L2 vocabulary learning, g = 1.42, 95% CI [0.86, 1.99], when feedback was
provided than when feedback was not provided. As the desirable difficulty per-
spective recommends, spacing after processing feedback can provide a learner
with a desirably difficult learning condition on the subsequent trial, improv-
ing subsequent retention. However, we found that feedback did not have much
impact on the differences between equal and expanding spacing conditions.
Cepeda et al. (2006) mentioned that the variability in effects between equal
and expanding spacing could be explained by the presence or absence of feed-
back, which was often a potential confound in the studies comparing these
two conditions. However, our findings suggest that the differences in effects
across these spacing conditions might be impacted by other variables, rather
than feedback.
Seventh, delayed feedback influenced the effects of spaced practice for
retention. There were larger effects of spaced practice on L2 vocabulary
learning when delayed feedback was provided (g = 2.34, 95% CI [1.64, 3.04],
in the spaced vs. massed comparison; g = 0.64, 95% CI [0.15, 1.14], in the
longer vs. shorter comparison) than when immediate feedback was provided (g
= 1.04, 95% CI [0.59, 1.49], in the spaced vs. massed comparison; g = 0.37,
95% CI [−0.16, 0.90], in the longer vs. shorter comparison). In the current
meta-analysis, most vocabulary studies that provided delayed feedback ma-
nipulated spacing between multiple sessions (between-session spacing, k = 6)
rather than within one session (within-session spacing, k = 2), whereas vocab-
ulary studies that provided immediate feedback more often involved within-
session spacing (k = 9) than between-session spacing (k = 4). One explanation
is that delayed feedback that is also between multiple sessions provides (even)
longer spacing intervals between opportunities of feedback for a given item
than does immediate feedback within one session. This supports distributed
practice effects (Bahrick et al., 1993), suggesting that longer spacing promotes
better retention. Delayed feedback can also decrease the competition between
a learner’s incorrect response and the correct response, because an incorrect
response tends to be forgotten over time (Butler et al., 2007).
In the current meta-analysis, almost all the studies that provided delayed
feedback after a test trial targeted L2 vocabulary: Only one L2 grammar study
included delayed feedback (Bird, 2010), whereas seven grammar studies in-
cluded immediate feedback, and one L2 pronunciation study included imme-
diate feedback (Li & DeKeyser, 2019). We should be careful in interpreting
the effects of feedback timing on L2 grammar and pronunciation learning, and
further research in this area would be valuable.

Language Learning 72:1, March 2022, pp. 269–319 304

It is noteworthy that delayed feedback provided in classroom-based stud-

ies with paper-and-pencil tasks (Bird, 2010) and computer-based studies (e.g.,
Gerbier, Toppino, & Koenig, 2015) may lead to different recall rates, because
it may be possible for learners to look over all of their responses on the pa-
pers in the classroom-based studies, whereas this might not be the case with
computer-based delayed feedback. Therefore, the operationalization of feed-
back timing should be carefully considered when a study is carried out with
paper-and-pencil tasks.
Eighth, frequency of practice did not have a significant influence on the
effects of spaced practice on L2 learning. However, a closer inspection of the
data revealed that the results may be accounted for by other potential con-
founding variables. It was found that grammar studies included much greater
frequency of practice than vocabulary studies (2−30 repetitions in grammar
studies compared with 2−9 repetitions in vocabulary studies). Grammar stud-
ies that engaged greater values (e.g., 10−30 repetitions) showed differential
effects of spaced practice in relation to number of sessions (i.e., whether the
practice was manipulated within a session or between multiple sessions). The
study by Suzuki et al. (2020) was a within-session study (10−11 repetitions)
and showed a diminished spacing effect on the delayed posttest (g = 0.67 on
the immediate posttest and g = 0.41 on the delayed posttest). However, the
study by Suzuki (2017) was a between-session study (27−30 repetitions), and
the effect did not attenuate on the delayed posttest (g = −0.63 on the immedi-
ate posttest and g = −0.64 on the delayed posttest [note that a negative value
here indicates the superiority of a baseline condition relative to a treated con-
dition]). This may suggest that spaced practice promotes better learning and
retention of L2 grammar when the practice is manipulated between sessions
rather than within a session (see Tables S9.5 and S9.6, Appendix S9, in the
Supporting Information online).
Finally, the effects of expanding spacing were greater than those of equal
spacing when the RI was longer. The authors of some previous studies have
argued that the advantage produced by expanding spacing is strongly related
to the timing of the first retrieval attempt during practice (e.g., Carpenter &
DeLosh, 2005). However, Logan and Balota (2008) found that fewer items
(low associate word pairs, e.g., cloth-sheep) were recalled in the expanding
condition compared to the equal condition on a 24-hour-delayed posttest. Fur-
thermore, in our data, L2 studies that controlled the timing of the first retrieval
attempt (e.g., Gerbier & Koenig, 2012; Gerbier et al., 2015) found expanding
spacing to be superior to equal spacing on 2-day-delayed posttests. Consistent
with contextual variability theory (e.g., Melton, 1970) and the accessibility

305 Language Learning 72:1, March 2022, pp. 269–319

principle (e.g., Jacoby, 1978), the gradual expansion of spacing between learn-
ing opportunities can lead to greater contextual variation and serve to decrease
the accessibility of a target item but increase reprocessing of the item in spaced
repetitions. Overall, our findings suggest that the timing of the final posttest
and gradual expansion of the spacing interval between learning opportunities
(rather than the timing of the initial retrieval attempt) may have a profound ef-
fect on spaced practice. However, as only two studies controlled for the initial
retrieval attempt, more research is warranted to test this interpretation.
It is pertinent to mention that some of the results of the moderator analyses
(age, learning target, activity type, feedback timing) as interpreted above were
not statistically significant due to small study sample sizes. However, tentative
explanations were offered because the findings could be noteworthy, and we
hope that these explanations will provide some direction for future research
initiatives.
We turn now to the pedagogical implications of our findings. There are
many such implications for both young and adult L2 learners. First, teachers
may need to revisit target words over spaced time intervals. However, the analy-
ses indicated that it might be useful to space the learning of pronunciation rules
with shorter rather than longer spacing, specifically when the rules are not eas-
ily distinguished from each other. This may allow students the time needed to
recognize the patterns and fully comprehend the rules. Second, teachers may
need to revisit target words across a single session. For better retention, teach-
ers could use longer spacing within a single session and/or, for likely even
larger benefits, (also) space items over multiple days. Third, teachers may need
to intersperse spaced retrieval (i.e., tests) with some kind of restudying prac-
tice. For example, teachers could revisit target words that had not been cor-
rectly recalled by students when tested or could provide feedback with a delay
(e.g., feedback given after testing all items). Furthermore, there could be some
value in spaced learning with comprehension activities (e.g., reading sentences
or listening to words, followed by comprehension questions), but teachers may
need to make sure that the activities are desirably challenging for students and
that there is sufficient study (or presentation) time to help students fully com-
prehend target items or features (e.g., Hausman & Kornell, 2014).

Limitations and Future Directions

This meta-analysis identified several limitations that would be useful to address
in further research on spaced practice. First, there have been comparatively
few studies of relative spacing (i.e., equal or expanding spacing). Second, we
found a need for additional research investigating the effects of spacing on L2

Language Learning 72:1, March 2022, pp. 269–319 306

learning that (a) involves young learners, (b) targets L2 grammar and pronun-
ciation learning, (c) includes production activities, (d) includes delayed feed-
back, and (e) measures productive knowledge. Moreover, there is a need for
clearer reporting of participants’ L2 proficiency (as also observed in the syn-
thesis by Park, Solon, Dehghan-Chaleshtori, & Ghanbar, 2021), which could
help teachers to understand how learner differences may interact with the ef-
fects of spaced practice. Although learners may be learning through the same
activities across and within courses, their L2 proficiency (and aptitude) will
vary. Differential effects of spacing might be expected for learners of one pro-
ficiency level as compared to learners of a different proficiency level in the
same learning condition (see Serrano, 2011). Finally, we were not able to rule
out publication bias in the current meta-analysis. Therefore, the overall effects
of spaced practice on L2 learning from the current synthesis should be inter-
preted with caution.

Conclusion
This meta-analysis revealed that although the spacing effect was robust, the
size was in the range of small to medium (g = 0.58) for immediate effects (i.e.,
immediately after the last training session) and medium to large (g = 0.80)
for delayed effects (i.e., a delay of one day or greater following the treatment).
It also revealed that longer spacing was more effective than shorter spacing
for long-term retention (small-to-medium effect, g = 0.40), but that learning
gains were not significantly different between the equal and expanding spacing
conditions. Some of the differences between the effects of different spacing
conditions were explained by particular variables (e.g., learning target, number
of sessions).

Final revised version accepted 14 September 2021

Notes
1 An anonymous reviewer pointed out that there were some studies (k = 12) that
involved different types of posttests (e.g., receptive and productive) administered as
immediate posttests, and that in such cases each different type of posttest could be
considered as a separate learning session when coding the frequency of practice. To
examine whether this affected the results, we did further analyses. We coded
multiple types of posttests as one learning session and also, separately, we coded
multiple types of posttests as separate learning sessions. We did the analyses in both
ways, and the results showed no difference (see Appendix S9 in the Supporting
Information online for details).

307 Language Learning 72:1, March 2022, pp. 269–319

2 An anonymous reviewer pointed out that retention interval is a key variable in

examining spaced practice effects and suggested that the first delayed posttest
should be used as a dependent variable (for examining delayed effects) when a
study involved two or three delayed posttests. We recoded and further analyzed
whether this choice affected the results. We found no statistically significant
difference between our earlier coding (where the interval between the first or
second delayed posttest [if there were three delayed posttests] and the final delayed
posttest was used to examine delayed effects) and this coding suggested by the
reviewer (where the first delayed posttest was used to examine delayed effects) in
both the spaced versus massed comparison and the longer versus shorter spacing
comparison (see Appendix S9 in the Supporting Information online for details).
3 Serrano and Huang’s (2018) study was excluded because their RI was manipulated
between participants, not within participants.

Open Research Badges

This article has earned Open Data and Open Materials badges for making pub-
licly available the digitally-shareable data and the components of the research
methods needed to reproduce the reported procedure and results. All data and
materials that the authors have used and have the right to share are available
at http://www.iris-database.org. All proprietary materials have been precisely
identified in the manuscript.

References
Note. The full reference list of the studies included in the meta-analysis is available in
Appendix S10 in the Supporting Information online.
Avery, N., & Marsden, E. J. (2019). A meta-analysis of sensitivity to grammatical
information during self-paced reading: Towards a framework of reference for
reading time effect sizes. Studies in Second Language Acquisition, 41(5),
1055–1087. Retrieved from https://doi.org/10.1017/S0272263119000196
Baddeley, A. (1999). Human memory: Theory and practice (rev. ed.). East Sussex:
Psychology Press. Retrieved from https://doi.org/10.1016/s0145-2134(00)00166-6
Baddeley, A., Eysenck, M. W., & Anderson, M. C. (2015). Memory (2nd ed). New
York: Psychology Press. Retrieved from https://doi.org/10.4324/9781315749860
Baddeley, A. D., Thomson, N., & Buchanan, M. (1975). Word length and the structure
of short-term memory. Journal of Verbal Learning and Verbal Behavior, 14,
575–589. Retrieved from https://doi.org/10.1016/S0022-5371(75)80045-4
Bahrick, H. P. (1979). Maintenance of knowledge: Questions about memory we forgot
to ask. Journal of Experimental Psychology: General, 108(3), 296–308. Retrieved
from https://doi.org/10.1037/0096-3445.108.3.296

Language Learning 72:1, March 2022, pp. 269–319 308

Bahrick, H. P., Bahrick, L. E., Bahrick, A. S., & Bahrick, P. E. (1993). Maintenance of
foreign language vocabulary and the spacing effect. Psychological Science, 4(5),
316–321. Retrieved from https://doi.org/10.1111/j.1467-9280.1993.tb00571.x
Barcroft, J. (2007). Effects of opportunities for word retrieval during second language
vocabulary learning. Language Learning, 57(1), 35–56. Retrieved from
https://doi.org/10.1111/j.1467-9922.2007.00398.x
Bird, S. (2010). Effects of distributed practice on the acquisition of second language
English syntax. Applied Psycholinguistics, 31, 635–650. Retrieved from
http://doi.org/10.1017/S0142716410000172
Bjork, R. A. (1975). Retrieval as a memory modifier. In R. Solso (Ed.), Information
processing and cognition: The Loyola Symposium (pp. 123–144). Mahwah, NJ:
Lawrence Erlbaum Associates. Retrieved from https://doi.org/10.2307/1421430
Bjork, R. A. (1994). Memory and metamemory considerations in the training of
human beings. In J. Metcalfe & A. Shimamura (Eds.), Metacognition: Knowing
about knowing (pp. 185–205). Cambridge, MA: MIT Press. Retrieved from
https://doi.org/10.7551/mitpress/4561.003.0011
Bjork, R. A., & Bjork, E. L. (1992). A new theory of disuse and an old theory of
stimulus fluctuation. In A. Healy, S. Kosslyn, & R. Shiffrin (Eds.), From learning
processes to cognitive processes: Essays in honor of William K. Estes (Vol. 2, pp.
35–67). Hillsdale, NJ: Erlbaum.
Bloom, K. C., & Shuell, T. J. (1981). Effects of massed and distributed practice on the
learning and retention of second-language vocabulary. Journal of Educational
Research, 74(4), 245–248. Retrieved from
http://doi.org/10.1080/00220671.1981.10885317
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009).
Introduction to meta-analysis. Chichester: John Wiley and Sons, Ltd. Retrieved
from https://doi.org/10.1002/9780470743386
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2013).
Comprehensive Meta-Analysis Version 3 [Software]. Englewood, NJ: Biostat, Inc.
Borenstein, M., Higgins, J. P. T., Hedges, L. V., & Rothstein, H. R. (2017). Basics of
meta-analysis: I2 is not an absolute measure of heterogeneity. Research Synthesis
Methods, 8(5), 5–18. Retrieved from https://doi.org/10.1002/jrsm.1230
Brosvic, G. M., Epstein, M. L., Cook, M. J., & Dihoff, R. E. (2005). Efficacy of error
for the correction of initially incorrect assumptions and of feedback for the
affirmation of correct responding: Learning in the classroom. Psychological
Record, 55(3), 401–418. Retrieved from https://doi.org/10.1007/BF03395518
Butler, A. C., Karpicke, J. D., & Roediger, H. L. (2007). The effect of type and timing
of feedback on learning from multiple-choice tests. Journal of Experimental
Psychology: Applied, 13(4), 273–281. Retrieved from
https://doi.org/10.1037/1076-898X.13.4.273

309 Language Learning 72:1, March 2022, pp. 269–319

Butler, A. C., & Roediger, H. L. (2007). Testing improves long-term retention in a

simulated classroom setting. European Journal of Cognitive Psychology, 19(4–5),
514–527. Retrieved from https://doi.org/10.1080/09541440701326097
Carpenter, S. K. (2017). Spacing effects on learning and memory. In J. T. Wixted
(Ed.), Cognitive psychology of memory, Vol. 2 of Learning and memory: A
comprehensive reference (2nd ed., pp. 465−485). Amsterdam, The Netherlands:
Academic Press. Retrieved from
https://doi.org/10.1016/b978-0-12-809324-5.21054-7
Carpenter, S. K., & DeLosh, E. L. (2005). Application of the testing and spacing
effects to name learning. Applied Cognitive Psychology, 19(5), 619–636. Retrieved
from https://doi.org/10.1002/acp.1101
Carpenter, S. K., & Mueller, F. E. (2013). The effects of interleaving versus blocking
on foreign language pronunciation learning. Memory & Cognition, 41, 671–682.
Retrieved from http://doi.org/10.3758/s13421-012-0291-4
Çekiç, A., & Bakla, A. (2019). The effects of spacing patterns on incidental L2
vocabulary learning through reading with electronic glosses. Instructional Science,
47(3), 353–371. Retrieved from https://doi.org/10.1007/s11251-019-09483-4
Cepeda, N. J., Mozer, M. C., Coburn, N., Rohrer, H., Wixted, J. T., & Pashler, H.
(2005). Optimizing distributed practice: Theoretical analysis and practical
implications. Experimental Psychology, 56(4), 236–246. Retrieved from
https://doi.org/10.1027/1618-3169.56.4.236
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed
practice in verbal recall tasks: A review and quantitative synthesis. Psychological
Bulletin, 132(3), 354–380. Retrieved from
https://doi.org/10.1037/0033-2909.132.3.354
Cepeda, N. J., Vul, E., Rohrer, D., Wixted, J. T., & Pashler, H. (2008). Spacing effects
in learning: A temporal ridgeline of optimal retention. Psychological Science,
19(11), 1095–1102. Retrieved from
https://doi.org/10.1111/j.1467-9280.2008.02209.x
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and
Psychological Measurement, 20(1), 37–46. Retrieved from
https://doi.org/10.1177/001316446002000104
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed).
Hillsdale, NJ: Lawrence Erlbaum.
Crothers, E., & Suppes, P. (1967). Experiments in second-language learning. New
York: Academic Press. Retrieved from
https://doi.org/10.1016/b978-0-12-395568-5.50010-7
DeKeyser, R. M. (2007). Practice in a second language: Perspectives from applied
linguistics and cognitive psychology. Cambridge: Cambridge University Press.
Retrieved from https://doi.org/10.1017/cbo9780511667275

Language Learning 72:1, March 2022, pp. 269–319 310

Donovan, J. J., & Radosevich, D. J. (1999). A meta-analytic review of the distribution

of practice effect: Now you see it, now you don’t. Journal of Applied Psychology,
84(5), 795–805. Retrieved from https://doi.org/10.1037/0021-9010.84.5.795
Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013).
Improving students’ learning with effective learning techniques: Promising
directions from cognitive and educational psychology. Psychological Science in the
Public Interest, 14(1), 4–58. Retrieved from
https://doi.org/10.1177/1529100612453266
Gathercole, S. E., Pickering, S. J., Ambridge, B., & Wearing, H. (2004). The structure
of working memory from 4 to 15 years of age. Developmental Psychology, 40(2),
177–190. Retrieved from https://doi.org/10.1037/0012-1649.40.2.177
Gerbier, E., & Koenig, O. (2012). Influence of multiple-day temporal distribution of
repetitions on memory: A comparison of uniform, expanding, and contracting
schedules. The Quarterly Journal of Experimental Psychology, 65(3), 514–525.
Retrieved from http://doi.org/10.1080/17470218.2011.600806
Gerbier, E., Toppino, T. C., & Koenig, O. (2015). Optimising retention through
multiple study opportunities over days: The benefit of an expanding schedule of
repetition. Memory, 23(6), 943–954. Retrieved from
http://doi.org/10.1080/09658211.2014.944916
Hausman, H., & Kornell, N. (2014). Mixing topics while studying does not enhance
learning. Journal of Applied Research in Memory and Cognition, 3(3), 153–160.
Retrieved from https://doi.org/10.1016/j.jarmac.2014.03.003
Hunter, J., & Schmidt, F. (2004). Methods of meta-analysis: Correcting error and bias
in research findings. London: SAGE Publications.
Jacoby, L. L. (1978). On interpreting the effects of repetition: Solving a problem
versus remembering a solution. Journal of Verbal Learning and Verbal Behavior,
17(6), 649–667. Retrieved from https://doi.org/10.1016/S0022-5371(78)90393-6
Kang, S., Lindsey, R., Mozer, M., & Pashler, H. (2014). Retrieval practice over the
long time: Should spacing be expanding or equal-interval? Psychonomic Bulletin &
Review, 21(6), 1544–1550. Retrieved from
http://doi.org/10.3758/s13423-014-0636-z
Karpicke, J. D., & Bauernschmidt, A. (2011). Spaced retrieval: Absolute spacing
enhances learning regardless of relative spacing. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 37(5), 1250–1257. Retrieved from
http://doi.org/10.1037/a0023436
Kasprowicz, R., Marsden, E., & Sephton, N. (2019). Investigating distribution of
practice effects for the learning of foreign language verb morphology in the young
learner classroom. The Modern Language Journal, 103(3), 580–606. Retrieved
from http://doi.org/10.1111/modl.12586
Khoii, R., & Abed, K. F. (2017). Effects of equal spacing, expanding spacing, and
massed condition on EFL learners’ receptive and productive vocabulary retrieval. In
Pixel, Proceedings of ICT for language learning (19th ed, pp. 500–504). Florence,

311 Language Learning 72:1, March 2022, pp. 269–319

Italy: ICT for Language Learning. Retrieved from https://conference.pixel-

online.net/ICT4LL/files/ict4ll/ed0010/FP/0960-SLA2580-FP-ICT4LL10.pdf
Kim, S. K., & Webb, S. (2021). Coding scheme. Materials from “The effects of spaced
practice on second language learning: A meta-analysis” [Coding scheme]. IRIS
Database, University of York, UK. https://doi.org/10.48316/rn3w-1b17
Koval, N. G. (2019). Testing the deficient processing account of the spacing effect in
second language vocabulary learning: Evidence from eye tracking. Applied
Psycholinguistics, 40(5), 1–37. Retrieved from
http://doi.org/10.1017/S0142716419000158
Koval, N. G. (2020). Testing the reminding account of the lag effect in L2 vocabulary
acquisition from L2-L1 retrieval practice within a paired-associate learning format
(Published doctoral dissertation). Michigan State University. Retrieved from
http://doi.org/10.25335/pg4k-p594
Küpper-Tetzel, C. E., Erdfelder, E., & Dickhäuser, O. (2014). The lag effect in
secondary school classrooms: Enhancing students’ memory for vocabulary.
Instructional Science, 42(3), 373–388. Retrieved from
https://doi.org/10.1007/s11251-013-9285-2
Lawrence, N. K. (2013). Cumulative exams in the introductory psychology course.
Teaching of Psychology, 40(1), 15–19. Retrieved from
https://doi.org/10.1177/0098628312465858
Lee, E., & Choe, M-H. (2014). The effect of spaced repetitions on Korean elementary
students’ L2 English vocabulary learning. Studies in English Education, 19(1),
55–75.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis.
Language Learning, 60(2), 309–365. Retrieved from
https://doi.org/10.1111/j.1467-9922.2010.00561.x
Li, M., & DeKeyser, R. (2019). Distribution of practice effects in the acquisition and
retention of L2 Mandarin tonal word production. The Modern Language Journal,
103(3), 607–628. Retrieved from http://doi.org/10.1111/modl.12580
Lightbown, P. M., & Spada, N. (1994). An innovative program for primary ESL
students in Quebec. TESOL Quarterly, 28(3), 563–579. Retrieved from
https://doi.org/10.2307/3587308
Linck, J. A., Osthus, P., Koeth, J. T., & Bunting, M. F. (2014). Working memory and
second language comprehension and production: A meta-analysis. Psychonomic
Bulletin and Review, 21(4), 861–883. Retrieved from
https://doi.org/10.3758/s13423-013-0565-2
Lipsey, M., & Wilson, D. (2001). Practical meta-analysis. London: SAGE
Publications. Retrieved from https://doi.org/10.1177/01632780122034902
Logan, J. M., & Balota, D. A. (2008). Expanded vs. equal interval spaced retrieval
practice: Exploring different schedules of spacing and retention interval in younger
and older adults. Neuropsychology, Development, and Cognition. Section B, Aging,

Language Learning 72:1, March 2022, pp. 269–319 312

Neuropsychology and Cognition, 15(3), 257–280. Retrieved from

https://doi.org/10.1080/13825580701322171
Lotfolahi, A. R., & Salehi, H. (2016). Learners’ perceptions of the effectiveness of
spaced learning schedule in L2 vocabulary learning. SAGE Open, 6(2), 1–9.
Retrieved from http://doi.org/10.1177/2158244016646148
Lotfolahi, A. R., & Salehi, H. (2017). Spacing effects in vocabulary learning: Young
EFL learners in focus. Cogent Education, 4(1), 1–10. Retrieved from
https://doi.org/10.1080/2331186X.2017.1287391
Maddox, G. B., & Balota, D. A. (2015). Retrieval practice and spacing effects in young
and older adults: An examination of the benefits of desirable difficulty. Memory and
Cognition, 43(5), 760–774. Retrieved from
https://doi.org/10.3758/s13421-014-0499-6
Maddox, G. B., Balota, D. A., Coane, J. H., & Duchek, J. M. (2011). The role of
forgetting rate in producing a benefit of expanded over equal spaced retrieval in
young and older adults. Psychology and Aging, 26(3), 661–670. Retrieved from
https://doi.org/10.1037/a0022942
Melton, A. W. (1970). The situation with respect to the spacing of repetitions and
memory. Journal of Verbal Learning and Verbal Behavior, 9(5), 596–606.
Retrieved from https://doi.org/10.1016/S0022-5371(70)80107-4
Metcalfe, J. (2011). Desirable difficulties and studying in the region of proximal
learning. In A. S. Benjamin (Ed.), Successful remembering and successful
forgetting: A Festschrift in honor of Robert A. Bjork (pp. 259–276). London /New
York: Psychology Press. Retrieved from https://doi.org/10.4324/9780203842539-18
Miles, S. W. (2014). Spaced vs. massed distribution instruction for L2 grammar
learning. System, 42, 412–428. Retrieved from
http://doi.org/10.1016/j.system.2014.01.014
Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G., & the PRISMA Group. (2009).
Preferred reporting items for systematic reviews and meta-analyses: The PRISMA
statement. PLOS Medicine, 6(7), e1000097. Retrieved from
https://doi.org/10.1371/journal.pmed.1000097
Nakata, T. (2015a). Effects of expanding and equal spacing on second language
vocabulary learning: Does gradually increasing spacing increase vocabulary
learning? Studies in Second Language Acquisition, 37(4), 677–711. Retrieved from
http://doi.org/10.1017/S0272263114000825
Nakata, T. (2015b). Effects of feedback timing on second language vocabulary
learning: Does delaying feedback increase learning? Language Teaching Research,
19(4), 416–434. Retrieved from https://doi.org/10.1177/1362168814541721
Nakata, T. (2017). Does repeated practice make perfect? The effects of within-session
repeated retrieval on second language vocabulary learning. Studies in Second
Language Acquisition, 39(4), 653–679. Retrieved from
https://doi.org/10.1017/S0272263116000280

313 Language Learning 72:1, March 2022, pp. 269–319

Nakata, T., & Elgort, I. (2021). Effects of spacing on contextual vocabulary learning:
Spacing facilitates the acquisition of explicit, but not tacit, vocabulary knowledge.
Second Language Research, 37(2), 233–260. Retrieved from
http://doi.org/10.1177/0267658320927764
Nakata, T., & Suzuki, Y. (2019a). Effects of massing and spacing on the learning of
semantically related and unrelated words. Studies in Second Language Acquisition,
41(2), 287–311. Retrieved from http://doi.org/10.1017/S0272263118000219
Nakata, T., & Suzuki, Y. (2019b). Mixing grammar exercises facilitates long-term
retention: Effects of blocking, interleaving, and increasing practice. The Modern
Language Journal, 103(3), 629–647. Retrieved from
http://doi.org/10.1111/modl.12581
Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research
synthesis and quantitative meta-analysis. Language Learning, 50(3), 417–528.
Retrieved from https://doi.org/10.1111/0023-8333.00136
Pan, S. C., Tajran, J., Lovelett, J., Osuna, J., & Rickard, T. C. (2019). Does interleaved
practice enhance foreign language learning? The effects of training schedule on
Spanish verb conjugation skills. Journal of Educational Psychology, 111,
1172–1188. Retrieved from http://doi.org/10.1037/edu0000336
Park, H. I., Solon, M., Dehghan-Chaleshtori, M., & Ghanbar, H. (2021). Proficiency
reporting practices in research on second language acquisition: Have we made any
progress? Language Learning, 72. Retrieved from
https://doi.org/10.1111/lang.12475
Pashler, H., Cepeda, N. J., Wixted, J. T., & Rohrer, D. (2005). When does feedback
facilitate learning of words? Journal of Experimental Psychology: Learning,
Memory, and Cognition, 31(1), 3–8. Retrieved from
https://doi.org/10.1037/0278-7393.31.1.3
Pashler, H., Zarow, G., & Triplett, B. (2003). Is temporal spacing of tests helpful even
when it inflates error rates? Journal of Experimental Psychology: Learning,
Memory, and Cognition, 29(6), 1051–1057. Retrieved from
http://doi.org/10.1037/0278-7393.29.6.1051
Patall, E. A., Cooper, H., & Robinson, J. C. (2008). The effects of choice on intrinsic
motivation and related outcomes: A meta-analysis of research findings.
Psychological Bulletin, 134(2), 270–300. Retrieved from
https://doi.org/10.1037/0033-2909.134.2.270
Peterson, L. R., Saltzman, D., Hillner, K., & Land, V. (1962). Recency and frequency
in paired-associate learning. Journal of Experimental Psychology, 63, 396–403.
Retrieved from https://doi.org/10.1037/h0043571
Pinker, S. (1998). Words and rules. Lingua, 106(1–4), 219–242. Retrieved from
https://doi.org/10.1016/S0024-3841(98)00035-7
Plonsky, L., & Oswald, F. L. (2014). How big is big? Interpreting effect sizes in L2
research. Language Learning, 64(4), 878–912. Retrieved from
https://doi.org/10.1111/lang.12079

Language Learning 72:1, March 2022, pp. 269–319 314

Pyc, M. A., & Rawson, K. A. (2007). Examining the efficiency of schedules of

distributed retrieval practice. Memory & Cognition, 35(8), 1917–1927. Retrieved
from http://doi.org/10.3758/BF03192925
Pyc, M. A., & Rawson, K. A. (2009). Testing the retrieval effort hypothesis: Does
greater difficulty correctly recalling information lead to higher levels of memory?
Journal of Memory and Language, 60(4), 437–447. Retrieved from
http://doi.org/10.1016/j.jml.2009.01.004
Roediger, H. L., & Karpicke, J. D. (2006). The power of testing memory: Basic
research and implications for educational practice. Perspectives on Psychological
Science, 1(3), 181–210. Retrieved from
https://doi.org/10.1111/j.1745-6916.2006.00012.x
Rogers, J. (2015). Learning second language syntax under massed and distributed
conditions. TESOL Quarterly, 49(4), 857–866. Retrieved from
http://doi.org/10.1002/tesq.252
Rogers, J., & Cheung, A. (2020a). Input spacing and the learning of L2 vocabulary in
a classroom context. Language Teaching Research, 24, 616–641. Retrieved from
http://doi.org/10.1177/1362168818805251
Rogers, J., & Cheung, A. (2020b). Does it matter when you review? Input spacing,
ecological validity, and the learning of L2 vocabulary. Studies in Second Language
Acquisition. Retrieved from http://doi.org/10.1017/S0272263120000236
Rohrer, D., & Pashler, H. (2007). Increasing retention without increasing study time.
Current Directions in Psychological Science, 16(4), 183–186. Retrieved from
https://doi.org/10.1111/j.1467-8721.2007.00500.x
Rosenthal, R. (1979). The “file drawer problem” and tolerance for null results.
Psychological Bulletin, 86(3), 638–641. Retrieved from
https://doi.org/10.1037/0033-2909.86.3.638
Saito, K., & Plonsky, L. (2019). Effects of second language pronunciation teaching
revisited: A proposed measurement framework and meta-analysis. Language
Learning, 69(3), 652–708. Retrieved from https://doi.org/10.1111/lang.12345
Schäfer, T., & Schwarz, M. A. (2019). The meaningfulness of effect sizes in
psychological research: Differences between sub-disciplines and the impact of
potential biases. Frontiers in Psychology, 10, 813. Retrieved from
https://doi.org/10.3389/fpsyg.2019.00813
Schuetze, U. (2014). Spacing techniques in second language vocabulary acquisition:
Short-term gains vs. long-term memory. Language Teaching Research, 19(1),
28–42. Retrieved from http://doi.org/10.1177/1362168814541726
Seabrook, R., Brown, G. D. A., & Solity, J. E. (2005). Distributed and massed
practice: From laboratory to classroom. Applied Cognitive Psychology, 19(1),
107–122. Retrieved from https://doi.org/10.1002/acp.1066
Serrano, R. (2011). The time factor in EFL classroom practice. Language Learning,
61(1), 117–145. Retrieved from https://doi.org/10.1111/j.1467-9922.2010.00591.x

315 Language Learning 72:1, March 2022, pp. 269–319

Serrano, R., & Huang, H-Y. (2018). Learning vocabulary through assisted repeated
reading: How much time should there be between repetitions of the same text?
TESOL Quarterly, 52(4), 971–994. Retrieved from http://doi.org/10.1002/tesq.445
Shintani, N. (2015). The effectiveness of processing instruction and production-based
instruction on L2 grammar acquisition: A meta-analysis. Applied Linguistics, 36(3),
306–325. Retrieved from https://doi.org/10.1093/applin/amu067
Shintani, N., Li, S., & Ellis, R. (2013). Comprehension-based versus productive-based
grammar instruction: A meta-analysis of comparative studies. Language Learning,
63(2), 296–329. Retrieved from https://doi.org/10.1111/lang.12001
Snoder, P. (2017). Improving English learners’ productive collocation knowledge: The
effects of involvement load, spacing, and intentionality. TESL Canada Journal,
34(3), 140–164. Retrieved from http://doi.org/10.18806/tesl.v34i3.1277
Suzuki, Y. (2017). The optimal distribution of practice for the acquisition of L2
morphology: A conceptual replication and extension. Language Learning, 67(3),
512–545. Retrieved from http://doi.org/10.1111/lang.12236
Suzuki, Y. (2018). The role of procedural learning ability in automatization of L2
morphology under different learning schedules. Studies in Second Language
Acquisition, 40(4), 923–937. Retrieved from
https://doi.org/10.1017/S0272263117000249
Suzuki, Y. (2019). Individualization of practice distribution in second language
grammar learning: A role of metalinguistic rule rehearsal ability and working
memory capacity. Journal of Second Language Studies, 2(2), 170–197. Retrieved
from http://doi.org/10.1075/bct.116.02suz
Suzuki, Y., & DeKeyser, R. (2017a). Effects of distributed practice on the
preceduralization of morphology. Language Teaching Research, 21(2), 166–188.
Retrieved from http://doi.org/10.1177/1362168815617334
Suzuki, Y., & DeKeyser, R. (2017b). Exploratory research on second language practice
distribution: An aptitude × treatment interaction. Applied Psycholinguistics, 38(1),
27–56. Retrieved from https://doi.org/10.1017/S0142716416000084
Suzuki, Y., Nakata, T., & DeKeyser, R. M. (2019). The desirable difficulty framework
as a theoretical foundation for optimizing and researching second language
practice. The Modern Language Journal, 103(3), 713–720. Retrieved from
https://doi.org/10.1111/modl.12585
Suzuki, Y., Yokosawa, S., & Aline, D. (2020). The role of working memory in blocked
and interleaved grammar practice: Proceduralization of L2 syntax. Language
Teaching Research, Retrieved from http://doi.org/10.1177/1362168820913985
Taylor, K., & Rohrer, D. (2010). The effects of interleaved practice. Applied Cognitive
Psychology, 24(6), 837–848. Retrieved from https://doi.org/10.1002/acp.1598
Toppino, T. C., & Bloom, L. C. (2002). The spacing effect, free recall, and two-process
theory: A closer look. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 28(3), 437–444. Retrieved from
https://doi.org/10.1037//0278-7393.28.3.437

Language Learning 72:1, March 2022, pp. 269–319 316

Toppino, T. C., & DiGeorge, W. (1984). The spacing effect in free recall emerges with
development. Memory and Cognition, 12(2), 118–122. Retrieved from
https://doi.org/10.3758/bf03198425
Uchihara, T., Webb, S., & Yanagisawa, A. (2019). The effects of repetition on
incidental vocabulary learning: A meta-analysis of correlational studies. Language
Learning, 69(3), 559–599. Retrieved from https://doi.org/10.1111/lang.12343
Ullman, M. T. (2015). The declarative/procedural model: A neurobiologically
motivated theory of first and second language. In B. Van Patten & J. Williams
(Eds.), Theories in second language acquisition: An introduction (pp. 135–158).
New York: Routledge. Retrieved from https://doi.org/10.4324/9780429503986-7
van Aert, R. C. M., Wicherts, J. M., & van Assen, M. A. L. M. (2016). Conducting
meta-analyses based on p values: Reservations and recommendations for applying
p-uniform and p-curve. Perspectives on Psychological Science, 11(5), 713–729.
Retrieved from https://doi.org/10.1177/174569116659874
Verkoeijen, P. P. J. L., Rikers, R. M. J. P., & Özsoy, B. (2008). Distributed rereading
can hurt the spacing effect in text memory. Applied Cognitive Psychology, 22(5),
685–695. Retrieved from https://doi.org/10.1002/acp.1388
Wickelgren, W. A. (1972). Trace resistance and the decay of long-term memory.
Journal of Mathematical Psychology, 9(4), 418–455. Retrieved from
https://doi.org/10.1016/0022-2496(72)90015-6
Wilson, W. P. (1976). Developmental changes in the lag effect: An encoding
hypothesis for repeated word recall. Journal of Experimental Child Psychology,
22(1), 113–122. Retrieved from https://doi.org/10.1016/0022-0965(76)90094-1

Supporting Information
Additional Supporting Information may be found in the online version of this
article at the publisher’s website:

Appendix S1. PRISMA Flow Diagram.

Appendix S2. Category Criteria.
Appendix S3. Coding Scheme.
Appendix S4. Details of the Studies Included in the Meta-Analysis.
Appendix S5. Coding Reliability.
Appendix S6. Publication Bias Analyses.
Appendix S7. Overall Results Under Each Category.
Appendix S8. Moderator Analyses for Each Posttest (Immediate and Delayed)
Under Each Category.
Appendix S9. Further Analyses for the Moderators Frequency of Practice and
Retention Interval.
Appendix S10. A Full List of All the Included Studies in the Current Meta-
Analysis.

317 Language Learning 72:1, March 2022, pp. 269–319

Appendix: Accessible Summary (also publicly available at

https://oasis-database.org)

Spaced Practice Effects in L2 Learning

What This Research Was About and Why It Is Important
Given the robust effects of spaced practice, in which time or other events occur
between repeated practice sessions, on different aspects of learning in cognitive
psychology, many studies have been conducted in the field of second language
(L2) learning. The present study systematically reviewed 37 L2 studies (pro-
viding 98 effect sizes from 48 experiments) of spaced practice to (a) provide
a more reliable and informative estimate of its effect on L2 learning than is
possible from a single study, and (b) determine the extent to which the effects
are moderated by different variables. Results showed that spaced practice had
significantly greater effects on L2 learning and retention than when there was
no spacing. Longer spacing was more effective for long-term retention than
shorter spacing, but there was no difference in learning gains between equal
and expanding spacing conditions. Variability in spaced practice effects across
studies was explained by several methodological variables (e.g., learning tar-
get, number of sessions).

What the Researchers Did

The researchers’ comprehensive search for studies on spaced practice for L2
learning found 37 studies satisfying all inclusion criteria.
r The 37 studies (providing 98 effect sizes from 48 experiments) were divided
into three categories of spaced schedules (i.e., comparisons of: spaced vs.
massed; longer spacing vs. shorter spacing; and equal spacing vs. expanding
spacing), based on carefully defined category criteria.
r The 98 effect sizes were analyzed to examine the extent to which spaced
practice affects L2 learning.
r The researchers examined which learner-related (age: adult [university stu-
dents or older] versus younger [Grades 1–12]) and methodological (learn-
ing target [vocabulary, grammar, pronunciation], number of sessions, type
of practice, activity type, provision of feedback, feedback timing, frequency
of practice, and retention interval) variables moderated the effects of spaced
practice.

Language Learning 72:1, March 2022, pp. 269–319 318

What the Researchers Found

Spaced practice showed greater benefits for immediate L2 learning (g = 0.58,
i.e., over half a standard deviation unit) and longer-term (i.e., after a delay of
1 day or greater following the treatment) retention (g = 0.80, i.e., 0.8 of a
standard deviation unit) than when there was no spacing.
r While shorter spacing was as effective as longer spacing on immediate L2
performance, longer spacing was more effective than shorter spacing for
longer-term retention (g = 0.40, i.e., 0.4 of a standard deviation unit).
r Spacing effects (longer intervals between encountering the items) on L2 vo-
cabulary were more pronounced when spacing was within a single training
session than between multiple training sessions.
r Greater retention occurred when longer spacing involved test-restudy trials
than study-only trials.
r Learning gains were not different between equal and expanding spacing con-
ditions, but the effects of expanding spacing were greater than equal spacing
when the retention interval was longer (i.e., when the tests were longer after
the last practice or test session).

Things to Consider
This meta-analysis showed significant effects of spaced practice on L2 vocab-
ulary, grammar, and pronunciation learning. However, the majority of studies
examining spacing effects have investigated L2 vocabulary learning and there
is a need for more research on the effects of spaced practice on L2 grammar
and pronunciation.
r Spaced practice benefits L2 learning, but the effects seemed to depend on
what is being learned (e.g., learning target) and how the learning happens
(e.g., number of sessions, type of practice).
Materials, data, open access article: Coding sheet and data are publicly avail-
able at http://www.iris-database.org.
How to cite this summary: Kim, S. K., & Webb, S. (2022). Spaced practice
effects in L2 learning. OASIS Summary of Kim & Webb (2022) in Language
Learning. https://oasis-database.org

This summary has a CC BY-NC-SA license.

319 Language Learning 72:1, March 2022, pp. 269–319

L2 Vocabulary Learning: Input Spacing Study
No ratings yet
L2 Vocabulary Learning: Input Spacing Study
19 pages
Vocabulary Learning via Repeated Reading
No ratings yet
Vocabulary Learning via Repeated Reading
20 pages
Self-Regulated Spacing in A Massive Open Online Course Is Related To Better Learning
No ratings yet
Self-Regulated Spacing in A Massive Open Online Course Is Related To Better Learning
10 pages
Spaced Practice and Second Language Vocabulary Learning
No ratings yet
Spaced Practice and Second Language Vocabulary Learning
314 pages
Artifcial Grammar Learning Is Facilitated by Distributed Practice
No ratings yet
Artifcial Grammar Learning Is Facilitated by Distributed Practice
13 pages
Interleaving vs. Blocking in L2 Grammar
No ratings yet
Interleaving vs. Blocking in L2 Grammar
19 pages
Kang - 2016 - Spaced Repetition Promotes Efficient and Effective Learning Policy Implications For Instruction
No ratings yet
Kang - 2016 - Spaced Repetition Promotes Efficient and Effective Learning Policy Implications For Instruction
8 pages
How To Use Spaced Retrieval Practice To Boost Learning
No ratings yet
How To Use Spaced Retrieval Practice To Boost Learning
12 pages
14 Suzuki
No ratings yet
14 Suzuki
29 pages
Distinct Theories of Spacing vs. Interleaving
No ratings yet
Distinct Theories of Spacing vs. Interleaving
24 pages
Lag Effects in Grammar Learning A Desirable Difficulties Perspective
No ratings yet
Lag Effects in Grammar Learning A Desirable Difficulties Perspective
38 pages
PLM Delaney Verkoeijen Spirgel 2010
No ratings yet
PLM Delaney Verkoeijen Spirgel 2010
86 pages
General Problem
No ratings yet
General Problem
17 pages
Spaced Learning for L2 Vocabulary Retention
No ratings yet
Spaced Learning for L2 Vocabulary Retention
16 pages
Spacing and Interleaving in Learning
No ratings yet
Spacing and Interleaving in Learning
11 pages
SCK2011 PDF
No ratings yet
SCK2011 PDF
5 pages
PPNCDOC
No ratings yet
PPNCDOC
9 pages
Boost Learning with Distributed Practice
No ratings yet
Boost Learning with Distributed Practice
35 pages
Important Study
No ratings yet
Important Study
43 pages
Spaced Practice
No ratings yet
Spaced Practice
4 pages
Spacing and Testing Effects:A Deeply Critical, Lengthy, and at Times Discursive Review of The Literature
No ratings yet
Spacing and Testing Effects:A Deeply Critical, Lengthy, and at Times Discursive Review of The Literature
86 pages
Spaced Repetition
No ratings yet
Spaced Repetition
6 pages
Enhancing Learning and Retention Through The Distribution of Practice Repetitions Across Multiple Sessions
No ratings yet
Enhancing Learning and Retention Through The Distribution of Practice Repetitions Across Multiple Sessions
18 pages
Erratum: Language Learning Study Errors
No ratings yet
Erratum: Language Learning Study Errors
18 pages
Counting Days Is A Spacing Incentive That Unlocks The Potential of Low GPA Students
No ratings yet
Counting Days Is A Spacing Incentive That Unlocks The Potential of Low GPA Students
9 pages
Dempster 1988
No ratings yet
Dempster 1988
8 pages
E-portfolio - trần Khánh Toàn
No ratings yet
E-portfolio - trần Khánh Toàn
23 pages
Massed VS Spaced Practice of Learning...
50% (2)
Massed VS Spaced Practice of Learning...
14 pages
2 2 Nakata
No ratings yet
2 2 Nakata
18 pages
6 Sato
No ratings yet
6 Sato
28 pages
Task Repetition and L2 Writing Development
No ratings yet
Task Repetition and L2 Writing Development
30 pages
2010 Kelli Taylor - The Effects of Interleaved Practice (Retrieved - 2024!03!25)
No ratings yet
2010 Kelli Taylor - The Effects of Interleaved Practice (Retrieved - 2024!03!25)
12 pages
Pan Tajran Lovelett Osunaand Rickard 2018 Interleavedpracticeandforeignlanguagelearningpreprint
No ratings yet
Pan Tajran Lovelett Osunaand Rickard 2018 Interleavedpracticeandforeignlanguagelearningpreprint
36 pages
TBLT vs. PPP: L2 Proficiency Growth Study
No ratings yet
TBLT vs. PPP: L2 Proficiency Growth Study
14 pages
Experiment Massed vs. Spaced Learning
No ratings yet
Experiment Massed vs. Spaced Learning
3 pages
Impact of Learner Background on JSL Writing
No ratings yet
Impact of Learner Background on JSL Writing
20 pages
Ebook
No ratings yet
Ebook
12 pages
Cepeda 2006
No ratings yet
Cepeda 2006
27 pages
Understanding the Spacing Effect for Retention
No ratings yet
Understanding the Spacing Effect for Retention
13 pages
Unraveling The Effect of Inter
No ratings yet
Unraveling The Effect of Inter
7 pages
Spacing Learning Over Time March2009v1
No ratings yet
Spacing Learning Over Time March2009v1
54 pages
Ej 1293488
No ratings yet
Ej 1293488
17 pages
Final Research
No ratings yet
Final Research
7 pages
Rohrer&Taylor 2006 ACP
No ratings yet
Rohrer&Taylor 2006 ACP
16 pages
288 - Pandhu Pekerti Luhur - UAS Statistics in ELT
No ratings yet
288 - Pandhu Pekerti Luhur - UAS Statistics in ELT
8 pages
Study Like A Champ 100-119
No ratings yet
Study Like A Champ 100-119
20 pages
Spacing vs. Massing in Orthographic Learning
No ratings yet
Spacing vs. Massing in Orthographic Learning
1 page
Verlearning Tudy Trategies: Overlearning Involves Studying Material Beyond A Pre-Determined Level of
No ratings yet
Verlearning Tudy Trategies: Overlearning Involves Studying Material Beyond A Pre-Determined Level of
3 pages
Successive Relearning in Education
No ratings yet
Successive Relearning in Education
26 pages
Impact of Interline Spacing on Reading Speed
No ratings yet
Impact of Interline Spacing on Reading Speed
17 pages
Spaced vs. Interleaved Math Practice
No ratings yet
Spaced vs. Interleaved Math Practice
16 pages
Measuring Learning Performance Techniques
No ratings yet
Measuring Learning Performance Techniques
12 pages
Rawson & Kintsch (2005) Rereading Effects Depend On Time of Test
No ratings yet
Rawson & Kintsch (2005) Rereading Effects Depend On Time of Test
11 pages
Bahrick Et Al. (1993) Spacing Effect
No ratings yet
Bahrick Et Al. (1993) Spacing Effect
7 pages
Spacing Learning Over Time 2006
No ratings yet
Spacing Learning Over Time 2006
54 pages
Effectiveness of Spaced Repetition in Vocabulary Learning
No ratings yet
Effectiveness of Spaced Repetition in Vocabulary Learning
43 pages
219 - Revangga ML - UAS Statistics in ELT PDF
No ratings yet
219 - Revangga ML - UAS Statistics in ELT PDF
5 pages
Successive Relearning in Education
No ratings yet
Successive Relearning in Education
27 pages
SL Experiment 2
No ratings yet
SL Experiment 2
3 pages
Preface to Revised Sanskrit Grammar
No ratings yet
Preface to Revised Sanskrit Grammar
243 pages
Lesson 2.. English at Work-1
No ratings yet
Lesson 2.. English at Work-1
3 pages
Week 6 - DLL Matatag
No ratings yet
Week 6 - DLL Matatag
6 pages
Modify ALV Reports with BADI ALV_GRID_XT
No ratings yet
Modify ALV Reports with BADI ALV_GRID_XT
6 pages
Capitalization of Proper Names Explained
No ratings yet
Capitalization of Proper Names Explained
18 pages
Police Correspondenc E: The Writing of Memoranda, Police Reports, and Civilian Letters
No ratings yet
Police Correspondenc E: The Writing of Memoranda, Police Reports, and Civilian Letters
56 pages
Pen Pal Visit Test: English Language Skills
100% (1)
Pen Pal Visit Test: English Language Skills
15 pages
Cmap Peh II Week 2 Midterm
No ratings yet
Cmap Peh II Week 2 Midterm
2 pages
The Noble Quran-Quran in Chapter Order: OR Index of The Noble Quran
No ratings yet
The Noble Quran-Quran in Chapter Order: OR Index of The Noble Quran
13 pages
Working With Words Business Communication: Care Deliver Expect Loyal Produce Satisfy Serve
100% (5)
Working With Words Business Communication: Care Deliver Expect Loyal Produce Satisfy Serve
3 pages
Computer Applications Syllabus XII
No ratings yet
Computer Applications Syllabus XII
347 pages
Automobile Management System Project Report
No ratings yet
Automobile Management System Project Report
35 pages
Dood & Toot-Play It Today-2023 EN RevA
No ratings yet
Dood & Toot-Play It Today-2023 EN RevA
48 pages
Venda
No ratings yet
Venda
5 pages
Business Model Canvas Assignment Guide
No ratings yet
Business Model Canvas Assignment Guide
2 pages
Computer Verbs
No ratings yet
Computer Verbs
1 page
Understanding SQL Injection Attacks
No ratings yet
Understanding SQL Injection Attacks
149 pages
Indian Literature Part 1
No ratings yet
Indian Literature Part 1
26 pages
Antonyms and Synonyms Quiz Guide
No ratings yet
Antonyms and Synonyms Quiz Guide
11 pages
Making A Cardboard Pico Drawing Robot Arm - Kitronik LTD
No ratings yet
Making A Cardboard Pico Drawing Robot Arm - Kitronik LTD
6 pages
AIM242 S Agentic AI and The Journey To Gen AI Value Realization Sponsored by ZS
No ratings yet
AIM242 S Agentic AI and The Journey To Gen AI Value Realization Sponsored by ZS
16 pages
Digital Electronics Assignment: Logic Design
No ratings yet
Digital Electronics Assignment: Logic Design
3 pages
SHS PBM Students' Speaking & Writing Challenges
No ratings yet
SHS PBM Students' Speaking & Writing Challenges
13 pages
Hindu (The Triune Gods and The Lesser Gods)
100% (1)
Hindu (The Triune Gods and The Lesser Gods)
28 pages
SNEHANA
No ratings yet
SNEHANA
34 pages
MORGADES. Written Report.
No ratings yet
MORGADES. Written Report.
7 pages
Student Enquiry Form Testing Guide
No ratings yet
Student Enquiry Form Testing Guide
14 pages
Ac 9295
No ratings yet
Ac 9295
2 pages
Leaky Bucket Algorithm for Congestion Control
No ratings yet
Leaky Bucket Algorithm for Congestion Control
2 pages