Chi Et Al. (2021)
Chi Et Al. (2021)
Abstract
Objective: Metamemory tasks have been utilized to investigate anosognosia in older adults with dementia, though previous
research has not systematically compared memory self-awareness in prodromal dementia groups. This represents an important
oversight given that remedial and interventional efforts may be most beneficial before individuals’ transition to clinical dementia.
We examine differences in memory self-awareness and memory self-monitoring between cognitively healthy elderly controls
and prodromal dementia groups.
Methods: Participants with subjective cognitive decline despite intact objective neuropsychological functioning (SCD; n = 82),
amnestic mild cognitive impairment (aMCI; n = 18), nonamnestic mild cognitive impairment (naMCI; n = 38), and normal cog-
nitive functioning (HC; n = 120) were recruited from the Einstein Aging Study for a cross-sectional study. Participants completed
an experimental visual memory-based global metamemory prediction task and subjective assessments of memory/cognition and
self-awareness.
Results: While, relative to HC, memory self-awareness and memory self-monitoring were preserved for delayed memory
performance in SCD and aMCI, these processes were impaired in naMCI. Furthermore, results suggest that poor metamemory
accuracy captured by our experimental task can be generalized to everyday memory problems.
Conclusions: Within the framework of the Cognitive Awareness Model, our findings provide preliminary evidence that poor
memory self-awareness/self-monitoring in naMCI may reflect an executive or primary anosognosia, with implications for
tailored rehabilitative interventions.
Keywords: Mild cognitive impairment; Meta cognition; Alzheimer’s disease; Dementia; Rehabilitation; Executive functions
Introduction
The ability to accurately self-assess one’s own memory functioning has been shown to be vulnerable to the neuropathological
changes associated with Alzheimer’s disease (AD) (Brandt, Carvalho, Belfort, & Dourado, 2018; Morris & Mograbi, 2013). Poor
self-awareness of memory ability can be conceived as a problem with metamemory—broadly defined as knowledge about one’s
own memory functioning (Nelson & Narens, 1990). Metamemory is supported by monitoring and control mechanisms that,
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected]
https://doi.org/10.1093/arclin/acab008 Advance Access publication 23 April 2021
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1405
respectively, assess the status of ongoing memory performance and direct behavior to optimize memory functioning (Nelson
& Narens, 1990). Impairment of metamemory functioning, including poor awareness of memory/cognitive deficits, has been
linked to objective functional difficulties (Steward, Bull, Kennedy, Crowe, & Wadley, 2019), as well as poorer utilization of
compensatory strategies (Schmitter-Edgecombe & Seelye, 2011) and cognitive rehabilitation outcomes (Clare, Wilson, Carter,
Roth, & Hodges, 2004). Taken together, the inclusion of metamemory assessments in diagnostic assessments of AD and other
neurocognitive disease conditions has been recommended (Brandt et al., 2018; Chapman, Colvin, & Cosentino, 2020; Morris &
Mograbi, 2013; Ryals, O’Neil, Mesulam, Weintraub, & Voss, 2018). Additionally, recent research has focused on investigating
the translational utility of experimental metamemory tasks in clinical settings, especially with regard to clarifying diagnoses and
subtle cognitive changes that are not yet detectable on standardized testing during this early disease stage and it is only after a
critical point in the disease process, when the level of cognitive decline associated with the underlying disease burden surpasses
these individuals’ ability to compensate, that objective cognitive impairment and the onset of MCI becomes apparent (e.g., see
Jessen et al., 2014 and Rabin et al., 2017). Therefore, a key diagnostic assumption is that individuals with SCD are capable of
making accurate, global self-assessments about their cognitive and memory functioning. Although this assumption is supported
by a recent study conducted by our laboratory that showed comparable performance between SCD and HC participants on
online measures of absolute and relative metamemory accuracy, SCD has also been attributed to other factors, including poor
metacognition and depression (Buckley, Laming, Chen, Crole, & Hester, 2016; Chapman et al., 2020; Metternich, Schmidtke,
Method
Participants
Participants were recruited from the Einstein Aging Study (EAS), which enrolls ethnically and social-economically diverse
community-dwelling individuals who reside in the Bronx, NY. EAS participants are recruited through systematic sampling from
voter registration and Medicare lists (Katz et al., 2012; Lipton et al., 2003), with the following exclusion criteria: age < 70 years,
active psychiatric symptomatology and/or visual/auditory impairments that would interfere with neuropsychological testing,
Procedure
As noted above, this study was part of a larger longitudinal study of cognitive aging. Participants were first assessed during
their annual EAS visit, which included neuropsychological and neurological examinations (see Katz et al., 2012 for details);
∼2 weeks later, they completed a second assessment session that included our visual memory global prediction task, as well as
other objective and subjective assessments.
For the metamemory task, we embedded queries to elicit pre- and post-experience predictions about prospective memory
performance into the standard administration of the BVMT-R (Benedict et al., 1996) to capture memory monitoring and episodic
visual memory performance within a single paradigm. As per the standardized task instructions, participants were informed that
they would have 10 s to study “six geometric figures” presented on a stimulus sheet (BVMT-R, Form 1), after which they
would “draw each figure exactly as it appeared and in its correct location on the page.” Prior to learning the stimuli, participants
made predictions about their future performance for immediate and delayed memory (see Appendix B for exact instructions).
1408 S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425
Their responses to these queries were recorded and represented their “pre-experience predictions” for immediate recall (Learning
Trials 1, 2, 3) and delayed recall (Delayed Recall Trial). Then, participants continued with the standardized administration of the
BVMT-R to complete the three learning trials. Specifically, the stimulus sheet included six simple geometric designs arranged in
a 2 × 3 matrix. On each learning trial, participants studied the stimulus sheet for 10 s and then freely recalled as many figures as
they could by drawing them from memory. Following completion of all three learning trials, participants were queried to make a
global JOL about their future performance on the delayed recall trial: “If I ask you about the figures later, how many do you think
you will remember?” Their response was recorded and represented a global JOL rating, an assessment of how much information
one feels has been acquired subsequent to a period of learning. Participants then completed questionnaires for ∼25 min with
Measures
Experimental Task Measures. Objective memory performance: episodic memory scores—BVMT-R Standardized measures of
visual memory performance based on the BVMT-R protocol were calculated for immediate free recall/learning trials (i.e., Trials
1, 2, and 3) and delayed memory trials, including free delayed recall (Delayed Recall Trial) and recognition (for which we
used two scores, Recognition Hits and Recognition Discrimination Index). For all recall trials, participants received 1 point for a
correctly recalled figure and 1 point for a figure that was drawn in the correct location. Therefore, the score for each figure ranged
from 0 to 2 (where 0 indicated neither correct recall nor correct location, 1 indicated either correct recall or correct location, 2
indicated both correct recall and location), and the total score for each trial (sum of all figure scores) ranged from 0 to 12 (where
0 indicated no points were obtained for any of the 6 figures and 12 indicated that the maximum points were obtained for all 6
figures). The raw score for BVMT-R-Delayed Recall equaled the total score for the Delayed Recall Trial, ranging from 0 to 12.
The raw score for BVMT-R-Recognition Hits equaled the sum of correctly recognized target figures, ranging from 0 to 6. The
raw score for the BVMT-R-Recognition Discrimination Index was calculated by subtracting the number of Recognition False
Alarms from the Recognition Hits score, where Recognition False Alarms equaled the number of distractor figures incorrectly
recognized as target figures. Thus, the range for the Recognition Discrimination Index score was −6 to 6, with 6 representing
normal discrimination and −6 representing very poor discrimination (i.e., endorsement of 6 false positive and no target figures).
Objective memory performance: episodic memory scores—global metamemory prediction task. Objective memory perfor-
mance scores for all immediate (i.e., Trial 1, 2, 3) and delayed recall trials were calculated using the standardized BVMT-R
scoring criteria based on recall accuracy for figure but not for location because metamemory queries pertained only to number
of figures that would be remembered in the future. This was done to simplify the comparison between subjective and objective
memory performance, as well as to reduce participant confusion when describing the task prior to learning, which could have
led to unwanted influences on judgments and evaluations. Therefore, participants received 1 point for each figure that was
accurately drawn for each recall trial regardless of whether the figures were drawn in their correct locations. The range for all
recall measures was also 0 to 6, where 0 indicated no figures and 6 indicated all figures were correctly recalled. For delayed
recognition, we utilized the standardized scoring criteria for BVMT-R, Recognition Hits and Recognition Discrimination Index
(see above).
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1409
Subjective memory performance: pre-experience predictions. Our global metamemory prediction task yielded three trial-
specific pre-experience predictions pertaining to immediate memory performance (i.e., for Trials 1, 2, and 3) and one trial-
specific pre-experience prediction for delayed memory performance (i.e., for the Delayed Recall trial). Scores ranged from 0 to
6, where 0 indicated no figures and 6 indicated all figures would be correctly recalled on the respective recall trial.
Subjective memory performance: post-experience predictions. Our procedure also yielded a global JOL rating about delayed
memory performance (on the Delayed Recall trial; range = 0 to 6, where 0 indicated no figures and 6 indicated all figures would
be correctly recalled on delayed recall), which was recorded after the completion of all three learning trials. Lastly, there was
one global recognition estimate (range = 0 to 6, where 0 indicated no figures and 6 indicated all figures would be correctly
The Cognitive Change Index (CCI) (Rattanabannakit et al., 2016) is a 20-item measure of participants’ ability level on certain
tasks and cognitive skills (e.g., recalling information, making decisions) compared with 5 years ago. The CCI includes both self
and informant forms and uses a 5-point Likert scale (1 = normal ability/no change to 5 = severe problem/much worse). Of the 20
CCI items, 12 items focus on memory, 5 on executive functioning, and 3 on language. The CCI self (CCI-S) and CCI informant
(CCI-I) scores are the sum of all items on the self-reported and informant reported versions of the assessment, respectively
(range = 20–100), with greater scores representing increased cognitive problems/cognitive change. In addition, the CCI includes
a difference score between self and informant reports (CCI-D, range = −80 to 80), which is calculated by using, CCI-S − CCI-I,
and represents the discrepancy between self- and informant-reports (Rattanabannakit et al., 2016). A positive score indicates
that the participant reported greater cognitive impairment relative to the informant, while a negative score indicates the reverse.
Compared with self-report measures of cognitive decline, informant-report measures have been more strongly correlated to
participants’ objective neuropsychological test scores (Gavett, Dunn, Stoddard, Harty, & Weintraub, 2011; Rami et al., 2014);
therefore, greater distances between self-and informant-ratings have been conceptualized to reflect poorer awareness of overall
cognitive change/functioning.
The Comprehensive Assessment of Prospective Memory, Section B (CAPM B; Chau et al., 2007) is a 39-item questionnaire
that assesses how problematic everyday prospective memory failures are to an individual, thus measuring his/her level of concern
(Chau et al., 2007). Specifically, prospective memory refers to memory for intended actions that are to be carried out at a specific
time in the future, such as remembering to pass on a phone message, take medication, or turn off the stove after a set period
of time. The CAPM B includes self- and informant-rated versions. Each item describes a memory failure and participants and
informants indicate “how much of a problem” each listed failure has been in the past month (scale 1–5) with “1” representing
“no problem at all” and “5” representing “a very serious problem.” A “not applicable” (N/A) option is also available. We used
the CAPM B total score, the average rating of all items answered, excluding N/A responses, with scores ranging from 0 to 5.
We also calculated a difference score between self and informant reports (CAPM B-D; range = −5 to 5) using, CAPM B-self
score − CAPM B-informant score, with positive values reflecting greater concern reported by participants relative to informants
and negative values representing the reverse. We considered the magnitude of the CAPM B-D score to reflect the distance
between informant and participant ratings, with larger magnitudes likely reflecting poorer awareness of everyday prospective
memory failures.
1410 S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425
The short form of the Geriatric Depression Scale (GDS; Sheikh & Yesavage, 1986) is a self-reported measure of depressive
symptoms, using a yes/no rating scale. Scores range from 0 to 15, and scores of 5 or higher suggest clinical depression (Almeida
& Almeida, 1999; Marc, Raue, & Bruce, 2008).
Statistical Analyses
We used SPSS Version 26 for analyses, and all p-values were two-tailed with an alpha level of .05. Effect sizes for analysis
of variance (ANOVA) and analysis of covariance (ANCOVA) were assessed using partial eta squared. With regard to the
Results
Participant Characteristics
Table 1 shows the demographic and clinical characteristic comparisons for the HC, SCD1 , aMCI, and naMCI groups.
Education significantly differed and post-hoc tests showed the mean years of education was significantly lower for naMCI
compared to HC and SCD (ps<.001). Ethnicity also significantly differed, with post-hoc tests showing a greater proportion
1 Only 8 SCD participants were classified based on informant-report of cognitive concerns. Box-plot analyses showed that these 8 participants were not outliers in any of the relevant demographic
or experimental measures.
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1411
Table 1. Demographic and clinical characteristics of the healthy control, subjective cognitive decline, amnestic mild cognitive impairment, and nonamnestic
mild cognitive impairment groups (n = 260)
HC SCD aMCI naMCI
M (SD) or # (%) M (SD) or # (%) M (SD) or # (%) M (SD) or # (%)
Variable n = 120 n = 84 n = 18 n = 38 p
Age (years) 80.28 (5.53) 81.56 (5.18) 81.56 (7.00) 80.42 (5.72) ns
Sex (women) 80.00 (66.67) 54.00 (64.30) 11.00 (61.10) 31.00 (81.60) ns
Education (years) 14.99 (3.13) 15.10 (3.20) 13.56 (4.44) 11.84 (2.91)∗∗∗++ <.001
Ethnicity (non-white) 45.00 (37.50) 24.00 (28.60) 8.00 (44.40) 27.00 (71.10)∗∗∗++ <.001
of non-white participants in the naMCI group compared to HC (p<.001) and SCD (p<.001). GDS2 was also significant and
post-hoc tests showed that the mean GDS score was significantly higher for SCD compared to HC (p<.05). However, all group
means for GDS were below the cut-off score associated with clinical depression (i.e., GDS > 4). There were no significant
between-group differences of age or sex.
Tables 2 and 3 summarize the means, standard deviations, and group and within-subjects comparisons for HC, SCD, aMCI,
and naMCI on measures of immediate recall, delayed recall, and delayed recognition obtained using the BVMT-R and global
prediction task, respectively. Figure 1 visually depicts group and within-subjects comparisons across trials for objective and
subjective memory/metamemory measures. As expected, using both sets of scores, the MCI groups demonstrated poor overall
immediate recall and delayed recall compared with HC; however, only aMCI demonstrated impaired recognition (see also
Figure 1a). This pattern shows that, although both MCI groups struggled with recall, memory storage was intact for naMCI but
not aMCI, highlighting the latter group’s primary deficit in episodic memory.
Memory Self-awareness
Table 4 summarizes the group means and standard deviations for HC, SCD, aMCI, and naMCI, as well as note significant
between-group and within-subject differences, on measures of memory self-awareness, as indexed by pre-experience memory
predictions and predictive accuracy.
Memory predictions. The 4 Group (HC, SCD, aMCI, and naMCI) X 4 Trial (1, 2, 3, Delayed Recall) mixed ANCOVA,
using pre-experience memory prediction (with education, ethnicity, and GDS as covariates), showed no significant main effect
of group (F < 1). There was a significant within-subjects effect of Trial (F(2.23, 570.95) = 10.71, MSE = .90, ε = .76,
p < .001, ηp 2 = .04) that was qualified by a significant Group X Trial interaction (F(6.88, 570.95) = 3.45, MSE = .897, ε = .76,
p <, ηp 2 = .04). Group X Trial contrasts revealed significant Group X Trial interactions between Trial 2 and Trial 3 (F(3,
249) = 2.96, MSE = .64, p < .05, ηp 2 = .03), as well as between Trial 3 and Delayed Recall (F(3, 249) = 7.04, MSE = .64,
p < .001, ηp 2 = .08). The remaining contrast (Trial 2 vs. Trial 1) did not reveal a significant interaction term (F < 1). Post-hoc
tests using separate paired t-tests for each group, together with analysis of a simple effects plot (Figure 1b), showed that in the
pre-experience phase all groups expected their recall performance to significantly improve between Trial 1 and Trial 2 (HC:
t(119) = −9.69, p < .001; SCD: t(81) = −10.01, p < .001; aMCI: t(17) = −5.58, p < .001; naMCI: t(37) = −4.91, p < .001).
In addition, all groups except aMCI also expected their recall performance to continue to significantly improve between
Trial 2 and Trial 3 (HC: t(119) = −6.95, p < .001; SCD: t(81) = −6.24, p < .001; aMCI: t(17) = −1.49, p = ns; naMCI:
t(37) = −2.14, p < .05). Furthermore, all groups except naMCI expected their recall performance to markedly decline (see
Figure 1b to compare slopes) between Trial 3 and Delayed Recall (HC: t(117) = 14.59, p < .001; SCD: t(81) = 12.84, p < .001;
2 17 participants demonstrated GDS scores above the cut-off associated with depression. Results from a Pearson chi-square test showed that the frequency of these depressed participants did
not significantly differ between groups. Results from all statistical tests that included these participants with elevated clinical depression scores did not differ from those that excluded them;
therefore, we reported findings from the analyses that included all participants.
1412
Table 2. Analyses for group and within-subjects differences for the healthy control, subjective cognitive decline, amnestic mild cognitive impairment, and nonamnestic mild cognitive impairment groups
on standardized measurements of immediate recall, delayed recall, and recognition obtained using the Brief Visual Memory Test-Revised (N = 260)
Group Trial
Table 3. Analyses for group and within-subjects differences for the healthy control, subjective cognitive decline, amnestic mild cognitive impairment, and
nonamnestic mild cognitive impairment groups on measurements of immediate recall, delayed recall, and recognition obtained using the global metamemory
prediction task (N = 260)
Group Trial
aMCI: t(17) = 3.74, p < .01; naMCI: t(37) = 3.14, p < .01). Taken together, findings suggest that, prior to exposure to task
stimuli/learning, all groups demonstrated knowledge that memory performance typically improves with repetition and may
decline after a delay.
Accuracy of memory predictions. The 4 Group (HC, SCD, aMCI, and naMCI) X 4 Trial (1, 2, 3, Delayed Recall) mixed
ANCOVA conducted using calibration scores (e.g., predictive/metamemory accuracy), with education, ethnicity, and GDS as
covariates, showed no significant main effect of group (F = 1.43, p > .05). There was, however, a significant main effect of Trial
(F(2.61, 615.26) = 3.42, MSE = 1.44, ε = .87, p < .05, ηp 2 = .02) that was qualified by a significant Group X Trial interaction
(F(7.82, 615.26) = 3.31, MSE = 1.44, ε = .87, p < .01, ηp 2 = .04). Separate one-way independent ANCOVAs, controlling for
appropriate covariates, revealed no significant group differences in calibration for Trial 1 (F < 1), Trial 2 (F = 1.31), or Trail
3 (F < 1). However, groups did significantly differ on calibration on Delayed Recall (F(3, 236) = 4.67, p < .01, ηp 2 = .06).
Post-hoc tests showed that HC participants were significantly more accurate (demonstrated by a smaller mean calibration score
because perfect calibration is 0) in their pre-experience memory predictions compared with naMCI. Analysis of simple effects
plot (see Figure 1c) also showed that all groups were significantly more accurate in their pre-experience memory predictions
for delayed compared with immediate recall trials. Taken together, findings indicate that, at pre-experience, the predementia
groups were just as accurate as HC in predicting their immediate memory performance; however, the naMCI group (but no
other predementia group) was less accurate in predicting their delayed memory performance relative to HC.
Memory Self-monitoring. Table 5 summarizes the group means and standard deviations for HC, SCD, aMCI, and naMCI, as
well as note significant between-group and within-subject differences, on measures of memory self-monitoring, as indexed by
post-experience memory predictions and their accuracy.
Memory predictions: Delayed recall. The 4 Group (HC, SCD, aMCI, and naMCI) by 2 Experience (pre-experience vs. post-
experience) mixed ANCOVA using memory predictions for Delayed Recall (with education, ethnicity, and GDS as covariates),
showed no significant main effect of Group (F < 1). There was, however, a significant effect of Experience, (F(1, 242) = 7.90,
MSE = 1.34, ε = 1, p < .01, ηp 2 = .03), where the predicted performance for Delayed Recall was higher at pre-experience
(M = 3.06, SE = .12) compared with post-experience (M = 2.71, SE = .12). The Group X Experience interaction effect was not
1414 S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425
significant (F < 1). Taken together, findings suggest that all participants, irrespective of group classification, predicted higher
delayed recall performance prior to learning compared with after learning, indicating that the experience of learning impacted
(i.e., lowered) their performance expectations.
Accuracy of memory predictions: Delayed recall. The 4 Group (HC, SCD, aMCI, and naMCI) by 2 Experience (pre-
experience vs. post-experience) mixed ANCOVA using calibration scores for Delayed Recall (with education, ethnicity, and
GDS as covariates), showed a significant main effect of Group (F(3, 230) = 6.54, MSE = 2.71, p < .001, ηp 2 = .08). Post-
hoc tests showed the naMCI group (M = 1.84, SE = .26) but not aMCI (M = 1.39, SE = .37) demonstrated significantly
worse calibration with higher calibration scores (perfect calibration = 0) compared with the HC (M = 0.67, SE = .15) and
SCD (M = 0.48, SE = .18) groups (ps < .01 and .001, respectively). There was also a significant main effect of Trial (F(1,
230) = 7.69, MSE = 1.36, ε = 1, p < .01, ηp 2 = .03), which showed that all participants were more accurate in their memory
predictions at post-experience (M = 1.27, SE = .17) compared with pre-experience (M = 0.92, SE = .12). Lastly, the Group
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1415
Table 4. Means and standard deviations reported for the healthy control, subjective cognitive decline, amnestic mild cognitive impairment, and nonamnestic mild
cognitive impairment groups on measurements of memory self-awareness, indexed by pre-experience memory predictions and the accuracy of pre-experience
memory predictions (calibration scores), obtained from the global metamemory prediction task (N = 260)
HC SCD aMCI naMCI
Variable M (SD) M (SD) M (SD) M (SD)
n = 120 n = 84 n = 18 n = 38
Memory Self-Awareness
Predictions
Trial 1 3.50 (1.25) 3.42 (1.38) 3.11 (1.18) 2.92 (1.08)
Table 5. Means and standard deviations for the healthy control, subjective cognitive decline, amnestic mild cognitive impairment, and nonamnestic mild
cognitive impairment groups on measurements of memory self-monitoring, indexed by post-experience memory predictions and the accuracy of post-experience
predictions (calibration scores), obtained from the global metamemory prediction task (N = 260)
HC SCD aMCI naMCI ANCOVA ηp 2
Variable M (SD) M (SD) M (SD) M (SD) F-value (p)
n = 120 n = 84 n = 18 n = 38
Memory Self-Monitoring
Predictions
Pre-Exp, DR 2.92 (1.45) 2.94 (1.36) 2.94 (1.30) 3.34 (1.60) — —
Post-Exp, DR (Global JOL) 2.67 (1.32)a 2.87 (1.36)a 2.44 (1.15)a 2.68 (1.34)a — —
Prediction Accuracy
Delayed Recall
Pre-Exp, DR 0.75 (1.87) 0.48 (2.22) 1.65 (1.58) 2.34 (2.10)∗++ — —
Post-Exp, DR (Global JOL) 0.51 (1.43)b 0.30 (1.28)b 1.12 (1.87)b 1.74 (1.50)b ∗∗++ — —
Delayed Recognition
Post-Exp, GRE (RDI) −0.92 (1.82) −1.01 (1.46) −0.73 (2.15) −0.78 (2.11) 0.10 (ns) .00
Post-Exp, Rec 4.14 (1.50) 4.37 (1.44) 3.31 (1.20) 3.51 (1.40)∗∗∗+ 3.76 (<.05) .05
Notes. M = mean; SD = standard deviations; HC = healthy control; SCD = subjective cognitive decline; aMCI = amnestic mild cognitive impairment;
naMCI = nonamnestic mild cognitive impairment; ANCOVA = analysis of covariance. Sample sizes slightly vary due to omission of scores by certain
participants. Pre-Exp = Pre-experience; Post-Exp = Post-Experience; DR = delayed recall; Rec = delayed recognition; GRE = Global Recognition Estimate;
Global JOL = global judgment of learning; Rec Hits = recognition hits; RDI = Recognition Discrimination Index; ns = not significant. ANCOVA was used to
compare group differences of all variables, adjusting for education, ethnicity, and GDS. All effect sizes are partial eta square.
Significantly different from HC (p < .05)∗ , (p < .01)∗∗
Significant trend compared to HC (p = .06)∗∗∗
Significantly different from SCD (p < .05)+ , (p < .01)++
Predicted recall performance declined from subsequent trial (ps < .001)a
Prediction accuracy better at post-experience compared with pre-experience (ps < .001)b
X Experience interaction effect was not significant (F < 1). Taken together, findings suggest that, although all participants
demonstrated an increase in prediction accuracy from pre- to post-experience, naMCI participants were less accurate (in the
direction of overconfidence) in their delayed memory performance both prior and after learning compared with HC and SCD
participants.
Accuracy of memory predictions: Delayed recognition. We evaluated group differences in post-experience prediction
accuracy for delayed recognition performance (i.e., global recognition estimate accuracy) for both RDI and Recognition Hits
1416 S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425
Table 6. Summary of results for Pearson’s correlations between global prediction measures and subjective report scores related to everyday prospective memory
failures and overall cognitive change
Subjective Report Scores JOL rating JOL Accuracy
CAPM-S .048 -.067
CAPM-I .133 .221∗∗
CAPM-D .083 -.015
CCI-S .073 .031
CCI-I -.110 -.036
CCI-D .240∗∗ .153
using separate one-way ANCOVAs, where education, ethnicity, and GDS were entered as covariates. There was a significant
group effect on global recognition estimate accuracy when the calibration score was based on recognition hits. Post-hoc tests
showed that naMCI demonstrated worse calibration with a larger mean global recognition estimate score compared with HC
(p = .06) and SCD (p < .05). However, there was no significant group effect on global recognition estimate accuracy when
the calibration score was based on the Recognition Discrimination Index score. Notably, group means for global recognition
estimate accuracy (i.e., using both recognition hits and the Recognition Discrimination Index score) were negative overall,
suggesting that all groups were underconfident in prospectively judging their recognition performance. In addition, group means
for global recognition estimate accuracy reflected greater underconfidence when the calibration score was based on recognition
hits compared with the Recognition Discrimination Index score, which was due to the fact that participants’ actual recognition
scores were higher when false positives were not accounted for. Participants likely did not take penalties for endorsing false
alarms into consideration when making their global recognition estimates. Therefore, the fact that naMCI only differed from
HC and SCD participants on global recognition estimates when the recall hits score (but not Recognition Discrimination Index)
was used for calibration, suggests that naMCI participants were prone to false recognition errors.
Correlation Between Global Prediction Measures and Self- and Informant-reported Inventories of Everyday Cognitive
Functioning. We explored the relationship between global JOL ratings, as well as global JOL accuracy, and the six subjective
measures associated with the CAPM-B and CCI (i.e., self-report, informant-report, and self-informant difference scores for each
inventory) utilizing the entire sample in two different Pearson correlation analyses. See Table 6 for a summary of results.
There was a significant small positive correlation between global JOL ratings and the CCI-D score, suggesting that,
subsequent to learning, participants who gave higher global JOL ratings also showed poorer self-awareness (i.e., a larger positive
discrepancy between self- and informant-reported cognitive decline). Unexpectedly, however, poor self-awareness was in the
direction of underconfidence.
There was also a significant small positive correlation between global JOL calibration scores and CAPM-I, suggesting that
participants who demonstrated poorer global JOL accuracy (i.e., larger calibration scores) also had informants who were more
concerned about their everyday prospective memory failures. No other significant correlations emerged between experimental
measures and self-report assessments.
Discussion
Our primary goal was to investigate differences in metamemory accuracy between cognitively healthy elderly controls (HC)
and those with SCD, aMCI, and naMCI using a global metamemory prediction task based on visual memory. Compared with HC,
all prodromal dementia groups demonstrated comparable metamemory accuracy with regard to immediate recall performance at
pre-experience, indicating intact self-awareness about immediate memory processes. In addition, all participant groups were able
to appropriately modify their memory predictions after task exposure/learning, demonstrating significantly better metamemory
accuracy at post-experience compared with pre-experience. However, compared with HC, only the naMCI group demonstrated
significantly worse metamemory accuracy at pre-experience and significantly worse JOL accuracy at post-experience when
predicting their delayed recall performance, indicating that, despite demonstrating prediction upgrading, naMCI (but not aMCI)
participants still struggled with poor self-awareness and self-monitoring of delayed memory processes. Altogether, results
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1417
suggest neuropsychological deficits specific to MCI subtypes may interfere with different mechanisms in the brain’s cognitive
awareness system resulting in differential levels of awareness and metamemory accuracy. Moreover, our findings provide
confirmatory evidence that metamemory abilities remain intact in SCD. A second goal was to explore the relationship between
global JOL measures and self-report measures related to everyday memory/cognitive problems. Results offer some support that
metamemory difficulties captured using our experimental metamemory task may generalize to everyday memory difficulties.
Memory self-awareness. Relative to HC, participants with SCD, aMCI, and naMCI demonstrated intact general knowledge
and self-awareness related to immediate memory processes. These findings are consistent with previous research showing that
memory self-knowledge and self-awareness of working memory and immediate memory abilities are preserved in MCI and
AD (Bertrand et al., 2019; Seelye et al., 2010; Silva, Pinho, Macedo, Souchay, & Moulin, 2017; Thomas, Lee, & Balota,
2013). Notably, relative to HC, the naMCI participants (and no other preclinical dementia group) demonstrated evidence of
poor memory self-awareness of delayed recall, which could potentially suggest that those with naMCI base their metamemory
judgments on inaccurate representations of self-ability. However, given that memory self-awareness, especially as measured by
predictive accuracy at pre-experience, has also been linked to other factors, such as familiarly with task procedures (Connor,
Dunlosky, & Hertzog, 1997), future research is needed to investigate the determinates of impaired memory self-awareness
observed in those with naMCI. Lastly, given that SCD is conceived as a pre-MCI condition (Rabin et al., 2017), findings of
intact general knowledge and self-awareness of memory processes were expected.
Memory self-monitoring: Mild Cognitive Impairment. Consistent with our predictions, those with naMCI (but not aMCI)
demonstrated deficits in memory self-monitoring (i.e., JOL accuracy) compared with HC. Given that past research has shown
that the frontal cortex but not the temporal lobes is critical in supporting JOL accuracy (Andrés, Mazzoni, & Howard, 2010;
Howard et al., 2010; Howard, Andrés, & Mazzoni, 2013; Vilkki et al., 1998; Vilkki et al., 1999), our results suggest that poor
JOL calibration observed in naMCI participants was likely attributable to primary deficits in executive functions/frontal systems,
while intact JOL calibration observed in aMCI participants was likely because JOL accuracy is not dependent on the integrity
of episodic memory/Medial Temporal Lobe systems. In addition, because the naMCI group included participants characterized
by low executive and/or global/verbal functioning factor scores, deficits in global functioning, which may also interfere with
the efficiency of executive functioning processes, could have also contributed to poor JOL calibration in some of our naMCI
participants. Importantly, all groups demonstrated the ability to significantly improve the accuracy of their memory predictions
for Delayed Recall after learning/task exposure (i.e., prediction upgrading) essentially by lowering their memory predictions.
This suggests that even the naMCI group was able to utilize experience with the task to update memory self-knowledge, albeit
less efficiently compared with HC.
Altogether, our findings offer support for the hypothesis that differences in metamemory accuracy may arise in MCI subtypes
as a function of their primary neuropsychological impairments that may interfere with specific mechanisms in the brain’s
cognitive awareness system. Findings for aMCI participants are consistent with past research showing preserved JOL accuracy
in aMCI (Akhtar et al., 2006; Ryals et al., 2018; Seelye et al., 2010). Results for naMCI are also generally consistent with those
reported by Seelye et al. (2010)—who, using a verbal memory-based metamemory prediction paradigm, showed evidence that
naMCI (but not aMCI) participants were significantly more poorly calibrated in their delayed recall predictions both at pre- and
post-experience compared with HC. Finally, findings for naMCI in the current study are consistent with those from our previous
1418 S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425
study (Chi et al., 2020) using a retrospective metamemory task, which showed that metamemory monitoring processes were
impaired in naMCI (but not aMCI).
Memory self-monitoring: Subjective Cognitive Decline. Consistent with our prediction, SCD participants were able to
accurately predict their memory performance relative to HC, demonstrating intact memory self-monitoring. These findings
are also consistent with our previous study (Chi et al., 2020), which showed comparable performance between SCD and HC
on a retrospective metamemory task. Notably, although SCD participants demonstrated a significantly higher mean GDS score
compared with HC in the current study (e.g., depression symptoms were in the nonclinical range for all groups), performance
between SCD and HC participants did not significantly differ on any of our metamemory accuracy measures after controlling for
Global recognition estimates. We found that participants underestimated their delayed recognition performance. Notably,
participants were queried about their prospective recognition performance immediately after the delayed recall trial. It is possible
that the perception of poor performance on the delayed recall trial or perception that the task was difficult lowered participants’
confidence in recognition performance. Confidence in one’s response on a given task/trial has been reported to influence
confidence on the following task/trial (Rahnev, Koizumi, McCurdy, D’Esposito, & Lau, 2015). Unfortunately, we did not query
participants about their predicted delayed recognition performance before learning, which precludes further analysis of this
issue. However, future research should explore the effect of task order.
Relationship Between JOL Accuracy and Subjective Reports of Everyday Cognitive Functioning
JOL predictions were positively correlated with an SRD measure that was based on overall cognitive change/problem,
suggesting that participants who made higher predictions for delayed recall performance at post-experience also had poorer
self-awareness scores (i.e., a larger discrepancy between self- and informant-report), however, unexpectedly in the direction
of underconfidence. One explanation is that participants may have been able to accurately assess their memory/cognitive
functioning when completing offline metamemory measures (e.g., questionnaires) because these are based on metacognitive
knowledge (i.e., general knowledge and beliefs about their memory; Flavell, 1979). In spite of this, it is possible that those
with cognitive difficulties struggled to spontaneously use their metacognitive knowledge to support their predictions while
engaged in online performance monitoring (Perrotin, Belleville, & Isingrini, 2007), given the high cognitive load of experimental
metamemory tasks, resulting in their higher JOL ratings. Lastly, although the size of both MCI groups is comparable to those
reported in other metacognition in MCI studies (e.g., Ryals et al., 2018; Seelye et al., 2010; see also Piras et al., 2016, Table 2), it
limited our ability to explore correlational differences between “online” JOL measures and “offline” measures of both cognitive
change and prospective memory difficulties for each group individually due to insufficient power.
In addition, JOL calibration scores (e.g., higher scores indicate poorer JOL accuracy) were positively correlated with
informant but not self-reported concern about prospective memory failures, suggesting that participants who demonstrated the
worst JOL accuracy also had informants who were the most concerned about their everyday memory difficulties. Given that
prospective memory failures are associated with safety implications for activities of daily life, such as remembering to take
medication or turn off the stove (Chau et al., 2007), it is not unexpected that everyday PM failures related to poor JOL accuracy
would be linked to greater informant concern as these failures may be more salient. In addition, the lack of a relationship
between self-reported concern and poorer JOL accuracy could provide more evidence of poorer self-awareness of memory
functioning in participants with poor visual memory JOL accuracy. Importantly, given that anosognosia is also associated
with safety risks (Starkstein, Jorge, Mizrahi, Adrian, & Robinson, 2007), we offer support that outcome measures from our
experimental metamemory task can generalize to memory difficulties in everyday life.
Although our global metamemory prediction task is not suitable for detection of a mnemonic anosognosia, it holds potential
for detecting the executive and primary forms of anosognosia. In the context of CAM, failure to detect errors and/or impaired
evaluative processes during performance monitoring (e.g., due to an executive anosognosia) or lack of conscious awareness
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1419
of memory failures (e.g., due to a primary anosognosia) could result in one possessing inaccurate self-knowledge about
his/her cognitive abilities over time (Hannesdottir & Morris, 2007; Morris & Mograbi, 2013). Interestingly, although naMCI
participants struggled with self-monitoring of delayed memory processes (i.e., demonstrating poor JOL accuracy relative to
HC) in our study, they were still able to improve their prediction accuracy after learning, demonstrating the ability to use their
experience with the task to update memory self-knowledge. Overall, this finding may provide preliminary support to the idea
that poor JOL accuracy in naMCI could be more attributable to problems with performance monitoring rather than to a lack of
conscious awareness of memory failures. However, modifications to our global metamemory prediction task will be necessary
to definitively distinguish between executive and primary forms of anosognosia. For example, incorporating a separate feedback
Funding
This work was supported by the National Institute on Aging (NIA) and National Institute of General Medical Sciences
(SC2AG039235), NIA (P01 AG03949, R01AG039409-0), The Czap Foundation, and The Leonard and Sylvia Marx Foundation.
Acknowledgements
The authors are appreciative of Milushka Elbulok-Charcape, Moisey Abramov, Valdiva Da Silva, Dr. Avner Aronov, Dr. Ashu
Kapoor, Dr. Erica Meltzer, Charlotte Magnotta, Dr. Wendy Ramratan, Dr. Molly Zimmerman, Dr. Richard Lipton, and Mindy
Katz for their contributions.
Conflict of Interest
None declared.
References
Agnew, S. K., & Morris, R. (1998). The heterogeneity of anosognosia for memory impairment in Alzheimer’s disease: A review of the literature and a proposed
model. Aging & Mental Health, 2(1), 7–19.
Akhtar, S., Moulin, C. J., & Bowie, P. C. (2006). Are people with mild cognitive impairment aware of the benefits of errorless learning? Neuropsychological
Rehabilitation, 16(3), 329–346.
Almeida, O. P., & Almeida, S. A. (1999). Confiabilidade da versão Brasileira da Escala de Depressão em Geriatria (GDS) versão reduzida. Arq Neuropsiquiatr,
57(2B), 421–426.
Amariglio, R. E., Mormino, E. C., Pietras, A. C., Marshall, G. A., Vannini, P., Johnson, K. A. et al. (2015). Subjective cognitive concerns, amyloid-β, and
neurodegeneration in clinically normal elderly. Neurology, 85(1), 56–62.
1420 S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425
Anderson, S.-E. (2009). Predictions of episodic memory following moderate to severe traumatic brain injury during inpatient rehabilitation. Journal of Clinical
and Experimental Neuropsychology, 31(4), 425–438. doi: 10.1080/13803390802232667.
Andrés, P., Mazzoni, G., & Howard, C. E. (2010). Preserved monitoring and control processes in temporal lobe epilepsy. Neuropsychology, 24(6), 775.
Barnes, D. E., Santos-Modesitt, W., Poelke, G., Kramer, A. F., Castro, C., Middleton, L. E. et al. (2013). The mental activity and eXercise (MAX) trial: A
randomized controlled trial to enhance cognitive function in older adults. Journal of the American Medical Society Internal Medicine, 173(9), 797–804.
doi: 10.1001/jamainternmed.2013.189.
Beckett, L. A., Donohue, M. C., Wang, C., Aisen, P., Harvey, D. J., Saito, N. et al. (2015). The Alzheimer’s disease neuroimaging initiative phase 2: Increasing
the length, breadth, and depth of our understanding. Alzheimer’s and Dementia, 11(7), 823–831. doi: 10.1016/j.jalz.2015.05.004.
Benedict, R. H., Schretlen, D., Groninger, L., Dobraski, M., & Shpritz, B. (1996). Revision of the brief visuospatial memory test: Studies of normal performance,
reliability, and validity. Psychological Assessment, 8(2), 145.
Jessen, F., Feyen, L., Freymann, K., Tepest, R., Maier, W., Heun, R. et al. (2006). Volume reduction of the entorhinal cortex in subjective memory impairment.
Neurobiological Aging, 27(12), 1751–1756. doi: 10.1016/j.neurobiolaging.2005.10.010.
Katz, M. J., Lipton, R. B., Hall, C. B., Zimmerman, M. E., Sanders, A. E., Verghese, J., Dickson, D. W., & Derby, C. A. (2012). Age-specific and sex-specific
prevalence and incidence of mild cognitive impairment, dementia, and Alzheimer dementia in blacks and whites: a report from the Einstein Aging Study.
Alzheimer disease and associated disorders, 26(4), 335–343. doi: 10.1097/WAD.0b013e31823dbcfc.
Lipton, R. B., Katz, M. J., Kuslansky, G., Sliwinski, M. J., Stewart, W. F., Verghese, J., Crystal, H. A., & Buschke, H. (2003). Screening for dementia by
telephone using the memory impairment screen. Journal of the American Geriatrics Society, 51(10), 1382–1390. doi: 10.1046/j.1532-5415.2003.51455.x.
Marc, L. G., Raue, P. J., & Bruce, M. L. (2008). Screening performance of the 15-item geriatric depression scale in a diverse elderly home care population.
American Journal of Geriatric Psychiatry, 16(11), 914–921. doi: 10.1097/JGP.0b013e318186bd67.
Metternich, B., Schmidtke, K., & Hüll, M. (2009). How are memory complaints in functional memory disorder related to measures of affect, metamemory and
Saykin, A., Wishart, H., Rabin, L., Santulli, R., Flashman, L., West, J. et al. (2006). Older adults with cognitive complaints show brain atrophy similar to that
of amnestic MCI. Neurology, 67(5), 834–842.
Schmitter-Edgecombe, M., & Seelye, A. M. (2011). Predictions of verbal episodic memory in persons with Alzheimer’s disease. Journal of Clinical and
Experimental Neuropsychology, 33(2), 218–225. doi: 10.1080/13803395.2010.507184.
Schraw, G. (2009). 21 measuring metacognitive judgments. Handbook of metacognition in education, 415.
Seelye, A. M., Schmitter-Edgecombe, M., & Flores, J. (2010). Episodic memory predictions in persons with amnestic and nonamnestic mild cognitive
impairment. Journal of Clinical and Experimental Neuropsychology, 32(4), 433–441.
Sheikh, J. I., & Yesavage, J. A. (1986). Geriatric Depression Scale (GDS): recent evidence and development of a shorter version. Clinical Gerontologist: The
Journal of Aging and Mental Health.
Silva, A. R., Pinho, M. S., Macedo, L., Souchay, C., & Moulin, C. (2017). Mnemonic anosognosia in Alzheimer’s disease is caused by a failure to transfer
APPENDIX A
First, we established robust norms for the 13 neuropsychological tests utilizing 411 independent EAS participants who were
dementia-free for 3 years, who were not participants in the current study, and whom we refer to as the “robust sample.” Second,
three underlying cognitive factors were identified using a principal component analysis: global/verbal; executive/processing
speed; and memory. Third, for participants in the current study, global/verbal, executive/processing speed, and memory
cognitive domain scores were calculated as the average Z score of each neuropsychological test associated within a given
factor, derived using means and standard deviations (SD) of the robust sample stratified by age group (70–79 and 80 and
above).
MCI was classified in participants whose cognitive domain scores were considerably lower (>1 SD) than the mean
of the robust sample on one or more cognitive factors and who endorsed at least one cognitive complaint on EAS self-
report measures—i.e., items that assess participants’ self-perceptions of their cognitive abilities taken from the Consortium
to Establish a Registry for Alzheimer’s Disease (Morris et al., 1993), a yes-no rating scale of current functioning of
several cognitive domains; or the “cognitive item” from the GDS (Sheikh & Yesavage, 1986), a dichotomous item that
asks participants whether they feel they have “more memory problems than most.” MCI was further subdivided into aMCI
and naMCI. Participants whose cognitive factor Z scores were below 1 SD on memory or memory plus global, and/or
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1423
executive/processing speed domains of the robust sample were classified as aMCI. Participants whose cognitive factor Z
scores were below 1 SD on the executive/processing speed and/or global domains of the robust sample were classified as
naMCI.
SCD was classified in cognitively intact participants (i.e., cognitive factor Z scores for all three domains did not fall
considerably lower [>1 SD] than the mean of the robust sample) who exceeded an optimal cut point for self- and/or informant
complaints. We used cognitive complaints items from previous research (Rabin et al., 2012) to derive scores that were the
APPENDIX B
Before presentation of each learning trial, the respondent’s attention should be fixed at the point where the Recall Stimulus
Booklet will be positioned. Then say:
I will show you a sheet that has six geometric figures on it. I want you to study the figures so that you can remember as
many of them as possible. You will have just 10 seconds to study the entire display. I will present the figures right here (place
hand at eye level approximately 16 inches in front of respondent). After I take the display away, try to draw each figure exactly
as it appeared and in its correct location on the page.
Repeat instructions and clarify as often as necessary. Then say:
Before we begin the task, I have a few questions for you. How many of the six geometric figures do you think you will
recall after they are displayed for a total of 10 seconds?
∗
Record response in the upper right-hand corner of the response sheet for (Trial 1). ∗ .
Then say: How many of the six geometric figures do you think you will recall after they are displayed a second time for a
total of 10 seconds? (Clarify if necessary).
∗
Record response in the upper right hand corner of the response sheet for (Trail 2). ∗ .
1424 S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425
Then say: How many of the six geometric figures do you think you will recall after they are displayed a third time for a
total of 10 seconds? (Clarify if necessary).
∗
Record response in the upper right hand corner of the response sheet for (Trail 3). ∗ .
Then say: How many of the six geometric figures do you think you will recall after a 25 minute delay period where you
are performing other tasks? (Clarify if necessary).
∗
Record response in the upper right hand corner of the response sheet for (DR Trial). ∗ .
Delayed Recall
Delay should consist of predominantly questionnaires and verbal tasks. After 25 min, position the response sheet for the
Delayed Recall Trial and say:
Remember the figures I showed you before? I want to see how many you can remember now. I know it sounds difficult, but
try to draw as many of the figures as you can in their correct location on the page. Remember, try to draw them accurately.
Just do the best you can.
After the respondent indicates being finished drawing, remove the Response Form. Record the time, determine the delay
interval in minutes, and also record this number in the appropriate location.
S. Y. Chi et al. / Archives of Clinical Neuropsychology 36 (2021); 1404–1425 1425
Recognition Trial
Immediately after the Delayed Recall Trial, position the Recognition Booklet in front of the respondent with the card indicating
the form of the test and instructions visible to the administrator. Then say:
Now I will show you some more figures, one at a time. Some were on the display I showed you before and others are new
figures you have not seen before. Say “yes” for those figures that were on the display and say “no” if I show you a figure that
was not on the display. Do you understand?