UNSW Face Test: A Screening Tool For Super-Recognizers: Author Note
UNSW Face Test: A Screening Tool For Super-Recognizers: Author Note
James D. Dunn1, Stephanie Summersby1, Alice Towler1, Josh P. Davis2, and David White1
1
School of Psychology, UNSW Sydney, Australia
2
Department of Psychology, University of Greenwich, United Kingdom
Author Note
Corresponding Author: James D. Dunn, School of Psychology, UNSW Sydney, NSW 2052
Australia. Email: [email protected]
Acknowledgements
Thanks to Daniel Noble and Natalija Pavleski for assistance with stimuli selection and
preparation, to Christel Macdonald, Daniel Guilbert, Albert Lin, and Monika Durova for
assistance with data collection, and to Richard Kemp for his thoughtful commentary.
1
UNSW FACE TEST
Abstract
We present a new test – the UNSW Face Test (www.unswfacetest.com) – that has been
specifically designed to screen for super-recognizers in large online cohorts and is available
free for scientific use. Super-recognizers are people that demonstrate sustained performance
in the very top percentiles in tests of face identification ability. Because they represent a
small proportion of the population, screening large online cohorts is an important tool for
their initial recruitment, before completing confirmatory testing via standardized measures
and more detailed cognitive testing. We provide normative data on the test from 3 cohorts
tested via the internet (combined n = 23,902) and 2 cohorts tested in our lab
(combined n = 182). The UNSW Face Test: (i) captures both identification memory and
perceptual matching, as confirmed by correlations with existing tests of these abilities; (ii)
captures face-specific perceptual and memorial abilities, as confirmed by non-significant
correlations with non-face object processing tasks; (iii) enables researchers to apply stricter
selection criteria that other available tests, which boosts the average accuracy of the
individuals selected in subsequent testing. Together, these properties make the test uniquely
suited to screening for super-recognizers in large online cohorts.
2
UNSW FACE TEST
People show a surprising degree of variation in their ability to identify faces, ranging from
chance-level to perfect accuracy. These individual differences are stable over repeated testing
(e.g. Balsdon et al., 2018), generalise from one face identification task to another (e.g.
Wilhelm et al., 2010), and represent a domain-specific cognitive skill that is dissociable from
general intelligence (e.g. Gignac et al., 2016) and visual object processing ability (e.g.
Richler et al., 2017). Moreover, twin studies reveal this ability is highly heritable (Shakeshaft
& Plomin, 2015; Wilmer et al., 2010). Together, this evidence indicates that face
identification ability is a stable cognitive trait with a biological basis, which means it can be
reliably measured.
Face identification ability is normally distributed, and people at the very top end – ‘super-
recognizers’ – demonstrate extraordinary innate abilities (Noyes et al., 2017; Russell et al.,
2009). Super-recognizers could, therefore, make substantial contributions to the theoretical
understanding of face identification, by helping to pinpoint the cognitive, perceptual, and
neural mechanisms underlying accurate identification (Bobak et al., 2017; McCaffery et al.,
2018; Tardif et al., 2018). They can also make important practical contributions by working
in applied face identification roles to reduce error-rates in law enforcement (Robertson et al.,
2016), criminal trials (Davis et al., 2019; Edmond & Wortley, 2016), and security-critical
identity management tasks (Balsdon et al., 2018; Bobak, Dowsett & Bate, 2016; White et al.,
2015; see Ramon et al., 2019).
However, finding super-recognizers is difficult because they make up just 2-3% of the
general population. Most studies rely on testing small groups of people (typically fewer than
10) who present to researchers with anecdotal claims of superior ability in response to
participant recruitment adverts (e.g. Russell et al., 2009; Bobak, Hancock, & Bate, 2016), or
who were recruited to form specialist teams based on standardised tests (e.g. Davis et al.,
2016; Robertson et al., 2016). However, small sample sizes limit the statistical power of
comparisons between super-recognizers and normative samples. Additionally, identifying
super-recognizers based on self-report alone is unreliable (Bate & Dudfield, 2019; Bobak,
Pampoulov, et al., 2016) and better suited for detecting deficits in face identification ability
(e.g. PI-20; Shah et al., 2015). This coupled with varying patterns of performance shown by
different super-recognizers across different face processing tasks (Bobak, Bennetts, et al.,
2016; Noyes et al., 2017), limits the generalisability of research findings in this field.
A more promising method of finding super-recognizers is to administer standardised
cognitive tests of face identification ability and map people’s performance to a normative
population (Bate et al., 2018; Belanova et al., 2018; Bobak, Pampoulov, et al., 2016). The
two most used tests are the Cambridge Face Memory Test (CFMT/CFMT+; Duchaine &
Nakayama, 2006; Russell et al., 2009), and the Glasgow Face Matching Test (GFMT, Burton
et al., 2010). Both have been used in academic research (Bate et al., 2018; Belanova et al.,
2018; Bobak, Pampoulov, et al., 2016) and in professional recruitment processes (Davis et
al., 2016; Robertson et al., 2016; White et al., 2015). However, existing standardised tests of
face identification ability are unsuitable for online testing. For example, the CFMT and
GFMT are carefully calibrated psychometric tests intended to be reliable measures of a
person’s ability.
Here we present the UNSW Face Test, an online screening tool for super-recognizers. This
test has been designed specifically as the first step in a procedure to reliably identify super-
3
UNSW FACE TEST
recognizers. We propose that mass online testing is the ideal solution to identify super-
recognizers in the population. The ability to test large groups of people remotely would also
allow organisations to identify super-recognizers among their staff to bolster the
organisation’s face identification capability. Further, any single test provides an unreliable
indication of face identification ability. Therefore, we propose that researchers and employers
verify the super-recognizer status of those who score highly on the UNSW Face Test in
controlled conditions using existing standardised tests, such as the CFMT and GFMT.
Three main properties of the UNSW Face Test distinguish it from existing tests and make it
ideally suited to identifying super-recognizers. First, the UNSW Face Test is very
challenging. We intentionally did not calibrate the difficulty of the test so that mean accuracy
was centred on the midpoint of the scale, as is common practice in standardised psychometric
tests (Wilmer et al., 2012). This is an important departure from existing tests of face
identification ability which were not designed to discriminate between the highest levels of
performance. As a result, super-recognizers typically achieve ceiling or near-ceiling accuracy
on existing standardised tests (for example see McCaffery et al., 2018; Russell et al., 2009;
Robertson et al., 2016). Shifting the mean towards the lower end of the scale enables more
precise stratification of abilities in the upper tail of the test score distribution. This allows
strict thresholds to be applied at the screening phase so that even if someone scores lower on
subsequent confirmatory tests because of regression to the mean effects, they are unlikely to
drop so far that they will not meet super-recognizer criteria on those tests.
Second, we designed the UNSW Face Test so that it captures people with a general ability to
identify faces, across memory and matching tasks. The CFMT is designed to test face
recognition memory, and the GFMT is designed to test face matching ability. These abilities
are employed to greater or lesser degrees in different professional tasks that super-recognizers
have been recruited to perform. For example, in CCTV surveillance, super-recognizers
monitor footage for faces they have committed to memory (e.g. Davis et al., 2019), whereas
passport officers match photo-ID to unfamiliar travellers. While these abilities may be
dissociable to a limited extent (e.g. Bate et al., 2018; 2019; Wilhelm et al., 2010), the high
correlation between them suggests there is substantial overlap in these two abilities
(McCaffery et al., 2018; Robertson et al., 2017; Verhallen et al., 2017). Identifying people
with a general face identification ability, therefore, allows researchers to follow up initial
screening with more detailed profiling of participants’ abilities, tailored to the specific
identification task of interest (see Ramon et al., 2019).
Third, images in the UNSW Face Test capture natural ‘ambient’ variability in appearance —
caused by changes in age, pose, lighting and expression. The GFMT and CFMT use highly
standardised images and captured under optimal studio conditions in the same session. These
types of images do not reflect the challenge of real-world face identification (see Jenkins et
al., 2011). Using ambient images to test face identification ability, therefore, more closely
approximates real-world face identification tasks, and makes the task challenging without
having to resort to artificial degradation of test images, such as in the CFMT+.
In this paper, we describe the development and validation of the UNSW Face Test. We find
that the UNSW Face Test is a valid and reliable test that is uniquely suited to screening for
super-recognizers.
4
UNSW FACE TEST
Test development
Test delivery
The UNSW Face Test is free for use in research and can be completed at
www.unswfacetest.com. Unique weblinks can be created for researchers and organizations to
screen their own participants. Those interested in using the test should complete the following
web form (http://www.unswfacetest.com/request.html). A package containing jsPsych
functions (de Leeuw, 2015), experiment scripts and images can also be provided on request.
This enables researchers to create their own versions of the test using image datasets
collected using the protocol described below. This may be desirable if researchers wish to
target super-recognizers in a particular demographic (for example see Supplementary
Materials for analysis of accuracy by participant ethnicity; see also Wan et al., 2017).
5
UNSW FACE TEST
Figure 1. The UNSW Face Test contains two tasks. Left: In the recognition memory task
participants study studio-quality target faces for 5 seconds each (Study Phase), and then
make old/new recognition judgments on ambient test faces (Test Phase). Right: In the match-
to-sample sorting task participants memorize a studio-quality target face for 5 seconds and
then sort 4 ambient test images according to whether they are the target face. Scores on each
task are summed for a maximum score of 120 and then expressed as a percentage.
Recognition memory task. In this task, participants complete a standard old/new recognition
memory paradigm. In the study phase, participants memorize 20 studio-quality target faces,
shown for 5 seconds each in random order. In the test phase, participants see 20 ambient
images of the targets randomly intermixed with 20 ambient foil faces and decide whether
each face appeared or did not appear in the study phase. Participants make 40 decisions,
giving participants a maximum score of 40. This task takes approximately 5 minutes.
Match-to-sample sorting task. This task combines immediate face memory, perceptual
matching and pile sorting (see Jenkins et al., 2011), and is designed to model ‘identity
clustering’, a common task in criminal investigations where police determine which of
multiple images shows the person of interest. On each trial, participants memorise a studio-
quality target face for 5 seconds. Next, participants sort a ‘pile’ of four ambient images by
dragging an image to the right if it shows the target or to the left if it does not. Participants
are told the pile could contain between 0 and 4 images of the target. The remaining images
are of the target’s foil. Participants complete two practice trials, followed by 20 trials in a
fixed order. Four of these trials contain 0 images of the target, four contain 1 image of the
target, and so on. Participants make 4 decisions on each trial, giving participants a maximum
score of 80. This task takes approximately 8 minutes.
6
UNSW FACE TEST
First, we established normative accuracy on the test using an online sample of 290
participants from Amazon’s Mechanical Turk (see Buhrmester et al., 2011). Next, because
the UNSW Face Test is designed to identify super-recognizers within large online cohorts,
we recruited two large online samples by targeting high performers on the GFMT and
CFMT+. Online Sample 1 consists of 22,776 people who completed the UNSW Face Test via
www.unswfacetest.com between September 7, 2017, and August 23, 2019, after following
links in various news media (e.g. Towler et al., 2019). Online Sample 2 consists of 836
people who completed the UNSW Face Test, CFMT+ and GFMT between March 7, 2018,
and August 1, 2019, after responding to an online advertisement by The University of
Greenwich. We used these online samples to (i) calculate accuracy on the UNSW Face Test
for this target group, (ii) confirm the test is difficult enough not to suffer from ceiling effects,
(iii) compare the test’s effectiveness as a screening tool to existing tests, and (iv) examine the
effect of participant demographics on accuracy.
Finally, we recruited two lab-based samples of university students from UNSW to establish
the fundamental psychometric properties of the test. Lab Sample 1 consists of 80 participants
who completed the UNSW Face Test, GFMT and CFMT, and then returned one week later to
repeat the UNSW Face Test. We used this sample to assess (i) test-retest reliability, which is
important to confirm the test will be a useful screening tool, and (ii) convergent validity,
which confirms the test measures skill in face identification. Lab Sample 2 consists of 102
participants who completed the UNSW Face Test, CFMT+, and non-face tasks: the
Cambridge Car Memory Test (Dennett et al., 2012), and Matching Familiar Figures Test
(Kagan, 1965). We used this sample to assess discriminant validity, which confirms the test
measures face identification skills rather than domain-general object processing skills.
Participants in both lab samples also completed an ethnicity questionnaire.
7
UNSW FACE TEST
8
UNSW FACE TEST
Figure 3. Comparison of the normative, lab and online samples on UNSW Face Test scores.
The central dotted line indicates the mean and the lower and upper dotted lines indicate 25%
and 75% percentiles, respectively.
9
UNSW FACE TEST
Figure 4. Left: Accuracy distribution of Online Sample 1 (top) and Online Sample 2 (bottom)
compared to the normative accuracy distribution (black line). Right: Sample of distribution
above the super-recognition threshold (2 SDs above the mean). The long tail of the
distribution shows that the UNSW Face Test is sensitive to differences in performance up to 6
SD above the mean.
10
UNSW FACE TEST
Figure 5. Violin plots showing how the distribution of performance on each of the online tests
varies as a function of the screening criteria used to select individuals based on their UNSW
Face Test score (top row), CFMT+ score (middle row) and GFMT score (bottom row). Boxes
on the right show the number of participants in Online Sample 2 represented in each
distribution. These data show that the ability to set stricter screening criteria on the UNSW
Face Test provides greater precision for targeting high performing individuals to follow up
testing than the CFMT+ or GFMT.
The top row of Figure 5 shows that the distribution of scores on the CFMT+ and GFMT
improves as stricter criteria are applied on the UNSW Face Test. This same pattern is not
evident when using the CFMT+ or GFMT for screening. For the CFMT, applying
progressively stricter criteria does not select groups that perform progressively better on the
other tests (Figure 5, middle row). For the GFMT, applying the strictest limit possible
provides only moderate benefits (Figure 5, bottom row). These results show that the ability to
set stricter screening criteria using the UNSW Face Test, compared to existing tests, provides
researchers with an enhanced ability to target high performing people for follow-up testing.
Figure 6 shows the correlations between the three tests used to perform the analysis shown in
Figure 5. Visual inspection of these figures suggests that the enhanced ability of the UNSW
Face Test to screen for super-recognizers is due to the reduced frequency of ceiling level
performance relative to the other tests. Ceiling effects in these tests are likely caused by the
11
UNSW FACE TEST
recruitment methods that explicitly targeted higher performers, which is consistent with the
superior accuracy we observe in our online samples relative to lab-based samples.
Figure 6. Correlations between performance on the UNSW Face Test, CFMT+ and GFMT
for online sample 2, showing that the UNSW Face Test does not suffer from ceiling effects,
unlike existing tests. ** Significant at .01 level
Test-retest reliability
To establish test-retest reliability of the UNSW Face Test, 80 participants in Lab Sample 1
completed the UNSW Face Test twice, one week apart (see Method for full details). Their
scores at each time point are plotted in Figure 7. Test-retest reliability is r(78) = 0.592 (p <
.001, CI95 [0.428, 0.718]), and relatively high in the context of psychometric tests more
broadly, but slightly lower than the test-retest reliabilities of some existing face identification
tests, including the CFMT (r = 0.70; Wilmer et al., 2010) and Kent Face Matching Test (r =
0.67; Fysh & Bindemann, 2018). This might be attributable to the fact that accuracy on the
UNSW Face Test is not calibrated to the midpoint of the measurement scale, as we aimed to
produce a challenging test with a greater resolution at the upper tail of the distribution.
Nonetheless, as demonstrated in Figure 5, the test is very effective at identifying high
performers on subsequent tests.
Figure 7. Test-retest reliability on the UNSW Face Test after a one-week delay. **
Significant at .01 level.
12
UNSW FACE TEST
We also note that repeating the UNSW Face Test significantly improved accuracy from
60.2% (SD = 5.7%) at Time 1 to 62.1% (SD = 5.8%) at Time 2, t(79) = 3.30, p < .001, CI95 =
[0.76%, 3.07%]. This 1.9% improvement equates to 0.33 SDs, which means practicing this
test buys only modest improvements that are unlikely to invalidate estimates of face
identification ability.
Convergent validity
Next, we sought to establish convergent validity. The 80 participants in Lab Sample 1 also
completed the CFMT and GFMT at Time 1 (see Method for full details). Accuracy scores
and correlations between the tests are shown in Table 2. Accuracy on the UNSW Face Test is
more strongly correlated with the CFMT, indicating that the UNSW Face Test is more
strongly aligned to face memory tasks like the CFMT compared to perceptual face matching
tasks like the GFMT. These correlations are consistent with previous reports of an association
between standardized tests of face identification ability (Bate et al., 2018; Burton et al., 2010;
Fysh & Bindemann, 2018; McCaffery et al., 2018; McKone et al., 2011; Wilmer et al., 2012)
and provide evidence of high convergent validity.
Table 2. Reliable correlations between the UNSW Face Test and the CFMT and GFMT for
Lab Sample 1 demonstrate high convergent validity.
Discriminant validity
To establish discriminant validity of the UNSW Face Test, 102 participants in Lab Sample 2
completed the UNSW Face Test, the CFMT+, and two non-face tasks: the Cambridge Car
Memory Test, and the Matching Familiar Figures Test (see Method for full details). Their
accuracy scores are shown in Table 3, along with correlation coefficients between each of the
tests. Consistent with the results of the previous analysis, we find evidence of convergent
validity: the UNSW Face Test is significantly positively correlated with the CFMT+, r(100)
= .306, p = .002, CI95 [0.119, 0.472]. More important for this analysis, we find evidence of
discriminant validity: the UNSW Face Test does not correlate significantly with performance
on either the Cambridge Car Memory Test, r(100) = .034, p = .738, CI95 [-0.162, 0.227], or
the Matching Familiar Figures Test, r(100) = .142, p = .154, CI95 [-0.054, 0.328]. This pattern
confirms that the UNSW Face Test has discriminant validity and is measuring domain-
specific face identification abilities.
Table 3. Mean accuracy and correlation matrix for all tests in Lab Sample 2, demonstrating
discriminant validity.
13
UNSW FACE TEST
Figure 8 shows accuracy on the UNSW Face Test as a function of age. Previous research
shows face identification ability increases markedly from childhood to adulthood – peaking at
around age 31 – before slowly declining with further aging (Germine et al., 2011; see also
Carey et al., 1980). Visual inspection of test accuracy in Figure 8 shows a strikingly similar
pattern. We computed estimated peak accuracy and standard error by fitting a quadratic
function to the logarithm of age and then using a bootstrapping resampling procedure,
resampling from the data with replacement 200 times. The estimated age of peak accuracy for
the UNSW Face Test is 30.7 years (SE = 0.2), which is remarkably close to the 31.4 years
(SE = 0.5) for the CFMT reported by Germine et al. (2011).
14
UNSW FACE TEST
Figure 8. Average accuracy for each participant age on the UNSW Face Test separated for
overall (left), memory task (middle) and sorting task (right). Size and shade of each data
point show the number of participants in that age group.
Visual inspection of Figure 8 suggests that accuracy on the memory sub-task is modulated
more by participant age in comparison to the sorting sub-task. This is apparent both in terms
of a greater increase in accuracy from childhood to adulthood (Memory increase = 9.3%, SE
= 0.4; Sorting increase = 7.7%, SE = 0.3), an earlier peak (Memory = 28.6 years, SE = 0.2;
Sorting = 32.5 years, SE = 0.4), and a more marked decline of accuracy after the peak
(Memory decrease = 8.0%, SE = 0.3; Sorting decrease = 4.0%, SE = 0.2). The divergence of
aging effects for memory and sorting tasks is consistent with previous work showing that face
identification accuracy is less sensitive to aging in face matching, which is akin to sorting,
compared to recognition memory tasks (Burton et al., 2010).
General Discussion
The UNSW Face Test is a challenging new online screening test designed to identify super-
recognizers. Unlike existing face identification tests, the UNSW Face Test is designed to be
administered online and delivered en masse to large cohorts of participants. Because super-
recognizers are rare, a mass screening tool enables researchers and organizations to identify
larger groups of super-recognizers for follow-up confirmatory testing than is possible with
existing tests. This will improve researchers’ ability to achieve the statistical power necessary
to investigate the cognitive, perceptual, and neural mechanisms supporting the highest levels
of accuracy in face identification tasks and allow organizations to bolster their face
identification capabilities using existing staff.
An important property of the UNSW Face Test is that it is extremely challenging. Despite
testing over 24,000 participants, no participant has achieved a perfect score at the time of
writing. It is, therefore, an open question as to whether the limits of human ability in face
identification tasks fall below the upper bound of the measurement scale used in this test.
Moreover, because the accuracy of super-recognizers is below this upper-bound, it enables
15
UNSW FACE TEST
researchers to discriminate between super-recognizers that achieve, for example, a score that
is 2 SDs vs 4 SDs. 4SD above the mean. As we have shown here, this unique property of the
test makes it especially suited to screening large online cohorts because stricter recruitment
criteria translate to higher performance in participant groups.
The UNSW Face Test will enable researchers to adopt a more staged approach to recruiting
super-recognizers. A great deal of effort is required to create a standardized psychometric test
that is well-calibrated, reliable and provides a valid measure of ability (Wilmer et al., 2012).
And yet, many standardized tests are available online as initial screening tools, which means
participants could practice the tests repeatedly. This is problematic because it reduces the
legitimacy of these tests in scientific use and also, perhaps more concerningly, where these
tests have been used in recruitment for security and policing roles (Davis et al., 2016;
Robertson et al., 2016; White et al., 2015).
We propose a solution whereby an initial screening test is followed by a battery of other
standardized tests, held under stricter control, to verify an individual’s ability. We have used
the UNSW Face Test for precisely this purpose and have found it to be a very effective
recruitment tool. Because the people pictured in the test have agreed for their images to be
used, it can be linked to popular media content, and the interactive nature of the task is suited
to engaging consumers of popular media. Anecdotal accounts from super-recognizers and our
students suggest that they find the task enjoyable and are motivated to perform well. We
attribute this to the difficulty of the test, and its strong face validity – stemming from the fact
it was created using the type of images that people typically encounter on the internet.
Scientific understanding of superior ability in face identification is limited. Given that super-
recognizers are increasingly being deployed to perform important real-world tasks, there are
strong theoretical and practical motivations for researchers to rectify this in the years ahead
(Ramon et al., 2019). We hope the UNSW Face Test can support the initial recruitment phase
on which these research activities are based.
Method
Descriptions of tests
Glasgow Face Matching Test (GFMT). In the GFMT (Burton et al., 2010) participants
decide whether pairs of images show the same person or two different people. Participants
completed the short version of the GFMT, which contains 40 image pairs (20 match, 20 non-
match). GFMT images were captured minutes apart in studio-conditions with different
cameras.
Cambridge Face Memory Test (CFMT). The CFMT (Duchaine & Nakayama, 2006) is a
standardized test of face memory. In this test, there are 3 blocks of 24 increasingly difficult
trials. Participants memorize 6 novel faces (study phase) and then attempt to identify them in
a three-person lineup (test phase). Images in block 1 are the same as those shown in the study
phase. Images in block 2 are novel images of the study faces captured in untrained views and
lighting conditions. Finally, images in block 3 are novel images that have been degraded with
visual noise.
16
UNSW FACE TEST
Cambridge Face Memory Test – long form (CFMT+). The CFMT+ (Russell et al., 2009)
contains all trials from the original CFMT but also includes an additional, more difficult
block of 24 trials intended to prevent ceiling effects. In this block, participants learn and are
tested on novel images that contain more extensive visual noise and variability in the pose,
expression, and visible features.
Cambridge Car Memory Test (CCMT). The CCMT (Dennett et al., 2012) was created as a
measure of individual differences in object discrimination. Using the same trial structure as
the CFMT, this test provides a measure of object recognition ability that is independent of
face recognition.
Matching Familiar Figures Test (MFFT). The MFFT (Kagan, 1966) is a task that measures
cognitive style, impulsivity versus reflexivity. Participants decide whether a target drawing is
identical to one of six variants, or absent. Participants complete 20 trials.
Participant samples
Normative sample. We recruited an online sample of 321 US residents from Amazon’s
Mechanical Turk (see Buhrmester et al., 2011). 321 participants completed the UNSW Face
Test. Thirty-one participants were removed for scoring below 40% accuracy (48 out of 120),
for exclusively responding Match or Nonmatch in the Memory task, or for not sorting any
images on the Sort task. The final sample of 290 participants (114 males and 176 females,
mean age = 37.1 years, SD = 11.4 years) contained 212 Caucasians (73.1%), 25 Africans
(8.6%), 24 East Asians (8.3%), 15 Hispanics (5.2%), 5 people of mixed race (1.7%), 1
Middle Eastern (0.3%), and 8 other/not-specified (2.8%).
Online Sample 1. Participants in Online Sample 1 were 24,159 people who completed only
the UNSW Face Test via www.unswfacetest.com between September 7, 2017, and August
23, 2019. Online participants were excluded if they failed any one of following criteria: 1)
scored less than 40% overall on the test, 2) responded exclusively match or nonmatch in the
Memory task, 3) did not move any images on a trial in the Sort task. Because this is an online
test our sample likely contains multiple attempts by some participants. When we were able to
link multiple attempts to the same email address, we only included their first attempt. This
left a final sample of 22,776 participants (9,211 males, 13,277 females, and 288 other/not
specified, mean age = 36.7 years, SD = 13.9 years). The sample consisted of 17,241
Caucasians (75.7%), 2157 Asians (9.5%), 1,330 people of mixed race (5.8%), 892 other
(3.9%), 405 Pacific Islanders (1.8%), 220 Middle Eastern (1%), 196 Hispanics (0.9%), 178
Aboriginal Australians (0.8%), and 157 Africans (0.7%).
Online Sample 2. Participants in Online Sample 2 completed the CFMT+, GFMT and
UNSW Face Test online, in this fixed order. These participants volunteered to complete the
tests online by clicking an advertisement for research participation with The University of
Greenwich between March 7, 2018, and August 1, 2019. This sample consisted of 836
participants (355 males, 456 females, and 25 other/not specified, mean age = 34.6 years, SD
= 11.5 years) comprising 682 Caucasians (81.6%), 60 Asians (7.2%), 54 people of mixed
race (6.5%), 15 other (1.8%), 14 Hispanics (1.7%), 9 Africans (1.1%), 1 Pacific Islander
(0.1%) and 1 Middle Eastern person (0.1%).
17
UNSW FACE TEST
Lab Sample 1. Participants in Lab Sample 1 completed two test sessions, one week apart. At
Time 1, participants completed the GFMT, CFMT and UNSW Face Test, in this fixed order.
At Time 2, participants completed the UNSW Face Test again. Due to time constraints, one
participant did not complete the GFMT but did complete the other tests, resulting in a final
sample of 79 for the GFMT and 80 for the remaining tests. Participants took between 45-60
minutes to complete the test battery at Time 1 and 15-20 minutes to complete Time 2. This
sample consisted of 80 UNSW undergraduate students (21 males and 59 females, mean age =
19.3 years, SD = 3.0 years) who participated in exchange for course credit. There were 46
Asians (57.5%), 23 Caucasians (28.8%), 6 people of mixed race (7.5%), 3 others (3.8%), and
2 people of Middle Eastern descent (2.5%).
Lab Sample 2. Lab Sample 2 completed the following test battery in a counterbalanced order:
the UNSW Face Test, CFMT+, CCMT, and the MFFT. Participants took between 75-90
minutes to complete the test battery. This sample consisted of 102 UNSW undergraduate
students (27 males and 75 females, mean age = 18.9 years, SD = 1.6 years) who participated
in exchange for course credit. There were 62 Asians (60.8%), 31 Caucasians (30.4%), 3
people of mixed race (2.9%), 2 Africans (2%), 2 Aboriginal Australians (2%), and 2 Middle
Eastern people (2%).
18
UNSW FACE TEST
References
Balsdon, T., Summersby, S., Kemp, R. I., & White, D. (2018). Improving face identification
with specialist teams. Cognitive Research: Principles and Implications, 3(1).
https://doi.org/10.1186/s41235-018-0114-7
Bate, S., & Dudfield, G. (2019). Subjective assessment for super recognition: an evaluation
of self-report methods in civilian and police participants. PeerJ, 7, e6330.
https://doi.org/10.7717/peerj.6330
Bate, S., Frowd, C. D., Bennetts, R., Hasshim, N., Murray, E., Bobak, A. K., Wills, H., &
Richards, S. (2018). Applied screening tests for the detection of superior face
recognition. Cognitive Research: Principles and Implications, 3, 1-19.
https://doi.org/10.1186/s41235-018-0116-5
Bate, S., Frowd, C. D., Bennetts, R., Hasshim, N., Portch, E., Murray, E., & Dudfield, G.
(2019). The consistency of superior face recognition skills in police officers. Applied
Cognitive Psychology. https://doi.org/10.1002/acp.3525
Belanova, E., Davis, J. P., & Thompson, T. (2018). Cognitive and neural markers of super-
recognisers’ face processing superiority and enhanced cross-age effect. Cortex, 108,
92-111. https://doi.org/10.1016/j.cortex.2018.07.008
Bobak, A. K., Bennetts, R. J., Parris, B. A., Jansari, A., & Bate, S. (2016). An in-depth
cognitive examination of individuals with superior face recognition skills. Cortex, 82,
48-62. https://doi.org/10.1016/j.cortex.2016.05.003
Bobak, A. K., Dowsett, A. J., & Bate, S. (2016). Solving the Border Control Problem:
Evidence of Enhanced Face Matching in Individuals with Extraordinary Face
Recognition Skills. PLoS One, 11(2), e0148148.
https://doi.org/10.1371/journal.pone.0148148
Bobak, A. K., Hancock, P. J. B., & Bate, S. (2016). Super-recognisers in Action: Evidence
from Face-matching and Face Memory Tasks. Applied Cognitive Psychology, 30(1),
81-91. https://doi.org/10.1002/acp.3170
Bobak, A. K., Mileva, V. R., & Hancock, P. J. (2018). Facing the facts: Naive participants
have only moderate insight into their face recognition and face perception abilities. Q
J Exp Psychol (Hove), 1747021818776145.
https://doi.org/10.1177/1747021818776145
Bobak, A. K., Pampoulov, P., & Bate, S. (2016). Detecting Superior Face Recognition Skills
in a Large Sample of Young British Adults. Front Psychol, 7, 1378.
https://doi.org/10.3389/fpsyg.2016.01378
Bobak, A. K., Parris, B. A., Gregory, N. J., Bennetts, R. J., & Bate, S. (2017). Eye-movement
strategies in developmental prosopagnosia and "super" face recognition. Q J Exp
Psychol (Hove), 70(2), 201-217. https://doi.org/10.1080/17470218.2016.1161059
Bowles, D. C., McKone, E., Dawel, A., Duchaine, B., Palermo, R., Schmalzl, L., Rivolta, D.,
Wilson, C. E., & Yovel, G. (2009). Diagnosing prosopagnosia: effects of ageing, sex,
and participant-stimulus ethnic match on the Cambridge Face Memory Test and
Cambridge Face Perception Test. Cogn Neuropsychol, 26(5), 423-455.
https://doi.org/10.1080/02643290903343149
Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon's Mechanical Turk: A new
source of inexpensive, yet high-quality data? Perspectives on Psychological Science,
6(1), 3-5. https://doi.org/10.1177/1745691610393980
Burton, A. M., White, D., & McNeill, A. (2010). The Glasgow Face Matching Test. Behav
Res Methods, 42(1), 286-291. https://doi.org/10.3758/BRM.42.1.286
Carey, S., Diamond, R., & Woods, B. (1980). Development of Face Recognition — A
Maturational Component? Developmental Psychology, 16(4), 257-269.
19
UNSW FACE TEST
Davis, J. P., Lander, K., Evans, R., & Jansari, A. (2016). Investigating Predictors of Superior
Face Recognition Ability in Police Super-recognisers. Applied Cognitive Psychology,
30(6), 827-840. https://doi.org/10.1002/acp.3260
Davis, J. P., Maigut, A., & Forrest, C. (2019). The wisdom of the crowd: A case of post- to
ante-mortem face matching by police super-recognisers. Forensic Science
International. https://doi.org/10.1016/j.forsciint.2019.109910
de Leeuw, J. R. (2015). jsPsych: a JavaScript library for creating behavioral experiments in a
Web browser. Behav Res Methods, 47(1), 1-12. https://doi.org/10.3758/s13428-014-
0458-y
Dennett, H. W., McKone, E., Tavashmi, R., Hall, A., Pidcock, M., Edwards, M., & Duchaine,
B. (2012). The Cambridge Car Memory Test: a task matched in format to the
Cambridge Face Memory Test, with norms, reliability, sex differences, dissociations
from face memory, and expertise effects. Behav Res Methods, 44(2), 587-605.
https://doi.org/10.3758/s13428-011-0160-2
Duchaine, B., & Nakayama, K. (2006). The Cambridge Face Memory Test: results for
neurologically intact individuals and an investigation of its validity using inverted
face stimuli and prosopagnosic participants. Neuropsychologia, 44(4), 576-585.
https://doi.org/10.1016/j.neuropsychologia.2005.07.001
Edmond, G., & Wortley, N. (2016). Interpreting image evidence: Facial mapping, police
familiars and super-recognisers in England and Australia. Journal of International
and Comparative Law, 3, 473-522.
Fysh, M. C., & Bindemann, M. (2018). The Kent Face Matching Test. Br J Psychol, 109(2),
219-231. https://doi.org/10.1111/bjop.12260
Germine, L. T., Duchaine, B., & Nakayama, K. (2011). Where cognitive development and
aging meet: face learning ability peaks after age 30. Cognition, 118(2), 201-210.
https://doi.org/10.1016/j.cognition.2010.11.002
Gignac, G. E., Shankaralingam, M., Walker, K., & Kilpatrick, P. (2016). Short-term memory
for faces relates to general intelligence moderately. Intelligence, 57, 96-104.
https://doi.org/10.1016/j.intell.2016.05.001
Jenkins, R., White, D., Van Montfort, X., & Burton, A. M. (2011). Variability in photos of
the same face. Cognition, 121(3), 313-323.
https://doi.org/10.1016/j.cognition.2011.08.001
Kagan, J. (1966). Reflection-impulsivity: The generality and dynamics of conceptual tempo.
Journal of abnormal psychology, 71(1), 17-24.
McCaffery, J. M., Robertson, D. J., Young, A. W., & Burton, A. M. (2018). Individual
differences in face identity processing. Cognitive Research: Principles and
Implications, 3(1). https://doi.org/10.1186/s41235-018-0112-9
McKone, E., Hall, A., Pidcock, M., Palermo, R., Wilkinson, R. B., Rivolta, D., Yovel, G.,
Davis, J. M., & O'Connor, K. B. (2011). Face ethnicity and measurement reliability
affect face recognition performance in developmental prosopagnosia: evidence from
the Cambridge Face Memory Test-Australian. Cogn Neuropsychol, 28(2), 109-146.
https://doi.org/10.1080/02643294.2011.616880
Noyes, E., Phillips, P., & O’Toole, A. (2017). What is a super-recogniser. In M. Bindemann
& A. M. Megreya (Eds.), Face processing: Systems, disorders and cultural
differences (pp. 173-201). Nova Science Publishers.
Palermo, R., Rossion, B., Rhodes, G., Laguesse, R., Tez, T., Hall, B., Albonico, A.,
Malaspina, M., Daini, R., Irons, J., Al-Janabi, S., Taylor, L. C., Rivolta, D., &
McKone, E. (2016). Do people have insight into their face recognition abilities? Q J
Exp Psychol (Hove), 1-33. https://doi.org/10.1080/17470218.2016.1161058
20
UNSW FACE TEST
Ramon, M., Bobak, A. K., & White, D. (2019). Super‐recognizers: From the lab to the world
and back again. Br J Psychol, 110(3), 461-479. https://doi.org/10.1111/bjop.12368
Richler, J. J., Wilmer, J. B., & Gauthier, I. (2017). General object recognition is specific:
Evidence from novel and familiar objects. Cognition, 166, 42-55.
https://doi.org/10.1016/j.cognition.2017.05.019
Robertson, D. J., Jenkins, R., & Burton, A. M. (2017). Face detection dissociates from face
identification. Visual Cognition, 25(7-8), 740-748.
https://doi.org/10.1080/13506285.2017.1327465
Robertson, D. J., Noyes, E., Dowsett, A. J., Jenkins, R., & Burton, A. M. (2016). Face
Recognition by Metropolitan Police Super-Recognisers. PLoS One, 11(2), e0150036.
https://doi.org/10.1371/journal.pone.0150036
Russell, R., Duchaine, B., & Nakayama, K. (2009). Super-recognizers: people with
extraordinary face recognition ability. Psychon Bull Rev, 16(2), 252-257.
https://doi.org/10.3758/PBR.16.2.252
Shah, P., Gaule, A., Sowden, S., Bird, G., & Cook, R. (2015). The 20-item prosopagnosia
index (PI20): a self-report instrument for identifying developmental prosopagnosia. R
Soc Open Sci, 2(6), 140343. https://doi.org/10.1098/rsos.140343
Shakeshaft, N. G., & Plomin, R. (2015). Genetic specificity of face recognition. Proc Natl
Acad Sci U S A, 112(41), 12887-12892. https://doi.org/10.1073/pnas.1421881112
Tardif, J., Morin Duchesne, X., Cohan, S., Royer, J., Blais, C., Fiset, D., Duchaine, B., &
Gosselin, F. (2018). Use of Face Information Varies Systematically From
Developmental Prosopagnosics to Super-Recognizers. Psychol Sci,
956797618811338. https://doi.org/10.1177/0956797618811338
Towler, A., & White, D. (2019). Super-recognisers accurately pick out a face in a crowd –
but can this skill be taught? The Conversation. https://theconversation.com/super-
recognisers-accurately-pick-out-a-face-in-a-crowd-but-can-this-skill-be-taught-
112003
Verhallen, R. J., Bosten, J. M., Goodbourn, P. T., Lawrance-Owen, A. J., Bargary, G., &
Mollon, J. D. (2017). General and specific factors in the processing of faces. Vision
Res. https://doi.org/10.1016/j.visres.2016.12.014
Wan, L., Crookes, K., Dawel, A., Pidcock, M., Hall, A., & McKone, E. (2017). Face-blind
for other-race faces: Individual differences in other-race recognition impairments. J
Exp Psychol Gen, 146(1), 102-122. https://doi.org/10.1037/xge0000249
Weule, G. (2018). So, you think you're good at recognising faces. ABC Science.
https://www.abc.net.au/news/science/2018-03-11/super-face-recognisers-are-you-
one/9517772
White, D., Burton, A. L., & Kemp, R. I. (2016). Not looking yourself: The cost of self-
selecting photographs for identity verification. Br J Psychol, 107(2), 359-373.
https://doi.org/10.1111/bjop.12141
White, D., Dunn, J. D., Schmid, A. C., & Kemp, R. I. (2015). Error Rates in Users of
Automatic Face Recognition Software. PLoS One, 10(10), e0139827.
https://doi.org/10.1371/journal.pone.0139827
Wilhelm, O., Herzmann, G., Kunina, O., Danthiir, V., Schacht, A., & Sommer, W. (2010).
Individual differences in perceiving and recognizing faces - One element of social
cognition. J Pers Soc Psychol, 99(3), 530-548. https://doi.org/10.1037/a0019972
Wilmer, J. B., Germine, L., Chabris, C. F., Chatterjee, G., Gerbasi, M., & Nakayama, K.
(2012). Capturing specific abilities as a window into human individuality: the
example of face recognition. Cogn Neuropsychol, 29(5-6), 360-392.
https://doi.org/10.1080/02643294.2012.753433
21
UNSW FACE TEST
Wilmer, J. B., Germine, L., Chabris, C. F., Chatterjee, G., Williams, M., Loken, E.,
Nakayama, K., & Duchaine, B. (2010). Human face recognition ability is specific and
highly heritable. Proc Natl Acad Sci U S A, 107(11), 5238-5241.
https://doi.org/10.1073/pnas.0913053107
22