0% found this document useful (0 votes)
19 views21 pages

The Language Network Is Not Engaged in Object Categorization

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views21 pages

The Language Network Is Not Engaged in Object Categorization

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Cerebral Cortex, 2023, 33, 10380–10400

https://doi.org/10.1093/cercor/bhad289
Advance access publication date 9 August 2023
Original Article

The language network is not engaged in object


categorization
1,†,
Yael Benn *, Anna A. Ivanova2,3,† , Oliver Clark1 , Zachary Mineroff2,3 , Chloe Seikus4 , Jack Santos Silva4 , Rosemary Varley4,‡ ,
Evelina Fedorenko2,3,‡

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


1 Department of Psychology, Manchester Metropolitan University, Manchester M15 6BH, United Kingdom,
2 Brain and Cognitive Sciences Department, Massachusetts Institute of Technology, Cambridge, MA 02139, United States,
3 McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, United States,
4 Division of Psychology & Language Sciences, University College London, London WC1E 6BT, UK

*Corresponding author: Department of Psychology, Manchester Metropolitan University, Brooks Building, Birley Fields Campus, 53 Bonsall Street, Manchester M15
6GX, United Kingdom. Email: [email protected]
† Yael Benn and Anna A. Ivanova contributed equally
‡ Co-senior authors

The relationship between language and thought is the subject of long-standing debate. One claim states that language facilitates
categorization of objects based on a certain feature (e.g. color) through the use of category labels that reduce interference from other,
irrelevant features. Therefore, language impairment is expected to affect categorization of items grouped by a single feature (low-
dimensional categories, e.g. “Yellow Things”) more than categorization of items that share many features (high-dimensional categories,
e.g. “Animals”). To test this account, we conducted two behavioral studies with individuals with aphasia and an fMRI experiment with
healthy adults. The aphasia studies showed that selective low-dimensional categorization impairment was present in some, but not
all, individuals with severe anomia and was not characteristic of aphasia in general. fMRI results revealed little activity in language-
responsive brain regions during both low- and high-dimensional categorization; instead, categorization recruited the domain-general
multiple-demand network (involved in wide-ranging cognitive tasks). Combined, results demonstrate that the language system is not
implicated in object categorization. Instead, selective low-dimensional categorization impairment might be caused by damage to brain
regions responsible for cognitive control. Our work adds to the growing evidence of the dissociation between the language system and
many cognitive tasks in adults.

Key words: aphasia; categorization; fMRI; language.

Introduction but also to draw powerful inferences about shared properties


from one category member to another (Mervis and Rosch 1981;
The role of language in mediating or augmenting thought is the
Smith and Medin 1981; Wasserman et al. 1988; Smith and Heise
subject of long-standing debate. According to one view, language
1992; Pearce 1994; Mareschal and Quinn 2001; Murphy 2002).
is necessary for many cognitive functions, such as math, logic,
In contrast to other animals, humans additionally label individ-
and propositional thought (Darwin 1871; Dennett 1994; Bickerton
ual categories with words—the core building blocks of a powerful
1995; Carruthers 2002; Bermúdez 2007; Baldo et al. 2010; Baldo
communication system that allows us to share complex thoughts
et al. 2015, and others). However, a large body of evidence sup-
with one another. Even though categorization is a basic cogni-
ports a different view: that language is cognitively and neurally
tive capacity that evolved long before language, evidence exists
independent from the rest of human cognition. This evidence
that word learning affects category learning in development (e.g.
includes the lack of activity in the language brain regions during
Gershkoff-Stowe et al. 1997; Sloutsky and Fisher 2004; Plunkett
non-linguistic tasks (Monti et al. 2009; Fedorenko et al. 2011; Monti
et al. 2008; Waxman and Gelman 2009; Ferguson and Waxman
et al. 2012; Amalric and Dehaene 2016; Amalric and Dehaene
2017) and, to some extent, in adulthood (Lupyan et al. 2007;
2019; Ivanova et al. 2021), the retained ability of some individuals
Brojde et al. 2011; Lupyan and Casasanto 2015). Here, we ask
with aphasia to perform such tasks (e.g. Varley et al. 2005; Siegal
the following: how does language affect the process of grouping
and Varley 2006; Bek et al. 2013; Benn et al. 2013), and variability
objects into categories when the category boundaries are already
across cultures in the use of language resources during thought
known?
(Kim 2002). However, the role of language is still contested for one
important aspect of human cognition: categorization.
Like other animals, humans can convert rich, multi-dimensional High-dimensional and low-dimensional
perceptual inputs into a latent lower-dimensional structured categories
representation of the world. Grouping discriminable individual Before summarizing the key prior evidence, it is important to
objects and events into classes allows us not only to decide introduce a distinction that some have considered to be relevant
whether some new object/event belongs to a particular category, to the question of whether language affects categorization.
Received: September 27, 2021. Revised: July 12, 2023. Accepted: July 13, 2023
© The Author(s) 2023. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Yael Benn et al. | 10381

Lupyan and colleagues (e.g. Lupyan 2009; Lupyan and Mirman


2013; Perry and Lupyan 2014) distinguish between “high-
dimensional” (HD) categories, where members share many
features, and “low-dimensional” (LD) categories, where members
share one or a few features. HD categories typically correspond
to established sets that ref lect either the taxonomic (similarity-
based) or relational/thematic (co-occurrence-based) structure
of the world (Bain 1855; Mirman et al. 2017). Taxonomic HD
categories can often be labeled by superordinate terms such as
ANIMALS, FRUIT, or TOOLS. Relational HD categories correspond
to common events/scenarios: for example, THINGS YOU TAKE
ON A PICNIC or NON-FOOD THINGS FOUND IN THE KITCHEN.
For such relational categories, the shared features have to do with
typical co-occurrences (e.g. although a fridge and a spatula are

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


quite different, they both co-occur with a large number of kitchen
objects, like a stove, pots and pans, a kettle, etc.). In contrast to
HD categories, LD categories are more likely to be novel groupings
of items that often straddle taxonomic and relational boundaries,
such as THINGS MADE OF WOOD or THINGS THAT ARE YELLOW
(e.g. things made of wood may include a cupboard, a sledge, and
a wooden spoon, and things that are yellow may include a lemon,
a yellow hat, and a canary). Fig. 1. Trial structure in (A) Aphasia Study 1 and (B) Aphasia Study 2 and
Similar distinctions have been made by others, in related liter- the fMRI experiment. HD, high dimensional category; LD, low dimensional
atures. For example, Barsalou (1983) distinguishes between “com- category.
mon” categories, which mirror the correlational structure of the
environment, and “ad-hoc,” or “goal-derived,” categories, which The LD-specific language recruitment hypothesis predicts that
are constructed for a specific goal and are thus often based on a reduced availability of language resources should lead to a greater
small number of features. Kloos and Sloutsky (2008) and Sloutsky disruption of LD compared with HD categorization.
(2010) distinguish between “dense” and “sparse” categories based This prediction found some support in the aphasia litera-
on the ratio of category-relevant variance to total variance. Mem- ture. Some patients with linguistic deficits have been reported to
bers of statistically dense categories share many inter-correlated exhibit impairments in non-verbal categorization tasks when the
features that matter for category membership, and members of task required focusing on one particular dimension and ignoring
sparse categories have very few features in common, with many other salient dimensions (De Renzi and Spinnler 1967; Cohen et al.
other features varying independently and being irrelevant for 1980; Cohen and Woll 1981; Hjelmquist 1989; Davidoff and Rober-
category membership. Couchman et al. (2010) contrast family- son 2004). Building on these findings, Lupyan (2009) manipulated
resemblance categorization, which relies on judgments of overall verbal versus spatial interference in a dual-task paradigm in neu-
similarity, considering multiple features in tandem, and criterial- rotypical participants and found that verbal, but not visuo-spatial,
attribute categorization (or “rule-based categorization”), which interference affected the participants’ ability to decide whether
requires adhering to a single-dimensional criterial attribute and an object belongs to an LD category. In contrast, verbal and visuo-
suppressing all other, irrelevant dimensions (see also Ashby and spatial interference had similar (and negligible) effects on HD
O’Brien 2005). Langland et al. (2021) relate the HD/LD distinction categorization. In a follow-up study, Lupyan and Mirman (2013)
to the concrete/abstract distinction, arguing that items in con- directly compared performance on HD and LD categorization in
crete categories have many shared features, whereas identifying individuals with aphasia and neurotypical controls. Participants
items from an abstract category requires generalizing over many were provided with a category descriptor (or label) and then had
irrelevant properties to identify a small set of commonalities. In to select from a picture array the subset of objects that belong to
this work, we use the HD/LD category distinction proposed by the target category (similar to Fig. 1, top). Performance in the LD
Lupyan et al. (although see the discussion for criticisms of that condition was lower for both groups, but critically, the HD versus
distinction). LD difference was larger in individuals with aphasia, particularly
in those with low scores on a picture-naming task. Lupyan and
colleagues therefore concluded that access to lexical resources is
The LD-specific language recruitment hypothesis important for LD categorization.
One claim that emerged in the literature in recent years is that However, evidence from aphasia does not provide uniform
language plays a special role in LD categorization (Lupyan 2009; support for the LD-specific language recruitment hypothesis.
Lupyan 2012; Lupyan and Mirman 2013). The argument goes as For example, Burger and Muma (1980) found deficits in HD
follows: during LD categorization, only one to two features are categorization in individuals with anomia and in individuals with
relevant to the task, whereas the rest of the features interfere Wernicke’s aphasia using a task similar to that used in Lupyan and
and have to be inhibited; for instance, when categorizing objects Mirman (2013). Others described aphasia-related categorization
by color, their shape and function have to be ignored. A verbal deficits for both HD and LD categories (Koemeda-Lutz et al. 1987)
label (e.g. “yellow”) can help maintain focus on the relevant or no deficits in either (Hough 1993). Further, variations in the task
categorization criterion and reduce interference from irrelevant (such as showing the category label to the participant during the
features. The hypothesis states that language resources are used entire trial versus just at the beginning of the trial) significantly
to maintain the label and are therefore more important for LD affected categorization performance in participants with aphasia
categorization compared with holistic, HD categorization. (Koemeda-Lutz et al. 1987), suggesting that task demands may
10382 | Cerebral Cortex, 2023, Vol. 33, No. 19

contribute to the observed results (above and beyond alleged patients with aphasia (and patients with Parkinson’s disease and
effects of category type). Finally, some have argued for a healthy adults as controls) and an functional Magnetic Resonance
relationship between categorization difficulties and conceptual- Imaging (fMRI) study. In Study 1, we use the setup from Lupyan
semantic rather than purely linguistic impairments (Caramazza and Mirman (2013; L&M henceforth) to determine whether their
et al. 1982; Whitehouse et al. 1978; cf. Le Dorze and Nespoulous findings can be replicated in a sample of participants with mod-
1989). erate aphasia. In Study 2, we adjust the experimental paradigm
to reduce task complexity by decreasing the amount of visual
The possible role of cognitive control information on the screen at any one time, and test whether
mechanisms in LD categorization the LD-selective categorization impairment holds in a sample
Even if individuals with aphasia consistently showed a selective of individuals with severe anomia. In the fMRI study, we collect
impairment in LD categorization, this result would not neces- data from neurotypical individuals to test the prediction that the
sarily implicate language as the source of the deficit. In par- language system is engaged during LD categorization more than
ticular, the language network in the left hemisphere, especially during HD categorization.
in the left frontal cortex, lies adjacent to the domain-general To foreshadow our results, the LD-selective categorization

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


multiple demand network, which supports executive functions, impairment was observed only in some participants with severe
like working memory (WM) and inhibitory control (Duncan 2010; anomia (Study 2), not in the general aphasia sample (Study 1).
Fedorenko et al. 2012; Duncan 2013; Fedorenko et al. 2013; Assem Only three of the five individuals with severe anomia exhibited
et al. 2020b). As a result, left hemisphere damage can lead to joint an LD-selective categorization impairment, casting doubt at the
linguistic and domain-general executive deficits (Gainotti et al. immediate causal link between language (or naming ability)
1986; Baldo et al. 2010). Prior work has shown that performance on and LD categorization. Finally, the fMRI study revealed low
executive function tasks, not language tasks, predicts success in engagement of the language network during both LD and
learning novel categories (Vallila-Rohter and Kiran 2015), and LD HD categorization, with no significant difference between the
categorization consist of novel grouping of elements that are not two. Thus, the inf luence of language on LD categorization is
typically grouped together. Further, the multiple demand network, behaviorally inconsistent and is not supported by fMRI evidence,
but not the language network, is robustly sensitive to cogni- leading us to conclude that the language system does not play a
tive effort across domains (e.g. Fedorenko et al. 2011; Fedorenko special role in LD (single-feature-based) categorization and is not
et al. 2013; Hugdahl et al. 2015; Shashidhara et al. 2019), and LD engaged during categorization in general.
categorization appears to be more cognitively challenging than
HD categorization: LD categories are harder to learn for both
human children (e.g. Kloos and Sloutsky 2008) and non-human
Aphasia Study 1
primates (Couchman et al. 2010), require supervision (e.g. Kloos The aim of Study 1 was to test the LD-specific language recruit-
and Sloutsky 2008), and are generally linked with executively- ment hypothesis using a paradigm that is closely related to the
taxing intentional learning (Kemler Nelson 1984; Ashby et al. original L&M study. L&M compared LD and HD categorization per-
1998; Ashby and Ell 2001; Ashby and O’Brien 2005; Couchman formance in participants with anomic aphasia and in neurotyp-
et al. 2010). It is therefore possible that impaired performance ical controls. They found (i) lower performance on LD compared
on LD categorization (and on categorization tasks more broadly) with HD categories in both healthy adults and participants with
depends on domain-general cognitive control resources rather anomic aphasia; and, critically, (ii) a greater decrement in perfor-
than on language resources. mance for the LD, compared with the HD condition in participants
The LD-specific language recruitment hypothesis further with aphasia. We explored whether these same effects would
predicts that LD categories would evoke stronger activity within replicate in our sample of participants with moderate aphasia.
the language brain regions. To our knowledge, this hypothesis To additionally examine the extent to which performance might
has not been directly tested in the neuroimaging literature; depend on the general effect of brain damage, as opposed to a
instead, many studies have investigated differences between linguistic impairment, we also included a group of individuals
taxonomic and thematic relations (Sachs et al. 2008; Kalénine with Parkinson’s disease (PD).
et al. 2009; Sass et al. 2009; Lewis et al. 2015), both of which
are considered HD. Further, few neuroimaging studies employ Methods
methods that would be required to dissociate the contributions of Participants
language-specific regions from those of domain-general cognitive Neurotypical older participants (n = 9 (6 F), age M = 67.89,
control regions: given the inter-individual variability in the SD = 14.98) were recruited by convenience sampling; individuals
precise locations of functional areas, voxels in anatomically with chronic aphasia (n = 11 (3 F), age M = 61.18, SD = 12.09) were
identical locations within the frontal lobe might be language- recruited from the UCL Aphasia Clinic Research Register. The
specific in one individual and domain-general in another, so aphasia group included patients with a range of aphasia types
traditional group-based analyses (Friston et al. 1994) would fail to and severities. Unlike L&M, we did not try to limit our sample
distinguish between them (Fedorenko et al. 2012; Nieto-Castañón to individuals classified as having “Anomic” aphasia, given that
and Fedorenko 2012; Fedorenko and Blank 2020). Disentangling the use of such rigid classification labels fails to account for the
the role of language and executive resources in LD categorization heterogeneity among the symptoms observed across patients
requires identification of language-specific and domain-general (Badecker and Caramazza 1985; Caramazza and Badecker 1989;
cognitive control regions in individual participants and testing Wilson et al. 2023), and given that some degree of anomia is
their responses to LD compared with HD conditions. present in all forms of aphasia (e.g. Goodglass and Geschwind
1976; Blumstein 1988). According to the normative literature
Current study on the Boston Naming Test (BNT; Goodglass et al. 1983), which
Here, we re-examine the role of language in LD and HD catego- recommends accounting for age, education, and gender when
rization by reporting evidence from two behavioral studies with diagnosing anomia (Welch et al. 1996; Zec et al. 2007), 7 of
Yael Benn et al. | 10383

Table 1. Participant information, study 1.

Group Participant Age Education Gender TPO BNT HD accuracy LD accuracy


(months) (SD) (SD)

Neurotypical 1 75 Up to 16 F - 51 97% (17) 96% (20)


2 68 Up to 16 F - 55 98% (15) 97% (17)
3 68 Up to 16 M - 55 98% (14) 97% (17)
4 56 Degree-Level F - 59 100% (7) 98% (14)
5 98 Up to 16 F - 47 94% (24) 93% (26)
6 54 Degree-Level M - 53 99% (11) 97% (17)
7 69 Up to 16 M - 55 98% (14) 97% (17)
8 76 Up to 16 F - 52 96% (20) 94% (24)
9 47 Up to 18 F - 58 99% (12) 97% (17)
PD 1 60 Postgraduate M 36 59 99% (9) 98% (14)
2 58 Degree-Level M 12 58 99% (9) 99% (11)

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


3 80 Up to 18 F 48 58 98% (14) 99% (9)
4 56 Postgraduate F 48 54 99% (10) 98% (16)
5 66 Degree-Level F 72 59 99% (9) 97% (17)
6 75 Degree-Level F 96 56 98% (15) 97% (17)
7 59 Degree-Level F 60 55 98% (16) 97% (18)
8 69 Postgraduate F 36 54 100% (7) 96% (19)
9 63 Postgraduate F 60 56 98% (14) 98% (16)
10 77 Degree-Level M 12 46 98% (15) 99% (9)
11 72 Postgraduate M 120 53 96% (19) 98% (14)
12 75 Degree-Level M 2 58 98% (15) 96% (19)
13 75 Postgraduate F 360 53 96% (20) 96% (20)
Aphasia 1 52 Degree-Level M 120 30 92% (27) 89% (32)
2 57 Up to 16 M 84 57 99% (10) 98% (14)
3 52 Up to 18 M 48 52 98% (15) 97% (18)
4 59 Postgraduate M 120 43 100% (7) 98% (15)
5 79 Up to 16 F 36 50 95% (23) 96% 20)
6 44 Up to 18 F 12 14 98% (16) 95% (21)
7 81 Up to 16 M 96 57 92% (27) 95% (21)
8 56 Up to 18 M 60 12 90% (31) 89% (31)
9 57 Up to 18 M 48 51 100% (7) 98% (14)
10 60 Up to 16 M 132 34 96% (19) 95% (22)
11 76 Up to 16 F 84 14 93% (26) 93% (26)

TPO, time post onset; BNT, Boston Naming Test; HD, high dimension categories; LD, low dimension categories; SD, standard deviation

the 11 participants in the aphasia group (P1, P4, P6, P8, P9, were animals without stripes, and inanimate objects with stripes).
P10, and P11) were below the cut-off for normative naming A total of 1087 unique images were used (any given image
performance. Individuals with PD (n = 13 (8 F), age M = 68.08, appeared as a target in 0–2 categories and as a distractor in
SD = 8.20) were recruited from the Parkinson’s UK Research 0–2 categories). All photographs depicted objects on a white
Registry. For detailed participant information, see Table 1. All background. The materials and the experimental scripts for all
participants used English as their primary language. Patients were studies are available on OSF: https://osf.io/guwh8/.
offered a £10.00 reimbursement. Ethical approval was granted by To determine the extent of lexical impairment in the aphasia
the UCL Research Ethics panel, Project ID: LC/2013/05, and all group and to compare lexical abilities across the three groups,
volunteers gave informed consent to participate in the study. all participants completed the BNT (Goodglass et al. 1983), where
they were sequentially presented with up to 60 line drawings
Design and materials of objects and asked to overtly name each one. The standard
The critical categorization task was modeled closely on L&M’s discontinuation rule was applied, with testing stopped after eight
study, with two modifications. First, the original study used consecutive failed naming attempts. No semantic or phonological
34 unique categories (18 HD categories and 16 LD categories), cues were given.
with some repetition of categories in each condition. We chose
to not repeat any categories, so we limited the materials to
16 categories in each condition (dropping “BODY PARTS” and Experimental procedure
“FACIAL FEATURES” from the HD set). And second, we used a Testing was carried out individually either in a quiet well-lit room
different set of images. L&M used normed color drawings (Rossion at the UCL Aphasia clinic or at the participants’ home, using a
and Pourtois 2004), and we used high-quality color photographs MacBook Pro (Retina, 13-inch display) and an external computer
selected from the Hemera Photo Objects 5000 and Google Images. mouse. The study was set up using PsychoPy (Version 1.83), and
For each category, we selected 8–15 targets and 25–27 distractors. the procedure closely followed that used in L&M’s study, except
Distractors included some items which were related to the target where noted. On each trial (see Fig. 1A for a sample HD and LD
category (for example, for the category “DANGEROUS ANIMALS,” trial), participants were presented with a 4 x 5 grid of images. The
13 of the 26 distractors were animals that were not dangerous, and image sets for the individual trials—each consisting of 20 images
the category “ANIMALS WITH STRIPES” included distractors that (4 targets and 16 distractors)—were randomly selected from the
10384 | Cerebral Cortex, 2023, Vol. 33, No. 19

pool of targets/distractors for each participant separately. The cat- Lastly, due to a technical error, if participants accidently double-
egory was stated at the top of the screen in lower-case Arial bold clicked the “Done” button, the next set of images was skipped,
letters (e.g. “objects that hold water”) and remained on the screen and the software registered it as though no response was made
for the duration of the trial. Participants selected the objects that by participants. As a result, we excluded trials where no selection
belonged to the target category by clicking on each relevant image. was made and where the trial length was less than 5 seconds.
A gray frame appeared around an image once it was clicked; This resulted in the exclusion of 40 trials (out of 2,112; ∼ 2%),
clicking the image again de-selected it (removed the gray frame) spread randomly between participants, groups and categories.
to allow participants to modify responses. Once the participant The analysis code is available on OSF: https://osf.io/guwh8/.
had selected all of the images they deemed appropriate for the
target category, they clicked a large green button with the word Results
“Done” at the bottom of the screen (in the L&M version, the button Group profiles
said “click here when done”). Doing so triggered the next trial. As expected, the neurotypical, aphasia, and PD groups differed
Although each trial contained a fixed number of targets (four), significantly in their BNT scores (F(2,31) = 9.85, P < 0.001). Post-hoc
participants were not informed of the number of targets during pairwise comparisons showed that the BNT scores of participants

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


the instructions and could therefore select as many images as with aphasia (M = 37.64, SD = 17.78) were significantly lower than
they wished on any given trial. No time limit was imposed on those of neurotypical participants (P = 0.005) or participants with
the trials, but participants were encouraged to work as quickly PD (P = 0.001), with the latter two groups not differing significantly
and accurately as possible. HD and LD trials were interleaved, (M = 53.89, SD = 3.66 vs. M = 55.21, SD = 3.42; P > 0.999). The groups
and the order of conditions was randomized for each participant. did not differ in age (F(2,31) = 1.45, P = 0.250), but a significant
Each participant performed the experiment twice for a total of difference was observed in the level of education (F(2,31) = 14.36,
64 trials (32 per condition), but in contrast to L&M, different P < 0.001): participants in the PD group were significantly more
sets of images were used for the two instances of each category educated than both neurotypical participants (P = 0.001) and par-
to minimize practice effects. Responses were recorded for each ticipants with aphasia (P = 0.002), with the latter two not differing
image; response times were recorded for each trial (the time from significantly (P > 0.999).
the onset of the trial until the “Done” button was pressed). The
session lasted approximately 1 hour. The BNT (Goodglass et al. Categorization task
2001) was administered between the two runs of the study. Categorization results for Study 1 are summarized in Fig. 2.
We wish to note that in their study, L&M state that they only
included ‘the correct responses’ in their RT analyses. It is not Accuracy
clear what is meant here given the internal complexity of the We did not observe predicted categorization deficits in the aphasia
trials (i.e., possible errors including misses and false alarms). group. Participants with aphasia had high accuracy for both
It is possible that L&M only included trials where no errors of LD (M = 0.95, SD = 0.03) and HD categories (M = 0.95, SD = 0.03;
any kind were made, but they also talk about ‘per click’ RTs, LD > HD: β = −0.11, SE = 0.26, P = 0.672). The overall accuracy for
which is not consistent with this interpretation. It also appears participants with aphasia (M = 0.95, SD = 0.03) was similar to neu-
that L&M analyzed median, not mean RTs. For simplicity and to rotypical participants (M = 0.97, SD = 0.02; neurotypical>aphasia:
avoid collider bias (Elwert and Winship 2014), we chose to analyze β = 0.36, SE = 0.26, P = 0.166) and slightly lower than for partici-
all trials here. We use mean per-trial values, but we make the pants with PD (M = 0.98, SD = 0.01; PD > aphasia: β = 0.69, SE = 0.24,
per-image data available on OSF (https://osf.io/guwh8/), so other P = 0.004). The key comparison—interaction between category
researchers could perform additional analyses. dimension and group (aphasia vs. neurotypical)—was marginally
significant (β = −0.26, SE = 0.14, P = 0.055), and the trend was in
Statistical analyses the opposite direction from that predicted by the LD-specific
To determine possible differences in demographics and BNT language recruitment hypothesis (the performance gap for the
scores across groups, we conducted ANOVA tests (with follow- neurotypical group was larger). The category dimension by
up Bonferroni-corrected t-tests), implemented in SPSS 22 (IBM group interaction for the aphasia versus PD comparison was not
Corp 2013). For the critical analyses, we used linear/logistic mixed significant (β = −0.12, SE = 0.13, P = 0.341). Thus, we did not observe
effect regression models (Baayen et al. 2008). Given that correct LD-specific categorization impairment in the aphasia group.
or incorrect selection of items is categorical in nature, we use We additionally conducted an exploratory analysis to investi-
logistic regression to analyze accuracy measures (Jaeger 2008). For gate the difference between the aphasia and PD groups. Given that
response times, we use linear regression. When specifying model the PD group had a higher average education level, we repeated
contrasts, we used sum coding for category dimension (HD vs. LD); the analysis above with “education level” as an additional fixed
the effect of group was therefore estimated across both category effect. The updated model had a similar fit to the data com-
dimensions. For the participant group (neurotypical vs. aphasia vs. pared with the original (as per the likelihood ratio test: χ 2 = 1.55,
PD), we used dummy coding with “neurotypical” as the reference P = 0.213); under this model, the difference between the aphasia
level; thus, the effect of category was estimated specifically for and the PD groups was no longer significant (β = 0.44, SE = 0.31,
the neurotypical group (with interaction terms denoting whether P = 0.158). The significance of other effects was unchanged.
the category effect differed for the aphasia/PD groups). For
completeness and to facilitate result comparison with L&M, we Response times
also ran pairwise comparisons across groups using “aphasia” as The RT analysis revealed that participants with aphasia were
the reference level (the results were Bonferroni-corrected, n = 2). faster to respond during LD trials (M = 32.36, SD = 9.33) com-
The mixed effect analyses were run using the lmer function from pared with HD trials (M = 37.10, SD = 12.90; LD > HD: β = −4.75,
the lme4 R package (Bates et al. 2015); statistical significance of SE = 2.26, P = 0.042), in contrast to the predictions of the LD-
the effects was evaluated using the lmerTest package (Kuznetsova specific language recruitment hypothesis. The overall RTs for
et al. 2017); follow-up comparisons were conducted using the participants with aphasia (M = 34.70, SD = 11.30) were longer than
emmeans package (https://cran.r-project.org/package=emmeans). for neurotypical participants (M = 26.30, SD = 12.10; β = −8.42,
Yael Benn et al. | 10385

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


Fig. 2. Study 1 results. (A) accuracy and (B) response time (RT) across the three participant groups (here, RT is the time from trial onset until participants
pressed the “done” button). (C) Accuracy and (D) RT plotted against participants’ BNT scores, a measure of naming performance. Here and elsewhere,
error bars depict the standard error across participants.

SE = 4.02, P = 0.044) and the PD group (M = 21.40, SD = 5.14; exists a relationship between the BNT score and categorization
β = −13.30, SE = 3.66, P < 0.001). The interactions between category performance, they do not support the LD-specific language
dimension and group were not significant (neurotypical>aphasia: recruitment hypothesis.
β = 0.82, SE = 1.29, P = 0.522; PD > aphasia: β = 1.98, SE = 1.17,
P = 0.091). Follow-up analyses showed no overall effect of category
dimension across groups (β = 3.81, SE = 2.19, P = 0.249), within the
Interim discussion
neurotypical group (β = 3.92, SE = 2.34, P = 0.271) or within the PD In Study 1, we use the setup from a previous study (Lupyan and
group (β = 2.76, SE = 2.27, P = 0.521). Mirman 2013, or L&M) to test the hypothesis that language is
selectively recruited to support LD categorization. To examine
Effect of naming performance the generality of the language-categorization link, we recruited a
To explore the effect of naming ability on the categorization task group of individuals with aphasia with diverse degrees of aphasia
performance, we fitted a logistic mixed effect linear regression severity. We found that the aphasia group performed comparably
model with the BNT score, category dimension, and their inter- to the control groups on the categorization tasks. Naming ability
action as fixed effects and participants (across the three groups) (as measured with the BNT) predicted overall categorization per-
and categories (e.g. “DANGEROUS ANIMALS”) as random effects. formance, but we observed no interaction between naming ability
Similar to L&M, we also included education level as a fixed effect. and category dimension (HD vs LD). In summary, Study 1 provides
We found that BNT was a significant predictor of accuracy no support for the hypothesis that language plays a special role
(β = 0.36, SE = 0.08, P < 0.001) and RT (β = −5.26, SE = 1.41, P < 0.001), in LD categorization.
such that higher BNT scores corresponded to more accurate Participants with aphasia performed object categorization as
and faster performance (Fig. 2C, D). There was no main effect accurately as the neurotypical controls. Participants with PD per-
of category dimension (accuracy: β = −0.24, SE = 0.26, P = 0.358; RT: formed better than the other groups, but this difference is likely
β = −3.73, SE = 2.14, P = 0.092) and no interaction between BNT and explained by the higher education level in this group. As in L&M,
category dimension (accuracy: β = −0.05, SE = 0.04, P = 0.271; RT: participants with aphasia were significantly slower to complete
β = 0.74, SE = 0.49, P = 0.131). Education was a significant predictor the categorization task compared with the neurotypical group,
for both accuracy (β = 0.23, SE = 0.07, P = 0.001) and RT (β = −2.95, and to our additional, PD control group. However, this slower per-
SE = 1.24, P = 0.024). Whereas these results indicate that there formance in the aphasia group can be explained by the presence
10386 | Cerebral Cortex, 2023, Vol. 33, No. 19

of motor impairments (e.g. right hemiplegia)—often more severe participants took part in Study 1. All participants used English as
than in participants with PD—which often necessitate use of their their primary language and were offered a £15.00 reimbursement.
non-preferred hand. This difference could also be explained by Ethical approval was granted by the UCL Research Ethics panel,
the fact that participants with aphasia may require longer to pro- Project ID: LC/2013/05, and all volunteers gave informed consent
cess the category descriptions, which are presented verbally and to participate in the study.
sometimes in lengthy phrases (e.g. “NON-FOOD THINGS FOUND
IN THE KITCHEN”). Thus, we are hesitant to place a lot of weight Design and materials
on the RT differences. The categories were identical to those of Study 1. The images were
Across groups, BNT scores significantly predicted performance also largely the same although some were replaced by better qual-
on all three outcome measures (although this effect did not differ ity photographs. Unlike Study 1, we presented the images sequen-
for LD and HD categorization). Although BNT scores may be a tially (Fig. 2B). Each block started with a category label, followed
proxy for the severity of linguistic impairment, they also might by 12 images presented one at a time. The category label remained
index the degree of executive function impairments (Higby et al. on the screen to minimize memory demands. The images for each
2019). Due to the proximity of language-specific and multiple- category block were randomly selected from the general set of

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


demand brain regions in some parts of the brain (Fedorenko et al. pictures for that category. The number of targets varied across
2012; Fedorenko and Blank 2020), brain damage that causes lower blocks (minimum: 4, maximum: 6) so as to minimize the implicit
BNT scores is also likely to lead to difficulties with cognitively learning of a fixed number of targets. Categories were grouped by
demanding tasks. The categorization task adopted from L&M dimension (LD/HD) into groups of four, for a total of eight blocks
involves visual search and selecting among multiple options, (four blocks per dimension). These 8-block sequences (“runs”)
which require a substantial degree of cognitive control (Posner were separated by a rest period of fixation (10 s in duration). The
and Petersen 1990; Petersen and Posner 2012); thus, categoriza- order of runs, the order of conditions within runs (LD first vs. HD
tion difficulties on the current task might ref lect this increased first), the order of categories within runs, and the order of images
recruitment of executive/cognitive control resources. within category blocks were randomized for each participant.
Given the heterogeneity of the aphasia group in Study 1 and
a relatively low sample size, our results in this section should be Experimental procedure
interpreted with caution. Therefore, in the next two studies, we (i) Testing was carried out individually either in a quiet well-lit room
test the hypothesis that LD-specific categorization impairments at a clinic nearest to the participant’s location or in their home,
might be observed specifically in participants with low BNT scores using a Dell Latitude E5540 (14.1-inch display). The paradigm was
(Aphasia Study 2) and (ii) evaluate the relative contributions of set up using Python (version 2.7.10). Each category block started
language and executive resource to categorization in neurotypical with an instruction screen presented for 2 s that read “Please find
participants (fMRI experiment). [CATEGORY LABEL]” (e.g. “Please find objects that hold water”).
For Aphasia Study 2 and the fMRI experiment, we use a mod- Given that the participants in the aphasia group had severe
ified paradigm that temporally separates the process of reading lexical impairment and had difficulty processing orthographic
the category label and the process of categorizing objects based information, the experimenter read the category label aloud to
on that label. Reading the label necessarily requires the use of all participants (in all groups) during this trial-initial 2 s window.
language but is not the target of the LD-specific categorization This screen was followed by a sequence of 12 images presented
hypothesis: thus, in the new setup, participants first read the label one at a time for a maximum of 10 s per image. For each image,
and then make categorization judgments. participants had to decide whether the depicted object belong to
the target category by pressing one of two keys on the keyboard:
the “Y” key marked with a green sticker for YES, or the “N” key
Aphasia Study 2
marked with a red sticker for NO. If no response was recorded for
The aim of Study 2 was 3-fold. First, we wanted to further probe 10s, the experiment advanced to the next image. Responses and
the relationship between naming ability (BNT scores) and cate- response times were recorded for each image. The session lasted
gorization performance, which was reported by L&M and found approximately 1 hour. The BNT was administered at the beginning
in Study 1. Thus, we recruited participants with aphasia who of the testing session.
had severe anomia, as measured by the BNT (score range 1–11,
compared with 12–57 in Study 1; see Tables 1 and 2). Second, we Statistical analyses
adjusted the paradigm to minimize executive demands, including The statistical analysis procedure was the same as in Study 1. No
attention, visual search, selection/inhibition, and updating. Third, trials were excluded.
we sought to validate a version of the task that could be used in an
fMRI setting (time-locked to events). See Fig. 1B for the modified
task setup. Results
Group profiles
As expected, the groups differed significantly in their BNT
Method scores (F(2,32) = 202.67, P < 0.001). Post-hoc pairwise comparisons
Participants revealed that the BNT scores of participants with aphasia
Neurotypical participants (n = 15 (15 F), age M = 72.47, SD = 6.41) (M = 6.00, SD = 4.00) were significantly lower than both neurotyp-
were recruited by convenience sampling; patients with chronic ical participants (P < 0.001) and participants with PD (P < 0.001),
aphasia and severe lexical impairment (n = 5 (all males), age with the latter two groups not differing significantly (M = 53.67,
M = 66.60, SD = 8.91) were recruited from Aphasia volunteer SD = 5.42 vs. M = 54.87, SD = 4.73, P > 0.999). The groups did not
research registers; PD patients (n = 15 (1 F), age M = 66.60, SD = 6.38) differ in age (F(2,32) = 3.23, P = 0.053), but a significant difference
were recruited from the Parkinson’s UK Research Registry was observed in the level of education (F(2,32) = 5.42, P = 0.009),
(see Table 2 for detailed participant information). None of the with neurotypical participants and participants with PD having
Yael Benn et al. | 10387

Table 2. Participant information, study 2.

Group Participant Age Education Gender TPO BNT HD Accuracy LD Accuracy


(months) (SD) (SD)

Neurotypical 1 68 Degree-Level F - 51 99% (11) 98% (14)


2 61 Postgraduate F - 41 98% (12) 98% (12)
3 85 Degree-Level F - 54 99% (10) 95% (21)
4 73 Postgraduate F - 58 95% (21) 96% (19)
5 72 Up to 18 F - 58 99% (9) 99% (9)
6 77 Postgraduate F - 55 96% (21) 99% (11)
7 77 Degree-Level F - 59 97% (17) 98% (12)
8 66 Degree-Level F - 57 98% (13) 99% (7)
9 66 Postgraduate F - 54 98% (12) 98% (12)
10 76 Degree-Level F - 59 99% (9) 98% (14)
11 65 Postgraduate F - 56 98% (15) 98% (13)

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


12 80 Up to 18 F - 45 97% (18) 99% (11)
13 74 Postgraduate F - 54 98% (12) 99% (10)
14 71 Degree-Level F - 57 96% (20) 95% (23)
15 76 Degree-Level F - 47 96% (19) 95% (22)
PD 1 71 Degree-Level M 24 58 98% (13) 97% (17)
2 78 Degree-Level M 24 47 95% (22) 93% (25)
3 64 Postgraduate M 30 48 98% (13) 95% (21)
4 72 Postgraduate M 18 59 98% (14) 96% (19)
5 54 Degree-Level M 204 58 97% (16) 97% (16)
6 72 Degree-Level M 4 48 96% (21) 98% (14)
7 62 Postgraduate F 120 56 98% (14) 99% (9)
8 65 Postgraduate M 17 59 97% (17) 98% (13)
9 74 Up to 18 M 96 56 96% (19) 97% (17)
10 67 Up to 16 M 60 54 99% (9) 98% (13)
11 67 Postgraduate M 72 59 98% (14) 96% (19)
12 59 Postgraduate M 30 58 99% (10) 99% (7)
13 59 Degree-Level M 48 60 97% (18) 99% (11)
14 67 Degree-Level M 18 55 97% (17) 95% (22)
15 68 Degree-Level M 98 48 92% (27) 94% (23)
Aphasia 1 58 Up to 18 M 42 5 88% (32) 82% (39)
2 68 Up to 16 M 68 9 77% (42) 79% (41)
3 77 Up to 18 M 111 11 91% (29) 88% (33)
4 57 Degree-Level M 34 1 96% (19) 95% (22)
5 73 Up to 18 M 326 4 96% (19) 91% (28)

TPO, time post onset; BNT, Boston Naming Test; HD, high dimension categories; LD, low dimension categories; SD, standard deviation

significantly more years of education than participants with P = 0.021) and for the aphasia versus PD comparison (β = 0.32,
aphasia (P = 0.010 and 0.016, respectively). The neurotypical SE = 0.15, P = 0.037).
participants and participants with PD did not differ (P > 0.999).
Response times
Categorization task RT results were also consistent with the LD-specific language
Categorization results for Study 2 are summarized in Fig. 3. recruitment hypothesis. Participants with aphasia were slower
to respond during LD trials (M = 2.37, SD = 0.70) compared with
Accuracy HD trials (M = 2.22, SD = 0.64; LD > HD: β = 0.16, SE = 0.08, P = 0.044).
As in Study 1, participants with aphasia had similar accura- The overall RTs for participants with aphasia (M = 2.30, SD = 0.64)
cies for LD (M = 0.87, SD = 0.07) and HD categories (M = 0.90, were longer than for neurotypical participants (M = 1.48, SD = 0.34;
SD = 0.08; LD > HD: β = −0.24, SE = 0.22, P = 0.282). Participants β = −.81, SE = 0.19, P < 0.001) and participants with PD (M = 1.43,
with aphasia had overall lower accuracies (M = 0.88, SD = 0.07) SD = 0.29; β = −0.86, SE = 0.19, P < 0.001). We also observed an
compared with neurotypical participants (M = 0.98, SD = 0.01; interaction between group and category dimension for both the
neurotypical>aphasia: β = 1.70, SE = 0.28, P < 0.001) and partici- neurotypical versus aphasia comparison (β = −0.23, SE = 0.03,
pants with PD (M = 0.97, SD = 0.02; PD > aphasia: β = 1.44, SE = 0.28, P < 0.001) and the PD versus aphasia comparison (β = −0.19,
P < 0.001), which is consistent with the negative relationship SE = 0.03, P < 0.001), such that participants with aphasia had
between naming ability and categorization performance observed longer RTs for LD categories compared with HD categories.
in Study 1. We did not observe a reliable category dimension by
group interaction for the aphasia versus neurotypical comparison Effect of naming performance
(β = 0.44, SE = 0.26, P = 0.086), nor for the aphasia versus PD As in Study 1, BNT was a significant predictor of categorization
comparison (β = 0.42, SE = 0.23, P = 0.070). Critically, in accordance performance (accuracy: β = 0.50, SE = 0.11, P < 0.001; RT: β = −0.29,
with the LD-specific language recruitment hypothesis, we SE = 0.07, P < 0.001). There was no main effect of category
observed a category dimension by group interaction both for dimension (accuracy: β = 0.06, SE = 0.21, P = 0.787; RT: β = −0.02,
the aphasia versus neurotypical comparison (β = 0.37, SE = 0.16, SE = 0.07, P = 0.742); however, unlike Study 1, and as predicted by
10388 | Cerebral Cortex, 2023, Vol. 33, No. 19

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


Fig. 3. Study 2 results. (A) Accuracy and (B) RT across the three participant groups (here, RT is the time until participants pressed a “yes” or “no” button
for each image within a trial). (C) Accuracy and (D) RT plotted against participants’ BNT scores, a measure of naming performance.

the LD-specific language recruitment hypothesis, we observed an (with the goal of reducing executive demands). We found that, in
interaction between BNT and category dimension for accuracy accordance with the LD-selective language recruitment hypothe-
(β = 0.13, SE = 0.05, P = 0.007) and RT (β = −0.08, SE = 0.01, P < 0.001). sis, individuals with aphasia were impaired on LD categorization
Finally, education was not a significant predictor of performance more than on HD categorization. However, performance of indi-
in this dataset (accuracy: β = 0.12, SE = 0.13, P = 0.372; RT: β = −0.01, vidual participants offers a reason to be skeptical about a direct
SE = 0.08, P = 0.940). link between naming and categorization. Participants A4 and A5
demonstrated dissociation between these two tasks: despite very
Single case analysis low BNT scores (lower than 5/60), they performed similarly on
Although the effect of naming performance in Study 2 is in the HD and LD categorization trials, and their accuracy on both
line with L&M’s prediction, careful examination of individual conditions was well within the range of the control groups.
participants’ scores casts doubt on the causal relationship Dissociations observed in individual case studies are critical in
between naming ability and categorization performance. Specif- informing debates about cognitive architecture (e.g. Caramazza
ically, participants A4 and A5 in the aphasia group (Table 2) and McCloskey 1988; Badecker et al. 1991; Caramazza and
had very low BNT scores (1/60 and 4/60), but nonetheless Coltheart 2006). Naturally occurring brain lesions do not respect
performed well relative to both the neurotypical and PD groups the boundaries between functionally distinct brain areas, and
(accuracy: LD A4 = 95%; A5 = 91%; HD A4 = 96%; A5 = 96%). Using comorbidities or associations of impairments are common (e.g.
the Adjusted F Calculator for comparing single cases to groups Bates et al. 2003). For example, damage to the left inferior frontal
(Hulleman and Humphreys 2007), these two participants did gyrus (LIFG) is likely to cause multiple cognitive impairments due
not differ significantly from the combined neurotypical and PD to the high functional heterogeneity of that region (Fedorenko
groups for either the HD condition (A4: F[1,29] < 0.01, P (one- et al. 2012; Fedorenko and Blank 2020). Thus, a correlation that we
tailed) = 0.414; A5: F[1,29] < 0.01, P (one-tailed) = 0.414) or the LD observe between naming and categorization might be because the
condition (A4: F[1,29] = 0.02, P (one-tailed) = 0.337; A5: F[1,29] = 0.14, brain regions that support these functions are located nearby and
P (one-tailed) = 0.154). This dissociation indicates that naming thus are likely to be damaged together (rather than naming and
impairment is not necessarily accompanied by a decrement in categorization engaging the same brain region/mechanism). The
LD categorization. dissociation that we observe in participants A4 and A5 supports
this possibility: in both cases, severely limited lexical access
Interim discussion did not prevent success on the categorization task, revealing
In Study 2, we examined object categorization performance of that intact linguistic (naming) skills are not necessary for object
individuals with severe anomia using a modified task paradigm categorization.
Yael Benn et al. | 10389

As in Study 1, naming ability significantly predicted perfor- words (albeit no single region or voxel is sensitive just to word-
mance. Furthermore, possibly because in this study we recruited level or sentence-level meaning; Blank et al. 2016; Fedorenko
participants with aphasia who had extremely poor naming per- et al. 2020). Therefore, if a task requires activating verbal labels,
formance, we also observed a group difference: participants with we expect to observe activity in the regions identified with the
aphasia had lower accuracy and longer response times than the language localizer.
two control groups. This evidence points to a possible link between The multiple demand localizer identifies a set of brain regions
naming performance and categorization. As in Study 1, this link that respond to a wide range of cognitively demanding tasks.
might arise from the fact that task instructions are presented ver- Specifically, these regions are sensitive to general cognitive effort,
bally; thus, linguistic impairments might affect task performance exhibiting higher activity when the task is more difficult (Assem
simply because they make it more challenging to process the et al. 2020b; Duncan 2010; Fedorenko et al. 2013; Hugdahl et al.
instructions. Another explanation, also offered by L&M, is that LD 2015). The hard>easy response signature in the multiple demand
categorization is correlated with naming impairments because network holds across many diverse tasks, including spatial WM,
both tasks may be affected by damage to cognitive control logic, math, relational reasoning, and cognitive control (Fedorenko
mechanisms, which lay in close proximity to language areas, et al. 2013; Coetzee and Monti 2018; Shashidhara et al. 2019;

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


especially in the LIFG (Thompson-Schill et al. 1997; Kan and Assem et al. 2020b). Thus, if LD categorization is more cognitively
Thompson-Schill 2004; Fedorenko et al. 2012). In line with this challenging, we expect it to elicit higher activity in the multiple
conjecture, Hu et al. (2021) observed strong neural responses (in demand network.
fMRI) in the domain-general MD network during an object naming Examining activation patterns in both the language and the
task. Thus, the correlation between anomia severity and object multiple demand networks allows us to examine the relative
categorization performance does not offer evidence of a language- contributions of linguistic and cognitive control resources to LD
specific impairment and might ref lect an executive impairment and HD categorization. As discussed before, brain damage leading
instead. to aphasia is often comorbid with multiple demand network dam-
The results of Studies 1 and 2 did not allow us to resolve the age: the language-selective regions and these domain-general
question of whether language plays a key role in LD categoriza- regions in left inferior frontal cortex lie in close proximity to each
tion. Study 1 failed to replicate the selective LD categorization other (Fedorenko et al. 2012; Blank et al. 2014; Fedorenko and
impairments as reported in L&M. Study 2 did show a selective Blank 2020), with precise locations varying substantially across
decrease in accuracy (and increase in RTs) for LD categories in par- individuals. Thus, impaired categorization performance of par-
ticipants with low naming scores, as predicted by the LD-specific ticipants with aphasia in Studies 1 and 2 could have potentially
language recruitment hypothesis. However, this piece of evidence arisen from damage to either or both networks. Study 3 allows
is undermined by the dissociation observed in participants A4 and us to disambiguate between these possibilities. If, as suggested by
A5 and the possibility that the performance deficits in individuals L&M, LD categorization indeed relies on language more than HD
with severe anomia could be caused by damage to the domain- categorization, we expect to see more activity within the language
general executive brain regions that are adjacent to the language system during LD trials compared with HD trials. Further, if LD
system in the left frontal lobe. categorization is a more cognitively demanding task, we expect to
To definitively establish whether LD categorization recruits the see higher responses within the multiple demand network during
language system, we next turned to fMRI. LD trials compared with HD trials (in accordance with the fact
that multiple demand regions are sensitive to effort across diverse
tasks; Duncan and Owen 2000; Fedorenko et al. 2013; Hugdahl
fMRI experiment et al. 2015). Finally, if a brain network does not respond to either
To further test the relationship between language and categoriza- LD or HD categorization, we can conclude that this network is not
tion, we conducted an fMRI experiment. Neurotypical participants recruited for this task.
performed the same LD/HD categorization task as participants
in Study 2. In addition, they completed two functional “localizer” Method
tasks (Saxe et al. 2006; Fedorenko et al. 2010) that were used Participants
to identify the networks of interest: the language network and Fourteen neurotypical participants (7 F, age M = 22.31, SD = 3.51)
the multiple demand network. The use of standard, extensively were recruited from MIT and the surrounding community and
validated language network and multiple demand networks local- paid $60 for their participation. All were native speakers of
izers allows us to identify and characterize these networks con- English. One participant was left-handed (see Willems et al. 2014,
sistently across studies (Saxe et al. 2006; Fedorenko 2021). for motivation to include left-handers in cognitive neuroscience
The language localizer was designed to identify brain regions research) but showed typical left-lateralized language activation
that respond more strongly to meaningful and structured lan- as determined by the language localizer task (described below).
guage than a perceptually similar control condition (for example, All participants gave informed consent in accordance with the
sentences versus meaningless sequences of letters (“nonwords”); requirements of MIT’s Committee On the Use of Humans as
Fedorenko et al. 2010). A large number of studies have shown that Experimental Subjects.
sentences>nonwords and similar contrasts pick out a set of brain
regions that are strongly and selectively recruited for language Design, materials, and procedure
processing, including spoken, written, and signed language com- Each participant completed a language localizer task aimed at
prehension, spoken and written language production, and inner identifying language-responsive brain regions (Fedorenko et al.
speech (Amit et al. 2017; Braga et al. 2020; Fedorenko et al. 2010, 2010), a spatial WM task aimed at identifying the multiple demand
2011; Giglio et al. 2022; Hu et al. 2021; Menenti et al. 2011; Scott network (Fedorenko et al. 2013) and the critical categorization
et al. 2017; Silbert et al. 2014). These regions (henceforth, the lan- task. Some participants completed one or more additional tasks
guage network) also respond to linguistic units at different levels for unrelated studies. The entire scanning session lasted two
of the processing hierarchy, including both phrases and single hours.
10390 | Cerebral Cortex, 2023, Vol. 33, No. 19

Language network localizer completed three runs. Across the three runs, any given participant
Participants read sentences (e.g. NOBODY COULD HAVE PRE- saw a random subset of the 32 categories, with some categories
DICTED THE EARTHQUAKE IN THIS PART OF THE COUNTRY) repeating (but never repeating within a run; see Appendix S1,
and lists of unconnected, pronounceable nonwords (e.g. U BIZBY Table 1 for details). Condition order was counterbalanced across
ACWORRILY MIDARAL MAPE LAS POME U TRINT WEPS WIBRON runs and participants.
PUZ) in a blocked design. Each stimulus consisted of twelve word-
s/nonwords. The sentences > nonword-lists contrast has been
fMRI data acquisition
previously shown to reliably activate high-level language process- Structural and functional data were collected on the whole-
ing regions and to be robust to changes in the materials, task, body, 3 Tesla, Siemens Trio scanner with a 32-channel head coil,
and modality of presentation (Fedorenko et al. 2010; Mahowald at the Athinoula A. Martinos Imaging Center at the McGovern
and Fedorenko 2016; Scott et al. 2017). For details of how the Institute for Brain Research at MIT. T1-weighted structural images
language materials were constructed, see Fedorenko et al. (2010). were collected in 176 sagittal slices with 1-mm isotropic voxels
The materials are available at http://evlab.mit.edu/funcloc. Stim- (TR = 2,530 ms, TE = 3.48 ms). Functional, blood oxygenation level
dependent (BOLD), data were acquired using an EPI sequence

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


uli were presented in the center of the screen, one word/nonword
at a time, at the rate of 450 ms per word/nonword. Each stimulus (with a 90 ◦ f lip angle and using GRAPPA with an acceleration
was preceded by a 100 ms blank screen and followed by a 400- factor of 2), with the following acquisition parameters: 31 4-mm
ms screen showing a picture of a finger pressing a button, and thick near-axial slices acquired in the interleaved order (with 10%
a blank screen for another 100 ms, for a total trial duration of distance factor), 2.1 mm × 2.1 mm in-plane resolution, FoV in
6 s. Participants were asked to press a button whenever they saw the phase encoding (A> > P) direction 200 mm and matrix size
the picture of a finger pressing a button. This task was included 96 mm × 96 mm, TR = 2000 ms and TE = 30 ms. The first 10s of
to help participants stay alert and awake. Condition order was each run were excluded to allow for steady state magnetization.
counterbalanced across runs. Experimental blocks lasted 18 s
(with theree trials per block), and fixation blocks lasted 14 s. Each
fMRI data preprocessing
run (consisting of 5 fixation blocks and 16 experimental blocks) fMRI data were analyzed using SPM12 (release 7487), CONN EvLab
lasted 358 s. Each participant completed two runs. module (release 19b), and other custom MATLAB scripts. Each
participant’s functional and structural data were converted from
DICOM to NIFTI format. All functional scans were coregistered
Multiple demand network localizer
and resampled using B-spline interpolation to the first scan of the
Participants had to keep track of four (easy condition) or eight
first session (Friston Karl et al. 1995). Potential outlier scans were
(hard condition) sequentially presented locations in a 3 × 4 grid
identified from the resulting subject-motion estimates, as well as
(Fedorenko et al. 2013). The hard > easy contrast has been
from BOLD signal indicators, using default thresholds in CONN
previously shown to robustly activate multiple demand regions
preprocessing pipeline (5 standard deviations above the mean
(Fedorenko et al. 2013; Blank et al. 2014; Mineroff et al. 2018;
in global BOLD signal change, or framewise displacement values
Assem et al. 2020a). Stimuli in both conditions were presented
above 0.9 mm; Nieto 2020), and used as regressors of no interest
in the center of the screen across four steps. Each of these steps
in first-level analyses (see below). Functional and structural data
lasted for 1 s and presented one location on the grid in the easy
were independently normalized into a common space [the Mon-
condition, and two locations in the hard condition. Each stimulus
treal Neurological Institute (MNI) template; IXI549Space[ using
was followed by a choice-selection step, which showed two grids
SPM12 unified segmentation and normalization procedure (Ash-
side by side. One grid contained the locations shown on the
burner and Friston 2005) with a reference functional image com-
previous four steps, whereas the other contained an incorrect set
puted as the mean functional data after realignment across all
of locations. Participants were asked to press one of two buttons to
timepoints omitting outlier scans. The output data were resam-
choose the grid that showed the correct locations. Condition order
pled to a common bounding box between MNI-space coordinates
was counterbalanced across runs and participants. Experimental
(−90, −126, −72) and (90, 90, 108), using 2-mm isotropic voxels and
blocks lasted 32 s (with 4 trials per block), and fixation blocks
fourth-order spline interpolation for the functional data, and 1-
lasted 16 s. Each run lasted 448 s, consisting of 12 experimental
mm isotropic voxels and trilinear interpolation for the structural
blocks (6 per condition) and 4 fixation blocks. Twelve participants
data. Last, the functional data were smoothed spatially using
completed two runs and two participants completed one run.
spatial convolution with a 4-mm FWHM Gaussian kernel.

Critical categorization task First-level analysis


The categorization materials were the same as those used in Responses in individual voxels were estimated using a General
Study 2 (see Fig. 1, bottom). The timing differed in the following Linear Model (GLM) in which each experimental condition was
way. In order to make blocks uniform in duration, each category modeled with a boxcar function convolved with the canonical
block started with a category label presented for 2 s, and then hemodynamic response function (HRF) (fixation was modeled
the 12 images were presented sequentially at the fixed speed of implicitly, such that all timepoints that did not correspond to
2 s per image. As in Study 2, any given category block contained one of the conditions were assumed to correspond to a fixa-
between four and six target images. Participants were asked to tion period). Temporal autocorrelations in the BOLD signal time-
press a button if the picture belonged to the target category and series were accounted for by a combination of high-pass filter-
not to press anything if it did not. As before, the category label ing with a 128-s cutoff and whitening using an AR(0.2) model
was displayed at the top of the screen for the duration of the (first-order autoregressive model linearized around the coeffi-
trial to minimize memory demands. Category blocks lasted 26 s cient a = 0.2) to approximate the observed covariance of the func-
(2 s category label presentation +2 s × 12 images), and fixation tional data in the context of Restricted Maximum Likelihood
blocks lasted 14 s. Each run, consisting of 12 category blocks (6 estimation (ReML). In addition to experimental condition effects,
LD and 6 HD) and 4 fixation blocks, lasted 368 s. Each participant the GLM design included first-order temporal derivatives for each
Yael Benn et al. | 10391

condition (included to model variability in the HRF delays), as The responses to the localizer conditions (sentences and non-
well as nuisance regressors to control for the effect of slow linear words for language fROIs, hard and easy WM conditions for mul-
drifts, subject-motion parameters, and potential outlier scans on tiple demand fROIs, and LD and HD categorization for categoriza-
the BOLD signal. tion fROIs) were estimated using an across-runs cross-validation
procedure, where one run was used to define the fROI and the
Defining individual functional regions of interest other to estimate the response magnitudes, then the procedure
was repeated switching the runs used for fROI definition versus
Responses to the critical categorization experiment were extracted
response estimation, and finally the estimates were averaged to
from regions of interest that were defined functionally in each
derive a single value per condition per fROI per participant. This
individual participant (Saxe et al. 2006; Nieto-Castañón and
cross-validation procedure allows one to use all of the data for
Fedorenko 2012). Three sets of functional regions of interest
defining the fROIs as well as for estimating their responses (see
(fROIs) were defined—one for the language network, one for
Nieto-Castañón and Fedorenko 2012, for discussion), while ensur-
the multiple demand network, and one for the putative LD > HD
ing the independence of the data used for fROI definition and
categorization regions. To do so, we used the Group-constrained
response estimation (Kriegeskorte et al. 2009). Two participants

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


Subject-Specific (GSS) approach (Fedorenko et al. 2010; Julian
completed only one run of the multiple demand localizer task;
et al. 2012). In particular, fROIs were constrained to fall within
therefore, we did not estimate the strength of their responses
a set of “parcels,” which marked the expected gross locations
to the hard and easy multiple demand localizer conditions but
of activations for the relevant contrast. For the language
ensured that the whole-brain activation maps for the hard>easy
network, the parcels were generated based on a group-level
contrast showed the expected topography.
representation of language localizer data from 220 participants.
For the multiple demand network, the parcels were generated Statistical analyses
based on a group-level representation of spatial WM task data
Similar to Studies 1 and 2, we analyzed our data using mixed
from 197 participants. For the putative LD categorization regions,
effect regression models (Baayen et al. 2008). For accuracy, we
we generated the parcels based on the data collected in this study.
use logistic regression (Jaeger 2008). For RT and fROIs response
The parcels are available on OSF (https://osf.io/guwh8/).
magnitudes, we use linear regression. In all models, condition
To create each set of parcels, individual activation maps for the
was a fixed effect and participant was a random intercept. The
relevant localizer contrast were binarized (by turning all voxels
model for the multiple demand network included hemisphere
significant at the P < 0.001 whole-brain threshold (uncorrected)
as an additional fixed effect. For language and multiple demand
into 1 s, and the rest into 0 s) and overlaid in the MNI space to
network analyses, we also included fROI as a random intercept
create a probabilistic overlap map. The map was then smoothed
and then ran follow-up analyses on individual fROIs using false
(FWHM = 6 mm), and voxels with fewer than 10% of participants
discovery rate (FDR) correction (Benjamini and Hochberg 1995) for
overlapping were excluded. The resulting map was divided into
the number of fROIs in each network. Behavioral analyses used
regions using a watershed algorithm. Finally, we excluded parcels
sum coding for condition (LD vs. HD in the categorization task
that did not show significant effects for the relevant localizer
and Hard vs. Easy in the multiple demand localizer task). Neu-
contrast in a left-out run or did not contain supra-threshold
roimaging analyses used custom contrasts (see Appendix 3 for
voxels in at least 60% of the participants (for language and mul-
detailed contrast specification). The mixed effect analyses were
tiple demand networks) or in at least 50% of the participants
run using the lmer function from the lme4 R package (Bates et al.
(for putative LD categorization regions). For the multiple demand
2015); statistical significance of the effects was evaluated using
network, we also (i) excluded parcels in the visual cortex (the
the lmerTest package (Kuznetsova et al. 2017). The hypotheses-
hard condition includes more visual information than the easy
specific contrasts were defined using the hypr package (Rabe et al.
condition and thus yields more activation in the visual cortex),
2020).
and (ii) divided a parcel that encompassed parts of both the
In sum, if linguistic resources are engaged during categoriza-
precentral gyrus and the opercular portion of the inferior frontal
tion, we would expect an overall high response of the language
gyrus according to the macroanatomical boundary.
network to categorization conditions. Further, if, as L&M have
For each participant, each set of masks was intersected
argued, LD categorization taxes linguistic resources to a greater
with the participant’s activation map for the relevant contrast
extent, we would expect to see stronger response of this network
(sentences>nonwords for the language network, hard>easy
to the LD compared with the HD condition. Lastly, if LD catego-
spatial WM for the multiple demand network, and LD > HD for
rization is generally more taxing, we would expect to see greater
putative LD categorization regions). Within each mask, the voxels
responses to the LD condition in the domain-general multiple
were sorted based on their t-values for the relevant contrast, and
demand regions that are sensitive to effort across diverse tasks
the top 10% of voxels were selected as that participant’s fROI.
(Duncan 2010; Duncan 2013; Fedorenko et al. 2013; Hugdahl et al.
This top n% approach ensures that the fROIs can be defined in
2015).
every participant, thus enabling us to generalize the results to the
entire population (Nieto-Castañón and Fedorenko 2012).
Results
Examining the functional response profiles of fROIs Behavioral data
After defining fROIs in individual participants, we evaluated their Multiple demand network localizer
responses to the conditions of interest by averaging the responses Due to a technical error, behavioral data for one participant
across voxels to get a single value per condition per fROI. This got overwritten. For the remaining thirteen participants, perfor-
fROI-level estimate of the BOLD response magnitude is our main mance on the spatial WM task was as expected: participants
effect of interest in this study (and the response magnitude were more accurate and faster in the easy condition (accuracy
averaged across participants constitutes a measure of the effect M = 93.91%, SD = 3.00%; reaction time (RT) = 1.18 s, SD = 0.16 s)
size). than the hard condition (accuracy M = 79.65%, SD = 12.03%; RT
10392 | Cerebral Cortex, 2023, Vol. 33, No. 19

Fig. 4. Categorization responses within the language brain network. (A) Parcels used to define fROIs in individual participants. (B) Average responses
within the language network to four conditions of interest (sentence reading and nonword reading vs. LD and HD categorization). (C) fROI responses to

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


the four conditions of interest.

M = 1.52 s, SD = 0.25 s). Mixed effect models with condition as a extent. There was also an interaction between the Hard>Easy WM
fixed effect and participant as a random intercept showed that task and hemisphere, such that the effect was greater in right
both accuracy and RT effects were significant (accuracy: β = −1.41, hemisphere (β = 0.38, SE = 0.19, P = 0.040).
SE = 0.202, P < 0.001; RT: β = 0.33, SE = 0.027, P < 0.001). Follow-up analyses on individual fROIs (Appendix 2, Table 2)
showed that responses to categorization were significantly above
Critical categorization task 0 in all fROIs. However, they were weaker than the overall
The accuracies for the two categorization conditions did not responses to the WM task in almost all fROIs (except left middle
significantly differ (LD M = 95.73%, SD = 4.20%; HD M = 95.44%, frontal fROI). This result highlights the domain-general nature
SD = 4.11%; LD > HD β = 0.14, SE = 0.20, P = 0.454). Similarly, there of these responses. Further, none of the fROIs had significantly
was no significant difference between response times in the LD different responses to LD and HD categories, despite the presence
condition (RT = 0.81 s, SD = 0.1 s) and the HD condition (RT = 0.84 s, of this effect in the network-level analysis.
SD = 0.1 s; LD > HD β = −0.03, SE = 0.02, P = 0.156).
Whole-brain analyses
Functional response profile of the language network We also conducted a whole-brain analysis to identify fROIs that
There was no significant difference between language network might respond more strongly to LD or HD categorization but lie
responses to LD and HD categorization (β = −0.02, SE = 0.10, outside the language and multiple demand fROIs described above.
P = 0.848). Overall, responses to the categorization task were The GSS analysis (see Methods for details) revealed that no regions
barely above 0 (β = 0.42, SE = 0.19, P = 0.054; see Fig. 4), not exhibited consistent HD > LD responses across participants; how-
significantly different from responses to nonword reading, the ever, the LD > HD contrast revealed two parcels, both located in
control condition in the language localizer task (β = 0.13, SE = 0.09, left parietal lobe (Fig. 6). Further analysis of fROIs defined within
P = 0.144), and significantly weaker than responses to sentences these parcels showed that the LD > HD response only reached
(β = −1.49, SE = 0.09, P < 0.001). significance in fROI 2 (β = 0.43, SE = 0.17, P = 0.013), but not in fROI
Follow-up analyses in individual language fROIs (Appendix 2, 1 (β = 0.58, SE = 0.30, P = 0.060). The overall categorization response
Table 1) showed that responses to categorization were signifi- was significantly above 0 in fROI 1 (β = 0.65, SE = 0.19, P = 0.001) but
cantly above 0 in frontal fROIs (MFG, IFG, and IFGorb). However, not fROI 2 (β = −0.13, SE = 0.15, P = 0.389).
none of the responses were significantly higher than responses Importantly, both fROIs responded to the WM task more
during the control task, nonword reading, indicating that these strongly than to the categorization task (fROI 1: β = 1.66, SE = 0.21,
responses are not language-specific. Thus, our results suggest P < 0.001; fROI 2: β = 0.64, SE = 0.12, P < 0.001), indicating that these
that the language network does not support either LD or HD regions likely respond to general cognitive effort rather than to LD
categorization in neurotypical participants. categorization (or feature selection) specifically, and thus likely
belong to the MD network. Neither of the two fROIs exhibited a
Functional response profile of the multiple demand network sentences>nonwords effect; in fact, both showed a trend in the
Multiple demand network response to LD categorization was opposite direction (fROI 1: β = −0.51, SE = 0.30, P = 0.094; fROI 2:
higher than to HD categorization (β = 0.19, SE = 0.09, P = 0.025), β = −0.28, SE = 0.17, P = 0.098), which shows that these regions do
indicating that, as predicted, LD categorization is more effortful. not respond to linguistic input.
In general, multiple demand network responses to categorization The whole-brain analysis provides additional evidence against
were significantly above 0 (β = 1.07, SE = 0.21, P < 0.001; see Fig. 5) the LD-specific language recruitment hypothesis and shows that
and stronger than responses to control conditions from the lan- differences in LD versus HD categorization, if present, are likely
guage localizer task (categorization > sentences: β = 0.73, SE = 0.08, caused by domain-general mechanisms.
P < 0.001; categorization > nonwords: β = 0.41, SE = 0.08, P < 0.001).
However, they were weaker than responses to the spatial WM Interim discussion
task (β = −1.43, SE = 0.07, P < 0.001), indicating that the WM task In the fMRI Experiment, we examined neural responses to LD and
was more effortful. Responses to the categorization task were HD categorization. Our main goal was to evaluate the hypothesis
stronger in the left hemisphere (β = 0.24, SE = 0.09, P = 0.005). We that LD categorization relies more heavily on linguistic resources
also observed an interaction between the WM > categorization compared with HD categorization. For this purpose, we identi-
contrast and hemisphere (β = 0.29, SE = 0.13, P = 0.024), showing fied the language network individually in 14 healthy adults and
that the WM task engages the right hemisphere to a greater examined its responses during LD and HD categorization. The
Yael Benn et al. | 10393

A 4
B C
2
3
5
7
6
8
9

1 10

D 4
E F
2

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


3
5
7
6
8
9

10 1

Fig. 5. Categorization responses within the multiple demand brain network. (A) Left hemisphere parcels used to define fROIs in individual participants.
(B) Average responses within the left hemisphere fROIs to four conditions of interest (hard and easy WM tasks vs. LD and HD categorization). (C) Left
hemisphere fROI responses to the four conditions of interest. (D–F) Parcels, average responses, and fROI-level responses in the right hemisphere.

Fig. 6. Results of the whole-brain analyses. (A) Parcels defined with the LD > HD categorization contrast. (B) Responses to conditions of interest within
the two fROIs (defined as the top 10% of voxels within each parcel, sorted by the magnitude of the LD > HD response). WM, working memory task.

language network exhibited low responses to both categorization domain-general multiple demand regions and not on language-
tasks, which did not differ from activations elicited by reading of specific regions. Future work should examine whether the
nonword sequences (a low-level control condition). There was no small difference between LD and HD categories is driven by
difference between responses to LD and HD categories, contrary a small subset of categories or whether it indeed ref lects
to the prediction that the language network would be selec- greater domain-general cognitive demands associated with all LD
tively or preferentially engaged during LD categorization. Thus, categorization.
we conclude that (i) the neuroimaging results disconfirm the LD- Neuroimaging of healthy individuals provides a powerful
specific language recruitment hypothesis and (ii) the language complement to patient studies. Given the strong and selective
network is not at all engaged in object categorization, highlighting engagement of the language network during all behaviors
a dissociation between linguistic processing and non-linguistic requiring access to linguistic representations (Fedorenko et al.
semantic cognition. 2010; Fedorenko et al. 2011; Menenti et al. 2011; Scott et al.
Unlike the language network, the domain-general multiple 2017; Giglio et al. 2022; Hu et al. 2021, among others), the lack
demand network (also defined individually in each participant) of activity in the language regions during categorization strongly
was engaged during categorization, indicating that this task is suggests that they do not contribute to categorization (Mather
cognitively challenging. This network responded more strongly et al. 2013). The response to categorization within the multiple
to LD than HD categorization, but this effect was small. The demand network, on the other hand, indicates its involvement in
whole-brain analyses specifically aimed at identifying regions categorization, even though we note that fMRI evidence described
with stronger responses to LD than HD categorization confirmed here is correlational, not causal, and should be complemented
that the two identified fROIs, responded more strongly to a with patient studies or brain stimulation studies that specifically
WM task than to a categorization task, and the LD > HD effect target this hypothesis (that interfering with the activity in the
was small and/or not statistically significant. We conclude that multiple demand network or damage to this network should lead
categorization, and LD categorization in particular, relies on to impairments in categorization tasks). Neuroimaging evidence
10394 | Cerebral Cortex, 2023, Vol. 33, No. 19

is particularly helpful when patient studies do not produce object because the information required for categorization (e.g.
conclusive results, as in our case. color, length) is directly extractable from the image. For semantic
Whereas some previous work suggested that a region within categorization (e.g. danger level or typical location), however, the
left angular gyrus is involved in inhibiting irrelevant semantic identity of the object is important. The result of this re-coding is
information (Lewis et al. 2019), as may be required for LD cat- reported in Appendix 1. The rest of the analyses were the same as
egorization, the results of our study suggest that activation of those described for LD/HD category types.
the language-responsive portion of the left angular gyrus was
comparable during LD and HD categorization. If anything, this Results and discussion
language fROI showed numerically higher activation during HD The results are shown in Appendices S2 and S3. In both aphasia
categorization, suggesting that it may be recruited for recognizing studies, category type had no effect on accuracy, nor did it interact
and thinking about established sets more than for constructing with participant group or BNT. However, semantic categorization
novel sets that may require inhibition of object-irrelevant char- overall elicited longer response times compared with perceptual
acteristics. We also did not find significant differences in the categorization. This main effect of category type on response
engagement of the language fROIs in the left inferior frontal cor- times interacted with participant group for both studies, but

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


tex during LD and HD categorization. These results are in contrast the interaction went in opposite directions across studies: in
to findings from Lupyan et al. (2012), which suggested that tDCS Study 1, individuals with low BNT showed an increased difference
to the left inferior frontal cortex disrupted performance on LD but in response time between semantic and perceptual categories,
not HD categorization. The latter result might be explained by the whereas in Study 2, this gap was reduced. The results of the
fact that left inferior frontal cortex contains not only language- aphasia studies are therefore inconclusive but do not provide
responsive areas, but also multiple demand areas (Fedorenko et al. support for a consistent relationship between naming ability and
2012; Fedorenko and Blank 2020), and interfering with the latter categorization.
areas’ activity may have a disproportionately higher effect on LD The neuroimaging results, however, are clear. The language
categorization. network is not significantly recruited for either semantic or per-
The response to categorization within the multiple demand ceptual categories, reinforcing our conclusion that the cognitive
network was stronger in the left hemisphere, consistent with the mechanisms responsible for core language processing are not
view that label-based categorization recruits the left hemisphere engaged in object categorization.
more strongly (Gilbert et al. 2006; Franklin et al. 2008). This makes Given that the semantic nature of the category has an effect on
the categorization task similar to logic and math, which also evoke response times during categorization tasks, future works should
left-lateralized responses within the multiple demand network aim to disentangle category dimensionality and semantic content
(Monti et al. 2009; Pinel and Dehaene 2009; Monti et al. 2012; when designing the stimuli.
Amalric and Dehaene 2016). Importantly, our result demonstrates
that, just because the function is left-lateralized, it is not necessar-
ily related to language, at least not in fully formed brains (contra,
General discussion
e.g. Gilbert et al. 2006; see also Holmes and Wolff 2012). We reported three studies that evaluated the hypothesis that
All in all, results from the fMRI Experiment disconfirm the linguistic resources are essential for performing feature-based,
hypothesis that LD categorization relies on linguistic resources. or LD, categorization—what we refer to as the “LD-specific lan-
Instead, they show that categorization recruits the multiple guage recruitment hypothesis” (Lupyan 2009; Lupyan et al. 2012;
demand brain regions and that LD categorization is, on average, Lupyan and Mirman 2013; Langland et al. 2021). In Study 1, we
slightly more effortful that HD categorization. aimed to replicate the results of Lupyan and Mirman (2013), who
showed a selective impairment in LD categorization in individuals
with aphasia. Our results failed to replicate this critical finding,
Alternative account: semantic versus although they did show that naming ability, as measured by
perceptual categories BNT scores, was a significant predictor of overall categorization
Throughout this paper, we have adopted the LD/HD distinction performance.
proposed by L&M and tested their hypothesis using the same In Study 2, we modified the design to reduce general task com-
categories as those in their study. However, the LD/HD distinction plexity and examined the specific contribution of naming ability
might not be the only relevant distinction for testing the role of to categorization by recruiting a group of participants with very
language in object categorization (see Section 6.2 for potential low naming scores. We found that, in accordance with the LD-
issues with this classification scheme). Therefore, we addition- specific language recruitment hypothesis, individuals with apha-
ally tested an alternative hypothesis: that the language network sia were more impaired on LD compared with HD categorization.
would be selectively recruited for processing semantic categories However, a case-by-case analysis revealed that two individuals
(e.g. DANGEROUS ANIMALS) but not perceptual categories (e.g. with a severe naming impairment (with scores of 1 and 4 out
THINGS THAT ARE BLUE). This classification does not fully align of 60 on the BNT) performed within the neurotypical range on
with the HD/LD distinction and instead ref lects the view that lan- both HD and LD categorization. Evidence from patients with brain
guage and semantic, or conceptual, processing are tightly linked lesions remains an important way to establish whether specific
(see, e.g. Binder et al. 2009; Binder and Desai 2011; cf. Patterson cognitive capacities support performance on particular tasks (Ror-
et al. 2007; Ivanova et al. 2021). den and Karnath 2004), and dissociations are more important
than associations in this kind of evidence (Caramazza and Colt-
Method heart 2006). Patient studies have previously demonstrated that
We re-analyzed the data from the two aphasia studies and the many high-order cognitive functions are not affected by even
fMRI experiment by re-coding the categories as either semantic or severe linguistic deficits (e.g. Apperly et al. 2006; Bek et al. 2013;
perceptual. The criterion we used was the following. For percep- Chen et al. 2020; Varley et al. 2001, 2005; Varley and Siegal 2000;
tual categorization, one does not need to know the identity of the Willems et al. 2011; Ivanova et al. 2021). Based on Study 2, we
Yael Benn et al. | 10395

therefore concluded that lexical retrieval is not necessary for suc- Yet another possibility is that both naming and categorization
cessful categorization, including categorization based on single performance rely not only on domain-general, but also on
features. semantic control resources. Semantic control is a cognitive
In Study 3, we used a complementary approach and examined construct posited by several groups that investigate controlled
the engagement of the language network and a domain-general retrieval of conceptual information (e.g. Thompson-Schill et al.
multiple demand network in HD and LD categorization using fMRI 1997; Badre and Wagner 2002; Jefferies 2013; Lambon Ralph et al.
in neurotypical adults. The language network was not engaged 2017). Although the location of the putative regions responsible
during either LD or HD categorization: its responses did not for semantic control (or, more neutrally, semantic demand)
significantly differ from responses during the control, nonword resembles that of the language regions, precise localization
reading, task. This observation goes against the hypothesis that approaches in individual brains indicate that language, multiple
categorization (either LD or HD) relies on linguistic resources. demand, and semantic demand regions are spatially distinct
In contrast, the multiple demand network was recruited dur- (Ivanova et al. in prep). If semantic demand regions support
ing the categorization task, consistent with prior evidence of deliberate, controlled semantic tasks, damage to these regions
its involvement in diverse cognitively challenging tasks (Duncan might explain both categorization and naming difficulties in

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


2010; Duncan 2013; Fedorenko et al. 2013; Assem et al. 2020b). individuals with anomia. However, that would not constitute
It also responded more strongly during LD than HD categoriza- evidence in favor of the LD-specific language recruitment hypoth-
tion. Given extensive evidence that the multiple demand network esis: semantic demand regions get recruited both for verbal and
responds more strongly when the task is harder (e.g. Fedorenko nonverbal inputs (Ivanova et al. in prep) and are therefore not
et al. 2011; Fedorenko et al. 2013; Hugdahl et al. 2015; Shashidhara language-specific.
et al. 2019), the increased response during LD categorization is Future patient studies should explicitly test the cognitive con-
consistent with the hypothesis that LD categorization is more trol accounts of LD-selective categorization impairments. One
cognitively challenging. However, this effect was small and did way to do so is to use lesion mapping along with probabilis-
not come out as statistically significant in any of the individual tic maps of functional networks of interest (see, e.g. Woolgar
multiple demand regions in follow-up analyses. In sum, we find et al. 2018): this method allows explicitly determining which net-
little evidence in favor of the LD-specific language recruitment work (language, multiple demand, or semantic control) underlies
hypothesis. observed behavior patterns. Another way is to measure domain-
general and semantic cognitive control in individuals with brain
The cognitive control account of categorization damage and use them as predictors when evaluating the rela-
performance tionship between naming performance and categorization. Yet
The failure to replicate the results from L&M in Study 1 and an another approach would be to explore these relationships in neu-
only partial replication in Study 2 have several possible explana- rotypical participants by examining the correlational structure of
tions. The first explanation is that the effect described by L&M is these abilities across individuals. Such studies could provide addi-
real, but we could not detect it due to low power (e.g. small sample tional evidence in favor or against the cognitive control accounts
size). This explanation is unlikely because of our neuroimaging of categorization impairments, complementing our neuroimaging
results: if language was indeed required for LD categorization, the results and reconciling conf licting findings from individuals with
language network would be active during the LD categorization aphasia.
condition. The second explanation is that the result that was
reported by L&M is a false positive. The third explanation is that The relevance of LD versus HD distinction
the effect holds in a subset of individuals with aphasia, due to Why did we find no, few, or inconsistent differences in perfor-
comorbid cognitive control impairments. We cannot definitively mance and neural responses between LD and HD categories? A
rule out either the second or the third explanation, although our possible explanation is that “LD” and “HD” category types are not
neuroimaging results provide some support for the latter: the “natural kinds.” In the interest of replicability, we here chose to
multiple demand network, implicated in cognitively demanding keep the categories used by L&M for most analyses, but future
tasks, was somewhat more active during LD than during HD research will possibly refine or even abandon this distinction. As
categorization. discussed in the introduction, different researchers have empha-
The hypothesis that domain-general cognitive control deficits sized different distinctions among categories, such as natural/ad
underlie impaired categorization can also explain the link hoc, taxonomic/thematic, dense/sparse, concrete/abstract, etc.
between categorization and naming, which we observed in both Many of these distinctions are not isomorphic with the LD/HD dis-
Studies 1 and 2, and which was also reported by L&M. Con- tinction. In particular, HD categories encompass both taxonomic
frontation naming is a complex, multi-component behavior that (e.g. “animals”) and thematic (e.g. “non-food things found in the
involves not only linguistic, but also visual, motor-articulatory, kitchen”) categories. Multiple studies show that the processing
and critically, executive resources. Indeed, a recent fMRI study (Hu of taxonomic and thematic relations relies on distinct cognitive
et al. 2021) reports strong responses within the multiple demand and neural mechanisms (e.g. Kalénine et al. 2009; Sass et al. 2009;
network to an object naming condition. Furthermore, unlike syn- Schwartz et al. 2011; Lewis et al. 2015; Xu et al. 2018); collapsing
tactic comprehension, both naming ability and f luid intelligence them into a single “HD” category type leads to substantial within-
(a trait linked to the multiple demand network; Gläscher et al. HD heterogeneity and may therefore obscure potential HD/LD
2010; Woolgar et al. 2010; Woolgar et al. 2018) decline with age, and differences.
this decline is linked to decreased activity in the multiple demand In addition, there is currently no principled way of labeling
brain regions during both of these tasks (Samu et al. 2017). Thus, categories as LD versus HD. Different researchers might disagree
although both our work and L&M show a relationship between on whether items in a given category have few or many features in
naming and categorization, the underlying cognitive mechanism common: for instance, Lupyan and Mirman (2013) classify “things
of this relationship is likely related to cognitive control, not that f ly” as an HD category, even though the majority of members
language. in this category can be identified using an LD label “have wings”;
10396 | Cerebral Cortex, 2023, Vol. 33, No. 19

under other accounts (e.g. Langland et al. 2021), “f lying” might low-level verbal/phonological rehearsal appears to rely on lower-
be a feature in and of itself, uniting objects that are otherwise level speech processing mechanisms (e.g. Scott and Perrachione
highly diverse. The lack of clarity on what exactly constitutes an 2019) and the domain-general multiple-demand network (e.g.
HD category makes it hard to generalize the results beyond the Fedorenko et al. 2011; Shashidhara et al. 2020), not on the lan-
specific categories used in the study. guage network. In any case, the verbal rehearsal account is quite
Furthermore, not all LD categories as defined by Lupyan different from L&M’s original LD-specific language recruitment
and Mirman (2013) necessarily involve conceptual processing. hypothesis.
For instance, many are based on color: e.g. “THINGS THAT
ARE YELLOW”. Although color is often encoded as part of
Relationship to other work on language and
the conceptual representation of an object, this conceptual
categorization
representation was not required for the task in question:
participants were simply asked to indicate whether the object Other results from psycho- and neurolinguistics also support the
they were viewing was yellow, and decisions could be made on view that linguistic resources do not typically mediate catego-
the basis of surface perceptual features alone. Thus, even if “true” rization in humans. If access to linguistic representations were

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


(semantic) LD categories are indeed harder to process than HD necessary for categorization, categorizing images would take
categories, inclusion of perception-based color categories could longer than categorizing words; instead, they take approximately
have prevented us from reliably observing this difference. the same amount of time (Potter and Faulconer 1975). When
Our results are somewhat inconsistent with recent work asked to match a picture with a label, participants do not
by Langland et al. (2021), who observe that individuals with explicitly generate/rehearse verbal labels in advance unless
aphasia were slower and less accurate (compared with healthy there is an additional memory demand (e.g. if images disappear
adults) when processing abstract categories compared with from the screen) (Pontillo et al. 2015). Previous work also
concrete categories. The authors argue that the abstract/concrete shows that language is not necessary for performing tasks that
distinction is similar to the LD/HD distinction because members require isolating a specific aspect (“feature”) of the semantic
of abstract categories share fewer common features. However, representation, including theory of mind inferences (Varley and
another important difference is the kind of features used for Siegal 2000; Varley et al. 2001; Apperly et al. 2006) and thematic
categorization. For instance, their example of an abstract category role identification (Ivanova et al. 2021). Our work therefore adds to
“predict” (which includes a weatherperson and a fortune-teller) the growing body of evidence for a separation between linguistic
relies on an unobservable functional similarity rather than on an and visual semantic processing.
observable visual similarity. Unobserved features play an impor- That said, many studies have shown that linguistic labels inf lu-
tant role in the use of verbal category labels (Gelman and Roberts ence categorization behavior in infants (e.g. Gershkoff-Stowe et al.
2017), so it is possible that language mediates categorization 1997; Sloutsky and Fisher 2004; Plunkett et al. 2008; Waxman
based on latent features rather than LD categorization per se. and Gelman 2009; Ferguson and Waxman 2017) and adults (e.g.
In short, the LD/HD and the abstract/concrete distinction do not Lupyan et al. 2007; Lupyan 2009; Brojde et al. 2011; Zettersten and
cleanly map onto each other, which makes it difficult to compare Lupyan 2020), so the relationship between words and categories
the results of our studies to those by Langland-Hassan et al. is clearly an important one. What we are showing here is that the
More generally, the typology of category types remains vague and mechanisms responsible for language processing are not engaged
inconsistent, and more careful work should be done to establish during object categorization, nor are they specifically recruited
meaningful category distinctions and thus facilitate comparisons for LD categorization. It is possible that linguistic labels, once
across studies. acquired, may inf luence categorization via other brain systems,
e.g. semantic, domain-general, or perceptual. The cognitive and
Possible paradigm-specific effects of verbal labels neural mechanisms underlying the inf luence of labels on cat-
Even if we were able to successfully replicate L&M’s findings, egorization thus remain to be determined (for some modeling
our conclusions about the language–categorization link would be proposals, see Gliozzi et al. 2009; Lupyan 2012; Ivanova and Hofer
complicated by the fact that the paradigm introduced by L&M 2020; Luo et al. 2023).
is not language-free. In order to successfully sort objects into Overall, our study shows that categorizing items is not a
categories, participants need to read (or hear) and encode the language-dependent task in the adult brain, regardless of whether
category label, presented verbally. The importance of language the categorization is made on the basis of multiple features
during the instruction encoding stage might account for the rela- (HD) or a single feature (LD). Instead, this task relies on the
tionship between categorization performance and naming ability; domain-general multiple demand system, which supports diverse
it might even explain the (putative) LD-specific categorization goal-directed behaviors. Our work provides evidence against the
impairments, given that category labels for LD categories are view of language as an aid for feature-based (LD) categorization
often longer. In Studies 2 and 3, we simplified the visual processing and highlights the value of complementing patient studies with
demands, and separated the category-label instruction from the neuroimaging experiments.
task, which allowed us to measure the behavioral and neural
responses to categorization more clearly. Another solution to this
Acknowledgments (including disclaimers
issue would be to modify the paradigm to remove verbal labels
altogether, e.g. by providing several category exemplars instead.
and address of the corresponding author)
In addition, linguistic labels might contribute to the task via We would like to acknowledge the Athinoula A. Martinos Imaging
verbal rehearsal: participants might employ a phonological loop Center at the McGovern Institute for Brain Research at MIT, and
to maintain an active representation of the labels in WM. Such its support team (Steve Shannon and Atsushi Takahashi). The
assistive role of language labels has been observed in condi- authors thank Naveen Hanif and Anis Adila Khairil Anuar, who
tions of high cognitive demand (e.g. during mathematical cal- have helped with development and piloting of the paradigms used
culation; Benn et al. 2012; Klessinger et al. 2012). However, such in Studies 1 and 2, and Alvincé Pongos for help with data analysis.
Yael Benn et al. | 10397

CRediT taxonomy Baayen RH, Davidson DJ, Bates DM. Mixed-effects modeling with
crossed random effects for subjects and items. J Mem Lang.
Yael Benn (Conceptualization, Formal analysis, Methodology,
2008:59(4):390–412.
Project administration, Software, Validation, Writing—original
Badecker W, Caramazza A. On considerations of method and theory
draft, Writing—review and editing), Anna Ivanova (Conceptual-
governing the use of clinical categories in neurolinguistics and
ization, Data curation, Formal analysis, Validation, Visualization,
cognitive neuropsychology: the case against agrammatism. Cog-
Writing—original draft, Writing—review and editing), Oliver
nition. 1985:20(2):97–125.
Clark (Formal analysis, Software, Validation), Zachary Mineroff
Badecker W, Nathan P, Caramazza A. Varieties of sentence compre-
(Formal analysis, Investigation, Methodology), Chloe Seikus
hension deficits: a case study. Cortex. 1991:27(2):311–321.
(Investigation, Methodology), Jack Santos Silva (Investigation,
Badre D, Wagner AD. Semantic retrieval, mnemonic control, and
Methodology), Rosmary Varley (Conceptualization, Investigation,
prefrontal cortex. Behav Cogn Neurosci Rev. 2002:1(3):206–218.
Methodology, Supervision, Writing—original draft, Writing—
Bain A. The senses and the intellect. London: John W. Parker & Son; 1855.
review and editing), Evelina Fedorenko (Conceptualization,
Baldo JV, Bunge SA, Wilson SM, Dronkers NF. Is relational reasoning
Methodology, Supervision, Writing—original draft, Writing—
dependent on language? A voxel-based lesion symptom mapping

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


review and editing).
study. Brain Lang. 2010:113(2):59–64.
Baldo JV, Paulraj SR, Curran BC, Dronkers NF. Impaired reasoning
and problem-solving in individuals with language impairment
Supplementary material due to aphasia or language delay. Front Psychol. 2015:6:1523.
Supplementary material is available at Cerebral Cortex online. Barsalou LW. Ad hoc categories. Mem Cogn. 1983:11(3):211–227.
Bates E, Wilson SM, Saygin AP, Dick F, Sereno MI, Knight RT,
Dronkers NF. Voxel-based lesion-symptom mapping. Nat Neu-
Funding rosci. 2003:6(5):448–450.
This research was supported by National Institutes of Health Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects
awards (R00-HD057522, R01-DC016607, R01-DC016950 to E.F.), a models using lme4. J Stat Softw. 2015:67(1):1–48.
grant from the Simons Foundation to the Simons Center for Bek J, Blades M, Siegal M, Varley RA. Dual-task interference in
the Social Brain at MIT, and funds from the Brain and Cogni- spatial reorientation: linguistic and nonlinguistic factors. Spatial
tive Sciences Department and the McGovern Institute for Brain Cognition & Computation. 2013:13(1):26–49.
Research at MIT. This research did not receive any specific grant Benjamini Y, Hochberg Y. Controlling the false discovery rate: a
from funding agencies in the public, commercial, or not-for-profit practical and powerful approach to multiple testing. J R Stat Soc
sectors. Ser B Methodol. 1995:57(1):289–300.
Benn Y, Zheng Y, Wilkinson ID, Siegal M, Varley R. Language in
calculation: a core mechanism? Neuropsychologia. 2012:50(1):1–10.
Conf lict of interest statement: None declared.
Benn Y, Wilkinson ID, Zheng Y, Kadosh KC, Romanowski CAJ, Siegal
M, Varley R. Differentiating core and co-opted mechanisms in
calculation: the neuroimaging of calculation in aphasia. Brain
References
Cogn. 2013:82(3):254–264.
Amalric M, Dehaene S. Origins of the brain networks for advanced Bermúdez JL. Thinking without words[place unknown]:. Oxford Univer-
mathematics in expert mathematicians. Proc Natl Acad Sci U S A. sity Press; 2007.
2016:113(18):4909–4917. Bickerton D. Language and human behavior. Seattle, WA: University of
Amalric M, Dehaene S. A distinct cortical network for mathematical Washington Press; 1995.
knowledge in the human brain. NeuroImage. 2019:189:19–31. Binder JR, Desai RH. The neurobiology of semantic memory. Trends
Amit E, Hoeflin C, Hamzah N, Fedorenko E. An asymmetrical rela- Cogn Sci. 2011:15(11):527–536.
tionship between verbal and visual thinking: converging evidence Binder JR, Desai RH, Graves WW, Conant LL. Where is the semantic
from behavior and fMRI. NeuroImage. 2017:152:619–627. system? A critical review and meta-analysis of 120 functional
Apperly IA, Samson D, Carroll N, Hussain S, Humphreys G. Intact neuroimaging studies. Cereb Cortex. 2009:19(12):2767–2796.
first-and second-order false belief reasoning in a patient with Blank IA, Kanwisher N, Fedorenko E. A functional dissociation
severely impaired grammar. Soc Neurosci. 2006:1(3–4):334–348. between language and multiple-demand systems revealed in
Ashburner J, Friston KJ. Unified segmentation. NeuroImage. patterns of BOLD signal fluctuations. J Neurophysiol. 2014:112(5):
2005:26(3):839–851. 1105–1118.
Ashby FG, Ell SW. The neurobiology of human category learning. Blank IA, Balewski Z, Mahowald K, Fedorenko E. Syntactic processing
Trends Cogn Sci. 2001:5(5):204–210. is distributed across the language system. NeuroImage. 2016:127:
Ashby FG, O’Brien JB. Category learning and multiple memory sys- 307–323.
tems. Trends Cogn Sci. 2005:9(2):83–89. Blumstein SE. Neurolinguistics: an overview of language–brain rela-
Ashby FG, Alfonso-Reese LA, Turken AU, Waldron EM. A neuropsy- tions in aphasia. In: Newmeyer FJ, editor. Linguistics: The Cam-
chological theory of multiple systems in category learning. Psy- bridge Survey: Volume 3: language: psychological and biological aspects
chol Rev. 1998:105(3):442–481. [Internet]. Vol. 3. Cambridge: Cambridge University Press; 1988
Assem M, Blank IA, Mineroff Z, Ademoğlu A, Fedorenko E. Activity [accessed 2021 Sep 23]; p. 210–236.
in the fronto-parietal multiple-demand network is robustly asso- Braga RM, DiNicola LM, Becker HC, Buckner RL. Situating the left-
ciated with individual differences in working memory and fluid lateralized language network in the broader organization of mul-
intelligence. Cortex. 2020a:131:1–16. tiple specialized large-scale distributed networks. J Neurophysiol.
Assem M, Glasser MF, Van Essen DC, Duncan J. A domain-general 2020:124(5):1415–1448.
cognitive core defined in multimodally parcellated human cor- Brojde CL, Porter C, Colunga E. Words can slow down category
tex. Cereb Cortex. 2020b:30(8):4361–4380. learning. Psychon Bull Rev. 2011:18(4):798–804.
10398 | Cerebral Cortex, 2023, Vol. 33, No. 19

Burger RA, Muma JR. Cognitive distancing in mediated categorization Fedorenko E, Duncan J, Kanwisher N. Language-selective and
in aphasia. J Psycholinguist Res. 1980:9(4):355–365. domain-general regions lie side by side within Broca’s area. Curr
Caramazza A, Badecker W. Patient classification in neuropsycholog- Biol. 2012:22(21):2059–2062.
ical research. Brain Cogn. 1989:10(2):256–295. Fedorenko E, Duncan J, Kanwisher N. Broad domain generality in
Caramazza A, Coltheart M. Cognitive neuropsychology twenty years focal regions of frontal and parietal cortex. Proc Natl Acad Sci U
on. Cogn Neuropsychol. 2006:23(1):3–12. S A. 2013:110(41):16616–16621.
Caramazza A, McCloskey M. The case for single-patient studies. Cogn Fedorenko E, Blank IA, Siegelman M, Mineroff Z. Lack of selectivity
Neuropsychol. 1988:5(5):517–527. for syntax relative to word meanings throughout the language
Caramazza A, Berndt RS, Brownell HH. The semantic deficit hypoth- network. Cognition. 2020:203:104348.
esis: perceptual parsing and object classification by aphasic Ferguson B, Waxman S. Linking language and categorization in
patients. Brain Lang. 1982:15(1):161–189. infancy. J Child Lang. 2017:44(3):527–552.
Carruthers P. The cognitive functions of language. Behav Brain Sci. Franklin A, Drivonikou GV, Clifford A, Kay P, Regier T, Davies IRL. Lat-
2002:25(6):657–674 discussion 674-725. eralization of categorical perception of color changes with color
Chen X, Affourtit J, Norman-Haignere S, Jouravlev O, Malik-Moraleda term acquisition. Proc Natl Acad Sci. 2008:105(47):18221–18225.

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


S, Kean HH, Regev T, McDermott JH, Fedorenko E. The fronto- Friston Karl J, Ashburner J, Frith CD, Poline J-B, Heather JD, Frack-
temporal language system does not support the processing of owiak RSJ. Spatial registration and normalization of images. Hum
music. Society for Neurobiology of Language. 2020. Brain Mapp. 1995:3(3):165–189.
Coetzee JP, Monti MM. At the core of reasoning: dissociating Friston KJ, Holmes AP, Worsley KJ, Poline J-P, Frith CD, Frackowiak
deductive and non-deductive load. Hum Brain Mapp. 2018:39(4): RSJ. Statistical parametric maps in functional imaging: a general
1850–1861. linear approach. Hum Brain Mapp. 1994:2(4):189–210.
Cohen R, Woll G. Facets of analytical processing in aphasia: a picture Gainotti G, D’Erme P, Villa G, Caltagirone C. Focal brain lesions
ordering task. Cortex. 1981:17(4):557–569. and intelligence: a study with a new version of Raven’s colored
Cohen R, Kelter S, Woll G. Analytical competence and language matrices. J Clin Exp Neuropsychol. 1986:8(1):37–50.
impairment in aphasia. Brain Lang. 1980:10(2):331–347. Gelman SA, Roberts SO. How language shapes the cultural inheri-
IBM Corp. IBM SPSS statistics for Windows, Version 22.0. Armonk, NY: tance of categories. PNAS. 2017:114(30):7900–7907.
IBM Corp., 2013. https://hadoop.apache.org. Gershkoff-Stowe L, Thal DJ, Smith LB, Namy LL. Categorization and
Couchman JJ, Coutinho MVC, Smith JD. Rules and resemblance: its developmental relation to early language. Child Dev. 1997:68(5):
their changing balance in the category learning of humans (Homo 843–859.
sapiens) and monkeys (Macaca mulatta). J Exp Psychol Anim Behav Giglio L, Ostarek M, Weber K, Hagoort P. Commonalities and
Process. 2010:36(2):172–183. asymmetries in the neurobiological infrastructure for lan-
Darwin C. The descent of man and selection in relation to sex. London: guage production and comprehension. Cerebral Cortex. 2022:32(7):
John Murray; 1871. 1405–1418.
Davidoff J, Roberson D. Preserved thematic and impaired taxo- Gilbert AL, Regier T, Kay P, Ivry RB. Whorf hypothesis is supported
nomic categorisation: a case study. Lang Cogn Process. 2004:19(1): in the right visual field but not the left. Proc Natl Acad Sci U S A.
137–174. 2006:103(2):489–494.
De Renzi E, Spinnler H. Impaired performance on color tasks in Gläscher J, Rudrauf D, Colom R, Paul LK, Tranel D, Damasio H,
patients with hemispheric damage. Cortex. 1967:3(2):194–217. Adolphs R. Distributed neural system for general intelligence
Dennett DC. The role of language in intelligence In: Khalfa J, editor. revealed by lesion mapping. PNAS. 2010:107(10):4705–4709.
What is intelligence? Cambridge, UK: the Darwin College lectures. Gliozzi V, Mayor J, Hu J-F, Plunkett K. Labels as features (not names)
Cambridge: Cambridge University Press; 1994. for infant categorization: a neurocomputational approach. Cogn
Duncan J. The multiple-demand (MD) system of the primate brain: Sci. 2009:33(4):709–738.
mental programs for intelligent behaviour. Trends Cogn Sci (Regul Goodglass H, Geschwind N. Language disturbance (aphasia). In:
Ed). 2010:14(4):172–179. Carterette EC, Friedman MP, editors. Handbook of perception. Vol.
Duncan J. The structure of cognition: attentional episodes in mind 7. New York: Academic Press; 1976. pp. 389–428
and brain. Neuron. 2013:80(1):35–50. Goodglass H, Kaplan E, Weintraub S. Boston Naming Test[place
Duncan J, Owen AM. Common regions of the human frontal unknown]:. Philadelphia, PA: Lea & Febiger; 1983.
lobe recruited by diverse cognitive demands. Trends Neurosci. Goodglass H, Kaplan E, Weintraub S. BDAE: The Boston diagnostic apha-
2000:23(10):475–483. sia examination. Philadelphia, PA: Lippincott Williams & Wilkins;
Elwert F, Winship C. Endogenous selection bias: the problem of 2001.
conditioning on a collider variable. Annu Rev Sociol. 2014:40(1): Higby E, Cahana-Amitay D, Vogel-Eyny A, Spiro A, Albert ML, Obler
31–53. LK. The role of executive functions in object- and action-naming
Fedorenko E. The early origins and the growing popularity of the among older adults. Exp Aging Res. 2019:45(4):306–330.
individual-subject analytic approach in human neuroscience. Hjelmquist EK. Concept formation in non-verbal categorization
Curr Opin Behav Sci. 2021:40:105–112. tasks in brain-damaged patients with and without aphasia. Scand
Fedorenko E, Blank IA. Broca’s area is not a natural kind. Trends Cogn J Psychol. 1989:30(4):243–254.
Sci. 2020:24(4):270–284. Holmes KJ, Wolff P. Does categorical perception in the left hemi-
Fedorenko E, Hsieh P-J, Nieto-Castañón A, Whitfield-Gabrieli S, sphere depend on language? J Exp Psychol Gen. 2012:141(3):
Kanwisher N. New method for fMRI investigations of language: 439–443.
defining ROIs functionally in individual subjects. J Neurophysiol. Hough MS. Categorization in aphasia: access and organization
2010:104(2):1177–1194. of goal-derived and common categories. Aphasiology. 1993:7(4):
Fedorenko E, Behr MK, Kanwisher N. Functional specificity for high- 335–357.
level linguistic processing in the human brain. Proc Natl Acad Sci. Hu J, Small H, Kean H, Takahashi A, Zekelman L, Kleinman D,
2011:108(39):16428–16433. Ryan E, Ferreira V, Fedorenko E. The language network supports
Yael Benn et al. | 10399

both lexical access and sentence generation during language Lewis GA, Poeppel D, Murphy GL. Contrasting semantic versus
production. Biorxiv. 2021:2021–09. inhibitory processing in the angular gyrus: an fMRI study. Cereb
Hugdahl K, Raichle ME, Mitra A, Specht K. On the existence of Cortex. 2019:29(6):2470–2481.
a generalized non-specific task-dependent network. Front Hum Luo X, Sexton NJ, Love BC. A deep learning account of how
Neurosci [Internet]. 2015:9:430. language affects thought. Language, Cognition and Neuroscience.
Hulleman J, Humphreys GW. Maximizing the power of comparing 2023:38(4):499–508.
single cases against a control sample: an argument, a program for Lupyan G. Extracommunicative functions of language: verbal inter-
making comparisons, and a worked example from the Pyramids ference causes selective categorization impairments. Psychon Bull
and Palm Trees Test. Cogn Neuropsychol. 2007:24(3):279–291. Rev. 2009:16(4):711–718.
Ivanova AA, Hofer M. Linguistic overhypotheses in category learning: Lupyan G. Linguistically modulated perception and cognition: The
explaining the label advantage effect. In: Proceedings of the 42nd label-feedback hypothesis. Front Psychol. 2012:3:54.
Annual Conference of the Cognitive Science Society. Cognitive Science Lupyan G, Casasanto D. Meaningless words promote meaningful
Society; 2020, p. 723–729. categorization. Lang Cogn. 2015:7(2):167–193.
Ivanova AA, Mineroff Z, Zimmerer V, Kanwisher N, Varley R, Lupyan G, Mirman D. Linking language and categorization: evidence

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


Fedorenko E. The language network is recruited but not required from aphasia. Cortex. 2013:49(5):1187–1194.
for nonverbal event semantics. Neurobiology of Language. 2021:2(2): Lupyan G, Rakison DH, McClelland JL. Language is not just for
176–201. talking: redundant labels facilitate learning of novel categories.
Jaeger TF. Categorical data analysis: away from ANOVAs (transfor- Psychol Sci. 2007:18(12):1077–1083.
mation or not) and towards logit mixed models. J Mem Lang. Lupyan G, Mirman D, Hamilton R, Thompson-Schill SL. Categoriza-
2008:59(4):434–446. tion is modulated by transcranial direct current stimulation over
Jefferies E. The neural basis of semantic cognition: converging evi- left prefrontal cortex. Cognition. 2012:124(1):36–49.
dence from neuropsychology, neuroimaging and TMS. Cortex. Mahowald K, Fedorenko E. Reliable individual-level neural markers
2013:49(3):611–625. of high-level language processing: a necessary precursor for
Julian JB, Fedorenko E, Webster J, Kanwisher N. An algorithmic relating neural variability to behavioral and genetic variability.
method for functionally defining regions of interest in the ventral NeuroImage. 2016:139:74–93.
visual pathway. NeuroImage. 2012:60(4):2357–2364. Mareschal D, Quinn PC. Categorization in infancy. Trends Cogn Sci.
Kalénine S, Peyrin C, Pichat C, Segebarth C, Bonthoux F, Baciu M. The 2001:5(10):443–450.
sensory-motor specificity of taxonomic and thematic conceptual Mather M, Cacioppo JT, Kanwisher N. How fMRI can inform cognitive
relations: a behavioral and fMRI study. NeuroImage. 2009:44(3): theories. Perspect Psychol Sci. 2013:8(1):108–113.
1152–1162. Menenti L, Gierhan SME, Segaert K, Hagoort P. Shared language: over-
Kan IP, Thompson-Schill SL. Selection from perceptual and con- lap and segregation of the neuronal infrastructure for speaking
ceptual representations. Cogn Affect Behav Neurosci. 2004:4(4): and listening revealed by functional MRI. Psychol Sci. 2011:22(9):
466–482. 1173–1182.
Kemler Nelson DG. The effect of intention on what concepts are Mervis CB, Rosch E. Categorization of natural objects. Annu Rev
acquired. J Verbal Learn Verbal Behav. 1984:23(6):734–759. Psychol. 1981:32(1):89–115.
Kim HS. We talk, therefore we think? A cultural analysis of the effect Mineroff Z, Blank IA, Mahowald K, Fedorenko E. A robust dissocia-
of talking on thinking. J Pers Soc Psychol. 2002:83(4):828–842. tion among the language, multiple demand, and default mode
Klessinger N, Szczerbinski M, Varley RA. The role of number words: networks: evidence from inter-region correlations in effect size.
the phonological length effect in multidigit addition. Mem Cogn. Neuropsychologia. 2018:119:501–511.
2012:40(8):1289–1302. Mirman D, Landrigan J-F, Britt AE. Taxonomic and thematic semantic
Kloos H, Sloutsky VM. What’s behind different kinds of kinds: effects systems. Psychol Bull. 2017:143(5):499–520.
of statistical density on learning and representation of categories. Monti MM, Parsons LM, Osherson DN. The boundaries of lan-
J Exp Psychol Gen. 2008:137(1):52–72. guage and thought in deductive inference. Proc Natl Acad Sci.
Koemeda-Lutz M, Cohen R, Meier E. Organization of and access to 2009:106(30):12554–12559.
semantic memory in aphasia. Brain Lang. 1987:30(2):321–337. Monti MM, Parsons LM, Osherson DN. Thought beyond language:
Kriegeskorte N, Simmons WK, Bellgowan PSF, Baker CI. Circular neural dissociation of algebra and natural language. Psychol Sci.
analysis in systems neuroscience: the dangers of double dipping. 2012:23(8):914–922.
Nat Neurosci. 2009:12(5):535–540. Murphy G. The big book of concepts. Cambridge, MA: MIT Press; 2002.
Kuznetsova A, Brockhoff PB, Christensen RHB. lmerTest package: Nieto-Castañon A. Handbook of functional connectivity magnetic reso-
tests in linear mixed effects models. J Stat Softw. 2017:82(1):1–26. nance imaging methods in CONN. Boston: Hilbert Press; 2020.
Lambon Ralph MA, Jefferies E, Patterson K, Rogers TT. The neural Nieto-Castañón A, Fedorenko E. Subject-specific functional localiz-
and computational bases of semantic cognition. Nat Rev Neurosci. ers increase sensitivity and functional resolution of multi-subject
2017:18(1):42–55. analyses. NeuroImage. 2012:63(3):1646–1669.
Langland-Hassan P, Faries FR, Gatyas M, Dietz A, Richardson MJ. Patterson K, Nestor PJ, Rogers TT. Where do you know what you
Assessing abstract thought and its relation to language with know? The representation of semantic knowledge in the human
a new nonverbal paradigm: evidence from aphasia. Cognition. brain. Nat Rev Neurosci. 2007:8(12):976–987.
2021:211:104622. Pearce JM. Discrimination and categorization. In: Mackintosh NJ,
Le Dorze G, Nespoulous JL. Anomia in moderate aphasia: problems editor. Animal learning and cognition [Internet]. San Diego, CA: Aca-
in accessing the lexical representation. Brain Lang. 1989:37(3): demic Press[accessed 2021 Mar 11];; 1994. pp. 109–134
381–400. Perry LK, Lupyan G. The role of language in multi-dimensional
Lewis GA, Poeppel D, Murphy GL. The neural bases of taxonomic and categorization: evidence from transcranial direct current stim-
thematic conceptual relations: an MEG study. Neuropsychologia. ulation and exposure to verbal labels. Brain Lang. 2014:135:
2015:68:176–189. 66–72.
10400 | Cerebral Cortex, 2023, Vol. 33, No. 19

Petersen SE, Posner MI. The attention system of the human brain: 20 Sloutsky VM. From perceptual categories to concepts: what devel-
years after. Annu Rev Neurosci. 2012:35:73–89. ops? Cogn Sci. 2010:34(7):1244–1286.
Pinel P, Dehaene S. Beyond hemispheric dominance: brain regions Sloutsky VM, Fisher AV. Induction and categorization in young chil-
underlying the joint lateralization of language and arithmetic to dren: a similarity-based model. J Exp Psychol Gen. 2004:133(2):
the left hemisphere. J Cogn Neurosci. 2009:22(1):48–66. 166–188.
Plunkett K, Hu J-F, Cohen LB. Labels can override perceptual cate- Smith LB, Heise D. Perceptual similarity and conceptual structure.
gories in early infancy. Cognition. 2008:106(2):665–681. In: Burns B, editor. Advances in psychology. Vol. 93. North-Holland:
Pontillo DF, Salverda AP, Tanenhaus MK. 2015. Flexible use of phono- Elsevier; 1992. pp. 233–272.
logical and visual memory in language-mediated visual search. Smith EE, Medin DL. Categories and concepts. Cambridge MA: Harvard
In: Proceedings of the 37th Meeting of the Cognitive Science Society. University Press; 1981.
Pasadena, California. Thompson-Schill SL, D’Esposito M, Aguirre GK, Farah MJ. Role of left
Posner MI, Petersen SE. The attention system of the human brain. inferior prefrontal cortex in retrieval of semantic knowledge: a
Annu Rev Neurosci. 1990:13:25–42. reevaluation. Proc Natl Acad Sci. 1997:94(26):14792–14797.
Potter MC, Faulconer BA. Time to understand pictures and words. Vallila-Rohter S, Kiran S. An examination of strategy implementa-

Downloaded from https://academic.oup.com/cercor/article/33/19/10380/7239890 by guest on 13 October 2023


Nature. 1975:253(5491):437–438. tion during abstract nonlinguistic category learning in aphasia. J
Rabe MM, Vasishth S, Hohenstein S, Kliegl R, Schad DJ. hypr: an Speech Lang Hear Res. 2015:58(4):1195–1209.
R package for hypothesis-driven contrast coding. J Open Source Varley RA, Siegal M. Evidence for cognition without grammar from
Softw. 2020:5(48):2134. causal reasoning and “theory of mind” in an agrammatic aphasic
Rorden C, Karnath H-O. Using human brain lesions to infer function: patient. Curr Biol. 2000:10(12):723–726.
a relic from a past era in the fMRI age? Nat Rev Neurosci. 2004:5(10): Varley RA, Siegal M, Want SC. Severe impairment in grammar does
812–819. not preclude theory of mind. Neurocase. 2001:7(6):489–493.
Rossion B, Pourtois G. Revisiting Snodgrass and Vanderwart’s object Varley RA, Klessinger NJC, Romanowski CAJ, Siegal M. Agram-
pictorial set: the role of surface detail in basic-level object recog- matic but numerate. Proc Natl Acad Sci U S A. 2005:102(9):
nition. Perception. 2004:33(2):217–236. 3519–3524.
Sachs O, Weis S, Zellagui N, Huber W, Zvyagintsev M, Mathiak K, Wasserman E, Kiedinger RE, Bhatt R. Conceptual behavior in
Kircher T. Automatic processing of semantic relations in fMRI: pigeons: Categories, subcategories, and pseudocategories. Journal
neural activation during semantic priming of taxonomic and of Experimental Psychology: Animal Behavior Processes. 1988:14(3):
thematic categories. Brain Res. 2008:1218:194–205. 235.
Samu D, Campbell KL, Tsvetanov KA, Shafto MA, Tyler LK. Waxman SR, Gelman SA. Early word-learning entails reference, not
Preserved cognitive functions with age are determined by merely associations. Trends Cogn Sci. 2009:13(6):258–263.
domain-dependent shifts in network responsivity. Nat Commun. Welch LW, Doineau D, Johnson S, King D. Educational and gender
2017:8(1):14743. normative data for the Boston naming test in a group of older
Sass K, Sachs O, Krach S, Kircher T. Taxonomic and thematic cat- adults. Brain Lang. 1996:53(2):260–266.
egories: neural correlates of categorization in an auditory-to- Whitehouse P, Caramazza A, Zurif E. Naming in aphasia: interacting
visual priming task using fMRI. Brain Res. 2009:1270:78–87. effects of form and function. Brain Lang. 1978:6(1):63–74.
Saxe R, Brett M, Kanwisher N. Divide and conquer: a defense of Willems RM, Benn Y, Hagoort P, Toni I, Varley RA. Communicating
functional localizers. NeuroImage. 2006:30(4):1088–1096 discus- without a functioning language system: implications for the
sion 1097-1099. role of language in mentalizing. Neuropsychologia. 2011:49(11):
Schwartz MF, Kimberg DY, Walker GM, Brecher A, Faseyitan OK, 3130–3135.
Dell GS, Mirman D, Coslett HB. Neuroanatomical dissociation for Willems RM, der Haegen LV, Fisher SE, Francks C. On the other hand:
taxonomic and thematic knowledge in the human brain. Proc Natl including left-handers in cognitive neuroscience and neuroge-
Acad Sci U S A. 2011:108(20):8520–8524. netics. Nat Rev Neurosci. 2014:15(3):193–201.
Scott TL, Perrachione TK. Common cortical architectures for phono- Wilson SM, Entrup JL, Schneck SM, Onuscheck CF, Levy DF, Rah-
logical working memory identified in individual brains. NeuroIm- man M, Willey E, Casilio M, Yen M, Brito AC, et al. Recovery
age. 2019:202:116096. from aphasia in the first year after stroke. Brain. 2023:146(3):
Scott TL, Gallée J, Fedorenko E. A new fun and robust version of 1021–1039.
an fMRI localizer for the frontotemporal language system. Cogn Woolgar A, Parr A, Cusack R, Thompson R, Nimmo-Smith I, Torralva
Neurosci. 2017:8(3):167–176. T, Roca M, Antoun N, Manes F, Duncan J. Fluid intelligence loss
Shashidhara S, Mitchell DJ, Erez Y, Duncan J. Progressive recruitment linked to restricted regions of damage within frontal and parietal
of the frontoparietal multiple-demand system with increased cortex. Proc Natl Acad Sci. 2010:107(33):14899–14902.
task complexity, time pressure, and reward. J Cogn Neurosci. Woolgar A, Duncan J, Manes F, Fedorenko E. Fluid intelligence is
2019:31(11):1617–1630. supported by the multiple-demand system not the language
Shashidhara S, Spronkers FS, Erez Y. Individual-subject functional system. Nat Hum Behav. 2018:2(3):200–204.
localization increases univariate activation but not multivariate Xu Y, Xiaosha W, Xiaoying W, Men W, Gao J-H, Bi Y. Doctor, teacher,
pattern discriminability in the “multiple-demand” frontoparietal and stethoscope: neural representation of different types of
network. J Cogn Neurosci. 2020:32(7):1348–1368. semantic relations. J Neurosci. 2018:38(13):3303–3317.
Siegal M, Varley R. Aphasia, language, and theory of mind. Soc Zec RF, Burkett NR, Markwell SJ, Larsen DL. Normative data stratified
Neurosci. 2006:1(3–4):167–174. for age, education, and gender on the Boston Naming Test. Clin
Silbert LJ, Honey CJ, Simony E, Poeppel D, Hasson U. Coupled Neuropsychol. 2007:21(4):617–637.
neural systems underlie the production and comprehension of Zettersten M, Lupyan G. Finding categories through words:
naturalistic narrative speech. Proc Natl Acad Sci. 2014:111(43): more nameable features improve category learning. Cognition.
E4687–E4696. 2020:196:104135.

You might also like