Chapter 2
Chapter 2
com
CHAPTER 2
Systematics,
Phylogeny, and
Classifications
The affinities of all the beings of the same class have sometimes been represented
by a great tree. . . . As buds give rise by growth to fresh buds, and these, if
vigorous, branch out and overtop on all sides many a feebler branch, so by
generation I believe it has been with the great Tree of Life, which fills with its
dead and broken branches the crust of the earth, and covers the surface with
its ever branching and beautiful ramifications.
Charles Darwin, The Origin of Species, 1859
S
ystematics is the study of biological diversity and its origins. Systematists
identify and distinguish species, describe and name new taxa, provide
tools to aid others in identifying specimens, infer the evolutionary rela-
tionships among species and higher taxa, undertake biogeographic analyses,
and produce classification systems that reflect evolutionary history. These
stages in systematic research overlap and cycle back on themselves in a highly
iterative fashion. In sum, the role of systematics is to document and understand
Earth’s biological diversity, to reconstruct the history of that biodiversity, and to
develop natural (evolutionary) classifications of living and extinct organisms.
Just as your own family has grown through time and you are connected to your
parents and your grandparents, all organisms that have ever lived on Earth are
related to one another through genetic descent. Phylogenetic trees are diagrams
comprising branches and nodes that depict this flow of genetic information over
time to illustrate how organisms are related. Since all organisms are kin to one
another, there is a single tree of life, whose branching pattern reflects the evolu-
tionary relationships among all species on Earth (Darwin’s “great Tree of Life”).
However, because multicellular life has been evolving on Earth for hundreds of
millions of years, and we cannot time travel, we don’t know for certain how all of
the branches of the tree are connected to one another. That is, there is not a univer-
sal record of the pattern of genetic decent since life began. But, there are traces of
that pattern embedded in the anatomy and genomes of all of the organisms roam-
ing the Earth today and even in some that are no longer among us (e.g., fossils
can be studied anatomically, and ancient DNA allows us to study the genomes
of some organisms that died more than one million years ago). Systematists seek
to uncover those traces to generate hypotheses of how organisms and species
are related to one another. They do this by creating and analyzing data matrices
of homologous anatomical or genetic traits, formally called characters, among
The chapter opener photo shows representatives of two phyla that are only distantly related,
deeply in time: sponges (Porifera) and crabs (Arthropoda). Both are ancient phyla that arose in
Precambrian seas more than 550 million years ago.
organisms. As new data become available and are added to being able to read a tree correctly. A monophyletic
to the analysis, trees are revised and updated. As we will group, or clade, is a group of species that includes a
see throughout this book, many long-held ideas about common ancestor and all of the descendants of that
animal relationships have recently been altered (or even common ancestor—that is, it is a unique branch in the
overturned) by analyses of DNA sequence data. In this tree of life. Monophyletic groups can be separated from
chapter, we will present the underpinnings of phyloge- the rest of the tree of life by cutting a single branch
netic analysis and the construction of classifications. once. If you were able to walk up to the tree of life and
cut off one branch, any branch, and hold that trimming
in your hand, you would be holding a monophyletic
Phylogeny, Monophyly, Paraphyly, group (Figure 2.1). A monophyletic group is a group
of taxa that are one another’s closest relatives. You and
and Polyphyly your immediate family (you, your parents, and your
A phylogenetic tree is a branching diagram that depicts siblings) is one example of a monophyletic group.
how organisms are related to one another (Figure 2.1). Most trees published in the literature (and in this
It is a graphical means of expressing relationships (or text) are rooted, and rooted trees not only show a pat-
genetic connectivity) among species or other taxa and tern of relationship but also indicate the direction in
is, in fact, a nested set of clades. Understanding the which time is moving along the branches—always
concept of a clade (or monophyletic group) is critical from the root (oldest) to the tips (youngest) of the tree,
no matter how the tree is drawn (Figure 2.2). Thus,
Most A B C D E F
rooted trees depict ancestor-descendant relationships.
recent
m q A
B
n p
C
E
Terminal node F
Oldest o Internal node Oldest Most
Root recent
Oldest
FIGURE 2.1 Learning to read trees. This phylogeny
depicts the relationships among six taxa (A–F). Phylo
genetic trees are always composed of nodes and branch
es, but the nodes are usually not indicated. In this tree,
the terminal nodes are solid black circles (A–F), and they
represent the extant species. The internal nodes are gray Most
or open circles (m–q), and they represent the most recent A B C D E F recent
common ancestors (MRCA) of their descendants. Node o
is the root node; it indicates where this tree would attach
A
to the rest of the tree of life, and it tells us the direction
in which time moves along the branches of the tree (from B
bottom to top). Time always moves from the root node
toward the terminal nodes, as indicated by the arrow to C
the left. A monophyletic group (or clade) is a group of D
taxa that includes the MRCA and the descendants of that
ancestor. You could cut this tree along any single branch E
and hold the trimming in your hand, and that would be F
a monophyletic group. Trees are nested sets of mono Most Oldest
phyletic groups (or clades). Sister groups are two mono recent
phyletic groups that are more closely related to one
another than they are to any other groups. Sister groups FIGURE 2.2. Trees, the root node, and direction of
share a most recent common ancestor. The sister group time. These three trees depict the same pattern of rela
of taxon A in this tree is m + B + C, and their MCRA is tionship between taxa A–F as the tree shows in Figure 2.1.
represented by node n. The sister group of the clade In each tree the root node is indicated by a circle, and
that includes A + B + C is the clade composed of an arrow depicts the direction in which time moves along
p + q + D + E + F. These two clades share a MRCA the branches, from the root node to the terminal nodes
depicted by node o. (sometimes called “leaves”).
Brusca 4e
Sinauer Associates/OUP
Morales Studio
BB4e_02.01 2-22-22
(A) (B)
O P Q R S T O P Q R S T
Dragonflies
Springtails
Butterflies
Crustacea
Silverfish
Crickets
Beetles
Characters
Crustacea (outgroup)
1. Thorax of only 3 segments,
No Yes Yes Yes Yes Yes Yes
each bearing one pair of legs Springtails
Hexapoda (ingroup)
2. Mouthparts external, not Silverfish
No No Yes Yes Yes Yes Yes
embedded within head 1
Dragonflies
3. Wings present No No No Yes Yes Yes Yes 2
Pterygota
Crickets
FIGURE 2.4. Example of a data matrix of five represents an apomorphy for the corresponding ancestral
characters and seven taxa. The character states for each node to the right, and that character state occurs in all the
of the seven taxa are coded in the matrix. If the character is descendants of the ancestor. Among the descendants, mani
not present within a taxon, the matrix is coded with “no”; festations of this character are called homologues, or homol
if it is present in that taxon, the matrix is coded with “yes” ogous features. Homologues share a common developmen
(systematists often use “absence” and “presence” instead tal and evolutionary origin. For example, the presence of
of “no” and “yes,” and these are represented in matrices wings is an apomorphy of clade Pterygota (and wings are
as “0” and “1,” but additional states may exist in the so homologous throughout the Pterygota), but the presence of
-
called multistate characters, as opposed to binary ones). wings is a plesiomorphy of the clade containing only beetles
These 4e
Brusca character states are plotted on the corresponding and butterflies. The clade containing only beetles and but
Sinauer Associates/OUP
tree to the right along the branch where they change from terflies is defined by the apomorphy of a pupal stage (repre
Morales Studio
no to yes. The2-22-22
BB4e_02.04 place where the character state first appears sentative of complete metamorphosis).
(which are usually either morphological or molecular) of homology among structures. This can result in mis-
are assembled into data matrices (Figure 2.4) in which takes because it ignores the evolutionary histories of the
every character amounts to a hypothesis of homology. genes and of the structures in which they are expressed.
Homologous characters have shared ancestry. That is, The functions of homologous genes (orthologues or
they are traits that are present in two or more taxa and paralogues), just like those of homologous structures,
inherited from their common ancestor. Homologous can diverge from one another over evolutionary time.
characters share an evolutionary history. The process of Similarly, the functions of nonhomologous genes can
evolutionary descent with modification has produced converge over time. Therefore, similarity of function is
a hierarchical pattern of homologies that can be traced not a valid criterion for the determination of homology
through lineages of living organisms. It is this pattern of either genes or structures. For example, the phenom-
that we use to reconstruct the history of life. The func- enon of gene recruitment (co-option) can lead to situa-
tions of homologous structures may be similar or differ- tions in which truly orthologous genes are expressed in
ent, but this has no bearing on the underlying homology nonhomologous structures during development. Most
of the structures involved. Our ability to recognize ana- regulatory genes play several distinct roles during devel-
tomical homologues often depends on developmental opment, and homologous genes can be independently
or embryological evidence and on the relative position recruited to superficially similar roles. A classic example
of the anatomical structure or the nucleotide or amino is the regulatory gene Distal-less, which is expressed in
acids in nucleotide or protein sequences. the distal portion of appendages of many animals dur-
Homology is a concept that is applicable to anatomi- ing their embryogeny (e.g., arthropods, echinoderms,
cal structures, to genes, and to developmental processes. chordates). Although the domains of Distal-less gene
However, homology at one of these levels does not nec- expression might reflect a homologous role in specify-
essarily indicate homology at another. Biologists should ing proximodistal axes of appendages, the appendages
always be clear regarding the level at which they are themselves are clearly not homologous.
inferring homology: genes, their expression patterns, Characters are the attributes, or features, of organ-
their developmental roles, or the structures to which isms or groups of organisms (clades, taxa) that biolo-
they give rise. Researchers sometimes assume similar gists rely on to indicate their relatedness to other similar
patterns of regulatory gene expression are also evidence organisms and to distinguish them from other groups.
Characters are the observable features and expressions never predicted.4 In molecular-based data matrices,
of the genotype, and they can be anything from the every column in the aligned matrix represents a char-
actual amino acid sequences of the genes themselves to acter, or a hypothesis of a homologous position in the
the phenotypic expressions of the genotype. A character alignment, and the character states are the specific
can be any genetically inherited trait that systematists nucleotides (A, adenine; T, thymine; C, cytosine; and
can examine and measure; it can be a morphological, G, guanine) or amino acids that occur in those taxa at
anatomical, developmental, or molecular feature of those sites, or their inferred insertions or deletions.5
an organism, its chromosomal makeup (karyotype) or Every character in a phylogenetic character matrix,
biochemical “fingerprint,” or even a physiological, or whether built from anatomical or genetic data, is a
ethological (behavioral), attribute—the webs of spiders hypothesis of homology.
are a classic example of behavioral traits relatively easy
to document. A large number of biochemical, molecular, 4
Equally exciting are techniques being developed to extract DNA
and bioinformatics techniques for measuring similarity from fossilized bone, tissue, or dung. In 2003, scientists managed to
extract DNA from Siberian permafrost sediments and soils of caves
and inferring relationships among organisms have been in New Zealand. The Siberian sediments yielded the oldest reliable
developed over the past few decades. Thus, a variety ancient DNA up to that time, from plants as much as 400,000 years
of kinds of data are available that provide systematists old (angiosperms, gymnosperms, and mosses) as well as from numer-
with homologous characters needed to define and com- ous animals, including both living and extinct species up to 30,000
years old. In 2008, the entire genome of the extinct woolly mammoth
pare species and infer phylogenies. (Mammuthus primigenius) from Siberia was sequenced, showing a
With the advent of technology for rapid amplifica- sequence identity of over 98% with modern African elephants (the two
tion of DNA fragments and sequencing of the nucleic diverged from one another about 6 million years ago). Also in 2008,
DNA was sequenced from lice (Insecta: Phthiraptera: Pediculus huma-
acids, the field of systematics has undergone dramatic
nus) preserved in the scalps of 1,000-year-old Peruvian mummies. The
change.3 Questions of evolutionary relationship, which lice belonged to a subtype of head and body lice found all over the
had traditionally been addressed only by comparative world, thus proving that human lice were in the New World well before
anatomy, can now be tested by an independent source Columbus. Since then, scientists have begun sequencing DNA from
even older animals—the oldest so far is from two mammoths older
of data. Since the 1990s, molecular phylogenetics has than 1 million years, perhaps pushing the limits of how long DNA can
swept the literature with evolutionary trees built by actually survive. By extracting fragments of DNA from ancient soils
analyses of gene sequences. Trees built from DNA and sediments (known as “environmental DNA”) in North America,
sequence data have corroborated some relationships scientists have begun to reconstruct Pleistocene and early Holocene
ecosystems. The extraction of DNA from 2,000- to 4,000-year-old
previously inferred by analyses of morphological traits, cores from the Arctic region has yielded a variety of plants, bison,
but they also provided support for novel relationships horses, bears, mammoths, and lemmings. Environmental DNA comes
from urine, feces, hair, skin, eggshells, feathers, and even the saliva
3
On February 12, 1988 (by appropriate coincidence, Charles Darwin’s of animals, as well as from the decaying leaves and fine rootlets of
birthday), a paper published by Katherine Field, Rudy Raff, and oth- plants. Ancient DNA analyses have led to the discovery of new types
ers presented the first credible molecular analysis of metazoan phy- of ancient humans and revealed interbreeding between our ancestors
logeny based on sequences from the small ribosomal subunit RNA and our archaic cousins, which left a genetic legacy that shapes who
gene (SSU or 18S rRNA). This work initiated a paradigm shift in phy- we are today. Much of the genome of Neanderthal has now been
logenetic analysis, and today the field of molecular phylogenetics is sequenced. As a result, we now know that modern Homo sapiens carry
rooted in the methods pioneered in that important paper. In 1997, a remnant of Neanderthal DNA from interbreeding events that have
Anna Marie Aguinaldo and colleagues also published a revolution- been postulated to have occurred as humans migrated out of Africa
ary paper, proposing a radical new view of animal phylogeny—one and into Eurasia, at least 80,000 years ago.
5
that hypothesized the Protostomia to comprise two distinct clades, Not long after the first molecular-based trees were published, it was
a “molting clade” (called Ecdysozoa) and a nonmolting clade (called recognized that analyses of molecular sequence data are prone to pre-
Spiralia in this book). dicting erroneous relationships under certain circumstances. Rapidly
The first molecular phylogenies were constructed from analyses of evolving lineages were inferred to be closely related, regardless of
ribosomal genes, which code for RNA that forms the 3-D structure of their true evolutionary relationships, due to a phenomenon known
ribosomes (the large ribosomal subunit or 28S rRNA, and the small as long branch attraction (LBA). Since there are only four possible
subunit or 18S rRNA). However, the number of genes used to infer character states in molecular sequence data (the four nucleotides),
phylogenies increased quickly to include both nuclear and mitochon- when DNA substitution rates are high, there is a high probability
drial protein-coding genes. Whereas analyses of single genes were that two lineages will independently evolve the same nucleotide at
the standard only a few years ago, most molecular phylogenetic anal- the same site by chance alone. In these circumstances, phylogenetic
yses today use multiple, preferably unlinked, genes concatenated algorithms (especially parsimony methods) erroneously interpret
together into one supermatrix. In fact, new and relatively inexpen- these convergences to be signs of shared ancestry (synapomorphies)
sive DNA sequencing technology, high-throughput sequencing, is and therefore misinterpret taxa on long branches to be close relatives.
allowing for molecular phylogenies to be constructed from larger Using phylogenetic algorithms that incorporate models of evolu-
portions of the genome, up to tens of thousands of genes. Specific tion can minimize the LBA problem. These models include three
regions of the genome can be targeted a priori through methods such components: (1) models of DNA substitution, which describe the
as anchored hybrid enrichment or ultraconserved elements, or novel rates at which one nucleotide replaces another over evolutionary
genes of phylogenetic significance can be discovered after shotgun time; (2) the relative nucleotide base frequency in a data set; and
sequencing of all DNA molecules (genomes) or RNA molecules (3) the relative rates at which sites in an alignment evolve in a
(transcriptomes). No matter what type of genetic data are selected data set. Things get more complicated in analysis of data as amino
or how they are sequenced, these methods result in rich multigene acids, as typically done in many phylogenomic analyses, especially
data sets for phylogenetic inference. Resulting trees are therefore those using transcriptomes, and models become more complex
inferred from a larger portion of the genome than previous methods and computationally intensive. Popular evolutionary-model-based
that relied on only a handful of genes, whose history may or may not algorithms include maximum likelihood and Bayesian methods of
precisely reflect the history of the taxa being analyzed. phylogenetic estimation.
Apomorphy and Plesiomorphy resemblance. Some homologous features look very dif-
Monophyletic groups (or clades) are defined only by ferent in different taxa (e.g., the pectoral fins of whales
apomorphies. Apomorphies are shared, derived char- and the arms of humans; the forewings of beetles and
acter states, and they stand in opposition to plesiomor- those of flies). Again, the concept of homology is related
phies, which are relatively older states (or forms) of to the level of analysis being considered. The wings of
that character.6 An apomorphy is only an apomorphy at bats and birds are homologous as tetrapod forelimbs,
one specific place in a phylogeny, along the branch in the but they are not homologous as “wings,” because wings
tree in which the character state evolved. At the specific evolved independently in these two groups (i.e., the
point on a phylogenetic tree where such a transforma- wings of bats and birds do not share a common ancestral
tion takes place, the new (derived) character state is wing). Homology is a powerful concept, but it is impor-
called an apomorphy and the former (ancestral) state tant to remember that homologies are really hypotheses,
is called a plesiomorphy. Character states can be apo- open to testing and possible refutation.
morphic in one place in the tree and plesiomorphic in Through convergent evolution, similar-appearing
another. Within Arthropoda, for example, having just (but nonhomologous) structures may evolve in distantly
three thoracic segments, each bearing one pair of legs, related groups of organisms in quite different ways; that
is a derived character state whose evolutionary appear- is, they have separate genetic and developmental ori-
ance marked the origin of the Hexapoda (thus distin- gins. For example, early biologists were misled by the
guishing them from all other arthropods) (Figure 2.4). superficial similarities between the vertebrate eye and
But for groups or lineages within the Hexapoda, such the cephalopod eye, the bivalve shells of molluscs and
as the Pterygota, these same features represent retained of brachiopods, and the sucking mouthparts of true
ancestral features (plesiomorphies), whereas having bugs (Hemiptera) and of mosquitoes (Diptera). Struc-
wings is a shared, derived trait (or apomorphy) for the tures such as these, which appear superficially similar
clade Pterygota. In the most general sense, so-called but that have arisen independently and have separate
primitive character states are attributes of lineages that genetic and phylogenetic origins, are called conver-
are relatively older and have been retained from some gent characters. There are both ecological and genomic
more distant ancestor; in other words, they have been explanations for the evolution of morphological similar-
around for a longer time relative to the apomorphic ity. Through the phenomenon of convergent evolution,
state, geologically or genealogically speaking. similar-appearing structures may arise independently,
In sum, systematists aim to document biodiversity with separate genetic and developmental origins, in
by recognizing the boundaries of natural lineages and response to the same ecological factors. Convergent
describing them by inferring the shape of the tree of traits (among both animals and plants) have been rec-
life and building systems to classify species that reflect ognized at nearly all levels of biological organization,
that evolutionary history. They do this by analyzing ranging from molecules to morphology to behavior.
homologous characters to build a phylogeny, with One of the most interesting cases of convergent evo-
each clade in that phylogeny defined by at least one lution is the recently discovered analogies between
apomorphy, or shared derived character state. It seems voice and vocal learning in some mammals and birds
fairly straightforward, so why is this process so diffi- (vocal learning is the ability to imitate sounds). Not
cult? The answer to that question boils down to the fact only have the vocal regions of certain bird and mam-
that we cannot always recognize homologous charac- mal brains converged in their anatomy, but more than
ters, so there are many chances for error as we seek to 50 genes have contributed to their convergent special-
obtain the shape of the one true tree of life. ization—convergent behavior and neural circuits for
vocal learning are accompanied by convergent molecu-
lar changes of multiple genes in species separated by
Challenges of millions of years from a common ancestor.
Convergence is often confused with parallelism.
Phylogenetic Inference Parallel characters are similar features that have arisen
Attempts to relate two taxa by comparing nonhomolo- more than once in different species but that share a com-
gous characters will result in errors. For example, the mon genetic and developmental basis. Parallel evolution
hands of chimpanzees and humans are homologous is the result of “distant” or underlying homology; for
characters (i.e., homologues) because they have the parallel evolution to occur, the genetic potential for cer-
same evolutionary and developmental origin; the wings tain features must persist within a group, thus allowing
of bats and insects, although similar in some ways, are the feature to appear and reappear in various related spe-
not homologous characters because they have com- cies or groups of species. Parallelism might be thought
pletely different origins. In a strict sense, the concept of of as a kind of “evolutionary repeatedness.”7 Failure to
homology has nothing to do with similarity or degree of 7
Parallelism in this context is not to be confused with the evolution
6
Apomorphies that are shared by two or more taxa are sometimes of species (or characters within species) “in parallel,” that is, when
referred to as synapomorphies; plesiomorphies shared by two or two species (or characters) change more or less together over time.
more taxa are sometimes referred to symplesiomorphies. Host-parasite coevolution is an example of “evolution in parallel.”
Morphological change
(B) more than two descendant lineages. (D) Parallel evo
lution occurs when two or more species (or lineages)
change similarly so that, despite evolutionary activity,
they remain similar in some ways. Parallelism gener
ally refers to closely related taxa, usually species,
within which the characters or structures in question
share a common genetic basis.
(C)
Patterns of relatedness are usually displayed by originally proposed. Its detailed methodology has been
biologists in branching diagrams called trees, and these formalized and expanded and will certainly continue to
are the least ambiguous (most testable) way to present be elaborated for some time to come. The goal of phylo
evolutionary hypotheses, although it is now recognized genetic systematics is to produce explicit and testable
that for some organisms networks may be better rep- hypotheses of genealogical relationships among mono-
resentations of relationships than trees, especially in phyletic groups of organisms. As a systematic methodol-
organisms with rampant genome recombination. Once ogy, it is based entirely on recency of common descent
constructed, such trees can then be converted into clas- (i.e., genealogy). The trees used by phylogenetic sys-
sification schemes, which are dynamic ways of repre- tematists are constructed to depict only genealogy, or
senting our understanding of the history of life on Earth. ancestor-descendant relationships. The term cladogene-
Thus, trees and classifications are actually hypotheses of sis refers to splitting; in the case of biology, this means the
the evolution of life and the natural order it has created. splitting of one species (or one lineage) into two or more
Although classification schemes are ultimately derived species (or lineages). It is this splitting process that pro-
from phylogenetic trees, they do not always reflect pre- duces genealogical (ancestor-descendant) relationships.
cisely the arrangement of natural groups in the trees. Phylogenetic inference can be a time-consuming
Discrepancies between phylogenetic trees and classifi- process. The number of mathematically possible trees
cations derived from them most commonly occur when (of all branching patterns) for more than a few species
biologists choose to establish or recognize paraphyletic is enormous—for three taxa there are only four pos-
taxa. Thus, to recognize the protists as a distinct taxon sible rooted trees, but for ten taxa there are 34 million
(the kingdom Protista) would be to recognize a paraphy- possible rooted trees. Needless to say, such analyses are
letic group (because it excludes three large lineages that not possible without the aid of computers. Algorithms
descended from it—animals, plants, and fungi). Whereas for computer-assisted tree construction began appear-
most systematists advocate that only monophyletic taxa ing in the late 1970s, and today the development of
be recognized in a formal classification, many paraphy- such programs comprises an entire field of research.
letic taxa persist in animal classification, for convenience Phylogenetic systematists generally use either the
or tradition or because they are simply not yet known to principle of parsimony8 or maximum likelihood to select
be paraphyletic (there are probably thousands of taxa for the optimal tree from among the set of all possible trees
which we do not yet know whether they are monophy- and thus select the tree with the fewest evolutionary
letic or paraphyletic). For example, the long-recognized transformations (character state changes) or with an
group Reptilia is paraphyletic because it excludes one optimized likelihood score. The use of gene sequence
of that group’s most distinct lineages, the birds. The
subphylum Crustacea is paraphyletic because it omits 8
Parsimony is a method of logic in which economy in reasoning is
the Hexapoda (insects and their kin), which evolved sought. The principle of parsimony, also known as Ockham’s razor,
has strong support in science. William of Ockham (Occam), the
from within the crustaceans long ago. And, of course, fourteenth-century English philosopher, stated the principle as “plural-
the group “Invertebrata” is a paraphyletic group—it is ity must not be posited without necessity.” A modern rendering would
Metazoa excluding the vertebrates. Even Prokaryota read, “An explanation of the facts should be no more complicated than
is a paraphyletic group (the Eukaryota evolved out of necessary” or “Among competing hypotheses, favor the simplest one.”
Scientists in all disciplines follow this rule daily, and it can be viewed
it). In fact, taxonomic groups at all levels are likely to as a consequence of deeper principles that are supported by statistical
have been derived from within other taxonomic groups, inferences. Thus, parsimonious solutions or hypotheses are those that
leaving the latter paraphyletic, and we are beginning to explain the data in the simplest way. Evolutionary biologists rely on
the principle of logical parsimony for the same reason other scientific
discover that paraphyly abounds in the Linnean hier-
disciplines rely on it: doing so presumes the fewest ad hoc assumptions
archy of life that has been built over the past centu- and produces the most testable (i.e., the most easily falsified) hypoth-
ries. The issue of how to deal with such long-standing, eses. If evidential support favored only one hypothesis, we would have
well-known paraphyletic taxa in classification schemes little need for parsimony as a method. The reason we must rely on
parsimony in science is that there is virtually always more than one
is still being debated. One way of doing this might be to hypothesis that can explain our data. Parsimony considerations come
indicate their paraphyletic status by a code in the clas- into play most strongly when a choice must be made among equally
sification scheme (e.g., some type of notation beside the supported hypotheses.
name). This code would inform readers that to view the In phylogenetic reconstruction, any given data set can be explained
by a great number of possible trees. A three-taxon data set has 3 pos-
precise phylogenetic relationships of such taxa, they sible dichotomous (all lines divide into just two branches) trees that
must look to the phylogenetic tree. explain it. A four-taxon data set has 15 possible dichotomous trees, a
Biologists today use a method known as phylogenetic five-taxon data set has 105 possible dichotomous trees, and so on. Thus,
systematics (or cladistics) when inferring patterns of the evidence alone does not sufficiently narrow the class of admis-
sible hypotheses, and some extra-evidential criterion (parsimony) is
evolutionary relationships. Phylogenetic systematics required. Again, the virtue of choosing the shortest (i.e., most parsimo-
had its origin in 1950 in a book by the German biolo- nious) tree among a universe of possible trees—the one that requires
gist Willi Hennig; the English translation (with revisions) the fewest character transformations—lies in its testability. William of
appeared in 1966. Its popularity has grown steadily since Ockham, by the way, also denied the existence of universals except in
the minds of humans and in language. This notion resulted in a charge
that time. Through the years, phylogenetic systematics of heresy from the Roman Catholic Church, after which he fled and,
has evolved well beyond the framework Hennig alas, died of the plague.
data has spawned a new family of model-based meth- was splitting the cycle up into finer and finer pieces, I
ods that incorporate hypotheses of nucleotide evolu- was also building a structure. This structure of concepts
tion (see Footnote 5). In these methods (i.e., maximum is formally called a hierarchy and since ancient times
likelihood, distance methods, and Bayesian analyses), has been a basic structure for all Western knowledge.
DNA nucleotide (or protein) sequences from organisms Robert M. Pirsig
in the study group are analyzed within a framework of Zen and the Art of Motorcycle Maintenance, 1974
assumptions based on our understanding of how nucle-
otides (or amino acids) operate and change over time. Classifications are necessary for several reasons, not
By identifying the precise points at which apomor- the least of which is to efficiently catalog the enormous
phies occur, phylogenetic trees unambiguously define number of species of organisms on Earth. Nearly 2 mil-
monophyletic lineages. Hence, these trees are explicit lion species of prokaryotes and eukaryotes have been
phylogenetic hypotheses. Being explicit, they can be named and described (and a great many more remain
tested (and potentially falsified) by anyone. Apomor- undescribed). The insects alone comprise nearly a mil-
phies are markers that identify specific places in the lion named species, and over 380,000 of those are bee-
tree where new monophyletic taxa arise. For phylo- tles! Classifications provide a detailed system for stor-
genetic systematists, a phylogeny consists of a genea- age and retrieval of these names. The second, and most
logical branching pattern expressed as a phylogenetic important, reason to evolutionary biologists is that
tree. Each split or dichotomy within the tree produces classifications serve a descriptive function. This func-
a pair of newly derived taxa called sister taxa, or sis- tion is served not only by the descriptions that define
ter groups (e.g., “sister species”). Sister groups always each taxon, but also, as noted earlier, by the detailed
share an immediate common ancestor. In Figures 2.1 hypotheses of evolutionary relationships among the
and 2.2, A is the sister group of B + C; D is the sister organisms that inhabit Earth. In other words, classifica-
of E + F; and A + B + C is the sister group of D + E + F. tions are (or should be) constructed from evolutionary
All of the trees in Figures 2.1 and 2.2 are fully dichot- relationships and capture the patterns of ancestry and
omous. That is, only two branches emerge from each descent depicted in phylogenetic trees.
internal node. Sometimes, trees include a polytomy, The construction of a classification may at first appear
when more than two branches emerge from a node. straightforward. Specimens are grouped into species;
Polytomies (or multifurcations) can have several differ- related species are grouped into genera (sing., genus);
ent meanings. In this textbook (and most commonly in related genera are grouped into families; and so forth.
the literature) polytomies represent uncertainty about The grouping process creates a system of subordinated,
the precise evolutionary relationships among the or nested, taxa arranged in a hierarchical fashion follow-
members; it is unclear who is the exact sister group of ing basic set theory (Figure 2.6). If the taxa are properly
whom. For example, we are uncertain about whether grouped, following the pattern in a phylogenetic tree
Porifera or Ctenophora is the sister group to all other (i.e., on the basis of shared derived characteristics), the
animals, and in Figure 28.1 we represent this uncer- hierarchy will reflect patterns of evolutionary descent.
tainty as a polytomy at the base of the tree of Metazoa. These hierarchical categories (or ranks) are commonly
Like all scientific hypotheses, phylogenetic analyses used for the biological classification of animals:
and their resulting trees are tested by the discovery of Kingdom
new data. As new characters or new species are iden- Phylum
tified and their character states elucidated, new data Class
matrices are developed, and new analyses are under-
Cohort
taken. The first molecular phylogenetic trees were based
Order
on a single gene fragment. But as techniques improved,
these trees were tested with multiple gene sets and, Family
eventually, with entire genomic data sets. Hypotheses Tribe
(branches of the tree) that consistently resist refutation Genus
are said to be highly corroborated. For example, the clade Species
called Arthropoda has been examined in thousands of Thus, the common eastern Pacific sea star Pisaster
analyses using a great variety of data, and it has consis- giganteus is classified as follows:
tently been shown to constitute a monophyletic group Category Taxon
(i.e., it is a highly corroborated phylogenetic hypothesis).
Phylum Echinodermata
Subphylum Asterozoa
Biological Classification Class Asteroidea
Order Forcipulatida
And you see that every time I made a further division,
Family Asteriidae
up came more boxes based on these divisions until I had
Genus Pisaster
a huge pyramid of boxes. Finally you see that while I
Species Pisaster giganteus
HEXAPODA
34 Nilaparvata
105 33 Cercopis
Okanagana
Ectopsocus
40 Liposcelis
39
38 Menopon PSOCODEA: bark & true lice
INSECTA
Pediculus
Tenthredo
50 Orussus
Cotesia
Leptopilina
Nasonia
Chrysis
HYMENOPTERA: sawflies, wasps,
104 Acromyrmex
Harpegnathos
bees, ants
Exoneura
Apis
Bombus
65 Inocellia
Xanthostigma
Corydalus
RAPHIDIOPTERA: snakeflies
66
64
63 Sialis
Conwentzia
MEGALOPTERA: alderflies & dobsonflies
103 62 Osmylus
67 Pseudomallada
Euroleon
NEUROPTERA: net-winged insects
Mengenilla
58 Stylops
Aleochara
STREPSIPTERA: twisted wing parasites
59 Dendroctonus
Meloe
57 Tribolium
Lepicerus
Priacma
COLEOPTERA: beetles
Gyrinus
Carabus
102 Rhyacophila
Platycentropus
99 Hydroptila
Philopotamus TRICHOPTERA: caddisflies
100 Annulipalpia chim.
Micropterix
95 Dyseriocrania
Triodia
Nemophora
Yponomeuta
Zygaena
Polyommatus
LEPIDOPTERA : moths & butterflies
101 Parides
Bombyx
Manduca
Ceratophyllus
81
85
Archaeopsylla
Ctenocephalides SIPHONAPTERA: fleas
Boreus
84 Nannochorista
86
Bittacus
Panorpa
MECOPTERA: scorpionflies
Anopheles
Aedes
79 Phlebotomus
Trichocera
Tipula
Bibio
Bombylius
Drosophila
DIPTERA: true flies
Lipara
Rhagoletis
Glossina
Sarcophaga
Triarthria
Ma 550 500 450 400 350 300 250 200 150 100 50 0
his Systema Naturae 100 years prior to the appearance 1. Botanical, bacterial, and zoological codes are
of Darwin and Wallace’s theory of evolution by means independent of each other. It is therefore permis-
of natural selection (1859), and thus his use of simi- sible, although not recommended, for a plant
larities in classification foreshadowed the subsequent genus and an animal genus to bear the same
emphasis by biologists on evolutionary relationships name (e.g., Aotus is the generic name of both
among taxa. Linnaeus was granted nobility in 1761 golden peas and night monkeys).
(and became Carl von Linné); he died in 1778. 2. A taxon can bear one and only one correct name.
Binomens are Latin (or Latinized) because of the
3. No two genera within a given code can bear the
custom followed in Europe prior to the eighteenth
same name (i.e., generic names are unique), and
century of publishing scientific papers in Latin, the
no two species within one genus can bear the
universal language of the educated people of the
same name (i.e., binomens are unique).
time. For several decades after Linnaeus, names for
animals and plants proliferated, and there were often 4. The correct or valid name of a taxon is based
several names for any given species (different names on priority of publication (first usage), with a
for the same species are called synonyms). The name few exceptions for very old names that have not
in common use was usually the most descriptive one, been in use for a long period of time.
or often it was simply the one used by the preeminent 5. For the categories of superfamily in animals
authority of the time. In addition, some generic names and order in plants, and for all categories below
and specific epithets were composed of more than one these, taxon names must be based on type speci-
word each. This lack of nomenclatural uniformity led, mens, type species, or type genera.10
in 1842, to the adoption of a code of rules formulated
under the auspices of the British Association for the When strict application of a code results in confusion
Advancement of Science, called the Strickland code. or ambiguity, problems are referred to the appropriate
In 1901 the newly formed International Commission commission for a “legal” decision. Rulings of the Inter-
on Zoological Nomenclature adopted a revised ver- national Commission on Zoological Nomenclature are
sion of the Strickland code, called the International published regularly in its journal, the Bulletin of Zoologi-
Code of Zoological Nomenclature (ICZN). Botanists cal Nomenclature. Note that the international commis-
had adopted a similar code for plants in 1813, the sions rule only on nomenclature or “legal” matters, not
Théorie Élémentaire de la Botanique, which became in on questions of scientific or biological interpretation;
1930 the International Code of Botanical Nomencla- these latter problems are the business of systematists.
ture (there is also a separate but complementary code Obviously, the correct name of a species (and any future
for cultivated plants). Since then, it has been revised changes in that name) has great importance to all fields
as the International Code of Nomenclature for Algae, of biology (e.g., ecology, conservation biology, physiol-
Fungi, and Plants. There is also an International Code ogy) and also in the field of law, where many environ-
of Nomenclature of Bacteria. mental decisions are made today, because scientists (and
The ICZN established January 1, 1758 (the year the governments) must know the correct names of their
tenth edition of Linnaeus’s Systema Naturae appeared), study organisms in order to communicate about them.
as the starting date for modern zoological nomen- Names given to animals and plants are usually
clature. Any names published the same year, or in descriptive in some way, or perhaps indicative of the
subsequent years, are regarded as having appeared geographic area in which the species occurs. Others are
after the Systema. The ICZN also slightly changed the named in honor of persons for one reason or another.
description of Linnaeus’s naming system, from bino- Occasionally one runs across purely whimsical names,
mial nomenclature (names of two parts) to binominal or even names that seem to have been formulated for
nomenclature (names of two names). However, one seemingly diabolical reasons.11
still sees the former designation in common use. This The biological species definition (or genetic species
subtle change implies that the system must be truly concept), as codified by Ernst Mayr, defines species as
binary; that is, both genus and species epithet names groups of interbreeding (or potentially interbreeding)
can be only one word each. Although the system is
10
binary, it also accepts the use of subspecies names, cre- When a systematist first names and describes a new species, she
or he takes a “typical” or representative individual, declares it a
ating a trinomen (three names) within which is con-
type specimen, and deposits it in a safe repository such as a large
tained the mandatory binomen. For example, the sea natural history museum. If later workers are ever uncertain about
star Pisaster giganteus is known to have a distinct form whether they are working with the same species described by the
occurring in the southern part of its range, which is original author, they can compare their material to the type speci-
men. Although of substantially less value, the designation of a “typi-
designated as a subspecies, Pisaster giganteus capitatus. cal” or type species for a genus, or a type genus for a family, serves
All codes of biological nomenclature share the follow- a somewhat similar purpose in establishing, a “typical” species or
ing five basic principles: genus upon which a genus or family is based.
natural populations that are reproductively isolated There are no rules for how many species should make
from other such groups. Obviously, this definition up a genus—only that it be a natural group. Nor are
fails to accommodate nonsexual species. George there rules about how many genera constitute a fam-
Gaylord Simpson and Edward O. Wiley developed ily, or whether any group of genera should be recog-
the evolutionary species concept, which states that nized as a family, or a subfamily, or an order, or any
a species is a single lineage of ancestor-descendant other categorical rank. What matters is simply that the
populations that maintains its identity separate from named group (the taxon) be a natural group. Hence,
other such lineages and that has its own evolution- it is incorrect to assume that families of insects are in
ary tendencies and historical fate. In reality, of course, some way evolutionarily comparable to families of
biologists rely heavily on morphological aspects of molluscs, or orders of worms comparable to orders of
organisms and on gene sequence data as surrogates in crustaceans. Nor are there any rules about categorical
gauging these conceptual views of species. That is, we rank and geological or evolutionary age. These aspects
conceive of species as genetic or evolutionary entities, of higher taxa are often misunderstood. Interestingly,
but we recognize them primarily by their phenotypic this being said, family-level taxa often tend to be the
or gene sequence characters. Hence, an understanding most stable taxonomic groupings, usually recognizable
of such characters is of great importance; read on. even to laypersons—think, for example, of cats (Feli-
Higher taxa (categories and taxa above the species dae), dogs (Canidae), abalone (Haliotidae), ladybird
level) are natural groups of species (or lineages) chosen beetles (Coccinellidae), mosquitoes (Culicidae), octo-
by biologists for naming in order to reflect our state puses (Octopodidae), or weevils (Curculionidae). This
of knowledge regarding their evolutionary relation- stability seems to be an artifact of the history of tax-
ships. Higher taxa, if correctly constructed, represent onomy, but it nonetheless makes families convenient
ancestor-descendant lineages (clades) that, like species, higher taxa to study and discuss. However, biologists
have an origin, a common ancestry and descent, and err when they compare equally ranked higher taxa
eventually a death (extinction of the lineage); thus they from different groups in ways that presuppose them
too are evolutionary units with definable boundaries. to be somehow equivalent.
11 a hermaphroditic slipper shell (gastropod) that forms stacks of alter-
Among the many clever names given to animals are Agra vation
(a tropical beetle that was extremely difficult for Dr. Terry Erwin to nating male- and female-functioning individuals (males on top turn
collect) and Lightiella serendipida (a small crustacean; the generic name into females as they grow). Injecting a lyrical dose of sexual innu-
honors the famous Pacific naturalist S. F. Light, 1886–1947, while the endo into taxonomy is not new. Linnaeus himself incorporated a few
species epithet is taken from “serendipity,” a word coined by Walpole good zingers into his writings and, in fact, drew parallels between
in allusion to the tale of “The Three Princes of Serendip,” who in their plant sexuality and human love. In 1729 he wrote of flower petals,
travels were always discovering, by chance or sagacity, things they “[These] serve as bridal beds which the Creator has so gloriously
did not seek—the term is said to aptly describe the circumstances arranged, adorned with such noble bed curtains, and perfumed with
of the initial discovery of this species). There are actually over 500 so many soft scents, that the bridegroom with his bride might there
described species of Agra (those carabid beetles known as “elegant celebrate their nuptials with so much the greater solemnity.” Such
canopy beetles”), including Agra eponine, named after the street sexually explicit writing (in the early eighteenth century) did not go
urchin in Les Miserables who, in the Broadway version of the story, uncriticized, and Linnaeus had his detractors. The German botanist
personified tragic beauty (“such is the state of the tropical forests Johann Siegesbeck (a Demonstrator at the Botanical Garden at St.
where these beetles live,” according to Dr. Erwin, who also named Petersburg) called it “loathsome harlotry” and commented, “Who
this species). Another of Erwin’s names is Agra ichabod, referring to would have thought that bluebells, lilies, and onions could be up
the fact that the holotype is missing its head, the allusion referring to such immorality?” Linnaeus had his revenge, however, when he
to the frightened schoolteacher Ichabod Crane’s phantom nemesis, named a small, ugly, foul-smelling, mud-inhabiting European weed
the Headless Horseman, in “The Legend of Sleepy Hollow.” The (St. Paul’s wort) Siegesbeckia.
nineteenth-century British naturalist W. E. Leach erected numer- Other fun names include Upupa epops, euphoniously named for the
ous genera of isopod crustaceans whose spellings were anagrams call of the hoopoe (a bird), and the fish Zappa confluentus, which was
of the name Caroline. Exactly who Caroline was (and the nature of named by a fan of Frank Zappa. The Grateful Dead have a fly named
her relationship with Professor Leach) is still being debated, but the in their honor (Dicrotendipes thanatogratus). And there is the vampire
prevailing theory implicates Caroline of Brunswick, who was in the squid Vampyroteuthis infernalis (the “vampire squid from hell”), a
public eye at this time in history. It is said that Caroline was badly bivalve named Abra cadabra, a blood-sucking spider Draculoides
treated by her husband (the Prince Regent, later George IV) and that bramstokeri, and a wasp Aha ha. Even Linnaeus created a curious name
she was herself a lady of questionable fidelity. Leach, from Devon, for a common ameba, Chaos chaos. And, in a stroke of whimsy, the
may have taken the side of support for Caroline by honoring her with entomologist G. W. Kirkaldy created the bug genera Polychisme (“Polly
a long series of generic names, including Cirolana, Lanocira, Rocinela, kiss me”), Peggichisme, Marichisme, Dolychisme, and Florichisme. There
Nerocila, Anilocra, Conilera, Olincera, and others. are fish genera named Zeus, Satan, Zen, Batman, and Sayonara. There
A light-hearted attitude toward naming organisms has not always are insect genera named Cinderella, Aloha, Oops, and Euphoria; the
been without Freudian overtones, as there also exist Thetys vagina siboglinid genus Bobmarleya; and a lucinid clam with a periostracum
(a large, hollow, tubular pelagic salp), Succinea vaginacontorta (a her- looking like dreadlocks named Rasta. The spider genus Orsonwelles
maphroditic snail whose vagina twists in corkscrew fashion), Phallus contains such species as O. macbeth and O. othello. Some other clever
impudicus (a slime-covered mushroom), and Amanita phalloides and binomens include Leonardo davincii (a moth), Phthiria relativitae (a fly),
Amanita vaginata (two species of highly toxic mushrooms around and Ba humbugi (a snail). A few biologists have gone overboard in
which numerous Indigenous ceremonies and legends exist). Humbert erecting names for new animals, and many binomens exceed 30 letters
humberti is a wasp named after Vladimir Nabokov’s Humbert in length, including Strongylocentrotus droebachiensis (32 letters), for the
Humbert, the narrator in the great novel Lolita who was obsessed common North Pacific sea urchin, and Lagenivaginopseudobenedenia,
with his 12-year old, soon-to-be stepdaughter. Crepidula fornicata is a 27-letter genus name for a group of monogenean flukes.
Chapter Summary
In this chapter we have introduced you to the field of all species were related to one another. But since we do
systematics, the oldest and most foundational field not know for certain what Darwin’s “great Tree of Life”
among the biological sciences. The main goal of sys- looks like, we must infer the shape of the tree through
tematists is to describe and organize the many forms of phylogenetic inference. In recent years, the field of
life in evolutionarily meaningful ways, whether those molecular phylogenetics has revealed a great deal about
genetic linkages are depicted as a branching diagram how species are related to one another, but much more
(or phylogenetic tree) or a nested list (classification). remains to be accomplished.
This task would be straightforward if we knew how