Peter Roach English Phonetics and Phonol
Peter Roach English Phonetics and Phonol
and Phonology
A practical course
Fourth edition
PETER ROACH
Emeritus Professor o f Phonetics
University o f Reading
11 CAMBRIDGE
U NIVERSITY PRESS
CAMBRIDGE UNIVERSITY PRESS
www.cambridge.org
Information on this title: www.cambridge.org/9780521717403
Printed and bound in the United Kingdom by the MPG Books Group
A catalogue record for this publication is available from the British Library
1 Introduction 1
1.1 How th e course is organised i
1.2 The English Phonetics and Phonology w ebsite 2
1.3 Phonemes and other aspects o f pronunciation 2
1.4 Accents and dialects 3
8 The syllable 56
8.1 The nature o f th e syllable 56
8.2 The structure o f th e English syllable 57
8.3 Syllable division 60
12 W eak form s 89
Contents vii
15 Intonation 1 119
15.1 Form and function in intonation 120
15.2 Tone and tone languages 121
15.3 Complex tones and pitch height 122
15.4 Some functions o f English tones 123
15.5 Tones on other words 126
In previous editions I have used the Preface as a place to thank all the people who have
helped me with the book. My debt to them, which in some cases dates back more than
twenty-five years, remains, and I have put copies of the Prefaces to the first three editions on
the new website of the book so that those acknowledgements are not lost and forgotten. In
this new edition, I would like firstly to thank Professor Nobuo Yuzawa of the Takasaki City
University of Economics for his wise suggestions and his meticulous and expert scrutiny of
the text, which have been invaluable to me. Any errors that remain are entirely my fault.
At Cambridge University Press, I would like to thank Jane Walsh, Jeanette Alfoldi, Liz
Driscoll, Anna Linthe, Clive Rumble and Brendan Wightman.
As in all previous editions, I want to thank my wife Helen for all her help and support.
List of symbols
1 Symbols fo r phonemes
i: as in ‘key’ ki:
I as in ‘pit’ pit
a: as in ‘car’ ka:
e as in ‘pet’ pet
01 as in ‘core’ ko:
ae as in ‘pat’ paet
u: as in ‘coo’ ku:
A as in ‘putt’ pAt
3 : as in ‘cur’ k3i
D as in ‘pot’ pot
u as in ‘put’ put
o as in ‘about’, upper’
obaut, Apo
ei as in ‘bay’ bei au as in ‘go’ gsu
ai as in ‘buy’ bai au as in ‘cow’ kau
01 as in ‘boy’ boi
io as in ‘peer’ pio
eo as in ‘pear’ peo
C
O
uo as in
8
x
List of symbols xi
2 Non-phonemic symbols
i as in ‘react’, ‘happy’ riaekt, haepi
u as in ‘to each’ tu i:tj
? (glottal stop)
h aspiration, as in ‘pin’ phin
, syllabic consonant, as in ‘button’ b A t n
shortened vowel, as in ‘miss’ m is
syllable division, as in ‘differ’ dif .3
3 Word stress
1 primary stress, as in ‘open’ 'aupsn
, secondary stress, as in ‘half time’ ,ha:f'taim
4 Intonation
I tone-unit boundary
II pause
Tones: \ fall
/ rise
v fall-rise
a rise-fall
level
1 stressed syllable in head, high pitch, as in 'please \d o
, stressed syllable in head, low pitch, as in ,please \d o
stressed syllable in the tail, as in \ mv -turn
t extra pitch height, as in t\m y -turn
THE INTERNATIONAL PHONETIC ALPHABET (revised to 2005)
CONSONANTS (PULMONIC) © 2005 IPA
Bilabial Labiodental Dental Alveolar Post alveolar Retroflex Palatal Velar j Uvular Pharyngeal 1 Glottal
Plosive
P b t d t cl c J k g q G ?
;
Nasal m n] n n Ji q n j
j
Trill B r R [ j
Tap or Flap V r r ...........
L
Fricative
♦ p f V 6 5 s z J 3 § \ 9i x Y % K h ? h fi
| Lateral
j fricative 1 fe ! 1
j Approximant u j \ j
| Lateral
1 approximant 1 I X L
Where symbols appear in pairs, the one to the right represents a voiced consonant. Shaded areas denote articulations judged impossible.
Front C e n tr a l Back
Clicks Voiced implosives Ejectives
9
O Bilabial 6 Bilabial Examples:
„J
| Dental cf D ental/alveolar T) Bilabial
OTHER SYMBOLS
AV V oiceless labial-velar fricative Q> Alveolo-palatai fricatives Where symbols appear in pairs, the one
to the right represents a rounded vowel.
W Voiced labial-velar approxim ant J V oiced alveolar lateral flap
X X eX ~ j Mid e S\ {lifh
M id-centraiized e ?
X
Raised
<:? voiced alveolar fricative)
e - J Low e j Low
rising
. . Syllabic
i
n | T
Low ered eT
voiced bilabial approxim ant) v\
e J Extra
tow e 'i
•
Rising-
falling
N on-syllabic e ■i
A dvanced Tongue Root
%
\
i D ow nstep / G lobal rise
& ay I k t U pstep \
R hoticity
►
R etracted T ongue Root
% G lobal fall
You probably want to know what the purpose of this course is, and what you can expect
to learn from it. An important purpose of the course is to explain how English is pro
nounced in the accent normally chosen as the standard for people learning the English
spoken in England. If this was the only thing the course did, a more suitable title would
have been “English Pronunciation”. However, at the comparatively advanced level at which
this course is aimed, it is usual to present this information in the context of a general
theory about speech sounds and how they are used in language; this theoretical context is
called phonetics and phonology. Why is it necessary to learn this theoretical background?
A similar question arises in connection with grammar: at lower levels of study one is
concerned simply with setting out how to form grammatical sentences, but people who
are going to work with the language at an advanced level as teachers or researchers need
the deeper understanding provided by the study of grammatical theory and related areas
of linguistics. The theoretical material in the present course is necessary for anyone who
needs to understand the principles regulating the use of sounds in spoken English.
You should keep in mind that this is a course. It is designed to be studied from begin
ning to end, with the relevant exercises being worked on for each chapter, and it is there
fore quite different from a reference book. Most readers are expected to be either studying
English at a university, or to be practising English language teachers. You may be working
under the supervision of a teacher, or working through the course individually; you may
be a native speaker of a language that is not English, or a native English-speaker.
Each chapter has additional sections:
• Notes on problems and further reading: this section gives you information on
how to find out more about the subject matter of the chapter.
• Notes for teachers: this gives some ideas that might be helpful to teachers using
the book to teach a class.
• Written exercises: these give you some practical work to do in the area covered
by the chapter. Answers to the exercises are given on pages 200-9.
• Audio exercises: these are recorded on the CDs supplied with this book (also
convertible to mp3 files), and there are places marked in the text when there is a
relevant exercise.
1
2 English Phonetics and Phonology
• Additional exercises: you will find more written and audio exercises, with
answers, on the book’s website.
Only some of the exercises are suitable for native speakers of English. The exercises for
Chapter 1 are mainly aimed at helping you to become familiar with the way the written
and audio exercises work.
If you have access to the Internet, you can find more information on the website
produced to go with this book. You can find it at www.cambridge.org/elt/peterroach.
Everything on the website is additional material - there is nothing that is essential to
using the book itself, so if you don’t have access to the Internet you should not suffer a
disadvantage.
The website contains the following things:
• Additional exercise material.
• Links to useful websites.
• A discussion site for exchanging opinions and questions about English phonetics
and phonology in the context of the study of the book.
• Recordings of talks given by Peter Roach.
• Other material associated with the book.
• A Glossary giving brief explanations of the terms and concepts found in
phonetics and phonology.
The nature of phonetics and phonology will be explained as the course progresses, but
one or two basic ideas need to be introduced at this stage. In any language we can identify a
small number of regularly used sounds (vowels and consonants) that we call phonemes;
for example, the vowels in the words ‘pin’ and ‘pen’ are different phonemes, and so are
the consonants at the beginning of the words ‘pet’ and ‘bet’. Because of the notoriously
confusing nature of English spelling, it is particularly important to learn to think of
English pronunciation in terms of phonemes rather than letters of the alphabet; one must
be aware, for example, that the word ‘enough’ begins with the same vowel phoneme as
that at the beginning of ‘inept’ and ends with the same consonant as ‘stuff’. We often use
special symbols to represent speech sounds; with the symbols chosen for this course, the
word ‘enough’ would be written (transcribed) as inAf. The symbols are always printed in
blue type in this book to distinguish them from letters of the alphabet. A list of the sym
bols is given on pp. x-xi, and the chart of the International Phonetic Association (IPA) on
which the symbols are based is reproduced on p. xii.
The first part of the course is mainly concerned with identifying and describing the pho
nemes of English. Chapters 2 and 3 deal with vowels and Chapter 4 with some consonants.
After this preliminary contact with the practical business of how some English sounds are
i Introduction 3
pronounced, Chapter 5 looks at the phoneme and at the use of symbols in a theoretical
way, while the corresponding Audio Unit revises the material of Chapters 2-4. After the
phonemes of English have been introduced, the rest of the course goes on to look at larger
units of speech such as the syllable and at aspects of speech such as stress (which could be
roughly described as the relative strength of a syllable) and intonation (the use of the pitch
of the voice to convey meaning). As an example of stress, consider the difference between
the pronunciation o f‘contract5as a noun (‘they signed a contract') and contract’ as a verb
(£it started to contract'). In the former the stress is on the first syllable, while in the latter it is
on the second syllable. A possible example of intonation would be the different pitch move
ments on the word welT said as an exclamation and as a question: in the first case the pitch
will usually fall from high to low, while in the second it will rise from low to high.
You will have to learn a number of technical terms in studying the course: you will find
that when they are introduced in order to be defined or explained, they are printed in bold
type. This has already been done in this Introduction in the case of, for example, phoneme,
phonetics and phonology*. Another convention to remember is that when words used as
examples are given in spelling form, they are enclosed in single quotation marks - see for
example ‘pin’, pen’, etc. Double quotation marks are used where quotation marks would
normally be used - that is, for quoting something that someone has said or might say. Words
are sometimes printed in italics to mark them as specially important in a particular context.
Languages have different accents: they are pronounced differently by people from
different geographical places, from different social classes, of different ages and different
educational backgrounds. The word accent is often confused with dialect. We use the word
dialect to refer to a variety of a language which is different from others not just in pronun
ciation but also in such matters as vocabulary, grammar and word order. Differences of
accent, on the other hand, are pronunciation differences only.
The accent that we concentrate on and use as our model is the one that is most
often recommended for foreign learners studying British English. It has for a long time
been identified by the name Received Pronunciation (usually abbreviated to its initials,
RP), but this name is old-fashioned and misleading: the use of the word “received” to
mean “accepted” or “approved” is nowadays very rare, and the word if used in that sense
seems to imply that other accents would not be acceptable or approved of. Since it is most
familiar as the accent used by most announcers and newsreaders on BBC and British
independent television broadcasting channels, a preferable name is BBC pronunciation.
This should not be taken to mean that the BBC itself imposes an “official” accent -
individual broadcasters all have their own personal characteristics, and an increasing
number of broadcasters with Scottish, Welsh and Irish accents are employed. However, the
accent described here is typical of broadcasters with an English accent, and there is a useful
degree of consistency in the broadcast speech of these speakers.
This course is not written for people who wish to study American pronunciation,
though we look briefly at American pronunciation in Chapter 20. The pronunciation of
English in North America is different from most accents found in Britain. There are excep
tions to this - you can find accents in parts of Britain that sound American, and accents in
North America that sound English. But the pronunciation that you are likely to hear from
most Americans does sound noticeably different from BBC pronunciation.
In talking about accents of English, the foreigner should be careful about the differ
ence between England and Britain; there are many different accents in England, but the
range becomes very much wider if the accents of Scotland, Wales and Northern Ireland
(Scotland and Wales are included in Britain, and together with Northern Ireland form the
United Kingdom) are taken into account. Within the accents of England, the distinction
that is most frequently made by the majority of English people is between northern and
southern. This is a very rough division, and there can be endless argument over where
the boundaries lie, but most people on hearing a pronunciation typical of someone from
Lancashire, Yorkshire or other counties further north would identify it as “Northern”. This
course deals almost entirely with BBC pronunciation. There is no implication that other
accents are inferior or less pleasant-sounding; the reason is simply that BBC is the accent
that has usually been chosen by British teachers to teach to foreign learners, it is the accent
that has been most fully described, and it has been used as the basis for textbooks and
pronunciation dictionaries.
A term which is widely found nowadays is Estuary English, and many people have
been given the impression that this is a new (or newly-discovered) accent of English. In
reality there is no such accent, and the term should be used with care. The idea originates
from the sociolinguistic observation that some people in public life who would previously
have been expected to speak with a BBC (or RP) accent now find it acceptable to speak
with some characteristics of the accents of the London area (the estuary referred to is the
Thames estuary), such as glottal stops, which would in earlier times have caused comment
or disapproval.
If you are a native speaker of English and your accent is different from BBC you
should try, as you work through the course, to note what your main differences are for
purposes of comparison. I am certainly not suggesting that you should try to change your
pronunciation. If you are a learner of English you are recommended to concentrate on
BBC pronunciation initially, though as you work through the course and become familiar
with this you will probably find it an interesting exercise to listen analytically to other
accents of English, to see if you can identify the ways in which they differ from BBC and
even to learn to pronounce some different accents yourself.
The recommendation to use the name BBC pronunciation rather than RP is not univer
sally accepted. ‘BBC pronunciation’ is used in recent editions of the Cambridge English
Pronouncing Dictionary (Jones, eds. Roach, Hartman and Setter, 2006), in Trudgill (1999)
i Introduction 5
and in Ladefoged (2004); for discussion, see the Introduction to the Longman Pronunciation
Dictionary (Wells, 2008), and to the Cambridge English Pronouncing Dictionary (Jones, eds.
Roach et al> 2006). In Jones’s original English Pronouncing Dictionary of 1917 the term
used was Public School Pronunciation (PSP). Where I quote other writers who have used the
term RP in discussion of standard accents, I have left the term unchanged. Other writers
have suggested the name GB (General British) as a term preferable to RP: I do not feel this
is satisfactory, since the accent being described belongs to England, and citizens of other
parts of Britain are understandably reluctant to accept that this accent is the standard for
countries such as Scotland and Wales. The BBC has an excellent Pronunciation Research
Unit to advise broadcasters on the pronunciation of difficult words and names, but most
people are not aware that it has no power to make broadcasters use particular pronuncia
tions: BBC broadcasters only use it on a voluntary basis.
I feel that if we had a completely free choice of model accent for British English it
would be possible to find more suitable ones: Scottish and Irish accents, for example, have a
more straightforward relationship between spelling and sounds than does the BBC accent;
they have simpler vowel systems, and would therefore be easier for most foreign learners to
acquire. However, it seems that the majority of English teachers would be reluctant to learn
to speak in the classroom with a non-English accent, so this is not a practical possibility.
For introductory reading on the choice of English accent, see Brown (1990: 12-13);
Abercrombie (1991: 48-53); Cruttenden (2008: Chapter 7); Collins and Mees (2008: 2-6);
Roach (2004,2005). We will return to the subject of accents of English in Chapter 20.
Much of what has been written on the subject of “Estuary English” has been in minor
or ephemeral publications. However, I would recommend looking at Collins and Mees
(2008: 5-6, 206-8, 268-272); Cruttenden (2008: 87).
A problem area that has received a lot of attention is the choice of symbols for rep
resenting English phonemes. In the past, many different conventions have been proposed
and students have often been confused by finding that the symbols used in one book are
different from the ones they have learned in another. The symbols used in this book are
in most respects those devised by A. C. Gimson for his Introduction to the Pronunciation
of English, the latest version of which is the revision by Cruttenden (Cruttenden, 2008).
These symbols are now used in almost all modern works on English pronunciation pub
lished in Britain, and can therefore be looked on as a de facto standard. Although good
arguments can be made for some alternative symbols, the advantages of having a common
set of symbols for pronunciation teaching materials and pronunciation entries in diction
aries are so great that it would be very regrettable to go back to the confusing diversity of
earlier years. The subject of symbolisation is returned to in Section 5.2 of Chapter 5.
Notes fo r teachers
Pronunciation teaching has not always been popular with teachers and language-teaching
theorists, and in the 1970s and 1980s it was fashionable to treat it as a rather outdated
activity. It was claimed, for example, that it attempted to make learners try to sound like
6 English Phonetics and Phonology
native speakers of Received Pronunciation, that it discouraged them through difficult and
repetitive exercises and that it failed to give importance to communication. A good exam
ple of this attitude is to be found in Brown and Yule (1983: 26-7). The criticism was
misguided, I believe, and it is encouraging to see that in recent years there has been a sig
nificant growth of interest in pronunciation teaching and many new publications on the
subject. There are very active groups of pronunciation teachers who meet at TESOL and
IATEFL conferences, and exchange ideas via Internet discussions.
No pronunciation course that I know has ever said that learners must try to speak
with a perfect RP accent. To claim this mixes up models with goals: the model chosen
is BBC (RP), but the goal is normally to develop the learner’s pronunciation sufficiently
to permit effective communication with native speakers. Pronunciation exercises can be
difficult, of course, but if we eliminate everything difficult from language teaching and
learning, we may end up doing very little beyond getting students to play simple com
munication games. It is, incidentally, quite incorrect to suggest that the classic works on
pronunciation and phonetics teaching concentrated on mechanically perfecting vowels
and consonants: Jones (1956, first published 1909), for example, writes “ ‘Good’ speech
may be defined as a way of speaking which is clearly intelligible to all ordinary people.
‘Bad’ speech is a way of talking which is difficult for most people to understand ... A
person may speak with sounds very different from those of his hearers and yet be clearly
intelligible to all of them, as for instance when a Scotsman or an American addresses an
English audience with clear articulation. Their speech cannot be described as other than
good’ ” (pp. 4-5).
Much has been written recently about English as an International Language, with
a view to defining what is used in common by the millions of people around the world
who use English (Crystal, 2003; Jenkins, 2000). This is a different goal from that of this
book, which concentrates on a specific accent. The discussion of the subject in Cruttenden
(2008: Chapter 13) is recommended as a survey of the main issues, and the concept of an
International English pronunciation is discussed there.
There are many different and well-tried methods of teaching and testing pronuncia
tion, some of which are used in this book. I do not feel that it is suitable in this book to
go into a detailed analysis of classroom methods, but there are several excellent treatments
of the subject; see, for example, Dalton and Seidlhofer (1995); Celce-Murcia et al (1996)
and Hewings (2004).
W ritte n exercises
The exercises for this chapter are simple ones aimed at making you familiar with the style
of exercises that you will work on in the rest of the course. The answers to the exercises are
given on page 200.
1 Give three different names that have been used for the accent usually used for
teaching the pronunciation of British English.
i Introduction 7
All the sounds we make when we speak are the result of muscles contracting. The
muscles in the chest that we use for breathing produce the flow of air that is needed for
almost all speech sounds; muscles in the larynx produce many different modifications in
the flow of air from the chest to the mouth. After passing through the larynx, the air goes
through what we call the vocal tract, which ends at the mouth and nostrils; we call the
part comprising the mouth the oral cavity and the part that leads to the nostrils the nasal
cavity. Here the air from the lungs escapes into the atmosphere. We have a large and
complex set of muscles that can produce changes in the shape of the vocal tract, and in
order to learn how the sounds of speech are produced it is necessary to become familiar
with the different parts of the vocal tract. These different parts are called articulators, and
the study of them is called articulatory phonetics.
Fig. 1 is a diagram that is used frequently in the study of phonetics. It represents the
human head, seen from the side, displayed as though it had been cut in half. You will need
to look at it carefully as the articulators are described, and you will find it useful to have a
mirror and a good light placed so that you can look at the inside of your mouth.
i) The pharynx is a tube which begins just above the larynx. It is about 7 cm long
in women and about 8 cm in men, and at its top end it is divided into two, one
8
The production o f speech sounds 9
part being the back of the oral cavity and the other being the beginning of the
way through the nasal cavity. If you look in your mirror with your mouth open,
you can see the back of the pharynx.
ii) The soft palate or velum is seen in the diagram in a position that allows air
to pass through the nose and through the mouth. Yours is probably in that
position now, but often in speech it is raised so that air cannot escape through
the nose. The other important thing about the soft palate is that it is one of the
articulators that can be touched by the tongue. When we make the sounds k, g
the tongue is in contact with the lower side of the soft palate, and we call these
velar consonants.
iii) The hard palate is often called the wroof of the mouth”. You can feel its smooth
curved surface with your tongue. A consonant made with the tongue close to the
hard palate is called palatal. The sound j in yes’ is palatal.
iv) The alveolar ridge is between the top front teeth and the hard palate. You can
feel its shape with your tongue. Its surface is really much rougher than it feels,
and is covered with little ridges. You can only see these if you have a mirror small
enough to go inside your mouth, such as those used by dentists. Sounds made
with the tongue touching here (such as t, d, n) are called alveolar.
v) The tongue is a very important articulator and it can be moved into many dif
ferent places and different shapes. It is usual to divide the tongue into different
parts, though there are no clear dividing lines within its structure. Fig. 2 shows
the tongue on a larger scale with these parts shown: tip, blade, front, back and
root. (This use of the word “front” often seems rather strange at first.)
vi) The teeth (upper and lower) are usually shown in diagrams like Fig. 1 only at the
front of the mouth, immediately behind the lips. This is for the sake of a simple
diagram, and you should remember that most speakers have teeth to the sides of
their mouths, back almost to the soft palate. The tongue is in contact with the
upper side teeth for most speech sounds. Sounds made with the tongue touching
the front teeth, such as English 0, 6 , are called dental.
vii) The lips are important in speech. They can be pressed together (when we
produce the sounds p, b), brought into contact with the teeth (as in f, v), or
rounded to produce the lip-shape for vowels like u:. Sounds in which the lips
are in contact with each other are called bilabial, while those with lip-to-teeth
contact are called labiodental.
The seven articulators described above are the main ones used in speech, but there
are a few other things to remember. Firstly, the larynx (which will be studied in Chapter 4)
could also be described as an articulator - a very complex and independent one. Secondly,
the jaws are sometimes called articulators; certainly we move the lower jaw a lot in speak
ing. But the jaws are not articulators in the same way as the others, because they cannot
themselves make contact with other articulators. Finally, although there is practically noth
ing active that we can do with the nose and the nasal cavity when speaking, they are a very
important part of our equipment for making sounds (which is sometimes called our vocal
apparatus), particularly nasal consonants such as m, n. Again, we cannot really describe
the nose and the nasal cavity as articulators in the same sense as (i) to (vii) above.
The words vowel and consonant are very familiar ones, but when we study the
sounds of speech scientifically we find that it is not easy to define exactly what they mean.
The most common view is that vowels are sounds in which there is no obstruction to the
flow of air as it passes from the larynx to the lips. A doctor who wants to look at the back
of a patient’s mouth often asks them to say <£ah”; making this vowel sound is the best way
of presenting an unobstructed view. But if we make a sound like s, d it can be clearly felt
that we are making it difficult or impossible for the air to pass through the mouth. Most
people would have no doubt that sounds like s, d should be called consonants. However,
there are many cases where the decision is not so easy to make. One problem is that some
English sounds that we think of as consonants, such as the sounds at the beginning of the
words ‘hay’ and ‘way’, do not really obstruct the flow of air more than some vowels do.
Another problem is that different languages have different ways of dividing their sounds
into vowels and consonants; for example, the usual sound produced at the beginning of
the word ‘red’ is felt to be a consonant by most English speakers, but in some other lan
guages (e.g. Mandarin Chinese) the same sound is treated as one of the vowels.
If we say that the difference between vowels and consonants is a difference in the way
that they are produced, there will inevitably be some cases of uncertainty or disagreement;
this is a problem that cannot be avoided. It is possible to establish two distinct groups of
sounds (vowels and consonants) in another way. Consider English words beginning with
the sound h; what sounds can come next after this h? We find that most of the sounds
we normally think of as vowels can follow (e.g. e in the word ‘hen’), but practically none
of the sounds we class as consonants, with the possible exception of j in a word such as
‘huge’ hju:d 3. Now think of English words beginning with the two sounds bi; we find
many cases where a consonant can follow (e.g. d in the word ‘bid’, or 1 in the word ‘bill’),
2 The production o f speech sounds 11
but practically no cases where a vowel may follow. What we are doing here is looking at
the different contexts and positions in which particular sounds can occur; this is the study
of the distribution of the sounds, and is of great importance in phonology. Study of the
sounds found at the beginning and end of English words has shown that two groups of
sounds with quite different patterns of distribution can be identified, and these two groups
are those of vowel and consonant. If we look at the vowel—consonant distinction in this
way, we must say that the most important difference between vowel and consonant is not
the way that they are made, but their different distributions. It is important to remember
that the distribution of vowels and consonants is different for each language.
We begin the study of English sounds in this course by looking at vowels, and it
is necessary to say something about vowels in general before turning to the vowels of
English. We need to know in what ways vowels differ from each other. The first matter to
consider is the shape and position of the tongue. It is usual to simplify the very complex
possibilities by describing just two things: firstly, the vertical distance between the upper
surface of the tongue and the palate and, secondly, the part of the tongue, between front
and back, which is raised highest. Let us look at some examples:
i) Make a vowel like the i: in the English word ‘see’ and look in a mirror; if you tilt
your head back slightly you will be able to see that the tongue is held up close to
the roof of the mouth. Now make an ae vowel (as in the word ‘cat’) and notice
how the distance between the surface of the tongue and the roof of the mouth
is now much greater. The difference between i: and ae is a difference of tongue
height, and we would describe i: as a relatively close vowel and ae as a relatively
open vowel. Tongue height can be changed by moving the tongue up or down,
or moving the lower jaw up or down. Usually we use some combination of the
two sorts of movement, but when drawing side-of-the-head diagrams such as
Fig. 1 and Fig. 2 it is usually found simpler to illustrate tongue shapes for vowels
as if tongue height were altered by tongue movement alone, without any accom
panying jaw movement. So we would illustrate the tongue height difference
between i: and ae as in Fig. 3.
ii) In making the two vowels described above, it is the front part of the tongue that
is raised. We could therefore describe i: and ae as comparatively front vowels. By
changing the shape of the tongue we can produce vowels in which a different part
of the tongue is the highest point. A vowel in which the back of the tongue is the
highest point is called a back vowel. If you make the vowel in the word ‘calm’,
which we write phonetically as a:, you can see that the back of the tongue is raised.
Compare this with ae in front of a mirror; as is a front vowel and a: is a back
vowel. The vowel in ‘too’ (u:) is also a comparatively back vowel, but compared
with a: it is close.
So now we have seen how four vowels differ from each other; we can show this in a simple
diagram.
Front Back
Close i: u:
Open ae a:
However, this diagram is rather inaccurate. Phoneticians need a very accurate way of
classifying vowels, and have developed a set of vowels which are arranged in a close-open,
front-back diagram similar to the one above but which are not the vowels of any particular
language. These cardinal vowels are a standard reference system, and people being trained
in phonetics at an advanced level have to learn to make them accurately and recognise them
correctly. If you learn the cardinal vowels, you are not learning to make English sounds, but
you are learning about the range of vowels that the human vocal apparatus can make, and
also learning a useful way of describing, classifying and comparing vowels. They are recorded
on Track 12 of CD 2.
It has become traditional to locate cardinal vowels on a four-sided figure (a quadri
lateral of the shape seen in Fig. 4 - the design used here is the one recommended by the
International Phonetic Association). The exact shape is not really important - a square
would do quite well - but we will use the traditional shape. The vowels in Fig. 4 are the so-
called primary cardinal vowels; these are the vowels that are most familiar to the speakers
of most European languages, and there are other cardinal vowels (secondary cardinal
vowels) that sound less familiar. In this course cardinal vowels are printed within square
brackets [ ] to distinguish them clearly from English vowel sounds.
Cardinal vowel no. 1 has the symbol [i], and is defined as the vowel which is as close
and as front as it is possible to make a vowel without obstructing the flow of air enough to
produce friction noise; friction noise is the hissing sound that one hears in consonants like
s or f. Cardinal vowel no. 5 has the symbol [a] and is defined as the most open and back
vowel that it is possible to make. Cardinal vowel no. 8 [u] is fully close and back and no. 4
[a] is fully open and front. After establishing these extreme points, it is possible to put in
intermediate points (vowels no. 2, 3, 6 and 7). Many students when they hear these vowels
find that they sound strange and exaggerated; you must remember that they are extremes of
vowel quality. It is useful to think of the cardinal vowel framework like a map of an area or
country that you are interested in. If the map is to be useful to you it must cover all the area;
but if it covers the whole area of interest it must inevitably go a little way beyond that and
include some places that you might never want to go to.
When you are familiar with these extreme vowels, you have (as mentioned above)
learned a way of describing, classifying and comparing vowels. For example, we can say
that the English vowel ae (the vowel in ‘cat’) is not as open as cardinal vowel no. 4 [a]. We
have now looked at how we can classify vowels according to their tongue height and their
frontness or backness. There is another important variable of vowel quality, and that is
lip-position. Although the lips can have many different shapes and positions, we will at
this stage consider only three possibilities. These are:
i) Rounded, where the corners of the lips are brought towards each other and the
lips pushed forwards. This is most clearly seen in cardinal vowel no. 8 [u].
ii) Spread, with the corners of the lips moved away from each other, as for a smile.
This is most clearly seen in cardinal vowel no. 1 [i].
iii) Neutral, where the lips are not noticeably rounded or spread. The noise most
English people make when they are hesitating (written er5) has neutral lip position.
Now, using the principles that have just been explained, we will examine some of the
English vowels.
English has a large number of vowel sounds; the first ones to be examined are short
vowels. The symbols for these short vowels are: i,e, ae, a , d, u. Short vowels are only relatively
short; as we shall see later, vowels can have quite different lengths in different contexts.
Each vowel is described in relation to the cardinal vowels.
i (example words: ‘bit’, ‘pin’, ‘fish’) The diagram shows that, though this vowel is in
the close front area, compared with cardinal vowel no. 1 [i] it is more open, and
nearer in to the centre. The lips are slightly spread,
e (example words: ‘bet’, ‘men’, ‘yes’) This is a front vowel between cardinal vowel
no. 2 [e] and no. 3 [e]. The Ups are slightly spread,
ae (example words: ‘bat’, ‘man’, ‘gas’) This vowel is front, but not quite as open as
cardinal vowel no. 4 [a]. The Ups are slightly spread.
a (example words: ‘cut’, ‘come’, ‘rush’) This is a central vowel, and the diagram
shows that it is more open than the open-mid tongue height. The Up position is
neutral.
d (example words: ‘pot’, ‘gone’, ‘cross’) This vowel is not quite frilly back, and between
open-mid and open in tongue height. The Ups are slighdy rounded,
u (example words: ‘put’, ‘puli’, ‘push’) The nearest cardinal vowel is no. 8 [u], but it
can be seen that u is more open and nearer to central. The lips are rounded.
There is one other short vowel, for which the symbol is a. This central vowel - which is
caUed schwa - is a very famiUar sound in English; it is heard in the first syllable of the
words ‘about’, ‘oppose’, ‘perhaps’, for example. Since it is different from the other vowels in
several important ways, we wiU study it separately in Chapter 9.
One of the most difficult aspects of phonetics at this stage is the large number of technical
terms that have to be learned. Every phonetics textbook gives a description of the articula
tors. Usefrd introductions are Ladefoged (2006: Chapter 1), Ashby (2005), and Ashby and
Maidment (2005: Chapter 3).
An important discussion of the vowel-consonant distinction is by Pike (1943:66-79).
He suggested that since the two approaches to the distinction produce such different
results we should use new terms: sounds which do not obstruct the airflow (tradition
ally caUed “vowels”) should be caUed vocoids, and sounds which do obstruct the air
flow (traditionaUy caUed “consonants”) should be called contoids. This leaves the terms
“vowel” and “consonant” for use in labeUing phonological elements according to their
distribution and their role in syllable structure; see Section 5.8 of Laver (1994). While
vowels are usuaUy vocoids and consonants are usually contoids, this is not always the
case; for example, j in ‘yet’ and w in ‘wet’ are (phoneticaUy) vocoids but function (pho-
nologically) as consonants. A study of the distributional differences between vowels and
consonants in English is described in O’Connor and Trim (1953); a briefer treatment
is in Cruttenden (2008: Sections 4.2 and 5.6 ). The classification of vowels has a large
literature: I would recommend Jones (1975: Chapter 8); Ladefoged (2006) gives a brief
introduction in Chapter 1, and much more detail in Chapter 9; see also Abercrombie
(1967: 55-60 and Chapter 10). The Handbook of the International Phonetic Association
(1999: Section 2.6) explains the IPA’s principles of vowel classification. The distinction
2 The production o f speech sounds 15
between primary and secondary cardinal vowels is a rather dubious one which appears
to be based to some extent on a division between those vowels which are familiar and
those which are unfamiliar to speakers of most European languages. It is possible to
classify vowels quite unambiguously without resorting to this notion by specifying their
front/back, close/open and lip positions.
W ritte n exercises
2 Using the descriptive labels introduced for vowel classification, say what the fol
lowing cardinal vowels are:
a) [u] b) [e] c) [a] d) [i] e) [o]
3 Draw a vowel quadrilateral and indicate on it the correct places for the following
English vowels:
a) ae b) A c) I d) e
4 Write the symbols for the vowels in the following words:
a) bread b) rough c) foot d) hymn
e) pull f) cough g) mat h) friend
3 Long vowels, diphthongs and triphthongs
In Chapter 2 the short vowels were introduced. In this chapter we look at other
types of English vowel sound. The first to be introduced here are the five long vowels;
these are the vowels which tend to be longer than the short vowels in similar contexts.
It is necessary to say “in similar contexts” because, as we shall see later, the length of
all English vowel sounds varies very much according to their context (such as the type
of sound that follows them) and the presence or absence of stress. To remind you that
these vowels tend to be long, the symbols consist of one vowel symbol plus a length
mark made of two dots :. Thus we have i:,3 i,a :,o :,u :. We will now look at each of
these long vowels individually.
The five long vowels are different from the six short vowels described in Chapter
2, not only in length but also in quality. If we compare some similar pairs of long and
short vowels, for example 1 with i:, or u with u:, or ae with a:, we can see distinct dif
ferences in quality (resulting from differences in tongue shape and position, and lip
position) as well as in length. For this reason, all the long vowels have symbols which
are different from those of short vowels; you can see that the long and short vowel sym
bols would still all be different from each other even if we omitted the length mark, so
it is important to remember that the length mark is used not because it is essential but
because it helps learners to remember the length difference. Perhaps the only case where
a long and a short vowel are closely similar in quality is that of a and 3 :, but a is a spe
cial case - as we shall see later.
O AU3 (CD 1), Exs 1-5
16
Long vowels, diphthongs and triphthongs 17
i: (example words: ‘beat’, ‘mean’, ‘peace’) This vowel is nearer to cardinal vowel no.
1 [i] (i.e. it is closer and more front) than is the short vowel o f‘bid’, ‘pin’, ‘fish’
described in Chapter 2. Although the tongue shape is not much different from
cardinal vowel no. 1, the lips are only slightly spread and this results in a rather
different vowel quality.
3 : (example words: ‘bird’, ‘fern’, ‘purse’) This is a mid-central vowel which is used in
most English accents as a hesitation sound (written ‘er’), but which many learners
find difficult to copy. The lip position is neutral,
a: (example words: ‘card’, ‘half’, ‘pass’) This is an open vowel in the region of cardi
nal vowel no. 5 [a], but not as back as this. The lip position is neutral,
o: (example words: ‘board’, ‘torn’, ‘horse’) The tongue height for this vowel is
between cardinal vowel no. 6 [o] and no. 7 [o], and closer to the latter. This
vowel is almost fully back and has quite strong lip-rounding,
u: (example words: ‘food’, ‘soon’, ‘loose’) The nearest cardinal vowel to this is no. 8
[u], but BBC u: is much less back and less close, while the lips are only moderately
rounded.
DIPHTHONG
centring closing
ei ai oi au
Fig. 7 Diphthongs
18 English Phonetics and Phonology
The centring diphthongs glide towards the 3 (schwa) vowel, as the symbols indicate.
is (example words: ‘beard’, weird’, ‘fierce’) The starting point
is a little closer than i in ‘bit’, ‘bin’,
ea (example words: ‘aired’, ‘cairn’, ‘scarce’) This diphthong
begins with a vowel sound that is more open than the e
o f‘get’, ‘men’.
(example words: ‘moored’, ‘tour’, ‘lure’) For speakers who
have this diphthong, this has a starting point similar to u
Fig. 8 Centring diphthongs in ‘put’, ‘puli’. Many speakers pronounce o: instead.
The closing diphthongs have the characteristic that they all end with a glide towards a
closer vowel. Because the second part of the diphthong is weak, they often do not reach
a position that could be called close. The important thing is that a glide from a relatively
more open towards a relatively closer vowel is produced.
Three of the diphthongs glide towards i, as described below:
ei (example words: ‘paid’, ‘pain’, ‘face’) The starting point is
the same as the e o f‘get’, ‘men’,
ai (example words: ‘tide’, ‘time’, ‘nice’) This diphthong begins
with an open vowel which is between front and back; it is
quite similar to the a of the words ‘cut’, ‘bun’,
oi (example words: ‘void’, ‘loin’, ‘voice’) The first part of this
diphthong is slightly more open than o: in ‘ought’, ‘born’.
Fig. 9 Closing diphthongs
Two diphthongs glide towards u, so that as the tongue moves closer to the roof of the
mouth there is at the same time a rounding movement of the lips. This movement is not a
large one, again because the second part of the diphthong is weak.
su (example words: ‘load’, ‘home’, ‘most’) The vowel position for the beginning
of this is the same as for the “schwa” vowel a, as found in the first syllable of
the word ‘about’. The lips may be slightly rounded in anticipation of the glide
towards u, for which there is quite noticeable lip-rounding,
au (example words: ‘loud’, ‘gown’, ‘house’) This diphthong begins with a vowel
similar to ai. Since this is an open vowel, a glide to u would necessitate a large
movement, and the tongue often does not reach the u position. There is only
slight lip-rounding.
3.3 Triphthongs
The most complex English sounds of the vowel type are the triphthongs. They can be
rather difficult to pronounce, and very difficult to recognise. A triphthong is a glide from
3 Long vowels, diphthongs and triphthongs 19
one vowel to another and then to a third, all produced rapidly and without interruption. For
example, a careful pronunciation of the word ‘hour’begins with a vowel quality similar to a:,
goes on to a glide towards the back close rounded area (for which we use the symbol u), then
ends with a mid-central vowel (schwa, o). We use the symbol auo to represent the pronun
ciation o f‘hour’, but this is not always an accurate representation of the pronunciation.
The triphthongs can be looked on as being composed of the five closing diphthongs
described in the last section, with 0 added on the end. Thus we get:
e i + 9 = eiO OU + 0 = OUO
ai + 0 = aio au + 0 = auo
oi + 0 = 010
The principal cause of difficulty for the foreign learner is that in present-day English the
extent of the vowel movement is very small, except in very careful pronunciation. Because
of this, the middle of the three vowel qualities of the triphthong (i.e. the 1 or u part) can
hardly be heard and the resulting sound is difficult to distinguish from some of the diph
thongs and long vowels. To add to the difficulty, there is also the problem of whether a
triphthong is felt to contain one or two syllables. Words such as ‘fire’ faio or ‘hour’ auo
are probably felt by most English speakers (with BBC pronunciation) to consist of only
one syllable, whereas ‘player’ pleio or 'slower’ slouo are more likely to be heard as two
syllables.
We will not go through a detailed description of each triphthong. This is partly
because there is so much variation in the amount of vowel movement according to how
slow and careful the pronunciation is, and also because the “careful” pronunciation can be
found by looking at the description of the corresponding diphthong and adding 0 to the
end. However, to help identify these triphthongs, some example words are given here:
eio ‘layer5,‘player’ ouo ‘lower’, ‘mower’
aio ‘liar’, ‘fire’ auo ‘power’, ‘hour’
010 ‘loyal’, ‘royal’
For more information about vowels, see Ashby (2005, Chapter 4), Ladefoged (2004,
Chapter 3). Long vowels and diphthongs can be seen as a group of vowel sounds that
are consistently longer in a given context than the short vowels described in the previous
chapter. Some writers give the label tense to long vowels and diphthongs and lax to the
short vowels. Giegerich (1992) explains how this concept applies to three different accents
of English: SSE (Standard Scottish English), RP (BBC pronunciation) and GA (General
American). The accents are described in 3.1 and 3.2; the idea of pairs of vowels differing
in tenseness and laxness follows in 3.3. Jakobson and Halle (1964) explain the histori
cal background to the distinction, which plays an important role in the treatment of the
English vowel system by Chomsky and Halle (1968).
As mentioned in the notes on Chapter 1, the choice of symbols has in the past tended
to vary from book to book, and this is particularly noticeable in the case of length marks
20 English Phonetics and Phonology
for long vowels (this issue comes up again in Section 5.2 of Chapter 5); you could read
Cruttenden (2008: Section 8.5). As an example of a contemporary difference in symbol
choice, see Kreidler (2004, 4.3).
The phonemes i:, u: are usually classed as long vowels; it is worth noting that most
English speakers pronounce them with something of a diphthongal glide, so that a possible
alternative transcription could be ii, uu, respectively. This is not normally proposed,
however.
It seems that triphthongs in BBC pronunciation are in a rather unstable state, resulting
in the loss of some distinctions: in the case of some speakers, for example, it is not easy to
hear a difference between ‘tyre’t aia, ‘tower’t aoo, ‘tar’t a :. BBC newsreaders often pronounce
‘Ireland’ as aitend. Gimson (1964) suggested that this shows a change in progress in the
phonemic system of RP.
Notes fo r teachers
I mention above that i:, u: are often pronounced as slightly diphthongal: although this
glide is often noticeable, I have never found it helpful to try to teach foreign learners to
pronounce i:, u: in this way. Foreign learners who wish to get close to the BBC model
should be careful not to pronounce the “r” that is often found in the spelling correspond
ing to a:, o:, 3: (‘ar’, ‘or’, ‘er’).
Most of the essential pronunciation features of the diphthongs are described in
Chapter 3. One of the most common pronunciation characteristics that result in a learner
of English being judged to have a foreign accent is the production of pure vowels where a
diphthong should be pronounced (e.g. [e] for ei, [o] for so).
Two additional points are worth making. The diphthong uo is included, but this is not
used as much as the others - many English speakers use o: in words like ‘moor’, ‘mourn’,
‘tour’. However, I feel that it is important for foreign learners to be aware of this diphthong
because of the distinctiveness of words in pairs like ‘moor’ and ‘more’, ‘poor’ and ‘paw’ for
many speakers. The other diphthong that requires comment is 30. English speakers seem to
be specially sensitive to the quality of this diphthong, particularly to the first part. It often
happens that foreign learners, having understood that the first part of the diphthong is not
a back vowel, exaggerate this by using a vowel that is too front, producing a diphthong like
eu. Unfortunately, this gives the impression of someone trying to copy a “posh” or upper-
class accent: eu for ou is noticeable in the speech of the Royal Family.
W ritte n exercises
2 Write the symbols for the long vowels in the following words:
a) broad d) learn g) err
b) ward e) cool h) seal
c) calf f) team i) curl
Write the symbols for the diphthongs in
a) tone d) way g) hair
b) style e) beer h) why
c) out f) coil i) prey
4 Voicing and consonants
We begin this chapter by studying the larynx. The larynx has several very impor
tant functions in speech, but before we can look at these functions we must examine its
anatomy and physiology - that is, how it is constructed and how it works.
The larynx is in the neck; it has several parts, shown in Fig. 10. Its main structure is
made of cartilage, a material that is similar to bone but less hard. If you press down on
your nose, the hard part that you can feel is cartilage. The larynx’s structure is made of
two large cartilages. These are hollow and are attached to the top of the trachea; when we
breathe, the air passes through the trachea and the larynx. The front of the larynx comes
to a point and you can feel this point at the front of your neck - particularly if you are a
man and/or slim. This point is commonly called the Adam’s Apple.
Inside the “box” made by these two cartilages are the vocal folds, which are two thick
flaps of muscle rather like a pair of lips; an older name for these is vocal cords. Looking
down the throat is difficult to do, and requires special optical equipment, but Fig. 11 shows
in diagram form the most important parts. At the front the vocal folds are joined together
and fixed to the inside of the thyroid cartilage. At the back they are attached to a pair of
thyroid cartilage
cricoid cartilage
tracheal rings
22
Voicing and consonants 23
front
small cartilages called the arytenoid cartilages so that if the arytenoid cartilages move, the
vocal folds move too.
The arytenoid cartilages are attached to the top of the cricoid cartilage, but they can
move so as to move the vocal folds apart or together (Fig. 12). We use the word glottis to
refer to the opening between the vocal folds. If the vocal folds are apart we say that the
glottis is open; if they are pressed together we say that the glottis is closed. This seems
quite simple, but in fact we can produce a very complex range of changes in the vocal folds
and their positions.
These changes are often important in speech. Let us first look at four easily recognis
able states of the vocal folds; it would be useful to practise moving your vocal folds into
these different positions.
i) Wide apart: The vocal folds are wide apart for normal breathing and usually
during voiceless consonants like p, f, s (Fig. 13a). Your vocal folds are probably
apart now.
ii) Narrow glottis: If air is passed through the glottis when it is narrowed as in
Fig. 13b, the result is a fricative sound for which the symbol is h. The sound
is not very different from a whispered vowel. It is called a voiceless glottal
fricative. (Fricatives are discussed in more detail in Chapter 6 .) Practise saying
hahahaha - alternating between this state of the vocal folds and that described
in (iii) below.
iii) Position for vocal fold vibration: When the edges of the vocal folds are touching
each other, or nearly touching, air passing through the glottis will usually cause
vibration (Fig. 13c). Air is pressed up from the lungs and this air pushes the vocal
folds apart so that a little air escapes. As the air flows quickly past the edges of
the vocal folds, the folds are brought together again. This opening and closing
happens very rapidly and is repeated regularly, roughly between two and three
hundred times per second in a woman’s voice and about half that rate in an
adult man’s voice.
iv) Vocal folds tightly closed: The vocal folds can be firmly pressed together so that
air cannot pass between them (Fig. 13d). When this happens in speech we call it
a glottal stop or glottal plosive, for which we use the symbol ?. You can practise
this by coughing gently; then practise the sequence a?a?a?a?a?a.
Section 4.1 referred several times to air passing between the vocal folds. The normal
way for this airflow to be produced is for some of the air in the lungs to be pushed out;
when air is made to move out of the lungs we say that there is an egressive pulmonic
airstream. All speech sounds are made with some movement of air, and the egressive
pulmonic is by far the most commonly found air movement in the languages of the
world. There are other ways of making air move in the vocal tract, but they are not usually
relevant in the study of English pronunciation, so we will not discuss them here.
How is air moved into and out of the lungs? Knowing about this is important, since
it will make it easier to understand many aspects of speech, particularly the nature of
stress and intonation. The lungs are like sponges that can fill with air, and they are con
tained within the rib cage (Fig. 14). If the rib cage is lifted upwards and outwards there
4 Voicing and consonants 25
is more space in the chest for the lungs and they expand, with the result that they take in
more air. If we allow the rib cage to return to its rest position quite slowly, some of the air
is expelled and can be used for producing speech sounds. If we wish to make the egres-
sive pulmonic airstream continue without breathing in again - for example, when saying
a long sentence and not wanting to be interrupted - we can make the rib cage press down
on the lungs so that more air is expelled.
In talking about making air flow into and out of the lungs, the process has been
described as though the air were free to pass with no obstruction. But, as we saw in
Chapter 2, to make speech sounds we must obstruct the airflow in some way - breathing
by itself makes very little sound. We obstruct the airflow by making one or more obstruc
tions or strictures in the vocal tract, and one place where we can make a stricture is in
the larynx, by bringing the vocal folds close to each other as described in the previous
section. Remember that there will be no vocal fold vibration unless the vocal folds are
in the correct position and the air below the vocal folds is under enough pressure to be
forced through the glottis.
If the vocal folds vibrate we will hear the sound that we call voicing or phonation.
There are many different sorts of voicing that we can produce - think of the differences in
the quality of your voice between singing, shouting and speaking quietly, or think of the
different voices you might use reading a story to young children in which you have to read
out what is said by characters such as giants, fairies, mice or ducks; many of the differences
are made with the larynx. We can make changes in the vocal folds themselves - they can,
for example, be made longer or shorter, more tense or more relaxed or be more or less
strongly pressed together. The pressure of the air below the vocal folds (the subglottal
pressure) can also be varied. Three main differences are found:
i) Variations in intensity: We produce voicing with high intensity for shouting, for
example, and with low intensity for speaking quietly.
ii) Variations in frequency: If the vocal folds vibrate rapidly, the voicing is at high
frequency; if there are fewer vibrations per second, the frequency is lower.
iii) Variations in quality: We can produce different-sounding voice qualities, such as
those we might call harsh, breathy, murmured or creaky.
26 English Phonetics and Phonology
4 .3 Plosives
4 .4 English plosives
Are b, d, g voiced plosives? The description of them makes it clear that it is not very
accurate to call them “voiced”; in initial and final position they are scarcely voiced at all,
and any voicing they may have seems to have no perceptual importance. Some phoneticians
say that p, t, k are produced with more force than b, d, g, and that it would therefore be
better to give the two sets of plosives (and some other consonants) names that indicate
that fact; so the voiceless plosives p, t, k are sometimes called fortis (meaning ‘strong’) and
b, d, g are then called lenis (meaning ‘weak’). It may well be true that p, t, k are produced
with more force, though nobody has really proved it - force of articulation is very dif
ficult to define and measure. On the other hand, the terms fortis and lenis are difficult to
remember. Despite this, we shall follow the practice of many books and use these terms.
The plosive phonemes of English can be presented in the form of a table as shown
here:
4 Voicing and consonants 29
PLACE OF ARTICULATION
Bilabial Alveolar Velar
Fortis ("voiceless”) p t k
Lenis (“voiced”) b d g
Tables like this can be produced for all the different consonants. Each major type of
consonant (such as plosives like p, t, k, fricatives like s, z, and nasals like m, n) obstructs
the airflow in a different way, and these are classed as different manners of articulation.
4.1,4.2 For more information about the larynx and about respiration in relation to
speech, see Raphael et al., (2006); Laver (1994: Chapters 6 and 7); Ashby and Maidment
(2005: Chapter 2 ).
4.3 The outline of the stages in the production of plosives is based on Cruttenden (2008:
158). In classifying consonants it is possible to go to a very high level of complexity if
one wishes to account for all the possibilities; see, for example, Pike (1943: 85-156).
4.4 It has been pointed out that the transcription sb, sd, sg could be used quite
appropriately instead of sp, st, sk in syllable-initial position; see Davidsen-Nielsen
(1969). The vowel length difference before final voiceless consonants is apparently found
in many (possibly all) languages, but in English this difference - which is very slight in
most languages - has become exaggerated so that it has become the most important
factor in distinguishing between final p^t, k and b, d, g; see Chen (1970). Some
phonetics books wrongly state that b, d, g lengthen preceding vowels, rather than that p,
t, k shorten them. The conclusive evidence on this point is that if we take the pair ‘right’
rait and ‘ride’ raid, and then compare ‘rye’ rai, the length of the ai diphthong when no
consonant follows is practically the same as in ‘ride’; the ai in ‘right’ is much shorter than
the ai in ‘ride’ and ‘rye’.
4.5 The fortis/lenis distinction is a very complicated matter. It is necessary to consider
how one could measure “force of articulation”; many different laboratory techniques
have been tried to see if the articulators are moved more energetically for fortis conso
nants, but all have proved inconclusive. The only difference that seems reasonably reliable
is that fortis consonants have higher air pressure in the vocal tract, but Lisker (1970) has
argued convincingly that this is not conclusive evidence for a “force of articulation” dif
ference. It is possible to ask phonetically untrained speakers whether they feel that more
energy is used in pronouncing p, t, k than in b, d, g, but there are many difficulties in
doing this. A useful review of the “force of articulation” question is in Catford (1977:
199-208). I feel the best conclusion is that any term one uses to deal with this distinction
(whether fortis/lenis or voiceless/voiced) is to be looked on as a cover term - a term which
30 English Phonetics and Phonology
has no simple physical meaning but which may stand for a large and complex set of pho
netic characteristics.
W ritten exercises
1 Write brief descriptions of the actions of the articulators and the respiratory
system in the words given below. Your description should start and finish with
the position for normal breathing. Here is a description of the pronunciation of
the word ‘bee’ bi: as an example:
Starting from the position for normal breathing, the lips are closed and the
lungs are compressed to create air pressure in the vocal tract. The tongue
moves to the position for a close front vowel, with the front of the tongue
raised close to the hard palate. The vocal folds are brought close together
and voicing begins; the lips then open, releasing the compressed air. Voicing
continues for the duration of an i: vowel. Then the lung pressure is lowered,
voicing ceases and the articulators return to the normal breathing position.
Words to describe: (a) goat; (b) ape.
2 Transcribe the following words:
a) bake d) bought g) bored
b) goat e) tick h) guard
c) doubt f) bough i) pea
5 Phonemes and symbols
In Chapters 2-4 we have been studying some of the sounds of English. It is now
necessary to consider some fundamental theoretical questions. What do we mean when we
use the word “sound”? How do we establish what are the sounds of English, and how do
we decide how many there are of them?
When we speak, we produce a continuous stream of sounds. In studying speech we
divide this stream into small pieces that we call segments. The word ‘man’ is pronounced
with a first segment m, a second segment ae and a third segment n. It is not always easy
to decide on the number of segments. To give a simple example, in the word ‘mine5the
first segment is m and the last is n, as in the word ‘man’ discussed above. But should we
regard the a i in the middle as one segment or two? We will return to this question.
As well as the question of how we divide speech up into segments, there is the
question of how many different sounds (or segment types) there are in English. Chapters
2 and 3 introduced the set of vowels found in English. Each of these can be pronounced in
many slightly different ways, so that the total range of sounds actually produced by speakers
is practically infinite. Yet we feel quite confident in saying that the number of English vowels
is not greater than twenty. Why is this? The answer is that if we put one of those twenty in
the place of one of the others, we can change the meaning of a word. For example, if we
substitute ae for e in the word cbed5we get a different word: ‘bad’. But in the case of two
slightly different ways of pronouncing what we regard as “the same sound”, we usually find
that, if we substitute one for the other, a change in the meaning of a word does not result. If
we substitute a more open vowel, for example cardinal vowel no. 4 [a] for the ae in the word
‘bad’, the word is still heard as ‘bad’.
The principles involved here may be easier to understand if we look at a similar situ
ation related to the letters of the alphabet that we use in writing English. The letter of the
alphabet in writing is a unit which corresponds fairly well to the unit of speech we have
been talking about earlier in this chapter - the segment. In the alphabet we have five let
ters that are called vowels: % V, T, ‘o’, V. If we choose the right context we can show how
substituting one letter for another will change meaning. Thus with a letter cp5before and
a letter Y after the vowel letter, we get the five words spelt ‘pat’, ‘pet’, ‘pit5, ‘pot5, ‘put’, each
of which has a different meaning. We can do the same with sounds. If we look at the short
31
32 English Phonetics and Phonology
vowels i, e, ae, a , d , u, for example, we can see how substituting one for another in between
the plosives p and t gives us six different words as follows (given in spelling on the left):
‘pit’ p i t ‘putt5 p A t
‘pet’ pet ‘pot’ pot
‘pat’ paet ‘put’ put
Let us return to the example of letters of the alphabet. If someone who knew nothing
about the alphabet saw these four characters:
‘A’ V ‘a’ ‘u>
they would not know that to users of the alphabet three of these characters all represent
the same letter, while the fourth is a different letter. They would quickly discover, through
noticing differences in meaning, that V is a different letter from the first three. What
would our illiterate observer discover about these three? They would eventually come to
the conclusion about the written characters ‘a’ and V that the former occurs most often
in printed and typed writing while the latter is more common in handwriting, but that if
you substitute one for the other it will not cause a difference in meaning. If our observer
then examined a lot of typed and printed material they would eventually conclude that a
word that began with ‘a’ when it occurred in the middle of a sentence would begin with %
and never with % at the beginning of a sentence. They would also find that names could
begin with ‘A’ but never with ‘a5; they would conclude that ‘A’ and V were different ways of
writing the same letter and that a context in which one of them could occur was always a
context in which the other could not. As will be explained below, we find similar situations
in speech sounds.
If you have not thought about such things before, you may find some difficulty in
understanding the ideas that you have just read about. The principal difficulty lies in the
fact that what is being talked about in our example of letters is at the same time something
abstract (the alphabet, which you cannot see or touch) and something real and concrete
(marks on paper). The alphabet is something that its users know; they also know that it
has twenty-six letters. But when the alphabet is used to write with, these letters appear on
the page in a practically infinite number of different shapes and sizes.
Now we will leave the discussion of letters and the alphabet; these have only been
introduced in this chapter in order to help explain some important general principles.
Let us go back to the sounds of speech and see how these principles can be explained. As
was said earlier in this chapter, we can divide speech up into segments, and we can find
great variety in the way these segments are made. But just as there is an abstract alphabet
as the basis of our writing, so there is an abstract set of units as the basis of our speech.
These units are called phonemes, and the complete set of these units is called the phonemic
system of the language. The phonemes themselves are abstract, but there are many slightly
different ways in which we make the sounds that represent these phonemes, just as there
are many ways in which we may make a mark on a piece of paper to represent a particular
(abstract) letter of the alphabet.
5 Phonemes and symbols 33
We find cases where it makes little difference which of two possible ways we choose to
pronounce a sound. For example, the b at the beginning of a word such as ‘bad’ will usu
ally be pronounced with practically no voicing. Sometimes, though, a speaker may produce
the b with full voicing, perhaps in speaking very emphatically. If this is done, the sound is
still identified as the phoneme b, even though we can hear that it is different in some way.
We have in this example two different ways of making b - two different realisations of the
phoneme. One can be substituted for the other without changing the meaning.
We also find cases in speech similar to the writing example of capital ‘A’ and little
‘a5(one can only occur where the other cannot). For example, we find that the realisation
of t in the word ‘tea’ is aspirated (as are all voiceless plosives when they occur before
stressed vowels at the beginning of syllables). In the word ‘eat’, the realisation of t is
unaspirated (as are all voiceless plosives when they occur at the end of a syllable and are
not followed by a vowel). The aspirated and unaspirated realisations are both recognised
as t by English speakers despite their differences. But the aspirated realisation will never be
found in the place where the unaspirated realisation is appropriate, and vice versa. When
we find this strict separation of places where particular realisations can occur, we say that
the realisations are in complementary distribution. One more technical term needs to
be introduced: when we talk about different realisations of phonemes, we sometimes call
these realisations allophones. In the last example, we were studying the aspirated and
unaspirated allophones of the phoneme t. Usually we do not indicate different allophones
when we write symbols to represent sounds.
You have now seen a number of symbols of several different sorts. Basically the
symbols are for one of two purposes: either they are symbols for phonemes (phonemic
symbols) or they are phonetic symbols (which is what the symbols were first
introduced as).
We will look first at phonemic symbols. The most important point to remember is
the rather obvious-seeming fact that the number of phonemic symbols must be exactly
the same as the number of phonemes we decide exist in the language. It is rather like
typing on a keyboard - there is a fixed number of keys that you can press. However, some
of our phonemic symbols consist of two characters; for example, we usually treat tj (as
in ‘chip’ tjip) as one phoneme, so tj is a phonemic symbol consisting of two characters
(t and J).
One of the traditional exercises in pronunciation teaching by phonetic methods is
that of phonemic transcription, where every speech sound must be identified as one of
the phonemes and written with the appropriate symbol. There are two different kinds
of transcription exercise: in one, transcription from dictation, the student must listen
to a person, or a recording, and write down what they hear; in the other, transcription
from a written text, the student is given a passage written in orthography and must
use phonemic symbols to represent how she or he thinks it would be pronounced by
34 English Phonetics and Phonology
type, and the context should make it clear whether the symbols are phonemic or phonetic
in function.
It should now be clear that there is a fundamental difference between phonemic
symbols and phonetic symbols. Since the phonemic symbols do not have to indicate pre
cise phonetic quality, it is possible to choose among several possible symbols to represent
a particular phoneme; this has had the unfortunate result that different books on English
pronunciation have used different symbols, causing quite a lot of confusion to students.
In this course we are using the symbols now most frequently used in British publishing.
It would be too long a task to examine other writers’ symbols in detail, but it is worth
considering some of the reasons for the differences. One factor is the complication and
expense of using special symbols which create problems in typing and printing; it could,
for example, be argued that a is a symbol that is found in practically all typefaces whereas
ae is unusual, and that the a symbol should be used for the vowel in ‘cat’ instead of ae.
Some writers have concentrated on producing a set of phonemic symbols that need the
minimum number of special or non-standard symbols. Others have thought it important
that the symbols should be as close as possible to the symbols that a phonetician would
choose to give a precise indication of sound quality. To use the same example again, refer
ring to the vowel in ‘cat’, it could be argued that if the vowel is noticeably closer than
cardinal vowel no. 4 [a], it is more suitable to use the symbol ae, which is usually used
to represent a vowel between open-mid and open. There can be disagreements about the
most important characteristics of a sound that a symbol should indicate: one example is
the vowels of the words ‘bit’ and ‘beat’. Some writers have claimed that the most important
difference between them is that the former is short and the latter long, and transcribed the
former with i and the latter with i: (the difference being entirely in the length mark); other
writers have said that the length (or quantity) difference is less important than the qual
ity difference, and transcribe the vowel of ‘bit’ with the symbol i and that of ‘beat’ with i.
Yet another point of view is that quality and quantity are both important and should both
be indicated; this point of view results in a transcription using i for ‘bit’ and i:, a symbol
different from i both in shape of symbol (suggesting quality difference) and in length
mark (indicating quantity difference), for ‘beat’. This is the approach taken in this course.
5.3 Phonology
Chapters 2-4 were mainly concerned with matters of phonetics - the comparatively
straightforward business of describing the sounds that we use in speaking. When we talk
about how phonemes function in language, and the relationships among the different pho
nemes - when, in other words, we study the abstract side of the sounds of language, we are
studying a related but different subject that we call phonology. Only by studying both the
phonetics and the phonology of English is it possible to acquire a full understanding of
the use of sounds in English speech. Let us look briefly at some areas that come within the
subject of phonology; these areas of study will be covered in more detail later in the course.
36 English Phonetics and Phonology
Suprasegmental phonology
Many significant sound contrasts are not the result of differences between phonemes.
For example, stress is important: when the word ‘import’ is pronounced with the first
syllable sounding stronger than the second, English speakers hear it as a noun, whereas
when the second syllable is stronger the word is heard as a verb. Intonation is also impor
tant: if the word ‘right’ is said with the pitch of the voice rising, it is likely to be heard as a
question or as an invitation to a speaker to continue, while falling pitch is more likely to
be heard as confirmation or agreement. These examples show sound contrasts that extend
over several segments (phonemes), and such contrasts are called suprasegmental. We will
look at a number of other aspects of suprasegmental phonology later in the course.
This chapter is theoretical rather than practical. There is no shortage of material to read
on the subject of the phoneme, but much of it is rather difficult and assumes a lot of
background knowledge. For basic reading I would suggest Katamba (1989: Chapter 2),
Cruttenden (2008: Chapter 5, Section 3) or Giegerich (1992: 29-33). There are many
classic works: Jones (1976; first published 1950) is widely regarded as such, although it
is often criticised nowadays for being superficial or even naive. Another classic work is
Pike’s Phonemics (1947), subtitled “A Technique for Reducing Languages to Writing”:
5 Phonemes and symbols 37
this is essentially a practical handbook for people who need to analyse the phonemes of
unknown languages, and contains many examples and exercises.
The subject of symbols is a large one: there is a good survey in Abercrombie (1967:
Chapter 7). The IPA has tried as far as possible to keep to Roman-style symbols, although it
is inevitable that these symbols have to be supplemented with diacritics (extra marks that
add detail to symbols - to mark the vowel [e] as long, we can add the length diacritic : to
give [e:], or to mark it as centralised we can add the centralisation diacritic " to give [e]).
The IPAs present practice on symbolisation is set out in the Handbook of the International
Phonetic Association (IPA, 1999). There is a lot of information about symbol design and
choice in Pullum and Ladusaw (1996). Some phoneticians working at the end of the nine
teenth century tried to develop non-alphabetic sets of symbols whose shape would indicate
all essential phonetic characteristics; these are described in Abercrombie (1967: Chapter 7).
We have seen that one must choose between, on the one hand, symbols that are very
informative but slow to write and, on the other, symbols that are not very precise but are
quick and convenient to use. Pike (1943) presents at the end of his book an “analphabetic
notation” designed to permit the coding of sounds with great precision on the basis of their
articulation; an indication of the complexity of the system is the fact that the full specifica
tion of the vowel [o] requires eighty-eight characters. On the opposite side, many American
writers have avoided various IPA symbols as being too complex, and have tried to use as far
as possible symbols and diacritics which are already in existence for various special alpha
betic requirements of European languages and which are available on standard keyboards.
For example, where the IPA has J and 3, symbols not usually found outside phonetics,
many Americans use s and z, the mark above the symbols being widely used for Slavonic
languages that do Hot use the Cyrillic alphabet. The widespread use of computer printers
and word processing has revolutionised the use of symbols, and sets of phonetic fonts are
widely available via the Internet. We are still some way, however, from having a univer
sally agreed set of IPA symbol codes, and for much computer-based phonetic research it is
necessary to make do with conventions which use existing keyboard characters.
N ote fo r teachers
It should be made clear to students that the treatment of the phoneme in this chapter is
only an introduction. It is difficult to go into detailed examples since not many symbols
have been introduced at this stage, so further consideration of phonological issues is left
until later chapters.
W ritte n exercises
The words in the following list should be transcribed first phonemically, then (in square
brackets) phonetically. In your phonetic transcription you should use the following diacritics:
• b, d, g pronounced without voicing are transcribed b, d, g
• p, t, k pronounced with aspiration are transcribed ph, th, kh
38 English Phonetics and Phonology
• i:, a:, o:, 3:, u: when shortened by a following fortis consonant should be
transcribed i', a', o ', 3’, u'
• i, e, ae, a , d , u , a when shortened by a following fortis consonant should be tran
scribed i , e, ae, a , d , a , 5. Use the same mark for diphthongs, placing the diacritic
on the first part of the diphthong.
Example spelling: ‘peat’; phonemic: pi:t phonetic: phi't
Words for transcription
a) speed c) book e) car g) appeared i) stalk
b) partake d) goat f) bad h) toast
6 Fricatives and affricates
Fricatives are consonants with the characteristic that air escapes through a narrow
passage and makes a hissing sound. Most languages have fricatives, the most commonly
found being something like s. Fricatives are continuant consonants, which means that you
can continue making them without interruption as long as you have enough air in your
lungs. Plosives, which were described in Chapter 4, are not continuants. You can demon
strate the importance of the narrow passage for the air in the following ways:
i) Make a long, hissing s sound and gradually lower your tongue so that it is no
longer close to the roof of the mouth. The hissing sound will stop as the air
passage gets larger.
ii) Make a long f sound and, while you are producing this sound, use your fingers to
pull the lower lip away from the upper teeth. Notice how the hissing sound of the
air escaping between teeth and lip suddenly stops.
Affricates are rather complex consonants. They begin as plosives and end as fricatives.
A familiar example is the affricate heard at the beginning and end of the word ‘church’.
It begins with an articulation practically the same as that for t, but instead of a rapid
release with plosion and aspiration as we would find in the word ‘tip’, the tongue moves
to the position for the fricative J that we find at the beginning of the word ‘ship’. So
the plosive is followed immediately by fricative noise. Since phonetically this affricate
is composed of t and J we represent it as tj, so that the word ‘church’ is transcribed
as tj 3 itf.
However, the definition of an affricate must be more restricted than what has been
given so far. We would not class all sequences of plosive plus fricative as affricates; for
example, we find in the middle of the word ‘breakfast’ the plosive k followed by the frica
tive f. English speakers would generally not accept that kf forms a consonantal unit in
the way that tj seems to. It is usually said that the plosive and the following fricative must
be made with the same articulators - the plosive and fricative must be homorganic. The
sounds k, f are not homorganic, but t, d and J, 3, being made with the tongue blade against
the alveolar ridge, are homorganic. This still leaves the possibility of quite a large number
of affricates since, for example, t, d are homorganic not only with J, 3 but also with s, z, so
39
40 English Phonetics and Phonology
ts, dzwould also count as affricates. We could also consider tr, dr as affricates for the same
reason. However, we normally only count tj, d 3 as affricate phonemes of English.
Although tj, d3 can be said to be composed of a plosive and a fricative, it is usual
to regard them as being single, independent phonemes of English. In this way, t is one
phoneme, J is another and t j yet another. We would say that the pronunciation of the word
‘church’ t j3 itj is composed of three phonemes, tj, 3 : and tj. We will look at this question
of “two sounds = one phoneme” from the theoretical point of view in Chapter 13.
English has quite a complex system of fricative phonemes. They can be seen in the
table below:
PLACE OF ARTICULATION
-----------------------,-----------
Labiodental Dental Alveolar Post-alveolar • Glottal
.... ” 1
Fortis (“voiceless”) f 0 s I ;
ih
Lenis ("voiced”) V d z 3
With the exception of glottal, each place of articulation has a pair of phonemes, one fortis
and one lenis. This is similar to what was seen with the plosives. The fortis fricatives are
said to be articulated with greater force than the lenis, and their friction noise is louder. The
lenis fricatives have very little or no voicing in initial and final positions, but may be voiced
when they occur between voiced sounds. The fortis fricatives have the effect of shortening a
preceding vowel in the same way as fortis plosives do (see Chapter 4, Section 4). Thus in a pair
of words like ‘ice5ais and ‘eyes’ aiz, the ai diphthong in the first word is considerably shorter
than ai in the second. Since there is only one fricative with glottal place of articulation, it
would be rather misleading to call it fortis or lenis (which is why there is a line on the chart
above dividing h from the other fricatives).
O a U 6 (CD l),Exs 1-3
We will now look at the fricatives separately, according to their place of articulation,
f, v (example words: ‘fan’, ‘van’; ‘safer’, ‘saver’; ‘half’, ‘halve’)
These are labiodental: the lower lip is in contact with the upper teeth as shown in Fig. 18.
The fricative noise is never very strong and is scarcely audible in the case of v.
0, d (example words: ‘thumb’, ‘thus’; ‘ether’, ‘father’; ‘breath’, ‘breathe’)
The dental fricatives are sometimes described as if the tongue were placed between the
front teeth, and it is common for teachers to make their students do this when they are
trying to teach them to make this sound. In fact, however, the tongue is normally placed
6 Fricatives and affricates 41
behind the teeth, as shown in Fig. 19, with the tip touching the inner side of the lower
teeth. The air escapes through the gaps between the tongue and the teeth. As with f, v, the
fricative noise is weak.
s, z (example words: ‘sip’, ‘zip’; ‘facing’, ‘phasing’; ‘rice, ‘rise’)
These are alveolar fricatives, with the same place of articulation as t, d. The air escapes
through a narrow passage along the centre of the tongue, and the sound produced is
comparatively intense. The tongue position is shown in Fig. 16 in Chapter 4.
J, 3 (example words: ‘ship’ (initial 3 is very rare in English); ‘Russia’, ‘measure’; ‘Irish’,
‘garage’)
These fricatives are called post-alveolar, which can be taken to mean that the tongue is
in contact with an area slightly further back than that for s, z (see Fig. 20). If you make s,
then J, you should be able to feel your tongue move backwards.
The air escapes through a passage along the centre of the tongue, as in s, z, but the pas
sage is a little wider. Most BBC speakers have rounded lips for J, 3 , and this is an impor
tant difference between these consonants and s, z. The fricative J is a common and
widely distributed phoneme, but 3 is not. All the other fricatives described so far (f, v,
9 , d, s, z, J) can be found in initial, medial and final positions, as shown in the example
words. In the case of 3, however, the distribution is much more limited. Very few English
words begin with 3 (most of them have come into the language comparatively recently
from French) and not many end with this consonant. Only medially, in words such as
‘measure’ me33, ‘usual’ ju : 3uol is it found at all commonly.
h (example words: ‘head’, ‘ahead’, ‘playhouse’)
The place of articulation of this consonant is glottal. This means that the narrowing that
produces the friction noise is between the vocal folds, as described in Chapter 4. If you
breathe out silently, then produce h, you are moving your vocal folds from wide apart to
close together. However, this is not producing speech. When we produce h in speaking
English, many different things happen in different contexts. In the word ‘hat’, the h is
followed by an ae vowel. The tongue, jaw and lip positions for the vowel are all produced
simultaneously with the h consonant, so that the glottal fricative has an ae quality. The
same is found for all vowels following h; the consonant always has the quality of the
vowel it precedes, so that in theory if you could listen to a recording of h-sounds cut
off from the beginnings of different vowels in words like ‘hit’, ‘hat’, ‘hot’, ‘hut’, etc., you
should be able to identify which vowel would have followed the h. One way of stating
the above facts is to say that phonetically h is a voiceless vowel with the quality of the
voiced vowel that follows it.
Phonologically, h is a consonant. It is usually found before vowels. As well as being
found in initial position it is found medially in words such as ‘ahead’ shed, ‘greenhouse’
griinhaos, ‘boathook’ bauthuk. It is noticeable that when h occurs between voiced sounds
(as in the words ‘ahead’, ‘greenhouse’), it is pronounced with voicing - not the normal voicing
of vowels but a weak, slightly fricative sound called breathy voice. It is not necessary for
foreign learners to attempt to copy this voicing, although it is important to pronounce
h where it should occur in BBC pronunciation. Many English speakers are surprisingly
sensitive about this consonant; they tend to judge as sub-standard a pronunciation in
which h is missing. In reality, however, practically all English speakers, however carefully
they speak, omit the h in non-initial unstressed pronunciations of the words ‘her’, ‘he’,
‘him’, ‘his’ and the auxiliary ‘have’, ‘has’, ‘had’, although few are aware that they do this.
There are two rather uncommon sounds that need to be introduced; since they
are said to have some association with h, they will be mentioned here. The first is the
sound produced by some speakers in words which begin orthographically (i.e. in their
spelling form) with ‘wh’; most BBC speakers pronounce the initial sound in such words
(e.g. ‘which’, ‘why’, ‘whip’, ‘whale’) as w (which is introduced in Chapter 7), but there are
some (particularly when they are speaking clearly or emphatically) who pronounce the
sound used by most American and Scottish speakers, a voiceless fricative with the same
6 Fricatives and affricates 43
lip, tongue and jaw position as w . The phonetic symbol for this voiceless fricative is m .
We can find pairs of words showing the difference between this sound and the voiced
sound w :
‘witch’ w i t j ‘which’ Mitf
‘wail’ w e i l ‘whale’ A \e il
‘Wye’ wai ‘why’ Mai
‘wear’ w e a ‘where’ A vea
The obvious conclusion to draw from this is that, since substituting one sound for the
other causes a difference in meaning, the two sounds must be two different phonemes.
It is therefore rather surprising to find that practically all writers on the subject of
the phonemes of English decide that this answer is not correct, and that the sound m
in ‘which’, ‘why’, etc., is not a phoneme of English but is a realisation of a sequence of two
phonemes, h and w . We do not need to worry much about this problem in describing
the BBC accent. However, it should be noted that in the analysis of the many accents of
English that do have a “voiceless w ” there is not much more theoretical justification for
treating the sound as h plus w than there is for treating p as h plus b. Whether the ques
tion of this sound is approached phonetically or phonologically, there is no h sound in the
“voiceless w ”.
A very similar case is the sound found at the beginning of words such as ‘huge’,
‘human’, ‘hue’. Phonetically this sound is a voiceless palatal fricative (for which the
phonetic symbol is §); there is no glottal fricative at the beginning o f‘huge’, etc. However,
it is usual to treat this sound as h plus j (the latter is another consonant that is intro
duced in Chapter 7 - it is the sound at the beginning o f‘yes’, ‘yet’). Again we can see that
a phonemic analysis does not necessarily have to be exactly in line with phonetic facts. If
we were to say that these two sounds a \ , g were phonemes of English, we would have two
extra phonemes that do not occur very frequently. We will follow the usual practice of
transcribing the sound at the beginning of ‘huge’, etc., as hj just because it is convenient
and common practice.
It was explained in Section 6.1 that tf, d3 are the only two affricate phonemes
in English. As with the plosives and most of the fricatives, we have a fortis/lenis pair,
and the voicing characteristics are the same as for these other consonants, tj is slightly
aspirated in the positions where p, t , k are aspirated, but' not strongly enough for it to
be necessary for foreign learners to give much attention to it. The place of articulation
is the same as for J\ 3 - that is, it is post-alveolar. This means that the t component of
t f has a place of articulation rather further back in the mouth than the t plosive usually
has. When tf is final in the syllable it has the effect of shortening a preceding vowel, as
do other fortis consonants, t j , d3 often have rounded lips.
44 English Phonetics and Phonology
6 .4 Fortis consonants
All the consonants described so far, with the exception of h, belong to pairs distin
guished by the difference between fortis and lenis. Since the remaining consonants to be
described are not paired in this way, a few points that still have to be made about fortis
consonants are included in this chapter.
The first point concerns the shortening of a preceding vowel by a syllable-final fortis
consonant. As was said in Chapter 4, the effect is most noticeable in the case of long vowels
and diphthongs, although it does also affect short vowels. What happens if something
other than a vowel precedes a fortis consonant? This arises in syllables ending with 1, m, n,
g, followed by a fortis consonant such as p, t, k as in ‘belt5belt, ‘bump’ bAmp, ‘bent’ bent,
‘bank’ baerjk. The effect on those continuant consonants is the same as on a vowel: they
are considerably shortened.
Fortis consonants are usually articulated with open glottis - that is, with the vocal folds
separated. This is always the case with fricatives, where airflow is essential for successful
production. However, with plosives an alternative possibility is to produce the consonant
with completely closed glottis. This type of plosive articulation, known as glottalisation,
is found widely in contemporary English pronunciation, though only in specific contexts.
The glottal closure occurs immediately before p, t, k, tj. The most widespread glottalisa
tion is that of tj at the end of a stressed syllable (I leave defining what “stressed syllable”
means until Chapter 8). If we use the symbol? to represent a glottal closure, the phonetic
transcription for various words containing tj can be given as follows:
Learners usually find these rules difficult to learn, from the practical point of view,
and find it simpler to keep to the more conservative pronunciation which does not use
glottalisation. However, it is worth pointing out the fact that this occurs - many learners
6 Fricatives and affricates 45
notice the glottalisation and want to know what it is that they are hearing, and many of
them find that they acquire the glottalised pronunciation in talking to native speakers.
The dental fricative 6 is something of a problem: although there are not many English
words in which this sound appears, those words are ones which occur very frequently -
words like ‘the’, ‘this, ‘there’, ‘that’. This consonant often shows so little friction noise that
on purely phonetic grounds it seems incorrect to class it as a fricative. It is more like a weak
(lenis) dental plosive. This matter is discussed again in Chapter 14, Section 14.2.
On the phonological side, I have brought in a discussion of the phonemic analysis
of two “marginal” fricatives m, 5 which present a problem (though not a particular
ly important or fundamental one): I feel that this is worth discussing in that it gives a
good idea of the sort of problem that can arise in analysing the phonemic system of a
language. The other problem area is the glottalisation described at the end of the chapter.
There is now a growing awareness of how frequently this is to be found in contemporary
English speech; however, it not easy to formulate rules stating the contexts in which this
occurs. There is discussion in Brown (1990: 28-30), in Cruttenden (2008: Section 9.2.8),
in Ladefoged (2006: 60-1) and in Wells (1982: Section 3.4.5).
Notes fo r teachers
W ritten exercises
So far we have studied two major groups of consonants - the plosives and fricatives
- and also the affricates tj, d3; this gives a total of seventeen. There remain the nasal
consonants - m, n, g - and four others - 1, r, w, j; these four are not easy to fit into groups.
All of these seven consonants are continuants and usually have no friction noise, but in
other ways they are very different from each other.
7.1 Nasals
The basic characteristic of a nasal consonant is that the air escapes through the nose.
For this to happen, the soft palate must be lowered; in the case of all the other consonants
and vowels of English, the soft palate is raised and air cannot pass through the nose. In
nasal consonants, however, air does not pass through the mouth; it is prevented by a com
plete closure in the mouth at some point. If you produce a long sequence dndndndndn
without moving your tongue from the position for alveolar closure, you will feel your
soft palate moving up and down. The three types of closure are: bilabial (lips), alveolar
(tongue blade against alveolar ridge) and velar (back of tongue against the palate). This set
of places produces three nasal consonants - m, n, g - which correspond to the three places
of articulation for the pairs of plosives p b, t d, k g.
The consonants m, n are simple and straightforward with distributions quite similar
to those of the plosives. There is in fact little to describe. However, rj is a different matter. It
is a sound that gives considerable problems to foreign learners, and one that is so unusual
in its phonological aspect that some people argue that it is not one of the phonemes of
English at all. The place of articulation of r) is the same as that of k, g; it is a useful
exercise to practise making a continuous 13 sound. If you do this, it is very important not
to produce a k or g at the end - pronounce the r) like m or n.
O AU7 (CD 1), Exs 1 & 2
We will now look at some ways in which the distribution of 13 is unusual.
i) In initial position we find m, n occurring freely, but r) never occurs in this
position. With the possible exception of 3, this makes r) the only English
consonant that does not occur initially.
ii) Medially, rj occurs quite frequently, but there is in the BBC accent a rather
complex and quite interesting rule concerning the question of when r) may
46
Nasals and other consonants 47
A B
‘f in g e r ’ firjg a ‘s in g e r ’ sirja
‘a n g e r ’ aerjga ‘h a n g e r ’ h aerp
this rule. It is important to remember that English speakers in general (apart from those
trained in phonetics) are quite ignorant of this rule, and yet if a foreigner uses the wrong
pronunciation (i.e. pronounces rjg where r) should occur, or r) where ijg should be used),
they notice that a mispronunciation has occurred.
iii) A third way in which the distribution of o is unusual is the small number of
vowels it is found to follow. It rarely occurs after a diphthong or long vowel,
so only the short vowels i, e , ae, a , d , u , a are regularly found preceding this
consonant.
The velar nasal consonant r) is, in summary, phonetically simple (it is no more difficult
to produce than m or n) but phonologically complex (it is, as we have seen, not easy to
describe the contexts in which it occurs).
The 1 phoneme (as in ‘long’ lor), ‘hill’ hil) is a lateral approximate. This is a conso
nant in which the passage of air through the mouth does not go in the usual way along the
centre of the tongue; instead, there is complete closure between the centre of the tongue
and the part of the roof of the mouth where contact is to be made (the alveolar ridge in
the case of 1). Because of this complete closure along the centre, the only way for the air
to escape is along the sides of the tongue. The lateral approximant is therefore somewhat
different from other approximants, in which there is usually much less contact between
the articulators. If you make a long 1 sound you may be able to feel that the sides of your
tongue are pulled in and down while the centre is raised, but it is not easy to become
consciously aware of this; what is more revealing (if you can do it) is to produce a long
sequence of alternations between d and 1 without any intervening vowel. If you produce
dldldldldl without moving the middle of the tongue, you will be able to feel the movement
of the sides of the tongue that is necessary for the production of a lateral. It is also possible
to see this movement in a mirror if you open your lips wide as you produce it. Finally, it
is also helpful to see if you can feel the movement of air past the sides of the tongue; this
is not really possible in a voiced sound (the obstruction caused by the vibrating vocal folds
reduces the airflow), but if you try to make a very loud whispered 1, you should be able to
feel the air rushing along the sides of your tongue.
We find 1 initially, medially and finally, and its distribution is therefore not
particularly limited. In BBC pronunciation, the consonant has one unusual character
istic: the realisation of 1 found before vowels sounds quite different from that found in
other contexts. For example, the realisation of 1 in the word ‘lea’ li: is quite different
from that in ‘eel’ i:l.The sound in ‘eel’ is what we call a “dark 1”; it has a quality rather
similar to an [u] vowel, with the back of the tongue raised. The phonetic symbol for
this sound is 1. The sound in ‘lea’ is what is called a “clear 1”; it resembles an [i] vowel,
with the front of the tongue raised (we do not normally use a special phonetic symbol,
7 Nasals and other consonants 49
different from 1, to indicate this sound). The “dark 1” is also found when it precedes
a consonant, as in ‘eels’ i:lz. We can therefore predict which realisation of 1 (clear or
dark) will occur in a particular context: clear 1 will never occur before consonants or
before a pause, but only before vowels; dark 1 never occurs before vowels. We can say,
using terminology introduced in Chapter 5, that clear 1 and dark 1 are allophones of the
phoneme 1 in complementary distribution. Most English speakers do not consciously
know about the difference between clear and dark 1, yet they are quick to detect the
difference when they hear English speakers with different accents, or when they hear
foreign learners who have not learned the correct pronunciation. You might be able to
observe that most American and lowland Scottish speakers use a wdark 1” in all positions,
and don’t have a “clear 1” in their pronunciation, while most Welsh and Irish speakers
have “clear 1” in all positions.
Another allophone of 1 is found when it follows p, k at the beginning of a stressed
syllable. The 1 is then devoiced (i.e. produced without the voicing found in most reali
sations of this phoneme) and pronounced as a fricative. The situation is (as explained
in Chapter 4) similar to the aspiration found when a vowel follows p, t, k in a stressed
syllable: the first part of the vowel is devoiced.
the mouth than that for alveolar consonants such as t, d, which is why this approximant
is called “post-alveolar”. A rather different r sound is found at the beginning of a syllable
if it is preceded by p, t, k; it is then voiceless and fricative. This pronunciation is found in
words such as ‘press’, ‘tress’, ‘cress’.
One final characteristic of the articulation of r is that it is usual for the lips to be
slightly rounded; learners should do this but should be careful not to exaggerate it. If
the lip-rounding is too strong the consonant will sound too much like w, which is the
sound that most English children produce until they have learned to pronounce r in the
adult way.
The distributional peculiarity of r in the BBC accent is very easy to state: this
phoneme only occurs before vowels. No one has any difficulty in remembering this rule,
but foreign learners (most of whom, quite reasonably, expect that if there is a letter ‘r’ in
the spelling then r should be pronounced) find it difficult to apply the rule to their own
pronunciation. There is no problem with words like the following:
i) ‘red’ red ‘arrive’ oral v ‘hearing’ hlornj
In these words r is followed by a vowel. But in the following words there is no r in the
pronunciation:
ii) ‘car’ ka: ‘ever’ eva ‘here’ hia
iii) ‘hard’ ha:d ‘verse’ v3is ‘cares’ kcaz
Many accents of English do pronounce r in words like those of (ii) and (iii) (e.g. most
American, Scots and West of England accents). Those accents which have r in final position
(before a pause) and before a consonant are called rhotic accents, while accents in which
r only occurs before vowels (such as BBC) are called non-rhotic.
These are the consonants found at the beginning of words such as ‘yet’ and ‘wet’.
They are known as approximants (introduced in Section 7.3 above). The most impor
tant thing to remember about these phonemes is that they are phonetically like vowels
but phonologically like consonants (in earlier works on phonology they were known as
“semivowels”). From the phonetic point of view the articulation of j is practically the same
as that of a front close vowel such as [i], but is very short. In the same way w is closely
similar to [u]. If you make the initial sound o f‘yet’ or ‘wet’ very long, you will be able to
hear this. But despite this vowel-like character, we use them like consonants. For example,
they only occur before vowel phonemes; this is a typically consonantal distribution. We can
show that a word beginning with w or j is treated as beginning with a consonant in the
following way: the indefinite article is ‘a’ before a consonant (as in ‘a cat’, ‘a dog’), and ‘an’
before a vowel (as in ‘an apple’, ‘an orange’). If a word beginning with w or j is preceded
by the indefinite article, it is the ‘a’ form that is found (as in ‘a way’, ‘a year’). Another
example is that of the definite article. Here the rule is that ‘the’ is pronounced as 6 a before
7 Nasals and other consonants 51
consonants (as in ‘the dog’ 6a dog, ‘the cat’ 6a kaet) and as di before vowels (as in ‘the
apple’ di aepl, ‘the orange’ di Drind3). This evidence illustrates why it is said that j, w are
phonologically consonants. However, it is important to remember that to pronounce them
as fricatives (as many foreign learners do), or as affricates, is a mispronunciation. Only in
special contexts do we hear friction noise in j or w; this is when they are preceded by p, t,
k at the beginning of a syllable, as in these words:
‘pure’ pjua (no English words begin with pw)
‘tune’ tjum ‘twin’ twin
‘queue’ kju: ‘quit’ kwit
When p, t, k come at the beginning of asyllable and are followed by a vowel, they are aspi
rated, as was explained in Chapter 4. This means that the beginning of a vowel is voiceless
in this context. However, when p, t, k are followed not by a vowel but by one of 1, r, j, w,
these voiced continuant consonants undergo a similar process, as has been mentioned
earlier in this chapter: they lose their voicing and become fricative. So words like ‘play’
plei, ‘tray’ trei, ‘quick’ kwik, ‘cue’ kju: contain devoiced and fricative 1, r, w, j whereas
‘lay’, ‘ray’, ‘wick’, ‘you’ contain voiced 1, r, w, j. Consequently, if for example ‘tray’ were to
be pronounced without devoicing of the r (i.e. with fully voiced r) English speakers would
be likely to hear the word ‘dray’.
This completes our examination of the consonant phonemes of English. It is useful to
place them on a consonant chart, and this is done in Table 1. On this chart, the different places
of articulation are arranged from left to right and the manners of articulation are arranged
from top to bottom. When there is a pair of phonemes with the same place and manner of
articulation but differing in whether they are fortis or lenis (voiceless or voiced), the symbol
for the fortis consonant is placed to the left of the symbol for the lenis consonant.
The notes for this chapter are devoted to giving further detail on a particularly difficult
theoretical problem. The argument that g is an allophone of n, not a phoneme in its own
right, is so widely accepted by contemporary phonological theorists that few seem to feel
it worthwhile to explain it fully. Since the velar nasal is introduced in this chapter, I have
chosen to attempt this here. However, it is a rather complex theoretical matter, and you
may prefer to leave consideration of it until after the discussion of problems of phonemic
analysis in Chapter 13.
There are brief discussions of the phonemic status of r) in Chomsky and Halle (1968:
85) and Ladefoged (2006); for a fuller treatment, see Wells (1982: 60-4) and Giegerich
(1992:297-301). Everyone agrees that English has at least two contrasting nasal phonemes,
m and n. However, there is disagreement about whether there is a third nasal phoneme r).
In favour of accepting g as a phoneme is the fact that traditional phoneme theory more or
less demands its acceptance despite the usual preference for making phoneme inventories
as small as possible. Consider minimal pairs (pairs of words in which a difference in
re
S
J2 O)
$
re
+J
A3
re
o.
O
ft
ro
cn T3
£
O
5! N
tZ) C
ru
+-»
C
<u *o
D CD
z
o
c(U
Z)
“O
o
u 15
£ re
5 <
qj
c
o
-c re
+J lo
_ro X)
<3 sCl 5
a
0
vo
C
8 c
-C re
E
*x
1Uj 2
Q_
c
re
Q. E
tcs <U <L> re
•I * S 75 1CL
5 S it IS
<U
a.
a. it < -Z <
.0)
-Q
tC3 NOiivinDiiav do aaNisivw
7 Nasals and other consonants 53
meaning depends on the difference of just one phoneme) like these: ‘sin’ sin - ‘sing’ sir);
‘sinner’ sins - ‘singer’ siga.
There are three main arguments against accepting g as a phoneme:
i) In some English accents it can easily be shown that g is an allophone of n,
which suggests that something similar might be true of BBC pronunciation too.
ii) If rj is a phoneme, its distribution is very different from that of m and n, being
restricted to syllable-final position (phonologically), and to morpheme-final
position (morphologically) unless it is followed by k or g.
iii) English speakers with no phonetic training are said to feel that g is not a ‘single
sound’ like m, n. Sapir (1925) said that “no native speaker of English could be
made to feel in his bones” that g formed part of a series with m, n. This is, of
course, very hard to establish, although that does not mean that Sapir was wrong.
We need to look at point (i) in more detail and go on to see how this leads to the argument
against having g as a phoneme. Please note that I am not trying to argue that this proposal
must be correct; my aim is just to explain the argument. The whole question may seem
of little or no practical consequence, but we ought to be interested in any phonological
problem if it appears that conventional phoneme theory is not able to deal satisfactorily
with it.
In some English accents, particularly those of the Midlands, rj is only found with k
or g following. For example:
‘sink’ sigk ‘singer’ sigga
‘sing’ sigg ‘singing’ siggigg
This was my own pronunciation as a boy, living in the West Midlands, but I now usually
have the BBC pronunciation sigk, sig, si 130, sigig. In the case of an accent like this, it can
be shown that within the morpheme the only nasal that occurs before k, g is g. Neither m
nor n can occur in this environment. Thus within the morpheme g is in complementary
distribution with m, n. Since m, n are already established as distinct English phonemes in
other contexts (maep, naep, etc.), it is clear that for such non-BBC accents g must be an
allophone of one of the other nasal consonant phonemes. We choose n because when a
morpheme-final n is followed by a morpheme-initial k, g it is usual for that n to change
to g; however, a morpheme-final m followed by a morpheme-initial k, g usually doesn’t
change to g. Thus:
‘raincoat’ reigkaut but ‘tramcar’ traemka:
So in an analysis which contains no g phoneme, we would transcribe ‘raincoat’phonemically
as reinkaut and ‘sing’, ‘singer’, ‘singing’ as sing, singa, singing. The phonetic realisation
of the n phoneme as a velar nasal will be accounted for by a general rule that we will call
Rule 1:
Rule 1: n is realised as g when it occurs in an environment in which it precedes
either k or g.
54 English Phonetics and Phonology
Let us now look at BBC pronunciation. As explained in Section 7.1 above, the crucial
difference between ‘singer’ siga and ‘finger’ fujgo is that ‘finger’ is a single, indivisible
morpheme whereas ‘singer’ is composed of two morphemes ‘sing’ and ‘-er’ When g
occurs without a following k or g it is always immediately before a morpheme bound
ary. Consequently, the sound g and the sequence gg are in complementary distribution.
But within the morpheme there is no contrast between the sequence gg and the sequence
ng, which makes it possible to say that g is also in complementary distribution with the
sequence ng.
After establishing these “background facts”, we can go on to state the argument as
follows:
i) English has only m, n as nasal phonemes.
ii) The sound g is an allophone of the phoneme n.
iii) The words ‘finger’, ‘sing’, ‘singer’, ‘singing’ should be represented phonemically
as finga, sing, singa, singing.
iv) Rule 1 (above) applies to all these phonemic representations to give these
phonetic forms: figga, sigg, sigga, siggigg
v) A further rule (Rule 2) must now be introduced:
Rule 2: g is deleted when it occurs after g and before a morpheme boundary.
It should be clear that Rule 2 will not apply to ‘finger’ because the g is not immediately
followed by a morpheme boundary. However, the rule does apply to all the others, hence
the final phonetic forms: figga, sig, siga, sigig.
vi) Finally, it is necessary to remember the exception we have seen in the case of
comparatives and superlatives.
The argument against treating g as a phoneme may not appeal to you very much. The
important point, however, is that if one is prepared to use the kind of complexity and
abstractness illustrated above, one can produce quite far-reaching changes in the phonemic
analysis of a language.
The other consonants - 1, r, w, j - do not, I think, need further explanation, except
to mention that the question of whether j, w are consonants or vowels is examined on
distributional grounds in O’Connor and Trim (1953).
W ritten exercises
1 List all the consonant phonemes of the BBC accent, grouped according to
manner of articulation.
2 Transcribe the following words phonemically:
a) sofa c) steering
b) verse d) breadcrumb
7 Nasals and other consonants 55
e) square g) bought
f) anger h) nineteen
3 When the vocal tract is in its resting position for normal breathing, the soft
palate is usually lowered. Describe what movements are carried out by the soft
palate in the pronunciation of the following words:
a) banner b) mid c) angle
8 The syllable
The syllable is a very important unit. Most people seem to believe that, even if they
cannot define what a syllable is, they can count how many syllables there are in a given
word or sentence. If they are asked to do this they often tap their finger as they count,
which illustrates the syllable’s importance in the rhythm of speech. As a matter of fact,
if one tries the experiment of asking English speakers to count the syllables in, say, a
recorded sentence, there is often a considerable amount of disagreement.
When we looked at the nature of vowels and consonants in Chapter 1 it was shown
that one could decide whether a particular sound was a vowel or a consonant on phonetic
grounds (in relation to how much they obstructed the airflow) or on phonological grounds
(vowels and consonants having different distributions). We find a similar situation with
the syllable, in that it may be defined both phonetically and phonologically. Phonetically
(i.e. in relation to the way we produce them and the way they sound), syllables are usually
described as consisting of a centre which has little or no obstruction to airflow and which
sounds comparatively loud; before and after this centre (i.e. at the beginning and end of
the syllable), there will be greater obstruction to airflow and/or less loud sound. We will
now look at some examples:
i) What we will call a minimum syllable is a single vowel in isolation (e.g. the
words ‘are’ a:, ‘or’ o:, ‘err’ 3:). These are preceded and followed by silence.
Isolated sounds such as m, which we sometimes produce to indicate agreement,
or X, to ask for silence, must also be regarded as syllables.
ii) Some syllables have an onset - that is, instead of silence, they have one or more
consonants preceding the centre of the syllable:
‘bar’ ba: ‘key’ ki: ‘more’ mo:
iii) Syllables may have no onset but have a coda - that is, they end with one or
more consonants:
‘am’ aem ‘ought’ o:t ‘ease’ i:z
iv) Some syllables have both onset and coda:
‘ran’ raen ‘sat’ saet ‘fill’ fil
56
The syllable 57
This is one way of looking at syllables. Looking at them from the phonological point
of view is quite different. What this involves is looking at the possible combinations of
English phonemes; the study of the possible phoneme combinations of a language is called
phonotactics. It is simplest to start by looking at what can occur in initial position - in
other words, what can occur at the beginning of the first word when we begin to speak
after a pause. We find that the word can begin with a vowel, or with one, two or three
consonants. No word begins with more than three consonants. In the same way, we can
look at how a word ends when it is the last word spoken before a pause; it can end with a
vowel, or with one, two, three or (in a small number of cases) four consonants. No current
word ends with more than four consonants.
Let us now look in more detail at syllable onsets. If the first syllable of the word in
question begins with a vowel (any vowel may occur, though u is rare) we say that this initial
syllable has a zero onset. If the syllable begins with one consonant, that initial consonant
may be any consonant phoneme except rj; 3 is rare.
We now look at syllables beginning with two consonants. When we have two or more
consonants together we call them a consonant cluster. Initial two-consonant clusters are of
two sorts in English. One sort is composed of s followed by one of a small set of consonants;
examples of such clusters are found in words such as ‘sting’ stir), ‘sway’ swei, ‘smoke’ smauk.
The s in these clusters is called the pre-initial consonant and the other consonant (t, w, m in
the above examples) the initial consonant. These clusters are shown in Table 2.
The other sort begins with one of a set of about fifteen consonants, followed by one
of the set 1, r, w, j as in, for example, ‘play’ plei, ‘try’ trai, ‘quick’ kwik, ‘few’ fju:. We call
the first consonant of these clusters the initial consonant and the second the post-initial.
There are some restrictions on which consonants can occur together. This can best be
shown in table form, as in Table 3. When we look at three-consonant clusters we can
recognise a clear relationship between them and the two sorts of two-consonant cluster
described above; examples of three-consonant initial clusters are: ‘split’ split, ‘stream’
striim, ‘square’ skwea. The s is the pre-initial consonant, the p, t, k that follow s in the
three example words are the initial consonant and the 1, r, w are post-initial. In fact, the
number of possible initial three-consonant clusters is quite small and they can be set out
in M l (words given in spelling form): O AU8 (CD 1), Ex 2
POST-INITIAL
1 r w j
P ‘splay’ ‘spray’ - ‘spew’
s plus initial t - ‘string’ - ‘stew’
k ‘sclerosis’ ‘screen’ ‘squeak’ ‘skewer’
+-»
c
H3
Q
>.J
o 5
<uJ
CU u
0 JZ
h- . <u g
to j4 m do o _re
TD <u -C
g •'~5 _o Tj v/T 3
u > .re 3 10 I
C/3 1— QJ v/> r
ue
*01 ^JT c E (U
DO L. TO E
C 7® re c 05 VO jtn
o It; ai c C <D jW
CD
c/5
-G c «5^ *c Q.
JO 00 (U CL
Cl Q.
* 1 CL *< u O
c +-»vo re k.U _ra QJ <U
O CI 5 a.
-C
P cs/3 «»« S2 ™
C
s
c
TS
c m
u 5
0-J5 5O
(T5 Q_ T=. 3 re T3
O c
■£ cfl J u a» g re
o."re <D <D > C^ 5
to I C c
c E » o "
1-1 o _*
5 c « \A
VO & pS
O JC (U
N I v\ cl "a
s o t;
1fC *1
*—j = EO+- ^o
T
'y" £ "S C m <U \A
>0 I
•’C
—’S re
/3
3
•’—5 VO _v£J r
cue ^ JOJ
3
<u > QJ _>» M— DO
c=>~ ■£* VO I 4-J
DO' O
to r
ue
> I ^ —c are T3 re ^ 1
k. O
=3
u < u re
•’—5
<u 5
& -2 oT 42 3 o. c
4=3 I ^ Q -£ E (U 2
< do— <L» I <u
'uT ^ "a r
i_e
ai .2 aJ <l> I I c <u
^ +i
Z jc
s re ro
re ? ™_
-s
_D *E1 -<u £ S 5 *re u
“5 a ts <u o 5 're
S Q. <U "D _c _c
o_ vrea v/i u
C/3 | O o L-
re
O CD £ to ■g £ -
^ dj *in © CD O g
o <O
Vu C
J (U
ro € & u E e S
S 5 i c3 C/3 > <D C
Table 2 Two-consonant clusters with pre-initial s
™ TJ C .0 I VO O -S —
• 1—5 QJ fTS £
re c re a; QJ
_o
„ VO _ g "<L> <u ^
’£
^ U DO ii s -i
**ii
VO 13) III
C7) (N Tt* c
5 ^
vq re ? 0
05
— ^tr
CLJD O € T3
&
T3 T3
35
J-H i I> ‘—
15
I
TD E
S 00 ^i-
c
ovo m
J !
i ~re iz«: v+_
O) «
cn C C -C
05 O 1
O u -r;
_
sc 44
s*
iQJ c <u TD
“ 1re E LU
DO O ^
52 £ a! ■2J
—
T3 ^
tou v-
(U ^ 3 VO 3 £ I lo _3 ! e C O
■K "K or (U 1U — <u £ £ 5
3 _3 <u "C U
Pre-initial s followed by:
“u vj -C +-» J3 io C- <U = ~
JO I c 3 s
44
?
44 <U 3
vq o
+j vo +j
vo b ^
Q O u
? S o C « £ ^ “3u _oj
ju r e
0 a c ~ ^ gj a/
g i-S vo
C
I jd cu
IS 5 .2, a J ig c *v? -c
<U vo +J
C l! ^ 81
o (u c 5 CL ■ 5 ^ E c r O
3 C
U -* TO D <
45 UD
h 0 0 0 re QJ +-» <U <y a * v-o
' ro j_
I ^3 <u C vo
a Oh 1 'S
$ Q-S .c Ol c §>% <u =
m <U re C 5 •— o y
QJ £ —> 2 E _1 re 0 O
£o | ° c ■+-J 0 *vZ oJ -C
Oh c/3
_Q
.r e 1-lSOd to to % ^0 .5 > h - CD
S E .so r- fM ^ in
8 The syllable 59
The second type shows how more than one post-final consonant can occur in a final
cluster: final plus post-final 1 plus post-final 2 . Post-final 2 is again one of s, z, t, d, 0 .
To sum up, we may describe the English syllable as having the following maximum
phonological structure:
In the above structure there must be a vowel in the centre of the syllable. There is, however,
a special case, that of syllabic consonants (which are introduced in Chapter 9); we do not,
for example, analyse the word ‘students’ stjuidnts as consisting of one syllable with the
three-consonant cluster stj for its onset and a four-consonant final cluster dnts. To fit
in with what English speakers feel, we say that the word contains two syllables, with the
second syllable ending with the cluster nts; in other words, we treat the word as though
there was a vowel between d and n, although a vowel only occurs here in very slow, careful
pronunciation. This phonological problem will be discussed in Chapter 13.
Much present-day work in phonology makes use of a rather more refined analysis
of the syllable in which the vowel and the coda (if there is one) are known as the rhyme;
if you think of rhyming English verse you will see that the rhyming works by matching
just that part of the last syllable of a line. The rhyme is divided into the peak (normally
the vowel) and the coda (but note that this is optional: the rhyme may have no coda, as
in a word like ‘me’). As we have seen, the syllable may also have an onset, but this is not
obligatory. The structure is thus the following
syllable
There are still problems with the description of the syllable: an unanswered question
is how we decide on the division between syllables when we find a connected sequence of
them as we usually do in normal speech. It often happens that one or more consonants
8 The syllable 61
from the end of one word combine with one or more at the beginning of the following
word, resulting in a consonant sequence that could not occur in a single syllable. For
example, ‘walked through’ waikt 0ru: gives us the consonant sequence kt0r.
We will begin by looking at two words that are simple examples of the problem of
dividing adjoining syllables. Most English speakers feel that the word ‘morning’ momir)
consists of two syllables, but we need a way of deciding whether the division into syllables
should be mo: and nir), or main and ir). A more difficult case is the word ‘extra’ ekstra.
One problem is that by some definitions the s in the middle, between k and t, could
be counted as a syllable, which most English speakers would reject. They feel that the
word has two syllables. However, the more controversial issue relates to where the two
syllables are to be divided; the possibilities are (using the symbol . to signify a syllable
boundary):
i) e.kstra
ii) ek.stra
iii) eks.tra
iv) ekst.ra
v) ekstr.a
How can we decide on the division? No single rule will tell us what to do without bringing
up problems.
One of the most widely accepted guidelines is what is known as the maximal onsets
principle. This principle states that where two syllables are to be divided, any consonants
between them should be attached to the right-hand syllable, not the left, as far as possible.
In our first example above, ‘morning’ would thus be divided as mo: .nir). If we just followed
this rule, we would have to divide ‘extra’ as (i) e.kstra, but we know that an English syllable
cannot begin with kstr. Our rule must therefore state that consonants are assigned to the
right-hand syllable as far as possible within the restrictions governing syllable onsets and
codas. This means that we must reject (i) e.kstra because of its impossible onset, and (v)
ekstr.a because of its impossible coda. We then have to choose between (ii), (iii) and (iv).
The maximal onsets rule makes us choose (ii). There are, though, many problems still
remaining. How should we divide words like ‘better’ beta? The maximal onsets principle
tells us to put the t on the right-hand syllable, giving be.ta, but that means that the first
syllable is analysed as be. However, we never find isolated syllables ending with one of the
vowels i, e, ae, a , d, u, so this division is not possible. The maximal onsets principle must
therefore also be modified to allow a consonant to be assigned to the left syllable if that
prevents one of the vowels i, e, ae, a , d, u from occurring at the end of a syllable. We can
then analyse the word as bet .a, which seems more satisfactory. There are words like ‘carry’
kaeri which still give us problems: if we divide the word as kae.ri, we get a syllable-final ae,
but if we divide it as kaer.i we have a syllable-final r, and both of these are non-occurring
in BBC pronunciation. We have to decide on the lesser of two evils here, and the preferable
solution is to divide the word as kaer.i on the grounds that in the many rhotic accents of
English (see Section 7.3) this division would be the natural one to make.
62 English Phonetics and Phonology
One further possible solution should be mentioned: when one consonant stands
between vowels and it is difficult to assign the consonant to one syllable or the other - as
in ‘better’ and ‘carry’ - we could say that the consonant belongs to both syllables. The term
used by phonologists for a consonant in this situation is ambisyllabic.
Notes fo r teachers
Analysing syllable structure, as we have been doing in this chapter, can be very useful
to foreign learners of English, since English has a more complex syllable structure than
most languages. There are many more limitations on possible combinations of vowels
and consonants than we have covered here, but an understanding of the basic structures
described will help learners to become aware of the types of consonant cluster that present
8 The syllable 63
them with pronunciation problems. In the same way, teachers can use this knowledge to
construct suitable exercises. Most learners find some English clusters difficult, but few find
all of them difficult. For reading in this area, see Celce-Murcia et al. (1996: 80-9); Dalton
and Seidlhofer (1994: 34-8); Hewings (2004: 1.4,2.10-2.12).
W ritten exercise
Using the analysis of the word ‘cramped’ given below as a model, analyse the structure of
the following one-syllable English words:
a) squealed
b) eighths
c) splash
d) texts
9 Strong and weak syllables
One of the most noticeable features of English pronunciation is that some of its
syllables are strong while many others are weak; this is also true of many other languages,
but it is necessary to study how these weak syllables are pronounced and where they occur
in English. The distribution of strong and weak syllables is a subject that will be met in
several later chapters. For example, we will look later at stress, which is very important
in deciding whether a syllable is strong or weak. Elision is a closely related subject, and in
considering intonation the difference between strong and weak syllables is also important.
Finally, words with “strong forms” and “weak forms” are clearly a related matter. In this
chapter we look at the general nature of weak syllables.
What do we mean by “strong” and “weak”? To begin with, we can look at how we
use these terms to refer to phonetic characteristics of syllables. When we compare weak
syllables with strong syllables, we find the vowel in a weak syllable tends to be shorter,
of lower intensity (loudness) and different in quality. For example, in the word ‘data’
delta the second syllable, which is weak, is shorter than the first, is less loud and has a
vowel that cannot occur in strong syllables. In a word like ‘bottle’ bnt | the weak second
syllable contains no vowel at all, but consists entirely of the consonant 1. We call this a
syllabic consonant.
There are other ways of characterising strong and weak syllables. We could describe
them partly in terms of stress (by saying, for example, that strong syllables are stressed and
weak syllables unstressed) but, until we describe what “stress” means, such a description
would not be very useful. The most important thing to note at present is that any strong
syllable will have as its peak one of the vowel phonemes (or possibly a triphthong) listed
in Chapters 2 and 3, but not a, i, u (the last two are explained in Section 9.3 below). If the
vowel is one of 1, e, ae, a , d , u, then the strong syllable will always have a coda as well. Weak
syllables, on the other hand, as they are defined here, can only have one of a very small
number of possible peaks. At the end of a word, we may have a weak syllable ending with
a vowel (i.e. with no coda):
64
Strong and weak syllables 65
The most frequently occurring vowel in English is a, which is always associated with
weak syllables. In quality it is mid (i.e. halfway between close and open) and central (i.e.
halfway between front and back). It is generally described as lax - that is, not articulated
with much energy. Of course, the quality of this vowel is not always the same, but the
variation is not important.
Not all weak syllables contain a, though many do. Learners of English need to learn
where a is appropriate and where it is not. To do this we often have to use information
that traditional phonemic theory would not accept as relevant - we must consider spelling.
The question to ask is: if the speaker were to pronounce a particular weak syllable as if it
were strong instead, which vowel would it be most likely to have, according to the usual
rules of English spelling? Knowing this will not tell us which syllables in a word or utter
ance should be weak - that is something we look at in later chapters - but it will give us a
rough guide to the correct pronunciation of weak syllables. Let us look at some examples:
i) Spelt with ‘a’; strong pronunciation would have ae
‘attend’ a t e n d ‘character’ k a e ra k ta
‘barracks’ b a era k s
66 English Phonetics and Phonology
Two other vowels are commonly found in weak syllables, one close front (in
the general region of i:, 1) and the other close back rounded (in the general region of
u:, u). In strong syllables it is comparatively easy to distinguish i: from 1 or u: from u, but
in weak syllables the difference is not so clear. For example, although it is easy enough to
decide which vowel one hears in ‘beat’ or ‘bit’, it is much less easy to decidewhich vowel
one hears in the second syllable of words such as ‘easy’ or ‘busy’. There are accents of
English (e.g. Welsh accents) in which the second syllable sounds most like the i: in the
first syllable of ‘easy’, and others (e.g. Yorkshire accents) in which it sounds more like the
1 in the first syllable of ‘busy’. In present-day BBC pronunciation, however, the matter is
not so clear. There is uncertainty, too, about the corresponding close back rounded vowels.
If we look at the words ‘good to eat’ and ‘food to eat’, we must ask if the word ‘to’ is
pronounced with the u vowel phoneme o f‘good’ or the u: phoneme o f‘food’. Again, which
vowel comes in ‘to’ in ‘I want to’?
9 Strong and weak syllables 67
One common feature is that the vowels in question are more like i: or u: when
they precede another vowel, less so when they precede a consonant or pause. You should
notice one further thing: with the exception of one or two very artificial examples, there
is really no possibility in these contexts of a phonemic contrast between i: and i, or
between ui and u. Effectively, then, the two distinctions, which undoubtedly exist within
strong syllables, are neutralised in weak syllables of BBC pronunciation. How should we
transcribe the words ‘easy’ and ‘busy’? We will use the close front unrounded case as an
example, since it is more straightforward. The possibilities, using our phoneme symbols,
are the following: \
‘easy’ ‘busy’
i) i:zi: bizi:
ii) i:zi bizi
Few speakers with a BBC accent seem to feel satisfied with any of these transcriptions.
There is a possible solution to this problem, but it goes against standard phoneme theory.
We can symbolise this weak vowel as i - that is, using the symbol for the vowel in ‘beat’ but
without the length mark. Thus:
i:zi bizi
The i vowel is neither the i: o f‘beat’ nor the i o f‘bit’, and is not in contrast with them. We
can set up a corresponding vowel u that is neither the u: o f‘shoe’ nor the u o f‘book’ but
a weak vowel that shares the characteristics of both. If we use i, u in our transcription as
well as i:, i, u:, u, it is no longer a true phonemic transcription in the traditional sense.
However, this need not be too serious an objection, and the fact that native speakers seem
to think that this transcription fits better with their feelings about the language is a good
argument in its favour.
Q AU9 (CD 1), Ex 2
Let us now look at where these vowels are found, beginning with close front
unrounded ones. We find i occurring:
i) In word-final position in words spelt with final ‘y’ or ‘ey’ after one or more
consonant letters (e.g. ‘happy’ haepi, ‘valley’ vaeli) and in morpheme-final
position when such words have suffixes beginning with vowels (e.g. ‘happier’
haspia, ‘easiest’ iiziast, ‘hurrying’ hAriirj).
ii) In a prefix such as those spelt ‘re’, ‘pre’, ‘de’ if it precedes a vowel and is
unstressed (e.g. in ‘react’ riaekt, ‘create’ krieit, ‘deodorant’ diaudarant).
iii) In the suffixes spelt ‘iate’, ‘ious’ when they have two syllables (e.g. in ‘appreciate’
aprirjieit, ‘hilarious’ hilearias).
iv) In the following words when unstressed: ‘he’, ‘she’, ‘we’, me’, ‘be’ and the word
‘the’ when it precedes a vowel.
In most other cases of syllables containing a short close front unrounded vowel we can
assign the vowel to the i phoneme, as in the first syllable o f ‘resist’ rizist, ‘inane’ inein,
68 English Phonetics and Phonology
‘enough’ inAf, the middle syllable of ‘incident’ insidant, ‘orchestra’ oikistra, ‘artichoke’
aititfauk, and the final syllable o f‘swimming’ swimiq, ‘liquid’ likwid, ‘optic’ optik. It can
be seen that this vowel is most often represented in spelling by the letters ‘i’ and ‘e’.
Weak syllables with close back rounded vowels are not so commonly found. We find
u most frequently in the words ‘you’, ‘to’, ‘into’, ‘do’, when they are unstressed and are not
immediately preceding a consonant, and ‘through’, ‘who’ in all positions when they are
unstressed. This vowel is also found before another vowel within a word, as in ‘evacuation’
ivaekjueij'n, ‘influenza’ influenza.
In the above sections we have looked at vowels in weak syllables. We must also
consider syllables in which no vowel is found. In this case, a consonant, either 1, r or
a nasal, stands as the peak of the syllable instead of the vowel, and we count these as
weak syllables like the vowel examples given earlier in this chapter. It is usual to indicate
that a consonant is syllabic by means of a small vertical mark ( ,) beneath the symbol, for
example ‘cattle’ kaetl.
words like ‘happen’, ‘happening’, ‘ribbon’we can consider it equally acceptable to pronounce
them with syllabic n (haepn, haepmr), ribn) or with an (haepan, haepanir), riban). In a
similar way, after velar consonants in words like ‘thicken’, ‘waken’, syllabic n is possible but
an is also acceptable.
After f, v, syllabic n is more common than 9n (except, as with the other cases
described, in word-initial syllables). Thus ‘seven’, ‘heaven’, ‘often’ are more usually sevn,
hevn, ofn than sevan, hevan, of an.
In all the examples given so far the syllabic n has been following another consonant;
sometimes it is possible for another consonant to precede that consonant, but in this case
a syllabic consonant is less likely to occur. If n is preceded by 1and a plosive, as in ‘Wilton’,
the pronunciation wiltn is possible, but wilt an is also found regularly. If s precedes, as in
‘Boston’, a final syllabic nasal is less frequent, while clusters formed by nasal + plosive +
syllabic nasal are very unusual: thus ‘Minton’, ‘lantern’, ‘London’, ‘abandon’ will normally
have a in the last syllable and be pronounced mint an, laentan, Lvndan, abaendan. Other
nasals also discourage a following plosive plus syllabic nasal, so that for example ‘Camden’
is normally pronounced kaemdan.
Syllabic m ,r)
We will not spend much time on the syllabic pronunciation of these consonants.
Both can occur as syllabic, but only as a result of processes such as assimilation and elision
that are introduced later. We find them sometimes in words like ‘happen’, which can be
pronounced haepm, though haepn and haepan are equally acceptable, and ‘uppermost’,
which could be pronounced as Apmaust, though Apamaust would be more usual.
Examples of possible syllabic velar nasals would be ‘thicken’ 0ikr) (where Gikan and 0ikn
are also possible), and ‘broken key’ braukq kii, where the nasal consonant occurs between
velar consonants (n or an could be substituted for rj).
Syllabic r
In many accents of the type called “rhotic” (introduced in Chapter 7), such as most
American accents, syllabic r is very common. The word ‘particular’, for example, would
probably be pronounced prt ik j air in careful speech by most Americans, while BBC speakers
would pronounce this word patikjala. Syllabic r is less common in BBC pronunciation:
it is found in weak syllables such as the second syllable of ‘preference’ prefrans. In most
cases where it occurs there are acceptable alternative pronunciations without the syllabic
consonant.
There are a few pairs of words (minimal pairs) in which a difference in meaning appears
to depend on whether a particular r is syllabic or not, for example:
‘hungry’ hAqgri ‘Hungary’ hAqgri
But we find no case of syllabic r where it would not be possible to substitute either non-
syllabic r or ar; in the example above, ‘Hungary’ could equally well be pronounced
hArjgari.
9 Strong and weak syllables 71
9.1 I have at this point tried to bring in some preliminary notions of stress and prominence
without giving a full explanation. By this stage in the course it is important to be getting
familiar with the difference between stressed and unstressed syllables, and the nature of
the “schwa” vowel. However, the subject of stress is such a large one that I have felt it best
to leave its main treatment until later. On the subject of schwa, see Ashby (2005: p. 29);
Cruttenden (2008: Section 8.9. 12).
9.2 The introduction of i and u is a relatively recent idea, but it is now widely accepted
as a convention in influential dictionaries such as the Longman Pronunciation Dictionary
(Wells, 2008), the Cambridge English Pronouncing Dictionary (Jones, eds. Roach et al,
2006) and the Oxford Dictionary of Pronunciation (Upton et al, 2001). Since I mention
native speakers’ feelings in this connection, and since I am elsewhere rather sceptical about
appeals to native speakers’ feelings, I had better explain that in this case my evidence comes
from the native speakers of English I have taught in practical classes on transcription over
many years. A substantial number of these students have either been speakers with BBC
pronunciation or had accents only slightly different from it, and their usual reaction to
being told to use i for the vowel at the end o f‘easy’, ‘busy’ has been one of puzzlement and
frustration; like them, I cannot equate this vowel with the vowel of ‘bit’. I am, however,
reluctant to use i:> which suggests a stronger vowel than should be pronounced (like the
final vowel in ‘evacuee’, ‘Tennessee’). I must emphasise that the vowels i, u are not to be
included in the set of English phonemes but are simply additional symbols to make the
writing and reading of transcription easier. The Introduction to the Cambridge English
Pronouncing Dictionary (Jones, eds. Roach et al., 2006) discusses some of the issues
involved in syllabic consonants and weak syllables: see section 2.10 and p. 492.
Notes fo r teachers
Introduction of the “schwa” vowel has been deliberately delayed until this chapter, since
I wanted it to be presented in the context of weak syllables in general. Since students
72 English Phonetics and Phonology
W ritte n exercise
The following sentences have been partially transcribed, but the vowels have been left
blank. Fill in the vowels, taking care to identify which vowels are weak; put no vowel at
all if you think a syllabic consonant is appropriate, but put a syllabic mark beneath the
syllabic consonant
1 A particular problem of the boat was a leak
p t kj 1 pr bl mv d b t wz 1 k
2 Opening the bottle presented no difficulty
p n r) 5 b t l p r z n t d n d f k l t
3 There is no alternative to the government’s proposal
6 rzn Itn tv td g v nm nt spr p zl
4 We ought to make a collection to cover the expenses
w ttm k k l k / n t k v d ksp ns z
5 Finally they arrived at a harbour at the edge of the mountains
f n 1 5 r v d t h b r t d d 3 v d m n t n z
io Stress in simple words
Stress has been mentioned several times already in this course without an explana
tion of what the word means. The nature of stress is simple enough: practically everyone
would agree that the first syllable of words like ‘father’, ‘open’, ‘camera’ is stressed, that the
middle syllable is stressed in ‘potato’, ‘apartment’, ‘relation’, and that the final syllable is
stressed in ‘about’, ‘receive’, ‘perhaps’. Also, most people feel they have some sort of idea
of what the difference is between stressed and unstressed syllables, although they might
explain it in different ways.
We will mark a stressed syllable in transcription by placing a small vertical line (')
high up, just before the syllable it relates to; the words quoted above will thus be transcribed
as follows:
'fa:6a ps'teitao a'baut
'aupan a'paitmsnt ri'siiv
'kaemra ri'lei/n pa'hasps
What are the characteristics of stressed syllables that enable us to identify them? It is
important to understand that there are two different ways of approaching this question.
One is to consider what the speaker does in producing stressed syllables and the other is to
consider what characteristics of sound make a syllable seem to a listener to be stressed. In
other words, we can study stress from the points of view of production and of perception;
the two are obviously closely related, but are not identical. The production of stress is
generally believed to depend on the speaker using more muscular energy than is used
for unstressed syllables. Measuring muscular effort is difficult, but it seems possible,
according to experimental studies, that when we produce stressed syllables, the muscles
that we use to expel air from the lungs are often more active, producing higher subglottal
pressure. It seems probable that similar things happen with muscles in other parts of our
vocal apparatus.
Many experiments have been carried out on the perception of stress, and it is clear
that many different sound characteristics are important in making a syllable recognisably
stressed. From the perceptual point of view, all stressed syllables have one characteristic in
common, and that is prominence. Stressed syllables are recognised as stressed because they
74 English Phonetics and Phonology
are more prominent than unstressed syllables. What makes a syllable prominent? At least
four different factors are important:
i) Most people seem to feel that stressed syllables are louder than unstressed
syllables; in other words, loudness is a component of prominence. In a sequence
of identical syllables (e.g. ba:ba:ba:ba:), if one syllable is made louder than
the others, it will be heard as stressed. However, it is important to realise that
it is very difficult for a speaker to make a syllable louder without changing
other characteristics of the syllable such as those explained below (ii-iv); if one
literally changes only the loudness, the perceptual effect is not very strong.
ii) The length of syllables has an important part to play in prominence. If one
of the syllables in our “nonsense word” ba:ba:ba:ba: is made longer than the
others, there is quite a strong tendency for that syllable to be heard as stressed.
iii) Every voiced syllable is said on some pitch; pitch in speech is closely related to the
frequency of vibration of the vocal folds and to the musical notion of low- and
Sigh-pitched notes. It is essentially a perceptual characteristic of speech. If one
syllable of our “nonsense word” is said with a pitch that is noticeably different
from that of the others, this will have a strong tendency to produce the effect of
prominence. For example, if all syllables are said with low pitch except for one
said with high pitch, then the high-pitched syllable will be heard as stressed and
the others as unstressed. To place some movement of pitch (e.g. rising or falling)
on a syllable is even more effective in making it sound prominent.
iv) A syllable will tend to be prominent if it contains a vowel that is different
in quality from neighbouring vowels. If we change one of the vowels in our
“nonsense word” (e.g. ba:bi:ba:bai) the “odd” syllable bi: will tend to be heard
as stressed. This effect is not very powerful, but there is one particular way in
which it is relevant in English: the previous chapter explained how the most
frequently encountered vowels in weak syllables are 3, i, i, u (syllabic consonants
are also common). We can look on stressed syllables as occurring against a
“background” of these weak syllables, so that their prominence is increased by
contrast with these background qualities.
Prominence, then, is produced by four main factors: (i) loudness, (ii) length, (iii) pitch and
(iv) quality. Generally these four factors work together in combination, although syllables
may sometimes be made prominent by means of only one or two of them. Experimental
work has shown that these factors are not equally important; the strongest effect is produced
by pitch, and length is also a powerful factor. Loudness and quality have much less effect.
Up to this point we have talked about stress as though there were a simple distinction
between “stressed” and “unstressed” syllables with no intermediate levels; such a treatment
would be a two-level analysis of stress. Usually, however, we have to recognise one or more
intermediate levels. It should be remembered that in this chapter we are dealing only with
10 Stress in simple words 75
In s o m e w o rd s , w e c a n o b se rv e a ty p e o f stress th a t is w e a k e r th a n p r im a r y stress b u t
s tr o n g e r t h a n t h a t o f th e first syllable o f ‘a r o u n d ’; fo r e x a m p le , c o n s id e r th e first syllables
o f th e w o rd s ‘p h o to g r a p h i c ’ f a o to g r a e f ik , ‘a n th r o p o lo g y ’ a s n G rsp o la d si. T h e stress in th e se
w o rd s is c a lle d secondary stress. It is u s u a lly re p r e s e n te d in tr a n s c r ip tio n w ith a lo w m a r k
( ) so t h a t th e e x a m p le s c o u ld b e tr a n s c r ib e d as .f a u ta 'g r a e f ik , .aen G ra'p n la d g i.
W e have n o w id e n tifie d tw o levels o f stress: p r im a r y a n d s e c o n d a ry ; th is also im p lie s
a th i r d level w h ic h c a n b e ca lle d unstressed a n d is re g a rd e d as b e in g th e a b s e n c e o f a n y
re c o g n isa b le a m o u n t o f p r o m in e n c e . T h e se a re th e th r e e levels th a t w e w ill u se in d e s c rib in g
E n g lish stress. H o w ev er, it is w o r th n o tin g th a t u n s tre s s e d syllables c o n ta in in g o, i, i, u, o r a
syllabic c o n s o n a n t, w ill s o u n d less p r o m in e n t th a n a n u n s tre s s e d syllable c o n ta in in g s o m e
o th e r vow el. F o r e x a m p le , th e first syllable o f ‘p o e tic ’ p a u 'e t i k is m o r e p r o m in e n t th a n th e
firs t syllable o f ‘p a th e tic ’ p a 'G e tik . T h is could b e u s e d as a b asis fo r a f u r th e r d iv isio n o f
stress levels, g iv in g u s a th i r d (“te r tia r y ” ) level. It is also p o ssib le to su g g e st a te r tia r y level o f
stress in s o m e p o ly sy llab ic w o rd s. To ta k e a n e x a m p le , it h a s b e e n su g g e sted th a t th e w o rd
‘in d iv isib ility ’ sh o w s f o u r d iffe re n t levels: th e syllable b il is th e s tro n g e s t (c a rry in g p r im a r y
stress), th e in itia l syllable m h a s s e c o n d a ry stress, w h ile th e th ir d syllable v iz h a s a level
o f stress w h ic h is w e a k e r th a n th o s e tw o b u t s tr o n g e r th a n th e se c o n d , f o u r th , six th a n d
se v e n th syllable (w h ic h are all u n s tre s s e d ). U sin g th e s y m b o l to m a r k th is te r tia r y stress,
th e w o rd c o u ld b e re p r e s e n te d like th is: , m d r v i z a 'b i t a t i . W h ile th is m a y b e a p h o n e tic a lly
c o rre c t a c c o u n t o f s o m e p r o n u n c ia tio n s , th e in t r o d u c tio n o f te rtia r y stress se em s to in t r o
d u c e a n u n n e c e s s a ry d eg re e o f c o m p lex ity . W e w ill tr a n s c r ib e th e w o rd as in d i . v iz a 'b i la t i.
the correct syllable or syllables to stress in an English word? As is well known, English
is not one of those languages where word stress can be decided simply in relation to the
syllables of the word, as can be done in French (where the last syllable is usually stressed),
Polish (where the syllable before the last - the penultimate syllable - is usually stressed)
or Czech (where the first syllable is usually stressed). Many writers have said that English
word stress is so difficult to predict that it is best to treat stress placement as a property of
the individual word, to be learned when the word itself is learned. Certainly anyone who
tries to analyse English stress placement has to recognise that it is a highly complex matter.
However, it must also be recognised that in most cases (though certainly not all), when
English speakers come across an unfamiliar word, they can pronounce it with the correct
stress; in principle, it should be possible to discover what it is that the English speaker
knows and to write it in the form of rules. The following summary of ideas on stress
placement in nouns, verbs and adjectives is an attempt to present a few rules in the simplest
possible form. Nevertheless, practically all the rules have exceptions and readers may feel
that the rules are so complex that it would be easier to go back to the idea of learning the
stress for each word individually.
In order to decide on stress placement, it is necessary to make use of some or all of
the following information:
i) Whether the word is morphologically simple, or whether it is complex as a
result either of containing one or more affixes (i.e. prefixes or suffixes) or of
being a compound word.
ii) What the grammatical category of the word is (noun, verb, adjective, etc.).
iii) How many syllables the word has.
iv) What the phonological structure of those syllables is.
It is sometimes difficult to make the decision referred to in (i). The rules for complex words
are different from those for simple words and these will be dealt with in Chapter 11. Single
syllable words present no problems: if they are pronounced in isolation they are said with
primary stress.
Point (iv) above is something that should be dealt with right away, since it affects many
of the other rules that we will look at later. We saw in Chapter 9 that it is possible to divide
syllables into two basic categories: strong and weak. One component of a syllable is the
rhyme, which contains the syllable peak and the coda. A strong syllable has a rhyme with
either (i) a syllable peak which is a long vowel or diphthong, with or without a
following consonant (coda). Examples:
‘die’ dai ‘heart’ ha:t ‘see’ si:
or (ii) a syllable peak which is a short vowel, one of i, e, ae, a , d, u, followed by at least
one consonant. Examples:
‘bat’ baet ‘much5mAtJ cpulT pul
io Stress in simple words 77
A weak syllable has a syllable peak which consists of one of the vowels a, i, u and no
coda except when the vowel is a. Syllabic consonants are also weak. Examples:
The vowel i may also be the peak of a weak syllable if it occurs before a consonant that is
initial in the syllable that follows it. Examples:
(However, this vowel is also found frequendy as the peak of stressed syllables, as in ‘thinker’
'Girjka, ‘input’ 'input.)
The important point to remember is that, although we do find unstressed strong
syllables (as in the last syllable of ‘dialect’ 'daialekt), only strong syllables can be
stressed. Weak syllables are always unstressed. This piece of knowledge does not by
any means solve all the problems of how to place English stress, but it does help in
some cases.
In the case of simple two-syllable words, either the first or the second syllable will be
stressed - not both. There is a general tendency for verbs to be stressed nearer the end of
a word and for nouns to be stressed nearer the beginning. We will look first at verbs. If the
final syllable is weak, then the first syllable is stressed. Thus:
A final syllable is also unstressed if it contains au (e.g. ‘follow’ 'folau, ‘borrow’ 'bDrau).
If the final syllable is strong, then that syllable is stressed even if the first syllable is
also strong. Thus:
Two-syllable simple adjectives are stressed according to the same rule, giving:
As with most stress rules, there are exceptions; for example: ‘honest’ 'onist, ‘perfect’
'p3:fikt, both of which end with strong syllables but are stressed on the first syllable.
78 English Phonetics and Phonology
Nouns require a different rule: stress will fall on the first syllable unless the first
syllable is weak and the second syllable is strong. Thus:
‘money’ 'mAni ‘divan’ di'vaen
‘product’ 'prodAkt ‘balloon’ bo'luin
‘larynx’ 'lasrigks ‘design’ di'zain
Other two-syllable words such as adverbs seem to behave like verbs and adjectives.
Three-syllable words
Here we find a more complicated picture. One problem is the difficulty of identifying
three-syllable words which are indisputably simple. In simple verbs, if the final syllable is
strong, then it will receive primary stress. Thus:
‘entertain’ .ents'tein ‘resurrect’ .reza'rekt
If the last syllable is weak, then it will be unstressed, and stress will be placed on the
preceding (penultimate) syllable if that syllable is strong. Thus:
‘encounter’ iri'kaunta ‘determine’ di't3:min
If both the second and third syllables are weak, then the stress falls on the initial syllable:
‘parody’ 'paeradi ‘monitor’ 'mnnito
Nouns require a slightly different rule. The general tendency is for stress to fall on the first
syllable unless it is weak. Thus:
‘quantity’ 'kwnntati ‘emperor’ 'cmpara
‘custody’ 'kAStadi ‘enmity’ 'enmati
However, in words with a weak first syllable the stress comes on the next syllable:
‘mimosa’ mi'mauza ‘disaster’ di'zaista
‘potato’ pa'teitau ‘synopsis’ si'nopsis
When a three-syllable noun has a strong final syllable, that syllable will not usually receive
the main stress:
‘intellect’ 'intalekt ‘marigold’ 'maerigauld
‘alkali’ 'aelkalai ‘stalactite’ 'stastaktait
Adjectives seem to need the same rule, to produce stress patterns such as:
‘opportune’ 'opotjuin ‘insolent’ 'insatant
‘derelict’ 'deralikt ‘anthropoid’ 'aenGrapoid
The above rules certainly do not cover all English words.Theyapplyonlytomajor
categories of lexical words (nouns, verbs and adjectives inthis chapter), not tofunction
io Stress in simple words 79
words such as articles and prepositions. There is not enough space in this course to deal
with simple words of more than three syllables, nor with special cases of loan words
(words brought into the language from other languages comparatively recently). Complex
and compound words are dealt with in Chapter 11. One problem that we must also leave
until Chapter 11 is the fact that there are many cases of English words with alternative
possible stress patterns (e.g. ‘controversy’ as either 'knntrav'Jisi or kan'trovasi). Other
words - which we will look at in studying connected speech - change their stress pattern
according to the context they occur in. Above all, there is not space to discuss the many
exceptions to the above rules. Despite the exceptions, it seems better to attempt to produce
some stress rules (even if they are rather crude and inaccurate) than to claim that there is
no rule or regularity in English word stress.
The subject of English stress has received a large amount of attention, and the references
given here are only a small selection from an enormous number. As I suggested in the
notes on the previous chapter, incorrect stress placement is a major cause of intelligibility
problems for foreign learners, and is therefore a subject that needs to be treated
very seriously.
10.1 I have deliberately avoided using the term accent, which is found widely in the litera
ture on stress - see, for example, Cruttenden (2008), p. 23. This is for three main reasons:
i) It increases the complexity of the description without, in my view, contributing
much to its value.
ii) Different writers do not agree with each other about the way the term should be
used.
iii) The word accent is used elsewhere to refer to different varieties of pronun
ciation (e.g. “a foreign accent”); it is confusing to use it for a quite different
purpose. To a lesser extent we also have this problem with the word stress, which
can be used to refer to psychological tension.
10.2 On the question of the number of levels of stress, in addition to Laver (1994: 516),
see also Wells (2008).
10.3 It is said in this chapter that one may take one of two positions. One is that stress is
not predictable by rule and must be learned word by word (see, for example, Jones 1975:
Sections 920-1). The second (which I prefer) is to say that, difficult though the task is, one
must try to find a way of writing rules that express what native speakers naturally tend to
do in placing stress, while acknowledging that there will always be a substantial residue of
cases which appear to follow no regular rules. A very thorough treatment is given by Fudge
(1984). More recently, Giegerich (1992) has presented a clear analysis of English word
stress (including a useful explanation of strong, weak, heavy and light syllables); see p. 146
80 English Phonetics and Phonology
and C hapter 7 .1 have n o t adopted the practice o flabelling syllables /zeavy and /zg /it to
denote characteristics of phonological structure (e.g. types of peak and coda), though this
could have been done to avoid confusion with the more phonetically-based terms strong
and weak introduced in Chapter 9. For our purposes, the difference is not important
enough to neecl additional terminology.
There is another approach to English stress rules which is radically different. This is
based on generative phonology, an analysis which was first presented in Chomsky and
Halle (1968) and has been followed by a large number of works exploring the same field.
To anyone not familiar with this type of treatment, the presentation will seem difficult
or even unintelligible; within the generative approach, many different theories, all with
different names, tend to come and go with changes in fashion. The following paragraph
is an attempt to summarise the main characteristics of basic generative phonology, and
recommends some further reading for those interested in learning about it in detail.
The level of phonology is very abstract in this theory. An old-fashioned view of speech
communication would be that what the speaker intends to say is coded - or represented - as
a string of phonemes just like a phonemic transcription, and what a hearer hears is also
converted by the brain from sound waves into a similar string of phonemes. A generative
phonologist, however, would say that this phonemic representation is not accurate; the
representation in the brain of the speaker or hearer is much more abstract and is often
quite different from the ‘real’ sounds recognisable in the sound wave. You may hear the
word ‘football’ pronounced as fupboil, but your brain recognises the word as made up
of ‘foot5and ‘ball’ and interprets it phonologically as futboil. You may hear a in the first
syllable o f‘photography’, in the second syllable o f‘photograph’ and in the third syllable of
‘photographer’, but these a vowels are only the surface realizations of underlying vowel pho
nemes. An abstract phonemic representation of ‘photograph’ (including the relevant part
of ‘photography’, ‘photographic’ and ‘photographer’) would be something like foitograf;
each of the three underlying vowels (for which I am using symbols different from those
used in the rest of this book) would be realised differently according to the stress they
received and their position in the word: the o: in the first syllable would be realized as 30
if stressed (‘photograph’ 'fautagraif, ‘photographic’ .fsuta'graefik) and as 9 if unstressed
(‘photography’ fa'tDgrafi); the o in the second syllable would be realised as d if stressed
(‘photography’ fa'tografi) and as 3 if unstressed (‘photograph’ ‘fsutagraif), while the a
in the third syllable would be realised as ae if stressed (‘photographic’ fsuta'graefik), as
either a: or ae if in a word-final syllable (‘photograph’ 'fautagraif or 'foutsgraef) and as
a if unstressed in a syllable that is not word-final (‘photography’ fa'tDgrsfi). These vowel
changes are brought about by rules - not the sort of rules that one might teach to language
learners, but more like the instructions that one might build into a machine or write into
a computer program. According to Chomsky and Halle, at the abstract phonological level
words do not possess stress; stress (of many different levels) is the result of the application
of phonological rules, which are simple enough in theory but highly complex in practice.
The principles of these rules are explained first on pp. 15-43 of Chomsky and Halle (1968),
and in greater detail on pp. 69-162.
io Stress in simple words 81
Notes fo r teachers
It should be clear from what is said above that from the purely practical classroom point
of view, explaining English word stress in terms of generative phonology could well create
confusion for learners. Finding practice and testing material for word stress is very simple,
however: any modern English dictionary shows word stress patterns as part of word
entries, and lists of these can be made either with stress marks for students to read from
(as in Exercise 2 of Audio Unit 10), or without stress marks for students to put their own
marks on (as in Exercise 1 of the same Audio Unit).
W ritte n exercises
In Chapter 10 the nature of stress was explained and some broad general rules were
given for deciding which syllable in a word should receive primary stress. The words that
were described were called “simple” words; “simple” in this context means “not composed
of more than one grammatical unit”, so that, for example, the word ‘care’ is simple while
‘careful’ and ‘careless’ (being composed of two grammatical units each) are complex; ‘care
fully’ and ‘carelessness’ are also complex, and are composed of three grammatical units
each. Unfortunately, as was suggested in Chapter 10, it is often difficult to decide whether
a word should be treated as complex or simple. The majority of English words of more
than one syllable (polysyllabic words) have come from other languages whose way of
constructing words is easily recognisable; for example, we can see how combining ‘mit’
with the prefixes ‘per-’, ‘sub-’, ‘com-’ produced ‘permit’, ‘submit’, ‘commit’ - words which
have come into English from Latin. Similarly, Greek has given us ‘catalogue’, ‘analogue’,
‘dialogue’, ‘monologue’, in which the prefixes ‘cata-’, ‘ana-’, ‘dia-’, ‘mono-’ are recognisable.
But we cannot automatically treat the separate grammatical units of other languages as
if they were separate grammatical units of English. If we did, we would not be able to
study English morphology without first studying the morphology of five or six other
languages, and we would be forced into ridiculous analyses such as that the English word
‘parallelepiped’ is composed of four or five grammatical units (which is the case in Ancient
Greek). We must accept, then, that the distinction between “simple” and “complex” words
is difficult to draw.
Complex words are of two major types:
i) words made from a basic word form (which we will call the stem), with the
addition of an affix; and
ii) compound words, which are made of two (or occasionally more) independent
English words (e.g. ‘ice cream’, ‘armchair’).
We will look first at the words made with affixes. Affixes are of two sorts in English:
prefixes, which come before the stem (e.g. prefix ‘un-’ + stem ‘pleasant’ —> ‘unpleas
ant’) and suffixes, which come after the stem (e.g. stem ‘good’ + suffix ‘-ness’ —>
‘goodness’).
Affixes have one of three possible effects on word stress:
82
Complex word stress 83
i) The affix itself receives the primary stress (e.g. ‘semi-’ + ‘circle’ S3:kl —>
‘semicircle’ 'serms3:kl; ‘-ality’ + ‘person’ 'p 3:sn —> ‘personality’ p3:sn'ael3ti).
ii) The word is stressed as if the affix were not there (e.g. ‘pleasant’ 'pleznt,
‘unpleasant’ An'pleznt;‘market’ 'm aikit,‘marketing’ 'maikitirj).
iii) The stress remains on the stem, not the affix, but is shifted to a different
syllable (e.g. ‘magnet’ 'masgnat, ‘magnetic’ maeg'netik).
11.2 Suffixes
There are so many suffixes that it will only be possible here to examine a small
proportion of them: we will concentrate on those which are common and productive - that
is, are applied to a considerable number of stems and could be applied to more to make
new English words. In the case of the others, foreign learners would probably be better
advised to learn the ‘stem + affix’ combination as an individual item.
One of the problems that we encounter is that we find words which are obviously
complex but which, when we try to divide them into stem + affix, turn out to have a stem
that is difficult to imagine as an English word. For example, the word ‘audacity’ seems to
be a complex word - but what is its stem? Another problem is that it is difficult in some
cases to know whether a word has one, or more than one, suffix: for example, should we
analyse ‘personality’ from the point of view of stress assignment, as p3:sn + a:loti or as
p3:sn + ael + at i? In the study of English word formation at a deeper level than we can
go into here, it is necessary for such reasons to distinguish between a stem (which is what
remains when affixes are removed), and a root, which is the smallest piece of lexical mate
rial that a stem can be reduced to. So, in ‘personality’, we could say that the suffix ‘-ity’ is
attached to the stem ‘personal’ which contains the root ‘person’ and the suffix ‘al’. We will
not spend more time here on looking at these problems, but go on to look at some gen
eralisations about suffixes and stress, using only the term ‘stem’ for the sake of simplicity.
The suffixes are referred to in their spelling form.
In the examples given, which seem to be the most common, the primary stress
is on the first syllable of the suffix. If the stem consists of more than one syllable there
will be a secondary stress on one of the syllables of the stem. This cannot fall on the last
syllable of the stem and is, if necessary, moved to an earlier syllable. For example, in ‘Japan’
d33'paen the primary stress is on the last syllable, but when we add the stress-carrying
suffix ‘-ese’ the primary stress is on the suffix and the secondary stress is placed not on the
second syllable but on the first: ‘Japanese’ ,d3aepo'ni:z.
• ‘-ee’: ‘refugee’ ,ref ju'd 3i:; ‘evacuee’ i.vaekju'i:
• ‘-eer’: ‘mountaineer’ .m aunti'ma;‘volunteer’ .vDlan'tia
• ‘-ese’: ‘Portuguese’ .poit/a'giiz; ‘journalese’ ,d33:nri:z
84 English Phonetics and Phonology
Otherwise the syllable before the last one receives the stress: ‘inheritance’ in'heritans,
‘military’ 'militri.
11.3 Prefixes
We will look only briefly at prefixes. Their effect on stress does not have the com
parative regularity, independence and predictability of suffixes, and there is no prefix of
one or two syllables that always carries primary stress. Consequently, the best treatment
seems to be to say that stress in words with prefixes is governed by the same rules as those
for polysyllabic words without prefixes.
The words discussed so far in this chapter have all consisted of a stem plus an affix.
We now pass on to another type of word. This is called compound, and its main charac
teristic is that it can be analysed into two words, both of which can exist independently
as English words. Some compounds are made of more than two words, but we will not
consider these. As with many of the distinctions being made in connection with stress,
there are areas of uncertainty. For example, it could be argued that ‘photograph’ may be
divided into two independent words, ‘photo’ and ‘graph’; yet we usually do not regard
it as a compound, but as a simple word. If, however, someone drew a graph displaying
numerical information about photos, this would perhaps be called a ‘photo-graph’ and the
word would then be regarded as a compound. Compounds are written in different ways:
sometimes they are written as one word (e.g. ‘armchair’, ‘sunflower’); sometimes with the
words separated by a hyphen (e.g. ‘open-minded’, ‘cost-effective’); and sometimes with
two words separated by a space (e.g. ‘desk lamp’, ‘battery charger’). In this last case there
would be no indication to the foreign learner that the pair of words was to be treated as
a compound. There is no clear dividing line between two-word compounds and pairs of
words that simply happen to occur together quite frequently.
As far as stress is concerned, the question is quite simple. When is primary stress
placed on the first constituent word of the compound and when on the second? Both
patterns are found. A few rules can be given, although these are not completely reliable.
Perhaps the most familiar type of compound is the one which combines two nouns and
which normally has the stress on the first element, as in:
‘typewriter’ 'taipraita
‘car ferry’ 'kaiferi
‘sunrise’ 'sAnraiz
‘suitcase’ 'suitkeis
‘teacup’ 'tiik A p
It is probably safest to assume that stress will normally fall in this way on other compounds;
however, a number of compounds receive stress instead on the second element. The first
86 English Phonetics and Phonology
words in such compounds often have secondary stress. For example, compounds with
an adjectival first element and the -ed morpheme at the end have this pattern (given in
spelling only):
.bad- 'tempered
.half- 'timbered
.heavy- 'handed
Compounds in which the first element is a number in some form also tend to have final
stress:
.three- 'wheeler
.second-'class
.five-'finger
Compounds functioning as adverbs are usually final-stressed:
.head'first
.North-'East
.down'stream
Finally, compounds which function as verbs and have an adverbial first element take final
stress:
.down'grade
.back-'pedal
.ill-'treat
It would be wrong to imagine that the stress pattern is always fixed and unchanging
in English words. Stress position may vary for one of two reasons: either as a result of the
stress on other words occurring next to the word in question, or because not all speakers
agree on the placement of stress in some words. The former case is an aspect of connected
speech that will be encountered again in Chapter 14: the main effect is that the stress on
a final-stressed compound tends to move to a preceding syllable and change to secondary
stress if the following word begins with a strongly stressed syllable. Thus (using some
examples from the previous section):
.bad-'tempered but a .bad-tempered 'teacher
.half-'timbered but a .half-timbered 'house
.heavy-'handed but a .heavy-handed 'sentence
The second is not a serious problem, but is one that foreign learners should be aware
of. A well-known example is ‘controversy’, which is pronounced by some speakers as
'kDntr9V3isi and by others as ksn'trDvssi; it would be quite wrong to say that one version
was correct and one incorrect. Other examples of different possibilities are Cice cream’
ii Complex word stress 87
One aspect of word stress is best treated as a separate issue. There are several dozen
pairs of two-syllable words with identical spelling which differ from each other in stress
placement, apparently according to word class (noun, verb or adjective). All appear to
consist of prefix + stem. We shall treat them as a special type of word and give them
the following rule: if a pair of prefix-plus-stem words exists, both members of which are
spelt identically, one of which is a verb and the other of which is either a noun or an
adjective, then the stress is placed on the second syllable of the verb but on the first syllable
of the noun or adjective. Some common examples are given below (V = verb, A = adjective,
N = noun):
abstract 'aebstraekt (A) aeb'straekt (V)
conduct 'kondAkt(N) k s n ' d A k t (V)
contract 'kontraekt (N) kan'trrekt (V)
contrast 'kontraist (N) kan'traist (V)
desert 'dezat (N) di'z 3:t (V)
escort 'eskoit (N) I'skoit (V)
export 'ekspo:t (N) ik'spoit (V)
import 'import (N) im'poit (V)
insult 'insAlt (N) in'sAlt (V)
object 'Dbd3ekt (N) ab'd 3ekt (V)
perfect 'p 3 :fikt (A) ps'fekt (V)
permit 'p 3 :mit (N) pa'mit (V)
present 'preznt (N, A) pri'zent (V)
produce 'prodju:s (N) prs'djuis (V)
protest 'prautest (N) pra'test (V)
rebel 'rebl (N) ri'bel (V)
record 'rekoid (N, A) ri'koid (V)
subject 'sAbd3ekt (N) sab'd3ekt (V)
Most of the reading recommended in the notes for the previous chapter is relevant
for this one too. Looking specifically at compounds, it is worth reading Fudge (1984:
Chapter 5). See also Cruttenden (2008: 242-5). If you wish to go more deeply into
compound-word stress, you should first study English word formation. Recommended
reading for this is Bauer (1983). On the distinction between stem and root, see Radford
etal. (1999: 67-8).
88 English Phonetics and Phonology
W ritte n exercises
Put stress marks on the following words (try to put secondary stress marks on as
well).
a) shopkeeper f) confirmation
b) open-ended g) eight-sided
c) Javanese h) fruitcake
d) birthmark i) defective
e) anti-clockwise j) roof timber
Write the words in phonemic transcription, including the stress marks.
12 Weak forms
Chapter 9 discussed the difference between strong and weak syllables in English. We have
now moved on from looking at syllables to looking at words, and we will consider certain
well-known English words that can be pronounced in two different ways; these are called
strong forms and weak forms. As an example, the word ‘that’ can be pronounced daet
(strong form) or d a t (weak form). The sentence ‘I like that’ is pronounced ai laik daet
(strong form); the sentence ‘I hope that she will’ is pronounced ai haup dot Ji wil (weak
form). There are roughly forty such words in English. It is possible to use only strong
forms in speaking, and some foreigners do this. Usually they can still be understood by
other speakers of English, so why is it important to learn how weak forms are used? There
are two main reasons: first, most native speakers of English find an “all-strong form” pro
nunciation unnatural and foreign-sounding, something that most learners would wish to
avoid. Second, and more importantly, speakers who are not familiar with the use of weak
forms are likely to have difficulty understanding speakers who do use weak forms; since
practically all native speakers of British English use them, learners of the language need to
learn about these weak forms to help them to understand what they hear.
We must distinguish between weak forms and contracted forms. Certain English
words are shortened so severely (usually to a single phoneme) and so consistently that they
are represented differently in informal writing (e.g. ‘it is’ —>‘it’s5; ‘we have’ —> ‘we’ve’; ‘do
not’ —» ‘don’t’). These contracted forms are discussed in Chapter 14, and are not included
here.
Almost all the words which have both a strong and weak form belong to a category
that may be called function words - words that do not have a dictionary meaning in the
way that we normally expect nouns, verbs, adjectives and adverbs to have. These function
words are words such as auxiliary verbs, prepositions, conjunctions, etc., all of which are
in certain circumstances pronounced in their strong forms but which are more frequently
pronounced in their weak forms. It is important to remember that there are certain
contexts where only the strong form is acceptable, and others where the weak form is the
normal pronunciation. There are some fairly simple rules; we can say that the strong form
is used in the following cases:
i) For many weak-form words, when they occur at the end of a sentence; for
example, the word ‘of’ has the weak form sv in the following sentence:
‘I’m fond of chips’ aim 'fond sv 'tjips
89
90 English Phonetics and Phonology
However, when it comes at the end of the sentence, as in the following example,
it has the strong form d v :
‘Chips are what I’m fond of’ 'tjips 3 'w D t aim 'fond dv
Many of the words given below (particularly 1-9) never occur at the end of a
sentence (e.g. ‘the’, ‘your’). Some words (particularly the pronouns numbered
10-14 below) do occur in their weak forms in final position.
ii) When a weak-form word is being contrasted with another word; for example:
‘The letter’s from him, not to him’ da 'letaz 'from im not 'tu: im
1 ‘the’
Weak forms: 5a (before consonants)
‘Shut the door’ 'jAt da 'da:
di (before vowels)
‘Wait for the end’ 'weit fa di 'end
2 ‘a’, ‘an’
Weak forms: a (before consonants)
‘Read a book’ 'ri:d a 'buk
an (before vowels)
‘Eat an apple’ 'i:t an 'aepl
3 ‘and’
Weak form: an (sometimes n after t, d, s, z, J)
‘Come and see’ 'kAm an 'si:
‘Fish and chips’ 'fij n 'tjips
12 W eak forms 91
4 ‘but’
Weak form: bat
‘It’s good but expensive’ its 'gud bat ik'spensiv
5 ‘that’
This word only has a weak form when used in a relative clause; when used with
a demonstrative sense it is always pronounced in its strong form.
Weak form: dat
‘Theprice is the thing that annoys me’ 6a 'prais iz 6a '0 ig
6at a'noiz mi
6 ‘than’
Weak form: 6an
‘Better than ever’ 'beta 6an 'eva
7 ‘his’ (when it occurs before a noun)
Weak form: iz (hizat the beginning of a sentence)
‘Take his name’ 'teik iz 'neim
(Another sense o f‘his’, as in ‘it was his’, or ‘his was late’, always
has the strong form)
8 ‘her’
When used with a possessive sense, preceding a noun; as an object pronoun, this
can also occur at the end of a sentence.
Weak forms: a (before consonants)
‘Take her home’ 'teik a 'haum
ar (before vowels)
‘Take her out’ 'teik ar 'aut
9 ‘your’
Weak forms: ja (before consonants)
‘Take your time’ 'teik ja 'taim
jar (before vowels)
‘On your own’ 'on jar 'aun
10 ‘she’, ‘he’, ‘we’, ‘you
This group of pronouns has weak forms pronounced with weaker vowels than
the i:, u: of their strong forms. I use the symbols i, u (in preference to i, u) to
represent them. There is little difference in the pronunciation in different places
in the sentence, except in the case o f‘he’.
Weak forms:
a) ‘she’ Ji
‘Why did she read it?’ 'wai did Ji 'ri:d it
‘Who is she?’ 'hu: 'iz Ji
b) ‘he’ i (the weak form is usually pronounced without h except at
the beginning of a sentence)
‘Which did he choose?’ 'w itj did i 'tjuiz
‘He was late, wasn’t he?’ hi waz 'leit 'wDznt i
92 English Phonetics and Phonology
c) ‘we’ wi
‘How can we get there?’ 'hau kan wi 'get dea
‘We need that, don’t we?’ wi 'ni:d daet 'daunt wi
d) ‘you’ ju
‘What do you think?’ 'wDt da ju 'Girjk
‘You like it, do you?’ ju 'laik it 'du: ju
1 1 ‘him’
Weak form: im
‘Leave him alone’ 'liiv im a'laun
‘I’ve seen him’ aiv 'si:n im
12 ‘her’
Weak form: 3 (ha when sentence-initial)
‘Ask her to come’ 'disk a ta 'kAm
‘I’ve met her’ aiv 'met a
13 ‘them’
Weak form: dam
‘Leave them here’ 'li:v dam 'hia
‘Eat them’ 'i:t 6 am
14 ‘us’
Weak form: as
‘Write us a letter’ 'rait as a 'leta
‘They invited all of us’ dei in'vaitid 'oil av as
The next group of words (some prepositions and other function words) occur in their
strong forms when they are in final position in a sentence; examples of this are given.
Number 19, ‘to’, is a partial exception.
15 ‘at’
Weak form: at
‘I’ll see you at lunch’ ail 'si: ju at 'lAnf
In final position: aet
‘What’s he shooting at?’ 'wots i 'Juitir) set
16 ‘for’
Weak form: fa (before consonants)
‘Tea for two’ 'ti: fa 'tu:
far (before vowels)
‘Thanks for asking’ 'Gaeijks far 'aiskirj
In final position: fa:
‘What’s that for?’ 'wots 'daet fa:
17 ‘from’
Weak form: fram
‘I’m home from work’ aim 'haum fram 'w3:k
12 W eak forms 93
The remaining weak-form words are all auxiliary verbs, which are always used in conjunc
tion with (or at least implying) another (“full”) verb. It is important to remember that in
their negative form (i.e. combined with ‘not’) they never have the weak pronunciation, and
some (e.g. ‘don’t’, ‘can’t’) have different vowels from their non-negative strong forms.
23 ‘can’, ‘could’
Weak forms: kan, kad
‘They can wait’ '5 ei kan 'weit
‘He could do it’ 'hi: kad 'du: it
In final position: kaen, kud
‘I think we can’ ai '9 ir)k wi 'kaen
‘Most of them could’ 'maust av dom 'kud
24 ‘have’, ‘has’, ‘had’
Weak forms: av, az, ad (with initial h in initial position)
‘Which have you seen?’ 'witj av ju 'si:n
‘Which has been best?’ 'witJ az bi:n 'best
‘Most had gone home’ 'maust ad gon 'haum
In final position: haev, haez, haed
‘Yes, we have’ 'jes wi 'haev
‘I think she has’ ai 'Girjk Ji 'haez
‘I thought we had’ ai '0o:t wi 'haed
25 ‘shall’, ‘should’
Weak forms: Jal or Jl; Jad
‘We shall need to hurry’ wi Jl 'ni:d ta 'hAri
‘I should forget it’ 'ai Jad fa'get it
In final position: Jael, Jud
‘I think we shall’ ai 'Gigk wi 'Jael
‘So you should’ 'sao ju 'Jud
26 ‘must’
This word is sometimes used with the sense of forming a conclusion or deduc
tion (e.g. ‘she left at eight o’clock, so she must have arrived by now’); when
‘must’ is used in this way, it is less likely to occur in its weak form than when it is
being used in its more familiar sense of obligation.
Weak forms: mos (before consonants)
‘You must try harder’ ju mas 'trai 'haida
mast (before vowels)
‘He must eat more’ hi mast 'i:t 'mo:
In final position: mAst
‘She certainly must’ Ji 's 3itnli ' m A S t
27 ‘do’, ‘does’
Weak forms:
‘do’ da (before consonants)
‘Why do they like it?’ 'wai da dei 'laik it
12 W eak forms 95
du (before vowels)
‘Why do all the cars stop?’ 'wai du 'oil da 'kaiz 'stop
‘does’ daz
‘When does it arrive?’ 'wen daz it a'raiv
In final position: du:, dAZ
‘We don’t smoke, but some people do’ 'wi: daunt 'smauk bat
'sAm pi:pl 'du:
‘I think John does’ ai '0 ir)k 'd 3Dn dAz
28 ‘am’, ‘are’, ‘was’, ‘were’
Weak forms: am
‘Why am I here?’ 'wai am ai 'hia
a (before consonants)
‘Here are the plates’ 'h ia r a 6 a ' p l e i t s
ar (before vowels)
‘The coats are in there’ da 'kauts ar in 'dea
waz
‘He was here a minute ago’hi waz 'hiar a 'minit a'gau
wa (before consonants)
‘The papers were late’ da 'peipaz wa 'leit
war (before vowels)
‘The questions were easy’da 'kwestfanz war 'i:zi
In final position: aem, a:, w d z , W 3 i
‘She’s not as old as I am’/iz 'not az 'auld az 'ai aem
‘I know the Smiths are’ ai 'nau da 'smi0s a:
‘The last record was’ da ' l a i s t 'r e k a id w d z
‘They weren’t as cold as we were’ dei 'w3int az 'kauld az
'wi: W 3 :
This chapter is almost entirely practical. All books about English pronunciation devote
a lot of attention to weak forms. Some of them give a great deal of importance to using
these forms, but do not stress the importance of also knowing when to use the strong
forms, something which I feel is very important; see Hewings (2007:48-9). There is a very
detailed study of English weak forms in Obendorfer (1998).
W ritte n exercise
In the following sentences, the transcription for the weak-form words is left blank. Fill in
the blanks, taking care to use the appropriate form (weak or strong).
1 I want her to park that car over there,
ai wont pa:k ka:r auva
96 English Phonetics and Phonology
Of all the proposals, the one that you made is the silliest.
oil prapauz|z w a ii meid iz siliast
Jane and Bill could have driven them to and from the party.
d3ein bil drivn pa:ti
To come to the point, what shall we do for the rest of the week?
kAm point WDt rest wi:k
Has anyone got an idea where it came from?
eniwAn got aidia wear it keim
Pedestrians must always use the crossings provided,
padestrianz oilweiz ju:z krDsirjz pravaidid
Each one was a perfect example of the art that had been
i:tf wAn p3:fikt igzaimpl a:t bi:n
developed there,
divelapt
13 Problems in phonemic analysis
The concept of the phoneme was introduced in Chapter 5, and a few theoretical problems
connected with phonemic analysis have been mentioned in other chapters. The general
assumption (as in most phonetics books) has been that speech is composed of phonemes
and that usually whenever a speech sound is produced by a speaker it is possible to identify
which phoneme that sound belongs to. While this is often true, we must recognise that
there are exceptions which make us consider some quite serious theoretical problems.
From the comparatively simple point of view of learning pronunciation, these problems
are not particularly important. However, from the point of view of learning about the
phonology of English they are too important to ignore.
There are problems of different types. In some cases, we have difficulty in deciding
on the overall phonemic system of the accent we are studying, while in others we are
concerned about how a particular sound fits into this system. A number of such problems
are discussed below.
13.1 A ffricates
instead of the three phonemes that result from the one-phoneme analysis:
t j - 3: - t j d3 - a - d3
and there would be no separate t j , d3 phonemes. But how can we decide which analysis
is preferable? The two-phoneme analysis has one main advantage: if there are no separate
tj, d3 phonemes, then our total set of English consonants is smaller. Many phonologists
have claimed that one should prefer the analysis which is the most “economical” in the
number of phonemes it results in. The argument for this might be based on the claim
97
98 English Phonetics and Phonology
that when we speak to someone we are using a code, and the most efficient codes do
not employ unnecessary symbols. Further, it can be claimed that a phonological analysis
is a type of scientific theory, and a scientific theory should be stated as economically as
possible. However, it is the one-phoneme analysis that is generally chosen by phonologists.
Why is this? There are several arguments: no single one of them is conclusive, but added
together they are felt to make the one-phoneme analysis seem preferable. We will look
briefly at some of these arguments.
i) One argument could be called “phonetic” or “allophonic”: if it could be shown
that the phonetic quality of the t and J (or d and 3) in tj, d$ is clearly different
from realisations of t, J, d, 3 found elsewhere in similar contexts, this would
support the analysis of tj, d3 as separate phonemes. As an example, it might
be claimed that J in ‘hutch’ hAtJ was different (perhaps in having shorter
duration) from J in ‘hush’ hAj or ‘Welsh’ welj; or it might be claimed that the
place of articulation of t in ‘watch apes’ WDtJ eips was different from that of
t in ‘what shapes’ wot Jeips. This argument is a weak one: there is no clear
evidence that such phonetic differences exist, and even if there were such
evidence, it would be easy to produce explanations for the differences that did
not depend on phonemic analyses (e.g. the position of the word boundary in
‘watch apes’, ‘what shapes’).
ii) It could be argued that the proposed phonemes tj, d3 have distributions
similar to other consonants, while other combinations of plosive plus fricative
do not. It can easily be shown that tj, d3 are found initially, medially and
finally, and that no other combination (e.g. pf, dz, t0) has such a wide
distribution. However, several consonants are generally accepted as phonemes
of the BBC accent despite not being free to occur in all positions (e.g. r, w,
j, h, r), 3), so this argument, although supporting the one-phoneme analysis,
does not actually prove that tJ, d3 must be classed with other single-consonant
phonemes.
iii) If tj, d3 were able to combine quite freely with other consonants to form
consonant clusters, this would support the one-phoneme analysis. In initial
position, however, tj, d3 never occur in clusters with other consonants. In final
position in the syllable, we find that tj can be followed by t (e.g. ‘watched’
wotjt) and d3 by d (e.g. ‘wedged’ wed3d). Final tj, d3 can be preceded by 1
(e.g. ‘squelch’ skweltj, ‘bulge’ bAld3); 3 is never preceded by 1, and J*is preceded
by 1only in a few words and names (e.g. ‘Welsh’ welj, ‘Walsh’ wdIJ). A fairly
similar situation is found if we ask if n can precede tJ, d3; some BBC speakers
have ntj in ‘lunch’, ‘French’, etc., and never pronounce the sequence nj within
a syllable, while other speakers (like me) always have nj in these contexts and
never ntj. In words like ‘lunge’, ‘flange’ there seems to be no possible phonologi
cal distinction between L\nd3, flaend3 and lAn3, flaen3. It seems, then, that no
contrast between syllable-final 1J and ltj exists in the BBC accent, and the same
13 Problems in phonemic analysis 99
appears to be true in relation to nj and ntj and to n3 and nd3. There are no
other possibilities for final-consonant clusters containing tf, d3 , except that the
pre-final 1 or n may occur in combination with post-final t, d as in ‘squelched’
skwelt/t, ‘hinged’ hind3d. It could not, then, be said that tf, d3 combine freely
with other consonants in forming consonant clusters; this is particularly notice
able in initial position.
How would the two-phoneme analysis affect the syllable-structure
framework that was introduced in Chapter 8? Initial t j’, d3 would have
to be interpreted as initial t, d plus post-initial J, 3, with the result that
the post-initial set of consonants would have to contain 1, r, w, j and also
J, 3 - consonants which are rather different from the other four and which
could only combine with t, d. (The only alternative would be to put t, d with
s in the pre-initial category, again with very limited possibilities of combining
with another consonant.)
iv) Finally, it has been suggested that if native speakers of English who have not
been taught phonetics feel that tf, d3 are each “one sound”, we should be guided
by their intuitions and prefer the one-phoneme analysis. The problem with this
is that discovering what untrained (or “naive”) speakers feel about their own
language is not as easy as it might sound. It would be necessary to ask questions
like this: “Would you say that the word ‘chip’ begins with one sound - like ‘tip’
and ‘sip’ - or with two sounds - like ‘trip’ and ‘skip’?” But the results would be
distorted by the fact that two consonant letters are used in the spelling; to
do the test properly one should use illiterate subjects, which raises many
further problems.
This rather long discussion of the phonemic status of tf, d3 shows how difficult it can be
to reach a conclusion in phonemic analysis.
For the rest of this chapter a number of other phonological problems will be discussed
comparatively briefly. We have already seen (in Chapter 6 ) problems of analysis in connec
tion with the sounds usually transcribed hw, hj. The velar nasal 13, described in Chapter 7,
also raises a lot of analysis problems: many writers have suggested that the correct analysis
is one in which there is no 13 phoneme, and this sound is treated as an allophone of the
phoneme n that occurs when it precedes the phoneme g. It was explained in Chapter 7
that in certain contexts no g is pronounced, but it can be claimed that at an abstract level
there is a g phoneme, although in certain contexts the g is not actually pronounced. The
sound 1.1 is therefore, according to this theory, an allophone of n.
The analysis of the English vowel system presented in Chapters 2 and 3 contains a
large number of phonemes, and it is not surprising that some phonologists who believe in
the importance of keeping the total number of phonemes small propose different analyses
100 English Phonetics and Phonology
which contain fewer than ten vowel phonemes and treat all long vowels and diphthongs
as composed of two phonemes each. There are different ways of doing this: one way is to
treat long vowels and diphthongs as composed of two vowel phonemes. Starting with a
set of basic or “simple” vowel phonemes (e.g. i, e , ae, a , d , u , a ) it is possible to make up
long vowels by using short vowels twice. Our usual transcription for long vowels is given
in brackets:
ii (i:) aeae (a:) dd (a:) uu (u :) aa (3:)
This can be made to look less unusual by choosing different symbols for the basic vowels.
We will use i, e , a, a, a, u, a: thus i: could be transcribed as ii, a: as aa, a: as aa, u: as uu
and 3: as aa. In this approach, diphthongs would be composed of a basic vowel phoneme
followed by one of i, u, a, while triphthongs would be made from a basic vowel plus one of
i, u followed by a, and would therefore be composed of three phonemes.
Another way of doing this kind of analysis is to treat long vowels and diphthongs as
composed of a vowel plus a consonant; this may seem a less obvious way of proceeding,
but it was for many years the choice of most American phonologists. The idea is that long
vowels and diphthongs are composed of a basic vowel phoneme followed by one of j, w , h
(we should add r for rhotic accents). Thus the diphthongs would be made up like this (our
usual transcription is given in brackets):
ej (e i) a w (a u ) ih (ia )
asj ( a i ) aew ( a u ) e h (ea)
Dj ( a i ) uh (u a)
Long vowels:
ij (i:) aeh ( a : ) Dh (a :) ah (3:) u w (u :)
Diphthongs and long vowels are now of exactly the same phonological composition. An
important point about this analysis is that j, w , h do not otherwise occur finally in the
syllable. In this analysis, the inequality of distribution is corrected.
In Chapter 9 we saw how, although 1, i: are clearly distinct in most contexts, there
are other contexts where we find a sound which cannot clearly be said to belong to one
or other of these two phonemes. The suggested solution to this problem was to use the
symbol i, which does not represent any single phoneme; a similar proposal was made for
u. We use the term neutralisation for cases where contrasts between phonemes which
exist in other places in the language disappear in particular contexts. There are many other
ways of analysing the very complex vowel system of English, some of which are extremely
ingenious. Each has its own advantages and disadvantages.
A final analysis problem that we will consider is that mentioned at the end of Chapter 8:
how to deal with syllabic consonants. It has to be recognised that syllabic consonants are a
13 Problems in phonemic analysis 101
problem: they are phonologically different from their non-syllabic counterparts. How do
we account for the following minimal pairs, which were given in Chapter 9?
Syllabic Non-syllabic
‘coddling’ kodhg ‘codling’ kDdlig
‘Hungary’ hAggri ‘hungry’ hAggri
One possibility is to add new consonant phonemes to our list. We could invent the
phonemes 1, r, n, etc. The distribution of these consonants would be rather limited, but
the main problem would be fitting them into our pattern of syllable structure. For a word
like ‘button’ bAtn or ‘bottle’ bDtl, it would be necessary to add n ,} to the first post-final
set; the argument would be extended to include the rin ‘Hungary’. But if these consonants
now form part of a syllable-final consonant cluster, how do we account for the fact that
English speakers hear the consonants as extra syllables? The question might be answered
by saying that the new phonemes are to be classed as vowels. Another possibility is to set
up a phoneme that we might name syllabicity, symbolised with the mark . Then the
word ‘codling’ would consist of the following six phonemes: k - D - d - l - i - g , while the
word ‘coddling’ would consist of the following seven phonemes: k - d —d - 1 and simul
taneously - i - g. This is superficially an attractive theory, but the proposed phoneme is
nothing like the other phonemes we have identified up to this point - putting it simply,
the syllabic mark doesn’t have any sound.
Some phonologists maintain that a syllabic consonant is really a case of a vowel and
a consonant that have become combined. Let us suppose that the vowel is a. We could then
say that, for example, ‘Hungary’ is phonemically hAggari while ‘hungry’ is hAggri; it would
then be necessary to say that the a vowel phoneme in the phonemic representation is not
pronounced as a vowel, but instead causes the following consonant to become syllabic.
This is an example of the abstract view of phonology where the way a word is represented
phonologically may be significantly different from the actual sequence of sounds heard, so
that the phonetic and the phonemic levels are quite widely separated.
Words like ‘spill’, ‘still’, ‘skill’ are usually represented with the phonemes p, t, k
following the s. But, as many writers have pointed out, it would be quite reasonable to
transcribe them with b, d, g instead. For example, b, d, g are unaspirated while p, t, k in
syllable-initial position are usually aspirated. However, in sp, st, sk we find an unaspirated
plosive, and there could be an argument for transcribing them as sb, sd, sg. We do not do
this, perhaps because of the spelling, but it is important to remember that the contrasts
between p and b, between t and d and between k and g are neutralised in this context.
13.5 Schwa ( 0)
It has been suggested that there is not really a contrast between a and a, since a only
occurs in weak syllables and no minimal pairs can be found to show a clear contrast between
102 English Phonetics and Phonology
a and a in unstressed syllables (although there have been some ingenious attempts). This
has resulted in a proposal that the phoneme symbol a should be used for representing any
occurrence of a or a, so that ‘cup’ (which is usually stressed) would be transcribed 'kap
and ‘upper’ (with stress on the initial syllable) as ‘opo. This new a phoneme would thus
have two allophones, one being a and the other a; the stress mark would indicate the a
allophone and in weak syllables with no stress it would be more likely that the a allophone
would be pronounced.
Other phonologists have suggested that a is an allophone of several other vowels; for
example, compare the middle two syllables in the words ‘economy I'konami and ‘economic’
.iiko'nDmik - it appears that when the stress moves away from the syllable containing d the
vowel becomes a. Similarly, compare ‘Germanic’ d 3 3 : ' m a e m k with ‘German’ 'd 3 3 : m a n -
when the stress is taken away from the syllable m aen, the vowel weakens to a. (This view
has already been referred to in the Notes for Chapter 10, Section 3.) Many similar examples
could be constructed with other vowels; some possibilities may be suggested by the list of
words given in Section 9.2 to show the different spellings that can be pronounced with a. The
conclusion that could be drawn from this argument is that a is not a phoneme of English,
but is an allophone of several different vowel phonemes when those phonemes occur in an
unstressed syllable. The argument is in some ways quite an attractive one, but since it leads
to a rather complex and abstract phonemic analysis it is not adopted for this course.
Many references have been made to phonology in this course, with the purpose of
making use of the concepts and analytical techniques of that subject to help explain various
facts about English pronunciation as efficiently as possible. One might call this “applied
phonology”; however, the phonological analysis of different languages raises a great number
of difficult and interesting theoretical problems, and for a long time the study of phonology
“for its own sake” has been regarded as an important area of theoretical linguistics. Within
this area of what could be called “pure phonology”, problems are examined with little or no
reference to their relevance to the language learner. Many different theoretical approaches
have been developed, and no area of phonology has been free from critical examination. The
very fundamental notion of the phoneme, for example, has been treated in many different
ways. One approach that has been given a lot of importance is distinctive feature analysis,
which is based on the principle that phonemes should be regarded not as independent
and indivisible units, but instead as combinations of different features. For example, if we
consider the English d phoneme, it is easy to show that it differs from the plosives b, g in its
place of articulation (alveolar), from t in being lenis, from s, z in not being fricative, from n
in not being nasal, and so on. If we look at each of the consonants just mentioned and see
which of the features each one has, we get a table like this, where + means that a phoneme
does possess that feature and - means that it does not.
If you look carefully at this table, you will see that the combination of + and - values
for each phoneme is different; if two sounds were represented by exactly the same +’s and
13 Problems in phonemic analysis 103
d b 9 t s z n
alveolar + — — + + + +
bilabial — + — — - - -
velar — — + - - - -
lenis + + + - - + (+ r
plosive + + + + - - -
fricative — — - - + + —
nasal - - - - - — +
* Since there is no fortis/lenis contrast among nasals this could be left blank.
- ’s, then by definition they could not be different phonemes. In the case of the limited set
of phonemes used for this example, not all the features are needed: if one wished, it would
be possible to dispense with, for example, the feature velar and the feature nasal. The g
phoneme would still be distinguished from b, d by being neither bilabial nor alveolar, and
n would be distinct from plosives and fricatives simply by being neither plosive nor frica
tive. To produce a complete analysis of all the phonemes of English, other features would
be needed for representing other types of consonant, and for vowels and diphthongs. In
distinctive feature analysis the features themselves thus become important components of
the phonology.
It has been claimed by some writers that distinctive feature analysis is relevant to the
study of language learning, and that pronunciation difficulties experienced by learners are
better seen as due to the need to learn a particular feature or combination of features than
as the absence of particular phonemes. For example, English speakers learning French or
German have to learn to produce front rounded vowels. In English it is not necessary
to deal with vowels which are +front, H-round, whereas this is necessary for French and
German; it could be said that the major task for the English-speaking learner of French or
German in this case is to learn the combination of these features, rather than to learn the
individual vowels y, 0 and (in French) ce*.
English, on the other hand1, has to be able to distinguish dental from labiodental
and alveolar places of articulation, for 0 to be distinct from f, s and for 6 to be distinct
from v, z. This requires an additional feature that most languages do not make use of, and
learning this could be seen as a specific task for the learner of English. Distinctive feature
phonologists have also claimed that when children are learning their first language, they
acquire features rather than individual phonemes.
13.7 Conclusion
This chapter is intended to show that there are many ways of analysing the English
phonemic system, each with its own advantages and disadvantages. We need to consider
* The phonetic symbols represent the following sounds: y is a close front rounded vowel (e.g. the vowel in French
tu, German Btihne); 0 is a close-mid front rounded vowel (e.g. French peu, German schon); ce is an open-mid
front rounded vowel (e.g. French oeufi.
104 English Phonetics and Phonology
the practical goal of teaching or learning about English pronunciation, and for this purpose
a very abstract analysis would be unsuitable. This is one criterion for judging the value of
an analysis; unless one believes in carrying out phonological analysis for purely aesthetic
reasons, the only other important criterion is whether the analysis is likely to correspond
to the representation of sounds in the human brain. Linguistic theory is preoccupied with
economy, elegance and simplicity, but cognitive psychology and neuropsychology show us
that the brain often uses many different pathways to the same goal.
The analysis of tj", d$ is discussed in Cruttenden (2008: 181-8). The phonemic analysis
of the velar nasal has already been discussed above (see Notes on problems and further
reading in Chapter 7). The “double vowel” interpretation of English long vowels was
put forward by MacCarthy (1952) and is used by Kreidler (2004: 45-59). The “vowel-
plus-semivowel” interpretation of long vowels and diphthongs was almost universally
accepted by American (and some British) writers from the 1940s to the 1960s, and still
pervades contemporary American descriptions. It has the advantage of being economical
on phonemes and very “neat and tidy”. The analysis in this form is presented in Trager
and Smith (1951). In generative phonology it is claimed that, at the abstract level, English
vowels are simply tense or lax. If they are lax they are realised as short vowels, if tense
as diphthongs (this category includes what I have been calling long vowels). The quality
of the first element of the diphthongs/long vowels is modified by some phonologi
cal rules, while other rules supply the second element automatically. This is set out in
Chomsky and Halle (1968: 178-87). There is a valuable discussion of the interpretation
of the English vowel system with reference to several different accents in Giegerich (1992:
Chapter 3), followed by an explanation of the distinctive feature analysis of the English
vowel system (Chapter 4) and the consonant system (Chapter 5). A more wide-ranging
discussion of distinctive features is given in Clark et al, (2007: Chapter 10).
The idea that a is an allophone of many English vowels is not a new one. In generative
phonology, a results from vowel reduction in vowels which have never received stress in the
process of the application of stress rules. This is explained - in rather difficult terms - in
Chomsky and Halle (1968:110-26). A clearer treatment of the schwa problem is in Giegerich
(1992: 68-9 and 285-7).
N o te fo r teachers
Since this is a theoretical chapter it is difficult to provide practical work. I do not feel
that it is helpful for students to do exercises on using different ways of transcribing
English phonemes - just learning one set of conventions is difficult enough. Some books
on phonology give exercises on the phonemic analysis of other languages (e.g. Katamba,
1989; Roca and Johnson, 1999), but although these are useful, I do not feel that it would
be appropriate in this book to divert attention from English. The exercises given below
13 Problems in phonemic analysis 105
W ritten exercises
All the following exercises involve different ways of looking at the phonemic interpretation
of English sounds. We use square brackets here to indicate when symbols are phonetic
rather than phonemic.
1 In this exercise you must look at phonetically transcribed material from an
English accent different from BBC pronunciation and decide on the best way to
interpret and transcribe it phonemically.
a) ‘thing’ [Girjg]
b) ‘think’ [Girjk]
c) ‘thinking’ [Girjkigg]
d) ‘finger’ [firjga]
e) ‘singer’ [siqga]
f) ‘singing’ [sirjgirjg]
2 It often happens in rapid English speech that a nasal consonant disappears when
it comes between a vowel and another consonant. For example, this may happen
to the n in ‘front’: when this happens the preceding vowel becomes nasalised -
some of the air escapes through the nose. We symbolise a nasalised vowel in
phonetic transcription by putting the ~ diacritic above it; for example, the word
‘front’ may be pronounced [frAt], Nasalised vowels are found in the words given
in phonetic transcription below. Transcribe them phonemically.
a) ‘sound’ [saud]
b) ‘anger’ [eeg^]
c) ‘can’t’ [ka:t]
d) ‘camper’ [ksepa]
e) ‘bond’ [bod]
3 When the phoneme t occurs between vowels it is sometimes pronounced as
a “tap”: the tongue blade strikes the alveolar ridge sharply, producing a very brief
voiced plosive. The IPA phonetic symbol for this is r, but many books which
deal with American pronunciation prefer to use the phonetic symbol t; this
sound is frequently pronounced in American English, and is also found in a
number of accents in Britain: think of a typical American pronunciation of
“getting better”, which we can transcribe phonetically as [get ir) bet o ]. Look at
the transcriptions of the words given below and see if you can work out (for the
accent in question) the environment in which t is found.
a) ‘betting’ [betir)]
b) ‘bedding’ [bedirj]
c) ‘attend’ [athend]
d) ‘attitude’ [aet athu:d]
106 English Phonetics and Phonology
e) ‘time’ [thaim]
f) ‘tight’ [thait]
4 Distinctive feature analysis looks at different properties of segments and classes
of segments. In the following exercise you must mark the value of each feature
in the table for each segment listed on the top row with either a + or you will
probably find it useful to look at the IPA chart on p. xii.
P d s m z
Continuant
Alveolar
Voiced
5 In the following sets of segments (a-f), all segments in the set possess some
characteristic feature which they have in common and which may distinguish
them from other segments. Can you identify what this common feature might be
for each set?
a) English i:, i, u:, u; cardinal vowels [i], [e], [u], [o]
b) t d n 1 s tj d3 J 3 r
c) b f v k g h
d ) p t k f 6 s JtJ
e) u: o: su au
f) 1r w j
14 Aspects of connected speech
Many years ago scientists tried to develop machines that produced speech from a vocab
ulary of pre-recorded words; the machines were designed to join these words together to
form sentences. For very limited messages, such as those of a “talking clock”, this tech
nique was usable, but for other purposes the quality of the speech was so unnatural that
it was practically unintelligible. In recent years, developments in computer technology
have led to big improvements in this way of producing speech, but the inadequacy of the
original “mechanical speech” approach has many lessons to teach us about pronuncia
tion teaching and learning. In looking at connected speech it is useful to bear in mind
the difference between the way humans speak and what would be found in “mechanical
speech”.
The notion of rhythm involves some noticeable event happening at regular inter
vals of time; one can detect the rhythm of a heartbeat, of a flashing light or of a piece of
music. It has often been claimed that English speech is rhythmical, and that the rhythm is
detectable in the regular occurrence of stressed syllables. Of course, it is not suggested that
the timing is as regular as a clock: the regularity of occurrence is only relative. The theory
that English has stress-timed rhythm implies that stressed syllables will tend to occur at
relatively regular intervals whether they are separated by unstressed syllables or not; this
would not be the case in “mechanical speech”. An example is given below. In this sen
tence, the stressed syllables are given numbers: syllables 1 and 2 are not separated by any
unstressed syllables, 2 and 3 are separated by one unstressed syllable, 3 and 4 by two, and
4 and 5 by three.
1 2 3 4 5
'Walk 'down the 'path to the 'end of the ca'nal
The stress-timed rhythm theory statesthat the times from each stressedsyllable to
the next will tend to be the same, irrespective of the number of intervening unstressed
syllables. The theory also claims that while some languages (e.g. Russian, Arabic) have
stress-timed rhythm similar to that of English, others (e.g. French, Telugu, Yoruba)
have a different rhythmical structure called syllable-timed rhythm; in these languages, all
syllables, whether stressed or unstressed, tend to occur at regular time intervals and the
107
108 English Phonetics and Phonology
time between stressed syllables will be shorter or longer in proportion to the number of
unstressed syllables. Some writers have developed theories of English rhythm in which a
unit of rhythm, the foot, is used (with a parallel in the metrical analysis of verse). The foot
begins with a stressed syllable and includes all following unstressed syllables up to (but
not including) the following stressed syllable. The example sentence given above would be
divided into feet as follows:
i 1 i 2 3 4 5 |
| ’Walk | ’down the 'path to the ’end of the ca ’nal |
Some theories of rhythm go further than this, and point to the fact that some feet are
stronger than others, producing strong-weak patterns in larger pieces of speech above the
level of the foot. To understand how this could be done, let’s start with a simple example:
the word ‘twenty’ has one strong and one weak syllable, forming one foot. A diagram
of its rhythmical structure can be made, where s stands for “strong” and w stands for
“weak”.
s w
twen ty
S W
pla ces
Now consider the phrase ‘twenty places’, where ‘places’ normally carries stronger stress
than ‘twenty’ (i.e. is rhythmically stronger). We can make our “tree diagram” grow to look
like this:
S W s w
twen ty pla ces
If we then look at this phrase in the context of a longer phrase ‘twenty places further back’,
and build up the ‘further back’ part in a similar way, we would end up with an even more
elaborate structure:
14 Aspects o f connected speech 109
s w s w s w s
twen ty pla ces fur ther back
By analysing speech in this way we are able to show the relationships between strong and
weak elements, and the different levels of stress that we find. The strength of any particular
syllable can be measured by counting up the number of times an s symbol occurs above it.
The levels in the sentence shown above can be diagrammed like this (leaving out syllables
that have never received stress at any level):
s
s s s
s s s s
twen ty pla ces fur ther back
The above “metrical grid” may be correct for very slow speech, but we must now look at
what happens to the rhythm in normal speech: many English speakers would feel that,
although in ‘twenty places’ the right-hand foot is the stronger, the word ‘twenty’ is stronger
than ‘places’ in ‘twenty places further back’when spoken in conversational style. It is widely
claimed that English speech tends towards a regular alternation between stronger and
weaker, and tends to adjust stress levels to bring this about. The effect is particularly notice
able in cases such as the following, which all show the effect of what is called stress-shift:
compact (adjective) kam'paekt but compact disk 'kompaekt 'disk
thirteen 03i'ti:n but thirteenth place '03:ti:n0 'pleis
Westminster west'minsta but Westminster Abbey 'Westminster 'aebi
In brief, it seems that stresses are altered according to context: we need to be able to explain
how and why this happens, but this is a difficult question and one for which we have only
partial answers.
An additional factor is that in speaking English we vary in how rhythmically we speak:
sometimes we speak very rhythmically (this is typical of some styles of public speaking)
while at other times we may speak arhythmically (i.e. without rhythm) if we are hesitant
or nervous. Stress-timed rhythm is thus perhaps characteristic of one style of speaking,
not of English speech as a whole; one always speaks with some degree of rhythmicality,
110 English Phonetics and Phonology
but the degree varies between a minimum value (arhythmical) and a maximum value
(completely stress-timed rhythm).
It follows from what was stated earlier that in a stress-timed language all the feet
are supposed to be of roughly the same duration. Many foreign learners of English are
made to practise speaking English with a regular rhythm, often with the teacher beating
time or clapping hands on the stressed syllables. It must be pointed out, however, that
the evidence for the existence of truly stress-timed rhythm is not strong. There are many
laboratory techniques for measuring time in speech, and measurement of the time inter
vals between stressed syllables in connected English speech has not shown the expected
regularity; moreover, using the same measuring techniques on different languages, it has
not been possible to show a clear difference between “stress-timed” and “syllable-timed”
languages. Experiments have shown that we tend to hear speech as more rhythmical than
it actually is, and one suspects that this is what the proponents of the stress-timed rhythm
theory have been led to do in their auditory analysis of English rhythm. However, one
ought to keep an open mind on the subject, remembering that the large-scale, objective
study of suprasegmental aspects of real speech is difficult to carry out, and much research
remains to be done.
What, then, is the practical value of the traditional “rhythm exercise” for foreign
learners? The argument about rhythm should not make us forget the very important
difference in English between strong and weak syllables. Some languages do not have such
a noticeable difference (which may, perhaps, explain the subjective impression of “syllable-
timing”), and for native speakers of such languages who are learning English it can be
helpful to practise repeating strongly rhythmical utterances since this forces the speaker
to concentrate on making unstressed syllables weak. Speakers of languages like Japanese,
Hungarian and Spanish - which do not have weak syllables to anything like the same
extent as English does - may well find such exercises of some value (as long as they are not
overdone to the point where learners feel they have to speak English as though they were
reciting verse).
The device mentioned earlier that produces “mechanical speech” would contain
all the words of English, each having been recorded in isolation. A significant difference
in natural connected speech is the way that sounds belonging to one word can cause
changes in sounds belonging to neighbouring words. Assuming that we know how the
phonemes of a particular word would be realised when the word is pronounced in isola
tion, in cases where we find a phoneme realised differently as a result of being near some
other phoneme belonging to a neighbouring word we call this difference an instance of
assimilation. Assimilation is something which varies in extent according to speaking rate
and style: it is more likely to be found in rapid, casual speech and less likely in slow, careful
speech. Sometimes the difference caused by assimilation is very noticeable, and sometimes
it is very slight. Generally speaking, the cases that have most often been described are
14 Aspects o f connected speech 111
assimilations affecting consonants. As an example, consider a case where two words are
combined, the first of which ends with a single final consonant (which we will call Cf) and
the second of which starts with a single initial consonant (which we will call C1); we can
construct a diagram like this:
--------- Cf
word
boundary
If Cf changes to become like C" in some way, then the assimilation is called regressive
(the phoneme that comes first is affected by the one that comes after it); if C1changes to
become like Cf in some way, then the assimilation is called progressive. An example of the
latter is what is sometimes called coalescence, or coalescent assimilation: a final t, d and
an initial j following often combine to form tj, d3, so that ‘not yet’ is pronounced notJet
and ‘could you’ is kud3u. In what ways can a consonant change? We have seen that the
main differences between consonants are of three types:
differently, the only noticeable change being that s becomes J, and z becomes 3 when fol
lowed by J or j, as in: ‘this shoe’ 61J J11:; ‘those years’ 69113 j I3z- It is important to note
that the consonants that have undergone assimilation have not disappeared; in the above
examples, the duration of the consonants remains more or less what one would expect for
a two-consonant cluster. Assimilation of place is only noticeable in this regressive assimila
tion of alveolar consonants; it is not something that foreign learners need to learn to do.
Assimilation of manner is much less noticeable, and is only found in the most rapid
and casual speech; generally speaking, the tendency is again for regressive assimilation
and the change in manner is most likely to be towards an “easier” consonant - one which
112 English Phonetics and Phonology
makes less obstruction to the airflow. It is thus possible to find cases where a final
plosive becomes a fricative or nasal (e.g. ‘that side’ daes said, ‘good night’ gun nait), but
most unlikely that a final fricative or nasal would become a plosive. In one particular
case we find progressive assimilation of manner, when a word-initial 6 follows a plosive
or nasal at the end of a preceding word: it is very common to find that the O becomes
identical in manner to the Cf but with dental place of articulation. For example (the arrow
symbol means “becomes”):
‘in the’ in da —» inns
‘get them’ get 69m get t am
‘read these’ ri:d 6i:z —» ri:ddi:z
The d phoneme frequently occurs with no discernible friction noise.
Assimilation of voice is also found, but again only in a limited way. Only regressive
assimilation of voice is found across word boundaries, and then only of one type; since
this matter is important for foreign learners we will look at it in some detail. If Cf is a lenis
(i.e. “voiced”) consonant and C is fortis (“voiceless”) we often find that the lenis conso
nant has no voicing; for example in ‘I have to’ the final v becomes voiceless f because of
the following voiceless t in ai haef tu, and in the same way the z in ‘cheese’ tjiiz becomes
more like s when it occurs in ‘cheesecake’ tjiiskeik. This is not a very noticeable case of
assimilation, since, as was explained in Chapter 4, initial and final lenis consonants usu
ally have little or no voicing anyway; these devoiced consonants do not shorten preceding
vowels as true fortis consonants do. However, when Cf is fortis (“voiceless”) and C1lenis
(“voiced”), a context in which in many languages Cf would become voiced, assimilation
of voice never takes place; consider the following example: ‘I like that black dog’ ai laik
daet blaek dog. It is typical of many foreign learners of English that they allow regressive
assimilation of voicing to change the final k o f‘like’ to g, the final t o f‘that’ to d and the
final k o f‘black’ to g, giving ai laig daed blaeg dog. This creates a strong impression of a
foreign accent.
Up to this point we have been looking at some fairly clear cases of assimilation across
word boundaries. However, similar effects are also observable across morpheme bounda
ries and to some extent also within the morpheme. Sometimes in the latter case it seems
that the assimilation is rather different from the word-boundary examples; for example,
if in a syllable-final consonant cluster a nasal consonant precedes a plosive or a fricative
in the same morpheme, then the place of articulation of the nasal is always determined
by the place of articulation of the other consonant; thus: ‘bump’ bAmp, ‘tenth’ ten0, ‘hunt’
hAnt, ‘bank’ baerjk. It could be said that this assimilation has become fixed as part of the
phonological structure of English syllables, since exceptions are almost non-existent.
A similar example of a type of assimilation that has become fixed is the progressive
assimilation of voice with the suffixes s, z; when a verb carries a third person singular ‘-s’
suffix, or a noun carries an ‘-s’ plural suffix or an ‘-’s’ possessive suffix, that suffix will be
pronounced as s if the preceding consonant is fortis (“voiceless”) and as z if the preceding
consonant is lenis (“voiced”). Thus:
14 Aspects o f connected speech 113
‘cats’ k a e t s ‘dogs’ d D g z
‘jumps’ d3Amps ‘runs’ rAnz
‘Pat’s’ p a e t s ‘Pam’s’ p a e m z
Assimilation creates something of a problem for phoneme theory: when, for example, d in
‘good’ g u d becomes g in the context ‘good girl’, giving g u g g 3 i l or b in the context ‘good
boy’ g u b b o i , should we say that one phoneme has been substituted for another? If we do
this, how do we describe the assimilation in ‘good thing’, where d becomes dental d before
the 0 of ‘thing’, or in ‘good food’, where d becomes a labiodental plosive before the f in
‘food’? English has no dental or labiodental plosive phonemes, so in these cases, although
there is clearly assimilation, there could not be said to be a substitution of one phoneme
for another. The alternative is to say that assimilation causes a phoneme to be realised by
a different allophone; this would mean that, in the case of g u g g s : l and g o b b o i , the pho
neme d o f‘good’ has velar and bilabial allophones. Traditionally,phonemes were supposed
not to overlap in their allophones, so that the only plosives that could have allophones
with bilabial place of articulation were p, b ; this restriction is no longer looked on as
so important. The traditional view of assimilation as a change from one phoneme to
another is, therefore, naive: modern instrumental studies in the broader field of coarticu
lation show that when assimilation happens one can often see some sort of combination
of articulatory gestures. In ‘good girl’, for example, it is not a simple matter of the first
word ending either in d or in g , but rather a matter of the extent to which alveolar and/or
velar closures are achieved. There may be an alveolar closure immediately preceding and
overlapping with a velar closure; there may be simultaneous alveolar and velar closure, or
a velar closure followed by slight contact but not closure in the alveolar region. There are
many other possibilities.
Much more could be said about assimilation but, from the point of view of learning
or teaching English pronunciation, to do so would not be very useful. It is essentially a
natural phenomenon that can be seen in any sort of complex physical activity, and the only
important matter is to remember the restriction, specific to English, on voicing assimila
tion mentioned above.
The nature of elision may be stated quite simply: under certain circumstances
sounds disappear. One might express this in more technical language by saying that in
certain circumstances a phoneme may be realised as zero, or have zero realisation or be
deleted. As with assimilation, elision is typical of rapid, casual speech. Producing elisions
is something which foreign learners do not need to learn to do, but it is important for
them to be aware that when native speakers of English talk to each other, quite a number
of phonemes that the foreigner might expect to hear are not actually pronounced. We
will look at some examples, although only a small number of the many possibilities can
be given here.
114 English Phonetics and Phonology
• ‘are’: spelt ’re, pronounced a after vowels, usually with some change in the
preceding vowel (e.g. ‘you’ ju: - ‘you’re’ jua or jo:, ‘we’ wi: - ‘we’re’ wia, ‘they’
dei - ‘they’re’ dea); linking is used when a vowel follows, as explained in the
next section. Contracted ‘are’ is also pronounced as a or ar when following a
consonant.
14.4 U nking
In our hypothetical “mechanical speech” all words would be separate units placed
next to each other in sequence; in real connected speech, however, we link words together
in a number of ways. The most familiar case is the use of linking r; the phoneme r does
not occur in syllable-final position in the BBC accent, but when the spelling of a word
suggests a final r, and a word beginning with a vowel follows, the usual pronunciation is to
pronounce with r. For example:
‘here’ hia but ‘here are’ hiar a
‘four’ fa: but ‘four eggs’ fa:r egz
BBC speakers often use r in a similar way to link words ending with a vowel, even when
there is no “justification” from the spelling, as in:
‘Formula A’ faimjalar ei
‘Australia all out’ ostreiliar oil aut
‘media event’ mi:diar ivent
This has been called intrusive r; some English speakers and teachers still regard this as
incorrect or substandard pronunciation, but it is undoubtedly widespread.
“Linking r” and “intrusive r” are special cases of juncture; we need to consider the
relationship between one sound and the sounds that immediately precede and follow it. If
we take the two words ‘my turn’ mai t 3in, we know that the sounds m and ai, t and 3:, and
3 : and n are closely linked. The problem lies in deciding what the relationship is between
ai and t; since we do not usually pause between words, there is no silence to indicate word
division and to justify the space left in the transcription. But if English speakers hear mai
t3in they can usually recognise this as ‘my turn’ and not ‘might earn’. This is where the
problem of juncture becomes apparent. What is it that makes perceptible the difference
between mai t3:n and mait 3:n? The answer is that in one case the t is fully aspirated
(initial in ‘turn’), and in the other case it is not (being final in ‘might’). In addition to this,
the ai diphthong is shorter in ‘might’. If a difference in meaning is caused by the difference
between aspirated and unaspirated t, how can we avoid the conclusion that English has
a phonemic contrast between aspirated and unaspirated t? The answer is that the posi
tion of a word boundary has some effect on the realisation of the t phoneme; this is one
of the many cases in which the occurrence of different allophones can only be properly
explained by making reference to units of grammar (something which was for a long time
disapproved of by many phonologists).
116 English Phonetics and Phonology
Many ingenious minimal pairs have been invented to show the significance of
juncture, a few of which are given below:
• ‘might rain’ m a i t r e i n (r voiced when initial in ‘rain’, a i shortened), vs.
‘my train’ m a i t r e i n (r voiceless following t in ‘train’, a i longer)
• ‘all that I’m after today’ o il 6 at a i m u:fto t o d e i (t relatively unaspirated when
final in ‘that’)
‘all the time after today’ o:l d a t a i m a i f t a t a d e i (t aspirated when initial in
‘time’)
• ‘tray lending’ t r e i l e n d i g (“clear 1” initial in ‘lending’)
‘trail ending’ t r e i l e n d i g (“dark 1” final in ‘trail’)
• ‘keep sticking’ k i: p s t i k i g (t unaspirated after s )
‘keeps ticking’ k i : p s tikirj (t aspirated in ‘ticking’)
The context in which the words occur almost always makes it clear where the boundary
comes, and the juncture information is then redundant.
It should by now be clear that there is a great deal of difference between the way
words are pronounced in isolation and their pronunciation in the context of connected
speech.
14.1 English rhythm is a controversial subject on which widely differing views have been
expressed. On one side there have been writers such as Abercrombie(1967)and Halliday
(1967) who set out an elaborate theory of the rhythmical structure of English speech
(including foot theory). On the other side there are sceptics like Crystal (1969: 161-5)
who reject the idea of an inherent rhythmical pattern. The distinction between physically
measurable time intervals and subjective impressions of rhythmicality is discussed in
Roach (1982) and Lehiste (1977). Adams (1979) presents a review and experimental
study of the subject, and concludes that, despite the theoretical problems, there is practi
cal value in teaching rhythm to learners of English. The “stress-timed / syllable-timed”
dichotomy is generally agreed in modern work to be an oversimplification; a more widely
accepted view is that all languages display characteristics of both types of rhythm, but each
may be closer to one or the other; see Mitchell (1969) and Dauer (1983). Dauer’s theory
makes possible comparisons between different languages in terms of their relative posi
tions on a scale from maximally stress-timed to maximally syllable-timed (see for example
Dimitrova, 1997).
For some writers concerned with English language teaching, the notion of rhythm is
a more practical matter of making a sufficiently clear difference between strong and weak
syllables, rather than concentrating on a rigid timing pattern, as I suggest at the end of
Section 14.1; see, for example, Taylor (1981).
The treatment of rhythmical hierarchy is based on the theory of metrical phonology.
Hogg and McCully (1987) give a full explanation of this, but it is difficult material.
14 Aspects o f connected speech 117
Goldsmith (1990: Chapter 4) and Katamba (1989: Chapter 11. 1) are briefer and somewhat
simpler. A paper by Fudge (1999) discusses the relationship between syllables, words and
feet. James (1988) explores the relevance of metrical phonology to language learning.
14.2 Factors such as assimilation and elision are dealt with in an interesting and origi
nal way in Shockey (2003). Assimilation is described in more conventional terms in
Cruttenden (2008: 297-303). For reading on coarticulation, which studies the influences
of sounds on each other in wider and more complex ways than assimilation, see Roach
(2002), Ladefoged (2006: 68-71).
Notes fo r teachers
There is a lot of disagreement about the importance of the various topics in this chapter
from the language teacher’s point of view. My feeling is that while the practice and study
of connected speech are agreed by everyone to be very valuable, this can sometimes result
in some relatively unimportant aspects of speech (e.g. assimilation, juncture) being given
more emphasis than they should. It would not be practical or useful to teach all learners
of English to produce assimilations; practice in making elisions is more useful, and it is
clearly valuable to do exercises related to rhythm and linking. Perhaps the most important
consequence of what has been described in this chapter is that learners of English must
be made very clearly aware of the problems that they will meet in listening to colloquial,
connected speech.
In looking at the importance of studying aspects of speech above the segmental
level some writers have claimed that learners can come to identify an overall “feel” of the
pronunciation of the language being learned. Differences between languages have been
described in terms of their articulatory settings - that is, overall articulatory posture -
by Honikman (1964). She describes such factors as lip mobility and tongue setting for
English, French and other languages. The notion seems a useful one, although it is difficult
to confirm these settings scientifically.
118 English Phonetics and Phonology
Audio Unit 14 is liable to come as something of a surprise to students who have not
had the experience of examining colloquial English speech before. The main message to
get across is that concentration on selective, analytic listening will help them to recognise
what is being said, and that practice usually brings confidence.
W ritte n exercises
1 Divide the following sentences up into feet, using a dotted vertical line (i) as a
boundary symbol. If a sentence starts with an unstressed syllable, leave it out of
consideration - it doesn’t belong in a foot.
a) A bird in the hand is worth two in the bush.
b) Over a quarter of a century has elapsed since his death.
c) Computers consume a considerable amount of money and time.
d) Most of them have arrived on the bus.
e) Newspaper editors are invariably underworked.
2 Draw tree diagrams of the rhythmical structure of the following phrases.
a) Christmas present
b) Rolls-Royce
c) pet-food dealer
d) Rolls-Royce rally event
3 The following sentences are given in spelling and in a “slow, careful” phonemic
transcription. Rewrite the phonemic transcription as a “broad phonetic” one so
as to show likely assimilations, elisions and linking.
a) One cause of asthma is supposed to be allergies
WAn ko:z 9v aesBms iz sspsuzd ta bi aebd3iz
Many of the previous chapters have been concerned with the description of phonemes, and
in Section 5.2 it was pointed out that the subject of phonology includes not just this aspect
(which is usually called segmental phonology) but also several others. In Chapters 10 and
11, for example, we studied stress. Clearly, stress has linguistic importance and is therefore
an aspect of the phonology of English that must be described, but it is not usually regarded
as something that is related to individual segmental phonemes; normally, stress is said
to be something that is applied to (or is a property of) syllables, and is therefore part of
the suprasegmental phonology of English. (Another name for suprasegmental phonology
is prosodic phonology or prosody.) An important part of suprasegmental phonology is
intonation, and the next five chapters are devoted to this subject.
What is intonation? No definition is completely satisfactory, but any attempt at a
definition must recognise that the pitch of the voice plays the most important part. Only
in very unusual situations do we speak with fixed, unvarying pitch, and when we speak
normally the pitch of our voice is constantly changing. One of the most important tasks
in analysing intonation is to listen to the speaker’s pitch and recognise what it is doing;
this is not an easy thing to do, and it seems to be a quite different skill from that acquired
in studying segmental phonetics. We describe pitch in terms of high and low, and some
people find it difficult to relate what they hear in someone’s voice to a scale ranging from
low to high. We should remember that “high” and “low” are arbitrary choices for end
points of the pitch scale. It would be perfectly reasonable to think of pitch as ranging
instead from “light” to “heavy”, for example, or from “left” to “right”, and people who have
difficulty in “hearing” intonation patterns are generally only having difficulty in relating
what they hear (which is the same as what everyone else hears) to this “pseudo-spatial”
representation.
It is very important to make the point that we are not interested in all aspects of a
speaker’s pitch; the only things that should interest us are those which carry some linguis
tic information. If a speaker tries to talk while riding fast on a horse, his or her pitch will
make a lot of sudden rises and falls as a result of the irregular movement; this is something
which is outside the speaker’s control and therefore cannot be linguistically significant.
Similarly, if we take two speakers at random we will almost certainly find that one speaker
typically speaks with lower pitch than the other; the difference between the two speakers
is not linguistically significant because their habitual pitch level is determined by their
physical structure. But an individual speaker does have control over his or her own pitch,
120 English Phonetics and Phonology
and may choose to speak with a higher than normal pitch; this is something which is
potentially of linguistic significance.
A word of caution is needed in connection with the word pitch. Strictly speaking,
this should be used to refer to an auditory sensation experienced by the hearer. The rate
of vibration of the vocal folds - something which is physically measurable, and which is
related to activity on the part of the speaker - is the fundamental frequency of voiced
sounds, and should not be called “pitch”. However, as long as this distinction is under
stood, it is generally agreed that the term “pitch” is a convenient one to use informally
to refer both to the subjective sensation and to the objectively measurable fundamental
frequency.
We have established that for pitch differences to be linguistically significant, it is
a necessary condition that they should be under the speaker’s control. There is another
necessary condition and that is that a pitch difference must be perceptible; it is possible to
detect differences in the frequency of the vibration of a speaker’s voice by means of labo
ratory instruments, but these differences may not be great enough to be heard by a listener
as differences in pitch. Finally, it should be remembered that in looking for linguistically
significant aspects of speech we must always be looking for contrasts; one of the most
important things about any unit of phonology or grammar is the set of items it contrasts
with. We know how to establish which phonemes are in contrast with b in the context -in;
we can substitute other phonemes (e.g. p, s) to change the identity of the word from ‘bin’
to ‘pin’ to ‘sin’. Can we establish such units and contrasts in intonation?
To summarise what was said above, we want to know the answers to two questions
about English speech:
i) What can we observe when we study pitch variations?
ii) What is the linguistic importance of the phenomena we observe?
These questions might be rephrased more briefly as:
i) What is the form of intonation?
ii) What is the function of intonation?
We will begin by looking at intonation in the shortest piece of speech we can find -
the single syllable. At this point a new term will be introduced: we need a name for a
continuous piece of speech beginning and ending with a clear pause, and we will call this
an utterance. In this chapter, then, we are going to look at the intonation of one-syllable
utterances. These are quite common, and give us a comparatively easy introduction to the
subject.
Two common one-syllable utterances are ‘yes’ and ‘no’. The first thing to notice is
that we have a choice of saying these with the pitch remaining at a constant level, or with
the pitch changing from one level to another. The word we use for the overall behaviour
15 Intonation i 121
of the pitch in these examples is tone; a one-syllable word can be said with either a level
tone or a moving tone. If you try saying ‘yes’ or ‘no’ with a level tone (rather as though
you were trying to sing them on a steady note) you may find the result does not sound
natural, and indeed English speakers do not use level tones on one-syllable utterances very
frequently. Moving tones are more common. If English speakers want to say ‘yes’ or ‘no’ in
a definite, final manner they will probably use a falling tone - one which descends from a
higher to a lower pitch. If they want to say‘yes?’ or ‘no?’ in a questioning manner they may
say it with a rising tone - a movement from a lower pitch to a higher one.
Notice that already, in talking about different tones, some idea of function has been
introduced; speakers are said to select from a choice of tones according to how they want
the utterance to be heard, and it is implied that the listener will hear one-syllable utterances
said with different tones as sounding different in some way. During the development of
modern phonetics in the twentieth century it was for a long time hoped that scientific
study of intonation would make it possible to state what the function of each different
aspect of intonation was, and that foreign learners could then be taught rules to enable
them to use intonation in the way that native speakers use it. Few people now believe this
to be possible. It is certainly possible to produce a few general rules, and some will be given
in this course, just as a few general rules for word stress were given in Chapters 10 and 11.
However, these rules are certainly not adequate as a complete practical guide to how to use
English intonation. My treatment of intonation is based on the belief that foreign learners
of English at advanced levels who may use this course should be given training to make
them better able to recognise and copy English intonation. The only really efficient way to
learn to use the intonation of a language is the way a child acquires the intonation of its
first language, and the training referred to above should help the adult learner of English
to acquire English intonation in a similar (though much slower) way - through listening
to and talking to English speakers. It is perhaps a discouraging thing to say, but learners of
English who are not able to talk regularly with native speakers of English, or who are not
able at least to listen regularly to colloquial English, are not likely to learn English intona
tion, although they may learn very good pronunciation of the segments and use stress
correctly.
In the preceding section we mentioned three simple possibilities for the intonation
used in pronouncing the one-word utterances ‘yes’ and ‘no’. These were: level, fall and rise.
It will often be necessary to use symbols to represent tones, and for this we will use marks
placed before the syllable in the following way (phonemic transcription will not be used in
these examples - words are given in spelling):
Level _yes _no
Falling \y e s \n o
Rising /y e s /n o
122 English Phonetics and Phonology
This simple system for tone transcription could be extended, if we wished, to cover a
greater number of possibilities. For example, if it were important to distinguish between a
high level and low level tone for English we could do it in this way:
High level "yes "no
Low level _yes _no
Although in English we do on occasions say “yes or “no and on other occasions _yes or
_no, a speaker of English would be unlikely to say that the meaning of the words ‘yes’ and
‘no’ was different with the different tones; as will be seen below, we will not use the sym
bols for high and low versions of tones in the description of English intonation. But there
are many languages in which the tone can determine the meaning of a word, and changing
from one tone to another can completely change the meaning. For example, in Kono, a
language of West Africa, we find the following (meanings given in brackets):
High level “beg (‘uncle’) “buu (‘horn’)
Low level _beg (‘greedy’) _buu (‘to be cross’)
Similarly, while we can hear a difference between English _yes, /yes and \yes, and between
_no, /n o and \n o , there is not a difference in meaning in such a clear-cut way as in
Mandarin Chinese, where, for example,- ma means ‘mother’, /m a means ‘hemp’ a n d \m a
means ‘scold’. Languages such as the above are called tone languages; although to most
speakers of European languages they may seem strange and exotic, such languages are in
fact spoken by a very large proportion of the world’s population. In addition to the many
dialects of Chinese, many other languages of South-East Asia (e.g. Thai, Vietnamese) are
tone languages; so are very many African languages, particularly those of the South and
West, and a considerable number of Native American languages. English, however, is not
a tone language, and the function of tone is much more difficult to define than in a tone
language.
We have introduced three simple tones that can be used on one-syllable English
utterances: level, fall and rise. However, other more complex tones are also used. One that
is quite frequently found is the fall-rise tone, where the pitch descends and then rises
again. Another complex tone, much less frequently used, is the rise-fall in which the pitch
follows the opposite movement. We will not consider any more complex tones, since these
are not often encountered and are of little importance.
One further complication should be mentioned here. Each speaker has his or her
own normal pitch range: a top level which is the highest pitch normally used by the
speaker, and a bottom level that the speaker’s pitch normally does not go below. In ordi
nary speech, the intonation tends to take place within the lower part of the speaker’s pitch
range, but in situations where strong feelings are to be expressed it is usual to make use
of extra pitch height. For example, if we represent the pitch range by drawing two parallel
15 Intonation i 123
'' x
\y e s and t\y e s
In th is c h a p te r o n ly a v e ry sm a ll p a r t o f E n g lish i n t o n a ti o n h a s b e e n in tro d u c e d .
W e w ill n o w see if it is p o ssib le to sta te in w h a t c irc u m s ta n c e s th e d iffe re n t to n e s are
u s e d w ith in th e v e ry lim ite d c o n te x t o f th e w o rd s ‘y es> a n d ‘n o ’ sa id in is o la tio n . W e w ill
lo o k a t s o m e ty p ic a l o c c u rre n c e s; n o e x a m p le s o f e x tra p itc h h e ig h t w ill b e c o n s id e re d
h e re , so th e e x a m p le s s h o u ld b e t h o u g h t o f as b e in g sa id re la tiv e ly lo w in th e sp e a k e r s
p itc h ra n g e .
FalK yesxn o
T h is is th e to n e a b o u t w h ic h le a st n e e d s to b e said , a n d w h ic h is u su a lly re g a rd e d
as m o r e o r less “n e u t r a l”. I f s o m e o n e is ask e d a q u e s tio n a n d rep lies \ y e s o r \ n o it w ill b e
u n d e r s to o d t h a t th e q u e s tio n is n o w a n s w e re d a n d th a t th e re is n o th in g m o r e to b e said.
T h e fall c o u ld b e sa id to give a n im p re s s io n o f “fin a lity ”.
R is e /y e s /n o
I n a v a r ie ty o f w ays, th is to n e co nveys a n im p re s s io n t h a t s o m e th in g m o r e is to follow .
A ty p ic a l o c c u r re n c e in a d ia lo g u e b e tw e e n tw o sp e a k e rs w h o m w e sh a ll call A a n d B
m ig h t b e th e fo llo w in g :
(B’s reply is, perhaps, equivalent to ‘what do you want?’) Another quite common occur
rence would be:
A: Do you know John Smith?
One possible reply from B would be /yes, inviting A to continue with what she intends
to say about John Smith after establishing that B knows him. To reply instead \yes would
give a feeling of “finality”, of “end of the conversation”; if A did have something to say
about John Smith, the response with a fall would make it difficult for A to continue.
We can see similar “invitations to continue” in someone’s response to a series of
instructions or directions. For example:
A: You start off on the ring road ...
B: /yes
A: turn left at the first roundabout...
B: /yes
A: and ours is the third house on the left.
Whatever B replies to this last utterance of A, it would be most unlikely to be /yes again,
since A has clearly finished her instructions and it would be pointless to “prompt” her to
continue.
With ‘no’, a similar function can be seen. For example:
A: Have you seen Ann?
If B replies \n o (without using high pitch at the start) he implies that he has no inter
est in continuing with that topic of conversation. But a reply of /n o would be an invi
tation to A to explain why she is looking for Ann, or why she does not know where
she is.
Similarly, someone may ask a question that implies readiness to present some new
information. For example:
A: Do you know what the longest balloon flight was?
If B replies /n o he is inviting A to tell him, while a response of \n o would be more likely
to mean that he does not know and is not expecting to be told. Such “do you know?” ques
tions are, in fact, a common cause of misunderstanding in English conversation, when
a question such as A’s above might be a request for information or an offer to provide
some.
B’s reply would be taken to mean that he would not completely agree with what A
said, and A would probably expect B to go on to explain why he was reluctant to agree.
Similarly:
A: It s not really an expensive book, is it?
B: vno
The fall-rise in B’s reply again indicates that he would not completely agree with A.
Fall-rise in such contexts almost always indicates both something “given” or “conceded”
and at the same time some reservation or hesitation. This use of intonation will be returned
to in Chapter 19.
Level_yes_.no
This tone is certainly used in English, but in a rather restricted context: it almost
always conveys (on single-syllable utterances) a feeling of saying something routine, unin
teresting or boring. A teacher calling the names of students from a register will often do
so using a level tone on each name, and the students are likely to respond with _yes when
their name is called. Similarly, if one is being asked a series of routine questions for some
purpose - such as applying for an insurance policy - one might reply to each question of
a series (like ‘Have you ever been in prison?’, ‘Do you suffer from any serious illness?’, ‘Is
your eyesight defective?’, etc.) with _no.
A few meanings have been suggested for the five tones that have been introduced,
but each tone may have many more such meanings. Moreover, it would be quite wrong to
conclude that in the above examples only the tones given would be appropriate; it is, in
fact, almost impossible to find a context where one could not substitute a different tone.
This is not the same thing as saying that any tone can be used in any context: the point
is that no particular tone has a unique “privilege of occurrence” in a particular context.
When we come to look at more complex intonation patterns, we will see that defining
intonational “meanings” does not become any easier.
126 English Phonetics and Phonology
We can now move on from examples o f‘yes’ and ‘no’ and see how some of these tones
can be applied to other words, either single-syllable words or words of more than one
syllable. In the case of polysyllabic words, it is always the most strongly stressed syllable
that receives the tone; the tone mark is equivalent to a stress mark. We will underline
syllables that carry a tone from this point onwards.
Examples:
Fall (usually suggests a “final” or “definite” feeling)
\sto p \ eighty a \ gain
Rise (often suggesting a question)
/ sure /really t o / night
When a speaker is giving a list of items, they often use a rise on each item until the last,
which has a fall, for example:
You can have it in / red, / blue, / green or \ black
Fall-rise (often suggesting uncertainty or hesitation)
vsome vnearly pervhaps
Fall-rise is sometimes used instead of rise in giving lists.
Rise-fall (often sounds surprised or impressed)
Aoh Alovelv iAmmense
15.1 The study of intonation went through many changes in the twentieth century, and dif
ferent theoretical approaches emerged. In the United States the theory that evolved was based
on ‘pitch phonemes’ (Pike, 1945; Trager and Smith, 1951): four contrastive pitch levels were
established and intonation was described basically in terms of a series of movements from
one of these levels to another. You can read a summary of this approach in Cruttenden
(1997: 38-40). In Britain the ‘tone-unit’ or ‘tonetic’ approach was developed by (among
others) O’Connor and Arnold (1973) and Halliday (1967). These two different theoretical
approaches became gradually more elaborate and difficult to use. I have tried in this course
to stay within the conventions of the British tradition, but to present an analysis that is
simpler than most. A good introduction to the theoretical issues is Cruttenden (1997).
Wells (2006) is also in the tradition of British analyses, but goes into much more detail
than the present course, including a lot of recorded practice material.
15.2 The amount of time to be spent on learning about tone languages should depend to
some extent on your background. Those whose native language is a tone language should
be aware of the considerable linguistic importance of tone in such languages; often it is
extremely difficult for people who have spoken a tone language all their life to learn to
15 Intonation i 127
o b se rv e th e ir o w n u se o f to n e objectively. T h e s tu d y o f to n e la n g u a g e s w h e n le a r n in g
E n g lish is less im p o r t a n t fo r n a tiv e sp e a k e rs o f n o n - to n e la n g u a g e s, b u t m o s t s tu d e n ts
se e m to f in d it a n in te r e s tin g su b je c t. A g o o d in t r o d u c tio n is L a d e fo g e d (2006: 2 4 7 - 2 5 3 ).
T h e classic w o r k o n th e s u b je c t is P ike (1 9 4 8 ) w h ile m o r e m o d e r n tr e a tm e n ts are H y m a n
(1975: 2 1 2 - 2 9 ), F ro m k in (1978 ) a n d K a ta m b a (1989: C h a p te r 10).
M a n y an aly ses w ith in th e B ritish a p p r o a c h to in t o n a ti o n in c lu d e a m o n g to n e s b o th
“h ig h ” a n d “lo w ” v arie tie s. F o r e x a m p le , O ’C o n n o r a n d A r n o ld (19 7 3 ) d is tin g u is h e d
b e tw e e n “h ig h fall” a n d “lo w fall” (th e f o r m e r s ta rtin g f r o m a h ig h p itc h , th e la tte r fro m
m id ) , a n d also b e tw e e n “lo w ris e ” a n d “h ig h ris e ” ( th e la tte r r is in g to a h ig h e r p o i n t th a n
th e f o rm e r ). S o m e w rite r s h a d h ig h a n d lo w v e rsio n s o f all to n e s. C o m p a r e d w ith o u r
s e p a ra te fe a tu re o f extra pitch height (w h ic h is e x p la in e d m o r e fu lly in S e c tio n 18.1), th is
is u n n e c e s s a r y d u p lic a tio n . H o w e v er, if o n e a d d s e x tra p itc h h e ig h t to a to n e , o n e h a s
n o t given all p o ssib le d e ta il a b o u t it. I f w e ta k e as a n e x a m p le a f a ll-r is e w ith o u t e x tra
p itc h h e ig h t:
th e n s o m e th in g s y m b o lis e d as Tv c o u ld b e a n y o f th e fo llo w in g :
students find it very helpful to work with a computer showing a real-time display of their
pitch movements as they speak.
W ritte n exercise
In the following sentences and bits of dialogue, each underlined syllable must be given an
appropriate tone mark. Write a tone mark just in front of the syllable.
1 This train is for Leeds, York and Hull.
2 Can you give me a lift?
Possibly. Where to?
3 No! Certainly not! Go away!
4 Did you know he’d been convicted of drunken driving?
No!
5 If I give him money he goes and spends it.
If I lend him the bike he loses it.
He’s completely unreliable.
16 Intonation 2
16.1 T h e to n e -u n it
In Chapter 15 it was explained that many of the world’s languages are tone languages,
in which substituting one distinctive tone for another on a particular word or morpheme
can cause a change in the dictionary (“lexical”) meaning of that word or morpheme, or
in some aspect of its grammatical categorisation. Although tones or pitch differences are
used for other purposes, English is one of the languages that do not use tone in this way.
Languages such as English are sometimes called intonation languages. In tone languages
the main suprasegmental contrastive unit is the tone, which is usually linked to the phono
logical unit that we call the syllable. It could be said that someone analysing the function
and distribution of tones in a tone language would be mainly occupied in examining
utterances syllable by syllable, looking at each syllable as an independently variable item.
In Chapter 15, five tones found on English one-syllable utterances were introduced, and if
English were spoken in isolated monosyllables, the job of tonal analysis would be a rather
similar one to that described for tone languages. However, when we look at continuous
speech in English utterances we find that these tones can only be identified on a small
number of particularly prominent syllables. For the purposes of analysing intonation, a
unit generally greater in size than the syllable is needed, and this unit is called the tone-
unit; in its smallest form the tone-unit may consist of only one syllable, so it would in fact
be wrong to say that it is always composed of more than one syllable. The tone-unit is
difficult to define, and one or two examples may help to make it easier to understand the
concept. As explained in Chapter 15, examples used to illustrate intonation transcription
are usually given in spelling form, and you will notice that no punctuation is used; the rea
son for this is that intonation and stress are the vocal equivalents of written punctuation,
so that when these are transcribed it would be unnecessary or even confusing to include
punctuation as well.
O AU16 (CD 2), Exs 1 & 2
Let us begin with a one-syllable utterance:
/ you
We underline syllables that carry a tone, as explained at the end of the previous chapter.
Now consider this utterance:
is it / you
130 English Phonetics and Phonology
The third syllable is more prominent than the other two and carries a rising tone.
The other two syllables will normally be much less prominent, and be said on a level
pitch. Why do we not say that each of the syllables cis’ and ‘if carries a level tone? This is
a difficult question that will be examined more fully later; for the present I will answer
it (rather unsatisfactorily) by saying that it is unusual for a syllable said on a level pitch
to be so prominent that it would be described as carrying a level tone. To summarise the
analysis of £is it / you so far, it is an utterance of three syllables, consisting of one tone-
unit; the only syllable that carries a tone is the third one. From now on, a syllable which
carries a tone will be called a tonic syllable. It has been mentioned several times that
tonic syllables have a high degree of prominence; prominence is, of course, a property of
stressed syllables, and a tonic syllable not only carries a tone (which is something related
to intonation) but also a type of stress that will be called tonic stress. (Some writers use
the terms nucleus and nuclear stress for tonic syllable and tonic stress.)
The example can now be extended:
vlohn is it / you
A fall-rise tone is used quite commonly in calling someone’s name. If there is a clear pause
(silence) between ‘vlohn and ‘is it / you’ then, according to the definition of an utterance
given in Chapter 15, there are two utterances; however, it is quite likely that a speaker
would say ‘vlohn is it / you’ with no pause, so that the four syllables would make up a
single utterance. In spite of the absence of any pause, the utterance would normally be
regarded as divided into two tone-units: ‘vlohn’ and ‘is i t / you’. Since it is very difficult to
lay down the conditions for deciding where the boundaries between tone-units exist, the
discussion of this matter must wait until later.
It should be possible to see now that the tone-unit has a place in a range of pho
nological units that are in a hierarchical relationship: speech consists of a number of
utterances (the largest units that we shall consider); each utterance consists of one or more
tone-units; each tone-unit consists of one or more feet; each foot consists of one or more
syllables; each syllable consists of one or more phonemes.
In Chapter 8 the structure of the English syllable was examined in some detail. Like
the syllable, the tone-unit has a fairly clearly defined internal structure, but the only com
ponent that has been mentioned so far is the tonic syllable. The first thing to be done is
to make more precise the role of the tonic syllable in the tone-unit. Most tone-units are
of a type that we call simple, and the sort that we call compound are not discussed in this
chapter. Each simple tone-unit has one and only one tonic syllable; this means that the
tonic syllable is an obligatory component of the tone-unit. (Compare the role of the vowel
in the syllable.) We will now see what the other components may be.
16 Intonation 2 131
The head
Consider the following one-syllable utterance:
\ those
We can find the same tonic syllable in a long utterance (still of one tone-unit):
'give me \those
The rest of the tone-unit in this example is called the head. Notice that the first syllable
has a stress mark: this is important. A head is all of that part of a tone-unit that extends
from the first stressed syllable up to (but not including) the tonic syllable. It follows that if
there is no stressed syllable before the tonic syllable, there cannot be a head. In the above
example, the first two syllables (words) are the head of the tone-unit. In the following
example, the head consists of the first five syllables:
'Bill 'called to 'give me \ these
As was said a little earlier, if there is no stressed syllable preceding the tonic syllable, there
is no head. This is the case in the following example:
in an \h o u r
Neither of the two syllables preceding the tonic syllable is stressed. The syllables ‘in an’
form a pre-head, which is the next component of the tone-unit to be introduced.
The pre-head
The pre-head is composed of all the unstressed syllables in a tone-unit preceding the
first stressed syllable. Thus pre-heads are found in two main environments:
i) when there is no head (i.e. no stressed syllable preceding the tonic syllable), as in
this example:
in an \ hour
ii) when there is a head, as in this example:
in a 'little 'less than an \h o u r
In this example, the pre-head consists o f‘in a’, the head consists of “litde 'less than an’, and
the tonic syllable is \ hour’.
The tail
It often happens that some syllables follow the tonic syllable. Any syllables between
the tonic syllable and the end of the tone-unit are called the tail. In the following exam
ples, each tone-unit consists of an initial tonic syllable and a tail:
\ look at it / what did you say \b o th of them were here
132 English Phonetics and Phonology
When it is necessary to mark stress in a tail, we will use a special symbol, a raised dot • for
reasons that will be explained later. The above examples should, then, be transcribed as
follows:
\lo ok at it / what did you -say \b o th of them were ‘here
This completes the list of tone-unit components. If we use brackets to indicate optional
components (i.e. components which may be present or may be absent), we can summarise
tone-unit structure as follows:
(pre-head) (head) tonic syllable (tail)
or, more briefly, as:
(PH) (H) TS (T)
To illustrate this more fully, let us consider the following passage, which is transcribed
from a recording of spontaneous speech (the speaker is describing a picture). When we
analyse longer stretches of speech, it is necessary to mark the places where tone-unit
boundaries occur - that is, where one tone-unit ends and another begins, or where a tone-
unit ends and is followed by a pause, or where a tone-unit begins following a pause. It was
mentioned above that tone-units are sometimes separated by silent pauses and sometimes
not; pause-type boundaries can be marked by double vertical lines (II) and non-pause
boundaries with a single vertical line (I). In practice it is not usually important to mark
pauses at the beginning and end of a passage, though this is done here for completeness.
The boundaries within a passage are much more important.
II and then 'nearer to the vfront II on the /left I theres a 'bit of \forest I 'coming
'down to the vwaterside II and then a 'bit of a /bay II
We can mark their structure as follows (using dotted lines to show divisions between tone-
unit components, though this is only done for this particular example):
PH j H j TS PH i TS PH
and then ! 'nearer to the ! vfront on the j /le ft theres a
1 1
H i TS i T H TS i T
'bit of 1! \fo
---r 1! est 'coming 'down to the \w a ! terside
PH H TS
and then a 'bit of a / bay
The above passage contains five tone-units. Notice that in the third tone-unit, since it is
the syllable rather than the word that carries the tone, it is necessary to divide the word
‘forest’ into two parts, ‘for-’ for and ‘-est’ ist; in the fourth tone-unit the word ‘waterside’ is
16 Intonation 2 133
/ h e re 'sh a ll w e 'sit / h e re
It w o u ld n o t b e u se fu l (u n le ss y o u are d o in g re se a rc h o n th e su b je c t) to go in to all th e
d iffe re n t w ays in w h ic h E n g lish in t o n a ti o n h a s b e e n r e p re s e n te d , b u t it is w o r t h n o tin g
t h a t s im p le r a p p ro a c h e s h av e b e e n u s e d in th e p ast. I n th e e a rlie r p a r t o f th e la st c e n tu ry , a
c o m m o n a p p r o a c h w as to tr e a t all th e p itc h m o v e m e n t in th e to n e - u n i t as a sin g le “t u n e ” ;
T u n e 1 w as ty p ic a lly d e s c e n d in g a n d e n d in g in a fall, w h ile T u n e 2 e n d e d u p risin g (I w as
ta u g h t F re n c h i n t o n a ti o n in th is w ay in th e 1960s). I n m o r e m o d e r n w o rk , w e c a n see th a t
it is p o ssib le to re p r e s e n t in t o n a ti o n as a s im p le s e q u e n c e o f to n ic a n d n o n - to n ic stre sse d
syllables, a n d p a u se s, w ith n o h ig h e r-le v e l o r g a n is a tio n ; a n e x a m p le o f th is is th e t r a n
s c rip tio n u s e d in th e S p o k e n E n g lish C o r p u s (W illia m s, 1996). B ro w n (199 0, C h a p te r 5)
u ses a relativ ely sim p le an aly sis o f i n t o n a ti o n to p r e s e n t v a lu a b le e x a m p le s o f a u th e n tic
r e c o rd e d sp eech . M o s t c o n t e m p o r a r y B ritish an aly ses, h o w ev e r, u se a u n it s im ila r o r
id e n tic a l to w h a t I call a t o n e - u n i t d iv id e d in to c o m p o n e n ts s u c h as p r e - h e a d , h e a d , to n ic
syllable a n d tail. D iffe re n t w rite rs u se d iffe re n t n a m e s: “t o n e - g r o u p ”, “in to n a ti o n - g r o u p ”,
“s e n s e - g ro u p ”, “ in t o n a ti o n u n i t ” a n d “ i n t o n a ti o n p h r a s e ( I P ) ” are all m o r e o r less s y n o n y
m o u s w ith “t o n e - u n i t ” G o o d b a c k g r o u n d re a d in g o n th is is C r u tt e n d e n (1997: 2 6 - 5 5 ).
W ritten exercises
— - ^
a) 'O n ly w hen th e v w in d ‘b lo w s
b) /W h e n d id you *say
I n C h a p te r 16 th e s tr u c tu r e o f th e t o n e - u n i t w as in t r o d u c e d a n d it w as e x p la in e d th a t
w h e n a to n ic syllable is fo llo w e d b y a tail, t h a t ta il c o n tin u e s a n d c o m p le te s th e to n e b e g u n
o n th e to n ic syllable. E x a m p le s w ere giv en to sh o w h o w th is h a p p e n s in th e case o f risin g
a n d fallin g to n e s. W e n o w go o n to c o n s id e r th e r a th e r m o r e d iffic u lt cases o f f a ll-r is e a n d
r is e - f a ll to n e s.
vy
I f w e a d d a syllable, th e “fall” p a r t o f th e f a ll-r is e is u s u a lly c a r rie d b y th e first to n ic syllable
a n d th e “rise ” p a r t b y th e se c o n d . T h e re s u lt m a y b e a c o n tin u o u s p itc h m o v e m e n t v e ry
s im ila r to th e o n e -sy lla b le case, if th e re are n o v o iceless m e d ia l c o n s o n a n ts to ca u se a
b r e a k in th e v o ic in g . T h is w o u ld give a p itc h m o v e m e n t t h a t w e c o u ld d r a w like th is:
vsom e -m e n
v so m e -ch a irs
136
Intonation 3 137
i) I v m ig h t -b u y it
I v m ig h t have - th o u g h t of -b u y in g it
- \ _
ii) vm ost of th e m
vm ost of it w as fo r th e m
X _ _ _ _ ^
W ith th e r is e - f a ll to n e w e f in d a sim ila r s itu a tio n : if th e to n ic syllable is fo llo w e d b y a
sin g le syllable in th e ta il, th e “rise ” p a r t o f th e to n e ta k es p la c e o n th e firs t (to n ic ) syllable
a n d th e “fall” p a r t is o n th e se c o n d . T h u s:
A beau ti fu l a all of th e m -w e n t
i) w ith h ig h h e a d
we 'a sk e d if it had \c o m e
...... — — _
" \
ii) w ith lo w h e a d
we ,ask e d if it had \c o m e
_ —— __ ......
,1 c o u ld h av e .b o u g h t it fo r .less th a n a \ pound
140 English Phonetics and Phonology
W h e n w e e x a m in e th e i n t o n a ti o n o f p o ly sy llab ic h e a d s w e f in d m u c h g re a te r v a r ie ty th a n
th e se sim p le e x a m p le s suggest. H o w e v er, th e d iv is io n in to h ig h a n d lo w h e a d s as g e n e ra l
ty p e s is p r o b a b ly th e m o s t b a sic t h a t c a n b e m a d e , a n d it w o u ld b e p o in tle s s to set u p a m o r e
e la b o ra te sy ste m to re p r e s e n t d iffe re n c e s if th e se d iffe ren c es w ere n o t r e c o g n is e d b y m o s t
n a tiv e sp e ak e rs. S o m e w rite rs o n in t o n a ti o n c la im th a t th e in t o n a ti o n p a t te r n s ta rtin g a t a
fairly h ig h p itc h , w ith a g r a d u a l d r o p p in g d o w n o f p itc h d u r in g th e u tte r a n c e , is th e m o s t
b asic, n o r m a l, “u n m a r k e d ” in t o n a ti o n p a tte r n ; th is m o v e m e n t is o fte n ca lle d declination.
T h e c la im th a t d e c lin a tio n is u n iv e rsa lly u n m a r k e d in E n g lish , o r ev e n in all la n g u a g e s, is a
s tr o n g o n e . As far as E n g lish is c o n c e rn e d , it w o u ld b e g o o d to see m o r e e v id e n c e f ro m th e
full ra n g e o f re g io n a l a n d n a tio n a l v a rie tie s in s u p p o r t o f th e claim .
It s h o u ld b e n o te d t h a t th e tw o m a rk s and are b e in g u s e d fo r tw o d iffe re n t p u r p o s e s
in th is c o u rse , as th e y are in m a n y p h o n e tic s b o o k s . W h e n stress is b e in g d isc u sse d , th e
m a r k (b lu e ty p e ) in d ic a te s p r im a r y stress a n d in d ic a te s s e c o n d a ry stress. F o r th e p u r p o s e s
o f m a r k in g i n to n a tio n , h o w ev e r, th e m a r k 1 (b la c k ty p e ) in d ic a te s a s tre sse d syllable in a h ig h
h e a d a n d th e m a r k , in d ic a te s a stre sse d syllable in a lo w h e a d . In p ra c tic e th is is n o t u su a lly
f o u n d c o n fu s in g as lo n g as o n e is aw a re o f w h e th e r o n e is m a r k in g stress levels o r i n to n a tio n ,
a n d th e c o lo u r d iffe re n c e h e lp s to d is tin g u is h th e m . W h e n th e h ig h a n d lo w m a rk s 1 a n d ,
are b e in g u s e d to in d ic a te in to n a tio n , it is n o lo n g e r p o ssib le to m a r k tw o d iffe re n t levels o f
stress w ith in th e w o rd . H o w e v er, w h e n lo o k in g a t sp e e c h a t th e level o f th e to n e - u n i t w e are
n o t u su a lly in te r e s te d in th is; a m u c h m o r e im p o r t a n t d iffe re n c e h e re is th e o n e b e tw e e n
to n ic stress ( m a r k e d b y u n d e r lin in g th e to n ic syllable a n d p la c in g b e fo re it o n e o f th e five
to n e - m a r k s ) a n d n o n - to n ic stre s se d syllables ( m a r k e d 1 o r , in th e h e a d o r • in th e ta il).
It n e e d s to b e e m p h a s is e d t h a t in m a r k in g in to n a tio n , o n ly stre s se d syllables are
m a rk e d ; th is im p lie s t h a t i n t o n a ti o n is c a r rie d e n tire ly b y th e stre s se d syllables o f a to n e -
u n it a n d t h a t th e p itc h o f u n s tr e s s e d syllables is e ith e r p r e d ic ta b le f ro m th a t o f stre s se d
syllables o r is o f so little im p o r ta n c e th a t it is n o t w o r t h m a rk in g . R e m e m b e r t h a t th e a d d i
tio n a l in f o r m a ti o n g iven in th e e x a m p le s ab o v e b y d ra w in g p itc h levels a n d m o v e m e n ts
b e tw e e n lin e s is o n ly in c lu d e d h e r e to m a k e th e e x a m p le s c le a re r a n d is n o t n o r m a lly giv en
w ith o u r sy ste m o f tr a n s c r ip tio n ; all th e i m p o r t a n t in f o r m a tio n a b o u t in t o n a ti o n m u s t,
th e re fo re , b e g iven b y th e m a rk s p la c e d in th e text.
Ive \s e e n / h im
i} - n y
In th is e x a m p le th e re se em s to b e e q u a l p r o m in e n c e o n ‘se e n a n d ‘h im ’. It c o u ld b e c la im e d
th a t th is is th e sa m e th in g as:
Ive v se e n h im
I n v e rs io n (ii), o n th e o th e r h a n d , th e w o r d ‘se en is g iv en th e g re a te st p r o m in e n c e , a n d
it is likely to s o u n d as th o u g h th e sp e a k e r h a s s o m e re s e rv a tio n , o r h a s s o m e th in g f u r th e r
to say:
T h e sa m e is f o u n d w ith ‘h e r ’, as in:
142 English Phonetics and Phonology
compared with:
Ive vseen her
aiv v sim a
This is a difficult problem, since it weakens the general claim made earlier that each tone-
unit contains only one tonic syllable.
Anomalous tone-units
However comprehensive one's descriptive framework may be (and the one given in
this course is very limited), there will inevitably be cases which do not fit within it. For
example, other tones such as fall-rise-fall or rise-fall-rise are occasionally found. In the
head, we sometimes find cases where the stressed syllables are not all high or all low, as in
the following example:
,After ,one of the 'worst 'days of my vlife
It can also happen that a speaker is interrupted and leaves a tone-unit incomplete - for
example, lacking a tonic syllable. To return to the analogy with grammar, in natural speech
one often finds sentences which are grammatically anomalous or incomplete, but this
does not deter the grammarian from describing “normal” sentence structure. Similarly,
17 Intonation 3 143
although there are inevitably problems and exceptions, we continue to treat the tone-unit
as something that can be described, defined and recognised.
The main concern of this chapter is to complete the description of intonational form,
including analysis of perhaps the most difficult aspect: that of recognising fall-rise and
rise-fall tones when they are extended over a number of syllables. This is necessary since
no complete analysis of intonation can be done without having studied these “extended
tones”.
Cruttenden (1997: Chapters 3 and 4) gives a good introduction to the problems of
analysing tones both within the traditional British framework and in autosegmental terms.
On tone-unit boundaries, there is a clear explanation of the problems in Cruttenden (1997:
Section 3.2), and in more detail in Crystal (1969: 204-7). A study of Scottish English by
Brown et al. (1980) gives ample evidence that tone-units in real life are not as easy to
identify as tone-units in textbooks.
Some writers follow Halliday (1967) in using the terms tone, tonality and tonicity
(the “three Ts”) to refer (respectively) to tone, to the division of speech into tone-units and
to the placement of the tonic syllable; see for example Tench (1996), Wells (2006). In my
experience people find it difficult to remember which is which, so I don’t use these terms.
There has recently been a growth of interest in the comparative study of intonation
in different languages and dialects: see Cruttenden (1997: Chapter 5); Hirst and di Cristo
(1998); Ladd (1996: Chapter 4).
On declination, see Cruttenden (1997: 121-3).
For reading on autosegmental analysis (often given the name ToBI, which stands for
Tones and Break Indices), a good introduction is Cruttenden (1997: 56-67). A fuller and
more critical analysis can be read in Ladd (1996: Chapters 2 and 3); see also Roca and
Johnson (1999: Chapter 14). A short account of the problems found in trying to compare
this approach with the traditional British analysis is given in Roach (1994). ToBI is essen
tially a computer-based transcription system, and more information about it is provided
on this book’s website.
N ote fo r teachers
I would like to emphasise how valuable an exercise it is for students and teachers to attempt
to analyse some recorded speech for themselves. For beginners it is best to start on slow,
careful speech - such as that of newsreaders - before attempting conversational speech.
One can learn more about intonation in an hour of this work than in days of reading
textbooks on the subject, and one’s interest in and understanding of theoretical problems
becomes much more profound.
W ritte n exercises
1 The following sentences are given with intonation marks. Sketch the pitch within
the lines below, leaving a gap between each syllable.
17 Intonation 3 145
2 This exercise is similar, but here you are given polysyllabic words and a tone. You
must draw an appropriate pitch movement between the lines.
a) (rise) opportunity d) (rise-fall) magnificent
The form of intonation has now been described in some detail, and we will move on to
look more closely at its functions. Perhaps the best way to start is to ask ourselves what
would be lost if we were to speak without intonation: you should try to imagine speech
in which every syllable was said on the same level pitch, with no pauses and no changes
in speed or loudness. This is the sort of speech that would be produced by a “mechan
ical speech” device (as described at the beginning of Chapter 14) that made sentences by
putting together recordings of isolated words. To put it in the broadest possible terms, we
can see that intonation makes it easier for a listener to understand what a speaker is trying
to convey. The ways in which intonation does this are very complex, and many suggestions
have been made for ways of isolating different functions. Among the most often proposed
are the following:
i) Intonation enables us to express emotions and attitudes as we speak, and this
adds a special kind of “meaning” to spoken language. This is often called the
attitudinal function of intonation.
ii) Intonation helps to produce the effect of prominence on syllables that need
to be perceived as stressed, and in particular the placing of tonic stress on
a particular syllable marks out the word to which it belongs as the most
important in the tone-unit. In this case, intonation works to focus attention
on a particular lexical item or syllable. This has been called the accentual
function of intonation.
iii) The listener is better able to recognise the grammar and syntactic structure
of what is being said by using the information contained in the intonation;
for example, such things as the placement of boundaries between phrases,
clauses or sentences, the difference between questions and statements, and the
use of grammatical subordination may be indicated. This has been called the
grammatical function of intonation.
iv) Looking at the act of speaking in a broader way, we can see that intonation
can signal to the listener what is to be taken as “new” information and what is
already “given”, can suggest when the speaker is indicating some sort of contrast
or link with material in another tone-unit and, in conversation, can convey to
the listener what kind of response is expected. Such functions are examples of
intonation’s discourse function.
Functions of intonation i 147
The attitudinal function has been given so much importance in past work on into
nation that it will be discussed separately in this chapter, although it should eventually
become clear that it overlaps considerably with the discourse function. In the case of the
other three functions, it will be argued that it is difficult to see how they could be treated
as separate; for example, the placement of tonic stress is closely linked to the presenta
tion of “new” information, while the question/statement distinction and the indication
of contrast seem to be equally important in grammar and discourse. What seems to be
common to accentual, grammatical and discourse functions is the indication, by means of
intonation, of the relationship between some linguistic element and the context in which
it occurs. The fact that they overlap with each other to a large degree is not so important
if one does not insist on defining watertight boundaries between them.
The rest of this chapter is concerned with a critical examination of the attitudinal
function.
Many writers have expressed the view that intonation is used to convey our feel
ings and attitudes: for example, the same sentence can be said in different ways, which
might be labelled “angry”, “happy”, “grateful”, “bored”, and so on. A major factor in this
is the tone used, and most books agree on some basic meanings of tones. Here are some
examples (without punctuation):
1 Fall
Finality, definiteness: That is the end of the \news
Im absolutely \ certain
Stop \ talking
2 Rise
Most of the functions attributed to rises are nearer to grammatical than attitudi
nal, as in the first three examples given below; they are included here mainly to
give a fuller picture of intonational function.
General questions: Can you /help me
Is it /over
Listing: / Red / brown /yellow or \ blue
(a fall is usual on the last item)
“More to follow”: I phoned them right a/w ay (‘and they agreed to come’)
You must write it a / gain (and this time, get it right)
Encouraging: It wont / hurt
3 Fall-rise
Uncertainty, doubt: You vmay be right
Its vpossible
Requesting: Can I vbuy it
Will you vlend it to me
148 English Phonetics and Phonology
4 Rise-fall
Surprise, being impressed: You were Afirst
a All of them
It has also been widely observed that the form of intonation is different in different
languages; for example, the intonation of languages such as Swedish, Italian or Hindi
is instantly recognisable as being different from that of English. Not surprisingly, it has
often been said that foreign learners of English need to learn English intonation. Some
writers have gone further than this and claimed that, unless the foreign learner learns the
appropriate way to use intonation in a given situation, there is a risk that he or she may
unintentionally give offence; for example, the learner might use an intonation suitable for
expressing boredom or discontent when what is needed is an expression of gratitude or
affection. This misleading view of intonation must have caused unnecessary anxiety to
many learners of the language.
Let us begin by considering how one might analyse the attitudinal function of into
nation. One possibility would be for the analyst to invent a large number of sentences and
to try saying them with different intonation patterns (i.e. different combinations of head
and tone), noting what attitude was supposed to correspond to the intonation in each
case; of course, the results are then very subjective, and based on an artificial perform
ance that has little resemblance to conversational speech. Alternatively, the analyst could
say these different sentences to a group of listeners and ask them all to write down what
attitudes they thought were being expressed; however, we have a vast range of adjectives
available for labelling attitudes and the members of the group would probably produce
a very large number of such adjectives, leaving the analyst with the problem of deciding
whether pairs such as “pompous” and “stuck-up”, or “obsequious” and “sycophantic” were
synonyms or represented different attitudes. To overcome this difficulty, one could ask the
members of the group to choose among a small number of adjectives (or “labels”) given
by the analyst; the results would then inevitably be easier to quantify (i.e. the job of count
ing the different responses would be simpler) but the results would no longer represent
the listeners’ free choices of label. An alternative procedure would be to ask a lot of speak
ers to say a list of sentences in different ways according to labels provided by the analyst,
and see what intonational features are found in common - for example, one might count
how many speakers used a low head in saying something in a “hostile” way. The results of
such experiments are usually very variable and difficult to interpret, not least because the
range of acting talent in a randomly selected group is considerable.
A much more useful and realistic approach is to study recordings of different
speakers’ natural, spontaneous speech and try to make generalisations about attitudes and
intonation on this basis. Many problems remain, however. In the method described previ
ously, the analyst tries to select sentences (or passages of some other size) whose meaning
is fairly “neutral” from the emotional point of view, and will tend to avoid material
such as ‘Why don’t you leave me alone?’ or ‘How can I ever thank you enough?’ because
the lexical meaning of the words used already makes the speaker’s attitude pretty clear,
18 Functions of intonation i 149
whereas sentences such as ‘She’s going to buy it tomorrow’ or ‘The paper has fallen under
the table’ are less likely to prejudice the listener. The choice of material is much less free
for someone studying natural speech. Nevertheless, if we are ever to make new discover
ies about intonation, it will be as a result of studying what people really say rather than
inventing examples of what they might say.
The notion of “expressing an emotion or attitude” is itself a more complex one than
is generally realised. First, an emotion may be expressed involuntarily or voluntarily; if I
say something in a “happy” way, this may be because I feel happy, or because I want to
convey to you the impression that I am happy. Second, an attitude that is expressed could
be an attitude towards the listener (e.g. if I say something in a “friendly” way), towards
what is being said (e.g. if I say something in a “sceptical” or “dubious” way) or towards
some external event or situation (e.g. “regretful” or “disapproving”).
However, one point is much more important and fundamental than all the problems
discussed above. To understand this point you should imagine (or even actually perform)
your pronunciation of a sentence in a number of different ways: for example, if the sentence
was ‘I want to buy a new car’ and you were to say it in the following ways: “pleading”,
“angry”, “sad”, “happy”, “proud”, it is certain that at least some of your performances
will be different from some others, but it is also certain that the technique for analysing
and transcribing intonation introduced earlier in the course will be found inadequate to
represent the different things you do. You will have used variations in loudness and speed,
for example; almost certainly you will have used different voice qualities for different atti
tudes. You may have used your pitch range (see Section 15.3) in different ways: your pitch
movements may have taken place within quite a narrow range (narrow pitch range) or
using the full range between high and low (wide pitch range); if you did not use wide
pitch range, you may have used different keys: high key (using the upper part of your
pitch range), mid key (using the middle part of the range) or low key (the lower part). It is
very likely that you will have used different facial expressions, and even gestures and body
movements. These factors are all of great importance in conveying attitudes and emotions,
yet the traditional handbooks on English pronunciation have almost completely ignored
them.
If we accept the importance of these factors it becomes necessary to consider how
they are related to intonation, and what intonation itself consists of. We can isolate three
distinct types of suprasegmental variable: sequential, prosodic and paralinguistic.
Sequential
These components of intonation are found as elements in sequences of other such
elements occurring one after another (never simultaneously). These are:
i) pre-heads, heads, tonic syllables and tails (with their pitch possibilities);
ii) pauses;
iii) tone-unit boundaries.
These have all been introduced in previous chapters.
150 English Phonetics and Phonology
Prosodic
These components are characteristics of speech which are constantly present and
observable while speech is going on. The most important are:
i) width of pitch range;
ii) key;
iii) loudness;
iv) speed;
v) voice quality.
It is not possible to speak without one’s speech having some degree or type of pitch range,
loudness, speed and voice quality (with the possible exception that pitch factors are largely
lost in whispered speech). Different speakers have their own typical pitch range, loudness,
voice quality, etc., and contrasts among prosodic components should be seen as relative to
these “background” speaker characteristics.
Each of these prosodic components needs a proper framework for categorisation,
and this is an interesting area of current research. One example of the prosodic com
ponent “width of pitch range” has already been mentioned in Section 15.3, when “extra
pitch height” was introduced, and the “rhythmicality” discussed in Section 14.1 could be
regarded as another prosodic component. Prosodic components should be regarded as
part of intonation along with sequential components.
Paralinguistic
Mention was made above of facial expressions, gestures and body movements.
People who study human behaviour often use the term body language for such activity.
One could also mention certain vocal effects such as laughs and sobs. These paralinguistic
effects are obviously relevant to the act of speaking but could not themselves properly be
regarded as components of speech. Again, they need a proper descriptive and classificatory
system, but this is not something that comes within the scope of this course, nor in my
opinion should they be regarded as components of intonation.
What advice, then, can be given to the foreign learner of English who wants to learn
“correct intonation”? It is certainly true that a few generalisations can be made about the
attitudinal functions of some components of intonation. We have looked at some basic
examples earlier in this chapter. Generalisations such as these are, however, very broad,
and foreign learners do not find it easy to learn to use intonation through studying them.
Similarly, within the area of prosodic components most generalisations tend to be rather
obvious: wider pitch range tends to be used in excited or enthusiastic speaking, slower
speed is typical of the speech of someone who is tired or bored, and so on. Most of the
generalisations one could make are probably true for a lot of other languages as well.
In short, of the rules and generalisations that could be made about conveying attitudes
18 Functions of intonation i 151
through intonation, those which are not actually wrong are likely to be too trivial to be
worth learning. I have witnessed many occasions when foreigners have unintentionally
caused misunderstanding or even offence in speaking to a native English speaker, but can
remember only a few occasions when this could be attributed to “using the wrong into
nation”; most such cases have involved native speakers of different varieties of English,
rather than learners of English. Sometimes an intonation mistake can cause a difference
in apparent grammatical meaning (something that is dealt with in Chapter 19). It should
not be concluded that intonation is not important for conveying attitudes. What is being
claimed here is that, although it is of great importance, the complexity of the total set of
sequential and prosodic components of intonation and of paralinguistic features makes it
a very difficult thing to teach or learn. One might compare the difficulty with that of trying
to write rules for how one might indicate to someone that one finds him or her sexually
attractive; while psychologists and biologists might make detailed observations and gener
alisations about how human beings of a particular culture behave in such a situation, most
people would rightly feel that studying these generalisations would be no substitute for
practical experience, and that relying on a textbook could lead to hilarious consequences.
The attitudinal use of intonation is something that is best acquired through talking with
and listening to English speakers, and this course aims simply to train learners to be more
aware of and sensitive to the way English speakers use intonation.
Perhaps the most controversial question concerning English intonation is what its func
tion is; pedagogically speaking, this is a very important question, since one would not wish
to devote time to teaching something without knowing what its value is likely to be. At the
beginning of this chapter I list four commonly cited functions. It is possible to construct a
longer list: Wells (2006) suggests six, while Lee (1958) proposed ten.
For general introductory reading on the functions of intonation, there is a good
survey in Cruttenden (1997: Chapter 4). Critical views are expressed in Brazil et al. (1980:
98-103) and Crystal (1969:282-308). There are many useful examples in Brazil (1994). Few
people have carried out experiments on listeners’ perception of attitudes through intona
tion, probably because it is extremely difficult to design properly controlled experiments.
Once one has recognised the importance of features other than pitch, it is neces
sary to devise a framework for categorising these features. There are many different views
about the meaning of the term “paralinguistic”. In the framework presented in Crystal
and Quirk (1964), paralinguistic features of the “vocal effect” type are treated as part of
intonation, and it is not made sufficiently clear how these are to be distinguished from
prosodic features. Crystal (1969) defines paralinguistic features as “vocal effects which
are primarily the result of physiological mechanisms other than the vocal cords, such
as the direct results of the workings of the pharyngeal, oral or nasal cavities” but this
does not seem to me to fit the facts. In my view, “paralinguistic” implies “outside the sys
tem of contrasts used in spoken language” - which does not, of course, necessarily mean
152 English Phonetics and Phonology
N ote fo r teachers
W ritte n exercise
In the following bits of conversation, you are supplied with an “opening line” and a
response that you must imagine saying. You are given an indication in brackets of the
feeling or attitude expressed, and you must mark on the text the intonation you think is
appropriate (mark only the response). As usual in intonation work in this book, punctua
tion is left out, since it can cause confusion.
It 'looks 'nice for a \ swim its rather cold (doubtful)
'Why not 'get a / car because I cant afford it (impatient)
Ive ,lost my \ ticket youre silly then (stating the obvious)
You 'cant 'have an 'ice \ cream oh please (pleading)
'What 'times are the /buses seven oclock seven thirty and eight (listing)
She got 'four \A -levels four (impressed)
'How m uch\ work have you Ive got to do the shopping (and more
•got to -do things after th a t...)
'Will the vchildren -go some of them might (uncertain)
19 Functions of intonation 2
In the previous chapter we looked at the attitudinal function of intonation. We now turn
to the accentual, grammatical and discourse functions.
The term accentual is derived from “accent”, a word used by some writers to refer
to what in this course is called “stress”. When writers say that intonation has accentual
function they imply that the placement of stress is something that is determined by
intonation. It is possible to argue against this view: in Chapters 10 and 11 word stress
is presented as something quite independent of intonation, and subsequently (p. 140) it
was said that “intonation is carried entirely by the stressed syllables of a tone-unit”. This
means that the presentation so far has implied that the placing of stress is independent
of and prior to the choice of intonation. However, one particular aspect of stress could be
regarded as part of intonation: this is the placement of the tonic stress within the tone-
unit. It would be reasonable to suggest that while word stress is independent of intonation,
the placement of tonic stress is a function (the accentual function) of intonation. Some
older pronunciation handbooks refer to this function as “sentence stress”, which is not an
appropriate name: the sentence is a unit of grammar, while the location of tonic stress is a
matter which concerns the tone unit, a unit of phonology.
The location of the tonic syllable is of considerable linguistic importance. The most
common position for this is on the last lexical word (e.g. noun, adjective, verb, adverb as
distinct from the function words introduced in Chapter 12) of the tone-unit. For contras
tive purposes, however, any word may become the bearer of the tonic syllable. It is fre
quently said that the placement of the tonic syllable indicates the focus of the information.
In the following pairs of examples, (i) represents normal placement and (ii) contrastive:
i) I .want to .know .where hes \ travelling to
(The word ‘to’ at the end of the sentence, being a preposition and not a lexical
word, is not stressed.)
ii) (I 'dont want to 'know 'where hes 'travelling vfrom)
I .want to .know .where hes .travelling \to
i) She was 'wearing a 'red \ dress
ii) (She 'wasnt 'wearing a vgreen -dress) I She was .wearing a \re d -dress
154 English Phonetics and Phonology
Similarly, for the purpose of emphasis we may place the tonic stress in other positions; in
these examples, (i) is non-emphatic and (ii) is emphatic:
i) It was 'very \ boring
ii) It was \very -boring
i) You 'mustnt ‘talk so \loudly
ii) You \ mustnt -talk so -loudly
However, it would be wrong to say that the only cases of departure from putting tonic
stress on the last lexical word were cases of contrast or emphasis. There are quite a few
situations where it is normal for the tonic syllable to come earlier in the tone-unit. A well-
known example is the sentence ‘I have plans to leave’; this is ambiguous:
i) I have 'plans to \ leave
(i.e. I am planning to leave)
ii) I have \ plans to -leave
(i.e. I have some plans/diagrams/drawings that I have to leave)
Version (ii) could not be described as contrastive or emphatic. There are many examples
similar to (ii); perhaps the best rule to give is that the tonic syllable will tend to occur on
the last lexical word in the tone-unit, but may be placed earlier in the tone-unit if there is
a word there with greater importance to what is being said. This can quite often happen
as a result of the last part of the tone-unit being already “given” (i.e. something which has
already been mentioned or is completely predictable); for example:
i) 'Heres that \book you -asked me to -bring
(The fact that you asked me to bring it is not new)
ii) Ive 'got to 'take the \ dog for a -walk
(cFor a walk’ is by far the most probable thing to follow Tve got to take the dog’;
if the sentence ended with ‘to the vet’ the tonic syllable would probably be ‘vet’)
Placement of tonic stress is, therefore, important and is closely linked to intonation. A
question that remains, however, is whether one can and should treat this matter as separate
from the other functions described below.
The word “grammatical” tends to be used in a very loose sense in this context. It is
usual to illustrate the grammatical function by inventing sentences which when written are
ambiguous, and whose ambiguity can only be removed by using differences of intonation.
A typical example is the sentence ‘Those who sold quickly made a profit’. This can be said
in at least two different ways:
i) 'Those who 'sold vquicklv I ,made a \ profit
ii) 'Those who vsold I .quickly ,made a \ profit
ig Functions o f intonation 2 155
The difference caused by the placement of the tone-unit boundary is seen to be equivalent
to giving two different paraphrases of the sentences, as in:
i) A profit was made by those who sold quickly.
ii) A profit was quickly made by those who sold.
Let us look further at the role of tone-unit boundaries, and the link between the tone-
unit and units of grammar. There is a strong tendency for tone-unit boundaries to occur
at boundaries between grammatical units of higher order than words; it is extremely
common to find a tone-unit boundary at a sentence boundary, as in:
I 'wont have any/tea I I 'dont \Uke it
In sentences with a more complex structure, tone-unit boundaries are often found at
phrase and clause boundaries as well, as in:
In vFrance I where ,farms ,tend to be vsmaller I the 'subsidies are 'more
imxportant
It is very unusual to find a tone-unit boundary at a place where the only grammatical boun
dary is a boundary between words. It would, for example, sound distinctly odd to have a
tone-unit boundary between an article and a following noun, or between auxiliary and
main verbs if they are adjacent (although we may, on occasions, hesitate or pause in such
places within a tone-unit; it is interesting to note that some people who do a lot of arguing
and debating, notably politicians and philosophers, develop the skill of pausing for breath
in such intonationally unlikely places because they are less likely to be interrupted than if
they pause at the end of a sentence). Tone-unit boundary placement can, then, indicate
grammatical structure to the listener and we can find minimal pairs such as the following:
i) The Con'servatives who vlike the pro-posal I are \ pleased
ii) The Conservatives I who vhke the pro-posal I are \ pleased
The intonation makes clear the difference between (i) “restrictive” and (ii) “non-restrictive”
relative clauses: (i) implies that only some Conservatives like the proposal, while (ii) implies
that all the Conservatives like it.
Another component of intonation that can be said to have grammatical significance
is the choice of tone on the tonic syllable. One example that is very familiar is the use of
a rising tone with questions. Many languages have the possibility of changing a statement
into a question simply by changing the tone from falling to rising. This is, in fact, not
used very much by itself in the variety of English being described here, where questions
are usually grammatically marked. The sentence ‘The price is going up’ can be said as a
statement like this:
The \ price is going -up
(the tonic stress could equally well be on ‘up’). It would be quite acceptable in some dialects
of English (e.g. many varieties of American English) to ask a question like this:
156 English Phonetics and Phonology
If we think of linguistic analysis as usually being linked to the sentence as the maximum
unit of grammar, then the study of discourse attempts to look at the larger contexts in
which sentences occur. For example, consider the four sentences in the following:
A: Have you got any free time this morning?
B: I might have later on if that meeting’s off.
A: They were talking about putting it later.
B: You can’t be sure.
Each sentence could be studied in isolation and be analysed in terms of grammatical
construction, lexical content, and so on. But it is clear that the sentences form part of
some larger act of conversational interaction between two speakers; the sentences contain
several references that presuppose shared knowledge (e.g. ‘that meeting’ implies that both
19 Functions o f intonation 2 157
speakers know which meeting is being spoken about), and in some cases the meaning of a
sentence can only be correctly interpreted in the light of knowledge of what has preceded
it in the conversation (e.g. ‘You can’t be sure’).
If we consider how intonation may be studied in relation to discourse, we can identify
two main areas: one of them is the use of intonation to focus the listener’s attention on
aspects of the message that are most important, and the other is concerned with the regu
lation of conversational behaviour. We will look at these in turn.
In the case of “attention focusing”, the most obvious use has already been described:
this is the placing of tonic stress on the appropriate syllable of one particular word in the
tone-unit. In many cases it is easy to demonstrate that the tonic stress is placed on the
word that is in some sense the “most important”, as in:
She 'went to \ Scotland
Sometimes it seems more appropriate to describe tonic stress placement in terms of
“information content”: the more predictable a word’s occurrence is in a given context, the
lower is its information content. Tonic stress will tend to be placed on words with high
information content, as suggested above when the term focus was introduced. This is the
explanation that would be used in the case of the sentences suggested in Section 19.1:
i) Ive 'got to 'take the \d o g for a -walk
ii) Ive 'got to 'take the 'dog to the w e t
The word Vet’ is less predictable (has a higher information content) than walk’. However,
we still find many cases where it is difficult to explain tonic placement in terms of “impor
tance” or “information”. For example, in messages like:
Your coat’s on fire The wing’s breaking up
The radio’s gone wrong Your uncle’s died
probably the majority of English speakers would place the tonic stress on the subject noun,
although it is difficult to see how this is more important than the last lexical word in each
of the sentences. The placement of tonic stress is still to some extent an unsolved mystery;
it is clear, however, that it is at least partly determined by the larger context (linguistic and
non-linguistic) in which the tone-unit occurs.
We can see at least two other ways in which intonation can assist in focusing atten
tion. The tone chosen can indicate whether the tone-unit in which it occurs is being used
to present new information or to refer to information which is felt to be already possessed
by speaker and hearer. For example, in the following sentence:
'Since the vlast time we -met I 'when we 'had that 'huge vdinner I
Ive .been on a \d ie t
the first two tone-units present information which is relevant to what the speaker is
saying, but which is not something new and unknown to the listener. The final tone-unit,
however, does present new information. Writers on discourse intonation have proposed
158 English Phonetics and Phonology
that the falling tone indicates new information while rising (including falling-rising) tones
indicate “shared” or “given” information.
Another use of intonation connected with the focusing of attention is intonational
subordination; we can signal that a particular tone-unit is of comparatively low impor
tance and as a result give correspondingly greater importance to adjacent tone-units. For
example:
i) As I ex.pect youve \ heard I theyre 'only ad'mitting exmergencv -cases
ii) The 'Tapavnese I for ,some ,reason or /o th er I 'drive on the \left I like \u s
In a typical conversational pronunciation of these sentences, the first tone-unit of (i) and
the second and fourth tone-units of (ii) might be treated as intonationally subordinate;
the prosodic characteristics marking this are usually:
i) a drop to a lower part of the pitch range (“low key”);
ii) increased speed;
iii) narrower range of pitch; and
iv) reduced loudness, relative to the non-subordinate tone-unit(s).
The use of these components has the result that the subordinate tone-units are less easy
to hear. Native speakers can usually still understand what is said, if necessary by guessing
at inaudible or unrecognisable words on the basis of their knowledge of what the speaker
is talking about. Foreign learners of English, on the other hand, having in general less
“common ground” or shared knowledge with the speaker, often find that these subordi
nate tone-units - with their “throwaway”, parenthetic style - cause serious difficulties in
understanding.
We now turn to the second main area of intonational discourse function: the
regulation of conversational behaviour. We have already seen how the study of sequences
of tone-units in the speech of one speaker can reveal information carried by intonation
which would not have been recognised if intonation were analysed only at the level of
individual tone-units. Intonation is also important in the conversational interaction of
two or more speakers. Most of the research on this has been on conversational interac
tion of a rather restricted kind - such as between doctor and patient, teacher and student,
or between the various speakers in court cases. In such material it is comparatively easy
to identify what each speaker is actually doing in speaking - for example, questioning,
challenging, advising, encouraging, disapproving, etc. It is likely that other forms of
conversation can be analysed in the same way, although this is considerably more diffi
cult. In a more general way, it can be seen that speakers use various prosodic components
to indicate to others that they have finished speaking, that another person is expected
to speak, that a particular type of response is required, and so on. A familiar example is
that quoted above (p. 156), where the difference between falling and rising intonation on
question-tags is supposed to indicate to the listener what sort of response is expected.
It seems that key (the part of the pitch range used) is important in signalling informa
tion about conversational interaction. We can observe many examples in non-linguistic
19 Functions o f intonation 2 159
behaviour of the use of signals to regulate turn-taking: in many sports, for example, it is
necessary to do this - footballers can indicate that they are looking for someone to pass
the ball to, or that they are ready to receive the ball, and doubles partners in tennis can
indicate to each other who is to play a shot. Intonation, in conjunction with “body lan
guage” such as eye contact, facial expression, gestures and head-turning, is used for similar
purposes in speech, as well as for establishing or confirming the status of the participants
in a conversation.
19.4 Conclusions
Important work was done on the placement of tonic stress by Halliday (1967); his term
for this is “tonicity”, and he adopts the widely-used linguistic term “marked” for tonicity
that deviates from what I have called (for the sake of simplicity) “normal”. Within genera
tive phonology there has been much debate about whether one can determine the placing
of tonic (“primary”) stress without referring to the non-linguistic context in which the
speaker says something. This debate was very active in the 1970s, well summarised and
criticised in Schmerling (1976), but see Bolinger (1972). For more recent accounts, see
Couper-Kuhlen (1986: Chapters 7 and 8) and Ladd (1996: 221-35).
One of the most interesting developments of recent years has been the emergence
of a theory of discourse intonation. Readers unfamiliar with the study of discourse may
find some initial difficulty in understanding the principles involved; the best introduction
is Brazil et al (1980), while the ideas set out there are given more practical expression
in Brazil (1994). I have not been able to do more than suggest the rough outline of this
approach.
The treatment of intonational subordination is based not on the work of Brazil but
on Crystal and Quirk (1964: 52-6) and Crystal (1969: 235-52). The basic philosophy is
the same, however, in that both views illustrate the fact that there is in intonation some
organisation at a level higher than the isolated tone-unit; see Fox (1973). A parallel might
be drawn with the relationship between the sentence and the paragraph in writing. It
160 English Phonetics and Phonology
seems likely that a considerable amount of valuable new research on pronunciation will
grow out of the study of discourse.
N ote fo r teachers
Audio Unit 19 (CD 2) is short and intensive. It is meant primarily to give a reminder that
English spoken at something like full conversational speed is very different from the slow,
careful pronunciation of the early Audio Units.
W ritte n exercises
1 In the following exercise, read the “opening line” and then decide on the most
suitable place for tonic stress placement (underline the syllable) in the response.
a) Id 'like you to \help me (right) can I do the shopping
for you
b) I 'hear youre 'offering to 'do (right) can I do the shopping
the \ shopping for someone for you
c) 'What was the 'first 'thing that first the professor explained
\happened her theory
d) 'Was the 'theory ex'plained by (no) first the professor
the /students explained her theory
e) 'Tell me 'how the \theory was first she explained her theory
pre-sented
f) I 'think it 'starts at 'ten to (no) ten past three
\three
g) I 'think it 'starts at 'quarter (no) ten past three
past \three
h) I 'think it 'starts at 'ten past (no) ten past three
\fo u r
2 The following sentences are given without punctuation. Underline the appropri
ate tonic syllable places and mark tone-unit boundaries where you think they are
most appropriate.
a) (he wrote the letter in a sad way) he wrote the letter sadly
b) (it’s regrettable that he wrote the letter) he wrote the letter sadly
c) four plus six divided by two equals five
d) four plus six divided by two equals seven
e) we broke one thing after another fell down
f) we broke one thing after another that night
20 Varieties of English pronunciation
In Chapter 1 there was some discussion of different types of English pronunciation and
the reasons for choosing the accent that is described in this book. The present chapter
returns to this topic to look in more detail at differences in pronunciation.
Differences between accents are of two main sorts: phonetic and phonological. When
two accents differ from each other only phonetically, we find the same set of phonemes in
both accents, but some or all of the phonemes are realised differently. There may also be
differences in stress or intonation, but not such as would cause a change in meaning. As an
example of phonetic differences at the segmental level, it is said that Australian English has
the same set of phonemes and phonemic contrasts as BBC pronunciation, yet Australian
pronunciation is so different from that accent that it is easily recognised.
Many accents of English also differ noticeably in intonation without the difference
being such as would cause a difference in meaning; some Welsh accents, for example, have
a tendency for unstressed syllables to be higher in pitch than stressed syllables. Such a
difference is, again, a phonetic one. An example of a phonetic (non-phonological) differ
ence in stress would be the stressing of the final syllable of verbs ending in ‘-ise’ in some
Scottish and Northern Irish accents (e.g. ‘realise’ r i a ' l a i z ) .
Phonological differences are of various types: again, we can divide these into
segmental and suprasegmental. Within the area of segmental phonology the most obvious
type of difference is where one accent has a different number of phonemes (and hence
of phonemic contrasts) from another. Many speakers with northern English accents, for
example, do not have a contrast between a and u, so that ‘luck’ and ‘look’ are pronounced
identically (both as l u k ) ; in the case of consonants, many accents do not have the phoneme
h, so that there is no difference in pronunciation between ‘art’ and ‘heart’. The phonemic
system of such accents is therefore different from that of the BBC accent. On the other
hand, some accents differ from others in having more phonemes and phonemic contrasts.
For example, many northern English accents have a long e : sound as the realisation of the
phoneme symbolised e i in BBC pronunciation (which is a simple phonetic difference);
but in some northern accents there is both an e i diphthong phoneme and also a contrast
ing long vowel phoneme that can be symbolised as e : . Words like ‘eight’, ‘reign’ are pro
nounced e i t , r e i n , while ‘late’, ‘rain’ (with no ‘g’ in the spelling) are pronounced l e : t , r e i n .
162 English Phonetics and Phonology
A more complicated kind of difference is where, without affecting the overall set
of phonemes and contrasts, a phoneme has a distribution in one accent that is different
from the distribution of the same phoneme in another accent. The clearest example is
r, which is restricted to occurring in pre-vocalic position in BBC pronunciation, but in
many other accents is not restricted in this way. Another example is the occurrence of
j between a consonant and u:, u or ua. In BBC pronunciation we can find the following:
‘pew’ pju:, ‘tune’ tju:n, ‘queue’ kju:. However, in most American accents and in some
English accents of the south and east we find that, while ‘pew’ is pronounced pju: and
‘queue’ as kju:, ‘tune’ is pronounced turn; this absence of j is found after the other alveolar
consonants; hence; ‘due’ du:,‘new’ nu:. In Norwich, and other parts of East Anglia, we find
many speakers who have no consonant 4- j clusters at the beginning of a syllable, so that
‘music’ is pronounced mu:zik and ‘beautiful’ as brntifl.
We also find another kind of variation: in the example just given above, the
occurrence of the phonemes being discussed is determined by their phonological con
text; however, sometimes the determining factor is lexical rather than phonological. For
example, in many accents of the Midlands and north-western England a particular set
of words containing a vowel represented by ‘o’ in the spelling is pronounced with a in
BBC but with d in these other accents; the list of words includes ‘one’, ‘none’, ‘nothing’,
‘tongue’, ‘mongrel’, ‘constable’, but does not include some other words of similar form such
as ‘some’ SAm and ‘ton’ t An. One result of this difference is that such accents have different
pronunciations for the two members of pairs of words that are pronounced identically
(i.e. are homophones) in BBC - for example, ‘won’ and ‘one’, ‘nun’ and ‘none’. In my own
pronunciation when I was young, I had d instead of a in these words, so that ‘won’ was
pronounced WAn and ‘one’ as won, ‘nun’ as nAn and ‘none’ as non; this has not completely
disappeared from my accent.
It would be satisfying to be able to list examples of phonological differences
between accents in the area of stress and intonation but, unfortunately, straightfor
ward examples are not available. We do not yet know enough about the phonological
functions of stress and intonation, and not enough work has been done on comparing
accents in terms of these factors. It will be necessary to show how one accent is able
to make some difference in meaning with stress or intonation that another accent is
unable to make. Since some younger speakers seem not to distinguish between the
noun ‘protest’ and the verb ‘protest’, pronouncing both as 'prautest, we could say that
in their speech a phonological distinction in stress has been lost, but this is a very
limited example. It is probable that such differences will in the future be identified by
suitable research work.
For a long time, the study of variation in accents was part of the subject of
dialectology, which aimed to identify all the ways in which a language differed from
place to place. Dialectology in its traditional form is therefore principally interested in
20 Varieties of English pronunciation 163
American
In many parts of the world, the fundamental choice for learners of English is whether
to learn an American or a British pronunciation, though this is by no means true every
where. Since we have given very little attention to American pronunciation in this course,
it will be useful at this stage to look at the most important differences between American
accents and the BBC accent. It is said that the majority of American speakers of English
have an accent that is often referred to as General American (GA); since it is the American
accent most often heard on international radio and television networks, it is also called
Network English. Most Canadian speakers of English have a very similar accent (few
British people can hear the difference between the Canadian and American accents, as
is the case with the difference between Australian and New Zealand accents). Accents in
America different from GA are mainly found in New England and in the “deep south” of
the country, but isolated rural communities everywhere tend to preserve different accents;
there is also a growing section of American society whose native language is Spanish
(or who are children of Spanish speakers) and they speak English with a pronunciation
influenced by Spanish.
The most important difference between GA and BBC is the distribution of the r
phoneme, GA being rhotic (i.e. r occurs in all positions, including before consonants
and at the end of utterances). Thus where BBC pronounces ‘car’ as ka: and ‘cart’ as ka:t,
GA has ka:r and ka:rt. Long vowels and diphthongs that are written with an ¥ in the
164 English Phonetics and Phonology
spelling are pronounced in GA as simple vowels followed by r. We can make the following
comparisons:
BBC GA
car’ ka: ka:r
‘more’ mo: moir
‘fear5 fio fir
care’ keo ker
‘tour’ tuo tur
American vowels followed by r are strongly “r-coloured”, to the extent that one often hears
the vowel at the centre of a syllable as a long r with no preceding vowel. The GA vowel in
‘fur’, for example, could be transcribed as 3ir (with a transcription that matched those for
the other long vowels in the list above), but it is more often transcribed 3>- with a diacritic
to indicate that the whole vowel is “r-coloured”. Similarly, the short “schwa” in GA may
be r-coloured and symbolised ^ as in ‘minor’ m a m rr . It would be wrong to assume that
GA has no long vowels like those of the BBC accent: in words like ‘psalm’, ‘bra’, ‘Brahms’,
where there is no letter ‘r’ following the ‘a’ in the spelling, a long non-rhotic vowel is pro
nounced, whose pronunciation varies from region to region.
One vowel is noticeably different: the d of ‘dog’, ‘cot’ in BBC pronunciation is not
found in GA. In most words where the BBC accent has d we find a: or o:, so that ‘dog’,
which is dog in BBC, is da:g or do:g in American pronunciation. In this case, we have
a phonological difference, since one phoneme that is present in BBC pronunciation is
absent in American accents. Other segmental differences are phonetic: the 1 phoneme,
which was introduced in Section 2 of Chapter 7, is almost always pronounced as a “dark
1” in American English: the sound at the beginning of ‘like’ is similar to that at the end
of ‘mile’. The pronunciation of t is very different in American English when it occurs at
the end of a stressed syllable and in front of an unstressed vowel. In a word like ‘betting’,
which in BBC pronunciation is pronounced with a t that is plosive and slightly aspirated,
American speakers usually have what is called a “flapped r” in which the tip of the tongue
makes very brief contact with the alveolar ridge, a sound similar to the r sound in Spanish
and many other languages. This is sometimes called “voiced t”, and it is usually repre
sented with the symbol t .
There are many other differences between American and English pronunciation,
many of them the subject of comic debates such as “You say tomato (ta'm eitau) and I say
tomato (ta'maitau).”
Scottish
There are many accents of British English, but one that is spoken by a large number
of people and is radically different from BBC English is the Scottish accent. There is much
variation from one part of Scodand to another; the accent of Edinburgh is the one most
usually described. Like the American accent described above, Scottish English pronuncia
tion is essentially rhotic and an ‘r’ in the spelling is always pronounced; the words ‘shore’
20 Varieties of English pronunciation 165
and ‘short5can be transcribed as Jar and Jart. The Scottish r sound is usually pronounced
as a “flap” or “tap” similar to the r sound in Spanish.
It is in the vowel system that we find the most important differences between BBC
pronunciation and Scottish English. As with American English, long vowels and diph
thongs that correspond to spellings with Y are composed of a vowel and the r consonant,
as mentioned above. The distinction between long and short vowels does not exist, so that
good’, ‘food’ have the same vowel, as do ‘Sam’, ‘psalm’ and ‘caught’, ‘cot’. The BBC diph
thongs ei, so are pronounced as pure vowels e, o, but the diphthongs ei, ai, ai exist as in
the BBC accent (though with phonetic differences).
This brief account may cover the most basic differences, but it should be noted that
these and other differences are so radical that people from England and from parts of
lowland Scotland have serious difficulty in understanding each other. It often happens
that foreigners who have learned to pronounce English as it is spoken in England find life
very difficult when they go to Scotland, though in time they do manage to deal with the
pronunciation differences and communicate successfully.
We do not have space for a detailed examination of all the different types of variation
in pronunciation, but a few more are worth mentioning.
Age
Everybody knows that younger people speak differently from older people. This seems
to be true in every society, and many people believe that younger people do this specially to
annoy their parents and other people of the older generation, or to make it difficult for their
parents to understand what they are saying to their friends. We can look at how younger
people speak and guess at how the pronunciation of the language will develop in the future,
but such predictions are of limited value: elderly professors can safely try to predict how
pronunciation will change over the coming decades because they are not likely to be around
to find themselves proved wrong. The speech of young people tends to show more elisions
than that of older people. This seems to be true in all cultures, and is usually described by
older speakers as “sloppy” or “careless”. A sentence like the following: ‘What’s the point of
going to school if there’s no social life?’ might be pronounced in a careful way as (in pho
nemic transcription) wDts da point av gauir) to skuil if daz nau saujl laif, but a young
speaker talking to a friend might (in the area of England where I live) say it in a way that
might be transcribed phonetically as s p 5 i? gaeu? skau f s naeu saeuj loif.
There is an aspect of intonation that has often been quoted in relation to age
differences: this is the use of rising intonation in making statements, a style of speaking
that is sometimes called “upspeak” or “uptalk”. Here is a little invented example:
I was in Marks and Spencer’s. In the food section. They had this chocolate cake. I
just had to buy some.
166 English Phonetics and Phonology
A typical adult pronunciation would be likely to use a sequence of falling tones, like this:
I was in 'Marks and \ Spencers | In the \food section | They had this \chocolate
cake 11 just 'had to \ buv some
But the “upspeak” version would sound like this:
I was in 'Marks and / Spencers | In the / food section | They had this / chocolate
cake 11 just 'had to \ buv some
(with a falling tone only on the last tone-unit). It is widely believed that this style of
intonation arose from copying young actors in Australian and American soap operas. One
thing that keeps it alive in young people’s speech is that older people find it so intensely
irritating. It is, I believe, a passing fashion that will not last long.
Style
Many linguists have attempted to produce frameworks for the analysis of style in
language. There is not space for us to consider this in detail, but we should note that, for
foreign learners, a typical situation - regrettably, an almost inevitable one - is that they
learn a style of pronunciation which could be described as careful and formal. Probably
their teachers speak to them in this style, although what the learners are likely to encoun
ter when they join in conversations with native speakers is a “rapid, casual” style. We all
have the ability to vary our pronunciation to suit the different styles of speech that we
use. Speaking to one’s own children, for example, is a very different activity from that
of speaking to adults that one does not know well. In broadcasting, there is a very big
difference between formal news-reading style and the casual speech used in chat shows and
game shows. Some politicians change their pronunciation to suit the context: it was often
noticed that Tony Blair, when he was prime minister, would adopt an “Estuary English”
style of pronunciation when he wanted to project an informal “man of the people” style,
but a BBC accent when speaking on official state occasions. In the former style, it was
not unusual to hear him say something like ‘We’ve got a problem’ with a glottal stop
replacing the t in ‘got’: wiv gD? a probtam. I can’t remember any other prime minister
doing this.
20 Varieties of English pronunciation 167
Rhythm forms an important part of style: careful, deliberate speech tends to go with
regular rhythm and slow speed. Casual speech, as well as being less rhythmical and faster,
tends to include a lot of “fillers” - such as hesitation noises (usually written cum’ or ‘er’) or
exaggeratedly long vowels to cover a hesitation.
It should now be clear that the pronunciation described in this course is only one of
a vast number of possible varieties. The choice of a slow, careful style is made for the sake
of convenience and simplicity; learners of English need to be aware of the fact that this
style is far from being the only one they will meet, and teachers of English to foreigners
should do their best to expose their students to other varieties.
20.1 For general reading about sociolinguistics and dialectology, see Trudgill (1999);
Foulkes and Docherty (1999); Spolsky (1998).
20.2 There are some major works on geographical variation in English pronunciation.
Wells (1982) is an important source of information in this field. For a brief overview,
with recorded examples, see Collins and Mees (2008: Section C). To find out more about
American and Scottish pronunciation, see Cruttenden (2001: Sections 7.6.1 and 7.6.2);
there is a good account of the vowel systems of American, Scottish and BBC English in
Giegerich (1992: Chapter 3). In a more practical way, it can be useful to compare the
accounts of American and British pronunciation in pronunciation dictionaries such as
Jones (eds. Roach et a l, 2006) or Wells (2008); the CDs of these dictionaries allow you to
listen to the British and American pronunciations of all the words in the dictionary, and to
compare your own pronunciation.
20.3 On “upspeak” or “uptalk”, see Wells (2006: Section 2.9); Cruttenden (1997:129-130).
Collins and Mees (2008) reproduce a valuable extract from the work of Barbara Bradford,
who has done pioneering work in this area. Shockey (2003) shows the great variation
between formal and informal styles of speech.
N ote fo r teachers
that when they get back home they risk being given a lower mark by their (middle-aged)
examiners in an oral examination than students producing a more traditional accent. I
regret this, but I can’t change it.
The comment about Audio Unit 18 at the end of Chapter 18 applies also to Audio
Unit 20 (CD 2 ). At first hearing it seems very difficult, but when worked on step by step
it is far from impossible. If there is time, students should now be encouraged to go back
to some of the more difficult Audio Units dealing with connected speech (say from Audio
Unit 12 onwards, missing out Audio Unit 15); they will probably discover a lot of things
they did not notice before.
W ritte n exercise
Phonological differences between accents are of various types. For each of the following
sets of phonetic data, based on non-BBC accents, say what you can conclude about the
phonology of that accent.
1 sing’ sir) ‘finger5 fir)g o
‘sung5 SA13 ‘running5 rAnin
‘singing5 sir)in ‘ring5 rir)
2 ‘day5 d e : ‘you5 ju:
‘buy5 b a i ‘me5 m i:
‘go5 g o : ‘more5 mo:
‘now5 n a u ‘fur5 f s :
‘own5 o : n ‘eight5 e : t
3 ‘mother5 niAva ‘father5 fa:va
‘think 5 f i g k ‘breath5 b r e f
‘lip5 lip ‘pill5 piw
‘help5 e w p ‘hill5 iw
4 ‘mother5 rriAdar ‘father5 f a : d a r
‘car5 k a : r ‘cart5 k a : r t
‘area5 e:rial ‘aerial5 e : r i a l
‘idea5 a i d i a l ‘ideal5 a i d i a l
‘India5 in d i a l ‘Norma5 no:mal
5 ‘cat5 k a t ‘plaster5 p l a : s t a r
‘cart5 k a : r t ‘grass5 g r a : s
‘calm5 k a : m ‘gas5 g a s
Recorded exercises
These exercises are mainly intended for students whose native language is not English;
however, those exercises which involve work with transcription (exercises 1.2,2.2, 3.3,3.5,
3.7, 4.5, 5.3, 5.4, 6.2, 7.6, 9.5, 10.1, 10.2, 11.5, 12.3, 13.1, 13.2, 13.3, all of Audio Unit 14
and Exercise 19.2) and those which give practice in intonation (Audio Units 15-20) will be
useful to native speakers as well.
Each Audio Unit corresponds to a chapter of this book. As far as possible I have tried
to relate the content of each Audio Unit to the material of the chapter; however, where the
chapter is devoted to theoretical matters I have taken advantage of this to produce revision
exercises going back over some of the subjects previously worked on.
In some of the exercises you are asked to put stress or intonation marks on the text.
It would be sensible to do this in such a way that will make it possible for you, or someone
else, to erase these marks and use the exercise again.
As with the chapters of the book, these exercises are intended to be worked through
from first to last. Those at the beginning are concerned with individual vowels and con
sonants, and the words containing them are usually pronounced in isolation in a slow,
careful style. Pronouncing isolated words in this way is a very artificial practice, but the
recorded exercises are designed to lead the student towards the study of comparatively
natural and fluent speech by the end of the course. In some of the later exercises you will
find it necessary to stop the recording in order to allow yourself enough time to write a
transcription. You will also need to stop the recording to check your answers. The answers
section for the Recorded Exercises is on pages 210-18.
A udio U n it i Introduction
To give you practice in using the audio exercises in this book, here are two simple exercises
on English word stress.
Exercise i R e p e titio n
Each word is shown with a diagram showing which syllables are strong (•) and which are
weak (•). Listen to each word and repeat it.
1 •• • potato
2 • • • optimist
3 •• decide
169
170 Recorded exercises
4 • • • • reservation
5 • • • quantity
The exercises in this Unit practise the six short vowels introduced in Chapter 2. When
pronouncing them, you should take care to give the vowels the correct length and the
correct quality.
Exercise 1 R e p e titio n
Listen and repeat:
i
bit bit b id b id hymn him m is s mis
e
bet bet bed b ed hen hen m ess mes
as
bat baet b a d baed ham haem m a ss mass
A
Exercise 3 P ro d u c tio n
When you hear the number, pronounce the word (which is given in spelling and in phon
etic symbols). Repeat the correct pronunciation when you hear it.
Example: 1 ‘mad’
1 mad maed 4 bet bet
2 mud m A d 5 cut k A t
3 bit bit 6 cot k D t
Audio U nit 3 Long vowels, diphthongs and triphthongs 171
Exercise 4 S h o rt v o w e ls c o n tra s te d
Listen and repeat (words given in spelling):
i and e e and ae ae and A
bit bet hem ham lack luck
tin ten set sat bad bud
fill fell peck pack fan fun
built belt send sand stamp stump
lift left wreck rack flash flush
a and D d and u
dug dog lock look
cup cop cod could
rub rob pot put
stuck stock shock shook
luck lock crock crook
Long vowels
Exercise 1 R e p e titio n
Listen and repeat:
i:
beat bi:t bead b ir d bean b i: n beefbiif
o:
heart ha:t hard ha:d harm ha:m hearth ha:0
3:
caught k.i;t cord k 3 :d corn ka:n course k 3 :s
172 Recorded exercises
u:
r o o t ru :t rude ru:d room ru:m roof ru:f
3:
h u rt h 3 :t heard h3:d earn 3:n earth 3:0
Exercise 2 P ro d u c tio n
When you hear the number, pronounce the word. Repeat the correct pronunciation when
you hear it.
1 heard h3:d 6 heart ha:t
2 bean bi:n 7 cord ko:d
3 root ru:t 8 beefbiif
4 hearth ha:0 9 rude ru:d
5 caught ko:t 10 earn 3:n
Exercise 4 L o n g -s h o rt v o w e l c o n tras ts
Listen and repeat (words in spelling):
i: and 1 a: and A a: and ae
feel fill calm come part pat
bead bid cart cut lard lad
steel still half huff calm Cam
reed rid lark luck heart hat
bean bin mast must harms hams
o: and D u: and u 3: and A a: and D
caught cot pool pull hurt hut dark dock
stork stock suit soot turn ton part pot
short shot Luke look curt cut lark lock
cord cod wooed wood girl gull balm bomb
port pot fool fall bird bud large lodge
Diphthongs
Exercise 6 R e p e titio n
Listen and repeat, making sure that the second part of the diphthong is weak.
Audio U nit 4 Plosives 173
ei
mate m eit made m eid main m ein mace m eis
ai
right r a it ride raid rhyme raim rice rais
01
quoit k o it buoyed b o id Boyne born Royce rois
3U
coat kaut code kaud cone kaun close k la u s
au
gout gaut loud la u d gown gaun louse la u s
13
feared fia d fierce fia s
ea
cared kead cairn k esn scarce sk eas
ua
moored m uad
Triphthongs
E xercise 8 R e p e titio n
Listen and repeat:
eio layer leio 000 lower boo
a 10 liar laia aoo tower taoo
Dia loyal taial
A udio U n it 4 Plosives
INITIAL LENIS b, d, g
Each word begins with a lenis plosive; notice that there is practically no voicing of the
plosive. Listen and repeat:
bee b i: gear gia
door d o : boy b a i
go gau deardia
bear b e a bough b a u
do d u : day d e i
INITIAL sp, st, sk
The plosive must be unaspirated. Listen and repeat:
spy spai score ska:
store st a: spear sp ia
ski ski: stay stei
spare spea sky skai
steer stia spar spa:
Exercise 2 R e p e titio n o f f i n a l p lo sives
In the pairs of words in this exercise one word ends with a fortis plosive and the other
ends with a lenis plosive. Notice the length difference in the vowel. Listen to each pair and
repeat:
FORTIS FOLLOWED BY LENIS
mate made meit meid
rope robe rau p raub
leak league li:k li:g
cart card ka:t ka:d
back bag baek baeg
A udio U n it 5 Revision
Exercise 1 V o w e ls a n d d ip h th o n g s
Listen and repeat:
a: and 3: ei and e ai and a:
barn burn fade fed life laugh
are err sale sell tight tart
fast first laid led pike park
cart curt paste pest hide hard
lark lurk late let spike spark
oi and 0: 30 and o: 13 and i:
toy tore phone fawn fear fee
coin corn boat bought beard bead
boil baU code cord mere me
boy bore stoke stork steered steed
foil fall bowl ball peer pea
e3 and ei e3 and 13 03 and o:
dare day fare fear poor paw
stared stayed pair pier sure shore
pairs pays stare steer moor more
hair hay air ear dour door
mare may snare sneer tour tore
Exercise 2 T rip h th o n g s
Listen and repeat:
eio player pie 10
aio tyre taia
oia loyal loiol
3U3 mower mouo
auo shower Jaua
Exercise 3 T ra n s c rip tio n o f w o rd s
You should now be able to recognise all the vowels, diphthongs and triphthongs of English,
and all the plosives. In the next exercise you will hear one-syllable English words composed
of these sounds. Each word will be said twice. You must transcribe these words using the
phonemic symbols that you have learned in the first three chapters. When you hear the
word, write it with phonemic symbols. (1-20)
which you should repeat. If you want to see how these words are spelt when you have fin
ished the exercise, you will find them in the answers section.
1 ki:p 11 dAk
2 baut 12 kaup
3 kAp 13 dog
4 d3:t 14 kauad
5 baik 15 beik
6 kaeb 16 taid
7 geit 17 biad
8 kead 18 put
9 taiad 19 bAg
10 b3:d 20 daut
Fortis Lenis
1 right rait ride raid
2 bat baet bad baed
3 bet bet bed bed
4 leak li:k league li:g
5 feet fi:t feed fi:d
6 right rait ride raid
7 tack taek tag taeg
8 rope raup robe raub
9 mate meit made meid
10 beat biit bead bi:d
3 m e a s u r e m e 39 r o u g e r u :3
h h o t hDt b e e h iv e b i i h a i v
c) Final J and tJ
m a e j m a e tj (m a sh , m a tc h )
k a ej k a etj (c a sh , c a tc h )
w i j w i t f (w is h , w itc h )
d) Medial 3 and d 3
le3^ led33 (leisure, ledger)
ple3^ pled39 (pleasure, pledger)
li:3sn li:d3^n (lesion, legion)
Exercise 5 D is c rim in a tio n b e tw e e n fr ic a tiv e s a n d a ffric a te s
You will hear some of the words from Exercise 4. When you hear the word, say “A” if you
hear the word on the left, or “B” if you hear the word on the right. You will then hear the
correct answer and the word will be said again for you to repeat.
Audio U nit 7 Further consonants 179
A B
JnP tjop
k£Ej kaetf
wDjig wDtfirj
Ju:z tfuiz
li:39 n li:d33 n
bae/iz baet/iz
Ji:t tji:t
le33 led33
liijiz liitjiz
wij WltJ
ple 39 pled33
ITlcCj maetj
Exercise 1 R e p e titio n o f w o rd s c o n ta in in g a v e la r n a s a l
Listen and repeat; take care not to pronounce a plosive after the velar nasal.
haer) haerp
sirjirj ror)
taq baerjir)
0 ir) r ig
Exercise 2 V e la r n a s a l w ith a n d w it h o u t g
WORDS OF ONE MORPHEME
Listen and repeat:
fir)93 finger
aerjga anger
basrjga Bangor
hArjga hunger
aerjgl angle
180 Recorded exercises
Exercise 4 r
Listen and repeat, concentrating on not allowing the tongue to make contact with the roof
of the mouth in pronouncing this consonant:
earig airing rears rarer
riirait rewrite herir) herring
terarist terrorist mira mirror
aerau arrow roirir) roaring
Exercise 5 j and w
Listen and repeat:
ju: you wei way
join yawn wo: war
jiayear win win
jua your wea wear
Audio U nit 8 Consonant clusters 181
Exercise 6 D ic ta tio n o f w o rd s
When you hear the word, write it down using phonemic symbols. Each word will be said
three times; you should pause the CD if you need more time for writing. (1-12)
Exercise i D e v o id n g o f \ , r, w , j
When 1, r, w , j follow p , t or k in syllable-initial position they are produced as voiceless,
slightly fricative sounds.
Listen and repeat:
p le i play trei tray k lia clear
p r e i pray t w i n twin krai cry
p j u : pew t j u i n tune k ju : queue
Exercise 2 R e p e titio n o f in it ia l clu sters
TWO CONSONANTS
Listen and repeat:
spot spot plao plough
stau n stone tw ist twist
s k e i t skate k r i i m cream
s f i a sphere p j u a pure
s m a l l smile f l e i m flame
s n a u snow J r ir jk shrink
slsem slam v j u : view
s w i t j switch Q woit thwart
THREE CONSONANTS
Listen and repeat:
splei splay strei stray skru: screw
sprei spray stju : stew s k w o j squash
sp ju : spew s k j u : skew
182 Recorded exercises
5 krAnJt 7 p lA n d 3 d
6 G ro u n z 8 kw enj
(The spelling of these words is given in the answers section.)
Exercise 7 Repetition o f sentences with consonant clusters
Listen and repeat:
1 Strong trucks climb steep gradients stro i] trAks k l a i m s t i i p g r e i d i o n t s
2 He cycled from Sloane Square through Knightsbridge hi s a i k | d f r o m s l o o n
sk w eo 0ru: n a its b r id 3
3 Old texts rescued from the floods were preserved o u ld t e k s t s r e s k j u i d f r o m do
f U d z w o p r i z 3ivd
4 Six extra trays of drinks were spread around sik s e k s tr o t r e i z ov d r ig k s w o
spred oraund
5 Thick snowdrifts had grown swiftly 0 ik s n o u d r i f t s o d g r o u n sw iftli
6 Spring prompts flowers to grow s p r i g p r o m p t s f l a u o z t o g r o u
Exercise 1 “Schwa”
TWO-SYLLABLE WORDS WITH WEAK FIRST SYLLABLE AND STRESS ON THE
SECOND SYLLABLE
Listen and repeat:
Weak syllable spelt ‘a5
about o ' b a u t ahead o ' h e d again o 'g en
Spelt ‘o’
obtuse o b ' t j u i s oppose o ' p o u z offend o ' f e n d
Spelt V
suppose s o ' p o u z support s o ' p o i t suggest s o ' d 3 e s t
Spelt ‘or5
forget f o ' g e t forsake f o ' s e i k forbid f o ' b i d
Spelt ‘er5
perhaps p o 'h a e p s per cent p o ' s e n t perceive p o ' s i i v
Spelt ‘ur5
survive s o ' v a i v surprise s o ' p r a i z survey (verb) s o 'v ei
Spelt V
hundred 'hAndrad sullen 's Alan open 'a u p a n
Spelt cu’
circus 's3ikas autumn o itam album 'aelbam
Spelt ‘ar’
tankard 'taegkad custard 'kAstad standard 'staendad
Spelt or’
juror 'd 3 u a r a major 'm e id 3 a manor 'maena
Spelt ‘er’
longer ' l n q g a eastern 'iistan mother 'mAda
Spelt ‘ure’
nature 'neitja posture 'postja creature 'kriitja
Spelt ‘ous’
ferrous 'feras vicious 1vi fas gracious 'g re ija s
Spelt ough’
thorough '0Ara borough 'bAra
Spelt ‘our’
saviour 'seivja succour 's Aka colour 'kAla
THREE-SYLLABLE WORDS WITH WEAK SECOND SYLLABLE AND STRESS ON THE
FIRST SYLLABLE
Listen and repeat:
Weak syllable spelt V
workaday 'w3ikadei roundabout 'ra u n d a b a o t
Spelt ‘o’
customer 'kAstama pantomime 'p asn ta m a im
Spelt V
perjury 'p3:d3ari venturer 'v en tJ a r a
Spelt far’
standardise 'staendadaiz jeopardy 'd 3 ep a d i
Spelt ‘er’
wonderland 'wAndalaend yesterday 'je s ta d e i
Exercise 1 Stress m a rk in g
When you hear the word, repeat it, then place a stress mark (') before the stressed syl
lable.
ensmi enemy ssbtraekt subtract
kalekt collect elifont elephant
kaepitl capital abz3:v9 observer
kaineijn carnation profit profit
paeradais paradise e n t s t e i n entertain
1 'Jra u zb ri 6 'b 3 :m ig 9 m
2 p o l'p e ro o 7 ^ o i'O a e m p ts n
3 ^ b s'd iin 8 dAn'di:
4 .w u lv a 'h a s m p ta n 9 'k a e n ta b r i
5 .aeb a 'ris tw o © 10 ' b e i z i g s t o o k
(The spelling for these names is given in the answers section.)
TWO-SYLLABLE WORDS
VERBS
1 d i si :v deceive 6 obd3ekt object
2 J a i p a n sharpen 7 knqka conquer
3 k a l e k t collect 8 n k o i d record
4 p r o n a u n s pronounce 9 p o l l /polish
5 k o p i copy 10 d i p e n d depend
ADJECTIVES
1 i:zi easy 6 je tau yellow
2 k a m p l i i t complete 7 3:li early
3 meid33 major 8 s a b l a i m sublime
4 a t a u n alone 9 h e v i heavy
5 b i b u below 10 a l a i v alive
NOUNS
1 b i Jap bishop o f is office
2 s e s p e k t aspect a r e i array
3 a f e a affair 8 p a t r a u l patrol
4 k a i p i t carpet 9 d e n t i s t dentist
5 d i f i i t defeat 10 o i t s m autumn
THREE-SYLLABLE WORDS
VERBS
1 en to tein entertain 6 i l is it elicit
2 r e z a r e k t resurrect 7 kDmandia commandeer
3 a b a e n d a n abandon 8 im a e d 3 in imagine
4 d i l i v a deliver 9 d i t 3 i m i n determine
5 i n t a r Apt interrupt 10 s e p a r e i t separate
Audio U nit 11 Complex word stress 187
ADJECTIVES
1 im p o itn t important 6 i n s t a n t insolent
2 ino im as enormous 7 f a e n ta e stik fantastic
3 d e r a l i k t derelict 8 n e g s t i v negative
4 d e s i m l decimal 9 ask j a r a t accurate
5 a e b n o im a l abnormal 10 Anlaikli unlikely
NOUNS
1 f 'j r n i t f a furniture 6 k a O iid rs l cathedral
2 dizaista disaster 7 h o t a k o i s t holocaust
3 disaipl disciple 8 t r a e n z i s t a transistor
4 a e m b j a l a n s ambulance 9 a e k s id n t accident
5 k w n n t a t i quantity 10 t a m a i t a u tomato
Exercise 2 N e u tr a l s u ffix e s
When you hear the stem word, add the suffix, without changing the stress.
comfort+-able power+-less
anchor+-age hurried+-ly
refuse+-al (refusal) punish+-ment
wide+-en (widen) yellow+-ness
wonder+-ful poison+-ous
amaze+-ing (amazing) glory+-fy (glorify)
devil+-ish other+-wise
bird+-like fun+-y (funny)
188 Recorded exercises
Words occurring in their weak forms are printed in smaller type than stressed words and
strong forms, for example:
'we can 'wait 'wi: kan 'w e it
Audio U nit 12 W eak forms 189
Exercise 2 W e a k f o r m s w it h p re -v o c a lic a n d p re -c o n s o n a n ta l fo r m s
DIFFERENT VOWELS
When you hear the number, say the phrase, using the appropriate weak form:
the 1 the apple di aepl 2 the pear do pea
to 3 to Edinburgh tu ednbro 4 to Leeds to liidz
do 5 so do I sou du ai 6 so do they sou do dei
LINKING CONSONANT
a/an 7 an ear on 10 8 a foot 0 fut
(The other words in this section have “linking r”.)
her 9 her eyes hor aiz 10 her nose ho nouz
your 11 your uncle jar Ar)kl 12 your friend jo frend
for 13 for Alan for aelon 14 for Mike fo maik
there 15 there aren’t dor a:nt 16 there couldn’t do kudnt
are 17 these are ours di:z or auaz 18 these are mine di:z o main
were 19 you were out ju: wor aut 20 you were there ju: wo deo
Write the following sentences in transcription, taking care to give the correct weak forms
for the words printed in smaller type.
1 'Leave the 'rest of the 'food for 'lunch
2 'Aren’t there some 'letters for her to 'open?
3 'Where do the 'eggs 'come from?
4 'Read his 'book and 'write some 'notes
5 At 'least we can 'try and 'help
Now correct your transcription, using the version in the answers section.
Exercise 4 P ro n u n c ia tio n o f w e a k fo r m s
This exercise uses the sentences of Exercise 3. When you hear the number, say the sentence,
giving particular attention to the weak forms. (1-5)
A udio U n it 13 Revision
Exercise 1 R h y th m a n d th e f o o t
Listen to the following sentences. Put a stress mark ' on each stressed syllable, then divide
the sentences into feet by placing a dotted line I at each foot boundary.
Example: j 'Come to the j 'party on i 'Monday j 'evening j
1 Each person in the group was trained in survival
2 About three hundred soldiers were lined up
3 Buying a new computer is a major expense
4 All the people who came to the wedding were from England
5 Try to be as tactful as you can when you talk to him
Exercise 2 Elisions
R e a d th is b e fo re s ta r tin g th is exercise
This Audio Unit gives you practice in recognising places where elision occurs in natural
speech (i.e. where one or more phonemes which would be pronounced in careful speech are
not pronounced). The examples are extracted from dialogues between speakers who are dis
cussing differences between two similar pictures. Each extract is given three times. You must
transcribe each item, using phonemic symbols so that the elision can be seen in the tran
scription. For example, if you heard ‘sixth time’ pronounced without the 0 fricative at the
end of the first word you would write siks taim, and the elision would be clearly indicated in
this way. You can use th eh symbol to indicate a devoiced weak vowel, as in ‘potato’phteitau.
192 Recorded exercises
You will probably need to pause your CD or tape to give yourself more time to write
the transcription. This is a difficult exercise, but explanatory notes are given in the answers
section.
T ra n s c rip tio n
ONE ELISION
1 a beautiful girl
2 we seem to have a definite one there
3 could it be a stool rather than a table
4 a fifth in
5 any peculiarities about that
6 and how many stripes on yours
7 well it appears to button up its got three
8 or the what do you call it the sill
TWO ELISIONS
9 by column into columns all right
10 diamond shaped patch
11 and I should think from experience of kitchen knives
12 what shall we do next go down
THREE ELISIONS
13 the top of the bottle is projecting outwards into the room
Now check your transcriptions.
A udio U n it 15 Tones
Exercise 1 R e p e titio n o f to n e s
Listen and repeat:
Fall: \yes \n o \w ell \fo u r
Rise: /yes /n o /w ell /fo u r
Fall-rise: vyes vno vwell vfour
Rise-fall: Ayes Ano Awell Afour
Level: _yes _no _well Jo u r
Exercise 2 P ro d u c tio n o f to n e s
When you hear the number, say the syllable with the tone indicated:
1 /them
2 \w hy
3 vwell
Audio U nit 16 The to n e-u n it 193
4 \John
5 /w hat
6 Ano
7 \here
8 /you
9 /now
10 vend
A udio U n it 16 The to n e -u n it
7 Here it is
8 That was a loud noise
9 We could go from Manchester
10 Have you finished
Now check your answers.
Exercise 2 P ro n o u n c in g th e to n ic s y lla b le
When you hear the number, say the item with the tonic syllable in the place indicated,
using a falling tone:
1 Pont do that
2 Pont do that
3 Pont do that
4 Write your name
5 Write your name
6 Write your name
7 Heres my pen
8 Heres my pen
9 Heres my pen
10 Why dont you try
11 Why dont you try
12 Why dont you try
13 Why dont you try
A udio U n it 17 In to n atio n
Exercise 3 H ig h a n d lo w h e a d
The following tone-units will be repeated with high and low heads. Listen and repeat:
'Taxes have 'risen by 'five per \ cent
.Taxes have .risen by .five per \cen t
'Havent you 'asked the 'boss for / more
.Havent you .asked the .boss for / more
We 'dont have 'time to 'read the \paper
We .dont have .time to .read the \ paper
'Wouldnt you 'like to 'read it on t h e /train
.Wouldnt you .like to .read it on the / train
196 Recorded exercises
The following extracts are from the same recorded conversations as were used in Audio
Unit 14. Each extract will be heard three times, with four or five seconds between repeti
tions. Mark the intonation; the instructions for how to do this are given in the text for
Audio Unit 17, Exercise 4. In addition, for numbers 10-16 you will need to use the vertical
line I to separate tone-units.
T ran sc rip tio n
ONE TONE-UNIT
1 it looks like a French magazine
2 the television is plugged in
3 does your colander have a handle
4 a flap on it
5 you tell me about yours
6 well dark hair
7 more than halfway
Audio U nit 19 Further practice on connected speech 197
Exercise 1 D ic ta tio n
You will hear five sentences spoken rapidly. Each will be given three times. Write each sen
tence down in normal spelling. (1-5)
Exercise 4 S tu d y p a s s a g e
The following passage will first be read as continuous speech, then each tone-unit will be
heard separately, twice.
Theyre building wind farms all over the area where we live. We can see long lines of
them along the tops of the hills, and down by the coast there are wind turbines out at
198 Recorded exercises
sea and along the shore. They only build them where there’s plenty of wind, obviously.
We certainly get a lot of that near us. You could say the landscape’s been completely
transformed, but most people don’t seem to mind.
a) Transcribe each tone-unit using phonemic symbols, but paying attention to con-
nected-speech features such as elisions and assimilations.
b) Add intonation transcription to each tone-unit.
do not apply
if youre turning right
which means that
if youre coming up to a traffic light
someone stopped
who wants to go straight on
or turn left
and you want to turn right
then you pull out
overtake them
and then cut across
in front
Now check your transcription.
Answers to written exercises
Chapter i
C hapter 2
a) e e) o
b) A f) n
c) o g) ae
d) i h) e
200
Answers to w ritten exercises 201
2 a) n: d) 3: g) 3:
b) 0: e) u: h) i:
c) a: f) i: i) 3:
3 a) ao d) ei g) es
b) ai e) is h) ai
c) au f) 31 i) ei
C hapter 4
1 You will obviously not have written descriptions identical to the ones given
below. The important thing is to check that the sequence of articulatory events is
more or less the same.
a) goat
Starting from the position for normal breathing, the back of the tongue is raised
to form a closure against the velum (soft palate). The lungs are compressed to
produce higher air pressure in the vocal tract and the vocal folds are brought
together in the voicing position. The vocal folds begin to vibrate, and the back
of the tongue islowered to allow thecompressed air to escape. The tongue is
moved to amid-central vowel andthen moves in the direction of a closer, backer
vowel: the lips are moderately rounded for the second part. The tongue blade
is raised to make a closure against the alveolar ridge, the vocal folds are sepa
rated and voicing ceases. Then the compressed air is released quietly and the lips
return to an unrounded shape.
b) ape
The tongue is moved slightly upward and forward, and the vocal folds are
brought together to begin voicing. The tongue glides to a slightly closer and
more central vowel position. Then the lips are pressed together, making a
closure, and at the same time the vocal folds are separated so that voicing ceases.
The lips are then opened and the compressed air is released quietly, while the
tongue is lowered to the position for normal breathing.
2 a) beik d) bo:t g) bo:d
b) g s o t e) tik h) gu:d
c) daot f) bau i) pi:
202 Answers to w ritten exercises
C hapter 5
C hapter 6
a) fijiz e) 3t f i : v
b) J e i v a f) Adsz
c) siksG g) m e 3 3
d) di:z h) a h e d
Starting from the position for normal breathing, the lower lip is brought into
contact with the upper teeth. The lungs are compressed, causing air to flow
through the constriction, producing fricative noise. The tongue moves to the
position for i. The vocal folds are brought together, causing voicing to begin,
and at the same time the lower lip is lowered. Then the tongue blade is raised to
make a fairly wide constriction in the post-alveolar region and the vocal folds
are separated to stop voicing; the flow of air causes fricative noise. Next, the
vocal folds are brought together to begin voicing again and at the same time the
tongue is lowered from the constriction position into the i vowel posture. The
tongue blade is then raised against the alveolar ridge, forming a constriction
which results in fricative noise. This is initially accompanied by voicing, which
then dies away. Finally, the tongue is lowered from the alveolar constriction, the
vocal folds are separated and normal breathing is resumed.
C hapter 7
Plosives: p t k b d g
Fricatives: f 0 s j h v d z 3
Affricates: t j d3
Nasals: mng
Lateral: 1
Approximants: r w j
(This course has also mentioned the possibility of g and m.)
a) SQufo c) s t i a r i g
b) V3is d) bredkrAm
Answers to w ritten exercises 203
e) s k w e a g) b a it
f) asrjga h) n a i n t i m
3 a) The soft palate is raised for the b plosive and remains raised for ae. It is
lowered for n, then raised again for the final a.
b) The soft palate remains lowered during the articulation of m, and is then
raised for the rest of the syllable.
c) The soft palate is raised for the ae vowel, then lowered for rj. It is then raised
for the g plosive and remains raised for the 1.
Chapter 8
PEAK CODA
c) PRE POST
INITIAL INITIAL INITIAL FINAL
s P 1 £6 f
Chapter 9
1 a p a t i k j a l a p r D b la m a v d a b a u t w a z a li:k
2 aupm r) d a b o t l p r iz e n t id n a u d if ik lt i
204 Answers to w ritten exercises
C hapter io
1 a) pro'tect pro'tekt
b) 'clamber 'klaembo
c) fes'toon fes'tuin
d) de'test di'test
e) 'bellow 'belou
f) 'menace 'mems
g) disconnect ,disko'nekt
h) 'enter 'entorir) ('entrir))
2 a) 'language 'laer)gwid3
b) 'captain 'kaeptin
c) ca'reer ko'rio
d) 'paper 'peipo
e) e'vent 1'vent
f) 'jonquil 'd3 Dgkwil
g) 'injury 'ind 3 ori Cind3ri)
h) co'nnection ko'nek Jon (ka'nekfn)
C hapter i i
1 and 2
a) 'shop,keeper 'Jop.kiipa
b) ,open'ended .aupan'endid
c) Java'nese ,d3a:va'ni:z
d) 'birthmark 'b3:0ma:k
e) ,anti'clockwise .aenti'klokwaiz
g) .confirmation .kDnfa'meiJn
h) .eight'sided .eit'saidid
h) 'fruitcake 'fru:t,keik
i) de'fective di'fektiv
j) 'roof .timber 'ru:f,timba
C hapter 12
C hapter 13
P d s m z
Continuant - - + + +
Alveolar - + + - +
Voiced - + - + +
5 a) All the vowels are close or close-mid (or between these heights).
b) All require the tongue blade to be raised for their articulation, and all are in
the alveolar or post-alveolar region.
c) None of these requires the raising of the tongue blade - all are front or back
articulations.
d) All are voiceless.
e) All are rounded or end with lip-rounding.
f) All are approximants (they create very little obstruction to the airflow).
Chapter 14
c) Com j puters con j sume a con I siderable a i mount of j money and I time
d) j Most of them have a j rrived on the j bus
e) I Newspaper editors are in variably i under worked
2 a) b)
Royce
c)
d)
w s w
s w s w w s
Rolls Royce rail y e vent
(the stress levels o f‘Rolls’ and ‘Royce’ are exchanged to avoid “stress clash”
between ‘Royce’ and ‘ra-’.)
3 a) w a q k o :z 9v se sm sr is s p s u s tQ bi aetad3iz
b) w o t di 3i b m p D p j 3leiJn k o d 3u :z iz b e t s t r e m z
c) Ji aeks p o t i k p l i w e l in n s f 3 is siin
(Each of the above represents just one possible pronunciation: many others are
possible.)
C hapter 15
Chapter 16
1 (T h is is a n exercise w h e re th e re is m o r e th a n o n e c o r r e c t an sw er.)
a) b u y it fo r m e
b ) h e a r it
c) ta lk to h im
2 a) 'm in d th e ste p
b ) 'th is is th e 'te n to 'sev e n tr a i n
c) 'k e e p th e 'fo o d h o t
3 a) 'O n ly w hen th e vw in d -blow s
b) /W h e n d id you -say
Chapter 17
c) ,She w o u ld h av e ,th o u g h t it w as a o b v io u s
~ \
2 a) ,o p p o r t / u n it v
b ) v ac tu a lly
c) \ c o n fid e n t ly
V___
d) m a g Am fi c e n t
e) re / l a tio n sh ip
f) ,a f te r v n o o n
\y
Chapter 18
1 Its 'r a th e r v c o ld
2 B e 1ca u se I 'c a n t a x ffo rd it
3 Y oure \ silly th e n
4 O h v p lease
5 ,S even o x clo ck | ,sev en / th i r ty I a n d \ e ig h t
6 a Fout
7 Ive ,g o t to ,d o th e / s h o p p in g
8 v S o m e o f th e m - m ig h t
Answers to w ritten exercises 209
C hapter 19
C hapter 20
This accent has a distribution for r) similar to BBC pronunciation (i.e. a case can
be made for a 13 phoneme), except that in the case of the participial ‘-mg’ ending
n is found instead of g.
This accent has two additional long vowels (e:, o:) and, correspondingly, two
fewer diphthongs (ei, 30). This situation is found in many Northern accents.
The fricatives 9, 6, h are missing from the phoneme inventory, and f, v are used
in place of 0, d. This accent has w where BBC pronunciation has “dark 1”. This is
typical of a Cockney accent.
This data is based on the traditional working-class accent of Bristol, where words
of more than one syllable do not usually end in a The accent is rhotic, so where
there is an ¥ in the spelling (as in ‘mother5) an r is pronounced: where the spell
ing does not have V, an 1sound is added, resulting in the loss of distinctiveness
in some words (cf. ‘idea5,‘ideal5; ‘area5,‘aerial5).
Here we appear to have three vowels where BBC pronunciation has two: the
word ‘cat5has the equivalent of as, ‘calm5has a vowel similar to a : while in the set
of words that have se in many Northern accents (‘plaster5,‘grass5, etc.) an addi
tional long vowel a: is used. This is found in Shropshire.
Answers to recorded exercises
A udio U n it i
Exercise 2
1 radical • • •
2 emigration • • • •
3 enormous • • •
4 disability • • • • •
5 alive • •
Audio U n it 2
Exercise 2
A udio U n it 3
Exercise 3
Exercise 5
210
Answers to recorded exercises 211
Exercise ^
A udio U n it 4
Exercise 3 b )
Exercise 5
1 ‘debate’ 6 ‘guarded’
2 ‘copied’ 7 ‘dedicated’
3 ‘buttercup’ 8 ‘paddock’
4 ‘cuckoo’ 9 ‘boutique’
5 ‘decayed’ 10 ‘appetite’
A udio U n it 5
Exercise 3
Exercise 4
1 ‘keep’ 11 ‘duck’
2 ‘boat’ 12 ‘cope’
3 ‘cup’ 13 ‘dog’
4 ‘dirt’ 14 ‘coward’
5 ‘bike’ 15 ‘bake’
6 ‘cab’ 16 ‘tied’
7 ‘gate’ 17 ‘beard’
8 ‘cared’ 18 ‘put’
9 ‘tired’ 19 ‘bug’
10 ‘bird’ 20 ‘doubt’
A udio U n it 6
Exercise 2
a) initial position b) medial position c) final position
1 J- in Jao ‘show’ 6 v in novo ‘over’ 11 6 in lood ‘loathe’
2 0 in 0ai ‘thigh’ 7 3 in me30 ‘measure’ 12 v in i:v ‘Eve’
3 z in zu: ‘zoo’ 8 s in aisir) ‘icing’ 13 J in aej ‘ash’
4 f in fa: ‘far’ 9 J in ei j'o ‘Asia’ 14 f in rAf ‘rough’
5 d in dou ‘though’ 10 h in ohed ‘ahead’ 15 0 in ooB ‘oath’
A udio U n it 7
Exercise 6
1 j 11:303! ‘usual’ 7 vaiolons ‘violence’
2 rimein ‘remain’ 8 emfosis ‘emphasis’
3 eksasaiz ‘exercise’ 9 d3entli ‘gently’
4 we on t] ‘wearing’ 10 0ir)kir) ‘thinking’
5 3:d3ont ‘urgent’ 11 taipraito‘typewriter’
6 minimom ‘minimum’ 12 j iali ‘yearly’
A udio U n it 8
Exercise 6 (spellings)
1 ‘scraped’ 5 ‘crunched’
2 ‘grudged’ 6 ‘thrones’
3 ‘clothes’ 7 ‘plunged’
4 ‘scripts’ 8 ‘quench’
Answers to recorded exercises 213
A udio U n it 9
Exercise 5
1 'gaidna ‘gardener’ 6 'sAdn ‘sudden’
2 'k d a m ‘column’ 7 'kaelas ‘callous’
3 'haendlz ‘handles’ 8 'Sretnirj ‘threatening’
4 a'laiv ‘alive’ 9 pa'lait ‘polite’
5 pri'tend ‘pretend’ 10 'pAzl ‘puzzle’
A udio U n it 10
Exercise 1
1 'enami 6 sab'traekt
2 ka'lekt 7 'elifant
3 'kaepitl 8 ab'z 3 :va
4 kai'neijn 9 'profit
5 'paeradais 10 .enta'tein
1 Shrewsbury 6 Birmingham
2 Polperro 7 Northampton
3 Aberdeen 8 Dundee
4 Wolverhampton 9 Canterbury
5 Aberystwyth 10 Basingstoke
A udio U n it 12
Exercise 3
A udio U n it 13
Exercise 1 (s p e llin g s )
1 Colchester 4 Scunthorpe
2 Carlisle 5 Glamorgan
3 Hereford 6 Holyhead
214 Answers to recorded exercises
7 Framlingham 9 Cheltenham
8 Southend 10 Inverness
Exercise 2
Exercise 3
A udio U n it 14
Exercise 1
Exercise 2
Note: When recordings of conversational speech are used, it is no longer possible to give
definite decisions about “right” and “wrong” answers. Some problems, points of interest
and alternative possibilities are mentioned.
1 (Careful speech would have had b j u i t i f l or b j u i t i f u l . )
a b j u : t hf l g 3 i l
2 (Careful speech would have d e f i n i t ,
w i s i:m t a haev a d e f n a t w A n d e s
d e f i n o t or d e f n o t ; notice that this speaker uses a glottal stop at the end
o f ‘definite’ so that the transcription - phonetic rather than phonemic -
d e f n a ? would be acceptable. There is a good example of assimilation in
the pronunciation of ‘one there’; as often happens when n and d are com
bined, the n becomes dental n . In addition, the 6 loses its friction - which
is always weak —and becomes a dental nasal, so that this could be tran
scribed phonetically as w A n n e a . )
3 k u d it bi 9 stuil r a i d d n a teibl (Careful speech would have r a i d s d a n a ; the 6
is long, so the symbol is written twice to indicate this.)
Answers to recorded exercises 215
A udio U n it 15
E xercise 3
1 vone 6 /six
2 \tw o 7 \now
3 /three 8 vyou
4 Afour 9 Amore
5 \five 10 /u s
A udio U n it 16
Exercise 1
A udio U n it 17
Exercise 4
A udio U nit 18
Note: Since these extracts were not spoken deliberately for illustrating intonation, it is
not possible to claim that the transcription given here is the only correct version. There
are several places where other transcriptions would be acceptable, and suggestions about
alternative possibilities are given with some items, in addition to a few other comments.
1 it 'looks like a 'French magaxzine (slight hesitation between ‘looks’ and ‘like’)
2 the 'television 'is plugged vin
3 'does your 'colander have a \ handle (‘does’ possibly not stressed)
4 a /flap on it
5 'you tell me about / vours (narrow pitch movement on ‘yours’; ‘tell’ may also be
stressed)
6 'well x dark hair
7 .more than .half / wav
8 but er 'not in the \o th er -corners
9 a .sort o f ,Daily \ Sketch -format -newspaper (‘sort’ possibly not stressed)
Answers to recorded exercises 217
10 'on the \to g | 'on the \lid (both pronunciations o f‘on’ might be unstressed)
11 well theyre 'on al vternate -steps | theyre 'not on vevery -step
12 'what about the w en t | at the \ back
13 and a 'ladys \handbag | .hanging on a ,nail on th e\ wall
14 'you do the \left hand -bit of the -picture | and ,111 do the \ right hand -bit
15 were being 'very par vticular | but we 'just haven’t 'hit upon 'one of the \differ-
ences -yet (stress on ‘just’ is weak or absent)
16 and 'what about your telex vision | 'two / knobs | in the / front
A udio U n it 19
Exercise 1
| dea 'bildirj \ win fa:mz | 'oil 'auva di /earia | ,wea wi \ h v | wi kan si: 'lorj \ lainz av
dam | a,lDg da ,tDps av da \ hilz | an 'daun bai da vkanst J dar a 'win 't3:bainz 'aut
at / s i l | 'aend a'lDg d a x foi | dei 'aunli 'bild dam 'wea daz 'plenti av vwind | Aobviasli
| wi ,s3:tnli ,get a ,lot av \ daet mar -as | ju vkod -sei | da .laenskeips ,bi:r) kam,pli:tli
traensxfaimd | bap 'maus 'pi:pl 'daunt si:m ta \ maind |
A udio U n it 20
Note: Transcription of natural speech involves making decisions that have the effect of
simplifying complex phonetic events. The broad transcription given below is not claimed
to be completely accurate, nor to be the only “correct” version.
iwaz 'raida \ frai?mn
bikaz da dara\ skaiz
a di:z \baiskjz
ju 'riili \haev tu
218 Answers to recorded exercises
References to reading on specific topics are given at the end of each chapter. The following
is a list of basic books and papers recommended for more general study: if you wish to go
more fully into any of the areas given below you would do well to start by reading these.
I would consider it very desirable that any library provided for students using this book
should possess most or all of the books listed. I give full bibliographic references to the
books recommended in this section.
The best and most comprehensive book in this field is A. C. Gimson’s book originally
tided Introduction to the Pronunciation of English, now in its Seventh Edition edited by A.
Cruttenden with the title The Pronunciation of English (London, Edward Arnold, 2008);
the level is considerably more advanced and the content much more detailed than the
present course. All writers on the pronunciation of British English owe a debt to Daniel
Jones, whose book An Outline of English Phonetics first appeared in 1918 and was last
reprinted in its Ninth Edition (Cambridge University Press, 1975), but the book, though
still of interest, must be considered out of date.
Two other books that approach the subject in rather different ways are G. O. Knowles,
Patterns of Spoken English (London: Longman, 1987) and C. W. Kreidler, The Pronunciation
of English, Second Edition (Oxford: Blackwell, 2004). A. McMahon, An Introduction
to English Phonology (Edinburgh: Edinburgh University Press, 2002) covers the theory of
phonology in more depth than this book: it is short and clearly written. H. Giegerich,
English Phonology: An Introduction (Cambridge: Cambridge University Press, 1992)
is more advanced, and contains valuable information and ideas. I would also recommend
Practical Phonetics and Phonology by B. Collins and I. Mees (Second Edition, London:
Routiedge, 2008).
G eneral phonetics
I have written a basic introductory book on general phonetics, called Phonetics in the
series ‘Oxford Introductions to Language Studies’ (Oxford: Oxford University Press, 2002).
There are many good introductory books at a more advanced level: I would recommend
P. Ladefoged, A Course in Phonetics (Fifth Edition, Boston: Thomson, 2006), but see also
220 Recommendations for general reading
the same author’s Vowels and Consonants (Second Edition, Oxford: Blackwell, 2004) or M.
Ashby and J. Maidment, Introducing Phonetic Science (Cambridge: Cambridge University
Press, 2005). Also recommended is Phonetics: The Science of Speech by M. Ball and J.
Rahilly (London: Edward Arnold, 1999). D. Abercrombie, Elements of General Phonetics
(Edinburgh: Edinburgh University Press, 1967) is a well-written classic, but less suitable
as basic introductory reading. J. C. Catford, A Practical Introduction to Phonetics (Oxford:
Oxford University Press, 1988) is good for explaining the nature of practical phonetics;
a simpler and more practical book is P. Ashby, Speech Sounds (Second Edition, London:
Routledge, 2005). J. Laver, Principles of Phonetics (Cambridge: Cambridge University Press,
1994) is a very comprehensive and advanced textbook.
Phonology
Several books explain the basic elements of phonological theory. F. Katamba, An Introduction
to Phonology (London: Longman, 1989) is a good introduction. Covering both this area
and the previous one in a readable and comprehensive way is J. Clark, C. Yallop and
J. Fletcher, An Introduction to Phonetics and Phonology (Third Edition, Oxford: Blackwell,
2007). A lively and interesting course in phonology is I. Roca and W. Johnson, A Course
in Phonology (Oxford: Blackwell, 1999). A recent addition to the literature is D. Odden’s
Introducing Phonology (Cambridge: Cambridge University Press, 2005). The classic work
on the generative phonology of English is N. Chomsky and M. Halle, The Sound Pattern of
English (New York: Harper and Row, 1968); most people find this very difficult.
Accents o f English
The major work in this area is J. C. Wells, Accents of English, 3 vols. (Cambridge: Cambridge
University Press, 1982), which is a large and very valuable work dealing with accents of
English throughout the world. A shorter and much easier introduction is A. Hughes, P.
Trudgill and D. Watt, English Accents and Dialects (Third Edition, London: Edward Arnold,
2005). See also P. Foulkes and G. Docherty, Urban Voices (London: Edward Arnold, 1999)
and P. Trudgill, The Dialects of England (Second Edition, Oxford: Blackwell, 1999).
I do not include here books which are mainly classroom materials. Good introductions
to the principles of English pronunciation teaching are M. Celce-Murcia, D. Brinton and
J. Goodwin, Teaching Pronunciation (Cambridge: Cambridge University Press, 1996), C.
Dalton and B. Seidlhofer, Pronunciation (Oxford: Oxford University Press, 1994) and J.
Kenworthy, Teaching English Pronunciation (London: Longman, 1987). M. Hewings,
Pronunciation Practice Activities (Cambridge: Cambridge University Press, 2004) contains
much practical advice. A. Cruttenden’s revision of A. C. Gimson’s The Pronunciation of
Recommendations fo r general reading 221
English (Seventh Edition, London: Edward Arnold, 2008) has a useful discussion of
requirements for English pronunciation teaching in Chapter 13.
Pronunciation dictionaries
Most modern English dictionaries now print recommended pronunciations for each word
listed, so for most purposes a dictionary which gives only pronunciations and not mean
ings is of limited value unless it gives a lot more information than an ordinary diction
ary could. A few such dictionaries are currently available for British English. One is the
Seventeenth Edition of the Cambridge English Pronouncing Dictionary, originally by Daniel
Jones, edited by P. Roach, J. Hartman and J. Setter (Cambridge: Cambridge University
Press, 2006). Jones’ work was the main reference work on English pronunciation for most
of the twentieth century; I was the principal editor for this new edition, and have tried to
keep it compatible with this book. There is a CD-ROM disk to accompany the dictionary
which allows you to hear the English and American pronunciations of any word. Another
dictionary is J. C. Wells, Longman Pronunciation Dictionary (Third Edition, London:
Longman, 2008). See also C. Upton, W. Kretzschmar and R. Konopka (eds.), Oxford
Dictionary of Pronunciation (Oxford: Oxford University Press, 2001). A useful addition to
the list is L. Olausson and C. Sangster, The Oxford BBC Guide to Pronunciation (Oxford:
Oxford University Press, 2006), which makes use of the BBC Pronunciation Research
Unit’s database to suggest pronunciations of difficult names, words and phrases.
222
Bibliography 223
Collins, B. and Mees, I. (2008) Practical Phonetics and Phonology, 2nd edn., London:
Routledge.
Couper-Kuhlen, E. (1986) An Introduction to English Prosody, London: Edward Arnold.
Cruttenden, A. (1997) Intonation, 2nd edn., Cambridge: Cambridge University Press.
Cruttenden, A. (ed.) (2008) Gimson’s Pronunciation of English, 7th edn., London: Edward
Arnold.
Crystal, D. (1969) Prosodic Systems and Intonation in English, Cambridge: Cambridge
University Press.
Crystal, D. (2003) English as a Global Language, 2nd edn., Cambridge: Cambridge
University Press.
Crystal, D. and Quirk, R. (1964) Systems o f Prosodic and Paralinguistic Features in
English, The Hague: Mouton.
Dalton, C. and Seidlhofer, B. (1994) Pronunciation, Oxford: Oxford University Press.
Dauer, R. (1983) ‘Stress-timing and syllable-timing reanalysed’, Journal o f Phonetics,
vol. 11, pp. 51-62.
Davidsen-Nielsen, N. (1969) ‘English stops after initial /s/’, English Studies, vol. 50,
pp. 321-8.
Dimitrova, S. (1997) ‘Bulgarian speech rhythm: stress-timed or syllable-timed?’,
Journal of the International Phonetic Association, vol. 27, pp. 27—34.
Foulkes, P. and Docherty, G. (eds.) (1999) Urban Voices, London: Arnold.
Fox, A. T. C. (1973) ‘Tone sequences in English’, Archivum Linguisticum, vol. 4, pp. 17-26.
Fromkin, V. A. (ed.) (1978) Tone: A Linguistic Survey, New York: Academic Press.
Fudge, E. (1969) ‘Syllables’, Journal of Linguistics, vol. 5, pp. 253-86.
Fudge, E. (1984) English Word Stress, London: Allen and Unwin.
Fudge, E. (1999) ‘Words and feet’, Journal of Linguistics, vol. 35, pp. 273-96.
Giegerich, H. (1992) English Phonology: An Introduction, Cambridge: Cambridge University
Press.
Gimson, A. C. (1964) ‘Phonetic change and the RP vowel system’, in D. Abercrombie et al.
(eds.) In Honour of Daniel Jones, London: Longman, pp. 131-6.
Goldsmith, J. A. (1990) Autosegmental and Metrical Phonology, Oxford: Blackwell.
Halliday, M. A. K. (1967) Intonation and Grammar in British English, The Hague:
Mouton.
Harris, J. (1994) English Sound Structure, Oxford: Blackwell.
Hewings, M. (2004) Pronunciation Practice Activities, Cambridge: Cambridge University
Press.
Hewings, M. (2007) English Pronunciation in Use; Advanced, Cambridge: Cambridge
University Press.
Hirst, D. and di Cristo, A. (eds.) (1998) Intonation Systems, Cambridge: Cambridge
University Press.
Hogg, R. and McCully, C. (1987) Metrical Phonology: A Coursebook, Cambridge: Cambridge
University Press.
224 Bibliography
O’Connor, J. D. and Arnold, G. F. (1973) The Intonation of Colloquial English, 2nd edn.,
London: Longman.
O’Connor, J. D. and Tooley, O. (1964) ‘The perceptibility of certain word boundaries’, in D.
Abercrombie et al (eds.) In Honour of Daniel Jones, pp. 171-6, London: Longman.
O’Connor, J. D. and Trim, J. L. (1953) ‘Vowel, consonant and syllable: a phonological
definition’, Word, vol. 9, pp. 103-22.
Odden, D. (2005) Introducing Phonology, Cambridge: Cambridge University Press.
Olausson, L. and Sangster, C. (eds.) (2006) The Oxford BBC Guide to Pronunciation,
Oxford: Oxford University Press.
Pike, K. L. (1943) Phonetics, Ann Arbor: University of Michigan Press.
Pike, K. L. (1945) The Intonation of American English, Ann Arbor: University of Michigan
Press.
Pike, K. L. (1947) Phonemics, Ann Arbor: University of Michigan Press.
Pike, K. L. (1948) Tone Languages, Ann Arbor: University of Michigan Press.
Pullum, G. K. and Ladusaw, W. (1996) Phonetic Symbol Guide, 2nd edn., Chicago: University
of Chicago Press.
Radford, A., Atkinson, M., Britain, D., Clahsen, H. and Spencer, A. (1999) Linguistics: An
Introduction, Cambridge: Cambridge University Press.
Raphael, L. J., Borden, G. and Harris, K. (2006) Speech Science Primer, London:
Lippincott, Williams and Wilkins.
Roach, P. J. (1982) ‘On the distinction between “stress-timed” and “syllable-timed”
languages’, in D. Crystal (ed.) Linguistic Controversies, London: Edward Arnold.
Roach, P. J. (1994) ‘Conversion between prosodic transcription systems: “Standard British”
and ToBI’, Speech Communication, vol. 15, pp. 91-9.
Roach, P. J. (2002) Phonetics, Oxford: Oxford University Press.
Roach, P. J. (2004) ‘Illustration of British English: Received Pronunciation’, Journal of the
International Phonetic Association, vol. 34.2, pp. 239-46.
Roach, P. J. (2005) ‘Representing the English model’, in Dzubialska-Kolaczyk, K. and
Przedlacka, J. (eds.) English Pronunciation Models: a Changing Scene, pp. 393-9,
Basel: Peter Lang.
Roca, I. and Johnson, W. (1999) A Course in Phonology, Oxford: Blackwell.
Sapir, E. (1925) ‘Sound patterns in language’, Language, vol. 1, pp. 37-51.
Schmerling, S. (1976) Aspects of English Sentence Stress, Austin: University of Texas Press.
Shockey, L. (2003) Sound Patterns of Spoken English, Oxford: Blackwell.
Spolsky, D. (1998) Sociolinguistics, Oxford: Oxford University Press.
Taylor, D. S. (1981) ‘Non-native speakers and the rhythm of English’, International Review
of Applied Linguistics, vol. 19, pp. 219-26.
Tench, P. (1996) The Intonation Systems of English, London: Cassell.
Trager, G. and Smith, H. (1951) An Outline of English Structure, Washington: American
Council of Learned Societies.
Trudgill, P. (1999) The Dialects of England, 2nd edn., Oxford: Blackwell.
226 Bibliography