Glossary of Pronunciation Terms
The Articulatory System
Accentedness: This describes listeners’ judgment of whether a speaker’s pronunciation fits into
the norms of standard pronunciation that they expect to hear. (We ask, “Does this person have a
noticeable accent?”)
Allophones: Variations of a phoneme that are still heard to be the same sound are
called allophones of the same phoneme. They’re different sounds that function as the same
sound. Changing from one allophone to another doesn’t change meaning, although it may make
the word sound strange.
Alphabetic Principle: The understanding that written words are composed of letters, and the
letters represent the sounds of spoken words.
Articulatory system: The parts of the body that are used in producing sounds.
These parts are shown in the picture on the right. (Click the picture to enlarge it.)
1. Lips
2. Teeth
3. Alveolar ridge (tooth ridge)
4. Hard palate
5. Soft palate (velum)
6. Nasal cavity
7. Tongue
8. Jaw
9. Vocal cords and glottis
Aspiration: The puff of air that is produced with some sounds. In English, voiceless stops are
1
often aspirated.
Sounds that are pronounced with this puff of air are called aspirated sounds.
Sounds that are pronounced without this puff of air are called unaspirated sounds.
Assimilation: A sound change in which one sound becomes more similar to a sound that comes
before or after it. This makes the words easier to pronounce.
Auditory learning modality: Learning through hearing.
Authentic materials: Materials that were created for “real life” purposes, not for teaching, such
as newspapers, magazines, TV or radio programs, movies, advertisements, and poems.
Bottom-up processing: When we listen to individual sounds to figure out what words we’re
hearing and decode the meaning of the message, we’re using bottom-up processing.
Citation form of a word: The pronunciation of a word when it is said carefully, and usually
alone. For example, the citation form of and is /ᴂnd/, and the citation form of to is /tuw/.
Closed syllable: A syllable that ends in a consonant sound, like sun, bat, made, or the last
syllable in return.
Communicability: This describes how well a speaker’s pronunciation lets him/her function and
communicate in the real-life situations he/she faces. (We ask, “Can this person communicate?”)
Consonant: A sound in which the air stream meets some obstacles on its way up from the
lungs. Words like “big,” “map,” and “see” begin with consonants.
Consonant blends or consonant clusters: Combinations of letters that represent a sequence
of sounds, such as “str” in “street” or “mp” in “lamp.”
Consonant chart: A table that shows all the consonants of a language, categorizing them by
place of articulation, manner of articulation, and voicing. Click here to see a consonant chart for
North American English.
Content words: Words that have lexical meaning, not grammatical meaning, such as nouns,
verbs, adjectives, adverbs, and question words.
Contrastive analysis: Comparing the sound systems of L1 and L2 to determine what sounds
might be most difficult for learners. Sounds that are different can cause more problems and need
more teaching time. (The focus is on sounds.)
Contrastive stress. When a sentence contains two words that contrast with each other, they
often both receive stress. For example, “Today isn’t Tuesday; it’s Wednesday.”
Decoding skills: The ability to recognize words that follow predictable spelling patterns—to
“sound out” words by putting together their sounds.
Deletion: A sound change in which a sound may disappear or not be clearly pronounced in
certain contexts. Deletion is also called omission, elision, or ellipsis.
Dialect or variety of a language: A form of a language that is associated with a particular place
or social group. A dialect can have its own pronunciation, vocabulary, and grammar.
Digraph: A combination of two graphemes (letters) that together represent one sound. In
2
English, “sh” is a digraph that represents /ʃ/.
Distinctive feature analysis: A way of analyzing two languages to determine which sound
features are different, to determine what to emphasize in pronunciation teaching. (The focus is
on features that might affect many different sounds, such as voicing, aspiration, or nasalization,
rather than on single sounds.)
Drama techniques: Techniques borrowed from acting, such as breathing practice exercises,
voice warm-ups, role play, and skits.
Duration: The length of time that a sound lasts.
Emphatic stress: When a speaker stresses one word in a sentence because it’s important and
he/she wants to emphasize it. For example, “No, I do not want to eat a dead lizard.”
English as a Foreign Language (EFL): English is being taught in a country where English is not
commonly spoken.
English as an International Language (EIL): English instruction is designed for learners who
will need to communicate with people from many different backgrounds, both native and non-
native speakers.
English as a Second Language (ESL): English is being taught in a country where English is the
main language.
Epenthetic vowel: An extra vowel sound that is added between other sounds. In this case, its
purpose is to separate two sounds that are very similar. The process of adding an extra sound--
either a vowel or consonant--to an existing string of sounds is called epenthesis.
Final position: Occurring at the end of a word. For example, in the word cat, the sound /t/ is in
final position.
Flapped /t/ (or tapped /t/): A sound made when the tongue taps the alveolar ridge very quickly,
so that it sounds like a quick /d/. This is called an alveolar flap or tap, and it is represented by this
symbol: [R]. It’s a voiced sound.
Fossilization: A process that occurs when a language learner progresses to a certain point and
has a hard time making further progress. He/she keeps making the same mistakes over and
over. The mistakes seem frozen in time, like a dinosaur fossil.
Function words: Words that have grammatical meaning rather than lexical meaning, such as
articles, pronouns, prepositions, and conjunctions. They don’t have much meaning in
themselves; they show the relationship between other words.
Functional load: A way of measuring how important the contrast between two sounds is in a
language. If there are many minimal pairs with two sounds, the contrast between those sounds
has a high functional load. If there are few minimal pairs with those two sounds, then the contrast
has a low functional load. It’s usually more useful to teach contrasts with a high functional load
than those with a low functional load.
Glottal stop: A sound produced by closing the vocal cords tightly and releasing them quickly,
like the beginning o a small cough, or the middle sound when we say “huh-uh” to mean “no.” It’s
represented by this symbol: [ʔ].
Grapheme: A written symbol that represents a sound. In English, the letters of the alphabet are
graphemes.
3
Homonyms or homophones: Words that sound alike, but they’re spelled differently, such
as meet and meat or write, right, and rite.
Inflections: Forms of words that change to show a grammatical category, such as tense or
number. In English, these are inflectional suffixes, or endings that have some grammatical
meaning (as in want/wanted, cat/cats, eat/eating, or hot/hotter).
Initial position: Occurring at the beginning of a word. For example, in the word cat, the sound /k/
is in initial position.
Intonation: The pitch pattern of a sentence–the up-and-down “melody” of your voice as you
speak.
Intelligibility or comprehensibility: Both describe whether it is easy for listeners to understand
what a speaker is saying. They both imply that the speaker’s accent does not distract or cause
problems for listeners. (We ask, “Can people understand this person?”)
International Phonetic Alphabet (IPA): A system of symbols developed in the late 1800s to
represent all the sounds that are used in human languages. Variations of IPA are used in many
dictionaries and textbooks, although most of them are not exactly like “real” IPA. (Click here to
see a chart showing all the official IPA symbols.)
Intervocalic: In a position between two vowels, such as /t/ in water or /m/ in among.
“Invisible /y/”: When the letter “u” represents the sound /yuw/, as in cube or music, we say
there’s an “invisible /y/.” There’s no separate letter that represents /y/.
Kinesthetic learning modality: Learning through doing--through body movements and
manipulating objects and tools.
Lateral: A description of a sound that is produced with the sides of the tongue open, like /l/.
Learning modalities: Different ways that people learn and understand new information:
Auditory: Learning through hearing.
Visual: Learning through seeing.
Tactile: Learning through touching.
Kinesthetic: Learning through doing--through body movements and manipulating
objects and tools.
Linguistics, linguist: Linguistics is the systematic study of language and how it works. Linguists
are people who study linguistics. (They’re not necessarily people who speak a lot of languages
Those are polyglots.)
Linking: In normal speech, the last sound of one word is often linked or blended with the first
sound of the next word so that the two words sound like one unit.
Lip rounding: A description of whether the lips are rounded, relaxed, or stretched wide when a
vowel sound is pronounced.
Manner of articulation: A description of how we produce a particular consonant sound:
4
Stop/plosive: The air stream is blocked completely before it is released, like a tiny
explosion.
Fricative: The air stream is compressed and passes through a small opening, creating
friction—a hissing sound.
Affricate: A combination of a stop followed by a fricative—an explosion with a slow
release.
Nasal: The air passes through the nose instead of the mouth.
Liquid: The air stream moves around the tongue in a relatively unobstructed manner.
Glide/semivowel: The sounds is like a very quick vowel.
Medial position: Occurring in the middle of a word. For example, in the word cat, the sound /ᴂ/
is in medial position.
Merging: When a learner can’t hear or pronounce the difference between two similar sounds in a
new language, he/she may pronounce both sounds the same, as one of the sounds of his/her
first language.
Minimal pair: Two words that differ by just one sound, for example, late and rate, beat and bit,
sat and sap.
Morphology: The study of the forms of words and the different parts that are put together to
make words (In English, these are prefixes, suffixes, and word roots). It includes:
Inflectional morphology: Adding grammatical endings to words to make a different form
of the same word. (For example, work + ing >> working, class + es >> classes)
Derivational morphology: Putting word parts together (roots, prefixes, suffixes) to make
new words. (For example, work + er >> worker, un + happy >> unhappy, class + room >>
classroom)
Neurolinguistic programming: A technique from psychology that is concerned with the
connection between the body, thoughts, and emotions. It says that if we want to change one of
these things, we need to change the others. It combines relaxation and multisensory techniques
to increase learners’ awareness of their pronunciation and then to change it in positive ways.
North American English (NAE): The standard form of English spoken in the United States and
Canada (although there are slight differences between U.S. and Canadian English).
Open syllable: A syllable that ends in a vowel sound, like go, eye, through, or the last syllable
in party.
Oral cavity: The space inside the mouth.
Part of speech: A grammatical category for describing words. In English, there are nouns,
verbs, adjectives, adverbs, pronouns, prepositions, conjunctions, articles, and interjections.
Peak of a syllable: The central part of a syllable. It’s usually a vowel, but it could also be a
syllabic consonant. For example, in the one-syllable word cat, the peak is the vowel /æ/.
Phonemes: The distinctive sounds of a language–the sounds that a native speaker of the
language considers to be separate sounds. Changing from one phoneme to another changes the
meaning of the word. Sometimes it makes a word meaningless.
5
(A sound feature) is phonemic: This means that changing this feature makes a
difference in both sound and meaning. It changes one sound into another. For example,
aspiration of stops is not phonemic in English. (An aspirated /t/ is still /t/.) It is phonemic
in Korean, Thai, and many other languages.
Phonemic alphabet: A set of symbols that represent the distinctive sounds of a language. One
symbol represents exactly one phoneme. Many textbooks and dictionaries use phonemic
alphabets. Click here to see a phonemic alphabet for North American English.
Phonetic alphabet: A set of symbols, such as the International Phonetic Alphabet, that tries to
represent all the possible sounds of human languages, not just the sounds of one language. A
full phonetic alphabet would be too complex to use in textbooks and dictionaries. Click here to
see a phonetic alphabet (IPA).
Phonetics: The study and classification of the sounds of human speech—not just the system of
one language.
Phonemic awareness: The understanding that words are made up of individual sounds that
can be separated, counted and rearranged.
Phonics: The study of the relationship between written letters and spoken sounds. Also, a way
of teaching people to read that emphasizes the relationship between written letters and spoken
sounds of the language.
Phonologist: A person who studies the sounds of languages, how they are produced, and how
they work together as a system in a particular language.
Phonology: The study of speech sounds in language, how they are produced, and how they
work together as a system in a particular language.
Pitch: A measure of how high or low the voice is at a particular point. (This means high or low in
the sense that a musical note is high or low; it doesn’t mean a high or low volume.)
Place of articulation: A description of which parts of the vocal apparatus are working when we
produce a particular consonant sound.
Bilabial: Both lips touch or almost touch.
Labiodental: The upper teeth touch the lower lip.
Dental/Interdental: The tip of the tongue touches the teeth/between the teeth.
Alveolar: The tip of the tongue touches or almost touches the alveolar ridge (tooth
ridge).
Palatal/alveopalatal: The body of the tongue touches or almost touches the hard palate.
Velar: The back of the tongue touches the soft palate.
Glottal: There is friction in the glottis (the space between the vocal cords).
Polysyllabic word: A word with more than one syllable.
Prefix: A word part that is placed before a word root to change its meaning (as in
happy/unhappy or port/transport).
Prosody: The patterns of intonation and stress in a language (some of the suprasegmental
features. The term prosody is often used in talking about poetry.
Received Pronunciation (RP): The standard form of British English pronunciation, based on
6
educated speech in southern England. It is also called “The Queen’s English” or “The King’s
English." (Actually, very few people in the UK speak RP.)
Reduced form of a word: The pronunciation of a word when it is said in normal speech at a
normal speed, and it is not being stressed. For example, the reduced form of and is /ᴂn/ or /n/,
and the reduced form of to is /tə/.
Resyllabification: A way of making consonant clusters easier to pronounce by splitting up a
consonant cluster so that the last consonant goes with the syllable after it.
Retroflex: A description of a sound that is made with the tongue curled slightly backwards, as in
some pronunciations of /r/.
Rhotic and nonrhotic dialects: If speakers pronounce the phoneme /r/ after a vowel in words
like car and butter, the dialect is called rhotic. Most varieties of American English are rhotic. If
people don’t pronounce /r/ after a vowel, the dialect is called nonrhotic. Many varieties of British
and Australian English are nonrhotic.
Rhythm: The regular, patterned beat of stressed and unstressed syllables and pauses when
people speak a language.
Schwa: A mid-central, lax vowel represented by the symbol /ə/. Many vowel sounds change to
/ə/ when they’re not stressed.
Segmental features of pronunciation: The individual sounds (phonemes) of a language–
vowels and consonants.
Segmentation: Breaking up a stream of sounds into the words that are being said. For example,
when we hear /ətᴂks/, we might interpret it as attacks or a tax, depending on the context.
Sentence stress, prominence, or focus: One syllable in each thought group that receives
more stress than the others. It is often the stressed syllable of the last content word in the
sentence or thought group.
Shadowing and mirroring: Pronunciation practice techniques in which students mimic a
recording, such as a video clip, trying to speak in exactly the same way as the actors.
Sight words: Words with spellings that do not follow predictable patterns and hae to be
memorized individually, like eye, eight, or would.
Silent letter: A letter that is seen in the spelling of a word, but does not represent a sound, such
as k and e in the word knife or gh in the word night.
Simple vowels, glided vowels, and diphthongs: Categories of vowels based on whether the
tongue moves during the pronunciation of the vowel.
If the tongue stays in one position during a vowel, it’s a simple vowel.
If the tongue position changes just a little, it’s a glided vowel.
If the tongue position changes a lot, so it sounds like two separate vowels mashed
together, it’s a diphthong.
Stress-timed language: A language in which the time between stressed syllables remains fairly
steady, and the unstressed syllables have to crowd in between them. English is a stress-timed
language.
7
Stressed syllable: One syllable in a word that is emphasized. It can be longer, louder, clearer,
and higher in pitch than the others. It stands out from the other syllables. The syllables of a word
may have one of three degrees of stress:
Strongly stressed (also called primary stress)
Lightly stressed (also called secondary stress)
Unstressed (also called tertiary stress)
Substitution: When a learner can’t pronounce an unfamiliar sound in a new language, he/she
may substitute a similar sound from his/her first language that's easier to pronounce. For
example, speakers of many languages might substitute /s/ for /θ/ or /b/ for /v/.
Suffix: A word part that is placed after a word root to change its meaning (as in open/opened or
nation/national).
Suprasegmental features of pronunciation: Aspects of pronunciation that affect more than just
one sound segment, such as stress, rhythm, and intonation–the musical aspects of
pronunciation.
Syllabic consonant: A consonant that is stretched out so that it becomes a syllable. The
consonants /n/, /l/, and /r/ can sometimes be a full syllable by themselves. This most often
happens after a stressed syllable that ends in an alveolar consonant.
Syllable: A rhythmic unit in speech—a unit of sound that gets one “beat” in a word. A syllable
must have a vowel (or a syllabic consonant). It might also have one or more consonants before
the vowel and one or more consonants after it. For example, the word banana has three
syllables: ba-na-na.
Syllable-timed languages: Languages that give each syllable about the same amount of time.
Japanese, Korean, Chinese, and Spanish are syllable-timed languages.
Thought group: A group of spoken words that form a grammatical and semantic unit. It is often
a sentence, a clause, or a phrase–a chunk of language that feels like a logical unit. Because
each thought group has its own intonation contour, a thought group can also be called
an intonation unit.
Tense and lax vowels: A description of whether the muscles of the tongue are relatively tense
or more relaxed when we say a vowel sound. Although this is not entirely accurate, it can be a
useful way of thinking about these sounds.
Tongue Position: A description of where the highest, tensest, or most active part of the tongue
is when we pronounce a vowel sound. We describe:
Vertical position: High, mid, or low
Horizontal position: Front, central, or back
Top-down processing: When we listen for the overall meaning of what we hear, using our
background knowledge of the topic to create expectations about what it must all mean, we’re
using top-down processing.
Unreleased stop: When we start to say a stop by blocking off the flow of air in our mouth, but we
don’t release the air, it’s called an unreleased stop. These are very common at the ends of
8
words.
Visual learning modality: Learning through seeing.
Voice quality: The overall characteristics of a speaker’s voice, such as average pitch, tenseness
of the muscles of the throat and vocal tract, or whether the speaker’s voice sounds breathy,
nasal, etc.
Voiced sound: A sound that is produced with vibration of the vocal cords.
Voiceless sound: A sound that is produced without vibration of the vocal cords.
Vowel: A sound in which the air stream moves out very smoothly. Words like “apple,” “east,”
“over,” and “out” begin with vowels.
Vowel quadrant: A diagram showing the tongue position for the vowels of a language.
Click here to see the vowel quadrant for English.
Vowel reduction: In English, a vowel in an unstressed syllable usually becomes weaker, softer,
and less distinct than a full vowel. In this case, we call it a reduced vowel. Reduced vowels in
English often sound like /ə/.
Contractions and blends: Two words that blend together to make a shorter word. If the two-
word combination is written as one word with an apostrophe, we call it a contraction, such
as isn’t, that’s, or I’m. If the combination is not commonly written as one word, we call it a blend,
such as /watəl/ for what will.
Word root: A part of a word that carries its basic meaning. It might occur alone (as in dog, care,
or cover, or with one or more prefixes or suffixes (as in dogs, careful, uncover, or discoverer.