Why do people use BAT instead of TRAP?

Lexical Sets

Joey Stanley


October 28, 2019

In English sociolinguistics, you’ll often see vowel phonemes represented by a single word in small caps. For example, trap represents /æ/. However, in a lot of American dialectology papers, you’ll see authors use the label bat instead. In this post, I explain why I think these competing labels are used… and why I prefer trap over bat.

See also: “Thoughts on allophonic extensions to Wells’ lexical sets.”

Wells Lexical Sets

As it turns out the trap label came first. In fact, trap is just one in a set of 24 labels, one for each English vowel. The creator of this lexical set is John C. Wells, who established them in his 1982 three-volume series, Accents of English. In the preface of each volume, Wells explains a notation system that has since been called the “Wells Lexical Sets.” Because it is brief, I’ll quote it in its entirety (bold and small caps in original):

Words written in capitals

Throughout the work, use is made of the concept of standard lexical sets. These enable one to refer concisely to large groups of words which tend to share the same vowel, and to the vowel which they share. They are based on the vowel correspondences which apply between British Received Pronunciation and (a variety of) General American, and make use of keywords intended to be unmistakable no matter what accent one says them in. Thus ‘the kit words’ refers to ‘ship, bridge, milk…’; ‘the kit vowel’ refers to the vowel these words have (in most accents, /ɪ/); both may just be referred to as kit.

Wells then provides this table:

Table 1: Wells’ original lexical sets. From Wells (1982:xviii–xix).
RP GenAm
ɪ ɪ 1. kit ship, sick, bridge, milk, myth, busy
e ɛ 2. dress step, neck, edge, shelf, friend, ready
æ æ 3. trap tap, back, badge, scalp, hand, cancel
ɒ ɑ 4. lot stop, sock, dodge, romp, possible, quality
ʌ ʌ 5. strut cup, suck, budge, pulse, trunk, blood
ʊ ʊ 6. foot put, bush, full, good, look, wolf
ɑː æ 7. bath staff, brass, ask, dance, sample, calf
ɒ ɔ 8. cloth cough, broth, cross, long, Boston
ɜː ɜr 9. nurse hurt, lurk, urge, burst, jerk, term
u 10. fleece creep, speak, leave, feel, key, people
11. face tape, cake, raid, veil, steak, day
ɑː ɑ 12. palm psalm, father, bra, spa, lager
ɔː ɔ 13. thought taught, sauce, hawk, jaw, broad
əʊ o 14. goat soap, joke, home, know, so, roll
u 15. goose loop, shoot, tomb, mute, huge, view...
16. price ripe, write, arrive, high, try, buy
ɔɪ ɔɪ 17. choice adroit, noise, join, toy, royal
18. mouth out, house, loud, count, crowd, cow
ɪə ɪ(r 19. near beer, sincere, fear, beard, serum
ɛə ɛ(r 20 square care, fair, pear, where, scarce, vary
ɑː ɑ(r 21 start far, sharp, bark, carve, farm, heart
ɔː ɔ(r 22 north for, war, short, scorch, born warm
ɔː o(r 23 force four, wore, sport, porch, borne, story
ʊə ʊ(r 24. cure poor, tourist, pure, plural, jury

Later on in the book (p. 122–124), Wells compares Received Pronunciation and General American English and goes into more detail about the principle behind the lexical sets:

When we compare the pronunciation of particular words in the two accents, we find that in many respects there is a good match: for example, almost all words that have /iː/ in RP have the corresponding /i/ in GenAm, and vice versa: thus creep, sleeve, key, people and hundreds of other words. Likewise /aɪ/, transcribed identically for the two accents, and used in both cases for ripe, arrive, high, try and many other words…

Investigation shows that… we can successfully match the vowels in RP and GenAm forms of particular words for the vast bulk of the vocabulary…

This matching furnishes us with the framework of standard lexical sets which we use not only for comparing RP and GenAm but also for describing the lexical incidence of vowels in all the many accents we consider in this work. It turns out that for vowels in strong (stressed or stressable) syllables there are twenty-four matching pairs of RP and GenAm vowels. We identify each pair, and each standard lexical set of words whose stressed syllable exhibits the correspondence in question, by a keyword, which we shall always write in small capitals. Thus the correspondence between RP /iː/ and GenAm /i/ is the basis for the standard lexical set fleece

In the rest of this work standard lexical set keywords will also be used to refer to (i) any or all of the words belonging to the standard lexical set in question; and (ii) the vowel sound used for the standard lexical set in question in the accent under consideration. Rather than using expressions such as ’short i/ for example, we shall speak of the kit vowel or simply of kit.

If you’re unfamiliar with these labels, I encourage you to look at Wells’ book. He explains each of these lexical sets in greater detail on pages 122–168 of Volume I.

An Alternative Lexical Set: the b_t frame

The Wells sets are very useful, but for some reason, they have not become adopted universally. Several researchers have opted to use an alternative set of labels that take advantage of a large minimal set in English, the b_t frame.

I did some digging in old American Speech and volumes of the Publications of the American Dialect Society to see when these labels were first used. The earliest instance I could find goes all the way back to Sumner Ives’ 1954 study called The Phonology of the Uncle Remus Stories, which was the 22nd volume in the PADS series. On page 6, the author states that the following words are to refer to English vowels: beet, bit, bait, bet, bat, not, bought, boat, put, boot, but, curt, bite, bout, boy, and above. This set is remarkably close to what some researchers use today!

Sumner Ives. 1954. The Phonology of the Uncle Remus Stories. Publication of the American Dialect Society 22(1): 3–59. https://doi.org/10.1215/-22-1-3

In contemporary sociolinguistics, I believe the b_t frame was popularized by Erik Thomas & Malcah Yaeger-Dror’s 2009 edited volume, African American English Speakers and Their Participation in Local Sound Changes: A Comparative Study. In the introduction, starting on page 8 and spilling into page 9, they say that they

Erik Thomas & Malcah Yaeger-Dror, eds. 2009. African American English Speakers and Their Participation in Local Sound Changes: A Comparative Study. Publication of the American Dialect Society 94. Available here.

…found that it would be helpful to formulate a convention to unify the text and simplify the reader’s task; with that thought in mind, we have suggested that authors use neither a phonological / / nor a variable ( ) presentation, both of which differ in conventions from author to author. We have chosen instead to refer to a given vowel class using keywords, following the principle behind Wells (1982). To further simplify, we turned to Ladefoged’s (2005) choice of keyword paradigm, which uses words that are as untrammeled by their consonantal environment as possible. To obtain these keywords, he chose an h_d frame, to have his speakers “say heed again.

To minimize the need for varying the “carrier” environment, in each case, the vowel being focused on here will be a b_t paradigm.

Table 2: The lexical set based on the b_t frame. This is a subset of the table by Thomas & Yaeger-Dror (2009:6).
IPA Keyword
/i/ beet
/ɪ/ bit
/e/ bait
/ɛ/ bet
/æ/ bat
/ɑ/ bot
/ɔ/ bought
/o/ boat
/ʌ/ but
/ʊ/ book
/u/ boot
/aɪ/ bite
/aʊ/ bout
/ɔɪ/ boy
/ɚ/ bird

They end with this statement:

We hope that this convention will permit the reader to follow all the authors without difficult transitioning between chapters.

It appears that their goal for continuity has beyond their volume because the set was used in later volumes of the Publications of the American Dialect Society. For instance, here are the remarks by the editors of Speech in the Western States: Volume 1: The Coastal States (Fridland et al. 2016):

Valerie Fridland, Tyler Kendall, Betsy E. Evans, & Alicia Beckford Wassink, eds. 2016. Speech in the Western States, Vol 1., The Coastal States. Publication of the American Dialect Society 101. Available here.

The description in Speech in the Speech in the Western States: Volume 2: The Mountain West (Fridland et al. 2017) is almost identical.

For the purpose of clarity and continuity, authors use the conventions of the International Phonetic Alphabet throughout the chapters, though, in many cases, keywords in the b_t frame are used to highlight particular word classes and subclasses, following other recent PADS volumes (Thomas & Yaeger-Dror 2009). These frames are built upon those made for comparative study of English dialects by Wells (1982) but have been adapted to allow representation of the particular vowel changes and conditioning environments of interest to the present study of the U.S. West.

Table 3: The lexical set used in the Speech in the Western States volumes. From Fridland et al. 2016:3 and Fridland et al. 2016:5; Table 1.1 in both. The columns have been rearranged for consistency within this blog post.
IPA Wells Keyword b_t Keyword
ɪ kit bit
ɛ dress bet
æ trap bat
ɑ ~ a lot bot
ɔ ~ a cloth/thought bought
ʌ strut but
ʊ foot book
ɚ nurse burt
i fleece beet
e face bait
o goat boat
u goose boot
price bite
ɔɪ choice boy
mouth bout
ɪɹ ~ iɹ near beer
ɚɹ square bare
ɑɹ start bar
ɔr / or north/force bore
ʊɹ cure burr
əɹ letter
ə comma

The only difference is that bird was changed to burt.

Outside of the PADS volumes, this frame was also used in McCarthy (2011), which explicitly states that these labels were used because of Thomas & Yaeger-Dror (2009).

I don’t know the reason why the b_t frame was designed when the Wells lexical sets were already established. Perhaps the draw of the nearly complete minimal set to contrast all English vowels was useful. Maybe it’s because the words in the b_t frame are shorter, which makes for less cluttered visualizations and written prose. Thomas & Yaeger-Dror did say that the use of the consonants /b/ and /t/ in the keywords helped reduce the effects of surrounding consonants on the vowels themselves. Ultimately though, I’m not sure.

My thoughts on the b_t frame

Hot take: I don’t think this the b_t is any more useful than IPA. Hear me out:

Imagine you’re at a conference talking about the low back merger and you yourself have the merger. Since you don’t naturally differentiate the words bot and bought, you struggle to explain the differences that may exist in your study. Even if you do distinguish the sounds, many people in your audience may not, so they’ll have a hard time understanding. This was one of the reasons Wells came up with his lexical sets: neither you nor your audience would have much difficulty understanding the words lot or thought since, as far as I know, /lɔt/ and /θɑt/ are not English words.

Even if you’re not talking about a merger, if you are someone with particularly shifted vowels, when you say an isolated, ambiguous token like [bɛ̞t] or [bæ̙t], it may not be immediately clear to listeners of other dialects which vowel you’re talking about.

The words in the b_t frame may be “untrammeled by their consonantal environment,” but I don’t know if the lack of transition formants make for the most effective label in speech or writing. Keywords are labels to refer to large lexical sets, so while they may not make for ideal tokens when collecting phonetic data, they need to still serve the purpose of unambiguously identifying a vowel phoneme.

In fact, Wells specifically designed his original set so that it specifically would not use the b_t frame:

The keywords have been chosen in such a way that clarity is maximized: whatever accent of English they are spoken in, they can hardly be mistaken for other words. Although fleece is not the commonest of words, it cannot be mistaken for a word with some other vowel; whereas beat, say, if we had chosen it instead, would have been subject to the drawback that one man’s pronunciation of beat may sound like another’s pronunciation of bait or bit. (Wells 1982:123)

I question the usefulness of the b_t frame. It’s convenient that a common enough English word can be created by filling in almost any vowel into the template, but I don’t know if this large minimal set makes for the most unambiguous lexical set. When he introduces his keywords, Wells says that they are “intended to be unmistakable no matter what accent one says them in.” This property is not retained in the b_t set of keywords. In fact, creating a set based on minimal pairs defeats the very purpose of a lexical set.

To put it another way, just because we call something the cot-caught merger doesn’t mean we should refer to the entire lexical sets as cot and caught. In fact, I think we should actively avoid referring to them as cot and caught for the very reason that they do form a minimal pair.

It’s my impression that researchers on non-American varieties of English use Wells’ original lexical sets without any problems. Using a different set is potentially confusing and may alienate ourselves from studies on other World Englishes.

To be clear, I am in no way criticizing the researchers who came up with or use the b_t frame. If you’re familiar with what I do you’ll know that their work is highly relevant to my own studies, and I cite a lot of studies that use the b_t frame. Their work is excellent and I model my own work after their theirs. I just think the labels could be clearer.


This reminds me of how some people use the Labovian transcription system instead of standard IPA. See Josef Fruehwald’s blog post on those. There are two competing systems of lexical sets being used in American dialectology: the Wells lexical set and the b_t frame. To answer my titular question of why people use bat instead of trap… I don’t really know. I think it may largely depend on what university the work is coming out of. But, I think Wells’ original set may be a little better.

PS: Regardless of which system you use, I think we should make sure we use small caps instead of ALL CAPS or even Capitalized Small Caps. It’s truer to Wells’ original notation and I think they just look a lot better typographically.

PPS: To my knowledge, Wells never intended for the lexical set labels (or even the example words in the original explanation) to be ideal tokens for eliciting the vowels they represent. So, while the b_t labels might be “untrammeled by their consonantal environment,” the Wells labels are not. So, there’s probably no need to eliciting the words fleece, kit, face, dress, etc. when getting tokens of these vowels.

Update: Click here for further musings, ramblings, and recommendations for non-canonical extensions to Wells’ lexical sets when referring to allophones.