In English sociolinguistics, you’ll often see vowel phonemes represented by a single word in small caps. For example,
See also: “Thoughts on allophonic extensions to Wells’ lexical sets.”
Wells Lexical Sets
As it turns out the
Words written in capitals
Throughout the work, use is made of the concept of standard lexical sets. These enable one to refer concisely to large groups of words which tend to share the same vowel, and to the vowel which they share. They are based on the vowel correspondences which apply between British Received Pronunciation and (a variety of) General American, and make use of keywords intended to be unmistakable no matter what accent one says them in. Thus ‘the
kit words’ refers to ‘ship, bridge, milk…’; ‘thekit vowel’ refers to the vowel these words have (in most accents, /ɪ/); both may just be referred to askit .
Wells then provides this table:
RP | GenAm | |||
---|---|---|---|---|
ɪ | ɪ | 1. | ship, sick, bridge, milk, myth, busy… | |
e | ɛ | 2. | step, neck, edge, shelf, friend, ready… | |
æ | æ | 3. | tap, back, badge, scalp, hand, cancel… | |
ɒ | ɑ | 4. | stop, sock, dodge, romp, possible, quality… | |
ʌ | ʌ | 5. | cup, suck, budge, pulse, trunk, blood… | |
ʊ | ʊ | 6. | put, bush, full, good, look, wolf… | |
ɑː | æ | 7. | staff, brass, ask, dance, sample, calf… | |
ɒ | ɔ | 8. | cough, broth, cross, long, Boston… | |
ɜː | ɜr | 9. | hurt, lurk, urge, burst, jerk, term… | |
iː | u | 10. | creep, speak, leave, feel, key, people… | |
eɪ | eɪ | 11. | tape, cake, raid, veil, steak, day… | |
ɑː | ɑ | 12. | psalm, father, bra, spa, lager… | |
ɔː | ɔ | 13. | taught, sauce, hawk, jaw, broad… | |
əʊ | o | 14. | soap, joke, home, know, so, roll… | |
uː | u | 15. | loop, shoot, tomb, mute, huge, view... | |
aɪ | aɪ | 16. | ripe, write, arrive, high, try, buy… | |
ɔɪ | ɔɪ | 17. | adroit, noise, join, toy, royal… | |
aʊ | aʊ | 18. | out, house, loud, count, crowd, cow… | |
ɪə | ɪ(r | 19. | beer, sincere, fear, beard, serum… | |
ɛə | ɛ(r | 20 | care, fair, pear, where, scarce, vary… | |
ɑː | ɑ(r | 21 | far, sharp, bark, carve, farm, heart… | |
ɔː | ɔ(r | 22 | for, war, short, scorch, born warm… | |
ɔː | o(r | 23 | four, wore, sport, porch, borne, story… | |
ʊə | ʊ(r | 24. | poor, tourist, pure, plural, jury… |
Later on in the book (p. 122–124), Wells compares Received Pronunciation and General American English and goes into more detail about the principle behind the lexical sets:
When we compare the pronunciation of particular words in the two accents, we find that in many respects there is a good match: for example, almost all words that have /iː/ in RP have the corresponding /i/ in GenAm, and vice versa: thus creep, sleeve, key, people and hundreds of other words. Likewise /aɪ/, transcribed identically for the two accents, and used in both cases for ripe, arrive, high, try and many other words…
Investigation shows that… we can successfully match the vowels in RP and GenAm forms of particular words for the vast bulk of the vocabulary…
This matching furnishes us with the framework of standard lexical sets which we use not only for comparing RP and GenAm but also for describing the lexical incidence of vowels in all the many accents we consider in this work. It turns out that for vowels in strong (stressed or stressable) syllables there are twenty-four matching pairs of RP and GenAm vowels. We identify each pair, and each standard lexical set of words whose stressed syllable exhibits the correspondence in question, by a keyword, which we shall always write in
small capitals . Thus the correspondence between RP /iː/ and GenAm /i/ is the basis for the standard lexical setfleece …
In the rest of this work standard lexical set keywords will also be used to refer to (i) any or all of the words belonging to the standard lexical set in question; and (ii) the vowel sound used for the standard lexical set in question in the accent under consideration. Rather than using expressions such as ’short i/ for example, we shall speak of the
kit vowel or simply ofkit .
If you’re unfamiliar with these labels, I encourage you to look at Wells’ book. He explains each of these lexical sets in greater detail on pages 122–168 of Volume I.
An Alternative Lexical Set: the b_t frame
The Wells sets are very useful, but for some reason, they have not become adopted universally. Several researchers have opted to use an alternative set of labels that take advantage of a large minimal set in English, the
I did some digging in old American Speech and volumes of the Publications of the American Dialect Society to see when these labels were first used. The earliest instance I could find goes all the way back to Sumner Ives’ 1954 study called The Phonology of the Uncle Remus Stories, which was the 22nd volume in the PADS series. On page 6, the author states that the following words are to refer to English vowels: beet, bit, bait, bet, bat, not, bought, boat, put, boot, but, curt, bite, bout, boy, and above. This set is remarkably close to what some researchers use today!
In contemporary sociolinguistics, I believe the
…found that it would be helpful to formulate a convention to unify the text and simplify the reader’s task; with that thought in mind, we have suggested that authors use neither a phonological / / nor a variable ( ) presentation, both of which differ in conventions from author to author. We have chosen instead to refer to a given vowel class using keywords, following the principle behind Wells (1982). To further simplify, we turned to Ladefoged’s (2005) choice of keyword paradigm, which uses words that are as untrammeled by their consonantal environment as possible. To obtain these keywords, he chose an
h_d frame, to have his speakers “sayheed again.
To minimize the need for varying the “carrier” environment, in each case, the vowel being focused on here will be a
b_t paradigm.
IPA | Keyword |
---|---|
/i/ | |
/ɪ/ | |
/e/ | |
/ɛ/ | |
/æ/ | |
/ɑ/ | |
/ɔ/ | |
/o/ | |
/ʌ/ | |
/ʊ/ | |
/u/ | |
/aɪ/ | |
/aʊ/ | |
/ɔɪ/ | |
/ɚ/ |
They end with this statement:
We hope that this convention will permit the reader to follow all the authors without difficult transitioning between chapters.
It appears that their goal for continuity has beyond their volume because the set was used in later volumes of the Publications of the American Dialect Society. For instance, here are the remarks by the editors of Speech in the Western States: Volume 1: The Coastal States (Fridland et al. 2016):
For the purpose of clarity and continuity, authors use the conventions of the International Phonetic Alphabet throughout the chapters, though, in many cases, keywords in the
b_t frame are used to highlight particular word classes and subclasses, following other recent PADS volumes (Thomas & Yaeger-Dror 2009). These frames are built upon those made for comparative study of English dialects by Wells (1982) but have been adapted to allow representation of the particular vowel changes and conditioning environments of interest to the present study of the U.S. West.
IPA | Wells Keyword | b_t Keyword |
---|---|---|
ɪ | ||
ɛ | ||
æ | ||
ɑ ~ a | ||
ɔ ~ a | ||
ʌ | ||
ʊ | ||
ɚ | ||
i | ||
e | ||
o | ||
u | ||
aɪ | ||
ɔɪ | ||
aʊ | ||
ɪɹ ~ iɹ | ||
ɚɹ | ||
ɑɹ | ||
ɔr / or | ||
ʊɹ | ||
əɹ | ||
ə |
The only difference is that
Outside of the PADS volumes, this frame was also used in McCarthy (2011), which explicitly states that these labels were used because of Thomas & Yaeger-Dror (2009).
I don’t know the reason why the
My thoughts on the b_t frame
Hot take: I don’t think this the
Imagine you’re at a conference talking about the low back merger and you yourself have the merger. Since you don’t naturally differentiate the words bot and bought, you struggle to explain the differences that may exist in your study. Even if you do distinguish the sounds, many people in your audience may not, so they’ll have a hard time understanding. This was one of the reasons Wells came up with his lexical sets: neither you nor your audience would have much difficulty understanding the words lot or thought since, as far as I know, /lɔt/ and /θɑt/ are not English words.
Even if you’re not talking about a merger, if you are someone with particularly shifted vowels, when you say an isolated, ambiguous token like [bɛ̞t] or [bæ̙t], it may not be immediately clear to listeners of other dialects which vowel you’re talking about.
The words in the
In fact, Wells specifically designed his original set so that it specifically would not use the
The keywords have been chosen in such a way that clarity is maximized: whatever accent of English they are spoken in, they can hardly be mistaken for other words. Although fleece is not the commonest of words, it cannot be mistaken for a word with some other vowel; whereas beat, say, if we had chosen it instead, would have been subject to the drawback that one man’s pronunciation of beat may sound like another’s pronunciation of bait or bit. (Wells 1982:123)
I question the usefulness of the
To put it another way, just because we call something the cot-caught merger doesn’t mean we should refer to the entire lexical sets as
It’s my impression that researchers on non-American varieties of English use Wells’ original lexical sets without any problems. Using a different set is potentially confusing and may alienate ourselves from studies on other World Englishes.
To be clear, I am in no way criticizing the researchers who came up with or use the
Conclusion
This reminds me of how some people use the Labovian transcription system instead of standard IPA. See Josef Fruehwald’s blog post on those. There are two competing systems of lexical sets being used in American dialectology: the Wells lexical set and the
PS: Regardless of which system you use, I think we should make sure we use
PPS: To my knowledge, Wells never intended for the lexical set labels (or even the example words in the original explanation) to be ideal tokens for eliciting the vowels they represent. So, while the