English Language

Download

This chapter describes the linguistic resources included in the file eng.zip of the lang folder.

List of phonemes

The following list of phonemes includes both the British English and the American English. The acoustic model contains all of them. However, only the American English pronunciation dictionary is provided. The given examples are for American English.

SPPAS is based on the X-SAMPA standard. See https://en.wikipedia.org/wiki/X-SAMPA for details.

Consonant Plosives

SPPAS IPA Description Examples
p p voiceless bilabial pie, spy, cap
b b voiced bilabial buy, cab
t t voiceless alveolar tie, sty, cat, atom
d d voiced alveolar dye, cad, do
k k voiceless velar sky, crack, quick
g g voiced velar guy, bag, luggage

Consonant Fricatives

SPPAS IPA Description Examples
D ð voiced dental thy, breathe, father
f f voiceless labiodental phi, caff, fan
s s voiceless alveolar sigh, mass
S ʃ voiceless postalveolar shy, cash, emotion
z z voiced alveolar zoo, has
Z ʒ voiced postalveolar equation, pleasure, vision, beige
v v voiced labiodental vie, have
T θ voiceless dental thigh, math
h h voiceless glottal high, ahead

Consonant Nasals

SPPAS IPA Description Examples
m m bilabial my, smile, cam
n n alveolar nigh, snide, can
N ŋ voiced velar sang, sink, singer

Consonant Liquids

SPPAS IPA Description Examples
l l alveolar lateral lie, sly, gal
4 ɾ alveolar flap lyda, maddy, makita
r\ ɹ alveolar approximant red, try, very

Semivowels

SPPAS IPA Description Examples
j j voiced palatal yes, yacht, william
w w voiced labiovelar wye, swine, why

Vowels

SPPAS IPA Description Examples
E ɛ open-mid front unrounded dress, bed, fell, men
A: ɑ: open back unrounded palm, father, bra
A ɒ open back rounded lot, pod, John
O: ɔ: open-mid back rounded thought, Maud, dawn, fall
V ʌ open-mid back unrounded strut, mud, dull, gun
i i close front unrounded happy, serious
i: i: close front unrounded fleece, seed, feel, sea
u: u: close back rounded goose, food, chew, do
@ ə schwa a, baccus
I ɪ near-close near-front unrounded kit, lid, fill, bin
u u close back rounded vowel absolute, assume
U ʊ near-close near-back rounded foot, full, woman
{ æ near-open front unrounded trap, pad, shall, ban

Affricates

SPPAS IPA Description Examples
dZ d͡ʒ voiced postalveolar giant, badge, jam
tS t͡ʃ voiceless postalveolar China, catch

Other symbols

SPPAS IPA Examples
aI price, ride, file, pie
aU mouth, loud, down, how
eI face, fail, vein, pay
OI ɔɪ choice, void, foil, boy
@U goat, code, foal, go
3:r ɜ:r liner, foundered, current

Recently added (not documented yet)

SPPAS IPA Description
@U
E@
3: ɜ open-mid central unrounded vowel
I@
l= /l/ syllabic
n= /n/ syllabic
Q ɒ open back rounded vowel

Fillers

SPPAS Description
laugh laughter
noise noises, unintelligible speech
dummy un-transcribed speech

Lexicons

All English lexicons are (c) CNRS, Laboratoire Parole et Langage, Aix-en-Provence, France:

  • eng.vocab contains a list of 120k different words;
  • eng.stp is a list of 150 stop-words;
  • eng_num.repl allows to convert numbers to their written form;
  • eng.repl allows to convert symbols and abbreviations into a text form.

All of them are distributed under the terms of the GNU General Public License v3.

Pronunciation dictionary

The pronunciation dictionary is for North American English. It was downloaded in 2011 from the CMU web page. This Carnegie Mellon Pronouncing Dictionary (version 0.6) is Copyright (C) 1993-2008 by Carnegie Mellon University. We acknowledge CMU for distributing freely this resource and allowing its re-distribution.

Brigitte Bigi converted the original CMUdict encoded with ARPAbet into X-SAMPA and converted the format of the file in HTK-ASCII.

Acoustic Model

The first version of the acoustic model was context-dependent (better accuracy) but did not contain the fillers. This model distributed in SPPAS resources was downloaded in 2014 from the VoxForge project at http://www.voxforge.org/. For the second model, the monophones were extracted to create a new context-independent model in which the fillers (i.e. laugh, noise and dummy) were added. The model was under the terms of the GNU Public License.

In 2022, a new context-independent acoustic model was trained and phonemes for British English are introduced. This model is under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License

Syllabification

There's no such available resource because English syllabification can't be solved with the implemented algorithm.

Cued Speech

There is no available resource yet, but American English will be supported by mid-2025. So... check back soon!