English Language

Download

This chapter describes the linguistic resources included in the file eng.zip of the "Ortolang repository".

List of phonemes

The following list of phonemes includes both the British English and the American English. The acoustic model contains all of them. However, only the American English pronunciation dictionary is provided. The given examples are for American English.

SPPAS is based on the X-SAMPA standard. See https://en.wikipedia.org/wiki/X-SAMPA for details.

Consonant Plosives

SPPAS	IPA	Description	Examples
p	p	voiceless bilabial	pie, spy, cap
b	b	voiced bilabial	buy, cab
t	t	voiceless alveolar	tie, sty, cat, atom
d	d	voiced alveolar	dye, cad, do
k	k	voiceless velar	sky, crack, quick
g	g	voiced velar	guy, bag, luggage

Consonant Fricatives

SPPAS	IPA	Description	Examples
D	ð	voiced dental	thy, breathe, father
f	f	voiceless labiodental	phi, caff, fan
s	s	voiceless alveolar	sigh, mass
S	ʃ	voiceless postalveolar	shy, cash, emotion
z	z	voiced alveolar	zoo, has
Z	ʒ	voiced postalveolar	equation, pleasure, vision, beige
v	v	voiced labiodental	vie, have
T	θ	voiceless dental	thigh, math
h	h	voiceless glottal	high, ahead

Consonant Nasals

SPPAS	IPA	Description	Examples
m	m	bilabial	my, smile, cam
n	n	alveolar	nigh, snide, can
N	ŋ	voiced velar	sang, sink, singer

Consonant Liquids

SPPAS	IPA	Description	Examples
l	l	alveolar lateral	lie, sly, gal
4	ɾ	alveolar flap	lyda, maddy, makita
r\	ɹ	alveolar approximant	red, try, very

Semivowels

SPPAS	IPA	Description	Examples
j	j	voiced palatal	yes, yacht, william
w	w	voiced labiovelar	wye, swine, why

Vowels

SPPAS	IPA	Description	Examples
E	ɛ	open-mid front unrounded	dress, bed, fell, men
A:	ɑ:	open back unrounded	palm, father, bra
A	ɒ	open back rounded	lot, pod, John
O:	ɔ:	open-mid back rounded	thought, Maud, dawn, fall
V	ʌ	open-mid back unrounded	strut, mud, dull, gun
i	i	close front unrounded	happy, serious
i:	i:	close front unrounded	fleece, seed, feel, sea
u:	u:	close back rounded	goose, food, chew, do
@	ə	schwa	a, baccus
I	ɪ	near-close near-front unrounded	kit, lid, fill, bin
u	u	close back rounded vowel	absolute, assume
U	ʊ	near-close near-back rounded	foot, full, woman
{	æ	near-open front unrounded	trap, pad, shall, ban

Affricates

SPPAS	IPA	Description	Examples
dZ	d͡ʒ	voiced postalveolar	giant, badge, jam
tS	t͡ʃ	voiceless postalveolar	China, catch

Other symbols

SPPAS	IPA	Examples
aI	aɪ	price, ride, file, pie
aU	aʊ	mouth, loud, down, how
eI	eɪ	face, fail, vein, pay
OI	ɔɪ	choice, void, foil, boy
@U	oʊ	goat, code, foal, go
3:r	ɜ:r	liner, foundered, current

Recently added (not documented yet)

SPPAS	IPA	Description
@U
E@
3:	ɜ	open-mid central unrounded vowel
I@
l=		/l/ syllabic
n=		/n/ syllabic
Q	ɒ	open back rounded vowel

Fillers

SPPAS	Description
laugh	laughter
noise	noises, unintelligible speech
dummy	un-transcribed speech

Lexicons

All English lexicons are (c) CNRS, Laboratoire Parole et Langage, Aix-en-Provence, France:

eng.vocab contains a list of 120k different words;
eng.stp is a list of 150 stop-words;
eng_num.repl allows to convert numbers to their written form;
eng.repl allows to convert symbols and abbreviations into a text form.

All of them are distributed under the terms of the GNU General Public License v3.

Pronunciation dictionary

The pronunciation dictionary is for North American English. It was downloaded in 2011 from the CMU web page. This Carnegie Mellon Pronouncing Dictionary (version 0.6) is Copyright (C) 1993-2008 by Carnegie Mellon University. We acknowledge CMU for distributing freely this resource and allowing its re-distribution.

Brigitte Bigi converted the original CMUdict encoded with ARPAbet into X-SAMPA and converted the format of the file in HTK-ASCII.

Acoustic Model

The first version of the acoustic model was context-dependent (better accuracy) but did not contain the fillers. This model distributed in SPPAS resources was downloaded in 2014 from the VoxForge project at http://www.voxforge.org/. For the second model, the monophones were extracted to create a new context-independent model in which the fillers (i.e. laugh, noise and dummy) were added. The model was under the terms of the GNU Public License.

In 2022, a new context-independent acoustic model was trained and phonemes for British English are introduced. This model is under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License

Syllabification

There's no such available resource because English syllabification can't be solved with the implemented algorithm.

Cued Speech

There is no available resource yet, but American English will be supported by mid-2025. So... check back soon!