Nigerian Pidgin Language
Download
This chapter describes the linguistic resources included in the file
pcm.zip
of the lang
folder.
This work was financed by the French Agence Nationale pour la Recherche (ANR-16-CE27-0007), in the context of the NaijaSynCor project.
The iso-639-3 code of Naija language is PCM.
List of phonemes
Consonant Plosives
SPPAS | IPA | Description | Examples |
---|---|---|---|
p | p | voiceless bilabial | public, palaver |
b | b | voiced bilabial | bye, bojuboju, boli |
t | t | voiceless alveolar | two, tree, tranga |
d | d | voiced alveolar | drop, duma, this |
k | k | voiceless velar | ketu, cut, quick |
g | g | voiced velar | gain, girl, guy |
Consonant Fricatives
SPPAS | IPA | Description | Examples |
---|---|---|---|
f | f | voiceless labiodental | farm, phone, view |
s | s | voiceless alveolar | centre, safe, zero |
S | ʃ | voiceless postalveolar | cheque, sabi, ship |
z | z | voiced alveolar | used, diesel, eze |
v | v | voiced labiodental | visit, view |
h | h | voiceless glottal | happy, hope, who |
T | θ | voiceless dental | thing, ethnic, three |
D | ð | voiced dental |
Consonant Nasals
SPPAS | IPA | Description | Examples |
---|---|---|---|
m | m | bilabial | make, milk, magaji |
n | n | alveolar | knock, name, nitel |
N | ŋ | voiced velar | bongo, sings |
Consonant Liquids
SPPAS | IPA | Description | Examples |
---|---|---|---|
l | l | alveolar lateral | load, lokodan |
r\ | ɹ | alveolar approximant | radio, root, wrap |
Semivowels
SPPAS | IPA | Description | Examples |
---|---|---|---|
j | j | voiced palatal | uni, yes, europe |
w | w | voiced labiovelar | one, wait, wowo |
Vowels
SPPAS | IPA | Description | Examples |
---|---|---|---|
E | ɛ | open-mid front unrounded | air, early, egg, men |
a | a | open front unrounded | our, ask, above |
O | ɔ | open-mid back rounded | us, onion, all, oba |
i | i | close front unrounded | each, even, ile, is |
e | e | close-mid front unrounded | alone, eko |
o | o | close-mid back rounded | obodo, ojo |
u | u | close back rounded | ugu, una, upo |
Nasal vowels
SPPAS | IPA | Examples |
---|---|---|
a~ | ɑ̃ | auntie, african, commander |
e~ | ẽ | fiyen, britain, town |
E~ | ɛ̃ | calendar, men, accent |
i~ | ĩ | admin, ani, benin |
O~ | ɔ̃ | election, lokodan, million |
u~ | ũ | remove, segun, broken |
Affricates
SPPAS | IPA | Description |
---|---|---|
tS | t͡ʃ | voiceless postalveolar |
dZ | d͡ʒ | voiced postalveolar |
Others
SPPAS | IPA | Examples |
---|---|---|
aI | aɪ | I, write, type |
aU | aʊ | out, town |
OI | ɔɪ | oil, boy |
eI | eɪ | a, eight, age |
Fillers
SPPAS | Description |
---|---|
laugh | laughter |
noise | noises, unintelligible speech |
dummy | un-transcribed speech |
Pronunciation Dictionary
The dictionary was originally created by extracting the lexicon of the corpus published in annex of (Deuber 2005). New words with their orthographic variants and pronunciations were added to the dictionary by team of four transcribers, native speakers of the language.
It is distributed under the terms of the GNU General Public License.
Acoustic Model
A first version of the Naija acoustic model was created in 2017-07 by Brigitte Bigi with the SPPAS training scripts.
An initial model was created on the basis of other language prototypes. Such prototypes were mostly extracted from the English acoustic model. For the missing models of phonemes, the nasals /O~/, /a~/, /e~/, /i~/ and /u~/ were picked off Southern Min language, and /E~/ was extracted from French language using /U~/ prototype. The vowels /a/ and /e/ were extracted from the French model; and finally /O/ and /o/ from the Italian one. The fillers were also added to the model in order to be automatically time-aligned too: silence, noise, laughter.
The acoustic model was then trained with a set of 8 files (totalling 3 min 29 seconds in length.) manually phonetized and time-aligned.
The currently distributed acoustic model was trained at the end of the project, in 2020, thanks to all the manual transcription of the whole corpus (see Bigi et al., submitted).
The model is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License.
The model was created using a Python script available in the SPPAS package:
acmtrain.py
.
References
Brigitte Bigi, Abiola S. Oyelere, Bernard Caron (submitted). Resources for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin). LNAI, Springer.
Brigitte Bigi, Bernard Caron, Abiola S. Oyelere (2017). Developing Resources for Automated Speech Processing of the African Language Naija (Nigerian Pidgin). In 8th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 441-445, Poznań, Poland.