Nigerian Pidgin Language

Download

This chapter describes the linguistic resources included in the file pcm.zip of the lang folder.

This work was financed by the French Agence Nationale pour la Recherche (ANR-16-CE27-0007), in the context of the NaijaSynCor project.

The iso-639-3 code of Naija language is PCM.

List of phonemes

Consonant Plosives

SPPAS IPA Description Examples
p p voiceless bilabial public, palaver
b b voiced bilabial bye, bojuboju, boli
t t voiceless alveolar two, tree, tranga
d d voiced alveolar drop, duma, this
k k voiceless velar ketu, cut, quick
g g voiced velar gain, girl, guy

Consonant Fricatives

SPPAS IPA Description Examples
f f voiceless labiodental farm, phone, view
s s voiceless alveolar centre, safe, zero
S ʃ voiceless postalveolar cheque, sabi, ship
z z voiced alveolar used, diesel, eze
v v voiced labiodental visit, view
h h voiceless glottal happy, hope, who
T θ voiceless dental thing, ethnic, three
D ð voiced dental

Consonant Nasals

SPPAS IPA Description Examples
m m bilabial make, milk, magaji
n n alveolar knock, name, nitel
N ŋ voiced velar bongo, sings

Consonant Liquids

SPPAS IPA Description Examples
l l alveolar lateral load, lokodan
r\ ɹ alveolar approximant radio, root, wrap

Semivowels

SPPAS IPA Description Examples
j j voiced palatal uni, yes, europe
w w voiced labiovelar one, wait, wowo

Vowels

SPPAS IPA Description Examples
E ɛ open-mid front unrounded air, early, egg, men
a a open front unrounded our, ask, above
O ɔ open-mid back rounded us, onion, all, oba
i i close front unrounded each, even, ile, is
e e close-mid front unrounded alone, eko
o o close-mid back rounded obodo, ojo
u u close back rounded ugu, una, upo

Nasal vowels

SPPAS IPA Examples
a~ ɑ̃ auntie, african, commander
e~ fiyen, britain, town
E~ ɛ̃ calendar, men, accent
i~ admin, ani, benin
O~ ɔ̃ election, lokodan, million
u~ remove, segun, broken

Affricates

SPPAS IPA Description
tS t͡ʃ voiceless postalveolar
dZ d͡ʒ voiced postalveolar

Others

SPPAS IPA Examples
aI I, write, type
aU out, town
OI ɔɪ oil, boy
eI a, eight, age

Fillers

SPPAS Description
laugh laughter
noise noises, unintelligible speech
dummy un-transcribed speech

Pronunciation Dictionary

The dictionary was originally created by extracting the lexicon of the corpus published in annex of (Deuber 2005). New words with their orthographic variants and pronunciations were added to the dictionary by team of four transcribers, native speakers of the language.

It is distributed under the terms of the GNU General Public License.

Acoustic Model

A first version of the Naija acoustic model was created in 2017-07 by Brigitte Bigi with the SPPAS training scripts.

An initial model was created on the basis of other language prototypes. Such prototypes were mostly extracted from the English acoustic model. For the missing models of phonemes, the nasals /O~/, /a~/, /e~/, /i~/ and /u~/ were picked off Southern Min language, and /E~/ was extracted from French language using /U~/ prototype. The vowels /a/ and /e/ were extracted from the French model; and finally /O/ and /o/ from the Italian one. The fillers were also added to the model in order to be automatically time-aligned too: silence, noise, laughter.

The acoustic model was then trained with a set of 8 files (totalling 3 min 29 seconds in length.) manually phonetized and time-aligned.

The currently distributed acoustic model was trained at the end of the project, in 2020, thanks to all the manual transcription of the whole corpus (see Bigi et al., submitted).

The model is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License.

The model was created using a Python script available in the SPPAS package: acmtrain.py.

References

Brigitte Bigi, Abiola S. Oyelere, Bernard Caron (submitted). Resources for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin). LNAI, Springer.

Brigitte Bigi, Bernard Caron, Abiola S. Oyelere (2017). Developing Resources for Automated Speech Processing of the African Language Naija (Nigerian Pidgin). In 8th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 441-445, Poznań, Poland.