Nigerian Pidgin Language
Download
This chapter describes the linguistic resources included in the file
pcm.zip of the
"Ortolang repository".
This work was financed by the French Agence Nationale pour la Recherche (ANR-16-CE27-0007), in the context of the NaijaSynCor project.
The iso-639-3 code of Naija language is PCM.
List of phonemes
Consonant Plosives
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| p | p | voiceless bilabial | public, palaver |
| b | b | voiced bilabial | bye, bojuboju, boli |
| t | t | voiceless alveolar | two, tree, tranga |
| d | d | voiced alveolar | drop, duma, this |
| k | k | voiceless velar | ketu, cut, quick |
| g | g | voiced velar | gain, girl, guy |
Consonant Fricatives
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| f | f | voiceless labiodental | farm, phone, view |
| s | s | voiceless alveolar | centre, safe, zero |
| S | ʃ | voiceless postalveolar | cheque, sabi, ship |
| z | z | voiced alveolar | used, diesel, eze |
| v | v | voiced labiodental | visit, view |
| h | h | voiceless glottal | happy, hope, who |
| T | θ | voiceless dental | thing, ethnic, three |
| D | ð | voiced dental |
Consonant Nasals
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| m | m | bilabial | make, milk, magaji |
| n | n | alveolar | knock, name, nitel |
| N | ŋ | voiced velar | bongo, sings |
Consonant Liquids
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| l | l | alveolar lateral | load, lokodan |
| r\ | ɹ | alveolar approximant | radio, root, wrap |
Semivowels
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| j | j | voiced palatal | uni, yes, europe |
| w | w | voiced labiovelar | one, wait, wowo |
Vowels
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| E | ɛ | open-mid front unrounded | air, early, egg, men |
| a | a | open front unrounded | our, ask, above |
| O | ɔ | open-mid back rounded | us, onion, all, oba |
| i | i | close front unrounded | each, even, ile, is |
| e | e | close-mid front unrounded | alone, eko |
| o | o | close-mid back rounded | obodo, ojo |
| u | u | close back rounded | ugu, una, upo |
Nasal vowels
| SPPAS | IPA | Examples |
|---|---|---|
| a~ | ɑ̃ | auntie, african, commander |
| e~ | ẽ | fiyen, britain, town |
| E~ | ɛ̃ | calendar, men, accent |
| i~ | ĩ | admin, ani, benin |
| O~ | ɔ̃ | election, lokodan, million |
| u~ | ũ | remove, segun, broken |
Affricates
| SPPAS | IPA | Description |
|---|---|---|
| tS | t͡ʃ | voiceless postalveolar |
| dZ | d͡ʒ | voiced postalveolar |
Others
| SPPAS | IPA | Examples |
|---|---|---|
| aI | aɪ | I, write, type |
| aU | aʊ | out, town |
| OI | ɔɪ | oil, boy |
| eI | eɪ | a, eight, age |
Fillers
| SPPAS | Description |
|---|---|
| laugh | laughter |
| noise | noises, unintelligible speech |
| dummy | un-transcribed speech |
Pronunciation Dictionary
The dictionary was originally created by extracting the lexicon of the corpus published in annex of (Deuber 2005). New words with their orthographic variants and pronunciations were added to the dictionary by team of four transcribers, native speakers of the language.
It is distributed under the terms of the GNU General Public License.
Acoustic Model
A first version of the Naija acoustic model was created in 2017-07 by Brigitte Bigi with the SPPAS training scripts.
An initial model was created on the basis of other language prototypes. Such prototypes were mostly extracted from the English acoustic model. For the missing models of phonemes, the nasals /O~/, /a~/, /e~/, /i~/ and /u~/ were picked off Southern Min language, and /E~/ was extracted from French language using /U~/ prototype. The vowels /a/ and /e/ were extracted from the French model; and finally /O/ and /o/ from the Italian one. The fillers were also added to the model in order to be automatically time-aligned too: silence, noise, laughter.
The acoustic model was then trained with a set of 8 files (totalling 3 min 29 seconds in length.) manually phonetized and time-aligned.
The currently distributed acoustic model was trained at the end of the project, in 2020, thanks to all the manual transcription of the whole corpus (see Bigi et al., submitted).
The model is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License.
The model was created using a Python script available in the SPPAS package:
acmtrain.py.
References
Brigitte Bigi, Abiola S. Oyelere, Bernard Caron (submitted). Resources for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin). LNAI, Springer.
Brigitte Bigi, Bernard Caron, Abiola S. Oyelere (2017). Developing Resources for Automated Speech Processing of the African Language Naija (Nigerian Pidgin). In 8th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 441-445, Poznań, Poland.