Catalan Language
Download
This chapter describes the linguistic resources included in the file
cat.zip
of the lang
folder.
List of phonemes
Consonant Plosives
SPPAS | IPA | Description | Examples |
---|---|---|---|
p | p | voiceless bilabial | pala |
b | b | voiced bilabial | bala, via |
t | t | voiceless alveolar | tela |
d | d | voiced alveolar | donar |
k | k | voiceless velar | cala |
g | g | voiced velar | gala |
Consonant Fricatives
SPPAS | IPA | Description | Examples |
---|---|---|---|
D | ð | voiced dental | cada |
G | ɣ | voiceless velar | alga, mages |
f | f | voiceless labiodental | fals |
s | s | voiceless alveolar | si, sala |
z | z | voiced alveolar | desde |
S | ʃ | voiceless postalveolar | caixa |
Z | ʒ | voiced postalveolar | mújol |
v | v | voiced labiodental | va, vol |
T | θ | voiceless dental | circus |
Consonant Nasals
SPPAS | IPA | Description | Examples |
---|---|---|---|
m | m | bilabial | mena |
n | n | alveolar | nena |
J | ɲ | palatal | any |
N | ŋ | voiced velar | lingot, lingual |
Consonant Liquids
SPPAS | IPA | Description | Examples |
---|---|---|---|
l | l | alveolar lateral | líquid |
L | ʎ | palatal lateral | llamp |
r | r | alveolar trill | carro |
4 | ɾ | alveolar flap | cara |
Semivowels
SPPAS | IPA | Description | Examples |
---|---|---|---|
j | j | voiced palatal | iaia, naciós, iogurt |
w | w | voiced labiovelar | veu, veuran |
Vowels
SPPAS | IPA | Description | Examples |
---|---|---|---|
E | ɛ | open-mid front unrounded | sec, veça |
a | a | open front unrounded | sac |
O | ɔ | open-mid back rounded | soc |
o | o | close-mid back rounded | sóc |
e | e | close-mid front unrounded | séc, cec |
i | i | close front unrounded | sic, ric |
u | u | close back rounded | suc |
@ | ə | schwa | contra, estada |
U | ʊ | near-close near-back rounded | òpols |
Affricates
SPPAS | IPA | Description | Examples |
---|---|---|---|
dZ | d͡ʒ | voiced postalveolar | metge |
tS | t͡ʃ | voiceless postalveolar | cotxe |
Fillers
SPPAS | Description |
---|---|
laugh | laughter |
noise | noises, unintelligible speech |
dummy | un-transcribed speech |
Lexicons
All Catalan lexicons are (c) Laboratoire Parole et Langage, Aix-en-Provence, France:
cat.vocab
contains a list of 94k different words;cat.repl
allows to convert symbols and abbreviations into a text form.
All of them are distributed under the terms of the GNU General Public License.
Help is welcome to create a
cat_num.repl
allowing SPPAS to convert numbers to their written form.
Pronunciation dictionary
The catalan pronunciation dictionary was downloaded in 2014 from the Ralf catalog of dictionaries for the Simon ASR system at http://spirit.blau.in/simon/import-pls-dictionary/. It was then converted (format and phoneset) by Brigitte Bigi. Some new words were also added and phonetized manually by Eva Bosch i Roura. New entries were added from observed pronunciations in Glissando corpus.
It is distributed under the terms of the GNU General Public License.
Acoustic Model
The acoustic model was trained from Glissando corpus. It is
distributed under the terms of the Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International Public
License. It was created using a Python script available in
the SPPAS package: acmtrain.py
.
We address special thanks to Juan-Maria Garrido for giving us access to the Glissande corpus:
GARRIDO, J. M. - ESCUDERO, D. - AGUILAR, L. -CARDEÑOSO, V. - RODERO, E. - DE-LA-MOTA, C. - GONZÁLEZ, C. - RUSTULLET, S. - LARREA, O. - LAPLAZA, Y. - VIZCAÍNO, F. - CABRERA, M. - BONAFONTE, A. (2013). Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan, Language Resources and Evaluation, DOI 10.1007/s10579-012-9213-0.