Catalan Language

Download

This chapter describes the linguistic resources included in the file cat.zip of the lang folder.

List of phonemes

Consonant Plosives

SPPAS IPA Description Examples
p p voiceless bilabial pala
b b voiced bilabial bala, via
t t voiceless alveolar tela
d d voiced alveolar donar
k k voiceless velar cala
g g voiced velar gala

Consonant Fricatives

SPPAS IPA Description Examples
D ð voiced dental cada
G ɣ voiceless velar alga, mages
f f voiceless labiodental fals
s s voiceless alveolar si, sala
z z voiced alveolar desde
S ʃ voiceless postalveolar caixa
Z ʒ voiced postalveolar mújol
v v voiced labiodental va, vol
T θ voiceless dental circus

Consonant Nasals

SPPAS IPA Description Examples
m m bilabial mena
n n alveolar nena
J ɲ palatal any
N ŋ voiced velar lingot, lingual

Consonant Liquids

SPPAS IPA Description Examples
l l alveolar lateral líquid
L ʎ palatal lateral llamp
r r alveolar trill carro
4 ɾ alveolar flap cara

Semivowels

SPPAS IPA Description Examples
j j voiced palatal iaia, naciós, iogurt
w w voiced labiovelar veu, veuran

Vowels

SPPAS IPA Description Examples
E ɛ open-mid front unrounded sec, veça
a a open front unrounded sac
O ɔ open-mid back rounded soc
o o close-mid back rounded sóc
e e close-mid front unrounded séc, cec
i i close front unrounded sic, ric
u u close back rounded suc
@ ə schwa contra, estada
U ʊ near-close near-back rounded òpols

Affricates

SPPAS IPA Description Examples
dZ d͡ʒ voiced postalveolar metge
tS t͡ʃ voiceless postalveolar cotxe

Fillers

SPPAS Description
laugh laughter
noise noises, unintelligible speech
dummy un-transcribed speech

Lexicons

All Catalan lexicons are (c) Laboratoire Parole et Langage, Aix-en-Provence, France:

  • cat.vocab contains a list of 94k different words;
  • cat.repl allows to convert symbols and abbreviations into a text form.

All of them are distributed under the terms of the GNU General Public License.

Help is welcome to create a cat_num.repl allowing SPPAS to convert numbers to their written form.

Pronunciation dictionary

The catalan pronunciation dictionary was downloaded in 2014 from the Ralf catalog of dictionaries for the Simon ASR system at http://spirit.blau.in/simon/import-pls-dictionary/. It was then converted (format and phoneset) by Brigitte Bigi. Some new words were also added and phonetized manually by Eva Bosch i Roura. New entries were added from observed pronunciations in Glissando corpus.

It is distributed under the terms of the GNU General Public License.

Acoustic Model

The acoustic model was trained from Glissando corpus. It is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License. It was created using a Python script available in the SPPAS package: acmtrain.py.

We address special thanks to Juan-Maria Garrido for giving us access to the Glissande corpus:

GARRIDO, J. M. - ESCUDERO, D. - AGUILAR, L. -CARDEÑOSO, V. - RODERO, E. - DE-LA-MOTA, C. - GONZÁLEZ, C. - RUSTULLET, S. - LARREA, O. - LAPLAZA, Y. - VIZCAÍNO, F. - CABRERA, M. - BONAFONTE, A. (2013). Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan, Language Resources and Evaluation, DOI 10.1007/s10579-012-9213-0.