Italian Language

Download

This chapter describes the linguistic resources included in the file ita.zip of the lang folder.

List of phonemes

Consonant Plosives

SPPAS IPA Description Examples
p p voiceless bilabial primo, ampio, copertura
b b voiced bilabial banca, cibo
t t voiceless alveolar tranne, mito, Fiat
d d voiced alveolar dove, idra
k k voiceless velar cavolo, acuto, anche, quei
g g voiced velar gatto, agro, glifo, ghetto

Consonant Fricatives

SPPAS IPA Description Examples
f f voiceless labiodental fatto, fosforo
s s voiceless alveolar sano, scatola, presentire
S ʃ voiceless postalveolar scena, sciame, pesci
z z voiced alveolar sbavare, presentare, asma
v v voiced labiodental vado, povero

Consonant Nasals

SPPAS IPA Description Examples
m m bilabial mano, amare, campo
n n alveolar nano, punto, pensare, anfibio
J ɲ palatal gnocco, ogni
N ŋ voiced velar fango, unghia, panchina, dunque

Consonant Liquids

SPPAS IPA Description Examples
l l alveolar lateral lato, lievemente
L ʎ palatal lateral gli, glielo, maglia
r r alveolar trill Roma, quattro, morte

Semivowels

SPPAS IPA Description Examples
j j voiced palatal ieri, più, Jesi
w w voiced labiovelar uovo, fuoco, qui

Vowels

SPPAS IPA Description Examples
E ɛ open-mid front unrounded elica, cioè
a a open front unrounded alto, sarà
o ɔ open-mid back rounded otto, posso, sarò
o o close-mid back rounded ombra, come
e e close-mid front unrounded vero, perché
i i close front unrounded imposta, colibrì, zie
u u close back rounded ultimo, caucciù, tuo
a~ ã nasal -
e~ ɛ̃ nasal -
O~ ɔ̃ nasal -

Affricates

SPPAS IPA Description Examples
tS t͡ʃ voiceless postalveolar Cennini, cinque, ciao
ts t͡s voiceless alveolar sozzo canzone marzo
dz d͡z voiced alveolar zaino zelare mezzo
dZ d͡ʒ voiced postalveolar giungla, magia, fingere

Fillers

SPPAS Description
laugh laughter
noise noises, unintelligible speech
dummy un-transcribed speech
fp filled pause (eh, ah)

Lexicons

All Italian lexicons are (c) Laboratoire Parole et Langage, Aix-en-Provence, France:

  • ita.vocab contains a list of 389k different words;
  • ita_num.repl allows to convert numbers to their written form;
  • ita.repl allows to convert symbols and abbreviations into a text form.

All of them are distributed under the terms of the GNU General Public License.

Pronunciation dictionary

The Italian dictionary was downloaded in 2011 from the Festival synthetizer tool. A large amount of the phonetization were manually corrected by Brigitte Bigi and a large set of missing words and pronunciation variants were added manually.

It is distributed under the terms of the GNU General Public License.

Acoustic Model

The Italian acoustic model was created during the Evalita 2011 evaluation campaign, from the CLIPS MapTask corpus (3h30), and updated during the Evalita 2014 evaluation campaign.

It is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License.

Three nazalized vowels were added (2021, September) but not currently used in the pronunciation dictionary.

Syllabification configuration file

The syllabification configuration file corresponds to the rules defined in the paper (Bigi and Petrone, 2014). This file is distributed under the terms of the GNU General Public License.

References

Brigitte Bigi, Caterina Petrone (2014). A generic tool for the automatic syllabification of Italian. In Proceedings of the First Italian Conference on Computational Linguistics CLiC-it 2014 and of the Fourth International Workshop EVALITA 2014, pp. 73-77, Pisa, Italy.

Brigitte Bigi (2014). The SPPAS participation to Evalita 2014. In Proceedings of the First Italian Conference on Computational Linguistics CLiC-it 2014 and the Fourth International Workshop EVALITA 2014, Pisa, Italy.

Brigitte Bigi (2012). The SPPAS participation to Evalita 2011. Lecture Notes in Artifical Intelligence, LNAI-7689, pp. 312-321. Rome, Italy.