Polish Language

Download

This chapter describes the linguistic resources included in the file pol.zip of the lang folder.

List of phonemes

Consonant Plosives

SPPAS IPA Description Examples
p p voiceless bilabial paw, pan
b b voiced bilabial bal, ból
t t voiceless (post)dental tak, tata
d d voiced (post)dental dom, dawać
k k voiceless velar kot, kawa
g g voiced velar gar, gwiazda

Consonant Fricatives

SPPAS IPA Description Examples
f f voiceless labiodental fan, fotel, faza
s s voiceless (post)dental sabat, subaru, sen
s\ ɕ voiceless alveolo-palatal świerszcz, śpi, się, siwy
s\ ʂ voiceless alveolar with retroflex hook
z z voiced (post)dental za, ząb
z\ ʑ voiced alveolo-palatal źrebak, zima, ziemia
z\ ʐ voiced retroflex
Z ʒ voiced alveolar że, żaba, rzeka
v v voiced labiodental wrak, lawenda
x x voiceless velar hak, chór
S ʃ voiceless postalveolar szum, sztama, Szczecin, Warszawa

Consonant Nasals

SPPAS IPA Description Examples
m m bilabial mój, mama
n n (post)dental nos, nowy
n` ɳ retroflex
N ŋ velar bank, gang
n\ ɲ alveolo-palatal

Consonant Liquids

SPPAS IPA Description Examples
l l lateral (post)dental las, lato
r r alveolar trill/flap rok, rata, krok

Semivowels / Approximants

SPPAS IPA Description Examples
j j front approximant ja, jajo, już
w w back approxiamnt ławka, łyk, łże

Vowels

SPPAS IPA Description Examples
a a open front unrounded pat, ptak, Ala, adres
E ɛ open-mid front unrounded test, ten, Ewa, deszcz
O ɔ open-mid back rounded pot, kot, Ola, ogród, ogórek
i i close front unrounded miś, Irena, instytut
I\ ɨ near-close central unrounded ryba, mysz, być
u u close back rounded bum, uwaga, tutaj, wóz
y y close front rounded mysz

Nasal vowels

SPPAS IPA Examples
E~ ɛ̃ węże, kęsy
o~ ɔ̃ wąż, mąż, wąsy

Affricates

SPPAS IPA Description Examples
t^S t͡ʃ voiceless alveolar czas, czy, czwartek
t^s t͡s voiceless (post)dental co, cały, Francja
t^s\ t͡ɕ voiceless alveolo-palatal ćwiczenie, pamięć
d^z d͡z voiced (post)dental dzwon, sadza
d^z\ d͡ʑ voiced alveolo-palatal dźwięk, dziwny, niedziela
d^Z d͡ʒ voiced alveolar dżem, drożdże, dżuma, dżungla

Fillers

SPPAS Description
laugh laughter
noise noises, unintelligible speech
dummy un-transcribed speech

Lexicons

All Polish lexicons are (c) Laboratoire Parole et Langage, Aix-en-Provence, France:

  • pol.vocab contains a list of 500k different words;
  • pol_num.repl allows to convert numbers to their written form;
  • pol.repl allows to convert symbols and abbreviations into a text form.

All of them are distributed under the terms of the GNU General Public License.

Pronunciation Dictionary

The Polish pronunciation dictionary was downloaded in 2015 from the Ralf catalog of dictionaries for the Simon ASR system at http://spirit.blau.in/simon/import-pls-dictionary/.

It was then converted (format and phoneset) and corrected by Brigitte Bigi, thanks to the help of Katarzyna Klessa http://katarzyna.klessa.pl/. An update was done in 2017 to correct systematic errors.

It is distributed under the terms of the GNU General Public License.

Acoustic Model

The acoustic model was created by Brigitte Bigi. We address special thanks to Katarzyna Klessa for giving us access to a corpus.

It is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License.

The model was created using a Python script available in the SPPAS package: acmtrain.py.

References

Brigitte Bigi, Katarzyna Klessa (2015). Automatic Syllabification of Polish. In 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 262-266, Poznań, Poland.