Polish Language
Download
This chapter describes the linguistic resources included in the file
pol.zip of the
"Ortolang repository".
List of phonemes
Consonant Plosives
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| p | p | voiceless bilabial | paw, pan |
| b | b | voiced bilabial | bal, ból |
| t | t | voiceless (post)dental | tak, tata |
| d | d | voiced (post)dental | dom, dawać |
| k | k | voiceless velar | kot, kawa |
| g | g | voiced velar | gar, gwiazda |
Consonant Fricatives
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| f | f | voiceless labiodental | fan, fotel, faza |
| s | s | voiceless (post)dental | sabat, subaru, sen |
| s\ | ɕ | voiceless alveolo-palatal | świerszcz, śpi, się, siwy |
| s\ | ʂ | voiceless alveolar with retroflex hook | |
| z | z | voiced (post)dental | za, ząb |
| z\ | ʑ | voiced alveolo-palatal | źrebak, zima, ziemia |
| z\ | ʐ | voiced retroflex | |
| Z | ʒ | voiced alveolar | że, żaba, rzeka |
| v | v | voiced labiodental | wrak, lawenda |
| x | x | voiceless velar | hak, chór |
| S | ʃ | voiceless postalveolar | szum, sztama, Szczecin, Warszawa |
Consonant Nasals
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| m | m | bilabial | mój, mama |
| n | n | (post)dental | nos, nowy |
| n` | ɳ | retroflex | |
| N | ŋ | velar | bank, gang |
| n\ | ɲ | alveolo-palatal |
Consonant Liquids
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| l | l | lateral (post)dental | las, lato |
| r | r | alveolar trill/flap | rok, rata, krok |
Semivowels / Approximants
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| j | j | front approximant | ja, jajo, już |
| w | w | back approxiamnt | ławka, łyk, łże |
Vowels
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| a | a | open front unrounded | pat, ptak, Ala, adres |
| E | ɛ | open-mid front unrounded | test, ten, Ewa, deszcz |
| O | ɔ | open-mid back rounded | pot, kot, Ola, ogród, ogórek |
| i | i | close front unrounded | miś, Irena, instytut |
| I\ | ɨ | near-close central unrounded | ryba, mysz, być |
| u | u | close back rounded | bum, uwaga, tutaj, wóz |
| y | y | close front rounded | mysz |
Nasal vowels
| SPPAS | IPA | Examples |
|---|---|---|
| E~ | ɛ̃ | węże, kęsy |
| o~ | ɔ̃ | wąż, mąż, wąsy |
Affricates
| SPPAS | IPA | Description | Examples |
|---|---|---|---|
| t^S | t͡ʃ | voiceless alveolar | czas, czy, czwartek |
| t^s | t͡s | voiceless (post)dental | co, cały, Francja |
| t^s\ | t͡ɕ | voiceless alveolo-palatal | ćwiczenie, pamięć |
| d^z | d͡z | voiced (post)dental | dzwon, sadza |
| d^z\ | d͡ʑ | voiced alveolo-palatal | dźwięk, dziwny, niedziela |
| d^Z | d͡ʒ | voiced alveolar | dżem, drożdże, dżuma, dżungla |
Fillers
| SPPAS | Description |
|---|---|
| laugh | laughter |
| noise | noises, unintelligible speech |
| dummy | un-transcribed speech |
Lexicons
All Polish lexicons are (c) Laboratoire Parole et Langage, Aix-en-Provence, France:
pol.vocabcontains a list of 500k different words;pol_num.replallows to convert numbers to their written form;pol.replallows to convert symbols and abbreviations into a text form.
All of them are distributed under the terms of the GNU General Public License.
Pronunciation Dictionary
The Polish pronunciation dictionary was downloaded in 2015 from the Ralf catalog of dictionaries for the Simon ASR system at http://spirit.blau.in/simon/import-pls-dictionary/.
It was then converted (format and phoneset) and corrected by Brigitte Bigi, thanks to the help of Katarzyna Klessa http://katarzyna.klessa.pl/. An update was done in 2017 to correct systematic errors.
It is distributed under the terms of the GNU General Public License.
Acoustic Model
The acoustic model was created by Brigitte Bigi. We address special thanks to Katarzyna Klessa for giving us access to a corpus.
It is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License.
The model was created using a Python script available in the SPPAS package:
acmtrain.py.
References
Brigitte Bigi, Katarzyna Klessa (2015). Automatic Syllabification of Polish. In 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 262-266, Poznań, Poland.