annotations.Phon package

Submodules

annotations.Phon.dagphon module

filename

sppas.src.annotations.Phon.dagphon.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Direct Acyclic Graph for the phonetization of unknown entries.

class annotations.Phon.dagphon.sppasDAGPhonetizer(variants=4)[source]

Bases: object

DAG to phonetize unk.

DAG2phon(graph, pron_graph)[source]

Convert a DAG into a dict, including all pronunciation variants.

Parameters
  • graph

  • pron_graph

Returns

__init__(variants=4)[source]

Create a sppasDAGPhonetizer instance.

Parameters

variants – (int) Maximum number of variants for phonetizations.

decompose(pron1, pron2='')[source]

Create a decomposed phonetization from a string as follow:

>>> self.decompose("p1 p2|x2 p3|x3")
>>> p1-p2-p3|p1-p2-x3|p1-x2-p3|p1-x2-x3

The input string is converted into a DAG, then output corresponds to all paths.

phon2DAG(pron)[source]

Convert a phonetization into a DAG.

Parameters

pron

set_variants(v)[source]

Fix the maximum number of variants.

Parameters

v – (int) If v is set to 0, all variants will be returned.

annotations.Phon.phonetize module

filename

sppas.src.annotations.Phon.phonetize.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Phonetization of an utterance.

class annotations.Phon.phonetize.sppasDictPhonetizer(pdict, maptable=None)[source]

Bases: object

Dictionary-based automatic phonetization.

Grapheme-to-phoneme conversion is a complex task, for which a number of diverse solutions have been proposed. It is a structure prediction task; both the input and output are structured, consisting of sequences of letters and phonemes, respectively.

This phonetization system is entirely designed to handle multiple languages and/or tasks with the same algorithms and the same tools. Only resources are language-specific, and the approach is based on the simplest resources as possible: this automatic annotation is using a dictionary-based approach.

The dictionary can contain words with a set of pronunciations (the canonical one, and optionally some common reductions, etc). In this approach, it is then assumed that most of the words of the speech transcription and their phonetic variants are mentioned in the pronunciation dictionary. If a word is missing, our system is based on the idea that given enough examples it should be possible to predict the pronunciation of unseen words purely by analogy.

__init__(pdict, maptable=None)[source]

Create a sppasDictPhonetizer instance.

Parameters
  • pdict – (sppasDictPron) The pronunciation dictionary.

  • maptable – (Mapping) A mapping table for phones.

get_dict_filename()[source]
get_phon_entry(entry)[source]

Return the phonetization of an entry.

Unknown entries are not automatically phonetized. This is a pure dictionary-based method.

Parameters

entry – (str) The entry to be phonetized.

Returns

A string with the phonetization of the given entry or

the unknown symbol.

get_phon_tokens(tokens, phonunk=True)[source]

Return the phonetization of a list of tokens, with the status.

Unknown entries are automatically phonetized if phonunk is set to True.

Parameters
  • tokens – (list) The list of tokens to be phonetized.

  • phonunk – (bool) Phonetize unknown words (or not).

TODO: EOT is not fully supported.

Returns

A list with the tuple (token, phon, status).

phonetize(utterance, phonunk=True, delimiter=' ')[source]

Return the phonetization of an utterance.

Parameters
  • utterance – (str) The utterance string to be phonetized.

  • phonunk – (bool) Phonetize unknown words (or not).

  • delimiter – (char) The character to be used to separate entries

in the result and which was used in the given utterance.

Returns

A string with the phonetization of the given utterance.

set_dict(pron_dict)[source]

Set the pronunciation dictionary.

Parameters

pron_dict – (sppasDictPron) The pronunciation dictionary.

set_maptable(map_table)[source]

Set the mapping table dictionary.

Parameters

map_table – (Mapping) The mapping table dictionary.

set_unk_variants(value)[source]

Fix the maximum number of variants for unknown entries.

Parameters

value – (int) If v is set to 0, all variants will be returned.

annotations.Phon.phonunk module

filename

sppas.src.annotations.Phon.phonunk.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Phonetization of an unknown entry.

class annotations.Phon.phonunk.sppasPhonUnk(pron_dict)[source]

Bases: object

Perform a dictionary-based phonetization for unknown entries.

Implements a language-independent algorithm to phonetize unknown tokens. The algorithm is based on the idea that given enough examples it should be possible to predict the pronunciation of unseen tokens purely by analogy. It consists in exploring the unknown token from left to right, then from right to left, and to find the longest strings in the dictionary. Since this algorithm uses the dictionary, the quality of such a phonetization strongly depends on this resource.

Example of use:

>>> d = { 'a':'a|aa', 'b':'b', 'c':'c|cc', 'abb':'abb', 'bac':'bac' }
>>> p = sppasPhonUnk(d)
__init__(pron_dict)[source]

Create a sppasPhonUnk instance.

Parameters

pron_dict – (sppasPronDict) Dictionary of a set of tuples:

token=key, phon=value.

get_phon(entry)[source]

Return the phonetization of an unknown entry.

Parameters

entry – (str) the string to phonetize

Returns

a string with the proposed phonetization

Raises

Exception if the word can NOT be phonetized

set_variants(v)[source]

Fix the maximum number of variants.

Parameters

v – (int) If v is set to 0, all variants will be returned.

annotations.Phon.sppasphon module

filename

sppas.src.annotations.Phon.sppasphon.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

SPPAS integration of Phonetization automatic annotation.

class annotations.Phon.sppasphon.sppasPhon(log=None)[source]

Bases: annotations.baseannot.sppasBaseAnnotation

SPPAS integration of the Phonetization automatic annotation.

__init__(log=None)[source]

Create a sppasPhon instance without any linguistic resources.

Log is used for a better communication of the annotation process and its results. If None, logs are redirected to the default logging system.

Parameters

log – (sppasLog) Human-readable logs.

convert(tier)[source]

Phonetize annotations of a tokenized tier.

Parameters

tier – (Tier) the ortho transcription previously tokenized.

Returns

(Tier) phonetized tier with name “Phones”

fix_options(options)[source]

Fix all options.

Available options are:

  • phonunk

  • usesstdtokens

Parameters

options – (sppasOption)

get_input_patterns()[source]

Pattern this annotation expects for its input filename.

get_inputs(input_files)[source]

Return the the tier with aligned tokens.

Parameters

input_files – (list)

Raise

NoTierInputError

Returns

(sppasTier)

get_output_pattern()[source]

Pattern this annotation uses in an output filename.

load_resources(dict_filename=None, map_filename=None, **kwargs)[source]

Set the pronunciation dictionary and the mapping table.

Parameters

dict_filename – (str) The pronunciation dictionary in HTK-ASCII

format with UTF-8 encoding.

Parameters

map_filename – (str) is the filename of a mapping table. It is used to generate new pronunciations by mapping phonemes of the dict.

run(input_files, output=None)[source]

Run the automatic annotation process on an input.

Parameters
  • input_files – (list of str) Normalized text

  • output – (str) the output name

Returns

(sppasTranscription)

set_unk(unk)[source]

Fix the unk option value.

Parameters

unk – (bool) If unk is set to True, the system attempts

to phonetize unknown entries (i.e. tokens missing in the dictionary). Otherwise, the phonetization of an unknown entry unit is set to the default stamp.

set_usestdtokens(stdtokens)[source]

Fix the stdtokens option.

Parameters

stdtokens – (bool) If it is set to True, the phonetization

uses the standard transcription as input, instead of the faked transcription. This option does make sense only for an Enriched Orthographic Transcription.

Module contents

filename

sppas.src.annotations.Phon.__init__.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Multilingual system for Phonetization.

Phonetization is the process of representing sounds with phonetic signs. There are two general ways to construct a phonetization process: dictionary based solutions which consist in storing a maximum of phonological knowledge in a lexicon and rule based systems with rules based on inference approaches or proposed by expert linguists.

A system based on a dictionary solution consists in storing a maximum of phonological knowledge in a lexicon. In this sense, this approach is language-independent unlike rule-based systems. The SPPAS phonetization of the orthographic transcription produces a phonetic transcription based on a phonetic dictionary. The phonetization is the equivalent of a sequence of dictionary look-ups.

It is assumed that all words of the speech transcription are mentioned in the pronunciation dictionary. Otherwise, SPPAS implements a language-independent algorithm to phonetize unknown words. This implementation is in its early stage. It consists in exploring the unknown word from left to right and to find the longuest strings in the dictionary. Since this algorithm uses the dictionary, the quality of such a phonetization will depend on this resource.

Actually, some words can correspond to several entries in the dictionary with various pronunciations. Unlike rule-based systems, in SPPAS the pronunciation is not supposed to be ``standard’’. Phonetic variants are proposed for the aligner to choose the phoneme string. The hypothesis is that the answer to the phonetization question is in the signal.

SPPAS can take as input a tokenized standard orthographic transcription and some enrichment only if the acoustic model includes them. For example, the French transcriptions can contain laugh (represented by the symbol ‘@’ in the transcription).

The SPPAS phonetization follows the conventions:

  • whitespace separate tokens,

  • minus separate phonemes,

  • pipes separate phonetic variants.

For details, read the following reference:

Brigitte Bigi (2016).
A phonetization approach for the forced-alignment task in SPPAS.
Human Language Technology. Challenges for Computer Science and
Linguistics, LNAI 9561, Springer, pp. 515–526.

To summarize:

A phoneme is the smallest structural unit that distinguishes meaning in a language. Phonemes are not the physical segments themselves, but are cognitive abstractions or categorizations of them. On the other hand, phones refer to the instances of phonemes in the actual utterances - i.e. the physical segments.

Phonetization consists in searching the possible phones of the given utterance. In the approach implemented in this package, phonetic variants are included in the result.

class annotations.Phon.sppasDictPhonetizer(pdict, maptable=None)[source]

Bases: object

Dictionary-based automatic phonetization.

Grapheme-to-phoneme conversion is a complex task, for which a number of diverse solutions have been proposed. It is a structure prediction task; both the input and output are structured, consisting of sequences of letters and phonemes, respectively.

This phonetization system is entirely designed to handle multiple languages and/or tasks with the same algorithms and the same tools. Only resources are language-specific, and the approach is based on the simplest resources as possible: this automatic annotation is using a dictionary-based approach.

The dictionary can contain words with a set of pronunciations (the canonical one, and optionally some common reductions, etc). In this approach, it is then assumed that most of the words of the speech transcription and their phonetic variants are mentioned in the pronunciation dictionary. If a word is missing, our system is based on the idea that given enough examples it should be possible to predict the pronunciation of unseen words purely by analogy.

__init__(pdict, maptable=None)[source]

Create a sppasDictPhonetizer instance.

Parameters
  • pdict – (sppasDictPron) The pronunciation dictionary.

  • maptable – (Mapping) A mapping table for phones.

get_dict_filename()[source]
get_phon_entry(entry)[source]

Return the phonetization of an entry.

Unknown entries are not automatically phonetized. This is a pure dictionary-based method.

Parameters

entry – (str) The entry to be phonetized.

Returns

A string with the phonetization of the given entry or

the unknown symbol.

get_phon_tokens(tokens, phonunk=True)[source]

Return the phonetization of a list of tokens, with the status.

Unknown entries are automatically phonetized if phonunk is set to True.

Parameters
  • tokens – (list) The list of tokens to be phonetized.

  • phonunk – (bool) Phonetize unknown words (or not).

TODO: EOT is not fully supported.

Returns

A list with the tuple (token, phon, status).

phonetize(utterance, phonunk=True, delimiter=' ')[source]

Return the phonetization of an utterance.

Parameters
  • utterance – (str) The utterance string to be phonetized.

  • phonunk – (bool) Phonetize unknown words (or not).

  • delimiter – (char) The character to be used to separate entries

in the result and which was used in the given utterance.

Returns

A string with the phonetization of the given utterance.

set_dict(pron_dict)[source]

Set the pronunciation dictionary.

Parameters

pron_dict – (sppasDictPron) The pronunciation dictionary.

set_maptable(map_table)[source]

Set the mapping table dictionary.

Parameters

map_table – (Mapping) The mapping table dictionary.

set_unk_variants(value)[source]

Fix the maximum number of variants for unknown entries.

Parameters

value – (int) If v is set to 0, all variants will be returned.

class annotations.Phon.sppasPhon(log=None)[source]

Bases: annotations.baseannot.sppasBaseAnnotation

SPPAS integration of the Phonetization automatic annotation.

__init__(log=None)[source]

Create a sppasPhon instance without any linguistic resources.

Log is used for a better communication of the annotation process and its results. If None, logs are redirected to the default logging system.

Parameters

log – (sppasLog) Human-readable logs.

convert(tier)[source]

Phonetize annotations of a tokenized tier.

Parameters

tier – (Tier) the ortho transcription previously tokenized.

Returns

(Tier) phonetized tier with name “Phones”

fix_options(options)[source]

Fix all options.

Available options are:

  • phonunk

  • usesstdtokens

Parameters

options – (sppasOption)

get_input_patterns()[source]

Pattern this annotation expects for its input filename.

get_inputs(input_files)[source]

Return the the tier with aligned tokens.

Parameters

input_files – (list)

Raise

NoTierInputError

Returns

(sppasTier)

get_output_pattern()[source]

Pattern this annotation uses in an output filename.

load_resources(dict_filename=None, map_filename=None, **kwargs)[source]

Set the pronunciation dictionary and the mapping table.

Parameters

dict_filename – (str) The pronunciation dictionary in HTK-ASCII

format with UTF-8 encoding.

Parameters

map_filename – (str) is the filename of a mapping table. It is used to generate new pronunciations by mapping phonemes of the dict.

run(input_files, output=None)[source]

Run the automatic annotation process on an input.

Parameters
  • input_files – (list of str) Normalized text

  • output – (str) the output name

Returns

(sppasTranscription)

set_unk(unk)[source]

Fix the unk option value.

Parameters

unk – (bool) If unk is set to True, the system attempts

to phonetize unknown entries (i.e. tokens missing in the dictionary). Otherwise, the phonetization of an unknown entry unit is set to the default stamp.

set_usestdtokens(stdtokens)[source]

Fix the stdtokens option.

Parameters

stdtokens – (bool) If it is set to True, the phonetization

uses the standard transcription as input, instead of the faked transcription. This option does make sense only for an Enriched Orthographic Transcription.

class annotations.Phon.sppasPhonUnk(pron_dict)[source]

Bases: object

Perform a dictionary-based phonetization for unknown entries.

Implements a language-independent algorithm to phonetize unknown tokens. The algorithm is based on the idea that given enough examples it should be possible to predict the pronunciation of unseen tokens purely by analogy. It consists in exploring the unknown token from left to right, then from right to left, and to find the longest strings in the dictionary. Since this algorithm uses the dictionary, the quality of such a phonetization strongly depends on this resource.

Example of use:

>>> d = { 'a':'a|aa', 'b':'b', 'c':'c|cc', 'abb':'abb', 'bac':'bac' }
>>> p = sppasPhonUnk(d)
__init__(pron_dict)[source]

Create a sppasPhonUnk instance.

Parameters

pron_dict – (sppasPronDict) Dictionary of a set of tuples:

token=key, phon=value.

get_phon(entry)[source]

Return the phonetization of an unknown entry.

Parameters

entry – (str) the string to phonetize

Returns

a string with the proposed phonetization

Raises

Exception if the word can NOT be phonetized

set_variants(v)[source]

Fix the maximum number of variants.

Parameters

v – (int) If v is set to 0, all variants will be returned.