annotations.Syll package

Submodules

annotations.Syll.rules module

filename

sppas.src.annotations.Syll.rules.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Rules of the syllabification system.

class annotations.Syll.rules.SyllRules(filename=None)[source]

Bases: object

Manager of a set of rules for syllabification.

The rules we propose follow usual phonological statements for most of the corpus. A configuration file indicates phonemes, classes and rules. This file can be edited and modified to adapt the syllabification.

The syllable configuration file is a simple ASCII text file that the user can change as needed.

BREAK_SYMBOL = '#'
__init__(filename=None)[source]

Create a new SyllRules instance.

Parameters

filename – (str) Name of the file with the rules.

get_boundary(phonemes)[source]

Get the index of the syllable boundary (EXCRULES or GENRULES).

Phonemes are separated with the symbol defined by separators.phonemes variable.

Parameters

phonemes – (str) Sequence of phonemes to syllabify

Returns

(int) boundary index or -1 if phonemes don’t match any rule.

get_class(phoneme)[source]

Return the class identifier of the phoneme.

If the phoneme is unknown, the break symbol is returned.

Parameters

phoneme – (str) A phoneme

Returns

class of the phoneme or break symbol

get_class_rules_boundary(classes)[source]

Get the index of the syllable boundary (EXCRULES or GENRULES).

Parameters

classes – (str) The class sequence to syllabify

Returns

(int) boundary index or -1 if it does not match any rule.

get_gap(phonemes)[source]

Return the shift to apply (OTHRULES).

Parameters

phonemes – (str) Phonemes to syllabify

Returns

(int) boundary shift

is_exception(rule)[source]

Return True if the rule is an exception rule.

Parameters

rule – (str)

load(filename)[source]

Load the rules from a file.

Parameters

filename – (str) Name of the file with the rules.

reset()[source]

Reset the set of rules.

annotations.Syll.sppassyll module

filename

sppas.src.annotations.Syll.sppassyll.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

SPPAS integration of Syllabification automatic annotation

class annotations.Syll.sppassyll.sppasSyll(log=None)[source]

Bases: annotations.baseannot.sppasBaseAnnotation

SPPAS integration of the automatic syllabification annotation.

__init__(log=None)[source]

Create a new sppasSyll instance with only the general rules.

Log is used for a better communication of the annotation process and its results. If None, logs are redirected to the default logging system.

Parameters

log – (sppasLog) Human-readable logs.

convert(phonemes, intervals=None)[source]

Syllabify labels of a time-aligned phones tier.

Parameters
  • phonemes – (sppasTier) time-aligned phonemes tier

  • intervals – (sppasTier)

Returns

(sppasTier)

fix_options(options)[source]

Fix all options.

Available options are:

  • usesintervals

  • usesphons

  • tiername

  • createclasses

  • createstructures

Parameters

options – (sppasOption)

get_input_pattern()[source]

Pattern this annotation expects for its input filename.

get_inputs(input_files)[source]

Return the the tier with aligned tokens.

Parameters

input_files – (list)

Raise

NoTierInputError

Returns

(sppasTier)

get_output_pattern()[source]

Pattern this annotation uses in an output filename.

load_resources(config_filename, **kwargs)[source]

Fix the syllabification rules from a configuration file.

Parameters

config_filename – Name of the configuration file with the rules

make_classes(syllables)[source]

Create the tier with syllable classes.

Parameters

syllables – (sppasTier)

run(input_files, output=None)[source]

Run the automatic annotation process on an input.

Parameters
  • input_files – (list of str) Time-aligned phonemes

  • output – (str) the output file name

Returns

(sppasTranscription)

set_create_tier_classes(create=True)[source]

Fix the createclasses option.

Parameters

create – (bool)

set_tiername(tier_name)[source]

Fix the tiername option.

Parameters

tier_name – (str)

set_usesintervals(mode)[source]

Fix the usesintervals option.

Parameters

mode – (bool) If mode is set to True, the syllabification

operates inside specific (given) intervals.

set_usesphons(mode)[source]

Fix the usesphons option.

Parameters

mode – (str) If mode is set to True, the syllabification operates

by using only tier with phonemes.

syllabify_interval(phonemes, from_p, to_p, syllables)[source]

Perform the syllabification of one interval.

Parameters
  • phonemes – (sppasTier)

  • from_p – (int) index of the first phoneme to be syllabified

  • to_p – (int) index of the last phoneme to be syllabified

  • syllables – (sppasTier)

annotations.Syll.syllabify module

filename

sppas.src.annotations.Syll.syllabify.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Syllabification of a sequence of phonemes.

class annotations.Syll.syllabify.Syllabifier(rules_filename=None)[source]

Bases: object

Syllabification of a sequence of phonemes.

__init__(rules_filename=None)[source]

Create a new Syllabifier instance.

Load rules from a text file, depending on the language and phonemes encoding. See documentation for details about this file.

Parameters

rules_filename – (str) Name of the file with the list of rules.

annotate(phonemes)[source]

Return the syllable boundaries of a sequence of phonemes.

>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a']
>>> Syllabifier("fra-config-file").annotate(phonemes)
>>> [(0, 3), (4, 6)]
Parameters

phonemes – (list)

Returns

list of tuples (begin index, end index)

classes_phonetized(phonetized_syllable)[source]

Return the classes of a phonetized syllable.

>>> syllable = "a-p-s-k"
>>> syllabifier.classes_phonetized(syllable)
>>> "V-P-F-P"
static phonetize_syllables(phonemes, syllables)[source]

Return the phonetized sequence of syllables.

>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a']
>>> syllables = Syllabifier("fra-config-file").annotate(phonemes)
>>> Syllabifier.phonetize_syllables(phonemes, syllables)
>>> "a-p-s-k.m-w-a"
Parameters
  • phonemes – (list) List of phonemes

  • syllables – list of tuples (begin index, end index)

Returns

(str) String representing the syllables segmentation

Module contents

filename

sppas.src.annotations.Syll.__init__.py

author

Brigitte Bigi

contact

develop@sppas.org

summary

Syllabification rule system.

The syllabification of phonemes is performed with a rule-based system. This RBS phoneme-to-syllable segmentation system is based on 2 main principles:

  • a syllable contains a vowel, and only one.

  • a pause is a syllable boundary.

These two principles focus the problem of the task of finding a syllabic boundary between two vowels. As in state-of-the-art systems, phonemes were grouped into classes and rules established to deal with these classes.

For details, read the following reference:

B. Bigi, C. Meunier, I. Nesterenko, R. Bertrand (2010).
Automatic detection of syllable boundaries in spontaneous speech.
In Language Resource and Evaluation Conference, pp. 3285–3292,
La Valetta, Malta.
class annotations.Syll.SyllRules(filename=None)[source]

Bases: object

Manager of a set of rules for syllabification.

The rules we propose follow usual phonological statements for most of the corpus. A configuration file indicates phonemes, classes and rules. This file can be edited and modified to adapt the syllabification.

The syllable configuration file is a simple ASCII text file that the user can change as needed.

BREAK_SYMBOL = '#'
__init__(filename=None)[source]

Create a new SyllRules instance.

Parameters

filename – (str) Name of the file with the rules.

get_boundary(phonemes)[source]

Get the index of the syllable boundary (EXCRULES or GENRULES).

Phonemes are separated with the symbol defined by separators.phonemes variable.

Parameters

phonemes – (str) Sequence of phonemes to syllabify

Returns

(int) boundary index or -1 if phonemes don’t match any rule.

get_class(phoneme)[source]

Return the class identifier of the phoneme.

If the phoneme is unknown, the break symbol is returned.

Parameters

phoneme – (str) A phoneme

Returns

class of the phoneme or break symbol

get_class_rules_boundary(classes)[source]

Get the index of the syllable boundary (EXCRULES or GENRULES).

Parameters

classes – (str) The class sequence to syllabify

Returns

(int) boundary index or -1 if it does not match any rule.

get_gap(phonemes)[source]

Return the shift to apply (OTHRULES).

Parameters

phonemes – (str) Phonemes to syllabify

Returns

(int) boundary shift

is_exception(rule)[source]

Return True if the rule is an exception rule.

Parameters

rule – (str)

load(filename)[source]

Load the rules from a file.

Parameters

filename – (str) Name of the file with the rules.

reset()[source]

Reset the set of rules.

class annotations.Syll.Syllabifier(rules_filename=None)[source]

Bases: object

Syllabification of a sequence of phonemes.

__init__(rules_filename=None)[source]

Create a new Syllabifier instance.

Load rules from a text file, depending on the language and phonemes encoding. See documentation for details about this file.

Parameters

rules_filename – (str) Name of the file with the list of rules.

annotate(phonemes)[source]

Return the syllable boundaries of a sequence of phonemes.

>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a']
>>> Syllabifier("fra-config-file").annotate(phonemes)
>>> [(0, 3), (4, 6)]
Parameters

phonemes – (list)

Returns

list of tuples (begin index, end index)

classes_phonetized(phonetized_syllable)[source]

Return the classes of a phonetized syllable.

>>> syllable = "a-p-s-k"
>>> syllabifier.classes_phonetized(syllable)
>>> "V-P-F-P"
static phonetize_syllables(phonemes, syllables)[source]

Return the phonetized sequence of syllables.

>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a']
>>> syllables = Syllabifier("fra-config-file").annotate(phonemes)
>>> Syllabifier.phonetize_syllables(phonemes, syllables)
>>> "a-p-s-k.m-w-a"
Parameters
  • phonemes – (list) List of phonemes

  • syllables – list of tuples (begin index, end index)

Returns

(str) String representing the syllables segmentation

class annotations.Syll.sppasSyll(log=None)[source]

Bases: annotations.baseannot.sppasBaseAnnotation

SPPAS integration of the automatic syllabification annotation.

__init__(log=None)[source]

Create a new sppasSyll instance with only the general rules.

Log is used for a better communication of the annotation process and its results. If None, logs are redirected to the default logging system.

Parameters

log – (sppasLog) Human-readable logs.

convert(phonemes, intervals=None)[source]

Syllabify labels of a time-aligned phones tier.

Parameters
  • phonemes – (sppasTier) time-aligned phonemes tier

  • intervals – (sppasTier)

Returns

(sppasTier)

fix_options(options)[source]

Fix all options.

Available options are:

  • usesintervals

  • usesphons

  • tiername

  • createclasses

  • createstructures

Parameters

options – (sppasOption)

get_input_pattern()[source]

Pattern this annotation expects for its input filename.

get_inputs(input_files)[source]

Return the the tier with aligned tokens.

Parameters

input_files – (list)

Raise

NoTierInputError

Returns

(sppasTier)

get_output_pattern()[source]

Pattern this annotation uses in an output filename.

load_resources(config_filename, **kwargs)[source]

Fix the syllabification rules from a configuration file.

Parameters

config_filename – Name of the configuration file with the rules

make_classes(syllables)[source]

Create the tier with syllable classes.

Parameters

syllables – (sppasTier)

run(input_files, output=None)[source]

Run the automatic annotation process on an input.

Parameters
  • input_files – (list of str) Time-aligned phonemes

  • output – (str) the output file name

Returns

(sppasTranscription)

set_create_tier_classes(create=True)[source]

Fix the createclasses option.

Parameters

create – (bool)

set_tiername(tier_name)[source]

Fix the tiername option.

Parameters

tier_name – (str)

set_usesintervals(mode)[source]

Fix the usesintervals option.

Parameters

mode – (bool) If mode is set to True, the syllabification

operates inside specific (given) intervals.

set_usesphons(mode)[source]

Fix the usesphons option.

Parameters

mode – (str) If mode is set to True, the syllabification operates

by using only tier with phonemes.

syllabify_interval(phonemes, from_p, to_p, syllables)[source]

Perform the syllabification of one interval.

Parameters
  • phonemes – (sppasTier)

  • from_p – (int) index of the first phoneme to be syllabified

  • to_p – (int) index of the last phoneme to be syllabified

  • syllables – (sppasTier)