annotations.Syll package¶
Submodules¶
annotations.Syll.rules module¶
- filename
sppas.src.annotations.Syll.rules.py
- author
Brigitte Bigi
- contact
- summary
Rules of the syllabification system.
- class annotations.Syll.rules.SyllRules(filename=None)[source]¶
Bases:
object
Manager of a set of rules for syllabification.
The rules we propose follow usual phonological statements for most of the corpus. A configuration file indicates phonemes, classes and rules. This file can be edited and modified to adapt the syllabification.
The syllable configuration file is a simple ASCII text file that the user can change as needed.
- BREAK_SYMBOL = '#'¶
- __init__(filename=None)[source]¶
Create a new SyllRules instance.
- Parameters
filename – (str) Name of the file with the rules.
- get_boundary(phonemes)[source]¶
Get the index of the syllable boundary (EXCRULES or GENRULES).
Phonemes are separated with the symbol defined by separators.phonemes variable.
- Parameters
phonemes – (str) Sequence of phonemes to syllabify
- Returns
(int) boundary index or -1 if phonemes don’t match any rule.
- get_class(phoneme)[source]¶
Return the class identifier of the phoneme.
If the phoneme is unknown, the break symbol is returned.
- Parameters
phoneme – (str) A phoneme
- Returns
class of the phoneme or break symbol
- get_class_rules_boundary(classes)[source]¶
Get the index of the syllable boundary (EXCRULES or GENRULES).
- Parameters
classes – (str) The class sequence to syllabify
- Returns
(int) boundary index or -1 if it does not match any rule.
- get_gap(phonemes)[source]¶
Return the shift to apply (OTHRULES).
- Parameters
phonemes – (str) Phonemes to syllabify
- Returns
(int) boundary shift
annotations.Syll.sppassyll module¶
- filename
sppas.src.annotations.Syll.sppassyll.py
- author
Brigitte Bigi
- contact
- summary
SPPAS integration of Syllabification automatic annotation
- class annotations.Syll.sppassyll.sppasSyll(log=None)[source]¶
Bases:
annotations.baseannot.sppasBaseAnnotation
SPPAS integration of the automatic syllabification annotation.
- __init__(log=None)[source]¶
Create a new sppasSyll instance with only the general rules.
Log is used for a better communication of the annotation process and its results. If None, logs are redirected to the default logging system.
- Parameters
log – (sppasLog) Human-readable logs.
- convert(phonemes, intervals=None)[source]¶
Syllabify labels of a time-aligned phones tier.
- Parameters
phonemes – (sppasTier) time-aligned phonemes tier
intervals – (sppasTier)
- Returns
(sppasTier)
- fix_options(options)[source]¶
Fix all options.
Available options are:
usesintervals
usesphons
tiername
createclasses
createstructures
- Parameters
options – (sppasOption)
- get_inputs(input_files)[source]¶
Return the the tier with aligned tokens.
- Parameters
input_files – (list)
- Raise
NoTierInputError
- Returns
(sppasTier)
- load_resources(config_filename, **kwargs)[source]¶
Fix the syllabification rules from a configuration file.
- Parameters
config_filename – Name of the configuration file with the rules
- make_classes(syllables)[source]¶
Create the tier with syllable classes.
- Parameters
syllables – (sppasTier)
- run(input_files, output=None)[source]¶
Run the automatic annotation process on an input.
- Parameters
input_files – (list of str) Time-aligned phonemes
output – (str) the output file name
- Returns
(sppasTranscription)
- set_create_tier_classes(create=True)[source]¶
Fix the createclasses option.
- Parameters
create – (bool)
- set_usesintervals(mode)[source]¶
Fix the usesintervals option.
- Parameters
mode – (bool) If mode is set to True, the syllabification
operates inside specific (given) intervals.
annotations.Syll.syllabify module¶
- filename
sppas.src.annotations.Syll.syllabify.py
- author
Brigitte Bigi
- contact
- summary
Syllabification of a sequence of phonemes.
- class annotations.Syll.syllabify.Syllabifier(rules_filename=None)[source]¶
Bases:
object
Syllabification of a sequence of phonemes.
- __init__(rules_filename=None)[source]¶
Create a new Syllabifier instance.
Load rules from a text file, depending on the language and phonemes encoding. See documentation for details about this file.
- Parameters
rules_filename – (str) Name of the file with the list of rules.
- annotate(phonemes)[source]¶
Return the syllable boundaries of a sequence of phonemes.
>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a'] >>> Syllabifier("fra-config-file").annotate(phonemes) >>> [(0, 3), (4, 6)]
- Parameters
phonemes – (list)
- Returns
list of tuples (begin index, end index)
- classes_phonetized(phonetized_syllable)[source]¶
Return the classes of a phonetized syllable.
>>> syllable = "a-p-s-k" >>> syllabifier.classes_phonetized(syllable) >>> "V-P-F-P"
- static phonetize_syllables(phonemes, syllables)[source]¶
Return the phonetized sequence of syllables.
>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a'] >>> syllables = Syllabifier("fra-config-file").annotate(phonemes) >>> Syllabifier.phonetize_syllables(phonemes, syllables) >>> "a-p-s-k.m-w-a"
- Parameters
phonemes – (list) List of phonemes
syllables – list of tuples (begin index, end index)
- Returns
(str) String representing the syllables segmentation
Module contents¶
- filename
sppas.src.annotations.Syll.__init__.py
- author
Brigitte Bigi
- contact
- summary
Syllabification rule system.
The syllabification of phonemes is performed with a rule-based system. This RBS phoneme-to-syllable segmentation system is based on 2 main principles:
a syllable contains a vowel, and only one.
a pause is a syllable boundary.
These two principles focus the problem of the task of finding a syllabic boundary between two vowels. As in state-of-the-art systems, phonemes were grouped into classes and rules established to deal with these classes.
For details, read the following reference:
B. Bigi, C. Meunier, I. Nesterenko, R. Bertrand (2010).Automatic detection of syllable boundaries in spontaneous speech.In Language Resource and Evaluation Conference, pp. 3285–3292,La Valetta, Malta.
- class annotations.Syll.SyllRules(filename=None)[source]¶
Bases:
object
Manager of a set of rules for syllabification.
The rules we propose follow usual phonological statements for most of the corpus. A configuration file indicates phonemes, classes and rules. This file can be edited and modified to adapt the syllabification.
The syllable configuration file is a simple ASCII text file that the user can change as needed.
- BREAK_SYMBOL = '#'¶
- __init__(filename=None)[source]¶
Create a new SyllRules instance.
- Parameters
filename – (str) Name of the file with the rules.
- get_boundary(phonemes)[source]¶
Get the index of the syllable boundary (EXCRULES or GENRULES).
Phonemes are separated with the symbol defined by separators.phonemes variable.
- Parameters
phonemes – (str) Sequence of phonemes to syllabify
- Returns
(int) boundary index or -1 if phonemes don’t match any rule.
- get_class(phoneme)[source]¶
Return the class identifier of the phoneme.
If the phoneme is unknown, the break symbol is returned.
- Parameters
phoneme – (str) A phoneme
- Returns
class of the phoneme or break symbol
- get_class_rules_boundary(classes)[source]¶
Get the index of the syllable boundary (EXCRULES or GENRULES).
- Parameters
classes – (str) The class sequence to syllabify
- Returns
(int) boundary index or -1 if it does not match any rule.
- get_gap(phonemes)[source]¶
Return the shift to apply (OTHRULES).
- Parameters
phonemes – (str) Phonemes to syllabify
- Returns
(int) boundary shift
- class annotations.Syll.Syllabifier(rules_filename=None)[source]¶
Bases:
object
Syllabification of a sequence of phonemes.
- __init__(rules_filename=None)[source]¶
Create a new Syllabifier instance.
Load rules from a text file, depending on the language and phonemes encoding. See documentation for details about this file.
- Parameters
rules_filename – (str) Name of the file with the list of rules.
- annotate(phonemes)[source]¶
Return the syllable boundaries of a sequence of phonemes.
>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a'] >>> Syllabifier("fra-config-file").annotate(phonemes) >>> [(0, 3), (4, 6)]
- Parameters
phonemes – (list)
- Returns
list of tuples (begin index, end index)
- classes_phonetized(phonetized_syllable)[source]¶
Return the classes of a phonetized syllable.
>>> syllable = "a-p-s-k" >>> syllabifier.classes_phonetized(syllable) >>> "V-P-F-P"
- static phonetize_syllables(phonemes, syllables)[source]¶
Return the phonetized sequence of syllables.
>>> phonemes = ['a', 'p', 's', 'k', 'm', 'w', 'a'] >>> syllables = Syllabifier("fra-config-file").annotate(phonemes) >>> Syllabifier.phonetize_syllables(phonemes, syllables) >>> "a-p-s-k.m-w-a"
- Parameters
phonemes – (list) List of phonemes
syllables – list of tuples (begin index, end index)
- Returns
(str) String representing the syllables segmentation
- class annotations.Syll.sppasSyll(log=None)[source]¶
Bases:
annotations.baseannot.sppasBaseAnnotation
SPPAS integration of the automatic syllabification annotation.
- __init__(log=None)[source]¶
Create a new sppasSyll instance with only the general rules.
Log is used for a better communication of the annotation process and its results. If None, logs are redirected to the default logging system.
- Parameters
log – (sppasLog) Human-readable logs.
- convert(phonemes, intervals=None)[source]¶
Syllabify labels of a time-aligned phones tier.
- Parameters
phonemes – (sppasTier) time-aligned phonemes tier
intervals – (sppasTier)
- Returns
(sppasTier)
- fix_options(options)[source]¶
Fix all options.
Available options are:
usesintervals
usesphons
tiername
createclasses
createstructures
- Parameters
options – (sppasOption)
- get_inputs(input_files)[source]¶
Return the the tier with aligned tokens.
- Parameters
input_files – (list)
- Raise
NoTierInputError
- Returns
(sppasTier)
- load_resources(config_filename, **kwargs)[source]¶
Fix the syllabification rules from a configuration file.
- Parameters
config_filename – Name of the configuration file with the rules
- make_classes(syllables)[source]¶
Create the tier with syllable classes.
- Parameters
syllables – (sppasTier)
- run(input_files, output=None)[source]¶
Run the automatic annotation process on an input.
- Parameters
input_files – (list of str) Time-aligned phonemes
output – (str) the output file name
- Returns
(sppasTranscription)
- set_create_tier_classes(create=True)[source]¶
Fix the createclasses option.
- Parameters
create – (bool)
- set_usesintervals(mode)[source]¶
Fix the usesintervals option.
- Parameters
mode – (bool) If mode is set to True, the syllabification
operates inside specific (given) intervals.