annotations.Align.aligners package¶
Submodules¶
annotations.Align.aligners.aligner module¶
- filename
sppas.src.annotations.Align.aligners.aligner.py
- author
Brigitte Bigi
- contact
- summary
Aligners manager.
- class annotations.Align.aligners.aligner.sppasAligners[source]¶
Bases:
object
Manager of the aligners implemented in the package.
- check(aligner_name)[source]¶
Check whether the aligner name is known or not.
- Parameters
aligner_name – (str) Name of the aligner.
- Returns
formatted alignername
- classes(aligner_name=None)[source]¶
Return the list of aligner classes.
- Parameters
aligner_name – (str) A specific aligner
- Returns
BasicAligner, or a list if no aligner name is given
- default_extension(aligner_name=None)[source]¶
Return the default extension of each aligner.
- Parameters
aligner_name – (str) A specific aligner
- Returns
str, or a dict of str if no aligner name is given
- extensions(aligner_name=None)[source]¶
Return the list of supported extensions of each aligner.
- Parameters
aligner_name – (str) A specific aligner
- Returns
list of str, or a dict of list if no aligner name is given
- instantiate(model_dir=None, aligner_name='basic')[source]¶
Instantiate an aligner to the appropriate system from its name.
If an error occurred, the basic aligner is returned.
- Parameters
model_dir – (str) Directory of the acoustic model
aligner_name – (str) Name of the aligner
- Returns
an Aligner instance.
annotations.Align.aligners.alignerio module¶
- filename
sppas.src.annotations.Align.aligners.alignerio.py
- author
Brigitte Bigi
- contact
- summary
Aligners Input/Output readers and writers
- class annotations.Align.aligners.alignerio.AlignerIO[source]¶
Bases:
object
Reader/writer of the output files of the aligners.
AlignerIO implements methods to read/write files of the external aligner systems.
- EXTENSIONS_READ = {'mlf': <class 'annotations.Align.aligners.alignerio.mlf'>, 'palign': <class 'annotations.Align.aligners.alignerio.palign'>, 'walign': <class 'annotations.Align.aligners.alignerio.walign'>}¶
- EXTENSIONS_WRITE = {'palign': <class 'annotations.Align.aligners.alignerio.palign'>}¶
- static read_aligned(basename)[source]¶
Find an aligned file and read it.
- Parameters
basename – (str) File name without extension
- Returns
Two lists of tuples with phones and words - (start-time end-time phoneme score) - (start-time end-time word score)
The score can be None. todo: The “phoneme” column can be a sequence of alternative phonemes.
- class annotations.Align.aligners.alignerio.BaseAlignersReader[source]¶
Bases:
object
Base class for readers/writers of time-aligned files.
- static get_lines(filename)[source]¶
Return the lines of a file with the SPPAS encoding.
- Parameters
filename – file to load
- Returns
list of decoded lines
- static get_phonemes_julius(lines)[source]¶
Return the pronunciation of all words.
- Parameters
lines – (List of str)
- Returns
List of tuples (ph1 ph2…phN)
- static get_units_julius(lines)[source]¶
Return the units of a palign/walign file (in frames).
- Parameters
lines – (List of str)
- Returns
List of tuples (start, end)
- static get_word_scores_julius(lines)[source]¶
Return all scores of words.
- Parameters
lines – (List of str)
- Returns
List
- static get_words_julius(lines)[source]¶
Return all words.
- Parameters
lines – (List of str)
- Returns
List
- static make_result(units, words, phonemes, scores)[source]¶
Make a unique data structure from the given data.
- Parameters
units – (List of tuples)
words – (List of str)
phonemes – (List of tuples)
scores – (List of str, or None)
- Returns
Two data structures
List of (start_time end_time phoneme None)
List of (start_time end_time word score)
- static shift_time_units(units, delta)[source]¶
Return the units shifted of a delta time.
The first start time and the last end time are not shifted.
- Parameters
units – (list of tuples) Time units
delta – (float) Delta time value in range [-0.02;0.02]
- static units_to_time(units, samplerate)[source]¶
Return the conversion of units.
Convert units (in frames) into time values (in seconds).
- Parameters
samplerate – (int) Sample rate to be applied to the units.
- Returns
List of tuples (start, end)
NOTE: DANS LES VERSIONS PREC. ON DECALAIT TOUT DE 10ms A DROITE.
- class annotations.Align.aligners.alignerio.mlf[source]¶
Bases:
annotations.Align.aligners.alignerio.BaseAlignersReader
mlf reader of time-aligned files (HTK Toolkit).
When the -m option is used, the transcriptions output by HVITE would by default contain both the model level and word level transcriptions . For example, a typical fragment of the output might be:
7500000 8700000 f -1081.604736 FOUR 30.000000 8700000 9800000 ao -903.821350 9800000 10400000 r -665.931641
10400000 10400000 sp -0.103585 10400000 11700000 s -1266.470093 SEVEN 22.860001 11700000 12500000 eh -765.568237 12500000 13000000 v -476.323334 13000000 14400000 n -1285.369629 14400000 14400000 sp -0.103585
- static get_phonemes(lines)[source]¶
Return the pronunciation of all words.
- Parameters
lines – (List of str)
- Returns
List of tuples (ph1 ph2…phN)
- static get_units(lines)[source]¶
Return the units of a mlf file (in nano-seconds).
- Parameters
lines – (List of str)
- Returns
List of tuples (start, end)
- class annotations.Align.aligners.alignerio.palign[source]¶
Bases:
annotations.Align.aligners.alignerio.BaseAlignersReader
palign reader/writer of time-aligned files (Julius CSR Engine).
- static read(filename)[source]¶
Read an alignment file in the format of Julius CSR engine.
- Parameters
filename – (str) The input file name.
- Returns
3 lists of tuples
List of (start-time end-time phoneme None)
List of (start-time end-time word None)
List of (start-time end-time pron_word score)
- static write(phoneslist, tokenslist, alignments, outputfilename)[source]¶
Write an alignment output file.
- Parameters
phoneslist – (list) The phonetization of each token
tokenslist – (list) Each token
alignments – (list) Tuples (start-time end-time phoneme)
outputfilename – (str) Output file name (a Julius-like output).
- class annotations.Align.aligners.alignerio.walign[source]¶
Bases:
annotations.Align.aligners.alignerio.BaseAlignersReader
walign reader of time-aligned files (Julius CSR Engine).
annotations.Align.aligners.basealigner module¶
- filename
sppas.src.annotations.Align.tracksio.py
- author
Brigitte Bigi
- contact
- summary
Base class for any automatic forced alignment system.
- class annotations.Align.aligners.basealigner.BaseAligner(model_dir=None)[source]¶
Bases:
object
Base class for any automatic alignment system.
Base class for a system to perform phonetic speech segmentation.
- __init__(model_dir=None)[source]¶
Create a BaseAligner instance.
- Parameters
model_dir – (str) the acoustic model directory name
- add_tiedlist(entries)[source]¶
Add missing triphones/biphones in the tiedlist of the model.
Backup the initial file if entries were added.
- Parameters
entries – (list) List of missing entries into the tiedlist.
- Returns
list of entries really added
- check_data()[source]¶
Check the given data to be aligned (phones and tokens).
- Returns
A warning message, or an empty string if check is OK.
- run_alignment(input_wav, output_align)[source]¶
Perform forced-alignment.
It is expected that the alignment is performed on a file with a size less or equal to a sentence (sentence/IPUs/segment/utterance).
The audio file must be of type PCM-WAV 16000 Hz, 16 bits, like in the model.
- Parameters
input_wav – (str) the audio input file name
output_align – (str) the output file name
- Returns
(str) A message of the aligner
annotations.Align.aligners.basicalign module¶
- filename
sppas.src.annotations.Align.aligners.basicalign.py
- author
Brigitte Bigi
- contact
- summary
An aligner to set the same duration to each sound.
- class annotations.Align.aligners.basicalign.BasicAligner(model_dir=None)[source]¶
Bases:
annotations.Align.aligners.basealigner.BaseAligner
Basic automatic alignment system.
This segmentation assign the same duration to each phoneme. In case of phonetic variants, the first shortest pronunciation is selected.
- __init__(model_dir=None)[source]¶
Create a BasicAligner instance.
This class allows to align one unit assigning the same duration to each phoneme. It selects the shortest sequence in case of variants.
- Parameters
model_dir – (str) Ignored.
- run_alignment(input_wav, output_align)[source]¶
Perform the speech segmentation.
Assign the same duration to each phoneme.
- Parameters
input_wav – (str/float) audio input file name, or its duration
output_align – (str) the output file name
- Returns
Empty string.
annotations.Align.aligners.hvitealign module¶
- filename
sppas.src.annotations.Align.aligners.hvitealign.py
- author
Brigitte Bigi
- contact
- summary
Wrapper for HVite.
http://htk.eng.cam.ac.uk/links/asr_tool.shtml
- class annotations.Align.aligners.hvitealign.HviteAligner(model_dir=None)[source]¶
Bases:
annotations.Align.aligners.basealigner.BaseAligner
HVite automatic alignment system.
- __init__(model_dir=None)[source]¶
Create a HViteAligner instance.
This class allows to align one inter-pausal unit with with the external segmentation tool HVite.
- HVite is able to align one audio segment that can be:
an inter-pausal unit,
an utterance,
a sentence,
a paragraph…
no longer than a few seconds.
- Parameters
model_dir – (str) Name of the directory of the acoustic model
- gen_dependencies(grammar_name, dict_name)[source]¶
Generate the dependencies (grammar, dictionary) for HVite.
- Parameters
grammar_name – (str) the file name of the tokens
dict_name – (str) the dictionary file name
- run_alignment(input_wav, output_align)[source]¶
Execute the external program HVite to align.
Given audio file must match the ones we used to train the acoustic model: PCM-WAV 16000 Hz, 16 bits
- Parameters
input_wav – (str) audio input file name
output_align – (str) the output file name
- Returns
(str) An empty string.
annotations.Align.aligners.juliusalign module¶
- filename
sppas.src.annotations.Align.aligners.juliusalign.py
- author
Brigitte Bigi
- contact
- summary
Wrapper for Julius aligner.
http://julius.sourceforge.jp/en_index.php
Julius is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. Based on word N-gram and context-dependent HMM, it can perform almost real-time decoding on most current PCs in 60k word dictation task. Major search techniques are fully incorporated such as tree lexicon, N-gram factoring, cross-word context dependency handling, enveloped beam search, Gaussian pruning, Gaussian selection, etc. Besides search efficiency, it is also modularized carefully to be independent from model structures, and various HMM types are supported such as shared-state triphones and tied-mixture models, with any number of mixtures, states, or phones. Standard formats are adopted to cope with other free modeling toolkit such as HTK, CMU-Cam SLM toolkit, etc.
The main platform is Linux and other Unix workstations, and also works on Windows. Most recent version is developed on Linux and Windows (cygwin / mingw), and also has Microsoft SAPI version. Julius is distributed with open license together with source codes.
Julius has been developed as a research software for Japanese LVCSR since 1997, and the work was continued under IPA Japanese dictation toolkit project (1997-2000), Continuous Speech Recognition Consortium, Japan (CSRC) (2000-2003) and currently Interactive Speech Technology Consortium (ISTC).
- class annotations.Align.aligners.juliusalign.JuliusAligner(model_dir=None)[source]¶
Bases:
annotations.Align.aligners.basealigner.BaseAligner
Julius automatic alignment system.
- JuliusAligner is able to align one audio segment that can be:
an inter-pausal unit,
an utterance,
a sentence…
no longer than a few seconds.
Things needed to run JuliusAligner:
To perform speech segmentation with Julius, three “models” have to be prepared. The models should define the linguistic property of the language: recognition unit, audio properties of the unit and the linguistic constraint for the connection between the units. Typically the unit should be a word, and you should give Julius these models below:
1. “Acoustic model”, which is a stochastic model of input waveform patterns, typically per phoneme. Format is HTK-ASCII model.
“Word dictionary”, which defines vocabulary.
3. “Language model”, which defines syntax level rules that defines the connection constraint between words. It should give the constraint for the acceptable or preferable sentence patterns. It can be:
either a rule-based grammar,
or probabilistic N-gram model.
This class automatically construct the word dictionary and the language model from both:
the tokenization of speech,
the phonetization of speech.
If outext is set to “palign”, JuliusAligner will use a grammar and it will produce both phones and words alignments. If outext is set to “walign”, JuliusAligner will use a slm and will produce words alignments only.
- __init__(model_dir=None)[source]¶
Create a JuliusAligner instance.
- Parameters
model_dir – (str) Name of the directory of the acoustic model
- gen_grammar_dependencies(basename)[source]¶
Generate the dependencies (grammar, dictionary) for julius.
- Parameters
basename – (str) base name of the grammar and dictionary files
- gen_slm_dependencies(basename, N=3)[source]¶
Generate the dependencies (slm, dictionary) for julius.
- Parameters
basename – (str) base name of the slm and dictionary files
N – (int) Language model N-gram length.
- run_alignment(input_wav, output_align, N=3)[source]¶
Execute the external program julius to align.
The data related to the unit to time-align need to be previously fixed with:
set_phones(str)
set_tokens(str)
Given audio file must match the ones we used to train the acoustic model: PCM-WAV 16000 Hz, 16 bits
- Parameters
input_wav – (str) the audio input file name
output_align – (str) the output file name
N – (int) for N-grams, used only if SLM (i.e. outext=walign)
- Returns
(str) A message of julius.
- run_julius(inputwav, basename, outputalign)[source]¶
Perform the speech segmentation.
System call to the command julius.
Given audio file must match the ones we used to train the acoustic model: PCM-WAV 16000 Hz, 16 bits
- Parameters
inputwav – (str) audio input file name
basename – (str) base name of grammar and dictionary files
outputalign – (str) output file name
Module contents¶
- filename
sppas.src.annotations.Align.aligners.__init__.py
- author
Brigitte Bigi
- contact
- summary
Internal or externals automatic aligners
How to get the list of supported aligner names?
>>> a = sppasAligners()
>>> a.default_aligner_name()
>>> a.names()
How to get an instance of a given aligner?
>>> a1 = sppasAligners().instantiate(model_dir, "julius")
>>> a2 = JuliusAligner(model_dir)
- class annotations.Align.aligners.BasicAligner(model_dir=None)[source]¶
Bases:
annotations.Align.aligners.basealigner.BaseAligner
Basic automatic alignment system.
This segmentation assign the same duration to each phoneme. In case of phonetic variants, the first shortest pronunciation is selected.
- __init__(model_dir=None)[source]¶
Create a BasicAligner instance.
This class allows to align one unit assigning the same duration to each phoneme. It selects the shortest sequence in case of variants.
- Parameters
model_dir – (str) Ignored.
- run_alignment(input_wav, output_align)[source]¶
Perform the speech segmentation.
Assign the same duration to each phoneme.
- Parameters
input_wav – (str/float) audio input file name, or its duration
output_align – (str) the output file name
- Returns
Empty string.
- class annotations.Align.aligners.HviteAligner(model_dir=None)[source]¶
Bases:
annotations.Align.aligners.basealigner.BaseAligner
HVite automatic alignment system.
- __init__(model_dir=None)[source]¶
Create a HViteAligner instance.
This class allows to align one inter-pausal unit with with the external segmentation tool HVite.
- HVite is able to align one audio segment that can be:
an inter-pausal unit,
an utterance,
a sentence,
a paragraph…
no longer than a few seconds.
- Parameters
model_dir – (str) Name of the directory of the acoustic model
- gen_dependencies(grammar_name, dict_name)[source]¶
Generate the dependencies (grammar, dictionary) for HVite.
- Parameters
grammar_name – (str) the file name of the tokens
dict_name – (str) the dictionary file name
- run_alignment(input_wav, output_align)[source]¶
Execute the external program HVite to align.
Given audio file must match the ones we used to train the acoustic model: PCM-WAV 16000 Hz, 16 bits
- Parameters
input_wav – (str) audio input file name
output_align – (str) the output file name
- Returns
(str) An empty string.
- class annotations.Align.aligners.JuliusAligner(model_dir=None)[source]¶
Bases:
annotations.Align.aligners.basealigner.BaseAligner
Julius automatic alignment system.
- JuliusAligner is able to align one audio segment that can be:
an inter-pausal unit,
an utterance,
a sentence…
no longer than a few seconds.
Things needed to run JuliusAligner:
To perform speech segmentation with Julius, three “models” have to be prepared. The models should define the linguistic property of the language: recognition unit, audio properties of the unit and the linguistic constraint for the connection between the units. Typically the unit should be a word, and you should give Julius these models below:
1. “Acoustic model”, which is a stochastic model of input waveform patterns, typically per phoneme. Format is HTK-ASCII model.
“Word dictionary”, which defines vocabulary.
3. “Language model”, which defines syntax level rules that defines the connection constraint between words. It should give the constraint for the acceptable or preferable sentence patterns. It can be:
either a rule-based grammar,
or probabilistic N-gram model.
This class automatically construct the word dictionary and the language model from both:
the tokenization of speech,
the phonetization of speech.
If outext is set to “palign”, JuliusAligner will use a grammar and it will produce both phones and words alignments. If outext is set to “walign”, JuliusAligner will use a slm and will produce words alignments only.
- __init__(model_dir=None)[source]¶
Create a JuliusAligner instance.
- Parameters
model_dir – (str) Name of the directory of the acoustic model
- gen_grammar_dependencies(basename)[source]¶
Generate the dependencies (grammar, dictionary) for julius.
- Parameters
basename – (str) base name of the grammar and dictionary files
- gen_slm_dependencies(basename, N=3)[source]¶
Generate the dependencies (slm, dictionary) for julius.
- Parameters
basename – (str) base name of the slm and dictionary files
N – (int) Language model N-gram length.
- run_alignment(input_wav, output_align, N=3)[source]¶
Execute the external program julius to align.
The data related to the unit to time-align need to be previously fixed with:
set_phones(str)
set_tokens(str)
Given audio file must match the ones we used to train the acoustic model: PCM-WAV 16000 Hz, 16 bits
- Parameters
input_wav – (str) the audio input file name
output_align – (str) the output file name
N – (int) for N-grams, used only if SLM (i.e. outext=walign)
- Returns
(str) A message of julius.
- run_julius(inputwav, basename, outputalign)[source]¶
Perform the speech segmentation.
System call to the command julius.
Given audio file must match the ones we used to train the acoustic model: PCM-WAV 16000 Hz, 16 bits
- Parameters
inputwav – (str) audio input file name
basename – (str) base name of grammar and dictionary files
outputalign – (str) output file name
- class annotations.Align.aligners.sppasAligners[source]¶
Bases:
object
Manager of the aligners implemented in the package.
- check(aligner_name)[source]¶
Check whether the aligner name is known or not.
- Parameters
aligner_name – (str) Name of the aligner.
- Returns
formatted alignername
- classes(aligner_name=None)[source]¶
Return the list of aligner classes.
- Parameters
aligner_name – (str) A specific aligner
- Returns
BasicAligner, or a list if no aligner name is given
- default_extension(aligner_name=None)[source]¶
Return the default extension of each aligner.
- Parameters
aligner_name – (str) A specific aligner
- Returns
str, or a dict of str if no aligner name is given
- extensions(aligner_name=None)[source]¶
Return the list of supported extensions of each aligner.
- Parameters
aligner_name – (str) A specific aligner
- Returns
list of str, or a dict of list if no aligner name is given
- instantiate(model_dir=None, aligner_name='basic')[source]¶
Instantiate an aligner to the appropriate system from its name.
If an error occurred, the basic aligner is returned.
- Parameters
model_dir – (str) Directory of the acoustic model
aligner_name – (str) Name of the aligner
- Returns
an Aligner instance.