annotations.SearchIPUs package¶
Submodules¶
annotations.SearchIPUs.searchipus module¶
- filename
sppas.src.annotations.SearchIPUs.searchipus.py
- author
Brigitte Bigi
- contact
- summary
Silences vs sounding segments segmentation.
- class annotations.SearchIPUs.searchipus.SearchIPUs(channel, win_len=0.02)[source]¶
Bases:
annotations.SearchIPUs.silences.sppasSilences
An automatic silence/tracks segmentation system.
Silence/tracks segmentation aims at finding IPUs. IPUs - Inter-Pausal Units are blocks of speech bounded by silent pauses of more than X ms, and time-aligned on the speech signal.
- DEFAULT_MIN_IPU_DUR = 0.3¶
- DEFAULT_MIN_SIL_DUR = 0.25¶
- DEFAULT_SHIFT_END = 0.02¶
- DEFAULT_SHIFT_START = 0.02¶
- DEFAULT_VOL_THRESHOLD = 0¶
- MIN_IPU_DUR = 0.06¶
- MIN_SIL_DUR = 0.06¶
- __init__(channel, win_len=0.02)[source]¶
Create a new SearchIPUs instance.
- Parameters
channel – (sppasChannel)
- get_effective_threshold()[source]¶
Return the threshold volume estimated automatically to search for silences.
- get_track_data(tracks)[source]¶
Return the audio data of tracks.
- Parameters
tracks – List of tracks. A track is a tuple (start, end).
- Returns
List of audio data
- get_tracks(time_domain=False)[source]¶
Return a list of tuples (from,to) of tracks.
(from,to) values are converted, or not, into the time-domain.
The tracks are found from the current list of silences, which is firstly filtered with the min_sil_dur.
- This methods requires the following members to be fixed:
the volume threshold
the minimum duration for a silence,
the minimum duration for a track,
the duration to remove to the start boundary,
the duration to add to the end boundary.
- Parameters
time_domain – (bool) Convert from/to values in seconds
- Returns
(list of tuples) with (from,to) of the tracks
- set_min_ipu(min_ipu_dur)[source]¶
Fix the default minimum duration of an IPU.
- Parameters
min_ipu_dur – (float) Duration in seconds.
- set_min_sil(min_sil_dur)[source]¶
Fix the default minimum duration of a silence.
- Parameters
min_sil_dur – (float) Duration in seconds.
- set_shift_end(s)[source]¶
Fix the default minimum boundary shift value.
- Parameters
s – (float) Duration in seconds.
- set_shift_start(s)[source]¶
Fix the default minimum boundary shift value.
- Parameters
s – (float) Duration in seconds.
annotations.SearchIPUs.silences module¶
- filename
sppas.src.annotations.SearchIPUs.silences.py
- author
Brigitte Bigi
- contact
- summary
search for silences in a channel.
- class annotations.SearchIPUs.silences.sppasSilences(channel, win_len=0.02, vagueness=0.005)[source]¶
Bases:
object
Silence search on a channel of an audio file.
Silences are stored in a list of (from_pos,to_pos) values, indicating the frame from which the silences are beginning and ending respectively.
- __init__(channel, win_len=0.02, vagueness=0.005)[source]¶
Create a sppasSilences instance.
- Parameters
channel – (sppasChannel) the input channel
win_len – (float) duration of a window
vagueness – (float) Windows length to estimate the boundaries.
Maximum value of vagueness is win_len. The duration of a window (win_len) is relevant for the estimation of the rms values.
Radius (see sppasPoint) is the 2*vagueness of the boundaries.
- extract_tracks(min_track_dur, shift_dur_start, shift_dur_end)[source]¶
Return the tracks, deduced from the silences and track constrains.
- Parameters
min_track_dur – (float) The minimum duration for a track
shift_dur_start – (float) The time to remove to the start bound
shift_dur_end – (float) The time to add to the end boundary
- Returns
list of tuples (from_pos,to_pos)
Duration is in seconds.
- filter_silences(threshold, min_sil_dur=0.2)[source]¶
Filter the current silences.
- Parameters
threshold – (int) Expected minimum volume (rms value)
If threshold is set to 0, search_minvol() will assign a value. :param min_sil_dur: (float) Minimum silence duration in seconds :returns: Number of silences with the expected minimum duration
- filter_silences_from_tracks(min_track_dur=0.6)[source]¶
Filter the given silences to remove very small tracks.
- Parameters
min_track_dur – (float) Minimum duration of a track
- Returns
filtered silences
- fix_threshold_vol()[source]¶
Fix the threshold for tracks/silences segmentation.
This is an observation of the distribution of rms values.
- Returns
(int) volume value
- search_silences(threshold=0)[source]¶
Search windows with a volume lesser than a given threshold.
This is then a search for silences. All windows with a volume higher than the threshold are considered as tracks and not included in the result. Block of silences lesser than min_sil_dur are also considered tracks.
- Parameters
threshold – (int) Expected minimum volume (rms value)
If threshold is set to 0, search_minvol() will assign a value. :returns: threshold
- set_channel(channel)[source]¶
Set a channel, then reset all previous results.
- Parameters
channel – (sppasChannel)
- set_silences(silences)[source]¶
Fix manually silences.
To be use carefully!
- Parameters
silences – (list of tuples (start_pos, end_pos))
annotations.SearchIPUs.sppassearchipus module¶
- filename
sppas.src.annotations.SearchIPUs.sppassearchipus.py
- author
Brigitte Bigi
- contact
- summary
SPPAS integration of Search for IPUs automatic annotation
- class annotations.SearchIPUs.sppassearchipus.sppasSearchIPUs(log=None)[source]¶
Bases:
annotations.baseannot.sppasBaseAnnotation
SPPAS integration of the IPUs detection.
- __init__(log=None)[source]¶
Create a new sppasSearchIPUs instance.
- Parameters
log – (sppasLog) Human-readable logs.
- convert(channel)[source]¶
Search for IPUs in the given channel.
- Parameters
channel – (sppasChannel) Input channel
- Returns
(sppasTier)
- fix_options(options)[source]¶
Fix all options.
Available options are:
threshold: volume threshold to decide a window is silence or not
win_length: length of window for a estimation or volume values
min_sil: minimum duration of a silence
min_ipu: minimum duration of an ipu
shift_start: start boundary shift value.
shift_end: end boundary shift value.
- Parameters
options – (sppasOption)
- static get_input_extensions()[source]¶
Extensions that the annotation expects for its input filename.
- run(input_files, output=None)[source]¶
Run the automatic annotation process on an input.
- Parameters
input_files – (list of str) Audio
output – (str) the output file name
- Returns
(sppasTranscription)
- run_for_batch_processing(input_files)[source]¶
Perform the annotation on a file.
This method is called by ‘batch_processing’. It fixes the name of the output file. If the output file is already existing, the annotation is cancelled (the file won’t be overridden). If not, it calls the run method.
- Parameters
input_files – (list of str) the inputs to perform a run
- Returns
output file name or None
- set_min_ipu(value)[source]¶
Fix the default minimum duration of an IPU.
- Parameters
value – (float) Duration in seconds.
- set_min_sil(value)[source]¶
Fix the default minimum duration of a silence.
- Parameters
value – (float) Duration in seconds.
- set_shift_end(value)[source]¶
Fix the end boundary shift value.
- Parameters
value – (float) Duration in seconds.
- set_shift_start(value)[source]¶
Fix the start boundary shift value.
- Parameters
value – (float) Duration in seconds.
- set_threshold(value)[source]¶
Fix the threshold volume.
- Parameters
value – (int) RMS value used as volume threshold
Module contents¶
- filename
sppas.src.annotations.SearchIPUs.__init__.py
- author
Brigitte Bigi
- contact
- summary
Search for Inter-Pausal Units in an audio file.
- class annotations.SearchIPUs.SearchIPUs(channel, win_len=0.02)[source]¶
Bases:
annotations.SearchIPUs.silences.sppasSilences
An automatic silence/tracks segmentation system.
Silence/tracks segmentation aims at finding IPUs. IPUs - Inter-Pausal Units are blocks of speech bounded by silent pauses of more than X ms, and time-aligned on the speech signal.
- DEFAULT_MIN_IPU_DUR = 0.3¶
- DEFAULT_MIN_SIL_DUR = 0.25¶
- DEFAULT_SHIFT_END = 0.02¶
- DEFAULT_SHIFT_START = 0.02¶
- DEFAULT_VOL_THRESHOLD = 0¶
- MIN_IPU_DUR = 0.06¶
- MIN_SIL_DUR = 0.06¶
- __init__(channel, win_len=0.02)[source]¶
Create a new SearchIPUs instance.
- Parameters
channel – (sppasChannel)
- get_effective_threshold()[source]¶
Return the threshold volume estimated automatically to search for silences.
- get_track_data(tracks)[source]¶
Return the audio data of tracks.
- Parameters
tracks – List of tracks. A track is a tuple (start, end).
- Returns
List of audio data
- get_tracks(time_domain=False)[source]¶
Return a list of tuples (from,to) of tracks.
(from,to) values are converted, or not, into the time-domain.
The tracks are found from the current list of silences, which is firstly filtered with the min_sil_dur.
- This methods requires the following members to be fixed:
the volume threshold
the minimum duration for a silence,
the minimum duration for a track,
the duration to remove to the start boundary,
the duration to add to the end boundary.
- Parameters
time_domain – (bool) Convert from/to values in seconds
- Returns
(list of tuples) with (from,to) of the tracks
- set_min_ipu(min_ipu_dur)[source]¶
Fix the default minimum duration of an IPU.
- Parameters
min_ipu_dur – (float) Duration in seconds.
- set_min_sil(min_sil_dur)[source]¶
Fix the default minimum duration of a silence.
- Parameters
min_sil_dur – (float) Duration in seconds.
- set_shift_end(s)[source]¶
Fix the default minimum boundary shift value.
- Parameters
s – (float) Duration in seconds.
- set_shift_start(s)[source]¶
Fix the default minimum boundary shift value.
- Parameters
s – (float) Duration in seconds.
- class annotations.SearchIPUs.sppasSearchIPUs(log=None)[source]¶
Bases:
annotations.baseannot.sppasBaseAnnotation
SPPAS integration of the IPUs detection.
- __init__(log=None)[source]¶
Create a new sppasSearchIPUs instance.
- Parameters
log – (sppasLog) Human-readable logs.
- convert(channel)[source]¶
Search for IPUs in the given channel.
- Parameters
channel – (sppasChannel) Input channel
- Returns
(sppasTier)
- fix_options(options)[source]¶
Fix all options.
Available options are:
threshold: volume threshold to decide a window is silence or not
win_length: length of window for a estimation or volume values
min_sil: minimum duration of a silence
min_ipu: minimum duration of an ipu
shift_start: start boundary shift value.
shift_end: end boundary shift value.
- Parameters
options – (sppasOption)
- static get_input_extensions()[source]¶
Extensions that the annotation expects for its input filename.
- run(input_files, output=None)[source]¶
Run the automatic annotation process on an input.
- Parameters
input_files – (list of str) Audio
output – (str) the output file name
- Returns
(sppasTranscription)
- run_for_batch_processing(input_files)[source]¶
Perform the annotation on a file.
This method is called by ‘batch_processing’. It fixes the name of the output file. If the output file is already existing, the annotation is cancelled (the file won’t be overridden). If not, it calls the run method.
- Parameters
input_files – (list of str) the inputs to perform a run
- Returns
output file name or None
- set_min_ipu(value)[source]¶
Fix the default minimum duration of an IPU.
- Parameters
value – (float) Duration in seconds.
- set_min_sil(value)[source]¶
Fix the default minimum duration of a silence.
- Parameters
value – (float) Duration in seconds.
- set_shift_end(value)[source]¶
Fix the end boundary shift value.
- Parameters
value – (float) Duration in seconds.
- set_shift_start(value)[source]¶
Fix the start boundary shift value.
- Parameters
value – (float) Duration in seconds.
- set_threshold(value)[source]¶
Fix the threshold volume.
- Parameters
value – (int) RMS value used as volume threshold
- class annotations.SearchIPUs.sppasSilences(channel, win_len=0.02, vagueness=0.005)[source]¶
Bases:
object
Silence search on a channel of an audio file.
Silences are stored in a list of (from_pos,to_pos) values, indicating the frame from which the silences are beginning and ending respectively.
- __init__(channel, win_len=0.02, vagueness=0.005)[source]¶
Create a sppasSilences instance.
- Parameters
channel – (sppasChannel) the input channel
win_len – (float) duration of a window
vagueness – (float) Windows length to estimate the boundaries.
Maximum value of vagueness is win_len. The duration of a window (win_len) is relevant for the estimation of the rms values.
Radius (see sppasPoint) is the 2*vagueness of the boundaries.
- extract_tracks(min_track_dur, shift_dur_start, shift_dur_end)[source]¶
Return the tracks, deduced from the silences and track constrains.
- Parameters
min_track_dur – (float) The minimum duration for a track
shift_dur_start – (float) The time to remove to the start bound
shift_dur_end – (float) The time to add to the end boundary
- Returns
list of tuples (from_pos,to_pos)
Duration is in seconds.
- filter_silences(threshold, min_sil_dur=0.2)[source]¶
Filter the current silences.
- Parameters
threshold – (int) Expected minimum volume (rms value)
If threshold is set to 0, search_minvol() will assign a value. :param min_sil_dur: (float) Minimum silence duration in seconds :returns: Number of silences with the expected minimum duration
- filter_silences_from_tracks(min_track_dur=0.6)[source]¶
Filter the given silences to remove very small tracks.
- Parameters
min_track_dur – (float) Minimum duration of a track
- Returns
filtered silences
- fix_threshold_vol()[source]¶
Fix the threshold for tracks/silences segmentation.
This is an observation of the distribution of rms values.
- Returns
(int) volume value
- search_silences(threshold=0)[source]¶
Search windows with a volume lesser than a given threshold.
This is then a search for silences. All windows with a volume higher than the threshold are considered as tracks and not included in the result. Block of silences lesser than min_sil_dur are also considered tracks.
- Parameters
threshold – (int) Expected minimum volume (rms value)
If threshold is set to 0, search_minvol() will assign a value. :returns: threshold
- set_channel(channel)[source]¶
Set a channel, then reset all previous results.
- Parameters
channel – (sppasChannel)
- set_silences(silences)[source]¶
Fix manually silences.
To be use carefully!
- Parameters
silences – (list of tuples (start_pos, end_pos))