audiodata package¶
Subpackages¶
Submodules¶
audiodata.audio module¶
- filename
sppas.src.audiodata.audio.py
- author
Nicolas Chazeau, Brigitte Bigi
- contact
- summary
Main class to work with a recorded audio file.
Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. A PCM signal is a sequence of digital audio samples containing the data providing the necessary information to reconstruct the original analog signal. Each sample represents the amplitude of the signal at a specific point in time, and the samples are uniformly spaced in time. The amplitude is the only information explicitly stored in the sample
A PCM stream has two basic properties that determine the stream’s fidelity to the original analog signal: the sampling rate, which is the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that can be used to represent each sample.
For speech analysis, recommended sampling rate are 16000 (for automatic analysis) or 48000 (for manual analysis); and recommended sample depths are 16 per sample.
- class audiodata.audio.sppasAudioPCM[source]¶
Bases:
object
An audio manager.
- These variables are user gettable through appropriate methods:
nchannels – the number of audio channels
framerate – the sampling frequency
sampwidth – the number of bytes per audio sample (1, 2 or 4)
nframes – the number of frames
params – parameters of the wave file
filename – the name of the wave file
The audiofp member is assigned by the IO classes (WaveIO, AifIO, SunauIO). It is expected that it can access the following methods:
readframes()
writeframes()
getsampwidth()
getframerate()
getnframes()
getnchannels()
setpos()
tell()
rewind()
- append_channel(channel)[source]¶
Append a channel to the list of uploaded channels.
- Parameters
channel – (sppasChannel) the channel to append
- Returns
index of the channel
- clipping_rate(factor)[source]¶
Return the clipping rate of the frames.
- Parameters
factor – (float) An interval to be more precise on clipping rate.
It will consider that all frames outside the interval are clipped. Factor has to be between 0 and 1. :returns: (float)
- extract_channel(index=0)[source]¶
Extract a channel from the Audio File Pointer.
Append the channel into the list of channels.
Frames are stored into a sppasChannel() instance. Index of the channel in the audio file: 0 = 1st channel (left); 1 = 2nd channel (right); 2 = 3rd channel…
- Parameters
index – (int) The index of the channel to extract
- Returns
the index of the sppasChannel() in the list
- extract_channels()[source]¶
Extract all channels from the Audio File Pointer.
Append the extracted channels to the list of channels.
- get_channel(idx)[source]¶
Get an uploaded channel.
- Parameters
idx – (int) the index of the channel to return
- Returns
(sppasChannel)
- get_duration()[source]¶
Return the duration of the Audio File Pointer.
- Returns
(float) duration of the audio file (in seconds)
- get_framerate()[source]¶
Return the frame rate of the Audio File Pointer.
- Returns
(int) frame rate of the audio file
- get_nchannels()[source]¶
Return the number of channels of the Audio File Pointer.
- Returns
(int) number of channels of the audio file
- get_nframes()[source]¶
Return the number of frames of the Audio File Pointer.
- Returns
(int) number of frames of the audio file
- get_sampwidth()[source]¶
Return the sample width of the Audio File Pointer.
- Returns
(int) sample width of the audio file
- insert_channel(idx, channel)[source]¶
Insert a channel at the position given in the list of uploaded channels.
- Parameters
idx – (int) the index where the channel has to be inserted
channel – (sppasChannel) the channel to insert
- pop_channel(idx)[source]¶
Pop a channel at the position given from the list of uploaded channels.
- Parameters
idx – (int) the index of the channel to remove
- read_frames(nframes)[source]¶
Read n frames from the audio file.
- Parameters
nframes – (int) the number of frames to read
- Returns
(str) frames
- read_samples(nframes)[source]¶
Read the samples from the audio file.
- Parameters
nframes – (int) the number of frames to read
- Returns
(list of list) list of samples of each channel
- remove_channel(channel)[source]¶
Remove a channel from the list of uploaded channels.
- Parameters
channel – (sppasChannel) the channel to remove
audiodata.audioconvert module¶
- filename
sppas.src.audiodata.audioconvert.py
- author
Brigitte Bigi
- contact
- summary
An utility to convert audio data formats
- class audiodata.audioconvert.sppasAudioConverter[source]¶
Bases:
object
An utility to convert data formats.
- static amp2db(value)[source]¶
Return the equivalent value in a dB scale, from an amplitude value.
Decibels express a power ratio, not an amount. They tell how many times more (positive dB) or less (negative dB) but not how much in absolute terms. Decibels are logarithmic, not linear. Doubling of the value leads to an increase of 6.02dB.
- Parameters
value – (int) the amplitude value to convert
- Returns
(float) the value in dB
- static hz2mel(value)[source]¶
Return the equivalent value in a mel scale, from a frequency value.
Mel is a unit of pitch proposed by Stevens, Volkmann and Newmann in 1937. The mel scale is a scale of pitches judged by listeners to be equal in distance one from another. The name mel comes from the word melody to indicate that the scale is based on pitch comparisons.
- Parameters
value – (int) the value to convert
- Returns
(int) the value in mel
- static mel2hz(value)[source]¶
Return the equivalent value in frequency, from a mel value.
- Parameters
value – (int) the value in mel to convert
- Returns
(int) the value in dB
- static samples2frames(samples, samples_width, nchannels=1)[source]¶
Turn samples into frames.
- Parameters
samples – (int[][]) samples list,
first index is the index of the channel, second is the index of the sample. :param samples_width: (int) sample width of the frames. :param nchannels: (int) number of channels in the samples :returns: frames
audiodata.audiodataexc module¶
- filename
sppas.src.audiodata.audiodataexc.py
- author
Brigitte Bigi
- contact
- summary
Exceptions for the audiodata package.
- exception audiodata.audiodataexc.AudioDataError(filename='')[source]¶
Bases:
Exception
:ERROR 2015:.
No data or corrupted data in the audio file {filename}.
- exception audiodata.audiodataexc.AudioError[source]¶
Bases:
Exception
:ERROR 2000:.
No audio file is defined.
- exception audiodata.audiodataexc.AudioIOError(message='', filename='')[source]¶
Bases:
OSError
:ERROR 2010:.
Opening, reading or writing error.
- exception audiodata.audiodataexc.AudioTypeError(extension)[source]¶
Bases:
TypeError
:ERROR 2005:.
Audio type error: not supported file format {extension}.
- exception audiodata.audiodataexc.ChannelError[source]¶
Bases:
Exception
:ERROR 2050:.
No channel defined.
- exception audiodata.audiodataexc.ChannelIndexError(index)[source]¶
Bases:
ValueError
:ERROR 2020:.
{number} is not a right index of channel.
- exception audiodata.audiodataexc.FrameRateError(value)[source]¶
Bases:
ValueError
- ERROR 2080
Invalid framerate {value}.
- exception audiodata.audiodataexc.IntervalError(value1, value2)[source]¶
Bases:
ValueError
:ERROR 2025:.
From {value1} to {value2} is not a proper interval.
audiodata.audioframes module¶
- filename
sppas.src.audiodata.audioframes.py
- authors
Nicolas Chazeau, Brigitte Bigi
- contact
- summary
Manipulate frames of an Audio()
- class audiodata.audioframes.sppasAudioFrames(frames=b'', sampwidth=2, nchannels=1)[source]¶
Bases:
object
An utility class for audio frames.
TODO: There’s no unittests of this class.
- __init__(frames=b'', sampwidth=2, nchannels=1)[source]¶
Create an instance.
- Parameters
frames – (str) input frames
sampwidth – (int) sample width of the frames (1, 2 or 4)
nchannels – (int) number of channels in the samples
- bias(value)[source]¶
Return frames that is the original fragment with a bias added to each sample.
Samples wrap around in case of overflow.
- Parameters
value – (int) the bias which will be applied to each sample.
- Returns
(str) converted frames
- change_sampwidth(new_sampwidth)[source]¶
Return frames with the given number of bytes.
- Parameters
new_sampwidth – (int) new sample width of the frames. (1 for 8 bits, 2 for 16 bits, 4 for 32 bits)
- Returns
(str) converted frames
- clipping_rate(factor)[source]¶
Return the clipping rate of the frames.
- Parameters
factor – (float) An interval to be more precise on clipping rate.
It will consider that all frames outside the interval are clipped. Factor has to be between 0 and 1. :returns: (float) the clipping rate
- static get_maxval(size, signed=True)[source]¶
Return the max value for a given sampwidth.
- Parameters
size – (int) the sampwidth
signed – (bool) if the values will be signed or not
- Returns
(int) the max value
- static get_minval(size, signed=True)[source]¶
Return the min value for a given sampwidth.
- Parameters
size – (int) the sampwidth
signed – (bool) if the values will be signed or not
- Returns
(int) the min value
- mul(factor)[source]¶
Return frames for which all samples are multiplied by factor.
Samples are truncated in case of overflow.
- Parameters
factor – (int) the factor which will be applied to each sample.
- Returns
(str) converted frames
audiodata.audiopitch module¶
- filename
sppas.src.audiodata.audiopitch.py
- author
Brigitte Bigi
- contact
- summary
TO BE IMPLEMENTED.
audiodata.audiovolume module¶
audiodata.autils module¶
audiodata.basevolume module¶
audiodata.channel module¶
- filename
sppas.src.audiodata.channel.py
- author
Nicolas Chazeau, Brigitte Bigi
- contact
- summary
Represent a channel (frames) of an audio file.
- class audiodata.channel.sppasChannel(framerate=16000, sampwidth=2, frames=b'')[source]¶
Bases:
object
Manage data and information of a channel.
- __init__(framerate=16000, sampwidth=2, frames=b'')[source]¶
Create a sppasChannel instance.
- Parameters
framerate – (int) The frame rate of this channel, in Hertz.
sampwidth – (int) 1 for 8 bits, 2 for 16 bits, 4 for 32 bits.
frames – (str) The frames represented by a string.
- clipping_rate(factor)[source]¶
Return the clipping rate of the frames.
- Parameters
factor – (float) An interval to be more precise on clipping rate.
It will consider that all frames outside the interval are clipped. Factor has to be between 0 and 1. :returns: (float) the clipping rate
- extract_fragment(begin=None, end=None)[source]¶
Extract a fragment between the beginning and the end.
- Parameters
begin – (int: number of frames) the beginning of the fragment to extract
end – (int: number of frames) the end of the fragment to extract
- Returns
(sppasChannel) the fragment extracted.
- get_duration()[source]¶
Return the duration of the channel, in seconds.
- Returns
(float) the duration of the channel
- get_frames(chunck_size=None)[source]¶
Return some frames from the current position.
- Parameters
chunck_size – (int) the size of the chunk to return.
None for all frames of the channel. :returns: (str) the frames
- get_nframes()[source]¶
Return the number of frames.
A frame has a length of (sampwidth) bytes.
- Returns
(int) the total number of frames
- rms()[source]¶
Return the root mean square of the channel.
- Returns
(int) the root mean square of the channel
- set_framerate(framerate)[source]¶
Set a new framerate to the channel.
- Parameters
framerate – (int) The frame rate of this channel, in Hertz.
A value between 8000 and 192000
- set_frames(frames)[source]¶
Set new frames to the channel.
It is supposed the sampwidth and framerate are the same as the current ones.
- Parameters
frames – (str) the new frames
audiodata.channelformatter module¶
- filename
sppas.src.audiodata.channelformatter.py
- author
Nicolas Chazeau, Brigitte Bigi
- contact
- summary
Tools to apply on frames of a channel.
- class audiodata.channelformatter.sppasChannelFormatter(channel)[source]¶
Bases:
object
Utility to format frames of a channel.
- __init__(channel)[source]¶
Create a sppasChannelFormatter instance.
- Parameters
channel – (sppasChannel) The channel to work on.
- add_frames(frames, position)[source]¶
Convert the channel by adding frames.
- Parameters
frames – (str)
position – (int) the position where the frames will be inserted
- append_frames(frames)[source]¶
Convert the channel by appending frames.
- Parameters
frames – (str) the frames to append
- bias(bias_value)[source]¶
Convert the channel with a bias added to each frame.
Samples wrap around in case of overflow.
- Parameters
bias_value – (int) the value to bias the frames
- convert()[source]¶
Convert the channel.
Convert to the expected (already) given sample width and frame rate.
- get_framerate()[source]¶
Return the expected frame rate for the channel.
Notice that while convert is not applied, it can be different of the current one of the channel.
- Returns
the frame rate that will be used by the converter
- get_sampwidth()[source]¶
Return the expected sample width for the channel.
Notice that while convert is not applied, it can be different of the current one of the channel.
- Returns
the sample width that will be used by the converter
- mul(factor)[source]¶
Convert the channel.
All frames in the original channel are multiplied by the floating- point value factor. Samples are truncated in case of overflow.
- Parameters
factor – (float) the factor to multiply the frames
- remove_frames(begin, end)[source]¶
Convert the channel by removing frames.
- Parameters
begin – (int) the position of the beginning of the frames to remove
end – (int) the position of the end of the frames to remove
- set_framerate(framerate)[source]¶
Fix the expected frame rate for the channel.
Notice that while convert is not applied, it can be different of the current one of the channel.
- Parameters
framerate –
audiodata.channelframes module¶
- filename
sppas.src.audiodata.channelframes.py
- author
Nicolas Chazeau, Brigitte Bigi
- contact
- summary
Represent frames of a channel.
- class audiodata.channelframes.sppasChannelFrames(frames=b'')[source]¶
Bases:
object
Data structure to deal with the frames of a channel.
- __init__(frames=b'')[source]¶
Create a sppasChannelFrames instance.
- Parameters
frames – (str) Frames that must be MONO ONLY.
- append_silence(nframes)[source]¶
Create n frames of silence and append it to the frames.
- Parameters
nframes – (int) the number of frames of silence to append
- change_sampwidth(sampwidth, newsampwidth)[source]¶
Change the number of bytes used to encode the frames.
- Parameters
sampwidth – (int) current sample width of the frames.
(1 for 8 bits, 2 for 16 bits, 4 for 32 bits) :param newsampwidth: (int) new sample width of the frames. (1 for 8 bits, 2 for 16 bits, 4 for 32 bits)
- prepend_silence(nframes)[source]¶
Create n frames of silence and prepend it to the frames.
- Parameters
nframes – (int) the number of frames of silence to append
audiodata.channelmfcc module¶
- filename
sppas.src.audiodata.channelmfcc.py
- author
Brigitte Bigi
- contact
- summary
Estimate MFCC. TO BE IMPLEMENTED.
Requires HTK to be installed.
Mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency.
MFCCs are commonly derived as follows:
Take the Fourier transform of (a windowed excerpt of) a signal.
Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows.
Take the logs of the powers at each of the mel frequencies.
Take the discrete cosine transform of the list of mel log powers, as if it were a signal.
The MFCCs are the amplitudes of the resulting spectrum.
audiodata.channelsilence module¶
audiodata.channelsmixer module¶
- filename
sppas.src.audiodata.channelmixer.py
- author
Nicolas Chazeau, Brigitte Bigi
- contact
- summary
A channel utility class to mix several channels in one.
- class audiodata.channelsmixer.sppasChannelMixer[source]¶
Bases:
object
A channel utility class to mix several channels in one.
- append_channel(channel, factor=1)[source]¶
Append a channel and the corresponding factor for a mix.
- Parameters
channel – (Channel object) the channel to append
factor – (float) the factor associated to the channel
- get_channel(idx)[source]¶
Return the channel of a given index.
- Parameters
idx – (int) the index of the channel to return
- Returns
(sppasChannel)
- get_minmax()[source]¶
Return a tuple with the minimum and the maximum samples values.
- Returns
the tuple (minvalue, maxvalue)
audiodata.channelvolume module¶
Module contents¶
- filename
sppas.src.audiodata.__init__.py
- author
Nicolas Chazeau, Brigitte Bigi
- contact
- summary
audio file manager.
audiodata: management of digital audio data.¶
Requires the following other packages:
config
- class audiodata.sppasAudioFrames(frames=b'', sampwidth=2, nchannels=1)[source]¶
Bases:
object
An utility class for audio frames.
TODO: There’s no unittests of this class.
- __init__(frames=b'', sampwidth=2, nchannels=1)[source]¶
Create an instance.
- Parameters
frames – (str) input frames
sampwidth – (int) sample width of the frames (1, 2 or 4)
nchannels – (int) number of channels in the samples
- bias(value)[source]¶
Return frames that is the original fragment with a bias added to each sample.
Samples wrap around in case of overflow.
- Parameters
value – (int) the bias which will be applied to each sample.
- Returns
(str) converted frames
- change_sampwidth(new_sampwidth)[source]¶
Return frames with the given number of bytes.
- Parameters
new_sampwidth – (int) new sample width of the frames. (1 for 8 bits, 2 for 16 bits, 4 for 32 bits)
- Returns
(str) converted frames
- clipping_rate(factor)[source]¶
Return the clipping rate of the frames.
- Parameters
factor – (float) An interval to be more precise on clipping rate.
It will consider that all frames outside the interval are clipped. Factor has to be between 0 and 1. :returns: (float) the clipping rate
- static get_maxval(size, signed=True)[source]¶
Return the max value for a given sampwidth.
- Parameters
size – (int) the sampwidth
signed – (bool) if the values will be signed or not
- Returns
(int) the max value
- static get_minval(size, signed=True)[source]¶
Return the min value for a given sampwidth.
- Parameters
size – (int) the sampwidth
signed – (bool) if the values will be signed or not
- Returns
(int) the min value
- mul(factor)[source]¶
Return frames for which all samples are multiplied by factor.
Samples are truncated in case of overflow.
- Parameters
factor – (int) the factor which will be applied to each sample.
- Returns
(str) converted frames
- class audiodata.sppasAudioPCM[source]¶
Bases:
object
An audio manager.
- These variables are user gettable through appropriate methods:
nchannels – the number of audio channels
framerate – the sampling frequency
sampwidth – the number of bytes per audio sample (1, 2 or 4)
nframes – the number of frames
params – parameters of the wave file
filename – the name of the wave file
The audiofp member is assigned by the IO classes (WaveIO, AifIO, SunauIO). It is expected that it can access the following methods:
readframes()
writeframes()
getsampwidth()
getframerate()
getnframes()
getnchannels()
setpos()
tell()
rewind()
- append_channel(channel)[source]¶
Append a channel to the list of uploaded channels.
- Parameters
channel – (sppasChannel) the channel to append
- Returns
index of the channel
- clipping_rate(factor)[source]¶
Return the clipping rate of the frames.
- Parameters
factor – (float) An interval to be more precise on clipping rate.
It will consider that all frames outside the interval are clipped. Factor has to be between 0 and 1. :returns: (float)
- extract_channel(index=0)[source]¶
Extract a channel from the Audio File Pointer.
Append the channel into the list of channels.
Frames are stored into a sppasChannel() instance. Index of the channel in the audio file: 0 = 1st channel (left); 1 = 2nd channel (right); 2 = 3rd channel…
- Parameters
index – (int) The index of the channel to extract
- Returns
the index of the sppasChannel() in the list
- extract_channels()[source]¶
Extract all channels from the Audio File Pointer.
Append the extracted channels to the list of channels.
- get_channel(idx)[source]¶
Get an uploaded channel.
- Parameters
idx – (int) the index of the channel to return
- Returns
(sppasChannel)
- get_duration()[source]¶
Return the duration of the Audio File Pointer.
- Returns
(float) duration of the audio file (in seconds)
- get_framerate()[source]¶
Return the frame rate of the Audio File Pointer.
- Returns
(int) frame rate of the audio file
- get_nchannels()[source]¶
Return the number of channels of the Audio File Pointer.
- Returns
(int) number of channels of the audio file
- get_nframes()[source]¶
Return the number of frames of the Audio File Pointer.
- Returns
(int) number of frames of the audio file
- get_sampwidth()[source]¶
Return the sample width of the Audio File Pointer.
- Returns
(int) sample width of the audio file
- insert_channel(idx, channel)[source]¶
Insert a channel at the position given in the list of uploaded channels.
- Parameters
idx – (int) the index where the channel has to be inserted
channel – (sppasChannel) the channel to insert
- pop_channel(idx)[source]¶
Pop a channel at the position given from the list of uploaded channels.
- Parameters
idx – (int) the index of the channel to remove
- read_frames(nframes)[source]¶
Read n frames from the audio file.
- Parameters
nframes – (int) the number of frames to read
- Returns
(str) frames
- read_samples(nframes)[source]¶
Read the samples from the audio file.
- Parameters
nframes – (int) the number of frames to read
- Returns
(list of list) list of samples of each channel
- remove_channel(channel)[source]¶
Remove a channel from the list of uploaded channels.
- Parameters
channel – (sppasChannel) the channel to remove
- class audiodata.sppasChannel(framerate=16000, sampwidth=2, frames=b'')[source]¶
Bases:
object
Manage data and information of a channel.
- __init__(framerate=16000, sampwidth=2, frames=b'')[source]¶
Create a sppasChannel instance.
- Parameters
framerate – (int) The frame rate of this channel, in Hertz.
sampwidth – (int) 1 for 8 bits, 2 for 16 bits, 4 for 32 bits.
frames – (str) The frames represented by a string.
- clipping_rate(factor)[source]¶
Return the clipping rate of the frames.
- Parameters
factor – (float) An interval to be more precise on clipping rate.
It will consider that all frames outside the interval are clipped. Factor has to be between 0 and 1. :returns: (float) the clipping rate
- extract_fragment(begin=None, end=None)[source]¶
Extract a fragment between the beginning and the end.
- Parameters
begin – (int: number of frames) the beginning of the fragment to extract
end – (int: number of frames) the end of the fragment to extract
- Returns
(sppasChannel) the fragment extracted.
- get_duration()[source]¶
Return the duration of the channel, in seconds.
- Returns
(float) the duration of the channel
- get_frames(chunck_size=None)[source]¶
Return some frames from the current position.
- Parameters
chunck_size – (int) the size of the chunk to return.
None for all frames of the channel. :returns: (str) the frames
- get_nframes()[source]¶
Return the number of frames.
A frame has a length of (sampwidth) bytes.
- Returns
(int) the total number of frames
- rms()[source]¶
Return the root mean square of the channel.
- Returns
(int) the root mean square of the channel
- set_framerate(framerate)[source]¶
Set a new framerate to the channel.
- Parameters
framerate – (int) The frame rate of this channel, in Hertz.
A value between 8000 and 192000
- set_frames(frames)[source]¶
Set new frames to the channel.
It is supposed the sampwidth and framerate are the same as the current ones.
- Parameters
frames – (str) the new frames