Python API

This section includes information for using the pure Python API of bob.ap.

bob.ap.get_config()[source]

Returns a string containing the configuration information.

class bob.ap.Ceps

Bases: bob.ap.Spectrogram

Ceps(sampling_frequency, [win_length_ms=20., [win_shift_ms=10., [n_filters=24, [n_ceps=19, [f_min=0., [f_max=4000., [delta_win=2, [pre_emphasis_coeff=0.95, [mel_scale=True, [dct_norm=True]]]]]]]]]]) -> new Ceps Ceps(other) -> new Ceps

Objects of this class, after configuration, can extract the cepstral coefficients from 1D audio array/signals.

Parameters:

sampling_frequency
[float] the sampling frequency/frequency rate
win_length_ms
[float] the window length in miliseconds
win_shift_ms
[float] the window shift in miliseconds
n_filters
[int] the number of filter bands
n_ceps
[int] the number of cepstral coefficients
f_min
[double] the minimum frequency of the filter bank
f_max
[double] the maximum frequency of the filter bank
delta_win
[int] The integer delta value used for computing the first and second order derivatives
pre_emphasis_coeff
[double] the coefficient used for the pre-emphasis
mel_scale
[bool] tells whether cepstral features are extracted on a linear (LFCC, set it to False) or Mel (MFCC, set it to True - the default)
dct_norm
[bool] A factor by which the cepstral coefficients are multiplied
other
[Ceps] an object of which is or inherits from Ceps that will be deep-copied into a new instance.
dct_norm

A factor by which the cepstral coefficients are multiplied

delta_win

The integer delta value used for computing the first and second order derivatives

energy_bands

Tells whether we compute a spectrogram or energy bands

energy_filter

Tells whether we use the energy or the square root of the energy

energy_floor

The energy flooring threshold

f_max

The maximum frequency of the filter bank

f_min

The minimum frequency of the filter bank

get_shape(input) → tuple

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input
[int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

log_filter

Tells whether we use the log triangular filter or the triangular filter

mel_scale

Tells whether cepstral features are extracted on a linear (LFCC) or Mel (MFCC) scale

n_ceps

The number of cepstral coefficients

n_filters

The number of filter bands

pre_emphasis_coeff

The coefficient used for the pre-emphasis

sampling_frequency

The sampling frequency/frequency rate

win_length

The normalized window length w.r.t. the sample frequency

win_length_ms

The window length of the cepstral analysis in milliseconds

win_shift

The normalized window shift w.r.t. the sample frequency

win_shift_ms

The window shift of the cepstral analysis in milliseconds

with_delta

Tells if we add the first derivatives to the output feature

with_delta_delta

Tells if we add the second derivatives to the output feature

with_energy

Tells if we add the energy to the output feature

class bob.ap.Energy

Bases: bob.ap.FrameExtractor

Energy(sampling_frequency, [win_length_ms=20., [win_shift_ms=10.]]) -> new Energy Energy(other) -> new Energy

Objects of this class, after configuration, can extract the energy of frames extracted from a 1D audio array/signal.

Parameters:

sampling_frequency
[float] the sampling frequency/frequency rate
win_length_ms
[float] the window length in miliseconds
win_shift_ms
[float] the window shift in miliseconds
other
[Energy] an object of which is or inherits from Energy that will be deep-copied into a new instance.
energy_floor

The energy flooring threshold

get_shape(input) → tuple

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input
[int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

sampling_frequency

The sampling frequency/frequency rate

win_length

The normalized window length w.r.t. the sample frequency

win_length_ms

The window length of the cepstral analysis in milliseconds

win_shift

The normalized window shift w.r.t. the sample frequency

win_shift_ms

The window shift of the cepstral analysis in milliseconds

class bob.ap.FrameExtractor

Bases: object

FrameExtractor(sampling_frequency, [win_length_ms=20., [win_shift_ms=10.]]) -> new FrameExtractor FrameExtractor(other) -> new FrameExtractor

This class is a base type for classes that perform audio processing on a frame basis. It can be instantiated from Python.

Objects of this class, after configuration, can extract audio frame from a 1D audio array/signal. You can instantiate objects of this class by passing a set of construction parameters or another object of which the base type is FrameExtractor.

Parameters:

sampling_frequency
[float] the sampling frequency/frequency rate
win_length_ms
[float] the window length in miliseconds
win_shift_ms
[float] the window shift in miliseconds
other
[FrameExtractor] an object of which is or inherits from a FrameExtractor that will be deep-copied into a new instance.
get_shape(input) → tuple

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input
[int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

sampling_frequency

The sampling frequency/frequency rate

win_length

The normalized window length w.r.t. the sample frequency

win_length_ms

The window length of the cepstral analysis in milliseconds

win_shift

The normalized window shift w.r.t. the sample frequency

win_shift_ms

The window shift of the cepstral analysis in milliseconds

class bob.ap.Spectrogram

Bases: bob.ap.Energy

Spectrogram(sampling_frequency, [win_length_ms=20., [win_shift_ms=10., [n_filters=24, [f_min=0., [f_max=4000., [pre_emphasis_coeff=0.95, [mel_scale=True]]]]]]]) -> new Spectrogram Spectrogram(other) -> new Spectrogram

Objects of this class, after configuration, can extract the spectrogram from 1D audio array/signals.

Parameters:

sampling_frequency
[float] the sampling frequency/frequency rate
win_length_ms
[float] the window length in miliseconds
win_shift_ms
[float] the window shift in miliseconds
n_filters
[int] the number of filter bands
f_min
[double] the minimum frequency of the filter bank
f_max
[double] the maximum frequency of the filter bank
pre_emphasis_coeff
[double] the coefficient used for the pre-emphasis
mel_scale
[bool] tells whether cepstral features are extracted on a linear (LFCC, set it to False) or Mel (MFCC, set it to True - the default)
other
[Spectrogram] an object of which is or inherits from Spectrogram that will be deep-copied into a new instance.
energy_bands

Tells whether we compute a spectrogram or energy bands

energy_filter

Tells whether we use the energy or the square root of the energy

energy_floor

The energy flooring threshold

f_max

The maximum frequency of the filter bank

f_min

The minimum frequency of the filter bank

get_shape(input) → tuple

Computes the shape of the output features, given the size of an input array or an input array.

Parameters:

input
[int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.

This method always returns a 2-tuple containing the shape of output features produced by this extractor.

log_filter

Tells whether we use the log triangular filter or the triangular filter

mel_scale

Tells whether cepstral features are extracted on a linear (LFCC) or Mel (MFCC) scale

n_filters

The number of filter bands

pre_emphasis_coeff

The coefficient used for the pre-emphasis

sampling_frequency

The sampling frequency/frequency rate

win_length

The normalized window length w.r.t. the sample frequency

win_length_ms

The window length of the cepstral analysis in milliseconds

win_shift

The normalized window shift w.r.t. the sample frequency

win_shift_ms

The window shift of the cepstral analysis in milliseconds

Previous topic

User Guide

This Page