Bob 2.0 extraction of cepstral features (MFCC or LFCC) from audio
Algorithms have at least one input and one output. All algorithm endpoints are organized in groups. Groups are used by the platform to indicate which inputs and outputs are synchronized together. The first group is automatically synchronized with the channel defined by the block in which the algorithm is deployed.
Endpoint Name | Data Format | Nature |
---|---|---|
speech | system/array_1d_floats/1 | Input |
vad | system/array_1d_integers/1 | Input |
features | system/array_2d_floats/1 | Output |
Parameters allow users to change the configuration of an algorithm when scheduling an experiment
Name | Description | Type | Default | Range/Choices |
---|---|---|---|---|
f_max | Max frequency of the range used in bandpass filtering | float64 | 8000.0 | |
delta_win | Window size used in delta and delta-delta computation | uint32 | 2 | |
withDelta | Compute deltas (with window size specified by delta_win) | bool | True | |
pre_emphasis_coef | Pre-emphasis coefficient | float64 | 0.95 | |
win_shift_ms | The length of the overlap between neighboring windows. Typically the half of window length. | float64 | 10.0 | |
win_length_ms | The length of the sliding processing window, typically about 20 ms | float64 | 20.0 | |
dct_norm | Use normalized DCT | bool | False | |
normalizeFeatures | Normalize computed Cepstral features (shift by mean and divide by std) | bool | True | |
filter_frames | Filter frames with computed Cepstral features based on the VAD labels. Either trim out silence head/tails, keep only speech, or keep only silence. | string | trim_silence | trim_silence, silence_only, speech_only |
rate | Sampling rate of the speech signal | float64 | 16000.0 | |
n_filters | Number of filter bands | uint32 | 24 | |
f_min | Min frequency of the range used in bandpass filtering | float64 | 0.0 | |
withDeltaDelta | Compute delta-deltas (with window size specified by delta_win) | bool | True | |
withEnergy | Use power of the FFT magnitude, otherwise just an absolute value of the magnitude | bool | True | |
mel_scale | Set true to use Mel-scaled triangular filter, otherwise it's a linear scale | bool | True | |
n_ceps | Number of cepstral coefficients | uint32 | 19 |
The code for this algorithm in Python
The ruler at 80 columns indicate suggested POSIX line breaks (for readability).
The editor will automatically enlarge to accomodate the entirety of your input
Use keyboard shortcuts for search/replace and faster editing. For example, use Ctrl-F (PC) or Cmd-F (Mac) to search through this box
Extract cepstral features (MFCC or LFCC) from audio
Updated | Name | Databases/Protocols | Analyzers | |||
---|---|---|---|---|---|---|
pkorshunov/pkorshunov/isv-asv-pad-fusion-complete/1/asv_isv-pad_lbp_hist_ratios_lr-fusion_lr-pa_aligned | avspoof/2@physicalaccess_verify_train,avspoof/2@physicalaccess_verification,avspoof/2@physicalaccess_verification_spoof,avspoof/2@physicalaccess_verify_train_spoof,avspoof/2@physicalaccess_antispoofing | pkorshunov/spoof-score-fusion-roc_hist/1 | ||||
pkorshunov/pkorshunov/isv-asv-pad-fusion-complete/1/asv_isv-pad_gmm-fusion_lr-pa | avspoof/2@physicalaccess_verify_train,avspoof/2@physicalaccess_verification,avspoof/2@physicalaccess_verification_spoof,avspoof/2@physicalaccess_verify_train_spoof,avspoof/2@physicalaccess_antispoofing | pkorshunov/spoof-score-fusion-roc_hist/1 | ||||
pkorshunov/pkorshunov/speech-pad-simple/1/speech-pad_gmm-pa | avspoof/2@physicalaccess_antispoofing | pkorshunov/simple_antispoofing_analyzer/4 | ||||
pkorshunov/pkorshunov/isv-speaker-verification-spoof/1/isv-speaker-verification-spoof-pa | avspoof/2@physicalaccess_verification_spoof,avspoof/2@physicalaccess_verification | pkorshunov/eerhter_postperf_iso_spoof/1 | ||||
pkorshunov/pkorshunov/isv-speaker-verification/1/isv-speaker-verification-licit | avspoof/2@physicalaccess_verification | pkorshunov/eerhter_postperf_iso/1 |
This table shows the number of times this algorithm has been successfully run using the given environment. Note this does not provide sufficient information to evaluate if the algorithm will run when submitted to different conditions.