Implements the Linear and Mel Frequency Cepstal Coefficients (MFCC and LFCC)

This algorithm is a legacy one. The API has changed since its implementation. New versions and forks will need to be updated.

Algorithms have at least one input and one output. All algorithm endpoints are organized in groups. Groups are used by the platform to indicate which inputs and outputs are synchronized together. The first group is automatically synchronized with the channel defined by the block in which the algorithm is deployed.

Unnamed group

Endpoint Name Data Format Nature
speech system/array_1d_floats/1 Input
vad system/array_1d_integers/1 Input
features system/array_2d_floats/1 Output

Parameters allow users to change the configuration of an algorithm when scheduling an experiment

Name Description Type Default Range/Choices
rate float64 16000.0
xxxxxxxxxx
132
 
1
import bob
2
import numpy
3
4
5
def normalize_std_array(vector):
6
  """Applies a unit mean and variance normalization to an arrayset"""
7
8
  # Initializes variables
9
  length = 1
10
  n_samples = len(vector)
11
  mean = numpy.ndarray((length,), 'float64')
12
  std = numpy.ndarray((length,), 'float64')
13
14
  mean.fill(0)
15
  std.fill(0)
16
17
  # Computes mean and variance
18
  for array in vector:
19
    x = array.astype('float64')
20
    mean += x
21
    std += (x ** 2)
22
23
  mean /= n_samples
24
  std /= n_samples
25
  std -= (mean ** 2)
26
  std = std ** 0.5 
27
  arrayset = numpy.ndarray(shape=(n_samples,mean.shape[0]), dtype=numpy.float64)
28
  
29
  for i in range (0, n_samples):
30
    arrayset[i,:] = (vector[i]-mean) / std 
31
  return arrayset
32
33
34
class Algorithm:
35
36
  def __init__(self):
37
    self.win_length_ms = 20
38
    self.win_shift_ms = 10
39
    self.n_filters = 24
40
    self.n_ceps = 19
41
    self.f_min = 0 
42
    self.f_max = 4000
43
    self.delta_win = 2
44
    self.pre_emphasis_coef = 0.95
45
    self.dct_norm = False
46
    self.mel_scale = True
47
    self.withEnergy = True
48
    self.withDelta = True
49
    self.withDeltaDelta = True
50
    #TODO: find a way to compute this automatically
51
    self.rate = 16000
52
    self.features_mask = numpy.arange(0,60)
53
    self.normalizeFeatures = True
54
55
56
  def setup(self, parameters):
57
    self.rate = parameters.get('rate', self.rate)
58
    wl = self.win_length_ms
59
    ws = self.win_shift_ms
60
    nf = self.n_filters
61
    nc = self.n_ceps
62
    f_min = self.f_min
63
    f_max = self.f_max
64
    dw = self.delta_win
65
    pre = self.pre_emphasis_coef
66
    rate = self.rate
67
    normalizeFeatures = self.normalizeFeatures
68
    self.extractor = bob.ap.Ceps(rate, wl, ws, nf, nc, f_min, f_max, dw, pre)
69
    self.extractor.dct_norm = self.dct_norm
70
    self.extractor.mel_scale = self.mel_scale
71
    self.extractor.with_energy = self.withEnergy
72
    self.extractor.with_delta = self.withDelta
73
    self.extractor.with_delta_delta = self.withDeltaDelta
74
75
    return True
76
77
  def normalize_features(self, params):
78
  #########################
79
  ## Initialisation part ##
80
  #########################
81
    normalized_vector = [ [ 0 for i in range(params.shape[1]) ] for j in range(params.shape[0]) ] 
82
    for index in range(params.shape[1]):
83
      vector = numpy.array([row[index] for row in params])
84
      n_samples = len(vector)
85
      norm_vector = normalize_std_array(vector)
86
    
87
      for i in range(n_samples):
88
        normalized_vector[i][index]=numpy.asscalar(norm_vector[i])  
89
    data = numpy.array(normalized_vector)
90
    return data
91
  
92
  
93
  def process(self, inputs, outputs):
94
    
95
    float_wav = inputs["speech"].data.value
96
    labels = inputs["vad"].data.value
97
    
98
    cepstral_features = self.extractor(float_wav)
99
    features_mask = self.features_mask
100
    filtered_features = numpy.ndarray(shape=((labels == 1).sum(),len(features_mask)), dtype=numpy.float64)
101
    i=0
102
    cur_i=0
103
104
    for row in cepstral_features:
105
      if i < len(labels):
106
        if labels[i]==1:
107
          for k in range(len(features_mask)):
108
            filtered_features[cur_i,k] = row[features_mask[k]]
109
          cur_i = cur_i + 1
110
        i = i+1
111
      else:
112
        if labels[-1]==1:
113
          if cur_i == cepstral_features.shape[0]:
114
            for k in range(len(features_mask)):
115
              filtered_features[cur_i,k] = row[features_mask[k]]
116
            cur_i = cur_i + 1
117
        i = i+1
118
    if self.normalizeFeatures:
119
      normalized_features = self.normalize_features(filtered_features)
120
    else:
121
      normalized_features = filtered_features
122
    if normalized_features.shape[0] == 0:
123
      print("Warning: no speech found in: %s" % input_file)
124
      # But do not keep it empty!!! This avoids errors in next steps
125
      normalized_features=numpy.array([numpy.zeros(len(features_mask))])
126
127
    outputs["features"].write({
128
      'value': numpy.vstack(normalized_features)
129
    })
130
131
    return True
132

The code for this algorithm in Python
The ruler at 80 columns indicate suggested POSIX line breaks (for readability).
The editor will automatically enlarge to accomodate the entirety of your input
Use keyboard shortcuts for search/replace and faster editing. For example, use Ctrl-F (PC) or Cmd-F (Mac) to search through this box

This algorithm implements the MFCC and LFCC feature extraction. It relies on the Bob library.

The following parameters are set inside the algorithm and can be modified by the user:

  • 'win_length_ms': length of the processing window
  • 'win_shift_ms': length of the shift
  • 'n_filters': number of filters
  • 'n_ceps': number of cepstal coefficients
  • 'f_min': minimum frequency
  • 'f_max': maximum frequency
  • 'delta_win': window on which first and second derivatives are computed
  • 'pre_emphasis_coeff': pre-emphasis coefficient
  • 'mel_scale': flag for Mel scale
  • 'dct_norm': DCT normalization
  • 'with_delta': flag for computing the first derivatives
  • 'with_delta_delta': flag for computing the second derivatives
  • 'with_energy': flag for computing the energy
  • 'features_mask': mask to use only a sub-set of features
  • 'normalizeFeatures': flag to do zero-mean and variance normalization
No experiments are using this algorithm.
Created with Raphaël 2.1.2[compare]elie_khoury/cepstral/1elie_khoury/cepstral/2Aug272014Sep6
This algorithm was never executed.
Terms of Service | Contact Information | BEAT platform version 2.2.1b0 | © Idiap Research Institute - 2013-2025