Bob 2.0 LBP histograms of spectrogram bands plus bands-ratio.

This algorithm is a legacy one. The API has changed since its implementation. New versions and forks will need to be updated.
This algorithm is splittable

Algorithms have at least one input and one output. All algorithm endpoints are organized in groups. Groups are used by the platform to indicate which inputs and outputs are synchronized together. The first group is automatically synchronized with the channel defined by the block in which the algorithm is deployed.

Group: main

Endpoint Name Data Format Nature
labels system/array_1d_integers/1 Input
speech system/array_1d_floats/1 Input
features system/array_1d_floats/1 Output

Parameters allow users to change the configuration of an algorithm when scheduling an experiment

Name Description Type Default Range/Choices
mel_scale Apply Mel-scale filtering or use linear (default - linear) bool True
pre_emphasis_coef Pre-emphasis coefficient, used in the spectrogram computation float64 1.0
f_max Maximum frequency of the spectrogram float64 4000.0
lbp_to_average LBP parameter. Compare the pixels to the center pixel or to the average bool False
win_shift_ms The length of the overlap between neighboring windows. Typically the half of window length. float64 10.0
win_length_ms The length of the sliding processing window, typically about 20 ms float64 20.0
n_lbp_histograms Split resulted spectrogram in the number of bands and compute LBP histogram for each uint32 2
rate Sampling rate of the speech signal float64 16000.0 [2000.0, 256000.0]
n_filters The number of filter bands used in spectrogram computation uint32 40
lbp_circular LBP parameter. Extract neighbors on a circle or on a square? bool True
lbp_radius LBP parameter. The radius of the LBP in both vertical and horizontal direction together uint32 1 [1, 10]
lbp_neighbors LBP parameter. Number of neighbors uint32 8 4, 8, 16
lbp_uniform LBP parameter. Only uniform LBP codes (with less than two bit-changes between 0 and 1) are considered; all other strings are combined into one LBP code bool False
lbp_elbp_type LBP parameter. How to generate the bit strings from the pixels: regular - Choose one bit for each comparison of the neighboring pixel with the central pixel; transitional - Compare only the neighboring pixels and skip the central one; direction-coded - Compute a 2-bit code for four directions. string regular regular, transitional, direction-coded
xxxxxxxxxx
154
 
1
###############################################################################
2
#                                                                             #
3
# Copyright (c) 2016 Idiap Research Institute, http://www.idiap.ch/           #
4
# Contact: beat.support@idiap.ch                                              #
5
#                                                                             #
6
# This file is part of the beat.core module of the BEAT platform.             #
7
#                                                                             #
8
# Commercial License Usage                                                    #
9
# Licensees holding valid commercial BEAT licenses may use this file in       #
10
# accordance with the terms contained in a written agreement between you      #
11
# and Idiap. For further information contact tto@idiap.ch                     #
12
#                                                                             #
13
# Alternatively, this file may be used under the terms of the GNU Affero      #
14
# Public License version 3 as published by the Free Software and appearing    #
15
# in the file LICENSE.AGPL included in the packaging of this file.            #
16
# The BEAT platform is distributed in the hope that it will be useful, but    #
17
# WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY  #
18
# or FITNESS FOR A PARTICULAR PURPOSE.                                        #
19
#                                                                             #
20
# You should have received a copy of the GNU Affero Public License along      #
21
# with the BEAT platform. If not, see http://www.gnu.org/licenses/.           #
22
#                                                                             #
23
###############################################################################
24
25
import numpy
26
import bob.ap
27
import bob.sp
28
import bob.ip.base
29
import math
30
31
class Algorithm:
32
33
    def __init__(self):
34
        self.win_length_ms = 20
35
        self.win_shift_ms = 10
36
        self.rate = 16000
37
        self.pre_emphasis_coef = 1.0
38
        self.mel_scale = True
39
        self.n_filters = 40
40
        self.f_max = 8000
41
        self.n_lbp_histograms = 5
42
43
        self.lbp_neighbors = 16
44
        self.lbp_to_average = False
45
        self.lbp_elbp_type = 'regular'
46
        self.lbp_uniform = False
47
        self.lbp_circular = True
48
        self.lbp_radius = 1
49
        
50
    def setup(self, parameters):
51
        self.rate = float(parameters.get('rate', self.rate))
52
        self.win_length_ms = float(parameters.get('win_length_ms', self.win_length_ms))
53
        self.win_shift_ms = float(parameters.get('win_shift_ms', self.win_shift_ms))
54
55
        self.pre_emphasis_coef = float(parameters.get('pre_emphasis_coef', self.pre_emphasis_coef))
56
        self.mel_scale = parameters.get('mel_scale', self.mel_scale)
57
        self.n_filters = parameters.get('n_filters', self.n_filters)
58
        self.f_max = parameters.get('f_max', self.f_max)
59
        self.n_lbp_histograms = parameters.get('n_lbp_histograms', self.n_lbp_histograms)
60
61
        self.lbp_neighbors = parameters.get('lbp_neighbors', self.lbp_neighbors)
62
        self.lbp_to_average = parameters.get('lbp_to_average', self.lbp_to_average)
63
        self.lbp_elbp_type = parameters.get('lbp_elbp_type', self.lbp_elbp_type)
64
        self.lbp_uniform = parameters.get('lbp_uniform', self.lbp_uniform)
65
        self.lbp_circular = parameters.get('lbp_circular', self.lbp_circular)
66
        self.lbp_radius = parameters.get('lbp_radius', self.lbp_radius)
67
        
68
        return True
69
    
70
71
    def compute_spectrogram(self, data):
72
        c = bob.ap.Spectrogram(float(self.rate), float(self.win_length_ms), float(self.win_shift_ms), 
73
                               int(self.n_filters), 0.0, float(self.f_max), float(self.pre_emphasis_coef), 
74
                               bool(self.mel_scale))
75
        # energy power spectrum
76
        c.energy_filter = True  # ^2 of FFT spectrum
77
        # we take no log
78
        c.log_filter = True
79
        c.energy_bands = True  # band filtering
80
81
        return c(data)
82
        
83
    def compute_lbp_histograms_and_ratios(self, data):
84
        histograms = []
85
        ratios = []
86
        prev_textogram = None
87
        textogram_width = math.floor(self.n_filters/self.n_lbp_histograms)
88
        
89
        for i in range(0, self.n_lbp_histograms):
90
            textogram = data[:, i*textogram_width:(i+1)*textogram_width]
91
            if prev_textogram is None:
92
                prev_textogram = textogram
93
            else:
94
                ratios.append(numpy.mean(prev_textogram)/numpy.mean(textogram))
95
                
96
            if textogram.max():
97
                textogram *= 255.0/textogram.max()
98
            textogram = numpy.asarray(textogram, dtype=numpy.uint8)
99
100
            lbp = bob.ip.base.LBP(neighbors=int(self.lbp_neighbors), circular=bool(self.lbp_circular), 
101
                                  radius=int(self.lbp_radius), to_average=bool(self.lbp_to_average), 
102
                                  uniform=bool(self.lbp_uniform), elbp_type=self.lbp_elbp_type)
103
104
            lbpimage = numpy.ndarray(lbp.lbp_shape(textogram), 'uint16') # allocating the image with lbp codes
105
            lbp(textogram, lbpimage) # calculating the lbp image
106
            current_hist = bob.ip.base.histogram(lbpimage, (0, lbp.max_label-1), lbp.max_label)
107
            
108
            if sum(current_hist) != 0:
109
                current_hist = current_hist / sum(current_hist) # histogram normalization
110
                
111
            # reduce dimension of the features if lbp is for 16 neighbors
112
            if self.lbp_neighbors == 16:
113
                current_hist_fft = bob.sp.fft(numpy.asarray(current_hist, dtype=numpy.complex128))
114
                current_hist = current_hist_fft.real[0:16] #take only first 16 frequencies of the real part
115
                
116
            histograms.append(current_hist) # just put into the larger list
117
            
118
        return ratios, histograms
119
    
120
    
121
    def process(self, inputs, outputs):
122
        data = inputs["speech"].data.value.astype('float64')
123
        vad_labels = inputs["labels"].data.value
124
125
        # first, trim out the silences from both ends
126
        # if VAD detection worked on this sample
127
        if vad_labels.size == 2 and not vad_labels.all():
128
            # we probably could not read the sample, so no labels were computed
129
            print('VAD labels for the sample is invalid!')
130
        else:
131
            # trim away silent head and tail
132
            # in VAD, speech frames are 1 and silence are 0
133
            speech, = numpy.nonzero(vad_labels)
134
            if len(speech) and len(speech) < len(vad_labels):  # trim only if necessary
135
                nzstart = speech[0]*int(self.rate/1000*self.win_shift_ms)  # index of the first non-silent frame
136
                # make sure we count the length of non-speech shift plus the length of the last frame
137
                nzend = (speech[-1])*int(self.rate/1000*self.win_shift_ms) + int(self.rate/1000*self.win_length_ms)
138
139
                data = data[nzstart:nzend]
140
141
        # compute the spectrogram
142
        spectrogram = self.compute_spectrogram(data)
143
        
144
        # compute LBP histograms from the spectrogram
145
        ratios, histograms = self.compute_lbp_histograms_and_ratios(spectrogram)
146
                
147
        features = numpy.append(ratios, histograms)
148
        features = numpy.asarray(features, dtype=numpy.float64)
149
        
150
        outputs["features"].write({
151
                'value':features
152
            })
153
        return True
154

The code for this algorithm in Python
The ruler at 80 columns indicate suggested POSIX line breaks (for readability).
The editor will automatically enlarge to accomodate the entirety of your input
Use keyboard shortcuts for search/replace and faster editing. For example, use Ctrl-F (PC) or Cmd-F (Mac) to search through this box

Silent start and end of a sample are trimmed using Voice Activity Detection (VAD) labels as input.

Experiments

Updated Name Databases/Protocols Analyzers
pkorshunov/pkorshunov/isv-asv-pad-fusion-complete/1/asv_isv-pad_lbp_hist_ratios_lr-fusion_lr-pa_aligned avspoof/2@physicalaccess_verification,avspoof/2@physicalaccess_verify_train,avspoof/2@physicalaccess_verify_train_spoof,avspoof/2@physicalaccess_antispoofing,avspoof/2@physicalaccess_verification_spoof pkorshunov/spoof-score-fusion-roc_hist/1
pkorshunov/pkorshunov/speech-pad-simple/1/speech-pad_lbp_hist_ratios_lr-pa_aligned avspoof/2@physicalaccess_antispoofing pkorshunov/simple_antispoofing_analyzer/4
Created with Raphaël 2.1.2[compare]pkorshunov/lbp-hist-ratios-of-spectrogram/22016Apr9

This table shows the number of times this algorithm has been successfully run using the given environment. Note this does not provide sufficient information to evaluate if the algorithm will run when submitted to different conditions.

Terms of Service | Contact Information | BEAT platform version 2.2.1b0 | © Idiap Research Institute - 2013-2025