Bob 2.0-based training of the binary Logistic Regression model
Algorithms have at least one input and one output. All algorithm endpoints are organized in groups. Groups are used by the platform to indicate which inputs and outputs are synchronized together. The first group is automatically synchronized with the channel defined by the block in which the algorithm is deployed.
Endpoint Name | Data Format | Nature |
---|---|---|
features | system/array_1d_floats/1 | Input |
class | system/text/1 | Input |
classifier | tutorial/linear_machine/1 | Output |
xxxxxxxxxx
import bob
import bob.learn.linear
import numpy
import math
def calc_mean(c0, c1=[]):
""" Calculates the mean of the data."""
if c1 != []:
return (numpy.mean(c0, 0) + numpy.mean(c1, 0)) / 2.
else:
return numpy.mean(c0, 0)
def calc_std(c0, c1=[]):
""" Calculates the variance of the data."""
if c1 == []:
return numpy.std(c0, 0)
prop = float(len(c0)) / float(len(c1))
if prop < 1:
p0 = int(math.ceil(1/prop))
p1 = 1
else:
p0 = 1
p1 = int(math.ceil(prop))
return numpy.std(numpy.vstack(p0*[c0] + p1*[c1]), 0)
"""
@param c0
@param c1
@param nonStdZero if the std was zero, convert to one. This will avoid a zero division
"""
def calc_mean_std(c0, c1=[], nonStdZero=False):
""" Calculates both the mean of the data. """
mi = calc_mean(c0,c1)
std = calc_std(c0,c1)
if(nonStdZero):
std[std==0] = 1
return mi, std
def zeromean_unitvar_norm(data, mean, std):
""" Normalized the data with zero mean and unit variance. Mean and variance are in numpy.ndarray format"""
return numpy.divide(data-mean,std)
class Algorithm:
def __init__(self):
self.positives = []
self.negatives = []
def process(self, inputs, outputs):
# accumulates the input data in different
# containers for hit or miss
feature_vector = inputs["features"].data.value
if inputs["class"].data.text == 'real':
self.positives.append(feature_vector)
else:
self.negatives.append(feature_vector)
if not(inputs.hasMoreData()):
# trains the LDA projection
self.positives = numpy.vstack(self.positives)
self.negatives = numpy.vstack(self.negatives)
mean, std = calc_mean_std(self.positives, self.negatives, nonStdZero=True)
self.positives = zeromean_unitvar_norm(self.positives, mean, std)
self.negatives = zeromean_unitvar_norm(self.negatives, mean, std)
trainer = bob.learn.linear.CGLogRegTrainer()
machine = trainer.train(self.negatives, self.positives)
if mean is not None and std is not None:
machine.input_subtract = mean
machine.input_divide = std
# outputs data
outputs["classifier"].write({
'input_subtract': machine.input_subtract,
'input_divide': machine.input_divide,
'weights': machine.weights,
'biases': machine.biases,
})
return True
The code for this algorithm in Python
The ruler at 80 columns indicate suggested POSIX line breaks (for readability).
The editor will automatically enlarge to accomodate the entirety of your input
Use keyboard shortcuts for search/replace and faster editing. For example, use Ctrl-F (PC) or Cmd-F (Mac) to search through this box
This algorithm will run a Logistic Regression model [LR] for a binary classification problem using features as inputs.
The inputs take feature vectors as input and a text flag indicating if the data is a hit (it should be 'real') or a miss.
[LR] | https://en.wikipedia.org/wiki/Logistic_regression |
Updated | Name | Databases/Protocols | Analyzers | |||
---|---|---|---|---|---|---|
pkorshunov/pkorshunov/isv-asv-pad-fusion-complete/1/asv_isv-pad_lbp_hist_ratios_lr-fusion_lr-pa_aligned | avspoof/2@physicalaccess_verification,avspoof/2@physicalaccess_verify_train,avspoof/2@physicalaccess_verify_train_spoof,avspoof/2@physicalaccess_antispoofing,avspoof/2@physicalaccess_verification_spoof | pkorshunov/spoof-score-fusion-roc_hist/1 | ||||
pkorshunov/pkorshunov/speech-pad-simple/1/speech-pad_lbp_hist_ratios_lr-pa_aligned | avspoof/2@physicalaccess_antispoofing | pkorshunov/simple_antispoofing_analyzer/4 | ||||
pkorshunov/pkorshunov/speech-antispoofing-baseline/1/btas2016-baseline-pa | avspoof/1@physicalaccess_antispoofing | pkorshunov/simple_antispoofing_analyzer/2 |
This table shows the number of times this algorithm has been successfully run using the given environment. Note this does not provide sufficient information to evaluate if the algorithm will run when submitted to different conditions.