Tools implemented in bob.pad.base

Please not that some parts of the code in this package are dependent on and reused from bob.bio.base package.

Summary

Base Classes

Most of the base classes are reused from bob.bio.base. Only one base class that is presentation attack detection specific, Algorithm is implemented in this package.

bob.pad.base.algorithm.Algorithm([…]) This is the base class for all anti-spoofing algorithms.
bob.pad.base.algorithm.Predictions(**kwargs) An algorithm that takes the precomputed predictions and uses them for scoring.

Implementations

bob.pad.base.database.PadDatabase(name[, …]) This class represents the basic API for database access.
bob.pad.base.database.PadFile(client_id, path) A simple base class that defines basic properties of File object for the use in PAD experiments

Preprocessors and Extractors

Preprocessors and Extractors from the bob.bio.base package can also be used in this package. Please see Tools implemented in bob.bio.base for more details.

Algorithms

class bob.pad.base.algorithm.Algorithm(performs_projection=False, requires_projector_training=True, **kwargs)

Bases: object

This is the base class for all anti-spoofing algorithms. It defines the minimum requirements for all derived algorithm classes.

Call the constructor in derived class implementations. If your derived algorithm performs feature projection, please register this here. If it needs training for the projector, please set this here, too.

Parameters:

performs_projection : bool
Set to True if your derived algorithm performs a projection. Also implement the project() function, and the load_projector() if necessary.
requires_projector_training : bool
Only valid, when performs_projection = True. Set this flag to False, when the projection is applied, but the projector does not need to be trained.
kwargs : key=value pairs
A list of keyword arguments to be written in the __str__ function.
load_projector(projector_file)[source]

Loads the parameters required for feature projection from file. This function usually is useful in combination with the train_projector() function. In this base class implementation, it does nothing.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_file : str
The file to read the projector from.
project(feature) → projected[source]

This function will project the given feature. It must be overwritten by derived classes, as soon as performs_projection = True was set in the constructor. It is assured that the load_projector() was called once before the project function is executed.

Parameters:

feature : object
The feature to be projected.

Returns:

projected : object
The projected features. Must be writable with the write_feature() function and readable with the read_feature() function.
read_feature(feature_file) → feature[source]

Reads the projected feature from file. In this base class implementation, it uses bob.io.base.load() to do that. If you have different format, please overwrite this function.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

feature_file : str or bob.io.base.HDF5File
The file open for reading, or the file name to read from.

Returns:

feature : object
The feature that was read from file.
read_toscore_object(toscore_object_file) → toscore_object[source]

Reads the toscore_object feature from a file. By default, the toscore_object feature is identical to the projected feature. Hence, this base class implementation simply calls read_feature().

If your algorithm requires different behavior, please overwrite this function.

Parameters:

toscore_object_file : str or bob.io.base.HDF5File
The file open for reading, or the file name to read from.

Returns:

toscore_object : object
The toscore_object that was read from file.
score(toscore) → score[source]

This function will compute the score for the given object toscore. It must be overwritten by derived classes.

Parameters:

toscore : object
The object to compute the score for.

Returns:

score : float
A score value for the object toscore.
score_for_multiple_projections(toscore)[source]

scorescore_for_multiple_projections(toscore) -> score

This function will compute the score for a list of objects in toscore. It must be overwritten by derived classes.

Parameters:

toscore : [object]
A list of objects to compute the score for.

Returns:

score : float
A score value for the object toscore.
train_projector(training_features, projector_file)[source]

This function can be overwritten to train the feature projector. If you do this, please also register the function by calling this base class constructor and enabling the training by requires_projector_training = True.

Parameters:

training_features : [object] or [[object]]
A list of extracted features that can be used for training the projector. Features will be provided in a single list
projector_file : str
The file to write. This file should be readable with the load_projector() function.
write_feature(feature, feature_file)[source]

Saves the given projected feature to a file with the given name. In this base class implementation:

  • If the given feature has a save attribute, it calls feature.save(bob.io.base.HDF5File(feature_file), 'w'). In this case, the given feature_file might be either a file name or a bob.io.base.HDF5File.
  • Otherwise, it uses bob.io.base.save() to do that.

If you have a different format, please overwrite this function.

Please register ‘performs_projection = True’ in the constructor to enable this function.

Parameters:

feature : object
A feature as returned by the project() function, which should be written.
feature_file : str or bob.io.base.HDF5File
The file open for writing, or the file name to write to.
class bob.pad.base.algorithm.LogRegr(C=1, frame_level_scores_flag=False, subsample_train_data_flag=False, subsampling_step=10, subsample_videos_flag=False, video_subsampling_step=3)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train Logistic Regression classifier given Frame Containers with features of real and attack classes. The procedure is the following:

  1. First, the input data is mean-std normalized using mean and std of the real class only.
  2. Second, the Logistic Regression classifier is trained on normalized input features.
  3. The input features are next classified using pre-trained LR machine.

Parameters:

C : float
Inverse of regularization strength in LR classifier; must be a positive. Like in support vector machines, smaller values specify stronger regularization. Default: 1.0 .
frame_level_scores_flag : bool
Return scores for each frame individually if True. Otherwise, return a single score per video. Default: False.
subsample_train_data_flag : bool
Uniformly subsample the training data if True. Default: False.
subsampling_step : int
Training data subsampling step, only valid is subsample_train_data_flag = True. Default: 10 .
subsample_videos_flag : bool
Uniformly subsample the training videos if True. Default: False.
video_subsampling_step : int
Training videos subsampling step, only valid is subsample_videos_flag = True. Default: 3 .
load_lr_machine_and_mean_std(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_file : str
Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework.

Returns:

machine : object
The loaded LR machine. As returned by sklearn.linear_model module.
features_mean : 1D numpy.ndarray
Mean of the features.
features_std : 1D numpy.ndarray
Standart deviation of the features.
load_projector(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

This function sets the arguments self.lr_machine, self.features_mean and self.features_std of this class with loaded machines.

The function must be capable of reading the data saved with the train_projector() method of this class.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_file : str
The file to read the projector from, as returned by the bob.pad.base framework. In this class the names of the files to read the projectors from are modified, see load_machine and load_cascade_of_machines methods of this class for more details.
project(feature)[source]

This function computes a vector of scores for each sample in the input array of features. The following steps are apllied:

  1. First, the input data is mean-std normalized using mean and std of the real class only.
  2. The input features are next classified using pre-trained LR machine.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

feature : FrameContainer or 2D numpy.ndarray
Two types of inputs are accepted. A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).

Returns:

scores : 1D numpy.ndarray
Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class. In this case scores are probabilities.
save_lr_machine_and_mean_std(projector_file, machine, features_mean, features_std)[source]

Saves the LR machine, features mean and std to the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_file : str
Absolute name of the file to save the data to, as returned by bob.pad.base framework.
machine : object
The LR machine to be saved. As returned by sklearn.linear_model module.
features_mean : 1D numpy.ndarray
Mean of the features.
features_std : 1D numpy.ndarray
Standart deviation of the features.
score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore : 1D numpy.ndarray
Vector with scores for each frame/sample defining the probability of the frame being a sample of the real class.

Returns:

score : [float]
If frame_level_scores_flag = False a single score is returned. One score per video. This score is placed into a list, because the score must be an iterable. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a list of scores is returned. One score per frame/sample.
subsample_train_videos(training_features, step)[source]

Uniformly select subset of frmae containes from the input list

Parameters:

training_features : [FrameContainer]
A list of FrameContainers
step : int
Data selection step.

Returns:

training_features_subset : [FrameContainer]
A list with selected FrameContainers
train_lr(real, attack, C)[source]

Train LR classifier given real and attack classes. Prior to training the data is mean-std normalized.

Parameters:

real : 2D numpy.ndarray
Training features for the real class.
attack : 2D numpy.ndarray
Training features for the attack class.
C : float
Inverse of regularization strength in LR classifier; must be a positive. Like in support vector machines, smaller values specify stronger regularization. Default: 1.0 .

Returns:

machine : object
A trained LR machine.
features_mean : 1D numpy.ndarray
Mean of the features.
features_std : 1D numpy.ndarray
Standart deviation of the features.
train_projector(training_features, projector_file)[source]

Train LR for feature projection and save them to files. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features : [[FrameContainer], [FrameContainer]]
A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_file : str
The file to save the trained projector to, as returned by the bob.pad.base framework.
class bob.pad.base.algorithm.OneClassGMM(n_components=1, random_state=3, frame_level_scores_flag=False)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train a OneClassGMM based PAD system. The OneClassGMM is trained using data of one class (real class) only. The procedure is the following:

  1. First, the training data is mean-std normalized using mean and std of the real class only.
  2. Second, the OneClassGMM with n_components Gaussians is trained using samples of the real class.
  3. The input features are next classified using pre-trained OneClassGMM machine.

Parameters:

n_components : int
Number of Gaussians in the OneClassGMM. Default: 1 .
random_state : int
A seed for the random number generator used in the initialization of the OneClassGMM. Default: 7 .
frame_level_scores_flag : bool
Return scores for each frame individually if True. Otherwise, return a single score per video. Default: False.
load_gmm_machine_and_mean_std(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_file : str
Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework.

Returns:

machine : object
The loaded OneClassGMM machine. As returned by sklearn.mixture module.
features_mean : 1D numpy.ndarray
Mean of the features.
features_std : 1D numpy.ndarray
Standart deviation of the features.
load_projector(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

This function sets the arguments self.machine, self.features_mean and self.features_std of this class with loaded machines.

The function must be capable of reading the data saved with the train_projector() method of this class.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_file : str
The file to read the projector from, as returned by the bob.pad.base framework. In this class the names of the files to read the projectors from are modified, see load_machine and load_cascade_of_machines methods of this class for more details.
project(feature)[source]

This function computes a vector of scores for each sample in the input array of features. The following steps are applied:

  1. First, the input data is mean-std normalized using mean and std of the real class only.
  2. The input features are next classified using pre-trained OneClassGMM machine.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

feature : FrameContainer or 2D numpy.ndarray
Two types of inputs are accepted. A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).

Returns:

scores : 1D numpy.ndarray
Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class. In this case scores are the weighted log probabilities.
save_gmm_machine_and_mean_std(projector_file, machine, features_mean, features_std)[source]

Saves the OneClassGMM machine, features mean and std to the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_file : str
Absolute name of the file to save the data to, as returned by bob.pad.base framework.
machine : object
The OneClassGMM machine to be saved. As returned by sklearn.linear_model module.
features_mean : 1D numpy.ndarray
Mean of the features.
features_std : 1D numpy.ndarray
Standart deviation of the features.
score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore : 1D numpy.ndarray
Vector with scores for each frame/sample defining the probability of the frame being a sample of the real class.

Returns:

score : [float]
If frame_level_scores_flag = False a single score is returned. One score per video. This score is placed into a list, because the score must be an iterable. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a list of scores is returned. One score per frame/sample.
train_gmm(real, n_components, random_state)[source]

Train OneClassGMM classifier given real class. Prior to the training the data is mean-std normalized.

Parameters:

real : 2D numpy.ndarray
Training features for the real class.
n_components : int
Number of Gaussians in the OneClassGMM. Default: 1 .
random_state : int
A seed for the random number generator used in the initialization of the OneClassGMM. Default: 7 .

Returns:

machine : object
A trained OneClassGMM machine.
features_mean : 1D numpy.ndarray
Mean of the features.
features_std : 1D numpy.ndarray
Standart deviation of the features.
train_projector(training_features, projector_file)[source]

Train OneClassGMM for feature projection and save it to file. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features : [[FrameContainer], [FrameContainer]]
A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_file : str
The file to save the trained projector to, as returned by the bob.pad.base framework.
class bob.pad.base.algorithm.Predictions(**kwargs)

Bases: bob.pad.base.algorithm.Algorithm

An algorithm that takes the precomputed predictions and uses them for scoring.

score(predictions)[source]
class bob.pad.base.algorithm.SVM(machine_type='C_SVC', kernel_type='RBF', n_samples=10000, trainer_grid_search_params={'cost': [0.03125, 0.125, 0.5, 2, 8, 32, 128, 512, 2048, 8192, 32768], 'gamma': [3.0517578125e-05, 0.0001220703125, 0.00048828125, 0.001953125, 0.0078125, 0.03125, 0.125, 0.5, 2, 8]}, mean_std_norm_flag=False, frame_level_scores_flag=False, save_debug_data_flag=True, reduced_train_data_flag=False, n_train_samples=50000)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train SVM given features (either numpy arrays or Frame Containers) from real and attack classes. The trained SVM is then used to classify the testing data as either real or attack. The SVM is trained in two stages. First, the best parameters for SVM are estimated using train and cross-validation subsets. The size of the subsets used in hyper-parameter tuning is defined by n_samples parameter of this class. Once best parameters are determined, the SVM machine is trained using complete training set.

Parameters:

machine_type : str
A type of the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘C_SVC’.
kernel_type : str
A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘RBF’.
n_samples : int
Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the hyper-parameter grid search.
trainer_grid_search_params : dict
Dictionary containing the hyper-parameters of the SVM to be tested in the grid-search. Default: {‘cost’: [2**p for p in range(-5, 16, 2)], ‘gamma’: [2**p for p in range(-15, 4, 2)]}.
mean_std_norm_flag : bool
Perform mean-std normalization of data if set to True. Default: False.
frame_level_scores_flag : bool
Return scores for each frame individually if True. Otherwise, return a single score per video. Should be used only when features are in Frame Containers. Default: False.
save_debug_data_flag : bool
Save the data, which might be usefull for debugging if True. Default: True.
reduced_train_data_flag : bool
Reduce the amount of final training samples if set to True. Default: False.
n_train_samples : int
Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the final traing of the SVM. Default: 50000.
comp_prediction_precision(machine, real, attack)[source]

This function computes the precision of the predictions as a ratio of correctly classified samples to the total number of samples.

Parameters:

machine : object
A pre-trained SVM machine.
real : 2D numpy.ndarray
Array of features representing the real class.
attack : 2D numpy.ndarray
Array of features representing the attack class.

Returns:

precision : float
The precision of the predictions.
load_projector(projector_file)[source]

Load the pretrained projector/SVM from file to perform a feature projection. This function usually is useful in combination with the train_projector() function.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_file : str
The file to read the projector from.
project(feature)[source]

This function computes class probabilities for the input feature using pretrained SVM. The feature in this case is a Frame Container with features for each frame. The probabilities will be computed and returned for each frame.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

feature : object
A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer.

Returns:

probabilities : 1D or 2D numpy.ndarray
2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class. Must be writable with the write_feature function and readable with the read_feature function.
score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore : 1D or 2D numpy.ndarray
2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.

Returns:

score : float or a 1D numpy.ndarray
If frame_level_scores_flag = False a single score is returned. One score per video. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a 1D array of scores is returned. One score per frame. Score is a probability of a sample being a real class.
score_for_multiple_projections(toscore)[source]

Returns a list of scores computed by the score method of this class.

Parameters:

toscore : 1D or 2D numpy.ndarray
2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.

Returns:

list_of_scores : [float]
A list containing the scores.
train_projector(training_features, projector_file)[source]

Train SVM feature projector and save the trained SVM to a given file. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features : [[FrameContainer], [FrameContainer]]
A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_file : str
The file to save the trained projector to. This file should be readable with the load_projector() function.
train_svm(training_features, n_samples=10000, machine_type='C_SVC', kernel_type='RBF', trainer_grid_search_params={'cost': [0.03125, 0.125, 0.5, 2, 8, 32, 128, 512, 2048, 8192, 32768], 'gamma': [3.0517578125e-05, 0.0001220703125, 0.00048828125, 0.001953125, 0.0078125, 0.03125, 0.125, 0.5, 2, 8]}, mean_std_norm_flag=False, projector_file='', save_debug_data_flag=True, reduced_train_data_flag=False, n_train_samples=50000)[source]

First, this function tunes the hyper-parameters of the SVM classifier using grid search on the sub-sets of training data. Train and cross-validation subsets for both classes are formed from the available input training_features.

Once successfull parameters are determined the SVM is trained on the whole training data set. The resulting machine is returned by the function.

Parameters:

training_features : [[FrameContainer], [FrameContainer]]
A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
n_samples : int
Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the hyper-parameter grid search.
machine_type : str
A type of the SVM machine. Please check bob.learn.libsvm for more details.
kernel_type : str
A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.
trainer_grid_search_params : dict
Dictionary containing the hyper-parameters of the SVM to be tested in the grid-search.
mean_std_norm_flag : bool
Perform mean-std normalization of data if set to True. Default: False.
projector_file : str
The name of the file to save the trained projector to. Only the path of this file is used in this function. The file debug_data.hdf5 will be save in this path. This file contains information, which might be usefull for debugging.
save_debug_data_flag : bool
Save the data, which might be usefull for debugging if True. Default: True.
reduced_train_data_flag : bool
Reduce the amount of final training samples if set to True. Default: False.
n_train_samples : int
Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the final traing of the SVM. Default: 50000.

Returns:

machine : object
A trained SVM machine.
class bob.pad.base.algorithm.SVMCascadePCA(machine_type='C_SVC', kernel_type='RBF', svm_kwargs={'cost': 1, 'gamma': 0}, N=2, pos_scores_slope=0.01, frame_level_scores_flag=False)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train the cascede of SVMs given Frame Containers with features of real and attack classes. The procedure is the following:

  1. First, the input data is mean-std normalized.
  2. Second, the PCA is trained on normalized input features. Only the features of the real class are used in PCA training, both for one-class and two-class SVMs.
  3. The features are next projected given trained PCA machine.
  4. Prior to SVM training the features are again mean-std normalized.
  5. Next SVM machine is trained for each N projected features. First, preojected features corresponding to highest eigenvalues are selected. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for projected features 1 and 2, second SVM is trained for projected features 3 and 4, and so on.
  6. These SVMs then form a cascade of classifiers. The input feature vector is then projected using PCA machine and passed through all classifiers in the cascade. The decision is then made by majority voting.

Both one-class SVM and two-class SVM cascades can be trained. In this implementation the grid search of SVM parameters is not supported.

Parameters:

machine_type : str
A type of the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘C_SVC’.
kernel_type : str
A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘RBF’.
svm_kwargs : dict
Dictionary containing the hyper-parameters of the SVM. Default: {‘cost’: 1, ‘gamma’: 0}.
N : int
The number of features to be used for training a single SVM machine in the cascade. Default: 2.
pos_scores_slope : float
The positive scores returned by SVM cascade will be multiplied by this constant prior to majority voting. Default: 0.01 .
frame_level_scores_flag : bool
Return scores for each frame individually if True. Otherwise, return a single score per video. Default: False.
combine_scores_of_svm_cascade(scores_array, pos_scores_slope)[source]

First, multiply positive scores by constant pos_scores_slope in the input 2D array. The constant is usually small, making the impact of negative scores more significant. Second, the a single score per sample is obtained by avaraging the pre-modified scores of the cascade.

Parameters:

scores_array : 2D numpy.ndarray
2D score array of the size (N_samples x N_scores).
pos_scores_slope : float
The positive scores returned by SVM cascade will be multiplied by this constant prior to majority voting. Default: 0.01 .

Returns:

scores : 1D numpy.ndarray
Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class.
comp_prediction_precision(machine, real, attack)[source]

This function computes the precision of the predictions as a ratio of correctly classified samples to the total number of samples.

Parameters:

machine : object
A pre-trained SVM machine.
real : 2D numpy.ndarray
Array of features representing the real class.
attack : 2D numpy.ndarray
Array of features representing the attack class.

Returns:

precision : float
The precision of the predictions.
get_cascade_file_names(projector_file, projector_file_name)[source]

Get the list of file-names storing the cascade of machines. The location of the files is specified in the path component of the projector_file argument.

Parameters:

projector_file : str
Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework. In this function only the path component is used.
projector_file_name : str
The common string in the names of files storing the cascade of pretrained machines. Name without extension.

Returns:

cascade_file_names : [str]
A list of of relative file-names storing the cascade of machines.
get_data_start_end_idx(data, N)[source]

Get indexes to select the subsets of data related to the cascades. First (n_machines - 1) SVMs will be trained using N features. Last SVM will be trained using remaining features, which is less or equal to N.

Parameters:

data : 2D numpy.ndarray
Data array containing the training features. The dimensionality is (N_samples x N_features).
N : int
Number of features per single SVM.

Returns:

idx_start : [int]
Starting indexes for data subsets.
idx_end : [int]
End indexes for data subsets.
n_machines : int
Number of SVMs to be trained.
load_cascade_of_machines(projector_file, projector_file_name)[source]

Loades a cascade of machines from the hdf5 files. The name of the file is specified in projector_file_name string and will be augumented with a number of the machine. The location is specified in the path component of the projector_file string.

Parameters:

projector_file : str
Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework. In this function only the path component is used.
projector_file_name : str
The relative name of the file to load the machine from. This name will be augumented with a number of the machine. Name without extension.

Returns:

machines : dict
A cascade of machines. The key in the dictionary is the number of the machine, value is the machine itself.
load_machine(projector_file, projector_file_name)[source]

Loads the machine from the hdf5 file. The name of the file is specified in projector_file_name string. The location is specified in the path component of the projector_file string.

Parameters:

projector_file : str
Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework. In this function only the path component is used.
projector_file_name : str
The relative name of the file to load the machine from. Name without extension.

Returns:

machine : object
A machine loaded from file.
load_projector(projector_file)[source]

Load the pretrained PCA machine and a cascade of SVM classifiers from files to perform feature projection. This function sets the arguments self.pca_machine and self.svm_machines of this class with loaded machines.

The function must be capable of reading the data saved with the train_projector() method of this class.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_file : str
The file to read the projector from, as returned by the bob.pad.base framework. In this class the names of the files to read the projectors from are modified, see load_machine and load_cascade_of_machines methods of this class for more details.
project(feature)[source]

This function computes a vector of scores for each sample in the input array of features. The following steps are apllied:

  1. Convert input array to numpy array if necessary.
  2. Project features using pretrained PCA machine.
  3. Apply the cascade of SVMs to the preojected features.
  4. Compute a single score per sample by combining the scores produced by the cascade of SVMs. The combination is done using combine_scores_of_svm_cascade method of this class.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

feature : FrameContainer or 2D numpy.ndarray
Two types of inputs are accepted. A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).

Returns:

scores : 1D numpy.ndarray
Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class.
save_cascade_of_machines(projector_file, projector_file_name, machines)[source]

Saves a cascade of machines to the hdf5 files. The name of the file is specified in projector_file_name string and will be augumented with a number of the machine. The location is specified in the path component of the projector_file string.

Parameters:

projector_file : str
Absolute name of the file to save the trained projector to, as returned by bob.pad.base framework. In this function only the path component is used.
projector_file_name : str
The relative name of the file to save the machine to. This name will be augumented with a number of the machine. Name without extension.
machines : dict
A cascade of machines. The key in the dictionary is the number of the machine, value is the machine itself.
save_machine(projector_file, projector_file_name, machine)[source]

Saves the machine to the hdf5 file. The name of the file is specified in projector_file_name string. The location is specified in the path component of the projector_file string.

Parameters:

projector_file : str
Absolute name of the file to save the trained projector to, as returned by bob.pad.base framework. In this function only the path component is used.
projector_file_name : str
The relative name of the file to save the machine to. Name without extension.
machine : object
The machine to be saved.
score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore : 1D or 2D numpy.ndarray
2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.

Returns:

score : [float]
If frame_level_scores_flag = False a single score is returned. One score per video. This score is placed into a list, because the score must be an iterable. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a list of scores is returned. One score per frame/sample.
train_pca(data)[source]

Train PCA given input array of feature vectors. The data is mean-std normalized prior to PCA training.

Parameters:

data : 2D numpy.ndarray
Array of feature vectors of the size (N_samples x N_features). The features must be already mean-std normalized.

Returns:

machine : bob.learn.linear.Machine
The PCA machine that has been trained. The mean-std normalizers are also set in the machine.
eig_vals : 1D numpy.ndarray
The eigen-values of the PCA projection.
train_pca_svm_cascade(real, attack, machine_type, kernel_type, svm_kwargs, N)[source]

This function is designed to train the cascede of SVMs given features of real and attack classes. The procedure is the following:

  1. First, the PCA machine is trained also incorporating mean-std feature normalization. Only the features of the real class are used in PCA training, both for one-class and two-class SVMs.
  2. The features are next projected given trained PCA machine.
  3. Next, SVM machine is trained for each N projected features. Prior to SVM training the features are again mean-std normalized. First, preojected features corresponding to highest eigenvalues are selected. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for projected features 1 and 2, second SVM is trained for projected features 3 and 4, and so on.

Both one-class SVM and two-class SVM cascades can be trained. In this implementation the grid search of SVM parameters is not supported.

Parameters:

real : 2D numpy.ndarray
Training features for the real class.
attack : 2D numpy.ndarray
Training features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.
machine_type : str
A type of the SVM machine. Please check bob.learn.libsvm for more details.
kernel_type : str
A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.
svm_kwargs : dict
Dictionary containing the hyper-parameters of the SVM.
N : int
The number of features to be used for training a single SVM machine in the cascade.

Returns:

pca_machine : object
A trained PCA machine.
svm_machines : dict
A cascade of SVM machines.
train_projector(training_features, projector_file)[source]

Train PCA and cascade of SVMs for feature projection and save them to files. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features : [[FrameContainer], [FrameContainer]]
A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_file : str
The file to save the trained projector to, as returned by the bob.pad.base framework. In this class the names of the files to save the projectors to are modified, see save_machine and save_cascade_of_machines methods of this class for more details.
train_svm(real, attack, machine_type, kernel_type, svm_kwargs)[source]

One-class or two class-SVM is trained in this method given input features. The value of attack argument is not important in the case of one-class SVM. Prior to training the data is mean-std normalized.

Parameters:

real : 2D numpy.ndarray
Training features for the real class.
attack : 2D numpy.ndarray
Training features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.
machine_type : str
A type of the SVM machine. Please check bob.learn.libsvm for more details.
kernel_type : str
A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.
svm_kwargs : dict
Dictionary containing the hyper-parameters of the SVM.

Returns:

machine : object
A trained SVM machine. The mean-std normalizers are also set in the machine.
train_svm_cascade(real, attack, machine_type, kernel_type, svm_kwargs, N)[source]

Train a cascade of SVMs, one SVM machine per N features. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for features 1 and 2, second SVM is trained for features 3 and 4, and so on.

Both one-class and two-class SVM cascades can be trained. The value of attack argument is not important in the case of one-class SVM.

The data is mean-std normalized prior to SVM cascade training.

Parameters:

real : 2D numpy.ndarray
Training features for the real class.
attack : 2D numpy.ndarray
Training features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.
machine_type : str
A type of the SVM machine. Please check bob.learn.libsvm for more details.
kernel_type : str
A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.
svm_kwargs : dict
Dictionary containing the hyper-parameters of the SVM.
N : int
The number of features to be used for training a single SVM machine in the cascade.

Returns:

machines : dict
A dictionary containing a cascade of trained SVM machines.

Databases

class bob.pad.base.database.Client(client_id)

Bases: object

The clients of this database contain ONLY client ids. Nothing special.

class bob.pad.base.database.FileListPadDatabase(filelists_directory, name, protocol=None, pad_file_class=<class 'bob.pad.base.database.PadFile'>, original_directory=None, original_extension=None, annotation_directory=None, annotation_extension='', annotation_type=None, train_subdir=None, dev_subdir=None, eval_subdir=None, real_filename=None, attack_filename=None, keep_read_lists_in_memory=True, **kwargs)

Bases: bob.pad.base.database.PadDatabase, bob.bio.base.database.FileListBioDatabase

This class provides a user-friendly interface to databases that are given as file lists.

Keyword parameters:

filelists_directory : str
The directory that contains the filelists defining the protocol(s). If you use the protocol attribute when querying the database, it will be appended to the base directory, such that several protocols are supported by the same class instance of bob.pad.base.
name : str
The name of the database
protocol : str
The protocol of the database. This should be a folder inside filelists_directory.
pad_file_class : class
The class that should be used for return the files. This can be PadFile, PadVoiceFile, or anything similar.
original_directory : str or None
The directory, where the original data can be found
original_extension : str or [str] or None
The filename extension of the original data, or multiple extensions
annotation_directory : str or None
The directory, where additional annotation files can be found
annotation_extension : str or None
The filename extension of the annotation files

annotation_type : str The type of the annotation file to read, see bob.db.base.read_annotation_file for accepted formats.

train_subdir : str or None
Specify a custom subdirectory for the filelists of the development set (default is ‘train’)
dev_subdir : str or None
Specify a custom subdirectory for the filelists of the development set (default is ‘dev’)
eval_subdir : str or None
Specify a custom subdirectory for the filelists of the development set (default is ‘eval’)
keep_read_lists_in_memory : bool
If set to true, the lists are read only once and stored in memory
annotations(file)[source]
client_ids(protocol=None, groups=None)[source]

Returns a list of client ids for the specific query by the user.

Keyword Parameters:

protocol : str or None
The protocol to consider
groups : str or [str] or None
The groups to which the clients belong (“dev”, “eval”, “train”).

Returns: A list containing all the client ids which have the given properties.

groups(protocol=None, add_world=False, add_subworld=False)[source]

This function returns the list of groups for this database.

protocol : str or None
The protocol for which the groups should be retrieved.

Returns: a list of groups

objects(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]

Returns a set of PadFile objects for the specific query by the user.

Keyword Parameters:

groups : str or [str] or None
One of the groups (“dev”, “eval”, “train”) or a tuple with several of them. If ‘None’ is given (this is the default), it is considered the same as a tuple with all possible values.
protocol : str or None
The protocol to consider
purposes : str or [str] or None
The purposes required to be retrieved (“real”, “attack”) or a tuple with several of them. If ‘None’ is given (this is the default), it is considered the same as a tuple with all possible values.
model_ids : [various type]
This parameter is not supported in PAD databases yet

Returns: A list of PadFile objects considering all the filtering criteria.

tobjects(groups=None, protocol=None, model_ids=None, **kwargs)[source]
zobjects(groups=None, protocol=None, **kwargs)[source]
class bob.pad.base.database.HighBioDatabase(original_directory='[DB_DATA_DIRECTORY]', original_extension='.wav', db_name='', **kwargs)

Bases: bob.bio.base.database.BioDatabase

Implements verification API for querying High database.

annotations(file)[source]
model_ids_with_protocol(groups=None, protocol=None, **kwargs)[source]
objects(protocol=None, purposes=None, model_ids=None, groups=None, **kwargs)[source]

Maps objects method of PAD databases into objects method of Verification database

Parameters:
  • protocol (str) – To distinguish two vulnerability scenarios, protocol name should have either ‘-licit’ or ‘-spoof’ appended to it. For instance, if DB has protocol ‘general’, the named passed to this method should be ‘general-licit’, if we want to run verification experiments on bona fide data only, but it should be ‘general-spoof’, if we want to run it for spoof scenario (the probes are attacks).
  • purposes ([str]) – This parameter is passed by the bob.bio.base verification experiment
  • model_ids ([object]) – This parameter is passed by the bob.bio.base verification experiment
  • groups ([str]) – We map the groups from (‘world’, ‘dev’, ‘eval’) used in verification experiments to (‘train’, ‘dev’, ‘eval’)
  • **kwargs – The rest of the parameters valid for a given database
Returns:

Set of BioFiles that verification experiments expect.

Return type:

[object]

class bob.pad.base.database.HighPadDatabase(original_directory='[DB_DATA_DIRECTORY]', original_extension='.wav', db_name='', **kwargs)

Bases: bob.pad.base.database.FileListPadDatabase

class bob.pad.base.database.PadDatabase(name, protocol='Default', original_directory=None, original_extension=None, **kwargs)

Bases: bob.bio.base.database.BioDatabase

This class represents the basic API for database access. Please use this class as a base class for your database access classes. Do not forget to call the constructor of this base class in your derived class.

Parameters:

name : str A unique name for the database.

protocol : str or None The name of the protocol that defines the default experimental setup for this database.

original_directory : str The directory where the original data of the database are stored.

original_extension : str The file name extension of the original data.

kwargs : key=value pairs The arguments of the bob.bio.base.database.BioDatabase base class constructor.

all_files(groups=('train', 'dev', 'eval'), flat=False)[source]

Returns all files of the database, respecting the current protocol. The files can be limited using the all_files_options in the constructor.

Parameters:
  • groups (str or tuple or None) – The groups to get the data for. it should be some of ('train', 'dev', 'eval') or None
  • flat (bool) – if True, it will merge the real and attack files into one list.
Returns:

files – The sorted and unique list of all files of the database.

Return type:

[bob.pad.base.database.PadFile]

annotations(file)[source]

Returns the annotations for the given File object, if available. You need to override this method in your high-level implementation. If your database does not have annotations, it should return None.

Parameters:

file : bob.pad.base.database.PadFile
The file for which annotations should be returned.

Returns:

annots : dict or None
The annotations for the file, if available.
model_ids_with_protocol(groups = None, protocol = None, **kwargs) → ids[source]

Client-based PAD is not implemented.

objects(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]

This function returns lists of File objects, which fulfill the given restrictions.

Keyword parameters:

groups : str or [str]
The groups of which the clients should be returned. Usually, groups are one or more elements of (‘train’, ‘dev’, ‘eval’)
protocol
The protocol for which the clients should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.
purposes : str or [str]
The purposes for which File objects should be retrieved. Usually it is either ‘real’ or ‘attack’.
model_ids : [various type]
This parameter is not supported in PAD databases yet
original_file_names(files) → paths[source]

Returns the full paths of the real and attack data of the given PadFile objects.

Parameters:

files : [[bob.pad.base.database.PadFile], [bob.pad.base.database.PadFile]
The list of lists ([real, attack]) of file object to retrieve the original data file names for.

Returns:

paths : [str] or [[str]]
The paths extracted for the concatenated real+attack files, in the preserved order.
training_files(step = None, arrange_by_client = False) → files[source]

Returns all training File objects This function needs to be implemented in derived class implementations.

Parameters:
The parameters are not applicable in this version of anti-spoofing experiments

Returns:

files : [bob.pad.base.database.PadFile] or [[bob.pad.base.database.PadFile]]
The (arranged) list of files used for the training.
class bob.pad.base.database.PadFile(client_id, path, attack_type=None, file_id=None)

Bases: bob.bio.base.database.BioFile

A simple base class that defines basic properties of File object for the use in PAD experiments

Grid Configuration

Code related to grid is reused from bob.bio.base package. Please see the corresponding documentation.