Tools implemented in bob.bio.base

Summary

Base Classes

bob.bio.base.preprocessor.Preprocessor([…]) This is the base class for all preprocessors.
bob.bio.base.extractor.Extractor([…]) This is the base class for all feature extractors.
bob.bio.base.algorithm.Algorithm([…]) This is the base class for all biometric recognition algorithms.
bob.bio.base.grid.Grid([grid_type, …]) This class is defining the options that are required to submit parallel jobs to the SGE grid, or jobs to the local queue.
bob.bio.base.annotator.Annotator([…]) Annotator class for all annotators.
bob.bio.base.baseline.Baseline(name, …) Base class to define baselines

Implementations

bob.bio.base.preprocessor.Filename() This preprocessor is simply passing over the file name, in order to be used in an extractor that loads the data from file.
bob.bio.base.preprocessor.SequentialPreprocessor(…) A helper class which takes several preprocessors and applies them one by one sequentially.
bob.bio.base.preprocessor.ParallelPreprocessor(…) A helper class which takes several preprocessors and applies them on each processor separately and yields their outputs one by one.
bob.bio.base.preprocessor.CallablePreprocessor(…) A simple preprocessor that takes a callable and applies that callable to the input.
bob.bio.base.extractor.SequentialExtractor(…) A helper class which takes several extractors and applies them one by one sequentially.
bob.bio.base.extractor.ParallelExtractor(…) A helper class which takes several extractors and applies them on each processor separately and yields their outputs one by one.
bob.bio.base.extractor.CallableExtractor(…) A simple extractor that takes a callable and applies that callable to the input.
bob.bio.base.extractor.Linearize([dtype]) Extracts features by simply concatenating all elements of the data into one long vector.
bob.bio.base.algorithm.Distance([…]) This class defines a simple distance measure between two features.
bob.bio.base.algorithm.PCA(subspace_dimension) Performs a principal component analysis (PCA) on the given data.
bob.bio.base.algorithm.LDA([…]) Computes a linear discriminant analysis (LDA) on the given data, possibly after computing a principal component analysis (PCA).
bob.bio.base.algorithm.PLDA(…[, …]) Tool chain for computing PLDA (over PCA-dimensionality reduced) features
bob.bio.base.algorithm.BIC(comparison_function) Computes the Intrapersonal/Extrapersonal classifier using a generic feature type and feature comparison function.
bob.bio.base.database.BioFile(client_id, path) A simple base class that defines basic properties of File object for the use in verification experiments
bob.bio.base.database.BioDatabase(name[, …]) This class represents the basic API for database access.
bob.bio.base.database.ZTBioDatabase(name[, …]) This class defines another set of abstract functions that need to be implemented if your database provides the interface for computing scores used for ZT-normalization.
bob.bio.base.database.FileListBioDatabase(…) This class provides a user-friendly interface to databases that are given as file lists.
bob.bio.base.annotator.FailSafe(annotators, …) A fail-safe annotator.
bob.bio.base.annotator.Callable(callable, …) A class that wraps a callable object that annotates a sample into a bob.bio.annotator object.

Preprocessors

class bob.bio.base.preprocessor.CallablePreprocessor(callable, accepts_annotations=True, write_data=None, read_data=None, **kwargs)

Bases: bob.bio.base.preprocessor.Preprocessor

A simple preprocessor that takes a callable and applies that callable to the input.

accepts_annotations

bool – If False, annotations are not passed to the callable.

callable

object – Anything that is callable. It will be used as a preprocessor in bob.bio.base.

read_data

object – A callable object with the signature of data = read_data(data_file). If not provided, the default implementation handles numpy arrays.

write_data

object – A callable object with the signature of write_data(data, data_file). If not provided, the default implementation handles numpy arrays.

Examples

You can take any function like numpy.cast['float32'] to cast your data to float32 for example. This is useful when you want to stack several preprocessors using the SequentialPreprocessor and ParallelPreprocessor classes.

class bob.bio.base.preprocessor.Filename

Bases: bob.bio.base.preprocessor.Preprocessor

This preprocessor is simply passing over the file name, in order to be used in an extractor that loads the data from file.

The file name that will be returned by the read_data() function will contain the path of the bob.bio.base.database.BioFile, but it might contain more paths (such as the --preprocessed-directory passed on command line).

read_data(data_file) → data[source]

Returns the name of the data file without its filename extension.

Parameters:

data_file : str
The name of the preprocessed data file.

Returns:

data : str
The preprocessed data read from file.
write_data(data, data_file)[source]

Does not write any data.

data : any
ignored.
data_file : any
ignored.
class bob.bio.base.preprocessor.ParallelPreprocessor(processors, **kwargs)

Bases: bob.extension.processors.ParallelProcessor, bob.bio.base.preprocessor.Preprocessor

A helper class which takes several preprocessors and applies them on each processor separately and yields their outputs one by one.

processors

list – A list of preprocessors to apply.

Examples

You can use this class to apply several preprocessors on your data and get all the results back. For example:

>>> import numpy as np
>>> from functools import  partial
>>> from bob.bio.base.preprocessor import ParallelPreprocessor, CallablePreprocessor
>>> raw_data = np.array([[1, 2, 3], [1, 2, 3]])
>>> parallel_preprocessor = ParallelPreprocessor(
...     [CallablePreprocessor(f, accepts_annotations=False) for f in
...      [np.cast['float64'], lambda x: x / 2.0]])
>>> np.allclose(list(parallel_preprocessor(raw_data)),[[[ 1.,  2.,  3.],[ 1.,  2.,  3.]], [[ 0.5,  1. ,  1.5],[ 0.5,  1. ,  1.5]]])
True

The data may be further processed using a SequentialPreprocessor:

>>> from bob.bio.base.preprocessor import SequentialPreprocessor
>>> total_preprocessor = SequentialPreprocessor(
...     [parallel_preprocessor, CallablePreprocessor(list, False),
...      CallablePreprocessor(partial(np.concatenate, axis=1), False)])
>>> np.allclose(total_preprocessor(raw_data),[[ 1. ,  2. ,  3. ,  0.5,  1. ,  1.5],[ 1. ,  2. ,  3. ,  0.5,  1. ,  1.5]])
True
class bob.bio.base.preprocessor.Preprocessor(writes_data=True, read_original_data=None, min_preprocessed_file_size=1000, **kwargs)

Bases: object

This is the base class for all preprocessors. It defines the minimum requirements for all derived proprocessor classes.

Parameters:

writes_data : bool
Select, if the preprocessor actually writes preprocessed images, or if it is simply returning values.
read_original_data: callable or None
This function is used to read the original data from file. It takes three inputs: A bob.bio.base.database.BioFile (or one of its derivatives), the original directory (as str) and the original extension (as str). If None, the default function bob.bio.base.read_original_data() is used.
min_preprocessed_file_size: int
The minimum file size of a saved preprocessd data in bytes. If the saved preprocessed data file size is smaller than this, it is assumed to be a corrupt file and the data will be processed again.
kwargs : key=value pairs
A list of keyword arguments to be written in the __str__ function.
read_data(data_file) → data[source]

Reads the preprocessed data from file. In this base class implementation, it uses bob.bio.base.load() to do that. If you have different format, please overwrite this function.

Parameters:

data_file : str or bob.io.base.HDF5File
The file open for reading or the name of the file to read from.

Returns:

data : object (usually numpy.ndarray)
The preprocessed data read from file.
write_data(data, data_file)[source]

Writes the given preprocessed data to a file with the given name. In this base class implementation, we simply use bob.bio.base.save() for that. If you have a different format (e.g. not images), please overwrite this function.

Parameters:

data : object
The preprocessed data, i.e., what is returned from __call__.
data_file : str or bob.io.base.HDF5File
The file open for writing, or the name of the file to write.
class bob.bio.base.preprocessor.SequentialPreprocessor(processors, read_original_data=None, **kwargs)

Bases: bob.extension.processors.SequentialProcessor, bob.bio.base.preprocessor.Preprocessor

A helper class which takes several preprocessors and applies them one by one sequentially.

processors

list – A list of preprocessors to apply.

Examples

You can use this class to apply a chain of preprocessors on your data. For example:

>>> import numpy as np
>>> from functools import  partial
>>> from bob.bio.base.preprocessor import SequentialPreprocessor, CallablePreprocessor
>>> raw_data = np.array([[1, 2, 3], [1, 2, 3]])
>>> seq_preprocessor = SequentialPreprocessor(
...     [CallablePreprocessor(f, accepts_annotations=False) for f in
...      [np.cast['float64'], lambda x: x / 2, partial(np.mean, axis=1)]])
>>> np.allclose(seq_preprocessor(raw_data), [ 1.,  1.])
True
>>> np.all(seq_preprocessor(raw_data) ==
...        np.mean(np.cast['float64'](raw_data) / 2, axis=1))
True
read_data(data_file)[source]
write_data(data, data_file)[source]

Extractors

class bob.bio.base.extractor.CallableExtractor(callable, write_feature=None, read_feature=None, **kwargs)

Bases: bob.bio.base.extractor.Extractor

A simple extractor that takes a callable and applies that callable to the input.

callable

object – Anything that is callable. It will be used as an extractor in bob.bio.base.

read_feature

object – A callable object with the signature of feature = read_feature(feature_file). If not provided, the default implementation handles numpy arrays.

write_feature

object – A callable object with the signature of write_feature(feature, feature_file). If not provided, the default implementation handles numpy arrays.

Examples

You can take any function like numpy.cast['float32'] to cast your data to float32 for example. This is useful when you want to stack several extractors using the SequentialExtractor and ParallelExtractor classes.

class bob.bio.base.extractor.Extractor(requires_training=False, split_training_data_by_client=False, min_extractor_file_size=1000, min_feature_file_size=1000, **kwargs)

Bases: object

This is the base class for all feature extractors. It defines the minimum requirements that a derived feature extractor class need to implement.

If your derived class requires training, please register this here.

Parameters

requires_training : bool
Set this flag to True if your feature extractor needs to be trained. In that case, please override the train() and load() methods
split_training_data_by_client : bool
Set this flag to True if your feature extractor requires the training data to be split by clients. Ignored, if requires_training is False
min_extractor_file_size : int
The minimum file size of a saved extractor file for extractors that require training in bytes. If the saved file size is smaller than this, it is assumed to be a corrupt file and the extractor will be trained again.
min_feature_file_size : int
The minimum file size of extracted features in bytes. If the saved file size is smaller than this, it is assumed to be a corrupt file and the features will be extracted again.
kwargs : key=value pairs
A list of keyword arguments to be written in the __str__ function.
load(extractor_file)[source]

Loads the parameters required for feature extraction from the extractor file. This function usually is only useful in combination with the train() function. In this base class implementation, it does nothing.

Parameters:

extractor_file : str
The file to read the extractor from.
read_feature(feature_file)[source]

Reads the extracted feature from file. In this base class implementation, it uses bob.bio.base.load() to do that. If you have different format, please overwrite this function.

Parameters:

feature_file : str or bob.io.base.HDF5File
The file open for reading or the name of the file to read from.

Returns:

feature : object (usually numpy.ndarray)
The feature read from file.
train(training_data, extractor_file)[source]

This function can be overwritten to train the feature extractor. If you do this, please also register the function by calling this base class constructor and enabling the training by requires_training = True.

Parameters:

training_data : [object] or [[object]]
A list of preprocessed data that can be used for training the extractor. Data will be provided in a single list, if split_training_features_by_client = False was specified in the constructor, otherwise the data will be split into lists, each of which contains the data of a single (training-)client.
extractor_file : str
The file to write. This file should be readable with the load() function.
write_feature(feature, feature_file)[source]

Writes the given extracted feature to a file with the given name. In this base class implementation, we simply use bob.bio.base.save() for that. If you have a different format, please overwrite this function.

Parameters:

feature : object
The extracted feature, i.e., what is returned from __call__.
feature_file : str or bob.io.base.HDF5File
The file open for writing, or the name of the file to write.
class bob.bio.base.extractor.Linearize(dtype=None)

Bases: bob.bio.base.extractor.Extractor

Extracts features by simply concatenating all elements of the data into one long vector.

If a dtype is specified in the contructor, it is assured that the resulting

load(*args, **kwargs)[source]
train(*args, **kwargs)[source]
class bob.bio.base.extractor.MultipleExtractor(requires_training=False, split_training_data_by_client=False, min_extractor_file_size=1000, min_feature_file_size=1000, **kwargs)

Bases: bob.bio.base.extractor.Extractor

Base class for SequentialExtractor and ParallelExtractor. This class is not meant to be used directly.

static get_attributes(processors)[source]
get_extractor_groups()[source]
load(extractor_file)[source]
train_one(e, training_data, extractor_file, apply=False)[source]

Trains one extractor and optionally applies the extractor on the training data after training.

Parameters:
  • e (Extractor) – The extractor to train. The extractor should be able to save itself in an opened hdf5 file.
  • training_data ([object] or [[object]]) – The data to be used for training.
  • extractor_file (bob.io.base.HDF5File) – The opened hdf5 file to save the trained extractor inside.
  • apply (bool, optional) – If True, the extractor is applied to the training data after it is trained and the data is returned.
Returns:

Returns None if apply is False. Otherwise, returns the transformed training_data.

Return type:

None or [object] or [[object]]

class bob.bio.base.extractor.ParallelExtractor(processors, **kwargs)

Bases: bob.extension.processors.ParallelProcessor, bob.bio.base.extractor.MultipleExtractor

A helper class which takes several extractors and applies them on each processor separately and yields their outputs one by one.

processors

list – A list of extractors to apply.

Examples

You can use this class to apply several extractors on your data and get all the results back. For example:

>>> import numpy as np
>>> from functools import  partial
>>> from bob.bio.base.extractor import ParallelExtractor, CallableExtractor
>>> raw_data = np.array([[1, 2, 3], [1, 2, 3]])
>>> parallel_extractor = ParallelExtractor(
...     [CallableExtractor(f) for f in
...      [np.cast['float64'], lambda x: x / 2.0]])
>>> np.allclose(list(parallel_extractor(raw_data)),[[[ 1.,  2.,  3.],[ 1.,  2.,  3.]], [[ 0.5,  1. ,  1.5],[ 0.5,  1. ,  1.5]]])
True

The data may be further processed using a SequentialExtractor:

>>> from bob.bio.base.extractor import SequentialExtractor
>>> total_extractor = SequentialExtractor(
...     [parallel_extractor, CallableExtractor(list),
...      CallableExtractor(partial(np.concatenate, axis=1))])
>>> np.allclose(total_extractor(raw_data),[[ 1. ,  2. ,  3. ,  0.5,  1. ,  1.5],[ 1. ,  2. ,  3. ,  0.5,  1. ,  1.5]])
True
train(training_data, extractor_file)[source]
class bob.bio.base.extractor.SequentialExtractor(processors, **kwargs)

Bases: bob.extension.processors.SequentialProcessor, bob.bio.base.extractor.MultipleExtractor

A helper class which takes several extractors and applies them one by one sequentially.

processors

list – A list of extractors to apply.

Examples

You can use this class to apply a chain of extractors on your data. For example:

>>> import numpy as np
>>> from functools import  partial
>>> from bob.bio.base.extractor import SequentialExtractor, CallableExtractor
>>> raw_data = np.array([[1, 2, 3], [1, 2, 3]])
>>> seq_extractor = SequentialExtractor(
...     [CallableExtractor(f) for f in
...      [np.cast['float64'], lambda x: x / 2, partial(np.mean, axis=1)]])
>>> np.allclose(seq_extractor(raw_data),[ 1.,  1.])
True
>>> np.all(seq_extractor(raw_data) ==
...        np.mean(np.cast['float64'](raw_data) / 2, axis=1))
True
read_feature(feature_file)[source]
train(training_data, extractor_file)[source]
write_feature(feature, feature_file)[source]

Algorithms

class bob.bio.base.algorithm.Algorithm(performs_projection=False, requires_projector_training=True, split_training_features_by_client=False, use_projected_features_for_enrollment=True, requires_enroller_training=False, multiple_model_scoring='average', multiple_probe_scoring='average', min_projector_file_size=1000, min_projected_file_size=1000, min_enroller_file_size=1000, min_model_file_size=1000, min_t_model_file_size=1000, **kwargs)

Bases: object

This is the base class for all biometric recognition algorithms. It defines the minimum requirements for all derived algorithm classes.

Call the constructor in derived class implementations. If your derived algorithm performs feature projection, please register this here. If it needs training for the projector or the enroller, please set this here, too.

Parameters:

performs_projection : bool
Set to True if your derived algorithm performs a projection. Also implement the project() function, and the load_projector() if necessary.
requires_projector_training : bool
Only valid, when performs_projection = True. Set this flag to False, when the projection is applied, but the projector does not need to be trained.
split_training_features_by_client : bool
Only valid, when performs_projection = True and requires_projector_training = True. If set to True, the train_projector() function will receive a double list (a list of lists) of data (sorted by identity). Otherwise, the train_projector() function will receive data in a single list.
use_projected_features_for_enrollment : bool
Only valid, when performs_projection = True. If set to false, the enrollment is performed using the original features, otherwise the features projected using the project() function are used for model enrollment.
requires_enroller_training : bool
Set this flag to True, when the enroller requires specialized training. Which kind of features are used for training depends on the use_projected_features_for_enrollment flag.
multiple_model_scoring : str or None
The way, scores are fused when multiple features are stored in a one model. See bob.bio.base.score_fusion_strategy() for possible values.
multiple_probe_scoring : str or None
The way, scores are fused when multiple probes are available. See bob.bio.base.score_fusion_strategy() for possible values.
min_projector_file_size : int
The minimum file size of projector_file in bytes. If the saved file is smaller than this, it is assumed to be corrupt and it will be generated again.
min_projected_file_size : int
The minimum file size of projected_file in bytes. If the saved file is smaller than this, it is assumed to be corrupt and it will be generated again.
min_enroller_file_size : int
The minimum file size of enroller_file in bytes. If the saved file is smaller than this, it is assumed to be corrupt and it will be generated again.
min_model_file_size : int
The minimum file size of model_file in bytes. If the saved file is smaller than this, it is assumed to be corrupt and it will be generated again.
kwargs : key=value pairs
A list of keyword arguments to be written in the __str__ function.
enroll(enroll_features) → model[source]

This function will enroll and return the model from the given list of features. It must be overwritten by derived classes.

Parameters:

enroll_features : [object]
A list of features used for the enrollment of one model.

Returns:

model : object
The model enrolled from the enroll_features. Must be writable with the write_model() function and readable with the read_model() function.
load_enroller(enroller_file)[source]

Loads the parameters required for model enrollment from file. This function usually is only useful in combination with the train_enroller() function. This function is always called after calling load_projector(). In this base class implementation, it does nothing.

Parameters:

enroller_file : str
The file to read the enroller from.
load_projector(projector_file)[source]

Loads the parameters required for feature projection from file. This function usually is useful in combination with the train_projector() function. In this base class implementation, it does nothing.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_file : str
The file to read the projector from.
project(feature) → projected[source]

This function will project the given feature. It must be overwritten by derived classes, as soon as performs_projection = True was set in the constructor. It is assured that the load_projector() was called once before the project function is executed.

Parameters:

feature : object
The feature to be projected.

Returns:

projected : object
The projected features. Must be writable with the write_feature() function and readable with the read_feature() function.
read_feature(feature_file) → feature[source]

Reads the projected feature from file. In this base class implementation, it uses bob.io.base.load() to do that. If you have different format, please overwrite this function.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

feature_file : str or bob.io.base.HDF5File
The file open for reading, or the file name to read from.

Returns:

feature : object
The feature that was read from file.
read_model(model_file) → model[source]

Loads the enrolled model from file. In this base class implementation, it uses bob.io.base.load() to do that.

If you have a different format, please overwrite this function.

Parameters:

model_file : str or bob.io.base.HDF5File
The file open for reading, or the file name to read from.

Returns:

model : object
The model that was read from file.
score(model, probe) → score[source]

This function will compute the score between the given model and probe. It must be overwritten by derived classes.

Parameters:

model : object
The model to compare the probe with. The model was read using the read_model() function.
probe : object
The probe object to compare the model with. The probe was read using the read_feature() function (or the bob.bio.base.extractor.Extractor.read_feature() function, if this algorithm does not perform projection.

Returns:

score : float
A similarity between model and probe. Higher values define higher similarities.
score_for_multiple_models(models, probe) → score[source]

This function computes the score between the given model list and the given probe. In this base class implementation, it computes the scores for each model using the score() method, and fuses the scores using the fusion method specified in the constructor of this class. Usually this function is called from derived class score() functions.

Parameters:

models : [object]
A list of model objects.
probe : object
The probe object to compare the models with.

Returns:

score : float
The fused similarity between the given models and the probe.
score_for_multiple_probes(model, probes) → score[source]

This function computes the score between the given model and the given probe files. In this base class implementation, it computes the scores for each probe file using the score() method, and fuses the scores using the fusion method specified in the constructor of this class.

Parameters:

model : object
A model object to compare the probes with.
probes : [object]
The list of probe object to compare the models with.

Returns:

score : float
The fused similarity between the given model and the probes.
train_enroller(training_features, enroller_file)[source]

This function can be overwritten to train the model enroller. If you do this, please also register the function by calling this base class constructor and enabling the training by require_enroller_training = True.

Parameters:

training_features : [object] or [[object]]
A list of extracted features that can be used for training the projector. Features will be split into lists, each of which contains the features of a single (training-)client.
enroller_file : str
The file to write. This file should be readable with the load_enroller() function.
train_projector(training_features, projector_file)[source]

This function can be overwritten to train the feature projector. If you do this, please also register the function by calling this base class constructor and enabling the training by requires_projector_training = True.

Parameters:

training_features : [object] or [[object]]
A list of extracted features that can be used for training the projector. Features will be provided in a single list, if split_training_features_by_client = False was specified in the constructor, otherwise the features will be split into lists, each of which contains the features of a single (training-)client.
projector_file : str
The file to write. This file should be readable with the load_projector() function.
write_feature(feature, feature_file)[source]

Saves the given projected feature to a file with the given name. In this base class implementation:

  • If the given feature has a save attribute, it calls feature.save(bob.io.base.HDF5File(feature_file), 'w'). In this case, the given feature_file might be either a file name or a bob.io.base.HDF5File.
  • Otherwise, it uses bob.io.base.save() to do that.

If you have a different format, please overwrite this function.

Please register ‘performs_projection = True’ in the constructor to enable this function.

Parameters:

feature : object
A feature as returned by the project() function, which should be written.
feature_file : str or bob.io.base.HDF5File
The file open for writing, or the file name to write to.
write_model(model, model_file)[source]

Writes the enrolled model to the given file. In this base class implementation:

  • If the given model has a ‘save’ attribute, it calls model.save(bob.io.base.HDF5File(model_file), 'w'). In this case, the given model_file might be either a file name or a bob.io.base.HDF5File.
  • Otherwise, it uses bob.io.base.save() to do that.

If you have a different format, please overwrite this function.

Parameters:

model : object
A model as returned by the enroll() function, which should be written.
model_file : str or bob.io.base.HDF5File
The file open for writing, or the file name to write to.
class bob.bio.base.algorithm.BIC(comparison_function, maximum_training_pair_count=None, subspace_dimensions=None, uses_dffs=False, read_function=<function load>, write_function=<function save>, **kwargs)

Bases: bob.bio.base.algorithm.Algorithm

Computes the Intrapersonal/Extrapersonal classifier using a generic feature type and feature comparison function.

In this generic implementation, any distance or similarity vector that results as a comparison of two feature vectors can be used. Currently two different versions are implemented: One with [MWP98] and one without (a generalization of [GW09]) subspace projection of the features. The implementation of the BIC training is taken from bob.learn.linear.

Parameters:

comparison_function : function
The function to compare the features in the original feature space. For a given pair of features, this function is supposed to compute a vector of similarity (or distance) values. In the easiest case, it just computes the element-wise difference of the feature vectors, but more difficult functions can be applied, and the function might be specialized for the features you put in.
maximum_training_pair_count : int or None
Limit the number of training image pairs to the given value, i.e., to avoid memory issues.
subspace_dimensions : (int, int) or None
A tuple of sizes of the intrapersonal and extrapersonal subspaces. If given, subspace projection is performed (cf. [MWP98]) and the subspace projection matrices are truncated to the given sizes. If omitted, no subspace projection is performed (cf. [GW09]).
uses_dffs : bool
Only valid, when subspace_dimensions are specified. Use the Distance From Feature Space (DFFS) (cf. [MWP98]) during scoring. Use this flag with care!
read_function : function
A function to read a feature from bob.io.base.HDF5File. This function need to be appropriate to read the type of features that you are using. By default, bob.bio.base.load() is used.
write_function : function
A function to write a feature to bob.io.base.HDF5File. This function is used to write the model and need to be appropriate to write the type of features that you are using. By default, bob.bio.base.save() is used.
kwargs : key=value pairs
A list of keyword arguments directly passed to the Algorithm base class constructor.
enroll(enroll_features) → model[source]

Enrolls the model by storing all given input features. The features must be writable with the write_function defined in the constructor.

Parameters:

enroll_features : [object]
The list of projected features to enroll the model from.

Returns:

model : [object]
The enrolled model (which is identical to the input features).
load_enroller(enroller_file)[source]

Reads the bob.learn.linear.BICMachine from file.

The bob.learn.linear.BICMachine.use_DFFS will be overwritten by the use_dffs value specified in this class’ constructor.

Parameters:

enroller_file : str
An existing file, from which the bob.learn.linear.BICMachine will be read.
load_projector(*args, **kwargs)[source]
project(*args, **kwargs)[source]
read_feature(*args, **kwargs)[source]
read_model(model_file) → model[source]

Reads all features of the model from the given HDF5 file.

To read the features, the read_function specified in the constructor is employed.

Parameters:

model_file : str or bob.io.base.HDF5File
The file (open for reading) or the name of an existing file to read from.

Returns:

model : [object]
The read model, which is a list of features.
score(model, probe) → float[source]

Computes the BIC score between the model and the probe. First, the comparison_function is used to create the comparison vectors between all model features and the probe feature. Then, a BIC score is computed for each comparison vector, and the BIC scores are fused using the model_fusion_function defined in the bob.bio.base.algorithm.Algorithm base class.

Parameters:

model : [object]
The model storing all model features.
probe : object
The probe feature.

Returns:

score : float
A fused BIC similarity value between model and probe.
train_enroller(train_features, enroller_file)[source]

Trains the BIC by computing intra-personal and extra-personal subspaces.

First, two lists of pairs are computed, which contain intra-personal and extra-personal feature pairs, respectively. Afterward, the comparison vectors are computed using the comparison_function specified in the constructor. Finally, the bob.learn.linear.BICTrainer is used to train a bob.learn.linear.BICMachine.

Parameters:

train_features : [[object]]
A list of lists of feature vectors, which are used to train the BIC. Each sub-list contains the features of one client.
enroller_file : str
A writable file, into which the resulting bob.learn.linear.BICMachine will be written.
train_projector(*args, **kwargs)[source]
write_feature(*args, **kwargs)[source]
write_model(model, model_file)[source]

Writes all features of the model into one HDF5 file.

To write the features, the write_function specified in the constructor is employed.

Parameters:

model : [object]
The model to write, which is a list of features.
model_file : str or bob.io.base.HDF5File
The file (open for writing) or a file name to write into.
class bob.bio.base.algorithm.Distance(distance_function=<function euclidean>, is_distance_function=True, **kwargs)

Bases: bob.bio.base.algorithm.Algorithm

This class defines a simple distance measure between two features. Independent of the actual shape, each feature vector is treated as a one-dimensional vector, and the specified distance function is used to compute the distance between the two features. If the given distance_function actually computes a distance, we simply return its negative value (as all Algorithm’s are supposed to return similarity values). If the distance_function computes similarities, the similarity value is returned unaltered.

Parameters:

distance_function : callable
A function taking two 1D arrays and returning a float
is_distance_function : bool
Set this flag to False if the given distance_function computes a similarity value (i.e., higher values are better)
kwargs : key=value pairs
A list of keyword arguments directly passed to the Algorithm base class constructor.
enroll(enroll_features) → model[source]

Enrolls the model by storing all given input vectors.

Parameters:

enroll_features : [numpy.ndarray]
The list of projected features to enroll the model from.

Returns:

model : 2D numpy.ndarray
The enrolled model.
load_enroller(*args, **kwargs)[source]
load_projector(*args, **kwargs)[source]
project(*args, **kwargs)[source]
read_feature(*args, **kwargs)[source]
score(model, probe) → float[source]

Computes the distance of the model to the probe using the distance function specified in the constructor.

Parameters:

model : 2D numpy.ndarray
The model storing all enrollment features
probe : numpy.ndarray
The probe feature vector

Returns:

score : float
A similarity value between model and probe
train_enroller(*args, **kwargs)[source]
train_projector(*args, **kwargs)[source]
write_feature(*args, **kwargs)[source]
class bob.bio.base.algorithm.LDA(lda_subspace_dimension=None, pca_subspace_dimension=None, use_pinv=False, distance_function=<function euclidean>, is_distance_function=True, uses_variances=False, **kwargs)

Bases: bob.bio.base.algorithm.Algorithm

Computes a linear discriminant analysis (LDA) on the given data, possibly after computing a principal component analysis (PCA).

This algorithm computes a LDA projection (bob.learn.linear.FisherLDATrainer) on the given training features, projects the features to Fisher space and computes the distance of two projected features in Fisher space. For example, the Fisher faces algorithm as proposed by [ZKC+98] can be run with this class.

Additionally, a PCA projection matrix can be computed beforehand, to reduce the dimensionality of the input vectors. In that case, the finally stored projection matrix is the combination of the PCA and LDA projection.

Parameters:

lda_subspace_dimension : int or None
If specified, the LDA subspace will be truncated to the given number of dimensions. By default (None) it is limited to the number of classes in the training set - 1.
pca_subspace_dimentsion : int or float or None
If specified, a combined PCA + LDA projection matrix will be computed. If specified as int, defines the number of eigenvectors used in the PCA projection matrix. If specified as float (between 0 and 1), the number of eigenvectors is calculated such that the given percentage of variance is kept.
use_pinv : bool
Use the Pseudo-inverse to compute the LDA projection matrix? Sometimes, the training fails because it is impossible to invert the covariance matrix. In these cases, you might want to set use_pinv to True, which solves this process, but slows down the processing noticeably.
distance_function : function
A function taking two parameters and returns a float. If uses_variances is set to True, the function is provided with a third parameter, which is the vector of variances (aka. eigenvalues).
is_distance_function : bool
Set this flag to False if the given distance_function computes a similarity value (i.e., higher values are better)
use_variances : bool
If set to True, the distance_function is provided with a third argument, which is the vector of variances (aka. eigenvalues).
kwargs : key=value pairs
A list of keyword arguments directly passed to the Algorithm base class constructor.
enroll(enroll_features) → model[source]

Enrolls the model by storing all given input vectors.

Parameters:

enroll_features : [1D numpy.ndarray]
The list of projected features to enroll the model from.

Returns:

model : 2D numpy.ndarray
The enrolled model.
load_enroller(*args, **kwargs)[source]
load_projector(projector_file)[source]

Reads the projection matrix and the eigenvalues from file.

Parameters:

projector_file : str
An existing file, from which the PCA or PCA+LDA projection matrix and the eigenvalues are read.
project(feature) → projected[source]

Projects the given feature into Fisher space.

Parameters:

feature : 1D numpy.ndarray
The 1D feature to be projected.

Returns:

projected : 1D numpy.ndarray
The feature projected into Fisher space.
score(model, probe) → float[source]

Computes the distance of the model to the probe using the distance function specified in the constructor.

Parameters:

model : 2D numpy.ndarray
The model storing all enrollment features.
probe : 1D numpy.ndarray
The probe feature vector in Fisher space.

Returns:

score : float
A similarity value between model and probe
train_enroller(*args, **kwargs)[source]
train_projector(training_features, projector_file)[source]

Generates the LDA or PCA+LDA projection matrix from the given features (that are sorted by identity).

Parameters:

training_features : [[1D numpy.ndarray]]
A list of lists of 1D training arrays (vectors) to train the LDA projection matrix with. Each sub-list contains the features of one client.
projector_file : str
A writable file, into which the LDA or PCA+LDA projection matrix (as a bob.learn.linear.Machine) and the eigenvalues will be written.
class bob.bio.base.algorithm.PCA(subspace_dimension, distance_function=<function euclidean>, is_distance_function=True, uses_variances=False, **kwargs)

Bases: bob.bio.base.algorithm.Algorithm

Performs a principal component analysis (PCA) on the given data.

This algorithm computes a PCA projection (bob.learn.linear.PCATrainer) on the given training features, projects the features to eigenspace and computes the distance of two projected features in eigenspace. For example, the eigenface algorithm as proposed by [TP91] can be run with this class.

Parameters:

subspace_dimension : int or float
If specified as int, defines the number of eigenvectors used in the PCA projection matrix. If specified as float (between 0 and 1), the number of eigenvectors is calculated such that the given percentage of variance is kept.
distance_function : function
A function taking two parameters and returns a float. If uses_variances is set to True, the function is provided with a third parameter, which is the vector of variances (aka. eigenvalues).
is_distance_function : bool
Set this flag to False if the given distance_function computes a similarity value (i.e., higher values are better)
use_variances : bool
If set to True, the distance_function is provided with a third argument, which is the vector of variances (aka. eigenvalues).
kwargs : key=value pairs
A list of keyword arguments directly passed to the Algorithm base class constructor.
enroll(enroll_features) → model[source]

Enrolls the model by storing all given input vectors.

Parameters:

enroll_features : [1D numpy.ndarray]
The list of projected features to enroll the model from.

Returns:

model : 2D numpy.ndarray
The enrolled model.
load_enroller(*args, **kwargs)[source]
load_projector(projector_file)[source]

Reads the PCA projection matrix and the eigenvalues from file.

Parameters:

projector_file : str
An existing file, from which the PCA projection matrix and the eigenvalues are read.
project(feature) → projected[source]

Projects the given feature into eigenspace.

Parameters:

feature : 1D numpy.ndarray
The 1D feature to be projected.

Returns:

projected : 1D numpy.ndarray
The feature projected into eigenspace.
score(model, probe) → float[source]

Computes the distance of the model to the probe using the distance function specified in the constructor.

Parameters:

model : 2D numpy.ndarray
The model storing all enrollment features.
probe : 1D numpy.ndarray
The probe feature vector in eigenspace.

Returns:

score : float
A similarity value between model and probe
train_enroller(*args, **kwargs)[source]
train_projector(training_features, projector_file)[source]

Generates the PCA covariance matrix and writes it into the given projector_file.

Parameters:

training_features : [1D numpy.ndarray]
A list of 1D training arrays (vectors) to train the PCA projection matrix with.
projector_file : str
A writable file, into which the PCA projection matrix (as a bob.learn.linear.Machine) and the eigenvalues will be written.
class bob.bio.base.algorithm.PLDA(subspace_dimension_of_f, subspace_dimension_of_g, subspace_dimension_pca=None, plda_training_iterations=200, INIT_SEED=5489, INIT_F_METHOD='BETWEEN_SCATTER', INIT_G_METHOD='WITHIN_SCATTER', INIT_S_METHOD='VARIANCE_DATA', multiple_probe_scoring='joint_likelihood')

Bases: bob.bio.base.algorithm.Algorithm

Tool chain for computing PLDA (over PCA-dimensionality reduced) features

Todo

Add more documentation for the PLDA constructor, i.e., by explaining the parameters

enroll(enroll_features)[source]

Enrolls the model by computing an average of the given input vectors

load_enroller(projector_file)[source]

Reads the PCA projection matrix and the PLDA model from file

load_projector(*args, **kwargs)[source]
project(*args, **kwargs)[source]
read_feature(*args, **kwargs)[source]
read_model(model_file)[source]

Reads the model, which in this case is a PLDA-Machine

score(model, probe)[source]

Computes the PLDA score for the given model and probe

score_for_multiple_probes(model, probes)[source]

This function computes the score between the given model and several given probe files. In this base class implementation, it computes the scores for each probe file using the ‘score’ method, and fuses the scores using the fusion method specified in the constructor of this class.

train_enroller(training_features, projector_file)[source]

Generates the PLDA base model from a list of arrays (one per identity), and a set of training parameters. If PCA is requested, it is trained on the same data. Both the trained PLDABase and the PCA machine are written.

train_projector(*args, **kwargs)[source]
write_feature(*args, **kwargs)[source]

Databases

class bob.bio.base.database.BioDatabase(name, all_files_options={}, extractor_training_options={}, projector_training_options={}, enroller_training_options={}, check_original_files_for_existence=False, original_directory=None, original_extension=None, annotation_directory=None, annotation_extension='.pos', annotation_type=None, protocol='Default', training_depends_on_protocol=False, models_depend_on_protocol=False, **kwargs)

Bases: bob.db.base.FileDatabase

This class represents the basic API for database access. Please use this class as a base class for your database access classes. Do not forget to call the constructor of this base class in your derived class.

Parameters:

name : str A unique name for the database.

all_files_options : dict Dictionary of options passed to the bob.bio.base.database.BioDatabase.objects() database query when retrieving all data.

extractor_training_options : dict Dictionary of options passed to the bob.bio.base.database.BioDatabase.objects() database query used to retrieve the files for the extractor training.

projector_training_options : dict Dictionary of options passed to the bob.bio.base.database.BioDatabase.objects() database query used to retrieve the files for the projector training.

enroller_training_options : dict Dictionary of options passed to the bob.bio.base.database.BioDatabase.objects() database query used to retrieve the files for the enroller training.

check_original_files_for_existence : bool Enables to test for the original data files when querying the database.

original_directory : str The directory where the original data of the database are stored.

original_extension : str The file name extension of the original data.

annotation_directory : str The directory where the image annotations of the database are stored, if any.

annotation_extension : str The file name extension of the annotation files.

annotation_type : str The type of the annotation file to read, see bob.db.base.read_annotation_file for accepted formats.

protocol : str or None The name of the protocol that defines the default experimental setup for this database.

training_depends_on_protocol : bool Specifies, if the training set used for training the extractor and the projector depend on the protocol. This flag is used to avoid re-computation of data when running on the different protocols of the same database.

models_depend_on_protocol : bool Specifies, if the models depend on the protocol. This flag is used to avoid re-computation of models when running on the different protocols of the same database.

kwargs : key=value pairs The arguments of the Database base class constructor.

all_files(groups=None) → files[source]

Returns all files of the database, respecting the current protocol. The files can be limited using the all_files_options in the constructor.

Parameters:

groups : some of ('world', 'dev', 'eval') or None
The groups to get the data for. If None, data for all groups is returned.

kwargs: ignored

Returns:

files : [bob.bio.base.database.BioFile]
The sorted and unique list of all files of the database.
annotations(file)[source]

Returns the annotations for the given File object, if available. You need to override this method in your high-level implementation. If your database does not have annotations, it should return None.

Parameters:

file : bob.bio.base.database.BioFile
The file for which annotations should be returned.

Returns:

annots : dict or None
The annotations for the file, if available.
arrange_by_client(files) → files_by_client[source]

Arranges the given list of files by client id. This function returns a list of lists of File’s.

Parameters:

files : bob.bio.base.database.BioFile
A list of files that should be split up by BioFile.client_id.

Returns:

files_by_client : [[bob.bio.base.database.BioFile]]
The list of lists of files, where each sub-list groups the files with the same BioFile.client_id
client_id_from_model_id(model_id, group='dev')[source]

Return the client id associated with the given model id. In this base class implementation, it is assumed that only one model is enrolled for each client and, thus, client id and model id are identical. All key word arguments are ignored. Please override this function in derived class implementations to change this behavior.

enroll_files(model_id, group = 'dev') → files[source]

Returns a list of File objects that should be used to enroll the model with the given model id from the given group, respecting the current protocol. If the model_id is None (the default), enrollment files for all models are returned.

Parameters:

model_id : int or str
A unique ID that identifies the model.
group : one of ('dev', 'eval')
The group to get the enrollment files for.

Returns:

files : [bob.bio.base.database.BioFile]
The list of files used for to enroll the model with the given model id.
file_names(files, directory, extension) → paths[source]

Returns the full path of the given File objects.

Parameters:

files : [bob.bio.base.database.BioFile]
The list of file object to retrieve the file names for.
directory : str
The base directory, where the files can be found.
extension : str
The file name extension to add to all files.

Returns:

paths : [str] or [[str]]
The paths extracted for the files, in the same order. If this database provides file sets, a list of lists of file names is returned, one sub-list for each file set.
groups(protocol=None)[source]

Returns the names of all registered groups in the database

Keyword parameters:

protocol: str
The protocol for which the groups should be retrieved. If you do not have protocols defined, just ignore this field.
model_ids(group = 'dev') → ids[source]

Returns a list of model ids for the given group, respecting the current protocol.

Parameters:

group : one of ('dev', 'eval')
The group to get the model ids for.

Returns:

ids : [int] or [str]
The list of (unique) model ids for models of the given group.
model_ids_with_protocol(groups = None, protocol = None, **kwargs) → ids[source]

Returns a list of model ids for the given groups and given protocol.

Parameters:

groups : one or more of ('world', 'dev', 'eval')
The groups to get the model ids for.

protocol: a protocol name

Returns:

ids : [int] or [str]
The list of (unique) model ids for the given groups.
object_sets(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]

This function returns lists of FileSet objects, which fulfill the given restrictions.

Keyword parameters:

groups : str or [str]
The groups of which the clients should be returned. Usually, groups are one or more elements of (‘world’, ‘dev’, ‘eval’)
protocol
The protocol for which the clients should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.
purposes : str or [str]
The purposes for which File objects should be retrieved. Usually, purposes are one of (‘enroll’, ‘probe’).
model_ids : [various type]
The model ids for which the File objects should be retrieved. What defines a ‘model id’ is dependent on the database. In cases, where there is only one model per client, model ids and client ids are identical. In cases, where there is one model per file, model ids and file ids are identical. But, there might also be other cases.
objects(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]

This function returns a list of bob.bio.base.database.BioFile objects or the list of objects which inherit from this class. Returned files fulfill the given restrictions.

Keyword parameters:

groups : str or [str]
The groups of which the clients should be returned. Usually, groups are one or more elements of (‘world’, ‘dev’, ‘eval’)
protocol
The protocol for which the clients should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.
purposes : str or [str]
The purposes for which File objects should be retrieved. Usually, purposes are one of (‘enroll’, ‘probe’).
model_ids : [various type]
The model ids for which the File objects should be retrieved. What defines a ‘model id’ is dependent on the database. In cases, where there is only one model per client, model ids and client ids are identical. In cases, where there is one model per file, model ids and file ids are identical. But, there might also be other cases.
probe_file_sets(model_id = None, group = 'dev') → files[source]

Returns a list of probe FileSet objects, respecting the current protocol. If a model_id is specified, only the probe files that should be compared with the given model id are returned (for most databases, these are all probe files of the given group). Otherwise, all probe files of the given group are returned.

Parameters:

model_id : int or str or None
A unique ID that identifies the model.
group : one of ('dev', 'eval')
The group to get the enrollment files for.

Returns:

files : [bob.bio.base.database.BioFileSet] or something similar
The list of file sets used to probe the model with the given model id.
probe_files(model_id = None, group = 'dev') → files[source]

Returns a list of probe File objects, respecting the current protocol. If a model_id is specified, only the probe files that should be compared with the given model id are returned (for most databases, these are all probe files of the given group). Otherwise, all probe files of the given group are returned.

Parameters:

model_id : int or str or None
A unique ID that identifies the model.
group : one of ('dev', 'eval')
The group to get the enrollment files for.

Returns:

files : [bob.bio.base.database.BioFile]
The list of files used for to probe the model with the given model id.
replace_directories(replacements=None)[source]

This helper function replaces the original_directory and the annotation_directory of the database with the directories read from the given replacement file.

This function is provided for convenience, so that the database configuration files do not need to be modified. Instead, this function uses the given dictionary of replacements to change the original directory and the original extension (if given).

The given replacements can be of type dict, including all replacements, or a file name (as a str), in which case the file is read. The structure of the file should be:

# Comments starting with # and empty lines are ignored

[YOUR_..._DATA_DIRECTORY] = /path/to/your/data
[YOUR_..._ANNOTATION_DIRECTORY] = /path/to/your/annotations

If no annotation files are available (e.g. when they are stored inside the database), the annotation directory can be left out.

Parameters:

replacements : dict or str
A dictionary with replacements, or a name of a file to read the dictionary from. If the file name does not exist, no directories are replaced.
test_files(groups = ['dev']) → files[source]

Returns all test files (i.e., files used for enrollment and probing) for the given groups, respecting the current protocol. The files for the steps can be limited using the all_files_options defined in the constructor.

Parameters:

groups : some of ('dev', 'eval')
The groups to get the data for.

Returns:

files : [bob.bio.base.database.BioFile]
The sorted and unique list of test files of the database.
training_files(step = None, arrange_by_client = False) → files[source]

Returns all training files for the given step, and arranges them by client, if desired, respecting the current protocol. The files for the steps can be limited using the ..._training_options defined in the constructor.

Parameters:

step : one of ('train_extractor', 'train_projector', 'train_enroller') or None
The step for which the training data should be returned.
arrange_by_client : bool
Should the training files be arranged by client? If set to True, training files will be returned in [[bob.bio.base.database.BioFile]], where each sub-list contains the files of a single client. Otherwise, all files will be stored in a simple [bob.bio.base.database.BioFile].

Returns:

files : [bob.bio.base.database.BioFile] or [[bob.bio.base.database.BioFile]]
The (arranged) list of files used for the training of the given step.
uses_probe_file_sets(protocol=None)[source]

Defines if, for the current protocol, the database uses several probe files to generate a score. Returns True if the given protocol specifies file sets for probes, instead of a single probe file. In this default implementation, False is returned, throughout. If you need different behavior, please overload this function in your derived class.

class bob.bio.base.database.BioFile(client_id, path, file_id=None, **kwargs)

Bases: bob.db.base.File

A simple base class that defines basic properties of File object for the use in verification experiments

Parameters:
  • client_id (object) – The id of the client this file belongs to. Its type depends on your implementation. If you use an SQL database, this should be an SQL type like Integer or String.
  • file_id (object) – see bob.db.base.File constructor
  • path (object) – see bob.db.base.File constructor
class bob.bio.base.database.BioFileSet(file_set_id, files, path=None, **kwargs)

Bases: bob.bio.base.database.BioFile

This class defines the minimum interface of a set of database files that needs to be exported. Use this class, whenever the database provides several files that belong to the same probe. Each file set has an id, and a list of associated files, which are of type bob.bio.base.database.BioFile of the same client. The file set id can be anything hashable, but needs to be unique all over the database.

Parameters:
  • file_set_id (str or int) – A unique ID that identifies the file set.
  • files ([bob.bio.base.database.BioFile]) – A non-empty list of BioFile objects that should be stored inside this file. All files of that list need to have the same client ID.
class bob.bio.base.database.FileListBioDatabase(filelists_directory, name, protocol=None, bio_file_class=<class 'bob.bio.base.database.BioFile'>, original_directory=None, original_extension=None, annotation_directory=None, annotation_extension='.pos', annotation_type='eyecenter', dev_sub_directory=None, eval_sub_directory=None, world_filename=None, optional_world_1_filename=None, optional_world_2_filename=None, models_filename=None, probes_filename=None, scores_filename=None, tnorm_filename=None, znorm_filename=None, use_dense_probe_file_list=None, keep_read_lists_in_memory=True, **kwargs)

Bases: bob.bio.base.database.ZTBioDatabase

This class provides a user-friendly interface to databases that are given as file lists.

Parameters:
  • filelists_directory (str) – The directory that contains the filelists defining the protocol(s). If you use the protocol attribute when querying the database, it will be appended to the base directory, such that several protocols are supported by the same class instance of bob.bio.base.
  • name (str) – The name of the database
  • protocol (str) – The protocol of the database. This should be a folder inside filelists_directory.
  • bio_file_class (class) – The class that should be used to return the files. This can be bob.bio.base.database.BioFile, bob.bio.spear.database.AudioBioFile, bob.bio.face.database.FaceBioFile, or anything similar.
  • original_directory (str or None) – The directory, where the original data can be found.
  • original_extension (str or [str] or None) – The filename extension of the original data, or multiple extensions.
  • annotation_directory (str or None) – The directory, where additional annotation files can be found.
  • annotation_extension (str or None) – The filename extension of the annotation files.
  • annotation_type (str or None) – The type of annotation that can be read. Currently, options are 'eyecenter', 'named', 'idiap'. See bob.db.base.read_annotation_file() for details.
  • dev_sub_directory (str or None) – Specify a custom subdirectory for the filelists of the development set (default is 'dev')
  • eval_sub_directory (str or None) – Specify a custom subdirectory for the filelists of the development set (default is 'eval')
  • world_filename (str or None) – Specify a custom filename for the training filelist (default is 'norm/train_world.lst')
  • optional_world_1_filename (str or None) – Specify a custom filename for the (first optional) training filelist (default is 'norm/train_optional_world_1.lst')
  • optional_world_2_filename (str or None) – Specify a custom filename for the (second optional) training filelist (default is 'norm/train_optional_world_2.lst')
  • models_filename (str or None) – Specify a custom filename for the model filelists (default is 'for_models.lst')
  • probes_filename (str or None) – Specify a custom filename for the probes filelists (default is 'for_probes.lst')
  • scores_filename (str or None) – Specify a custom filename for the scores filelists (default is 'for_scores.lst')
  • tnorm_filename (str or None) – Specify a custom filename for the T-norm scores filelists (default is 'for_tnorm.lst')
  • znorm_filename (str or None) – Specify a custom filename for the Z-norm scores filelists (default is 'for_znorm.lst')
  • use_dense_probe_file_list (bool or None) – Specify which list to use among probes_filename (dense) or scores_filename. If None it is tried to be estimated based on the given parameters.
  • keep_read_lists_in_memory (bool) – If set to True (the default), the lists are read only once and stored in memory. Otherwise the lists will be re-read for every query (not recommended).
all_files(groups=['dev'], add_zt_files=True)[source]

Returns all files for the given group. The internally stored protocol is used, throughout.

Parameters:
  • groups ([str]) – A list of groups to retrieve the files for.
  • add_zt_files (bool) – If selected, also files for ZT-norm scoring will be added. Please select this option only if this dataset provides ZT-norm files, see implements_zt().
Returns:

A list of all files that fulfill your query.

Return type:

[BioFile]

annotations(file)[source]

Reads the annotations for the given file id from file and returns them in a dictionary.

Parameters:file (BioFile) – The BioFile object for which the annotations should be read.
Returns:The annotations as a dictionary, e.g.: {'reye':(re_y,re_x), 'leye':(le_y,le_x)}
Return type:dict
client_id_from_model_id(model_id, group='dev')[source]

Returns the client id that is connected to the given model id.

Parameters:
  • model_id (str or None) – The model id for which the client id should be returned.
  • groups (str or [str] or None) – (optional) the groups, the client belongs to. Might be one or more of ('dev', 'eval', 'world', 'optional_world_1', 'optional_world_2'). If groups are given, only these groups are considered.
  • protocol (str or None) – The protocol to consider.
Returns:

The client id for the given model id, if found.

Return type:

str

client_id_from_t_model_id(t_model_id, group='dev')[source]

Returns the client id that is connected to the given T-Norm model id.

Parameters:
  • model_id (str or None) – The model id for which the client id should be returned.
  • groups (str or [str] or None) – (optional) the groups, the client belongs to. Might be one or more of ('dev', 'eval'). If groups are given, only these groups are considered.
Returns:

The client id for the given model id of a T-Norm model, if found.

Return type:

str

client_ids(protocol=None, groups=None)[source]

Returns a list of client ids for the specific query by the user.

Parameters:
  • protocol (str or None) – The protocol to consider
  • groups (str or [str] or None) – The groups to which the clients belong ('dev', 'eval', 'world', 'optional_world_1', 'optional_world_2').
Returns:

A list containing all the client ids which have the given properties.

Return type:

[str]

get_base_directory()[source]

Returns the base directory where the filelists defining the database are located.

groups(protocol=None, add_world=True, add_subworld=True)[source]

This function returns the list of groups for this database.

Parameters:
  • protocol (str or None) – The protocol for which the groups should be retrieved. If None, the internally stored protocol is used.
  • add_world (bool) – Add the world groups?
  • add_subworld (bool) – Add the sub-world groups? Only valid, when add_world=True
Returns:

A list of groups

Return type:

[str]

implements_zt(protocol=None, groups=None)[source]

Checks if the file lists for the ZT score normalization are available.

Parameters:
  • protocol (str or None) – The protocol for which the groups should be retrieved.
  • groups (str or [str] or None) – The groups for which the ZT score normalization file lists should be checked ('dev', 'eval').
Returns:

True if the all file lists for ZT score normalization exist, otherwise False.

Return type:

bool

model_ids_with_protocol(groups=None, protocol=None, **kwargs)[source]

Returns a list of model ids for the specific query by the user.

Parameters:
  • protocol (str or None) – The protocol to consider
  • groups (str or [str] or None) – The groups to which the models belong ('dev', 'eval', 'world', 'optional_world_1', 'optional_world_2').
Returns:

A list containing all the model ids which have the given properties.

Return type:

[str]

objects(groups=None, protocol=None, purposes=None, model_ids=None, classes=None, **kwargs)[source]

Returns a set of bob.bio.base.database.BioFile objects for the specific query by the user.

Parameters:
  • protocol (str or None) – The protocol to consider
  • purposes (str or [str] or None) – The purposes required to be retrieved ('enroll', 'probe') or a tuple with several of them. If None is given (this is the default), it is considered the same as a tuple with all possible values. This field is ignored for the data from the 'world', 'optional_world_1', 'optional_world_2' groups.
  • model_ids (str or [str] or None) – Only retrieves the files for the provided list of model ids (claimed client id). If None is given (this is the default), no filter over the model_ids is performed.
  • groups (str or [str] or None) – One of the groups ('dev', 'eval', 'world', 'optional_world_1', 'optional_world_2') or a tuple with several of them. If None is given (this is the default), it is considered to be the existing subset of ('world', 'dev', 'eval').
  • classes (str or [str] or None) –

    The classes (types of accesses) to be retrieved ('client', 'impostor') or a tuple with several of them. If None is given (this is the default), it is considered the same as a tuple with all possible values.

    Note

    Classes are not allowed to be specified when ‘probes_filename’ is used in the constructor.

Returns:

A list of BioFile objects considering all the filtering criteria.

Return type:

[BioFile]

original_file_name(file, check_existence=True)[source]

Returns the original file name of the given file.

This interface supports several original extensions, so that file lists can contain images of different data types.

When multiple original extensions are specified, this function will check the existence of any of these file names, and return the first one that actually exists. In this case, the check_existence flag is ignored.

Parameters:
  • file (BioFile) – The BioFile object for which the file name should be returned.
  • check_existence (bool) – Should the existence of the original file be checked? (Ignored when multiple original extensions were specified in the constructor.)
Returns:

The full path of the original data file.

Return type:

str

set_base_directory(filelists_directory)[source]

Resets the base directory where the filelists defining the database are located.

tclient_ids(protocol=None, groups=None)[source]

Returns a list of T-Norm client ids for the specific query by the user.

Parameters:
  • protocol (str or None) – The protocol to consider
  • groups (str or [str] or None) – The groups to which the clients belong (“dev”, “eval”).
Returns:

A list containing all the T-Norm client ids which have the given properties.

Return type:

[str]

tmodel_ids_with_protocol(protocol=None, groups=None, **kwargs)[source]

Returns a list of T-Norm model ids for the specific query by the user.

Parameters:
  • protocol (str or None) – The protocol to consider
  • groups (str or [str] or None) – The groups to which the models belong ('dev', 'eval').
Returns:

A list containing all the T-Norm model ids belonging to the given group.

Return type:

[str]

tobjects(groups=None, protocol=None, model_ids=None, **kwargs)[source]

Returns a list of bob.bio.base.database.BioFile objects for enrolling T-norm models for score normalization.

Parameters:
  • protocol (str or None) – The protocol to consider
  • model_ids (str or [str] or None) – Only retrieves the files for the provided list of model ids (claimed client id). If None is given (this is the default), no filter over the model_ids is performed.
  • groups (str or [str] or None) – The groups to which the models belong ('dev', 'eval').
Returns:

A list of BioFile objects considering all the filtering criteria.

Return type:

[BioFile]

uses_dense_probe_file(protocol)[source]

Determines if a dense probe file list is used based on the existence of parameters.

zclient_ids(protocol=None, groups=None)[source]

Returns a list of Z-Norm client ids for the specific query by the user.

Parameters:
  • protocol (str or None) – The protocol to consider
  • groups (str or [str] or None) – The groups to which the clients belong (“dev”, “eval”).
Returns:

A list containing all the Z-Norm client ids which have the given properties.

Return type:

[str]

zobjects(groups=None, protocol=None, **kwargs)[source]

Returns a list of BioFile objects to perform Z-norm score normalization.

Parameters:
  • protocol (str or None) – The protocol to consider
  • groups (str or [str] or None) – The groups to which the clients belong ('dev', 'eval').
Returns:

A list of File objects considering all the filtering criteria.

Return type:

[BioFile]

class bob.bio.base.database.ZTBioDatabase(name, z_probe_options={}, **kwargs)

Bases: bob.bio.base.database.BioDatabase

This class defines another set of abstract functions that need to be implemented if your database provides the interface for computing scores used for ZT-normalization.

all_files(groups=None) → files[source]

Returns all files of the database, including those for ZT norm, respecting the current protocol. The files can be limited using the all_files_options and the the z_probe_options in the constructor.

Parameters:

groups : some of ('world', 'dev', 'eval') or None
The groups to get the data for. If None, data for all groups is returned.
add_zt_files: bool
If set (the default), files for ZT score normalization are added.

Returns:

files : [bob.bio.base.database.BioFile]
The sorted and unique list of all files of the database.
client_id_from_t_model_id(t_model_id, group = 'dev') → client_id[source]

Returns the client id for the given T-Norm model id. In this base class implementation, we just use the BioDatabase.client_id_from_model_id() function. Overload this function if you need another behavior.

Parameters:

t_model_id : int or str
A unique ID that identifies the T-Norm model.
group : one of ('dev', 'eval')
The group to get the client ids for.

Returns:

client_id : [int] or [str]
A unique ID that identifies the client, to which the T-Norm model belongs.
t_enroll_files(t_model_id, group = 'dev') → files[source]

Returns a list of File objects that should be used to enroll the T-Norm model with the given model id from the given group, respecting the current protocol.

Parameters:

t_model_id : int or str
A unique ID that identifies the model.
group : one of ('dev', 'eval')
The group to get the enrollment files for.

Returns:

files : [bob.bio.base.database.BioFile]
The sorted list of files used for to enroll the model with the given model id.
t_model_ids(group = 'dev') → ids[source]

Returns a list of model ids of T-Norm models for the given group, respecting the current protocol.

Parameters:

group : one of ('dev', 'eval')
The group to get the model ids for.

Returns:

ids : [int] or [str]
The list of (unique) model ids for T-Norm models of the given group.
tmodel_ids_with_protocol(protocol=None, groups=None, **kwargs)[source]

This function returns the ids of the T-Norm models of the given groups for the given protocol.

Keyword parameters:

groups : str or [str]
The groups of which the model ids should be returned. Usually, groups are one or more elements of (‘dev’, ‘eval’)
protocol : str
The protocol for which the model ids should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.
tobjects(groups=None, protocol=None, model_ids=None, **kwargs)[source]

This function returns the File objects of the T-Norm models of the given groups for the given protocol and the given model ids.

Keyword parameters:

groups : str or [str]
The groups of which the model ids should be returned. Usually, groups are one or more elements of (‘dev’, ‘eval’)
protocol : str
The protocol for which the model ids should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.
model_ids : [various type]
The model ids for which the File objects should be retrieved. What defines a ‘model id’ is dependent on the database. In cases, where there is only one model per client, model ids and client ids are identical. In cases, where there is one model per file, model ids and file ids are identical. But, there might also be other cases.
z_probe_file_sets(group = 'dev') → files[source]

Returns a list of probe FileSet objects used to compute the Z-Norm. This function needs to be implemented in derived class implementations.

Parameters:

group : one of ('dev', 'eval')
The group to get the Z-norm probe files for.

Returns:

files : [bob.bio.base.database.BioFileSet]
The unique list of file sets used to compute the Z-norm.
z_probe_files(group = 'dev') → files[source]

Returns a list of probe files used to compute the Z-Norm, respecting the current protocol. The Z-probe files can be limited using the z_probe_options in the query to bob.bio.base.database.ZTBioDatabase.z_probe_files()

Parameters:

group : one of ('dev', 'eval')
The group to get the Z-norm probe files for.

Returns:

files : [bob.bio.base.database.BioFile]
The unique list of files used to compute the Z-norm.
zobjects(groups=None, protocol=None, **kwargs)[source]

This function returns the File objects of the Z-Norm impostor files of the given groups for the given protocol.

Keyword parameters:

groups : str or [str]
The groups of which the model ids should be returned. Usually, groups are one or more elements of (‘dev’, ‘eval’)
protocol : str
The protocol for which the model ids should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.

Grid Configuration

class bob.bio.base.grid.Grid(grid_type='sge', number_of_preprocessing_jobs=32, number_of_extraction_jobs=32, number_of_projection_jobs=32, number_of_enrollment_jobs=32, number_of_scoring_jobs=32, training_queue='8G', preprocessing_queue='default', extraction_queue='default', projection_queue='default', enrollment_queue='default', scoring_queue='default', number_of_parallel_processes=1, scheduler_sleep_time=1.0)[source]

Bases: object

This class is defining the options that are required to submit parallel jobs to the SGE grid, or jobs to the local queue.

If the given grid_type is 'sge' (the default), this configuration is set up to submit algorithms to the SGE grid. In this setup, specific SGE queues can be specified for different steps of the tool chain, and different numbers of parallel processes can be specified for each step. Currently, only the SGE at Idiap is tested and supported, for other SGE’s we do not assure compatibility.

If the given grid_type is 'local', this configuration is set up to run using a local scheduler on a single machine. In this case, only the number_of_parallel_processes and scheduler_sleep_time options will be taken into account.

Parameters:

grid_type : one of ('sge', 'local')
The type of submission system, which should be used. Currently, only sge and local submissions are supported.
number_of_preprocessing_jobs, number_of_extraction_jobs, number_of_projection_jobs, number_of_enrollment_jobs, number_of_scoring_jobs : int
Only valid if grid_type = 'sge'. The number of parallel processes that should be executed for preprocessing, extraction, projection, enrollment or scoring.
training_queue, preprocessing_queue, extraction_queue, projection_queue, enrollment_queue, scoring_queue : str or dict
Only valid if grid_type = 'sge'. SGE queues that should be used for training, preprocessing, extraction, projection, enrollment or scoring. The queue can be defined using a dictionary of keywords that will directly passed to the gridtk.tools.qsub() function, or one of our PREDEFINED_QUEUES, which are adapted for Idiap.
number_of_parallel_processes : int
Only valid if grid_type = 'local'. The number of parallel processes, with which the preprocessing, extraction, projection, enrollment and scoring should be executed.
scheduler_sleep_time : float
The time (in seconds) that the local scheduler will sleep between its iterations.
queue(params) → dict[source]

This helper function translates the given queue parameters to grid options. When the given params are a dictionary already, they are simply returned. If params is a string, the PREDEFINED_QUEUES are indexed with them. If params is None, or the grid_type is 'local', an empty dictionary is returned.

is_local()[source]

Returns whether this grid setup should use the local submission or the SGE grid.

bob.bio.base.grid.PREDEFINED_QUEUES

A dictionary of predefined queue keywords, which are adapted to the Idiap SGE.

Annotators

class bob.bio.base.annotator.Annotator(read_original_data=None, **kwargs)

Bases: object

Annotator class for all annotators. This class is meant to be used in conjunction with the bob bio annotate script.

read_original_data

callable – A function that loads the samples. The syntax is like bob.bio.base.read_original_data.

annotate(sample, **kwargs)[source]

Annotates a sample and returns annotations in a dictionary.

Parameters:
  • sample (numpy.ndarray) – The sample that is being annotated.
  • **kwargs – The extra arguments that may be passed.
Returns:

A dictionary containing the annotations of the biometric sample. If the program fails to annotate the sample, it should return an empty dictionary.

Return type:

dict

class bob.bio.base.annotator.Callable(callable, **kwargs)

Bases: bob.bio.base.annotator.Annotator

A class that wraps a callable object that annotates a sample into a bob.bio.annotator object.

callable

callable – A callable with the following signature: annotations = callable(sample, **kwargs) that takes numpy array and returns annotations in dictionary format for that biometric sample. Please see Annotator for more information.

annotate(sample, **kwargs)[source]
class bob.bio.base.annotator.FailSafe(annotators, required_keys, only_required_keys=False, **kwargs)

Bases: bob.bio.base.annotator.Annotator

A fail-safe annotator. This annotator takes a list of annotator and tries them until you get your annotations. The annotations of previous annotator is passed to the next one.

annotators

list – A list of annotators to try

required_keys

list – A list of keys that should be available in annotations to stop trying different annotators.

only_required_keys

bool – If True, the annotations will only contain the required_keys.

annotate(sample, **kwargs)[source]

Baselines

bob.bio.base.baseline.get_config()[source]

Returns a string containing the configuration information.

class bob.bio.base.baseline.Baseline(name, preprocessors, extractor, algorithm, **kwargs)

Bases: object

Base class to define baselines

A Baseline is composed by the triplet bob.bio.base.preprocessor.Preprocessor, bob.bio.base.extractor.Extractor, and bob.bio.base.algorithm.Algorithm

name

str – Name of the baseline. This name will be displayed in the command line interface.

preprocessors

dict – Dictionary containing all possible preprocessors

extractor

str – Registered resource or a config file containing the feature extractor

algorithm

str – Registered resource or a config file containing the algorithm

bob.bio.base.baseline.get_available_databases()[source]

Get all the available databases through the database entry-points

bob.bio.base.baseline.search_preprocessor(db_name, keys)[source]

Wrapper that searches for preprocessors for specific databases. If not found, the default preprocessor is returned