Connecting legacy bob.bio.base to Vanilla Biometrics¶
The transition to the pipeline concept changed the way data goes from the raw sample to the extracted features, and how the biometric algorithm is applied. However, a set of tools was implemented to support the older bob implementations (designated as legacy) of database, preprocessor, extractor, and algorithms.
This adaptation consists of wrapper classes that take a legacy bob class as input and constructs a Transformer
or BiometricAlgorithm
out of it.
Warning
A temporary folder is created in case a legacy bob package needs to write on disk during its operation. However, this folder is persistent between experiments. You should remove its content before running another experiment.
Legacy FileList Database interface¶
This is a similar database interface to the CSV file interface, but takes information from a series of two- or three-column files without header instead of CSV files and returns a legacy database (use a Database Connector to create a database interface).
The files are separated into three sets: 'world'
(training; optional), 'dev'
(development; required) and 'eval'
(evaluation; optional) set to be used by the biometric verification algorithm.
The summarized complete structure of the list base directory (here denoted as basedir
) containing all the files should be like this:
filelists_directory
|
+-- norm
| |
| +-- train_world.lst
| +-- train_optional_world_1.lst
| +-- train_optional_world_2.lst
|
+-- dev
| |
| +-- for_models.lst
| +-- for_probes.lst
| +-- for_scores.lst
| +-- for_tnorm.lst
| +-- for_znorm.lst
|
+-- eval
|
+-- for_models.lst
+-- for_probes.lst
+-- for_scores.lst
+-- for_tnorm.lst
+-- for_znorm.lst
The file lists contain several information that need to be available for the biometric recognition experiment to run properly. A complete list of possible information is:
filename
: The name of the data file, relative to the common root of all data files, and without file name extension.client_id
: The name or ID of the subject the biometric traces of which are contained in the data file. These names are handled asstr
objects, so001
is different from1
.model_id
:used for model enrollment: The name or ID of the client model that should be enrolled. In most cases, the
model_id
is identical to theclient_id
.used for scoring: The name or ID of the client model that the probe file should be compared with.
claimed_client_id
:used for scoring: The
client_id
of the client model that the probe file should be compared with.
The following list files need to be created:
For training (optional):
world file, with default name
train_world.lst
, in the default sub-directorynorm
. It is a 2-column file with format:filename client_id
two world files, with default names
train_optional_world_1.lst
andtrain_optional_world_2.lst
, in default sub-directorynorm
. The format is the same as for the world file. These files are not needed for most biometric recognition algorithms, hence, they need to be specified only if the algorithm uses them.
For enrollment:
one or two model files for the development (and evaluation) set, with default name
for_models.lst
in the default sub-directoriesdev
(andeval
). They are 3-column files with format:filename model_id client_id
For scoring:
There exist two different ways to implement file lists used for scoring.
The first (and simpler) variant is to define a file list of probe files, where all probe files will be tested against all models. Hence, you need to specify one (or two) probe files for the development (and evaluation) set, with the default name
for_probes.lst
in the default sub-directorydev
(andeval
). They are 2-column files with format:filename client_id
The other option is to specify a detailed list of which probe files should be compared with which client model, i.e., one (or two) score files for the development (and evaluation) set, with the default name
for_scores.lst
in the sub-directorydev
(andeval
). These files need to be provided only if the scoring is to be done selectively, meaning by creating a sparse probe/model scoring matrix. They are 4-column files with format:filename model_id claimed_client_id client_id
Note
The verification queries will use either only the probe or only the score files, so only one of them is mandatory. If only one of the two files is available, the scoring technique will be automatically determined. In case both probe and score files are provided, the user should set the parameter
use_dense_probe_file_list
, which specifies the files to consider, when creating the object of theDatabase
class.For ZT score normalization (optional):
Optionally, file lists for ZT score normalization can be added. These are:
one or two files for t-score normalization for the development (and evaluation) set, with default name
for_tnorm.lst
in both sub-directoriesdev
(andeval
). They are 3-column files with format:filename model_id client_id
one or two files for z-score normalization for the development (and evaluation) set, with default name
for_znorm.lst
in both sub-directoriesdev
(andeval
). They are 2-column files with format:filename client_id
Note
In all files, the lines starting with any number of white space and #
will be ignored.
Legacy Database Connector¶
This legacy database wrapper is used to translate an old bob.db
package functions into a bob pipelines database interface.
It uses objects()
to retrieve a list of files for each role (world
, references
, and probes
) and specified group (dev
and eval
) and creates the matching Sample
and SampleSet
lists.
This example shows the creation of the Mobio database interface in the bob.pipelines format from the legacy bob.db:
from bob.bio.face.database import MobioBioDatabase
from bob.bio.base.pipelines.vanilla_biometrics import DatabaseConnector
from bob.extension import rc
legacy_database = MobioBioDatabase(
original_directory=rc["bob.db.mobio.directory"],
annotation_directory=rc["bob.db.mobio.annotation_directory"],
original_extension=".png",
protocol="mobile0-male",
)
# Converts to a Database interface for bob.pipelines
database = DatabaseConnector(legacy_database)
# Sets the optimization flag
database.allow_scoring_with_all_biometric_references = True
Legacy Preprocessor wrapper¶
The PreprocessorTransformer
wrapper takes a :py:class`bob.bio.base.preprocessor` from the old bob.bio.base
as input and creates a Transformer
out of it.
The __call__()
method of the :py:class`~bob.bio.base.preprocessor` class is called when the Transformer.transform()
method is called.
This example shows how to create a Transformer
out of a legacy preprocessor (FaceCrop, from bob.bio.face):
from bob.bio.face.preprocessor import FaceCrop
from bob.bio.base.transformers import PreprocessorTransformer
# Initialize the legacy Preprocessor
legacy_preprocessor = FaceCrop(
cropped_size=(80,64),
cropped_positions={'leye':'16,15', 'reye':'16,48'},
fixed_positions={'leye':'50,24', 'reye','50,64'}
)
# Create the Transformer
preprocessor_transformer = PreprocessorTransformer( legacy_preprocessor )
Legacy Extractor wrapper¶
A similar wrapper is available for the legacy bob.bio.base
Extractor. It is the ExtractorTransformer
.
It maps the Transformer.transform()
method to the __call__()
of the legacy Extractor.
Here is an example showing how to create a Transformer
from a legacy Extractor (Linearize, from bob.bio.base):
from bob.bio.base.extractor import Linearize
from bob.bio.base.transformers import ExtractorTransformer
# Create the Transformer from the legacy Extractor
extractor_transformer = ExtractorTransformer( Linearize() )
Legacy Algorithm wrappers¶
Lastly, AlgorithmTransformer
and BioAlgorithmLegacy
are available to map correctly a legacy Algorithm to a Transformer
and a BioAlgorithm
.
Those two adaptors are needed as the legacy Algorithm could consist of a projector that could be trainable (with methods project()
and train_projector()
), which correspond to a Transformer
in the new API.
The enrollment and scoring of the legacy algorithm were done using the enroll()
and score()
methods, which can be mapped to the same methods in a BioAlgorithm
.
Here is an example showing how to create the Transformer
out of a bob.bio.base Algorithm (Distance
):
from bob.bio.base.algorithm import Distance
from bob.bio.base.transformers import AlgorithmTransformer
import scipy.spatial
legacy_algorithm = Distance(
distance_function = scipy.spatial.distance.cosine,
is_distance_function = True
)
# Create the BioAlgorithm from the legacy Algorithm
algorithm_transformer = AlgorithmTransformer( legacy_algorithm )
And here is an example of the creation of the BioAlgorithm
from the bob.bio.base Algorithm (Distance
) with the BioAlgorithmLegacy
.
This will map correctly the enroll()
and score()
methods:
from bob.bio.base.algorithm import Distance
from bob.bio.base.pipelines.vanilla_biometrics.legacy import BioAlgorithmLegacy
import scipy.spatial
legacy_algorithm = Distance(
distance_function = scipy.spatial.distance.cosine,
is_distance_function = True
)
bio_algorithm = BioAlgorithmLegacy(legacy_algorithm)