Running Biometric Recognition Experiments¶
Now, you are ready to run your first biometric recognition experiment.
Running Experiments (part I)¶
To run an experiment, we provide a generic script verify.py
.
As a default, verify.py
accepts one or more configuration files that include the parametrization of the experiment to run.
A configuration file contains one ore more variables that define parts of the experiment.
When several configuration files are specified, the variables of the latter will overwrite the ones of the former.
For simplicity, here we discuss only a single configuration file.
As a start, we have implemented a shortcut to generate an empty configuration file that contains all possible variables, each of which are documented and commented out:
$ verify.py --create-configuration-file experiment.py
Note
The generated experiment.py
is a regular python file, so you can include any regular python code inside this file.
Alright, lets have a look into this file. Whoops, that’s a lot of variables! But, no worries, most of them have proper default values. However, there are five variables, which are required and sufficient to define the complete biometric recognition experiment. These five variables are:
database
: The database to run the experiments onpreprocessor
: The data preprocessorextractor
: The feature extractoralgorithm
: The recognition algorithmsub_directory
: A descriptive name for your experiment, which will serve as a sub-directory
The first four variables, i.e., the database
, the preprocessor
, the extractor
and the algorithm
can be specified in several different ways.
For the start, we will use only the registered Resources.
These resources define the source code that will be used to compute the experiments, as well as all the meta-parameters of the algorithms (which we will call the configuration).
To get a list of registered resources, please call:
$ resources.py
Each package in bob.bio
defines its own resources, and the printed list of registered resources differs according to the installed packages.
If only bob.bio.base
is installed, no databases and only one preprocessor will be listed.
To see more details about the resources, i.e., the full constructor call for the respective class, use the --details
(or shortly -d
) option, and to sub-select only specific types of resources, use the --types
(or -t
) option:
$ resources.py -dt algorithm
Note
You will also find some grid
resources being listed.
These type of resources will be explained later.
Before going into more details about the configurations, we will provide information about running default experiments.
One variable, which is not required, but recommended, is verbose
.
By default, the algorithms are set up to execute quietly, and only errors are reported (logging.ERROR
).
To change this behavior, you can set the verbose
variable to show:
Warning messages (
logging.WARN
)Informative messages (
logging.INFO
)Debug messages (
logging.DEBUG
)
When running experiments, my personal preference is verbosity level 2
.
So, a minimal configuration file (say: pca_atnt.py
) would look something like:
database = 'atnt'
preprocessor = 'face-detect'
extractor = 'linearize'
algorithm = 'pca'
sub_directory = 'PCA_ATNT'
verbose = 2
Running the experiment is then as simple as:
$ verify.py pca_atnt.py
Note
To be able to run exactly the command line from above, it requires to have bob.bio.face installed.
Note
Chain loading is possible through configuration files, i.e., variables of each config is available during evaluation of the following config file.
This allows us to spread our experiment setup in several configuration files and have a call similar to this:
$ verify.py config_1.py config_2.py config_n.py
For more information see Chain Loading in Python-based Configuration System.
Before running an experiment, it is recommended to add set the variable dry_run = True
, so that it will only print, which steps would be executed, without actually executing them, and make sure that everything works as expected.
The final result of the experiment will be one (or more) score file(s).
Usually, they will be called something like scores-dev
.
By default, you can find them in a sub-directory the result
directory, but you can change this option using the result_directory
variable.
Note
At Idiap, the default result directory differs, see verify.py --help
for your directory.
Command Line Options¶
Each configuration can also directly be specified as command line option of verify.py
.
Note
Command line options have a long version starting with --
and often a short version starting with a single -
.
Here, only the long names of the arguments are listed, please refer to verify.py --help
(or short: verify.py -h
) for the abbreviations.
Usually, the (long version of the) command line parameter is identical to the variable name, where _
characters are replaced by -
, and all options start with --
.
For example, the sub_directory
variable can also be set by the --sub-directory
command line option.
Only, the --verbose
option differs, you can use the --verbose
option several times to increase verbosity, e.g, --verbose --verbose
(or short -vv
) increases verbosity to 2
(alias logging.INFO
).
Generally, options defined on the command line will overwrite variables inside the configuration file(s).
Note
Required options need to be specified at least in either the configuration file or on command line. If all options are given on the command line, the configuration file can be omitted completely.
The exact same experiment as above can, hence, be executed using:
$ verify.py --database mobio-image --preprocessor face-crop-eyes --extractor linearize --algorithm pca --sub-directory pca-experiment -vv
Note
When running an experiment twice, you might realize that the second execution of the same experiment is much faster than the first one.
This is due to the fact that those parts of the experiment, which have been successfully executed before (i.e., the according files already exist), are skipped.
To override this behavior, i.e., to always regenerate all parts of the experiments, you can set force = True
.
While we recommend to use a configuration file to declare your experiment, some variables might be faster to be changed on the command line, such as --dry-run
, --verbose
, --force
(see above), --parallel N
, or --skip-...
(see below).
However, to be consistent, throughout this documentation we document the options as variables.
Evaluating Experiments¶
After the experiment has finished successfully, one or more text file
containing all the scores are written. In this section, commands that helps to
quickly evaluate a set of scores by generating metrics or plots are presented
here. The scripts take as input either a 4-column or 5-column data format as
specified in the documentation of
bob.bio.base.score.load.four_column()
or
bob.bio.base.score.load.five_column()
.
Please note that there exists another file called Experiment.info
inside
the result directory. This file is a pure text file and contains the complete
configuration of the experiment. With this configuration it is possible to
inspect all default parameters of the algorithms, and even to re-run the exact
same experiment.
Metrics¶
To calculate the threshold using a certain criterion (EER (default), FAR or min.HTER) on a development set and apply it on an evaluation set, just do:
$ bob bio metrics -e {dev,test}-4col.txt --legends ExpA --criterion min-hter
[Min. criterion: MIN-HTER ] Threshold on Development set `ExpA`: -4.830500e-03
====== ====================== =================
ExpA Development dev-4col Eval. test-4col
====== ====================== =================
FtA 0.0% 0.0%
FMR 6.7% (35/520) 2.5% (13/520)
FNMR 6.7% (26/390) 6.2% (24/390)
FAR 6.7% 2.5%
FRR 6.7% 6.2%
HTER 6.7% 4.3%
====== ====================== =================
Note
When evaluation scores are provided, --eval
option must be passed.
See metrics –help for further options.
You can also compute measure such as recognition rate (rr
), Cllr and
minCllr (cllr
) and minDCF (mindcf
) by passing the corresponding option.
For example:
bob bio metrics -e {dev,test}-4col.txt --legends ExpA --criterion cllr
====== ====================== ================
Computing Cllr and minCllr...
======= ====================== ================
None Development dev-4col eval test-4col
======= ====================== ================
Cllr 0.9% 0.9%
minCllr 0.2% 0.2%
======= ====================== ================
Note
You must provide files in the correct format depending on the measure you
want to compute. For example, recognition rate takes cmc type files. See
bob.bio.base.score.load.cmc()
.
Plots¶
Customizable plotting commands are available in the bob.bio.base
module. They take a list of development and/or evaluation files and generate a
single PDF file containing the plots. Available plots are:
roc
(receiver operating characteristic)det
(detection error trade-off)epc
(expected performance curve)hist
(histograms of scores with threshold line)cmc
(cumulative match characteristic)dir
(detection & identification rate)
Use the --help
option on the above-cited commands to find-out about more
options.
For example, to generate a CMC curve from development and evaluation datasets:
$bob bio cmc -e -v --output 'my_cmc.pdf' dev-1.txt eval-1.txt
dev-2.txt eval-2.txt
where my_cmc.pdf will contain CMC curves for the two experiments.
Note
By default, det
, roc
, cmc
and dir
plot development and
evaluation curves on
different plots. You can force gather everything in the same plot using
--no-split
option.
Note
The --figsize
and --style
options are two powerful options that can
dramatically change the appearance of your figures. Try them! (e.g.
--figsize 12,10 --style grayscale
)
Evaluate¶
A convenient command evaluate is provided to generate multiple metrics and plots for a list of experiments. It generates two metrics outputs with EER, HTER, Cllr, minDCF criteria along with roc, det, epc, hist plots for each experiment. For example:
$bob bio evaluate -e -v -l 'my_metrics.txt' -o 'my_plots.pdf' {sys1,sys2}/{dev,eval}
will output metrics and plots for the two experiments (dev and eval pairs) in my_metrics.txt and my_plots.pdf, respectively.
Running in Parallel¶
One important property of the verify.py
script is that it can run in parallel, using either several processes on the local machine, or an SGE grid.
To achieve that, bob.bio
is well-integrated with our SGE grid toolkit GridTK, which we have selected as a python package in the Installation section.
The verify.py
script can submit jobs either to the SGE grid, or to a local scheduler, keeping track of dependencies between the jobs.
The GridTK keeps a list of jobs in a local database, which by default is called submitted.sql3
, but which can be overwritten with the gridtk_database_file
variable.
Please refer to the GridTK documentation for more details on how to use the Job Manager jman
.
Two different types of grid
resources are defined, which can be used with the grid
variable.
The first type of resources will submit jobs to an SGE grid.
They are mainly designed to run in the Idiap SGE grid and might need some adaptations to run on your grid.
The second type of resources will submit jobs to a local queue, which needs to be run by hand (e.g., using jman --local run-scheduler --parallel 4
), or by setting the variable run_local_scheduler = True
.
The difference between the two types of resources is that the local submission usually starts with local-
, while the SGE resource does not.
You can also re-nice the parallel jobs by setting the nice
variable accordingly.
To run an experiment parallel on the local machine, you can also use the simple variable parallel = N
, which will run the experiments in N
parallel processes on your machine.
Here, N
can be any positive integer – but providing N
greater than the number of processor threads of your machine will rather slow down processing.
Basically, parallel = N
is a shortcut for:
grid = bob.bio.base.grid.Grid("local", number_of_parallel_processes=N)
run_local_scheduler = True
stop_on_failure = True
Warning
Some of the processes require a lot of memory, which are multiplied by N
when you run in N
parallel processes.
There is no check implemented to avoid that.
Variables to change Default Behavior¶
Additionally to the required variables discussed above, there are several variables to modify the behavior of the experiments.
One set of command line options change the directory structure of the output.
By default, intermediate (temporary) files are by default written to the temp
directory, which can be overridden by the temp_directory
variable, which expects relative or absolute paths.
Re-using Parts of Experiments¶
If you want to re-use parts previous experiments, you can specify the directories (which are relative to the temp_directory
, but you can also specify absolute paths):
preprocessed_directory
extracted_directory
projected_directory
models_directories
(one for each the models and the ZT-norm-models, see below)
or even trained extractor, projector, or enroller (i.e., the results of the extractor, projector, or enroller training):
extractor_file
projector_file
enroller_file
For that purpose, it is also useful to skip parts of the tool chain.
To do that you can set these variables to True
:
skip_preprocessing
skip_extractor_training
skip_extraction
skip_projector_training
skip_projection
skip_enroller_training
skip_enrollment
skip_score_computation
skip_concatenation
skip_calibration
although by default files that already exist are not re-created.
You can use the force
variable combined with the skip_
variables (in which case the skip is preferred).
To (re-)run just a sub-selection of the tool chain, you can also use the execute_only
variable, which takes a list of options out of: preprocessing
, extractor-training
, extraction
, projector-training
, projection
, enroller-training
, enrollment
, score-computation
, concatenation
or calibration
.
This option is particularly useful for debugging purposes.
Database-dependent Variables¶
Many databases define several protocols that can be executed.
To change the protocol, you can either modify the configuration file, or simply use the protocol
variable.
Some databases define several kinds of evaluation setups.
For example, often two groups of data are defined, a so-called development set and an evaluation set.
The scores of the two groups will be concatenated into two files called scores-dev and scores-eval, which are located in the score directory (see above).
In this case, by default only the development set is employed.
To use both groups, just specify groups = ['dev', 'eval']
(of course, you can also only use the 'eval'
set by setting groups = ['eval']
).
One score normalization technique is the so-called ZT score normalization.
To enable this, simply use the zt_norm
variable.
If the ZT-norm is enabled, two sets of scores will be computed, and they will be placed in two different sub-directories of the score directory, which are by default called nonorm and ztnorm, but which can be changed using the zt_score_directories
variable.
Other Variables¶
Calibration¶
For some applications it is interesting to get calibrated scores.
Simply set the variable calibrate_scores = True
and another set of score files will be created by training the score calibration on the scores of the 'dev'
group and execute it to all available groups.
The scores will be located at the same directory as the nonorm and ztnorm scores, and the file names are calibrated-dev (and calibrated-eval if applicable).
Unsuccessful Preprocessing¶
In some cases, the preprocessor is not able to preprocess the data (e.g., for face image processing the face detector might not detect the face).
If you expect such cases to happen, you might want to use the allow_missing_files
variable.
When this variable is set to True
, missing files will be handled correctly throughout the whole processing chain, i.e.:
the data file is not used during training (in any step of the processing tool chain)
preprocessed data is not written
feature extraction is not performed for that file
the file is exempt from model enrollment; if no enrollment file is found for a model, no model file is written
if either model or probe file is not found, the according score will be
NaN
. If several probe files are combined into one score, missing probe files will be ignored; if all probe files are not found, the score isNaN
.
Warning
At the moment, combining the allow_missing_files
and zt_norm
variables might result in unexpected behavior, as the ZT-Norm computation does not handle NaN
values appropriately.