Score file conversion¶
Sometimes, it is required to export the score files generated by Bob to a different format, e.g., to be able to generate a plot comparing Bob’s systems with other systems. In this package, we provide source code to convert between different types of score files.
Bob to OpenBR¶
One of the supported formats is the matrix format that the National Institute
of Standards and Technology (NIST) uses, and which is supported by OpenBR.
The scores are stored in two binary matrices, where the first matrix (usually
with a .mtx
filename extension) contains the raw scores, while a second
mask matrix (extension .mask
) contains information, which scores are
positives, and which are negatives.
To convert from Bob’s four column or five column score file to a pair of these
matrices, you can use the bob.bio.base.score.openbr.write_matrix()
function.
In the simplest way, this function takes a score file
'five-column-sore-file'
and writes the pair 'openbr.mtx', 'openbr.mask'
of OpenBR compatible files:
>>> bob.bio.base.score.openbr.write_matrix('five-column-sore-file', 'openbr.mtx', 'openbr.mask', score_file_format = '5column')
In this way, the score file will be parsed and the matrices will be written in
the same order that is obtained from the score file.
For most of the applications, this should be sufficient, but as the identity
information is lost in the matrix files, no deeper analysis is possible anymore
when just using the matrices. To enforce an order of the models and probes
inside the matrices, you can use the model_names
and probe_names
parameters of bob.bio.base.score.openbr.write_matrix()
:
The
probe_names
parameter lists thepath
elements stored in the score files, which are the fourth column in a5column
file, and the third column in a4column
file, seebob.bio.base.score.load.five_column()
andbob.bio.base.score.load.four_column()
.The
model_names
parameter is a bit more complicated. In a5column
format score file, the model names are defined by the second column of that file, seebob.bio.base.score.load.five_column()
. In a4column
format score file, the model information is not contained, but only the client information of the model. Hence, for the4column
format, themodel_names
actually lists the client ids found in the first column, seebob.bio.base.score.load.four_column()
.
Warning
The model information is lost, but required to write the matrix files. In
the 4column
format, we use client ids instead of the model
information. Hence, when several models exist per client, this function
will not work as expected.
Additionally, there are fields in the matrix files, which define the gallery
and probe list files that were used to generate the matrix. These file names
can be selected with the gallery_file_name
and probe_file_name
keyword
parameters of bob.bio.base.score.openbr.write_matrix()
.
Finally, OpenBR defines a specific 'search'
score file format, which is
designed to be used to compute CMC curves. The score matrix contains
descendingly sorted and possibly truncated list of scores, i.e., for each
probe, a sorted list of all scores for the models is generated. To generate
these special score file format, you can specify the search
parameter. It
specifies the number of highest scores per probe that should be kept. If the
search
parameter is set to a negative value, all scores will be kept. If
the search
parameter is higher as the actual number of models, NaN
scores will be appended, and the according mask values will be set to 0
(i.e., to be ignored).
OpenBR to Bob
————-
On the other hand, you might also want to generate a Bob-compatible (four or
five column) score file based on a pair of OpenBR matrix and mask files. This
is possible by using the bob.bio.base.score.openbr.write_score_file()
function. At the basic, it takes the given pair of matrix and mask files, as
well as the desired output score file:
>>> bob.bio.base.score.openbr.write_score_file('openbr.mtx', 'openbr.mask', 'four-column-sore-file')
This score file is sufficient to compute a CMC curve (see bob.measure), however it
does not contain relevant client ids or paths for models and probes.
Particularly, it assumes that each client has exactly one associated model.
To add/correct these information, you can use additional parameters to
bob.bio.base.score.openbr.write_score_file()
. Client ids of models and
probes can be added using the models_ids
and probes_ids
keyword
arguments. The length of these lists must be identical to the number of models
and probes as given in the matrix files, and they must be in the same order
as used to compute the OpenBR matrix. This includes that the same
same-client and different-client pairs as indicated by the OpenBR mask will be
generated, which will be checked inside the function.
To add model and probe path information, the model_names
and
probe_names
parameters, which need to have the same size and order as the
models_ids
and probes_ids
. These information are simply stored in the
score file, and no further check is applied.
Note
The model_names
parameter is used only when writing score files in score_file_format='5column'
, in the '4column'
format, this parameter is ignored.