Python API for bob.pad.base

Summary

Databases

bob.pad.base.pipelines.Database()

Base database class for PAD experiments.

bob.pad.base.database.FileListPadDatabase(...)

A PAD database interface from CSV files.

Score Functions

bob.pad.base.error_utils.split_csv_pad(filename)

Loads PAD scores from a CSV score file, splits them by attack vs bonafide.

bob.pad.base.error_utils.split_csv_pad_per_pai(...)

Returns scores for Bona-Fide samples and scores for each PAI.

bob.pad.base.error_utils.calc_threshold(...)

Calculates the threshold based on the given method.

bob.pad.base.error_utils.apcer_threshold(...)

Computes the threshold given the desired APCER as the criteria.

bob.pad.base.error_utils.apcer_bpcer(...)

Computes APCER_PAI, APCER, and BPCER given the positive scores and a list of negative scores and a threshold.

Details

class bob.pad.base.pipelines.Database

Bases: object

Base database class for PAD experiments.

all_samples(groups: str | list[str] | None = None) list[Sample][source]

Returns all the samples of the database in one list.

Giving groups will restrict the predict_samples to those groups.

abstract fit_samples() list[Sample][source]

Returns bob.pipelines.Sample’s to train a PAD model.

Returns:

samples – List of samples for model training.

Return type:

list

abstract predict_samples(group: str = 'dev') list[Sample][source]

Returns bob.pipelines.Sample’s to be scored.

Parameters:

group (str, optional) – Limits samples to this group

Returns:

samples – List of samples to be scored.

Return type:

list

class bob.pad.base.database.FileListPadDatabase(name, dataset_protocols_path, protocol, transformer=None, **kwargs)

Bases: Database, FileListDatabase

A PAD database interface from CSV files.

fit_samples()[source]

Returns bob.pipelines.Sample’s to train a PAD model.

Returns:

samples – List of samples for model training.

Return type:

list

predict_samples(group='dev')[source]

Returns bob.pipelines.Sample’s to be scored.

Parameters:

group (str, optional) – Limits samples to this group

Returns:

samples – List of samples to be scored.

Return type:

list

purposes()[source]
samples(groups=None, purposes=None)[source]

Get samples of a certain group

Parameters:

groups (str, optional) – A str or list of str to be used for filtering samples, by default None

Returns:

A list containing the samples loaded from csv files.

Return type:

list

Utility functions for computation of EPSC curve and related measurement

bob.pad.base.error_utils.calc_threshold(method, pos, negs, all_negs, far_value=None, is_sorted=False)[source]

Calculates the threshold based on the given method.

Parameters:
  • method (str) – One of bpcer20, eer, min-hter, apcer20.

  • pos (array_like) – The positive scores. They should be sorted!

  • negs (list) – A list of array_like negative scores. Each item in the list corresponds to scores of one PAI.

  • all_negs (array_like) – An array of all negative scores. This can be calculated from negs as well but we ask for it since you might have it already calculated.

  • far_value (float, optional) – If method is far, far_value and all_negs are used to calculate the threshold.

  • is_sorted (bool, optional) – If True, it means all scores are sorted and no sorting will happen.

Returns:

The calculated threshold.

Return type:

float

Raises:

ValueError – If method is unknown.

bob.pad.base.error_utils.apcer_threshold(desired_apcer, pos, *negs, is_sorted=False)[source]

Computes the threshold given the desired APCER as the criteria.

APCER is computed as max of all APCER_PAI values. The threshold will be computed such that the real APCER is at most the desired value.

Parameters:
  • desired_apcer (float) – The desired APCER value.

  • pos (list) – An array or list of positive scores in float.

  • *negs – A list of negative scores. Each item corresponds to the negative scores of one PAI.

  • is_sorted (bool, optional) – Set to True if ALL arrays (pos and negs) are sorted.

Returns:

The computed threshold that satisfies the desired APCER.

Return type:

float

bob.pad.base.error_utils.apcer_bpcer(threshold, pos, *negs)[source]

Computes APCER_PAI, APCER, and BPCER given the positive scores and a list of negative scores and a threshold.

Parameters:
  • threshold (float) – The threshold to be used to compute the error rates.

  • pos (list) – An array or list of positive scores in float.

  • *negs – A list of negative scores. Each item corresponds to the negative scores of one PAI.

Returns:

A tuple such as (list of APCER_PAI, APCER, BPCER)

Return type:

tuple

bob.pad.base.error_utils.split_csv_pad_per_pai(filename, regexps=[], regexp_column='attack_type')[source]

Returns scores for Bona-Fide samples and scores for each PAI. By default, the real_id column (second column) is used as indication for each Presentation Attack Instrument (PAI).

For example, with default regexps and regexp_column, if you have scores like:

claimed_id, test_label,              is_bonafide, attack_type, score
001,        bona_fide_sample_1_path, True,        ,            0.9
001,        print_sample_1_path,     False,       print,       0.6
001,        print_sample_2_path,     False,       print,       0.6
001,        replay_sample_1_path,    False,       replay,      0.2
001,        replay_sample_2_path,    False,       replay,      0.2
001,        mask_sample_1_path,      False,       mask,        0.5
001,        mask_sample_2_path,      False,       mask,        0.5

this function will return 1 set of positive scores, and 3 sets of negative scores (for each print, replay, and mask PAIs).

Otherwise, you can provide a list regular expressions that match each PAI. For example, with regexps as [‘print’, ‘replay’, ‘mask’], if you have scores like:

claimed_id, test_label,              is_bonafide, attack_type, score
001,        bona_fide_sample_1_path, True,        ,            0.9
001,        print_sample_1_path,     False,       print/1,     0.6
001,        print_sample_2_path,     False,       print/2,     0.6
001,        replay_sample_1_path,    False,       replay/1,    0.2
001,        replay_sample_2_path,    False,       replay/2,    0.2
001,        mask_sample_1_path,      False,       mask/1,      0.5
001,        mask_sample_2_path,      False,       mask/2,      0.5

the function will return 3 sets of negative scores (for print, replay, and mask PAIs, given in regexp).

Parameters:
  • filename (str) – Path to the score file.

  • regexps (list, optional) – A list of regular expressions that match each PAI. If not given, the values in the column pointed by regexp_column are used to find scores for different PAIs.

  • regexp_column (str, optional) – If a list of regular expressions are given, those patterns will be matched against the values in this column. default: attack_type

Returns:

A tuple, ([positives], {‘pai_name’: [negatives]}), containing positive scores and a dict of negative scores mapping PAIs names to their respective scores.

Return type:

tuple

Raises:
  • ValueError – If none of the given regular expressions match the values in regexp_column.

  • KeyError – If regexp_column is not a column of the CSV file.

bob.pad.base.error_utils.split_csv_pad(filename)[source]

Loads PAD scores from a CSV score file, splits them by attack vs bonafide.

The CSV must contain a is_bonafide column with each field either True or False (case insensitive).

Parameters:

filename (str) – The path to a CSV file containing all the scores.

Returns:

Tuple of 1D-arrays: (attack, bonafide). The negative (attacks) and positives (bonafide) scores.

Return type:

tuple