User’s guide¶

This package contains the access API and descriptions for the HQ-WMCA Database. It only contains the Bob accessor methods to use the DB directly from python, with our certified protocols. The actual raw data for the dataset should be downloaded from the original URL.

The database is presented in the following paper:

@article{heusch-tbiom-2020,
       author = {Guillaume Heusch, Anjith George, David Geissbuehler, Zoreh Mostaani and Sebastien Marcel},
       title = {Deep Models and Shortwave Infrared Information to Detect Face Presentation Attacks},
       journal = {IEEE Trans. on Biometrics, Behavior, and Identity Science},
       volume = {XX}
       issue = {YY}
       year = {2020},
   }

The Data¶

Each example is contained in one .hdf5 file with the following nomenclature:

<site_id>_<session_id>_<client_id>_<presenter_id>_<type_id>_<subtype_id>_<pai_id>-<hash>.h5

where:

<site_id> : site where the data was recorded
<session_id> : session (1, 2 or 3)
<client_id> : id of the person (either bonafide or attack)
<presenter_id> : id of the person presenting the attack
<type_id> : type of attack
<subtype_id> : subtype of the attack
<pai_id> : unique id for each PAI
<hash>

Each file contains the following data:

BASLER_BGR               Dataset {60, 3, 1920, 1200}
BASLER_LEFT_NIR_1050nm   Dataset {20, 1, 1920, 1200}
BASLER_LEFT_NIR_735nm    Dataset {20, 1, 1920, 1200}
BASLER_LEFT_NIR_850nm    Dataset {20, 1, 1920, 1200}
BASLER_LEFT_NIR_940nm    Dataset {20, 1, 1920, 1200}
BASLER_LEFT_NIR_Dark     Dataset {40, 1, 1920, 1200}
BASLER_LEFT_NIR_stereo   Dataset {60, 1, 1920, 1200}
BASLER_RIGHT_NIR_1050nm  Dataset {20, 1, 1920, 1200}
BASLER_RIGHT_NIR_735nm   Dataset {20, 1, 1920, 1200}
BASLER_RIGHT_NIR_850nm   Dataset {20, 1, 1920, 1200}
BASLER_RIGHT_NIR_940nm   Dataset {20, 1, 1920, 1200}
BASLER_RIGHT_NIR_Dark    Dataset {40, 1, 1920, 1200}
BASLER_RIGHT_NIR_stereo  Dataset {60, 1, 1920, 1200}
BOBCAT_SWIR_1050nm       Dataset {20, 1, 640, 512}
BOBCAT_SWIR_1200nm       Dataset {20, 1, 640, 512}
BOBCAT_SWIR_1300nm       Dataset {20, 1, 640, 512}
BOBCAT_SWIR_1450nm       Dataset {20, 1, 640, 512}
BOBCAT_SWIR_1550nm       Dataset {20, 1, 640, 512}
BOBCAT_SWIR_1650nm       Dataset {20, 1, 640, 512}
BOBCAT_SWIR_940nm        Dataset {20, 1, 640, 512}
BOBCAT_SWIR_Dark         Dataset {40, 1, 640, 512}
GOBI_THERMAL             Dataset {60, 1, 640, 480}
REALSENSE_D415_DEPTH     Dataset {70, 1, 720, 1280}

Each Dataset shape is given by: (number of frames, number of channels, height and width).

Note

The complete database contains a total of 2904 video sequences. However, in the above mentioned paper, experiments are performed on a subset containing 2440 sequences. Remaining sequences were added later.

Creating the database¶

You can directly download the database using the following script:

> bob_dbmanage.py hqwmca download

However, you may be interested in creating the database yourself, since there actually exists two versions: the complete one and the one used in the paper. Currently, the command above will download the paper version.

So, to (re)create the database, you should first run the following script:

> generate_metadata.py

It will output a json file containing all the necessary information to create the sql3 file for the database.

Note

The default execution will create metadata for the paper version. If you want the complete database, run the previous script with the --makeup option.

Then, to create the db.sql3 file, just do:

> bob_dbmanage.py hqwmca create

A note on loading data¶

Since each example contains a lot of data (i.e. video sequence for 22 different streams), you may not want to load everything. As a consequence, a list of loading function used to actually load data from each example in the database (implemented in bob/db/hqwmca/models.py:97) should be provided.

Such loading functions relies on the bob.io.stream package. A relatively simple example is given below, but you may want to check the bob.io.stream package for all possibilities.

def color(f)
  """
  function to load color data from a sequence

  Parameters
  ----------
  f: :py:class:`bob.io.stream.StreamFile`
    The stream file object

  Returns
  -------
  numpy.ndarray:
    The data corresponding to the specified stream

  """
  return f.stream('color')

And this is how you then use the function to load data

import bob.db.hqwmca
db = bob.db.hqwmca.Database()

objects = db.objects(protocol='grand_test')
obj = objects[0]

data = obj.load('path/to/database', '.h5', color)

Warning

The data is a dictionary of bob.bio.video.FrameContainer. Note that the key corresponds to the name of the loading function (i.e. ‘color’ in this case).