VFPAD

in-Vehicle Face Presentation Attack Detection

Description

The in-Vehicle Face Presentation Attack Detection (VFPAD) dataset consists of bona-fide and 2D/3D attack presentations acquired for a subject (real or fake) in the driver’s sear of the car. These presentations have been captured using an NIR camera (940 nm) placed on the steering wheel of the car, while NIR illuminators have been fixed on both front pillars (adjacent to the wind-shield) of the car. The bona-fide videos represent 24 male and 16 female subjects of various ethnicities. The PAI species used to construct this dataset include photo-prints, digital displays (for replay attacks), rigid 3D masks, and flexible 3D masks made of silicone.

Data Collection

The videos comprising this dataset represent bona-fide and attack presentations under a range of variations:

Environmental variations: presentations have been recorded in four sessions, each under different environmental conditions (outdoor sunny; outdoor cloudy; indoor dimly-lit; and indoor brightly-lit)
Different scenarios: bona-fide presentations for each subject have been captured with variety of appearances: with/without glasses, with/without hat, etc.
Illumination variations: two illumination conditions have been used: ‘uniform’ (both NIR illuminators switched on), and ‘non-uniform’ (only the left NIR-illuminator switched on), and
Pose variations: two poses (‘angles’) have been used: ‘front’: the subject looks ahead at the road; and ‘below’: subject looks straight into the camera.

As Figure 1 shows, the camera is placed on the steering column, looking up at the subject’s face.

Structure of the Dataset

Each presentation is recorded in a separate file in HDF5 format. The hdf5 files have the following directory-structure:

/stream_0
/stream_0/recording_0
/stream_0/recording_1

The subdirectory recording_0 contains several frames that may be used for illumination-calibration. These frames represent a video, approximately 2 seconds long, that has been captured without the NIR 940nm illumination. Therefore, these frames capture the ambient natural light.

The subdirectory recording_1 contains frames of a 10-second long video, with the appropriate NIR illuminators switched on. These are the frames that are used for PAD experiments.

Overall Statistics

	Number of videos
bona-fide	4046
PA	1790
Total no. of videos	5836

The dataset is divided into two folders: bf and pa--- each of which consists of sub-folders for each client (real subject or PAI). All recordings for the given client are stored in the corresponding sub-folder. These presentations are stored in HDF5 format. The filename encodes information about the type of presentation recorded. The filename has the following format:

<presentation-type>_<session-id>_<angle-id>_<illumination-id>_<client-id>_<presenter-id>_<type-id>_<sub-category-id>_<pai-id>_<trial-id>.hdf5

The description for each field is provided below:

	Component	Length	Description
1	presentation-type	2 char	bf or pa: string indicating whether the corresponding sample is bona-fide or PA.
2	session-id	2 digits	01, 02, 03, or 04: indicates the session (S1, S2, S3, or S4, respectively) in which the data is captured.
3	angle-id	1 digit	1 or 2: indicates the angle between camera and face (below: 1; or front: 2).
4	illumination-id	1 digit	1 or 2: indicates the light distribution over the face (non-uniform: 1; or uniform: 2).
5	client-id	4 digits	The identity assigned to the bona-fide subject or to the PAI that is in front of the camera. For bona-fide subjects, arbitrary numerical identities have been used from 0001 to 0040. For PAIs, arbitrary strings havebeen used to create identities for each PAI.
6	presenter-id	4 digits	Redundant information for the present version of the dataset. Indicates who is presenting the face (real or fake) to the camera. In the dataset the presenter-id is either 0000 (for bf) or 0001 (for pa) for every file.
7	type-id	2 digits	00, 01, 02, 03, or 04: Indicates the main category of presentation. The numeric strings correspond to bona-fide, 2D print attacks, 2D replay attacks, 3D silicone masks, and 3D rigid masks, respectively.
8	sub-category-id	2 digits	Indicates the sub-category of the main category indicated by type-id See Table 2 for explanations of sub-category-id for the various type-id values.
9	pai-id	3 digits	3 digits A unique number given to each presentation attack instrument. For bona-fide presentations this number is always 000.
10	trial-id	8 digits	An arbitrary numeric string. This string helps to distinguish between separate captures of the same presentation for the exactly same recording scenario.

The details of sub-category-id are provided in Table below:

Type ID	Sub-Category ID	Description
00 (bona-fide)	00	Natural (no glasses or hat)
	01	Medical glasses (wherever applicable)
	02	Clear glasses
	03	Sunglasses
	04	Hat (no glasses)
	05	Hat + clear glasses
	06	Hat + sunglasses
01 (Print)	01	Matte on Laser printer
	02	Glossy on Laser printer
	03	Matte on Inkjet printer
	04	Glossy on Inkjet printer
02 (Replay-attack)	00	–
03 (3D Silicone masks)	00	Generic flexible mask (G-Flex-3D-Mask)
03 (3D Silicone masks)	01	Custom flexible mask (C-Flex-3D-Mask)
04 (3D Rigid masks)	00	Custom rigid mask 1
	02	Custom rigid mask 2
	03	Custom rigid mask 3
	04	Custom rigid mask 4

Experimental protocol

The reference publication considers the experimental protocol named grandtest. For a frame-level evaluation, 20 frames from each video have been used, except for print attacks. The VFPAD dataset consists of relatively less number of print attacks. The grandtest protocol, thus, considers 80 frames per video to provide a fair representation of print attacks during experimentation. For the grandtest protocol, videos were divided into fixed, disjoint groups: train, dev, and eval. Each group consists of unique subset of subjects. (Subjects of one group are not present in other two).

Details of the grandtest protocol are summarized below:

Partition	#Videos	Split ratio (%)
train bona-fide	1503	37.15
train PA	595	33.24
dev bona-fide	1247	30.82
dev PA	666	37.20
eval bona-fide	1296	32.03
eval PA	529	29.56
Total	5836

Citation

If you use the dataset, please cite the following publication:

@article{IEEE_TBIOM_2021,
author = {Kotwal, Ketan and Bhattacharjee, Sushil and Abbet, Philip and Mostaani, Zohreh and Wei, Huang and Wenkang, Xu and Yaxi, Zhao and Marcel, S\'{e}bastien},
title = {Domain-Specific Adaptation of CNN for Detecting Face Presentation Attacks in NIR},
journal = {IEEE Transactions on Biometrics, Behavior, and Identity Science},
publisher = {{IEEE}},
year={2022},
volume={4},
number={1},
pages={135--147},
doi={10.1109/TBIOM.2022.3143569}
}

VFPAD

Description

Data Collection

Structure of the Dataset

Overall Statistics

Experimental protocol

Citation

About

Research

Innovation

Education

News

Events

Careers