Multi-PIE Face Verification Protocol - Unmatched Illumination, Version 1.0
Feb 23rd 2012
Idiap Research Institute

Author: Laurent El Shafey
Citation: All documents and papers that report research results obtained using this protocol should please cite: 
L. El Shafey et al., "Idiap Biometric Resources" [Online]. Available: http://www.idiap.ch/resource/biometric

Summary:
-------
The images in the world data set should be used for all system training. The development (or dev) set should be used only for tuning of system hyperparameters, model selection, etc. The evaluation (or eval, or test) set should be used only for reporting the final face authentication (verification) performance.

File formats:
------------

world/world_clients.txt:		<subject id>
world/world_files.txt:			<subject id> <image>
world/world_files_ubmsubset.txt:	<subject id> <image>

dev/dev_clients.txt:     <subject id>
dev/dev_enrol.txt:       <subject id> <image to be used to enrol the specified subject>
dev/dev_probe.txt:       <subject id> <image to be used as a probe against the specified subject's model>

eval/dev_clients.txt:    <subject id>
eval/dev_enrol.txt:      <subject id> <image to be used to enrol the specified subject>
eval/dev_probe.txt:      <subject id> <image to be used as a probe against the specified subject's model>


Details:
-------

The CMU Multi-PIE Face Database consists of a set of images acquired from 337 subjects in up to four different sessions over time. Images exhibit additional session variation through difference in poses, illumination characteristics and facial expressions. It can be ordered via this website: http://www.multipie.org/.

As a face authentication protocol is not publicly available for Multi-PIE, we developed our own protocol with independent world, development and evaluation sets. Information about this protocol is provided below.


The following protocol aims at studying the effect of illumination variations on face recognition system performances. Furthermore, only frontal face images from the camera 05_1 with neutral expression from the multi-view part of the database are considered.

A/ Division of the subjects in three disjoint groups
The three disjoint world, development and evaluation subsets are defined as follows. 
208 clients from the database did not take part in all the session recordings. Based on this fact, we split the clients into two groups:
- 'world' which contains the data from subject who only took part in only 3 or less session recordings. It hence consists of 208 identities.
- 'deveval' for the remaining ones. The 'deveval' subset was then randomly splitted into two groups of almost equal size:
  * 'dev' which consists of 64 identities
  * 'eval' which consists of 65 identities

The lists of identities for each the three previous sets are provided in the {world/world,dev/dev,eval/eval}_clients.txt files. 


B/ Selection and organization of the images in each of the groups
For this protocol, we are only considering images with neutral expressions and frontal poses. For session 1, 2 and 3, one recording with a neutral expression has been performed, which means 20 images with various illumination conditions for each of these sessions. In contrast, for session 4, two recordings with neutral expressions were performed at a few seconds/minutes intervals, which means 20 x 2 images.

In addition, we have noticed that for each of these recordings, the two images without flash were almost identical. Therefore, we removed one of them (shot 19) from our protocol, leading to 19 images per session and per recording.

Taking the previous aspects into considerations, the world subsets consists of 9785 images. The corresponding list of images is given in the world/world_files.txt file, the first column being the subject id and the second one the relative path of the image in the database.
In some of our experiments, this world subset was too large. We hence defined a subset by randomly selecting 1326 files (list given in the world/world_files_ubmsubset.txt). In particular, we used it to train a Universal Backgroud Model.

The 'dev' and 'eval' subsets require a further split, with images used to enrol models, and images compared to the enroled models. For both 'dev' and 'eval', we use:
- shot 0 from session 1 to enrol each client model
- shots 0 to 18 from the other three session (2, 3 and 4) as probes, which means 76 images per client. These probes are trialled against all the client models.

Finally, for the 'dev' subset, we have:
- 64 images to enrol clients (one for each client model)
- 64 x 76 = 4,864 target trials and 64 x 63 x 76 = 306,432 impostor trials
and for the 'eval' subset, we have:
- 65 images to enrol
- 65 x 76 = 4,940 target trials and 65 x 64 x 76 = 316,160 impostor trials

Enrolment images are listed in the {dev/dev,eval/eval}_enrol.txt files, whereas probe images are listed in the {dev/dev,eval/eval}_probe.txt. Once again, the first column is the subject id and the second one the relative path of the image.


C/ ZT-Norm
Where ZT-norm is used to perform score normalisation, the 'dev' subset should be used to normalise the scores for the 'eval' subset, and vice-versa. For T-norm, the enrolment data of the appropriate set should be used to create the set of cohort models, while for Z-norm, the cohort set should be formed from the appropriate set of probe images.

