User’s Guide

This package contains the access API and descriptions for the YouTube Faces database. It only contains the Bob accessor methods to use the DB directly from python, with our certified protocols. The actual raw data for the YouTube Faces database should be downloaded from the original URL (though we were not able to contact the corresponding Professor).

The Database Interface

The bob.db.youtube.Database complies with the standard biometric verification database as described in bob.db.base <bob.db.base>, implementing the interface bob.db.base.SQLiteDatabase.

The Protocols

To use the protocol interface, you have to create an instance of the bob.db.youtube.Database:

>>> import bob.db.youtube
>>> db = bob.db.youtube.Database(YOUR_DATABASE_DIRECTORY)

where YOUR_DATABASE_DIRECTORY is the base directory, where the original images from the database can be found, e.g., '/path/to/YouTubeFaces/frame_images_DB'.

The database interface contains several functions to query the database. For example, to get the list of supported protocols, you can query the list of supported protocols:

>>> db.protocol_names()
('fold1', 'fold2', 'fold3', 'fold4', 'fold5', 'fold6', 'fold7', 'fold8', 'fold9', 'fold10')

These protocol names define the 10 different splits of the YouTube Faces protocol, for which experiments can be run. Some of the remaining query functions require a protocol to be selected.

For each protocol, the splits of the database are distributed into three different groups: ('world', 'dev', 'eval').

  • The eval group contains exactly the split that is requested for the protocol. For this group, the final evaluation should be run, e.g., by classifying the corresponding pairs to be same class or different class.
  • The dev group contains two splits, which contain different identities than the eval group. This group can be used, e.g., to select a threshold that is used to classify the pairs.
  • Finally, the world group contains up to seven splits of the database, with identities distinct from the dev and eval groups. This split can be used to train your classifier.

For the final evaluation it is required that 10 different experiments are executed, e.g., by training 10 different classifiers on the according world splits, selecting 10 different thresholds on the dev set and compute 10 different classification results. Finally, the classification accuracy is reported as an average of the 10 classification results.

The Directory Objects

The most important method of the interface is the bob.db.youtube.Database.objects() function. You can use this function to query the information for the protocols. For the YouTube database, the information consists of a list of bob.db.youtube.Directory. Each Directory contains information about a video, such as the identity of the client, the shot id and the (relative) path of the directory in the database:

>>> objects = db.objects(protocol='fold1')
>>> type(objects)
<type 'list'>
>>> d = objects[0]
>>> type(d)
<class 'bob.db.youtube.models.Directory'>
>>> d.client_id
1
>>> d.shot_id
0
>>> d.path                   
u'AJ_Cook/0'

These Directory objects can be used to get the path for the image data. Since the videos are stored as a list of frames, the Directory interface will return a list of image file names, sorted by frame number:

>>> file_names = db.original_file_name(d)
>>> print (file_names[0])    
[...]/AJ_Cook/0/0.123.jpg

Warning

Please note that – in opposition to most other bob.db database interfaces – the bob.db.youtube.Database.original_file_name() function returns a list of file names. Likewise, bob.db.youtube.Database.original_file_name() returns a list of lists of file names.

Finally, bounding boxes are annotated in the images. To get these bounding boxes for a specific (set of) images, you can use the bob.db.youtube.Database.annotations() function. In the example below, the annotations for the first 20 images are read and returned:

>>> file_name_stems = [os.path.basename(f) for f in file_names[:20]]
>>> annotations = db.annotations(d, file_name_stems)
>>> sorted(annotations.keys()) == file_name_stems
True
>>> bounding_box = annotations[file_name_stems[0]]
>>> print (bounding_box)
{'topleft': (56.0, 205.0), 'bottomright': (112.0, 261.0)}

The annotations for one image can, for example, be used to cut out the face region from the image, using default functionality from other Bob packages:

>>> import bob.io.base
>>> import bob.io.image
>>> import bob.ip.color
>>> color_image = bob.io.base.load(file_names[0])
>>> gray_image = bob.ip.color.rgb_to_gray(color_image)
>>> face_region = gray_image[bounding_box['topleft'][0] : bounding_box['bottomright'][0],
                             bounding_box['topleft'][1] : bounding_box['bottomright'][1]]