User’s Guide¶
This package contains the access API and descriptions for the YouTube Faces database. It only contains the Bob accessor methods to use the DB directly from python, with our certified protocols. The actual raw data for the YouTube Faces database should be downloaded from the original URL (though we were not able to contact the corresponding Professor).
The Database Interface¶
The bob.db.youtube.Database
complies with the standard biometric verification database as described in bob.db.base <bob.db.base>, implementing the interface bob.db.base.SQLiteDatabase
.
The Protocols¶
To use the protocol interface, you have to create an instance of the bob.db.youtube.Database
:
>>> import bob.db.youtube
>>> db = bob.db.youtube.Database(YOUR_DATABASE_DIRECTORY)
where YOUR_DATABASE_DIRECTORY
is the base directory, where the original images from the database can be found, e.g., '/path/to/YouTubeFaces/frame_images_DB'
.
The database interface contains several functions to query the database. For example, to get the list of supported protocols, you can query the list of supported protocols:
>>> db.protocol_names()
('fold1', 'fold2', 'fold3', 'fold4', 'fold5', 'fold6', 'fold7', 'fold8', 'fold9', 'fold10')
These protocol names define the 10 different splits of the YouTube Faces protocol, for which experiments can be run. Some of the remaining query functions require a protocol to be selected.
For each protocol, the splits of the database are distributed into three different groups: ('world', 'dev', 'eval')
.
- The
eval
group contains exactly the split that is requested for the protocol. For this group, the final evaluation should be run, e.g., by classifying the corresponding pairs to be same class or different class. - The
dev
group contains two splits, which contain different identities than theeval
group. This group can be used, e.g., to select a threshold that is used to classify the pairs. - Finally, the
world
group contains up to seven splits of the database, with identities distinct from thedev
andeval
groups. This split can be used to train your classifier.
For the final evaluation it is required that 10 different experiments are executed, e.g., by training 10 different classifiers on the according world
splits, selecting 10 different thresholds on the dev
set and compute 10 different classification results.
Finally, the classification accuracy is reported as an average of the 10 classification results.
The Directory Objects¶
The most important method of the interface is the bob.db.youtube.Database.objects()
function.
You can use this function to query the information for the protocols.
For the YouTube database, the information consists of a list of bob.db.youtube.Directory
.
Each Directory
contains information about a video, such as the identity of the client, the shot id and the (relative) path of the directory in the database:
>>> objects = db.objects(protocol='fold1')
>>> type(objects)
<type 'list'>
>>> d = objects[0]
>>> type(d)
<class 'bob.db.youtube.models.Directory'>
>>> d.client_id
1
>>> d.shot_id
0
>>> d.path
u'AJ_Cook/0'
These Directory
objects can be used to get the path for the image data.
Since the videos are stored as a list of frames, the Directory
interface will return a list of image file names, sorted by frame number:
>>> file_names = db.original_file_name(d)
>>> print (file_names[0])
[...]/AJ_Cook/0/0.123.jpg
Warning
Please note that – in opposition to most other bob.db database interfaces – the bob.db.youtube.Database.original_file_name()
function returns a list of file names.
Likewise, bob.db.youtube.Database.original_file_name()
returns a list of lists of file names.
Finally, bounding boxes are annotated in the images.
To get these bounding boxes for a specific (set of) images, you can use the bob.db.youtube.Database.annotations()
function.
In the example below, the annotations for the first 20 images are read and returned:
>>> file_name_stems = [os.path.basename(f) for f in file_names[:20]]
>>> annotations = db.annotations(d, file_name_stems)
>>> sorted(annotations.keys()) == file_name_stems
True
>>> bounding_box = annotations[file_name_stems[0]]
>>> print (bounding_box)
{'topleft': (56.0, 205.0), 'bottomright': (112.0, 261.0)}
The annotations for one image can, for example, be used to cut out the face region from the image, using default functionality from other Bob packages:
>>> import bob.io.base
>>> import bob.io.image
>>> import bob.ip.color
>>> color_image = bob.io.base.load(file_names[0])
>>> gray_image = bob.ip.color.rgb_to_gray(color_image)
>>> face_region = gray_image[bounding_box['topleft'][0] : bounding_box['bottomright'][0],
bounding_box['topleft'][1] : bounding_box['bottomright'][1]]