Python API¶
The YouTube Faces database protocol interface. Please refer to http://www.cs.tau.ac.il/~wolf/ytfaces for information how to get a copy of the original data.
Note
There has been errata data published for the database. These errata is not considered in the protocols (yet).
The YouTube database consists of 10 different splits, which are called “fold” here (to be consistent with the LFW database).
In each fold 9/10 of the database are used for training, and one for evaluation.
In this implementation of the YouTube protocols, up to 7/10 of the data is used for training (groups='world'
),
2/10 are used for development (to estimate a threshold; groups='dev'
) and the last 1/10 is finally used to evaluate the system (groups='eval'
).
To compute recognition results, please execute experiments on all 10 protocols (protocol='fold1'
… protocol='fold10'
)
and average the resulting classification results (cf. http://vis-www.cs.umass.edu/lfw for details on scoring).
The design of this implementation differs slightly compared to the one from http://www.cs.tau.ac.il/~wolf/ytfaces. Originally, only lists of image pairs are provided by the creators of the YouTube database. To be consistent with other Bob databases, here the lists are split up into files to be enrolled, and probe files. The files to be enrolled are always the first file in the pair, while the second pair item is used as probe.
Note
When querying probe files, please always query probe files for a specific model id: objects(..., purposes = 'probe', model_ids = (model_id,))
.
In this case, you will follow the default protocols given by the database.
When querying training files objects(..., groups='world')
, you will automatically end up with the “image restricted configuration”.
When you want to respect the “unrestricted configuration” (cf. README on http://vis-www.cs.umass.edu/lfw),
please query the files that belong to the pairs, via objects(..., groups='world', world_type='unrestricted')
If you want to stick to the original protocol and use only the pairs for training and testing, feel free to query the pairs
function.
Note
The pairs that are provided using the pairs
function, and the files provided by the objects
function (see note above) correspond to the identical model/probe pairs.
Hence, either of the two approaches should give the same recognition results.
-
class
bob.db.youtube.
Client
(id, name)¶ Bases:
sqlalchemy.ext.declarative.api.Base
Information about the clients (identities) of the Youtube Faces database.
-
id
¶
-
name
¶
-
-
class
bob.db.youtube.
Database
(original_directory=None, original_extension='/*.jpg', annotation_extension='.labeled_faces.txt')¶ Bases:
bob.db.base.SQLiteDatabase
The dataset class opens and maintains a connection opened to the Database.
It provides many different ways to probe for the characteristics of the data and for the data itself inside the database.
-
annotations
(directory, image_names=None)[source]¶ Returns the annotations for the given file id as a dictionary of dictionaries, e.g. {‘1.56.jpg’ : {‘topleft’:(y,x), ‘bottomright’:(y,x)}, ‘1.57.jpg’ : {‘topleft’:(y,x), ‘bottomright’:(y,x)}, …}. Here, the key of the dictionary is the full image file name of the original image.
Keyword parameters:
- directory
The
Directory
object for which you want to retrieve the annotations- image_names
If given, only the annotations for the given image names (without path, but including filaname extension) are extracted and returned
-
clients
(protocol=None, groups=None, subworld='sevenfolds', world_type='unrestricted')[source]¶ Returns a list of Client objects for the specific query by the user.
Keyword Parameters:
- protocol
The protocol to consider; one of: (‘fold1’, …, ‘fold10’), or None
- groups
The groups to which the clients belong; one or several of: (‘world’, ‘dev’, ‘eval’)
- subworld
The subset of the training data. Has to be specified if groups includes ‘world’ and protocol is one of ‘fold1’, …, ‘fold10’. It might be exactly one of (‘onefolds’, ‘twofolds’, …, ‘sevenfolds’). Ignored for group ‘dev’ and ‘eval’.
- world_type
One of (‘restricted’, ‘unrestricted’). Ignored.
Returns: A list containing all Client objects which have the desired properties.
-
get_client_id_from_file_id
(file_id, **kwargs)[source]¶ Returns the client_id (real client id) attached to the given file_id
Keyword Parameters:
- file_id
The file_id to consider
Returns: The client_id attached to the given file_id
-
get_client_id_from_model_id
(model_id, **kwargs)[source]¶ Returns the client_id (real client id) attached to the given model id
Keyword Parameters:
- model_id
The model to consider
Returns: The client_id attached to the given model
-
model_ids
(protocol=None, groups=None)[source]¶ Returns a list of model ids for the specific query by the user. For the ‘dev’ and ‘eval’ groups, the first element of each pair is extracted.
Keyword Parameters:
- protocol
The protocol to consider; one of: (‘fold1’, …, ‘fold10’), or None
- groups
The groups to which the clients belong; one or several of: (‘dev’, ‘eval’) The ‘eval’ group does not exist for protocol ‘view1’.
Returns: A list containing all model ids which have the desired properties.
-
models
(protocol=None, groups=None)[source]¶ Returns a list of Directory objects (there are multiple models per client) for the specific query by the user. For the ‘dev’ and ‘eval’ groups, the first element of each pair is extracted.
Keyword Parameters:
- protocol
The protocol to consider; one of: (‘fold1’, …, ‘fold10’), or None
- groups
The groups to which the clients belong; one or several of: (‘dev’, ‘eval’)
Returns: A list containing all Directory objects which have the desired properties.
-
objects
(protocol=None, model_ids=None, groups=None, purposes=None, subworld='sevenfolds', world_type='unrestricted')[source]¶ Returns a list of Directory objects for the specific query by the user.
Keyword Parameters:
- protocol
The protocol to consider (‘fold1’, …, ‘fold10’), or None
- groups
The groups to which the objects belong (‘world’, ‘dev’, ‘eval’)
- purposes
The purposes of the objects (‘enroll’, ‘probe’)
- subworld
The subset of the training data. Has to be specified if groups includes ‘world’ and protocol is one of ‘fold1’, …, ‘fold10’. It might be exactly one of (‘onefolds’, ‘twofolds’, …, ‘sevenfolds’).
- world_type
One of (‘restricted’, ‘unrestricted’). If ‘restricted’, only the files that are used in one of the training pairs are used. For ‘unrestricted’, all files of the training people are returned.
- model_ids
Only retrieves the objects for the provided list of model ids. If ‘None’ is given (this is the default), no filter over the model_ids is performed. Note that the combination of ‘world’ group and ‘model_ids’ should be avoided.
Returns: A list of Directory objects considering all the filtering criteria.
-
original_file_name
(directory, check_existence=None)[source]¶ Returns the list of original image names for the given
directory
, sorted by frame number. In opposition to other bob databases, here a list of file names is returned.Keyword arguments:
- directory
bob.db.youtube.Directory
The Directory object to retrieve the list of file names for
- check_existencebool
Shall the existence of the files be checked?
- directory
-
pairs
(protocol=None, groups=None, classes=None, subworld='sevenfolds')[source]¶ Queries a list of Pair’s of files.
Keyword Parameters:
- protocol
The protocol to consider (‘fold1’, …, ‘fold10’)
- groups
The groups to which the objects belong (‘world’, ‘dev’, ‘eval’)
- classes
The classes to which the pairs belong (‘matched’, ‘unmatched’)
- subworld
The subset of the training data. Has to be specified if groups includes ‘world’ and protocol is one of ‘fold1’, …, ‘fold10’. It might be exactly one of (‘onefolds’, ‘twofolds’, …, ‘sevenfolds’).
Returns: A list of Pair’s considering all the filtering criteria.
-
tmodel_ids
(protocol, groups=None)[source]¶ Returns a list of T-Norm model ids that can be used for ZT norm. In fact, it uses the model ids from two other splits of the data, specifically, the last two of the training splits. Hence, to get training data independent from ZT-Norm data, use maximum subworld=’fivefolds’ in the world query.
Keyword Parameters:
- protocol
The protocol to consider; one of: (‘fold1’, …, ‘fold10’), or None
- groups
Ignored.
Returns: A list containing all Directory objects which have the desired properties.
-
tmodels
(protocol=None, groups=None)[source]¶ Returns a list of T-Norm models that can be used for ZT norm. In fact, it uses the model ids from two other splits of the data, specifically, the last two of the training splits. Hence, to get training data independent from ZT-Norm data, use maximum subworld=’fivefolds’ in the world query.
Keyword Parameters:
- protocol
The protocol to consider; one of: (‘fold1’, …, ‘fold10’), or None
- groups
Ignored.
Returns: A list containing all Directory objects which have the desired properties.
-
tobjects
(protocol, model_ids=None, groups=None)[source]¶ - Returns a set of filenames for enrolling T-norm models for score
normalization.
Keyword Parameters:
- protocol
The protocol to consider (‘fold1’, …, ‘fold10’), or None
- model_ids
Only retrieves the files for the provided list of model ids. If ‘None’ is given (this is the default), no filter over the model_ids is performed.
- groups
Ignored.
Returns: A set of Directory objects with the given properties.
-
zobjects
(protocol, model_ids=None, groups=None)[source]¶ - Returns a set of filenames for Z-norm probing for score
normalization.
Keyword Parameters:
- protocol
The protocol to consider (‘fold1’, …, ‘fold10’), or None
- model_ids
Only retrieves the files for the provided list of model ids. If ‘None’ is given (this is the default), no filter over the model_ids is performed.
- groups
Ignored.
Returns: A set of Directory objects with the given properties.
-
-
class
bob.db.youtube.
Directory
(file_id, client_id, path)¶ Bases:
sqlalchemy.ext.declarative.api.Base
,bob.db.base.File
Information about the directories of the Youtube Faces database.
-
client
¶
-
client_id
¶
-
id
¶
-
path
¶
-
shot_id
¶
-
-
class
bob.db.youtube.
Pair
(protocol, enroll_id, probe_id, enroll_client_id, probe_client_id, is_match)¶ Bases:
sqlalchemy.ext.declarative.api.Base
Information of the pairs (as given in the pairs.txt files) of the LFW database.
-
enroll_client
¶
-
enroll_client_id
¶
-
enroll_directory
¶
-
enroll_directory_id
¶
-
id
¶
-
is_match
¶
-
probe_client
¶
-
probe_client_id
¶
-
probe_directory
¶
-
probe_directory_id
¶
-
protocol
¶
-