AMIcorpus Database Interface for Bob¶

The AMI Meeting Corpus consists of:

100 hours of meeting recordings
Include close-talking and far-field microphones
Individual and room-view video cameras
Output from a slide projector and an electronic whiteboard

The meetings were recorded in English using three different rooms with different acoustic properties, and include mostly non-native speaker.

The AMI Meeting Corpus Annotations:

Linguistic (covering all recordings)
high quality, manually produced orthographic transcription for each individual speaker
word-level timings
behaviors (limited coverage)
dialogue acts
word-level timings, topic segmentation
extractive and abstractive summaries
the types of head gesture, hand gesture, and gaze direction
movement around the room
emotional state

More information about this database could be found in amicorpus

Two protocols are implemented in the script generate_files_list_verif.py are shipped with this package. In this script, two protocols are already implemented, called p1 and p2. These two protocols correspond respectively to the partitions: Full-corpus and Full-corpus-ASR mentioned in the dataset section of amicorpus. To re-generate the file lists generated by the script generate_files_list_verif.py, run the following command:

$ python generate_files_list_verif.py /PATH_TO_YOUR_AMICORPUS/

It should be noted that the script will structure the list of files as instructed in bob.db.bio_filelist.

One last thing before being able to run experiment using this database, is to make sure that your the database file in your home directory ~/.bob_bio_databases.txt contains [YOUR_AMI_DIRECTORY]=PATH_TO_YOUR_AMI_DIRECTORY

Speaker recognition protocol on the AMIcorpus Database¶

To run speaker recognition experiments on amicorups, you first need to setup a spear environment. Spear is a part of the bob.bio packages, which provide open source tools to run reproducible biometric recognition experiments. For more information about spear please refer to the documentation https://www.idiap.ch/software/bob/docs/latest/bob/bob.bio.spear/master/index.html

If you choose to setup the environment using buildout, use the following instructions to start a spear project: https://gitlab.idiap.ch/bob/bob/wikis/buildout#using-zcbuildout-for-production. A list summarizing what should be done is given below:

You should modify the buildout.cfg file w.r.t your needs, that is what packages, or databases you want to use. An example of buildout.cfg file is given below:

[buildout]
parts = scripts
develop =       src/bob.db.base
                        src/bob.db.bio_filelist
                        src/bob.db.ami
                        src/bob.bio.base
                        src/bob.bio.spear
                        src/bob.bio.gmm

eggs =          bob.db.base
                        bob.db.bio_filelist
                        bob.db.ami
                        bob.bio.base
                        bob.bio.spear
                        bob.bio.gmm

extensions = bob.buildout
                        mr.developer

auto-checkout = *
debug = true
newest = false
verbose = true

[sources]
bob.db.base = git git@gitlab.idiap.ch:bob/bob.db.base
bob.db.bio_filelist = git git@gitlab.idiap.ch:bob/bob.db.bio_filelist
bob.db.ami = git https://gitlab.idiap.ch/akomaty/bob.db.ami.git
bob.bio.base = git git@gitlab.idiap.ch:bob/bob.bio.base
bob.bio.spear = git git@gitlab.idiap.ch:bob/bob.bio.spear
bob.bio.gmm = git git@gitlab.idiap.ch:bob/bob.bio.gmm

[scripts]
recipe = bob.buildout:scripts
dependent-scripts = true

If you installed using conda, make sure to activate the environment using:

$ source /idiap/group/torch5spro/conda/bin/activate bob-2.4.0-py27_0

More details about using bob at idiap are available in bob
open the terminal and type:

python bootstrap-buildout.py
./bin/buildout

UBM-GMM¶

Now you’re ready to run to run your speaker recognition experiment using bob. To do so, you can use the script ./bin/verify.py. One example of command line is:

$ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘gmm’ -s ubm_gmm –groups {dev,eval}

In this example, the database used is amicorpus, the preprocessor is an Energy-based VAD, the feature extractor is MFCC-60 (19 MFCC coefficients + energy + first and second derivatives). UBM-GMM modeling (with 265 Gaussians).

The performance of the system on DEV and EVAL are:

The HTER of the development set of 'UBM-GMM' is 1.877%
The HTER of the evaluation set of 'UBM-GMM' is 8.202%

[Min. criterion: EER] Threshold on Development set: 8.567133e-01

	Development	Test
FAR	2.703% (18/666)	5.237% (53/1012)
FRR	2.703% (1/37)	10.870% (5/46)
HTER	2.703%	8.053%

[Min. criterion: Min. HTER] Threshold on Development set: 8.163361e-01

	Development	Test
FAR	3.754% (25/666)	5.534% (56/1012)
FRR	0.000% (0/37)	10.870% (5/46)
HTER	1.877%	8.202%

JFA¶

To use JFA toolchain instead of UBM-GMM:

$ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘jfa’ -s ‘jfa’ –groups {dev,eval}

The performance of the system on DEV and EVAL are:

The HTER of the development set of 'JFA' is 1.652%
The HTER of the evaluation set of 'JFA' is 11.561%

[Min. criterion: EER] Threshold on Development set: 7.990545e-01

	Development	Test
FAR	3.003% (20/666)	5.632% (57/1012)
FRR	2.703% (1/37)	19.565% (9/46)
HTER	2.853%	12.599%

[Min. criterion: Min. HTER] Threshold on Development set: 7.902645e-01

	Development	Test
FAR	3.303% (22/666)	5.731% (58/1012)
FRR	0.000% (0/37)	17.391% (8/46)
HTER	1.652%	11.561%

ISV¶

Another example is to use ISV toolchain, but first you need to train the ISV model using the training set (called world in bob):

$ ./bin/train_isv.py ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘isv’ -s ‘isv’ –groups ‘world’ -o ‘isv’ –preprocessed-directory ‘YOUR_PREPROCESSED_DIR’ –extracted-directory ‘YOUR_EXTRACTED_FEATURES_DIR’ –gmm-directory ‘YOUR_UBM_DIR’

Then you can run the script ./bin/verify.py:: $ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘isv’ -s ‘isv’ –groups {dev,eval} –skip-preprocessing –skip-extraction –skip-projector-training –preprocessed-directory ‘YOUR_PREPROCESSED_DIR’ –extracted-directory ‘YOUR_EXTRACTED_FEATURES_DIR’ –projector-file ‘DIRECTORY_OF_PROJECTOR_FILE’

The performance of the system on DEV and EVAL are:

The HTER of the development set of 'ISV' is 1.426%
The HTER of the evaluation set of 'ISV' is 3.360%

[Min. criterion: EER] Threshold on Development set: 2.599592e-01

	Development	Test
FAR	2.703% (18/666)	4.941% (50/1012)
FRR	2.703% (1/37)	2.174% (1/46)
HTER	2.703%	3.557%

[Min. criterion: Min. HTER] Threshold on Development set: 3.543569e-01

	Development	Test
FAR	0.150% (1/666)	2.372% (24/1012)
FRR	2.703% (1/37)	4.348% (2/46)
HTER	1.426%	3.360%

TV-cosine¶

To use Total Variability modeling (TV) or also known as i-vector using cosine distance scoring:: $ ./bin/train_ivector.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-cosine’ -s ‘ivec-cosine’ –groups ‘world’ $ ./bin/verify.py -vv -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-cosine’ -s ‘ivec-cosine’ –groups {dev,eval} –skip-projector-training

The performance of the system on DEV and EVAL are:

The HTER of the development set of 'TV-cosine' is 10.586%
The HTER of the evaluation set of 'TV-cosine' is 17.194%

[Min. criterion: EER] Threshold on Development set: 8.452344e-01

	Development	Test
FAR	12.763% (85/666)	7.905% (80/1012)
FRR	13.514% (5/37)	19.565% (9/46)
HTER	13.138%	13.735%

[Min. criterion: Min. HTER] Threshold on Development set: 8.646607e-01

	Development	Test
FAR	7.658% (51/666)	6.126% (62/1012)
FRR	13.514% (5/37)	28.261% (13/46)
HTER	10.586%	17.194%

TV-PLDA¶

To use i-vector using PLDA:: $ ./bin/train_ivector.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-plda’ -s ‘ivec-plda’ –groups ‘world’ $ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-plda’ -s ‘ivec-plda’ –groups {dev,eval} –skip-projector-training

The performance of the system on DEV and EVAL are:

The HTER of the development set of 'TV-PLDA' is 2.252%
The HTER of the evaluation set of 'TV-PLDA' is 4.545%

[Min. criterion: EER] Threshold on Development set: -6.309673e+01

	Development	Test
FAR	2.703% (18/666)	8.202% (83/1012)
FRR	2.703% (1/37)	2.174% (1/46)
HTER	2.703%	5.188%

[Min. criterion: Min. HTER] Threshold on Development set: -5.790133e+01

	Development	Test
FAR	1.802% (12/666)	6.917% (70/1012)
FRR	2.703% (1/37)	2.174% (1/46)
HTER	2.252%	4.545%

Getting the data¶

The original data can be downloaded from amicorpus.

AMIcorpus Database Interface for Bob¶

Speaker recognition protocol on the AMIcorpus Database¶

UBM-GMM¶

JFA¶

ISV¶

TV-cosine¶

TV-PLDA¶

Getting the data¶

Documentation¶

Indices and tables¶