AMIcorpus Database Interface for Bob

The AMI Meeting Corpus consists of:

  • 100 hours of meeting recordings
  • Include close-talking and far-field microphones
  • Individual and room-view video cameras
  • Output from a slide projector and an electronic whiteboard

The meetings were recorded in English using three different rooms with different acoustic properties, and include mostly non-native speaker.

The AMI Meeting Corpus Annotations:

  • Linguistic (covering all recordings)
  • high quality, manually produced orthographic transcription for each individual speaker
  • word-level timings
  • behaviors (limited coverage)
  • dialogue acts
  • word-level timings, topic segmentation
  • extractive and abstractive summaries
  • the types of head gesture, hand gesture, and gaze direction
  • movement around the room
  • emotional state

More information about this database could be found in amicorpus

Two protocols are implemented in the script generate_files_list_verif.py are shipped with this package. In this script, two protocols are already implemented, called p1 and p2. These two protocols correspond respectively to the partitions: Full-corpus and Full-corpus-ASR mentioned in the dataset section of amicorpus. To re-generate the file lists generated by the script generate_files_list_verif.py, run the following command:

$ python generate_files_list_verif.py /PATH_TO_YOUR_AMICORPUS/

It should be noted that the script will structure the list of files as instructed in bob.db.bio_filelist.

One last thing before being able to run experiment using this database, is to make sure that your the database file in your home directory ~/.bob_bio_databases.txt contains [YOUR_AMI_DIRECTORY]=PATH_TO_YOUR_AMI_DIRECTORY

Speaker recognition protocol on the AMIcorpus Database

To run speaker recognition experiments on amicorups, you first need to setup a spear environment. Spear is a part of the bob.bio packages, which provide open source tools to run reproducible biometric recognition experiments. For more information about spear please refer to the documentation https://www.idiap.ch/software/bob/docs/latest/bob/bob.bio.spear/master/index.html

If you choose to setup the environment using buildout, use the following instructions to start a spear project: https://gitlab.idiap.ch/bob/bob/wikis/buildout#using-zcbuildout-for-production. A list summarizing what should be done is given below:

  1. You should modify the buildout.cfg file w.r.t your needs, that is what packages, or databases you want to use. An example of buildout.cfg file is given below:
[buildout]
parts = scripts
develop =       src/bob.db.base
                        src/bob.db.bio_filelist
                        src/bob.db.ami
                        src/bob.bio.base
                        src/bob.bio.spear
                        src/bob.bio.gmm

eggs =          bob.db.base
                        bob.db.bio_filelist
                        bob.db.ami
                        bob.bio.base
                        bob.bio.spear
                        bob.bio.gmm

extensions = bob.buildout
                        mr.developer

auto-checkout = *
debug = true
newest = false
verbose = true

[sources]
bob.db.base = git git@gitlab.idiap.ch:bob/bob.db.base
bob.db.bio_filelist = git git@gitlab.idiap.ch:bob/bob.db.bio_filelist
bob.db.ami = git https://gitlab.idiap.ch/akomaty/bob.db.ami.git
bob.bio.base = git git@gitlab.idiap.ch:bob/bob.bio.base
bob.bio.spear = git git@gitlab.idiap.ch:bob/bob.bio.spear
bob.bio.gmm = git git@gitlab.idiap.ch:bob/bob.bio.gmm

[scripts]
recipe = bob.buildout:scripts
dependent-scripts = true
  1. If you installed using conda, make sure to activate the environment using:

    $ source /idiap/group/torch5spro/conda/bin/activate bob-2.4.0-py27_0

    More details about using bob at idiap are available in bob

  2. open the terminal and type:

python bootstrap-buildout.py
./bin/buildout

UBM-GMM

Now you’re ready to run to run your speaker recognition experiment using bob. To do so, you can use the script ./bin/verify.py. One example of command line is:

$ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘gmm’ -s ubm_gmm –groups {dev,eval}

In this example, the database used is amicorpus, the preprocessor is an Energy-based VAD, the feature extractor is MFCC-60 (19 MFCC coefficients + energy + first and second derivatives). UBM-GMM modeling (with 265 Gaussians).

The performance of the system on DEV and EVAL are:

  • The HTER of the development set of 'UBM-GMM' is 1.877%
  • The HTER of the evaluation  set of 'UBM-GMM' is 8.202%

[Min. criterion: EER] Threshold on Development set: 8.567133e-01

  Development Test
FAR 2.703% (18/666) 5.237% (53/1012)
FRR 2.703% (1/37) 10.870% (5/46)
HTER 2.703% 8.053%

[Min. criterion: Min. HTER] Threshold on Development set: 8.163361e-01

  Development Test
FAR 3.754% (25/666) 5.534% (56/1012)
FRR 0.000% (0/37) 10.870% (5/46)
HTER 1.877% 8.202%

JFA

To use JFA toolchain instead of UBM-GMM:

$ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘jfa’ -s ‘jfa’ –groups {dev,eval}

The performance of the system on DEV and EVAL are:

  • The HTER of the development set of 'JFA' is 1.652%
  • The HTER of the evaluation set of 'JFA' is 11.561%

[Min. criterion: EER] Threshold on Development set: 7.990545e-01

  Development Test
FAR 3.003% (20/666) 5.632% (57/1012)
FRR 2.703% (1/37) 19.565% (9/46)
HTER 2.853% 12.599%

[Min. criterion: Min. HTER] Threshold on Development set: 7.902645e-01

  Development Test
FAR 3.303% (22/666) 5.731% (58/1012)
FRR 0.000% (0/37) 17.391% (8/46)
HTER 1.652% 11.561%

ISV

Another example is to use ISV toolchain, but first you need to train the ISV model using the training set (called world in bob):

$ ./bin/train_isv.py ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘isv’ -s ‘isv’ –groups ‘world’ -o ‘isv’ –preprocessed-directory ‘YOUR_PREPROCESSED_DIR’ –extracted-directory ‘YOUR_EXTRACTED_FEATURES_DIR’ –gmm-directory ‘YOUR_UBM_DIR’
Then you can run the script ./bin/verify.py:
$ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ –algorithm ‘isv’ -s ‘isv’ –groups {dev,eval} –skip-preprocessing –skip-extraction –skip-projector-training –preprocessed-directory ‘YOUR_PREPROCESSED_DIR’ –extracted-directory ‘YOUR_EXTRACTED_FEATURES_DIR’ –projector-file ‘DIRECTORY_OF_PROJECTOR_FILE’

The performance of the system on DEV and EVAL are:

  • The HTER of the development set of 'ISV' is 1.426%
  • The HTER of the evaluation set of 'ISV' is 3.360%

[Min. criterion: EER] Threshold on Development set: 2.599592e-01

  Development Test
FAR 2.703% (18/666) 4.941% (50/1012)
FRR 2.703% (1/37) 2.174% (1/46)
HTER 2.703% 3.557%

[Min. criterion: Min. HTER] Threshold on Development set: 3.543569e-01

  Development Test
FAR 0.150% (1/666) 2.372% (24/1012)
FRR 2.703% (1/37) 4.348% (2/46)
HTER 1.426% 3.360%

TV-cosine

To use Total Variability modeling (TV) or also known as i-vector using cosine distance scoring:
$ ./bin/train_ivector.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-cosine’ -s ‘ivec-cosine’ –groups ‘world’ $ ./bin/verify.py -vv -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-cosine’ -s ‘ivec-cosine’ –groups {dev,eval} –skip-projector-training

The performance of the system on DEV and EVAL are:

  • The HTER of the development set of 'TV-cosine' is 10.586%
  • The HTER of the evaluation set of 'TV-cosine' is 17.194%

[Min. criterion: EER] Threshold on Development set: 8.452344e-01

  Development Test
FAR 12.763% (85/666) 7.905% (80/1012)
FRR 13.514% (5/37) 19.565% (9/46)
HTER 13.138% 13.735%

[Min. criterion: Min. HTER] Threshold on Development set: 8.646607e-01

  Development Test
FAR 7.658% (51/666) 6.126% (62/1012)
FRR 13.514% (5/37) 28.261% (13/46)
HTER 10.586% 17.194%

TV-PLDA

To use i-vector using PLDA:
$ ./bin/train_ivector.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-plda’ -s ‘ivec-plda’ –groups ‘world’ $ ./bin/verify.py -d ‘ami’ -p ‘energy-2gauss’ -e ‘mfcc-60’ -a ‘ivector-plda’ -s ‘ivec-plda’ –groups {dev,eval} –skip-projector-training

The performance of the system on DEV and EVAL are:

  • The HTER of the development set of 'TV-PLDA' is 2.252%
  • The HTER of the evaluation set of 'TV-PLDA' is 4.545%

[Min. criterion: EER] Threshold on Development set: -6.309673e+01

  Development Test
FAR 2.703% (18/666) 8.202% (83/1012)
FRR 2.703% (1/37) 2.174% (1/46)
HTER 2.703% 5.188%

[Min. criterion: Min. HTER] Threshold on Development set: -5.790133e+01

  Development Test
FAR 1.802% (12/666) 6.917% (70/1012)
FRR 2.703% (1/37) 2.174% (1/46)
HTER 2.252% 4.545%

Getting the data

The original data can be downloaded from amicorpus.

Indices and tables