NIST-SRE04-16 Dataset¶
Dataset Description¶
This is an aggregation of the NIST-SRE datasets from 2004 to 2016.
Identities |
Sample count |
||
train |
6213 |
71728 |
|
dev |
references |
80 |
120 |
probes |
5 |
1207 |
|
eval |
references |
802 |
1202 |
probes |
5 |
9294 |
|
GMM¶
Development |
Evaluation |
Command used:
$ bob bio pipeline -d nist-sre04to16 -p gmm-nist -g dev -g eval -l sge -o results/gmm_nist
On 1281 CPU nodes on the SGE Grid: TODO
ISV¶
Development |
Evaluation |
Command used:
$ bob bio pipeline -d nist-sre04to16 -p isv-nist -g dev -g eval -l sge -o results/isv_nist
On 1281 CPU nodes on the SGE Grid: TODO
Speechbrain ECAPA-TDNN¶
Development |
Evaluation |
|
|---|---|---|
Failure to Acquire |
0.0% |
0.0% |
False Match Rate |
28.4% (27400/96342) |
28.7% (2142253/7453619) |
False Non Match Rate |
28.4% (62/218) |
33.7% (57/169) |
False Accept Rate |
28.4% |
28.7% |
False Reject Rate |
28.4% |
33.7% |
Half Total Error Rate |
28.4% |
31.2% |
Todo
These results are not taking into account the C_ID_X unknown identity…
In the core protocol, there are probes with the "C_ID_X" reference_id.
These samples do not come from the same person and will be treated as one person by
the analysis scripts. This is not good. We have to handle them.
Command used:
$ bob bio pipeline -d nist-sre04to16 -p speechbrain-ecapa-voxceleb -g dev -g eval -l sge -o results/speechbrain_nist
On 1281 CPU nodes on the SGE Grid: Ran in 50 minutes (no training).
Footnotes