The AVspoof Database
The AVspoof Database provides non-biased spoofing attacks in order for researchers to test both their ASV systems and anti-spoofing algorithms. The attacks are created based on newly acquired audio recordings. The data acquisition process lasted approximately two months with 44 persons, each participating in several sessions configured in different environmental conditions and setups. After the collection of the data, the attacks, more precisely, replay, voice conversion and speech synthesis attacks were generated. This Database was produced at the Idiap Research Institute, in Switzerland.
If you use this database, please cite the following publication on your paper:
@INPROCEEDINGS{KucurErgunay_IEEEBTAS_2015, author = {Kucur Ergunay, Serife and Khoury, Elie and Lazaridis, Alexandros and Marcel, S{\'{e}}bastien}, projects = {Idiap, SNSF-LOBI, BEAT}, month = sep, title = {On the Vulnerability of Speaker Verification to Realistic Voice Spoofing}, booktitle = {IEEE International Conference on Biometrics: Theory, Applications and Systems}, year = {2015}, pdf = {http://publications.idiap.ch/downloads/papers/2015/KucurErgunay_IEEEBTAS_2015.pdf} }
The data acquisition process is divided into four different sessions, each scheduled several days apart in different setups and environmental conditions (e.g. different in terms of background noise, reverberation, etc.) for each of 31 male and 13 female participants. The first session which is supposed to be used as training set while creating the attacks, was performed in the most controlled conditions. Besides, the conditions for the last three sessions dedicated to test trials were more relaxed in order to grasp the challenging scenarios. The audio data were recorded by three different devices including (a) one good-quality microphone, AT2020USB+, and two mobiles, (b) Samsung Galaxy S4 (phone1) and (c) iPhone 3GS (phone2). The positioning of the devices was stabilized for each session and each participant in order to standardize the recording settings.
For each session, the participant was subjected to three different data acquisition protocols as in the following: * Reading part (read): 10/40 pre-defined sentences are read by the participant. * Pass-phrases part (pass): 5 short prompts are read by the participant. * Free speech part (free): The participant speaks freely about any topic for 3 to 10 minutes.
The number, the length, as well as the content of the sentences for the reading and pass-phrases part are carefully selected in order to satisfy the constraints in terms of readability, data acquisition and attack quality. Similarly, the minimum duration of the free speech part is also determined according to our preliminary investigations mostly on the voice conversion attacks for which the free speech data would be included in the training set.
In the spoofing attack creation phase, we considered creating spoofing trials for the text-dependent utterances of the testing data, i.e. reading parts of sessions 2-4 and the pass-phrases of all four sessions. As a preliminary step before the creation of the attacks, the speech data originally recorded at 44.1 KHz sampling rate is down-sampled to 16 KHz.
There are four main spoofing attacks for ASV systems: Impersonation, replay, speech synthesis and voice conversion. As the impersonation is known not to be a serious threat for ASV systems, we did not include it in our database. For the remaining three spoofing types, we designed ten different scenarios (see table below). We gave special attention to physical access attacks. These attacks are more realistic than logic access attacks considering the fact that the attacker often has no direct access to the system. The acquisition devices (sensors) are open to anyone, therefore more subjected to such attacks.
Attacks | Num. of trials per speaker | Total num. of trials | ||
Male | Female | Male | Female | |
Replay-phone1 | 50 | 50 | 1550 | 650 |
Replay-phone2 | 50 | 50 | 1550 | 650 |
Replay-laptop | 50 | 50 | 1550 | 650 |
Replay-laptop-HQ | 50 | 50 | 1550 | 650 |
Speech-Synthesis-LA | 35 | 35 | 1085 | 455 |
Speech-Synthesis-PA | 35 | 35 | 1085 | 455 |
Speech-Synthesis-PA-HQ | 35 | 35 | 1085 | 455 |
Voice-Conversion-LA | 1500 | 600 | 46500 | 7800 |
Voice-Conversion-PA | 1500 | 600 | 46500 | 7800 |
Voice-Conversion-PA-HQ | 1500 | 600 | 46500 | 7800 |
A replay attack consists of replaying a pre-recorded speech to an ASV system. We assume that the ASV system has a good quality microphone and the replay attack targets this sensor:
The speech synthesis attacks were based on statistical parametric speech synthesis (SPSS). More specific, hidden Markov model (HMM)-based speech synthesis technique was used.
The voice conversion attacks were created using Festvox. A conversion function for each pair of source-target speaker is found based on the learned GMM model/parameters by using the source and target speakers training data. We did not consider cross-gender voice conversion attacks, that is only male-to-male and female-to-female conversions were taken into account. As in the case of speech synthesis, three possible scenarios are involved:
Spoofing and genuine samples are each labelled with a text field that defines the class of the sample: "attack" or "real" for simple anti-spoofing binary classification systems. Code using this database views may use the class field to differentiate samples.
The view supports the following protocols, which are also available in the database: smalltest (for prove of concept experiments only, as a subset of only three clients is provided for each set), grandtest (data of the whole database is provided), physical_access (only replay/presentation attack are provided), and logical_access (only logical access attacks are provided with no replay attacks).
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
probe_id: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
probe_id: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
probe_id: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
attack_type: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
class: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
client_id: | system/text/1 (Basic format containing a text) |
file_id: | system/text/1 (Basic format containing a text) |
probe_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_id: | system/text/1 (Basic format containing a text) |
Output name | Data format |
---|---|
probe_id: | system/text/1 (Basic format containing a text) |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
template_ids: | system/array_1d_text/1 |
Output name | Data format |
---|---|
template_ids: | system/array_1d_text/1 |
speech: | system/array_1d_floats/1 (Basic format containing a one-dimensional array of float values) |
file_id: | system/text/1 (Basic format containing a text) |
client_id: | system/text/1 (Basic format containing a text) |
attack_id: | system/text/1 (Basic format containing a text) |
Sharing for databases is controlled exclusively by platform administrators. Get in touch if you'd like to change access control for databases hosted on this platform.