Coverage for src/bob/bio/vein/database/utfvp.py: 100%
14 statements
« prev ^ index » next coverage.py v7.6.0, created at 2024-07-12 23:27 +0200
« prev ^ index » next coverage.py v7.6.0, created at 2024-07-12 23:27 +0200
1#!/usr/bin/env python
2# vim: set fileencoding=utf-8 :
3# Victor <vbros@idiap.ch>
5"""
6 Utfvp database implementation
7"""
9from clapper.rc import UserDefaults
10from sklearn.pipeline import make_pipeline
12import bob.io.base
14from bob.bio.base.database import CSVDatabase, FileSampleLoader
15from bob.bio.vein.database.roi_annotation import ROIAnnotation
17rc = UserDefaults("bobrc.toml")
20class UtfvpDatabase(CSVDatabase):
21 """
22 The University of Twente Finger Vascular Pattern dataset
24 .. warning::
26 To use this dataset protocol, you need to have the original files of the UTFVP dataset.
27 Once you have it downloaded, please run the following command to set the path for Bob
29 .. code-block:: sh
31 bob config set bob.bio.vein.utfvp.directory [DATABASE PATH]
33 The fingervein image database consists of 1440 images taken in 2 distinct
34 session in two days (May 9th, 2012 and May 23rd, 2012) using a custom built
35 fingervein sensor. In each session, each of the 60 subjects in the dataset were
36 asked to present 6 fingers to the sensor twice, making up separate tries. The
37 six fingers are the left and right ring, middle and index fingers. Therefore,
38 the database contains 60x6 = 360 unique fingers.
40 Files in the database have a strict naming convention and are organized in
41 directories following their subject identifier like so:
42 ``0003/0003_5_2_120509-141536``. The fields can be interpreted as
43 ``<subject-id>/<subject-id>_<finger-name>_<trial>_<date>-<hour>``. The subject
44 identifier is written as a 4-digit number with leading zeroes, varying from 1
45 to 60. The finger name is one of the following:
47 * **1**: Left ring
48 * **2**: Left middle
49 * **3**: Left index
50 * **4**: Right index
51 * **5**: Right middle
52 * **6**: Right ring
54 The trial identifiers can vary between 1 and 4. The first two tries were
55 captured during the first session while the last two, on the second session.
56 Given the difference in the images between trials on the same day, we assume
57 users were asked to remove the finger from the device and re-position it
58 afterwards.
60 **Annotations**
62 We provide region-of-interest (RoI) **hand-made** annotations for all images in
63 this dataset. The annotations mark the place where the finger is on the image,
64 excluding the background. The annotation file is a text file with one
65 annotation per line in the format ``(y, x)``, respecting Bob's image encoding
66 convention. The interconnection of these points in a polygon forms the RoI.
68 .. warning::
70 To use the annotations, you need to provide the roi files.
71 Once you have it downloaded, please run the following command to set the path for Bob
73 .. code-block:: sh
75 bob config set bob.bio.vein.utfvp.roi [ANNOTATION PATH]
78 **Protocols**
80 There are 15 protocols implemented in this package:
82 * 1vsall
83 * nom
84 * nomLeftRing
85 * nomLeftMiddle
86 * nomLeftIndex
87 * nomRightIndex
88 * nomRightMiddle
89 * nomRightRing
90 * full
91 * fullLeftRing
92 * fullLeftMiddle
93 * fullLeftIndex
94 * fullRightIndex
95 * fullRightMiddle
96 * fullRightRing
98 **"nom" Protocols**
100 "nom" means "normal operation mode". In this set of protocols, images from
101 different clients are separated in different sets that can be used for system
102 training, validation and evaluation:
104 * Fingers from clients in the range [1, 10] are used on the training set
105 * Fingers from clients in the range [11, 28] are used on the development (or validation) set
106 * Fingers from clients in the range [29, 60] are used in the evaluation (or test) set
108 Data from the first session (both trials) can be used for enrolling the finger
109 while data on the last session (both trials) should be used exclusively for
110 probing the finger. In the way setup by this database interface, each of the
111 samples is returned as a separate enrollment model. If a single score per
112 finger is required, the user must manipulate the final score listings and fuse
113 results themselves.
115 Matching happens exhaustively between all probes and models. The variants named
116 "nomLeftRing", for example, contain the data filtered by finger name as per the
117 listings above. For example, "Left Ring" means all files named
118 ``*/*_1_*_*-*.png``. Therefore, the equivalent protocol contains only 1/6 of
119 the files of its complete ``nom`` version.
122 **"full" Protocols**
125 "full" protocols are meant to match current practices in fingervein reporting
126 in which most published material don't use a separate evaluation set. All data
127 is placed on the development (or validation) set. In these protocols, all
128 images are used both for enrolling and probing for fingers. It is, of course,
129 a biased setup. Matching happens exhaustively between all samples in the
130 development set.
133 The variants named "fullLeftRing", for example, contain the data filtered by
134 finger name as per the listings above. For example, "Left Ring" means all files
135 named ``*/*_1_*_*-*.png``. Therefore, the equivalent protocol contains only 1/6
136 of the files of its complete ``full`` version.
139 **"1vsall" Protocol**
141 The "1vsall" protocol is meant as a cross-validation protocol. All data from
142 the dataset is split into training and development (or validation). No samples
143 are allocated for a separate evaluation (or test) set. The training set is
144 composed of all samples of fingers ``0001_1`` (left ring finger of subject 1),
145 ``0002_2`` (left middle finger of subject 2), ``0003_3`` (left index finger of
146 subject 3), ``0004_4`` (right index finger of subject 4), ``0005_5`` (right
147 middle finger of subject 5), ``0006_6`` (right ring finger of subject 6),
148 ``0007_1`` (left ring finger of subject 7), ``0008_2`` (left middle finger of
149 subject 8) and so on, until subject 35 (inclusive). There are 140 images in
150 total on this set.
152 All other 1300 samples from the dataset are used as a development (or
153 validation) set. Each sample generates a single model and is used as a probe
154 for all other models. Matching happens exhaustively, but with the same image
155 that generated the model being matched. So, there are 1299 probes per model.
157 """
159 name = "utfvp"
160 category = "vein"
161 dataset_protocols_name = "utfvp.tar.gz"
162 dataset_protocols_urls = [
163 "https://www.idiap.ch/software/bob/databases/latest/vein/utfvp-fe51ba85.tar.gz",
164 "http://www.idiap.ch/software/bob/databases/latest/vein/utfvp-fe51ba85.tar.gz",
165 ]
166 dataset_protocols_hash = "fe51ba85"
168 def __init__(self, protocol):
169 super().__init__(
170 name=self.name,
171 protocol=protocol,
172 transformer=make_pipeline(
173 FileSampleLoader(
174 data_loader=bob.io.base.load,
175 dataset_original_directory=rc.get(
176 "bob.bio.vein.utfvp.directory", ""
177 ),
178 extension="",
179 ),
180 ROIAnnotation(roi_path=rc.get("bob.bio.vein.utfvp.roi", "")),
181 ),
182 score_all_vs_all=True,
183 )