Coverage for src/bob/bio/vein/database/utfvp.py: 100%

1#!/usr/bin/env python

2# vim: set fileencoding=utf-8 :

3# Victor <vbros@idiap.ch>

5"""

6 Utfvp database implementation

7"""

9from clapper.rc import UserDefaults

10from sklearn.pipeline import make_pipeline

12import bob.io.base

14from bob.bio.base.database import CSVDatabase, FileSampleLoader

15from bob.bio.vein.database.roi_annotation import ROIAnnotation

17rc = UserDefaults("bobrc.toml")

20class UtfvpDatabase(CSVDatabase):

21 """

22 The University of Twente Finger Vascular Pattern dataset

24 .. warning::

26 To use this dataset protocol, you need to have the original files of the UTFVP dataset.

27 Once you have it downloaded, please run the following command to set the path for Bob

29 .. code-block:: sh

31 bob config set bob.bio.vein.utfvp.directory [DATABASE PATH]

33 The fingervein image database consists of 1440 images taken in 2 distinct

34 session in two days (May 9th, 2012 and May 23rd, 2012) using a custom built

35 fingervein sensor. In each session, each of the 60 subjects in the dataset were

36 asked to present 6 fingers to the sensor twice, making up separate tries. The

37 six fingers are the left and right ring, middle and index fingers. Therefore,

38 the database contains 60x6 = 360 unique fingers.

40 Files in the database have a strict naming convention and are organized in

41 directories following their subject identifier like so:

42 ``0003/0003_5_2_120509-141536``. The fields can be interpreted as

43 ``<subject-id>/<subject-id>_<finger-name>_<trial>_<date>-<hour>``. The subject

44 identifier is written as a 4-digit number with leading zeroes, varying from 1

45 to 60. The finger name is one of the following:

47 * **1**: Left ring

48 * **2**: Left middle

49 * **3**: Left index

50 * **4**: Right index

51 * **5**: Right middle

52 * **6**: Right ring

54 The trial identifiers can vary between 1 and 4. The first two tries were

55 captured during the first session while the last two, on the second session.

56 Given the difference in the images between trials on the same day, we assume

57 users were asked to remove the finger from the device and re-position it

58 afterwards.

60 **Annotations**

62 We provide region-of-interest (RoI) **hand-made** annotations for all images in

63 this dataset. The annotations mark the place where the finger is on the image,

64 excluding the background. The annotation file is a text file with one

65 annotation per line in the format ``(y, x)``, respecting Bob's image encoding

66 convention. The interconnection of these points in a polygon forms the RoI.

68 .. warning::

70 To use the annotations, you need to provide the roi files.

71 Once you have it downloaded, please run the following command to set the path for Bob

73 .. code-block:: sh

75 bob config set bob.bio.vein.utfvp.roi [ANNOTATION PATH]

78 **Protocols**

80 There are 15 protocols implemented in this package:

82 * 1vsall

83 * nom

84 * nomLeftRing

85 * nomLeftMiddle

86 * nomLeftIndex

87 * nomRightIndex

88 * nomRightMiddle

89 * nomRightRing

90 * full

91 * fullLeftRing

92 * fullLeftMiddle

93 * fullLeftIndex

94 * fullRightIndex

95 * fullRightMiddle

96 * fullRightRing

98 **"nom" Protocols**

100 "nom" means "normal operation mode". In this set of protocols, images from

101 different clients are separated in different sets that can be used for system

102 training, validation and evaluation:

103

104 * Fingers from clients in the range [1, 10] are used on the training set

105 * Fingers from clients in the range [11, 28] are used on the development (or validation) set

106 * Fingers from clients in the range [29, 60] are used in the evaluation (or test) set

107

108 Data from the first session (both trials) can be used for enrolling the finger

109 while data on the last session (both trials) should be used exclusively for

110 probing the finger. In the way setup by this database interface, each of the

111 samples is returned as a separate enrollment model. If a single score per

112 finger is required, the user must manipulate the final score listings and fuse

113 results themselves.

114

115 Matching happens exhaustively between all probes and models. The variants named

116 "nomLeftRing", for example, contain the data filtered by finger name as per the

117 listings above. For example, "Left Ring" means all files named

118 ``*/*_1_*_*-*.png``. Therefore, the equivalent protocol contains only 1/6 of

119 the files of its complete ``nom`` version.

120

121

122 **"full" Protocols**

123

124

125 "full" protocols are meant to match current practices in fingervein reporting

126 in which most published material don't use a separate evaluation set. All data

127 is placed on the development (or validation) set. In these protocols, all

128 images are used both for enrolling and probing for fingers. It is, of course,

129 a biased setup. Matching happens exhaustively between all samples in the

130 development set.

131

132

133 The variants named "fullLeftRing", for example, contain the data filtered by

134 finger name as per the listings above. For example, "Left Ring" means all files

135 named ``*/*_1_*_*-*.png``. Therefore, the equivalent protocol contains only 1/6

136 of the files of its complete ``full`` version.

137

138

139 **"1vsall" Protocol**

140

141 The "1vsall" protocol is meant as a cross-validation protocol. All data from

142 the dataset is split into training and development (or validation). No samples

143 are allocated for a separate evaluation (or test) set. The training set is

144 composed of all samples of fingers ``0001_1`` (left ring finger of subject 1),

145 ``0002_2`` (left middle finger of subject 2), ``0003_3`` (left index finger of

146 subject 3), ``0004_4`` (right index finger of subject 4), ``0005_5`` (right

147 middle finger of subject 5), ``0006_6`` (right ring finger of subject 6),

148 ``0007_1`` (left ring finger of subject 7), ``0008_2`` (left middle finger of

149 subject 8) and so on, until subject 35 (inclusive). There are 140 images in

150 total on this set.

151

152 All other 1300 samples from the dataset are used as a development (or

153 validation) set. Each sample generates a single model and is used as a probe

154 for all other models. Matching happens exhaustively, but with the same image

155 that generated the model being matched. So, there are 1299 probes per model.

156

157 """

158

159 name = "utfvp"

160 category = "vein"

161 dataset_protocols_name = "utfvp.tar.gz"

162 dataset_protocols_urls = [

163 "https://www.idiap.ch/software/bob/databases/latest/vein/utfvp-fe51ba85.tar.gz",

164 "http://www.idiap.ch/software/bob/databases/latest/vein/utfvp-fe51ba85.tar.gz",

165 ]

166 dataset_protocols_hash = "fe51ba85"

167

168 def __init__(self, protocol):

169 super().__init__(

170 name=self.name,

171 protocol=protocol,

172 transformer=make_pipeline(

173 FileSampleLoader(

174 data_loader=bob.io.base.load,

175 dataset_original_directory=rc.get(

176 "bob.bio.vein.utfvp.directory", ""

177 ),

178 extension="",

179 ),

180 ROIAnnotation(roi_path=rc.get("bob.bio.vein.utfvp.roi", "")),

181 ),

182 score_all_vs_all=True,

183 )