Python API¶
This section includes information for using the pure Python API of bob.ip.base
.
Classes¶
bob.ip.base.GeomNorm |
Objects of this class, after configuration, can perform a geometric |
bob.ip.base.FaceEyesNorm |
Objects of this class, after configuration, can perform a geometric |
bob.ip.base.LBP |
A class that extracts local binary patterns in various types |
bob.ip.base.LBPTop |
A class that extracts local binary patterns (LBP) in three orthogonal |
bob.ip.base.DCTFeatures |
Objects of this class, after configuration, can extract DCT features. |
bob.ip.base.TanTriggs |
Objects of this class, after configuration, can preprocess images |
bob.ip.base.Gaussian |
Objects of this class, after configuration, can perform Gaussian |
bob.ip.base.Wiener |
A Wiener filter |
bob.ip.base.MultiscaleRetinex |
This class allows after configuration to apply the Self Quotient Image |
bob.ip.base.WeightedGaussian |
This class performs weighted gaussian smoothing (anisotropic filtering) |
bob.ip.base.SelfQuotientImage |
This class allows after configuration to apply the Self Quotient Image |
bob.ip.base.GaussianScaleSpace |
This class allows after configuration the generation of Gaussian |
bob.ip.base.GSSKeypoint |
Structure to describe a keypoint on the |
bob.ip.base.GSSKeypointInfo |
This is a companion structure to the |
bob.ip.base.SIFT |
This class allows after configuration the extraction of SIFT |
bob.ip.base.VLSIFT |
Computes SIFT features using the VLFeat library |
bob.ip.base.VLDSIFT |
Computes dense SIFT features using the VLFeat library |
bob.ip.base.GradientMagnitude |
Gradient ‘magnitude’ used |
bob.ip.base.BlockNorm |
Enumeration that defines the norm that is used for normalizing the |
bob.ip.base.HOG |
Objects of this class, after configuration, can extract Histogram of Oriented Gradients (HOG) descriptors. |
bob.ip.base.GLCMProperty |
Enumeration that defines the properties of GLCM, to be used in |
Functions¶
Detailed Information¶
-
class
bob.ip.base.
BlockNorm
¶ Bases:
object
Enumeration that defines the norm that is used for normalizing the descriptor blocks
Possible values are:
L2
: Euclidean normL2Hys
: L2 norm with clipping of high valuesL1
: L1 norm (Manhattan distance)L1sqrt
: Square root of the L1 normNonorm
: no norm used
Class Members:
-
L1
= 2¶
-
L1sqrt
= 3¶
-
L2
= 0¶
-
L2Hys
= 1¶
-
Nonorm
= 4¶
-
entries
= {'Nonorm': 4, 'L2Hys': 1, 'L2': 0, 'L1sqrt': 3, 'L1': 2}¶
-
class
bob.ip.base.
DCTFeatures
¶ Bases:
object
Objects of this class, after configuration, can extract DCT features.
The DCT feature extraction is described in more detail in [Sanderson2002]. This class also supports block normalization and DCT coefficient normalization.
Constructor Documentation:
- bob.ip.base.DCTFeatures (coefficients, block_size, [block_overlap], [normalize_block], [normalize_dct], [square_pattern])
- bob.ip.base.DCTFeatures (dct_features)
Constructs a new DCT features extractor
Todo
Explain DCTFeatures constructor in more detail.
Parameters:
coefficients
: intThe number of DCT coefficients;
Note
the real number of DCT coefficient returned by the extractor is
coefficients-1
when the block normalization is enabled by settingnormalize_block=True
(as the first coefficient is always 0 in this case)block_size
: (int, int)The size of the blocks, in which the image is decomposedblock_overlap
: (int, int)[default:(0, 0)
] The overlap of the blocksnormalize_block
: bool[default:False
] Normalize each block to zero mean and unit variance before extracting DCT coefficients? In this case, the first coefficient will always be zero and hence will not be returnednormalize_dct
: bool[default:False
] Normalize DCT coefficients to zero mean and unit variance after the DCT extraction?square_pattern
: bool[default: False] Select, whether a zigzag pattern or a square pattern is used for the DCT extraction; for a square pattern, the number of DCT coefficients must be a square integerdct_features
:bob.ip.base.DCTFeatures
The DCTFeatures object to use for copy-constructionClass Members:
-
block_overlap
¶ (int, int) <– The block overlap in both vertical and horizontal direction of the Multi-Block-DCTFeatures extractor, with read and write access
Note
The
block_overlap
must be smaller than theblock_size
.
-
block_size
¶ (int, int) <– The size of each block for the block decomposition, with read and write access
-
coefficients
¶ int <– The number of DCT coefficients, with read and write access
Note
The real number of DCT coefficient returned by the extractor is
coefficients-1
when the block normalization is enabled (as the first coefficient is always 0 in this case)
-
extract
()¶ - extract(input, [flat]) -> output
- extract(input, output) -> None
Extracts DCT features from either uint8, uint16 or double arrays
The input array is a 2D array/grayscale image. The destination array, if given, should be a 2D or 3D array of type float64 and allocated with the correct dimensions (see
output_shape()
). If the destination array is not given (first version), it is generated in the required size. The blocks can be split into either a 2D array of shape(block_index, coefficients)
by settingflat=True
, or into a 3D array of shape(block_index_y, block_index_x, coefficients)
withflat=False
.Note
The
__call__()
function is an alias for this method.Parameters:
input
: array_like (2D)The input image for which DCT features should be extractedflat
: bool[default:True
] Theflat
parameter is used to decide whether 2D (flat = True
) or 3D (flat = False
) output shape is generatedoutput
: array_like (2D, float)The output image that need to be of shapeoutput_shape()
Returns:
output
: array_like (2D, float)The resulting DCT features
-
normalization_epsilon
¶ float <– The epsilon value to avoid division-by-zero when performing block or DCT coefficient normalization (read and write access)
The default value for this epsilon is
10 * sys.float_info.min
, and usually there is little necessity to change that.
-
normalize_block
¶ bool <– Normalize each block to zero mean and unit variance before extracting DCT coefficients (read and write access)
Note
In case
normalize_block
is set toTrue
the first coefficient will always be zero and, hence, will not be returned.
-
normalize_dct
¶ bool <– Normalize DCT coefficients to zero mean and unit variance after the DCT extraction (read and write access)
-
output_shape
()¶ - output_shape(input, [flat]) -> dct_shape
- output_shape(shape, [flat]) -> dct_shape
This function returns the shape of the DCT output for the given input
The blocks can be split into either a 2D array of shape
(block_index, coefficients)
by settingflat=True
, or into a 3D array of shape(block_index_y, block_index_x, coefficients)
withflat=False
.Parameters:
input
: array_like (2D)The input image for which DCT features should be extractedshape
: (int, int)The shape of the input image for which DCT features should be extractedflat
: bool[default:True
] Theflat
parameter is used to decide whether 2D (flat = True
) or 3D (flat = False
) output shape is generatedReturns:
dct_shape
: (int, int) or (int, int, int)The shape of the DCT features image that is required in a call toextract()
-
square_pattern
¶ bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?
Note
For a square pattern, the number of DCT coefficients must be a square integer.
-
class
bob.ip.base.
FaceEyesNorm
¶ Bases:
object
Objects of this class, after configuration, can perform a geometric normalization of facial images based on their eye positions
The geometric normalization is a combination of rotation, scaling and cropping an image. The underlying implementation relies on a
bob.ip.base.GeomNorm
object to perform the actual geometric normalization.Constructor Documentation:
- bob.ip.base.FaceEyesNorm (crop_size, eyes_distance, eyes_center)
- bob.ip.base.FaceEyesNorm (crop_size, right_eye, left_eye)
- bob.ip.base.FaceEyesNorm (other)
Constructs a FaceEyesNorm object.
Basically there exist two ways to define a FaceEyesNorm. Both ways require the resulting
crop_size
. The first constructor takes the inter-eye-distance and the center of the eyes, which will be used as transformation center. The second version takes the image resolution and two arbitrary positions in the face, with which the image will be aligned. Usually, these positions are the eyes, but any other pair (like mouth and eye for profile faces) can be specified.Parameters:
crop_size
: (int, int)The resolution of the normalized faceeyes_distance
: floatThe inter-eye-distance in the normalized faceeyes_center
: (float, float)The center point between the eyes in the normalized faceright_eye
: (float, float)The location of the right eye (or another fix point) in the normalized imageleft_eye
: (float, float)The location of the left eye (or another fix point) in the normalized imageother
:FaceEyesNorm
Another FaceEyesNorm object to copyClass Members:
-
crop_offset
¶ (float, float) <– The transformation center in the processed image, which is usually the center between the eyes; with read and write access
-
crop_size
¶ (int, int) <– The size of the normalized image, with read and write access
-
extract
()¶ - extract(input, right_eye, left_eye) -> output
- extract(input, output, right_eye, left_eye) -> None
- extract(input, input_mask, output, output_mask, right_eye, left_eye) -> None
This function extracts and normalized the facial image
This function extracts the facial image based on the eye locations (or the location of other fixed point, see note below). The geometric normalization is applied such that the eyes are placed to fixed positions in the normalized image. The image is cropped at the same time, so that no unnecessary operations are executed.
Note
Instead of the eyes, any two fixed positions can be used to normalize the face. This can simply be achieved by selecting two other nodes in the constructor (see
FaceEyesNorm
) and in this function. Just make sure that ‘right’ and ‘left’ refer to the same landmarks in both functions.Note
The
__call__()
function is an alias for this method.Parameters:
input
: array_like (2D)The input image to which FaceEyesNorm should be appliedoutput
: array_like (2D, float)The output image, which must be of sizecrop_size
right_eye
: (float, float)The position of the right eye (or another landmark) ininput
image coordinates.left_eye
: (float, float)The position of the left eye (or another landmark) ininput
image coordinates.input_mask
: array_like (2D, bool)An input mask of valid pixels before geometric normalization, must be of same size asinput
output_mask
: array_like (2D, bool)The output mask of valid pixels after geometric normalization, must be of same size asoutput
Returns:
output
: array_like(2D, float)The resulting normalized face image, which is of sizecrop_size
-
eyes_angle
¶ float <– The angle between the eyes in the normalized image (relative to the horizontal line), with read and write access
-
eyes_distance
¶ float <– The distance between the eyes in the normalized image, with read and write access
-
geom_norm
¶ bob.ip.base.GeomNorm
<– The geometric normalization class that was used to compute the last normalization, read access only
-
last_angle
¶ float <– The rotation angle that was applied on the latest normalized image, read access only
-
last_offset
¶ (float, float) <– The original transformation offset (eye center) in the normalization process, read access only
-
last_scale
¶ float <– The scale that was applied on the latest normalized image, read access only
-
class
bob.ip.base.
GLCM
(*args, **kwargs)[source]¶ Bases:
bob.ip.base.GLCM
Objects of this class, after configuration, can compute Grey-Level Co-occurence Matrix of an image
This class allows to extract a Grey-Level Co-occurence Matrix (GLCM) [Haralick1973]. A thorough tutorial about GLCM and the textural (so-called Haralick) properties that can be derived from it, can be found at: http://www.fp.ucalgary.ca/mhallbey/tutorial.htm. A MatLab implementation can be found at: http://www.mathworks.ch/ch/help/images/ref/graycomatrix.html
Constructor Documentation:
- bob.ip.base.GLCM ([levels], [min_level], [max_level], [dtype])
- bob.ip.base.GLCM (quantization_table)
- bob.ip.base.GLCM (glcm)
Constructor
Todo
The parameter(s) ‘levels, max_level, min_level, quantization_table’ are used, but not documented.
Parameters:
dtype
:numpy.dtype
[default:numpy.uint8
] The data-type for the GLCM classglcm
:bob.ip.base.GLCM
The GLCM object to use for copy-constructionClass Members:
-
angular_second_moment
(input) → property¶ Computes the angular_second_moment property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘angular_second_moment’ property
-
auto_correlation
(input) → property¶ Computes the auto_correlation property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘auto_correlation’ property
-
cluster_prominence
(input) → property¶ Computes the cluster_prominence property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘cluster_prominence’ property
-
cluster_shade
(input) → property¶ Computes the cluster_shade property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘cluster_shade’ property
-
contrast
(input) → property¶ Computes the contrast property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘contrast’ property
-
correlation
(input) → property¶ Computes the correlation property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘correlation’ property
-
correlation_matlab
(input) → property¶ Computes the correlation_matlab property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘correlation_matlab’ property
-
difference_entropy
(input) → property¶ Computes the difference_entropy property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘difference_entropy’ property
-
difference_variance
(input) → property¶ Computes the difference_variance property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘difference_variance’ property
-
dissimilarity
(input) → property¶ Computes the dissimilarity property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘dissimilarity’ property
-
dtype
¶ numpy.dtype
<– The data type, which was used in the constructorOnly images of this data type can be processed in the
extract()
function.
-
energy
(input) → property¶ Computes the energy property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘energy’ property
-
entropy
(input) → property¶ Computes the entropy property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘entropy’ property
-
extract
(input[, output]) → output¶ Extracts the GLCM matrix from the given input image
If given, the output array should have the expected type (numpy.float64) and the size as defined by
output_shape()
.Note
The
__call__()
function is an alias for this method.Parameters:
input
: array_like (2D)The input image to extract GLCM features fromoutput
: array_like (3D, float)[default:None
] If given, the output will be saved into this array; must be of the shape asoutput_shape()
Returns:
output
: array_like (3D, float)The resulting output data, which is the same as the parameteroutput
(if given)
-
homogeneity
(input) → property¶ Computes the homogeneity property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘homogeneity’ property
-
information_measure_of_correlation_1
(input) → property¶ Computes the information_measure_of_correlation_1 property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘information_measure_of_correlation_1’ property
-
information_measure_of_correlation_2
(input) → property¶ Computes the information_measure_of_correlation_2 property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘information_measure_of_correlation_2’ property
-
inverse_difference
(input) → property¶ Computes the inverse_difference property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘inverse_difference’ property
-
inverse_difference_moment
(input) → property¶ Computes the inverse_difference_moment property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘inverse_difference_moment’ property
-
inverse_difference_moment_normalized
(input) → property¶ Computes the inverse_difference_moment_normalized property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘inverse_difference_moment_normalized’ property
-
inverse_difference_normalized
(input) → property¶ Computes the inverse_difference_normalized property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘inverse_difference_normalized’ property
-
levels
¶ int <– Specifies the number of gray-levels to use when scaling the gray values in the input image
This is the number of the values in the first and second dimension in the GLCM matrix. The default is the total number of gray values permitted by the type of the input image.
-
max_level
¶ int <– Gray values greater than or equal to this value are scaled to
levels
The default is the maximum gray-level permitted by the type of input image.
-
maximum_probability
(input) → property¶ Computes the maximum_probability property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘maximum_probability’ property
-
min_level
¶ int <– Gray values smaller than or equal to this value are scaled to 0The default is the minimum gray-level permitted by the type of input image.
-
normalized
¶ bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?
Note
For a square pattern, the number of DCT coefficients must be a square integer.
-
offset
¶ array_like (2D, int) <– The offset specifying the column and row distance between pixel pairs
The shape of this array is (num_offsets, 2), where num_offsets is the total number of offsets to be taken into account when computing GLCM.
-
output_shape
() → shape¶ Get the shape of the GLCM matrix goven the input image
The shape has 3 dimensions: two for the number of gray levels, and one for the number of offsets
Returns:
shape
: (int, int, int)The shape of the output array required to callextract()
-
properties_by_name
(glcm_matrix, prop_names) → prop_values¶ Query the properties of GLCM by specifying a name
Returns a list of numpy.array of the queried properties. Please see the documentation of
bob.ip.base.GLCMProperty
for details on the possible properties.Parameters:
glcm_matrix
: array_like (3D, float)The result of the GLCM extractionprop_names
: [bob.ip.base.GLCMProperty
][default:None
] A list of GLCM properties; either by value (int) or by name (str)Returns:
prop_values
: [array_like (1D, float)]The GLCM properties for the givenprop_names
-
quantization_table
¶ array_like (1D) <– The thresholds of the quantizationEach element corresponds to the lower boundary of the particular quantization level. E.g.. array([ 0, 5, 10]) means quantization in 3 levels. Input values in the range [0,4] will be quantized to level 0, input values in the range[5,9] will be quantized to level 1 and input values in the range [10-max_level] will be quantized to level 2.
-
sum_average
(input) → property¶ Computes the sum_average property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘sum_average’ property
-
sum_entropy
(input) → property¶ Computes the sum_entropy property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘sum_entropy’ property
-
sum_variance
(input) → property¶ Computes the sum_variance property
Parameters
input
: array_like (3D, float)The result of theextract()
functionReturns
property
:array_like (1D, float)The resulting ‘sum_variance’ property
-
symmetric
¶ bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?
Note
For a square pattern, the number of DCT coefficients must be a square integer.
-
class
bob.ip.base.
GLCMProperty
¶ Bases:
object
Enumeration that defines the properties of GLCM, to be used in
bob.ip.base.GLCM.properties_by_name()
Possible values are:
'angular_second_moment'
[1] / energy [6]'energy'
[4]'variance'
(sum of squares) [1]'contrast'
[1], [6]'auto_correlation'
[2]'correlation'
[1]'correlation_matlab'
as in MATLAB Image Processing Toolbox method graycoprops() [6]'inverse_difference_moment'
[1] = homogeneity [2], homop[5]'sum_average'
[1]'sum_variance'
[1]'sum_entropy'
[1]'entropy'
[1]'difference_variance'
[4]'difference_entropy'
[1]'dissimilarity'
[4]'homogeneity'
[6]'cluster_prominence'
[2]'cluster_shade'
[2]'maximum_probability'
[2]'information_measure_of_correlation_1'
[1]'information_measure_of_correlation_2'
[1]'inverse_difference'
(INV) is homom [3]'inverse_difference_normalized'
(INN) [3]'inverse_difference_moment_normalized'
[3]
The references from above are as follows:
- [1] R. M. Haralick, K. Shanmugam, I. Dinstein; Textural Features for Image classification, in IEEE Transactions on Systems, Man and Cybernetics, vol.SMC-3, No. 6, p. 610-621.
- [2] L. Soh and C. Tsatsoulis; Texture Analysis of SAR Sea Ice Imagery Using Gray Level Co-Occurrence Matrices, IEEE Transactions on Geoscience and Remote Sensing, vol. 37, no. 2, March 1999.
- [3] D A. Clausi, An analysis of co-occurrence texture statistics as a function of grey level quantization, Can. J. Remote Sensing, vol. 28, no.1, pp. 45-62, 2002.
- [4] http://murphylab.web.cmu.edu/publications/boland/boland_node26.html
- [5] http://www.mathworks.com/matlabcentral/fileexchange/22354-glcmfeatures4-m-vectorized-version-of-glcmfeatures1-m-with-code-changes
- [6] http://www.mathworks.ch/ch/help/images/ref/graycoprops.html
Class Members:
-
angular_second_moment
= 0¶
-
auto_correlation
= 4¶
-
cluster_prominence
= 16¶
-
cluster_shade
= 17¶
-
contrast
= 3¶
-
correlation
= 5¶
-
correlation_matlab
= 6¶
-
difference_entropy
= 13¶
-
difference_variance
= 12¶
-
dissimilarity
= 14¶
-
energy
= 1¶
-
entries
= {'cluster_prominence': 16, 'energy': 1, 'homogeneity': 15, 'entropy': 11, 'difference_variance': 12, 'inverse_difference_normalized': 22, 'inverse_difference_moment': 7, 'sum_entropy': 10, 'angular_second_moment': 0, 'difference_entropy': 13, 'correlation_matlab': 6, 'sum_variance': 9, 'contrast': 3, 'cluster_shade': 17, 'auto_correlation': 4, 'maximum_probability': 18, 'inverse_difference_moment_normalized': 23, 'information_measure_of_correlation_1': 19, 'dissimilarity': 14, 'sum_average': 8, 'correlation': 5, 'inverse_difference': 21, 'variance': 2, 'information_measure_of_correlation_2': 20}¶
-
entropy
= 11¶
-
homogeneity
= 15¶
-
information_measure_of_correlation_1
= 19¶
-
information_measure_of_correlation_2
= 20¶
-
inverse_difference
= 21¶
-
inverse_difference_moment
= 7¶
-
inverse_difference_moment_normalized
= 23¶
-
inverse_difference_normalized
= 22¶
-
maximum_probability
= 18¶
-
sum_average
= 8¶
-
sum_entropy
= 10¶
-
sum_variance
= 9¶
-
variance
= 2¶
-
class
bob.ip.base.
GSSKeypoint
¶ Bases:
object
Structure to describe a keypoint on the
bob.ip.base.GaussianScaleSpace
It consists of a scale sigma, a location (y,x) and an orientation.
Constructor Documentation:
bob.ip.base.GSSKeypoint (sigma, location, [orientation])
Creates a GSS keypoint
Parameters:
sigma
: floatThe floating point value describing the scale of the keypointlocation
: (float, float)The location of the keypointorientation
: float[default: 0] The orientation of the keypoint (in degrees)Class Members:
-
location
¶ (float, float) <– The location (y, x) of the keypoint, with read and write access
-
orientation
¶ float <– The orientation of the keypoint (in degree), with read and write access
-
sigma
¶ float <– The floating point value describing the scale of the keypoint, with read and write access
-
-
class
bob.ip.base.
GSSKeypointInfo
¶ Bases:
object
This is a companion structure to the
bob.ip.base.GSSKeypoint
It provides additional and practical information such as the octave and scale indices, the integer location
location = (y,x)
, and eventually the scores associated to the detection step (peak_score
andedge_score
)Constructor Documentation:
bob.ip.base.GSSKeypointInfo ([octave_index], [scale_index], [location], [peak_score], [edge_score])
Creates a GSS keypoint
Parameters:
octave_index
: int[default: 0] The octave index associated with the keypoint in thebob.ip.base.GaussianScaleSpace
objectscale_index
: int[default: 0] The scale index associated with the keypoint in thebob.ip.base.GaussianScaleSpace
objectlocation
: (int, int)[default:(0, 0)
] The integer unnormalized location (y,x) of the keypointpeak_score
: float[default: 0] The orientation of the keypoint (in degrees)edge_score
: float[default: 0] The orientation of the keypoint (in degrees)Class Members:
-
edge_score
¶ float <– The edge score of the keypoint during the SIFT-like detection step, with read and write access
-
location
¶ (int, int) <– The integer unnormalized location (y, x) of the keypoint, with read and write access
-
octave_index
¶ int <– The octave index associated with the keypoint in the
bob.ip.base.GaussianScaleSpace
object, with read and write access
-
peak_score
¶ float <– The peak score of the keypoint during the SIFT-like detection step, with read and write access
-
scale_index
¶ int <– The scale index associated with the keypoint in the
bob.ip.base.GaussianScaleSpace
object, with read and write access
-
-
class
bob.ip.base.
Gaussian
¶ Bases:
object
Objects of this class, after configuration, can perform Gaussian filtering (smoothing) on images
The Gaussian smoothing is done by convolving the image with a vertical and a horizontal smoothing filter.
Constructor Documentation:
- bob.ip.base.Gaussian (sigma, [radius], [border])
- bob.ip.base.Gaussian (gaussian)
Constructs a new Gaussian filter
The Gaussian kernel is generated in both directions independently, using the given standard deviation and the given radius, where the size of the kernels is actually
2*radius+1
. When the radius is not given or negative, it will be automatically computed ad3*sigma
.Note
Since the Gaussian smoothing is done by convolution, a larger radius will lead to longer execution time.
Parameters:
sigma
: (double, double)The standard deviation of the Gaussian along the y- and x-axes in pixelsradius
: (int, int)[default: (-1, -1) ->3*sigma
] The radius of the Gaussian in both directions – the size of the kernel is2*radius+1
border
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the bordergaussian
:bob.ip.base.Gaussian
The Gaussian object to use for copy-constructionClass Members:
-
border
¶ bob.sp.BorderType
<– The extrapolation method used by the convolution at the border, with read and write access
-
filter
(src[, dst]) → dst¶ Smooths an image (2D/grayscale or 3D/color)
If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.
Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D)The input image which should be smootheddst
: array_like (2D, float)[default:None
] If given, the output will be saved into this image; must be of the same shape assrc
Returns:
dst
: array_like (2D, float)The resulting output image, which is the same asdst
(if given)
-
kernel_x
¶ array_like (1D, float) <– The values of the kernel in horizontal direction; read only access
-
kernel_y
¶ array_like (1D, float) <– The values of the kernel in vertical direction; read only access
-
class
bob.ip.base.
GaussianScaleSpace
¶ Bases:
object
This class allows after configuration the generation of Gaussian Pyramids that can be used to extract SIFT features
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.GaussianScaleSpace (size, scales, octaves, octave_min, [sigma_n], [sigma0], [kernel_radius_factor], [border])
- bob.ip.base.GaussianScaleSpace (gss)
Constructs a new DCT features extractor
Todo
Explain GaussianScaleSpace constructor in more detail.
Warning
The order of the parameters
scales
andoctaves
has changed compared to the old implementation, in order to keep it consistent withbob.ip.base.VLSIFT
!Parameters:
size
: (int, int)The height and width of the images to processscales
: intThe number of intervals of the pyramid. Three additional scales will be computed in practice, as this is required for extracting SIFT featuresoctaves
: intThe number of octaves of the pyramidoctave_min
: intThe index of the minimum octavesigma_n
: float[default: 0.5] The value sigma_n of the standard deviation for the nominal/initial octave/scalesigma0
: float[default: 1.6] The value sigma0 of the standard deviation for the image of the first octave and first scalekernel_radius_factor
: float[default: 4.] Factor used to determine the kernel radii:size=2*radius+1
. For each Gaussian kernel, the radius is equal toceil(kernel_radius_factor*sigma_{octave,scale})
border
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the bordergss
:bob.ip.base.GaussianScaleSpace
The GaussianScaleSpace object to use for copy-constructionClass Members:
-
allocate_output
() → pyramid¶ Allocates a python list of arrays for the Gaussian pyramid
Returns:
pyramid
: [array_like(3D, float)]A list of output arrays in the size required to call :py:func`process`
-
border
¶ bob.sp.BorderType
<– The extrapolation method used by the convolution at the border; with read and write access
-
get_gaussian
(index) → gaussian¶ Returns the Gaussian at index/interval/scale i
Parameters:
index
: intThe index of the scale for which the Gaussian should be retrievedReturns:
gaussian
:bob.ip.base.Gaussian
The Gaussian at the given index
-
kernel_radius_factor
¶ float <– Factor used to determine the kernel radii
size=2*radius+1
For each Gaussian kernel, the radius is equal to
ceil(kernel_radius_factor*sigma_{octave,scale})
-
octave_max
¶ int <– The index of the minimum octave, read only access
This is equal to octave_min+n_octaves-1.
-
octave_min
¶ int <– The index of the minimum octave, with read and write access
-
octaves
¶ int <– The number of octaves of the pyramid, with read and write access
-
process
(src[, dst]) → dst¶ Computes a Gaussian Pyramid for an input 2D image
If given, the results are put in the output
dst
, which output should already be allocated and of the correct size (using theallocate_output()
method).Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D)The input image which should be processeddst
: [array_like (3D, float)]The Gaussian pyramid that should have been allocated withallocate_output()
Returns:
dst
: [array_like (3D, float)]The resulting Gaussian pyramid, if given it will be the same as thedst
parameter
-
scales
¶ int <– The number of intervals of the pyramid, with read and write access
Three additional scales will be computed in practice, as this is required for extracting SIFT features
-
set_sigma0_no_init_smoothing
() → None¶ Sets sigma0 such that there is not smoothing at the first scale of octave_min
-
sigma0
¶ float <– The value sigma0 of the standard deviation for the image of the first octave and first scale
-
sigma_n
¶ float <– The value sigma_n of the standard deviation for the nominal/initial octave/scale; with read and write access
-
size
¶ (int, int) <– The shape of the images to process, with read and write access
-
class
bob.ip.base.
GeomNorm
¶ Bases:
object
Objects of this class, after configuration, can perform a geometric normalization of images
The geometric normalization is a combination of rotation, scaling and cropping an image.
Constructor Documentation:
- bob.ip.base.GeomNorm (rotation_angle, scaling_factor, crop_size, crop_offset)
- bob.ip.base.GeomNorm (other)
Constructs a GeomNorm object with the given scale, angle, size of the new image and transformation offset in the new image
When the GeomNorm is applied to an image, it is rotated and scaled such that it visually rotated counter-clock-wise (mathematically positive) with the given angle, i.e., to mimic the behavior of ImageMagick. Since the origin in the image is in the top-left corner, this means that the rotation is actually clock-wise (mathematically negative). This also applies for the second version of the landmarks, which will be rotated mathematically negative as well, to keep it consistent with the image.
Warning
The behavior of the landmark rotation has changed from Bob version 1.x, where the landmarks were mistakenly rotated mathematically positive.
Parameters:
rotation_angle
: floatThe rotation angle in degrees that should be appliedscaling_factor
: floatThe scale factor to applycrop_size
: (int, int)The resolution of the processed imagescrop_offset
: (float, float)The transformation offset in the processed imagesother
:GeomNorm
Another GeomNorm object to copyClass Members:
-
crop_offset
¶ (float, float) <– The transformation center in the processed image, with read and write access
-
crop_size
¶ (int, int) <– The size of the processed image, with read and write access
-
process
()¶ - process(input, output, center) -> None
- process(input, input_mask, output, output_mask, center) -> None
- process(position, center) -> transformed
This function geometrically normalizes an image or a position in the image
The function rotates and scales the given image, or a position in image coordinates, such that the result is visually rotated and scaled with the
rotation_angle
andscaling_factor
.Note
The
__call__()
function is an alias for this method.Parameters:
input
: array_like (2D or 3D)The input image to which GeomNorm should be appliedoutput
: array_like (2D or 3D, float)The output image, which must be of sizecrop_size
center
: (float, float)The transformation center in the given image; this will be placed tocrop_offset
in the output imageinput_mask
: array_like (bool, 2D or 3D)An input mask of valid pixels before geometric normalization, must be of same size asinput
output_mask
: array_like (bool, 2D or 3D)The output mask of valid pixels after geometric normalization, must be of same size asoutput
position
: (float, float)A position in input image space that will be transformed to output image space (might be outside of the crop area)Returns:
transformed
: uint16The resulting GeomNorm code at the given position in the image
-
rotation_angle
¶ float <– The rotation angle, with read and write access
-
scaling_factor
¶ float <– The scale factor, with read and write access
-
class
bob.ip.base.
GradientMagnitude
¶ Bases:
object
Gradient ‘magnitude’ used
Possible values are:
Magnitude
: L2 magnitude over X and YMagnitudeSquare
: Square of the L2 magnitudeSqrtMagnitude
: Square root of the L2 magnitude
Class Members:
-
Magnitude
= 0¶
-
MagnitudeSquare
= 1¶
-
SqrtMagnitude
= 2¶
-
entries
= {'MagnitudeSquare': 1, 'Magnitude': 0, 'SqrtMagnitude': 2}¶
-
class
bob.ip.base.
HOG
¶ Bases:
object
Objects of this class, after configuration, can extract Histogram of Oriented Gradients (HOG) descriptors.
This implementation relies on the article of [Dalal2005]. A few remarks:
- Only single channel inputs (a.k.a. grayscale) are considered. Therefore, it does not take the maximum gradient over several channels as proposed in the above article.
- Gamma/Color normalization is not part of the descriptor computation. However, this can easily be done (using this library) before extracting the descriptors.
- Gradients are computed using standard 1D centered gradient (except at the borders where the gradient is uncentered [-1 1]). This is the method which achieved best performance reported in the article. To avoid too many uncentered gradients to be used, the gradients are computed on the full image prior to the cell decomposition. This implies that extra-pixels at each boundary of the cell are contributing to the gradients, although these pixels are not located inside the cell.
- R-HOG blocks (rectangular) normalization is supported, but not C-HOG blocks (circular).
- Due to the similarity with the SIFT descriptors, this can also be used to extract dense-SIFT features.
- The first bin of each histogram is always centered around 0. This
implies that the orientations are in
[0-e,180-e]
rather than [0,180], withe
being half the angle size of a bin (same with [0,360]).
Constructor Documentation:
- bob.ip.base.HOG (image_size, [bins], [full_orientation], [cell_size], [cell_overlap], [block_size], [block_overlap])
- bob.ip.base.HOG (hog)
Constructs a new HOG extractor
Parameters:
image_size
: (int, int)The size of the input image to process.bins
: int[default: 8] Dimensionality of a cell descriptor (i.e. the number of bins)full_orientation
: bool[default:False
] Whether the range[0,360]
is used or only[0,180]
cell_size
: (int, int)[default:(4,4)
] The size of a cell.cell_overlap
: (int, int)[default:(0,0)
] The overlap between cells.block_size
: (int, int)[default:(4,4)
] The size of a block (in terms of cells).block_overlap
: (int, int)[default:(0,0)
] The overlap between blocks (in terms of cells).hog
:bob.ip.base.HOG
Another HOG object to copyClass Members:
-
bins
¶ int <– Dimensionality of a cell descriptor (i.e. the number of bins), with read and write access
-
block_norm
¶ bob.ip.base.BlockNorm
<– The type of norm used for normalizing blocks, with read and write access
-
block_norm_eps
¶ float <– Epsilon value used to avoid division by zeros when normalizing the blocks, read and write access
-
block_norm_threshold
¶ float <– Threshold used to perform the clipping during the block normalization, with read and write access
-
block_overlap
¶ (int, int) <– Overlap between blocks (in terms of cells), with read and write access
-
block_size
¶ (int, int) <– Size of a block (in terms of cells), with read and write access
-
cell_overlap
¶ (int, int) <– Overlap between cells, with read and write access
-
cell_size
¶ (int, int) <– Size of a cell, with read and write access
-
compute_histogram
(magnitude, orientation[, histogram]) → histogram¶ Computes an Histogram of Gradients for a given ‘cell’
The inputs are the gradient magnitudes and the orientations for each pixel of the cell
Parameters:
magnitude
: array_like (2D, float)The input array with the gradient magnitudesorientation
: array_like (2D, float)The input array with the orientationshistogram
: array_like (1D, float)[default = None] If given, the result will be written to this histogram; must be of sizebins
Returns:
histogram
: array_like (1D, float)The resulting histogram; same as inputhistogram
, if given
-
disable_block_normalization
() → None¶ Disable block normalization
This is performed by setting parameters such that the cells are not further processed, i.e.:
block_ size
= (1, 1)
block_overlap
= (0, 0)
block_norm
=
bob.ip.base.BlockNorm.Nonorm
-
extract
(input[, output]) → output¶ Extract the HOG descriptors
This extracts HOG descriptors from the input image. The output is 3D, the first two dimensions being the y- and x- indices of the block, and the last one the index of the bin (among the concatenated cell histograms for this block).
Note
The
__call__()
function is an alias for this method.Parameters:
input
: array_like (2D)The input image to extract HOG features fromoutput
: array_like (3D, float)[default:None
] If given, the container to extract the HOG features to; must be of sizeoutput_shape()
Returns:
output
: array_like(2D, float)The resulting HOG features, same as parameteroutput
, if given
-
full_orientation
¶ bool <– Whether the range [0,360] is used or not ([0,180] otherwise), with read and write access
-
image_size
¶ (int, int) <– The size of the input image to process., with read and write access
-
magnitude_type
¶ bob.ip.base.GradientMagnitude
<– Type of the magnitude to consider for the descriptors, with read and write access
-
class
bob.ip.base.
LBP
¶ Bases:
object
A class that extracts local binary patterns in various types
The implementation is based on [Atanasoaei2012], where all the different types of LBP features are defined in more detail.
Constructor Documentation:
- bob.ip.base.LBP (neighbors, [radius], [circular], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
- bob.ip.base.LBP (neighbors, radius_y, radius_x, [circular], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
- bob.ip.base.LBP (neighbors, block_size, [block_overlap], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
- bob.ip.base.LBP (lbp)
- bob.ip.base.LBP (hdf5)
Creates an LBP extractor with the given parametrization
Basically, the LBP configuration can be split into three parts.
- Which pixels are compared how:
- The number of neighbors (might be 4, 8 or 16)
- Circular or rectangular offset positions around the center, or even Multi-Block LBP (MB-LBP)
- Compare the pixels to the center pixel or to the average
- How to generate the bit strings from the pixels (this is handled
by the
elbp_type
parameter):'regular'
: Choose one bit for each comparison of the neighboring pixel with the central pixel'transitional'
: Compare only the neighboring pixels and skip the central one'direction-coded'
: Compute a 2-bit code for four directions
- How to cluster the generated bit strings to compute the final
LBP code:
uniform
: Only uniform LBP codes (with less than two bit-changes between 0 and 1) are considered; all other strings are combined into one LBP coderotation_invariant
: Rotation invariant LBP codes are generated, e.g., bit strings00110000
and00000110
will lead to the same LBP code
This clustering is done using a look-up-table, which you can also set yourself using the
look_up_table
attribute. The maximum code that will be generated can be read from themax_label
attribute.Finally, the border handling of the image can be selected. With the
'shrink'
option, no LBP code is computed for the border pixels and the resulting image is 2\timesradius
or 3\timesblock_size
-1 pixels smaller in both directions, seelbp_shape()
. The'wrap'
option will wrap around the border and no truncation is performed.Note
To compute MB-LBP features, it is possible to compute an integral image before to speed up the calculation.
Parameters:
neighbors
: intThe number of neighboring pixels that should be taken into account; possible values: 4, 8, 16radius
: float[default: 1.] The radius of the LBP in both vertical and horizontal direction togetherradius_y, radius_x
: floatThe radius of the LBP in both vertical and horizontal direction separatelyblock_size
: (int, int)If set, multi-block LBP’s with the given block size will be extractedblock_overlap
: (int, int)[default:(0, 0)
] Multi-block LBP’s with the given block overlap will be extractedcircular
: bool[default:False
] Extract neighbors on a circle or on a square?to_average
: bool[default:False
] Compare the neighbors to the average of the pixels instead of the central pixel?add_average_bit
: bool[default: False] (only useful if to_average is True) Add another bit to compare the central pixel to the average of the pixels?uniform
: bool[default:False
] Extract uniform LBP features?rotation_invariant
: bool[default:False
] Extract rotation invariant LBP features?elbp_type
: str[default:'regular'
] Which type of LBP codes should be computed; possible values: (‘regular’, ‘transitional’, ‘direction-coded’), seeelbp_type
border_handling
: str[default:'shrink'
] How should the borders of the image be treated; possible values: (‘shrink’, ‘wrap’), seeborder_handling
lbp
:bob.ip.base.LBP
Another LBP object to copyhdf5
:bob.io.base.HDF5File
An HDF5 file to read the LBP configuration fromClass Members:
-
add_average_bit
¶ bool <– Should the bit for the comparison of the central pixel with the average be added as well (read and write access)?
-
block_overlap
¶ (int, int) <– The block overlap in both vertical and horizontal direction of the Multi-Block-LBP extractor, with read and write access
Note
The
block_overlap
must be smaller than theblock_size
. To set both the block size and the block overlap at the same time, use theset_block_size_and_overlap()
function.
-
block_size
¶ (int, int) <– The block size in both vertical and horizontal direction of the Multi-Block-LBP extractor, with read and write access
-
border_handling
¶ str <– The type of border handling that should be applied (read and write access)
Possible values are: (‘shrink’, ‘wrap’)
-
circular
¶ bool <– Should circular or rectangular LBP’s be extracted (read and write access)?
-
elbp_type
¶ str <– The type of LBP bit string that should be extracted (read and write access)
Possible values are: (‘regular’, ‘transitional’, ‘direction-coded’)
-
extract
()¶ - extract(input, [is_integral_image]) -> output
- extract(input, position, [is_integral_image]) -> code
- extract(input, output, [is_integral_image]) -> None
This function extracts the LBP features from an image
LBP features can be extracted either for the whole image, or at a single location in the image. When MB-LBP features will be extracted, an integral image will be computed to speed up the calculation. The integral image calculation can be done before this function is called, and the integral image can be passed to this function directly. In this case, please set the
is_integral_image
parameter toTrue
.Note
The
__call__()
function is an alias for this method.Parameters:
input
: array_like (2D)The input image for which LBP features should be extractedposition
: (int, int)The position in theinput
image, where the LBP code should be extracted; assure that you don’t try to provide positions outside of theoffset
output
: array_like (2D, uint16)The output image that need to be of shapelbp_shape()
is_integral_image
: bool[default:False
] Is the giveninput
image an integral image?Returns:
output
: array_like (2D, uint16)The resulting image of LBP codescode
: uint16The resulting LBP code at the given position in the image
-
is_multi_block_lbp
¶ bool <– Is the current configuration of the LBP extractor set up to extract Multi-Block LBP’s (read access only)?
-
lbp_shape
()¶ - lbp_shape(input, is_integral_image) -> lbp_shape
- lbp_shape(shape, is_integral_image) -> lbp_shape
This function returns the shape of the LBP image for the given image
In case the
border_handling
is'shrink'
the image resolution will be reduced, depending on the LBP configuration. This function will return the desired output shape for the given input image or input shape.Parameters:
input
: array_like (2D)The input image for which LBP features should be extractedshape
: (int, int)The shape of the input image for which LBP features should be extractedis_integral_image
: bool[default:False
] Is the given image (shape) an integral image?Returns:
lbp_shape
: (int, int)The shape of the LBP image that is required in a call toextract()
-
load
(hdf5) → None¶ Loads the parametrization of the LBP extractor from the given HDF5 file
Parameters:
hdf5
:bob.io.base.HDF5File
An HDF5 file opened for reading
-
look_up_table
¶ array_like (1D, uint16) <– The look up table that defines, which bit string is converted into which LBP code (read and write access)
Depending on the values of
uniform
androtation_invariant
, bit strings might be converted into different LBP codes. Since this attribute is writable, you can define a look-up-table for LBP codes yourself.Warning
For the time being, the look up tables are not saved by the
save()
function!
-
max_label
¶ int <– The number of different LBP code that are extracted (read access only)
The codes themselves are uint16 numbers in the range
[0, max_label - 1]
. Depending on the values ofuniform
androtation_invariant
, bit strings might be converted into different LBP codes.
-
offset
¶ (int, int) <– The offset in the image, where the first LBP code can be extracted (read access only)
Note
When extracting LBP features from an image with a specific
shape
, positions might be in range[offset, shape - offset[
only. Otherwise, an exception will be raised.
-
points
¶ int <– The number of neighbors (usually 4, 8 or 16), with read and write access
Note
The
block_overlap
must be smaller than theblock_size
. To set both the block size and the block overlap at the same time, use theset_block_size_and_overlap()
function.
-
radii
¶ (float, float) <– The radii in both vertical and horizontal direction of the elliptical or rectangular LBP extractor, with read and write access
-
radius
¶ float <– The radius of the round or square LBP extractor, with read and write access
-
relative_positions
¶ array_like (2D, float) <– The list of neighbor positions, with which the central pixel is compared (read access only)
The list is defined as relative positions, where the central pixel is considered to be at
(0, 0)
.
-
rotation_invariant
¶ bool <– Should rotation invariant LBP patterns be extracted (read and write access)?
Rotation invariant LBP codes collects all patterns that have the same bit string with shifts. Hence,
00111000
and10000011
will result in the same LBP code.
-
save
(hdf5) → None¶ Saves the the parametrization of the LBP extractor to the given HDF5 file
Warning
For the time being, the look-up-table is not saved. If you have set the
look_up_table
by hand, it is lost.Parameters:
hdf5
:bob.io.base.HDF5File
An HDF5 file open for writing
-
set_block_size_and_overlap
(block_size, block_overlap) → None¶ This function sets the block size and the block overlap for MB-LBP features at the same time
Parameters:
block_size
: (int, int)Multi-block LBP’s with the given block size will be extractedblock_overlap
: (int, int)Multi-block LBP’s with the given block overlap will be extracted
-
to_average
¶ bool <– Should the neighboring pixels be compared with the average of all pixels, or to the central one (read and write access)?
-
uniform
¶ bool <– Should uniform LBP patterns be extracted (read and write access)?
Uniform LBP patterns are those bit strings, where only up to two changes from 0 to 1 and vice versa are allowed. Hence,
00111000
is a uniform pattern, while00110011
is not. All non-uniform bit strings will be collected in a single LBP code.
-
class
bob.ip.base.
LBPTop
¶ Bases:
object
A class that extracts local binary patterns (LBP) in three orthogonal planes (TOP)
The LBPTop class is designed to calculate the LBP-Top coefficients given a set of images. The workflow is as follows:
Todo
UPDATE as this is not true
- You initialize the class, defining the radius and number of points in each of the three directions: XY, XT, YT for the LBP calculations
- For each image you have in the frame sequence, you push into the class
- An internal FIFO queue (length = radius in T direction) keeps track of the current image and their order. As a new image is pushed in, the oldest on the queue is pushed out.
- After pushing an image, you read the current LBP-Top coefficients and may save it somewhere.
Constructor Documentation:
bob.ip.base.LBPTop (xy, xt, yt)
Constructs a new LBPTop object
For all three directions, the LBP objects need to be specified. The radii for the three LBP classes must be consistent, i.e.,
xy.radii[1] == xt.radii[1]
,xy.radii[0] == yt.radii[1]
andxt.radii[0] == yt.radii[0]
.Warning
The order of the
radius_x
andradius_y
parameters are not(radius_x, radius_y)
in theLBP
constructor, but(radius_y, radius_x)
. Hence, to get anx
radius 2 andy
radius 3, you need to usexy = bob.ip.base.LBP(8, 3, 2)
or more specificallyxy = bob.ip.base.LBP(8, radius_x=2, radius_y=3)
. The same applies forxt
andyt
.Parameters:
xy
:bob.ip.base.LBP
The 2D LBP-XY plane configurationxt
:bob.ip.base.LBP
The 2D LBP-XT plane configurationyt
:bob.ip.base.LBP
The 2D LBP-YT plane configurationClass Members:
-
process
(input, xy, xt, yt) → None¶ This function processes the given set of images and extracts the three orthogonal planes
The given 3D input array represents a set of gray-scale images and returns (by argument) the three LBP planes calculated. The 3D array has to be arranged in this way:
- First dimension: time
- Second dimension: frame height
- Third dimension: frame width
The central pixel is the point where the LBP planes intersect/have to be calculated from.
Parameters:
input
: array_like (3D)The input set of gray-scale images for which LBPTop features should be extractedxy, xt, yt
: array_like (3D, uint16)The result of the LBP operator in the XY, XT and YT plane (frame), for the central frame of the input array
-
xt
¶ bob.ip.base.LBP
<– The 2D LBP-XT plane configuration
-
xy
¶ bob.ip.base.LBP
<– The 2D LBP-XY plane configuration
-
yt
¶ bob.ip.base.LBP
<– The 2D LBP-XT plane configuration
-
class
bob.ip.base.
MultiscaleRetinex
¶ Bases:
object
This class allows after configuration to apply the Self Quotient Image algorithm to images
More information about this algorithm can be found in [Jobson1997].
Constructor Documentation:
- bob.ip.base.MultiscaleRetinex ([scales], [size_min], [size_step], [sigma], [border])
- bob.ip.base.MultiscaleRetinex (msrx)
Creates a MultiscaleRetinex object
Todo
Add documentation for MultiscaleRetinex
Parameters:
scales
: int[default: 1] The number of scales (bob.ip.base.Gaussian
)size_min
: int[default: 1] The radius of the kernel of the smallestbob.ip.base.Gaussian
size_step
: int[default: 1] The step used to set the kernel size of other weighted Gaussians:size_s = 2 * (size_min + s * size_step) + 1
sigma
: double[default: 2.] The standard deviation of the kernel of the smallest weighted Gaussian; other sigmas:sigma_s = sigma * (size_min + s * size_step) / size_min
border
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the bordermsrx
:bob.ip.base.MultiscaleRetinex
The MultiscaleRetinex object to use for copy-constructionClass Members:
-
border
¶ bob.sp.BorderType
<– The extrapolation method used by the convolution at the border; with read and write access
-
process
(src[, dst]) → dst¶ Applies the Self Quotient Image algorithm to an image (2D/grayscale or color 3D/color) of type uint8, uint16 or double
Todo
Check if this documentation is correct (seems to be copied from
bob.ip.base.SelfQuotientImage
If given, the
dst
array should have the type float and the same size as thesrc
array.Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D)The input image which should be processeddst
: array_like (2D, float)[default:None
] If given, the output will be saved into this image; must be of the same shape assrc
Returns:
dst
: array_like (2D, float)The resulting output image, which is the same asdst
(if given)
-
scales
¶ int <– The number of scales (Gaussian); with read and write access
-
sigma
¶ float <– The variance of the kernel of the smallest weighted Gaussian (variance_s = sigma2 * (size_min+s*size_step)/size_min); with read and write access
-
size_min
¶ int <– The radius (size=2*radius+1) of the kernel of the smallest weighted Gaussian; with read and write access
-
size_step
¶ int <– The step used to set the kernel size of other Weighted Gaussians (size_s=2*(size_min+s*size_step)+1); with read and write access
-
class
bob.ip.base.
SIFT
¶ Bases:
object
This class allows after configuration the extraction of SIFT descriptors
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.SIFT (size, scales, octaves, octave_min, [sigma_n], [sigma0], [contrast_thres], [edge_thres], [norm_thres], [kernel_radius_factor], [border])
- bob.ip.base.SIFT (sift)
Creates an object that allows the extraction of SIFT descriptors
Todo
Explain SIFT constructor in more detail.
Warning
The order of the parameters
scales
andoctaves
has changed compared to the old implementation, in order to keep it consistent withbob.ip.base.VLSIFT
!Parameters:
size
: (int, int)The height and width of the images to processscales
: intThe number of intervals of the pyramid. Three additional scales will be computed in practice, as this is required for extracting SIFT featuresoctaves
: intThe number of octaves of the pyramidoctave_min
: intThe index of the minimum octavesigma_n
: float[default: 0.5] The value sigma_n of the standard deviation for the nominal/initial octave/scalesigma0
: float[default: 1.6] The value sigma0 of the standard deviation for the image of the first octave and first scalecontrast_thres
: float[default: 0.03] The contrast threshold used during keypoint detectionedge_thres
: float[default: 10.] The edge threshold used during keypoint detectionnorm_thres
: float[default: 0.2] The norm threshold used during descriptor normalizationkernel_radius_factor
: float[default: 4.] Factor used to determine the kernel radii:size=2*radius+1
. For each Gaussian kernel, the radius is equal toceil(kernel_radius_factor*sigma_{octave,scale})
border
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the bordersift
:bob.ip.base.SIFT
The SIFT object to use for copy-constructionClass Members:
-
bins
¶ int <– The number of bins for the descriptor, with read and write access
-
blocks
¶ int <– The number of blocks for the descriptor, with read and write access
-
border
¶ bob.sp.BorderType
<– The extrapolation method used by the convolution at the border; with read and write access
-
compute_descriptor
(src, keypoints[, dst]) → dst¶ Computes SIFT descriptor for a 2D/grayscale image, at the given keypoints
If given, the results are put in the output
dst
, which output should be of type float and allocated in the shapeoutput_shape()
method).Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D)The input image which should be processedkeypoints
: [bob.ip.base.GSSKeypoint
]The keypoints at which the descriptors should be computeddst
: [array_like (4D, float)]The descriptors that should have been allocated in sizeoutput_shape()
Returns:
dst
: [array_like (4D, float)]The resulting descriptors, if given it will be the same as thedst
parameter
-
contrast_threshold
¶ float <– The contrast threshold used during keypoint detection
-
edge_threshold
¶ float <– The edge threshold used during keypoint detection
-
gaussian_window_size
¶ float <– The Gaussian window size for the descriptor
-
kernel_radius_factor
¶ float <– Factor used to determine the kernel radii
size=2*radius+1
For each Gaussian kernel, the radius is equal to
ceil(kernel_radius_factor*sigma_{octave,scale})
-
magnif
¶ float <– The magnification factor for the descriptor
-
norm_epsilon
¶ float <– The magnification factor for the descriptor
-
norm_threshold
¶ float <– The norm threshold used during keypoint detection
-
octave_max
¶ int <– The index of the minimum octave, read only access
This is equal to
octave_min+octaves-1
.
-
octave_min
¶ int <– The index of the minimum octave, with read and write access
-
octaves
¶ int <– The number of octaves of the pyramid, with read and write access
-
output_shape
(keypoints) → shape¶ Returns the output shape for the given number of input keypoints
Parameters:
keypoints
: intThe number of keypoints that you want to retrieve SIFT features forReturns:
shape
: (int, int, int, int)The shape of the output array required to callcompute_descriptor()
-
scales
¶ int <– The number of intervals of the pyramid, with read and write access
Three additional scales will be computed in practice, as this is required for extracting SIFT features
-
set_sigma0_no_init_smoothing
() → None¶ Sets sigma0 such that there is not smoothing at the first scale of octave_min
-
sigma0
¶ float <– The value sigma0 of the standard deviation for the image of the first octave and first scale
-
sigma_n
¶ float <– The value sigma_n of the standard deviation for the nominal/initial octave/scale; with read and write access
-
size
¶ (int, int) <– The shape of the images to process, with read and write access
-
class
bob.ip.base.
SelfQuotientImage
¶ Bases:
object
This class allows after configuration to apply the Self Quotient Image algorithm to images
Details of the Self Quotient Image algorithm is described in [Wang2004].
Constructor Documentation:
- bob.ip.base.SelfQuotientImage ([scales], [size_min], [size_step], [sigma], [border])
- bob.ip.base.SelfQuotientImage (sqi)
Creates an object to preprocess images with the Self Quotient Image algorithm
Todo
explain SelfQuotientImage constructor
Warning
Compared to the last Bob version, here the sigma parameter is the standard deviation and not the variance. This includes that the
WeightedGaussian
pyramid is different, see https://github.com/bioidiap/bob.ip.base/issues/1.Parameters:
scales
: int[default: 1] The number of scales (bob.ip.base.WeightedGaussian
)size_min
: int[default: 1] The radius of the kernel of the smallestbob.ip.base.WeightedGaussian
size_step
: int[default: 1] The step used to set the kernel size of other weighted Gaussians:size_s = 2 * (size_min + s * size_step) + 1
sigma
: double[default:math.sqrt(2.)
] The standard deviation of the kernel of the smallest weighted Gaussian; other sigmas:sigma_s = sigma * (size_min + s * size_step) / size_min
border
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the bordersqi
:bob.ip.base.SelfQuotientImage
TheSelfQuotientImage
object to use for copy-constructionClass Members:
-
border
¶ bob.sp.BorderType
<– The extrapolation method used by the convolution at the border; with read and write access
-
process
(src[, dst]) → dst¶ Applies the Self Quotient Image algorithm to an image (2D/grayscale or 3D/color) of type uint8, uint16 or double
If given, the
dst
array should have the type float and the same size as thesrc
array.Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D)The input image which should be processeddst
: array_like (2D, float)[default:None
] If given, the output will be saved into this image; must be of the same shape assrc
Returns:
dst
: array_like (2D, float)The resulting output image, which is the same asdst
(if given)
-
scales
¶ int <– The number of scales (Weighted Gaussian); with read and write access
-
sigma
¶ float <– The standard deviation of the kernel of the smallest weighted Gaussian (sigma_s = sigma * (size_min+s*size_step)/size_min); with read and write access
-
size_min
¶ int <– The radius (size=2*radius+1) of the kernel of the smallest weighted Gaussian; with read and write access
-
size_step
¶ int <– The step used to set the kernel size of other Weighted Gaussians (size_s=2*(size_min+s*size_step)+1); with read and write access
-
class
bob.ip.base.
TanTriggs
¶ Bases:
object
Objects of this class, after configuration, can preprocess images
It does this using the method described by Tan and Triggs in the paper [TanTriggs2007].
Constructor Documentation:
- bob.ip.base.TanTriggs ([gamma], [sigma0], [sigma1], [radius], [threshold], [alpha], [border])
- bob.ip.base.TanTriggs (tan_triggs)
Constructs a new Tan and Triggs filter
Todo
Explain TanTriggs constructor in more detail.
Parameters:
gamma
: float[default:0.2
] The value of gamma for the gamma correctionsigma0
: float[default:1.
] The standard deviation of the inner Gaussiansigma1
: float[default:2.
] The standard deviation of the outer Gaussianradius
: int[default:2
] The radius of the Difference of Gaussians filter along both axes (size of the kernel=2*radius+1)threshold
: float[default:10.
] The threshold used for the contrast equalizationalpha
: float[default:0.1
] The alpha value used for the contrast equalizationborder
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the bordertan_triggs
:bob.ip.base.TanTriggs
The TanTriggs object to use for copy-constructionClass Members:
-
alpha
¶ float <– The alpha value used for the contrast equalization, with read and write access
-
border
¶ bob.sp.BorderType
<– The extrapolation method used by the convolution at the border, with read and write access
-
gamma
¶ float <– The value of gamma for the gamma correction, with read and write access
-
kernel
¶ array_like (2D, float) <– The values of the DoG filter; read only access
-
process
(input[, output]) → output¶ Preprocesses a 2D/grayscale image using the algorithm from Tan and Triggs.
The input array is a 2D array/grayscale image. The destination array, if given, should be a 2D array of type float64 and allocated in the same size as the input. If the destination array is not given, it is generated in the required size.
Note
The
__call__()
function is an alias for this method.Parameters:
input
: array_like (2D)The input image which should be normalizedoutput
: array_like (2D, float)[default:None
] If given, the output will be saved into this image; must be of the same shape asinput
Returns:
output
: array_like (2D, float)The resulting output image, which is the same asoutput
(if given)
-
radius
¶ int <– The radius of the Difference of Gaussians filter along both axes (size of the kernel=2*radius+1)
-
sigma0
¶ float <– The standard deviation of the inner Gaussian, with read and write access
-
sigma1
¶ float <– The standard deviation of the inner Gaussian, with read and write access
-
threshold
¶ float <– The threshold used for the contrast equalization, with read and write access
-
class
bob.ip.base.
VLDSIFT
¶ Bases:
object
Computes dense SIFT features using the VLFeat library
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.VLDSIFT (size, [step], [block_size])
- bob.ip.base.VLDSIFT (sift)
Creates an object that allows the extraction of VLDSIFT descriptors
Todo
Explain VLDSIFT constructor in more detail.
Parameters:
size
: (int, int)The height and width of the images to processstep
: (int, int)[default:(5, 5)
] The step along the y- and x-axesblock_size
: (int, int)[default:(5, 5)
] The block size along the y- and x-axessift
:bob.ip.base.VLDSIFT
The VLDSIFT object to use for copy-constructionClass Members:
-
block_size
¶ (int, int) <– The block size in both directions, with read and write access
-
extract
(src[, dst]) → dst¶ Computes the dense SIFT features from an input image, using the VLFeat library
If given, the results are put in the output
dst
, which should be of type float and allocated in the shapeoutput_shape()
method.Todo
Describe the output of the
VLDSIFT.extract()
method in more detail.Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D, float32)The input image which should be processeddst
: [array_like (2D, float32)]The descriptors that should have been allocated in sizeoutput_shape()
Returns:
dst
: array_like (2D, float32)The resulting descriptors, if given it will be the same as thedst
parameter
-
output_shape
() → shape¶ Returns the output shape for the current setup
The output shape is a 2-element tuple consisting of the number of keypoints for the current size, and the size of the descriptors
Returns:
shape
: (int, int)The shape of the output array required to callextract()
-
size
¶ (int, int) <– The shape of the images to process, with read and write access
-
step
¶ (int, int) <– The step along both directions, with read and write access
-
use_flat_window
¶ bool <– Whether to use a flat window or not (to boost the processing time), with read and write access
-
window_size
¶ float <– The window size, with read and write access
-
class
bob.ip.base.
VLSIFT
¶ Bases:
object
Computes SIFT features using the VLFeat library
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.VLSIFT (size, scales, octaves, octave_min, [peak_thres], [edge_thres], [magnif])
- bob.ip.base.VLSIFT (sift)
Creates an object that allows the extraction of VLSIFT descriptors
Todo
Explain VLSIFT constructor in more detail.
Parameters:
size
: (int, int)The height and width of the images to processscales
: intThe number of intervals in each octaveoctaves
: intThe number of octaves of the pyramidoctave_min
: intThe index of the minimum octavepeak_thres
: float[default: 0.03] The peak threshold (minimum amount of contrast to accept a keypoint)edge_thres
: float[default: 10.] The edge rejectipon threshold used during keypoint detectionmagnif
: float[default: 3.] The magnification factor (descriptor size is determined by multiplying the keypoint scale by this factor)sift
:bob.ip.base.VLSIFT
The VLSIFT object to use for copy-constructionClass Members:
-
edge_threshold
¶ float <– The edge rejection threshold used during keypoint detection, with read and write access
-
extract
(src[, keypoints]) → dst¶ Computes the SIFT features from an input image
A keypoint is specified by a 3- or 4-tuple (y, x, sigma, [orientation]), stored as one row of the given
keypoints
parameter. If thekeypoints
are not given, the are detected first. It returns a list of descriptors, one for each keypoint and orientation. The first four values are the x, y, sigma and orientation of the values. The 128 remaining values define the descriptor.Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D, uint8)The input image which should be processedkeypoints
: array_like (2D, float)The keypoints at which the descriptors should be computedReturns:
dst
: [array_like (1D, float)]The resulting descriptors; the first four values are the x, y, sigma and orientation of the keypoints, the 128 remaining values define the descriptor
-
magnif
¶ float <– The magnification factor for the descriptor
-
octave_max
¶ int <– The index of the minimum octave, read only access
This is equal to
octave_min+octaves-1
.
-
octave_min
¶ int <– The index of the minimum octave, with read and write access
-
octaves
¶ int <– The number of octaves of the pyramid, with read and write access
-
peak_threshold
¶ float <– The peak threshold (minimum amount of contrast to accept a keypoint), with read and write access
-
scales
¶ int <– The number of intervals of the pyramid, with read and write access
Three additional scales will be computed in practice, as this is required for extracting VLSIFT features
-
size
¶ (int, int) <– The shape of the images to process, with read and write access
-
class
bob.ip.base.
WeightedGaussian
¶ Bases:
object
This class performs weighted gaussian smoothing (anisotropic filtering)
In particular, it is used by the Self Quotient Image (SQI) algorithm
bob.ip.base.SelfQuotientImage
.Constructor Documentation:
- bob.ip.base.WeightedGaussian (sigma, [radius], [border])
- bob.ip.base.WeightedGaussian (weighted_gaussian)
Constructs a new weighted Gaussian filter
Todo
explain WeightedGaussian constructor
Warning
Compared to the last Bob version, here the sigma parameter is the standard deviation and not the variance.
Parameters:
sigma
: (double, double)The standard deviation of the WeightedGaussian along the y- and x-axes in pixelsradius
: (int, int)[default: (-1, -1) ->3*sigma
] The radius of the Gaussian in both directions – the size of the kernel is2*radius+1
border
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the borderweighted_gaussian
:bob.ip.base.WeightedGaussian
The weighted Gaussian object to use for copy-constructionClass Members:
-
border
¶ bob.sp.BorderType
<– The extrapolation method used by the convolution at the border, with read and write access
-
filter
(src[, dst]) → dst¶ Smooths an image (2D/grayscale or 3D/color)
If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.
Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D)The input image which should be smootheddst
: array_like (2D, float)[default:None
] If given, the output will be saved into this image; must be of the same shape assrc
Returns:
dst
: array_like (2D, float)The resulting output image, which is the same asdst
(if given)
-
class
bob.ip.base.
Wiener
¶ Bases:
object
A Wiener filter
The Wiener filter is implemented after the description in Part 3.4.3 of [Szeliski2010]
Constructor Documentation:
- bob.ip.base.Wiener (size, Pn, [variance_threshold])
- bob.ip.base.Wiener (Ps, Pn, [variance_threshold])
- bob.ip.base.Wiener (data, [variance_threshold])
- bob.ip.base.Wiener (filter)
- bob.ip.base.Wiener (hdf5)
Constructs a new Wiener filter
Several variants of contructors are possible for contructing a Wiener filter. They are:
- Constructs a new Wiener filter dedicated to images of the given
size
. The filter is initialized with zero values - Constructs a new Wiener filter from a set of variance estimates
Ps
and a noise levelPn
- Trains the new Wiener filter with the given
data
- Copy constructs the given Wiener filter
- Reads the Wiener filter from
bob.io.base.HDF5File
Parameters:
Ps
: array_like<float, 2D>Variance Ps estimated at each frequencyPn
: floatNoise level Pnsize
: (int, int)The shape of the newly created empty filterdata
: array_like<float, 3D>The training data, with dimensions(#data, height, width)
variance_threshold
: float[default:1e-8
] Variance flooring threshold (i.e., the minimum variance valuefilter
:bob.ip.base.Wiener
The Wiener filter object to use for copy-constructionhdf5
:bob.io.base.HDF5File
The HDF5 file object to read the Wiener filter fromClass Members:
-
Pn
¶ float <– Noise level Pn
-
Ps
¶ array_like <float, 2D> <– Variance Ps estimated at each frequency
-
filter
(src[, dst]) → dst¶ Filters the input image
If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.
Note
The
__call__()
function is an alias for this method.Parameters:
src
: array_like (2D)The input image which should be smootheddst
: array_like (2D, float)[default:None
] If given, the output will be saved into this image; must be of the same shape assrc
Returns:
dst
: array_like (2D, float)The resulting output image, which is the same asdst
(if given)
-
is_similar_to
(other[, r_epsilon][, a_epsilon]) → None¶ Compares this Wiener filter with the
other
one to be approximately the sameThe optional values
r_epsilon
anda_epsilon
refer to the relative and absolute precision, similarly tonumpy.allclose()
.Parameters:
other
:bob.ip.base.Wiener
The other Wiener filter to compare withr_epsilon
: float[Default:1e-5
] The relative precisiona_epsilon
: float[Default:1e-8
] The absolute precision
-
load
(hdf5) → None¶ Loads the configuration of the Wiener filter from the given HDF5 file
Parameters:
hdf5
:bob.io.base.HDF5File
An HDF5 file opened for reading
-
save
(hdf5) → None¶ Saves the the configuration of the Wiener filter to the given HDF5 file
Parameters:
hdf5
:bob.io.base.HDF5File
An HDF5 file open for writing
-
size
¶ (int, int) <– The size of the filter
-
variance_threshold
¶ float <– Variance flooring threshold
-
w
¶ array_like<2D, float> <– The Wiener filter W (W=1/(1+Pn/Ps)) (read-only)
-
bob.ip.base.
angle_to_horizontal
(right, left) → angle[source]¶ Get the angle needed to level out (horizontally) two points.
Parameters
right
,left
: (float, float)- The two points to level out horizontically.
Returns
- angle : float
- The angle in degrees between the left and the right point
-
bob.ip.base.
block
(input, block_size[, block_overlap][, output][, flat]) → output¶ Performs a block decomposition of a 2D array/image
If given, the output 3D or 4D destination array should be allocated and of the correct size, see
bob.ip.base.block_output_shape()
.Parameters:
input
: array_like (2D)The source image to decompose into blocksblock_size
: (int, int)The size of the blocks in which the image is decomposedblock_overlap
: (int, int)[default:(0, 0)
] The overlap of the blocksoutput
: array_like(3D or 4D)[default:None
] If given, the resulting blocks will be saved into this parameter; must be initialized in the correct size (seeblock_output_shape()
)flat
: bool[default:False
] Ifoutput
is not specified, theflat
parameter is used to decide whether 3D (flat = True
) or 4D (flat = False
) output is generatedReturns:
output
: array_like(3D or 4D)The resulting blocks that the image is decomposed into; the same array as theoutput
parameter, when given.
-
bob.ip.base.
block_output_shape
(input, block_size[, block_overlap][, flat]) → shape¶ Returns the shape of the output image that is required to compute the
bob.ip.base.block()
functionParameters:
input
: array_like (2D)The source image to decompose into blocksblock_size
: (int, int)The size of the blocks in which the image is decomposedblock_overlap
: (int, int)[default:(0, 0)
] The overlap of the blocksflat
: bool[default:False
] Theflat
parameter is used to decide whether 3D (flat = True
) or 4D (flat = False
) output is generatedReturns:
shape
: (int, int, int) or (int, int, int, int)The shape of the blocks.
-
bob.ip.base.
crop
(src, crop_offset, crop_size[, dst][, src_mask][, dst_mask][, fill_pattern]) → dst[source]¶ Crops the given image
src
image to the given offset (might be negative) and to the given size (might be greater thansrc
image).Either crop_size or dst need to be specified. When masks are given, the need to be of the same size as the
src
anddst
parameters. When crop regions are outside the image, the cropped image will containfill_pattern
and the mask will be set toFalse
Parameters
src
: array_like (2D or 3D)- The source image to flip.
crop_offset
: (int, int)- The position in
src
coordinates to start cropping; might be negative crop_size
: (int, int)- The size of the cropped image; might be omitted when the
dst
is given dst
: array_like (2D or 3D)- If given, the destination to crop
src
to. src_mask
,dst_mask
: array_like(bool, 2D or 3D)- Masks that define, where
src
anddst
are valid fill_pattern
: number- [default: 0] The value to set outside the croppable area
Returns
dst
: array_like (2D or 3D)- The cropped image
-
bob.ip.base.
extrapolate_mask
()¶ - extrapolate_mask(mask, img) -> None
- extrapolate_mask(mask, img, random_sigma, [neighbors], [rng]) -> None
Extrapolate a 2D array/image, taking a boolean mask into account
The
img
argument is used both as an input and an output. Only values where the mask is set to false are extrapolated. The regions, where the mask is set to True is expected to be convex.This function can be called in two ways:
The first way is by giving only the mask and the image. Then a nearest neighbor technique is used as:
- The columns of the image are firstly extrapolated wrt. to the nearest neighbour on the same column.
- The rows of the image are the extrapolate wrt. to the closest neighbour on the same row.
The second way, the mask is interpolated by adding random values to the border pixels. The image is scanned in a spiral way, starting at the center of the masked area. When a pixel of the unmasked area is reached:
- The next pixel of the masked area is searched perpendicular to the current spiral direction
- From that pixel,
neigbors
pixels are extratced from the image in both sides of the current spiral direction, and a random value os choosen - A normal distributed random value with mean 1 and standard deviation
random_sigma
is added to the pixel value - The pixel value is set to the image at the current position
Any action considering a random number will use the given
rng
to create random numbers.Note
For the second variant, images of type
float
are preferred.Parameters:
mask
: array_like (2D, bool)The mask which has the valid pixel set toTrue
and the invalid pixel set toFalse
img
: array_like (2D, bool)The image that will be filled; must have the same shape asmask
random_sigma
: floatThe standard deviation of the random factor to multiply thevalid pixel value from the border with; must be greater than or equal to 0neighbors
: int[Default: 5] The number of neighbors of valid border pixels to choose one from; setneighbors=0
to disable random selectionrng
:bob.core.random.mt19937
[Default: rng initialized with the system time] The random number generator to consider
-
bob.ip.base.
flip
(src[, dst]) → dst[source]¶ Flip a 2D or 3D array/image upside-down. If given, the destination array
dst
should have the same size and type as the source array.Parameters
src
: array_like (2D or 3D)- The source image to flip.
dst
: array_like (2D or 3D)- If given, the destination to flip
src
to.
Returns
dst
: array_like (2D or 3D)- The flipped image
-
bob.ip.base.
flop
(src[, dst]) → dst[source]¶ Flip a 2D or 3D array/image left-right. If given, the destination array
dst
should have the same size and type as the source array.Parameters
src
: array_like (2D or 3D)- The source image to flip.
dst
: array_like (2D or 3D)- If given, the destination to flip
src
to.
Returns
dst
: array_like (2D or 3D)- The flipped image
-
bob.ip.base.
gamma_correction
(src, gamma[, dst]) → dst¶ Performs a power-law gamma correction of a given 2D image
Todo
Explain gamma correction in more detail
Parameters:
src
: array_like (2D)The source image to compute the histogram forgamma
: floatThe gamma value to applydst
: array_like (2D, float)The gamma-corrected image to write; if not specified, it will be created in the desired sizeReturns:
dst
: array_like (2D, float)The gamma-corrected image; the same as thedst
parameter, if specified
-
bob.ip.base.
histogram
()¶ - histogram(src, [bin_count]) -> hist
- histogram(src, hist) -> None
- histogram(src, min_max, bin_count) -> hist
- histogram(src, min_max, hist) -> None
Computes an histogram of the given input image
This function computes a histogram of the given input image, in several ways.
- (version 1 and 2, only valid for uint8 and uint16 types – and uint32
and uint64 when
bin_count
is specified orhist
is given as parameter): For each pixel value of thesrc
image, a histogram bin is computed, using a fast implementation. The number of bins can be limited, and there will be a check that the source image pixels are actually in the desired range(0, bin_count-1)
- (version 3 and 4, valid for many data types): The histogram is computed by defining regular bins between the provided minimum and maximum values.
Parameters:
src
: array_like (2D)The source image to compute the histogram forhist
: array_like (1D, uint64)The histogram with the desired number of bins; the histogram will be cleaned before running the extractionmin_max
: (scalar, scalar)The minimum value and the maximum value in the source imagebin_count
: int[default: 256 or 65536] The number of bins in the histogram to create, defaults to the maximum number of valuesReturns:
hist
: array_like(2D, uint64)The histogram with the desired number of bins, which is filled with the histogrammed source data
-
bob.ip.base.
histogram_equalization
()¶ - histogram_equalization(src) -> None
- histogram_equalization(src, dst) -> None
Performs a histogram equalization of a given 2D image
The first version computes the normalization in-place (in opposition to the old implementation, which returned a equalized image), while the second version fills the given
dst
array and leaves the input untouched.Parameters:
src
: array_like (2D, uint8 or uint16)The source image to compute the histogram fordst
: array_like (2D, uint8, uint16, uint32 or float)The histogram-equalized image to write; if not specified, the equalization is computed in-place.
-
bob.ip.base.
integral
(src, dst[, sqr][, add_zero_border]) → None¶ Computes an integral image for the given input image
It is the responsibility of the user to select an appropriate type for the numpy array
dst
(andsqr
), which will contain the integral image. By default,src
anddst
should have the same size. When thesqr
matrix is given as well, it will be filled with the squared integral image (useful to compute variances of pixels).Note
The
sqr
image is expected to have the same data type as thedst
image.If
add_zero_border
is set toTrue
,dst
(andsqr
) should be one pixel larger thansrc
in each dimension. In this case, an extra zero pixel will be added at the beginning of each row and column.Parameters:
src
: array_like (2D)The source imagedst
: array_like (2D)The resulting integral imagesqr
: array_like (2D)The resulting squared integral image with the same data type asdst
add_zero_border
: boolIf enabled, an extra zero pixel will be added at the beginning of each row and column
-
bob.ip.base.
lbphs
(input, lbp, block_size[, block_overlap][, output]) → output¶ Computes an local binary pattern histogram sequences from the given image
Warning
This is a re-implementation of the old bob.ip.LBPHSFeatures class, but with a different handling of blocks. Before, the blocks where extracted from the image, and LBP’s were extracted in the blocks. Hence, in each block, the border pixels where not taken into account, and the histogram contained far less elements. Now, the LBP’s are extracted first, and then the image is split into blocks.
This function computes the LBP features for the whole image, using the given
bob.ip.base.LBP
instance. Afterwards, the resulting image is split into several blocks with the given block size and overlap, and local LBH histograms are extracted from each region.Note
To get the required output shape, you can use
lbphs_output_shape()
function.Parameters:
input
: array_like (2D)The source image to compute the LBPHS forlbp
:bob.ip.base.LBP
The LBP class to be used for feature extractionblock_size
: (int, int)The size of the blocks in which the LBP histograms are splitblock_overlap
: (int, int)[default:(0, 0)
] The overlap of the blocks in which the LBP histograms are splitoutput
: array_like(2D, uint64)If given, the resulting LBPHS features will be written to this array; must have the size #output-blocks, #LBP-labels (seelbphs_output_shape()
)Returns:
output
: array_like(2D, uint64)The resulting LBPHS features of the size #output-blocks, #LBP-labels; the same array as theoutput
parameter, when given.
-
bob.ip.base.
lbphs_output_shape
(input, lbp, block_size[, block_overlap]) → shape¶ Returns the shape of the output image that is required to compute the
bob.ip.base.lbphs()
functionParameters:
input
: array_like (2D)The source image to compute the LBPHS forlbp
:bob.ip.base.LBP
The LBP class to be used for feature extractionblock_size
: (int, int)The size of the blocks in which the LBP histograms are splitblock_overlap
: (int, int)[default:(0, 0)
] The overlap of the blocks in which the LBP histograms are splitReturns:
shape
: (int, int)The shape of the LBP histogram sequences, which is(#blocks, #labels)
.
-
bob.ip.base.
max_rect_in_mask
(mask) → rect¶ Given a 2D mask (a 2D blitz array of booleans), compute the maximum rectangle which only contains true values.
The resulting rectangle contains the coordinates in the following order:
- The y-coordinate of the top left corner
- The x-coordinate of the top left corner
- The height of the rectangle
- The width of the rectangle
Parameters:
mask
: array_like (2D, bool)The mask of boolean values, e.g., as a result ofbob.ip.base.GeomNorm.process()
Returns:
rect
: (int, int, int, int)The resulting rectangle: (top, left, height, width)
-
bob.ip.base.
median
(src, radius[, dst]) → dst¶ Performs a median filtering of the input image with the given radius
This function performs a median filtering of the given
src
image with the given radius and writes the result to the givendst
image. Both gray-level and color images are supported, and the input and output datatype must be identical.Median filtering iterates with a mask of size
(2*radius[0]+1, 2*radius[1]+1)
over the input image. For each input region, the pixels under the mask are sorted and the median value (the middle element of the sorted list) is written into thedst
image. Therefore, thedst
is smaller than thesrc
image, i.e., by2*radius
pixels.Parameters:
src
: array_like (2D or 3D)The source image to filter, might be a gray level image or a color imageradius
: (int, int)The radius of the median filter; the final filter will have the size(2*radius[0]+1, 2*radius[1]+1)
dst
: array_like (2D or 3D)The median-filtered image to write; need to be of sizesrc.shape - 2*radius
; if not specified, it will be createdReturns:
dst
: array_like (2D or 3D)The median-filtered image; the same as thedst
parameter, if specified
-
bob.ip.base.
rotate
()¶ - rotate(src, rotation_angle) -> dst
- rotate(src, dst, rotation_angle) -> None
- rotate(src, src_mask, dst, dst_mask, rotation_angle) -> None
Rotates an image.
This function rotates an image using bi-linear interpolation. It supports 2D and 3D input array/image (NumPy array) of type numpy.uint8, numpy.uint16 and numpy.float64. Basically, this function can be called in three different ways:
- Given a source image and a rotation angle, the rotated image is
returned in the size
bob.ip.base.rotated_output_shape()
- Given source and destination image and the rotation angle, the source image is rotated and filled into the destination image.
- Same as 2., but additionally boolean masks will be read and filled with according values.
Note
Since the implementation uses a different interpolation style than before, results might slightly differ.
Parameters:
src
: array_like (2D or 3D)The input image (gray or colored) that should be rotateddst
: array_like (2D or 3D, float)The resulting scaled gray or color image, should be in sizebob.ip.base.rotated_output_shape()
src_mask
: array_like (bool, 2D or 3D)An input mask of valid pixels before geometric normalization, must be of same size assrc
dst_mask
: array_like (bool, 2D or 3D)The output mask of valid pixels after geometric normalization, must be of same size asdst
rotation_angle
: floatthe rotation angle that should be applied to the imageReturns:
dst
: array_like (2D, float)The resulting rotated image
-
bob.ip.base.
rotated_output_shape
(src, angle) → rotated_shape¶ This function returns the shape of the rotated image for the given image and angle
Parameters:
src
: array_like (2D,3D)The src image which which should be scaledangle
: floatThe rotation angle in degrees to rotate the src image withReturns:
rotated_shape
: (int, int) or (int, int, int)The shape of the rotateddst
image required in a call tobob.ip.base.rotate()
-
bob.ip.base.
scale
()¶ - scale(src, scaling_factor) -> dst
- scale(src, dst) -> None
- scale(src, src_mask, dst, dst_mask) -> None
Scales an image.
This function scales an image using bi-linear interpolation. It supports 2D and 3D input array/image (NumPy array) of type numpy.uint8, numpy.uint16 and numpy.float64. Basically, this function can be called in three different ways:
- Given a source image and a scale factor, the scaled image is
returned in the size
bob.ip.base.scaled_output_shape()
- Given source and destination image, the source image is scaled such that it fits into the destination image.
- Same as 2., but additionally boolean masks will be read and filled with according values.
Note
For 2. and 3., scale factors are computed for both directions independently. Factually, this means that the image might be stretched in either direction, i.e., the aspect ratio is not identical for the horizontal and vertical direction. Even for 1. this might apply, e.g., when
src.shape * scaling_factor
does not result in integral values.Parameters:
src
: array_like (2D or 3D)The input image (gray or colored) that should be scaleddst
: array_like (2D or 3D, float)The resulting scaled gray or color imagesrc_mask
: array_like (bool, 2D or 3D)An input mask of valid pixels before geometric normalization, must be of same size assrc
dst_mask
: array_like (bool, 2D or 3D)The output mask of valid pixels after geometric normalization, must be of same size asdst
scaling_factor
: floatthe scaling factor that should be applied to the imageReturns:
dst
: array_like (2D, float)The resulting scaled image
-
bob.ip.base.
scaled_output_shape
(src, scaling_factor) → scaled_shape¶ This function returns the shape of the scaled image for the given image and scale
The function tries its best to compute an integral-valued shape given the shape of the input image and the given scale factor. Nevertheless, for non-round scale factors this might not work out perfectly.
Parameters:
src
: array_like (2D,3D)The src image which which should be scaledscaling_factor
: floatThe scaling factor to scale the src image withReturns:
scaled_shape
: (int, int) or (int, int, int)The shape of the scaleddst
image required in a call tobob.ip.base.scale()
-
bob.ip.base.
shift
(src, offset[, dst][, src_mask][, dst_mask][, fill_pattern]) → dst[source]¶ Shifts the given image
src
image with the given offset (might be negative).If
dst
is specified, the image is shifted into thedst
image. Ideally,dst
should have the same size assrc
, but other sizes work as well. Whendst
isNone
(the default), it is created in the same size assrc
. When masks are given, the need to be of the same size as thesrc
anddst
parameters. When shift to regions are outside the image, the shifted image will containfill_pattern
and the mask will be set toFalse
Parameters
src
: array_like (2D or 3D)- The source image to flip.
crop_offset
: (int, int)- The position in
src
coordinates to start cropping; might be negative crop_size
: (int, int)- The size of the cropped image; might be omitted when the
dst
is given dst
: array_like (2D or 3D)- If given, the destination to crop
src
to. src_mask
,dst_mask
: array_like(bool, 2D or 3D)- Masks that define, where
src
anddst
are valid fill_pattern
: number- [default: 0] The value to set outside the croppable area
Returns
dst
: array_like (2D or 3D)- The cropped image
-
bob.ip.base.
sobel
(src[, border][, dst]) → dst¶ Performs a Sobel filtering of the input image
This function will perform a Sobel filtering woth both the vertical and the horizontal filter. A Sobel filter is an edge detector, which will detect either horizontal or vertical edges. The two filter are given as:
S_y = \left\lgroup\begin{array}{ccc} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{array}\right\rgroup \qquad S_x = \left\lgroup\begin{array}{ccc} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{array}\right\rgroup
If given, the dst array should have the expected type (numpy.float64) and two layers of the same size as the input image. Finally, the result of the vertical filter will be put into the first layer of
dst[0]
, while the result of the horizontal filter will be written todst[1]
.Parameters:
src
: array_like (2D, float)The source image to filterborder
:bob.sp.BorderType
[default:bob.sp.BorderType.Mirror
] The extrapolation method used by the convolution at the borderdst
: array_like (3D, float)The Sobel-filtered image to write; need to be of size[2] + src.shape
; if not specified, it will be createdReturns:
dst
: array_like (3D, float)The Sobel-filtered image; the same as thedst
parameter, if specified
-
bob.ip.base.
zigzag
(src, dst, right_first) → None¶ Extracts a 1D array using a zigzag pattern from a 2D array
This function extracts a 1D array using a zigzag pattern from a 2D array. If bottom_first is set to True, the second element of the pattern is taken at the bottom of the upper left element, otherwise it is taken at the right of the upper left element. The input is expected to be a 2D dimensional array. The output is expected to be a 1D dimensional array. This method only supports arrays of the following data types:
numpy.uint8
numpy.uint16
numpy.float64
(or the native pythonfloat
)
To create an object with a scalar type that will be accepted by this method, use a construction like the following:
>> import numpy >> input_righttype = input_wrongtype.astype(numpy.float64)
Parameters:
src
: array_like (uint8|uint16|float64, 2D)The source matrix.dst
: array_like (uint8|uint16|float64, 1D)The destination matrix.right_first
: scalar (bool)Tells whether the zigzag pattern start to move to the right or not