This section includes information for using the pure Python API of bob.ip.base.
bob.ip.base.GeomNorm | Objects of this class, after configuration, can perform a geometric |
bob.ip.base.FaceEyesNorm | Objects of this class, after configuration, can perform a geometric |
bob.ip.base.LBP | A class that extracts local binary patterns in various types |
bob.ip.base.LBPTop | A class that extracts local binary patterns (LBP) in three orthogonal |
bob.ip.base.DCTFeatures | Objects of this class, after configuration, can extract DCT features. |
bob.ip.base.TanTriggs | Objects of this class, after configuration, can preprocess images |
bob.ip.base.Gaussian | Objects of this class, after configuration, can perform Gaussian |
bob.ip.base.Wiener | A Wiener filter |
bob.ip.base.MultiscaleRetinex | This class allows after configuration to apply the Self Quotient Image |
bob.ip.base.WeightedGaussian | This class performs weighted gaussian smoothing (anisotropic filtering) |
bob.ip.base.SelfQuotientImage | This class allows after configuration to apply the Self Quotient Image |
bob.ip.base.GaussianScaleSpace | This class allows after configuration the generation of Gaussian |
bob.ip.base.GSSKeypoint | Structure to describe a keypoint on the |
bob.ip.base.GSSKeypointInfo | This is a companion structure to the |
bob.ip.base.SIFT | This class allows after configuration the extraction of SIFT |
bob.ip.base.VLSIFT | Computes SIFT features using the VLFeat library |
bob.ip.base.VLDSIFT | Computes dense SIFT features using the VLFeat library |
bob.ip.base.GradientMagnitude | Gradient ‘magnitude’ used |
bob.ip.base.BlockNorm | Enumeration that defines the norm that is used for normalizing the |
bob.ip.base.HOG | Objects of this class, after configuration, can extract Histogram of Oriented Gradients (HOG) descriptors. |
bob.ip.base.GLCMProperty | Enumeration that defines the properties of GLCM, to be used in |
bob.ip.base.GLCM(*args, **kwargs) | Objects of this class, after configuration, can compute Grey-Level |
bob.ip.base.scale |
|
bob.ip.base.scaled_output_shape((src, ...) | This function returns the shape of the scaled image for the given |
bob.ip.base.rotate |
|
bob.ip.base.rotated_output_shape((src, ...) | This function returns the shape of the rotated image for the given |
bob.ip.base.block((input, block_size, ...) | Performs a block decomposition of a 2D array/image |
bob.ip.base.block_output_shape((input, ...) | Returns the shape of the output image that is required to compute the |
bob.ip.base.crop((src, crop_offset, ...) | Crops the given image src image to the given offset (might be negative) and to the given size (might be greater than src image). |
bob.ip.base.shift((src, offset, [dst], ...) | Shifts the given image src image with the given offset (might be negative). |
bob.ip.base.max_rect_in_mask((mask) -> rect) | Given a 2D mask (a 2D blitz array of booleans), compute the maximum rectangle which only contains true values. |
bob.ip.base.angle_to_horizontal((right, ...) | Get the angle needed to level out (horizontally) two points. |
bob.ip.base.histogram |
|
bob.ip.base.lbphs((input, lbp, block_size, ...) | Computes an local binary pattern histogram sequences from the given |
bob.ip.base.lbphs_output_shape((input, lbp, ...) | Returns the shape of the output image that is required to compute the |
bob.ip.base.histogram_equalization |
|
bob.ip.base.gamma_correction((src, gamma, ...) | Performs a power-law gamma correction of a given 2D image |
bob.ip.base.integral((src, dst, [sqr], ...) | Computes an integral image for the given input image |
bob.ip.base.zigzag((src, dst, ...) | Extracts a 1D array using a zigzag pattern from a 2D array |
bob.ip.base.median((src, radius, [dst]) -> dst) | Performs a median filtering of the input image with the given radius |
bob.ip.base.sobel((src, [border], [dst]) -> dst) | Performs a Sobel filtering of the input image |
Bases: object
Enumeration that defines the norm that is used for normalizing the descriptor blocks
Possible values are:
Class Members:
Bases: object
Objects of this class, after configuration, can extract DCT features.
The DCT feature extraction is described in more detail in [Sanderson2002]. This class also supports block normalization and DCT coefficient normalization.
Constructor Documentation:
- bob.ip.base.DCTFeatures (coefficients, block_size, [block_overlap], [normalize_block], [normalize_dct], [square_pattern])
- bob.ip.base.DCTFeatures (dct_features)
Constructs a new DCT features extractor
Todo
Explain DCTFeatures constructor in more detail.
Parameters:
coefficients : int
The number of DCT coefficients;
Note
the real number of DCT coefficient returned by the extractor is coefficients-1 when the block normalization is enabled by setting normalize_block=True (as the first coefficient is always 0 in this case)
block_size : (int, int)
The size of the blocks, in which the image is decomposedblock_overlap : (int, int)
[default: (0, 0)] The overlap of the blocksnormalize_block : bool
[default: False] Normalize each block to zero mean and unit variance before extracting DCT coefficients? In this case, the first coefficient will always be zero and hence will not be returnednormalize_dct : bool
[default: False] Normalize DCT coefficients to zero mean and unit variance after the DCT extraction?square_pattern : bool
[default: False] Select, whether a zigzag pattern or a square pattern is used for the DCT extraction; for a square pattern, the number of DCT coefficients must be a square integerdct_features : bob.ip.base.DCTFeatures
The DCTFeatures object to use for copy-construction
Class Members:
(int, int) <– The block overlap in both vertical and horizontal direction of the Multi-Block-DCTFeatures extractor, with read and write access
Note
The block_overlap must be smaller than the block_size.
(int, int) <– The size of each block for the block decomposition, with read and write access
int <– The number of DCT coefficients, with read and write access
Note
The real number of DCT coefficient returned by the extractor is coefficients-1 when the block normalization is enabled (as the first coefficient is always 0 in this case)
Extracts DCT features from either uint8, uint16 or double arrays
The input array is a 2D array/grayscale image. The destination array, if given, should be a 2D or 3D array of type float64 and allocated with the correct dimensions (see output_shape()). If the destination array is not given (first version), it is generated in the required size. The blocks can be split into either a 2D array of shape (block_index, coefficients) by setting flat=True, or into a 3D array of shape (block_index_y, block_index_x, coefficients) with flat=False.
Note
The __call__() function is an alias for this method.
Parameters:
input : array_like (2D)
The input image for which DCT features should be extracted
flat : bool
[default: True] The flat parameter is used to decide whether 2D (flat = True) or 3D (flat = False) output shape is generated
output : array_like (2D, float)
The output image that need to be of shape output_shape()
Returns:
output : array_like (2D, float)
The resulting DCT features
float <– The epsilon value to avoid division-by-zero when performing block or DCT coefficient normalization (read and write access)
The default value for this epsilon is 10 * sys.float_info.min, and usually there is little necessity to change that.
bool <– Normalize each block to zero mean and unit variance before extracting DCT coefficients (read and write access)
Note
In case normalize_block is set to True the first coefficient will always be zero and, hence, will not be returned.
bool <– Normalize DCT coefficients to zero mean and unit variance after the DCT extraction (read and write access)
This function returns the shape of the DCT output for the given input
The blocks can be split into either a 2D array of shape (block_index, coefficients) by setting flat=True, or into a 3D array of shape (block_index_y, block_index_x, coefficients) with flat=False.
Parameters:
input : array_like (2D)
The input image for which DCT features should be extracted
shape : (int, int)
The shape of the input image for which DCT features should be extracted
flat : bool
[default: True] The flat parameter is used to decide whether 2D (flat = True) or 3D (flat = False) output shape is generated
Returns:
dct_shape : (int, int) or (int, int, int)
The shape of the DCT features image that is required in a call to extract()
bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?
Note
For a square pattern, the number of DCT coefficients must be a square integer.
Bases: object
Objects of this class, after configuration, can perform a geometric normalization of facial images based on their eye positions
The geometric normalization is a combination of rotation, scaling and cropping an image. The underlying implementation relies on a bob.ip.base.GeomNorm object to perform the actual geometric normalization.
Constructor Documentation:
- bob.ip.base.FaceEyesNorm (crop_size, eyes_distance, eyes_center)
- bob.ip.base.FaceEyesNorm (crop_size, right_eye, left_eye)
- bob.ip.base.FaceEyesNorm (other)
Constructs a FaceEyesNorm object.
Basically there exist two ways to define a FaceEyesNorm. Both ways require the resulting crop_size. The first constructor takes the inter-eye-distance and the center of the eyes, which will be used as transformation center. The second version takes the image resolution and two arbitrary positions in the face, with which the image will be aligned. Usually, these positions are the eyes, but any other pair (like mouth and eye for profile faces) can be specified.
Parameters:
crop_size : (int, int)
The resolution of the normalized faceeyes_distance : float
The inter-eye-distance in the normalized faceeyes_center : (float, float)
The center point between the eyes in the normalized faceright_eye : (float, float)
The location of the right eye (or another fix point) in the normalized imageleft_eye : (float, float)
The location of the left eye (or another fix point) in the normalized imageother : FaceEyesNorm
Another FaceEyesNorm object to copy
Class Members:
(float, float) <– The transformation center in the processed image, which is usually the center between the eyes; with read and write access
(int, int) <– The size of the normalized image, with read and write access
This function extracts and normalized the facial image
This function extracts the facial image based on the eye locations (or the location of other fixed point, see note below). The geometric normalization is applied such that the eyes are placed to fixed positions in the normalized image. The image is cropped at the same time, so that no unnecessary operations are executed.
Note
Instead of the eyes, any two fixed positions can be used to normalize the face. This can simply be achieved by selecting two other nodes in the constructor (see FaceEyesNorm) and in this function. Just make sure that ‘right’ and ‘left’ refer to the same landmarks in both functions.
Note
The __call__() function is an alias for this method.
Parameters:
input : array_like (2D)
The input image to which FaceEyesNorm should be applied
output : array_like (2D, float)
The output image, which must be of size crop_size
right_eye : (float, float)
The position of the right eye (or another landmark) in input image coordinates.
left_eye : (float, float)
The position of the left eye (or another landmark) in input image coordinates.
input_mask : array_like (2D, bool)
An input mask of valid pixels before geometric normalization, must be of same size as input
output_mask : array_like (2D, bool)
The output mask of valid pixels after geometric normalization, must be of same size as output
Returns:
output : array_like(2D, float)
The resulting normalized face image, which is of size crop_size
float <– The angle between the eyes in the normalized image (relative to the horizontal line), with read and write access
float <– The distance between the eyes in the normalized image, with read and write access
bob.ip.base.GeomNorm <– The geometric normalization class that was used to compute the last normalization, read access only
float <– The rotation angle that was applied on the latest normalized image, read access only
(float, float) <– The original transformation offset (eye center) in the normalization process, read access only
float <– The scale that was applied on the latest normalized image, read access only
Bases: bob.ip.base.GLCM
Objects of this class, after configuration, can compute Grey-Level Co-occurence Matrix of an image
This class allows to extract a Grey-Level Co-occurence Matrix (GLCM) [Haralick1973]. A thorough tutorial about GLCM and the textural (so-called Haralick) properties that can be derived from it, can be found at: http://www.fp.ucalgary.ca/mhallbey/tutorial.htm. A MatLab implementation can be found at: http://www.mathworks.ch/ch/help/images/ref/graycomatrix.html
Constructor Documentation:
- bob.ip.base.GLCM ([levels], [min_level], [max_level], [dtype])
- bob.ip.base.GLCM (quantization_table)
- bob.ip.base.GLCM (glcm)
Constructor
Todo
The parameter(s) ‘levels, max_level, min_level, quantization_table’ are used, but not documented.
Parameters:
dtype : numpy.dtype
[default: numpy.uint8] The data-type for the GLCM class
glcm : bob.ip.base.GLCM
The GLCM object to use for copy-construction
Class Members:
Computes the angular_second_moment property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘angular_second_moment’ property
Computes the auto_correlation property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘auto_correlation’ property
Computes the cluster_prominence property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘cluster_prominence’ property
Computes the cluster_shade property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘cluster_shade’ property
Computes the contrast property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘contrast’ property
Computes the correlation property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘correlation’ property
Computes the correlation_matlab property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘correlation_matlab’ property
Computes the difference_entropy property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘difference_entropy’ property
Computes the difference_variance property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘difference_variance’ property
Computes the dissimilarity property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘dissimilarity’ property
numpy.dtype <– The data type, which was used in the constructor
Only images of this data type can be processed in the extract() function.
Computes the energy property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘energy’ property
Computes the entropy property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘entropy’ property
Extracts the GLCM matrix from the given input image
If given, the output array should have the expected type (numpy.float64) and the size as defined by output_shape() .
Note
The __call__() function is an alias for this method.
Parameters:
input : array_like (2D)
The input image to extract GLCM features from
output : array_like (3D, float)
[default: None] If given, the output will be saved into this array; must be of the shape as output_shape()
Returns:
output : array_like (3D, float)
The resulting output data, which is the same as the parameter output (if given)
Computes the homogeneity property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘homogeneity’ property
Computes the information_measure_of_correlation_1 property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘information_measure_of_correlation_1’ property
Computes the information_measure_of_correlation_2 property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘information_measure_of_correlation_2’ property
Computes the inverse_difference property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘inverse_difference’ property
Computes the inverse_difference_moment property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘inverse_difference_moment’ property
Computes the inverse_difference_moment_normalized property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘inverse_difference_moment_normalized’ property
Computes the inverse_difference_normalized property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘inverse_difference_normalized’ property
int <– Specifies the number of gray-levels to use when scaling the gray values in the input image
This is the number of the values in the first and second dimension in the GLCM matrix. The default is the total number of gray values permitted by the type of the input image.
int <– Gray values greater than or equal to this value are scaled to levels The default is the maximum gray-level permitted by the type of input image.
Computes the maximum_probability property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘maximum_probability’ property
int <– Gray values smaller than or equal to this value are scaled to 0The default is the minimum gray-level permitted by the type of input image.
bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?
Note
For a square pattern, the number of DCT coefficients must be a square integer.
array_like (2D, int) <– The offset specifying the column and row distance between pixel pairs
The shape of this array is (num_offsets, 2), where num_offsets is the total number of offsets to be taken into account when computing GLCM.
Get the shape of the GLCM matrix goven the input image
The shape has 3 dimensions: two for the number of gray levels, and one for the number of offsets
Returns:
shape : (int, int, int)
The shape of the output array required to call extract()
Query the properties of GLCM by specifying a name
Returns a list of numpy.array of the queried properties. Please see the documentation of bob.ip.base.GLCMProperty for details on the possible properties.
Parameters:
glcm_matrix : array_like (3D, float)
The result of the GLCM extraction
prop_names : [bob.ip.base.GLCMProperty]
[default: None] A list of GLCM properties; either by value (int) or by name (str)
Returns:
prop_values : [array_like (1D, float)]
The GLCM properties for the given prop_names
array_like (1D) <– The thresholds of the quantizationEach element corresponds to the lower boundary of the particular quantization level. E.g.. array([ 0, 5, 10]) means quantization in 3 levels. Input values in the range [0,4] will be quantized to level 0, input values in the range[5,9] will be quantized to level 1 and input values in the range [10-max_level] will be quantized to level 2.
Computes the sum_average property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘sum_average’ property
Computes the sum_entropy property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘sum_entropy’ property
Computes the sum_variance property
Parameters
input : array_like (3D, float)
The result of the extract() function
Returns
property :array_like (1D, float)
The resulting ‘sum_variance’ property
bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?
Note
For a square pattern, the number of DCT coefficients must be a square integer.
Bases: object
Enumeration that defines the properties of GLCM, to be used in bob.ip.base.GLCM.properties_by_name()
Possible values are:
The references from above are as follows:
Class Members:
Bases: object
Structure to describe a keypoint on the bob.ip.base.GaussianScaleSpace
It consists of a scale sigma, a location (y,x) and an orientation.
Constructor Documentation:
bob.ip.base.GSSKeypoint (sigma, location, [orientation])
Creates a GSS keypoint
Parameters:
sigma : float
The floating point value describing the scale of the keypointlocation : (float, float)
The location of the keypointorientation : float
[default: 0] The orientation of the keypoint (in degrees)
Class Members:
(float, float) <– The location (y, x) of the keypoint, with read and write access
float <– The orientation of the keypoint (in degree), with read and write access
float <– The floating point value describing the scale of the keypoint, with read and write access
Bases: object
This is a companion structure to the bob.ip.base.GSSKeypoint
It provides additional and practical information such as the octave and scale indices, the integer location location = (y,x), and eventually the scores associated to the detection step (peak_score and edge_score)
Constructor Documentation:
bob.ip.base.GSSKeypointInfo ([octave_index], [scale_index], [location], [peak_score], [edge_score])
Creates a GSS keypoint
Parameters:
octave_index : int
[default: 0] The octave index associated with the keypoint in the bob.ip.base.GaussianScaleSpace objectscale_index : int
[default: 0] The scale index associated with the keypoint in the bob.ip.base.GaussianScaleSpace objectlocation : (int, int)
[default: (0, 0)] The integer unnormalized location (y,x) of the keypointpeak_score : float
[default: 0] The orientation of the keypoint (in degrees)edge_score : float
[default: 0] The orientation of the keypoint (in degrees)
Class Members:
float <– The edge score of the keypoint during the SIFT-like detection step, with read and write access
(int, int) <– The integer unnormalized location (y, x) of the keypoint, with read and write access
int <– The octave index associated with the keypoint in the bob.ip.base.GaussianScaleSpace object, with read and write access
float <– The peak score of the keypoint during the SIFT-like detection step, with read and write access
int <– The scale index associated with the keypoint in the bob.ip.base.GaussianScaleSpace object, with read and write access
Bases: object
Objects of this class, after configuration, can perform Gaussian filtering (smoothing) on images
The Gaussian smoothing is done by convolving the image with a vertical and a horizontal smoothing filter.
Constructor Documentation:
- bob.ip.base.Gaussian (sigma, [radius], [border])
- bob.ip.base.Gaussian (gaussian)
Constructs a new Gaussian filter
The Gaussian kernel is generated in both directions independently, using the given standard deviation and the given radius, where the size of the kernels is actually 2*radius+1. When the radius is not given or negative, it will be automatically computed ad 3*sigma.
Note
Since the Gaussian smoothing is done by convolution, a larger radius will lead to longer execution time.
Parameters:
sigma : (double, double)
The standard deviation of the Gaussian along the y- and x-axes in pixelsradius : (int, int)
[default: (-1, -1) -> 3*sigma ] The radius of the Gaussian in both directions – the size of the kernel is 2*radius+1border : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the bordergaussian : bob.ip.base.Gaussian
The Gaussian object to use for copy-construction
Class Members:
bob.sp.BorderType <– The extrapolation method used by the convolution at the border, with read and write access
Smooths an image (2D/grayscale or 3D/color)
If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D)
The input image which should be smoothed
dst : array_like (2D, float)
[default: None] If given, the output will be saved into this image; must be of the same shape as src
Returns:
dst : array_like (2D, float)
The resulting output image, which is the same as dst (if given)
array_like (1D, float) <– The values of the kernel in horizontal direction; read only access
array_like (1D, float) <– The values of the kernel in vertical direction; read only access
Bases: object
This class allows after configuration the generation of Gaussian Pyramids that can be used to extract SIFT features
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.GaussianScaleSpace (size, scales, octaves, octave_min, [sigma_n], [sigma0], [kernel_radius_factor], [border])
- bob.ip.base.GaussianScaleSpace (gss)
Constructs a new DCT features extractor
Todo
Explain GaussianScaleSpace constructor in more detail.
Warning
The order of the parameters scales and octaves has changed compared to the old implementation, in order to keep it consistent with bob.ip.base.VLSIFT!
Parameters:
size : (int, int)
The height and width of the images to processscales : int
The number of intervals of the pyramid. Three additional scales will be computed in practice, as this is required for extracting SIFT featuresoctaves : int
The number of octaves of the pyramidoctave_min : int
The index of the minimum octavesigma_n : float
[default: 0.5] The value sigma_n of the standard deviation for the nominal/initial octave/scalesigma0 : float
[default: 1.6] The value sigma0 of the standard deviation for the image of the first octave and first scalekernel_radius_factor : float
[default: 4.] Factor used to determine the kernel radii: size=2*radius+1. For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})border : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the bordergss : bob.ip.base.GaussianScaleSpace
The GaussianScaleSpace object to use for copy-construction
Class Members:
Allocates a python list of arrays for the Gaussian pyramid
Returns:
pyramid : [array_like(3D, float)]
A list of output arrays in the size required to call :py:func`process`
bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access
Returns the Gaussian at index/interval/scale i
Parameters:
index : int
The index of the scale for which the Gaussian should be retrieved
Returns:
gaussian : bob.ip.base.Gaussian
The Gaussian at the given index
float <– Factor used to determine the kernel radii size=2*radius+1
For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})
int <– The index of the minimum octave, read only access
This is equal to octave_min+n_octaves-1.
int <– The index of the minimum octave, with read and write access
int <– The number of octaves of the pyramid, with read and write access
Computes a Gaussian Pyramid for an input 2D image
If given, the results are put in the output dst, which output should already be allocated and of the correct size (using the allocate_output() method).
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D)
The input image which should be processed
dst : [array_like (3D, float)]
The Gaussian pyramid that should have been allocated with allocate_output()
Returns:
dst : [array_like (3D, float)]
The resulting Gaussian pyramid, if given it will be the same as the dst parameter
int <– The number of intervals of the pyramid, with read and write access
Three additional scales will be computed in practice, as this is required for extracting SIFT features
Sets sigma0 such that there is not smoothing at the first scale of octave_min
float <– The value sigma0 of the standard deviation for the image of the first octave and first scale
float <– The value sigma_n of the standard deviation for the nominal/initial octave/scale; with read and write access
(int, int) <– The shape of the images to process, with read and write access
Bases: object
Objects of this class, after configuration, can perform a geometric normalization of images
The geometric normalization is a combination of rotation, scaling and cropping an image.
Constructor Documentation:
- bob.ip.base.GeomNorm (rotation_angle, scaling_factor, crop_size, crop_offset)
- bob.ip.base.GeomNorm (other)
Constructs a GeomNorm object with the given scale, angle, size of the new image and transformation offset in the new image
When the GeomNorm is applied to an image, it is rotated and scaled such that it visually rotated counter-clock-wise (mathematically positive) with the given angle, i.e., to mimic the behavior of ImageMagick. Since the origin in the image is in the top-left corner, this means that the rotation is actually clock-wise (mathematically negative). This also applies for the second version of the landmarks, which will be rotated mathematically negative as well, to keep it consistent with the image.
Warning
The behavior of the landmark rotation has changed from Bob version 1.x, where the landmarks were mistakenly rotated mathematically positive.
Parameters:
rotation_angle : float
The rotation angle in degrees that should be appliedscaling_factor : float
The scale factor to applycrop_size : (int, int)
The resolution of the processed imagescrop_offset : (float, float)
The transformation offset in the processed imagesother : GeomNorm
Another GeomNorm object to copy
Class Members:
(float, float) <– The transformation center in the processed image, with read and write access
(int, int) <– The size of the processed image, with read and write access
This function geometrically normalizes an image or a position in the image
The function rotates and scales the given image, or a position in image coordinates, such that the result is visually rotated and scaled with the rotation_angle and scaling_factor.
Note
The __call__() function is an alias for this method.
Parameters:
input : array_like (2D or 3D)
The input image to which GeomNorm should be applied
output : array_like (2D or 3D, float)
The output image, which must be of size crop_size
center : (float, float)
The transformation center in the given image; this will be placed to crop_offset in the output image
input_mask : array_like (bool, 2D or 3D)
An input mask of valid pixels before geometric normalization, must be of same size as input
output_mask : array_like (bool, 2D or 3D)
The output mask of valid pixels after geometric normalization, must be of same size as output
position : (float, float)
A position in input image space that will be transformed to output image space (might be outside of the crop area)
Returns:
transformed : uint16
The resulting GeomNorm code at the given position in the image
float <– The rotation angle, with read and write access
float <– The scale factor, with read and write access
Bases: object
Gradient ‘magnitude’ used
Possible values are:
Class Members:
Bases: object
Objects of this class, after configuration, can extract Histogram of Oriented Gradients (HOG) descriptors.
This implementation relies on the article of [Dalal2005]. A few remarks:
Constructor Documentation:
- bob.ip.base.HOG (image_size, [bins], [full_orientation], [cell_size], [cell_overlap], [block_size], [block_overlap])
- bob.ip.base.HOG (hog)
Constructs a new HOG extractor
Parameters:
image_size : (int, int)
The size of the input image to process.bins : int
[default: 8] Dimensionality of a cell descriptor (i.e. the number of bins)full_orientation : bool
[default: False] Whether the range [0,360] is used or only [0,180]cell_size : (int, int)
[default: (4,4)] The size of a cell.cell_overlap : (int, int)
[default: (0,0)] The overlap between cells.block_size : (int, int)
[default: (4,4)] The size of a block (in terms of cells).block_overlap : (int, int)
[default: (0,0)] The overlap between blocks (in terms of cells).hog : bob.ip.base.HOG
Another HOG object to copy
Class Members:
int <– Dimensionality of a cell descriptor (i.e. the number of bins), with read and write access
bob.ip.base.BlockNorm <– The type of norm used for normalizing blocks, with read and write access
float <– Epsilon value used to avoid division by zeros when normalizing the blocks, read and write access
float <– Threshold used to perform the clipping during the block normalization, with read and write access
(int, int) <– Overlap between blocks (in terms of cells), with read and write access
(int, int) <– Size of a block (in terms of cells), with read and write access
(int, int) <– Overlap between cells, with read and write access
(int, int) <– Size of a cell, with read and write access
Computes an Histogram of Gradients for a given ‘cell’
The inputs are the gradient magnitudes and the orientations for each pixel of the cell
Parameters:
magnitude : array_like (2D, float)
The input array with the gradient magnitudes
orientation : array_like (2D, float)
The input array with the orientations
histogram : array_like (1D, float)
[default = None] If given, the result will be written to this histogram; must be of size bins
Returns:
histogram : array_like (1D, float)
The resulting histogram; same as input histogram, if given
Disable block normalization
This is performed by setting parameters such that the cells are not further processed, i.e.:
Extract the HOG descriptors
This extracts HOG descriptors from the input image. The output is 3D, the first two dimensions being the y- and x- indices of the block, and the last one the index of the bin (among the concatenated cell histograms for this block).
Note
The __call__() function is an alias for this method.
Parameters:
input : array_like (2D)
The input image to extract HOG features from
output : array_like (3D, float)
[default: None] If given, the container to extract the HOG features to; must be of size output_shape()
Returns:
output : array_like(2D, float)
The resulting HOG features, same as parameter output, if given
bool <– Whether the range [0,360] is used or not ([0,180] otherwise), with read and write access
(int, int) <– The size of the input image to process., with read and write access
bob.ip.base.GradientMagnitude <– Type of the magnitude to consider for the descriptors, with read and write access
Bases: object
A class that extracts local binary patterns in various types
The implementation is based on [Atanasoaei2012], where all the different types of LBP features are defined in more detail.
Constructor Documentation:
- bob.ip.base.LBP (neighbors, [radius], [circular], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
- bob.ip.base.LBP (neighbors, radius_y, radius_x, [circular], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
- bob.ip.base.LBP (neighbors, block_size, [block_overlap], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
- bob.ip.base.LBP (lbp)
- bob.ip.base.LBP (hdf5)
Creates an LBP extractor with the given parametrization
Basically, the LBP configuration can be split into three parts.
- Which pixels are compared how:
- The number of neighbors (might be 4, 8 or 16)
- Circular or rectangular offset positions around the center, or even Multi-Block LBP (MB-LBP)
- Compare the pixels to the center pixel or to the average
- How to generate the bit strings from the pixels (this is handled by the elbp_type parameter):
- 'regular': Choose one bit for each comparison of the neighboring pixel with the central pixel
- 'transitional': Compare only the neighboring pixels and skip the central one
- 'direction-coded': Compute a 2-bit code for four directions
- How to cluster the generated bit strings to compute the final LBP code:
- uniform: Only uniform LBP codes (with less than two bit-changes between 0 and 1) are considered; all other strings are combined into one LBP code
- rotation_invariant: Rotation invariant LBP codes are generated, e.g., bit strings 00110000 and 00000110 will lead to the same LBP code
This clustering is done using a look-up-table, which you can also set yourself using the look_up_table attribute. The maximum code that will be generated can be read from the max_label attribute.
Finally, the border handling of the image can be selected. With the 'shrink' option, no LBP code is computed for the border pixels and the resulting image is 2\times radius or 3\times block_size -1 pixels smaller in both directions, see lbp_shape(). The 'wrap' option will wrap around the border and no truncation is performed.
Note
To compute MB-LBP features, it is possible to compute an integral image before to speed up the calculation.
Parameters:
neighbors : int
The number of neighboring pixels that should be taken into account; possible values: 4, 8, 16radius : float
[default: 1.] The radius of the LBP in both vertical and horizontal direction togetherradius_y, radius_x : float
The radius of the LBP in both vertical and horizontal direction separatelyblock_size : (int, int)
If set, multi-block LBP’s with the given block size will be extractedblock_overlap : (int, int)
[default: (0, 0)] Multi-block LBP’s with the given block overlap will be extractedcircular : bool
[default: False] Extract neighbors on a circle or on a square?to_average : bool
[default: False] Compare the neighbors to the average of the pixels instead of the central pixel?add_average_bit : bool
[default: False] (only useful if to_average is True) Add another bit to compare the central pixel to the average of the pixels?uniform : bool
[default: False] Extract uniform LBP features?rotation_invariant : bool
[default: False] Extract rotation invariant LBP features?elbp_type : str
[default: 'regular'] Which type of LBP codes should be computed; possible values: (‘regular’, ‘transitional’, ‘direction-coded’), see elbp_typeborder_handling : str
[default: 'shrink'] How should the borders of the image be treated; possible values: (‘shrink’, ‘wrap’), see border_handlinglbp : bob.ip.base.LBP
Another LBP object to copyhdf5 : bob.io.base.HDF5File
An HDF5 file to read the LBP configuration from
Class Members:
bool <– Should the bit for the comparison of the central pixel with the average be added as well (read and write access)?
(int, int) <– The block overlap in both vertical and horizontal direction of the Multi-Block-LBP extractor, with read and write access
Note
The block_overlap must be smaller than the block_size. To set both the block size and the block overlap at the same time, use the set_block_size_and_overlap() function.
(int, int) <– The block size in both vertical and horizontal direction of the Multi-Block-LBP extractor, with read and write access
str <– The type of border handling that should be applied (read and write access)
Possible values are: (‘shrink’, ‘wrap’)
bool <– Should circular or rectangular LBP’s be extracted (read and write access)?
str <– The type of LBP bit string that should be extracted (read and write access)
Possible values are: (‘regular’, ‘transitional’, ‘direction-coded’)
This function extracts the LBP features from an image
LBP features can be extracted either for the whole image, or at a single location in the image. When MB-LBP features will be extracted, an integral image will be computed to speed up the calculation. The integral image calculation can be done before this function is called, and the integral image can be passed to this function directly. In this case, please set the is_integral_image parameter to True.
Note
The __call__() function is an alias for this method.
Parameters:
input : array_like (2D)
The input image for which LBP features should be extracted
position : (int, int)
The position in the input image, where the LBP code should be extracted; assure that you don’t try to provide positions outside of the offset
output : array_like (2D, uint16)
The output image that need to be of shape lbp_shape()
is_integral_image : bool
[default: False] Is the given input image an integral image?
Returns:
output : array_like (2D, uint16)
The resulting image of LBP codes
code : uint16
The resulting LBP code at the given position in the image
bool <– Is the current configuration of the LBP extractor set up to extract Multi-Block LBP’s (read access only)?
This function returns the shape of the LBP image for the given image
In case the border_handling is 'shrink' the image resolution will be reduced, depending on the LBP configuration. This function will return the desired output shape for the given input image or input shape.
Parameters:
input : array_like (2D)
The input image for which LBP features should be extracted
shape : (int, int)
The shape of the input image for which LBP features should be extracted
is_integral_image : bool
[default: False] Is the given image (shape) an integral image?
Returns:
lbp_shape : (int, int)
The shape of the LBP image that is required in a call to extract()
Loads the parametrization of the LBP extractor from the given HDF5 file
Parameters:
hdf5 : bob.io.base.HDF5File
An HDF5 file opened for reading
array_like (1D, uint16) <– The look up table that defines, which bit string is converted into which LBP code (read and write access)
Depending on the values of uniform and rotation_invariant, bit strings might be converted into different LBP codes. Since this attribute is writable, you can define a look-up-table for LBP codes yourself.
Warning
For the time being, the look up tables are not saved by the save() function!
int <– The number of different LBP code that are extracted (read access only)
The codes themselves are uint16 numbers in the range [0, max_label - 1]. Depending on the values of uniform and rotation_invariant, bit strings might be converted into different LBP codes.
(int, int) <– The offset in the image, where the first LBP code can be extracted (read access only)
Note
When extracting LBP features from an image with a specific shape, positions might be in range [offset, shape - offset[ only. Otherwise, an exception will be raised.
int <– The number of neighbors (usually 4, 8 or 16), with read and write access
Note
The block_overlap must be smaller than the block_size. To set both the block size and the block overlap at the same time, use the set_block_size_and_overlap() function.
(float, float) <– The radii in both vertical and horizontal direction of the elliptical or rectangular LBP extractor, with read and write access
float <– The radius of the round or square LBP extractor, with read and write access
array_like (2D, float) <– The list of neighbor positions, with which the central pixel is compared (read access only)
The list is defined as relative positions, where the central pixel is considered to be at (0, 0).
bool <– Should rotation invariant LBP patterns be extracted (read and write access)?
Rotation invariant LBP codes collects all patterns that have the same bit string with shifts. Hence, 00111000 and 10000011 will result in the same LBP code.
Saves the the parametrization of the LBP extractor to the given HDF5 file
Warning
For the time being, the look-up-table is not saved. If you have set the look_up_table by hand, it is lost.
Parameters:
hdf5 : bob.io.base.HDF5File
An HDF5 file open for writing
This function sets the block size and the block overlap for MB-LBP features at the same time
Parameters:
block_size : (int, int)
Multi-block LBP’s with the given block size will be extracted
block_overlap : (int, int)
Multi-block LBP’s with the given block overlap will be extracted
bool <– Should the neighboring pixels be compared with the average of all pixels, or to the central one (read and write access)?
bool <– Should uniform LBP patterns be extracted (read and write access)?
Uniform LBP patterns are those bit strings, where only up to two changes from 0 to 1 and vice versa are allowed. Hence, 00111000 is a uniform pattern, while 00110011 is not. All non-uniform bit strings will be collected in a single LBP code.
Bases: object
A class that extracts local binary patterns (LBP) in three orthogonal planes (TOP)
The LBPTop class is designed to calculate the LBP-Top coefficients given a set of images. The workflow is as follows:
Todo
UPDATE as this is not true
Constructor Documentation:
bob.ip.base.LBPTop (xy, xt, yt)
Constructs a new LBPTop object starting from the algorithm configuration
Parameters:
xy : bob.ip.base.LBP
The 2D LBP-XY plane configurationxt : bob.ip.base.LBP
The 2D LBP-XT plane configurationyt : bob.ip.base.LBP
The 2D LBP-YT plane configuration
Class Members:
This function processes the given set of images and extracts the three orthogonal planes
The given 3D input array represents a set of gray-scale images and returns (by argument) the three LBP planes calculated. The 3D array has to be arranged in this way:
The central pixel is the point where the LBP planes intersect/have to be calculated from.
Parameters:
input : array_like (3D)
The input set of gray-scale images for which LBPTop features should be extracted
xy, xt, yt : array_like (3D, uint16)
The result of the LBP operator in the XY, XT and YT plane (frame), for the central frame of the input array
bob.ip.base.LBP <– The 2D LBP-XT plane configuration
bob.ip.base.LBP <– The 2D LBP-XY plane configuration
bob.ip.base.LBP <– The 2D LBP-XT plane configuration
Bases: object
This class allows after configuration to apply the Self Quotient Image algorithm to images
More information about this algorithm can be found in [Jobson1997].
Constructor Documentation:
- bob.ip.base.MultiscaleRetinex ([scales], [size_min], [size_step], [sigma], [border])
- bob.ip.base.MultiscaleRetinex (msrx)
Creates a MultiscaleRetinex object
Todo
Add documentation for MultiscaleRetinex
Parameters:
scales : int
[default: 1] The number of scales (bob.ip.base.Gaussian)size_min : int
[default: 1] The radius of the kernel of the smallest bob.ip.base.Gaussiansize_step : int
[default: 1] The step used to set the kernel size of other weighted Gaussians: size_s = 2 * (size_min + s * size_step) + 1sigma : double
[default: 2.] The standard deviation of the kernel of the smallest weighted Gaussian; other sigmas: sigma_s = sigma * (size_min + s * size_step) / size_minborder : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the bordermsrx : bob.ip.base.MultiscaleRetinex
The MultiscaleRetinex object to use for copy-construction
Class Members:
bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access
Applies the Self Quotient Image algorithm to an image (2D/grayscale or color 3D/color) of type uint8, uint16 or double
Todo
Check if this documentation is correct (seems to be copied from bob.ip.base.SelfQuotientImage
If given, the dst array should have the type float and the same size as the src array.
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D)
The input image which should be processed
dst : array_like (2D, float)
[default: None] If given, the output will be saved into this image; must be of the same shape as src
Returns:
dst : array_like (2D, float)
The resulting output image, which is the same as dst (if given)
int <– The number of scales (Gaussian); with read and write access
float <– The variance of the kernel of the smallest weighted Gaussian (variance_s = sigma2 * (size_min+s*size_step)/size_min); with read and write access
int <– The radius (size=2*radius+1) of the kernel of the smallest weighted Gaussian; with read and write access
int <– The step used to set the kernel size of other Weighted Gaussians (size_s=2*(size_min+s*size_step)+1); with read and write access
Bases: object
This class allows after configuration the extraction of SIFT descriptors
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.SIFT (size, scales, octaves, octave_min, [sigma_n], [sigma0], [contrast_thres], [edge_thres], [norm_thres], [kernel_radius_factor], [border])
- bob.ip.base.SIFT (sift)
Creates an object that allows the extraction of SIFT descriptors
Todo
Explain SIFT constructor in more detail.
Warning
The order of the parameters scales and octaves has changed compared to the old implementation, in order to keep it consistent with bob.ip.base.VLSIFT!
Parameters:
size : (int, int)
The height and width of the images to processscales : int
The number of intervals of the pyramid. Three additional scales will be computed in practice, as this is required for extracting SIFT featuresoctaves : int
The number of octaves of the pyramidoctave_min : int
The index of the minimum octavesigma_n : float
[default: 0.5] The value sigma_n of the standard deviation for the nominal/initial octave/scalesigma0 : float
[default: 1.6] The value sigma0 of the standard deviation for the image of the first octave and first scalecontrast_thres : float
[default: 0.03] The contrast threshold used during keypoint detectionedge_thres : float
[default: 10.] The edge threshold used during keypoint detectionnorm_thres : float
[default: 0.2] The norm threshold used during descriptor normalizationkernel_radius_factor : float
[default: 4.] Factor used to determine the kernel radii: size=2*radius+1. For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})border : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the bordersift : bob.ip.base.SIFT
The SIFT object to use for copy-construction
Class Members:
int <– The number of bins for the descriptor, with read and write access
int <– The number of blocks for the descriptor, with read and write access
bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access
Computes SIFT descriptor for a 2D/grayscale image, at the given keypoints
If given, the results are put in the output dst, which output should be of type float and allocated in the shape output_shape() method).
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D)
The input image which should be processed
keypoints : [bob.ip.base.GSSKeypoint]
The keypoints at which the descriptors should be computed
dst : [array_like (4D, float)]
The descriptors that should have been allocated in size output_shape()
Returns:
dst : [array_like (4D, float)]
The resulting descriptors, if given it will be the same as the dst parameter
float <– The contrast threshold used during keypoint detection
float <– The edge threshold used during keypoint detection
float <– The Gaussian window size for the descriptor
float <– Factor used to determine the kernel radii size=2*radius+1
For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})
float <– The magnification factor for the descriptor
float <– The magnification factor for the descriptor
float <– The norm threshold used during keypoint detection
int <– The index of the minimum octave, read only access
This is equal to octave_min+octaves-1.
int <– The index of the minimum octave, with read and write access
int <– The number of octaves of the pyramid, with read and write access
Returns the output shape for the given number of input keypoints
Parameters:
keypoints : int
The number of keypoints that you want to retrieve SIFT features for
Returns:
shape : (int, int, int, int)
The shape of the output array required to call compute_descriptor()
int <– The number of intervals of the pyramid, with read and write access
Three additional scales will be computed in practice, as this is required for extracting SIFT features
Sets sigma0 such that there is not smoothing at the first scale of octave_min
float <– The value sigma0 of the standard deviation for the image of the first octave and first scale
float <– The value sigma_n of the standard deviation for the nominal/initial octave/scale; with read and write access
(int, int) <– The shape of the images to process, with read and write access
Bases: object
This class allows after configuration to apply the Self Quotient Image algorithm to images
Details of the Self Quotient Image algorithm is described in [Wang2004].
Constructor Documentation:
- bob.ip.base.SelfQuotientImage ([scales], [size_min], [size_step], [sigma], [border])
- bob.ip.base.SelfQuotientImage (sqi)
Creates an object to preprocess images with the Self Quotient Image algorithm
Todo
explain SelfQuotientImage constructor
Warning
Compared to the last Bob version, here the sigma parameter is the standard deviation and not the variance. This includes that the WeightedGaussian pyramid is different, see https://github.com/bioidiap/bob.ip.base/issues/1.
Parameters:
scales : int
[default: 1] The number of scales (bob.ip.base.WeightedGaussian)size_min : int
[default: 1] The radius of the kernel of the smallest bob.ip.base.WeightedGaussiansize_step : int
[default: 1] The step used to set the kernel size of other weighted Gaussians: size_s = 2 * (size_min + s * size_step) + 1sigma : double
[default: math.sqrt(2.)] The standard deviation of the kernel of the smallest weighted Gaussian; other sigmas: sigma_s = sigma * (size_min + s * size_step) / size_minborder : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the bordersqi : bob.ip.base.SelfQuotientImage
The SelfQuotientImage object to use for copy-construction
Class Members:
bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access
Applies the Self Quotient Image algorithm to an image (2D/grayscale or 3D/color) of type uint8, uint16 or double
If given, the dst array should have the type float and the same size as the src array.
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D)
The input image which should be processed
dst : array_like (2D, float)
[default: None] If given, the output will be saved into this image; must be of the same shape as src
Returns:
dst : array_like (2D, float)
The resulting output image, which is the same as dst (if given)
int <– The number of scales (Weighted Gaussian); with read and write access
float <– The standard deviation of the kernel of the smallest weighted Gaussian (sigma_s = sigma * (size_min+s*size_step)/size_min); with read and write access
int <– The radius (size=2*radius+1) of the kernel of the smallest weighted Gaussian; with read and write access
int <– The step used to set the kernel size of other Weighted Gaussians (size_s=2*(size_min+s*size_step)+1); with read and write access
Bases: object
Objects of this class, after configuration, can preprocess images
It does this using the method described by Tan and Triggs in the paper [TanTriggs2007].
Constructor Documentation:
- bob.ip.base.TanTriggs ([gamma], [sigma0], [sigma1], [radius], [threshold], [alpha], [border])
- bob.ip.base.TanTriggs (tan_triggs)
Constructs a new Tan and Triggs filter
Todo
Explain TanTriggs constructor in more detail.
Parameters:
gamma : float
[default: 0.2] The value of gamma for the gamma correctionsigma0 : float
[default: 1.] The standard deviation of the inner Gaussiansigma1 : float
[default: 2.] The standard deviation of the outer Gaussianradius : int
[default: 2] The radius of the Difference of Gaussians filter along both axes (size of the kernel=2*radius+1)threshold : float
[default: 10.] The threshold used for the contrast equalizationalpha : float
[default: 0.1] The alpha value used for the contrast equalizationborder : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the bordertan_triggs : bob.ip.base.TanTriggs
The TanTriggs object to use for copy-construction
Class Members:
float <– The alpha value used for the contrast equalization, with read and write access
bob.sp.BorderType <– The extrapolation method used by the convolution at the border, with read and write access
float <– The value of gamma for the gamma correction, with read and write access
array_like (2D, float) <– The values of the DoG filter; read only access
Preprocesses a 2D/grayscale image using the algorithm from Tan and Triggs.
The input array is a 2D array/grayscale image. The destination array, if given, should be a 2D array of type float64 and allocated in the same size as the input. If the destination array is not given, it is generated in the required size.
Note
The __call__() function is an alias for this method.
Parameters:
input : array_like (2D)
The input image which should be normalized
output : array_like (2D, float)
[default: None] If given, the output will be saved into this image; must be of the same shape as input
Returns:
output : array_like (2D, float)
The resulting output image, which is the same as output (if given)
int <– The radius of the Difference of Gaussians filter along both axes (size of the kernel=2*radius+1)
float <– The standard deviation of the inner Gaussian, with read and write access
float <– The standard deviation of the inner Gaussian, with read and write access
float <– The threshold used for the contrast equalization, with read and write access
Bases: object
Computes dense SIFT features using the VLFeat library
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.VLDSIFT (size, [step], [block_size])
- bob.ip.base.VLDSIFT (sift)
Creates an object that allows the extraction of VLDSIFT descriptors
Todo
Explain VLDSIFT constructor in more detail.
Parameters:
size : (int, int)
The height and width of the images to processstep : (int, int)
[default: (5, 5)] The step along the y- and x-axesblock_size : (int, int)
[default: (5, 5)] The block size along the y- and x-axessift : bob.ip.base.VLDSIFT
The VLDSIFT object to use for copy-construction
Class Members:
(int, int) <– The block size in both directions, with read and write access
Computes the dense SIFT features from an input image, using the VLFeat library
If given, the results are put in the output dst, which should be of type float and allocated in the shape output_shape() method.
Todo
Describe the output of the VLDSIFT.extract() method in more detail.
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D, float32)
The input image which should be processed
dst : [array_like (2D, float32)]
The descriptors that should have been allocated in size output_shape()
Returns:
dst : array_like (2D, float32)
The resulting descriptors, if given it will be the same as the dst parameter
Returns the output shape for the current setup
The output shape is a 2-element tuple consisting of the number of keypoints for the current size, and the size of the descriptors
Returns:
shape : (int, int)
The shape of the output array required to call extract()
(int, int) <– The shape of the images to process, with read and write access
(int, int) <– The step along both directions, with read and write access
bool <– Whether to use a flat window or not (to boost the processing time), with read and write access
float <– The window size, with read and write access
Bases: object
Computes SIFT features using the VLFeat library
For details, please read [Lowe2004].
Constructor Documentation:
- bob.ip.base.VLSIFT (size, scales, octaves, octave_min, [peak_thres], [edge_thres], [magnif])
- bob.ip.base.VLSIFT (sift)
Creates an object that allows the extraction of VLSIFT descriptors
Todo
Explain VLSIFT constructor in more detail.
Parameters:
size : (int, int)
The height and width of the images to processscales : int
The number of intervals in each octaveoctaves : int
The number of octaves of the pyramidoctave_min : int
The index of the minimum octavepeak_thres : float
[default: 0.03] The peak threshold (minimum amount of contrast to accept a keypoint)edge_thres : float
[default: 10.] The edge rejectipon threshold used during keypoint detectionmagnif : float
[default: 3.] The magnification factor (descriptor size is determined by multiplying the keypoint scale by this factor)sift : bob.ip.base.VLSIFT
The VLSIFT object to use for copy-construction
Class Members:
float <– The edge rejection threshold used during keypoint detection, with read and write access
Computes the SIFT features from an input image
A keypoint is specified by a 3- or 4-tuple (y, x, sigma, [orientation]), stored as one row of the given keypoints parameter. If the keypoints are not given, the are detected first. It returns a list of descriptors, one for each keypoint and orientation. The first four values are the x, y, sigma and orientation of the values. The 128 remaining values define the descriptor.
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D, uint8)
The input image which should be processed
keypoints : array_like (2D, float)
The keypoints at which the descriptors should be computed
Returns:
dst : [array_like (1D, float)]
The resulting descriptors; the first four values are the x, y, sigma and orientation of the keypoints, the 128 remaining values define the descriptor
float <– The magnification factor for the descriptor
int <– The index of the minimum octave, read only access
This is equal to octave_min+octaves-1.
int <– The index of the minimum octave, with read and write access
int <– The number of octaves of the pyramid, with read and write access
float <– The peak threshold (minimum amount of contrast to accept a keypoint), with read and write access
int <– The number of intervals of the pyramid, with read and write access
Three additional scales will be computed in practice, as this is required for extracting VLSIFT features
(int, int) <– The shape of the images to process, with read and write access
Bases: object
This class performs weighted gaussian smoothing (anisotropic filtering)
In particular, it is used by the Self Quotient Image (SQI) algorithm bob.ip.base.SelfQuotientImage.
Constructor Documentation:
- bob.ip.base.WeightedGaussian (sigma, [radius], [border])
- bob.ip.base.WeightedGaussian (weighted_gaussian)
Constructs a new weighted Gaussian filter
Todo
explain WeightedGaussian constructor
Warning
Compared to the last Bob version, here the sigma parameter is the standard deviation and not the variance.
Parameters:
sigma : (double, double)
The standard deviation of the WeightedGaussian along the y- and x-axes in pixelsradius : (int, int)
[default: (-1, -1) -> 3*sigma ] The radius of the Gaussian in both directions – the size of the kernel is 2*radius+1border : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the borderweighted_gaussian : bob.ip.base.WeightedGaussian
The weighted Gaussian object to use for copy-construction
Class Members:
bob.sp.BorderType <– The extrapolation method used by the convolution at the border, with read and write access
Smooths an image (2D/grayscale or 3D/color)
If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D)
The input image which should be smoothed
dst : array_like (2D, float)
[default: None] If given, the output will be saved into this image; must be of the same shape as src
Returns:
dst : array_like (2D, float)
The resulting output image, which is the same as dst (if given)
Bases: object
A Wiener filter
The Wiener filter is implemented after the description in Part 3.4.3 of [Szeliski2010]
Constructor Documentation:
- bob.ip.base.Wiener (size, Pn, [variance_threshold])
- bob.ip.base.Wiener (Ps, Pn, [variance_threshold])
- bob.ip.base.Wiener (data, [variance_threshold])
- bob.ip.base.Wiener (filter)
- bob.ip.base.Wiener (hdf5)
Constructs a new Wiener filter
Several variants of contructors are possible for contructing a Wiener filter. They are:
- Constructs a new Wiener filter dedicated to images of the given size. The filter is initialized with zero values
- Constructs a new Wiener filter from a set of variance estimates Ps and a noise level Pn
- Trains the new Wiener filter with the given data
- Copy constructs the given Wiener filter
- Reads the Wiener filter from bob.io.base.HDF5File
Parameters:
Ps : array_like<float, 2D>
Variance Ps estimated at each frequencyPn : float
Noise level Pnsize : (int, int)
The shape of the newly created empty filterdata : array_like<float, 3D>
The training data, with dimensions (#data, height, width)variance_threshold : float
[default: 1e-8] Variance flooring threshold (i.e., the minimum variance valuefilter : bob.ip.base.Wiener
The Wiener filter object to use for copy-constructionhdf5 : bob.io.base.HDF5File
The HDF5 file object to read the Wiener filter from
Class Members:
float <– Noise level Pn
array_like <float, 2D> <– Variance Ps estimated at each frequency
Filters the input image
If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.
Note
The __call__() function is an alias for this method.
Parameters:
src : array_like (2D)
The input image which should be smoothed
dst : array_like (2D, float)
[default: None] If given, the output will be saved into this image; must be of the same shape as src
Returns:
dst : array_like (2D, float)
The resulting output image, which is the same as dst (if given)
Compares this Wiener filter with the other one to be approximately the same
The optional values r_epsilon and a_epsilon refer to the relative and absolute precision, similarly to numpy.allclose().
Parameters:
other : bob.ip.base.Wiener
The other Wiener filter to compare with
r_epsilon : float
[Default: 1e-5] The relative precision
a_epsilon : float
[Default: 1e-8] The absolute precision
Loads the configuration of the Wiener filter from the given HDF5 file
Parameters:
hdf5 : bob.io.base.HDF5File
An HDF5 file opened for reading
Saves the the configuration of the Wiener filter to the given HDF5 file
Parameters:
hdf5 : bob.io.base.HDF5File
An HDF5 file open for writing
(int, int) <– The size of the filter
float <– Variance flooring threshold
array_like<2D, float> <– The Wiener filter W (W=1/(1+Pn/Ps)) (read-only)
Get the angle needed to level out (horizontally) two points.
Parameters
Returns
Performs a block decomposition of a 2D array/image
If given, the output 3D or 4D destination array should be allocated and of the correct size, see bob.ip.base.block_output_shape().
Parameters:
input : array_like (2D)
The source image to decompose into blocks
block_size : (int, int)
The size of the blocks in which the image is decomposed
block_overlap : (int, int)
[default: (0, 0)] The overlap of the blocks
output : array_like(3D or 4D)
[default: None] If given, the resulting blocks will be saved into this parameter; must be initialized in the correct size (see block_output_shape())
flat : bool
[default: False] If output is not specified, the flat parameter is used to decide whether 3D (flat = True) or 4D (flat = False) output is generated
Returns:
output : array_like(3D or 4D)
The resulting blocks that the image is decomposed into; the same array as the output parameter, when given.
Returns the shape of the output image that is required to compute the bob.ip.base.block() function
Parameters:
input : array_like (2D)
The source image to decompose into blocks
block_size : (int, int)
The size of the blocks in which the image is decomposed
block_overlap : (int, int)
[default: (0, 0)] The overlap of the blocks
flat : bool
[default: False] The flat parameter is used to decide whether 3D (flat = True) or 4D (flat = False) output is generated
Returns:
shape : (int, int, int) or (int, int, int, int)
The shape of the blocks.
Crops the given image src image to the given offset (might be negative) and to the given size (might be greater than src image).
Either crop_size or dst need to be specified. When masks are given, the need to be of the same size as the src and dst parameters. When crop regions are outside the image, the cropped image will contain fill_pattern and the mask will be set to False
Parameters
Returns
Extrapolate a 2D array/image, taking a boolean mask into account
The img argument is used both as an input and an output. Only values where the mask is set to false are extrapolated. The regions, where the mask is set to True is expected to be convex.
This function can be called in two ways:
The first way is by giving only the mask and the image. Then a nearest neighbor technique is used as:
The second way, the mask is interpolated by adding random values to the border pixels. The image is scanned in a spiral way, starting at the center of the masked area. When a pixel of the unmasked area is reached:
Any action considering a random number will use the given rng to create random numbers.
Note
For the second variant, images of type float are preferred.
Parameters:
mask : array_like (2D, bool)
The mask which has the valid pixel set to True and the invalid pixel set to False
img : array_like (2D, bool)
The image that will be filled; must have the same shape as mask
random_sigma : float
The standard deviation of the random factor to multiply thevalid pixel value from the border with; must be greater than or equal to 0
neighbors : int
[Default: 5] The number of neighbors of valid border pixels to choose one from; set neighbors=0 to disable random selection
rng : bob.core.random.mt19937
[Default: rng initialized with the system time] The random number generator to consider
Flip a 2D or 3D array/image upside-down. If given, the destination array dst should have the same size and type as the source array.
Parameters
Returns
Flip a 2D or 3D array/image left-right. If given, the destination array dst should have the same size and type as the source array.
Parameters
Returns
Performs a power-law gamma correction of a given 2D image
Todo
Explain gamma correction in more detail
Parameters:
src : array_like (2D)
The source image to compute the histogram for
gamma : float
The gamma value to apply
dst : array_like (2D, float)
The gamma-corrected image to write; if not specified, it will be created in the desired size
Returns:
dst : array_like (2D, float)
The gamma-corrected image; the same as the dst parameter, if specified
Computes an histogram of the given input image
This function computes a histogram of the given input image, in several ways.
Parameters:
src : array_like (2D)
The source image to compute the histogram for
hist : array_like (1D, uint64)
The histogram with the desired number of bins; the histogram will be cleaned before running the extraction
min_max : (scalar, scalar)
The minimum value and the maximum value in the source image
bin_count : int
[default: 256 or 65536] The number of bins in the histogram to create, defaults to the maximum number of values
Returns:
hist : array_like(2D, uint64)
The histogram with the desired number of bins, which is filled with the histogrammed source data
Performs a histogram equalization of a given 2D image
The first version computes the normalization in-place (in opposition to the old implementation, which returned a equalized image), while the second version fills the given dst array and leaves the input untouched.
Parameters:
src : array_like (2D, uint8 or uint16)
The source image to compute the histogram for
dst : array_like (2D, uint8, uint16, uint32 or float)
The histogram-equalized image to write; if not specified, the equalization is computed in-place.
Computes an integral image for the given input image
It is the responsibility of the user to select an appropriate type for the numpy array dst (and sqr), which will contain the integral image. By default, src and dst should have the same size. When the sqr matrix is given as well, it will be filled with the squared integral image (useful to compute variances of pixels).
Note
The sqr image is expected to have the same data type as the dst image.
If add_zero_border is set to True, dst (and sqr) should be one pixel larger than src in each dimension. In this case, an extra zero pixel will be added at the beginning of each row and column.
Parameters:
src : array_like (2D)
The source image
dst : array_like (2D)
The resulting integral image
sqr : array_like (2D)
The resulting squared integral image with the same data type as dst
add_zero_border : bool
If enabled, an extra zero pixel will be added at the beginning of each row and column
Computes an local binary pattern histogram sequences from the given image
Warning
This is a re-implementation of the old bob.ip.LBPHSFeatures class, but with a different handling of blocks. Before, the blocks where extracted from the image, and LBP’s were extracted in the blocks. Hence, in each block, the border pixels where not taken into account, and the histogram contained far less elements. Now, the LBP’s are extracted first, and then the image is split into blocks.
This function computes the LBP features for the whole image, using the given bob.ip.base.LBP instance. Afterwards, the resulting image is split into several blocks with the given block size and overlap, and local LBH histograms are extracted from each region.
Note
To get the required output shape, you can use lbphs_output_shape() function.
Parameters:
input : array_like (2D)
The source image to compute the LBPHS for
lbp : bob.ip.base.LBP
The LBP class to be used for feature extraction
block_size : (int, int)
The size of the blocks in which the LBP histograms are split
block_overlap : (int, int)
[default: (0, 0)] The overlap of the blocks in which the LBP histograms are split
output : array_like(2D, uint64)
If given, the resulting LBPHS features will be written to this array; must have the size #output-blocks, #LBP-labels (see lbphs_output_shape())
Returns:
output : array_like(2D, uint64)
The resulting LBPHS features of the size #output-blocks, #LBP-labels; the same array as the output parameter, when given.
Returns the shape of the output image that is required to compute the bob.ip.base.lbphs() function
Parameters:
input : array_like (2D)
The source image to compute the LBPHS for
lbp : bob.ip.base.LBP
The LBP class to be used for feature extraction
block_size : (int, int)
The size of the blocks in which the LBP histograms are split
block_overlap : (int, int)
[default: (0, 0)] The overlap of the blocks in which the LBP histograms are split
Returns:
shape : (int, int)
The shape of the LBP histogram sequences, which is (#blocks, #labels).
Given a 2D mask (a 2D blitz array of booleans), compute the maximum rectangle which only contains true values.
The resulting rectangle contains the coordinates in the following order:
Parameters:
mask : array_like (2D, bool)
The mask of boolean values, e.g., as a result of bob.ip.base.GeomNorm.process()
Returns:
rect : (int, int, int, int)
The resulting rectangle: (top, left, height, width)
Performs a median filtering of the input image with the given radius
This function performs a median filtering of the given src image with the given radius and writes the result to the given dst image. Both gray-level and color images are supported, and the input and output datatype must be identical.
Median filtering iterates with a mask of size (2*radius[0]+1, 2*radius[1]+1) over the input image. For each input region, the pixels under the mask are sorted and the median value (the middle element of the sorted list) is written into the dst image. Therefore, the dst is smaller than the src image, i.e., by 2*radius pixels.
Parameters:
src : array_like (2D or 3D)
The source image to filter, might be a gray level image or a color image
radius : (int, int)
The radius of the median filter; the final filter will have the size (2*radius[0]+1, 2*radius[1]+1)
dst : array_like (2D or 3D)
The median-filtered image to write; need to be of size src.shape - 2*radius; if not specified, it will be created
Returns:
dst : array_like (2D or 3D)
The median-filtered image; the same as the dst parameter, if specified
Rotates an image.
This function rotates an image using bi-linear interpolation. It supports 2D and 3D input array/image (NumPy array) of type numpy.uint8, numpy.uint16 and numpy.float64. Basically, this function can be called in three different ways:
Note
Since the implementation uses a different interpolation style than before, results might slightly differ.
Parameters:
src : array_like (2D or 3D)
The input image (gray or colored) that should be rotated
dst : array_like (2D or 3D, float)
The resulting scaled gray or color image, should be in size bob.ip.base.rotated_output_shape()
src_mask : array_like (bool, 2D or 3D)
An input mask of valid pixels before geometric normalization, must be of same size as src
dst_mask : array_like (bool, 2D or 3D)
The output mask of valid pixels after geometric normalization, must be of same size as dst
rotation_angle : float
the rotation angle that should be applied to the image
Returns:
dst : array_like (2D, float)
The resulting rotated image
This function returns the shape of the rotated image for the given image and angle
Parameters:
src : array_like (2D,3D)
The src image which which should be scaled
angle : float
The rotation angle in degrees to rotate the src image with
Returns:
rotated_shape : (int, int) or (int, int, int)
The shape of the rotated dst image required in a call to bob.ip.base.rotate()
Scales an image.
This function scales an image using bi-linear interpolation. It supports 2D and 3D input array/image (NumPy array) of type numpy.uint8, numpy.uint16 and numpy.float64. Basically, this function can be called in three different ways:
Note
For 2. and 3., scale factors are computed for both directions independently. Factually, this means that the image might be stretched in either direction, i.e., the aspect ratio is not identical for the horizontal and vertical direction. Even for 1. this might apply, e.g., when src.shape * scaling_factor does not result in integral values.
Parameters:
src : array_like (2D or 3D)
The input image (gray or colored) that should be scaled
dst : array_like (2D or 3D, float)
The resulting scaled gray or color image
src_mask : array_like (bool, 2D or 3D)
An input mask of valid pixels before geometric normalization, must be of same size as src
dst_mask : array_like (bool, 2D or 3D)
The output mask of valid pixels after geometric normalization, must be of same size as dst
scaling_factor : float
the scaling factor that should be applied to the image
Returns:
dst : array_like (2D, float)
The resulting scaled image
This function returns the shape of the scaled image for the given image and scale
The function tries its best to compute an integral-valued shape given the shape of the input image and the given scale factor. Nevertheless, for non-round scale factors this might not work out perfectly.
Parameters:
src : array_like (2D,3D)
The src image which which should be scaled
scaling_factor : float
The scaling factor to scale the src image with
Returns:
scaled_shape : (int, int) or (int, int, int)
The shape of the scaled dst image required in a call to bob.ip.base.scale()
Shifts the given image src image with the given offset (might be negative).
If dst is specified, the image is shifted into the dst image. Ideally, dst should have the same size as src, but other sizes work as well. When dst is None (the default), it is created in the same size as src. When masks are given, the need to be of the same size as the src and dst parameters. When shift to regions are outside the image, the shifted image will contain fill_pattern and the mask will be set to False
Parameters
Returns
Performs a Sobel filtering of the input image
This function will perform a Sobel filtering woth both the vertical and the horizontal filter. A Sobel filter is an edge detector, which will detect either horizontal or vertical edges. The two filter are given as:
S_y = \left\lgroup\begin{array}{ccc} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{array}\right\rgroup \qquad S_x = \left\lgroup\begin{array}{ccc} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{array}\right\rgroup
If given, the dst array should have the expected type (numpy.float64) and two layers of the same size as the input image. Finally, the result of the vertical filter will be put into the first layer of dst[0], while the result of the horizontal filter will be written to dst[1].
Parameters:
src : array_like (2D, float)
The source image to filter
border : bob.sp.BorderType
[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border
dst : array_like (3D, float)
The Sobel-filtered image to write; need to be of size [2] + src.shape; if not specified, it will be created
Returns:
dst : array_like (3D, float)
The Sobel-filtered image; the same as the dst parameter, if specified
Extracts a 1D array using a zigzag pattern from a 2D array
This function extracts a 1D array using a zigzag pattern from a 2D array. If bottom_first is set to True, the second element of the pattern is taken at the bottom of the upper left element, otherwise it is taken at the right of the upper left element. The input is expected to be a 2D dimensional array. The output is expected to be a 1D dimensional array. This method only supports arrays of the following data types:
- numpy.uint8
- numpy.uint16
- numpy.float64 (or the native python float)
To create an object with a scalar type that will be accepted by this method, use a construction like the following:
>> import numpy >> input_righttype = input_wrongtype.astype(numpy.float64)
Parameters:
src : array_like (uint8|uint16|float64, 2D)
The source matrix.
dst : array_like (uint8|uint16|float64, 1D)
The destination matrix.
right_first : scalar (bool)
Tells whether the zigzag pattern start to move to the right or not