bob.ip.common.data.transforms¶

Image transformations for our pipelines

Differences between methods here and those from torchvision.transforms is that these support multiple simultaneous image inputs, which are required to feed segmentation networks (e.g. image and labels or masks). We also take care of data augmentations, in which random flipping and rotation needs to be applied across all input images, but color jittering, for example, only on the input image.

Classes

`AutoLevel16to8`()	Converts multiple 16-bit images to 8-bit representations using "auto-level"
`CenterCrop`(size)
`ColorJitter`([p])	Randomly applies a color jitter transformation on the first image
`Compose`(transforms)
`Crop`(i, j, h, w)	Crops multiple images at the given coordinates.
`GaussianBlur`([p])	Randomly applies a gaussian blur transformation on the first image
`GetBoundingBox`([image, reference])	Returns image tensor and its corresponding target dict given a mask.
`Pad`(padding[, fill, padding_mode])
`RandomHorizontalFlip`([p])	Randomly flips all input images horizontally
`RandomRotation`([p])	Randomly rotates all input images by the same amount
`RandomVerticalFlip`([p])	Randomly flips all input images vertically
`Resize`(size[, interpolation, max_size, ...])
`ShrinkIntoSquare`([reference, threshold])	Crops black borders and then resize to a square with minimal padding
`SingleAutoLevel16to8`()	Converts a 16-bit image to 8-bit representation using "auto-level"
`SingleCrop`(i, j, h, w)	Crops one image at the given coordinates.
`SingleToRGB`()	Converts from any input format to RGB, using an ADAPTIVE conversion.
`ToRGB`()	Converts from any input format to RGB, using an ADAPTIVE conversion.
`ToTensor`()
`TupleMixin`()	Adds support to work with tuples of objects to torchvision transforms

class bob.ip.common.data.transforms.TupleMixin[source]¶

Bases: object

Adds support to work with tuples of objects to torchvision transforms

class bob.ip.common.data.transforms.CenterCrop(size)[source]¶: Bases: TupleMixin, CenterCrop

class bob.ip.common.data.transforms.Pad(padding, fill=0, padding_mode='constant')[source]¶: Bases: TupleMixin, Pad

class bob.ip.common.data.transforms.Resize(size, interpolation=InterpolationMode.BILINEAR, max_size=None, antialias=None)[source]¶: Bases: TupleMixin, Resize

class bob.ip.common.data.transforms.ToTensor[source]¶: Bases: TupleMixin, ToTensor

class bob.ip.common.data.transforms.Compose(transforms)[source]¶: Bases: Compose

class bob.ip.common.data.transforms.SingleCrop(i, j, h, w)[source]¶

Bases: object

Crops one image at the given coordinates.

i¶

upper pixel coordinate.

Type: int

j¶

left pixel coordinate.

Type: int

h¶

height of the cropped image.

Type: int

w¶

width of the cropped image.

Type: int

class bob.ip.common.data.transforms.Crop(i, j, h, w)[source]¶

Bases: TupleMixin, SingleCrop

Crops multiple images at the given coordinates.

i¶

upper pixel coordinate.

Type: int

j¶

left pixel coordinate.

Type: int

h¶

height of the cropped image.

Type: int

w¶

width of the cropped image.

Type: int

class bob.ip.common.data.transforms.SingleAutoLevel16to8[source]¶

Bases: object

Converts a 16-bit image to 8-bit representation using “auto-level”

This transform assumes that the input image is gray-scaled.

To auto-level, we calculate the maximum and the minimum of the image, and consider such a range should be mapped to the [0,255] range of the destination image.

class bob.ip.common.data.transforms.AutoLevel16to8[source]¶

Bases: TupleMixin, SingleAutoLevel16to8

Converts multiple 16-bit images to 8-bit representations using “auto-level”

This transform assumes that the input images are gray-scaled.

To auto-level, we calculate the maximum and the minimum of the image, and consider such a range should be mapped to the [0,255] range of the destination image.

class bob.ip.common.data.transforms.SingleToRGB[source]¶

Bases: object

Converts from any input format to RGB, using an ADAPTIVE conversion.

This transform takes the input image and converts it to RGB using py:method:PIL.Image.Image.convert, with mode=’RGB’ and using all other defaults. This may be aggressive if applied to 16-bit images without further considerations.

class bob.ip.common.data.transforms.ToRGB[source]¶

Bases: TupleMixin, SingleToRGB

Converts from any input format to RGB, using an ADAPTIVE conversion.

This transform takes the input image and converts it to RGB using py:method:PIL.Image.Image.convert, with mode=’RGB’ and using all other defaults. This may be aggressive if applied to 16-bit images without further considerations.

class bob.ip.common.data.transforms.RandomHorizontalFlip(p=0.5)[source]¶

Bases: RandomHorizontalFlip

Randomly flips all input images horizontally

class bob.ip.common.data.transforms.RandomVerticalFlip(p=0.5)[source]¶

Bases: RandomVerticalFlip

Randomly flips all input images vertically

class bob.ip.common.data.transforms.RandomRotation(p=0.5, **kwargs)[source]¶

Bases: RandomRotation

Randomly rotates all input images by the same amount

Unlike the current torchvision implementation, we also accept a probability for applying the rotation.

Parameters

p (float, Optional) – probability at which the operation is applied
**kwargs (dict) –
passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:
- degrees: 15
- interpolation: torchvision.transforms.functional.InterpolationMode.BILINEAR

class bob.ip.common.data.transforms.ColorJitter(p=0.5, **kwargs)[source]¶

Bases: ColorJitter

Randomly applies a color jitter transformation on the first image

Notice this transform extension, unlike others in this module, only affects the first image passed as input argument. Unlike the current torchvision implementation, we also accept a probability for applying the jitter.

Parameters

p (float, Optional) – probability at which the operation is applied
**kwargs (dict) –
passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:
- brightness: 0.3
- contrast: 0.3
- saturation: 0.02
- hue: 0.02

class bob.ip.common.data.transforms.ShrinkIntoSquare(reference=0, threshold=0)[source]¶

Bases: object

Crops black borders and then resize to a square with minimal padding

This transform can crop all the images by removing the black pixels in the width and height until it finds a non-black pixel. Then, expands the image back until it makes a square with minimal size.

Parameters

reference (int, Optional) – Which reference part of the sample to use for cropping black borders. If not set, use the first object on the sample (typically, the image).
threshold (int, Optional) – Threshold to use for when considering what is a “black” border

class bob.ip.common.data.transforms.GaussianBlur(p=0.5, **kwargs)[source]¶

Bases: GaussianBlur

Randomly applies a gaussian blur transformation on the first image

Notice this transform extension, unlike others in this module, only affects the first image passed as input argument. Unlike the current torchvision implementation, we also accept a probability for applying the blur.

Parameters

p (float, Optional) – probability at which the operation is applied
**kwargs (dict) –
passed to parent. Notice that, if not set, we use the following defaults here for the underlying transform from torchvision:
- kernel_size: (5, 5)
- sigma: (0.1, 5)

class bob.ip.common.data.transforms.GetBoundingBox(image=0, reference=1)[source]¶

Bases: object

Returns image tensor and its corresponding target dict given a mask.

Parameters

image (int, Optional) – Which reference part of the sample is the image.
reference (int, Optional) – Which reference part of the sample to use for getting bbox. If not set, use the second object on the sample (typically, the mask).