Python API¶
Package Documentation¶
This is the bob.io.stream package
This package provides a way to define efficient processing pipelines, based on the concept of “streams”, to load and
process video data stored in hdf5 files. The interface with the hdf5 files is implemented in
StreamFile
. Users can define loading and processing pipeline through the
Stream
class.
The stream implementation is designed to allow the extension of the class by implementing filters using the
StreamFilter
class and decorating them with stream_filter()
. The decorator
adds the filter to the Stream
, so it can be used as a stream’s member.
- class bob.io.stream.Stream(name=None, parent=None)¶
Bases:
object
Base class implementing methods to load/write, process and use data from hdf5 file with a “numpy-like” api.
This class is designed to provide the following functionalities:
Easily define chain of processing and loading data. When accessing data through a stream, it will recursively call its parent
load()
function before its own. For instance if a stream’s parent is a StreamFile, loading data through the stream will first load the data (from the dataset specified by the stream’s name) from the hdf5 through the StreamFile before applying its own processing.Provide an easy syntax to implement this chain processing. This is achieved through the
stream_filter()
filter decorator which adds to the Stream class filters members, allowing them to be used in the following fashion:example_stream = Stream("cam1").normalize().stack(Stream("cam2").normalize())
The data loaded through example_stream will thus load data from “cam1” and normalize it, then load data from “cam2” and normalize it, and finally stack the two together.
In a similar fashion to the chain processing, this class allows to apply processing in the reverse order to write data in a hdf5 file. This is implemented in the
put()
method and uses the child attribute.
The api is designed to be similar to numpy arrays:
Data access (processing and loading) is done using [].
Taking a slice in a stream returns a new stream with the sliced data. (This is implemented with the
StreamView
filter).
To reduce disk access, the result of loading or processing is buffered.
The class was initially designed to work with video streams, therefore
StreamArray
members are available to provide an easy way to use bounding boxes or landmarks for each frame in the stream. Additionally, thetimestamps
member are the timestamps of each frame in the stream.- name¶
Name of the stream. If parent is a
StreamFile
, it will be used to know from which dataset in the hdf5 file the data should be taken. Otherwise it is an identifier of the Stream (or StreamFilter) functionality (eg “adjust”, “normalize”, …).- Type
- parent¶
The element before this instance in the chain of processing for loading data. The parent’s “load” function will recursively be used before this instance’s one.
- Type
Stream
orStreamFile
- child¶
The element after this instance in the chain of processing for writing data. When
put()
is called, it will perform its function then recursively call its child’s.- Type
Stream
orStreamFile
- _data¶
Buffered data.
- Type
- _shape¶
Shape of the stream’s data. This member is mostly used when writing data, while when reading the
shape
property is used.- Type
tuple
of int
- adjust(*args, **kwargs)¶
- astype(*args, **kwargs)¶
- property bounding_box¶
Bounding box at each frame in the stream.
A StreamArray member is provided to allow the user easily store their bounding boxes with the stream’s data.
- Returns
Bounding boxes.
- Return type
- clean(*args, **kwargs)¶
- colormap(*args, **kwargs)¶
- property config¶
Configuration dictionary to access the data in the hdf5 file.
- Returns
Config.
- Return type
- filter(*args, **kwargs)¶
- filters = ['filter', 'view', 'save', 'astype', 'adjust', 'select', 'colormap', 'normalize', 'clean', 'stack', 'subtract']¶
- get_available_filters()[source]¶
Get a list of the available filters to use with
Stream
class.Note: Stream.filters is filled in with the name of the filters by the
stream_filter()
decorator, each time a class is decorated.
- get_parent()[source]¶
Return this stream’s parent (None if parent is not set)
- Returns
This stream’s parent.
- Return type
- property image_points¶
Landmarks at each frame in the stream.
A StreamArray member is provided to allow the user easily store their landmark points with the stream’s data.
- Returns
Landmarks.
- Return type
- load(index=None)[source]¶
Load data directly.
Unlike accessing stream data through brackets [], this method always returns the data, not a Stream. This method is overloaded in
StreamFilter
, in order to call parent load method first and apply processing on the result.The loaded data is buffered to reduce disk access.
- Parameters
index (int or
list
) – Indices of the frames to load, by default None.- Returns
Data at index.
- Return type
- property ndim¶
Number of dimension in the stream’s data.
- Returns
Number of dimension.
- Return type
- normalize(*args, **kwargs)¶
- put(data, timestamp=None)[source]¶
Recursivelly pass data down to child to write in hdf5File.
StreamFilter
overloads this method to process data with the filter function before passing down to child.- Parameters
data (
numpy.ndarray
) – data to write to file.timestamp (int or float) – Timestamp of data, by default None.
- Raises
ValueError – If data’s shape does not match with previous frames’ shape or with stream’s shape..
- save(*args, **kwargs)¶
- select(*args, **kwargs)¶
- set_source(src)[source]¶
Recursively set source of self and parent.
- Parameters
src (
StreamFile
) – The file containing the raw data of this stream and parents.
- property shape¶
Shape of the stream’s data.
When reading data, the shape of the stream is typically defined by the shape of the data in source, therefore the shape is recursively set to parent as well. However, when writing data, the shape is defined by the user, and the stream’s parent might not be set. In this case, we store the shape in _shape.
- Raises
Exception – If trying to set the shape when it is already defined (by a parent StreamFile).
ValueError – If setting the shape with an invalid type.
- Returns
Shape.
- Return type
tuple
of int
- property source¶
Source file of the Stream’s data.
While parent points to the previous stream in the chain of processing, source points directly to the data file.
- Returns
File containing the stream’s data, before processing.
- Return type
- stack(*args, **kwargs)¶
- subtract(*args, **kwargs)¶
- property timestamps¶
Timestamp of each frame in the stream’s data.
- Returns
Timestamps.
- Return type
- view(*args, **kwargs)¶
- class bob.io.stream.StreamAdjust(adjust_to, name, parent)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter that allows to use 2 streams with different timestamps seamlessly by taking the closest time neighbors.
Streams frames are not necessarily simultaneous: some streams may be delayed, some might have less frames… However the timestamps of each frames are available. Given the timestamps of the parent stream, this filter implements a nearest neighbor search in the timestamps of the adjust_to stream to load the closest frame.
This stream emulates the adjust_to number of frames and timestamps to facilitate operations on streams.
- adjust_to¶
Stream relatively to which the timestamps will be adjusted.
- Type
- set_source(src)[source]¶
Set self and adjust_to sources to src.
- Parameters
src (
Stream
orStreamFile
) – Source Stream or StreamFile.
- property shape¶
Stream’s data shape. The number of frames is equal to adjust_to.
- Returns
Shape of the Stream’s data.
- Return type
tuple
of int
- property timestamps¶
Stream’s timestamps, equal to adjust_to after adjustment.
- Returns
Timestamps of the frames in the stream.
- Return type
- load(index)[source]¶
Load frame(s) at index.
index is the index of a frame in adjust_to. The closest frame in self is found using nearest neighbor search, then the data is loaded.
- Parameters
index (int or
list
of int or slice) – Indices of the frames to load.- Returns
Stream’s data at index.
- Return type
- class bob.io.stream.StreamArray(stream)[source]¶
Bases:
object
Class to associate data to a
Stream
, for instance bounding boxes to a video stream.This class allows to set the value of the data array (eg the bounding box at some or each frame of a stream) without having to care about the shape of the stream. If the data is not initialized, it will return None.
- class bob.io.stream.StreamAsType(name, parent, dtype)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to cast the data to a different numpy dtype.
- dtype¶
The dtype to which to cast the data.
- Type
- process(data, indices)[source]¶
Cast data to dtype.
- Parameters
data (
numpy.ndarray
) – Data to cast.indices (int or
list
of int) – Not used. Present for compatibility with other filters.
- Returns
data casted to dtype.
- Return type
- class bob.io.stream.StreamClean(name, parent)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to fill in dead pixels through inpainting, then blurring.
- process_frame(data, data_index, stream_index)[source]¶
Fill in dead pixels in data.
- Parameters
data (
numpy.ndarray
) – Parent stream’s data to clean.data_index (int or
list
of int) – Not used. Present for compatibility with other filters.stream_index (int or
list
of int) – Not used. Present for compatibility with other filters.
- Returns
Cleaned data.
- Return type
- class bob.io.stream.StreamColorMap(name, parent, colormap='gray')[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to map a 1 channel images to RGB images, usefull for visualization, eg of depth maps.
- colormap¶
The colormap used to represent the data. Can be “gray” for grayscale, or an openCV colormap.
- Type
- property shape¶
Shape of the stream’s data. The stream parent must have 1 channel, and this stream has mapped it to 3 (RGB).
- Returns
Shape of the stream’s data.
- Return type
tuple
of int
- process_frame(data, data_index, stream_index)[source]¶
Maps a 1 channel frame to a RGB frame using the filter’s colormap.
- Parameters
data (
numpy.ndarray
) – Parent stream’s data. Must have only 1 channeldata_index (int) – Not used. Present for compatibility with other streams.
stream_index (int) – Not used. Present for compatibility with other filters.
- Returns
Stream’s data, mapped to RGB using the filter’s colormap.
- Return type
- Raises
ValueError – If the parent’s stream does not have only 1 channel: this stream maps 1 channel images to RGB.
- class bob.io.stream.StreamFile(hdf5_file=None, data_format_config_file_path=None, mode='r')¶
Bases:
object
File class to read and write from HDF5 files.
Exposes methods to read a stream’s data and meta-data. The format of the data in the hdf5 file is defined through a configuration dictionary.
The class can also be used to write a HDF5 file, through the
put_frame()
method. This operates by appending, one frame at a time, data to a file.- hdf5_file¶
HDF5 file containing the streams data.
- Type
- data_format_config¶
Path to configuration json with the streams data meta-data (names, shape, etc…)
- Type
- get_stream_config(stream_name)[source]¶
Get the stream_name configuration: stream name, data format, etc…
- get_stream_timestamps(stream_name)[source]¶
Return the timestamps of each frame in stream_name.
- Parameters
stream_name (str) – Name of the stream which timestamps are requested.
- Returns
Timestamps of each frame in stream_name
- Return type
- load_stream_data(stream_name, index)[source]¶
Load the index frame(s) of data from stream_name.
Loads only the requested indices from the file. If the stream’s data configuration requests it, some axis in the loaded data are flipped.
- Parameters
- Returns
Stream’s data at frames index.
- Return type
- Raises
ValueError – If index has not a valid type.
- put_frame(name, data, timestamp=None)[source]¶
Appends data (a frame of a stream) to the hdf5 file.
- Parameters
name (str) – Path to the dataset to append to.
data (obj:numpy.ndarray) – Data frame to append.
- set_source(hdf5_file=None, data_format_config_file_path=None, mode='r')[source]¶
Open the HDF5 file and load data config.
- Parameters
hdf5_file (
bob.io.base.HDF5File
or str or None) – File handle or path to the streams HDF5 File, by default None.data_format_config_file_path (str or None) – Path to the data config file, by default None.
mode (str) – File opening mode, by default “r”.
- class bob.io.stream.StreamFilter(name, parent, process_frame=None)¶
Bases:
bob.io.stream.Stream
Base filter class: overloads the
bob.io.stream.Stream.load()
andbob.io.stream.Stream.put()
methods to insert the filter processing.This class implements the
process()
andbob.io.stream.StreamFilter.process_frame()
methods, which define the processing operated by the filter. A “process_frame” method can be receive in argument, in which case it will be applied to each frame of data inprocess_frame()
. If not provided, this filter doesn’t perform any processing, however it provides the definition of the processing methods which can be overloaded in inheriting classes. See for exampleStreamView
filter.The
bob.io.stream.Stream.load()
is overloaded to first perform the filter’s parent processing (or loading if parent is not a filter) Thebob.io.stream.Stream.put()
methods is overloaded to first perform the processing of the filter, then pass the data down to child to further process or write on disk.- filter_name¶
The name of this filter. name (from class
bob.io.stream.Stream
) is kept separate because it is used to know from which dataset to load data in the hdf5.- Type
- load(index=None)[source]¶
Overload
bob.io.stream.Stream.load()
to apply the filter processing what parent loaded.- Parameters
index (int or
list
of int) – Indices of the frames to load, by default None.- Returns
The processed data.
- Return type
- process(data, indices)[source]¶
Apply the filter on each frame of data, and stack the results back in one array.
- Parameters
data (
numpy.ndarray
) – Data to process.indices (
list
of int) – Indices of data in the stream. Unused here, but usefull for instance for filters that combine two streams together.
- Returns
Processed data.
- Return type
- Raises
ValueError – If indices is not a list.
- process_frame(data, data_index, stream_index)[source]¶
Apply self.__process_frame if possible, otherwise simply return data.
- Parameters
data (
numpy.ndarray
) – Data (one frame) to process.data_index (int) – Not used. Index of data in the stream.
stream_index (int) – Not used. Index of the stream from which data comes, to be used by filters that combine several streams.
- Returns
Processed frame.
- Return type
- put(data, timestamp=None)[source]¶
Apply filter’s processsing, then pass down data to child for further processing or save on disk.
- Parameters
data (
numpy.ndarray
) – Data (one frame) to process.timestamp (int or float) – Timestamp of data in the stream, by default None.
- class bob.io.stream.StreamNormalize(name, parent, tmin=None, tmax=None, dtype='uint8')[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to normalize images data range.
- tmin¶
minimal threshold: values below tmin will be clipped to 0.
- Type
- tmax¶
maximum threshold: values over tmax will be clipped to the maximum value allowed by the dtype
- Type
- dtype¶
Data type of the images.
- Type
str or
numpy.dtype
- process(data, indices)[source]¶
Normalize data.
- Parameters
data (
numpy.ndarray
) – The parent stream’s data, to be normalized.indices (int or
list
of int) – Not used. Present for compatibility with other filters. The indices of data in the stream.
- Returns
The normalized data.
- Return type
- class bob.io.stream.StreamSave(file, name, parent)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to save frames of data to a
StreamFile
.Saving is performed by appending to the streamfile.
- file¶
StreamFile into which the data will be appended.
- Type
- put(data, timestamp=None)[source]¶
Pass data and timestamp to the
StreamFile
to write to disk.- Parameters
data (
numpy.ndarray
) – data to write to file.timestamp (int or float) – Timestamp of data, by default None.
- class bob.io.stream.StreamSelect(name, parent, channel)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to select a channel in a color stream (in bob’s format).
This could also be performed by slicing the channel in the parent.
- property shape¶
Shape of the stream’s data.
Because 1 channel is selected, the dimension is 1 on the channel axis.
- Returns
Shape of the stream’s data.
- Return type
tuple
of int
- process(data, indices)[source]¶
Select the required channel in data.
- Parameters
data (
numpy.ndarray
) – Color data, from which a channel is selected.indices (int) – Not used. Present for compatibility with other filters.
- Returns
Selected channel in data.
- Return type
- class bob.io.stream.StreamStacked(stack_stream, name, parent)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to stack streams along the channel dimension.
The stream stacks his parent Stream with its stack_stream.
- stack_stream¶
The stream to stack with parent.
- Type
- set_source(src)[source]¶
Set self and stack_stream source to src.
- Parameters
src (
Stream
orStreamFile
) – Source Stream or StreamFile.
- property shape¶
Shape of the stream’s data. The number of channels is the sum of the parent’s and the stacked stream.
- Returns
Shape of the stream’s data.
- Return type
tuple
of int
- process(data, indices)[source]¶
Stacks data from stack_stream with data (which comes from parent).
data comes from parent with shape (n, c1, …), this method loads the data of stack_stream at the same indices, which has shape (n, c2, …), then stacks them to output an array of shape (n, c1 + c2, …)
parent and stack_stream must have the same dimensions, except in the channel axis.
- Parameters
data (
numpy.ndarray
) – Parent stream’s data at indicesindices (int or
list
of int) – Indices of data
- Returns
data from parent stacked with data at indices from stacked_stream along the channel dimension.
- Return type
- process_frame(data, data_index, stream_index)[source]¶
Concatenate frame from parent and stack_stream along channel axis.
- Parameters
data (
numpy.ndarray
) – parent frame at data_index.data_index (int) – Index of the frames to stack in the streams.
stream_index (int) – Not used. Present for compatibility with other filters.
- Returns
Concatenated frames from parent and stack_stream streams.
- Return type
- class bob.io.stream.StreamSubtract(subtrahend, name, parent)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to subtract subtrahend from parent, clipping results values to be positive or zero.
- subtrahend¶
The stream’s which data will be subtracted.
- Type
- set_source(src)[source]¶
Set self and subtrahend sources to src.
- Parameters
src (
Stream
orStreamFile
) – Source stream or stream file.
- process(data, indices)[source]¶
Subtract subtrahend’s data from data.
- Parameters
data (
numpy.ndarray
) – parent data at indices.indices (int) – Indices of data.
- Returns
data minus subtrahend’s data.
- Return type
- class bob.io.stream.StreamView(name, parent, view_indices=None)[source]¶
Bases:
bob.io.stream.StreamFilter
Filter to implement “slicing” functionality for the
bob.io.stream.Stream
class.Similarly to numpy’s “view”, this filter allows to take a slice in a stream without creating a copy of the data.
- frame_view¶
Slice value in the first dimension of the stream (along the frame’s axis). None means no slicing: take the whole array.
- Type
slice
or None
- property shape¶
Shape of the stream’s data.
The shape is computed with respect to the parent’s shape, because source might not be set so we can not know the shape of the data. If the requested slice has a integer index along one axis, this dimension is dropped. However, taking an integer along the first axis is not allowed (Exception raised in __init__).
- Returns
Shape of the stream’s data.
- Return type
tuple
of int
- property ndim¶
Number of dimension of the stream’s data.
If the requested slice has an integer along an axis, this dimension is collapsed, otherwise the number of dimension is the same as parent’s.
- Returns
Number of dimension.
- Return type
- load(index=None)[source]¶
Load stream’s data at the corresponding indices.
Maps index to indices in parent and delegate loading.
- process(data, indices)[source]¶
Apply slicing on each frame of data.
The slicing of the frame’s axis is performed in
bob.io.stream.StreamView.load()
, so that data only contains frames that are requested. It remains to apply the slicing along the other axis in data, which is delegarted toprocess_frame()
(by slicing into the numpy arrays). Here we only store the requested slice in full format (value along all axis).- Parameters
data (
numpy.ndarray
) – Data to slice. Slicing on the first axis is already performed.indices (int or
list
of int) – Indices of data in the stream.
- Returns
Sliced data.
- Return type
- process_frame(data, data_index, stream_index)[source]¶
Apply the slicing on a frame of data.
Apply the frame slicing computed in
process()
on a frame.- Parameters
data (
numpy.ndarray
) – Frame of data.data_index (int) – Not used. Present for compatibility with other filters.
stream_index (int) – Not used. Present for compatibility with other filters.
- Returns
Sliced data.
- Return type
- bob.io.stream.stream_filter(name)¶
Adds the filter with name to the
Stream
class.This decorator function is meant to be used on a filter class that inherits the
Stream
class. It adds this filter to theStream
class so it can be used directly as a member. It also adds it to thefilters
list.For example, see the
StreamView
filter.- Parameters
name (str) – Name of the filter