Extending packages as frameworks¶
It is often required to extend the functionality of your package as a framework. bob.bio.base is a good example; it provides an API and other packages build upon it. The utilities provided in this page are helpful in creating framework packages and building complex toolchians/pipelines.
Python-based Configuration System¶
This package also provides a configuration system that can be used by packages in the Bob-echosystem to load run-time configuration for applications (for package-level static variable configuration use Global Configuration System). It can be used to accept complex configurations from users through command-line. The run-time configuration system is pretty simple and uses Python itself to load and validate input files, making no a priori requirements on the amount or complexity of data that needs to be configured.
The configuration system is centered around a single function called
bob.extension.config.load()
. You call it to load the configuration
objects from one or more configuration files, like this:
>>> from bob.extension.config import load
>>> #the variable `path` points to <path-to-bob.extension's root>/data
>>> configuration = load([os.path.join(path, 'basic_config.py')])
If the function bob.extension.config.load()
succeeds, it returns a
python dictionary containing strings as keys and objects (of any kind) which
represent the configuration resource. For example, if the file
basic_config.py
contained:
1 2 | a = 1
b = a + 2
|
Then, the object configuration
would look like this:
>>> print("a = %d\nb = %d"%(configuration.a, configuration.b))
a = 1
b = 3
The configuration file does not have to limit itself to simple Pythonic operations, you can import modules, define functions and more.
Chain Loading¶
It is possible to implement chain configuration loading and overriding by
passing iterables with more than one filename to
bob.extension.config.load()
. Suppose we have two configuration files
which must be loaded in sequence:
1 2 | a = 1
b = a + 2
|
1 2 3 | # the b variable from the last config file is available here
c = b + 1
b = b + 3
|
Then, one can chain-load them like this:
>>> #the variable `path` points to <path-to-bob.extension's root>/data
>>> file1 = os.path.join(path, 'basic_config.py')
>>> file2 = os.path.join(path, 'load_config.py')
>>> configuration = load([file1, file2])
>>> print("a = %d \nb = %d"%(configuration.a, configuration.b))
a = 1
b = 6
The user wanting to override the values needs to manage the overriding and the order in which the override happens.
Entry Points¶
The function bob.extension.config.load()
can also load config files
through Setuptools entry points and module names. It is only needed
to provide the group name of the entry points:
>>> group = 'bob.extension.test_config_load' # the group name of entry points
>>> file1 = 'basic_config' # an entry point name
>>> file2 = 'bob.extension.data.load_config' # module name
>>> configuration = load([file1, file2], entry_point_group=group)
>>> print("a = %d \nb = %d"%(configuration.a, configuration.b))
a = 1
b = 6
Stacked Processing¶
bob.extension.processors.SequentialProcessor
and
bob.extension.processors.ParallelProcessor
are provided to help you
build complex processing mechanisms. You can use these processors to apply a
chain of processes on your data. For example,
bob.extension.processors.SequentialProcessor
accepts a list of callables
and applies them on the data one by one sequentially. :
>>> import numpy as np
>>> from functools import partial
>>> from bob.extension.processors import SequentialProcessor
>>> raw_data = np.array([[1, 2, 3], [1, 2, 3]])
>>> seq_processor = SequentialProcessor(
... [np.cast['float64'], lambda x: x / 2, partial(np.mean, axis=1)])
>>> seq_processor(raw_data)
array([ 1., 1.])
>>> np.all(seq_processor(raw_data) ==
... np.mean(np.cast['float64'](raw_data) / 2, axis=1))
True
bob.extension.processors.ParallelProcessor
accepts a list of callables
and applies each them on the data independently and returns all the results.
For example:
>>> import numpy as np
>>> from functools import partial
>>> from bob.extension.processors import ParallelProcessor
>>> raw_data = np.array([[1, 2, 3], [1, 2, 3]])
>>> parallel_processor = ParallelProcessor(
... [np.cast['float64'], lambda x: x / 2.0])
>>> list(parallel_processor(raw_data))
[array([[ 1., 2., 3.],
[ 1., 2., 3.]]), array([[ 0.5, 1. , 1.5],
[ 0.5, 1. , 1.5]])]
The data may be further processed using a
bob.extension.processors.SequentialProcessor
:
>>> from bob.extension.processors import SequentialProcessor
>>> total_processor = SequentialProcessor(
... [parallel_processor, list, partial(np.concatenate, axis=1)])
>>> total_processor(raw_data)
array([[ 1. , 2. , 3. , 0.5, 1. , 1.5],
[ 1. , 2. , 3. , 0.5, 1. , 1.5]])