Python API¶
This section includes information for using the pure Python API of
bob.learn.em
.
Classes¶
|
Stores the k-means clusters parameters (centroid of each cluster). |
|
Stores accumulated statistics of a GMM. |
|
Transformer that stores a Gaussian Mixture Model (GMM) parameters. |
|
Trains a linear machine to perform Within-Class Covariance Normalization (WCCN) WCCN finds the projection matrix W that allows us to linearly project the data matrix X to another (sub) space such that: |
|
Trains an Estimator perform Cholesky whitening. |
Functions¶
bob.learn.em.linear_scoring
(models_means, ...)Estimation of the LLR between a target model and the UBM for a test instance.
Detailed Information¶
- class bob.learn.em.GMMMachine(n_gaussians: int, trainer: str = 'ml', ubm: Optional[GMMMachine] = None, convergence_threshold: float = 1e-05, max_fitting_steps: Optional[int] = 200, random_state: Union[int, RandomState] = 0, weights: Optional[ndarray['n_gaussians', float]] = None, k_means_trainer: Optional[KMeansMachine] = None, update_means: bool = True, update_variances: bool = False, update_weights: bool = False, mean_var_update_threshold: float = 2.220446049250313e-16, map_alpha: float = 0.5, map_relevance_factor: Union[None, float] = 4, **kwargs)¶
Bases:
BaseEstimator
Transformer that stores a Gaussian Mixture Model (GMM) parameters.
This class implements the statistical model for multivariate diagonal mixture Gaussian distribution (GMM), as well as ways to train a model on data.
A GMM is defined as \(\sum_{c=0}^{C} \omega_c \mathcal{N}(x | \mu_c, \sigma_c)\), where \(C\) is the number of Gaussian components \(\mu_c\), \(\sigma_c\) and \(\omega_c\) are respectively the the mean, variance and the weight of each gaussian component \(c\). See Section 2.3.9 of Bishop, “Pattern recognition and machine learning”, 2006
Two types of training are available MLE and MAP, chosen with trainer.
Maximum Likelihood Estimation (MLE, ML)
The mixtures are initialized (with k-means by default). The means, variances, and weights of the mixtures are then trained on the data to increase the likelihood value. (MLE)
Maximum a Posteriori (MAP)
The MAP machine takes another GMM machine as prior, called Universal Background Model (UBM). The means, variances, and weights of the MAP mixtures are then trained on the data as adaptation of the UBM.
Both training method use a Expectation-Maximization (e-m) algorithm to iteratively train the GMM.
Note
When setting manually any of the means, variances or variance thresholds, the k-means initialization will be skipped in fit.
- means, variances, variance_thresholds
Gaussians parameters.
- classmethod from_hdf5(hdf5, ubm=None)[source]¶
Creates a new GMMMachine object from an HDF5File object.
- property g_norms¶
Precomputed g_norms (depends on variances and feature shape).
- initialize_gaussians(data: Optional[ndarray['n_samples', 'n_features', float]] = None)[source]¶
Populates gaussians parameters with either k-means or the UBM values.
- is_similar_to(other, rtol=1e-05, atol=1e-08)[source]¶
Returns True if other has the same gaussians (within a tolerance).
- log_likelihood(data: ndarray['n_samples', 'n_features', float])[source]¶
Returns the current log likelihood for a set of data in this Machine.
- Parameters
data – Data to compute the log likelihood on.
- Returns
The log likelihood of each sample.
- Return type
array of shape (n_samples)
- log_weighted_likelihood(data: ndarray['n_samples', 'n_features', float])[source]¶
Returns the weighted log likelihood for each Gaussian for a set of data.
- Parameters
data – Data to compute the log likelihood on.
- Returns
The weighted log likelihood of each sample of each Gaussian.
- Return type
array of shape (n_gaussians, n_samples)
- property log_weights¶
Retrieve the logarithm of the weights.
- property means¶
The means of each Gaussian.
- property shape¶
Shape of the gaussians in the GMM machine.
- property variance_thresholds¶
Threshold below which variances are clamped to prevent precision losses.
- property variances¶
The (diagonal) variances of the gaussians.
- property weights¶
The weights of each Gaussian mixture.
- class bob.learn.em.GMMStats(n_gaussians: int, n_features: int, like=None, **kwargs)¶
Bases:
object
Stores accumulated statistics of a GMM.
- n¶
Sum of responsibility.
- Type
array of shape (n_gaussians,)
- sum_px¶
First order statistic
- Type
array of shape (n_gaussians, n_features)
- sum_pxx¶
Second order statistic
- Type
array of shape (n_gaussians, n_features)
- init_fields(log_likelihood=0.0, t=0, n=None, sum_px=None, sum_pxx=None)[source]¶
Initializes the statistics values to a defined value, or zero by default.
- is_similar_to(other, rtol=1e-05, atol=1e-08)[source]¶
Returns True if other has the same values (within a tolerance).
- property nbytes¶
The number of bytes used by the statistics n, sum_px, sum_pxx.
- property shape¶
The number of gaussians and their dimensionality.
- class bob.learn.em.ISVMachine(r_U, em_iterations=10, relevance_factor=4.0, random_state=0, ubm=None, ubm_kwargs=None, **kwargs)¶
Bases:
FactorAnalysisBase
Implements the Intersession Variability Modelling hypothesis on top of GMMs
Inter-Session Variability (ISV) modeling is a session variability modeling technique built on top of the Gaussian mixture modeling approach. It hypothesizes that within-class variations are embedded in a linear subspace in the GMM means subspace and these variations can be suppressed by an offset w.r.t each mean during the MAP adaptation. For more information check [McCool2013]
- Parameters
r_U (int) – Dimension of the subspace U
em_iterations (int) – Number of EM iterations
relevance_factor (float) – Factor analysis relevance factor
random_state (int) – random_state for the random number generator
ubm (
bob.learn.em.GMMMachine
or None) – A trained UBM (Universal Background Model). If None, the UBM is trained with a newbob.learn.em.GMMMachine
when fit is called, with ubm_kwargs as parameters.
- enroll(X)[source]¶
Enrolls a new client In ISV, the enrolment is defined as: \(m + Dz\) with the latent variables z representing the enrolled model.
- Parameters
X (list of
bob.learn.em.GMMStats
) – List of statistics to be enrolled- Returns
self – z
- Return type
- fit(X, y)[source]¶
Trains the U matrix (session variability matrix)
- Parameters
X (numpy.ndarray) – Nxd features of N GMM statistics
y (numpy.ndarray) – The input labels, a 1D numpy array of shape (number of samples, )
- Returns
self – Returns self.
- Return type
- m_step(acc_U_A1_acc_U_A2_list)[source]¶
ISV M-step. This updates U matrix
- Parameters
acc_U_A1 (array) – Accumulated statistics for U_A1(n_gaussians, r_U, r_U)
acc_U_A2 (array) – Accumulated statistics for U_A2(n_gaussians* feature_dimension, r_U)
- score(latent_z, data)[source]¶
Computes the ISV score
- Parameters
latent_z (numpy.ndarray) – Latent representation of the client (E[z_i])
data (list of
bob.learn.em.GMMStats
) – List of statistics to be scored
- Returns
score – The linear scored
- Return type
- class bob.learn.em.JFAMachine(r_U, r_V, em_iterations=10, relevance_factor=4.0, random_state=0, ubm=None, ubm_kwargs=None, **kwargs)¶
Bases:
FactorAnalysisBase
Joint Factor Analysis (JFA) is an extension of ISV. Besides the within-class assumption (modeled with \(U\)), it also hypothesize that between class variations are embedded in a low rank rectangular matrix \(V\). In the supervector notation, this modeling has the following shape: \(\mu_{i, j} = m + Ux_{i, j} + Vy_{i} + D_z{i}\).
For more information check [McCool2013]
- Parameters
ubm (
bob.learn.em.GMMMachine
) – A trained UBM (Universal Background Model)r_U (int) – Dimension of the subspace U
r_V (int) – Dimension of the subspace V
em_iterations (int) – Number of EM iterations
relevance_factor (float) – Factor analysis relevance factor
random_state (int) – random_state for the random number generator
- e_step_d(X, y, n_samples_per_class, latent_x, latent_y, n_acc, f_acc)[source]¶
ISV E-step for the U matrix.
- Parameters
X (list of
bob.learn.em.GMMStats
) – List of statisticsy (list of int) – List of labels
n_classes (int) – Number of classes
latent_x (array) – E(x) latent variable
latent_y (array) – E(y) latent variable
latent_z (array) – E(z) latent variable
n_acc (array) – Accumulated 0th-order statistics
f_acc (array) – Accumulated 1st-order statistics
- Returns
acc_D_A1 (array) – Accumulated statistics for D_A1(n_gaussians* feature_dimension, )
acc_D_A2 (array) – Accumulated statistics for D_A2(n_gaussians* feature_dimension, )
- e_step_u(X, y, n_samples_per_class, latent_y)[source]¶
ISV E-step for the U matrix.
- Parameters
X (list of
bob.learn.em.GMMStats
) – List of statisticsy (list of int) – List of labels
latent_y (array) – E(y) latent variable
- Returns
acc_U_A1 (array) – Accumulated statistics for U_A1(n_gaussians, r_U, r_U)
acc_U_A2 (array) – Accumulated statistics for U_A2(n_gaussians* feature_dimension, r_U)
- e_step_v(X, y, n_samples_per_class, n_acc, f_acc)[source]¶
ISV E-step for the V matrix.
- Parameters
X (list of
bob.learn.em.GMMStats
) – List of statisticsy (list of int) – List of labels
n_classes (int) – Number of classes
n_acc (array) – Accumulated 0th-order statistics
f_acc (array) – Accumulated 1st-order statistics
- Returns
acc_V_A1 (array) – Accumulated statistics for V_A1(n_gaussians, r_V, r_V)
acc_V_A2 (array) – Accumulated statistics for V_A2(n_gaussians* feature_dimension, r_V)
- enroll(X)[source]¶
Enrolls a new client. In JFA the enrolment is defined as: \(m + Vy + Dz\) with the latent variables y and z representing the enrolled model.
- Parameters
X (list of
bob.learn.em.GMMStats
) – List of statistics- Returns
self – z, y latent variables
- Return type
array
- finalize_u(X, y, n_samples_per_class, latent_y)[source]¶
Compute for the last time E[x]
- Parameters
X (list of
bob.learn.em.GMMStats
) – List of statisticsy (list of int) – List of labels
n_classes (int) – Number of classes
latent_y (array) – E[y] latent variable
- Returns
latent_x – E[x]
- Return type
array
- finalize_v(X, y, n_samples_per_class, n_acc, f_acc)[source]¶
Compute for the last time E[y]
- Parameters
X (list of
bob.learn.em.GMMStats
) – List of statisticsy (list of int) – List of labels
n_classes (int) – Number of classes
n_acc (array) – Accumulated 0th-order statistics
f_acc (array) – Accumulated 1st-order statistics
- Returns
latent_y – E[y]
- Return type
array
- fit(X, y)[source]¶
Trains the U matrix (session variability matrix)
- Parameters
X (numpy.ndarray) – Nxd features of N GMM statistics
y (numpy.ndarray) – The input labels, a 1D numpy array of shape (number of samples, )
- Returns
self – Returns self.
- Return type
- m_step_d(acc_D_A1_acc_D_A2_list)[source]¶
D Matrix M-step. This updates the D matrix
- Parameters
acc_D_A1 (array) – Accumulated statistics for D_A1(n_gaussians* feature_dimension, )
acc_D_A2 (array) – Accumulated statistics for D_A2(n_gaussians* feature_dimension, )
- m_step_u(acc_U_A1_acc_U_A2_list)[source]¶
U Matrix M-step. This updates the U matrix
- Parameters
acc_V_A1 (array) – Accumulated statistics for V_A1(n_gaussians, r_V, r_V)
acc_V_A2 (array) – Accumulated statistics for V_A2(n_gaussians* feature_dimension, r_V)
- m_step_v(acc_V_A1_acc_V_A2_list)[source]¶
V Matrix M-step. This updates the V matrix
- Parameters
acc_V_A1 (array) – Accumulated statistics for V_A1(n_gaussians, r_V, r_V)
acc_V_A2 (array) – Accumulated statistics for V_A2(n_gaussians* feature_dimension, r_V)
- score(model, data)[source]¶
Computes the JFA score
- Parameters
latent_z (numpy.ndarray) – Latent representation of the client (E[z_i])
data (list of
bob.learn.em.GMMStats
) – List of statistics to be scored
- Returns
score – The linear scored
- Return type
- class bob.learn.em.KMeansMachine(n_clusters: int, init_method: Union[str, ndarray] = 'k-means||', convergence_threshold: float = 1e-05, max_iter: int = 20, random_state: Union[int, RandomState] = 0, init_max_iter: Optional[int] = 5, oversampling_factor: float = 2, **kwargs)¶
Bases:
BaseEstimator
Stores the k-means clusters parameters (centroid of each cluster).
Allows the clustering of data with the
fit
method.- The training works in two phases:
An initialization (setting the initial values of the centroids)
An e-m loop reducing the total distance between the data points and their closest centroid.
The initialization can use an iterative process to find the best set of coordinates, use random starting points, or take specified coordinates. The
init_method
parameter specifies which of these behavior is considered.- centroids_¶
The current clusters centroids. Available after fitting.
- Type
ndarray of shape (n_clusters, n_features)
- get_variances_and_weights_for_each_cluster(data: ndarray)[source]¶
Returns the clusters variance and weight for data clustered by the machine.
For each cluster, finds the subset of the samples that is closest to that centroid, and calculates: 1) the variance of that subset (the cluster variance) 2) the proportion of samples represented by that subset (the cluster weight)
- Parameters
data – The data to compute the variance of.
- Returns
- variances: ndarray of shape (n_clusters, n_features)
For each cluster, the variance in each dimension of the data.
- weights: ndarray of shape (n_clusters, )
Weight (proportion of quantity of data point) of each cluster.
- Return type
Tuple of arrays
- initialize(data: ndarray)[source]¶
Assigns the means to an initial value using a specified method or randomly.
- predict(X)[source]¶
Returns the labels of the closest cluster centroid to the data.
- Parameters
X (ndarray of shape (n_samples, n_features)) – Series of data points.
- Returns
indices – The indices of the closest cluster for each data point.
- Return type
ndarray of shape (n_samples)
- transform(X)[source]¶
Returns all the distances between the data and each cluster’s mean.
- Parameters
X (ndarray of shape (n_samples, n_features)) – Series of data points.
- Returns
distances – For each mean, for each point, the squared Euclidian distance between them.
- Return type
ndarray of shape (n_clusters, n_samples)
- class bob.learn.em.WCCN(pinv=False, **kwargs)¶
Bases:
TransformerMixin
,BaseEstimator
Trains a linear machine to perform Within-Class Covariance Normalization (WCCN) WCCN finds the projection matrix W that allows us to linearly project the data matrix X to another (sub) space such that:
\[(1/N) S_{w} = W W^T\]where \(W\) is an upper triangular matrix computed using Cholesky Decomposition:
\[W = cholesky([(1/K) S_{w} ]^{-1})\]- where:
\(K\) the number of classes
\(S_w\) the within-class scatter; it also has dimensions
(X.shape[0], X.shape[0])
and is defined as \(S_w = \sum_{k=1}^K \sum_{n \in C_k} (x_n-m_k)(x_n-m_k)^T\), with \(C_k\) being a set representing all samples for class k.\(m_k\) the class k empirical mean, defined as \(m_k = \frac{1}{N_k}\sum_{n \in C_k} x_n\)
References
Within-class covariance normalization for SVM-based speaker recognition, Andrew O. Hatch, Sachin Kajarekar, and Andreas Stolcke, In INTERSPEECH, 2006.
- class bob.learn.em.Whitening(pinv: bool = False, **kwargs)¶
Bases:
TransformerMixin
,BaseEstimator
Trains an Estimator perform Cholesky whitening.
The whitening transformation is a decorrelation method that converts the covariance matrix of a set of samples into the identity matrix \(I\). This effectively linearly transforms random variables such that the resulting variables are uncorrelated and have the same variances as the original random variables.
This transformation is invertible. The method is called the whitening transform because it transforms the input matrix \(X\) closer towards white noise (let’s call it \(\tilde{X}\)):
\[Cov(\tilde{X}) = I\]- with:
- \[\tilde{X} = X W\]
where \(W\) is the projection matrix that allows us to linearly project the data matrix \(X\) to another (sub) space such that:
\[Cov(X) = W W^T\]\(W\) is computed using Cholesky decomposition:
\[W = cholesky([Cov(X)]^{-1})\]References
- bob.learn.em.linear_scoring(models_means: Union[list[bob.learn.em.GMMMachine], ndarray['n_models', 'n_gaussians', 'n_features', float]], ubm: GMMMachine, test_stats: Union[list[bob.learn.em.GMMStats], GMMStats], test_channel_offsets: ndarray['n_test_stats', 'n_gaussians', float] = 0, frame_length_normalization: bool = False) ndarray['n_models', 'n_test_stats', float] [source]¶
Estimation of the LLR between a target model and the UBM for a test instance.
The Linear scoring is an approximation to the log-likelihood ratio (LLR) that was shown to be as accurate and up to two orders of magnitude more efficient to compute. [Glembek2009]
- Parameters
models_means – The model(s) to score against. If a list of GMMMachine is given, the means of each model are considered.
ubm – The Universal Background Model. Accepts a GMMMachine object. If the GMMMachine uses MAP, it’s ubm attribute is used.
test_stats – The instances to score.
test_channel_offsets – Offset values added to the test instances.
- Returns
The scores of each probe against each model.
- Return type
Array of shape (n_models, n_probes)