bob.ip.binseg.engine.evaluator¶
Defines functionality for the evaluation of predictions
Functions
|
Compares annotations on the same dataset |
|
Runs inference and calculates measures |
|
Calculates counts on one single sample, for a specific threshold |
- bob.ip.binseg.engine.evaluator.sample_measures_for_threshold(pred, gt, mask, threshold)[source]¶
Calculates counts on one single sample, for a specific threshold
- Parameters
pred (torch.Tensor) – pixel-wise predictions
gt (torch.Tensor) – ground-truth (annotations)
mask (torch.Tensor) – region mask (used only if available). May be set to
None
.threshold (float) – a particular threshold in which to calculate the performance measures
- Returns
tp (int)
fp (int)
tn (int)
fn (int)
- bob.ip.binseg.engine.evaluator.run(dataset, name, predictions_folder, output_folder=None, overlayed_folder=None, threshold=None, steps=1000)[source]¶
Runs inference and calculates measures
- Parameters
dataset (py:class:torch.utils.data.Dataset) – a dataset to iterate on
name (str) – the local name of this dataset (e.g.
train
, ortest
), to be used when saving measures files.predictions_folder (str) – folder where predictions for the dataset images has been previously stored
output_folder (
str
, Optional) – folder where to store results. If not provided, then do not store any analysis (useful for quickly calculating overlay thresholds)overlayed_folder (
str
, Optional) – if notNone
, then it should be the name of a folder where to store overlayed versions of the images and ground-truthsthreshold (
float
, Optional) – ifoverlayed_folder
, then this should be threshold (floating point) to apply to prediction maps to decide on positives and negatives for overlaying analysis (graphical output). This number should come from the training set or a separate validation set. Using a test set value may bias your analysis. This number is also used to print the a priori F1-score on the evaluated set.steps (
float
, Optional) – number of threshold steps to consider when evaluating thresholds.
- Returns
threshold – Threshold to achieve the highest possible F1-score for this dataset
- Return type
- bob.ip.binseg.engine.evaluator.compare_annotators(baseline, other, name, output_folder, overlayed_folder=None)[source]¶
Compares annotations on the same dataset
- Parameters
baseline (py:class:torch.utils.data.Dataset) – a dataset to iterate on, containing the baseline annotations
other (py:class:torch.utils.data.Dataset) – a second dataset, with the same samples as
baseline
, but annotated by a different annotator than in the first dataset. The key values must much betweenbaseline
and this dataset.name (str) – the local name of this dataset (e.g.
train-second-annotator
, ortest-second-annotator
), to be used when saving measures files.output_folder (str) – folder where to store results
overlayed_folder (
str
, Optional) – if notNone
, then it should be the name of a folder where to store overlayed versions of the images and ground-truths