bob.med.tb.engine.evaluator¶

Defines functionality for the evaluation of predictions

Functions

`posneg`(pred, gt, threshold)	Calculates true and false positives and negatives
`run`(dataset, name, predictions_folder[, …])	Runs inference and calculates measures
`sample_measures_for_threshold`(pred, gt, …)	Calculates measures on one single sample, for a specific threshold

bob.med.tb.engine.evaluator.posneg(pred, gt, threshold)[source]¶: Calculates true and false positives and negatives

bob.med.tb.engine.evaluator.sample_measures_for_threshold(pred, gt, threshold)[source]¶

Calculates measures on one single sample, for a specific threshold

Parameters

pred (torch.Tensor) – pixel-wise predictions
gt (torch.Tensor) – ground-truth (annotations)
threshold (float) – a particular threshold in which to calculate the performance measures

Returns

precision (float)
recall (float)
specificity (float)
accuracy (float)
jaccard (float)
f1_score (float)

bob.med.tb.engine.evaluator.run(dataset, name, predictions_folder, output_folder=None, f1_thresh=None, eer_thresh=None, steps=1000)[source]¶

Runs inference and calculates measures

Parameters

dataset (py:class:torch.utils.data.Dataset) – a dataset to iterate on
name (str) – the local name of this dataset (e.g. train, or test), to be used when saving measures files.
predictions_folder (str) – folder where predictions for the dataset images has been previously stored
output_folder (str, Optional) – folder where to store results.
f1_thresh (float, Optional) – This number should come from the training set or a separate validation set. Using a test set value may bias your analysis. This number is also used to print the a priori F1-score on the evaluated set.
eer_thresh (float, Optional) – This number should come from the training set or a separate validation set. Using a test set value may bias your analysis. This number is used to print the a priori EER.
steps (float, Optional) – number of threshold steps to consider when evaluating thresholds.

Returns

f1_threshold (float) – Threshold to achieve the highest possible F1-score for this dataset
eer_threshold (float) – Threshold achieving Equal Error Rate for this dataset