bob.ip.binseg.utils.measure¶
Functions
|
Calculates the area under the precision-recall curve (AUC) |
|
Calculates frequentist measures from true/false positive and negative counts |
|
Calculates mean and mode from true/false positive and negative counts with credible regions |
|
Returns the mode, upper and lower bounds of the equal-tailed credible region of a probability estimate following Bernoulli trials. |
|
Divides n by d. |
Classes
|
Track a series of values and provide access to smoothed values over a window or the global series average. |
-
class
bob.ip.binseg.utils.measure.
SmoothedValue
(window_size=20)[source]¶ Bases:
object
Track a series of values and provide access to smoothed values over a window or the global series average.
-
property
median
¶
-
property
avg
¶
-
property
-
bob.ip.binseg.utils.measure.
tricky_division
(n, d)[source]¶ Divides n by d. Returns 0.0 in case of a division by zero
-
bob.ip.binseg.utils.measure.
base_measures
(tp, fp, tn, fn)[source]¶ Calculates frequentist measures from true/false positive and negative counts
This function can return (frequentist versions of) standard machine learning measures from true and false positive counts of positives and negatives. For a thorough look into these and alternate names for the returned values, please check Wikipedia’s entry on Precision and Recall.
- Parameters
- Returns
precision (float) – P, AKA positive predictive value (PPV). It corresponds arithmetically to
tp/(tp+fp)
. In the casetp+fp == 0
, this function returns zero for precision.recall (float) – R, AKA sensitivity, hit rate, or true positive rate (TPR). It corresponds arithmetically to
tp/(tp+fn)
. In the special case wheretp+fn == 0
, this function returns zero for recall.specificity (float) – S, AKA selectivity or true negative rate (TNR). It corresponds arithmetically to
tn/(tn+fp)
. In the special case wheretn+fp == 0
, this function returns zero for specificity.accuracy (float) – A, see Accuracy. is the proportion of correct predictions (both true positives and true negatives) among the total number of pixels examined. It corresponds arithmetically to
(tp+tn)/(tp+tn+fp+fn)
. This measure includes both true-negatives and positives in the numerator, what makes it sensitive to data or regions without annotations.jaccard (float) – J, see Jaccard Index or Similarity. It corresponds arithmetically to
tp/(tp+fp+fn)
. In the special case wheretn+fp+fn == 0
, this function returns zero for the Jaccard index. The Jaccard index depends on a TP-only numerator, similarly to the F1 score. For regions where there are no annotations, the Jaccard index will always be zero, irrespective of the model output. Accuracy may be a better proxy if one needs to consider the true abscence of annotations in a region as part of the measure.f1_score (float) – F1, see F1-score. It corresponds arithmetically to
2*P*R/(P+R)
or2*tp/(2*tp+fp+fn)
. In the special case whereP+R == (2*tp+fp+fn) == 0
, this function returns zero for the Jaccard index. The F1 or Dice score depends on a TP-only numerator, similarly to the Jaccard index. For regions where there are no annotations, the F1-score will always be zero, irrespective of the model output. Accuracy may be a better proxy if one needs to consider the true abscence of annotations in a region as part of the measure.
-
bob.ip.binseg.utils.measure.
beta_credible_region
(k, l, lambda_, coverage)[source]¶ Returns the mode, upper and lower bounds of the equal-tailed credible region of a probability estimate following Bernoulli trials.
This implemetnation is based on [GOUTTE-2005]. It assumes \(k\) successes and \(l\) failures (\(n = k+l\) total trials) are issued from a series of Bernoulli trials (likelihood is binomial). The posterior is derivated using the Bayes Theorem with a beta prior. As there is no reason to favour high vs. low precision, we use a symmetric Beta prior (\(\alpha=\beta\)):
\[\begin{split}P(p|k,n) &= \frac{P(k,n|p)P(p)}{P(k,n)} \\ P(p|k,n) &= \frac{\frac{n!}{k!(n-k)!}p^{k}(1-p)^{n-k}P(p)}{P(k)} \\ P(p|k,n) &= \frac{1}{B(k+\alpha, n-k+eta)}p^{k+\alpha-1}(1-p)^{n-k+\beta-1} \\ P(p|k,n) &= \frac{1}{B(k+\alpha, n-k+\alpha)}p^{k+\alpha-1}(1-p)^{n-k+\alpha-1}\end{split}\]The mode for this posterior (also the maximum a posteriori) is:
\[\text{mode}(p) = \frac{k+\lambda-1}{n+2\lambda-2}\]Concretely, the prior may be flat (all rates are equally likely, \(\lambda=1\)) or we may use Jeoffrey’s prior (\(\lambda=0.5\)), that is invariant through re-parameterisation. Jeffrey’s prior indicate that rates close to zero or one are more likely.
The mode above works if \(k+{\alpha},n-k+{\alpha} > 1\), which is usually the case for a resonably well tunned system, with more than a few samples for analysis. In the limit of the system performance, \(k\) may be 0, which will make the mode become zero.
For our purposes, it may be more suitable to represent \(n = k + l\), with \(k\), the number of successes and \(l\), the number of failures in the binomial experiment, and find this more suitable representation:
\[\begin{split}P(p|k,l) &= \frac{1}{B(k+\alpha, l+\alpha)}p^{k+\alpha-1}(1-p)^{l+\alpha-1} \\ \text{mode}(p) &= \frac{k+\lambda-1}{k+l+2\lambda-2}\end{split}\]This can be mapped to most rates calculated in the context of binary classification this way:
Precision or Positive-Predictive Value (PPV): p = TP/(TP+FP), so k=TP, l=FP
Recall, Sensitivity, or True Positive Rate: r = TP/(TP+FN), so k=TP, l=FN
Specificity or True Negative Rage: s = TN/(TN+FP), so k=TN, l=FP
F1-score: f1 = 2TP/(2TP+FP+FN), so k=2TP, l=FP+FN
Accuracy: acc = TP+TN/(TP+TN+FP+FN), so k=TP+TN, l=FP+FN
Jaccard: j = TP/(TP+FP+FN), so k=TP, l=FP+FN
Contrary to frequentist approaches, in which one can only say that if the test were repeated an infinite number of times, and one constructed a confidence interval each time, then X% of the confidence intervals would contain the true rate, here we can say that given our observed data, there is a X% probability that the true value of \(k/n\) falls within the provided interval.
Note
For a disambiguation with Confidence Interval, read https://en.wikipedia.org/wiki/Credible_interval.
- Parameters
k (int) – Number of successes observed on the experiment
l (int) – Number of failures observed on the experiment
lambda (
float
, Optional) – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior (the default).coverage (
float
, Optional) – A floating-point number between 0 and 1.0 indicating the coverage you’re expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Returns
mean (float) – The mean of the posterior distribution
mode (float) – The mode of the posterior distribution
lower, upper (float) – The lower and upper bounds of the credible region
-
bob.ip.binseg.utils.measure.
bayesian_measures
(tp, fp, tn, fn, lambda_, coverage)[source]¶ Calculates mean and mode from true/false positive and negative counts with credible regions
This function can return bayesian estimates of standard machine learning measures from true and false positive counts of positives and negatives. For a thorough look into these and alternate names for the returned values, please check Wikipedia’s entry on Precision and Recall. See
beta_credible_region()
for details on the calculation of returned values.- Parameters
tp (int) – True positive count, AKA “hit”
fp (int) – False positive count, AKA “false alarm”, or “Type I error”
tn (int) – True negative count, AKA “correct rejection”
fn (int) – False Negative count, AKA “miss”, or “Type II error”
lambda (float) – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior.
coverage (float) – A floating-point number between 0 and 1.0 indicating the coverage you’re expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Returns
precision ((float, float, float, float)) – P, AKA positive predictive value (PPV), mean, mode and credible intervals (95% CI). It corresponds arithmetically to
tp/(tp+fp)
.recall ((float, float, float, float)) – R, AKA sensitivity, hit rate, or true positive rate (TPR), mean, mode and credible intervals (95% CI). It corresponds arithmetically to
tp/(tp+fn)
.specificity ((float, float, float, float)) – S, AKA selectivity or true negative rate (TNR), mean, mode and credible intervals (95% CI). It corresponds arithmetically to
tn/(tn+fp)
.accuracy ((float, float, float, float)) – A, mean, mode and credible intervals (95% CI). See Accuracy. is the proportion of correct predictions (both true positives and true negatives) among the total number of pixels examined. It corresponds arithmetically to
(tp+tn)/(tp+tn+fp+fn)
. This measure includes both true-negatives and positives in the numerator, what makes it sensitive to data or regions without annotations.jaccard ((float, float, float, float)) – J, mean, mode and credible intervals (95% CI). See Jaccard Index or Similarity. It corresponds arithmetically to
tp/(tp+fp+fn)
. The Jaccard index depends on a TP-only numerator, similarly to the F1 score. For regions where there are no annotations, the Jaccard index will always be zero, irrespective of the model output. Accuracy may be a better proxy if one needs to consider the true abscence of annotations in a region as part of the measure.f1_score ((float, float, float, float)) – F1, mean, mode and credible intervals (95% CI). See F1-score. It corresponds arithmetically to
2*P*R/(P+R)
or2*tp/(2*tp+fp+fn)
. The F1 or Dice score depends on a TP-only numerator, similarly to the Jaccard index. For regions where there are no annotations, the F1-score will always be zero, irrespective of the model output. Accuracy may be a better proxy if one needs to consider the true abscence of annotations in a region as part of the measure.
-
bob.ip.binseg.utils.measure.
auc
(x, y)[source]¶ Calculates the area under the precision-recall curve (AUC)
This function requires a minimum of 2 points and will use the trapezoidal method to calculate the area under a curve bound between
[0.0, 1.0]
. It interpolates missing points if required. The inputx
should be continuously increasing or decreasing.- Parameters
x (numpy.ndarray) – A 1D numpy array containing continuously increasing or decreasing values for the X coordinate.
y (numpy.ndarray) – A 1D numpy array containing the Y coordinates of the X values provided in
x
.