COVD- and COVD-SLL Results¶
In addition to the M2U-Net architecture, we also evaluated the larger DRIU network and a variation of it that contains batch normalization (DRIU BN) on COVD- and COVD-SSL. Perhaps surprisingly, for the majority of combinations, the performance of the DRIU variants are roughly equal or worse than the M2U-Net. We anticipate that one reason for this could be overparameterization of large VGG16 models that are pretrained on ImageNet.
F1 Scores¶
Comparison of F1-micro-scores (std) of DRIU and M2U-Net on COVD- and COVD-SSL. Standard deviation across test-images in brackets.
F1 score |
|||
0.788 (0.018) |
0.797 (0.019) |
||
0.785 (0.018) |
0.783 (0.019) |
||
0.778 (0.117) |
0.778 (0.122) |
||
0.788 (0.102) |
0.811 (0.074) |
||
0.796 (0.027) |
0.791 (0.025) |
||
0.796 (0.024) |
0.798 (0.025) |
||
0.799 (0.044) |
0.800 (0.045) |
||
0.799 (0.044) |
0.784 (0.048) |
||
0.791 (0.021) |
0.777 (0.032) |
||
0.797 (0.017) |
0.811 (0.074) |
M2U-Net Precision vs. Recall Curves¶
Precision vs. recall curves for each evaluated dataset. Note that here the F1-score is calculated on a macro level (see paper for more details).

Fig. 1 CHASE_DB1: Precision vs Recall curve and F1 scores¶

Fig. 2 DRIVE: Precision vs Recall curve and F1 scores¶

Fig. 3 HRF: Precision vs Recall curve and F1 scores¶

Fig. 4 IOSTAR: Precision vs Recall curve and F1 scores¶

Fig. 5 STARE: Precision vs Recall curve and F1 scores¶