TP TN FP FN
The following are the 4 basic terminologies you need to know.
True Positives (TP) : when the actual value is Positive and predicted is also Positive.
True Negatives (TN) : when the actual value is Negative and prediction is also Negative.
False Positives (FP) : When the actual is negative but prediction is Positive. Also known as the Type 1 error.
False Negatives (FN) : When the actual is Positive but the prediction is Negative. Also known as the Type 2 error.
OA, Overall Accuracy, 总体精度
计算总体的分类精度
O A = N u m b e r o f c o r r e c t l y p r e d i c t e d s a m p l e s T o t a l n u m b e r o f s a m p l e s = ∑ T P c ∑ ( T P c + F N c ) \rm OA
= \frac{Number of correctly predicted samples}{Total number of samples}
= \frac{\sum TP_c}{\sum (TP_c + FN_c)}
OA = Total number of samples Number of correctly predicted samples = ∑ ( T P c + F N c ) ∑ T P c
Precision, 精确率/查准率
P r e c i s i o n c = N u m b e r o f s a m p l e s c o r r e c t l y p r e d i c t e d a s c l a s s c T o t a l n u m b e r o f s a m p l e s p r e d i c t e d a s c l a s s c = T P c T P c + F P c \rm Precision_c
= \frac{Number of samples correctly predicted as class c}{Total number of samples predicted as class c}
= \frac{TP_c}{TP_c + FP_c}
Precisio n c = Total number of samples predicted as class c Number of samples correctly predicted as class c = T P c + F P c T P c
Accuracy, 单个类别的精度
计算每一类别的分类精度
A c c u r a c y c = N u m b e r o f c o r r e c t l y p r e d i c t e d s a m p l e s i n c l a s s c T o t a l n u m b e r o f s a m p l e s i n c l a s s c = T P c T P c + F N c \rm Accuracy_c =
\frac{Number of correctly predicted samples in class c}{Total number of samples in class c}
= \frac{TP_c}{TP_c + FN_c}
Accurac y c = Total number of samples in class c Number of correctly predicted samples in class c = T P c + F N c T P c
其中的 sample 在不同任务中有不同的涵义,在图像分类中一张图片为一个 sample,在图像分割任务中一个像素点即一个 sample。
Recall, 召回率/查全率
单个类别的 Accuracy 也可以称之为 Recall,两者在计算公式上是一样的。
R e c a l l c = N u m b e r o f c o r r e c t l y p r e d i c t e d s a m p l e s T o t a l n u m b e r o f s a m p l e s = T P c T P c + F N c \rm Recall_c
= \frac{Number of correctly predicted samples}{Total number of samples}
= \frac{TP_c}{TP_c + FN_c}
Recal l c = Total number of samples Number of correctly predicted samples = T P c + F N c T P c
F1 Score, F1分数
1 F 1 c = 1 2 ( 1 P r e c i s i o n c + 1 R e c a l l c ) ⇒ F 1 c = 2 × P r e c i s i o n c × R e c a l l c P r e c i s i o n c + R e c a l l c = 2 T P c 2 T P c + F P c + F N c \begin{gather*}
\rm
\frac{1}{F1_c} = \frac{1}{2}\left(\frac{1}{Precision_c}+\frac{1}{Recall_c}\right)\\
\rm \Rightarrow F1_c = 2 \times \frac{Precision_c \times Recall_c}{Precision_c + Recall_c}
= \frac{2TP_c}{2TP_c + FP_c + FN_c}
\end{gather*}
F 1 c 1 = 2 1 ( Precisio n c 1 + Recal l c 1 ) ⇒ F 1 c = 2 × Precisio n c + Recal l c Precisio n c × Recal l c = 2T P c + F P c + F N c 2T P c
Dice, Dice similarity coefficient, DSC
The Sørensen-Dice index, known as the Dice similarity coefficient (DSC)
D i c e c = 2 T P c 2 T P c + F P c + F N c \rm Dice_c
= \frac{2TP_c}{2TP_c + FP_c + FN_c}
Dic e c = 2T P c + F P c + F N c 2T P c
这与 F1 score 的计算公式是一样的。
IoU, Intersection over Union, 交并比
I o U c = T P c T P c + F P c + F N c \rm IoU_c = \frac{TP_c}{TP_c + FP_c + FN_c}
Io U c = T P c + F P c + F N c T P c
Kappa 系数
K a p p a = p o − p e 1 − p e Kappa = \frac{p_o-p_e}{1-p_e}
K a pp a = 1 − p e p o − p e
p o = O A p_o = OA p o = O A , p e p_e p e 如下计算
p e = ∑ 类别 类别的实际像素数 总像素数 × 类别的预测像素数 总像素数 p_e = \sum_{\text{类别}} \frac{类别的实际像素数}{总像素数}\times\frac{类别的预测像素数}{总像素数}
p e = 类别 ∑ 总像素数 类别的实际像素数 × 总像素数 类别的预测像素数
对于只有两类 前景和背景 的分割任务, p e p_e p e 如下计算
p e = T P + F N N 前景实际像素数 × T P + F P N 前景预测像素数 + T N + F P N 背景实际像素数 × T N + F N N 背景预测像素数 p_e = \underset{前景实际像素数}{\frac{TP+FN}{N}} \times \underset{前景预测像素数}{\frac{TP+FP}{N}} + \underset{背景实际像素数}{\frac{TN+FP}{N}} \times \underset{背景预测像素数}{\frac{TN+FN}{N}}
p e = 前景实际像素数 N TP + FN × 前景预测像素数 N TP + FP + 背景实际像素数 N TN + FP × 背景预测像素数 N TN + FN
对于多类别的分类任务, 例如
Predicted Class → Actual Class ↓ A B C Actual Class Num A N a a N a b N a c N a ⋅ B N b a N b b N b c N b ⋅ C N c a N c b N c c N c ⋅ Predicted Class Num N ⋅ a N ⋅ b N ⋅ c N \begin{array}{c|ccc|c}
\frac{\text{Predicted Class}\rightarrow}{\underset{\downarrow}{\text{Actual Class}}} & A & B & C & \text{Actual Class Num}\\
\hline
A & N_{aa} & N_{ab} & N_{ac} & N_{a\cdot} \\
B & N_{ba} & N_{bb} & N_{bc} & N_{b\cdot} \\
C & N_{ca} & N_{cb} & N_{cc} & N_{c\cdot} \\
\hline
\text{Predicted Class Num} & N_{\cdot a} & N_{\cdot b} & N_{\cdot c} & N \\
\end{array}
↓ Actual Class Predicted Class → A B C Predicted Class Num A N aa N ba N c a N ⋅ a B N ab N bb N c b N ⋅ b C N a c N b c N cc N ⋅ c Actual Class Num N a ⋅ N b ⋅ N c ⋅ N
p o p_o p o 和 p e p_e p e 的计算如下
p o = O A = N a a + N b b + N c c N p_o = OA = \frac{N_{aa}+N_{bb}+N_{cc}}{N}
p o = O A = N N aa + N bb + N cc
p e = N a ⋅ N ⋅ N ⋅ a N + N b ⋅ N ⋅ N ⋅ b N + N c ⋅ N ⋅ N ⋅ c N = N a ⋅ ⋅ N ⋅ a + N b ⋅ ⋅ N ⋅ b + N c ⋅ ⋅ N ⋅ c N 2 \begin{split}
p_e &= \frac{N_{a\cdot}}{N}\cdot \frac{N_{\cdot a}}{N} +
\frac{N_{b\cdot}}{N}\cdot \frac{N_{\cdot b}}{N} +
\frac{N_{c\cdot}}{N}\cdot \frac{N_{\cdot c}}{N}\\
&= \frac{N_{a\cdot} \cdot N_{\cdot a} + N_{b\cdot} \cdot N_{\cdot b} + N_{c\cdot} \cdot N_{\cdot c}}{N^2}
\end{split}
p e = N N a ⋅ ⋅ N N ⋅ a + N N b ⋅ ⋅ N N ⋅ b + N N c ⋅ ⋅ N N ⋅ c = N 2 N a ⋅ ⋅ N ⋅ a + N b ⋅ ⋅ N ⋅ b + N c ⋅ ⋅ N ⋅ c
Confusion Matrix, 混淆矩阵
Confusion matrix is a matrix of size (class_num x class_num).
Predicted Class → Actual Class ↓ A B C Actual Class Num A N a a N a b N a c N a ⋅ B N b a N b b N b c N b ⋅ C N c a N c b N c c N c ⋅ Predicted Class Num N ⋅ a N ⋅ b N ⋅ c N \begin{array}{c|ccc|c}
\frac{\text{Predicted Class}\rightarrow}{\underset{\downarrow}{\text{Actual Class}}} & A & B & C & \text{Actual Class Num}\\
\hline
A & N_{aa} & N_{ab} & N_{ac} & N_{a\cdot} \\
B & N_{ba} & N_{bb} & N_{bc} & N_{b\cdot} \\
C & N_{ca} & N_{cb} & N_{cc} & N_{c\cdot} \\
\hline
\text{Predicted Class Num} & N_{\cdot a} & N_{\cdot b} & N_{\cdot c} & N \\
\end{array}
↓ Actual Class Predicted Class → A B C Predicted Class Num A N aa N ba N c a N ⋅ a B N ab N bb N c b N ⋅ b C N a c N b c N cc N ⋅ c Actual Class Num N a ⋅ N b ⋅ N c ⋅ N
How to get TP,TN,FP,FN
For class A, its TP are N a a N_{aa} N aa , TN are N b b + N c c N_{bb}+N_{cc} N bb + N cc , FP are N b a + N c a N_{ba}+N_{ca} N ba + N c a , FN are N a b + N a c N_{ab}+N_{ac} N ab + N a c
T P T N F P F N N a a N a b N a c N b a N b b N b c N c a N c b N c c \begin{array}{|c|}
\hline
\color{red}TP \quad \color{green}TN \quad \color{blue}FP \quad \color{gold}FN\\
\hline
\begin{matrix}
\color{red}{N_{aa}} & \color{gold}N_{ab} & \color{gold}N_{ac}\\
\color{blue}N_{ba} & \color{green}N_{bb} & N_{bc}\\
\color{blue}N_{ca} & N_{cb} & \color{green}N_{cc}
\end{matrix}
\\ \hline
\end{array}
TP TN FP FN N aa N ba N c a N ab N bb N c b N a c N b c N cc
For class B, its TP are N b b N_{bb} N bb , TN are N a a + N c c N_{aa}+N_{cc} N aa + N cc , FP are N a b + N c b N_{ab}+N_{cb} N ab + N c b , FN are N b a + N b c N_{ba}+N_{bc} N ba + N b c
T P T N F P F N N a a N a b N a c N b a N b b N b c N c a N c b N c c \begin{array}{|c|}
\hline
\color{red}TP \quad \color{green}TN \quad \color{blue}FP \quad \color{gold}FN\\
\hline
\begin{matrix}
\color{green}N_{aa} & \color{blue}N_{ab} & N_{ac}\\
\color{gold}N_{ba} & \color{red}N_{bb} & \color{gold}N_{bc}\\
N_{ca} & \color{blue}N_{cb} & \color{green}N_{cc}
\end{matrix}
\\ \hline
\end{array}
TP TN FP FN N aa N ba N c a N ab N bb N c b N a c N b c N cc
For class C, its TP are N c c N_{cc} N cc , TN are N a a + N b b N_{aa}+N_{bb} N aa + N bb , FP are N a c + N b c N_{ac}+N_{bc} N a c + N b c , FN are N c a + N c b N_{ca}+N_{cb} N c a + N c b
T P T N F P F N N a a N a b N a c N b a N b b N b c N c a N c b N c c \begin{array}{|c|}
\hline
\color{red}TP \quad \color{green}TN \quad \color{blue}FP \quad \color{gold}FN\\
\hline
\begin{matrix}
\color{green}N_{aa} & N_{ab} & \color{blue}N_{ac}\\
N_{ba} & \color{green}N_{bb} & \color{blue}N_{bc}\\
\color{gold}N_{ca} & \color{gold}N_{cb} & \color{red}N_{cc}
\end{matrix}
\\ \hline
\end{array}
TP TN FP FN N aa N ba N c a N ab N bb N c b N a c N b c N cc
How to compute confusion matrix in semantic segmentation case using python.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 import numpy as npclass_num = 3 gound_truth = np.random.randint(low=0 , high=class_num, size=(5 , 5 )) prediction = np.random.randint(low=0 , high=class_num, size=(5 , 5 )) ''' [0, 1, 2]*3 + [0, 1, 2] By doing this, we will get 3*3=9 kinds of results: 0+0, 0+1, 0+2 3+0, 3+1, 3+2 6+0, 6+1, 6+2 ''' labels = class_num*gound_truth + prediction counts = np.bincount(labels.flatten(), minlength=class_num ** 2 ) confusion_matrix = counts.reshape(class_num, class_num) TP = np.diag(confusion_matrix) TN = np.diag(confusion_matrix).sum () - np.diag(confusion_matrix) FP = confusion_matrix.sum (axis=0 ) - np.diag(confusion_matrix) FN = confusion_matrix.sum (axis=1 ) - np.diag(confusion_matrix) '''Overall Accuracy''' OA = TP.sum () / (confusion_matrix.sum (dim=None ) + eps) '''Precision or Producer's Accuracy (Accuracy from the aspect of ground truth)''' Precision = TP / (TP + FP + eps) '''Recall or User's Accuracy (Accuracy from the aspect of prediction result)''' Recall = TP / (TP + FN + eps) '''F1 socre''' F1 = (2.0 * Precision * Recall) / (Precision + Recall + eps) '''Intersection over Union''' IoU = TP / (TP + FN + FP)
Another way to calculate TP,TN,FP,FN
For the inputs in segmentation, a Prediction with shape (Batch_size, Class_num, Height, Width) and a Mask or Ground Truth with shape (Batch_size, Height, Width), Prediction contains Class_num values per pixel for each image of every batch, representing predicted probability for each class respectively.
For category c, the TP,TN,FP,FN will be calculated as follwing:
T P c = ∑ h , w p c , h , w g c , h , w T N c = ∑ h , w ∑ i ≠ c p i , h , w g i , h , w F P c = ∑ h , w p c , h , w ( 1 − g c , h , w ) F N c = ∑ h , w ( 1 − p c , h , w ) g c , h , w \begin{split}
TP_c &= \sum_{h,w}p_{c,h,w}g_{c,h,w}\\
TN_c &= \sum_{h,w}\sum_{i\neq c}p_{i,h,w}g_{i,h,w}\\
FP_c &= \sum_{h,w}p_{c,h,w}(1-g_{c,h,w})\\
FN_c &= \sum_{h,w}(1-p_{c,h,w})g_{c,h,w}
\end{split}
T P c T N c F P c F N c = h , w ∑ p c , h , w g c , h , w = h , w ∑ i = c ∑ p i , h , w g i , h , w = h , w ∑ p c , h , w ( 1 − g c , h , w ) = h , w ∑ ( 1 − p c , h , w ) g c , h , w
where g h , w g_{h,w} g h , w uses a one-hot encoding scheme for ground truth labels of pixel (h,w), and g c , h , w g_{c,h,w} g c , h , w is the c-th element of g h , w g_{h,w} g h , w ; p c , h , w ∈ [ 0 , 1 ] p_{c,h,w}\in[0,1] p c , h , w ∈ [ 0 , 1 ] is the predicted value of the pixel (h,w) belonging to label c.
g h , w = ( 0 0 , ⋯ , 0 i − 1 , 1 i , 0 i + 1 , ⋯ , 0 C − 1 ) g_{h,w} = (0_0,\cdots,0_{i-1},1_{i},0_{i+1},\cdots,0_{C-1}) g h , w = ( 0 0 , ⋯ , 0 i − 1 , 1 i , 0 i + 1 , ⋯ , 0 C − 1 ) , i is the label of pixel (h,w)
eg. predicton = [p0, p1, p2, p3], ground-truth = [0, 0, 1, 0], which has 4 categories.
For 0th category:
TP0 = p0 × g0 = 0,
TN0 = p1 × g1 + p2 × g2 + p3 × g3 = p2,
FP0 = p0 × (1 - g0) = p0,
FN0 = (1 - p0) × g0 = 0.
For 2nd category:
TP2 = p2 × g2 = p2,
TN2 = p0 × g0 + p1 × g1 + p3 × g3 = 0,
FP2 = p2 × (1 - g2) = 0,
FN2 = (1 - p2) × g2 = 1 - p2.
代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 import torchfrom torch import Tensorclass Metric (object ): def __init__ (self, num_classes: int , device ): self.num_classes = num_classes self.confusion_matrix = torch.zeros(size=(num_classes, num_classes), dtype=torch.long, device=device) self.eps = 1e-8 self.tp = None self.fp = None self.tn = None self.fn = None self.num = None def calculate_tp_fp_tn_fn_num (self ) -> None : '''calculate true positive, false positive, true negative and false negative of each class''' self.tp = torch.diag(self.confusion_matrix) self.fp = self.confusion_matrix.sum (dim=0 ) - self.tp self.fn = self.confusion_matrix.sum (dim=1 ) - self.tp self.tn = self.tp.sum () - self.tp self.num = self.confusion_matrix.sum (dim=1 ) def get_precision (self ) -> Tensor: """calculate precision for each class""" precision = self.tp / (self.tp + self.fp + self.eps) return precision def get_recall (self ) -> Tensor: """calculate recall for each class""" recall = self.tp / (self.tp + self.fn + self.eps) return recall def get_f1 (self ) -> Tensor: """calculate f1 score for each class""" Precision = self.tp / (self.tp + self.fp + self.eps) Recall = self.tp / (self.tp + self.fn + self.eps) F1 = (2.0 * Precision * Recall) / (Precision + Recall + self.eps) return F1 def get_mf1 (self ) -> Tensor: f1 = self.get_f1() mF1 = f1.mean(dim=None ) return mF1 def get_fwf1 (self ): """Frequency Weighted F1 score, the weighted average of all F1 scores""" f1 = self.get_f1() FWF1 = (self.num * f1).sum (dim=None ) / self.num.sum (dim=None ) return FWF1 def get_iou (self ) -> Tensor: """calculate Intersection over Union""" IoU = self.tp / (self.tp + self.fn + self.fp + self.eps) return IoU def get_miou (self ) -> Tensor: """calculate the mean of all IoU""" IoU = self.get_iou() mIoU = IoU.mean(dim=None ) return mIoU def get_fwiou (self ): """Frequency Weighted IoU, the weighted average of all IoU""" iou = self.get_iou() FWIoU = (self.num * iou).sum (dim=None ) / self.num.sum (dim=None ) return FWIoU def get_dice (self ) -> Tensor: """calculate dice for each class""" Dice = 2 * self.tp / ((self.tp + self.fp) + (self.tp + self.fn) + self.eps) return Dice def get_accuracy (self ) -> Tensor: """calculate accuracy for each class""" Acc = self.tp / (self.tp + self.fn + self.eps) return Acc def get_overall_accuracy (self ) -> Tensor: """calculate the overall accuracy""" OA = self.tp.sum (dim=None ) / self.confusion_matrix.sum (dim=None ) return OA def get_average_accuracy (self ) -> Tensor: """calculate the average accuracy""" Acc = self.get_accuracy() AverAcc = Acc.mean(dim=None ) return AverAcc def _get_confusion_matrix (self, labels: Tensor, predictions: Tensor ) -> Tensor: """ calculate confusion matrix for one result or a batch of results labels: [height, width], predictions: [height, width]; labels: [batch, height, width], predictions: [batch, height, width]; """ ''' 0: impervious surfaces, 1: building, 2: low vegetation, 3: tree, 4: car, 5: background 6 * (0,1,2,3,4,5) + (0,1,2,3,4,5) --------------------------------- 0: 0 1 2 3 4 5 1: 6 7 8 9 10 11 2: 12 13 14 15 16 17 3: 18 19 20 21 22 23 4: 24 25 26 27 28 29 5: 30 31 32 33 34 35 ''' assert labels.shape == predictions.shape, f"shape should be same" index = self.num_classes * labels + predictions count = torch.bincount(input =index.flatten(), minlength=self.num_classes ** 2 ) confusion_matrix = count.reshape(self.num_classes, self.num_classes) return confusion_matrix def add_batch (self, labels: Tensor, predictions: Tensor ) -> None : """labels: [height, width], predictions: [height, width] labels: [batch, height, width], predictions: [batch, height, width]""" assert labels.shape == predictions.shape, f'shape should be same' self.confusion_matrix += self._get_confusion_matrix(labels, predictions) self.calculate_tp_fp_tn_fn_num() def reset_confusion_matrix (self ): self.confusion_matrix = torch.zeros(size=(self.num_classes, self.num_classes)) if __name__ == '__main__' : num_classes = 6 labels = torch.randint(low=0 , high=num_classes, size=(2 , 224 , 224 )) predictions = torch.randint(low=0 , high=num_classes, size=(2 , 224 , 224 )) metric = Metric(num_classes=num_classes, device="cpu" ) metric.add_batch(labels=labels, predictions=predictions) print ( f"num: {metric.num} \n" f"oa: {metric.get_overall_accuracy()} \n" f"aa: {metric.get_average_accuracy()} \n" f"accuracy: {metric.get_accuracy()} \n" f"iou: {metric.get_iou()} \n" f"miou: {metric.get_miou()} \n" f"fwiou: {metric.get_fwiou()} \n" f"f1: {metric.get_f1()} \n" f"mf1: {metric.get_mf1()} \n" f"fwf1: {metric.get_fwf1()} " ) print (metric.confusion_matrix)
借助 sklearn
1 pip install scikit-learn
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 import numpy as npfrom sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, jaccard_scoreif __name__ == "__main__" : num_classes = 6 B, H, W = 8 , 224 , 224 labels = ["0" , "1" , "2" , "3" , "4" , "5" ] y_true = np.random.randint(low=0 , high=num_classes, size=(B, H, W)) y_pred = np.random.randint(low=0 , high=num_classes, size=(B, H, W)) '''reshape(-1) and flatten() can do the same thing -- converting a multidimensional array into a 1D array.''' confusion = confusion_matrix(y_true=y_true.reshape(-1 ), y_pred=y_pred.flatten()) print (f"{confusion = } " ) accuracy = accuracy_score(y_true=y_true.reshape(-1 ), y_pred=y_pred.flatten()) precision = precision_score(y_true=y_true.reshape(-1 ), y_pred=y_pred.flatten(), average=None ) recall = recall_score(y_true=y_true.reshape(-1 ), y_pred=y_pred.flatten(), average=None ) f1 = f1_score(y_true=y_true.reshape(-1 ), y_pred=y_pred.flatten(), average=None ) iou = jaccard_score(y_true=y_true.reshape(-1 ), y_pred=y_pred.flatten(), average=None ) print (f"{accuracy = :>.2 %} " ) np.set_printoptions(precision=4 ) print (f"{precision = } " ) print (f"{recall = } " ) print (f"{f1 = } " ) print (f"{iou = } " )
evaluator 的设计
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 from tqdm import tqdmfrom typing import Dict import loggingfrom torch import nnimport torchfrom torch import Tensordef calculate_confusion_matrix (y_true: Tensor, y_pred: Tensor, num_classes: int ) -> Tensor: """ calculate confusion matrix The shape of passed tensor should be [height * width], [height, width] or [batch, height, width] """ ''' 0: impervious surfaces, 1: building, 2: low vegetation, 3: tree, 4: car, 5: background --------------------------------- num_classes * y_true + y_pred 6 * (0,1,2,3,4,5) + (0,1,2,3,4,5) --------------------------------- 0: 0 1 2 3 4 5 1: 6 7 8 9 10 11 2: 12 13 14 15 16 17 3: 18 19 20 21 22 23 4: 24 25 26 27 28 29 5: 30 31 32 33 34 35 ''' assert y_pred.shape == y_true.shape, f"shape should be same" index = num_classes * y_true + y_pred counts = torch.bincount(input =index.flatten(), minlength=num_classes ** 2 ) confusion_matrix = counts.reshape(num_classes, num_classes) return confusion_matrix def evaluator_potsdam (cfg, model, testloader, device ) -> Dict : '''evaluate the model over testset''' model.eval () model.to(device) '''initialize confusion matrix''' confusion_matrix = torch.zeros(size=[cfg.num_classes, cfg.num_classes], dtype=torch.int64, device=device) '''create a process bar by tqdm''' testloader_bar = tqdm(testloader) testloader_bar.set_description(desc="val" ) for batch in testloader_bar: images, labels = batch['img' ].to(device), batch['ann' ].to(device) '''raw_prediction: [B, Classes, Height, Width]''' raw_predictions = model(images) raw_predictions = nn.Softmax(dim=1 )(raw_predictions) '''[B, Classes, Height, Width] -argmax(dim=1)-> [B, Height, Width] predictions: [B, Height, Width]''' predictions = raw_predictions.argmax(dim=1 ) confusion_matrix += calculate_confusion_matrix(y_true=labels, y_pred=predictions, num_classes=cfg.num_classes) testloader_bar.close() eps = 1e-8 proportion_per_class = confusion_matrix.sum (dim=1 ) / confusion_matrix.sum (dim=None ) '''true positive, false positive, true negative and false negative for each class''' tp = torch.diag(confusion_matrix) fp = confusion_matrix.sum (dim=0 ) - tp tn = tp.sum (dim=None ) - tp fn = confusion_matrix.sum (dim=1 ) - tp '''overall accuracy''' oa = tp.sum (dim=None ) / confusion_matrix.sum (dim=None ) '''intersection over union''' iou_per_class = tp / (tp + fn + fp) '''mean iou''' miou = iou_per_class.mean(dim=None ) '''frequency weighted iou''' fwiou = (iou_per_class * proportion_per_class).sum (dim=None ) '''f1 score''' precision, recall = tp / (tp + fp + eps), tp / (tp + fn + eps) f1_per_class = 2.0 * precision * recall / (precision + recall + eps) '''mean f1 score''' mf1 = f1_per_class.mean(dim=None ) '''frequency weighted f1 score''' fwf1 = (f1_per_class * proportion_per_class).sum (dim=None ) logging.info(f"OA:{oa:06.2 %} , mF1:{mf1:06.2 %} , fwF1:{fwf1:06.2 %} , mIoU:{miou:06.2 %} , fwIoU:{fwiou:06.2 %} " ) for class_name, portion, iou, f1 in zip (cfg.class_names, proportion_per_class, iou_per_class, f1_per_class): logging.info(f"{class_name:>9 } ({portion:06.2 %} ): f1={f1:>06.2 %} , iou={iou:>06.2 %} " ) return {"oa" : oa.item(), "mf1" : mf1.item(), "fwf1" : fwf1.item(), "miou" : miou.item(), "fwiou" : fwiou.item()} if __name__ == "__main__" : from ml_collections import ConfigDict from torch.utils.data import Dataset, DataLoader config = ConfigDict() config.num_classes = 6 config.class_names = ('ImSurf' , 'Building' , 'LowVeg' , 'Tree' , 'Car' , 'Clutter' ) device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu" ) logging.basicConfig(level=logging.INFO, format ="%(asctime)s %(levelname)s %(message)s" , datefmt="%Y-%m-%d %H:%M:%S" ) class CustomDataset (Dataset ): def __init__ (self ): self.images = torch.rand(size=[80 , 3 , 512 , 512 ], dtype=torch.float32) self.labels = torch.randint(low=0 , high=6 , size=[80 , 512 , 512 ], dtype=torch.long) def __getitem__ (self, index ): return {"name" : "name" , "img" : self.images[index], "ann" : self.labels[index]} def __len__ (self ): return len (self.images) dataset = CustomDataset() dataloader = DataLoader(dataset, batch_size=8 ) model = nn.Sequential( nn.Conv2d(in_channels=3 , out_channels=6 , kernel_size=1 ) ).to(device) evaluator_potsdam(config, model, dataloader, device)