本文总结了二元分类评价指标中分子为 TP (number of true positives) 的指标, 并给出了它们之间的关系.
基础概念
- TP: number of true positives (实际正样本被正确预测为正样本的数目)
- FN: number of false negatives (实际正样本被错误预测为负样本的数目)
- FP: number of false positives (实际负样本被错误预测为正样本的数目)
- TN: number of true negatives (实际负样本被正确预测为负样本的数目)
分子为 TP 的指标
指标 | 公式 |
---|---|
Precision, PPV (Positive predictive value) | $\mathrm{Precision} = \frac{TP}{TP + FP}$ |
Recall, TPR (True Positive Rate) | $\mathrm{Recall} = \frac{TP}{TP + FN}$ |
Dice coefficient, Sørensen–Dice index, F1 score | $\mathrm{Dice} = \frac{2TP}{2TP + FP + FN}$ |
Fβ score | $F_\beta = \frac{(1 + \beta ^2)TP}{(1 + \beta ^2)TP + \beta ^2FP + FN}$ |
Jaccard index, IoU | $\mathrm{Jaccard} = \frac{TP}{TP + FP + FN}$ |
Tversky index | $\mathrm{Tversky}_{\mu ,\nu} = \frac{TP}{TP + \mu FN + \nu FP}$ |
注意到这些指标定义式中分子分母中均有 TP, 且分子分母中均无 TN.
Dice coefficient 与 Precision 和 Recall 之间的关系
$$\mathrm{Dice} = \frac{2\mathrm{Precision} \times \mathrm{Recall} }{\mathrm{Precision} + \mathrm{Recall} }$$
可见 Dice coefficient 是 Precision 和 Recall 的调和均值.
Fβ score 与 Precision 和 Recall 之间的关系
$$F_\beta = \frac{\left( 1 + \beta ^2 \right)\mathrm{Precision} \times \mathrm{Recall}}{{\beta ^2\mathrm{Precision} + \mathrm{Recall}}}$$
当 $\beta = 1$ 时, Fβ score 退化为 Dice coefficient.
当 $\beta \to + \infty$ 时, Fβ score 退化为 Recall.
当 $\beta = 0$ 时, Fβ score 退化为 Precision.
Dice coefficient 与 Jaccard index 之间的关系
$$\mathrm{Dice} = \frac{2\mathrm{Jaccard}}{\mathrm{Jaccard} + 1}$$
$$\mathrm{Jaccard} = \frac{\mathrm{Dice}}{2 - \mathrm{Dice}}$$
import numpy as np
import matplotlib.pyplot as plt
if __name__ == '__main__':
iou = np.linspace(0, 1, 100)
dice = 2 * iou / (iou + 1)
plt.plot(iou, dice, iou, iou)
plt.show()
由函数图形可见 Dice 始终大于等于 Jaccard.
Tversky index 与其他指标的关系
当 $\mu = 0$ 且 $\nu = 1$ 时, $\mathrm{Tversky}_{0,1} = \frac{TP}{TP + FP} = \mathrm{Precision}$, 即 Tversky index 退化为 Precision.
当 $\mu = 1$ 且 $\nu = 0$ 时, $\mathrm{Tversky}_{1,0} = \frac{TP}{TP + FN} = \mathrm{Recall}$, 即 Tversky index 退化为 Recall.
当 $\mu = \nu = 1$ 时, $\mathrm{Tversky}_{1,1} = \frac{TP}{TP + FN + FP} = \mathrm{Jaccard}$, 即 Tversky index 退化为 Jaccard index.
当 $\mu = \nu = 0.5$ 时, $\mathrm{Tversky}_{0.5,0.5} = \frac{2TP}{2TP + FN + FP} = \mathrm{Dice}$, 即 Tversky index 退化为 Dice coefficient.
当 $\mu = \frac{\beta ^2}{\beta ^2 + 1}$ 且 $\nu = \frac{1}{\beta^2 + 1}$ 时, $\mathrm{Tversky}_{\mu, \nu} = \frac{(1 + \beta ^2)TP}{(1 + \beta^2)TP + \beta^2 FP + FN}$, 即 Tversky index 退化为 Fβ Score.
参考
- https://handwiki.org/wiki/Sørensen–Dice_coefficient
- https://handwiki.org/wiki/Jaccard_index
- https://handwiki.org/wiki/Tversky_index
- https://stats.stackexchange.com/questions/221997/why-f-beta-score-define-beta-like-that
- https://stats.stackexchange.com/questions/273537/f1-dice-score-vs-iou
更新记录
- 20211221, 发布