Evaluating Multi-Class Segmentation Errors with Anatomical Priors

Acquiring large scale annotations is challenging in medical image analysis because of the limited number of qualified annotators. Thus, it is essential to achieve high performance using a small number of labeled data, where the key lies in mining the most informative samples to annotate. In this paper, we propose two effective metrics which leverage anatomical priors to evaluate multi-class segmentation methods without ground truth (GT). Together with our smooth margin loss, these metrics can help to mine the most informative samples for training. In experiments, first we demonstrate the proposed metrics can clearly distinguish samples with different degree of errors in the task of pulmonary lobe segmentation. And then we show that our metrics synergized with the proposed loss function can reach the Pearson Correlation Coefficient (PCC) of 0.7447 with mean surface distance (MSD) and -0.5976 with Dice score, which implies the proposed metrics can be used to evaluate segmentation methods. Finally, we utilize our metrics as sample selection criteria in an active learning setting, which shows that the model trained with our anatomy based query achieves comparable performance with the one trained with random query and uncertainty based query using more annotated training data.
  • IEEE MemberUS $11.00
  • Society MemberUS $0.00
  • IEEE Student MemberUS $11.00
  • Non-IEEE MemberUS $15.00
Purchase

Videos in this product

Evaluating Multi-Class Segmentation Errors with Anatomical Priors

00:13:07
0 views
Acquiring large scale annotations is challenging in medical image analysis because of the limited number of qualified annotators. Thus, it is essential to achieve high performance using a small number of labeled data, where the key lies in mining the most informative samples to annotate. In this paper, we propose two effective metrics which leverage anatomical priors to evaluate multi-class segmentation methods without ground truth (GT). Together with our smooth margin loss, these metrics can help to mine the most informative samples for training. In experiments, first we demonstrate the proposed metrics can clearly distinguish samples with different degree of errors in the task of pulmonary lobe segmentation. And then we show that our metrics synergized with the proposed loss function can reach the Pearson Correlation Coefficient (PCC) of 0.7447 with mean surface distance (MSD) and -0.5976 with Dice score, which implies the proposed metrics can be used to evaluate segmentation methods. Finally, we utilize our metrics as sample selection criteria in an active learning setting, which shows that the model trained with our anatomy based query achieves comparable performance with the one trained with random query and uncertainty based query using more annotated training data.