Efficient Detection of EMVI in Rectal Cancer via Richer Context Information and Feature Fusion

It is vital to automatically detect the Extramural Vascular Invasion (EMVI) in rectal cancer before surgery, which facilitates to guide the patient's treatment planning. Nevertheless, there are few studies about EMVI detection through magnetic resonance imaging (MRI). Moreover, since EMVI has three main characteristics: highly-variable appearances, relatively-small sizes and similar shapes with surrounding tissues, current deep learning based methods can not be directly used. In this paper, we propose a novel and efficient EMVI detection framework, which gives rise to three main contributions. Firstly, we introduce a self-attention module to capture dependencies ranging from local to global. Secondly, we design a parallel atrous convolution (PAC) block and a global pyramid pooling (GPP) module to encode richer context information at multiple scales. Thirdly, we fuse the whole-scene and local-region information together to improve the feature representation ability. Experimental results show that our framework can significantly improve the detection accuracy and outperform other state-of-the-art methods.
  • IEEE MemberUS $11.00
  • Society MemberUS $0.00
  • IEEE Student MemberUS $11.00
  • Non-IEEE MemberUS $15.00
Purchase

Videos in this product

Efficient Detection of EMVI in Rectal Cancer via Richer Context Information and Feature Fusion

00:15:47
0 views
It is vital to automatically detect the Extramural Vascular Invasion (EMVI) in rectal cancer before surgery, which facilitates to guide the patient's treatment planning. Nevertheless, there are few studies about EMVI detection through magnetic resonance imaging (MRI). Moreover, since EMVI has three main characteristics: highly-variable appearances, relatively-small sizes and similar shapes with surrounding tissues, current deep learning based methods can not be directly used. In this paper, we propose a novel and efficient EMVI detection framework, which gives rise to three main contributions. Firstly, we introduce a self-attention module to capture dependencies ranging from local to global. Secondly, we design a parallel atrous convolution (PAC) block and a global pyramid pooling (GPP) module to encode richer context information at multiple scales. Thirdly, we fuse the whole-scene and local-region information together to improve the feature representation ability. Experimental results show that our framework can significantly improve the detection accuracy and outperform other state-of-the-art methods.