边缘概率分布引导的结直肠息肉高分辨率分割网络
Edge-distribution-guided high-resolution network for colorectal polyp segmentation
- 2023年28卷第12期 页码:3897-3910
收稿:2023-01-13,
修回:2023-04-08,
纸质出版:2023-12-16
DOI: 10.11834/jig.230015
移动端阅览

浏览全部资源
扫码关注微信
收稿:2023-01-13,
修回:2023-04-08,
纸质出版:2023-12-16
移动端阅览
目的
2
结直肠息肉检测可以有效预防癌变,然而人工诊断往往存在较高漏检率,使用深度学习技术可以提供有助于诊断的细粒度信息,辅助医生进行筛查。实际场景中,息肉形态各异和息肉边缘模糊的特点会严重影响算法的准确性。针对这一问题,提出了一种边缘概率分布模型引导的结直肠息肉分割网络(edge distribution guided high-resolution network,HRNetED)。
方法
2
本文所提的HRNetED网络使用HRNet结构作为网络主干,设计了一种堆叠残差卷积模块,显著降低模型参数量的同时提高模型性能;此外,本文使用边缘概率分布模型来描述息肉边缘,提高模型对边缘检测的稳定性;最后,本文在多尺度解码器中引入边缘检测任务,以加强模型对息肉边缘的感知。
结果
2
本文在Kvasir-Seg(Kvasir segmentation dataset)、ETIS(ETIS larib polyp database)、CVC-ColonDB(colonoscopy videos challenge colon database)、CVC-ClinicDB(colonoscopy videos challenge clinic database)和CVC-300(colonoscopy videos challenge 300)5个数据集上进行测试。最终,HRNetED在CVC-ClinicDB和CVC-300数据集上的Dice系数(Dice similarity coefficient)和平均交并比(mean intersection over union,mIoU)指标均优于对比算法,且在CVC-ClinicDB数据集上相较于对比最优模型分别获得了1.25%和1.37%的提升;在ETIS数据集上,Dice系数表现优于对比最优算法;在CVC-ColonDB数据集上,Dice和mIoU处于较优水平。此外,HRNetED在Kvasir-Seg、ETIS、CVC-ColonDB数据集上的HD
95
距离相较于对比最优算法分别降低了0.315%、29.19%和2.95%,在CVC-ClinicDB和CVC-300数据集上表现为次优,同样具有良好的性能。
结论
2
本文提出的HRNetED网络在多个数据集中表现稳定,对于小目标、模糊息肉有较好的感知能力,对息肉轮廓检测能力更强。
Objective
2
As a harmful, high-prevalence disease, colorectal cancer is seriously threatening human life and health. Nearly 95% of colorectal cancer cases are caused by the development of early colon polyps. Therefore, if colorectal polyps can be detected in time and closely observed by specialists, then the incidence of colorectal cancer can be effectively reduced. However, artificial diagnosis often has a high rate of missing polyps. The use of deep learning technology can provide fine-grained information that is helpful for diagnosis, such as the location and shape of polyps, and assist doctors in screening, thus providing great value for the prevention and treatment of colorectal cancer. The rapid development of deep learning in recent years has introduced great breakthroughs in the use of computer-aided diagnosis technologies in the medical field. Several models, such as convolutional neural networks and vision Transformer(ViT), have demonstrated their excellent medical task processing capabilities, and the use of computer technology for auxiliary diagnosis has gradually become a trend. In view of the characteristics of colorectal polyp images, such as their excessive morphological differences and unclear edges, we propose a edge-probability-distribution-guided high-resolution network for colorectal polyp segmentation called HRNetED, which performs well in multiple colorectal polyp datasets and has good clinical application significance.
Method
2
The proposed HRNetED network takes the HRNet structure as its backbone to ensure a full exchange of multi-scale features and guarantee the accuracy of the model output by maintaining a high-resolution convolutional branch. A stack residual convolution (SRC) module is also designed to extract the output of each convolution kernel by splitting a single convolution into four subconvolutions and connecting them serially so as to obtain the characteristics of multi-receptive fields. Pointwise convolution is then applied for feature fusion, and residual connection is introduced to avoid model performance degradation. To a certain extent, SRC solves the limitation of insufficient receptive fields in a single convolution operation and significantly reduces the number of model parameters and improves model performance through convolution splitting. Given the different morphological sizes, large color differences, and inconsistent imaging quality of colorectal polyp images, we design a multi-scale decoder to simultaneously supervise and learn the output results of different scales and introduce edge detection tasks into the structure to strengthen the perception of polyp edges. To address the unclear edges of polyps, we use the edge probability distribution model based on Gaussian distribution to describe the polyp edge so that the model does not need to return the accurate edge position information but only needs to predict the heat map of the edge distribution, thus effectively reducing the difficulty of model convergence and improving the perception ability and robustness of the model in the edge semantic ambiguous region. In the dataset configuration, we follow the experimental steps of mainstream networks, such as Pra-Net. Specifically, we use 900 images from the Kvasir-Seg dataset and 550 images from CVC-ClinicDB as the training set, amounting to 1 450 images. All images from ETIS, CVC-ColonDB, and CVC-300 and the remaining images from Kvasir-Seg and CVC-ClinicDB are then combined as test sets. We scale all these images to 256 × 256 pixels simultaneously. In the model training part, and use FocalLoss and BCELoss for the supervised training of edge detection and polyp segmentation tasks, respectively. We also iteratively use the cosine annealing learning rate adjustment strategy and Adam optimizer. In the model testing phase, we evaluate our model using the Dice coefficient and the mean intersection over union (mIoU) metric.
Result
2
We test our method on five publicly available colorectal polyp datasets, namely, Kvasir-Seg, ETIS, CVC-ColonDB, CVC-ClinicDB, and CVC-300, and compare its performance with that of existing colorectal polyp segmentation algorithms, including HRNetv2, Pra-Net, UACANet, MSRF-Net, BDG-Net, SSFormer, and ESFPNet. The comparison results reveal that the Dice coefficient and mIoU of HRNetED on the CVC-ClinicDB and CVC-300 datasets are greater than those of other algorithms. Compared with the previous optimal model on the CVC-ClinicDB dataset, HRNetED achieves 1.25% and 1.37% improvements in Dice and mIoU, respectively. On the ETIS dataset, the Dice and mIoU of HRNetED are 82.41% and 71.21%, respectively, with the former being higher than that of the existing optimal algorithm. On the CVC-ColonDB dataset, the Dice and mIoU of HRNetED are 80.55% and 71.56%, respectively. In addition, the HD
95
distance of HRNetED on the Kvasir-Seg, ETIS, and CVC-ColonDB datasets is 0.315%, 29.19%, and 2.95% lower than that of existing optimal algorithms. While HRNetEd shows good performance on the CVC-ClinicDB and CVC-300 datasets, this model only emerges as the second best-performing algorithm.
Conclusion
2
The proposed HRNetED network performs well in colorectal polyp segmentation tasks. The subjective segmentation results show that this network performs stably in multiple datasets, has a good perception of small targets and fuzzy polyps, and has a strong ability to detect polyp contours. Results of ablation experiments show that the proposed stacked residual convolution module can greatly reduce the number of model parameters and improve model performance, whereas the edge probability distribution model proposed for the edge fuzzy region problem can effectively improve the performance of the network.
Bernal J , Snchez F J , Fernndez-Esparrach G , Gil D , Rodríguez C and Vilariño F . 2015 . WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians . Computerized Medical Imaging and Graphics , 43 : 99 - 111 [ DOI: 10.1016/j.compmedimag.2015.02.007 http://dx.doi.org/10.1016/j.compmedimag.2015.02.007 ]
Cao H , Wang Y Y , Chen J , Jiang D S , Zhang X P , Tian Q Q and Wang M N . 2023 . Swin-UNet: UNet-like pure Transformer for medical image segmentation // Proceedings of 2022 European Conference on Computer Vision . Tel Aviv, Israel : Springer: 205 - 218 [ DOI: 10.1007/978-3-031-25066-8_9 http://dx.doi.org/10.1007/978-3-031-25066-8_9 ]
Chang Q , Ahmad D , Toth J , Bascom R and Higgins W E . 2023 . ESFPNet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video // Proceedings of 2023 Biomedical Applications in Molecular, Structural, and Functional Imaging . San Diego, USA : SPIE: #12468 [ DOI: https://doi.org/10.1117/12.2647897 https://doi.org/10.1117/12.2647897 ]
Chen L C , Papandreou G , Kokkinos I , Murphy K and Yuille A L . 2016 . Semantic image segmentation with deep convolutional nets and fully connected CRFs [EB/OL]. [ 2016-06-07 ]. https://arxiv.org/pdf/1412.7062v4.pdf https://arxiv.org/pdf/1412.7062v4.pdf
Chen L C , Papandreou G , Schroff F and Adam H . 2017 . Rethinking atrous convolution for semantic image segmentation [EB/OL]. [ 2017-12-05 ]. https://arxiv.org/pdf/1706.05587.pdf https://arxiv.org/pdf/1706.05587.pdf
Ding X H , Zhang X Y , Han J G and Ding D G . 2022 . Scaling up your kernels to 31 x 31 : revisiting large kernel design in CNNs //Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA : IEEE: 11953 - 11965 [ DOI: 10.1109/CVPR52688.2022.01166 http://dx.doi.org/10.1109/CVPR52688.2022.01166 ]
Fan D P , Ji G P , Zhou T , Chen G , Fu H Z , Shen J B and Shao L . 2020 . PraNet: parallel reverse attention network for polyp segmentation // Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention . Lima, Peru : Springer: 263 - 273 [ DOI: 10.1007/978-3-030-59725-2_26 http://dx.doi.org/10.1007/978-3-030-59725-2_26 ]
He K M , Zhang X Y , Ren S Q and Sun J . 2016 . Deep residual learning for image recognition // Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas, USA : IEEE: 770 - 778 [ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Huang G , Liu Z , van der Maaten L and Weinberger K Q . 2017 . Densely connected convolutional networks // Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, USA : IEEE: 2261 - 2269 [ DOI: 10.1109/CVPR.2017.243 http://dx.doi.org/10.1109/CVPR.2017.243 ]
Jha D , Smedsrud P H , Riegler M A , Halvorsen P , De Lange T , Johansen D and Johansen H D . 2020 . Kvasir-SEG: a segmented polyp dataset // Proceedings of the 26th International Conference on Multimedia Modeling . Daejeon, Korea (South) : Springer: 451 - 462 [ DOI: 10.1007/978-3-030-37734-2_37 http://dx.doi.org/10.1007/978-3-030-37734-2_37 ]
Jha D , Smedsrud P H , Riegler M A , Johansen D , De Lange T , Halvorsen P and Johansen H D . 2019 . ResUNet++: an advanced architecture for medical image segmentation // 2019 IEEE International Symposium on Multimedia . San Diego, USA : IEEE: 225 - 2255 [ DOI: 10.1109/ISM46123.2019.00049 http://dx.doi.org/10.1109/ISM46123.2019.00049 ]
Kim T , Lee H and Kim D . 2021 . UACANet: uncertainty augmented context attention for polyp segmentation // Proceedings of the 29th ACM International Conference on Multimedia . Virtual Event, China : ACM: 2167 - 2175 [ DOI: 10.1145/3474085.3475375 http://dx.doi.org/10.1145/3474085.3475375 ]
Li J X , Sun J , Li C and Ahmad B . 2022 . A MHA-based integrated diagnosis and segmentation method for COVID-19 pandemic . Journal of Image and Graphics , 27 ( 12 ): 3651 - 3662
李金星 , 孙俊 , 李超 , Ahmad B . 2022 . 融合多头注意力机制的新冠肺炎联合诊断与分割 . 中国图象图形学报 , 27 ( 12 ): 3651 - 3662 [ DOI: 10.11834/jig.211015 http://dx.doi.org/10.11834/jig.211015 ]
Lin T Y , Goyal P , Girshick R , He K M and Dollr P . 2017 . Focal loss for dense object detection // Proceedings of 2017 IEEE International Conference on Computer Vision . Venice, Italy : IEEE: 2999 - 3007 [ DOI: 10.1109/ICCV.2017.324 http://dx.doi.org/10.1109/ICCV.2017.324 ]
Liu Z , Lin Y T , Cao Y , Hu H , Wei Y X , Zhang Z , Lin S and Guo B N . 2021 . Swin Transformer: hierarchical vision Transformer using shifted windows // Proceedings of 2021 IEEE/CVF International Conference on Computer Vision . Montreal, Canada : IEEE: 9992 - 10002 [ DOI: 10.1109/ICCV48922.2021.00986 http://dx.doi.org/10.1109/ICCV48922.2021.00986 ]
Liu Z , Mao H Z , Wu C Y , Feichtenhofer C , Darrel T and Xie S N . 2022 . A ConvNet for the 2020s // Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans, USA : IEEE: 11966 - 11976 [ DOI: 10.1109/CVPR52688.2022.01167 http://dx.doi.org/10.1109/CVPR52688.2022.01167 ]
Milletari F , Navab N and Ahmadi S A . 2016 . V-Net: fully convolutional neural networks for volumetric medical image segmentation // Proceedings of 2016 International Conference on 3D Vision (3DV) . Stanford, USA : IEEE: 565 - 571 [ DOI: 10.1109/3DV.2016.79 http://dx.doi.org/10.1109/3DV.2016.79 ]
Nisha J S , Gopi V P and Palanisamy P . 2022 . Automated colorectal polyp detection based on image enhancement and dual-path CNN architecture . Biomedical Signal Processing and Control , 73 : # 103465 [ DOI: 10.1016/j.bspc.2021.103465 http://dx.doi.org/10.1016/j.bspc.2021.103465 ]
Qiu Z H , Wang Z C , Zhang M M , Xu Z Y , Fan J and Xu L F . 2022 . BDG-Net: boundary distribution guided network for accurate polyp segmentation // Proceedings Volume 12032 , Medical Imaging 2022: Image Processing. San Diego, USA : SPIE: 792 - 799 [ DOI: 10.1117/12.2606785 http://dx.doi.org/10.1117/12.2606785 ]
Ronneberger O , Fischer P and Brox T . 2015 . U-net: convolutional networks for biomedical image segmentation // Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention . Munich, Germany : Springer: 234 - 241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Silva J , Histace A , Romain O , Dray X and Granado B . 2014 . Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer . International Journal of Computer Assisted Radiology and Surgery , 9 ( 2 ): 283 - 293 [ DOI: 10.1007/s11548-013-0926-3 http://dx.doi.org/10.1007/s11548-013-0926-3 ]
Srivastava A , Jha D , Chanda S , Pal U , Johansen H D , Johansen D , Riegler M A , Ali S and Halvorsen P . 2022 . MSRF-net: a multi-scale residual fusion network for biomedical image segmentation . IEEE Journal of Biomedical and Health Informatics , 26 ( 5 ): 2252 - 2263 [ DOI: 10.1109/JBHI.2021.3138024 http://dx.doi.org/10.1109/JBHI.2021.3138024 ]
Sun K , Xiao B , Liu D and Wang J D . 2019 . Deep high-resolution representation learning for human pose estimation // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach, USA : IEEE: 5686 - 5796 [ DOI: 10.1109/CVPR.2019.00584 http://dx.doi.org/10.1109/CVPR.2019.00584 ]
Tajbakhsh N , Gurudu S R and Liang J M . 2016 . Automated polyp detection in colonoscopy videos using shape and context information . IEEE Transactions on Medical Imaging , 35 ( 2 ): 630 - 644 [ DOI: 10.1109/TMI.2015.2487997 http://dx.doi.org/10.1109/TMI.2015.2487997 ]
Vzquez D , Bernal J , Snchez F J , Fernndez-Esparrach G , López A M , Romero A , Drozdzal M and Courville A . 2017 . A benchmark for endoluminal scene segmentation of colonoscopy images . Journal of Healthcare Engineering , 2017 : # 4037190 [ DOI: 10.1155/2017/4037190 http://dx.doi.org/10.1155/2017/4037190 ]
Wang H N , Cao P , Wang J Q and Zaiane O R . 2022a . Uctransnet: rethinking the skip connections in U-Net from a channel-wise perspective with Transformer . Proceedings of the AAAI Conference on Artificial Intelligence , 36 ( 3 ): 2441 - 2449 [ DOI: 10.1609/aaai.v36i3.20144 http://dx.doi.org/10.1609/aaai.v36i3.20144 ]
Wang J D , Sun K , Cheng T H , Jiang B R , Deng C R , Zhao Y , Liu D , Mu Y D , Tan M K , Wang X G , Liu W Y and Xiao B . 2021 . Deep high-resolution representation learning for visual recognition . IEEE Transactions on Pattern Analysis and Machine Intelligence , 43 ( 10 ): 3349 - 3364 [ DOI: 10.1109/TPAMI.2020.2983686 http://dx.doi.org/10.1109/TPAMI.2020.2983686 ]
Wang J F , Huang Q M , Tang F L , Meng J , Su J L and Song S F . 2022b . Stepwise feature fusion: local guides global // Proceedings of the 25th International Conference on Medical Image Computing and Computer Assisted Intervention . Singapore, Singapore : Springer: 110 - 120 [ DOI: 10.1007/978-3-031-16437-8_11 http://dx.doi.org/10.1007/978-3-031-16437-8_11 ]
Wei T Q and Xiao Z Y . 2022 . Dual encoded-decoded polyp segmentation method for gastroscopic images architecture . Journal of Image and Graphics , 27 ( 12 ): 3637 - 3650
魏天琦 , 肖志勇 . 2022 . 双重编—解码架构的肠胃镜图像息肉分割 . 中国图象图形学报 , 27 ( 12 ): 3637 - 3650 [ DOI: 10.11834/jig.210966 http://dx.doi.org/10.11834/jig.210966 ]
Zhou Z W , Siddiquee M M R , Tajbakhsh N and Liang J M . 2018 . U-Net++: a nested u-net architecture for medical image segmentation // Proceedings of 2018 Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support . Granada, Spain : Springer: 3 - 11 [ DOI: 10.1007/978-3-030-00889-5_1 http://dx.doi.org/10.1007/978-3-030-00889-5_1 ]
相关作者
相关机构
京公网安备11010802024621