边缘概率分布引导的结直肠息肉高分辨率分割网络
Edge-distribution-guided high-resolution network for colorectal polyp segmentation
- 2023年28卷第12期 页码:3897-3910
纸质出版日期: 2023-12-16
DOI: 10.11834/jig.230015
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2023-12-16 ,
移动端阅览
林佳俐, 李永强, 徐希舟, 冯远静. 2023. 边缘概率分布引导的结直肠息肉高分辨率分割网络. 中国图象图形学报, 28(12):3897-3910
Lin Jiali, Li Yongqiang, Xu Xizhou, Feng Yuanjing. 2023. Edge-distribution-guided high-resolution network for colorectal polyp segmentation. Journal of Image and Graphics, 28(12):3897-3910
目的
2
结直肠息肉检测可以有效预防癌变,然而人工诊断往往存在较高漏检率,使用深度学习技术可以提供有助于诊断的细粒度信息,辅助医生进行筛查。实际场景中,息肉形态各异和息肉边缘模糊的特点会严重影响算法的准确性。针对这一问题,提出了一种边缘概率分布模型引导的结直肠息肉分割网络(edge distribution guided high-resolution network,HRNetED)。
方法
2
本文所提的HRNetED网络使用HRNet结构作为网络主干,设计了一种堆叠残差卷积模块,显著降低模型参数量的同时提高模型性能;此外,本文使用边缘概率分布模型来描述息肉边缘,提高模型对边缘检测的稳定性;最后,本文在多尺度解码器中引入边缘检测任务,以加强模型对息肉边缘的感知。
结果
2
本文在Kvasir-Seg(Kvasir segmentation dataset)、ETIS(ETIS larib polyp database)、CVC-ColonDB(colonoscopy videos challenge colon database)、CVC-ClinicDB(colonoscopy videos challenge clinic database)和CVC-300(colonoscopy videos challenge 300)5个数据集上进行测试。最终,HRNetED在CVC-ClinicDB和CVC-300数据集上的Dice系数(Dice similarity coefficient)和平均交并比(mean intersection over union,mIoU)指标均优于对比算法,且在CVC-ClinicDB数据集上相较于对比最优模型分别获得了1.25%和1.37%的提升;在ETIS数据集上,Dice系数表现优于对比最优算法;在CVC-ColonDB数据集上,Dice和mIoU处于较优水平。此外,HRNetED在Kvasir-Seg、ETIS、CVC-ColonDB数据集上的HD
95
距离相较于对比最优算法分别降低了0.315%、29.19%和2.95%,在CVC-ClinicDB和CVC-300数据集上表现为次优,同样具有良好的性能。
结论
2
本文提出的HRNetED网络在多个数据集中表现稳定,对于小目标、模糊息肉有较好的感知能力,对息肉轮廓检测能力更强。
Objective
2
As a harmful, high-prevalence disease, colorectal cancer is seriously threatening human life and health. Nearly 95% of colorectal cancer cases are caused by the development of early colon polyps. Therefore, if colorectal polyps can be detected in time and closely observed by specialists, then the incidence of colorectal cancer can be effectively reduced. However, artificial diagnosis often has a high rate of missing polyps. The use of deep learning technology can provide fine-grained information that is helpful for diagnosis, such as the location and shape of polyps, and assist doctors in screening, thus providing great value for the prevention and treatment of colorectal cancer. The rapid development of deep learning in recent years has introduced great breakthroughs in the use of computer-aided diagnosis technologies in the medical field. Several models, such as convolutional neural networks and vision Transformer(ViT), have demonstrated their excellent medical task processing capabilities, and the use of computer technology for auxiliary diagnosis has gradually become a trend. In view of the characteristics of colorectal polyp images, such as their excessive morphological differences and unclear edges, we propose a edge-probability-distribution-guided high-resolution network for colorectal polyp segmentation called HRNetED, which performs well in multiple colorectal polyp datasets and has good clinical application significance.
Method
2
The proposed HRNetED network takes the HRNet structure as its backbone to ensure a full exchange of multi-scale features and guarantee the accuracy of the model output by maintaining a high-resolution convolutional branch. A stack residual convolution (SRC) module is also designed to extract the output of each convolution kernel by splitting a single convolution into four subconvolutions and connecting them serially so as to obtain the characteristics of multi-receptive fields. Pointwise convolution is then applied for feature fusion, and residual connection is introduced to avoid model performance degradation. To a certain extent, SRC solves the limitation of insufficient receptive fields in a single convolution operation and significantly reduces the number of model parameters and improves model performance through convolution splitting. Given the different morphological sizes, large color differences, and inconsistent imaging quality of colorectal polyp images, we design a multi-scale decoder to simultaneously supervise and learn the output results of different scales and introduce edge detection tasks into the structure to strengthen the perception of polyp edges. To address the unclear edges of polyps, we use the edge probability distribution model based on Gaussian distribution to describe the polyp edge so that the model does not need to return the accurate edge position information but only needs to predict the heat map of the edge distribution, thus effectively reducing the difficulty of model convergence and improving the perception ability and robustness of the model in the edge semantic ambiguous region. In the dataset configuration, we follow the experimental steps of mainstream networks, such as Pra-Net. Specifically, we use 900 images from the Kvasir-Seg dataset and 550 images from CVC-ClinicDB as the training set, amounting to 1 450 images. All images from ETIS, CVC-ColonDB, and CVC-300 and the remaining images from Kvasir-Seg and CVC-ClinicDB are then combined as test sets. We scale all these images to 256 × 256 pixels simultaneously. In the model training part, and use FocalLoss and BCELoss for the supervised training of edge detection and polyp segmentation tasks, respectively. We also iteratively use the cosine annealing learning rate adjustment strategy and Adam optimizer. In the model testing phase, we evaluate our model using the Dice coefficient and the mean intersection over union (mIoU) metric.
Result
2
We test our method on five publicly available colorectal polyp datasets, namely, Kvasir-Seg, ETIS, CVC-ColonDB, CVC-ClinicDB, and CVC-300, and compare its performance with that of existing colorectal polyp segmentation algorithms, including HRNetv2, Pra-Net, UACANet, MSRF-Net, BDG-Net, SSFormer, and ESFPNet. The comparison results reveal that the Dice coefficient and mIoU of HRNetED on the CVC-ClinicDB and CVC-300 datasets are greater than those of other algorithms. Compared with the previous optimal model on the CVC-ClinicDB dataset, HRNetED achieves 1.25% and 1.37% improvements in Dice and mIoU, respectively. On the ETIS dataset, the Dice and mIoU of HRNetED are 82.41% and 71.21%, respectively, with the former being higher than that of the existing optimal algorithm. On the CVC-ColonDB dataset, the Dice and mIoU of HRNetED are 80.55% and 71.56%, respectively. In addition, the HD
95
distance of HRNetED on the Kvasir-Seg, ETIS, and CVC-ColonDB datasets is 0.315%, 29.19%, and 2.95% lower than that of existing optimal algorithms. While HRNetEd shows good performance on the CVC-ClinicDB and CVC-300 datasets, this model only emerges as the second best-performing algorithm.
Conclusion
2
The proposed HRNetED network performs well in colorectal polyp segmentation tasks. The subjective segmentation results show that this network performs stably in multiple datasets, has a good perception of small targets and fuzzy polyps, and has a strong ability to detect polyp contours. Results of ablation experiments show that the proposed stacked residual convolution module can greatly reduce the number of model parameters and improve model performance, whereas the edge probability distribution model proposed for the edge fuzzy region problem can effectively improve the performance of the network.
医学图像处理息肉分割深度学习高分辨率网络边缘检测
medical image processingpolyp segmentationdeep learninghigh resolution netedge detection
Bernal J, Snchez F J, Fernndez-Esparrach G, Gil D, Rodríguez C and Vilariño F. 2015. WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, 43: 99-111 [DOI: 10.1016/j.compmedimag.2015.02.007http://dx.doi.org/10.1016/j.compmedimag.2015.02.007]
Cao H, Wang Y Y, Chen J, Jiang D S, Zhang X P, Tian Q Q and Wang M N. 2023. Swin-UNet: UNet-like pure Transformer for medical image segmentation//Proceedings of 2022 European Conference on Computer Vision. Tel Aviv, Israel: Springer: 205-218 [DOI: 10.1007/978-3-031-25066-8_9http://dx.doi.org/10.1007/978-3-031-25066-8_9]
Chang Q, Ahmad D, Toth J, Bascom R and Higgins W E. 2023. ESFPNet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video//Proceedings of 2023 Biomedical Applications in Molecular, Structural, and Functional Imaging. San Diego, USA: SPIE: #12468 [DOI: https://doi.org/10.1117/12.2647897https://doi.org/10.1117/12.2647897]
Chen L C, Papandreou G, Kokkinos I, Murphy K and Yuille A L. 2016. Semantic image segmentation with deep convolutional nets and fully connected CRFs [EB/OL]. [2016-06-07]. https://arxiv.org/pdf/1412.7062v4.pdfhttps://arxiv.org/pdf/1412.7062v4.pdf
Chen L C, Papandreou G, Schroff F and Adam H. 2017. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2017-12-05]. https://arxiv.org/pdf/1706.05587.pdfhttps://arxiv.org/pdf/1706.05587.pdf
Ding X H, Zhang X Y, Han J G and Ding D G. 2022. Scaling up your kernels to 31x31: revisiting large kernel design in CNNs//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 11953-11965 [DOI: 10.1109/CVPR52688.2022.01166http://dx.doi.org/10.1109/CVPR52688.2022.01166]
Fan D P, Ji G P, Zhou T, Chen G, Fu H Z, Shen J B and Shao L. 2020. PraNet: parallel reverse attention network for polyp segmentation//Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention. Lima, Peru: Springer: 263-273 [DOI: 10.1007/978-3-030-59725-2_26http://dx.doi.org/10.1007/978-3-030-59725-2_26]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Huang G, Liu Z, van der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2261-2269 [DOI: 10.1109/CVPR.2017.243http://dx.doi.org/10.1109/CVPR.2017.243]
Jha D, Smedsrud P H, Riegler M A, Halvorsen P, De Lange T, Johansen D and Johansen H D. 2020. Kvasir-SEG: a segmented polyp dataset//Proceedings of the 26th International Conference on Multimedia Modeling. Daejeon, Korea (South): Springer: 451-462 [DOI: 10.1007/978-3-030-37734-2_37http://dx.doi.org/10.1007/978-3-030-37734-2_37]
Jha D, Smedsrud P H, Riegler M A, Johansen D, De Lange T, Halvorsen P and Johansen H D. 2019. ResUNet++: an advanced architecture for medical image segmentation//2019 IEEE International Symposium on Multimedia. San Diego, USA: IEEE: 225-2255 [DOI: 10.1109/ISM46123.2019.00049http://dx.doi.org/10.1109/ISM46123.2019.00049]
Kim T, Lee H and Kim D. 2021. UACANet: uncertainty augmented context attention for polyp segmentation//Proceedings of the 29th ACM International Conference on Multimedia. Virtual Event, China: ACM: 2167-2175 [DOI: 10.1145/3474085.3475375http://dx.doi.org/10.1145/3474085.3475375]
Li J X, Sun J, Li C and Ahmad B. 2022. A MHA-based integrated diagnosis and segmentation method for COVID-19 pandemic. Journal of Image and Graphics, 27(12): 3651-3662
李金星, 孙俊, 李超, Ahmad B. 2022. 融合多头注意力机制的新冠肺炎联合诊断与分割. 中国图象图形学报, 27(12): 3651-3662 [DOI: 10.11834/jig.211015http://dx.doi.org/10.11834/jig.211015]
Lin T Y, Goyal P, Girshick R, He K M and Dollr P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2999-3007 [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]
Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, Lin S and Guo B N. 2021. Swin Transformer: hierarchical vision Transformer using shifted windows//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 9992-10002 [DOI:10.1109/ICCV48922.2021.00986http://dx.doi.org/10.1109/ICCV48922.2021.00986]
Liu Z, Mao H Z, Wu C Y, Feichtenhofer C, Darrel T and Xie S N. 2022. A ConvNet for the 2020s//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 11966-11976 [DOI: 10.1109/CVPR52688.2022.01167http://dx.doi.org/10.1109/CVPR52688.2022.01167]
Milletari F, Navab N and Ahmadi S A. 2016. V-Net: fully convolutional neural networks for volumetric medical image segmentation//Proceedings of 2016 International Conference on 3D Vision (3DV). Stanford, USA: IEEE: 565-571 [DOI: 10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79]
Nisha J S, Gopi V P and Palanisamy P. 2022. Automated colorectal polyp detection based on image enhancement and dual-path CNN architecture. Biomedical Signal Processing and Control, 73: #103465 [DOI: 10.1016/j.bspc.2021.103465http://dx.doi.org/10.1016/j.bspc.2021.103465]
Qiu Z H, Wang Z C, Zhang M M, Xu Z Y, Fan J and Xu L F. 2022. BDG-Net: boundary distribution guided network for accurate polyp segmentation//Proceedings Volume 12032, Medical Imaging 2022: Image Processing. San Diego, USA: SPIE: 792-799 [DOI: 10.1117/12.2606785http://dx.doi.org/10.1117/12.2606785]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Silva J, Histace A, Romain O, Dray X and Granado B. 2014. Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. International Journal of Computer Assisted Radiology and Surgery, 9(2): 283-293 [DOI: 10.1007/s11548-013-0926-3http://dx.doi.org/10.1007/s11548-013-0926-3]
Srivastava A, Jha D, Chanda S, Pal U, Johansen H D, Johansen D, Riegler M A, Ali S and Halvorsen P. 2022. MSRF-net: a multi-scale residual fusion network for biomedical image segmentation. IEEE Journal of Biomedical and Health Informatics, 26(5): 2252-2263 [DOI: 10.1109/JBHI.2021.3138024http://dx.doi.org/10.1109/JBHI.2021.3138024]
Sun K, Xiao B, Liu D and Wang J D. 2019. Deep high-resolution representation learning for human pose estimation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5686-5796 [DOI: 10.1109/CVPR.2019.00584http://dx.doi.org/10.1109/CVPR.2019.00584]
Tajbakhsh N, Gurudu S R and Liang J M. 2016. Automated polyp detection in colonoscopy videos using shape and context information. IEEE Transactions on Medical Imaging, 35(2): 630-644 [DOI: 10.1109/TMI.2015.2487997http://dx.doi.org/10.1109/TMI.2015.2487997]
Vzquez D, Bernal J, Snchez F J, Fernndez-Esparrach G, López A M, Romero A, Drozdzal M and Courville A. 2017. A benchmark for endoluminal scene segmentation of colonoscopy images. Journal of Healthcare Engineering, 2017: #4037190 [DOI: 10.1155/2017/4037190http://dx.doi.org/10.1155/2017/4037190]
Wang H N, Cao P, Wang J Q and Zaiane O R. 2022a. Uctransnet: rethinking the skip connections in U-Net from a channel-wise perspective with Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 36(3): 2441-2449 [DOI: 10.1609/aaai.v36i3.20144http://dx.doi.org/10.1609/aaai.v36i3.20144]
Wang J D, Sun K, Cheng T H, Jiang B R, Deng C R, Zhao Y, Liu D, Mu Y D, Tan M K, Wang X G, Liu W Y and Xiao B. 2021. Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10): 3349-3364 [DOI: 10.1109/TPAMI.2020.2983686http://dx.doi.org/10.1109/TPAMI.2020.2983686]
Wang J F, Huang Q M, Tang F L, Meng J, Su J L and Song S F. 2022b. Stepwise feature fusion: local guides global//Proceedings of the 25th International Conference on Medical Image Computing and Computer Assisted Intervention. Singapore, Singapore: Springer: 110-120 [DOI: 10.1007/978-3-031-16437-8_11http://dx.doi.org/10.1007/978-3-031-16437-8_11]
Wei T Q and Xiao Z Y. 2022. Dual encoded-decoded polyp segmentation method for gastroscopic images architecture. Journal of Image and Graphics, 27(12): 3637-3650
魏天琦, 肖志勇. 2022. 双重编—解码架构的肠胃镜图像息肉分割. 中国图象图形学报, 27(12): 3637-3650 [DOI: 10.11834/jig.210966http://dx.doi.org/10.11834/jig.210966]
Zhou Z W, Siddiquee M M R, Tajbakhsh N and Liang J M. 2018. U-Net++: a nested u-net architecture for medical image segmentation//Proceedings of 2018 Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Granada, Spain: Springer: 3-11 [DOI: 10.1007/978-3-030-00889-5_1http://dx.doi.org/10.1007/978-3-030-00889-5_1]
相关作者
相关机构