双重编—解码架构的肠胃镜图像息肉分割
Dual encoded-decoded polyp segmentation method for gastroscopic images architecture
- 2022年27卷第12期 页码:3637-3650
收稿日期:2021-10-11,
修回日期:2022-01-27,
录用日期:2022-2-3,
纸质出版日期:2022-12-16
DOI: 10.11834/jig.210966
移动端阅览
浏览全部资源
扫码关注微信
收稿日期:2021-10-11,
修回日期:2022-01-27,
录用日期:2022-2-3,
纸质出版日期:2022-12-16
移动端阅览
目的
2
肠胃镜诊断一直被认为是检测及预防结直肠癌的金标准,但当前的临床检查中仍存在一定的漏诊概率,基于深度学习的肠胃内窥镜分割方法可以帮助医生准确评估癌前病变,对诊断和干预治疗都有积极作用。然而提高目标分割的准确性仍然是一项具有挑战性的工作,针对这一问题,本文提出一种基于双层编—解码结构的算法。
方法
2
本文算法由上、下游网络构成,创新性地利用上游网络训练产生注意力权重图,对下游网络解码过程中的特征图产生注意力引导,使分割模型更加注重目标区域;提出子空间通道注意力结构,在跨越连接中提取多分辨率下的跨通道信息,可以有效细化分割边缘;最终输出添加残差结构防止网络退化。
结果
2
在公共数据集CVC-ClinicDB(Colonoscopy Videos Challenge-ClinicDataBase)和Kvasir-Capsule上进行测试,采用Dice相似系数(Dice similariy coefficient,DSC)、均交并比(mean intersection over union,mIoU)、精确率(precision)以及召回率(recall)为评价指标,在两个数据集上的DSC分别达到了94.22%和96.02%。进一步将两个数据集混合,测试了算法在跨设备图像上的鲁棒性,其中DSC提升分别达到17%—20%,在没有后处理的情况下,相较其他先进模型(state-of-the-art,SOTA),如U-Net在DSC、mIoU以及recall上分别取得了1.64%、1.41%和2.54%的提升,与ResUNet++的对比中,在DSC以及recall指标上分别取得了2.23%和9.87%的提升,与SFA (selective feature aggregation network)、PraNet和TransFuse等算法相比,在上述评价指标上也均有显著提升。
结论
2
本文算法可以有效提高医学图像分割效果,并且对小目标分割、边缘分割具有更高的准确率。
Objective
2
Adenomatous polyp is demonstrated as the early manifestation of colorectal cancer. Early intervention is an effective way to prevent colorectal cancer. Current gastroscopy has been regarded as the "gold standard" for detection and prevention of colorectal cancer. However
a certain probability of missed diagnosis is still existed for clinical examination. Deep learning based gastrointestinal endoscopy segmentation method can aid to assess precancerous lesions efficiently
which has a positive effect on diagnosis and clinical intervention. Intestinal polyps are also characterized by small
round and blurred edges
which greatly increase the difficulty of semantic segmentation. Our research is focused on developing an improved algorithm based on the double-layer encoder-decode structure.
Method
2
Our algorithm comprises of upstream and downstream architectures. The attention weight graph generated by the upstream network training is melted into the decoding part of the downstream network. 1) To promote effective network for target area in the image
the generated attention guidance is clarified to the feature map in the decoding process. The background-area-ignored model can be paid more attention to the segmentation contexts
which has a significant effect on small target recognition in semantic segmentation. 2) The edge extraction issue is concerned as well. Due to the similarity of intestinal wall and polyp mucous membrane
the segmentation target edge is blurred. It is essential to strengthen the edge extraction ability of the model and obtain more accurate segmentation results as well. In order to improve the segmentation ability of polyp target boundary
subspace channel attention is integrated into the cross-connection portion of the downstream network for extracting cross-channel information at multi-resolution and refining the edges. Unlike the convolution operation
a self-attention mechanism is involved in. Its ability to model remote dependencies provides an infinite receptive field for the application of visual models. However
traditional attention mechanism brings a huge amount of additional computational overhead. To realize the refine edge effect
the introduction of lightweight subspace channel attention mechanism can feature each space division
reduce the amount of calculation
learn the attention of multiple features
and get the attention of the fusion feature maps. We conduct tests performed on the public datasets Colonoscopy Videos Challenge-ClinicDataBase(CVC-ClinicDB) and Kvasir-Capsule. The CVC-ClinicDB dataset is used to the image data of intestinal polyps collected by conventional colonoscopy and there are 612 pictures in total
while Kvasir-Capsule dataset tends to the image data of polyps collected by Capsule gastroscopy and there are 55 pictures in total. A big gap needs to be bridged in imaging although the same kinds of targets are collected. At the same time
to further prove the robustness of this algorithm
our tests are carried out on the ultrasound nerve segmentation dataset
which has 5 633 ultrasound images of the brachial plexus taken by the imaging surgeon. The resolution of all images are set to 224×224 pixels and it can be randomly scrambled
divided into training set
verification set and test set according to the ratio of 6∶2∶2 and trained on a single GTX 1080Ti GPU. Our saliency network is implemented in Pytorch. In the experiment
binary cross entropy loss function(BCE loss) and Dice loss are proportionally mixed to construct a new Loss function
which has better performance for semantic segmentation of dichotomies. The Adam optimizer is used as well. The initial learning rate is 0.000 3 and the learning rate attenuation is set.
Result
2
The Dice similariy coefficient(DCS)
mean intersection over union(mIoU)
precision and recall are used as the quantitative evaluation metrics
and these metrics are all between 0 and 1. The higher of the index is
the segmentation performance of the model is better. The experimental results showed that the DCS of our model on CVC-ClinicDB and Kvasir-Capsule datasets reached 94.22% and 96.02%
respectively. Compared with U-Net
our DCS
mIoU
precision and recall is increased by 1.89%
2.42%
1.04%
1.87% of each in CVC-ClinicDB dataset and 1.06%
1.9%
0.4%
1.58% in Kvasir-Capsule dataset. The robustness of our algorithm on cross-device images is tested further by mixing the two data sets. Among them
DSC is increased by 17% to 20%
Compared with U-Net
the DCS of our model is increased by 16.73% in CVC-KC dataset (trained on CVC-ClinicDB and tested on Kvasir-Capsule) and 1% in KC-CVC dataset (trained on Kvasir-Capsule and tested on CVC-ClinicDB).
Conclusion
2
We propose an attention segmentation model with dual encode-decoder architecture. Our algorithm can improve the effect of medical image segmentation effectively
and has higher accuracy for small target segmentation and edge segmentation on improving colorectal cancer screening strategies.
Ahn S B, Han D S, Bae J H, Byun T J, Kim J P and Eun C S. 2012. The miss rate for colorectal adenoma determined by quality-adjusted, back-to-back colonoscopies. Gut and Liver, 6(1): 64-70 [DOI: 10.5009/gnl.2012.6.1.64]
Bernal J, Tajkbaksh N, Sánchez F J, Matuszewski B J, Chen H, Yu L Q, Angermann Q, Romain O, Rustad B, Balasingham I, Pogorelov K, Choi S, Debard Q, Maier-Hein L, Speidel S, Stoyanov D, Brandao P, Córdova H, Sánchez-Montes C, Gurudu S R, Fernández-Esparrach G, Dray X, Liang J M and Histace A. 2017. Comparative validation of polyp detection methods in video colonoscopy: results from the MICCAI 2015 endoscopic vision challenge. IEEE Transactions on Medical Imaging, 36(6): 1231-1249 [DOI: 10.1109/TMI.2017.2664042]
Bernal J, Sánchez F J, Fernández-Esparrach G, Gil D, Rodríguez C and Vilariño F. 2015. WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, 43: 99-111 [DOI: 10.1016/j.compmedimag.2015.02.007]
Chen J N, Lu Y Y, Yu Q H, Luo X D, Adeli E, Wang, Lu L, Yuille A L and Zhou Y Y. 2021. TransUNet: transformers make strong encoders for medical image segmentation [EB/OL]. [2021-02-08] . https://arxiv.org/pdf/2102.04306.pdf https://arxiv.org/pdf/2102.04306.pdf
Deng J, Dong W, Socher R, Li L J, Li K and Li F F. 2009. ImageNet: a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 248-255 [ DOI: 10.1109/CVPR.2009.5206848 http://dx.doi.org/10.1109/CVPR.2009.5206848 ]
Fan D P, Ji G P, Zhou T, Chen G, Fu H Z, Shen J B and Shao L. 2020. PraNet: parallel reverse attention network for polyp segmentation//Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention—MICCAI 2020. Lima, Peru: Springer: 263-273 [ DOI: 10.1007/978-3-030-59725-2_26 http://dx.doi.org/10.1007/978-3-030-59725-2_26 ]
Fang Y Q, Chen C, Yuan Y X and Tong K Y. 2019. Selective feature aggregation network with area-boundary constraints for polyp segmentation//Proceedings of the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention—MICCAI 2019. Shenzhen, China: Springer: 302-310 [ DOI: 10.1007/978-3-030-32239-7_34 http://dx.doi.org/10.1007/978-3-030-32239-7_34 ]
Finlay A M, Parikh A R and Ricciardi R. 2021. Clinical presentation, diagnosis, and staging of colorectal cancer [EB/OL]. [2021-07-27] . https://www.uptodate.com/contents/zh-Hans/clinical-presentation-diagnosis-and-staging-of-colorectal-cancer https://www.uptodate.com/contents/zh-Hans/clinical-presentation-diagnosis-and-staging-of-colorectal-cancer
Fu J, Liu J, Tian H J, Li Y, Bao Y J, Fang Z W and Lu H Q. 2019. Dual attention network for scene segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3146-3154 [ DOI: 10.1109/CVPR.2019.00326 http://dx.doi.org/10.1109/CVPR.2019.00326 ]
He K H and Xiao Z Y. 2021. LRUNet: a lightweight rapid semantic segmentation network for brain tumors. Journal of Image and Graphics, 26(9): 2233-2242
何康辉, 肖志勇. 2021. LRUNet: 轻量级脑肿瘤快速语义分割网络. 中国图象图形学报, 26(9): 2233-2242 [DOI: 10.11834/jig.200436]
Huang C H, Wu Y H and Lin L Y. 2021. HarDNet-MSEG: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean Dice and 86 FPS [EB/OL]. [2021-01-20] . https://arxiv.org/pdf/2101.07172.pdf https://arxiv.org/pdf/2101.07172.pdf
Jha D, Riegler M A, Johansen D, Halvorsen P and Johansen H D. 2020. DoubleU-Net: a deep convolutional neural network for medical image segmentation//Proceedings of the 33rd IEEE International Symposium on Computer-Based Medical Systems (CBMS). Rochester, USA: IEEE: 558-564 [ DOI: 10.1109/CBMS49503.2020.00111 http://dx.doi.org/10.1109/CBMS49503.2020.00111 ]
Jha D, Smedsrud P H, Riegler M A, Johansen D, Lange T D, Halvorsen P and Johansen H D. 2019. ResUNet++: an advanced architecture for medical image segmentation//Proceedings of 2019 IEEE International Symposium on Multimedia (ISM). San Diego, USA: IEEE: #49 [ DOI: 10.1109/ISM46123.2019.00049 http://dx.doi.org/10.1109/ISM46123.2019.00049 ]
Jha D, Tomar N K, Ali S, Riegler M A, Johansen H D, Johansen D, de Lange T and Halvorsen P. 2021. NanoNet: real-time polyp segmentation in video capsule endoscopy and colonoscopy//Proceedings of the 34th IEEE International Symposium on Computer-Based Medical Systems. Aveiro, Portugal: IEEE: 37-43 [ DOI: 10.1109/CBMS52027.2021.00014 http://dx.doi.org/10.1109/CBMS52027.2021.00014 ]
Ji S Y and Xiao Z Y. 2021. Integrated context and multi-scale features in thoracic organs segmentation. Journal of Image and Graphics, 26(9): 2135-2145
吉淑滢, 肖志勇. 2021. 融合上下文和多尺度特征的胸部多器官分割. 中国图象图形学报, 26(9): 2135-2145 [DOI: 10.11834/jig.200558]
Kim T, Lee H and Kim D. 2021. UACANet: uncertainty augmented context attention for polyp segmentation//Proceedings of the 29th ACM International Conference on Multimedia. [s. l.]: ACM: 2167-2175 [ DOI: 10.1145/3474085.3475375 http://dx.doi.org/10.1145/3474085.3475375 ]
Liu J W, Liu Q H, Li X O, Ling C and Liu C J. 2021. Improved colonic polyp segmentation method based on double U-shaped network. Acta Optica Sinica, 41(18): #1810001
刘佳伟, 刘巧红, 李晓欧, 凌晨, 刘存珏. 2021. 一种改进的双U型网络的结肠息肉分割方法. 光学学报, 41(18): #1810001 [DOI: 10.3788/AOS202141.1810001]
Mamonov A V, Figueiredo I N, Figueiredo P N and Tsai Y H R. 2014. Automated polyp detection in colon capsule endoscopy. IEEE Transactions on Medical Imaging, 33(7): 1488-1502 [DOI: 10.1109/TMI.2014.2314959]
Milletari F, Navab N and Ahmadi S A. 2016. V-Net: fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision (3DV). Stanford, USA: IEEE: 565-571 [ DOI: 10.1109/3DV.2016.79 http://dx.doi.org/10.1109/3DV.2016.79 ]
Oktay O, Schlemper J, Folgoc L L, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla N Y, Kainz B, Glocker B and Rueckert D. 2018. Attention U-Net: learning where to look for the pancreas [EB/OL]. [2021-05-20] . https://arxiv.org/pdf/1804.03999.pdf https://arxiv.org/pdf/1804.03999.pdf
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015. Munich, Germany: Springer: 234-241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Saini R, Jha N K, Das B, Mittal S and Mohan C K. 2020. ULSAM: ultra-lightweight subspace attention module for compact convolutional neural networks//Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass, USA: IEEE: 1616-1625 [ DOI: 10.1109/WACV45572.2020.9093341 http://dx.doi.org/10.1109/WACV45572.2020.9093341 ]
Siegel R L, Miller K D and Jemal A. 2019. Cancer statistics 2019. CA: A Cancer Journal for Clinicians, 69(1): 7-34 [DOI: 10.3322/caac.21551]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition //Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: [s. n.]
Smedsrud P H, Thambawita V, Hicks S A, Gjestang H, Nedrejord O O, Næss E, Borgli H, Jha D, Berstad T J D, Eskeland S L, Lux M, Espeland H, Petlund A, Nguyen D T D, Garcia-Ceja E, Johansen D, Schmidt P T, Toth E, Hammer H L, de Lange T, Riegler M A and Halvorsen P. 2021. Kvasir-Capsule, a video capsule endoscopy dataset. Scientific Data, 8(1): #142 [DOI: 10.6084/m9.figshare.14178905]
Tian C X and Zhao L. 2021. Epidemiological characteristics of colorectal cancer and colorectal liver metastasis. Chinese Journal of Cancer Prevention and Treatment, 28(13): 1033-1038
田传鑫, 赵磊. 2021. 结直肠癌及结直肠癌肝转移流行病学特点. 中华肿瘤防治杂志, 28(13): 1033-1038 [DOI: 10.16073/j.cnki.cjcpt.2021.13.12]
Tomar N K, Jha D, Riegler M A, Johansen H D, Johansen D, Rittscher J, Halvorsen P and Ali S. 2022. FANet: a feedback attention network for improved biomedical image segmentation. IEEE Transactions on Neural Networks and Learning Systems: #3159394 [DOI: 10.1109/TNNLS.2022.3159394]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc: 5998-6008
Wang D C, Hao M J, Xia R R, Zhu J H, Li S and He X X. 2021. MSB-Net: multi-scale boundary net for polyp segmentation//Proceedings of the 10th IEEE Data Driven Control and Learning Systems Conference (DDCLS). Suzhou, China: IEEE: 88-93 [ DOI: 10.1109/DDCLS52934.2021.9455514 http://dx.doi.org/10.1109/DDCLS52934.2021.9455514 ]
Xiao X, Lian S, Luo Z M and Li S Z. 2018. Weighted res-UNet for high-quality retina vessel segmentation//Proceedings of the 9th International Conference on Information Technology in Medicine and Education (ITME). Hangzhou, China: IEEE: 327-331 [ DOI: 10.1109/ITME.2018.00080 http://dx.doi.org/10.1109/ITME.2018.00080 ]
Zhang Y D, Liu H Y and Hu Q. 2021. Transfuse:fusing transformers and CNNs for medical image segmentation//Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention. Strasbourg, France: Springer: 14-24 [ DOI: 10.1007/978-3-030-87193-2_2 http://dx.doi.org/10.1007/978-3-030-87193-2_2 ]
Zhang Z X, Liu Q J and Wang Y H. 2018. Road extraction by deep residual U-Net. IEEE Geoscience and Remote Sensing Letters, 15(5): 749-753 [DOI: 10.1109/LGRS.2018.2802944]
相关作者
相关机构