融合残差注意力机制的UNet视盘分割
Optic disk segmentation by combining UNet and residual attention mechanism
- 2020年25卷第9期 页码:1915-1929
收稿:2019-10-30,
修回:2020-3-14,
录用:2020-3-21,
纸质出版:2020-09-16
DOI: 10.11834/jig.190527
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-10-30,
修回:2020-3-14,
录用:2020-3-21,
纸质出版:2020-09-16
移动端阅览
目的
2
青光眼和病理性近视等会对人的视力造成不可逆的损害,早期的眼科疾病诊断能够大大降低发病率。由于眼底图像的复杂性,视盘分割很容易受到血管和病变等区域的影响,导致传统方法不能精确地分割出视盘。针对这一问题,提出了一种基于深度学习的视盘分割方法RA-UNet(residual attention UNet),提高了视盘分割精度,实现了自动、端到端的分割。
方法
2
在原始UNet基础上进行了改进。使用融合注意力机制的ResNet34作为下采样层来增强图像特征提取能力,加载预训练权重,有助于解决训练样本少导致的过拟合问题。注意力机制可以引入全局上下文信息,增强有用特征并抑制无用特征响应。修改UNet的上采样层,降低模型参数量,帮助模型训练。对网络输出的分割图进行后处理,消除错误样本。同时,使用DiceLoss损失函数替代普通的交叉熵损失函数来优化网络参数。
结果
2
在4个数据集上分别与其他方法进行比较,在RIM-ONE(retinal image database for optic nerve evaluation)-R1数据集中,F分数和重叠率分别为0.957 4和0.918 2,比UNet分别提高了2.89%和5.17%;在RIM-ONE-R3数据集中,F分数和重叠率分别为0.969和0.939 8,比UNet分别提高了1.5%和2.78%;在Drishti-GS1数据集中,F分数和重叠率分别为0.966 2和0.934 5,比UNet分别提高了1.65%和3.04%;在iChallenge-PM病理性近视挑战赛数据集中,F分数和重叠率分别为0.942 4和0.891 1,分别比UNet提高了3.59%和6.22%。同时还在RIM-ONE-R1和Drishti-GS1中进行了消融实验,验证了改进算法中各个模块均有助于提升视盘分割效果。
结论
2
提出的RA-UNet,提升了视盘分割精度,对有病变区域的图像也有良好的视盘分割性能,同时具有良好的泛化性能。
Objective
2
Glaucoma and pathologic myopia are two important causes of irreversible damage to vision. The early detection of these diseases is crucial for subsequent treatment. The optic disk
which is the starting point of blood vessel convergence
is approximately elliptical in normal fundus images. An accurate and automatic segmentation of the optic disk from fundus images is a basic task. Doctors often diagnose eye diseases on the basis of the colored fundus images of patients. Browsing the images repeatedly to make appropriate diagnoses is a tedious and arduous task for doctors. Doctors are likely to miss some subtle changes in the image when they are tired
resulting in missed diagnoses. Therefore
using computers to segment optic disks automatically can help doctors in the diagnosis of these diseases. Glaucoma
pathologic myopia
and other eye diseases can be reflected by the shape of the optic disk; thus
an accurate segmentation of the optic disk can assist doctors in diagnosis. However
achieving an accurate segmentation of optic disks is challenging due to the complexity of fundus images. Many existing methods based on deep learning are susceptible to pathologic regions. UNet has been widely used in medical image segmentation tasks; however
it performs poorly in optic disk segmentation. Convolution is the core of convolutional neural networks. The importance of information contained in different spatial locations and channels varies. Attention mechanisms have received increasing attention over the past few years. In this study
we present a new automatic optic disk segmentation network based on UNet to improve segmentation accuracy.
Method
2
According to the design idea of UNet
the proposed model consists of an encoder and a decoder
which can achieve end-to-end training. The ability of the encoder to extract discriminative representations directly affects the segmentation performance. Achieving pixel-wise label data is expensive
especially in the field of medical image analysis; thus
transfer learning is adopted to train the model. Given that ResNet has a strong feature extraction capability
the encoder adopts a modified and pretrained ResNet34 as the backbone to achieve hierarchical features and then integrates a squeeze-and-excitation (SE) block into appropriate positions to enhance the performance further. The final average pooling layer and the fully connected layer of ResNet34 are removed
but the rest are kept. The SE block can boost feature discriminability
which includes SE operations. The SE block can model the relationship between different feature map channels to recalibrate channel-wise feature responses adaptively. In the encoder
all modules
except for four SE blocks
use the pretrained weights on ImageNet (ImageNet Large-Scale Visual Recognition Challenge) as initialization
thereby speeding up convergence and preventing overfitting. The input images are downsampled for a total of five times to extract abstract semantic features. In the decoder
2×2 deconvolution with stride 2 is used for upsampling. Five upsampling operations are conducted. In contrast to the original UNet decoder
each deconvolution
except for the last one
outputs a feature map of 128 channels
thus reducing model parameters. The shallow feature map preserves more detailed spatial information
whereas the deep feature map has more high-level semantic information. A set of downsampling layers enlarges the receptive field of the network but causes a loss of detailed location information. The skip connection between the encoder and decoder can combine high-level semantic information with low-level detailed information for fine-grained segmentation. The feature map in the encoder first goes through a 1×1 convolution layer
and then the output of 1×1 convolution is concatenated with the corresponding feature map in the decoder. Using skip connection is crucial in restoring image details in the decoder layers. Lastly
the network outputs a two-channel probability map for the background and the optic disk; this map has the same size as the input image. The network utilizes the last deconvolution with two output channels
followed by SoftMax activation
to generate the final probability map of the background and the optic disk simultaneously. The segmentation map predicted by the network is rough; thus
postprocessing is used to reduce false positives. In addition
DiceLoss is used to replace the traditional cross entropy loss function. Considering that the training images are limited
we first perform data augmentation
including random horizontal
vertical
and diagonal flips
to prevent overfitting. An NVidia GeForce GTX 1080Ti device is used to accelerate network training. We adopt Adam optimization with an initial learning rate of 0.001.
Result
2
To verify the effectiveness of our method
we conduct experiments on four public datasets
namely
RIM-ONE (retinal image database for optic nerve evaluation)-R1
ONE-R1
RIM-ONE-R3
Drishti-GS1
and iChallenge-PM. Two evaluation metrics
namely
F score and overlap rate
are computed. We also provide some segmentation results to compare different methods visually. The extensive experiments demonstrate that our method outperforms several other deep learning-based methods
such as UNet
DRIU
DeepDisc
and CE-Net
on four public datasets. In addition
the visual segmentation results produced by our method are more similar to the ground truth label. Compared with the UNet results in RIM-ONE-R1
RIM-ONE-R3
Drishti-GS1
and iChallenge-PM
the F score (higher is better) increases by 2.89%
1.5%
1.65%
and 3.59%
and the overlap rate (higher is better) increases by 5.17%
2.78%
3.04%
and 6.22%
respectively. Compared with the DRIU results in RIM-ONE-R1
RIM-ONE-R3
Drishti-GS1
and iChallenge-PM
the F score (higher is better) increases by 1.89%
1.85%
1.14%
and 2.01%
and the overlap rate (higher is better) increases by 3.41%
3.42%
2.1%
and 3.53%
respec tively. Compared with the DeepDisc results in RIM-ONE-R1
RIM-ONE-R3
Drishti-GS1
and iChallenge-PM
the F score (higher is better) increases by 0.24%
0.01%
0.18%
and 1.44%
and the overlap rate (higher is better) increases by 0.42%
0.01%
0.33%
and 2.55%
respectively. Compared with the CE-Net results in RIM-ONE-R1
RIM-ONE-R3
Drishti-GS1
and iChallenge-PM
the F score (higher is better) increases by 0.42%
0.2%
0.43%
and 1.07%
and the overlap rate (higher is better) increases by 0.77%
0.36%
0.79%
and 1.89% respectively. We also conduct ablation experiments on RIM-ONE-R1 and Drishti-GS1. Results demonstrate the effectiveness of each part of our algorithm.
Conclusion
2
In this study
we propose a new end-to-end convolutional network model based on UNet and apply it to the optic disk segmentation problem in practical medical image analysis. The extensive experiments prove that our method outperforms other state-of-the-art deep learning-based optic disk segmentation approaches and has excellent generalization performance. In our future work
we intend to introduce some recent loss functions
focusing on the segmentation of the optic disk boundary.
Al-Bander B, Williams B M, Al-Nuaimy W, Al-Taee M A, Pratt H and Zheng Y L. 2018. Dense fully convolutional segmentation of the optic disc and cup in colour fundus for glaucoma diagnosis. Symmetry, 10(4):#87[DOI:10.3390/sym10040087]
Aquino A, Emilio M, Gegúndez-Arias M E and Marin D. 2010. Detecting the optic disc boundary in digital fundus images using morphological, edge detection, and feature extraction techniques. IEEE Transactions on Medical Imaging, 29(11):1860-1869[DOI:10.1109/TMI.2010.2053042]
Cao X R, Xue L Y, Lin J W and Yu L. 2018. A novel method of optic disk segmentation based on visual saliency and rotary scanning. Journal of Biomedical Engineering, 35(2):229-236
曹新容, 薛岚燕, 林嘉雯, 余轮. 2018.基于视觉显著性和旋转扫描的视盘分割新方法.生物医学工程学杂志, 35(2):229-236 [DOI:10.7507/1001-5515.201706013]
Cheng J, Liu J, Xu Y W, Yin F S, Wong D W K, Tan N M, Tao D C, Cheng C Y, Aung T and Wong T Y. 2013. Superpixel classification based optic disc and optic cup segmentation for glaucoma screening. IEEE Transactions on Medical Imaging, 32(6):1019-1032[DOI:10.1109/TMI.2013.2247770]
Edupuganti V G, Chawla A and Kale A. 2018. Automatic optic disk and cup segmentation of fundus images using deep learning//Proceedings of the 25th IEEE International Conference on Image Processing. Athens: IEEE: 2227-2231[ DOI:10.1109/ICIP.2018.8451753 http://dx.doi.org/10.1109/ICIP.2018.8451753 ]
Fu H Z, Cheng J, Xu Y W, Wong D W K, Liu J and Cao X C. 2018. Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Transactions on Medical Imaging, 37(7):1597-1605[DOI:10.1109/TMI.2018.2791488]
Fu H Z, Li F, Orlando J L, BogunovićH, Sun X, Liao J G, Xu Y W, Zhang S C and Zhang X L. 2019. PALM: pathologic myopia challenge[EB/OL].[2019-10-09] . http://dx.doi.org/10.21227/55pk-8z03 http://dx.doi.org/10.21227/55pk-8z03
Fumero F, Alayon S, Sanchez J L, Sigut J and Gonzalez-Hernandez M. 2011. Rim-One: an open retinal image database for optic nerve evaluation//Proceedings of the 24th International Symposium on Computer-Based Medical Systems. Bristol: IEEE: 1-6[ DOI:10.1109/CBMS.2011.5999143 http://dx.doi.org/10.1109/CBMS.2011.5999143 ]
Gu Z W, Cheng J, Fu H Z, Zhou K, Hao H Y, Zhao Y T, Zhang T Y, Gao S H and Liu J. 2019. CE-Net:context encoder network for 2D medical image segmentation. IEEE Transactions on Medical Imaging, 38(10):2281-2292[DOI:10.1109/TMI.2019.2903562]
Gu Z W, Liu P, Zhou K, JiangY M, Mao H Y, Cheng J and Liu J. 2018. DeepDisc: optic disc segmentation based on atrous convolution and spatial pyramid pooling//Proceedings of the 1st International Workshop on Computational Pathology and Ophthalmic Medical Image Analysis. Granada: Springer: 253-260[ DOI:10.1007/978-3-030-00949-6_30 http://dx.doi.org/10.1007/978-3-030-00949-6_30 ]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: 770-778[ DOI:10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE: 7132-7141[ DOI:10.1109/CVPR.2018.00745 http://dx.doi.org/10.1109/CVPR.2018.00745 ]
Huang G, Liu Z, van der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE: 2261-2269[ DOI:10.1109/CVPR.2017.243 http://dx.doi.org/10.1109/CVPR.2017.243 ]
Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift[EB/OL]. 2015-03-02[2019-10-09] . https://arxiv.org/pdf/1502.03167.pdf https://arxiv.org/pdf/1502.03167.pdf
Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conferenceon Computer Vision and Pattern Recognition. Boston: IEEE: 3431-3440[ DOI:10.1109/CVPR.2015.7298965 http://dx.doi.org/10.1109/CVPR.2015.7298965 ]
Lowell J, Hunter A, Steel D, Basu A, Ryder R, Fletcher E and Kennedy L. 2004. Optic nerve head segmentation. IEEE Transactions on Medical Imaging, 23(2):256-264[DOI:10.1109/TMI.2003.823261]
Maninis K K, Pont-Tuset J, Arbeláez P and Van Gool L. 2016. Deep retinal image understanding//Proceedings of the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention. Athens: Springer: 140-148[ DOI:10.1007/978-3-319-46723-8_17 http://dx.doi.org/10.1007/978-3-319-46723-8_17 ]
Mary M C V S, Rajsingh E B and Naik G R. 2016. Retinal fundus image analysis for diagnosis of glaucoma:a comprehensive survey. IEEE Access, 4:4327-4354[DOI:10.1109/ACCESS.2016.2596761]
Milletari F, Navab N and Ahmadi S A. 2016. V-Net: fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision (3DV). Stanford: IEEE: 565-571[ DOI:10.1109/3DV.2016.79 http://dx.doi.org/10.1109/3DV.2016.79 ]
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich: Springer: 234-241[ DOI:10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Sevastopolsky A. 2017. Optic disc and cup segmentation methods for glaucoma detection with modification of U-Net convolutional neural network. Pattern Recognition and Image Analysis, 27(3):618-624[DOI:10.1134/S1054661817030269]
Shankaranarayana S M, Ram K, Mitra K and Sivaprakasam M. 2017. Joint optic disc and cup segmentation using fully convolutional and adversarial networks//Proceedings of the International Workshop, FIFI 2017, and the 4th International Workshop, OMIA 2017, Held in Conjunction with MICCAI 2017: Fetal, Infant and Ophthalmic Medical Image Analysis. Québec City: Springer: 168-176[ DOI:10.1007/978-3-319-67561-9_19 http://dx.doi.org/10.1007/978-3-319-67561-9_19 ]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition[EB/OL]. 2015-04-10[2019-10-09] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf
Sivaswamy J, Krishnadas S R, Chakravarty A, Joshi G D, Ujjwal and Syed T A. 2015. A comprehensive retinal image dataset for the assessment of glaucoma from the optic nerve head analysis. JSM Biomedical Imaging Data Papers, 2(1):#1004
Son J, Park S J and Jung K H. 2018. Towards accurate segmentation of retinal vessels and the optic disc in fundoscopic images with generative adversarial networks. Journal of digital imaging, 32(3):499-512[DOI:10.1007/s10278-018-0126-3]
Wang S J, Yu, L Q, Yang X, Fu C W and Heng P A. 2019. Patch-based output space adversarial learning for joint optic disc and cup segmentation. IEEE Transactions on Medical Imaging, 38(11):2485-2495[DOI:10.1109/TMI.2019.2899910]
Wu X X and Xiao Z Y. 2018. Automatic algorithm for fast parting optical fundus disc based on multi-circle. Optical Technique, 44(5):586-591
吴鑫鑫, 肖志勇. 2018.基于多圆快速分割眼底视盘的自动算法.光学技术, 44(5):586-591 [DOI:10.13741/j.cnki.11-1879/o4.2018.05.012]
Yu F and Koltun V. 2016. Multi-Scale context aggregation by dilated convolutions[EB/OL]. 2016-04-30[2019-10-09] . https://arxiv.org/pdf/1511.07122.pdf https://arxiv.org/pdf/1511.07122.pdf
Yu S, Xiao D, Frost S and Kanagasingam Y. 2019. Robust optic disc and cup segmentation with deep learning for glaucoma detection. Computerized Medical Imaging and Graphics, 74:61-71[DOI:10.1016/j.compmedimag.2019.02.005]
Zhao H S, Shi J P, Qi X J, Wang X G and Jia J Y. 2017.Pyramid scene parsing network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE: 6230-6239[ DOI:10.1109/CVPR.2017.660 http://dx.doi.org/10.1109/CVPR.2017.660 ]
Zilly J, Buhmann J M and Mahapatra D. 2017. Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation. Computerized Medical Imaging and Graphics, 55:28-41[DOI:10.1016/j.compmedimag.2016.07.012]
相关作者
相关机构
京公网安备11010802024621