融合残差上下文编码和路径增强的视杯视盘分割
梅华威, 尚虹霖, 苏攀, 刘艳平(河北省保定市华北电力大学) 摘 要
目的 从眼底图像中分割视盘和视杯对于眼部疾病智能诊断来说是一项重要工作,U-Net及变体模型已经广泛应用在视杯盘分割任务中。由于连续的卷积与池化操作容易引起空间信息损失,导致视盘和视杯分割精度差且效率低。本文提出了融合残差上下文编码和路径增强的深度学习网络(residual context path augmentation U-Net,RCPA-Net),提升了分割结果的准确性与连续性。方法 采用限制对比度自适应直方图均衡方法处理输入图像,增强对比度并丰富图像信息。特征编码模块以ResNet34(residual neural network)为骨干网络,通过引入残差递归与注意力机制使模型更关注感兴趣区域,采用残差空洞卷积模块捕获更深层次的语义特征信息,使用路径增强模块在浅层特征中获得精确的定位信息来增强整个特征层次。本文还提出了一种新的多标签损失函数用于提高视盘视杯与背景区域的像素比例并生成最终的分割图。结果 本文实验在四个数据集上与多种分割方法进行了比较,在ORIGA数据集中,视盘分割的Jaccard(JC)指数为0.9391,F-measure为0.9686,视杯分割的JC和F-measure分别为0.7948和0.8855;在Drishti-GS1数据集中,视盘分割的JC和F-measure分别为0.9513和0.9750,视杯分割的JC和F-measure分别为0.8633和0.9266;在REFUGE数据集中,视盘分割的JC和F-measure分别为0.9298和0.9636,视杯分割的JC和F-measure分别为0.8288和0.9063;在RIM-ONE-R1数据集中,视盘分割的JC和F-measure分别为0.9290和0.9628。在四个数据集上结果均优于对比算法,性能取得显著提升。此外,针对网络中提出的模块分别做了消融实验,验证了RCPA-Net中各个模块的有效性。结论 与多个近年来提出的算法对比实验结果表明,RCPA-Net提升了视盘和视杯分割精度,预测图像更接近真实标签结果,同时跨数据集测试结果证明了RCPA-Net具有良好的泛化能力。
关键词
Optic disc and cup segmentation with combination of residual context encoding and path augmentation
Mei Huawei, Shang Honglin, Su Pan, Liu Yanping(North China Electric Power University) Abstract
Objective Ophthalmic image segmentation is an important part of medical image analysis, among which optic disc (OD) and optic cup (OC) segmentation is a key technology of intelligent diagnosis of glaucoma, the second leading cause of blindness in the world, which can cause irreversible damage to eyes. The main glaucoma screening method is the evaluation of OD and OC based on fundus images. Cup Disc Ratio (CDR) is considered as one of the most representative glaucoma detection features. Generally, eyes with CDR greater than 0.65 are considered as glaucoma. With the continuous development of deep learning, U-Net and the variant models including super pixel classification and edge segmentation, are widely used in OD and OC segmentation tasks. However, due to the loss of spatial information caused by continuous convolution and pooling operations, the segmentation accuracy of OD and OC is limited and the efficiency is low in the training process. To improve the accuracy and training efficiency of the OD and OC segmentation, we proposed the Residual Context Path Augmentation U-Net (RCPA-Net), which is able to capture deeper semantic feature information and solve the problem of unclear edge localization. Method RCPA-Net mainly includes three modules: feature coding module (FCM), residual atrous convolution (RAC) module, and path augmentation module (PAM). Firstly, the FCM block takes ResNet34 network as the backbone network. By introducing residual module and attention mechanism, the model is enabled to focus on the region of interest and the efficient channel attention (ECA) is adopted to the squeeze and excitation (SE) module. The ECA module is an efficient channel attention module, which avoids dimensionality reduction and captures cross-channel features effectively. Secondly, the RAC block is used to obtain the context feature information of a wider layer. Inspired by inception-V4 and CE-Net, we fused cavity convolution into the inception series network and stack convolution blocks. Traditional convolution is replaced by the cavity convolution, so that the receptive field is increased while the number of parameters remains the same. Finally, with an aim to shorten the information path between the low-level and top-level features, the PAM block uses the accurate low-level positioning signal and lateral connection to enhance the entire feature hierarchy. In order to solve the problem of very unbalanced pixels and generated the final segmentation map, we proposed a new multi-label loss function based on Dice coefficient and Focal Loss, which was used to improve the pixel ratio between the OD/OC region and the background region. Additionally, we enhanced the training data by flipping the image and adjusting the ratio of the length and width. Then, the input images are processed by CLAHE method,and each resultant image is fused with its original one and then averaged to form a new three-channel image. The purpose of this is to enhance image contrast and enrich image information. In the experimental stage, we used the Adam optimization instead of the stochastic gradient descent method to optimize the model. The number of samples selected in each training stage is 8, and the weight decay is 0.0001. In the training process, the learning rate is adjusted adaptively according to the number of samples selected each time. In the process of outputting the prediction results, the maximum connected region in the OD and OC is selected to get the final segmentation result. Result Three datasets (ORIGA, Drishti-GS1, and REFUGE) are employed to validate the performance of the proposed method and the results are compared with a variety of the state-of-the-art methods including U-Net, M-Net, and CE-Net. The ORIGA dataset contains 650 color fundus images of 3072×2048, and the ratio of training set to test set is 1:1 during the experiment. The DrishTI-GS1 dataset contains a total of 101 images, including 31 normal images and 70 diseased images. The fundus images were divided into two datasets: Group A and B include 50 training samples and 51 testing ones, respectively. The 400 fundus images in REFUGE dataset were also divided into two datasets: Group A includes 320 training samples and Group B includes 80 testing ones. Jaccard index and F-measure score were used in the experimentation to evaluate the results of OD and OC segmentation. The results show that in the ORIGA dataset, the Jaccard index and F-measure of the proposed method in OD/OC segmentation are 0.9391/0.7948 and 0.9686/0.8855, respectively. In DrishTI-GS1 dataset, the results in OD/OC segmentation are 0.9513/0.8633 and 0.9750/0.9266, respectively. In the REFUGE dataset, they are 0.9298/0.8288 and 0.9636/0.9063 in OD/OC segmentation. And in RIM-ONE-R1 dataset, the results of OD segmentation are 0.9290 and 0.9628. The results on the four datasets of the proposed method are all better than those of its counterparts, and the performance of the network is significantly improved. In addition, we conducted the ablation experiments for the main modules proposed in the network, where we conducted comparative experiments in respect to the location of the modules, parameters in the model, and other factors. The results of ablation experiments demonstrate the effectiveness of each proposed module in RCPA-NET. Conclusion In this study, we proposed the RCPA-NET which combines the advantages of the deep segmentation models. The images predicted by the RCPA-NET are closer to the real results, resulting in more accurate segmentation of the OD and OC than several state-of-the-art methods. The experimentation demonstrates the high effectiveness and generalization ability of RCPA-NET.
Keywords
optic disc and optic cup segmentation deep learning attention mechanism residual atrous convolution path augmentation
|