Current Issue Cover
注意力机制下密集空洞卷积的肺部图像分割

郭宁, 柏正尧(云南大学信息学院, 昆明 650500)

摘 要
目的 卷积神经网络(convolutional neural network,CNN)在计算机辅助诊断(computer-aided diagnosis,CAD)肺部疾病方面具有广泛的应用,其主要工作在于肺部实质的分割、肺结节检测以及病变分析,而肺实质的精确分割是肺结节检出和肺部疾病诊断的关键。因此,为了更好地适应计算机辅助诊断系统要求,提出一种融合注意力机制和密集空洞卷积的具有编码—解码模式的卷积神经网络,进行肺部分割。方法 将注意力机制引入网络的解码部分,通过增大关键信息权重以突出目标区域抑制背景像素干扰。为了获取更广更深的语义信息,将密集空洞卷积模块部署在网络中间,该模块集合了Inception、残差结构以及多尺度空洞卷积的优点,在不引起梯度爆炸和梯度消失的情况下,获得了更深层次的特征信息。针对分割网络常见的特征丢失等问题,对网络中的上/下采样模块进行改进,利用多个不同尺度的卷积核级联加宽网络,有效避免了特征丢失。结果 在LUNA (lung nodule analysis)数据集上与现有5种主流分割网络进行比较实验和消融实验,结果表明,本文模型得到的预测图更接近于标签图像。Dice相似系数、交并比(intersection over union,IoU)、准确度(accuracy,ACC)以及敏感度(sensitivity,SE)等评价指标均优于对比方法,相比于性能第2的模型,分别提高了0.443%,0.272%,0.512%以及0.374%。结论 本文提出了一种融合注意力机制与密集空洞卷积的肺部分割网络,相对于其他分割网络取得了更好的分割效果。
关键词
The integration of attention mechanism and dense atrous convolution for lung image segmentation

Guo Ning, Bai Zhengyao(School of Information Science and Engineering, Yunnan University, Kunming 650500, China)

Abstract
Objective As an important criterion for the diagnosis of early-stage lung cancer, chest computed tomography (CT) images-based pulmonary nodules detection have been implemented via location observation, scope and shape of the lesions. The CT image has been analyzed lung organizational structures like the lung parenchyma and the contextual part, such as hydrops, trachea, bronchus, and ribs. CT images-based lung parenchyma has been hard to interpret automatically and precisely. The precise extraction of lung parenchyma has played a vital role in lung-based diseases analyses. Most of lung segmentation have been conducted based on regular image processing algorithms like threshold or morphological operation. The convolutional neural networks (CNNs) have been used in computerized pulmonary disease analysis. CNN-driven lung segmentation algorithms have been adopted in computer-aided diagnosis (CAD). The U-shape structure has been designed for medical image segmentation based on end-to-end fully convolutional network (FCN) structure. The credibility for biomedical image segmentations have been realized based on the encoding and decoding symmetric network structure. A novel convolutional neural network based on U-Net architecture has been illustrated via integrating attention mechanism and dense atrous convolution (DAC). Method The network has contained an encoder and a decoder. The encoder has consisted of convolution and down sampling. The deductible spatial dimension of feature maps have been used to learn more semantic information. And the attention mechanism decoder has been implemented for de-convolution and up-sampling to re-configure the spatial dimension of the feature maps. The decoding mode using attention mechanism has been manipulated to make the target area output more effectively. Meanwhile, the algorithm of lung image segmentation has been used to identify the target-oriented neural network's attention using transmitted skip-connection to improve the weight of the salient feature. The feature resolution capability has been enhanced to the requirements for intensive spatial prediction via pooling consecutive operations and convolution striding. The DAC block has been deployed between the encoder and the decoder to extract multi-scale information of the context sufficiently. The advantages of Inception, ResNet and atrous convolution for the block have been inherited to capture multi-sized features consequently. The max-pooling and up-sampling operators have been utilized to reduce and increase the resolution of feature maps intensively based on the classic U-Net framework, which could lead to feature loss and accuracy reduced problems during training. The original max-pooling and up-sampling operators have been replaced via down-sample and up-sample block with inception structure to widen the multi-filters network and avoid feature loss. The Dice coefficient loss function has been used instead of the cross entropy loss to identify the gap between prediction and ground-truth. The deep learning framework Pytorch have been used on a server with two NVIDIA GeForce RTX 2080Ti graphics cards and each GPU has 11 Gigabyte memory. At the experimental stage, the original images have been resized to 256×256 pixels and 80% of these for training besides the test remaining. The proposed model has been trained for 120 epochs. Based on an initial learning rate of 0.000 1,the Adam has been opted as the optimization algorithm. Result In order to verify the efficiency of the proposed method, we conduct multi-compatible verifications called FCN-8 s, U-Net, UNet++, ResU-Net and CE-Net (context encoder network) have been conducted. Four segmentation metrics have been adopted to assess the segmentation. These metrics has evolved the Dice similarity coefficient (DSC), the intersection over union (IoU), sensitivity (SE) and accuracy (ACC). The experimental results on the LUNA16 dataset have demonstrated the priorities in terms of all metrics results. The average Dice similarity coefficient has reached 0.985 9, which has 0.443% higher than the segmentation results of the second-performing CE-Net. The model consequence has achieved 0.972 2, 0.993 8, and 0.982 2 each in terms of IoU, ACC and SE. This second qualified segmentation performance has reached:0.272%, 0.512% and 0.374% each (more better). Compared with other algorithms, the predictable results of modeling has closer to the label made. The adhesive difficulties on the left and right lung cohesion issue have been resolved well. Conclusion An encoded/decoded structure in novel convolutional neural network has been integrated via attention mechanism and dense atrous convolution for lungs segmentation. The experiment results have illustrated that the qualified and effective framework for segmenting the lung parenchyma area have its own priority.
Keywords

订阅号|日报