Print

发布时间: 2021-07-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.200501
2021 | Volume 26 | Number 7




    医学图像处理    




  <<上一篇 




  下一篇>> 





3D多尺度深度卷积神经网络肺结节检测
expand article info 孙华聪, 彭延军, 郭燕飞, 张晓庆
山东科技大学, 青岛 266590

摘要

目的 肺结节是肺癌的早期存在形式。低剂量CT(computed tomogragphy)扫描作为肺癌筛查的重要检查手段,已经大规模应用于健康体检,但巨大的CT数据带来了大量工作,随着人工智能技术的快速发展,基于深度学习的计算机辅助肺结节检测引起了关注。由于肺结节尺寸差别较大,在多个尺度上表示特征对结节检测任务至关重要。针对结节尺寸差别较大导致的结节检测困难问题,提出一种基于深度卷积神经网络的胸部CT序列图像3D多尺度肺结节检测方法。方法 包括两阶段:1)尽可能提高敏感度的结节初检网络;2)尽可能减少假阳性结节数量的假阳性降低网络。在结节初检网络中,以组合了压缩激励单元的Res2Net网络为骨干结构,使同一层卷积具有多种感受野,提取肺结节的多尺度特征信息,并使用引入了上下文增强模块和空间注意力模块的区域推荐网络结构,确定候选区域;在由Res2Net网络模块和压缩激励单元组成的假阳性降低网络中对候选结节进一步分类,以降低假阳性,获得最终结果。结果 在公共数据集LUNA16(lung nodule analysis 16)上进行实验,实验结果表明,对于结节初检网络阶段,当平均每例假阳性个数为22时,敏感度可达到0.983,相比基准ResNet + FPN(feature pyramid network)方法,平均敏感度和最高敏感度分别提高了2.6%和0.8%;对于整个3D多尺度肺结节检测网络,当平均每例假阳性个数为1时,敏感度为0.924。结论 与现有主流方案相比,该检测方法不但提高了肺结节检测的敏感度,还有效地控制了假阳性,取得了更优的性能。

关键词

肺结节检测; 卷积神经网络(CNN); 多尺度; 区域推荐网络; 上下文增强; 空间注意力; 假阳性降低

3D multi-scale deep convolutional neural networks in pulmonary nodule detection
expand article info Sun Huacong, Peng Yanjun, Guo Yanfei, Zhang Xiaoqing
Shandong University of Science and Technology, Qingdao 266590, China
Supported by: National Natural Science Foundation of China(61976126)

Abstract

Objective Pulmonary nodules are early forms of lung cancer, one of the most threatening malignancies for human health and life. As an important means of lung cancer screening, low-dose computerized tomographic scanning has been widely used in health examinations. However, a large amount of computed tomography(CT) data brings a heavy workload to doctors and radiologists, and high-intensity work can result in misdiagnosis. With the rapid development of artificial intelligence technology, computer-aided lung-nodule detection based on deep learning has attracted much attention. As the size of pulmonary nodules varies greatly, representing features on multiple scales is critical for nodule detection tasks. To solve the problem of difficulty in detection caused by the large difference in size of nodules, this paper proposes a 3D multi-scale pulmonary nodule detection method in chest CT sequence images based on deep convolutional neural network. Method The method mainly consists of two stages: 1) nodule candidate detection stage that maximizes system sensitivity, and 2) false positive reduction stage that minimizes the number of false positive nodules. Specifically, a series of preprocessing operations is performed on the original CT images first, and the regions of interest (ROIs) of lung nodules are obtained by cropping. In the training phase of the nodule candidate detection network, after the preprocessing steps, data augmentation is performed by randomly rotating, flipping, and scaling. Then, nodule cubes and non-nodule cubes with a size of 128×128×128 are randomly cropped out and input to the network. The nodule candidate detection network uses the combination of the squeeze-and-excitation units and the Res2Net modules as the backbone structure, so that the convolutions of the same layer have a variety of receptive fields. Thus, the network can extract the multi-scale feature information of pulmonary nodules. In addition, the nodule candidate detection network also uses the region proposal network structure that introduces context enhancement module and spatial attention module to identify region candidates. In the test phase of the nodule candidate detection network, the preprocessed CT image is divided into several small patches of size 208×208×208, which are used as the inputs of the network, and adjacent small patches overlap 32 pixels. For each CT image, the nodule candidates obtained from all small patches are summarized, and the nodules with higher overlap are merged through non-maximum suppression with an intersection over union(IOU) threshold of 0.1 to obtain the detection results. In the training phase of the false positive reduction network, because the average number of false positive nodules per scan is 22 obtained through experiments in the nodule candidate detection network, the positive samples are augmented by 22 times to balance the number of positive and negative samples. The augmentation methods are consistent with the methods in the training phase of the nodule candidate detection network. The false positive reduction network mainly consisting of Res2Net modules and squeeze-and-excitation units further classifies nodule candidates to reduce the number of false positives. The testing phase of the false positive reduction network takes the nodule candidate coordinates obtained by the nodule candidate detection network as the centers, and crops cubes of size 48×48×48 as the inputs of the false-positive reduction network. The outputs of the false-positive reduction network are the confidences of nodule candidate cubes. Among them, the squeeze-and-excitation unit can capture the channel dependence comprehensively, which makes the channel weight that contains abundant nodule information significant, and makes the channel weight without nodule information small. Res2Net module increases the receptive field of each output feature map without increasing the computational load, which causes the network to have stronger multi-scale representation ability. The region proposal network can take images of any scale as input and output a series of region candidates with scores, which are robust. Context enhancement module can fuse high-level semantic information and low-level position information. Its structure is simple, the implementation is easy, and the calculation cost is low, but it has good performance. The spatial attention module enables the network to pay more attention to the ROIs, which can reduce the difficulty of accurately distinguishing because of the visual similarity between pulmonary nodules and the structures such as blood vessels and shadows around the pulmonary nodules. The effectiveness of this method is validated on the publicly available dataset LUNA16(lung nodule analysis 16) and extensive ablation validation experiments are conducted to demonstrate the contribution of each key component to our proposed framework. The LUNA16 dataset is a subset of LIDC-IDRI(lung image database consortium and image database resource initiative), the largest public dataset of lung nodules. The LUNA16 dataset excludes CT images with slice thickness greater than 2.5 mm from the LIDC-IDRI dataset. A total of 888 CT images remain, with slice thickness of 0.62.5 mm, spatial resolution of 0.460.98 mm, and average diameter of 8.3 mm. The criteria for judging a nodule in the LUNA16 dataset is that at least three of the four radiologists believe that the diameter of the nodule is greater than 3 mm. Therefore, a total of 1 186 positive nodules are annotated in the dataset. The evaluation metric, FROC(free-response receiver operating characteristic curves), is the average recall rate at the average number of false positive nodules at 0.125, 0.25, 0.5, 1, 2, 4, and 8 per scan, which is the official evaluation metric for the LUNA16 dataset. Result The experimental results show that in the nodule candidate detection stage, the sensitivity can reach 0.983 when the average number of false positives per scan is 22. Compared with the benchmark ResNet + FPN(feature pyramid network) method, the average sensitivity and the maximum sensitivity are increased by 2.6%and 0.8%, respectively. For the entire 3D multi-scale pulmonary nodule detection network, when the average number of false positives per scan is 1, the sensitivity is 0.924. Conclusion Compared with the state-of-the-art methods, our method not only improves the sensitivity of pulmonary nodule detection but also effectively controls the number of false positives and achieves better performance. As this method can only output the position information of nodules, in actual lung cancer screening, the growth position, edge shape, and internal structure of the nodules are all significant for clinical diagnosis. Analysis of the characteristics of the nodules can make this method more practical.

Key words

pulmonary nodule detection; convolutional neural network(CNN); multi-scale; region proposal network; context enhancement; spatial attention; false positive reduction

0 引言

肺癌是对人类健康和生命威胁最大的恶性肿瘤之一(Bray等,2018)。根据医学临床经验,一旦肺癌的临床症状显现,治愈率就非常低,因此尽早发现肺结节对降低肺癌死亡率具有重要意义(Wood等,2018)。低剂量CT(computed tomography)扫描作为对高危人群进行肺癌筛查的重要检查手段已大规模用于健康体检,但巨大的CT数据给医生和放射学家带来大量工作,高强度的工作容易造成医生误诊。随着人工智能技术的快速发展,基于深度学习的计算机辅助检测引起了人们关注(Murphy等,2018)。

深度学习在胸部CT序列图像肺结节检测领域取得了优异成绩。Zhu等人(2018)使用具有双路径块和U-Net型编解码器结构的Faster R-CNN(region convolutional neural network)(Ren等,2017)进行候选结节检测,在LUNA16(lung nodule analysis 16)数据集(Setio等,2017)7个假阳性数(0.125,0.25,0.5,1,2,4,8)下的平均敏感度为0.842。Dou等人(2017a)使用3维全卷积网络(Shelhamer等,2017)进行肺结节初始检测,在LUNA16数据集的平均敏感度为0.839,利用残差网络(He等,2016)进行假阳性降低,在平均每例假阳性个数为1时,敏感度达到0.905。Khosravan和Bagci(2018)提出一种名为S4ND(single-shot single-scale lung nodule detection)的肺结节检测网络,由密集连接卷积块组成,以端到端的方式进行训练,不需要任何后处理来完善检测结果,在LUNA16数据集上达到0.897的平均敏感度。Xie等人(2019)通过两个区域推荐网络和一个反卷积层对2D Faster R-CNN结构进行调整,检测候选结节,在LUNA16数据集上最高敏感度可达0.864,并用3种2D模型分别训练3种位置不同的切片,降低假阳性,平均敏感度为0.790。Dou等人(2017b)通过将不同大小的CT图像立方体作为输入,提出了用于假阳性降低的多级上下文3D卷积神经网络,在LUNA16数据集上平均敏感度为0.827。Wang等人(2019)合并3个相邻的轴向切片,构建3D RGB图像用于结节检测,在LUNA16数据集上,当平均每例候选结节数为60.23时,最高敏感度可达0.968,通过感受野不同的两种Inception-v4网络(Szegedy等,2017)降低假阳性,平均敏感度为0.903。以上这些方法,虽然对于特定大小的肺结节检测已取得较好效果,但在肺结节多尺度检测提高敏感度和降低假阳性等方面仍存在改善空间,如何利用CT序列图像数据特性设计更高效的网络结构是提高计算机辅助检测系统性能的关键。

本文基于深度卷积神经网络(deep convolution neural network,DCNN),以提高系统敏感度和降低假阳性为目标,首先使用结节初检网络检测候选结节,然后利用假阳性降低网络对候选结节进一步分类,获得最终结果。本文的主要贡献有:1)CT图像由连续的序列切片组成,3D CNN能更好地捕获CT序列图像的空间信息,提取到更丰富的特征。因此,设计两种3D DCNN,分别用于检测候选结节与去除假阳性结节。2)通过将压缩激励单元(Hu等,2020)嵌入到多个感受野在同一粒度水平的多尺度Res2Net(Gao等,2021)网络模块,创建了一种3D多尺度肺结节检测网络,并引入融合多尺度特征的上下文增强模块和使网络更加关注感兴趣区域的空间注意力模块(Qin等,2019),提高检测性能。3)基于多尺度Res2Net网络模块与压缩激励单元,创建了一种3D假阳性降低网络,将假阳性降低网络得到的预测概率与检测网络得到的预测概率加权平均,得到最终结果。4)在公共大型LUNA16数据集上进行验证,实验结果表明:所提出的方法与几种先进的网络相比具有一定的竞争力。此外,进行了多种消融验证实验,证明算法的有效性。

1 3D多尺度肺结节检测网络

自动肺结节检测可以看做是一项输入为CT影像$\boldsymbol{I} $,输出为肺结节位置[${x}$$y$$z$$d$]的目标检测任务。其中,${x}$$y$$z$表示肺结节立方体包围盒的中心坐标,$d$表示肺结节的直径。实质上,就是构造一个从$\boldsymbol{I} $到[${x}$$y$$z$$d$]的映射${F}$。为了达到这一目标,提出一种3D多尺度肺结节检测网络,如图 1所示(图中,$ \otimes$表示矩阵对应元素相乘)。该网络由基本模块Bottle2SEneck构成,包括结节初检网络和假阳性降低网络两部分。

图 1 3D多尺度肺结节检测网络结构
Fig. 1 3D multi-scale pulmonary nodule detection network structure
((a)nodule candidate detection network; (b)false positive reduction network)

1.1 Bottle2SEneck

Bottle2SEneck由Res2Net网络模块与压缩激励单元组合而成,模块结构如图 2所示(图中,$\oplus $表示矩阵元素相加),其中,$\boldsymbol{x}_{i}, \boldsymbol{y}_{i}(i=1, 2, 3, 4) $表示拆分特征图,3 × 3 × 3表示卷积核大小为3 × 3 × 3的卷积层,每个卷积层后面都有一个批归一化层和一个ReLU层。

图 2 Bottle2SEneck结构
Fig. 2 The structure of Bottle2SEneck module

Bottle2SEneck首先利用一组3 × 3 × 3的滤波器从输入特征图$\boldsymbol{X}$中提取特征,并将输出特征图根据通道维度平均拆分为4个组,在图 2中表示为$ \boldsymbol{x}_{1}, \boldsymbol{x}_{2}, \boldsymbol{x}_{3}, \boldsymbol{x}_{4}$($ \boldsymbol{x}_{1}, \boldsymbol{x}_{2}, \boldsymbol{x}_{3}, \boldsymbol{x}_{4}$的空间尺寸相同)。然后将特征图子集$\boldsymbol{x}_{i} $与上一组滤波器$K_{i-1} $的输出$y_{i-1} $相加,送入滤波器$K_{i} $,得到输出特征图$y_{i} $,具体为

$ \boldsymbol{y}_{i}= \begin{cases}{K}_{i} \boldsymbol{x}_{i} & i=1 \\ K_{i}\left(\boldsymbol{y}_{i-1}+\boldsymbol{x}_{i}\right) & 2 \leqslant i \leqslant 3 \\ \boldsymbol{x}_{i} & i=4\end{cases} $ (1)

然后,再将$\boldsymbol{y}_{1}, \boldsymbol{y}_{2}, \boldsymbol{y}_{3}, \boldsymbol{y}_{4} $在通道维度拼接起来。在Bottle2SEneck中,不需对第4个分块进行卷积,这可使特征得到重用;使用多个卷积核大小为3×3×3、通道数为${C}$/4的小滤波器替代一个卷积核大小为3 × 3 × 3、通道数为${C}$的大滤波器,增加了每个输出特征图的感受野范围,使网络能够充分提取全局和局部特征,具有更强的多尺度表示能力,同时还能保持与大滤波器相似的计算负载。拆分和级联策略,有利于使卷积更有效地处理特征。

压缩激励单元结构如图 3所示,由压缩和激励两个过程组成。压缩过程通过自适应平均池化来整合全局特征;激励过程通过全连接层FC1—ReLU—全连接层FC2—Sigmoid结构实现,其中,$ r$是一个缩放参数,取值为16。激励过程可根据压缩过程中汇聚的信息,全面捕获通道依赖性,使得含有丰富结节信息的通道权重大,不含有结节信息的通道权重小。最后,将激励过程产生的输出(每个通道的权重)与最初的输入中对应通道的特征图相乘,以强调肺结节的特征。

图 3 压缩激励单元的结构
Fig. 3 The structure of squeeze-and-excitation unit

1.2 结节初检网络

1.2.1 网络结构

本文提出的用于在低剂量CT扫描中检测候选结节的初检网络,如图 1(a)所示,采用区域推荐网络结构(Yuan等,2019Tong等,2019Yang等,2019),并根据检测任务的特点,将网络候选区域尺度设置为5,10,20三种。具体来说,该网络由骨干部分Res2SENet和检测部分两部分组成,输入是裁剪的CT图像立方体,维度(长×宽×高×通道数)为128 × 128 × 128 × 1。

骨干部分Res2SENet包括5个阶段,第1阶段包括两个卷积层,第2—5阶段分别包括1个最大池化层和若干个Bottle2SEneck模块,具体模块个数如图 1(a)所示。最大池化层用来下采样,减小特征图的尺寸。Bottle2SEneck模块用来更改通道数,并不改变特征图尺寸。此处,用$\boldsymbol{c}_{i} $表示阶段$i$的输出特征图。

由于特征金字塔网络(feature pyramid network,FPN)(Lin等,2017)结构涉及许多额外的卷积和检测分支,增加了计算成本并导致巨大的运行时间延迟。因此,在网络的检测部分,引入上下文增强模块(context enhancement module,CEM)和空间注意力模块(spatial attention module,SAM)。CEM可聚合多尺度特征信息,增强特征的区分性。在CEM中,将$\boldsymbol{c}_{4} $$\boldsymbol{c}_{5} $分别通过反卷积进行上采样得到的特征图与$\boldsymbol{c}_{3} $在通道维度进行合并。与先前的FPN结构相比较,CEM仅涉及2个反卷积层和1个特征图拼接操作,在保证网络效果的同时降低了计算成本。SAM对从高层得到的特征图做softmax,得到空间注意力图,并将此注意力图与低层特征图相乘,使得网络对感兴趣区域有更多的关注。在SAM后添加2个Bottle2SEneck模块,并设置丢弃(dropout)层(Lim,2021)以防止过拟合现象发生。最后,将dropout层的输出作为区域推荐网络的输入,区域推荐网络的输出包括候选结节包围盒中包含结节的概率$p $以及候选结节的空间信息,即坐标(${x}$$y$$z$)和直径$d$

1.2.2 损失函数

每个候选区域的二进制类标签根据其与目标结节的交并比(intersection over union,IoU)分配。如果IoU>0.5,则将该候选区域标记为正样本;如果IoU<0.02,则将该候选区域标记为负样本;其他既不属于正样本也不属于负样本的候选区域在训练过程中忽略。损失函数由分类损失和回归损失共同组成。对标记的每个候选结节,多任务损失函数定义为

$ L\left(p_{i}, t_{i}\right)=\lambda L_{\mathrm{cls}}\left(p_{i}, p_{i}^{*}\right)+p_{i}^{*} L_{\mathrm{reg}}\left(t_{i}, t_{i}^{*}\right) $ (2)

式中,$\lambda $是权重参数,设置为0.5。$L_{\mathrm{cls}}\left(p_{i}, p_{i}^{*}\right)$是用二值交叉熵损失函数($CrossEntropy_{\text{binary}}$)计算的分类损失,$L_{\mathrm{reg}}\left(t_{i}, t_{i}^{*}\right)$是用$smooth_{l_{1}}$损失函数计算的回归损失,分别定义为

$ L_{\mathrm{cls}}\left(p_{i}, p_{i}^{*}\right)=p_{i}^{*} \log _{2}\left(p_{i}\right)+\left(1-p_{i}^{*}\right) \log _{2}\left(1-p_{i}\right) $ (3)

$ L_{\mathrm{reg}}\left(t_{i}, t_{i}^{*}\right)= \begin{cases}0.5\left(t_{i}-t_{i}^{*}\right)^{2} & \left|t_{i}-t_{i}^{*}\right|<1 \\ \left|t_{i}-t_{i}^{*}\right|-0.5 & \text { 其他 }\end{cases} $ (4)

式中,$p_{i} $$p_{i}^{*}$分别表示候选区域的预测概率和分类标签。正样本的标签$p_{i}^{*}$=1,负样本的标签$p_{i}^{*}$=0。可以看出,只有标记$p_{i}^{*}$=1的正样本才考虑回归损失。$t_{i}$$t_{i}^{*}$分别表示候选区域的预测相对坐标和回归标签,定义为

$ t_{i}=\left(\frac{x-x_{\alpha}}{d_{\alpha}}, \frac{y-y_{\alpha}}{d_{\alpha}}, \frac{z-z_{\alpha}}{d_{\alpha}}, \log _{2}\left(\frac{d}{d_{\alpha}}\right)\right) $ (5)

$ t_{i}^{*}=\left(\frac{x^{*}-x_{\alpha}}{d_{\alpha}}, \frac{y^{*}-y_{\alpha}}{d_{\alpha}}, \frac{z^{*}-z_{\alpha}}{d_{\alpha}}, \log _{2}\left(\frac{d^{2}}{d_{\alpha}}\right)\right) $ (6)

式中,$(x, y, z, d) $为预测得到的结节在原始空间的坐标和尺寸,$\left(x^{*}, y^{*}, z^{*}, d^{*}\right)$为真实结节在原始空间的坐标和尺寸,$\left(x_{\alpha}, y_{\alpha}, z_{\alpha}, d_{\alpha}\right)$为对应候选区域的坐标和尺寸。

1.3 假阳性降低网络

在结节初检阶段,通常会产生很多假阳性,为了从大量候选结节中准确区分真实结节,设计了一个3维深度卷积神经网络,对初检阶段产生的候选结节进一步分类以降低假阳性,网络结构如图 1(b)所示。

该网络由5个阶段组成,将阶段$i$的输出特征图用$\boldsymbol{m}_{i} $表示,则$\boldsymbol{m}_{i} $的尺寸(长×宽×高×通道数)显示在图中阶段$i$的下方。在该网络中,使用卷积层和Bottle2SEneck模块更改通道数;使用最大池化层进行下采样,减小特征图的尺寸;使用dropout层避免过拟合;使用二值交叉熵损失函数进行优化。

2 实验

2.1 数据集

LUNA16数据集是最大的肺结节公开数据集LIDC-IDRI(lung image database consortium and image database resource initiative)(Armato Ⅲ等,2011)的子集。LUNA16数据集从LIDC-IDRI数据集中剔除了切片厚度大于2.5 mm的CT图像,剩下的CT图像共888幅,切片厚度为0.6~2.5 mm,空间分辨率为0.46~0.98 mm,平均直径为8.3 mm。LUNA16数据集中结节的判定标准为4名放射科专家中至少3名认为该结节直径大于3 mm,该数据集中共注释了1 186个阳性结节。

2.2 预处理

对输入的CT影像采用自动预处理,具体步骤如下:

1) 为便于神经网络从中抽取有效的图像特征,将肺结节可能出现的体素值范围从原来的[-1 200,600]归一化到[0,1],具体为

$ \overline{v a l}= \begin{cases}0 & v a l<-1\ 200 \\ \frac{v a l-(-1\ 200)}{600-(-1\ 200)} & -1\ 200 \leqslant v a l \leqslant 600 \\ 1 & v a l>600\end{cases} $ (7)

式中,$val $表示变换前的CT值,$\overline{v a l}$表示变换后的CT值。

2) 根据LUNA16数据集给出的原始CT肺实质区域分割文件,去除背景。

3) 将CT图像${X}$${Y}$${Z}$方向的像素间距统一为[1,1,1] mm。

4) 根据CT肺实质区域自动截取肺结节感兴趣区域。

预处理前后的CT图像如图 4所示。

图 4 预处理前后的CT图像
Fig. 4 CT images before and after preprocessing
((a)CT images before preprocessing; (b)CT images after preprocessing)

2.3 实验过程

实验所用训练平台使用8块主频为2.20 GHz的Intel(R) Xeon(R) Sliver 4210 CPU,内存为64 GB。所有实验均采用2.2节所述的预处理操作。所有网络模型使用Python 2.7搭建,通过Pytorch并行计算框架在2块NVIDIA GeForce RTX 2080Ti显卡上进行加速。网络训练均使用随机梯度下降(stochastic gradient descent,SGD)优化方法,其中,初始学习率设置为0.01,动量参数设置为0.9,权重衰减设置为0.000 1。对LUNA16数据集进行10折交叉验证。

在结节初检网络的训练阶段,经预处理后,通过随机旋转、翻转和比例在0.75~1.25之间的缩放进行数据增强,并最终随机裁剪为128 × 128 × 128像素尺寸的结节和非结节立方体输入到网络中进行训练。批处理大小设置为8,共训练150次,每经过50次训练,将学习率调整为原来的1/10。在结节初检网络的测试阶段,将预处理之后的CT图像分割成208 × 208 × 208的小块作为网络输入,相邻小块间重叠32个像素。对每个CT图像,汇总所有小块得到候选结节,并通过IoU阈值为0.1的非极大值抑制(non-maximum suppression,NMS)(Rothe等,2015)将重叠度较高的结节合并,从而得到检测结果。

在假阳性降低网络的训练阶段,由于结节初检网络的平均每例假阳性个数为22,为了平衡正负样本的数量,对正样本进行22倍数据扩增,扩增方法与结节初检网络中的数据增强方法一致。批处理大小设置为128,共训练40次,在训练10次后,将学习率设置为0.001,在训练20次后,将学习率设置为0.000 1。在假阳性降低网络的测试阶段,将假阳性降低网络得到的预测概率与结节初检网络得到的预测概率加权平均,得到最后的分类结果,计算为

$ p=\sum\limits_{i=1}^{2} \omega_{i} p_{i} $ (8)

式中,$ \omega_{i}$为网络预测概率$p_{i} $的权重,将结节初检网络的预测概率所占权重设置为0.2,假阳性降低网络的预测概率所占权重设置为0.8。

2.4 评价标准

将FROC(free-response receiver operating characteristic curves)曲线在7个不同误报数(0.125,0.25,0.5,1,2,4,8)下对应的敏感度(sensitivity)平均值作为算法性能的评价结果。敏感度计算为

$ S=\frac{T P}{T P+F N} $ (9)

式中,$ TP$代表真阳性结节数量,真阳性结节的判定标准是预测出的结节坐标位于真实结节半径范围内。$ FN$代表假阴性结节数量,假阴性结节即未检测出的真实结节。

2.5 实验结果

利用FROC曲线、平均敏感度、最高敏感度以及平均每例假阳性个数对结节初检网络和整个3D多尺度肺结节检测网络的性能进行评价。网络的FROC曲线如图 5所示,其中,曲线是对真实预测值进行插值得到的。

图 5 网络的FROC曲线
Fig. 5 FROC curves

为了验证所提出的网络结构中CEM、SAM以及Res2SENet骨架网络的有效性,在LUNA16数据集上,对不同3D结节初检网络的平均敏感度、最高敏感度以及达到最高敏感度时的平均每例假阳性个数进行了相应的对比实验,实验结果如表 1所示。可以看出,为了验证CEM的有效性,在实验中分别用FPN、CEM与残差网络(residual network,ResNet)(Xie等,2019)组合进行比较,实验结果表明,ResNet与CEM的组合比ResNet与FPN的组合最高敏感度低,但平均敏感度高,表明结构简单的CEM与FPN具有相当的性能。为了验证SAM的有效性,在CEM之后添加SAM,实验结果表明,使用SAM的结节初检网络平均敏感度提高了0.7%,最高敏感度提高了2.5%,且平均每例假阳性个数约减少了5个,说明SAM能有效提升结节检测性能。为了验证Res2SENet骨架网络的性能,用Res2SENet替代ResNet,与CEM和SAM进行组合,实验结果表明,使用Res2SENet为骨架网络的结节初检网络比使用ResNet为骨架网络的结节初检网络平均敏感度提高了1.6%,且平均每例假阳性个数约减少了4个,证明了Res2SENet骨架网络的有效性。

表 1 不同3D结节初检网络性能比较
Table 1 Performance comparison of different nodule candidate detection networks

下载CSV
网络模型 平均敏感度 最高敏感度 达到最高敏感度时的平均每例假阳性个数
ResNet+FPN 0.846 0.975 18
ResNet+CEM 0.849 0.958 31
ResNet+CEM+SAM 0.856 0.983 26
本文 0.872 0.983 22
注:加粗字体表示各列最优结果。

为了验证提出的整个肺结节检测网络的有效性,在LUNA16数据集中,将本文方法与现有主流方法在平均敏感度方面进行比较,结果如表 2所示。可以看出,本文方法的平均敏感度为0.923,高于现有主流方法,说明所提方法的优越性。

表 2 整个肺结节检测网络性能比较
Table 2 Performance comparison of different methods for pulmonary nodules detection

下载CSV
网络模型 平均敏感度
Khosravan和Bagci(2018) 0.897
Xie等人(2019) 0.790
Wang等人(2019) 0.903
Ding等人(2017) 0.891
Pezeshk等人(2019) 0.832 ± 0.011
Li等人(2019) 0.912
本文 0.923
注:加粗字体表示最优结果。

图 6对检测结果进行了展示。但由于CT图像的3维特性,只能显示检测中心所在的横断面,又因为肺结节在横断面中相对较小,所以只截取以检测中心为中心,边长为64的正方形区域来进行可视化。其中,图 6(a)是检测到的真阳性结节(绿色圆圈),图 6(b)是检测到的假阳性结节(红色圆圈),具有与真阳性结节非常相似的特征,图 6(c)是未检测到的真实结节(黄色圆圈),即假阴性结节,基本尺寸极小,对其进行特殊的数据增强,可提高检测性能。本文方法不仅对实性结节检测效果良好,对磨玻璃结节的效果也很好。

图 6 检测结果
Fig. 6 Detection results((a)true positive nodules; (b)false positive nodules; (c)false negative nodules)

3 结论

本文提出一种基于深度卷积神经网络的3D多尺度肺结节检测方法,由结节初检和假阳性降低两个阶段组成。为了能够充分提取肺结节的多尺度特征,将Res2Net网络模块与压缩激励单元组合,搭建结节初检网络和假阳性降低网络。此外,在结节初检网络中,为了融合深层次的语义信息与低层次的位置信息,提出一种结构简单但性能优异的上下文增强模块。同时,为了使网络对感兴趣区域有更多关注,在上下文增强模块之后引入了空间注意力模块。与现有主流肺结节检测方法相比,本文算法平均敏感度较高,假阳性较低,在胸部CT序列图像肺结节检测领域具有较高的实用价值。

由于提出的3D多尺度肺结节检测方法仍然存在少数尺寸极小结节漏诊情况,未来需对其进一步优化,以提高系统的检测性能。例如,对尺寸极小的结节进行特殊的数据增强。此外,该系统只能输出结节的位置信息,但在实际的肺癌筛查中,结节的生长部位、边缘形态及内部结构等都对临床诊断具有重要意义,今后可对结节的大小、类型和特征等进行分析,为后续工作提供建议。最后,实验是直接使用LUNA16数据集提供的肺实质区域分割文件去除背景,但对CT原图像,需要进行区域分割预处理才能得到。因此,未来还需对本文方法进一步完善,使其具有更强的实践意义。

参考文献

  • Armato Ⅲ S G, McLennan G, Bidaut L, McNitt-Gray M F, Meyer C R, Reeves A P, Zhao B S, Aberle D R, Henschke C I, Hoffman E A, Kazerooni E A, MacMahon H, van Beek E J R, Yankelevitz D, Biancardi A M, Bland P H, Brown M S, Engelmann R M, Laderach G E, Max D, Pais R C, Qing D P Y, Roberts R Y, Smith A R, Starkey A, Batra P, Caligiuri P, Farooqi A, Gladish G W, Jude C M, Munden R F, Petkovska I, Quint L E, Schwartz L H, Sundaram B, Dodd L E, Fenimore C, Gur D, Petrick N, Freymann J, Kirby J, Hughes B, Casteele A V, Gupte S, Sallam M, Heath M D, Kuhn M H, Dharaiya E, Burns R, Fryd D S, Salganicoff M, Anand V, Shreter U, Vastagh S, Croft B Y, Clarke L P. 2011. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Medical Physics, 38(2): 915-931 [DOI:10.1118/1.3528204]
  • Bray F, Ferlay J, Soerjomataram I, Siegel R L, Torre L A, Jemal A. 2018. Global cancer statistics 2018:GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 68(6): 394-424 [DOI:10.3322/caac.21492]
  • Ding J, Li A X, Hu Z Q and Wang L W. 2017. Accurate pulmonary nodule detection in computed tomography images using deep convolutional neural networks//Proceedings of the 20th International Conference on Medical Image Computing and Computer Assisted Intervention. Quebec City, Canada: Springer: 559-567[DOI: 10.1007/978-3-319-66179-7_64]
  • Dou Q, Chen H, Jin Y M, Lin H J, Qin J and Heng P A. 2017a. Automated pulmonary nodule detection via 3D ConvNets with online sample filtering and hybrid-loss residual learning//Proceedings of the 20th International Conference on Medical Image Computing and Computer Assisted Intervention. Quebec City, Canada: Springer: 630-638[DOI: 10.1007/978-3-319-66179-7_72]
  • Dou Q, Chen H, Yu L Q, Qin J, Heng P A. 2017b. Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection. IEEE Transactions on Biomedical Engineering, 64(7): 1558-1567 [DOI:10.1109/TBME.2016.2613502]
  • Gao S H, Cheng M M, Zhao K, Zhang X Y, Yang M H, Torr P. 2021. Res2Net: a new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2): 652-662 [DOI:10.1109/TPAMI.2019.2938758]
  • He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778[DOI: 10.1109/CVPR.2016.90]
  • Hu J, Shen L, Albanie S, Sun G, Wu E H. 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8): 2011-2023 [DOI:10.1109/TPAMI.2019.2913372]
  • Khosravan N and Bagci U. 2018. S4ND: single-shot single-scale lung nodule detection//Proceedings of the 21st International Conference on Medical Image Computing and Computer Assisted Intervention. Granada, Spain: Springer: 794-802[DOI: 10.1007/978-3-030-00934-2_88]
  • Li F, Huang H Y, Wu Y W, Cai C B, Huang Y and Ding X H. 2019. Lung nodule detection with a 3D ConvNet via IoU self-normalization and maxout unit//Proceedings of 2019 International Conference on Acoustics, Speech and Signal Processing. Brighton, UK: IEEE: 1214-1218[DOI: 10.1109/ICASSP.2019.8683537]
  • Lim H I. 2021. A study on dropout techniques to reduce overfitting in deep neural networks. Lecture Notes in Electrical Engineering, 716: 133-139 [DOI:10.1007/978-981-15-9309-3_20]
  • Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 936-944[DOI: 10.1109/CVPR.2017.106]
  • Murphy A, Skalski M, Gaillard F. 2018. The utilisation of convolutional neural networks in detecting pulmonary nodules: a review. The British Journal of Radiology, 91(1090): #20180028 [DOI:10.1259/bjr.20180028]
  • Pezeshk A, Hamidian S, Petrick N, Sahiner B. 2019. 3-D convolutional neural networks for automatic detection of pulmonary nodules in chest CT. IEEE Journal of Biomedical and Health Informatics, 23(5): 2080-2090 [DOI:10.1109/JBHI.2018.2879449]
  • Qin Z, Li Z M, Zhang Z N, Bao Y P, Yu G, Peng Y X and Sun J. 2019. ThunderNet: towards real-time generic object detection on mobile devices//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 6717-6726[DOI: 10.1109/ICCV.2019.00682]
  • Ren S Q, He K M, Girshick R, Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI:10.1109/TPAMI.2016.2577031]
  • Rothe R, Guillaumin M and Van Gool L. 2015. Non-maximum suppression for object detection by passing messages between windows//Proceedings of the 12th Asian Conference on Computer Vision. Singapore, Singapore: Springer: 290-306[DOI: 10.1007/978-3-319-16865-4_19]
  • Setio A A A, Traverso A, de Bel T, Berens M S N, van den Bogaard C, Cerello P, Chen H, Dou Q, Fantacci M E, Geurts B, van der Gugten R, Heng P A, Jansen B, de Kaste M M, Kotov V, Lin J Y H, Manders J T M C, Sóñora-Mengana A, García-Naranjo J, Papavasileiou E, Prokop M, Saletta M, Schaefer-Prokop C M, Scholten E T, Scholten L, Snoeren M M, Torres E L, Vandemeulebroucke J, Walasek N, Zuidhof G C A, van Ginneken B, Jacobs C. 2017. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Medical Image Analysis, 42: 1-13 [DOI:10.1016/j.media.2017.06.015]
  • Shelhamer E, Long J, Darrell T. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4): 640-651 [DOI:10.1109/TPAMI.2016.2572683]
  • Szegedy C, Ioffe S, Vanhoucke V and Alemi A A. 2017. Inception-v4, inception-ResNet and the impact of residual connections on learning//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI: 4278-4284
  • Tong G F, Chen H R, Li Y, Du X C, Zhang Q C. 2019. Object detection for panoramic images based on MS-RPN structure in traffic road scenes. IET Computer Vision, 13(5): 500-506 [DOI:10.1049/iet-cvi.2018.5304]
  • Wang J, Wang J W, Wen Y F, Lu H B, Niu T Y, Pan J F, Qian D H. 2019. Pulmonary nodule detection in volumetric chest CT scans using CNNS-based nodule-size-adaptive detection and classification. IEEE Access, 7: 46033-46044 [DOI:10.1109/ACCESS.2019.2908195]
  • Wood D E, Kazerooni E A, Baum S L, Eapen G A, Ettinger D S, Hou L F, Jackman D M, Klippenstein D, Kumar R, Lackner R P, Leard L E, Lennes I T, Leung A N C, Makani S S, Massion P P, Mazzone P, Merritt R E, Meyers B F, Midthun D E, Pipavath S, Pratt C, Reddy C, Reid M E, Rotter A J, Sachs P B, Schabath M B, Schiebler M L, Tong B C, Travis W D, Wei B, Yang S C, Gregory K M, Hughes M. 2018. Lung cancer screening, version 3. 2018, NCCN clinical practice guidelines in oncology. Journal of the National Comprehensive Cancer Network, 16(4): 412-441 [DOI:10.6004/jnccn.2018.0020]
  • Xie H T, Yang D B, Sun N N, Chen Z N, Zhang Y D. 2019. Automated pulmonary nodule detection in CT images using deep convolutional neural networks. Pattern Recognition, 85: 109-119 [DOI:10.1016/j.patcog.2018.07.031]
  • Yang D M, Zou Y X, Zhang J, Li G. 2019. C-RPNs: promoting object detection in real world via a cascade structure of region proposal networks. Neurocomputing, 367: 20-30 [DOI:10.1016/j.neucom.2019.08.016]
  • Yuan J R, Xue B, Zhang W S, Xu L, Sun H Y, Zhou J H. 2019. RPN-FCN based rust detection on power equipment. Procedia Computer Science, 147: 349-353 [DOI:10.1016/j.procs.2019.01.236]
  • Zhu W T, Liu C C, Fan W and Xie X H. 2018. DeepLung: deep 3D dual path nets for automated pulmonary nodule detection and classification//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, USA: IEEE: 673-681[DOI: 10.1109/WACV.2018.00079]