Current Issue Cover
3D多尺度深度卷积神经网络肺结节检测

孙华聪, 彭延军, 郭燕飞, 张晓庆(山东科技大学, 青岛 266590)

摘 要
目的 肺结节是肺癌的早期存在形式。低剂量CT(computed tomogragphy)扫描作为肺癌筛查的重要检查手段,已经大规模应用于健康体检,但巨大的CT数据带来了大量工作,随着人工智能技术的快速发展,基于深度学习的计算机辅助肺结节检测引起了关注。由于肺结节尺寸差别较大,在多个尺度上表示特征对结节检测任务至关重要。针对结节尺寸差别较大导致的结节检测困难问题,提出一种基于深度卷积神经网络的胸部CT序列图像3D多尺度肺结节检测方法。方法 包括两阶段:1)尽可能提高敏感度的结节初检网络;2)尽可能减少假阳性结节数量的假阳性降低网络。在结节初检网络中,以组合了压缩激励单元的Res2Net网络为骨干结构,使同一层卷积具有多种感受野,提取肺结节的多尺度特征信息,并使用引入了上下文增强模块和空间注意力模块的区域推荐网络结构,确定候选区域;在由Res2Net网络模块和压缩激励单元组成的假阳性降低网络中对候选结节进一步分类,以降低假阳性,获得最终结果。结果 在公共数据集LUNA16(lung nodule analysis 16)上进行实验,实验结果表明,对于结节初检网络阶段,当平均每例假阳性个数为22时,敏感度可达到0.983,相比基准ResNet + FPN(feature pyramid network)方法,平均敏感度和最高敏感度分别提高了2.6%和0.8%;对于整个3D多尺度肺结节检测网络,当平均每例假阳性个数为1时,敏感度为0.924。结论 与现有主流方案相比,该检测方法不但提高了肺结节检测的敏感度,还有效地控制了假阳性,取得了更优的性能。
关键词
3D multi-scale deep convolutional neural networks in pulmonary nodule detection

Sun Huacong, Peng Yanjun, Guo Yanfei, Zhang Xiaoqing(Shandong University of Science and Technology, Qingdao 266590, China)

Abstract
Objective Pulmonary nodules are early forms of lung cancer, one of the most threatening malignancies for human health and life. As an important means of lung cancer screening, low-dose computerized tomographic scanning has been widely used in health examinations. However, a large amount of computed tomography(CT) data brings a heavy workload to doctors and radiologists, and high-intensity work can result in misdiagnosis. With the rapid development of artificial intelligence technology, computer-aided lung-nodule detection based on deep learning has attracted much attention. As the size of pulmonary nodules varies greatly, representing features on multiple scales is critical for nodule detection tasks. To solve the problem of difficulty in detection caused by the large difference in size of nodules, this paper proposes a 3D multi-scale pulmonary nodule detection method in chest CT sequence images based on deep convolutional neural network. Method The method mainly consists of two stages: 1) nodule candidate detection stage that maximizes system sensitivity, and 2) false positive reduction stage that minimizes the number of false positive nodules. Specifically, a series of preprocessing operations is performed on the original CT images first, and the regions of interest (ROIs) of lung nodules are obtained by cropping. In the training phase of the nodule candidate detection network, after the preprocessing steps, data augmentation is performed by randomly rotating, flipping, and scaling. Then, nodule cubes and non-nodule cubes with a size of 128×128×128 are randomly cropped out and input to the network. The nodule candidate detection network uses the combination of the squeeze-and-excitation units and the Res2Net modules as the backbone structure, so that the convolutions of the same layer have a variety of receptive fields. Thus, the network can extract the multi-scale feature information of pulmonary nodules. In addition, the nodule candidate detection network also uses the region proposal network structure that introduces context enhancement module and spatial attention module to identify region candidates. In the test phase of the nodule candidate detection network, the preprocessed CT image is divided into several small patches of size 208×208×208, which are used as the inputs of the network, and adjacent small patches overlap 32 pixels. For each CT image, the nodule candidates obtained from all small patches are summarized, and the nodules with higher overlap are merged through non-maximum suppression with an intersection over union(IOU) threshold of 0.1 to obtain the detection results. In the training phase of the false positive reduction network, because the average number of false positive nodules per scan is 22 obtained through experiments in the nodule candidate detection network, the positive samples are augmented by 22 times to balance the number of positive and negative samples. The augmentation methods are consistent with the methods in the training phase of the nodule candidate detection network. The false positive reduction network mainly consisting of Res2Net modules and squeeze-and-excitation units further classifies nodule candidates to reduce the number of false positives. The testing phase of the false positive reduction network takes the nodule candidate coordinates obtained by the nodule candidate detection network as the centers, and crops cubes of size 48×48×48 as the inputs of the false-positive reduction network. The outputs of the false-positive reduction network are the confidences of nodule candidate cubes. Among them, the squeeze-and-excitation unit can capture the channel dependence comprehensively, which makes the channel weight that contains abundant nodule information significant, and makes the channel weight without nodule information small. Res2Net module increases the receptive field of each output feature map without increasing the computational load, which causes the network to have stronger multi-scale representation ability. The region proposal network can take images of any scale as input and output a series of region candidates with scores, which are robust. Context enhancement module can fuse high-level semantic information and low-level position information. Its structure is simple, the implementation is easy, and the calculation cost is low, but it has good performance. The spatial attention module enables the network to pay more attention to the ROIs, which can reduce the difficulty of accurately distinguishing because of the visual similarity between pulmonary nodules and the structures such as blood vessels and shadows around the pulmonary nodules. The effectiveness of this method is validated on the publicly available dataset LUNA16(lung nodule analysis 16) and extensive ablation validation experiments are conducted to demonstrate the contribution of each key component to our proposed framework. The LUNA16 dataset is a subset of LIDC-IDRI(lung image database consortium and image database resource initiative), the largest public dataset of lung nodules. The LUNA16 dataset excludes CT images with slice thickness greater than 2.5 mm from the LIDC-IDRI dataset. A total of 888 CT images remain, with slice thickness of 0.62.5 mm, spatial resolution of 0.460.98 mm, and average diameter of 8.3 mm. The criteria for judging a nodule in the LUNA16 dataset is that at least three of the four radiologists believe that the diameter of the nodule is greater than 3 mm. Therefore, a total of 1 186 positive nodules are annotated in the dataset. The evaluation metric, FROC(free-response receiver operating characteristic curves), is the average recall rate at the average number of false positive nodules at 0.125, 0.25, 0.5, 1, 2, 4, and 8 per scan, which is the official evaluation metric for the LUNA16 dataset. Result The experimental results show that in the nodule candidate detection stage, the sensitivity can reach 0.983 when the average number of false positives per scan is 22. Compared with the benchmark ResNet + FPN(feature pyramid network) method, the average sensitivity and the maximum sensitivity are increased by 2.6%and 0.8%, respectively. For the entire 3D multi-scale pulmonary nodule detection network, when the average number of false positives per scan is 1, the sensitivity is 0.924. Conclusion Compared with the state-of-the-art methods, our method not only improves the sensitivity of pulmonary nodule detection but also effectively controls the number of false positives and achieves better performance. As this method can only output the position information of nodules, in actual lung cancer screening, the growth position, edge shape, and internal structure of the nodules are all significant for clinical diagnosis. Analysis of the characteristics of the nodules can make this method more practical.
Keywords

订阅号|日报