深度学习多部位病灶检测与分割
Multiorgan lesion detection and segmentation based on deep learning
- 2021年26卷第11期 页码:2723-2731
收稿:2020-07-17,
修回:2020-11-14,
录用:2020-11-21,
纸质出版:2021-11-16
DOI: 10.11834/jig.200353
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-07-17,
修回:2020-11-14,
录用:2020-11-21,
纸质出版:2021-11-16
移动端阅览
目的
2
多部位病灶具有大小各异和类型多样的特点,对其准确检测和分割具有一定的难度。为此,本文设计了一种2.5D深度卷积神经网络模型,实现对多种病灶类型的计算机断层扫描(computed tomography,CT)图像的病灶检测与分割。
方法
2
利用密集卷积网络和双向特征金字塔网络组成的骨干网络提取图像中的多尺度和多维度信息,输入为带有标注的中央切片和提供空间信息的相邻切片共同组合而成的CT切片组。将融合空间信息的特征图送入区域建议网络并生成候选区域样本,再由多阈值级联网络组成的Cascade R-CNN(region convolutional neural networks)筛选高质量样本送入检测与分割分支进行训练。
结果
2
本文模型在DeepLesion数据集上进行验证。结果表明,在测试集上的平均检测精度为83.15%,分割预测结果与真实标签的端点平均距离误差为1.27 mm,直径平均误差为1.69 mm,分割性能优于MULAN(multitask universal lesion analysis network for joint lesion detection,tagging and segmentation)和Auto RECIST(response evaluation criteria in solid tumors),且推断每幅图像平均时间花费仅91.7 ms。
结论
2
对于多种部位的CT图像,本文模型取得良好的检测与分割性能,并且预测时间花费较少,适用病变类别与DeepLesion数据集类似的CT图像实现病灶检测与分割。本文模型在一定程度上能满足医疗人员利用计算机分析多部位CT图像的需求。
Objective
2
Most of the computed tomography (CT) image analysis networks based on deep learning are designed for a single lesion type
such that they are incapable of detecting multiple types of lesions. The general CT image analysis network focusing on accurate
timely diagnosis and treatment of patients is urgently needed. The public medical image set is quite difficult to build because doctors or researchers must process the existing CT images more efficiently and diagnose diseases more accurately. To improve the performance of CT image analysis networks
several scholars have constructed 3D convolutional neural networks (3D CNN) to extract substantial spatial features which have better performances than those of 2D CNN. However
the high computational complexity in 3D CNN restricts the depth of the designed networks
resulting in performance bottlenecks. Recently
the CT image dataset with multiple lesion types
DeepLesion
has contributed to the universal network construction for lesion detection and segmentation task on CT images. Different lesion scales and types cause a large burden on lesion detection and segmentation. To address the problems and improve the performance of CT image analysis networks
we propose a model based on deep convolutional networks to accomplish the tasks of multi-organ lesion detection and segmentation on CT images
which will help doctors diagnose the disease quickly and accurately.
Method
2
The proposed model mainly consists of two parts. 1) Backbone networks. To extract multi-dimension
multi-scale features
we integrate bidirectional feature pyramid networks and densely connected convolutional networks into the backbone network. The model's inputs are the combination of CT key slice and the neighboring slices
where the former provides ground truth information
and the latter provide the 3D context. Combining the backbone network with feature fusion method enables the 2D network to extract spatial information from adjacent slices. Thus
the network can use features of the adjacent slices and key slice
and network performance can be improved by utilizing the 3D context from the CT slices. Moreover
we try to simplify and fine tune the network structure such that our model has a better performance as well as low computational complexity than the original architecture. 2) Detection and segmentation branches. To produce high-quality
typical proposals
we place the features fused with 3D context into the region of proposal network. The cascaded R-CNN (region convolutional neural network) with gradually increasing threshold resamples the generated proposals
and the high-quality proposals are fed into the detection and segmentation branches. We set the anchor ratios to 1:2
1:1
and 2:1
and the sizes in region of proposal networks to 16
24
32
48
and 96 for the different scales of lesions. We take different cascaded stages with different value of intersection over union such as 0.5
0.6
and 0.7 to find the suitable cascaded stages. The original region of interest (ROI) pool method is substituted with ROI align for better performances.
Result
2
We validate the network's performance on the dataset DeepLesion containing 32 120 CT images with different types of lesions. We split the dataset into three subsets
namely
training set
testing set
and validating set
with proportions of 70%
15%
and 15%
respectively. We employ the stochastic gradient descent method to train the proposed model with an initial learning rate of 0.001. The rate will drop to 1/10 of the original value in the fourth and sixth epoch (eight epochs in total for training). Four groups of comparative experiments are conducted to explore the effects of different networks on detection and segmentation performance. Multiple network structures such as feature pyramid networks (FPN)
bidirectional feature pyramid networks (BiFPN)
feature fusion
and different number of cascade stages and segmentation branch are considered in our experiments. Experimental results show that BiFPN can function well in the detection task compared with FPN. Moreover
detection performance is greatly improved by using the feature fusion method. As the number of cascaded stages increases
detection accuracy drops slightly
while the performance of segmentation improves greatly. In addition
the networks without a segmentation branch can detect lesions more accurately than those with a segmentation branch. Hence
we recognize a negative relationship between detection and segmentation tasks. We can select different structures for distinct requirements on detection or segmentation accuracy to achieve satisfying results. If doctors or researchers want to diagnose lesions more accurately
the baseline network without a segmentation branch can meet the requirements. For more accurate segmentation results
baseline network with three cascaded stages network can achieve the goal. We present the results from the three-stage cascaded networks. The results show that the average detection accuracy of our model on the DeepLesion test set is 83.15%
while the average distance error between the segmentation prediction result and the real weak label of response evaluation criteria in solid tumors (RECIST)'s endpoint is 1.27 mm
and the average radius error is 1.69 mm. Our network's performance in segmentation is superior to the multitask universal lesion analysis network for joint lesion detection
tagging
and segmentation and auto RECIST. The inference time per image in our network is 91.7 ms.
Conclusion
2
The proposed model achieves good detection and segmentation performance on CT images
and takes less time to predict. It is suitable for accomplishing lesion detection and segmentation in CT images with similar lesion types in the DeepLesion dataset. Our model trained on DeepLesion can help doctors diagnose lesions on multiple organs using a computer.
Cai Z W and Vasconcelos N. 2018. Cascade R-CNN: delving into high quality object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 6154-6162[ DOI: 10.1109/CVPR.2018.00644 http://dx.doi.org/10.1109/CVPR.2018.00644 ]
Eisenhauer E A, Therasse P, Bogaerts J, Schwartz L H, Sargent D, Ford R, Dancey J, Arbuck S, Gwyther S, Mooney M, Rubinstein L, Shankar L, Dodd L, Kaplan R, Lacombe D and Verweij J. 2009. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). European Journal of Cancer, 45(2): 228-247[DOI: 10.1016/j.ejca.2008.10.026]
He K M, Gkioxari G, Dollár P and Girshick R. 2017. Mask R-CNN//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2980-2988[ DOI: 10.1109/ICCV.2017.322 http://dx.doi.org/10.1109/ICCV.2017.322 ]
Huang G, Liu Z, Van Der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2261-2269[ DOI: 10.1109/CVPR.2017.243 http://dx.doi.org/10.1109/CVPR.2017.243 ]
Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 936-944[ DOI: 10.1109/CVPR.2017.106 http://dx.doi.org/10.1109/CVPR.2017.106 ]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149[DOI: 10.1109/TPAMI.2016.2577031]
Roth H R, Shen C, Oda H, Oda M, Hayashi Y, Misawa K and Mori K. 2018. Deep learning and its application to medical image segmentation. Medical Imaging Technology, 36(2): 63-71[DOI: 10.11409/mit.36.63]
Setio A A A, Traverso A, De Bel T, Berens M S N, Van Den Bogaard C, Cerello P, Chen H, Dou Q, Fantacci M E, Geurts B, Van Der Gugten R, Heng P A, Jansen B, De Kaste M M J, Kotov V, Lin J Y H, Manders J T M C, Sóñora-Mengana A, García-Naranjo J C, Papavasileiou E, Prokop M, Saletta M, Schaefer-Prokop C M, Scholten E T, Scholten L, Snoeren M M, Torres E L, Vandemeulebroucke J, Walasek N, Zuidhof G C A, Van Ginneken B and Jacobs C. 2017. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the luna16 challenge. Medical Image Analysis, 42: 1-13[DOI: 10.1016/j.media.2017.06.015]
Taha A, Lo P, Li J N and Zhao T. 2018. Kid-net: convolution networks for kidney vessels segmentation from CT-volumes//Proceedings of the 21st International Conference on Medical Image Computing and Computer Assisted Intervention. Granada, Spain: Springer: 463-471[ DOI: 10.1007/978-3-030-00937-3_53 http://dx.doi.org/10.1007/978-3-030-00937-3_53 ]
Tan M X, Pang R M and Le Q V. 2020. EfficientDet: scalable and efficient object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 10778-10787[ DOI: 10.1109/CVPR42600.2020.01079 http://dx.doi.org/10.1109/CVPR42600.2020.01079 ]
Tang Y, Harrison A P, Bagheri M, Xiao J and Summers R M. 2018. Semi-automatic RECIST labeling on CT scans with cascaded convolutional neural networks//Proceedings of the 21st International Conference on Medical Image Computing and Computer-Assisted Intervention. Granada, Spain: Springer: 405-413[ DOI: 10.1007/978-3-030-00937-3_47 http://dx.doi.org/10.1007/978-3-030-00937-3_47 ]
Tang Y B, Yan K, Tang Y X, Liu J M, Xiao J and Summers R M. 2019. Uldor: A universal lesion detector for CT scans with pseudo masks and hard negative example mining//The 16th IEEE International Symposium on Biomedical Imaging (ISBI 2019). Venice, Italy: IEEE: 833-836[ DOI: 10.1109/ISBI.2019.8759478 http://dx.doi.org/10.1109/ISBI.2019.8759478 ]
Xie S N, Girshick R, Dollár P, Tu Z W and He K M. 2017. Aggregated residual transformations for deep neural networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5987-5995[ DOI: 10.1109/CVPR.2017.634 http://dx.doi.org/10.1109/CVPR.2017.634 ]
Xie W Y, Chen Y B, Wang J Y, Li Q and Chen Q. 2019. Detection of pulmonary nodules in CT images based on convolutional neural networks. Computer Engineering and Design, 40(12): 3575-3581
谢未央, 陈彦博, 王季勇, 李强, 陈群. 2019. 基于卷积神经网络的CT图像肺结节检测. 计算机工程与设计, 40(12): 3575-3581[DOI: 10.16208/j.issn1000-7024.2019.12.035]
Yan K, Bagheri M and Summers R M. 2018a. 3D context enhanced region-based convolutional neural network for end-to-end lesion detection//Proceedings of the 21st International Conference on Medical Image Computing and Computer-Assisted Intervention. Granada, Spain: Springer: 511-519[ DOI: 10.1007/978-3-030-00928-1_58 http://dx.doi.org/10.1007/978-3-030-00928-1_58 ]
Yan K, Tang Y, Peng Y, Sandfort V, Bagheri M, Lu Z Y and Summers R M. 2019. MULAN: multitask universal lesion analysis network for joint lesion detection, tagging, and segmentation//Proceedings of the 22nd International Conference on Medical Image Computing and Computer-Assisted Intervention. Shenzhen, China: Springer: 194-202[ DOI: 10.1007/978-3-030-32226-7_22 http://dx.doi.org/10.1007/978-3-030-32226-7_22 ]
Yan K, Wang X S, Lu L and Summers R M. 2018b. DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. Journal of Medical Imaging, 5(3): #036501[DOI: 10.1117/1.JMI.5.3.036501]
Zlocha M, Dou Q and Glocker B. 2019. Improving retinaNet for CT lesion detection with dense masks from weak RECIST labels//Proceedings of the 22nd International Conference on Medical Image Computing and Computer-Assisted Intervention. Shenzhen, China: Springer: 402-410[ DOI:10.1007/978-3-030-32226-7_45 http://dx.doi.org/10.1007/978-3-030-32226-7_45 ]
相关作者
相关机构
京公网安备11010802024621