融合注意力机制与可变形卷积的多尺度骨病变检测
Multi-scale bone lesion detection based on attention mechanism and deformable convolution
- 2021年26卷第9期 页码:2181-2192
收稿:2020-08-20,
修回:2020-10-22,
录用:2020-10-29,
纸质出版:2021-09-16
DOI: 10.11834/jig.200476
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-08-20,
修回:2020-10-22,
录用:2020-10-29,
纸质出版:2021-09-16
移动端阅览
目的
2
在计算机断层扫描(computed tomography,CT)影像中对骨组织部位进行自动分析和检测,对于骨科疾病的早期诊断具有重要意义,然而基于人工分析诊断的方法存在效率较低、诊断的准确性和客观一致性无法保证等问题。为此,本文研究构建一个骨组织病变检测的级联神经网络模型,以期为骨科医生的诊断提供支持。
方法
2
在影像预处理阶段使用改进的增强方法对CT影像进行对比度增强并获取影像中的人体有效部位;根据骨骼组织CT值(Hounsfield unit,HU)的分布范围进行阈值分割,得到大致的骨组织区域;以级联目标检测模型为研究基线,结合注意力机制与可变形卷积增加特征图的全局上下文的相关性,以适应形态多变的骨病灶;通过特征融合模块促进不同尺度特征信息之间的融合,并在多个尺度特征图上分别进行骨组织病变训练和预测。
结果
2
在DeepLesion数据集上进行实验,实验结果表明,本文网络对骨病变检测的召回率(recall)、准确率(precision)、F1分数、平均精度(average precision,AP)分别为0.85、0.613、0.712以及0.816;较对照组中性能最优的通用CT病变检测网络对骨病变检测的召回率提升0.15。
结论
2
本文提出的网络模型对CT骨组织病变具有较好的检测效果,能够对骨组织病变判别诊断提供辅助支持,提高诊断效率,降低漏诊风险。
Objective
2
Since frequent orthopedic diseases cause serious harm to human body
automatic analysis and detection of bone tissue position in computed tomography (CT) has crucial clinical significance for early diagnosis of orthopedic diseases. The method based on manual analysis and diagnosis of bone tissue in CT image has problems such as low efficiency. The accuracy and objective consistency of diagnosis cannot be achieved. Therefore
A cascaded neural network model for bone tissue lesion detection has been demonstrated to aid decision support for orthopedic surgeons' diagnosis.
Method
2
The proposed bone lesion detection algorithm has mainly consist of four steps. At first
convert original data into CT value data in terms of the conversion formula and relative files that have been illustrated by the National Institutes of Health Clinical Center (NIHCC)
USA
in the preprocessing stage. The segmentation has been conducted via the mean value of CT value data as the threshold in order to filter out most of the non-human body parts in the image. The segmentation cannot be filtered out entirely due to the high CT value of the bed plate material of CT equipment. Rectangle kernels (RK)-based opening operation in morphological operations have benefited to filter out the CT bed plate from CT image. According to the characteristics of bone tissue in CT image
a contrast enhancement method based on Gamma transform is proposed to enhance the contrast of CT images. Next
the approximate bone tissue area in enhanced CT images have been calculated via thresholding based on the distribution range of the Hounsfield unit (HU) of the bone tissue in the CT image. The cascaded object detection model has been set up as our baseline. The attention mechanism and deformable convolution have been integrated to increase the global context relevance of the feature map based on the bone lesions with variable shapes. At last
the feature fusion module has been used to strengthen the fusion of feature information at various scales. A multi-scale feature map for the training and prediction of bone tissue lesions has been sorted out.
Result
2
Four designated groups of comparative experiments in the context of the network structure have compared with modeling capability. The detected model has been mainly examined based on average precision (AP). 1) ResNet50
ResNet101 and ResNeXt101 have been used as feature extraction networks to complete training and testing based on naïve Cascade R-CNN(region-convolutional neural network) model to calculate the baseline. The analyzed results have shown that ResNeXt101 has the best optimization based on the AP up to 0.543 to get the baseline. 2) Feature pyramid networks (FPN)
path aggregation feature pyramid networks (PAFPN)
neural architecture search-feature pyramid networks (NAS-FPN) and naive structure have been adapted to complete model training and testing based on the calculated Cascade R-CNN. The feature fusion module based on the best value has been sorted out. The best PAFPN based on the AP increased to 0.721 has been leaked out the feature fusion module. 3) Two groups of comparative experiments have been illustrated. Firstly
batch normalization (BN) and group normalization (GN) modules have been used in the head of R-CNN for entire training and testing. The results have shown that the performance of GN is better than BN with 0.723 AP. Attention mechanism block and deformable convolution block have been embedded in model to verify their effectiveness in the next step. The verified results have shown the effectiveness of attention mechanism and deformable convolution module. The AP have been improved to 0.816. 4) The trained model and other object detection network models have been calculated to compare the testing value of each model. The research experiments results have been achieved based on DeepLesion dataset. The results have shown as below: 1) the recall is 0.85; 2) the precision is 0.613; 3) the F1-score is 0.712; 4) the AP is 0.816. The performance has been significantly improved in comparison of the existing universal CT lesion detection models based on the recall rates of 0.574 and 0.70.
Conclusion
2
The main methods such as HU value threshold segmentation and morphological operations have been used to filter out most of the non-bone tissue area in the CT image at the image preprocessing stage. The bone tissue area has been highlighted coupled with the enhancement of the image contrast further. The training model has been accelerated to reduce the interference of noise. The second group of experimental results have improved the fusion of low-level feature information
high-level feature information and enhances the location information of high-level features. The semantic information of low-level features based on the multi-scale feature pyramid fusion module has been embedded in the network structure. The detection performance has been significantly improved at the end. The third group of experimental results have been concluded based on the enhancing adaptation of attention mechanism and the weight of irrelevant information deduction. The operation of deformable convolution module has realized the network adaptation further based on multi-shapes and sizes convolution kernels the fourth group of experiment results have achieved via the comparison experiment between our model and other object detection models. The metrics including recall
precision
F1-score and AP have been mainly evaluated in these models. The experimental results have demonstrated that the model analysis has a good detection effect on the CT bone tissue lesions in the context of upgrading diagnostic efficiency
missed diagnosis deduction and quick diagnosis and treatment. The differential diagnosis of bone tissue lesions has been aided effectively. The real-time detection capability can be strengthened via the deduction of model parameters quantity and the time of training and judging.
Cai Z W and Vasconcelos N. 2021. Cascade R-CNN: high quality object detection and instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5): 1483-1498[DOI:10.1109/TPAMI.2019.2956516]
Cao Y, Xu J R, Lin S, Wei F Y and Hu H. 2019. GCNet: non-local networks meet squeeze-excitation networks and beyond[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1904.11492.pdf https://arxiv.org/pdf/1904.11492.pdf
Chen K, Wang J Q, Pang J M, Cao Y H, Xiong Y, Li X X, Sun S Y, Feng W S, Liu Z W, Xu J R, Zhang Z, Cheng D Z, Zhu, C C, Cheng, T H, Zhao Q J, Li B Y, Lu X, Zhu R, Wu Y, Dai J F, Wang J D, Shi J P, Ouyang W L, Loy C and Lin D H. 2020. MMdetection[CP/OL]. [2020-08-08] . https://github.com/open-mmlab/mmdetection https://github.com/open-mmlab/mmdetection
Dai J F, Qi H Z, Xiong Y W, Yi L, Zhang G D, Hu H and Wei Y C. 2017. Deformable convolutional networks//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 3[ DOI:10.1109/ICCV.2017.89 http://dx.doi.org/10.1109/ICCV.2017.89 ]
Dong Y, Cui J F and Li X H. 2017. CT Diagnosis of Musculoskeletal System. Beijing: Science Press
董越, 崔久法, 李小虎. 2017. 骨关节肌肉系统CT诊断. 北京: 科学出版社
Eisenhauer E, Therasse P, Bogaerts J, Schwartz L, Sargent D, Ford R, Dancey J, Arbuck S, Gwyther S, Mooney M, Rubenstein L, Shankar L, Kaplan R, Lacombe D and Verweij J. 2008. 32 INVITED new response evaluation criteria in solid tumors: revised RECIST guideline version 1.1. European Journal of Cancer Supplements, 6(12): #13[DOI:10.1016/S1359-6349(08)71964-5]
Ghiasi G, Lin T Y, Pang R M, Le Q V and Brain G. 2019. NAS-FPN: learning scalable feature pyramid architecture for object detection[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1904.07392.pdf https://arxiv.org/pdf/1904.07392.pdf
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778[ DOI:10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Hu J, Shen L, Albanie S, Sun G and Wu E H. 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8): 2011-2023[DOI:10.1109/TPAMI.2019.2913372]
Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France: JMLR: 448-456
Jiang C H, Wang S J, Xu H, Liang X D and Xiao N. 2020. ElixirNet: relation-aware network architecture adaptation for medical lesion detection[EB/OL]. [2020-03-03] . https://arxiv.org/pdf/2003.08770.pdf https://arxiv.org/pdf/2003.08770.pdf
Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2117-2125[ DOI:10.1109/CVPR.2017.106 http://dx.doi.org/10.1109/CVPR.2017.106 ]
Lin T Y, Goyal P, Girshick R, He K M and Dollár P. 2020. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2): 318-327[DOI:10.1109/TPAMI.2018.2858826]
Liu S, Qi L, Qin H F, Shi J P and Jia J Y. 2018. Path aggregation network for instance segmentation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8759-8768[ DOI:10.1109/CVPR.2018.00913 http://dx.doi.org/10.1109/CVPR.2018.00913 ]
Padilla R, Netto S L and da Silva E A B. 2020. A survey on performance metrics for object-detection algorithms//Proceedings of 2020 International Conference on Systems, Signals and Image Processing. Niteroi, Brazil: IEEE: 2157-870[ DOI:10.1109/IWSSIP48289.2020.9145130 http://dx.doi.org/10.1109/IWSSIP48289.2020.9145130 ]
Pang J M, Chen K, Shi J P, Feng H J, Ouyang W L and Lin D H. 2019. Libra R-CNN: towards balanced learning for object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE[ DOI:10.1109/CVPR.2019.00091 http://dx.doi.org/10.1109/CVPR.2019.00091 ]
Redmon J, Divvala S, Girshick R and Farhadi A. 2016. You only look once: unified, real-time object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 779-788[ DOI:10.1109/CVPR.2016.91 http://dx.doi.org/10.1109/CVPR.2016.91 ]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149[DOI:10.1109/TPAMI.2016.2577031]
Tian Z, Shen C H, Chen H and He T. 2019. FCOS: fully convolutional one-stage object detection[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1904.01355.pdf https://arxiv.org/pdf/1904.01355.pdf
Wang J W, Wang D H, Zhang L L, Yu T F and Weng X J. 2004. Relationship between bone CT value and bone's minim element. Chinese Journal of Medical Imaging Technology, 20(9): 1328-1330
汪家旺, 王德杭, 张廉良, 俞同福, 翁学军. 2004. 骨组织CT值与骨结构成分间的关系研究. 中国医学影像技术, 20(9): 1328-1330)[DOI:10.3321/j.issn:1003-3289.2004.09.006
Wang X L, Girshick R, Gupta A and He K M. 2018. Non-local Neural Networks[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1711.07971.pdf https://arxiv.org/pdf/1711.07971.pdf
Wu Y X and He K M. 2018. Group normalization[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1803.08494.pdf https://arxiv.org/pdf/1803.08494.pdf
Xie S N, Girshick R, Dollár P, Tu Z W and He K M. 2017. Aggregated residual transformations for deep neural networks[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1611.05431.pdf https://arxiv.org/pdf/1611.05431.pdf
Yan K, Peng Y F, Sandfort V, Bagheri M, Lu Z Y and Summers R M. 2019. Holistic and comprehensive annotation of clinically significant findings on diverse CT images: learning from radiology reports and label ontology[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1904.04661.pdf https://arxiv.org/pdf/1904.04661.pdf
Yan K, Wang X S, Lu L and Summers R M. 2017. DeepLesion: automated deep mining, categorization and detection of significant radiology image findings using large-scale clinical lesion annotations[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/1710.01766.pdf https://arxiv.org/pdf/1710.01766.pdf
Zhang N, Cao Y, Liu B Y and Luo Y. 2020. 3D aggregated faster R-CNN for general lesion detection[EB/OL]. [2020-07-20] . https://arxiv.org/pdf/2001.11071.pdf https://arxiv.org/pdf/2001.11071.pdf
Zhao H F, Gao P R, Li H L and Qu J R. 2019. CT and MRI features of the malignant tumors of the sacrum. Journal of Practical Medical Imaging, 20(6): 607-609.
赵浩锋, 高朋瑞, 黎海亮, 曲金荣. 2019. 骶骨恶性肿瘤的CT及磁共振成像表现. 实用医学影像杂志, 20(6): 607-609)[DOI:10.16106/j.cnki.cn14-1281/r.2019.06.023
Zhong J P. 2009. Chinese Yearbook of Surgery. Shanghai: Second Military Medical University Press
仲剑平. 2009. 中国外科年鉴. 上海: 第二军医大学出版社
相关作者
相关机构
京公网安备11010802024621