融合上下文和多尺度特征的胸部多器官分割

吉淑滢; 肖志勇

doi:10.11834/jig.200558

计算机断层扫描图像 | 浏览量 : 0 下载量: 0 CSCD: 6

PDF
导出
分享
收藏
专辑

融合上下文和多尺度特征的胸部多器官分割
Integrated context and multi-scale features in thoracic organs segmentation
2021年26卷第9期页码：2135-2145
纸质出版日期： 2021-09-16 ，

录用日期： 2021-01-28
DOI： 10.11834/jig.200558
稿件说明：

移动端阅览

吉淑滢, 肖志勇. 融合上下文和多尺度特征的胸部多器官分割[J]. 中国图象图形学报, 2021,26(9):2135-2145.

Shuying Ji, Zhiyong Xiao. Integrated context and multi-scale features in thoracic organs segmentation[J]. Journal of Image and Graphics, 2021,26(9):2135-2145.
吉淑滢, 肖志勇. 融合上下文和多尺度特征的胸部多器官分割[J]. 中国图象图形学报, 2021,26(9):2135-2145. DOI： 10.11834/jig.200558.

Shuying Ji, Zhiyong Xiao. Integrated context and multi-scale features in thoracic organs segmentation[J]. Journal of Image and Graphics, 2021,26(9):2135-2145. DOI： 10.11834/jig.200558.

摘要

目的

肿瘤周围高危器官的准确分割是图像引导放射治疗中的关键步骤，也是对抗肺癌和食道癌，规划有效治疗策略的重要组成部分。为了解决不同患者之间器官形状和位置的复杂变化情况以及计算机断层扫描（computed tomography，CT）图像中相邻器官之间软组织对比度低等问题，本文提出了一种深度学习算法对胸部CT图像中的高危器官进行细分。

方法

以U-Net神经网络结构为基础，将冠状面下的3个连续切片序列即2.5D（2.5 dimention）数据作为网络输入来获取切片联系，同时利用高效全局上下文实现不降维的跨通道交互、捕获单视图下切片序列间的长距离依赖关系、加强通道联系和融合空间全局上下文信息。在编码部分使用金字塔卷积和密集连接的集成提取多尺度信息，扩大卷积层的感受野，并将解码器与编码器每层进行连接来充分利用多尺度特征，增强特征图的辨识度。考虑到CT图像中多器官形状不规则且紧密相连问题，加入深度监督来学习不同层的特征表示，从而精准定位器官和细化器官边界。

结果

在ISBI（International Symposium on Biomedical Imaging）2019 SegTHOR（segmentation of thoracic organs at risk in CT images）挑战赛中，对40个胸部多器官训练样本进行分割，以Dice系数和HD（Hausdorff distance）距离作为主要评判标准，该方法在测试样本中食道、心脏、气管和主动脉的Dice系数分别达到0.855 1、0.945 7、0.923 0和0.938 3，HD距离分别为0.302 3、0.180 5、0.212 2和0.191 8。

结论

融合全局上下文和多尺度特征的算法在胸部多器官分割效果上更具竞争力，有助于临床医师实现高效的诊断与治疗。

Abstract

Objective

Automatic segmentation of organs at risk (OAR) in computed tomography (CT) has been an essential part of implementing effective treatment strategies to resist lung and esophageal cancers. Accurate segmentation of organs' tumors can aid to interpretate inherent position and morphological changes for patients via facilitating adaptive and computer assisted radiotherapy. Manual delineation of OAR cannot be customized in the future. Scholors have conducted segmentation manually for heart-based backward esophagus spinal and cord-based upper trachea based on intensity levels and anatomical knowledge Complicated variations in the shape and position of organs and low soft tissue contrast between neighboring organs in CT images have caused emerging errors. The CT-based images for lifting manual segmentation skill for thoracic organs have been caused time-consuming. Nonlinear-based modeling of deep convolutional neural networks (DCNNs) has been presented tremendous capability in medical image segmentation. Multi organ segmentation deep learning skill has been applied in abdominal CT images. The small size and irregular shape for automatic segmentation of the esophagus have not been outreached in comparison with even larger size organs. Two skills have related to 3D medical image segmentation have been implemented via the independent separation of each slice and instant 3D convolution to aggregate information between slices and segment all slices of the CT image in. The single slice segmentation skill cannot be used in the multi-layer dependencies overall. Higher computational cost for slices 3D segmentation has been operated via all layers aggregation. A 2.5D deep learning framework has been illustrated to identify the organs location robustly and refine the boundaries of each organ accurately.

Method

This network segmentation of 2.5D slice sequences under the coronal plane composed of three adjacent slices as input can learn the most distinctive of a single slice deeply. The features have been presented in the connection between slices. The image intensity values of all scans were truncated to the range of[-384

384] HU to omit the irrelevant information in one step. An emerging attention module called efficient global context has been demonstrated based on the completed U-Net neural network structure. The integration for effective channel attention and global context module have been achieved. The global context information has been demonstrated via calculating the response at a location as the weighted sum of the features of all locations in the input feature map. A model has been built up to identify the correlation for channels. The effective feature map can obtain useful information. The useless information can be deducted. The single view long distance dependency between slice sequences can be captured. Attention has been divided into three modules on the aspect of context modeling module

feature conversion module and fusion module. Unlike the traditional global context module

feature conversion module has not required dimensionality to realize the information interaction between channels. The channel attention can be obtained via one dimensional convolution effectively. The capability of pyramid convolution has been used in the encoding layer part. Extracted multi-scale information and expanded receptive field for the convolution layer can be used via dense connection. The pyramid convolution has adapted convolution kernels on different scales and depths. The increased convolution kernels can be used in parallel to process the input and capture different levels of information. Feature transformation has been processed uniformly and individually in multiple parallel branches. The output of each branch has been integrated into the final output. Multi-scale feature extraction based on adjusting the size of the convolution kernel has been achieved than the receptive field resolution down sampling upgration. Multi-layer dense connection has realized feature multiplexing and ensures maximum information transmission. The integration of pyramid convolution and dense connection has obtained a wider range of information and good quality integrated multi-scale images. The backward gradient flow can be smoother than before. An accurate multi-organs segmentation have required local and global information fusion

decoder with each layer of encoders connecting network and the low level details of different levels of feature maps with high level semantics in order to make full use of multi-scale features and enhance the recognition of feature maps. The irregular and closely connected shape of multi-organs in CT images can be calculated at the end. Deep supervision has been added to learn the feature representations of different layers based on the sophisticated feature map aggregation. The boundaries of organs and excessive segmentation deduction in non-organ images and network training can be enhanced effectively. More accurate segmentation results can be produced finally.

Result

In the public dataset of the segmentation of thoracic organs at risk in CT images(SegTHOR) 2019 challenge

the research has been performed CT scans operation on four thoracic organs (i.e.

esophagus

heart

trachea and aorta)

take Dice similarity coefficient (DSC) and Hausdorff distance (HD) as main criteria

the Dice coefficients of the esophagus

heart

trachea and aorta in the test samples reached 0.855 1

0.945 7

0.923 0 and 0.938 3 separately. The HD distances have achieved 0.302 3

0.180 5

0.212 2 and 0.191 8 respectively.

Conclusion

Low level detailed feature maps can capture rich spatial information to highlight the boundaries of organs. High level semantic features have reflected position information and located organs. Multi scale features and global context integration have been the key step to accurate segmentation. The highest average DSC value and HD obtained for heart and Aorta have achieved its high contrast

regular shape

and larger size compared to the other organs. The esophagus had the lowest average DSC and HD values due to its irregularity and low contrast to identify within CT volumes more difficult. The research has achieved a DSC score of 85.5% for the esophagus on test dataset. Experimental results have shown that the proposed method has beneficial for segmenting high risk organs to strengthen radiation therapy planning.

关键词

多器官分割伪三维高效全局上下文金字塔卷积多尺度特征

Keywords

multi-organ segmentationpseudo three dimensionefficient global contextpyramid convolutionmulti-scale features

references

Cao Y, Xu J R, Lin S, Wei F Y and Hu H. 2019. GCNet: non-local networks meet squeeze-excitation networks and beyond//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea (South): IEEE: #00246[DOI:10.1109/ICCVW.2019.00246http://dx.doi.org/10.1109/ICCVW.2019.00246]

Chen P, Xu C H, Li X Y, Ma Y Y and Sun F L. 2019. Two-stage network for OAR segmentation[EB/OL]. [2020-8-31].http://ceur-ws.org/Vol-2349/SegTHOR2019_paper_4.pdfhttp://ceur-ws.org/Vol-2349/SegTHOR2019_paper_4.pdf

Dou Q, Chen H, Jin Y M, Yu L Q, Qin J and Heng P A. 2016. 3D deeply supervised network for automatic liver segmentation from CT volumes//Proceedings of the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention. Athens, Greece: Springer: 149-157[DOI:10.1007/978-3-319-46723-8_18http://dx.doi.org/10.1007/978-3-319-46723-8_18]

Duta I C, Liu L, Zhu F and Shao L. 2020. Pyramidal convolution: rethinking convolutional neural networks for visual recognition[EB/OL]. [2020-08-31].https://arxiv.org/pdf/2006.11538.pdfhttps://arxiv.org/pdf/2006.11538.pdf

He K M, Zhang X Y, Ren S Q and Sun J. 2015. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1026-1034[DOI:10.1109/ICCV.2015.123http://dx.doi.org/10.1109/ICCV.2015.123]

Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7132-7141[DOI:10.1109/CVPR.2018.00745http://dx.doi.org/10.1109/CVPR.2018.00745]

Huang G, Liu Z, Van Der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 4700-4708[DOI:10.1109/CVPR.2017.243http://dx.doi.org/10.1109/CVPR.2017.243]

Huang H M, Lin L F, Tong R F, Hu H J, Zhang Q W, Iwamoto Y, Han X H, Chen Y W and Wu J. 2020. UNet 3+: a full-scale connected UNet for medical image segmentation//Proceedings of 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona, Spain: IEEE: 1055-1059[DOI:10.1109/ICASSP40776.2020.9053405http://dx.doi.org/10.1109/ICASSP40776.2020.9053405]

Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift[EB/OL]. [2020-08-31].https://arxiv.org/pdf/1502.03167.pdfhttps://arxiv.org/pdf/1502.03167.pdf

Kim S, Jang Y, Han K, Shim H and Chang H J. 2019. A cascaded two-step approach for segmentation of thoracic organs[EB/OL]. [2020-08-31].http://ceur-ws.org/Vol-2349/SegTHOR2019_paper_3.pdfhttp://ceur-ws.org/Vol-2349/SegTHOR2019_paper_3.pdf

Li X M, Chen H, Qi X J, Dou Q, Fu C W and Heng P A. 2018. H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Transactions on Medical Imaging, 37(12): 2663-2674[DOI:10.1109/TMI.2018.2845918]

Milletari F, Navab N and Ahmadi S A. 2016. V-Net: fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision. Stanford, USA: IEEE: 565-571[DOI:10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79]

Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241[DOI:10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]

Trullo R, Petitjean C, Dubray B and Ruan S. 2019. Multiorgan segmentation using distance-aware adversarial networks. Journal of Medical Imaging, 6(1): #014001[DOI:10.1117/1.JMI.6.1.014001]

Wang Q L, Wu B G, Zhu P F, Li P H, Zuo W M and Hu Q H. 2020. ECA-Net: efficient channel attention for deep convolutional neural networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 11534-11542[DOI:10.1109/CVPR42600.2020.01155http://dx.doi.org/10.1109/CVPR42600.2020.01155]

Wang X L, Girshick R, Gupta A and He K M. 2018. Non-local neural networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7794-7803[DOI:10.1109/CVPR.2018.00813http://dx.doi.org/10.1109/CVPR.2018.00813]

Wu J, Ye F, Ma J L, Sun X P,Xu J and Cui Z M. 2008. The segmentation and visualization of human organs based on adaptive region growing method//Proceedings of the 8th IEEE International Conference on Computer and Information Technology Workshops. Sydney, Australia: IEEE: 439-443[DOI:10.1109/CIT.2008.Workshops.24http://dx.doi.org/10.1109/CIT.2008.Workshops.24]

Zhang L, Wang L S, Huang Y J and Chen H. 2019. Segmentation of thoracic organs at risk in CT images combining coarse and fine network[EB/OL]. [2020-08-31].http://ceur-ws.org/Vol-2349/SegTHOR2019_paper_5.pdfhttp://ceur-ws.org/Vol-2349/SegTHOR2019_paper_5.pdf

Zhou B L, Khosla A, Lapedriza A, Oliva A and Torralba A. 2015. Object detectors emerge in deep scene CNNS[EB/OL]. [2020-08-31].https://arxiv.org/pdf/1412.6856.pdfhttps://arxiv.org/pdf/1412.6856.pdf

Zhou Y X and Bai J. 2007. Multiple abdominal organ segmentation: an atlas-based fuzzy connectedness approach. IEEE Transactions on Information Technology in Biomedicine, 11(3): 348-352[DOI:10.1109/TITB.2007.892695]

Zhou Z W, Siddiquee M M R, Tajbakhsh N and Liang J M. 2018. Unet++: a nested u-net architecture for medical image segmentation//Proceedings of the 4th International Workshop on Deep Learning in Medical Image Analysis. Granada, Spain: Springer: 3-11[DOI:10.1007/978-3-030-00889-5_1http://dx.doi.org/10.1007/978-3-030-00889-5_1]

文章被引用时，请邮件提醒。

提交

结合局部全局特征与多尺度交互的三维多器官分割网络