融合上下文和多尺度特征的胸部多器官分割

吉淑滢; 肖志勇

发布时间： 2021-09-16
摘要点击次数： 1905
全文下载次数： 990
DOI: 10.11834/jig.200558
2021 | Volume 26 | Number 9

融合上下文和多尺度特征的胸部多器官分割

吉淑滢, 肖志勇(江南大学人工智能与计算机学院, 无锡 214122)

摘要

目的肿瘤周围高危器官的准确分割是图像引导放射治疗中的关键步骤，也是对抗肺癌和食道癌，规划有效治疗策略的重要组成部分。为了解决不同患者之间器官形状和位置的复杂变化情况以及计算机断层扫描（computed tomography，CT）图像中相邻器官之间软组织对比度低等问题，本文提出了一种深度学习算法对胸部CT图像中的高危器官进行细分。方法以U-Net神经网络结构为基础，将冠状面下的3个连续切片序列即2.5D （2.5 dimention）数据作为网络输入来获取切片联系，同时利用高效全局上下文实现不降维的跨通道交互、捕获单视图下切片序列间的长距离依赖关系、加强通道联系和融合空间全局上下文信息。在编码部分使用金字塔卷积和密集连接的集成提取多尺度信息，扩大卷积层的感受野，并将解码器与编码器每层进行连接来充分利用多尺度特征，增强特征图的辨识度。考虑到CT图像中多器官形状不规则且紧密相连问题，加入深度监督来学习不同层的特征表示，从而精准定位器官和细化器官边界。结果在ISBI （International Symposium on Biomedical Imaging）2019 SegTHOR （segmentation of thoracic organs at risk in CT images）挑战赛中，对40个胸部多器官训练样本进行分割，以Dice系数和HD （Hausdorff distance）距离作为主要评判标准，该方法在测试样本中食道、心脏、气管和主动脉的Dice系数分别达到0.855 1、0.945 7、0.923 0和0.938 3，HD距离分别为0.302 3、0.180 5、0.212 2和0.191 8。结论融合全局上下文和多尺度特征的算法在胸部多器官分割效果上更具竞争力，有助于临床医师实现高效的诊断与治疗。

关键词

多器官分割伪三维高效全局上下文金字塔卷积多尺度特征

Integrated context and multi-scale features in thoracic organs segmentation

Ji Shuying, Xiao Zhiyong(School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China)

Abstract

Objective Automatic segmentation of organs at risk (OAR) in computed tomography (CT) has been an essential part of implementing effective treatment strategies to resist lung and esophageal cancers. Accurate segmentation of organs' tumors can aid to interpretate inherent position and morphological changes for patients via facilitating adaptive and computer assisted radiotherapy. Manual delineation of OAR cannot be customized in the future. Scholors have conducted segmentation manually for heart-based backward esophagus spinal and cord-based upper trachea based on intensity levels and anatomical knowledge Complicated variations in the shape and position of organs and low soft tissue contrast between neighboring organs in CT images have caused emerging errors. The CT-based images for lifting manual segmentation skill for thoracic organs have been caused time-consuming. Nonlinear-based modeling of deep convolutional neural networks (DCNNs) has been presented tremendous capability in medical image segmentation. Multi organ segmentation deep learning skill has been applied in abdominal CT images. The small size and irregular shape for automatic segmentation of the esophagus have not been outreached in comparison with even larger size organs. Two skills have related to 3D medical image segmentation have been implemented via the independent separation of each slice and instant 3D convolution to aggregate information between slices and segment all slices of the CT image in. The single slice segmentation skill cannot be used in the multi-layer dependencies overall. Higher computational cost for slices 3D segmentation has been operated via all layers aggregation. A 2.5D deep learning framework has been illustrated to identify the organs location robustly and refine the boundaries of each organ accurately. Method This network segmentation of 2.5D slice sequences under the coronal plane composed of three adjacent slices as input can learn the most distinctive of a single slice deeply. The features have been presented in the connection between slices. The image intensity values of all scans were truncated to the range of[-384, 384] HU to omit the irrelevant information in one step. An emerging attention module called efficient global context has been demonstrated based on the completed U-Net neural network structure. The integration for effective channel attention and global context module have been achieved. The global context information has been demonstrated via calculating the response at a location as the weighted sum of the features of all locations in the input feature map. A model has been built up to identify the correlation for channels. The effective feature map can obtain useful information. The useless information can be deducted. The single view long distance dependency between slice sequences can be captured. Attention has been divided into three modules on the aspect of context modeling module, feature conversion module and fusion module. Unlike the traditional global context module, feature conversion module has not required dimensionality to realize the information interaction between channels. The channel attention can be obtained via one dimensional convolution effectively. The capability of pyramid convolution has been used in the encoding layer part. Extracted multi-scale information and expanded receptive field for the convolution layer can be used via dense connection. The pyramid convolution has adapted convolution kernels on different scales and depths. The increased convolution kernels can be used in parallel to process the input and capture different levels of information. Feature transformation has been processed uniformly and individually in multiple parallel branches. The output of each branch has been integrated into the final output. Multi-scale feature extraction based on adjusting the size of the convolution kernel has been achieved than the receptive field resolution down sampling upgration. Multi-layer dense connection has realized feature multiplexing and ensures maximum information transmission. The integration of pyramid convolution and dense connection has obtained a wider range of information and good quality integrated multi-scale images. The backward gradient flow can be smoother than before. An accurate multi-organs segmentation have required local and global information fusion, decoder with each layer of encoders connecting network and the low level details of different levels of feature maps with high level semantics in order to make full use of multi-scale features and enhance the recognition of feature maps. The irregular and closely connected shape of multi-organs in CT images can be calculated at the end. Deep supervision has been added to learn the feature representations of different layers based on the sophisticated feature map aggregation. The boundaries of organs and excessive segmentation deduction in non-organ images and network training can be enhanced effectively. More accurate segmentation results can be produced finally. Result In the public dataset of the segmentation of thoracic organs at risk in CT images(SegTHOR) 2019 challenge, the research has been performed CT scans operation on four thoracic organs (i.e., esophagus, heart, trachea and aorta), take Dice similarity coefficient (DSC) and Hausdorff distance (HD) as main criteria, the Dice coefficients of the esophagus, heart, trachea and aorta in the test samples reached 0.855 1, 0.945 7, 0.923 0 and 0.938 3 separately. The HD distances have achieved 0.302 3, 0.180 5, 0.212 2 and 0.191 8 respectively. Conclusion Low level detailed feature maps can capture rich spatial information to highlight the boundaries of organs. High level semantic features have reflected position information and located organs. Multi scale features and global context integration have been the key step to accurate segmentation. The highest average DSC value and HD obtained for heart and Aorta have achieved its high contrast, regular shape, and larger size compared to the other organs. The esophagus had the lowest average DSC and HD values due to its irregularity and low contrast to identify within CT volumes more difficult. The research has achieved a DSC score of 85.5% for the esophagus on test dataset. Experimental results have shown that the proposed method has beneficial for segmenting high risk organs to strengthen radiation therapy planning.

Keywords

multi-organ segmentation pseudo three dimension efficient global context pyramid convolution multi-scale features

在线采编平台

在线出版

年度会议

下载中心

年度信息