发布时间: 2020-10-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.200281
2020 | Volume 25 | Number 10

计算机断层扫描图像

2D级联CNN模型的放疗危及器官自动分割

石军¹, 赵敏帆¹, 薛旭东^2,3, 郝晓宇¹, 金旭¹, 安虹¹, 张红雁²

1. 中国科学技术大学计算机科学与技术学院, 合肥 230026;

2. 中国科学技术大学附属第一医院肿瘤放疗科, 合肥 230001;

3. 湖北省肿瘤医院肿瘤放疗科, 武汉 430079

收稿日期: 2020-06-10; 修回日期: 2020-07-04; 预印本日期: 2020-07-11

基金项目: 国家重点研发计划项目（2016YFB1000403）；中央高校基本科研业务费专项资金资助

第一作者简介: 石军, 1995年生, 男, 硕士研究生, 主要研究方向为医学图像认知计算与深度学习。E-mail:shijun18@mail.ustc.edu.cn;
赵敏帆, 男, 硕士研究生, 主要研究方向为医学图像认知计算与深度学习。E-mail:zmf@mail.ustc.edu.cn;
薛旭东, 男, 工程师, 主要研究方向为肿瘤放射物理与人工智能在医学中的应用。E-mail:xuexudong511@163.com;
郝晓宇, 男, 硕士研究生, 主要研究方向为人工智能与医疗大数据。E-mail:hxy2018@mail.ustc.edu.cn;
金旭, 男, 博士研究生, 主要研究方向为智慧病理与医疗大数据。E-mail:jinxu@mail.ustc.edu.cn;
张红雁, 女, 主任医师, 主要研究方向为肿瘤放射治疗。E-mail:874595024@qq.com.

中图法分类号: TP391

文献标识码: A

文章编号: 1006-8961(2020)10-2110-09

摘要

目的精准的危及器官（organs at risk，OARs）勾画是肿瘤放射治疗过程中的关键步骤。依赖人工的勾画方式不仅耗费时力，且勾画精度容易受图像质量及医生主观经验等因素的影响。本文提出了一种2D级联卷积神经网络（convolutional neural network，CNN）模型，用于放疗危及器官的自动分割。方法模型主要包含分类器和分割网络两部分。分类器以VGG（visual geometry group）16为骨干结构，通过减少卷积层以及加入全局池化极大地降低了参数量和计算复杂度；分割网络则是以U-Net为基础，用双线性插值代替反卷积对特征图进行上采样，并引入Dropout层来缓解过拟合问题。在预测阶段，先利用分类器从输入图像中筛选出包含指定器官的切片，然后使用分割网络对选定切片进行分割，最后使用移除小连通域等方法对分割结果进一步优化。结果本文所用数据集共包含89例宫颈癌患者的腹盆腔CT（computed tomography）图像，并以中国科学技术大学附属第一医院多位放射医师提供的手工勾画结果作为评估的金标准。在实验部分，本文提出的分类器在6种危及器官（左右股骨、左右股骨头、膀胱和直肠）上的平均分类精度、查准率、召回率和F1-Score分别为98.36%、96.64%、94.1%和95.34%。基于上述分类性能，本文分割方法在测试集上的平均Dice系数为92.94%。结论与已有的CNN分割模型相比，本文方法获得了最佳的分割性能，先分类再分割的策略能够有效地避免标注稀疏问题并减少假阳性分割结果。此外，本文方法与专业放射医师在分割结果上具有良好的一致性，有助于在临床中实现更准确、快速的危及器官分割。

关键词

危及器官分割; 卷积神经网络; 级联模型; 放射治疗; 宫颈癌

Automatic segmentation of organs at risk in radiotherapy using 2D cascade-CNN model

Shi Jun¹, Zhao Minfan¹, Xue Xudong^2,3, Hao Xiaoyu¹, Jin Xu¹, An Hong¹, Zhang Hongyan²

1. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China;

2. Department of Radiation Oncology, The First Affiliated Hospital of USTC University of Science and Technology of China, Hefei 230001, China;

3. Department of Radiation Oncology, Hubei Cancer Hospital, Wuhan 430079, China

Supported by: National Key Research and Development Program of China(2016YFB1000403); Fundamental Research Funds for the Central Universities of China

Abstract

Objective Accurate delineation of organs at risk (OARs) is an essential step in the radiation therapy for cancers. However, this procedure is frequently time consuming and error prone because of the large anatomical variation across patients, different experience of observers, and poor soft-tissue contrast in computed tomography (CT) scans. A computer-aided analysis system for OAR auto-segmentation from CT images will reduce the burden of doctors and the subjective errors and improve the effect of radiotherapy. In the early years, atlas-based methods are extremely popular and widely used in anatomy segmentation. However, the performance of atlas-based segmentation methods can be easily affected by various factors, such as the quality of atlas and registration methods. Recently, profits from the rapid growth of computing power and the amount of available data, deep learning, especially deep convolutional neural networks (CNNs), has shown great potential in the field of image analysis. For most of the medical image segmentation tasks, the algorithms based on CNN outperform traditional methods. As a special fully CNN, U-Net adopts the design of encoder-decoder and fuses the high- and low-level features by skip connections to realize pixelwise segmentation. Given the outstanding performance of U-Net, numerous derivatives of the U-Net architecture have been gradually developed in various organ segmentation tasks. V-Net is proposed as an improvement scheme of U-Net to address the difficulties in processing 3D data. V-Net can fully utilize the 3D characteristics of images, although it is unsuitable for the 3D medical image datasets with few samples and the segmentation tasks of small organs. Therefore, a two-step 2D CNN model is proposed for the automatic and accurate segmentation of OARs in radiotherapy. Method In this study, we propose a novel cascade-CNN model that mainly includes a slice classifier and a 2D organ segmentation network. visual geometry group (VGG)16 is used as the backbone structure of the classifier and modified accordingly by considering the time overhead and the size of available data. To reduce the parameters and calculation complexity, three convolutional layers are removed, and an additional global max pooling layer is used as the bridging module between the feature extraction and the fully connected layer. The segmentation network is built upon the U-Net by replacing the deconvolutional layers with bilinear interpolation layers to perform the upsampling for features. The dropout layer and the data augmentation are utilized to avoid the overfitting problem. The classifier and the segmentation network for each organ are implemented in Keras and independently trained with the binary cross-entropy and dice losses, respectively. We use adaptive moment estimation to stabilize the gradient descent process during the training of the two models. In the inference stage, the slices containing the target organs are first selected by the classifier from the entire CT scans. These slices are then used as the input of the segmentation network to obtain the results, and some simple post-processing methods are applied to optimize the segmentation results. Result The dataset containing CT scans of 89 cases of cervical cancer patients and the manual segmentation results provided by multiple radiologists in The First Affiliated Hospital of University of Science and Technology of China (USTC) are used as the gold standard for evaluation. In the experimental part, the average classification accuracy, precision, recall, and F1-score of the classifier on the six organs (left femoral head, right femoral head, left femur, right femur, bladder, and rectum) are 98.36%, 96.64%, 94.1%, and 95.34%, respectively. On the basis of the performance of the above classifiers, the proposed method achieves high segmentation accuracy on the bladder, left femoral head, right femoral head, left femur, and right femur, with Dice coefficients of 94.16%, 93.69%, 95.09%, 96.14%, and 96.57%, respectively. Compared with the single U-Net and cascaded V-Net, the Dice coefficient increases by 4.1% to 6.6%. For the rectum, all methods perform poorly because of the irregular shape and low contrast. The proposed method achieves a Dice coefficient of 72.99% on the rectum, which is approximately 20% higher than other methods. The comparison experiments demonstrate that the classifiers can effectively improve the overall segmentation accuracy. Conclusion In this work, we propose a novel 2D cascade-CNN model, which is composed of a classifier and a segmentation network, for the automatic segmentation of OARs in radiation therapy. The experiments demonstrate that the proposed method effectively alleviates the labeling sparse problem and reduces the false positive segmentation results. For the organs that immensely vary in shape and size, the proposed method obtains a significant improvement in segmentation accuracy. In comparison with existing neural network methods, the proposed method achieves state-of-the-art results in the segmentation task of OARs for cervical cancer and a better consistency with that of experienced radiologists.

Key words

segmentation of organs at risk; convolutional neural network(CNN); cascade model; radiation therapy; cervical cancer

0 引言

在临床医学中，放射性治疗(简称放疗)是恶性肿瘤的主要治疗手段之一，指的是利用高剂量的放射线照射肿瘤靶区以达到清除癌细胞的目的(Miller等，2019)。放疗的实施通常包括以下几个过程：1)图像采集；2)靶区勾画；3)放疗计划制定及优化；4)放疗计划实施。其中，放疗计划制定的关键在于精准地勾画出肿瘤靶区及其周边可能被射线损伤的正常组织器官(简称危及器官)(Ibragimov和Xing，2017)，以确保在治疗过程中能够最大限度地照射肿瘤靶区，同时保护正常组织器官。然而，由于危及器官种类繁多，完全依靠人工的勾画方式不仅耗费时力，而且勾画精度容易受到各种因素的干扰，如放射图像的低对比度、医生的主观经验以及患者间解剖结构的差异等。因此，构建计算机辅助分析系统，实现对放疗危及器官精准、快速地自动分割，具有十分重要的临床意义：一方面可以减轻医生的工作负担，优化放疗工作流程；另一方面又能降低勾画误差，提高放疗效果。

许多研究人员围绕医学图像的器官分割算法进行了探索。在早期的研究中，基于图集的方法广泛应用于CT(computed tomography)图像及MR(magnetic resonance)图像的器官分割任务(Zhou和Bai，2007)。该类方法主要包含预处理、图集创建、图像配准以及标签融合等多个步骤，因此，其分割性能极易受到图集质量、图像配准方式等因素的影响(Han等，2008)。除此之外，基于图集的方法一般无法处理不同患者解剖结构之间的差异，而且图像配准的过程往往耗时较长。相比之下，基于传统机器学习的方法则更加直接，算法经过训练便可用于分割图像。但该类算法大多依赖于烦杂的预处理操作以及人工设计的图像特征(Tam等，2018)，导致该类方法的鲁棒性较差。总体而言，上述几类方法均无法在分割精度和运行效率方面同时满足目前的临床需求。

得益于计算力和可用数据量的快速增长，深度学习技术尤其是卷积神经网络(convolutional neural networks, CNN)在图像分析领域发挥出了巨大潜力(Krizhevsky等，2012)。对于多数医学图像分割任务，基于CNN的分割算法取得了比传统方法更加优越的性能(Litjens等，2017)，Ronneberger等人(2015)提出的U-Net便是其中最具代表性的结构。U-Net作为一种特殊的全卷积神经网络(Shelhamer等，2017)，整体采用编码—解码的设计并使用跳转连接对高低级特征进行融合，实现了像素级别的语义分割。鉴于U-Net突出的分割性能，其衍生出的一系列变种结构如今普遍应用于医学图像分割任务中。Trullo等人(2017)在U-Net中加入可训练的全连接条件随机场(Zheng等，2015)用以分割胸部CT图像中的危及器官，该方法可以生成更精细的边缘信息，能够进一步细化分割结果。对于相同任务，Vesal等人(2019)使用连续的空洞卷积(Wang等，2018)构建U-Net的瓶颈模块以捕获多尺寸的上下文信息，并结合Tversky损失(Salehi等，2017)，使分割精度得到显著提升。何慧和陈胜(2020)利用预训练的U-Net对PET(positron emission tomography)肿瘤进行分割，取得了较好效果。然而，利用单一的U-Net处理CT图像(或MR图像)这类3D数据会面临以下挑战：一是标注稀疏问题，大量无标注的切片会影响网络的收敛；二是由于图像中相邻组织器官的对比度较低，易导致假阳性分割结果。

为了能够利用图像完整的上下文信息，Milletari等人(2016)提出了V-Net作为U-Net的3D扩展，实现了对MR图像中前列腺的全切片分割。V-Net中加入了残差连接(He等，2016)，采用Dice损失取代交叉熵损失以提高对目标分割区域的敏感度。Zhu等人(2019)在V-Net中增加通道注意力机制(Hu等，2018)构成了AnatomyNet，用于分割头颈部CT图像中的多种危及器官。AnatomyNet结合了Dice损失和Focal损失，且在编码部分仅使用单个下采样操作缓解小体积器官的体素不均衡问题。Han等人(2019)提出了一种多阶段的级联V-Net，先通过低分辨率图像获得器官的位置信息，再结合位置信息对高分辨率图像进行精细分割，该方法在2019年SegTHOR(Segmentation of Thoracic Organs at Risk)竞赛中取得了第1名。V-Net这类3D网络虽然能充分利用图像的3D特性，但存在以下问题：一方面，医学图像数据集通常规模较小，加上GPU显存的限制，使用3D数据易引起过拟合问题及网络训练的震荡；另一方面，小体积器官的体素严重不均衡将导致分割性能难以提升。

基于此，本文提出了一种新型的2D级联模型，用于CT图像中危及器官的自动分割，该模型包括分类器和分割网络两个部分。其中，分类器用于从输入图像中筛选包含指定器官的切片，分割网络对选定切片进行分割。分类器的加入可以减少假阳性的分割结果以提高整体分割精度，同时，仅利用带标注数据来训练分割网络，避免了标注稀疏问题。在实验部分，以Dice系数及运行时间等指标评估了本文方法的分割性能，并与现有算法进行比较。

1 本文方法

本文方法的实施流程如图 1所示，主要步骤包括：1)数据预处理。对原始图像进行灰度截断、归一化及背景去除等操作以提高局部对比度。2)切片筛选。利用CNN分类器从输入图像中筛选出所有包含指定器官的切片。3)器官分割。使用带标注数据训练出的CNN分割网络对选定切片进行指定器官分割。4)结果后处理。结合简单的图像处理方法，如移除小连通域，对分割结果进一步优化。

图 1 放疗危及器官自动分割框图

Fig. 1 Overview of the proposed method for automatic segmentation of organs at risk in radiotherapy

1.1 数据预处理

CT图像以亨氏单位(Hounsfield unit，HU)存储像素，不同组织器官通常对应着不同的HU值范围。为了提高图像的局部对比度，首先利用特定的HU窗口(窗宽600 HU，窗位100 HU)对原始图像进行灰度截断，然后做归一化处理，最后运用阈值分割以及高级形态学方法去除图像中无关的背景信息。具体效果如图 2(c)所示，图像的对比度得到了显著提高。

图 2 数据预处理效果对比

Fig. 2 Data preprocessing((a) original image; (b) grayscale truncation and normalization; (c) background removal)

1.2 分类器结构

本文方法所用的分类器以VGG(visual geometry group) 16(Simonyan和Zisserman，2014)为骨干网络，考虑到时间开销以及可用的数据规模，对原网络进行针对性调整。如图 3所示，改动后的分类器仅使用8个卷积层，卷积核大小均为3×3，同时最大特征通道数减少到原来的1/4。另外，为了进一步降低网络参数量和计算复杂度，采用全局最大池化层作为特征提取与全连接层的衔接模块。最后，在网络末端通过Softmax层输出预测概率($P_{1}$和$P_{2}$)。具体的网络参数分布如表 1所示，其中卷积层和全连接层占据了约99.61 %的参数，网络总体参数量仅约为VGG16参数总量(138 MB)的0.35 %。

图 3 分类器结构图

Fig. 3 Architecture of the classifier

表 1 分类器网络参数分布
Table 1 Network parameter distribution of the classifier

下载CSV

网络层	特征图数	特征图尺寸/像素	参数量/个
卷积层1	16	208×208	160
卷积层2	16	208×208	2 320
卷积层3	32	104×104	4 640
卷积层4	32	104×104	9 248
卷积层5	64	52×52	18 496
卷积层6	64	52×52	36 928
卷积层7	128	26×26	73 856
卷积层8	128	26×26	147 584
全连接层1	-	512	66 048
全连接层2	-	256	131 328
全连接层3	-	2	514
总计	-	-	491 122
注：“-”表示无数据。

1.3 改进后的U-Net结构

Ronneberger等人(2015)提出U-Net结构，在当时的细胞分割竞赛中取得了SOTA(state-of-the-art)结果，因其适用于小样本数据，之后被广泛应用于医学图像分割任务。图 4为本文改进后的U-Net，可以看到其结构十分清晰、对称，总体呈U型，继承了FCN(fully convolutional networks)的特点，完全由卷积层构成。整个网络结构主要包含两个部分，分别为上侧的收缩路径和下侧的扩张路径。其中收缩路径包含4个编码模块。每个编码模块中包含两个3×3的卷积层，以ReLU作为激活函数，紧接着利用最大池化层进行下采样操作。每一次下采样都会使特征图的尺寸变为之前的1/2。扩张路径与收缩路径结构类似，包含4个解码模块。每个解码模块同样包含两个ReLU激活的3×3的卷积层，之后对特征图进行上采样操作，将尺寸提高为之前的2倍。收缩路径和扩张路径之间通过跳转连接，使不同层次的特征融合在一起。网络的输出模块包含两个3×3的卷积层，以及为了防止过拟合加入的Dropout层，最后使用一个Softmax激活的1×1的卷积将所有特征图映射到目标类别，得到最终的分割结果。

图 4 改进后的U-Net结构

Fig. 4 Architecture of the modified U-Net

与原始U-Net相比，本文的U-Net主要有以下几点改进：1)为避免网络训练的过程中出现梯度弥散，在除最后一层外的所有卷积层后均添加一个批归一化层。2)使用双线性插值代替原网络中的反卷积作为上采样操作。在上采样时就可以只改变特征图的尺寸，而不改变特征图的通道维度，从而保留更多的特征用于后续的特征融合。且双线性插值没有可训练参数，计算量更少，效率更高。3)在网络的输出层之前加入Dropout层用于防止过拟合。

1.4 损失函数

在本文方法中，不同类别器官对应的分类器独立训练，使用二值交叉熵作为分类器的训练损失函数，定义为

${\zeta _{{\rm{bce}}}} = \frac{1}{N}\sum\limits_i { - [{t_i}{\rm{log}}\left({{p_i}} \right) + \left({1 - {t_i}} \right){\rm{log}}(1 - {p_i})]} $

(1)

式中，$N$代表样本数量，$t_{i}$代表样本$i$的真实标签(0或1)，$t_{i}=1$表示包含指定器官，$t_{i}=0$表示不包含，$p_{i}$代表分类器的预测概率，取值分布在$[0, 1]$之间。

由于器官在CT切片中占据的面积较小，若同样使用交叉熵损失则容易导致像素的分类结果偏向背景。因此，以Dice损失对本文的分割网络进行训练，Dice损失能够有效地提高网络对目标区域的敏感度，适用于像素类别不均衡的情况，其定义为

$ {\zeta _{{\rm{dc}}}} = 1 - \frac{1}{{\left| \mathit{\boldsymbol{C}} \right|}}\sum\limits_{c \in \mathit{\boldsymbol{C}}} {\frac{{2\sum {y_c}{{\hat y}_c}}}{{\sum \left({{y_c}{ + {\hat y}_c}} \right)}}} $

(2)

式中，$\boldsymbol{C}$为待分割类别的集合，$y_{c}$和$ {\hat y}_{c}$分别为真实结果和预测结果。

2 实验结果

本文模型均使用Keras框架实现，后端选用Tensorflow，并采用NVIDIA Tesla V100 GPU进行训练加速。

2.1 数据集

利用中国科学技术大学附属第一医院肿瘤放疗科提供的真实临床数据集对本文方法进行验证，该数据集共包含89名宫颈癌患者的腹盆腔CT图像。同时，以多位专业放射医师的手工勾画结果作为评估的金标准，具体包括6种危及器官，分别为左右股骨、左右股骨头、膀胱和直肠。

原始数据集中不同图像的层厚与像素间距均不相同，为了消除空间分辨率差异的影响，本文将整个数据集重采样到一致的各向同性分辨率。如图 5所示，超过半数的CT图像经过重采样后，切片数量扩增为原来的2倍。

图 5 数据重采样前后的切片数量

Fig. 5 Number of the slices before and after data resampling

重采样后的89例CT图像共包含17 070张2D切片，其中包含任意一种危及器官标注的切片数量均不超过总数的27 %。如图 6所示，包含左右股骨头标注的切片仅占8.5 %左右。实验部分，每种危及器官对应的分类器及分割网络独立训练，将数据集以8 : 1 : 1的比例随机划分为训练集、验证集以及测试集。

图 6 切片分布情况

Fig. 6 Distribution of all slices

2.2 评估指标

在实验阶段对分类器的性能使用精度($Acc$)、查准率($P$)、查全率($R$)以及F1-Score等指标进行度量，定义为

${Acc = \frac{{TP + TN}}{{TP + FN + FP + TN}}} $

(3)

$ {P = \frac{{TP}}{{TP + FP}}} $

(4)

$ {R = \frac{{TP}}{{TP + FN}}} $

(5)

$ {{F_1} = \frac{{2 \times P \times R}}{{P + R}}} $

(6)

式中，$TP$、$FN$、$FP$以及$TN$分别对应真正例数、假反例数、假正例数以及真反例数，4类样例数之和等于测试样例总数。

采用医学图像分割任务中最常用的Dice系数(dice similarity coefficient)评估本文方法的分割精度。设$\boldsymbol{G}$为真实结果，$\boldsymbol{P}$为预测结果，Dice系数计算为

$ {f_{{\rm{Dice}}}} = \frac{{2\left| {\mathit{\boldsymbol{G}} \cap \mathit{\boldsymbol{P}}} \right|}}{{\left| \mathit{\boldsymbol{G}} \right| + \left| \mathit{\boldsymbol{P}} \right|}} $

(7)

式中，$\boldsymbol{G}∩\boldsymbol{P}$代表$\boldsymbol{G}$与$\boldsymbol{P}$的交集，Dice系数值分布在$[0, 1]$之间，越接近1，表示分割结果越准确。

2.3 实验设置

针对每类危及器官独立训练分类器和分割网络，均使用Adam优化器，初始学习率分别为0.000 1和0.001，学习率调整参数使用默认配置。考虑到GPU显存容量(16 GB)，分类器输入尺寸为208×208像素，Batch大小设置为16，训练50个epoch(遍历一次训练集为一个epoch)；分割网络输入尺寸为512×512像素，Batch大小设置为12，训练100个epoch。为了避免过拟合问题，采用随机翻转、缩放、平移以及旋转等操作对训练数据进行在线增强。

2.4 分类器性能评估

分类测试集包含1 766张CT切片，测试结果如表 2所示，以精度、查准率、召回率及F1-Score对分类器的性能进行评估。由表 2可知，分类器在左右股骨和左右股骨头上的表现最好，这是因为骨骼组织在CT图像中的对比度较高，形状较规则。相反地，对于直肠这类形态多变、局部对比度较低的器官，分类器的表现较差，召回率仅为81.36 %。分类器的性能好坏对后续分割有直接影响，分割网络只对被分类器筛选出的切片进行指定的器官分割，因此分类器的假反例率越低越好，即召回率越高越好。

表 2 分类器在测试集上的结果
Table 2 The results of the classifier on the test set

下载CSV

/%
器官	精度	查准率	召回率	F1_Score
膀胱	99.15	96.85	98.40	97.62
直肠	93.93	87.01	81.36	84.09
左股骨头	98.58	98.40	93.75	96.02
右股骨头	99.38	97.83	94.41	96.09
左股骨	99.43	99.77	98.00	98.88
右股骨	99.66	100.00	98.69	99.34
平均	98.36	96.64	94.10	95.34

图 7绘制了6种危及器官的ROC(receiver operating characteristic)曲线，除直肠外，其余5种危及器官对应的ROC曲线所围成的面积(AUC)均接近1，表明本文分类器的性能较突出。

图 7 ROC曲线

Fig. 7 ROC curves

2.5 分割性能度量

本文所用分割测试集共包含17名宫颈癌患者的腹盆腔CT图像，使用Dice系数对分割结果的精度进行度量，不同方法的Dice系数如表 3所示。其中，DDCNN(deep dilated convolutional neural network)由Men等人(2017)提出用于分割直肠癌放疗危及器官，其分割目标与本文有部分重合(膀胱以及左右股骨头)，鉴于数据集不同，在此仅作为参考。在级联V-Net结构中加入本文分类器后构成C-级联V-Net结构。

表 3 不同方法的Dice系数对比
Table 3 Dice coefficient comparison of different approaches

下载CSV

/%
器官	DDCNN	U-Net	级联V-Net	C-级联V-Net	本文
膀胱	93.4	89.28	92.15	91.91	94.16
直肠	-	52.36	51.92	58.23	72.99
左股骨头	92.1	89.64	92.85	92.86	93.69
右股骨头	92.3	89.23	90.83	92.45	95.09
左股骨	-	90.37	92.35	92.25	96.14
右股骨	-	89.97	91.85	91.87	96.57
注：“-”表示无相关数据，加粗字体表示最佳结果。

基于上述分类器性能，本文方法在膀胱、左右股骨头以及左右股骨等5种危及器官上取得了较高的分割精度，相比单一的U-Net和级联V-Net，Dice系数提高了4.1 %~6.6 %。对于直肠的分割，所有方法的表现均不佳，原因在于直肠的形态不规则且对比度较低，本文方法在直肠上取得了72.99 %的Dice系数，相比其他方法提高了20 %左右。此外，通过对比实验可知，利用分类器能够有效地提高整体分割精度，级联V-Net在加入本文分类器之后，不同器官的分割Dice系数均有所提高，尤其是在直肠上提升了6.31 %。上述结果展现了本文方法在6种指定危及器官上优越的分割性能，同时验证了分类器加分割网络这种级联结构在本文分割任务中的有效性。

除了分割精度的评估，本文对不同方法在单例样本上的平均运行时间进行了比较，如表 4所示。在测试阶段，不同方法在每例样本上的用时主要包括对6种危及器官的分割耗时以及对分割结果的后处理耗时。由表 4可知单个U-Net结构的运行时间最少。相比其他级联网络，本文方法的运行时间虽然有所下降但仍然较长，导致该问题的原因在于不同器官对应着独立的分类器和分割网络，在未来工作中可以通过对分类器的合并来提高运行效率。

表 4 不同方法的运行时间对比
Table 4 Runtime comparison of different approaches

下载CSV

	模型
	DDCNN	U-Net	级联V-Net	C-级联V-Net	本文
运行时间/s	-	90.5	183.6	200.2	175.2
注：“-”表示无相关数据。

图 8对不同方法在同一样本上的分割结果进行了可视化对比。由图 8可知, 与其他方法相比，本文方法的分割结果与金标准具有更好的一致性，并且在轮廓细节方面有更强的鲁棒性。

图 8 分割结果可视化

Fig. 8 Visualization of segmentation results

((a)ground truth；(b)ours；(c)cascade V-Net result；(d)C-cascade V-Net result)

3 结论

针对CT图像中放疗危及器官的自动分割，本文提出了一种新型的2D级联CNN模型，采用先分类再分割的策略取得了突出的分割性能。同时，通过实验表明该方法能够有效地避免标注稀疏问题，并减少假阳性分割结果，特别是对直肠这类形态多变、图像对比度较低器官的效果更加明显。与已有的方法相比，本文方法虽然在分割精度上取得了显著提升，但仍存在局限性。首先，所用的数据集较小，导致对本文方法的泛化能力评估不足；其次，由于分类器和分割网络的结构问题，本文方法对于直肠这类特殊器官的分割精度与金标准还存在较大差距；最后，本文方法的运行效率可以通过合并分类器进一步提高。在未来的工作中，将继续围绕上述问题对本文方法进行改进。一方面，收集更多的可用数据提升方法的泛化能力；另一方面，结合直肠这类器官的具体特点对分类器和分割网络进行适应性地调整，以进一步提高整体分割精度，满足临床应用的要求。

致谢本文实验数据的收集得到了中国科学技术大学附属第一医院放疗科多位医师的帮助，在此表示衷心感谢！

参考文献

Han M F, Yao G, Zhang W H, Mu G R, Zhan Y Q, Zhou X and Gao Y Z. 2019. Segmentation of CT thoracic organs by multi-resolution VB-nets[EB/OL].[2020-05-30]. http://ceur-ws.org/Vol-2349/SegTHOR2019_paper_1.pdf

Han X, Hoogeman M S, Levendag P C, Hibbard L S, Teguh D N, Voet P, Cowen A C and Wolf T K. 2008. Atlas-based auto-segmentation of head and neck CT images//Proceedings of the 11th International Conference on Medical Image Computing and Computer-Assisted Intervention. New York: Springer: 434-441[DOI:10.1007/978-3-540-85990-1_52]

He H, Chen S. 2020. Automatic tumor segmentation in PET by deep convolutional U-Net with pre-trained encoder. Journal of Image and Graphics, 25(1): 171-179 (何慧, 陈胜. 2020. 改进预训练编码器U-Net模型的PET肿瘤自动分割. 中国图象图形学报, 25(1): 171-179) [DOI:10.11834/jig.190058]

He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE: 770-778[DOI:10.1109/CVPR.2016.90]

Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE: 7132-7141[DOI:10.1109/CVPR.2018.00745]

Ibragimov B, Xing L. 2017. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Medical Physics, 44(2): 547-557 [DOI:10.1002/mp.12045]

Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: ACM: 1097-1105

Litjens G, Kooi T, Ehteshami Bejnordi B, Setio A A A, Ciompi F, Ghafoorian M, van der Laak J A W M, van Ginneken B, Sánchez C I. 2017. A survey on deep learning in medical image analysis. Medical Image Analysis, 42: 60-88 [DOI:10.1016/j.media.2017.07.005]

Men K, Dai J R, Li Y X. 2017. Automatic segmentation of the clinical target volume and organs at risk in the planning CT for rectal cancer using deep dilated convolutional neural networks. Medical Physics, 44(12): 6377-6389 [DOI:10.1002/mp.12602]

Miller K D, Nogueira L, Mariotto A B, Rowland J H, Yabroff K R, Alfano C M, Jemal A, Kramer J L, Siegel R L. 2019. Cancer treatment and survivorship statistics, 2019. CA:A Cancer Journal for Clinicians, 69(5): 363-385 [DOI:10.3322/caac.21565]

Milletari F, Navab N and Ahmadi S A. 2016. V-net: fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision. Stanford: IEEE: 565-571[DOI:10.1109/3DV.2016.79]

Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich: Springer: 234-241[DOI:10.1007/978-3-319-24574-4_28]

Salehi S M, Erdogmus D and Gholipour A. 2017. Tversky loss function for image segmentation using 3D fully convolutional deep networks//Proceedings of the 8th International Workshop on Machine Learning in Medical Imaging. Quebec City: Springer: 379-387[DOI:10.1007/978-3-319-67389-9_44]

Shelhamer E, Long J, Darrell T. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4): 640-651 [DOI:10.1109/TPAMI.2016.2572683]

Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition[EB/OL].[2020-05-30]. https://arxiv.org/pdf/1409.1556v6.pdf

Tam C M, Yang X, Tian S, Jiang X, Beitler J J and Li S. 2018. Automated delineation of organs-at-risk in head and neck CT images using multi-output support vector regression//Proceedings of SPIE 10578, Medical Imaging 2018: Biomedical Applications in Molecular, Structural, and Functional Imaging. Houston: SPIE: 1-10[DOI:10.1117/12.2292556]

Trullo R, Petitjean C, Ruan S, Dubray B, Nie D and Shen D. 2017. Segmentation of organs at risk in thoracic CT images using a sharpmask architecture and conditional random fields//Proceedings of the 14th IEEE International Symposium on Biomedical Imaging. Melbourne: IEEE: 1003-1006[DOI:10.1109/ISBI.2017.7950685]

Vesal S, Ravikumar N and Maier A. 2019. A 2D dilated residual U-Net for multi-organ segmentation in thoracic CT.[EB/OL].[2020-05-30]. https://arxiv.org/pdf/1905.07710.pdf

Wang P Q, Chen P F, Yuan Y, Liu D, Huang Z H, Hou X D and Cottrell G. 2018. Understanding convolution for semantic segmentation//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE: 1451-1460[DOI:10.1109/WACV.2018.00163]

Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z Z, Du D L, Huang C and Torr P H S. 2015. Conditional random fields as recurrent neural networks//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE: 1529-1537[DOI:10.1109/ICCV.2015.179]

Zhou Y X, Bai J. 2007. Multiple abdominal organ segmentation:an atlas-based fuzzy connectedness approach. IEEE Transactions on Information Technology in Biomedicine, 11(3): 348-352 [DOI:10.1109/titb.2007.892695]

Zhu W T, Huang Y F, Zeng L, Chen X M, Liu Y, Qian Z, Du N, Fan W, Xie X H. 2019. AnatomyNet:deep learning for fast and fully automated whole-volume segmentation of head and neck anatomy. Medical Physics, 46(2): 576-589 [DOI:10.1002/mp.13300]