发布时间: 2020-01-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.190058
2020 | Volume 25 | Number 1

医学图像处理

改进预训练编码器U-Net模型的PET肿瘤自动分割

何慧, 陈胜

上海理工大学光电信息与计算机工程学院, 上海 200093

收稿日期: 2019-03-08; 修回日期: 2019-07-14; 预印本日期: 2019-07-21

基金项目: 国家自然科学基金项目（81101116）

第一作者简介: 何慧, 1995年生, 女, 硕士研究生, 主要研究方向为医学图像处理与分析、深度学习。E-mail:1023458864@qq.com.

中图法分类号: TP391.4

文献标识码: A

文章编号: 1006-8961(2020)01-0171-09

摘要

目的为制定放疗计划并评估放疗效果，精确的PET（positron emission tomography）肿瘤分割在临床中至关重要。由于PET图像存在低信噪比和有限的空间分辨率等特点，为此提出一种应用预训练编码器的深度卷积U-Net自动肿瘤分割方法。方法模型的编码器部分用ImageNet上预训练的VGG19编码器代替；引入基于Jaccard距离的损失函数满足对样本重新加权的需要；引入了DropBlock取代传统的正则化方法，有效避免过拟合。结果 PET数据库共包含1 309幅图像，专业的放射科医师提供了肿瘤的掩模、肿瘤的轮廓和高斯平滑后的轮廓作为模型的金标准。实验结果表明，本文方法对PET图像中的肿瘤分割具有较高的性能。Dice系数、Hausdorff距离、Jaccard指数、灵敏度和正预测值分别为0.862、1.735、0.769、0.894和0.899。最后，给出基于分割结果的3维可视化，与金标准的3维可视化相对比，本文方法分割结果可以达到金标准的88.5%，这使得在PET图像中准确地自动识别和连续测量肿瘤体积成为可能。结论本文提出的肿瘤分割方法有助于实现更准确、稳定、快速的肿瘤分割。

关键词

深度学习; 正电子发射型断层成像; 分割; 肿瘤; U-Net

Automatic tumor segmentation in PET by deep convolutional U-Net with pre-trained encoder

He Hui, Chen Sheng

School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

Supported by: National Natural Science Foundation of China (81101116)

Abstract

Objective Positron emission tomography (PET) is a crucial technique established for patient administration in neurology, oncology, and cardiology. Particularly in clinical oncology, fluorodeoxy glucose PET is usually applied in therapy monitoring, radiotherapy planning, staging, diagnosis, and follow-up. Adaptive radiation therapy assists in radiation treatment with the hope that specific therapies aimed at individual patients and target tumors can be developed to re-optimize the treatment plan as early as possible. The use of PET greatly benefits adaptive radiation therapy. Manual delineation is time-consuming and highly dependent on observers. Previous studies have shown that automatic computer-generated segmentations are more reproducible than manual delineations, especially for radiomic analysis. Therefore, automatic and accurate tumor delineation is highly demanded for subsequent determination of therapeutic options and achievement of an improved prognosis. Over the past decade, dozens of methods have been used, which rely on multiple image segmentation approaches or composed methods from broad categories, including thresholding, region-based, contour-based, and graph-based methods as well as clustering, statistical techniques, and machine learning. However, those methods depend on hand-crafted features and possess a limited capability to represent features. For medical image segmentation, convolutional neural networks (CNNs) have demonstrated competitive performance. Nevertheless, these methods show the image classification based on region, in which the integral input image is split up and turns into small regions. Then, whether the small region belongs to the target (foreground) or not can be predicted by the CNN model of each small region. Therefore, each region merely stands for a partial area of the image; thus, this algorithm merely involves limited contextual knowledge that belongs to the small region. A U-Net, which is considered an optimal segmentation network for medical imaging, is trained end to end. It includes a contractive path and an expensive path produced by a combination of convolutional, up-sampling, and pooling layers. This architecture certified itself to be highly effective in using limited amounts of data for segmentation problems. Affected by recent achievements in deep learning, we exploit an automatic tumor segmentation method by deep convolutional U-Net with a pre-trained encoder. Method In this paper, we present a fully automatic method for tumor segmentation by using a 14-layer U-Net model with two blocks of a VGG19 encoder pre-trained with ImageNet. The pre-trained VGG19 encoder contains 260 160 trainable parameters. The rest of our network consists of 14 layers with 1 605 961 trainable parameters. We fix the stride at 2. We propose three-step strategies to ensure effective and efficient learning with limited training data. First, we use the first two blocks of VGG19 as the contracting path and introduce rectified linear units (ReLUs) to each convolutional layer as the activation function. For the symmetrically expanding path, we arrange ReLUs and batch normalization after each convolutional layer. The loss of boundary pixels in each convolution layer necessitates cropping. For the last layer, we use a 1×1 convolution to map each 64-channel feature vector, and each component expresses the chance that the corresponding input pixel is within a target tumor. Second, a tumor holds a small portion within an entire PET image. Therefore, the pixel-wise classification tends to be biased to the outside of targets, leading to a high probability to partially segment or miss tumors. A loss function based on Jaccard distance is applied to eliminate the need for sample re-weighting, which is a typical procedure when using cross-entropy as the loss function for image segmentation due to a strong imbalance between the number of foreground and background pixels. Third, we import the DropBlock technique to replace the normal regularization dropout method because the former can help the U-Net efficiently avoid overfitting. This approach is a structured shape of dropout in which units in a successive region of a feature map are dropped together. In addition to the convolution layers, applying DropBlock in skip connections increases the accuracy. Result A database that contains 1 309 PET images is applied to train and test the proposed segmentation model. We split the database into a before-radiotherapy (BR) sub-database and an after-radiotherapy (AR) sub-database. We use the mask, contour, and smoothed contour of a tumor, which are provided by an expert radiologist, as truths for teaching the proposed model. Experimental results on the BR sub-database show that our method presented a relatively high performance of tumor segmentation in PET images. The Dice coefficient (DI), Hausdorff distance, Jaccard index, sensitivity, and positive predicted value (PPV) are 0.862, 1.735, 0.769, 0.894, and 0.899, respectively. In the test stage, the total processing time of the testing dataset of the BR sub-database needs an average of 1.39 s, which can meet clinical real-time requirements. Then, we fine-tune the weights of the model that we have selected on the BR sub-database by training the network further with the AR sub-database. Experimental results indicate a good segmentation performance with a DI of 0.852, SE of 0.840, and PPV of 0.893. Compared with the traditional U-Net, our method increased by 5.9%, 15.1%, 1.9%, respectively. Finally, the volume of the segmented tumors in the PET images is presented, enabling the accurate automated identification and serial measurement of tumor volumes in PET images. Conclusion This study uses a 14-layer U-Net architecture with a VGG19 pre-trained encoder for tumor segmentation in PET images. We demonstrate how to improve the performance of the U-Net by using a technique called fine-tuning in an encoder of network for initializing weights. Although fine-tuning has been widely applied in image classification tasks, it has not been applied to the like-U-Net-type architectures for medical image segmentation tasks. We use the Jaccard distance as the loss function to improve the segmentation performance. Overall results show that our approach is suitable for various tumors with minimum post-processing and without pre-processing. We believe that this method could be generalized effectively to other medical image segmentation tasks.

Key words

deep learning; positron emission tomography(PET); segmentation; tumor; U-Net

0 引言

正电子发射型断层成像(PET)目前已成为肿瘤学、心脏病学和神经病学患者管理的重要工具。特别是在肿瘤学中，PET常用于诊断、分期、放疗计划、治疗监测和随访(Bai等，2013)。适应性放射治疗的目标是通过个体患者与目标肿瘤特异性的结合来改善放射治疗，并在治疗过程中尽早重新优化治疗计划(Yan等，1997)。PET在自适应放射治疗的应用中具有很大作用(Thorwarth等，2010)。最近，快速发展的影像组学开始用于准确、稳定及可重复地分割肿瘤，以便提取大量附加特征，诸如3D描述算子、强度、直方图指标和2阶或更高阶的纹理特征(Hatt等，2017b)。

当前，PET图像分割技术(毕晓君和肖婧，2011)的研究仍受到诸多挑战：一方面，PET图像为背景分布不均匀的灰度图像，具有低空间分辨率、低信噪比、无明显的灰度差异等问题；另一方面，由于个体间的差异，肿瘤病灶成像不规则。手动分割在医学成像中通常被认为可重复性差、繁琐且耗时，在PET图像中尤其如此(Hatt等，2017a)，上述因素促进了自动分割方法的发展。

研究人员运用多种方法在PET图像上分割肿瘤，包括基于阈值(Moussallem等，2012)、基于轮廓、基于区域(Day等，2009)、基于图(Onoma等，2012)、聚类(Lelandais等，2014)、统计和机器学习(Foster等，2014)等方法。Moussallem等人(2012)提出直接来自患者PET/CT图像的自适应阈值策略，对于最大轴在2~4.5 cm之间的病变准确度较高，但对小尺寸病变的分割存在不准确性；Day等人(2009)提出区域生长方法依赖于肿瘤区域的统计数据，分割性能很大程度上取决于分段的初始化；Onoma等人(2012)使用自适应参数来改进3D随机游走(RW)模型，用于分割不均匀肿瘤或小肿瘤；Lelandais等人(2014)基于证据C均值(ECM)算法，利用相邻体素来处理每个单元示踪剂PET图像中的不确定性和不精确性，但仅使用强度值将体素分配到不同的聚类中容易忽略每个体素周围空间环境的纹理特征。

卷积神经网络(CNNs)已经在医学图像分割中有出色的表现(Lecun等，1998)。Cha等人(2016)训练CNN并使用CTU图像中的感兴趣区域(ROI)区分膀胱的内部和外部；Pereira等人(2016)为了加快网络训练速度，在CNN模型中用多个3×3小尺寸的卷积核来减少网络参数，同时加深层数提高模型精度。由于每个像素块仅代表图像的局部区域，因此这些方法仅包含像素块中有限的上下文信息。为了解决这个问题，Long等人(2015)提出一种不依赖于像素块直接对像素点分割的全卷积神经网络。Ronneberger等人(2015)在此基础上提出的U-Net，被认为是目前最好的医学图像分割网络(Zhong等，2018)，包含收缩路径和扩张路径的体系结构对于数据集有限的分割问题非常有用。Chen等人(2016)提出一种多分辨率大规模人工神经网络(MTANN)，采用多分辨率分解和组合来做下采样的收缩路径和上采样的扩展路径。这种多分辨率方法在MTANN深度学习中形成了U形架构，相当于U-Net，在分离骨骼上具有很好的效果。因此，本文提出一种基于U-Net的新型框架，使用两个在ImageNet数据集上预训练的VGG19卷积块作为U-Net编码器，可自动分割PET图像中的肿瘤。该模型以端到端的方式进行训练。同时本文对网络模型进行改进，提出DropBlock和Jaccard距离损失函数等策略，确保有限的训练数据能够有效和高效地学习，提高模型的鲁棒性和泛化能力。

1 本文方法

1.1 图像预处理

PET图像中的每一个像素点在计算机中以16 bit进行存储，本文先对PET图像进行灰度归一化，将每一像素点的灰度值均匀压缩至0~255之间。

1.2 网络结构

本文提出一种浅层U-Net网络结构，其编码器替换为在ImageNet上预训练的VGG19编码器，网络结构如图 1所示，该模型包括两个路径，一个收缩路径(左侧)和一个扩展路径(右侧)，通过收缩路径中的卷积和池化来聚合上下文信息，通过扩展路径中的卷积和上采样来恢复完整图像分辨率。网络共14层，可训练参数1 605 961个，步幅固定为2。使用前两个VGG19卷积块作为收缩路径，并使用修正线性单元(ReLU)作为每个卷积层的激活函数。对于对称的扩展路径，在每个卷积层之后安排ReLU和批归一化(BN)。由于每次卷积后边界像素的丢失，裁剪是必要的。通过连续的卷积和池化，卷积层可以整合从区域到全局尺度的上下文信息，从而降低了输出层的分辨率，为了解决多尺度信息融合和全分辨率像素分类之间的冲突，在扩展路径中将池化替换为上采样。来自收缩路径的高分辨率特征与上采样输出相结合。在最后一层，使用1×1卷积来映射每个64分量的特征向量，并且每个元素表示对应的输入像素属于肿瘤的概率，下面详细介绍网络结构中所做的改进。

图 1 网络结构

Fig. 1 Architecture of the proposed convolutional network

1.2.1 VGG19预训练编码器

通常，U-Net从随机初始化的权重开始。众所周知，为了使网络不过拟合，数据集规模应该相对较大，一般包含数百万幅图像。在ImageNet数据集上训练的网络被广泛用作其他图像处理任务中网络权重初始化。学习过程只针对未经预训练的网络层(有时仅针对最后一层)进行。在本文中，只使用VGG19的block1和block2，每个block包含两个卷积层。

1.2.2 损失函数

对于输入图像中的任何像素${x_{mn}}$，U-Net的对应输出$p(w|{x_{mn}})$表示该像素属于肿瘤的后验概率，$w$为权重。由于U-Net基本上执行像素分类，因此交叉熵通常用作损失函数，具体为

$ \begin{array}{l} L\left( w \right) = - \frac{1}{{{N_m} \times {N_n}}}\sum\limits_{m,n} {\left[ {{t_{mn}}\ln p\left( {w|{x_{mn}}} \right) + } \right.} \\ \;\;\;\;\;\;\;\;\;\;\;\;\left. {\left( { - {t_{mn}}} \right)\ln \left( {1 - p\left( {w|{x_{mn}}} \right)} \right)} \right] \end{array} $

(1)

式中，${t_{mn}} \in \left\{ {0, 1} \right\}$是像素点${x_{mn}}$的类别，${t_{mn}}$=1表示肿瘤，${t_{mn}}$=0表示背景。

在PET中，肿瘤通常占据整个PET图像的一个小区域，像素级分类偏向背景，因此在分割结果中，肿瘤被部分丢失的可能性很高(Yuan等, 2017)。典型的解决方案是在训练期间为每个像素分配权重，补偿来自每个类的不同像素频率，重新平衡肿瘤和背景。然而，这种像素级重新加权过程带来了额外的计算成本，尤其在训练过程中采用图像扩增时。

本文引入了一种新的基于Jaccard距离的损失函数。基于Jaccard距离样本之间的不相似性，与Jaccard指数互补。定义$\mathit{\boldsymbol{G}}$表示手动分割肿瘤区域，$\mathit{\boldsymbol{M}}$代表计算机生成的掩模，Jaccard距离定义为

$ \begin{array}{l} {d_J}\left( {\mathit{\boldsymbol{G}},\mathit{\boldsymbol{M}}} \right) = 1 - J\left( {\mathit{\boldsymbol{G}},\mathit{\boldsymbol{M}}} \right) = \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;1 - \frac{{\left| {\mathit{\boldsymbol{G}} \cap \mathit{\boldsymbol{M}}} \right|}}{{\left| \mathit{\boldsymbol{G}} \right| + \left| \mathit{\boldsymbol{M}} \right| - {{\left| {\mathit{\boldsymbol{G}} \cap \mathit{\boldsymbol{M}}} \right|}^\prime }}} \end{array} $

(2)

${d_J}\left({\mathit{\boldsymbol{G}}, \mathit{\boldsymbol{M}}} \right)$本身不可微分，因此难以应用于反向传播。本文设计以下损失函数

$ {L_{{d_J}}} = 1 - \frac{{\sum\limits_{m,n} {\left( {{t_{mn}}{p_{mn}}} \right)} }}{{\sum\limits_{m,n} {t_{mn}^2} + \sum\limits_{m,n} {p_{mn}^2} - \sum\limits_{m,n} {{{\left( {{t_{mn}}{p_{mn}}} \right)}^\prime }} }} $

(3)

式中，${p_{mn}}$为像素属于肿瘤的后验概率。

该损失函数不需要权重图重新平衡来自肿瘤区域和背景的像素。同时，提出的损失函数是可微的，即

$ \begin{array}{*{20}{c}} {\frac{{\partial {L_{{d_J}}}}}{{\partial {p_{mn}}}} = - \frac{{{t_{mn}}\left[ {\sum\limits_{m,n} {t_{mn}^2} + \sum\limits_{m,n} {p_{mn}^2} - \sum\limits_{m,n} {\left( {{t_{mn}}{p_{mn}}} \right)} } \right]}}{{{{\left[ {\sum\limits_{m,n} {t_{mn}^2} + \sum\limits_{m,n} {p_{mn}^2} - \sum\limits_{m,n} {\left( {{t_{mn}}{p_{mn}}} \right)} } \right]}^2}}} + }\\ {\frac{{\left( {2{p_{mn}} - {t_{mn}}} \right)\left[ {\sum\limits_{m,n} {\left( {{t_{mn}}{p_{mn}}} \right)} } \right]}}{{{{\left[ {\sum\limits_{m,n} {t_{mn}^2} + \sum\limits_{m,n} {p_{mn}^2} - \sum\limits_{m,n} {\left( {{t_{mn}}{p_{mn}}} \right)} } \right]}^2}}}} \end{array} $

(4)

式(4)可以在训练网络时，有效集成到反向传播中。

1.2.3 DropBlock

在多数情况下，dropout主要用于卷积网络的完全连接层。本文引入了以结构形式失活的DropBlock技术(Ghiasi等, 2018)。与dropout相比，主要区别在于DropBlock从层的特征映射中删除连续区域，而不是丢弃独立的随机单元。由于DropBlock会失活相关区域中的功能，因此网络模型必须在其他地方寻找适合的特征匹配数据。被丢弃的特征子集是针对每个小批量独立绘制的并且形成不同的网络架构，然后DropBlock训练具有不同架构但在单个时期中共享权重的所有子网络的集合。以这种方式，特征块之间产生较少的依赖性，从而学习在不同子集之间更强的特征，这使得训练的模型鲁棒性和泛化能力更强。

DropBlock有两个主要参数，即${b_s}$和$β$。${b_s}$是要删除的块的大小，$β$控制要删除的激活单元数。在本文实验中，$β$可以计算为

$ \beta = \frac{{1 - {k_p}}}{{b_s^2}} \times \frac{{f_s^2}}{{{{\left( {{f_s} - {b_s} + 1} \right)}^2}}} $

(5)

式中，${k_p}$为保持单元处于传统dropout状态的概率。有效种子区域的大小是${({f_s} - {b_s} + 1)^2}$。${f_s}$是特征映射的大小，式(5)仅是近似值。本文首先估计${k_p}$的值在0.75~0.95之间，然后根据式(5)计算$β$。

2 实验结果与分析

本文模型是用基于Keras包的Python实现的。实验在谷歌云平台(GCP)上进行，使用型号Nvidia Tesla K80的GPU加速图像处理。运行时间：在测试阶段，测试数据集的总处理时间平均约为1.39 s，能够满足临床的实时要求。

2.1 数据集

使用美国芝加哥大学医学中心提供的数据库对本文提出的方法进行验证。该数据库包含25名患者的真实临床图像，每位患者被视为一个病例，每个病例有两组数据，分别是放疗前和放疗后的图像。本文将数据库拆分为放疗前(BR)子数据库和放疗后(AR)子数据库。BR子数据库包含769幅图像(24个病例用于训练，1个病例用于测试)，AR子数据库包含540幅图像。所有图像都具有特定的金标准，由放射科医师手动描绘。数字化后产生8位PET图像，分辨率为128×128像素，像素尺寸为4.5 mm×4.5 mm。模型在BR子数据库上进行训练、验证和初步评估，并用AR子数据库进行进一步评估。

2.2 评价指标

1) 采用Dice和Jaccard指数对分割结果进行评价，定义为

$ D = \frac{{2\left| {\mathit{\boldsymbol{A}} \cap \mathit{\boldsymbol{B}}} \right|}}{{\left| \mathit{\boldsymbol{A}} \right| + \left| \mathit{\boldsymbol{B}} \right|}} $

(6)

$ J = \frac{{\left| {\mathit{\boldsymbol{A}} \cap \mathit{\boldsymbol{B}}} \right|}}{{\left| {\mathit{\boldsymbol{A}} \cup \mathit{\boldsymbol{B}}} \right|}} $

(7)

式中，$\mathit{\boldsymbol{A}}$表示分割结果，$\mathit{\boldsymbol{B}}$表示与之对应的金标准，$\mathit{\boldsymbol{A}} \cap \mathit{\boldsymbol{B}}$为分割图中肿瘤的部分，Dice越接近1，表示分割结果越准确。

2) Hausdorff (HD)距离定义为

$ \begin{array}{*{20}{c}} {H\left( {\mathit{\boldsymbol{A}},\mathit{\boldsymbol{B}}} \right) = }\\ {\max \left\{ {\mathop {\max }\limits_{a \in \mathit{\boldsymbol{A}}} \mathop {\min }\limits_{b \in \mathit{\boldsymbol{B}}} d\left( {a,b} \right),\mathop {\max }\limits_{b \in \mathit{\boldsymbol{B}}} \mathop {\min }\limits_{a \in \mathit{\boldsymbol{A}}} d\left( {a,b} \right)} \right\}} \end{array} $

(8)

式中，$a$，$b$分别是$\mathit{\boldsymbol{A}}$和$\mathit{\boldsymbol{B}}$上的像素点，$d$为欧氏距离。

3) 采用灵敏度(SE)和正预测值(PPV)对分割结果进行评价，定义为

$ S = \frac{{{T_P}}}{{{T_P} + {F_N}}} $

(9)

$ P = \frac{{{T_P}}}{{{T_P} + {F_P}}} $

(10)

式中，${T_P}$表示正确预测的肿瘤部分，${F_P}$表示将背景预测为肿瘤的部分，${F_N}$表示将肿瘤预测为背景的部分。

2.3 实验训练过程中的参数设置

由于Sigmoid、Tanh激活函数容易造成梯度消失，网络选取ReLU作为激活函数。在除最后一个卷积层之外的每个卷积层的输出中添加了批归一化，批的尺寸设置为4，最后采用常规的Sigmoid分类损失函数。本文采用Adam优化算法，将基于一阶和二阶矩的学习率调整为$\Delta_{t}=\alpha \cdot \hat{m}_{t} / \sqrt{\hat{v}_{t}}$，$\hat{m}_{t}$为指数移动均值，$\hat{v}_{t}$为平方梯度，用较小的值表示关于$\hat{m}_{t}$的方向是否对应于真实梯度的方向存在更大的不确定性。本文使用Jaccard距离作为损失函数时，学习率$α$设置为0.001以加速本研究中的训练过程。当使用交叉熵对比实验时，学习率$α$设置为0.000 1。每组实验都训练100 epoch(将数据集中所有图像训练一次为一个epoch)。

2.4 实验分析

本节分析VGG19预训练编码器、损失函数和正则化方法对提出的分割模型的影响。第1个实验比较是否应用VGG19预编码器对实验结果的影响；第2个实验比较基于Jaccard距离的损失函数与基于交叉熵的损失函数对实验结果的影响；第3个实验评估DropBlock和dropout正则化方法对实验结果的影响。

2.4.1 VGG19预训练编码器

图 2是有预训练编码器和没有预训练编码器对模型的训练损失比较。可以看出，使用预训练编码器的模型收敛更快且实现更低的损耗值。

图 2 使用/不使用预训练编码器对模型的训练损失比较

Fig. 2 Comparison of training losses between models with and without pre-trained encoders

2.4.2 基于Jaccard距离的损失函数

本文将构建的网络模型使用两种不同的损失函数进行训练。使用交叉熵作为损失函数时的PPV为0.922，而使用Jaccard距离时为0.899，这是因为交叉熵模型从前景像素和背景像素两者考虑，而Jaccard距离聚焦于前景像素，与图像分割的任务更相关。如表 1所示，基于Jaccard距离的损失函数获得较高的SE和Dice。

表 1 基于不同损失函数的实验结果
Table 1 Results of experiment with different loss functions

下载CSV

损失函数	SE	PPV	Dice	Jaccard	HD
交叉熵	0.715	0.923	0.801	0.684	2.375
Jaccard距离	0.894	0.899	0.862	0.769	1.735
注：加粗字体表示最佳结果。

2.4.3 基于DropBlock的正则化方法

为了验证DropBlock的有效性，本文将损失函数固定为Jaccard距离，将DropBlock(DB)与dropout(DP)进行了比较。${b_s}$设置为3，5，7和9，并将${k_p}$设置为0.5和0.8。如表 2所示，当${b_s}$= 5和${k_p}$= 0.5时，所有指标都高于其他指标。在以下实验中，使用DropBlock，且${b_s}$ = 5，${k_p}$ = 0.5。

表 2 基于不同正则化方法的实验结果
Table 2 Results of experiment with different regularization methods

下载CSV

正则化方法	SE	PPV	Dice	Jaccard	HD
无正则化	0.831	0.893	0.857	0.769	2.283
DP[0.5]	0.847	0.876	0.856	0.775	4.435
DP[0.8]	0.842	0.881	0.881	0.800	2.005
DB[9, 0.5]	0.851	0.859	0.849	0.759	5.794
DB[9, 0.8]	0.881	0.877	0.868	0.782	3.918
DB[7, 0.5]	0.808	0.911	0.849	0.751	2.667
DB[7, 0.8]	0.890	0.848	0.864	0.782	8.463
DB[5, 0.5]	0.894	0.899	0.862	0.769	1.735
DB[5, 0.8]	0.783	0.902	0.835	0.741	2.936
DB[3, 0.5]	0.797	0.891	0.837	0.746	2.029
DB[3, 0.8]	0.803	0.910	0.850	0.759	2.586
注：加粗字体表示最优的前4名结果。

2.4.4 与传统U-Net模型对比

在验证模型的关键组成之后，将训练好的网络应用于BR子数据库的测试数据集，使用13层U-Net作为基准。图 3是几个典型的自动分割例子，可以看出，本文提出的模型比传统的U-Net模型在分割肿瘤时，具有分割结果更完整、保留更多边界细节的优势，模型准确描绘了肿瘤，并且在各种图像采集条件下都很稳定。图 4绘制了病例6(30幅图像)的分割结果(肿瘤掩模、肿瘤轮廓和高斯平滑后肿瘤轮廓)的分布。肿瘤掩模Jaccard指标范围为0.227~0.875，平均值为0.769；肿瘤轮廓Jaccard指标范围为0.196~0.806，平均值为0.627；高斯平滑后的肿瘤轮廓Jaccard指标范围为0.347~0.629，平均值为0.500。

图 3 3个例子的分割结果

Fig. 3 Segmentation results for three examples ((a) original images; (b) the truth references; (c) U-Net model; (d) U-Net with Jaccard distance; (e) ours)

图 4 基于Jaccard因子的分割性能分布

Fig. 4 Distribution of segmentation performance in terms of Jaccard index

2.4.5 AR子数据库进一步评估

为了进一步评估模型的性能，本文使用AR子数据库进行了两个实验。第1个实验直接用BR子数据库中性能最佳的模型来执行AR子数据库图像分割任务。第2个实验通过AR子数据库进一步训练网络，精细调整在第1个实验中选择的模型的权重，其中398幅图像用于训练，100幅图像用于验证，剩下的42幅图像用于测试。在训练100个epoch后，在验证图像上选择表现最佳的模型。

模型在AR子数据库上的性能评估如表 3所示。从表 3可以看出，本文模型获得了比较好的分割结果，经过对模型的进一步训练，获得最佳分割性能。

表 3 AR子数据库上的性能评估
Table 3 Performance evaluation on the AR sub-database

下载CSV

模型	Dice	SE	PPV
13层U-Net	0.831	0.760	0.894
本文	0.852	0.840	0.893
再训练后	0.879	0.875	0.910
注：加粗字体表示最佳结果。

2.4.6 3D体积计算

用本文模型将病例6的肿瘤自动分割结果3维可视化(如图 5所示)，与医生给出的金标准3维可视比较(如图 6所示)，可以发现自动分割肿瘤的体积与专家提供的真实肿瘤的体积相当，本文模型的分割体积可达到金标准的88.5%，高于Chen等人(2016)提出的MTANN模型。

图 5 BR子数据库案例6的3维可视化

Fig. 5 Visualization in 3D of total case 6 in BR sub-database

图 6 BR子数据库案例6的肿瘤体积结果

Fig. 6 Volumes in 3D results of case 6 in BR sub-database ((a) truth masks; (b) ours)

3 结论

本文提出一种带有VGG19预训练编码器的14层改进后的U-Net肿瘤分割模型，实施了3步有效策略，一定程度上解决了训练样本少导致过拟合的问题。该模型能够全自动且较为精准地分割PET图像中的肿瘤。本文通过预训练VGG19的部分参数初始化U-Net模型参数；使用基于Jaccard距离的损失函数提高分割性能；使用一种新颖的基于DropBlock正则化方法有效地规范卷积网络，与dropout相比，能使训练后的模型鲁棒性和泛化能力更强。通过以上几种对模型的改进方法的结合，实验表明本文提出的改进模型可以较为稳定、精准地分割PET图像中的肿瘤。

参考文献

Bai B, Bading J, Conti P S. 2013. Tumor quantification in clinical positron emission tomography. Theranostics, 3(10): 787-801 [DOI:10.7150/thno.5629]

Bi X J, Xiao J. 2011. Application of DE algorithm and improved GVF Snake model in segmentation of PET image. Journal of Image and Graphics, 16(3): 382-388 (毕晓君, 肖婧. 2011. 差分进化算法GVF Snake模型在PET图像分割中的应用. 中国图象图形学报, 16(3): 382-388) [DOI:10.11834/jig.20110320]

Cha K H, Hadjiiski L, Samala R K, Chan H P, Caoili E M, Cohan R H. 2016. Urinary bladder segmentation in CT urography using deep-learning convolutional neural network and level sets. Medical Physics, 43(4): 1882-1896 [DOI:10.1118/1.4944498]

Chen S, Zhong S, Yao L, Shang Y, Suzuki K. 2016. Enhancement of chest radiographs obtained in the intensive care unit through bone suppression and consistent processing. Physics in Medicine & Biology, 61(6): 2283-2301 [DOI:10.1088/0031-9155/61/6/2283]

Day E, Betler J, Parda D, Reitz B, Kirichenko A, Mohammadi S, Miften M. 2009. A region growing method for tumor volume segmentation on PET images for rectal and anal cancer patients. Medical Physics, 36(10): 4349-4358 [DOI:10.1118/1.3213099]

Foster B, Bagci U, Mansoor A, Xu Z, Mollura D J. 2014. A review on segmentation of positron emission tomography images. Computers in Biology and Medicine, 50: 76-96 [DOI:10.1016/j.compbiomed.2014.04.014]

Ghiasi G, Lin T Y and Le Q V. 2018. DropBlock: a regularization method for convolutional networks[EB/OL]. (2018-10-30)[2019-07-19]. https://arxiv.org/pdf/1810.12890.pdf

Hatt M, Lee J A, Schmidtlein C R, Naqa I E, Caldwell C, De Bernardi E, Lu W, Das S, Geets X, Gregoire V, Jeraj R, Macmanus M P, Mawlawi O R, Nestle U, Pugachev A B, Schöder H, Shepherd T, Spezi E, Visvikis D, Zaidi H and Kirov A S. 2017a. Classification and evaluation strategies of auto-segmentation approaches for PET: report of AAPM task group No. 211. Medical Physics, 44(6): e1-e42[DOI: 10.1002/mp.12124]

Hatt M, Tixier F, Pierce L, Kinahan P E, Le Rest C C, Visvikis D. 2017b. Characterization of PET/CT images using texture analysis:the past, the present… any future?. European Journal of Nuclear Medicine and Molecular Imaging, 44(1): 151-165 [DOI:10.1007/s00259-016-3427-0]

Lecun Y, Bottou L, Bengio Y, Haffner P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278-2324 [DOI:10.1109/5.726791]

Lelandais B, Ruan S, Denoeux T, Vera P, Gardin I. 2014. Fusion of multi-tracer PET images for dose painting. Medical Image Analysis, 18(7): 1247-1259 [DOI:10.1016/j.media.2014.06.014]

Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 3431-3440[DOI: 10.1109/CVPR.2015.7298965]

Moussallem M, Valette P J, Traverse-Glehen A, Houzard C, Jegou C, Giammarile F. 2012. New strategy for automatic tumor segmentation by adaptive thresholding on PET/CT images. Journal of Applied Clinical Medical Physics, 13(5): 236-251 [DOI:10.1120/jacmp.v13i5.3875]

Onoma D P, Ruan S, Gardin I, Monnehan G A, Modzelewski R and Vera P. 2012. 3D random walk based segmentation for lung tumor delineation in PET imaging//The 9th IEEE International Symposium on Biomedical Imaging. Barcelona, Spain: IEEE, 1260-1263[DOI: 10.1109/ISBI.2012.6235791]

Pereira S, Pinto A, Alves V, Silva C A. 2016. Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Transactions on Medical Imaging, 35(5): 1240-1251 [DOI:10.1109/TMI.2016.2538465]

Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer, 234-241[DOI: 10.1007/978-3-319-24574-4_28]

Thorwarth D, Geets X, Paiusco M. 2010. Physical radiotherapy treatment planning based on functional PET/CT data. Radiotherapy and Oncology, 96(3): 317-324 [DOI:10.1016/j.radonc.2010.07.012]

Yan D, Vicini F, Wong J, Martinez A. 1997. Adaptive radiation therapy. Physics in Medicine & Biology, 42(1): 123-132 [DOI:10.1088/0031-9155/42/1/008]

Yuan Y, Chao M, Lo Y C. 2017. Automatic skin lesion segmentation using deep fully convolutional networks with jaccard distance. IEEE Transactions on Medical Imaging, 36(9): 1876-1886 [DOI:10.1109/TMI.2017.2695227]

Zhong Z, Kim Y, Zhou L, Plichta K, Allen B, Buatti J and Wu X. 2018. 3D fully convolutional networks for co-segmentation of tumors on PET-CT images//Proceedings of the 15th IEEE International Symposium on Biomedical Imaging. Washington DC, USA: IEEE, 228-231[DOI: 10.1109/ISBI.2018.8363561]