|
发布时间: 2020-04-16 |
医学图像处理 |
|
|
收稿日期: 2019-04-12; 修回日期: 2019-09-16; 预印本日期: 2019-09-23
基金项目: 山西省自然科学基金项目(201701D121062)
第一作者简介:
王丽芳, 1977年生, 女, 副教授, 主要研究方向为机器视觉、大数据处理、医学图像处理。E-mail:wsm2004@nuc.edu.cn;
秦品乐, 男, 副教授, 主要研究方向为机器视觉、大数据处理、3维重建。E-mail:QPL@nuc.edu.cn; 蔺素珍, 女, 教授, 主要研究方向为图像处理、文物虚拟修复。E-mail:lsz@nuc.edu.cn; 高媛, 女, 副教授, 主要研究方向为机器视觉、大数据处理、3维重建。E-mail:yuan_g@126.com; 窦杰亮, 男, 硕士研究生, 主要研究方向为医学图像融合与机器学习。E-mail:liangdjsha@163.com.
中图法分类号: TP391.4
文献标识码: A
文章编号: 1006-8961(2020)04-0745-14
|
摘要
目的 针对图像合成配准算法中鲁棒性差及合成图像特征信息不足导致配准精度不高的问题,提出了基于残差密集相对平均条件生成对抗网络(residual dense-relativistic average conditional generative adversarial network,RD-RaCGAN)的多模态脑部图像配准方法。方法 相对平均生成对抗网络中的相对平均鉴别器能够增强模型稳定性,条件生成对抗网络加入条件变量能提高生成数据质量,结合两种网络特点,利用残差密集块充分提取深层网络特征的能力,构建RD-RaCGAN合成模型。然后,待配准的参考CT(computed tomography)和浮动MR(magnetic resonance)图像通过已训练好的RD-RaCGAN合成模型双向合成对应的参考MR和浮动CT图像。采用区域自适应配准算法,从参考CT和浮动CT图像中选取骨骼信息的关键点,从浮动MR和参考MR图像中选取软组织信息的关键点,通过提取的关键点指导形变场的估计。从浮动CT图像到参考CT图像估计一个形变场。类似地,从浮动MR图像到参考MR图像估计一个形变场。另外,采用分层对称的思想进一步优化两个形变场,当两个形变场之间的差异达到最小时,将两个形变场融合得到最终的形变场,并将形变场作用于浮动图像完成配准。结果 实验结果表明,与其他6种图像合成方法相比,本文模型合成的目标图像在视觉效果和客观评价指标上均优于其他方法。对比Powell优化的MI(mutual information)法、ANTs-SyN(advanced normalization toolbox-symmetric normalization)、D.Demons(diffeomorphic demons)、Cue-Aware Net(cue-aware deep regression network)和I-SI(intensity and spatial information)的图像配准方法,归一化互信息分别提高了43.71%、12.87%、10.59%、0.47%、5.59%,均方根误差均值分别下降了39.80%、38.67%、15.68%、4.38%、2.61%。结论 本文提出的多模态脑部图像配准方法具有很强的鲁棒性,能够稳定、准确地完成图像配准任务。
关键词
医学图像配准; 图像合成; 相对平均生成对抗网络; 残差密集块; 最小二乘; 条件生成对抗网络(CGAN)
Abstract
Objective Multimodal medical image registration is a key step in medical image analysis and processing as it complements the information from different modality images and provides doctors with a variety of information about diseased tissues or organs. This method enables doctors to make accurate diagnosis and treatment plans. Image registration based on image synthesis is the main method for achieving high-precision registration. A high-quality composite image indicates good registration effect. However, current image-based registration algorithm has poor robustness in synthetic models and provides an insufficient representation of synthetic image feature information, resulting in low registration accuracy. In recent years, owing to the success of deep learning in many fields, medical image registration based on deep learning has become a focus of research. The synthetic model is trained according to the modal type of the image to be registered, and the synthetic model bidirectional synthetic image is used to guide the subsequent registration. Anatomical information is employed to guide the registration and improve the accuracy of multimodal image registration. Therefore, a multimodal brain image registration method based on residual dense relative average conditional generative adversarial network (RD-RaCGAN) is proposed in this study. Method First, the RD-RaCGAN image synthesis model is constructed by combining the advantages of the relative average discriminator in the relativistic average generative adversarial network, which can enhance the model stability, and the advantages of the conditional generative adversarial network, which can improve the quality of the generated data, and also the ability of residual dense blocks to fully extract the characteristics of the deep network. Residual dense blocks are utilized as core components for building a generator. The purpose of this generator is to capture the law of sample distribution and generate a target image with specific significance, that is, to input a floating magnetic resonance (MR) or reference computed tomography (CT) image and generate the corresponding synthetic CT or synthetic MR image. The convolution neural network is used as a relative average discriminator, which correctly distinguishes an image generated by the generator from the real image. The generator and relative average discriminator perform confrontational training. First, the generator is fixed to train the relative average discriminator.Then, the relative average discriminator is fixed to train the generator, and the loop training is subsequently continued. During training, the least square function optimization generator and relative average discriminator, which are more stable and less saturated than the cross entropy function are selected. The ability of the generator and the relative average discriminator is enhanced, and the image generated by the generator can be falsified. At this point, the synthetic model training is completed. Subsequently, the CT image and MR image to be registered are bidirectionally synthesized into the corresponding reference MR image and floating CT image through the RD-RaCGAN synthesis model that has been trained. Four images obtained by bidirectional synthesis are registered by a region-adaptive registration algorithm. Specifically, the key points of the bone information are selected from the reference CT image and the floating CT image.The key points of the soft tissue information are selected from the floating MR image and the reference MR image, and the estimation of the deformation field is guided by the extracted key points. In other words, one deformation field is estimated from the floating CT image to the reference CT image, and a deformation field is estimated from the floating MR image to the reference MR image. At the same time, the idea of hierarchical symmetry is adopted to further guide the registration. The key points in the image are gradually increased when the reference and floating images are close to each other.Moreover, anatomical information is used to optimize the two deformation fields continuously until the difference between the two deformation fields reaches a minimum. The two deformation fields are fused to form the deformation field between the reference CT image and floating MR image. Finally, the deformation field is applied to the floating image to complete registration. Given that the synthesis of a target image from two images to be registered through the synthesis model requires time, the algorithm efficiency in this study is slightly lower than that of D.Demons(diffeomorphic demons) and ANTs-SyN(advanced normalization toolbox-symmetric normalization). Result Given that the quality of the synthesized image directly affects registration accuracy, three sets of contrast experiments are designed to verify the effect of the algorithm in this study. Different algorithms are by MR synthesis CT, and different algorithms are compared by CT synthesis MR, and comparison of the effect of different registration algorithms. The experimental results show that the target image synthesized by the synthesis model in this study is superior to those obtained by the other methods in terms of visual effect and objective evaluation index. The target image synthesized by RD-RaCGAN is similar to the real image and has less noise than the target images generated by the other synthetic methods. As can be seen from the bones of the synthesized brain image and the area near the air interface, the synthetic model in this work visually shows realistic texture details. Compared with the Powell-optimized MI(mutual information) method, ANTs-SyN, D.Demons, Cue-Aware Net(cue-aware deep regression network), and I-SI(intensity and spatial information) image registration methods, the normalized mutual information increased by 43.71%, 12.87%, 10.59%, 0.47%, and 5.59%, respectively. In addition, the mean square root error decreased by 39.80%, 38.67%, 15.68%, 4.38%, and 2.61%, respectively. The results obtained by the registration algorithm in this study are close to the reference image. The registration effect diagram that the difference between the registration image and the reference image obtained by the algorithm in this study is smaller than that obtained by the other three methods. Small difference between the two images means good registration effect. Conclusion This study proposes a multimodal brain image registration method based on RD-RaCGAN, which solves the problem of the poor robustness of the model synthesis algorithm based on image synthesis, leading to the inaccuracy of the synthetic image and the poor registration effect.
Key words
medical image registration; image synthesis; RaGAN (relativistic average generative adversarial network); residual dense blocks; least squares; CGAN (conditional generative adversarial network)
0 引言
在临床医学中,将不同模态的图像进行配准, 可以给医生提供病灶部位的不同信息, 提高医生诊断疾病的准确性。因此,多模态医学图像配准在图像融合、肿瘤生长监测、图像引导手术治疗及放疗计划制定等方面发挥着重要作用(Sotiras等,2013)。
多模态脑部图像配准主要有基于特征、基于灰度和基于图像合成等3类方法。1)基于特征的方法是通过提取图像的显著特征建立图像间的映射关系完成配准,计算效率较高,但配准的精度取决于特征提取的准确性(Oliveira和Tavares,2014),如RAMMS、Hammer和SURF(speeded up robust feature)。2)基于灰度的方法直接对不同模态图像的灰度特性进行全局最优化实现配准,配准精度高,但计算量大。如Powell优化的MI(mutual information)法(Maes等,1996)、ANTs-SyN(advanced normalization toolbox-symmetric normalization)(Avants等,2008)和D. Demons(diffeomorphic demons)(Vercauteren等,2009)。3)基于图像合成的方法是实现高精度配准的主要方法。由于不同模态的图像灰度强度不一致,通过一种模态合成另一种模态,可以将复杂的多模态配准转换为相对容易的单模态配准,当配准图像存在较大形变时比其他两类方法有更大优势。
用于医学图像合成的方法大致分为传统方法和基于深度学习的方法两类。1)传统方法。如基于图谱(Atlas)的方法(Burgos等,2015)、基于稀疏表示(sparse representation,SR)的方法(Roy等,2010;Ye等,2013)和基于随机森林(random forest,RF)的方法(Huynh等,2016)等。基于图谱的方法通过将图谱集中的所有图像与源图像做配准,根据形变场与图谱标记的映射关系合成目标图像。此类方法对配准的准确性非常敏感,且合成图像边缘轮廓比较模糊。基于稀疏表示的方法由于需要利用稀疏编码理论将源图像块和对应的目标图像块分别表示为相应的字典,再通过学习字典之间的映射关系,生成给定源图像的目标图像,花费时间较长。基于随机森林的方法当样本数据存在较大噪声时会出现过拟合现象(Ye等,2015)。2)基于深度学习的方法。近年来,深度学习在图像处理领域取得了显著成果,在大量任务中,深度学习得到的特征被证实比传统方法得到的特征具有更强的表达能力(Krizhevsky等,2012)。Han(2017)利用深度卷积神经网络(deep convolutional neural network,DCNN)从MR(magnetic resonance)图像合成CT(computed tomography)图像,与基于回归的方法(Huynh等,2016)相比计算量较小,但当输入图像尺寸较小时,容易忽略预测目标图像中的邻域信息。Xie等人(2016)将可以保留结构信息的全卷积网络(fully convolutional networks,FCN)用于图像合成,然而要训练稳健且准确的模型需要大量的数据集而且训练耗费时间很长。之后,Nie等人(2018)提出了基于生成对抗网络(generative adversarial network,GAN)(Goodfellow等,2014)的合成算法,通过生成器和鉴别器的对抗训练构建合成模型。实验表明,相比其他深度学习算法(Han,2017;Xie等,2016),结合对抗学习能够生成更逼真的图像且模型结构易于扩展。但生成对抗网络由于生成过程过于自由,当数据集规模较大且图像内容复杂时,会导致生成样本与目标不一致,且在训练过程中易出现模式崩塌、梯度消失的问题(Salimans等,2016),另外,以上图像合成方法均在单一方向上进行,即只有一种图像的解剖信息指导配准,容易引入偏差。
针对上述问题,本文提出了基于残差密集相对平均条件生成对抗网络(residual dense-relativistic average conditional generative adversarial network,RD-RaCGAN)的多模态脑部图像配准方法,通过结合相对平均生成对抗网络(relativistic average GAN,RaGAN)(Jolicoeur-Martineau,2018)和条件生成对抗网络(conditional GAN, CGAN)(Mirza和Osindero,2014)来改进脑部图像的建模方法。其中,生成器的结构对生成样本的质量有较大影响,Zhang等人(2018)认为残差密集块能充分利用深层网络中各卷积层的分层特征,并有效缓解网络退化问题。因此,本文的生成器以残差密集块作为核心组件,卷积神经网络为相对平均鉴别器,通过对抗学习训练合成模型,完成CT到MR及MR到CT的双向图像合成。然后,采用区域自适应配准算法联合估计形变场,并将形变场作用于浮动图像完成配准。
1 相关工作
1.1 相对平均生成对抗网络
为了解决GAN中存在的不稳定问题,相对平均生成对抗网络(RaGAN)从鉴别器(discriminator)
1.2 条件生成对抗网络
GAN作为一种无监督方法,当数据集中的图像内容复杂时,使用GAN模型很难控制生成的结果。条件生成对抗网络(CGAN)通过在
1.3 残差密集块
残差密集块(residual dense block,RDB)将残差网络(He等,2016)中的跳跃连接与密集网络(Huang等,2017)中的密集连接相结合,使得网络不仅具有残差网络能缓解深层网络退化的效果,又具有密集网络增强特征表达的作用。基本的残差密集块如图 2所示。
图 2中,每个RDB包含3层卷积(conv),后接ReLU激活函数,
2 基于RD-RaCGAN的脑部图像配准算法
2.1 RD-RaCGAN算法网络结构
综合CGAN中加入条件
RD-RaCGAN(residual dense-relativistic average conditional generative adversarial network)由生成器(
2.1.1 生成器
生成器
2.1.2 相对平均鉴别器
相对平均鉴别器(
2.1.3 最小二乘损失函数
损失函数的选择是影响网络性能的重要因素,GAN、CGAN和RaGAN均以交叉熵为损失函数。Arjovsky等人(2017)指出GAN训练不稳定是由于交叉熵不适合衡量不相交部分的分布造成的,Mao等人(2016)指出对抗学习过程中的不稳定性部分是由鉴别器的饱和引起的。如图 6所示,最小二乘函数的饱和区域远小于交叉熵函数的饱和区域,对于对抗学习,最小二乘损失函数比交叉熵损失函数更稳定,生成图像质量更高。因此,本文采用最小二乘作为损失函数,将最小二乘函数看作最优化问题,通过最小化目标函数值,达到合成图像与真实图像相似的目的。
以最小二乘为损失函数的CGAN中
$ L_{G}^{\mathrm{CGAN}}=\frac{1}{2} E_{c \sim p(c)}\left[D\left(\boldsymbol{x}_{\mathrm{f}}\right)-1\right]^{2} $ | (1) |
$ \begin{aligned} L_{D}^{\mathrm{CGAN}}=& \frac{1}{2} E_{x_{\mathrm{r}} \sim p\left(x_{\mathrm{r}}\right)}\left[\left(D\left(\boldsymbol{x}_{\mathrm{r}}\right)-1\right)^{2}\right]+\\ & \frac{1}{2} E_{c \sim p(c)}\left[D(c, G(c, \boldsymbol{z}))^{2}\right] \end{aligned} $ | (2) |
式中,
$ L_{1}(G)=E_{x_{\mathrm{r}}, x_{\mathrm{f}} \sim p\left(x_{\mathrm{r}}, x_{\mathrm{f}}\right)}\left[\left\|\boldsymbol{x}_{\mathrm{r}}-\boldsymbol{x}_{\mathrm{f}}\right\|_{1}\right] $ | (3) |
在式(1)和式(2)中加入相对平均鉴别器的梯度惩罚因子,得到RD-RaCGAN的损失函数为
$ L_{\mathrm{RD}-\mathrm{RaCGAN}}=\left\{\min \left(L_{D_{\mathrm{Ra}}}^{\mathrm{RaCGAN}}\right), \min \left(L_{G}^{\mathrm{RaCGAN}}\right)\right\} $ | (4) |
$ \begin{array}{c} L_{D_{\mathrm{Ra}}}^{\mathrm{RaCGAN}}=\frac{1}{2} E_{x_{\mathrm{r}}, x_{\mathrm{f}} \sim p\left(x_{\mathrm{r}, x_{\mathrm{f}}}\right)} \times \\ \left\{\left[D_{\mathrm{Ra}}\left(\boldsymbol{x}_{\mathrm{r}}, \boldsymbol{x}_{\mathrm{f}}\right)-1\right]^{2}+\left[D_{\mathrm{Ra}}\left(\boldsymbol{x}_{\mathrm{r}}, \boldsymbol{x}_{\mathrm{f}}\right)\right]^{2}\right\} \end{array} $ | (5) |
$ \begin{array}{c} L_{G}^{\mathrm{RaCGAN}}=\frac{1}{2} E_{x_{{\rm r}}, x_{\mathrm{f}} \sim p\left(x_{r}, x_{f}\right)} \times \\ {\left[D_{\mathrm{Ra}}\left(\boldsymbol{x}_{\mathrm{r}}, \boldsymbol{x}_{\mathrm{f}}\right)-1\right]^{2}+\lambda L_{1}(G)} \end{array} $ | (6) |
式中,
2.2 区域自适应配准算法
以CT为参考图像,MR为浮动图像,经已训练好的RD-RaCGAN合成模型分别合成参考S-MR和浮动S-CT图像。为了避免单向合成的图像信息指导多模态图像配准引入偏差,在MR与CT配准时,同时利用S-MR和S-CT的解剖信息,采用区域自适应配准算法(Cao等,2018a)进行配准,主要过程分为3个阶段:
1) 关键点采样。为了得到CT和MR之间的形变场
$P=\frac{|\nabla x|+|\nabla y|+|\nabla z|}{|\nabla G|}$ | (7) |
式中,
2) 关键点的区域自适应选择。对式(7)采取截断阈值操作,从CT对(CT和S-CT)中选取骨骼信息的关键点,从MR对(MR和S-MR)中选取软组织信息的关键点,并通过归一化互相关(normalized cross correlation,NCC)检测关键点间的匹配关系。
3) 分层对称配准。通过双向图像合成,将MR和CT的配准转化为CT对和MR对的单模态配准,即
$\boldsymbol{\varphi}=\arg \min M\left(\boldsymbol{I}_{\mathrm{CT}}, D\left(\boldsymbol{I}_{\mathrm{S}-\mathrm{CT}}, \boldsymbol{\varphi}\right)\right)+$ $M\left(\boldsymbol{I}_{\mathrm{S}-\mathrm{MR}}, D\left(\boldsymbol{I}_{\mathrm{MR}}, \boldsymbol{\varphi}\right)\right)+\alpha R(\boldsymbol{\varphi})$ | (8) |
式中,
$\begin{aligned}\left\{\boldsymbol{\varphi}_{1}, \boldsymbol{\varphi}_{2}\right\}=& \arg \min\limits_{\boldsymbol{\varphi}_{1}, \boldsymbol{\varphi}_{2}} M\left(D\left(\boldsymbol{T}, \boldsymbol{\varphi}_{1}\right), D\left(\boldsymbol{S}, \boldsymbol{\varphi}_{2}\right)\right)+\\ & \alpha\left(R\left(\boldsymbol{\varphi}_{1}\right)+R\left(\boldsymbol{\varphi}_{2}\right)\right) \end{aligned}$ | (9) |
将CT、S-MR作为模板空间
$\begin{aligned} &\boldsymbol{\varphi}=\boldsymbol{\varphi}_{2} \circ \boldsymbol{\varphi}_{1}^{-1}\\ &\boldsymbol{\varphi}^{-1}=\boldsymbol{\varphi}_{1} \circ \boldsymbol{\varphi}_{2}^{-1} \end{aligned}$ | (10) |
式中,“
2.3 RD-RaCGAN的脑部图像配准算法
本文提出的基于RD-RaCGAN的多模态脑部图像配准算法流程如图 9所示,具体步骤为:
1) 构建RD-RaCGAN合成模型。CGAN中条件
2) 合成图像生成。输入两幅大小相同的待配准脑部图像,如参考CT和浮动MR图像,经RD-RaCGAN合成模型执行CT到S-MR和MR到S-CT的图像合成。
3) 区域自适应配准。双向合成后得到4幅图像:CT对(CT和S-CT)和MR对(MR和S-MR)。从CT对选取骨骼信息的关键点,从MR对选取软组织信息的关键点,采用区域自适应配准算法,在配准期间逐渐增加关键点的数量,分层对称地指导形变场的估计。
4) 将最终的形变场作用于浮动图像完成配准。
3 实验结果与分析
3.1 数据集与实验环境
本文的训练集来自数据库RIRE(retrospective image registration evaluation)(http://insight-journal.org/rire)和Atlas(Johnson和Becker,1999)。从RIRE数据库的18位患者及Atlas数据库中的急性脑中风、高血压性脑病变和颅内肿瘤患者中挑选了518对清晰度高、纹理丰富、细节复杂的MR/CT图像作为数据集,其中,RIRE数据库中CT图像的体素大小在0.40×0.40×3.00 mm3~0.65×0.65×4.00 mm3之间,MR图像的体素大小在0.82×0.82×3.00 mm3~1.25×1.25×4.00 mm3之间,Atlas数据库中CT图像大小为512 × 512像素,MR图像为256 × 256像素。首先,将RIRE数据库中的DICOM(digital imaging and communications in medicine)格式的数据及Atlas数据库中的GIF(graphics interchange format)格式的数据均转换为PNG(portable network graphic format)格式。其次,采用N4(Tustison等,2010)偏差校正算法消除MR图像的灰度不均匀性,最后,将所有图像重新采样到256 × 256像素,切片厚度为1 mm。数据集分为3部分,包括训练集、测试集、验证集。训练集用于训练网络模型,占数据的80%,共414对;验证集用于对模型准确率进行初步评估,以决定是否停止训练,占10%,共52对;测试集用于评估最终模型的泛化能力,占10%,共52对;由于深度模型通常受益于大数据,为了充分利用数据集,采用Albumentations(Buslaev等,2018)对训练集进行数据增强,对数据集中的每幅图像分别进行90°、180°和270°旋转、水平和垂直翻转、转置、弹性变换、网格变形、光学畸变等操作,获得4×6-1=23倍的数据,即9 936对图像进行训练。
实验的硬件平台是一台Intel Xeon服务器,搭载2块NVIDIA Tesla M40的GPU,每块显存12 GB。软件平台是64位的Windows 7操作系统,Pycharm64,Tensorflow V1.2,MATLAB R2016a。
3.2 训练过程
生成器
1) 固定
2) 用
3) 固定
4) 重复步骤1)—3),并用验证集对模型进行评估,得到乱真的目标图像。
5) 选用测试集图像评估最终模型性能,以及合成图像质量。
本文采用Adam优化算法(Kingma和Ba,2014)在训练过程中促使鉴别损失和生成损失函数达到最小来不断更新网络的参数。网络初始学习率设置为1 × 10-4,动量参数设定为0.9,权重衰减为5 × 10-4。在初始赋值方面,卷积层和反卷积层的赋值满足均值为0、标准差为0.01的正态分布。受GPU显存的限制,采用mini-batch的训练方式,batch-size设置为16,由于MR图像相对CT图像解剖信息更丰富,MR合成S-CT,属于由“复杂”到“简单”的合成,CT合成S-MR,属于由“简单”到“复杂”的合成,在训练过程中,由CT到S-MR的合成模型训练时间相对较长,经过200个epoch的训练后,
从图 10可以看出,随着迭代次数的增加,
3.3 结果比较与分析
由于合成图像的质量直接影响配准精度,为了验证本文算法配准图像的效果,设计了3组对比实验。
1) 由MR合成CT不同算法效果比较。图 11给出了测试集中的3组脑部MR-T2及对应的CT图像,本组实验选取FCN-GAN(Nie等,2018)、Atlas、DCNN(Han,2017)、SR和RF等5种方法比较并分析合成效果。由图 11可以看出,Atlas算法合成的CT图像轮廓不清晰、不完整;SR算法合成的图像整体看起来很模糊;DCNN算法得到的图像有很多噪声;RF和FCN-GAN算法在图像锐度和视觉效果方面都较好,但相比RD-RaCGAN在细节方面的处理仍显不足,例如,从第1行放大图可以看出耳道、更细的骨骼和颅骨后侧这些复杂的细节部分合成不精确;而RD-RaCGAN算法合成的图像在视觉上显示出了更逼真的纹理细节, 与真实CT图像更相似、噪声更小,从第2行可以看出骨骼和空气界面附近区域的合成较其他方法更准确。
为了客观评价合成效果,采用平均绝对误差(mean absolute error,MAE)和峰值信噪比(peak signal-noise ratio,PSNR)来衡量合成图像的精度,具体为
$ M_{{\rm A E}}=\frac{\sum\limits_{i=1}^{H}\left|\boldsymbol{C T}_{{\rm R}}(i)-\boldsymbol{C T}_{{\rm S}}(i)\right|}{H} $ | (11) |
$ P_{\mathrm{SNR}}=10 \ln \left(\frac{Q^{2}}{M_{\mathrm{SF}}}\right) $ | (12) |
式中,
表 1为使用6种算法合成CT图像所得MAE和PSNR的平均值(mean)和标准差(std)值。为了更好地评估各算法的合成性能,表 1总结了大脑不同区域的MAE值,包括空气、骨骼和软组织区域。依据对实验结果中配准后图像以及表 1的各项指标来看,本文使用的合成模型具有更好的合成效果。通过计算其他区域的MAE值可以得出各合成方法在骨骼和空气区域的产生误差较大,其余组织区域MAE值较小的结论。
表 1
不同合成算法下合成CT图像的MAE和PSNR值
Table 1
MAE and PSNR values of CT images synthesized by different synthesis algorithms results
算法 | 全脑 | 空气 | 骨骼 | 组织 | |||
MAE | PSNR/dB | MAE | MAE | MAE | |||
Atlas | mean | 169.5 | 20.9 | 294.21 | 271.39 | 55.6 | |
std | 35.6 | 1.6 | 65.5 | 64.95 | 23.34 | ||
SR | mean | 166 | 21.2 | 302.48 | 283.96 | 52.18 | |
std | 37.6 | 1.7 | 68.99 | 60.5 | 22.84 | ||
RF | mean | 99.9 | 26.3 | 240.94 | 264.1 | 42.61 | |
std | 14.2 | 1.4 | 57.47 | 55.7 | 14.32 | ||
DCNN | mean | 102.4 | 26.2 | 256.5 | 255.22 | 43.23 | |
std | 11.3 | 1.22 | 57.32 | 55.32 | 16.9 | ||
FCN-GAN | mean | 93.5 | 27.9 | 238.5 | 245.3 | 39.09 | |
std | 13.9 | 1.18 | 52.5 | 48.97 | 11.2 | ||
本文 | mean | 88.3 | 28.15 | 235.44 | 240.2 | 38.65 | |
std | 10.25 | 1.17 | 50.21 | 47.74 | 8.58 |
2) 由CT合成MR不同算法效果比较。图 12给出了从测试集中选出的两组脑部MR-T2及其对应的CT图像,并分别通过MT-RF(multi-target regression forest)(Cao等,2018a)算法与本文中的RD-RaCGAN算法从CT图像合成MR图像所得结果。从图 12可以看出,MT-RF算法比RD-RaCGAN算法对丰富纹理处理仍然显得不够细腻,MT-RF算法合成的图像比较模糊且视觉过于平滑,RD-RaCGAN算法在视觉上显示出了更逼真的纹理细节。
表 2列出了使用两种合成算法各项评价指标的计算结果,结果表明本文算法的各项评价指标均优于MT-RF算法。通过观察发现,表 2中的MAE值略大于表 1中的数值,而PSNR值略小于表 1中的数值,这是因为由解剖信息简单的CT合成复杂的MR图像比由MR合成CT的计算复杂度更高,合成产生的误差也相应的略高。
表 2
不同合成算法下合成MR图像的MAE和PSNR值
Table 2
MAE and PSNR values of MR images synthesized by different synthesis algorithms results
算法 | 全脑 | 空气 | 骨骼 | 组织 | |||
MAE | PSNR/dB | MAE | MAE | MAE | |||
MT-RF | mean | 124.53 | 26.52 | 238.45 | 246.96 | 40.52 | |
std | 7.88 | 0.78 | 53.24 | 52.37 | 14.33 | ||
本文 | mean | 120.55 | 27.21 | 225.93 | 241.78 | 39.02 | |
std | 6.43 | 0.82 | 51.85 | 48.02 | 7.96 |
3) 不同配准算法的效果比较。为了评估基于RD-RaCGAN算法的配准性能,选取Atlas数据库中的急性脑卒中患者CT/MR图像,图像大小均为256 × 256像素,将实验分为两组,每组从24对切片中随机抽取8对进行实验,以CT为参考图像,MR为浮动图像,并对两组图像中的MR图像分别随机施加±20像素的平移、±10°的旋转和20%的纹波变形。实验分别使用常用于多模态医学配准Powell优化的MI法、D. Demons、ANTs-SyN、Cue-Aware Net(Cao等,2018b)、I-SI(Öfverstedt等,2019)以及本文方法进行比较。图 12展示了采用上述6种方法配准的效果和配准后的图像差。
本组实验采用归一化互信息(normalized mutual information,NMI)、均方根误差(root mean square error,RMSE)评价图像的配准效果。NMI度量描述的是两幅图像之间的统计相关性,配准后图像与参考图像之间的NMI值越大,两幅图像越相似,配准越精确;RMSE表示配准后图像与参考图像的偏差,RMSE值越小,则图像配准效果越好。NMI和RMSE的表达式为
$ N_{\mathrm{MI}}=\frac{H(R)+H(M(\boldsymbol{\varphi}(n)))}{H(R, M(\boldsymbol{\varphi}(n)))} $ | (13) |
$ R_{\mathrm{MSE}}=\sqrt{\frac{1}{n} \sum\limits_{i=1}^{n}\left(M_{i}-R_{i}\right)^{2}} $ | (14) |
式中,
表 3列出了本文算法及对比算法的平均NMI、RMSE值及平均配准时间。从表 3可以看出,本文算法与Powell-MI、ANTs-SyN、D.Demons、Cue-Aware Net和I-SI的图像配准的方法相比,NMI均值分别提高了43.71%、12.87%、10.59%、0.47%、5.59%,RMSE均值分别下降了39.80%、38.67%、15.68%、4.38%、2.61%,故基于RD-RaCGAN的配准算法效果更好,精度更高。但从配准时间上来看,本文算法的效率略低于D. Demons和ANTs-SyN,主要是因为本文方法是基于图像合成且合成模型是通过RD-RaCGAN训练得到的,并在此基础上对待配准的CT/MR图像双向合成,因此,通过合成模型合成相应的S-CT/S-MR图像需要花费一定时间。
表 3
不同算法配准图像的评价指标值及时间
Table 3
The average evaluation index value and time of images registered by different algorithms
配准方法 | NMI | RMSE | 配准时间/s |
Powell-MI | 1.025 | 1.304 | 208.86 |
ANTs-SyN | 1.305 | 1.280 | 148.52 |
D.Demons | 1.332 | 0.931 | 136.48 |
Cue-Aware Net | 1.466 | 0.821 | 300.80 |
I-SI | 1.395 | 0.806 | 209.40 |
本文 | 1.473 | 0.785 | 181.32 |
图 13是不同算法的配准结果和配准后图像差,从图中箭头所指的纹理和边缘区域可以看出:1)经过扭曲变形后的浮动图像,通过本文配准算法得到的结果更接近参考图像;2)本文算法得到的配准图像与参考图像的图像差比其他5种方法得到的图像差的差异性小。配准后两幅图像的差异性越小,配准效果越好。
4 结论
本文提出了基于残差密集相对平均条件生成对抗网络(RD-RaCGAN)的多模态脑部图像配准方法,解决了基于图像合成的配准算法合成模型鲁棒性差导致合成图像不准确和配准效果欠佳的问题。实验结果表明,本文方法能够将待配准的MR和CT图像通过RD-RaCGAN合成模型合成S-CT及S-MR图像,并用于联合估计形变场。本文算法有以下特点:1)RD-RaCGAN合成模型综合CGAN中条件变量能直接指导生成器并提高生成图像质量及RaGAN中相对平均鉴别器能提高网络稳定性的优点,具有鲁棒性强、合成图像效果好的特点; 2)利用双向合成的图像信息指导配准,降低了多模态配准的难度,提高了配准精度。
但本文算法也存在局限性,尚有较大提升空间,具体表现在:1)本文算法需要将两幅待配准图像通过预先训练好的合成模型合成对应图像再进行相应的配准,相比传统方法需要更多的时间。后续工作将致力于有效降低算法执行时间的研究。2)本文算法中的图像合成模型仅适用于单图像域的合成任务,当进行多个域间的合成时,需要为每对域训练一个单独的生成器,计算成本很高。如何设计多图像域合成模型以适用于多模态图像配准任务将有助于提升配准效率。
参考文献
-
Arjovsky M, Chintala S and Bottou L. 2017. Wasserstein GAN[EB/OL].[2019-04-10]. https://arxiv.org/pdf/1701.07875.pdf
-
Avants B B, Epstein C L, Grossman M, Gee J C. 2008. Symmetric diffeomorphic image registration with cross-correlation:evaluating automated labeling of elderly and neurodegenerative brain. Medical Image Analysis, 12(1): 26-41 [DOI:10.1016/j.media.2007.06.004]
-
Bookstein F L. 1989. Principal warps:thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(6): 567-585 [DOI:10.1109/34.24792]
-
Burgos N, Cardoso M J, Guerreiro F, Veiga C, Modat M, McClelland J, Knopf A C, Punwani S, Atkinson D, Arridge S R, Hutton B F and Ourselin S. 2015. Robust CT synthesis for radiotherapy planning: application to the head and neck region//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich: Springer: 476-484[DOI: 10.1007/978-3-319-24571-3_57]
-
Buslaev A, Parinov A, Khvedchenya E, Iglovikov V I and Kalinin A A. 2018. Albumentations: fast and flexible image augmentations[EB/OL].[2019-04-10].https://arxiv.org/pdf/1809.06839v1.pdf
-
Cao X H, Yang J H, Gao Y Z, Wang Q, Shen D G. 2018a. Region-adaptive deformable registration of CT/MRI pelvic images via learning-based image synthesis. IEEE Transactions on Image Processing, 27(7): 3500-3512 [DOI:10.1109/TIP.2018.2820424]
-
Cao X H, Yang J H, Zhang J, Wang Q, Yap P T, Shen D G. 2018b. Deformable image registration using a cue-aware deep regression network. IEEE Transactions on Biomedical Engineering, 65(9): 1900-1911 [DOI:10.1109/TBME.2018.2822826]
-
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial nets//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: MIT Press: 2672-2680
-
Han X. 2017. MR-based synthetic CT generation using a deep convolutional neural network method. Medical Physics, 44(4): 1408-1419 [DOI:10.1002/mp.12155]
-
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE: 770-778[DOI: 10.1109/CVPR.2016.90]
-
Huang G, Liu Z, van der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE: 4700-4708[DOI: 10.1109/CVPR.2017.243]
-
Huynh T, Gao Y Z, Kang J Y, Wang L, Zhang P, Lian J, Shen D G. 2016. Estimating CT image from MRI data using structured random forest and auto-context model. IEEE Transactions on Medical Imaging, 35(1): 174-183 [DOI:10.1109/TMI.2015.2461533]
-
Johnson K A and Becker J A. 1999. The whole brain atlas[EB/OL].[2019-04-10]. http://www.med.harvard.edu/aanlib/home.html
-
Jolicoeur-Martineau A. 2018. The relativistic discriminator: a key element missing from standard GAN[EB/OL].[2019-04-10]. https://arxiv.org/pdf/1807.00734.pdf
-
Kingma D P and Ba J. 2014. Adam: a method for stochastic optimization[EB/OL].[2019-04-10]. https://arxiv.org/pdf/1412.6980.pdf
-
Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates Inc: 84-90[DOI: 10.1145/3065386]
-
Maes F, Collignon A, Vandermeulen D, Marchal G and Suetens P. 1996. Multi-modality image registration by maximization of mutual information//Proceedings of 1996 Workshop on Mathematical Methods in Biomedical Image Analysis. San Francisco: IEEE: 14-22[DOI: 10.1109/MMBIA.1996.534053]
-
Mao X D, Li Q, Xie H R, Lau R Y K and Wang Z. 2016. Multi-class generative adversarial networks with the L2 loss function[EB/OL].[2019-04-10]. https://arxiv.org/pdf/1611.04076v1.pdf
-
Mirza M and Osindero S. 2014. Conditional generative adversarial nets[EB/OL].[2019-04-10]. https://arxiv.org/pdf/1411.1784.pdf
-
Nie D, Trullo R, Lian J, Wang L, Petitjean C, Ruan S, Wang Q, Shen D G. 2018. Medical image synthesis with deep convolutional adversarial networks. IEEE Transactions on Biomedical Engineering, 65(12): 2720-2730 [DOI:10.1109/TBME.2018.2814538]
-
Öfverstedt J, Lindblad J, Sladoje N. 2019. Fast and robust symmetric image registration based on distances combining intensity and spatial information. IEEE Transactions on Image Processing, 28(7): 3584-3597 [DOI:10.1109/TIP.2019.2899947]
-
Oliveira F P M, Tavares J M R S. 2014. Medical image registration:a review. Computer Methods in Biomechanics and Biomedical Engineering, 17(2): 73-93 [DOI:10.1080/10255842.2012.670855]
-
Ratliff L J, Burden S A and Sastry S S. 2013. Characterization and computation of local Nash equilibria in continuous games//Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton). Monticello: IEEE: 917-924[DOI: 10.1109/Allerton.2013.6736623]
-
Roy S, Carass A, Shiee N, Pham D L and Prince J L. 2010. MR contrast synthesis for lesion segmentation//Proceedings of 2010 IEEE International Symposium on Biomedical Imaging: from Nano to Macro. Rotterdam: IEEE: 932-935[DOI: 10.1109/ISBI.2010.5490140]
-
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A and Chen X. 2016. Improved techniques for training GANs[EB/OL].[2019-04-10]. https://arxiv.org/pdf/1606.03498.pdf
-
Sotiras A, Davatzikos C, Paragios N. 2013. Deformable medical image registration:a survey. IEEE Transactions on Medical Imaging, 32(7): 1153-1190 [DOI:10.1109/TMI.2013.2265603]
-
Tai Y, Yang J, Liu X M and Xu C Y. 2017. MemNet: a persistent memory network for image restoration//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE: 4549-4557[DOI: 10.1109/ICCV.2017.486]
-
Tustison N J, Avants B B, Cook P A, Zheng Y J, Egan A, Yushkevich P A, Gee J C. 2010. N4ITK:improved N3 bias correction. IEEE Transactions on Medical Imaging, 29(6): 1310-1320 [DOI:10.1109/TMI.2010.2046908]
-
Vercauteren T, Pennec X, Perchant A, Ayache N. 2009. Diffeomorphic demons:efficient non-parametric image registration. NeuroImage, 45(S1): S61-S72 [DOI:10.1016/j.neuroimage.2008.10.040]
-
Xie D, Cao X H, Gao Y Z, Wang L and Shen D G. 2016. Estimating CT image from MRI data using 3D fully convolutional networks//Proceedings of the 1st International Workshop, LABELS 2016, and Second International Workshop on Deep Learning and Data Labeling for Medical Applications. Athens: Springer: 170-178[DOI: 10.1007/978-3-319-46976-8_18]
-
Ye C Y, Yang Z, Ying S H, Prince J L. 2015. Segmentation of the cerebellar peduncles using a random forest classifier and a multi-object geometric deformable model:application to spinocerebellar ataxia type 6. Neuroinformatics, 13(3): 367-381 [DOI:10.1007/s12021-015-9264-7]
-
Ye D H, Zikic D, Glocker B, Criminisi A and Konukoglu E. 2013. Modality propagation: coherent synthesis of subject-specific scans with data-driven regularization//Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention. Nagoya: Springer: 606-613[DOI: 10.1007/978-3-642-40811-3_76]
-
Zhang Y L, Tian Y P, Kong Y, Zhong B N and Fu Y. 2018. Residual dense network for image super-resolution//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE: 2472-2481[DOI: 10.1109/CVPR.2018.00262]