发布时间: 2019-11-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.190043 2019 | Volume 24 | Number 11 医学图像处理

1. 东北大学信息科学与工程学院, 沈阳 110004;
2. 教育部医学影像计算重点实验室, 沈阳 110004
 收稿日期: 2019-02-18; 修回日期: 2019-03-26; 预印本日期: 2019-04-02 基金项目: 国家自然科学基金项目（61871106） 第一作者简介: 郭彤宇, 1995年生, 男, 硕士研究生, 主要研究方向为MR脑部图像的分割、深度学习。E-mail:2389423323@qq.com;王博, 男, 硕士研究生, 主要研究方向为MR脑部图像的分割。E-mail:1719428250@qq.com;刘悦, 女, 博士研究生, 主要研究方向为脑图像分割、计算机辅助诊断、机器学习。E-mail:18512478164@163.com. 中图法分类号: TP751.1 文献标识码: A 文章编号: 1006-8961(2019)11-2009-12

# 关键词

MR脑部图像分割; 卷积神经网络; 深度可分离卷积; 多通道融合; 通道混洗

Multi-channel fusion separable convolution neural networks for brain magnetic resonance image segmentation
Guo Tongyu1, Wang Bo1, Liu Yue1, Wei Ying1,2
1. College of Information Science and Engineering, Northeastern University, Shenyang 110004, China;
2. Key Laboratory of Medical Imaging Calculation of the Ministry of Education, Shenyang 110004, China
Supported by: National Natural Science Foundation of China (61871106)

# Abstract

Objective CNN (convolution neural network) shows excellent performance in the field of brain magnetic resonance image segmentation because of its ability to extract the deep information features of the image. However, the majority of deep learning methods have the problems of too many parameters and inaccurate result of edge segmentation. To overcome these problems, this study proposes a multi-channel fusion separable convolution neural network (MFSCNN). Method First, the weight of the brain structure and its edge pixels are increased in the training set to make the network acquire numerous features of the brain structure and its edge in training. The network is also forced to learn how to segment the edge part of the brain structure to improve the accuracy of the entire brain structure segmentation. Second, the residual unit is introduced to allow the network to transfer the derivative back to the network by jumping connections between the layers of the residual network. While deepening the network, the gradient dispersion can be avoided, which makes up for the lack of information loss in information transmission. The deep separable convolution is used to replace the original convolution layer, and the depth is used to replace the width. Without changing the number of characteristic channels in each stage of the network, the number of network parameters, the number of network training parameters, the training cost, and the training time of the network are reduced. Finally, the feature information of different stages is merged, and the channel is shuffled to obtain the enhanced information features containing deep and shallow information. The features are then placed into the network for training. The input feature information of each stage is richer, the learning feature is faster, and the convergence is faster; so the performance of the brain image segmentation based on the network is obviously improved. Content of main experiment and result For IBSR data sets, the results of MFSCNN are compared with those of ordinary convolutional neural network model (CNN), neural network model with residual unit (ResCNN), and neural network model with local full connection (DenseCNN). The network structure is divided into four stages, and each stage is a specific unit. In training and testing, 75% of the samples are selected as training set and 25% as test set. Dice and IOU (intersection cver union) values are used to measure the accuracy of image segmentation. Dice value can measure the similarity between the segmentation and gold standard results. IOU value reflects the coincidence degree between the segmentation and gold standard results. The results of MFSCNN are significantly higher than those of CNN. In the complex part of the edge, the performance of segmentation is improved obviously. The Dice and IOU are increased by 0.9%6.6% and 1.3%9.7% respectively. In the edge smoothing part, MFSCNN is better than the deep network ResCNN and DenseCNN in terms of the segmentation effect. Moreover, the parameters of MFSCNN are only 50% of ResCNN and 28% of DenseCNN, which not only improves the segmentation performance but also reduces the computational complexity and training time. Comparisons with reviewed research. In the performance on the IBSR, Hammer67n20, and LPBA40, the segmentation results of MFSCNN are better than those of other existing methods. MFSCNN is more prominent in the segmentation of the hippocampus. Compared with commonly used segmentation software FIRST and FreeSurfer, the average Dice values of the putamen and caudate nucleus are increased by 3.4% and 8%, respectively. For the popular methods, the values of Brainsegnet and MSCNN+LC (label consistenay) are increased by 1.6%4.4% and 2.6%2.7%, respectively. Conclusion The proposed MFSCNN method can form a friendly initialization training set for brain structure segmentation by increasing the weight of the interested brain structure and its edge pixels in the training set. When training the network, the deep separable convolution structure is used instead of the original convolution layer, thereby reducing the amount of network training parameters. The feature maps of each stage are merged, and the channels are shuffled to obtain enhanced information features containing deep and shallow information, thereby improving the accuracy of network model segmentation. MFSCNN not only solves the problem of inaccurate segmentation of complex edges of the brain structure by traditional CNN but also improves the inaccurate segmentation of the lateral edges of the brain structure by ResCNN and DenseCNN. In addition, for different data sets, accurate segmentation results of MR brain images can be obtained. Meanings: The regional contrast of MR image is low, and the gray value of each structure is similar. Therefore, fusion information can be extracted directly from MR image by the proposed MFSCNN method and further applied to other MR image segmentation. Although MFSCNN achieves good results for deep brain structure segmentation, the accuracy of segmentation for the discontinuous part of the brain structure still needs to be improved mainly because of the complex and discontinuous types of pixels on the edges of these parts. Therefore, how to extract features that can segment complex edge contours by using deep convolution network is a problem that needs to be studied in the future.

# Key words

subcortical brain MR image segmentation; convolution neural network(CNN); depthwise separable convolution; multi-channel fusion; channel shuffle

# 0 引言

Ronneberger等人[6]在FCN(fully convolutional network)的基础上提出了U-Net网络结构，该方法参加了医学图像分割竞赛(ISBI)，并且获得了很不错的分割效果；Yoo等人[7]使用深度学习进行特征学习，使用随机森林进行监督分类，在此基础上提出了CEN(convolutional encoder networks)的网络结构并和U-Net结合[8]，用于分割脑部多发性硬化病变，提高了分割准确性。

Mehta等人[9]基于CNN(convolutional neural network)框架提出了一种将不同尺度的图像块进行融合的方式分割MR脑部图像。将3维脑结构轴状、冠状、矢状图和3维图像块进行结合，将全局特征和局部特征相结合，能够更有效地分割MR脑部图像。Mehta等人[10]提出了M-Net网络，使用2D卷积对3D的数据进行处理，使得算法处理速度较其他深度学习算法更快，同时准确性也有所提升。

1) 使用深度可分离卷积(depthwise separable convolution)代替CNN中原始的普通卷积层，减少网络的参数量，提高网络的训练速度。并结合残差模块，解决随网络层数增加而出现的梯度消失问题。

2) 将每个卷积单元的输出特征图通过通道合并的方式拼接起来，作为后续卷积单元的输入，实现深浅层次特征的结合，同时，将拼接后的特征图进行通道混洗(channel shuffle)增强特征输入的随机性，避免了边界效应。

# 1.1 残差模型

 $x_{L}=x_{l}+\sum\limits_{i=1}^{L-1} R_{\mathrm{ELU}}\left(F\left(x_{i}, W_{i}\right)\right)$ (1)

 $\frac{\partial \varepsilon}{\partial x_{l}}=\frac{\partial \varepsilon}{\partial x_{L}}\left(1+\frac{\partial}{\partial x_{l}} \sum\limits_{i=1}^{L-1} F\left(x_{i}, W_{i}\right)\right)$ (2)

# 1.3 多通道融合网络模型

1.1节中的残差结构虽然能够使每个阶段的特征信息更加丰富，但对于全局来说，每个阶段的特征都没有什么关联，随着网络的加深，越来越复杂的特征对于一些人眼无法分辨的细节确实可以很好地区分，但同时也意味着忽略了那些原始的直观表征图像的特征。所以本文提出一种多通道融合网络模型，将不同阶段的特征信息通过通道合并的方式拼接在一起。同时，对合并后的特征进行“通道混洗”，其过程如图 3所示。混洗后的特征图的每一个子区域都包含不同阶段的特征，实现真正意义上全阶段特征信息的融合，同时可以避免网络学习的局限性，提升网络的鲁棒性。

# 2.1 数据集和预处理

 $D=\frac{2 T P}{2 T P+F P+F N}$ (3)

 $I=\frac{T P}{T P+F P+F N}$ (4)

# 2.3 对比实验

1) 普通卷积神经网络模型。设置4个阶段，每个阶段包含一个卷积层和一个BN/Relu层普通的卷积神经网络无法做到设置很多层，因为随着层数的增加，会出现梯度消失的问题，网络的性能反而降低。

2) 引入残差块的神经网络模型(ResCNN)。残差块的引入主要是为了解决梯度消失的问题，同时可以增加网络的深度，进而得到更深层次的特征。将CNN结构的网络中每一个模块替换成卷积层和残差单元结合的形式，使得网络更深，同时每个阶段的特征信息也更丰富, 如图 9所示。

ResCNN设置4个阶段，每个阶段由一个卷积层和一个残差模块以及BN/Relu层构成，每个残差模块都包含两个卷积层。

3) 引入局部全连接模块的神经网络模型(DenseCNN)。该模型是在ResCNN的基础上，增加每一个单元内部的卷积层的层数，将浅层的输出跳跃连接到之后的每一层，与深层输出相加作为下一层的输出，以达到单元内每一层的输入都会包含之前每一层的输出信息的效果。相比于残差模块，局部全连接模块的卷积层数更多，提取到的特征更多更细，但是缺点是参数量太过庞大, 如图 10所示。

# 2.4 实验结果分析

Table 1 Number of network model parameters

 /106 CNN ResCNN DenseCNN MFSCNN 参数量 3 5.99 10.69 3.03

Table 2 MFSCNN method compared with other segmentation methods

 Dice IOU CNN ResCNN DenseCNN MFSCNN CNN ResCNN DenseCNN MFSCNN 左海马 0.825 0.827 0.831 0.834 0.702 0.705 0.711 0.715 右海马 0.817 0.827 0.813 0.827 0.691 0.685 0.684 0.705 左壳核 0.884 0.881 0.886 0.900 0.794 0.805 0.812 0.818 右壳核 0.889 0.887 0.900 0.901 0.802 0.798 0.817 0.817 左尾状核 0.848 0.853 0.861 0.890 0.738 0.744 0.756 0.800 右尾状核 0.824 0.843 0.827 0.890 0.705 0.731 0.708 0.802 平均 0.845 0.852 0.853 0.874 0.739 0.745 0.748 0.776 注：加粗数值为最优结果。

DenseCNN虽在右海马和右尾状核的分割准确率上不如ResCNN，但在其他脑结构分割准确率都高于ResCNN，因为DenseCNN的层数比ResCNN更多，参数更多，学习到的特征信息也更丰富，分割的准确率也有所提升，但过多的参数量增加了训练的难度，网络模型比普通的网络更难训练到最优。

# 2.5 和现有主流方法对比

1) Hammers数据集。MFSCNN分割脑结构(壳核，尾状核，海马)的平均Dice值为0.898，比近几年来在该数据集上的其他方法分割效果更好。对比方法有Nonlocal-PBM[11]、Sparse-PBM[12]、Wu提出的多尺度特征的图谱融合方法[13]、Cardoso提出的相似性估计的方法[14]、BrainSegNet方法[15]表 3为各方法的Dice值。本文方法的Dice均值，在各个脑结构都要高于其他5种方法，其中尾状核的分割结果与Brainsegnet方法接近，但海马体和壳核的分割效果都比Brainsegnet方法好。

Table 3 Comparison of segmentation results on Hammers67n20 dataset

 NonlocalPBM SparsePBM Wu Cardoso BrainSegNet 本文方法 海马 0.823 0.840 0.846 0.842 0.840 0.868 壳核 0.874 0.888 0.895 0.891 0.890 0.919 尾状核 0.885 0.889 0.892 0.892 0.900 0.906 平均 0.861 0.872 0.878 0.875 0.876 0.898 注：加粗数值为最优结果。

2) LPBA40数据集。MFSCNN分割脑结构(壳核，尾状核，海马)在LPBA40数据集上的平均Dice值为0.877，比在该数据集上的其他方法有更好的表现，对比的方法有Bao提出的基于随机游走的Atlas图像分割的特征敏感标签融合方法[16]、Zhang提出的HLAF(hierarchical learning of atlas forests)方法[17]、MS-CNN(multi-scale structured CNN)和MS-CNN+LC(multi-scale structured CNN with label consistency)的方法[18]、Prasad提出的基于主成分图谱和非刚性配准的方法[19]、和BrainSegNet的方法[15]表 4为各方法的Dice值。

Table 4 Comparison of segmentation results on LPBA40 dataset

 Bao Zhang MS-CNN MS-CNN+LC Prasad BrainSegNet 本文方法 海马 0.849 0.810 0.827 0.839 0.828 0.830 0.885 壳核 0.858 0.817 0.850 0.860 0.842 0.840 0.858 尾状核 0.867 0.806 0.851 0.851 0.823 0.840 0.887 平均 0.858 0.811 0.843 0.850 0.831 0.837 0.877 注：加粗数值为最优结果。

3) IBSR数据集。MFSCNN分割脑结构(壳核，尾状核，海马)在IBSR数据集的平均Dice值为0.872。对比的方法有：MS-CNN[18]、MS-CNN+LC(label consistency)[18]、M-net[10]、BrainSegNet、HLAF方法[17]、随机游走的Atlas图像分割的特征敏感标签融合方法[16]

Table 5 Comparison of segmentation results on IBSR dataset

 FIRST FreeSurfer MS-CNN BrainSegNet MS-CNN+LC M-net Bao 本文方法 海马 0.811 0.764 0.788 0.820 0.817 0.820 0.814 0.830 壳核 0.875 0.809 0.875 0.910 0.882 0.900 0.887 0.900 尾状核 0.827 0.803 0.849 0.870 0.870 0.870 0.849 0.890 平均 0.838 0.792 0.837 0.867 0.856 0.863 0.850 0.872 注：加粗数值为最优结果。

# 参考文献

• [1] Geuze E, Vermetten E, Bremner J D. MR-based in vivo hippocampal volumetrics:2. Findings in neuropsychiatric disorders[J]. Molecular Psychiatry, 2005, 10(2): 160–184. [DOI:10.1038/sj.mp.4001579]
• [2] Duan H Q, Shu X H, Xu J, et al. A novel computer aided Alzheimer's analysis approach based on regions of interests of PiB PET images[J]. Chinese Journal of Biomedical Engineering, 2016, 35(6): 641–647. [段火强, 舒星辉, 徐俊, 等. 基于PiB PET图像感兴趣区域的阿尔茨海默症计算机辅助分析[J]. 中国生物医学工程学报, 2016, 35(6): 641–647. ] [DOI:10.3969/j.issn.0258-8021.2016.06.001]
• [3] Jiang X L, Zhou Z Z, Ding X K, et al. Level set based hippocampus segmentation in MR images with improved initialization using region growing[J]. Computational and Mathematical Methods in Medicine, 2017, 2017: #5256346.
• [4] Tang S Y, Xing J F, Yang M. New method for medical image segmentation based on BP neural network[J]. Computer Science, 2017, 44(S1): 240–243. [唐思源, 邢俊凤, 杨敏. 基于BP神经网络的医学图像分割新方法[J]. 计算机科学, 2017, 44(S1): 240–243. ]
• [5] Scherrer B, Forbes F, Garbay C, et al. Distributed local MRF models for tissue and structure brain segmentation[J]. IEEE Transactions on Medical Imaging, 2009, 28(8): 1278–1295. [DOI:10.1109/TMI.2009.2014459]
• [6] Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer, 2015: 234-241.[DOI: 10.1007/978-3-319-24574-4_28]
• [7] Yoo Y, Brosch T, Traboulsee A, et al. Deep learning of image features from unlabeled data for multiple sclerosis lesion segmentation[C]//Proceedings of the 5th International Workshop Machine Learning in Medical Imaging. Boston, MA, USA: Springer, 2014: 117-124.[DOI: 10.1007/978-3-319-10581-9_15]
• [8] Brosch T, Tang L Y W, Yoo Y, et al. Deep 3D convolutional encoder networks with shortcuts for multiscale feature integration applied to multiple sclerosis lesion segmentation[J]. IEEE Transactions on Medical Imaging, 2016, 35(5): 1229–1239. [DOI:10.1109/TMI.2016.2528821]
• [9] Mehta R, Majumdar A, Sivaswamy J. BrainSegNet:a convolutional neural network architecture for automated segmentation of human brain structures[J]. Journal of Medical Imaging, 2017, 4(2): 024003. [DOI:10.1117/1.JMI.4.2.024003]
• [10] Mehta R, Sivaswamy J. M-net: a convolutional neural network for deep brain structure segmentation[C]//Proceedings of 2017 IEEE 14th International Symposium on Biomedical Imaging. Melbourne, VIC, Australia: IEEE, 2017.[DOI: 10.1109/ISBI.2017.7950555]
• [11] Hammers A, Allom R, Koepp M J, et al. Three-dimensional maximum probability atlas of the human brain, with particular reference to the temporal lobe[J]. Human Brain Mapping, 2003, 19(4): 224–247. [DOI:10.1002/hbm.10123]
• [12] Hammers A, Chen C H, Lemieux L, et al. Statistical neuroanatomy of the human inferior frontal gyrus and probabilistic atlas in a standard stereotaxic space[J]. Human Brain Mapping, 2007, 28(1): 34–48. [DOI:10.1002/hbm.20254]
• [13] Wu G R, Shen D G. Hierarchical label fusion with multiscale feature representation and label-specific patch partition[C]//Proceedings of the 17th International Conference on Medical Image Computing and Computer-Assisted Intervention. Boston, MA, USA: Springer, 2014: 299-306.[DOI: 10.1007/978-3-319-10404-1_38]
• [14] Cardoso M J, Leung K, Modat M, et al. STEPS:similarity and truth estimation for propagated segmentations and its application to hippocampal segmentation and brain parcelation[J]. Medical Image Analysis, 2013, 17(6): 671–684. [DOI:10.1016/j.media.2013.02.006]
• [15] Mehta R, Majumdar A, Sivaswamy J. BrainSegNet:a convolutional neural network architecture for automated segmentation of human brain structures[J]. Journal of Medical Imaging, 2017, 4(2): 024003. [DOI:10.1117/1.JMI.4.2.024003]
• [16] Bao S Q, Chung A C S. Feature sensitive label fusion with random walker for atlas-based image segmentation[J]. IEEE Transactions on Image Processing, 2017, 26(6): 2797–2810. [DOI:10.1109/TIP.2017.2691799]
• [17] Zhang L C, Wang Q, Gao Y Z, et al. Automatic labeling of MR brain images by hierarchical learning of atlas forests[J]. Medical Physics, 2016, 43(3): 1175–1186. [DOI:10.1118/1.4941011]
• [18] Bao S Q, Chung A C S. Multi-scale structured CNN with label consistency for brain MR image segmentation[J]. Computer Methods in Biomechanics and Biomedical Engineering:Imaging & Visualization, 2018, 6(1): 113–117. [DOI:10.1080/21681163.2016.1182072]
• [19] Prasad G. Segmentation of 3D MR images of the brain using a PCA atlas and nonrigid registration[D]. Los Angeles: University of California, 2010.