Current Issue Cover
医学图像深度学习技术:从卷积到图卷积的发展

唐朝生1,2, 胡超超1, 孙君顶1, 司马海峰1(1.河南理工大学计算机科学与技术学院, 焦作 454000;2.苏州大学江苏省计算机信息处理技术重点实验室, 苏州 215006)

摘 要
以卷积神经网络为代表的深度学习技术推动神经网络在医学图像研究领域不断实现新突破。然而,平移不变性等理论假设限制了卷积神经网络在非欧氏空间数据中的表达能力,是医学图像深度学习技术亟待突破的瓶颈。图卷积技术不仅能够解决非欧氏空间数据的拓扑建模难题,还实现了空间特征提取,是深度学习技术全新的研究方向。本文对图卷积网络在医学图像领域的相关理论及其应用进行综述,旨在系统归纳和全面总结医学图像领域最新的图卷积理论、方法和实践,包括图结构视角下医学图像的专业采集、数据结构的剪枝转换以及特征聚类重构方法;图卷积网络的理论溯源,重要的网络架构和发展脉络;图卷积网络的优化方向和衍生出的跳跃连接、inception、图注意力等重要机制;图卷积网络在医学图像分割、疾病检测和图像重建等方面的实践应用。最后,提出了图卷积网络在医学图像分析领域仍亟待突破的瓶颈问题:1)多模态医学图像学习中,异构图的构建与学习任务的优化;2)特征重构和池化过程中,如何通过构图算法设计与神经架构搜索算法结合,以实现最优图结构的可学习过程转换;3)高质量图结构医学标注数据的大规模低成本生成与生成对抗网络的算法设计。随着人工智能技术的不断发展和医学影像规模的不断扩大,以图卷积为代表的深度学习方法必将在医疗辅助诊断领域取得更大的突破。
关键词
Deep learning-based medical images analysis evolved from convolution to graph convolution

Tang Chaosheng1,2, Hu Chaochao1, Sun Junding1, Sima Haifeng1(1.School of Computer Science & Technology, Henan Polytechnic University, Jiaozuo 454000, China;2.Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou 215006, China)

Abstract
The convolutional neural networks (CNN) have been facilitated to develop deep learning-based medical image sustainable research. The translation invariance capability has constrained the expression of CNN in the context of non-Euclidean spatial data. In order to realize deep learning-based spatial feature extraction, graph convolution has resolved the topology modeling issue based on non-Euclidean spatial data. The latest theories and applications of graph convolutional networks (GCN) for medical image analysis have been reviewed. This research has been divided into four aspects as follows:1) Data structure transformation of medical images based on graph-structure; 2) Theoretical development and network architecture of GCN; 3) The optimized and derivative of graph convolution mechanism; 4) GCN implementation in medical image segmentation, disease detection, and image reconstruction. First, graph-structure-based medical images transformation has been reviewed in the context of graph data acquisition, transformation, and reconstruction. The graph-structure-based medical data have been acquired via the professional medical equipment, the sparse pruning algorithm, or the rebuilt graph-structure using the K-nearest neighbor (KNN) algorithm. The graph-structure reconstruction algorithm based on the medical image features has performed better than the graph-structure conversion algorithm based on the medical image data. Next, the critical architecture of the GCN, including the graph convolutional layer, the graph regularization layer, the graph pooling layer, and the graph readout layer, has been summarized. The graph-structural nodes or edges have been updated via the graph convolution layer. The generalization of GCN has been upgraded via the graph regularization layer. The number of calculation parameters has been reduced via the graph pooling layer. The representation of the graph has been generated via the graph readout layer. Graph convolution has been categorized into two methods as mentioned below:a) The spectrum-based graph convolution operation has been implemented via the theory of graph spectrum; b) The spatial domain-based graph convolution operation has been defined via the connectivity of each node. The spectrum-based graph convolution has relied on the eigen-decomposition of the Laplace matrix with the defects of high time complexity, poor portability, and narrow application. The convolution can be optimized by Chebyshev Inequality analysis. The graph pooling layer has effectively reduced parameter size. The graph regularization layer can facilitate the generalization of the model and alleviate the over-fitting and over-smoothing issues. The different structural features, node features, and edge features have been extracted based on the graph convolutional layer. All features need to be aggregated to complete the classification (note:this operation is called the readout operation, and its function is similar to the fully connected layer of CNN). Third, the development and derivation mechanism of GCN optimizations have been summarized. For instance, the jump connection mechanism of deepGCN has alleviated the over-smooth issue. The outputs of multiple GCN based on inception architecture can be integrated to improve the representation ability of the model. The graph attention mechanism has aggregated the differentiated information of the GCN nodes. The adjacency matrix reconstruction has been critically optimized to achieve qualified GCN model performance via learning the hidden structure of the unidentified graph adjacency matrix. Fourth, the main application of GCN for medical image analysis has been interpreted. The general graph-structure construct algorithm for GCN application to medical image segmentation has taken the region of interest (ROI) as the node and the existence of connection in the ROI as the edge. For some unique imaging data (such as brain voxel data and cardiac coronary artery surface grid data), the KNN algorithm has been used to convert them into a graph-structure. The improvement of model architecture has changed from the simple stack of CNN and GCN to the complex combination of various models. The previous medical images application of GCN in disease detection has mainly focused on brain images. Disease detection has been accomplished by using GCN based on the various relationships between objects. Current research on disease detection has mainly divided into three aspects:1) various CNN models have been used to extract the features based on the original medical images; 2) the KNN algorithm or graph attention algorithm has been used for feature reconstruction; 3) the potential relationship between features is mined by graph convolution for feature classification. In addition, GCN have been used for brain magnetic resonance imaging (MRI) reconstruction, liver image reconstruction, heart image reconstruction, and other diagnoses. In a word, GCN have effectively mined the generalized topological structure in image data on the aspects of medical image segmentation, disease detection, and image reconstruction. The integrated deep learning architecture, which uses pre-trained CNN as feature extractor and GCN as the classifier, has solved the missing issues of medical training samples in a graph structure and significantly improved the performance of deep learning technology in medical image analysis.
Keywords

订阅号|日报