SA-TF-UNet:基于空间注意力机制和Transformer的MRI海马体分割
SA-TF-UNet: a Transformer and spatial attention mechanisms based hippocampus segmentation network
- 2023年28卷第10期 页码:3191-3202
纸质出版日期: 2023-10-16
DOI: 10.11834/jig.220567
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2023-10-16 ,
移动端阅览
欧宇轩, 高敏, 赵地, 刘军. 2023. SA-TF-UNet:基于空间注意力机制和Transformer的MRI海马体分割. 中国图象图形学报, 28(10):3191-3202
Ou Yuxuan, Gao Min, Zhao Di, Liu Jun. 2023. SA-TF-UNet: a Transformer and spatial attention mechanisms based hippocampus segmentation network. Journal of Image and Graphics, 28(10):3191-3202
目的
2
海马体内嗅皮层的像素体积较小,这些特征给医学影像的分割任务带来很大挑战。综合海马体的形态特点以及医生的分割流程,提出一种新的海马体分割方法,以实现在临床医学影像处理中对海马体的精确分割,辅助阿尔兹海默症的早期诊断。
方法
2
提出一个基于自注意力机制与空间注意力机制的U型网络模型SA-TF-UNet(hippocampus segmentation network based on Transformer and spatial attention mechanisms)。该网络为端到端的预测网络,输入任意大小的3维MRI(magnetic resonance imaging)影像,输出类别标签。SA-TF-UNet采用编码器—解码器结构,编码器采用纯Transformer模块,不包含卷积模块。多头自注意力机制为Transformer模块中的特征提取器,自注意力模块基于全局信息建模,并提取特征。因此,使用Transformer提取特征符合医生分割海马体的基本思路。解码器采用简单的卷积模块进行上采样。使用AG(attention gate)模块作为跳跃连接的方式,自动增加前景的权重,代替了传统网络中的直接连接。为了验证AG的有效性,分别做了只在单层加入AG的实验,与在4层网络中全部加入AG的实验结果进行对比。为了进一步探讨AG模块中门控信号的来源,设计了两个SA-TF-UNet的变体,它们的网络结构中AG门控信号分别为比AG中的特征图深两层的Transformer模块输出和深3层的Transformer模块输出。
结果
2
为了验证SA-TF-UNet在临床数据集中分割海马体的有效性,在由阿尔兹海默症患者的MRI影像组成的脑MRI数据集上进行实验。4层网络全部加入AG,且AG的门控信号是由比AG特征图更深一层的Transformer模块输出的SA-TF-UNet模型分割效果最好。SA-TF-UNet对于左海马体、右海马体的分割Dice系数分别为0.900 1与0.909 1,相较于对比的语义分割网络有显著提升,Dice系数提升分别为2.82%与3.43%。
结论
2
加入空间注意力机制的以纯Transformer模块为编码器的分割网络有效提升了脑部MRI海马体的分割精度。
Objective
2
The early intervention and diagnosis of Alzheimer’s disease (AD) have its high clinical and social value to a certain extent. Hippocampus is located and as one of the earliest affected brain regions in AD, and its dysfunction is recognized as such core features of the disease-memory impairment. It is labor-intensive and time inefficient to deal with AD contexts using magnetic resonance imaging (MRI).The emerging artificial intelligence (AI) technique is beneficial for high-accuracy hippocampus segmentation work on MRI scanning effectively and efficiently. When an AI-related algorithm is developed for AD diagnosis, convolutional neural networks (CNNs) based deep learning methods can be employed to carry out the task of hippocampus segmentation further. As the down-sampling steps are involved in the encoder, convolutions of various kernel sizes can be used to contract images and extract image features. To expand the generated feature map through encoding, upsampling it to the original spatial size of the input image, the decoders can be used to transpose convolutions and bilinear interpolation as well. First, convolutions can be used to integrate context information within the receptive field only. In this case, all pixels-out would be ignored for in-bound of the receptive field, even pixels are correlated with in-bound pixels, and redundant information is produced after that. To optimize task of hippocampus segmentation network, we focus on the natural characteristics of the hippocampus and clinical-based segmentation works. The characteristics of the hippocampus can be affected on the two aspects as mentioned below: the first one is oriented that the shape of the hippocampus is irregular, while its size of the second one is minimal, occupied by only 0.000 2 of the whole pixels of the MRI scans. For the first one, convolutions are difficult to extract features effectively from irregular shape objectives because they can extract local features only. An encoder in a neural network may contain many feature extraction layers, so the extracted information of the hippocampus will be lost because there are only limited pixels of the hippocampus in the original image. To sort the hippocampus-relevant region of interest out, it is required to segment small objects is a superposition of a detection network. The semantic segmentation network will only be oriented and applied inside the bounding box. However, it still has two identical features in the learning process, for which redundancy of computing resources are inevitable.
Method
2
To extract features from targets with irregular shapes effectively and highlight the target areas automatically, we adjust the segmentation in medical images and treat it as a sequence-to-sequence prediction task. We develop a U-shaped network based on self-attention and spatial attention mechanisms, called SA-TF-UNet. The SA-TF-UNet has an encoder-decoder architecture, where the encoder is based on pure Transformer blocks. Self-attention mechanisms in Transformer blocks can be used to enable global modeling as well. An attention gate (AG) is adopted to optimize the concatenation of the skip connections in U-Net, where the AGs can be learnt from depth layers of the Transformer and the weights on the target areas can be automatically set up more. To validate the effectiveness of AGs, we carried out experiments where one AG is only contained for the network. The comparative analysis is carried out the experiment as well, where we apply AG to all four layers. To determine the gating signals for each AG further, two sorts of structures are illustrated. The gating signals in these two sorts of structures are focused on the depth outputs of two Transformer blocks, and three Transformer blocks.
Result
2
Our models proposed are tested on a dataset sample derived of 54 clinical MRI scans from AD patients. The dataset is divided into training data and testing data at a ratio of 8∶1 randomly. Three independent experiments are carried out, and an average result is used to reduce contingency simutaneously. The potential of SA-TF-UNet is demonstrated that the average dice of the left hippocampus and right hippocampus in three independent experiments are 0.900 1 and 0.909 1 relevant to an improvement of 2.82% and 3.37%. The other two related fine-tuned structures are linked that a dice coefficient of them is reached to more than 0.88 as well.
Conclusion
2
The integrated self and spatial attention is beneifical for the precision of hippocampus segmentation. It is effective that the gating signal in AG is outputted in terms of one depth Transformer block only.
海马体医学图像处理Transformer空间注意力机制语义分割
hippocampusmedical image processingTransformerspatial attentionsementic segmentation
Alexey D, Lucas B, Alexander K, Dirk W, Xiaohua Z, Thomas U, Mostafa D, Matthias M, Georg H, Sylvain G, Jakob U and Neil H. 2021. An image is worth 16 × 16 words: Transformers for image recognition at scale [EB/OL]. [2022-05-20]. https://arxiv.org/pdf/2010.11929.pdfhttps://arxiv.org/pdf/2010.11929.pdf
Bruno P, Calimeri F, Marte C and Manna M. 2021. Combining deep learning and ASP-based models for the semantic segmentation of medical images//Proceedings of the 5th International Joint Conference on Rules and Reasoning. Leuven, Belgium: Springer: 95-110 [DOI: 10.1007/978-3-030-91167-6_7http://dx.doi.org/10.1007/978-3-030-91167-6_7]
Chen H Y, Gao J Y, Zhao D, Wang H Z, Song H and Su Q H. 2021a. Review of the research progress in deep learning and biomedical image analysis till 2020. Journal of Image and Graphics, 26(3): 475-486
陈弘扬, 高敬阳, 赵地, 汪红志, 宋红, 苏庆华. 2021a. 深度学习与生物医学图像分析2020年综述. 中国图象图形学报, 26(3): 475-486 [DOI: 10.11834/jig.200351http://dx.doi.org/10.11834/jig.200351]
Chen H Y, Gao J Y, Zhao D, Wu J, Chen J J, Quan X Y, Li X M, Xue F, Zhou M Y and Bai B B. 2021b. LFSCA-UNet: liver fibrosis region segmentation network based on spatial and channel attention mechanisms. Journal of Image and Graphics, 26(9): 2121-2134
陈弘扬, 高敬阳, 赵地, 吴忌, 陈金军, 全显跃, 李欣明, 薛峰, 周沐瑶, 柏冰冰. 2021b. LFSCA-UNet: 基于空间与通道注意力机制的肝纤维化区域分割网络. 中国图象图形学报, 26(9): 2121-2134 [DOI: 10.11834/jig.210236http://dx.doi.org/10.11834/jig.210236]
Clark C, Lewczuk P, Kornhuber J, Richiardi J, Maréchal B, Karikari T K, Blennow K, Zetterberg H and Popp J. 2021. Plasma neurofilament light and phosphorylated tau 181 as biomarkers of Alzheimer’s disease pathology and clinical disease progression. Alzheimer’s Research and Therapy, 13(1): #65 [DOI: 10.1186/s13195-021-00805-8http://dx.doi.org/10.1186/s13195-021-00805-8]
Du X B, Shi Q Q, Zhao Y X, Xie Y L, Li X X, Liu Q, Iqbal J, Zhang H J, Liu X K and Shen L. 2021. Se-Methylselenocysteine (SMC) improves cognitive deficits by attenuating synaptic and metabolic abnormalities in Alzheimer’s mice model: a proteomic study. ACS Chemical Neuroscience, 12(7): 1112-1132 [DOI: 10.1021/acschemneuro.0c00549http://dx.doi.org/10.1021/acschemneuro.0c00549]
Frisoni G B, Fox N C, JrJack C R, Scheltens P and Thompson P M. 2010. The clinical use of structural MRI in Alzheimer disease. Nature Reviews Neurology, 6(2): 67-77 [DOI: 10.1038/nrneurol.2009.215http://dx.doi.org/10.1038/nrneurol.2009.215]
Gaur L, Bhatia U, Jhanjhi N Z, Muhammad G and Masud M. 2021. Medical image-based detection of COVID-19 using deep convolution neural networks [J/OL]. Multimedia Systems, (11): 1-10 [2022-05-20]. https://link.springer.com/article/10.1007/s00530-021-00794-6https://link.springer.com/article/10.1007/s00530-021-00794-6
Hatamizadeh A, Tang Y C, Nath V, Yang D, Myronenko A, Landman B and Xu D. 2022. UNETR: transformers for 3D medical image segmentation//Proceedings of 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE: 1748-1758
Jiang Y Y, Zhao T H and Zheng H T. 2021. Population aging and its effects on the gap of urban public health insurance in China. China Economic Review, 68: #101646 [DOI: 10.1016/j.chieco.2021.101646http://dx.doi.org/10.1016/j.chieco.2021.101646]
Lai W S, Huang J B, Ahuja N and Yang M H. 2017. Deep Laplacian pyramid networks for fast and accurate super-resolution//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5835-5843 [DOI: 10.1109/CVPR.2017.618http://dx.doi.org/10.1109/CVPR.2017.618]
Liu Y M, Zhang X Y, Lin W W, Kehriman N, Kuang W and Ling X M. 2022. Multi-factor combined biomarker screening strategy to rapidly diagnose Alzheimer’s disease and evaluate drug effect based on a rat model. Journal of Pharmaceutical Analysis, 12(4): 627-636 [DOI: 10.1016/j.jpha.2022.04.003http://dx.doi.org/10.1016/j.jpha.2022.04.003]
Lukiw W J. 2007. Micro-RNA speciation in fetal, adult and Alzheimer’s disease hippocampus. Neuroreport, 18(3): 297-300 [DOI: 10.1097/WNR.0b013e3280148e8bhttp://dx.doi.org/10.1097/WNR.0b013e3280148e8b]
Ma X J, Niu Y H, Gu L, Wang Y S, Zhao Y T, Bailey J and Lu F. 2021. Understanding adversarial attacks on deep learning based medical image analysis systems. Pattern Recognition, 110: #107332 [DOI: 10.1016/j.patcog.2020.107332http://dx.doi.org/10.1016/j.patcog.2020.107332]
McKhann G M, Drachman D A, Folstein M, Katzman R, Price D and Stadlan E M. 1984. Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA work group* under the auspices of department of health and human services task force on Alzheimer’s disease. Neurology, 34(7): 939-944 [DOI: 10.1212/wnl.34.7.939http://dx.doi.org/10.1212/wnl.34.7.939]
Milletari F, Navab N and Ahmadi S A. 2016. V-Net: fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision (3DV). Stanford, USA: IEEE: 565-571 [DOI: 10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79]
Natarajan A, Chang Y L, Mariani S, Rahman A, Boverman G, Vij S and Rubin J. 2020. A wide and deep transformer neural network for 12-lead ECG classification//2020 Computing in Cardiology. Rimini, Italy: IEEE: 1-4 [DOI: 10.22489/CinC.2020.107http://dx.doi.org/10.22489/CinC.2020.107]
Oktay O, Schlemper J, Le Folgoc L, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla N Y, Kainz B, Glocker B and Rueckert D. 2018. Attention U-Net: learning where to look for the pancreas [EB/OL]. [2022-05-20]. https://arxiv.org/pdf/1804.03999.pdfhttps://arxiv.org/pdf/1804.03999.pdf
Pflugfelder P W, Wisenberg G and Boughner D R. 1985. Detection of atrial myxoma by magnetic resonance imaging. The American Journal of Cardiology, 55(1): 242-243 [DOI: 10.1016/0002-9149(85)90345-5http://dx.doi.org/10.1016/0002-9149(85)90345-5]
Pieper S, Halle M and Kikinis R. 2004. 3D slicer//2004 2nd IEEE International Symposium on Biomedical Imaging: Nano to Macro. Arlington, USA: IEEE: 632-635 [DOI: 10.1109/ISBI.2004.1398617http://dx.doi.org/10.1109/ISBI.2004.1398617]
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Safaei A and HabibiAsl S. 2021. Diamond: multi-dimensional indexing technique for medical images retrieval using vertical fragmentation approach. The Journal of Supercomputing, 77(7): 7089-7148 [DOI: 10.1007/s11227-020-03522-5http://dx.doi.org/10.1007/s11227-020-03522-5]
Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B and Rueckert D. 2019. Attention gated networks: learning to leverage salient regions in medical images. Medical Image Analysis, 53: 197-207 [DOI: 10.1016/j.media.2019.01.012http://dx.doi.org/10.1016/j.media.2019.01.012]
Song L F, Shi Y W, Xiao X Y, Zhang C X and Xiang S M. 2021. Relational attention with textual enhanced transformer for image captioning//Proceedings of the 4th Chinese Conference on Pattern Recognition and Computer Vision (PRCV). Beijing, China: Springer: 151-163 [DOI: 10.1007/978-3-030-88010-1_13http://dx.doi.org/10.1007/978-3-030-88010-1_13]
Tennakoon A, Katharesan V, Musgrave I F, Koblar S A, Faull R L M, Curtis M A and Johnson I P. 2022. Normal aging, motor neurone disease, and Alzheimer’s disease are characterized by cortical changes in inflammatory cytokines. Journal of Neuroscience Research, 100(2): 653-669 [DOI: 10.1002/jnr.24996http://dx.doi.org/10.1002/jnr.24996]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 6000-6010
Xu D and Tamir D E. 2019. Pseudo-random number generators based on the Collatz conjecture. International Journal of Information Technology, 11(3): 453-459 [DOI: 10.1007/s41870-019-00307-9http://dx.doi.org/10.1007/s41870-019-00307-9]
Zeng H H, Qi Y J, Zhang Z Y, Liu C T, Peng W J and Zhang Y. 2021. Nanomaterials toward the treatment of Alzheimer’s disease: recent advances and future trends. Chinese Chemical Letters, 32(6): 1857-1868 [DOI: 10.1016/j.cclet.2021.01.014http://dx.doi.org/10.1016/j.cclet.2021.01.014]
Zhao C Y, Wu Q, Yu T H, Cai Z X, Shen J, Zhao D, Guo S J and Wang Y Q. 2022. Advances of left atrial segmentation methods for atrial fibrillation analysis. Journal of Image and Graphics, 27(12): 3429-3449
赵春艳, 吴清, 余太慧, 蔡兆熙, 沈君, 赵地, 郭士杰, 王元全. 2022. 面向房颤分析的左心房分割方法综述. 中国图象图形学报, 27(12): 3429-3449 [DOI: 10.11834/jig.210924http://dx.doi.org/10.11834/jig.210924]
Zheng S X, Lu J C, Zhao H S, Zhu X T, Luo Z K, Wang Y B, Fu Y W, Feng J F, Xiang T, Torr P H S and Zhang L. 2020. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 6877-6886 [DOI: 10.1109/CVPR46437.2021.00681http://dx.doi.org/10.1109/CVPR46437.2021.00681]
相关作者
相关机构