基于深度学习的医学图像分割方法综述
A review of deep learning-based medical image segmentation methods
- 2024年 页码:1-26
网络出版日期: 2024-12-23
DOI: 10.11834/jig.240467
移动端阅览
浏览全部资源
扫码关注微信
网络出版日期: 2024-12-23 ,
移动端阅览
石军,王天同,朱子琦等.基于深度学习的医学图像分割方法综述[J].中国图象图形学报,
Shi Jun,Wang Tiantong,Zhu Ziqi,et al.A review of deep learning-based medical image segmentation methods[J].Journal of Image and Graphics,
医学图像分割是临床医学图像分析的重要组成部分,其目标是准确识别和分割医学图像中的人体解剖结构或病灶等感兴趣区域,从而为临床疾病的诊断、治疗规划以及术后评估等应用场景提供客观、量化的决策依据。近年来,随着可用标注数据规模的不断增长,基于深度学习的医学图像分割方法得以迅速发展,并展现出远超传统图像分割方法的精度和鲁棒性,目前已成为该领域的主流技术。为了进一步提高分割精度,大量的研究集中在对分割模型的结构改进上,产生了一系列结构迥异的分割方法。总的来说,现有的基于深度学习的医学图像分割方法从模型结构上可以分为三类:基于卷积神经网络(convolutional neural network, CNN)、基于视觉Transformer以及基于视觉Mamba。其中,以U-Net为代表的基于CNN的方法最早被广泛应用于各类医学图像分割任务。这类方法一般以卷积操作为核心,能够有效地提取图像的局部特征。相比之下,基于视觉Transformer的方法则更擅长捕捉全局信息和长距离依赖关系,从而能够更好地处理复杂的上下文信息。基于视觉Mamba的方法作为一种新兴架构,因其具有全局感受野和线性计算复杂度的特点,表现出了巨大的应用潜力。为了深入了解基于深度学习的医学图像分割方法的发展脉络、优势与不足,本文对现有方法进行了系统梳理和综述。首先,简要回顾了上述三类主流分割方法的结构演进历程,分析了不同方法的结构特点、优势与局限性。然后,从算法结构、学习方法和任务范式等多个方面深入探讨了医学图像分割领域面临的主要挑战及机遇。最后,对基于深度学习的医学图像分割方法未来的发展方向和应用前景进行了深入分析和讨论。
Medical image segmentation is a crucial component of clinical medical image analysis, aimed at accurately identifying and delineating anatomical structures or regions of interest, such as lesions, within medical images. This provides objective and quantitative support for decision-making in disease diagnosis, treatment planning, and postoperative evaluation. In recent years, the rapid growth of available annotated data has facilitated the swift development of deep learning-based medical image segmentation methods, which demonstrate superior accuracy and robustness compared to traditional segmentation techniques, thereby becoming the mainstream technology in the field. To further enhance segmentation accuracy, extensive research has focused on improving the structural designs of segmentation models, resulting in a variety of distinct segmentation approaches. Current deep learning-based medical image segmentation methods can be classified into three main structural categories: Convolutional Neural Networks (CNNs), Vision Transformers, and Vision Mamba.As a representative neural network architecture, CNNs effectively capture spatial features in images through their unique local receptive fields and weight-sharing mechanisms, making them particularly suitable for image analysis and processing tasks. Since 2015, CNN-based methods, exemplified by U-Net, have dominated the field of medical image segmentation, consistently achieving state-of-the-art performance across various downstream segmentation tasks. To further improve segmentation accuracy, many studies have focused on modifying and innovating the U-Net structure, leading to a series of derived segmentation methods. However, the inherent limitations of convolutional operators, particularly their local receptive fields, restrict these methods' ability to capture global contextual dependencies, especially when handling complex medical images and fine-grained segmentation targets. While techniques such as attention mechanisms and specialized convolutions have somewhat alleviated this issue and enhanced the model's focus on global information, their effectiveness remains limited.Since 2020, researchers have begun to introduce Transformer architectures, originally developed in the natural language processing (NLP) domain, into computer vision tasks, including medical image segmentation. Vision Transformers utilize self-attention mechanisms to effectively model global dependencies, significantly improving the quality of semantic feature extraction and facilitating the segmentation of complex medical images. Transformer-based methods for medical image segmentation mainly include hybrid approaches that combine Transformers with CNNs and pure Transformer methods, each showcasing unique advantages and disadvantages. Hybrid approaches leverage CNNs' strengths in local feature extraction alongside Transformers' capabilities in modeling global context, thereby enhancing segmentation accuracy while maintaining computational efficiency. However, these methods remain dependent on CNN structures, which may limit their performance in complex scenarios. In contrast, pure Transformer methods excel in capturing long-range dependencies and multiscale features, significantly improving segmentation accuracy and generalization. Nevertheless, pure Transformer architecture typically requires substantial computational resources and high-quality training data, posing challenges in obtaining large-scale annotated datasets in the medical field.Despite the notable advantages of Transformer structures in capturing long-range dependencies and global contextual information, their computational complexity grows quadratically with the length of the input sequence, limiting their applicability in resource-constrained environments. To overcome this challenge, researchers are developing new methods capable of modeling global dependencies with linear time complexity. Mamba introduces a novel selective state-space model that employs a selection mechanism, hardware-aware algorithms, and a simpler architecture to reduce computational complexity while maintaining efficient long-sequence modeling performance significantly. Consequently, since 2024, numerous studies have begun to apply the Mamba structure to medical image segmentation tasks, achieving promising results and potentially replacing Transformer structures. The hybrid method combining Mamba with CNNs can more effectively enhance segmentation accuracy and robustness by integrating CNN's feature extraction capabilities with Mamba's handling of long-range dependencies. However, this approach may increase computational complexity during integration. Additionally, pure Mamba methods are more suitable for segmentation tasks requiring global contextual information but still face limitations in capturing spatial features of images and may demand greater computational resources during training.In summary, this paper systematically reviews and analyzes the development trajectory, advantages, and limitations of deep learning-based medical image segmentation methods from a structural perspective for the first time. First, we categorize all surveyed methods into three structural classes. We then provide a brief overview of the structural evolution of different segmentation methods, analyzing their structural characteristics, strengths, and weaknesses. Subsequently, we delve into the major challenges and opportunities currently facing the field of medical image segmentation from multiple perspectives, including algorithm structure, learning methods, and task paradigms. Finally, we conduct an in-depth analysis and discussion of future development directions and application prospects.
深度学习医学图像分割卷积神经网络视觉Transformer视觉Mamba
deep learningmedical image segmentationconvolutional neural networkvision transformervision mamba
Almajalid R, Shan J, Du Y, and Zhang M. 2018. Development of a deep-learning-based method for breast ultrasound image segmentation.//2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE:1103-1108.[DOI:10.1109/ICMLA.2018.00179http://dx.doi.org/10.1109/ICMLA.2018.00179]
Alom M Z, Hasan M, Yakopcic C, Taha T M, and Asari V K. 2018. Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/1802.06955https://arxiv.org/abs/1802.06955
Antonelli M, Reinke A, Bakas S, Farahani K, Kopp-Schneider A, Landman B A and Cardoso M J. 2022. The medical segmentation decathlon. Nature communications, 13(1): 4128. [DOI:10.1038/s41467-022-30695-9http://dx.doi.org/10.1038/s41467-022-30695-9]
Azad R, Heidari M, and Shariatnia M. 2022. Transdeeplab: Convolution-free transformer-based deeplab v3+ for medical image segmentation. International Workshop on PRedictive Intelligence In MEdicine. 91-102.[DOI: 10.1007/978-3-031-16919-9_9http://dx.doi.org/10.1007/978-3-031-16919-9_9]
Bahdanau D, Cho K, and Bengio Y. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/1409.0473https://arxiv.org/abs/1409.0473
Butoi V I, Ortiz J J G, Ma T, Sabuncu M R, Guttag J, and Dalca A V. 2023. Universeg: Universal medical image segmentation.//Proceedings of the IEEE/CVF International Conference on Computer Vision:21438-21451. [DOI:10.1109/ICCV51070.2023.01960http://dx.doi.org/10.1109/ICCV51070.2023.01960]
Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q and Wang M. 2022. Swin-unet: Unet-like pure transformer for medical image segmentation//Proceedings of the European conference on computer vision. Cham: Springer Nature Switzerland: 205-218 [DOI: 10.1007/978-3-031-19815-2_13http://dx.doi.org/10.1007/978-3-031-19815-2_13]
Chang Y, Menghan H, Guangtao Z, and Xiao-Ping Z. 2021. Transclaw u-net: Claw u-net with transformers for medical image segmentation. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/2107.05188https://arxiv.org/abs/2107.05188
Chen L C, Papandreou G, Kokkinos I, Murphy K, and Yuille A L. 2017. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848.[DOI: 10.1109/TPAMI.2017.2699184http://dx.doi.org/10.1109/TPAMI.2017.2699184]
Chen L C, Zhu Y, Papandreou G, Schroff F, and Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), 801-818. [DOI: 10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49]
Chen J J, Lu Y Y, Yu Q H, Luo X D, Adeli E, Wang Y, Lu L, Yuille A L, and Zhou Y Y. 2021. Transunet: Transformers make strong encoders for medical image segmentation[EB/OL]. [2024-07-20]. https://arxiv.org/abs/2102.04306https://arxiv.org/abs/2102.04306
Chen Y, Yin M, Li Y and Cai Q. CSU-Net: 2022. A CNN-transformer parallel network for multimodal brain tumour segmentation. Electronics, 11(14): 2226.[DOI:/10.3390/electronics11142226http://dx.doi.org//10.3390/electronics11142226]
Chen Y P, Dai X Y, Chen D D, Liu M C, Dong X Y, Yuan L and Liu Z C. 2022. Mobile-former: bridging mobilenet and Transformer//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE:5260-5269[DOI:10.1109/CVPR52688.2022.00520http://dx.doi.org/10.1109/CVPR52688.2022.00520]
Çiçek Ö, Abdulkadir A, Lienkamp SS. 2016. 3D U-Net: learning dense volumetric segmentation from sparse annotation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016, Lecture Notes in Computer Science. 424-432. [DOI:10.1007/978-3-319-46723-8_49http://dx.doi.org/10.1007/978-3-319-46723-8_49]
Dai J F, Qi H Z, Xiong Y W, Li Y, Zhang G D, Hu H, and Wei Y C. 2017. Deformable convolutional networks//Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE: 764-773 [DOI: 10.1109/ICCV.2017.89http://dx.doi.org/10.1109/ICCV.2017.89]
Dao T, Fu D, Ermon S, Rudra A and Ré C. 2022. Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35: 16344-16359 [DOI: 10.48550/arXiv.2205.14135http://dx.doi.org/10.48550/arXiv.2205.14135]
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, and Houlsby N. 2020. An image is worth16x16 words: Transformers for image recognition at scale[EB/OL]. [2024-07-20]. https://arxiv.org/abs/2010.11929https://arxiv.org/abs/2010.11929
Du G T, Cao X, Liang J M, Chen X L, and Zhan Y H. 2020. Medical Image Segmentation based on U-Net: A Review. Journal of Imaging Science & Technology, 64(2). [DOI:/10.3390/su13031224http://dx.doi.org//10.3390/su13031224]
Fu X, Bi L, Kumar A, Fulham M., and Kim J. 2021. Multimodal spatial attention module for targeting multimodal PET-CT lung tumor segmentation. IEEE Journal of Biomedical and Health Informatics, 25(9), 3507-3516. [DOI: 10.1109/JBHI.2021.3059453http://dx.doi.org/10.1109/JBHI.2021.3059453]
Fu Z J, Li J J and Hua Z. 2023 MSA-Net: Multiscale spatial attention network for medical image segmentation. Alexandria Engineering Journal, 70: 453-473.[DOI:/10.1016/j.aej.2023.02.039http://dx.doi.org//10.1016/j.aej.2023.02.039]
Gan X L, Wang L D, Chen Q, Ge Y J and Duan S K. 2021. GAU-Net: U-Net based on global attention mechanism for brain tumor segmentation//Journal of Physics: Conference Series. IOP Publishing, 1861(1): 012041
Gao Y, Zhou M, and Metaxas D N. 2021. UTNet: a hybrid transformer architecture for medical image segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, Part III24, 61-71. [DOI: 10.1007/978-3-030-87234-2_6http://dx.doi.org/10.1007/978-3-030-87234-2_6]
Gu A, Goel K, and Ré C. 2021. Efficiently modeling long sequences with structured state spaces[EB/OL]. [2021-11-01]. https://arxiv.org/abs/2111.00396https://arxiv.org/abs/2111.00396
Gu A, and Dao T. 2023. Mamba: Linear-time sequence modeling with selective state spaces[EB/OL]. [2023-12-01]. https://arxiv.org/abs/2312.00752https://arxiv.org/abs/2312.00752
Gu Z W, Cheng J, Fu H Z, Zhou K, Hao H Y, Zhao Y T, Zhang T Y, Gao S H, and Liu J. 2019. Ce-net: Context encoder network for 2d medical image segmentation. IEEE Transactions on Medical Imaging, 38(10), 2281-2292. [DOI: 10.1109/TMI.2019.2912165http://dx.doi.org/10.1109/TMI.2019.2912165]
Guo C, Szemenyei M, and Yi Y. 2021. Sa-unet: Spatial attention u-net for retinal vessel segmentation. 2020 25th International Conference on Pattern Recognition (ICPR), 1236-1242. IEEE. [DOI: 10.1109/ICPR48806.2021.9412815http://dx.doi.org/10.1109/ICPR48806.2021.9412815]
Guo J, Zhou H Y, Wang L, and Yu Y. 2022. UNet-2022: Exploring dynamics in non-isomorphic architecture. In International Conference on Medical Imaging and Computer-Aided Diagnosis, 465-476. Singapore: Springer Nature Singapore. [DOI: 10.1007/978-981-16-7954-0_42http://dx.doi.org/10.1007/978-981-16-7954-0_42]
Guo Y R, Jiang J G, Hao S J, Zhan S, and Li H. 2013. Medical image segmentation based on statistical similarity feature[J]. Journal of Image and Graphics, 18(2):225-234
郭艳蓉, 蒋建国, 郝世杰, 詹曙, 李鸿. 2013. 统计相似度特征的医学图像分割. 中国图象图形学报, 18(2):225-234 [DOI: 10.11834/jig.20130215http://dx.doi.org/10.11834/jig.20130215]
Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, ... and Xu D. 2022. Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 574-584. [DOI: 10.1109/WACV51458.2022.00150http://dx.doi.org/10.1109/WACV51458.2022.00150]
Hatamizadeh A, Nath V, Tang Y, Yang D, Roth H R, and Xu D. 2021. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images.//International MICCAI brainlesion workshop. Cham: Springer International Publishing: 272-284.[DOI:10.1007/978-3-031-08999-2_22http://dx.doi.org/10.1007/978-3-031-08999-2_22]
Hao J, He L, and Hung K F. 2024. T-mamba: Frequency-enhanced gated long-range dependency for tooth 3d cbct segmentation. https://arxiv.org/abs/2404.01065https://arxiv.org/abs/2404.01065
He K M, Zhang X Y, Ren S Q, and Sun J. 2016. Deep Residual Learning for Image Recognition//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. [DOI:10.1109/cvpr.2016.90http://dx.doi.org/10.1109/cvpr.2016.90]
Ho J, Kalchbrenner N, Weissenborn D, and Salimans T. 2019. Axial attention in multidimensional transformers. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/1912.12180https://arxiv.org/abs/1912.12180
Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, and Adam H. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/1704.04861https://arxiv.org/abs/1704.04861
Hu J, Shen L, and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),7132-7141. [DOI: 10.1109/CVPR.2018.00745http://dx.doi.org/10.1109/CVPR.2018.00745]
Huang G, Liu Z, Van Der Maaten L, and Weinberger K Q. 2017. Densely connected convolutional networks.//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 2261-2269, [DOI: 10.1109/CVPR.2017.243http://dx.doi.org/10.1109/CVPR.2017.243]
Huang G H, Zhu J W, Li J J, Wang Z W, Cheng L L, Liu L Z, Li H J, and Zhou J. 2020. Channel-attention U-Net: Channel attention mechanism for semantic segmentation of esophagus and esophageal cancer //IEEE Access, 2020, 8: 122798-122810. [DOI: 10.1109/ACCESS.2020.3007719http://dx.doi.org/10.1109/ACCESS.2020.3007719]
Huang H M, Lin L F, Tong R F, Hu H J, Zhang Q W, Iwamoto Y, Han X H, Chen Y W, and Wu J. 2020. Unet3+: A full-scale connected unet for medical image segmentation. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1055-1059. IEEE. [DOI: 10.1109/ICASSP40776.2020.9053405http://dx.doi.org/10.1109/ICASSP40776.2020.9053405]
Huang X H, Deng Z F, Li D D, and Yuan X G. 2021. Missformer: An effective medical image segmentation transformer.[EB/OL]. [2024-07-20]. https://arxiv.org/abs/2109.07162https://arxiv.org/abs/2109.07162
Ibtehaz N, and Rahman M S. 2020. MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Networks, 121, 74-87. [DOI: 10.1016/j.neunet.2019.08.025http://dx.doi.org/10.1016/j.neunet.2019.08.025]
Isensee F, Jaeger P F, Kohl S A, Petersen J, and Maier-Hein K H. 2021. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature Methods, 18(2), 203-211. [DOI: 10.1038/s41592-020-01008-zhttp://dx.doi.org/10.1038/s41592-020-01008-z]
Jin Q G, Meng Z P, Pham T D, Chen Q, Wei L Y, and Su R. 2019. DUNet: A deformable network for retinal vessel segmentation. Knowledge-Based Systems, 178, 149-162. [DOI: 10.1016/j.knosys.2019.04.029http://dx.doi.org/10.1016/j.knosys.2019.04.029]
Kan H, Shi J, Zhao M, Wang Z, Han W, An H, and Wang S. 2022. Itunet: Integration of transformers and unet for organs-at-risk segmentation. In 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society,2123-2127.[DOI: 10.1109/EMBC48229.2022.9871945http://dx.doi.org/10.1109/EMBC48229.2022.9871945]
Kaul C, Manandhar S and Pears N. 2019. Focusnet: An attention-based fully convolutional network for medical image segmentation/2019 IEEE 16th international symposium on biomedical imaging . IEEE: 455-458. [DOI: 10.1109/ISBI.2019.8759477http://dx.doi.org/10.1109/ISBI.2019.8759477]
Krizhevsky A, Sutskever I, and Hinton G E. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097-1105. [DOI: 10.1145/3065386http://dx.doi.org/10.1145/3065386]
LeCun Y., Bottou L., and Bengio Y. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. [DOI: 10.1109/5.726791http://dx.doi.org/10.1109/5.726791]
Lei T, Zhou W Z, Zhang Y X, Wang R S, Meng H Y, and Nandi A K. 2020. Lightweight V-Net for liver segmentation.//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE: 1379-1383. [DOI: 10.1109/ICASSP40776.2020.9053454http://dx.doi.org/10.1109/ICASSP40776.2020.9053454]
Li R, Li M and Li J.2019. Connection sensitive attention U-net for accurate retinal vessel segmentation.[EB/OL]. [2024-07-20]. https://arxiv.org/abs/1903.05558https://arxiv.org/abs/1903.05558
Li X M, Chen H, Qi X J, Dou Q, Fu C W, and Heng P A. 2018. H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Transactions on Medical Imaging, 37(12), 2663-2674. [DOI:10.1109/TMI.2018.2833502http://dx.doi.org/10.1109/TMI.2018.2833502]
Liang M, and Hu X L. 2015. Recurrent convolutional neural network for object recognition.//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: 3367-3375.[DOI:10.1109/cvpr.2015.7298958http://dx.doi.org/10.1109/cvpr.2015.7298958]
Liao W B, Zhu Y H, Wang X Y, Pan C W, Wang Y S, and Ma L T. 2024. Lightm-unet: Mamba assists in lightweight unet for medical image segmentation.[EB/OL]. [2024-07-20]. https://arxiv.org/abs/2403.05246https://arxiv.org/abs/2403.05246
Lin X, Yan Z, Deng X, Zheng C, and Yu L. 2023. ConvFormer: Plug-and-play CNN-style transformers for improving medical image segmentation//Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI), 642-651. [DOI:10.1007/978-3-031-43901-8_61http://dx.doi.org/10.1007/978-3-031-43901-8_61]
Liu J R, Yang H, Zhou H Y, Xi Y, Yu L Q, Yu Y Z, Liang Y, Shi G M, Zhang S T, and Zheng H R. 2024. Swin-umamba: Mamba-based unet with imagenet-based pretraining. International Conference on Medical Image Computing and Computer-Assisted Intervention: 615-625. [DOI:10.1007/978-3-031-72114-4_59http://dx.doi.org/10.1007/978-3-031-72114-4_59]
Liu W, Luo J, Yang Y, Wang W, Deng J, and Yu L. 2022. Automatic lung segmentation in chest X-ray images using improved U-Net. Scientific Reports, 12(1), 8649. [DOI: 10.1038/s41598-022-12743-yhttp://dx.doi.org/10.1038/s41598-022-12743-y]
Liu Y, Tian Y J, Zhao Y Z, Yu H T, Xie L X, Wang Y W, Ye Q X, and Liu Y F. 2024. VMamba: Visual State Space Model.[EB/OL]. [2024-07-20]. https://arxiv.org/abs/2401.10166https://arxiv.org/abs/2401.10166
Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, Lin S and Guo B N. 2021. Swin Transformer: hierarchical vision Transformer using shifted windows//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 999210002 [DOI:10.1109/ICCV48922.2021.00986]
Liu Z, Mao H, Wu C, Feichtenhofer C, Darrell T, and Xie S. 2022. A convnet for the2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),11976-11986.[DOI:10.1109/CVPR52688.2022.01167http://dx.doi.org/10.1109/CVPR52688.2022.01167]
Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3431-3440 [DOI:10.1109/CVPR.2015.7298965http://dx.doi.org/10.1109/CVPR.2015.7298965]
Ma J, Li F F, and Wang B. 2024. U-mamba: Enhancing long-range dependency for biomedical image segmentation.[EB/OL]. [2024-07-20]. https://arxiv.org/abs/2401.04722https://arxiv.org/abs/2401.04722
Ma J, He Y T, Li F F, Han L, You C Y, and Wang B. 2024. Segment anything in medical images. Nature Communications, 15(1), 654. [DOI:10.1038/s41467-024-44824-zhttp://dx.doi.org/10.1038/s41467-024-44824-z]
Mei H, Lei W, Gu R, Ye S, Sun Z, Zhang S, and Wang G. 2021. Automatic segmentation of gross target volume of nasopharynx cancer using ensemble of multiscale deep neural networks with spatial attention. Neurocomputing, 438, 211-222. [DOI: 10.1016/j.neucom.2020.06.146http://dx.doi.org/10.1016/j.neucom.2020.06.146]
Milletari F, Navab N and Ahmadi S A. 2016. V-Net: Fully convolutional neural networks for volumetric medical image segmentation//Proceedings of 2016 International Conference on 3D Vision. Stanford, USA: IEEE: 565-571
Moradi S, Ghelich O M, Alizadehasl A, Isaac S, Nik O and Mehrdad O. 2019. MFP-Unet: A novel deep learning based approach for left ventricle segmentation in echocardiography. Physica Medica-European Journal of Medical Physics, 67: 58-69.[DOI:10.1016/j.ejmp.2019.10.001http://dx.doi.org/10.1016/j.ejmp.2019.10.001]
Oktay O, Schlemper J, Folgoc L L, Lee M, Heinrich M, Misawa K, ... and Rueckert D. 2018. Attention u-net: Learning where to look for the pancreas.[EB/OL]. [2024-07-20]. https://arxiv.org/abs/1804.03999https://arxiv.org/abs/1804.03999
Petit O, Thome N, Rambour C, Themyr L, Collins T, and Soler L. 2021. U-net transformer: Self and cross attention for medical image segmentation.//Machine Learning in Medical Imaging: 12th International Workshop, MLMI 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, Proceedings 12. Springer International Publishing: 267-276. [DOI:10.1007/978-3-030-87589-3_28http://dx.doi.org/10.1007/978-3-030-87589-3_28]
Qi K H, Yang H, Li C, Liu Z Y, Wang M Y, Liu Q G, and Wang S S. 2019. X-net: Brain stroke lesion segmentation based on depthwise separable convolution and long-range dependencies. //Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part III 22. Springer International Publishing, 2019: 247-255. [DOI:10.1007/978-3-030-32248-9_28http://dx.doi.org/10.1007/978-3-030-32248-9_28]
Qian Y, and Zhang Y J. 2008. Level Set Methods and Its Application on Image Segmentation. Journal of Image and Graphics, 13(1):7
钱芸, 张英杰. 2008. 水平集的图像分割方法综述. 中国图象图形学报, 13(1):7 [DOI: 10.11834/jig.20080102http://dx.doi.org/10.11834/jig.20080102]
Qiao L, Liu Q, Shi J, Zhao M, Kan H, Wang Z, and Wang S. 2022. Fctc-unet: Fine-grained combination of transformer and cnn for thoracic organs segmentation. In 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society. 4749-4753. [DOI: 10.1109/EMBC48229.2022.9870880http://dx.doi.org/10.1109/EMBC48229.2022.9870880]
Ruan J C, and Xiang S C. 2024. Vm-unet: Vision mamba unet for medical image segmentation. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/2402.02491https://arxiv.org/abs/2402.02491
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, 234-241. [DOI:10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Samarasinghe G, Jameson M, Vinod S, Field M, Dowling J, Sowmya A, and Holloway L. 2021. Deep learning for segmentation in radiation therapy planning: a review. Journal of Medical Imaging and Radiation Oncology, 65(5), 578-595. [DOI: 10.1111/1754-9485.13286http://dx.doi.org/10.1111/1754-9485.13286]
Sanjid K S, Hossain M T, Junayed M S S, and Uddin D M M. 2024. Integrating mamba sequence model and hierarchical upsampling network for accurate semantic segmentation of multiple sclerosis legion.[EB/OL]. [2024-07-20]. https://arxiv.org/abs/2403.17432https://arxiv.org/abs/2403.17432
Shi J, Kan H, Ruan S, Zhu Z, Zhao M, Qiao L, and Xue X. 2023. H-denseformer: An efficient hybrid densely connected transformer for multimodal tumor segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 692-702.[DOI: 10.1007/978-3-031-43901-8_66http://dx.doi.org/10.1007/978-3-031-43901-8_66]
Siddique N, Paheding S, Elkin C P, and Devabhaktuni V. 2021. U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access, 9: 82031-82057.[DOI: 10.1109/ACCESS.2021.3086020http://dx.doi.org/10.1109/ACCESS.2021.3086020]
Strudel R, Garcia R, Laptev I and Schmid C. 2021. Segmenter: Transformer for semantic segmentation//Proceedings of the IEEE/CVF international conference on computer vision: 7262-7272. [DOI: 10.1109/ICCV48922.2021.00717http://dx.doi.org/10.1109/ICCV48922.2021.00717]
Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V and Rabinovich A. 2015. Going deeper with convolutions//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition,s 1-9 [DOI:10.1109/CVPR.2015.7298594http://dx.doi.org/10.1109/CVPR.2015.7298594]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 6000-6010.
Valanarasu J M J, Oza P, Hacihaliloglu I, and Patel V M. 2021. Medical transformer: Gated axial-attention for medical image segmentation//Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, proceedings, part I 24: 36-46.[DOI:10.1007/978-3-030-87193-2_4http://dx.doi.org/10.1007/978-3-030-87193-2_4]
Wan Q, Huang Z L, Lu J C, Yu G and Zhang L. 2023. SeaFormer: squeeze-enhanced axial Transformer for mobile semantic segmentation [EB/OL]. [2024-07-20]. https://arxiv.org/pdf/2301.13156.pdfhttps://arxiv.org/pdf/2301.13156.pdf
Wang C, He Y and Liu Y. 2019. ScleraSegNet: An improved U-net model with attention for accurate sclera segmentation. In: Proc. of the Int'l Conf. on Biometrics.1-8.[DOI: 10.1109/ICB45273.2019.8987270http://dx.doi.org/10.1109/ICB45273.2019.8987270]
Wang H, Xie S, Lin L, Iwamoto Y, Han X H, Chen Y W, and Tong R. 2022. Mixed transformer u-net for medical image segmentation. In IEEE international conference on acoustics, speech and signal processing (ICASSP), 2390-2394. [DOI: 10.1109/ICASSP43922.2022.9746172http://dx.doi.org/10.1109/ICASSP43922.2022.9746172]
Wang H N, Cao P, Wang J Q, and Zaiane O R. 2022. Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer. Proceedings of the AAAI Conference on Artificial Intelligence: 2441-2449. [DOI:10.1609/aaai.v36i3.20144http://dx.doi.org/10.1609/aaai.v36i3.20144.]
Wang J and Liu X P. 2021. Medical image recognition and segmentation of pathological slices of gastric cancer based on Deeplab v3+ neural network. Computer Methods and Programs in Biomedicine, 207, 106210. [DOI:10.1016/j.cmpb.2021.106210http://dx.doi.org/10.1016/j.cmpb.2021.106210]
Wang J H, Chen J T, Chen D, and Wu J. 2024. Large window-based mamba unet for medical image segmentation: Beyond convolution and self-attention. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/2403.07332https://arxiv.org/abs/2403.07332
Wang L, Pan L R, Wang H T, Liu M T, Feng Z C, Rong P F, Chen Z, and Peng S L. 2023. DHUnet: Dual-branch hierarchical global–local fusion network for whole slide image segmentation. Biomedical Signal Processing and Control, 85, 104976.
Wang W J, Chen J X, Zhao J, Chi Y, Xie X S and Zhang L. 2019. Automated segmentation of pulmonary lobes using coordination-guided deep neural networks. In: Proc. of the Int'l Symp. on Biomedical Imaging. 1353-1357.
Wang X C, Li W, Miao B Y, He J, Jiang Z W, Xu W, Ji Z Y, Hong G, and Shen Z M. 2018. Retina blood vessel segmentation using a U-net based Convolutional neural network. Procedia Computer Science: International Conference on Data Science ICDS 2018: 8-9.
Wang Z Y, Zheng J Q, Zhang Y C, Cui G, and Li L. 2024. Mamba-unet: Unet-like pure visual mamba for medical image segmentation. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/2402.05079https://arxiv.org/abs/2402.05079
Wu R K, Liu Y H, Liang P C, and Chang Q. 2024. H-vmunet: High-order vision mamba unet for medical image segmentation. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/2403.13642https://arxiv.org/abs/2403.13642
Xiao H G, Li L, Liu Q Y, Zhu X H, and Zhang Q H. 2023. Transformers in medical image segmentation: A review. Biomedical Signal Processing and Control, 84, 104791. [DOI:10.1016/j.bspc.2023.104791http://dx.doi.org/10.1016/j.bspc.2023.104791]
Xiao X, Lian S, Luo Z M, and Li S Z. 2018. Weighted res-unet for high-quality retina vessel segmentation. 2018 9th International Conference on Information Technology in Medicine and Education (ITME), Hangzhou. [DOI:10.1109/itme.2018.00080http://dx.doi.org/10.1109/itme.2018.00080]
Xie J H, Liao R F, Zhang Z A, Yi S D, Zhu Y S, and Luo G B. 2024. Promamba: Prompt-mamba for polyp segmentation. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/2403.13660https://arxiv.org/abs/2403.13660
Xie Y T, Zhang J P, Shen C H, and Xia Y. 2021. Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021, Lecture Notes in Computer Science. 171-180. [DOI:10.1007/978-3-030-87199-4_16http://dx.doi.org/10.1007/978-3-030-87199-4_16]
Xing Z H, Ye T, Yang Y J, Liu G, and Zhu L. 2024. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention: 578-588. [DOI:10.1007/978-3-031-72111-3_54http://dx.doi.org/10.1007/978-3-031-72111-3_54]
Xu H W, Yan P X, Wu M, Xu Z Y and Sun Y B. 2020. Automated segmentation of cystic kidney in CT images using residual double attention motivated U-Net model. Application Research of Computers,37(7): 2237-2240.
Xu G P, Zhang X, He X W, and Wu X L. 2023. Levit-unet: Make faster encoders with transformer for medical image segmentation. Chinese Conference on Pattern Recognition and Computer Vision (PRCV). Singapore: Springer Nature Singapore: 42-53. [DOI:/10.1007/978-981-99-8543-2_4http://dx.doi.org//10.1007/978-981-99-8543-2_4]
Yang B, Liu X F and Zhang J. 2020. Medical image segmentation based on deep feature aggregation network. Computer Engineering.
Yang X N and Tian X L. 2022. Transnunet: Using attention mechanism for whole heart segmentation. 2022 IEEE 2nd International Conference on Power, Electronics and Computer Applications (ICPECA). IEEE: 553-556. [DOI: 10.1109/ICPECA53709.2022.9719101http://dx.doi.org/10.1109/ICPECA53709.2022.9719101]
Yu F and Koltun V. 2015. Multi-scale context aggregation by dilated convolutions. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/1511.07122https://arxiv.org/abs/1511.07122
Yu X, Yang Q, Zhou Y C, Cai L Y , Gao R Q, Lee H H , Li T, Bao S X and Xu Z B. 2023. Unest: local spatial representation learning with hierarchical transformer for efficient medical segmentation. Medical Image Analysis, 90: 102939. [DOI: 10.1016/j.media.2023.102939http://dx.doi.org/10.1016/j.media.2023.102939]
Zhang L, Zhang K J and Pan H W. 2023. SUNet++: A deep network with channel attention for small-scale object segmentation on 3D medical images. Tsinghua Science and Technology, 28(4): 628-638.[DOI: 10.26599/TST.2022.9010023http://dx.doi.org/10.26599/TST.2022.9010023]
Zhang X F, Zhang S, Zhang D H, and Liu R. 2023. Group attention-based medical image segmentation model. Journal of Image and Graphics, 28(10):3231-3242
张学峰, 张胜, 张冬晖, 刘瑞. 2023. 引入分组注意力的医学图像分割模型. 中国图象图形学报, 28(10):3231-3242 [DOI: 10.11834/jig.220748http://dx.doi.org/10.11834/jig.220748]
Zhang Y D, Liu H Y, and Hu Q. 2021. Transfuse: Fusing transformers and cnns for medical image segmentation. Medical Image Computing and Computer Assisted Intervention (MICCAI). 14-24. [DOI:10.1007/978-3-030-87193-2_2http://dx.doi.org/10.1007/978-3-030-87193-2_2]
Zheng S X, Lu J C, Zhao H S, Zhu X T, Luo Z K, Wang Y B, Fu Y W, Feng J F, Xiang T, and Torr P H S. 2021. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA. [DOI:10.1109/cvpr46437.2021.00681http://dx.doi.org/10.1109/cvpr46437.2021.00681]
Zhou H Y, Guo J, Zhang Y, Yu L, Wang L, Yu Y. 2021. nnformer: Interleaved transformer for volumetric segmentation. [EB/OL]. [2024-07-20]. https://arxiv.org/abs/2109.03201https://arxiv.org/abs/2109.03201
Zhou T, Dong Y L, Huo B Q, Liu S, and Ma Z J. 2021. U-Net and its applications in medical image segmentation: a review. Journal of Image and Graphics, 26(9):2058-2077
周涛, 董雅丽, 霍兵强, 刘珊, 马宗军. 2021. U-Net网络医学图像分割应用综述. 中国图象图形学报, 26(9):2058-2077 [DOI: 10.11834/jig.200704http://dx.doi.org/10.11834/jig.200704]
Zhou X Y, Zheng J Q, Li P, and Yang G Z. 2020. Acnn: a full resolution dcnn for medical image segmentation. 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE: 8455-8461 [DOI: 10.1109/ICRA40945.2020.9197328http://dx.doi.org/10.1109/ICRA40945.2020.9197328]
Zhou Z W, Siddiquee M R, Tajbakhsh N, and Liang J M. 2018. UNet++: A nested u-net architecture for medical image segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Lecture Notes in Computer Science. 3-11. [DOI:10.1007/978-3-030-00889-5_1http://dx.doi.org/10.1007/978-3-030-00889-5_1]
Zhu W, Huang Y, Zeng L, Chen X, Liu Y, Qian Z, and Xie X. 2019. AnatomyNet: deep learning for fast and fully automated whole‐volume segmentation of head and neck anatomy. Medical Physics, 46(2), 576-589.[DOI: 10.1002/mp.13300http://dx.doi.org/10.1002/mp.13300]
相关作者
相关机构