采用多方向蛇形卷积和视觉残差Mamba的两阶段冠状动脉分割方法
Two-stage Coronary Artery Segmentation via Multi-Direction Snake Convolution and Vision Mamba
- 2024年 页码:1-12
网络出版日期: 2024-12-23
DOI: 10.11834/jig.240538
移动端阅览
浏览全部资源
扫码关注微信
网络出版日期: 2024-12-23 ,
移动端阅览
刘建明,唐煜城.采用多方向蛇形卷积和视觉残差Mamba的两阶段冠状动脉分割方法[J].中国图象图形学报,
Liu Jianming,Tang Yucheng.Two-stage Coronary Artery Segmentation via Multi-Direction Snake Convolution and Vision Mamba[J].Journal of Image and Graphics,
目的
2
基于CT血管造影(Computed Tomography Angiography, CTA)图像的冠状动脉分割具有重要的临床应用价值。冠状动脉具有多分支和细分支的管状结构特点,同时面临前景与背景类别严重不平衡的问题。传统基于卷积神经网络(Convolutional Neural Network, CNN)的冠状动脉分割网络难以建模血管间的长依赖关系,而基于视觉转换器(Vision Transformer, ViT)的模型由于复杂度过高,在资源有限的现实环境中难以部署。最新研究表明,以Mamba为代表的状态空间模型(State Space Models)能够在保持线性复杂度的情况下有效建模长依赖关系。
方法
2
基于上述原因,首次将视觉Mamba应用于冠状动脉分割,提出一种采用多方向蛇形卷积和视觉残差Mamba的两阶段冠状动脉分割方法——MDSVM-Unet++。MDSVM-Unet++采用传统的编码-解码架构:在编码阶段,为了准确捕捉血管细长而曲折的管状结构特征,提出了一种新的多方向蛇形卷积模块,从矢状面、冠状面、水平面三个视角进行多视角融合学习,使模型能够更全面的自适应专注于冠状动脉细长局部结构;在解码阶段,为了建模血管切片间的长依赖关系同时保持线性复杂度,设计了一种基于残差视觉Mamba的上采样解码器块,解码器块首先使用加法运算执行特征融合,随后将结果输入到残差视觉Mamba层中进行长依赖关系建模,最后通过三线性插值操作对特征图进行上采样。为了更准确地分割细分支血管,进一步提出了两阶段分割模型:在第一阶段,采用上述MDSVM-UNet++对整个CT图像进行直接分割,并将得到的结果用于指导原图像的分块;随后,将分块后的数据重新输入MDSVM-UNet++网络进行第二阶段分割,最终合并所有分块的分割结果。在保证分割结果连续性的情况下,进一步减少了分割结果中的假阳性点,同时提高了冠脉的连续性和平滑度。
结果
2
实验结果表明,提出的两阶段MDSVM-UNet++方法在IMAGECAS数据集上的Dice相似系数、豪斯多夫距离和平均豪斯多夫距离分别优于最新的基准网络ImageCAS 5.41%、8.5456和0.8093。
结论
2
本文提出一种采用多方向蛇形卷积和视觉残差Mamba的两阶段冠状动脉分割方法:一方面提出了一种多方向蛇形卷积来更全面更准确地捕捉血管结构特征;另一方面,设计了一种基于残差视觉Mamba的解码器模块,在线性复杂度下实现了血管切片间的长依赖关系的建模,最终实现低资源环境下更准确的冠状动脉分割。
Objective
2
Cardiovascular disease (CVD) accounts for approximately half of non-communicable diseases. Coronary artery stenosis is considered a major risk factor for CVD. Computed Tomography Angiography (CTA) has become one of the widely used non-invasive imaging methods for coronary diagnosis due to its excellent image resolution. Clinically, coronary artery segmentation is crucial for the diagnosis and quantification of coronary artery diseases. Coronary arteries are characterized by their multi-branching and tubular structures, and there is a severe imbalance between foreground and background classes. Traditional coronary artery segmentation networks based on Convolutional Neural Networks (CNNs) struggle to model long-range dependencies between vessels, while Vision Transformer (ViT) models are difficult to deploy in resource-constrained real-world environments due to their high complexity. Recent studies have shown that State Space Models (SSMs), such as Mamba, can effectively model long-range dependencies while maintaining linear complexity.
Method
2
For these reasons, this paper applies visual Mamba to coronary artery segmentation for the first time and proposes a two-stage coronary artery segmentation method, MDSVM-Unet++, based on Multi-Direction snake convolution and visual residual Mamba. MDSVM-Unet++ adopts a traditional encoder-decoder architecture: in the encoding stage, dynamic snake convolution is used to replace traditional convolution to accurately capture the elongated and tortuous tubular structure features of the vessels, introducing deformable offsets in the convolution kernel and employing an iterative strategy to prevent the model from deviating from the target while learning the deformable offsets, ensuring continuity of attention. Additionally, a Multi-direction Snake Convolution Layer (MDSConv Layer) is proposed to extract features from the three dimensions (x, y, z) of the 3D image, retaining attention from multiple directions and further improving segmentation accuracy, thereby allowing the model to focus more on the slender and complex tubular structures. In the decoding stage, to model the long-range dependencies between vascular slices while maintaining linear complexity, an upsampling decoder block based on residual visual Mamba is designed. This block employs dense spatial pooling techniques to generate finer multi-scale contexts, first performing feature fusion using addition and then inputting the results into the residual visual Mamba layer for long-range dependency modeling, followed by trilinear interpolation for upsampling the feature maps.To achieve more accurate segmentation of small branch vessels, a two-stage segmentation model is further proposed: in the first stage, the original CTA images are scaled down to a size of 128x128x64, and then the MDSVM-Unet++ is directly applied to segment the entire CTA image. The results are used to guide the partitioning of the original image into a set of 64x64x64 voxel blocks, allowing the data to contain more coronary artery information. Subsequently, the segmented data is re-input into the MDSVM-Unet++ network for the second stage of segmentation, with all segmented results merged at the end. This approach reduces false positive points in the segmentation results while ensuring continuity and improving the smoothness of the coronary arteries.
Result
2
I In the experimental section, we implemented the model using the PyTorch framework and trained it on an NVIDIA GTX 3090. We selected 750 CTA images from the IMAGECAS dataset as the training set, with the remaining 250 CTA images used as the validation and test sets. The first-stage segmentation network was trained for 25 epochs using the Adam optimizer with a learning rate of 0.001. The second-stage segmentation network underwent 50 iterations, with the learning rate decayed by a factor of 0.1 at the 30th and 40th iterations. The Dice similarity coefficient (DSC) was used as the evaluation metric, while the Hausdorff distance (HD) (Huttenlocher et al., 1993) and Average Hausdorff Distance (AHD) served as auxiliary metrics. The experimental results indicate that the proposed MDSVM-Unet++ method outperformed the latest baseline network, ImageCas, on the IMAGECAS dataset, achieving a 5.41% improvement in Dice coefficient, an increase of 8.5456 in Hausdorff Distance, and a 0.8093 increase in Average Hausdorff Distance.
Conclusion
2
Given the tubular structural characteristics of coronary arteries, this paper applies visual Mamba to coronary artery segmentation for the first time, proposing a two-stage coronary artery segmentation method based on dynamic snake convolution and visual residual Mamba. On one hand, dynamic snake convolution is utilized to more accurately capture vascular structural features; on the other hand, a decoding module based on visual residual Mamba is designed to model long-range dependencies between vascular slices while maintaining linear complexity, ultimately achieving more accurate coronary artery segmentation in resource-limited environments.
计算机断层扫描血管造影冠状动脉分割两阶段分割方法动态蛇形卷积状态空间模型
Computed Tomography Angiographycoronary artery segmentationtwo-stage split methoddynamic snake convolutionstate space models
C. Dong, S. Xu, D. Dai, Y. Zhang, C. Zhang, and Z. Li. 2023. A novel multi-attention, multiscale 3D deep network for coronary artery segmentation, Med. Image Anal. 85 102745. [DOI:10.1016/j.media.2023.102745]
Çiçek Ö, Abdulkadir A, Lienkamp, Soeren S L,Thomas B and Olaf R. 2016. 3D U-Net:learning dense volumetric segmentation from sparse annotation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich:Springer: 424-432.[DOI:10.1007/978-3-319-46723-8_49http://dx.doi.org/10.1007/978-3-319-46723-8_49]
Cao H., Wang Y., Chen J., Jiang D., Zhang X., Tian Q., and Wang M. (2022, October). Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision :205-218. [DOI:10.1007/978-3-031-25066-8_9http://dx.doi.org/10.1007/978-3-031-25066-8_9]
Dai J, Qi H, Xiong Y, Zhang G, Hu H and Wei Y.Deformable Convolutional Networks[C]//2017 IEEE International Conference on Computer Vision (ICCV).Munich:Springer, 2017:764-773[DOI: 10.1109/ICCV.2017.89http://dx.doi.org/10.1109/ICCV.2017.89]
Gharleghi R., Chen N., Sowmya A., and Beier S. 2022.Towards automated coronary artery segmentation: A systematic review[J]. Computer Methods and Programs in Biomedicine, 225: 107015. [DOI:10.1016/j.cmpb.2022.107015http://dx.doi.org/10.1016/j.cmpb.2022.107015]
Gu A., and Dao T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. [EB/OL].[2023-12-1] https://doi.org/10.48550/arXiv.2312.00752.pdfhttps://doi.org/10.48550/arXiv.2312.00752.pdf
Hatamizadeh, Y. Tang, V. Nath, D. Yang, A. Myronenko, B. Landman,H.R. Roth, and D. Xu. 2022. Unetr: Transformers for 3d medical image segmentation, in:Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584.[DOI: 10.1109/WACV51458.2022.00181http://dx.doi.org/10.1109/WACV51458.2022.00181]
Huttenlocher D P, Klanderman G A and Rucklidge W J. 1993. Comparing images using the Hausdorff distance[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9):850-863[DOI:10.1109/34.232073http://dx.doi.org/10.1109/34.232073]
Huang W, Huang L, Lin Z, Huang S, Chi Y, Zhou J, Zhang J, Tan R and Zhong L.Coronary Artery Segmentation by Deep Learning Neural Networks on Computed Tomographic Coronary Angiographic Images[C]//2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).IEEE, 2018[DOI:10.1109/EMBC.2018.8512328]
Isensee F., Petersen J., Klein A., Zimmerer D., Jaeger P. F., Kohl S., ... and Maier-Hein, K. H.. 2021. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation[J]. Nature methods, 18(2): 203-211. [DOI: 10.1038/s41592-020-01008-zhttp://dx.doi.org/10.1038/s41592-020-01008-z]
Kiris H A, Schaap M, Metz C T, Dharampal A S, Meijboom W B, Papadopoulou S L, Dedic A, Nieman K, Graaf M A, Meijs M F L, Cramer M J, Broersen A, Centin S, Eslami A, Flórez-Valencia L, Lor K L, Matuszewski B, Melki I, Mohr B, Öksüz I and Walsum T. 2013. Standardized evaluation framework for evaluating coro-nary artery stenosis detection, stenosis quantification and lumen segmentation algorithms in computed tomography angiography[J].Medical image analysis,17(8):859-876. [DOI:10.1016/j.media.2013.05.007http://dx.doi.org/10.1016/j.media.2013.05.007]
Lei Y, Guo B, Fu Y, Wang T, Liu T,Curran W, Zhang L and Yang X. 2020. Automated coronary artery segmentation in Coronary Computed Tomography Angiography (CCTA) using deep learning neural networks[J].Proceedings of SPIE - The International Society for Optical Engineering:34.[DOI:10.1117/12.2550368http://dx.doi.org/10.1117/12.2550368]
Li X, Sun X, Meng Y, Liang J, Wu F and Li J. 2020. Dice Loss for Data-imbalanced NLP Tasks[J]. Computation and Language:465-476. [DOI:10.18653/v1/2020.acl-main.45http://dx.doi.org/10.18653/v1/2020.acl-main.45]
Liu J., Yang H., Zhou H. Y., Xi Y., Yu L., Li C., ... and Wang, S. (2024, October). Swin-umamba: Mamba-based unet with imagenet-based pretraining[C]. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 615-625). Cham: Springer Nature Switzerland. [DOI:10.1007/978-3-031-72114-4_59http://dx.doi.org/10.1007/978-3-031-72114-4_59]
Liu Y., Tian Y., Zhao Y., Yu H., Xie L., Wang Y., Ye Q., Liu Y. 2024. Vmamba:Visual state space model. NeurIPS2024. [DOI:10.48550/arXiv.2401.10166http://dx.doi.org/10.48550/arXiv.2401.10166]
Long J,Shelhamer E and Darrel T. 2014. Fully Convolutional Networks for Semantic Segmentation[C].IEEE Transactions on Pattern Analysis & Machine Intelligence. [DOI:10.1109/TPAMI.2016.2572683http://dx.doi.org/10.1109/TPAMI.2016.2572683]
Lou Lufei, Ying Junjie, Cai Kaijun, Xin Yu. 2024. Review of various vessels and airway segmentation in medical imaging. Journal of Image and Graphics,29(09):2692-2715 [DOI: 10.11834/jig.230240http://dx.doi.org/10.11834/jig.230240]
楼陆飞, 应俊杰, 蔡凯俊, 辛宇. 2024. 医学影像多血管和气道分割方法综述. 中国图象图形学报, 29(09):2692-2715 DOI: 10.11834/jig.230240.
Qi Y, He Y, Qi X, Zhang Y and Yang G. 2023. Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation[C]//2023 IEEE International Conference on Computer Vision (ICCV). Munich:Springer:6047-6056.
相关文章
相关作者
相关机构