结合分段频域和局部注意力的超声甲状腺分割

胡屹杉; 秦品乐; 曾建潮; 柴锐; 王丽芳

发布时间： 2020-10-16
摘要点击次数： 2215
全文下载次数： 709
DOI: 10.11834/jig.200230
2020 | Volume 25 | Number 10

结合分段频域和局部注意力的超声甲状腺分割

胡屹杉^1,2,3, 秦品乐^1,2,3, 曾建潮^1,2,3, 柴锐^1,2,3, 王丽芳^1,2,3(1.中北大学山西省医学影像与数据分析工程研究中心, 太原 030051;2.中北大学大数据学院, 太原 030051;3.中北大学山西省医学影像人工智能工程技术研究中心, 太原 030051)

摘要

目的超声检查是诊断甲状腺疾病的主要影像学方法之一，但由于超声图像中斑点强度具有随机性、组织器官复杂等问题，导致甲状腺在不同数据源间的形态、大小和纹理差异性较大，容易导致观察者视觉疲劳。针对甲状腺超声成像存在斑点强度随机性以及周边组织复杂性的问题，为了更准确地描述出器官与病理性病变的解剖边界，提出一种基于频域增强和局部注意力机制的甲状腺超声分割网络。方法针对原始数据采用高低通滤波器获取高低频段的图像信息，整合高频段细节特征与低频段边缘特征，增强图像前背景的对比度，降低图像间的差异性。根据卷积网络中网络深度所提取特征信息量的不同，采用局部注意力机制对高低维特征信息进行自适应激活，增强低维特征的细节信息，弱化对非目标区域的关注，增强高维特征的全局信息，弱化冗余信息对网络的干扰，增强前背景分类以及对非显著性目标检测的能力。采用金字塔级联空洞卷积获取不同感受野的特征信息，解决数据源间图像差异较大的问题。结果实验结果表明，本文方法在11~16 MHz时采集的16个手绘甲状腺超声公开数据集中，通过10折交叉验证显示准确率为0.989，召回率为0.849，精准率为0.940，Dice系数为0.812，效果优于当前其他医学图像分割网络。通过消融实验，证明本文的几个模块对超声图像分割确实具有一定的提升效果。结论本文所提分割网络，结合深度学习模型及传统图像处理模型的优点，能较好地处理超声图像随机斑点并且提升非显著性组织分割效果。

关键词

图像分割频域分析注意力机制空洞卷积超声影像

Ultrasound thyroid segmentation based on segmented frequency domain and local attention

Hu Yishan^1,2,3, Qin Pinle^1,2,3, Zeng Jianchao^1,2,3, Chai Rui^1,2,3, Wang Lifang^1,2,3(1.Shanxi Medical Imaging and Data Analysis Engineering Research Center, North University of China, Taiyuan 030051, China;2.College of Big Data, North University of China, Taiyuan 030051, China;3.Shanxi Medical Imaging Artificial Intelligence Engineering Technology Research Center, North University of China, Taiyuan 030051, China)

Abstract

Objective Ultrasound is a main imaging method used for the diagnosis of thyroid diseases. It is convenient for the diagnosis of medical results through the real-time study of its internal anatomical structure. In computer vision, the segmentation of image tissue and organ is the pre background classification of the pixels in the image. The final segmentation image boundary is the combination of the target pixels. The research on medical image segmentation has received much attention, which is mainly divided into two ideas, where the first idea is to obtain the target area by analyzing the pixel value of a given image through computer vision technology. However, the generalization ability of the given image analysis is poor, and the segmentation effect is unremarkable because of the interference of random noise in the ultrasonic image. The second idea is to use deep learning for obtaining the target area through the background information before deep convolution classification. However, the target area may be insignificant using the depth learning model because of the complexity of tissue and organs, the evident surrounding tissues, and the lack of background information before the image, making the abstract features obtained by the depth network mostly the surrounding non target area and causing the segmentation effect of the original target unideal. A thyroid image is different in shape, size, and texture among different data sources. To solve the two problems, a thyroid ultrasound segmentation network based on frequency domain enhancement and local attention mechanism is proposed to solve the problem of random noise interference and insignificant target. Method First, high and low pass filters are used to obtain the image information of high- and low-frequency bands, and the detail features of high frequency band and the edge feature of low frequency band are integrated to enhance the contrast of background and reduce the difference between images. Second, a local attention mechanism is used to adaptively activate the high- and low-dimensional feature information in accordance with the different information amounts of the features extracted by the network depth in the convolution network. This mechanmism can enhance the detailed information of low-dimensional features, weaken the attention to nontarget areas, enhance the global information of high-dimensional features, and weaken the interference of redundant information on the network, thereby enhancing the ability of background classification and nonsignificant target detection. Finally, a pyramid cascading hole is used, and convolution is utilized to obtain the feature information of different receptive fields and solve the problem of large image difference between data sources. In the training process, a mixed loss function is used to regress the network training effect, and pixel level loss (binary cross entropy) and image similarity loss (structural similarity) can better evaluate the segmentation prediction results. This paper uses the ResNet34 network, which is trained in advance to fine tune, to train the model of the network. The training set adopts the open data set of the network and selects approximately 3 500 images through the screening of appropriate images. During the training, one NVIDIA P100 graphics processing unit(GPU) server is used, the network training of approximately 10 epochs can achieve a better and stable effect, and the total training time is approximately 120 min. Result Experimental results show that the accuracy of the proposed method is 0.989, the recall rate is 0.849, the specificity is 0.94, and the Dice coefficient is 0.812, which is better than the current methods of medical image segmentation network, such as U-Net and CE-Net network, and is more accurate and special in the effect of ultrasound thyroid image segmentation. A significant improvement is found in heterosexuality and is better than the evaluation result for the network using the same dataset, such as sumNet. At the same time, the ablation experiments show that the proposed modules have a certain improvement effect on ultrasound image segmentation. Conclusion The proposed segmentation model combined with the advantages of deep learning model and traditional image processing model can better deal with ultrasound image random spots and improve the results of nonsignificant tissue segmentation.

Keywords

image segmentation frequency domain analysis attention mechanism dilate convolution ultrasound image

在线采编平台

论文出版

年度会议

下载中心

年度信息