Current Issue Cover
单域泛化X-ray乳腺肿瘤检测

史彩娟1,2, 郑远帆1,2, 任弼娟1,2, 孔凡跃1,2, 段昌钰1(1.华北理工大学人工智能学院, 唐山 063210;2.河北省工业智能感知重点实验室, 唐山 063210)

摘 要
目的 由于乳腺肿瘤病灶的隐蔽性强且极易转移,目前采用医学辅助诊断(computer-aided diagnosis,CAD)来尽早地发现肿瘤并诊断。然而,医学图像数据量少且标注昂贵,导致全监督场景下的基于深度学习的X-ray乳腺肿瘤检测方法的性能非常有限,且模型泛化能力弱;此外,噪声产生的域偏移(domain shift)也降低了不同环境下肿瘤检测的性能。针对上述挑战,提出一种单域泛化X-ray乳腺肿瘤检测方法。方法 提出了一种单域泛化模型(single-domain generalization model,SDGM)进行X-ray乳腺肿瘤检测,采用ResNet-50(residual network-50)作为主干特征提取网络,设计了域特征增强模块(domain feature enhancement module,DFEM)来有效融合上采样与下采样中的全局信息以抑制噪声,然后在检测头处设计了实例泛化模块(instance generalization module,IGM),对每个实例的类别语义信息进行正则化与白化处理来提升模型的泛化性能,通过学习少量的有标注医学图像对不可预见的噪声图像进行迁移学习,缓解因有标记医学图像匮乏而导致的泛化能力弱的问题;同时避免模型的冗余训练,进一步增强模型在不同环境下的鲁棒性。结果 为了验证所提模型SDGM的域内泛化性能,将INbreast的单域X-ray图像作为训练集,多种域偏移的图像为测试集,实验结果表明在域内泛化场景下SDGM性能优于FCOS (fully convolutional one-stage object detection)、Cascade-RCNN、FoveaBox、ATSS、TOOD (task-aligned one-stage object detection)、PVTv2-Transformer等方法,泛化性能比baseline方法的mAP (mean average precision)提升了9.7%;在训练数据量更小的前提下,单域泛化性能优于INbreast全监督场景下的baseline方法的性能。此外,为了进一步验证SDGM在不同数据集的域间的泛化性能,将CBIS-DDSM (curated breast imaging subset of DDSM)数据集作为训练集而多种域偏移的INbreast数据集作为测试集进行实验,所提方法SDGM比baseline方法提升了5.8%。结论 所提单域泛化模型SDGM能够有效缓解域偏移对模型性能的影响,并能够针对医学数据域未知且数量少的特点进行泛化,能够较灵活地迁移至临床实践中未知域下的噪声场景。
关键词
Single-domain generalized breast tumor detection in X-ray images

Shi Caijuan1,2, Zheng Yuanfan1,2, Ren Bijuan1,2, Kong Fanyue1,2, Duan Changyu1(1.College of Artificial Intelligence, North China University of Science and Technology, Tangshan 063210, China;2.Hebei Key Laboratory of Industrial Intelligent Perception, Tangshan 063210, China)

Abstract
Objective Breast tumor detection in X-ray images is a great challenge in the domain of medical image analysis, primarily because of the intrinsic difficulty in discerning lesions due to their significant concealment and propensity for metastasis. Currently, computer-aided diagnosis (CAD) plays a pivotal role in early tumor detection and diagnosis. Remarkable progress has been achieved in detecting breast tumors in X-ray images through deep learning-based object detection methods when the training and testing data are of the same modality. However, the limited availability of medical image data and the labor-intensive and professional nature of data annotation have constrained the detection performance and generalization ability of models. In addition, the presence of domain shift in the unseen domains caused by noise impairs the performance of breast tumor detection across diverse environments. To address these issues, existing studies have proposed different methods, including domain adaptation and domain generalization. However, domain adaptation requires a partition between the target and source domains, while domain generalization requires training the models in multiple domains. Achieving domain division poses a formidable challenge due to the limited availability of medical data. Therefore, in response to these challenges, single-domain methods have been proposed to train the models in a single domain and then they are generalized to the unseen domains in recent years. These methods are well-suited for medical data for aiding in mitigating domain shifts. Though single-domain generalization has been widely applied in classification tasks, its application to object detection tasks remains relatively nascent due to the inherent differences between object detection and classification. Through analysis, we found the single instance only focuses on holistic images for domain alignment in the classification tasks. In contrast, object detection tasks entail the simultaneous consideration of multiple objects within each image, which leads to the mismatch of instances. Thus, we propose a novel instance alignment paradigm to facilitate the single-domain generalization for detecting breast tumors. Method To improve the generalization performance for robust breast tumor detection in X-ray images, we propose a novel model called the single-domain generalization model(SDGM). The SDGM is constructed upon the baseline(RetinaNet) and employs Resnet-50 as its backbone. Two pivotal modules, namely, the instance generalization module (IGM) and the domain feature enhancement module (DFEM), are developed. First, the IGM is strategically positioned at the detection head to enhance the generalization performance by normalizing and whitening the category semantic information of each instance. The IGM comprises N sets of 3 × 3 convolutions and the switchable whitening sub-module, which is widely recognized for its effectiveness in extracting instance domain-invariant features in classification tasks. Therefore, IGM is integrated into the classification branch at the detection head. Second, the DFEM is ingeniously devised to efficiently merge the global information from both up-sampling and down-sampling processes while mitigating the impact of noise in medical images. To counteract the noise generated by conventional convolution in spatial features, a 3 × 3 convolution is employed to generate a foreground mask image, which serves as the convolution offset to guide the deformable convolution for sampling. Subsequently, channel-wise attention is leveraged to selectively suppress noise within each channel. The DFEM is incorporated into the feature pyramid network to attenuate the noise during the fusion of feature maps at various scales, thereby promoting subsequent domain-invariant feature extraction. Result To assess the efficiency of our proposed SDGM, we conduct extensive experiments on the CBISDDSM dataset and the INbreast dataset, which is single-domain generalized with multiple domains in the intra-domain. Additionally, we compare the SDGM against several state-of-the-art methods. We also evaluate the inter-domain generalization performance between the CBIS-DDSM and INbreast datasets. In the intra-domain single-domain generalization scenarios, the SDGM consistently outperforms the baseline method(RetinaNet) by a 9. 7% increase in mean average precision. Furthermore, it surpasses other one-stage anchor-free methods(e. g., FCOS and FoveaBox), one-stage anchorbased methods(e. g., ATSS and TOOD), two-stage methods(e. g., Faster R-CNN and Cascade-RCNN), and even the transformer-based method PVTv2. In the supervised learning scenarios, the SDGM trained with only 728 images, surpasses RetinaNet, Cascade-RCNN, FoveaBox, and FCOS trained with 5 148 images. This result demonstrates that the SDGM exhibits remarkable generalization capabilities, outperforming supervised methods with substantially less training data. Furthermore, we assess the impact of the attention mechanism on the model performance. Compared with the method TOOD without attention, the SDGM alleviates domain shift to achieve at least a 3. 6% improvement in the single-domain generalization scenario. Additionally, compared with PVTv2 and ResNeSt, which employ different attention mechanisms, the SDGM alleviates domain shift to achieve 21. 1% and 2. 8% improvement respectively, in the single-domain generalization scenarios. In the inter-domain single-domain generalization scenarios, the SDGM displays a performance improvement of 5. 8% compared with the baseline. These results indicate that our proposed SDGM not only mitigates performance degradation but also has robustness and generalization capabilities across different datasets. Conclusion In this study, we develop the SDGM for detecting breast tumors in X-ray images and focus on designing two important components:the DFEM and the IGM. The DFEM improves the performance of SDGM by effectively suppressing the noise in the global information. Meanwhile, the IGM is positioned at the detection head to enhance the generalization ability by normalizing and whitening the category information for each object. We evaluate the SDGM on the INbreast and CBIS-DDSM datasets with multiple benchmarks to evaluate its efficiency. The SDGM can handle domain shift and perform well even with limited labeled medical data, mitigating challenges in medical image analysis. Additionally, the SDGM exhibits robustness across different environmental conditions. In summary, the SDGM offers a promising solution to improving breast tumor detection in X-ray images, making a valuable impact on clinical practice.
Keywords

订阅号|日报