基于边缘引导的双注意力图像拼接检测网络
(1.杭州师范大学;2.杭州启源视觉科技有限公司) 摘 要
目的 伪造图像给众多行业埋下了隐患,这会造成大量潜在的经济损失。最近关于图像拼接检测的研究已经证明了卷积神经网络在提高模型定位性能方面的有效性,但这些方法通常忽略了不同尺度特征信息的融合,而这种融合对于定位各种大小的篡改区域是至关重要的。因此,需要设计一种好的伪造图像检测方法来解决这个问题。方法 在本文中,提出了一种边缘引导的双注意力图像拼接检测网络(Boundary-guided Dual Attention Network ,BDA-Net),该网络通过将空间通道依赖和边缘预测集成到网络提取的特征中来得到预测结果。具体来说,文章首先提出一种称为预测分支的编解码模型,该分支作为拼接检测网络的主干网络,可以提取和融合不同分辨率的特征图。其次,为了捕捉不同维度的依赖关系并增强网络对感兴趣区域的关注能力,还设计了一个沿多维度进行特征编码的坐标-空间注意力模块(Coordinate-Spatial Attention Module,CSAM)。最后,文章设计了一条边缘引导分支来捕获篡改区域和非篡改区域之间的微小边缘痕迹,并将其建模为一个二进制分割任务,以辅助预测分支进行更好地分割。结果 文章采用F1度量和交并比作为评价指标。在4个数据集上与多种方法进行比较,实验结果表明,文章中提出的方法在这4个数据集上均取得了最优的性能。同时,为了验证模型的鲁棒性,文章还对检测图片施加了JPEG 压缩、高斯模糊、 锐化、高斯噪声和椒盐噪声共5种攻击手段。实验结果显示,文章中提出的模型的鲁棒性显著地优于其他模型。结论 本文所提出的图像拼接检测方法,充分利用了深度学习模型的优点和图像拼接检测领域的专业知识,有效地提升了模型的性能。与现有的拼接篡改检测方法相比,具有更强的检测能力和更好的稳定性。
关键词
BDA-Net: Boundary-guided dual attention network for image splicing detection
Wu Jinghui, Yan Caiping1, Li Hong2, Liu Renhai3(1.杭州师范大学;2.Hangzhou InsVision Technology Co., Ltd.;3.Hangzhou Normal University) Abstract
Objective Due to the rapid development of the Internet and the proliferation of effective and user-friendly picture editing software, there has been an explosion of modified images on the Internet. While these modified images can bring some benefits (e.g., landscape beautification, face photos enhancement, etc.), they also have many negative effects on people"s lives, such as falsifying transaction records, publishing false news, and even presenting fake evidence in court. Tampered images exploited maliciously can cause immeasurable damage to individuals and collectives. Recent studies on image splicing detection have demonstrated the effectiveness of Convolutional Neural Network (CNN) for improving localization performance, but they generally ignore the multi-scale information fusion, which is essential for locating tampered regions with various sizes. At the same time, the performance of most existing detection methods is not satisfactory. Therefore, we need to design a good splicing image detection method. Method In this paper, we propose a novel Boundary-guided Dual Attention Network (BDA-Net) by integrating spatial-channel dependency and boundary prediction into the features extracted by the network. Specifically, we present a new encoder-decoder model named prediction branch to extract and fuse feature maps with different resolution, which constitutes the backbone of BDA-Net. In order to capture long-range dependencies, a Coordinate-Spatial Attention Module (CSAM) is designed and embedded into the deep layer of feature extraction. In this way, the representations of interested region can be augmented and meanwhile the computational complexity is limited by aggregating features with three one-dimensional encoding. In addition, we present a boundary-guided branch to capture the tiny border artifacts between tampered and non-tampered regions, and model it as a binary segmentation task to enhance the detail prediction of our network. Result We used a total of four image splicing datasets in our experiments, which are Columbia dataset, NIST16 Splicing dataset, CASIA2. 0 Splicing dataset and IMD2020 dataset. And in order to compare the performance of the proposed BDA-Net, we choose three deep learning-based detection methods. The quantitative evaluation metrics contained F-measure and IOU. In Columbia dataset, compared with the second ranked model, F value increased by 7.9% and IOU value increased by 11.3%. In the NIST16 Splicing dataset, F value and IOU value were increased by 4.3% and 5.5%, respectively, compared with the second-ranked model. In CASIA2.0 Splicing dataset, F value increased by 10.1% and IOU value increased by 10.4% compared with the second-ranked model. In the IMD2020 data set, compared with the second-ranked model, the F value increased by 11.1% and the IOU value increased by 7.5%. Meanwhile, in order to verify the robustness of the model proposed by us, we also apply five attack methods of JPEG compression, Gaussian blur, Sharpening, Gaussian noise and Salt and pepper noise to the image. Experiments show that the robustness of our model is significantly better than other models. Conclusion The image splicing detection method proposed in this paper makes full use of the advantages of deep learning model and expertise in the field of image forgery, which is effectively improves the performance of the model. The experimental results on four splicing datasets illustrate that our model has stronger detection capability and better stability compared to existing splicing detection methods.
Keywords
|