结合图像块比较与残差图估计的人脸伪造检测方法
冯才博, 刘春晓, 王昱烨, 周其当(浙江工商大学) 摘 要
目的 基于深度学习的人脸伪造技术迅速发展,以假乱真的人脸伪造图像为人们的工作和生活带来了严峻的安全挑战。然而,由于不同伪造类型样本的数据分布差距较大,现有人脸伪造检测方法的准确度不够高,而且泛化性能差。为此,本文引入“图像块归属纯净性”和“残差图估计可靠性”的概念,提出了基于图像块比较和残差图估计的人脸伪造检测方法。方法 除了骨干网络,本文的人脸伪造检测神经网络主要由纯净图像块比较模块和可靠残差图估计模块两部分组成。为了避免在同时包含人脸与背景像素的图像块上提取的混杂特征对于图像块比较的干扰,纯净图像块比较模块中选择只包含人脸像素的纯净人脸图像块与只包含背景像素的纯净背景图像块,通过比较两种图像块纯净特征之间的差异来检测伪造图像,图像块的纯净性保障了特征提取的纯净性,从而提高了特征比较的鲁棒性。考虑到靠近伪造边缘的像素比远离伪造边缘的像素具有较高的残差估计准确度,本文在可靠残差图估计模块中根据像素到伪造边缘的距离设计了一个距离场加权的残差损失来引导网络的训练过程,使网络重点关注输入图像与对应真实图像在伪造边缘附近的差异,对于可靠信息的关注进一步增强了伪造检测的鲁棒性。结果 在FF++数据集上的测试结果显示:与已有性能最好方法相比,本文方法的准确率与AUC(Area Under the ROC Curve)指标分别提高了2.49%与3.31%,在FS与F2F两种伪造数据上的准确率指标分别提高了6.01%与3.99%。在泛化性能方面,与11种已有方法在交叉数据集上的测试结果显示:本文方法与已有性能最好的方法相比,在CDF数据集上的视频AUC指标与图像AUC指标分别提高了1.85%与1.03%。结论 与已有方法相比,由于提高了特征信息的纯净性与可靠性,本文提出的人脸图像伪造检测模型的泛化能力与准确率优于已有方法。
关键词
Combining image patch comparison and residual map estimation for face forgery detection
Feng Caibo, Liu Chunxiao, Wang Yuye, Zhou Qidang(Zhejiang Gongshang University) Abstract
Objective In recent years, the face recognition technique has found its way into our daily life. However, with the rapid development of face forgery technique based on deep learning, it not only greatly reduces the cost of face forgery, but also brings unexpected risk to the face recognition technique. If someone uses a fake face image to break the face recognition system, our personal information and property will be cheated and stolen easily. However, it is difficult for human eyes to distinguish whether the face in an image is forged or not. And, due to large data distribution gaps among different forgery samples, the existing face forgery detection methods have poor generalization performance and are difficult to defend against unknown attack samples. Therefore, a reliable and general face forgery detection method is needed urgently. To this end, we introduce the concept of “Patch Attribution Purity” and “Residual Estimation Reliability”, and propose a novel multi-task learning network (PuRe) based on Pure Image Patch Comparison and Reliable Residual Map Estimation to detect face forgery images. Methods Except for the network backbone, our neural network mainly consists of the Pure Image Patch Comparison (PIPC) module and the Reliable Residual Map Estimation (RRME) module. Both modules are helpful for the performance improvement of face forgery detection. On the one hand, if the face in an image is forged, the features extracted from face patches and background patches ought to be inconsistent, so the PIPC module compares the feature discrepancy between face patches and background patches to complete the face forgery detection task. Nevertheless, if an image patch contains both face and background pixels, the features extracted from it will be mixed with both face and background information, which disturbs the feature comparison between the face and background image patches, and results in over-fitting of the training dataset. Considering above problem, our PIPC module suggests to only use pure image patches, which only contains face pixels (pure face image patches) or background pixels (pure background image patches). The purity of image patches provides guarantee for the purity of extracted features, thus it improves the robustness of feature comparison. On the other hand, the residual map estimation task is designed to predict the difference between input image and corresponding real image, which leads the network backbone to strengthen the generalization of extracted image features and improves the accuracy of face forgery detection. However, for the pixels far away from the forged edges between forgery region and real region, the known information used to estimate the residuals will be less, which results in unreliable residual estimation. Considering above problem, a loss function called as the Distance Field Weighted Residual Loss (DWRLoss) is designed in the RRME module to constrain the neural network to pay more attention to estimate the residuals near the forged edges between forgery region and real region. In the face region (i.e. forgery region), if the pixel is far away from the background region, its loss is assigned with a smaller weight coefficient. The attention to the reliable residual information improves the robustness of the face forgery detection. Finally, we adopt the multi-task learning strategy to train the proposed neural network. Both learning tasks guide the network backbone together to extract effective and generalized features for face forgery detection. Results Large amounts of experiments are conducted to demonstrate the superiority of our method. Compared with the existing superior methods, the test results on the FF++ dataset show that the ACC (Accuracy) and AUC (Area Under the ROC Curve) of face forgery detection are improved by 2.49% and 3.31% respectively with the proposed method. And, our method improves the face forgery detection ACC on the FF++ dataset with FS and F2F forgery types by 6.01% and 3.99% respectively. In terms of the cross-datasets test, compared with 11 existing representative methods, the experimental results show that the AUC on the CDF dataset in the video and image level are increased by 1.85% and 1.03% respectively with our method. Conclusion Due to the purity and reliability of the extracted features, the proposed neural network (PuRe) based on Pure Image Patch Comparison (PIPC) and Reliable Residual Map Estimation (RRME) modules show amazing generalization ability and performs better than existing methods.
Keywords
face forgery detection deepfake multi-task learning generalization pixel-wise supervision convolutional neural network
|