Current Issue Cover

李晓龙1, 俞能海2, 张新鹏3, 张卫明2, 李斌4, 卢伟5, 王伟6, 刘晓龙1(1.北京交通大学信息科学研究所, 北京 100044;2.中国科学技术大学网络空间安全学院, 合肥 230027;3.上海大学通信与信息工程学院, 上海 200444;4.深圳大学电子与信息工程学院, 深圳 518060;5.中山大学计算机学院, 广州 510006;6.中国科学院自动化研究所, 北京 100190)

摘 要
面对每天有数以百万计通过网络传播的多媒体数据,到底哪些内容是真实可信的,虚假内容的背后又经历了哪些篡改?数字取证技术将给出答案。该技术不预先嵌入水印,而是直接分析多媒体数据的内容,达到辨别真实性的目的。任何篡改和伪造都会在一定程度上破坏原始多媒体数据本身固有特征的完整性,由于其具有一致性和独特性,可作为自身的“固有指纹”,用于鉴别篡改文件。随着篡改媒体的数量与日俱增,社会稳定甚至国家安全受到了严重威胁。特别地,随着深度学习技术的快速发展,虚假媒体与真实媒体之间的感官差距越来越小,这对媒体取证研究提出了巨大挑战,并使得多媒体取证成为信息安全领域一个重要的研究方向。因此,目前迫切需要能够检测虚假多媒体内容和避免危险虚假信息传播的技术和工具。本文旨在对过去多媒体取证领域所提出的优秀检测取证算法进行总结。除了回顾传统的媒体取证方法,还将介绍基于深度学习的方法。本文针对当今主流的多媒体篡改对象:图像、视频和语音分别进行总结,并针对每种媒体形式,分别介绍传统篡改方法和基于AI(artificial intelligence)生成的篡改方法,并介绍了已公开的大规模数据集以及相关应用的情况,同时探讨了多媒体取证领域未来可能的发展方向。
Overview of digital media forensics technology

Li Xiaolong1, Yu Nenghai2, Zhang Xinpeng3, Zhang Weiming2, Li Bin4, Lu Wei5, Wang Wei6, Liu Xiaolong1(1.Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China;2.School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China;3.School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China;4.School of Information Engineering, Shenzhen University, Shenzhen 518060, China;5.School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China;6.Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China)

Internet and social networks have become the main platforms for people to access and share various digital media. Among them, media based on images, videos, and audio carry more information and are the most eye-catching. With the rapid development of computer technology, image and video editing software and tools have appeared one after another, such as Photoshop, Adobe Premiere Pro, and VideoStudio. These editing software can be faster and easier to modify the media. The effect of image forgery is realistic, and the effect of video editing and synthesis is natural and smooth. In recent years, the image generation technology has also been greatly developed, and the visual effects of the generated images may be fake. The problem of multimedia forgery attracts people’s attention. The purpose of forgery may be entertainment (such as beautifying images), malicious modification of the content of images and videos (such as deliberately modifying photos of political figures or deliberately exaggerating the severity of news events), and malicious copying. Image forgery incidents in recent years also remind people to focus on the security of media content. The authenticity of visual media content decreases and is increasingly being questioned. At present, millions of multimedia data are transmitted via the Internet every day. What type of content is true? What tampering was made behind the wrong content? The digital forensics technology proposed in recent years provides the answer. This technology does not embed a watermark in advance but directly analyzes the content of multimedia data to achieve the purpose of authenticity recognition. The basic principle is that the inherent characteristics of the original multimedia data are consistent and unique and can be used as its own “intrinsic fingerprint”. Any tampering or forgery destroys its integrity to a certain extent. In recent years, media tampering has been increasing and has seriously threatened social stability and even national security. Especially with the rapid development of deep learning technology, the perceived gap between fake media and real media decreases. This finding poses a serious challenge to media forensic research and makes multimedia forensics an important issue in the field of information security research direction. Therefore, technologies and tools that can detect erroneous multimedia content are urgently required, and the spread of dangerous erroneous information is avoided. This article aims to summarize the excellent detection and forensics algorithms proposed in the previous multimedia forensics field. In addition to reviewing traditional media forensics methods, we introduce methods based on deep learning. This article summarizes the current mainstream multimedia tampering objects, namely, images, videos, and sounds. Each media form includes traditional tampering methods and artificial intelligence (AI)-based tampering methods. Among them, video tampering is mainly divided into intraframe tampering and interframe tampering. Intraframe tampering takes the video frame as a unit to delete objects on the screen or performing “copy and move” operations, and interframe tampering takes the video sequence as a unit to add or delete frames. Traditional methods for detecting fake videos can be divided into video encoding tracking detection, video content inconsistency detection, video frame repeated tampering, and copy and paste detection. AI-based error video detection technology focuses on detecting artifacts left over from the network generated in the imaging network, which is different from the imaging process of a real camera. The purpose of digital image forensics technology is to verify the integrity and authenticity of digital images. Image forensic methods can be divided into active methods and passive methods. Active image forensics includes embedding watermarks or signatures in digital images. The passive blind forensic (blind forensics) method is not limited by these factors. It distinguishes images by detecting traces of tampering in the image. Common image forgery and tampering include enhancement, modification, area duplication, splicing, and synthesis. The detection of partial replacement image is divided into the following: 1) Area copy and tamper detection, which copies and pastes part of the area in the image to other areas. During the copying process, the copied area may undergo various geometric transformations and postprocessing. 2) Image processing fingerprints detection. The visual difference caused by simple area copying, splicing, and tampering is still evident. The forger performs postprocessing, such as zooming, rotating, and blurring the image, to eliminate these traces. 3) In recompression fingerprint detection, tampered images inevitably undergo recompression; thus, digital image recompression detection can provide a powerful auxiliary basis for digital image forensics. For the traceability detection technology of forged images, most images are captured by the camera. The general physical structure of the camera and the physical differences between different cameras leave traces on the captured images. These traces (camera fingerprints) appear as a series of features on the image, and the acquisition device of this image can be identified by examining the fingerprint of the device embedded in the image. The detection technology for the overall image generated by AI also focuses on detecting the artifacts left by the network generated in the imaging network. In the previous decades, some digital audio forensic studies have focused on detecting various forms of audio tampering. These methods check the metadata of audio files. In addition, the publicly available large-scale data sets and related applications are introduced, and the possible future development directions of the multimedia forensic field is discussed.