Current Issue Cover
相同编码参数HEVC视频重压缩检测

潘鹏飞, 姚晔, 王慧(杭州电子科技大学网络空间安全学院, 杭州 310018)

摘 要
目的 视频重压缩是视频取证技术的重要辅助性手段。目前,不同编码参数进行压缩的高效视频编码(high efficiency video coding,HEVC)视频重压缩检测已经取得较高的准确度,而在前后采用相同编码参数压缩过程中,HEVC视频重压缩操作的痕迹非常小,检测难度大。为此,提出了在相同编码参数下基于视频质量下降机制的视频重压缩检测算法。方法 在经过多次相同编码参数压缩后,可以观察到视频的质量趋于不变,利用视频质量下降程度可以区分单压缩视频和重压缩视频。本文提出I帧预测单元模式(intra-coded picture prediction unit mode,IPUM)和P帧预测单元模式(predicted picture prediction unit mode,PPUM)两类视频特征,即分别从I帧和P帧中的亮度分量(Y)提取预测单元(prediction unit,PU)的模式。从待测视频中提取IPUM和PPUM特征,将HEVC视频以相同的编码参数压缩3次,每次提取上述特征。由于I帧、P帧中不同尺寸的PU数量相差较大,应选取数量较多的PU作为统计特征。统计平均每一I帧、P帧在相同位置第n次压缩和第n+1次压缩不同的PU模式,构成6维特征集送入支持向量机(support vector machine,SVM)进行分类。结果 本文方法在CIF(common intermediate format)数据集、720p数据集、1 080p数据集的平均检测准确度分别为95.45%,94.8%,95.53%。在不同的图像组(group of pictures,GOP)和帧删除的情况下均具有较好的表现。结论 本文方法利用在相同位置连续两次压缩不同的PU模式数来揭示视频质量下降的规律,具有较高的准确度,且在不同情况下均有较好表现。
关键词
Detection of double compression for HEVC videos with the same coding parameters

Pan Pengfei, Yao Ye, Wang Hui(School of Cyberspace, Hangzhou Dianzi University, Hangzhou 310018, China)

Abstract
Objective Multimedia forensics and copyright protection have become hot issues in the society. The widespread use of portable cameras, mobile phones, and surveillance cameras has led to an explosive growth in the amount of digital video data. Although people enjoy the convenience given by the popularity of digital multimedia, they also experience considerable security problems. Double compression for digital video file is a necessary procedure for malicious video content modification. The detection for double compression is also an important auxiliary measure for video content forensics. Content-tampered video inevitably undergo two or more re-compression operations. If the video test is judged to have undergone multiple re-compressions, it is more likely to undergo content tampering operation. At present, high efficiency video coding (HEVC) video double compression detection with different coding parameters achieves high accuracy. However, in the compression process with the same coding parameters, the trace of HEVC video double compression is very small and the detection is considerably more difficult. For most attackers, their concern is focused on the modification of video content. With the video stream containing the video parameter set and the image parameter set, the video editing software generally uses the same parameters for re-compression as default setting. This study proposes a detection algorithm for video double compression with the same coding parameters. The proposed algorithm is based on the video quality degradation mechanism. Method After multiple compression times with the same coding parameters, the video quality tends to be unchanged. The single compressed and double compressed videos can be distinguished by the degree of video quality degradation. Video coding is based on rate-distortion optimization to balance the bitrate and distortion to choose the optimal parameters for the encoder. When the video is compressed with the same coding parameters, the trace of video re-compression operations is extremely little because of the slight changes in division mode of the coding unit to the prediction unit (PU) and the little influence on the distribution of PU size type. Thus, the double compression with the same parameters is more difficult to detect. Given that the transform quantization coding process of each coding unit is independent, the quantization error and its distribution characteristics are independent, too. The discontinuous boundaries of adjacent blocks will affect the mode selection of intra-prediction. In the process of motion compensation prediction, the predicted values of adjacent blocks come from different positions of different images, which results in the numerical discontinuity of the predicted residual at the block boundary. It will affect the selection of motion vectors predicted between frames and reference pictures. This study proposes a detection algorithm based on two kinds of video features: the I frame PU mode (intra-coded picture prediction unit mode, IPUM) and the P frame PU mode (predicted picture prediction unit mode,PPUM). These video features are extracted from the luminance component (Y) in I frame and P frame, respectively. First, the IPUM and PPUM features are extracted from the tested HEVC videos. Then, the video is compressed three times with the same coding parameters. In this study, the above features are repetitively extracted for each compressing time. A larger number of PU should be selected as the statistical feature because the numbers of PU of different sizes in I frame and P frame are quite different. Finally, the average different PU modes of the nth compression and the (n+1)th compression of each I frame and P frame at the same position are counted to form a 6-dimensional feature set, which is sent to a support vector machine (SVM) for classification. Result The experiment is composed of three resolution video sets: common intermediate format (CIF) (352×288 pixels), 720p (1 280×720 pixels), and 1 080p (1 920×1 080 pixels). To increase the number of video samples, each test sequence is clipped into smaller video clips. Each video clip contains 100 frames. If the video exceeds 1 000 frames, only the first 1 000 frames are considered to generate the samples in our experiments. Accordingly, a total of 132 CIF-video sequence segments, 87 720p-video sequence segments, and 98 1080p-video sequence segments are obtained. For each set, 4/5 positive samples and their corresponding negative samples are randomly selected as the training set, while the rest are used as the test set. The binary classification is applied by using the SVM classifier with radial basis function kernel. The optimized parameters, gamma and cost, are determined by using grid search with fivefold cross validation. The final detection accuracy values are collected by averaging the accuracy results from 30 repetitions of the experimental test, where the training and testing data are randomly selected for each time. Considering the computational complexity of the experiment, the repetition of re-compression for each experimental test is especially important. With the increase in the times of re-compression, the computational complexity of video encoding and decoding will increase linearly. However, the classification accuracy is not significantly improved. Thus, the repetition of re-compression is finally adjusted to three. The average detection accuracies for the different video test datasets, CIF, 720p, and 1 080p, are 95.45%, 94.8%, and 95.53%, respectively. In addition, video compression is usually affected by coding parameters. Group of pictures (GOP) is the basic coding unit of video compression. The interval of GOP has a significant impact on video quality owing to the error propagation in the inter-coding process. In the CIF dataset, the detection accuracy of this method with different GOP reaches more than 90%. With the increase of GOP, the detection accuracy will decline slightly. The final experimental test is regarding frame deletion, which is a common operation of video tampering. In the CIF dataset, the detection accuracy of video re-compression with 10 consecutive deleted frames can maintain above 88%, which means the proposed method is robust to frame deletion. Conclusion In this study, the law of video quality degradation is revealed by the changed number of different PU modes in the same position of I frame and P frame. In sum, our proposed method clearly performs well in different test situations. The detection accuracy of the proposed method can reach high rates of different GOP settings, video resolutions, and frame deletion rate.
Keywords

订阅号|日报