Improving deep learning-based video steganalysis with motion vector differences

Yongjian Hu; Xiongbo Huang; Yufei Wang; Beibei Liu; Shuowei Liu

doi:10.11834/jig.220328

Information Hiding | Views : 0 下载量: 0 CSCD: 0

PDF
Export
Share
Collection
Album

Improving deep learning-based video steganalysis with motion vector differences
Vol. 28, Issue 3, Pages: 702-715(2023)
Published： 16 March 2023 ，

Accepted： 14 June 2022
DOI： 10.11834/jig.220328
稿件说明：

移动端阅览

Yongjian Hu, Xiongbo Huang, Yufei Wang, Beibei Liu, Shuowei Liu. Improving deep learning-based video steganalysis with motion vector differences. [J]. Journal of Image and Graphics 28(3):702-715(2023)
DOI：

Yongjian Hu, Xiongbo Huang, Yufei Wang, Beibei Liu, Shuowei Liu. Improving deep learning-based video steganalysis with motion vector differences. [J]. Journal of Image and Graphics 28(3):702-715(2023) DOI： 10.11834/jig.220328.

摘要

目的

针对现有深度学习视频隐写分析网络准确率不够高的问题，本文从视频压缩编码的原理出发，发掘嵌密编码参数与其他参数之间的关系，通过拓展检测空间，构造新的检测通道，改善现有深度学习视频隐写分析网络的检测性能。

方法

以H.265/HEVC(high efficiency video coding)压缩视频为例，首先通过分析运动向量的嵌密修改对运动向量差值的影响，指出可将运动向量差值作为新增的采样对象(或称检测对象)；接着，提出一个构造运动向量差值检测矩阵的方法，解决了空域上采样样本稀疏、时域上样本空间位置无法对齐的问题；最后，将运动向量差值矩阵直接用于改善现有的VSRNet(video steganalysis residual network)、SCA-VSRNet(selection-channel-aware VSRNet)以及Q-VSRNet(quantitative VSRNet)等3个H.265/HEVC深度学习视频隐写分析网络，分别得到IVSRNet(improved VSRNet)、SCA-IVSRNet(selection-channel-aware improved VSRNet)以及Q-IVSRNet(quantitative improved VSRNet)。

结果

在5种隐写方法上进行了测试。与4种隐写分析方法进行了比较，包括移植到H.265/HEVC视频的经典手工特征视频隐写分析方法AoSO(adding or subtracting one)、MVRB(motion vector reversion-based)、NPEFLO(near-perfect estimation for local optimality)以及直接针对H.265/HEVC视频的新型隐写分析方法LOCL(local optimality in candidate list)。在定性隐写分析测试中，以0.2 bpmv嵌入率为例，在不同码率下，IVSRNet和SCA-IVSRNet的准确率分别全面超越了VSRNet和SCA-VRSNet；SCA-IVSRNet的准确率不全面超越AoSO和MVRB，且在部分情况下好于较新的LOCL方法。在定量隐写分析的测试中，Q-IVSRNet对于6种不同嵌入率样本的检测性能全面超越Q-VSRNet。

结论

本文提出的拓展检测空间改进策略原理清晰，构造输入矩阵的方法简便、普适性好，能方便地拓展到其他深度学习视频隐写分析网络中，为设计更有效的视频隐写分析网络指明了一条道路。

Abstract

Objective

The subjects of video steganography and video steganalysis have been widely studied because video is an ideal cover media for achieving high embedding capacity. The booming deep learning technique has been recently introduced to the area of video steganalysis. A few video steganalysis deep neural networks were published to detect the secret embedding in motion vectors (MVs). However

the current deep neural networks (DNNs) for video steganalysis only report mediocre detection accuracies

compared to the traditional handcrafted feature-based steganalysis approaches. It is conjectured that the performance limitation is due to the inadequate information provided for the network. According to the principle of video encoding

we explore the impact of steganographic embedding on different encoding parameters. Our aim is to extend the detection space by searching for abnormalities in coding parameters raised from steganography

so that we construct multiple input channels to improve detection performance of steganalysis networks.

Method

We first analyze how the motion vector differences (MVDs) can be influenced by the secret embedding on motion vectors (MVs). It is shown that the histogram of MVDs can exhibit visible changes in bin height after the embedding process of MVs. The MVDs convey critical information for revealing MV alteration

so we propose to consider the MVDs as an extra sampling space of the videos steganalysis network in addition to the existing MV and prediction residual spaces. However

the MVDs are irregularly and sparsely distributed in individual frames and are therefore difficult to calibrate among consecutive frames. We deliberately design a method for constructing the input channels of MVD samples

which can be compatible with the existing network architecture. Specifically

two matrices are adopted to record the vertical and horizontal components of MVD. Since the prediction unit (PU) partition varies from frame to frame

we take the minimum 4×4 block as the basic sampling unit. The vertical and horizontal components of the MVD of each 4×4 block are recorded as one element in vertical MVD matrix and horizontal MVD matrix

respectively. For H.265/HEVC (high efficiency video coding) video format

there are some blocks that do not involve inter-frame prediction and thus have no MVs and MVDs. There are also some blocks that use inter-frame prediction but adopt the Merge and Skip modes instead

and therefore only have MVs but no MVDs. For these two types of blocks

the corresponding elements are set to zeros in the MVD matrices. The newly introduced MVD channels can work alone or together with other channels such as MVs and prediction residuals. By incorporating the MVD channels into current video steganalysis networks

we obtain the improved networks for various tasks

including the improved VSRNet (IVSRNet)

selection-channel-aware improved VSRNet (SCA-IVSRNet) and quantitative improved VSRNet (Q-IVSRNet).

Result

We conduct extensive experiments against 5 target steganographic methods with varying resolutions

bit rates and embedding rates. All embedding and detection are operated on H.265/HEVC videos. Two of the classical target methods originally designed for H.264 videos are transplanted to H.265/HEVC videos. The rest three targets are recently published H.265/HEVC specific steganographic methods. We first evaluate the performance of the MVD-VSRNet that only uses the MVD and prediction residual channels without the MV channels. Increased accuracies are obtained from the MVD-VSRNet compared to the baseline network VSRNet that employs MV and prediction residual channels. The discriminating capability of MVDs for stego videos is thus verified. The IVSRNet

adopting the MV

prediction residual and MVD channels

achieves an even better result. We then evaluate the SCA-IVSRNet

which integrates the IVSRNet with an embedding probability channel. It is shown that the performance of the SCA-IVSRNet exceeds both the IVSRNet and the SCA-VSRNet. We conduct comparisons with several milestone handcrafted feature-based video steganalysis approaches for MV-based steganography

including the adding or subtracting one (AoSO)

motion vector reversion-based (MVRB) and near-perfect estimation for local optimality (NPEFLO) algorithms. We also include the local optimality in candidate list (LOCL)

the latest state-of-the-art (SOTA) steganalysis method that employs specific feature of H.265/HEVC standard. It is shown that the SCA-IVSRNet surpasses all the other methods against the two transplanted target steganography. As for the H.265/HEVC specific steganography

the SCA-IVSRNet loses marginally to the NPEFLO and LOCL methods by less than 2% but exceeds the rest methods by around 10%. Among the five targets

the most challenging one does not directly change the MV values. In this case

the SCA-IVSRNet reports accuracies around 67%

only 0.3% behind the first place LOCL. It is worth noting that the IVSRNet also reaches 63% in this case

verifying again the important role of the proposed MVD channels. Finally

we assess the performance of the Q-IVSRNet on quantitative steganalysis task. The mean absolute errors (MAEs) obtained with the Q-IVSRNet are consistently less than those with the Q-VSRNet

which can be attributed to the effectiveness of MVD channels.

Conclusion

In this work we aim at improving the detection accuracy of convolutional neural network (CNN)-based steganalyzers for MV-based video steganography. We point out the current input spaces of MVs and prediction residuals do not convey adequate steganalytic information. To solve this problem

we propose to extend the detection space to MVDs. The newly introduced MVD channel is fully compatible with current CNN-based video steganalyzers

leading to several improved steganalysis networks. Extensive experiments are conducted to evaluate the effectiveness of adopting MVD channels. Results show that the improved detection networks not only surpass their precedent versions by a large margin

but also catch up or even exceed some popular handcrafted feature-based steganalyzers. This work has exhibited how to extend the detection space and handle highly unstructured data in the construction of input matrix for CNN-based video steganalysis

which paves a way of designing more effective deep learning networks for video steganalysis.

关键词

视频隐写分析深度学习运动向量(MV)运动向量差值(MVD)检测空间稀疏数据信号采样输入矩阵构造

Keywords

video steganalysisdeep learningmotion vector(MV)motion vector difference(MVD)detection spacesparse datadata samplinginput matrix construction

references

Aly H A. 2011. Data hiding in motion vectors of compressed video based on their associated prediction error. IEEE Transactions on Information Forensics and Security, 6(1): 14-18 [DOI: 10.1109/TIFS.2010.2090520]

Boroumand M, Chen M and Fridrich J. 2019. Deep residual network for steganalysis of digital images. IEEE Transactions on Information Forensics and Security, 14(5): 1181-1193 [DOI: 10.1109/TIFS.2018.2871749]

Cao Y, Zhao X F and Feng D G. 2012. Video steganalysis exploiting motion vector reversion-based features. IEEE Signal Processing Letters, 19(1): 35-38 [DOI: 10.1109/LSP.2011.2176116]

Duan R and Chen D. 2018. Video steganography algorithm uses motion vector difference as carrier. Journal of Image and Graphics, 23(2): 163-173

段然, 陈丹. 2018. 以运动矢量残差为载体的视频隐写算法. 中国图象图形学报, 23(2): 163-173 [DOI: 10.11834/jig.170278]

Gao W, Zhao D B and Ma S W. 2018. Principles of digital video coding technology. 2nd ed. Beijing: Science Press

高文, 赵德斌, 马思伟. 2018. 数字视频编码技术原理. 2版. 北京: 科学出版社

Guo M Y, Sun T F, Jiang X H, Dong Y and Xu K. 2020. A motion vector-based steganographic algorithm for HEVC with MTB mapping strategy//Proceedings of the 18th International Workshop on Digital Watermarking. Chengdu, China: Springer: 293-306 [DOI: 10.1007/978-3-030-43575-2_25http://dx.doi.org/10.1007/978-3-030-43575-2_25]

Hu Y J, Gong W B, Liu B B, Liu S W and Zhu M N. 2018. Large-capacity lossless HEVC information hiding based on index parameter modification. Journal of South China University of Technology (Natural Science Edition), 46(5): 1-8

胡永健, 龚文斌, 刘琲贝, 刘烁炜, 朱美能. 2018. 修改标志位的大容量无损HEVC信息隐藏方法. 华南理工大学学报(自然科学版), 46(5): 1-8 [DOI: 10.3969/j.issn.1000-565X.2018.05.001]

Huang X B, Hu Y J and Wang Y F. 2020. A detection method with deep neural networks for video motion vector steganography. Journal of South China University of Technology (Natural Science Edition), 48(8): 1-9

黄雄波, 胡永健, 王宇飞. 2020. 针对视频运动向量隐写的深度神经网络检测方法. 华南理工大学学报(自然科学版), 48(8): 1-9 [DOI: 10.12141/j.issn.1000-565X.190917]

Huang X B, Hu Y J, Wang Y F, Liu B B and Liu S W. 2020a. Selection-channel-aware deep neural network to detect motion vector embedding of HEVC videos//Proceedings of 2020 IEEE International Conference on Signal Processing, Communications and Computing. Macau, China: IEEE: 1-6 [DOI: 10.1109/ICSPCC50002.2020.9259551http://dx.doi.org/10.1109/ICSPCC50002.2020.9259551]

Huang X B, Hu Y J, Wang Y F, Liu B B and Liu S W. 2020b. Deep learning-based quantitative steganalysis to detect motion vector embedding of HEVC videos//Proceedings of the 5th IEEE International Conference on Data Science in Cyberspace. Hong Kong, China: IEEE: 150-155 [DOI: 10.1109/DSC50466.2020.00030http://dx.doi.org/10.1109/DSC50466.2020.00030]

Liu S W, Hu Y J, Liu B B and Li C T. 2021a. An HEVC steganalytic approach against motion vector modification using local optimality in candidate list. Pattern Recognition Letters, 146: 23-30 [DOI: 10.1016/j.patrec.2021.02.018]

Liu S W, Liu B B, Hu Y J and Zhao X F. 2021b. Non-degraded adaptive HEVC steganography by advanced motion vector prediction. IEEE Signal Processing Letters, 28: 1843-1847 [DOI: 10.1109/LSP.2021.3111565]

Tan S Q, Wu W L, Shao Z L, Li Q S, Li B and Huang J W. 2021. CALPA-NET: channel-pruning-assisted deep residual network for steganalysis of digital images. IEEE Transactions on Information Forensics and Security, 16: 131-146 [DOI: 10.1109/TIFS.2020.3005304]

Tasdemir K, Kurugollu F and Sezer S. 2016. Spatio-temporal rich model-based video steganalysis on cross sections of motion vector planes. IEEE Transactions on Image Processing, 25(7): 3316-3328 [DOI: 10.1109/TIP.2016.2567073]

Wang K R, Zhao H and Wang H X. 2014. Video steganalysis against motion vector-based steganography by adding or subtracting one motion vector value. IEEE Transactions on Information Forensics and Security, 9(5): 741-751[DOI:10.1109/TIFS.2014.2308633]

Wang P P, Cao Y, Zhao X F and Wu B. 2015. Motion vector reversion-based steganalysis revisited//2015 IEEE China Summit and International Conference on Signal and Information Processing. Chengdu, China: IEEE: 463-467 [DOI: 10.1109/ChinaSIP.2015.7230445http://dx.doi.org/10.1109/ChinaSIP.2015.7230445]

Wu H T, Liu Y, Huang J W and Yang X Y. 2014. Improved steganalysis algorithm against motion vector based video steganography//Proceedings of 2014 IEEE International Conference on Image Processing. Paris, France: IEEE: 5512-5516 [DOI: 10.1109/ICIP.2014.7026115http://dx.doi.org/10.1109/ICIP.2014.7026115]

Xu C Y, Ping X J and Zhang T. 2006. Steganography in compressed video stream//Proceedings of the 1st International Conference on Innovative Computing, Information and Control. Beijing, China: IEEE: 269-272 [DOI: 10.1109/ICICIC.2006.158http://dx.doi.org/10.1109/ICICIC.2006.158]

Xu G S, Wu H Z and Shi Y Q. 2016. Structural design of convolutional neural networks for steganalysis. IEEE Signal Processing Letters, 23(5): 708-712 [DOI: 10.1109/LSP.2016.2548421]

Yang J and Li S B. 2018. An efficient information hiding method based on motion vector space encoding for HEVC. Multimedia Tools and Applications, 77(10): 11979-12001 [DOI: 10.1007/s11042-017-4844-1]

Ye J, Ni J Q and Yi Y. 2017. Deep learning hierarchical representations for image steganalysis. IEEE Transactions on Information Forensics and Security, 12(11): 2545-2557 [DOI: 10.1109/TIFS.2017.2710946]

You W K, Zhang H and Zhao X F. 2021. A Siamese CNN for image steganalysis. IEEE Transactions on InformationForensics and Security, 16: 291-306 [DOI: 10.1109/TIFS.2020.3013204]

Zhang H, Cao Y and Zhao X F. 2017. A steganalytic approach to detect motion vector modification using near-perfect estimation for local optimality. IEEE Transactions on Information Forensics and Security, 12(2): 465-478 [DOI: 10.1109/TIFS.2016.2623587]

Alert me when the article has been cited

提交

Survey of digital face rendering and appearance recovery methods

Comprehensive review of methods for vehicle logo recognition in intelligent transportation systems

Review of various vessels and airway segmentation in medical imaging

A review of adversarial examples for optical character recognition

Review of cross-view image geolocalization methods