发布时间: 2018-10-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.180037 2018 | Volume 23 | Number 10 图像处理和编码

1. 北京邮电大学计算机学院, 北京 100876;
2. 信息工程大学信息系统工程学院, 郑州 450001;
3. 江南计算技术研究所, 无锡 214083;
4. 郑州升达经贸管理学院, 郑州 451191
 收稿日期: 2018-01-24; 修回日期: 2018-04-23 基金项目: 国家自然科学基金项目（61602511，61572518，U1636202） 第一作者简介: 汪然, 1985年生, 女, 讲师, 2014年于解放军信息工程大学获信号与信息处理专业工学博士学位, 主要研究方向为图像处理和多媒体信息安全。E-mail:nemo2007@163.com;薛小燕, 女, 研究员, 主要研究方向为大数据与人工智能。E-mail:npxxy1@163.com;平西建, 男, 教授, 主要研究方向包括图像处理、模式识别、多媒体信息安全等。E-mail:pingxijian@163.com;牛少彰, 男, 教授, 主要研究方向包括计算机安全、多媒体信息安全等。E-mail:szniu@bupt.edu.cn;张涛, 男, 教授, 主要研究方向包括图像处理、模式识别、多媒体信息安全等。E-mail:brunda@163.com. 中图法分类号: TP391 文献标识码: A 文章编号: 1006-8961(2018)10-1472-11

# 关键词

Steganalysis of JPEG images based on image classification and segmentation
Wang Ran1,2, Xue Xiaoyan3, Ping Xijian2,4, Niu Shaozhang1, Zhang Tao2
1. Beijing University of Posts and Telecommunications, Beijing 100876, China;
2. Zhengzhou Information Science and Technology Institute, Zhengzhou 450001, China;
3. Jiangnan Institute of Computing Technology, Wuxi 214083, China;
4. Zhengzhou Shengda University of Economics, Business & Management, Zhengzhou 451191, China
Supported by: National Nature Science Foundation of China (61602511, 61572518, U1636202)

# Abstract

Objective Image steganalysis is the opposite technology of steganography; it aims to detect, extract, restore, and destroy secret messages embedded in cover images.As an important technical tool for image information security, image steganalysis has become popular in multimedia information security to researchers all over the world.The basic concept of the current image steganalysis is to analyze the embedding mechanism and the statistical changes in image data caused by embedding secret messages.Images steganalysis overcomes the binary classification problem by using the cover and stego images of two image categories.The performance of steganalysis methods depends on feature extraction, and steganalysis features are expected to have small within-class scatter distances and big between-class scatter distances.However, embedded changes are not only correlated with steganography methods but also with image content and local statistical characteristics.The changes in steganalysis features caused by secret embedding are subtle, especially when the embedding ratio is low.The contents and statistical characteristics of images have a stronger impact on the distribution of steganalysis features than the embedding process.Thus, the steganalysis features of cover and stego images are inseparable, a scenario that can be attributed to the differences in image statistical characteristics.Consequently, image steganalysis becomes a classification problem with large within-class and small between-class scatter distances.To solve this problem, a new steganalysis framework for JPEG images, which aims to reduce the within-class scatter distances, is proposed. Method The secret messages after embedding will have different effects on the characteristics of images with different content complexities, while the steganalysis features of the images with the same content complexity are similar.This study on image steganalysis focuses on reducing the differences of image statistical characteristics caused by various contents and processing methods.The motivation of the new model is introduced by analyzing the Fisher linear discriminant analysis, which is the basis of the ensemble classifier, the most used one in steganalysis applications, and a new steganalysis model of JPEG images based on image classification and segmentation is proposed.We define a content complexity evaluation feature for each image, and the given images are first classified according to the content.Thus, the images classified to the same sub-class will have a closer content complexity.Then, each image is segmented to several sub-images according to the evaluated texture features and the complexity of each sub-block.During segmentation, we first categorize the image blocks according to texture complexity, and then amalgamate the adjacent block categories.After the combined classification and segmentation process, the content texture of the same class of image regions is more similar, and the steganalysis features are more centralized.The steganalysis features are extracted separately from each subset with the same or close texture complexity to build a classifier.When deciding which steganalysis feature set to extract, we mainly consider the performance.In our prior work, we found that when extracting a steganalysis feature set with low dimension, the performance of the method based on classification or segmentation can be obviously improved.However, when extracting high-dimensional steganalysis features, such as JPEG rich model (JRM), the performance is unsatisfactory because the rich model is based on the residual of the given image, and it can eliminate the effect of image content.The JRM feature set is sensitive to subtle image details, and the steganalysis result is good.However, we still extract the JRM feature set, which is the most representative high-dimensional feature set in the JPEG domain, to prove the validity of the proposed model.In the testing phase, the steganalysis features of each segmented sub-image in each sub-class are sent to the corresponding classifier.The final steganalysis result is obtained through the weighted fusing process. Result In the experiment, we compute two kinds of separability criteria of the tested steganalysis feature set, including the separability criterion based on the within-and between-class distances and the Bhattacharyya distances.The Bhattacharyya distance is one of the most used separability criteria on the basis of the probability density of classified samples.Both separability criteria of the proposed method are obviously improved, which means that the proposed classified and segmentation-based steganalysis features can be more easily categorized, thereby verifying the validity of the proposed steganalysis model.We also compare the classification performance of the proposed method and the prior work in various experimental circumstances, including the use of the same and different training and testing image databases.We compute the detection results for the original feature set, the features extracted from the classified image and the segmented image, and the image combined classification and segmentation.Experimental results show that in both circumstances, the combined classification and segmentation process can effectively improve the performance by up to 10%.The improvement considerably higher when the training and testing images have different statistical features, which implies that the proposed method is suitable for practical application on images from the Internet with considerable diversity in sources, processing methods, and contents. Conclusion In this paper, a new steganalysis model for JPEG images is proposed.The differences in image statistical characteristics caused by various contents and processing methods are reduced by image classification and segmentation.The JRM feature set was extracted.The theoretical analysis of and the experimental results for several diverse image databases and circumstances demonstrate the validity of the framework.When a considerable diversity in image sources and contents exists, such as different training and testing images, the performance improvement of the proposed method is obvious, indicating that the performance of the proposed method does not depending highly on image content.Furthermore, the proposed steganalysis model is suitable for practical application in complex network environments.

# Key words

steganalysis; image statistical characteristics; image classification; image segmentation; weighted fusing

# 1 模型原理

Fisher线性判决的思路是将所有样本都投影到一个方向上，在这个方向上，所有相同类内特征的投影值足够聚合，而不同类间特征的投影值则相距尽可能远。设两类特征投影后的类内离散度为$\mathit{\boldsymbol{S}}_i^2$, $i = 1, 2$，则总类内离散度为${\mathit{\boldsymbol{S}}_{\rm{W}}} = \mathit{\boldsymbol{S}}_1^2 + \mathit{\boldsymbol{S}}_2^2$，类间离散度为${\mathit{\boldsymbol{S}}_{\rm{b}}}$，则Fisher线性判决的准则为

 $\max J\left( \omega \right) = \frac{{{\mathit{\boldsymbol{S}}_{\rm{b}}}}}{{{\mathit{\boldsymbol{S}}_{\rm{W}}}}}$ (1)

# 2.1 图像分类

 $\begin{array}{l} c_{kl}^*\left( {x, y, \Delta x, \Delta y} \right) = \frac{1}{Z} \times \\ \sum\limits_{i, j} {\left| {\left\{ {R_{xy}^{\left( {i, j} \right)}\left| \begin{array}{l} \mathit{\boldsymbol{R}} = {\rm{t}}{{\rm{r}}_T}\left( {{\mathit{\boldsymbol{A}}^*}} \right)\\ R_{xy}^{\left( {i, j} \right)} = k\\ R_{x + \Delta x, y + \Delta y}^{\left( {i, j} \right)} = l \end{array} \right.} \right\}} \right|} \end{array}$ (4)

 ${\rm{t}}{{\rm{r}}_T}\left( x \right) = \left\{ \begin{array}{l} T \cdot {\mathop{\rm sgn}} \left( x \right)\;\;\;\;\;\;\left| x \right| > T\\ x\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;其他 \end{array} \right.$ (5)

Table 1 The features of the sub-images with different complexity

 $\left( {\Delta x, \Delta y} \right)$ 平坦区域$\left( {x, y} \right)$ 中等区域$\left( {x, y} \right)$ (0, 1), (0, 8), $\left( {y-x, x-y + 8} \right)$ {(1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (4, 1)} {(1, 2), (1, 3), (1, 4), (1, 5), (2, 1), (2, 2), (2, 3), (2, 4), (3, 1), (3, 2), (3, 3), (4, 1), (4, 2), (5, 1)} (1, 1) {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3)} {(1, 2), (1, 3), (1, 4), (1, 5), (2, 2), (2, 3), (2, 4), (3, 3)} (-1, 1) {(2, 1), (2, 2), (2, 3), (3, 2)} {(2, 1), (2, 2), (2, 3), (2, 4), (3, 2), (3, 3), (4, 3)} (0, 2) {(1, 2), (1, 3), (2, 1), (2, 2), (3, 1)} {(1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (4, 1)} $\left( {y-x, x-y} \right)$ {(1, 2), (1, 3), (1, 4), (2, 3)} {(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (3, 3)} (2, 2) {(1, 2), (1, 3), (2, 2)} {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (3, 2)} (-2, 2) {(3, 1), (3, 2), (4, 2)} {(3, 1), (3, 2), (3, 3), (3, 4), (4, 2), (4, 3), (5, 3)} (-1, 2) {(2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (4, 1)} {(2, 1), (2, 2), (2, 3), (2, 4), (3, 1), (3, 2), (3, 3), (4, 1), (4, 2), (5, 1)} (8, 8) (-8, 8) {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (3, 3)} {(1, 2), (1, 3), (1, 4), (1, 5), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4)}

# 2.4 训练和测试

 ${P_{\rm{E}}} = {\min _{{P_{{\rm{FA}}}}}}\frac{{{P_{{\rm{FA}}}} + {P_{{\rm{MD}}}}}}{2}$ (6)

 ${w_i} = \frac{{{a_i}-0.5}}{{\sum\limits_{i = 1}^3 {\left( {{a_i}-0.5} \right)} }}$ (7)

 $P = \sum\limits_{i = 1}^3 {{p_i} \cdot {w_i}}$ (8)

$P \le 0.5$，将图像判为载密图像，当$P > 0.5$时，则判为载体图像。由此得到对整幅图像的判决结果。

 ${P_{{\rm{Err}}}} = \sum\limits_{j = 1}^N {{\theta _j}P_{{\rm{Err}}}^j}$ (9)

# 3.2.2 基于类内、类间距离的可分性判据

 $J = \ln \frac{{\left| {{\mathit{\boldsymbol{S}}_{\rm{b}}}} \right|}}{{\left| {{\mathit{\boldsymbol{S}}_{\rm{W}}}} \right|}}$ (10)

Table 2 The separability criterion based on within and between class distances of the JRM feature set (embedding ratio is 0.1 bpnc)

 特征 F5 MB1 MME2 J-Uniward 原始图像 -12.183 3 -10.005 6 -17.508 4 -15.609 4 平坦类/平坦区域 -8.268 3 -10.335 9 -11.120 5 -13.574 2 平坦类/中等区域 -7.354 1 -8.696 0 -10.744 0 -12.612 9 平坦类/复杂区域 -8.875 6 -9.234 5 -10.575 2 -13.535 7 中等类/平坦区域 -11.869 9 -8.876 3 -11.150 9 -16.935 3 中等类/中等区域 -10.715 1 -9.049 8 -10.925 7 -15.440 2 中等类/复杂区域 -10.772 6 -9.083 7 -11.413 1 -14.282 8 复杂类/平坦区域 -11.888 9 -8.915 9 -9.490 1 -17.493 5 复杂类/中等区域 -11.779 2 -9.672 8 -8.801 5 -15.174 3 复杂类/复杂区域 -12.218 3 -9.936 7 -9.525 6 -13.807 1

# 3.2.3 巴氏距离

 $B\left( {{p_{\rm{C}}}, {p_{\rm{S}}}} \right) =-\sum\limits_{x \in X} {\ln \sqrt {{p_{\rm{C}}}\left( \mathit{\boldsymbol{x}} \right){p_{\rm{S}}}\left( \mathit{\boldsymbol{x}} \right)} }$ (11)

Table 3 The Bhattacharyya distances of the JRM feature set(embedding ratio is 0.1 bpnc)

 特征 F5 MB1 MME2 J-Uniward 原始图像 0.019 5 0.022 4 0.020 3 0.017 0 平坦类别/平坦子区域 0.077 0 0.049 7 0.100 1 0.039 6 平坦类别/中等子区域 0.075 3 0.065 5 0.0098 8 0.046 0 平坦类别/复杂子区域 0.088 4 0.086 2 0.087 0 0.068 7 中等类别/平坦子区域 0.043 7 0.043 1 0.031 0 0.024 6 中等类别/中等子区域 0.042 7 0.044 2 0.033 8 0.031 9 中等类别/复杂子区域 0.043 8 0.045 7 0.032 4 0.038 6 复杂类别/平坦子区域 0.078 9 0.081 9 0.052 2 0.046 2 复杂类别/中等子区域 0.072 7 0.078 5 0.056 8 0.063 6 复杂类别/复杂子区域 0.071 7 0.077 0 0.061 8 0.069 6

# 3.3 分类性能

Table 4 Comparison of detection accuracy when training and testing databases are same

 /% 隐写方法 嵌入率 对比算法 JRM 分类JRM 分割JRM 分类+分割JRM nsF5 0.02 44.98 44.10 44.02 43.94 0.05 35.77 35.17 35.18 35.04 0.1 20.77 20.25 20.01 19.70 F5 0.02 28.09 28.03 27.94 27.72 0.05 22.33 22.11 21.98 22.18 0.1 13.54 11.75 12.32 11.38 MB1 0.02 25.28 24.87 23.92 23.44 0.05 6.88 6.66 6.01 5.64 0.1 1.21 1.05 0.87 0.66 MME2 0.1 27.86 26.14 27.06 26.08 0.15 25.11 23.09 24.11 23.06 0.2 11.43 10.74 10.13 10.84 MME3 0.1 29.93 29.51 29.76 28.42 0.15 26.46 24.37 25.33 23.98 0.2 14.91 13.45 13.86 12.67 J-Uniward 0.1 47.10 47.02 46.88 46.64 0.2 44.06 43.98 44.01 43.88 0.3 39.82 39.75 39.52 39.40 UED 0.1 27.48 25.23 24.36 21.56 0.2 25.46 22.37 20.83 17.70 0.3 14.98 12.36 10.39 8.54 注：加粗字体表示同等实验条件下的最佳性能。

Table 5 Comparison of detection accuracy when training and testing databases are different

 /% 隐写方法 嵌入率 对比算法 JRM 分类JRM 分割JRM 分类+分割JRM nsF5 0.02 47.49 46.16 46.32 46.08 0.05 41.82 39.04 39.13 38.76 0.1 31.62 28.06 27.55 26.66 F5 0.02 35.74 32.99 33.76 32.84 0.05 30.03 27.74 28.33 27.50 0.1 19.62 18.01 18.51 18.12 MB1 0.02 29.69 27.29 26.93 26.58 0.05 15.84 12.32 10.02 9.20 0.1 9.89 5.01 5.82 2.24 MME2 0.1 39.47 37.54 38.02 35.68 0.15 30.78 27.83 28.59 26.76 0.2 20.24 17.34 18.54 15.96 MME3 0.1 42.56 39.42 40.08 37.50 0.15 40.79 37.21 38.26 34.20 0.2 26.98 24.12 24.82 22.02 J-Uniward 0.1 49.58 49.12 48.98 48.16 0.2 48.54 47.92 47.23 46.58 0.3 47.04 45.83 44.98 44.12 UED 0.1 46.06 43.08 42.12 37.46 0.2 44.46 40.68 38.59 34.52 0.3 36.50 35.21 33.86 30.12 注：加粗字体表示同等实验条件下的最佳性能。

# 参考文献

• [1] Pevný T, Fridrich J.Merging Markov and DCT features for multi-class JPEG steganalysis[C]//Proceedings of SPIE 6505 Security, Steganography, and Watermarking of Multimedia Contents IX.San Jose, CA, United States: SPIE, 2007: #650503.[DOI:10.1117/12.696774]
• [2] Kodovský J, Fridrich J.Calibration revisited[C]//Proceedings of the 11th ACM Workshop Multimedia and Security.Princeton, New Jersey: ACM, 2009: 63-74.
• [3] Pevný T, Bas P, Fridrich J. Steganalysis by subtractive pixel adjacency matrix[J]. IEEE Transactions on Information Forensics and Security, 2010, 5(2): 215–224. [DOI:10.1109/TIFS.2010.2045842]
• [4] Goljan M, Fridrich J, Holotyak T.New blind steganalysis and its implications[C]//Proceedings of SPIE 6072 Security, Steganography, and Watermarking of Multimedia Contents VⅢ.San Jose, California, United States: SPIE, 2006: #607201.[DOI:10.1117/12.643254]
• [5] Wang Y, Moulin P. Optimized feature extraction for learning-based image steganalysis[J]. IEEE Transactions on Information Forensics and Security, 2007, 2(1): 31–45. [DOI:10.1109/TIFS.2006.890517]
• [6] Kodovský J, Fridrich J, Holub V. Ensemble classifiers for steganalysis of digital media[J]. IEEE Transactions on Information Forensics and Security, 2012, 7(2): 432–444. [DOI:10.1109/TIFS.2011.2175919]
• [7] Kodovský J, Fridrich J.Steganalysis of JPEG images using rich models[C]//Proceedings of SPIE 8303 Media Watermarking, Security, and Forensics 2012.Burlingame, California, United States: SPIE, 2012: #83030A.[DOI:10.1117/12.907495]
• [8] Holub V, Fridrich J, Denemark T.Random projections of residuals as an alternative to co-occurrences in steganalysis[C]//Proceedings of SPIE 8665, Media Watermarking, Security, and Forensics.Burlingame, California, United States: SPIE, 2013: #86650L.[DOI:10.1117/12.1000330]
• [9] Amirkhani H, Rahmati M. New framework for using image contents in blind steganalysis systems[J]. Journal of Electronic Imaging, 2011, 20(1): #013016. [DOI:10.1117/1.3554413]
• [10] Cho S, Cha B H, Gawecki M, et al. Block-based image steganalysis:algorithm and performance evaluation[J]. Journal of Visual Communication and Image Representation, 2013, 24(7): 846–856. [DOI:10.1016/j.jvcir.2013.05.007]
• [11] Wang R, Xu M K, Ping X J, et al. Steganalysis of JPEG images by block texture based segmentation[J]. Multimedia Tools and Applications, 2015, 74(15): 5725–5746. [DOI:10.1007/s11042-014-1880-y]
• [12] Wang R, Niu S Z, Ping X J, et al.Steganalysis Based on reducing the differences of image statistical characteristics[C]//Proceedings of SPIE 10615, Ninth International Conference on Graphic and Image Processing.Qingdao, China: SPIE, 2017: #106151J.[DOI:10.1117/12.2304572]
• [13] Fisher R A. The use of multiple measurements in taxonomic problems[J]. Annals of Eugenics, 1936, 7(2): 179–188. [DOI:10.1111/j.1469-1809.1936.tb02137.x]
• [14] Filler T, Pevný T, Bas P.BOSS[EB/OL].[2007-07-01].http://agents.fel.cvut.cz/stegodata/
• [15] Bas P, Furon T.Bows-2[EB/OL].[2007-07-01].http://bows2.ec-lille.fr/
• [16] The USDA NRCS Photo Gallery[EB/OL].[2008-09-14].http://photogallery.nrcs.usda.gov.
• [17] Schaefer G, Stich M.UCID-An uncompressed colour image database[R].UK: Nottingham Trent University, 2003.
• [18] Fridrich J, Pevný T, Kodovský J.Statistically undetectable JPEG steganography: dead ends challenges, and opportunities[C]//Proceedings of the 9th workshop on Multimedia & Security.Dallas, Texas, USA: ACM, 2007: 3-14.[DOI:10.1145/1288869.1288872]
• [19] Westfeld A.High capacity despite better steganalysis (F5——A steganographic algorithm)[C]//Proceedings of 4th International Workshop on Information Hiding.Pittsburgh, PA: Springer-Verlag, 2001, 2137: 289-302.
• [20] Sallee P.Model-based steganography[C]//Proceedings of International Workshop on Digital Watermarking.Seoul, Korea: Springer-Verlag, 2003: 154-167.[DOI:10.1007/978-3-540-24624-4_12]
• [21] Huang F J, Huang J W, Shi Y Q. New channel selection rule for JPEG steganography[J]. IEEE Transactions on Information Forensics and Security, 2012, 7(4): 1181–1191. [DOI:10.1109/TIFS.2012.2198213]
• [22] Holub V, Fridrich J, Denemark T. Universal distortion function for steganography in an arbitrary domain[J]. EURASIP Journal on Information Security, 2014, 2014: #1. [DOI:10.1186/1687-417X-2014-1]
• [23] Guo L J, Ni J Q, Shi Y Q. Uniform embedding for efficient JPEG steganography[J]. IEEE Transactions on Information Forensics and Security, 2014, 9(5): 814–825. [DOI:10.1109/TIFS.2014.2312817]