融合结构与非结构信息的自然图像恰可察觉失真阈值估计
Just distortion threshold estimation on natural images using fusion of structured and unstructured information
- 2019年24卷第9期 页码:1546-1557
收稿:2018-11-22,
修回:2019-4-2,
纸质出版:2019-09-16
DOI: 10.11834/jig.180631
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-11-22,
修回:2019-4-2,
纸质出版:2019-09-16
移动端阅览
目的
2
研究表明,图像的恰可察觉失真(JND)阈值主要与视觉系统的亮度适应性、对比度掩模、模块掩模以及图像结构等因素有关。为了更好地研究图像结构对JND阈值的影响,提出一种基于稀疏表示的结构信息和非结构信息分离模型,并应用于自然图像的JND阈值估计,使JND阈值模型与人眼视觉系统具有更好的一致性。
方法
2
首先通过K-均值奇异值分解算法(K-SVD)得到过完备视觉字典。然后利用该过完备字典对输入的自然图像进行稀疏表示和重建,得到该图像对应的结构层和非结构层。针对结构层和非结构层,进一步设计基于亮度适应性与对比度掩模的结构层JND估计模型和基于亮度对比度与信息不确定度的非结构层JND估计模型。最后利用一个能够刻画掩模效应的非线性可加模型对以上两个分量的JND估计模型进行融合。
结果
2
本文提出的JND估计模型利用稀疏表示将自然图像的结构/非结构信息进行分离,然后采用符合各自分量特点的JND模型进行计算,与视觉感知机理高度一致。实验结果表明,本文JND模型能够有效地预测自然图像的JND阈值,受污染图的峰值信噪比(PSNR)值比其他3个JND对比模型值高出35 dB。
结论
2
与现有模型相比,该模型与人眼主观视觉感知具有更好的一致性,更能有效地预测自然图像的JND阈值。
Objective
2
Neuroscientists have studied the Bayesian brain perception theory
which indicates that the human vision system indirectly processes input signals during the processing of input images. A complete set of intrinsic derivation mechanisms actively predicts and understands input image information and attempts to ignore any uncertainty information in an image. In other words
given an input image
the brain does not fully process the input visual information
but it has an intrinsic derivation mechanism that enables it to actively predict the gross structure of the image
including certain information (structured information). At the same time
uncertain information (unstructured information)
such as residual clutter
is ignored to realize the understanding and perception of the image. In considering the role of structured information in just noticeable distortion (JND) estimation on natural images
a sparse representation-based structured/unstructured information separation model is proposed and applied to the JND threshold estimation. The proposed method achieves great consistency with the human visual system in terms of the perceived JND threshold.
Method
2
Initially
90 natural images are selected for dictionary learning. These training images are pre-processed
and each image is divided into 8×8 non-overlapping image blocks. The variance of each image block is calculated
and the image blocks with high variances are selected as training samples. Then
an over-complete dictionary is learned from a set of training samples using the classical K-singular value decomposition algorithm. Then
the input natural image is reconstructed by sparse representation using the previously learned dictionary via the orthogonal matching pursuit (OMP) algorithm. The corresponding structural layer and non-structural layer of the input natural image can be obtained by setting an appropriate iteration number during the implementation of the OMP algorithm. Subsequently
we further design different JND estimation models for structural and non-structural layers. 1) Luminance adaptability and contrast mask-based JND estimation model for structural layers. The JND threshold value of an image is mainly related to the brightness adaptability of the visual system
contrast mask
module mask
and image structure. Thus
the luminance adaptability function and contrast mask equation are derived under the experimental environment of a regular structure. The JND calculation model of the structural layer is derived from the fusion of the two models. 2) Luminance contrast and information uncertainty-based JND estimation model for non-structural layer. The modular mask effect reveals the visibility of stimuli in the visual system because of the interaction or interference among visual stimuli in the visual content of the input scene. When the structure of the visual content is ordered and the background is uniform
the module mask effect is extremely weak
and the spatial object is easily detected. On the contrary
when the visual content is disordered and uncertain
the module mask effect is enhanced
that is
the detection of space objects is suppressed. Therefore
the module mask effect is related not only to brightness contrast but also to information uncertainty. Therefore
we construct an unstructured layer of the JND model on the basis of the module mask combined with information uncertainty and brightness contrast. Finally
given the overlap between the structural layer of JND and the non-structural layer of JND
using a simple linear sum to fuse the two layers is impossible
and the overlapping parts must be removed. A nonlinear additive model describing the masking effect between different components is utilized to fuse the two JND estimation results.
Result
2
Three existing JND models are selected for comparison. For a fair comparison
the same noise is injected into the original image through the JND models
and then the visual effects of the polluted image are compared. The subjective experimental results show that the proposed JND model can better guide the distribution of noise and avoid the sensitive region of human vision relative to other JND models when the same noise is injected. The proposed JND model is also consistent with the subjective visual perception of human eyes. To further verify the fairness
we compare the four JND models using the classical peak signal-to-noise ratio (PSNR). The PSNRs of the contaminated Goddess image and contaminated Lena images are compared. The objective experimental results show that the PSNRs of the proposed model are significantly higher than those of the other three JND models. The proposed JND estimation model uses sparse representation to separate the structured and unstructured information of the input natural image. It then calculates the JND threshold according to the characteristics of different components. The process is consistent with the mechanism of human visual perception. Therefore
the proposed JND estimation model can effectively and accurately predict the JND threshold of natural images.
Conclusion
2
Compared with the existing relevant models
the proposed JND model can effectively predict the JND threshold of natural images
and it is much more consistent with human visual perception.
Atchison D A, Smith G. Optics of the Human Eye[M]. Oxford:Butterworth-Heinemann, 2000:43-63.[DOI:10.1016/S0275-5408(00)00039-9]
Jayant N, Johnston J, Safranek R. Signal compression based on models of human perception[J]. Proceedings of the IEEE, 1993, 81(10):1385-1422.[DOI:10.1109/5.241504]
Xia P, Xiang X J, Ji P R. Low bit-rate image coding based on adaptive lifting scheme[J]. Journal of Image and Graphics, 2007, 12(12):2068-2071.
夏平, 向学军, 吉培荣.基于自适应提升方案的低比特率图像压缩编码[J].中国图象图形学报, 2007, 12(12):2068-2071. [DOI:10.11834/jig.20071206]
Guo J C, Li C Y, Zhang Y, et al. Quality assessment method for underwater images[J]. Journal of Image and Graphics, 2017, 22(1):1-8.
郭继昌, 李重仪, 张艳, 等.面向水下图像的质量评价方法[J].中国图象图形学报, 2017, 22(1):1-8. [DOI:10.11834/jig.20170101]
Ferzli R, Ivanovski Z A, Karam L J. An efficient, selective, perceptual-based super-resolution estimator[C] //Proceedings of the 15th IEEE International Conference on Image Processing. San Diego, CA, USA: IEEE, 2008: 1260-1263.[ DOI: 10.1109/ICIP.2008.4711991 http://dx.doi.org/10.1109/ICIP.2008.4711991 ]
Shao L, Brady M. Invariant salient regions based image retrieval under viewpoint and illumination variations[J]. Journal of Visual Communication and Image Representation, 2006, 17(6):1256-1272.[DOI:10.1016/j.jvcir.2006.08.002]
Chou C H, Li Y C. A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile[J]. IEEE Transactions on Circuits and Systems for Video Technology, 1995, 5(6):467-476.[DOI:10.1109/76.475889]
Liu A M, Lin W S, Paul M, et al. Just noticeable difference for images with decomposition model for separating edge and textured regions[J]. IEEE Transactions on Circuits andSystems for Video Technology, 2010, 20(11):1648-1652.[DOI:10.1109/TCSVT.2010.2087432]
Friston K. The free-energy principle:a unified brain theory?[J]. Nature Reviews Neuroscience, 2010, 11(2):127-138.[DOI:10.1038/nrn2787]
Wu J J, Shi G M, Lin W S, et al. Just noticeable difference estimation for images with free-energy principle[J]. IEEE Transactions on Multimedia, 2013, 15(7):1705-1710.[DOI:10.1109/TMM.2013.2268053]
Wu J J, Lin W S, Shi G M, et al. Pattern masking estimation in image with structural uncertainty[J]. IEEE Transactions on Image Processing, 2013, 22(12):4892-4904.[DOI:10.1109/TIP.2013.2279934]
Wu J J, Li L D, Dong W S, et al. Enhanced just noticeable difference model for images with pattern complexity[J]. IEEE Transactions on Image Processing, 2017, 26(6):2682-2693.[DOI:10.1109/TIP.2017.2685682]
Ma S W, Zhang X, Wang S Q, et al. Entropy of primitive:from sparse representation to visual information evaluation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(2):249-260.[DOI:10.1109/TCSVT.2015.2511838]
Yang X K, Ling W S, Lu Z K, et al. Just noticeable distortion model and its applications in video coding[J]. Signal Processing:Image Communication, 2005, 20(7):662-680.[DOI:10.1016/j.image.2005.04.001]
Aharon M, Elad M, Bruckstein A. K-SVD:an algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11):4311-4322.[DOI:10.1109/TSP.2006.881199]
Olshausen B A, Field D J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images[J]. Nature, 1996, 381(6583):607-609.[DOI:10.1038/381607a0]
Ojala T, Pietikäinen M, Harwood I. A comparative study of texture measures with classification based on featured distributions[J]. Pattern Recognition, 1996, 29(1):51-59.[DOI:10.1016/0031-3203(95)00067-4]
Legge G E, Foley J M. Contrast masking in human vision[J]. Journal of the Optical Society of America, 1980, 70(12):1458-1471.[DOI:10.1364/JOSA.70.001458]
Ojala T, Pietikäinen M, MäenpääT. Gray scale and rotation invariant texture classification with local binary patterns[C]//Proceedings of the 6th European Conference on Computer Vision - ECCV 2000. Dublin, Ireland: Springer, 2000: 404-420.[ DOI: 10.1007/3-540-45054-8_27 http://dx.doi.org/10.1007/3-540-45054-8_27 ]
Ojala T, Valkealahti K, Oja E, et al. Texture discrimination with multidimensional distributions of signed gray-level differences[J]. Pattern Recognition, 2001, 34(3):727-739.[DOI:10.1016/S0031-3203(00)00010-8]
相关作者
相关机构
京公网安备11010802024621