多层感知分解的全参考图像质量评估
Multi-layer perceptual decomposition based full reference image quality assessment
- 2019年24卷第1期 页码:149-158
收稿:2018-07-06,
修回:2018-8-23,
纸质出版:2019-01-16
DOI: 10.11834/jig.180438
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-07-06,
修回:2018-8-23,
纸质出版:2019-01-16
移动端阅览
目的
2
图像质量评估是计算机视觉、图像处理等领域的基础研究课题之一,传统评估方法常基于图像低层视觉特征而忽略了高层语义信息,这也在一定程度上影响了客观指标和主观视觉质量的一致性。近年来,感知损失被广泛应用于图像风格化、图像复原等研究中,通过使用预训练的深度网络对图像进行多层语义分解,在相关问题上取得了较好的效果。受感知损失启发,提出一种多层感知分解的全参考图像质量评估方法。
方法
2
首先使用预训练的深度网络对图像进行多层语义分解,获取多层特征图,再计算失真图像与参考图像之间的相似度,以及它们的不同层级特征图之间的相似度,最终得出兼顾了高层语义信息的图像质量分数。
结果
2
针对传统方法PSNR(peak signal-to-noise ratio)、SSIM(structure similarity)、MS-SSIM(multi-scale structure similarity)及FSIM(feature similarity)进行实验,结果表明,本文方法能够有效提升传统图像质量评估方法的性能,在SRCC(Spearman rank order correlation coefficient)、KRCC(Kendall rank order correlation coefficient)、PLCC(Pearson linear correlation coefficient)和RMSE(root mean squared error)客观指标上均有相应提升。通过使用本文框架,PSNR、SSIM、MS-SSIM、FSIM方法在TID2013数据库上SRCC指标分别获得0.02、0.07、0.06和0.04的提升。
结论
2
本文提出的一种多层感知分解的全参考图像质量评估方法,结合传统方法与深度学习方法,兼顾了图像低层视觉特征和高层语义信息,从而有效地提升了传统方法的评估性能,使客观评估结果更加符合主观视觉感受,同时,本文提出的评估框架能够适用于多种传统方法的性能提升。
Objective
2
IQA (image quality assessment) is one of the fundamental research topics in the fields of computer vision and image processing. Traditional quality assessment methods are mainly based on low-level visual features and generally ignore high-level semantic information. Traditional IQA methods mainly rely on single pixel intensity or low-level visual features
such as image contrast
image edges
etc
.
to assess images. PSNR (peak signal-to-noise ratio) is a basic and commonly used tool for directly comparing the differences of pixel intensities between the test image and the reference image By contrast
human visual systems extract structural information from visual scenes. The PSNR cannot accurately measure the subjective visual quality. To extract the structure information and attain a better evaluation
various kinds of improved IQA methods have been proposed. Many methods first decompose an image into different aspects to extract information that effectively measures image quality. However
these traditional methods still omit the high-level semantic information. With the rapid development of deep learning algorithms
high-level semantic information can be effectively extracted by deep networks. Given their special hierarchical structure
deep networks can analyze and understand images in different levels. In recent years
perceptual loss based on deep network has been widely used in many computer vision applications
such as image style-transfer
non-photorealistic rendering
image restoration
etc. By utilizing a pre-trained deep network to decompose an image into different semantic levels
satisfactory results can be produced for related tasks. Inspired by the perceptual loss
we proposed a multi-layer perceptual decomposition-based full-reference image quality assessment method.
Method
2
First
a pre-trained deep network was used to decompose the input image and extract the multi-layer feature maps. Many pre-trained deep networks could be employed for this purpose. On the basis of previous studies on perceptual loss
the VGG-19 network was selected because of its effectiveness. VGG-19 is composed of several different layers
including the convolutional
activation function
pool
dropout
fully connected
and softmax layers. These elements are stacked in a specific order to form a completed network model. This network has been widely applied because it can achieve impressive results in many recognition tasks. To reduce complexity
several layers were set as the abstraction layer for extracting feature maps. Second
the proposed method calculated not only the similarity between the test image and the reference image but also the similarity between their multi-level feature maps. The feature maps at the lower level can reflect the differences of the image in the edge
detail
texture
and some low-level features
whereas the feature maps at the higher level can reflect the saliency and semantic differences of the image in the region of interest. Finally
an image quality score that considered the similarity of high-level semantic was obtained. Compared with existing DNN (deep neural network)-based IQA methods
the pre-trained deep network was merely utilized to decompose the image rather than to fit the subjective mean opinion scores. Thus
the proposed method did not need to train a new IQA network in contrast to other DNN-based methods. Moreover
the proposed method was an open and elastic framework that improved the performance of traditional methods by extracting additional high-level semantic information. Therefore
numerous traditional full reference IQA methods can be further improved by exploiting the proposed framework. In this paper
a number of typical and efficient traditional IQA methods were improved and evaluated by proposed method. These IQA methods included the PSNR
the SSIM (structure similarity)
and its two effective variants
namely
MS-SSIM (multi-scale structure similarity) and FSIM (feature similarity). Other full reference IQA methods can also be improved by the proposed semantic decomposition-based framework.
Result
2
The experimental data were derived from the TID2013 dataset
which includes 25 reference images and 3 000 distorted images. Compared with other existing databases
TID2013 has more images and distortion types
guaranteeing more reliable results. The experimental results of the selected traditional methods
namely
PSNR
SSIM
MS-SSIM
and FSIM
showed that the proposed method can effectively improve the performance of traditional image quality assessment methods and achieve corresponding improvements in many objective criteria
such as SRCC (Spearman rank order correlation coefficient)
KRCC (Kendall rank order correlation coefficient)
PLCC (Pearson linear correlation coefficient)
and RMSE (root mean squared error). The SRCC indicators were increased by 0.02
0.07
0.06
and 0.04 for PSNR
SSIM
MS-SSIM
and FSIM
respectively
on the TID2013 dataset. SRCC and KRCC measure the predicting monotonicity. PLCC is calculated to predict accuracy. RMSE is used to predict consistency. These traditional assessments can attain higher SRCC
KRCC
and PLCC values by using the proposed method. For the RMSE
the proposed methods can achieve much lower results than those of the corresponding conventional IQA methods. In addition
the results for different distortion types demonstrated that the proposed method can effectively improve the performance.
Conclusion
2
This paper proposed a full-reference image quality assessment method based on perceptual decomposition that combined the benefits of traditional methods and deep learning methods. By simultaneously considering the low-level visual features and high-level semantic information
the proposed method effectively improved the evaluation performance of traditional methods. By incorporating the additional high-level semantic information
the IQA results became more consistent with the subjective visual perception. Furthermore
the proposed evaluation framework can also be applied to other traditional full reference IQA methods.
Wang Z, Bovik A C. Modern Image Quality Assessment:Synthesis Lectures on Image, Video&Multimedia Processing[M]. San Rafael, Calif:Morgan&Claypool, 2006:156.
Damera-Venkata N, Kite T D, Geisler W S, et al. Image quality ASSESSMENT based on a degradation model[J]. IEEE Transactions on Image Processing, 2000, 9(4):636-650.[DOI:10.1109/83.841940]
Chandler D M, Hemami S S. VSNR:a wavelet-based visual signal-to-noise ratio for natural images[J]. IEEE Transactions on Image Processing, 2007, 16(9):2284-2298.[DOI:10.1109/TIP.2007.901820]
Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment:from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4):600-612.[DOI:10.1109/TIP.2003.819861]
Wang Z, Simoncelli E P, Bovik A C. Multiscale structural similarity for image quality assessment[C]//Proceedings of the 37th Asilomar Conference on Signals, Systems & Computers. Pacific Grove, CA, USA: IEEE, 2003: 1398-1402.[ DOI: 10.1109/ACSSC.2003.1292216 http://dx.doi.org/10.1109/ACSSC.2003.1292216 ]
Zhang L, Zhang L, Mou X Q, et al. FSIM:a feature similarity index for image quality assessment[J]. IEEE Transactions on Image Processing, 2011, 20(8):2378-2386.[DOI:10.1109/TIP.2011.2109730]
Zhang L, Shen Y, Li H Y. VSI:a visual saliency-induced index for perceptual image quality assessment[J]. IEEE Transactions on Image Processing, 2014, 23(10):4270-4281.[DOI:10.1109/TIP.2014.2346028]
Zhang L, Gu Z Y, Li H Y. SDSP: a novel saliency detection method by combining simple priors[C]//Proceedings of 2013 IEEE International Conference on Image Processing. Melbourne, VIC, Australia: IEEE, 2013: 171-175.[ DOI: 10.1109/ICIP.2013.6738036 http://dx.doi.org/10.1109/ICIP.2013.6738036 ]
Sheikh H R, Bovik A C, De Veciana G. An information fidelity criterion for image quality assessment using natural scene statistics[J]. IEEE Transactions on Image Processing, 2005, 14(12):2117-2128.[DOI:10.1109/TIP.2005.859389]
Sheikh H R, Bovik A C. Image information and visual quality[J]. IEEE Transactions on Image Processing, 2006, 15(2):430-444.[DOI:10.1109/TIP.2005.859378]
Kim J, Zeng H, Ghadiyaram D, et al. Deep convolutional neural models for picture-quality prediction:challenges and solutions to data-driven image quality assessment[J]. IEEE Signal Processing Magazine, 2017, 34(6):130-141.[DOI:10.1109/MSP.2017.2736018]
Kang L, Ye P, Li Y, et al. Convolutional neural networks for no-reference image quality assessment[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 1733-1740.[ DOI: 10.1109/CVPR.2014.224 http://dx.doi.org/10.1109/CVPR.2014.224 ]
Bosse S, Maniry D, Wiegand T, et al. A deep neural network for image quality assessment[C]//Proceedings of 2016 IEEE International Conference on Image Processing. Phoenix, AZ, USA: IEEE, 2016: 3773-3777.[ DOI: 10.1109/ICIP.2016.7533065 http://dx.doi.org/10.1109/ICIP.2016.7533065 ]
Liang Y D, Wang J J, Wan X Y, et al. Image quality assessment using similar scene as reference[C]//Proceedings of the 14th European Conference on Computer Vision. The Netherlands: Springer, 2016: 3-18.[ DOI: 10.1007/978-3-319-46454-1_1 http://dx.doi.org/10.1007/978-3-319-46454-1_1 ]
Gao F, Wang Y, Li P P, et al. DeepSim:deep similarity for image quality assessment[J]. Neurocomputing, 2017, 257:104-114.[DOI:10.1016/j.neucom.2017.01.054]
Kim J, Lee S. Deep learning of human visual sensitivity in image quality assessment framework[C] //Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1969-1977.[ DOI: 10.1109/CVPR.2017.213 http://dx.doi.org/10.1109/CVPR.2017.213 ]
Gatys L A, Ecker A S, Bethge M. A neural algorithm of artistic style[J]. arXiv: 1508.06576, 2015. http://www.mendeley.com/catalog/neural-algorithm-artistic-style/ .
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv: 1409.1556, 2014. http://www.mendeley.com/catalog/very-deep-convolutional-networks-largescale-image-recognition/ .
Johnson J, Alahi A, Li F F. Perceptual losses for real-time style transfer and super-resolution[C]//Proceedings of the 14th European Conference on Computer Vision. The Netherlands: Springer, 2016: 694-711.[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Ni Z K, Ma L, Zeng H Q, et al. ESIM:edge similarity for screen content image quality assessment[J]. IEEE Transactions on Image Processing, 2017, 26(10):4818-4831.[DOI:10.1109/TIP.2017.2718185]
相关作者
相关机构
京公网安备11010802024621