多层感知分解的全参考图像质量评估

李国庆; 赵洋; 刘青萌; 殷翔宇; 王业南

doi:10.11834/jig.180438

NCIG 2018会议专栏 | 浏览量 : 0 下载量: 4 CSCD: 3

PDF
导出
分享
收藏
专辑

多层感知分解的全参考图像质量评估
Multi-layer perceptual decomposition based full reference image quality assessment
2019年24卷第1期页码：149-158
收稿：2018-07-06，

修回：2018-8-23，

纸质出版：2019-01-16
DOI： 10.11834/jig.180438
稿件说明：

移动端阅览

李国庆, 赵洋, 刘青萌, 殷翔宇, 王业南. 多层感知分解的全参考图像质量评估[J]. 中国图象图形学报, 2019,24(1):149-158. DOI： 10.11834/jig.180438.

Guoqing Li, Yang Zhao, Qingmeng Liu, Xiangyu Yin, Yenan Wang. Multi-layer perceptual decomposition based full reference image quality assessment[J]. Journal of Image and Graphics, 2019, 24(1): 149-158. DOI： 10.11834/jig.180438.

摘要

目的

图像质量评估是计算机视觉、图像处理等领域的基础研究课题之一，传统评估方法常基于图像低层视觉特征而忽略了高层语义信息，这也在一定程度上影响了客观指标和主观视觉质量的一致性。近年来，感知损失被广泛应用于图像风格化、图像复原等研究中，通过使用预训练的深度网络对图像进行多层语义分解，在相关问题上取得了较好的效果。受感知损失启发，提出一种多层感知分解的全参考图像质量评估方法。

方法

首先使用预训练的深度网络对图像进行多层语义分解，获取多层特征图，再计算失真图像与参考图像之间的相似度，以及它们的不同层级特征图之间的相似度，最终得出兼顾了高层语义信息的图像质量分数。

结果

针对传统方法PSNR（peak signal-to-noise ratio）、SSIM（structure similarity）、MS-SSIM（multi-scale structure similarity）及FSIM（feature similarity）进行实验，结果表明，本文方法能够有效提升传统图像质量评估方法的性能，在SRCC（Spearman rank order correlation coefficient）、KRCC（Kendall rank order correlation coefficient）、PLCC（Pearson linear correlation coefficient）和RMSE（root mean squared error）客观指标上均有相应提升。通过使用本文框架，PSNR、SSIM、MS-SSIM、FSIM方法在TID2013数据库上SRCC指标分别获得0.02、0.07、0.06和0.04的提升。

结论

本文提出的一种多层感知分解的全参考图像质量评估方法，结合传统方法与深度学习方法，兼顾了图像低层视觉特征和高层语义信息，从而有效地提升了传统方法的评估性能，使客观评估结果更加符合主观视觉感受，同时，本文提出的评估框架能够适用于多种传统方法的性能提升。

Abstract

Objective

IQA (image quality assessment) is one of the fundamental research topics in the fields of computer vision and image processing. Traditional quality assessment methods are mainly based on low-level visual features and generally ignore high-level semantic information. Traditional IQA methods mainly rely on single pixel intensity or low-level visual features

such as image contrast

image edges

etc

to assess images. PSNR (peak signal-to-noise ratio) is a basic and commonly used tool for directly comparing the differences of pixel intensities between the test image and the reference image By contrast

human visual systems extract structural information from visual scenes. The PSNR cannot accurately measure the subjective visual quality. To extract the structure information and attain a better evaluation

various kinds of improved IQA methods have been proposed. Many methods first decompose an image into different aspects to extract information that effectively measures image quality. However

these traditional methods still omit the high-level semantic information. With the rapid development of deep learning algorithms

high-level semantic information can be effectively extracted by deep networks. Given their special hierarchical structure

deep networks can analyze and understand images in different levels. In recent years

perceptual loss based on deep network has been widely used in many computer vision applications

such as image style-transfer

non-photorealistic rendering

image restoration

etc. By utilizing a pre-trained deep network to decompose an image into different semantic levels

satisfactory results can be produced for related tasks. Inspired by the perceptual loss

we proposed a multi-layer perceptual decomposition-based full-reference image quality assessment method.

Method

First

a pre-trained deep network was used to decompose the input image and extract the multi-layer feature maps. Many pre-trained deep networks could be employed for this purpose. On the basis of previous studies on perceptual loss

the VGG-19 network was selected because of its effectiveness. VGG-19 is composed of several different layers

including the convolutional

activation function

pool

dropout

fully connected

and softmax layers. These elements are stacked in a specific order to form a completed network model. This network has been widely applied because it can achieve impressive results in many recognition tasks. To reduce complexity

several layers were set as the abstraction layer for extracting feature maps. Second

the proposed method calculated not only the similarity between the test image and the reference image but also the similarity between their multi-level feature maps. The feature maps at the lower level can reflect the differences of the image in the edge

detail

texture

and some low-level features

whereas the feature maps at the higher level can reflect the saliency and semantic differences of the image in the region of interest. Finally

an image quality score that considered the similarity of high-level semantic was obtained. Compared with existing DNN (deep neural network)-based IQA methods

the pre-trained deep network was merely utilized to decompose the image rather than to fit the subjective mean opinion scores. Thus

the proposed method did not need to train a new IQA network in contrast to other DNN-based methods. Moreover

the proposed method was an open and elastic framework that improved the performance of traditional methods by extracting additional high-level semantic information. Therefore

numerous traditional full reference IQA methods can be further improved by exploiting the proposed framework. In this paper

a number of typical and efficient traditional IQA methods were improved and evaluated by proposed method. These IQA methods included the PSNR

the SSIM (structure similarity)

and its two effective variants

namely

MS-SSIM (multi-scale structure similarity) and FSIM (feature similarity). Other full reference IQA methods can also be improved by the proposed semantic decomposition-based framework.

Result

The experimental data were derived from the TID2013 dataset

which includes 25 reference images and 3 000 distorted images. Compared with other existing databases

TID2013 has more images and distortion types

guaranteeing more reliable results. The experimental results of the selected traditional methods

namely

PSNR

SSIM

MS-SSIM

and FSIM

showed that the proposed method can effectively improve the performance of traditional image quality assessment methods and achieve corresponding improvements in many objective criteria

such as SRCC (Spearman rank order correlation coefficient)

KRCC (Kendall rank order correlation coefficient)

PLCC (Pearson linear correlation coefficient)

and RMSE (root mean squared error). The SRCC indicators were increased by 0.02

0.07

0.06

and 0.04 for PSNR

SSIM

MS-SSIM

and FSIM

respectively

on the TID2013 dataset. SRCC and KRCC measure the predicting monotonicity. PLCC is calculated to predict accuracy. RMSE is used to predict consistency. These traditional assessments can attain higher SRCC

KRCC

and PLCC values by using the proposed method. For the RMSE

the proposed methods can achieve much lower results than those of the corresponding conventional IQA methods. In addition

the results for different distortion types demonstrated that the proposed method can effectively improve the performance.

Conclusion

This paper proposed a full-reference image quality assessment method based on perceptual decomposition that combined the benefits of traditional methods and deep learning methods. By simultaneously considering the low-level visual features and high-level semantic information

the proposed method effectively improved the evaluation performance of traditional methods. By incorporating the additional high-level semantic information

the IQA results became more consistent with the subjective visual perception. Furthermore

the proposed evaluation framework can also be applied to other traditional full reference IQA methods.

关键词

Keywords

references

Wang Z, Bovik A C. Modern Image Quality Assessment:Synthesis Lectures on Image, Video&Multimedia Processing[M]. San Rafael, Calif:Morgan&Claypool, 2006:156.

Damera-Venkata N, Kite T D, Geisler W S, et al. Image quality ASSESSMENT based on a degradation model[J]. IEEE Transactions on Image Processing, 2000, 9(4):636-650.[DOI:10.1109/83.841940]

Chandler D M, Hemami S S. VSNR:a wavelet-based visual signal-to-noise ratio for natural images[J]. IEEE Transactions on Image Processing, 2007, 16(9):2284-2298.[DOI:10.1109/TIP.2007.901820]

Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment:from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4):600-612.[DOI:10.1109/TIP.2003.819861]

Wang Z, Simoncelli E P, Bovik A C. Multiscale structural similarity for image quality assessment[C]//Proceedings of the 37th Asilomar Conference on Signals, Systems & Computers. Pacific Grove, CA, USA: IEEE, 2003: 1398-1402.[ DOI: 10.1109/ACSSC.2003.1292216 http://dx.doi.org/10.1109/ACSSC.2003.1292216 ]

Zhang L, Zhang L, Mou X Q, et al. FSIM:a feature similarity index for image quality assessment[J]. IEEE Transactions on Image Processing, 2011, 20(8):2378-2386.[DOI:10.1109/TIP.2011.2109730]

Zhang L, Shen Y, Li H Y. VSI:a visual saliency-induced index for perceptual image quality assessment[J]. IEEE Transactions on Image Processing, 2014, 23(10):4270-4281.[DOI:10.1109/TIP.2014.2346028]

Zhang L, Gu Z Y, Li H Y. SDSP: a novel saliency detection method by combining simple priors[C]//Proceedings of 2013 IEEE International Conference on Image Processing. Melbourne, VIC, Australia: IEEE, 2013: 171-175.[ DOI: 10.1109/ICIP.2013.6738036 http://dx.doi.org/10.1109/ICIP.2013.6738036 ]

Sheikh H R, Bovik A C, De Veciana G. An information fidelity criterion for image quality assessment using natural scene statistics[J]. IEEE Transactions on Image Processing, 2005, 14(12):2117-2128.[DOI:10.1109/TIP.2005.859389]

Sheikh H R, Bovik A C. Image information and visual quality[J]. IEEE Transactions on Image Processing, 2006, 15(2):430-444.[DOI:10.1109/TIP.2005.859378]

Kim J, Zeng H, Ghadiyaram D, et al. Deep convolutional neural models for picture-quality prediction:challenges and solutions to data-driven image quality assessment[J]. IEEE Signal Processing Magazine, 2017, 34(6):130-141.[DOI:10.1109/MSP.2017.2736018]

Kang L, Ye P, Li Y, et al. Convolutional neural networks for no-reference image quality assessment[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 1733-1740.[ DOI: 10.1109/CVPR.2014.224 http://dx.doi.org/10.1109/CVPR.2014.224 ]

Bosse S, Maniry D, Wiegand T, et al. A deep neural network for image quality assessment[C]//Proceedings of 2016 IEEE International Conference on Image Processing. Phoenix, AZ, USA: IEEE, 2016: 3773-3777.[ DOI: 10.1109/ICIP.2016.7533065 http://dx.doi.org/10.1109/ICIP.2016.7533065 ]

Liang Y D, Wang J J, Wan X Y, et al. Image quality assessment using similar scene as reference[C]//Proceedings of the 14th European Conference on Computer Vision. The Netherlands: Springer, 2016: 3-18.[ DOI: 10.1007/978-3-319-46454-1_1 http://dx.doi.org/10.1007/978-3-319-46454-1_1 ]

Gao F, Wang Y, Li P P, et al. DeepSim:deep similarity for image quality assessment[J]. Neurocomputing, 2017, 257:104-114.[DOI:10.1016/j.neucom.2017.01.054]

Kim J, Lee S. Deep learning of human visual sensitivity in image quality assessment framework[C] //Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1969-1977.[ DOI: 10.1109/CVPR.2017.213 http://dx.doi.org/10.1109/CVPR.2017.213 ]

Gatys L A, Ecker A S, Bethge M. A neural algorithm of artistic style[J]. arXiv: 1508.06576, 2015. http://www.mendeley.com/catalog/neural-algorithm-artistic-style/ .

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv: 1409.1556, 2014. http://www.mendeley.com/catalog/very-deep-convolutional-networks-largescale-image-recognition/ .

Johnson J, Alahi A, Li F F. Perceptual losses for real-time style transfer and super-resolution[C]//Proceedings of the 14th European Conference on Computer Vision. The Netherlands: Springer, 2016: 694-711.[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]

Ni Z K, Ma L, Zeng H Q, et al. ESIM:edge similarity for screen content image quality assessment[J]. IEEE Transactions on Image Processing, 2017, 26(10):4818-4831.[DOI:10.1109/TIP.2017.2718185]