目的 图像质量评估是计算机视觉、图像处理等领域的基础研究课题之一，传统评估方法常基于图像低层视觉特征而忽略了高层语义信息，这也在一定程度上影响了客观指标和主观视觉质量的一致性。近年来，感知损失被广泛应用于图像风格化、图像复原等研究中，通过使用预训练的深度网络对图像进行多层语义分解，在相关问题上取得了较好的效果。受感知损失启发，本文提出一种多层感知分解的全参考图像质量评估方法。方法 本文方法首先使用预训练的深度网络对图像进行多层语义分解，获取多层特征图，再计算失真图像与参考图像之间的相似度，以及它们的不同层级特征图之间的相似度，最终得出兼顾了高层语义信息的图像质量分数。结果 针对传统方法Peak Signal-to-Noise Ratio (PSNR)、Structure Similarity (SSIM)、Multi-Scale Structure Similarity (MS-SSIM)及Feature Similarity (FSIM)进行实验，结果表明，本文方法能够有效提升传统图像质量评估方法的性能，在Spearman Rank order Correlation Coefficient (SRCC)、Kendall Rank order Correlation Coefficient (KRCC)、Pearson Linear Correlation Coefficient (PLCC)和Root Mean Squared Error (RMSE)客观指标上均有相应提升。通过使用本文框架，PSNR，SSIM，MS-SSIM，FSIM方法在tid2013库上SRCC指标分别能获得0.02，0.07,0.06和0.04的提升。结论 本文提出的一种多层感知分解的全参考图像质量评估方法，结合了传统方法与深度学习方法，兼顾了图像低层视觉特征和高层语义信息，从而有效地提升了传统方法的评估性能，使客观评估结果更加符合主观视觉感受，同时，本文提出的评估框架能够适用于多种传统方法的性能提升。
Multi-layer perceptual Decomposition Based Full Reference Image Quality Assessment
李 国庆,赵 洋,刘 青萌,殷 翔宇,王 业南(School;of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China;Key;Laboratory of Industrial Safety and Applied Technology of Anhui Province, Hefei 230009, China)
Objective Image quality assessment (IQA) is one of the basic research topics in the fields of computer vision and image processing. Traditional quality assessment methods are mainly based on the low-level visual features, but often ignore the high-level semantic information. These traditional IQA methods mainly rely on single pixel intensity or low-level visual features to assess images, such as image contrast, image edges, etc. Peak signal-to-noise ratio (PSNR) is a basic and commonly used method, and it directly compares the differences of pixel intensities between the test image and the reference image However, human visual system (HVS) pays more attention to extract structural information from visual scenes, and the PSNR cannot accurately measure the subjective visual quality. In order to extract structure information and produce better evaluation, various kinds of improved IQA methods have been proposed. Many methods firstly decompose image into different aspects to extract information that effectively measures image quality. However, these traditional methods still omit the high-level semantic information. With the rapid development of deep learning algorithms, high-level semantic information can be effectively extracted by means of deep networks. Due to the special hierarchical structure, deep networks can analyze and understand images in different levels. In recent years, perceptual loss based on deep network has been widely used in many computer vision applications, such as image style-transfer, non-photorealistic rendering, image restoration, etc. By utilizing a pre-trained deep network to decompose an image into different semantic levels, satisfying results can be produced on related tasks. Inspired by the perceptual loss, we propose a multi-layer perceptual decomposition-based full-reference image quality assessment method.Method First, a pre-trained deep network is used to decompose the input image and extract the multi-layer feature maps. Note that many pre-trained deep networks could be employed. According to previous related studies on perceptual loss, VGG-19 network has often been chosen because of its effectiveness. VGG-19 is composed of several different layers, including convolutional layer, activation function layer, pool layer, dropout layer, fully connected layer, and softmax layer. These elements are simply stacked in a certain order to form a completed network model. It has been widely applied since it can achieve impressive results in many recognition tasks. In order to reduce the complexity in this paper, several layers are handpicked as the abstraction layer to extract feature maps. Second, the proposed method calculates not only the similarity between the test image and the reference image, but also the similarity between their multi-level feature maps. The feature maps at the lower level can reflect the difference of the image in the edge, detail, texture and some low-level features, while the feature maps at the higher level can reflect the saliency and semantic difference of the image in the region of interest. Finally, an image quality score that takes into account the similarity of high-level semantic is obtained. Compared to existing deep neural network (DNN) based IQA methods, the pre-trained deep network is merely utilized to decompose the image in this paper rather than to fit the subjective mean opinion scores. Hence, the proposed method doesn’t need to train a new IQA network as in other DNN based methods. Moreover, the proposed method is an open and elastic framework to improve the performance of traditional methods by extracting additional high-level semantic information. Plenty of traditional full reference IQA methods thus can be further improved by means of the proposed framework. In this paper, some typical and efficient traditional IQA methods are improved and evaluated by means of the proposed method, e.g., the basic PSNR, the structure similarity (SSIM), and its two effective variants of the multi-scale structure similarity (MS-SSIM) and feature similarity (FSIM). Note that other full reference IQA methods can also be improved by means of the proposed semantic decomposition based framework.Result Experimental data derived from TID2013 dataset which includes 25 reference images and 3000 distorted images. Especially, compared to other existing databases, TID2013 has more images and distortion types, and thus makes the results more reliable. Experimental results of these selected traditional methods, which include PSNR, SSIM, MS-SSIM and FSIM, show that the proposed method can effectively improve the performance of traditional image quality assessment methods, and achieve corresponding improvements in many objective criteria such as Spearman rank order correlation coefficient (SRCC), Kendall rank order correlation coefficient (KRCC), Pearson linear correlation coefficient (PLCC) and root mean squared error (RMSE). The SRCC indicators can be increased by 0.02, 0.07, 0.06 and 0.04 for PSNR, SSIM, MS-SSIM and FSIM respectively on tid2013 dataset. SRCC and KRCC could measure the predicting monotonicity, PLCC is calculated for predicting accuracy, moreover, RMSE is used for predicting consistency. These traditional assessments can obtain higher SRCC, KRCC and PLCC values by means of the proposed method. For the RMSE, the proposed methods can achieve much lower results than corresponding conventional IQA methods. In addition, the results on different distortion types demonstrate that the proposed method can effectively improve the performance.Conclusion This paper proposes a full-reference image quality assessment method based on perceptual decomposition, which combines the benefits of both the traditional methods and the deep learning methods. By taking into account the low-level visual features and high-level semantic information simultaneously, the proposed method can effectively improve the evaluation performance of traditional methods. By adding the extra high-level semantic information, the IQA results become more consistent with the subjective visual perception. Furthermore, the evaluation framework proposed in this paper can also be applied to other traditional full reference IQA methods.