|
发布时间: 2021-07-16 |
图像分析和识别 |
|
|
收稿日期: 2020-09-04; 修回日期: 2021-01-25; 预印本日期: 2021-02-01
基金项目: 国家自然科学基金项目(61671283)
作者简介:
夏雨蒙, 1996年生, 女, 硕士研究生, 主要研究方向为图像视频的质量评估。E-mail: xym_96@shu.edu.cn
王永芳, 通信作者, 女, 教授, 主要研究方向为智能多媒体处理与分析、图像视频质量编码与评估。E-mail: yfw@shu.edu.cn 王闯, 男, 硕士研究生, 主要研究方向为感知模型及感知视频编码。E-mail: chuangwang@shu.edu.cn *通信作者: 王永芳 yfw@shu.edu.cn
中图法分类号: TP391.41
文献标识码: A
文章编号: 1006-8961(2021)07-1625-12
|
摘要
目的 全景图像的质量评价和传输、处理过程并不是在同一个空间进行的,传统的评价算法无法准确地反映用户在观察球面场景时产生的真实感受,针对观察空间与处理空间不一致的问题,本文提出一种基于相位一致性的全参考全景图像质量评价模型。方法 将平面图像进行全景加权,使得平面上的特征能准确反映球面空间质量畸变。采用相位一致性互信息的相似度获取参考图像和失真图像的结构相似度。接着,利用相位一致性局部熵的相似度反映参考图像和失真图像的纹理相似度。将两部分相似度融合可得全景图像的客观质量分数。结果 实验在全景质量评价数据集OIQA(omnidirectional image quality assessment)上进行,在原始图像中引入4种不同类型的失真,将提出的算法与6种主流算法进行性能对比,比较了基于相位信息的一致性互信息和一致性局部熵,以及评价标准依据4项指标。实验结果表明,相比于现有的6种全景图像质量评估算法,该算法在PLCC(Pearson linear correlation coefficient)和SRCC(Spearman rank order correlation coefficient)指标上比WS-SSIM(weighted-to-spherically-uniform structural similarity)算法高出0.4左右,并且在RMSE(root of mean square error)上低0.9左右,4项指标最优,能够获得更好的拟合效果。结论 本文算法解决了观察空间和映射空间不一致的问题,并且融合了基于人眼感知的多尺度互信息相似度和局部熵相似度,获得与人眼感知更为一致的客观分数,评价效果更为准确,更加符合人眼视觉特征。
关键词
全景图像/视频; 质量评价; 人类视觉系统; 相位一致性; 结构相似度(SSIM); 纹理相似度
Abstract
Objective Panoramic images introduce distortion in the process of acquisition, compression, and transmission. To provide viewers with a real experience, the resolution of a panoramic image is higher than that of the traditional image. The higher the resolution is, the more bandwidth is needed for transmission, and the more space is needed for storage. Therefore, image compression technology is conducive to improving transmission efficiency. At the same time, the compression distortion is introduced. With the increasing demand of viewers for panoramic image/video visual experience, the research on virtual reality visual system becomes increasingly important, and the quality evaluation of panoramic image/video is an indispensable part. The traditional subjective observation process of image is realized through the screen, and the design of objective quality assessment algorithm is based on 2D planes. When assessing the quality of panoramic images, viewers need to freely switch the perspective to observe the whole spherical scene with the help of head-mounted equipment. However, the transmission, storage, and processing are all in the projection format of the panoramic image, which causes the problem of inconsistency between the observation and processing spaces. As a result, the traditional assessment algorithm cannot accurately reflect the viewers' real feelings when observing the sphere, and cannot directly reflect the distortion degree of the spherical scene. To solve the problem of inconsistency between the observation and processing spaces, this study proposes a phase-consistency guided panoramic image quality assessment (PC-PIQA) algorithm. Method The structure and texture information are rich in high-resolution panoramic images, and they are the important features of the human visual system to understand the scene content. The proposed PC-PIQA model can solve the inconsistency between the observation space and processing plane by utilizing the features. Its panoramic statistical similarity is only related to the description parameters rather than the video content. First, the equirectangular projection format is mapped to the cube map projection (CMP) format, and the panoramic weight under the CMP format is used to solve the problem of inconsistent observation space and processing space.Then, the high-order phase-consistent mutual information of a single plane in the CMP format is calculated to describe the similarity of structural information between the reference image and distorted image at different orders.Next, the texture similarity is calculated by using the similarity of the first-order phase congruence local entropy. Finally, the visual quality of a single plane can be obtained by fusing the two parts of quality. According to the human eye's attention to the panoramic content, the different perceptual weights are assigned to six planes to obtain the overall quality score. Result Experiments are conducted on the panoramic evaluation data set called omnidirectional image quality assessment (OIQA). The original images are added by four different types of distortion, including JPEG compression, JPEG2000 compression, Gaussian blur, and Gaussian noise. The proposed algorithm is compared with six kinds of mainstream algorithm performance, including peak signal-to-noise ratio (PSNR), structural similarity (SSIM), craster parabolic projection PSNR (CPP-PSNR), weighted-to-spherically-uniform PSNR (WS-PSNR), spherical PSNR (S-PSNR) and weighted-to-spherically-uniform SSIM (WS-SSIM). The assessment criteria contains four indicators, including Pearson linear correlation coefficient (PLCC), Spearman rank-order correlation coefficient (SRCC), Kendall rank-order correlation coefficient (KRCC), and root of mean square error (RMSE). In addition, we also list the performance obtained separately for structural similarity based on the panoramic weighted-mutual information (PW-MI) and texture similarity based on the panoramic weighted-local entropy (PW-LE), which can prove that each factor plays a significant role in improving the performance. The experimental results show that the PLCC and SRCC indexes of this proposed algorithm are approximately 0.4 higher than that of the other existing models, and the RMSE index is approximately 0.9 lower. All the indexes are the best compared with the other existing six panoramic image-quality assessment algorithms. Meanwhile, the individual performance of PV-MI and PV-LE is also better than that of the reference panoramic algorithms. The algorithm not only solves the problem of inconsistency between the observation and processing spaces, but also has robustness to different distortion types and achieves the best fitting effect. The human visual system has different sensitivities to different scales of images, and experiment results show that the sampling scales with parameters of 2 and 4 perform better. Therefore, the mutual information of each order of phase consistency on the two scales and the local entropy of the first order of phase consistency are finally fused. The high-order phase consistency has a negative effect on the calculation of similarity. The proposed model performs best when using the local entropy with the first-order phase consistency. Conclusion The proposed algorithm solves the problem of inconsistency between the observation and processing space, and combines the multi-scale mutual information similarity and local entropy similarity based on human eye perception to obtain an objective score that is more consistent with the human eye perception. The assessment result is more accurate and consistent with the human visual system.The panoramic quality evaluation model proposed in this paper is classified as a traditional algorithm. With the development of deep learning, the framework implemented by neural networks can also obtain high accuracy. Further experiments are needed to determine if our model can be further integrated into neural network-based panoramic quality assessment.
Key words
panoramic image/video; quality assessment; human visual system; phase consistency; structural similarity(SSIM); texture similarity
0 引言
全景图像/视频能带来全新的视觉体验,但是在采集、传输以及存储过程中难免引入失真。对全景图像/视频进行主观评价能够准确反映其质量,但是需要大量人力物力。因此,对于图像的质量评价,快速而又准确的客观质量评价模型具有重要作用。
一些客观的全景图像质量评价方法将传统的峰值信噪比(peak signal to noise ratio, PSNR)和结构相似性(structural similarity, SSIM)与全景图像的特性相结合。S-PSNR(spherical PSNR)(Yu等,2015)将球上一点s投影到参考图像和失真图像上,分别找到对应点并计算这两点之间的PSNR作为失真全景图像的质量。Zakharchenko等人(2017)使用CPP-PSNR (craster parabolic projection PSNR)将参考图像与失真图像同时投影到CPP(craster’s parabolic projection)面上,再进行对应点PSNR的计算。Sun等人(2017)提出WS-PSNR (weighted-to-spherically-uniform PSNR),利用球面与投影平面之间的映射关系改进PSNR。
全景图像显著性的研究也为全景图像质量评价提供了新的思路。Upenik和Ebrahimi(2019),Upenik等人(2016)利用视觉注意力机制,提出了基于关注度的全参考质量评估模型(visual attention based PSNR, VA-PSNR),将得到的全景显著性图像与传统PSNR进行结合。Yang等人(2017)提出了基于反向传播的全参考质量评估模型(back propagation-based quality assessment of panoramic videos in VR system, BP-QAVR)以衡量全景视频的质量。
Xu等人(2019b)提出基于非内容和基于内容的两种全参考全景视频质量评价模型,前者认为不同位置的像素产生的失真与人眼的关注区有关,后者将对视频内容预测的可能观看方向作为权重来衡量质量损失。Zhou等人(2018)采用SSIM,在考虑亮度、对比度和结构特征的基础上,将像素从球面映射到投影平面时的面积拉伸比作为权重,扩展成全景质量评价模型WS-SSIM (weighted-to-spherically-uniform SSIM)。
通过拼接实现的全景图像重建需要将多个视点图通过拼接算法合成为广角视图,因此这类全景图像的失真主要是几何失真和结构失真。Cheung等人(2017)利用光流来建立像素点之间的对应关系,将几何误差和畸变程度两部分特征进行融合来评估拼接图像的失真。Xu等人(2019a)提出了一种立体全景图像的全参考评价模型。许欣等人(2018)利用小波域的特征设计了一种半参考全景图像质量评价模型。上述几种模型的计算均在全景图像的投影平面上进行,没有考虑到处理平面与观察空间之间的非线性关系。
全景图像主观质量评价时通过辅助设备自由切换视角以观察整个球面场景,但传输、存储与处理过程都是对全景图像的投影格式进行处理,这就造成观察空间与处理空间不一致的问题,从而导致传统评价算法无法直接反映球面场景的失真程度。因此,利用观察空间与处理空间之间的映射关系,更有利于提升全景质量评价算法的准确性。
高分辨率的全景图像中结构和纹理信息非常丰富,且结构和纹理信息是人眼视觉系统理解场景内容的重要特征,因此针对观察空间与处理平面不一致,本文提出一种基于相位一致性的全参考全景图像质量评价模型(phase consistency based panoramic image quality assessment, PC-PIQA)。依据视觉系统对结构和纹理的敏感性,计算参考图像与失真图像四阶相位一致性之间互信息的相似度,来衡量结构信息的相似度; 计算一阶相位一致性局部熵的相似度,来衡量纹理相似度,将两部分融合得到最终的质量分数。
1 基于相位一致性的全参考全景图像质量评价算法
本文提出基于相位一致性的全参考全景图像评价模型,其框图如图 1所示。首先将经纬图投影(equirectangular projection, ERP)格式(艾达等,2018)影射为立方体投影(cube map projection, CMP) 格式(Greene,1986),利用CMP格式下的全景权重解决观察空间和处理空间不一致的问题。然后,对CMP中单个平面计算高阶相位一致性之间的互信息来描述参考图像与失真图像不同阶之间结构信息传递的相似度。此外,利用一阶相位一致性局部熵的相似度反映纹理的相似度。将两部分质量融合可得单个平面的视觉质量。最后,根据人眼对全景内容的关注度,分配给6个平面不同的感知权重得到整体的质量分数。
1.1 基于投影格式的全景权重
如图 2所示,全景投影格式ERP应用广泛,但其两极点处像素拉伸变形非常严重,将ERP格式转换成CMP格式可以有效减轻畸变,并且CMP的单个平面更接近人眼视觉系统的结构(Dedhia等,2019)。为了解决观察空间与处理平面之间的非线性映射关系,利用CMP格式下像素面积由球面投影为平面时产生的拉伸比作为全景权重。设CMP单个平面上点的坐标为
$ \omega_{\mathrm{cmp}}(i, j)=\left(1+\frac{d^{2}(i, j)}{r^{2}}\right)^{-3 / 2} $ | (1) |
$ \begin{gathered} d^{2}(i, j)=(i+0.5-A / 2)^{2}+ \\ (j+0.5-A / 2)^{2} \end{gathered} $ | (2) |
式中,
1.2 基于高阶相位一致性互信息的结构相似度
图像的边缘信息对于人眼视觉系统至关重要,人眼容易感受到图像结构信息的变化。传统的边缘检测大多数都是通过Sobel、Roberts、Canny和Laplacian等算子实现的,可以提取图像的结构特征,但是失真的加入则会改变这些结构信息,因此测量结构的失真程度是质量评价的重要方法。但这些梯度函数的计算与人眼对边缘信息的处理过程不同,不符合人类的感知视觉特性。Kovesi(1999)提出一种基于人类视觉特性的相位一致性图像边缘检测算子,表明人类视觉系统感知的图像特征集中在图像中各谐波分量相位最一致的点,且该算法可以不受图像局部光线明暗变化的影响。
将CMP单个平面记为
$ \boldsymbol{I}^{\prime}(x, y)=\boldsymbol{I}(x, y) \cdot \omega_{\mathrm{cmp}} $ | (3) |
对
$ P(x)=\frac{\sum\limits_{n=1}^{N} W(x)\left\lfloor A_{n}(x) \Delta \phi_{n}(x)-T\right\rfloor}{\sum\limits_{n=1}^{N} A_{n}(x)+\varepsilon} $ | (4) |
式中,
$ \begin{array}{c} \Delta \phi_{n}(x)=\cos \left(\phi_{n}(x)-\overline{\phi_{n}}(x)\right)- \\ \left|\sin \left(\phi_{n}(x)-\overline{\phi_{n}}(x)\right)\right| \end{array} $ | (5) |
式中,
如图 5所示,通过原始图像与失真图像的一阶相位一致图,可以看出不同失真类型导致相位一致性产生不同失真。
计算单个平面的原始图像和失真图像的四阶相位一致性图,每一阶相位一致性图描述不同程度上的图像结构信息,这里使用各阶之间的互信息来表达基于相位一致性的特征。以第一阶和第二阶之间的互信息为例,设第一阶的相位一致为
$ H\left(P^{1 \mathrm{st}}\right)=-\sum\limits_{m} P_{P ^{\mathrm{1st}}}(m) \log P_{P^{1 \mathrm{st}}}(m) $ | (6) |
$ H\left(P^{2 \mathrm{nd}}\right)=-\sum\limits_{n} P_{P ^{\mathrm{2nd}}}(n) \log P_{P^{2 \mathrm{nd}}}(n) $ | (7) |
式中,
$ \begin{gathered} H\left(P^{1 \mathrm{st}}, P^{2 \mathrm{nd}}\right)=-\sum\limits_{m, n} P_{P^{\mathrm{1st}}, {P}^{2 \mathrm{nd}}}(m, n) \cdot \\ \log P_{P^{1 \mathrm{st}}, P ^{2 \mathrm{nd}}}(m, n) \end{gathered} $ | (8) |
式中,
$ \begin{gathered} M^{1 \mathrm{st}, 2 \mathrm{nd}}=H\left(P^{1 \mathrm{st}}\right)+H\left(P^{2 \mathrm{nd}}\right)- \\ H\left(P^{1 \mathrm{st}}, P^{2 \mathrm{nd}}\right) \end{gathered} $ | (9) |
同理可计算其他阶数之间的互信息,在权衡复杂度和性能的基础上仅用了3个互信息特征。参考图像单个平面的互信息特征可以记为
$ Q_{\mathrm{MI}}=\frac{\sum\limits_{\boldsymbol{R} \in \boldsymbol{G}_{r}} \sum\limits_{\boldsymbol{D} \in \boldsymbol{G}_{d}}\left(\boldsymbol{M}_{\mathrm{R}}-\overline{\boldsymbol{M}_{\mathrm{R}}}\right)\left(\boldsymbol{M}_{\mathrm{D}}-\overline{\boldsymbol{M}_{\mathrm{D}}}\right)}{\sqrt{\left(\sum\limits_{\boldsymbol{R} \in \boldsymbol{G}_{r}}\left(\boldsymbol{M}_{\mathrm{R}}-\overline{\boldsymbol{M}_{\mathrm{R}}}\right)^{2}\right)\left(\sum\limits_{\boldsymbol{D} \in \boldsymbol{G}_{d}}\left(\boldsymbol{M}_{\mathrm{D}}-\overline{\boldsymbol{M}_{\mathrm{D}}}\right)^{2}\right)}} $ | (10) |
$ \overline{\boldsymbol{M}_{\mathrm{R}}}={mean}\left(\boldsymbol{M}_{\mathrm{R}}^{1 \mathrm{st}, 2 \mathrm{nd}}, \boldsymbol{M}_{\mathrm{R}}^{2 \mathrm{nd}, 3 \mathrm{rd}}, \boldsymbol{M}_{\mathrm{R}}^{3 \mathrm{rd}, 4 \mathrm{th}}\right) $ | (11) |
$ \overline{\boldsymbol{M}_{\mathrm{D}}}={mean}\left(\boldsymbol{M}_{\mathrm{D}}^{1 \mathrm{st}, 2 \mathrm{nd}}, \boldsymbol{M}_{\mathrm{D}}^{2 \mathrm{nd}, 3 \mathrm{rd}}, \boldsymbol{M}_{\mathrm{D}}^{3 \mathrm{rd}, 4 \mathrm{th}}\right) $ | (12) |
式中,
应用式(10)计算出CMP格式下的6个平面的质量
$ \begin{gathered} Q_{\mathrm{M}}=\omega_{1} \cdot Q_{\mathrm{MI}}^{1}+\omega_{2} \cdot Q_{\mathrm{MI}}^{2}+\omega_{3} \cdot Q_{\mathrm{MI}}^{3}+ \\ \omega_{4} \cdot Q_{\mathrm{MI}}^{4}+\omega_{5} \cdot Q_{\mathrm{MI}}^{5}+\omega_{6} \cdot Q_{\mathrm{MI}}^{6} \end{gathered} $ | (13) |
式中,
在不同尺度上观察图像时,人眼视觉系统会捕捉不同的内容,尺度变小时会更加关注图像的整体概貌,反之就会更关注图像中的细节(Li等,2016)。为了获取不同尺度上的特征来模拟人眼对于尺度的感知特性,以参数2和4对图像进行下采样,两次下采样获取的不同尺度的相位一致性互信息相似度质量记为
1.3 基于相位一致性的局部熵的纹理相似度
图像信息熵(Ren等,2017)在图像恢复、边缘检测、目标检测和图像匹配等领域应用广泛。全局熵的大小反映了整幅图像包含的信息量,局部熵则反映了图像灰度的离散程度、图像的纹理分布情况。失真的引入会破坏图像中的纹理信息,因此使用局部熵的变化来衡量失真程度。
构建一阶相位一致性图的局部熵,假设相位一致性图中像素点为
$ \begin{aligned} E(x, y)=-& \sum\limits_{i=x-(n-1) / 2}^{x+(n-1) / 2} \sum\limits_{j=y-(n-1) / 2}^{y+(n-1) / 2} p(\varTheta(i, j)) \times \\ & \log p(\varTheta(i, j)) \end{aligned} $ | (14) |
$ p(\varTheta(i, j))=\frac{\varTheta(i, j)}{\sum\limits_{i=x-(n-1) / 2}^{x+(n-1) / 2} \sum\limits_{i=y-(n-1) / 2}^{y+(n-1) / 2} \varTheta(i, j)} $ | (15) |
将局部熵算子在相位一致性图上遍历,获得一阶相位一致的局部熵图
$ Q_{\mathrm{LE}}=\frac{\sum\limits_{R \in \boldsymbol{G}_{r}} \sum\limits_{D \in \boldsymbol{G}_{d}}\left(E_{\mathrm{R}}-\overline{E_{\mathrm{R}}}\right)\left(E_{\mathrm{D}}-\overline{E_{\mathrm{D}}}\right)}{\sqrt{\left(\sum\limits_{R \in \boldsymbol{G}_{r}}\left(E_{\mathrm{R}}-\overline{E_{\mathrm{R}}}\right)^{2}\right)\left(\sum\limits_{D \in \boldsymbol{G}_{d}}\left(E_{\mathrm{D}}-E_{\mathrm{D}}\right)^{2}\right)}} $ | (16) |
式中,
应用式(16)计算出CMP格式下每个平面的质量
$ \begin{gathered} Q_{\mathrm{E}}=\omega_{1} \cdot Q_{\mathrm{LE}}^{1}+\omega_{2} \cdot Q_{\mathrm{LE}}^{2}+\omega_{3} \cdot Q_{\mathrm{LE}}^{3}+ \\ \omega_{4} \cdot Q_{\mathrm{LE}}^{4}+\omega_{5} \cdot Q_{\mathrm{LE}}^{5}+\omega_{6} \cdot Q_{\mathrm{LE}}^{6} \end{gathered} $ | (17) |
式中,
最后,将多尺度的基于高阶相位一致性互信息的质量与基于一阶相位一致性的局部熵的质量进行融合,即联合式(13)和(17)获得最终图像质量
$ \begin{gathered} Q_{\text {all }}=\omega_{7} \cdot\left(\omega_{9} \cdot Q_{\mathrm{M}}^{1 / 2}+\omega_{9} \cdot Q_{\mathrm{E}}^{1 / 2}\right)+ \\ \omega_{8} \cdot\left(\omega_{9} \cdot Q_{\mathrm{M}}^{1 / 4}+\omega_{9} \cdot Q_{\mathrm{E}}^{1 / 4}\right) \end{gathered} $ | (18) |
式中,参数
2 实验结果与分析
实验在全景图像质量评价(omnidirectional image quality assessment, OIQA)数据集(Cheung等,2017)上进行,共含336幅图像,包括16幅原始图像和320幅失真图像。16幅原始参考图像的分辨率在11 332×5 666像素与13 320×6 660像素之间,场景多样,如图 7所示。在原始全景图像中引入JPEG压缩、JPEG2000压缩、高斯模糊(Gaussian blur, GB)和高斯白噪声(Gaussian white noise, GN)4种不同类型的失真,每一种失真存在4种不同程度的失真情况,生成320幅失真图像。
表 1列出本文提出的PC-PIQA算法与主流算法的性能对比,包括两个传统算法PSNR(Horé和Ziou,2010)、SSIM(Wang等,2004)和4种主流全景算法CPP-PSNR(Zakharchenko等,2017)、WS-PSNR(Sun等,2017)、S-PSNR(Yu等,2015)、WS-SSIM(Zhou等,2018),最佳性能用粗体表示。除此之外,还展示了基于高阶相位一致性互信息(panoramic weighted-mutual information, PW-MI)的结构相似度与基于一阶相位一致性局部熵(panoramic weighted-local entropy, PW-LE)的纹理相似度单独获得的性能。
表 1
整体性能对比
Table 1
The comparison of overall performance
评价模型 | PLCC | SRCC | KRCC | RMSE |
PSNR(Horé和Ziou,2010) | 0.508 0 | 0.497 9 | 0.338 2 | 1.821 1 |
SSIM(Wang等,2004) | 0.249 2 | 0.348 3 | 0.437 3 | 1.901 4 |
CPP-PSNR(Zakharchenko等,2017) | 0.350 2 | 0.518 2 | 0.518 6 | 1.807 8 |
WS-PSNR(Sun等,2017) | 0.504 4 | 0.503 2 | 0.341 4 | 1.825 6 |
S-PSNR(Yu等,2015) | 0.531 9 | 0.530 3 | 0.358 8 | 1.790 4 |
WS-SSIM(Zhou等,2018) | 0.459 1 | 0.431 1 | 0.295 5 | 1.878 3 |
PW-MI | 0.627 6 | 0.622 1 | 0.439 1 | 1.646 0 |
PW-LE | 0.890 3 | 0.885 6 | 0.693 1 | 0.962 8 |
PC-PIQA | 0.892 2 | 0.889 3 | 0.697 1 | 0.954 7 |
注:加粗字体为每列最优结果。 |
采用4个常用的客观质量评价统计学指标:皮尔森线性相关系数(Pearson linear correlation coefficient, PLCC)、斯皮尔曼秩序相关系数(Spearman rank order correlation coefficient, SRCC)、肯德尔秩序相关系数(Kendall rank order correlation coefficient, KRCC)和均方根误差(root of mean square error, RMSE),计算式分别为
$ M_{\mathrm{PLCC}}=\frac{\sum\limits_{i=1}^{N}\left(s_{i}-\bar{s}\right)\left(p_{i}-\bar{p}\right)}{\sqrt{\sum\limits_{i=1}^{N}\left(s_{i}-\bar{s}\right)^{2}} \sqrt{\sum\limits_{i=1}^{N}\left(p_{i}-\bar{p}\right)^{2}}} $ | (19) |
式中,
$ M_{\mathrm{SRCC}}=1-\frac{6 \sum\limits_{i=1}^{N} d_{i}^{2}}{N\left(N^{2}-1\right)} $ | (20) |
式中,
$ M_{\mathrm{KRCC}}=\frac{N_{c}-N_{d}}{0.5 N(N-1)} $ | (21) |
式中,
$ M_{\mathrm{RMSE}}=\sqrt{\frac{1}{N} \sum\limits_{i=1}^{N}\left(X_{i}-Y_{i}\right)^{2}} $ | (22) |
式中,
从表 1可以得知:由于解决了观察空间和映射空间不一致的问题,并且融合了基于人眼感知的多尺度互信息相似度和局部熵相似度,提出的基于相位一致性的全景算法在4个指标上都达到了最佳;同时PW-MI和PW-LE的单独性能也高于这几种全景算法。此外,与基于结构信息的SSIM和WS-SSIM相比,提出的模型性能更优。图 8进一步分别给出平均意见分数(mean opinion score, MOS)与7个模型预测分数的拟合散点图。可以看出, 图 8(g)代表的PC-PIQA相较于其他6种模型拟合的更好。
为了验证所提出的全参考全景图像质量评价算法PC-PIQA对不同失真类型具有鲁棒性,表 2中列出了4种全景算法和PC-PIQA在4种不同失真类型图像上4种指标的比较。
表 2
不同失真类型的性能对比
Table 2
Performance comparison of different distortion types
指标 | 失真类型 | CPP-PSNR | WS-PSNR | S-PSNR | WS-SSIM | PC-PIQA |
PLCC | JPEG | 0.568 3 | 0.736 8 | 0.752 6 | 0.790 3 | 0.886 8 |
JPEG2000 | 0.704 0 | 0.717 7 | 0.726 4 | 0.740 1 | 0.900 4 | |
GN | 0.950 3 | 0.948 3 | 0.954 0 | 0.938 3 | 0.901 0 | |
GB | 0.498 0 | 0.496 1 | 0.521 2 | 0.423 2 | 0.921 2 | |
SRCC | JPEG | 0.223 0 | 0.705 0 | 0.719 9 | 0.787 5 | 0.874 5 |
JPEG2000 | 0.749 6 | 0.763 6 | 0.763 3 | 0.736 9 | 0.896 9 | |
GN | 0.922 8 | 0.930 1 | 0.918 2 | 0.922 3 | 0.891 8 | |
GB | 0.525 1 | 0.523 2 | 0.543 2 | 0.409 8 | 0.907 4 | |
KRCC | JPEG | 0.456 0 | 0.509 5 | 0.524 1 | 0.595 7 | 0.681 9 |
JPEG2000 | 0.557 3 | 0.572 5 | 0.574 4 | 0.549 7 | 0.716 9 | |
GN | 0.740 8 | 0.763 0 | 0.733 2 | 0.746 5 | 0.707 9 | |
GB | 0.362 6 | 0.363 8 | 0.377 8 | 0.280 1 | 0.733 7 | |
RMSE | JPEG | 1.890 1 | 1.553 2 | 1.512 7 | 1.407 5 | 1.061 7 |
JPEG2000 | 1.569 5 | 1.539 0 | 1.518 8 | 1.486 2 | 0.961 5 | |
GN | 0.586 0 | 0.597 2 | 0.563 8 | 0.650 7 | 0.816 4 | |
GB | 1.693 1 | 1.695 1 | 1.666 2 | 1.769 1 | 0.759 6 | |
注:加粗字体为每列最优结果。 |
从表 2中可以看出,在高斯噪声这一类失真图像上,PC-PIQA所获得的性能比WS-PSNR和S-PSNR略低,但这4种全景算法在其他3种失真类型上的性能远差于提出的PC-PIQA。因此,综合4种失真类型的实验结果,提出的PC-PIQA获得的综合性能更好。
人眼视觉系统对不同尺度的图像具有不同的敏感度。如果尺度过大,容易忽略人眼对全局的感知,如果尺度过小,容易忽视局部细节。因此本文进行了下采样的实验,不同尺度下获得的性能对比如表 3所示。通过实验发现, 以参数为2和4采样的尺度获得的性能较佳,因此最终将两个尺度上的各阶相位一致性的互信息和一阶相位一致性的局部熵进行了融合。
表 3
不同尺度的性能对比
Table 3
Performance comparison of different scales
尺度 | PLCC | SRCC | KRCC | RMSE |
原尺度 | 0.835 0 | 0.830 9 | 0.628 8 | 1.163 4 |
1/2 | 0.871 9 | 0.868 0 | 0.670 2 | 1.035 4 |
1/4 | 0.882 1 | 0.879 8 | 0.685 2 | 0.996 0 |
1/8 | 0.805 2 | 0.803 3 | 0.597 0 | 1.253 7 |
1/16 | 0.565 2 | 0.567 2 | 0.393 6 | 1.744 2 |
注:加粗字体为每列最优结果。 |
最后,表 4展示了从一阶到五阶的基于局部熵相似度的质量的实验结果,可以看出,获得的性能随着阶数的不断增加在逐渐下降。因此可以认为,虽然高阶的相位一致性能够获得更加清晰的结构特征和纹理信息,但随着阶数的升高,被失真破坏的结构特征和纹理信息会对相似度的计算产生不利的影响,且局部熵会进一步放大这种不利影响,所以最终使用了一阶相位一致性的局部熵。
表 4
各阶相位一致性局部熵的影响
Table 4
Influence of phase consistency local entropy of each order
阶数 | PLCC | SRCC | KRCC | RMSE |
一阶 | 0.834 8 | 0.830 6 | 0.629 2 | 1.163 9 |
二阶 | 0.824 6 | 0.821 2 | 0.619 9 | 1.196 1 |
三阶 | 0.820 6 | 0.817 2 | 0.615 3 | 1.208 3 |
四阶 | 0.818 4 | 0.818 5 | 0.613 6 | 1.125 0 |
五阶 | 0.812 0 | 0.811 0 | 0.609 7 | 1.234 0 |
注:加粗字体为每列最优结果。 |
3 结论
本文提出一种基于相位一致性的全参考全景图像质量评价算法。首先采用基于人类视觉特性的相位一致性算子提取参考图像和失真图像结构特征相似度,然后利用一阶相位一致性局部熵的相似度反映参考图像和失真图像纹理的相似度,将两部分质量融合可得全景图像的客观质量分数。
在OIQA全景图像数据集上的实验结果表明,本文算法在4项评价指标都达到最佳结果,优于对比的全参考图像质量评估算法,与主观感受具有较高的一致性。该方法不但解决了观察空间和处理空间不一致的问题,而且对不同失真类型具有很好的鲁棒性,能够获得更好的拟合效果。
全景图像和视频的质量评价在虚拟现实(virtual reality, VR)技术及其应用的发展和普及中有着非常关键的作用,近年来成为多媒体技术领域研究的热点。随着深度学习的发展,深度学习网络所实现的框架同样也能获得较高的准确性。本文提出的全景质量评价模型属于传统算法,并未与深度学习方法进行比较。此外,该模型是否可以进一步融合到基于神经网络的全景质量评价中,还需要进一步论证和实验。
参考文献
-
Ai D, Dong J J, Lin N, Liu Y. 2018. Advance of 360-degree video coding for virtual reality: a survey. Application Research of Computers, 35(6): 1606-1612 (艾达, 董久军, 林楠, 刘颖. 2018. 用于虚拟现实的360度视频编码技术新. 计算机应用研究, 35(6): 1606-1612) [DOI:10.3969/j.issn.1001-3695.2018.06.002]
-
Cheung G, Yang L Y, Tan Z G and Huang Z. 2017. A content-aware metric for stitched panoramic image quality assessment//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops (ICCVW). Venice: IEEE: 2487-2494[DOI: 10.1109/ICCVW.2017.293]
-
Dedhia B, Chiang J C and Char Y F. 2019. Saliency prediction for omnidirectional images considering optimization on sphere domain//Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, United Kingdom: IEEE: 2142-2146[DOI: 10.1109/ICASSP.2019.8683125]
-
Greene N. 1986. Environment mapping and other applications of world projections. IEEE Computer Graphics and Applications, 6(11): 21-29 [DOI:10.1109/MCG.1986.276658]
-
Horé A and Ziou D. 2010. Image quality metrics: PSNR vs. SSIM//Proceedings of the 20th IEEE International Conference on Pattern Recognition (ICPR). Istanbul: IEEE: 2366-2369[DOI: 10.1109/ICPR.2010.579]
-
Kovesi P. 1999. Image features from phase congruency. Videre: Journal of Computer Vision Research, 1(3): 1-26
-
Lebreton P, Raake A. 2018. GBVS360, BMS360, ProSal: extending existing saliency prediction models from 2D to omnidirectional images. Signal Processing: Image Communication, 69: 69-78 [DOI:10.1016/j.image.2018.03.006]
-
Li Q H, Lin W S, Fang Y M. 2016. No-reference quality assessment for multiply-distorted images in gradient domain. IEEE Signal Processing Letters, 23(4): 541-545 [DOI:10.1109/LSP.2016.2537321]
-
Ren Y F, Sun L, Wu G W and Huang W Z. 2017. DIBR-synthesized image quality assessment based on local entropy analysis//Proceedings of 2017 IEEE International Conference on the Frontiers and Advances in Data Science (FADS). Xi'an, China: IEEE: 2017: 86-90[DOI: 10.1109/FADS.2017.8253200]
-
Sun Y L, Lu A, Yu L. 2017. Weighted-to-spherically-uniform quality evaluation for omnidirectional video. IEEE Signal Processing Letters, 24(9): 1408-1412 [DOI:10.1109/LSP.2017.2720693]
-
Upenik E and Ebrahimi T. 2019. Saliency driven perceptual quality metric for omnidirectional visual content//Proceedings of 2019 IEEE International Conference on Image Processing (ICIP). Taipei, China: IEEE: 4335-4339[DOI: 10.1109/ICIP.2019.8803637]
-
Upenik E, Řeřábek M and Ebrahimi T. 2016. Testbed for subjective evaluation of omnidirectional visual content//Proceedings of 2016 IEEE Picture Coding Symposium (PCS). Nuremberg, Germany: IEEE: 1-5[DOI: 10.1109/PCS.2016.7906378]
-
Wang Z, Bovik A C, Sheikh H R, Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612 [DOI:10.1109/TIP.2003.819861]
-
Xu J H, Luo Z Y, Zhou W, Zhang W Y and Chen Z B. 2019a. Quality assessment of stereoscopic 360-degree images from multi-viewports//Proceedings of 2019 IEEE Picture Coding Symposium (PCS). Ningbo, China: IEEE: 1-5[DOI: 10.1109/PCS48520.2019.8954555]
-
Xu M, Li C, Chen Z Z, Wang Z L, Guan Z Y. 2019b. Assessing visual quality of omnidirectional videos. IEEE Transactions on Circuits and Systems for Video Technology, 29(12): 3516-3530 [DOI:10.1109/TCSVT.2018.2886277]
-
Xu X, Zhang H Q, Xia Z F. 2018. Quality assessment of 360-degree spherical images based on feature extraction in the wavelet domain. Video Engineering, 42(4): 36-40 (许欣, 张会清, 夏志方. 2018. 基于小波域特征提取的360度全景图像质量评价. 电视技术, 42(4): 36-40) [DOI:10.16280/j.videoe.2018.04.007]
-
Yang S, Zhao J Z, Jiang T T, Wang J, Rahim T, Zhang B, Xu Z J and Fei Z S. 2017. An objective assessment method based on multi-level factors for panoramic videos//Proceedings of 2017 IEEE Visual Communications and Image Processing (VCIP). St. Petersburg, USA: IEEE: 1-4[DOI: 10.1109/VCIP.2017.8305133]
-
Yu M, Lakshman H and Girod B. 2015. A framework to evaluate omnidirectional video coding schemes//Proceedings of 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Fukuoka, Japan: IEEE: 31-36[DOI: 10.1109/ISMAR.2015.12]
-
Zakharchenko V, Choi K P, Alshina E and Park J H. 2017. Omnidirectional video quality metrics and evaluation process//Proceedings of 2017 IEEE Data Compression Conference (DCC). Snowbird, USA: IEEE: #472[DOI: 10.1109/DCC.2017.90]
-
Zhou Y F, Yu M, Ma H L, Shao H and Jiang G Y. 2018. Weighted-to-spherically-uniform SSIM objective quality evaluation for panoramic video//Proceedings of 2018 IEEE International Conference on Signal Processing (ICSP). Beijing, China: IEEE: 54-57[DOI: 10.1109/ICSP.2018.8652269]
-
Zhu Y C, Zhai G T, Min X K. 2018. The prediction of head and eye movement for 360 degree images. Signal Processing: Image Communication, 69: 15-25 [DOI:10.1016/j.image.2018.05.010]