发布时间: 2017-04-16
DOI: 10.11834/jig.20170405
2017 | Volume 22 | Number 4




expand article info 王颖1, 郁梅1,2, 应宏微1, 蒋刚毅1,2
1. 宁波大学信息科学与工程学院, 宁波 315211;
2. 南京大学计算机软件新技术国家重点实验室, 南京 210023


目的 针对人眼观看立体图像内容可能存在的视觉不舒适性,基于视差对立体图像视觉舒适度的影响,提出了一种结合全局线性和局部非线性视差重映射的立体图像视觉舒适度提升方法。 方法 首先,考虑双目融合限制和视觉注意机制,分别结合空间频率和立体显著性因素提取立体图像的全局和局部视差统计特征,并利用支持向量回归构建客观的视觉舒适度预测模型作为控制视差重映射程度的约束;然后,通过构建的预测模型对输入的立体图像的视觉舒适性进行分析,就欠舒适的立体图像设计了一个两阶段的视差重映射策略,分别是视差范围的全局线性重映射和针对提取的潜在欠舒适区域内视差的局部非线性重映射;最后,根据重映射后的视差图绘制得到舒适度提升后的立体图像。 结果 在IVY Lab立体图像舒适度测试库上的实验结果表明,相较于相关有代表性的视觉舒适度提升方法对于欠舒适立体图像的处理结果,所提出方法在保持整体场景立体感的同时,能更有效地提升立体图像的视觉舒适度。 结论 所提出方法能够根据由不同的立体图像特征构建的视觉舒适度预测模型来自动实施全局线性和局部非线性视差重映射过程,达到既改善立体图像视觉舒适度、又尽量减少视差改变所导致的立体感削弱的目的,从而提升立体图像的整体3维体验。


立体图像; 视觉舒适度提升; 客观预测模型; 视差重映射; 立体感

Visual comfort enhancement for stereoscopic images based on disparity remapping
expand article info Wang Ying1, Yu Mei1,2, Ying Hongwei1, Jiang Gangyi1,2
1. Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China;
2. National Key Laboratory of Software New Technology, Nanjing University, Nanjing 210023, China
Supported by: National Natural Science Foundation of China (U1301257, 61671258);Natural Science Foundation of Zhejiang Province, China (LY15F010005)


Objective At present, 3D videos have become extensively integrated into the daily lives of people due to the immersive visual experience that they provide to users. However, viewers can experience visual discomfort when watching 3D videos, and even suffer from eye fatigue, headache, nausea, and other symptoms due to defects in 3D imaging technology. Therefore, the study of visual comfort enhancement methods for stereoscopic images or videos is highly significant to improve stereoscopic display technology and provide users with higher-quality 3D vision service. The factors that can cause visual discomfort when people watch stereoscopic images or videos include the followings: vergence-accommodation conflict, excessive cross and non-cross disparities, disparity distribution, spatial frequency, mismatch between left and right images, and object movement. Vergence-accommodation conflict is the fundamental cause of visual discomfort. Binocular vergence-accommodation conflict is characterized by a large disparity that occurs in 3D space. If the disparity is outside the fusion range, then the viewer cannot fuse the left and right images into a stereoscopic image, and instead, will see an unclear crosstalk image, thereby resulting in severe visual fatigue. Disparity distribution is also one of the main factors that affect visual comfort. Excessive cross disparity is more likely to cause visual discomfort than excessive non-cross disparity. When the entire image is located in front of the screen, visual comfort will be lower compared with when the entire image is positioned behind the screen. The disparity distribution of an image is more concentrated on the zero-disparity plane, thereby making the image more comfortable to view. As dispersion decreases, viewing the image becomes more comfortable. Spatial frequency also influences visual comfort by affecting binocular fusion limit. An image with high spatial frequency causes a higher degree of visual discomfort than an image with low spatial frequency. Disparity adjustment is the main method that can enhance the visual comfort of stereoscopic images because the vergence-accommodation conflict caused by the increased disparity is the main factor that leads to visual discomfort. Disparity adjustment methods can be divided into two categories: disparity shifting and disparity scaling. A disparity shifting method adjusts disparity by shifting the zero-disparity plane of the original image, thereby keeping the disparity range unchanged. Although this method has low computational complexity, simultaneously ensuring maximum cross disparity and non-cross disparity within the comfort zone is difficult regardless of how disparity is moved when the original disparity range exceeds a certain range of comfortable viewing area. Thus, visual discomfort remains unavoidable in this case. By contrast, the disparity range of the original scene can be linearly or nonlinearly adjusted into the comfort area by using a disparity scaling method. In general, excessive vergence-accommodation conflict can be avoided effectively by reducing the disparity range of the scene. However, when a large-scale disparity reduction is performed, the overall perceived depth of the stereoscopic image is significantly decreased, and an unnatural visual effect occurs due to the limited range of the comfortable viewing area. A new visual comfort enhancement method for stereoscopic images is proposed by combining global linear and local nonlinear disparity remapping based on the effect of disparity on visual comfort. This method can prevent visual discomfort when viewing stereoscopic images; it also balances the improvement of visual comfort of stereoscopic images and the weakening of the 3D sense of scenes. Method First, an objective visual comfort assessment model is constructed to automatically predict the visual comfort of stereoscopic images and to judge the improvement of the visual comfort of stereoscopic images during disparity adjustment. On the one hand, when binocular fusion limitation is considered, the global visual comfort features of stereoscopic images are extracted by combining spatial frequency and disparity. On the other hand, we perform a disparity statistical analysis on stereoscopic significant regions and obtain local visual comfort features based on the hypothesis that the human eye tends to pay excessive attention to perceived salient regions. Support vector regression is adopted in this study to construct the objective visual comfort prediction model for stereoscopic images by establishing the mapping relationship between features and subjective scores. Then, the visual comfort of the input stereoscopic image is analyzed using the constructed prediction model. A two-stage disparity remapping strategy is designed for less-comfortable stereoscopic images. This strategy consists of the global linear adjustment of the disparity range and the local nonlinear adjustment of the disparity in the extracted potentially less-comfortable regions. The global disparity remapping of the input disparity map is performed during the first stage to adjust the uncomfortable stereoscopic images to a relatively comfortable degree. The global disparity linear iterative adjustment process is performed if the predicted visual comfort objective score is less than the preset threshold. Only the global features are applied at this point to construct the visual comfort prediction function. Local nonlinear disparity remapping is then performed during the second stage to further enhance the viewing comfort of the stereoscopic image and maintain the 3D sense of the scene. The disparity of the potentially less-comfortable regions extracted from the disparity map after global linear remapping is adjusted via nonlinear iteration until the predicted visual comfort objective score is higher than the preset target threshold. The visual comfort of the adjusted stereoscopic image is predicted in conjunction with global and local features at this point. Lastly, an updated comfortable stereoscopic image is reconstructed via a rendering technique according to the remapped disparity map. Result A subjective evaluation experiment is designed on the IVY Lab stereoscopic image database to verify the effectiveness of the proposed method in improving the visual comfort and maintaining the 3D sense of stereoscopic images. Experimental results show that the proposed method can more effectively enhance the visual comfort of less-comfortable stereoscopic images while maintaining the 3D sense of scenes compared with state-of-the-art stereoscopic image visual comfort enhancement methods. Conclusion The proposed method can automatically implement global linear and local nonlinear disparity remapping processes based on the visual comfort prediction model constructed with different features of stereoscopic images. The proposed method can realize the purpose of improving the visual comfort of stereoscopic images under the premise of ensuring 3D sense, which enhances the overall 3D experience of stereoscopic images.

Key words

stereoscopic image; visual comfort enhancement; objective prediction model; disparity remapping; three-dimensional sense

0 引言


人眼在观看立体图像/视频内容时,可能诱发视觉不舒适的因素主要有:辐辏-调节冲突、过大的交叉和非交叉视差、视差分布、空间频率、左右图像不匹配、对象运动等[2-3]。针对立体图像的视觉舒适度客观评价,学者已开展了相关研究;Kim等人[4]对过大的水平视差和垂直视差采用加性一阶线性回归来预测人眼对立体图像的视觉疲劳程度。Sohn等人[5]考虑相对视差和对象厚度因素提出了依靠对象的视差特征来量化视觉舒适度。Jung等人[6]通过提取基于视觉关注图的视差特征并利用支持向量回归 (SVR) 预测立体图像的视觉舒适度评价值。Park等人[7]通过分析水平视差分布和与水平视差相关的中颞区域的神经活动,提出了一个基于神经统计框架的视觉舒适度预测模型。Jiang等人[8]从偏好选择的角度出发,利用学习到的偏好分类模型构建了一个鲁棒的视觉舒适度评价模型。

对于视觉不舒适的立体图像,需要寻求有效的理论与方法进行舒适度提升。由于视差增大而导致辐辏-调节大冲突是引起视觉不舒适的主要因素,目前针对视差调整方面的立体图像视觉舒适度提升方法主要可分为两类:第一类是视差平移[9-10],另一类是视差缩放[11-14]。视差平移方法通过移动左图和右图来调整视差,即平移原始图像的零视差平面而保持视差范围不变[9]。因此,过大的屏幕视差可通过平移而减小,从而降低了辐辏-调节冲突。该类方法虽计算复杂度低,但当原始视差范围超过一定的舒适观看区域的范围,不管怎样平移,都难以同时保证最大交叉视差和非交叉视差限制在舒适区域之内,则仍会引发观众的视觉不舒适。Lei等人[10]利用视觉关注模型确定零视差平面来控制视差,以改善人眼对多视点图像的视觉疲劳。视差缩放方法则是将原始场景的视差范围线性或非线性缩放到舒适区域[11]。通常,减小场景的视差范围可有效避免过大的辐辏-调节冲突,但当进行大尺度的视差缩小时,由于有限的舒适观看区域的范围 (如角视差的±1°),使得立体图像的整体感知深度大幅度减弱并造成不自然的视觉效应。Sohn等人[12]通过从全局和局部图像中提取的视觉不舒适因素来指导整体视差的线性和非线性重映射,在提高视觉舒适度的同时保持了场景的自然性。Jung等人[13]构建了一个基于sigmoid函数的显著自适应非线性视差重映射函数来增强立体图像的视觉舒适感并减小视差失真。Oh等人[14]基于整体视觉疲劳分值构造了一个非线性重映射算子从而压缩潜在问题区域的视差范围和拉伸舒适区域的视差范围。

为平衡立体图像视觉舒适度的改善和场景立体感的削弱之间的关系,本文提出了一种结合全局线性和局部非线性视差重映射策略的立体图像视觉舒适度提升方法。其主要贡献包括:1) 分别联合空间频率和立体显著性提取视差统计特征,并利用SVR训练得到视觉舒适度预测模型,来自动指导视差调整过程;2) 提出了一个全局线性和局部非线性的两阶段的视差调整策略,特别是以局部的方式针对全局调整后存在的潜在欠舒适区域内的视差进行非线性调整,从而在提升视觉舒适度的同时尽可能维持场景立体感;3) 为证明所提出方法的有效性,设计了一个主观评价实验来同时评估处理后立体图像的视觉舒适度和立体感。

1 结合全局线性与局部非线性视差重映射的立体图像视觉舒适度提升方法

立体图像/视频内容的产生,除了要考虑立体感,还要求满足视觉舒适性。针对视觉可能不舒适的立体图像内容,在改善立体图像视觉舒适度的同时,也需要避免视差的过大缩放而导致场景立体感的大幅度减弱。为此,本文提出了一种两阶段视差重映射策略的立体图像视觉舒适度提升方法,分别为全局视差粗调和局部视差细调等两个阶段,其具体过程如图 1所示。首先,为控制每个阶段视差范围调整的幅度,分别考虑双眼融合极限和视觉注意机制,提取全局和局部的视觉舒适度特征,结合SVR模型引入客观的视觉舒适度预测函数;然后,在第1阶段,为使不舒适的立体图像经整体调整后初步达到比较舒适的程度,对输入的视差图进行全局线性视差重映射,线性迭代调整全局视差,直到预测到的全局视觉舒适度客观评分值大于预先设定的阈值;第2阶段,为进一步提升立体图像的舒适度以及保持场景的立体感,进行局部非线性视差重映射,提取全局线性映射后的视差图中的潜在欠舒适区域,非线性迭代调整该区域内的视差范围,使得结合全局和局部的视觉舒适度预测值达到预定的目标阈值;最后,根据重映射后的视差图利用绘制技术得到更新后的立体图像以改善其舒适性。

图 1 结合全局线性与局部非线性视差重映射的立体图像视觉舒适度提升方法框图
Fig. 1 Diagram of the proposed visual comfort enhancement method by combining global linear and local nonlinear disparity remapping for stereoscopic images

1.1 立体图像的视觉舒适度预测模型


1.1.1 全局视觉舒适特征


$ PC\left( p \right) = \frac{{\sum\limits_j {{E_{{\theta _j}}}(p)} }}{{\alpha + \sum\limits_n {\sum\limits_j {{A_{n, {\theta _j}}}(p)} } }} $ (1)

式中,$ {{A}_{n, {{\theta }_{j}}}}(p) $为尺度$n $和方向$ {{\theta }_{j}} $上的局部幅值,$ {{E}_{{{\theta }_{j}}}}(p) $为方向${{\theta }_{j}} $上的局部能量,α为小的正常数以保证分母不为零,本文取值为0.000 1。由此,可直接根据相位一致性,得到每个像素点p的空间频率MSF(p)=PC(p)[14]

为更有效反映空间频率对视觉舒适程度的影响,通过将空间频率图MSF作为视差图D的权重与D相乘得到空间频率感知视差图MSFD,因为空间频率高的区域若具有大视差,则会感受到更严重的视觉不舒适。然后,提取空间频率感知视差图MSFD中的平均幅值$ {{f}_{1}} $、均值${{f}_{2}} $、方差$ {{f}_{3}} $、最大$k $%的均值${{f}_{4}} $、最小$k $%的均值${{f}_{5}} $和范围$ {{f}_{6}} $,得到6维的全局特征矢量$ {{\mathit{\boldsymbol{F}}}_{\rm{G}}}=[{{f}_{1}}, ~{{f}_{2}}, ~{{f}_{3}}, ~{{f}_{4}}, ~{{f}_{5}}, ~{{f}_{6}}] $

1.1.2 局部视觉舒适特征

根据视觉注意机制,人眼在观看立体图像时会更关注一些感知重要区域 (比如显著区域),也就是说在视觉显著区域的特征会更大程度的影响整幅立体图像的视觉舒适度,而非视觉显著区域的特征对整体的视觉舒适性影响较小[17]。由此可在立体显著区域提取视差统计特征,作为局部视觉舒适度特征。

立体图像的显著图MSS通过直接将视差图D作为平面显著图MSI的权重得到,${\mathit{\boldsymbol{M}}_{{\rm{SS}}}}\left( {x, y} \right) = \mathit{\boldsymbol{D}}(x, y) \times {\mathit{\boldsymbol{M}}_{{\rm{SI}}}}(x, y) $


$ {\mathit{\boldsymbol{M}}_{{\rm{Seg}}}}\left( {x, y} \right) = \left\{ \begin{array}{l} 1\;\;\;{\mathit{\boldsymbol{M}}_{{\rm{SS}}}}\left( {x, y} \right) > {T_r}\\ 0\;\;其他 \end{array} \right. $ (2)

式中,阈值$ {T_r} $MSS中值从高到底排列后的10%处的值[18]MSeg中值为1的区域表示潜在显著不舒适区域MSR。结合潜在显著不舒适区域MSR和视差图D,得到潜在显著不舒适区域视差图MSDR

通过对潜在显著不舒适区域视差图MSDR中的视差进行统计分析,提取得到视差幅值$ {f_7} $、视差均值$ {f_8} $、视差方差$ {f_9} $、最大$k $%视差的均值$ {f_10} $、最小$k $%视差的均值$ {f_11} $、视差范围$ {f_12} $、以及视差分散度$ {f_13} $和视差偏度$ {f_14} $,共组成8维的局部特征矢量${\mathit{\boldsymbol{F}}_{\rm{L}}} = [{f_7}, {f_8}, {f_9}, {f_{10}}, {f_{11}}, {f_{12}}, {f_{13}}, {f_{14}}] $。视差分散度是指视差相对于零视差水平的分散度,视差分布从零处开始分散的越大,更有可能引起调节辐辏冲突,从而引起视觉不舒适。视差偏度是指视差分布的偏斜度,视差分布偏向屏幕前方比偏向屏幕后方更容易导致不舒适感。这里,取$k $=10[5],视差用角度表示。

1.1.3 视觉舒适度预测函数构建


$ \Psi \left( \mathit{\boldsymbol{F}} \right) = \sum\limits_{i = 1}^n \mathit{\boldsymbol{\omega }} \times K\left( {\mathit{\boldsymbol{F}}, {\mathit{\boldsymbol{F}}^i}} \right) + b $ (3)

式中,F为特征矢量,ω为权重矢量,$b $为常数项,$ n $为训练集中立体图像的对数,$ K(\mathit{\boldsymbol{F}}, {\mathit{\boldsymbol{F}}^i}) $为径向基核函数。

在线性视差重映射阶段只考虑全局视觉舒适特征FG来预测全局的视觉舒适度评价值Ψ(FG),而在非线性视差重映射阶段结合全局视觉舒适特征FG和局部视觉舒适特征FL来量化视觉舒适程度Ψ(FG, FL)。

1.2 全局视差线性重映射



$ {\mathit{\boldsymbol{D}}^\prime } = \varepsilon \times \mathit{\boldsymbol{D}} $ (4)

式中,D′是全局线性重映射后的视差图,比例因子ε∈(0, 1]。

该视差线性重映射过程可通过搜索算法实现:给定视差图D和初始比例因子ε=1,找到一个新的缩小视差的比例因子值,满足全局视觉舒适度预测值Ψ(FG) 大于设定的阈值VCGLT的条件。视差图以间隔Δε=0.05的步调进行缩小,即比例因子ε每次以0.05的间隔减小。每次迭代后,在新得到的调整视差图上提取全局特征来重新预测客观的视觉舒适度评分值,如果分值小于预定的目标值,则重复视差缩小过程,直到达到预定的目标值时,算法进程结束。采用视差迭代调整的目的是防止视差缩小过多而影响场景的立体感。图 2给出了全局视差线性重映射过程的一个例子。图 2 (a) 是原始左右视点的红蓝立体合成图;图 2 (b) 是原始左右视点经全局视差线性调整后左右视点的红蓝立体合成图;图 2 (c) 是原始输入视差经全局线性调整后的视差范围。可以看出原始立体图像具有过大的交叉视差,会引起强烈的视觉不适感,经重映射后,整体视差被线性压缩,交叉视差减小,从而视觉不舒适程度得到缓解。

图 2 全局视差线性重映射实例
Fig. 2 Example of the global linear disparity remapping ((a) original stereoscopic image; (b) stereoscopic image after linear remapping; (c) disparity range after linear remapping)

1.3 局部视差非线性重映射


给定全局线性重映射后的视差图D′,利用K均值聚类算法分割得到两个区域,将视差均值较大的区域作为潜在欠舒适区域DZ,而另一个区域则为舒适区域DNZ,主要是考虑到过大的双目视差会引起辐辏和调节的高程度的冲突,这是引起视觉不舒适的主要原因。图 3分别显示了立体图像的右视点图像和它对应的全局线性重映射后的视差图、提取的潜在欠舒适区域的二值化图以及潜在欠舒适区域的彩色图。

图 3 潜在欠舒适区域提取
Fig. 3 Potential less-comfortable regions extraction ((a) right view images; (b) disparity maps after global linear remapping; (c) binary images in potential less-comfortable regions; (d) color images in potential less-comfortable regions)


$ {\mathit{\boldsymbol{D}}^{''}}_Z = \eta \times {\rm{ln}}({\mathit{\boldsymbol{D}}^\prime }-{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}} + 1) + {\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}} $ (5)

式中,$ \eta \in [1, {\eta _m}] $是调节因子用来控制非线性重映射后的视差范围,$ {\eta _m} $是设定的初始调节因子,计算公式为

$ {\eta _m} = \frac{{{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{max}}}}}-{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}}}}{{{\rm{ln}}({\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{max}}}}}-{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}} + 1)}} $ (6)

式中,$ {\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{max}}}}} $$ {\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}} $分别表示潜在欠舒适区域DZ中视差的最大值和最小值。


$ {\mathit{\boldsymbol{D}}^{''}} = {\mathit{\boldsymbol{D}}^{''}}_Z \cup {\mathit{\boldsymbol{D}}^\prime }_{NZ} $ (7)


该视差非线性重映射过程同样通过搜索算法实现:给定全局线性重映射后的视差图D′和初始调节因子$ {\eta _m} $,对潜在欠舒适区域内的视差进行间隔为Δη=0.5的非线性迭代缩小,最终找到最优的新调节因子,使得在该最优值下的视觉舒适度预测值Ψ(FG, FL) 大于且最接近设定的阈值VCCNLT图 4为局部视差非线性重映射过程的例子。可以看出,本阶段对第1阶段调整后存在的潜在欠舒适区域内的视差进行了非线性压缩,并保持了区域间视差的连续性以及控制了整体视差范围缩小的幅度,从而达到保证立体感的同时改善视觉舒适程度的目的。

图 4 局部视差非线性重映射过程实例
Fig. 4 Examples of the local nonlinear disparity remapping process ((a) disparity after linear remapping; (b) stereoscopic image after linear remapping; (c) locally enlarged region of (b); (d) disparity after nonlinear remapping; (e) stereoscopic image after nonlinear remapping; (f) locally enlarged region of (e))

1.4 视点绘制与立体图像重构

利用重映射后的视差图、原始右视点图像绘制左视点图像,在绘制过程中因为视差图边缘强度的突变等可能导致空洞,再进行空洞填补,从而得到虚拟左视点图像,与原始的右视点图像一起构成重建后的立体图像;这里,可以采用视点合成参考软件 (VSRS) 中相关算法来填补空洞区域[19]。通过视差图重建并修整左视点图像,形成新的立体图像,来提升立体图像的视觉舒适性。

2 实验结果及分析

为反映所提出方法在提升视觉舒适度并保持立体感上的有效性,设计了相关实验进行验证。采用韩国先进科学技术院的IVY Lab立体图像舒适度测试库进行测试[20]。该测试库共包含120幅分辨率为1 920×1 080的不同场景立体图像,并提供了每幅立体图像的平均主观评价值 (MOS) 和利用深度估计参考软件 (DERS) 得到的视差图。MOS值通过对视觉舒适程度进行五等级 (5:非常舒适,4:舒适,3:轻度不舒适,2:不舒适,1:非常不舒适) 主观打分而得到。实验中先利用该数据库训练得到一个客观的视觉舒适度预测模型,再对处理后的左右视点进行红蓝合成来证实效果,最后设计主观评价实验进一步证明所提出方法的性能。

2.1 视觉舒适度预测函数构建

为构建视觉舒适度预测函数,将IVY Lab立体图像舒适度测试库的120幅MOS值分布广泛的立体图像作为回归模型的训练集。通过将提取得到的立体图像视觉舒适度特征与MOS值输入SVR模型,训练得到一个客观的视觉舒适度预测函数。在第一阶段的全局视差线性重映射过程中,只利用全局特征FG来构建一个视觉舒适度客观评价模型Ψ(FG),调整的目标阈值VCGLT设定为3.5,因为该值介于MOS值中轻度不舒适和舒适之间,也即该阶段调整的目标是将立体图像整体调到较为舒适的程度。而对于第2阶段的局部非线性视差重映射,同时结合全局特征FG和局部特征FL来预测调整后立体图像的视觉舒适度评价值Ψ(FG, FL),阈值VCCNLT设定为4,该分值4对应于主观评价中的舒适,因此该阶段的目标是进一步提升立体图像的视觉舒适程度。

考虑到所构建的视觉舒适度评价模型的预测准确性,采用100次的十折交叉实验进行验证,并将所提出方法与现有代表性的立体图像舒适度评价方法进行比较,分别Kim方法[4],Sohn方法[5]和Jung方法[6]表 1给出了所提出方法与其他具有代表性的3种方法在IVY Lab立体图像舒适度测试库上的各项性能指标的比较结果,其中前两项最优的性能指标加粗表示。从表 1可以看出,在线性视差重映射阶段,对每幅立体图像提取6维的全局特征,所提出方法的Pearson线性相关系数 (PLCC) 值为0.864 2,Spearman等级相关系数 (SROCC) 值为0.829 1,均方根误差 (RMSE) 值为0.372 9;在非线性视差重映射阶段,提取了14维全局和局部的特征,所提出方法的PLCC值为0.869 4,SROCC值为0.830 1,RMSE值为0.366 9,均明显优于其他方法。结果表明,所构建的视觉舒适度预测模型与主观评价值具有良好的一致性,可作为在立体图像的视觉舒适度提升过程中视觉舒适度评价的判断标准。

表 1 各种评价方法在IVY Lab立体图像舒适度测试上的总体性能比较
Table 1 Performance comparison of different assessment metrics on IVY Lab Stereoscopic image database for visual comfort test

Kim[4] 0.814 0 0.804 0 0.484 0
Sohn[5] 0.838 0 0.422 0
Jung[6] 0.849 0 0.811 0 0.440 0
Proposed (6维) 0.864 2 0.829 1 0.372 9
Proposed (14维) 0.869 4 0.830 1 0.366 9

2.2 视觉舒适度提升结果及分析

为更直观显示所提出方法的提升效果,选取了IVY Lab立体图像舒适度测试库中的10幅不同场景的不舒适 (MOS < 4) 的立体图像进行测试,其右视点图像如图 5所示。然后,分别对原始左右视点和处理后的左右视点采用红蓝合成进行立体可视化。另外,为进一步验证所提出方法的有效性、先进性,将所提出方法与现有代表性的立体图像视觉舒适度提升方法进行比较,分别为Lei方法[10]和Jung方法[13]

图 5 Lab立体图像舒适度测试库中的10个立体图像的右视点图像
Fig. 5 Right views of ten stereoscopic images in IVY Lab stereoscopic image database for visual comfort test

图 6展示了处理后立体图像的舒适度提升结果的一些例子。从图 6 (a) 可知,原始立体图像的视差过大,从而导致观看时视觉的不舒适,它们的MOS值分别为2.82、1.35、1.71、2.35、2.24;而图 6(b)-(d)经视差重映射后都变得相对舒适。对比3个方法的可视化立体图像,虽然图 6(b)的立体感要强于图 6(c)图 6(d),但视觉舒适程度要远低于图 6(c)图 6(d),且还是感到轻度的视觉不舒适 (在2.3节中的主观评价实验定量的证实了该结论)。图 7图 6中第5个测试立体图像中黑色矩形框的细节放大图。相比较图 6 (c)图 6(d)的结果,从图 7的细节放大图中可看出两者的视觉舒适程度相差不大,然而图 6 (d) 的立体感要高于图 6 (c)(在2.3节中的主观评价实验同样定量的证实了该结论)。造成这一结果的原因是,Lei方法[10]是通过计算得到的零视差平面来移动整个场景的视差,而保持整体视差范围不变,因此可以保持良好的立体感。而Jung方法[13]对场景视差进行了大幅度的非线性映射,场景中过大的交叉和非交叉视差被显著缩减,是通过牺牲立体感来改善视觉舒适度。综合考虑,本文方法在有效提升视觉舒适度的同时维持了场景的立体感。

图 6 处理后立体图像的可视化结果实例
Fig. 6 Examples of visualization results of the processed stereoscopic images ((a) original stereoscopic images; (b) results of Lei's method[10]; (c) results of Jung's method[13]; (d) results of the proposed method)
图 7图 6中第5个测试立体图像的局部放大图
Fig. 7 The locally enlarged regions of the fifth test stereoscopic image in Fig 6 ((a) original stereoscopic image; (b) result of Lei's method[10]; (c) result of Jung's method[13]; (d) result of the proposed method)

2.3 主观评价实验与分析

除了上述比较外,将进一步从主观感知上验证所提出方法对于视觉舒适度的提升和保持立体感的有效性,按照ITU-R BT.500-11[21]和ITU-R BT.1438[22]标准设计主观评价实验。在该主观评价实验中,使用65英寸的三星UA65F9000超高清3D-LED显示器。该显示器具有低的串扰水平,最高亮度调整为50 cd/m2。位于显示屏幕和受试者之间的观看距离固定为屏幕高度的3倍。本实验采用ITU-R BT.500-11标准中的双刺激连续质量标度方法 (DSCQS)。DSCQS方法需要原始的立体图像和处理后的立体图像两类,且原始的立体图像和处理后的立体图像随机呈现。每幅立体图像显示10 s,切换不同的立体图像前播放5 s的灰度图像。这里,对2.2节中选取的标号为NO.0-NO.9的立体图像进行主观评价实验。共有17名受试者参与主观评价实验,所有受试者都具有正常或矫正到正常的视力,且达到了60″的最小立体视力。实验中,采用与IVY Lab立体图像库的主观打分一致的方式,要求受试者对立体图像的视觉舒适程度和立体感分别进行五等级打分,即5:非常舒适 (非常强),4:舒适 (强),3:轻度不舒适 (中),2:不舒适 (弱),1:非常不舒适 (非常弱)。最后对所有的实验数据进行检验,丢弃2份异常的分数后,利用剩下的15份数据计算得到最终的视觉舒适度的MOS值 (VC)、立体感的MOS值 (DS),以及两者综合的MOS值 (Ave),即Ave = 0.5(VC + DS)。

表 2给出了本文方法与其他具有代表性的2种方法的主观评价结果,其中VC和Ave中最优的指标加粗表示。显然,对于视觉舒适度的主观评价结果,Lei方法要明显低于Jung方法和本文方法,并且分值都在3分附近波动,说明经过Lei方法调整后,视觉舒适程度比原始是有略微的改善,但并未达到视觉舒适的程度,观看时依旧感到轻微的不舒适。而Jung方法和本文方法将不舒适的立体图像提升到舒适的程度,分值都位于4分上下,两者改善的程度相差相当。同时,对立体感的主观评价结果表明,Jung方法要明显弱于Lei方法和本文方法,受试者能明显感到经Jung方法调整后场景的立体感变弱,而Lei方法的立体感最强。结合视觉舒适度和立体感两个因素来综合评价,本文方法要明显优于Lei方法和Jung方法,在视觉舒适度和立体感之间达到了一个平衡。

表 2 各种调整方法对不舒适立体图像的主观评价结果
Table 2 Subjective assessment results of different adjustment metrics for uncomfortable stereoscopic images

指标 方法 NO.0 NO.1 NO.2 NO.3 NO.4 NO.5 NO.6 NO.7 NO.8 NO.9
MOS 原图 2.82 2.35 1.71 2.35 2.35 3.12 1.35 1.76 2.24 3.35
VC Lei[10] 3.13 3.07 2.80 3.20 3.13 3.67 2.27 3.53 3.07 3.67
Jung[13] 4.13 4.00 3.73 3.87 3.93 4.27 3.73 4.07 3.80 4.20
Proposed (主观) 4.07 4.00 4.00 3.80 4.07 4.13 3.87 4.00 4.07 4.13
Proposed (客观) 3.95 3.95 3.95 3.95 4.04 4.09 3.95 4.20 3.95 4.07
DS Lei[10] 4.00 4.07 4.13 4.13 4.20 3.93 4.20 3.93 4.00 4.07
Jung[13] 3.47 3.53 3.67 3.60 3.60 3.40 3.73 3.40 3.47 3.60
Proposed (主观) 3.87 4.00 4.07 3.93 4.00 3.93 4.20 3.73 3.87 3.93
Ave Lei[10] 3.57 3.57 3.47 3.67 3.67 3.80 3.24 3.73 3.54 3.87
Jung[13] 3.80 3.77 3.70 3.74 3.77 3.84 3.73 3.74 3.64 3.90
Proposed (主观) 3.97 4.00 4.04 3.87 4.04 4.03 4.04 3.87 3.97 4.03

图 8更直观地显示出所提出方法与Lei和Jung方法的对比结果,其中图 8(a)-(c)分别是视觉舒适度 (VC)、立体感 (DS) 和两者平均 (Ave) 的主观评价对比结果。可以明显看出,本文方法能够在保证立体感的前提下达到提升视觉舒适度的目的,而Lei方法对于视觉舒适度的改善效果一般,Jung方法虽然有效改善了视觉舒适度,却削弱了图像的立体感。

图 8 几种调整方法对不舒适立体图像的主观评价结果比较
Fig. 8 Comparison of subjective assessment results of different adjustment metrics for uncomfortable stereoscopic images ((a) VC; (b) DS; (c) Ave)

3 结论



  • [1] Oh H, Lee S, Bovik A C. Stereoscopic 3D visual discomfort prediction:a dynamic accommodation and vergence interaction model[J]. IEEE Transactions on Image Processing, 2016, 25(2): 615–629. [DOI:10.1109/TIP.2015.2506340]
  • [2] Li J, Wang A N, Wang J L, et al. Visual discomfort induced by three-dimensional display technology[J]. Laser & Optoelectronics Progress, 2015, 52(3): # 030009. [DOI:10.3788/LOP52.030009]
  • [3] Cho S H, Kang H B. An analysis of visual discomfort caused by watching stereoscopic 3D content in terms of depth, viewing time and display size[J]. Journal of Imaging Science and Technology, 2015, 59(2): #20503. [DOI:10.2352/J.ImagingSci.Technol.2015.59.2.020503]
  • [4] Kim D, Sohn K. Visual fatigue prediction for stereoscopic image[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2011, 21(2): 231–236. [DOI:10.1109/TCSVT.2011.2106275]
  • [5] Sohn H, Jung Y J, Lee S I, et al. Predicting visual discomfort using object size and disparity information in stereoscopic images[J]. IEEE Transactions on Broadcasting, 2013, 59(1): 28–37. [DOI:10.1109/TBC.2013.2238413]
  • [6] Jung Y J, Sohn H, Lee S I, et al. Predicting visual discomfort of stereoscopic images using human attention model[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(12): 2077–2082. [DOI:10.1109/TCSVT.2013.2270394]
  • [7] Park J, Lee S, Bovik A C. 3D visual discomfort prediction:vergence, foveation, and the physiological optics of accommodation[J]. IEEE Journal of Selected Topics in Signal Processing, 2014, 8(3): 415–427. [DOI:10.1109/JSTSP.2014.2311885]
  • [8] Jiang Q P, Shao F, Jiang G Y, et al. Three-dimensional visual comfort assessment via preference learning[J]. Journal of Electronic Imaging, 2015, 24(4): #043002. [DOI:10.1117/1.JEI.24.4.043002]
  • [9] Xu D, Coria L E, Nasiopoulos P. Quality of experience for the horizontal pixel parallax adjustment of stereoscopic 3D videos[C]//Proceedings of 2012 IEEE International Conference on Consumer Electronics. Las Vegas, NV, USA:IEEE, 2012:394-395.[DOI:10.1109/ICCE.2012.6161918]
  • [10] Lei J J, Li S Q, Wang B R, et al. Stereoscopic visual attention guided disparity control for multiview images[J]. Journal of Display Technology, 2014, 10(5): 373–379. [DOI:10.1109/JDT.2014.2312648]
  • [11] Holliman N S. Mapping perceived depth to regions of interest in stereoscopic images[C]//Proceedings of the SPIE 5291, Stereoscopic Displays and Virtual Reality Systems XI. San Jose, CA:SPIE, 2004, 5291:#117.[DOI:10.1117/12.525853]
  • [12] Sohn H, Jung Y J, Lee S I, et al. Visual comfort amelioration technique for stereoscopic images:Disparity remapping to mitigate global and local discomfort causes[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(5): 745–758. [DOI:10.1109/TCSVT.2013.2291281]
  • [13] Jung C, Cao L H, Liu H M, et al. Visual comfort enhancement in stereoscopic 3D images using saliency-adaptive nonlinear disparity mapping[J]. Displays, 2015, 40: 17–23. [DOI:10.1016/j.displa.2015.05.006]
  • [14] Oh C, Ham B, Choi S, et al. Visual fatigue relaxation for stereoscopic video via nonlinear disparity remapping[J]. IEEE Transactions on Broadcasting, 2015, 61(2): 142–153. [DOI:10.1109/TBC.2015.2402471]
  • [15] Schor C, Wood I, Ogawa J. Binocular sensory fusion is limited by spatial resolution[J]. Vision Research, 1984, 24(7): 661–665. [DOI:10.1016/0042-6989(84)90207-4]
  • [16] Henriksson L, Hyvärinen A, Vanni S. Representation of cross-frequency spatial phase relationships in human visual cortex[J]. Journal of Neuroscience, 2009, 29(45): 14342–14351. [DOI:10.1523/JNEUROSCI.3136-09.2009]
  • [17] Sohn H, Jung Y J, Lee S I, et al. Attention model-based visual comfort assessment for stereoscopic depth perception[C]//Proceedings of the 17th International Conference on Digital Signal Processing. Corfu:IEEE, 2011:1-6.[DOI:10.1109/ICDSP.2011.6004985]
  • [18] Jung C, Liu H M, Cui Y. Visual comfort assessment for stereoscopic 3D images based on salient discomfort regions[C]//Proceedings of 2015 IEEE International Conference on Image Processing. Quebec City, Canada:IEEE, 2015:4047-4051.[DOI:10.1109/ICIP.2015.7351566]
  • [19] Tanimoto M, Fujii T, Suzuki K. Document M16090, ISO/IEC JTC1/SC29/WG11 View synthesis algorithm in view synthesis reference software 3.0 (VSRS 3.0)[S]. Lausanne, Switzerland:International Organization for Standardization, 2009.
  • [20] Sohn H, Jung Y J, Lee S, et al. Korea Advanced Institute of Science and Technology. IVY LAB stereoscopic image database[EB/OL].[2016-10-11].
  • [21] ITU-R BT.500-11 Methodology for the subjective assessment of the quality of television pictures[S]. Geneva, Switzerland:International Telecommunication Union, 2002.
  • [22] ITU-R BT.1438 Subjective assessment of stereoscopic television pictures[S]. Geneva, Switzerland:International Telecommunication Union, 2000.