发布时间: 2017-04-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.20170405 2017 | Volume 22 | Number 4 图像处理和编码

1. 宁波大学信息科学与工程学院, 宁波 315211;
2. 南京大学计算机软件新技术国家重点实验室, 南京 210023
 收稿日期: 2016-10-24; 修回日期: 2017-01-03 基金项目: 国家自然科学基金项目（U1301257，61671258）；浙江省自然科学基金项目（LY15F010005） 第一作者简介: 王颖 (1991-), 女, 现为宁波大学信号与信息处理专业硕士研究生, 主要研究方向为图像/视频处理、3D视觉舒适度。E-mail:wangying2524@126.com 中图法分类号: TP391.4 文献标识码: A 文章编号: 1006-8961(2017)04-0452-11

# 关键词

Visual comfort enhancement for stereoscopic images based on disparity remapping
Wang Ying1, Yu Mei1,2, Ying Hongwei1, Jiang Gangyi1,2
1. Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China;
2. National Key Laboratory of Software New Technology, Nanjing University, Nanjing 210023, China
Supported by: National Natural Science Foundation of China (U1301257, 61671258);Natural Science Foundation of Zhejiang Province, China (LY15F010005)

# Abstract

Objective At present, 3D videos have become extensively integrated into the daily lives of people due to the immersive visual experience that they provide to users. However, viewers can experience visual discomfort when watching 3D videos, and even suffer from eye fatigue, headache, nausea, and other symptoms due to defects in 3D imaging technology. Therefore, the study of visual comfort enhancement methods for stereoscopic images or videos is highly significant to improve stereoscopic display technology and provide users with higher-quality 3D vision service. The factors that can cause visual discomfort when people watch stereoscopic images or videos include the followings: vergence-accommodation conflict, excessive cross and non-cross disparities, disparity distribution, spatial frequency, mismatch between left and right images, and object movement. Vergence-accommodation conflict is the fundamental cause of visual discomfort. Binocular vergence-accommodation conflict is characterized by a large disparity that occurs in 3D space. If the disparity is outside the fusion range, then the viewer cannot fuse the left and right images into a stereoscopic image, and instead, will see an unclear crosstalk image, thereby resulting in severe visual fatigue. Disparity distribution is also one of the main factors that affect visual comfort. Excessive cross disparity is more likely to cause visual discomfort than excessive non-cross disparity. When the entire image is located in front of the screen, visual comfort will be lower compared with when the entire image is positioned behind the screen. The disparity distribution of an image is more concentrated on the zero-disparity plane, thereby making the image more comfortable to view. As dispersion decreases, viewing the image becomes more comfortable. Spatial frequency also influences visual comfort by affecting binocular fusion limit. An image with high spatial frequency causes a higher degree of visual discomfort than an image with low spatial frequency. Disparity adjustment is the main method that can enhance the visual comfort of stereoscopic images because the vergence-accommodation conflict caused by the increased disparity is the main factor that leads to visual discomfort. Disparity adjustment methods can be divided into two categories: disparity shifting and disparity scaling. A disparity shifting method adjusts disparity by shifting the zero-disparity plane of the original image, thereby keeping the disparity range unchanged. Although this method has low computational complexity, simultaneously ensuring maximum cross disparity and non-cross disparity within the comfort zone is difficult regardless of how disparity is moved when the original disparity range exceeds a certain range of comfortable viewing area. Thus, visual discomfort remains unavoidable in this case. By contrast, the disparity range of the original scene can be linearly or nonlinearly adjusted into the comfort area by using a disparity scaling method. In general, excessive vergence-accommodation conflict can be avoided effectively by reducing the disparity range of the scene. However, when a large-scale disparity reduction is performed, the overall perceived depth of the stereoscopic image is significantly decreased, and an unnatural visual effect occurs due to the limited range of the comfortable viewing area. A new visual comfort enhancement method for stereoscopic images is proposed by combining global linear and local nonlinear disparity remapping based on the effect of disparity on visual comfort. This method can prevent visual discomfort when viewing stereoscopic images; it also balances the improvement of visual comfort of stereoscopic images and the weakening of the 3D sense of scenes. Method First, an objective visual comfort assessment model is constructed to automatically predict the visual comfort of stereoscopic images and to judge the improvement of the visual comfort of stereoscopic images during disparity adjustment. On the one hand, when binocular fusion limitation is considered, the global visual comfort features of stereoscopic images are extracted by combining spatial frequency and disparity. On the other hand, we perform a disparity statistical analysis on stereoscopic significant regions and obtain local visual comfort features based on the hypothesis that the human eye tends to pay excessive attention to perceived salient regions. Support vector regression is adopted in this study to construct the objective visual comfort prediction model for stereoscopic images by establishing the mapping relationship between features and subjective scores. Then, the visual comfort of the input stereoscopic image is analyzed using the constructed prediction model. A two-stage disparity remapping strategy is designed for less-comfortable stereoscopic images. This strategy consists of the global linear adjustment of the disparity range and the local nonlinear adjustment of the disparity in the extracted potentially less-comfortable regions. The global disparity remapping of the input disparity map is performed during the first stage to adjust the uncomfortable stereoscopic images to a relatively comfortable degree. The global disparity linear iterative adjustment process is performed if the predicted visual comfort objective score is less than the preset threshold. Only the global features are applied at this point to construct the visual comfort prediction function. Local nonlinear disparity remapping is then performed during the second stage to further enhance the viewing comfort of the stereoscopic image and maintain the 3D sense of the scene. The disparity of the potentially less-comfortable regions extracted from the disparity map after global linear remapping is adjusted via nonlinear iteration until the predicted visual comfort objective score is higher than the preset target threshold. The visual comfort of the adjusted stereoscopic image is predicted in conjunction with global and local features at this point. Lastly, an updated comfortable stereoscopic image is reconstructed via a rendering technique according to the remapped disparity map. Result A subjective evaluation experiment is designed on the IVY Lab stereoscopic image database to verify the effectiveness of the proposed method in improving the visual comfort and maintaining the 3D sense of stereoscopic images. Experimental results show that the proposed method can more effectively enhance the visual comfort of less-comfortable stereoscopic images while maintaining the 3D sense of scenes compared with state-of-the-art stereoscopic image visual comfort enhancement methods. Conclusion The proposed method can automatically implement global linear and local nonlinear disparity remapping processes based on the visual comfort prediction model constructed with different features of stereoscopic images. The proposed method can realize the purpose of improving the visual comfort of stereoscopic images under the premise of ensuring 3D sense, which enhances the overall 3D experience of stereoscopic images.

# Key words

stereoscopic image; visual comfort enhancement; objective prediction model; disparity remapping; three-dimensional sense

# 0 引言

3维视频作为多媒体领域研究热点已广泛融入到人们的日常生活中。3维视频在带给观众震撼视觉体验的同时，由于立体成像技术的缺陷，可能也会导致观众出现眼疲劳、头疼、恶心等不舒适的症状[1]。因此，研究立体图像/视频的视觉舒适性及提升方法对改进立体显示技术、为用户提供更高品质的3维视觉服务有重要的实际意义。

# 1.1.1 全局视觉舒适特征

 $PC\left( p \right) = \frac{{\sum\limits_j {{E_{{\theta _j}}}(p)} }}{{\alpha + \sum\limits_n {\sum\limits_j {{A_{n, {\theta _j}}}(p)} } }}$ (1)

# 1.1.2 局部视觉舒适特征

 ${\mathit{\boldsymbol{M}}_{{\rm{Seg}}}}\left( {x, y} \right) = \left\{ \begin{array}{l} 1\;\;\;{\mathit{\boldsymbol{M}}_{{\rm{SS}}}}\left( {x, y} \right) > {T_r}\\ 0\;\;其他 \end{array} \right.$ (2)

# 1.1.3 视觉舒适度预测函数构建

 $\Psi \left( \mathit{\boldsymbol{F}} \right) = \sum\limits_{i = 1}^n \mathit{\boldsymbol{\omega }} \times K\left( {\mathit{\boldsymbol{F}}, {\mathit{\boldsymbol{F}}^i}} \right) + b$ (3)

# 1.2 全局视差线性重映射

 ${\mathit{\boldsymbol{D}}^\prime } = \varepsilon \times \mathit{\boldsymbol{D}}$ (4)

# 1.3 局部视差非线性重映射

 ${\mathit{\boldsymbol{D}}^{''}}_Z = \eta \times {\rm{ln}}({\mathit{\boldsymbol{D}}^\prime }-{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}} + 1) + {\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}}$ (5)

 ${\eta _m} = \frac{{{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{max}}}}}-{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}}}}{{{\rm{ln}}({\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{max}}}}}-{\mathit{\boldsymbol{D}}^\prime }_{{z_{{\rm{min}}}}} + 1)}}$ (6)

 ${\mathit{\boldsymbol{D}}^{''}} = {\mathit{\boldsymbol{D}}^{''}}_Z \cup {\mathit{\boldsymbol{D}}^\prime }_{NZ}$ (7)

# 2.1 视觉舒适度预测函数构建

Table 1 Performance comparison of different assessment metrics on IVY Lab Stereoscopic image database for visual comfort test

 方法 PLCC SROCC RMSE Kim[4] 0.814 0 0.804 0 0.484 0 Sohn[5] 0.838 0 — 0.422 0 Jung[6] 0.849 0 0.811 0 0.440 0 Proposed (6维) 0.864 2 0.829 1 0.372 9 Proposed (14维) 0.869 4 0.830 1 0.366 9

# 2.3 主观评价实验与分析

Table 2 Subjective assessment results of different adjustment metrics for uncomfortable stereoscopic images

 指标 方法 NO.0 NO.1 NO.2 NO.3 NO.4 NO.5 NO.6 NO.7 NO.8 NO.9 MOS 原图 2.82 2.35 1.71 2.35 2.35 3.12 1.35 1.76 2.24 3.35 VC Lei[10] 3.13 3.07 2.80 3.20 3.13 3.67 2.27 3.53 3.07 3.67 Jung[13] 4.13 4.00 3.73 3.87 3.93 4.27 3.73 4.07 3.80 4.20 Proposed (主观) 4.07 4.00 4.00 3.80 4.07 4.13 3.87 4.00 4.07 4.13 Proposed (客观) 3.95 3.95 3.95 3.95 4.04 4.09 3.95 4.20 3.95 4.07 DS Lei[10] 4.00 4.07 4.13 4.13 4.20 3.93 4.20 3.93 4.00 4.07 Jung[13] 3.47 3.53 3.67 3.60 3.60 3.40 3.73 3.40 3.47 3.60 Proposed (主观) 3.87 4.00 4.07 3.93 4.00 3.93 4.20 3.73 3.87 3.93 Ave Lei[10] 3.57 3.57 3.47 3.67 3.67 3.80 3.24 3.73 3.54 3.87 Jung[13] 3.80 3.77 3.70 3.74 3.77 3.84 3.73 3.74 3.64 3.90 Proposed (主观) 3.97 4.00 4.04 3.87 4.04 4.03 4.04 3.87 3.97 4.03 注：加粗字体为最优结果和次优结果。

# 参考文献

• [1] Oh H, Lee S, Bovik A C. Stereoscopic 3D visual discomfort prediction:a dynamic accommodation and vergence interaction model[J]. IEEE Transactions on Image Processing, 2016, 25(2): 615–629. [DOI:10.1109/TIP.2015.2506340]
• [2] Li J, Wang A N, Wang J L, et al. Visual discomfort induced by three-dimensional display technology[J]. Laser & Optoelectronics Progress, 2015, 52(3): # 030009. [DOI:10.3788/LOP52.030009]
• [3] Cho S H, Kang H B. An analysis of visual discomfort caused by watching stereoscopic 3D content in terms of depth, viewing time and display size[J]. Journal of Imaging Science and Technology, 2015, 59(2): #20503. [DOI:10.2352/J.ImagingSci.Technol.2015.59.2.020503]
• [4] Kim D, Sohn K. Visual fatigue prediction for stereoscopic image[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2011, 21(2): 231–236. [DOI:10.1109/TCSVT.2011.2106275]
• [5] Sohn H, Jung Y J, Lee S I, et al. Predicting visual discomfort using object size and disparity information in stereoscopic images[J]. IEEE Transactions on Broadcasting, 2013, 59(1): 28–37. [DOI:10.1109/TBC.2013.2238413]
• [6] Jung Y J, Sohn H, Lee S I, et al. Predicting visual discomfort of stereoscopic images using human attention model[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(12): 2077–2082. [DOI:10.1109/TCSVT.2013.2270394]
• [7] Park J, Lee S, Bovik A C. 3D visual discomfort prediction:vergence, foveation, and the physiological optics of accommodation[J]. IEEE Journal of Selected Topics in Signal Processing, 2014, 8(3): 415–427. [DOI:10.1109/JSTSP.2014.2311885]
• [8] Jiang Q P, Shao F, Jiang G Y, et al. Three-dimensional visual comfort assessment via preference learning[J]. Journal of Electronic Imaging, 2015, 24(4): #043002. [DOI:10.1117/1.JEI.24.4.043002]
• [9] Xu D, Coria L E, Nasiopoulos P. Quality of experience for the horizontal pixel parallax adjustment of stereoscopic 3D videos[C]//Proceedings of 2012 IEEE International Conference on Consumer Electronics. Las Vegas, NV, USA:IEEE, 2012:394-395.[DOI:10.1109/ICCE.2012.6161918]
• [10] Lei J J, Li S Q, Wang B R, et al. Stereoscopic visual attention guided disparity control for multiview images[J]. Journal of Display Technology, 2014, 10(5): 373–379. [DOI:10.1109/JDT.2014.2312648]
• [11] Holliman N S. Mapping perceived depth to regions of interest in stereoscopic images[C]//Proceedings of the SPIE 5291, Stereoscopic Displays and Virtual Reality Systems XI. San Jose, CA:SPIE, 2004, 5291:#117.[DOI:10.1117/12.525853]
• [12] Sohn H, Jung Y J, Lee S I, et al. Visual comfort amelioration technique for stereoscopic images:Disparity remapping to mitigate global and local discomfort causes[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(5): 745–758. [DOI:10.1109/TCSVT.2013.2291281]
• [13] Jung C, Cao L H, Liu H M, et al. Visual comfort enhancement in stereoscopic 3D images using saliency-adaptive nonlinear disparity mapping[J]. Displays, 2015, 40: 17–23. [DOI:10.1016/j.displa.2015.05.006]
• [14] Oh C, Ham B, Choi S, et al. Visual fatigue relaxation for stereoscopic video via nonlinear disparity remapping[J]. IEEE Transactions on Broadcasting, 2015, 61(2): 142–153. [DOI:10.1109/TBC.2015.2402471]
• [15] Schor C, Wood I, Ogawa J. Binocular sensory fusion is limited by spatial resolution[J]. Vision Research, 1984, 24(7): 661–665. [DOI:10.1016/0042-6989(84)90207-4]
• [16] Henriksson L, Hyvärinen A, Vanni S. Representation of cross-frequency spatial phase relationships in human visual cortex[J]. Journal of Neuroscience, 2009, 29(45): 14342–14351. [DOI:10.1523/JNEUROSCI.3136-09.2009]
• [17] Sohn H, Jung Y J, Lee S I, et al. Attention model-based visual comfort assessment for stereoscopic depth perception[C]//Proceedings of the 17th International Conference on Digital Signal Processing. Corfu:IEEE, 2011:1-6.[DOI:10.1109/ICDSP.2011.6004985]
• [18] Jung C, Liu H M, Cui Y. Visual comfort assessment for stereoscopic 3D images based on salient discomfort regions[C]//Proceedings of 2015 IEEE International Conference on Image Processing. Quebec City, Canada:IEEE, 2015:4047-4051.[DOI:10.1109/ICIP.2015.7351566]
• [19] Tanimoto M, Fujii T, Suzuki K. Document M16090, ISO/IEC JTC1/SC29/WG11 View synthesis algorithm in view synthesis reference software 3.0 (VSRS 3.0)[S]. Lausanne, Switzerland:International Organization for Standardization, 2009.
• [20] Sohn H, Jung Y J, Lee S, et al. Korea Advanced Institute of Science and Technology. IVY LAB stereoscopic image database[EB/OL].[2016-10-11].http://ivylab.kaist.ac.kr/demo/3DVCA/3DVCA.htm.
• [21] ITU-R BT.500-11 Methodology for the subjective assessment of the quality of television pictures[S]. Geneva, Switzerland:International Telecommunication Union, 2002.
• [22] ITU-R BT.1438 Subjective assessment of stereoscopic television pictures[S]. Geneva, Switzerland:International Telecommunication Union, 2000.