观看经度联合加权全景图显著性检测算法
Saliency detection algorithm of panoramic images using joint weighting with observer' s attention longitude
- 2022年27卷第4期 页码:1322-1334
纸质出版日期: 2022-04-16 ,
录用日期: 2021-01-13
DOI: 10.11834/jig.200682
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2022-04-16 ,
录用日期: 2021-01-13
移动端阅览
孙耀, 陈纯毅, 胡小娟, 李凌, 邢琦玮. 观看经度联合加权全景图显著性检测算法[J]. 中国图象图形学报, 2022,27(4):1322-1334.
Yao Sun, Chunyi Chen, Xiaojuan Hu, Ling Li, Qiwei Xing. Saliency detection algorithm of panoramic images using joint weighting with observer' s attention longitude[J]. Journal of Image and Graphics, 2022,27(4):1322-1334.
目的
2
目前针对全景图显著性检测的研究已经取得了一定成果,但在全景图像的位置特性问题中,大都仅探讨了纬度对全景图显著性检测的影响。而人们观看全景图像时,因视角有限,不同经度位置的显著性也有很大差异,从而导致预测的显著区域往往不够精确。为此本文以全景图的经度位置特性为出发点,提出基于观看经度联合加权的全景图显著性检测算法。
方法
2
使用空间显著性预测网络得到初步的显著性图像,使用赤道偏倚进行预处理以改善不同纬度位置的显著性检测效果。接着对显著性图像进行注视点经度加权,将观察者观看全景图的行为习惯与显著性图像相结合。之后对全景图进行双立方体投影与分割,提取全景图的亮度与深度特征,进而计算不同视口经度权重。经过两次加权,得到最终的显著性图像。
结果
2
在Salient360!挑战大赛提供的数据集上与其他几种算法进行了实验比较。结果显示,本文算法能得到很好的显著性检测结果。在对本文算法的通用性能的测试中,在标准化扫描路径显著性、相关系数、相似度与相对熵指标上分别达到了1.979 3、0.806 2、0.709 5和0.323 9,均优于其他算法。
结论
2
提出的全景图显著性检测算法解决了以往全景图显著性检测中不同经度位置检测结果不够准确的问题。
Objective
2
Considerable development in immersive media technologies has taken place with the aim of providing a complete audiovisual experience to the users
especially sense of being in the visualized scene. It has been used in many fields such as entertainment
tourism
exhibition
etc. The image resolution of virtual reality (VR) panoramic images is much higher than that of traditional images
making the storage and transmission of VR panoramic images very difficult. However
a human' s visual attention mechanism has a selective attention ability
and when faced with a scene
the human can automatically deal with the area of interest
selectively ignoring the area of no interest. In daily tasks
humans face far more information than they can handle
and selective visual attention enables them to process a large amount of information by prioritizing certain aspects of the information while ignoring others. Therefore
it is necessary to detect the saliency of panoramic images to reasonably reduce the redundant information in it. For the saliency detection of panoramic images
current research can be divided into the following directions: 1) improved traditional saliency detection algorithm and 2) panoramic image saliency algorithm by deep learning. The improved traditional saliency detection algorithm involves two aspects: projection conversion and equator bias. According to the characteristics of VR panoramic images with multiple projection modes
the saliency detection of VR panoramic images can be used in different projection domains. Equator bias refers to a phenomenon that the saliency of panoramic images tends to be concentrated near the equator because of human observation habits. The saliency detection algorithm can weigh the saliency according to the latitude position of pixels. The panoramic image saliency algorithm for deep learning uses neural networks to extract image features and detect the image' s saliency. It is also necessary to improve the saliency detection effects when using the neural network algorithm
such as combining equator bias because of insufficient contents in the current panoramic image dataset. Although the existing algorithm optimizes the influence of latitude location attributes by combining equator bias
no research has focused on the influence of longitude location attributes on saliency. Hence
this study proposes a saliency detection algorithm of panoramic images using joint weighting with the observer' s attention longitude.
Method
2
First
a spatial saliency prediction network is used to obtain the preliminary saliency images
and then the equator bias is used to increase the accuracy of the saliency detection at different latitudes. The saliency image is weighted by the attention longitude weighting to combine the observer' s behavior with saliency image. This study first adds up the saliency value of each longitude in the reference saliency images in the dataset to obtain the prime attention longitude weight graph. Then
the center of prime attention longitude weight graph is aligned with the prime observation center of the original panorama by translating the prime attention longitude weight graph. The weight of the prime attention longitude weighting is multiplied by the value of the saliency. A strong salient area out of the prime observation viewport is observed
the most salient part of the predicted panorama saliency image is used as the secondary observation center
and the converted attention longitude weighting will work. There are two differences between the prime attention longitude weighting and the converted attention longitude weighting. One is that the datasets they use are different
and we choose images more similar to human viewing habits to get the converted attention longitude weighting. The other one is that their effect is different
and the converted attention longitude weighting' s effect is weaker than the other. The second step is "weighting of different viewports and longitude". First
the panoramic image will be double-cube projected
and the panoramic image in ERP (equirectangular projection) format was cube projected into six squares. Then
it will be translated for 45 degrees and use cube projection will be used again. Then
the RGB format image is converted into LAB format to extract the brightness feature of the panoramic image and the mrharicot-monodepth2 is used to obtain the depth feature. The different longitude weights of each viewport were calculated based on the difference between the features of each viewport and other viewports
and the longitude weights of each pixel point were calculated based on the difference between the features of each pixel point and other pixels. Combined with the two weights
the different longitude weights of the viewport were obtained and used to weigh the saliency image. Finally
by using the saliency graph with the prime attention longitude weighting and the weighting of different viewports and longitude
we can obtain the final saliency graph.
Result
2
This study compared our result with other algorithms' results on a dataset. The dataset is provided by the International Conference on Multimedia & Expo(ICME) 2017 Salient360! Grand Challenge. Other algorithms include "a saliency prediction model based on sparse representation and the human acuity weighted center-surround differences" (CDSR)
"deepauto encoder-based reconstruction network" (AER)
and "panoramic-CNN-360-Saliency" (PC3S) algorithm. CSDR is an improved traditional algorithm
and AER and PC3S are deep learning algorithms. For evaluation
the evaluation metrics we use various evaluation metrics for eye fixation prediction
including the saliency of the standardized scan path
correlation coefficient
similarity
and Kullback-Leibler (KL) divergence
and reached 1.979 3
0.806 2
0.709 5
and 0.323 9
respectively. The results show that the proposed algorithm is superior to other algorithms in the four evaluation metrics
and the detection results of saliency are better
and the detection accuracy of saliency at different longitude positions is more accurate.
Conclusion
2
In this study
we proposed a saliency detection algorithm of panoramic images using joint weighting with observer' s attention longitude. This algorithm improves the effect of saliency detection at different longitude position' s accuracy. Experiments show that our algorithm is superior to the current algorithms
especially the detection accuracy of saliency at different longitude is improved.
显著性检测全景图注视点经度加权双立方体投影不同视口经度加权
saliency detectionpanoramic imageattention longitude weightingdouble-cube projectionweighting of different viewports and longitude
Azam S, Gilani S O, Jeon M, Yousaf R and Kim J B. 2016. A benchmark of computational models of saliency to predict human fixations in videos//Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. Rome, Italy: [s.n.]: 134-142 [DOI: 10.5220/000567870-1340142http://dx.doi.org/10.5220/000567870-1340142]
Battisti F, Baldoni S, Brizzi M and Carli M. 2018. A feature-based approach for saliency estimation of omni-directional images. Signal Processing: Image Communication, 69: 53-59 [DOI: 10.1016/j.image.2018.03.008]
Carrasco M. 2011. Visual attention: the past 25 years. Vision Research, 51(13): 1484-1525 [DOI: 10.1016/j.visres.2011.04.012]
Chao F Y, Zhang L, Hamidouche W and Deforges O. 2018. Salgan360: visual saliency prediction on 360 degree images with generative adversarial networks//Proceedings of 2018 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). San Diego, USA: IEEE: 1-4 [DOI: 10.1109/ICMEW.2018.8551543http://dx.doi.org/10.1109/ICMEW.2018.8551543]
De Abreu A, Ozcinar C and Smolic A. 2017. Look around you: Saliency maps for omnidirectional images in VR applications///Proceedings of the 9th International Conference on Quality of Multimedia Experience. Erfurt, Germany: IEEE: 1-6 [DOI: 10.1109/QoMEX.2017.7965634http://dx.doi.org/10.1109/QoMEX.2017.7965634]
Ding Y, Liu Y W, Liu J X, Liu K D, Wang L M and Xu Z. 2018. Panoramic image saliency detection by fusing visual frequency feature and viewing behavior pattern//Proceedings of the 19th Pacific-Rim Conference on Multimedia Advances in Multimedia Information Processing. Hefei, China: Springer: 418-429 [DOI: 10.1007/978-3-030-00767-6_39http://dx.doi.org/10.1007/978-3-030-00767-6_39]
Ding Y, Liu Y W, Liu J X, Liu K D, Wang L M and Xu Z. 2019. An overview of research progress on saliency detection of panoramic VR images. Acta Electronica Sinica, 47(7): 1575-1583
丁颖, 刘延伟, 刘金霞, 刘科栋, 王利明, 徐震. 2019. 虚拟现实全景图像显著性检测研究进展综述. 电子学报, 47(7): 1575-1583 [DOI: 10.3969/j.issn.0372-2112.2019.07.024]
Ebner M. 2007. Color Constancy. Chichester: John Wiley and Sons [DOI:10.1002/9780470510490]
Godard C, Mac Aodha O, Firman M and Brostow G J. 2019. Digging into self-supervised monocular depth estimation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Seoul, Korea (South): IEEE: 3828-3838 [DOI: 10.1109/ICCV.2019.00393http://dx.doi.org/10.1109/ICCV.2019.00393]
Gutiérrez J, David E, Rai Y and Le Callet P. 2018. Toolbox and dataset for the development of saliency and scanpath models for omnidirectional/360° still images. Signal Processing: Image Communication, 69: 35-42 [DOI: 10.1016/j.image.2018.05.003]
Jost T, Ouerhani N, Von Wartburg R, Müri R and Hügli H. 2005. Assessing the contribution of color in visual attention. Computer Vision and Image Understanding, 100(1/2): 107-123 [DOI: 10.1016/j.cviu.2004.10.009]
Lebreton P and Raake A. 2018. GBVS360, BMS360, ProSal: extending existing saliency prediction models from 2D to omnidirectional images. Signal Processing: Image Communication, 69: 9-78 [DOI: 10.1016/j.image.2018.03.006]
Ling J, Zhang K, Zhang Y X, Yang D Q and Chen Z Z. 2018. A saliency prediction model on 360 degree images using color dictionary based sparse representation. Signal Processing: Image Communication, 69: 60-68 [DOI: 10.1016/j.image.2018.03.007]
Martin D, Serrano A and Masia B. 2020. Panoramic convolutions for 360° single-image saliency prediction [EB/OL]. [2020-06-22]. https://paper.nweon.com/2706
Maugey T, Le Meur O and Liu Z. 2017. Saliency-based navigation in omnidirectional image//Proceedings of the 19th IEEE International Workshop on Multimedia Signal Processing (MMSP). Luton, UK: IEEE: 1-6 [DOI: 10.1109/MMSP.2017.8122229http://dx.doi.org/10.1109/MMSP.2017.8122229]
Peters R J, Iyer A, Koch C and Itti L. 2005. Components of bottom-up gaze allocation in natural scenes. Journal of Vision, 5(8): #692 [DOI: 10.1167/5.8.692]
Rai Y, Gutiérrez J and Le Callet P. 2017. A dataset of head and eye movements for 360 degree images//Proceedings of the 8th ACM on Multimedia Systems Conference. Taipei, China: IEEE: 205-210 [DOI: 10.1145/3083187.3083218http://dx.doi.org/10.1145/3083187.3083218]
Řeřábek M, Upenik E and Ebrahimi T. 2016. JPEG backward compatible coding of omnidirectional images//Proceedings of SPIE 9971, Applications of Digital Image Processing XXXIX. San Diego, USA: SPIE: #99710 [DOI: 10.1117/12.2240281http://dx.doi.org/10.1117/12.2240281]
Shafieyan F, Karimi N, Mirmahboub B, Samavi S and Shirani S. 2014. Image seam carving using depth assisted saliency map//Proceedings of 2014 IEEE International Conference on Image Processing (ICIP). Paris, France: IEEE: 1155-1159 [DOI: 10.1109/ICIP.2014.7025230http://dx.doi.org/10.1109/ICIP.2014.7025230]
Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2020-11-23]. https://arxiv.org/pdf/1409.1556.pdf
Su Q, Lin C Y, Zhao Y, Li Y R and Liu M Q. 2018. Salient detection of 360 panorama based on multi-angle segmentation. Journal of Graphics, 39(6): 1055-1061
苏群, 林春雨, 赵耀, 李雅茹, 刘美琴. 2018. 基于多角度分割的360全景图的显著性检测. 图学学报, 39(6): 1055-1061)[DOI: 10.11996/JG.j.2095-302X.2018061055]
Tatler B W, Baddeley R J and Gilchrist I D. 2005. Visual correlates of fixation selection: effects of scale and time. Vision Research, 45(5): 643-659 [DOI: 10.1016/j.visres.2004.09.017]
Xia C, Qi F and Shi G M. 2016. Bottom-up visual saliency estimation with deep autoencoder-based sparse reconstruction. IEEE Transactions on Neural Networks and Learning Systems, 27(6): 1227-1240 [DOI: 10.1109/TNNLS.2015.2512898]
Zhang K and Chen Z Z. 2019. Video saliency prediction based on spatial-temporal two-stream network. IEEE Transactions on Circuits and Systems for Video Technology, 29(12): 3544-3557 [DOI: 10.1109/TCSVT.2018.2883305]
Zhu Y C, Zhai G T and Min X K. 2018. The prediction of head and eye movement for 360 degree images. Signal Processing: Image Communication, 69: 15-25 [DOI: 10.1016/j.image.2018.05.010]
相关作者
相关机构