稀疏深度特征对传统显著性检测的优化
Optimization of traditional saliency detection by sparse depth features
- 2019年24卷第9期 页码:1493-1503
收稿:2018-11-19,
修回:2019-3-13,
纸质出版:2019-09-16
DOI: 10.11834/jig.180626
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-11-19,
修回:2019-3-13,
纸质出版:2019-09-16
移动端阅览
目的
2
显著性目标检测算法主要分为基于低级特征的传统方法和基于深度学习的新方法,传统方法难以捕获对象的高级语义信息,基于深度学习的新方法能捕获高级语义信息却忽略了边缘特征。为了充分发挥两种方法的优势,基于将二者结合的思路,本文利用稀疏能使得显著性对象指向性凝聚的优势,提出了一种基于稀疏自编码和显著性结果优化的方法。
方法
2
对VGG(visual geometry group)网络第4个池化层的特征图进行稀疏自编码处理,得到5张稀疏显著性特征图,再与传统方法得到的显著图一起输入卷积神经网络进行显著性结果优化。
结果
2
使用DRFI(discriminative regional feature integration)、HDCT(high dimensional color transform)、RRWR(regularized random walks ranking)和CGVS(contour-guided visual search)等传统方法在DUT-OMRON、ECSSD、HKU-IS和MSRA等公开数据集上进行实验,表明本文算法有效改善了显著性对象的F值和MAE(mean absolute error)值。在F值提高方面,优化后的DRFI方法提升最高,在HKU-IS数据集上提高了24.53%。在MAE值降低方面,CGVS方法降低最少,在ECSSD数据集上降低了12.78%,降低最多的接近50%。而且本模型结构简单,参数少,计算效率高,训练时间约5 h,图像的平均测试时间约为3 s,有很强的实际应用性。
结论
2
本文提出了一种显著性结果优化算法,实验结果表明算法有效改善了显著性对象F值和MAE值,在对显著性对象检测要求越来越准确的对象识别等任务中有较好的适应性和应用性前景。
Objective
2
Saliency detection
as a preprocessing component of computer vision
has received increasing attention in the areas of object relocation
scene classification
semantic segmentation
and visual tracking. Although object detection has been greatly developed
it remains challenging because of a series of realistic factors
such as background complexity and attention mechanism. In the past
many significant target detection methods have been developed. These methods are mainly divided into traditional methods and new methods based on deep learning. The traditional approach is to find significant targets through low-level manual features
such as contrast
color
and texture. These general techniques are proven effective in maintaining image structure and reducing computational effort. However
these low-level features cause difficulty in capturing high-level semantic knowledge about objects and their surroundings. Therefore
these low-level feature-based methods do not achieve excellent results when salient objects are stripped from the stacked background. The saliency detection method based on deep learning mainly seeks significant targets by automatically extracting advanced features. However
most of these advanced models focus on the nonlinear combination of advanced features extracted from the final convolutional layer. The boundaries of salient objects are often extremely blurry due to the lack of low-level visual information such as edges. In these jobs
convolutional neural network (CNN) features are applied directly to the model without any processing. The features extracted from the CNN are generally high in dimension and contain a large amount of noise
thereby reducing the utilization efficiency of CNN features and revealing an opposite effect. Sparse methods can effectively aggregate the salient objects in a feature map and eliminate some of the noise interference. Sparse self-encoding is a sparse method. A traditional saliency recognition method based on sparse self-encoding and image fusion
combined with background prior and contrast analysis and VGG (visual geometry group) saliency calculation
is proposed to solve these problems.
Method
2
The proposed algorithm is mainly composed of the following:traditional saliency map extraction
VGG feature extraction
sparse self-encoding
and saliency result optimization. The traditional method to be improved is selected
and the corresponding saliency map is calculated. In this experiment
we select four traditional methods with excellent results
namely
discriminative regional feature integration (DRFI)
high-dimensional color transform (HDCT)
regularized random walks ranking (RRWR)
and contour-guided visual search (CGVS). Then
the VGG network is used to extract feature maps. The feature maps obtained by each pooled layer are sparsely self-encoded to obtain 25 sparse saliency feature maps. When a feature map is selected
excessive edge information and texture information are retained because the features extracted by the first three pooling layers are mainly low-level features
indicating duplicate effects with feature maps obtained by the conventional method; thus
the feature maps from low-level are not used. The comparison between the fourth and fifth feature maps shows that the feature information of the fifth pooling layer is excessively lost. After experimental verification
the fifth layer characteristic map exerts an interference effect. Thus
we use the feature map extracted from the fourth pooling layer. Then
these feature maps are placed into the sparse self-encoder to perform the sparse operation to obtain five feature maps. Each feature map is integrated with the corresponding saliency map obtained in the previous volume. Finally
the neural network performs the operation and calculates the final saliency map.
Result
2
Our experiments involved four open datasets:DUT-OMRON
ECSSD
HKU-IS
and MSRA. Then
we obtained half of the images from the four datasets used in the experiment to form a training set and the remaining four test sets. The results obtained can be extremely credible. The following conclusions are drawn from the experiment. 1) The proposed model greatly improves the F value in the four datasets of the four methods
including an increase of 24.53% in the HKU-IS dataset of the DRFI method. 2) The MAE (mean absolute error) value has also been greatly reduced
the least of which is reduced by 12.78% for the ECSSD dataset of the CGVS method and the highest of which is reduced by nearly 50%. 3) The proposed model network has few layers
few parameters
and short calculation time. The training time is approximately 2 h
and the average test time of the image is approximately 0.2 s. On the contrary
Liu chooses an image saliency optimization scheme using adaptive fusion. The training time is approximately 47 h
and the average test time of the image is 56.95 s. The proposed model greatly improves the computational efficiency. 4) The proposed model achieves a significant improvement for the four datasets
especially the HKU-IS and MSRA datasets. These datasets contain difficult images
thereby confirming the effectiveness of the proposed method.
Conclusion
2
A low-level feature map based on traditional models
such as a texture and high-level feature map of a sparsely self-encoded VGG network
is proposed to optimize saliency results and greatly improve saliency target recognition. The traditional methods based on DRFI
HDCT
RRWR
and CGVS are tested in the publicly significant object detection datasets DUT-OMRON
ECSSD
HKU-IS
and MSRA
respectively. The obtained F value and MAE value are significantly improved
thereby confirming the effectiveness of the proposed method. Moreover
the method steps and network structure are simple and easy to understand
the training takes little time
and popular promotion can be easily obtained. The limitation of the study is that some of the extracted feature maps are missing. In practice
only the fourth layer of VGG maps is selected
and not all useful information is fully utilized.
Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11):1254-1259.[DOI:10.1109/34.730558]
Schälkopf B, Platt J, Hofmann T. Graph-based visual saliency[C]//Proceedings of Neural Information Processing Systems. Canada: MIT Press, 2007: 545-552.
Perazzi F, Krähenbühl P, Pritch Y, et al. Saliency filters: contrast based filtering for salient region detection[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012: 733-740.[ DOI: 10.1109/CVPR.2012.6247743 http://dx.doi.org/10.1109/CVPR.2012.6247743 ]
Li Y, Hou X D, Koch C, et al. The secrets of salient object segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014: 280-287.[ DOI: 10.1109/CVPR.2014.43 http://dx.doi.org/10.1109/CVPR.2014.43 ]
Borji A. What is a salient object? A dataset and a baseline model for salient object detection[J]. IEEE Transactions on Image Processing, 2015, 24(2):742-756.[DOI:10.1109/TIP.2014.2383320]
Liu L, Kuang G Y. Overview of image textural feature extraction methods[J]. Journal of Image and Graphics, 2009, 14(4):622-635.
刘丽, 匡纲要.图像纹理特征提取方法综述[J].中国图象图形学报, 2009, 14(4):622-635.][DOI:10.11834/jig.20090409]
Li G B, Yu Y Z. Visual saliency detection based on multiscale deep CNN features[J]. IEEE Transactions on Image Processing, 2016, 25(11):5012-5024.[DOI:10.1109/TIP.2016.2602079]
Zhao R, Ouyang W L, Li H S, et al. Saliency detection by multi-context deep learning[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 1265-1274.[ DOI: 10.1109/CVPR.2015.7298731 http://dx.doi.org/10.1109/CVPR.2015.7298731 ]
Fang Z, Cao T Y, Hong S Z, et al. Saliency detection via fusion of deep model and traditional model[J]. Journal of Image and Graphics, 2018, 23(12):1864-1873.
方正, 曹铁勇, 洪施展, 等.融合深度模型和传统模型的显著性检测[J].中国图象图形学报, 2018, 23(12):1864-1873.][DOI:10.11834/jig.180073]
Lee G, Tai Y W, Kim J. Deep saliency with encoded low level distance map and high level features[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 660-668.[ DOI: 10.1109/CVPR.2016.78 http://dx.doi.org/10.1109/CVPR.2016.78 ]
Liu N, Han J W. Dhsnet: deep hierarchical saliency network for salient object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 678-686.[ DOI: 10.1109/CVPR.2016.80 http://dx.doi.org/10.1109/CVPR.2016.80 ]
Zhang W, Jiang G Y, Wang Z F, et al. Research on image multiple description coding[J]. Journal of Image and Graphics, 2004, 9(3):257-264.
张炜, 蒋刚毅, 汪增福, 等.图像信号的多描述编码方法[J].中国图象图形学报, 2004, 9(3):257-264.][DOI:10.11834/jig.20040347]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2018-11-18] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf .
Yang K F, Li H, Li C Y, et al. A unified framework for salient structure detection by contour-guided visual search[J]. IEEE Transactions on Image Processing, 2016, 25(8):3475-3488.[DOI:10.1109/TIP.2016.2572600]
Jiang H Z, Wang J D, Yuan Z J, et al. Salient object detection: a discriminative regional feature integration approach[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE, 2013: 2083-2090.[ DOI:10.1109/CVPR.2013.271 http://dx.doi.org/10.1109/CVPR.2013.271 ]
Kim J, Han D, Tai Y W, et al. Salient region detection via high-dimensional color transform[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014: 883-890.[ DOI:10.1109/CVPR.2014.118 http://dx.doi.org/10.1109/CVPR.2014.118 ]
Li C Y, Yuan Y C, Cai W D, et al. Robust saliency detection via regularized random walks ranking[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 2710-2717.[ DOI: 10.1109/CVPR.2015.7298887 http://dx.doi.org/10.1109/CVPR.2015.7298887 ]
Liu T, Yuan Z J, Sun J, et al. Learning to detect a salient object[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(2):353-367.[DOI:10.1109/TPAMI.2010.70]
Khuwuthyakorn P, Robles-Kelly A, Zhou J. Object of interest detection by saliency learning[C]//Proceedings of the 11th European Conference on Computer Vision. Berlin, Germany: Springer-Verlag, 2010: 636-649.[ DOI: 10.1007/978-3-642-15552-9_46 http://dx.doi.org/10.1007/978-3-642-15552-9_46 ]
Yang J M, Yang M H. Top-down visual saliency via joint CRF and dictionary learning[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 201 2: 2296-2303.[ DOI: 10.1109/CVPR.2012.6247940 http://dx.doi.org/10.1109/CVPR.2012.6247940 ]
Tong N, Lu H C, Ruan X, et al. Salient object detection via bootstrap learning[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 1884-1892.[ DOI: 10.1109/CVPR.2015.7298798 http://dx.doi.org/10.1109/CVPR.2015.7298798 ]
Yan Q, Xu L, Shi J P, et al. Hierarchical saliency detection[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE, 2013: 1155-1162.[ DOI: 10.1109/CVPR.2013.153 http://dx.doi.org/10.1109/CVPR.2013.153 ]
Zhou X F, Liu Z, Sun G L, et al. Improving saliency detection via multiple kernel boosting and adaptive fusion[J]. IEEE Signal Processing Letters, 2016, 23(4):517-521.[DOI:10.1109/LSP.2016.2536743]
相关作者
相关机构
京公网安备11010802024621