融合深度特征和多核增强学习的显著目标检测

张晴; 李云; 李文举; 林家骏; 肖莽; 陈飞云

doi:10.11834/jig.180224

图像分析和识别 | 浏览量 : 0 下载量: 9 CSCD: 3

PDF
导出
分享
收藏
专辑

融合深度特征和多核增强学习的显著目标检测
Salient object detection via deep features and multiple kernel boosting learning
2019年24卷第7期页码：1096-1105
收稿：2018-04-09，

修回：2019-1-7，

纸质出版：2019-07-16
DOI： 10.11834/jig.180224
稿件说明：

移动端阅览

张晴, 李云, 李文举, 林家骏, 肖莽, 陈飞云. 融合深度特征和多核增强学习的显著目标检测[J]. 中国图象图形学报, 2019,24(7):1096-1105. DOI： 10.11834/jig.180224.

Qing Zhang, Yun Li, Wenju Li, Jiajun Lin, Mang Xiao, Feiyun Chen. Salient object detection via deep features and multiple kernel boosting learning[J]. Journal of Image and Graphics, 2019, 24(7): 1096-1105. DOI： 10.11834/jig.180224.

摘要

目的

针对现有基于手工特征的显著目标检测算法对于显著性物体尺寸较大、背景杂乱以及多显著目标的复杂图像尚不能有效抑制无关背景区域且完整均匀高亮显著目标的问题，提出了一种利用深度语义信息和多核增强学习的显著目标检测算法。

方法

首先对输入图像进行多尺度超像素分割计算，利用基于流形排序的算法构建弱显著性图。其次，利用已训练的经典卷积神经网络对多尺度序列图像提取蕴含语义信息的深度特征，结合弱显著性图从多尺度序列图像内获得可靠的训练样本集合，采用多核增强学习方法得到强显著性检测模型。然后，将该强显著性检测模型应用于多尺度序列图像的所有测试样本中，线性加权融合多尺度的检测结果得到区域级的强显著性图。最后，根据像素间的位置和颜色信息对强显著性图进行像素级的更新，以进一步提高显著图的准确性。

结果

在常用的MSRA5K、ECSSD和SOD数据集上与9种主流且相关的算法就准确率、查全率、F-measure值、准确率—召回率（PR）曲线、加权F-measure值和覆盖率（OR）值等指标和直观的视觉检测效果进行了比较。相较于性能第2的非端到端深度神经网络模型，本文算法在3个数据集上的平均F-measure值、加权F-measure值、OR值和平均误差（MAE）值，分别提高了1.6%，22.1%，5.6%和22.9%。

结论

相较于基于手工特征的显著性检测算法，本文算法利用图像蕴含的语义信息并结合多个单核支持向量机（SVM）分类器组成强分类器，在复杂图像上取得了较好的检测效果。

Abstract

Objective

Salient object detection identifies the most conspicuous and eye-attracting objects or regions in images. Results are often expressed by saliency maps

in which the intensity of each pixel presents the strength of the probability that the pixel belongs to a salient region. Visual saliency detection has been used as a pre-processing step for facilitating a wide range of vision applications

including image and video compression

image retargeting

visual tracking

and robot navigation. Although the performance of salient object detection approaches has dramatically improved in the last few years

it remains challenging in computer vision tasks. Most existing methods focus on handcrafted features and use distinct prior knowledge

such as contrast

center

background

and objectness priors

to enhance performance. Recently

convolutional neural network (CNN)-based approaches have shown to be remarkably effective and successfully broken the limits of traditional handcrafted feature-based methods. The recent CNN-based salient object detection approaches have been successful in overcoming the disadvantages of handcrafted feature-based approaches and have greatly enhanced the performance of saliency detection. These CNN-based models

especially the end-to-end ones

have shown their superiority on feature extraction and efficiently captured high-level information about the objects and their cluttered surroundings. The existing handcrafted feature-based salient object detection algorithms are insufficient in effectively suppressing irrelevant backgrounds and uniformly highlighting the entire salient object and on complicated images with large salient object

cluttered backgrounds

and multiple salient objects. We propose a salient object detection scheme based on multiple kernel boosting learning and deep semantic information to overcome this drawback.

Method

First

we segment the input image into multiscale superpixels and obtain weak saliency maps through graph-based manifold ranking. Second

we extract the deep features involving semantic information by using classic CNN. We obtain reliable training sets through the multiscale weak saliency maps to develop a strong salient object detection model by using multiple kernel boosting learning. Then

saliency maps are directly produced by samples from the multiscale superpixel images

which are infused to generate a strong saliency map. Finally

a pixel-level saliency map is refined in accordance with the color and position to improve the detection performance.

Result

The proposed moodel is compared with 11 state-of-the-art methods to evaluate its performance in terms of precision

recall

F-measure

PR (precision-recall) curve

weighted F-measure

OR (overlapping ratio) and MAE (mean absolute error) scores

and visual effect on three popular and public datasets

namely

MSRA5K

ECSSD

and SOD. Experimental results show the improvements over the state-of-the-art methods. The F-measure score of our algorithm increased by 0.7%

2.0%

and 2.1%; the weighted F-measure increased by 18.9%

27.6%

and 19.8%; the OR scores increased by 2.9%

6.8%

and 7.2%; and the MAE scores increased by 34.5%

26.9%

and 7.5% compared with the saliency results produced by the non-end-to-end deep learning model whose performance ranks second on MSRA5K

ECSSD

and SOD

respectively. The experiments on visual effect show that our method performs well in various complex images

such as saliency objects and backgrounds that share similar appearance

multiple salient objects

salient objects with complex texture and structure

and clutter backgrounds. The proposed approach not only uniformly highlights the entire salient objects but also efficiently preserves the contour of salient objects under various scenarios. Moreover

we conduct experiments on three datasets in terms of PR curves to evaluate the performance of each component of the proposed algorithm. Moreover

the average running time of our algorithm and the methods based on non-end-to-end CNNs is presented. The implementation is performed on ECSSD dataset by using MATLAB or C

and most of the test images have a resolution of 300×400 pixels. An efficient C/C++ implementation based on parallelized components would decrease our model's computation time and render it feasible for real-world application.

Conclusion

The proposed salient object detection model demonstrates good performance on complicated images compared with the salient object detection method based on handcrafted features

which learns a strong classifier with four single kernel SVM(support vector machine) and uses classic CNN. Further improvements of salient object detection algorithm on dataset with complex and confusing background images are worth expecting. In further research

we plan to utilize additional features from a CNN and construct an end-to-end model

which would improve performance and save computation cost. Moreover

our further work will pay attention to small and salient object detections in video.

关键词

Keywords

references

Zhao R, Ouyang W, Wang X G. Person re-identification by saliency learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(2):356-370.[DOI:10.1109/TPAMI.2016.2544310]

Zhang F, Du B, Zhang L P. Saliency-guided unsupervised feature learning for scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(4):2175-2184.[DOI:10.1109/TGRS.2014.2357078]

Yang X Y, Qian X M, Xue Y. Scalable mobile image retrieval by exploring contextual saliency[J]. IEEE Transactions on Image Processing, 2015, 24(6):1709-1721.[DOI:10.1109/TIP.2015.2411433]

Sun M, Farhadi A, Taskar B, et al. Salient montages from unconstrained videos[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 472-488.[ DOI:10.1007/978-3-319-10584-0_31 http://dx.doi.org/10.1007/978-3-319-10584-0_31 ]

Cheng M M, Mitra N J, Huang X L, et al. Global contrast based salient region detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3):569-582.[DOI:10.1109/TPAMI.2014.2345401]

Li C Y, Yuan Y C, Cai W D, et al. Robust saliency detection via regularized random walks ranking[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 2710-2717.[ DOI:10.1109/CVPR.2015.7298887 http://dx.doi.org/10.1109/CVPR.2015.7298887 ]

Tong N, Lu H C, Ruan X, et al. Salient object detection via bootstrap learning[C]//Proceedings of 2005 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 1884-1892.[ DOI:10.1109/CVPR.2015.7298798 http://dx.doi.org/10.1109/CVPR.2015.7298798 ]

Zhou X F, Liu Z, Sun G L, et al. Improving saliency detection via multiple kernel boosting and adaptive fusion[J]. IEEE Signal Processing Letters, 2016, 23(4):517-521.[DOI:10.1109/LSP.2016.2536743]

Yan Q, Xu L, Shi J P, et al. Hierarchical saliency detection[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 1155-1162.[ DOI:10.1109/CVPR.2013.153 http://dx.doi.org/10.1109/CVPR.2013.153 ]

Tong N, Lu H C, Zhang L H, et al. Saliency detection with multi-scale superpixels[J]. IEEE Signal Processing Letters, 2014, 21(9):1035-1039.[DOI:10.1109/LSP.2014.2323407]

Zhang Q, Lin J J, Tao Y Y, et al. Salient object detection via color and texture cues[J]. Neurocomputing, 2017, 243:35-48.[DOI:10.1016/j.neucom.2017.02.064]

Zhu W J, Liang S, Wei Y C, et al. Saliency optimization from robust background detection[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 2814-2821.[ DOI:10.1109/CVPR.2014.360 http://dx.doi.org/10.1109/CVPR.2014.360 ]

Li H Y, Lu H C, Lin Z, et al. Inner and inter label propagation:salient object detection in the wild[J]. IEEE Transactions on Image Processing, 2015, 24(10):3176-3186.[DOI:10.1109/TIP.2015.2440174]

Peng H W, Li B, Ling H B, et al. Salient object detection via structured matrix decomposition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):818-832.[DOI:10.1109/TPAMI.2016.2562626]

Yan X Y, Wang Y H, Song Q, et al. Salient object detection via boosting object-level distinctiveness and saliency refinement[J]. Journal of Visual Communication and Image Representation, 2017, 48:224-237.[DOI:10.1016/j.jvcir.2017.06.013]

Yang C, Zhang L H, Lu H C, et al. Saliency detection via graph-based manifold ranking[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 3166-3173.[ DOI:10.1109/CVPR.2013.407 http://dx.doi.org/10.1109/CVPR.2013.407 ]

Zhang Q, Luo D S, Li W J, et al. Two-stage absorbing Markov chain for salient object detection[C]//Proceedings of 2017 IEEE International Conference on Image Processing. Beijing: IEEE, 2017: 895-899.[ DOI:10.1109/ICIP.2017.8296410 http://dx.doi.org/10.1109/ICIP.2017.8296410 ]

Qin Y, Lu H C,Xu Y Q, et al. Saliency detection via cellular automata[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recogniti on. Boston, MA, USA: IEEE, 2015: 110-119.[ DOI:10.1109/CVPR.2015.7298606 http://dx.doi.org/10.1109/CVPR.2015.7298606 ]

Kuen J, Wang Z H, Wang G. Recurrent attentional network for saliency detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 3668-3677.[ DOI:10.1109/CVPR.2016.78 http://dx.doi.org/10.1109/CVPR.2016.78 ]

Luo Z M, Mishra A, Achkar A, et al. Non-local deep features for salient object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 6593-6601.[ DOI:10.1109/CVPR.2017.698 http://dx.doi.org/10.1109/CVPR.2017.698 ]

Wang L J, Lu H C, Ruan X, et al. Deep networks for saliency detection via local estimation and global search[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3183-3192.[ DOI:10.1109/CVPR.2015.7298938 http://dx.doi.org/10.1109/CVPR.2015.7298938 ]

Li G B, Yu Y Z. Visual saliency based on multiscale deep features[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 5455-5463.[ DOI:10.1109/CVPR.2015.7299184 http://dx.doi.org/10.1109/CVPR.2015.7299184 ]

LeeG, Tai Y W, Kim J. Deep saliency with encoded low level distance map and high level features[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 660-668.[ DOI:10.1109/CVPR.2016.78 http://dx.doi.org/10.1109/CVPR.2016.78 ]

Liu N, Han J W. DHSNet: deep hierarchical saliency network for salient object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 678-686.[ DOI:10.1109/CVPR.2016.80 http://dx.doi.org/10.1109/CVPR.2016.80 ]

Li X, Zhao L M, Wei L N, et al. DeepSaliency:multi-task deep neural network model for salient object detection[J]. IEEE Transactions on Image Processing, 2016, 25(8):3919-3930.[DOI:10.1109/TIP.2016.2579306]

Wang L Z, Wang L J, Lu H C, et al. Saliency detection with recurrent fully convolutional networks[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 825-841.[ DOI:10.1007/978-3-319-46493-0_50 http://dx.doi.org/10.1007/978-3-319-46493-0_50 ]

Achanta R, Shaji A, Smith K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11):2274-2282.[DOI:10.1109/TPAMI.2012.120]

Margolin R, Tal A, Zelnik-Manor L, et al. What makes a patch distinct?[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 1139-1146.[ DOI:10.1109/CVPR.2013.151 http://dx.doi.org/10.1109/CVPR.2013.151 ]

Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE International Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 580-587.[ DOI:10.1109/CVPR.2014.81 http://dx.doi.org/10.1109/CVPR.2014.81 ]

Krähenbuhl P, Koltun V. Efficient inference in fully connected CRFs with Gaussian edge potentials[C]//Proceedings of the 24th International Conference on Neural Information Processing Systems. Granada, Spain: Curran Associates Inc., 2011: 109-117.

Margolin R, Zelnik-Manor L, Tal A. How to evaluate foreground maps[C]//Proceedings of 2014 IEEE International Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 248-255.[ DOI:10.1109/CVPR.2014.39 http://dx.doi.org/10.1109/CVPR.2014.39 ]