Semantic assistance and edge feature based salient object detection

Shengxuan Dai; Linfeng Xu; Fangyu Liu; Bin He

doi:10.11834/jig.210534

Image Analysis and Recognition | Views : 0 下载量: 1 CSCD: 0

PDF
Export
Share
Collection
Album

Semantic assistance and edge feature based salient object detection
Vol. 27, Issue 11, Pages: 3243-3256(2022)
Published： 16 November 2022 ，

Accepted： 27 November 2021
DOI： 10.11834/jig.210534
稿件说明：

移动端阅览

Shengxuan Dai, Linfeng Xu, Fangyu Liu, Bin He. Semantic assistance and edge feature based salient object detection. [J]. Journal of Image and Graphics 27(11):3243-3256(2022)
DOI：

Shengxuan Dai, Linfeng Xu, Fangyu Liu, Bin He. Semantic assistance and edge feature based salient object detection. [J]. Journal of Image and Graphics 27(11):3243-3256(2022) DOI： 10.11834/jig.210534.

摘要

目的

现有的显著对象检测模型能够很好地定位显著对象，但是在获得完整均匀的对象和保留清晰边缘的任务上存在不足。为了得到整体均匀和边缘清晰的显著对象，本文提出了结合语义辅助和边缘特征的显著对象检测模型。

方法

模型利用设计的语义辅助特征融合模块优化骨干网的侧向输出特征，每层特征通过语义辅助选择性融合相邻的低层特征，获得足够的结构信息并增强显著区域的特征强度，进而检测出整体均匀的显著对象。通过设计的边缘分支网络以及显著对象特征得到精确的边缘特征，将边缘特征融合到显著对象特征中，加强特征中显著对象边缘区域的可区分性，以便检测出清晰的边缘。同时，本文设计了一个双向多尺度模块来提取网络中的多尺度信息。

结果

在4种常用的数据集ECSSD（extended complex scene saliency dataset）、DUT-O（Dalian University of Technology and OMRON Corporation）、HKU-IS和DUTS上与12种较流行的显著模型进行比较，本文模型的最大F值度量（max F-measure，MaxF）和平均绝对误差（mean absolution error，MAE）分别是0.940、0.795、0.929、0.870和0.041、0.057、0.034、0.043。从实验结果看，本文方法得到的显著图更接近真值图，在MaxF和MAE上取得最佳性能的次数多于其他12种方法。

结论

本文提出的结合语义辅助和边缘特征的显著对象检测模型十分有效。语义辅助特征融合和边缘特征的引入使检测出的显著对象更为完整均匀，对象的边缘区分性也更强，多尺度特征提取进一步改善了显著对象的检测效果。

Abstract

Objective

Human visual system is beneficial to extracting features of the region of interest in images or videos processing. Computer-vision-derived salient object detection aims to improving the ability of visual interpretation for image preprocessing. The quality of the generated saliency map affects the performance of subsequent vision tasks directly. Current deep-learning-based salient object detection can locate salient objects well in terms of the effective semantic features extraction. The issue of clear-edged objects extraction is essential for improving the following visual tasks. In recent years

the complex scenes-oriented edge accuracy of the objects enhancement has been concerned further. Such models are required to obtain fine edges based on the indirect multiple edge losses for the edges of salient objects supervision. To improve the edge details of the object

some models simply fuse the complementary object features and edge features. These models do not make full use of the edge features

resulting in unidentified edge enhancement. Furthermore

it is necessary to use multi-scale information to extract object features because salient objects have variability of positions and scales in visual scenes. In order to regularize clear edges saliency map

we demonstrate a salient object detection model based on semantic assistance and edge feature.

Method

We use a semantic assistant feature fusion module to optimize the lateral output features of the backbone network. The selective layer features of each fuse the adjacent low-level features with semantic assistance to obtain enough structural information and enhance the feature strength of the salient region

which is helpful to generate a regular saliency map to detect the entire salient objects. We design an edge-branched network to obtain accurate edge features. To enhance the distinguishability of the edge regions for salient objects

the object features are integrated. In addition

a bidirectional multi-scale module extracts the multi-scale information. Thanks to the mechanism of dense connection and feature fusion

the bidirectional multi-scale module gradually fuses the multi-scale features of each adjacent layer

which is beneficial to detect multi-scale objects in the scene. Our experiments are equipped with a single NVIDIA GTX 1080ti graphics-processing unit (GPU) for training and test. We use the DUTS-train datasets to train the model

which contains 10 553 images. The model is trained for convergence with no validation set. The Visual Geometry Group(VGG16) is as the backbone network through the PyTorch deep learning framework. The pre-trained model on ImageNet initializes some parameters of the backbone network

and all newly convolutional layers-added are randomly initialized with "0.01"of variance and "0" of deviation. The hyper-parameters and experimental settings are clarified that the learning rate

weight decay

and momentum are set to 5E-5

0.000 5

and 0.9

respectively. We use adam optimizer for optimization learning. We carried out back-propagation method based on every ten images. The scale of input image is 256×256 pixels

and random flip is for data enhancement only. The model is trained in 100 iterations totally

and the attenuation is 10 times after 60 iterations.

Result

Our model is compared to twelve existing popular saliency models based on four commonly-used datasets

i.e.

extended complex scene saliency dataset (ECSSD)

Dalian University of Technology and OMRON Corporation (DUT-O)

HKU-IS

and DUTS. The analyzed results show that the maximum F-measure values of our model on each of four datasets are 0.940

0.795

0.929

and 0.870

the mean absolution error(MAE) values are 0.041

0.057

0.034

and 0.043

respectively. Our saliency maps obtained are closer to the ground truth.

Conclusion

We develop a model to detect salient objects. The semantic assisted feature and edge feature fusion in the model is beneficial to generate regularized saliency maps in the context of clear object edges. The multi-scale feature extraction improves the performance of salient object detection further.

关键词

显著对象检测全卷积神经网络语义辅助边缘特征融合多尺度提取

Keywords

salient object detectionfull convolution neural networksemantic assistanceedge feature fusionmulti-scale extraction

references

Achanta R, Hemami S, Estrada F and Susstrunk S. 2009. Frequency-tuned salient region detection//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 1597-1604 [DOI: 10.1109/cvpr.2009.5206596http://dx.doi.org/10.1109/cvpr.2009.5206596]

Chen K and Wang Y X. 2020. Saliency detection based on multi-level features and spatial attention. Journal of Image and Graphics, 25(6): 1130-1141

陈凯, 王永雄. 2020. 结合空间注意力多层特征融合显著性检测. 中国图象图形学报, 25(6): 1130-1141 [DOI: 10.11834/jig.190436]

Chen L C, Papandreou G, Schroff F and Adam H. 2017. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2021-06-15].https://arxiv.org/pdf/1706.05587.pdfhttps://arxiv.org/pdf/1706.05587.pdf

Fang H, Gupta S, Iandola F, Srivastava R K, Deng L, Dollár P, Gao J F, He X D, Mitchell M, Platt J C, Zitnick C L and Zweig G. 2015. From captions to visual concepts and back//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1473-1482 [DOI: 10.1109/CVPR.2015.7298754http://dx.doi.org/10.1109/CVPR.2015.7298754]

Gao Y, Wang M, Tao D C, Ji R R and Dai Q H. 2012. 3-D object retrieval and recognition with hypergraph analysis. IEEE Transactions on Image Processing, 21(9): 4290-4303 [DOI: 10.1109/TIP.2012.2199502]

Hou Q B, Cheng M M, Hu X W, Borji A, Tu Z W and Torr P. 2017. Deeply supervised salient object detection with short connections//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5300-5309 [DOI: 10.1109/CVPR.2017.563http://dx.doi.org/10.1109/CVPR.2017.563]

Huang Z L, Wang X G, Wang J S, Liu W Y and Wang J D. 2018. Weakly-supervised semantic segmentation network with deep seeded region growing//Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7014-7023 [DOI: 10.1109/CVPR.2018.00733http://dx.doi.org/10.1109/CVPR.2018.00733]

Lee G Y, Tai Y W and Kim J M. 2016. Deep saliency with encoded low level distance map and high level features//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 660-668 [DOI: 10.1109/CVPR.2016.78http://dx.doi.org/10.1109/CVPR.2016.78]

Li G B and Yu Y Z. 2015. Visual saliency based on multiscale deep features//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 5455-5463 [DOI: 10.1109/CVPR.2015.7299184http://dx.doi.org/10.1109/CVPR.2015.7299184]

Li X, Yang F, Cheng H, Liu W and Shen D G. 2018. Contour knowledge transfer for salient object detection//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: IEEE: 355-370 [DOI: 10.1007/978-3-030-01267-0_22http://dx.doi.org/10.1007/978-3-030-01267-0_22]

Liu N, Han J W and Yang M H. 2018. PiCANet: learning pixel-wise contextual attention for saliency detection//Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3089-3098 [DOI: 10.1109/CVPR.2018.00326http://dx.doi.org/10.1109/CVPR.2018.00326]

Liu Y, Gu Y C, Zhang X Y, Wang W W and Cheng M M. 2021. Lightweight salient object detection via hierarchical visual perception learning. IEEE Transactions on Cybernetics, 51(9): 4439-4449 [DOI: 10.1109/TCYB.2020.3035613]

Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3431-3440 [DOI: 10.1109/cvpr.2015.7298965http://dx.doi.org/10.1109/cvpr.2015.7298965]

Luo Z M, Mishra A, Achkar A, Eichel J, Li S Z and Jodoin P M. 2017. Non-local deep features for salient object detection//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6593-6601 [DOI: 10.1109/CVPR.2017.698http://dx.doi.org/10.1109/CVPR.2017.698]

Mohammadi S, Noori M, Bahri A, Majelan S G and Havaei M. 2020. CAGNet: content-aware guidance for salient object detection. Pattern Recognition, 103: #107303 [DOI: 10.1016/j.patcog.2020.107303]

Perazzi F, Krähenbühl P, Pritch Y and Hornung A. 2012. Saliency filters: Contrast based filtering for salient region detection//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE: 733-740 [DOI: 10.1109/cvpr.2012.6247743http://dx.doi.org/10.1109/cvpr.2012.6247743]

Qin X B, Zhang Z C, Huang C Y, Gao C, Dehghan M and Jagersand M. 2019. BASNet: boundary-aware salient object detection//Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7471-7481 [DOI: 10.1109/CVPR.2019.00766http://dx.doi.org/10.1109/CVPR.2019.00766]

Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2021-09-04].https://arxiv.org/pdf/1409.1556.pdfhttps://arxiv.org/pdf/1409.1556.pdf

Wang L J, Lu H C, Wang Y F, Feng M Y, Wang D, Yin B C and Ruan X. 2017. Learning to detect salient objects with image-level supervision//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 3796-3805 [DOI: 10.1109/CVPR.2017.404http://dx.doi.org/10.1109/CVPR.2017.404]

Wu R M, Feng M Y, Guan W L, Wang D, Lu H C and Ding E R. 2019a. A mutual learning method for salient object detection with intertwined multi-supervision//Proceedings of the 32nd IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8142-8151 [DOI: 10.1109/CVPR.2019.00834http://dx.doi.org/10.1109/CVPR.2019.00834]

Wu Z, Su L and Huang Q M. 2019b. Cascaded partial decoder for fast and accurate salient object detection//Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3902-3911 [DOI: 10.1109/CVPR.2019.00403http://dx.doi.org/10.1109/CVPR.2019.00403]

Xie S N and Tu Z W. 2015. Holistically-nested edge detection//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1395-1403 [DOI: 10.1109/ICCV.2015.164http://dx.doi.org/10.1109/ICCV.2015.164]

Yan Q, Xu L, Shi J P and Jia J Y. 2013. Hierarchical saliency detection//Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 1155-1162 [DOI: 10.1109/CVPR.2013.153http://dx.doi.org/10.1109/CVPR.2013.153]

Yang C, Zhang L H, Lu H C, Ruan Xand Yang M H. 2013. Saliency detection via graph-based manifold ranking//Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 3166-3173 [DOI: 10.1109/CVPR.2013.407http://dx.doi.org/10.1109/CVPR.2013.407]

Zeng Y, Zhang P P, Lin Z, Zhang J M and Lu H C. 2019. Towards high-resolution salient object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 7233-7242 [DOI: 10.1109/ICCV.2019.00733http://dx.doi.org/10.1109/ICCV.2019.00733]

Zhang D W, Meng D Y, Zhao L and Han J W. 2016. Bridging saliency detection to weakly supervised object detection based on self-paced curriculum learning//Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York, USA: AAAI Press: 3538-3544

Zhang L, Dai J, Lu H C, He Y and Wang G. 2018a. A Bi-directional message passing model for salient object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1741-1750 [DOI: 10.1109/CVPR.2018.00187http://dx.doi.org/10.1109/CVPR.2018.00187]

Zhang P P, Wang D, Lu H C, Wang H Y and Ruan X. 2017a. Amulet: aggregating multi-level convolutional features for salient object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 202-211 [DOI: 10.1109/ICCV.2017.31http://dx.doi.org/10.1109/ICCV.2017.31]

Zhang P P, Wang D, Lu H C, Wang H Y and Yin B C. 2017b. Learning uncertainconvolutional features for accurate saliency detection// Proceedings of the 16th IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 212-221 [DOI: 10.1109/ICCV.2017.32http://dx.doi.org/10.1109/ICCV.2017.32]

Zhang Q, Zuo B C, Shi Y J and Dai M. 2020. A multi-scale convolutional neural network for salient object detection. Journal of Image and Graphics, 25(6): 1116-1129

张晴, 左保川, 石艳娇, 戴蒙. 2020. 多尺度卷积神经网络显著物体检测. 中国图象图形学报, 25(6): 1116-1129 [DOI: 10.11834/jig.190395]

Zhang X N, Wang T T, Qi J Q, Lu H C and Wang G. 2018b. Progressive attention guided recurrent network for salient object detection//Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 714-722 [DOI: 10.1109/CVPR.2018.00081http://dx.doi.org/10.1109/CVPR.2018.00081]

Zhao J X, Liu J J, Fan D P, Cao Y, Yang J F and Cheng M M. 2019. EGNet: edge guidance network for salient object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8778-8787 [DOI: 10.1109/ICCV.2019.00887http://dx.doi.org/10.1109/ICCV.2019.00887]

Zhao R, Ouyang W L, Li H S and Wang X G. 2015. Saliency detection by multi-context deep learning//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1265-1274 [DOI: 10.1109/CVPR.2015.7298731http://dx.doi.org/10.1109/CVPR.2015.7298731]

Zhao R, Ouyang W L and Wang X G. 2017. Person re-identification by saliency learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(2): 356-370 [DOI: 10.1109/TPAMI.2016.2544310]

Zhao T and Wu X Q. 2019. Pyramid feature attention network for saliency detection//Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3080-3089 [DOI: 10.1109/CVPR.2019.00320http://dx.doi.org/10.1109/CVPR.2019.00320]

Zhao X Q, Pang Y W, Zhang L H, Lu H C and Zhang L. 2020. Suppress and balance: a simple gated network for salient object detection//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 35-51 [DOI: 10.1007/978-3-030-58536-5_3http://dx.doi.org/10.1007/978-3-030-58536-5_3]

Alert me when the article has been cited

提交

Road scene segmentation based on KSW and FCNN

Visual salient objects detection in natural scenes