注意力引导网络的显著性目标检测
The salient object detection based on attention-guided network
- 2022年27卷第4期 页码:1176-1190
收稿:2020-11-13,
修回:2021-2-7,
录用:2021-2-14,
纸质出版:2022-04-16
DOI: 10.11834/jig.200658
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-11-13,
修回:2021-2-7,
录用:2021-2-14,
纸质出版:2022-04-16
移动端阅览
目的
2
全卷积模型的显著性目标检测大多通过不同层次特征的聚合实现检测,如何更好地提取和聚合特征是一个研究难点。常用的多层次特征融合策略有加法和级联法,但是这些方法忽略了不同卷积层的感受野大小以及产生的特征图对最后显著图的贡献差异等问题。为此,本文结合通道注意力机制和空间注意力机制有选择地逐步聚合深层和浅层的特征信息,更好地处理不同层次特征的传递和聚合,提出了新的显著性检测模型AGNet(attention-guided network),综合利用几种注意力机制对不同特征信息加权解决上述问题。
方法
2
该网络主要由特征提取模块(feature extraction module
FEM)、通道—空间注意力融合模块(channel-spatial attention aggregation module
C-SAAM)和注意力残差细化模块(attention residual refinement module,ARRM)组成,并且通过最小化像素位置感知(pixel position aware
PPA)损失训练网络。其中,C-SAAM旨在有选择地聚合浅层的边缘信息以及深层抽象的语义特征,利用通道注意力和空间注意力避免融合冗余的背景信息对显著性映射造成影响;ARRM进一步细化融合后的输出,并增强下一个阶段的输入。
结果
2
在5个公开数据集上的实验表明,AGNet在多个评价指标上达到最优性能。尤其在DUT-OMRON(Dalian University of Technology-OMRON)数据集上,F-measure指标相比于排名第2的显著性检测模型提高了1.9%,MAE(mean absolute error)指标降低了1.9%。同时,网络具有不错的速度表现,达到实时效果。
结论
2
本文提出的显著性检测模型能够准确地分割出显著目标区域,并提供清晰的局部细节。
Objective
2
The salient object detection is to detect the targeted part of the image
and to segment the shape of salient objects. The distractibility allows humans to allocate limited resources of brain to the most important information in the visual scene. It achieves the high efficiency and precision of visual system. The salient object detection is used to simulate the attention mechanism of the human brain. This image processing issue is usually applied in image editing
visual tracking and robot navigation. The existing visual feature information method is widespread used to detect salient objects in accordance with
brightness
color
and movement. The lack of high-level semantic information constraints their capability to detect salient objects in complex scenes. The pyramid structure of deep convolutional neural networks (DCNNs) realizes the extraction of low-level information and semantically high-level information through multiple convolution operations and pooling operations. The feature extraction capabilities of convolutional neural networks have applied in the context of computer vision. The full convolutional neural network (FCN) is proposed to harness salient object detection. Multi-level feature fusion strategies are commonly used like addition and cascade. But these adopted strategies often ignore the difference in the contribution of different features to salient objects and lead to sub-optimal solutions. The low-level and fuzzy boundaries at the high-level reduce salient detection accuracy. Hence
we design a new model for salient object detection. Our model yields different weights to attention features and a variety of attention mechanisms are used to guide the fusion of feature information block by block.
Method
2
A feature aggregation network based on attention mechanisms is conducted for saliency object detection. Our new network proposed uses a variety of attention mechanisms to melt different weights into the information of different feature maps. It clarifies the effective aggregation of deep features and shallow features. The network is mainly composed of feature extraction module (FEM)
channel-spatial attention aggregation module (C-SAAM) and attention residual refinement module (ARRM). Our trained network is minimized the pixel position aware loss (PPA). FEM obtains rich context information based on multi-scale feature extraction. C-SAAM aims to option aggregate edge information of shallow feature and extract semantic high-level features. Unlike addition and concatenation
C-SAAM uses channel attention and spatial attention to aggregate multi-layer features and release redundant information fusing problems. We also design a residual refinement module based on ARRM to further refine the fused output and improve the input function. We use ResNet-50 as the backbone network of our encoder part
and use transfer learning to load the parameters of the trained model on ImageNet to initialize the network. The DUTS-TR dataset is used to train our network as well. In the training stage
the input images and ground truth masks are resized to 288 × 288 pixels
and NVIDIA GTX 2080Ti GPU device are used for training. Small batch random gradient descent (SGD) is utilized to optimize our network. The learning rate is set to 0.05
the momentum is set to 0.9
the weight decay is set to 5E-4
and the batch size is set to 24. With no validation set
our model was trained 30 epochs
and the whole training process took 3 hours. In the test process
the inference time for 320 × 320 pixels images reaches 0.02 s (50 frame/s)
which achieves the real-time requirements.
Result
2
we compared our model with the 13 models on five public datasets. In order to comprehensively evaluate the effectiveness of our proposed model
we used the precision-recall (PR) curve
the F-measure score and curve
the mean absolute error (MAE) and E-measure were adopt to evaluate our model. In terms of complex DUT-OMRON dataset analysis
the F-measure is increased by 1.9% and MAE is reduced by 1.9% compared with the second performance model. In addition
we also design PR curve and F-measure curve of the five datasets in order to evaluate the segmented salient objects. Compared with other methods
the F-measure curve is the core under different thresholds
which proves the effectiveness of the demonstrated model. It is shown in the visualize example that our model can predict qualified saliency map and filter the non-salient areas out.
Conclusion
2
Our aggregation network based on channel-spatial attention guidance has its priority to extract high-level and low-level features from the input image effectively.
Aksac A, Ozyer T and Alhajj R. 2017. Complex networks driven salient region detection based on superpixel segmentation. Pattern Recognition, 66: 268-279 [DOI: 10.1016/j.patcog.2017.01.010]
Borji A, Cheng M M, Jiang H Z and Li J. 2015. Salient object detection: a benchmark. IEEE Transactions on Image Processing, 24(12): 5706-5722 [DOI: 10.1109/TIP.2015.2487833]
Chen K and Wang Y X. 2020. Saliency detection based on multi-level features and spatial attention. Journal of Image and Graphics, 25(6): 1130-1141
陈凯, 王永雄. 2020. 结合空间注意力多层特征融合显著性检测. 中国图象图形学报, 25(6): 1130-1141 [DOI: 10.11834/jig.190436]
Chen S H, Tan X L, Wang B and Hu X L. 2018. Reverse attention for salient object detection//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 236-252 [ DOI: 10.1007/978-3-030-01240-3_15 http://dx.doi.org/10.1007/978-3-030-01240-3_15 ]
Cheng M M, Mitra N J, Huang X L, Torr P H S and Hu S M. 2015. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3): 569-582 [DOI: 10.1109/TPAMI.2014.2345401]
Cheng M M, Zhang F L, Mitra N J, Huang X L and Hu S M. 2010. RepFinder: finding approximately repeated scene elements for image editing. ACM Transactions on Graphics, 29(4): #83 [DOI: 10.1145/1778765.1778820]
Chu X, Yang W, Ouyang W L, Ma C, Yuille A L and Wang X G. 2017. Multi-context attention for human pose estimation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 5669-5678 [ DOI: 10.1109/CVPR.2017.601 http://dx.doi.org/10.1109/CVPR.2017.601 ]
Craye C, Filliat D andGoudou J F. 2016. Environment exploration for object-based visual saliency learning//Proceedings of 2016 IEEE International Conference on Robotics and Automation (ICRA). Stockholm, Sweden: IEEE: 2303-2309 [ DOI: 10.1109/ICRA.2016.7487379 http://dx.doi.org/10.1109/ICRA.2016.7487379 ]
Deng Z J, Hu X W, Zhu L, Xu X M, Qin J, Han G Q and Heng P A. 2018. R 3 Net: recurrent residual refinement network for saliency detection//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: IJCAI: 684-690 [ DOI: 10.24963/ijcai.2018/95 http://dx.doi.org/10.24963/ijcai.2018/95 ].
Everingham M, Van Gool L, Williams C K I, Winn J and Zisserman A. 2010. The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2): 303-338 [DOI: 10.1007/s11263-009-0275-4]
Feng M Y, Lu H C and Ding E R. 2019. Attentive feedback network for boundary-aware salient object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 1623-1632 [ DOI: 10.1109/CVPR.2019.00172 http://dx.doi.org/10.1109/CVPR.2019.00172 ]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 770-778 [ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Hou Q B, Cheng M M, Hu X W, Borji A, Tu Z W and Torr P H S. 2019. Deeply supervised salient objectdetection with short connections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(4): 815-828 [DOI: 10.1109/TPAMI.2018.2815688]
Hu X W, Zhu L, Qin J, Fu C W and Heng P A. 2018. Recurrently aggregating deep features for salient object detection//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI: 6943-6950
Huang X, Shen C Y, Boix X and Zhao Q. 2015. SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE: 262-270 [ DOI: 10.1109/ICCV.2015.38 http://dx.doi.org/10.1109/ICCV.2015.38 ]
Jiang Z L and Davis L S. 2013. Submodular salient region detection//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 2043-2050 [ DOI: 10.1109/CVPR.2013.266 http://dx.doi.org/10.1109/CVPR.2013.266 ]
Lee H and Kim D. 2018. Salient region-based online object tracking//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, USA: IEEE: 1170-1177 [ DOI: 10.1109/WACV.2018.00133 http://dx.doi.org/10.1109/WACV.2018.00133 ]
Li G B and Yu Y Z. 2016. Visual saliency detection based on multiscale deep CNN features. IEEE Transactions on Image Processing, 25(11): 5012-5024 [DOI: 10.1109/TIP.2016.2602079]
Li X, Yang F, Cheng H, Liu W and Shen D G. 2018. Contour knowledge transfer for salient object detection//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 370-385 [ DOI: 10.1007/978-3-030-01267-0_22 http://dx.doi.org/10.1007/978-3-030-01267-0_22 ]
Li Y, Hou X D, Koch C, Rehg J M and Yuille A L. 2014. The secrets of salient object segmentation//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 280-287 [ DOI: 10.1109/CVPR.2014.43 http://dx.doi.org/10.1109/CVPR.2014.43 ]
Liang J, Zhou J, Tong L, Bai X and Wang B. 2018. Material based salient object detection from hyperspectral images. Pattern Recognition, 76: 476-490 [DOI: 10.1016/j.patcog.2017.11.024]
Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 936-944 [ DOI: 10.1109/CVPR.2017.106 http://dx.doi.org/10.1109/CVPR.2017.106 ]
Liu N and Han J W. 2016. DHSNet: deep hierarchical saliency network for salient object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 678-686 [ DOI: 10.1109/CVPR.2016.80 http://dx.doi.org/10.1109/CVPR.2016.80 ]
Liu N, Han J W and Yang M H. 2018. PiCANet: learning pixel-wise contextual attention for saliency detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3089-3098 [ DOI: 10.1109/CVPR.2018.00326 http://dx.doi.org/10.1109/CVPR.2018.00326 ]
Liu T, Yuan Z J, Sun J, Wang J D, Zheng N N, Tang X O and Shum H Y. 2011. Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(2): 353-367 [DOI: 10.1109/TPAMI.2010.70]
Luo Z M, Mishra A, Achkar A, Eichel J, Li S Z and Jodoin P M. 2017. Non-local deep features for salient object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 6593-6601 [ DOI: 10.1109/CVPR.2017.698 http://dx.doi.org/10.1109/CVPR.2017.698 ]
Ma C, Miao Z J, Zhang X P and Li M. 2017. A saliency prior context model for real-time object tracking. IEEE Transactions on Multimedia, 19(11): 2415-2424 [DOI: 10.1109/TMM.2017.2694219]
Mechrez R, Shechtman E and Zelnik-Manor L. 2018. Saliency driven image manipulation//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, USA: IEEE: 1368-1376 [ DOI: 10.1109/WACV.2018.00154 http://dx.doi.org/10.1109/WACV.2018.00154 ]
Mohammadi S, Noori M, Bahri A, Majelan SG and Havaei M. 2020. CAGNet: content-aware guidance for salient object detection. Pattern Recognition, 103: #107303 [DOI: 10.1016/j.patcog.2020.107303]
Qin X B, He S D, Zhang Z C, Dehghan M and Jagersand M. 2018. ByLabel: a boundary based semi-automatic image annotation tool//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, USA: IEEE: 1804-1813 [ DOI: 10.1109/WACV.2018.00200 http://dx.doi.org/10.1109/WACV.2018.00200 ]
Qin X B, Zhang Z C, Huang C Y, Gao C and Dehghan M. 2019. BASNet: boundary-aware salient object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 7471-7481 [ DOI: 10.1109/CVPR.2019.00766 http://dx.doi.org/10.1109/CVPR.2019.00766 ]
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Wang J D, Jiang H Z, Yuan Z J, Cheng M M, Hu X W and Zheng N N. 2017a. Salient object detection: a discriminative regional feature integration approach. International Journal of Computer Vision, 123(2): 251-268 [DOI: 10.1007/s11263-016-0977-3]
Wang L J, Lu H C, Wang Y F, Feng M Y, Wang D, Yin B C and Ruan X. 2017b. Learning to detect salient objects with image-level supervision//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 3796-3805 [ DOI: 10.1109/CVPR.2017.404 http://dx.doi.org/10.1109/CVPR.2017.404 ]
Wang L Z, Wang L J, Lu H C, Zhang P P and Ruan X. 2019. Salient object detection with recurrent fully convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(7): 1734-1746 [DOI: 10.1109/TPAMI.2018.2846598]
Wang T T, Borji A, Zhang L H, Zhang P P and Lu H C. 2017c. A stagewise refinement model for detecting salient objects in images//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 4039-4048 [ DOI: 10.1109/ICCV.2017.433 http://dx.doi.org/10.1109/ICCV.2017.433 ]
Wang T T, Zhang L H, Wang S, Lu H C, Yang G, Ruan X and Borji A. 2018a. Detect globally, refine locally: a novel approach to saliency detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3127-3135 [ DOI: 10.1109/CVPR.2018.00330 http://dx.doi.org/10.1109/CVPR.2018.00330 ]
Wang W G, Shen J B, Dong X P and Borji A. 2018b. Salient object detection driven by fixation prediction//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1711-1720 [ DOI: 10.1109/CVPR.2018.00184 http://dx.doi.org/10.1109/CVPR.2018.00184 ]
Wei J, Wang S H and Huang Q M. 2020. F 3 Net: fusion, feedback and focus for salient objectdetection//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI: 12321-12328 [ DOI: 10.1609/aaai.v34i07.6916 http://dx.doi.org/10.1609/aaai.v34i07.6916 ].
Wu Z, Su L and Huang Q M. 2019. Cascaded partial decoder for fast and accurate salient object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 3902-3911 [ DOI: 10.1109/CVPR.2019.00403 http://dx.doi.org/10.1109/CVPR.2019.00403 ]
Xiang S K, Cao T Y, Fang Z and Hong S Z. 2020. Dense weak attention model for salient object detection. Journal of Image and Graphics, 25(1): 136-147
项圣凯, 曹铁勇, 方正, 洪施展. 2020. 使用密集弱注意力机制的图像显著性检测. 中国图象图形学报, 25(1): 136-147 [DOI: 10.11834/jig.190187]
Yan Q, Xu L, Shi J P and Jia J Y. 2013. Hierarchical saliency detection//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 1155-1162 [ DOI: 10.1109/CVPR.2013.153 http://dx.doi.org/10.1109/CVPR.2013.153 ]
Yang C, Zhang L H, Lu H C, Ruan X and Yang M H. 2013. Saliency detection via graph-based manifold ranking//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 3166-3173 [ DOI: 10.1109/CVPR.2013.407 http://dx.doi.org/10.1109/CVPR.2013.407 ]
Zhang L, Dai J, Lu H C, He Y and Wang G. 2018a. A bi-directional message passing model for salient object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1741-1750 [ DOI: 10.1109/CVPR.2018.00187 http://dx.doi.org/10.1109/CVPR.2018.00187 ]
Zhang P P, Wang D, Lu H C, Wang H Y and Ruan X. 2017. Amulet: aggregating multi-level convolutional features for salient object detection//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 202-211 [ DOI: 10.1109/ICCV.2017.31 http://dx.doi.org/10.1109/ICCV.2017.31 ]
Zhang P P, Liu W, Lu H C and Shen C H. 2018b. Salient object detection by lossless feature reflection//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: IJCAI: 1149-1155 [ DOI: 10.24963/ijcai.2018/160 http://dx.doi.org/10.24963/ijcai.2018/160 ]
Zhang X N, Wang T T, Qi J Q, Lu H C and Wang G. 2018c. Progressive attention guided recurrent network for salient object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 714-722 [ DOI: 10.1109/CVPR.2018.00081 http://dx.doi.org/10.1109/CVPR.2018.00081 ]
相关作者
相关机构
京公网安备11010802024621