结合连续卷积算子的自适应加权目标跟踪算法

罗会兰; 石武

doi:10.11834/jig.180586

图像分析和识别 | 浏览量 : 0 下载量: 7 CSCD: 3

PDF
导出
分享
收藏
专辑

结合连续卷积算子的自适应加权目标跟踪算法
Adaptive weighted object tracking algorithm with continuous convolution operator
2019年24卷第7期页码：1106-1115
收稿：2018-10-24，

修回：2019-1-8，

纸质出版：2019-07-16
DOI： 10.11834/jig.180586
稿件说明：

移动端阅览

罗会兰, 石武. 结合连续卷积算子的自适应加权目标跟踪算法[J]. 中国图象图形学报, 2019,24(7):1106-1115. DOI： 10.11834/jig.180586.

Huilan Luo, Wu Shi. Adaptive weighted object tracking algorithm with continuous convolution operator[J]. Journal of Image and Graphics, 2019, 24(7): 1106-1115. DOI： 10.11834/jig.180586.

摘要

目的

在视觉跟踪领域中，特征的高效表达是鲁棒跟踪的关键，观察到在相关滤波跟踪中，不同卷积层表达了目标的不同方面特征，提出了一种结合连续卷积算子的自适应加权目标跟踪算法。

方法

针对目标定位不准确的问题，提出连续卷积算子方法，将离散的位置估计转换成连续位置估计，使得位置定位更加准确；利用不同卷积层的特征表达，提高跟踪效果。首先利用深度卷积网络结构提取多层卷积特征，通过计算相关卷积响应大小，决定在下一帧特征融合时各层特征所占的权重，凸显优势特征，然后使用从不同层训练得到的相关滤波器与提取得到的特征进行相关运算，得到最终的响应图，响应图中最大值所在的位置便是目标所在的位置和尺度。

结果

与目前较流行的3种目标跟踪算法在目标跟踪基准数据库（OTB-2013）中的50组视频序列进行测试，本文算法平均跟踪成功率达到85.4%。

结论

本文算法在光照变化、尺度变化、背景杂波、目标旋转、遮挡和复杂环境下的跟踪具有较高的鲁棒性。

Abstract

Objective

In the visual tracking field

efficient representation of features is the key to robust tracking. Different convolution layers represent various aspects of the target in correlation filter tracking. An adaptive weighted object tracking algorithm with continuous convolution operator is proposed.

Method

A continuous convolution operator method is proposed to convert discrete position estimates into continuous ones for solving the inaccurate target location problem

thereby rendering position location highly accurate. The feature representations of different convolution layers are leveraged to improve the tracking effect. Different convolutional layer features in deep convolutional neural networks have different expression capabilities. Specifically

shallow features demonstrate substantial positional information

whereas deep ones present considerable semantic features. Therefore

when feature expression and tracking can be conducted by combining them

better tracking effects can be obtained than using only deep or shallow features. First

the multi-layer convolution features are extracted by using the deep convolution network structure. The weight of each layer feature in the fusion features in the next frame is determined by calculating the correlation convolution response to highlight the dominant features and render the target highly distinguishable from the background or distractor. Then

the correlation filter trained from different layers is used to perform correlation operation with the extracted features for obtaining the final response map. The position of the maximum value in the response map is used to calculate the position and scale of the target. The weights of different convolutional feature layers are adaptively updated through the correlation filtering tracking effect of different convolutional layers. The feature expression capability of different convolutional layers in the convolutional neural network is fully exerted. The expression scheme is adaptively adjusted in accordance with the different environmental conditions of each frame to improve the tracking performance.

Result

The average success rate of the proposed algorithm is 85.4% compared with three state-of-the-art tracking algorithms in 50 video sequences of object tracking benchmark (OTB-2013) dataset.

Conclusion

Experimental results show that the proposed tracking algorithm has good performance and can successfully and efficiently track many complicated situations

such as illumination variation

scale variation

background clutters

object rotation

and occlusion.

关键词

Keywords

references

Huang X Y, Cheng X J, Geng Q C, et al. The apolloScape open dataset for autonomous driving and its application[C]//Proceedings of the Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018.

Mei X, Ling H B. Robust visual tracking using $$\ell$$ 1 minimization[C]//Proceedings of the 12th IEEE International Conference on Computer Vision. Kyoto, Japan: IEEE, 2009: 1436-1443.[ DOI:10.1109/ICCV.2009.5459292 http://dx.doi.org/10.1109/ICCV.2009.5459292 ].

Liu B Y, Liu Y, Huang J Z, et al. Robust and fast collaborative tracking with two stagesparse optimization[C]//Proceedings of 2010 European Conference on Computer Vision. Crete, Greece: Springer, 2010: 624-637.[ DOI:10.1007/978-3-642-15561-1_45 http://dx.doi.org/10.1007/978-3-642-15561-1_45 ]

Bao C L, Wu Y, Ling H B, et al. Real time robust L1 tracker using accelerated proximal gradient approach[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 1830-1837.[ DOI:10.1109/CVPR.2012.6247881 http://dx.doi.org/10.1109/CVPR.2012.6247881 ]

Ross D A, Lim J, LIN R S, et al. Incremental learning for robust visual tracking[J]. International Journal of Computer Vision, 2008, 77(1-3):125-141.[DOI:10.1007/s11263-007-0075-7]

Zhang W, Kang B S. Recent advances in correlation filter-based object tracking:a review[J]. Journal of Image and Graphics, 2017, 22(8):1017-1033.

张微, 康宝生.相关滤波目标跟踪进展综述[J].中国图象图形学报, 2017, 22(8):1017-1033. [DOI:10.11834/jig.170092]

Li H X, Li Y, Porikli F. DeepTrack: learning discriminative feature representations by convolutional neural networks for visual tracking[C]//Proceedings of 2014 British Machine Vision Conference. Nottingham, United Kingdom: B-MVA Press, 2014: 1-12.[ DOI:10.5244/C.28.56 http://dx.doi.org/10.5244/C.28.56 ]

Nam H, Han B. Learning multi-domain conv-olutional neural networks for visual tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 4293-4302.[ DOI:10.1109/CVPR.2016.465 http://dx.doi.org/10.1109/CVPR.2016.465 ]

Yao R, Shi Q F, Shen C H, et al. Part-based visual tracking with online latent structural learning[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 2363-2370.[ DOI:10.1109/CVPR.2013.306 http://dx.doi.org/10.1109/CVPR.2013.306 ]

Ning J F, Yang J M, Jiang S J, et al. Object tracking via dual linear structured SVM and explicit feature map[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 4266-4274.[ DOI:10.1109/CVPR.2016.462 http://dx.doi.org/10.1109/CVPR.2016.462 ]

Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010: 2544-2550.[ DOI:10.1109/CVPR.2010.5539960 http://dx.doi.org/10.1109/CVPR.2010.5539960 ]

Danelljan M, Häger G, Khan F S, et al. Accurate scale estimation for robust visual tracking[C]//Proceedings of 2014 British Machine Vision Conference. Nottingham: BMVA Press, 2014: 65.1-65.11.[ DOI:10.5244/C.28.65 http://dx.doi.org/10.5244/C.28.65 ]

Ma C, Huang J B, Yang X K, et al. Hierarchical convolutional features for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3074-3082.[ DOI:10.1109/ICCV.2015.352 http://dx.doi.org/10.1109/ICCV.2015.352 ]

Wang L J, Ouyang W L, Wang X G, et al. Visual tracking with fully convolutional networks[C]//Proceedings of 2016 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3119-3127.[ DOI:10.1109/ICCV.2015.357 http://dx.doi.org/10.1109/ICCV.2015.357 ]

Danelljan M, Robinson A, Khan F S, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016: 472-488.[ DOI:10.1007/978-3-319-46454-1_29 http://dx.doi.org/10.1007/978-3-319-46454-1_29 ]

He Z Q, Fan Y R, Zhuang J F, et al. Correlation filters with weighted convolution responses[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 1992-2000.[ DOI:10.1109/ICCVW.2017.233 http://dx.doi.org/10.1109/ICCVW.2017.233 ]

Qi Y K, Zhang S P, Qin L, et al. Hedged deep tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 4303-4311.[ DOI:10.1109/CVPR.2016.466 http://dx.doi.org/10.1109/CVPR.2016.466 ]

Tao R, Gavves E, Smeulders A W M. Siamese instance search for tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 1420-1429.[ DOI:10.1109/CVPR.2016.158 http://dx.doi.org/10.1109/CVPR.2016.158 ]

Held D, Thrun S, Savarese S. Learning to track at 100 fps with deep regression networks[C]//Proceedings of 2016 European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016: 749-765.[ DOI:10.1007/978-3-319-46448-0_45 http://dx.doi.org/10.1007/978-3-319-46448-0_45 ]

Cui Z, Xiao S T, Feng J S, et al. Recurrently target-attending tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 1449-1458.[ DOI:10.1109/CVPR.2016.161 http://dx.doi.org/10.1109/CVPR.2016.161 ]

Fan H, Ling H B. SANet: structure-aware network for visual tracking[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, HI, USA: IEEE, 2017: 2217-2224.[ DOI:10.1109/CVPRW.2017.275 http://dx.doi.org/10.1109/CVPRW.2017.275 ]

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014.

Zeiler M D, Fergus R. Visualizing and under standing convolutional networks[C]//Proceedings of 2014 European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 818-833.[ DOI:10.1007/978-3-319-10590-1_53 http://dx.doi.org/10.1007/978-3-319-10590-1_53 ]

LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015, 521(7553):436-444.[DOI:10.1038/nature14539]

Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.[DOI:10.1145/3065386]

He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 770-778.[ DOI:10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]

Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252.[DOI:10.1007/s11263-015-0816-y]

Lu H C, Li P X, Wang D. Visual object tracking:a survey[J]. Pattern Recognition and Artificial Intelligence, 2018, 31(1):61-76.

卢湖川, 李佩霞, 王栋.目标跟踪算法综述[J].模式识别与人工智能, 2018, 31(1):61-76. [DOI:10.16451/j.cnki.issn1003-6059.201801006]

Li F, Tian C, Zuo W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018.[ DOI:10.1109/CVPR.2018.00515 http://dx.doi.org/10.1109/CVPR.2018.00515 ]

Song Y B, Ma C, Wu X H, et al. VITAL: visual tracking via adversarial learning[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018.[ DOI:10.1109/CVPR.2018.00937 http://dx.doi.org/10.1109/CVPR.2018.00937 ]

Lukežic A, Vojir T, Zaji L C, et al. Discriminative correlation filter with channel and spatial reliability[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii, USA: IEEE, 2017: 4847-4856.[ DOI:10.1109/CVPR.2017.515 http://dx.doi.org/10.1109/CVPR.2017.515 ]

Li Y, Zhu J K. A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of 2014 European Conference on Computer Vision. Zurich, Switzerland: Springer, 2015: 254-265.[ DOI:10.1007/978-3-319-16181-5_18 http://dx.doi.org/10.1007/978-3-319-16181-5_18 ]

Wu Y, Lim J, Yang M H. Online object tracking: a benchmark[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 2411-2418.[ DOI:10.1109/CVPR.2013.312 http://dx.doi.org/10.1109/CVPR.2013.312 ]

Danelljan M, Häger G, Khan F S, et al. Convolutional features for correlation filter based visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. Santiago, Chile: IEEE, 2015: 621-629.[ DOI:10.1109/ICCVW.2015.84 http://dx.doi.org/10.1109/ICCVW.2015.84 ]