结合连续卷积算子的自适应加权目标跟踪算法
Adaptive weighted object tracking algorithm with continuous convolution operator
- 2019年24卷第7期 页码:1106-1115
收稿:2018-10-24,
修回:2019-1-8,
纸质出版:2019-07-16
DOI: 10.11834/jig.180586
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-10-24,
修回:2019-1-8,
纸质出版:2019-07-16
移动端阅览
目的
2
在视觉跟踪领域中,特征的高效表达是鲁棒跟踪的关键,观察到在相关滤波跟踪中,不同卷积层表达了目标的不同方面特征,提出了一种结合连续卷积算子的自适应加权目标跟踪算法。
方法
2
针对目标定位不准确的问题,提出连续卷积算子方法,将离散的位置估计转换成连续位置估计,使得位置定位更加准确;利用不同卷积层的特征表达,提高跟踪效果。首先利用深度卷积网络结构提取多层卷积特征,通过计算相关卷积响应大小,决定在下一帧特征融合时各层特征所占的权重,凸显优势特征,然后使用从不同层训练得到的相关滤波器与提取得到的特征进行相关运算,得到最终的响应图,响应图中最大值所在的位置便是目标所在的位置和尺度。
结果
2
与目前较流行的3种目标跟踪算法在目标跟踪基准数据库(OTB-2013)中的50组视频序列进行测试,本文算法平均跟踪成功率达到85.4%。
结论
2
本文算法在光照变化、尺度变化、背景杂波、目标旋转、遮挡和复杂环境下的跟踪具有较高的鲁棒性。
Objective
2
In the visual tracking field
efficient representation of features is the key to robust tracking. Different convolution layers represent various aspects of the target in correlation filter tracking. An adaptive weighted object tracking algorithm with continuous convolution operator is proposed.
Method
2
A continuous convolution operator method is proposed to convert discrete position estimates into continuous ones for solving the inaccurate target location problem
thereby rendering position location highly accurate. The feature representations of different convolution layers are leveraged to improve the tracking effect. Different convolutional layer features in deep convolutional neural networks have different expression capabilities. Specifically
shallow features demonstrate substantial positional information
whereas deep ones present considerable semantic features. Therefore
when feature expression and tracking can be conducted by combining them
better tracking effects can be obtained than using only deep or shallow features. First
the multi-layer convolution features are extracted by using the deep convolution network structure. The weight of each layer feature in the fusion features in the next frame is determined by calculating the correlation convolution response to highlight the dominant features and render the target highly distinguishable from the background or distractor. Then
the correlation filter trained from different layers is used to perform correlation operation with the extracted features for obtaining the final response map. The position of the maximum value in the response map is used to calculate the position and scale of the target. The weights of different convolutional feature layers are adaptively updated through the correlation filtering tracking effect of different convolutional layers. The feature expression capability of different convolutional layers in the convolutional neural network is fully exerted. The expression scheme is adaptively adjusted in accordance with the different environmental conditions of each frame to improve the tracking performance.
Result
2
The average success rate of the proposed algorithm is 85.4% compared with three state-of-the-art tracking algorithms in 50 video sequences of object tracking benchmark (OTB-2013) dataset.
Conclusion
2
Experimental results show that the proposed tracking algorithm has good performance and can successfully and efficiently track many complicated situations
such as illumination variation
scale variation
background clutters
object rotation
and occlusion.
Huang X Y, Cheng X J, Geng Q C, et al. The apolloScape open dataset for autonomous driving and its application[C]//Proceedings of the Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018.
Mei X, Ling H B. Robust visual tracking using $$\ell$$ 1 minimization[C]//Proceedings of the 12th IEEE International Conference on Computer Vision. Kyoto, Japan: IEEE, 2009: 1436-1443.[ DOI:10.1109/ICCV.2009.5459292 http://dx.doi.org/10.1109/ICCV.2009.5459292 ].
Liu B Y, Liu Y, Huang J Z, et al. Robust and fast collaborative tracking with two stagesparse optimization[C]//Proceedings of 2010 European Conference on Computer Vision. Crete, Greece: Springer, 2010: 624-637.[ DOI:10.1007/978-3-642-15561-1_45 http://dx.doi.org/10.1007/978-3-642-15561-1_45 ]
Bao C L, Wu Y, Ling H B, et al. Real time robust L1 tracker using accelerated proximal gradient approach[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 1830-1837.[ DOI:10.1109/CVPR.2012.6247881 http://dx.doi.org/10.1109/CVPR.2012.6247881 ]
Ross D A, Lim J, LIN R S, et al. Incremental learning for robust visual tracking[J]. International Journal of Computer Vision, 2008, 77(1-3):125-141.[DOI:10.1007/s11263-007-0075-7]
Zhang W, Kang B S. Recent advances in correlation filter-based object tracking:a review[J]. Journal of Image and Graphics, 2017, 22(8):1017-1033.
张微, 康宝生.相关滤波目标跟踪进展综述[J].中国图象图形学报, 2017, 22(8):1017-1033. [DOI:10.11834/jig.170092]
Li H X, Li Y, Porikli F. DeepTrack: learning discriminative feature representations by convolutional neural networks for visual tracking[C]//Proceedings of 2014 British Machine Vision Conference. Nottingham, United Kingdom: B-MVA Press, 2014: 1-12.[ DOI:10.5244/C.28.56 http://dx.doi.org/10.5244/C.28.56 ]
Nam H, Han B. Learning multi-domain conv-olutional neural networks for visual tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 4293-4302.[ DOI:10.1109/CVPR.2016.465 http://dx.doi.org/10.1109/CVPR.2016.465 ]
Yao R, Shi Q F, Shen C H, et al. Part-based visual tracking with online latent structural learning[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 2363-2370.[ DOI:10.1109/CVPR.2013.306 http://dx.doi.org/10.1109/CVPR.2013.306 ]
Ning J F, Yang J M, Jiang S J, et al. Object tracking via dual linear structured SVM and explicit feature map[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 4266-4274.[ DOI:10.1109/CVPR.2016.462 http://dx.doi.org/10.1109/CVPR.2016.462 ]
Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010: 2544-2550.[ DOI:10.1109/CVPR.2010.5539960 http://dx.doi.org/10.1109/CVPR.2010.5539960 ]
Danelljan M, Häger G, Khan F S, et al. Accurate scale estimation for robust visual tracking[C]//Proceedings of 2014 British Machine Vision Conference. Nottingham: BMVA Press, 2014: 65.1-65.11.[ DOI:10.5244/C.28.65 http://dx.doi.org/10.5244/C.28.65 ]
Ma C, Huang J B, Yang X K, et al. Hierarchical convolutional features for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3074-3082.[ DOI:10.1109/ICCV.2015.352 http://dx.doi.org/10.1109/ICCV.2015.352 ]
Wang L J, Ouyang W L, Wang X G, et al. Visual tracking with fully convolutional networks[C]//Proceedings of 2016 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3119-3127.[ DOI:10.1109/ICCV.2015.357 http://dx.doi.org/10.1109/ICCV.2015.357 ]
Danelljan M, Robinson A, Khan F S, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016: 472-488.[ DOI:10.1007/978-3-319-46454-1_29 http://dx.doi.org/10.1007/978-3-319-46454-1_29 ]
He Z Q, Fan Y R, Zhuang J F, et al. Correlation filters with weighted convolution responses[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 1992-2000.[ DOI:10.1109/ICCVW.2017.233 http://dx.doi.org/10.1109/ICCVW.2017.233 ]
Qi Y K, Zhang S P, Qin L, et al. Hedged deep tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 4303-4311.[ DOI:10.1109/CVPR.2016.466 http://dx.doi.org/10.1109/CVPR.2016.466 ]
Tao R, Gavves E, Smeulders A W M. Siamese instance search for tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 1420-1429.[ DOI:10.1109/CVPR.2016.158 http://dx.doi.org/10.1109/CVPR.2016.158 ]
Held D, Thrun S, Savarese S. Learning to track at 100 fps with deep regression networks[C]//Proceedings of 2016 European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016: 749-765.[ DOI:10.1007/978-3-319-46448-0_45 http://dx.doi.org/10.1007/978-3-319-46448-0_45 ]
Cui Z, Xiao S T, Feng J S, et al. Recurrently target-attending tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 1449-1458.[ DOI:10.1109/CVPR.2016.161 http://dx.doi.org/10.1109/CVPR.2016.161 ]
Fan H, Ling H B. SANet: structure-aware network for visual tracking[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, HI, USA: IEEE, 2017: 2217-2224.[ DOI:10.1109/CVPRW.2017.275 http://dx.doi.org/10.1109/CVPRW.2017.275 ]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014.
Zeiler M D, Fergus R. Visualizing and under standing convolutional networks[C]//Proceedings of 2014 European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 818-833.[ DOI:10.1007/978-3-319-10590-1_53 http://dx.doi.org/10.1007/978-3-319-10590-1_53 ]
LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015, 521(7553):436-444.[DOI:10.1038/nature14539]
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.[DOI:10.1145/3065386]
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE, 2016: 770-778.[ DOI:10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252.[DOI:10.1007/s11263-015-0816-y]
Lu H C, Li P X, Wang D. Visual object tracking:a survey[J]. Pattern Recognition and Artificial Intelligence, 2018, 31(1):61-76.
卢湖川, 李佩霞, 王栋.目标跟踪算法综述[J].模式识别与人工智能, 2018, 31(1):61-76. [DOI:10.16451/j.cnki.issn1003-6059.201801006]
Li F, Tian C, Zuo W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018.[ DOI:10.1109/CVPR.2018.00515 http://dx.doi.org/10.1109/CVPR.2018.00515 ]
Song Y B, Ma C, Wu X H, et al. VITAL: visual tracking via adversarial learning[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018.[ DOI:10.1109/CVPR.2018.00937 http://dx.doi.org/10.1109/CVPR.2018.00937 ]
Lukežic A, Vojir T, Zaji L C, et al. Discriminative correlation filter with channel and spatial reliability[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii, USA: IEEE, 2017: 4847-4856.[ DOI:10.1109/CVPR.2017.515 http://dx.doi.org/10.1109/CVPR.2017.515 ]
Li Y, Zhu J K. A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of 2014 European Conference on Computer Vision. Zurich, Switzerland: Springer, 2015: 254-265.[ DOI:10.1007/978-3-319-16181-5_18 http://dx.doi.org/10.1007/978-3-319-16181-5_18 ]
Wu Y, Lim J, Yang M H. Online object tracking: a benchmark[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 2411-2418.[ DOI:10.1109/CVPR.2013.312 http://dx.doi.org/10.1109/CVPR.2013.312 ]
Danelljan M, Häger G, Khan F S, et al. Convolutional features for correlation filter based visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. Santiago, Chile: IEEE, 2015: 621-629.[ DOI:10.1109/ICCVW.2015.84 http://dx.doi.org/10.1109/ICCVW.2015.84 ]
相关作者
相关机构
京公网安备11010802024621