自适应多特征融合相关滤波目标跟踪
Correlation filter target tracking algorithm based on adaptive multifeature fusion
- 2020年25卷第6期 页码:1160-1170
收稿:2019-06-21,
修回:2019-11-13,
录用:2019-11-20,
纸质出版:2020-06-16
DOI: 10.11834/jig.190304
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-06-21,
修回:2019-11-13,
录用:2019-11-20,
纸质出版:2020-06-16
移动端阅览
目的
2
针对现实场景中跟踪目标背景复杂、光照变化、快速运动、旋转等问题,提出自适应多特征融合的相关滤波跟踪算法。
方法
2
提取目标的HOG(histogram of oriented gradients)特征和利用卷积神经网络提取高、低层卷积特征,借助一种自适应阈值分割方法评估每种特征的有效性,得到特征融合的权重比。根据权重系数融合每种特征的响应图,并据此得到目标的新估计位置,利用尺度相关滤波器计算目标尺度,得到目标尺度完成跟踪。
结果
2
在OTB(object tracking benchmark)-2013公开数据集上进行实验,在对多特征融合进行分析的基础上,测试了本文算法在11种不同属性下的跟踪性能,并与当前流行的7种算法进行对比分析。结果表明,本文算法的成功率和精确度均排名第1,相较于基准算法DSST (discriminative scale space tracking)跟踪精确度提高了4%,成功率提高了6%。在复杂场景下比其他主流算法更具有鲁棒性。
结论
2
本文算法以DSST相关滤波跟踪器为基准算法,借助自适应阈值分割方法评估每种特征的有效性,自适应融合两层卷积特征和HOG特征,使得判别性越强的单一特征融合权重越大,较好表达了目标的外观模型,在背景复杂、目标消失、光照变化、快速运动、旋转等场景下表现出较强的跟踪准确性。
Objective
2
Target tracking is one of the basic problems in the field of computer vision. It is widely used in security monitoring
military operations
and automatic driving
among others. Tracking algorithms based on correlation filtering have been developed rapidly in recent years because of their fast and efficient features. However
designing a robust tracking algorithm remains a challenging problem due to background clutter
illumination variation
fast motion
rotation
and other complex factors. Building an effective appearance model is a key factor in tracking the success of an algorithm. The expressions of the current appearance model have two major types. In the first type
the appearance model is based on manual design. The common artificial design appearance model is the histogram of oriented gradients (HOG) feature because this feature can efficiently describe the contour and shape information of a target by calculating the direction gradient of the local area of the detected image. In the second type
the appearance model is based on deep learning. Low-level convolutional features contain rich texture information but are unable to adapt to background changes. High-level convolutional features contain rich semantic information that distinguishes backgrounds from targets even in complex contexts. Different informations of an image are described due to varying features; thus
this study proposes a correlation filter tracking method to achieve the effect of adaptive multifeature fusion.
Method
2
In this work
the DSST(discriminative scale space tracking) correlation filter is adopted as the benchmark algorithm
and conv1 and conv5 of the convolutional neural network (CNN) ImageNet-VGG (visual geometry group)-2 048 are used. First
the HOG feature of the target is extracted. Then
the high-and low-level convolutional features of the target are extracted using the CNN. The characteristic response graph is obtained. The maximum peak and shape of the response graph reflect the accurate information of the tracking results. Second
to evaluate the validity of the feature
using the area ratio of the peak as the new index is proposed to distinguish the confidence level of the correlation response graph. The validity of each feature is evaluated using an adaptive threshold segmentation method. If the peak of the response graph is sharp and the periphery is smooth
then the tracking result is reliable. The weight ratio of the feature fusion is obtained
such that the feature is improved and the fusion coefficient is increased. Lastly
the response graph of each feature is fused in accordance with the fusion coefficient
the final response output is calculated
and the target response position is determined from the maximum response value in the response graph. Scale-dependent filter estimation scales are reintroduced to achieve adaptive target tracking.
Result
2
To effectively evaluate the performance of the proposed method
the algorithm is tested on the public dataset OTB (object tracking benchmark)-2013. The 50 videos mostly contain 11 different challenges encountered in the target tracking process (including background complex
deformation
object disappearance
and scale variation). This study compares the algorithm with seven mainstream algorithms. The accuracy and success rate are used as the evaluation and tracking performance indicators. These algorithms are divided into two major categories: traditional tracking algorithms with representative and top ranks: ASLA (adaptive structural local sparse appearance)
SCM (sparsity-based collaborative model)
TLD (tracking-learning-detection). Correlation filtering algorithms: CFNet (correlation filter networks)
KCF (kernel correlation filter)
DSST
SAMF (scale adaptive with multiple features). The experimental results show that the proposed algorithm achieves the highest success rate and accuracy compared with the other algorithms. The accuracy of the proposed method on the OTB-2013 dataset is 77.8%. Those of the other algorithms are as follows: CFNet (76.1%)
DSST (74.6%)
KCF (73.5%)
SAMF (72.5%)
and the traditional algorithm SCM (67.8%). The success rate of the algorithm proposed in this work is 71.5%. Those of the other algorithms are CFNet (71.4%)
DSST (67.5%)
SAMF (66.2%)
and KCF (61.1%). The algorithm presented in this study increases tracking accuracy by 4% and improves success rate by 6%. From the aforementioned experimental data analysis
the method can effectively improve tracking performance. The proposed method ranks first in terms of accuracy compared with CFNet
DSST
SAMF
and KCF in seven attributes: background clutter
deformation
out-of-view
illumination variation
in-plane rotation
out-of-plane rotation
and fast motion. Compared with the other algorithms
the algorithm proposed in this study achieves the highest success rate in the scenes of nine attributes.
Conclusion
2
Different information of an image are described due to varying features; thus
this study proposes a correlation filter tracking method to achieve the effect of adaptive multifeature fusion. A CNN is used to extract high- and low-layer convolutional and HOG features. The adaptive threshold segmentation method is proposed to evaluate the validity of each feature. The two-layer convolutional and HOG features are adaptively fused. The response graph is fused in accordance with the fusion coefficient using feature validity analysis. Compared with most feature fusion methods that connect features serially or in parallel
this algorithm increases the fusion weight of a single feature with strong discriminability
and the appearance model of the target can be accurately represented. Therefore
the proposed algorithm exhibits strong robustness and tracking accuracy in scenarios with low resolution and scale variation. The presented target tracking method will be further studied in the future under occlusion
motion blur
and fast motion conditions.
Bolme D S, Beveridge J R, Draper B A and Lui Y M. 2010. Visual object tracking using adaptive correlation filters//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern. San Francisco, CA: IEEE: 2544-2550[ DOI: 10.1109/CVPR.2010.5539960 http://dx.doi.org/10.1109/CVPR.2010.5539960 ]
Dalal N and Triggs B. 2005. Histograms of oriented gradients for human detection//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE: 886-893[ DOI: 10.1109/CVPR.2005.177 http://dx.doi.org/10.1109/CVPR.2005.177 ]
Danelljan M, Häger G, Khan F S and Felsberg M. 2014a. Accurate scale estimation for robust visual tracking//Proceedings of British Machine Vision Conference 2014. Nottingham, UK: BMVA Press: 1-11[ DOI: 10.5244/C.28.65 http://dx.doi.org/10.5244/C.28.65 ]
Danelljan M, Khan F S, Felsberg M and van de Weijer J. 2014b. Adaptive color attributes for real-time visual tracking//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE: 1090-1097[ DOI: 10.1109/CVPR.2014.143 http://dx.doi.org/10.1109/CVPR.2014.143 ]
Danelljan M, Häger G, Khan F S and Felsberg M. 2015a. Convolutional features for correlation filter based visual tracking//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. Santiago, Chile: IEEE: 621-629[ DOI: 10.1109/ICCVW.2015.84 http://dx.doi.org/10.1109/ICCVW.2015.84 ]
Danelljan M, Häger G, Khan F S and Felsberg M. 2015b. Learning spatially regularized correlation filters for visualtracking//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 4310-4318[ DOI: 10.1109/ICCV.2015.490 http://dx.doi.org/10.1109/ICCV.2015.490 ]
Henriques J F, Caseiro R, Martins P and Batista J. 2015. High-speed tracking with Kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3):583-596[DOI:10.1109/TPAMI.2014.2345390]
Jia X, Lu H C and Yang M H. 2012. Visual tracking via adaptive structural local sparse appearance model//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE: 1822-1829[ DOI: 10.1109/CVPR.2012.6247880 http://dx.doi.org/10.1109/CVPR.2012.6247880 ]
Kalal Z, Matas J and Mikolajczyk K. 2010. P-N learning: bootstrapping binary classifiers by structural constraints//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE: 49-56[ DOI: 10.1109/CVPR.2010.5540231 http://dx.doi.org/10.1109/CVPR.2010.5540231 ]
Li Y and Zhu J K. 2014. A scale adaptive kernel correlation filter tracker with feature integration//Proceedings of European Conference on Computer Vision. Zurich, Switzerland: Springer: 254-265[ DOI: 10.1007/978-3-319-16181-5_18 http://dx.doi.org/10.1007/978-3-319-16181-5_18 ]
Lu G Z, Peng D L and Gu Y. 2018. Robust correlation filtering-based tracking by multifeature hierarchical fusion. Journal of Image and Graphics, 23(5):662-673
鲁国智, 彭冬亮, 谷雨. 2018.多特征分层融合的相关滤波鲁棒跟踪.中国图象图形学报, 23(5):662-673)[DOI:10. 11834/jig. 170472]
Ma C, Huang J B, Yang X K and Yang M S. 2015. Hierarchical convolutional features for visual tracking//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3074-3082[ DOI: 10.1109/ICCV.2015.352 http://dx.doi.org/10.1109/ICCV.2015.352 ]
Mao N, Yang D D, Yang F C and Cai Y Z. 2016. Adaptive object tracking based on hierarchical convolution features. Laser and Optoelectronics Progress, 53(12):195-207
毛宁, 杨德东, 杨福才, 蔡玉柱. 2016.基于分层卷积特征的自适应目标跟踪.激光与光电子学进展, 53(12):195-207)[DOI:10.3788/LOP53.121502]
Otsu N. 1979. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1):62-66[DOI:10.1109/TSMC.1979.4310076]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition.[EB/OL].[2020-03-22] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf
Tuzel O, Porikli F and Meer P. 2006. Region covariance: a fast descriptor for detection and classification//Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer: 589-600[ DOI: 10.1007/11744047_45 http://dx.doi.org/10.1007/11744047_45 ]
Valmadre J, Bertinetto L, Henriques J F, Vedaldi A and Torr P H S. 2017. End-to-end representation learning for Correlation Filter based tracking//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 5000-5008[ DOI: 10.1109/CVPR.2017.531 http://dx.doi.org/10.1109/CVPR.2017.531 ]
Viola P and Jones M J. 2004. Robust real-time face detection. International Journal of Computer Vision, 57(2):137-154[DOI:10.1023/b:visi.0000013087.49260.fb]
Wang L J, Ouyang W L, Wang X G and Lu H C. 2015. Visual tracking with fully convolutional networks//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3119-3127[ DOI: 10.1109/ICCV.2015.357 http://dx.doi.org/10.1109/ICCV.2015.357 ]
Wu Y, Lim J and Yang M H. 2013. Online object tracking: a benchmark//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE: 2411-2418[ DOI: 10.1109/CVPR.2013.312 http://dx.doi.org/10.1109/CVPR.2013.312 ]
Xiong C Z, Che M Q and Wang R L. 2018. Adaptive convolutional feature selection for real-time visual tracking. Journal of Image and Graphics, 23(11):1742-1750
熊昌镇, 车满强, 王润玲. 2018.自适应卷积特征选择的实时跟踪算法.中国图象图形学报, 23(11):1742-1750)[DOI:10.11834/jig.180252]
Zhong W, Lu H C and Yang M H. 2012. Robust object tracking via sparsity-based collaborative model//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE: 1838-1845[ DOI: 10.1109/CVPR.2012.6247882 http://dx.doi.org/10.1109/CVPR.2012.6247882 ]
相关作者
相关机构
京公网安备11010802024621