加权多特征外观表示的实时目标追踪
Real-time visual tracking via weighted multi-feature fusion on an appearance model
- 2019年24卷第2期 页码:291-301
收稿:2018-06-22,
修回:2018-8-10,
纸质出版:2019-02-16
DOI: 10.11834/jig.180398
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-06-22,
修回:2018-8-10,
纸质出版:2019-02-16
移动端阅览
目的
2
目标跟踪是计算机视觉领域重点研究方向之一,在智能交通、人机交互等方面有着广泛应用。尽管目前基于相关滤波的方法由于其高效、鲁棒在该领域取得了显著进展,但特征的选择和表示一直是追踪过程中建立目标外观时的首要考虑因素。为了提高外观模型的鲁棒性,越来越多的跟踪器中引入梯度特征、颜色特征或其他组合特征代替原始灰度单一特征,但是该类方法没有结合特征本身考虑不同特征在模型中所占的比重。
方法
2
本文重点研究特征的选取以及融合方式,通过引入权重向量对特征进行融合,设计了基于加权多特征外观模型的追踪器。根据特征的计算方式,构造了一项二元一次方程,将权重向量的求解转化为确定特征的比例系数,结合特征本身的维度信息,得到方程的有限组整数解集,最后通过实验确定最终的比例系数,并将其归一化得到权重向量,进而构建一种新的加权混合特征模型对目标外观建模。
结果
2
采用OTB-100中的100个视频序列,将本文算法与其他7种主流算法,包括5种相关滤波类方法,以精确度、平均中心误差、实时性为评价指标进行了对比实验分析。在保证实时性的同时,本文算法在Basketball、DragonBaby、Panda、Lemming等多个数据集上均表现出了更好的追踪结果。在100个视频集上的平均结果与基于多特征融合的尺度自适应跟踪器相比,精确度提高了1.2%。
结论
2
本文基于相关滤波的追踪框架在进行目标的外观描述时引入权重向量,进而提出了加权多特征融合追踪器,使得在复杂动态场景下追踪长度更长,提高了算法的鲁棒性。
Objective
2
Visual tracking is an important research direction in the field of computer vision and is widely applied in intelligent transportation
human-computer interaction
and other areas. Correlation filter-based trackers (CFTs) have achieved excellent performance due to their efficiency and robustness in tracking field. However
the design of a robust tracking algorithm for complex dynamic scenes is challenging due to the influence of lighting
fast motion
background interference
target rotation
scale change
occlusion
and other factors. In addition
the selection and presentation of features are constantly used as the primary considerations in establishing a target appearance model during tracking. To improve the robustness of the appearance model
many trackers introduce gradient feature
color feature
or several other combined features rather than a single gray feature. However
they do not discuss the role of each feature and their relationships in the model.
Method
2
The research on correlation filter theory achieves remarkable improvements. On the basis of this research
the appearance model is used to represent the target and verify the observation. This process is the most important part of any tracking algorithm. Moreover
the features are fundamental and difficult in appearance representation. Therefore
this study mainly focuses on the selection and combination of features. Gradient feature
color feature
and raw pixel have been discussed in previous works. As a common descriptor of shape and edge
gradient feature is invariable in translation and light and performs well in the tracking scene of deformation
light change
and partial occlusion. However
the gradient feature of the target is not evident
and the description capability of the feature is weakened when considerable noise is encountered in the background
target rotation
and target blur. The color of the target and background can be distinguished although they are usually different. On this basis
a new tracking method called weighted multi-feature fusion (WMFF) tracker is proposed via the introduction of a weight vector to fuse multiple feature on the appearance model. The model is dominated by gradient features and is supplemented by color feature and original pixels
which can compensate the inadequacies of single-gradient feature and provide the utilization of the color features of color
thereby making features complementary to each other. In detail
this study constructs a three-variable linear equation on weights based on the calculation method of each feature. The proportional relationships in this equation are solved rather than their specific values. The gradient feature can transform the solutions of weight vector to determine the proportional coefficients of each feature by using it as a criterion. Therefore
the equation is a system of linear equations of two unknowns. In addition
the equation has a limited integer solution set
and the final proportion coefficient is determined by experimental verification on test sequence in terms of the dimension information of feature calculation. This method normalizes the proportion coefficient as weight vector and builds a new weighted feature-mixing model of target appearance to model. The WMFF tracker adopts a detection-based tracking framework
which includes feature extraction
model construction
filter training
target center detection
and model update.
Result
2
A total of 100 video sequences from the object tracking benchmark datasets (herein
OTB-100 datasets) are adopted in the experiments to compare the performance with seven other state-of-the-art trackers
which include five CFTs. A total of 11 different attributes
such as illumination
occlusion
and scale variation
are annotated on video sequences. Comparisons and analyses are performed for these trackers by using precision
average center error
average Pascal VOC overlap ratio
and median frame per second as evaluation standards. Precision and success plots of different datasets are also presented
and the performance of different attributes are discussed. Experimental results on benchmark OTB-100 datasets demonstrate that our tracker can achieve real-time and better performance compared with other methods
especially on Basketball
DragonBaby
Panda
and Lemming sequences. The edge contours
especially the gradient information of the target
are unremarkable when the scene is subjected to motion blur due to occlusion or deformation
which causes the appearance model constructed by the gradient feature not being able to distinguish the target accurately and thus tracking failure easily occurs. Meanwhile
the WMFF tracker can utilize the color feature as a supplement to construct the appearance model in time to obtain a robust tracking effect when the gradient feature is invalid. The color feature has the same level of importance as the gradient feature and achieves an ideal feature combination effect. The performance of the proposed method outperforms other algorithms on multiple datasets
and the average results on OTB-100 datasets show that the precision is improved by 1.2% compared with a scale-adaptive kernel CFT with feature integration tracker.
Conclusion
2
In this study
a weight vector is introduced to combine features in describing the appearance of the target
and a WMFF tracker is proposed based on a CFT framework. A new hybrid feature HCG is dominated by gradient feature and is supplemented by color and gray feature
which can be used to model the appearance of the target. This model can compensate the deficiency of single feature and enables the function of each feature. This model not only can make the features complement one another but also make the appearance model adapt to multiple complex scenes. The WMFF tracker makes the tracking length longer than other trackers in complex dynamic scenes and improves the robustness of the algorithm.
Jia X, Lu H C, Yang M H. Visual tracking via adaptive structural local sparse appearance model[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 1822-1829.[ DOI: 10.1109/CVPR.2012.6247880 http://dx.doi.org/10.1109/CVPR.2012.6247880 ]
Zhong W, Lu H C, Yang M H. Robust object tracking via sparse collaborative appearance model[J]. IEEE Transactions on Image Processing, 2014, 23(5):2356-2368.[DOI:DOI:10.1109/TIP.2014.2313227]
Babenko B, Yang M H, Belongie S. Robust object tracking with online multiple instance learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8):1619-1632.[DOI:10.1109/TPAMI.2010.226]
Grabner H, Grabner M, Bischof H. Real-time tracking via online boosting[C]//Proceedings of 2006 British Machine Vision Conference. Edinburgh, UK: BMVA Press, 2006: 47-56[ DOI: 10.5244/C.20.6 http://dx.doi.org/10.5244/C.20.6 ]
Hare S, Saffari A, Torr P H S. Struck: structured output tracking with kernels[C]//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE, 263-270.[ DOI: 10.1109/ICCV.2011.6126251 http://dx.doi.org/10.1109/ICCV.2011.6126251 ]
Nam H, Han B. Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1063-6919.[ DOI: 10.1109/CVPR.2016.465 http://dx.doi.org/10.1109/CVPR.2016.465 ]
Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]//Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010: 2544-2550.[ DOI: 10.1109/CVPR.2010.5539960 http://dx.doi.org/10.1109/CVPR.2010.5539960 ]
Henriques J F, Caseiro R, Martins P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer, 2012: 702-715.[ DOI: 10.1007/978-3-642-33765-9_50 http://dx.doi.org/10.1007/978-3-642-33765-9_50 ]
Henriques J F, Caseiro R, Martins P, et al. High-Speed tracking with Kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3):583-596.[DOI:10.1109/TPAMI.2014.2345390]
Danelljan M, Khan F S, Felsberg M, et al. Adaptive color attributes for real-time visual tracking[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014: 1090-1097.[ DOI: 10.1109/CVPR.2014.143 http://dx.doi.org/10.1109/CVPR.2014.143 ]
Khan F S, Anwer R M, van de Weijer J, et al. Color attributes for object detection[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 3306-3313.[ DOI: 10.1109/CVPR.2012.6248068 http://dx.doi.org/10.1109/CVPR.2012.6248068 ]
Li Y, Zhu J K. A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of 2014 European Conference on Computer Vision. Zurich, Switzerland: Springer, 2015: 254-265.[ DOI: 10.1007/978-3-319-16181-5_18 http://dx.doi.org/10.1007/978-3-319-16181-5_18 ]
Xu F L, Wang H P, Song Y L, et al. A multi-scale kernel correlation filter tracker with feature integration and robust model updater[C]//Proceedings of the 29th Chinese Control and Decision Conference. Chongqing, China: IEEE, 2017: 1934-1939.[ DOI: 10.1109/CCDC.2017.7978833 http://dx.doi.org/10.1109/CCDC.2017.7978833 ]
Huang D F, Luo L, Wen M, et al. Enable scale and aspect ratio adaptability in visual tracking with detection proposals[C]//Proceedings of the British Machine Vision Conference. Swansea, UK: BMVA Press, 2015.[ DOI: 10.5244/C.29.185 http://dx.doi.org/10.5244/C.29.185 ]
Li F, Yao Y J, Li P H, et al. Integrating boundary and center correlation filters for visual tracking with aspect ratio variation[C]//Proceedings of 2017 IEEE International Conference on Computer Vision Workshop. Venice, Italy: IEEE, 2017: 2001-2009.[ DOI: 10.1109/ICCVW.2017.234 http://dx.doi.org/10.1109/ICCVW.2017.234 ]
Wu Y, Lim J, Yang M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1834-1848.[DOI:10.1109/TPAMI.2014.2388226]
Danelljan M, Häger G, Khan F S, et al. Accurate scale estimation for robust visual tracking[C]//Proceedings of 2014 British Machine Vision Conference. Nottingham: BMVC Press, 2014.[ DOI: 10.5244/C.28.65 http://dx.doi.org/10.5244/C.28.65 ]
Bertinetto L, Valmadre J, Golodetz S, et al. Staple: complementary learners for real-time tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, US: IEEE, 2016: 1401-1409.[ DOI: 10.1109/CVPR.2016.156 http://dx.doi.org/10.1109/CVPR.2016.156 ]
Wang D, Lu H C. Visual tracking via probability continuous outlier model[C]//Proceedings of 2014 IEEE Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 3478-3485.[ DOI: 10.1109/CVPR.2014.445 http://dx.doi.org/10.1109/CVPR.2014.445 ]
相关作者
相关机构
京公网安备11010802024621