多相关滤波自适应融合的鲁棒目标跟踪
Multi-correlation filters method for robust visual tracking
- 2018年23卷第2期 页码:269-276
收稿:2017-07-18,
修回:2017-9-17,
纸质出版:2018-02-16
DOI: 10.11834/jig.170387
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-07-18,
修回:2017-9-17,
纸质出版:2018-02-16
移动端阅览
目的
2
由于目标在复杂场景中可能会发生姿态变化、物体遮挡、背景干扰等情况,目标跟踪仍然是一个具有挑战性的课题。目前判别性相关滤波方法在目标跟踪问题上获得了成功而又广泛的应用。标准的相关滤波方法基于循环偏移得到大量训练样本,并利用快速傅里叶变换加速求解滤波器,使其具有很好的实时性和鲁棒性,但边界偏移带来的消极的训练样本降低了跟踪效果。空间正则化的相关滤波跟踪方法引入空间权重函数,增强目标区域的滤波器作用,在增大了目标搜索区域的同时,也增加了计算时间,而且对于目标形变不规则,背景相似的情景也会增强背景滤波器,从而导致跟踪失败。为此,基于以上问题,提出一种自适应融合多种相关滤波器的方法。
方法
2
利用交替方向乘子法将无约束的相关滤波问题转化为有约束问题的两个子问题,在子问题中分别采用不同的相关滤波方法进行求解。首先用标准的相关滤波方法进行目标粗定位,进而用空间正则化的相关滤波跟踪方法进行再定位,实现了目标位置和滤波模板的微调,提高了跟踪效果。
结果
2
本文算法和目前主流的一些跟踪方法在OTB-2015数据集中100个视频上,以中心坐标误差和目标框的重叠率为评判标准进行了对比实验,本文算法能较好地处理多尺度变化、姿态变化、背景干扰等问题,在CarScale、Freeman4、Girl等视频上都表现出了最好的跟踪结果;本文算法在100个视频上的平均中心坐标误差为28.55像素,平均目标框重叠率为61%,和使用人工特征的方法相比,均高于其他算法,与使用深度特征的相关滤波方法相比,平均中心坐标误差高了6像素,但平均目标框的重叠率高了4%。
结论
2
大量的实验结果表明,在目标发生姿态变化、尺度变化等外观变化时,本文算法均具有较好的准确性和鲁棒性。
Objective
2
Due to pose variation of the target
occlusion and background clutter in complex scene
visual object tracking is still a challenging task. Recently
discriminative correlation filter methods have been successfully and widely applied to visual tracking problem. The standard correlation filter method obtains a number of training samples by cyclic shift
and solves the filters by fast Fourier transform algorithm
which makes it have good real-time and robustness. However
the negative training samples caused by the boundary shift reduce the tracking effect. Spatially regularized correlation filters based tracker enhances the effect of target area by introducing a spatial weight function
which makes the difference between positive and negative samples more obvious. The target search area is increased while the computation time is also increased. In addition
for those complex scene
in which
target deformation is irregular or background is similar
the background filters are also enhanced which result in failure of tracking.
Method
2
In order to address the above problems
an adaptive fusion of multiple correlation filters method is proposed in this paper. The unconstrained correlation filter tracking problem is transformed into two sub problems with constraints via an alternating direction multiplier optimization method. And two sub problems are solved by different correlation filter methods. Firstly
standard correlation filters are used to locate target coarsely
and then the relocation is done via spatially regularized correlation filters
which adjusts the target position to improve the tracking effect.
Result
2
In the experiment
the algorithm is evaluated on 100 videos of OTB-2015 benchmark dataset and compared with other state-of-the-art trackers
and the central coordinate error and the overlap rate of target frame are used as evaluation criteria. And the algorithm can handle variation in position
scale
and occlusion and shows the best results in CarScale
Freeman4
Girl and other videos. The average center position error of 100 videos is 28.55 pixels and the average overlap rate of target frame is 61%. Compared with the methods which utilize artificial features
our algorithm is better than those other algorithms. Compared with the correlation filter method using deep feature such as CNN feature
the average center position error of our algorithm is 6 pixels higher
but the average overlap rate of target frame improves 4%.
Conclusion
2
Extensive experimental results show that our algorithm has better accuracy and robustness under appearance changes such as variation in position and scale.
Danelljan M, Häger G, Shahbaz Khan F, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 4310-4318. [ DOI:10.1109/ICCV.2015.490 http://dx.doi.org/10.1109/ICCV.2015.490 ]
Wang N Y, Yeung D Y. Learning a deep compact image representation for visual tracking[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc, 2013: 809-817.
Wang L J, Ouyang WL, Wang X G, et al. Visual Tracking with Fully Convolutional Networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3119-3127. [ DOI:10.1109/ICCV.2015.357 http://dx.doi.org/10.1109/ICCV.2015.357 ]
Ma C, Huang J B, Yang X K, et al. Hierarchical convolutional features for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3074-3082. [ DOI:10.1109/ICCV.2015.352 http://dx.doi.org/10.1109/ICCV.2015.352 ]
Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]//Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 2544-2550. [ DOI:10.1109/CVPR.2010.5539960 http://dx.doi.org/10.1109/CVPR.2010.5539960 ]
Henriques J F, Caseiro R, Martins P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer-Verlag, 2012: 702-715. [ DOI:10.1007/978-3-642-33765-9_50 http://dx.doi.org/10.1007/978-3-642-33765-9_50 ]
Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3):583-596.[DOI:10.1109/TPAMI.2014.2345390]
Danelljan M, Shahbaz Khan F, Felsberg M, et al. Adaptive color attributes for real-time visual tracking[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 1090-1097. [ DOI:10.1109/CVPR.2014.143 http://dx.doi.org/10.1109/CVPR.2014.143 ]
Zhang K H, Zhang L, Liu Q S, et al. Fast visual tracking via dense spatio-temporal context learning[C]//Proceedings of 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 127-141. [ DOI:10.1007/978-3-319-10602-1_9 http://dx.doi.org/10.1007/978-3-319-10602-1_9 ]
Danelljan M, Häger G, Shahbaz Khan F, et al. Accurate scale estimation for robust visual tracking[C]//Proceedings of 2014 British Machine Vision Conference. Nottingham, United Kingdom: BMVA Press, 2014. [ DOI:10.5244/C.28.65 http://dx.doi.org/10.5244/C.28.65 ]
Wu Y, Lim J, Yang M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1834-1848.[DOI:10.1109/TPAMI.2014.2388226]
Jia X, Lu H C, Yang M H. Visual tracking via adaptive structural local sparse appearance model[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 1822-1829. [ DOI:10.1109/CVPR.2012.6247880 http://dx.doi.org/10.1109/CVPR.2012.6247880 ]
Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 188-203. [ DOI:10.1007/978-3-319-10599-4_13 http://dx.doi.org/10.1007/978-3-319-10599-4_13 ]
Kalal Z, Mikolajczyk K, Matas J. Tracking-learning-detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7):1409-1422.[DOI:10.1109/TPAMI.2011.239]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv: 1409. 1556, 2014. http://arxiv.org/abs/1409.1556 .
相关作者
相关机构
京公网安备11010802024621