Current Issue Cover


摘 要
目的 针对现实场景中跟踪目标的背景复杂、光照变化、快速运动、旋转等问题,提出了自适应多特征融合的相关滤波跟踪的算法。方法 提取目标的HOG特征和利用卷积神经网路提取高、低层卷积特征,借助一种自适应阈值分割方法评估每种特征的有效性,得到特征融合的权重比,根据权重系数融合每种特征的响应图,基于融合后的响应图得到目标的新估计位置。再利用尺度相关滤波器计算目标尺度,得到目标尺度完成跟踪。结果 实验选取OTB-2013公开数据集进行跟踪,在对多特征融合进行分析的基础上,测试了算法在11种不同属性下的跟踪性能,并与当前流行的7种算法进行对比分析。实验结果表明,本文算法的成功率和精确度均排名第一,相较于基准算法DSST跟踪精确度提高了4%,成功率提高了6%。在复杂场景下比其他主流算法更具有鲁棒性。结论 本文算法以DSST相关滤波跟踪器为基准算法,由于不同特征描述了图像的不同特点,借助自适应阈值分割方法评估每种特征的有效性,自适应融合两层卷积特征和HOG特征,使得判别性越强的单一特征融合权重越大。较好表达了目标的外观模型,本算法在背景复杂、目标消失、光照变化、快速运动、旋转等场景下表现出较强的跟踪准确性。
Correlation filter target tracking algorithm based on adaptive multi-feature fusion


Objective Target tracking is one of the basic problems in the field of computer vision, which is widely used in security monitoring, military, automatic driving and so on. The tracking algorithms based on correlation filtering have developed rapidly in recent years due to its fast and efficient features. However, due to background clutters, illumination variation, fast motion, rotation and other complex factors, designing a robust tracking algorithm is still a very challenging problem. How to build an effective appearance model is a key factor in tracking the success of an algorithm. There are two main types of expressions of the current appearance model. The first is the appearance model based on manual design. The common artificial design appearance model is the HOG (histogram of oriented gradient) feature, because the HOG feature can better describe the contour and shape information of the target by calculating the direction gradient of the local area of the detected image. The second is an appearance model based on deep learning. Low-level convolution features have rich texture information but are not able to adapt to background changes. High-level convolution features contain rich semantic information that distinguishes backgrounds and targets even in complex contexts. Since different characteristics of the image are described due to different characteristics, this paper proposes a correlation filter tracking method to achieve the effect of adaptive multi-feature fusion. Method In this paper, the DSST correlation filter is used as the benchmark algorithm, and conv1 and conv5 of convolutional neural network (CNN) imagenet-vgg-2048 are used. Firstly, the HOG feature of the target is extracted, and the high and low level convolution features of the target are extracted by using convolutional neural network, and the characteristic response graph is obtained. The maximum peak and shape of the response graph reflects the accurate information of the tracking results. Secondly, in order to evaluate the validity of the feature, it is proposed to use the area ratio of peak as the new index to discriminate the confidence level of the correlation response graph. Evaluate the validity of each feature with an adaptive threshold segmentation method. If the peak of the response graph is sharper and the periphery is smoother, the tracking result is more reliable. Then the weight ratio of the feature fusion is obtained, so that the feature is more effective and the fusion coefficient is larger. Finally, the response graph of each feature is fused according to the fusion coefficient, and the final response output is calculated and the target response position is determined by the maximum response value in the response graph. Re-introduction of scale-dependent filter estimation scales to achieve adaptive target tracking. Result In order to effectively evaluate the performance of the proposed method, the algorithm is tested on the public data set OTB-2013. The 50 videos mainly contain 11 different challenges encountered in the target tracking process (mainly including background complex, deformation, object disappearance, scale variation, etc.). This paper compares the algorithm with seven mainstream algorithms, and the accuracy and success rate are taken as the evaluation and tracking performance indicators. These algorithms are mainly divided into two categories: the first category is the traditional tracking algorithm ASLA, SCM, TLD with representative and top ranking; the second is Correlation filtering algorithms CFNet, KCF, DSST, SAMF. The experimental results show that the proposed algorithm has the highest success rate and accuracy compared with other algorithms. The accuracy of the proposed method on the OTB-2013 dataset is (77.8%), other algorithms CFNet(76.1%), DSST (74.6%), KCF (73.5%), SAMF (72.5%) and traditional algorithm SCM (67.8%); The success rate of the algorithm in this paper is (71.5%), The success rate of other algorithms is CFNet(71.4%), DSST (67.5%), SAMF (66.2%), KCF (61.1%). the algorithm in this paper increases the tracking accuracy by 4% and improves the success rate by 6%. Through the above experimental data analysis, the method can effectively improve the tracking performance. Under these seven attributes including background clutters (BC), deformation (DEF), out-of-view (OV), illumination variation (IV), in-plane rotation (IPR), out-of-plane rotation (OPR), Fast Motion(FM) the proposed method ranks first in accuracy compared with CFNet, DSST, SAMF and KCF. Compared with other algorithms, the algorithm in this paper has the highest success rate in the scenes of nine attributes. Conclusion Since different characteristics of the image are described due to different characteristics, this paper proposes a correlation filter tracking method to achieve the effect of adaptive multi-feature fusion. The convolutional neural network is used to extract the high and low layer convolution features and HOG features. The adaptive threshold segmentation method is proposed to evaluate the validity of each feature, and the two-layer convolution feature and HOG feature are adaptively fused. Using the analysis of the feature validity, the response graph is fused according to the fusion coefficient. Compared with most feature fusion methods, which connect features serially or in parallel, this algorithm makes the fusion weight of single feature with stronger discriminability larger, and the appearance model of the target can be better represented. Therefore, the experimental results show that the proposed algorithm exhibits strong robustness and tracking accuracy in scenarios with low resolution and scale variation. In this paper, the target tracking method under occlusion, motion blur and fast motion conditions will be further studied in the future.