发布时间: 2019-11-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.190059
2019 | Volume 24 | Number 11

图像分析和识别

利用感知模型的长期目标跟踪

张博¹, 江沸菠², 刘刚^1,3

1. 长沙师范学院信息科学与工程学院, 长沙 410100;

2. 湖南师范大学信息科学与工程学院, 长沙 410081;

3. 中南大学物理与电子学院, 长沙 410083

收稿日期: 2019-03-01; 修回日期: 2019-06-05; 预印本日期: 2019-06-12

基金项目: 国家自然科学基金青年科学基金项目（41604117）；教育部产学合作协同育人项目（201801097009）

第一作者简介: 张博, 1980年生, 男, 高级实验师, 主要研究方向为智能感知与控制、目标检测与跟踪。E-mail:zb801121@126.com;
江沸菠, 男, 副教授, 主要研究方向为图像处理信息安全、人工智能工程应用。E-mail:156151245@qq.com;
刘刚, 男, 讲师, 主要研究方向为压缩感知、信道估计。E-mail:125897360@qq.com.

中图法分类号: TP391

文献标识码: A

文章编号: 1006-8961(2019)11-1906-12

摘要

目的传统相关滤波目标跟踪算法存在两个问题，其一，使用循环移位产生的虚假负样本训练分类器，导致分类器分类能力受到限制；其二，当目标被严重遮挡时，由遮挡引起的一些不正确的样本（预测的目标图像）用于更新分类器，随着遮挡时间的增加，分类器将包含较多噪声信息并逐渐失去判别力，使得跟踪失败。针对上述问题，提出一种基于感知模型的长期目标跟踪算法，通过引入背景感知策略解决传统相关滤波器缺乏真实负样本问题，通过引入遮挡感知策略来有效跟踪被遮挡的目标。方法首先，所提算法通过扩大采样区域，增加所产生训练样本数量，并引入裁剪矩阵，裁取移位后的样本以获得完整有效的样本，同时克服了由循环移位产生样本导致的边界效应问题；然后，利用无遮挡情况下一定帧数目标图像各自对应的分类器构建分类池；最后，在严重遮挡情况下利用最小化能量函数从分类池中选择最佳分类器进行重检测，以实现长期目标跟踪。结果使用公开数据集对所提算法进行性能评估，结果表明，所提算法成功率为0.990，精确度为0.988。其较背景感知相关滤波（BACF）算法分别提升2.7%和2.5%。结论所提算法在目标被遮挡、形变、尺度变化以及复杂背景下仍能较准确跟踪目标，具备较高的精确度和鲁棒性。

关键词

目标跟踪; 循环卷积; 背景感知; 严重遮挡; 分类池

Long-term target tracking based on perceptual model

Zhang Bo¹, Jiang Feibo², Liu Gang^1,3

1. College of Information Science and Engineering, Changsha Normal University, Changsha 410100, China;

2. College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China;

3. College of Physical Science and Electronics, Central South University, Changsha 410083, China

Supported by: Young Scientists Fund of National Natural Science Foundation of China (41604117)

Abstract

Objective Visual target tracking is an important issue in machine vision. Its core tasks are to locate the target in a continuous video sequence and estimate the target's motion trajectory. This method has been widely used in many fields, such as human-computer interaction, security monitoring, automatic driving, navigation, and positioning. Through extensive research by domestic and foreign experts in recent years, visual target-tracking technology has gradually matured. However, tracking targets accurately in complex scenes, such as intense illumination change, occlusion, deformation, scale change, and background clutter, remains a challenging task. Visual target-tracking algorithms can be divided into two categories, namely, generative and discriminative tracking methods. Generative tracking converts the tracking problem into the nearest neighbor search task of the target model, constructs the target model by using a template or sparse representation in the subspace, and achieves target tracking by searching for the most similar region in the target model. Discriminant tracking treats the tracking problem as a binary classification problem.The target is separated from the background by training the classifier to achieve target tracking. Given that the generated visual target-tracking algorithm needs to construct a complex target appearance model, its computational complexity is high, and its algorithm has poor real-time performance. Discriminant tracking algorithm uses samples of the target and surrounding background to train a classifier online and achieves target tracking by detecting and tracking. Its classifier obtains considerable background information during training. Thus, this method can distinguish foreground and background better and its performance is generally better than that of the generative tracking method. Correlation-filtering algorithm is an algorithm with better performance than discriminant tracking algorithm. The traditional correlation-filtering algorithm introduces the concept of dense sampling and uses cyclically shifted samples of the base samples as training samples, which greatly improve the classification ability of the filter. The introduction of kernel strategy maps the linear regression problem of the ridge to the nonlinear space and uses the discrete Fourier transform to transform the time-domain calculation into the frequency-domain calculation, which greatly reduces algorithm complexity. Although traditional correlation-filtering algorithm has many advantages, it also has shortcomings. Method First, this algorithm uses false negative samples generated by the cyclic shift to train a classifier, which limits the classifier's classification ability. Second, several incorrect samples (predicted target images) caused by occlusion are used to update the classifier when the target is seriously occluded. With an increase in occlusion time, the classifier will contain considerable noise information and gradually lose discrimination, which causes tracking failure.Aiming to address the above problems, this study proposes a long-term target-tracking algorithm based on a perceptual model. The algorithm introduces the background perceptual strategy to solve the problem of traditional correlation filtering lacking real negative samples and the occlusion-sensing strategy to effectively track the occluded target. The proposed algorithm first increases the number of training samples by enlarging the sampling area. A cropping matrix is then introduced into the algorithm to crop shifted samples and obtain complete and valid samples.This method overcomes the boundary effect problem caused by cyclically shifted samples. A classification pool is subsequently constructed by using the corresponding classifiers of a certain number of frames in the case of no occlusion. In the case of severe occlusion, the optimal classifier is finally selected from the classification pool by minimizing the energy function for redetection to achieve long-term target tracking. Result The performance of the proposed algorithm is evaluated by using a public data set. The proposed algorithm has a success rate of 0.990 and an accuracy of 0.988. These values are respectively 2.7% and 2.5% higher than those of the background-aware correlation filter algorithm. The overall success rate and accuracy of the proposed algorithm are considerably higher than those of other algorithms because of the introduction of background and occlusion perception strategies. The tracking accuracy for a single sequence is also higher. However, other algorithms have certain advantages in specific scenarios, and the proposed algorithm does not rank first in the accuracy and success rate of each sequence. The time complexity of the algorithm is slightly higher and the real-time performance is insufficient because of the introduction of perception module. Conclusion Experiments show that the proposed algorithm can accurately track a target under complex conditions, such as severe occlusion, scale change, and target deformation and has certain research value.

Key words

target tracking; circular convolution; background perception; heavy occlusion; classification pool

0 引言

视觉目标跟踪是机器视觉中的一个重要问题，其核心任务是在连续的视频序列中对目标进行定位，估计目标的运动轨迹。视觉目标跟踪已广泛应用于人机交互、安全监控、自动驾驶、导航与定位等诸多领域。近些年，经过大量国内外专家学者的深入研究，视觉目标跟踪技术正逐渐走向成熟。但是，在剧烈光照变化、遮挡、形变、尺度变化和背景杂乱等复杂场景下准确跟踪目标仍是一个富有挑战性的任务^[1-2]。

现有的视觉目标跟踪算法大致可以分为两类，即生成式和判别式跟踪方法。生成式跟踪方法将跟踪问题转换为目标模型的最近邻搜索任务，利用子空间中的模板或稀疏表示构建目标模型，通过搜索与目标模型最相似的区域实现目标跟踪^[3-5]。而判别式跟踪方法将跟踪问题视为二元分类问题，通过训练分类器，将目标从背景中分离出来，从而实现目标跟踪^[6-10]。生成式视觉目标跟踪算法由于需要构造复杂的目标外观模型，其计算复杂度较高，算法实时性能较差。而判别式跟踪算法由于使用目标和周围背景的样本在线训练分类器，通过检测跟踪的方式实现目标跟踪，其分类器由于在训练过程中获得了大量背景信息，因此能较好地对前景和背景进行区分，其性能普遍好于生成式跟踪方法。

在现有判别式跟踪算法中，基于相关滤波器的跟踪方法因为其在跟踪任务中表现出良好的性能与较高的运行速度，所以广泛应用于各种视觉目标跟踪算法。Bolme等人^[11]提出自适应相关滤波视觉目标跟踪算法，首次将相关滤波器(CF)引入到目标跟踪领域，并充分利用快速离散傅里叶变换，将时域中的卷积运算转换为频域中的点积，显著提高了跟踪算法的运行效率。Henriques等人^[12]提出具有核化相关滤波器的高速跟踪算法，使用循环移位方法扩充训练样本，极大提高了所训练跟踪器的跟踪精确度。Li等人^[13]提出具有特征融合的自适应尺度相关滤波算法，通过将颜色特征与方向梯度直方图特征进行有效融合，提高了特征的表征能力，使得跟踪精度得到进一步提升。张红颖等人^[14]在原始压缩跟踪算法的基础上引入海林格距离度量特征可靠性，选择可靠性比较高的特征构建贝叶斯分类器，并结合协方差矩阵来提升算法对目标的表达能力，增强了算法应对剧烈尺度变化和光照变化的能力。为了对目标进行长期跟踪，Ma等人^[15]提出了长期目标跟踪算法，在跟踪失败情况下使用在线随机蕨检测器对目标进行重检测，在一定程度上提高了跟踪器应对遮挡和出视野等复杂场景的能力。Choi等人^[16]提出注意调制的分解和整合视觉跟踪算法，将跟踪器分为分解与整合两个阶段，在分解阶段，目标由具有各种类型的特征和核的多个基础跟踪器训练，提高了将目标与杂乱背景区分开的能力，在整合阶段，多个基础跟踪器的响应根据记忆的优先级和可靠性进行组合，快速适应目标外观的突然变化情况，使得其能对目标进行长期跟踪。葛宝义等人^[17]提出了一种基于特征融合的长时目标跟踪算法，通过融合HOG(梯度方向直方图)特征、CN(颜色命名)特征和局部敏感直方图特征，提高了特征表征能力，并将Edge Boxes方法与结构化支持向量机方法结合，实现跟踪失败情况下目标的重检测，使得算法在长期跟踪时具备较高鲁棒性。

以上基于相关滤波的跟踪算法在应用中都取得了较好效果，但是在长期目标跟踪中，如果相邻帧之间出现较大形变或目标被严重遮挡，容易引起噪声更新，从而导致跟踪失败。针对上述问题，本文提出了基于感知模型的长期目标跟踪算法，在传统相关滤波器基础上引入裁剪矩阵构建背景感知相关滤波器，并使用它来获取大量真实样本，提高了所获负样本的真实性。同时，在严重遮挡情况下，通过使用能量函数从所构建的分类池中选择最佳分类器进行重检测的方法实现遮挡感知，从而有效提高算法在长期目标跟踪中的精确度与鲁棒性。最后在2015目标跟踪标准测试数据集下测试所提基于感知模型的长期目标跟踪算法性能。

1 相关研究

核化相关滤波跟踪算法是传统相关滤波跟踪算法中比较经典的一个，它通过引入密集采样的概念，将基础样本的循环移位所得样本作为训练样本，极大提高了所训练滤波器的分类能力。同时，引入核化策略将线性空间的岭回归问题映射到非线性空间求解，并利用离散傅里叶变换将时域内计算转换到频域计算，使得算法复杂度大大降低，很好地满足了现实应用中的实时性要求。

所提基于感知模型的长期目标跟踪算法的基础算法为核化相关滤波算法，这个算法将跟踪问题视为一个二元分类问题，通过脊回归模型训练滤波器实现视觉目标跟踪。脊回归目标函数可以表述为

$ \mathop {\min }\limits_\mathit{\boldsymbol{h}} \sum\limits_i^m {{{\left( {f\left( {{\mathit{\boldsymbol{x}}_i}} \right) - {\mathit{\boldsymbol{y}}_i}} \right)}^2}} + \lambda \parallel \mathit{\boldsymbol{h}}\parallel $

(1)

式中，${{\mathit{\boldsymbol{x}}_i}} $是训练样本，$ {{\mathit{\boldsymbol{y}}_i}}$是期望输出响应，$\mathit{\boldsymbol{h}} $为滤波器参数，$\lambda $为过拟合参数，$ f$为分类函数，其可以写成基础样本的线性组合形式：$ f\left( \mathit{\boldsymbol{x}} \right) = {\mathit{\boldsymbol{h}}^{\rm{T}}}\mathit{\boldsymbol{x}}$。上述脊回归有闭合形式为：$\boldsymbol{h}=\left(\boldsymbol{X}^{\mathrm{T}} \boldsymbol{X}+\lambda \boldsymbol{I}\right)^{-1} \boldsymbol{X}^{\mathrm{T}} \boldsymbol{y} $。$ \boldsymbol{X}$为循环样本集合，$ \boldsymbol{I}$为第$ i$个样本。

由于核相关滤波器训练样本可通过基础样本循环移位产生，所以$ \boldsymbol{X}$可以表述为以下循环移位形式$\boldsymbol{X}=\boldsymbol{F}^{\mathrm{H}} \operatorname{diag}(\boldsymbol{F} \boldsymbol{x}) \boldsymbol{F} $，式中$\boldsymbol{F} $为离散傅里叶变换矩阵，$\boldsymbol{F}^{\mathrm{H}} $为$\boldsymbol{F} $的厄尔米特转置。将其代入脊回归公式得到闭合解为

$ \hat{\boldsymbol{h}}^{*}=\frac{\hat{\boldsymbol{x}}^{*} \odot \hat{\boldsymbol{y}}^{*}}{\hat{\boldsymbol{x}}^{*} \odot \hat{\boldsymbol{x}}+\lambda} $

(2)

式中，$ \mathit{\boldsymbol{\hat x = Fx}}$表示$\mathit{\boldsymbol{x}} $的离散傅里叶变换，$ {{\mathit{\boldsymbol{\hat x}}}^*}$表示${\mathit{\boldsymbol{\hat x}}} $的复共轭，$ \odot $为点乘。由于上述解为线性情况下解，为了使非线性样本输入实现线性可分，这个算法引入了核化策略将输入特征从低维空间映射到高维空间，从而实现线性可分。将核函数引入分类函数可得：$ f(\mathit{\boldsymbol{x}}) = {\mathit{\boldsymbol{h}}^{\rm{T}}}\mathit{\boldsymbol{z}} = \sum\limits_{i = 1}^n {{\mathit{\boldsymbol{\alpha }}_i}} \mathit{\boldsymbol{\kappa }}\left( {\mathit{\boldsymbol{z}}, {\mathit{\boldsymbol{x}}_i}} \right)$，则可获得对偶空间滤波器参数解为

$ {{\mathit{\boldsymbol{\hat h}}}^*} = \frac{{\mathit{\boldsymbol{\hat y}}}}{{{{\mathit{\boldsymbol{\hat \kappa }}}^{\mathit{\boldsymbol{xx}}}} + \lambda }} $

(3)

式中，$ {{{\mathit{\boldsymbol{\hat \kappa }}}^{\mathit{\boldsymbol{xx}}}}}$为核相关函数，通常算法中使用的核相关函数为高斯核函数。最后通过核化映射特征与滤波器参数作相关操作，可获得目标响应得分

$ \mathit{\boldsymbol{\hat v}}(\mathit{\boldsymbol{z}}) = {\left( {{{\mathit{\boldsymbol{\hat \kappa }}}^{\mathit{\boldsymbol{xz}}}}} \right)^*} \otimes \mathit{\boldsymbol{\hat h}} $

(4)

式中，$\mathit{\boldsymbol{x}} $为训练样本，$\mathit{\boldsymbol{z}} $为当前输入图像，$ \otimes $表示取相关操作。通过找到最大目标响应得分位置即可获得目标中心位置。

2 本文算法

针对传统相关滤波算法使用循环移位产生的虚假负样本训练分类器，导致分类器分类能力受到限制的问题，所提算法采用背景感知相关滤波器获取大量真实负样本以训练相关滤波器，提高其分类能力。同时，基于传统相关滤波器的跟踪算法在长期跟踪中缺乏相应遮挡感知模块，导致其在遇到严重遮挡时容易跟踪失败，所提算法通过构建分类池保存不具有噪声更新的分类器，在遇到严重遮挡时利用能量函数选取分类池中最佳分类器重检测目标的方式进行遮挡感知跟踪。

2.1 引入背景感知策略改进相关滤波器

传统相关滤波跟踪采用的循环移位方法生成大量训练样本，这样的训练样本存在两个不足，其一是循环移位产生的负样本多为虚假负样本，不具备真实性；其二是循环移位产生的部分不连续样本易导致边界效应。针对以上问题所提算法通过扩大采样区域，增加所产生训练样本数量，并引入裁剪矩阵，裁取移位后的样本以获得完整有效的样本，同时克服了由循环移位产生样本导致的边界效应问题。下式即为加入裁剪矩阵后的训练滤波器所用损失函数

$ E(\boldsymbol{h})=\frac{1}{2} \sum\limits_{i=1}^{M}\left\|\boldsymbol{y}(i)-\sum\limits_{j=1}^{N} \boldsymbol{h}_{j}^{\mathrm{T}} \boldsymbol{P} \boldsymbol{x}_{j}\left[\Delta \boldsymbol{\psi}_{j}\right]\right\|_{2}^{2}+\\ \frac{\lambda}{2} \sum\limits_{j=1}^{N}\left\|\boldsymbol{h}_{j}\right\|_{2}^{2} $

(5)

式中，$\boldsymbol{P} $是一个$D \times M $二进制矩阵，即所引入的裁剪矩阵，它用于裁取扩大循环样本$ {\mathit{\boldsymbol{x}}_j}$中间部分, ${\mathit{\boldsymbol{x}}_j} \in {\mathit{\boldsymbol{C}}^M}, \mathit{\boldsymbol{y}} \in {\mathit{\boldsymbol{C}}^M}, \mathit{\boldsymbol{h}} \in {\mathit{\boldsymbol{C}}^D}, M $表示扩大搜索区域后样本$\mathit{\boldsymbol{x}} $的长度，$ D$表示其裁剪后长度，所以$ M \gg D$。$ \mathit{\boldsymbol{y}}\left( i \right)$表示$\mathit{\boldsymbol{y}} $的第$i $个元素，$\Delta {\mathit{\boldsymbol{\psi }}_j} $是循环移位算子，$ {\mathit{\boldsymbol{x}}_j}\left[ {\Delta {\mathit{\boldsymbol{\psi }}_j}} \right]$表示对样本$ {\mathit{\boldsymbol{x}}_j}$进行循环移位操作。

同样这里可以如传统相关滤波跟踪算法一样将式(5)转化到频域计算，提高算法计算效率。转化结果为

$ \begin{aligned} E(\boldsymbol{h}, \hat{\boldsymbol{b}}) &=\frac{1}{2}\|\hat{\boldsymbol{y}}-\hat{\boldsymbol{X}} \hat{\boldsymbol{b}}\|_{2}^{2}+\frac{\lambda}{2}\|\boldsymbol{h}\|_{2}^{2} \\ \text { s. t. } & \hat{\boldsymbol{b}}=\sqrt{M}\left(\boldsymbol{F P}^{\mathrm{T}} \circ \boldsymbol{I}_{N}\right) \boldsymbol{h} \end{aligned} $

(6)

式中，$\hat{\boldsymbol{b}} $是一个辅助变量，$ \mathit{\boldsymbol{\hat X}} = \left[ {{\mathop{\rm diag}\nolimits} {{\left( {{{\mathit{\boldsymbol{\hat x}}}_1}} \right)}^{\rm{T}}}, \cdots , {\rm{diag}}} \right.\left. {{{\left( {{{\mathit{\boldsymbol{\hat x}}}_N}} \right)}^{\rm{T}}}} \right], \mathit{\boldsymbol{h}} = {\left[ {\mathit{\boldsymbol{h}}_1^{\rm{T}}, \cdots , \mathit{\boldsymbol{h}}_N^{\rm{T}}} \right]^{\rm{T}}}, $ $ \mathit{\boldsymbol{\hat b}} = {\left[ {\mathit{\boldsymbol{\hat b}}_1^{\rm{T}}, \cdots , \mathit{\boldsymbol{\hat b}}_N^{\rm{T}}} \right]^{\rm{T}}}, {\mathit{\boldsymbol{I}}_N}$是$ N \times N$的单位矩阵，“ $ \circ $ ”表示克罗内克积。

通过对式(6)构造拉格朗日函数，并使用交替迭代优化方法求得相关滤波参数的解，即

$ \mathit{\boldsymbol{h}} = {\left( {\mu + \frac{\lambda }{{\sqrt M }}} \right)^{ - 1}}(\mu \mathit{\boldsymbol{b}} + \mathit{\boldsymbol{\xi }}) $

(7)

式中，$\boldsymbol{h}=\left(\mu+\frac{\lambda}{\sqrt{M}}\right)^{-1}(\mu \boldsymbol{b}+\boldsymbol{\xi}) $为惩罚因子，$\mathit{\boldsymbol{\xi }} $为拉格朗日系数集。$\boldsymbol{b}=\frac{1}{\sqrt{M}}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \circ \boldsymbol{I}_{N}\right) \hat{\boldsymbol{b}}, \boldsymbol{\xi}=\frac{1}{\sqrt{T}}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \circ \boldsymbol{I}_{N}\right) \hat{\boldsymbol{\xi}} $。

$ {\mathit{\boldsymbol{\widehat b}}^*} = \frac{1}{\mu }(M\mathit{\boldsymbol{\hat y\hat x}} - \hat \zeta + \mu \mathit{\boldsymbol{\widehat h}}) - \frac{{{{\mathit{\boldsymbol{\hat x}}}_t}}}{{\mu q}}\left( {T\hat y{{\mathit{\boldsymbol{\hat s}}}_x} - {{\mathit{\boldsymbol{\hat s}}}_\xi } + \mu {{\mathit{\boldsymbol{\hat s}}}_h}} \right) $

(8)

式中，$\hat{\boldsymbol{s}}_{x}=\hat{\boldsymbol{x}}^{\mathrm{T}} \hat{\boldsymbol{x}}, \hat{\boldsymbol{s}}_{\xi}=\boldsymbol{x}^{\mathrm{T}} \boldsymbol{\xi}, \hat{\boldsymbol{s}}_{h}=\boldsymbol{x}^{\mathrm{T}} \boldsymbol{h}, q $是1个标量，其值为：$\hat{\boldsymbol{s}}_{x}+M \mu $。在所提算法中拉格朗日系数矩阵通过以下方式进行更新

$ \boldsymbol{\xi}_{i+1}=\boldsymbol{\xi}_{i}+\mu\left(\hat{\boldsymbol{b}}_{i+1}-\hat{\boldsymbol{h}}_{i+1}\right) $

(9)

2.2 遮挡判别

当目标被严重遮挡时，由遮挡引起的一些不正确的样本(预测的目标图像)用于更新分类器，随着遮挡时间的增加，分类器将包含较多噪声信息并逐渐失去判别力，使得跟踪失败。为了解决这个问题，所提算法提出了一种遮挡感知策略来跟踪具有遮挡的目标。加入背景感知策略后的滤波器首先在没有遮挡的情况下跟踪目标，对于每个帧$ t$，可得到一个分类${\mathit{\boldsymbol{h}}_t} $和一个目标图像。因此可通过保存先前无遮挡$k $帧中的分类器和目标图像以构建分类器池，当出现严重遮挡时采用合理的策略选取分类池中最佳分类器和目标图像用于跟踪。这里, 由于远离当前帧的分类器不能非常有效地构造当前分类池，因此只保存无有遮挡情况下距当前帧最近的$k $帧分类器和目标图像来构造分类池，并且分类池随着输入帧的增加逐渐更新，当遇到部分遮挡或完全遮挡时不更新，图 1所示即为遮挡感知策略示意图。

图 1 遮挡感知策略示意图

Fig. 1 Schematic diagram of the occlusion-aware strategy

由于传统的滤波器对部分遮挡情况下目标已表现得较好，因此只需要考虑目标被完全遮挡的情况。目标被完全遮挡情况一般分为以下3个阶段：目标正在接近用于遮挡的物体; 目标被完全或严重遮挡; 目标从遮挡的物体中离开。第1个阶段目标仍属部分遮挡，因此使用融入背景感知策略后的滤波器更新策略即可较准确跟踪目标; 第2个阶段从先前保存的分类池中选择最佳分类器替换当前分类器，对当前帧目标进行跟踪; 第3个阶段也属部分遮挡，因此仍使用融入背景感知策略后的滤波器更新方式更新分类器即可。

针对上述策略如何判别遮挡是实现遮挡感知的首要问题，因此所提算法引入两种距离测量来判别遮挡。首先使用平方欧几里德距离进行距离测量

$ D\left( {{\mathit{\boldsymbol{X}}_1}, {\mathit{\boldsymbol{X}}_2}} \right) = \frac{1}{n}{\mathop{\rm tr}\nolimits} \left( {{{\left( {{\mathit{\boldsymbol{X}}_1} - {\mathit{\boldsymbol{X}}_2}} \right)}^{\rm{T}}}\left( {{\mathit{\boldsymbol{X}}_1} - {\mathit{\boldsymbol{X}}_2}} \right)} \right) $

(10)

式中，${\mathop{\rm tr}\nolimits} $表示矩阵的迹, $\boldsymbol{X}_{1}, \boldsymbol{X}_{2} \in {\bf R}^{n \times n} $分别表示两个图像块的特征。然后定义第一距离为遮挡阈值，通过计算目标图像块与其周围8个图像块之间的遮挡距离来确定

$ T_t^0 = \left\{ {\begin{array}{*{20}{l}} {(1 - \gamma )T_{t - 1}^0 + 0.95\gamma D_t^0\;\;\;\;t帧未遮挡}\\ {T_{t - 1}^0\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;其他} \end{array}} \right. $

(11)

式中，$ \gamma $是学习率, ${D_t^0} $为第$ t$帧中目标图像块与其周围8个图像块之间的最小平方欧几里德距离。其次第二距离定义为目标距离，它等于前一帧目标图像块和分类池中的图像块之间的最小距离$ D_{t-1}^{\min }$。只比较这两个距离就可以区分目标是否被遮挡：在第$ t$帧的时候，如果$ D_{t-1}^{\min }$> $T_{t-1}^{0} $，则上一帧中目标被完全遮挡，那么需要从分类池中选择最佳分类器对当前帧目标进行跟踪。如果$ D_{t-1}^{\min }$≤$T_{t-1}^{0} $，则上一帧目标被部分遮挡或未遮挡，则仍使用上一帧目标图像更新分类器。当检测出当前帧目标位置后，可计算出当前帧目标最小距离$D_t^{\min } $以判别当前帧目标遮挡情况：如果$ D_{t}^{\min } <\eta T_{t-1}^{0}$，则目标未被遮挡，则可将当前帧所使用分类器更新分类池。如果$ D_t^{\min } \ge \eta T_{t - 1}^0$则目标被部分遮挡或完全遮挡。这里$\eta $为调整相邻帧部分遮挡阈值的自适应参数。

参数$\eta $根据不同情况设定。一些视频中的目标可能在两个相邻帧之间具有大的变化。因此，参数$\eta $也应该设置得大一些。但是，一些目标可能在两个相邻帧之间稍微改变，相应的参数$\eta $应该设置得小一些。具体而言，$\eta $通过响应分数进行调整，如果相邻响应的距离$ \left|\hat{v}_{t}(\boldsymbol{x})-\hat{v}_{t-1}(\boldsymbol{x})\right| <0.2$且$ \hat{v}_{t}(\boldsymbol{x})>0.3$，$\eta $设置为0.5。否则，$\eta $设置为0.8。

2.3 利用能量函数获取最佳分类器

在判别为严重遮挡情况下，需要从分类池中选择最佳分类器进行重检测，所提算法采用能量函数来测量分类器池中的分类器$\left\{\boldsymbol{h}_{t, k}^{p}\right\}_{k=1}^{K} $的性能以选择最佳分类器，为了简化表示，后面公式中使用将$ {\mathit{\boldsymbol{h}}_{t, k}^p}$表示为$\mathit{\boldsymbol{h}}_k^p $。这个能量函数基于概率模型进行定义，因此这里首先给出分类器的概率解释

$ {P_k}\left( {{\rho _1}|{\mathit{\boldsymbol{x}}_{m, n}}} \right) = \left\{ {\begin{array}{*{20}{l}} {R_k^p\left( {{\mathit{\boldsymbol{x}}_{m, n}}} \right)}&{0 \le R_k^p\left( {{x_{m, n}}} \right) \le 1}\\ 0&{R_k^p\left( {{\mathit{\boldsymbol{x}}_{m, n}}} \right) < 0}\\ 1&{R_k^p\left( {{\mathit{\boldsymbol{x}}_{m, n}}} \right) > 1} \end{array}} \right. $

(12)

$ P_{k}\left(\rho_{2} | \boldsymbol{x}_{m, n}\right)=1-P_{k}\left(\rho_{1} | \boldsymbol{x}_{m, n}\right) $

(13)

式中, 响应$ {R_k^p\left( {{\mathit{\boldsymbol{x}}_{m, n}}} \right)}$为相应样本$ {{\mathit{\boldsymbol{x}}_{m, n}}}$的似然，式(11)为该似然归一化定义，${{\rho _1}} $为目标样本标签，$ {{\rho _2}}$为非目标样本标签。其次，基于上式能量函数定义为

$ E\left(\boldsymbol{h}_{k}^{p}\right)=-L\left(\boldsymbol{X} ; \boldsymbol{h}_{k}^{p}\right)+\lambda_{a} \boldsymbol{Z}\left(\boldsymbol{I} | \boldsymbol{X} ; \boldsymbol{h}_{k}^{p}\right) $

(14)

式中，$ \mathit{\boldsymbol{X}} = \left\{ {{\mathit{\boldsymbol{x}}_{m, n}}|(m, n) \in \{ 0, \cdots , M - N\} \times } \right.\left\{ {\left. {0, \cdots , N - 1} \right\}} \right.$是循环移位得到的样本集合。$\mathit{\boldsymbol{I}} = \left\{ {{\rho _1}, {\rho _2}} \right\} $是样本标签集。$L\left( {\mathit{\boldsymbol{X}};\mathit{\boldsymbol{h}}_k^p} \right) $为分类器和样本的对数似然函数，用于保持能量函数和分类器响应分数一致性，最大响应越大，分类器越好，相应的能量应该越小，其具体定义为

$ \boldsymbol{L}\left(\boldsymbol{X} ; \boldsymbol{h}_{k}^{p}\right)=\max _{(m, n)} \lg P_{k}\left(\boldsymbol{\rho}_{1} | x_{m, n}\right) $

(15)

式(14)中，$ {\lambda _a}\mathit{\boldsymbol{Z}}\left( {\mathit{\boldsymbol{I}}|\mathit{\boldsymbol{X}};\mathit{\boldsymbol{h}}_k^p} \right)$为熵正则化项，这项有利于低模糊性分类器分布，例如：将样本分配给标签$ {{\rho _1}}$或标签${{\rho _2}} $，对于这种标签样本，对一个标签具有高置信度分类而对另一个标签具有低置信度分类的分类器比对两个标签具有类似置信度的分类器更有利，其具体计算方式为

$ \begin{array}{l} Z\left( {I|X;h_k^p} \right) = - \frac{1}{{MN}}\sum\limits_{m = 1}^M {\sum\limits_{n = 1}^N {\sum\limits_{i = 1}^2 \times } } \\ {P_k}\left( {{\rho _i}|{x_{m, n}}} \right)\lg {P_k}\left( {{\rho _i}|{x_{m, n}}} \right) \end{array} $

(16)

最后，通过以下函数从分类池中选择最佳分类器$ {h_k^p}$，即

$ {k^*} = \arg \mathop {\min }\limits_{k = 1}^K E\left( {\mathit{\boldsymbol{h}}_k^p} \right) $

(17)

这里如前所述计算每个分类器${h_k^p} $的能量$E\left( {\mathit{\boldsymbol{h}}_k^p} \right) $。通过该函数，选择一个具有最小能量的分类器作为最佳分类器$ \mathit{\boldsymbol{h}}{_k^{p'} }$。

2.4 尺度自适应

为了应对剧烈尺度变化场景给目标跟踪带来的挑战，利用尺度池策略选择目标的最佳尺度。首先构建尺度池$\mathit{\boldsymbol{S}} = \left\{ {{\mathit{\boldsymbol{S}}_1}, \cdots , {\mathit{\boldsymbol{S}}_n}} \right\} $，尺度池中每个尺度为${\mathit{\boldsymbol{s}}_t} = \left( {{\mathit{\boldsymbol{s}}_x}, {\mathit{\boldsymbol{s}}_y}} \right) $形式。然后使用前面所提方法预测出目标中心位置，使用尺度池策略提取多个目标尺度：$s_{x}^{n} G \times s_{y}^{n} J, G, J $为上一帧目标的宽和高。通过计算不同尺寸目标响应，找到最大响应目标尺度即为最佳尺度

$ \boldsymbol{S}^{\prime}=\arg \max \limits_{s}\left\{\max \left(\hat{\boldsymbol{v}}_{1}\right), \max \left(\hat{\boldsymbol{v}}_{2}\right), \cdots, \max \left(\hat{\boldsymbol{v}}_{n}\right)\right\} $

(18)

3 实验论证

3.1 对比实验设定

为证明所提方法的有效性，所有测试均在MATLAB2015b实验平台上进行，所使用电脑配置为英特尔I5处理器，8 GB内存，操作系统为Windows 10系统。将所提算法与KCF(核相关滤波)^[12]、DSST(判别尺度跟踪)^[18]、SAMF(多特征融合的尺度自适应跟踪)^[13]、SRDCF(空间正则化相关滤波)^[19]、BACF(背景感知相关滤波)^[20]、MEEM(熵最小化多专家跟踪)^[21]、Staple(互补跟踪)^[22]、LCT(长期目标跟踪)^[15]、MUSTER(multi-store tracker)^[23]等算法进行定量与定性分析对比，全面测试所提算法性能。所用测试视频为2015目标跟踪标准测试数据集中的10组视频序列，测试视频序列相关属性如表 1所示。

表 1 测试视频序列属性
Table 1 Attributes of test video sequence

下载CSV

视频序列	长度	分辨率/像素	特征
twinnings	472	352×288	scale variation、out-of-plane rotation
walking	412	768×576	scale variation、Occlusion、deformation、low resolution
bolt	350	640×360	out-of-plane rotation、Occlusion、deformation、in-plane rotation
subway	175	352×288	Occlusion、deformation、background clutter
suv	945	320×240	Occlusion、in-plane rotation、Out-of-View
bolt2	293	480×270	deformation、background clutter
boy	602	640×480	scale variation、Motion Blur、fast motion、in-plane rotation、out-of-plane rotation
girl	500	128×96	scale variation、Occlusion、in-plane rotation、out-of-plane rotation
faceocc1	892	352×288	Occlusion
basketball	725	576×432	out-of-plane rotation、occlusion、deformation、background clutter、illumination variation
carDark	393	320×240	illumination variation、background clutter
ironman	166	720×304	illumination variation、scale variation、Occlusion、Motion Blur、fast motion、in-plane rotation、out-of-plane rotation、Out-of-View、background clutter、low resolution
motorRolling	164	640×360	illumination variation、scale variation、Motion Blur、fast motion、in-plane rotation、background clutter、low resolution

3.2 实验相关参数设定

实验所用特征为31维的HOG特征，像素胞元(cell)的大小设定为4×4。经过大量实验调参方法对实验中参数进行优化设置：过拟合参数$ \lambda $设定为0.001，惩罚因子$ \mu $为10³，学习率$ \gamma $设置为0.015，分类池大小$K $设置为0.05，熵正则化项中惩罚系数${\lambda _a} $设置为10，尺度池大小$ n$为29。

3.3 定量分析

图 2为各算法在所用测试视频序列下的总体成功率与精确度测试结果。从图中可以看出，所提基于感知模型的跟踪算法总体精确度为0.990，成功率为0.988，较BACF算法分别提升2.7%和2.5%。表 2和表 3为各个算法在不同序列上的跟踪精确度与成功率。从表中数据可以看出，所提算法在各序列中均能表现出较好性能，特别是在具有遮挡属性的视频序列中跟踪精确度与成功率比较高。

图 2 10种跟踪算法的总体精确度和成功率

Fig. 2 The overall accuracy and the overall success rate of 10 tracking algorithms ((a) accuracy; (b) success rate)

表 2 距离精确度(DP)
Table 2 Distance precision (DP)

下载CSV

属性	本文	KCF	BACF	Staple	SAMF	LCT	MUSTER	MEEM	DSST	SRDCF
twinnings	1.000	0.907	1.000	0.998	1.000	0.850	0.413	0.924	1.000	0.513
walking	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
bolt	1.000	0.989	1.000	1.000	1.000	1.000	1.000	0.974	1.000	0.017
subway	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	0.246	1.000
suv	0.978	0.979	0.979	0.978	0.981	0.980	0.977	0.663	0.978	0.975
bolt2	0.990	0.017	0.993	0.997	0.017	0.017	0.017	0.017	0.020	0.017
boy	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
girl	1.000	0.864	0.948	0.868	1.000	1.000	0.990	1.000	0.928	0.994
faceocc1	0.955	0.730	0.657	0.918	0.923	0.906	0.871	0.648	0.895	0.831
basketball	0.979	0.923	0.828	0.879	0.989	1.000	1.000	0.996	0.806	0.996

表 3 成功率(SR)
Table 3 Success rate (SR)

下载CSV

属性	本文	KCF	BACF	Staple	SAMF	LCT	MUSTER	MEEM	DSST	SRDCF
twinnings	0.987	0.542	0.981	0.975	0.964	0.742	0.407	0.434	0.998	0.434
walking	0.998	0.515	0.998	0.990	0.998	0.983	0.998	0.527	0.998	0.998
bolt	1.000	0.943	1.000	1.000	0.997	0.989	1.000	0.191	1.000	0.014
subway	1.000	1.000	1.000	1.000	1.000	1.000	1.000	0.966	0.223	0.994
suv	0.986	0.984	0.986	0.984	0.983	0.984	0.984	0.638	0.984	0.984
bolt2	0.966	0.007	0.959	0.911	0.007	0.007	0.010	0.007	0.010	0.010
boy	1.000	0.992	0.998	1.000	1.000	1.000	0.992	0.990	1.000	1.000
girl	0.998	0.742	0.896	0.644	1.000	0.976	0.580	0.958	0.306	0.776
faceocc1	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
basketball	0.968	0.898	0.814	0.870	0.967	0.992	0.994	0.981	0.680	0.412

3.4 定性分析

为更直观看出所提基于感知模型的目标跟踪算法的效果，在遮挡、尺度变化、形变等3种情况下对10种算法的实际跟踪效果进行分析，图 3所示为各种算法实际跟踪效果图。

图 3 算法实际效果

Fig. 3 Actual effect of the algorithm

((a)bolt; (b) faceocc1; (c) subway; (d) suv; (e) boy; (f) twinnings; (g) girl; (h) basketball; (i) bolt2; (j) walking)

3.4.1 目标被严重遮挡情况分析

在图 3(a)—(c)中目标被部分遮挡，可以看出大部分跟踪器均能较好跟上目标，而在图 3(d)中目标被严重遮挡，多数跟踪器由于使用了遮挡情况下的目标图像更新分类器，导致分类器在遇到严重遮挡或全遮挡时判别能力较弱，从而导致跟踪失败。例如图 3(d)suv序列的第517帧和683帧中汽车被树木完全遮挡，其他几种跟踪器均不能准确跟踪目标；而所提算法由于使用了遮挡感知策略，利用欧氏距离判别出遮挡情况，并利用能量函数从分类池中选择最佳分类器跟踪该情况下的目标，很好地解决了严重遮挡情况下的目标跟踪问题，所以仍能准确跟踪该目标。

3.4.2 尺度变化情况下算法性能分析

在图 3(e)—(f)中跟踪算法主要面临的挑战是尺度变化，此时DSST、BACF以及本文算法等具有目标尺度自适应策略的算法均能较好跟上目标，而KCF、MUSTER等不具备目标尺度自适应策略的算法发生了跟踪漂移现象，这说明对目标进行多尺度估计能够较好提高算法在尺度变化情况下的跟踪准确度。而在图 3(g)的girl序列中由于目标面临的主要挑战是尺度变化、遮挡以及相似目标干扰，此时包括BACF在内的大多数算法由于缺乏遮挡判别机制而跟踪失败，例如girl序列的第442帧和472帧中男生的脸遮挡住女生，再缓慢移开的过程中，BACF、DSST等算法虽然具备尺度自适应策略却仍然跟踪失败，但本文算法由于兼具尺度自适应策略与遮挡感知策略，因此能成功跟踪目标。

3.4.3 目标形变情况下算法性能分析

在图 3(h)—(j)中跟踪目标面临的主要挑战问题是形变问题，此时SRDCF、DSST等算法由于使用传统的循环移位方法产生的虚假负样本训练分类器，导致分类器的分类能力受到限制，从而不能较好判别出形变目标，例如图 3(i)的bolt2序列中的第116帧和第249帧中，SRDCF、MEEM、SAMF、MUSTER等算法在目标形变情况下出现了跟踪失败现象，而本文算法由于使用背景感知策略获得大量真实有效的样本来训练分类器，使得分类器判别能力进一步增强，从而能够较好跟踪形变目标。

3.5 复杂背景下算法性能分析

为进一步分析所提算法的鲁棒性，本文选取了2015目标跟踪标准测试数据集中3组背景较为复杂的视频序列测试所提算法效果，表 4和表 5为各个算法在复杂背景序列中的跟踪精确度与成功率。从表中数据可以看出，所提算法由于具备背景感知模块和遮挡感知模块，所以在3组复杂背景的视频序列中精确度与成功率均高于其他算法。

表 4 复杂背景下算法精确度
Table 4 Accuracy of algorithms in complex background

下载CSV

属性	本文	KCF	BACF	Staple	SAMF	LCT	MUSTER	MEEM	DSST	SRDCF
carDark	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
ironman	0.553	0.217	0.133	0.145	0.133	0.145	0.145	0.548	0.151	0.030
motorRolling	0.607	0.049	0.067	0.055	0.043	0.043	0.238	0.055	0.049	0.043
Overall	0.929	0.744	0.816	0.834	0.776	0.765	0.742	0.753	0.698	0.647

表 5 复杂背景下算法成功率
Table 5 Success rate of algorithms in complex background

下载CSV

属性	本文	KCF	BACF	Staple	SAMF	LCT	MUSTER	MEEM	DSST	SRDCF
carDark	1.000	0.692	0.995	1.000	0.583	0.992	1.000	1.000	1.000	1.000
ironman	0.512	0.151	0.127	0.096	0.114	0.096	0.127	0.470	0.133	0.030
motorRolling	0.516	0.079	0.091	0.067	0.079	0.061	0.274	0.085	0.067	0.073
Overall	0.915	0.657	0.835	0.811	0.746	0.756	0.720	0.634	0.646	0.594

与此同时，为直观分析所提算法在复杂背景下的跟踪效果，记录了各个算法在复杂背景下实际跟踪情况，如图 4所示。在carDark序列中，夜间行车背景比较复杂，由于MUSTER跟踪器容易受背景干扰的影响，所以从第316帧开始出现较大误差，而所提算法由于采用了背景感知模块，其抗背景干扰能力较强，所以能准确跟踪目标。motorRolling序列和ironman序列由于其具有背景极其复杂，序列所具干扰属性众多，所以其他算法都不能准确跟踪目标，而同时具备背景感知模块、遮挡感知模块和尺度估计方法的基于感知模型的长期目标跟踪算法仍能准确跟踪目标。

图 4 复杂背景序列中算法实际效果

Fig. 4 Actual effect of the algorithmin complex background

3.6 算法时间复杂度分析

在提高算法精确度的同时，本文对所提算法的时间复杂度进行了记录，如表 6所示。实验中所提算法的平均耗费时间为19.8帧/s，由于其增加了感知模块，所以较BACF、LCT等算法慢一些，其实时性略显不足，因此后续将在保证算法精确度的同时提高该算法效率方向进行研究。

表 6 算法时间复杂度
Table 6 Time complexity of the algorithm

下载CSV

	本文	KCF	BACF	LCT	MUSTER	DSST	SRDCF
速度/(帧/s)	19.8	243.5	25.2	21.6	2.7	53.1	4.5

3.7 算法边界效应问题分析

传统的相关滤波算法通过循环偏移产生样本，而实际的真实样本只有一个，其他的样本都是通过循环移位产生的虚拟样本，这些样本都具有边界效应。而通过引入裁剪矩阵可以提高真实样本的数量，降低具有边界效应样本的比例，并显著增加样本的数量，提高所训练的分类器的分类能力。具体如图 5所示，由于KCF算法存在边界效应，其响应图出现了两个峰值导致跟踪失败。而所提算法由于降低了边界效应样本的比例，所得分类器分类能力较强，其跟踪响应图只有一个陡直的单峰，其跟踪效果较好。

图 5 响应图效果

Fig. 5 Effectofthe response map((a) original image; (b) KCF response map; (c) ours)

4 结论

由于传统相关滤波算法使用循环移位方法获得的负样本真实性不足，以及循环移位产生的样本可能会导致边界效应问题的产生，所以本文算法通过扩大采样区域，增加所产生训练样本数量，并引入裁剪矩阵，裁取移位后的样本以获得完整有效的真实样本，同时经过裁剪的样本由于不存在样本不连续问题，所以极大地减弱了边界效应问题的产生。在目标被严重遮挡及完全遮挡的时候，由于传统相关滤波算法使用经过上一帧目标图像更新的分类器来跟踪当前帧目标，所以容易引起噪声更新导致跟踪失败。针对这个问题所提算法采用欧氏距离判别遮挡，当上一帧目标被完全遮挡的时候，利用能量函数从无遮挡分类池中选择最佳分类器检测出当前帧目标位置，并判别当前帧是否为无遮挡情况以决定是否更新分类池，从而较好解决了严重遮挡情况下的目标跟踪问题。

在实验中，对算法的成功率、精确度、时间复杂度等指标进行了详细对比，所得实验结果如下：1)在精确度与成功率方面：由于算法引入了背景感知和遮挡感知策略，所提算法的整体成功率与精确度显著高于其他算法，且针对单个序列也能获得较高的跟踪精度。但不可否认，其他算法在特定场景下具备一定长处，所提算法并未能在各个序列的精度和成功率都排第一；2)时间复杂度方面：由于感知模块的引入，使得算法的时间复杂度略高，算法实时性能不足。

实验表明所提算法在严重遮挡、尺度变化、目标形变等复杂情况下能准确跟踪目标，具有一定研究价值；但是所提算法在精确度与成功率提升的同时存在实时性不足的问题，所以接下来的研究目标将是提出时间复杂度优化策略，使得算法实时性能得到提升。

参考文献

[1] Gültekın O, Günsel B. Robust object tracking by variable rate kernel particle filter[C]//Proceedings of the 26th Signal Processing and Communications Applications Conference. Izmir, Turkey: IEEE, 2018: 1-4.[DOI: 10.1109/SIU.2018.8404479]

[2] Gao M F, Zhang X X. Scale adaptive kernel correlation filtering for target tracking[J]. Laser & Optoelectronics Progress, 2018, 55(4): 041501. [高美凤, 张晓玄. 尺度自适应核相关滤波目标跟踪[J]. 激光与光电子学进展, 2018, 55(4): 041501. ] [DOI:10.3788/LOP55.041501]

[3] Nai K, Li Z Y, Li G J, et al. Robust object tracking via local sparse appearance model[J]. IEEE Transactions on Image Processing, 2018, 27(10): 4958–4970. [DOI:10.1109/TIP.2018.2848465]

[4] Qi Y K, Qin L, Zhang J, et al. Structure-aware local sparse coding for visual tracking[J]. IEEE Transactions on Image Processing, 2018, 27(8): 3857–3869. [DOI:10.1109/TIP.2018.2797482]

[5] Li Z T, Zhang J, Zhang K H, et al. Visual tracking with weighted adaptive local sparse appearance model via spatio-temporal context learning[J]. IEEE Transactions on Image Processing, 2018, 27(9): 4478–4489. [DOI:10.1109/TIP.2018.2839916]

[6] Chen Z H, Guo Q, Wang L, et al. Background-suppressed correlation filters for visual tracking[C]//Proceedings of 2018 IEEE International Conference on Multimedia and Expo. San Diego, CA, USA: IEEE, 2018: 1-6.[DOI: 10.1109/ICME.2018.8486453]

[7] Li Z K, Wan C S. Visual tracking with re-detection based on feature combination[C]//Proceedings of the 10th International Conference on Advanced Computational Intelligence. Xiamen, China: IEEE, 2018: 655-660.[DOI: 10.1109/ICACI.2018.8377537]

[8] Xiao Y F, Li J, Chang J, et al. Correlation filter tracking with multiscale spatial view[C]//Proceedings of 2018 International Joint Conference on Neural Networks. Rio de Janeiro, Brazil: IEEE, 2018: 1-6.[DOI: 10.1109/IJCNN.2018.8489278]

[9] Li J W, Zhou X L, Chan S X, et al. Robust object tracking via large margin and scale-adaptive correlation filter[J]. IEEE Access, 2018, 6: 12642–12655. [DOI:10.1109/ACCESS.2017.2778740]

[10] Xu T Y, Wu X J, Feng F. Fast visual object tracking via correlation filter and binary descriptors[C]//Proceedings of 2017 International Smart Cities Conference. Wuxi, China: IEEE, 2017: 1-4.[DOI: 10.1109/ISC2.2017.8090855]

[11] Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 2544-2550.[DOI: 10.1109/CVPR.2010.5539960]

[12] Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(3): 583–596. [DOI:10.1109/TPAMI.2014.2345390]

[13] Li Y, Zhu J K. A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 254-265.[DOI: 10.1007/978-3-319-16181-5_18]

[14] Zhang H Y, Li C F. Compressive tracking algorithm combining online feature selection with covariance matrix[J]. Optics and Precision Engineering, 2017, 25(4): 519–527. [张红颖, 李灿锋. 结合特征在线选择与协方差矩阵的压缩跟踪算法[J]. 光学精密工程, 2017, 25(4): 519–527. ] [DOI:10.3788/OPE.20172504.1051]

[15] Ma C, Yang X K, Zhang C Y, et al. Long-term correlation tracking[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 5388-5396.[DOI: 10.1109/CVPR.2015.7299177]

[16] Choi J, Chang H J, Jeong J, et al. Visual tracking using attention-modulated disintegration and integration[C]//Proceedings of 2016 IEEE Conference on Computer Visionand Pattern Recognition. Las Vegas, USA: IEEE, 2016: 4321-4330.[DOI: 10.1109/CVPR.2016.468] http://www.researchgate.net/publication/311609512_Visual_Tracking_Using_Attention-Modulated_Disintegration_and_Integration

[17] Ge B Y, Zuo X Z, Hu Y J. Long-term object tracking based on feature fusion[J]. Acta Optica Sinica, 2018, 38(11): 1115002. [葛宝义, 左宪章, 胡永江. 基于特征融合的长时目标跟踪算法[J]. 光学学报, 2018, 38(11): 1115002. ] [DOI:10.3788/AOS201838.1115002]

[18] Danelljan M, Häger G, Khan F S, et al. Accurate scale estimation for robust visual tracking[C]//Proceedings of the British Machine Vision Conference. Nottingham, UK: BMVC, 2014: 1-11.[DOI: 10.5244/C.28.65]

[19] Danelljan M, Häger G, Khan F S, Felsberg M. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 4310-4318.[DOI: 10.1109/ICCV.2015.490]

[20] Galoogahi H K, Fagg A, Lucey S. Learning background-aware correlation filters for visual tracking[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 1144-1152.[DOI: 10.1109/ICCV.2017.129]

[21] Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 188-203.[DOI: 10.1007/978-3-319-10599-4_13]

[22] Bertinetto L, Valmadre J, Golodetz S, et al. Staple: complementary learners for real-time tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 1401-1409.[DOI: 10.1109/CVPR.2016.156]

[23] Hong Z B, Chen Z, Wang C H, et al. Multi-store tracker (MUSTer): a cognitive psychology inspired approach to object tracking[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Boston, MA, USA: IEEE, 2015: 749-758.[DOI: 10.1109/CVPR.2015.7298675]