发布时间: 2021-03-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.200139
2021 | Volume 26 | Number 3

图像分析和识别

背景与方向感知的相关滤波跟踪

姜文涛¹, 涂潮², 刘万军¹

1. 辽宁工程技术大学软件学院, 葫芦岛 125105;

2. 辽宁工程技术大学研究生院, 葫芦岛 125105

收稿日期: 2020-05-12; 修回日期: 2020-07-13; 预印本日期: 2020-07-20

基金项目: 国家自然科学基金项目（61172144）；辽宁省自然科学基金项目（20170540426）；辽宁省教育厅基金项目（LJYL049）

作者简介: 姜文涛, 1986年生, 男, 副教授, 主要研究方向为图像与视觉计算、模式识别与人工智能。E-mail: lntuwulue@sina.com;
涂潮, 通信作者, 男, 硕士研究生, 主要研究方向为图像与视觉计算、模式识别与人工智能。E-mail: 745700558@qq.com;
刘万军, 男, 教授, 主要研究方向为软件工程理论、图像与视觉信息计算、模式识别与人工智能。E-mail: liuwanjun@Intu.edu.cn

中图法分类号: TP391.4

文献标识码: A

文章编号: 1006-8961(2021)03-0527-15

摘要

目的针对相关滤波跟踪算法，目标与周围背景进行等值权重训练滤波器导致目标与背景信息相似时，易出现目标漂移问题，本文提出一种基于背景与方向感知的相关滤波跟踪算法。方法将目标周围的背景信息学习到滤波器中，利用卡尔曼滤波预测目标的运动状态和运动方向，提取目标运动方向上的背景信息，将目标运动方向上与非运动方向上的背景信息进行滤波器训练，保证分配给目标运动方向上背景信息的训练权重高于非运动方向上的权重，增加滤波器对目标和背景信息的分辨能力，采用线性插值法得到最大响应值，用于确定目标位置；构造辅助因子g，利用增广拉格朗日乘子法（augmented Lagrange method，ALM）将约束项放到优化函数里，采用交替求解算法（alternating direction method of multipliers，ADMM）将求解目标问题转化为求滤波器和辅助因子的最优解，降低计算复杂度；采用多分辨率搜索方法来估计目标变换的尺度。结果在数据集OTB50（object tracking benchmark）和OTB100上的平均精确率和平均成功率分别为0.804和0.748，相比BACF（background-aware correlation filters）算法分别提高了7%和16%；在数据集LaSOT上本文算法精确率为0.329，相比BACF（0.239）的精确率得分，更能体现本文算法的鲁棒性。结论与其他主流算法相比，本文算法在运动模糊、背景杂乱和形变等复杂条件下跟踪效果更加鲁棒。

关键词

计算机视觉; 目标跟踪; 相关滤波; 背景感知; 卡尔曼滤波; 交替求解算法(ADMM)

Background and direction-aware correlation filter tracking

Jiang Wentao¹, Tu Chao², Liu Wanjun¹

1. School of Software, Liaoning Technical University, Huludao 125105, China;

2. Graduate School, Liaoning Technical University, Huludao 125105, China

Supported by: National Natural Science Foundation of China (61172144)

Abstract

Objective Although the backgrocund-aware correlation filters (BACF) algorithm increases the number of samples and guarantees the sample quality, the algorithm performs equal weight training on the background information, resulting in the problem of target drift when the target is similar to the background information in complex scenes. The value weight training method ignores the priority of sample collection in the target movement direction and the importance of weight distribution. If the sample sampling method can be effectively designed in the target movement direction and the sample weights can be allocated reasonably, the tracking effect will be improved, and the target drift will be solved effectively. Therefore, this paper adds Kalman filtering to the BACF algorithm framework. Method For the single-target tracking problem, the algorithm in this paper only takes the motion vector from the predicted value and does not locate the target according to constant speed or acceleration. The target position is still determined by the response peak value. The maximum response value is obtained by linear interpolation. The target location is determined. When the speed is zero, the response peak of the target positioning in the previous frame image is still used to determine the target position in the current frame image. Kalman filtering is used to predict the target's motion state and direction, and the background information in the target's motion direction and non-motion direction is subjected to filter training to ensure that the training weight assigned to the background information in the target's motion direction is higher than the non-motion direction weights. The objective function problem is optimized and solved, auxiliary factor g is constructed, the augmented Lagrangian multiplier methodis usedto place the constraints in the optimization function, and the alternating solution method (alternating direction method of multipliers(ADMM) is used to optimize the filter and auxiliary factors, andreduce computational complexity. Result This paper selects standard data sets OTB50(abject tracking benchmark) and OTB100 to facilitate experimental comparison with the current mainstream algorithms. OTB50 is a commonly used tracking dataset, which contains 50 groups of video sequences and has 11 different attributes, such as lighting changes and occlusions. OTB100 containsan additional 50 test sequences based on OTB50. Each sequence may have different video attributes, making tracking challenges difficult. The algorithm in this paper uses one-pass evaluation (OPE) to analyze the performance of the algorithm, and tracking accuracy and success rate as the evaluation criteria. In video sequence Board_1, the algorithm in this paper, ECO(efficient convolution operators), SRDCF(spatially regularized correlation filters), and DeepSTRCF(deep spatial-temporal regularized) can achieve accurate tracking, but the speed of the algorithm in this paper is substantially better than that of the three two algorithms of ECO, SRDCF, and DeepSTRCF. In video sequence Panda_1, the tracking effect of the algorithm in this paper is stable under a low resolution. In video sequence Box_1, only the algorithm in this paper can accurately track the target from the initial frame to the last frame because the Kalman filter is used to predict the direction of the target and distinguish the target from the background information effectively. The tracker is prevented from tracking other similar background information. Experimental results show that the average accuracy rate and average success rate of the algorithm on datasets OTB50 and OTB100 are 0.804 and 0.748, respectively, which are 7% and 16% higher than the BACF algorithm, respectively. In tracking the experimental sequence, the tracking success rate and tracking accuracy of the algorithm in this paper are high and meet the real-time requirements, and the tracking performance is good. Conclusion This paper uses Kalman filtering to predict the direction and state of the target, assigns different weights to the background information in different directions, performs filter training, and obtains the maximum response value based on linear interpolation to determine the target position. The ADMM method is used to transform the problem of solving the target model into two subproblems with the optimal solution. The online adaptive method is used to solve the problem of target deformation in model update. Numerous comparative experiments are performed on OTB50 and OTB100 datasets. On OTB50, the algorithm success rate and accuracy rate of this paper are 0.720 and 0.777, respectively. On OTB100, the algorithm success rate and accuracy rate of this paper are 0.773 and 0.828, respectively.Both are better than the current mainstream algorithms, which shows that the algorithm in this paper has better accuracy and robustness. In background sensing, the sample sampling method and weight allocation directly affect target tracking performance. The next step is to conduct an in-depth research on the construction of a speed-adaptive sample collection model.

Key words

computer vision; target tracking; correlation filter; background-aware; Kalman filters; alternating direction method of multipliers (ADMM)

0 引言

目标跟踪是计算机视觉领域中十分重要的研究热点，在医学图像和现代化军事等领域应用广泛(卢湖川等，2018；孟琭和杨旭，2019)。但目标跟踪存在较多难点，如目标遮挡和光照变化等。因此，如何在复杂情况下仍能进行准确稳定的跟踪是当前研究的热点问题之一(李聪等，2018；刘波等，2019)。

相关滤波算法因跟踪速度较快与跟踪精度较高得到国内外学者的广泛关注(宋日成等，2018)。Bolme等人(2010)首次将相关滤波(correlation filter，CF)用于跟踪领域，提出最小均方误差滤波(minimum output sum of squared error filter，MOSSE)跟踪方法，该方法在目标和检测区域进行相关性计算，得到最大响应值确定目标位置做相关性，利用傅里叶变换方法将运算从时域转换到频域，降低了计算复杂度，同时跟踪速度有较大提升。针对MOSSE算法中样本数量不足的问题，Henriques等人(2012)提出核循环结构(exploiting the circulant structure of tracking-by-detection with kernels，CSK)算法，通过循环密集采样的方式，包含整幅图像特征，提升了跟踪效果。Henriques等人(2015)在CSK的基础上提出了核相关滤波算法(kernel correlation filters，KCF)，该算法采用方向梯度直方图(histogram of oriented gradient，HOG)特征，通过引入核函数将低维空间中的非线性问题转化为高维空间中的线性问题, 使算法性能更优。虽然针对核相关滤波目标跟踪的改进算法较多，但是对于相关滤波框架改进的算法却很少。Danelljan等人(2014)提出了判别型尺度空间跟踪算法(discriminatiive scale space tracker，DSST)，该算法采用判别相关滤波器确定目标位置，利用尺度估计方法确定尺度信息，有效提高了尺度自适应性能。相关滤波是模板类方法，在目标快速运动或发生形变的情况下，跟踪效果较差。针对这一问题，Danelljan等人(2015)提出了基于空间正则化的相关滤波算法(spatially regularized correlation filters，SRDCF), 该算法扩大背景信息跟踪范围，加入空间正则化约束，虽然提高了跟踪精度，但跟踪速度较慢。Bertinetto等人(2016)提出Staple算法，该算法融合了HOG特征和颜色直方图特征(color name，CN)，跟踪性能得到提高，但该算法在低分辨率、目标超出视野时鲁棒性较差，无法实时有效地跟踪到目标。Galoogahi等人(2017)提出背景感知相关滤波(background-aware correlation filters，BACF)算法，该算法在传统的相关滤波算法框架基础上扩大循环矩阵采样的区域，使得样本数量增加，同时利用样本裁剪，在每个样本上筛选出有效区域，保证样本质量，相较于KCF算法，跟踪效果较好，且跟踪速度达到了33.9帧/s。Mueller等人(2017)提出上下文感知相关滤波器跟踪算法(context-aware correlation filter tracking，CACF)，该算法在滤波器训练时增加了目标周围的背景信息，将目标作为正样本，目标上下左右方向各取一个背景块作为负样本，即在目标周围增加了背景约束信息，同时该算法是对相关滤波算法框架的改进，适用于所有的相关滤波算法，如SAMF(scale adaptive multiple feature)(Li和Zhu，2014)算法等，相比之前的算法，成功率和准确率都有了显著提高。Li等人(2018)针对基于空间正则化的相关滤波算法(SRDCF)的不足，提出了空间与时间正则化算法(spatial-temporal regularized correlation filters，STRCF)，该算法在SRDCF的基础上加入了时间正则化，不再保留从初始帧到当前帧的跟踪背景样本，既保证了跟踪精度也提高了跟踪速度。

随着深度学习方法的广泛应用，国内外学者通过多层卷积神经网络训练大量数据，有效提高了目标的跟踪准确率。为了解决跟踪过程中训练样本不足的问题，Wang和Yeung(2013)提出了DLT(deep compact image representation)算法，该算法是第一个将深度学习方法应用到目标跟踪领域，并结合离线预训练与在线调整的方法，有效保证了训练样本的数量，提高了跟踪精度。Danelljan等人(2016)提出C-COT(continuous convolution operators)算法，该算法基于判别式学习的特征点跟踪算法，提出连续卷积算子框架来实现准确的亚像素定位。针对C-COT算法的不足，Yun等人(2017)提出了ECO(efficient convolution operators)算法，该算法选取了贡献较多的滤波器，利用高斯混合模型简化了训练样本，同时规定每6帧对模型进行更新，减少了模型更新次数，跟踪效果更加鲁棒。DeepSTRCF在STRCF的基础上引进深度特征，保证其在复杂情况下仍能准确稳定地跟踪。

背景感知相关滤波(BACF)算法虽然增加了样本数量并保证了样本质量，但该算法对背景信息进行等值权重训练，导致复杂场景下目标与背景信息相似时，容易出现目标漂移的问题，等值权重训练方式忽视了目标运动方向上样本采集的优先性和权重分配的重要性，如果能在目标运动方向上有效设计样本采样方式并合理分配样本权重，将进一步提升跟踪效果，有效解决目标漂移问题。因此，本文在BACF的算法框架上加入卡尔曼滤波(Welch和Bishop，2001；Xu等，2018): 1)利用卡尔曼滤波预测目标的运动状态和运动方向，将目标运动方向上与非运动方向上的背景信息进行滤波器训练，保证分配给目标运动方向上背景信息的训练权重高于非运动方向上的权重; 2)对目标函数问题进行优化求解，构造辅助因子$ g$，利用增广拉格朗日乘子法(augmented Lagrange method, ALM)将约束项放到优化函数里，采用交替求解方法(alternating direction method of multipliers, ADMM)(Wang等，2019)对滤波器和辅助因子优化求解，降低计算复杂度。在数据集OTB50和OTB100上评估本文算法，实验结果表明，与当前主流算法相比，本文算法具有更高的跟踪精确率和成功率，且跟踪速度达到29.9帧/s。

1 背景感知相关滤波算法原理

相关滤波算法中，训练集对目标进行初次采样后，由初始样本通过循环移位操作得到其他跟踪样本，唯一正样本只有初始样本，当目标出现形变或平面内外翻转时，会出现边界效应，导致跟踪性能较差，部分主流算法通过加入余弦窗方法来降低边界效应的影响，但是加入余弦窗会过滤掉一部分背景信息，从而降低分类器的分辨能力(尹明锋等，2019)。为了解决边界效应，Galoogahi等人(2015)提出有限边界的相关性过滤器(correlation filters with limited boundaries，CFLB)最小化岭回归，即

$ \begin{array}{c} E(\boldsymbol{h})=\frac{1}{2} \sum\limits_{i=1}^{N} \sum\limits_{j=1}^{T}\left\|y_{i}(j)-\boldsymbol{h}^{\mathrm{T}} \boldsymbol{P} \boldsymbol{x}_{i}\left[\Delta \boldsymbol{\tau}_{j}\right]\right\|_{2}^{2}+ \\ \frac{\lambda}{2}\|\boldsymbol{h}\|_{2}^{2} \end{array} $

(1)

式中，$\boldsymbol{x}_{i} \in {\bf{R}}^{T}, \boldsymbol{h} \in {\bf{R}}^{{\mathit{D}}}, N $，表示训练图像数，$T $表示整幅图像大小，$\Delta \tau_{j}$表示循环移位操作，${\mathit{\boldsymbol{h}}} $表示训练的滤波器，${\mathit{\boldsymbol{P}}} $为$D×T$的2维矩阵，目的是提取信号$ {\mathit{\boldsymbol{x}}}$的中间$D $个元素, $ λ$为正则化因子。一般情况下，$ D \gg T$。$ {\mathit{\boldsymbol{P}}}$是一个常数矩阵，可以计算出来。该算法采用较大的训练和检测图像块，利用矩阵${\mathit{\boldsymbol{P}}} $裁切出真实的小尺寸样本，提升真实样本的比例，有效解决了边界效应带来的问题，跟踪效果较好。

背景感知相关滤波(BACF)算法将HOG特征应用到CFLB上，即

$ \begin{array}{c} E(\boldsymbol{h})=\frac{1}{2} \sum\limits_{i=1}^{T}\left\|\boldsymbol{y}(j)-\sum\limits_{k=1}^{K} \boldsymbol{h}_{k}^{\mathrm{T}} \boldsymbol{P} \boldsymbol{x}_{k}\left[\Delta \boldsymbol{\tau}_{j}\right]\right\|_{2}^{2}+ \\ \frac{\lambda}{2} \sum\limits_{k=1}^{K}\left\|\boldsymbol{h}_{k}\right\|_{2}^{2} \end{array} $

(2)

式中，$ K$为特征通道数，$ {\mathit{\boldsymbol{y}}}(j)$是${\mathit{\boldsymbol{y}}} $的第$j $个元素，${\mathit{\boldsymbol{h}}}_K $为多通道滤波器，$ λ$为正则化，${\mathit{\boldsymbol{x}}}_{k}∈ {\boldsymbol{R}}^T$，${\mathit{\boldsymbol{y}}}∈ {\boldsymbol{R}}^T $且$ {\mathit{\boldsymbol{h}}}∈{\boldsymbol{R}}^D$。BACF算法将CFLB算法中的$ N$设为1，然后加入多通道特征(HOG)。该算法在传统相关滤波算法框架的基础上扩大循环矩阵采样的区域，使得样本数量增加，同时利用样本裁剪，在每个样本上筛选出有效区域，保证样本的质量，且跟踪速度达到了33.9帧/s。

2 总体框架设计

背景感知相关滤波算法(BACF)在传统的相关滤波算法基础上增加了样本数量, 并通过裁剪方式提高了样本质量，在实现过程中使用了优化和化简的方法，但该算法对于目标和背景进行等值权重训练导致背景信息滤波过于平滑，当目标与背景相似时，容易出现目标漂移的问题。

本文在该算法的基础上，通过设计样本采样方式并合理分配样本权重，使算法更加优化。本文算法分为两个阶段，整体算法框架如图 1所示：

图 1 算法整体框架示意图

Fig. 1 Overall framework of algorithm

1) 建模阶段。确定初始帧图像内的跟踪目标，提取目标外观特征训练滤波器，建立目标模型。

2) 预测与更新阶段。利用卡尔曼滤波预测目标运动状态和方向，提取不同方向上的背景块信息进行滤波器训练与更新滤波器模板，对目标不同运动方向上的背景信息赋予不同的权重进行滤波器训练，增加滤波器对目标和背景的分类能力。目标定位采用线性插值法得到最大响应值$ F_ {\rm{max}}$，用于确定目标位置，对得到的目标进行模型更新，反复进行该过程，完成目标跟踪。

3 背景与方向感知的相关滤波跟踪

3.1 背景感知下的卡尔曼滤波预测模型

本文算法采用卡尔曼滤波预测目标运动方向和运动状态。系统状态方程为

$ \begin{array}{c} \boldsymbol{X}_{a}=\boldsymbol{A} \boldsymbol{X}_{a-1}+\boldsymbol{B} \boldsymbol{U}_{a-1}+\boldsymbol{w}_{a-1} \\ \boldsymbol{Z}_{a}=\boldsymbol{H} \boldsymbol{X}_{a}+\boldsymbol{V}_{a} \end{array} $

(3)

式中，${\mathit{\boldsymbol{X}}}_a $为系统状态矩阵，${\mathit{\boldsymbol{A}}} $为状态转移矩阵，${\mathit{\boldsymbol{B}}} $为控制输入矩阵，$ {\mathit{\boldsymbol{U}}}$是控制输入量，${\mathit{\boldsymbol{H}}} $为状态观测矩阵，${\mathit{\boldsymbol{A}}}、{\mathit{\boldsymbol{B}}}、{\mathit{\boldsymbol{H}}} $均为系统参数，${\mathit{\boldsymbol{w}}}_{a-1} $为过程噪声，协方差为${\mathit{\boldsymbol{Q}}} $，${\mathit{\boldsymbol{Z}}}_a$为状态矩阵的观测量，${\mathit{\boldsymbol{V}}}_a $代表测量噪声且为高斯白噪声，协方差为${\mathit{\boldsymbol{R}}} $。结合式(3)运用卡尔曼滤波其余5个迭代公式对其做状态估计

$ \hat{\boldsymbol{X}}_{a / a-1}=\boldsymbol{A} \hat{\boldsymbol{X}}_{a-1}+\boldsymbol{B} \boldsymbol{U}_{a-1} $

(4)

式中，$\hat{\boldsymbol{X}}_{a-1}$表示$a－1 $时刻状态估计，$\hat{\boldsymbol{X}}_{a/a-1} $表示下一帧的预测值。${\mathit{\boldsymbol{U}}}_{a－1} $表示$a－1 $时刻的控制量，本文算法没有控制量，故${\mathit{\boldsymbol{U}}}_{a－1} $设为0。

$ \boldsymbol{P}_{a / a-1}=\boldsymbol{A} \boldsymbol{P}_{a-1} \boldsymbol{A}^{\mathrm{T}}+\boldsymbol{Q} $

(5)

式中，$ {\mathit{\boldsymbol{P}}}$$ _{a}$表示协方差矩阵，$ {\mathit{\boldsymbol{P}}}$$ _{a/a－1}$表示下一帧预测的协方差矩阵。

$ \boldsymbol{K}_{g}=\boldsymbol{P}_{a / a-1} \boldsymbol{H}^{\mathrm{T}} /\left(\boldsymbol{H} \boldsymbol{P}_{a / a-1} \boldsymbol{H}^{\mathrm{T}}+\boldsymbol{R}\right) $

(6)

式中，$ {\mathit{\boldsymbol{K}}}$$ _{g}$表示卡尔曼增益。

$ \hat{\boldsymbol{X}}_{a}=\hat{\boldsymbol{X}}_{a / a-1}+\boldsymbol{K}_{g}\left(\boldsymbol{Z}_{a}-\boldsymbol{H} \hat{\boldsymbol{X}}_{a / a-1}\right) $

(7)

式中，$\hat{\boldsymbol{X}}_{a}$表示当前时刻估算值。

$ \boldsymbol{P}_{a}=\left(\boldsymbol{I}-\boldsymbol{K}_{\mathrm{g}} \boldsymbol{H}\right) \boldsymbol{P}_{a / a-1} $

(8)

式中，$ {\mathit{\boldsymbol{I}}}$表示单位矩阵。

前述状态转移矩阵$ {\mathit{\boldsymbol{A}}}$，状态观测矩阵$ {\mathit{\boldsymbol{H}}}$，协方差$ {\mathit{\boldsymbol{Q}}}$、$ {\mathit{\boldsymbol{R}}}$的初始矩阵分别为

$ \begin{array}{c} \boldsymbol{A}=\left[\begin{array}{c} 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \\ \boldsymbol{H}=\left[\begin{array}{ccccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{array}\right] \\ \boldsymbol{Q}=\left[\begin{array}{cccc} 0.01 & 0 & 0 & 0 \\ 0 & 0.01 & 0 & 0 \\ 0 & 0 & 0.01 & 0 \\ 0 & 0 & 0 & 0.01 \end{array}\right] \\ \boldsymbol{R}=\left[\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}\right] \end{array} $

(9)

本文算法针对单目标跟踪问题，只从预测值当中取运动矢量，不根据恒定速度或有加速度情况下对目标进行定位，目标位置仍由响应峰值确定，采用线性插值法得到最大响应值$F_{{\rm{max}}} $，用于确定目标位置。当速度为0时，仍用上一帧图像中目标定位的响应峰值确定当前帧图像中的目标位置。当速度不为0时，目标的状态空间向量为

$ \boldsymbol{O}=\left(C_{x}, C_{y}, V_{x}, V_{y}\right)^{\mathrm{T}} $

(10)

式中，$C _{x}$，$C _{y}$为目标中心坐标，$V _{x}$，$V _{y}$分别为$C _{x}$，$C _{y}$方向上的速度，单位矢量为

$ \boldsymbol{V}=\left(V_{x} /\left|V_{x}\right|, V_{y} /\left|V_{y}\right|\right)^{\mathrm{T}} $

(11)

将目标运动方向上的背景信息赋予较高的权重进行滤波器训练，增加滤波器对目标和背景的分类能力，则式(2)转化为

$ \begin{aligned} E(\boldsymbol{h})=& \frac{1}{2} \sum\limits_{i=1}^{T}\left\|\boldsymbol{y}(j)-\sum\limits_{k=1}^{K} \boldsymbol{h}_{k}^{\mathrm{T}} \boldsymbol{P} \boldsymbol{x}_{k}\left[\Delta \boldsymbol{\tau}_{j}\right]\right\|_{2}^{2}+\\ & \frac{\lambda}{2} \sum\limits_{k=1}^{K}\left\|\boldsymbol{h}_{k}\right\|_{2}^{2}+\frac{\beta}{2} \sum\limits_{k=1}^{K}\left\|\boldsymbol{O}_{\delta} \omega\right\|_{2}^{2} \end{aligned} $

(12)

式中，$T $表示整幅图像大小，$ {\mathit{\boldsymbol{y}}}(j) $是$ {\mathit{\boldsymbol{y}}}$的第$j $个元素，$K $为特征通道数，$ {\mathit{\boldsymbol{h}}}$$ _{K}$为多通道滤波器，$ {\mathit{\boldsymbol{P}}}$为$D×T $的2维矩阵，$Δτ _{j}$表示循环移位操作，$ {\mathit{\boldsymbol{h}}}$表示训练的滤波器，$ λ$，$ β$为正则化，$ {\mathit{\boldsymbol{x}}}$$ _{k}∈{{\bf{R}}^{\rm{T}}}$，$ {\mathit{\boldsymbol{y}}}∈ {{\bf{R}}^{\rm{T}}}$且$ {\mathit{\boldsymbol{h}}}∈{{\bf{R}}^{{\mathit{D}}}}$。$ω$为权重系数，$ ω∈(0, 1) $，与所选取的背景块数量有关，$ {\mathit{\boldsymbol{O}}}$$ _{δ}$($δ∈[1, ∂] $) 为目标运动的背景信息，$ ∂$为选取的背景块数量，选取的背景块数越多，$ ω$越接近1。本文算法中，通过对选取参数$ ∂$的数量分析，可得到$ ∂_{1}$为目标相反运动方向上的背景块数量，此时权重系数$ ω_{1}$最小；$ ∂_{2}$为目标非运动方向上的背景块数量，此时权重系数$ ω_{2}$略大于$∂ _{1}$的权重系数$ ω_{1}$，$ ∂_{3}$为目标运动方向上的背景块数量，此时权重系数$ ω_{3}$最大，通过选取的背景块数量得到不同的权重系数，从而确定提取目标不同运动方向上的背景信息训练滤波器。图 2中蓝色区域为目标周围的背景信息。

图 2 背景样本

Fig. 2 Background sample

图 3为对目标不同运动方向背景信息的分类结果，黑色区域$ ∂_1$为目标相反运动方向上的背景信息，红色区域$ ∂_2$为目标非运动方向上的背景信息，蓝色区域$ ∂_3$为目标运动方向上的背景信息。

图 3 通过方向分类的背景样本

Fig. 3 Background classification by direction

图 4为背景信息高斯混合模型的3维可视化图。图 2—图 4中背景样本是一个2维矩阵，大小与矩阵${\mathit{\boldsymbol{P}}} $一致，3种颜色区域的圆点仅代表目标不同运动方向的背景信息，不代表背景块的中心位置。

图 4 背景信息3维可视化图

Fig. 4 Background information 3D visualization

将式(12)转换到频域上计算，公式为

$ \begin{array}{c} E(\boldsymbol{h}, \hat{\boldsymbol{g}})=\frac{1}{2}\|\hat{\boldsymbol{y}}-\hat{\boldsymbol{X}} \hat{\boldsymbol{g}}\|_{2}^{2}+\\ \frac{\lambda}{2}\|\boldsymbol{h}\|_{2}^{2}+\frac{\beta}{2}\|\hat{\boldsymbol{g}}(t) \omega\|_{2}^{2} \\ \text { s. t. } \hat{\boldsymbol{g}}=\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h} \end{array} $

(13)

式中，${\mathit{\boldsymbol{\hat g}}} $是辅助因子，$\mathit{\boldsymbol{\widehat g}} = {\left[ {\mathit{\boldsymbol{\widehat g}}_1^{\rm{T}}, \cdots, \mathit{\boldsymbol{\widehat g}}_K^{\rm{T}}} \right]^{\rm{T}}}$表示$ K×T$维列向量，由$K $个通道的$g_K$级联组成，$ \otimes $表示克罗内克积，$ {\mathit{\boldsymbol{I}}}_K$表示$K $阶单位矩阵，$\boldsymbol{h}=\left[\boldsymbol{h}_{1}^{\mathrm{T}}, \cdots, \boldsymbol{h}_{k}^{\mathrm{T}}\right]^{\mathrm{T}}$表示$ K×D$维列向量，由$ K$个通道的$ {\mathit{\boldsymbol{h}}}_k$级联组成，$ {\mathit{\boldsymbol{F}}}$为$T×T $的正交傅里叶变换矩阵，$ {\mathit{\boldsymbol{P}}}$为$ D×T$的2维矩阵，$ \mathit{\boldsymbol{\widehat g}}(t)$表示通过卡尔曼滤波预测目标在$ t$时刻的状态估计。

3.2 模型求解

为求解式(13)，采用增广拉格朗日乘子法(ALM)(濮定国和金中，2010)将约束项放到优化函数中，即

$ \begin{array}{c} \zeta(\hat{\boldsymbol{g}}, \boldsymbol{h}, \hat{\boldsymbol{\xi}})=\frac{1}{2}\|\hat{\boldsymbol{y}}-\hat{\boldsymbol{x}} \hat{\boldsymbol{g}}\|_{2}^{2}+\frac{\lambda}{2}\|\boldsymbol{h}\|_{2}^{2}+ \\ \hat{\boldsymbol{\xi}}\left(\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right)+ \\ \frac{\mu}{2}\left\|\left(\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right)\right\|_{2}^{2}+ \\ \frac{\beta}{2}\|\hat{\boldsymbol{g}}(t) \omega\|_{2}^{2} \end{array} $

(14)

式中，$μ $为惩罚因子，$ \mathit{\boldsymbol{\widehat \xi }} = {\left[ {\mathit{\boldsymbol{\widehat \xi }}_1^{\rm{T}}, \cdots, \mathit{\boldsymbol{\widehat \xi }}_k^{\rm{T}}} \right]^{\rm{T}}}$为拉格朗日向量。采用ADMM方法求解式(14)，将目标转化为两个容易求解的子问题，因为两个子问题$\mathit{\boldsymbol{\widehat g}}^* $和$ {\mathit{\boldsymbol{h}}^*}$都是凸光滑可微函数，有闭式解且为全局最优解。

1) 子问题1：求解${\mathit{\boldsymbol{h}}^*} $。将式(14)中$ \mathit{\boldsymbol{\widehat y}}, \mathit{\boldsymbol{\widehat g}}, \lambda, \mu, \beta, \omega, P$均当成已知量，则优化问题式(14)变成具有闭式解的子问题(详细推导过程参考附录), 即

(15)

2) 子问题2：求解$ {{\mathit{\boldsymbol{\hat g}}}^*}$。将式(14)中$\mathit{\boldsymbol{\widehat y}}, \mathit{\boldsymbol{h}}, \lambda, \mu, \beta, \omega, P $均当成已知量，则优化问题式(14)变成具有闭式解的子问题(详细推导过程参考附录), 即

$ \begin{array}{c} \boldsymbol{g}^{*}=\arg {\min \limits_\boldsymbol{g}}\left\{\frac{1}{2}\|\hat{\boldsymbol{y}}-\hat{\boldsymbol{x}} \hat{\boldsymbol{g}}\|_{2}^{2}+\right. \\ \hat{\boldsymbol{\xi}}^{\mathrm{T}}\left(\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right)+ \\ \frac{\mu}{2}\left\|\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right\|_{2}^{2}+ \\ \left.\frac{\beta}{2}\|\hat{\boldsymbol{g}}(t) \omega\|_{2}^{2}\right\} \end{array} $

(16)

式中，求解${\mathit{\boldsymbol{g}}^*} $的计算量大，无法达到实时性跟踪，将$ {\mathit{\boldsymbol{g}}^*}$的求解问题拆分成$L $个独立的目标函数，故式(16)可分解成$L $个$K×K $线性子问题单独求解，$ \hat{\boldsymbol{y}}(\hat{\boldsymbol{y}}(t), t=1, \cdots, L)$仅依赖于$\mathit{\boldsymbol{\widehat x}}(t) = \left[ {{{\mathit{\boldsymbol{\widehat x}}}_1}(t), \cdots } \right.{\left. {{{\mathit{\boldsymbol{\widehat x}}}_K}(t)} \right]^{\rm{T}}} $和$ \mathit{\boldsymbol{\widehat g}}(t) = \left[ {conj\left({{{\mathit{\boldsymbol{\widehat g}}}_1}(t)} \right), \cdots, } \right.{\left. {conj\left({{{\mathit{\boldsymbol{\widehat g}}}_K}(t)} \right)} \right]^{\rm{T}}}$的$K $值，$ conj(·)$为共轭运算。式(16)可优化为

$ \begin{array}{c} \boldsymbol{g}^{*}(t)=\arg {\min \limits_\boldsymbol{g}}\left\{\frac{1}{2}\left\|\hat{\boldsymbol{y}}(t)-\hat{\boldsymbol{x}}^{\mathrm{T}}(t) \hat{\boldsymbol{g}}(t)\right\|_{2}^{2}+\right.\\ \hat{\boldsymbol{\xi}}^{\mathrm{T}}(t)(\hat{\boldsymbol{g}}(t)-\hat{\boldsymbol{h}}(t))+\\ \left.\frac{\mu}{2} \| \hat{\boldsymbol{g}}(t)-\hat{\boldsymbol{h}}(t)\right)_{2}^{2}+\\ \left.\frac{\beta}{2}\|\hat{\boldsymbol{g}}(t) \omega\|_{2}^{2}\right\} \end{array} $

(17)

式中，$\hat{\boldsymbol{h}}(t)=\left[\hat{\boldsymbol{h}}_{1}(t), \cdots, \hat{\boldsymbol{h}}_{\kappa}(t)\right] \text { 且 } \hat{\boldsymbol{h}}_{\kappa}=\sqrt{D} \boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \boldsymbol{h}_{\kappa} $，根据帕斯瓦尔定理，式(17)进一步优化求解，即

$ \begin{array}{l} \hat{\boldsymbol{g}}(t)^{*}=\left(\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{x}}^{\mathrm{T}}(t)+T(\mu+\lambda) \boldsymbol{I}_{K}\right)^{-1} \\ (\hat{\boldsymbol{y}}(t) \hat{\boldsymbol{x}}(t)-T \hat{\boldsymbol{\zeta}}(t)+T \mu \hat{\boldsymbol{h}}(t)+T \beta \omega) \end{array} $

(18)

式(18)的时间复杂度为$ \mathrm{O}\left(T K^{3}\right)$，计算量较大，根据Sherman-Morrison定理，由$ {\left({\mathit{\boldsymbol{A}} + \mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{T}}}} \right)^{ - 1}} = {\mathit{\boldsymbol{A}}^{ - 1}} - \frac{{{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{T}}}{\mathit{\boldsymbol{A}}^{ - 1}}}}{{1 + {\mathit{\boldsymbol{v}}^{\rm{T}}}{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}}}}$求解$ \left(\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{x}}(t)^{\mathrm{T}}+T \mu \boldsymbol{I}_{K}\right)^{-1}$，即

$ \begin{array}{c} \hat{\boldsymbol{g}}(t)^{*}=\frac{1}{(\mu+\beta)}(T \hat{\boldsymbol{y}}(t) \hat{\boldsymbol{x}}(t)-\hat{\boldsymbol{\zeta}}(t)+\mu \hat{\boldsymbol{h}}(t))- \\ \frac{\hat{\boldsymbol{x}}(t)}{(\mu+\beta) b}\left(T \hat{\boldsymbol{y}}(t) \hat{\boldsymbol{s}}_{x}(t)-\hat{\boldsymbol{s}}_{\zeta}(t)+\mu \hat{\boldsymbol{s}}_{h}(t)+\beta \hat{\boldsymbol{s}}_{\omega}(t)\right) \end{array} $

(19)

式中，${\mathit{\boldsymbol{\widehat s}}_x}(t) = \mathit{\boldsymbol{\widehat x}}{(t)^{\rm{T}}}\mathit{\boldsymbol{\widehat x}}, {\mathit{\boldsymbol{\widehat s}}_\zeta }(t) = \mathit{\boldsymbol{\widehat x}}{(t)^{\rm{T}}}\mathit{\boldsymbol{\widehat \zeta }}, {\mathit{\boldsymbol{\widehat s}}_h}(t) = \mathit{\boldsymbol{\widehat x}}{(t)^{\rm{T}}}\mathit{\boldsymbol{\widehat h}}, {\mathit{\boldsymbol{\widehat s}}_\omega }(t) = \mathit{\boldsymbol{\widehat x}}{(t)^{\rm{T}}}\omega $。采用式(19)计算的时间复杂度为$ \mathrm{O}\left(T K\right)$，可以满足实时性跟踪需求。

3) 子问题3：求解$ {\mathit{\pmb{ζ}}}$。将拉格朗日乘子向量更新为

$ \hat{\boldsymbol{\xi}}^{(i+1)} \leftarrow \hat{\boldsymbol{\xi}}^{(i)}+\mu\left(\hat{\boldsymbol{g}}^{(i+1)}-\hat{\boldsymbol{h}}^{(i+1)}\right) $

(20)

式中，$\hat{\boldsymbol{h}}^{(i+1)} $和$ \hat{\boldsymbol{g}}^{(i+1)}$是求解$ \mathit{\boldsymbol{\widehat h}}$和$ \mathit{\boldsymbol{\widehat g}}$中ADMM迭代$n+1 $次的结果，参数$μ $采用$\mu^{i+1}=\min \left(\mu_{\max }, \vartheta \mu^{(i)}\right) $更新, $ ϑ$为常数。

3.3 目标模型更新

本文采用在线自适应策略提高目标发生形变或尺度变化时的跟踪鲁棒性。

$ \hat{\boldsymbol{x}}_{\text {model }}^{(f)}=(1-\eta) \hat{\boldsymbol{x}}_{\text {model }}^{(f-1)}+\eta \hat{\boldsymbol{x}}^{(f)} $

(21)

式中，$η$为在线自适应率，$\hat{\boldsymbol{x}}_{\text {model }}^{(f)} $替代式(19)中的$\hat{\boldsymbol{x}}(t) $，用来求解$\mathit{\boldsymbol{\widehat g}}{(t)^*}, {\mathit{\boldsymbol{\widehat s}}_x}(t), {\mathit{\boldsymbol{\widehat s}}_\zeta }(t), {\mathit{\boldsymbol{\widehat s}}_h}(t){\mathit{\boldsymbol{\widehat s}}_\omega }(t) $。与BACF一样，本文采用多分辨率搜索方法来估计目标变换的尺度。

3.4 算法具体流程

本文主要算法步骤如下。

1) 建模阶段。手动确定初始帧目标中心坐标($ C_{x}, C_{y}$)和目标所在的背景区域$ \boldsymbol{O}_{\delta}$。

2) 预测与更新阶段。

(1) 卡尔曼滤波预测。采用卡尔曼滤波预测目标运动状态($V $=0或$ V$≠0)和目标运动方向($ V_{x} /\left|V_{x}\right|, \left|V_{y} /\right| V_{y} \mid$)，提取目标运动方向、相反运动方向和非运动方向的背景块进行滤波器训练与更新滤波器模板。

(2) 目标定位。采用线性插值法得到最大响应值$F_{{\rm{max}}} $，用于确定目标位置。同时，考虑到摄像时相机和目标之间有相对运动，当两者速度一致时会出现相对静止的状态，目标速度为0，此时用上一帧图像中目标定位的响应峰值来确定当前帧图像中的目标位置。

(3) 模型更新。对步骤(2)跟踪到的目标采用式(21)进行模型更新。

(4) 输出跟踪结果。继续执行步骤(1)，进行下一帧图像的滤波器训练与更新。

4 实验结果

4.1 实验环境及参数设置

本文采用MATLAB 2016a作为编程语言，实验平台配置Intel i7-8565UCPU和8 GB内存。操作系统为Windows 10。本文部分参数设置：HOG特征通道数K为31，正则化因子$λ $和$ β$分别设为0.01和1，尺度$S $和尺度步数$α $分别设为5和1.01，迭代次数$L $=2，惩罚因子$ μ$=1，参数$ μ$采用$ {\mu ^{i + 1}} = \min \left({{\mu _{\max }}, \vartheta {\mu ^{(i)}}} \right)$更新，$ {\mu _{\max }} = {10^3}$，$ ϑ$=10。数量块$∂ $=12。

为了方便与当前主流算法进行实验对比，本文选取标准数据集OTB50(object tracking benchmark)(Wu等，2013)和OTB100(Wu等，2015)。数据集OTB50是常用的跟踪数据集，包含50组视频序列，具有11种不同的属性，如光照变化、遮挡等情况。OTB100是在OTB50的基础上新增了50组测试序列，每种序列可能都有不同的视频属性，跟踪挑战难度大。

本文算法采用一次性通过评价(one-pass evaluation，OPE)分析算法性能，将跟踪精确率和成功率作为评价标准。为了更好地验证本文算法与当前主流跟踪算法的跟踪准确率，同时在最新发布的LaSOT(large-scale single object tracking)(Fan等，2019)数据集上进行对比测试。

4.2 实验结果分析

为了验证本文算法的鲁棒性，选取近年跟踪效果较好的跟踪算法进行对比，分别为KCF(Henriques等，2014)，SRDCF(Danelljan等，2015)，BACF(Galoogahi等，2017)，DSST(Danelljan等，2014)，Staple(Bertinetto等，2016)，DCF_CA(Mueller等，2017)，SAMF_CA(Li等，2014)，ECO(Danelljan等，2017)，DeepSTRCF(Li等，2018)等9种主流算法。本节从算法定量比较和算法定性比较两个方面进行实验分析。

4.2.1 与对比算法定性比较

图 5给出了包含遮挡(occlusions，OCC)、光照变化(illumination variation，IV)、背景杂乱(background clutters，BC)、快速运动(fast motion，FM)、运动模糊(motion blur，MB)、低分辨率(low resolution，LR)等多种不同视频属性的9种跟踪方法在部分序列的跟踪结果。从图 5中可以看到，本文算法跟踪性能优于其他对比算法。

图 5 10种跟踪方法在部分序列的跟踪结果

Fig. 5 Tracking result of 10 tracking algorithms in partial sequences

((a) Girl2_1; (b) Box_1; (c) Biker_1; (d) Board_1; (e) Panda_1)

1) 遮挡和形变(deformation，DEF)。遮挡是目标跟踪中比较常见的挑战因素之一。形变是指目标外观不断发生变化，最终导致跟踪失败。从图 5(a)可以看到，在视频序列Girl2_1中，在第105帧之前各种算法均可跟踪目标，在第107帧到第298帧, 出现遮挡物到遮挡物完全消失的过程中，SRDCF，ECO，SAMF_CA，BACF，DSST，Staple，KCF，DCF_CA算法均出现不同程度的漂移，到第358帧时，除了DeepSTRCF算法可以跟踪目标，其余8种对比算法均跟踪失败，但是由于目标一直处于运动中，DeepSTRCF不能很好地适应目标外观变化，从初始帧到最后一帧，只有本文算法可以实现准确跟踪。

2) 光照变化(IV)和背景杂乱(BC)。光照变化指当目标区域的照明发生了明显变化，会影响目标跟踪的效果。背景杂乱指目标周围存在非常相似的背景信息，会对目标跟踪造成干扰。从图 5(b)可以看到，在视频序列Box_1中，到第458帧时，之前各种算法均可跟踪目标，在第458帧到第509帧由于灯光发生变化而且背景信息杂乱，SAMF_CA，SRDCF，ECO，BACF，DSST，Staple，KCF和DCF_CA算法陆续跟踪失败，错误地以为相似的背景信息是初始帧的跟踪目标，在第517帧到第518帧时，DeepSTRCF不能较好地适应目标外观，只能跟踪到目标的一部分，跟踪效果不如本文算法，同时本文算法跟踪速度明显优于DeepSTRCF算法。从初始帧到最后一帧，只有本文算法能准确跟踪目标，这是由于采用卡尔曼滤波预测目标的运动方向，有效区分目标和背景信息，因此当光照变化和背景杂乱时，防止了跟踪器跟踪到其他相似的背景信息。

3) 平面内旋转(in-plane rotation，IPR)和平面外旋转(out-of-plane rotation，OPR)。是指目标在平面内/外旋转导致目标被自身或者其他物体遮挡。从图 5(c)可以看出，在视频序列Biker_1中，在第60帧之前，各算法均可准确跟踪目标；第60帧到第67帧时，由于目标蓄力转身，导致目标开始发生形变，SRDCF，DCF_CA，Staple，DSST和KCF算法首先跟踪失败，BACF，SAMF_CA算法出现漂移，只有本文算法，DeepSTRCF和ECO能跟踪到目标；第69帧到第89帧时，由于目标不断转身导致目标被自身遮挡，DeepSTRCF，ECO算法跟踪失败，只有本文算法能准确跟踪到目标，从初始帧到最后一帧，其余主流跟踪算法均跟踪失败。因为本文算法用真实的背景信息和多分辨率搜索策略来解决旋转和尺度变换带来的目标丢失问题，因此从第1帧至最后一帧，本文算法可以准确跟踪目标。

4) 运动模糊(MB)和快速运动(FM)。因为在摄像时相机和被摄景物之间有相对运动而造成图像模糊，称为运动模糊。快速运动是目标在连续图像中运动区间大于20像素。从图 5(d)可以看出，在视频序列Board_1中，各算法均可准确跟踪目标，从初始帧到第363帧时，各算法均可跟踪到目标，但由于摄像机与目标存在相对运动造成图像模糊现象，Staple，DCF_CA，DSST，KCF，BACF与SAMF_CA算法不能较好地适应目标外观变化；从第363帧到第561帧，SRDCF，ECO算法出现跟踪漂移，不能准确包含目标信息；从第561帧到第581帧，只有SAMF_CA，DeepSTRCF与本文算法能准确跟踪目标，其余主流跟踪算法包含了更多的背景信息，SAMF_CA不能很好地适应目标外观变化；从初始帧到最后一帧，在运动模糊和快速运动的情况下，只有本文算法与ECO算法，DeepSTRCF算法一样可以有效实现对目标的准确跟踪，但本文算法跟踪速度优于ECO，DeepSTRCF算法，同时ECO算法跟踪目标过程中无法及时适应目标的外观变化。

5) 低分辨率(LR)。从图 5(e)可以看出，在视频序列Panda_1中，在第130帧之前各算法均可准确跟踪目标，从第143帧到第152帧，由于目标翻转并且目标分辨率较低，SRDCF，KCF和DSST算法包含了更多的背景信息，出现不同程度上的漂移；从第152帧到第335帧，目标一直处于运动当中，DSST，KCF跟踪失败；从第335帧到第978帧，目标不停运动、转身，在此过程中，DSST，KCF，BACF，SRDCF，DCF_CA，DeepSTRCF和Staple算法陆续出现跟踪漂移直到跟踪失败，只有本文算法，ECO和SAMF_CA算法可以跟踪到目标，但SAMF_CA算法不能较好地适应目标外观变化，相比较本文算法与ECO算法，SAMF_CA算法会包含更多的背景信息，跟踪效果不如本文算法和ECO算法。

图 5中含有5个视频序列9种不同视频属性下的定性比较。上述实验表明，本文算法与当前主流算法相比在快速运动、光照变化等复杂场景下跟踪性能更优。

4.2.2 与对比算法定量比较

图 6和图 7分别为10种跟踪算法在数据集OTB50和OTB100上的精确率曲线和成功率曲线。从图 6中可见，本文算法的成功率为0.720，精确率为0.777，得分均高于BACF的成功率(0.643)和精确率(0.721)，从图 7中可见，本文算法的成功率和精确率分别为0.773和0.828，得分均高于BACF的成功率(0.719)和精确率(0.796)。

图 6 10种跟踪算法在OTB50上的成功率和精确率

Fig. 6 Success and precision rates for 10 tracking algorithms on OTB50((a)success rate; (b)precision)

图 7 10种跟踪算法在OTB100上的成功率和精确率

Fig. 7 Success and precision rates for 10 tracking algorithms on OTB100((a)success rate; (b)precision)

为了更好地比较本文算法与各主流算法的跟踪性能，表 1和表 2分别记录了10种算法在数据集OTB100上11种视频属性的精确率和成功率。

表 1 10种跟踪算法在OTB100上的11种属性序列上的精确率得分
Table 1 Precision scores of 10 tracking algorithms on 11 attribute sequences on OTB100

下载CSV

OTB100	本文	BACF	SRDCF	KCF	DSST	Staple	ECO	DCF_CA	SAMF_CA	DeepSTRCF
IV	0.781	0.810	0.757	0.714	0.720	0.756	0.777	0.737	0.758	0.841
SV	0.803	0.753	0.734	0.632	0.644	0.721	0.761	0.689	0.758	0.867
OCC	0.783	0.715	0.704	0.627	0.602	0.709	0.708	0.645	0.745	0.859
DEF	0.784	0.754	0.704	0.600	0.554	0.725	0.742	0.666	0.734	0.839
MB	0.753	0.736	0.778	0.600	0.580	0.683	0.725	0.698	0.741	0.861
FM	0.841	0.748	0.759	0.622	0.582	0.699	0.780	0.718	0.722	0.800
IPR	0.779	0.767	0.721	0.699	0.709	0.754	0.773	0.735	0.741	0.842
OPR	0.795	0.740	0.713	0.661	0.654	0.715	0.753	0.663	0.764	0.876
OV	0.841	0.649	0.576	0.512	0.494	0.652	0.731	0.559	0.697	0.767
BC	0.780	0.778	0.759	0.684	0.688	0.722	0.803	0.740	0.780	0.835
LR	0.811	0.628	0.663	0.560	0.602	0.610	0.741	0.594	0.703	0.783
注：加粗和下划线字体分别表示每行最优和次优结果。

表 2 10种跟踪算法在OTB100上的11种属性序列上的成功率得分
Table 2 Success scores of 10 tracking algorithms on 11 attribute sequences on OTB100

下载CSV

OTB100	本文	BACF	SRDCF	KCF	DSST	Staple	ECO	DCF_CA	SAMF_CA	DeepSTRCF
IV	0.760	0.705	0.704	0.519	0.643	0.685	0.742	0.539	0.660	0.816
SV	0.734	0.647	0.660	0.421	0.551	0.609	0.705	0.464	0.626	0.816
OCC	0.676	0.666	0.648	0.505	0.546	0.637	0.737	0.504	0.663	0.828
DEF	0.666	0.663	0.626	0.472	0.494	0.625	0.705	0.508	0.610	0.780
MB	0.719	0.691	0.711	0.545	0.561	0.630	0.754	0.614	0.699	0.844
FM	0.818	0.688	0.703	0.521	0.543	0.630	0.749	0.594	0.630	0.757
IPR	0.703	0.689	0.648	0.556	0.629	0.664	0.689	0.608	0.646	0.783
OPR	0.727	0.655	0.641	0.510	0.577	0.631	0.692	0.528	0.685	0.834
OV	0.775	0.552	0.527	0.464	0.455	0.523	0.670	0.492	0.588	0.705
BC	0.747	0.694	0.690	0.569	0.608	0.656	0.779	0.630	0.693	0.789
LR	0.736	0.494	0.625	0.295	0.444	0.472	0.663	0.317	0.535	0.683
注：加粗和下划线字体分别表示每行最优和次优结果。

在数据集OTB100上，从表 1可以看出，本文算法在11种属性序列跟踪精确率得分处于最优位置的有3种，得分高于BACF的属性序列有10种，结合图 6，本文算法整体跟踪精确率处于次优位置，相比其他算法，本文算法在跟踪精确率上更接近DeepSTRCF，其在跟踪成功率上表现也是一样。从表 2可以看出, 本文算法在11种属性序列成功率得分有4种序列处于次优位置，3种序列处于最优位置，同时11种属性序列上精确率得分和成功率得分均排名前3，相比其他算法，整体跟踪效果更好。同时结合图 7，在数据集OTB100上，本文算法与9种当前主流算法相比，虽然整体跟踪效果次于DeepSTRCF，但大多数情况下可以准确且稳健地跟踪目标，优于其他相关滤波算法，本文算法跟踪速度优于DeepSTRCF，跟踪速度达到29.9帧/s。另外，本文算法是基于BACF算法，但是在跟踪成功率和跟踪精确率得分均高于BACF，在数据集OTB50和OTB100上的平均精确率和平均成功率分别为0.804和0.748，相比BACF算法分别提高了7%和16%。其中11种属性序列的精确率曲线和成功率曲线见附录。

由于数据集OTB50和OTB100视频序列集中超过1 500帧的视频序列较少，不能充分体现本文算法的跟踪精确率，本文引入最新的数据集LaSOT(Fan等，2019)，与当前主流跟踪算法进行对比测试。实验结果如图 8所示。

图 8 10种算法在LaSOT上的精确率

Fig. 8 Precisions rates for 10 tracking algorithms on LaSOT

与数据集OTB50和OTB100相比，数据集LaSOT视频序列数量更多，在视频序列中的图像帧数更多，种类划分更细致，挑战难度更大，本文算法精确率为0.329，仅低于DeepSTRCF算法，均高于其他当前主流跟踪算法。另外，本文是基于BACF算法的改进，相比较BACF(0.239)的精确率得分，更能体现本文算法的鲁棒性。

4.3 关于参数$∂ $选取背景块的数量分析

关于参数$ ∂$选取的背景块分为3个部分，$ ∂_1$为目标相反运动方向上的背景块数量，$ ∂_2$为目标非运动方向上的背景块数量，$ ∂_3$为目标运动方向上的背景块数量，其中，$ ∂_1$在t时刻选取的背景块包含$ ∂_3$在$ t-1$时刻选取的背景块当中，故$ ∂_1$选取的数量块最少，本文算法设$ ∂_3$=1；$ ∂_2$在$ t$时刻选取的背景块有一部分包含在$ ∂_3$在$ t-1$时刻选取的背景块当中，故$ ∂_2$选取的背景块数量略少于$ ∂_3$所选取的背景块数量，本文算法设$ ∂_2$=4，$ ∂_3$=7。本文在其他参数不变的条件下，参数$∂ $(参数$ ∂_1$，$ ∂_2$，$ ∂_3$三者之和)选取不同数量在OTB100上的实验结果见表 3。

表 3 参数选取10~14时的综合对比结果
Table 3 Parameter selection of comprehensive comparison results from 10 to 14

下载CSV

数量	跟踪精确率	跟踪成功率	平均跟踪速度/(帧/s)
10	0.798	0.770	32.4
11	0.811	0.771	31.1
12	0.829	0.775	29.9
13	0.829	0.776	27.2
14	0.833	0.776	25.9
注：由于存在误差，故此次实验对比中背景块数量为12时的数据与图 6和图 7有偏差，但偏差较小，可粗略不计。

表 3中，当选取的背景块数量分别为12，13时，跟踪精确率相同；当选取的背景块数量分别为13, 14时，跟踪成功率相同；当选取的背景块数量为13时, 相较于数量为12的背景块，跟踪速度下降过快，可以看出选取的背景块数量越多跟踪速度越慢。分别从跟踪精确率、跟踪成功率和跟踪速度3方面考虑，可以看出选取12块时，跟踪效果最好。

4.4 关于各算法跟踪速度分析

本文算法随机选取数据集OTB100中20组序列进行跟踪比较，对各算法跟踪速度进行加权平均，如表 4所示，同时结合表 1和表 2，在跟踪实验序列的过程中，本文算法的跟踪成功率与跟踪准确率较高，且满足实时性的需求，跟踪性能良好。

表 4 不同算法在部分实验图像序列上的平均跟踪速度
Table 4 Average tracking speed of different algorithms in experimental image sequences

下载CSV

/(帧/s)
	本文	BACF	SRDCF	KCF	DSST	Staple	DCF_CA	ECO	SAMF_CA	DeepSTRCF
平均跟踪速度	29.9	33.9	7.7	315.2	56.4	22.8	31.7	13	48.4	18.3

5 结论

本文采用卡尔曼滤波预测目标运动方向和运动状态，对不同方向上的背景信息分配不同的权重，进行滤波器训练，根据线性插值法求出最大响应值，从而确定目标位置。采用ADMM方法将求解目标模型问题转化为求解两个具有最优解的子问题，在模型更新上利用在线自适应方法来解决目标形变问题。

与其他主流跟踪算法相比，本文算法具有以下优势：1)本文算法在BACF框架下加入卡尔曼滤波，预测目标运动状态和目标运动方向，有效区分目标和背景信息，防止跟踪器跟踪到其他相似的背景信息；2)对目标运动的不同方向分配不同的训练权重，从真实的背景信息中提取有用的背景块训练滤波器，增加滤波器的分辨能力；3)本文算法构建了一个有闭式解的相关滤波目标函数，采用交替求解算法(ADMM)降低计算复杂度，满足目标跟踪实时需求。

在数据集OTB50和OTB100上进行大量对比实验，在数据集OTB50上，本文算法成功率和精确率分别为0.720和0.777，在数据集OTB100上，本文算法成功率和精确率分别为0.773和0.828，在数据集LaSOT上，均优于当前主流算法，说明本文算法具有较好的准确率和鲁棒性。背景感知过程中，样本采样方式和权重分配过程直接影响了目标跟踪性能，下一步将围绕速度自适应的样本采集模型构建进行深入研究。

附录(Appendix)

1) 求解$ {\mathit{\boldsymbol{h}}^*}$，如下

$ \begin{array}{c} \boldsymbol{h}^{*}=\arg {\min \limits_{h}}\left\{\frac{\lambda}{2}\|\boldsymbol{h}\|_{2}^{2}+\hat{\boldsymbol{\xi}}^{\mathrm{T}}\left(\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right)+\right. \\ \left.\frac{\mu}{2}\left\|\left(\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right)\right\|_{2}^{2}\right\}= \\ \frac{\lambda}{2}\|\boldsymbol{h}\|_{2}^{2}+\hat{\boldsymbol{\xi}}^{\mathrm{T}} \hat{\boldsymbol{g}}-\sqrt{T} \hat{\boldsymbol{\xi}}^{\mathrm{T}}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}+ \\ \frac{\mu}{2}\left(\hat{\boldsymbol{g}}^{\mathrm{T}}-\sqrt{\boldsymbol{T}} \boldsymbol{h}^{\mathrm{T}}\left(\boldsymbol{P F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right)\right)\left(\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right)= \\ \frac{\lambda}{2}\|\boldsymbol{h}\|_{2}^{2}+\hat{\boldsymbol{\xi}}^{\mathrm{T}} \hat{\boldsymbol{g}}-\sqrt{T} \hat{\boldsymbol{\xi}}^{\mathrm{T}}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}+ \\ \frac{\mu}{2}\left(\hat{\boldsymbol{g}}^{\mathrm{T}} \hat{\boldsymbol{g}}-\sqrt{\boldsymbol{T}} \hat{\boldsymbol{g}}^{\mathrm{T}}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}-\right. \\ \sqrt{\boldsymbol{T}} \boldsymbol{h}^{\mathrm{T}}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\boldsymbol{g}}+ \\ \left.T \boldsymbol{h}^{\mathrm{T}}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right)\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right) \end{array} $

(A1)

$ \begin{array}{c} \frac{\partial\left(\boldsymbol{h}^{*}\right)}{\partial \boldsymbol{h}}=\lambda \boldsymbol{h}-\sqrt{T}\left(\boldsymbol{P F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\boldsymbol{\xi}}-\frac{\mu}{2} \sqrt{T}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\boldsymbol{g}}- \\ \frac{\mu}{2} \sqrt{T}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\boldsymbol{g}}+ \\ \frac{\mu}{2} T\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right)\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}+ \\ \frac{\mu}{2} T\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right)\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}= \\ \lambda \boldsymbol{h}-\sqrt{T}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\boldsymbol{\xi}}- \\ \mu \sqrt{T}\left(\boldsymbol{P F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\boldsymbol{g}}+ \\ \mu T\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right)\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h} \\ \text { 令 } g=\frac{1}{\sqrt{T}}\left(\boldsymbol{P F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\boldsymbol{g}} \\ \xi=\frac{1}{\sqrt{T}}\left(\boldsymbol{P} \boldsymbol{F}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \hat{\xi} \end{array} $

(A2)

$ \frac{\partial\left(\boldsymbol{h}^{*}\right)}{\partial \boldsymbol{h}}=\lambda \boldsymbol{h}-T \boldsymbol{\xi}-\mu T \boldsymbol{g}+\mu T \boldsymbol{h} $

(A3)

令$\frac{\partial\left(h^{*}\right)}{\partial h}=0 $，则解式(A3)得

$ \boldsymbol{h}=T \frac{(\mu \boldsymbol{g}+\boldsymbol{\zeta})}{\lambda+\mu T} $

(A4)

2) 求解$ {\mathit{\boldsymbol{g}}^*}$，如下

$ \begin{array}{c} \boldsymbol{g}^{*}=\arg {\min \limits_{{\mathit{\boldsymbol{g}}}}}\left\{\frac{1}{2}\|\hat{\boldsymbol{y}}-\hat{\boldsymbol{x}} \hat{\boldsymbol{g}}\|_{2}^{2}+\right. \\ \hat{\boldsymbol{\xi}}^{\mathrm{T}}\left(\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right)+ \\ \frac{\mu}{2}\left\|\hat{\boldsymbol{g}}-\sqrt{T}\left(\boldsymbol{F} \boldsymbol{P}^{\mathrm{T}} \otimes \boldsymbol{I}_{k}\right) \boldsymbol{h}\right\|_{2}^{2}+ \\ \left.\frac{\beta}{2}\|\hat{\boldsymbol{g}}(t) \boldsymbol{\omega}\|_{2}^{2}\right\} \end{array} $

(A5)

分解成$T $个子问题，$t=[1, …, L] $

$ \begin{array}{c} \boldsymbol{g}^{*}(t)=\arg {\min \limits_\boldsymbol{g}}\left\{\frac{1}{2}\left\|\hat{\boldsymbol{y}}(t)-\hat{\boldsymbol{x}}(t)^{\mathrm{T}} \hat{\boldsymbol{g}}(t)\right\|_{2}^{2}+\right.\\ \hat{\boldsymbol{\xi}}(t)^{\mathrm{T}}(\hat{\boldsymbol{g}}(t)-\hat{\boldsymbol{h}}(t))+\\ \frac{\mu}{2}\|\hat{\boldsymbol{g}}(t)-\hat{\boldsymbol{h}}(t)\|_{2}^{2}+\\ \left.\frac{\beta}{2}\|\hat{\boldsymbol{g}}(t) \omega\|_{2}^{2}\right\} \end{array} $

(A6)

损失函数转换到频域中计算，根据帕斯瓦尔定理，得到优化公式

$ \begin{array}{c} \boldsymbol{g}^{*}(t)=\frac{1}{2 T}\left\|\hat{\boldsymbol{y}}(t)-\hat{\boldsymbol{x}}(t)^{\mathrm{T}} \hat{\boldsymbol{g}}(t)\right\|_{2}^{2}+ \\ \hat{\boldsymbol{\xi}}(t)^{\mathrm{T}}(\hat{\boldsymbol{g}}(t)-\hat{\boldsymbol{h}}(t))+ \\ \frac{\mu}{2}\|\hat{\boldsymbol{g}}(t)-\hat{\boldsymbol{h}}(t)\|_{2}^{2}+\\ \left.\frac{\beta}{2}\|\hat{\boldsymbol{g}}(t) \omega\|_{2}^{2}\right\} \end{array} $

(A7)

$ \begin{array}{c} \text { 令 } \frac{\partial \boldsymbol{g}^{*}(t)}{\partial \hat{\boldsymbol{g}}(t)}=\frac{1}{T}\left(-\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{y}}(t)+\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{x}}(t)^{\mathrm{T}} \hat{\boldsymbol{g}}(t)\right)+ \\ \hat{\boldsymbol{\xi}}(t)+\mu \hat{\boldsymbol{g}}(t)-\mu \hat{\boldsymbol{h}}(t)+\beta \boldsymbol{\omega}=0 \end{array} $

(A8)

$ \begin{array}{c} \hat{\boldsymbol{g}}(t)=\left(\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{x}}(t)^{\mathrm{T}}+T \mu \boldsymbol{I}_{k}\right)^{-1} \\ (\hat{\boldsymbol{y}}(t) \hat{\boldsymbol{x}}(t)-T \hat{\boldsymbol{\zeta}}(t)+T \mu \hat{\boldsymbol{h}}(t)+T \beta \omega) \end{array} $

(A9)

根据Sherman-Morrison定理，由${\left({\mathit{\boldsymbol{A}} + \mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{T}}}} \right)^{ - 1}} = {\mathit{\boldsymbol{A}}^{ - 1}} - \frac{{{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{T}}}{\mathit{\boldsymbol{A}}^{ - 1}}}}{{1 + {\mathit{\boldsymbol{v}}^{\rm{T}}}{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}}}} $求解$ \left(\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{x}}(t)^{\mathrm{T}}+T \boldsymbol{\mu} \boldsymbol{I}_{K}\right)^{-1}$，得

$ \begin{array}{c} \left(\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{x}}(t)^{\mathrm{T}}+T \boldsymbol{\mu} \boldsymbol{I}_{K}\right)^{-1}=\frac{1}{T \boldsymbol{\mu}} \boldsymbol{I}_{K}- \\ \frac{1}{T \boldsymbol{\mu}} \times \frac{\hat{\boldsymbol{x}}(t) \hat{\boldsymbol{x}}(t)^{\mathrm{T}}}{T \boldsymbol{\mu}+\hat{\boldsymbol{x}}(t)^{\mathrm{T}} \hat{\boldsymbol{x}}(t)} \end{array} $

(A10)

把式(A10)代入式(A9)，得

(A11)

参考文献

Bertinetto L, Valmadre J, Golodetz S, Miksik O and Torr P H S. 2016. Staple: complementary learners for real-time tracking//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 1401-1409[DOI: 10.1109/CVPR.2016.156]

Bolme D S, Beveridge J R, Draper B A and Lui Y M. 2010. Visual object tracking using adaptive correlation filters//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE: 2544-2550[DOI: 10.1109/CVPR.2010.5539960]

Danelljan M, Häger G, Khan F S and Felsberg M. 2014. Accurate scale estimation for robust visual tracking//Proceedings of British Machine Vision Conference. London, UK: BMVA Press: 1-65[DOI: 10.5244/C.28.65].

Danelljan M, Häger G, Khan F S and Felsberg M. 2015. Learning spatially regularized correlation filters for visual tracking//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 4310-4318[DOI: 10.1109/ICCV.2015.490]

Danelljan M, Robinson A, Khan F S and Felsberg M. 2016. Beyond correlation filters: learning continuous convolution operators for visual tracking//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer: 472-488[DOI: 10.1007/978-3-319-46454-1_29]

Fan H, Lin L T, Yang F, Chu P, Deng G, Yu S J, Bai H X, Xu Y, Liao C Y and Ling H B. 2019. LaSOT: a high-quality benchmark for large-scale single object tracking//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: #00552[DOI: 10.1109/CVPR.2019.00552]

Galoogahi H K, Fagg A and Lucey S. 2017. Learning background-aware correlation filters for visual tracking//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 1144-1152[DOI: 10.1109/ICCV.2017.129]

Galoogahi H K, Sim T and Lucey S. 2015. Correlation filters with limited boundaries//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 4630-4638[DOI: 10.1109/CVPR.2015.7299094]

Henriques J F, Caseiro R, Martins P and Batista J. 2012. Exploiting the circulant structure of tracking-by-detection with kernels//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer: 702-715[DOI: 10.1007/978-3-642-33765-9_50]

Henriques J F, Caseiro R, Martins P, Batista J. 2015. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3): 583-596 [DOI:10.1109/TPAMI.2014.2345390]

Li C, Lu C Y, Zhao X, Zhang B M, Wang H Y. 2018. Scale adaptive correlation filtering tracing algorithm based on feature fusion. Acta Optica Sinica, 38(5): #0515001 (李聪, 鹿存跃, 赵珣, 章宝民, 王红雨. 2018. 特征融合的尺度自适应相关滤波跟踪算法. 光学学报, 38(5): #0515001) [DOI:10.3788/AOS201838.0515001]

Li F, Tian C, Zuo W M, Zhang L and Yang M H. 2018. Learning spatial-temporal regularized correlation filters for visual tracking//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4904-4913[DOI: 10.1109/CVPR.2018.00515]

Li Y and Zhu J K. 2014. A scale adaptive kernel correlation filter tracker with feature integration//Agapito L, Bronstein M M and Rother C, eds. Computer Vision-ECCV 2014 Workshops. Zurich, Switzerland: Springer: 254-265[DOI: 10.1007/978-3-319-16181-5_18].

Liu B, Xu T F, Li X M, Shi G K, Huang B. 2019. Adaptive context-aware correlation filter tracking. Chinese Journal of Optics, 12(2): 265-273 (刘波, 许廷发, 李相民, 史国凯, 黄博. 2019. 自适应上下文感知相关滤波跟踪. 中国光学, 12(2): 265-273) [DOI:10.3788/CO.20191202.0265]

Lu H C, Li P X, Wang D. 2018. Visual object tracking: a survey. Pattern Recognition and Artificial Intelligence, 31(1): 61-76 (卢湖川, 李佩霞, 王栋. 2018. 目标跟踪算法综述. 模式识别与人工智能, 31(1): 61-76) [DOI:10.16451/j.cnki.issn1003-6059.201801006]

Meng L, Yang Y. 2019. A survey of object tracking algorithms. Acta Automatica Sinica, 45(7): 1244-1260 (孟琭, 杨旭. 2019. 目标跟踪算法综述. 自动化学报, 45(7): 1244-1260) [DOI:10.16383/j.aas.c180277]

Mueller M, Smith N and Ghanem B. 2017. Context-aware correlation filter tracking//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1387-1395[DOI: 10.1109/CVPR.2017.152]

Pu D G, Jin Z. 2010. New Lagrangian multiplier methods. Journal of Tongji University (Natural Science), 38(9): 1387-1391 (濮定国, 金中. 2010. 新的拉格朗日乘子方法. 同济大学学报(自然科学版), 38(9): 1387-1391) [DOI:10.3969/j.issn.0253-374x.2010.09.026]

Song R C, He X H, Wang Z Y. 2018. Complementary object tracking based on directional reliability. Acta Optica Sinica, 38(10): #1015001 (宋日成, 何小海, 王正勇. 2018. 基于方向可靠性的互补跟踪算法. 光学学报, 38(10): #1015001) [DOI:10.3788/AOS201838.1015001]

Wang N Y and Yeung D Y. 2013. Learning a deep compact image representation for visual tracking//Proceedings of the 26th International Conference on Neural Information Processing System. Red Hook, USA: Curran Associates Inc.: 809-817

Wang Y, Yin W T, Zeng J S. 2019. Global convergence of ADMM in nonconvex nonsmooth optimization. Journal of Scientific Computing, 78(1): 29-63 [DOI:10.1007/s10915-018-0757-z]

Welch G and Bishop G. 2001. An introduction to the kalman filter: SIGGRAPH 2001 course 8//Computer Graphics, Annual Conference on Computer Graphics and Interactive Techniques. Los Angeles, USA: ACM Press, Addison-Wesley Publishing Company

Wu Y, Lim J and Yang M H. 2013. Online object tracking: a benchmark//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 2411-2418[DOI: 10.1109/CVPR.2013.312]

Wu Y, Lim J, Yang M H. 2015. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9): 1834-1848 [DOI:10.1109/TPAMI.2014.2388226]

Xu Y B, Xu K, Wan J W, Xiong Z D and Li Y Y. 2018. Research on particle filter tracking method based on kalman filter//Proceedings of the 2nd IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC). Xi'an, China: IEEE: 1564-1568[DOI: 10.1109/IMCEC.2018.8469578]

Yin M F, Bo Y M, Zhu J L, Wu P L. 2019. Multi-scale context-aware correlation filter tracking algorithm based on channel reliability. Acta Optica Sinica, 39(5) (尹明锋, 薄煜明, 朱建良, 吴盘龙. 2019. 基于通道可靠性的多尺度背景感知相关滤波跟踪算法. 光学学报, 39(5)) [DOI:10.3788/AOS201939.0515002]

Yun S, Choi J, Yoo Y, Yun K M and Choi J Y. 2017. Action-decision networks for visual tracking with deep reinforcement learning//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1349-1358[DOI: 10.1109/CVPR.2017.148]