Print

发布时间: 2019-04-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.180320
2019 | Volume 24 | Number 4




    图像分析和识别    




  <<上一篇 




  下一篇>> 





背景与时间感知的相关滤波实时视觉跟踪
expand article info 朱建章1, 王栋2, 卢湖川2
1. 河南财经政法大学数学与信息科学学院, 郑州 450046;
2. 大连理工大学信息与通信工程学院, 大连 116024

摘要

目的 传统的相关滤波跟踪算法采用对跟踪目标(唯一准确正样本)循环移位获取负样本,在整个学习过程中没有对真正的背景信息进行建模,因此当目标与背景信息极其相似时容易漂移。大多数跟踪算法为了提高跟踪性能,在时间序列上收集了大量的训练样本而导致计算复杂度的增加。采用模型在线更新策略,由于未考虑时间一致性,使得学习到的滤波器可能偏向背景而发生漂移。为了改善以上问题,本文在背景感知相关滤波(BACF)跟踪算法的基础上,加入时间感知,构建了一个带等式限制的相关滤波目标函数,称为背景与时间感知相关滤波(BTCF)视觉跟踪。该算法不但获取了真正的负样本作为训练集,而且仅用当前帧信息无需模型在线更新策略就能学习到具有较强判别力的相关滤波器。方法 首先将带等式限制的相关滤波目标函数转化为无约束的增广拉格朗日乘子公式,然后采用交替方向乘子方法(ADMM)转化为两个具有闭式解的子问题迭代求最优解。结果 采用OTB2015数据库中的OPE(one pass evaluation)评价准则,以成功率曲线图线下面积(AUC)和中心点位置误差为评判标准,在OTB2015公开数据库上与10个比较优秀的视觉跟踪算法进行对比实验。结果显示,100个视频序列和11个视频属性的成功率及对应的AUC和中心位置误差均明显优于其他基于相关滤波的视觉跟踪算法,说明本文算法具有良好的跟踪效果。本文的BTCF算法仅采用HOG纯手工特征,在OTB2015数据库上AUC较BACF算法提高了1.3%;由于颜色与边缘特征具有互补特性,本文融合CN(color names)特征后,在OTB2015数据库上,AUC较BACF算法提高了4.2%,采用纯手工特征跟踪性能AUC达到0.663,跟踪速度达到25.4帧/${\rm{s}}$结论 本文的BTCF算法能够适用于光照变化、目标旋转、遮挡等复杂情况下的视觉跟踪,具有良好的鲁棒性和一定的实时性。

关键词

视觉跟踪; 相关滤波; 背景感知; 时间感知; 正则化; 交替方向乘子法

Learning background-temporal-aware correlation filter for real-time visual tracking
expand article info Zhu Jianzhang1, Wang Dong2, Lu Huchuan2
1. School of Mathematics and Information Sciences, Henan University of Economics and Low, Zhengzhou 450046, China;
2. School of Information and Communication Engineering, Dalian University of Technology, Dalian 116024, China
Supported by: National Natural Science Foundation of China (61502070);Natural Science Foundation of Henan Province, China (18A110013)

Abstract

Objective Visual tracking is a classical computer vision problem with many applications. In generic visual tracking, the task is to estimate the trajectory of a target in an image sequence, given only its initial location. Recently, traditional discriminative correlation filter-based approaches have been successfully applied to tracking problems. These methods learn a discriminative correlation filter from a set of training samples, which adopt a circular shift operator on the tracking target object (the only accurate positive sample) to obtain the training negative samples. These shifted patches are implicitly generated through the circulatory property of correlation in frequency domain and are used as negative examples for training the filter. All shifted patches are plagued by circular boundary effects and are not truly representative of negative patches in real-word scenes. Thus, the actual background information is not modeled during the total learning process and when the target object is similar to the background information, thereby leading to a drift. To improve the performance, a large number of training samples are collected, which results in the increase of computational complexity. Moreover, preferring the background is easy due to the online model update strategy, which causes drift. To resolve this problem, we construct a discriminative correlation filter-based target function with equation-constrained condition on the background-aware correlation filtering (BACF) visual object tracking algorithm, which is termed as the background-temporal-aware correlation filter (BTCF) visual object tracking. Our algorithm obtains the actual negative sample with the same size as the target object on the training set by multiplying the filter with a binary mask to suppress the background region. Moreover, it can learn a strong correlation filter-based discriminative classifier by only using the current frame information without online updating of the model. Method In this paper, the proposed BTCF model is convex and can be minimized to obtain the globally optimal solution. In order to further reduce the computational burden, we propose a new equation-constrained discriminative correlation filter-based objective function. This objective function satisfies the Eckstein-Bertsekas condition, therefore, it can be transformed into an unconstrained augmented Lagrange multiplier formula to converge to the global optimum solution. Then, two sub-problems with closed-form solution are gained by using the alternating direction multiplier method (ADMM). Every sub-problem is a smooth and convex function and is very easy to obtain the solution, therefore, each iteration of the sub-problem has a closed-form solution and is the global optimal solution of each sub-problem. Because of the convolution calculation in sub-problem two, it is difficult to solve the optimization problem, consequently, according to Parseval's theorem, we transform sub-problem two into Fourier domain to reduce computational complexity. The efficient ADMM based approach for learning our filter on multi-channel features, with computational cost of ${\rm{O}}\left({LKT\lg \left(T \right)} \right)$, where $T$ is the size of vectorized frame, $K$ is the number of feature channels, and $L$ is the ADMM's iterations. We calculate model updates with Sherman-Morrison lemma to cope with changes in target and background appearance with real-time performance. Our algorithm can empirically converge within few iterations and, with hand-crafted features, can run in real time, thereby achieving notable improvements over BACF object tracking algorithm by tracking accuracy. Result The one-pass evaluation is used to compare different trackers proposed by OTB2015 based on two criteria, namely, center location error and bounding box overlap ratio. The center location error is one of the widely used evaluation metrics for target object tracking, which computes the average Euclidean distance between the center locations of the tracked targets and manually labeled ground truth positions of all the frames. Moreover, the percentage of frames in the estimated locations with a given threshold distance of the ground truth positions is considered successful tracking. Another commonly used evaluation metric is the overlap score. We use the area under the curve (AUC) of each success plot, which is the average of the success rates corresponding to the sample over thresholds to measure the ranking trackers. Our approach is compared with 10 state-of-the-art visual object tracking algorithms on the OTB2015 public database. Results show that our BTCF algorithm is remarkably better than visual tracking algorithms based on discriminative correlation filter in center location error and AUC. OTB2015 categorizes 100 sequences by annotating them with 11 attributes to evaluate and analyze the strength and weakness of the tracking approaches. Results show that our BTCF algorithm is remarkably better than the BACF in the center location error and AUC on the 11 attributes, thereby indicating that our algorithm can achieve effective and efficient performance. The BTCF visual object tracking increased 1.3% on AUC compared with BACF, which only uses histogram of oriented gradients (HOG) hand-crafted features on the OTB2015 database. The color and edge features have complementary characteristics; thus, we introduce color names (CNs) to our BTCF formulation to hoist the 4.2% AUC compared with BACF on the OTB2015 database. The AUC that reaches 0.663 and the speed that attains 25.4 fps on OTB2015 database only use the hand-crafted features (HOG and CNs). Conclusion Compared with the BACF algorithm and other current popular tracking approaches, the proposed BTCF-based visual tracking algorithm can be applied to many challenging conditions. Due to introduce the temporal-aware term on BACF model, a stronger discriminative classifier can be learned to separate the target from the background, especially in illumination variation, motion blur, out-of-plane rotation, and occlusion scene. Therefore, the proposed BTCF-based algorithm demonstrates the robustness and real-time characteristic.

Key words

visual tracking; correlation filter; background-aware; temporal-aware; regularization; alternating direction multiplier method

0 引言

视觉跟踪是计算机视觉领域的一个非常重要的课题和研究热点,融合了图像处理、模式识别、人工智能以及数学等诸多领域的先进技术和核心思想,在自动驾驶、智能监控和人机交互等方面有着广泛应用。就目标跟踪而言,通常情况下仅能知道第1帧图像中被跟踪目标的初始信息。而视觉跟踪的任务是精准地估计后续图像序列中目标的位置、形状或所占面积,确定目标的运动速度、方向及轨迹等运动信息,以完成更高级的任务。随着研究的不断深入,视觉跟踪已取得了阶段性进展,但由于被跟踪目标存在着部分或完全遮挡、尺度形变或背景杂乱等多方面因素的干扰,建立一个通用且鲁棒的视觉跟踪系统仍然存在巨大挑战。

近年来,基于判别相关滤波(DCF)的视觉跟踪算法,在保证精度的同时,具备高效的实时性能,受到众多研究者的广泛关注和深入研究。DCF在空域内通过对前景目标的循环移位,密集采样大量正、负样本作为训练集,通过回归到高斯分布的软标签,在最小化岭回归损失函数框架下求解相关滤波器。借助快速傅里叶变换,DCF框架巧妙地将空域中的相关运算转化为频域中的点乘运算,从而在频域内高效地对目标外观进行建模,达到实时跟踪性能,最终学习到一个具有较强判别力的分类器,将目标从背景中分离出来。Bolme等人[1]首次将相关滤波方法引入到自适应目标跟踪领域,提出了平方误差最小滤波器(MOSSE)跟踪算法,该算法的最大优点是在训练分类器和定位目标时采用快速傅里叶变换,跟踪速度可达到600帧/${\rm{s}}$以上。MOSSE算法利用多帧跟踪样本训练相关滤波器,同时使用峰值旁瓣率检测相关滤波响应的峰值强度,通过及时停止目标外观模型的在线更新来避免跟踪漂移问题。Henriques等人[2]针对MOSSE中相关滤波中样本数量不足对分类器的影响,提出了基于核技术(CSK)的跟踪算法,改进了MOSSE的循环密集采样方法,利用中心图像块循环移位近似窗口移位。Danelljan等人[3]为了进一步提升跟踪的性能,改进了CSK跟踪算法,提出了基于颜色空间(CN)的跟踪算法,将RGB的颜色空间扩展到11个通道的CN空间。虽然提高了跟踪的精度,但是由于采用多通道特征,加大了运算量。Henriques等人[4]在核技术的基础上改进了CSK跟踪算法,提出了基于核的多通道相关滤波(KCF)跟踪算法,采用31个通道的方向梯度直方图(HOG)特征, 将多通道特征融入相关滤波框架,结合岭回归与循环矩阵,将相关滤波核化。该方法对运动模糊、光照变化以及颜色变化都具有较强的鲁棒性。Tang等人[5]在KCF中引入多核学习技术,提出了多核相关滤波(MKCF)跟踪算法,通过加权融合或自适应策略,改善单核学习中由于人工设定带来的不合理性。

早期相关滤波跟踪算法仅采用单尺度策略对目标进行定位,当视觉跟踪算法遇到目标尺寸变化时,可能发生漂移。针对尺度变化对跟踪性能带来的负面影响,Li等人[6]提出了自适应尺度变化的相关滤波(SAMF)跟踪算法,使用7个较粗的尺度训练平滑滤波器在多尺度图像块上进行检测,选取相关滤波响应最大值对应的平移位置和目标尺度。与此同时,Danelljan等人[7-8]提出了判别尺度空间(DSST)跟踪算法,采用33个较精细的尺度训练平滑滤波器和尺度滤波器来估计目标位置,然后在检测到的跟踪位置采用尺度滤波器估计目标尺度。

边界效应是隐含在相关滤波框架中的一个容易产生过拟合的缺陷。由于训练集是通过对中心目标块循环移位获取的,因此只有唯一的中心样本是正确的。当目标出现快速运动或背景干扰时,跟踪器将出现漂移。上述大多数算法仅在特征图上加入余弦窗操作来弱化边界效应的影响,但是该余弦窗技术可能使跟踪器仅学到了部分前景信息而屏蔽了背景信息,从而降低了跟踪器的判别力,导致跟踪失败。为了解决边界效应问题,Galoogahi等人[9]采用较大尺寸的检测图像块和较小尺寸的滤波器,同时动态减少训练集中的样本数量,提高真实样本的比例,在新的目标函数下构造增广拉格朗日乘子(ALM),并采用交替方向乘子法(ADMM)进行优化来求最优解。随后,Danelljan等人[10]提出了基于空间正则化的相关滤波(SRDCF)跟踪算法,对滤波器系数进行空间加权,越靠近中心目标位置,权重越小,反之权重越大,使得学习到的滤波器更加集中在中心区域。该算法通过高斯—赛德尔(Gauss-Seidel)方法优化求解目标函数,虽然跟踪器的性能有所提升,但是采用手工特征的HOG仅有9帧/${\rm{s}}$,相比KCF慢了近25倍。究其原因是SRDCF破坏了相关滤波的循环移位框架,使得优化问题变得困难。在训练过程中记录了初始帧到当前帧的所有样本信息,但由于提取样本特征需要耗费一定的时间,且采用高斯—赛德尔优化求解方法,仍具有很高的计算复杂度。Danelljan等人[11]在SRDCF的基础上,提出了连续卷积相关滤波(CCOT)跟踪算法,改变了以往单分辨率特征图的限制,融合了多分辨率特征图,将传统的手工特征与不同分辨率的深度特征相结合,提高了算法性能,同时通过隐含插值方法,将离散域的相关滤波框架推广到连续域的相关滤波框架,在连续空域中学习连续相关滤波器,达到子像素级别的定位。由于CCOT采用了高维深度特征,增加了外观模型的参数(大约800 000个),使得高维度的参数空间很容易产生过拟合现象而导致跟踪漂移。为了剔除CCOT中存在的冗余信息,在不失精度的前提下,Danelljan等人[12]提出了有效卷积运算(ECO)跟踪算法来加速CCOT,主要从对卷积操作进行因式分解来减少模型参数,通过高斯混合模型简化训练集且保证样本多样性,以及改进模型更新策略等3个方面改进了CCOT的性能。

上述跟踪算法在整个相关滤波学习过程中,没有对真正的背景信息进行建模,这是隐含在相关滤波框架中的另外一个容易产生过拟合的缺陷。因为训练集中的负样本是通过对中心目标块(唯一准确正样本)循环移位获取的,当目标与背景信息极其相似时,跟踪器容易漂移。在整个相关滤波学习过程中,放弃真实背景信息可能会降低学习到的跟踪器的判别力。为了缓解这一缺陷对跟踪器的影响,相继提出了CCOT[11]、ECO[12]、MDNet[13]、CREST[14]、CFCF[15]、DeepSRDCF[16]等基于深度特征的算法。这些算法或是提取深度特征或是运用深度学习框架,带来了很高的计算复杂度。虽然跟踪性能有了大幅度提升,但是实时性能却有所下降。这恰好也是视觉跟踪实际应用的一大障碍。为了克服以上缺陷,Galoogahi等人[17]提出了背景感知相关滤波(BACF)跟踪算法,可以从整帧图像中密集采样真正的负样本,仅采用手工特征(HOG特征)就能达到很高的跟踪性能,且能满足实时(33.9帧/${\rm{s}}$)需求。

本文的主要工作和创新点如下:

1) BACF算法利用一个掩码矩阵,通过密集采样的方法,获取真正的正、负样本对目标外观进行建模,已取得良好的跟踪效果。然而BACF算法在学习相关滤波器时,没有考虑滤波器的时间一致性问题,当目标出现快速运动或遮挡等外观突变时,学习到的相关滤波器会偏向背景而发生漂移。为了解决学习到的相关滤波器能够适应连续两帧之间的外观突变问题,本文的BTCF算法在BACF框架下,引入时间感知项,构造新的带等式限制的相关滤波目标函数,对学习到的相关滤波器进行惩罚来防止过拟合现象。本文的BTCF算法仅用当前帧信息,无需模型在线更新策略就能学习到具有更强判别力的分类器,将目标从背景中分离出来。

2) 本文的BTCF算法构建了一个带等式限制的相关滤波目标函数。为了减少计算复杂度,将带等式限制的相关滤波目标函数转化为无约束的增广拉格朗日乘子公式,采用交替方向乘子方法(ADMM)转化为两个具有全局最优解的子问题,迭代逼近目标函数的局部最优解。由于两个子问题都是凸光滑可微的,因此两个子问题分别有各自的闭式解且为子问题的全局最优解。

3) 从OTB2015公开数据库[18]上显示的跟踪结果可以看出,本文的BTCF算法在加入时间感知后,成功率曲线图线下面积(AUC)较BACF提升了1.3 %;由于颜色与边缘特征具有互补特性,因此本文在HOG特征的基础上融合CN颜色特征,AUC较BACF提升了4.2 %。纯手工特征跟踪性能在OTB2015数据库上100个视频的AUC为0.663,跟踪速度为25.4帧/${\rm{s}}$,能够达到实时跟踪的效果。

1 相关工作

本节简要介绍判别相关滤波算法的基本原理和背景感知相关滤波(BACF)算法。更详细的阐述参见文献[17-20]。

1.1 判别相关滤波(DCF)算法

判别相关滤波学习一个线性分类器或线性回归子属于监督学习的范畴,作为一种相关滤波跟踪框架,在多个视觉跟踪数据库上已经展示了强大的跟踪性能。判别相关滤波跟踪器的目标是在空域里从一个训练样本集$\left\{ {\left( {\mathit{\boldsymbol{x}}_m^k, {\mathit{\boldsymbol{y}}_m}} \right)} \right\}_{m = 1}^M, k = 1, \cdots , K$中学到一个多通道的相关滤波器$\left\{ {{\mathit{\boldsymbol{h}}^k}} \right\}_{k = 1}^K$。其中,$K$是特征通道的个数,$M$是训练集中样本的个数,$\mathit{\boldsymbol{x}}_m^k \in {{\bf{R}}^D}$表示从第$m$个训练样本中提取的向量化后的第$k$个通道的特征图,$D$表示特征图$\mathit{\boldsymbol{x}}_m^k$的大小,$\mathit{\boldsymbol{y}}_m \in {{\bf{R}}^D}$是对应于$\mathit{\boldsymbol{x}}_m^k$的高斯标签,${\mathit{\boldsymbol{h}}^k} \in {{\bf{R}}^D}$是对应于$\mathit{\boldsymbol{x}}_m^k \in {{\bf{R}}^D}$的相关滤波器。多通道相关滤波器$\left\{ {{\mathit{\boldsymbol{h}}^k}} \right\}_{k = 1}^K$关于第$m$个训练样本${\mathit{\boldsymbol{x}}_m}$的相关响应为

$ {S_h}\left( {{\mathit{\boldsymbol{x}}_m}} \right) = \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{h}}^k} * \mathit{\boldsymbol{x}}_m^k} $ (1)

式中,$* $表示空域循环卷积。

相关滤波器$\left\{ {{\mathit{\boldsymbol{h}}^k}} \right\}_{k = 1}^K$可以通过求解最优化式(2)中岭回归的目标函数得到

$ E\left( \mathit{\boldsymbol{h}} \right) = \sum\limits_{m = 1}^M {{\alpha _m}\left\| {{\mathit{\boldsymbol{y}}_m} - {S_h}\left( {{\mathit{\boldsymbol{x}}_m}} \right)} \right\|_2^2} + \frac{\lambda }{2}\sum\limits_{k = 1}^K {\left\| {{\mathit{\boldsymbol{h}}^k}} \right\|_2^2} $ (2)

式中,每个训练样本的权重为${\alpha _m} \ge 0$$\lambda$表示正则化因子。

在检测阶段,$\left\{ {{z^k} \in {{\bf{R}}^D}} \right\}_{k = 1}^K$表示从新一帧图像块提取的向量化后的第$k$个通道的特征图。每一个位置的分类器响应${S_h}\left( z \right)$

$ {S_h}\left( \mathit{\boldsymbol{z}} \right) = {F^{ - 1}}\left( {\sum\limits_{k = 1}^K {{{\mathit{\boldsymbol{\hat h}}}^k} \odot {{\mathit{\boldsymbol{\hat z}}}^k}} } \right) $ (3)

式中,$\odot $表示对应元素点乘,${{\mathit{\boldsymbol{\hat h}}}^k}$${{\mathit{\boldsymbol{\hat z}}}^k}$分别表示${\mathit{\boldsymbol{h}}^k}$${\mathit{\boldsymbol{z}}^k}$的离散傅里叶变换(FFT),${F^{ - 1}}\left( \cdot \right)$表示逆傅里叶变换函数(IFFT)。最终通过求解式(3)的最大响应值对应的位置来定位被跟踪的目标。

1.2 背景感知相关滤波(BACF)算法

判别相关滤波跟踪算法的训练数据集是采用循环移位的形式收集到的,该方式强烈依赖于潜在的样本周期延拓假设,这种假设使得模型训练和目标定位可以通过快速傅里叶变换高效完成,然而同时还带来了负面的边界效应问题。究其原因是采用循环移位的形式产生了不准确的负样本,该问题导致难以对真实目标外观进行准确的表达,从而降低了学习外观模型的辨别能力。而BACF算法利用一个掩码矩阵,使得较小的相关滤波器能作用于较大的搜索样本,且不会给相关滤波建模带来过多的背景因素干扰,并通过密集采样的方法得到真正的负样本,该方法甚至可以在整幅图像中进行目标搜索。学习多通道BACF相关滤波器的目标函数为

$ \begin{array}{*{20}{c}} {E\left( \mathit{\boldsymbol{h}} \right) = \frac{1}{2}\sum\limits_{j = 1}^T {\left\| {\mathit{\boldsymbol{y}}\left( j \right) - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{h}}^{{k^{\rm{H}}}}}\mathit{\boldsymbol{P}}{\mathit{\boldsymbol{x}}^k}\left[ {\Delta {\tau _j}} \right]} } \right\|_2^2} + }\\ {\frac{\lambda }{2}\sum\limits_{k = 1}^K {\left\| {{\mathit{\boldsymbol{h}}^k}} \right\|_2^2} } \end{array} $ (4)

式中,${\mathit{\boldsymbol{x}}^k} \in {{\bf{R}}^T}$表示从整幅图像中提取的向量化后的第$k$个通道的特征图;$T$表示特征图${\mathit{\boldsymbol{x}}^k}$的大小;${\mathit{\boldsymbol{x}}^k}\left[ {\Delta {\tau _j}} \right]$表示对信号${\mathit{\boldsymbol{x}}^k}$的第$j$步离散相关移位;$\mathit{\boldsymbol{y}}\left( j \right) \in {{\bf{R}}^T}$是对应于${\mathit{\boldsymbol{x}}^k}\left[ {\Delta {\tau _j}} \right]$的期望输出的高斯标签;${\mathit{\boldsymbol{h}}^k} \in {{\bf{R}}^D}$为相关滤波器;$\mathit{\boldsymbol{P}}$是一个$D \times T$的二进制矩阵,作用是从整幅图像的特征图${\mathit{\boldsymbol{x}}^k}$中剪切出$D$个元素,$D$是跟踪目标块向量化后的元素个数,通常情况下$D \ll T$;上标${\rm{H}}$为矩阵或向量的共轭转置。BACF采用有效的ADMM来交替优化求解式(4),得到多通道相关滤波器$\left\{ {{\mathit{\boldsymbol{h}}^k}} \right\}_{k = 1}^K$,计算复杂度仅有${\rm{O}}\left( {LKT\lg \left( T \right)} \right)$,其中$L$为ADMM迭代次数。

2 本文算法

2.1 背景与时间感知相关滤波(BTCF)模型

背景感知相关滤波(BACF)算法利用一个掩码矩阵,通过密集采样的方法获取真正的正、负样本对目标外观进行建模,已取得良好的跟踪效果。然而BACF算法在学习相关滤波器时并没有考虑滤波器的时间一致性问题,当目标出现快速运动或遮挡等外观突变现象时,学习到的相关滤波器将会偏向背景而发生漂移。为了解决学习到的相关滤波器能够适应连续两帧之间的外观突变问题,本文的BTCF算法在BACF框架的基础上引入时间感知项$\sum\limits_{k = 1}^K {\left\| {{\mathit{\boldsymbol{h}}^k} - {\mathit{\boldsymbol{h}}^{k\left( {v - 1} \right)}}} \right\|_2^2} $,使得学习到的连续两帧之间的滤波器尽可能保持一致。本文构造了带等式限制的多通道相关滤波目标函数,定义为

$ \begin{array}{*{20}{c}} {E\left( \mathit{\boldsymbol{h}} \right) = \frac{1}{2}\sum\limits_{j = 1}^T {\left\| {\mathit{\boldsymbol{y}}\left( j \right) - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{h}}^{{k^{\rm{H}}}}}\mathit{\boldsymbol{P}}{\mathit{\boldsymbol{x}}^k}\left[ {\Delta {\tau _j}} \right]} } \right\|_2^2} + }\\ {\frac{\lambda }{2}\sum\limits_{k = 1}^K {\left\| {{\mathit{\boldsymbol{h}}^k}} \right\|_2^2} + \frac{\varpi }{2}\sum\limits_{k = 1}^K {\left\| {{\mathit{\boldsymbol{h}}^k} - {\mathit{\boldsymbol{h}}^{k\left( {v - 1} \right)}}} \right\|_2^2} = }\\ {\frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2 + \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + \frac{\varpi }{2}\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2}\\ {{\rm{s}}.\;{\rm{t}}.\;\;\;\;{\mathit{\boldsymbol{g}}^k} = {\mathit{\boldsymbol{P}}^{\rm{H}}}{\mathit{\boldsymbol{h}}^k}} \end{array} $ (5)

式中, ${{\mathit{\boldsymbol{h}}^{k\left( {v - 1} \right)}}}$表示前一帧学习到的相关滤波,上标${v - 1}$代表前一帧目标, 变量$\varpi $表示时间正则化因子,$\mathit{\boldsymbol{h}} = {\left[ {{\mathit{\boldsymbol{h}}^{{1^{\rm{H}}}}}, {\mathit{\boldsymbol{h}}^{{2^{\rm{H}}}}}, \cdots , {\mathit{\boldsymbol{h}}^{{K^{\rm{H}}}}}} \right]^{\rm{H}}}$表示$K \times D$维列向量,由$K$个通道的相关滤波${\mathit{\boldsymbol{h}}^k}$向量化后级联组成;$\mathit{\boldsymbol{g}} = \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}$表示$K \times T$维的列向量,由$K$个通道的${\mathit{\boldsymbol{g}}^k}$向量化后级联得到,$ \otimes $表示克罗内克积,${\mathit{\boldsymbol{I}}_K}$表示$K$阶单位矩阵。

2.2 模型求解

式(5)是一个等式限制优化问题,可以转化为不带约束的增广拉格朗日乘子公式

$ \begin{array}{*{20}{c}} {L\left( {\mathit{\boldsymbol{h}},\mathit{\boldsymbol{g}},\mathit{\boldsymbol{\xi }}} \right) = \frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2 + \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + }\\ {\frac{\varpi }{2}\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2 + {\mathit{\boldsymbol{\xi }}^{\rm{H}}}\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right] + }\\ {\frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right\|_2^2} \end{array} $ (6)

式中, $\mu$为惩罚因子,${\mathit{\boldsymbol{\xi }}^{\rm{H}}} = {\left[ {{\mathit{\boldsymbol{\xi }}^{{1^{\rm{H}}}}}, {\mathit{\boldsymbol{\xi }}^{{2^{\rm{H}}}}}, \cdots , {\mathit{\boldsymbol{\xi }}^{{K^{\rm{H}}}}}} \right]^{\rm{H}}}$$KT$维的拉格朗日乘子列向量。求解式(6)中的最小值即为求解增广拉格朗日的鞍点。尝试通过交替迭代,使用ADMM方法线性逼近式(6)的局部最优解,将目标转化为两个容易求解的子问题。因为两个子问题都是凸光滑可微的,因此每次交替迭代的子问题有闭式解且为子问题的全局最优解。

1) 子问题1:求解${\mathit{\boldsymbol{h}}^ * }$。将式(6)中的$\mathit{\boldsymbol{y}}, \mathit{\boldsymbol{P}}, \mathit{\boldsymbol{g}}, \mathit{\boldsymbol{\xi }}, \mu , \varpi , {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}$均看做已知量,则优化问题式(6)变成一个凸光滑可微的子问题,该子问题有闭式解(详细推导过程参见附录)如下

$ {\mathit{\boldsymbol{h}}^ * } = \frac{1}{{\lambda + \mu }}\left[ {\left( {{\mathit{\boldsymbol{I}}_K} \otimes \mathit{\boldsymbol{P}}} \right)\mathit{\boldsymbol{\xi }} + \mu \left( {{\mathit{\boldsymbol{I}}_K} \otimes \mathit{\boldsymbol{P}}} \right)\mathit{\boldsymbol{g}}} \right] $ (7)

2) 子问题2:求解${{\mathit{\boldsymbol{\hat g}}}^ * }$。将式(6)中的$\mathit{\boldsymbol{y}}, \mathit{\boldsymbol{P}}, \mathit{\boldsymbol{h}}, \mathit{\boldsymbol{\xi }}, \mu , \varpi , {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}$均看做已知量,则优化问题式(6)变成一个凸光滑可微的子问题,该子问题有闭式解(详细推导过程参考附录)如下

$ \begin{array}{*{20}{c}} {{\mathit{\boldsymbol{g}}^ * } = \mathop {\arg \min }\limits_\mathit{\boldsymbol{g}} \frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2 + }\\ {\frac{\varpi }{2}\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2 + }\\ {\frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}} + {\mathit{\boldsymbol{\xi }}^ * }} \right\|_2^2} \end{array} $ (8)

式中, $\mathit{\boldsymbol{\xi }} = \mu {\mathit{\boldsymbol{\xi }}^ * }$。由于式(8)中出现了卷积符号$ * $,使得求解最优化问题存在一定困难,因此根据帕塞瓦尔定理(Parseval’s theorem),得到优化公式为

$ \begin{array}{*{20}{c}} {{{\mathit{\boldsymbol{\hat g}}}^ * } = \mathop {\arg \min }\limits_{\mathit{\boldsymbol{\hat g}}} \frac{1}{2}\left\| {\mathit{\boldsymbol{\hat y}} - \sum\limits_{k = 1}^K {{{\mathit{\boldsymbol{\hat g}}}^k} * {{\mathit{\boldsymbol{\hat x}}}^k}} } \right\|_2^2 + }\\ {T\frac{\varpi }{2}\left\| {\mathit{\boldsymbol{\hat g}} - {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}} \right\|_2^2 + }\\ {T\frac{\mu }{2}\left\| {\mathit{\boldsymbol{\hat g}} - \left( {\widehat {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}}} \right)\mathit{\boldsymbol{h}} + {{\mathit{\boldsymbol{\hat \xi }}}^ * }} \right\|_2^2} \end{array} $ (9)

式中,${\mathit{\boldsymbol{\hat y}}}$的每一个元素记为$\mathit{\boldsymbol{\hat y}}\left( t \right)$$1 \le t \le T$$\mathit{\boldsymbol{\hat y}}\left( t \right)$仅依赖于$\mathit{\boldsymbol{\hat x}}\left( t \right) = {\left[ {{{\mathit{\boldsymbol{\hat x}}}_1}\left( t \right), {{\mathit{\boldsymbol{\hat x}}}_2}\left( t \right), \cdots , {{\mathit{\boldsymbol{\hat x}}}_K}\left( t \right)} \right]^{\rm{H}}}$$\mathit{\boldsymbol{\hat g}}\left( t \right) = {\left[ {{{\mathit{\boldsymbol{\hat g}}}_1}{{\left( t \right)}^{\rm{H}}}, {{\mathit{\boldsymbol{\hat g}}}_2}{{\left( t \right)}^{\rm{H}}}, \cdots , {{\mathit{\boldsymbol{\hat g}}}_K}{{\left( t \right)}^{\rm{H}}}} \right]^{\rm{H}}}$中的$K$个值,因此式(9)可以分解成$T$$K \times K$的线性子系统独立求解,即

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{\hat g}}{{\left( t \right)}^ * } = {{\left( {\mathit{\boldsymbol{\hat x}}\left( t \right)\mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}} + T\left( {\mu + \varpi } \right){\mathit{\boldsymbol{I}}_K}} \right)}^{ - 1}} \times }\\ {\left( {\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right) - T\mathit{\boldsymbol{\hat \xi }}\left( t \right) + T\mu \mathit{\boldsymbol{\hat h}}\left( t \right) + T\varpi {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)} \right)} \end{array} $ (10)

式中,$\mathit{\boldsymbol{\hat h}}\left( t \right) = {\left[ {{{\mathit{\boldsymbol{\hat h}}}_1}{{\left( t \right)}^{\rm{H}}}, {{\mathit{\boldsymbol{\hat h}}}_2}{{\left( t \right)}^{\rm{H}}}, \cdots , {{\mathit{\boldsymbol{\hat h}}}_K}{{\left( t \right)}^{\rm{H}}}} \right]^{\rm{H}}}$${{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right) = {\left[ {\mathit{\boldsymbol{\hat g}}_1^{\left( {v - 1} \right)}{{\left( t \right)}^{\rm{H}}}, \mathit{\boldsymbol{\hat g}}_2^{\left( {v - 1} \right)}{{\left( t \right)}^{\rm{H}}}, \cdots , \mathit{\boldsymbol{\hat g}}_K^{\left( {v - 1} \right)}{{\left( t \right)}^{\rm{H}}}} \right]^{\rm{H}}}$

根据Sherman-Morrison定理,${\left( {\mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{H}}} + \mathit{\boldsymbol{A}}} \right)^{ - 1}} = {\mathit{\boldsymbol{A}}^{ - 1}} - \frac{{{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{H}}}{\mathit{\boldsymbol{A}}^{ - 1}}}}{{{\mathit{\boldsymbol{v}}^{\rm{H}}}{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}} + 1}}$,得

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{\hat g}}{{\left( t \right)}^ * } = \frac{1}{{\mu + \varpi }}\left( \begin{array}{l} \frac{{\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right)}}{T} - \mathit{\boldsymbol{\hat \xi }}\left( t \right)\\ \mu \mathit{\boldsymbol{\hat h}}\left( t \right) + \varpi {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right) \end{array} \right) - }\\ {\frac{1}{B}\left[ \begin{array}{l} \frac{{\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_x}\left( t \right)}}{T} - \mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_\xi }\left( t \right) + \\ \mu \mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_h}\left( t \right) + \varpi \mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_g}\left( t \right) \end{array} \right]} \end{array} $ (11)

式中,${{\hat S}_x}\left( t \right) = \mathit{\boldsymbol{\hat x}}{\left( t \right)^{\rm{H}}}\mathit{\boldsymbol{\hat x}}\left( t \right), {{\hat S}_\xi }\left( t \right) = \mathit{\boldsymbol{\hat x}}{\left( t \right)^{\rm{H}}}\mathit{\boldsymbol{\hat \xi }}\left( t \right)$, $B = \left( {\mu + \varpi } \right)\left[ {{{\hat S}_x}\left( t \right) + T\left( {\mu + \varpi } \right)} \right], {{\hat S}_h}\left( t \right) = \mathit{\boldsymbol{\hat x}}{\left( t \right)^{\rm{H}}}\mathit{\boldsymbol{\hat h}}\left( t \right)$, ${{\hat S}_g}\left( t \right) = \mathit{\boldsymbol{\hat x}}{\left( t \right)^{\rm{H}}}{{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)$

采用式(11)计算${\mathit{\boldsymbol{\hat g}}}$的时间复杂度为${\rm{O}}\left( {T \times K} \right)$

3) 子问题3:求解${\mathit{\boldsymbol{\hat \xi }}}$。增广拉格朗日乘子向量${\mathit{\boldsymbol{\hat \xi }}}$的更新方案为

$ {{\mathit{\boldsymbol{\hat \xi }}}^{\left( {i + 1} \right)}} \leftarrow {{\mathit{\boldsymbol{\hat \xi }}}^{\left( i \right)}} + \mu \left( {{{\mathit{\boldsymbol{\hat g}}}^{\left( {i + 1} \right)}} - {{\mathit{\boldsymbol{\hat h}}}^{\left( {i + 1} \right)}}} \right) $ (12)

式中,${{\mathit{\boldsymbol{\hat h}}}^{\left( {i + 1} \right)}}$${{\mathit{\boldsymbol{\hat g}}}^{\left( {i + 1} \right)}}$是子问题1和子问题2中ADMM第${i + 1}$次迭代的结果。通常情况下选取参数$\mu$的方案是${{\mu ^{\left( {i + 1} \right)}} = \min \left( {{\mu _{\max }}, \beta {\mu ^{\left( i \right)}}} \right)}$

2.3 目标定位

在目标定位阶段,本文利用前一帧ADMM迭代结果${{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}$,通过求响应结果

$ S\left( {\mathit{\boldsymbol{z}}_r^v} \right) = {F^{ - 1}}\left( {\sum\limits_{k = 1}^K {conj\left( {{{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)k}}} \right)} \odot \mathit{\boldsymbol{\hat z}}_r^{vk}} \right) $ (13)

的最大响应值,在当前帧$v$上定位目标。式(13)中,上标${v - 1}$代表前一帧目标,$K$为特征图的通道个数,$conj\left( \cdot \right)$表示共轭运算。本文的背景与时间感知相关滤波视觉跟踪算法流程图如图 1所示。与文献[17]一样, 本文采用多分辨率搜索策略来估计目标变换的尺度。即以前一帧目标为中心,以${\alpha ^\gamma }$为尺度提取${\left\{ {\mathit{\boldsymbol{z}}_r^v} \right\}_{r \in \left\{ {\left\lfloor {\frac{{1 - S}}{2}} \right\rfloor , \cdots , \left\lfloor {\frac{{S - 1}}{2}} \right\rfloor } \right\}}}$,式中$S$表示多分辨率尺度个数,$\alpha$为尺度增量因子,并采取两步搜索策略来精确定位目标,最后以$S$个不同尺度响应图的最高响应值作为最终目标位置和尺度变换的检测结果。

图 1 背景与时间感知相关滤波视觉跟踪算法流程图
Fig. 1 BTCF based visual tracking flow chart

3 实验结果

3.1 实验设置

为了评估本文提出的BTCF算法的有效性,在OTB2015数据库[18]上对100个视频序列采用相同的参数进行对比实验。本文的BTCF算法采用MATLAB 2017a作为编程语言,实验平台配置为Intel i7-4790 CPU @ 3.6 GHz和32 GB的内存,操作系统为Ubuntu 14.04(64 bit)。

3.2 实验细节

为了验证BTCF算法的鲁棒性与有效性,本文选取了KCF[4]、SAMF[6]、SRDCF[10]、ECOHC[12]、BACF[17]、fDSST[8]、Staple[21]、STRCF[22]、CACF [23]、CFAT[24]共10个当前比较优秀的基于相关滤波的视觉跟踪算法,从定性比较和定量比较两个方面,在OTB2015数据库上进行对比实验。

为了对比实验的公平性,本文的BTCF算法保留了BACF算法的参数设置,上述10个算法和本文的BTCF算法采用的都是纯手工特征,所有视频序列只有第1帧的初始位置为真值。

本文的部分参数设置为:多分辨率尺度个数$S$=5,尺度增量因子$\alpha$=1.01,HOG特征图通道数为31,CN特征图通道数为11,ADMM迭代次数$L$=2,正则化因子$\lambda $=0.01,$\varpi $=15,${{\mu _{\max }}}$=100,$\beta $=10,产生高斯回归函数标签的核带宽设置为0.075。

3.3 与BACF算法定性比较

图 2给出了8个具有挑战性的视频序列,包含11个不同的视频属性。从图 2的视频序列可以定性地看出,本文的BTCF算法能够准确地定位目标,明显优于BACF算法。下面从不同的视频属性来分析本文BTCF算法的有效性与鲁棒性,在这里BTCFHOG是本文的跟踪框架, 仅采用HOG特征。

图 2 BTCF算法和BACF算法对比
Fig. 2 Comparison between BTCF and BACF
((a)BlurOwl; (b)BlurCar4;(c)DragonBaby; (d)Skating2-1;(e)Skating1;(f)KiteSurf; (g)Jogging-2;(h)Freeman4)

1) 运动模糊(MB)。运动模糊是由于目标的快速运动或摄像机的抖动而产生的模糊显现。从图 2(a)可以看出,在视频序列BlurOwl中,由于摄像机的抖动,目标从154帧到155帧出现了模糊现象,BACF算法(蓝色虚线框)出现了漂移;本文的BTCF算法(红色实线框)由于加入了时间感知信息,能够学习到更强的判别分类器,因此在整个视频跟踪过程中能够很好地定位目标。

2) 快速运动(FM)。快速运动一般是指两帧之间的目标移动大于20个像素。从图 2(b)可以看出,在视频序列BlurCar4中,BACF算法虽然能够很好地跟踪目标,但是遇到快速运动且目标发生尺度变化时不能及时调整尺度;本文的BTCF算法采用与BACF相同的尺度搜索策略(多分辨率尺度$S$=5,增量尺度因子$\alpha $=1.01),能够很好地锁定目标。

3) 面外旋转(OPR)和目标变形(DEF)。面外旋转是指由于目标的旋转而被自己或其他物体遮挡;而目标变形一般是指非刚体旋转变形。从图 2(c)可以看出,在视频序列DragonBaby中,从26帧到113帧,DragonBaby的头部由于旋转而被自身遮挡,等到目标再次出现时,本文的BTCF算法能够很好地标定目标且能达到很高的精度,而BACF算法发生了漂移。从图 2(d)可以看出,在视频序列Skating2-1中,当女士被男士遮挡后再现时,由于本文的BTCF算法加入了时间感知信息,能够很好地对目标外观进行建模,因此达到良好的跟踪效果。

4) 光照变化(IV)和背景杂乱(BC)。从图 2(e)可以看出,在视频序列Skating1中,由于舞台灯光的剧烈变化且伴随背景杂乱,对于大多数视觉跟踪算法具有一定的挑战性。BACF算法在第85帧,由于背景灯光的剧烈变化,学习到的分类器偏向了背景灯光而发生漂移,而本文的BTCF算法能够取得很好的跟踪效果。

5) 面内旋转(IPR)和尺度变化(SV)。从图 2(f)可以看出,在视频序列KiteSurf中,当冲浪者离开海平面后,身体发生了旋转且伴随快速运动,本文的BTCF算法始终能够很好地定位目标。

6) 遮挡(OCC)。能否有效地解决遮挡一直是视觉跟踪面临的一大难题,从图 2(g)可以看出,在视频序列Jogging-2中,白衣女子被线杆遮挡时,BACF算法学习到的滤波器把线杆判别为目标而发生漂移。而本文的BTCF算法由于加入了时间感知,当目标出现短暂遮挡时,在学习相关滤波器的过程中,能够对学习到的相关滤波有效地约束而避开短暂遮挡。

7) 低分辨率(LR)。从图 2(h)可以看出,在视频序列Freeman4中,初始跟踪的目标的尺寸非常小,且分辨率非常低,被跟踪目标由小变大的过程中还时而遇到遮挡。BACF算法在159帧出现遮挡时发生了漂移。而本文的BTCF跟踪算法不但能够很好地跟踪目标,而且当目标尺寸由小变大时还能有效地进行尺度搜索。

从上述分析可知,图 2的8个视频序列定性展示了本文提出的BTCF算法较BACF算法具有一定的鲁棒性和有效性。

3.4 与10个state-of-the-art算法定量比较

本文采用中心位置误差和成功率曲线图线下面积(AUC)作为评价指标,在OTB2015数据库上,将100个视频序列采用相同的参数与10个目前比较优秀的基于相关滤波的视觉跟踪算法作对比实验。中心位置误差指每一帧视频序列的跟踪结果与真值之间的欧氏距离;重叠率指跟踪结果与真值之间的交并比(${\rm{IoU}}$),跟踪目标框的${\rm{IoU}}$计算公式为

$ {f_{{\rm{IoU}}}} = \frac{{area\left( {RO{I_{\rm{T}}} \cap RO{I_{\rm{G}}}} \right)}}{{area\left( {RO{I_{\rm{T}}} \cup RO{I_{\rm{G}}}} \right)}} $ (14)

式中,$RO{I_{\rm{T}}}$为跟踪结果的目标框,$RO{I_{\rm{G}}}$为真值。

图 3是OTB2015数据库上100个视频序列的成功率曲线图(11个视频属性的成功率曲线对比图见附录中图 1)。表 1定量展示了10个优秀算法与本文的BTCF算法在11个视频属性上的成功率曲线图。从图 3曲线和表 1对应的AUC数据可以看出,本文的BTCF算法在11个视频属性上的AUC均明显高于BACF算法。

图 3 OTB2015数据库上100个视频序列的成功率
Fig. 3 The overlap rate of 100 video sequences
图 1 OTB2015数据库上11个视频属性的成功率曲线图
Fig. 1 The overlap rate of 11 video attributes on OTB2015 database

表 1 OTB2015数据库上10个优秀算法与本文BTCF算法在11个视频属性上AUC的定量对比结果
Table 1 The results of quantitative comparison of the AUC of 10 state-of-the-art algorithms and our BTCF algorithm on 11 video attributes on OTB2015 database

下载CSV
视频总数 KCF SAMF Staple CFAT fDSST SRDCF CACF ECOHC STRCF BACF BTCFHOG BTCF
OTB2015 100 0.446 0.446 0.582 0.572 0.55 0.602 0.6 0.645 0.656 0.621 0.634 0.663
IV 35 0.461 0.45 0.586 0.532 0.563 0.609 0.601 0.638 0.656 0.632 0.617 0.659
SV 61 0.388 0.396 0.53 0.52 0.513 0.559 0.545 0.62 0.635 0.579 0.629 0.635
OCC 44 0.427 0.445 0.577 0.546 0.522 0.559 0.586 0.615 0.622 0.586 0.591 0.629
DEF 39 0.443 0.409 0.586 0.492 0.51 0.533 0.594 0.611 0.611 0.594 0.589 0.619
MB 29 0.456 0.28 0.547 0.595 0.537 0.603 0.577 0.652 0.652 0.585 0.649 0.649
FM 37 0.461 0.336 0.552 0.572 0.568 0.616 0.598 0.647 0.633 0.614 0.635 0.642
IPR 51 0.455 0.453 0.553 0.547 0.542 0.545 0.573 0.589 0.601 0.583 0.575 0.609
OPR 59 0.445 0.445 0.557 0.549 0.522 0.551 0.571 0.617 0.632 0.594 0.597 0.638
OV 14 0.349 0.35 0.481 0.502 0.457 0.492 0.509 0.591 0.587 0.552 0.567 0.579
BC 31 0.486 0.45 0.574 0.538 0.585 0.58 0.586 0.648 0.66 0.625 0.59 0.654
LR 9 0.273 0.416 0.396 0.471 0.429 0.521 0.448 0.54 0.577 0.514 0.539 0.583
速度/(帧/${\rm{s}}$) ~ 238 24.7 69.4 4.48 93.9 9.99 43 54.8 22.5 33.9 36.3 25.4
注:加粗字体表示对比结果最好,斜体表示对比结果次好。

图 4是OTB2015数据库上100个视频序列的中心位置误差图(11个视频属性的中心位置误差对比图见附录中图 2)。从图 4的中心位置误差曲线和对应的比率可以看出,本文的BTCF算法在11个视频属性上中心位置误差均明显高于BACF算法,说明本文的BTCF算法具有一定的鲁棒性和有效性。

图 4 OTB2015数据库上100个视频序列的中心位置误差
Fig. 4 The error of 100 video sequences
图 2 OTB2015数据库上11个视频属性的中心位置误差
Fig. 2 The center location error of 11 video attributes on OTB2015 database

表 1中的AUC数据展示,本文算法与其他10个优秀算法相比,本文的AUC排名处在最优或次优位置,明显优于大多数基于相关滤波的视觉跟踪算法。

3.5 讨论时间一致性项参数$\varpi $

在时间序列上,式(5)引入的时间一致性项能够起到平滑滤波的作用。当目标外观遇到突变时(例如短时间遮挡,图 2中视频序列Jogging-2和Freeman4),在引入时间一致性信息后,学习到的相关滤波器具有一定的平滑性(在时间序列上),能够有效地处理诸如短时遮挡等突变现象,而BACF的模型框架很容易使得学习到的相关滤波器偏向背景而发生漂移。

关于参数$\varpi $的选择,凭经验,如果参数$\varpi $设置太小,在时间序列上不能起到平滑滤波的作用,相反,如果参数$\varpi $设置太大,有可能产生过拟合现象。

本文在固定其他所有参数不变的情况下,让$\varpi $变动观察整体跟踪性能。1)让$\varpi $取整数值从10变化到20,在OTB2015数据库上进行了11轮实验。从表 2的定量对比结果可以看出,$\varpi $=15是最优结果。2)为了进一步说明本文选择$\varpi $=15的合理性,让$\varpi $取小数点后一位,从14.1变化到16,在OTB2015数据库上进行了20轮实验。从表 3和表4的定量对比结果可以看出,$\varpi $=15仍然是最优结果。

表 2 时间一致性参数$\varpi $取整数值10~20时的定量对比结果
Table 2 The results of quantitative comparison when the parameter $\varpi $ of temporal consistency which take the integer value from 10 to 20

下载CSV
$\varpi $ 距离精度 AUC
10 0.833 0.636
11 0.844 0.645
12 0.849 0.645
13 0.846 0.641
14 0.852 0.646
15 0.872 0.663
16 0.865 0.657
17 0.856 0.648
18 0.859 0.653
19 0.853 0.647
20 0.858 0.651
注:加粗字体表示对比结果最好。

表 3 时间一致性参数$\varpi $取小数值14.1~15.9时的定量对比结果
Table 3 The results of quantitative comparison when the parameter $\varpi $ of temporal consistency which take the decimal value from 14.1 to 15.9

下载CSV
$\varpi $ 距离精度 AUC
14.1 0.854 0.646
14.2 0.859 0.651
14.3 0.854 0.648
14.4 0.854 0.643
14.5 0.849 0.645
14.6 0.852 0.648
14.7 0.853 0.649
14.8 0.864 0.655
14.9 0.868 0.66
15.0 0.872 0.663
15.1 0.863 0.656
15.2 0.865 0.656
15.3 0.865 0.654
15.4 0.862 0.652
15.5 0.865 0.654
15.6 0.864 0.658
15.7 0.861 0.656
15.8 0.870 0.660
15.9 0.868 0.659
注:加粗字体表示对比结果最好。

4 结论

提出了一个基于背景与时间感知的相关滤波视觉跟踪算法,在判别相关滤波框架的基础上,构建了一个带等式限制的相关滤波目标函数。本文算法不但获取了真正的负样本作为训练集,而且仅用当前帧信息无需模型更新策略就能学习到具有较强判别力的相关滤波器。首先将带等式限制的相关滤波目标函数转化为无约束的增广拉格朗日乘子公式,然后采用ADMM方法将有等式限制的目标函数转化为两个具有闭式解的子问题迭代求局部最优解。在OTB2015公开数据库上的实验结果表明,本文的BTCF算法明显优于BACF算法,成功率曲线图的线下面积AUC较BACF提高了4.2 %。与10个比较优秀的视觉跟踪算法进行对比,AUC排名在最优或次优,明显优于大多数基于相关滤波的视觉跟踪算法,采用纯手工特征AUC达到0.663,跟踪速度达到25.4帧/${\rm{s}}$,具有良好的鲁棒性和一定的实时性。接下来的工作,将尝试把深度特征加入到BTCF框架中来提升跟踪性能。

附录

子问题1:求解${\mathit{\boldsymbol{h}}^*}$,如下

$ \begin{array}{*{20}{c}} {{\mathit{\boldsymbol{h}}^ * } = \arg \mathop {\min }\limits_\mathit{\boldsymbol{h}} \frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2 + \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + \frac{\varpi }{2} \times }\\ {\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2 + {\mathit{\boldsymbol{\xi }}^{\rm{H}}}\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right] + \frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right\|_2^2 = }\\ {\arg \mathop {\min }\limits_\mathit{\boldsymbol{h}} \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + {\mathit{\boldsymbol{\xi }}^{\rm{H}}}\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right] + \frac{\mu }{2} \times }\\ {\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right\|_2^2 = \arg \mathop {\min }\limits_\mathit{\boldsymbol{h}} \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + \mu {\mathit{\boldsymbol{\eta }}^{\rm{H}}}\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right] + }\\ {\frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right\|_2^2 + \frac{\mu }{2}\left\| \mathit{\boldsymbol{\eta }} \right\|_2^2 = \arg \mathop {\min }\limits_\mathit{\boldsymbol{h}} \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + \frac{\mu }{2} \times }\\ {\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}} + \mathit{\boldsymbol{\eta }}} \right\|_2^2} \end{array} $ (1)

式(1)中令$\mathit{\boldsymbol{\xi }} = \mu \mathit{\boldsymbol{\eta }}$, 记$C\left( \mathit{\boldsymbol{h}} \right) = \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + \frac{\mu }{2} \times \left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}} + \mathit{\boldsymbol{\eta }}} \right\|_2^2$,则令$\frac{{\partial C\left( \mathit{\boldsymbol{h}} \right)}}{{\partial \mathit{\boldsymbol{h}}}} = 0$,即

$ \begin{array}{*{20}{c}} {\frac{{\partial C\left( \mathit{\boldsymbol{h}} \right)}}{{\partial \mathit{\boldsymbol{h}}}} = \lambda \mathit{\boldsymbol{h}} - \mu {{\left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)}^{\rm{H}}}\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}} + \mathit{\boldsymbol{\eta }}} \right] = }\\ {\left( {\lambda + \mu } \right)\mathit{\boldsymbol{h}} - \mu \left( {{\mathit{\boldsymbol{I}}_K} \otimes \mathit{\boldsymbol{P}}} \right)\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes \mathit{\boldsymbol{P}}} \right)\mathit{\boldsymbol{\xi }} = 0} \end{array} $ (2)

解式(2)得

$ {\mathit{\boldsymbol{h}}^ * } = \frac{1}{{\lambda + \mu }}\left[ {\left( {{\mathit{\boldsymbol{I}}_K} \otimes \mathit{\boldsymbol{P}}} \right)\mathit{\boldsymbol{\xi }} + \mu \left( {{\mathit{\boldsymbol{I}}_K} \otimes \mathit{\boldsymbol{P}}} \right)\mathit{\boldsymbol{g}}} \right] $ (3)

子问题2:求解${{{\mathit{\boldsymbol{\hat g}}}^ * }}$,如下

$ \begin{array}{*{20}{c}} {{\mathit{\boldsymbol{g}}^ * } = \arg \mathop {\min }\limits_\mathit{\boldsymbol{g}} \frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2 + \frac{\lambda }{2}\left\| \mathit{\boldsymbol{h}} \right\|_2^2 + \frac{\varpi }{2} \times }\\ {\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2 + {\mathit{\boldsymbol{\xi }}^{\rm{H}}}\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right] + \frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right\|_2^2 = }\\ {\arg \mathop {\min }\limits_\mathit{\boldsymbol{g}} \frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2 + \frac{\varpi }{2}\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2 + }\\ {{\mathit{\boldsymbol{\xi }}^{\rm{H}}}\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right] + \frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right\|_2^2 = }\\ {\arg \mathop {\min }\limits_\mathit{\boldsymbol{g}} \frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2\frac{\varpi }{2}\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2 + \mu {\mathit{\boldsymbol{\xi }}^{ * {\rm{H}}}} \times }\\ {\left[ {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right] + \frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} \right\|_2^2 + \frac{\mu }{2}\left\| {{\mathit{\boldsymbol{\xi }}^ * }} \right\|_2^2 = }\\ {\arg \mathop {\min }\limits_\mathit{\boldsymbol{g}} \frac{1}{2}\left\| {\mathit{\boldsymbol{y}} - \sum\limits_{k = 1}^K {{\mathit{\boldsymbol{g}}^k} * {\mathit{\boldsymbol{x}}^k}} } \right\|_2^2\frac{\varpi }{2}\left\| {\mathit{\boldsymbol{g}} - {\mathit{\boldsymbol{g}}^{\left( {v - 1} \right)}}} \right\|_2^2 + }\\ {\frac{\mu }{2}\left\| {\mathit{\boldsymbol{g}} - \left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}} + {\mathit{\boldsymbol{\xi }}^ * }} \right\|_2^2} \end{array} $ (4)

式(4)中令$\mathit{\boldsymbol{\xi }} = \mu {\mathit{\boldsymbol{\xi }}^ * }$,由于式(4)中出现了卷积符号$*$,直接求解最优化问题存在着一定的困难,因此根据帕塞瓦尔定理(Parseval’s theorem),得到如下的优化公式

$ \begin{array}{*{20}{c}} {{{\mathit{\boldsymbol{\hat g}}}^ * } = \arg \mathop {\min }\limits_{\mathit{\boldsymbol{\hat g}}} \frac{1}{2}\left\| {\mathit{\boldsymbol{\hat y}} - \sum\limits_{k = 1}^K {{{\mathit{\boldsymbol{\hat g}}}^k} \odot {{\hat x}^k}} } \right\|_2^2 + T\frac{\varpi }{2}\left\| {\mathit{\boldsymbol{\hat g}} - {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}} \right\|_2^2 + }\\ {T\frac{\mu }{2}\left\| {\mathit{\boldsymbol{\hat g}} - \widehat {\left( {{\mathit{\boldsymbol{I}}_K} \otimes {\mathit{\boldsymbol{P}}^{\rm{H}}}} \right)\mathit{\boldsymbol{h}}} + {{\mathit{\boldsymbol{\hat \xi }}}^ * }} \right\|_2^2} \end{array} $ (5)

式(5)中${\mathit{\boldsymbol{\hat y}}}$的每一个元素记为$\mathit{\boldsymbol{\hat y}}\left( t \right)$$1 \le t \le T$, $\mathit{\boldsymbol{\hat y}}\left( t \right)$仅仅依赖于$\mathit{\boldsymbol{\hat x}}\left( t \right) = {\left[ {{{\mathit{\boldsymbol{\hat x}}}_1}\left( t \right), {{\mathit{\boldsymbol{\hat x}}}_2}\left( t \right), \cdots , {{\mathit{\boldsymbol{\hat x}}}_K}\left( t \right)} \right]^{\rm{H}}}$$\mathit{\boldsymbol{\hat g}}\left(t \right) = {\left[{{{\mathit{\boldsymbol{\hat g}}}_1}{{\left(t \right)}^{\rm{H}}}, {{\mathit{\boldsymbol{\hat g}}}_2}{{\left(t \right)}^{\rm{H}}}, \cdots, {{\mathit{\boldsymbol{\hat g}}}_K}{{\left(t \right)}^{\rm{H}}}} \right]^{\rm{H}}}$中的$K$个值,因此式(5)可以分解成$T$$K \times K$的线性子系统独立求解,即

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{\hat g}}{{\left( t \right)}^ * } = \arg \mathop {\min }\limits_{\mathit{\boldsymbol{\hat g}}\left( t \right)} \frac{1}{2}\left\| {\mathit{\boldsymbol{\hat y}}\left( t \right) - \mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}}\mathit{\boldsymbol{\hat g}}\left( t \right)} \right\|_2^2 + T\frac{\varpi }{2} \times }\\ {\left\| {\mathit{\boldsymbol{\hat g}}\left( t \right) - {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)} \right\|_2^2 + T\frac{\mu }{2}\left\| {\mathit{\boldsymbol{\hat g}}\left( t \right) - \mathit{\boldsymbol{\hat h}}\left( t \right) + {{\mathit{\boldsymbol{\hat \xi }}}^ * }\left( t \right)} \right\|_2^2} \end{array} $ (6)

$C\left( {\mathit{\boldsymbol{\hat g}}\left( t \right)} \right) = \frac{1}{2}\left\| {\mathit{\boldsymbol{\hat y}}\left( t \right) - \mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}}\mathit{\boldsymbol{\hat g}}\left( t \right)} \right\|_2^2 + T\frac{\varpi }{2} \times $$\left\| {\mathit{\boldsymbol{\hat g}}\left( t \right) - {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)} \right\|_2^2 + T\frac{\mu }{2}\left\| {\mathit{\boldsymbol{\hat g}}\left( t \right) - \mathit{\boldsymbol{\hat h}}\left( t \right) + {{\mathit{\boldsymbol{\hat \xi }}}^ * }\left( t \right)} \right\|_2^2$, 则令$\frac{{\partial C\left( {\mathit{\boldsymbol{\hat g}}\left( t \right)} \right)}}{{\partial \mathit{\boldsymbol{\hat g}}\left( t \right)}} = 0$,即

$ \begin{array}{*{20}{c}} {\frac{{\partial C\left( {\mathit{\boldsymbol{\hat g}}\left( t \right)} \right)}}{{\partial \mathit{\boldsymbol{\hat g}}\left( t \right)}} = \mathit{\boldsymbol{\hat x}}\left( t \right)\left[ {\mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}}\mathit{\boldsymbol{\hat g}}\left( t \right) - \mathit{\boldsymbol{\hat y}}\left( t \right)} \right] + }\\ {T\varpi \left[ {\mathit{\boldsymbol{\hat g}}\left( t \right) - {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)} \right] + T\mu \left[ {\mathit{\boldsymbol{\hat g}}\left( t \right) - \mathit{\boldsymbol{\hat h}}\left( t \right) + {{\mathit{\boldsymbol{\hat \xi }}}^ * }\left( t \right)} \right] = }\\ {\left[ {\mathit{\boldsymbol{\hat x}}\left( t \right)\mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}} + T\left( {\mu + \varpi } \right){\mathit{\boldsymbol{I}}_K}} \right]\mathit{\boldsymbol{\hat g}}\left( t \right) - T\varpi {{\mathit{\boldsymbol{\hat g}}}^{\left( {t - 1} \right)}}\left( t \right) - }\\ {\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right) - T\mu \mathit{\boldsymbol{\hat h}}\left( t \right) + T\mu {{\hat \xi }^ * }\left( t \right) = 0} \end{array} $ (7)

解式(7)得

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{\hat g}}{{\left( t \right)}^ * } = {{\left( {\mathit{\boldsymbol{\hat x}}\left( t \right)\mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}} + T\left( {\mu + \varpi } \right){\mathit{\boldsymbol{I}}_K}} \right)}^{ - 1}}\left( {\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right) - } \right.}\\ {\left. {T\mu {{\mathit{\boldsymbol{\hat \xi }}}^ * }\left( t \right) + T\mu \mathit{\boldsymbol{\hat h}}\left( t \right) + T\varpi {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)} \right) = }\\ {{{\left( {\mathit{\boldsymbol{\hat x}}\left( t \right)\mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}} + T\left( {\mu + \varpi } \right){\mathit{\boldsymbol{I}}_K}} \right)}^{ - 1}}\left( {\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right) - } \right.}\\ {\left. {T\mathit{\boldsymbol{\hat \xi }}\left( t \right) + T\mu \mathit{\boldsymbol{\hat h}}\left( t \right) + T\varpi {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)} \right)} \end{array} $ (8)

式(8)中$\mathit{\boldsymbol{\hat h}}\left( t \right) = {\left[ {{{\mathit{\boldsymbol{\hat h}}}_1}{{\left( t \right)}^{\rm{H}}}, {{\mathit{\boldsymbol{\hat h}}}_2}{{\left( t \right)}^{\rm{H}}}, \cdots , {{\mathit{\boldsymbol{\hat h}}}_K}{{\left( t \right)}^{\rm{H}}}} \right]^{\rm{H}}}$${{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right) = {\left[ {\mathit{\boldsymbol{\hat g}}_1^{\left( {v - 1} \right)}{{\left( t \right)}^{\rm{H}}}, \mathit{\boldsymbol{\hat g}}_2^{\left( {v - 1} \right)}{{\left( t \right)}^{\rm{H}}}, \cdots , \mathit{\boldsymbol{\hat g}}_K^{\left( {v - 1} \right)}{{\left( t \right)}^{\rm{H}}}} \right]^{\rm{H}}}$

根据Sherman-Morrison定理,${\left( {\mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{H}}} + \mathit{\boldsymbol{A}}} \right)^{ - 1}} = {\mathit{\boldsymbol{A}}^{ - 1}} - \frac{{{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}}{\mathit{\boldsymbol{v}}^{\rm{H}}}{\mathit{\boldsymbol{A}}^{ - 1}}}}{{{\mathit{\boldsymbol{v}}^{\rm{H}}}{\mathit{\boldsymbol{A}}^{ - 1}}\mathit{\boldsymbol{u}} + 1}}$,则式(8)中的逆矩阵为

$ \begin{array}{*{20}{c}} {{{\left( {\mathit{\boldsymbol{\hat x}}\left( t \right)\mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}} + T\left( {\mu + \varpi } \right){\mathit{\boldsymbol{I}}_K}} \right)}^{ - 1}} = }\\ {\frac{1}{{T\mu + \varpi }}{\mathit{\boldsymbol{I}}_K} - \frac{1}{{T\left( {\mu + \varpi } \right)}} \times \frac{{\mathit{\boldsymbol{\hat x}}\left( t \right)\mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}}}}{{T\left( {\mu + \varpi } \right) + \mathit{\boldsymbol{\hat x}}{{\left( t \right)}^{\rm{H}}}\mathit{\boldsymbol{\hat x}}\left( t \right)}}} \end{array} $ (9)

把式(9)代入式(8),得

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{\hat g}}{{\left( t \right)}^ * } = \frac{1}{{\left( {\mu + \varpi } \right)}} \times \left( {\frac{{\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right)}}{T} - \mathit{\boldsymbol{\hat \xi }}\left( t \right) + \mu \mathit{\boldsymbol{\hat h}}\left( t \right) + \varpi {{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)} \right) - }\\ {\frac{1}{B}\left[ {\frac{{\mathit{\boldsymbol{\hat y}}\left( t \right)\mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_x}\left( t \right)}}{T} - \mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_\xi }\left( t \right) + \mu \mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_h}\left( t \right) + \varpi \mathit{\boldsymbol{\hat x}}\left( t \right){{\hat S}_g}\left( t \right)} \right]} \end{array} $ (10)

式中, ${{\hat S}_x}\left( t \right) = \mathit{\boldsymbol{\hat x}}{\left( t \right)^{\rm{H}}}\mathit{\boldsymbol{\hat x}}\left( t \right), {{\hat S}_\xi }\left( t \right) = \mathit{\boldsymbol{\hat x}}{\left( t \right)^{\rm{H}}}\mathit{\boldsymbol{\hat \xi }}\left( t \right)$$B = \left( {\mu + \varpi } \right)\left[ {{{\hat S}_x}\left( t \right) + T\left( {\mu + \varpi } \right)} \right], {{\hat S}_h}\left( t \right) = \mathit{\boldsymbol{\hat x}}{\left( t \right)^{\rm{H}}}\mathit{\boldsymbol{\hat h}}\left( t \right)$${{\hat S}_g}\left( t \right) = \mathit{\boldsymbol{\hat x}}\left( t \right){\lambda ^{\rm{H}}}{{\mathit{\boldsymbol{\hat g}}}^{\left( {v - 1} \right)}}\left( t \right)$,采用式(10)计算${\mathit{\boldsymbol{\hat g}}}$的时间复杂度为${\rm{O}}\left( {TK} \right)$

参考文献

  • [1] Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 2544-2550.[DOI:10.1109/CVPR.2010.5539960]
  • [2] Henriques J F, Caseiro R, Martins P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer, 2012: 702-715.[DOI:10.1007/978-3-642-33765-950]
  • [3] Danelljan M, Khan F S, Felsberg M, et al. Adaptive color attributes for real-time visual tracking[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 1090-1097.[DOI:10.1109/CVPR.2014.143]
  • [4] Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596. [DOI:10.1109/TPAMI.2014.2345390]
  • [5] Tang M, Feng J Y. Multi-kernel correlation filter for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3038-3046.[DOI:10.1109/ICCV.2015.348]
  • [6] Li Y, Zhu J K. A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of European Conference on Computer Vision. Zurich, Switzerland: Springer, 2015: 254-265.[DOI: 10.1007/978-3-319-16181-5_18]
  • [7] Danelljan M, Häger G, Khan F S, et al. Accurate scale estimation for robust visual tracking[C]//Proceedings of British Machine Vision Conference. Nottingham, UK: BMVA Press, 2014: 65.1-65.11.[DOI:10.5244/C.28.65]
  • [8] Danelljan M, Häger G, Khan F S, et al. Discriminative scale space tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1561–1575. [DOI:10.1109/TPAMI.2016.2609928]
  • [9] Galoogahi H K, Sim T, Lucey S. Correlation filters with limited boundaries[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 4630-4638.[DOI:10.1109/CVPR.2015.7299094]
  • [10] Danelljan M, Häger G, Khan F S, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 4310-4318.[DOI:10.1109/ICCV.2015.490]
  • [11] Danelljan M, Robinson A, Khan F S, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016: 472-488.[DOI:10.1007/978-3-319-46454-1_29]
  • [12] Danelljan M, Bhat G, Khan F S, et al. ECO: efficient convolution operators for tracking[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 6931-6939.[DOI:10.1109/CVPR.2017.733]
  • [13] Nam H, Han B. Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 4293-4302.[DOI:10.1109/CVPR.2016.465]
  • [14] Song Y B, Ma C, Gong L J, et al. CREST: convolutional residual learning for visual tracking[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2574-2583.[DOI:10.1109/ICCV.2017.279]
  • [15] Gundogdu E, Alatan A A. Good features to correlate for visual tracking[J]. IEEE Transactions on Image Processing, 2018, 27(5): 2526–2540. [DOI:10.1109/TIP.2018.2806280]
  • [16] Danelljan M, Häger G, Khan F S, et al. Convolutional features for correlation filter based visual tracking[C]//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. Santiago, Chile: IEEE, 2015: 621-629.[DOI:10.1109/ICCVW.2015.84]
  • [17] Galoogahi H K, Fagg A, Lucey S. Learning background-aware correlation filters for visual tracking[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 1144-1152.[DOI:10.1109/ICCV.2017.129]
  • [18] Wu Y, Lim J, Yang M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848. [DOI:10.1109/TPAMI.2014.2388226]
  • [19] Zhang W, Kang B S. Recent advances in correlation filter-based object tracking:a review[J]. Journal of Image and Graphics, 2017, 22(8): 1017–1033. [张微, 康宝生. 相关滤波目标跟踪进展综述[J]. 中国图象图形学报, 2017, 22(8): 1017–1033. ] [DOI:10.11834/jig.170092]
  • [20] Lu H C, Li P X, Wang D. Visual object tracking:a survey[J]. Pattern Recognition and Artificial Intelligence, 2018, 31(1): 61–76. [卢湖川, 李佩霞, 王栋. 目标跟踪算法综述[J]. 模式识别与人工智能, 2018, 31(1): 61–76. ] [DOI:10.16451/j.cnki.issn1003-6059.201801006]
  • [21] Bertinetto L, Valmadre J, Golodetz S, et al. Staple: complementary learners for real-time tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1401-1409.[DOI:10.1109/CVPR.2016.156]
  • [22] Li F, Tian C, Zuo W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking[C]//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018.
  • [23] Mueller M, Smith N, Ghanem B. Context-aware correlation filter tracking[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1387-1395.[DOI:10.1109/CVPR.2017.152]
  • [24] Bibi A, Mueller M, Ghanem B. Target response adaptation for correlation filter tracking[C]//Proceedings of 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016: 419-433.[DOI:10.1007/978-3-319-46466-4_25]