Current Issue Cover
稀疏约束的时空正则相关滤波无人机视觉跟踪

田昊东1, 张津浦1, 王岳环1,2(1.华中科技大学人工智能与自动化学院, 武汉 430074;2.多谱信息处理技术国家级重点实验室, 武汉 430074)

摘 要
目的 基于相关滤波的跟踪算法在无人机(unmanned aerial vehicle,UAV)视觉跟踪领域表现出卓越的性能。现有的相关滤波类跟踪算法从样本区域的所有特征中学习滤波器,然而某些来自遮挡或形变的特征可能会污染滤波器,降低模型判别能力。针对此问题,提出一种稀疏约束的时空正则相关滤波跟踪算法。方法 在相关滤波目标函数上施加空间弹性网络约束以自适应地抑制跟踪过程中的干扰特征,同时集成空间—时间正则相关滤波算法(spatial-temporal regularized correlation filter,STRCF)中的时间正则项以增强滤波器抑制畸变的能力。采用交替方向乘子法(alternating direction method of multipliers,ADMM)将带有约束项的目标函数转化为两个具有闭式解的子问题迭代求局部最优解。此外,提出一种相关滤波框架通用的加速策略,根据当前帧的目标位移量,对检测定位阶段的特征矩阵进行等距离的循环移位,将其作为在线学习阶段的特征矩阵,每帧可节省一次训练样本的特征提取操作,提高跟踪速度。结果 在3个UAV数据集上与14种主流跟踪算法进行对比实验,在DTB70(drone tracking benchmark)数据集中,平均精确率与平均成功率分别为0.707和0.477,在所有对比算法中位列第1,相比较STRCF分别提高了5.8%和4%;在UAVDT (the unmanned aerial vehicle benchmark:object detection and tracking)数据集中,平均精确率与平均成功率相比较STRCF分别提高了8.4%和3.8%;在UAV123_10 fps数据集中,平均精确率与平均成功率相比较STRCF分别提高了4%和3.3%。同时,消融实验结果表明,加速策略在不显著影响跟踪精度(±0.1%)的前提下,可提高跟踪速度约25%,在单个CPU上的跟踪速度为50帧/s。结论 本文算法结合了稀疏约束与时间—空间正则化的优势,与对比算法相比,在遮挡、形变等复杂情况下跟踪效果更加鲁棒。
关键词
Sparse constraint and spatial-temporal regularized correlation filter for UAV tracking

Tian Haodong1, Zhang Jinpu1, Wang Yuehuan1,2(1.School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China;2.National Key Laboratory of Science and Technology on Muti-spectral Information Processing, Wuhan 430074, China)

Abstract
Objective Correlation filter (CF)-based methods have demonstrated their potential in visual object tracking for unmanned aerial vehicle (UAV) applications. Current discriminative CF trackers can be used to learn a multifaceted feature filter in the sample region. However, more occlusion or deformation-derived features may distort the filter and degrade the discriminative ability of the model. To mitigate this problem, we develop a novel sparse constraint and spatio-temporal regularized correlation filter to ignore those distractive features adaptively. Method By imposing a spatial (bowl-shaped) elastic net constraint on the objective function of the correlation filter, our algorithm can restrict the sparsity of the filter values corresponding to the target region instead of the whole sample region and adaptively suppress the distorted features during tracking. In addition, a temporal regularization term in spatial-temporal regularized correlation filter (STRCF) is integrated to enhance the filter's ability to suppress distortion. Our research treats the object tracking task as a convex optimization problem and provides an efficient global optimization method through alternating direction method of multipliers (ADMM). First, the objective function is required to meet Eckstein-Bertsek condition. Thus, it can converge to the global optimal solution by an unconstrained augmented Lagrange multiplier formulation. Next, ADMM is used to transform the Lagrange multiplier formulation into two sub-problems with closed-form solution. To improve computational efficiency, we convert the sub-problems into the Fourier domain according to Parseval's theorem. Our algorithm can converge quickly within a few iterations. Result Several evaluation metrics like center location error and bounding box overlap ratio are used to test and compare the proposed method against other existing methods. The center location error measures the accuracy of the tracking algorithm's estimation for the target location. It computes the average Euclidean distance between the ground truth and the center location of the tracked target in all frames. The center location error can represent the location accuracy of the tracking algorithm. But, the sensitivity of the different size targets to the center location error is different because scale and aspect ratio are not taken into consideration. Another commonly used evaluation metric is the overlap rate, which is defined as the intersection over union between the target box prediction and the ground truth. We compare our approach with several state-of-the-art algorithms on well-known benchmarks, such as DTB70 (drone tracking benchmark), UAVDT (unmanned aerial vehicle benchmark:object detection and tracking) and UAV123_10 fps. The experiment results show that our model outperforms all other methods on DTB70 benchmark. The average accuracy rate and the average success rate are 0.707 and 0.477, which are 5.8% and 4% higher than STRCF. For UAVDT benchmark, the average accuracy rate and the average success rate are 0.72 and 0.494, respectively, which are 8.4% and 3.8% higher than STRCF. For UAV123_10 fps benchmark, the average accuracy rate and average success rate are 0.667 and 0.577, respectively, which are 5% and 3.3% higher than STRCF. Furthermore, an ablation experiment demonstrates that the proposed strategy improves the tracking speed by about 25% without affecting the tracking accuracy, and the running speed can reach 50 frame/s on a single CPU. Conclusion Compared with the current popular methods, the proposed sparse constraint and spatio-temporal regularized correlation filter achieves leading performance. Due to the introduction of sparse constraints and spatial-temporal regularization, our algorithm improves the tracking effect and has strong robustness in complex scenes such as occlusion and deformation.
Keywords

订阅号|日报