正例投票下的L1目标跟踪算法

胡良梅; 王健; 张骏; 张旭东

doi:10.11834/jig.20161108

图像分析和识别 | 浏览量 : 0 下载量: 337 CSCD: 0

PDF
导出
分享
收藏
专辑

正例投票下的L1目标跟踪算法
L1 object-tracking algorithm with positive example voting
2016年21卷第11期页码：1483
网络出版：2016-11-03，

纸质出版：2016
DOI： 10.11834/jig.20161108
稿件说明：

移动端阅览

胡良梅, 王健, 张骏, 张旭东. 正例投票下的L1目标跟踪算法[J]. 中国图象图形学报, 2016,21(11):1483. DOI： 10.11834/jig.20161108.

Hu Liangmei, Wang Jian, Zhang Jun, Zhang Xudong. L1 object-tracking algorithm with positive example voting[J]. Journal of Image and Graphics, 2016, 21(11): 1483. DOI： 10.11834/jig.20161108.

摘要

传统的L1稀疏表示目标跟踪，是将所有候选目标表示为字典模板的线性组合，只考虑了字典模板的整体信息，没有分析目标的局部结构。针对该方法在背景杂乱时容易出现跟踪漂移的问题，提出一种基于正例投票的目标跟踪算法。本文将目标表示成图像块粒子的组合，考虑目标的局部结构。在粒子滤波框架内，构建图像块粒子置信函数和相似性函数，提取正例图像块。最终通过正例权重投票估计跟踪目标的最佳位置。在14组公测视频序列上进行跟踪实验，与多种优秀的目标跟踪算法相比，本文跟踪算法在目标受到背景杂乱、遮挡、光照变化等复杂环境干扰下最为稳定，重叠率达到了0.7，且取得了最低的平均跟踪误差5.90，反映了本文算法的可靠性和有效性。本文正例投票下的L1目标跟踪算法，与经典方法相比，能够解决遮挡、光照变化和快速运动等问题的同时，稳定可靠地实现背景杂乱序列的鲁棒跟踪。

Abstract

Visual tracking estimates the states of a moving target in a video. This technology is the most important and fundamental topic in computer vision and has several applications

such as surveillance

vehicle tracking

robotics

and human-computer interaction. L1 object-tracking based on sparse representation expresses each target candidate as a linear combination of dictionary templates. In such tracking

the global information is considered without analyzing the local information. To overcome drifting problems in background clutter

this paper proposes a tracking method based on positive patch voting. Given the over completeness of sparse representation dictionary and sensitivity to changing local features

we present the target by a set of image patch particles to consider the local structure of target templates. Extracting image patches is the core of our algorithm and directly affects the result of tracking. Specifically

we present a tracking reliability metric to measure how positively a patch can be tracked. Accordingly

a probability model is proposed to estimate the distribution of positive patches under a sequential Monte Carlo framework. To estimate how likely a patch can be preliminarily obtained

we adopt the peak-to-sidelobe ratio as a confidence metric. This ratio is widely used in signal processing to measure the signal peak strength in response map. The confidence function is proportional to the response map of image patches

and is a distance function between the templates and patches. Instead of using computationally intensive unsupervised clustering methods to group the image patches

we simply divide the image into two regions by a rectangle box that is obtained by the L1 method centering at the target. We then formulate a similarity function to measure the patches that are inside the bounding box and a confidence score that is higher than zero. We label the patches that obtained the high score. Finally

we calculate the weight of all positive patches and vote the optimal location of the tracked target. The traditional target tracking based on sparse representation simply considers global information without analyzing the local information

and L1 object tracking easily produces drifting problems under complex situations. Thus

we present the target by a set of image patches and formulate a new patch reliability metric to extract the positive patches. Qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed algorithm can handle highly diverse and challenging situations of visual tracking. Sparse representation is applied to visual tracker by modeling the target appearance using a sparse approximation over a template set. However

the proposed method cannot adapt to complex and dynamic scenes because of various factors

such as background clutter and illumination change. By formulating the confidence and similarity function

we extract the positive patches. We finally find the target location according to voting of positive patch weight. Unlike other classical methods

our method can deal with occlusion

illumination change

and fast movement to accomplish robust tracking in sequence with background clutters.