发布时间: 2017-04-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.20170410
2017 | Volume 22 | Number 4

图像理解和计算机视觉

先验模型约束的抗干扰轮廓跟踪

刘大千¹, 刘万军², 费博雯³

1. 辽宁工程技术大学电子与信息工程学院, 葫芦岛 125105;

2. 辽宁工程技术大学软件学院, 葫芦岛 125105;

3. 辽宁工程技术大学工商管理学院, 葫芦岛 125105

收稿日期: 2016-10-10; 修回日期: 2016-12-22

基金项目: 国家自然科学基金项目（61172144）；辽宁省科技攻关计划项目（2012216026）

第一作者简介: 刘大千 (1992-), 男, 辽宁工程技术大学电子与信息工程学院矿山空间信息工程专业博士研究生, 主要研究方向为图像与视觉信息计算、运动目标检测与跟踪。E-mail:liudaqianlntu@163.com

中图法分类号: TP301.6

文献标识码: A

文章编号: 1006-8961(2017)04-0502-14

摘要

目的基于水平集的轮廓提取方法被广泛用于运动物体的轮廓跟踪。针对传统方法易受局部遮挡、复杂背景等因素影响的问题，提出一种先验模型约束的抗干扰（AC-PMC）轮廓跟踪算法。方法首先，选取图像序列的前5帧进行跟踪训练，将每帧图像基于颜色特征分割成若干超像素块，利用均值聚类组建簇集合，并通过该集合建立目标的先验模型。然后，利用水平集分割方法提取目标轮廓，并提出决策判定算法，判断是否需要引入形状先验模型加以约束，避免遮挡、复杂背景等影响。最后，提出一种在线模型更新算法，在特征集中加入适当特征补偿，使得更新的目标模型更为准确。结果本文算法与多种优秀的轮廓跟踪算法相比，可以达到相同甚至更高的跟踪精度，在Fish、Face1、Face2、Shop、Train以及Lemming视频图像序列下的平均中心误差分别为3.46、7.16、3.82、13.42、14.72、12.47，算法的跟踪重叠率分别为0.92、0.74、0.85、0.77、0.73、0.82，算法的平均运行速度分别为4.27帧/s、4.03帧/s、3.11帧/s、2.94帧/s、2.16帧/s、1.71帧/s。结论利用目标的先验模型约束以及提取轮廓过程中的决策判定，使本文算法在局部遮挡、目标形变、目标旋转、复杂背景等条件下具有跟踪准确、适应性强的特点。

关键词

先验模型; 水平集; 决策判定; 特征补偿; 轮廓跟踪

Anti-interference contour tracking under prior model constraint

Liu Daqian¹, Liu Wanjun², Fei Bowen³

1. School of Electronic and Information Engineering, Liaoning Technical University, Huludao 125105, China;

2. School of Software, Liaoning Technical University, Huludao 125105, China;

3. School of Business and Management, Liaoning Technical University, Huludao 125105, China

Supported by: National Natural Science Foundation of China (61172144)

Abstract

Objective Target tracking plays an important role in computer vision, which is widely applied in intelligent traffic, robot vision, and motion capture. Experts and scholars have proposed numerous excellent target tracking algorithms in recent years to avoid the influence of illumination changes, target deformation, partial occlusion (even global occlusion), complex background, and other factors. One of the popular topics in the field of target tracking is determining how to deal with the change in target contour. A level set can better optimize the topology structure of a target, and thus, many researchers have adopted the level set method for the contour extraction and tracking of targets. In 2004, Freedman used the Bhattacharyya distance and Zhang used the Kullback-Leibler distance in 2005, respectively, to determine the target layout and locate the best candidate region. Accordingly, these researchers combined foreground/background matching flow and proposed a combined flow method. However, these two algorithms depend on the initial target selection. When the initial contour differs from the actual contour of the object, the algorithms will require multiple iterations to converge. Chiverton proposed an online contour tracking algorithm based on the learning model. This algorithm establishes a prior target model through initial target morphology and constrains the contour tracking process by using the target model. Ning proposed an approach that applied the morphological information of the initial delineation of the target to establish the prior model. This researcher also adopted the level set method for the implicit representation of the foreground and background regions of the target information. The distribution area of the foreground/background target is determined using the Bhattacharyya similarity measure to realize accurate tracking. Rathi adopted the geometric active contour model to track a deformed target that was moving fast in the framework of a particle filter. The algorithm does not only achieve affine transformation, but can also accurately estimate the non-affine transform target. The contour extraction methods based on a level set are extensively applied to tracking moving targets. Traditional methods can be easily affected by the local occlusion of other targets and the complex background. A novel tracking approach based on anti-interference contour tracking under the prior model constraint is proposed to solve the aforementioned problems. Method The proposed approach uses a simple model matching algorithm to track the previous five frames of the image sequences. The training sample set is established based on several super pixel blocks obtained via super pixel segmentation. The super pixel block sets with the same color feature are used to establish the cluster sets by using the mean shift algorithm. The confidence probability of each cluster is then calculated, and a prior model of the target is constructed according to the confidence degree of clusters. Subsequently, the target contour is extracted using the segmentation method of the level set. This study proposes a novel decision-making method to avoid the influences of partial occlusion and complex background. This method determines whether a shape prior model is required to constrain the level set evolution process, and thus, obtain more robust tracking results. Lastly, an appearance model online-updating algorithm is proposed. This algorithm can append the appropriate feature compensations to feature sets to improve the updating accuracy of the appearance model. The algorithm uses the evolution results of shape and color features, and then, the feature loss and redundant feature problems are effectively solved when the target is occluded. Result Six sets of common video sequences are used in the test to verify the performance of the proposed algorithm. The video sequence covers challenging factors, such as illumination change, partial occlusion, target deformation, and complex background. The algorithm is also compared with available contour tracking algorithms, such as the density matching and level set, the learning distribution metric, joint registration, and active contour segmentation. The proposed contour tracking algorithm can achieve the same or even higher tracking accuracy compared with excellent contour tracking algorithms. The average center errors in the video sequences Fish, Face1, Face2, Shop, Train, and Lemming are 3.46, 7.16, 3.82, 13.42, 14.72, and 12.47, respectively. The tracking overlap ratios of the aforementioned video sequences are 0.92, 0.74, 0.85, 0.77, 0.73, and 0.82, respectively. The average running speeds in the aforementioned video sequences are 4.27 frame/s, 4.03 frame/s, 3.11 frame/s, 2.94 frame/s, 2.16 frame/s, and 1.71 frame/s, respectively. Conclusion Experiment results indicate that using the prior model constraint of the target and implementing decision-making in the contour extraction process provide the algorithm with accurate tracking and strong adaptability characteristics under the conditions of partial occlusion, target deformation, target rotation, and complex background. The characteristics of the proposed approach are as follows:1) a prior model of the target is built by training the sample set, removing the interference of the non-target information in the image, and providing the prior model with a more accurate description of the target; 2) a decision-making method is proposed to judge whether a prior model is required. If the constraints of a prior model must be introduced, then the results of the shape subspace and the evolution in color space are fused in the level set segmentation process; 3) an appearance model online-updating algorithm is proposed, which can append the appropriate feature compensations to the feature sets, thereby ensuring the accuracy of the model.

Key words

prior model; level set; decision-making; feature compensation; contour tracking

0 引言

目标跟踪是计算机视觉领域的重要组成部分，在智能交通^[1]、机器人视觉^[2]、运动捕捉^[3]等方面有着广泛的应用。近几年专家学者提出了许多优秀的目标跟踪算法以避免光照变化、目标形变、局部遮挡 (甚至全局遮挡)、复杂背景等因素的影响^[4-7]。而跟踪算法如何处理目标轮廓变化是目标跟踪领域的热点问题之一。

由于水平集能够较好地优化目标的拓扑结构，许多研究人员利用水平集方法进行目标的轮廓提取、跟踪。Freedman和Zhang^[8]分别利用Bhattacharyya距离和Kullback-Leibler距离确定目标的布局，从而定位出最佳候选区域。在此基础上，两人结合前景/背景匹配流提出组合流法 (combination flow method)^[9]。但这两种算法依赖于初始目标的选取，当初始轮廓与物体的实际轮廓差异较大时，算法需要多次迭代才能收敛。Chiverton等人^[10]提出基于学习模型的在线轮廓跟踪算法，通过初始目标的形态建立先验目标模型，利用目标模型约束轮廓跟踪过程。Ning等人^[11]通过初始圈定目标的形态信息建立先验模型，采用水平集方法隐式表示前景/背景区域的目标信息，通过Bhattacharyya相似性度量确定前景/背景中目标的分布区域，实现较为准确的跟踪。Rathi等人^[12]在粒子滤波的框架下，采用几何活动轮廓模型对当目标处于形变、快速运动等情况下进行跟踪，该算法不仅能够获取目标仿射变换，而且还能较准确地估计目标的非仿射变换。但这几种方法在目标发生遮挡的情况下，会出现不同程度的目标特征丢失、冗余特征过多的现象，导致算法的跟踪准确率降低。

针对上述问题，本文提出了一种先验模型约束的抗干扰轮廓跟踪 (AC-PMC) 算法。首先，在前5帧图像序列中使用简单的模型匹配算法进行跟踪，并利用超像素分割所获得的目标区域建立由若干超像素块组成的训练样本集，将具有相同颜色特征的超像素块集合进行均值聚类组建簇集合，计算每个簇的置信度概率，根据簇的置信度大小建立先验模型。然后，利用水平集分割方法提取目标轮廓。为了避免局部遮挡、复杂背景等影响，本文提出一种决策判定方法，判断是否需要引入形状先验模型约束水平集演化过程，得到更鲁棒的跟踪结果。最后，本文提出一种在线模型更新算法，在特征集中加入适当特征补偿，保证模型对目标的描述更加准确。由于算法利用了目标的形状特征和颜色特征的演化结果，有效地解决了目标处于遮挡情况下的特征丢失、冗余特征过多的问题。

本文算法的特点如下：

1) 通过训练样本集聚类构建的目标先验模型，除去图像中非目标信息的干扰，使得先验模型对目标的描述更准确。

2) 提出一种决策判定方法，用来判断是否需要引入先验模型。

3) 若需要引入先验模型的约束，则在水平集分割的过程中融合在形状子空间的结果和在颜色空间的演化结果。

4) 提出一种在线模型更新算法，在特征集中加入适当特征补偿，保证模型的准确性。

1 算法概述

AC-PMC算法分为建立目标的先验模型、抗干扰轮廓提取及模型更新3个过程。算法的流程如图 1所示。

图 1 AC-PMC算法流程示例图

Fig. 1 The flow diagram of AC-PMC algorithm

建立目标的先验模型。在首帧中手动圈出目标的区域，在前5帧图像序列中使用简单的模型匹配算法进行跟踪，确定图像中的目标区域。利用超像素分割每帧所获得的目标区域，建立由若干个超像素块组成的训练样本集。然后利用均值聚类算法将具有相同特征 (颜色) 的超像素块聚集到一个簇中组建一个簇集合。最后计算每个簇的置信度概率，判断其是否属于目标区域，从而利用超像素块的特征以及置信度概率建立初始先验模型。

抗干扰轮廓提取。首先利用Bhattacharyya相似性度量与模型相似的目标分布区域，估计目标所在位置区域。然后利用决策判定方法判断是否需要引入先验模型，若需要则在水平集分割的过程中融合在形状子空间的结果和在颜色空间的演化结果作为初始目标区域；若无需引入先验模型的约束，则在目标模型匹配的过程中将颜色空间的度量结果作为初始目标区域。最后通过水平集演化进行细化处理，提取目标的精确轮廓。

模型的更新。模型的更新分为两部分。第1部分是利用当前帧的结果与先验目标模型进行加权融合实现轮廓模型的更新。第2部分是判断目标是否发生严重遮挡，确定聚类集合中被替换的特征集合。若没有发生严重遮挡，则利用最新帧的跟踪结果替换外观模型中最旧帧。若发生严重遮挡，选择与当前最近帧作为被替换帧，并将被替换帧中的部分特征作为补偿集合并到当前帧的特征集合中，重新聚类特征集合完成一次聚类特征集合的更新。对该聚类集合重新计算置信度，即可获得新的目标模型。

2 AC-PMC跟踪算法

2.1 建立目标的先验模型

为了建立可以区分背景的目标模型，需要采用样本集进行训练。通过简单的EMD模型匹配算法^[13]跟踪前5帧视频图像序列，确定每帧图像中目标的位置区域。利用SLIC算法^[14]分割每个训练图像帧中扩展的目标区域 (以圈定目标区域的中心为中心点，对角线长度为目标区域对角线的1.5倍进行扩展)，得到第$t$帧的$N_t$个超像素块。在第$t$帧图像中利用特征集合$\boldsymbol{F} = \{ {f^x}_t|t = 1, \ldots, 5;x = 1, \ldots, N\} $表示得到前5帧中所有超像素块特征集合。

由于分割算法是对目标的扩展区域进行分割，获取的超像素块包含了对目标以及背景区域的描述，所以应用均值漂移 (MS) 算法^[15]对特征集合进行聚类，获得$n$个簇$(clst\left( i \right)(i = 1, \ldots, n)$，每个族包括族中心${f_c}\left( i \right)$、族半径${r_c}\left( i \right)$及族成员$\left\{ {{f^x}_i|{f^x}_t \in clst\left( i \right)} \right\}, {f^x}_t$表示$t$帧中的第$x$个超像素特征向量。将$n$个簇对应到训练图像帧的局部区域。

通过聚类获得$n$个簇，记每个簇区域包含$S(i)$个超像素块。若这些超像素块都位于目标区域中，则该聚类簇属于目标区域，若这些超像素块都位于背景区域中，则该聚类簇属于背景区域。但在大多数情况下，超像素集合往往部分属于目标区域，部分属于背景区域，需要引入置信度的计算。为每个簇$clst\left( i \right)(i = 1, \ldots, n)$分配一个权值，记为聚类的置信度${C^c}_i\left( {{C^c}_i \in [-1,1]} \right)$为

$ {C^c}_i = \frac{{Y\left( i \right)-E\left( i \right)}}{{S\left( i \right)}}, \forall i = 1, \ldots, n $

(1)

式中，$Y(i)$为与目标区域相对应的超像素块数，$E(i)$为对应到背景区域的超像素块数，$S(i)$为簇区域包含的总超像素块数，即$S\left( i \right) = Y\left( i \right) + E(i)$。${C^c}_i$表示簇$clst\left( i \right)(i = 1, \ldots, n)$是否属于目标区域。当${C^c}_i > 0$时，$clst(i)$属于目标区域，${C^c}_i$值越大表明越偏向于目标区域。即

$ {p_i} = \left\{ {\begin{array}{*{20}{c}} {目标区域}&{{C^c}_i > 0}\\ {背景区域}&{{C^c}_i < 0} \end{array}} \right. $

(2)

利用式 (1) (2) 判断聚类簇是否属于目标区域，从而建立目标的先验模型，建立过程如图 2所示。需要指出的是在图 2(d)中，超像素块的置信度${C^c}_i$越高，则显示的红色越深，表示该超像素块越偏向目标；超像素块的置信度低，显示蓝色越深，表示该超像素块越偏向背景。

图 2 建立目标先验模型效果图

Fig. 2 Effect of establish target prior model

((a) training frame tracking image; (b) extended target area; (c) super-pixel segmentation; (d) target confidence map; (e) prior model of the target; (f) extraction of target contour)

2.2 抗干扰轮廓提取

2.2.1 目标的特征表示

设在目标特征空间中包含$m$个特征，定义$q$为目标特征分布，即

$ q = \left\{ {{q_u}} \right\}, \sum\limits_{u = 1}^m {{q_u}} = 1, u \in 1, \ldots, m $

(3)

利用2.1节建立的先验模型选定目标作为初始目标，在其后的图像帧中寻找最佳匹配目标区域。

假设水平集函数$\varphi $表示匹配到的候选区域，$p(φ)$与$q$相匹配的特征分布表示为

$ \begin{array}{l} p\left( \varphi \right) = \left\{ {{p_u}\left( \varphi \right)} \right\}\\ \sum\limits_{u = 1}^m {{p_u}} \left( \varphi \right) = 1, u \in 1, \ldots, m \end{array} $

(4)

令$\left\{ {pixe{l_i}} \right\}i \in 1, \ldots, n$表示目标区域的像素集合，利用文献[11]的特征空间函数$b(x)$对目标区域的像素集合$\{ pixe{l_i}\} $进行量化处理，记$b(pixe{l_i})$，则目标特征概率为

$ {q_u} = \frac{1}{n}\sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {\delta \left[{b{\rm{ }}\left( {pixe{l_i}} \right)-u} \right]} $

(5)

式中，$\delta $为Kronecker delta函数，$n$为正规化常数，表示像素集合中像素个数。由此可以推导候选区域中的特征分布为

$ \begin{array}{l} {p_u}\left( \varphi \right){\rm{ }} = {\rm{ }}\frac{1}{{\sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {H\left( {\varphi \left( {pixe{l_i}} \right)} \right)} }} \times \\ \sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {H\left( {\varphi \left( {pixe{l_i}} \right)} \right)\delta [b{\rm{ }}\left( {pixe{l_i}} \right)-u]} \end{array} $

(6)

式中，$H\left( x \right) = \frac{1}{2}\left( {1 + \frac{2}{\pi }{\rm{arctan}}\left( {\frac{x}{\varepsilon }} \right)} \right)$为Heaviside函数，用于对候选区域的选择^[16]。

2.2.2 基于形变的相似性度量

Bhattacharyya相似性度量广泛应用于目标跟踪领域，其测量两个离散或连续概率密度分布的相似性，Bhattacharyya系数越高，则两个概率密度分布的相似性越大。相似性度量的表达式为

$ E\left( \varphi \right) = \sum\limits_{u = 1}^m {\sqrt {{p_u}\left( \varphi \right){q_u}} } $

(7)

为了克服跟踪过程中目标的缩放、形变影响，本文采用基于形变的度量方式。设${\varphi _0}$为当前帧的初始目标区域，令${\varphi _0} = 0$可得到目标的初始轮廓，由式 (4) 可以得到目标的特征分布$\left( {p\left( {{\varphi _0}} \right) = \left\{ {{p_u}\left( {{\varphi _0}} \right)} \right\}u \in 1, \ldots, m} \right)$。设$\varphi $为${\varphi _0}$邻近候选区，推导其一般性 (推导过程见附录A)，可得到

$ \begin{array}{l} E\left( \varphi \right) = \frac{1}{2}\sum\limits_{u = 1}^m {\sqrt {{p_u}\left( \varphi \right){q_u}} } + \frac{1}{{2\sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {H\left( {\varphi \left( {pixe{l_i}} \right)} \right)} }} \times \\ \sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {\sum\limits_{u = 1}^m {\sqrt {\frac{{{q_u}}}{{{p_u}\left( {{\varphi _0}} \right)}}} } } \delta \left[{b\left( {pixe{l_i}} \right)-u} \right]H(\varphi (pixe{l_i})) \end{array} $

(8)

2.2.3 决策判定

当图像序列中目标被障碍物遮挡时，Bhattacharyya相似性度量所找到的最佳匹配区域往往不能准确地显示目标，因此本文提出一种决策判定，避免障碍物的干扰。

针对在跟踪过程中是否发生遮挡，定义一个遮挡阈值${\theta _0}$(${\theta _0}$取0.4) 与当前帧计算的遮挡阈值${\theta _t}$进行比较，如果${\theta _t} < {\theta _0}$，则判定目标发生遮挡。遮挡阈值为当前帧中目标区域置信度$C_t$(利用式 (1) 计算得出) 与前5帧算法跟踪的目标区域置信度均值之比，即

$ {\theta _t} = {C_t}/mean({C_t}, {C_{t-1}}, \ldots, {C_{t-4}}) $

(9)

2.2.4 目标轮廓的提取

利用2.2.3节中的决策对目标是否发生遮挡作判定，若发生遮挡，则利用目标的形状模型与相似性度量产生的目标区域进行求交操作，融合在形状子空和颜色空间的结果，将求交结果作为初始目标轮廓进行水平集演化；若未发生遮挡，则直接利用相似性度量产生的目标区域作为初始轮廓进行演化。然后通过水平集演化做细化处理，提取目标的精确轮廓。采用文献[17]提出的变分法以$\varphi (pixe{l_i})$为自变量对式 (8) 求导，得到

$ \begin{array}{l} \frac{{E\left( {\varphi \left( {pixe{l_i}} \right)} \right)}}{{\partial \varphi \left( {pixe{l_i}} \right)}} = \frac{1}{2}\delta \left( {\varphi \left( {pixe{l_i}} \right)} \right) \times \\ \left( {\frac{{\sum\limits_{u = 1}^m {\sqrt {\frac{{{q_u}}}{{{p_u}\left( {{\varphi _0}} \right)}}} } \delta \left[{b\left( {pixe{l_i}} \right)-u} \right]}}{{\sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {H\left( {\varphi \left( {pixe{l_i}} \right)} \right)} }}} \right) \end{array} $

(10)

式中，$\delta \left( {\varphi \left( {pixe{l_i}} \right)} \right) = \varepsilon /(\pi (\varphi {\left( {pixe{l_i}} \right)^2} + {\varepsilon ^2}))$为Heaviside函数的导数，作用于所有水平曲线上计算全局最优。利用梯度下降法演化，使得

$ \partial E\left( {\varphi \left( {pixe{l_i}} \right)} \right)/\partial \varphi \left( {pixe{l_i}} \right) = 0 $

即为最终提取的目标轮廓。

2.3 目标模型的更新

本文算法中利用的目标模型分为两部分：通过水平集演化的轮廓模型和基于超像素块聚类的目标形状模型。

2.3.1 轮廓模型更新

在动态场景中，目标的特征分布随着目标的移动、形变而不断变化。由于在2.2.2节中采用基于形变的度量方式，算法将当前帧的结果与先验目标模型进行加权融合实现轮廓模型的更新，即

$ {q_{{\rm{update}}}} = aq + \left( {1-a} \right){p_t} $

(11)

式中，$p_t$为第$t$帧目标区域的特征分布，$a$为更新权重，在实验中取$a$∈[0.8, 0.95]。

2.3.2 目标形状模型更新

传统的外观模型更新方法是跟踪若干帧后 (通常取5帧或10帧)，用最新帧的跟踪结果代替聚类库中的最旧帧来重建外观模型。这类方法当目标发生严重遮挡时易出现跟踪漂移甚至丢失目标。

针对在跟踪过程中是否发生严重遮挡，利用式 (9) 将遮挡阈值${\theta _1}$(${\theta _1}$取0.7) 与当前帧计算的遮挡阈值${\theta _t}$进行比较，如果${\theta _t} < {\theta _1}$，则判定目标发生严重遮挡。没有发生严重遮挡时，利用最新帧的跟踪结果与外观模型中最旧帧作相应替换，重建外观模型 (每10帧更新一次)。若发生严重遮挡时，选择与当前最近帧作为被替换帧，并选择$m$个聚类特征 (其聚类特征的置信度大于0，即属于目标区域的特征) 作为跟踪当前帧的补偿集，在实验中$m$取10，若替换帧中属于目标区域的特征不足10个时，则将这些置信度大于0的特征直接作为补偿集。将补偿集合并到当前帧的特征集合中 (严重遮挡下)，将其作为新的特征集合加入聚类特征集合，完成一次聚类特征集合的更新，对该聚类集合重新计算置信度，即可获得新的判别表观模型。

加入补偿集是十分必要的，其效果如图 3所示。从这两组图中可以看出，只通过替换整帧图像更新模型无法表示目标被严重遮挡的区域，这使得外观模型不能有效的描述目标区域。当移除遮挡时，建立的外观模型无法准确地识别目标被遮挡的部分，这样导致跟踪结果不够准确。而引入补偿集后，外观模型能够识别目标被遮挡的部分，使得算法的跟踪效果较准确。

图 3 加入补偿集的判别模型效果

Fig. 3 The result of compensation set with discriminating appearance model ((a) result of compensation set with discriminating appearance model; (b) result of compensation set without discriminating appearance model)

3 实验与结果分析

计算机环境：CPU为Intel Core i7，内存为16 GB。算法对每个视频图像序列都在第1帧做人工标出目标区域。为了验证本文算法的性能，采用了6组公共视频序列进行测试 (http://www4.comp.polyu.edu.hk/cslzhang/JRACS.htm; http://www.visual-tracking.net)。视频序列基本涵盖了光照变化、部分遮挡、目标形变、复杂背景等在跟踪过程中具有挑战性的因素。同时，还与近几年轮廓跟踪算法进行对比，分别是DMLS (density matching and level set)^[9]，LDM (learning distribution metric)^[18]，JRACS (joint registration and active contour segmentation)^[11]。本文采用中心误差及跟踪重叠率的评估标准来评价AC-PMC跟踪算法。中心误差^[19-20]是指真实目标中心与跟踪的目标中心之间欧氏距离的误差，以像素为测量单位，评价算法精度的指标。跟踪重叠率^[21]是指人工标注6组视频图像序列中目标的位置，应用重叠指数$s$评估跟踪性能。

$ s = \frac{{a\left( {{\boldsymbol{R}_{\rm{T}}} \cap {\boldsymbol{R}_{\rm{G}}}} \right)}}{{a({\boldsymbol{R}_{\rm{T}}} \cup {\boldsymbol{R}_{\rm{G}}})}} $

(12)

式中，$a$表示区域面积，${{\boldsymbol{R}_{\rm{T}}}}$表示标记实际面积，${{\boldsymbol{R}_{\rm{G}}}}$表示跟踪输出面积。$s$值越大意味着算法具有良好的跟踪定位精度，即跟踪的成功率越高。

3.1 跟踪效率分析

与传统基于水平集的轮廓跟踪算法不同，AC-PMC算法通过训练样本集构建目标的先验模型。提出一种决策判定方法，用来判断是否需要引入先验模型。然后又提出一种在线模型更新算法，在特征集中加入适当特征补偿，保证模型的准确性。为了验证算法的跟踪效率，将AC-PMC算法分为以下3个阶段：无形状约束的轮廓跟踪 (基于水平集的轮廓跟踪)、形状约束下无特征补偿的轮廓跟踪以及带有特征补偿的形状约束下轮廓跟踪算法。

3.1.1 准确率分析

主要围绕Lemming视频图像序列，将AC-PMC算法的3个阶段进行对比实验，跟踪的中心误差如图 4所示。从图中可以看出，加入了形状约束后算法的跟踪效率有较为明显的提高，但当目标发生较严重遮挡时，由于形状先验模型模型不能有效的描述目标区域。当移除遮挡时，建立的外观模型无法准确地识别目标被遮挡的部分，导致跟踪结果不够准确。因此本文在模型更新中加入了特征补偿，保证模型对目标描述的准确性，提高算法的跟踪效率。

图 4 算法的3个阶段的中心误差对比

Fig. 4 Center errors of the three stages of the algorithm

3.1.2 时效性分析

时效性是验证算法效率的重要指标，为保证算法的一致性，将上述3个阶段在Lemming视频图像序列上进行对比实验，在达到一定的跟踪重叠率阈值要求下，3种算法所需要的迭代收敛次数及整体运行时间见表 1。

表 1 3种算法的时效性对比
Table 1 The timeliness contrast of three tracking algorithms

下载CSV

跟踪重叠率/%	无形状约束		形状约束无特征补偿		带有特征补偿的形状约束
跟踪重叠率/%	迭代次数	平均运行时间/s	迭代次数	平均运行时间/s	迭代次数	平均运行时间/s
0.7	3	0.143	2	0.343	2	0.442
0.8	14	0.779	5	0.493	4	0.621
0.85	－	－	8	0.712	6	0.794
0.9	－	－	－	－	9	0.975
注：“－”表示算法在跟踪图像序列帧时无法达到限定的跟踪重叠率。

通过这3种算法的对比结果不难看出，虽然传统的轮廓跟踪算法运算速度快 (大约8帧/s)，但在实验中跟踪的重叠率较低，只有当迭代次数增加到14次时，算法的重叠率能达到0.8，但无法达到0.85。而加入了形状约束的轮廓跟踪算法不仅可以通过增加迭代次数提高跟踪重叠率，而且与无形状约束算法相比可以达到相同跟踪重叠率，减少了算法收敛的迭代次数，提高了算法的运行速率。但当目标发生较严重遮挡时，形状先验模型不能有效地描述目标区域，对约束模型加入特征补偿，进一步提高了算法的跟踪重叠率。

3.2 对比实验分析

3.2.1 实验参数的选择

在视频图像序列的前几帧中，目标的运动并不会出现较大程度的偏移，提取这些帧中的目标信息作为训练样本得到的目标模型更鲁棒。由于跟踪实时性在整个算法的评价中是一个重要指标，本文在保证算法跟踪成功率的前提下，减少一些超像素分割、聚类、计算置信度的时间来满足算法实时性要求，所以取图像序列的前5帧作为训练样本。

由于本文采用水平集方法提取目标轮廓，跟踪重叠率与提取轮廓的迭代次数 (达到收敛) 相关，若迭代次数过多会出现“过分割”现象，若迭代次数过少会导致提取轮廓精度较低，这样都会导致计算的跟踪重叠率 (利用式 (12)) 降低。在满足实时性的前提下，为了使算法的跟踪重叠率维持较高水平，本文将水平集迭代次数取为5次。

在目标特征个数$m$的选择上，若$m$值过小，当目标发生严重遮挡时，特征补偿后外观模型不能有效地描述目标，导致跟踪漂移；若$m$值过大，由于超像素块聚类、匹配的时间过长，使得算法的时效性降低。借鉴文献[14]经验，为了减少超像素块聚类、计算置信度的时间，同时保证特征匹配过程与特征补偿过程的鲁棒性，本文取$m$=10。

3.2.2 实验结果

实验中用到的视频图像序列信息如表 2所示。为了更加直观地分析轮廓跟踪算法的准确性，表 3和表 4分别给出了算法在6组视频序列上跟踪重叠率及平均中心误差。从这两个表中可以看出AC-PMC算法在多数视频序列中均取得了理想的跟踪结果。

表 2 实验图像序列信息
Table 2 The information of the test image sequence

下载CSV

图像序列	光照变化	遮挡	形变	快速运动	背景复杂	旋转
Fish			√			√
Face1		√		√		√
Face2		√			√	√
Shop		√		√	√
Train		√		√	√
Lemming		√		√	√

表 3 算法的跟踪重叠率
Table 3 Tracking overlap ratio of algorithms

下载CSV

/%
图像序列	DMLS	LDM	JRACS	AC-PMC
Fish	61	84	86	92
Face1	46	74	67	74
Face2	52	84	76	85
Shop	－	54	42	77
Train	－	43	27	73
Lemming	－	63	46	82
注：“-”表示算法丢失目标。

表 4 算法的平均中心误差
Table 4 Average center error of algorithms

下载CSV

图像序列	DMLS	LDM	JRACS	AC-PMC
Fish	15.32	5.27	4.84	3.46
Face1	25.61	7.83	8.24	7.16
Face2	15.47	4.41	7.73	3.82
Shop	－	27.65	31.26	13.42
Train	－	31.39	54.26	14.72
Lemming	－	16.48	28.39	12.47

3.2.3 算法准确性分析

Fish序列的跟踪结果如图 5所示。这组实验主要测试当鱼发生形变、旋转时算法的跟踪效果。在这组视频序列中，4种算法的跟踪重叠率都保持在60%以上，并且中心误差比较小，其中AC-PMC算法的跟踪效果是最佳。当鱼发生形变且伴有旋转时，DMLS算法的跟踪重叠率较低。而LDM和JRACS虽然跟踪重叠率较高，但有时会出现局部信息丢失现象。AC-PMC算法通过训练样本集聚类构建的目标先验模型，除去图像中非目标信息的干扰，使得先验模型对目标的描述更准确。结合Bhattacharyya相似性度量进行模型匹配，在视频序列中目标外观区分度较高时，模型匹配出的大体轮廓较为准确，再经过水平集曲线演化得到精确的目标轮廓。

图 5 4种跟踪算法对Fish序列的跟踪结果

Fig. 5 Tracking results of the four algorithms in the Fish sequence

((a) DMLS algorithm; (b) LDM algorithm; (c) JRACS algorithm; (d) AC-PMC algorithm)

Face1序列的跟踪结果如图 6所示。Face2序列的跟踪结果如图 7所示。这两组实验序列主要测试当人脸发生旋转、遮挡时算法的跟踪效果。在Face1图像序列中，LDM、JRACS、AC-PMC算法的跟踪重叠率都比较高。由于人体背影在图像序列中停留时间过长，AC-PMC算法虽然没有丢失目标，但在人脸背对镜头时，决策判定生效 (人脸旋转被视为目标被遮挡)，且在模型更新时不断加入补偿特征，当人脸出现时，引入形状先验模型需要自适应的不断聚类，由于人脸较长时间背对镜头，通过加入补偿特征未能较准确地识别目标，导致其后水平集曲线演化提取目标轮廓的精确度不高，AC-PMC算法的跟踪重叠率降低。如果将补偿特征数$n$增多，AC-PMC算法的跟踪效果会有所提高，但为了保持对比实验的公平性，并没有将此结果作为参考。在Face2视频序列中，LDM、JRACS算法的跟踪重叠率较高，但AC-PMC算法的跟踪重叠率最高，由于人脸背对镜头在视频序列中的帧数较少 (人脸背对镜头的时间较短)，当人脸逐渐出现在镜头中时，引入形状先验模型通过不断聚类，完成特征集合的更新，通过加入补偿特征较准确地表示目标缺少的特征信息，提高了目标轮廓提取的精确度，使得AC-PMC算法的跟踪效果达到最佳。

图 6 4种跟踪算法对Face1序列的跟踪结果

Fig. 6 Tracking results of the four algorithms in the Face1 sequence

((a) DMLS algorithm; (b) LDM algorithm; (c) JRACS algorithm; (d) AC-PMC algorithm)

图 7 4种跟踪算法对Face2序列的跟踪结果

Fig. 7 Tracking results of the four algorithms in the Face2 sequence

((a) DMLS algorithm; (b) LDM algorithm; (c) JRACS algorithm; (d) AC-PMC algorithm)

Shop序列的跟踪结果如图 8所示。Train序列的跟踪结果如图 9所示。这两组实验序列主要测试当目标处于比较复杂的环境下算法的跟踪效果，这对于算法跟踪准确性提出了挑战。这两组视频序列的跟踪的重叠率不高且中心误差较大的原因是采用$s$度量方法，当目标发生局部遮挡时，$s$值在计算上会普遍偏小，特别是在Train视频序列中，人与人之间交错频繁，$s$的值会变得更小。在Shop视频序列中，LDM、JRACS算法的跟踪效率较低，而AC-PMC算法的跟踪效率相对较高。由于本文采用了基于先验模型约束下的模型匹配方式进行目标轮廓圈定，在局部遮挡时，匹配出的目标大体轮廓较为准确，又由于本文在模型更新中引入形状先验模型，通过不断聚类完成特征集合的更新，通过加入补偿特征较准确地填充目标因遮挡而缺少的特征信息，提高目标轮廓提取的精确度，避免了目标丢失、轮廓演化不准确等问题。为了进一步说明本文算法的鲁棒性，引入Train视频序列进行对比。在Train图像序列中，目标所处的环境变得更为复杂，局部遮挡时常发生，DMLS算法从开始就丢失了目标，LDM、JRACS算法的跟踪效率也比较低，相对而言本文提出的AC-PMC算法鲁棒性更高，跟踪重叠率可达到73%。

图 8 3种跟踪算法对Shop序列的跟踪结果

Fig. 8 Tracking results of the three algorithms in the Shop sequence

((a) LDM algorithm; (b) JRACS algorithm; (c) AC-PMC algorithm)

图 9 3种跟踪算法对Train序列的跟踪结果

Fig. 9 Tracking results of the three algorithms in the Train sequence

((a) LDM algorithm; (b) JRACS algorithm; (c) AC-PMC algorithm)

Lemming序列的跟踪结果如图 10所示。这组序列主要测试当旅鼠发生局部遮挡且背景信息较为复杂时算法的跟踪效果。在这组序列中，被跟踪的旅鼠不断移动，背景信息不断更新而且伴随着局部遮挡，使得跟踪变得复杂。LDM、JRACS、AC-PMC 3种算法的跟踪重叠率都较高。由表 3可以看出，AC-PMC的成功率最高。由于AC-PMC算法引入了形状约束，当目标发生局部遮挡时，利用先验的形状模型约束水平集曲线的演化过程，使得提取的目标轮廓较为准确。但当目标发生较严重遮挡时，由于形状先验模型模型不能有效地描述目标区域。当移除遮挡时，建立的外观模型无法准确地识别目标被遮挡的部分，这样导致跟踪结果不够准确。在模型更新中本文加入了补偿特征集，利用补偿的特征弥补由于遮挡而丢失的目标特征，保证模型对目标描述的准确性，提高算法的跟踪效率。

图 10 3种跟踪算法对Lemming序列的跟踪结果

Fig. 10 Tracking results of the three algorithms in the Lemming sequence

((a) LDM algorithm; (b) JRACS algorithm; (c) AC-PMC algorithm)

3.2.4 算法运算效率对比

为了说明算法的运算效率，通过测试本文算法和其他跟踪算法在处理相同图像帧序列时，每帧所需要的平均时间，即不同算法运行的平均速度的结果如表 5所示。

表 5 不同跟踪算法的平均运行速度
Table 5 Average running speeds of different tracking algorithms

下载CSV

/(帧/s)
图像序列	DMLS	LDM	JRACS	AC-PMC
Fish	16.64	7.57	4.92	4.27
Face1	16.16	7.24	4.47	4.03
Face2	15.98	7.22	4.31	3.11
Shop	－	5.93	3.27	2.94
Train	－	5.23	3.13	2.16
Lemming	－	5.29	2.82	1.71
注：“－”表示算法丢失目标。

需要指出的是，由于AC-PMC与JRAC算法同为水平集演化提取目标轮廓，因此本文同设迭代次数为5次。通过与近几年轮廓跟踪算法对相同视频处理的结果不难看出，DMLS轮廓跟踪算法运算速度快，但是在实验中发现其跟踪的重叠率较低，甚至丢失目标。AC-PMC算法引入特征补偿的形状约束水平集的演化过程，并通过决策判定更新目标模型，使得算法运算速度略低于LDM和JRACS算法。与同类算法 (JRACS) 相比较，在平均运行速度相当的前提下，算法跟踪的跟踪重叠率较高。

4 结论

本文提出了一种先验模型约束的抗干扰轮廓跟踪算法。该算法通过建立训练样本集聚类构建的目标先验模型，除去图像中非目标信息的干扰，使得先验模型对目标的描述更准确。在轮廓提取的过程中引入决策判定方法，用来判断是否需要引入先验模型。若需要，则在水平集分割的过程中融合在形状子空间和颜色空间的演化结果。在模型更新中对目标是否发生严重遮挡作判断，提出一种新的在线模型更新算法，在特征集中加入适当特征补偿，保证模型的准确性。

对6组视频图像序列进行实验对比分析，实验结果表明，AC-PMC算法在部分遮挡、目标形变、复杂背景等条件下，均可以得到较高的跟踪准确率和较好的鲁棒性。

本文提出的轮廓跟踪算法虽引入了形状先验模型作约束条件，并在模型更新中加入补偿特征集避免遮挡物的影响，但是由于目标体态特征发生改变，难免会丢失目标。在未来的工作中，将目标自身特征与图像背景中特征区分利用作进一步分析研究。

附录A基于形变的相似性度量一般性推导

设$\varphi $为${\varphi _0}$邻近候选区，根据${p_u}({\varphi _0})$对$E\left( \varphi \right) = \sum\limits_{u = 1}^m {\sqrt {{p_u}\left( \varphi \right){q_u}} } $进行一阶泰勒公式展开，可得相似性度量的表达式为

$ \begin{array}{c} E\left( \varphi \right) = \sum\limits_{u = 1}^m {\sqrt {{p_u}\left( \varphi \right){q_u}} } = \sum\limits_{u = 1}^m {\sqrt {{p_u}\left( \varphi \right){q_u}} } + \\ \frac{1}{2}{p_u}\left( \varphi \right)\sum\limits_{u = 1}^m {{q_u}\sqrt {\frac{1}{{{p_u}\left( {{\varphi _0}} \right){q_u}}}} }-\\ \frac{1}{2}{p_u}\left( {{\varphi _0}} \right)\sum\limits_{u = 1}^m {{q_u}\sqrt {\frac{1}{{{p_u}\left( {{\varphi _0}} \right){q_u}}}} } = \\ \sum\limits_{u = 1}^m {\sqrt {{p_u}\left( {{\varphi _0}} \right){q_u}} } \frac{1}{2}{p_u}\left( \varphi \right)\sum\limits_{u = 1}^m {\sqrt {\frac{{{q_u}}}{{{p_u}\left( {{\varphi _0}} \right)}}} }-\\ \frac{1}{2}\sum\limits_{u = 1}^m {{q_u}\sqrt {{p_u}\left( {{\varphi _0}} \right){q_u}} } = \frac{1}{2}(\sum\limits_{u = 1}^m {\sqrt {{p_u}\left( {{\varphi _0}} \right){q_u}} } + \\ \sum\limits_{u = 1}^m {{p_u}\left( \varphi \right)\sqrt {\frac{{{q_u}}}{{{p_u}\left( {{\varphi _0}} \right)}}} } \end{array} $

将式 (6) 代入上式，可得到式 (8)，即

$ \begin{array}{c} E\left( \varphi \right){\rm{ }} = {\rm{ }}\frac{1}{2}\left( {\sum\limits_{u = 1}^m {\sqrt {{p_u}\left( {{\varphi _0}} \right){q_u}} } + {\rm{ }}\sum\limits_{u = 1}^m {{p_u}\left( \varphi \right)\sqrt {\frac{{{q_u}}}{{{p_u}\left( {{\varphi _0}} \right)}}} } } \right) = \\ \frac{1}{2}\sum\limits_{u = 1}^m {\sqrt {{p_u}\left( {{\varphi _0}} \right){q_u}} } + {\rm{ }}\frac{1}{{2\sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {H\left( {\varphi \left( {pixe{l_i}} \right)} \right)} }} \times \\ \sum\limits_{i{\rm{ }} = {\rm{ }}1}^n {\sum\limits_{u = 1}^m {\sqrt {\frac{{{q_u}}}{{{p_u}\left( {{\varphi _0}} \right)}}} } } H\left( {\varphi \left( {pixe{l_i}} \right)} \right)\delta \left[{b{\rm{ }}\left( {pixe{l_i}} \right)-u} \right] \end{array} $

参考文献

[1] Vatavu A, Danescu R, Nedevschi S. Stereovision-based multiple object tracking in traffic scenarios using free-form obstacle delimiters and particle filters[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(1): 498–511. [DOI:10.1109/TITS.2014.2366248]

[2] Lian F, Han C Z, Liu W F, et al. Tracking partly resolvable group targets using SMC-PHDF[J]. Acta Automatica Sinica, 2010, 36(5): 731–741. [连峰, 韩崇昭, 刘伟峰, 等. 基于SMC-PHDF的部分可分辨的群目标跟踪算法[J]. 自动化学报, 2010, 36(5): 731–741. ] [DOI:10.3724/SP.J.1004.2010.00731]

[3] Khatoonabadi S H, Bajic I V. Video object tracking in the compressed domain using spatio-temporal Markov random fields[J]. IEEE Transactions on Image Processing, 2013, 22(1): 300–313. [DOI:10.1109/TIP.2012.2214049]

[4] Wang M H, Liang Y, Liu F M, et al. Object tracking based on component-level appearance model[J]. Journal of Software, 2015, 26(10): 2733–2747. [王美华, 梁云, 刘福明, 等. 部件级表观模型的目标跟踪方法[J]. 软件学报, 2015, 26(10): 2733–2747. ] [DOI:10.13328/j.cnki.jos.004737]

[5] Ganta R R, Zaheeruddin S, Baddiri N, et al. Segmentation of oil spill images with illumination-reflectance based adaptive level set model[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2012, 5(5): 1394–1402. [DOI:10.1109/JSTARS.2012.2201249]

[6] Li X, Dick A, Shen C H, et al. Incremental learning of 3D-DCT compact representations for robust visual tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(4): 863–881. [DOI:10.1109/TPAMI.2012.166]

[7] Smeulders A W M, Chu D M, Cucchiara R, et al. Visual tracking:an experimental survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1442–1468. [DOI:10.1109/TPAMI.2013.230]

[8] Freedman D, Zhang T. Active contours for tracking distributions[J]. IEEE Transactions on Image Processing, 2004, 13(4): 518–526. [DOI:10.1109/TIP.2003.821445]

[9] Zhang T, Freedman D. Improving performance of distribution tracking through background mismatch[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(2): 282–287. [DOI:10.1109/TPAMI.2005.31]

[10] Chiverton J, Xie X H, Mirmehdi M. Automatic bootstrapping and tracking of object contours[J]. IEEE Transactions on Image Processing, 2012, 21(3): 1231–1245. [DOI:10.1109/TIP.2011.2167343]

[11] Ning J F, Zhang L, Zhang D, et al. Joint registration and active contour segmentation for object tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(9): 1589–1597. [DOI:10.1109/TCSVT.2013.2254931]

[12] Rathi Y, Vaswani N, Tannenbaum A, et al. Tracking deforming objects using particle filtering for geometric active contours[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(8): 1470–1475. [DOI:10.1109/TPAMI.2007.1081]

[13] Oron S, Bar-Hillel A, Levi D, et al. Locally orderless tracking[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI:IEEE, 2012:1940-1947.[DOI:10.1109/CVPR.2012.6247895]

[14] Wang S, Lu H C, Yang F, et al. Superpixel tracking[C]//Proceedings of 2011 IEEE International Conference on Computer Vision. Barcelona, Spain:IEEE, 2011:1323-1330.[DOI:10.1109/ICCV.2011.6126385]

[15] Comaniciu D, Meer P. Mean shift:a robust approach toward feature space analysis[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2002, 24(5): 603–619. [DOI:10.1109/34.1000236]

[16] Chan T F, Vese L A. Active contours without edges[J]. IEEE Transactions on Image Processing, 2001, 10(2): 266–277. [DOI:10.1109/83.902291]

[17] Clement J, Novas N, Gazquez J A, et al. An active contour computer algorithm for the classification of cucumbers[J]. Computers and Electronics in Agriculture, 2013, 92: 75–81. [DOI:10.1016/j.compag.2013.01.006]

[18] Ma B, Wu Y W. Learning distribution metric for object contour tracking[C]//Proceedings of 2011 International Conference on Multimedia Technology. Hangzhou:IEEE, 2011:3120-3123.[DOI:10.1109/ICMT.2011.6001851]

[19] Xu Y H, Tian Z H, Zhang Y Q, et al. Adaptively combining color and depth for human body contour tracking[J]. Acta Automatica Sinica, 2014, 40(8): 1623–1634. [徐玉华, 田尊华, 张跃强, 等. 自适应融合颜色和深度信息的人体轮廓跟踪[J]. 自动化学报, 2014, 40(8): 1623–1634. ] [DOI:10.3724/SP.J.1004.2014.01623]

[20] Wang F, Fang S. Visual tracking based on the discriminative dictionary and weighted local features[J]. Journal of Image and Graphics, 2014, 19(9): 1316–1323. [王飞, 房胜. 加权局部特征结合判别式字典的目标跟踪[J]. 中国图象图形学报, 2014, 19(9): 1316–1323. ] [DOI:10.11834/jig.20140908]

[21] Yang B, Lin G Y, Zhang W G, et al. Robust object tracking incorporating residual unscented particle filter and discriminative sparse representation[J]. Journal of Image and Graphics, 2014, 19(5): 730–738. [杨彪, 林国余, 张为公, 等. 融合残差Unscented粒子滤波和区别性稀疏表示的鲁棒目标跟踪[J]. 中国图象图形学报, 2014, 19(5): 730–738. ] [DOI:10.11834/jig.20140511]