# 关键词

Traffic video significance foreground target extraction in complex scenes
Lang Hong, Ding Shuo, Lu Jian, Ma Xiaoli
The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, Shanghai 201804, China
Supported by: National Natural Science Foundation of China (71671127)

# Abstract

Objective In urban traffic detection, the wide application of intelligent video surveillance provides the visual research interest on artificial intelligence and advanced computer vision technology to retrieve and recognize the foreground object in video and its further analysis, such as feature extraction and abnormal behavior analysis. However, when facing complex environment, the discontinuity of the dynamic background causes loss of a small part of the future target image information, false detection, and misjudgment. Constructing an effective and high-performance extractor has two core issues. The first issue is the detection of speed and efficiency. If the video object can be extracted in advance and can determine which video frames do not contain the foreground object, it is directly eliminated in the earlier period, only concerning the image with a significant foreground target, which greatly improves the detection efficiency, because of the large video data. The second problem involves the object integrity in complex environments. Effectively extracting the foreground part of the video sequence becomes the key to the reliability of subsequent algorithms. Method This paper proposes a robust principal component analysis (RPCA) optimization method. The classical RPCA detection method uses the l0-norm to independently determine whether each pixel is a moving target and is not conducive to eliminate unstructured sparse components due to noise and random background disturbance. This paper aims to maintain the good robustness of the algorithm in the complex environment and optimize the RPCA initial filtered image. In order to quickly screen and track the foreground target, a fast extraction algorithm for the saliency target frame number is designed based on the frame difference Euclidean distance method to determine the detection range in the key frame neighborhood. Through the establishment and the solution of sparse low-rank models and based on the initially filtered foreground target image, parallel recognition of the foreground target seed is performed to remove the dynamic background in the foreground target image. Also, as observed from several mask images after gray value inversion, the foreground target pixel has a small gray value and strong directionality. Therefore, the design ideas for the parallel recognition and optimization connection method of the foreground target seed are:1) By using gray pixel seed recognition, gray value inversion of the source image, and verification according to the gray scale and symmetry detection, grayscale pixels are identified as foreground and non-foreground target sub-blocks; 2) Grayscale pixels are optimized for connection, and foreground target seeds are connected according to grayscale values and directional similarity, followed by fusion and multi-template denoising; 3) Seed filling is used for foreground targets to enhance the connectivity and make the target more complete. Simultaneously, the foreground objects in the mask image are classified into regular and irregular class. For the fault separation of irregular targets such as pedestrians and animals, the vertical seed growth algorithm is designed in the target region. For the foreground targets of rules such as car and ships, the foreground seed in the design region is vertically and horizontally connected to remove the holes and the impact of the lack of structural information. Result The foreground target extraction is highly robust in complex environment with challenging interference factors. In the four groups of classic video of the database and the two videos of Shanxi Taichang Expressway, the dynamic background has the flow of water, swaying leaves, the slight jitter of the camera, and the change of light shadows. In addition, the experimental results were analyzed from three perspectives of the application effect, the accuracy of foreground target location, and the integrity of foreground target detection. The significance of target extraction algorithm has achieved an average accuracy of 90.1%, an average recall of 88.7%, and an average F value of 89.4%, which are all superior to other similar algorithms. Compared with the mixed Gaussian model and the optical flow algorithm, the complex background brings a large noise disturbance. The Gaussian mixture model uses a morphological algorithm to remove the noise filling holes, giving the detected foreground target more viscous information. At different shadow area, the detection effect varies greatly. Furthermore, the optical flow algorithm is sensitive to light, and the changed light is mistaken for optical flow, which is not suitable under strict environmental requirements. Conclusion In this paper, by quickly locating the salient foreground, a parallel seed identification and optimized connection algorithm for RPCA initial screening image is proposed. The qualitative and quantitative analyses of the experimental data show that the algorithm can separate the foreground target from the dynamic background more quickly, reduce the adhesion between the foreground object and the background, and more effectively retain the structural information of the foreground object in the original image. In the following studies, deficiencies in the overall model and the algorithm details are continuously optimized. In the face of abnormal light rays, shadow suppression can be combined to make it more robust, and the performance and effectiveness of the algorithm are improved in more complex environments such as drone mobile video, which provides data support for feature extraction and abnormal behavior analysis.

# Key words

intelligent traffic detection; sparse low rank; frame difference Euclidean distance method; parallel identification of foreground seeds; seed growth; region rule filling

# 0 引言

1) 显著性前景目标快速提取与验证。基于帧差欧氏距离设计显著性前景目标验证算法快速提取显著性前景目标帧号并对比人工校验结果。

4) 结合noncvx-RPCA、MoG-RPCA、GreGoDec算法、混合高斯背景消去算法(GM)、LK光流算法，从应用效果、前景目标定位的准确性以及前景目标检测的完整性3个角度对实验结果对比验证。

# 1.1 基于帧差欧氏距离的关键帧快速提取

 $d = \sqrt {\sum\limits_{t = 1}^G {{{\left[ {\left( {{x_{i + 2}} - {x_{i + 1}}} \right) - \left( {{x_{i + 1}} - {x_i}} \right)} \right]}^2}} }$ (1)

1) 在一个包含有$N$个帧图像的镜头中总共有$\left( {N - 2} \right)$个帧差欧氏距离，逐帧计算各图像的帧差欧氏距离值；

2) 计算这$\left( {N - 2} \right)$个帧差欧氏距离的极值，以及各极值点对应的函数值；

3) 计算各函数值的均值；

4) 取出所对应函数值大于均值的极值点，其对应的帧图像即为所要选取的关键帧图像。

# 1.2 与人工数据对比验证分析

Table 1 Video frame labels with significant foreground targets (manual acquisition)

 视频 显著性目标帧号 Campus 85, 200230, 306523, 600, 643683, 693712, 740906, 1 0061 036, 1 264, 1 3281 376, 1 3771 407 Curtain 967, 1 7561 906, 2 126, 2 1702 317, 2 642, 2 7672 933 Escalator 1180, 2012 399, 2 415, 2 539, 2 754, 2 7783 417 Fountain 141, 153213, 259, 335, 408523 Hall 0335, 350512, 578, 602749, 818849, 8501 056, 1 138, 1 1511 214, 1 246, 1 2771 544, 1 5573 534 Lobby 154198, 350512, 578, 602749, 818849, 8501 056，1 138, 1 1551 214, 1 246, 1 2771 544, 1 5573 534 Office 197, 372, 501, 5822 040, 2 080 Overpass 544665, 968, 1 551, 1 881, 2 098, 2 3352 956

# 2 稀疏低秩模型的建立及求解

 $\begin{array}{*{20}{c}} {\arg \mathop {\min }\limits_L {{\left\| \mathit{\boldsymbol{L}} \right\|}_ * } + \frac{\mu }{2}\left\| {\mathit{\boldsymbol{M}} - \mathit{\boldsymbol{L}} - \mathit{\boldsymbol{S}} + {\mu ^{ - 1}}\mathit{\boldsymbol{Y}}} \right\|_{\rm{F}}^2 = }\\ {\mathit{\Lambda }_\mu ^{ - 1}\left( {\mathit{\boldsymbol{M}} - \mathit{\boldsymbol{S}} + {\mu ^{ - 1}}\mathit{\boldsymbol{Y}}} \right)} \end{array}$

1) 分别计算中心像素两侧$r$个像素的灰度均值${{\bar m}_{v1}} $${{\bar m}_{v2}}  \left\{ \begin{array}{l} {{\bar m}_{v1}} = \frac{1}{r}\sum\limits_{n = - r}^{n = - 1} {I\left( u \right)} \\ {{\bar m}_{v2}} = \frac{1}{r}\sum\limits_{n = 1}^{n = r} {I\left( u \right)} \end{array} \right. (8) 式中，v =1、2、3、4分别为0°、45°、90°和135° 4个方向，I\left( u \right) 为对称性检测模板中，中心像素第u 个像素的灰度值， u \in \left[ { - r, - 1} \right]$$ u \in \left[ {1, r} \right]$

2) 计算各方向灰度变化，即

 ${d_v} = \min \left\{ {{{\bar m}_{v1}} - {g_v},{{\bar m}_{v2}} - {g_v}} \right\}$ (9)

 ${d_{\max }} = \max \left\{ {{d_v}} \right\},{d_{\min }} = \min \left\{ {{d_v}} \right\}$ (10)

4) 判断像素灰度变化是否显著，即

 ${d_{\max }} \ge t,{d_{\max }} - {d_{\min }} \ge s$ (11)

# 3.2.2 前景目标种子信息融合与滑动去噪

 ${\mathit{\boldsymbol{S}}_{\left( {0,0} \right)}} = \sum\limits_{i = 1}^5 {{\mathit{\boldsymbol{S}}_i}}$ (13)

# 3.2.3 前景目标区域规则填充

1) 规则类和非规则类前景目标区分准则。本文在已有视频数据集的前景目标基础上按照最小外接矩形的长宽比$R$、目标轮廓顶点线性回归系数$r$和目标帧差移动距离$D$作为准则函数进行规则类和非规则前景目标特征区分。对于某帧图像中的前景目标区域${\mathit{\boldsymbol{S}}_i}$，分类准则如下：

(1) 若$D \ge {D_0}$，则目标帧差移动距离较大，则${\mathit{\boldsymbol{S}}_i}$为规则类前景目标；否则，进入步骤(2);

(2) 若$R < {R_0} $$|r| < {r_0}$$ {\mathit{\boldsymbol{S}}_i}$为规则类前景目标；否则${\mathit{\boldsymbol{S}}_i}$为非规则类前景目标。

2) 对于非规则类前景目标(如行人、动物):

(1) 从前景目标区域${\mathit{\boldsymbol{S}}_i}$纵向搜索相邻前景目标区域，若存在断层区域${{\mathit{\boldsymbol{S'}}}_{i1}}, {{\mathit{\boldsymbol{S'}}}_{i2}}, \ldots {{\mathit{\boldsymbol{S'}}}_{in}}$，转入步骤(2)，若无则结束。

(2) 对存在若干个断层区域，进行区域外像素块首尾匹配并纵向生长，填充断层。

3) 对于规则类前景目标(如汽车、轮船)：

(1) 从前景目标区域${\mathit{\boldsymbol{S}}_r}$的第一行左端开始扫描，沿$x$轴方向进行连接前景目标种子，直至扫描到图像最右端；

(2) 进入下一行，按步骤(1)的方法从左至右扫描连接处理。

(3) 重复步骤(1)(2)的过程，直至完成每一行的扫描连接处理。

# 4.2 前景目标定位的准确性

 $\left\{ \begin{array}{l} {F_{{\rm{FNR}}}} = \frac{{FN}}{P} \times 100\% \\ {F_{{\rm{FPR}}}} = \frac{{FP}}{{TP + FP}} \times 100\% \\ {F_{{\rm{PPR}}}} = \frac{{TP + TN}}{{P + N}} \times 100\% \end{array} \right.$ (14)

# 4.3 前景目标检测的完整性

 ${F_{\rm{F}}} = \frac{{2PR}}{{P + R}}$ (15)

Table 2 Target integrity assessment results

 算法 评估指标/% 正确率 召回率 ${\rm{F}}$值 本文 90.1 88.7 89.4 noncvx-RPCA 72.7 81.4 76.8 MoG-RPCA 78.8 84.2 81.4 GreGoDec 80.9 85.4 83.1 MG 79.8 86.2 82.9 LK 75.6 80.5 78 注:加粗字体为每列最优结果。

Table 3 Comparison of average running time

 /s 算法 分辨率/像素 160×128 (DB1、DB2) 320×240 (DB3、DB4) 960×576 (HB1、HB2) 本文 0.421 1.766 2.934 noncvx-RPCA 0.017 0.051 3.573 MoG-RPCA 0.455 2.092 29.086 GreGoDec 0.013 0.059 2.358 MG 0.282 0.311 0.891 LK 0.265 0.298 0.797 注：加粗字体为每列最优结果。

