发布时间: 2019-01-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.180313 2019 | Volume 24 | Number 1 图像理解和计算机视觉

 收稿日期: 2018-05-10; 修回日期: 2018-07-30 基金项目: 国家自然科学基金项目（71671127） 第一作者简介: 郎洪, 1994年生, 男, 博士研究生, 主要研究方向为智能交通系统、交通图像数据检测、交通安全。E-mail:hl_tongji@126.com;丁朔, 男, 博士研究生, 主要研究方向为智能交通系统、交通安全。E-mail:shuoding@tongji.edu.cn;马晓丽, 女, 硕士研究生, 主要研究方向为风险管理、交通安全。E-mail:1632403@tongji.edu.cn. 中图法分类号: TP391.41 文献标识码: A 文章编号: 1006-8961(2019)01-0050-14

# 关键词

Traffic video significance foreground target extraction in complex scenes
Lang Hong, Ding Shuo, Lu Jian, Ma Xiaoli
The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, Shanghai 201804, China
Supported by: National Natural Science Foundation of China (71671127)

# Abstract

Objective In urban traffic detection, the wide application of intelligent video surveillance provides the visual research interest on artificial intelligence and advanced computer vision technology to retrieve and recognize the foreground object in video and its further analysis, such as feature extraction and abnormal behavior analysis. However, when facing complex environment, the discontinuity of the dynamic background causes loss of a small part of the future target image information, false detection, and misjudgment. Constructing an effective and high-performance extractor has two core issues. The first issue is the detection of speed and efficiency. If the video object can be extracted in advance and can determine which video frames do not contain the foreground object, it is directly eliminated in the earlier period, only concerning the image with a significant foreground target, which greatly improves the detection efficiency, because of the large video data. The second problem involves the object integrity in complex environments. Effectively extracting the foreground part of the video sequence becomes the key to the reliability of subsequent algorithms. Method This paper proposes a robust principal component analysis (RPCA) optimization method. The classical RPCA detection method uses the l0-norm to independently determine whether each pixel is a moving target and is not conducive to eliminate unstructured sparse components due to noise and random background disturbance. This paper aims to maintain the good robustness of the algorithm in the complex environment and optimize the RPCA initial filtered image. In order to quickly screen and track the foreground target, a fast extraction algorithm for the saliency target frame number is designed based on the frame difference Euclidean distance method to determine the detection range in the key frame neighborhood. Through the establishment and the solution of sparse low-rank models and based on the initially filtered foreground target image, parallel recognition of the foreground target seed is performed to remove the dynamic background in the foreground target image. Also, as observed from several mask images after gray value inversion, the foreground target pixel has a small gray value and strong directionality. Therefore, the design ideas for the parallel recognition and optimization connection method of the foreground target seed are:1) By using gray pixel seed recognition, gray value inversion of the source image, and verification according to the gray scale and symmetry detection, grayscale pixels are identified as foreground and non-foreground target sub-blocks; 2) Grayscale pixels are optimized for connection, and foreground target seeds are connected according to grayscale values and directional similarity, followed by fusion and multi-template denoising; 3) Seed filling is used for foreground targets to enhance the connectivity and make the target more complete. Simultaneously, the foreground objects in the mask image are classified into regular and irregular class. For the fault separation of irregular targets such as pedestrians and animals, the vertical seed growth algorithm is designed in the target region. For the foreground targets of rules such as car and ships, the foreground seed in the design region is vertically and horizontally connected to remove the holes and the impact of the lack of structural information. Result The foreground target extraction is highly robust in complex environment with challenging interference factors. In the four groups of classic video of the database and the two videos of Shanxi Taichang Expressway, the dynamic background has the flow of water, swaying leaves, the slight jitter of the camera, and the change of light shadows. In addition, the experimental results were analyzed from three perspectives of the application effect, the accuracy of foreground target location, and the integrity of foreground target detection. The significance of target extraction algorithm has achieved an average accuracy of 90.1%, an average recall of 88.7%, and an average F value of 89.4%, which are all superior to other similar algorithms. Compared with the mixed Gaussian model and the optical flow algorithm, the complex background brings a large noise disturbance. The Gaussian mixture model uses a morphological algorithm to remove the noise filling holes, giving the detected foreground target more viscous information. At different shadow area, the detection effect varies greatly. Furthermore, the optical flow algorithm is sensitive to light, and the changed light is mistaken for optical flow, which is not suitable under strict environmental requirements. Conclusion In this paper, by quickly locating the salient foreground, a parallel seed identification and optimized connection algorithm for RPCA initial screening image is proposed. The qualitative and quantitative analyses of the experimental data show that the algorithm can separate the foreground target from the dynamic background more quickly, reduce the adhesion between the foreground object and the background, and more effectively retain the structural information of the foreground object in the original image. In the following studies, deficiencies in the overall model and the algorithm details are continuously optimized. In the face of abnormal light rays, shadow suppression can be combined to make it more robust, and the performance and effectiveness of the algorithm are improved in more complex environments such as drone mobile video, which provides data support for feature extraction and abnormal behavior analysis.

# Key words

intelligent traffic detection; sparse low rank; frame difference Euclidean distance method; parallel identification of foreground seeds; seed growth; region rule filling

# 0 引言

1) 显著性前景目标快速提取与验证。基于帧差欧氏距离设计显著性前景目标验证算法快速提取显著性前景目标帧号并对比人工校验结果。

4) 结合noncvx-RPCA、MoG-RPCA、GreGoDec算法、混合高斯背景消去算法(GM)、LK光流算法，从应用效果、前景目标定位的准确性以及前景目标检测的完整性3个角度对实验结果对比验证。

# 1.1 基于帧差欧氏距离的关键帧快速提取

 $d = \sqrt {\sum\limits_{t = 1}^G {{{\left[ {\left( {{x_{i + 2}} - {x_{i + 1}}} \right) - \left( {{x_{i + 1}} - {x_i}} \right)} \right]}^2}} }$ (1)

1) 在一个包含有$N$个帧图像的镜头中总共有$\left( {N - 2} \right)$个帧差欧氏距离，逐帧计算各图像的帧差欧氏距离值；

2) 计算这$\left( {N - 2} \right)$个帧差欧氏距离的极值，以及各极值点对应的函数值；

3) 计算各函数值的均值；

4) 取出所对应函数值大于均值的极值点，其对应的帧图像即为所要选取的关键帧图像。

# 1.2 与人工数据对比验证分析

Table 1 Video frame labels with significant foreground targets (manual acquisition)

 视频 显著性目标帧号 Campus 85, 200230, 306523, 600, 643683, 693712, 740906, 1 0061 036, 1 264, 1 3281 376, 1 3771 407 Curtain 967, 1 7561 906, 2 126, 2 1702 317, 2 642, 2 7672 933 Escalator 1180, 2012 399, 2 415, 2 539, 2 754, 2 7783 417 Fountain 141, 153213, 259, 335, 408523 Hall 0335, 350512, 578, 602749, 818849, 8501 056, 1 138, 1 1511 214, 1 246, 1 2771 544, 1 5573 534 Lobby 154198, 350512, 578, 602749, 818849, 8501 056，1 138, 1 1551 214, 1 246, 1 2771 544, 1 5573 534 Office 197, 372, 501, 5822 040, 2 080 Overpass 544665, 968, 1 551, 1 881, 2 098, 2 3352 956

# 2 稀疏低秩模型的建立及求解

 $\begin{array}{*{20}{c}} {\arg \mathop {\min }\limits_L {{\left\| \mathit{\boldsymbol{L}} \right\|}_ * } + \frac{\mu }{2}\left\| {\mathit{\boldsymbol{M}} - \mathit{\boldsymbol{L}} - \mathit{\boldsymbol{S}} + {\mu ^{ - 1}}\mathit{\boldsymbol{Y}}} \right\|_{\rm{F}}^2 = }\\ {\mathit{\Lambda }_\mu ^{ - 1}\left( {\mathit{\boldsymbol{M}} - \mathit{\boldsymbol{S}} + {\mu ^{ - 1}}\mathit{\boldsymbol{Y}}} \right)} \end{array}$

1) 分别计算中心像素两侧$r$个像素的灰度均值${{\bar m}_{v1}} $${{\bar m}_{v2}}  \left\{ \begin{array}{l} {{\bar m}_{v1}} = \frac{1}{r}\sum\limits_{n = - r}^{n = - 1} {I\left( u \right)} \\ {{\bar m}_{v2}} = \frac{1}{r}\sum\limits_{n = 1}^{n = r} {I\left( u \right)} \end{array} \right. (8) 式中，v =1、2、3、4分别为0°、45°、90°和135° 4个方向，I\left( u \right) 为对称性检测模板中，中心像素第u 个像素的灰度值， u \in \left[ { - r, - 1} \right]$$ u \in \left[ {1, r} \right]$

2) 计算各方向灰度变化，即

 ${d_v} = \min \left\{ {{{\bar m}_{v1}} - {g_v},{{\bar m}_{v2}} - {g_v}} \right\}$ (9)

 ${d_{\max }} = \max \left\{ {{d_v}} \right\},{d_{\min }} = \min \left\{ {{d_v}} \right\}$ (10)

4) 判断像素灰度变化是否显著，即

 ${d_{\max }} \ge t,{d_{\max }} - {d_{\min }} \ge s$ (11)

# 3.2.2 前景目标种子信息融合与滑动去噪

 ${\mathit{\boldsymbol{S}}_{\left( {0,0} \right)}} = \sum\limits_{i = 1}^5 {{\mathit{\boldsymbol{S}}_i}}$ (13)

# 3.2.3 前景目标区域规则填充

1) 规则类和非规则类前景目标区分准则。本文在已有视频数据集的前景目标基础上按照最小外接矩形的长宽比$R$、目标轮廓顶点线性回归系数$r$和目标帧差移动距离$D$作为准则函数进行规则类和非规则前景目标特征区分。对于某帧图像中的前景目标区域${\mathit{\boldsymbol{S}}_i}$，分类准则如下：

(1) 若$D \ge {D_0}$，则目标帧差移动距离较大，则${\mathit{\boldsymbol{S}}_i}$为规则类前景目标；否则，进入步骤(2);

(2) 若$R < {R_0} $$|r| < {r_0}$$ {\mathit{\boldsymbol{S}}_i}$为规则类前景目标；否则${\mathit{\boldsymbol{S}}_i}$为非规则类前景目标。

2) 对于非规则类前景目标(如行人、动物):

(1) 从前景目标区域${\mathit{\boldsymbol{S}}_i}$纵向搜索相邻前景目标区域，若存在断层区域${{\mathit{\boldsymbol{S'}}}_{i1}}, {{\mathit{\boldsymbol{S'}}}_{i2}}, \ldots {{\mathit{\boldsymbol{S'}}}_{in}}$，转入步骤(2)，若无则结束。

(2) 对存在若干个断层区域，进行区域外像素块首尾匹配并纵向生长，填充断层。

3) 对于规则类前景目标(如汽车、轮船)：

(1) 从前景目标区域${\mathit{\boldsymbol{S}}_r}$的第一行左端开始扫描，沿$x$轴方向进行连接前景目标种子，直至扫描到图像最右端；

(2) 进入下一行，按步骤(1)的方法从左至右扫描连接处理。

(3) 重复步骤(1)(2)的过程，直至完成每一行的扫描连接处理。

# 4.2 前景目标定位的准确性

 $\left\{ \begin{array}{l} {F_{{\rm{FNR}}}} = \frac{{FN}}{P} \times 100\% \\ {F_{{\rm{FPR}}}} = \frac{{FP}}{{TP + FP}} \times 100\% \\ {F_{{\rm{PPR}}}} = \frac{{TP + TN}}{{P + N}} \times 100\% \end{array} \right.$ (14)

# 4.3 前景目标检测的完整性

 ${F_{\rm{F}}} = \frac{{2PR}}{{P + R}}$ (15)

Table 2 Target integrity assessment results

 算法 评估指标/% 正确率 召回率 ${\rm{F}}$值 本文 90.1 88.7 89.4 noncvx-RPCA 72.7 81.4 76.8 MoG-RPCA 78.8 84.2 81.4 GreGoDec 80.9 85.4 83.1 MG 79.8 86.2 82.9 LK 75.6 80.5 78 注:加粗字体为每列最优结果。

Table 3 Comparison of average running time

 /s 算法 分辨率/像素 160×128 (DB1、DB2) 320×240 (DB3、DB4) 960×576 (HB1、HB2) 本文 0.421 1.766 2.934 noncvx-RPCA 0.017 0.051 3.573 MoG-RPCA 0.455 2.092 29.086 GreGoDec 0.013 0.059 2.358 MG 0.282 0.311 0.891 LK 0.265 0.298 0.797 注：加粗字体为每列最优结果。

# 参考文献

• [1] Guo D J. Research on foreground extraction and targets detecting and tracking for video surveillance[D]. Hangzhou: Zhejiang University, 2016. [郭达洁.监控视频中的前景提取和目标检测跟踪算法研究[D].杭州: 浙江大学, 2016.] http://cdmd.cnki.com.cn/Article/CDMD-10335-1016073817.htm
• [2] Li Q. Research on anomaly detection algorithm in video surveillance[D]. Hefei: University of Science and Technology of China, 2017. [李强.监控视频异常行为检测算法研究[D].合肥: 中国科学技术大学, 2017.]
• [3] Araki S, Matsuoka T, Yokoya N, et al. Real-time tracking of multiple moving object contours in a moving camera image sequence[J]. IEICE Transactions on Information and Systems, 2000, E83-D(7): 1583–1591.
• [4] Hsieh J W. Fast stitching algorithm for moving object detection and mosaic construction[J]. Image and Vision Computing, 2004, 22(4): 291–306. [DOI:10.1016/j.imavis.2003.09.018]
• [5] Tissainayagam P, Suter D. Object tracking in image sequences using point features[J]. Pattern Recognition, 2005, 38(1): 105–113. [DOI:10.1016/j.patcog.2004.05.011]
• [6] Karmann K P. Moving object recognition using an adaptive background memory[M]//Cappellini V. Time-Varying Image Processing and Moving Object Recognition. Amsterdam, the Netherlands: Elsevier, 1990: 289-307.
• [7] Kilger M. A shadow handler in a video-based real-time traffic monitoring system[C]//Proceedings of IEEE Workshop on Applications of Computer Vision. Palm Springs, CA, USA: IEEE, 1992: 11-18.[DOI: 10.1109/ACV.1992.240332]
• [8] Arras K O, Lau B, Grzonka S, et al. Range-based people detection and tracking for socially enabled service robots[M]//Prassler E, Zöllner M, Bischoff R, et al. Towards Service Robots for Everyday Environments. Berlin, Heidelberg: Springer, 2012, 76: 235-280.[DOI: 10.1007/978-3-642-25116-0_18]
• [9] Li B C, Ding K. An improved algorithm of Gaussian mixture model combined with shadow suppression[J]. Computer Engineering & Science, 2016, 38(3): 556–561. [李博川, 丁轲. 结合阴影抑制的混合高斯模型改进算法[J]. 计算机工程与科学, 2016, 38(3): 556–561. ] [DOI:10.3969/j.issn.1007-130X.2016.03.024]
• [10] Stauffer C, Grimson W E L. Adaptive background mixture models for real-time tracking[M]//Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Fort Collins, Colorado: IEEE Computer. Soc., 1999.
• [11] Horn B K P, Schunck B G. Determining optic flow[J]. International Journal of Artificial Intelligence, 1981: 185–203.
• [12] Lucas B D, Kanade T. An iterative image registration technique with an application to stereo vision[C]//Proceedings of the 7th International Joint Conference on Artificial Intelligence. Vancouver, BC, Canada: Morgan Kaufmann Publishers Inc, 1981: 674-679.
• [13] Li T. Research on methods of multiple objects tracking in intelligent visual surveillance[D]. Hefei: University of Science and Technology of China, 2013. [李彤.智能视频监控下的多目标跟踪技术研究[D].合肥: 中国科学技术大学, 2013.]
• [14] Candès E J, Li X D, Ma Y, et al. Robust principal component analysis?[J]. Journal of the ACM, 2011, 58(3): #11. [DOI:10.1145/1970392.1970395]
• [15] Bao B K, Liu G C, Xu C S, et al. Inductive robust principal component analysis[J]. IEEE Transactions on Image Processing, 2012, 21(8): 3794–3800. [DOI:10.1109/TIP.2012.2192742]
• [16] Gao Z, Cheong L F, Wang Y X. Block-sparse RPCA for salient motion detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(10): 1975–1987. [DOI:10.1109/TPAMI.2014.2314663]
• [17] Yang B, Zou L. Robust foreground detection using block-based RPCA[J]. Optik-International Journal for Light and Electron Optics, 2015, 126(23): 4586–4590. [DOI:10.1016/j.ijleo.2015.08.064]
• [18] Zhou W, Sun Y B, Liu Q S, et al. l0 group sparse RPCA model and algorithm for moving object detection[J]. Acta Electronica Sinica, 2016, 44(3): 627–632. [周伟, 孙玉宝, 刘青山, 等. 运动目标检测的l0群稀疏RPCA模型及其算法[J]. 电子学报, 2016, 44(3): 627–632. ] [DOI:10.3969/j.issn.0372-2112.2016.03.020]
• [19] Tao D. Research and implementation of key frame extraction in content based video retrieval system[D]. Changchun: Jilin University, 2004. [陶丹.基于内容的视频检索系统中关键帧提取方法的研究与实现[D].长春: 吉林大学, 2004.] http://cdmd.cnki.com.cn/Article/CDMD-10183-2004101541.htm
• [20] Wen J J. Research on moving object detection and action analysis[D]. Harbin: Harbin Institute of Technology, 2015. [文嘉俊.运动目标检测及其行为分析研究[D].哈尔滨: 哈尔滨工业大学, 2015.] http://cdmd.cnki.com.cn/Article/CDMD-10213-1016739440.htm
• [21] Peng B, Jiang Y S, Chen C, et al. Automatic parallel cracking detection algorithm based on 1 mm resolution 3D pavement images[J]. Journal of Southeast University:Natural Science Edition, 2015, 45(6): 1190–1196. [彭博, 蒋阳升, 陈成, 等. 基于1mm精度路面三维图像的裂缝自动并行识别算法[J]. 东南大学学报:自然科学版, 2015, 45(6): 1190–1196. ] [DOI:10.3969/j.issn.1001-0505.2015.06.030]
• [22] Gavilán M, Balcones D, Marcos O, et al. Adaptive road crack detection system by pavement classification[J]. Sensors, 2011, 11(10): 9628–9657. [DOI:10.3390/s111009628]
• [23] Zhang D J, Li Q Q, Chen Y, et al. An efficient and reliable coarse-to-fine approach for asphalt pavement crack detection[J]. Image and Vision Computing, 2017, 57: 130–146. [DOI:10.1016/j.imavis.2016.11.018]
• [24] Kang Z, Peng C, Cheng Q. Robust PCA via nonconvex rank approximation[C]//Proceedings of 2015 IEEE International Conference on Data Mining. Atlantic City, NJ, USA: IEEE, 2015: 211-220.[DOI: 10.1109/ICDM.2015.15]
• [25] Luo Q, Han Z, Chen X A, et al. Tensor RPCA by bayesian CP factorization with complex noise[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 5029-5038.
• [26] Zhou T, Tao D. Unmixing Incoherent Structures of Big Data by Randomized or Greedy Decomposition[J]. Computer Science, 2013.
• [27] Chen L, Zhang R G, Hu J, et al. Improved Gaussian mixture model and shadow elimination method[J]. Journal of Computer Applications, 2013, 33(5): 1394–1397. [DOI:10.3724/SP.J.1087.2013.01394]
• [28] Tang Z, Miao Z J. Fast background subtraction and shadow elimination using improved Gaussian mixture model[C]//Proceedings of 2007 IEEE International Workshop on Haptic, Audio and Visual Environments and Games. Ottawa, Ont., Canada: IEEE, 2007: 38-41.[DOI: 10.1109/HAVE.2007.4371583]
• [29] Matthews I, Baker S. Active appearance models revisited[J]. International Journal of Computer Vision, 2004, 60(2): 135–164. [DOI:10.1023/B:VISI.0000029666.37597.d3]
• [30] Zou Q, Cao Y, Li Q Q, et al. CrackTree:automatic crack detection from pavement images[J]. Pattern Recognition Letters, 2012, 33(3): 227–238. [DOI:10.1016/j.patrec.2011.11.004]
• [31] Li Q Q, Zou Q, Liu X L. Pavement crack classification via spatial distribution features[J]. EURASIP Journal on Advances in Signal Processing, 2011, 2011: #649675.