发布时间: 2018-06-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.170507 2018 | Volume 23 | Number 6 图像分析和识别

 收稿日期: 2018-03-02; 修回日期: 2018-03-05 基金项目: 国家自然科学基金项目（61771180）；安徽省重点研发计划（1704d0802183） 第一作者简介: 齐美彬(1969-), 男, 教授, 2007年于合肥工业大学获信号与信息处理专业博士学位, 主要研究方向为视频编码、智能视频监控、机器视觉和DSP技术。E-mail:qimeibin@163.com. 中图法分类号: TP391 文献标识码: A 文章编号: 1006-8961(2018)06-0827-10

# 关键词

Person re-identification based on multi-feature fusion and alternating direction method of multipliers
Qi Meibin, Wang Cichun, Jiang Jianguo, Li Ji
School of Computer and Information, Hefei University of Technology, Hefei 230009, China
Supported by: National Natural Science Foundation of China(61771180); Key Research and Development Project of Anhui Province, China (1704d0802183)

# Abstract

Objective Person re-identification is an extremely challenging problem and has practical application value. It plays an important role in video surveillance systems because it can reduce human efforts in searching for a target from a large number of videos. This topic has gained increasing interest in computer vision. Nowadays, person re-identification algorithms have been applied in criminal investigation, where the interference of passers-by can be eliminated to help the police find final suspects. However, differences in color, illumination, posture, imaging quality, as well as low-resolution of the captured frames cause large appearance variance across multiple cameras; thus, person re-identification remains a significant problem. An algorithm for person re-identification, which is based on multi-feature fusion and alternating direction method of multipliers, is proposed to improve the accuracy of person re-identification. Method First, the original images are processed by the image enhancement algorithm to reduce the impact of illumination changes. This enhancement algorithm is committed to provide an image that is close to human visual characteristics. Then, the method of non-uniform segmentation that processes images is used. The method uses a sub-window size of 10-by-10 pixels with 5-pixel overlapping steps to obtain the local information of the pedestrian image. Meanwhile, the method uses the specific region mean method to divide the pedestrian image into five blocks. Specifically, depending on the difference of the expression ability of the legs and torso, these parts are divided into three blocks and two blocks, respectively. Then, the second and third blocks take the maximum operation, whereas the other blocks perform the mean operation because the second and third blocks are less affected by ambient noise compared with the other blocks. We also extract the HSV and LAB color features of the processed images, a texture feature of scale-invariant local ternary pattern and a shape feature of histogram of oriented gradient. The existing pedestrian re-identification algorithms generally consider the matching between local regions to eliminate the gap information between blocks. The combination of the global and local methods can effectively solve this problem. The proposed algorithm uses the multi-feature fusion method to combine the global and local information, which combines the global and local similarity measurement function of the related person, to obtain the final similarity function. Finally, the optimal distance measurement matrix is updated by the alternating direction method of multipliers, and the final similarities between each pair are obtained to conduct the re-identification. Result The proposed method is demonstrated on four public benchmark datasets including VIPeR, CUHK01, CUHK03, and GRID. Each dataset has its own characteristics. The proposed method achieves a 51.5% rank 1 (represents the accurately matched pair) on VIPeR benchmark and 48.7% and 21.4% on CUHK01 and GRID benchmarks, respectively. Rank 5 (represents the expectation of the matches at rank 5) is more than 80% on the VIPeR datasets and more than 70% on the CUHK01 datasets. The proposed method achieved 62.40% and 55.05% rank 1 identification rates with the labeled bounding boxes and automatically detected bounding boxes, respectively, thereby indicating that the method outperforms that of local maximal occurrence with an improvement of 10.2% for the labeled setting and 8.8% for the detected setting. The proposed method significantly improves the recognition rate and has a practical application value. Conclusion The experimental results show that the proposed method can express the image information of pedestrians effectively. Furthermore, the effectiveness of our algorithm stems from the non-uniform segmentation and the specific mean method, which reduces the influence of ambient noise, increases robustness to occlusion, and is more flexible in handling pose variation. The updated distance measure matrix can express the information of the distance between pedestrians and improve the recognition rate effectively. This method is applicable to person re-identification in most scenarios, especially for static image-based person re-identification in complex scenes. This method can maintain high recognition accuracy even in the presence of local occlusion, illumination difference, and pose or viewpoint difference.

# Key words

person re-identification; multi-feature fusion; non-uniform segmentation; HOG feature; specific region mean method; alternating direction method of multipliers

# 1.2 相似度度量函数

 $\delta \left( {{\mathit{\boldsymbol{x}}_i},{\mathit{\boldsymbol{x}}_j}} \right) = \ln \frac{{{P_{\rm{S}}}\left( {{\mathit{\boldsymbol{x}}_i},{\mathit{\boldsymbol{x}}_j}} \right)}}{{{P_{\rm{D}}}\left( {{\mathit{\boldsymbol{x}}_i},{\mathit{\boldsymbol{x}}_j}} \right)}}$ (1)

 $L\left( \mathit{\boldsymbol{m}} \right) = \sum\limits_{n = 1}^N {\sum\limits_{{\mathit{\boldsymbol{l}}_i} \in \mathit{\boldsymbol{l}}_n^ + ,{\mathit{\boldsymbol{l}}_j} \in \mathit{\boldsymbol{l}}_n^ - } {{l_{{\rm{triplet}}}}\left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_i},{\mathit{\boldsymbol{l}}_j}} \right)} }$ (13)

 ${l_{{\rm{triplet}}}}\left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_i},{\mathit{\boldsymbol{l}}_j}} \right) = {\left[ {\delta \left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_i}} \right) - \delta \left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_j}} \right) + \alpha } \right]_ + }$ (14)

 $\mathit{\boldsymbol{U}}_1^{k + 1} = \arg \mathop {\min }\limits_{{\mathit{\boldsymbol{U}}_1}} {g_1}\left( {{\mathit{\boldsymbol{U}}_1}} \right) + \frac{\rho }{2}\left\| {{\mathit{\boldsymbol{U}}_1} - \left( {\mathit{\boldsymbol{U}}_3^k - \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^k} \right)} \right\|_{\rm{F}}^2$ (17)

 $\mathit{\boldsymbol{U}}_2^{k + 1} = \arg \mathop {\min }\limits_{{\mathit{\boldsymbol{U}}_2}} {g_2}\left( {{\mathit{\boldsymbol{U}}_2}} \right) + \frac{\rho }{2}\left\| {{\mathit{\boldsymbol{U}}_2} - \left( {\mathit{\boldsymbol{U}}_3^k - \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^k} \right)} \right\|_{\rm{F}}^2$ (18)

 $\begin{array}{*{20}{c}} {\mathit{\boldsymbol{U}}_3^{k + 1} = \arg \mathop {\min }\limits_{{\mathit{\boldsymbol{U}}_3}} {g_3}\left( {{\mathit{\boldsymbol{U}}_3}} \right) + }\\ {\rho \left\| {{\mathit{\boldsymbol{U}}_3} - \frac{1}{2}\left( {\mathit{\boldsymbol{U}}_1^{k + 1} + \mathit{\boldsymbol{U}}_2^{k + 1} + \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^k + \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^k} \right)} \right\|_{\rm{F}}^2} \end{array}$ (19)

 $\left\{ \begin{array}{l} \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^{k + 1} = \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^k + \mathit{\boldsymbol{U}}_1^{k + 1} - \mathit{\boldsymbol{U}}_3^{k + 1}\\ \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^{k + 1} = \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^k + \mathit{\boldsymbol{U}}_2^{k + 1} - \mathit{\boldsymbol{U}}_3^{k + 1} \end{array} \right.$ (20)

end

$\mathit{\boldsymbol{U}} \leftarrow \mathit{\boldsymbol{U}}_3^K$

# 1.4 行人再识别算法

1) 输入数据集内的所有图像，运用Retinex算法对输入图像进行预处理。

2) 对处理后的图像非均匀分割，提取图像的特征${\left( {{\mathit{\boldsymbol{x}}_i}, {\mathit{\boldsymbol{x}}_j}} \right)_c}$包含HSV、LAB颜色特征以及SILTP纹理特征和HOG特征；$i, j = 1, 2, \cdots , N$; $c = 1, 2, 3, 4$$i, j$表示行人对，$c$表示不同的特征空间，$N$为图像总数。使用L2范数对提取的特征向量归一化，归一化后的特征用$\left( {{\mathit{\boldsymbol{l}}_i}, {\mathit{\boldsymbol{l}}_j}} \right)$表示。

3) 各个分块对应特征的距离测度矩阵用$\left\{ {{\mathit{\boldsymbol{M}}^{c, 1}}, {\mathit{\boldsymbol{M}}^{c, 2}}, \cdots , {\mathit{\boldsymbol{M}}^{c, r}}, \cdots , {\mathit{\boldsymbol{M}}^{c, R}}{, ^{c, G}}} \right\}_{c = 1}^C$表示, $R$为5，$C$为4。

4) 利用迭代更新优化算法迭代更新出最优的距离测度矩阵，得到每对行人之间的最终相似度。

# 2.1 VIPeR数据集的测试结果

VIPeR数据集是由632对行人1 264幅图像组成，每一对行人图像来源于不同的摄像头场景。VIPeR数据集是行人再识别中最具有挑战性的数据集，算法在VIPeR数据集上的测试结果也最具有说服力，因此本文算法的对比实验也均在VIPeR数据集上测试。实验中训练样本集和测试样本集(P)均为316对行人图像。

Table 1 Matching rates of the proposed algorithm whether using HOG feature

 /% HOG特征 Rank1 Rank5 Rank10 Rank20 无 48.54 78.39 89.37 98.96 有 51.49 80.79 89.40 99.27

Table 2 Matching rates of the proposed algorithm based on specific region mean method and local maximum method

 /% 区域处理法 Rank1 Rank5 Rank10 Rank20 最大值法 49.91 80.28 89.87 99.21 本文 51.49 80.79 89.40 99.27

Table 3 Matching rates of different methods on VIPeR

 /% 算法 Rank1 Rank5 Rank10 Rank20 KISSME[4] 19.2 48.8 64.9 80.2 kLFDA[16] 32.3 65.8 79.7 90.9 Polymap[9] 36.8 70.4 83.7 91.8 LOMO[10] 40.0 68.1 80.5 91.0 文献[6] 40.7 72.3 83.9 92.0 文献[7] 42.7 74.5 85.4 92.8 本文 51.5 80.8 89.4 99.3

# 2.2 CUHK数据集的测试结果

CUHK01数据集中的图像是在校园环境中捕获，共971个行人，每个行人包含4幅图像，每个行人的前两幅图像和后两幅图像是视频中前后帧的关系。实验中，将图像大小统一设置为高128像素，宽48像素。其中训练集为485对行人图像，测试集为486对行人图像。表 4为本文算法与现有行人再识别算法的性能比较。

Table 4 Matching rates of different methods on CUHK01

 /% 算法 Rank1 Rank5 Rank10 Rank20 KISSME[5] 17.9 38.1 48.0 58.8 kLFDA[16] 29.1 55.2 66.4 77.3 MFA[16] 29.6 55.8 66.4 77.3 文献[7] 36.1 62.6 72.6 81.9 文献[8] 43.7 70.8 79.0 87.3 本文 48.7 73.1 81.6 88.9

Table 5 Matching rates of different methods onCUHK03

 /% 算法 检测器检测 手动裁剪 KISSME[4] 11.70 14.17 DeepReID[2] 19.89 20.65 文献[17] 44.96 54.74 LOMO[10] 46.25 52.20 本文 55.05 62.40

# 2.3 GRID数据集的测试结果

GRID数据集由1 275人图像组成。其中有250行人图像对。每对中的图像属于同一个人，但是从不同的相机视图中捕获。此外，还有775个额外的人物图像不属于250人中的任何一个。实验中，将图像大小统一设置为高128像素，宽48像素，其中训练集为125对行人图像，测试集为125对行人图像和775不相关行人图像。由于增加了大量不相关行人图像做干扰，这也使GRID数据集的整体识别率较低，但GRID数据集更贴近现实，因为在刑事侦查中同样是有很多路人干扰警方找到最终的犯罪嫌疑人。

Table 6 Matching rates of different methods on GRID

 /% 算法 Rank1 Rank5 Rank10 Rank20 Mrank-PRDC[18] 11.1 26.1 35.8 46.6 Mrank-RankSVM[18] 12.2 27.8 36.3 46.6 Polymap[19] 16.3 35.8 46.0 57.6 LOMO[10] 16.6 33.8 41.8 52.4 本文 21.4 40.3 49.9 59.3

# 参考文献

• [1] Yi D, Lei Z, Liao S C, et al. Deep metric learning for person re-identification[C]//Proceedings of the 22nd International Conference on Pattern Recognition. Stockholm, Sweden: IEEE, 2014: 34-39. [DOI:10.1109/ICPR.2014.16]
• [2] Li W, Zhao R, Xiao T, et al. DeepReID: deep filter pairing neural network for person re-identification[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 152-159. [DOI:10.1109/CVPR.2014.27]
• [3] Schroff F, Kalenichenko D, Philbin J. FaceNet: a unified embedding for face recognition and clustering[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 815-823. [DOI:10.1109/CVPR.2015.7298682]
• [4] Köstinger M, Hirzer M, Wohlhart P, et al. Large scale metric learning from equivalence constraints[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 2288-2295. [DOI:10.1109/CVPR.2012.6247939]
• [5] Chen Y, Huo Z H. Person re-identification based on multi-directional saliency metric learning[J]. Journal of Image and Graphics, 2015, 20(12): 1674–1683. [陈莹, 霍中花. 多方向显著性权值学习的行人再识别[J]. 中国图象图形学报, 2015, 20(12): 1674–1683. ] [DOI:10.11834/jig.20151212]
• [6] Qi M B, Tan S S, Wang Y X, et al. Multi-feature subspace and kernel learning for person re-identification[J]. Acta Automatica Sinica, 2016, 42(2): 299–308. [齐美彬, 檀胜顺, 王运侠, 等. 基于多特征子空间与核学习的行人再识别[J]. 自动化学报, 2016, 42(2): 299–308. ] [DOI:10.16383/j.aas.2016.c150344]
• [7] Qi M B, Hu L F, Jiang J G, et al. Person re-identification based on multi-features fusion and independent metric learning[J]. Journal of Image and Graphics, 2016, 21(11): 1464–1472. [齐美彬, 胡龙飞, 蒋建国, 等. 多特征融合与独立测度学习的行人再识别[J]. 中国图象图形学报, 2016, 21(11): 1464–1472. ] [DOI:10.11834/jig.20161106]
• [8] Jobson D J, Rahman Z, Woodell G A. A multiscale retinex for bridging the gap between color images and the human observation of scenes[J]. IEEE Transactions on image Processing, 1997, 6(7): 965–976. [DOI:10.1109/83.597272]
• [9] Chen D P, Yuan Z J, Hua G, et al. Similarity learning on an explicit polynomial kernel feature map for person re-identification[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 1565-1573. [DOI:10.1109/CVPR.2015.7298764]
• [10] Liao S C, Hu Y, Zhu X Y, et al. Person re-identification by local maximal occurrence representation and metric learning[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 2197-2206. [DOI:10.1109/CVPR.2015.7298832]
• [11] Gabay D, Mercier B. A dual algorithm for the solution of nonlinear variational problems via finite element approximation[J]. Computers & Mathematics with Applications, 1976, 2(1): 17–40. [DOI:10.1016/0898-1221(76)90003-1]
• [12] Wang H B, Lu C, Zhou J, et al. Orthogonal projection non-negative matrix factorization using alternating direction method of multipliers[J]. Journal of Image and Graphics, 2017, 22(4): 463–471. [王华彬, 路成, 周健, 等. 正交投影非负矩阵的交替方向乘子分解方法[J]. 中国图象图形学报, 2017, 22(4): 463–471. ] [DOI:10.11834/jig.20170406]
• [13] Gray D, Tao H. Viewpoint invariant pedestrian recognition with an ensemble of localized features[C]//Proceedings of the 10th European Conference on Computer Vision. Marseille, France: Springer, 2008: 262-275. [DOI:10.1007/978-3-540-88682-2_21]
• [14] Loy C, Xiang T, Gong S G. Multi-camera activity correlation analysis[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009: 1988-1995. [DOI:10.1109/CVPR.2009.5206827]
• [15] Zhao R, Ouyang W L, Wang X G. Learning mid-level filters for person re-identification[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 144-151. [DOI:10.1109/CVPR.2014.26]
• [16] Xiong F, Gou M R, Camps O, et al. Person re-identification using kernel-based metric learning methods[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 1-16. [DOI:10.1007/978-3-319-10584-0_1]
• [17] Ahmed E, Jones M, Marks T K. An improved deep learning architecture for person re-identification[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3908-3916. [DOI:10.1109/CVPR.2015.7299016]
• [18] Loy C C, Liu C X, Gong S G. Person re-identification by manifold ranking[C]//Proceedings of the 2013 IEEE International Conference on Image Processing. Melbourne, VIC, Australia: IEEE, 2013: 3567-3571. [DOI:10.1109/ICIP.2013.6738736]