# 关键词

Person re-identification based on multi-feature fusion and alternating direction method of multipliers
Qi Meibin, Wang Cichun, Jiang Jianguo, Li Ji
School of Computer and Information, Hefei University of Technology, Hefei 230009, China
Supported by: National Natural Science Foundation of China(61771180); Key Research and Development Project of Anhui Province, China (1704d0802183)

# Abstract

Objective Person re-identification is an extremely challenging problem and has practical application value. It plays an important role in video surveillance systems because it can reduce human efforts in searching for a target from a large number of videos. This topic has gained increasing interest in computer vision. Nowadays, person re-identification algorithms have been applied in criminal investigation, where the interference of passers-by can be eliminated to help the police find final suspects. However, differences in color, illumination, posture, imaging quality, as well as low-resolution of the captured frames cause large appearance variance across multiple cameras; thus, person re-identification remains a significant problem. An algorithm for person re-identification, which is based on multi-feature fusion and alternating direction method of multipliers, is proposed to improve the accuracy of person re-identification. Method First, the original images are processed by the image enhancement algorithm to reduce the impact of illumination changes. This enhancement algorithm is committed to provide an image that is close to human visual characteristics. Then, the method of non-uniform segmentation that processes images is used. The method uses a sub-window size of 10-by-10 pixels with 5-pixel overlapping steps to obtain the local information of the pedestrian image. Meanwhile, the method uses the specific region mean method to divide the pedestrian image into five blocks. Specifically, depending on the difference of the expression ability of the legs and torso, these parts are divided into three blocks and two blocks, respectively. Then, the second and third blocks take the maximum operation, whereas the other blocks perform the mean operation because the second and third blocks are less affected by ambient noise compared with the other blocks. We also extract the HSV and LAB color features of the processed images, a texture feature of scale-invariant local ternary pattern and a shape feature of histogram of oriented gradient. The existing pedestrian re-identification algorithms generally consider the matching between local regions to eliminate the gap information between blocks. The combination of the global and local methods can effectively solve this problem. The proposed algorithm uses the multi-feature fusion method to combine the global and local information, which combines the global and local similarity measurement function of the related person, to obtain the final similarity function. Finally, the optimal distance measurement matrix is updated by the alternating direction method of multipliers, and the final similarities between each pair are obtained to conduct the re-identification. Result The proposed method is demonstrated on four public benchmark datasets including VIPeR, CUHK01, CUHK03, and GRID. Each dataset has its own characteristics. The proposed method achieves a 51.5% rank 1 (represents the accurately matched pair) on VIPeR benchmark and 48.7% and 21.4% on CUHK01 and GRID benchmarks, respectively. Rank 5 (represents the expectation of the matches at rank 5) is more than 80% on the VIPeR datasets and more than 70% on the CUHK01 datasets. The proposed method achieved 62.40% and 55.05% rank 1 identification rates with the labeled bounding boxes and automatically detected bounding boxes, respectively, thereby indicating that the method outperforms that of local maximal occurrence with an improvement of 10.2% for the labeled setting and 8.8% for the detected setting. The proposed method significantly improves the recognition rate and has a practical application value. Conclusion The experimental results show that the proposed method can express the image information of pedestrians effectively. Furthermore, the effectiveness of our algorithm stems from the non-uniform segmentation and the specific mean method, which reduces the influence of ambient noise, increases robustness to occlusion, and is more flexible in handling pose variation. The updated distance measure matrix can express the information of the distance between pedestrians and improve the recognition rate effectively. This method is applicable to person re-identification in most scenarios, especially for static image-based person re-identification in complex scenes. This method can maintain high recognition accuracy even in the presence of local occlusion, illumination difference, and pose or viewpoint difference.

# Key words

person re-identification; multi-feature fusion; non-uniform segmentation; HOG feature; specific region mean method; alternating direction method of multipliers

# 1.2 相似度度量函数

 $\delta \left( {{\mathit{\boldsymbol{x}}_i},{\mathit{\boldsymbol{x}}_j}} \right) = \ln \frac{{{P_{\rm{S}}}\left( {{\mathit{\boldsymbol{x}}_i},{\mathit{\boldsymbol{x}}_j}} \right)}}{{{P_{\rm{D}}}\left( {{\mathit{\boldsymbol{x}}_i},{\mathit{\boldsymbol{x}}_j}} \right)}}$ (1)

 $L\left( \mathit{\boldsymbol{m}} \right) = \sum\limits_{n = 1}^N {\sum\limits_{{\mathit{\boldsymbol{l}}_i} \in \mathit{\boldsymbol{l}}_n^ + ,{\mathit{\boldsymbol{l}}_j} \in \mathit{\boldsymbol{l}}_n^ - } {{l_{{\rm{triplet}}}}\left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_i},{\mathit{\boldsymbol{l}}_j}} \right)} }$ (13)

 ${l_{{\rm{triplet}}}}\left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_i},{\mathit{\boldsymbol{l}}_j}} \right) = {\left[ {\delta \left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_i}} \right) - \delta \left( {{\mathit{\boldsymbol{l}}_n},{\mathit{\boldsymbol{l}}_j}} \right) + \alpha } \right]_ + }$ (14)

 $\mathit{\boldsymbol{U}}_1^{k + 1} = \arg \mathop {\min }\limits_{{\mathit{\boldsymbol{U}}_1}} {g_1}\left( {{\mathit{\boldsymbol{U}}_1}} \right) + \frac{\rho }{2}\left\| {{\mathit{\boldsymbol{U}}_1} - \left( {\mathit{\boldsymbol{U}}_3^k - \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^k} \right)} \right\|_{\rm{F}}^2$ (17)

 $\mathit{\boldsymbol{U}}_2^{k + 1} = \arg \mathop {\min }\limits_{{\mathit{\boldsymbol{U}}_2}} {g_2}\left( {{\mathit{\boldsymbol{U}}_2}} \right) + \frac{\rho }{2}\left\| {{\mathit{\boldsymbol{U}}_2} - \left( {\mathit{\boldsymbol{U}}_3^k - \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^k} \right)} \right\|_{\rm{F}}^2$ (18)

 $\begin{array}{*{20}{c}} {\mathit{\boldsymbol{U}}_3^{k + 1} = \arg \mathop {\min }\limits_{{\mathit{\boldsymbol{U}}_3}} {g_3}\left( {{\mathit{\boldsymbol{U}}_3}} \right) + }\\ {\rho \left\| {{\mathit{\boldsymbol{U}}_3} - \frac{1}{2}\left( {\mathit{\boldsymbol{U}}_1^{k + 1} + \mathit{\boldsymbol{U}}_2^{k + 1} + \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^k + \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^k} \right)} \right\|_{\rm{F}}^2} \end{array}$ (19)

 $\left\{ \begin{array}{l} \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^{k + 1} = \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_1^k + \mathit{\boldsymbol{U}}_1^{k + 1} - \mathit{\boldsymbol{U}}_3^{k + 1}\\ \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^{k + 1} = \mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_2^k + \mathit{\boldsymbol{U}}_2^{k + 1} - \mathit{\boldsymbol{U}}_3^{k + 1} \end{array} \right.$ (20)

end

$\mathit{\boldsymbol{U}} \leftarrow \mathit{\boldsymbol{U}}_3^K$

# 1.4 行人再识别算法

1) 输入数据集内的所有图像，运用Retinex算法对输入图像进行预处理。

2) 对处理后的图像非均匀分割，提取图像的特征${\left( {{\mathit{\boldsymbol{x}}_i}, {\mathit{\boldsymbol{x}}_j}} \right)_c}$包含HSV、LAB颜色特征以及SILTP纹理特征和HOG特征；$i, j = 1, 2, \cdots , N$; $c = 1, 2, 3, 4$$i, j$表示行人对，$c$表示不同的特征空间，$N$为图像总数。使用L2范数对提取的特征向量归一化，归一化后的特征用$\left( {{\mathit{\boldsymbol{l}}_i}, {\mathit{\boldsymbol{l}}_j}} \right)$表示。

3) 各个分块对应特征的距离测度矩阵用$\left\{ {{\mathit{\boldsymbol{M}}^{c, 1}}, {\mathit{\boldsymbol{M}}^{c, 2}}, \cdots , {\mathit{\boldsymbol{M}}^{c, r}}, \cdots , {\mathit{\boldsymbol{M}}^{c, R}}{, ^{c, G}}} \right\}_{c = 1}^C$表示, $R$为5，$C$为4。

4) 利用迭代更新优化算法迭代更新出最优的距离测度矩阵，得到每对行人之间的最终相似度。

# 2.1 VIPeR数据集的测试结果

VIPeR数据集是由632对行人1 264幅图像组成，每一对行人图像来源于不同的摄像头场景。VIPeR数据集是行人再识别中最具有挑战性的数据集，算法在VIPeR数据集上的测试结果也最具有说服力，因此本文算法的对比实验也均在VIPeR数据集上测试。实验中训练样本集和测试样本集(P)均为316对行人图像。

Table 1 Matching rates of the proposed algorithm whether using HOG feature

 /% HOG特征 Rank1 Rank5 Rank10 Rank20 无 48.54 78.39 89.37 98.96 有 51.49 80.79 89.40 99.27

Table 2 Matching rates of the proposed algorithm based on specific region mean method and local maximum method

 /% 区域处理法 Rank1 Rank5 Rank10 Rank20 最大值法 49.91 80.28 89.87 99.21 本文 51.49 80.79 89.40 99.27

Table 3 Matching rates of different methods on VIPeR

 /% 算法 Rank1 Rank5 Rank10 Rank20 KISSME[4] 19.2 48.8 64.9 80.2 kLFDA[16] 32.3 65.8 79.7 90.9 Polymap[9] 36.8 70.4 83.7 91.8 LOMO[10] 40.0 68.1 80.5 91.0 文献[6] 40.7 72.3 83.9 92.0 文献[7] 42.7 74.5 85.4 92.8 本文 51.5 80.8 89.4 99.3

# 2.2 CUHK数据集的测试结果

CUHK01数据集中的图像是在校园环境中捕获，共971个行人，每个行人包含4幅图像，每个行人的前两幅图像和后两幅图像是视频中前后帧的关系。实验中，将图像大小统一设置为高128像素，宽48像素。其中训练集为485对行人图像，测试集为486对行人图像。表 4为本文算法与现有行人再识别算法的性能比较。

Table 4 Matching rates of different methods on CUHK01

 /% 算法 Rank1 Rank5 Rank10 Rank20 KISSME[5] 17.9 38.1 48.0 58.8 kLFDA[16] 29.1 55.2 66.4 77.3 MFA[16] 29.6 55.8 66.4 77.3 文献[7] 36.1 62.6 72.6 81.9 文献[8] 43.7 70.8 79.0 87.3 本文 48.7 73.1 81.6 88.9

Table 5 Matching rates of different methods onCUHK03

 /% 算法 检测器检测 手动裁剪 KISSME[4] 11.70 14.17 DeepReID[2] 19.89 20.65 文献[17] 44.96 54.74 LOMO[10] 46.25 52.20 本文 55.05 62.40

# 2.3 GRID数据集的测试结果

GRID数据集由1 275人图像组成。其中有250行人图像对。每对中的图像属于同一个人，但是从不同的相机视图中捕获。此外，还有775个额外的人物图像不属于250人中的任何一个。实验中，将图像大小统一设置为高128像素，宽48像素，其中训练集为125对行人图像，测试集为125对行人图像和775不相关行人图像。由于增加了大量不相关行人图像做干扰，这也使GRID数据集的整体识别率较低，但GRID数据集更贴近现实，因为在刑事侦查中同样是有很多路人干扰警方找到最终的犯罪嫌疑人。

Table 6 Matching rates of different methods on GRID

 /% 算法 Rank1 Rank5 Rank10 Rank20 Mrank-PRDC[18] 11.1 26.1 35.8 46.6 Mrank-RankSVM[18] 12.2 27.8 36.3 46.6 Polymap[19] 16.3 35.8 46.0 57.6 LOMO[10] 16.6 33.8 41.8 52.4 本文 21.4 40.3 49.9 59.3

