多方向显著性权值学习的行人再识别

陈莹; 霍中花

doi:10.11834/jig.20151212

图像分析和识别 | 浏览量 : 0 下载量: 342 CSCD: 5

PDF
导出
分享
收藏
专辑

多方向显著性权值学习的行人再识别
Person re-identification based on multi-directional saliency metric learning
2015年20卷第12期页码：1674-1683
网络出版：2015-12-04，

纸质出版：2015
DOI： 10.11834/jig.20151212
稿件说明：

移动端阅览

陈莹, 霍中花. 多方向显著性权值学习的行人再识别[J]. 中国图象图形学报, 2015,20(12):1674-1683. DOI： 10.11834/jig.20151212.

Chen Ying, Huo Zhonghua. Person re-identification based on multi-directional saliency metric learning[J]. Journal of Image and Graphics, 2015, 20(12): 1674-1683. DOI： 10.11834/jig.20151212.

摘要

针对当前行人再识别匹配块的显著性外观特征不一致的问题

提出一种对视角和背景变化具有较强鲁棒性的基于多向显著性相似度融合学习的行人再识别算法。首先用流形排序估计目标的内在显著性

并融合类间显著性得到图像块的显著性;然后根据匹配块的4种显著性分布情况

通过多向显著性加权融合建立二者的视觉相似度

同时采用基于结构支持向量机排序的度量学习方法获得各方向显著性权重值

形成图像对之间全面的相似度度量。在两个公共数据库进行再识别实验

本文算法较同类方法能获取更为全面的相似度度量

具有较高的行人再识别率

且不受背景变化的影响。对VIPeR数据库测试集大小为316对行人图像的再识别结果进行了定量统计

本文算法的第1识别率(排名第1的搜索结果即为待查询人的比率)为30%

第15识别率(排名前15的搜索结果中包含待查询人的比率)为72%

具有实际应用价值。多方向显著性加权融合能对图像对的显著性分布进行较为全面的描述

进而得到较为全面的相似度度量。本文算法能够实现大场景非重叠多摄像机下的行人再识别

具有较高的识别力和识别精度

且对背景变化具有较强的鲁棒性。

Abstract

Person re-identification is important in video surveillance systems because it reduces human efforts in searching for a target from a large number of video sequences. However

this task is difficult because of variations in lighting conditions

clutter in the background

changes in individual viewpoints

and differences in individual poses. To tackle this problem

most studies concentrated either on designing a feature representation

metric learning method

or discriminative learning method. Visual saliency based on discriminative learning methods has recently been exploited because salient regions can help humans efficiently distinguish targets. Given the problem of inconsistent salience properties between matched patches in person re-identification

this study proposes a multi-directional salience similarity evaluation approach for person re-identification based on metric learning. The proposed method is robust to viewpoints and background variations. First

the salience of image patches is obtained by fusing inter-salience and intra-salience

which are estimated by manifold ranking. The visual similarity between matched patches is then established by the multi-directional weighted fusion of salience according to the distribution of the four saliency types of matched patches. The weight of saliency in each direction is obtained by using metric learning in the base of structural SVM ranking. Finally

a comprehensive similarity measure of image pairs is formed. The proposed method is demonstrated on two public benchmark datasets (e.g.

VIPeR and ETHZ)

and experimental results show that the proposed method achieves excellent re-identification rates with comprehensive similarity measures compared with other similar algorithms. Moreover

the proposed method is invariant to the effects of background variations. The re-identification results on the VIPeR dataset with half of the dataset sampled as training samples are quantitatively analyzed

and the performance of the proposed method outperforms existing learning based methods by 30% at rank 1(represents the correct matched pair) and 72% at rank 15(represents the expectation of the matches at rank 15). The proposed method can still achieve state-of-the-art performance even if the size of the training pair is small. For generalization verification

experiments are conducted on the ETHZ dataset for testing. Result shows that the proposed method outperforms existing feature-design-based methods and supervised-learning-based methods on all three sequences. Thus

the proposed method shows practical significance. The multi-directional weighted fusion of salience can yield a comprehensive description of the saliency distribution of image pairs and obtain a comprehensive similarity measure. The proposed method can realize person reidentification in large-scale

non-overlapping

multi-camera views. Furthermore

the proposed method improves the discriminative and accuracy performance of re-identification and has strong robustness to background changes.