Global-local metric learning for person re-identification

Zhang Jing; Zhao Xu

doi:10.11834/jig.20170407

Views : 0 下载量: 359 CSCD: 3

PDF
Export
Share
Collection
Album

Global-local metric learning for person re-identification
Vol. 22, Issue 4, Pages: 472-481(2017)
Published Online：07 April 2017，

Published：2017
DOI： 10.11834/jig.20170407
稿件说明：

移动端阅览

DOI：

Zhang Jing, Zhao Xu. Global-local metric learning for person re-identification[J]. Journal of Image and Graphics, 2017, 22(4): 472-481. DOI： 10.11834/jig.20170407.

摘要

人体目标再识别的任务是匹配不同摄像机在不同时间、地点拍摄的人体目标。受光照条件、背景、遮挡、视角和姿态等因素影响，不同摄相机下的同一目标表观差异较大。目前研究主要集中在特征表示和度量学习两方面。很多度量学习方法在人体目标再识别问题上了取得了较好的效果，但对于多样化的数据集，单一的全局度量很难适应差异化的特征。对此，有研究者提出了局部度量学习，但这些方法通常需要求解复杂的凸优化问题，计算繁琐。利用局部度量学习思想，结合近几年提出的XQDA（cross-view quadratic discriminant analysis）和MLAPG（metric learning by accelerated proximal gradient）等全局度量学习方法，提出了一种整合全局和局部度量学习框架。利用高斯混合模型对训练样本进行聚类，在每个聚类内分别进行局部度量学习；同时在全部训练样本集上进行全局度量学习。对于测试样本，根据样本在高斯混合模型各个成分下的后验概率将局部和全局度量矩阵加权结合，作为衡量相似性的依据。特别地，对于MLAPG算法，利用样本在各个高斯成分下的后验概率，改进目标损失函数中不同样本的损失权重，进一步提高该方法的性能。在VIPeR、PRID 450S和QMUL GRID数据集上的实验结果验证了提出的整合全局—局部度量学习方法的有效性。相比于XQDA和MLAPG等全局方法，在VIPeR数据集上的匹配准确率提高2.0%左右，在其他数据集上的性能也有不同程度的提高。另外，利用不同的特征表示对提出的方法进行实验验证，相比于全局方法，匹配准确率提高1.3%~3.4%左右。有效地整合了全局和局部度量学习方法，既能对多种全局度量学习算法的性能做出改进，又能避免局部度量学习算法复杂的计算过程。实验结果表明，对于使用不同的特征表示，提出的整合全局—局部度量学习框架均可对全局度量学习方法做出改进。

Abstract

The task in person re-identification is to match snapshots of people from non-overlapping camera views at different times and places. Intra-class images from different cameras show varying appearances due to variations in illumination

background

occlusion

viewpoint

and pose. Feature representation and metric learning are two major research directions in person re-identification. On the one hand

some studies focus on feature descriptors

which are discriminative for different classes and robust against intra-class variations. On the other hand

numerous metric learning algorithms have achieved good performance in person re-identification. The comparison of all the samples with a single global metric is inappropriate for handling heterogeneous data. Several researchers have proposed local metric learning. However

these methods generally require complicated computations to solve convex optimization problems. To improve the performance of metric learning algorithms and avoid complex computation

this study applies the concept of local metric learning and combines global metric learning algorithms

such as cross-view quadratic discriminant analysis (XQDA) and metric learning by accelerated proximal gradient (MLAPG). In the training stage

all the samples are softly partitioned into several clusters using the Gaussian mixture model (GMM). Local metrics are learned on each cluster using metric learning methods

such as XQDA and MLAPG. Meanwhile

a global metric is also learned for the entire training set. In the testing stage

the posterior probabilities of the testing samples that are aligned to each GMM component are computed. For each pair of samples

the local metrics weighted by their posterior probabilities of GMM components and the global metric weighted by a cross-validated parameter are integrated into the final metric for similarity evaluation. In this manner

we use different metrics to measure various pairs of samples

which is more suitable for heterogeneous data sets. In particular

we also propose an effective local metric learning strategy for MLAPG by modifying the weights of the loss values of the sample pairs in the loss function with the posterior probabilities of the samples aligned to each GMM component. We conduct experiments on three challenging data sets of person re-identification (i.e.

VIPeR

PRID 450S

and QMUL GRID). Experimental results show that the proposed approach achieves better performance compared with traditional global metric learning methods. It performs significantly better on the VIPeR data set

providing more complex variations of backgrounds and clothes than on the other data sets

thereby improving matching accuracy by approximately 2.0%. In addition

we also conduct experiments on different types of feature representations for person re-identification to verify the generalized effectiveness of the proposed method. The matching accuracy is improved by approximately 1.3% to 3.4% with different feature descriptors. This result shows that the proposed approach can improve performance regardless of which feature descriptor is used. We propose a novel framework for integrating global and local metric learning methods by taking advantages of both metric learning approaches. Numerous recent global metric learning approaches can be integrated into the proposed framework to obtain improved performance in the person re-identification problem. Compared with certain local metric learning approaches

the proposed framework integrates global metric learning methods flexibly and effectively. It doesn't require complicated computation unlike other local metric learning approaches. Moreover

the proposed metric learning framework can be applied to many feature representation approaches.