Current Issue Cover


摘 要
Recent progress in person re-ID

Zhang Yongfei,Yang Hangyuan,Zhang Yujia,Dou Zhaopeng,Liao Shengcai,Zheng Wei-Shi,Zhang Shiliang,Ye Mang,Yan Yichao,Li Junjie,Wang Shengjin()

Person Re-Identification (person re-ID) aims to solve the problem of association and matching of target person images across multiple cameras within a camera network of a surveillance system, especially in the case of face, iris and other biometrics recognition failure under non-cooperative application scenarios, and has become the key component and supporting technique of intelligent video surveillance systems in numerous applications in intelligent public security and smart cities, etc. Recently, person re-ID has attracted more and more attention from both academia and industry, and has made rapid development and progress. Facing the technical challenges and application needs of person re-ID in practical scenarios, this paper will first give a brief introduction to the development history, major datasets and evaluation metrics of person re-ID, and then summarize and analyze the cutting-edge progress in hot research topics of person re-ID, including occluded person re-ID, unsupervised person re-ID, virtual data generation for person re-ID, domain generalization person re-ID, cloth-changing person re-ID, cross-modal person re-ID and person search. More specifically, to address the problem of impact of possible occlusions on the performance of person re-ID, recent progress in occluded person re-ID is first reviewed in Section 2.1, with a brief introduction of the major datasets for occluded person re-ID in 2.1.1, and review of the two major categories of occluded person re-ID models are presented in 2.1.2. Facing the challenges of low-efficiency and high-cost data annotation and great impact of training data on the performance of person re-ID, unsupervised person re-ID and virtual data generation for person re-ID emerges as two hot topics in person re-ID. Section 2.2 elaborates the recent advances on unsupervised person re-ID, which are classified into three major categories, namely, pseudo label generation-based models, domain transfer-based models and other models, which take into consideration of the extra information besides person image, like time-stamps, camera labels. Section 2.3 surveys the-state-of-the-art works on virtual data generation for person re-ID, with detailed introduction, as well as performance comparisons of major virtual datasets, including SOMAset, UnrealPerson, ClonePerson and WePerson. Despite the impressive progress in the supervised and unsupervised person re-ID settings where new models need to be trained for a new domain by accessing data, most of the exiting models suffer from drastic performance decline on unseen domains, i.e., new scenes in real-world applications, in which data might be unavailable and data re-training might be impossible. Thus, domain generalization person re-ID emerges, aiming at learning a model that has good generalization ability to unseen domains. The recent advances on domain generalization person re-ID will be reviewed in Section 2.4, which are classified into five categories, namely batch/instance normalization models in 2.4.1, domain invariant feature learning models in 2.4.2, deep-learning-based explicit image matching models in 2.4.3, models based on mixture of experts in 2.4.4 and meta-learning-based models in 2.4.5. Since most current person re-ID models largely depend on the color appearance of persons clothes, which might be not reliable when a person changes the clothes, cloth-changing person re-ID becomes a challenging setting in which person images with clothes change exhibit large intra-class variation and small inter-class variation. Typical cloth-changing person re-ID datasets will be introduced in 2.5.1 and the recent progress will be reviewed in 2.5.2, in which models in the first category explicitly introduces extra cloth-appearance-independent features like contour and face while the second try to decouple the cloth features and person ID features. To compensate the drawbacks of conventional person re-ID of visible light/RGB images in actual complex scenes like poor lighting conditions in the night, cross-modal person re-ID aims to explore the problem through other heterogeneous data besides visible RGB images. Section 2.6 surveys the-state-of-the-art cross-modal person re-ID, with commonly used cross-modal person re-ID datasets in 2.6.1 and four sub-categories models in 2.6.2 according to the different modalities employed, namely RGB-infrared image person re-ID, RGB image-Text person re-ID, RGB image-sketch person re-ID and RGB-depth image person re-ID, respectively. Since existing person re-ID benchmarks and methods mainly focus on matching cropped person images between queries and candidates and is different from practical scenarios where the bounding box annotations of persons are usually unavailable and the target person needs to be searched from a gallery of whole scene images, person search, which jointly considers person detection and person re-ID in a single framework, becomes a new hot research topic. The typical datasets and recent progress on person search will be presented in 2.7.1 and 2.7.2, respectively. Finally, the existing challenges and development trend of person re-ID techniques will be discussed. It is hoped that the summary and analysis can provide reference for relevant researchers to carry out research on person re-ID and promote the progress of person re-ID techniques and applications.