朱福庆,孔祥维,付海燕,田奇(大连理工大学信息与通信工程学院, 大连 116024;美国德州大学圣安东尼奥分校计算机科学系, 圣安东尼奥 TX 78249 美国)
目的 行人再识别的任务是研究如何在海量监控数据中准确地识别出某个特定场合中曾经出现过的人，已成为公共安全领域中一项新的且具有挑战性的研究课题。其挑战在于，行人在图像中有较大的姿态、视角、光照等变化，这些复杂的变化会严重影响行人再识别性能。近年来，以卷积神经网络（CNN）为代表的深度学习方法在计算机视觉领域取得了巨大的成功，也带动了行人再识别领域的相关研究。CNN有效地克服了行人变化，取得较高的准确率。然而，由于行人再识别数据集中行人标注量小，利用现有的一路CNN模型，其训练过程并不够充分，影响了深度学习模型的鉴别能力。为了解决上述问题，通过对网络结构进行改进，提出一种两路互补对称的CNN结构用于行人再识别任务。方法 本文方法每次同时输入两路样本，其中每路样本之间具有互补特性，此时在有限的训练样本下，输入的组合会更加多样化，CNN模型的训练过程更加丰富。结果 对本文提出的方法在两个公开的大规模数据集（Market-1501和DukeMTMC-reID）上进行实验评估，相比于基线方法有稳定的提升，相比于现存的其他一些方法，其结果也有竞争力。在Market-1501数据集上，1选识别正确率和平均精度均值分别达到了73.25%和48.44%。在DukeMTMC-reID数据集上，1选识别正确率和平均精度均值分别达到了63.02%和41.15%。结论 本文提出的两路互补对称CNN结构的行人再识别方法，能够在现有的有限训练样本下，更加充分地训练CNN模型，学习得到鉴别能力更强的深度学习模型，从而有效地提升行人再识别的性能。
Two-stream complementary symmetrical CNN architecture for person re-identification
Zhu Fuqing,Kong Xiangwei,Fu Haiyan,Tian Qi(School of Information and Communication Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China;Department of Computer Science, University of Texas at San Antonio, San Antonio TX 78249, USA)
Objective Person re-identification aims to identify persons of interest, who appear in particular scenarios, from mass surveillance data. Accurately implementing this process is critical. Thus, person re-identification has become a novel and challenging research topic for the community of public security. The main challenge is the pedestrian variations in images, which are as follows. First, pedestrian poses have complex varieties due to different human activities. Second, numerous camera perspectives exist because of the varying locations. Third, illumination differs in each period. These pedestrian variations compromise the performance of person re-identification. Recently, the CNN-based deep learning method has achieved great success in vision community applications. CNN has also led to the research of person re-identification, which has been demonstrated in several related works. The deep model, which can overcome these complex pedestrian variations effectively, has achieved better accuracy than traditional person re-identification methods. However, the number of annotated pedestrian images in the existing person re-identification dataset is relatively small due to the difficulty of pedestrian annotation in practice. Under this limited training set, the training process of the CNN model is insufficient using the existing one-stream architecture. Consequently, the discriminative ability of the learned deep model is compromised. To address these problems, we propose a two-stream complementary symmetrical CNN model, which has an improved network structure, for person re-identification.Method The newly designed architecture uses two-stream samples as input simultaneously. Each stream has complementary characteristics due to the concatenation of the fully connected layers. The input combination is diversified under the limited training set. The training process of the CNN model is richer.Result We evaluate the proposed method and the baseline on two large-scale public person re-identification datasets, namely, Market-1501 and DukeMTMC-reID. On the Market-1501 dataset, the rank-1 and mAP accuracies are 73.25% and 48.44%, respectively. On the DukeMTMC-reID dataset, the rank-1 and mAP accuracies are 63.02% and 41.15%, respectively. The proposed method yields a competitive performance against several existing person re-ID methods. Meanwhile, the proposed method exhibits its effectiveness by showing a stable improvement over the baseline.Conclusion In this work, we propose a novel two-stream complementary symmetrical CNN architecture for person re-identification. With the use of the newly designed CNN architecture, the training process of the CNN model can be adequate even under a limited training set. Therefore, the learned CNN model can obtain a high discriminative representation of different pedestrians, and the performance of person re-identification is improved effectively.