两路互补对称CNN结构的行人再识别
Two-stream complementary symmetrical CNN architecture for person re-identification
- 2018年23卷第7期 页码:1052-1060
收稿:2017-10-26,
修回:2018-1-29,
纸质出版:2018-07-16
DOI: 10.11834/jig.170557
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-10-26,
修回:2018-1-29,
纸质出版:2018-07-16
移动端阅览
目的
2
行人再识别的任务是研究如何在海量监控数据中准确地识别出某个特定场合中曾经出现过的人,已成为公共安全领域中一项新的且具有挑战性的研究课题。其挑战在于,行人在图像中有较大的姿态、视角、光照等变化,这些复杂的变化会严重影响行人再识别性能。近年来,以卷积神经网络(CNN)为代表的深度学习方法在计算机视觉领域取得了巨大的成功,也带动了行人再识别领域的相关研究。CNN有效地克服了行人变化,取得较高的准确率。然而,由于行人再识别数据集中行人标注量小,利用现有的一路CNN模型,其训练过程并不够充分,影响了深度学习模型的鉴别能力。为了解决上述问题,通过对网络结构进行改进,提出一种两路互补对称的CNN结构用于行人再识别任务。
方法
2
本文方法每次同时输入两路样本,其中每路样本之间具有互补特性,此时在有限的训练样本下,输入的组合会更加多样化,CNN模型的训练过程更加丰富。
结果
2
对本文提出的方法在两个公开的大规模数据集(Market-1501和DukeMTMC-reID)上进行实验评估,相比于基线方法有稳定的提升,相比于现存的其他一些方法,其结果也有竞争力。在Market-1501数据集上,1选识别正确率和平均精度均值分别达到了73.25%和48.44%。在DukeMTMC-reID数据集上,1选识别正确率和平均精度均值分别达到了63.02%和41.15%。
结论
2
本文提出的两路互补对称CNN结构的行人再识别方法,能够在现有的有限训练样本下,更加充分地训练CNN模型,学习得到鉴别能力更强的深度学习模型,从而有效地提升行人再识别的性能。
Objective
2
Person re-identification aims to identify persons of interest
who appear in particular scenarios
from mass surveillance data. Accurately implementing this process is critical. Thus
person re-identification has become a novel and challenging research topic for the community of public security. The main challenge is the pedestrian variations in images
which are as follows. First
pedestrian poses have complex varieties due to different human activities. Second
numerous camera perspectives exist because of the varying locations. Third
illumination differs in each period. These pedestrian variations compromise the performance of person re-identification. Recently
the CNN-based deep learning method has achieved great success in vision community applications. CNN has also led to the research of person re-identification
which has been demonstrated in several related works. The deep model
which can overcome these complex pedestrian variations effectively
has achieved better accuracy than traditional person re-identification methods. However
the number of annotated pedestrian images in the existing person re-identification dataset is relatively small due to the difficulty of pedestrian annotation in practice. Under this limited training set
the training process of the CNN model is insufficient using the existing one-stream architecture. Consequently
the discriminative ability of the learned deep model is compromised. To address these problems
we propose a two-stream complementary symmetrical CNN model
which has an improved network structure
for person re-identification.
Method
2
The newly designed architecture uses two-stream samples as input simultaneously. Each stream has complementary characteristics due to the concatenation of the fully connected layers. The input combination is diversified under the limited training set. The training process of the CNN model is richer.
Result
2
We evaluate the proposed method and the baseline on two large-scale public person re-identification datasets
namely
Market-1501 and DukeMTMC-reID. On the Market-1501 dataset
the rank-1 and mAP accuracies are 73.25% and 48.44%
respectively. On the DukeMTMC-reID dataset
the rank-1 and mAP accuracies are 63.02% and 41.15%
respectively. The proposed method yields a competitive performance against several existing person re-ID methods. Meanwhile
the proposed method exhibits its effectiveness by showing a stable improvement over the baseline.
Conclusion
2
In this work
we propose a novel two-stream complementary symmetrical CNN architecture for person re-identification. With the use of the newly designed CNN architecture
the training process of the CNN model can be adequate even under a limited training set. Therefore
the learned CNN model can obtain a high discriminative representation of different pedestrians
and the performance of person re-identification is improved effectively.
Chen Y, Huo Z H. Person re-identification based on multi-directional saliency metric learning[J]. Journal of Image and Graphics, 2015, 20(12):1674-1683.
陈莹, 霍中花.多方向显著性权值学习的行人再识别[J].中国图象图形学报, 2015, 20(12):1674-1683. [DOI:10.11834/jig.20151212]
Qi M B, Hu L F, Jiang J G, et al. Person re-identification based on multi-features fusion and independent metric learning[J]. Journal of Image and Graphics, 2016, 21(11):1464-1472.
齐美彬, 胡龙飞, 蒋建国, 等.多特征融合与独立测度学习的行人再识别[J].中国图象图形学报, 2016, 21(11):1464-1472. [DOI:10.11834/jig.20161106]
Zheng L, Shen L Y, Tian L, et al. Scalable person re-identification: a benchmark[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 1116-1124. [ DOI:10.1109/ICCV.2015.133 http://dx.doi.org/10.1109/ICCV.2015.133 ]
Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hierarchical image database[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009: 248-255. [ DOI:10.1109/CVPR.2009.5206848 http://dx.doi.org/10.1109/CVPR.2009.5206848 ]
Zhao R, Ouyang W L, Wang X G. Person re-identification by salience matching[C]//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013: 2528-2535. [ DOI:10.1109/ICCV.2013.314 http://dx.doi.org/10.1109/ICCV.2013.314 ]
Su C, Yang F, Zhang S L, et al. Multi-task learning with low rank attribute embedding for person re-identification[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3739-3747. [ DOI:10.1109/ICCV.2015.426 http://dx.doi.org/10.1109/ICCV.2015.426 ]
Zhao R, Ouyang W L, Wang X G. Learning mid-level filters for person re-identification[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 144-151. [ DOI:10.1109/CVPR.2014.26 http://dx.doi.org/10.1109/CVPR.2014.26 ]
Shen Y, Lin W Y, Yan J C, et al. Person re-identification with correspondence structure learning[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3200-3208. [ DOI:10.1109/ICCV.2015.366 http://dx.doi.org/10.1109/ICCV.2015.366 ]
Prosser B, Zheng W S, Gong S G, et al. Person re-identification by support vector ranking[C]//Proceedings of British Machine Vision Conference. Aberystwyth, UK: BMVA Press, 2010: 1-11. [ DOI:10.5244/C.24.21 http://dx.doi.org/10.5244/C.24.21 ]
Liao S C, Hu Y, Zhu X Y, et al. Person re-identification by local maximal occurrence representation and metric learning[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 2197-2206. [ DOI:10.1109/CVPR.2015.7298832 http://dx.doi.org/10.1109/CVPR.2015.7298832 ]
Bazzani L, Cristani M, Murino V. Symmetry-driven accumulation of local features for human characterization and re-identification[J]. Computer Vision and Image Understanding, 2013, 117(2):130-144.[DOI:10.1016/j.cviu.2012.10.008]
Li S, Shao M, Fu Y. Cross-view projective dictionary learning for person re-identification[C]//Proceedings of the 24th International Conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI Press, 2015: 2155-2161. http://dl.acm.org/citation.cfm?id=2832548&preflayout=tabs .
Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110.[DOI:10.1023/B:VISI.0000029664.99615.94]
Ma B P, Su Y, Jurie F. Local descriptors encoded by fisher vectors for person re-identification[C]//Proceedings of European Conference on Computer Vision. Florence, Italy: Springer, 2012: 413-422. [ DOI:10.1007/978-3-642-33863-2_41 http://dx.doi.org/10.1007/978-3-642-33863-2_41 ]
Köstinger M, Hirzer M, Wohlhart P, et al. Large scale metric learning from equivalence constraints[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 2288-2295. [ DOI:10.1109/CVPR.2012.6247939 http://dx.doi.org/10.1109/CVPR.2012.6247939 ]
Davis J V, Kulis B, Jain P, et al. Information-theoretic metric learning[C]//Proceedings of the 24th International Conference on Machine learning. Corvalis, Oregon, USA: ACM, 2007: 209-216. [ DOI:10.1145/1273496.1273523 http://dx.doi.org/10.1145/1273496.1273523 ]
Weinberger K Q, Blitzer J, Saul L K. Distance metric learning for large margin nearest neighbor classification[C]//Proceedings of the 18th International Conference on Neural Information Processing Systems Vancouver. British Columbia, Canada: MIT Press, 2005: 1473-1480.
Yi D, Lei Z, Liao S C, et al. Deep metric learning for person re-identification[C]//Proceedings of the 201422nd International Conference on Pattern Recognition. Stockholm, Sweden: IEEE, 2014: 34-39. [ DOI:10.1109/ICPR.2014.16 http://dx.doi.org/10.1109/ICPR.2014.16 ]
Ahmed E, Jones M, Marks T K. An improved deep learning architecture for person re-identification[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3908-3916. [ DOI:10.1109/CVPR.2015.7299016 http://dx.doi.org/10.1109/CVPR.2015.7299016 ]
Varior R R, Haloi M, Wang G. Gated Siamese convolutional neural network architecture for human re-identification[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016: 791-808. [ DOI:10.1007/978-3-319-46484-8_48 http://dx.doi.org/10.1007/978-3-319-46484-8_48 ]
Cheng D, Gong Y H, Zhou S P, et al. Person re-identification by multi-channel parts-based CNN with improved triplet loss function[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1335-1344. [ DOI:10.1109/CVPR.2016.149 http://dx.doi.org/10.1109/CVPR.2016.149 ]
Xiao T, Li H S, Ouyang W L, et al. Learning deep feature representations with domain guided dropout for person re-identification[C] //Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1249-1258. [ DOI:10.1109/CVPR.2016.140 http://dx.doi.org/10.1109/CVPR.2016.140 ]
Zheng Z D, Zheng L, Yang Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3774-3782. [ DOI:10.1109/ICCV.2017.405 http://dx.doi.org/10.1109/ICCV.2017.405 ]
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv: 1511. 06434, 2015. http://www.researchgate.net/publication/284476553_Unsupervised_Representation_Learning_with_Deep_Convolutional_Generative_Adversarial_Networks .
Sun Y F, Zheng L, Deng W J, et al. SVDNet for pedestrian retrieval[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3820-3828. [ DOI:10.1109/ICCV.2017.410 http://dx.doi.org/10.1109/ICCV.2017.410 ]
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012: 1097-1105. http://dl.acm.org/citation.cfm?id=2999257 .
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778. [ DOI:10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Jia Y Q, Shelhamer E, Donahue J, et al. Caffe: convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, Florida, USA: ACM, 2014: 675-678. [ DOI:10.1145/2647868.2654889 http://dx.doi.org/10.1145/2647868.2654889 ]
Su C, Zhang S L, Xing J L, et al. Deep attributes driven multi-camera person re-identification[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016: 475-491. [ DOI:10.1007/978-3-319-46475-6_30 http://dx.doi.org/10.1007/978-3-319-46475-6_30 ]
Martinel N, Das A, Micheloni C, et al. Temporal model adaptation for person re-identification[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016: 858-877. [ DOI:10.1007/978-3-319-46493-0_52 http://dx.doi.org/10.1007/978-3-319-46493-0_52 ]
Liu H, Feng J S, Qi M B, et al. End-to-end comparative attention networks for person re-identification[J]. IEEE Transactions on Image Processing, 2017, 26(7):3492-3506.[DOI:10.1109/TIP.2017.2700762]
Zhong Z, Zheng L, Cao D L, et al. Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 3652-3661. [ DOI:10.1109/CVPR.2017.389 http://dx.doi.org/10.1109/CVPR.2017.389 ]
相关作者
相关机构
京公网安备11010802024621