多形状局部区域神经网络结构的行人再识别
Multishape part network architecture for person re-identification
- 2019年24卷第11期 页码:1932-1941
收稿:2019-02-26,
修回:2019-4-10,
录用:2019-4-17,
纸质出版:2019-11-16
DOI: 10.11834/jig.190042
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-02-26,
修回:2019-4-10,
录用:2019-4-17,
纸质出版:2019-11-16
移动端阅览
目的
2
目前,行人再识别领域将行人图像的全局和局部特征相结合的方法已经成为基本的解决方法。现有的基于局部特征的方法更多的是侧重于定位具有特定的语义区域,这样增加了学习难度,并且对于差异较大的图像场景不具有鲁棒性。为了解决上述问题,通过对网络结构进行改进提出一种多形状局部区域网络(MSPN)结构,它具有多分支并将横向和纵向条状的特征作为局部特征,能够端到端进行训练。
方法
2
网络的多个分支设计可以同时获得多粒度和多形状的局部特征,其中一个分支表示全局特征的学习,两个分支表示横条状不同粒度的局部特征学习,最后一个分支表示竖条状局部特征学习。网络不再学习定位具有特定语义的区域,而是将图像提取的特征切分成横向和竖向的若干条作为局部特征。不同分支条的形状和数量不一致,最后获得不同粒度或不同形状的局部特征信息。因为切分方向的不同,多粒度多形状的局部特征缓解了行人在不同图像中无法对齐的问题。
结果
2
在包括Market-1501、DukeMTMC-ReID和CUHK03在内的主流评估数据集上的综合实验表明,多形状局部区域神经网络和现有的主要方法相比具有更好的表现。其中在数据集Market-1501上达到84.57%的平均准确率(mAP)和94.51%的rank-1准确率。
结论
2
多形状局部区域网络能够学习得到判别能力更强的深度学习模型,从而有效地提升行人再识别的准确率。
Objective
2
Person re-identification (ReID) aims to associate the same pedestrian across multiple cameras. It has attracted rapidly increasing attention in the computer vision community because of its importance for many potential applications
such as video surveillance analysis and content-based image/video retrieval. Person ReID is a challenging task. First
when a single person is captured by different cameras
the illumination conditions
background clutter
occlusion
observable human body parts
and perceived posture of the person can be dramatically different. Second
even within a single camera
the aforementioned conditions can vary through time as the person moves and engages in different actions (e.g.
suddenly taking something out of a bag while walking). Third
a gallery itself usually consists of diverse images of a single person from multiple cameras
which
given the above factors
generate high intraclass variation that impedes the generalization of learned representations. Fourth
compared with images in problems such as object recognition or detection
images in person ReID benchmarks are usually of lower resolution
making it difficult to extract distinctive attributes to distinguish one identity from another. The success of deep convolutional networks has introduced powerful representations with high discrimination and robustness for pedestrian images and enhanced the performance of ReID. The combination of global and local features has been an essential solution to improve discriminative performances in person ReID tasks. Previous methods based on local features focused on locating regions with specific predefined semantics
which increased the learning difficulty and did not have robustness for different scenarios. In this study
a multishape part network (MSPN) that has horizontal and vertical strip features as local features is designed. This network can train from end to end.
Method
2
We carefully design the MSPN
which is a multibranch deep network architecture consisting of one branch for global feature representations and three branches for local feature representations. MSPN no longer learns to locate regions with specific semantics. Instead
the features extracted from images are divided into horizontal and vertical ones. The shape and partition of different branches are different. Local feature information with different granularities is finally obtained. Our network can be compatible with the horizontal and vertical dislocation of different image features of the same pedestrian because of the different directions of partition.
Result
2
Comprehensive experiments implemented on mainstream evaluation data sets
including Market-1501
DukeMTMC-ReID
and CUHK03
indicate that our method robustly achieves state-of-the-art performances.
Conclusion
2
A pedestrian recognition method based on MSPN
which can obtain a high discriminative representation of different pedestrians
is proposed in this study. The performance of person ReID is improved effectively.
Cheng D, Gong Y H, Zhou S P, et al. Person re-identification by multi-channel parts-based CNN with improved triplet loss function[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1335-1344.[ DOI: 10.1109/CVPR.2016.149 http://dx.doi.org/10.1109/CVPR.2016.149 ]
Li W, Zhu X T, Gong S G. Person re-identification by deep joint learning of multi-loss classification[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia, 2017: 2194-2200.[ DOI: 10.24963/ijcai.2017/305 http://dx.doi.org/10.24963/ijcai.2017/305 ]
Sun Y F, Zheng L, Yang Y, et al. Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline)[C]//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018: 501-518.[ DOI: 10.1007/978-3-030-01225-0_30 http://dx.doi.org/10.1007/978-3-030-01225-0_30 ]
Zhang X, Luo H, Fan X, et al. AlignedReID: surpassing human-level performance in person re-identification[EB/OL].2018-11-22[2019-06-02] . https://arxiv.org/pdf/1711.08184.pdf https://arxiv.org/pdf/1711.08184.pdf .
Zhao H Y, Tian M Q, Sun S Y, et al. Spindle net: person reidentification with human body region guided feature decomposition and fusion[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 907-915.[ DOI: 10.1109/CVPR.2017.103 http://dx.doi.org/10.1109/CVPR.2017.103 ]
Su C, Li J N, Zhang S L, et al. Pose-driven deep convolutional model for person re-identification[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3980-3989.[ DOI: 10.1109/ICCV.2017.427 http://dx.doi.org/10.1109/ICCV.2017.427 ]
Huang H J, Li D W, Zhang Z, et al. Adversarially occluded samples for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Sa lt Lake City, UT, USA: IEEE, 2018: 5098-5107.[ DOI: 10.1109/CVPR.2018.00535 http://dx.doi.org/10.1109/CVPR.2018.00535 ]
Yao H T, Zhang S L, Zhang Y D, et al. Deep representation learning with part loss for person re-identification[EB/OL].2017-11-16[2019-06-02] . https://arxiv.org/pdf/1707.00798.pdf https://arxiv.org/pdf/1707.00798.pdf .
Li W, Zhu X T, Gong S G. Harmonious attention network for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 2285-2294.[ DOI: 10.1109/CVPR.2018.00243 http://dx.doi.org/10.1109/CVPR.2018.00243 ]
Liu H, Feng J S, Qi M B, et al. End-to-end comparative attention networks for person re-identification[J]. IEEE Transactions on Image Processing, 2017, 26(7):3492-3506.[DOI:10.1109/TIP.2017.2700762]
Liu X H, Zhao H Y, Tian M Q, et al. Hydraplus-net: attentive deep features for pedestrian analysis[C]//Proceedings of 2017 IEEE International Conferenceon Computer Vision. Venice, Italy: IEEE, 2017: 350-359.[ DOI: 10.1109/ICCV.2017.46 http://dx.doi.org/10.1109/ICCV.2017.46 ]
Zhao L M, Xi L, Zhuang Y T, et al. Deeply-learned part-aligned representations for person re-identification[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3219-3228.[ DOI: 10.1109/ICCV.2017.349 http://dx.doi.org/10.1109/ICCV.2017.349 ]
Ghiasi G, Lin T Y, Le Q V. DropBlock: a regularization method for convolutional networks[C]//Proceedings of Advances in Neural Information Processing Systems. 2018: 10727-10737.
Li W, Zhao R, Xiao T, et al. DeepReID: deep filter pairing neural network for person re-identification[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 152-159.[ DOI: 10.1109/CVPR.2014.27 http://dx.doi.org/10.1109/CVPR.2014.27 ]
Yi D, Lei Z, Liao S C, et al. Deep metric learning for person reidentification[C]//Proceedings of the 22nd International Conference on Pattern Recognition. Stockholm, Sweden: IEEE, 2014: 34-39.[ DOI: 10.1109/ICPR.2014.16 http://dx.doi.org/10.1109/ICPR.2014.16 ]
Zheng L, Yang Y, Hauptmann A G. Person re-identification: past, present and future[EB/OL].2016-10-10[2019-06-02] . https://arxiv.org/pdf/1610.02984.pdf https://arxiv.org/pdf/1610.02984.pdf .
Ahmed E, Jones M, Marks T K. An improved deep learning architecture for person re-identification[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3908-3916.[ DOI: 10.1109/CVPR.2015.7299016 http://dx.doi.org/10.1109/CVPR.2015.7299016 ]
Varior R R, Haloi M, Wang G. Gated siamese convolutional neural network architecture for human re-identification[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 791-808.[ DOI: 10.1007/978-3-319-46484-8_48 http://dx.doi.org/10.1007/978-3-319-46484-8_48 ]
Xiao T, Li H S, Ouyang W L, et al. Learning deep feature representations with domain guided dropout for person re-identification[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1249-1258.[ DOI: 10.1109/CVPR.2016.140 http://dx.doi.org/10.1109/CVPR.2016.140 ]
Zhong Z, Zheng L, Cao D L, et al. Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 3652-3661.[ DOI: 10.1109/CVPR.2017.389 http://dx.doi.org/10.1109/CVPR.2017.389 ]
Bai X, Yang M K, Huang T T, et al. Deep-person: learning discriminative deep features for person re-identification[EB/OL].2018-07-04[2019-06-02] . https://arxiv.org/pdf/1711.10658.pdf https://arxiv.org/pdf/1711.10658.pdf .
Sun Y, Wang X G, Tang X O. Deeply learned face representations are sparse, selective, and robust[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 2892-2900.[ DOI: 10.1109/CVPR.2015.7298907 http://dx.doi.org/10.1109/CVPR.2015.7298907 ]
Li D W, Chen X T, Zhang Z, et al. Learning deep context-aware features over body and latent parts for person re-identification[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 7398-7407.[ DOI: 10.1109/CVPR.2017.782 http://dx.doi.org/10.1109/CVPR.2017.782 ]
Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Confe rence on Computer Vision. Santiago, Chile: IEEE, 2015: 1440-1448.[ DOI: 10.1109/ICCV.2015.169 http://dx.doi.org/10.1109/ICCV.2015.169 ]
Jaderberg M, Simonyan K, Zisserman A, et al. Spatial transformer networks[C]//Proceedings of Advances in Neural Information Processing Systems[EB/OL] .2016-02-04[2019-06-02]. https://arxiv.org/pdf/1506.02025.pdf https://arxiv.org/pdf/1506.02025.pdf .
Zheng L, Shen L Y, Tian L, et al. Scalable person re-identification: a benchmark[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 1116-1124.[ DOI: 10.1109/ICCV.2015.133 http://dx.doi.org/10.1109/ICCV.2015.133 ]
Zheng Z D, Zheng L, Yang Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3774-3782.[ DOI: 10.1109/ICCV.2017.405 http://dx.doi.org/10.1109/ICCV.2017.405 ]
Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model[C]//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE, 2008: 1-8.[ DOI: 10.1109/CVPR.2008.4587597 http://dx.doi.org/10.1109/CVPR.2008.4587597 ]
Ristani E, Solera F, Zou R, et al. Performance measures and a data set for multi-target, multi-camera tracking[C]//Proceedings of European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 17-35.[ DOI: 10.1007/978-3-319-48881-3_2 http://dx.doi.org/10.1007/978-3-319-48881-3_2 ]
Wei L H, Zhang S L, Yao H T, et al. GLAD: global-local-alignment descriptor for pedestrian retrieval[C]//Proceedings of the 25th ACM International Conference on Multimedia. Mountain View, California, USA: ACM, 2017: 420-428.[ DOI: 10.1145/3123266.3123279 http://dx.doi.org/10.1145/3123266.3123279 ]
Chen Y B, Zhu X T, Gong S G. Person re-identification by deep learning multi-scale representations[C]//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE, 2017: 2590-2600.[ DOI: 10.1109/ICCVW.2017.304 http://dx.doi.org/10.1109/ICCVW.2017.304 ]
Su C, Li J N, Zhang S L, et al. Pose-driven deep convolutional model for person re-identification[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3980-3989.[ DOI: 10.1109/ICCV.2017.427 http://dx.doi.org/10.1109/ICCV.2017.427 ]
Ustinova E, Ganin Y, Lempitsky V. Multi-region bilinear convolutional neural networks for person re-identification[C]//Proceedings of the 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Lecce, Italy: IEEE, 2017: 1-6.[ DOI: 10.1109/AVSS.2017.8078460 http://dx.doi.org/10.1109/AVSS.2017.8078460 ]
Zhang Y, Xiang T, Hospedales T M, et al. Deep mutual learning[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 4320-4328.[ DOI: 10.1109/CVPR.2018.00454 http://dx.doi.org/10.1109/CVPR.2018.00454 ]
Geng M Y, Wang Y W, Xiang T, et al. Deep transfer learning for person re-identification.[EB/OL].2016-11-16[2019-06-02].https: //arxiv.org/pdf/1611.05244.pdf.
Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification[EB/OL].2017-11-21[2019-06-02] . https://arxiv.org/pdf/1611.05244.pdf https://arxiv.org/pdf/1611.05244.pdf .
Zheng Z D, Zheng L, Yang Y. Pedestrian alignment network for large-scale person re-identification[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018.[DOI:10.1109/TCSVT.2018.2873599]
Sun Y F, Zheng L, Deng W J, et al. SVDNet for pedestrian retrieval[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3820-3828.[ DOI: 10.1109/ICCV.2017.410 http://dx.doi.org/10.1109/ICCV.2017.410 ]
Barbosa I B, Cristani M, Caputo B, et al. Looking beyond appearances:synthetic training data for deep CNNs in re-identification[J]. Computer Vision and Image Understanding, 2018, 167:50-62.[DOI:10.1016/j.cviu.2017.12.002]
Shen Y T, Li H S, Xiao T, et al. Deep group-shuffling random walk for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 2265-2274.[ DOI: 10.1109/CVPR.2018.00241 http://dx.doi.org/10.1109/CVPR.2018.00241 ]
Si J L, Zhang H G, Li C G, et al. Dual attention matching network for context-aware feature sequence based person re-identification[C]//Proceedin gs of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 5363-5372.[ DOI: 10.1109/CVPR.2018.00562 http://dx.doi.org/10.1109/CVPR.2018.00562 ]
Chang X B, Hospedales T M, Xiang T. Multi-level factorisation net for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 2109-2118.[ DOI: 10.1109/CVPR.2018.00225 http://dx.doi.org/10.1109/CVPR.2018.00225 ]
相关作者
相关机构
京公网安备11010802024621