近邻优化跨域无监督行人重识别算法
Cross-domain unsupervised Re-ID algorithm based on neighbor adversarial and consistency loss
- 2023年28卷第11期 页码:3471-3484
纸质出版日期: 2023-11-16
DOI: 10.11834/jig.220838
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2023-11-16 ,
移动端阅览
朱锦雷, 李艳凤, 陈后金, 孙嘉, 潘盼. 2023. 近邻优化跨域无监督行人重识别算法. 中国图象图形学报, 28(11):3471-3484
Zhu Jinlei, Li Yanfeng, Chen Houjin, Sun Jia, Pan Pan. 2023. Cross-domain unsupervised Re-ID algorithm based on neighbor adversarial and consistency loss. Journal of Image and Graphics, 28(11):3471-3484
目的
2
无监督行人重识别可缓解有监督方法中数据集标注成本高的问题,其中无监督跨域自适应是最常见的行人重识别方案。现有UDA(unsupervised domain adaptive)行人重识别方法在聚类过程中容易引入伪标签噪声,存在对相似人群区分能力差等问题。
方法
2
针对上述问题,基于特征具有类内收敛性、类内连续性与类间外散性的特点,提出了一种基于近邻优化的跨域无监督行人重识别方法,首先采用有监督方法得到源域预训练模型,然后在目标域进行无监督训练。为增强模型对高相似度行人的辨识能力,设计了邻域对抗损失函数,任意样本与其他样本构成样本对,使类别确定性最强的一组样本对与不确定性最强的一组样本对之间进行对抗。为使类内样本特征朝着同一方向收敛,设计了特征连续性损失函数,将特征距离曲线进行中心归一化处理,在维持特征曲线固有差异的同时,拉近样本
k
邻近特征距离。
结果
2
消融实验结果表明损失函数各部分的有效性,对比实验结果表明,提出方法性能较已有方法更具优势,在Market-1501(1501 identities dataset from market)和DukeMTMC-reID(multi-target multi-camera person re-identification dataset from Duke University)数据集上的Rank-1和平均精度均值(mean average precision,mAP)指标分别达到了92.8%、84.1%和83.9%、71.1%。
结论
2
提出方法设计了邻域对抗损失与邻域连续性损失函数,增强了模型对相似人群的辨识能力,从而有效提升了行人重识别的性能。
Objective
2
The purpose of pedestrian re-identification is to determine whether the people appearing in different camera scenes belong to the same person. This process can be regarded as a sub-problem of image retrieval and is widely used in intelligent video surveillance, criminal investigation, safety production, and other fields. Most of the pedestrian re-identification algorithms are designed with the supervised method based on known labels. These data are high expensive and are sometimes impossible to obtain. Most of the existing unsupervised pedestrian re-identification methods are based on loss functions, such as triplet loss, but have poor ability to distinguish similar identities. Compared with supervised pedestrian recognition, unsupervised pedestrian recognition technology has greater application prospects. Although the image of pedestrians is partly affected by the shooting angle, light, camera parameters, pedestrian clothing, and other factors, pedestrian features also have strong regularity, such as intra-class feature convergence, inter-class feature divergence, and intra class feature consistency. Different scenes face different data distributions, and a large domain difference can be observed in real applications. The aforementioned problems lead to performance degeneration when transfer learning the model. Due to the great differences between the source and target domain data in image acquisition conditions and application scenarios, applying the source domain training model directly to the target domain will result in poor performance. Unsupervised domain adaptive (UDA) person re-identification aims to adapt the model trained on a labeled source domain to an unlabeled target domain. For pseudo-label-based UDA methods, pseudo label noise is the main problem for model degradation, while the cross-camera problem is one of the main factors that cause this noise.
Method
2
Aiming at the poor discriminative ability of similar pedestrians caused by pseudo-label noise, a cross-domain unsupervised pedestrian re-identification method based on neighbor optimization is proposed in this paper. To address the incorrect selection of the hardest positive and negative samples in triplet loss caused by the cross-camera problem, a camera-pseudo-label-based triplet loss is designed. Triplet-based loss does not fully explore the sample similarities within the target domain, which highly depends on the pseudo labels. To enhance the identification ability of high-similarity pedestrians, a neighborhood adversarial loss (NAL) function is designed. By constructing the sample pair between any sample and other samples, the confrontation between sample pairs of the strongest certainty and uncertainty is implemented. To make the intra-class features converge in the same direction, a neighborhood consistency loss (NCL) function is designed. The feature distance curve is processed by center normalization, and the feature distances of the k-nearest samples are narrowed while maintaining the inherent difference of the feature curve. Unlike the migration mechanism of ordinary semi-supervised learning methods, the proposed algorithm focuses on the structure and loss function of the unsupervised learning model in the target domain. First, the input target domain samples are classified based on the pre-training model, and the pseudo labels are assigned to the clustering results. Second, triple hard loss is used to control the introversion of intra-class features and the divergence of inter-class features. To enhance the ability to distinguish similar identities, this paper designs an adversarial loss function in which the group with the closer feature distance in the class antagonize with the group having a longer feature distance. Furthermore, to ensure consistency in the convergence direction of class features, the feature consistency loss function is designed to measure the continuity of various sample features in the batch group. Finally, the above three loss functions are weighted and added to form the final loss function.
Result
2
Experimental results on the Market-1501 and DukeMTMC-reID datasets show that the proposed method has certain advantages over state-of-the-art methods. Ablation experiments reveal the effectiveness of each part of the algorithm loss function. Analysis of the ablation experimental results shows that the three loss functions have certain complementarities in clustering. When considering the intra-class and inter-class divergence of features, further considering the consistency of feature convergence direction can comprehensively improve the performance of the pedestrian re-recognition algorithm. Comparative experiments show that the performance of the algorithm is significantly improved compared with existing methods, while the parameter experiments highlight the influence of different super-parameter values on recognition performance. In the comparative experiments, the proposed method obviously outperforms the existing methods. Rank-1/mean average precision (mAP) achieves 92.8%/84.1% and 83.9%/71.1% on the Market-1501 and DukeMTMC-reID datasets, respectively. Experimental results further show that similar people are prone to be given a pseudo noise label when clustering and that the proposed method can control the label noise by using the NAL loss function. Complementary with NCL, the NAL loss function controls the consistency of features of the k-nearest samples. Under the action of the NAL and NCL loss functions, the noise is effectively controlled, and the unsupervised learning effect is improved on the target domain.
Conclusion
2
The proposed method can improve the adaptability of the network model via unsupervised training in the target domain. Through the neighbor adversarial loss and neighbor consistency loss functions, this method can easily distinguish similar people, thus effectively improving the performance and robustness of pedestrian re-identification. Ablation and comparative experiments are carried out on public datasets, and results show that the performance of this algorithm is significantly improved compared with existing methods.
行人重识别(Re-ID)无监督学习跨域迁移学习邻域对抗损失(NAL)邻域连续损失(NCL)
pedestrian re-identification (Re-ID)unsupervised learningcross-domain learningneighbor adversarial loss (NAL)neighbor consistency loss (NCL)
Chen Y B, Zhu X T and Gong S G. 2019. Instance-guided context rendering for cross-domain person re-identification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 232-242 [DOI: 10.1109/ICCV.2019.00032http://dx.doi.org/10.1109/ICCV.2019.00032]
Cho Y, Kim W J, Hong S and Yoon S E. 2022. Part-based pseudo label refinement for unsupervised person re-identification//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 7308-7318 [DOI: 10.1109/CVPR52688.2022.00716http://dx.doi.org/10.1109/CVPR52688.2022.00716]
Deng W J, Zheng L, Ye Q X, Kang G L, Yang Y and Jiao J B. 2018. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 994-1003 [DOI: 10.1109/CVPR.2018.00110http://dx.doi.org/10.1109/CVPR.2018.00110]
Ester M, Kriegel H P, Sander J and Xu X W. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise//Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, USA: AAAI Press: 226-231
Fan H H, Zheng L, Yan C G and Yang Y. 2018. Unsupervised person re-identification: clustering and fine-tuning. ACM Transactions on Multimedia Computing, Communications, and Applications, 14(4): #83 [DOI: 10.1145/3243316http://dx.doi.org/10.1145/3243316]
Fu Y, Wei Y C, Wang G S, Zhou Y Q, Shi H H, Uiuc U and Huang T. 2019. Self-similarity grouping: a simple unsupervised cross domain adaptation approach for person re-identification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 6112-6121 [DOI: 10.1109/ICCV.2019.00621http://dx.doi.org/10.1109/ICCV.2019.00621]
Ge Y X, Chen D P and Li H S. 2020a. Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification//Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: OpenReview.net
Ge Y X, Zhu F, Chen D P, Zhao R and Li H S. 2020b. Self-paced contrastive learning with hybrid memory for domain adaptive object re-ID//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: #949
Hermans A, Beyer L and Leibe B. 2017. In defense of the triplet loss for person re-identification [EB/OL]. [2022-04-01]. https://arxiv.org/pdf/1703.07737.pdfhttps://arxiv.org/pdf/1703.07737.pdf
Li Y J, Lin C S, Lin Y B and Wang Y C F. 2019. Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 7918-7928 [DOI: 10.1109/ICCV.2019.00801http://dx.doi.org/10.1109/ICCV.2019.00801]
Lin Y T, Dong X Y, Zheng L, Yan Y and Yang Y. 2019. A bottom-up clustering approach to unsupervised person re-identification//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Hawaii, USA: AAAI Press: 8738-8745 [DOI: 10.1609/aaai.v33i01.33018738http://dx.doi.org/10.1609/aaai.v33i01.33018738]
Lin Y T, Xie L X, Wu Y, Yan C G and Tian Q. 2020. Unsupervised person re-identification via softened similarity learning//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 3390-3399 [DOI: 10.1109/CVPR42600.2020.00345http://dx.doi.org/10.1109/CVPR42600.2020.00345]
Liu J L, Li W H, Pei H B, Wang Y, Qu F, Qu Y and Chen Y H. 2019a. Identity preserving generative adversarial network for cross-domain person re-identification. IEEE Access, 7: 114021-114032 [DOI: 10.1109/ACCESS.2019.2933910http://dx.doi.org/10.1109/ACCESS.2019.2933910]
Liu J W, Zha Z J, Chen D, Hong R C and Wang M. 2019b. Adaptive transfer network for cross-domain person re-identification//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7202-7211 [DOI: 10.1109/CVPR.2019.00737http://dx.doi.org/10.1109/CVPR.2019.00737]
Liu W J, Dong L B and Qu H C. 2021. Small-scale pedestrian detection based on improved R-FCN model. Journal of Image and Graphics, 26(10): 2400-2410
刘万军, 董利兵, 曲海成. 2021. 改进R-FCN模型的小尺度行人检测. 中国图象图形学报, 26(10): 2400-2410 [DOI: 10.11834/jig.200287http://dx.doi.org/10.11834/jig.200287]
Luo Y, Zhang C Y, Tian Y H, Guo J and Sun J. 2022. An overview of deep learning based pedestrian detection algorithms. Journal of Image and Graphics, 27(7): 2094-2111
罗艳, 张重阳, 田永鸿, 郭捷, 孙军. 2022. 深度学习行人检测方法综述. 中国图象图形学报, 27(7): 2094-2111 [DOI: 10.11834/jig.200831http://dx.doi.org/10.11834/jig.200831]
Sun J, Li Y F, Chen H J, Peng Y H and Zhu J L. 2021a. Unsupervised cross domain person re-identification by multi-loss optimization learning. IEEE Transactions on Image Processing, 30: 2935-2946 [DOI: 10.1109/TIP.2021.3056889http://dx.doi.org/10.1109/TIP.2021.3056889]
Sun J, Li Y F, Chen H J, Zhang B and Zhu J L. 2021b. MEMF: multi-level-attention embedding and multi-layer-feature fusion model for person re-identification. Pattern Recognition, 116: #107937 [DOI: 10.1016/j.patcog.2021.107937http://dx.doi.org/10.1016/j.patcog.2021.107937]
Wang J Y, Zhu X T, Gong S G and Li W. 2018. Transferable joint attribute-identity deep learning for unsupervised person re-identification//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2275-2284 [DOI: 10.1109/CVPR.2018.00242http://dx.doi.org/10.1109/CVPR.2018.00242]
Wei L H, Zhang S L, Gao W and Tian Q. 2018. Person transfer GAN to bridge domain gap for person re-identification//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 79-88 [DOI: 10.1109/CVPR.2018.00016http://dx.doi.org/10.1109/CVPR.2018.00016]
Xuan S Y and Zhang S L. 2021. Intra-inter camera similarity for unsupervised person re-identification//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 11921-11930 [DOI: 10.1109/CVPR46437.2021.01175http://dx.doi.org/10.1109/CVPR46437.2021.01175]
Yu H X, Zheng W S, Wu A C, Guo X W, Gong S G and Lai J H. 2019. Unsupervised person re-identification by soft multi-label learning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 2148-2157 [DOI: 10.1109/CVPR.2019.00225http://dx.doi.org/10.1109/CVPR.2019.00225]
Zhang X Y, Cao J W, Shen C H and You M Y. 2019. Self-training with progressive augmentation for unsupervised cross-domain person re-identification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8222-8231 [DOI: 10.1109/ICCV.2019.00831http://dx.doi.org/10.1109/ICCV.2019.00831]
Zhang X Y, Li D D, Wang Z G, Wang J, Ding E, Shi J Q, Zhang Z X and Wang J D. 2022. Implicit sample extension for unsupervised person re-identification//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 7359-7368 [DOI: 10.1109/CVPR52688.2022.00722http://dx.doi.org/10.1109/CVPR52688.2022.00722]
Zheng K C, Liu W, He L X, Mei T, Luo J B and Zha Z J. 2021. Group-aware label transfer for domain adaptive person re-identification//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 5306-5315 [DOI: 10.1109/CVPR46437.2021.00527http://dx.doi.org/10.1109/CVPR46437.2021.00527]
Zheng L, Shen L Y, Tian L, Wang S J, Wang J D and Tian Q. 2015. Scalable person re-identification: a benchmark//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1116-1124 [DOI: 10.1109/ICCV.2015.133http://dx.doi.org/10.1109/ICCV.2015.133]
Zheng Z D, Zheng L and Yang Y. 2017. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 3754-3762 [DOI: 10.1109/ICCV.2017.405http://dx.doi.org/10.1109/ICCV.2017.405]
Zhong Z, Zheng L, Li S Z and Yang Y. 2018. Generalizing a person retrieval model hetero-and homogeneously//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 176-192 [DOI: 10.1007/978-3-030-01261-8_11http://dx.doi.org/10.1007/978-3-030-01261-8_11]
Zhong Z, Zheng L, Luo Z M, Li S Z and Yang Y. 2019. Invariance matters: exemplar memory for domain adaptive person re-identification//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 598-607 [DOI: 10.1109/CVPR.2019.00069http://dx.doi.org/10.1109/CVPR.2019.00069]
Zhu J L, Chen H J and Pan P. 2022a. A novel rate control algorithm for low latency video coding base on mobile edge cloud computing. Computer Communications, 187: 134-143 [DOI: 10.1016/j.comcom.2022.02.009http://dx.doi.org/10.1016/j.comcom.2022.02.009]
Zhu J L, Chen H J, Pan P and Sun J. 2022b. Weakly supervised spatial–temporal attention network driven by tracking and consistency loss for action detection. EURASIP Journal on Image and Video Processing, 2022(1): #10 [DOI: 10.1186/s13640-022-00588-4http://dx.doi.org/10.1186/s13640-022-00588-4]
Zhu J L, Chen H J, Pan P, Sun J, Jing K and Zhang C F. 2021. Multi-loss spatial-temporal attention-convolution network for action tube detection//Proceedings of the 6th International Conference on Image, Vision and Computing. Qingdao, China: IEEE: 301-305 [DOI: 10.1109/ICIVC52351.2021.9526997http://dx.doi.org/10.1109/ICIVC52351.2021.9526997]
相关作者
相关机构