融合全局与空间多尺度上下文信息的车辆重识别

王振学; 许喆铭; 雪洋洋; 郎丛妍; 李尊; 魏莉莉

doi:10.11834/jig.210849

图像分析和识别 | 浏览量 : 0 下载量: 0 CSCD: 3

PDF
导出
分享
收藏
专辑

融合全局与空间多尺度上下文信息的车辆重识别
Global and spatial multi-scale contexts fusion for vehicle re-identification
2023年28卷第2期页码：471-482
纸质出版日期： 2023-02-16 ，

录用日期： 2022-01-17
DOI： 10.11834/jig.210849
稿件说明：

移动端阅览

王振学, 许喆铭, 雪洋洋, 郎丛妍, 李尊, 魏莉莉. 融合全局与空间多尺度上下文信息的车辆重识别[J]. 中国图象图形学报, 2023,28(2):471-482.

Zhenxue Wang, Zheming Xu, Yangyang Xue, Congyan Lang, Zun Li, Lili Wei. Global and spatial multi-scale contexts fusion for vehicle re-identification[J]. Journal of Image and Graphics, 2023,28(2):471-482.
王振学, 许喆铭, 雪洋洋, 郎丛妍, 李尊, 魏莉莉. 融合全局与空间多尺度上下文信息的车辆重识别[J]. 中国图象图形学报, 2023,28(2):471-482. DOI： 10.11834/jig.210849.

Zhenxue Wang, Zheming Xu, Yangyang Xue, Congyan Lang, Zun Li, Lili Wei. Global and spatial multi-scale contexts fusion for vehicle re-identification[J]. Journal of Image and Graphics, 2023,28(2):471-482. DOI： 10.11834/jig.210849.

摘要

目的

车辆重识别指判断不同摄像设备拍摄的车辆图像是否属于同一辆车的检索问题。现有车辆重识别算法使用车辆的全局特征或额外的标注信息，忽略了对多尺度上下文信息的有效抽取。对此，本文提出了一种融合全局与空间多尺度上下文信息的车辆重识别模型。

方法

首先，设计一个全局上下文特征选择模块，提取车辆的细粒度判别信息，并且进一步设计了一个多尺度空间上下文特征选择模块，利用多尺度下采样的方式，从全局上下文特征选择模块输出的判别特征中获得其对应的多尺度特征。然后，选择性地集成来自多级特征的空间上下文信息，生成车辆图像的前景特征响应图，以此提升模型对于车辆空间位置特征的感知能力。最后，模型组合了标签平滑的交叉熵损失函数和三元组损失函数，以提升模型对强判别车辆特征的整体学习能力。

结果

在VeRi-776（vehicle re-idendification-776）数据集上，与模型PNVR（part-regularized near-duplicate vehicle re-identification）相比，本文模型的mAP（mean average precision）和rank-1（cumulative matching curve at rank 1）评价指标分别提升了2.3%和2.0%。在该数据集上的消融实验验证了各模块的有效性。在Vehicle ID数据集的大规模测试子集上，就rank-1和rank-5（cumulative matching curve at rank 5）而言，本文模型的mAP比PNVR分别提升了0.8%和4.5%。

结论

本文算法利用全局上下文特征和多尺度空间特征，提升了拍摄视角变化、遮挡等情况下车辆重识别的准确率，实验结果充分表明了所提模型的有效性与可行性。

Abstract

Objective

Vehicle re-identification issue is concerned of identifying the same vehicle images captured from multiple cameras-based non-overlapping views. Its applications and researches have been developing in computer vision like intelligent transportation system and public traffic security. Current sensor-based methods are focused on hardware detectors utilization as a source of information inputs for vehicle re-identification

but these methods are challenged to get effective information of the vehicle features in related to its color

length

and shape. To obtain feature information about the vehicle

most methods are based on label-manual features in the context of edges

colors and corners. However

such special decorations are challenged to be identified on the aspects of camera view variation

low resolution

and object occlusion of the vehicle-captured images. Thanks to the emerging deep learning technique

vehicle re-identification methods have been developing dramatically. Recent vehicle re-identification methods can be segmented into two categories: 1) feature learning and 2) ranges-metric learning. To enhance the re-identification performance

existing methods are restricted by multi-scale contextual information loss and lacking ability of discriminative feature selection because most feature learning and ranges-metric learning approaches are based on vehicle visual features from initial views captured or the additional information of multiple vehicles attributes

spatio-temporal information

vehicle orientation. So

we develop a novel global and spatial multi-scale contexts fusion method for vehicle re-identification (GSMC).

Method

Our method is focused on the global contextual information and the multi-scale spatial information for vehicle re-identification task. Specifically

GSMC is composed of two main modules: 1) a global contextual selection module and 2) a multi-scale spatial contextual selection module. To extract global feature as the original feature

we use residual network as the backbone network. The global contextual selection module can be used to divide the original feature map into several parts along the spatial dimension

and the 1×1 kernel size convolution layer is employed for the dimension-reducing. The softmax layer is used to obtain the weight of each part

which represents the contribution of different parts to the vehicle re-identification task. To extract more discriminative information of vehicles

the feature-optimized is melted into original feature. Additionally

to obtain a more discriminative feature representation

the feature outputs are divided into multiple horizontal local features in this module and these local features are used to replace global feature classification learning. In order to alleviate the feature loss in the boundary area

local features-adjacent have an intersection with a length of 1. What is more

the multi-scale spatial contextual selection module for GSMC is introduced to obtain multi-scale spatial features via different down-sampling

and

to generate the foreground feature response map of the vehicle image

this selected module can be used to optimize those multi-scale features

which can enhance the perception ability of GSMC to the vehicle's spatial location. To enhance the effect of the foreground

an adaptive larger weight can be assigned to the vehicle. To select more robust spatial contextual information

a smaller weight is assigned to the background for alleviating the interference of background information. Finally

as the final feature representation of the vehicle

our approach can fuse the features in the context of the global contextual selection module and the multi-scale spatial contextual selection module. In order to obtain a fine-grained feature space

GSMC is used for the label-smoothed cross-entropy loss and the triplet loss to improve its learning-coordinated ability overall. In the training process

in order to make the model have a faster convergence rate

our model is implemented in the first 5 epochs to keep the model stable in terms of the warm-up learning strategy.

Result

To valid the effectiveness of our approach proposed on vehicle re-identification task

we evaluate our model with some state-of-the-art methods on two public benchmarks of those are VehicleID and vehicle re-idendification-776 (VeRi-776) datasets. The quantitative evaluation metrics are related to mean average precision (mAP) and cumulative matching curve (CMC)

which can represent the probability that the image of the probe identity appears in the retrieved list. We carry out a series of comparative analysis with other methods

which are additional non-visual information and the multi-view leaning methods. Our analysis is demonstrated that it can surpass PNVR (part-regularized near-duplicate vehicle re-identification) by a large margin significantly. On the VehicleID dataset

we improve the rank-1 by 5.1%

4.1%

0.8% and the rank-5 by 4.4%

5.7%

and 4.5% on three test subsets of different size. Compared to PNVR on the VeRi-776 dataset

GSMC gains 2.3% and 2.0% performance improvements of each in terms of mAP and rank-1. The lower ranks of CMC accuracy illustrates that our method can promote the ranking of rough multi-view captured vehicle images. Furthermore

we use re-ranking strategy as a post processing step over the VeRi-776 dataset and the results have significant improvement in mAP

rank-1 and rank-5 scores. At the same time

to verify the necessity of different modules in the proposed model

we design an ablation experiment to clarify whether a single branch can extract discriminative feature or not and the effectiveness of the feature fusion of the two modules is optimized as well. When different modules are added sequentially

the combination can realize the performance improvement by a large margin on mAP

rank-1 and rank-5. We are able to conclude that our proposed module is effective and can be capable to pull the images of same vehicle identity closer and push the different vehicles far away through the comparative analysis in relevant to the experimental results

the attention heat map visualization and the foreground feature response map.

Conclusion

To resolve the problem of vehicle re-identification

we develop an optimized model in terms of a global contextual selection module and a multi-scale spatial contextual selection module. The proposed model has its potential effectiveness in the extensive experiments in comparison with two popular public datasets mentioned.

关键词

车辆重识别深度学习局部可区分性特征特征选择多尺度空间特征

Keywords

vehicle re-identificationdeep learninglocal discriminative featuresfeature selectionmulti-scale spatial features

references

Bai Y, Lou Y H, Gao F, Wang S Q, Wu Y W and Duan L Y. 2018. Group-sensitive triplet embedding for vehicle reidentification. IEEE Transactions on Multimedia, 20(9): 2385-2399 [DOI: 10.1109/TMM.2018.2796240]

Chen H, Lagadec B and Bremond F. 2019. Partition and reunion: a two-branch neural network for vehicle re-identification//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Long Beach, USA: IEEE: 184-192

Chen T S, Liu C T, Wu C W and Chien S Y. 2020. Orientation-aware vehicle re-identification with semantics-guided part attention network//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 330-346 [DOI: 10.1007/978-3-030-58536-5_20http://dx.doi.org/10.1007/978-3-030-58536-5_20]

Chu R H, Sun Y F, Li Y D, Liu Z, Zhang C and Wei Y C. 2019. Vehicle re-identification with viewpoint-aware metric learning//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8281-8290 [DOI: 10.1109/ICCV.2019.00837http://dx.doi.org/10.1109/ICCV.2019.00837]

Deng J, Dong W, Socher R, Li L J, Li K and Li F F. 2009. ImageNet: a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 248-255 [DOI: 10.1109/CVPR.2009.5206848http://dx.doi.org/10.1109/CVPR.2009.5206848]

He B, Li J, Zhao Y F and Tian Y H. 2019a. Part-regularized near-duplicate vehicle re-identification//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3992-4000 [DOI: 10.1109/CVPR.2019.00412http://dx.doi.org/10.1109/CVPR.2019.00412]

He L X, Wang Y G, Liu W, Zhao H, Sun Z and Feng J S. 2019b. Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8449-8458 [DOI: 10.1109/ICCV.2019.00854http://dx.doi.org/10.1109/ICCV.2019.00854]

Hermans A, Beyer L and Leibe B. 2017. In defense of the triplet loss for person re-identification [EB/OL]. [2021-06-03].https://arxiv.org/pdf/1703.07737.pdfhttps://arxiv.org/pdf/1703.07737.pdf

Jeng S T C and Chu L Y. 2013. Vehicle reidentification with the inductive loop signature technology. Journal of the Eastern Asia Society for Transportation Studies, 10: 1896-1915 [DOI: 10.11175/easts.10.1896]

Jiang N, Xu Y, Zhou Z and Wu W. 2018. Multi-attribute driven vehicle re-identification with spatial-temporal re-ranking//Proceedings of the 25th IEEE International Conference on Image Processing (ICIP). Athens, Greece: IEEE: 858-862 [DOI: 10.1109/icip.2018.8451776http://dx.doi.org/10.1109/icip.2018.8451776]

Khorramshahi P, Kumar A, Peri N, Rambhatla S S, Chen J C and Chellappa R. 2019. A dual-path model with adaptive attention for vehicle re-identification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 6131-6140 [DOI: 10.1109/ICCV.2019.00623http://dx.doi.org/10.1109/ICCV.2019.00623]

Li Y Q, Li Y H, Yan H F and Liu J Y. 2017. Deep joint discriminative learning for vehicle re-identification and retrieval//Proceedings of 2017 IEEE International Conference on Image Processing (ICIP). Beijing, China: IEEE: 395-399 [DOI: DOI:10.1109/ICIP.2017.8296310http://dx.doi.org/DOI:10.1109/ICIP.2017.8296310]

Lin W P, Li Y D, Yang X L, Peng P X and Xing J L. 2019. Multi-view learning for vehicle re-identification//Proceedings of 2019 IEEE International Conference on Multimedia and Expo (ICME). Shanghai, China: IEEE: 832-837 [DOI: 10.1109/icme.2019.00148http://dx.doi.org/10.1109/icme.2019.00148]

Liu H Y, Tian Y H, Yang Y W, Pang L and Huang T J. 2016a. Deep relative distance learning: tell the difference between similar vehicles//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2167-2175 [DOI: 10.1109/CVPR.2016.238http://dx.doi.org/10.1109/CVPR.2016.238]

Liu X B, Zhang S L, Huang Q M and Gao W. 2018a. RAM: a region-aware deep model for vehicle re-identification//Proceedings of 2018 IEEE International Conference on Multimedia and Expo (ICME). San Diego, USA: IEEE: 1-6 [DOI: 10.1109/ICME.2018.8486589http://dx.doi.org/10.1109/ICME.2018.8486589]

Liu X C, Liu W, Ma H D and Fu H Y. 2016b. Large-scale vehicle reidentification in urban surveillance videos//Proceedings of 2016 IEEE International Conference on Multimedia and Expo (ICME). Seattle, USA: IEEE: 1-6 [DOI: 10.1109/ICME.2016.7553002http://dx.doi.org/10.1109/ICME.2016.7553002]

Liu X C, Liu W, Mei T and Ma H D. 2016c. A deep learning-based approach to progressive vehicle re-identification for urban surveillance//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 869-884 [DOI: 10.1007/978-3-319-46475-6_53http://dx.doi.org/10.1007/978-3-319-46475-6_53]

Liu X C, Liu W, Mei T and Ma H D. 2018b. PROVID: progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia, 20(3): 645-658 [DOI: 10.1109/TMM.2017.2751966]

Liu X C, Liu W, Zheng J K, Yan C G and Mei T. 2020. Beyond the parts: learning multi-view cross-part correlation for vehicle re-identification//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM: 907-915 [DOI: 10.1145/3394171.3413578http://dx.doi.org/10.1145/3394171.3413578]

Müller R, Kornblith S and Hinton G. 2020. When does label smoothing help? [EB/OL]. [2021-06-03].https://arxiv.org/pdf/1906.02629.pdfhttps://arxiv.org/pdf/1906.02629.pdf

Pan H P, Wang Y T and Ma M. 2021. Vehicle re-identification methods based on attention mechanism and multi-scale fusion learning. Journal of Zhejiang Sci-Tech University (Natural Sciences Edition), 45(5): 657-665

潘海鹏, 王云涛, 马淼. 2021. 基于注意力机制与多尺度融合学习的车辆重识别方法. 浙江理工大学学报(自然科学版), 45(5): 657-665 [DOI: 10.3969/j.issn.1673-3851(n).2021.05.011]

Pan X G, Luo P, Shi J P and Tang X O. 2018. Two at once: enhancing learning and generalization capacities via IBN-net//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 484-500 [DOI: 10.1007/978-3-030-01225-0_29http://dx.doi.org/10.1007/978-3-030-01225-0_29]

Qian J J, Jiang W, Luo H and Yu H Y. 2019. Stripe-based and attribute-aware network: a two-branch deep model for vehicle re-identification [EB/OL]. [2021-06-03].https://arxiv.org/pdf/1910.05549.pdfhttps://arxiv.org/pdf/1910.05549.pdf

Qiu M K and Li X Y. 2021. Detail-aware discriminative feature learning model for vehicle re-identification. Acta Scientiarum Naturalium Universitatis Sunyatseni, 60(4): 111-120

邱铭凯, 李熙莹. 2021. 用于车辆重识别的基于细节感知的判别特征学习模型. 中山大学学报(自然科学版), 60(4): 111-120 [DOI: 10.13471/j.cnki.acta.snus.2020.03.16.2020B023]

Shen Y T, Xiao T, Li H S, Yi S and Wang X G. 2017. Learning deep neural networks for vehicle Re-ID with visual-spatio-temporal path proposals//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 1918-1927 [DOI: 10.1109/ICCV.2017.210http://dx.doi.org/10.1109/ICCV.2017.210]

Wang Z D, Tang L M, Liu X H, Yao Z L, Yi S, Shao J, Yan J J, Wang S J, Li H S and Wang X G. 2017. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 379-387 [DOI: 10.1109/ICCV.2017.49http://dx.doi.org/10.1109/ICCV.2017.49]

Xu Z M, Wei L L, Lang C Y, Feng S H, Wang T and Bors A G. 2020. HSS-GCN: a hierarchical spatial structural graph convolutional network for vehicle re-identification//Pattern Recognition. ICPR International Workshops and Challenges. Milan, Italy: Springer: 356-364 [DOI: 10.1007/978-3-030-68821-9_32http://dx.doi.org/10.1007/978-3-030-68821-9_32]

Zhong Z, Zheng L, Cao D L and Li S Z. 2017. Re-ranking person reidentification with k-reciprocal encoding//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 3652-3661 [DOI: 10.1109/CVPR.2017.389http://dx.doi.org/10.1109/CVPR.2017.389]

Zhou Y and Shao L. 2017. Cross-view GAN based vehicle generation for re-identification//Proceedings of 2017 British Machine Vision Conference (BMVC). [s.l.]: BMVA Press: 186.1-186.12 [DOI: 10.5244/c.31.186http://dx.doi.org/10.5244/c.31.186]

Zhou Y and Shao L. 2018. Viewpoint-aware attentive multi-view inference for vehicle re-identification//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE: 6489-6498 [DOI: 10.1109/CVPR.2018.00679http://dx.doi.org/10.1109/CVPR.2018.00679]

文章被引用时，请邮件提醒。

提交