融合全局与空间多尺度上下文信息的车辆重识别
Global and spatial multi-scale contexts fusion for vehicle re-identification
- 2023年28卷第2期 页码:471-482
纸质出版日期: 2023-02-16 ,
录用日期: 2022-01-17
DOI: 10.11834/jig.210849
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2023-02-16 ,
录用日期: 2022-01-17
移动端阅览
王振学, 许喆铭, 雪洋洋, 郎丛妍, 李尊, 魏莉莉. 融合全局与空间多尺度上下文信息的车辆重识别[J]. 中国图象图形学报, 2023,28(2):471-482.
Zhenxue Wang, Zheming Xu, Yangyang Xue, Congyan Lang, Zun Li, Lili Wei. Global and spatial multi-scale contexts fusion for vehicle re-identification[J]. Journal of Image and Graphics, 2023,28(2):471-482.
目的
2
车辆重识别指判断不同摄像设备拍摄的车辆图像是否属于同一辆车的检索问题。现有车辆重识别算法使用车辆的全局特征或额外的标注信息,忽略了对多尺度上下文信息的有效抽取。对此,本文提出了一种融合全局与空间多尺度上下文信息的车辆重识别模型。
方法
2
首先,设计一个全局上下文特征选择模块,提取车辆的细粒度判别信息,并且进一步设计了一个多尺度空间上下文特征选择模块,利用多尺度下采样的方式,从全局上下文特征选择模块输出的判别特征中获得其对应的多尺度特征。然后,选择性地集成来自多级特征的空间上下文信息,生成车辆图像的前景特征响应图,以此提升模型对于车辆空间位置特征的感知能力。最后,模型组合了标签平滑的交叉熵损失函数和三元组损失函数,以提升模型对强判别车辆特征的整体学习能力。
结果
2
在VeRi-776(vehicle re-idendification-776)数据集上,与模型PNVR(part-regularized near-duplicate vehicle re-identification)相比,本文模型的mAP(mean average precision)和rank-1(cumulative matching curve at rank 1)评价指标分别提升了2.3%和2.0%。在该数据集上的消融实验验证了各模块的有效性。在Vehicle ID数据集的大规模测试子集上,就rank-1和rank-5(cumulative matching curve at rank 5)而言,本文模型的mAP比PNVR分别提升了0.8%和4.5%。
结论
2
本文算法利用全局上下文特征和多尺度空间特征,提升了拍摄视角变化、遮挡等情况下车辆重识别的准确率,实验结果充分表明了所提模型的有效性与可行性。
Objective
2
Vehicle re-identification issue is concerned of identifying the same vehicle images captured from multiple cameras-based non-overlapping views. Its applications and researches have been developing in computer vision like intelligent transportation system and public traffic security. Current sensor-based methods are focused on hardware detectors utilization as a source of information inputs for vehicle re-identification
but these methods are challenged to get effective information of the vehicle features in related to its color
length
and shape. To obtain feature information about the vehicle
most methods are based on label-manual features in the context of edges
colors and corners. However
such special decorations are challenged to be identified on the aspects of camera view variation
low resolution
and object occlusion of the vehicle-captured images. Thanks to the emerging deep learning technique
vehicle re-identification methods have been developing dramatically. Recent vehicle re-identification methods can be segmented into two categories: 1) feature learning and 2) ranges-metric learning. To enhance the re-identification performance
existing methods are restricted by multi-scale contextual information loss and lacking ability of discriminative feature selection because most feature learning and ranges-metric learning approaches are based on vehicle visual features from initial views captured or the additional information of multiple vehicles attributes
spatio-temporal information
vehicle orientation. So
we develop a novel global and spatial multi-scale contexts fusion method for vehicle re-identification (GSMC).
Method
2
Our method is focused on the global contextual information and the multi-scale spatial information for vehicle re-identification task. Specifically
GSMC is composed of two main modules: 1) a global contextual selection module and 2) a multi-scale spatial contextual selection module. To extract global feature as the original feature
we use residual network as the backbone network. The global contextual selection module can be used to divide the original feature map into several parts along the spatial dimension
and the 1×1 kernel size convolution layer is employed for the dimension-reducing. The softmax layer is used to obtain the weight of each part
which represents the contribution of different parts to the vehicle re-identification task. To extract more discriminative information of vehicles
the feature-optimized is melted into original feature. Additionally
to obtain a more discriminative feature representation
the feature outputs are divided into multiple horizontal local features in this module and these local features are used to replace global feature classification learning. In order to alleviate the feature loss in the boundary area
local features-adjacent have an intersection with a length of 1. What is more
the multi-scale spatial contextual selection module for GSMC is introduced to obtain multi-scale spatial features via different down-sampling
and
to generate the foreground feature response map of the vehicle image
this selected module can be used to optimize those multi-scale features
which can enhance the perception ability of GSMC to the vehicle's spatial location. To enhance the effect of the foreground
an adaptive larger weight can be assigned to the vehicle. To select more robust spatial contextual information
a smaller weight is assigned to the background for alleviating the interference of background information. Finally
as the final feature representation of the vehicle
our approach can fuse the features in the context of the global contextual selection module and the multi-scale spatial contextual selection module. In order to obtain a fine-grained feature space
GSMC is used for the label-smoothed cross-entropy loss and the triplet loss to improve its learning-coordinated ability overall. In the training process
in order to make the model have a faster convergence rate
our model is implemented in the first 5 epochs to keep the model stable in terms of the warm-up learning strategy.
Result
2
To valid the effectiveness of our approach proposed on vehicle re-identification task
we evaluate our model with some state-of-the-art methods on two public benchmarks of those are VehicleID and vehicle re-idendification-776 (VeRi-776) datasets. The quantitative evaluation metrics are related to mean average precision (mAP) and cumulative matching curve (CMC)
which can represent the probability that the image of the probe identity appears in the retrieved list. We carry out a series of comparative analysis with other methods
which are additional non-visual information and the multi-view leaning methods. Our analysis is demonstrated that it can surpass PNVR (part-regularized near-duplicate vehicle re-identification) by a large margin significantly. On the VehicleID dataset
we improve the rank-1 by 5.1%
4.1%
0.8% and the rank-5 by 4.4%
5.7%
and 4.5% on three test subsets of different size. Compared to PNVR on the VeRi-776 dataset
GSMC gains 2.3% and 2.0% performance improvements of each in terms of mAP and rank-1. The lower ranks of CMC accuracy illustrates that our method can promote the ranking of rough multi-view captured vehicle images. Furthermore
we use re-ranking strategy as a post processing step over the VeRi-776 dataset and the results have significant improvement in mAP
rank-1 and rank-5 scores. At the same time
to verify the necessity of different modules in the proposed model
we design an ablation experiment to clarify whether a single branch can extract discriminative feature or not and the effectiveness of the feature fusion of the two modules is optimized as well. When different modules are added sequentially
the combination can realize the performance improvement by a large margin on mAP
rank-1 and rank-5. We are able to conclude that our proposed module is effective and can be capable to pull the images of same vehicle identity closer and push the different vehicles far away through the comparative analysis in relevant to the experimental results
the attention heat map visualization and the foreground feature response map.
Conclusion
2
To resolve the problem of vehicle re-identification
we develop an optimized model in terms of a global contextual selection module and a multi-scale spatial contextual selection module. The proposed model has its potential effectiveness in the extensive experiments in comparison with two popular public datasets mentioned.
车辆重识别深度学习局部可区分性特征特征选择多尺度空间特征
vehicle re-identificationdeep learninglocal discriminative featuresfeature selectionmulti-scale spatial features
Bai Y, Lou Y H, Gao F, Wang S Q, Wu Y W and Duan L Y. 2018. Group-sensitive triplet embedding for vehicle reidentification. IEEE Transactions on Multimedia, 20(9): 2385-2399 [DOI: 10.1109/TMM.2018.2796240]
Chen H, Lagadec B and Bremond F. 2019. Partition and reunion: a two-branch neural network for vehicle re-identification//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Long Beach, USA: IEEE: 184-192
Chen T S, Liu C T, Wu C W and Chien S Y. 2020. Orientation-aware vehicle re-identification with semantics-guided part attention network//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 330-346 [DOI: 10.1007/978-3-030-58536-5_20http://dx.doi.org/10.1007/978-3-030-58536-5_20]
Chu R H, Sun Y F, Li Y D, Liu Z, Zhang C and Wei Y C. 2019. Vehicle re-identification with viewpoint-aware metric learning//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8281-8290 [DOI: 10.1109/ICCV.2019.00837http://dx.doi.org/10.1109/ICCV.2019.00837]
Deng J, Dong W, Socher R, Li L J, Li K and Li F F. 2009. ImageNet: a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 248-255 [DOI: 10.1109/CVPR.2009.5206848http://dx.doi.org/10.1109/CVPR.2009.5206848]
He B, Li J, Zhao Y F and Tian Y H. 2019a. Part-regularized near-duplicate vehicle re-identification//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3992-4000 [DOI: 10.1109/CVPR.2019.00412http://dx.doi.org/10.1109/CVPR.2019.00412]
He L X, Wang Y G, Liu W, Zhao H, Sun Z and Feng J S. 2019b. Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8449-8458 [DOI: 10.1109/ICCV.2019.00854http://dx.doi.org/10.1109/ICCV.2019.00854]
Hermans A, Beyer L and Leibe B. 2017. In defense of the triplet loss for person re-identification [EB/OL]. [2021-06-03].https://arxiv.org/pdf/1703.07737.pdfhttps://arxiv.org/pdf/1703.07737.pdf
Jeng S T C and Chu L Y. 2013. Vehicle reidentification with the inductive loop signature technology. Journal of the Eastern Asia Society for Transportation Studies, 10: 1896-1915 [DOI: 10.11175/easts.10.1896]
Jiang N, Xu Y, Zhou Z and Wu W. 2018. Multi-attribute driven vehicle re-identification with spatial-temporal re-ranking//Proceedings of the 25th IEEE International Conference on Image Processing (ICIP). Athens, Greece: IEEE: 858-862 [DOI: 10.1109/icip.2018.8451776http://dx.doi.org/10.1109/icip.2018.8451776]
Khorramshahi P, Kumar A, Peri N, Rambhatla S S, Chen J C and Chellappa R. 2019. A dual-path model with adaptive attention for vehicle re-identification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 6131-6140 [DOI: 10.1109/ICCV.2019.00623http://dx.doi.org/10.1109/ICCV.2019.00623]
Li Y Q, Li Y H, Yan H F and Liu J Y. 2017. Deep joint discriminative learning for vehicle re-identification and retrieval//Proceedings of 2017 IEEE International Conference on Image Processing (ICIP). Beijing, China: IEEE: 395-399 [DOI: DOI:10.1109/ICIP.2017.8296310http://dx.doi.org/DOI:10.1109/ICIP.2017.8296310]
Lin W P, Li Y D, Yang X L, Peng P X and Xing J L. 2019. Multi-view learning for vehicle re-identification//Proceedings of 2019 IEEE International Conference on Multimedia and Expo (ICME). Shanghai, China: IEEE: 832-837 [DOI: 10.1109/icme.2019.00148http://dx.doi.org/10.1109/icme.2019.00148]
Liu H Y, Tian Y H, Yang Y W, Pang L and Huang T J. 2016a. Deep relative distance learning: tell the difference between similar vehicles//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2167-2175 [DOI: 10.1109/CVPR.2016.238http://dx.doi.org/10.1109/CVPR.2016.238]
Liu X B, Zhang S L, Huang Q M and Gao W. 2018a. RAM: a region-aware deep model for vehicle re-identification//Proceedings of 2018 IEEE International Conference on Multimedia and Expo (ICME). San Diego, USA: IEEE: 1-6 [DOI: 10.1109/ICME.2018.8486589http://dx.doi.org/10.1109/ICME.2018.8486589]
Liu X C, Liu W, Ma H D and Fu H Y. 2016b. Large-scale vehicle reidentification in urban surveillance videos//Proceedings of 2016 IEEE International Conference on Multimedia and Expo (ICME). Seattle, USA: IEEE: 1-6 [DOI: 10.1109/ICME.2016.7553002http://dx.doi.org/10.1109/ICME.2016.7553002]
Liu X C, Liu W, Mei T and Ma H D. 2016c. A deep learning-based approach to progressive vehicle re-identification for urban surveillance//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 869-884 [DOI: 10.1007/978-3-319-46475-6_53http://dx.doi.org/10.1007/978-3-319-46475-6_53]
Liu X C, Liu W, Mei T and Ma H D. 2018b. PROVID: progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia, 20(3): 645-658 [DOI: 10.1109/TMM.2017.2751966]
Liu X C, Liu W, Zheng J K, Yan C G and Mei T. 2020. Beyond the parts: learning multi-view cross-part correlation for vehicle re-identification//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM: 907-915 [DOI: 10.1145/3394171.3413578http://dx.doi.org/10.1145/3394171.3413578]
Müller R, Kornblith S and Hinton G. 2020. When does label smoothing help? [EB/OL]. [2021-06-03].https://arxiv.org/pdf/1906.02629.pdfhttps://arxiv.org/pdf/1906.02629.pdf
Pan H P, Wang Y T and Ma M. 2021. Vehicle re-identification methods based on attention mechanism and multi-scale fusion learning. Journal of Zhejiang Sci-Tech University (Natural Sciences Edition), 45(5): 657-665
潘海鹏, 王云涛, 马淼. 2021. 基于注意力机制与多尺度融合学习的车辆重识别方法. 浙江理工大学学报(自然科学版), 45(5): 657-665 [DOI: 10.3969/j.issn.1673-3851(n).2021.05.011]
Pan X G, Luo P, Shi J P and Tang X O. 2018. Two at once: enhancing learning and generalization capacities via IBN-net//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 484-500 [DOI: 10.1007/978-3-030-01225-0_29http://dx.doi.org/10.1007/978-3-030-01225-0_29]
Qian J J, Jiang W, Luo H and Yu H Y. 2019. Stripe-based and attribute-aware network: a two-branch deep model for vehicle re-identification [EB/OL]. [2021-06-03].https://arxiv.org/pdf/1910.05549.pdfhttps://arxiv.org/pdf/1910.05549.pdf
Qiu M K and Li X Y. 2021. Detail-aware discriminative feature learning model for vehicle re-identification. Acta Scientiarum Naturalium Universitatis Sunyatseni, 60(4): 111-120
邱铭凯, 李熙莹. 2021. 用于车辆重识别的基于细节感知的判别特征学习模型. 中山大学学报(自然科学版), 60(4): 111-120 [DOI: 10.13471/j.cnki.acta.snus.2020.03.16.2020B023]
Shen Y T, Xiao T, Li H S, Yi S and Wang X G. 2017. Learning deep neural networks for vehicle Re-ID with visual-spatio-temporal path proposals//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 1918-1927 [DOI: 10.1109/ICCV.2017.210http://dx.doi.org/10.1109/ICCV.2017.210]
Wang Z D, Tang L M, Liu X H, Yao Z L, Yi S, Shao J, Yan J J, Wang S J, Li H S and Wang X G. 2017. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 379-387 [DOI: 10.1109/ICCV.2017.49http://dx.doi.org/10.1109/ICCV.2017.49]
Xu Z M, Wei L L, Lang C Y, Feng S H, Wang T and Bors A G. 2020. HSS-GCN: a hierarchical spatial structural graph convolutional network for vehicle re-identification//Pattern Recognition. ICPR International Workshops and Challenges. Milan, Italy: Springer: 356-364 [DOI: 10.1007/978-3-030-68821-9_32http://dx.doi.org/10.1007/978-3-030-68821-9_32]
Zhong Z, Zheng L, Cao D L and Li S Z. 2017. Re-ranking person reidentification with k-reciprocal encoding//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 3652-3661 [DOI: 10.1109/CVPR.2017.389http://dx.doi.org/10.1109/CVPR.2017.389]
Zhou Y and Shao L. 2017. Cross-view GAN based vehicle generation for re-identification//Proceedings of 2017 British Machine Vision Conference (BMVC). [s.l.]: BMVA Press: 186.1-186.12 [DOI: 10.5244/c.31.186http://dx.doi.org/10.5244/c.31.186]
Zhou Y and Shao L. 2018. Viewpoint-aware attentive multi-view inference for vehicle re-identification//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE: 6489-6498 [DOI: 10.1109/CVPR.2018.00679http://dx.doi.org/10.1109/CVPR.2018.00679]
相关作者
相关机构