多任务分段紧凑特征的车辆检索方法

何霞; 汤一平; 陈朋; 王丽冉; 袁公萍

doi:10.11834/jig.180312

图像分析和识别 | 浏览量 : 0 下载量: 4 CSCD: 0

PDF
导出
分享
收藏
专辑

多任务分段紧凑特征的车辆检索方法
Fast hash vehicle retrieval method based on multitasking
2018年23卷第12期页码：1801-1812
收稿：2018-05-10，

修回：2018-6-14，

纸质出版：2018-12-16
DOI： 10.11834/jig.180312
稿件说明：

移动端阅览

何霞, 汤一平, 陈朋, 王丽冉, 袁公萍. 多任务分段紧凑特征的车辆检索方法[J]. 中国图象图形学报, 2018,23(12):1801-1812. DOI： 10.11834/jig.180312.

Xia He, Yiping Tang, Peng Chen, Liran Wang, Gongping Yan. Fast hash vehicle retrieval method based on multitasking[J]. Journal of Image and Graphics, 2018, 23(12): 1801-1812. DOI： 10.11834/jig.180312.

摘要

目的

随着公共安全领域中大规模图像监控及视频数据的增长以及智能交通的发展，车辆检索有着极其重要的应用价值。针对已有车辆检索中自动化和智能化水平低、难以获取精确的检索结果等问题，提出一种多任务分段紧凑特征的车辆检索方法，有效利用车辆基本信息的多样性和关联性实现实时检索。

方法

首先，利用相关任务之间的联系提高检索精度和细化图像特征，因此构造了一种多任务深度卷积网络分段学习车辆不同属性的哈希码，将图像语义和图像表示相结合，并采用最小化图像编码使学习到的车辆的不同属性特征更具有鲁棒性；然后，选用特征金字塔网络提取车辆图像的实例特征并利用局部敏感哈希再排序方法对提取到的特征进行检索；最后，针对无法获取查询车辆目标图像的特殊情况，采用跨模态辅助检索方法进行检索。

结果

提出的检索方法在3个公开数据集上均优于目前主流的检索方法，其中在CompCars数据集上检索精度达到0.966，在VehicleID数据集上检索精度提升至0.862。

结论

本文提出的多任务分段紧凑特征的车辆检索方法既能得到最小化图像编码及图像实例特征，还可在无法获取目标检索图像信息时进行跨模态检索，通过实验对比验证了方法的有效性。

Abstract

Objective

Large-scale image monitoring and video data have continuously increased in the field of public safety. Intelligent transportation has constantly evolved. Vehicle retrieval has extremely important application value. Existing vehicle retrieval techniques have low automation and intelligence level. Accurate search results are difficult to obtain. These retrieval techniques consume a large amount of storage space. To solve these problems

this study proposes a multi-task segmented compact feature vehicle retrieval method. The method can effectively use the correlation between detection and identification tasks. To achieve real-time retrieval

the method completely utilizes the diversity of information of vehicle attributes. Vehicle retrieval technology based on appearance features can overcome the limitation of traditional license plate recognition methods. This technology has broad application prospects in illegal inspections and search and seize of suspected criminal vehicles.

Method

This study constructs a multi-tasking deep convolutional network to investigate the hash code. This learning technique combines the image semantics with image representation. The technique uses the connection between the related tasks to improve the retrieval accuracy and to refine the image features. The hash code learning method uses the minimum image coding to ensure robustness of the learned vehicle features. Then

we use a feature pyramid network to extract the instance characteristics of the vehicle image. In the retrieval process

the extracted features are sorted using a local sensitive hash reordering method. A vehicle image cannot be obtained for several vehicle searches. For example

the night vision of a camera is blurred. This study proposes that a cross-modal-assisted retrieval can meet the actual requirements of different environments.

Result

Two datasets are used to verify the recognition of multitasking networks. The two datasets contain large-scale images of different vehicles. The BIT-Vehicle database is a commonly used database for vehicle identification. This database contains pictures of 9 850 bayonet vehicles. The pictures of these vehicles are divided into 12 categories. The categories are mainly divided into two tasks

namely

color and model. To verify the accuracy of fine-grained vehicle classification and multi-tasking network identification

we use the CompCars dataset that is more subdivided than the BIT-Vehicle dataset. The CompCars dataset contains two parts

namely

a network collection image and a bayonet capture image. We select the bayonet image part of the dataset and organized it

including the 30 000 positive bayonet capture images. The pictures of these vehicles are divided into 11 body color labels

69 vehicle brands

281 vehicle models

and 3 vehicle models. Therefore

this dataset is suitable for the verification of multitask convolutional neural network recognition performance. In addition

the general adaptability of the proposed vehicle retrieval method is verified. Experimental vehicle retrieval experiments are conducted on the VehicleID dataset. The VehicleID dataset contains approximately 200 000 images of 26 000 vehicles captured from surveillance cameras in real-world scenarios in different environments. The VehicleID dataset contains 250 models and 7 colors. The proposed search method outperforms the current mainstream search methods on all three public datasets. Among the datasets

the search accuracy on the CompCars dataset reaches 0.966. The search precision of the VehicleID dataset increases to 0.862. Compared with the existing methods

the retrieval accuracy of the proposed method is remarkably improved.

Conclusion

This study focused on the reality of public safety scenarios and the improvement of retrieval accuracy of massive video data. We designed a multitask neural network learning method that is suitable for identification and retrieval. The method unifies multiple feature extraction in the same model and uses end-to-end training. The proposed multi-task segmented compact feature vehicle retrieval method can achieve the minimum image coding and image feature. The method can also perform cross-modal retrieval when the target retrieval image information cannot be obtained. The effectiveness of the method is verified based on the comparison of experiments.

关键词

Keywords

references

Wang W G, Shen J B, Li X L, et al. Robust video object cosegmentation[J]. IEEE Transactions on Image Processing, 2015, 24(10):3137-3148.[DOI:10.1109/TIP.2015.2438550]

Yang D F, Bai Y Y. Vehicle retrieval method based on region of interest convolutional neural network[J]. Computer Engineering and Design, 2017, 38(8):2276-2280.

杨东芳, 白艳宇.基于感兴趣区域卷积神经网络的车辆检索方法[J].计算机工程与设计, 2017, 38(8):2276-2280. [DOI:10.16208/j.issn10000-7024.2017.08.052]

Chen F, Lyu S H, Li J, et al. Multi-label image retrieval by hashing with object proposal[J]. Journal of Image and Graphics, 2017, 22(2):232-240.

陈飞, 吕绍和, 李军, 等.目标提取与哈希机制的多标签图像检索[J].中国图象图形学报, 2017, 22(2):232-240. [DOI:10.11834/jig.20170211]

Kulis B, Darrell T. Learning to hash with binary reconstructive embeddings[C]//Proceedings of 2009 International Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada: Curran Associates Inc., 2009: 1042-1050.

Oliva A, Torralba A. Modeling the shape of the scene:a holistic representation of the spatial envelope[J]. International Journal of Computer Vision, 2001, 42(3):145-175.[DOI:10.1023/A:1011139631724]

Ma Z Q, Song Z B, Wang Y S. Image detection method for wheel-rail attack angle of vehicle mounted camera[J]. Journal of Image and Graphics, 2017, 23(3):418-427.

马增强, 宋子彬, 王永胜.车载式相机轮轨冲角图像检测[J].中国图象图形学报, 2017, 23(3):418-427. [DOI:10.11834/jig.170396]

Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada: MIT Press, 2015: 91-99.

Taigman Y, Yang M, Ranzato M, et al. DeepFace: closing the gap to human-level performance in face verification[C ] //Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 1701-1708.[ DOI: 10.1109/CVPR.2014.220 http://dx.doi.org/10.1109/CVPR.2014.220 ]

Lin K, Yang H F, Hsiao J H, et al. Deep learning of binary hash codes for fast image retrieval[C ] //Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, MA, USA: IEEE, 2015: 27-35.[ DOI: 10.1109/CVPRW.2015.7301269 http://dx.doi.org/10.1109/CVPRW.2015.7301269 ]

He T, Huang W L, Qiao Y, et al. Text-attentional convolutional neural network for scene text detection[J]. IEEE Transactions on Image Processing, 2016, 25(6):2529-2541.[DOI:10.1109/TIP.2016.2547588]

Yim J, Jung H, Yoo B I, et al. Rotating your face using multi-task deep neural network[C ] //Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 676-684.[ DOI: 10.1109/CVPR.2015.7298667 http://dx.doi.org/10.1109/CVPR.2015.7298667 ]

Caruana R. Multitask learning[J]. Machine Learning, 1997, 28(1):41-75.[DOI:10.1023/A:1007379606734]

Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C ] //Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI, USA: IEEE, 2017: 936-944.[ DOI: 10.1109/CVPR.2017.106 http://dx.doi.org/10.1109/CVPR.2017.106 ]

Dong Z, Wu Y, Pei M, et al. Vehicle Type Classification Using a Semisupervised Convolutional Neural Network[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(4):2247-2256.

Yang L J, Luo P, Loy C C, et al. A large-scale car dataset for fine-grained categorization and verification[C ] //Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3973-3981.[ DOI: 10.1109/CVPR.2015.7299023 http://dx.doi.org/10.1109/CVPR.2015.7299023 ]

Liu H Y, Tian Y H, Wang Y W, et al. Deep relative distance learning: tell the difference between similar vehicles[C ] //Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 2167-2175.[ DOI: 10.1109/CVPR.2016.238 http://dx.doi.org/10.1109/CVPR.2016.238 ]

Yu F X, Kumar S, Gong Y C, et al. Circulant binary embedding[J]. arXiv preprint arXiv: 1405.3162, 2014.

Gong Y C, Lazebnik S, Gordo A, et al. Iterative quantization:a procrustean approach to learning binary codes for large-scale image retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12):2916-2929.[DOI:10.1109/TPAMI.2012.193]

Har-Peled S, Indyk P, Motwani R. Approximate nearest neighbor:towards removing the curse of dimensionality[J]. Theory of Computing, 2012, 8:321-350

Wang J, Kumar S, Chang S F. Semi-supervised hashing for scalable image retrieval[C ] //Proceeding of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 3424-3431.[ DOI: 10.1109/CVPR.2010.5539994 http://dx.doi.org/10.1109/CVPR.2010.5539994 ]

Gong Y C, Lazebnik S. Iterative quantization: A procrustean approach to learning binary codes[C ] //Proceeding of 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, CO, USA: IEEE, 2011: 817-824.[ DOI: 10.1109/CVPR.2011.5995432 http://dx.doi.org/10.1109/CVPR.2011.5995432 ]

Jin Z M, Li C, Lin Y, et al. Density sensitive hashing[J]. IEEE Transactions on Cybernetics, 2014, 44(8):1362-1371.[DOI:10.1109/TCYB.2013.2283497]

Weiss Y, Torralba A, Fergus R. Spectral hashing[C]//Proceedings of the 21st International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2008: 1753-1760.

Ding S Y, Lin L, Wang G R, et al. Deep feature learning with relative distance comparison for person re-identification[J]. Pattern Recognition, 2015, 48(10):2993-3003.[DOI:10.1016/j.patcog.2015.04.005]

Berg T, Liu J X, Lee S W, et al. Birdsnap: large-scale fine-grained visual categorization of birds[C ] //Proceeding of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE,2014: 2019-2026.[ DOI: 10.1109/CVPR.2014.259 http://dx.doi.org/10.1109/CVPR.2014.259 ]