基于AT-PGGAN的增强数据车辆型号精细识别
A method of enhancing data based on AT-PGGAN for fine-grained recognition of vehicle models
- 2020年25卷第3期 页码:593-604
收稿:2019-06-21,
修回:2019-7-29,
录用:2019-8-5,
纸质出版:2020-03-16
DOI: 10.11834/jig.190282
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-06-21,
修回:2019-7-29,
录用:2019-8-5,
纸质出版:2020-03-16
移动端阅览
目的
2
车型识别在智能交通、智慧安防、自动驾驶等领域具有十分重要的应用前景。而车型识别中,带标签车型数据的数量是影响车型识别的重要因素。本文以"增强数据"为核心,结合PGGAN(progressive growing of GANs)和Attention机制,提出一种基于对抗网络生成数据再分类的网络模型AT-PGGAN(attention-progressive growing of GANs),采用模型生成带标签车型图像的数量,从而提高车型识别准确率。
方法
2
该模型由生成网络和分类网络组成,利用生成网络对训练数据进行增强扩充,利用注意力机制和标签重嵌入方法对生成网络进行优化使其生成图像细节更加完善,提出标签重标定的方法重新确定生成图像的标签数据,并对生成图像进行相应的筛选。使用扩充的图像加上原有数据集的图像作为输入训练分类网络。
结果
2
本文模型能够很好地扩充已有的车辆图像,在公开数据集StanfordCars上,其识别准确率相比未使用AT-PGGAN模型进行数据扩充的分类网络均有1%以上的提升,在CompCars上与其他网络进行对比,本文方法在同等条件下最高准确率达到96.6%,高于对比方法。实验结果表明该方法能有效提高车辆精细识别的准确率。
结论
2
将生成对抗网络用于对数据的扩充增强,生成图像能够很好地模拟原图像数据,对原图像数据具有正则的作用,图像数据可以使图像的细粒度识别准确率获得一定的提升,具有较大的应用前景。
Objective
2
Comprehensive perception of traffic management through computer vision technology is particularly important in intelligent transportation systems. Vehicle recognition is an important part of intelligent transportation systems
and fine-grained car model recognition in vehicle recognition is currently the most challenging subject. However
the traditional method has high demand for prior information on the data
while the deep learning method requires large-scale data
and the fitting effect is poor when the amount of data is small. Labelling a large number of vehicle images manually is a time-consuming task. Several deviations in manual labelling are observed due to the strong similarity between different categories of vehicle recognition. To obtain more abundant data from vehicle image features
we propose the attention-progressive growing of generative adversarial network(AT-PGGAN) model.
Method
2
The AT-PGGAN model consists of a generation network and a classification network. The generation network is used to augment the training data. For fine-grained vehicle recognition
most of the current work is based on high-resolution images. Given that the existing generation network is not ideal for generating high-resolution images
the generated high-resolution images cannot be directly used for fine-grained recognition. In this study
the attention mechanism and label re-embedding method are used to optimize the generation network to ensure that the high-resolution image details are perfect and are therefore conducive to discriminating the true features of the network extracted from the image. This paper proposes a method of label recalibration
which involves recalibrating the label data of the generated images and filtering the generated images accordingly
and then removing the generated images that do not meet the requirements. This approach solves the problem of poor image quality from another aspect. The relabeled generated and original images are collectively used as input data of the classification network. As no direct connection exists between the generation and classification networks
the classification part can use multiple classification networks
thereby improving the universality of the model.
Result
2
Based on the proposed model in this paper
the existing vehicle model images can be well augmented and used for fine-grained vehicle model recognition. In the public dataset StanfordCars
a 1% improvement over the classification network that does not use the AT-PGGAN model for data augmentation was observed. Compared with other networks on CompCars
the top1 and top5 accuracy rates of this method are higher than those of the existing methods under the same condition. Comparing several different semi-supervised image label calibration methods
we find that the method proposed in this paper shows the best results for different sample sizes. Different numbers of generated images also have a certain influence on the recognition accuracy. When the number of generated images reaches that of the original samples
the recognition accuracy is the highest. However
when the generated images continue to increase
the recognition accuracy decreases. In the comparative experiment
the progressive growth strategy has a basic improvement on the generation algorithm
and because a large number of images that do not meet the standard are screened out in the process of label recalibration
the influence on the feature extraction is removed
and the experimental results prove that the labels strongly affect the results. Relabeling is the major improvement to the algorithm.
Conclusion
2
The generative adversarial network is used for data augmentation and enhancement
and the generated images can effectively simulate the original image data. The images generated in the classification task have a regular effect on the original image data.The generated image data can improve the fine-grained recognition accuracy of the image. Thus
generating clear high-resolution images is the key to the problem. Different label calibration methods have great influence on the results. Therefore
effective calibration of image label generation is another way to solve the problem effectively.
Chabot F, Chaouch M, Rabarisoa J, Teulière C and Chateau T. 2017. Deep MANTA: a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 1827-1836[ DOI: 10.1109/CVPR.2017.198 http://dx.doi.org/10.1109/CVPR.2017.198 ]
Denton E, Chintala S, Szlam A and Fergus R. 2015. Deep generative image models using a Laplacian pyramid of adversarial networks[EB/OL]. 2015-06-18[2019-05-10] . https://arxiv.org/pdf/1710.10196.pdf https://arxiv.org/pdf/1710.10196.pdf
Elkerdawy S, Ray N and Zhang H. 2018. Fine-grained vehicle classification with unsupervised parts co-occurrence learning//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 644-670[ DOI: 10.1007/978-3-030-11018-5_54 http://dx.doi.org/10.1007/978-3-030-11018-5_54 ]
Em Y, Gag F, Lou Y H, Wang S Q, Huang T J and Duan L Y. 2017. Incorporating intra-class variance to fine-grained visual recognition//Proceedings of 2017 IEEE International Conference on Multimedia and Expo. Hong Kong, China: IEEE: 1452-1457[ DOI: 10.1109/ICME.2017.8019371 http://dx.doi.org/10.1109/ICME.2017.8019371 ]
Fukui H, Hirakawa T, Yamashita T and Fujiyoshi H. 2019. Attention branch network: learning of attention mechanism for visual explanation[EB/OL]. 2019-04-10[2019-05-01] . https://arxiv.org/pdf/1812.10025.pdf https://arxiv.org/pdf/1812.10025.pdf
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-FarleyD, Ozair S, Courville A and BengioY. 2014. Generative adversarial nets//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press: 2672-2680
Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 7132-7141[ DOI: 10.1109/CVPR.2018.00745 http://dx.doi.org/10.1109/CVPR.2018.00745 ]
Karras T, Aila T, Laine S and Lehtinen J. 2018. Progressive growing of GANs for improved quality, stability, and variation[EB/OL]. 2018-02-26[2019-05-10] . https://arxiv.org/pdf/1710.10196.pdf https://arxiv.org/pdf/1710.10196.pdf
Krause J, Stark M, Deng J and Li F F. 2013. 3D object representations for fine-grained categorization//Proceedings of 2013 IEEE International Conference on Computer Vision Workshops. Sydney, NSW, Australia: IEEE: 554-561[ DOI: 10.1109/ICCVW.2013.77 http://dx.doi.org/10.1109/ICCVW.2013.77 ]
Lee D H. Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks[EB/OL]. 2013-06-20[2019-05-10] https://www.kaggle.com/blobs/download/forum-message-attachment-files/746/pseudo_label_final.pdf https://www.kaggle.com/blobs/download/forum-message-attachment-files/746/pseudo_label_final.pdf
Lee H J, Ullah I, Wan W G, Gao Y B and Fang Z J. 2019. Real-time vehicle make and model recognition with the residual SqueezeNet Architecture. Sensors, 19(5):E982[DOI:10.3390/s19050982]
Li X J, Yang C, Chen S L, Zhu C and Yin X C. 2019. Semantic bilinear pooling for fine-grained recognition[EB/OL]. 2019-04-03[2019-05-01] . https://arxiv.org/pdf/1904.01893.pdf https://arxiv.org/pdf/1904.01893.pdf
Lin Y L, Morariu V I, Hsu W and Davis L S. 2014. Jointly optimizing 3D model fitting and fine-grained classification//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 466-480[ DOI: 10.1007/978-3-319-10593-2_31 http://dx.doi.org/10.1007/978-3-319-10593-2_31 ]
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford V and Chen X. 2016. Improved techniques for training GANs[EB/OL]. 2016-04-06[2019-05-10] . https://arxiv.org/pdf/1606.03498 https://arxiv.org/pdf/1606.03498
Sochor J, Herout A and Havel J. 2016. BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE: 3006-3015[ DOI: 10.1109/CVPR.2016.328 http://dx.doi.org/10.1109/CVPR.2016.328 ]
Tang X L, Du Y M, Liu Y H, Li J X and Ma Y W. 2018. Image recognition with conditional deep convolutional generative adversarial networks. Acta Automatica Sinica, 44(5):855-864
唐贤伦, 杜一铭, 刘雨微, 李佳歆, 马艺玮. 2018.基于条件深度卷积生成对抗网络的图像识别方法.自动化学报, 44(5):855-864[DOI:10.16383/j.aas.2018.c170470]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L and PolosukhinI. 2017. Attention is all you need[EB/OL]. 2017-12-06[2019-03-10] . https://arxiv.org/pdf/1706.03762.pdf https://arxiv.org/pdf/1706.03762.pdf
Yang Y, Zhang H Y, Zhu Y and Zhang Y N. 2018. Class-information generative adversarial network for single image super-resolution. Journal of Image and Graphics, 23(12):1777-1788
杨云, 张海宇, 朱宇, 张艳宁. 2018.类别信息生成式对抗网络的单图超分辨重建.中国图象图形学报, 23(12):1777-1788[DOI:10.11834/jig.180331]
Yu Y, Fu Y X, Yang C D and Lu Q. 2019. Fine-grained car model recognition based on FR-ResNet[J/OL]. Acta Automatica Sinica, 1-12[2019-06-01] . https://doi.org/10.16383/j.aas.c180539 https://doi.org/10.16383/j.aas.c180539
余烨, 傅云翔, 杨昌东, 路强. 2019.基于FR-ResNet的车辆型号精细识别研究[J/OL].自动化学报, 1-12[2019-06-01] . https://doi.org/10.16383/j.aas.c180539 https://doi.org/10.16383/j.aas.c180539
Zhang H, Goodfellow I, Metaxas D and Odena A. 2018. Self-attention generative adversarial networks[EB/OL].2018-03-21[2019-03-10] . https://arxiv.org/pdf/1805.08318.pdf https://arxiv.org/pdf/1805.08318.pdf
Zhang H, Xu T, Li H S, Zhang S T, Huang X G, Wang X L and Metaxas D. 2017. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5908-5916[ DOI: 10.1109/ICCV.2017.629 http://dx.doi.org/10.1109/ICCV.2017.629 ]
Zhang X F, Zhou F, Lin Y Q and Zhang S T. 2016. Embedding label structures for fine-grained feature representation//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. LasVegas, NV, USA: IEEE: 430-444[ DOI: 10.1109/CVPR.2016.126 http://dx.doi.org/10.1109/CVPR.2016.126 ]
Zheng Z D, Zheng L and Yang Y. 2017. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 3774-3782[ DOI: 10.1109/ICCV.2017.405 http://dx.doi.org/10.1109/ICCV.2017.405 ]
相关作者
相关机构
京公网安备11010802024621