面向航拍图像中工程车辆检测与识别的改进胶囊网络
Improved capsule network method for engineering vehicles detection and recognition in aerial images
- 2022年27卷第8期 页码:2380-2390
纸质出版日期: 2022-08-16 ,
录用日期: 2021-06-01
DOI: 10.11834/jig.210164
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2022-08-16 ,
录用日期: 2021-06-01
移动端阅览
钟映春, 郑海阳, 张文祥, 王波, 罗志勇. 面向航拍图像中工程车辆检测与识别的改进胶囊网络[J]. 中国图象图形学报, 2022,27(8):2380-2390.
Yingchun Zhong, Haiyang Zheng, Wenxiang Zhang, Bo Wang, Zhiyong Luo. Improved capsule network method for engineering vehicles detection and recognition in aerial images[J]. Journal of Image and Graphics, 2022,27(8):2380-2390.
目的
2
利用无人机(unmanned aerial vehicle
UAV)巡检识别航拍图像中的工程车辆对于减少电力安全事故的发生具有重要意义。采用人工提取特征的经典模式识别方法或YOLOv5(you only look once v5)等深度学习算法识别无人机电力巡检航拍图像中的工程车辆,存在识别准确率低、模型参数规模大等问题。针对上述问题,提出一种改进的胶囊网络识别航拍图像中的工程车辆。
方法
2
采用多层密集连接型方法改进原始胶囊网络结构,以提取图像中工程车辆更多的特征;改进了胶囊网络的动态路由方法,以提高胶囊网络的抗干扰能力;探索了网络层数和动态路由算法中关键参数对识别准确率的影响,以找到识别准确率最高时的参数。
结果
2
实验结果表明:1)在所采用的算法模型中,本文方法的平均识别率(mean average precision
mAP)达到94.56%,明显高于其他方法。2)网络层数对识别准确率有很大影响,但二者之间并非单调线性关系。在本文的应用场景中,5层胶囊网络的识别准确率最高;此外,动态路由算法改进与否并不会影响识别准确率跟随网络层数的变化趋势。3)胶囊网络层数增加会降低识别效率,但是并不会明显增加参数规模,且参数规模与mAP无明显关联。
结论
2
本文方法在获得较高识别准确率的同时具有参数规模较小的特点,为无人机在机载端识别目标物奠定了基础。
Objective
2
Electrical power lines construction
plays an important role in the urban development
especially the high-voltage power lines. Engineering vehicles are composed of excavators and wheeled cranes contexts
which are used in construction sites. If the engineering vehicle is working on site surrounding the high-voltage power line
its bucket or boom would probably enter the high-voltage breakdown range when they are lifted
which is very easy to result in accidents such as short circuit breakdowns. So
it is necessary to find out the adequate engineering vehicles working scenario near high-voltage power line. The multiple rotors unmanned aerial vehicle (UAV) is widely used to acquire amounts of aerial images for power lines inspection. The engineering vehicle information should be recognized from these aerial images manually in common. The classical pattern recognition methods and some deep learning models like you only look once version 5 (YOLOv5) has been challenged to some issues of recognizing the engineering vehicle in acquired aerial image
such as inefficiency and inaccuracy. The classical pattern recognition method needs to manually extract the features. Some deep learning models usually have large parameter scale and complex network structure
and are not accurate enough while the training set is small. In order to solve these problems
our research demonstrated an improved capsule network model to recognize engineering vehicles from aerial images. Capsule network improvement is mainly on the two aspects as mentioned below: one is to improve the network structure of the capsule network model
and the other one is to improve the dynamic routing algorithm of the capsule network.
Method
2
First
we built up an image dataset
which includes 1 890 aerial images in total. The dataset is then separated into training set and testing set at a ratio of 4 ∶1. Next
we improved the network structure of capsule network through a multi-layer densely connected method to extract more features of the engineering vehicle from the image
named improved model No.1. The multi-layer densely connected capsule network has 3 layers
5 layers or 7 layers probably. Third
we facilitated the dynamic routing method of the capsule network by replacing the softmax function with the leaky-softmax function to improve the anti-interference performance of the capsule network
named improved model No.2. We named the model with multi-layer densely connected network and the leaky-softmax function as the improved model No.3. Fourth
we embedded several key parameters on those models. The key parameters are related to the number of layers in the capsule network
the routing coefficient and squeeze coefficient in the dynamic routing algorithm.
Result
2
The aim of first group of experiments is to validate whether the two improved approaches are effective or not. We compared the mean average precision (mAP) of the original capsule network model with improvement model No.1
improvement model No.2 and improvement model No.3. All models use the 3-layer densely connected capsule network. Our experimental results illustrate that the mAP of the improvement model No.1 is 91.70%
and the mAP of the model with improvement No.2 is 90.01%
which are 2.21% and 0.54% each better than the original capsule network. The improvement model No.3 further improves the recognition accuracy
whose mAP reaching 92.10%. The aim of second group of experiments is to classify the issue of the number of network layers influence the mAP of those models. The experimental results demonstrate that the number of network layers influences the mAP greatly. When the number of network layers is small
the mAP increases while the number of network layers increasing. After a peak mAP of recognition shown
the mAP often decreases while the number of network layers increasing. So
their relationship is non-monotonic and nonlinear. In the application case
a 5-layer capsule network has the best recognition mAP. Additionally
the various trends of mAP are not affected by the improvement of dynamic routing algorithm. Furthermore
the efficiency of those improved models all decreased dramatically while the number of capsule network layers increase. And the parameter volume of those improved models is not obviously various
which means that the volume of parameter is irrelevant to the target recognition precision. The aim of third group of experiments is to find out the optimal model with appropriate routing coefficient and squeeze coefficient. This group of experimental results show that the mAP of 5-layer densely connected capsule network reaches up to 94.56% while the routing coefficient is 5 and the squeeze coefficient is l
which is an increase of 5.07% compared to the original capsule network. Meanwhile
the parameter volume of this optimal model is close to original model. Therefore
this optimal model has quite qualified mAP and small parameter volume. The aim of fourth group of experiments is to compare the performance of optimal model with other models. This kind of result shows that our optimal model is better than the classical pattern recognition model and YOLOv5x model in mAP
and the parameter volume of the optimal model is smaller.
Conclusion
2
Our research harnessed two approaches to improve the capsule network model for the engineering vehicles recognition derived of UAV aerial images. Our demonstrated experiments illustrate that this improved model has the small parameter volume and quite good recognizing precision
which is very significant for the UAV to recognize the airborne target information.
无人机航拍图像工程车辆识别胶囊网络动态路由算法密集连接型网络
aerial image of unmanned aerial vehicle(UAV)recognition of engineering vehiclecapsule networkdynamic routing algorithmdensely connected network
Afshar P, Mohammadi A and Plataniotis K N. 2020. BayesCap: a Bayesian approach to brain tumor classification using capsule networks. IEEE Signal Processing Letters, 27: 2024-2028 [DOI: 10.1109/LSP.2020.3034858]
Basu A, Kaewrak K, Petropoulakis L, Di Caterina G and Soraghan J J. 2020. Modified capsule neural network (Mod-CapsNet) for indoor home scene recognition//Proceedings of 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, UK: IEEE: 1-6 [DOI: 10.1109/IJCNN48605.2020.9207084http://dx.doi.org/10.1109/IJCNN48605.2020.9207084]
Baydilli Y Y and Atila V. 2020. Classification of white blood cells using capsule networks. Computerized Medical Imaging and Graphics, 80: #101699 [DOI: 10.1016/j.compmedimag.2020.101699]
Bhamidi S B S and El-Sharkawy M. 2020. 3-level residual capsule network for complex datasets//Proceedings of the 11th IEEE Latin American Symposium on Circuits and Systems (LASCAS). San Jose, USA: IEEE: 1-4 [DOI: 10.1109/LASCAS45839.2020.9068990http://dx.doi.org/10.1109/LASCAS45839.2020.9068990]
Ha M H and Chen O T C. 2021. Deep neural networks using capsule networks and skeleton-based attentions for action recognition. IEEE Access, 9: 6164-6178 [DOI: 10.1109/ACCESS.2020.3048741]
Hinton G E, Sabour S and Frosst N. 2018. Matrix capsules with EM routing [DB/OL]. [2022-05-23].https://openreview.net/pdf?id=HJWLfGWRbhttps://openreview.net/pdf?id=HJWLfGWRb
Huang G, Liu Z, Van Der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2261-2269 [DOI: 10.1109/CVPR.2017.243http://dx.doi.org/10.1109/CVPR.2017.243]
Liu C, Lin N, Cao Y J and Yang C. 2021. Seg-CapNet: neural network model for the cardiac MRI segmentation. Journal of Image and Graphics, 26(2): 452-463
刘畅, 林楠, 曹仰杰, 杨聪. 2021. Seg-CapNet: 心脏MRI图像分割神经网络模型. 中国图象图形学报, 26(2): 452-463 [DOI: 10.11834/jig.190626]
Phaye S S R, Sikka A, Dhall A and Bathula D. 2018. Dense and diverse capsule networks: making the capsules learn better [EB/OL]. [2020-08-07].https://arxiv.org/pdf/1805.04001.pdfhttps://arxiv.org/pdf/1805.04001.pdf
Sabour S, Frosst N and Hinton G E. 2017. Dynamic routing between capsules [EB/OL]. [2020-08-07].https://arxiv.org/pdf/1710.09829.pdfhttps://arxiv.org/pdf/1710.09829.pdf
Shao Y, Zhang Q F and Pu B M. 2013. Vehicle detection algorithm used in intelligent surveillance. Journal of Chinese Computer Systems, 34(4): 864-867
邵宇, 张全发, 蒲宝明. 2013. 智能监控中的工程车辆识别算法. 小型微型计算机系统, 34(4): 864-867 [DOI: 10.3969/j.issn.1000-1220.2013.04.035]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2020-08-07].https://arxiv.org/pdf/1409.1556.pdfhttps://arxiv.org/pdf/1409.1556.pdf
Szegedy C, Vanhoucke V, Ioffe S, Shlens J and Wojna Z. 2016. Rethinking the inception architecture for computer vision//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2818-2826 [DOI: 10.1109/CVPR.2016.308http://dx.doi.org/10.1109/CVPR.2016.308]
Wang D L and Liu Q. 2018. An optimization view on dynamic routing between capsules [DB/OL]. [2022-5-23].https://openreview.net/pdf?id=HJjtFYJDfhttps://openreview.net/pdf?id=HJjtFYJDf
Wu J T, Zhao X G and Yuan D C. 2019. Detection of construction vehicles under the transmission corridor in UAV inspection. Control Engineering of China, 26(2): 246-250
武金婷, 赵晓光, 袁德才. 2019. 无人机巡检输电走廊施工车辆识别方法研究. 控制工程, 26(2): 246-250 [DOI: 10.14107/j.cnki.kzgc.161172]
Wu Y J, Li J, Wu J and Chang J. 2020. Siamese capsule networks with global and local features for text classification. Neurocomputing, 390: 88-98 [DOI: 10.1016/j.neucom.2020.01.064]
Xi E, Bing S and Jin Y. 2017. Capsule network performance on complex data[EB/OL]. [2020-08-07].https://arxiv.org/pdf/1712.03480.pdfhttps://arxiv.org/pdf/1712.03480.pdf
Yan C J, Wang C, Fang H L, Wang Y X, Du J X, Xiang X Z and Guo X L. 2018. Intrusion detection for engineering vehicles under the electric transmission line based on deep learning. Information Technology, (7): 28-33, 38
闫春江, 王闯, 方华林, 王毅轩, 杜觉晓, 项学智, 郭鑫立. 2018. 基于深度学习的输电线路工程车辆入侵检测. 信息技术, (7): 28-33, 38 [DOI: 10.13274/j.cnki.hdzj.2018.07.007]
Yang S, Lee F, Miao R, Cai J W, Chen L, Yao W, Kotani K and Chen Q. 2020. RS-CapsNet: an advanced capsule network. IEEE Access, 8: 85007-85018 [DOI: 10.1109/ACCESS.2020.2992655]
Zhang M J, Li H W, Xia G J, Zhao W H, Ren S and Wang C Y. 2018. Research on the application of deep learning target detection of engineering vehicles in the patrol and inspection for military optical cable lines by UAV//The 11th International Symposium on Computational Intelligence and Design (ISCID). Hangzhou, China: IEEE: 97-101 [DOI: 10.1109/ISCID.2018.00029http://dx.doi.org/10.1109/ISCID.2018.00029]
Zhang Q F, Pu B M, Li T R and Sun H G. 2013. Vehicles detection based on histograms of oriented gradients and machine learning. Computer Systems and Applications, 22(7): 104-107
张全发, 蒲宝明, 李天然, 孙宏国. 2013. 基于HOG特征和机器学习的工程车辆检测. 计算机系统应用, 22(7): 104-107 [DOI: 10.3969/j.issn.1003-3254.2013.07.023]
Zhang W, Tang P and Zhao L J. 2019. Remote sensing image scene classification using CNN-CapsNet. Remote Sensing, 11(5): #494 [DOI: 10.3390/rs11050494]
Zhao W, Ye J B, Yang M, Lei Z Y, Zhang S F and Zhao Z. 2018. Investigating capsule networks with dynamic routing for text classification [EB/OL]. [2020-08-07].https://arxiv.org/pdf/1804.00538.pdfhttps://arxiv.org/pdf/1804.00538.pdf
Zhao Z, Kleinhans A, SandhuG, Patel I and Unnikrishnan K P. 2019. Capsule networks with max-min normalization[EB/OL]. [2020-08-07].https://arxiv.org/pdf/1903.09662.pdfhttps://arxiv.org/pdf/1903.09662.pdf
相关作者
相关机构