全卷积网络电线识别方法
Power line recognition method via fully convolutional network
- 2020年25卷第5期 页码:956-966
收稿:2019-07-03,
修回:2019-10-24,
录用:2019-10-31,
纸质出版:2020-05-16
DOI: 10.11834/jig.190316
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-07-03,
修回:2019-10-24,
录用:2019-10-31,
纸质出版:2020-05-16
移动端阅览
目的
2
电线预警对于直升机和无人飞行器的低空飞行安全至关重要,利用可见光和红外图像识别电线是一个有效途径。传统识别方法需要人工设计的滤波器提取电线的局部特征,再使用Hough变换等方法找出直线,支持向量机和随机森林等机器学习方法仅给出图像中有无电线的识别结果。本文提出一种基于全卷积网络的电线识别方法,能在自动学习特征提取器的同时得到电线的具体位置等信息。
方法
2
首先利用复杂背景生成大量包含电线图像和像素标签的成对仿真数据;然后改进U-Net网络结构以适应电线识别任务,使用仿真数据进行网络训练。由于图像中电线所占的像素很少,因此采用聚焦损失函数以平衡大量负样本的影响。
结果
2
在一个同时包含红外图像和可见光图像各4 000幅的电力巡线数据集上,与VGG(visual geometry group)16等多种特征的随机森林方法相比,本文方法的电线识别率达到了99%以上,而虚警率不到2%;同时,本文方法输出的像素分割结果中,电线基本都能被识别出来。
结论
2
本文提出的全卷积网络电线识别方法能够提取电线的光学图像特征,而且与传统机器学习方法相比能将电线从场景中精确提取出来,使得识别结果更加有判断的依据。
Objective
2
Tens of accidents involving helicopters occur every year owing to collisions with trees
wires
poles
and man-made buildings at low altitude. Just in 2014—2016
there were 96 crashes caused by hitting power lines around the world. Thus
warnings and avoiding wires are important for the low-altitude flight safety of helicopters and unmanned aerial vehicles. According to relevant studies
utilization of optical images is an effective way to identify wires. Traditional methods use manual filters to extract features of power lines and then use Hough transform to detect the lines. Machine learning methods
such as VGG (visual geometry group) 16 and random forest (RF)
can only obtain a classification result for a picture
which makes confirming accuracy difficult. The full connection layer of the traditional convolutional neural network (CNN) is effective at classification tasks. However
it cannot carry out pixel segmentation tasks because of the loss of location information. By contrast
the fully convolutional network has no full connection layer
which misses location information. One kind of fully convolutional network
U-Net
is proposed to solve problems such as cell segmentation and retina segmentation. U-Net works well under the conditions of a small amount of samples and a small slice. A three-channel image is input into the network. Through the encoder and decoder
it finally becomes a one-channel feature map via 1×1 kernel size convolution. To obtain the final value between 0 and 1
Sigmoid activation function is used before every convolution layer. In this study
a CNN recognition method based on U-net is proposed to detect power lines.
Method
2
First
we obtain a power line data set containing 8 000 images with 4 000 pairs of visible and infrared images. The image size is 128×128 pixels
with each image having a pixel ground truth label. The network receptive field calculation formula is used to determine the depth of our network. Next
adjustments are made on this basis network to choose the best model. The basis network is named the U-Net-0 model. The U-Net-1 model removes the lower pooling layer in the U-Net-0 model and changes the step size of the convolution layer before the lower pooling layer to 2. It also removes the upper pooling layer and changes the convolution layer after the upper pooling layer to the inverse convolution layer with a step size of 2. Compared with U-Net-0
the U-Net-2 model eliminates the upper and lower pooling layers and the convolution layer in the middle
thereby reducing the network depth. In the U-Net-3 model
decoding is expected to be a dimensionality reduction process. Therefore
the number of convolution kernels of the decoding part is limited
and the number of parameters of feature graph output of each layer is not larger than that of the previous layer. Pictures with complex backgrounds are likewise used to generate a large number of paired synthetic data
including power line images with pixel labels. The generated synthetic data are then used for network training. For each image
the power line contains a small number of pixels. Thus
focal loss is used to balance the impact of a large number of negative samples. The four models use the same optimizer named "Adam"
which can automatically adjust the learning rate on the basis of SGD (stochastic gradient descent). The training procedure of each model is accelerated using an NVIDIA GTX 1080 TI device
which takes approximately 18 hours in 6 000 iterations with a batch size of 64. Loss
F1 score
and intersection-over-union (IoU) are the three evaluative criteria for trained models. The best model usually has low loss and high F1 score and IoU. Each model is used on visible and infrared images. The two results are combined to make a judgment. The power line
regardless of which of the same pair includes it
is finally considered detected in the mixed result.
Result
2
After these four models are tested on the data set
the number of correctly identified pixels and IoU on each image is counted. According to the statistical results
the IoU of most image recognition results exceeds 0.2
and the threshold of 30 pixels as the result classification is relatively good. If more than 30 pixels are identified on an image
this image might include a power line. By this standard
the proposed method achieves a recognition rate over 99%
while the false alarms are less than 2%. Moreover
VGG16
which is trained on 3 800 pairs of images and tested on 200 pairs of images
only obtains a recognition rate of 95% and a false alarm rate of 37%. RF is affected by feature extraction methods. Thus
the recognition rate and false alarm rate fluctuate greatly. For example
RF with local binary patterns has a recognition rate of 63.5% and a false alarm rate of 36.3% on infrared images. In addition
RF with discrete cosine transform obtains a recognition rate of 92.95% and a false alarm rate of 13.95% on infrared images. Although U-Net-3 has more learnable parameters than U-Net-2
its performance is substantially worse.
Conclusion
2
Our models have higher recognition rates and lower false alarm rates than do other traditional methods on the same dataset. Results show that our models are more effective than other methods and can even clearly extract power lines from background. Our models are trained on synthetic data and tested on real data
which means better generalization performance. The comparison of the four models also shows that the number of parameters cannot completely determine the performance of the network and that the reasonable structure is important. However
our current models have a small receptive field and cannot be used for power line recognition in high-resolution images. In the future
the models will be further studied to increase their receptive field for adapting to larger images without greatly increasing the number of parameters.
Baker L, Mills S, Langlotz T and Rathbone C. 2016. Power line detection using Hough transform and line tracing techniques//Proceedings of 2016 International Conference on Image and Vis ion Computing New Zealand (IVCNZ). Palmerston North, New Zealand: IEEE: 1-6[ DOI: 10.1109/IVCNZ.2016.7804438 http://dx.doi.org/10.1109/IVCNZ.2016.7804438 ]
Cao H P, Zeng W M, Shi Y H and Xu P. 2018. Power line detection based on Hough transform and total least squares method. Computer Technology and Development, 28(10):164-167
操昊鹏, 曾卫明, 石玉虎, 徐鹏. 2018.基于Hough变换和总体最小二乘法的电力线检测.计算机技术与发展, 28(10):164-167)[DOI:10.3969/j.issn.1673-629X.2018.10.034]
Dai J F, Qi H Z, Xiong Y W, Li Y, Zhang G D, Hu H and Wei Y C.2017. Deformable convolutional networks//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 764-773[ DOI: 10.1109/ICCV.2017.89 http://dx.doi.org/10.1109/ICCV.2017.89 ]
Hiên Ð H T. 2017. A guide to receptive field arithmetic for Convolutional neural networks[EB/OL ] . (2017-06-04)[2019-06-05 ] . https://syncedreview.com/2017/05/11/a-guide-to-receptive-field-arithmetic-for-convolutional-neural-networks/ https://syncedreview.com/2017/05/11/a-guide-to-receptive-field-arithmetic-for-convolutional-neural-networks/
Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift[EB/OL ] . ( 2015-03-02)[2019-06-05 ] . https://arxiv.org/pdf/1502.03167v3.pdf https://arxiv.org/pdf/1502.03167v3.pdf
Kingma D P and Ba J L. 2017. Adam: a method for stochastic optimization[EB/OL ] . (2017-01-30)[2019-06-05 ] . https://arxiv.org/pdf/1412.6980v9.pdf https://arxiv.org/pdf/1412.6980v9.pdf
Lin T Y, Goyal P, Girshick R, He K M and Dollár P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 2999-3007[ DOI: 10.1109/ICCV.2017.324 http://dx.doi.org/10.1109/ICCV.2017.324 ]
Madaan R, Maturana D and Scherer S. 2017. Wire detection using synthetic data and dilated convolutional networks for unmanned aerial vehicles//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver BC, Canada: IEEE: 3487-3494[ DOI: 10.1109/IROS.2017.8206190 http://dx.doi.org/10.1109/IROS.2017.8206190 ]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241[ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Shi S G, Wang M and Dong L W. 2015. Power line detection algorithm of aerial images based on steerable filter. Optics and Optoelectronic Technology, 13(6):45-48
时圣革, 王淼, 董力文. 2015.一种基于方向可调滤波的航拍图像电力线检测算法.光学与光电技术, 13(6):45-48
Song B Q and Li X L. 2014. Power line detection from optical images. Neurocomputing, 129:350-361[DOI:10.1016/j.neucom.2013.09.023]
Yetgin Ö E and Gerek Ö N. 2018. Automatic recognition of scenes with power line wires in real life aerial images using DCT-based features. Digital Signal Processing, 77:102-119[DOI:10.1016/j.dsp.2017.10.012]
Yu F, Wang D Q, Shelhamer E and Darrell T. 2018. Deep layer aggregation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 2403-2412[ DOI: 10.1109/CVPR.2018.00255 http://dx.doi.org/10.1109/CVPR.2018.00255 ]
Zhang J, Zhan Y M and Wang Y M. 2017. Helicopter accident statistics and analysis all over the world during 2014 and 2016. Helicopter Technique, 193(3):68-72
张娟, 詹月玫, 王咏梅. 2017. 2014-2016年世界直升机事故统计及分析[J].直升机技术, 193(3):68-72)[DOI:10.3969/j.issn.1673-1220.2017.03.015]
Zhang X C, Xiao G, Gong K, Zhao J H and Bavirisetti D P. 2018. Automatic power line detection for low-altitude aircraft safety based on deep learning//Proceedings of 2018 International Conference on Aerospace System Science and Engineering. Singapore: Springer: 169-183[ DOI: 10.1007/978-981-13-6061-9_11 http://dx.doi.org/10.1007/978-981-13-6061-9_11 ]
Zhou Z W, Siddiquee M M R, Tajbakhsh N and Liang J M. 2018. UNet++: a nested U-net architecture for medical image segmentation//Proceedings of the 4th International Workshop Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Granada, Spain: Springer: 3-11[ DOI: 10.1007/978-3-030-00889-5_1 http://dx.doi.org/10.1007/978-3-030-00889-5_1 ]
相关作者
相关机构
京公网安备11010802024621