轻量化航拍图像电力线语义分割
Research on lightweight neural network of aerial powerline image segmentation
- 2021年26卷第11期 页码:2605-2618
纸质出版日期: 2021-11-16 ,
录用日期: 2021-04-20
DOI: 10.11834/jig.200690
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2021-11-16 ,
录用日期: 2021-04-20
移动端阅览
许刚, 李果. 轻量化航拍图像电力线语义分割[J]. 中国图象图形学报, 2021,26(11):2605-2618.
Gang Xu, Guo Li. Research on lightweight neural network of aerial powerline image segmentation[J]. Journal of Image and Graphics, 2021,26(11):2605-2618.
目的
2
电力线在航拍图像中的提取是智能巡检的重要研究内容,基于深度学习的图像语义分割模型在此领域的应用已有较好的效果。然而,图像训练集容量较小和预训练模型计算量过大是两个待解决的问题。
方法
2
首先使用生成对抗网络模型结合圆锥曲线和色相扰动进行数据集增强,对3种不同的损失函数以及两个色彩空间所训练的U-Net网络模型进行对比,给出最优组合。然后提出了一种联合一阶泰勒展开和输出通道2范数的显著度指标,对上述完整模型使用改进的通道级参数正则化方法来稀疏化模型权重,并对稀疏模型进行网络剪枝和重训练以降低模型的计算量。最后,在判决阈值的选择上,使用自适应阈值替代固定值法以增强对亮度变化的鲁棒性。
结果
2
实验结果表明,提出的灰度输入轻量化模型IoU(intersection-over-union)指标为0.459,但其参数量和计算量相当于IoU指标为0.573的可见光完整模型的0.03%和3.05%,且自适应阈值法在合适的光照变化范围内能达到该条件下最优阈值的相似结果。
结论
2
验证了不同数据集增强方法、损失函数、输入色彩空间组合对模型收敛性能、训练速度和过拟合程度的影响,给出了各色彩空间内的最佳组合。同时,采用网络剪枝的方式极大降低了电力线语义分割网络的参数量和运算量,对网络模型的落地部署有积极的作用。
Objective
2
Powerline semantic segmentation of aerial images
as an important content of powerline intelligent inspection research
has received widespread attention. Recently
several deep learning-based methods have been proposed in this field and achieved high accuracy. However
two major problems still need to be solved before deep learning models can be applied in practice. First
the sample size of publicly available datasets is small. Unlike target objects in other semantic segmentation tasks (e.g.
cars and buildings)
powerlines have few textures and structural features
which make powerlines easy to be misidentified
especially in scenes that are not covered by the training set. Therefore
constructing a training set that contains many different background samples is crucial to improve the generalization capability of the model. The second problem is the conflict between the amount of model computation and the limited terminal computing resources. Previous work has demonstrated that an improved U-Net model can segment powerlines from aerial images with satisfactory accuracy. However
the model is computationally expensive for many resource-constrained inference terminals (e.g.
unmanned aerial vehicles(UAVs)).
Method
2
In this study
the background images in the training set were learned using a generative adversarial network (GAN) to generate a series of pseudo-backgrounds
and curved powerlines were drawn on the generated images by utilizing conic curves. In detail
a multi-scale-based automatic growth model called progressive growing of GANs (PGGAN) was adopted to learn the mapping of a random noise vector to the background images in the training set. Then
its generator was used to generate serials of the background images. These background images and the curved powerlines generated by the conic curves were fused in the alpha channel. We created three training sets. The first one consisted of only 2 000 real background pictures
and the second was a mixture of 10 000 real and generated background images. The third training dataset was composed of 200 generated backgrounds and used to evaluate the similarity between the generated and original images. At the input of the segmentation network
random hue perturbation was applied to the images to enhance the generalization of the model across seasons. Then
the convergence accuracy of U-Net networks with three different loss functions was compared in RGB and grayscale color spaces to determine the best combination. Specifically
we trained U-Net with focal
soft-IoU
and Dice loss functions in RGB and gray spaces and compared the convergence accuracy
convergence speed
and overfitting of the six obtained models. Afterward
sparse regularization was applied to the pre-trained full model
and structured network pruning was performed to reduce the computation load in network inference. A saliency metric that combines first-order Taylor expansion and 2-norm metric was proposed to guide the regularization and pruning process. It provided a higher compression rate compared with the 2-norm that was used in the previous pruning algorithm. Conventional saliency metrics based on first-order expansion can change by orders of magnitude during the regularization process
thus making threshold selection during the iterative process difficult. Compared with these conventional metrics
the proposed metric has a more stable range of values
which enables the use of iteration-based regularization methods. We adopted a 0-norm-based regularization method to widen the saliency gap between important and unimportant neurons. To select the decision threshold
we used an adaptive approach
which was more robust to changes in luminance compared with the fixed-threshold method used in previous work.
Result
2
Experimental results showed that the convergence accuracy of the curved powerline dataset was higher than that of the straight powerline dataset. In RGB space
the hybrid dataset using GAN had higher convergence accuracy than the dataset using only real images
but no significant improvement in gray space was observed due to the possibility of model collapse. We confirmed that hue disturbance can effectively improve the performance of the model across seasons. The experimental results of the different loss functions revealed that the convergence intersection-over-union(IoU) of RGB and gray spaces under their respective optimal loss functions was 0.578 and 0.586
respectively. Dice and soft-IoU had a negligible difference in convergence speed and achieved the best accuracy in gray and RGB spaces
respectively. The convergence of focal loss was the slowest in both spaces
and neither achieved the optimal accuracy. At the pruning stage
by using the conventional 2-norm saliency metric
the proposed gray space lightweight model (IoU of 0.459) reduced the number of floating-point operations per second (FLOPs) and parameters to 3.05% and 0.03% of the full model in RGB space
respectively (IoU of 0.573). When the proposed joint saliency metric was used
the numbers of FLOPs and parameters further decreased to 0.947% and 0.015% of the complete model
respectively
while maintaining an IoU of 0.42. The experiment also showed that the Otsu threshold method worked stably within the appropriate range of illumination changes
and a negligible difference from the optimal threshold was observed.
Conclusion
2
Improvements in the dataset and loss function independently enhanced the performance of the baseline model. Sparse regularization and network pruning reduced the network parameters and calculation load
which facilitated the deployment of the model on resource-constrained inferring terminals
such as UAVs. The proposed saliency measure exhibited better compression capabilities than the conventional 2-norm metric
and the adaptive threshold method helped improve the robustness of the model when the luminance changed.
智能巡检图像语义分割稀疏正则化网络剪枝生成对抗网络(GAN)
smart inspectionimage semantic segmentationsparse regularizationnetwork pruninggenerated adversarial network (GAN)
Arjovsky M and Bottou L. 2017. Towards principled methods for training generative adversarial networks//Proceedings of the 5th International Conference on Learning Representations. Toulon, France: [s. n.]
Arjovsky M, Chintala S and Bottou L. 2017. Wasserstein GAN[EB/OL]. [2020-10-27].https://arxiv.org/pdf/1701.07875.pdfhttps://arxiv.org/pdf/1701.07875.pdf
Baker L, Mills S, Langlotz T and Rathbone C. 2016. Power line detection using Hough transform and line tracing techniques//Proceedings of 2016 International Conference on Image and Vision Computing New Zealand (IVCNZ). Palmerston North, New Iealand: IEEE: 1-6[DOI: 10.1109/IVCNZ.2016.7804438http://dx.doi.org/10.1109/IVCNZ.2016.7804438]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial nets//Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2. Montreal, Canada: MIT Press: 2672-2680[DOI: 10.5555/2969033.2969125http://dx.doi.org/10.5555/2969033.2969125]
Han S, Mao H and Dally W J. 2016. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding//Proceedings of the 4th International Conference on Learning Representations Conference Track Proceedings. San Juan, Puerto Rico: [s. n.]
Karras T, Aila T, Laine S and Lehtinen J. 2018. Progressive growing of GANs for improved quality, stability, and variation//Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: [s. n.]
Le Cun Y, Denker J S and Solla S A. 1989. Optimal brain damage//Proceedings of the 2nd International Conference on Neural Information Processing Systems. Denver, USA: MIT Press: 598-605[DOI: 10.5555/2969830.2969903http://dx.doi.org/10.5555/2969830.2969903]
Li B L, Wu B W, Su J and Wang G R. 2020. Eagleeye: fast sub-net evaluation for efficient neural network pruning//Proceedings of 2020 European Conference on Computer Vision(ECCV). Glasgow, UK: Springer: 639-654[DOI: 10.1007/978-3-030-58536-5_38http://dx.doi.org/10.1007/978-3-030-58536-5_38]
Liu J W, Li Y X, Gong Z, Liu X G and Zhou Y J. 2020. Power line recognition method via fully convolutional network. Journal of Image and Graphics, 25(5): 956-966
刘嘉玮, 李元祥, 龚政, 刘心刚, 周拥军. 2020. 全卷积网络电线识别方法. 中国图象图形学报, 25(5): 956-966 [DOI: 10.11834/jig.190316]
Liu Z, Sun M J, Zhou T H, Huang G and Darrell T. 2019. Rethinking the value of network pruning//Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: [s. n.]
Madaan R, Maturana D and Scherer S. 2017. Wire detection using synthetic data and dilated convolutional networks for unmanned aerial vehicles//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada: IEEE: 3487-3494[DOI: 10.1109/IROS.2017.8206190http://dx.doi.org/10.1109/IROS.2017.8206190]
Molchanov P, Tyree S, Karras T, Aila T and Kautz J. 2017. Pruning convolutional neural networks for resource efficient inference//Proceedings of the 5th International Conference on Learning Representations. Toulon, Franc: [s. n.]
Molchanov P, Mallya A, Tyree S, Frosio I and Kautz J. 2019. Importance estimation for neural network pruning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 11256-11264[DOI: 10.1109/CVPR.2019.01152http://dx.doi.org/10.1109/CVPR.2019.01152.]
Paszke A, Chaurasia A, Kim S and Culurciello E. 2017. ENET: a deep neural network architecture for real-time semantic segmentation[EB/OL]. [2021-01-06].https://arxiv.org/pdf/1606.02147.pdfhttps://arxiv.org/pdf/1606.02147.pdf
Ronneberger O, Fischer P and Brox T. 2015. U-Net: Convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention. Munich, Germany: Springer: 234-241[DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Song B Q and Li X L. 2014. Power line detection from optical images. Neurocomputing, 129: 350-361[DOI: 10.1016/j.neucom.2013.09.023]
Wang X W. 2019. Research on Semantic Segmentation of Power Line Based on Image. Hangzhou: Zhejiang University
王栩文. 2019. 基于图像的输电线路语义分割技术研究. 杭州: 浙江大学
YetginÖE and GerekÖN. 2019a. Powerline Image Dataset (Infrared-IR and Visible Light-VL)[DB/OL]. [2020-10-18]. https://data.mendeley.com/datasets/n6wrv4ry6v/8[DOI: 10.17632/n6wrv4ry6v.8http://dx.doi.org/10.17632/n6wrv4ry6v.8]
YetginÖE and GerekÖN. 2019b. Ground Truth of Powerline Dataset (Infrared-IR and Visible Light-VL)[DB/OL]. [2020-10-18]. https://data.mendeley.com/datasets/twxp8xccsw/9[DOI: 10.17632/twxp8xccsw.9http://dx.doi.org/10.17632/twxp8xccsw.9]
Yu C Q, Wang J B, Peng C, Gao C X, Yu G and Sang N. 2018. BiSeNet: bilateral segmentation network for real-time semantic segmentation//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 334-349[DOI: 10.1007/978-3-030-01261-8_20http://dx.doi.org/10.1007/978-3-030-01261-8_20]
Zhang H, Yang W, Yu H, Zhang H J and Xia G S. 2019. Detecting power lines in UAV images with convolutional features and structured constraints. Remote Sens, 11(11): #1342[DOI: 10.3390/RS11111342]
Zhang J J, Liu L, Wang B H, Chen X G, Wang Q and Zheng T R. 2012. High speed automatic power line detection and tracking for a UAV-based inspection//Proceedings of 2012 International Conference on Industrial Control and Electronics Engineering. Xi'an, China: IEEE: 266-269[DOI: 10.1109/ICICEE.2012.77http://dx.doi.org/10.1109/ICICEE.2012.77]
Zhao H S, Qi X J, Shen X Y, Shi J P and Jia J Y. 2018. ICNet for real-time semantic segmentation on high-resolution images//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 418-434[DOI: 10.1007/978-3-030-01219-9_25http://dx.doi.org/10.1007/978-3-030-01219-9_25]
Zhao L, Wang X P, Yao H T and Tian M. 2021. Survey of power line extraction methods based on visible light aerial image. Power System Technology, 45(4): 1536-1546
赵乐, 王先培, 姚鸿泰, 田猛. 2021. 基于可见光航拍图像的电力线提取算法综述. 电网技术, 45(4): 1536-1546 [DOI:10.13335/j.1000-3673.pst.2020.0300a]
相关作者
相关机构