注意力机制改进卷积神经网络的遥感图像目标检测
Attention mechanism improves CNN remote sensing image object detection
- 2019年24卷第8期 页码:1400-1408
收稿:2018-12-10,
修回:2019-1-10,
纸质出版:2019-08-16
DOI: 10.11834/jig.180649
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-12-10,
修回:2019-1-10,
纸质出版:2019-08-16
移动端阅览
目的
2
遥感图像目标检测是遥感图像处理的核心问题之一,旨在定位并识别遥感图像中的感兴趣目标。为解决遥感图像目标检测精度较低的问题,在公开的NWPU_VHR-10数据集上进行实验,对数据集中的低质量图像用增强深度超分辨率(EDSR)网络进行超分辨率重构,为训练卷积神经网络提供高质量数据集。
方法
2
对原Faster-RCNN(region convolutional neural network)网络进行改进,在特征提取网络中加入注意力机制模块获取更多需要关注目标的信息,抑制其他无用信息,以适应遥感图像视野范围大导致的背景复杂和小目标问题;并使用弱化的非极大值抑制来适应遥感图像目标旋转;提出利用目标分布之间的互相关对冗余候选框进一步筛选,降低虚警率,以进一步提高检测器性能。
结果
2
为证明本文方法的有效性,进行了两组对比实验,第1组为本文所提各模块间的消融实验,结果表明改进后算法比原始Faster-RCNN的检测结果高了12.2%,证明了本文所提各模块的有效性。第2组为本文方法与其他现有方法在NWPU_VHR-10数据集上的对比分析,本文算法平均检测精度达到79.1%,高于其他对比算法。
结论
2
本文使用EDSR对图像进行超分辨处理,并改进Faster-RCNN,提高了算法对遥感图像目标检测中背景复杂、小目标、物体旋转等情况的适应能力,实验结果表明本文算法的平均检测精度得到了提高。
Objective
2
Remote sensing image object detection aims to locate and identify the object of interest in remote sensing images
and it is one of the core issues in remote sensing image processing. Object detection in optical remote sensing images is a fundamental and challenging problem in the field of aerial and satellite image analysis and is an important part of automated extraction of remote sensing information. Object detection in remote sensing images plays an important role in a wide range of applications
having a broad application value in the fields of national defense security
urban construction planning
and disaster monitoring. In recent years
it has received great attention. The application range of remote sensing images is expanding day by day
thereby giving fast and effective remote sensing object detection methods a broad application prospect. With the rapid development of platform and sensor technology
the spatial resolution of remote sensing images continues to increase
and the visual difference from natural images is decreasing. An increasing number of computer vision methods can be applied to high-spatial-resolution remote sensing image object recognition
but problems of low detection accuracy and low efficiency still exist and need to be addressed.
Method
2
In this paper
an improved convolutional neural network (CNN) detection method for attention mechanism is proposed and tested on the NWPU_VHR-10 dataset. The dataset is a 10-level geospatial object detection dataset. Some of the images have low resolution
which affects the experimental results. Therefore
some low-quality images in the dataset were reconstructed with enhanced depth super-resolution (EDSR) network in super-resolution to provide a high-quality dataset for training CNNs. This paper studies how to use the Faster-RCNN model for multi-class object recognition to adapt to some characteristics of remote sensing images that are different from natural images. The original Faster-RCNN network was improved as follows:An attention mechanism was added to the feature extraction network module. Then
an attention CNN was obtained for more information. The object is focused by inhibiting other useless information from adapting to the background of the large range of remote sensing image vision
which leads to the complex problem of small targets. Weak non-maximal suppression is used to adapt to the target rotation of the remote sensing image. To improve detector performance
the cross-correlation between target distributions is used to further screen redundant candidate frames and reduce false alarm rate.
Result
2
Two sets of comparative experiments were conducted to prove the validity of the method. The first set of comparative experiments is the ablation experiment between the four modules mentioned in this paper:attention mechanism module
non-maximal suppression
cross-correlation filtering mechanism
and image super-resolution processing for low-quality images. Experimental results show that the improved attentional CNN has higher detection accuracy than the original Faster-RCNN in 10 categories. The average detection accuracy improved by 12.2%. All the modules mentioned in this paper effectively improved the object detection of aerial remote sensing images. Moreover
the added attention module is a lightweight module that hardly increases the computational cost of the network model. Thus
it does not reduce the efficiency of the network. The second set of comparative experiments is the comparison and analysis of the improved attentional CNN and other existing traditional methods and deep learning methods on the open dataset NWPU_VHR-10. The average detection accuracy of this algorithm is 79.1%
which is higher than that of other algorithms.
Conclusion
2
CNN has great application potential in remote sensing image object detection and is a research hotspot at present and in the future. How to better apply CNN to object detection of aerial remote sensing images has important theoretical significance. In this study
the enhanced depth super resolution network is used to super-resolve some low-resolution images in the dataset. The attentional mechanism was proposed to improve the gross-RCNN to enable the algorithm to focus on the target region of interest in the image
that is
the extracted features that are more valuable for the current detection task. It improves the adaptability of the algorithm to the complex background
small objects caused by a wide field of view
and object rotation caused by the angle of view used in aerial photography for aerial remote sensing image object detection. Experimental results show the improved average detection accuracy of the proposed algorithm.
Cheng G, Han J W. A survey on object detection in optical remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2016, 117:11-28.[DOI:10.1016/j.isprsjprs.2016.03.014]
Xu S, Fang T, Li D R, et al. Object classification of aerial images with bag-of-visual words[J]. IEEE Geoscience and Remote SensingLetters, 2010, 7(2):366-370.[DOI:10.1109/LGRS.2009.2035644]
Han J, Zhou P, Zhang D, et al. Efficient, simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014, 89:37-48.
Chen Y, Nasrabadi N M, Tran T D. Sparse representation for target detection in hyperspectral imagery[J]. IEEE Journal of Selected Topics in Signal Processing, 2011, 5(3):629-640.[DOI:10.1109/JSTSP.2011.2113170]
Sun H, Sun X, Wang H Q, et al. Automatic target detection in high-resolution remote sensing images using spatial sparse coding bag-of-words model[J]. IEEE Geoscience and Remote Sensing Letters, 2012, 9(1):109-113.[DOI:10.1109/LGRS.2011.2161569]
Lu Y F, Zhang S H. Object detection in optical remote sensing images withconvolutional neural networks[J]. China Sciencepaper, 2017, 12(14):1583-1589, 1633.
卢艺帆, 张松海.基于卷积神经网络的光学遥感图像目标检测[J].中国科技论文, 2017, 12(14):1583-1589, 1633. [DOI:10.3969/j.issn.2095-2783.2017.14.004]
Cheng G, Zhou P C, Han J W. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12):7405-7415.[DOI:10.1109/TGRS.2016.2601622]
Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu, HI: IEEE, 2017.[ DOI: 10.1109/CVPR.2017.690 http://dx.doi.org/10.1109/CVPR.2017.690 ]
Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBoxdetector[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016.[ DOI: 10.1007/978-3-319-46448-0_2 http://dx.doi.org/10.1007/978-3-319-46448-0_2 ]
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate objectdetection and semanticsegmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH: IEEE, 2014.[ DOI: 10.1109/CVPR.2014.81 http://dx.doi.org/10.1109/CVPR.2014.81 ]
Gao C X, Sang N. Deep learning for object detection in remote sensing image[J]. Bulletin of Surveying and Mapping, 2014, (S1):108-111.
高常鑫, 桑农.基于深度学习的高分辨率遥感影像目标检测[J].测绘通报, 2014, (S1):108-111. [DOI:10.13474/j.cnki.11-2246.2014.0625]
Krizhevsky A, Sutskever I, Hinton G E. ImageNetclassification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: IEEE, 2012.[ DOI: 10.114513065386 http://dx.doi.org/10.114513065386 ]
Li C, Zhang Y C, Lan T, et al. An object detection algorithm with visual perception for high-resolution remote sensing images[J]. Journal of Xi'an Jiaotong University, 2018, 52(6):9-16.
李策, 张亚超, 蓝天, 等.一种高分辨率遥感图像视感知目标检测算法[J].西安交通大学学报, 2018, 52(6):9-16. [DOI:10.7652/xjtuxb201806002]
Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015.
Girshick R. Fast R-CNN[EB/OL].[2018-11-25].https: //arxiv.org/pdf/1504.08083.pdf.
Lim B, Son S, Kim H, et al. Enhanced deep residual networks for single image super-resolution[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, HI: IEEE, 2017.[ DOI: 10.1109/CVPRW.2017.151 http://dx.doi.org/10.1109/CVPRW.2017.151 ]
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV: IEEE, 2016.[ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Ledig C, Theis L, Huszár F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI: IEEE, 2017.[ DOI: 10.1109/CVPR.2017.19 http://dx.doi.org/10.1109/CVPR.2017.19 ]
Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization[EB/OL].[2018-11-25] . https://arxiv.org/pdf/1509.00685.pdf https://arxiv.org/pdf/1509.00685.pdf .
Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[EB/OL].[2018-11-25] . https://arxiv.org/pdf/1807.06521.pdf https://arxiv.org/pdf/1807.06521.pdf .
Bodla N, Singh B, Chellappa R, et al. Soft-NMS-improving object detection with one line of code[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017.[ DOI: 10.1109/ICCV.2017.593 http://dx.doi.org/10.1109/ICCV.2017.593 ]
Cheng G, Han J W, Zhou P C, et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014, 98:119-132.[DOI:10.1016/j.isprsjprs.2014.10.002]
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deepconvolutional neuralnetworks[J]. Communications of the ACM, 2017, 60(6):84-90.[DOI:10.1145/3065386]
相关作者
相关机构
京公网安备11010802024621