双向特征融合的数据自适应SAR图像舰船目标检测模型
Data-adaptive single-shot ship detector with a bidirectional feature fusion module for SAR images
- 2020年25卷第9期 页码:1943-1952
收稿:2019-11-14,
修回:2020-3-21,
录用:2020-3-28,
纸质出版:2020-09-16
DOI: 10.11834/jig.190558
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-11-14,
修回:2020-3-21,
录用:2020-3-28,
纸质出版:2020-09-16
移动端阅览
目的
2
利用合成孔径雷达(synthetic aperture radar,SAR)图像进行舰船目标检测是实施海洋监视的重要手段。基于深度学习的目标检测模型在自然图像目标检测任务中取得了巨大成功,但由于自然图像与SAR图像的差异,不能将其直接迁移到SAR图像目标检测中。针对SAR图像目标检测实际应用中对速度和精度的需求,借鉴经典的单阶段目标检测模型(single shot detector,SSD)框架,提出一种基于特征优化的轻量化SAR图像舰船目标检测网络。
方法
2
改进模型并精简网络结构,提出一种数据驱动的目标分布聚类算法,学习SAR数据集的目标尺度、长宽比分布特性,用于网络参数设定;对卷积神经网络(convolutional neural network,CNN)提取的特征进行优化,提出一种双向高低层特征融合机制,将高层特征的语义信息通过语义聚合模块加成到低层特征中,在低层特征中提取特征平均图,处理后作为高层特征的注意力权重图对高层特征进行逐像素加权,将低层特征丰富的空间信息融入到高层特征中。
结果
2
利用公开的SAR舰船目标检测数据集(SAR ship detection dataset,SSDD)进行实验,与原始的SSD模型相比,轻量化结构设计在不损失检测精度的前提下,样本测试时间仅为SSD的65%;双向特征融合机制将平均精确度(average precision,AP)值由77.93%提升至80.13%,训练和测试时间分别为SSD的64.1%和72.6%;与公开的基于深度学习的SAR舰船目标检测方法相比,本文方法在速度和精度上都取得了最佳性能,AP值较精度次优模型提升了1.23%,训练和测试时间较精度次优模型分别提升了559.34 ms和175.35 ms。
结论
2
实验充分验证了本文所提模型的有效性,本文模型兼具检测速度与精度优势,具有很强的实用性。
Objective
2
Ship detection plays an important role in civil and military fields
including marine object identification
maritime transportation
rescue operation
marine security
and disaster relief. As a basic means of marine monitoring
ship detection in synthetic aperture radar (SAR) images has been studied for years. With the development of sensor and platform technologies
SAR big data are achieved
making it possible to perform automatic data-driven detection algorithms. Deep learning-based detection models have been proven to be a great success in common object detection tasks for natural scene images; moreover
it outperforms many traditional artificial feature based methods. However
when transferring them to SAR ship detection directly
many challenges emerge
and the results are not satisfying because natural and SAR images have several differences. Ship in SAR images usually appear as some bright parts and lack detail information in comparison with natural images because of the coherent imaging mechanism. The swath of SAR remote sensing images is large
but targets are distributed densely or sparsely; thus
the processing of SAR images is usually more complex than that of natural ones. In addition
the size and shape of ship targets vary
ranging from several pixels to hundreds. All these factors complicate ship detection in SAR images. Aiming to solve these challenges and considering the actual demands in practice
this study proposes a lightweight data-adaptive detector with a feature-optimizing mechanism on the basis of the famous single-shot detector (SSD) to improve detection precision and speed.
Method
2
In this study
the original SSD is modified by having the number of channels halved and the last two convolution blocks removed. The settings of the network parameters follow the outputs of proposed data-driven target distribution clustering algorithm
which leans the distributions of targets in the SAR dataset
including the size of ships and the aspect ratio of ships. The algorithm is free from human experience and can make the detector adapt to the SAR dataset. Trunked visual geometry group 16-layer net (VGG16) is utilized to extract features from input SAR images. Given that the features extracted by convolutional neural networks are hierarchical
low-level features with high spatial resolution usually contain extra local and spatial detail information
whereas more semantic and global information are involved in high-level features with low resolution. For object detection tasks
spatial and sematic information are important. Thus
information must be aggregated through a fusion strategy. A new bidirectional feature fusion mechanism
which contains a semantic aggregation and a novel attention guidance module
is proposed. In feature pyramid networks
the higher features are added to the lower features after an upsampling operation. On this basis
the up-sampled higher features in our model are concatenated with lower features in the channel dimension
and the channel numbers are adjusted through a 1×1 convolution operation. Instead of simply adding lower features to higher features
an inverse fusion from down to top and an attention mechanism are applied. A spatial attention map of each convolution block is generated
and the attention map that contains the most spatial information is selected as a weighted map. In the weighted map
target pixels with higher value are usually more noticeable
whereas the value of background pixels are suppressed. After down sample to weight map
element-wise multiplication is performed between the weighted map and the higher features. The features of the targets are strengthened; thus
spatial information is passed to higher level features. The optimized features are then entered into detector heads to predict the locations and types of targets; the low-level features mainly detect the small ships
whereas the high-level features are responsible for the large ones. The entire network is trained by a weighted sum of location and classification losses. In interference
nonmaximum suppression is used for removing repeated bounding boxes.
Result
2
The public SAR ship detection dataset widely used in SAR ship detection references is adopted in experiments. All the experiments are implemented using Python language under the TensorFlow framework on a 64-bit computer with Ubuntu 16.06
CPU Intel (R) Core (TM) i7-6770K @4.00 GHz×8
and NVIDIA GTX 1080Ti with CUDA9.0 and cuDNN7.0 for acceleration. The training iteration
initial learning rate
and batch size are set as 120 k
0.000 1
and 24
respectively. A momentum optimizer is used
with weight decay
gamma
and momentum values of 0.000 5
0.1
and 0.9
respectively. An ablation study is operated to verify the effectiveness of each proposed module
and the model is compared with five published state-of-art methods. Precision rate
recall rate
average precision (AP)
and the average training and testing time on a single image
are taken as evaluation indicators. In the original SSD
a model with parameters from the proposed data-driven target distribution clustering algorithm improves the AP by 1.08% in comparison with the model with original parameters. The lightweight design of the network significantly improves the detection speed; compared with that of the SSD
the training and testing time of the proposed model decrease from 20.79 ms to 12.74 ms and from 14.02 ms to 9.17 ms
respectively. The semantic aggregation and attention fusing modules can improve detection precision
whereas when the two modules are used together
the optimum performance in detection precision is achieved. The AP increased from 77.93% to 80.13%
and the precision and recall rates increased from 89.54% to 96.68% and from 88.60% to 89.60%
respectively. However
speed is not considerably affected
and the model still runs faster than SSD. The proposed model outperforms other models in terms of precision and speed; moreover
it improves AP by 6.9%
1.23%
9.09%
and 2.9% in comparison with other four methods.
Conclusion
2
In this study
we proposed a lightweight data adaptive single shot detector with feature optimizing mechanism. Experiment results show that our model have remarkable advantages over other published state-of-the-art detection approaches in terms of precision and speed.
Agrawal A, Mangalraj P and Bisherwal M A. 2015. Target detection in SAR images using SIFT//Proceedings of 2015 IEEE International Symposium on Signal Processing and Information Technology. Abu Dhabi: IEEE: 90-94[ DOI:10.1109/ISSPIT.2015.7394426 http://dx.doi.org/10.1109/ISSPIT.2015.7394426 ]
Allard Y, Germain M and Bonneau O. 2008. Ship detection and characterization using polarimetric SAR data//Shahbazian E, Rogova G and DeWeert M J, eds. Harbour Protection Through Data Fusion Technologies. Dordrecht: Springer: 243-250[ DOI:10.1007/978-1-4020-8883-4_29 http://dx.doi.org/10.1007/978-1-4020-8883-4_29 ]
Dalal N and Triggs B. 2005. Histograms of oriented gradients for human detection//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego: IEEE: 886-893[ DOI:10.1109/CVPR.2005.177 http://dx.doi.org/10.1109/CVPR.2005.177 ]
Deng Z P, Sun H, Zhou S L and Zhao J P. 2019. Learning deep ship detector in SAR images from scratch. IEEE Transactions on Geoscience and Remote Sensing, 57(6):4021-4039[DOI:10.1109/TGRS.2018.2889353]
Everingham M, Ali Eslami S M, Van Gool L, Williams C K I, Winn J and Zisserman A. 2015. The PASCAL visual object classes challenge:a retrospective. International Journal of Computer Vision, 111(1):98-136[DOI:10.1007/s11263-014-0733-5]
Jiao J, Zhang Y, Sun H, Yang X, Gao X, Hong W, Fu K and Sun X. 2018. A densely connected end-to-end neural network for multiscale and multiscene SAR Ship detection. IEEE Access, 6:20881-20892[DOI:10.1109/access.2018.2825376]
Kang M, Ji K F, Leng X G and Lin Z. 2017. Contextual region-based convolutional neural network with multilayer fusion for SAR ship detection. Remote Sensing, 9(8):#860[DOI:10.3390/rs9080860]
Li J W, Qu C W, Peng S J and Deng B. 2018. Ship detection in SAR images based on convolutional neural network. Systems Engineering and Electronics, 40(9):1953-1959
李健伟, 曲长文, 彭书娟, 邓兵. 2018.基于卷积神经网络的SAR图像舰船目标检测.系统工程与电子技术, 40(9):1953-1959 [DOI:10.3969/j.issn.1001-506X.2018.09.09]
Li J W, Qu C W and Shao J Q. 2017. Ship detection in SAR images based on an improved faster R-CNN//Proceedings of 2017 SAR in Big Data Era: Models, Methods and Applications. Beijing: IEEE: 1-6[ DOI:10.1109/BIGSARDATA.2017.8124934 http://dx.doi.org/10.1109/BIGSARDATA.2017.8124934 ]
Li J W, Qu C W, Peng S J and Jiang Y. 2019. Ship detection in SAR images based on generative adversarial network and online hard examples mining. Journal of Electronics and Information Technology, 41(1):143-149
李健伟, 曲长文, 彭书娟, 江源. 2019.基于生成对抗网络和线上难例挖掘的SAR图像舰船目标检测.电子与信息学报, 41(1):143-149 [DOI:10.11999/JEIT180050]
Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE: 2117-2125[ DOI:10.1109/CVPR.2017.106 http://dx.doi.org/10.1109/CVPR.2017.106 ]
Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P and Zitnick C L. 2014. Microsoft COCO: common objects in context//Proceedings of the 13th European Conference on Computer Vision. Zurich: Springer: 740-755[ DOI:10.1007/978-3-319-10602-1_48 http://dx.doi.org/10.1007/978-3-319-10602-1_48 ]
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y and Berg A C. 2016. SSD: single shot multibox detector//Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer: 21-37[ DOI:10.1007/978-3-319-46448-0-2 http://dx.doi.org/10.1007/978-3-319-46448-0-2 ]
Pappas O, Achim A and Bull D. 2018. Superpixel-level CFAR detectors for ship detection in SAR imagery. IEEE Geoscience and Remote Sensing Letters, 15(9):1397-1401[DOI:10.1109/LGRS.2018.2838263]
Redmon J, Divvala S, Girshick R and Farhadi A. 2015. You only look once: unified, real-time object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE: 779-788[ DOI:10.1109/CVPR.2016.91 http://dx.doi.org/10.1109/CVPR.2016.91 ]
Redmon J and Farhadi A. 2017. YOLO9000: better, faster, stronger//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE: 6517-6525[ DOI:10.1109/CVPR.2017.690 http://dx.doi.org/10.1109/CVPR.2017.690 ]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN:towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6):1137-1149[DOI:10.1109/TPAMI.2016.2577031]
Wang J Z, Lu C H and Jiang W W. 2018. Simultaneous ship detection and orientation estimation in SAR images based on attention module and angle regression. Sensors, 18(9):#2851[DOI:10.3390/s18092851]
Woo S, Park J, Lee J Y and Kweon I S. 2018. CBAM: convolutional block attention module[EB/OL].[2019-11-14] . https://arxiv.org/pdf/1807.06521.pdf https://arxiv.org/pdf/1807.06521.pdf
Yang X, Hao S, Fu K, Yang J R, Sun X, Yang M L and Guo Z. 2018. Automatic ship detection in remote sensing images from Google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sensing, 10(1):#132[DOI:10.3390/rs10010132]
Zhang X H, Wang H P, Xu C A, Lyu Y F, Fu C L, Xiao H C and He Y. 2019. A lightweight feature optimizing network for ship detection in SAR image. IEEE Access, 7:141662-141678[DOI:10.1109/ACCESS.2019.2943241]
相关作者
相关机构
京公网安备11010802024621