一阶全卷积遥感影像倾斜目标检测
Improved one-stage fully convolutional network for oblique object detection in remote sensing imagery
- 2022年27卷第8期 页码:2537-2548
纸质出版日期: 2022-08-16 ,
录用日期: 2021-11-04
DOI: 10.11834/jig.210157
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2022-08-16 ,
录用日期: 2021-11-04
移动端阅览
周院, 杨庆庆, 马强, 薛博维, 孔祥楠. 一阶全卷积遥感影像倾斜目标检测[J]. 中国图象图形学报, 2022,27(8):2537-2548.
Yuan Zhou, Qingqing Yang, Qiang Ma, Bowei Xue, Xiangnan Kong. Improved one-stage fully convolutional network for oblique object detection in remote sensing imagery[J]. Journal of Image and Graphics, 2022,27(8):2537-2548.
目的
2
主流深度学习的目标检测技术对自然影像的识别精度依赖于锚框设置的好坏,并使用平行于坐标轴的正框表示物体位置,而遥感影像中地物目标具有尺寸多变、分布密集、长宽比悬殊且朝向不定的特点,更宜通过与物体朝向一致的斜框表示其位置。本文试图结合无锚框和斜框检测技术,在遥感影像上实现高精度目标识别。
方法
2
使用斜框标注能够更为紧密地贴合目标边缘,有效减少识别干扰因素。本文基于单阶段无锚框目标检测算法:一阶全卷积目标检测网络(fully convolutional one-stage object detector
FCOS),通过引入滑动点结构,在遥感影像上实现高效率、高精度的斜框目标检测。与FCOS的不同之处在于,本文改进的检测算法增加了用于斜框检测的两个分支,通过在正框的两邻边上回归滑动顶点比率产生斜框,并预测斜框与正框的面积比以减少极端情况下的检测误差。
结果
2
在当前最大、最复杂的斜框遥感目标检测数据集DOTA(object detection in aerial images)上对本文方法进行评测,使用ResNet50作为骨干网络,平均精确率(mean average precision
mAP)达到74.84%,相比原始正框FCOS算法精度提升了33.02%,相比于YOLOv3(you only look once)效率提升了38.82%,比斜框检测算法R
3
Det(refined rotation RetinaNet)精度提升了1.53%。
结论
2
实验结果说明改进的FCOS算法能够很好地适应高分辨率遥感倾斜目标识别场景。
Objective
2
Most object detection techniques identify potential regions through well-designed anchors. The recognition accuracy is related to the setting of anchors intensively. It usually leads to sub-optimal results with no fine tunings when applying to unclear scenarios due to domain gap. The use of anchors constrains the generalization ability of object detection techniques on aerial imagery
and increases the cost of model training and parameter tuning. Moreover
object detection approaches designed for natural scenes represent objects using axis-aligned rectangles (horizontal boxes) that are inadequate when applied to aerial images since objects may have arbitrary orientation when observed from the overhead perspective. A horizontal bounding box involves multiple object instances and redundant background information in common that may confuse the learning algorithm and reduce recognition accuracy in aerial imagery. A better option tends to use oblique rectangles (oriented boxes) in aerial images. Oriented boxes are more compact compared to horizontal boxes
as they have the same direction with objects and closely adhere to objects' boundaries. We propose a novel object detection approach that is anchor-free and is capable to generate oriented bounding boxes in terms of gliding vertices of horizontal ones. Our algorithm is developed based on designated anchor-free detector fully convolutional one-stage object detector (FCOS). FCOS achieves comparable accuracy with anchor-based methods while totally eliminates the need for calibrating anchors and the complex pre and post-processing associated with anchors. It also requires less memory and can leverage more positive samples than its anchor-based counterparts. FCOS was originally designed for object detection in natural scenes observation
we adopt FCOS as our baseline and extend it for oblique object detection in aerial images. Our research contributions are mentioned as below 1) to extend FCOS for oblique object detection; 2) to weak the shape distortion issue of gliding vertex based representation of oriented boxes; and 3) to benchmark the extended FCOS on object detection in aerial images (DOTA).
Method
2
Our method integrates FCOS with gliding vertex approach to realize anchor-free oblique object detection. The following part describes our oriented object detection method on three aspects: network architecture
parameterization of oriented boxes
and experiments we conducted to evaluate the proposed network. Our network consists of a backbone for feature extraction
a feature pyramid network for feature fusion
and multiple detection heads for object recognition. Instead of using an orientation angle to represent the box direction
we adopt the gliding vertex representation for simplicity and robustness. We use ResNets as our backbone as FCOS does. The feature pyramid network fuses multi-level features from the backbone convolutional neural networks (CNN) to detect objects of various scales. Specifically
the
C
3
C
4 and
C
5 feature maps are taken to produce
P
3
P
4 and
P
5 by 1×1 convolution and lateral connection.
P
5 is fed into two subsequent convolutional layers with the stride parameter set to 2 to get
P
6 and
P
7. Unlike FCOS
we concatenate feature maps along the channel dimension followed by a 1×1 convolution and batch normalization for feature fusion. For each location on
P
3
P
4
P
5
P
6 and
P
7
the network predicts if an object exists at that location as well as the object category. For oriented box regression
we parameterize a box using a 7D real vector: (
l
t
r
b
α
1
α
2
k
). The
l
t
r
b
are the distances from the location to the four sides of the object's horizontal box. These four parameters together demine the size and location of the horizontal bounding box. (
α
1
α
2
) denote the gliding offsets on the top and left side of the horizontal bounding box that could be used to derive the coordinates of the first and second vertices of the oriented object.
k
is the obliquity factor that represents the area ratio between an oriented object and its horizontal bounding box. The obliquity factor describes the tilt degree of an object and guides the network to approximate nearly horizontal objects with the horizontal boxes. With this design
we can generate horizontal and oriented bounding box simultaneously with minimal increase in computing time and complexity. It is worth noting that we only predict gliding distances on two sides of the horizontal bounding box other than four with the assumption that the predicted boxes are parallelograms other than arbitrary quadrilaterals. We use fully convolutional sub-networks for target category classification and location regression that is consistent with FCOS. The detection heads are implemented using four convolutional layers
and take feature maps produced by the feature pyramid network as input. The network outputs are decoded to fetch classification scores as well as box locations.
Result
2
To illustrate the effectiveness of the proposed object detection approach
we evaluated the extended FCOS on the chall
enging oriented object detection dataset DOTA with various backbones and inference strategies. Without bells and whistles
our proposed network outperforms the horizontal detection baseline with 33.02% increase in mean average precision (mAP). Compared to you only look once (YOLOv3)
it achieves a performance boost of 38.82% in terms of frames per second (FPS). Compared to refined rotation RetinaNet(R
3
Det)
the proposed method improves detection accuracy by 1.53% in terms of mAP. We achieve an mAP of 74.84% on DOTA using ResNet50
that is higher than most one-stage and two-stage detectors.
Conclusion
2
The proposed method has its potentials to optimize single-stage and two-stage detectors in terms of recognition accuracy and time efficiency.
深度学习遥感影像无锚框特征提取多尺度特征融合倾斜目标检测
deep learningremote sensing imageanchor freefeature extractionmulti-scale feature fusionoblique object detection
Azimi S M, Vig E, Bahmanyar R, Krner M and Reinartz P. 2019. Towards multi-class object detection in unconstrained remote sensing imagery//Proceedings of the 14th Asian Conference on Computer Vision. Perth, Australia: Springer: 150-165
Chen H J, Wu D, Hou X Y and Wei Y T. 2021. A remote sensing image rotating ship target detection method based on dense sub-region cutting. China, CN201910816272.1
陈华杰, 吴栋, 侯新雨, 韦玉谭. 2021. 一种密集子区域切割的遥感图像旋转舰船目标检测方法. 中国, CN201910816272.1
Ding P. 2019. Research on Object Detection Technology in Optical Remote based on Deep Convolutional Neural Networks. Changchun: University of Chinese Academy of Sciences (Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences
丁鹏. 2019. 基于深度卷积神经网络的光学遥感目标检测技术研究. 长春: 中国科学院大学(中国科学院长春光学精密机械与物理研究所)
Feng W D, Sun X and Wang H Q. 2013. Spatial semantic model based geo-objects detection method for high resolution remote sensing images. Journal of Electronics and Information Technology, 35(10): 2518-2523
冯卫东, 孙显, 王宏琦. 2013. 基于空间语义模型的高分辨率遥感图像目标检测方法. 电子与信息学报, 35(10): 2518-2523[DOI: 10.3724/SP.J.1146.2013.00033]
Fu K, Xu G L, Sun X, Sun H, Zheng X W, Yan M L and Diao W H. 2019. A method for automatically recognizing large-scale remote sensing image targets based on deep learning. China, CN201511026790.1
付琨, 许光銮, 孙显, 孙皓, 郑歆慰, 闫梦龙, 刁文辉. 2019. 一种基于深度学习的大规模遥感影像目标自动识别方法. 中国, CN201511026790.1
Gao X, Li H, Zhang Y, Yan M L, Zhang Z S, Sun X, Sun H and Yu H F. 2018. Vehicle detection in remote sensing images of dense areas based on deformable convolution neural network. Journal of Electronics and Information Technology, 40(12): 2812-2819
高鑫, 李慧, 张义, 闫梦龙, 张宗朔, 孙显, 孙皓, 于泓峰. 2018. 基于可变形卷积神经网络的遥感影像密集区域车辆检测方法. 电子与信息学报, 40(12): 2812-2819[DOI: 10.11999/JEIT180209]
Girshick R. 2015. Fast R-CNN//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE: 1440-1448
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 770-778
Hou B, Zhou Y R, Jiao L C, Ma W P, Ma J J and Yang S J. 2020. Remote sensing image aircraft target detection method based on bounding box correction algorithm. China, CN201911017055.2
侯彪,周育榕, 焦李成, 马文萍, 马晶晶, 杨淑媛. 2020. 基于边界框修正算法的遥感图像飞机目标检测方法. 中国, CN201911017055.2
Huang L C, Yang Y, Deng Y F and Yu Y. 2015. DenseBox: unifying landmark localization with end to end object detection. [2021-03-03].
Law H and Deng J. 2018. CornerNet: detecting objects as paired keypoints//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 765-781
Li X J, Wang C L, Li Y and Sun H. 2016. Optical remote sensing object detection based on fused feature contrast of subwindows. Optics and Precision Engineering, 24(8): 2067-2077
李湘眷, 王彩玲, 李宇, 孙皓. 2016. 窗口融合特征对比度的光学遥感目标检测. 光学精密工程, 24(8): 2067-2077[DOI: 10.3788/OPE.20162408.2067]
Lin T Y, Dollr P, Girshick R, He K M, Hariharan B and Belongie S. 2016. Feature pyramid networks for object detection. [EB/OL]. [2021-03-18].
Lin T Y, Goyal P, Girshick R, He K M and Dollr P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 2999-3007
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y and Berg A C. 2016. SSD: single shot MultiBox detector//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 21-37
Liu Z K, Hu J G, Weng L B and Yang Y P. 2017. Rotated region based CNN for ship detection//Proceedings of 2017 IEEE International Conference on Image Processing. Beijing, China: IEEE: 900-904
Ma J Q, Shao W Y, Ye H, Wang L, Wang H, Zheng Y B and Xue X Y. 2018. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 20(11): 3111-3122 [DOI: 10.1109/TMM.2018.2818020]
Pan X J, Ren Y Q, Sheng K K, Dong W M, Yuan H L, Guo X W, Ma C Y and Xu C S. 2020. Dynamic refinement network for oriented and densely packed object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11204-11213
Redmon J, Divvala S, Girshick R and Farhadi A. 2016. You only look once: unified, real-time object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 779-788
Redmon J and Farhadi A. 2018. YOLOv3: an incremental improvement. [2021-03-03].
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031]
Shi W X, Tan D L and Bao SL. 2020. Feature enhancement SSD algorithm and its application in remote sensing images target detection. Acta Photonica Sinica, 49(1): #0128002
史文旭, 谭代伦, 鲍胜利. 2020. 特征增强SSD算法及其在遥感目标检测中的应用. 光子学报, 49(1): #0128002[DOI: 10.3788/gzxb20204901.0128002]
Tang W, Zhao B J and Long T. 2019. Aircraft detection in remote sensing image based on lightweight network. Journal of Signal Processing, 35(5): 768-774
唐玮, 赵保军, 龙腾. 2019. 基于轻量化网络的光学遥感图像飞机目标检测. 信号处理, 35(5): 768-774 [DOI: 10.16798/j.issn.1003-0530.2019.05.005]
Tian Z, Shen C H, Chen H and He T. 2019. FCOS: fully convolutional one-stage object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 9626-9635
Tian Z, Shen C H, Chen H and He T. 2020. FCOS: a simple and strong anchor-free object detector. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4): 1922-1933 [DOI: 10.1109/TPAMI.2020.3032166]
Wang S Y, Gao X, Sun H, Zheng X W and Sun X. 2017. An aircraft detection method based on convolutional neural networks in high-resolution SAR images. Journal of Radars, 6(2): 195-203
王思雨, 高鑫, 孙皓, 郑歆慰, 孙显. 2017. 基于卷积神经网络的高分辨率SAR图像飞机目标检测方法. 雷达学报, 6(2): 195-203
Wang Y Q, Ma L and Tian Y. 2011. State-of-the-art of ship detection and recognition in optical remotely sensed imagery. Acta Automatica Sinica, 37(9): 1029-1039
王彦情, 马雷, 田原. 2011. 光学遥感图像舰船目标检测与识别综述. 自动化学报, 37(9): 1029-1039 [DOI: 10.3724/SP.J.1004.2011.01029]
Wu Y J. 2015. Research on Detection of Aircraft in High-Resolution Optical Remote Sensing Images. Changsha: National University of Defense Technology
伍颖佳. 20015. 高分辨率可见光遥感图像中飞机目标检测方法研究. 长沙: 国防科学技术大学
Xia G S, Bai X, Ding J, Zhu Z, Belongie S, Luo J B, Datcu M, Pelillo M and Zhang L P. 2018. DOTA: a large-scale dataset for object detection in aerial images//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3974-3983
Xu Y C, Fu M T, Wang Q M, Wang Y K, Chen K, Xia G S and Bai X. 2021. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(4): 1452-1459 [DOI: 10.1109/TPAMI.2020.2974745]
Yang J X and Liu G Z. 2020. Small object detection with enhanced features. Journal of Shanxi Datong University (Natural Science), 36(6): 16-19
杨建秀, 刘桂枝. 2020. 特征增强的小目标检测算法. 山西大同大学学报(自然科学版), 36(6): 16-19[DOI: 10.3969/j.issn.1674-0874.2020.06.006]
Yang X, Sun H, Fu K, Yang J R, Sun X, Yan M L and Guo Z. 2018. Automatic ship detection in remote sensing images from google earth ofcomplex scenes based on multiscale rotation dense feature pyramid networks. Remote Sensing, 10(1): #132 [DOI: 10.3390/rs10010132]
Yang X, Yang J R, Yan J C, Zhang Y, Zhang T F, Guo Z, Sun X and Fu K. 2019. SCRDet: towards more robust detection for small, cluttered and rotated objects//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8231-8240
Zhang X H, Yao L B, Lyu Y F, Han P and Li J W. 2020. Center based model for arbitrary-oriented ship detection in remote sensing images. Acta Photonica Sinica, 49(4): #0410005
张筱晗, 姚力波, 吕亚飞, 韩鹏, 李健伟. 2020. 基于中心点的遥感图像多方向舰船目标检测. 光子学报, 49(4): #0410005[DOI: 10.3788/gzxb20204904.0410005]
Zhang Y, Sun X, Xu G L and Fu K. 2018. Multiscale ship detection from SAR images based on densely connected neural networks//Proceedings of the 5th Annual Conference on High-Resolution Earth Observation. Xi′an, China, 162-179
张跃, 孙显, 许光銮, 付琨. 2018. 基于稠密连接神经网络的多尺度SAR图像舰船检测. 第五届高分辨率对地观测学术年会论文集. 西安, 中国, 162-179
Zhao Y Q, Rao Y, Dong S P and Zhang J Y. 2020. Survey on deep learning object detection. Journal of Image and Graphics, 25(4): 629-654
赵永强, 饶元, 董世鹏, 张君毅. 2020. 深度学习目标检测方法综述. 中国图象图形学报, 25(4): 629-654[DOI: 10.11834/jig.190307]
相关作者
相关机构