航空遥感图像深度学习目标检测技术研究进展

石争浩; 仵晨伟; 李成建; 尤珍臻; 王泉; 马城城

doi:10.11834/jig.221085

复杂场景图像目标智能检测 | 浏览量 : 0 下载量: 4 CSCD: 1

PDF
导出
分享
收藏
专辑

航空遥感图像深度学习目标检测技术研究进展
Object detection techniques based on deep learning for aerial remote sensing images： a survey
2023年28卷第9期页码：2616-2643
纸质出版日期： 2023-09-16 ，
DOI： 10.11834/jig.221085
稿件说明：

移动端阅览

石争浩，仵晨伟，李成建，尤珍臻，王泉，马城城. 2023. 航空遥感图像深度学习目标检测技术研究进展. 中国图象图形学报， 28(09):2616-2643

Shi Zhenghao， Wu Chenwei， Li Chengjian， You Zhenzhen， Wang Quan， Ma Chengcheng. 2023. Object detection techniques based on deep learning for aerial remote sensing images： a survey. Journal of Image and Graphics， 28(09):2616-2643
石争浩，仵晨伟，李成建，尤珍臻，王泉，马城城. 2023. 航空遥感图像深度学习目标检测技术研究进展. 中国图象图形学报， 28(09):2616-2643 DOI： 10.11834/jig.221085.

Shi Zhenghao， Wu Chenwei， Li Chengjian， You Zhenzhen， Wang Quan， Ma Chengcheng. 2023. Object detection techniques based on deep learning for aerial remote sensing images： a survey. Journal of Image and Graphics， 28(09):2616-2643 DOI： 10.11834/jig.221085.

摘要

航空遥感图像目标检测旨在定位和识别遥感图像中感兴趣的目标，是航空遥感图像智能解译的关键技术，在情报侦察、灾害救援和资源勘探等领域具有重要应用价值。然而由于航空遥感图像具有尺寸大、目标小且密集、目标呈任意角度分布、目标易被遮挡、目标类别不均衡以及背景复杂等诸多特点，航空遥感图像目标检测目前仍然是极具挑战的任务。基于深度卷积神经网络的航空遥感图像目标检测方法因具有精度高、处理速度快等优点，受到了越来越多的关注。为推进基于深度学习的航空遥感图像目标检测技术的发展，本文对当前主流遥感图像目标检测方法，特别是2020—2022年提出的检测方法，进行了系统梳理和总结。首先梳理了基于深度学习目标检测方法的研究发展演化过程，然后对基于卷积神经网络和基于Transformer目标检测方法中的代表性算法进行分析总结，再后针对不同遥感图象应用场景的改进方法思路进行归纳，分析了典型算法的思路和特点，介绍了现有的公开航空遥感图像目标检测数据集，给出了典型算法的实验比较结果，最后给出现阶段航空遥感图像目标检测研究中所存在的问题，并对未来研究及发展趋势进行了展望。

Abstract

Given the successful development of aerospace technology， high-resolution remote-sensing images have been used in daily research. The earlier low-resolution images limit researchers’ interpretation of image information. In comparison， today’s high-resolution remote sensing images contain rich geographic and entity detail features. They are also rich in spatial structure and semantic information. Thus， they can greatly promote the development of research in this field. Aerial remote sensing image object detection aims to provide the category and location of the target of interest in aerial remote sensing images and present evidence for further information interpretation reasoning. This technology is crucial for aerial remote sensing image interpretation and has important applications in intelligence reconnaissance， target surveillance， and disaster rescue. The early remote sensing image object detection task mainly relies on manual interpretation. The interpretation results are greatly affected by subjective factors， such as the experience and energy of the interpreters. Moreover， the timeliness is low. Various remote sensing image object detection methods based on machine learning technology have been proposed with the progress and development of machine learning technology. Traditional machine learning-based object detection techniques generally use manually designed models to extract feature information， such as feature spectrum， gray value， texture， and shape of remote sensing images， after generating sliding windows. Then， they feed the extracted feature information into classifiers， such as support vector machine （SVM） and adaptive boosting （AdaBoost）， to achieve object detection in remote sensing images. These methods design the corresponding feature extraction models for specific targets with strong interpretability but weak feature expression capability， poor generalization， time-consuming computation， and low accuracy. These features make meeting the needs of accurate and efficient object detection tasks challenging in complex and variable application scenarios. In recent years， the research on the application of deep learning in remote sensing image processing has received considerable attention and become a hotspot because of the wide application of deep learning techniques， such as deep convolutional neural networks and generative adversarial neural networks， in the fields of natural image object detection， classification， and recognition， and the excellent performance in the task of large-scale natural scene image object detection. Thus， many excellent works have emerged. Object detection in aerial remote sensing images mainly faces challenges， such as large-size and high-resolution images， interference from complex backgrounds， target direction diversity， dense targets， dramatic scale changes， and small targets. At present， these challenges have corresponding model improvement methods. For large-scale natural scene image object detection， high-resolution aerial remote sensing images are used because the target scale in the image is widely distributed. This approach ensures the integrity of small target detail information. Thus， the most commonly used detection and recognition method involves segmenting the image during data preprocessing； that is， the large image is segmented into regular image sizes and sent to the object detection algorithm for detection and recognition in turn. In the subsequent processing， all the detection results are finally stitched together and reset to complete the detection of the whole image. Moreover， the aerial remote sensing image with the ultrahigh resolution has a complex background. The target to be detected is easily interfered with by various similar objects， and the similar targets to be detected present different characteristics. Thus， false detection quickly occurs during detection. Therefore， the usual methods for solving complex background interference can be divided into two types： extracting the contextual information in the image and improving the attention mechanism. The targets to be detected in the images for the complex multidirectional and multitarget situations are multidirectional because the aerial remote sensing images are all top-down images. Moreover， the aspect ratio range of the targets to be detected is more diverse than that of the targets in the natural images. Thus， the interference between the targets is serious， thereby affecting the accuracy of the final target localization and classification. At present， three practical improvement ideas are available for the problems of directional diversity and dense arrangement distribution of targets to be detected： image rotation enhancement， design of rotation invariant module， and design of an accurate position regression method. The designed model needs to have good scale invariance， i.e.， the model has high recognition ability even under the drastic changes of multiple scales of multiple targets， to meet the challenge of drastic changes in the target scales in aerial remote sensing images. Thus， the common improvement scheme is the multiscale feature fusion. For the small target detection in aerial remote sensing images， the current algorithms are mainly improved from feature enhancement， multilevel feature map detection， and the design of precise positioning strategies. In summary， the challenges and difficulties of object detection in aerial remote sensing imagery do not exist independently. For example， the large size and high resolution of aerial remote sensing images inevitably lead to a complex background in the images and a sharp increase in the category and number of small targets to be detected. Moreover， most of the small targets are susceptible to strong interference from the complex background. This phenomenon results in localization and classification recognition accuracy. In addition， the improvements for one challenge also apply to other difficulties， e.g.， the improvements for multiscale target feature enhancement benefit almost all challenges. Therefore， the problems in the field must be analyzed and improved from a global perspective. Based on the full study of the latest reviews and related research works， this study systematically compares and summarizes deep learning object detection algorithms for aerial remote sensing images， particularly the research methods at home and abroad in the past three years， to provide appropriate object detection research for aerial remote sensing images and help scholars comprehensively understand and grasp the latest progress in aerial remote sensing image object detection research based on deep learning. First， the present study introduces the deep-learning-based image object detection model. Then， it systematically composes the deep-learning-based aerial remote sensing image detection methods， introduces the publicly available datasets for aerial remote sensing image object detection， and compares the performances of typical methods through experiments. Finally， the problems in the current research of aerial remote sensing image object detection are presented， and future research and development trends are prospected.

关键词

航空遥感图像目标检测特征融合深度学习卷积神经网络（CNN）Transformer注意力机制

Keywords

aerial remote sensing imagesobject detectionfeature fusiondeep learningconvolution neural network（CNN）Transformerattention mechanism

references

Amit R A and Mohan C K. 2021. A robust airport runway detection network based on R-CNN using remote sensing images. IEEE Aerospace and Electronic Systems Magazine， 36（11）： 4-20 ［DOI： 10.1109/MAES.2021.3088477http://dx.doi.org/10.1109/MAES.2021.3088477］

Bochkovskiy A， Wang C Y and Liao H Y M. 2020. YOLOv4： optimal speed and accuracy of object detection ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2004.10934.pdfhttp://arxiv.org/pdf/2004.10934.pdf ［DOI： 10.48550/arXiv.2004.10934http://dx.doi.org/10.48550/arXiv.2004.10934］

Boroughani M， Pourhashemi S， Hashemi H， Salehi M， Amirahmadi A， Asadi M A Z and Berndtsson R. 2020. Application of remote sensing techniques and machine learning algorithms in dust source detection and dust source susceptibility mapping. Ecological Informatics， 56： #101059 ［DOI： 10.1016/j.ecoinf.2020.101059http://dx.doi.org/10.1016/j.ecoinf.2020.101059］

Cai Z W and Vasconcelos N. 2018. Cascade R-CNN： delving into high quality object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 6154-6162 ［DOI： 10.1109/cvpr.2018.00644http://dx.doi.org/10.1109/cvpr.2018.00644］

Carion N， Massa F， Synnaeve G， Usunier N， Kirillov A and Zagoruyko S. 2020. End-to-end object detection with Transformers//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： 213-229 ［DOI： 10.1007/978-3-030-58452-8_13http://dx.doi.org/10.1007/978-3-030-58452-8_13］

Chalavadi V， Jeripothula P， Datla R， Ch S B C K M. 2022. mSODANet： a network for multi-scale object detection in aerial images using hierarchical dilated convolutions. Pattern Recognition， 126： #108548 ［DOI： 10.1016/j.patcog.2022.108548http://dx.doi.org/10.1016/j.patcog.2022.108548］

Chen Q， Wang Y M， Yang T， Zhang X Y， Cheng J and Sun J. 2021. You only look one-level feature//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 13034-13043 ［DOI： 10.1109/cvpr46437.2021.01284http://dx.doi.org/10.1109/cvpr46437.2021.01284］

Cheng G， Wang J B， Li K， Xie X X， Lang C B， Yao Y Q and Han J W. 2022. Anchor-free oriented proposal generator for object detection. IEEE Transactions on Geoscience and Remote Sensing， 60： #5625411 ［DOI： 10.1109/TGRS.2022.3183022http://dx.doi.org/10.1109/TGRS.2022.3183022］

Cheng G， Zhou P C and Han J W. 2016. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing， 54（12）： 7405-7415 ［DOI： 10.1109/TGRS.2016.2601622http://dx.doi.org/10.1109/TGRS.2016.2601622］

Cooner A J， Shao Y and Campbell J B. 2016. Detection of urban damage using remote sensing and machine learning algorithms： revisiting the 2010 Haiti earthquake. Remote Sensing， 8（10）： #868 ［DOI： 10.3390/rs8100868http://dx.doi.org/10.3390/rs8100868］

Cortes C and Vapnik V. 1995. Support-vector networks. Machine Learning， 20（3）： 273-297 ［DOI： 10.1007/BF00994018http://dx.doi.org/10.1007/BF00994018］

Dai J F， Qi H Z， Xiong Y W， Li Y， Zhang G D， Hu H and Wei Y C. 2017. Deformable convolutional networks//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 764-773 ［DOI： 10.1109/ICCV.2017.89http://dx.doi.org/10.1109/ICCV.2017.89］

Dai K， Xu L B， Huang S Y and Li Y L. 2022. Single stage object detection algorithm based on fusing strategy optimization selection and dual attention mechanism. Journal of Image and Graphics， 27（8）： 2430-2443

戴坤，许立波，黄世旸，李鋆铃. 2022. 融合策略优选和双注意力的单阶段目标检测. 中国图象图形学报， 27（8）： 2430-2443 ［DOI： 10.11834/jig.210204http://dx.doi.org/10.11834/jig.210204］

Dai L H， Liu H， Tang H， Wu Z W and Song P H. 2022a. AO2-DETR： arbitrary-oriented object detection Transformer ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2205.12785.pdfhttp://arxiv.org/pdf/2205.12785.pdf

Dai P W， Yao S Y， Li Z K， Zhang S Y and Cao X C. 2022b. ACE： anchor-free corner evolution for real-time arbitrarily-oriented object detection. IEEE Transactions on Image Processing， 31： 4076-4089 ［DOI： 10.1109/TIP.2022.3167919http://dx.doi.org/10.1109/TIP.2022.3167919］

Dai Y N， Yu J Y， Zhang D A， Hu T H and Zheng X T. 2022c. RODFormer： high-precision design for rotating object detection with Transformers. Sensors， 22（7）： #2633 ［DOI： 10.3390/s22072633http://dx.doi.org/10.3390/s22072633］

Dai Z G， Cai B L， Lin Y G and Chen J Y. 2021. UP-DETR： unsupervised pre-training for object detection with Transformers//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 1601-1610 ［DOI： 10.1109/cvpr46437.2021.00165http://dx.doi.org/10.1109/cvpr46437.2021.00165］

Dalal N and Triggs B. 2005. Histograms of oriented gradients for human detection//Proceedings of 2005 IEEE Conference on Computer Vision and Pattern Recognition. San Diego， USA： IEEE： 886-893 ［DOI： 10.1109/CVPR.2005.177http://dx.doi.org/10.1109/CVPR.2005.177］

Deng S T， Li S， Xie K， Song W F， Liao X， Hao A M and Qin H. 2021. A global-local self-adaptive network for drone-view object detection. IEEE Transactions on Image Processing， 30： 1556-1569 ［DOI： 10.1109/TIP.2020.3045636http://dx.doi.org/10.1109/TIP.2020.3045636］

Ding J， Xue N， Long Y， Xia G S and Lu Q K. 2019. Learning RoI Transformer for oriented object detection in aerial images//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 2844-2853 ［DOI： 10.1109/cvpr.2019.00296http://dx.doi.org/10.1109/cvpr.2019.00296］

Ding X H， Zhang X Y， Han JG and Ding G G. 2022. Scaling up your kernels to 31 × 31： revisiting large kernel design in CNNs//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 11953-11965 ［DOI： 10.1109/cvpr52688.2022.01166http://dx.doi.org/10.1109/cvpr52688.2022.01166］

Ding X H， Zhang X Y， Ma N N， Han J G， Ding G G and Sun J. 2021. RepVGG： making VGG-style ConvNets great again//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 13728-13737 ［DOI： 10.1109/CVPR46437.2021.01352http://dx.doi.org/10.1109/CVPR46437.2021.01352］

Dong X H， Qin Y， Fu R G， Gao Y H， Liu S L， Ye Y X and Li B. 2022. Multiscale deformable attention and multilevel features aggregation for remote sensing object detection. IEEE Geoscience and Remote Sensing Letters， 19： #6510405 ［DOI： 10.1109/LGRS.2022.3178479http://dx.doi.org/10.1109/LGRS.2022.3178479］

Dong Z P， Wang M， Wang Y L， Zhu Y and Zhang Z Q. 2020. Object detection in high resolution remote sensing imagery based on convolutional neural networks with suitable object scale features. IEEE Transactions on Geoscience and Remote Sensing， 58（3）： 2104-2114 ［DOI： 10.1109/TGRS.2019.2953119http://dx.doi.org/10.1109/TGRS.2019.2953119］

Du D W， Qi Y K， Yu H Y， Yang Y F， Duan K W， Li G R， Zhang W G， Huang Q M and Tian Q. 2018. The unmanned aerial vehicle benchmark： object detection and tracking//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 375-391 ［DOI： 10.1007/978-3-030-01249-6_23http://dx.doi.org/10.1007/978-3-030-01249-6_23］

Du D W， Zhu P F， Wen L Y， Bian X， Lin H B， Hu Q H， Peng T， Zheng J Y， Wang X Y， Zhang Y， Bo L F， Shi H L， Zhu R， Kumar A， Li A J， Zinollayev A， Askergaliyev A， Schumann A， Mao B J， Lee B， Liu C， Chen C R， Pan C H， Huo C L， Yu D， Cong D C， Zeng D N， Pailla D R， Li D， Wang D， Cho D， Zhang D Y， Bai F R， Jose G， Gao G Y， Liu G Z， Xiong H T， Qi H， Wang H R， Qiu H Q， Li H L， Lu H C， Kim I， Kim J， Shen J， Lee J， Ge J， Xu J J， Zhou J K， Meier J， Choi J W， Hu J H， Zhang J Y， Huang J Y， Huang K Q， Wang K Y， Sommer L， Jin L， Zhang L， Huang L H， Sun L， Steinmann L， Jia M X， Xu N， Zhang P Y， Chen Q， Lyu Q X， Liu Q， Cheng Q S， Chennamsetty S S， Chen S H， Wei S， Kruthiventi S S S， Hong S， Kang S， Wu T， Feng T， Kollerathu V A， Li W Q， Dai W， Qin W D， Wang W Y， Wang X R， Chen X Y， Chen X， Sun X， Zhang X， Zhao X， Zhang X D， Zhang X Y， Chen X K， Wei X D， Zhang X Z， Li Y C， Chen Y F， Toh Y H， Zhang Y， Zhu Y， Zhong Y X， Wang Z X， Wang Z K， Song Z C and Liu Z M. 2019. VisDrone-DET2019： the vision meets drone object detection in image challenge results//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul， Korea （South）： IEEE： 213-223 ［DOI： 10.1109/iccvw.2019.00030http://dx.doi.org/10.1109/iccvw.2019.00030］

Duan C Z， Wei Z W， Zhang C， Qu S Y and Wang H P. 2021. Coarse-grained density map guided object detection in aerial images//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal， Canada： IEEE： 2789-2798 ［DOI： 10.1109/iccvw54120.2021.00313http://dx.doi.org/10.1109/iccvw54120.2021.00313］

Fang Y， Liao B， Wang X， Fang J， Qi J， Wu R， Niu J and Liu W. 2021. You only look at one sequence： rethinking Transformer in vision through object detection//Advances in Neural Information Processing Systems， 34， 26183-26197 ［DOI： 10.48550/arXiv.2106.00666http://dx.doi.org/10.48550/arXiv.2106.00666］

Felzenszwalb P F， Girshick R B， McAllester D and Ramanan D. 2010. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence， 32（9）： 1627-1645 ［DOI： 10.1109/TPAMI.2009.167http://dx.doi.org/10.1109/TPAMI.2009.167］

Fu H， Fan X T， Yan Z Z and Du X P. 2022. Progress of object detection in remote sensing images based on deep learning. Remote Sensing Technology and Application， 37（2）： 290-305

付涵，范湘涛，严珍珍，杜小平. 2022. 基于深度学习的遥感图像目标检测技术研究进展. 遥感技术与应用， 37（2）： 290-305 ［DOI： 10.11873/j.issn.1004-0323.2022.2.0290http://dx.doi.org/10.11873/j.issn.1004-0323.2022.2.0290］

Fu J， Liu J， Tian H J， Li Y， Bao Y J， Fang Z W and Lu H Q. 2019. Dual attention network for scene segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 3141-3149 ［DOI： 10.1109/cvpr.2019.00326http://dx.doi.org/10.1109/cvpr.2019.00326］

Fu J M， Sun X， Wang Z R and Fu K. 2021. An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images. IEEE Transactions on Geoscience and Remote Sensing， 59（2）： 1331-1344 ［DOI： 10.1109/TGRS.2020.3005151http://dx.doi.org/10.1109/TGRS.2020.3005151］

Ge Z， Liu S T， Wang F， Li Z M and Sun J. 2021. YOLOX： exceeding YOLO series in 2021 ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2107.08430.pdfhttp://arxiv.org/pdf/2107.08430.pdf

Gevorgyan Z. 2022. SIoU loss： more powerful learning for bounding box regression ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2205.12740.pdfhttp://arxiv.org/pdf/2205.12740.pdf

Ghaffarian S， Valente J， Van Der Voort M and Tekinerdogan B. 2021. Effect of attention mechanism in deep learning-based remote sensing image processing： a systematic literature review. Remote Sensing， 13（15）： #2965 ［DOI： 10.3390/rs13152965http://dx.doi.org/10.3390/rs13152965］

Ghasemian N and Akhoondzadeh M. 2018. Introducing two Random Forest based methods for cloud detection in remote sensing images. Advances in Space Research， 62（2）： 288-303 ［DOI： 10.1016/j.asr.2018.04.030http://dx.doi.org/10.1016/j.asr.2018.04.030］

Girshick R. 2015. Fast R-CNN//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago， Chile： IEEE： 1440-1448 ［DOI： 10.1109/iccv.2015.169http://dx.doi.org/10.1109/iccv.2015.169］

Girshick R， Donahue J， Darrell T and Malik J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus， USA： IEEE： 580-587 ［DOI： 10.1109/cvpr.2014.81http://dx.doi.org/10.1109/cvpr.2014.81］

Han J M， Ding J， Li J and Xia G S. 2022. Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing， 60： #5602511 ［DOI： 10.1109/TGRS.2021.3062048http://dx.doi.org/10.1109/TGRS.2021.3062048］

Han J M， Ding J， Xue N and Xia G S. 2021. ReDet： a rotation-equivariant detector for aerial object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 2785-2794 ［DOI： 10.1109/cvpr46437.2021.00281http://dx.doi.org/10.1109/cvpr46437.2021.00281］

He K M， Gkioxari G， Doll􀆦r P and Girshick R. 2017. Mask R-CNN//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 2980-2988 ［DOI： 10.1109/iccv.2017.322http://dx.doi.org/10.1109/iccv.2017.322］

He Y Q， Sun X， Gao L R and Zhang B. 2018. Ship detection without sea-land segmentation for large-scale high-resolution optical satellite images//IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia， Spain： IEEE： 717-720 ［DOI： 10.1109/IGARSS.2018.8519391http://dx.doi.org/10.1109/IGARSS.2018.8519391］

Hou B， Ren Z L， Zhao W， Wu Q and Jiao L C. 2020. Object detection in high-resolution panchromatic images using deep models and spatial template matching. IEEE Transactions on Geoscience and Remote Sensing， 58（2）： 956-970 ［DOI： 10.1109/TGRS.2019.2942103http://dx.doi.org/10.1109/TGRS.2019.2942103］

Hu J， Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 7132-7141 ［DOI： 10.1109/cvpr.2018.00745http://dx.doi.org/10.1109/cvpr.2018.00745］

Hua X， Wang X Q， Rui T， Zhang H T and Wang D. 2020. A fast self-attention cascaded network for object detection in large scene remote sensing images. Applied Soft Computing， 94： #106495 ［DOI： 10.1016/j.asoc.2020.106495http://dx.doi.org/10.1016/j.asoc.2020.106495］

Hussain M， Chen D M， Cheng A， Wei H and Stanley D. 2013. Change detection from remotely sensed images： from pixel-based to object-based approaches. ISPRS Journal of Photogrammetry and Remote Sensing， 80： 91-106 ［DOI： 10.1016/j.isprsjprs.2013.03.006http://dx.doi.org/10.1016/j.isprsjprs.2013.03.006］

Inglada J. 2007. Automatic recognition of man-made objects in high resolution optical remote sensing images by SVM classification of geometric image features. ISPRS Journal of Photogrammetry and Remote Sensing， 62（3）： 236-248 ［DOI： 10.1016/j.isprsjprs.2007.05.011http://dx.doi.org/10.1016/j.isprsjprs.2007.05.011］

Jaderberg M， Simonyan K， Zisserman A and Kavukcuoglu K. 2015. Spatial Transformer networks//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal， Canada： MIT Press： 2017-2025

Jia K X， Ma Z H， Zhu R and Li Y G. 2022. Attention-mechanism-based light single shot multiBox detector modelling improvement for small object detection on the sea surface. Journal of Image and Graphics， 27（4）： 1161-1175

贾可心，马正华，朱蓉，李永刚. 2022. 注意力机制改进轻量SSD模型的海面小目标检测. 中国图象图形学报， 27（4）： 1161-1175 ［DOI： 10.11834/jig.200517http://dx.doi.org/10.11834/jig.200517］

Jiang H， Zhang Y T， Guo J Y， Zhao X， Li F F， Huang L J， Hu Y X， Lei B and Ding C B. 2021. Accurate localization and parameter extraction of oil tank in remote sensing images. Journal of Image and Graphics， 26（12）： 2953-2963

江晗，张月婷，郭嘉逸，赵鑫，李芳芳，黄丽佳，胡玉新，雷斌，丁赤飚. 2021. 遥感图像中油罐目标精确定位与参数提取. 中国图象图形学报， 26（12）： 2953-2963 ［DOI： 10.11834/jig.200604http://dx.doi.org/10.11834/jig.200604］

Jocher Glenn. 2020. YOLOv5 release v6.2 ［EB/OL］. ［2023-01-19］. https://github.com/ultralytics/yolov5/releases/tag/v6.1https://github.com/ultralytics/yolov5/releases/tag/v6.1

Kattenborn T， Leitloff J， Schiefer F and Hinz S. 2021. Review on convolutional neural networks （CNN） in vegetation remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing， 173： 24-49 ［DOI： 10.1016/j.isprsjprs.2020.12.010http://dx.doi.org/10.1016/j.isprsjprs.2020.12.010］

Li C Y， Li L L， Jiang H L， Weng K H， Geng Y F， Li L， Ke Z D， Li Q Y， Cheng M， Nie W Q， Li Y D， Zhang B， Liang Y F， Zhou L Y， Xu X M， Chu X X， Wei X M and Wei X L. 2022a. YOLOv6： a single-stage object detection framework for industrial applications ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2209.02976.pdfhttp://arxiv.org/pdf/2209.02976.pdf

Li C L， Yang T J N， Zhu S J， Chen C and Guan S Y. 2020b. Density map guided object detection in aerial images//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle， USA： IEEE： 737-746 ［DOI： 10.1109/cvprw50498.2020.00103http://dx.doi.org/10.1109/cvprw50498.2020.00103］

Li F， Zhang H， Liu S L， Guo J， Ni L M and Zhang L. 2022b. DN-DETR： accelerate DETR training by introducing query denoising//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 13609-13617 ［DOI： 10.1109/cvpr52688.2022.01325http://dx.doi.org/10.1109/cvpr52688.2022.01325］

Li J X， Tian Y， Xu Y P and Zhang Z L. 2022c. Oriented object detection in remote sensing images with anchor-free oriented region proposal network. Remote Sensing， 14（5）： #1246 ［DOI： 10.3390/rs14051246http://dx.doi.org/10.3390/rs14051246］

Li K， Wan G， Cheng G， Meng L Q and Han J W. 2020a. Object detection in optical remote sensing images： a survey and a new benchmark. ISPRS Journal of Photogrammetry and Remote Sensing， 159： 296-307 ［DOI： 10.1016/j.isprsjprs.2019.11.023http://dx.doi.org/10.1016/j.isprsjprs.2019.11.023］

Li M J， Guo W W， Zhang Z H， Yu W X and Zhang T. 2018a. Rotated region based fully convolutional network for ship detection//IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia， Spain： IEEE： 673-676 ［DOI： 10.1109/IGARSS.2018.8519094http://dx.doi.org/10.1109/IGARSS.2018.8519094］

Li Q Y， Chen Y S and Zeng Y. 2022d. Transformer with transfer CNN for remote-sensing-image object detection. Remote Sensing， 14（4）： #984 ［DOI： 10.3390/rs14040984http://dx.doi.org/10.3390/rs14040984］

Li Q P， Mou L C， Liu Q J， Wang Y H and Zhu X X. 2018b. HSF-Net： multiscale deep feature embedding for ship detection in optical remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing， 56（12）： 7147-7161 ［DOI： 10.1109/TGRS.2018.2848901http://dx.doi.org/10.1109/TGRS.2018.2848901］

Li W T， Chen Y J， Hu K X and Zhu J K. 2022e. Oriented RepPoints for aerial object detection//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 1819-1828 ［DOI： 10.1109/cvpr52688.2022.00187http://dx.doi.org/10.1109/cvpr52688.2022.00187］

Li W J， Dong R M， Fu H H and Yu L. 2019. Large-scale oil palm tree detection from high-resolution satellite images using two-stage convolutional neural networks. Remote Sensing， 11（1）： #11 ［DOI： 10.3390/rs11010011http://dx.doi.org/10.3390/rs11010011］

Li Y S， Zhang Y J， Huang X and Yuille A L. 2018c. Deep networks under scene-level supervision for multi-class geospatial object detection from remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing， 146： 182-196 ［DOI： 10.1016/j.isprsjprs.2018.09.014http://dx.doi.org/10.1016/j.isprsjprs.2018.09.014］

Li Y Y， Huang Q， Pei X， Chen Y Q， Jiao L C and Shang R H. 2021. Cross-layer attention network for small object detection in remote sensing imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 14： 2148-2161 ［DOI： 10.1109/JSTARS.2020.3046482http://dx.doi.org/10.1109/JSTARS.2020.3046482］

Li Y Y， Huang Q， Pei X， Jiao L C and Shang R H. 2020c. RADet： refine feature pyramid network and multi-layer attention network for arbitrary-oriented object detection of remote sensing images. Remote Sensing， 12（3）： #389 ［DOI： 10.3390/rs12030389http://dx.doi.org/10.3390/rs12030389］

Liao J J， Piao Y， Su J H， Cai G R， Huang X W， Chen L， Huang Z H and Wu Y D. 2021. Unsupervised cluster guided object detection in aerial images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 14： 11204-11216 ［DOI： 10.1109/JSTARS.2021.3122152http://dx.doi.org/10.1109/JSTARS.2021.3122152］

Liao Y R， Wang H N， Lin C B， Li Y， Fang Y Q and Ni S Y. 2022. Research progress of deep learning-based object detection of optical remote sensing image. Journal on Communications， 43（5）： 190-203

廖育荣，王海宁，林存宝，李阳，方宇强，倪淑燕. 2022. 基于深度学习的光学遥感图像目标检测研究进展. 通信学报， 43（5）： 190-203 ［DOI： 10.11959/j.issn.1000-436x.2022071http://dx.doi.org/10.11959/j.issn.1000-436x.2022071］

Lin T Y， Goyal P， Girshick R， He K M and Doll􀆦r P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE/CVF International Conference on Computer Vision. Venice， Italy： IEEE： 2999-3007 ［DOI： 10.1109/iccv.2017.324http://dx.doi.org/10.1109/iccv.2017.324］

Liu G， Zhang Y S， Zheng X W， Sun X， Fu K and Wang H Q. 2014. A new method on inshore ship detection in high-resolution satellite images using shape and context information. IEEE Geoscience and Remote Sensing Letters， 11（3）： 617-621 ［DOI： 10.1109/LGRS.2013.2272492http://dx.doi.org/10.1109/LGRS.2013.2272492］

Liu J H， Yang D H and Hu F. 2022a. Multiscale object detection in remote sensing images combined with multi-receptive-field features and relation-connected attention. Remote Sensing， 14（2）： #427 ［DOI： 10.3390/rs14020427http://dx.doi.org/10.3390/rs14020427］

Liu K and Mattyus G. 2015. Fast multiclass vehicle detection on aerial images. IEEE Geoscience and Remote Sensing Letters， 12（9）： 1938-1942 ［DOI： 10.1109/LGRS.2015.2439517http://dx.doi.org/10.1109/LGRS.2015.2439517］

Liu S， Zhang L， Lu H C and He Y. 2022b. Center-boundary dual attention for oriented object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing， 60： #5603914 ［DOI： 10.1109/TGRS.2021.3069056http://dx.doi.org/10.1109/TGRS.2021.3069056］

Liu T L， Luo R H， Xu L Q， Feng D C， Cao L， Liu S Y and Guo J J. 2022c. Spatial channel attention for deep convolutional neural networks. Mathematics， 10（10）： #1750 ［DOI： 10.3390/math10101750http://dx.doi.org/10.3390/math10101750］

Liu W， Anguelov D， Erhan D， Szegedy C， Reed S， Fu C Y and Berg A C. 2016. SSD： single shot MultiBox detector//Proceedings of the 14th European Conference on Computer Vision. Amsterdam， the Netherlands： Springer： 21-37 ［DOI： 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2］

Liu X L， Ma S P， He L Y， Wang C and Chen Z. 2022d. Hybrid network model： TransConvNet for oriented object detection in remote sensing images. Remote Sensing， 14（9）： #2090 ［DOI： 10.3390/rs14092090http://dx.doi.org/10.3390/rs14092090］

Liu Y， Li H F， Hu C， Luo S， Luo Y and Chen C W. 2022e. Learning to aggregate multi-scale context for instance segmentation in remote sensing images ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2111.11057.pdfhttp://arxiv.org/pdf/2111.11057.pdf

Liu Y， Zhang Y， Wang Y X， Hou F， Yuan J， Tian J， Zhang Y， Shi Z C， Fan J P and He Z Q. 2022f. A survey of visual Transformers ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2111.06091.pdfhttp://arxiv.org/pdf/2111.06091.pdf

Liu Z K， Hu J G， Weng L B and Yang Y P. 2017a. Rotated region based CNN for ship detection//Proceedings of 2021 IEEE International Conference on Image Processing. Beijing， China： IEEE： 900-904 ［DOI： 10.1109/ICIP.2017.8296411http://dx.doi.org/10.1109/ICIP.2017.8296411］

Liu Z K， Yuan L， Weng L B and Yang Y P. 2017b. A high resolution optical satellite image dataset for ship recognition and some new baselines//Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods. Porto， Portugal： SciTePress： 324-331 ［DOI： 10.5220/0006120603240331http://dx.doi.org/10.5220/0006120603240331］

Luo C， Feng S S， Yang X F， Ye Y M， Li X T， Zhang B Q， Chen Z H and Quan Y L. 2022. LWCDnet： a lightweight network for efficient cloud detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing， 60： #5409816 ［DOI： 10.1109/TGRS.2022.3173661http://dx.doi.org/10.1109/TGRS.2022.3173661］

Ma T， Mao M Y， Zheng H H， Gao P， Wang X D， Han S M， Ding E R， Zhang B C and Doermann D. 2021. Oriented object detection with Transformer ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2106.03146.pdfhttp://arxiv.org/pdf/2106.03146.pdf

Mirhajianmoghadam H and Haghighi B B. 2022. EYNet： extended YOLO for airport detection in remote sensing images ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2203.14007.pdfhttp://arxiv.org/pdf/2203.14007.pdf

Nie G T and Huang H. 2021. A survey of object detection in optical remote sensing images. Acta Automatica Sinica， 47（8）： 1749-1768

聂光涛，黄华. 2021. 光学遥感图像目标检测算法综述. 自动化学报， 47（8）： 1749-1768 ［DOI： 10.16383/j.aas.c200596http://dx.doi.org/10.16383/j.aas.c200596］

Ojala T， Pietikainen M and Maenpaa T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence， 24（7）： 971-987 ［DOI： 10.1109/TPAMI.2002.1017623http://dx.doi.org/10.1109/TPAMI.2002.1017623］

Olson D and Anderson J. 2021. Review on unmanned aerial vehicles， remote sensors， imagery processing， and their applications in agriculture. Agronomy Journal， 113（2）： 971-992 ［DOI： 10.1002/agj2.20595http://dx.doi.org/10.1002/agj2.20595］

Qin Z Q， Zhang P Y， Wu F and Li X. 2021. FcaNet： frequency channel attention networks//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 783-792 ［DOI： 10.1109/iccv48922.2021.00082http://dx.doi.org/10.1109/iccv48922.2021.00082］

Ran Q， Wang Q， Zhao B Y， Wu Y F， Pu S L and Li Z J. 2021. Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 14： 5786-5795 ［DOI： 10.1109/JSTARS.2021.3079968http://dx.doi.org/10.1109/JSTARS.2021.3079968］

Redmon J， Divvala S， Girshick R and Farhadi A. 2016. You only look once： unified， real-time object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 779-788 ［DOI： 10.1109/cvpr.2016.91http://dx.doi.org/10.1109/cvpr.2016.91］

Redmon J and Farhadi A. 2017. YOLO9000： better， faster， stronger//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 6517-6525 ［DOI： 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690］

Redmon J and Farhadi A. 2018. YOLOv3： an incremental improvement ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/1804.02767.pdfhttp://arxiv.org/pdf/1804.02767.pdf

Ren S Q， He K M， Girshick R and Sun J. 2015. Faster R-CNN： towards real-time object detection with region proposal networks//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal， Canada： MIT Press： 91-99

Rodríguez J J and Maudes J. 2008. Boosting recombined weak classifiers. Pattern Recognition Letters， 29（8）： 1049-1059 ［DOI： 10.1016/j.patrec.2007.06.019http://dx.doi.org/10.1016/j.patrec.2007.06.019］

Roh B， Shin J， Shin W and Kim S. 2022. Sparse DETR： efficient end-to-end object detection with learnable sparsity ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2111.14330.pdfhttp://arxiv.org/pdf/2111.14330.pdf

Shafique A， Cao G， Khan Z， Asad M and Aslam M. 2022. Deep learning-based change detection in remote sensing images： a review. Remote Sensing， 14（4）： #871 ［DOI： 10.3390/rs14040871http://dx.doi.org/10.3390/rs14040871］

Singh I and Munjal G. 2022. Improved Yolov5 for small target detection in aerial images. （SSRN Scholarly Paper No #4049533）［DOI： 10.2139/ssrn.4049533］

Song Z N， Sui H and Hua L. 2021. A hierarchical object detection method in large-scale optical remote sensing satellite imagery using saliency detection and CNN. International Journal of Remote Sensing， 42（8）： 2827-2847 ［DOI： 10.1080/01431161.2020.1826059http://dx.doi.org/10.1080/01431161.2020.1826059］

Song Z N， Sui H G and Li Y C. 2021. A survey on ship detection technology in high-resolution optical remote sensing images. Geomatics and Information Science of Wuhan University， 46（11）： 1703-1715

宋志娜，眭海刚，李永成. 2021. 高分辨率可见光遥感图像舰船目标检测综述. 武汉大学学报（信息科学版）， 46（11）： 1703-1715 ［DOI： 10.13203/j.whugis20200481http://dx.doi.org/10.13203/j.whugis20200481］

Sun X， Wang P J， Wang C， Liu Y F and Fu K. 2021. PBNet： part-based convolutional neural network for complex composite object detection in remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing， 173： 50-65 ［DOI： 10.1016/j.isprsjprs.2020.12.015http://dx.doi.org/10.1016/j.isprsjprs.2020.12.015］

Sun X， Wang P J， Yan Z Y， Xu F， Wang R P， Diao W H， Chen J， Li J H， Feng Y C， Xu T， Weinmann M， Hinz S， Wang C and Fu K. 2022. FAIR1M： a benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing， 184： 116-130 ［DOI： 10.1016/j.isprsjprs.2021.12.004http://dx.doi.org/10.1016/j.isprsjprs.2021.12.004］

Van Etten A. 2018. You only look twice： rapid multi-scale object detection in satellite imagery ［EB/OL］. ［2023-01-19］. https://arxiv.org/pdf/1805.09512.pdfhttps://arxiv.org/pdf/1805.09512.pdf

Vaswani A， Shazeer N， Parmar N， Uszkoreit J， Jones L， Gomez A N， Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach， USA： Curran Associates Inc.： 6000-6010

Viola P and Jones M. 2001. Rapid object detection using a boosted cascade of simple features//Proceedings of 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai， USA： IEEE： I-511-I-518 ［DOI： 10.1109/CVPR.2001.990517http://dx.doi.org/10.1109/CVPR.2001.990517］

Wang C， Bai X， Wang S， Zhou J and Ren P. 2019. Multiscale visual attention networks for object detection in VHR remote sensing images. IEEE Geoscience and Remote Sensing Letters， 16（2）： 310-314 ［DOI： 10.1109/LGRS.2018.2872355http://dx.doi.org/10.1109/LGRS.2018.2872355］

Wang C Y， Bochkovskiy A and Liao H Y M. 2022a. YOLOv7： trainable bag-of-freebies sets new state-of-the-art for real-time object detectors ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2207.02696.pdfhttp://arxiv.org/pdf/2207.02696.pdf

Wang C Y， Yeh I H and Liao H Y M. 2021a. You only learn one representation： unified network for multiple tasks ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2105.04206.pdfhttp://arxiv.org/pdf/2105.04206.pdf

Wang J W， Xu C， Yang W and Yu L. 2022b. A normalized gaussian Wasserstein distance for tiny object detection ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2110.13389.pdfhttp://arxiv.org/pdf/2110.13389.pdf

Wang J W， Yang W， Guo H W， Zhang R X and Xia G S. 2021b. Tiny object detection in aerial images//Proceedings of the 25th International Conference on Pattern Recognition. Milan， Italy： 3791-3798 ［DOI： 10.1109/ICPR48806.2021.9413340http://dx.doi.org/10.1109/ICPR48806.2021.9413340］

Wang P J， Sun X， Diao W H and Fu K. 2020a. FMSSD： feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing， 58（5）： 3377-3390 ［DOI： 10.1109/TGRS.2019.2954328http://dx.doi.org/10.1109/TGRS.2019.2954328］

Wang T， Yuan L， Chen Y P， Feng J S and Yan S C. 2021c. PnP-DETR： towards efficient visual analysis with Transformers//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 4641-4650 ［DOI： 10.1109/iccv48922.2021.00462http://dx.doi.org/10.1109/iccv48922.2021.00462］

Wang X L， Girshick R， Gupta A and He K M. 2018. Non-local neural networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 7794-7803 ［DOI： 10.1109/cvpr.2018.00813http://dx.doi.org/10.1109/cvpr.2018.00813］

Wang Y， Bashir S M A， Khan M， Ullah Q， Wang R， Song Y L， Guo Z and Niu Y L. 2022c. Remote sensing image super-resolution and object detection： benchmark and state of the art. Expert Systems with Applications， 197： #116793 ［DOI： 10.1016/j.eswa.2022.116793http://dx.doi.org/10.1016/j.eswa.2022.116793］

Wang Y， Xu C F， Liu C W and Li Z K. 2022d. Context information refinement for few-shot object detection in remote sensing images. Remote Sensing， 14（14）： #3255 ［DOI： 10.3390/rs14143255http://dx.doi.org/10.3390/rs14143255］

Wang Y， Yang Y L and Zhao X. 2020b. Object detection using clustering algorithm adaptive searching regions in aerial images//Proceedings of 2020 European Conference on Computer Vision. Glasgow， UK： Springer： 651-664 ［DOI： 10.1007/978-3-030-66823-5_39http://dx.doi.org/10.1007/978-3-030-66823-5_39］

Woo S， Park J， Lee J Y and Kweon I S. 2018. CBAM： convolutional block attention module//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 3-19 ［DOI： 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1］

Wu Z Z， Xu J， Wang Y， Sun F， Tan M and Weise T. 2022. Hierarchical fusion and divergent activation based weakly supervised learning for object detection from remote sensing images. Information Fusion， 80： 23-43 ［DOI： 10.1016/j.inffus.2021.10.010http://dx.doi.org/10.1016/j.inffus.2021.10.010］

Xia G S， Bai X， Ding J， Zhu Z， Belongie S， Luo J B， Datcu M， Pelillo M and Zhang L P. 2018. DOTA： a large-scale dataset for object detection in aerial images//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 3974-3983 ［DOI： 10.1109/cvpr.2018.00418http://dx.doi.org/10.1109/cvpr.2018.00418］

Xie X X， Cheng G， Wang J B， Yao X W and Han J W. 2021. Oriented R-CNN for object detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 3500-3509 ［DOI： 10.1109/iccv48922.2021.00350http://dx.doi.org/10.1109/iccv48922.2021.00350］

Xu C， Wang J W， Yang W and Yu L. 2021a. Dot distance for tiny object detection in aerial images//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Nashville， USA： IEEE： 1192-1201 ［DOI： 10.1109/cvprw53098.2021.00130http://dx.doi.org/10.1109/cvprw53098.2021.00130］

Xu J T， Li Y L and Wang S J. 2022a. AdaZoom： towards scale-aware large scene object detection. IEEE Transactions on Multimedia， 1-1 ［DOI： 10.1109/TMM.2022.3178871http://dx.doi.org/10.1109/TMM.2022.3178871］

Xu S L， Wang X X， Lyu W Y， Chang Q Y， Cui C， Deng K P， Wang G Z， Dang Q Q， Wei S Y， Du Y N and Lai B H. 2022b. PP-YOLOE： an evolved version of YOLO ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2203.16250.pdfhttp://arxiv.org/pdf/2203.16250.pdf

Xu X K， Feng Z J， Cao C Q， Li M Y， Wu J， Wu Z Y， Shang， Y J and Ye S B. 2021b. An improved swin Transformer-based model for remote sensing object detection and instance segmentation. Remote Sensing， 13（23）： #4779 ［DOI： 10.3390/rs13234779http://dx.doi.org/10.3390/rs13234779］

Yan J Q， Zhao L J， Diao W H， Wang H Q and Sun X. 2021. AF-EMS detector： improve the multi-scale detection performance of the anchor-free detector. Remote Sensing， 13（2）： #160 ［DOI： 10.3390/rs13020160http://dx.doi.org/10.3390/rs13020160］

Yan Z G， Song X， Zhong H Y and Zhu X Z. 2018. Object detection in optical remote sensing images based on transfer learning convolutional neural networks//Proceedings of the 5th IEEE International Conference on Cloud Computing and Intelligence Systems. Nanjing， China： IEEE： 935-942 ［DOI： 10.1109/CCIS.2018.8691238http://dx.doi.org/10.1109/CCIS.2018.8691238］

Yang F， Fan H， Chu P， Blasch E and Ling H B. 2019a. Clustered object detection in aerial images//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 8310-8319 ［DOI： 10.1109/iccv.2019.00840http://dx.doi.org/10.1109/iccv.2019.00840］

Yang X， Hou L P， Zhou Y， Wang W T and Yan J C. 2021a. Dense label encoding for boundary discontinuity free rotation detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 15814-15824 ［DOI： 10.1109/cvpr46437.2021.01556http://dx.doi.org/10.1109/cvpr46437.2021.01556］

Yang X， Sun H， Sun X， Yan M L， Guo Z and Fu K. 2018. Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network. IEEE Access， 6： 50839-50849 ［DOI： 10.1109/ACCESS.2018.2869884http://dx.doi.org/10.1109/ACCESS.2018.2869884］

Yang X， Yan J C， Feng Z M and He T. 2021b. R3Det： refined single-stage detector with feature refinement for rotating object. Proceedings of the AAAI Conference on Artificial Intelligence， 35（4）： 3163-3171 ［DOI： 10.1609/aaai.v35i4.16426http://dx.doi.org/10.1609/aaai.v35i4.16426］

Yang X， Yan J C， Liao W L， Yang X K， Tang J and He T. 2023. SCRDet++： detecting small， cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. IEEE Transactions on Pattern Analysis and Machine Intelligence， 45（2）： 2384-2399 ［DOI： 10.1109/tpami.2022.3166956http://dx.doi.org/10.1109/tpami.2022.3166956］

Yang X， Yan J C， Ming Q， Wang W T， Zhang X P and Tian Q. 2021c. Rethinking rotated object detection with Gaussian Wasserstein distance loss//Proceedings of the 38th International Conference on Machine Learning. Virtual： ICML： 11830-11841 ［DOI： 10.48550/arXiv.2101.11952http://dx.doi.org/10.48550/arXiv.2101.11952］

Yang X， Yang J R， Yan J C， Zhang Y， Zhang T F， Guo Z， Sun X and Fu K. 2019b. SCRDet： towards more robust detection for small， cluttered and rotated objects//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 8231-8240 ［DOI： 10.1109/iccv.2019.00832http://dx.doi.org/10.1109/iccv.2019.00832］

Yang X， Yang X J， Yang J R， Ming Q， Wang W T， Tian Q and Yan J C. 2021d. Learning high-precision bounding box for rotated object detection via Kullback-Leibler divergence//Advances in Neural Information Processing Systems， 34， 18381-18394 ［DOI： 10.48550/arXiv.2106.01883http://dx.doi.org/10.48550/arXiv.2106.01883］

Yang Z， Liu S H， Hu H， Wang L W and Lin S. 2019c. RepPoints： point set representation for object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 9656-9665 ［DOI： 10.1109/ICCV.2019.00975http://dx.doi.org/10.1109/ICCV.2019.00975］

Yao Z Y， Ai J B， Li B X and Zhang C. 2021. Efficient DETR： improving end-to-end object detector with dense prior ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2104.01318.pdfhttp://arxiv.org/pdf/2104.01318.pdf

Yi J R， Wu P X， Liu B， Huang Q Y， Qu H and Metaxas D. 2021. Oriented object detection in aerial images with box boundary-aware vectors//Proceedings of 2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa， USA： IEEE： 2149-2158 ［DOI： 10.1109/wacv48630.2021.00220http://dx.doi.org/10.1109/wacv48630.2021.00220］

Yu D W and Ji S P. 2022. A new spatial-oriented object detection framework for remote sensing images. IEEE Transactions on Geoscience and Remote Sensing， 60： #4407416 ［DOI： 10.1109/TGRS.2021.3127232http://dx.doi.org/10.1109/TGRS.2021.3127232］

Zhang G J， Lu S J and Zhang W. 2019. CAD-Net： a context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing， 57（12）： 10015-10024 ［DOI： 10.1109/TGRS.2019.2930982http://dx.doi.org/10.1109/TGRS.2019.2930982］

Zhang H， Li F， Liu S L， Zhang L， Su H， Zhu J， Ni L M and Shum H Y. 2022a. DINO： DETR with improved DeNoising anchor boxes for end-to-end object detection ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2203.03605.pdfhttp://arxiv.org/pdf/2203.03605.pdf

Zhang J， Xie C M， Xu X， Shi Z W and Pan B. 2020a. A contextual bidirectional enhancement method for remote sensing image object detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 13： 4518-4531 ［DOI： 10.1109/JSTARS.2020.3015049http://dx.doi.org/10.1109/JSTARS.2020.3015049］

Zhang K， Wu Y L， Wang J Y and Wang Q. 2022b. A hierarchical context embedding network for object detection in remote sensing images. IEEE Geoscience and Remote Sensing Letters， 19： #6508105 ［DOI： 10.1109/LGRS.2022.3161938http://dx.doi.org/10.1109/LGRS.2022.3161938］

Zhang Y， Liu X， Wa S， Chen S Y and Ma Q. 2022c. GANsformer： a detection network for aerial images with high performance combining convolutional network and Transformer. Remote Sensing， 14（4）： #923 ［DOI： 10.3390/rs14040923http://dx.doi.org/10.3390/rs14040923］

Zhang Y J， Sheng W G， Jiang J F， Jing N F， Wang Q and Mao Z G. 2020c. Priority branches for ship detection in optical remote sensing images. Remote Sensing， 12（7）： #1196 ［DOI： 10.3390/rs12071196http://dx.doi.org/10.3390/rs12071196］

Zhang Y L， Guo L H， Wang Z F， Yu Y， Liu X W and Xu F. 2020b. Intelligent ship detection in remote sensing images based on multi-layer convolutional feature fusion. Remote Sensing， 12（20）： #3316 ［DOI： 10.3390/rs12203316http://dx.doi.org/10.3390/rs12203316］

Zhang Z C， Boubin J， Stewart C and Khanal S. 2020d. Whole-field reinforcement learning： a fully autonomous aerial scouting method for precision agriculture. Sensors， 20（22）： #6585 ［DOI： 10.3390/s20226585http://dx.doi.org/10.3390/s20226585］

Zhao W Q， Kong Z X， Zhou Z D and Zhao Z B. 2021. Target detection algorithm of aerial remote sensing based on feature enhancement technology. Journal of Image and Graphics， 26（3）： 644-653

赵文清，孔子旭，周震东，赵振兵. 2021. 增强小目标特征的航空遥感目标检测. 中国图象图形学报， 26（3）： 644-653 ［DOI： 10.11834/jig.190612http://dx.doi.org/10.11834/jig.190612］

Zheng M H， Gao P， Zhang R R， Li K C， Wang X G， Li H S and Dong H. 2021a. End-to-end object detection with adaptive clustering Transformer ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2011.09315.pdfhttp://arxiv.org/pdf/2011.09315.pdf

Zheng Y B， Sun P， Zhou Z T， Xu W Y and Ren Q. 2021b. ADT-Det： adaptive dynamic refined single-stage Transformer detector for arbitrary-oriented object detection in satellite optical imagery. Remote Sensing， 13（13）： #2623 ［DOI： 10.3390/rs13132623http://dx.doi.org/10.3390/rs13132623］

Zheng Z， Zhong Y F， Ma A L， Han X B， Zhao J， Liu Y F and Zhang L P. 2020. HyNet： hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing， 166： 1-14 ［DOI： 10.1016/j.isprsjprs.2020.04.019http://dx.doi.org/10.1016/j.isprsjprs.2020.04.019］

Zhou X Y， Wang D Q and Krähenbühl P. 2019. Objects as points ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/1904.07850.pdfhttp://arxiv.org/pdf/1904.07850.pdf

Zhou Z， Huang J F， Wang J， Zhang K Y， Kuang Z M， Zhong S Q and Song X D. 2015. Object-oriented classification of sugarcane using time-series middle-resolution remote sensing data based on AdaBoost. PLoS ONE， 10（11）： #e0142069 ［DOI： 10.1371/journal.pone.0142069http://dx.doi.org/10.1371/journal.pone.0142069］

Zhu X K， Lyu S， Wang X and Zhao Q. 2021a. TPH-YOLOv5： improved YOLOv5 based on Transformer prediction head for object detection on drone-captured scenarios//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal， Canada： IEEE： 2778-2788 ［DOI： 10.1109/iccvw54120.2021.00312http://dx.doi.org/10.1109/iccvw54120.2021.00312］

Zhu X Z， Hu H， Lin S and Dai J F. 2019. Deformable ConvNets V2： more deformable， better results//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 9300-9308 ［DOI： 10.1109/cvpr.2019.00953http://dx.doi.org/10.1109/cvpr.2019.00953］

Zhu X Z， Su W J， Lu L W， Li B， Wang X G and Dai J F. 2021b. Deformable DETR： deformable Transformers for end-to-end object detection ［EB/OL］. ［2023-01-19］. http://arxiv.org/pdf/2010.04159.pdfhttp://arxiv.org/pdf/2010.04159.pdf

文章被引用时，请邮件提醒。

提交

红外与可见光图像特征动态选择的目标检测网络

轻量级图像超分辨率的蓝图可分离卷积Transformer网络

时空特征融合网络的多目标跟踪与分割

注意力机制改进轻量SSD模型的海面小目标检测

混合监督学习的乳腺癌全切片病理图像分类