图像级标记弱监督目标检测综述

陈震元; 王振东; 宫辰

doi:10.11834/jig.220854

复杂场景图像目标智能检测 | 浏览量 : 0 下载量: 4 CSCD: 0

PDF
导出
分享
收藏
专辑

图像级标记弱监督目标检测综述
Image-level labeled weakly supervised object detection： a survey
2023年28卷第9期页码：2644-2660
纸质出版日期： 2023-09-16 ，
DOI： 10.11834/jig.220854
稿件说明：

移动端阅览

陈震元，王振东，宫辰. 2023. 图像级标记弱监督目标检测综述. 中国图象图形学报， 28(09):2644-2660

Chen Zhenyuan， Wang Zhendong， Gong Chen. 2023. Image-level labeled weakly supervised object detection： a survey. Journal of Image and Graphics， 28(09):2644-2660
陈震元，王振东，宫辰. 2023. 图像级标记弱监督目标检测综述. 中国图象图形学报， 28(09):2644-2660 DOI： 10.11834/jig.220854.

Chen Zhenyuan， Wang Zhendong， Gong Chen. 2023. Image-level labeled weakly supervised object detection： a survey. Journal of Image and Graphics， 28(09):2644-2660 DOI： 10.11834/jig.220854.

摘要

目标检测是计算机视觉领域的基本任务之一，根据标签信息的不同，可分为全监督目标检测、半监督目标检测和弱监督目标检测等。弱监督目标检测旨在仅利用图像级别的类别标记信息训练检测器，从而完成对测试图像中所有目标物体的定位和分类。因能够显著降低数据标记成本，弱监督目标检测愈发受到关注且已取得令人瞩目的进展。本文由弱监督目标检测的研究意义引入，首先介绍了弱监督目标检测的标签设置及问题定义、基于多示例学习的基础框架和面临的局部主导、实例歧义和计算消耗这3大难题，接着按核心网络架构将该领域的典型算法归纳为3大类，分别是基于优化候选框生成的算法、结合图像分割的算法和基于自训练的算法，并分别阐述各类算法的核心贡献。进一步地，本文通过实验在多种评估指标上对比了各类弱监督目标检测算法的检测效果。在VOC2007（visual object classes 2007）数据集中，平均精度均值（mean average precision，mAP）最高的方法为MIST（multiple instance self-training）算法（54.9%），正确定位率（correct localization，CorLoc）最高的方法为SLV（spatial likelihood voting）算法（71.1%）。在VOC2012数据集中，mAP最高的方法为NDI-WSOD（negative deterministic information weakly supervised object detection）算法（53.9%），CorLor最高的方法为P-MIDN（pyramidal multiple instance detection network）算法（73.3%）。在MSCOCO（Microsoft common objects in context）数据集中，在交并比（intersection over union， IoU）阈值为50%时验证集上的平均精度ValAP

最高的方法为P-MIDN（pyramidal multiple instance detection network）（27.4%）。最后探讨了弱监督目标检测未来的研究方向。本文所总结的弱监督目标检测算法框架，对后续研究人员的网络设计、模型探究和优化方向等都具有一定的参考价值。

Abstract

Object detection is a fundamental problem in computer vision and image processing. From the perspective of supervision， it can be divided into fully-supervised， semi-supervised， and weakly-supervised. In recent years， object detection has played an important role in various areas and shown great application value. Precise object detection depends on the accurate region or instance-level image labeling during detector training. However， the complexity of the background and the diversity of objects in real scenes make accurate image labeling extremely time-consuming and laborious. In particular， traditional fully supervised object detection algorithms need to mark the position and category of each object in the image manually with a minimum rectangular box. Thus， the cost of acquiring a training label is increased. By contrast， weakly-supervised object detection （WSOD） algorithms only require the category labels of the whole image for training. Thus， a large number of training samples can be easily obtained by searching the category labels on some image websites. WSOD has received increasing attention and achieved encouraging progress because of its ability to reduce the labor cost of labeling remarkably. Therefore， researchers focus on WSOD algorithms based on image-level coarse labeling. These algorithms slightly depend on supervised information. Compared with other supervised object detection tasks， WSOD aims to localize and classify objects in an image by using only image-level category annotations. The present study starts with the research significance of WSOD. First， the definition， basic framework， and main challenges of WSOD are introduced： 1） WSOD is performed in the training and test phases with standard detectors. The whole problem of WSOD can be understood as learning a mapping relationship from several candidate boxes contained in an image to image category markers. 2） The problem setup of WSOD is consistent with that of multi-example learning in weakly supervised learning. Thus， WSOD can be treated as a learning problem by taking each candidate box and the image containing all the candidate boxes as an example and a “package” itself， respectively. For each category， if the image contains at least one target object of this category， the image is a positive packet； otherwise， it is a negative packet. Therefore， detector parameters can be learned based on candidate boxes in images. If an image is predicted to be a positive packet of a certain class， then the image contains the target of this class. Thus， the target can be identified using a rectangular candidate box. 3） WSOD faces three major problems： local dominance problem， instance ambiguity problem， and conspicuous memory consumption problem. Afterward， advanced WSOD algorithms are classified into three categories according to the network architectures： optimization-candidate-box-generation-based algorithms， segmentation-based algorithms， and self-training-based algorithms. Among them， the core of the optimized-candidate-box-generation-based algorithms is the improved candidate box generator in the basic framework. The core of segmentation-based and self-training-based algorithms is the improved detector in the basic framework. The difference is that the former algorithms aim to add a segmentation branch and guide detection through segmentation， whereas the latter algorithms aim to optimize the detection network. Furthermore， the detection results of various WSOD algorithms are compared under several evaluation metrics through extensive experiments. This study selects and compares the current mainstream WSOD algorithms on PASCAL visual object class 2017 （VOC2007） and VOC2012 datasets. All algorithms use the Visual Geometry Group （VGG） network 16 pretrained on the ImageNet Large-Scale Visual Recognition Challenge （ILSVRC） dataset as the backbone for feature extraction to ensure the fairness of comparison. Moreover， only the performance of the model itself is evaluated without considering the effect of fully supervised models， such as Fast R-CNN. In the mean average precision （mAP） comparison on the VOC2007 dataset， multiple instance self-training （MIST） is considered the best， with the single model obtaining 54.9% mAP. The mAP of the existing advanced WSOD algorithms is between 50% and 60%. Compared with the mAP of the online instance classifier refinement （OICR） algorithm， which is often used as the baseline method， the mAP of MIST is improved by less than 15%. This finding indicates that this field still has a large room for improvement. The comparison of mAP and correct localization （CorLoc） on the VOC2012 dataset indicates that negative deterministic information weakly supervised object detection （NDI-WSOD） achieves good performance， reaching 53.9%， which is 16% higher than the OICR performance. The best algorithm for the CorLoc is pyramidal multiple instance detection network （P-MIDN）， and its performance reaches 73.3%. This value is 11.2% higher than that reached by OICR. In addition， various algorithms are adopted for comparison on Microsoft common objects in context （MS COCO） datasets. The algorithm with the highest ValAP

is still P-MIDN， which achieves 27.4%. MIST combines optimized pseudo notation generation， regularization technique， and bounding box regression in the self-training process. Thus， it can continue to be superior to its competitors on different datasets. The research of the WSOD algorithm based on image-level labeling has made a great breakthrough because of the vigorous development of deep learning. However， WSOD still faces many challenges， and a certain gap between it and fully supervised object detection exists. Finally， some valuable future research directions in this field are discussed： 1） generating a few candidate boxes with high quality， 2） designing a reasonable and efficient cooperative framework for detection and segmentation， 3） designing a reasonable strategy or digging out many improved positive samples through the network itself， and 4） designing lightweight network models that can be applied to mobile terminals.

关键词

弱监督目标检测弱监督语义分割候选框生成器自训练

Keywords

weakly-supervised object detectionweakly-supervised semantic segmentationproposal generatorself-training

references

Arun A， Jawahar C V and Kumar M P. 2019. Dissimilarity coefficient based weakly supervised object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 9432-9441 ［DOI： 10.1109/CVPR.2019.00966http://dx.doi.org/10.1109/CVPR.2019.00966］

Bilen H， Pedersoli M and Tuytelaars T. 2015. Weakly supervised object detection with convex clustering//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston， USA： IEEE： 1081-1089 ［DOI： 10.1109/CVPR.2015.7298711http://dx.doi.org/10.1109/CVPR.2015.7298711］

Bilen H and Vedaldi A. 2016. Weakly supervised deep detection networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 2846-2854 ［DOI： 10.1109/CVPR.2016.311http://dx.doi.org/10.1109/CVPR.2016.311］

Cao J L， Li Y L， Sun H Q， Xie J， Huang K Q and Pang Y W. 2022. A survey on deep learning based visual object detection. Journal of Image and Graphics， 27（6）： 1697-1722

曹家乐，李亚利，孙汉卿，谢今，黄凯奇，庞彦伟. 2022. 基于深度学习的视觉目标检测技术综述. 中国图象图形学报， 27（6）： 1697-1722 ［DOI： 10.11834/jig.220069http://dx.doi.org/10.11834/jig.220069］

Cao T Y， Du L Y， Zhang X Y， Chen S H， Zhang Y and Wang Y F. 2021. CaT： weakly supervised object detection with category transfer//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 3050-3059 ［DOI： 10.1109/ICCV48922.2021.00306http://dx.doi.org/10.1109/ICCV48922.2021.00306］

Chen Z， Fu Z H， Jiang R X， Chen Y W and Hua X S. 2020. SLV： spatial likelihood voting for weakly supervised object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 12992-13001 ［DOI： 10.1109/CVPR42600.2020.01301http://dx.doi.org/10.1109/CVPR42600.2020.01301］

Cheng G， Yang J Y， Gao D C， Guo L and Han J W. 2020. High-quality proposals for weakly supervised object detection. IEEE Transactions on Image Processing， 29： 5794-5804 ［DOI： 10.1109/TIP.2020.2987161http://dx.doi.org/10.1109/TIP.2020.2987161］

Dietterich T G， Lathrop R H and Lozano-pérez T. 1997. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence， 89（1/2）： 31-71 ［DOI： 10.1016/S0004-3702（96）00034-3http://dx.doi.org/10.1016/S0004-3702（96）00034-3］

Dong B W， Huang Z T， Guo Y L， Wang Q L， Niu Z X and Zuo W M. 2021. Boosting weakly supervised object detection via learning bounding box adjusters//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 2856-2865 ［DOI： 10.1109/ICCV48922.2021.00287http://dx.doi.org/10.1109/ICCV48922.2021.00287］

Everingham M， Eslami S M A， Van Gool L， Williams C K I， Winn J and Zisserman A. 2015. The pascal visual object classes challenge： a retrospective. International Journal of Computer Vision， 111（1）： 98-136 ［DOI： 10.1007/s11263-014-0733-5http://dx.doi.org/10.1007/s11263-014-0733-5］

Everingham M， Van Gool L， Williams C K I， Winn J and Zisserman A. 2010. The pascal visual object classes （VOC） challenge. International Journal of Computer Vision， 88（2）： 303-338 ［DOI： 10.1007/s11263-009-0275-4http://dx.doi.org/10.1007/s11263-009-0275-4］

Gao W， Wan F， Yue J， Xu S C and Ye Q X. 2022. Discrepant multiple instance learning for weakly supervised object detection. Pattern Recognition， 122： #108233 ［DOI： 10.1016/j.patcog.2021.108233http://dx.doi.org/10.1016/j.patcog.2021.108233］

Girshick R. 2015. Fast R-CNN//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago， Chile： IEEE： 1440-1448 ［DOI： 10.1109/ICCV.2015.169http://dx.doi.org/10.1109/ICCV.2015.169］

He K M， Zhang X Y， Ren S Q and Sun J. 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence， 37（9）： 1904-1916 ［DOI： 10.1109/TPAMI.2015.2389824http://dx.doi.org/10.1109/TPAMI.2015.2389824］

He K M， Zhang X Y， Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 770-778 ［DOI： 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90］

Huang Z Y， Zou Y， Bhagavatula V and Huang D. 2020. Comprehensive attention self-distillation for weakly-supervised object detection//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 16797-16807

Jia Q F， Wei S K， Ruan T， Zhao Y F and Zhao Y. 2021. GradingNet： towards providing reliable supervisions for weakly supervised object detection by grading the box candidates//Proceedings of the 35th AAAI Conference on Artificial Intelligence. ［s.l.］： AAAI： 1682-1690 ［DOI： 10.1609/AAAI.v35i2.16261http://dx.doi.org/10.1609/AAAI.v35i2.16261］

Kosugi S， Yamasaki T and Aizawa K. 2019. Object-aware instance labeling for weakly supervised object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 6063-6071 ［DOI： 10.1109/ICCV.2019.00616http://dx.doi.org/10.1109/ICCV.2019.00616］

Li X Y， Kan M N， Shan S G and Chen X L. 2019. Weakly supervised object detection with segmentation collaboration//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 9734-9743 ［DOI： 10.1109/ICCV.2019.00983http://dx.doi.org/10.1109/ICCV.2019.00983］

Lin C H， Wang S W， Xu D Q， Liu Y and Zhang W. 2020. Object instance mining for weakly supervised object detection//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York， USA： AAAI： 11482-11489 ［DOI： 10.1609/aaai.v34i07.6813http://dx.doi.org/10.1609/aaai.v34i07.6813］

Lin T Y， Doll􀆦r P， Girshick R， He K M， Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 936-944 ［DOI： 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106］

Lin T Y， Maire M， Belongie S， Hays J， Perona P， Ramanan D， Doll􀆦r P and Zitnick C L. 2014. Microsoft COCO： common objects in context//Proceedings of the 13th European Conference on Computer Vision. Zurich， Switzerland： Springer： 740-755 ［DOI： 10.1007/978-3-319-10602-1_48http://dx.doi.org/10.1007/978-3-319-10602-1_48］

Liu L， Ouyang W L， Wang X G， Fieguth P， Chen J， Liu X W and Pietikainen M. 2020. Deep learning for generic object detection： a survey. International Journal of Computer Vision， 128（2）： 261-318 ［DOI： 10.1007/s11263-019-01247-4http://dx.doi.org/10.1007/s11263-019-01247-4］

Nguyen D K， Tseng W L and Shuai H H. 2020. Domain-adaptive object detection via uncertainty-aware distribution alignment//Proceedings of the 28th ACM International Conference on Multimedia. Seattle， USA： ACM： 2499-2507 ［DOI： 10.1145/3394171.3413553http://dx.doi.org/10.1145/3394171.3413553］

Ren D W， Wang Q L， Wei Y C， Meng D Y and Zuo W M. 2022. Progress in weakly supervised learning for visual understanding. Journal of Image and Graphics， 27（6）： 1768-1798

任冬伟，王旗龙，魏云超，孟德宇，左旺孟. 2022. 视觉弱监督学习研究进展. 中国图象图形学报， 27（6）： 1768-1798 ［DOI： 10.11834/jig.220178http://dx.doi.org/10.11834/jig.220178］

Ren S Q， He K M， Girshick R and Sun J. 2015. Faster R-CNN： towards real-time object detection with region proposal networks//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal， Canada： MIT Press： 91-99

Ren Z Z， Yu Z D， Yang X D， Liu M Y， Lee Y J， Schwing A G and Kautz J. 2020. Instance-aware， context-focused， and memory-efficient weakly supervised object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 10595-10604 ［DOI： 10.1109/CVPR42600.2020.01061http://dx.doi.org/10.1109/CVPR42600.2020.01061］

Russakovsky O， Deng J， Su H， Krause J， Satheesh S， Ma S A， Huang Z H， Karpathy A， Khosla A， Bernstein M， Berg A C and Li F F. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision， 115（3）： 211-252 ［DOI： 10.1007/s11263-015-0816-yhttp://dx.doi.org/10.1007/s11263-015-0816-y］

Selvaraju R R， Cogswell M， Das A， Vedantam R， Parikh D and Batra D. 2017. Grad-CAM： visual explanations from deep networks via gradient-based localization//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 618-626 ［DOI： 10.1109/ICCV.2017.74http://dx.doi.org/10.1109/ICCV.2017.74］

Shao F F， Chen L， Shao J， Ji W， Xiao S N， Ye L， Zhuang Y T and Xiao J. 2022. Deep learning for weakly-supervised object detection and localization： a survey. Neurocomputing， 496： 192-207 ［DOI： 10.1016/j.neucom.2022.01.095http://dx.doi.org/10.1016/j.neucom.2022.01.095］

Shen Y H， Ji R R， Wang Y， Wu Y J and Cao L J. 2019. Cyclic guidance for weakly supervised joint detection and segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 697-707 ［DOI： 10.1109/CVPR.2019.00079http://dx.doi.org/10.1109/CVPR.2019.00079］

Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition//Proceedings of the 3rd International Conference on Learning Representations. San Diego， USA： ICLR： 714-723 ［DOI： 10.48550/arXiv.1409.1556http://dx.doi.org/10.48550/arXiv.1409.1556］

Song L Y， Liu J， Sun M X and Shang X Q. 2021. Weakly supervised group mask network for object detection. International Journal of Computer Vision， 129（3）： 681-702 ［DOI： 10.1007/s11263-020-01397-whttp://dx.doi.org/10.1007/s11263-020-01397-w］

Tang P， Wang X G， Bai S， Shen W， Bai X， Liu W Y and Alan L Y. 2020. PCL： proposal cluster learning for weakly supervised object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence， 42（1）： 176-191 ［DOI： 10.1109/TPAMI.2018.2876304http://dx.doi.org/10.1109/TPAMI.2018.2876304］

Tang P， Wang X G， Bai X and Liu W Y. 2017. Multiple instance detection network with online instance classifier refinement//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 3059-3067 ［DOI： 10.1109/CVPR.2017.326http://dx.doi.org/10.1109/CVPR.2017.326］

Tang P， Wang X G， Wang A T， Yan Y L， Liu W Y， Huang J Z and Yuille A. 2018. Weakly supervised region proposal network and object detection//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 370-386 ［DOI： 10.1007/978-3-030-01252-6_22http://dx.doi.org/10.1007/978-3-030-01252-6_22］

Uijlings J R R， van de Sande K E A， Gevers T and Smeulders A W M. 2013. Selective search for object recognition. International Journal of Computer Vision， 104（2）： 154-171 ［DOI： 10.1007/s11263-013-0620-5http://dx.doi.org/10.1007/s11263-013-0620-5］

Wan F， Liu C， Ke W， Ji X Y， Jiao J B and Ye Q X. 2019. C-MIL： continuation multiple instance learning for weakly supervised object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 2194-2203 ［DOI： 10.1109/CVPR.2019.00230http://dx.doi.org/10.1109/CVPR.2019.00230］

Wang G C， Zhang X R.， Peng Z L， Tang X.， Zhou H Y and Jiao L C. 2022. Absolute wrong makes better： boosting weakly supervised object detection via negative deterministic information//Proceedings of the 31st International Joint Conference on Artificial Intelligence， Vienna， Austria： IJCAI： 1378-1384 ［DOI： 10.24963/ijcai.2022/192http://dx.doi.org/10.24963/ijcai.2022/192］

Wang X， You S D， Li X and Ma H M. 2018. Weakly-supervised semantic segmentation by iteratively mining common object features//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake， USA： IEEE： 1354-1362 ［DOI： 10.1109/CVPR.2018.00147http://dx.doi.org/10.1109/CVPR.2018.00147］

Wei Y C， Shen Z Q， Cheng B W， Shi H H， Xiong J J， Feng J S and Huang T. 2018a. TS2C： tight box mining with surrounding segmentation context for weakly supervised object detection//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 454-470 ［DOI： 10.1007/978-3-030-01252-6_27http://dx.doi.org/10.1007/978-3-030-01252-6_27］

Xu X K， Ma Y， Qian X and Zhang Y. 2021. Scale-aware EfficientDet： real-time pedestrian detection algorithm for automated driving. Journal of Image and Graphics， 26（1）： 93-100

徐歆恺，马岩，钱旭，张龑. 2021. 自动驾驶场景的尺度感知实时行人检测. 中国图象图形学报， 26（1）： 93-100 ［DOI： 10.11834/jig.200445http://dx.doi.org/10.11834/jig.200445］

Xu Y Q， Zhou C L， Yu X， Xiao B and Yang Y. 2021. Pyramidal multiple instance detection network with mask guided self-correction for weakly supervised object detection. IEEE Transactions on Image Processing， 30： 3029-3040 ［DOI： 10.1109/TIP.2021.3056887http://dx.doi.org/10.1109/TIP.2021.3056887］

Yan G， Liu B X， Guo N， Ye X C， Wan F， You H H and Fan D R. 2019. C-MIDN： coupled multiple instance detection network with segmentation guidance for weakly supervised object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea（South）： IEEE： 9833-9842 ［DOI： 10.1109/ICCV.2019.00993http://dx.doi.org/10.1109/ICCV.2019.00993］

Yang H， Quan J C， Liang X Y and Wang Z W. 2021. Research progress of object detection based on weakly supervised learning. Computer Engineering and Applications， 57（16）： 40-49

杨辉，权冀川，梁新宇，王中伟. 2021. 基于弱监督学习的目标检测研究进展. 计算机工程与应用， 57（16）： 40-49 ［DOI： 10.3778/j.issn.1002-8331.2103-0306http://dx.doi.org/10.3778/j.issn.1002-8331.2103-0306］

Yang K， Li D S and Dou Y. 2019a. Towards precise end-to-end weakly supervised object detection network//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea（South）： IEEE： 8371-8380 ［DOI： 10.1109/ICCV.2019.00846http://dx.doi.org/10.1109/ICCV.2019.00846］

Yao X W， Feng X X， Han J W， Cheng G and Guo L. 2021. Automatic weakly supervised object detection from high spatial resolution remote sensing images via dynamic curriculum learning. IEEE Transactions on Geoscience and Remote Sensing， 59（1）： 675-685 ［DOI： 10.1109/TGRS.2020.2991407http://dx.doi.org/10.1109/TGRS.2020.2991407］

Yin Y F， Deng J J， Zhou W G and Li H Q. 2021. Instance mining with class feature banks for weakly supervised object detection//Proceedings of the 35th AAAI Conference on Artificial Intelligence. ［s.l.］： AAAI： 3190-3198 ［DOI： 10.1609/aaai.v35i4.16429http://dx.doi.org/10.1609/aaai.v35i4.16429］

Zeng Z Y， Liu B， Fu J L， Chao H Y and Zhang L. 2019. WSOD2： learning bottom-up and top-down objectness distillation for weakly-supervised object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 8291-8299 ［DOI： 10.1109/ICCV.2019.00838http://dx.doi.org/10.1109/ICCV.2019.00838］

Zhang D W， Han J W， Cheng G and Yang M H. 2022a. Weakly supervised object localization and detection： a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence， 44（9）： 5866-5885 ［DOI： 10.1109/TPAMI.2021.3074313http://dx.doi.org/10.1109/TPAMI.2021.3074313］

Zhang D W， Han J W， Zhao L and Zhao T. 2020b. From discriminant to complete： reinforcement searching-agent learning for weakly supervised object detection. IEEE Transactions on Neural Networks and Learning Systems， 31（12）： 5549-5560 ［DOI： 10.1109/TNNLS.2020.2969483http://dx.doi.org/10.1109/TNNLS.2020.2969483］

Zhang D W， Zeng W Y， Yao J R and Han J W. 2022b. Weakly supervised object detection using proposal- and semantic-level relationships. IEEE Transactions on Pattern Analysis and Machine Intelligence， 44（6）： 3349-3363 ［DOI： 10.1109/TPAMI.2020.3046647http://dx.doi.org/10.1109/TPAMI.2020.3046647］

Zhang Y Q， Bai Y C， Ding M L， Li Y Q and Ghanem B. 2018d. Weakly-supervised object detection via mining pseudo ground truth bounding-boxes. Pattern Recognition， 84： 68-81 ［DOI： 10.1016/j.patcog.2018.07.005http://dx.doi.org/10.1016/j.patcog.2018.07.005］

Zhao W Q， Kong Z X， Zhou Z D and Zhao Z B. 2021. Target detection algorithm of aerial remote sensing based on feature enhancement technology. Journal of Image and Graphics， 26（3）： 644-653

赵文清，孔子旭，周震东，赵振兵. 2021. 增强小目标特征的航空遥感目标检测. 中国图象图形学报， 26（3）： 644-653 ［DOI： 10.11834/jig.190612http://dx.doi.org/10.11834/jig.190612］

Zhou B L， Khosla A， Lapedriza A， Oliva A and Torralba A. 2016. Learning deep features for discriminative localization//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 2921-2929 ［DOI： 10.1109/CVPR.2016.319http://dx.doi.org/10.1109/CVPR.2016.319］

Zhou M F and Wang X L. 2018. Object detection models of remote sensing images using deep neural networks with weakly supervised training method. Cientia Sinica Informationis， 48（8）： 1022-1034

周明非，汪西莉. 2018. 弱监督深层神经网络遥感图像目标检测模型. 中国科学：信息科学， 48（8）： 1022-1034 ［DOI： 10.1360/N112017-00208http://dx.doi.org/10.1360/N112017-00208］

Zhou X L， Chen X J， Chen S Y and Lei B J. 2019. Weakly supervised learning-based object detection： a sursvey. Computer Science， 46（11）： 49-57

周小龙，陈小佳，陈胜勇，雷帮军. 2019. 弱监督学习下的目标检测算法综述. 计算机科学， 46（11）： 49-57 ［DOI： 10.11896/jsjkx.181001899http://dx.doi.org/10.11896/jsjkx.181001899］

Zhu Y， Zhou Y Z， Ye Q X， Qiu Q and Jiao J B. 2017. Soft proposal networks for weakly supervised object localization//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 1859-1868 ［DOI： 10.1109/ICCV.2017.204http://dx.doi.org/10.1109/ICCV.2017.204］

Zitnick C L and Doll􀆦r P. 2014. Edge boxes： locating object proposals from edges//Proceedings of the 13th European Conference on Computer Vision. Zurich， Switzerland： Springer： 391-405 ［DOI： 10.1007/978-3-319-10602-1_26http://dx.doi.org/10.1007/978-3-319-10602-1_26］

文章被引用时，请邮件提醒。

提交

流形正则化的交叉一致性语义分割算法