基于图像的自动驾驶3D目标检测综述——基准、制约因素和误差分析
3D object detection for autonomous driving from image:a survey ——benchmarks,constraints and error analysis
- 2023年28卷第6期 页码:1709-1740
纸质出版日期: 2023-06-16
DOI: 10.11834/jig.230036
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2023-06-16 ,
移动端阅览
李熙莹, 叶芝桧, 韦世奎, 陈泽, 陈小彤, 田永鸿, 党建武, 付树军, 赵耀. 2023. 基于图像的自动驾驶3D目标检测综述——基准、制约因素和误差分析. 中国图象图形学报, 28(06):1709-1740
Li Xiying, Ye Zhihui, Wei Shikui, Chen Ze, Chen Xiaotong, Tian Yonghong, Dang Jianwu, Fu Shujun, Zhao Yao. 2023. 3D object detection for autonomous driving from image:a survey ——benchmarks,constraints and error analysis. Journal of Image and Graphics, 28(06):1709-1740
从高分辨率图像中获取周边目标的精准3D位置和尺寸信息是实现自动驾驶控制和行为决策的基础,因此基于图像的3D目标检测是自动驾驶领域中的研究热点。已有学者对该领域方法论及成果进行了比较详细的综述,但对于导致现有方法检测精度不尽如意的制约因素未能进行深入系统的分析。考虑自动驾驶领域在工程应用方面的要求高,且现有方法以数据驱动类型为主,本文从常用数据集和评价基准、数据影响、方法论的制约因素和误差等角度,对学术界和产业界在3D目标检测方面的研究成果及行业应用进行较为系统的阐述。首先,从学术界探索成果以及自动驾驶行业的应用角度进行概要介绍。然后,从数据采集设备、数据精度和标注信息3方面详细分析总结了KITTI等4个通用数据集,并对这些数据集提出的主要评价指标进行对比分析。接着,从数据和方法论方面分析制约算法性能的主要因素及由此造成的误差影响。在数据方面,制约因素主要是数据精度、样本差异、标注数据量和标注规范;在方法论方面,制约因素主要包括先验几何关系、深度预测误差和数据模态等。最后,对国内外研究现状进行总结,并在数据集、评价指标和目标深度预测等方面提出了未来需要重点关注的研究方向。
Autonomous driving-oriented accurate perception and measurement of the three-dimensional (3D) spatial position and scale can be as the basis for realizing the control ability and decision-making level. Sensing technology-driven autonomous vehicles are equipped with high-resolution camera, light detection and ranging(LiDAR), radar, global positioning system(GPS)/inertial measurement unit(IMU) and other related sensors. Current LiDAR or multi-modal data-based 3D object detection algorithms are challenged for its deployment and application because of the shortcomings of LiDAR sensors like high price, limited sensing range, and sparse point clouds data. In contrast, such high-resolution cameras are commonly-used and featured by its lower price, and it can obtain high-resolution spatial information, richer shape, and appearance details as well. The emerging image-based 3D object detection is focused on further. At present, constraints of detection accuracy of the existing methods are still to be analyzed thoroughly and systematically. We summary the research results and industrial applications in relevance to such 1) perspectives of commonly used datasets and evaluation criteria, 2) data impact, 3) methodological constraints and prediction errors. First, a brief introduction is linked to perspective of academic domain and application of autonomous driving industry. We briefly review latest growths of Baidu Apollo, Google Waymo, Tesla and other related autonomous driving companies, and the thread of 3D object detection methods for autonomous driving. Then, we analyzed and summarized four popular datasets like KITTI, nuScenes, Waymo open dataset, and DAIR-V2X dataset from three aspects of: 1) data acquisition/sensors, data accuracy and data label information; 2) key evaluation standards proposed by these data sets, and 3) pros/cons and applicability of these evaluation standards. Third, main constraints of the image-based 3D object detection algorithm and the errors are derived from two sides of: data and methodology. Such main data constraints are originated from their data accuracy, sample difference, data volume, and data annotation. The data accuracy is mainly limited by equipment performance. The sample difference is mainly restricted by such image processing problems in related to object distance difference, angle difference, occlusion, and truncation. Data volume is affected by variety of 3D data types and high difficulty of labeling. The volume of 3D object detection data set is much smaller in comparison with the 2D object detection data set. Data annotation is mainly focused on 3D bounding box labeling, the labeling details, and quality of the dataset, especially for image annotation used in image-based 3D object detection. For non-rigid objects like pedestrians, the annotation error is larger, and there are some optimal for improving the labeling method. The general framework of image-based 3D object detection can be classified as one-stage methods and two-stage methods, and the limitations consists of 1) the prior geometric relationship, 2) depth prediction accuracy, and 3) data modality. The prior geometric relationship is focused on 2D-3D geometric constraints for 2D images-projected 3D objects and objects-between position relationships. The image-based 3D object detection methods face such problems as: prior 2D-3D geometric constraints and occluded and truncated objects. The prediction of depth information from 2D images is an ill conditioned problem, and dimension collapse will cause depth prediction error-relevant loss of depth information in the image. On the one hand, the depth prediction is often not accurate due to the influence of projection relationship. On the other hand, the performance of continuous depth prediction is often poor at the depth mutation of the image (such as edge of objects). When the prediction depth is discretized, there is a problem that the classification of depth is relatively rough, and the accuracy classification cannot be arbitrarily divided. The limitation of single image-based data modality is mainly reflected via large error of depth prediction. The detection performance of the algorithm can be optimized by 1) simulating the stereo signal and LiDAR point clouds, or 2) using stereo image as the aided input, or 3) leveraging point clouds data with accurate 3D information as supervision signal. In addition, video data can be adopted to improve the detection accuracy to a certain extent. Forth, current research situation is summarized and compared from academic and industrial domain. Finally, some future research directions are predicted in terms of such factors of datasets, evaluation indicators, and depth prediction.
3D目标检测基准制约因素误差分析自动驾驶图像处理计算机视觉
3D object detectionbenchmarkconstrainterror analysisautonomous drivingimage processingcomputer vision
Aghdam H H, Heravi E J, Demilew S S and Laganiere R. 2021. RAD: realtime and accurate 3D object detection on embedded systems//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Nashville, USA: IEEE: 2869-2877 [DOI: 10.1109/CVPRW53098.2021.00322http://dx.doi.org/10.1109/CVPRW53098.2021.00322]
AutoX. 2021. AutoX releases the fifth generation Gen5 fully driverless system. Automobile Parts, (7): #7
AutoX. 2021. AutoX发布第五代Gen5全无人驾驶系统. 汽车零部件, (7): #7
Badki A, Troccoli A, Kim K, Kautz J, Sen P and Gallo O. 2020. Bi3D: stereo depth estimation via binary classifications//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1597-1605 [DOI: 10.1109/CVPR42600.2020.00167http://dx.doi.org/10.1109/CVPR42600.2020.00167]
Baidu. 2022. ApolloAuto/apollo [EB/OL]. [2022-10-26]. https://github.com/ApolloAuto/apollohttps://github.com/ApolloAuto/apollo
Brazil G and Liu X M. 2019. M3D-RPN: monocular 3D region proposal network for object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 9286-9295 [DOI: 10.1109/ICCV.2019.00938http://dx.doi.org/10.1109/ICCV.2019.00938]
Brazil G, Pons-Moll G, Liu X M and Schiele B. 2020. Kinematic 3D object detection in monocular video//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 135-152 [DOI: 10.1007/978-3-030-58592-1_9http://dx.doi.org/10.1007/978-3-030-58592-1_9]
Caesar H, Bankiti V, Lang A H, Vora S, Liong V E, Xu Q, Krishnan A, Pan Y, Baldam G and Beijbom O. 2020. nuScenes: a multimodal dataset for autonomous driving//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 11621-11631 [DOI: 10.1109/CVPR42600.2020.01164http://dx.doi.org/10.1109/CVPR42600.2020.01164]
Cao J C and Tao C B. 2021. An anchor-guided 3D target detection algorithm based on stereo RCNN. Chinese Journal of Scientific Instrument, 42(12): 191-201
曹杰程, 陶重犇. 2021. 基于Stereo RCNN的锚引导3D目标检测算法. 仪器仪表学报, 42(12): 191-201 [DOI: 10.19650/j.cnki.cjsi.J2107801http://dx.doi.org/10.19650/j.cnki.cjsi.J2107801]
Chabot F, Chaouch M, Rabarisoa J, Teulière C and Chateau T. 2017. Deep MANTA: a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1827-1836 [DOI: 10.1109/CVPR.2017.198http://dx.doi.org/10.1109/CVPR.2017.198]
Chang M F, Lambert J, Sangkloy P, Singh J, Bak S, Hartnett A, Wang D, Carr P, Lucey S, Ramanan D and Hays J. 2019. Argoverse: 3D tracking and forecasting with rich maps//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8740-8749 [DOI: 10.1109/CVPR.2019.00895http://dx.doi.org/10.1109/CVPR.2019.00895]
Chen F, Wu F, Huang Q H, Feng Y J, Ge Q, Ji Y M, Hu C H and Jing X Y. 2020a. Semantic frustum based VoxelNet for 3D object detection//Proceedings of 2020 Chinese Automation Congress (CAC). Shanghai, China: IEEE: 7629-7634 [DOI: 10.1109/CAC51589.2020.9327549http://dx.doi.org/10.1109/CAC51589.2020.9327549]
Chen N H. 2020. Challenging Tesla FSD, Baidu Apollo launches pilot assisted driving ANP. Business Observer, (12): 66-67
陈念航. 2020. 挑战特斯拉FSD, 百度Apollo推出领航辅助驾驶ANP. 企业观察家, (12): 66-67
Chen X Z, Kundu K, Zhu Y K, Berneshawi A, Ma H M, Fidler S and Urtasun R. 2015. 3D object proposals for accurate object class detection//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press: 424-432
Chen X Z, Kundu K, Zhu Y K, Ma H M, Fidler S and Urtasun R. 2018. 3D object proposals using stereo imagery for accurate object class detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(5): 1259-1272 [DOI: 10.1109/TPAMI.2017.2706685http://dx.doi.org/10.1109/TPAMI.2017.2706685]
Chen X Z, Ma H M, Wan J, Li B and Xia T. 2017. Multi-view 3D object detection network for autonomous driving//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6526-6534 [DOI: 10.1109/CVPR.2017.691http://dx.doi.org/10.1109/CVPR.2017.691]
Chen Y J, Tai L, Sun K and Li M Y. 2020b. MonoPair: monocular 3D object detection using pairwise spatial relationships//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 12090-12099 [DOI: 10.1109/CVPR42600.2020.01211http://dx.doi.org/10.1109/CVPR42600.2020.01211]
Chen Y L, Huang S J, Liu S, Yu B and Jia J Y. 2023. DSGN++: exploiting visual-spatial relation for stereo-based 3D detectors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4): 4416-4429 [DOI: 10.1109/TPAMI.2022.3197236http://dx.doi.org/10.1109/TPAMI.2022.3197236]
Chen Y L, Liu S, Shen X Y and Jia J Y. 2020c. DSGN: deep stereo geometry network for 3D object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 12533-12542 [DOI: 10.1109/CVPR42600.2020.01255http://dx.doi.org/10.1109/CVPR42600.2020.01255]
Chen Y N, Dai H and Ding Y. 2022. Pseudo-stereo for monocular 3D object detection in autonomous driving//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 877-887 [DOI: 10.1109/CVPR52688.2022.00096http://dx.doi.org/10.1109/CVPR52688.2022.00096]
Cheng Z M. 2022. Analysis of Tesla autopilot software system. For Repair and Maintenance, (1): 33-35
程增木. 2022. 特斯拉自动驾驶软件系统解析. 汽车维修与保养, (1): 33-35 [DOI: 10.3969/j.issn.1008-3170.2022.01.010http://dx.doi.org/10.3969/j.issn.1008-3170.2022.01.010]
Chi J. 2021. Analysis of Google’s patent technology for unmanned driving. Popular Standardization, (4): 162-164
池娟. 2021. 关于Google无人驾驶的专利技术分析. 大众标准化, (4): 162-164 [DOI: 10.3969/j.issn.1007-1350.2021.04.053http://dx.doi.org/10.3969/j.issn.1007-1350.2021.04.053]
Chi X R, Pei W, Zhu Y Y, Wang C L, Shi L Y and Li J F. 2022. Fast Stereo-RCNN 3D target detection algorithm. Journal of Chinese Mini-Micro Computer Systems, 43(10): 2157-2161
迟旭然, 裴伟, 朱永英, 王春立, 史良宇, 李锦峰. 2022. Fast Stereo-RCNN三维目标检测算法. 小型微型计算机系统, 43(10): 2157-2161 [DOI: 10.20009/j.cnki.21-1106/TP.2021-0167http://dx.doi.org/10.20009/j.cnki.21-1106/TP.2021-0167]
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S and Schiele B. 2016. The cityscapes dataset for semantic urban scene understanding//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 3213-3223 [DOI: 10.1109/CVPR.2016.350http://dx.doi.org/10.1109/CVPR.2016.350]
Deng J, Dong W, Socher R, Li L J, Li K and Li F F. 2009. ImageNet: a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 248-255 [DOI: 10.1109/CVPR.2009.5206848http://dx.doi.org/10.1109/CVPR.2009.5206848]
Ding M Y, Huo Y Q, Yi H W, Wang Z, Shi J P, Lu Z W and Luo P. 2020. Learning depth-guided convolutions for monocular 3D object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 11669-11678 [DOI: 10.1109/CVPR42600.2020.01169http://dx.doi.org/10.1109/CVPR42600.2020.01169]
Dong W B and Isler V. 2020. Ellipse regression with predicted uncertainties for accurate multi-view 3D object estimation. [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/2101.05212.pdfhttps://arxiv.org/pdf/2101.05212.pdf
Dou J, Xue J R and Fang J W. 2019. SEG-VoxelNet for 3D vehicle detection from RGB and LiDAR data//Proceedings of 2019 International Conference on Robotics and Automation (ICRA). Montreal, Canada: IEEE: 4362-4368 [DOI: 10.1109/ICRA.2019.8793492http://dx.doi.org/10.1109/ICRA.2019.8793492]
Eigen D, Puhrsch C and Fergus R. 2014. Depth map prediction from a single image using a multi-scale deep network [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/1406.2283.pdfhttps://arxiv.org/pdf/1406.2283.pdf
Everingham M, van Gool L, Williams C K I, Winn J and Zisserman A. 2010. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2): 303-338 [DOI: 10.1007/s11263-009-0275-4http://dx.doi.org/10.1007/s11263-009-0275-4]
Feng Z Y, Jing L D, Yin P, Tian Y L and Li B. 2021. Advancing self-supervised monocular depth learning with sparse LiDAR [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/2109.09628.pdfhttps://arxiv.org/pdf/2109.09628.pdf
Gao T Z, Pan H H and Gao H J. 2022. Monocular 3D object detection with sequential feature association and depth hint augmentation. IEEE Transactions on Intelligent Vehicles, 7(2): 240-250 [DOI: 10.1109/TIV.2022.3143954http://dx.doi.org/10.1109/TIV.2022.3143954]
Garg D, Wang Y, Hariharan B, Campbell M, Weinberger K Q and Chao W L. 2021. Wasserstein distances for stereo disparity estimation [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/2007.03085.pdfhttps://arxiv.org/pdf/2007.03085.pdf
Geiger A, Lenz P, Stiller C and Urtasun R. 2022. The KITTI vision benchmark suite [EB/OL]. [2022-10-26]. https://www.cvlibs.net/datasets/kitti/index.phphttps://www.cvlibs.net/datasets/kitti/index.php
Geiger A, Lenz P and Urtasun R. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE: 3354-3361 [DOI: 10.1109/CVPR.2012.6248074http://dx.doi.org/10.1109/CVPR.2012.6248074]
Girshick R, Donahue J, Darrell T and Malik J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA: IEEE: 580-587[DOI: 10.1109/CVPR.2014.81http://dx.doi.org/10.1109/CVPR.2014.81]
Girshick R. 2015. Fast R-CNN//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1440-1448 [DOI: 10.1109/ICCV.2015.169http://dx.doi.org/10.1109/ICCV.2015.169]
Guo W J. 2021. From manned testing tono safety officers, unmanned taxis are gradually approaching. Intelligent Connected Vehicles, (1): 21-25
郭文佳. 2021. 从载人测试到取消安全员无人驾驶出租车渐行渐近. 智能网联汽车, (1): 21-25
Guo X Y, Shi S S, Wang X G and Li H S. 2021. LIGA-Stereo: learning LiDAR geometry aware representations for stereo-based 3D detector//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 3133-3143 [DOI: 10.1109/ICCV48922.2021.00314http://dx.doi.org/10.1109/ICCV48922.2021.00314]
Hong Y, Dai H and Ding Y. 2022. Cross-modality knowledge distillation network for monocular 3D object detection//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 87-104 [DOI: 10.1007/978-3-031-20080-9_6http://dx.doi.org/10.1007/978-3-031-20080-9_6]
Houston J, Zuidhof G, Bergamini L, Ye Y W, Chen L, Jain A, Omari S, Lglovikov V and Ondruska P. 2020. One thousand and one hours: self-driving motion prediction dataset [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/2006.14480.pdfhttps://arxiv.org/pdf/2006.14480.pdf
Ku J, Mozifian M, Lee J, Harakeh A and Waslander S L. 2018. Joint 3D proposal generation and object detection from view aggregation//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid, Spain: IEEE: 1-8 [DOI: 10.1109/IROS.2018.8594049http://dx.doi.org/10.1109/IROS.2018.8594049]
Ku J, Pon A D, Walsh S and Waslander S L. 2019a. Improving 3D object detection for pedestrians with virtual multi-view synthesis orientation estimation//Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Macau, China: IEEE: 3459-3466 [DOI: 10.1109/IROS40897.2019.8968242http://dx.doi.org/10.1109/IROS40897.2019.8968242]
Ku J, Pon A D and Waslander S L. 2019b. Monocular 3D object detection leveraging accurate proposals and shape reconstruction//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 11859-11868 [DOI: 10.1109/CVPR.2019.01214http://dx.doi.org/10.1109/CVPR.2019.01214]
Lang A H, Vora S, Caesar H, Zhou L B, Yang J and Beijbom O. 2019. PointPillars: fast encoders for object detection from point clouds//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 12689-12697 [DOI: 10.1109/CVPR.2019.01298http://dx.doi.org/10.1109/CVPR.2019.01298]
Li B Y, Ouyang W L, Sheng L, Zeng X Y and Wang X G. 2019a. GS3D: an efficient 3D object detection framework for autonomous driving//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 1019-1028 [DOI: 10.1109/CVPR.2019.00111http://dx.doi.org/10.1109/CVPR.2019.00111]
Li H R, Duan Z C, Ma M J, Chen Y R, Li J Q and Zhao D B. 2021a. MVM3Det: a novel method for multi-view monocular 3D detection. [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/2109.10473.pdfhttps://arxiv.org/pdf/2109.10473.pdf
Li L. 2022. Toyota “defected” to Tesla. Automotive Observer, (4): 14-15
李琳. 2022. 丰田“投靠”特斯拉. 汽车观察, (4): 14-15 [DOI: 10.3969/j.issn.1673-145X.2022.04.005http://dx.doi.org/10.3969/j.issn.1673-145X.2022.04.005]
Li P L, Chen X Z and Shen S J. 2019b. Stereo R-CNN based 3D object detection for autonomous driving//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7636-7644 [DOI: 10.1109/CVPR.2019.00783http://dx.doi.org/10.1109/CVPR.2019.00783]
Li P X, Su S and Zhao H C. 2021b. RTS3D: real-time stereo 3D detection from 4D feature-consistency embedding space for autonomous driving. Proceedings of 2021 AAAI Conference on Artificial Intelligence, 35(3): 1930-1939 [DOI: 10.1609/aaai.v35i3.16288http://dx.doi.org/10.1609/aaai.v35i3.16288]
Li P X, Zhao H C, Liu P F and Cao F D. 2020. RTM3D: real-time monocular 3D detection from object keypoints for autonomous driving//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 644-660 [DOI: 10.1007/978-3-030-58580-8_38http://dx.doi.org/10.1007/978-3-030-58580-8_38]
Liao Y Y, Xie J and Geiger A. 2023. KITTI-360: a novel dataset and benchmarks for urban scene understanding in 2D and 3D. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3): 3292-3310 [DOI: 10.1109/TPAMI.2022.3179507http://dx.doi.org/10.1109/TPAMI.2022.3179507]
Liu A Z. 2022. Ideal ONE: enhanced by intelligent driving. Intelligent and Connected Vehicles, (1): 91-93
刘岸泽. 2022. 理想ONE: 智能驾驶加持. 智能网联汽车, (1): 91-93
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y and Berg A C. 2016. SSD: single shot multibox detector//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 21-37 [DOI: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2]
Liu X P, Xue N and Wu T F. 2022. Learning auxiliary monocular contexts helps monocular 3D object detection. Proceedings of 2022 AAAI Conference on Artificial Intelligence, 36(2): 1810-1818 [DOI: 10.1609/aaai.v36i2.20074http://dx.doi.org/10.1609/aaai.v36i2.20074]
Liu Y X, Wang L J and Liu M. 2021a. YOLOStereo3D: a step back to 2D for efficient stereo 3D detection//Proceedings of 2021 IEEE International Conference on Robotics and Automation (ICRA). Xi′an, China: IEEE: 13018-13024 [DOI: 10.1109/ICRA48506.2021.9561423http://dx.doi.org/10.1109/ICRA48506.2021.9561423]
Liu Y X, Yuan Y X and Liu M. 2021b. Ground-aware monocular 3D object detection for autonomous driving. IEEE Robotics and Automation Letters, 6(2): 919-926 [DOI: 10.1109/LRA.2021.3052442http://dx.doi.org/10.1109/LRA.2021.3052442]
Liu Z C, Wu Z Z and Tóth R. 2020. SMOKE: single-stage monocular 3D object detection via keypoint estimation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA: IEEE: 4289-4298 [DOI: 10.1109/CVPRW50498.2020.00506http://dx.doi.org/10.1109/CVPRW50498.2020.00506]
Lu H H, Chen X S, Zhang G Y, Zhou Q H, Ma Y B and Zhao Y. 2019. SCANet: spatial-channel attention network for 3D object detection//Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, UK: IEEE: 1992-1996 [DOI: 10.1109/ICASSP.2019.8682746http://dx.doi.org/10.1109/ICASSP.2019.8682746]
Luo S J, Dai H, Shao L and Ding Y. 2021. M3DSSD: monocular 3D single stage object detector//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 6141-6150 [DOI: 10.1109/CVPR46437.2021.00608http://dx.doi.org/10.1109/CVPR46437.2021.00608]
Ma X Z, Wang Z H, Li H J, Zhang P B, Ouyang W L and Fan X. 2019. Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 6850-6859 [DOI: 10.1109/ICCV.2019.00695http://dx.doi.org/10.1109/ICCV.2019.00695]
Ma X Z, Zhang Y M, Xu D, Zhou D Z, Yi S, Li H J and Ouyang W L. 2021. Delving into localization errors for monocular 3D object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 4719-4728 [DOI: 10.1109/CVPR46437.2021.00469http://dx.doi.org/10.1109/CVPR46437.2021.00469]
Mao J G, Shi S S, Wang X G and Li H S. 2022. 3D object detection for autonomous driving: a review and new outlooks[EB/OL]. [2023-01-01]. https://arxiv.org/pdf/2206.09474.pdfhttps://arxiv.org/pdf/2206.09474.pdf
Mao J G, Xue Y J, Niu M Z, Bai H Y, Feng J S, Liang X D, Xu H and Xu C J. 2021. Voxel transformer for 3D object detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 3144-3153 [DOI: 10.1109/ICCV48922.2021.00315http://dx.doi.org/10.1109/ICCV48922.2021.00315]
Meng X. 2021. Meng Xing: self-evolution of Didi autonomous driving. Intelligent and Connected Vehicles, (3): 42-44
孟醒. 2021. 孟醒: 滴滴自动驾驶的自我进化. 智能网联汽车, (3): 42-44
Mousavian A, Anguelov D, Flynn J and Košeck J. 2017. 3D bounding box estimation using deep learning and geometry//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5632-5640 [DOI: 10.1109/CVPR.2017.597http://dx.doi.org/10.1109/CVPR.2017.597]
Nabati R and Qi H R. 2021. CenterFusion: center-based radar and camera fusion for 3D object detection//Proceedings of 2021 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 1526-1535 [DOI: 10.1109/WACV48630.2021.00157http://dx.doi.org/10.1109/WACV48630.2021.00157]
Novk L. 2017. Vehicle Detection and Pose Estimation for Autonomous Driving. Prague: Czech Technical University in Prague
Paperswithcode. 2023a. 3D object detection from stereo images on KITTI cars moderate [EB/OL]. [2023-01-30]. https://paperswithcode.com/sota/3d-object-detection-from-stereo-images-on-1https://paperswithcode.com/sota/3d-object-detection-from-stereo-images-on-1
Paperswithcode. 2023b. Monocular 3D object detection on KITTI cars Moderate [EB/OL]. [2023-01-30]. https://paperswithcode.com/sota/monocular-3d-object-detection-on-kitti-carshttps://paperswithcode.com/sota/monocular-3d-object-detection-on-kitti-cars
Park D, Ambruş R, Guizilini V, Li J and Gaidon A. 2021. Is pseudo-lidar needed for monocular 3D object detection?//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 3122-3132 [DOI: 10.1109/ICCV48922.2021.00313http://dx.doi.org/10.1109/ICCV48922.2021.00313]
Patil A, Malla S, Gang H and Chen Y T. 2019. The H3D dataset for full-surround 3D multi-object detection and tracking in crowded urban scenes//Proceedings of 2019 International Conference on Robotics and Automation (ICRA). Montreal, Canada: IEEE: 9552-9557 [DOI: 10.1109/ICRA.2019.8793925http://dx.doi.org/10.1109/ICRA.2019.8793925]
Peng W L, Pan H, Liu H and Sun Y. 2020. IDA-3D: instance-depth-aware 3D object detection from stereo vision for autonomous driving//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 13012-13021 [DOI: 10.1109/CVPR42600.2020.01303http://dx.doi.org/10.1109/CVPR42600.2020.01303]
Pony.ai. 2022. Technology [EB/OL]. [2022-10-26]. https://www.pony.ai/tech?lang=en (https://www.pony.ai/tech?lang=en(
小马智行. 2022. 核心技术)[EB/OL]. [2022-10-26]. https://www.pony.ai/tech?lang=zhhttps://www.pony.ai/tech?lang=zh
Qi C R, Su H, Mo K and Guibas L J. 2017a. PointNet: deep learning on point sets for 3D classification and segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 77-85 [DOI: 10.1109/CVPR.2017.16http://dx.doi.org/10.1109/CVPR.2017.16]
Qi C R, Yi L, Su H and Guibas L J. 2017b. PointNet++: deep hierarchical feature learning on point sets in a metric space//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 5100-5109
Qian R, Garg D, Wang Y, You Y, Belongie S, Hariharan B, Campbell M, Weinberger K Q and Chao W L. 2020. End-to-end pseudo-LiDAR for image-based 3D object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 5880-5889 [DOI: 10.1109/CVPR42600.2020.00592http://dx.doi.org/10.1109/CVPR42600.2020.00592]
Qin C, Wang Y F, Zhang Y C and Yin C L. 2022. 3D object detection based on extremely sparse laser point cloud and RGB images. Laser and Optoelectronics Progress, 59(18): 447-458
秦超, 王亚飞, 张宇超, 殷承良. 2022. 基于极端稀疏激光点云和RGB图像的3D目标检测. 激光与光电子学进展, 59(18): 447-458 [DOI: 10.3788/LOP202259.1828004http://dx.doi.org/10.3788/LOP202259.1828004]
Qin Z Y, Wang J L and Lu Y. 2019a. MonoGRNet: a geometric reasoning network for monocular 3D object localization. Proceedings of the AAAI Conference on Artificial Intelligence, 33(1): 8851-8858 [DOI: 10.1609/aaai.v33i01.33018851http://dx.doi.org/10.1609/aaai.v33i01.33018851]
Qin Z Y, Wang J L and Lu Y. 2019b. Triangulation learning network: from monocular to stereo 3D object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7607-7615 [DOI: 10.1109/CVPR.2019.00780http://dx.doi.org/10.1109/CVPR.2019.00780]
Reading C, Harakeh A, Chae J and Waslander S L. 2021. Categorical depth distribution network for monocular 3D object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 8551-8560 [DOI: 10.1109/CVPR46437.2021.00845http://dx.doi.org/10.1109/CVPR46437.2021.00845]
Redmon J, Divvala S, Girshick R and Farhadi A. 2016. You only look once: unified, real-time object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 779-788 [DOI: 10.1109/CVPR.2016.91http://dx.doi.org/10.1109/CVPR.2016.91]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]
Roddick T, Kendall A and Cipolla R. 2018. Orthographic feature transform for monocular 3D object detection [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/1811.08188.pdfhttps://arxiv.org/pdf/1811.08188.pdf
Ruan X G, Yan W J, Huang J and Guo P Y. 2022. Monocular depth estimation method based on dual-discriminator generative adversarial networks. Journal of Beijing University of Technology, 48(9): 928-934
阮晓钢, 颜文静, 黄静, 郭佩远. 2022. 基于双鉴别器生成对抗网络的单目深度估计方法. 北京工业大学学报, 48(9): 928-934 [DOI: 10.11936/bjutxb2021050001http://dx.doi.org/10.11936/bjutxb2021050001]
Simonelli A, Buló S R, Porzi L, Ricci E and Kontschieder P. 2020. Towards generalization across depth for monocular 3D object detection//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 767-782 [DOI: 10.1007/978-3-030-58542-6_46http://dx.doi.org/10.1007/978-3-030-58542-6_46]
Song X B, Wang P, Zhou D F, Zhu R, Guan C Y, Dai Y C, Su H, Li H D and Yang R G. 2019. ApolloCar3D: a large 3D car instance understanding benchmark for autonomous driving//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5447-5457 [DOI: 10.1109/CVPR.2019.00560http://dx.doi.org/10.1109/CVPR.2019.00560]
Su K Q, Yan W Q and Xu J D. 2022. 3D object detection based on multi-path feature pyramid network for stereo images. Journal of Beijing University of Aeronautics and Astronautics, 48(8): 1487-1494
苏凯祺, 阎维青, 徐金东. 2022. 基于立体图像的多路径特征金字塔网络3D目标检测. 北京航空航天大学学报, 48(8): 1487-1494 [DOI: 10.13700/j.bh.1001-5965.2021.0525http://dx.doi.org/10.13700/j.bh.1001-5965.2021.0525]
Sun P, Kretzschmar H, Dotiwalla X, Chouard A, Patnaik V, Tsui P, Guo J, Zhou Y, Chai Y N, Caine B, Vasudevan V, Han W, Ngiam J, Zhao H, Timofeev A, Ettinger S, Krivokon M, Gao A, Joshi A, Zhang Y, Shlens J, Chen Z F and Anguelov D. 2020. Scalability in perception for autonomous driving: waymo open dataset//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2443-2451 [DOI: 10.1109/CVPR42600.2020.00252http://dx.doi.org/10.1109/CVPR42600.2020.00252]
Talpes E, Sarma D D, Venkataramanan G, Bannon P, McGee B, Floering B, Jalote A, Hsiong C, Arora S, Gorti A and Sachdev G S. 2020. Compute solution for Tesla’s full self-driving computer. IEEE Micro, 40(2): 25-35 [DOI: 10.1109/mm.2020.2975764http://dx.doi.org/10.1109/mm.2020.2975764]
Wang K R, Tan J G, Du Q, Chen L L, Li J M and Zhang X L. 2020. 3D object detection based on iterative self-training. Acta Optica Sinica, 40(9): 133-145
王康如, 谭锦钢, 杜量, 陈利利, 李嘉茂, 张晓林. 2020. 基于迭代式自主学习的三维目标检测. 光学学报, 40(9): 133-145 [DOI: 10.3788/AOS202040.0915005http://dx.doi.org/10.3788/AOS202040.0915005]
Wang L, Du L, Ye X Q, Fu Y W, Guo G D, Xue X Y, Feng J F and Zhang L. 2021a. Depth-conditioned dynamic message propagation for monocular 3D object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 454-463 [DOI: 10.1109/CVPR46437.2021.00052http://dx.doi.org/10.1109/CVPR46437.2021.00052]
Wang Q C, Shuai H and Liu Q S. 2022. Monocular accumulated depth estimation with recursive feature fusion. Journal of Computer-Aided Design and Computer Graphics, 34(10): 1533-1541
王秋晨, 帅惠, 刘青山. 2022. 递归特征融合的单目深度累积估计. 计算机辅助设计与图形学学报, 34(10): 1533-1541 [DOI: 10.3724/SP.J.1089.2022.19728http://dx.doi.org/10.3724/SP.J.1089.2022.19728]
Wang Q D, Wang Q K, Cheng K and Liu Z H. 2022. Monocular depth estimation with enhanced edge. Journal of Huazhong University of Science and Technology (Natural Science Edition), 50(3): 36-42
王泉德, 王奇坤, 程凯, 刘子航. 2022. 强化边缘的单目图像深度估计. 华中科技大学学报(自然科学版), 50(3): 36-42 [DOI: 10.13245/j.hust.220307http://dx.doi.org/10.13245/j.hust.220307]
Wang T, Zhu X G, Pang J M and Lin D H. 2021b. FCOS3D: Fully convolutional one-stage monocular 3D object detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal, Canada: IEEE: 913-922 [DOI: 10.1109/ICCVW54120.2021.00107http://dx.doi.org/10.1109/ICCVW54120.2021.00107]
Wang Y, Chao W L, Garg D, Hariharan B, Campbell M and Weinberger K Q. 2019. Pseudo-LiDAR from visual depth estimation: bridging the gap in 3D object detection for autonomous driving//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8437-8445 [DOI: 10.1109/CVPR.2019.00864http://dx.doi.org/10.1109/CVPR.2019.00864]
Wang Y, Yang B, Hu R, Liang M and Urtasun R. 2021c. PLUMENet: efficient 3D object detection from stereo images//Proceedings of 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Prague, Czech Republic: IEEE: 3383-3390 [DOI: 10.1109/IROS51168.2021.9635875http://dx.doi.org/10.1109/IROS51168.2021.9635875]
Wang Y Q and Tao Y. 2022. Research on 3D object detection algorithm based on binocular vision. Microelectronics and Computer, 39(2): 19-25
王一强, 陶洋. 2022. 基于双目视觉的三维目标检测算法研究. 微电子学与计算机, 39(2): 19-25 [DOI: 10.19304/j.issn1000-7180.2021.0730http://dx.doi.org/10.19304/j.issn1000-7180.2021.0730]
Weng X S and Kitani K. 2019. Monocular 3D object detection with pseudo-lidar point cloud//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea (South): IEEE: 857-866 [DOI: 10.1109/ICCVW.2019.00114http://dx.doi.org/10.1109/ICCVW.2019.00114]
WorldAuto. 2022. 2022 WIDC officially ends-the new HI edition of Polar Fox Alpha S won two gold awards. WorldAuto, (7): 60-63
WorldAuto. 2022. 2022 WIDC正式落幕 极狐阿尔法S全新HI版荣获两大项金奖. 世界汽车, (7): 60-63 [DOI: 10.3969/j.issn.1005-9008.2022.07.010http://dx.doi.org/10.3969/j.issn.1005-9008.2022.07.010]
Xiang Y, Choi W, Lin Y Q and Savarese S. 2017. Subcategory-aware convolutional neural networks for object proposals and detection//Proceedings of 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). Santa Rosa, USA: IEEE: 924-933 [DOI: 10.1109/WACV.2017.108http://dx.doi.org/10.1109/WACV.2017.108]
Xu Q G, Zhong Y Q and Neumann U. 2022. Behind the curtain: learning occluded shapes for 3D object detection. Proceedings of the AAAI Conference on Artificial Intelligence, 36(3): 2893-2901 [DOI: 10.1609/aaai.v36i3.20194http://dx.doi.org/10.1609/aaai.v36i3.20194]
Xu Q G, Zhou Y, Wang W Y, Qi C R and Anguelov D. 2021. SPG: unsupervised domain adaptation for 3D object detection via semantic point generation//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 15426-15436 [DOI: 10.1109/ICCV48922.2021.01516http://dx.doi.org/10.1109/ICCV48922.2021.01516]
Xu Z B, Zhang W, Ye X Q, Tan X, Yang W, Wen S L, Ding E R, Meng A J and Huang L S. 2020. ZoomNet: part-aware adaptive zooming neural network for 3D object detection. Proceedings of the AAAI Conference on Artificial Intelligence, 34(7): 12557-12564 [DOI: 10.1609/aaai.v34i07.6945http://dx.doi.org/10.1609/aaai.v34i07.6945]
Yan J, Fang Z J and Gao Y B. 2020. 3D object detection based on domain attention and dilated convolution. Journal of Image and Graphics, 25(6): 1221-1234
严娟, 方志军, 高永彬. 2020. 结合混合域注意力与空洞卷积的3维目标检测. 中国图象图形学报, 25(6): 1221-1234 [DOI: 10.11834/jig.190378http://dx.doi.org/10.11834/jig.190378]
Yang B Y, Du X P, Fang Y Q, Li P Y and Wang Y. 2021. Review of rigid object pose estimation from a single image. Journal of Image and Graphics, 26(2): 334-354
杨步一, 杜小平, 方宇强, 李佩阳, 王阳. 2021. 单幅图像刚体目标姿态估计方法综述. 中国图象图形学报, 26(2): 334-354 [DOI: 10.11834/jig.200037http://dx.doi.org/10.11834/jig.200037]
Yang H T, Lei L and Lin Y C. 2022. Binocular depth estimation algorithm based on multi-scale attention feature fusion. Laser and Optoelectronics Progress, 59(18): 259-267
杨蕙同, 雷亮, 林永春. 2022. 基于多尺度注意力特征融合的双目深度估计算法. 激光与光电子学进展, 59(18): 259-267 [DOI: 10.3788/LOP202259.1815005http://dx.doi.org/10.3788/LOP202259.1815005]
You Y R, Wang Y, Chao W L, Garg D, Pleiss G, Hariharan B, Campbell M and Weinberger K Q. 2020. Pseudo-LiDAR++: accurate depth for 3D object detection in autonomous driving[EB/OL]. [2023-01-01]. https://arxiv.org/pdf/1906.06310.pdfhttps://arxiv.org/pdf/1906.06310.pdf
Yu H B, Luo Y Z, Shu M, Huo Y Y, Yang Z B, Shi Y F, Guo Z L, Li H Y, Hu X, Yuan J R and Nie Z Q. 2022. DAIR-V2X: a large-scale dataset for vehicle-infrastructure cooperative 3D object detection//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 21329-21338 [DOI: 10.1109/CVPR52688.2022.02067http://dx.doi.org/10.1109/CVPR52688.2022.02067]
Yu J X, Zhang M Q and Su Y T. 2021. Three-dimensional vehicle detection algorithm based on binocular vision. Laser and Optoelectronics Progress, 58(2): 301-306
于洁潇, 张美琪, 苏育挺. 2021. 基于双目视觉的三维车辆检测算法. 激光与光电子学进展, 58(2): 301-306 [DOI: 10.3788/LOP202158.0215004http://dx.doi.org/10.3788/LOP202158.0215004]
Zhang C, Ma Y X, Wan J W, Xu K and Xu G Q. 2022. Multi-scale monocular depth estimation network based on channel attention. Journal of Signal Processing, 38(11): 2332-2341
张聪, 马燕新, 万建伟, 许可, 徐国权. 2022. 基于通道注意力机制的单目深度估计. 信号处理, 38(11): 2332-2341 [DOI: 10.16798/j.issn.1003-0530.2022.11.010http://dx.doi.org/10.16798/j.issn.1003-0530.2022.11.010]
Zhang J L, Wei M and Wen W. 2022. Monocular depth estimation based on DSPP. Application Research of Computers, 39(12): 3837-3840
张竞澜, 魏敏, 文武. 2022. 基于DSPP的单目图像深度估计. 计算机应用研究, 39(12): 3837-3840 [DOI: 10.19734/j.issn.1001-3695.2022.05.0212http://dx.doi.org/10.19734/j.issn.1001-3695.2022.05.0212]
Zhang J N, Su Q X, Liu P Y, Gu H Q and Wang W. 2020. A monocular 3D target detection network with perspective projection. Robot, 42(3): 278-288
张峻宁, 苏群星, 刘鹏远, 谷宏强, 王威. 2020. 一种基于透视投影的单目3D目标检测网络. 机器人, 42(3): 278-288 [DOI: 10.13973/j.cnki.robot.190221http://dx.doi.org/10.13973/j.cnki.robot.190221]
Zhang Y F, Li Y X, Zhao M B, Yu X Y, Zhan Y L and Lin W Y. 2021. Object distance estimation based on stereo regional disparity regression. Journal of Image and Graphics, 26(7): 1604-1613
张羽丰, 李昱希, 赵明璧, 喻晓源, 占云龙, 林巍峣. 2021. 局部双目视差回归的目标距离估计. 中国图象图形学报, 26(7): 1604-1613 [DOI: 10.11834/jig.200511http://dx.doi.org/10.11834/jig.200511]
Zhang Y F, Zhang Q J, Zhu Z Y, Hou J H and Yuan Y X. 2022. GLENet: boosting 3D object detectors with generative label uncertainty estimation [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/2207.02466.pdfhttps://arxiv.org/pdf/2207.02466.pdf
Zhang Y P, Lu J W and Zhou J. 2021. Objects are different: flexible monocular 3D object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 3288-3297 [DOI: 10.1109/CVPR46437.2021.00330http://dx.doi.org/10.1109/CVPR46437.2021.00330]
Zhao H Q, Fang Z J and Gao Y B. 2019. Prior direction angle estimation in 3D object detection. Transducer and Microsystem Technologies, 38(6): 35-38
赵华卿, 方志军, 高永彬. 2019. 三维目标检测中的先验方向角估计. 传感器与微系统, 38(6): 35-38 [DOI: 10.13873/J.1000-9787(2019)06-0035-04http://dx.doi.org/10.13873/J.1000-9787(2019)06-0035-04]
Zhao X, Liang H R and Liang R H. 2019. Combining object detection and binocular vision for 3D car pose estimation. Journal of Computer-Aided Design and Computer Graphics, 31(9): 1518-1527
赵邢, 梁浩然, 梁荣华. 2019. 结合目标检测与双目视觉的三维车辆姿态检测. 计算机辅助设计与图形学学报, 31(9): 1518-1527 [DOI: 10.3724/SP.J.1089.2019.17625http://dx.doi.org/10.3724/SP.J.1089.2019.17625]
Zheng W, Tang W L, Jiang L and Fu C W. 2021. SE-SSD: self-ensembling single-stage object detector from point cloud//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 14489-14498 [DOI: 10.1109/CVPR46437.2021.01426http://dx.doi.org/10.1109/CVPR46437.2021.01426]
Zhou D K, Tian J and Yang X. 2021. Unsurpervised monocular image depth estimation based on the prediction of local plane parameters. Journal of Image and Graphics, 26(1): 165-175
周大可, 田径, 杨欣. 2021. 结合局部平面参数预测的无监督单目图像深度估计. 中国图象图形学报, 26(1): 165-175 [DOI: 10.11834/jig.200364http://dx.doi.org/10.11834/jig.200364]
Zhou X Y, Wang D Q and Krähenbühl P. 2019. Objects as points [EB/OL]. [2023-01-01]. https://arxiv.org/pdf/1904.07850.pdfhttps://arxiv.org/pdf/1904.07850.pdf
Zhou Y and Tuzel O. 2018. VoxelNet: end-to-end learning for point cloud based 3D object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4490-4499 [DOI: 10.1109/CVPR.2018.00472http://dx.doi.org/10.1109/CVPR.2018.00472]
Zhou Z Y, Du L, Ye X Q, Zou Z K, Tan X, Zhang L, Xue X Y and Feng J F. 2022. SGM3D: stereo guided monocular 3D object detection. IEEE Robotics and Automation Letters, 7(4): 10478-10485 [DOI: 10.1109/LRA.2022.3191849http://dx.doi.org/10.1109/LRA.2022.3191849]
相关作者
相关机构