融合点云与图像的环境目标检测研究进展
Survey on the fusion of point clouds and images for environmental object detection
- 2024年29卷第6期 页码:1765-1784
纸质出版日期: 2024-06-16
DOI: 10.11834/jig.240030
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-06-16 ,
移动端阅览
贾明达, 杨金明, 孟维亮, 郭建伟, 张吉光, 张晓鹏. 2024. 融合点云与图像的环境目标检测研究进展. 中国图象图形学报, 29(06):1765-1784
Jia Mingda, Yang Jinming, Meng Weiliang, Guo Jianwei, Zhang Jiguang, Zhang Xiaopeng. 2024. Survey on the fusion of point clouds and images for environmental object detection. Journal of Image and Graphics, 29(06):1765-1784
在数字仿真技术应用领域,特别是在自动驾驶技术的发展中,目标检测是至关重要的一个环节,它涉及对周围环境中物体的感知,为智能装备的决策和规划提供了关键信息。近年来,随着传感器技术的进步,图像和点云成为两种主要的感知数据源,它们各自在基于深度学习技术的目标检测方法研究中具有独特的优势。为了更加全面地对现有基于点云和图像的目标检测方法进行研究,本文对基于图像、点云及两者联合的3类目标检测算法进行系统的梳理和总结,旨在探索如何将这两种数据源融合起来,促进提高目标检测的准确性、稳定性和鲁棒性,并对融合点云和图像的环境目标检测发展方向进行展望。
In the field of digital simulation technology applications, especially in the development of autonomous driving, object detection is a crucial component. It involves the perception of objects in the surrounding environment, which provides essential information for the decision-making process and planning of intelligent systems. Traditional object detection methods typically involve steps such as feature extraction, object classification, and position regression on images. However, these methods are limited by manually designed features and the performance of classifiers, which restrict their effectiveness in complex scenes and for objects with significant variations. The advent of deep learning technology has led to the widespread adoption of object detection methods based on deep neural networks. Notably, the convolutional neural network (CNN) has emerged as one of the most prominent approaches in this field. By leveraging multiple layers of convolution and pooling operations, CNNs are capable of automatically extracting meaningful feature representations from image data. In addition to image data, light detection and ranging (LiDAR) data play a crucial role in object detection tasks, particularly for 3D object detection. LiDAR data represent objects through a set of unordered and discrete points on their surfaces. Accurately detecting point cloud clusters representing objects and providing their pose estimation from these unordered points is a challenging task. LiDAR data, with their unique characteristics, offer high-precision obstacle detection and distance measurement, which contributes to the perception of surrounding roadways, vehicles, and pedestrian targets. In real-world autonomous driving and related environmental perception scenarios, using a single modality often presents numerous challenges. For instance, while image data can provide a wide variety of high-resolution visual information such as color, texture, and shape, it is susceptible to lighting conditions. In addition, models may struggle to handle occlusions caused by objects obstructing the view due to inherent limitations in camera perspectives. Fortunately, LiDAR exhibits exceptional performance in challenging lighting conditions and excels at accurately spatially locating objects in diverse and harsh weather scenarios. However, it possesses certain limitations. Specifically, the low resolution of LiDAR input data results in sparse point cloud when detecting distant targets. Extracting semantic information from LiDAR data is also more challenging than that from image data. Thus, an increasing number of researchers are emphasizing multimodal environmental object detection. A robust multimodal perception algorithm can offer richer feature information, enhanced adaptability to diverse environments, and improved detection accuracy. Such capabilities empower the perception system to deliver reliable results across various environmental conditions. Certainly, multimodal object detection algorithms also face certain limitations and pressing challenges that require immediate attention. One challenge is the difficulty in data annotation. Annotating point cloud and image data is relatively complex and time consuming, particularly for large-scale datasets. Moreover, accurately labeling point cloud data is challenging due to their sparsity and the presence of noisy points. Addressing these issues is crucial for further advancements in multimodal object detection. Moreover, the data structure and feature representation of point cloud and image data, as two distinct perception modalities, differ significantly. The current research focus lies in effectively integrating the information from the two modalities and extracting accurate and comprehensive features that can be utilized effectively. Furthermore, processing large-scale point cloud data are equally challenging. Point cloud data typically encompass a substantial number of 3D coordinates, which necessitates greater demands on computing resources and algorithmic efficiency compared with pure image data. This study aims to summarize and refine existing approaches to facilitate researchers in gaining a deeper and more efficient understanding of object detection algorithms that integrate images and point clouds. It classifies object detection algorithms based on multimodal fusion of point clouds, images, and combinations of both. Furthermore, we analyze the strengths and weaknesses of various methods while discussing potential solutions. Moreover, we provide a comprehensive review of the development of object detection algorithms that fuse point clouds and images, with considerations of aspects such as data collection, representation, and model design. Ultimately, we give a perspective on the future development direction of environmental target detection, and the goal is to enhance overall capabilities in autonomous systems.
点云自动驾驶多模态目标检测融合
point cloudautonomous drivingmultimodalobject detectionfusion
Ahmad J and Del Bue A. 2023. mmFUSION: multimodal fusion for 3D objects detection [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2311.04058.pdfhttps://arxiv.org/pdf/2311.04058.pdf
Aldoma A, Marton Z C, Tombari F, Wohlkinger W, Potthast C, Zeisl B, Rusu R B, Gedikli S and Vincze M. 2012. Tutorial: point cloud library: three-dimensional object recognition and 6DOF pose estimation. IEEE Robotics and Automation Magazine, 19(3): 80-91 [DOI: 10.1109/MRA.2012.2206675http://dx.doi.org/10.1109/MRA.2012.2206675]
Ali W, Abdelkarim S, Zidan M, Zahran M and Sallab A E. 2018. YOLO3D: end-to-end real-time 3D oriented object bounding box detection from LiDAR point cloud//Proceedings of 2018 European Conference on Computer Vision. Munich, Germany: Springer: 716-728 [DOI: 10.1007/978-3-030-11015-4_54http://dx.doi.org/10.1007/978-3-030-11015-4_54]
Arnold E, Al-Jarrah O Y, Dianati M, Fallah S, Oxtoby D and Mouzakitis A. 2019. A survey on 3D object detection methods for autonomous driving applications. IEEE Transactions on Intelligent Transportation Systems, 20(10): 3782-3795 [DOI: 10.1109/TITS.2019.2892405http://dx.doi.org/10.1109/TITS.2019.2892405]
Arora H, Loeff N, Forsyth D A and Ahuja N. 2007. Unsupervised segmentation of objects using efficient learning//Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE: 1-7 [DOI: 10.1109/CVPR.2007.383011http://dx.doi.org/10.1109/CVPR.2007.383011]
Bao W T, Xu B and Chen Z Z. 2020. MonoFENet: monocular 3D object detection with feature enhancement networks. IEEE Transactions on Image Processing, 29: 2753-2765 [DOI: 10.1109/TIP.2019.2952201http://dx.doi.org/10.1109/TIP.2019.2952201]
Beltrn J, Guindel C, Moreno F M, Cruzado D, García F and De La Escalera A. 2018. BirdNet: a 3D object detection framework from LiDAR information//Proceedings of the 21st International Conference on Intelligent Transportation Systems (ITSC). Maui, USA: IEEE: 3517-3523 [DOI: 10.1109/ITSC.2018.8569311http://dx.doi.org/10.1109/ITSC.2018.8569311]
Bewley A, Sun P, Mensink T, Anguelov D and Sminchisescu C. 2020. Range conditioned dilated convolutions for scale invariant 3D object detection//Proceedings of the 4th Conference on Robot Learning. Cambridge, USA: PMLR: 627-641
Bochkovskiy A, Wang C Y and Liao H Y M. 2020. YOLOv4: optimal speed and accuracy of object detection [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2004.10934.pdfhttps://arxiv.org/pdf/2004.10934.pdf
Brazil G and Liu X M. 2019. M3D-RPN: monocular 3D region proposal network for object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 9286-9295 [DOI: 10.1109/ICCV.2019.00938http://dx.doi.org/10.1109/ICCV.2019.00938]
Brazil G, Pons-Moll G, Liu X M and Schiele B. 2020. Kinematic 3D object detection in monocular video//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 135-152 [DOI: 10.1007/978-3-030-58592-1_9http://dx.doi.org/10.1007/978-3-030-58592-1_9]
Caesar H, Bankiti V, Lang A H, Vora S, Liong V E, Xu Q, Krishnan A, Pan Y, Baldan G and Beijbom O. 2020. nuScenes: a multimodal dataset for autonomous driving//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11618-11628 [DOI: 10.1109/CVPR42600.2020.01164http://dx.doi.org/10.1109/CVPR42600.2020.01164]
Cai H X, Zhang Z Y, Zhou Z Y, Li Z Y, Ding W B and Zhao J H. 2023. BEVFusion4D: learning LiDAR-camera fusion under bird’s-eye-view via cross-modality guidance and temporal aggregation [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2303.17099.pdfhttps://arxiv.org/pdf/2303.17099.pdf
Cao J L, Li Y L, Sun H Q, Xie J, Huang K Q and Pang Y W. 2022. A survey on deep learning based visual object detection. Journal of Image and Graphics, 27(6): 1697-1722
曹家乐, 李亚利, 孙汉卿, 谢今, 黄凯奇, 庞彦伟. 2022. 基于深度学习的视觉目标检测技术综述. 中国图象图形学报, 27(6): 1697-1722 [DOI: 10.11834/jig.220069http://dx.doi.org/10.11834/jig.220069]
Catmull E. 1998. Computer display of curved surfaces. Seminal graphics: pioneering efforts that shaped the field, 1: 35-41 [DOI: 10.1145/280811.280920http://dx.doi.org/10.1145/280811.280920]
Chang J L and Wetzstein G. 2019. Deep optics for monocular depth estimation and 3D object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 10192-10201 [DOI: 10.1109/ICCV.2019.01029http://dx.doi.org/10.1109/ICCV.2019.01029]
Charles R Q, Su H, Kaichun M and Guibas L J. 2017. PointNet: deep learning on point sets for 3D classification and segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 77-85 [DOI: 10.1109/CVPR.2017.16http://dx.doi.org/10.1109/CVPR.2017.16]
Chen X Z, Ma H M, Wan J, Li B and Xia T. 2017. Multi-view 3D object detection network for autonomous driving//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 6526-6534 [DOI: 10.1109/CVPR.2017.691http://dx.doi.org/10.1109/CVPR.2017.691]
Chen Y, Liu S, Shen X and Jia J. 2019. Fast point R-CNN//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 9774-9783[DOI: 10.1109/ICCV.2019.00987http://dx.doi.org/10.1109/ICCV.2019.00987]
Cui Y D, Chen R, Chu W B, Chen L, Tian D X, Li Y and Cao D P. 2022. Deep learning for image and point cloud fusion in autonomous driving: a review. IEEE Transactions on Intelligent Transportation Systems, 23(2): 722-739 [DOI: 10.1109/TITS.2020.3023541http://dx.doi.org/10.1109/TITS.2020.3023541]
Dalal N and Triggs B. 2005. Histograms of oriented gradients for human detection//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). San Diego, USA: IEEE: 886-893 [DOI: 10.1109/CVPR.2005.177http://dx.doi.org/10.1109/CVPR.2005.177]
Deng J J, Shi S S, Li P W, Zhou W G, Zhang Y Y and Li H Q. 2021. Voxel R-CNN: towards high performance voxel-based 3D object detection//Proceedings of the 35th AAAI Conference on Artificial Intelligence. [s.l.]: AAAI: 1201-1209 [DOI: 10.1609/aaai.v35i2.16207http://dx.doi.org/10.1609/aaai.v35i2.16207]
Ding M Y, Huo Y Q, Yi H W, Wang Z, Shi J P, Lu Z W and Luo P. 2020a. Learning depth-guided convolutions for monocular 3D object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11669-11678 [DOI: 10.1109/CVPR42600.2020.01169http://dx.doi.org/10.1109/CVPR42600.2020.01169]
Ding Z Z, Hu Y H, Ge R Z, Huang L, Chen S J, Wang Yand Liao J. 2020b. 1st place solution for Waymo open dataset challenge—3D detection and domain adaptation [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2006.15505.pdfhttps://arxiv.org/pdf/2006.15505.pdf
Dong Z C, Ji H, Huang X F, Zhang W K, Zhan X and Chen J B. 2023. PeP: a Point enhanced Painting method for unified point cloud tasks [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2310.07591.pdfhttps://arxiv.org/pdf/2310.07591.pdf
Duan K W, Bai S, Xie L X, Qi H G, Huang Q M and Tian Q. 2019. CenterNet: keypoint triplets for object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 6568-6577 [DOI: 10.1109/ICCV.2019.00667http://dx.doi.org/10.1109/ICCV.2019.00667]
Engelcke M, Rao D, Wang D Z, Tong C H and Posner I. 2017. Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks//Proceedings of 2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore, Singapore: IEEE: 1355-1361 [DOI: 10.1109/ICRA.2017.7989161http://dx.doi.org/10.1109/ICRA.2017.7989161]
Fan L, Xiong X, Wang F, Wang N Y and Zhang Z X. 2021. RangeDet: in defense of range view for LiDAR-based 3D object detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE: 2898-2907 [DOI: 10.1109/ICCV48922.2021.00291http://dx.doi.org/10.1109/ICCV48922.2021.00291]
Feng D, Haase-Schütz C, Rosenbaum L, Hertlein H, Gläser C, Timm F, Wiesbeck W and Dietmayer K. 2021. Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges. IEEE Transactions on Intelligent Transportation Systems, 22(3): 1341-1360 [DOI: 10.1109/TITS.2020.2972974http://dx.doi.org/10.1109/TITS.2020.2972974]
Fu H, Gong M M, Wang C H, Batmanghelich K and Tao D C. 2018. Deep ordinal regression network for monocular depth estimation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2002-2011 [DOI: 10.1109/CVPR.2018.00214http://dx.doi.org/10.1109/CVPR.2018.00214]
Geiger A, Lenz P and Urtasun R. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE: 3354-3361 [DOI: 10.1109/CVPR.2012.6248074http://dx.doi.org/10.1109/CVPR.2012.6248074]
Girshick R. 2015. Fast R-CNN//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE: 1440-1448 [DOI: 10.1109/ICCV.2015.169http://dx.doi.org/10.1109/ICCV.2015.169]
Girshick R, Donahue J, Darrell T and Malik J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 580-587 [DOI: 10.1109/CVPR.2014.81http://dx.doi.org/10.1109/CVPR.2014.81]
Godard C, Aodha O M and Brostow G J. 2017. Unsupervised monocular depth estimation with left-right consistency//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 6602-6611 [DOI: 10.1109/CVPR.2017.699http://dx.doi.org/10.1109/CVPR.2017.699]
Guo Y L, Wang H Y, Hu Q Y, Liu H, Liu L and Bennamoun M. 2021. Deep learning for 3D point clouds: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(12): 4338-4364 [DOI: 10.1109/TPAMI.2020.3005434http://dx.doi.org/10.1109/TPAMI.2020.3005434]
He C H, Zeng H, Huang J Q, Hua X S and Zhang L. 2020. Structure aware single-stage 3D object detection from point cloud//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11870-11879 [DOI: 10.1109/CVPR42600.2020.01189http://dx.doi.org/10.1109/CVPR42600.2020.01189]
He K M, Gkioxari G, Dollr P and Girshick R. 2017. Mask R-CNN//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 2980-2988 [DOI: 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322]
He T and Soatto S. 2019. Mono3D++: monocular 3D vehicle detection with two-scale 3D hypotheses and task priors//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI: 8409-8416 [DOI: 10.1609/aaai.v33i01.33018409http://dx.doi.org/10.1609/aaai.v33i01.33018409]
Hu H T, Wang F Y, Su J W, Hu L F, Feng T P, Zhang Z K and Zhang W Z. 2023. EA-BEV: edge-aware bird’s-eye-view projector for 3D object detection [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2303.17895.pdfhttps://arxiv.org/pdf/2303.17895.pdf
Huang T T, Liu Z, Chen X W and Bai X. 2020. EPNet: enhancing point features with image semantics for 3d object detection//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 35-52 [DOI: 10.1007/978-3-030-58555-6_3http://dx.doi.org/10.1007/978-3-030-58555-6_3]
Huang Z, Wang Y C and Li D Y. 2023. A survey of 3D object detection algorithms. Chinese Journal of Intelligent Science and Technology, 5(1): 7-31
黄哲, 王永才, 李德英. 2023. 3D目标检测方法研究综述. 智能科学与技术学报, 5(1): 7-31 [DOI: 10.11959/j.issn.2096-6652.202312http://dx.doi.org/10.11959/j.issn.2096-6652.202312]
Jin S, Li X P, Yang F and Zhang W G. 2023. 3D object detection in road scenes by pseudo-LiDAR point cloud augmentation. Journal of Image and Graphics, 28(11): 3520-3535
晋帅, 李煊鹏, 杨凤, 张为公. 2023. 伪激光点云增强的道路场景三维目标检测. 中国图象图形学报, 28(11): 3520-3535 [DOI: 10.11834/jig.220986http://dx.doi.org/10.11834/jig.220986]
Krispel G, Opitz M, Waltner G, Possegger H and Bischof H. 2020. FuseSeg: LiDAR point cloud segmentation fusing multi-modal data//Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). Snowmass, USA: IEEE: 1863-1872 [DOI: 10.1109/WACV45572.2020.9093584http://dx.doi.org/10.1109/WACV45572.2020.9093584]
Ku J, Mozifian M, Lee J, Harakeh A and Waslander S L. 2018. Joint 3D proposal generation and object detection from view aggregation//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid, Spain: IEEE: 1-8 [DOI: 10.1109/IROS.2018.8594049http://dx.doi.org/10.1109/IROS.2018.8594049]
Ku J, Pon A D and Waslander S L. 2019. Monocular 3D object detection leveraging accurate proposals and shape reconstruction//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 11859-11868 [DOI: 10.1109/CVPR.2019.01214http://dx.doi.org/10.1109/CVPR.2019.01214]
Lang A H, Vora S, Caesar H, Zhou L B, Yang J and Beijbom O. 2019. PointPillars: fast encoders for object detection from point clouds//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 12689-12697 [DOI: 10.1109/CVPR.2019.01298http://dx.doi.org/10.1109/CVPR.2019.01298]
Law H and Deng J. 2020. CornerNet: detecting objects as paired keypoints. International Journal of Computer Vision, 128(3): 642-656 [DOI: 10.1007/s11263-019-01204-1http://dx.doi.org/10.1007/s11263-019-01204-1]
Li B. 2017. 3D fully convolutional network for vehicle detection in point cloud//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada: IEEE: 1513-1518 [DOI: 10.1109/IROS.2017.8205955http://dx.doi.org/10.1109/IROS.2017.8205955]
Li B Y, Ouyang W L, Sheng L, Zeng X Y and Wang X G. 2019. GS3D: an efficient 3D object detection framework for autonomous driving//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 1019-1028 [DOI: 10.1109/CVPR.2019.00111http://dx.doi.org/10.1109/CVPR.2019.00111]
Li P X, Zhao H C, Liu P F and Cao F D. 2020. RTM3D: real-time monocular 3d detection from object keypoints for autonomous driving//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 644-660 [DOI: 10.1007/978-3-030-58580-8_38http://dx.doi.org/10.1007/978-3-030-58580-8_38]
Li X, Ma T, Hou Y N, Shi B T, Yang Y C, Liu Y Q, Wu X J, Chen Q, Li Y K, Qiao Y and He L. 2023a. LoGoNet: towards accurate 3D object detection with local-to-global cross- modal fusion//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE: 17524-17534 [DOI: 10.1109/CVPR52729.2023.01681http://dx.doi.org/10.1109/CVPR52729.2023.01681]
Li Z Q, Wang W H, Li H Y, Xie E Z, Sima C H, Lu T, Qiao Y and Dai J F. 2022. BEVFormer: learning bird’s-eye-view representation from multi-camera images via spatiotemporal Transformers//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 1-18 [DOI: 10.1007/978-3-031-20077-9_1http://dx.doi.org/10.1007/978-3-031-20077-9_1]
Liang M, Yang B, Chen Y, Hu R and Urtasun R. 2019. Multi-task multi-sensor fusion for 3D object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 7337-7345 [DOI: 10.1109/CVPR.2019.00752http://dx.doi.org/10.1109/CVPR.2019.00752]
Liang M, Yang B, Wang S L and Urtasun R. 2018. Deep continuous fusion for multi-sensor 3D object detection//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 663-678 [DOI: 10.1007/978-3-030-01270-0_39http://dx.doi.org/10.1007/978-3-030-01270-0_39]
Liang Z D, Zhang M, Zhang Z H, Zhao X and Pu S L. 2020. RangeRCNN: towards fast and accurate 3D object detection with range image representation [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2009.00206.pdfhttps://arxiv.org/pdf/2009.00206.pdf
Liang Z D, Zhang Z H, Zhang M, Zhao X and Pu S L. 2021. RangeIoUDet: range image based real-time 3D object detector optimized by intersection over union//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 7136-7145 [DOI: 10.1109/CVPR46437.2021.00706http://dx.doi.org/10.1109/CVPR46437.2021.00706]
Lin T Y, Goyal P, Girshick R, He K M and Dollr P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 2999-3007 [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]
Liu L J, Lu J W, Xu C J, Tian Q and Zhou J. 2019. Deep fitting degree scoring network for monocular 3D object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 1057-1066 [DOI: 10.1109/CVPR.2019.00115http://dx.doi.org/10.1109/CVPR.2019.00115]
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y and Berg A C. 2016. SSD: single shot MultiBox detector//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 21-37 [DOI: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2]
Liu X, Li H, Cheng Y Z, Kong X Z and Chen S M. 2024. 3D multi-object tracking based on image and point cloud multi-information perception association. Journal of Image and Graphics, 29(1): 163-178
刘祥, 李辉, 程远志, 孔祥振, 陈双敏. 2024. 图像与点云多重信息感知关联的三维多目标跟踪. 中国图象图形学报, 29(1): 163-178 [DOI: 10.11834/jig.221003http://dx.doi.org/10.11834/jig.221003]
Liu Z C, Wu Z Z and Tóth R. 2020. SMOKE: single-stage monocular 3D object detection via keypoint estimation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle, USA: IEEE: 4289-4298 [DOI: 10.1109/CVPRW50498.2020.00506http://dx.doi.org/10.1109/CVPRW50498.2020.00506]
Liu Z J, Tang H T, Amini A, Yang X Y, Mao H Z, Rus D L and Han S. 2023. BEVFusion: multi-task multi-sensor fusion with unified bird’s-eye view representation//Proceedings of 2023 IEEE International Conference on Robotics and Automation (ICRA). London, UK: IEEE: 2774-2781 [DOI: 10.1109/ICRA48891.2023.10160968http://dx.doi.org/10.1109/ICRA48891.2023.10160968]
Lu H H, Chen X S, Zhang G Y, Zhou Q H, Ma Y B and Zhao Y. 2019. Scanet: spatial-channel attention network for 3D object detection//Proceedings of ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, UK: IEEE: 1992-1996 [DOI: 10.1109/ICASSP.2019.8682746http://dx.doi.org/10.1109/ICASSP.2019.8682746]
Lu Y, Ma X Z, Yang L, Zhang T Z, Liu Y T, Chu Q, Yan J J and Ouyang W L. 2021. Geometry uncertainty projection network for monocular 3D object detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE: 3091-3101 [DOI: 10.1109/ICCV48922.2021.00310http://dx.doi.org/10.1109/ICCV48922.2021.00310]
Luo S J, Dai H, Shao L and Ding Y. 2021. M3DSSD: monocular 3D single stage object detector//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 6141-6150 [DOI: 10.1109/CVPR46437.2021.00608http://dx.doi.org/10.1109/CVPR46437.2021.00608]
Ma T, Yang X M, Zhou H B, Li X, Shi B T, Liu J J, Yang Y C, Liu Z Z, He L, Qiao Y, Li Y K and Li H S. 2023. DetZero: rethinking offboard 3D object detection with long-term sequential point clouds//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE: 6713-6724 [DOI: 10.1109/ICCV51070.2023.00620http://dx.doi.org/10.1109/ICCV51070.2023.00620]
Ma X Z, Wang Z H, Li H J, Zhang P B, Ouyang W L and Fan X. 2019. Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 6850-6859 [DOI: 10.1109/ICCV.2019.00695http://dx.doi.org/10.1109/ICCV.2019.00695]
Manhardt F, Kehl W and Gaidon A. 2019. ROI-10D: monocular lifting of 2D detection to 6D pose and metric shape//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 2064-2073 [DOI: 10.1109/CVPR.2019.00217http://dx.doi.org/10.1109/CVPR.2019.00217]
Mao J G, Shi S S, Wang X G and Li H S. 2022. 3D object detection for autonomous driving: a comprehensive survey [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2206.09474.pdfhttps://arxiv.org/pdf/2206.09474.pdf
Meyer G P, Charland J, Pandey S, Laddha A, Gautam S, Vallespi-Gonzalez C and Wellington C K. 2021. LaserFlow: efficient and probabilistic object detection and motion forecasting. IEEE Robotics and Automation Letters, 6(2): 526-533 [DOI: 10.1109/LRA.2020.3047793http://dx.doi.org/10.1109/LRA.2020.3047793]
Meyer G P, Laddha A, Kee E, Vallespi-Gonzalez C and Wellington C K. 2019. LaserNet: an efficient probabilistic 3D object detector for autonomous driving//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 12669-12678 [DOI: 10.1109/CVPR.2019.01296http://dx.doi.org/10.1109/CVPR.2019.01296]
Naiden A, Paunescu V, Kim G, Jeon B M and Leordeanu M. 2019. Shift R-CNN: deep monocular 3d object detection with closed-form geometric constraints//Proceedings of 2019 IEEE international conference on image processing (ICIP). Taipei, China: IEEE: 61-65 [DOI: 10.1109/ICIP.2019.8803397http://dx.doi.org/10.1109/ICIP.2019.8803397]
Ng P C and Henikoff S. 2003. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Research, 31(13): 3812-3814 [DOI: 10.1093/nar/gkg509http://dx.doi.org/10.1093/nar/gkg509]
Ngiam J, Caine B, Han W, Yang B, Chai Y N, Sun P, Zhou Y, Yi X, Alsharif O, Nguyen P, Chen Z F, Shlens J and Vasudevan V. 2019. StarNet: targeted computation for object detection in point clouds [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/1908.11069.pdfhttps://arxiv.org/pdf/1908.11069.pdf
Pan X R, Xia Z F, Song S J, Li L E and Huang G. 2021. 3D object detection with pointformer//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 7459-7468 [DOI: 10.1109/CVPR46437.2021.00738http://dx.doi.org/10.1109/CVPR46437.2021.00738]
Qi C R, Liu W, Wu C X, Su H and Guibas L J. 2018. Frustum PointNets for 3D object detection from RGB-D data//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 918-927 [DOI: 10.1109/CVPR.2018.00102http://dx.doi.org/10.1109/CVPR.2018.00102]
Qi C R, Yi L, Su H and Guibas L J. 2017. Pointnet++: deep hierarchical feature learning on point sets in a metric space//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 5105-5114
Qin Z Y, Wang J L and Lu Y. 2022. MonoGRNet: a general framework for monocular 3D object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9): 5170-5184 [DOI: 10.1109/TPAMI.2021.3074363http://dx.doi.org/10.1109/TPAMI.2021.3074363]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]
Roddick T, Kendall A and Cipolla R. 2019. Orthographic feature transform for monocular 3D object detection//Proceedings of the 30th British Machine Vision Conference 2019. Cardiff, UK: BMVC: #285
Shi S S, Guo C X, Jiang L, Wang Z, Shi J P, Wang X G and Li H S. 2020a. PV-RCNN: point-voxel feature set abstraction for 3D object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 10526-10535 [DOI: 10.1109/CVPR42600.2020.01054http://dx.doi.org/10.1109/CVPR42600.2020.01054]
Shi S S, Wang X G and Li H S. 2019. PointRCNN: 3D object proposal generation and detection from point cloud//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 770-779 [DOI: 10.1109/CVPR.2019.00086http://dx.doi.org/10.1109/CVPR.2019.00086]
Shi W J and Rajkumar R. 2020. Point-GNN: graph neural network for 3D object detection in a point cloud//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 1708-1716 [DOI: 10.1109/CVPR42600.2020.00178http://dx.doi.org/10.1109/CVPR42600.2020.00178]
Shi X P, Ye Q, Chen X Z, Chen C R, Chen Z X and Kim T K. 2021. Geometry-based distance decomposition for monocular 3D object detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE: 15152-15161 [DOI: 10.1109/ICCV48922.2021.01489http://dx.doi.org/10.1109/ICCV48922.2021.01489]
Shin K, Kwon Y P and Tomizuka M. 2019. RoarNet: a robust 3D object detection based on RegiOn approximation refinement//Proceedings of 2019 IEEE Intelligent Vehicles Symposium (IV). Paris, France: IEEE: 2510-2515 [DOI: 10.1109/IVS.2019.8813895http://dx.doi.org/10.1109/IVS.2019.8813895]
Simonelli A, Buló S R, Porzi L, Ricci E and Kontschieder P. 2020. Towards generalization across depth for monocular 3D object detection//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 767-782 [DOI: 10.1007/978-3-030-58542-6_46http://dx.doi.org/10.1007/978-3-030-58542-6_46]
Sindagi V A, Zhou Y and Tuzel O. 2019. MVX-Net: multimodal VoxelNet for 3D object detection//Proceedings of 2019 International Conference on Robotics and Automation (ICRA). Montreal, Canada: IEEE: 7276-7282 [DOI: 10.1109/ICRA.2019.8794195http://dx.doi.org/10.1109/ICRA.2019.8794195]
Sun P, Kretzschmar H, Dotiwalla X, Chouard A, Patnaik V, Tsui P, Guo J, Zhou Y, Chai Y N, Caine B, Vasudevan V, Han W, Ngiam J, Zhao H, Timofeev A, Ettinger S, Krivokon M, Gao A, Joshi A, Zhang Y, Shlens J, Chen Z F and Anguelov D. 2020. Scalability in perception for autonomous driving: Waymo open dataset//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 2443-2451 [DOI: 10.1109/CVPR42600.2020.00252http://dx.doi.org/10.1109/CVPR42600.2020.00252]
Sun P, Wang W Y, Chai Y N, Elsayed G, Bewley A, Zhang X, Sminchisescu C and Anguelov D. 2021. RSN: range sparse net for efficient, accurate LiDAR 3D object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 5721-5730 [DOI: 10.1109/CVPR46437.2021.00567http://dx.doi.org/10.1109/CVPR46437.2021.00567]
Tan M X, Pang R M and Le Q V. 2020. EfficientDet: scalable and efficient object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 10778-10787 [DOI: 10.1109/CVPR42600.2020.01079http://dx.doi.org/10.1109/CVPR42600.2020.01079]
Tian H, Chen Y, Dai J, Zhang Z and Zhu X. 2021. Unsupervised object detection with LiDAR clues//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 5958-5968 [DOI: 10.1109/CVPR46437.2021.00590http://dx.doi.org/10.1109/CVPR46437.2021.00590]
Tian Z, Shen C H, Chen H and He T. 2019. FCOS: fully convolutional one-stage object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 9626-9635 [DOI: 10.1109/ICCV.2019.00972http://dx.doi.org/10.1109/ICCV.2019.00972]
Viola P and Jones M. 2001. Rapid object detection using a boosted cascade of simple features//Proceedings of 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. Kauai, USA: IEEE: I-I [DOI: 10.1109/CVPR.2001.990517http://dx.doi.org/10.1109/CVPR.2001.990517]
Vora S, Lang A H, Helou B and Beijbom O. 2020. PointPainting: sequential fusion for 3D object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 4603-4611 [DOI: 10.1109/CVPR42600.2020.00466http://dx.doi.org/10.1109/CVPR42600.2020.00466]
Wang C H, Chen H W and Fu L C. 2021a. VPFNet: voxel-pixel fusion network for multi-class 3d object detection [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2111.00966.pdfhttps://arxiv.org/pdf/2111.00966.pdf
Wang C W, Ma C, Zhu M and Yang X K. 2021b. PointAugmenting: cross-modal augmentation for 3D object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 11789-11798 [DOI: 10.1109/CVPR46437.2021.01162http://dx.doi.org/10.1109/CVPR46437.2021.01162]
Wang D Z and Posner I. 2015. Voting for voting in online point cloud object detection//Robotics: Science and Systems [DOI: 10.15607/RSS.2015.XI.035http://dx.doi.org/10.15607/RSS.2015.XI.035]
Wang G J, Tian B, Zhang Y C, Chen L, Cao D P and Wu J. 2020. Multi-view adaptive fusion network for 3D object detection [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2011.00652.pdfhttps://arxiv.org/pdf/2011.00652.pdf
Wang L, Du L, Ye X Q, Fu Y W, Guo G D, Xue X Y, Feng J F and Zhang L. 2021c. Depth-conditioned dynamic message propagation for monocular 3D object detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 454-463 [DOI: 10.1109/CVPR46437.2021.00052http://dx.doi.org/10.1109/CVPR46437.2021.00052]
Wang L, Zhang X Y, Song Z Y, Bi J F, Zhang G X, Wei H Y, Tang L Y, Yang L, Li J, Jia C Y and Zhao L J. 2023. Multi-modal 3D object detection in autonomous driving: a survey and taxonomy. IEEE Transactions on Intelligent Vehicles, 8(7): 3781-3798 [DOI: 10.1109/TIV.2023.3264658http://dx.doi.org/10.1109/TIV.2023.3264658]
Wang T, Zhu X E, Pang J M and Lin D H. 2021d. FCOS3D: fully convolutional one-stage monocular 3D object detection. Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal, Canada: IEEE: 913-922 [DOI: 10.1109/ICCVW54120.2021.00107http://dx.doi.org/10.1109/ICCVW54120.2021.00107]
Wang Y, Chao W L, Garg D, Hariharan B, Campbell M and Weinberger K Q. 2019a. Pseudo-LiDAR from visual depth estimation: bridging the gap in 3D object detection for autonomous driving//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 8437-8445 [DOI: 10.1109/CVPR.2019.00864http://dx.doi.org/10.1109/CVPR.2019.00864]
Wang Z X and Jia K. 2019. Frustum ConvNet: sliding frustums to aggregate local point-wise features for Amodal 3D object detection//Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Macau, China: IEEE: 1742-1749 [DOI: 10.1109/IROS40897.2019.8968513http://dx.doi.org/10.1109/IROS40897.2019.8968513]
Wu H, Deng J H, Wen C L, Li X, Wang C and Li J. 2022a. CasA: a cascade attention network for 3-D object detection from LiDAR point cloudS. IEEE Transactions on Geoscience and Remote Sensing, 60: #5704511 [DOI: 10.1109/TGRS.2022.3203163http://dx.doi.org/10.1109/TGRS.2022.3203163]
Wu H, Wen C L, Li W, Li X, Yang R G and Wang C. 2023a. Transformation-equivariant 3D object detection for autonomous driving//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI: 2795-2802 [DOI: 10.1609/aaai.v37i3.25380http://dx.doi.org/10.1609/aaai.v37i3.25380]
Wu H, Wen C L, Shi S S, Li X and Wang C. 2023b. Virtual sparse convolution for multimodal 3D object detection//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE: 21653-21662 [DOI: 10.1109/CVPR52729.2023.02074http://dx.doi.org/10.1109/CVPR52729.2023.02074]
Wu X P, Peng L, Yang H H, Xie L, Huang C X, Deng C Q, Liu H F and Cai D. 2022b. Sparse fuse dense: towards high quality 3D detection with depth completion//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE: 5408-5417 [DOI: 10.1109/CVPR52688.2022.00534http://dx.doi.org/10.1109/CVPR52688.2022.00534]
Xiao W P, Peng Y, Liu C, Gao J T, Wu Y Q and Li X M. 2023. Balanced sample assignment and objective for single-model multi-class 3D object detection. IEEE Transactions on Circuits and Systems for Video Technology, 33(9): 5036-5048 [DOI: 10.1109/TCSVT.2023.3248656http://dx.doi.org/10.1109/TCSVT.2023.3248656]
Xie E Z, Yu Z D Zhou D Q, Philion J, Anandkumar A, Fidler S, Luo P and Alvarez J M. 2022. M2BEV: multi-camera joint 3D detection and segmentation with unified birds-eye view representation [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2204.05088.pdfhttps://arxiv.org/pdf/2204.05088.pdf
Xie L, Xiang C, Yu Z X, Xu G D, Yang Z, Cai D and He X F. 2020. PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module//Proceedings of 2020 AAAI Conference on Artificial Intelligence. New York, USA: AAAI: 12460-12467 [DOI: 10.1609/aaai.v34i07.6933http://dx.doi.org/10.1609/aaai.v34i07.6933]
Xu B and Chen Z Z. 2018. Multi-level fusion based 3D object detection from monocular images//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2345-2353 [DOI: 10.1109/CVPR.2018.00249http://dx.doi.org/10.1109/CVPR.2018.00249]
Xu D F, Anguelov D and Jain A. 2018. PointFusion: deep sensor fusion for 3D bounding box estimation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 244-253 [DOI: 10.1109/CVPR.2018.00033http://dx.doi.org/10.1109/CVPR.2018.00033]
Xu H T, Xu J M and Xu W W. 2019. Survey of 3D modeling using depth cameras. Virtual Reality and Intelligent Hardware, 1(5): 483-499 [DOI: 10.1016/j.vrih.2019.09.003http://dx.doi.org/10.1016/j.vrih.2019.09.003]
Yan J J, Liu Y F, Sun J J, Jia F, Li S L, Wang T C and Zhang X Y. 2023. Cross modal Transformer: towards fast and robust 3D object detection [EB/OL]. [2023-12-21]. https://arxiv.org/pdf/2301.01283.pdfhttps://arxiv.org/pdf/2301.01283.pdf
Yan Y, Mao Y X and Li B. 2018. SECOND: sparsely embedded convolutional detection. Sensors (Basel), 18(10):#3337 [DOI: 10.3390/s18103337http://dx.doi.org/10.3390/s18103337]
Yang B, Luo W J and Urtasun R. 2018b. PIXOR: real-time 3D object detection from point clouds//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7652-7660 [DOI: 10.1109/CVPR.2018.00798http://dx.doi.org/10.1109/CVPR.2018.00798]
Yang C Y, Chen Y T, Tian H, Tao C X, Zhu X Z, Zhang Z X, Huang G, Li H Y, Qiao Y, Lu L W, Zhou J and Dai J F. 2022. BEVFormer v2: adapting modern image backbones to bird’s-eye-view recognition via perspective supervision//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE: 17830-17839 [DOI: 10.1109/CVPR52729.2023.01710http://dx.doi.org/10.1109/CVPR52729.2023.01710]
Yang Z T, Sun Y N, Liu S and Jia J Y. 2020. 3DSSD: point-based 3D single stage object detector//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11037-11045 [DOI: 10.1109/CVPR42600.2020.01105http://dx.doi.org/10.1109/CVPR42600.2020.01105]
Yang Z T, Sun Y N, Liu S, Shen X Y and Jia J Y. 2018c. IPOD: intensive point-based object detector for point cloud [EB/OL]. [2018-12-13]. https://arxiv.org/pdf/1812.05276.pdfhttps://arxiv.org/pdf/1812.05276.pdf
Yang Z T, Sun Y N, Liu S, Shen X Y and Jia J Y. 2019. STD: sparse-to-dense 3D object detector for point cloud//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 1951-1960 [DOI: 10.1109/ICCV.2019.00204http://dx.doi.org/10.1109/ICCV.2019.00204]
Yin T W, Zhou X Y and Krähenbühl P. 2021. Center-based 3D object detection and tracking//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 11779-11788 [DOI: 10.1109/CVPR46437.2021.01161http://dx.doi.org/10.1109/CVPR46437.2021.01161]
Yoo J H, Kim Y, Kim J and Choi J W. 2020. 3D-CVF: generating joint camera and LiDAR features using cross-view spatial feature fusion for 3D object detection//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 720-736 [DOI: 10.1007/978-3-030-58583-9_43http://dx.doi.org/10.1007/978-3-030-58583-9_43]
Zeng Y M, Hu Y, Liu S C, Ye J, Han Y H, Li X W and Sun N H. 2018. RT3D: real-time 3-D vehicle detection in LiDAR point cloud for autonomous driving. IEEE Robotics and Automation Letters, 3(4): 3434-3440 [DOI: 10.1109/LRA.2018.2852843http://dx.doi.org/10.1109/LRA.2018.2852843]
Zhao X, Liu Z, Hu R L and Huang K Q. 2019. 3D object detection using scale invariant and feature reweighting networks//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI: 9267-9274 [DOI: 10.1609/aaai.v33i01.33019267http://dx.doi.org/10.1609/aaai.v33i01.33019267]
Zhou X Y, Zhuo J C and Krähenbühl P. 2019. Bottom-up object detection by grouping extreme and center points//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 850-859 [DOI: 10.1109/CVPR.2019.00094http://dx.doi.org/10.1109/CVPR.2019.00094]
Zhou Y and Tuzel O. 2018. VoxelNet: end-to-end learning for point cloud based 3D object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4490-4499 [DOI: 10.1109/CVPR.2018.00472http://dx.doi.org/10.1109/CVPR.2018.00472]
Zhou Y S, He Y, Zhu H Z, Wang C, Li H Y and Jiang Q H. 2021. Monocular 3D object detection: an extrinsic parameter free approach//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 7552-7562 [DOI: 10.1109/CVPR46437.2021.00747http://dx.doi.org/10.1109/CVPR46437.2021.00747]
Zhu H Q, Deng J J, Zhang Y, Ji J M, Mao Q Y, Li H Q and Zhang Y Y. 2023. VPFNet: improving 3D object detection with virtual point based LiDAR and stereo data fusion. IEEE Transactions on Multimedia, 25: 5291-5304 [DOI: 10.1109/TMM.2022.3189778http://dx.doi.org/10.1109/TMM.2022.3189778]
相关作者
相关机构