三维视觉前沿进展
Recent progress in 3D vision
- 2021年26卷第6期 页码:1389-1428
纸质出版日期: 2021-06-16 ,
录用日期: 2021-02-09
DOI: 10.11834/jig.210043
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2021-06-16 ,
录用日期: 2021-02-09
移动端阅览
龙霄潇, 程新景, 朱昊, 张朋举, 刘浩敏, 李俊, 郑林涛, 胡庆拥, 刘浩, 曹汛, 杨睿刚, 吴毅红, 章国锋, 刘烨斌, 徐凯, 郭裕兰, 陈宝权. 三维视觉前沿进展[J]. 中国图象图形学报, 2021,26(6):1389-1428.
Xiaoxiao Long, Xinjing Cheng, Hao Zhu, Pengju Zhang, Haomin Liu, Jun Li, Lintao Zheng, Qingyong Hu, Hao Liu, Xun Cao, Ruigang Yang, Yihong Wu, Guofeng Zhang, Yebin Liu, Kai Xu, Yulan Guo, Baoquan Chen. Recent progress in 3D vision[J]. Journal of Image and Graphics, 2021,26(6):1389-1428.
在自动驾驶、机器人、数字城市以及虚拟/混合现实等应用的驱动下,三维视觉得到了广泛的关注。三维视觉研究主要围绕深度图像获取、视觉定位与制图、三维建模及三维理解等任务而展开。本文围绕上述三维视觉任务,对国内外研究进展进行了综合评述和对比分析。首先,针对深度图像获取任务,从非端到端立体匹配、端到端立体匹配及无监督立体匹配3个方面对立体匹配研究进展进行了回顾,从深度回归网络和深度补全网络两个方面对单目深度估计研究进展进行了回顾。其次,针对视觉定位与制图任务,从端到端视觉定位和非端到端视觉定位两个方面对大场景下的视觉定位研究进展进行了回顾,并从视觉同步定位与地图构建和融合其他传感器的同步定位与地图构建两个方面对同步定位与地图构建的研究进展进行了回顾。再次,针对三维建模任务,从深度三维表征学习、深度三维生成模型、结构化表征学习与生成模型以及基于深度学习的三维重建等4个方面对三维几何建模研究进展进行了回顾,并从多视RGB重建、单深度相机和多深度相机方法以及单视图RGB方法等3个方面对人体动态建模研究进展进行了回顾。最后,针对三维理解任务,从点云语义分割和点云实例分割两个方面对点云语义理解研究进展进行了回顾。在此基础上,给出了三维视觉研究的未来发展趋势,旨在为相关研究者提供参考。
3D vision has numerous applications in various areas
such as autonomous vehicles
robotics
digital city
virtual/mixed reality
human-machine interaction
entertainment
and sports. It covers a broad variety of research topics
ranging from 3D data acquisition
3D modeling
shape analysis
rendering
to interaction. With the rapid development of 3D acquisition sensors (such as low-cost LiDARs
depth cameras
and 3D scanners)
3D data become even more accessible and available. Moreover
the advances in deep learning techniques further boost the development of 3D vision
with a large number of algorithms being proposed recently. We provide a comprehensive review on progress of 3D vision algorithms in recent few years
mostly in the last year. This survey covers seven different topics
including stereo matching
monocular depth estimation
visual localization in large-scale scenes
simultaneous localization and mapping (SLAM)
3D geometric modeling
dynamic human modeling
and point cloud understanding. Although several surveys are now available in the area of 3D vision
this survey is different from few aspects. First
this study covers a wide range of topics in 3D vision and can therefore benefit a broad research community. On the contrary
most existing works mainly focus on a specific topic
such as depth estimation or point cloud learning. Second
this study mainly focuses on the progress in very recent years. Therefore
it can provide the readers with up-to-date information. Third
this paper presents a direct comparison between the progresses in China and abroad. The recent progress in depth image acquisition
including stereo matching and monocular depth estimation
is initially reviewed. The stereo matching algorithms are divided into non-end-to-end stereo matching
end-to-end stereo matching
and unsupervised stereo matching algorithms. The monocular depth estimation algorithms are categorized into depth regression networks and depth completion networks. The depth regression networks are further divided into encoder-decoder networks and composite networks. Then
the recent progress in visual localization
including visual localization in large-scale scenes and SLAM is reviewed. The visual localization algorithms for large-scale scenes are divided into end-to-end and non-end-to-end algorithms
and these non-end-to-end algorithms are further categorized into deep learning-based feature description algorithms
2D image retrieval-based visual localization algorithms
2D-3D matching-based visual localization algorithms
and visual localization algorithms based on the fusion of 2D image retrieval and 2D-3D matching. SLAM algorithms are divided into visual SLAM algorithms and multisensor fusion based SLAM algorithms. The recent progress in 3D modeling and understanding
including 3D geometric modeling
dynamic human modeling
and point cloud understanding is further reviewed. 3D geometric modeling algorithms consist of several components
including deep 3D representation learning
deep 3D generative models
structured representation learning and generative models
and deep learning-based 3D modeling. Dynamic human modeling algorithms are divided into multiview RGB modeling algorithms
single-depth camera-based and multiple-depth camera-based algorithms
and single-view RGB modeling methods. Point cloud understanding algorithms are further categorized into semantic segmentation methods and instance segmentation methods for point clouds. The paper is organized as follows. In Section 1
we present the progress in 3D vision outside China. In Section 2
we introduce the progress of 3D vision in China. In Section 3
the 3D vision techniques developed in China and abroad are compared and analyzed. In Section 4
we point out several future research directions in the area.
立体匹配单目深度估计视觉定位同步定位与地图构建(SLAM)三维几何建模人体动态重建点云语义理解
stereo matchingmonocular depth estimationvisual localizationsimultaneous localization and mapping(SLAM)3D geometry modelingdynamic human reconstructionpoint cloud understanding
Agamennoni G, Fontana S,Siegwart R Y and Sorrenti D G. 2016. Point clouds registration with probabilistic data association//Proceedings of 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Daejeon, Korea (South): IEEE: 4092-4098[DOI: 10.1109/IROS.2016.7759602http://dx.doi.org/10.1109/IROS.2016.7759602]
Aleotti F, Tosi F, Zhang L, Poggi M and Mattoccia S. 2020. Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation[EB/OL]. [2020-10-10].https://arxiv.org/pdf/2008.07130.pdfhttps://arxiv.org/pdf/2008.07130.pdf
Almalioglu Y, Saputra M R U, de Gusmão P P B, Markham A and Trigoni N. 2019. GANVO: unsupervised deep monocular visual odometry and depth estimation with generative adversarial networks//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 5474-5480[DOI: 10.1109/ICRA.2019.8793512http://dx.doi.org/10.1109/ICRA.2019.8793512]
Arandjelovic R, Gronat P, Torii A, Pajdla T and Sivic J. 2016. NetVLAD: CNN architecture for weakly supervised place recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 5297-5307[DOI: 10.1109/CVPR.2016.572http://dx.doi.org/10.1109/CVPR.2016.572]
Averkiou M, Kim V G, Zheng Y Y and Mitra N J. 2014. ShapeSynth: parameterizing model collections for coupled shape exploration and synthesis. Computer Graphics Forum, 33(2): 125-134[DOI:10.1111/cgf.12310]
Badki A, Troccoli A, Kim K, Kautz J, Sen P and Gallo O. 2020. Bi3D: stereo depth estimation via binary classifications//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1597-1605[DOI: 10.1109/CVPR42600.2020.00167http://dx.doi.org/10.1109/CVPR42600.2020.00167]
Balntas V, Lenc K, Vedaldi A and Mikolajczyk K. 2017. HPatches: a benchmark and evaluation of handcrafted and learned local descriptors//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 3852-3861[DOI: 10.1109/CVPR.2017.410http://dx.doi.org/10.1109/CVPR.2017.410]
Bao W, Wang W, Xu Y H, Guo Y L, Hong S Y and Zhang X H. 2020. InStereo2K: a large real dataset for stereo matching in indoor scenes. Science China Information Sciences, 63(11): #212101[DOI:10.1007/s11432-019-2803-x]
Barron J T and Poole B. 2016. The fast bilateral solver//Proceedings of 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 617-632[DOI: 10.1007/978-3-319-46487-9_38http://dx.doi.org/10.1007/978-3-319-46487-9_38]
Bhowmik A, Gumhold S, Rother C and Brachmann E. 2020. Reinforced feature points: optimizing feature detection and description for a high-level task//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 4947-4956[DOI: 10.1109/CVPR42600.2020.00500http://dx.doi.org/10.1109/CVPR42600.2020.00500]
Bloesch M, Czarnowski J, Clark R, Leutenegger S and Davison A J. 2018. CodeSLAM-learning a compact, optimisable representation for dense visual SLAM//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2560-2568[DOI: 10.1109/CVPR.2018.00271http://dx.doi.org/10.1109/CVPR.2018.00271]
Brachmann E, Krull A, Nowozin S, Shotton J, Michel F, Gumhold S and Rother C. 2017. DSAC-differentiable RANSAC for camera localization//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2492-2500[DOI: 10.1109/CVPR.2017.267http://dx.doi.org/10.1109/CVPR.2017.267]
Brachmann E and Rother C. 2018. Learning less is more-6D camera localization via 3D surface regression//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4654-4662[DOI: 10.1109/CVPR.2018.00489http://dx.doi.org/10.1109/CVPR.2018.00489]
Brachmann E and Rother C. 2019. Expert sample consensus applied to camera re-localization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 7524-7533[DOI: 10.1109/ICCV.2019.00762http://dx.doi.org/10.1109/ICCV.2019.00762]
Brahmbhatt S, Gu J W, Kim K, Hays J and Kautz J. 2018. Geometry-aware learning of maps for camera localization//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2616-2625[DOI: 10.1109/CVPR.2018.00277http://dx.doi.org/10.1109/CVPR.2018.00277]
Brandao P, Mazomenos E and Stoyanov D. 2019. Widening siamese architectures for stereo matching. Pattern Recognition Letters, 120: 75-81[DOI:10.1016/j.patrec.2018.12.002]
Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, Reid I and Leonard J J. 2016. Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Transactions on Robotics, 32(6): 1309-1332[DOI:10.1109/TRO.2016.2624754]
Campos C, Elvira R, Rodríguez J J G, Montiel J M M and Tardós J D. 2020. ORB-SLAM3: an accurate open-source library for visual, visual-inertial and multi-map SLAM[EB/OL]. [2020-07-23].https://arxiv.org/pdf/2007.11898.pdfhttps://arxiv.org/pdf/2007.11898.pdf
Caselitz T, Steder B, Ruhnke M and Burgard W. 2016. Monocular camera localization in 3D LiDAR maps//Proceedings of 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Daejeon, Korea (South): IEEE: 1926-1931[DOI: 10.1109/IROS.2016.7759304http://dx.doi.org/10.1109/IROS.2016.7759304]
Chakrabarti A, Shao J and Shakhnarovich G. 2016. Depth from a single image by harmonizing overcomplete local network predictions//Advances in Neural Information Processing Systems. Barcelona, Spain: Curran Associates, Inc. : 2658-2666
Chan S H, Wu P T and Fu L C. 2018. Robust 2D indoor localization through laser SLAM and visual SLAM fusion//Proceedings of 2018 IEEE International Conference on Systems, Man, and Cybernetics. Miyazaki, Japan: IEEE, 2018: 1263-1268[DOI:10.1109/SMC.2018.00221]
Chang J R and Chen Y S. 2018. Pyramid stereo matching network//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 5410-5418[DOI: 10.1109/CVPR.2018.00567http://dx.doi.org/10.1109/CVPR.2018.00567]
Chen C H, Rosa S, Miao Y S, Lu C X, Wu W, Markham A and Trigoni N. 2019. Selective sensor fusion for neural visual-inertial odometry//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 10534-10543[DOI: 10.1109/CVPR.2019.01079http://dx.doi.org/10.1109/CVPR.2019.01079]
Chen H X, Li K H, Fu Z H, Liu M Y, Chen Z H and Guo Y L. 2021. Distortion-aware monocular depth estimation for omnidirectional images. IEEE Signal Processing Letters, 28: 334-338[DOI:10.1109/LSP.2021.3050712]
Chen M X, Yang S W, Yi X D and Wu D. 2017. Real-time 3D mapping using a 2D laser scanner and IMU-aided visual SLAM//Proceedings of 2017 IEEE International Conference on Real-time Computing and Robotics. Okinawa, Japan: IEEE: 297-302[DOI: 10.1109/RCAR.2017.8311877http://dx.doi.org/10.1109/RCAR.2017.8311877]
Chen Z, Badrinarayanan V, Drozdov G and Rabinovich A. 2018. Estimating depth from rgb and sparse sensing//Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 167-182[DOI: 10.1007/978-3-030-01225-0_11http://dx.doi.org/10.1007/978-3-030-01225-0_11]
Chen Z Y, Sun X, Wang L, Yu Y and Huang C. 2015. A deep visual correspondence embedding model for stereo matching costs//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 972-980[DOI: 10.1109/ICCV.2015.117http://dx.doi.org/10.1109/ICCV.2015.117]
Cheng X J, Wang P, Guan C Y and Yang R G. 2020a. CSPN++: learning context and resource aware convolutional spatial propagation networks for depth completion. Proceedings of the AAAI Conference on Artificial Intelligence, 34(7): 10615-10622[DOI:10.1609/aaai.v34i07.6635]
Cheng X J, Wang P and Yang R G. 2018. Depth estimation via affinity learned with convolutional spatial propagation network//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 108-125[DOI: 10.1007/978-3-030-01270-0_7http://dx.doi.org/10.1007/978-3-030-01270-0_7]
Cheng X J, Wang P and Yang R G. 2020b. Learning depth with convolutional spatial propagation network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10): 2361-2379[DOI:10.1109/TPAMI.2019.2947374]
Chodosh N, Wang C Y and Lucey S. 2019. Deep convolutional compressed sensing for LiDAR depth completion//Proceedings of Asian Conference on Computer Vision. Cham: Springer: 499-513[DOI: 10.1007/978-3-030-20887-5_31http://dx.doi.org/10.1007/978-3-030-20887-5_31]
Choy C, Gwak J and Savarese S. 2019. 4D spatio-temporal convnets: minkowski convolutional neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3070-3079[DOI: 10.1109/CVPR.2019.00319http://dx.doi.org/10.1109/CVPR.2019.00319]
Clark R, Wang S, Wen H K, Markham A and Trigoni N. 2017. VINet: visual-inertial odometry as a sequence-to-sequence learning problem[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1701.08376.pdfhttps://arxiv.org/pdf/1701.08376.pdf
Dai A and Nieβner M. 2018. 3DMV: joint 3D-multi-view prediction for 3D semantic scene segmentation//Proceedings of European Conference on Computer Vision. Munich, Germany: Springer: 458-474[DOI: 10.1007/978-3-030-01249-6_28http://dx.doi.org/10.1007/978-3-030-01249-6_28]
Davison A J. 2003. Real-time simultaneous localisation and mapping with a single camera//Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE, 1403-1410[DOI: 10.1109/ICCV.2003.1238654http://dx.doi.org/10.1109/ICCV.2003.1238654]
DeTone D, Malisiewicz T and Rabinovich A. 2018. SuperPoint: self-supervised interest point detection and description//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City, USA: IEEE: 337-349[DOI: 10.1109/CVPRW.2018.00060http://dx.doi.org/10.1109/CVPRW.2018.00060]
Doria D and Radke R J. 2012. Filling large holes in LiDAR data by inpainting depth gradients//Proceedings of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Providence, USA: IEEE: 65-72[DOI: 10.1109/CVPRW.2012.6238916http://dx.doi.org/10.1109/CVPRW.2012.6238916]
Dosovitskiy A, Fischer P, Ilg E, Häusser P, Hazirbas C, Golkov V, van der Smagt P, Cremers D and Brox T. 2015. FlowNet: learning optical flow with convolutional networks//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2758-2766[DOI: 10.1109/ICCV.2015.316http://dx.doi.org/10.1109/ICCV.2015.316]
Dou M S, Khamis S, Degtyarev Y, Davidson P, Fanello S R, Kowdle A, Escolano S O, Rhemann C, Kim D, Taylor J, Kohli P, Tankovich V and Izadi S. 2016. Fusion4D: real-time performance capture of challenging scenes. ACM Transactions on Graphics, 35(4): #114[DOI:10.1145/2897824.2925969]
Du H, Wang W, Xu C W, Xiao R and Sun C Y. 2020. Real-time onboard 3D state estimation of an unmanned aerial vehicle in multi-environments using multi-sensor data fusion. Sensors, 20(3): #919[DOI:10.3390/s20030919]
Dusmanu M, Rocco I, Pajdla T, Pollefeys M, Sivic J, Torii A and Sattler T. 2019. D2-Net: a trainable CNN for joint description and detection of local features//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8084-8093[DOI: 10.1109/CVPR.2019.00828http://dx.doi.org/10.1109/CVPR.2019.00828]
Eigen D and Fergus R. 2015. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2650-2658[DOI: 10.1109/ICCV.2015.304http://dx.doi.org/10.1109/ICCV.2015.304]
Eigen D, Puhrsch C and Fergus R. 2014. Depth map prediction from a single image using a multi-scale deep network//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press: 2366-2374
Engel J, Schöps T and Cremers D. 2014. LSD-SLAM: large-scale direct monocular SLAM//Proceedings of European Conference on Computer Vision. Zurich, Switzerland: Springer: 834-849[DOI: 10.1007/978-3-319-10605-2_54http://dx.doi.org/10.1007/978-3-319-10605-2_54]
Engel J, Stückler J and Cremers D. 2015. Large-scale direct SLAM with stereo cameras//Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany: IEEE: 1935-1942[DOI: 10.1109/IROS.2015.7353631http://dx.doi.org/10.1109/IROS.2015.7353631]
Engel J, Koltun V and Cremers D. 2017. Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(3): 611-625
Engelmann F, Bokeloh M, Fathi A,Leibe B and Nieβner M. 2020. 3D-MPA: multi-proposal aggregation for 3D semantic instance segmentation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 9028-9037[DOI: 10.1109/CVPR42600.2020.00905http://dx.doi.org/10.1109/CVPR42600.2020.00905]
Engelmann F, Kontogianni T, Hermans A and Leibe B. 2017. Exploring spatial context for 3D semantic segmentation of point clouds//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE: 716-724[DOI: 10.1109/ICCVW.2017.90http://dx.doi.org/10.1109/ICCVW.2017.90]
Fan H Q, Su H and Guibas L. 2017. A point set generation network for 3D object reconstruction from a single image//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2463-2471[DOI: 10.1109/CVPR.2017.264http://dx.doi.org/10.1109/CVPR.2017.264]
Feng Y J, Fan L Xand Wu Y H. 2016. Fast localization in large-scale environments using supervised indexing of binary features. IEEE Transactions on Image Processing, 25(1): 343-358[DOI:10.1109/TIP.2015.2500030]
Ferstl D, Reinbacher C, Ranftl R, Rather M and Bischof H. 2013. Image guided depth upsampling using anisotropic total generalized variation//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE: 993-1000[DOI: 10.1109/ICCV.2013.127http://dx.doi.org/10.1109/ICCV.2013.127]
Flynn J, Neulander I, Philbin J and Snavely N. 2016. Deep stereo: learning to predict new views from the world's imagery//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 5515-5524[DOI: 10.1109/CVPR.2016.595http://dx.doi.org/10.1109/CVPR.2016.595]
Forster C, Pizzoli M, Scaramuzza D. SVO: Fast semi-direct monocular visual odometry[C]//2014 IEEE international conference on robotics and automation (ICRA). IEEE, 2014: 15-22
Fu H, Gong M M, Wang C H, Batmanghelich K and Tao D C. 2018. Deep ordinal regression network for monocular depth estimation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2002-2011[DOI: 10.1109/CVPR.2018.00214http://dx.doi.org/10.1109/CVPR.2018.00214]
Gálvez-López D and Tardos J D. 2012. Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics, 28(5): 1188-1197[DOI:10.1109/TRO.2012.2197158]
Gan Y K, Xu X Y, Sun W X and Lin L. 2018. Monocular depth estimation with affinity, vertical pooling, and label enhancement//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 232-247[DOI: 10.1007/978-3-030-01219-9_14http://dx.doi.org/10.1007/978-3-030-01219-9_14]
Gao X, Wang R, Demmel N and Cremers D. 2018. LDSO: direct sparse odometry with loop closure//Proceeding of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: IEEE: 2198-2204[DOI: 10.1109/IROS.2018.8593376http://dx.doi.org/10.1109/IROS.2018.8593376]
Garg R, Bg V K, Carneiro G and Reid I. 2016. Unsupervised cnn for single view depth estimation: geometry to the rescue//Proceedings of 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 740-756[DOI: 10.1007/978-3-319-46484-8_45http://dx.doi.org/10.1007/978-3-319-46484-8_45]
Gawel A, Cieslewski T, DubéR, Bosse M, Siegwart R and Nieto J. 2016. Structure-based vision-laser matching//Proceedings of 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Daejeon, Korea (South): IEEE: 182-188[DOI: 10.1109/IROS.2016.7759053http://dx.doi.org/10.1109/IROS.2016.7759053]
Ge Y X, Wang H B, Zhu F, Zhao R and Li H S. 2020. Self-supervising fine-grained region similarities for large-scale image localization[EB/OL]. [2021-01-21].https://arxiv.org/pdf/2006.03926.pdfhttps://arxiv.org/pdf/2006.03926.pdf
Genova K, Cole F, Sud A, Sarna A and Funkhouser T. 2019. Deep structured implicit functions[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1912.06126.pdfhttps://arxiv.org/pdf/1912.06126.pdf
Gidaris S and Komodakis N. 2017. Detect, replace, refine: deep structured prediction for pixel wise labeling//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 7187-7196[DOI: 10.1109/CVPR.2017.760http://dx.doi.org/10.1109/CVPR.2017.760]
Girdhar R, Fouhey D F, Rodriguez M and Gupta A. 2016. Learning a predictable and generative vector representation for objects//Proceedings of European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 484-499[DOI: 10.1007/978-3-319-46466-4_29http://dx.doi.org/10.1007/978-3-319-46466-4_29]
Godard C, Mac Aodha O and Brostow G J. 2017. Unsupervised monocular depth estimation with left-right consistency//Proceedings of 2017 IEEE Conference on Computer Vision andPattern Recognition. Honolulu, USA: IEEE: 6602-6611[DOI: 10.1109/CVPR.2017.699http://dx.doi.org/10.1109/CVPR.2017.699]
Graham B, Engelcke M and van der Maaten L. 2018. 3D semantic segmentation with submanifold sparse convolutional networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 9224-9232[DOI: 10.1109/CVPR.2018.00961http://dx.doi.org/10.1109/CVPR.2018.00961]
Groueix T, Fisher M, Kim V G, Russell B C and Aubry M. 2018. A papier-mache approach to learning 3D surface generation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 216-224[DOI: 10.1109/CVPR.2018.00030http://dx.doi.org/10.1109/CVPR.2018.00030]
Gu X D, Fan Z W, Zhu S Y, Dai Z Z, Tan F T and Tan P. 2020. Cascade cost volume for high-resolution multi-view stereo and stereo matching//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2492-2501[DOI: 10.1109/CVPR42600.2020.00257http://dx.doi.org/10.1109/CVPR42600.2020.00257]
Güney F and Geiger A. 2015. Displets: resolving stereo ambiguities using object knowledge//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 4165-4175[DOI: 10.1109/CVPR.2015.7299044http://dx.doi.org/10.1109/CVPR.2015.7299044]
Guo K W, Xu F, Wang Y G, Liu Y B and Dai Q H. 2015. Robust non-rigid motion tracking and surface reconstruction using L0regularization//Proceeding of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3083-3091[DOI: 10.1109/ICCV.2015.353http://dx.doi.org/10.1109/ICCV.2015.353]
Guo K W, Xu F, Yu T, Liu X Y, Dai Q H and Liu Y B. 2017. Real-time geometry, albedo, and motion reconstruction using a single RGB-D camera. ACM Transactions on Graphics, 36(3): #32[DOI:10.1145/3083722]
Guo X Y, Yang K, Yang W K, Wang X G and Li H S. 2019. Group-wise correlation stereo network//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3268-3277[DOI: 10.1109/CVPR.2019.00339http://dx.doi.org/10.1109/CVPR.2019.00339]
Guo Y L, Wang H Y, Hu Q Y, Liu H, Liu L and Bennamoun M. 2020. Deep learning for 3D point clouds: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence: #3005434[DOI: 10.1109/TPAMI.2020.3005434http://dx.doi.org/10.1109/TPAMI.2020.3005434]
Hambarde P and Murala S. 2020. S2DNet: depth estimation from single image and sparse samples. IEEE Transactions on Computational Imaging, 6: 806-817[DOI:10.1109/TCI.2020.2981761]
Han L, Zheng T, Xu L and Fang L. 2020. OccuSeg: occupancy-aware 3D instance segmentation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2937-2946[DOI: 10.1109/CVPR42600.2020.00301http://dx.doi.org/10.1109/CVPR42600.2020.00301]
Han X G, Gao C and Yu Y Z. 2017. DeepSketch2Face: a deep learning based sketching system for 3D face and caricature modeling. ACM Transactions on Graphics, 36(4): #126[DOI:10.1145/3072959.3073629]
Han X F, Leung T, Jia Y Q, Sukthankar R and Berg A C. 2015. MatchNet: unifying feature and metric learning for patch-based matching//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3279-3286[DOI: 10.1109/CVPR.2015.7298948http://dx.doi.org/10.1109/CVPR.2015.7298948]
He K, Lu Y and Sclaroff S. 2018. Local descriptors optimized for average precision//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 596-605[DOI: 10.1109/CVPR.2018.00069http://dx.doi.org/10.1109/CVPR.2018.00069]
Helmer S and Lowe D. 2010. Using stereo for object recognition//Proceedings of 2010 IEEE International Conference on Robotics and Automation. Anchorage, USA: IEEE: 3121-3127[DOI: 10.1109/ROBOT.2010.5509826http://dx.doi.org/10.1109/ROBOT.2010.5509826]
Hou J, Dai A and Nieβner M. 2019. 3D-SIS: 3D semantic instance segmentation of RGB-D scans//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4416-4425[DOI: 10.1109/CVPR.2019.00455http://dx.doi.org/10.1109/CVPR.2019.00455]
Houseago C, Bloesch M and Leutenegger S. 2019. KO-fusion: dense visual SLAM with tightly-coupled kinematic and odometric tracking//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 4054-4060[DOI: 10.1109/ICRA.2019.8793471http://dx.doi.org/10.1109/ICRA.2019.8793471]
Hu Q Y, Yang B, Xie L H, Rosa S, Guo Y L, Wang Z H, Trigoni N and Markham A. 2020. RandLA-Net: efficient semantic segmentation of large-scale point clouds//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Location: Seattle, USA: IEEE: 11105-11114[DOI: 10.1109/CVPR42600.2020.01112http://dx.doi.org/10.1109/CVPR42600.2020.01112]
Huang Z X, Fan J M, Cheng S G, Yi S, Wang X G and Li H S. 2019a. HMS-Net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Transactions on Image Processing, 29: 3429-3441[DOI:10.1109/TIP.2019.2960589]
Huang Z Y, Xu Y, Shi J P, Zhou X W, Bao H J and Zhang G F. 2019b. Prior guided dropout for robust visual localization in dynamic environments//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2791-2800[DOI: 10.1109/ICCV.2019.00288http://dx.doi.org/10.1109/ICCV.2019.00288]
Innmann M, Zollhöfer M, Nieβner M, Theobalt C and Stamminger M. 2020. VolumeDeform: real-time volumetric non-rigid reconstruction//Proceedings of European Conference on Computer Vision. Glasgow, United Kingdom: Springer: 362-379[DOI: 10.1007/978-3-319-46484-8_22http://dx.doi.org/10.1007/978-3-319-46484-8_22]
Jain A, Thormählen T, Ritschel T and Seidel H P. 2012. Exploring shape variations by 3D-model decomposition and part-based recombination. Computer Graphics Forum, 31: 631-640[DOI:10.1111/j.1467-8659.2012.03042.x]
Jaritz M, De Charette R, Wirbel E, Perrotton X and Nashashibi F. 2018. Sparse and dense data with CNNs: depth completion and semantic segmentation//Proceedings of 2018 International Conference on 3D Vision. Verona, Italy: IEEE: 52-60[DOI: 10.1109/3DV.2018.00017http://dx.doi.org/10.1109/3DV.2018.00017]
Jaritz M, Gu J Y and Su H. 2019. Multi-view PointNet for 3D scene understanding//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea (South): IEEE: 3995-4003[DOI: 10.1109/ICCVW.2019.00494http://dx.doi.org/10.1109/ICCVW.2019.00494]
Jiang L, Zhao H S, Liu S, Shen X Y, Fu C W and Jia J Y. 2019. Hierarchical point-edge interaction network for point cloud semantic segmentation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 10432-10440[DOI: 10.1109/ICCV.2019.01053http://dx.doi.org/10.1109/ICCV.2019.01053]
Jiang L, Zhao H S, Shi S S, Liu S, Fu C W and Jia J Y. 2020. PointGroup: dual-set point grouping for 3D instance segmentation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE 4866-4875[DOI: 10.1109/CVPR42600.2020.00492http://dx.doi.org/10.1109/CVPR42600.2020.00492]
Jiang M Y, Wu Y R, Zhao T Q, Zhao Z L and Lu C W. 2018. PointSIFT: a SIFT-like network module for 3D point cloud semantic segmentation[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1807.00652.pdfhttps://arxiv.org/pdf/1807.00652.pdf
Jie Z Q, Wang P F, Ling Y G, Zhao B, Wei Y C, Feng J S and Liu W. 2018. Left-right comparative recurrent model for stereo matching//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3838-3846[DOI: 10.1109/CVPR.2018.00404http://dx.doi.org/10.1109/CVPR.2018.00404]
Kanazawa A, Zhang Y J, Felsen P and Malik J. 2019. Learning 3D human dynamics from video//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5607-5616[DOI: 10.1109/CVPR.2019.00576http://dx.doi.org/10.1109/CVPR.2019.00576]
Kanhere O and Rappaport T S. 2019. Position locationing for millimeter wave systems//Proceedings of 2018 IEEE Global Communications Conference. Abu Dhabi, United Arab Emirates: IEEE: 206-212[DOI: 10.1109/GLOCOM.2018.8647983http://dx.doi.org/10.1109/GLOCOM.2018.8647983]
Kar A, Häne C and Malik J. 2017. Learning a multi-view stereo machine//Advances in Neural Information Processing Systems. Long Beach, USA: [s. n.]: 364-375
Kendall A, Grimes M and Cipolla R. 2015. PoseNet: a convolutional network for real-time 6-DOF camera relocalization//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2938-2946[DOI: 10.1109/ICCV.2015.336http://dx.doi.org/10.1109/ICCV.2015.336]
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A and Bry A. 2017. End-to-end learning of geometry and context for deep stereo regression//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 66-75[DOI: 10.1109/ICCV.2017.17http://dx.doi.org/10.1109/ICCV.2017.17]
Khattak S, Papachristos C and Alexis K. 2019. Keyframe-based direct thermal-inertial odometry//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 3563-3569[DOI: 10.1109/ICRA.2019.8793927http://dx.doi.org/10.1109/ICRA.2019.8793927]
Kiechle M, Hawe S and Kleinsteuber M. 2013. A joint intensity and depth co-sparse analysis model for depth map super-resolution//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE: 1545-1552[DOI: 10.1109/ICCV.2013.195http://dx.doi.org/10.1109/ICCV.2013.195]
Kim K R and Kim C S. 2016. Adaptive smoothness constraints for efficient stereo matching using texture and edge information//Proceedings of 2016 IEEE International Conference on Image Processing. Phoenix, USA: IEEE: 3429-3433[DOI: 10.1109/ICIP.2016.7532996http://dx.doi.org/10.1109/ICIP.2016.7532996]
Kim Y, Jeong J and Kim A. 2018. Stereo camera localization in 3D LiDAR maps//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: IEEE: 1-9[DOI: 10.1109/IROS.2018.8594362http://dx.doi.org/10.1109/IROS.2018.8594362]
Klein G and Murray D. 2007. Parallel tracking and mapping for small AR workspaces//Proceedings of the 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara, Japan: IEEE: 225-234[DOI: 10.1109/ISMAR.2007.4538852http://dx.doi.org/10.1109/ISMAR.2007.4538852]
Knöbelreiter P, Reinbacher C, Shekhovtsov A and Pock T. 2017. End-to-end training of hybrid CNN-CRF models for stereo//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1456-1465[DOI: 10.1109/CVPR.2017.159http://dx.doi.org/10.1109/CVPR.2017.159]
Kusupati U, Cheng S, Chen R and Su H. 2020. Normal assisted stereo depth estimation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2186-2196[DOI: 10.1109/cvpr42600.2020.00226http://dx.doi.org/10.1109/cvpr42600.2020.00226]
Lahoud J, Ghanem B, Oswald M R and Pollefeys M. 2019. 3D instance segmentation via multi-task metric learning//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 9255-9265[DOI: 10.1109/ICCV.2019.00935http://dx.doi.org/10.1109/ICCV.2019.00935]
Laina I, Rupprecht C, Belagiannis V, Tombari F and Navab N. 2016. Deeper depth prediction with fully convolutional residual networks//Proceedings of the 4th International Conference on 3DVision. Stanford, USA: IEEE: 239-248[DOI: 10.1109/3DV.2016.32http://dx.doi.org/10.1109/3DV.2016.32]
Landrieu L and Boussaha M. 2019. Point cloud oversegmentation with graph-structured deep metric learning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7432-7441[DOI: 10.1109/CVPR.2019.00762http://dx.doi.org/10.1109/CVPR.2019.00762]
Landrieu L and Simonovsky M. 2018. Large-scale point cloud semantic segmentation with superpoint graphs//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4558-4567[DOI: 10.1109/CVPR.2018.00479http://dx.doi.org/10.1109/CVPR.2018.00479]
Lawin F J, Danelljan M, Tosteberg P, Bhat G, Khan F S and Felsberg M. 2017. Deep projective 3D semantic segmentation//Proceedings of International Conference on Computer Analysis of Images and Patterns. Ystad, Sweden: Springer: 95-107[DOI: 10.1007/978-3-319-64689-3_8http://dx.doi.org/10.1007/978-3-319-64689-3_8]
Lee S H and Civera J. 2019. Loosely-coupled semi-direct monocular SLAM. IEEE Robotics and Automation Letters, 4(2): 399-406[DOI:10.1109/LRA.2018.2889156]
Lee W, Eckenhoff K, Geneva P and Huang G Q. 2020. Intermittent GPS-aided VIO: online initialization and calibration//Proceedings of 2020 IEEE International Conference on Robotics and Automation. Paris, France: IEEE: 5724-5731[DOI: 10.1109/ICRA40945.2020.9197029http://dx.doi.org/10.1109/ICRA40945.2020.9197029]
Leutenegger S, Lynen S, Bosse M. 2015. Keyframe-based visual-inertial odometry using nonlinear optimization. The International Journal of Robotics Research, 34(3): 314-334
Li B, Shen C H, Dai Y C, Van Den Hengel A and He M Y. 2015. Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1119-1127[DOI: 10.1109/CVPR.2015.7298715http://dx.doi.org/10.1109/CVPR.2015.7298715]
Li B Y, Zou D P, Sartori D, Pei L and Yu W X. 2020a. TextSLAM: visual SLAM with planar text features//Proceedings of 2020 IEEE International Conference on Robotics and Automation. Paris, France: IEEE: 2102-2108[DOI: 10.1109/ICRA40945.2020.9197233http://dx.doi.org/10.1109/ICRA40945.2020.9197233]
Li C J, Pan H, Liu Y, Tong X, Sheffer A and Wang W P. 2018a. Robust flow-guided neural prediction for sketch-based freeform surface modeling. ACM Transactions on Graphics, 37(6): #238[DOI:10.1145/3272127.3275051]
Li J Y, Bao H J and Zhang G. 2019a. Rapid and robust monocular visual-inertial initialization with gravity estimation via vertical edges//Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE: 6230-6236[DOI: 10.1109/IROS40897.2019.8968456http://dx.doi.org/10.1109/IROS40897.2019.8968456]
Li J, Niu C J and Xu K. 2020c. Learning part generation and assembly for structure-aware shape synthesis. Proceedings of the AAAI Conference on Artificial Intelligence, 34(7): 11362-11369[DOI:10.1609/aaai.v34i07.6798]
Li J Q, Pei L, Zou D P, Xia S P C, Wu Q, Li T, Sun Z and Yu W X. 2020b. Attention-SLAM: a visual monocular SLAM learning from human gaze[EB/OL]. [2021-01-21].https://arxiv.org/pdf/2009.06886.pdfhttps://arxiv.org/pdf/2009.06886.pdf
Li J, Xu K, Chaudhuri S, Yumer E, Zhang H and Guibas L. 2017. GRASS: generative recursive autoencoders for shape structures. ACM Transactions on Graphics, 36(4): #52[DOI:10.1145/3072959.3073637]
Li J Y, Yang B B, Huang K, Zhang G F and Bao H J. 2019b. Robust and efficient visual-inertial odometry with multi-plane priors//Proceedings of Chinese Conference on Pattern Recognition and Computer Vision. Xi'an, China: Springer: 283-295[DOI: 10.1007/978-3-030-31726-3_24http://dx.doi.org/10.1007/978-3-030-31726-3_24]
Li M Y, Patil A G, Xu K, Chaudhuri S, Khan O, Shamir A, Tu C H, Chen B Q, Cohen-Or D and Zhang H. 2019c. GRAINS: generative recursive autoencoders for indoor scenes. ACM Transactions on Graphics, 38(2): #12[DOI:10.1145/3303766]
Li R H, Wang S, Long Z Q and Gu D B. 2018b. UnDeepVO: monocular visual odometry through unsupervised deep learning//Proceedings of 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE: 7286-7291[DOI: 10.1109/ICRA.2018.8461251http://dx.doi.org/10.1109/ICRA.2018.8461251]
Li S K, Xue F, Wang X, Yan Z K and Zha H B. 2019d. Sequential adversarial learning for self-supervised deep visual odometry//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2851-2860[DOI: 10.1109/ICCV.2019.00294http://dx.doi.org/10.1109/ICCV.2019.00294]
Li Y Y, Bu R, Sun M C, Wu W, Di X H and Chen B Q. 2018c. PointCNN: convolution onX-transformed points//Advances in Neural Information Processing Systems. Montréal, Canada: [s. n.]: 820-830
Li Y, Ushiku Y and Harada T. 2019e. Pose graph optimization for unsupervised monocular visual odometry//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 5439-5445[DOI: 10.1109/ICRA.2019.8793706http://dx.doi.org/10.1109/ICRA.2019.8793706]
Liang M, Guo X, Li H, Wang X and Song Y. 2019. Unsupervised cross-spectral stereo matching by learning to synthesize//Proceedings of the AAAI Conference on Artificial Intelligence, 33: 8706-8713[DOI: 10.1609/aaai.v33i01.33018706http://dx.doi.org/10.1609/aaai.v33i01.33018706]
Liang Z F, Feng Y L, Guo Y L, Liu H Z, Chen W, Qiao L B, Zhou L and Zhang J F. 2018. Learning for disparity estimation through feature constancy//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2811-2820[DOI: 10.1109/CVPR.2018.00297http://dx.doi.org/10.1109/CVPR.2018.00297]
Liang Z F, Guo Y L, Feng Y L, Chen W, Qiao L B, Zhou L, Zhang J F and Liu H Z. 2021. Stereo matching using multi-level cost volume and multi-scale feature constancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1): 300-315[DOI:10.1109/TPAMI.2019.2928550]
Liao M, Lu F X, Zhou D F, Zhang S B, Li W and Yang R G. 2020. DVI: depth guided video inpainting for autonomous driving[EB/OL]. [2021-01-21].https://arxiv.org/pdf/2007.08854.pdfhttps://arxiv.org/pdf/2007.08854.pdf
Liao Y Y, Huang L C, Wang Y, Kodagoda S, Yu Y and Liu Y. 2017. Parse geometry from a line: monocular depth estimation with partial laser observation//Proceedings of 2017 IEEE International Conference on Robotics and Automation. Singapore, Singapore: IEEE: 5059-5066[DOI: 10.1109/ICRA.2017.7989590http://dx.doi.org/10.1109/ICRA.2017.7989590]
Liao Z W. 2016. Research on Autonomous Mapping and Navigation Technology in Indoor Environment based on Lidar and MEMS Intertial Components. Nanjing: Nanjing University of Aeronautics and Astronautics
廖自威. 2016. 激光雷达/微惯性室内自主建图与导航技术研究. 南京: 南京航空航天大学
Liebel L and Körner M. 2019. MultiDepth: single-image depth estimation via multi-task regression and classification//Proceedings of 2019 IEEE Intelligent Transportation Systems Conference. Auckland, New Zealand: IEEE: 1440-1447[DOI: 10.1109/ITSC.2019.8917177http://dx.doi.org/10.1109/ITSC.2019.8917177]
Liu F Y, Li S P, Zhang L Q, Zhou C H, Ye R T, Wang Y B and Lu J W. 2017a. 3DCNN-DQN-RNN: a deep reinforcement learning framework for semantic parsing of large-scale 3D point clouds//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5679-5688[DOI: 10.1109/ICCV.2017.605http://dx.doi.org/10.1109/ICCV.2017.605]
Liu H M, Chen M Y, Zhang G F and Bao H J. 2018. Ice-ba: Incremental, consistent and efficient bundle adjustment for visual-inertial slam//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1974-1982
Liu F Y, Shen C H, Lin G S and Reid I. 2016. Learning depth from single monocular images using deep convolutional neural fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(10): 2024-2039[DOI:10.1109/TPAMI.2015.2505283]
Liu H, Guo Y L, Ma Y N, Lei Y J and Wen G J. 2020a. Semantic context encoding for accurate 3D point cloud segmentation. IEEE Transactions on Multimedia[DOI: 10.1109/TMM.2020.3007331http://dx.doi.org/10.1109/TMM.2020.3007331]
Liu L, Li H D and Dai Y C. 2017b. Efficient global 2D-3D matching for camera localization in a large-scale 3D map//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2391-2400[DOI: 10.1109/ICCV.2017.260http://dx.doi.org/10.1109/ICCV.2017.260]
Liu L, Li H D and Dai Y C. 2019a. Stochastic attraction-repulsion embedding for large scale image localization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2570-2579[DOI: 10.1109/ICCV.2019.00266http://dx.doi.org/10.1109/ICCV.2019.00266]
Liu P P, King I, Lyu M R and Xu J. 2020b. Flow2Stereo: effective self-supervised learning of optical flow and stereo matching//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 6648-6657[DOI: 10.1109/CVPR42600.2020.00668http://dx.doi.org/10.1109/CVPR42600.2020.00668]
Liu Y C, Fan B, Xiang S M and Pan C H. 2019b. Relation-shape convolutional neural network for point cloud analysis//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8887-8896[DOI: 10.1109/CVPR.2019.00910http://dx.doi.org/10.1109/CVPR.2019.00910]
Liu Y, Shen Z H, Lin Z X, Peng S D, Bao H J and Zhou X W. 2019c. GIFT: learning transformation-invariant dense visual descriptors via group CNNs//Advances in Neural Information Processing Systems. Vancouver, Canada: [s. n.]: 6992-7003
Liu Y B, Ye G Z, Wang Y G, Dai Q H and Theobalt C. 2014. Human performance capture using multiple handheld kinects//Computer Vision and Machine Learning with RGB-D Sensors. Switzerland: Springer: 91-108[DOI: 10.1007/978-3-319-08651-4_5http://dx.doi.org/10.1007/978-3-319-08651-4_5]
Liu Z J, Tang H T, Lin Y J and Han S. 2019d. Point-voxel CNN for efficient 3D deep learning//Advances in Neural Information Processing Systems. Vancouver, Canada: [s. n.]
Lu C H, Uchiyama H, Thomas D, Shimada A and Taniguchi R I. 2018. Sparse cost volume for efficient stereo matching. Remote Sensing, 10(11): #1844[DOI:10.3390/rs10111844]
Lu G Y, Yan Y, Ren L, Song J K, Sebe N and Kambhamettu C. 2015. Localize me anywhere, anytime: a multi-task point-retrieval approach//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2434-2442[DOI: 10.1109/ICCV.2015.280http://dx.doi.org/10.1109/ICCV.2015.280]
Lu K Y, Barnes N, Anwar S and Zheng L. 2020. From depth what can you see? Depth completion via auxiliary image reconstruction//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 11303-11312[DOI: 10.1109/CVPR42600.2020.01132http://dx.doi.org/10.1109/CVPR42600.2020.01132]
Luo W J, Schwing A G and Urtasun R. 2016. Efficient deep learning for stereo matching//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA: IEEE: 5695-5703[DOI: 10.1109/CVPR.2016.614http://dx.doi.org/10.1109/CVPR.2016.614]
Luo Y, Ren J, Lin M D, Pang J H, Sun W X, Li H S and Lin L. 2018a. Single view stereo matching//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 155-163[DOI: 10.1109/CVPR.2018.00024http://dx.doi.org/10.1109/CVPR.2018.00024]
Luo Z X, Shen T W, Zhou L, Zhu S Y, Zhang R, Yao Y, Fang T and Quan L. 2018b. GeoDesc: learning local descriptors by integrating geometry constraints//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 170-185[DOI: 10.1007/978-3-030-01240-3_11http://dx.doi.org/10.1007/978-3-030-01240-3_11]
Luo Z X, Zhou L, Bai X Y, Chen H K, Zhang J H, Yao Y, Li S W, Fang T and Quan L. 2020. Aslfeat: learning local features of accurate shape and localization//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 6588-6597[DOI: 10.1109/CVPR42600.2020.00662http://dx.doi.org/10.1109/CVPR42600.2020.00662]
Lynen S, Achtelik M W, Weiss S, Chli M and Siegwart R. 2013. A robust and modular multi-sensor fusion approach applied to MAV navigation//Proceedings of 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE: 3923-3929[DOI: 10.1109/IROS.2013.6696917http://dx.doi.org/10.1109/IROS.2013.6696917]
Lynen S, Sattler T, Bosse M, Hesch J, Pollefeys M and Siegwart R. 2015. Get out of my lab: large-scale, real-Time visual-inertial localization//Proceedings of Robotics: Science and Systems. Rome, Italy: [s. n.][DOI: 10.15607/RSS.2015.XI.037http://dx.doi.org/10.15607/RSS.2015.XI.037]
Ma F C and Karaman S. 2018. Sparse-to-dense: depth prediction from sparse depth samples and a single image//Proceedings of 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE: 4796-4803[DOI: 10.1109/icra.2018.8460184http://dx.doi.org/10.1109/icra.2018.8460184]
Ma F C, Cavalheiro G V and Karaman S. 2019. Self-supervised sparse-to-dense: self-supervised depth completion from LiDAR and monocular camera//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 3288-3295[DOI: 10.1109/ICRA.2019.8793637http://dx.doi.org/10.1109/ICRA.2019.8793637]
Ma Y N, Guo Y L, Liu H, Lei Y J and Wen G J. 2020. Global context reasoning for semantic segmentation of 3D point clouds//Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass, USA: IEEE: 2920-2929[DOI: 10.1109/WACV45572.2020.9093411http://dx.doi.org/10.1109/WACV45572.2020.9093411]
Mac Aodha O, Campbell N D F, Nair A and Brostow G J. 2012. Patch based synthesis for single depth image super-resolution//Proceedings of European Conference on Computer Vision. Florence, Italy: Springer: 71-84[DOI: 10.1007/978-3-642-33712-3_6http://dx.doi.org/10.1007/978-3-642-33712-3_6]
Maddern W, Stewart A D and Newman P. 2014. LAPS-Ⅱ: 6-DOF day and night visual localisation with prior 3D structure for autonomous road vehicles//Proceedings of 2014 IEEE Intelligent Vehicles Symposium Proceedings. Dearborn, USA: IEEE: 330-337[DOI: 10.1109/IVS.2014.6856471http://dx.doi.org/10.1109/IVS.2014.6856471]
Mao J G, Wang X G and Li H S.2019. Interpolated convolutional networks for 3D point cloud understanding//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 1578-1587[DOI: 10.1109/ICCV.2019.00166http://dx.doi.org/10.1109/ICCV.2019.00166]
Mascaro R, Teixeira L, Hinzmann T, Siegwart R and Chli M. 2018. GOMSF: graph-optimization based multi-sensor fusion for robust UAV pose estimation//Proceedings of 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE: 1421-1428[DOI: 10.1109/ICRA.2018.8460193http://dx.doi.org/10.1109/ICRA.2018.8460193]
Matsuo K and Aoki Y. 2015. Depth image enhancement using local tangent plane approximations//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3574-3583[DOI: 10.1109/CVPR.2015.7298980http://dx.doi.org/10.1109/CVPR.2015.7298980]
Matusik W, Buehler C, Raskar R, Gortler S J and McMillan L. 2000. Image-based visual hulls//Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New York, United States: ACM: 369-374[DOI: 10.1145/344779.344951http://dx.doi.org/10.1145/344779.344951]
Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A and Brox T. 2016. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 4040-4048[DOI: 10.1109/CVPR.2016.438http://dx.doi.org/10.1109/CVPR.2016.438]
Menze M and Geiger A. 2015. Object scene flow for autonomous vehicles//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3061-3070[DOI: 10.1109/CVPR.2015.7298925http://dx.doi.org/10.1109/CVPR.2015.7298925]
Mishchuk A, Mishkin D, Radenović F and Matas J. 2017. Working hard to know your neighbor's margins: local descriptor learning loss//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, United States: Curran Associates Inc. : 4829-4840
Mitra N J, Wand M, Zhang H, Cohen-Or D, Kim V and Huang Q X. 2014. Structure-aware shape processing//ACM SIGGRAPH 2014 Courses. Vancouver, Canada: ACM: 1-21[DOI: 10.1145/2614028.2615401http://dx.doi.org/10.1145/2614028.2615401]
Mo K C, Guerrero P, Li Y, Su H, Wonka P, Mitra N and Guibas L J. 2019a. StructEdit: learning structural shape variations[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1911.11098.pdfhttps://arxiv.org/pdf/1911.11098.pdf
Mo K C, Guerrero P, Li Y, Su H, Wonka P, Mitra N and Guibas L J. 2019b. StructurEnet: hierarchical graph networks for 3D shape generation[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1908.00575.pdfhttps://arxiv.org/pdf/1908.00575.pdf
Mourikis A I and Roumeliotis S I. 2007. A multi-state constraint kalman filter for vision-aided inertial navigation//Proceedings of 2007 IEEE International Conference on Robotics and Automation. Rome, Italy: IEEE: 3565-3572[DOI: 10.1109/ROBOT.2007.364024http://dx.doi.org/10.1109/ROBOT.2007.364024]
Mur-Artal R, Montiel J M M and Tardós J D. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 31(5): 1147-1163[DOI:10.1109/TRO.2015.2463671]
Mur-Artal R and Tardós J D. 2017. ORB-SLAM2: an open-source SLAM system for monocular, stereo,and rgb-d cameras. IEEE Transactions on Robotics, 33(5): 1255-1262[DOI:10.1109/TRO.2017.2705103]
Neubert P, Schubert S and Protzel P. 2017. Sampling-based methods for visual navigation in 3D maps by synthesizing depth images//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vancouver BC, Canada: IEEE: 2492-2498[DOI: 10.1109/IROS.2017.8206067http://dx.doi.org/10.1109/IROS.2017.8206067]
Newcombe R A, Fox D and Seitz S M. 2015. DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 343-352[DOI: 10.1109/CVPR.2015.7298631http://dx.doi.org/10.1109/CVPR.2015.7298631]
Ng T, Balntas V, Tian Y and Mikolajczyk K. 2020. SOLAR: second-order loss and attention for image retrieval[EB/OL]. [2021-01-21].https://arxiv.org/pdf/2001.08972.pdfhttps://arxiv.org/pdf/2001.08972.pdf
Niu C J, Li J and Xu K. 2018. Im2Struct: recovering 3D shape structure from a single RGB image//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4521-4529[DOI: 10.1109/CVPR.2018.00475http://dx.doi.org/10.1109/CVPR.2018.00475]
Pang J H, Sun W X, Ren J S J, Yang C X and Yan Q. 2017. Cascade residual learning: a two-stage convolutional neural network for stereo matching//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE: 878-886[DOI: 10.1109/ICCVW.2017.108http://dx.doi.org/10.1109/ICCVW.2017.108]
Park C, Kim S, Moghadam P, Guo J D, Sridharan S and Fookes C. 2019a. Robust photogeometric localization over time for map-centric loop closure. IEEE Robotics and Automation Letters, 4(2): 1768-1775[DOI:10.1109/LRA.2019.2895262]
Park J J, Florence P, Straub J, Newcombe R and Lovegrove S. 2019b. DeepSDF: learning continuous signed distance functions for shape representation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 165-174[DOI: 10.1109/CVPR.2019.00025http://dx.doi.org/10.1109/CVPR.2019.00025]
Park J, Joo K, Hu Z, Liu C K and Kweon I S. 2020. Non-local spatial propagation network for depth completion//Proceedings of the European Conference on Computer Vision. Glasgow, United Kingdom: Springer: 120-136[DOI: 10.1007/978-3-030-58601-0_8http://dx.doi.org/10.1007/978-3-030-58601-0_8]
Pascoe G, Maddern W, Stewart A D and Newman P. 2015. FARLAP: fast robust localisation using appearance priors//Proceedings of 2015 IEEE International Conference on Robotics and Automation. Seattle, USA: IEEE, 2015: 6366-6373[DOI:10.1109/ICRA.2015.7140093]
Patil V, Van Gansbeke W, Dai D X and Van Gool L. 2020. Don't forget the past: recurrent depth estimation from monocular video. IEEE Robotics and Automation Letters, 5(4): 6813-6820[DOI:10.1109/LRA.2020.3017478]
Pham Q H, Nguyen T, Hua B S, Roig G and Yeung S K. 2019. JSIS3D: joint semantic-instance segmentation of 3D point clouds with multi-task pointwise networks and multi-value conditional random fields//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8819-8828[DOI: 10.1109/CVPR.2019.00903http://dx.doi.org/10.1109/CVPR.2019.00903]
Poggi M and Mattoccia S. 2017. Learning to predict stereo reliability enforcing local consistency of confidence maps//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 4541-4550[DOI: 10.1109/CVPR.2017.483http://dx.doi.org/10.1109/CVPR.2017.483]
Qi C R, Su H, Mo K C and Guibas L J. 2017a. PointNet: deep learning on point sets for 3D classification and segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 77-85[DOI: 10.1109/CVPR.2017.16http://dx.doi.org/10.1109/CVPR.2017.16]
Qi C R, Yi L, Su H and Guibas L J. 2017b. PointNet++: deep hierarchical feature learning on point sets in a metric space//Advances in Neural Information Processing Systems. Long Beach, USA: [s. n.]
Qin T, Li P L and Shen S J. 2018. VINS-mono: a robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 34(4): 1004-1020[DOI:10.1109/TRO.2018.2853729]
Qin T, Pan J, Cao S Z and Shen S J. 2019. A general optimization-based framework for local odometry estimation with multiple sensors[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1901.03638.pdfhttps://arxiv.org/pdf/1901.03638.pdf
Radenović F, Tolias G and Chum O. 2019. Fine-tuning CNN image retrieval with no human annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(7): 1655-1668[DOI:10.1109/TPAMI.2018.2846566]
Rappaport TS, Xing Y C, Kanhere O, Ju S H, Madanayake A, Mandal S, Alkhateeb A and Trichopoulos G C. 2019. Wireless communications and applications above 100 GHz: opportunities and challenges for 6G and beyond. IEEE Access, 7: 78729-78757[DOI:10.1109/ACCESS.2019.2921522]
Revaud J, Weinzaepfel P, De Souza C, Pion N, Csurka G, Cabon Y and Humenberger M. 2019. R2D2: repeatable and reliable detector and descriptor[EB/OL]. [2021-01-21].https://arxov.org/pdf/1906.06195.pdfhttps://arxov.org/pdf/1906.06195.pdf
Rosu R A, Schutt P, Quenzel J and Behnke S. 2019. LatticeNet: fast point cloud segmentation using permutohedral lattices[EB/OL]. [2021-01-21].https://arxir.org/pdf/1912.05905.pdfhttps://arxir.org/pdf/1912.05905.pdf
Roy A and Todorovic S. 2016. Monocular depth estimation using neural regression forest//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 5506-5514[DOI: 10.1109/CVPR.2016.594http://dx.doi.org/10.1109/CVPR.2016.594]
Saito S, Huang Z, Natsume R, Morishima S, Li H and Kanazawa A. 2019. PIFu: pixel-aligned implicit function for high-resolution clothed human digitization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2304-2314[DOI: 10.1109/ICCV.2019.00239http://dx.doi.org/10.1109/ICCV.2019.00239]
Saito S, Simon T, Saragih J and Joo H. 2020. PIFuHD: multi-level pixel-aligned implicit function for high-resolution 3D human digitization//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 81-90[DOI: 10.1109/CVPR42600.2020.00016http://dx.doi.org/10.1109/CVPR42600.2020.00016]
Saputra M R U, de Gusmao P P B, Wang S, Markham A and Trigoni N. 2019. Learning monocular visual odometry through geometry-aware curriculum learning//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 3549-3555[DOI: 10.1109/ICRA.2019.8793581http://dx.doi.org/10.1109/ICRA.2019.8793581]
Sarlin P E, Cadena C, Siegwart R and Dymczyk M. 2019. From coarse to fine: robust hierarchical localization at large scale//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 12708-12717[DOI: 10.1109/CVPR.2019.01300http://dx.doi.org/10.1109/CVPR.2019.01300]
Sarlin P E, Debraine F, Dymczyk M, Siegwart R and Cadena C. 2018. Leveraging deep visual descriptors for hierarchical efficient localization[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1809.01019.pdfhttps://arxiv.org/pdf/1809.01019.pdf
Sarlin P E, DeTone D, Malisiewicz T and Rabinovich A. 2020. Superglue: learning feature matching with graph neural networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 4937-4946[DOI: 10.1109/CVPR42600.2020.00499http://dx.doi.org/10.1109/CVPR42600.2020.00499]
Sattler T, Leibe B and Kobbelt L. 2017. Efficient and effective prioritized matching for large-scale image-based localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1744-1756[DOI:10.1109/TPAMI.2016.2611662]
Sattler T, Maddern W, Toft C, Torii A, Hammarstrand L, Stenborg E, Safari D, Okutomi M, Pollefeys M, Sivic J and Kahl F. 2018. Benchmarking 6DOF outdoor visual localization in changing conditions//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8601-8610[DOI: 10.1109/CVPR.2018.00897http://dx.doi.org/10.1109/CVPR.2018.00897]
Sattler T, Zhou Q J, Pollefeys M and Leal-TaixéL. 2019. Understanding the limitations of CNN-based absolute camera pose regression//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3297-3307[DOI: 10.1109/CVPR.2019.00342http://dx.doi.org/10.1109/CVPR.2019.00342]
Saxena A, Sun M and Ng A Y. 2009. Make3D: learning 3D scene structure from a single still image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5): 824-840[DOI:10.1109/TPAMI.2008.132]
Saxena A, Sung C H and Ng A Y. 2005. Learning depth from single monocular images//Proceedings of the 18th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press: 1161-1168
Schmid K, Tomic T, Ruess F, Hirschmüller H and Suppa M. 2013. Stereo vision based indoor/outdoor navigation for flying robots//Proceedings of 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE: 3955-3962[DOI: 10.1109/IROS.2013.6696922http://dx.doi.org/10.1109/IROS.2013.6696922]
Seki A and Pollefeys M. 2017. SGM-Nets: semi-global matching with neural networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6640-6649[DOI: 10.1109/CVPR.2017.703http://dx.doi.org/10.1109/CVPR.2017.703]
Shamwell E J, Lindgren K, Leung S and Nothwang W D. 2020. Unsupervised deep visual-inertial odometry with online error correction for RGB-D imagery. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10): 2478-2493[DOI:10.1109/TPAMI.2019.2909895]
Shao W Z, Vijayarangan S, Li C and Kantor G. 2019. Stereo visual inertial LiDAR simultaneous localization and mapping//Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE: 370-377[DOI: 10.1109/IROS40897.2019.8968012http://dx.doi.org/10.1109/IROS40897.2019.8968012]
Shean D E, Alexandrov O, Moratto Z M, Smith B E, Joughin I R, Porter C and Morin P. 2016. An automated, open-source pipeline for mass production of digital elevation models (DEMs) from very-high-resolution commercial stereo satellite imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 116: 101-117[DOI:10.1016/j.isprsjprs.2016.03.012]
Sheng L, Xu D, Ouyang W L and Wang X G. 2019. Unsupervised collaborative learning of keyframe detection and visual odometry towards monocular deep SLAM//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 4301-4310[DOI: 10.1109/ICCV.2019.00440http://dx.doi.org/10.1109/ICCV.2019.00440]
Shi T X, Cui H N, Song Z and Shen S H. 2020. Dense semantic 3D map based long-term visual localization with hybrid features[EB/OL]. [2021-01-21].https://arxiv.org/pdf/2005.10766.pdfhttps://arxiv.org/pdf/2005.10766.pdf
Shi T X, Shen S H, Gao X and Zhu L J. 2019. Visual localization using sparse semantic 3D map//Proceedings of 2019 IEEE International Conference on Image Processing. Taipei, China: IEEE: 315-319[DOI: 10.1109/ICIP.2019.8802957http://dx.doi.org/10.1109/ICIP.2019.8802957]
Shivakumar S S, Nguyen T, Miller I D, Chen S W, Kumar V and Taylor C J. 2019. DFuseNet: deep fusion of RGB and sparse depth information for image guided dense depth completion//Proceedings of 2019 IEEE Intelligent Transportation Systems Conference. Auckland, New Zealand: IEEE: 13-20[DOI: 10.1109/ITSC.2019.8917294http://dx.doi.org/10.1109/ITSC.2019.8917294]
Simo-Serra E, Trulls E, Ferraz L, Kokkinos I, Fua P and Moreno-Noguer F. 2015. Discriminative learning of deep convolutional feature point descriptors//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 118-126[DOI: 10.1109/ICCV.2015.22http://dx.doi.org/10.1109/ICCV.2015.22]
Sinha A, Bai J and Ramani K. 2016. Deep learning 3D shape surfaces using geometry images//Proceedings of European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 223-240
Song X, Zhao X, Hu H W and Fang L J. 2018. EdgeStereo: a context integrated residual pyramid network for stereo matching//Proceedings of 2018 Asian Conference on Computer Vision. Perth, Australia: Springer: 20-35[DOI: 10.1007/978-3-030-20873-8_2http://dx.doi.org/10.1007/978-3-030-20873-8_2]
Stewart A D and Newman P. 2012. LAPS-localisation using appearance of prior structure: 6-DoF monocular camera localisation using prior pointclouds//Proceedings of 2012 IEEE International Conference on Robotics and Automation. Saint Paul, USA: IEEE: 2625-2632[DOI: 10.1109/ICRA.2012.6224750http://dx.doi.org/10.1109/ICRA.2012.6224750]
Strasdat H, Davison A J, Montiel J M M and Konolige K. 2011. Double window optimisation for constant time visual SLAM//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE: 2352-2359[DOI: 10.1109/ICCV.2011.6126517http://dx.doi.org/10.1109/ICCV.2011.6126517]
Strasdat H, Montiel J M M and Davison A J. 2010. Scale drift-aware large scale monocular SLAM//Robotics: Science and Systems VI. Zaragoza, Spain: [s. n.]
Su H, Jampani V, Sun D Q, Maji S, Kalogerakis E, Yang M H and Kautz J. 2018. SPLATNet: sparse lattice networks for point cloud processing//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2530-2539[DOI: 10.1109/CVPR.2018.00268http://dx.doi.org/10.1109/CVPR.2018.00268]
Su H, Maji S, Kalogerakis E and Learned-Miller E. 2015. Multi-view convolutional neural networks for 3D shape recognition//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 945-953[DOI: 10.1109/ICCV.2015.114http://dx.doi.org/10.1109/ICCV.2015.114]
Su Z, Xu L, Zheng Z R, Yu T, Liu Y B and Fang L. 2020. RobustFusion: human volumetric capture with data-driven visual cues using a RGBD camera//Proceedings of European Conference on Computer Vision. Glasgow, United Kingdom: Springer: 246-264[DOI: 10.1007/978-3-030-58548-8_15http://dx.doi.org/10.1007/978-3-030-58548-8_15]
Sun X, Xie Y F, Luo P and Wang L. 2017. A dataset for benchmarking image-based localization//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5641-5649[DOI: 10.1109/CVPR.2017.598http://dx.doi.org/10.1109/CVPR.2017.598]
Svärm L, Enqvist O, Kahl F and Oskarsson M. 2017. City-scale localization for cameras with known vertical direction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(7): 1455-1461[DOI:10.1109/TPAMI.2016.2598331]
Tan F T, Zhu H, Cui Z P, Zhu S Y, Pollefeys M and Tan P. 2020. Self-supervised human depth estimation from monocular videos//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 647-656[DOI: 10.1109/CVPR42600.2020.00073http://dx.doi.org/10.1109/CVPR42600.2020.00073]
Tang F L, Li H P and Wu Y H. 2019. FMD stereo SLAM: fusing MVG and direct formulation towards accurate and fast stereo SLAM//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 133-139[DOI: 10.1109/ICRA.2019.8793664http://dx.doi.org/10.1109/ICRA.2019.8793664]
Taniai T, Matsushita Y, Sato Y and Naemura T. 2018. Continuous 3D label stereo matching using local expansion moves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(11): 2725-2739[DOI:10.1109/TPAMI.2017.2766072]
Tatarchenko M, Park J, Koltun V and Zhou Q Y. 2018. Tangent convolutions for dense prediction in 3D//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3887-389[DOI: 10.1109/CVPR.2018.00409http://dx.doi.org/10.1109/CVPR.2018.00409]
Tateno K, Tombari F, Laina I and Navab N. 2017. CNN-SLAM: real-time dense monocular SLAM with learned depth prediction//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6565-6574[DOI: 10.1109/CVPR.2017.695http://dx.doi.org/10.1109/CVPR.2017.695]
Tchapmi L, Choy C, Armeni I, Gwak J and Savarese S. 2017. SEGCloud: semantic segmentation of 3D point clouds//Proceedings of 2017 International Conference on 3D Vision. Qingdao, China: IEEE: 537-547[DOI: 10.1109/3DV.2017.00067http://dx.doi.org/10.1109/3DV.2017.00067]
Thomas H, Qi C R, Deschaud J E, Marcotegui B, Goulette F and Guibas L. 2019. KPConv: flexible and deformable convolution for point clouds//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 6410-6419[DOI: 10.1109/ICCV.2019.00651http://dx.doi.org/10.1109/ICCV.2019.00651]
Tian Y R, Fan B and Wu F C. 2017. L2-net: deep learning of discriminative patch descriptor in euclidean space//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6128-6136[DOI: 10.1109/CVPR.2017.649http://dx.doi.org/10.1109/CVPR.2017.649]
Tian Y R, Yu X, Fan B, Wu F C, Heijnen H and Balntas V. 2019. SOSNet: second order similarity regularization for local descriptor learning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 11008-11017[DOI: 10.1109/CVPR.2019.01127http://dx.doi.org/10.1109/CVPR.2019.01127]
Torii A, ArandjelovićR, Sivic J, Okutomi M and Pajdla T. 2015. 24/7 place recognition by view synthesis//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1808-1817[DOI: 10.1109/CVPR.2015.7298790http://dx.doi.org/10.1109/CVPR.2015.7298790]
Uhrig J, Schneider N, Schneider L, Franke U, Brox T and Geiger A. 2017. Sparsity invariant CNNs//Proceedings of 2017 International Conference on 3D Vision. Qingdao, China: IEEE: 11-20[DOI: 10.1109/3DV.2017.00012http://dx.doi.org/10.1109/3DV.2017.00012]
Ummenhofer B, Zhou H Z, Uhrig J, Mayer N, Ilg E, Dosovitskiy A and Brox T. 2017. DeMoN: depth and motion network for learning monocular stereo//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5622-5631[DOI: 10.1109/CVPR.2017.596http://dx.doi.org/10.1109/CVPR.2017.596]
Vlasic D, Peers R, Baran I, Debevec P, Popović J, Rusinkiewicz S and Matusik W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM Transactions on Graphics, 28(5): #174[DOI:10.1145/1618452.1618520]
Wang B, Chen C H, Lu C X, Zhao P J, Trigoni N and Markham A. 2020a. AtLoc: attention guided camera localization. Proceedings of the AAAI Conference on Artificial Intelligence, 34(6): 10393-10401[DOI:10.1609/aaai.v34i06.6608]
Wang H C, Liu Q, Yue X Y, Lasenby J and Kusner M J. 2020b. Pre-training by completing point clouds[EB/OL]. [2021-01-21].https://arxiv.org/pdf/2005.10766.pdfhttps://arxiv.org/pdf/2005.10766.pdf
Wang H, Schor N, Hu R Z, Huang H B, Cohen-Or D and Huang H. 2020c. Global-to-local generative model for 3D shapes. ACM Transactions on Graphics, 37(6): #214[DOI:10.1145/3272127.3275025]
Wang L G, Guo Y L, Wang Y Q, Liang Z F, Lin Z P, Yang J G and An W. 2020d. Parallax attention for unsupervised stereo correspondence learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020: #3026899[DOI:10.1109/TPAMI.2020.3026899]
Wang L, Huang Y C, Hou Y L, Zhang S M and Shan J. 2019a. Graph attention convolution for point cloud semantic segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 10288-10297[DOI: 10.1109/CVPR.2019.01054http://dx.doi.org/10.1109/CVPR.2019.01054]
Wang P S, Liu Y, Guo Y X, Sun C Y and Tong X. 2017a. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics, 36(4): #72[DOI:10.1145/3072959.3073608]
Wang P S, Sun C Y, Liu Y and Tong X. 2018a. Adaptive O-CNN: a patch-based deep representation of 3D shapes. ACM Transactions on Graphics, 37(6): #217[DOI:10.1145/3272127.3275050]
Wang Q Y, Yan Z K, Wang J Q, Xue F, Ma W and Zha H B. 2020e. Line flow based SLAM. [EB/OL]. [2021-02-03].https://arxiv.org/pdf/2009.09972.pdfhttps://arxiv.org/pdf/2009.09972.pdf
Wang Q, Zhou X, Hariharan B, Snavely N. 2020. Learning feature descriptors using camera pose supervision//Proceedings of 2020 European Conference on Computer Vision. [s. l.]: Springer, Cham: 757-774
Wang Q Q, Zhou X W, Hariharan B and Snavely N. 2020f. Learning feature descriptors using camera pose supervision[EB/OL]. [2021-02-03].https://arxiv.org/pdf/2004.13324.pdfhttps://arxiv.org/pdf/2004.13324.pdf
Wang S, Clark R, Wen H K and Trigoni N. 2017b. DeepVO: towards end-to-end visual odometry with deep recurrent convolutional neural networks//Proceedings of 2017 IEEE International Conference on Robotics and Automation. Singapore, Singapore: IEEE: 2043-2050[DOI: 10.1109/ICRA.2017.7989236http://dx.doi.org/10.1109/ICRA.2017.7989236]
Wang S L, Fidler S and Urtasun R. 2015. Lost shopping! Monocular localization in large indoor spaces//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2695-2703[DOI: 10.1109/ICCV.2015.309http://dx.doi.org/10.1109/ICCV.2015.309]
Wang S L, Suo S M, Ma W C, Pokrovsky A and Urtasun R. 2018b. Deep parametric continuous convolutional neural networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2589-2597[DOI: 10.1109/CVPR.2018.00274http://dx.doi.org/10.1109/CVPR.2018.00274]
Wang W Y, Yu R, Huang Q G and Neumann U. 2018c. SGPN: similarity group proposal network for 3D point cloud instance segmentation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2569-2578[DOI: 10.1109/CVPR.2018.00272http://dx.doi.org/10.1109/CVPR.2018.00272]
Wang X L, Liu S, Shen X Y, Shen C H and Jia J Y. 2019b. Associatively segmenting instances and semantics in point clouds//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4091-4100[DOI: 10.1109/CVPR.2019.00422http://dx.doi.org/10.1109/CVPR.2019.00422]
Wang Y R, Huang Z H, Zhu H, Li W, Cao X and Yang R G. 2020g. Interactive free-viewpoint video generation. Virtual Reality and Intelligent Hardware, 2(3): 247-260[DOI:10.1016/j.vrih.2020.04.004]
Wang Y, Wang P, Yang Z H, Luo C X, Yang Y and Xu W. 2019c. UnOS: unified unsupervised optical-flow and stereo-depth estimation by watching videos//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8063-8073[DOI: 10.1109/CVPR.2019.00826http://dx.doi.org/10.1109/CVPR.2019.00826]
Wang Y G, Liu Y B, Tong X, Dai Q H and Tan P. 2018d. Outdoor markerless motion capture with sparse handheld video cameras. IEEE Transactions on Visualization and Computer Graphics, 24(5): 1856-1866[DOI:10.1109/TVCG.2017.2693151]
Wang Z H, Liang D T, Liang D, Zhang J C and Liu H J. 2018. A SLAM method based on inertial/magnetic sensors and monocular vision fusion. Robot, 40(6): 933-941
王泽华, 梁冬泰, 梁丹, 章家成, 刘华杰. 2018. 基于惯性/磁力传感器与单目视觉融合的SLAM方法. 机器人, 40(6): 933-941[DOI:10.13973/j.cnki.robot.170683]
Wei J C, Lin G S, Yap K H, Hung T Y and Xie L H. 2020. Multi-path region mining for weakly supervised 3D semantic segmentation on point clouds//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 4383-4392[DOI: 10.1109/CVPR42600.2020.00444http://dx.doi.org/10.1109/CVPR42600.2020.00444]
Weiss S, Achtelik M W, Lynen S, Chli M and Siegwart R. 2012. Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments//Proceedings of 2012 IEEE International Conference on Robotics and Automation. Saint Paul, USA: IEEE: 957-964[DOI: 10.1109/ICRA.2012.6225147http://dx.doi.org/10.1109/ICRA.2012.6225147]
Wolcott R W and Eustice R M. 2014. Visual localization within lidar maps for automated urban drivingin///Proceedings of 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. Chicago, USA: IEEE: 176-183.
Wong D, Kawanishi Y, Deguchi D, Ide I and Murase H. 2017. Monocular localization within sparse voxel maps//Proceedings of 2017 IEEE Intelligent Vehicles Symposium (Ⅳ). Los Angeles, USA: IEEE: 499-504[DOI: 10.1109/IVS.2017.7995767http://dx.doi.org/10.1109/IVS.2017.7995767]
Wu B C, Wan A, Yue X Y and Keutzer K. 2018a. SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud//Proceedings of 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE: 1887-1893[DOI: 10.1109/ICRA.2018.8462926http://dx.doi.org/10.1109/ICRA.2018.8462926]
Wu B C, Zhou X Y, Zhao S C, Yue X Y and Keutzer K. 2019a. SqueezeSegV2: improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 4376-4382[DOI: 10.1109/ICRA.2019.8793495http://dx.doi.org/10.1109/ICRA.2019.8793495]
Wu J J, Zhang C K, Xue T F, Freeman B T and Tenenbaum J B. 2016. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling//Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook, United States: Curran Associates Inc. : 82-90
Wu L and Wu Y H. 2019. Similarity hierarchy based place recognition by deep supervised hashing for SLAM. IR-OS
Wu R D, Zhuang Y X, Xu K, Zhang H and Chen B Q. 2019b. PQ-NET: a generative part Seq2Seq network for 3D shapes[EB/OL]. [2021-01-21].https://arxiv.org/pdf/1911.10949.pdfhttps://arxiv.org/pdf/1911.10949.pdf
Wu Y H and Hu Z Y. 2006. PnP problem revisited. Journal of Mathematical Imaging and Vision, 24(1) 131-141[DOI:10.1007/s10851-005-3617-z]
Wu Y H, Tang F L and Li H P. 2018b. Image-based camera localization: an overview. Visual Computing for Industry, Biomedicine, and Art, 1: #8[DOI:10.1186/s42492-018-0008-z]
Wu Z R, Song S R, Khosla A, Yu F, Zhang L G, Tang X O and Xiao J X. 2015. 3D ShapeNets: a deep representation for volumetric shapes//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1912-1920[DOI: 10.1109/cvpr.2015.7298801http://dx.doi.org/10.1109/cvpr.2015.7298801]
Wu Z J, Wang X, Lin D, Lischinski D, Cohen-Or D and Huang H. 2019c. SAGNet: structure-aware generative network for 3D-shape modeling. ACM Transactions on Graphics, 38(4): #91[DOI:10.1145/3306346.3322956]
Xiao L H, Wang J, Qiu X S, Rong Z and Zou X D. 2019. Dynamic-SLAM: semantic monocular visual localization and mapping based on deep learning in dynamic environment. Robotics and Autonomous Systems, 117: 1-16[DOI:10.1016/j.robot.2019.03.012]
Xie J Y, Girshick R and Farhadi A. 2016. Deep3D: fully automatic 2D-to-3D video conversion with deep convolutional neural networks//Proceedings of 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 842-857[DOI: 10.1007/978-3-319-46493-0_51http://dx.doi.org/10.1007/978-3-319-46493-0_51]
Xie S N, Gu J T, Guo D M, Qi C R, Guibas L and Litany O. 2020. PointContrast: unsupervised pre-training for 3D point cloud understanding//Proceedings of European Conference on Computer Vision. Glasgow, United Kingdom: Springer: 574-591[DOI: 10.1007/978-3-030-58580-8_34http://dx.doi.org/10.1007/978-3-030-58580-8_34]
Xu C, Wu B, Wang Z, Tomizuka M. 2020. Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation[C]//European Conference on Computer Vision. Springer, Cham: 1-19.
Xu K, Zhang H, Cohen-Or D and Chen B Q. 2012. Fit and diverse: set evolution for inspiring 3D shape galleries. ACM Transactions on Graphics, 31(4): #57[DOI:10.1145/2185520.2185553]
Xu L, Cheng W, Guo K W, Han L, Liu Y B and Fang L. 2021. FlyFusion: realtime dynamic scene reconstruction using a flying depth camera. IEEE Transactions on Visualization and Computer Graphics, 27(1): 68-82[DOI:10.1109/TVCG.2019.2930691]
Xu L, Liu Y B, Cheng W, Guo K W, Zhou G Y, Dai Q H and Fang L. 2018a. FlyCap: markerless motion capture using multiple autonomous flying cameras. IEEE Transactions on Visualization and Computer Graphics, 24(8): 2284-2297[DOI:10.1109/TVCG.2017.2728660]
Xu L, Su Z, Han L, Yu T, Liu Y B and Fang L. 2020. UnstructuredFusion: realtime 4D geometry and texture reconstruction using commercial RGBD cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10): 2508-2522[DOI:10.1109/TPAMI.2019.2915229]
Xu Y F, Fan T Q, Xu M Y, Zeng L and Qiao Y. 2018b. SpiderCNN: deep learning on point sets with parameterized convolutional filters//Proceedings of European Conference on Computer Vision. Munich, Germany: Springer: 90-105[DOI: 10.1007/978-3-030-01237-3_6http://dx.doi.org/10.1007/978-3-030-01237-3_6]
Xu Y, Zhu X, Shi J P, Zhang G F, Bao H J and Li H S. 2019. Depth completion from sparse LiDAR data with depth-normal constraints//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2811-2820[DOI: 10.1109/ICCV.2019.00290http://dx.doi.org/10.1109/ICCV.2019.00290]
Xue F, Wang X, Li S K, Wang Q Y, Wang J Q and Zha H B. 2019. Beyond tracking: selecting memory and refining poses for deep visual odometry//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8567-8575[DOI: 10.1109/CVPR.2019.00877http://dx.doi.org/10.1109/CVPR.2019.00877]
Yan M, Wang J Z, Li J and Zhang C. 2017. Loose coupling visual-lidar odometry by combining VISO2 and LOAM//Proceedings of the 36th Chinese Control Conference. Dalian, China: IEEE: 6841-6846[DOI: 10.23919/ChiCC.2017.8028435http://dx.doi.org/10.23919/ChiCC.2017.8028435]
Yang B, Wang J, Clark R, Hu Q Y, Wang S, Markham A and Trigoni N. 2019. Learning object bounding boxes for 3D instance segmentation on point clouds//Advances in Neural Information Processing Systems. Vancouver, Canada: [s. n.]: 6737-6746
Yang G, Zhao H, Shi J, Deng Z and Jia J. 2018a. SegStereo: exploiting semantic information for disparity estimation//Proceedings of European Conference on Computer Vision. Munich, Germany: Springer: 660-676[DOI: 10.1007/978-3-030-01234-2_39http://dx.doi.org/10.1007/978-3-030-01234-2_39]
Yang N, von Stumberg L, Wang R and Cremers D. 2020. D3VO: deep depth, deep pose and deep uncertainty for monocular visual odometry//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1278-1289[DOI: 10.1109/CVPR42600.2020.00136http://dx.doi.org/10.1109/CVPR42600.2020.00136]
Yang N, Wang R, Stückler J and Cremers D. 2018b. Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 835-852[DOI: 10.1007/978-3-030-01237-3_50http://dx.doi.org/10.1007/978-3-030-01237-3_50]
Ye H Y, Huang H Y and Liu M. 2020a. Monocular direct sparse localization in a prior 3D surfel map. [EB/DL]. [2021-02-03].https://arxiv.org/pdf/2002.09923.pdfhttps://arxiv.org/pdf/2002.09923.pdf
Ye W L, Zheng R J, Zhang F Q, Ouyang Z Z and Liu Y. 2019. Robust and efficient vehicles motion estimation with low-cost multi-camera and odometer-gyroscope//Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE: 4490-4496[DOI: 10.1109/IROS40897.2019.8968048http://dx.doi.org/10.1109/IROS40897.2019.8968048]
Ye X C, Chen S D and Xu R. 2020b. DPNet: detail-preserving network for high quality monocular depth estimation. Pattern Recognition, 109: #107578[DOI:10.1016/j.patcog.2020.107578]
Ye X Q, Li J M, Huang H X, Du L and Zhang X L. 2018. 3D recurrent neural networks with contextfusion for point cloud semantic segmentation//Proceedings of European Conference on Computer Vision. Munich, Germany: Springer: 415-430[DOI: 10.1007/978-3-030-01234-2_25http://dx.doi.org/10.1007/978-3-030-01234-2_25]
Yi L, Zhao W, Wang H, Sung M and Guibas L J. 2019. GSPN: generative shape proposal network for 3D instance segmentation in point cloud//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3942-3951[DOI: 10.1109/CVPR.2019.00407http://dx.doi.org/10.1109/CVPR.2019.00407]
Yin Z C and Shi J P. 2018. GeoNet: unsupervised learning of dense depth, optical flow and camera pose//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1983-1992[DOI: 10.1109/CVPR.2018.00212http://dx.doi.org/10.1109/CVPR.2018.00212]
Yu H L, Ye W C, Feng Y J, Bao H J and Zhang G F. 2020. Learning bipartite graph matching for robust visual localization//Proceedings of 2020 IEEE International Symposium on Mixed and Augmented Reality. Porto de Galinhas, Brazil: IEEE: 146-155[DOI: 10.1109/ISMAR50242.2020.00036http://dx.doi.org/10.1109/ISMAR50242.2020.00036]
Yu T, Guo K W, Xu F, Dong Y, Su Z Q, Zhao J H, Li J G, Dai Q H and Liu Y B. 2017. BodyFusion: real-time capture of human motion and surface geometry using a single depth camera//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 910-919[DOI: 10.1109/ICCV.2017.104http://dx.doi.org/10.1109/ICCV.2017.104]
Yu T, Zhao J H, Huang Y H, Li Y P and Liu Y B. 2019a. Towards robust and accurate single-view fast human motion capture. IEEE Access, 7: 85548-85559[DOI:10.1109/ACCESS.2019.2920633]
Yu T, Zheng Z R, Guo K W, Zhao J H, Dai Q H, Li H, Pons-Moll G and Liu Y B. 2018. DoubleFusion: real-time capture of human performances with inner body shapes from a single depth sensor//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7287-7296[DOI: 10.1109/CVPR.2018.00761http://dx.doi.org/10.1109/CVPR.2018.00761]
Yu T, Zheng Z R, Zhong Y, Zhao J H, Dai Q H, Pons-Moll G and Liu Y B. 2019b. SimulCap: single-view human performance capture with cloth simulation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5499-5509[DOI: 10.1109/CVPR.2019.00565http://dx.doi.org/10.1109/CVPR.2019.00565]
Zagoruyko S and Komodakis N. 2015. Learning to compare image patches via convolutional neural networks//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 4353-4361[DOI: 10.1109/CVPR.2015.7299064http://dx.doi.org/10.1109/CVPR.2015.7299064]
Žbontar J and LeCun Y. 2015. Computing the stereo matching cost with a convolutional neural network//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1592-1599[DOI: 10.1109/CVPR.2015.7298767http://dx.doi.org/10.1109/CVPR.2015.7298767]
Zeisl B, Sattler T and Pollefeys M. 2015. Camera pose voting for large-scale image-based localization//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2704-2712[DOI: 10.1109/ICCV.2015.310http://dx.doi.org/10.1109/ICCV.2015.310]
Zhan H Y, Garg R, Weerasekera C S, Li K J, Agarwal H and Reid I M. 2018. Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 340-349[DOI: 10.1109/CVPR.2018.00043http://dx.doi.org/10.1109/CVPR.2018.00043]
Zhan H Y, Weerasekera C S, Bian J W and Reid I. 2020. Visual odometry revisited: what should be learnt?//Proceedings of 2020 IEEE International Conference on Robotics and Automation. Paris, France: IEEE: 4203-4210[DOI: 10.1109/ICRA40945.2020.9197374http://dx.doi.org/10.1109/ICRA40945.2020.9197374]
Zhang J and Singh S. 2018. Laser-visual-inertial odometry and mapping with high robustness and low drift. Journal of Field Robotics, 35(8): 1242-1264[DOI:10.1002/rob.21809]
Zhang L, Chen W H, Hu C, Wu X M and Li Z G. 2019a. S&CNet: monocular depth completion for autonomous systems and 3D reconstruction[EB/OL]. [2021-02-03].https://arxiv.org/pdf/1907.06071.pdfhttps://arxiv.org/pdf/1907.06071.pdf
Zhang P J, Wu Y H and Liu B X. 2020a. Leveraging local and global descriptors in parallel to search correspondences for visual localization[EB/OL]. [2021-02-03].https://arxiv.org/pdf/2009.10891.pdfhttps://arxiv.org/pdf/2009.10891.pdf
Zhang Y G and Li Q. 2018. Multi-frame fusion method for point cloud of LiDAR based on IMU. Journal of System Simulation, 30(11): 4334-4339
张艳国, 李擎. 2018. 基于惯性测量单元的激光雷达点云融合方法. 系统仿真学报, 30(11): 4334-4339
Zhang Y, Zhou Z X, David P, Yue X Y, Xi Z R, Gong B Q and Foroosh H. 2020b. PolarNet: an improved grid representation for online LiDAR point clouds semantic segmentation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 9598-9607[DOI: 10.1109/CVPR42600.2020.00962http://dx.doi.org/10.1109/CVPR42600.2020.00962]
Zhang Z Y, Hua B S and Yeung S K. 2019b. ShellNet: efficient point cloud convolutional neural networks using concentric shells statistics//Proceedingsof 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 1607-1616[DOI: 10.1109/ICCV.2019.00169http://dx.doi.org/10.1109/ICCV.2019.00169]
Zhao C, Sun L, Purkait P, Duckett T and Stolkin R. 2018. Learning monocular visual odometry with dense 3D mapping from dense 3D flow//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: IEEE: 6864-6871[DOI: 10.1109/IROS.2018.8594151http://dx.doi.org/10.1109/IROS.2018.8594151]
Zhao H S, Jiang L, Fu C W and Jia J Y. 2019. PointWeb: enhancing local neighborhood features for point cloud processing//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5560-5568[DOI: 10.1109/CVPR.2019.00571http://dx.doi.org/10.1109/CVPR.2019.00571]
Zhao H S, Shi J P, Qi X J, Wang X G and Jia J Y. 2017. Pyramid scene parsing network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6230-6239[DOI: 10.1109/CVPR.2017.660http://dx.doi.org/10.1109/CVPR.2017.660]
Zheng B and Zhang Z X. 2019. An improved EKF-SLAM for Mars surface exploration. International Journal of Aerospace Engineering, 2019: #7637469[DOI: 10.1155/2019/7637469http://dx.doi.org/10.1155/2019/7637469]
Zheng Z R, Yu T, Li H, Guo K W, Dai Q H, Fang L and Liu Y B. 2018. HybridFusion: real-time performance capture using a single depth sensor and sparse IMUs//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 389-406[DOI: 10.1007/978-3-030-01240-3_24http://dx.doi.org/10.1007/978-3-030-01240-3_24]
Zhi S F, Bloesch M, Leutenegger S and Davison A J. 2019. SceneCode: monocular dense semantic reconstruction using learned encoded scene representations//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 11768-11777[DOI: 10.1109/CVPR.2019.01205http://dx.doi.org/10.1109/CVPR.2019.01205]
Zhong Y R, Li H D and Dai Y C. 2018. Open-world stereo video matching with deep RNN//Proceedings of 2018 European Conference on Computer Vision. Munich, Germany: Springer: 104-119[DOI: 10.1007/978-3-030-01216-8_7http://dx.doi.org/10.1007/978-3-030-01216-8_7]
Zhou C, Zhang H, Shen X Y and Jia J Y. 2017a. Unsupervised learning of stereo matching//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 1576-1584[DOI: 10.1109/ICCV.2017.174http://dx.doi.org/10.1109/ICCV.2017.174]
Zhou H, Zhu X, Song X, Ma Y C, Wang Z, Li H S and Lin D H. 2020a. Cylinder3D: an effective 3D framework for driving-scene LiDAR semantic segmentation[EB/OL]. [2021-02-03].https://arxiv.org/pdf/2008.01550.pdfhttps://arxiv.org/pdf/2008.01550.pdf
Zhou T H, Brown M, Snavely N and Lowe D G. 2017b. Unsupervised learning of depth and ego-motion from video//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6612-6619[DOI: 10.1109/CVPR.2017.700http://dx.doi.org/10.1109/CVPR.2017.700]
Zhou Y, Wan G W, Hou S H, Yu L, Wang G, Rui X F and Song S Y. 2020b. DA4AD: end-to-end deep attention-based visual localization for autonomous driving[EB/OL]. [2021-02-03].https://arxiv.org/pdf/2003.03026.pdfhttps://arxiv.org/pdf/2003.03026.pdf
Zhu C, Giorgi G, Lee Y H and Günther C. 2018a. Enhancing accuracy in visual SLAM by tightly coupling sparse ranging measurements between two rovers//Proceedings of 2018 IEEE/ION Position, Location and Navigation Symposium. Monterey, USA: IEEE: 440-446[DOI: 10.1109/PLANS.2018.8373412http://dx.doi.org/10.1109/PLANS.2018.8373412]
Zhu C Y, Xu K, Chaudhuri S, Yi R J and Zhang H. 2018b. SCORES: shape composition with recursive substructure priors. ACM Transactions on Graphics, 37(6): #211[DOI:10.1145/3272127.3275008]
Zhu H, Su H, Wang P, Cao X and Yang R G. 2018c. View extrapolation of human body from a single image//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4450-4459[DOI: 10.1109/CVPR.2018.00468http://dx.doi.org/10.1109/CVPR.2018.00468]
Zhu X, Zhou H, Wang T, Hong F Z. 2020. cylindrical and asymmetrical 3D convolution networks for LiDAR segmentation[EB/OL]. [2021-01-21].https://arxiv.org/pdf/2011.10033.pdfhttps://arxiv.org/pdf/2011.10033.pdf
Zhu Z L, Yang S W, Dai H D and Li F. 2018d. Loop detection and correction of 3D laser-based SLAM with visual information//Proceedings of the 31st International Conference on Computer Animation and Social Agents. Beijing, China: ACM: 53-58[DOI: 10.1145/3205326.3205357http://dx.doi.org/10.1145/3205326.3205357]
Zoph B, Vasudevan V, Shlens J and Le Q V. 2018. Learning transferable architectures for scalable image recognition//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8697-8710[DOI: 10.1109/CVPR.2018.00907http://dx.doi.org/10.1109/CVPR.2018.00907]
Zubizarreta J, Aguinaga I and Montiel J M M. 2020. Direct sparse mapping. IEEE Transactions on Robotics, 36(4): 1363-1370[DOI:10.1109/TRO.2020.2991614]
Zuo X X, Geneva P, Yang Y L, Ye W L, Liu Y and Huang G Q. 2019. Visual-inertial localization with prior LiDAR map constraints. IEEE Robotics and Automation Letters, 4(4): 3394-3401[DOI:10.1109/LRA.2019.2927123]
相关作者
相关机构