深度学习单目深度估计研究进展
Review of monocular depth estimation based on deep learning
- 2022年27卷第2期 页码:390-403
收稿:2020-11-13,
修回:2021-3-9,
录用:2021-3-16,
纸质出版:2022-02-16
DOI: 10.11834/jig.200618
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-11-13,
修回:2021-3-9,
录用:2021-3-16,
纸质出版:2022-02-16
移动端阅览
单目深度估计是从单幅图像中获取场景深度信息的重要技术,在智能汽车和机器人定位等领域应用广泛,具有重要的研究价值。随着深度学习技术的发展,涌现出许多基于深度学习的单目深度估计研究,单目深度估计性能也取得了很大进展。本文按照单目深度估计模型采用的训练数据的类型,从3个方面综述了近年来基于深度学习的单目深度估计方法:基于单图像训练的模型、基于多图像训练的模型和基于辅助信息优化训练的单目深度估计模型。同时,本文在综述了单目深度估计研究常用数据集和性能指标基础上,对经典的单目深度估计模型进行了性能比较分析。以单幅图像作为训练数据的模型具有网络结构简单的特点,但泛化性能较差。采用多图像训练的深度估计网络有更强的泛化性,但网络的参数量大、网络收敛速度慢、训练耗时长。引入辅助信息的深度估计网络的深度估计精度得到了进一步提升,但辅助信息的引入会造成网络结构复杂、收敛速度慢等问题。单目深度估计研究还存在许多的难题和挑战。利用多图像输入中包含的潜在信息和特定领域的约束信息,来提高单目深度估计的性能,逐渐成为了单目深度估计研究的趋势。
The development of computer technology promotes the development of computer vision. Nowadays
more researchers focus on the field of 3D vision while monocular depth estimation is one of the basic tasks of 3D vision. Depth estimation from a single image is a critical technology for obtaining scene depth information. This technology has important research value because it has potential applications in intelligent vehicles
robot positioning
and other fields. Compared with traditional depth acquisition methods
monocular depth estimation based on deep learning has the advantages of low cost and simple operation. With the development of deep learning technology
many studies on monocular depth estimation based on deep learning have emerged in recent years
and the performance of monocular depth estimation has made great progress. The monocular depth estimation model needs a large a large amount of data to train the model. The commonly used training data types include RGB and depth (RGB-D) image pairs
stereo image pairs
and image sequences. The depth estimation model training by RGB-D images first extracts the image features through convolutional neural network and then predicts the depth map by using the method of continuous depth value regression. After predicting the depth map
several models use conditional random fields or other methods to optimize the depth map. Unsupervised learning is often used to train the monocular depth estimation model when the training data types are stereo image pairs and image sequences. The monocular estimation model training by stereo image pairs first predicts the disparity map and then estimates depth by using the disparity map. When an image sequence is used to train the model
the model first predicts the depth map of an image in the image sequence
and then the depth estimation model is optimized by images reconstructed by the depth map and other images in the sequence. To improve the accuracy of depth estimation
several researchers use semantic tags
depth range
and other auxiliary information to optimize depth maps. Several data sets can be used for multiple computer vision tasks such as depth estimation and semantic segmentation. Several researchers improve the accuracy of depth estimation by learning depth estimation and semantic segmentation model jointly because depth estimation has a strong correlation with semantic segmentation. When establishing the depth estimation data set
depth camera or light laser detection and ranging (LiDAR) is used to obtain the scene depth. Depth camera and LiDAR are based on the principle that light and other propagation media will reflect when they encounter objects. The depth range obtained by depth cameras and LiDAR is fixed because the propagation medium is dissipated in the transmission
and depth cameras and LiDAR cannot measure depth while the propagation medium energy is very small. Several models first divide the depth range into several depth intervals
take the median value of the depth interval as the depth value of the interval
and then use the method of multiple classifications to predict the depth map. Different training data types not only result in different network model structures but also affect the accuracy of depth estimation. In this review
the current monocular depth estimation methods based on deep learning are surveyed from the perspective of the training data type used by the monocular depth estimation model. Moreover
the single-image training model
the multi-image training model
and the monocular depth estimation model of auxiliary information optimization training are separately discussed. Furthermore
the latest research status of monocular depth estimation is systematically analyzed
and the advantages and disadvantages of various methods are discussed. Finally
the future research trends of monocular depth estimation are summarized.
Badki A, Troccoli A, Kim K, Kautz J, Sen P and Gallo O. 2020. Bi3D: stereo depth estimation via binary classifications//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1597-1605[ DOI: 10.1109/CVPR42600.2020.00167 http://dx.doi.org/10.1109/CVPR42600.2020.00167 ]
Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, Reid I and Leonard J J. 2016. Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Transactions on Robotics, 32(6): 1309-1332[DOI:10.1109/TRO.2016.2624-754]
Cao Y Z H, Wu Z F and Shen C H. 2018. Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Transactions on Circuits and Systems for Video Technology, 28(11): 3174-3182[DOI:10.1109/TCSVT.2017.2740321]
Chen L C, Zhu Y K, Papandreou G, Schroff F and Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 833-851[ DOI: 10.1007/978-3-030-01234-2_49 http://dx.doi.org/10.1007/978-3-030-01234-2_49 ]
Chen P Y, Liu A H, Liu Y C and Wang Y C F. 2019. Towards scene understanding: unsupervised monocular depth estimation with semantic-aware representation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 2619-2627[ DOI: 10.1109/CVPR.2019.00273 http://dx.doi.org/10.1109/CVPR.2019.00273 ]
Chen W F, Fu Z, Yang D W and Deng J. 2016. Single-image depth perception in the wild//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc. : 730-738
Dosovitskiy A, Fischer P, Ilg E, Häusser P, Hazirbas C, Golkov V, Smagt P V D, Cremers D and Brox T. 2015. FlowNet: learning optical flow with convolutional networks//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2758-2766[ DOI: 10.1109/ICCV.2015.316 http://dx.doi.org/10.1109/ICCV.2015.316 ]
Eigen D and Fergus R. 2015. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2650-2658[ DOI: 10.1109/ICCV.2015.304 http://dx.doi.org/10.1109/ICCV.2015.304 ]
Eigen D, Puhrsch C and Fergus R. 2014. Depth map prediction from a single image using a multi-scale deep network//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal,Canada: MIT Press: 2366-2374
Fu H, Gong M M, Wang C H, Batmanghelich K and Tao D C. 2018. Deep ordinal regression network for monocular depth estimation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2002-2011[ DOI: 10.1109/CVPR.2018.00214 http://dx.doi.org/10.1109/CVPR.2018.00214 ]
Geiger A, Lenz P and Urtasun R. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE: 3354-3361[ DOI: 10.1109/CVPR.2012.6248074 http://dx.doi.org/10.1109/CVPR.2012.6248074 ]
Godard C, Aodha O M and Brostow G J. 2017. Unsupervised monocular depth estimation with left-right consistency//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6602-6611[ DOI: 10.1109/CVPR.2017.699 http://dx.doi.org/10.1109/CVPR.2017.699 ]
Godard C, Aodha O M, Firman M and Brostow G J. 2019. Digging into self-supervised monocular depth estimation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 3827-3837[ DOI: 10.1109/ICCV.2019.00393 http://dx.doi.org/10.1109/ICCV.2019.00393 ]
He K M, Zhang X, Y Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778[ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Hu J, Shen L, Albanie S, Sun G and Wu E H. 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8): 2011-2023[DOI:10.1109/TPAMI.2019.2913372]
Hu J J, Ozay M, Zhang Y and Okatani T. 2019. Revisiting single image depth estimation: toward higher resolution maps with accurate object boundaries//Proceedings of 2019 IEEE Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 1043-1051[ DOI: 10.1109/WACV.2019.00116 http://dx.doi.org/10.1109/WACV.2019.00116 ]
Hu Y F, Qu T, Liu J, Shi Z Q, Zhu B, Cao D P and Chen H. 2019. Human-machine cooperative control of intelligent vehicle: recent developments and future perspectives. Acta Automatica Sinica, 45(7): 1261-1280
胡云峰, 曲婷, 刘俊, 施竹清, 朱冰, 曹东璞, 陈虹. 2019. 智能汽车人机协同控制的研究现状与展望. 自动化学报, 45(7): 1261-1280)[DOI:10.16383/j.aas.c180136]
Huang G, Liu Z, van der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2261-2269[ DOI: 10.1109/CVPR.2017.243 http://dx.doi.org/10.1109/CVPR.2017.243 ]
Jiao J B, Cao Y, Song Y B and Lau R. 2018. Look deeper into depth: monocular depth estimation with semantic booster and attention-driven loss//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 55-71[ DOI: 10.1007/978-3-030-01267-0_4 http://dx.doi.org/10.1007/978-3-030-01267-0_4 ]
Johnston A and Carneiro G. 2020. Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 4755-4764[ DOI: 10.1109/CVPR42-600.2020.00481 http://dx.doi.org/10.1109/CVPR42-600.2020.00481 ]
Kendall A, Grimes M and Cipolla R. 2015. PoseNet: a convolutional network for real-time 6-DOF camera relocalization//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2938-2946[ DOI: 10.1109/ICCV.2015.336 http://dx.doi.org/10.1109/ICCV.2015.336 ]
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A and Bry A. 2017. End-to-end learning of geometry and context for deep stereo regression//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 66-75[ DOI: 10.1109/ICCV.2017.17 http://dx.doi.org/10.1109/ICCV.2017.17 ]
Khan F, Salahuddin S and Javidnia H. 2020. Deep learning-based monocular depth estimation methods-a state-of-the-art review. Sensors, 20(8): 2272[DOI:10.3390/s20-082272]
Krizhevsky A, Sutskever I and Hinton G E. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84-90[DOI:10.1145/3065386]
Lafferty J, McCallum A and Pereira F C N. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data//Proceedings of the 18th International Conference on Machine Learning. Williamstown, USA: Morgan Kaufmann: 282-289
Laina I, Rupprecht C, Belagiannis V, Tombari F and Navab N. 2016. Deeper depth prediction with fully convolutional residual networks//Proceedings of the 4th International Conference on 3D Vision. Stanford, USA: IEEE: 239-248[ DOI: 10.1109/3DV.2016.32 http://dx.doi.org/10.1109/3DV.2016.32 ]
Lee J H, Han M K, Ko D W and Suh I H. 2019. From big to small: multi-scale local planar guidance for monocular depth estimation[EB/OL]. [2021-02-21] . https://arxiv.org/pdf/1907.10326.pdf https://arxiv.org/pdf/1907.10326.pdf
Lee J H, Heo M, Kim K R and Kim C S. 2018. Single-image depth estimation based on fourier domain analysis//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 330-339[ DOI: 10.1109/CVPR.2018.00042 http://dx.doi.org/10.1109/CVPR.2018.00042 ]
Li J, Klein R andYao A. 2017. A two-streamed network for estimating fine-scaled depth maps from single RGB images//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 3392-3400[ DOI: 10.1109/ICCV.2017.365 http://dx.doi.org/10.1109/ICCV.2017.365 ]
Liang X K, Song C and Zhao J J. 2019. Depth estimation technique of sequence image based on deep learning. Infrared and Laser Engineering, 48(S2): #S226002
梁欣凯, 宋闯, 赵佳佳. 2019. 基于深度学习的序列图像深度估计技术. 红外与激光工程, 48(S2): #S226002
Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 936-944[ DOI: 10.1109/CVPR.2017.106 http://dx.doi.org/10.1109/CVPR.2017.106 ]
Liu F Y, Shen C H and Lin G S. 2015. Deep convolutional neural fields for depth estimation from a single image//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 5162-5170[ DOI: 10.1109/CVPR.2015.7299152 http://dx.doi.org/10.1109/CVPR.2015.7299152 ]
Liu F Y, Shen C H, LinG S and Reid I. 2016. Learning depth from single monocular images using deep convolutional neural fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(10): 2024-2039[DOI:10.110-9/TPAMI.2015.2505283]
Liu M Y, Tuzel O, Ramalingam S and Chellappa R. 2011. Entropy rate superpixel segmentation//Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE: 2097-2104[ DOI: 10.1109/CVPR.2011.5995323 http://dx.doi.org/10.1109/CVPR.2011.5995323 ]
Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A and Brox T. 2016. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 4040-4048[ DOI: 10.1109/CVPR.2016.438 http://dx.doi.org/10.1109/CVPR.2016.438 ]
Owen A B. 2007. A robust hybrid of lasso and ridge regression. Prediction and Discovery, 443: 59-71
Ranjan A, Jampani V, BallesL, Kim K, Sun D Q, Wulff J and Black M J. 2019. Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion Segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 12232-12241[ DOI: 10.1109/CVPR.2019.01252 http://dx.doi.org/10.1109/CVPR.2019.01252 ]
Repala V K and Dubey S R. 2019. Dual CNN models for unsupervised monocular depth estimation//Proceedings of the 8th International Conference on Pattern Recognition and Machine Intelligence. Tezpur, India: Springer: 209-217[ DOI: 10.1007/978-3-030-34869-4_23 http://dx.doi.org/10.1007/978-3-030-34869-4_23 ]
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241[ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Saxena A, Chung S H and Ng A Y. 2005. Learning depth from single monocular images//Proceedings of the 18th International Conference on Neural Information Proce-ssing Systems. Vancouver, British Columbia, Canada: MIT Press: 1161-1168
Shelhamer E, Long J and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3431-3440[ DOI: 10.1109/CVPR.2015.7298965 http://dx.doi.org/10.1109/CVPR.2015.7298965 ]
Shi W Z, Caballero J, Huszár F, Totz J, Aitken A P, Bishop R, Rueckert D and Wang Z H. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 1874-1883[ DOI: 10.1109/CVPR.2016.207 http://dx.doi.org/10.1109/CVPR.2016.207 ]
Silberman N, Hoiem D, Kohli P and Fergus R. 2012. Indoor segmentation and support inference from RGBD images//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer: 746-760[ DOI: 10.1007/978-3-642-33715-4_54 http://dx.doi.org/10.1007/978-3-642-33715-4_54 ]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2021-02-21] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf
Su W and Zhang H F. 2020. Soft regression of monocular depth using scale-semantic exchange network. IEEE Access, 8: 114930-114939[DOI:10.1109/ACCESS.2020.3003466]
Sun Y H, Shi J L and Sun Z X. 2020. Estimating depth from single image using unsupervised convolutional network. Journal of Computer-Aided Design and Computer Graphics, 32(4): 643-651
孙蕴瀚, 史金龙, 孙正兴. 2020. 利用自监督卷积网络估计单图像深度信息. 计算机辅助设计与图形学学报, 32(4): 643-651)[DOI:10.3724/SP.J.1089.2020.1782]
Wang L J, Zhang J M, Wang O, Lin Z and Lu H C. 2020. SDC-Depth: semantic divide-and-conquer network for monocular depth estimation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 538-547[ DOI: 10.1109/CVPR42600.2020.00062 http://dx.doi.org/10.1109/CVPR42600.2020.00062 ]
Wang P, Shen X H, Lin Z, Cohen S, Price B and Yuille A. 2015. Towards unified depth and semantic prediction from a single image//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 2800-2809[ DOI: 10.1109/CVPR.2015.7298897 http://dx.doi.org/10.1109/CVPR.2015.7298897 ]
Wang X L, Girshick R, Gupta A and He K M. 2018. Non-local neural networks[EB/OL]. [2021-02-21] . https://arxiv.org/pdf/1711.07971.pdf https://arxiv.org/pdf/1711.07971.pdf
Xie Z, Ma H L, Wu K W, Gao Y and Sun Y X. 2020. Sampling aggregate network for scene depth estimation. Acta Automatica Sinica, 46(3): 600-612
谢昭, 马海龙, 吴克伟, 高扬, 孙永宣. 2020. 基于采样汇集网络的场景深度估计. 自动化学报, 46(3): 600-612)[DOI:10.16383/j.aas.c180430]
Xu D, Ricci E, Ouyang W L, Wang X G and Sebe N. 2017. Multi-scale continuous CRFs as sequential deep networks for monocular depth estimation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 161-169[ DOI: 10.1109/CVPR.2017.25 http://dx.doi.org/10.1109/CVPR.2017.25 ]
Yang N, von Stumberg L, Wang R and Cremers D. 2020. D3VO: deep depth, deep pose and deep uncertainty for monocular visual odometry//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1278-1289[ DOI: 10.1109/CVPR42600.2020.00136 http://dx.doi.org/10.1109/CVPR42600.2020.00136 ]
Yin Z C and Shi J P. 2018. GeoNet: unsupervised learning of dense depth, optical flow and camera pose//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1983-1992[ DOI: 10.1109/CVPR.2018.00212 http://dx.doi.org/10.1109/CVPR.2018.00212 ]
Yu F and Koltun V. 2016. Multi-scale context aggregation by dilated convolutions[EB/OL]. [2021-02-21] . https://arxiv.org/pdf/1511.07122.pdf https://arxiv.org/pdf/1511.07122.pdf
Zhan H Y, Garg R, Weerasekera C S, Li K J, Agarwal H and Reid I M. 2018. Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 340-349[ DOI: 10.1109/CVPR.2018.00043 http://dx.doi.org/10.1109/CVPR.2018.00043 ]
Zhang F H, Prisacariu V, Yang R G and Torr P H S. 2019. GA-Net: guided aggregation net for end-to-end stereo matching//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 185-194[ DOI: 10.1109/CVPR.2019.00027 http://dx.doi.org/10.1109/CVPR.2019.00027 ]
Zhao C Q, Sun Q Y, Zhang C Z, Tang Y and Qian F. 2020. Monocular depth estimation based on deep learning: an overview. Science China Technological Sciences, 63(9): 1612-1627[DOI:10.1007/s11431-020-1582-8]
Zhao S Y, Zhang L, Shen Y, Zhao S J and Zhang H J. 2019. Super-resolution for monocular depth estimation with multi-scale sub-pixel convolutions and a smoothness constraint. IEEE Access, 7: 16323-16335[DOI:10.1109/ACCESS.2019.2894651]
Zhou T H, Brown M, Snavely N and Lowe D G. 2017. Unsupervised learning of depth and ego-motion from video//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6612-6619[ DOI: 10.1109/CVPR.2017.700 http://dx.doi.org/10.1109/CVPR.2017.700 ]
Zhuo W, Salzmann M, He X M and Liu M M. 2015. Indoor scene structure analysis for single image depth estimation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 614-622[ DOI: 10.1109/CVPR.2015.7298660 http://dx.doi.org/10.1109/CVPR.2015.7298660 ]
Zou Y L, Luo Z L and Huang J B. 2018. DF-Net: unsupervised joint learning of depth and flow using cross-task consistency//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 38-55[ DOI: 10.1007/978-3-030-01228-1_3 http://dx.doi.org/10.1007/978-3-030-01228-1_3 ]
Zwald L and Lambert-Lacroix S. 2012. The BerHu penalty and the grouped effect[EB/OL]. [2021-02-21] . https://arxiv.org/pdf/1207.6868.pdf https://arxiv.org/pdf/1207.6868.pdf
相关作者
相关机构
京公网安备11010802024621