视觉感知的端到端自动驾驶运动规划综述

刘旖菲; 胡学敏; 陈国文; 刘士豪; 陈龙

doi:10.11834/jig.200276

综述 | 浏览量 : 0 下载量: 1 CSCD: 2

PDF
导出
分享
收藏
专辑

视觉感知的端到端自动驾驶运动规划综述
Review of end-to-end motion planning for autonomous driving with visual perception
2021年26卷第1期页码：49-66
纸质出版日期： 2021-01-16 ，

录用日期： 2020-10-21
DOI： 10.11834/jig.200276
稿件说明：

移动端阅览

刘旖菲, 胡学敏, 陈国文, 刘士豪, 陈龙. 视觉感知的端到端自动驾驶运动规划综述[J]. 中国图象图形学报, 2021,26(1):49-66.

Yifei Liu, Xuemin Hu, Guowen Chen, Shihao Liu, Long Chen. Review of end-to-end motion planning for autonomous driving with visual perception[J]. Journal of Image and Graphics, 2021,26(1):49-66.
刘旖菲, 胡学敏, 陈国文, 刘士豪, 陈龙. 视觉感知的端到端自动驾驶运动规划综述[J]. 中国图象图形学报, 2021,26(1):49-66. DOI： 10.11834/jig.200276.

Yifei Liu, Xuemin Hu, Guowen Chen, Shihao Liu, Long Chen. Review of end-to-end motion planning for autonomous driving with visual perception[J]. Journal of Image and Graphics, 2021,26(1):49-66. DOI： 10.11834/jig.200276.

摘要

视觉感知模块能够利用摄像机等视觉传感器获取丰富的图像和视频信息，进而检测自动驾驶汽车视野中的车辆、行人与交通标识等信息，是自动驾驶最有效、成本最低的感知方式之一。运动规划为自主车辆提供从车辆初始状态到目标状态的一系列运动参数和驾驶动作，而端到端的模型能够直接从感知的数据获取车辆的运动参数，因而受到广泛的关注。为了全面反映视觉感知的端到端自动驾驶运动规划方法的研究进展，本文对国内外公开发表的具有代表性和前沿的论文进行了概述。首先分析端到端方法的应用，以及视觉感知和运动规划在端到端自动驾驶中的作用，然后以自主车辆的学习方式作为分类依据，将视觉感知的端到端自动驾驶运动规划的实现方法分为模仿学习和强化学习两大类，并对各类方法的不同算法进行了归纳和分析；考虑到现阶段端到端模型的研究面临着虚拟到现实的任务，故对基于迁移学习的方法进行了梳理。最后列举与自动驾驶相关的数据集和仿真平台，总结存在的问题和挑战，对未来的发展趋势进行思考和展望。视觉感知的端到端自动驾驶运动规划模型的普适性强且结构简单，这类方法具有广阔的应用前景和研究价值，但是存在不可解释和难以保证绝对安全的问题，未来需要更多的研究改善端到端模型存在的局限性。

Abstract

A visual perception module can use cameras to obtain various image features for detecting peripheral information

such as vehicles

pedestrians

and traffic signs in the visual field of self-driving vehicle. This module is an effective and low cost perception method for autonomous driving. Motion planning provides self-driving vehicles with a series of motion parameters and driving actions from the initial state to the target state of the vehicle. It makes the vehicle subject to collision avoidance and dynamic constraints from the external environment and spatial-temporal constraint from the internal system during the whole traveling process. Traditional autonomous driving approaches refer to constructing intermediate processes from the sensor inputs to the actuator outputs into a plurality of independent submodules

such as perception

planning

decision making

and control. However

traditional modular approaches require the design and selection of features

camera calibration

and manual adjustment of parameters. Therefore

autonomous driving systems based on traditional modular approaches do not have complete autonomy. With the rapid development of big data

computer performance

and deep learning algorithms

increasing researchers apply deep learning to autonomous driving. An end-to-end model based on deep learning obtains the vehicle motion parameters directly from the perceived data and can fully embody the autonomy of autonomous driving. Thus

this model has been widely investigated in recent years. The representative and cutting-edge papers published locally and overseas are summarized in this paper to fully review the research progress of end-to-end motion planning for autonomous driving with visual perception. Applications of the end-to-end model in computer vision tasks and games are introduced. The complexity of tasks solved by end-to-end approaches is higher than that of autonomous driving in some other fields. End-to-end approaches can be successfully applied in the commercial field of autonomous driving. The important roles of visual perception and motion planning in end-to-end autonomous driving are analyzed by comparing the advantages and disadvantages of different input and output modes. On the basis of the learning methods of autonomous vehicles

the implementation methods of end-to-end motion planning for autonomous driving with visual perception are divided into imitation learning and reinforcement learning. Imitation learning methods can be divided into two major mainstream algorithms

namely

behavior cloning and dataset aggregation. Two recently proposed imitation learning methods

including observation imitation and conditional imitation learning

are analyzed. In reinforcement learning

value-based and policy-based methods are mainly introduced. Advanced reinforcement learning methods

such as inverse reinforcement learning and hierarchical reinforcement learning

are presented. The research of the end-to-end model for autonomous driving faces transitions from virtual scenarios to real scenarios at this stage. Transfer learning methods are combined from three aspects

including image conversion

domain adaptation

and domain randomization. On this basis

the basic idea and network structure of each method are described. Autonomous driving models are usually evaluated in a simulation environment by means of public datasets and simulation platforms. The datasets and simulation platforms related to autonomous driving are listed and analyzed from the perspectives of publication time

configuration

and applicable condition. The existing problems

challenges

thinking

and outlook are summarized. End-to-end motion planning for autonomous driving with visual perception has strong universality and simple structure. However

explaining and ensuring absolute safety are difficult. The method of generating accountable intermediate representation is expected to solve the inexplicable problem. Therefore

end-to-end motion planning methods for autonomous driving with visual perception have a broad application prospect and research value. However

many studies are needed to improve the limitations of the proposed model in the future.

关键词

视觉感知运动规划端到端自动驾驶模仿学习强化学习

Keywords

visual perceptionmotion planningend-to-endautonomous drivingimitation learningreinforcement learning

references

Al-Emran. 2015. Hierarchical reinforcement learning:a survey. International Journal of Computing and Digital Systems, 4(2):2210-2142[DOI:10.12785/ijcds/040207]

Attia A and Dayan S. 2018. Global overview of imitation learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1801.06503.pdfhttps://arxiv.org/pdf/1801.06503.pdf

Berner C, Brockman G, Chan B, Cheung V, Debiak P, Dennison C, Farhi D, Fischer Q, Hashme S, Hesse C, Józefowicz R, Gray S, Olsson C, Pachocki J, Petrov M, de Oliveira Pinto H P, Raiman J, Salimans T, Schlatter J, Schneider J, Sidor S, Sutskever I, Tang J, Wolski F and Zhang S S. 2019. Dota 2 with large scale deep reinforcement learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1912.06680.pdfhttps://arxiv.org/pdf/1912.06680.pdf

Bewley A, Rigley J, Liu Y X, Hawke J, Shen R, Lam V D and Kendall A. 2019. Learning to drive from simulation without real world labels//Proceedings of 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE: 4818-4824[DOI: 10.1109/ICRA.2019.8793668http://dx.doi.org/10.1109/ICRA.2019.8793668]

Bojarski M, Del Testa D, Dworakowski D, Firner B, Flepp B, Goyal P, Jackel L D, Monfort M, Muller U, Zhang J K, Zhang X, Zhao J K and Zieba K. 2016. End to end learning for self-driving cars[EB/OL].[2020-05-21].https://arxiv.org/pdf/1604.07316.pdfhttps://arxiv.org/pdf/1604.07316.pdf

Caesar H, Bankiti V, Lang A H, Vora S, Liong V E, Xu Q, Krishnan A, Pan Y, Baldan G and Beijbom O. 2020. nuScenes: a multimodal dataset for autonomous driving[EB/OL].[2020-05-25].https://arxiv.org/pdf/1903.11027.pdfhttps://arxiv.org/pdf/1903.11027.pdf

Castelvecchi D. 2016. Can we open the black box of AI? Nature, 538(7623):20-23[DOI:10.1038/538020a]

Chae H, Kang C M, Kim B D, Kim J, Chung C C and Choi J W. 2017. Autonomous braking system via deep reinforcement learning//Proceedings of the 20th IEEE International Conference on Intelligent Transportation Systems. Yokohama, Japan: IEEE: 1-6[DOI: 10.1109/ITSC.2017.8317839http://dx.doi.org/10.1109/ITSC.2017.8317839]

Chen H, Guo W and Yan J W. 2019. Synthetic aperture radar image target segmentation method based on boundary and texture information. Journal of Image and Graphics, 24(6):882-889

谌华, 郭伟, 闫敬文. 2019.综合边界和纹理信息的合成孔径雷达图像目标分割.中国图象图形学报, 24(6):882-889)[DOI:10.11834/jig.180484]

Chen L C, Papandreou G, Schroff F and Adam H. 2017. Rethinking atrous convolution for semantic image segmentation[EB/OL].[2020-05-21].https://arxiv.org/pdf/1706.05587.pdfhttps://arxiv.org/pdf/1706.05587.pdf

Codevilla F, Müller M, López A, Koltun V and Dosovitskiy A. 2018. End-to-end driving via conditional imitation learning[EB/OL].[2020-05-25].https://arxiv.org/pdf/1710.02410.pdfhttps://arxiv.org/pdf/1710.02410.pdf

Codevilla F, Santana E, López A M and Gaidon A. 2019. Exploring the limitations of behavior cloning for autonomous driving[EB/OL].[2020-05-25].https://arxiv.org/pdf/1904.08980.pdfhttps://arxiv.org/pdf/1904.08980.pdf

Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S and Schiele B. 2016. The Cityscapes dataset for semantic urban scene understanding//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 3213-3223[DOI: 10.1109/CVPR.2016.350http://dx.doi.org/10.1109/CVPR.2016.350]

Dosovitskiy A, Ros G, Codevilla F, López A M and Koltun V. 2017. CARLA: an open urban driving simulator[EB/OL].[2020-05-25].https://arxiv.org/1711.03938.pdfhttps://arxiv.org/1711.03938.pdf

Duan J L, Li S E, Guan Y, Sun Q and Cheng B. 2020. Hierarchical reinforcement learning for self-driving decision-making without reliance on labeled driving data[EB/OL].[2020-05-21].https://arxiv.org/pdf/2001.09816.pdfhttps://arxiv.org/pdf/2001.09816.pdf

El Sallab A, Abdou M, Perot E and Yogamani S. 2017. Deep reinforcement learning framework for autonomous driving[EB/OL].[2020-05-21].https://arxiv.pdf/abs/1704.02532.pdfhttps://arxiv.pdf/abs/1704.02532.pdf

Geiger A, Lenz P, Stiller C and Urtasun R. 2013. Vision meets robotics:the KITTI dataset. The International Journal of Robotics Research, 32(11):1231-1237[DOI:10.1177/0278364913491297]

Geyer J, Kassahun Y, Mahmudi M, Ricou X, Durgesh R, Chung A S, Hauswald L, Pham V H, Mühlegg M, Dorn S, Fernandez T, Jänicke M, Mirashi S, Savani C, Sturm M, Vorobiov O, Oelker M, Garreis S and Schuberth P. 2020. A2D2: audi autonomous driving dataset[EB/OL].[2020-05-21].https://arxiv.org/pdf/2004.06320.pdfhttps://arxiv.org/pdf/2004.06320.pdf

Hawke J, Shen R, Gurau C, Sharma S, Reda D, Nikolov N, Mazur P, Micklethwaite S, Griffiths N, Shah A and Kendall A. 2020. Urban driving with conditional imitation learning[EB/OL].https://arxiv.org/pdf/1912.00177.pdfhttps://arxiv.org/pdf/1912.00177.pdf

He H, Daumé Ⅲ H and Jason Eisner J. 2012. Imitation learning by coaching//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS: 3149-3157

He K M, Gkioxari G, Dollár P and Girshick R. 2020. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):386-397[DOI:10.1109/TPAMI.2018.2844175]

He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778[DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]

Hecker S, Dai D X and Van Gool L. 2018. End-to-end learning of driving models with surround-view cameras and route planners[EB/OL].[2020-05-25].https://arxiv.org/pdf/1803.10158.pdfhttps://arxiv.org/pdf/1803.10158.pdf

Hecker S, Dai D X and Van Gool L. 2019. Learning accurate, comfortable and human-like driving[EB/OL].[2020-05-21].https://arxiv.org/pdf/1903.10995.pdfhttps://arxiv.org/pdf/1903.10995.pdf

Hochreiter S and Schmidhuber J. 1997. Long short-term memory. Neural Computation, 9(8):1735-1780[DOI:10.1162/NECO.1997.9.8.1735]

Hu X M, Chen L, Tang B, Cao D P and He H B. 2018. Dynamic path planning for autonomous driving on various roads with avoidance of static and moving obstacles. Mechanical Systems and Signal Processing, 100:482-500[DOI:10.1016/j.ymssp.2017.07.019]

Huang X W, Kwiatkowska M, Wang S and Wu M. 2017. Safety verification of deep neural networks//Proceedings of the 29th International Conference on Computer Aided Verification. Heidelberg, Germany: Springer: 3-29[DOI: 10.1007/978-3-319-63387-9_1http://dx.doi.org/10.1007/978-3-319-63387-9_1]

Huang X Y, Wang P, Cheng X J, Zhou D F, Geng Q C and Yang R G. 2020. The ApolloScape open dataset for autonomous driving and its application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10):2702-2719[DOI:10.1109/TPAMI.2019.2926463]

Jaritz M, de Charette R, Toromanoff M, Perot E and Nashashibi F. 2018. End-to-end race driving with deep reinforcement learning[EB/OL].[2020-05-25].https://arixv.org/pdf/1807.02371.pdfhttps://arixv.org/pdf/1807.02371.pdf

Kaushik M, Prasad V, Krishna K M and Ravindran B. 2018. Overtaking maneuvers in simulated highway driving using deep reinforcement learning//2018 IEEE Intelligent Vehicles Symposium. Changshu, China: IEEE: 1885-1890[DOI: 10.1109/IVS.2018.8500718http://dx.doi.org/10.1109/IVS.2018.8500718]

Kendall A, Hawke J, Janz D, Mazur P, Reda D, Allen J M, Lam V D, Bewley A and Shah A. 2019. Learning to drive in a day[EB/OL].[2020-05-25].https://arxiv.org/pdf/1807.00412.pdfhttps://arxiv.org/pdf/1807.00412.pdf

Kingma D P and Welling M. 2014. Auto-encoding variational bayes[EB/OL].[2020-05-21].https://arxiv.org/pdf/1312.6114.pdfhttps://arxiv.org/pdf/1312.6114.pdf

Koenig N and Howard A. 2004. Design and use paradigms for Gazebo, an open-source multi-robot simulator//Proceedings of 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems. Sendai, Japan: IEEE: 2149-2154[DOI: 10.1109/IROS.2004.1389727http://dx.doi.org/10.1109/IROS.2004.1389727]

Kuderer M, Gulati S andBurgard W. 2015. Learning driving styles for autonomous vehicles from demonstration//Proceedings of 2015 IEEE International Conference on Robotics and Automation. Seattle, USA: IEEE: 2641-2646[DOI: 10.1109/ICRA.2015.7139555http://dx.doi.org/10.1109/ICRA.2015.7139555]

LeCun Y, Muller U, Ben J, Cosatto E and Flepp B. 2005. Off-Road obstacle avoidance through end-to-end learning//Proceedings of International Conference on Neural Information Processing Systems.[s.l.]: MIT Press: 739-746

Li G H, Müller M, Casser V, Smith N, Michels D L and Ghanem B. 2018. OIL: observational imitation learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1803.01129.pdfhttps://arxiv.org/pdf/1803.01129.pdf

Li J, Lyu S, Chen F, Yang G G and Dou Y. 2017. Image retrieval by combining recurrent neural network and visual attention mechanism. Journal of Image and Graphics,22(2):241-248

李军, 吕绍和, 陈飞, 阳国贵, 窦勇. 2017.结合视觉注意机制与递归神经网络的图像检索.中国图象图形学报, 22(2):241-248)[DOI:10.11834/jig.20170212]

Liang X D, Wang T R, Yang L N and Xing E P. 2018. CIRL: controllable imitative reinforcement learning for vision-based self-driving[EB/OL].[2020-05-25].https://arxiv.org/pdf/1807.03776.pdfhttps://arxiv.org/pdf/1807.03776.pdf

Lillicrap T P, Hunt J J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D and Wierstra D. 2016. Continuous control with deep reinforcement learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1509.02971.pdfhttps://arxiv.org/pdf/1509.02971.pdf

Liu M Y, Breuel T and Kautz J. 2017. Unsupervised image-to-image translation network[EB/OL].[2020-05-21].https://arxiv.org/pdf/1703.00848.pdfhttps://arxiv.org/pdf/1703.00848.pdf

Mnih V, Badia A P, Mirza M, Graves A, Lillicrap T P, Harley T, Silver D and Kavukcuoglu K. 2016. Asynchronous methods for deep reinforcement learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1602.01783.pdfhttps://arxiv.org/pdf/1602.01783.pdf

Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D and Riedmiller M. 2013. Playing Atari with deep reinforcement learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1312.5602.pdfhttps://arxiv.org/pdf/1312.5602.pdf

Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S and Hassabis D. 2015. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533[DOI:10.1038/nature14236]

Mu C D, Xie J B, Yan W, Liu T and Li P Q. 2015. Detecting high-speed moving targets in moving camera environments. Journal of Image and Graphics, 20(3):349-356

穆春迪, 谢剑斌, 闫玮, 刘通, 李沛秦. 2015.面向动摄像机的高速运动目标检测.中国图象图形学报, 20(3):349-356)[DOI:10.11834/jig.20150306]

Müller M, Dosovitskiy A, Ghanem B and Koltun V. 2018. Driving policy transfer via modularity and abstraction[EB/OL].[2020-05-21].https://arxiv.org/pdf/1804.09364v3.pdfhttps://arxiv.org/pdf/1804.09364v3.pdf

Neuhold G, Ollmann T, BulòS R and Kontschieder P. 2017. The Mapillary vistas dataset for semantic understanding of street scenes//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5000-5009[DOI: 10.1109/ICCV.2017.534http://dx.doi.org/10.1109/ICCV.2017.534]

Ng A Y and Russell S J. 2000. Algorithms for inverse reinforcement learning//Proceedings of the 17th International Conference on Machine Learning. Stanford, USA: ACM: 663-670

Ning K, Zhang D B, Yin F and Xiao H H. 2019. Garbage detection and classification of intelligent sweeping robot based on visual perception. Journal of Image and Graphics, 24(8):1358-1368

宁凯, 张东波, 印峰, 肖慧辉. 2019.基于视觉感知的智能扫地机器人的垃圾检测与分类.中国图象图形学报, 24(8):1358-1368)[DOI:10.11834/jig.180475]

Pan X L, You Y R, Wang Z Y and Lu C W. 2017. Virtual to real reinforcement learning for autonomous driving[EB/OL].[2020-05-21].https://arxiv.org/pdf/1704.03952.pdfhttps://arxiv.org/pdf/1704.03952.pdf

Pan Y P, Cheng C A, Saigol K, Lee K, Yan X Y, Theodorou E A and Boots B. 2020. Imitation learning for agile autonomous driving. The International Journal of Robotics Research, 39(2/3):286-302[DOI:10.1177/0278364919880273]

Papernot N, McDaniel P, Jha S, Fredrikson M, Celik Z B and Swami A. 2016. The limitations of deep learning in adversarial settings//Proceedings of 2016 IEEE European Symposium on Security and Privacy. Saarbrucken, Germany: IEEE: 372-387[DOI: 10.1109/EuroSP.2016.36http://dx.doi.org/10.1109/EuroSP.2016.36]

Pomerleau D A. 1988. ALVINN: an autonomous land vehicle in a neural network//Proceedings of Neural Information Processing Systems 1. Denver, USA: NIPS: 305-313

Ross S, Gordon G J and Bagnell J A. 2011. A reduction of imitation learning and structured prediction to no-regret online learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1011.0686.pdfhttps://arxiv.org/pdf/1011.0686.pdf

Sadeghi F and Levine S. 2017. CAD2RL: real single-image flight without a single real image[EB/OL].[2020-05-21].https://arxiv.org/pdf/1611.04201.pdfhttps://arxiv.org/pdf/1611.04201.pdf

Santana E and Hotz G. 2016. Learning a driving simulator[EB/OL].[2020-05-21].https://arxiv.org/pdf/1608.01230.pdfhttps://arxiv.org/pdf/1608.01230.pdf

Saxena D M, Bae S, Nakhaei A, Fujimura K and Likhachev M. 2020. Driving in dense traffic with model-free reinforcement learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1909.06710.pdfhttps://arxiv.org/pdf/1909.06710.pdf

Schafer H, Santana E, Haden A and Biasini R. 2018. A commute in data: the comma2k19 dataset[EB/OL].[2020-05-21].https://arxiv.org/pdf/1812.05752.pdfhttps://arxiv.org/pdf/1812.05752.pdf

Schulman J, Wolski F, Dhariwal P, Radford A and Klimov O. 2017. Proximal policy optimization algorithms[EB/OL].[2020-05-21].https://arxiv.org/pdf/1707.06347.pdfhttps://arxiv.org/pdf/1707.06347.pdf

Shah S, Dey D, Lovett C and Kapoor A. 2018. AirSim: high-fidelity visual and physical simulation for autonomous vehicles[EB/OL].[2020-05-21].https://arxiv.org/pdf/1705.05065.pdfhttps://arxiv.org/pdf/1705.05065.pdf

Sharifzadeh S, Chiotellis I, Triebel R and Cremers D. 2016. Learning to drive using inverse reinforcement learning and deep Q-networks[EB/OL].[2020-05-21].https://arxiv.org/pdf/1612.03653.pdfhttps://arxiv.org/pdf/1612.03653.pdf

Shi T Y, Wang P, Cheng X X and Chan C Y. 2019. Driving decision and control for autonomous lane change based on deep reinforcement learning[EB/OL].[2020-05-21].https://arxiv.org/pdf/1904.10171.pdfhttps://arxiv.org/pdf/1904.10171.pdf

Silver D, Huang A, Maddison C J, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T and Hassabis D. 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484-489[DOI:10.1038/nature16961]

Silver D, Hubert T, Schrittwieser J, Antonoglou I, Lai M, Guez A, Lanctot M, Sifre L, Kumaran D, Graepel T, Lillicrap T P, Simonyan K and Hassabis D. 2017. Mastering chess and Shogi by self-play with a general reinforcement learning algorithm[EB/OL].[2020-05-21].https://arxiv.org/pdf/1712.01815.pdfhttps://arxiv.org/pdf/1712.01815.pdf

Silver D, Lever G, Heess N, Degris T, Wierstra D and Riedmiller M A. 2014. Deterministic policy gradient algorithms//Proceedings of the 31st International Conference on Machine Learning. Beijing, China: JMLR: 387-395

Sun P, Kretzschmar H, Dotiwalla X, Chouard A, Patnaik V, Tsui P, Guo J, Zhou Y, Chai Y, Caine B, Vasudevan V, Han W, Ngiam J, Zhao H, Timofeev A, Ettinger S, Krivokon M, Gao A, Joshi A, Zhang Y, Shlens J, Chen Z F and Anguelov D. 2020. Scalability in perception for autonomous driving: waymo Open dataset//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2443-2451[DOI: 10.1109/CVPR42600.2020.00252http://dx.doi.org/10.1109/CVPR42600.2020.00252]

Torabi F, Warnell G and Stone P. 2018. Behavioral cloning from observation[EB/OL].[2020-05-21].https://arxiv.org/pdf/1805.01954.pdfhttps://arxiv.org/pdf/1805.01954.pdf

Vinyals O, Babuschkin I, Czarnecki W M, Mathieu M, Dudzik A, Chung J, Choi D H, Powell R, Ewalds T, Georgiev P, Oh J, Horgan D, Kroiss M, Danihelka I, Huang A, Sifre L, Cai T, Agapiou J P, Jaderberg M, Vezhnevets A S, Leblond R, Pohlen T, Dalibard V, Budden D, Sulsky Y, Molloy J, Paine T L, Gulcehre C, Wang Z Y, Pfaff T, Wu Y H, Ring R, Yogatama D, Wünsch D, McKinney K, Smith O, Schaul T, Lillicrap T, Kavukcuoglu K, Hassabis D, Apps C and Silver D. 2019. Grandmaster level in StarCraft Ⅱ using multi-agent reinforcement learning. Nature, 575(7782):350-354[DOI:10.1038/s41586-019-1724-z]

Wang P and Chan C Y. 2017. Formulation of deep reinforcement learning architecture toward autonomous driving for on-ramp merge//Proceedings of the 20th IEEE International Conference on Intelligent Transportation Systems. Yokohama, Japan: IEEE: #8317735[DOI: 10.1109/ITSC.2017.8317735http://dx.doi.org/10.1109/ITSC.2017.8317735]

Wolf P, Hubschneider C, Weber M, Bauer A, Härtl J, Dürr F and Zollner J M. 2017. Learning how to drive in a real world simulation with deep Q-Networks//2017 IEEE Intelligent Vehicles Symposium. Los Angeles, USA: IEEE: 244-250[DOI: 10.1109/IVS.2017.7995727http://dx.doi.org/10.1109/IVS.2017.7995727]

Xu H Z, Gao Y, Yu F and Darrell T. 2017. End-to-end learning of driving models from large-scale video datasets//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 3530-3538[DOI: 10.1109/CVPR.2017.376http://dx.doi.org/10.1109/CVPR.2017.376]

Xu L L, Zhang S M and Zhao J L. 2019. Expression recognition algorithm for parallel convolutional neural networks. Journal of Image and Graphics, 24(2):227-236

徐琳琳, 张树美, 赵俊莉. 2019.构建并行卷积神经网络的表情识别算法.中国图象图形学报, 24(2):227-236)[DOI:10.11834/jig.180346]

Xu N Y, Tan B W and Kong B Y. 2019. Autonomous driving in reality with reinforcement learning and image translation[EB/OL].[2020-05-21].https://arxiv.org/pdf/1801.05299.pdfhttps://arxiv.org/pdf/1801.05299.pdf

Yang L N, Liang X D, Wang T R and Xing E. 2018. Real-to-virtual domain unification for end-to-end autonomous driving[EB/OL].[2020-05-21].https://arxiv.org/pdf/1801.03458.pdfhttps://arxiv.org/pdf/1801.03458.pdf

Yao J H, Wu J M, Yang Y and Shi Z X. 2020. Segmentation in multi-spectral remote sensing images using the fully convolutional neural network. Journal of Image and Graphics, 25(1):180-192

姚建华, 吴加敏, 杨勇, 施祖贤. 2020.全卷积神经网络下的多光谱遥感影像分割.中国图象图形学报, 25(1):180-192)[DOI:10.11834/jig.190157]

Yu A, Palefsky-Smith R and Rishi B. 2016. Deep reinforcement learning for simulated autonomous vehicle control[EB/OL].[2020-05-21].https://www.studocu.com/en-us/document/stanford-university/convolutional-neural-networks-for-visual-recognition/other/deep-reinforcement-learning-for-simulated-autonomous-vehicle-control/751856/viewhttps://www.studocu.com/en-us/document/stanford-university/convolutional-neural-networks-for-visual-recognition/other/deep-reinforcement-learning-for-simulated-autonomous-vehicle-control/751856/view

Yu F, Chen H F, Wang X, Xian W Q, Chen Y Y, Liu F C, Madhavan V and Darrell T. 2020. BDD100K: a diverse driving dataset for heterogeneous multitask learning//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2633-2642[DOI: 10.1109/CVPR42600.2020.00271http://dx.doi.org/10.1109/CVPR42600.2020.00271]

Yuan X Y, He P, Zhu Q L and Li X L. 2019. Adversarial examples:attacks and defenses for deep learning. IEEE Transactions on Neural Networks and Learning Systems, 30(9):2805-2824[DOI:10.1109/TNNLS.2018.2886017]

Yurtsever E, Lambert J, Carballo A and Takeda K. 2020. A survey of autonomous driving:common practices and emerging technologies. IEEE Access, 8:58443-58469[DOI:10.1109/ACCESS.2020.2983149]

Zeng W Y, Luo W J, Suo S, Sadat A, Yang B, Casas S and Urtasun R. 2019. End-to-end interpretable neural motion planner//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8652-8661[DOI: 10.1109/CVPR.2019.00886http://dx.doi.org/10.1109/CVPR.2019.00886]

Zhang J K and Cho K. 2016. Query-efficient imitation learning for end-to-end autonomous driving[EB/OL].[2020-05-21].https://arxiv.org/pdf/1605.06450.pdfhttps://arxiv.org/pdf/1605.06450.pdf

Zhang R M, Liu C J and Chen Q J. 2018. End-to-end control of Kart Agent with deep reinforcement learning//Proceedings of 2018 IEEE International Conference on Robotics and Biomimetics. Kuala Lumpur, Malaysia: IEIE: 1688-1693[DOI: 10.1109/ROBIO.2018.8665334http://dx.doi.org/10.1109/ROBIO.2018.8665334]

Zhao H S, Shi J P, Qi X J, Wang X G and Jia J Y. 2017. Pyramid scene parsing network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6230-6239[DOI: 10.1109/CVPR.2017.660http://dx.doi.org/10.1109/CVPR.2017.660]

文章被引用时，请邮件提醒。

提交