Current Issue Cover
仿真到现实环境的自动驾驶决策技术综述

胡学敏1, 黄婷玉1, 余雅澜1, 任佳佳1, 谢微2, 陈龙3(1.湖北大学;2.临沂大学;3.中国科学院自动化研究所)

摘 要
自动驾驶汽车作为未来交通的重要发展方向,决策技术是其进行安全高效行驶的关键。基于成本和安全性的考虑,最新的自动驾驶决策技术往往先在仿真环境中研究,然后应用到现实世界中,故在自动驾驶决策领域,仿真到现实的方法能帮助自动驾驶系统更有效地进行学习、训练和验证。然而,仿真环境和现实环境之间的差距将在这些模型和技术转移到真实车辆时带来挑战,这个被称为仿真到现实环境域差距的问题促使研究人员探索解决该问题的途径,并且提出各种有效的方法。本文将这些方法总结为两大类:虚实迁移和平行智能。前者通过不同的方法将在模拟环境中训练的车辆决策迁移到现实环境中,以解决域差距;后者通过构建虚拟的人工系统和现实的物理系统,并将二者进行交互、比较、学习和实验,从而解决自动驾驶决策在现实环境中适配问题。本文首先从虚实迁移和平行智能的原理,以及其在自动驾驶决策领域应用的角度进行了详细综述,这也是首次从平行智能的角度来思考自动驾驶决策技术中的仿真到现实环境问题。然后总结了搭建仿真环境常用的自动驾驶模拟器,最后归纳了仿真到现实环境的自动驾驶面临的挑战和未来的发展趋势,既为自动驾驶在现实场景的应用与推广提供技术方案,也为自动驾驶研究人员提供新的想法和方向。
关键词
Decision technologies of simulation to reality for autonomous driving: A survey

(Institute of Automation, Chinese Academy of Sciences)

Abstract
Since the mid-1980s, many research institutions have been developing autonomous driving technologies. The main idea of autonomous driving technology is to perceive the ego-vehicle states and its surroundings in real time through sensors, then utilize an intelligent system for decision-making planning, and finally execute the driving operation through the control system. The decision-making module, which is an important component in autonomous driving systems, bridges surrounding perception and vehicle control, and it is mainly responsible for finding optimal paths or correct and reliable behaviors for the ego-vehicle so as to effectively drive on the road. In the research process of autonomous driving decision-making technologies which are very strict for the safety, if the training is done directly in the real world, it will not only lead to a significant cost increasement but also miss some marginal driving scenarios. In this case, many researches first conduct in the simulation world before applying new autonomous driving models in the real world. However, the simulation can only provide an approximate model of vehicle dynamics and its interaction with the surrounding environment, and the vehicle agent trained only in the simulation world cannot be generalized to the real world. There is still a gap between the reality and simulation, which is called the reality gap (RG) and poses a challenge for the transfer of developed autonomous driving models from simulated vehicles to real vehicles. To solve the problem of reality gap, researchers have proposed many approaches. This paper presents the principles and state-of-the-art methods of transferring knowledge from simulation to reality (sim2real) and parallel intelligence (PI) as well as their applications in decision making for autonomous driving. Sim2real approaches reduce RG by simply transferring the learned models from the simulation to the reality environments. In autonomous driving, the basic idea of sim2real is to train the vehicle agent in the simulation environment and then transfer it to the reality environment using various methods, which can greatly reduce the number of interactions between the vehicle agent and the reality environment. It can also improve the effectiveness and performance of decision-making algorithms for autonomous driving. At present, the main sim2real methods include robust reinforcement learning (RL), meta-learning, curriculum learning, knowledge distillation, transfer learning, as well as some other helpful techniques such as domain randomization and system identification, which have their own way of reducing the reality gap. For example, transfer learning bridges the reality gap by directly addressing the differences between domains. Considering that vehicle agents in the real world may be exposed to problems that do not exist in the simulation world, some researchers use meta-learning to bridge the gap. Sim2real methods handle the RG problem in some way, but the computational cost remains a challenge, especially when dealing with complex and dynamic environments, which limits the application range of sim2real methods. In order to solve the problem, the parallel intelligence which solves the RG problem by parallelly performing the simulation environment with the reality environment is proposed. It is a new paradigm based on the ACP method (artificial society, computational experiment, and parallel execution), which deeply integrates simulated and real scenarios. The main process of parallel intelligence is to form a complete system through repeated interactions between the artificial and physical systems, and reduce the RG through parallel execution and computational experiments. Among them, the computational experiment is divided into description learning, prediction learning, and prescriptive learning, which gradually transitions from the simulation environment to the real world. Parallel intelligence and sim2real technologies extend the physical space to the virtual space, and model the real world through virtual-real interaction, so that the vehicle agent can learn knowledge and experience through the simulation environment and the reality environment. The core technology of PI is to make decisions through the interaction between the real driving system and the artificial driving system, and realize the management and control of the driving system by using the comparison, learning, and experiment of the two systems. Compared with sim2real methods, parallel intelligence deals with the relationship between simulated and real scenarios from a higher technical level, solves complex modeling problems, and greatly reduces the difference between simulated to real scenarios. In the field of autonomous driving, parallel intelligence has developed several branches, mainly including the parallel system, parallel learning, parallel driving, and parallel planning. Moreover, the theoretical system has been continuously developed and achieved remarkable results in many fields such as transportation, medical treatment, manufacturing, and control. Subsequently, some autonomous driving simulator such as AirSim, CARLA, etc. are presented in this paper. Simulators for autonomous driving are usually to minimize the mismatch between real and simulated setups by providing training data and experience, thus enabling deployment of vehicle agents into the real world. Finally, existing challenges and future perspectives in sim2real and PI methods are summarized. With the continuous development of simulation to reality technologies, more breakthroughs and progresses in autonomous driving will be achieved in the future.
Keywords

订阅号|日报