发布时间: 2021-01-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.200396
2021 | Volume 26 | Number 1

自动驾驶场景感知与仿真

深度纯追随的拟人化无人驾驶转向控制模型

单云霄¹, 黄润辉², 何泽¹, 龚志豪¹, 景民³, 邹雪松³

1. 中山大学计算机学院, 广州 510006;

2. 中山大学智能工程学院, 广州 510006;

3. 云洲智能科技有限公司, 珠海 519080

收稿日期: 2020-07-17; 修回日期: 2020-11-03; 预印本日期: 2020-11-10

基金项目: 广东省重点领域研发计划资助项目（2019B090919003，2018B0108004）；中央高校基本科研业务费专项资金资助项目（19lgpy229）

第一作者简介: 单云霄, 1985年生, 男, 副研究员, 主要研究方向为基于学习的机器人系统。E-mail:shanyx@mail.sysu.edu.cn;
黄润辉, 男, 本科生, 主要研究方向为计算机视觉。E-mail:huangrh9@mail2.sysu.edu.cn;
何泽, 男, 本科生, 主要研究方向为计算机科学与超级计算。E-mail:heze@mail2.sysu.edu.cn;
景民, 男, 工程师, 主要研究方向为水面无人系统。E-mail:min.jing@yunzhou-tech.com;
邹雪松, 男, 工程师, 主要研究方向为水面无人系统。E-mail:zxs11@tsinghua.org.cn.

通信作者: 龚志豪, 通信作者, 男, 工程师, 主要研究方向为基于嵌入式的机器人系统。E-mail:gongzhh3@mail2.sysu.edu.cn.

中图法分类号: TP242.6

文献标识码: A

文章编号: 1006-8961(2021)01-0176-10

摘要

目的在无人驾驶系统技术中，控制车辆转向以跟踪特定路径是实现驾驶的关键技术之一，大量基于传统控制的方法可以准确跟踪路径，然而如何在跟踪过程中实现类人的转向行为仍是当前跟踪技术面临的挑战性问题之一。现有传统转向模型并没有参考人类驾驶行为，难以实现过程模拟。此外，现有大多数基于神经网络的转向控制模型仅仅以视频帧作为输入，鲁棒性和可解释性不足。基于此，本文提出了一个融合神经网络与传统控制器的转向模型：深度纯追随模型（deep pure pursuit，deep PP）。方法在deep PP中，首先利用卷积神经网络（convolutional neural network，CNN）提取驾驶环境的视觉特征，同时使用传统的纯追随（pure pursuit，PP）控制器融合车辆运动模型以及自身位置计算跟踪给定的全局规划路径所需的转向控制量。然后，通过拼接PP的转向结果向量和视觉特征向量得到融合特征向量，并构建融合特征向量与人类转向行为之间的映射模型，最终实现预测无人驾驶汽车转向角度。结果实验将在CARLA（Center for Advanced Research on Language Acquisition）仿真数据集和真实场景数据集上进行，并与Udacity挑战赛的CNN模型和传统控制器进行对比。实验结果显示，在仿真数据集的14个复杂天气条件下，deep PP比CNN模型和传统转向控制器更贴近无人驾驶仪的转向指令。在使用均方根误差（root mean square error，RMSE）作为衡量指标时，deep PP相比于CNN模型提升了50.28%，相比于传统控制器提升了35.39%。最后，真实场景实验验证了提出的模型在真实场景上的实用性。结论本文提出的拟人化转向模型，综合了摄像头视觉信息、位置信息和车辆运动模型信息，使得无人驾驶汽车的转向行为更贴近人类驾驶行为，并在各种复杂驾驶条件下保持了高鲁棒性。

关键词

无人驾驶; 端到端; 转向模型; 路径跟踪; 深度学习; 纯追随

Human-like steering model for autonomous driving based on deep pure pursuit method

Shan Yunxiao¹, Huang Runhui², He Ze¹, Gong Zhihao¹, Jing Min³, Zou Xuesong³

1. School of Computer Science, Sun Yat-sen University, Guangzhou 510006, China;

2. School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou 510006, China;

3. Zhuhai Yunzhou Intelligence Technology Co. Ltd., Zhuhai 519080, China

Supported by: Key-Area Research and Development Program of Guangdong Province (2019B090919003, 2018B0108004); Fundamental Research Funds for the Central Universities (19lgpy229)

Abstract

Objective Path tracking is not a new topic, being part of the various components of an autonomous vehicle that aim to steer the vehicle to track a defined path. The traditional steering controller, which uses location information and path information to steer autonomous vehicles, cannot achieve human-like driving behaviors according to real-life driving scenes or environments. When human-like steering behavior is considered a feature in the steering model, the steering problem of autonomous vehicles becomes challenging. The traditional steering controller tracks the defined path by predicting the steering angle of the front wheel according to the current location of the vehicles and the path information, but it is only a purely mechanical driving behavior rather than a human-like driving behavior. Thus, researchers employ a neural network as a steering model, training the neural network by using the images captured from the front-facing camera mounted on the vehicle along with the associated steering angles either from the perspective of human beings or simulators; this network is also known as end-to-end neural network. Nevertheless, most of the existing neural networks consider only the visual camera frames as input, ignoring other available information such as location, motion, and model of vehicle. The training dataset of the end-to-end neural network is supposed to cover all kinds of driving weather or scenes, such as rainy day, snow day, overexposure, and underexposure, so that the network can learn as much as possible the relationship between the image frames and driving behaviors, and enhance the universality of the neural network. The end-to-end neural network also relies on large-scale training datasets to enhance the robustness of the network. Overdependence on cameras results in the steering performance being greatly affected by the environment. Therefore, the combination of the traditional steering controller and end-to-end neural network can complement each other's advantages. With the use of only small-scale datasets that cover fewer driving scenes for training, the control behaviors of the new network can be human-like, robust, and able to cover multiple driving scenes. In this paper, we proposed a fusion neural network framework called deep pure pursuit (deep PP) to incorporate a convolutional neural network (CNN) with a traditional steering controller to build a robust steering model. Method In this study, a human-like steering model that fuses visual geometry group network (VGG)-type CNN and a traditional steering controller is built. The VGG-type CNN consists of 8 layers, including three convolutional layers, three pooling layers, and two fully connected layers. It uses 3×3 non-stride convolutions with 32, 64, and 128 kernels are used. Following each convolutional layer, a 2×2 max-pooling layer with stride 2 is configured to decrease the used parameters. The fully connected layers are designed to function as a controller for steering. While CNN extracts visual features from video frames, PP is employed to utilize the location information and motion model information. Fifty target points of the defined path ahead of the vehicle are selected to calculate the predict front-wheel steering angle by PP. The minimum and maximum look-ahead distance of PP are separately set to 1.5 m and 20 m, respectively, ahead of the vehicle. After visual features from the CNN model and 50 steering angles from PP are extracted, a combinational feature vector is proposed to integrate visual features with 50 steering angles. The features are concatenated with the fully connected layers to build the mapping relationship. In our augmentation, the images are flipped and rotated to improve the self-recovery capacities from a poor location or orientation. In each image, the bottom 30 pixels and the top 40 pixels are cropped to remove the front of the car and most of the sky above the horizon, and then the processed images are resized to a lower resolution to accelerate the training and testing. Our model is implemented in Google's TensorFlow. The experiments are conducted on a Titan X GPU. The max number of epochs is set to 10. Each epoch contains 10 000 frames to train the model. The batch size is set to 32. Adam optimizer with learning rate 1E-4 is deployed to train our model. The activation function of our model is ReLU. Root mean square error(RMSE) was used to evaluate the performance of different models. Result To train and validate our proposed solution, we collect datasets by using CARLA(Center for Advanced Research on Language Acquisition) simulator and a real-life autonomous vehicle. In the simulation dataset, we trained the models under the ClearNoon weather parameter and evaluated on 14 instances of poor driving weather. In the real-life dataset, 13 080 frames are collected for training, and 2 770 frames are collected for testing. We compared our model with a CNN model of the Udacity challenge and a traditional steering controller, PP, in verifying the effectiveness of deep PP. Experiment results show that our steering model can track the steering commands from the autopilot in CARLA more closely than CNN and PP can under 14 instances of poor driving conditions and improve the RMSE by 50.28% and 35.39%, separately. In real-life experiments, the proposed model is tested on a real-life dataset to prove its applicability. The discussion of different look-ahead distance demonstrates that PP controller is sensitive to the look-head distance. The maximal deviation from the human driver's steering commands reaches 0.245 2 rad. The discussion of location noise on the PP controller and deep PP proves that deep PP can better maintain robustness to location drift. Conclusion In this study, we proposed a fusion neural network framework that incorporates visual features from the camera with additional location information and motion model information. Experiment results show that our model can track the steering commands of autopilot or human driver more closely than the CNN model of the Udacity challenge and PP and maintained high robustness under 14 poor driving conditions.

Key words

autonomous driving; end-to-end; steering model; path tracking; deep learning; pure pursuit

0 引言

路径跟踪技术在无人驾驶领域发挥着十分重要的作用(单云霄等，2018)。Snider(2009)定义了路径跟踪的过程：路径跟踪是指无人驾驶汽车通过合理地控制车辆转向以引导车辆沿着路径行驶。在路径跟踪中，建立一个先进的转向模型极其重要，一个优秀的转向模型应可基本模拟人类驾驶行为，较为平滑地控制车辆转向。在各种移动机器人平台上，传统转向模型较为广泛地应用在跟踪任务中，例如：pure pursuit(PP)控制器(Chen等，2018)，模型预测控制(model predictive control，MPC)(Ma等，2017)，比例积分微分控制(proportion integration differentiation，PID)(Zhao等，2012)等。尽管这些跟踪方法应用性较好，但算法的跟踪性能与人类驾驶行为仍有较大差距，实现拟人化转向是完成无人驾驶的一个重要目标。

针对驾驶路径跟踪问题，国内外学者展开了大量的研究工作。Kuwata等人(2009)将PP算法移植到麻省理工学院的无人驾驶平台上并实现了转向控制。根据他们的实验，预瞄距离调整机制可适应速度的动态变化。Shan等人(2015)论证了跟踪路径的曲率在合理的预瞄距离选择上比速度的影响更大，并提出了一个全新的转向方法——曲率适应追踪算法。与基于纯追踪的方法不同，Zhao等人(2012)提出了一个自适应PID控制器以使得车辆可根据跟踪状况调整转向。Riofrio等人(2017)使用基于汽车动力学的线性二次型调节器进一步提高了跟踪性能。传统转向模型虽然在大多数无人驾驶系统中表现良好，但传统方法过于机械的转向过程与人类的驾驶习惯差别较大。基于深度学习的转向方法因为能够学习人类的转向行为而受到了很多研究者的关注。Bojarski等人(2016)训练了一个卷积神经网络将摄像头采集的原始像素直接映射到转向指令中。这种端到端方法的性能虽然在真实场景实验中得到了验证，但方法的鲁棒性和准确率还有待提升。Fernando等人(2017)分析了端到端方法准确率低的原因，并提出了一种基于神经记忆网络融合方向盘转动轨迹的方法。Eraqi等人(2017)解决了原始端到端方法忽视帧与帧之间时间关系的问题，并提出了卷积长短期记忆循环神经网络以学习驾驶中的视觉与动态时间依赖关系。为了进一步评估现有方法的性能，Tian等人(2018)设计了系统的测试工具DeepTest，并利用这个工具测试在Udacity无人驾驶挑战赛中表现最好的3个模型。根据测试结果，转向失败偶尔会发生。有时在一些极端驾驶条件(如：视觉模糊、雨雾天气等)将出现致命的转向错误。因此，在无人驾驶领域中，构建一个拟人的、环境鲁棒的深度转向模型仍然是一个挑战性的任务。

传统跟踪控制方法稳定可靠，受环境影响较小，然而机械的转向方式难以通过自身优化改善。基于深度学习的转向方法可较好地模拟人类转向行为，但单纯依赖视觉获取环境信息受环境影响较大，极端情况下将输出错误的转向行为导致安全问题。因此，如何融合传统的控制方法与深度转向行为以实现鲁棒、拟人的驾驶行为是一个值得深入研究的问题。

基于此，本文提出了深度纯追随模型(deep pure pursuit，deep PP)，通过融合传统转向控制方法PP和卷积神经网络(convolutional neural network, CNN)构建更鲁棒的深度转向模型。通过应用PP，在视觉传感器的基础上，融合高精度地图提供的全局轨迹、车辆自身动态模型和自身位置以提升模型的环境鲁棒性。同时，利用卷积神经网络提取环境图像特征，并通过设计的联合特征向量整合环境特征和PP计算的转向值构建与人类转向行为之间的映射模型。

1 背景

纯追随控制器可建立当前位置与目标轨迹中目标点之间的关系，并预测车辆的前轮转向角度。如图 1所示，根据阿克曼原理，车辆将沿着以前轮轴线与后轮轴线交叉点为中心的弧线移动。图 1展示了利用一个简单的二自由度模型仿真车辆的转向运动, 其中，$P_{\text {center }}$为车辆的旋转中心。图 1的几何关系为

图 1 PP算法

Fig. 1 Pure pursuit algorithm

$ \tan \beta = \frac{{{L_d}}}{R} $

(1)

式中，${{L_d}}$是预瞄距离，$R$是转弯半径，$\beta $是前轮的转向角度。根据图 1中的几何关系可得

$ {\frac{{{L_w}}}{{\sin 2\alpha }} = \frac{R}{{\sin \left({\frac{{\rm{ \mathsf{ π} }}}{2} - \alpha } \right)}}} $

(2)

$ {\frac{{{L_w}}}{{2\sin \alpha \cos \alpha }} = \frac{R}{{\cos \alpha }}} $

(3)

$ {\frac{{{L_w}}}{{\sin \alpha }} = 2R} $

(4)

式中，${{L_w}}$是前后轮之间的轴距，$\alpha $是由当前位置${P_{{\rm{now}}}}$和目标点位置${P_{{\rm{next}}}}$决定的。

联合式(1)—式(4)，前轮转向角度$\beta $计算为

$ \beta = \arctan \left({\frac{{2\sin \alpha {L_w}}}{{{L_d}}}} \right) $

(5)

对于给定的车辆，纯追随算法的性能将由预瞄距离${{L_d}}$所决定。

2 方法

2.1 数据收集

由于本文方法需使用全局轨迹以及实时位置信息，现有的用于学习人类驾驶行为的开源数据集，如KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago)，Cityscape，Oxford’s Robotcar等，大多缺少目标路径信息、同步的位置信息和转向信息，难以满足deep PP的数据需求。另外，为了测试提出的方案是否可适应极端的驾驶环境，数据集应尽可能包含极端驾驶条件，例如雨天、雪天等。近些年来，虚拟仿真技术的发展可实现真实场景的逼真模拟。因此，提出利用CARLA(Center for Advanced Research on Language Acquisition)仿真器创建包括多种驾驶情况的数据集。基于CARLA，采集的数据可包括关键的第一视角驾驶画面(第一人称视角)、位置信息等。另外，可定制周围的建筑物、行人、车辆和天气条件。在数据采集过程中，CARLA的无人驾驶仪将控制数据采集车采集彩色的驾驶图像、位置信息和转向指令的真值。在测试阶段，训练的转向模型将被移植到测试车辆上。测试车辆从安装的摄像头中提取驾驶场景图像与当前的位置信息一同输入到转向模型中，并输出转向指令。

2.2 数据处理方法

为了提高数据的多样性，采用翻转和旋转方法以增强数据，同时提高转向模型从不利位置或方向自我恢复的能力。此外，为了降低干扰，每幅图像最底部的30个像素和最顶部的40个像素都被剔除以降低车头和天空等无关信息特征。最后，通过降采样将数据中的图像调整为128×128像素以加速训练和测试过程。

deep PP的另一个输入是PP控制器的预测角度。在PP控制器中，通过选择车辆前方目标路径上50个目标点$\left({{p_1}, {p_2}, \cdots, {p_{50}}} \right)$计算转向角度$\left({{s_1}, {s_2}, \cdots, {s_{50}}} \right)$。数据的产生过程如图 2所示, 图中，x, y是车辆的当前位置，$h$是车辆的车头朝向角度。在目标路径上，两点之间的距离大约为0.4 m。为了避免输出零值或无效值，最小的预瞄距离设置为车辆前1.5 m处。根据实验和前人的研究工作，最大的预瞄距离限制在20 m之内，得到50个可能的预测转向角

$ \left({{s_1}, {s_2}, \cdots, {s_{50}}} \right) = PP\left({{p_1}, {p_2}, \cdots, {p_{50}}} \right) $

(6)

图 2 计算50个目标点的说明

Fig. 2 Illustration to calculate 50 target points

式中，$PP\left(\cdot \right)$指使用纯追随算法。

2.3 网络框架

图 3展示了本文提出的VGG(visual geometry group network)型神经网络模型架构。这个神经网络模型总共有8层，包括3个卷积层(conv1~conv3)、3个池化层(最大池化层，pool)和2个全连接层(FC1~FC2)。卷积层可提取深度特征, 所有卷积层都使用了3×3的卷积核，并且它们的步长都为1, 3个卷积层分别包含32、64和128个卷积核。在每个卷积层之后，一个2×2步长为2的池化层被用以压缩特征图信息，并减小模型参数量。最后2层全连接层负责输出控制的角度。

图 3 融合神经网络(deep PP)

Fig. 3 Fusion convolutional neural network (deep PP)

为了提高网络对环境的鲁棒性，deep PP将PP算法预测的转向角融合入网络中。融合的基本思想是将动态的位置信息和静态的车辆运动模型直接与转向行为关联起来。虽然CNN学习图像特征方法直接提取了车辆位置和运动规则(相邻帧之间的行为)的特征，但网络从容易获得的车辆位置和运动特征中学习更有利于网络捕获的关键信息。

因此，基于PP融合这些额外信息获取预测信息。为了降低预瞄距离对转向行为的影响，通过选择多个预瞄距离获取多个预测值的思想，并最终整合入图像特征中。这个整合的特征将通过两个全连接层构建与转向角度之间的映射关系。

3 实验

3.1 实验设置

数据集分别由CARLA(Dosovitskiy等，2017)仿真器中的车辆和现实生活中的无人驾驶汽车采集。仿真车前挡风玻璃配置了一个摄像头用于采集图像数据。当仿真车在14种天气条件下行驶时，驾驶行为记录、时间戳、图像和位置信息都被同步采集下来。然而，在这14种天气条件中，晴天中午被用以训练deep PP，而在测试时所有的天气场景都被包含在测试集中。带有前视摄像头的真实无人驾驶汽车也采集了类似于CARLA仿真器中采集的数据。然而，真实无人驾驶汽车的驾驶员是一位经验丰富的人类司机。

3.2 实现细节

模型在一块Titan X显卡上运行。在仿真实验中，用于训练的数据集由在晴天中午天气条件下的10 000幅行驶场景照片组成。测试集中包括14个不同天气的子测试集，每个子测试集分别由3 000幅图像组成。除了仿真数据集，真实场景驾驶数据集中包含了13 080幅图像用于训练，2 770幅图像用于测试。此外，实验代码在Google的TensorFlow深度学习框架下实现，表 1展示了网络训练的参数。

表 1 网络训练参数
Table 1 Parameters of training model

下载CSV

参数	值
最大迭代次数	10
每次迭代的样本数量	10 000
批大小	32
优化器	学习率为1E-4的Adam优化器
激活函数	ReLU

为了评估转向模型，Udacity挑战赛中性能最好的CNN模型和原始PP控制器将用来对比Deep PP。此外，利用均方根误差(root, mean square error, RMSE)衡量神经网络的平均预测误差。在实验中，RMSE用以评估不同方法的性能，并作为网络训练的损失函数，损失函数方程为

$ RMSE = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^n {{{\left({{g_i} - {s_i}} \right)}^2}} } $

(7)

式中，${{g_i}}$代表第$i$时刻模型预测的转向角度，${{s_i}}$代表第$i$时刻实际的转向角度。$n$表示时间长为$n$帧。

在训练过程中，使用RMSE优化网络能够最小化网络预测的转向角度与实际转向角度之间的差值。实际转向角度可以由人类司机或无人驾驶仪操控产生。

3.3 实验结果

3.3.1 仿真结果

在14种天气场景下使用RMSE衡量指标的对比实验结果展示在表 2和表 3中。结合表 2和表 3的结果，deep PP的性能相比CNN平均提高了50.28%，而相比PP控制器减小了35.39%的误差。deep PP可融合来自图像的特征和空间信息，使得网络能够克服在复杂驾驶环境下产生的图像噪声的影响。在表 2中，首先能观察到CNN的跟踪性能受到了驾驶环境的严重影响。在复杂驾驶场景下，例如雨天、黄昏和多云等(如图 4所示)，CNN由于受到反光、视线受阻、曝光不足或者过度曝光的影响，不能从带有噪声的图像中提取足够有效的特征以预测正确的转向角度。相反，当测试集的天气条件不同于训练集的天气条件时，PP控制器相比CNN保持了一个更稳定且更小的RMSE值。主要原因是PP预测仅由当前位置和预瞄距离所决定，而位置和预瞄距离对天气变化不敏感。图 4展示了直观的结果比较。在极其复杂的驾驶场景下，CNN的预测值完全偏离了真值的方向。而在大多数的场景下，PP控制器仍保持着较好的性能，但在转向的时候会出现过度转向的情况。在这些模型中，只有deep PP，即使只用简单驾驶场景用于训练，在测试时还可保持转向预测值贴近真值。图 5展示了在大雨中午的驾驶条件下，各模型转向性能的详细观测图。对比CNN容易受到天气影响，deep PP并没有被复杂的天气影响，而输出噪声值，其保持着平滑的转向过程。不依赖于环境的位置信息保证了deep PP在各种驾驶条件下的鲁棒性。此外，通过放大两个转弯部分的变化过程可观察到在复杂的场景下，deep PP可调整预测的转向角度，并比CNN更贴近真值转向指令的变化。即使在好的驾驶条件下，在遇到十字路口时，CNN要输出一个合理的转向指令也是一项具有挑战性的任务。然而，额外的目标路径信息和位置信息可帮助deep PP在困难场景下正确转向。deep PP使用融合了噪声图像的特征、准确的位置信息、目标路径信息以及运动模型信息的联合特征实现了一个更可靠、拟人化的转向过程。

表 2 不同方法的RMSE对比
Table 2 RMSE of ground truth and prediction of different methods

下载CSV

/rad
场景类别	场景细节	CNN	PP	deep PP
中午	晴天中午	0.054 9	0.081 1	0.042 7
	多云中午	0.090 5	0.066 0	0.037 9
	潮湿中午	0.109 0	0.074 4	0.050 2
	潮湿多云中午	0.089 0	0.065 8	0.039 9
雨天中午	中雨中午	0.094 4	0.071 8	0.042 0
	大雨中午	0.085 5	0.061 6	0.038 5
	小雨中午	0.096 5	0.073 0	0.049 2
傍晚	晴天傍晚	0.070 9	0.059 5	0.039 1
	多云傍晚	0.075 9	0.065 7	0.029 7
	潮湿傍晚	0.116 5	0.083 4	0.054 4
	潮湿多云傍晚	0.084 8	0.061 4	0.042 1
雨天傍晚	中雨傍晚	0.110 6	0.070 5	0.068 8
	大雨傍晚	0.098 3	0.073 0	0.055 2
	小雨傍晚	0.093 3	0.070 1	0.040 3
注：加粗字体表示在特定场景下的最优结果。

表 3 RMSE数据分析
Table 3 Statistical analysis on RMSE

下载CSV

方法	CNN	PP	deep PP
平均值	0.090 7	0.069 8	0.045 1
标准差	0.016 3	0.007 1	0.009 7
注：加粗字体表示最优结果。

图 4 在不同天气条件下控制器输出的转向指令比较

Fig. 4 Compare the steering commands under different difficult situation ((a) wet cloudy noon; (b) hard rain noon; (c) cloudy sunset; (d) wet cloudy sunset)

图 5 在大雨中午条件下各模型的转向细节

Fig. 5 Detailed steering angles under the hard rain noon condition

3.3.2 真实场景结果

虽然仿真结果显示deep PP有非常好的环境适应能力，但这并不能说明deep PP的实用性。因此，本文将在无人驾驶平台上采集的真实场景数据集上评估这3种方法的性能。图 6展示了训练集和测试集的驾驶场景，图 7展示了真实场景的转向对比结果。在真实场景中，传感器采集的位置信息准确度难以实现仿真器中的精度，且图像质量很难通过人为控制。实验结果显示deep PP的性能超过CNN和PP控制器。尤其是在转弯的场景中，deep PP表现得更佳。此外，对比图 7和图 5的结果，真实场景下的PP控制器性能并不像仿真实验中表现得那么平滑，而与人类司机的转向操作之间的最大偏差值达到了0.245 2 rad。图 8进一步分析了选择不同的预瞄距离导致PP的控制误差增大，尤其是在多弯道的场景。相对于易受定位偏差影响的PP控制器，定位偏差对CNN性能的影响较小。另外，由于测试集的天气条件跟训练集是一致的，因此CNN在真实场景下测试时产生的抖动比仿真器测试时小。

图 6 真实场景下的训练和测试

Fig. 6 Real-life scenarios for the training and testing

图 7 真实场景下的转向细节图

Fig. 7 Detailed steering process in real-life driving condition

图 8 不同预瞄距离的PP控制器的转向细节图

Fig. 8 Effects of different look-ahead distances on PP

最后，实验验证了当存在定位噪声时，定位噪声对模型转向性能的影响。由于PP控制器的性能除了取决于预瞄距离，还取决于足够精准的定位信息, 因此，定位信息出现偏差将严重影响PP控制器的性能。在真实环境中，普遍存在由于遮挡等原因导致无人驾驶汽车GPS定位精度下降的情况。图 9显示了在定位信息存在噪声时PP控制器和deep PP的表现情况。其中，PP 0.2与deep PP 0.2分别表示在加入了随机噪声的定位数据下，PP控制器与deep PP的预测值曲线。通过对比加入噪声的定位数据和正常的定位数据下PP控制器的预测值曲线，可以看出，PP控制器对定位噪声比较敏感。存在定位噪声时，PP控制器的预测偏差最大可达0.3 rad，并且，PP控制器的输出不稳定，存在较大的波动。由于融合了不依赖定位信息的图像数据特征，deep PP可以很好地抑制输出噪声。进一步验证了deep PP对图像数据噪声和定位数据噪声具有良好的鲁棒性。

图 9 定位噪声下的预测细节图

Fig. 9 Effects of location noise on PP and deep PP

4 结论

本文提出了一个基于融合神经网络的转向模型，实现了稳定拟人化的跟踪过程：1)本文模型基于神经网络架构，融合传统转向模型和行为反射方法；2)设计了一个联合特征向量整合PP控制器的转向结果和图像特征；3)构建了一个基于仿真与实车实景试验的训练和测试数据集；4)实验结果表明相对于单一的CNN和传统控制方法，deep PP在跟踪性能、环境适应性以及应对噪声方面优势明显。

然而，deep PP也存在以下的待优化空间：1)只应用当前的转向信息，忽略了历史时空信息对当前转向的影响，包括转向信息、车辆速度状态等；2)本文仅将较为典型的VGG型神经网络与较为典型的pure pursuit控制器融合来验证其性能和鲁棒性，但未对其他神经网络结构和传统控制器进行融合并验证性能；3)仅在自动驾驶实验平台中采集的数据上验证了deep PP的性能，但未能找到符合要求的公开数据集以验证deep PP的性能；4)仅在离线数据集上进行测试，并未在真实驾驶场景上进行闭环测试。因此，本文后续工作将通过构建新的网络融合模型进一步提高deep PP的性能，并集中实现deep PP在实际驾驶场景下的实时控制，考验模型的实时性和鲁棒性。并且，通过多种端到端神经网络和传统的转向控制器的融合以验证融合框架的合理性。

参考文献

Bojarski M, Testa D D, Dworakowski D, Firner B, Flepp B, Goyal P, Jackel L D, Monfort M, Muller U, Zhang J K, Zhang X, Zhao J and Zieba K. 2016. End to end learning for self-driving cars[EB/OL].[2020-07-02]. https://arxiv.org/pdf/1604.07316.pdf

Chen Y P, Shan Y X, Chen L, Huang K and Cao D P. 2018. Optimization of pure pursuit controller based on PID controller and low-pass filter//Proceedings of the 21st International Conference on Intelligent Transportation Systems. Maui, USA: IEEE: 3294-3299[DOI:10.1109/ITSC.2018.8569416]

Dosovitskiy A, Ros G, Codevilla F, López A and Koltun V. 2017. CARLA: an open urban driving simulator[EB/OL].[2020-07-02]. https://arxiv.org/pdf/1711.03938.pdf

Eraqi H M, Moustafa M N and Honer J. 2017. End-to-end deep learning for steering autonomous vehicles considering temporal dependencies[EB/OL].[2020-07-02]. https://arxiv.org/pdf/1711.03938.pdf

Fernando T, Denman S, Sridharan S and Fookes C. 2017. Going deeper: autonomous steering with neural memory networks//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE: 214-221[DOI:10.1109/ICCVW.2017.34]

Kuwata Y, Teo J, Fiore G, Karaman S, Frazzoli E, How J P. 2009. Real-time motion planning with applications to autonomous urban driving. IEEE Transactions on Control Systems Technology, 17(5): 1105-1118 [DOI:10.1109/TCST.2008.2012116]

Ma C J, Li F, Liao C L and Wang L F. 2017. Path following based on model predictive control for automatic parking system//SAE Technical Paper. Kunshan: SAE[DOI:10.4271/2017-01-1952]

Riofrio A, Sanz S, Boada M J L, Boada B L. 2017. A LQR-based controller with estimation of road bank for improving vehicle lateral and rollover stability via active suspension. Sensors, 17(10): #2318 [DOI:10.3390/s17102318]

Shan Y X, Guo X M, Long J Y, Cai B B, Li B J. 2018. Asymptotically sampling-based algorithm with applications to autonomous urban driving on structured road. China Journal of Highway and Transport, 31(4): 192-201 (单云霄, 郭晓旻, 龙江云, 蔡斌斌, 李必军. 2018. 渐优随机采样算法在结构化道路无人驾驶中的应用. 中国公路学报, 31(4): 192-201)

Shan Y X, Yang W, Chen C, Zhou J, Zheng L, Li B J. 2015. CF-pursuit:a pursuit method with a clothoid fitting and a fuzzy controller for autonomous vehicles. International Journal of Advanced Robotic Systems, 12(9): 1-13 [DOI:10.5772/61391]

Snider J M. 2009. Automatic Steering Methods for Autonomous Automobile Path Tracking[EB/OL].[2020-07-02]. https://www.ri.cmu.edu/publications/automatic-steering-methods-for-autonomous-automobile-path-tracking/

Tian Y C, Pei K X, Jana S and Ray B. 2018. DeepTest: automated testing of deep-neural-network-driven autonomous cars//Proceedings of the 40th International Conference on Software Engineering. New York, USA: Association for Computing Machinery: 303-314[DOI:10.1145/3180155.3180220]

Zhao P, Chen J J, Song Y, Tao X, Xu T J, Mei T. 2012. Design of a control system for an autonomous vehicle based on adaptive-PID. International Journal of Advanced Robotic Systems, 9(2)#44 [DOI:10.5772/51314]