引入辅助损失的多场景车道线检测
Multi-scenario lane line detection with auxiliary loss
- 2020年25卷第9期 页码:1882-1893
收稿:2020-01-03,
修回:2020-3-9,
录用:2020-3-16,
纸质出版:2020-09-16
DOI: 10.11834/jig.190646
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-01-03,
修回:2020-3-9,
录用:2020-3-16,
纸质出版:2020-09-16
移动端阅览
目的
2
为解决实时车辆驾驶中因物体遮挡、光照变化和阴影干扰等多场景环境影响造成的车道线检测实时性和准确性不佳的问题,提出一种引入辅助损失的车道线检测模型。
方法
2
该模型改进了有效的残差分解网络(effcient residual factorized network,ERFNet),在ERFNet的编码器之后加入车道预测分支和辅助训练分支,使得解码阶段与车道预测分支、辅助训练分支并列,并且在辅助训练分支的卷积层之后,利用双线性插值来匹配输入图像的分辨率,从而对4条车道线和图像背景进行分类。通过计算辅助损失,将辅助损失以一定的权重协同语义分割损失、车道预测损失进行反向传播,较好地解决了梯度消失问题。语义分割得到每条车道线的概率分布图,分别在每条车道线的概率分布图上按行找出概率大于特定阈值的最大点的坐标,并按一定规则选取相应的坐标点,形成拟合的车道线。
结果
2
经过在CULane公共数据集上实验测试,模型在正常场景的F1指标为91.85%,与空间卷积神经网络(spatial convolutional neural network,SCNN)模型相比,提高了1.25%,比其他场景分别提高了1%~7%;9种场景的F1平均值为73.76%,比目前最好的残差网络——101-自注意力蒸馏(ResNet-101-self attention distillation,R-101-SAD)模型(71.80%)高出1.96%。在单个GPU上测试,每幅图像的平均运行时间缩短至原来的1/13,模型的参数量减少至原来的1/10。与平均运行时间最短的车道线检测模型ENet——自注意力蒸馏(ENet-self attention distillation,ENet-SAD)相比,单幅图像的平均运行时间减短了2.3 ms。
结论
2
在物体遮挡、光照变化、阴影干扰等多种复杂场景下,对于实时驾驶车辆而言,本文模型具有准确性高和实时性好等特点。
Objective
2
In a real-time driving process
the vehicle must be positioned to complete the basic tasks of horizontal and vertical control. The premise of vehicle positioning problems is to understand road information. Road information includes all kinds of traffic signs
among which lane line is an important pavement information in the road scene. Such information is crucial for lane maintenancing
departure warning
and path planning; it is also important in research on advanced driving assistance systems. Therefore
lane line detection has become an important topic in real-time vehicle driving. Road scene images can be obtained using a vehicle camera
lidar
and other equipments
thus making it easy to obtain lane line images. However
lane line detection suffers from some difficulties. Traditional lane line detection methods usually design features manually. Starting from the bottom features
such as color
brightness
shape
and gray level
the method involves image processing via denoising
binarization
and graying. Then
the lane line features are extracted by combining edge detection
Hough transform
color threshold setting
perspective transform
and other methods. Afterward
the lane lines are fitted by straight or curve line models. These methods are simple and easy to implement
but the accuracy of lane line detection is poor under the influence of multiscene environment conditions
such as object occlusion
light change
and shadow interference; moreover
the manual design of features is time consuming and thus cannot meet the real-time requirements of vehicle driving. To solve these problems
this study proposes a lane detection model named efficient residual factorized network-auxiliary loss(ERFNet-AL)
which embeds an auxiliary loss.
Method
2
The model improves the semantic segmentation network of ERFNet. After the encoder of ERFNet
a lane prediction branch and an auxiliary training branch are added to make the decoding phase parallel with the lane prediction and auxiliary training branches. After the convolution layer of the auxiliary training branch
bilinear interpolation is used to match the resolution of input images to classify four lane lines and the background of images. The training set images in the dataset are sent to the lane line detection model after preprocessing
such as clipping
rotating
scaling
and normalization. The features are extracted through the semantic segmentation network ERFNet
thereby obtaining the probability distribution of each lane line. The auxiliary training branch uses a convolution operation to extract features
and bilinear interpolation is used to replace the deconvolution layer after the convolution layer to match the resolution of the input images and classify the four lane lines and background. After using convolution
batch normalization
dropout layers
and other operations
the lane line prediction branch predicts the existence of lane lines or virtual lane lines and outputs the probability value of lane line classification. An output probability value greater than 0.5 indicates the existence of lane lines. If at least one lane line exists
then a probability distribution map of the corresponding lane lines must be determined. On the probability distribution map of each lane line
the coordinates of the largest point with a probability greater than a specific threshold is identified by row
and the corresponding coordinate points is selected in accordance with the rules of selecting points in the SCNN (spatial convolutional neural network) model. If the number of points found are greater than 2
then these points are connected to form a fitted lane line. Then
the cross-entropy loss between the predicted value of the auxiliary training branch output and the real label is calculated and used as the auxiliary loss. The weight of all four lane lines is 1
and the weight of the background is 0.4. The auxiliary
semantic segmentation
and lane prediction losses are weighted and summed in accordance with a certain weight
and the network parameters are adjusted via backpropagation. Among them
the total loss includes the main
auxiliary
and lane prediction losses. During training
the weights of ERFNet on the Cityscapes dataset are used as pretraining weights. During training
the model with the largest mean intersection over union is taken as the best model.
Result
2
After testing in nine scenarios of the CULane public dataset
the F1 index of the model in the normal scenario is found to be 91.85%
which is a 1.25% increase compared with that of the SCNN model (90.6%). Moreover
the F1 index in seven scenes
including crowded
night
no line
shadow
arrow
dazzle light
and curve scenarios
is increased by 1%~7%; the total average F1 value in nine scenarios is 73.76%
which is 1.96% higher than the best ResNet-101-self-attention distillation (SAD) model; the average run time of each image is 11.1 ms
which is 11 times shorter than the average running time of the spatial CNN model when tested on a single GPU of GeForce GTX 1 080; the parameter quantity of the model is only 2.49 MB
which is 7.3 times less than that of the SCNN model. On the CULane dataset
ENet with SAD is the lane line detection model with the shortest average run time of a single-image test. The average run time of this model is 13.4 ms
whereas that of our model is 11.1 ms. Compared with ENet with SAD
the average running time is reduced by 2.3 ms. When detecting lane lines at a crossroad scenario
the number of false positive is large
which may be due to the large number of lane lines at crossroads
whereas only four lane lines are detected in our experiment.
Conclusion
2
In various complex scenarios
such as object occlusion
lighting changes
and shadow interference
the model is minimally affected by the environment for real-time driving vehicles
and its accuracy and real-time performance are improved. The next work will aim to increase the number of lane lines
optimize the model
and improve the model's detection performance at crossroads.
Badrinarayanan V, Kendall A and Cipolla R. 2017. SegNet:a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12):2481-2495[DOI:10.1109/tpami.2016.2644615]
Chen C Y. 2018. Research of Highway Rainy Weather Detection Based on Deep Semantic Segmentation. Chengdu: Southwest Jiaotong University
陈昌宇. 2018.基于深度分割网络的高速公路监控视频雨天检测算法研究.成都: 西南交通大学
Chen H S, Yao M H, Chen Z H and Yang Z. 2018. Efficient method of lane detection based on multi-frame blending and windows searching. Computer Science, 45(10):255-260
陈涵深, 姚明海, 陈志浩, 杨圳. 2018.基于多帧叠加和窗口搜索的快速车道检测.计算机科学, 45(10):255-260
Chen Z and Chen Z J. 2017. RBNET: a deep neural network for unified road and road boundary detection//Proceedings of the 24th International Conference on Neural Information Processing. Guangzhou: Springer: 677-687[ DOI:10.1007/978-3-319-70087-8_70 http://dx.doi.org/10.1007/978-3-319-70087-8_70 ]
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S and Schiele B. 2016. The cityscapes dataset for semantic urban scene understanding//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE: 3213-3223[ DOI:10.1109/cvpr.2016.350 http://dx.doi.org/10.1109/cvpr.2016.350 ]
Csurka G and Perronnin F. 2011. An efficient approach to semantic segmentation. International Journal of Computer Vision, 95(2):198-212[DOI:10.1007/s11263-010-0344-8]
Gao F, Mei K C, Gao Y, Lu S F and Xiao G. 2016. Algorithm of intersection background extraction and driveway calibration. Journal of Image and Graphics, 21(6):734-744
高飞, 梅凯城, 高炎, 卢书芳, 肖刚. 2016.城市交叉路口背景提取与车道标定算法.中国图象图形学报, 21(6):734-744 [DOI:10.11834/jig.20160606]
Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V and Garcia-Rodriguez J. 2017. A review on deep learning techniques applied to semantic segmentation[EB/OL].[2020-01-03] . https://arxiv.org/pdf/1704.06857.pdf https://arxiv.org/pdf/1704.06857.pdf
Ghafoorian M, Nugteren C, Baka N Booij O and Hofmann M. 2018. EL-GAN: embedding loss driven generative adversarial networks for lane detection//Proceedings of 2018 European Conference on Computer Vision. Munich: Springer: 256-272[ DOI:10.1007/978-3-030-11009-3_15 http://dx.doi.org/10.1007/978-3-030-11009-3_15 ]
Hou Y N, Ma Z, Liu C X and Loy C C. 2019. Learning lightweight lane detection CNNs by self attention distillation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE: 1013-1021[ DOI:10.1109/ICCV.2019.00110 http://dx.doi.org/10.1109/ICCV.2019.00110 ]
Jiang L B and Tai Q L. 2019. The lane line detection in complex scene based on instance segmentation. Machine Design and Manufacturing Engineering, 48(5):113-118
姜立标, 台啟龙. 2019.基于实例分割方法的复杂场景下车道线检测.机械设计与制造工程, 48(5):113-118 [DOI:10.3969/j.issn.2095-509X.2019.05.027]
Ling S Y, Ma Y, Huang C R and Zhai W L. 2017. Research on improved urban environment road detection algorithm based on Hough transform. Machine Design and Manufacturing Engineering, 46(12):71-75
凌诗韵, 马乐, 黄楚然, 翟伟良. 2017.基于Hough变换的城市环境道路识别优化算法研究.机械设计与制造工程, 46(12):71-75 [DOI:10.3969/j.issn.2095-509X.2017.12.017]
Pan X G, Shi J P, Luo P, Wang X G and Tang X O. 2018. Spatial as deep: spatial CNN for traffic scene understanding[EB/OL].[2020-01-03] . https://arxiv.org/pdf/1712.06080.pdf https://arxiv.org/pdf/1712.06080.pdf
Romera E,álvarez J M, Bergasa L M and Roberto A. 2017. Efficient ConvNet for real-time semantic segmentation[EB/OL].[2020-01-03] . https://arxiv.org/pdf/1712.06080.pdf https://arxiv.org/pdf/1712.06080.pdf
Romera E, Álvarez J M, Bergasa L M and Roberto A. 2018. ERFNet:efficient residual factorized ConvNet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 19(1):263-272[DOI:10.1109/TITS.2017.2750080]
Shi Y T. 2018. Research on semantic segmentation of road scene based on deep neural networks. Chengdu: Southwest Jiaotong University
石永涛. 2018.基于深度神经网络的道路场景语义分割研究.成都: 西南交通大学
Tian X, Wang L and Ding Q. 2019. Review of image semantic segmentation based on deep learning. Journal of Software, 30(2):440-468
田萱, 王亮, 丁琪. 2019.基于深度学习的图像语义分割方法综述.软件学报, 30(2):440-468 [DOI:10.13328/j.cnki.jos.005659]
Trinh T H, Dai A M, Luong M T and Le Q V. 2018. Learning longer-term dependencies in RNNs with auxiliary losses[EB/OL].[2020-01-03] . https://arxiv.org/pdf/1803.00144.pdf https://arxiv.org/pdf/1803.00144.pdf
Xing Y, Lv C, Chen L, Wang H J, Wang H, Cao D P, Velenis E and Wang F Y. 2018. Advances in vision-based lane detection:algorithms, integration, assessment, and perspectives on ACP-based parallel vision. IEEE/CAA Journal of Automatica Sinica, 5(3):645-661[DOI:10.1109/JAS.2018.7511063]
Zhao W M and Zhang H W. 2017. A research on lane maintenance assist system based on machine vision. Digital Technology and Application, (11):63-64
赵文明, 张海文. 2017.基于机器视觉的车道保持辅助系统研究.数字技术与应用, (11):63-64 [DOI:10.19695/j.cnki.cn12-1369.2017.11.035]
Zhu W, Qu J Y and Wu R B. 2017. Straight convolutional neural networks algorithm based on batch normalization for image classification. Journal of Computer-Aided Design and Computer Graphics, 29(9):1650-1657
朱威, 屈景怡, 吴仁彪. 2017.结合批归一化的直通卷积神经网络图像分类算法.计算机辅助设计与图形学学报, 29(9):1650-1657 [DOI:10.3969/j.issn.1003-9775.2017.09.008]
相关文章
相关作者
相关机构
京公网安备11010802024621