大场景双视角点云特征融合语义分割方法

孙刘杰; 曾腾飞; 樊景星; 王文举

发布时间： 2024-01-16
摘要点击次数： 812
全文下载次数： 677
DOI: 10.11834/jig.220943
2024 | Volume 29 | Number 1

大场景双视角点云特征融合语义分割方法

孙刘杰, 曾腾飞, 樊景星, 王文举(上海理工大学出版印刷与艺术设计学院, 上海 200093)

摘要

目的点云语义分割在无人驾驶、城市场景建模等领域中具有重要意义，为了提升大场景条件下点云特征的提取效率，提出一种大场景双视角点云特征融合的语义分割方法（double-view feature fusion network for LiDARsemantic segmentation，DVFNet）。方法大场景双视角点云特征融合的语义分割方法由两个部分组成，分别为双视角点云特征融合模块和基于非对称卷积的点云特征整合模块。双视角点云特征融合模块将柱状体素特征与关键点全局特征相结合，减少降采样导致的特征损失；基于非对称卷积的点云特征整合模块将双视角点云特征使用非对称卷积进行处理，并使用多维度卷积与多尺度特征整合来实现局部特征优化。结果本文提出的大场景双视角点云特征融合语义分割方法，在SemanticKITTI大场景点云数据集上达到63.9%的准确率，分割精度在已开源的分割方法中处于领先地位。结论通过文中的双视角点云特征融合语义分割方法，能够实现大场景条件下点云数据的高精度语义分割。

关键词

深度学习语义分割点云柱状体素上下文信息

Double-view feature fusion network for LiDAR semantic segmentation

Sun Liujie, Zeng Tengfei, Fan Jingxing, Wang Wenju(College of Communication and Art Design, University of Shanghai for Science and Technology, Shanghai 200093, China)

Abstract

Objective Point cloud semantic segmentation，as the basic technology of 3D point cloud data target detection， point cloud classification，and other projects，is an important part of the current 3D computer vision. At the same time， point cloud segmentation technology is the key for the computer to understand scenes，and it has been widely used in many fields such as autonomous driving，robotics，and augmented reality. Point cloud semantic segmentation refers to the pointby-point classification operation of points in the point cloud scene，that is，to judge the category of each point in the point cloud and finally segment and integrate accordingly. Generally，point cloud semantic segmentation technology can be divided into two categories according to different application scenarios：small-scale point cloud semantic segmentation and large-scale point cloud semantic segmentation. Small-scale point cloud semantic segmentation only performs semantic segmentation operations on indoor point cloud scenes or small-scale point cloud scenes，whereas the large-scale point cloud semantic segmentation replaces the deployment environment of the algorithm with outdoor large-scale point cloud data. Classification and integration for point clouds are usually performed on driving scenes or urban scenes. Compared with the point cloud semantic segmentation of small scenes，the semantic segmentation of large-scale point clouds has a wider range of applications and is extensively used in driving scene understanding，urban scene reconstruction，and other fields. However，due to the large amount of data and the complexity of point cloud data，the task of semantic segmentation for point cloud in large scenes is more difficult. To improve the extraction quality of point features in a large-scale point cloud，a semantic segmentation method based on double-view feature fusion network for LiDAR semantic segmentation is proposed. Method Our method is composed of two parts，double-view feature fusion module and feature integration based on asymmetric convolution. In the down sampling stage，a double-view feature fusion module，which includes a double-view point cloud feature extraction module and a feature fusion block，is suggested. The double-view feature fusion module combines the cylindrical feature with the global feature of key points to reduce the feature loss caused by downsampling. The features in different views of the point cloud are combined by feature splicing in this module. Finally，the combined point cloud features are placed into the feature fusion block for feature dimensionality reduction and fusion. In the feature integration stage，a point cloud feature integration module is proposed based on asymmetric convolution，including asymmetric convolution and multiscale dimension-decomposition context modeling，achieving the enhancement and reconstruction of point cloud features by the operation of asymmetric point cloud feature processing and multi-scale context feature integration. The feature integration processes the double-view feature by asymmetric convolution and then uses multi-dimensional convolution and multiscale feature integration for feature optimization. Result In our experimental environment，our algorithm has the second-highest frequency weighted intersection over union accuracy rate and the highest mean intersection over union（mIoU）accuracy rate among recent algorithms. Our work focuses on the improvement of segmentation accuracy and achieves the highest segmentation accuracy in multiple categories. In vehicle categories such as cars and trucks，our method achieves a high segmentation accuracy. In categories such as bicycles，motorcycles，and pedestrians with small individuals and complex shapes，our method performs better than other methods. In buildings，railings，vegetation，and other categories that are at the edge of the point cloud scene and where the point cloud distribution is relatively sparse，the double-view feature fusion module in our method not only retains the geometric structure of the point cloud but also extracts the global features of the data，thereby achieving the high-precision segmentation of these categories. Our method achieves 63. 9% mIoU on the SemanticKITTI dataset and leads the open-source segmentation methods. Compared with CA3DCNet， our method also achieves better segmentation results on the nuScenes dataset，and the mIoU accuracy is improved by 0. 7%. Conclusion Our method achieves a high-precision semantic segmentation for a large-scale point cloud. A doubleview feature fusion network is proposed for LiDAR semantic segmentation，which is suitable for the segmentation for a largescale point cloud. Experimental results show the double-view feature fusion module can reduce the loss of edge information in a large-scale point cloud，thereby improving the segmentation accuracy of edge object in the scene. Experiments prove the feature integration module based on asymmetric convolution can effectively segment small-sized objects，such as pedestrians，bicycles，motorcycles，and other categories. Our method is compared with a variety of semantic segmentation methods for large-scale point cloud. In terms of accuracy，our method performs better and achieves an average segmentation accuracy of 63. 9% on the SemanticKITTI dataset.

Keywords

deep learning semantic segmentation point cloud cylindrical voxel context information

在线采编平台

论文出版

年度会议

下载中心

年度信息