Current Issue Cover
激光点云的稀疏体素金字塔邻域构建与分类

陶帅兵1,2,3, 梁冲1,2,3, 蒋腾平4, 杨玉娇1,2,3, 王永君1,2,3(1.江苏省地理信息资源开发与应用协同创新中心, 南京 210023;2.南京师范大学虚拟地理环境教育部重点实验室, 南京 210023;3.南京师范大学地理环境演变国家重点实验室培育点, 南京 210023;4.武汉大学测绘遥感信息工程国家重点实验室, 武汉 430072)

摘 要
目的 在点云分类处理的各环节中,关键是准确描述点云的局部邻域结构并提取表达能力强的点云特征集合。为了改进传统邻域结构单尺度特征表达能力的有限性和多尺度特征的计算复杂性,本文提出了用于激光点云分类的稀疏体素金字塔邻域结构及对应的分类方法。方法 通过对原始数据进行不同尺度下采样构建稀疏体素金字塔,并根据稀疏体素金字塔提取多尺度特征,利用随机森林分类器进行初始分类;构建无向图,利用直方图交集核计算邻域点之间连接边的权重,通过多标签图割算法优化分类结果。当体素金字塔的接收域增大时,邻域点密度随其距离中心点距离的增加而减小,有效减少了计算量。结果 在地基Semantic3D数据集、车载点云数据和机载点云数据上进行实验,结果表明,在降低计算复杂性的前提下,本文方法的分类精度、准确性和鲁棒性达到了同类算法前列,验证了该框架作为点云分类基础框架的有效性。结论 与类似方法相比,本文方法提取的多尺度特征既保持了点的局部结构信息,也更好地兼顾了较大尺度的点云结构特征,因而提升了点云分类的精度。
关键词
Sparse voxel pyramid neighborhood construction and classification of LiDAR point cloud

Tao Shuaibing1,2,3, Liang Chong1,2,3, Jiang Tengping4, Yang Yujiao1,2,3, Wang Yongjun1,2,3(1.Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China;2.Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University, Nanjing 210023, China;3.State Key Laboratory Cultivation Base of Geographical Environment Evolution, Nanjing Normal University, Nanjing 210023, China;4.State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430072, China)

Abstract
Objective Point cloud classification is one of the hotspots of computer vision research. Among of various kinds of processing stages, accurately describing the local neighborhood structure of the point cloud and extracting the point cloud feature sets with strong expressive ability has become the key to point cloud classification. Traditionally, two methods can be used for modeling the neighborhood structure of point clouds:single-scale description and multiscale description. The former has a limited expressive ability, whereas the latter has a strong description ability but comes with a high computational complexity. To solve the above problems, this paper proposes a sparse voxel pyramid structure to express the local neighborhood structure of the point cloud and provides the corresponding point cloud classification and optimization method. Method First, after a comparative analysis of related point cloud classification methods, the paper describes in detail the structure of the proposed sparse voxel pyramid, analyzes the advantages of the sparse voxel pyramid in expressing the neighborhood structure of the point cloud, and provides the method to express the local neighborhood of the point could with this structure. When calculating point features, the influence of candidate points on the local feature calculation results gradually decreases as the distance decreases. Thus, a fixed number of neighbors is used to construct each layer of the sparse voxel pyramid. For each voxel, a sparse voxel pyramid of N layers is constructed, and the voxel radius of the 0th layer is set to R. The value of N can be set according to the computing power of hardware resources. The R value is the smallest voxel value in the entire voxel pyramid, and its size can be set according to the point cloud density and range of the scene. The voxel radius of each subsequent layer of the pyramid is in turn twice that of the previous layer. The voxel radius of the Nth layer is 2NR, and each layer contains the same number of K voxels. Each point in the original point cloud is built into a spatial K-nearest neighbor index in voxel point clouds of different scales to form a sparse voxel pyramid by downsampling the original point cloud according to the above-mentioned proportions. This method can determine the multiscale neighborhood of the center point only based on a fixed-size K value. Near the center point, the point cloud density maintains the original density. As the distance from the center point increases, the density of neighboring points becomes sparser. Based on the sparse voxel pyramid structure, the local neighborhood constructed by points at different scales is used to extract feature-value-based features, neighborhood geometric features, projection features, and fast point feature histogram features of corresponding points. Then, the single point features are aggregated, the random forest method is used for supervised classification, and then the multilabel graph cut method is used to optimize the above classification results. After calculating the fast point feature histogram feature of each point, the histogram intersection core is used to calculate the edge potential between neighboring points. Result This paper selects three public datasets for experiments, namely, the ground-based Semantic3D dataset, the airborne LiDAR scanning data obtained by the airborne laser scanning system ALTM 2050 in different regions, and point cloud dataset at the main entrance of the Munich Technical University campus in Arcisstrasse by mobile LiDAR system, to verify the effectiveness of this method. The evaluation indicators used in the experiment are accuracy, recall, and F1 value. Sparse voxel pyramids of different scales are used for feature extraction and feature vector aggregation owing to the difference in point cloud density and coverage. Using the method proposed in this paper, the overall classification accuracy of the experimental results on the ground Semantic3D dataset can reach 89%, the classification accuracy of the airborne LiDAR scan dataset can reach 96%, and the classification accuracy of the mobile LiDAR scan dataset can reach 89%. Experimental results show that compared with other comparison methods, the multiscale features based on sparse voxel pyramids proposed can express the local structure of point clouds more effectively and robustly. When the receiving field of the voxel pyramid increases, the density of neighboring points decreases as the distance from the center point increases, which effectively reduces the amount of calculation. In addition, the histogram feature of fast point feature is used to calculate the difference between adjacent points through the histogram intersection kernel, which is used as the weight of the edge in the structural graph model to improve the optimization effect. This method is more accurate than the traditional method that uses Euclidean distance as the weight. The multilabel graph cut method can further optimize the results of single-point classification and provide better optimization results on issues such as the incorrect classification of vegetation into buildings and vice versa. In areas with large natural terrain undulations, the classification accuracy of terrain and low vegetation is greatly affected by the undulations of natural terrain in different regions, and misclassification easily occurs. By contrast, the classification accuracy on higher features such as tall vegetation, buildings, and pedestrians is less affected by the terrain, and the accuracy is higher. Conclusion Overall, compared with other similar and more advanced methods, the multiscale features extracted by the proposed method maintain the local structure information while considering a larger range of point cloud structure information, thereby improving the point cloud classification accuracy.
Keywords

订阅号|日报