发布时间: 2021-11-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.200262 2021 | Volume 26 | Number 11 图像理解和计算机视觉

1. 江苏省地理信息资源开发与应用协同创新中心, 南京 210023;
2. 南京师范大学虚拟地理环境教育部重点实验室, 南京 210023;
3. 南京师范大学地理环境演变国家重点实验室培育点, 南京 210023;
4. 武汉大学测绘遥感信息工程国家重点实验室, 武汉 430072
 收稿日期: 2020-06-09; 修回日期: 2020-09-27; 预印本日期: 2020-10-04 基金项目: 国家自然科学基金项目（41771439）；国家重点研发计划项目（2016YFB0502304）；自然资源部城市国土资源监测与仿真重点实验室项目（KF-2018-03-070） 作者简介: 陶帅兵, 1994年生, 男, 硕士研究生, 主要研究方向为激光点云数据处理、机器学习。E-mail: shuaibingtao@163.com 梁冲, 男, 硕士研究生, 主要研究方向为基于深度学习的激光点云分类方法。E-mail: 947192871@qq.com 蒋腾平, 男, 博士研究生, 主要研究方向为激光点云分类与特征提取。E-mail: 331217972@qq.com 杨玉娇, 女, 硕士研究生, 主要研究方向为3维空间数据模型。E-mail: 1264640726@qq.com 王永君, 通信作者, 男, 教授, 主要研究方向为激光点云数据处理、多维数据集成建模。E-mail: wangyongjun@njnu.edu.cn *通信作者: 王永君  wangyongjun@njnu.edu.cn 中图法分类号: P23 文献标识码: A 文章编号: 1006-8961(2021)11-2703-10

# 关键词

Sparse voxel pyramid neighborhood construction and classification of LiDAR point cloud
Tao Shuaibing1,2,3, Liang Chong1,2,3, Jiang Tengping4, Yang Yujiao1,2,3, Wang Yongjun1,2,3
1. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China;
2. Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University, Nanjing 210023, China;
3. State Key Laboratory Cultivation Base of Geographical Environment Evolution, Nanjing Normal University, Nanjing 210023, China;
4. State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430072, China
Supported by: National Natural Science Foundation of China (41771439);National Key Research and Development Program of China (2016YFB0502304); Key Laboratory Project of Urban Land Resources Monitoring and Simulation, Ministry of Natural Resources (KF-2018-03-070)

# Abstract

Objective Point cloud classification is one of the hotspots of computer vision research. Among of various kinds of processing stages, accurately describing the local neighborhood structure of the point cloud and extracting the point cloud feature sets with strong expressive ability has become the key to point cloud classification. Traditionally, two methods can be used for modeling the neighborhood structure of point clouds: single-scale description and multiscale description. The former has a limited expressive ability, whereas the latter has a strong description ability but comes with a high computational complexity. To solve the above problems, this paper proposes a sparse voxel pyramid structure to express the local neighborhood structure of the point cloud and provides the corresponding point cloud classification and optimization method. Method First, after a comparative analysis of related point cloud classification methods, the paper describes in detail the structure of the proposed sparse voxel pyramid, analyzes the advantages of the sparse voxel pyramid in expressing the neighborhood structure of the point cloud, and provides the method to express the local neighborhood of the point could with this structure. When calculating point features, the influence of candidate points on the local feature calculation results gradually decreases as the distance decreases. Thus, a fixed number of neighbors is used to construct each layer of the sparse voxel pyramid. For each voxel, a sparse voxel pyramid of N layers is constructed, and the voxel radius of the 0th layer is set to R. The value of N can be set according to the computing power of hardware resources. The R value is the smallest voxel value in the entire voxel pyramid, and its size can be set according to the point cloud density and range of the scene. The voxel radius of each subsequent layer of the pyramid is in turn twice that of the previous layer. The voxel radius of the Nth layer is 2NR, and each layer contains the same number of K voxels. Each point in the original point cloud is built into a spatial K-nearest neighbor index in voxel point clouds of different scales to form a sparse voxel pyramid by downsampling the original point cloud according to the above-mentioned proportions. This method can determine the multiscale neighborhood of the center point only based on a fixed-size K value. Near the center point, the point cloud density maintains the original density. As the distance from the center point increases, the density of neighboring points becomes sparser. Based on the sparse voxel pyramid structure, the local neighborhood constructed by points at different scales is used to extract feature-value-based features, neighborhood geometric features, projection features, and fast point feature histogram features of corresponding points. Then, the single point features are aggregated, the random forest method is used for supervised classification, and then the multilabel graph cut method is used to optimize the above classification results. After calculating the fast point feature histogram feature of each point, the histogram intersection core is used to calculate the edge potential between neighboring points. Result This paper selects three public datasets for experiments, namely, the ground-based Semantic3D dataset, the airborne LiDAR scanning data obtained by the airborne laser scanning system ALTM 2050 in different regions, and point cloud dataset at the main entrance of the Munich Technical University campus in Arcisstrasse by mobile LiDAR system, to verify the effectiveness of this method. The evaluation indicators used in the experiment are accuracy, recall, and F1 value. Sparse voxel pyramids of different scales are used for feature extraction and feature vector aggregation owing to the difference in point cloud density and coverage. Using the method proposed in this paper, the overall classification accuracy of the experimental results on the ground Semantic3D dataset can reach 89%, the classification accuracy of the airborne LiDAR scan dataset can reach 96%, and the classification accuracy of the mobile LiDAR scan dataset can reach 89%. Experimental results show that compared with other comparison methods, the multiscale features based on sparse voxel pyramids proposed can express the local structure of point clouds more effectively and robustly. When the receiving field of the voxel pyramid increases, the density of neighboring points decreases as the distance from the center point increases, which effectively reduces the amount of calculation. In addition, the histogram feature of fast point feature is used to calculate the difference between adjacent points through the histogram intersection kernel, which is used as the weight of the edge in the structural graph model to improve the optimization effect. This method is more accurate than the traditional method that uses Euclidean distance as the weight. The multilabel graph cut method can further optimize the results of single-point classification and provide better optimization results on issues such as the incorrect classification of vegetation into buildings and vice versa. In areas with large natural terrain undulations, the classification accuracy of terrain and low vegetation is greatly affected by the undulations of natural terrain in different regions, and misclassification easily occurs. By contrast, the classification accuracy on higher features such as tall vegetation, buildings, and pedestrians is less affected by the terrain, and the accuracy is higher. Conclusion Overall, compared with other similar and more advanced methods, the multiscale features extracted by the proposed method maintain the local structure information while considering a larger range of point cloud structure information, thereby improving the point cloud classification accuracy.

# Key words

point cloud classfication; sparse voxel pyramid; multi-scale feature; multi-label graph cut; histogram intersection kernel

# 1.1 基于稀疏体素金字塔的多尺度点特征提取

Yang和Kang(2018)提出的多尺度特征不同，本文方法仅根据一个固定大小的$K$值就可确定中心点的多尺度邻域。图 2为根据Yang和Kang(2018)方法构建的多尺度空间邻域与本文提出的稀疏体素金字塔结构构建的邻域图对比。

# 2.1 基于多尺度特征的随机森林分类结果与分析

Table 1 Comparison of classification results at different scales on different datasets by our method

 数据集 准确率 召回率 F1-score 单尺度 多尺度 单尺度 多尺度 单尺度 多尺度 A 0.75 0.86 0.76 0.86 0.714 0.85 B 0.93 0.96 0.94 0.96 0.94 0.96 C 0.82 0.89 0.81 0.88 0.81 0.88

# 2.2 多标签图割法优化结果与分析

Table 2 The optimization results by multi-label graph-cut on Semantic3D dataset

 标签 准确率 召回率 F1值 人造地形 0.81 0.90 0.86 自然地形 0.45 0.16 0.24 高植被 0.99 0.80 0.88 低矮植被 0.68 0.91 0.78 建筑物 0.92 0.99 0.95 硬景观 0.79 0.42 0.55 行人 0.98 0.59 0.73 车辆 0.90 0.90 0.90 平均 0.88 0.88 0.87

Table 3 Comparison of classification results on Semantic3D dataset among different methods

 方法 精度 邻域优化法(Weinmann等，2015b) 0.742 DNNSP(Wang等，2018b) 0.893 PointNet++(Qi等，2017) 0.857 本文 0.880

# 参考文献

• Ali W, Abdelkarim S, Zahran M, Zidan M and Sallab A E. 2018. YOLO3D: end-to-end real-time 3D oriented object bounding box detection from LiDAR point cloud//Proceedings of 2018 European Conference on Computer Vision. Munich, Germany: Springer: #11131[DOI: doi.org/10.1007/978-3-030-11015-4_54]
• Bassier M, Van Genechten B, Vergauwen M. 2019. Classification of sensor independent point cloud data of building objects using random forests. Journal of Building Engineering, 21: 468-477 [DOI:10.1016/j.jobe.2018.04.027]
• Boykov Y, Veksler O, Zabih R. 2001. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11): 1222-1239 [DOI:10.1109/34.969114]
• Breiman L. 2001. Random Forests. Machine learning, 45(1): 5-32
• Charles R Q, Su H, Kaichun M and Guibas L J. 2017. PointNet: deep learning on point sets for 3D classification and segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 77-85[DOI: 10.1109/CVPR.2017.16]
• Chen D, Zhang L Q, Mathiopoulos T, Huang X F. 2014. A methodology for automated segmentation and reconstruction of urban 3-D buildings from ALS point clouds. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(10): 4199-4217 [DOI:10.1109/JSTARS.2014.2349003]
• Cheng M, Zhang H C, Wang C, Li J. 2017. Extraction and classification of road markings using mobile laser scanning point clouds. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(3): 1182-1196 [DOI:10.1109/JSTARS.2016.2606507]
• Demantké J, Mallet C, David N and Vallet B. 2011. Dimensionality based scale selection in 3D LiDAR point clouds//Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Calgary, Canada: ISPRS: 97-102[DOI: 10.5194/isprsarchives-XXXVIII-5-W12-97-2011]
• Dittrich A, Weinmann M, Hinz S. 2017. Analytical and numerical investigations on the accuracy and robustness of geometric features extracted from 3D point cloud data. ISPRS Journal of Photogrammetry and Remote Sensing, 126: 195-208 [DOI:10.1016/j.isprsjprs.2017.02.012]
• Filin S, Pfeifer N. 2005. Neighborhood systems for airborne laser data. Photogrammetric Engineering and Remote Sensing, 71(6): 743-755 [DOI:10.14358/PERS.71.6.743]
• Grauman K and Darrell T. 2005. The pyramid match kernel: discriminative classification with sets of image features//Proceedings of the 10th IEEE International Conference on Computer Vision. Beijing, China: IEEE: 1458-1465[DOI: 10.1109/ICCV.2005.239]
• Grilli E, Menna F and Remondino F. 2017. A review of point clouds segmentation and classification algorithms//Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Nafplio, Greece: ISPRS: 339-344[DOI: 10.5194/isprs-archives-XLII-2-W3-339-2017]
• Guo B, Huang X F, Zhang F, Sohn G. 2015. Classification of airborne laser scanning data using JointBoost. ISPRS Journal of Photogrammetry and Remote Sensing, 100: 71-83 [DOI:10.1016/j.isprsjprs.2014.04.015]
• Guo Y L, Sohel F A, Bennamoun M, Wan J W and Lu M. 2013. RoPS: a local feature descriptor for 3D rigid objects based on rotational projection statistics//Proceedings of the 1st International Conference on Communications, Signal Processing, and Their Applications. Sharjah, United Arab Emirates: IEEE: 1-6[DOI: 10.1109/ICCSPA.2013.6487310]
• Hackel T, Savinov N, Ladicky L, Wegner J D, Schindler K and Pollefeys M. 2017. Semantic3D. net: a new large-scale point cloud classification benchmark//Proceedings of the ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Hannover, Germany: ISPRS: 91-98[DOI: 10.5194/isprs-annals-IV-1-W1-91-2017]
• Kang Z Z, Yang J T, Zhong R F. 2017. A bayesian-network-based classification method integrating airborne LiDAR data with optical images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(4): 1651-1661 [DOI:10.1109/JSTARS.2016.2628775]
• Landrieu L and Simonovsky M. 2017. Large-scale point cloud semantic segmentation with superpoint graphs//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4558-4567[DOI: 10.1109/CVPR.2018.00479]
• Laube P, Franz M O and Umlauf G. 2017. Evaluation of features for SVM-based classification of geometric primitives in point clouds//Proceedings of the 15th IAPR International Conference on Machine Vision Applications(MVA). Nagoya, Greece: IEEE: 59-62[DOI: 10.23919/MVA.2017.7986776]
• Lee I and Schenk T. 2002. Perceptual organization of 3D surface points//International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 34(3/A): 193-198
• Li Y Y, Bu R, Sun M C, Wu W, Di X H and Chen B Q. 2018. PointCNN: convolution on X-transformed points//Proceedings of Advances in Neural Information Processing Systems. Montréal, Canada: 820-830
• Lim E H and Suter D. 2007. Conditional random field for 3D point clouds with adaptive data reduction//Proceedings of 2007 International Conference on Cyberworlds. Hannover, Germany: IEEE: 404-408[DOI: 10.1109/CW.2007.30]
• Lim E H, Suter D. 2009. 3D terrestrial LIDAR classifications with super-voxels and multi-scale Conditional Random Fields. Computer-Aided Design, 41(10): 701-710 [DOI:10.1016/j.cad.2009.02.010]
• Linsen L and Prautzsch H. 2001. Local versus global triangulations//Chalmers A and Rhyne T M, eds. Eurographics 2001. Oxford: Eurographics Association
• Liu Z Q, Li P C, Chen X W, Zhang B M, Guo H T. 2016. Classification of airborne LiDAR point cloud data based on information vector machine. Optics and Precision Engineering, 24(1): 210-219 (刘志青, 李鹏程, 陈小卫, 张保明, 郭海涛. 2016. 基于信息向量机的机载激光雷达点云数据分类. 光学精密工程, 24(1): 210-219) [DOI:10.3788/OPE.20162401.0210]
• Luo H, Wang C, Wen C L, Chen Z Y, Zai D W, Yu Y T, Li J. 2018. Semantic labeling of mobile LiDAR point clouds via active learning and higher order MRF. IEEE Transactions on Geoscience and Remote Sensing, 56(7): 3631-3644 [DOI:10.1109/TGRS.2018.2802935]
• Mitra N J, Nguyen A, Guibas L. 2004. Estimating surface normals in noisy point cloud data. International Journal of Computational Geometry and Applications, 14(4/5): 261-276 [DOI:10.1142/S0218195904001470]
• Najafi M, Namin S T, Salzmann M and Petersson L. 2014. Non-associative higher-order Markov networks for point cloud classification//Proceedings of 2014 European Conference on Computer Vision. Zurich, Switzerland: Spnhger: 500-515[DOI: doi.org/10.1007/978-3-319-10602-1_33]
• Ni H, Lin X G, Zhang J X. 2017. Classification of ALS point cloud with improved point cloud segmentation and random forests. Remote Sensing, 9(3): 288 [DOI:10.3390/rs9030288]
• Niemeyer J, Rottensteiner F and Soergel U. 2012. Conditional random fields for LIDAR point cloud classification in complex urban areas//Proceedings of the ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Melbourne, Australia: ISPRS: 263-268[DOI: 10.5194/isprsannals-I-3-263-2012]
• Qi C R, Yi L, Su H and Guibas L J. 2017. PointNet++: deep hierarchical feature learning on point sets in a metric space//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc: 5105-5114
• Rusu R B, Marton Z C, Blodow N and Beetz M. 2008. Persistent point feature histograms for 3D point clouds. Intelligent Autonomous Systems 10. Burgard W. et al (Eds. ). IOS Press[DOI: 10.3233/978-1-58603-887-8-119]
• Rusu R B, Blodow N and Beetz M. 2009. Fast Point Feature Histograms (FPFH) for 3D registration//Proceedings of 2009 IEEE International Conference on Robotics and Automation. Kobe, Japan: IEEE: 3212-3217[DOI: 10.1109/ROBOT.2009.5152473]
• Shapovalov R and Velizhev A. 2011. Cutting-plane training of non-associative Markov network for 3D point cloud segmentation//Proceedings of 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission. Hangzhou, China: IEEE: 1-8[DOI: 10.1109/3DIMPVT.2011.10]
• Tombari F, Salti S and Di Stefano L. 2010. Unique signatures of histograms for local surface description//Proceedings of the 11th European Conference on Computer Vision. Heraklion, Greece: Springer: 356-369[DOI: 10.1007/978-3-642-15558-1_26]
• Wang C, Hou S W, Wen C L, Gong Z, Li Q, Sun X T, Li J. 2018a. Semantic line framework-based indoor building modeling using backpacked laser scanning point cloud. ISPRS Journal of Photogrammetry and Remote Sensing, 143: 150-166 [DOI:10.1016/j.isprsjprs.2018.03.025]
• Wang Z, Zhang L Q, Zhang L, Li R J, Zheng Y B, Zhu Z D. 2018b. A deep neural network with spatial pooling (DNNSP) for 3-D point cloud classification. IEEE Transactions on Geoscience and Remote Sensing, 56(8): 4594-4604 [DOI:10.1109/TGRS.2018.2829625]
• Weinmann M, Jutzi B and Mallet C. 2014. Semantic 3D scene interpretation: a framework combining optimal neighborhood size selection with relevant features//Proceedings of SPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Zurich, Switzerland: ISPRS: 181-188[DOI: 10.5194/isprsannals-II-3-181-2014]
• Weinmann M, Jutzi B, Hinz S, Mallet C. 2015b. Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers. ISPRS Journal of Photogrammetry and Remote Sensing, 105: 286-304 [DOI:10.1016/j.isprsjprs.2015.01.016]
• Weinmann Ma, Jutzi B, Mallet C, Weinmann Mi. 2017. Geometric features and their relevance for 3 d point cloud classification. ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences, 2017(4): 157-164 [DOI:10.5194/isprs-annals-IV-1-W1-157-2017]
• Weinmann M, Urban S, Hinz S, Jutzi B, Mallet C. 2015a. Distinctive 2D and 3D features for automated large-scale scene analysis in urban areas. Computers and Graphics, 49: 47-57 [DOI:10.1016/j.cag.2015.01.006]
• West K F, Webb B N, Lersch J R, Pothier S, Triscari J M and Iverson A E. 2004. Context-driven automated target detection in 3D data//Proceedings of SPIE 5426, Automatic Target Recognition XIV. Orlando, USA: SPIE: 133-143[DOI: 10.1117/12.542536]
• Yang B S, Dong Z. 2013. A shape-based segmentation method for mobile laser scanning point clouds. ISPRS Journal of Photogrammetry and Remote Sensing, 81: 19-30 [DOI:10.1016/j.isprsjprs.2013.04.002]
• Yang B S, Dong Z, Zhao G, Dai W X. 2015. Hierarchical extraction of urban objects from mobile laser scanning data. ISPRS Journal of Photogrammetry and Remote Sensing, 99: 45-57 [DOI:10.1016/j.isprsjprs.2014.10.005]
• Yang J T, Kang Z Z. 2018. Multi-scale feature and Markov random field model for power line scene point cloud classification. Journal of Surveying and Mapping, 2: 188-197 (杨俊涛, 康志忠. 2018. 多尺度特征和马尔可夫随机场模型的电力线场景点云分类法. 测绘学报, 2: 188-197) [DOI:10.11947/j.AGCS.2018.20170556]
• Zhi S F, Liu Y X, Li X and Guo Y L. 2017. LightNet: a lightweight 3D convolutional neural network for real-time 3D object recognition//Pratikakis I, Dupont F and Ovsjanikov M, eds. Eurographics Workshop on 3D Object Retrieval. Lyon: the Eurographics Association: 9-16[DOI:10.2312/3dor.20171046]