Current Issue Cover
基于递归切片网络的三维点云语义分割与实例分割

刘苏毅1, 迟剑宁1, 吴成东1, 徐方2(1.东北大学机器人科学与工程学院, 沈阳 110167;2.沈阳新松机器人自动化股份有限公司中央研究院, 沈阳 110180)

摘 要
目的 针对三维点云语义与实例分割特征点提取精度不高、实例分割精度极度依赖语义分割的性能、在密集场景或小单元分割目标中出现语义类别错分以及实例边缘模糊等问题,提出了基于递归切片网络的三维点云语义分割与实例分割网络。方法 网络对输入点云进行切片,并将无序点云映射到有序序列上;利用双向长短期记忆网络(bidirectional long short-term memory,BiLSTM)得到带有局部特征和全局特征的编码特征矩阵;将编码特征矩阵解码为两个并行分支,进行多尺度的特征融合;对语义与实例特征进行融合,得到并行的语义与实例分割网络。结果 在斯坦福大尺度3D室内场景数据集(Stanford large-scale 3D indoor spaces dataset,S3DIS)以及ShapeNet数据集上,与目前最新点云分割方法进行实验对比。实验结果表明,在S3DIS数据集上,本文算法的语义分割的平均交并比指标为73%,较动态核卷积方法(position adaptive convolution,PAConv)提高7.4%,并且在13个类别中的8个类别取得最好成绩;实例分割中平均实例覆盖率指标为67.7%。在ShapeNet数据集上,语义分割的平均交并比为89.2%,较PAConv算法提高4.6%,较快速、鲁棒的点云语义与实例分割方法(fast and robust joint semantic-instancesegmentation,3DCFS)提高1.6%。结论 本文提出的语义与实例分割融合网络,综合了语义分割与实例分割的优点,有效提高语义分割与实例分割精度。
关键词
Recurrent slice networks-based 3D point cloud-relevant integrated segmentation of semantic and instances

Liu Suyi1, Chi Jianning1, Wu Chengdong1, Xu Fang2(1.Faculty of Robot Science and Engineering, Northeastern University, Shenyang 110167, China;2.Academia Sinica, Shenyang Siasun Robot Automation Co., Ltd., Shenyang 110180, China)

Abstract
Objective The growth of GPU-based computing power is beneficial for 3D spatial-contexts of computer vision domain. The 3D point cloud-based segmentation technique has been facilitating such sub-research contexts like robot and manipulator. The 3D point cloud segmentation is mainly categorized into two aspects of semantic segmentation and instance segmentation,and both of them can be focused on detecting minimum unit set-represented specific information areas in the scene. The following sampled scene point cloud can be parsed into points groups as well,in which each group is recognized as a separate instance or class of objects. The integration of two methods are challenged to be mutual-benefited although each optimization of semantic and instance segmentation task can be achieved. Existing challenges is still in related to lower accuracy via 3D point cloud features extraction. Incorrect instance segmentation prediction will distort the effects of semantic segmentation and classification because accuracy of instance segmentation is highly cohesive to the performance of semantic segmentation,such as semantic classification error,instance edge fuzzy and other related problems. We develop a recurrent slice network for the integrated instances and semantics segmentation in the context of 3D point cloud. Method The backbone network consists of two networks in related to an improved recursive slice feature extraction and an integrated feature. First,its slice-pooling layer of recursive slice feature-extracted network is oriented to slice the input point cloud for each spatial direction of three,and the maximum pooling method can balance the disordered point cloud sequence. Second,the bidirectional recurrent neural network(RNN)is ineffective derived of such non-updated prior input information like insufficient learning ability,gradient disappearance and other related problems. To obtain local and global featuresencoded matrix,the bidirectional long short term(BiLSTM)network is used to exchange local information of different slice. Third,the extracted features can be decoded into two kind of paralleling branches for semantic segmentation and instance segmentation. The multiple receptive fields-based feature fusion can melt each branch before semantic and instance feature are fused together. To get information-semantic instance segmentation model,semantic perceptive information is leaked out from the high-dimensional semantic features,and it can be combined with the instance features. To realize the semantic segmentation model of instance embedding,the instance-embed k-nearest neighbor(KNN)clustering method is facilitated to sort out a fixed number of adjacent points for each point in the instance clustering space. The points of the same class are correlated and the points of different classes are discrete. Meanwhile,super-parameters can filter some outliers to preserve the generalization performance of the model. Result to verify the performance of the point cloud segmentation,two public datasets of Stanford 3D indoor semantics dataset(S3DIS),and the ShapeNet dataset is involved in for comparative analysis. This model analysis is in comparison with other related state-of-the-art saliency models,including such segmentation approaches of their semantic,instance and the joint contexts. For the 6-fold cross validation experiment on S3DIS dataset,the results show that semantic segmentation accuracy of the proposed algorithm can be reached to 73% of mean intersection over union(mIOU),82. 3% of mean accuracy(mAcc)and 89. 3% of overall accuracy(oAcc). It is 4. 4%,10. 2% and 1. 9% higher than the position adaptive convolution (PAConv)algorithm;the m-Cov(mean instance coverage)and mean instance weighted coverage(mw-Cov)of the instance segmentation can be reached 64. 1% and 65. 3% texting on area 5,which is 0. 6% and 0. 7% higher than PointGroup algorithms. Furthermore,for the semantic segmentation experiment on the S3DIS dataset,our algorithm has achieved its ability for its 8/13 categories. For the ShapeNet dataset,the semantic segmentation accuracy of the proposed algorithm can be achieved 89. 2% of mIOU,higher than PAConv algorithm 4. 6%. Conclusion The problems of semantic segmentation and instance segmentation in 3D point cloud can be focused on,and a feature slice network-based fusion algorithm of semantic segmentation and instance segmentation is developed as well. To get instance-embed semantic segmentation,instance features are melted into semantic branches, and semantic features can be conveyed to instance segmentation channel. The proposed algorithm demonstrates that the integration of semantic segmentation and instance segmentation is in comparison with other related point cloud segmentation algorithms in S3DIS and ShapeNet datasets.
Keywords

订阅号|日报