复杂城市场景三维网格模型智能解译技术
Technology of intelligent interpretation of three-dimensional mesh models for complex urban scenes
- 2025年 页码:1-19
收稿日期:2024-12-27,
修回日期:2025-02-19,
录用日期:2025-02-25,
网络出版日期:2025-02-26
DOI: 10.11834/jig.240778
移动端阅览
浏览全部资源
扫码关注微信
收稿日期:2024-12-27,
修回日期:2025-02-19,
录用日期:2025-02-25,
网络出版日期:2025-02-26,
移动端阅览
城市3D Mesh模型解译是城市级实景三维建设的重要环节,有助于建筑设施、交通设施等城市设施的数字化和智能化、精细化管理,在城市更新、环境整治、城市仿真等城市行动中发挥积极作用。当前城市3D Mesh模型的语义化、单体化仍主要由人工勾勒实体轮廓,通过实体边界将每一个单独地物从城市3D Mesh模型中切割出来并赋予语义信息,然而城市3D Mesh模型通常是以瓦块的形式表达,在进行跨瓦块切割时容易让模型出现破碎、接缝、割裂等问题。为此,学者们开始研究基于深度神经网络的城市3D Mesh模型智能解译。然而,城市3D Mesh模型的智能解译却面临着巨大挑战,如城市3D Mesh模型不规则/非水密,传统卷积网络难以直接应用;城市3D Mesh模型多尺度特征获取困难等。虽然深度神经网络在城市3D Mesh模型解译方面的应用起步较晚,但该领域的研究依然取得了迅猛的发展。因此,本文以城市3D Mesh模型智能解译为主线,系统地回顾和总结了现有面向城市3D Mesh模型解译的深度神经网络方法,根据城市3D Mesh模型表达方式的不同,将面向城市3D Mesh模型解译的深度神经网络方法分为三类,即面向多视图表示的方法、面向质心点云表示的方法、面向3D Mesh模型元素的方法,并对这三类方法进行了详细比较和总结了当前面临的挑战;其次,梳理了城市3D Mesh模型智能解译常用的6个基准数据集,比较了多种方法在这些基准数据集中针对城市3D Mesh模型语义分割任务的性能表现;最后,对城市3D Mesh模型解译未来的发展方向和潜在应用前景进行了深入分析和讨论。
3D real scene serves as the spatial foundation and a unified spatial positioning framework and analysis basis for digital China. According to the content and hierarchy, 3D real scene can be categorized into terrain-level, city-level, and component-level. The city-level 3D real scene is mainly composed of 3D Mesh models derived from oblique photography, LiDAR point clouds, and texture images, which are semantically processed and integrated with real-time perception data. Urban 3D Mesh models are primarily composed of vertices, edges, triangular faces, and texture images. Compared to point clouds, 3D Mesh models not only display more detailed information about objects but also allow for easy control of the level of detail through adjusting the parameters of the 3D Mesh model. The focus of 3D real scene is on the digital mapping of production and living spaces, which can assist in the fine-grained management of cities and serve intelligent urban planning and construction. Interpreting 3D Mesh models of urban scenes, such as semantic segmentation and instance segmentation, is a crucial step in constructing city-level 3D real scene. Currently, the semantic segmentation and instance segmentation of urban 3D Mesh models primarily involves manually drawing the outline of object and cutting out each individual object from the 3D Mesh model using object boundaries, followed by assigning semantic information. However, urban 3D Mesh models are typically represented in a tile-based format, and cross-tile cutting can easily lead to issues such as fragmentation, seams, and discontinuities in the model. In recent years, deep learning technology has seen rapid development. Deep neural networks, due to their ability to learn discriminative high-level semantic features from given datasets, have been widely applied to the interpretation of image data and three-dimensional data (such as point clouds and 3D Meshes). Additionally, with the continuous improvement in the performance of graphics processing units (GPUs) and the expansion of annotated datasets, the accuracy of deep neural networks in interpreting 2D images and 3D data has significantly improved. Despite significant progress in the interpretation of 3D Mesh models, most of these studies have focused on small-scale, toy, and simulated 3D Mesh models. Research on deep neural networks for interpreting complex urban 3D Mesh models is still in its early stages and faces many challenges and difficulties, primarily in the following three aspects: 1) Urban 3D Mesh models are often irregular and may contain holes or be non-watertight, making it difficult for traditional deep neural networks to directly apply to these models to extract highly discriminative features. 2) The efficiency of extracting multi-scale feature is low. Traditional 3D Mesh simplification methods (such as quadric error metrics), which are used to generated hierarchy 3D Mesh, use greedy strategies which are hard to parallelize. When processing large-scale urban 3D Mesh models, these methods inevitably increase computational burden. 3) Compared to benchmark datasets for images and point clouds, publicly available benchmark datasets for urban 3D Mesh models are scarce. The structures of buildings, roads, and vegetation in urban scenes are complex and varied, making the annotation of urban 3D Mesh models not only require specialized knowledge but also consume a significant amount of time and human resources. Compared to the intelligent interpretation of images and point clouds, the application of deep neural networks in the interpretation of urban 3D Mesh models started later but has still seen rapid development. However, there are currently few review articles that systematically explore and summarize how different deep neural network architectures achieve the interpretation of urban 3D Mesh models. Therefore, this paper aims to systematically review and summarize existing deep neural network methods for interpreting urban 3D Mesh models and to highlight the open challenges currently faced by researchers, providing a reference for future research. For this purpose, we first survey vast literatures, and categorize the intelligent interpretation methods for urban 3D Mesh models into three classes, according to the types of representations used in processing urban 3D Mesh models: 1) Methods based on multi-view images attempted to project 3D Mesh models into 2D images from multiple viewpoints and used well-established 2D image deep learning methods to learn discriminative semantic features from the projected images. Subsequently, the semantic features learned from the projected images were mapped back to the 3D Mesh models. 2) Methods based on center of gravity (COG) point cloud representation convert each face of the urban 3D Mesh model into its COG point, thereby abstracting the entire 3D Mesh model into COG point clouds. Subsequently, intelligent interpretation algorithms designed for point clouds are used to process these COG point clouds. Unlike traditional point clouds, COG point clouds can inherit rich texture and geometric information from the urban 3D Mesh model. 3) Methods based on 3D Mesh elements aim to define learnable operations (such as convolution and pooling) directly on the 3D Mesh elements (vertices, edges, triangular faces). This approach allows for the direct learning and extraction of rich high-level semantic features from the urban 3D Mesh model, thereby avoiding information loss that can result from preprocessing steps such as multi-view image projection and centroid point cloud abstraction. Subsequently, we conduct a detailed comparison of the four categories of methods and summarized their current challenges. Furthermore, we also summarize commonly used datasets for intelligent interpretation of urban 3D Mesh models and compare the interpreting performance of different methods on these datasets. Finally, based on the systematic survey and comprehensive performance comparison, we discussed some promising future research directions from aspects such as dataset creation, 3D large model construction, and application scenarios.
Alexandre B , Guerry J , Saux B L , and Audebert N . 2018 . Snapnet: 3d point cloud semantic labeling with 2d deep segmentation networks . Computers & Graphics , 71 : 189 - 198 [ DOI: 10.1016/j.cag.2017.11.010 http://dx.doi.org/10.1016/j.cag.2017.11.010 ]
Breiman L . 1996 . Bagging predictors . Machine Learning , 24 : 123 - 140 [ DOI: 10.1007/BF00058655 http://dx.doi.org/10.1007/BF00058655 ]
Breiman L . 2001 . Random forests . Machine Learning , 45 : 5 - 32 [ DOI: 10.1023/A:1018054314350 http://dx.doi.org/10.1023/A:1018054314350 ]
Bronstein M , Bruna J , LeCun Y. , Szlam A D , and Vandergheynst P . 2017 . Geometric deep learning: going beyond Euclidean data . IEEE Signal Processing Magazine , 34 : 18 - 42 [ DOI: 10.1109/MSP.2017.2693418 http://dx.doi.org/10.1109/MSP.2017.2693418 ]
Chatfield K . 2014 . Return of the Devil in the Details: Delving deep into convolutional nets [EB/OL] . [ 2014-11-05 ]. https://arxiv.org/pdf/1405.3531.pdf https://arxiv.org/pdf/1405.3531.pdf
Chen D Y , Tian X P , Shen Y T , and Ouhyoung M . 2003 . On visual similarity based 3d model retrieval // Computer Graphics Forum . Oxford : Blackwell Publishing: 223 - 232 [ DOI: 10.1111/1467-8659.00669 http://dx.doi.org/10.1111/1467-8659.00669 ]
Chen J Z , Xu Y H , Lu S F , Liang R H , and Nan L L . 2022 . 3-D instance segmentation of MVS buildings . IEEE Transactions on Geoscience and Remote Sensing , 60 : 1 - 14 [ DOI: 10.1109/TGRS.2022.3183567 http://dx.doi.org/10.1109/TGRS.2022.3183567 ]
Chen L C , Zhu Y K , Papandreou G , Schroff F , and Adam H . 2018 . Encoder-decoder with Atrous separable convolution for semantic image segmentation // Proceedings of the European Conference on Computer Vision (ECCV) . Munich : IEEE: 801 - 818 [ DOI: 10.1007/978-3-030-01234-2_49 http://dx.doi.org/10.1007/978-3-030-01234-2_49 ]
Chen Y M and Feng M W . 2022 . Urban form simulation in 3d based on cellular automata and building objects generation . Building and Environment , 226 : 109727 [ DOI: 10.1016/j.buildenv.2022.109727 http://dx.doi.org/10.1016/j.buildenv.2022.109727 ]
Chollet F . 2017 . Xception: Deep Learning with Depthwise Separable Convolutions // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu : IEEE 1251 - 1258 [ DOI: 10.1109/CVPR.2017.135 http://dx.doi.org/10.1109/CVPR.2017.135 ]
Cohen-Steiner D , Alliez P , and Desbrun M . 2004 . Variational shape approximation // ACM SIGGRAPH 2004 Papers . Los Angeles : ACM Press: 905 - 914 [ DOI: 10.1145/1015706.1015817 http://dx.doi.org/10.1145/1015706.1015817 ]
Cross R G and Jain A K . 1983 . Markov random field texture models . IEEE Transactions on Pattern Analysis and Machine Intelligence , 1 : 25 - 39 [ DOI: 10.1109/TPAMI.1983.4767341 http://dx.doi.org/10.1109/TPAMI.1983.4767341 ]
Dominik W , Bożyczko M , and Tułacz-Maziarz K . 2021 . Deep learning for automatic lidar point cloud processing . Archiwum Fotogrametrii, Kartografii i Teledetekcji , 33 : 13 - 22 [ DOI: 10.2478/apcrs-2021-0001 http://dx.doi.org/10.2478/apcrs-2021-0001 ]
Evtimova K and LeCun Y . 2022 . Sparse coding with multi-layer decoders using variance regularization [EB/OL].[ 2022-09-07 ]. https://arxiv.org/pdf/2112.09214.pdf https://arxiv.org/pdf/2112.09214.pdf
Gao B , Pan Y C , Li C K , Geng S B , and Zhao H J . 2021 . Are we hungry for 3d lidar data for semantic segmentation? a survey of datasets and methods . IEEE Transactions on Intelligent Transportation Systems , 23 ( 7 ): 6063 - 6081 . [ DOI: 10.1109/TITS.2021.3076844 http://dx.doi.org/10.1109/TITS.2021.3076844 ]
Gao L , Liu Y , Chen X , Liu Y X , Yan S and Zhang M J . 2024 . Cus 3 d: A new comprehensive urban-scale semantic-segmentation 3d benchmark dataset. Remote Sensing , 16 ( 6 ): 1079 . [ DOI: 10.3390/rs16061079 http://dx.doi.org/10.3390/rs16061079 ]
Gao W X , Nan L L , Boom B and Ledoux H . 2021 . SUM: A benchmark dataset of semantic urban meshes . ISPRS Journal of Photogrammetry and Remote Sensing , 179 : 108 - 120 . [ DOI: 10.1016/j.isprsjprs.2021.07.008 http://dx.doi.org/10.1016/j.isprsjprs.2021.07.008 ]
Gao W X , Nan L L , Boom B and Ledoux H . 2023 . PSSNet: Planarity-sensible semantic segmentation of large-scale urban meshes . ISPRS Journal of Photogrammetry and Remote Sensing , 196 : 32 - 44 [ DOI: 10.1016/j.isprsjprs.2022.12.020 http://dx.doi.org/10.1016/j.isprsjprs.2022.12.020 ]
Garland M and Heckbert P S . 1997 . Surface simplification using quadric error metrics // Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques . Los Angeles : ACM Press: 209 - 216 [ DOI: 10.1145/258734.258849 http://dx.doi.org/10.1145/258734.258849 ]
George D , Xie X H and Tam G K . 2018 . 3d mesh segmentation via multi-branch 1d convolutional neural networks . graphical models , 96 : 1 - 10 [ DOI: 10.1016/j.gmod.2018.01.001 http://dx.doi.org/10.1016/j.gmod.2018.01.001 ]
Gregor K and LeCun Y . 2010 . Learning fast approximations of sparse coding // Proceedings of the 27th International Conference on Machine Learning . Haifa Israel : ACM Press: 399 - 406 [ DOI: 10.5555/3104322.3104374 http://dx.doi.org/10.5555/3104322.3104374 ]
Grzeczkowicz G and Vallet B . 2022 . Semantic segmentation of urban textured meshes through point sampling // ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences . Beijing : ISPRS: 177 - 184 [ DOI: 10.5194/isprs-annals-V-2-2022-177-2022 http://dx.doi.org/10.5194/isprs-annals-V-2-2022-177-2022 ]
Guo Y L , Wang H Y , Hu Q Y , Liu H , Liu L and Bennamoun M . 2020 . Deep learning for 3d point clouds: a survey . IEEE Transactions on Pattern Analysis and Machine Intelligence , 43 ( 12 ): 4338 - 4364 [ DOI: 10.1109/TPAMI.2020.3005434 http://dx.doi.org/10.1109/TPAMI.2020.3005434 ]
Hanocka R , Hertz A , Fish N , Giryes R , Fleishman S and Cohen-Or D . 2019 . MeshCNN: a network with an edge . ACM Transactions on Graphics (TOG) , 38 : 1 - 12 [ DOI: 10.1145/3306346.3322959 http://dx.doi.org/10.1145/3306346.3322959 ]
Hao S J , Zhou Y and Guo Y R . 2020 . A brief survey on semantic segmentation with deep learning . Neurocomputing , 406 : 302 - 321 [ DOI: 10.1016/j.neucom.2019.11.118 http://dx.doi.org/10.1016/j.neucom.2019.11.118 ]
He K , Zhang X , Ren S and Sun J . 2016 . Deep residual learning for image recognition // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas : IEEE: 770 - 778 [ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Hong Z H , Yang Y H , Liu J , Jiang S L , Pan H Y , Zhou R Y , Zhang Y , Han Y L , Wang J and Yang S H . 2022 . Enhancing 3d reconstruction model by deep learning and its application in building damage assessment after earthquake . Applied Sciences , 12 ( 19 ): 9790 [ DOI: 10.3390/app12199790 http://dx.doi.org/10.3390/app12199790 ]
Hu W B , Zhao H S , Jiang L , Jia J Y and Wong T T . 2021 . Bidirectional projection network for cross dimension scene understanding // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Virtual : IEEE: 14373 - 14382 [ DOI: 10.1109/cvpr46437.2021.01414 http://dx.doi.org/10.1109/cvpr46437.2021.01414 ]
Jaritz M , Gu J Y and Su H . 2019 . Multi-view PointNet for 3d scene understanding // Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops . Seoul : IEEE: 3995 - 4003 [ DOI: 10.1109/iccvw.2019.00494 http://dx.doi.org/10.1109/iccvw.2019.00494 ]
Katz D M , Bommarito M J , Gao S G and Arredondo P . 2024 . GPT-4 passes the bar exam . Philosophical Transactions of the Royal Society A , 382 ( 2270 ): 1 - 17 [ DOI: 10.1098/rsta.2023.0254 http://dx.doi.org/10.1098/rsta.2023.0254 ]
Kazhdan M , Funkhouser T and Rusinkiewicz S . 2003 . Rotation invariant spherical harmonic representation of 3d shape descriptors // Symposium on Geometry Processing . Aachen : ACM Press: 156 - 164 [ DOI: 10.5555/882370.882392 http://dx.doi.org/10.5555/882370.882392 ]
Kirillov A , Mintun E , Ravi N , Mao H , Rolland C , Gustafson L , Xiao T , Whitehead S , Berg A C and Lo W-Y . 2023 . Segment anything // Proceedings of the IEEE/CVF International Conference on Computer Vision . Paris : IEEE: 4015 - 4026 [ DOI: 10.1109/ICCV51070.2023.00371 http://dx.doi.org/10.1109/ICCV51070.2023.00371 ]
Knott M and Groenendijk R . 2021 . Towards mesh-based deep learning for semantic segmentation in photogrammetry // ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences . Nice : ISPRS: 59 - 66 [ DOI: 10.5194/isprs-annals-V-2-2021-59-2021 http://dx.doi.org/10.5194/isprs-annals-V-2-2021-59-2021 ]
Kölle M , Laupheimer D , Schmohl S , Haala N , Rottensteiner F , Wegner J D and Ledoux H . 2021 . The hessigheim 3d (h3d) benchmark on semantic segmentation of high-resolution 3d point clouds and textured meshes from UAV lidar and multi-view-stereo . ISPRS Open Journal of Photogrammetry and Remote Sensing , 1 : 100001 [ DOI: 10.1016/j.ophoto.2021.100001 http://dx.doi.org/10.1016/j.ophoto.2021.100001 ]
Kundu A , Yin X , Fathi A , Ross D , Brewington B , Funkhouser T and Pantofaru C . 2020 . Virtual multi-view fusion for 3d semantic segmentation // Computer Vision–ECCV 2020 : Glasgow : Springer: 518 - 535 [ DOI: 10.1007/978-3-030-58586-0_31 http://dx.doi.org/10.1007/978-3-030-58586-0_31 ]
Lafarge F and Mallet C . 2012 . Creating large-scale city models from 3d-point clouds: a robust approach with hybrid representation . International Journal of Computer Vision , 99 ( 1 ): 69 - 85 [ DOI: 10.1007/s11263-012-0517-8 http://dx.doi.org/10.1007/s11263-012-0517-8 ]
Laupheimer D , Shams Eddin M H and Haala N . 2020 . On the association of lidar point clouds and textured meshes for multi-modal semantic segmentation // ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences . Nice : ISPRS: 509 - 516 [ DOI: 10.5194/isprs-annals-V-2-2020-509-2020 http://dx.doi.org/10.5194/isprs-annals-V-2-2020-509-2020 ]
Laupheimer D , Shams Eddin M H and Haala N . 2020 . The importance of radiometric feature quality for semantic mesh segmentation // 40 . Wissenschaftlich-Technische Jahrestagung. Stuttgart : der DGPF: 29 : 205 - 218 [ https://www.dgpf.de/src/tagung/jt2020/proceedings/proceedings/papers/27_DGPF2020_Laupheimer_et_al.pdf https://www.dgpf.de/src/tagung/jt2020/proceedings/proceedings/papers/27_DGPF2020_Laupheimer_et_al.pdf ]
Laupheimer D and Haala N . 2022 . Multi-modal semantic mesh segmentation in urban scenes // ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences . Nice : ISPRS: 267 - 274 [ DOI: 10.5194/isprs-annals-V-2-2022-267-2022 http://dx.doi.org/10.5194/isprs-annals-V-2-2022-267-2022 ]
Lawin F J , Danelljan M , Tosteberg P , Bhat G , Khan F S and Felsberg M . 2017 . Deep projective 3d semantic segmentation // Computer Analysis of Images and Patterns . Ystad : Springer: 95 - 107 [ DOI: 10.1007/978-3-319-64689-3_8 http://dx.doi.org/10.1007/978-3-319-64689-3_8 ]
Lei H , Akhtar N and Mian A . 2021 . Picasso: A CUDA-based library for deep learning over 3d meshes // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . IEEE : 13854 - 13864 [ DOI: 10.1109/cvpr46437.2021.01364 http://dx.doi.org/10.1109/cvpr46437.2021.01364 ]
Lei H , Akhtar N , Shah M and Mian A . 2023 . Mesh convolution with continuous filters for 3-d surface parsing . IEEE Transactions on Neural Networks and Learning Systems , 35 ( 10 ): 14863 - 14877 [ DOI: 10.1109/TNNLS.2023.3281871 http://dx.doi.org/10.1109/TNNLS.2023.3281871 ]
Li J N , Li D X , Savarese S and Hoi S . 2023 . BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models // International Conference on Machine Learning . Honolulu : ACM Press: 19730 - 19742 [ DOI: 10.5555/3618408.3619222 http://dx.doi.org/10.5555/3618408.3619222 ]
Liu Z , Lin Y , Cao Y , Hu H , Wei Y , Zhang Z , Lin S and Guo B . 2021 . Swin transformer: hierarchical vision transformer using shifted windows // Proceedings of the IEEE/CVF International Conference on Computer Vision . Montreal : IEEE: 10012 - 10022 [ DOI: 10.1109/iccv48922.2021.00986 http://dx.doi.org/10.1109/iccv48922.2021.00986 ]
Monga V , Li Y and Eldar Y C . 2021 . Algorithm unrolling: interpretable, efficient deep learning for signal and image processing . IEEE Signal Processing Magazine , 38 ( 2 ): 18 - 44 [ DOI: 10.1109/MSP.2020.3016905 http://dx.doi.org/10.1109/MSP.2020.3016905 ]
Mu T J , Shen M Y , Lai Y K and Hu S M . 2024 . Learning virtual view selection for 3d scene semantic segmentation . IEEE Transactions on Image Processing . 33 : 4159 - 4172 [ DOI: 10.1109/TIP.2024.3421952 http://dx.doi.org/10.1109/TIP.2024.3421952 ]
Phong B T . 1998 . Illumination for computer generated pictures // Seminal Graphics: Pioneering Efforts That Shaped the Field . New York : ACM Press: 95 - 101 [ DOI: 10.1145/280811.280980 http://dx.doi.org/10.1145/280811.280980 ]
Qi C , Yi L , Su H and Guibas L J . 2017 . Pointnet++: deep hierarchical feature learning on point sets in a metric space // Advances in Neural Information Processing Systems . Long Beach : NIPS.cc: 5099 - 5108 [ DOI: 10.48550/arXiv.1706.02413 http://dx.doi.org/10.48550/arXiv.1706.02413 ]
Qi C , Su H , Mo K and Guibas L J . 2017 . Pointnet: deep learning on point sets for 3d classification and segmentation // Proceedings of the IEEE conference on computer vision and pattern recognition . Honolulu : IEEE: 652 - 660 [ DOI: 10.1109/CVPR.2017.16 http://dx.doi.org/10.1109/CVPR.2017.16 ]
Qiu Z J , Zhang L , Yao Y , Feng X Q and Gao J T . 2024 . A survey on semantic segmentation in 3D point cloud scenes [J]. Journal of Image and Graphics , 1 - 17 . (仇志江, 张林, 姚垚, 冯小青, 高俊涛. 三维点云场景语义分割研究进展[J]. 中国图象图形学报 [ DOI: 10.11834/jig.240650 http://dx.doi.org/10.11834/jig.240650 ]
Riemenschneider H , Bódis-Szomorú A , Weissenberg J and Van Gool L . 2014 . Learning where to classify in multi-view semantic segmentation // Computer Vision-ECCV 2014: 13th European Conference . Zurich : Springer: 516 - 532 [ DOI: 10.1007/978-3-319-10602-1_34 http://dx.doi.org/10.1007/978-3-319-10602-1_34 ]
Rong M Q and Shen S H . 2023 . 3d semantic segmentation of aerial photogrammetry models based on orthographic projection . IEEE Transactions on Circuits and Systems for Video Technology , 33 ( 12 ): 7425 - 7437 . [ DOI: 10.1109/TCSVT.2023.3273224 http://dx.doi.org/10.1109/TCSVT.2023.3273224 ]
Ronneberger O , Fischer P and Brox T . 2015 . U-Net: convolutional networks for biomedical image segmentation // Medical Image Computing And Computer-Assisted Intervention: 18th international conference . Munich : Springer: 234 - 241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Rouhani M , Lafarge F and Alliez P . 2017 . Semantic segmentation of 3d textured meshes for urban scene analysis . ISPRS Journal of Photogrammetry and Remote Sensing , 123 : 124 - 139 [ DOI: 10.1016/j.isprsjprs.2016.12.001 http://dx.doi.org/10.1016/j.isprsjprs.2016.12.001 ]
Sánchez J , Perronnin F , Mensink T and Verbeek J . 2013 . Image classification with the fisher vector: theory and practice . International Journal of Computer Vision , 105 : 222 - 245 [ DOI: 10.1007/s11263-013-0636-x http://dx.doi.org/10.1007/s11263-013-0636-x ]
Skondras A , Karachaliou E , Tavantzis I , Tokas N , Valari E , Skalidi I , Bouvet G A and Stylianidis E . 2022 . UAV mapping and 3d modeling as a tool for promotion and management of the urban space . Drones , 6 ( 5 ): 115 - 125 [ DOI: 10.3390/drones6050115 http://dx.doi.org/10.3390/drones6050115 ]
Su H , Maji S , Kalogerakis E and Learned-Miller E . 2015 . Multi-view convolutional neural networks for 3d shape recognition // Proceedings of the IEEE international conference on computer vision . Santiago : IEEE: 945 - 953 [ DOI: 10.1109/ICCV.2015.114 http://dx.doi.org/10.1109/ICCV.2015.114 ]
Sutton C and McCallum A . 2012 . An introduction to conditional random fields . Foundations and Trends® in Machine Learning , 4 ( 4 ): 267 - 373 [ DOI: 10.1561/9781601985736 http://dx.doi.org/10.1561/9781601985736 ]
Tang R , Xia M , Yang Y and Zhang C . 2022 . A deep-learning model for semantic segmentation of meshes from UAV oblique images . International Journal of Remote Sensing , 43 ( 13 ): 4774 - 4792 [ DOI: 10.1080/01431161.2022.2111665 http://dx.doi.org/10.1080/01431161.2022.2111665 ]
Thomas H , Qi C , Deschaud J E , Marcotegui B , Goulette F and Guibas L . 2019 . Kpconv: flexible and deformable convolution for point clouds// /Proceedings of the IEEE/CVF international conference on computer vision . Seoul : IEEE : 6410 - 6419 [ DOI: 10.1109/iccv.2019.00651 http://dx.doi.org/10.1109/iccv.2019.00651 ]
Touvron H , Lavril T , Izacard G , Martinet X , Lachaux M A , Lacroix T , Rozière B , Goyal N , Hambro E and Azhar F . 2023 . Llama: open and efficient foundation language models . [EB/OL].[ 2023-02-07 ]. https://arxiv.org/pdf/2302.13971.pdf https://arxiv.org/pdf/2302.13971.pdf
Tutzauer P , Laupheimer D and Haala N . 2019 . Semantic urban mesh enhancement utilizing a hybrid model // ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences , IV-2/W 7 . Munich : ISPRS: 175 - 182 [ DOI: 10.5194/isprs-annals-IV-2-W7-175-2019 http://dx.doi.org/10.5194/isprs-annals-IV-2-W7-175-2019 ]
Ulku I and Akagündüz E . 2022 . A survey on deep learning-based architectures for semantic segmentation on 2d images . Applied Artificial Intelligence , 36 ( 1 ): 2032924 [ DOI: 10.1080/08839514.2022.2032924 http://dx.doi.org/10.1080/08839514.2022.2032924 ]
Vedaldi A and Fulkerson B . 2010 . Vlfeat: An open and portable library of computer vision algorithms // Proceedings of the 18th ACM International Conference on Multimedia . Firenze : ACM Press: 1469 - 1472 [ DOI: 10.1145/1873951.1874249 http://dx.doi.org/10.1145/1873951.1874249 ]
Wang H and Zhang J . 2022 . A survey of deep learning-based mesh processing . Communications in Mathematics and Statistics , 10 ( 1 ): 163 - 194 [ DOI: 10.1007/s40304-021-00246-7 http://dx.doi.org/10.1007/s40304-021-00246-7 ]
Wang L , Huang Y , Hou Y , Zhang S and Shan J . 2019 . Graph attention convolution for point cloud semantic segmentation // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE: 10296 - 10305 [ DOI: 10.1109/CVPR.2019.01054 http://dx.doi.org/10.1109/CVPR.2019.01054 ]
Wang Y , Sun Y , Liu Z , Sarma S E , Bronstein M and Solomon J . 2019 . Dynamic graph CNN for learning on point clouds . ACM Transactions on Graphics (TOG) , 38 ( 4 ): 1 - 12 [ DOI: 10.1145/3326362 http://dx.doi.org/10.1145/3326362 ]
Wilk Ł , Mielczarek D , Ostrowski W , Dominik W and Krawczyk J . 2022 . Semantic urban mesh segmentation based on aerial oblique images and point clouds using deep learning // The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences . Nice : ISPRS: 485 - 491 [ DOI: 10.5194/isprs-archives-XLIII-B2-2022-485-2022 http://dx.doi.org/10.5194/isprs-archives-XLIII-B2-2022-485-2022 ]
Wu Z , Song S , Khosla A , Yu F , Zhang L , Tang X and Jianxiong X . 2015 . 3d Shapenets: a deep representation for volumetric shapes // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston : IEEE: 1912 - 1920 [ DOI: 10.1109/CVPR.2015.7298801 http://dx.doi.org/10.1109/CVPR.2015.7298801 ]
Xiao Y P , Lai Y K , Zhang F L , Li C and Gao L . 2020 . A survey on deep geometry learning: from a representation perspective . Computational Visual Media , 6 ( 2 ): 113 - 133 [ DOI: 10.48550/arXiv.2002.07995 http://dx.doi.org/10.48550/arXiv.2002.07995 ]
Yang G , Xue F , Zhang Q , Xie K , Fu C W and Huang H . 2023 . UrbanBIS: A large-scale benchmark for fine-grained urban building instance segmentation // ACM SIGGRAPH 2023 Conference Proceedings . Los Angeles : ACM Press: 1 - 11 [ DOI: 10.1145/3588432.3591508 http://dx.doi.org/10.1145/3588432.3591508 ]
Yang Y , Tang R , Xia M , Zhang C . 2023 . A surface graph based deep learning framework for large-scale urban mesh semantic segmentation . International Journal of Applied Earth Observation and Geoinformation , 119 : 103322 . [ DOI: 10.1016/j.jag.2023.103322 http://dx.doi.org/10.1016/j.jag.2023.103322 ]
Yang Y , Tang R , Xia M and Zhang C . 2023 . A texture integrated deep neural network for semantic segmentation of urban meshes . IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing . 16 : 4670 - 4684 [ DOI: 10.1109/JSTARS.2023.3276977 http://dx.doi.org/10.1109/JSTARS.2023.3276977 ]
Yu L and Wu X Q . 2024 . Survey of texture optimization algorithms for 3D reconstructed scenes . Journal of Image and Graphics , 29 ( 08 ): 2303 - 2318
于柳 , 吴晓群 . 2024 . 三维重建场景的纹理优化算法综述 . 中国图象图形学报 , 29 ( 08 ): 2303 - 2318 [ DOI: 10.11834/jig.230478 http://dx.doi.org/10.11834/jig.230478 ]
Yu Y , Wang C , Fu Q , Kou R , Huang F , Yang B , Yang T and Gao M . 2023 . Techniques and challenges of image segmentation: a review . Electronics , 12 ( 5 ): 1199 [ DOI: 10.3390/electronics12051199 http://dx.doi.org/10.3390/electronics12051199 ]
Zhang G Y and Zhang R T . 2023 . MeshNet-SP: a semantic urban 3d mesh segmentation network with sparse prior . Remote Sensing , 15 ( 22 ): 5324 . [ DOI: 10.3390/rs15225324 http://dx.doi.org/10.3390/rs15225324 ]
Zhang G Y , Zhang R T , Dai Q H , Chen J and Pan Y P . 2021 . The direction of the integration of surveying and mapping geographic information and artificial intelligence 2.0 . Acta Geodaetica et Cartographica Sinica , 50 ( 8 ): 1096 - 1108
张广运 , 张荣庭 , 戴琼海 , 陈军 , 潘云鹤 . 2021 . 测绘地理信息与人工智能2.0融合发展的方向 . 测绘学报 , 50 ( 8 ): 1096 - 1108 [ DOI: 10.11947/j.AGCS.2021.20210200 http://dx.doi.org/10.11947/j.AGCS.2021.20210200 ]
Zhang J Y , Zhao X L , Chen Z and Lu Z J . 2019 . A review of deep learning-based semantic segmentation for point cloud . IEEE access , 7 : 179118 - 179133 [ DOI: 10.1109/ACCESS.2019.2958671 http://dx.doi.org/10.1109/ACCESS.2019.2958671 ]
Zhang L , Liu Y , Sun Y , Lan C , Ai H , Fan Z . 2022 . A review of the development of 3d reconstruction theories and technologies in digital aerial photogrammetry . Acta Geodaetica et Cartographica Sinica , 51 ( 7 ): 1437 - 1457
张力 , 刘玉轩 , 孙洋杰 , 蓝朝桢 , 艾海滨 , 樊仲藜 . 2022 . 数字航空摄影三维重建理论与技术发展综述 . 测绘学报 , 51 ( 7 ): 1437 - 1457 [ DOI: 10.11947/j.AGCS.2022.20220130 http://dx.doi.org/10.11947/j.AGCS.2022.20220130 ]
Zhang N , Pan Z Y , Li T H , Gao W and Li G . 2023 . Improving graph representation for point cloud segmentation via attentive filtering // Proceedings of the IEEE/CVF conference on computer vision and pattern recognition . Vancouver : IEEE: 1244 - 1254 [ DOI: 10.1109/CVPR52729.2023.00126 http://dx.doi.org/10.1109/CVPR52729.2023.00126 ]
Zhang R T , Zhang G Y , Yin J H . 2023 . Semantic segmentation of 3d dynamic urban scenes using convolutional networks . Acta Geodaetica et Cartographica Sinica , 52 ( 10 ): 1703 - 1713
张荣庭 , 张广运 , 尹继豪 . 2023 . 复杂城市动态图卷积网络三维场景语义分割法 . 测绘学报 , 52 ( 10 ): 1703 - 1713 [ DOI: 10.11947/j.AGCS.2023.20220466 http://dx.doi.org/10.11947/j.AGCS.2023.20220466 ]
Zhang R T , Zhang G Y , Yin J H , Jia X P and Mian A . 2023 . Mesh-based DGCNN: semantic segmentation of textured 3-d urban scenes . IEEE Transactions on Geoscience and Remote Sensing , 61 : 1 - 12 . [ DOI: 10.1109/TGRS.2023.3266273 http://dx.doi.org/10.1109/TGRS.2023.3266273 ]
Zhao H H , Shi J P , Qi X J , Wang X G and Jia J Y . 2017 . Pyramid scene parsing network // Proceedings of the IEEE conference on computer vision and pattern recognition . Honolulu : IEEE: 2881 - 2890 [ DOI: 10.48550/arXiv.1612.01105 http://dx.doi.org/10.48550/arXiv.1612.01105 ]
Zhu Q , Zhang L G , Ding Y L , Hu H , Ge X M , Liu M W , Wang W . 2022 . From real-scene 3d modeling to digital twin modeling . Acta Geodaetica et Cartographica Sinica , 51 ( 6 ): 1040 - 1049
朱庆 , 张利国 , 丁雨淋 , 胡翰 , 葛旭明 , 刘铭崴 , 王玮 . 2022 . 从实景三维建模到数字孪生建模 . 测绘学报 , 51 ( 6 ): 1040 - 1049 [ DOI: 10.11947/j.AGCS.2022.20210640 http://dx.doi.org/10.11947/j.AGCS.2022.20210640 ]
相关作者
相关机构