三维穿衣人体重建综述——从传统方法到高保真模型

陈鸿鹄; 陶云帆; 张举勇

doi:10.11834/jig.230646

数字人建模、生成与渲染技术 | 浏览量 : 0 下载量: 6 CSCD: 0

PDF
导出
分享
收藏
专辑

三维穿衣人体重建综述——从传统方法到高保真模型
A survey on 3D clothed human body reconstruction： from traditional methods to high-fidelity models
2024年29卷第9期页码：2566-2595
纸质出版日期： 2024-09-16 ，
DOI： 10.11834/jig.230646
稿件说明：

移动端阅览

陈鸿鹄，陶云帆，张举勇. 2024. 三维穿衣人体重建综述——从传统方法到高保真模型. 中国图象图形学报， 29(09):2566-2595

Chen Honghu， Tao Yunfan， Zhang Juyong. 2024. A survey on 3D clothed human body reconstruction： from traditional methods to high-fidelity models. Journal of Image and Graphics， 29(09):2566-2595
陈鸿鹄，陶云帆，张举勇. 2024. 三维穿衣人体重建综述——从传统方法到高保真模型. 中国图象图形学报， 29(09):2566-2595 DOI： 10.11834/jig.230646.

Chen Honghu， Tao Yunfan， Zhang Juyong. 2024. A survey on 3D clothed human body reconstruction： from traditional methods to high-fidelity models. Journal of Image and Graphics， 29(09):2566-2595 DOI： 10.11834/jig.230646.

摘要

三维穿衣人体重建，在计算机图形学和三维视觉领域占有重要地位，广泛应用于多个方向。人体穿衣的多样性和动作的复杂性使得穿衣人体的高保真重建变得极其困难。深度学习技术优化了数据特征提取、隐式几何表示和神经渲染等关键环节，也推动了高保真穿衣人体重建技术的革命性进步。本文综述了人体重建的基本流程和组成模块，如各类输入数据、人体几何与动作表示、参数化模型以及三维到二维的渲染技术。同时，介绍了公开的穿衣人体数据集，简要回顾了近10年来人体重建算法的快速发展。本文详细探讨了几种主要的重建方法：稠密视角重建、非刚性运动重建（non-rigid structure from motion， NRSFM）、基于像素对齐的隐式几何重建以及生成模型方法。特别是，稠密视角重建能够生成高质量的人体几何，而NRSFM方法减少了对多视角的需求。基于像素对齐的方法重建细节丰富的人体几何，而生成模型方法利用多模态输入信息实现重建。最后总结了现有方法，并展望了未来研究方向，包括实现低成本高保真重建、加速重建过程和增强重建结果的可编辑性，以及在自然环境下进行重建的可能性。本文总结了近年来穿衣人体重建技术的进步，同时指出了未来研究可能集中的方向。

Abstract

Three-dimensional human body reconstruction is a fundamental task in computer graphics and computer vision， with wide-ranging applications in virtual reality， human-computer interaction， motion analysis， and many other fields. This process is aimed at the accurate recovery of a three-dimensional model of the human body from given input data for further analysis and applications. However， high-fidelity reconstruction of clothed human bodies still presents difficulty given the diversity of human body shapes， variations in clothing， and complex human motion. Considerable progress has been attained in the field of three-dimensional human body reconstruction owing to the rapid development of deep learning methods. In deep learning techniques， multilayer neural network models are leveraged for the effective extraction of features from input data and learning of discriminative representations. In human body reconstruction， deep learning methods achieved remarkable advancements through revolutionized data feature extraction， implicit geometric representation， and neural rendering. This article aims to provide a comprehensive and accessible overview of three-dimensional human body reconstruction and elucidate the underlying methodologies， techniques， and algorithms used in this complex process. The article introduces first the classical framework of human body reconstruction， which comprises several key modules that collectively contribute to the reconstruction pipeline. These modules encompass various types of input data， including images， videos， and three-dimensional scans， that serve as fundamental building blocks in the reconstruction process. The representation of human body geometry is a vital aspect of human body reconstruction. Capturing the nuanced contours and shapes that define the human form presents a challenge. The article also explores various techniques for geometric representation， from mesh-based approaches to implicit representations and voxel grids. These techniques capture intricate details of the human body while ensuring that body shapes and poses remain realistic. The article also delves into the challenges associated with the reconstruction of clothed human bodies and examines the efficacy of parametric models in encapsulating the complexities of clothing deformations. Representation of human body motion is another crucial component of human body reconstruction. Realistic reconstructions require accurate modeling and capture of the dynamic nature of human movements. The article comprehensively explores various approaches to modeling human body motions， including articulated and non-rigid ones. Techniques， such as skeletal animation， motion capture， and spatiotemporal analysis， are discussed for the accurate and lifelike representations of human body motion. Parametric models also contribute to human body reconstruction because they provide a concise and expressive representation of the complete human body. The article further examines optimization-based methods， regression-based approaches， and popular parametric models， such as skinned multi-person linear （SMPL） and SMPL plus offsets， for human body reconstruction. These models allow the capture of realistic body shapes， poses， and clothing deformations. The article also discusses the advantages and limitations of these models and their applications in various domains. Deep learning techniques have had a transformative influence on three-dimensional human body reconstruction. The article explores the application of deep learning methodologies in data feature extraction， implicit geometric representation， and neural rendering and highlights the advancements achieved in leveraging convolutional neural networks， recurrent neural networks， and generative adversarial networks for various aspects of the reconstruction pipeline. These deep learning techniques considerably improve the accuracy and realism of reconstructed human bodies. Furthermore， publicly available datasets have been specifically curated for clothed human body reconstruction. These datasets serve as invaluable resources for benchmarking and evaluation of the performance of various reconstruction algorithms and enable researchers to compare and analyze the effectiveness of different techniques to foster advancements in the field. Then， a comprehensive survey of the rapid advancements in human body reconstruction algorithms over the past decade is presented. The survey highlights breakthroughs in dense view reconstruction， non-rigid structure from motion （NRSFM） methods， pixel-aligned implicit geometry reconstruction， generative models， and parameterized models. The discussion is also focused on the strengths， limitations， and potential applications of each approach to provide readers with holistic insights into the current state-of-the-art techniques. In conclusion， this article offers an in-depth and accessible exploration of three-dimensional human body reconstruction and covers a wide range of topics， such as data acquisition， geometry representation， motion modeling， and rendering of modules. The article not only summarizes existing methods but also provides insights into future research directions， such as the pursuit of high-fidelity reconstructions at reduced costs， accelerated reconstruction speeds， editable reconstruction outcomes， and the capability to reconstruct human bodies in natural environments. These research endeavors increase the accuracy， realism， and practicality of three-dimensional human body reconstruction systems and unlock new possibilities for various applications in the academia and industry.

关键词

三维人体重建深度学习参数化模型隐式几何表示非刚性运动重建方法生成模型

Keywords

three-dimensional human body reconstructiondeep learningparameterized modelimplicit geometric representationnon-rigid structure-from-motion methodgenerative model

references

Aliev K A， Sevastopolsky A， Kolos M， Ulyanov D and Lempitsky V. 2020. Neural point-based graphics//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 696-712 ［DOI： 10.1007/978-3-030-58542-6_42http://dx.doi.org/10.1007/978-3-030-58542-6_42］

Allain B， Franco J S and Boyer E. 2015. An efficient volumetric framework for shape tracking//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston， USA： IEEE： 268-276 ［DOI： 10.1109/CVPR.2015.7298623http://dx.doi.org/10.1109/CVPR.2015.7298623］

Alldieck T， Magnor M， Bhatnagar B L， Theobalt C and Pons-Moll G. 2019a. Learning to reconstruct people in clothing from a single RGB camera//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Long Beach， USA： IEEE： 1175-1186 ［DOI： 10.1109/CVPR.2019.00127http://dx.doi.org/10.1109/CVPR.2019.00127］

Alldieck T， Magnor M， Xu W P， Theobalt C and Pons-Moll G. 2018. Video based reconstruction of 3D people models//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 8387-8397 ［DOI： 10.1109/CVPR.2018.00875http://dx.doi.org/10.1109/CVPR.2018.00875］

Alldieck T， Pons-Moll G， Theobalt C and Magnor M. 2019b. Tex2Shape： detailed full human body geometry from a single image//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. Seoul， Korea （South）： IEEE： 2293-2303 ［DOI： 10.1109/ICCV.2019.00238http://dx.doi.org/10.1109/ICCV.2019.00238］

Alldieck T， Xu H Y and Sminchisescu C. 2021. Imghum： implicit generative models of 3D human shape and articulated pose//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 5441-5450 ［DOI： 10.1109/ICCV48922.2021.00541http://dx.doi.org/10.1109/ICCV48922.2021.00541］

Anguelov D， Srinivasan P， Koller D， Thrun S， Rodgers J and Davis J. 2005. SCAPE： shape completion and animation of people. ACM Transactions on Graphics， 24（3）： 408-416 ［DOI： 10.1145/1073204.1073207http://dx.doi.org/10.1145/1073204.1073207］

Athanasiou N， Petrovich M， Black M J and Varol G. 2022. TEACH： temporal action composition for 3D humans//Proceedings of 2022 International Conference on 3D Vision （3DV）. Prague， Czech Republic： IEEE： 414-423 ［DOI： 10.1109/3DV57658.2022.00053http://dx.doi.org/10.1109/3DV57658.2022.00053］

Bérard P， Bradley D， Gross M and Beeler T. 2016. Lightweight eye capture using a parametric model. ACM Transactions on Graphics， 35（4）： #117 ［DOI： 10.1145/2897824.2925962http://dx.doi.org/10.1145/2897824.2925962］

Bérard P， Bradley D， Nitti M， Beeler T and Gross M. 2014. High-quality capture of eyes. ACM Transactions on Graphics， 33（6）： #223 ［DOI： 10.1145/2661229.2661285http://dx.doi.org/10.1145/2661229.2661285］

Bertiche H， Madadi M and Escalera S. 2020. CLOTH3D： clothed 3D humans//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 344-359 ［DOI： 10.1007/978-3-030-58565-5_21http://dx.doi.org/10.1007/978-3-030-58565-5_21］

Bertiche H， Madadi M and Escalera S. 2021. PBNS： physically based neural simulation for unsupervised garment pose space deformation. ACM Transactions on Graphics， 40（6）： #198 ［DOI： 10.1145/3478513.3480479http://dx.doi.org/10.1145/3478513.3480479］

Bhatnagar B， Tiwari G， Theobalt C and Pons-Moll G. 2019. Multi-garment net： learning to dress 3D people from images//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. Seoul， Korea （South）： IEEE： 5419-5429 ［DOI： 10.1109/ICCV.2019.00552http://dx.doi.org/10.1109/ICCV.2019.00552］

Bhatnagar B L， Sminchisescu C， Theobalt C and Pons-Moll G. 2020. LoopReg： self-supervised learning of implicit surface correspondences， pose and shape for 3D human mesh registration//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： NIPS： 12909-12922

Black M J， Patel P， Tesch J and Yang J L. 2023. BEDLAM： a synthetic dataset of bodies exhibiting detailed lifelike animated motion//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver， Canada： IEEE： 8726-8737 ［DOI： 10.1109/CVPR52729.2023.00843http://dx.doi.org/10.1109/CVPR52729.2023.00843］

Blanz V and Vetter T. 1999. A morphable model for the synthesis of 3D faces//Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques. Los Angeles， USA： ACM： 187-194 ［DOI： 10.1145/311535.311556http://dx.doi.org/10.1145/311535.311556］

Bogo F， Kanazawa A， Lassner C， Gehler P， Romero J and Black M J. 2016. Keep it SMPL： automatic estimation of 3D human pose and shape from a single image//Proceedings of the 14th European Conference on Computer Vision. Amsterdam， The Netherlands： Springer： 561-578 ［DOI： 10.1007/978-3-319-46454-1_34http://dx.doi.org/10.1007/978-3-319-46454-1_34］

Božič A， Palafox P， Zollhöfer M， Thies J， Dai A and Nießner M. 2021. Neural deformation graphs for globally-consistent non-rigid reconstruction//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 1450-1459 ［DOI： 10.1109/CVPR46437.2021.00150http://dx.doi.org/10.1109/CVPR46437.2021.00150］

Brand M. 2005. A direct method for 3D factorization of nonrigid motion observed in 2D//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego， USA： IEEE： 122-128 ［DOI： 10.1109/CVPR.2005.23http://dx.doi.org/10.1109/CVPR.2005.23］

Bregler C， Hertzmann A and Biermann H. 2000. Recovering non-rigid 3D shape from image streams//Proceedings of 2000 IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head： IEEE： 690-696 ［DOI： 10.1109/CVPR.2000.854941http://dx.doi.org/10.1109/CVPR.2000.854941］

Cagniart C， Boyer E and Ilic S. 2010. Free-form mesh tracking： a patch-based approach//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco， USA： IEEE： 1339-1346 ［DOI： 10.1109/CVPR.2010.5539814http://dx.doi.org/10.1109/CVPR.2010.5539814］

Cai H R， Feng W Q， Feng X T， Wang Y and Zhang J Y. 2022a. Neural surface reconstruction of dynamic scenes with monocular RGB-D camera//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans， USA： NeurIPS

Cai Z G， Ren D X， Zeng A L， Lin Z Y， Yu T， Wang W J， Fan X Y， Gao Y， Yu Y F， Pan L， Hong F Z， Zhang M Y， Loy C C， Yang L and Liu Z W. 2022b. HuMMan： multi-modal 4D human dataset for versatile sensing and modeling//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv， Israel： Springer： 557-577 ［DOI： 10.1007/978-3-031-20071-7_33http://dx.doi.org/10.1007/978-3-031-20071-7_33］

Cao Y K， Chen G Y， Han K， Yang W Q and Wong K Y K. 2022. JIFF： jointly-aligned implicit face function for high quality single view clothed human reconstruction//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 2719-2729 ［DOI： 10.1109/CVPR52688.2022.00275http://dx.doi.org/10.1109/CVPR52688.2022.00275］

Cao Z， Hidalgo G， Simon T， Wei S E and Sheikh Y. 2021. OpenPose： realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence， 43（1）： 172-186 ［DOI： 10.1109/TPAMI.2019.2929257http://dx.doi.org/10.1109/TPAMI.2019.2929257］

Chan E R， Lin C Z， Chan M A， Nagano K， Pan B X， De Mello S， Gallo O， Guibas L， Tremblay J， Khamis S， Karras T and Wetzstein G. 2022. Efficient geometry-aware 3D generative adversarial networks//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 16102-16112 ［DOI： 10.1109/CVPR52688.2022.01565http://dx.doi.org/10.1109/CVPR52688.2022.01565］

Charles R Q， Su H， Mo K C and Guibas L J. 2017. PointNet： deep learning on point sets for 3D classification and segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 77-85 ［DOI： 10.1109/CVPR.2017.16http://dx.doi.org/10.1109/CVPR.2017.16］

Chen L， Peng S D and Zhou X W. 2021a. Towards efficient and photorealistic 3D human reconstruction： a brief survey. Visual Informatics， 5（4）： 11-19 ［DOI： 10.1016/j.visinf.2021.10.003http://dx.doi.org/10.1016/j.visinf.2021.10.003］

Chen X， Jiang T J， Song J， Yang J L， Black M J， Geiger A and Hilliges O. 2022a. gDNA： towards generative detailed neural avatars//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 20395-20405 ［DOI： 10.1109/CVPR52688.2022.01978http://dx.doi.org/10.1109/CVPR52688.2022.01978］

Chen X， Zheng Y F， Black M J， Hilliges O and Geiger A. 2021b. SNARF： differentiable forward skinning for animating non-rigid neural implicit shapes//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. Montreal， Canada： IEEE： 11574-11584 ［DOI： 10.1109/ICCV48922.2021.01139http://dx.doi.org/10.1109/ICCV48922.2021.01139］

Cheng W， Chen R X， Fan S M， Yin W Q， Chen K Y， Cai Z G， Wang J B， Gao Y， Yu Z M， Lin Z Y， Ren D X， Yang L， Liu Z W， Loy C C， Qian C， Wu W， Lin D H， Dai B and Lin K Y. 2023. DNA-rendering： a diverse neural actor repository for high-fidelity human-centric rendering//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris， France： IEEE： 19925-19936 ［DOI： 10.1109/ICCV51070.2023.01829http://dx.doi.org/10.1109/ICCV51070.2023.01829］

Chibane J， Mir A and Pons-Moll G. 2020. Neural unsigned distance fields for implicit function learning//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： NIPS： #1816 ［DOI： 10.5555/3495724.3497540http://dx.doi.org/10.5555/3495724.3497540］

Choi H， Moon G， Chang J Y and Lee K M. 2021. Beyond static features for temporally consistent 3D human pose and shape from a video//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 1964-1973 ［DOI： 10.1109/CVPR46437.2021.00200http://dx.doi.org/10.1109/CVPR46437.2021.00200］

Choutas V， Pavlakos G， Bolkart T， Tzionas D and Black M J. 2020. Monocular expressive body regression through body-driven attention//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 20-40 ［DOI： 10.1007/978-3-030-58607-2_2http://dx.doi.org/10.1007/978-3-030-58607-2_2］

Collet A， Chuang M， Sweeney P， Gillett D， Evseev D， Calabrese D， Hoppe H， Kirk A and Sullivan S. 2015. High-quality streamable free-viewpoint video. ACM Transactions on Graphics， 34（4）： #69 ［DOI： 10.1145/2766945http://dx.doi.org/10.1145/2766945］

Corona E， Pumarola A， Alenyà G， Pons-Moll G and Moreno-Noguer F. 2021. SMPLicit： topology-aware generative model for clothed people//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 11870-11880 ［DOI： 10.1109/CVPR46437.2021.01170http://dx.doi.org/10.1109/CVPR46437.2021.01170］

Dai Y C， Li H D and He M Y. 2012. A simple prior-free method for non-rigid structure-from-motion factorization//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence， USA： IEEE： 2018-2025 ［DOI： 10.1109/CVPR.2012.6247905http://dx.doi.org/10.1109/CVPR.2012.6247905］

De Luigi L， Li R， Guillard B， Salzmann M and Fua P. 2023. DrapeNet： garment generation and self-supervised draping//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver， Canada： IEEE： 1451-1460 ［DOI： 10.1109/CVPR52729.2023.00146http://dx.doi.org/10.1109/CVPR52729.2023.00146］

Deng B Y， Lewis J P， Jeruzalski T， Pons-Moll G， Hinton G， Norouzi M and Tagliasacchi A. 2020. NASA neural articulated shape approximation//Proceedings of the 16th European Conference on Computer Vision （ECCV）. Glasgow， UK： Springer： 612-628 ［DOI： 10.1007/978-3-030-58571-6_36http://dx.doi.org/10.1007/978-3-030-58571-6_36］

Dong Z J， Chen X， Yang J L， Black M J， Hilliges O and Geiger A. 2023. AG3D： learning to generate 3D avatars from 2D image collections//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris， France： IEEE： 14870-14881 ［DOI： 10.1109/ICCV51070.2023.01370http://dx.doi.org/10.1109/ICCV51070.2023.01370］

Dong Z J， Guo C， Song J， Chen X， Geiger A and Hilliges O. 2022. PINA： learning a personalized implicit neural avatar from a single RGB-D video sequence//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 20438-20448 ［DOI： 10.1109/CVPR52688.2022.01982http://dx.doi.org/10.1109/CVPR52688.2022.01982］

Drebin R A， Carpenter L and Hanrahan P. 1988. Volume rendering. ACM SIGGRAPH Computer Graphics， 22（4）： 65-74 ［DOI： 10.1145/378456.378484http://dx.doi.org/10.1145/378456.378484］

Feng Y， Choutas V， Bolkart T， Tzionas D and Black M J. 2021. Collaborative regression of expressive bodies using moderation//Proceedings of 2021 International Conference on 3D Vision （3DV）. London， UK： IEEE： 792-804. ［DOI： 10.1109/3DV53792.2021.00088http://dx.doi.org/10.1109/3DV53792.2021.00088］

Feng Y， Yang J L， Pollefeys M， Black M J and Bolkart T. 2022. Capturing and animation of body and clothing from monocular video//Proceedings of 2022 SIGGRAPH Asia Conference Papers. Daegu， Republic of Korea： ACM： #45 ［DOI： 10.1145/3550469.3555423http://dx.doi.org/10.1145/3550469.3555423］

Fleishman S， Drori I and Cohen-Or D. 2003. Bilateral mesh denoising. ACM Transactions on Graphics， 22（3）： 950-953 ［DOI： 10.1145/882262.882368http://dx.doi.org/10.1145/882262.882368］

Foix S， Alenya G and Torras C. 2011. Lock-in time-of-flight （ToF） cameras： a survey. IEEE Sensors Journal， 11（9）： 1917-1926 ［DOI： 10.1109/JSEN.2010.2101060http://dx.doi.org/10.1109/JSEN.2010.2101060］

Furukawa Y and Hern􀅡ndez C. 2015. Multi-View Stereo： A Tutorial. Now Foundations and Trends， 1-148 ［DOI： 10.1561/0600000052http://dx.doi.org/10.1561/0600000052］

Gafni G， Thies J， Zollhöfer M and Nießner M. 2021. Dynamic neural radiance fields for monocular 4D facial avatar reconstruction//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 8645-8654 ［DOI： 10.1109/CVPR46437.2021.00854http://dx.doi.org/10.1109/CVPR46437.2021.00854］

Gall J， Stoll C， De Aguiar E， Theobalt C， Rosenhahn B and Seidel H P. 2009. Motion capture using joint skeleton tracking and surface estimation//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami， USA： IEEE： 1746-1753 ［DOI： 10.1109/CVPR.2009.5206755http://dx.doi.org/10.1109/CVPR.2009.5206755］

Garg R， Roussos A and Agapito L. 2013a. Dense variational reconstruction of non-rigid surfaces from monocular video//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland， USA： IEEE： 1272-1279 ［DOI： 10.1109/CVPR.2013.168http://dx.doi.org/10.1109/CVPR.2013.168］

Garg R， Roussos A and Agapito L. 2013b. A variational approach to video registration with subspace constraints. International Journal of Computer Vision， 104（3）： 286-314 ［DOI： 10.1007/s11263-012-0607-7http://dx.doi.org/10.1007/s11263-012-0607-7］

Garland M and Heckbert P S. 1997. Surface simplification using quadric error metrics//Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques. Los Angeles， USA： ACM： 209-216 ［DOI： 10.1145/258734.258849http://dx.doi.org/10.1145/258734.258849］

Geng J. 2011. Structured-light 3D surface imaging： a tutorial. Advances in Optics and Photonics， 3（2）： 128-160 ［DOI： 10.1364/AOP.3.000128http://dx.doi.org/10.1364/AOP.3.000128］

Georgakis G， Li R， Karanam S， Chen T， Košeck􀅡 J and Wu Z Y. 2020. Hierarchical kinematic human mesh recovery//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 768-784 ［DOI： 10.1007/978-3-030-58520-4_45http://dx.doi.org/10.1007/978-3-030-58520-4_45］

Gotardo P F U and Martinez A M. 2011. Kernel non-rigid structure from motion//Proceedings of 2011 International Conference on Computer Vision. Barcelona， Spain： IEEE： 802-809 ［DOI： 10.1109/ICCV.2011.6126319http://dx.doi.org/10.1109/ICCV.2011.6126319］

Guan P， Weiss A， Bãlan A O and Black M J. 2009. Estimating human shape and pose from a single image//Proceedings of the 12th International Conference on Computer Vision. Kyoto， Japan： IEEE： 1381-1388 ［DOI： 10.1109/ICCV.2009.5459300http://dx.doi.org/10.1109/ICCV.2009.5459300］

Guillard B， Stella F and Fua P. 2022. MeshUDF： fast and differentiable meshing of unsigned distance field networks//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv， Israel： Springer： 576-592 ［DOI： 10.1007/978-3-031-20062-5_33http://dx.doi.org/10.1007/978-3-031-20062-5_33］

Güler R A and Kokkinos I. 2019. HoloPose： holistic 3D human reconstruction in-the-wild//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Long Beach， USA： IEEE： 10876-10886 ［DOI： 10.1109/CVPR.2019.01114http://dx.doi.org/10.1109/CVPR.2019.01114］

Guo C， Jiang T J， Chen X， Song J and Hilliges O. 2023. Vid2Avatar： 3D avatar reconstruction from videos in the wild via self-supervised scene decomposition//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver， Canada： IEEE： 12858-12868 ［DOI： 10.1109/CVPR52729.2023.01236http://dx.doi.org/10.1109/CVPR52729.2023.01236］

Guo C， Zou S H， Zuo X X， Wang S， Ji W， Li X Y and Cheng L. 2022. Generating diverse and natural 3d human motions from text//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 5152-5161 ［DOI： 10.1109/CVPR52688.2022.00509http://dx.doi.org/10.1109/CVPR52688.2022.00509］

Guo K W， Lincoln P， Davidson P， Busch J， Yu X M， Whalen M， Harvey G， Orts-Escolano S， Pandey R， Dourgarian J， Tang D H， Tkach A， Kowdle A， Cooper E， Dou M S， Fanello S， Fyffe G， Rhemann C， Taylor J， Debevec P and Izadi S. 2019. The relightables： volumetric performance capture of humans with realistic relighting. ACM Transactions on Graphics， 38（6）： #217 ［DOI： 10.1145/3355089.3356571http://dx.doi.org/10.1145/3355089.3356571］

Guo K W， Xu F， Yu T， Liu X Y， Dai Q H and Liu Y B. 2017. Real-time geometry， albedo， and motion reconstruction using a single RGB-D camera. ACM Transactions on Graphics， 36（3）： #32 ［DOI： 10.1145/3083722http://dx.doi.org/10.1145/3083722］

Guo Y D， Chen K Y， Liang S， Liu Y J， Bao H J and Zhang J Y. 2021. AD-NeRF： audio driven neural radiance fields for talking head synthesis//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 5764-5774 ［DOI： 10.1109/ICCV48922.2021.00573http://dx.doi.org/10.1109/ICCV48922.2021.00573］

Habermann M， Liu L J， Xu W P， Zollhoefer M， Pons-Moll G and Theobalt C. 2021. Real-time deep dynamic characters. ACM Transactions on Graphics， 40（4）： #94 ［DOI： 10.1145/3450626.3459749http://dx.doi.org/10.1145/3450626.3459749］

Habermann M， Xu W P， Zollhöfer M， Pons-Moll G and Theobalt C. 2019. LiveCap： real-time human performance capture from monocular video. ACM Transactions on Graphics， 38（2）： #14 ［DOI： 10.1145/3311970http://dx.doi.org/10.1145/3311970］

Habermann M， Xu W P， Zollhöefer M， Pons-Moll G and Theobalt C. 2020. DeepCap： monocular human performance capture using weak supervision//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 5051-5062 ［DOI： 10.1109/CVPR42600.2020.00510http://dx.doi.org/10.1109/CVPR42600.2020.00510］

Hassan M， Choutas V， Tzionas D and Black M J. 2019. Resolving 3D human pose ambiguities with 3D scene constraints//Proceedings of 2019 International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 2282-2292 ［DOI： 10.1109/ICCV.2019.00237http://dx.doi.org/10.1109/ICCV.2019.00237］

He T， Collomosse J， Jin H L and Soatto S. 2020. Geo-PIFu： geometry and pixel aligned implicit functions for single-view human reconstruction//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： NIPS： #778 ［DOI： 10.5555/3495724.3496502http://dx.doi.org/10.5555/3495724.3496502］

He T， Xu Y L， Saito S， Soatto S and Tung T. 2021. ARCH++： animation-ready clothed human reconstruction revisited// Proceedings of 2021 International Conference on Computer Vision. Montreal， Canada： IEEE： 11026-11036 ［DOI： 10.1109/ICCV48922.2021.01086http://dx.doi.org/10.1109/ICCV48922.2021.01086］

Hejl J. 2004. Hardware skinning with quaternions//Game Programming Gems 4. ［s.l.］： Charles River Media： 487-495

Hesse N， Pujades S， Black M J， Arens M， Hofmann U G and Schroeder A S. 2020. Learning and tracking the 3D body shape of freely moving infants from RGB-D sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence， 42（10）： 2540-2551. ［DOI： 10.1109/TPAMI.2019.2917908http://dx.doi.org/10.1109/TPAMI.2019.2917908］

Hong F Z， Zhang M Y， Pan L， Cai Z G， Yang L and Liu Z W. 2022a. AvatarCLIP： zero-shot text-driven generation and animation of 3D avatars. ACM Transactions on Graphics， 41（4）： #161 ［DOI： 10.1145/3528223.3530094http://dx.doi.org/10.1145/3528223.3530094］

Hong Y. 2022. Representation and Reconstruction of High-Fidelity Virtual Digital Human. Heifei： University of Science and Technology of China

洪阳. 2022. 高保真虚拟数字人的表示与重建. 合肥：中国科学技术大学［DOI： 10.27517/d.cnki.gzkju.2022.000779http://dx.doi.org/10.27517/d.cnki.gzkju.2022.000779］

Hong Y， Zhang J Y， Jiang B Y， Guo Y D， Liu L G and Bao H J. 2021. StereoPIFu： depth aware clothed human digitization via stereo vision//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 535-545 ［DOI： 10.1109/CVPR46437.2021.00060http://dx.doi.org/10.1109/CVPR46437.2021.00060］

Horaud R， Hansard M， Evangelidis G and Ménier C. 2016. An overview of depth cameras and range scanners based on time-of-flight technologies. Machine Vision and Applications， 27（7）： 1005-1020 ［DOI： 10.1007/s00138-016-0784-4http://dx.doi.org/10.1007/s00138-016-0784-4］

Hu L W， Bradley D， Li H and Beeler T. 2017. Simulation-ready hair capture. Computer Graphics Forum， 36（2）： 281-294 ［DOI： 10.1111/cgf.13126http://dx.doi.org/10.1111/cgf.13126］

Hu T， Yu T， Zheng Z R， Zhang H， Liu Y B and Zwicker M. 2021. HVTR： hybrid volumetric-textural rendering for human avatars//Proceedings of 2021 International Conference on 3D Vision. Prague， Czech Republic： IEEE： 197-208 ［DOI： 10.1109/3DV57658.2022.00032http://dx.doi.org/10.1109/3DV57658.2022.00032］

Huang Y H， Kaufmann M， Aksan E， Black M J， Hilliges O and Pons-Moll G. 2018a. Deep inertial poser： learning to reconstruct human pose from sparse inertial measurements in real time. ACM Transactions on Graphics， 37（6）： #185 ［DOI： 10.1145/3272127.3275108http://dx.doi.org/10.1145/3272127.3275108］

Huang Y Y， Yi H W， Xiu Y L， Liao T T， Tang J X， Cai D and Thies J. 2023. TeCH： text-guided reconstruction of lifelike clothed humans. ［EB/OL］. ［2023-08-19］. http://arxiv.org/pdf/2308.08545.pdfhttp://arxiv.org/pdf/2308.08545.pdf

Huang Z， Li T Y， Chen W K， Zhao Y J， Xing J， LeGendre C， Luo L J， Ma C Y and Li H. 2018b. Deep volumetric video from very sparse multi-view performance capture//Proceedings of the 15th European Conference on Computer Vision （ECCV）. Munich， Germany： Springer： 351-369 ［DOI： 10.1007/978-3-030-01270-0_21http://dx.doi.org/10.1007/978-3-030-01270-0_21］

Huang Z， Xu Y L， Lassner C， Li H and Tung T. 2020. ARCH： animatable reconstruction of clothed humans//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 3090-3099 ［DOI： 10.1109/CVPR42600.2020.00316http://dx.doi.org/10.1109/CVPR42600.2020.00316］

Işık M， Rünz M， Georgopoulos M， Khakhulin T， Starck J， Agapito L and Nießner M. 2023. HumanRF： high-fidelity neural radiance fields for humans in motion. ACM Transactions on Graphics， 42（4）： #160 ［DOI： 10.1145/3592415http://dx.doi.org/10.1145/3592415］

Jackson A S， Manafas C and Tzimiropoulos G. 2019. 3D human body reconstruction from a single image via volumetric regression//Proceedings of 2019 European Conference on Computer Vision （ECCV） Workshops. Munich， Germany： Springer： 64-77 ［DOI： 10.1007/978-3-030-11018-5_6http://dx.doi.org/10.1007/978-3-030-11018-5_6］

Jacobson A， Baran I， Popovic J and Sorkine O. 2011. Bounded biharmonic weights for real-time deformation//ACM SIGGRAPH 2011 Papers. Vancouver， Canada： ACM： #78 ［DOI： 10.1145/1964921.1964973http://dx.doi.org/10.1145/1964921.1964973］

Jiang B Y， Hong Y， Bao H J and Zhang J Y. 2022a. SelfRecon： self reconstruction your digital avatar from monocular video//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 5595-5605 ［DOI： 10.1109/CVPR52688.2022.00552http://dx.doi.org/10.1109/CVPR52688.2022.00552］

Jiang B Y， Zhang J Y， Cai J F and Zheng J M. 2020a. Disentangled human body embedding based on deep hierarchical neural network. IEEE Transactions on Visualization and Computer Graphics， 26（8）： 2560-2575 ［DOI： 10.1109/TVCG.2020.2988476http://dx.doi.org/10.1109/TVCG.2020.2988476］

Jiang B Y， Zhang J Y， Hong Y， Luo J H， Liu L G and Bao H J. 2020b. BCNet： learning body and cloth shape from a single image//Proceedings of the 16th European Conference on Computer Vision （ECCV）. Glasgow， UK： Springer： 18-35 ［DOI： 10.1007/978-3-030-58565-5_2http://dx.doi.org/10.1007/978-3-030-58565-5_2］

Jiang W， Yi K M， Samei G， Tuzel O and Ranjan A. 2022b. NeuMan： neural human radiance field from a single video//Proceedings of the 17th European Conference on Computer Vision （ECCV）. Tel Aviv， Israel： Springer： 402-418 ［DOI： 10.1007/978-3-031-19824-3_24http://dx.doi.org/10.1007/978-3-031-19824-3_24］

Jiang Y H， Jiang S Y， Sun G X， Su Z， Guo K W， Wu M Y， Yu J Y and Xu L. 2022c. NeuralHOFusion： neural volumetric rendering under human-object interactions//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 6145-6155 ［DOI： 10.1109/CVPR52688.2022.00606http://dx.doi.org/10.1109/CVPR52688.2022.00606］

Joo H， Simon T and Sheikh Y. 2018. Total capture： a 3D deformation model for tracking faces， hands， and bodies//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 8320-8329 ［DOI： 10.1109/CVPR.2018.00868http://dx.doi.org/10.1109/CVPR.2018.00868］

Joshi P， Meyer M， DeRose T， Green B and Sanocki T. 2007. Harmonic coordinates for character articulation. ACM Transactions on Graphics， 26（3）： #71-es ［DOI： 10.1145/1276377.1276466http://dx.doi.org/10.1145/1276377.1276466］

Ju T， Schaefer S and Warren J. 2005. Mean value coordinates for closed triangular meshes. ACM Transactions on Graphics， 24（3）： 561-566 ［DOI： 10.1145/1073204.1073229http://dx.doi.org/10.1145/1073204.1073229］

Kanazawa A， Black M J， Jacobs D W and Malik J. 2018. End-to-end recovery of human shape and pose// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 7122-7131 ［DOI： 10.1109/CVPR.2018.00744http://dx.doi.org/10.1109/CVPR.2018.00744］

Karras T， Laine S， Aittala M， Hellsten J， Lehtinen J and Aila T. 2020. Analyzing and improving the image quality of stylegan//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 8107-8116 ［DOI： 10.1109/CVPR42600.2020.00813http://dx.doi.org/10.1109/CVPR42600.2020.00813］

Kato H， Ushiku Y and Harada T. 2018. Neural 3D mesh renderer//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 3907-3916 ［DOI： 10.1109/CVPR.2018.00411http://dx.doi.org/10.1109/CVPR.2018.00411］

Kavan L， Collins S， Ž􀅡ra J and O’Sullivan C. 2007. Skinning with dual quaternions//2007 Symposium on Interactive 3D Graphics and Games. Seattle， USA： ACM： 39-46 ［DOI： 10.1145/1230100.1230107http://dx.doi.org/10.1145/1230100.1230107］

Kavan L and Sorkine O. 2012. Elasticity-inspired deformers for character articulation. ACM Transactions on Graphics， 31（6）： #196 ［DOI： 10.1145/2366145.2366215http://dx.doi.org/10.1145/2366145.2366215］

Kavan L and Ž􀅡ra J. 2005. Spherical blend skinning： a real-time deformation of articulated models//2005 Symposium on Interactive 3D Graphics and Games. Washington， USA： ACM： 9-16 ［DOI： 10.1145/1053427.1053429http://dx.doi.org/10.1145/1053427.1053429］

Kazhdan M， Bolitho M and Hoppe H. 2006. Poisson surface reconstruction//Proceedings of the 4th Eurographics Symposium on Geometry Processing. Cagliari， Italy： ACM： 61-70 ［DOI： 10.5555/1281957.1281965http://dx.doi.org/10.5555/1281957.1281965］

Kirillov A， Mintun E， Ravi N， Mao H， Rolland C， Gustafson L， Xiao T T， Whitehead S， Berg A C， Lo W Y， Doll􀅡r P and Girshick R. 2023. Segment anything//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris， France： IEEE： 3992-4003 ［DOI： 10.1109/ICCV51070.2023.00371http://dx.doi.org/10.1109/ICCV51070.2023.00371］

Kocabas M， Athanasiou N and Black M J. 2020. VIBE： video inference for human body pose and shape estimation//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 5253-5262 ［DOI： 10.1109/CVPR42600.2020.00530http://dx.doi.org/10.1109/CVPR42600.2020.00530］

Kocabas M， Huang C H P， Hilliges O and Black M J. 2021a. PARE： part attention regressor for 3D human body estimation//Proceedings of 2021 International Conference on Computer Vision. Montreal， Canada： IEEE： 11107-11117 ［DOI： 10.1109/ICCV48922.2021.01094http://dx.doi.org/10.1109/ICCV48922.2021.01094］

Kocabas M， Huang C H P， Tesch J， Müller L， Hilliges O and Black M J. 2021b. SPEC： seeing people in the wild with an estimated camera//Proceedings of 2021 International Conference on Computer Vision. Montreal， Canada： IEEE： 11015-11025 ［DOI： 10.1109/ICCV48922.2021.01085http://dx.doi.org/10.1109/ICCV48922.2021.01085］

Kolotouros N， Alldieck T， Zanfir A， Bazavan E G， Fieraru M and Sminchisescu C. 2023. DreamHuman： animatable 3D avatars from text//Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans， USA： NeurlPS： 10516-10529

Kolotouros N， Pavlakos G， Black M and Daniilidis K. 2019. Learning to reconstruct 3D human pose and shape via model-fitting in the loop//Proceedings of 2019 International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 2252-2261 ［DOI： 10.1109/ICCV.2019.00234http://dx.doi.org/10.1109/ICCV.2019.00234］

Kong C and Lucey S. 2016. Prior-less compressible structure from motion//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Las Vegas， USA： IEEE： 4123-4131 ［DOI： 10.1109/CVPR.2016.447http://dx.doi.org/10.1109/CVPR.2016.447］

Kry P G， James D L and Pai D K. 2002. EigenSkin： real time large deformation character skinning in hardware//2002 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. San Antonio， USA： ACM： 153-159 ［DOI： 10.1145/545261.545286http://dx.doi.org/10.1145/545261.545286］

Kumar S， Cherian A， Dai Y C and Li H D. 2018. Scalable dense non-rigid structure-from-motion： a grassmannian perspective//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 254-263 ［DOI： 10.1109/CVPR.2018.00034http://dx.doi.org/10.1109/CVPR.2018.00034］

Lassner C， Romero J， Kiefel M， Bogo F， Black M J and Gehler P V. 2017. Unite the people： closing the loop between 3D and 2D human representations//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Honolulu， USA： IEEE： 4704-4713 ［DOI： 10.1109/CVPR.2017.500http://dx.doi.org/10.1109/CVPR.2017.500］

Le B H and Hodgins J K. 2016. Real-time skeletal skinning with optimized centers of rotation. ACM Transactions on Graphics， 35（4）： #37 ［DOI： 10.1145/2897824.2925959http://dx.doi.org/10.1145/2897824.2925959］

Lewis J P， Cordner M and Fong N. 2000. Pose space deformation： a unified approach to shape interpolation and skeleton-driven deformation//Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New Orleans， USA： ACM： 165-172 ［DOI： 10.1145/344779.344862http://dx.doi.org/10.1145/344779.344862］

Li J F， Xu C， Chen Z C， Bian S Y， Yang L X and Lu C W. 2021a. HybrIK： a hybrid analytical-neural inverse kinematics solution for 3D human pose and shape estimation//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 3382-3392 ［DOI： 10.1109/CVPR46437.2021.00339http://dx.doi.org/10.1109/CVPR46437.2021.00339］

Li P K， Xu Y Q， Wei Y C and Yang Y. 2020a. Self-correction for human parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence， 44（6）： 3260-3271 ［DOI： 10.1109/TPAMI.2020.3048039http://dx.doi.org/10.1109/TPAMI.2020.3048039］

Li P Z， Aberman K， Hanocka R， Liu L B， Sorkine-Hornung O and Chen B Q. 2021b. Learning skeletal articulations with neural blend shapes. ACM Transactions on Graphics， 40（4）： #130 ［DOI： 10.1145/3450626.3459852http://dx.doi.org/10.1145/3450626.3459852］

Li R L， Xiu Y L， Saito S， Huang Z， Olszewski K and Li H. 2020b. Monocular real-time volumetric performance capture//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 49-67 ［DOI： 10.1007/978-3-030-58592-1_4http://dx.doi.org/10.1007/978-3-030-58592-1_4］

Li Z， Yu T， Pan C Y， Zheng Z R and Liu Y B. 2020c. Robust 3D self-portraits in seconds//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 1341-1350 ［DOI： 10.1109/CVPR42600.2020.00142http://dx.doi.org/10.1109/CVPR42600.2020.00142］

Li Z H， Liu J Z， Zhang Z S， Xu S C and Yan Y L. 2022. CLIFF： carrying location information in full frames into human pose and shape estimation//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv， Israel： Springer： 590-606 ［DOI： 10.1007/978-3-031-20065-6_34http://dx.doi.org/10.1007/978-3-031-20065-6_34］

Lin S Y， Zhang H W， Zheng Z R， Shao R Z and Liu Y B. 2022. Learning implicit templates for point-based clothed human modeling//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv， Israel： Springer： 210-228 ［DOI： 10.1007/978-3-031-20062-5_13http://dx.doi.org/10.1007/978-3-031-20062-5_13］

Liu F and Zhou Y F. 2023. Implicit reconstruction of 3D human body from single view based on parametric-model and normal inference. Journal of Nanjing University of Posts and Telecommunications （Natural Science Edition）， 43（3）： 1-10

刘峰，周弈帆. 2023. 基于参数模型和法线推理的单视图三维人体隐式重建. 南京邮电大学学报（自然科学版）， 43（3）： 1-10 ［DOI： 10.14132/j.cnki.1673-5439.2023.03.001http://dx.doi.org/10.14132/j.cnki.1673-5439.2023.03.001］

Liu L J， Habermann M， Rudnev V， Sarkar K， Gu J T and Theobalt C. 2021. Neural actor： neural free-view synthesis of human actors with pose control. ACM Transactions on Graphics， 40（6）： #219 ［DOI： 10.1145/3478513.3480528http://dx.doi.org/10.1145/3478513.3480528］

Liu S C， Chen W K， Li T Y and Li H. 2019. Soft rasterizer： a differentiable renderer for image-based 3D reasoning//Proceedings of 2019 International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 7707-7716 ［DOI： 10.1109/ICCV.2019.00780http://dx.doi.org/10.1109/ICCV.2019.00780］

Lombardi S， Simon T， Saragih J， Schwartz G， Lehrmann A and Sheikh Y. 2019. Neural volumes： learning dynamic renderable volumes from images. ACM Transactions on Graphics， 38（4）： #65 ［DOI： 10.1145/3306346.3323020http://dx.doi.org/10.1145/3306346.3323020］

Loper M， Mahmood N and Black M J. 2014. MoSh： motion and shape capture from sparse markers. ACM Transactions on Graphics， 33（6）： #220 ［DOI： 10.1145/2661229.2661273http://dx.doi.org/10.1145/2661229.2661273］

Loper M， Mahmood N， Romero J， Pons-Moll G and Black M J. 2015. SMPL： a skinned multi-person linear model. ACM Transactions on Graphics， 34（6）： #248 ［DOI： 10.1145/2816795.2818013http://dx.doi.org/10.1145/2816795.2818013］

Loper M M and Black M J. 2014. OpenDR： an approximate differentiable renderer//Proceedings of the 13th European Conference on Computer Vision. Zurich， Switzerland： Springer： 154-169 ［DOI： 10.1007/978-3-319-10584-0_11http://dx.doi.org/10.1007/978-3-319-10584-0_11］

Lorensen W E and Cline H E. 1987. Marching cubes： a high resolution 3D surface construction algorithm. ACM SIGGRAPH Computer Graphics， 21（4）： 163-169 ［DOI： 10.1145/37402.37422http://dx.doi.org/10.1145/37402.37422］

Luo Z Y， Golestaneh S A and Kitani K M. 2021. 3D human motion estimation via motion compression and refinement//Proceedings of the 15th Asian Conference on Computer Vision. Kyoto， Japan： Springer： 324-340 ［DOI： 10.1007/978-3-030-69541-5_20http://dx.doi.org/10.1007/978-3-030-69541-5_20］

Ma Q L， Saito S， Yang J L， Tang S Y and Black M J. 2021a. SCALE： modeling clothed humans with a surface codec of articulated local elements//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 16077-16088 ［DOI： 10.1109/CVPR46437.2021.01582http://dx.doi.org/10.1109/CVPR46437.2021.01582］

Ma Q L， Yang J L， Black M J and Tang S Y. 2022. Neural point-based shape modeling of humans in challenging clothing//Proceedings of 2022 International Conference on 3D Vision （3DV）. Prague， Czech Republic： IEEE： 679-689 ［DOI： 10.1109/3DV57658.2022.00078http://dx.doi.org/10.1109/3DV57658.2022.00078］

Ma Q L， Yang J L， Ranjan A， Pujades S， Pons-Moll G， Tang S Y and Black M J. 2020. Learning to dress 3d people in generative clothing//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 6468-6477 ［DOI： 10.1109/CVPR42600.2020.00650http://dx.doi.org/10.1109/CVPR42600.2020.00650］

Ma Q L， Yang J L， Tang S Y and Black M J. 2021b. The power of points for modeling humans in clothing//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. Montreal， Canada： IEEE： 10954-10964 ［DOI： 10.1109/ICCV48922.2021.01079http://dx.doi.org/10.1109/ICCV48922.2021.01079］

Magnenat-Thalmann N， Laperrire R and Thalmann D. 1988. Joint-dependent local deformations for hand animation and object grasping//Proceedings on Graphics Interface’88. Edmonton， Canada： ACM： 26-33 ［DOI： 10.5555/102313.102317http://dx.doi.org/10.5555/102313.102317］

Mahmood N， Ghorbani N， Troje N F， Pons-Moll G and Black M. 2019. AMASS： archive of motion capture as surface shapes//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 5441-5450 ［DOI： 10.1109/ICCV.2019.00554http://dx.doi.org/10.1109/ICCV.2019.00554］

Martel J N P， Lindell D B， Lin C Z， Chan E R， Monteiro M and Wetzstein G. 2021. Acorn： adaptive coordinate networks for neural scene representation. ACM Transactions on Graphics， 40（4）： #58 ［DOI： 10.1145/3450626.3459785http://dx.doi.org/10.1145/3450626.3459785］

Merry B， Marais P and Gain J. 2006. Animation space： a truly linear framework for character animation. ACM Transactions on Graphics， 25（4）： 1400-1423 ［DOI： 10.1145/1183287.1183294http://dx.doi.org/10.1145/1183287.1183294］

Mescheder L， Oechsle M， Niemeyer M， Nowozin S and Geiger A. 2019. Occupancy networks： learning 3D reconstruction in function space//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Long Beach， USA： IEEE： 4455-4465 ［DOI： 10.1109/CVPR.2019.00459http://dx.doi.org/10.1109/CVPR.2019.00459］

Mihajlovic M， Zhang Y， Black M J and Tang S Y. 2021. LEAP： learning articulated occupancy of people//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 10456-10466 ［DOI： 10.1109/CVPR46437.2021.01032http://dx.doi.org/10.1109/CVPR46437.2021.01032］

Mildenhall B， Srinivasan P P， Tancik M， Barron J T， Ramamoorthi R and Ng R. 2020. NeRF： representing scenes as neural radiance fields for view synthesis//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 405-421 ［DOI： 10.1007/978-3-030-58452-8_24http://dx.doi.org/10.1007/978-3-030-58452-8_24］

Mukai T and Kuriyama S. 2016. Efficient dynamic skinning with low-rank helper bone controllers. ACM Transactions on Graphics， 35（4）： #36 ［DOI： 10.1145/2897824.2925905http://dx.doi.org/10.1145/2897824.2925905］

Müller T， Evans A， Schied C and Keller A. 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics， 41（4）： #102 ［DOI： 10.1145/3528223.3530127http://dx.doi.org/10.1145/3528223.3530127］

Newcombe R A， Fox D and Seitz S M. 2015. Dynamicfusion： reconstruction and tracking of non-rigid scenes in real-time//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston， USA： IEEE： 343-352 ［DOI： 10.1109/CVPR.2015.7298631http://dx.doi.org/10.1109/CVPR.2015.7298631］

Niemeyer M， Mescheder L， Oechsle M and Geiger A. 2020. Differentiable volumetric rendering： learning implicit 3D representations without 3D supervision//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 3501-3512 ［DOI： 10.1109/CVPR42600.2020.00356http://dx.doi.org/10.1109/CVPR42600.2020.00356］

Omran M， Lassner C， Pons-Moll G， Gehler P and Schiele B. 2018. Neural body fitting： unifying deep learning and model based human pose and shape estimation//Proceedings of 2018 International Conference on 3D Vision. Verona， Italy： IEEE： 484-494 ［DOI： 10.1109/3DV.2018.00062http://dx.doi.org/10.1109/3DV.2018.00062］

Onizuka H， Hayirci Z， Thomas D， Sugimoto A， Uchiyama H and Taniguchi R I. 2020. TetraTSDF： 3D human reconstruction from a single image with a tetrahedral outer shell//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 6010-6019 ［DOI： 10.1109/CVPR42600.2020.00605http://dx.doi.org/10.1109/CVPR42600.2020.00605］

Orts-Escolano S， Rhemann C， Fanello S， Chang W， Kowdle A， Degtyarev Y， Kim D， Davidson P L， Khamis S， Dou M S， Tankovich V， Loop C， Cai Q， Chou P A， Mennicken S， Valentin J， Pradeep V， Wang S L， Kang S B， Kohli P， Lutchyn Y， Keskin C and Izadi S. 2016. Holoportation： virtual 3D teleportation in real-time//Proceedings of the 29th Annual Symposium on User Interface Software and Technology. Tokyo， Japan： ACM： 741-754 ［DOI： 10.1145/2984511.2984517http://dx.doi.org/10.1145/2984511.2984517］

Osman A A， Bolkart T and Black M J. 2020. Star： sparse trained articulated human body regressor//Proceedings of the 16th European Conference on Computer Vision （ECCV）. Glasgow， UK： Springer： 598-613 ［DOI： 10.1007/978-3-030-58539-6_36http://dx.doi.org/10.1007/978-3-030-58539-6_36］

Paladini M， Del Bue A， Xavier J， Agapito L， Stošić M and Dodig M. 2012. Optimal metric projections for deformable and articulated structure-from-motion. International Journal of Computer Vision， 96（2）： 252-276 ［DOI： 10.1007/s11263-011-0468-5http://dx.doi.org/10.1007/s11263-011-0468-5］

Palafox P， Božič A， Thies J， Nießner M and Dai A. 2021. NPMs： neural parametric models for 3D deformable shapes//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 12675-12685 ［DOI： 10.1109/ICCV48922.2021.01246http://dx.doi.org/10.1109/ICCV48922.2021.01246］

Park J J， Florence P， Straub J， Newcombe R and Lovegrove S. 2019. DeepSDF： learning continuous signed distance functions for shape representation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Long Beach， USA： IEEE： 165-174 ［DOI： 10.1109/CVPR.2019.00025http://dx.doi.org/10.1109/CVPR.2019.00025］

Park K， Sinha U， Barron J T， Bouaziz S， Goldman D B， Seitz S M and Martin-Brualla R. 2020. Deformable neural radiance fields//Proceedings of 2020 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 5845-5854 ［DOI： 10.1109/ICCV48922.2021.00581http://dx.doi.org/10.1109/ICCV48922.2021.00581］

Pavlakos G， Choutas V， Ghorbani N， Bolkart T， Osman A A， Tzionas D and Black M J. 2019. Expressive body capture： 3D hands， face， and body from a single image//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Long Beach， USA： IEEE： 10967-10977 ［DOI： 10.1109/CVPR.2019.01123http://dx.doi.org/10.1109/CVPR.2019.01123］

Peng S D， Dong J T， Wang Q Q， Zhang S Z， Shuai Q， Zhou X W and Bao H J. 2021a. Animatable neural radiance fields for modeling dynamic human bodies//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 14294-14303 ［DOI： 10.1109/ICCV48922.2021.01405http://dx.doi.org/10.1109/ICCV48922.2021.01405］

Peng S D， Zhang Y Q， Xu Y H， Wang Q Q， Shuai Q， Bao H J and Zhou X W. 2021b. Neural body： implicit neural representations with structured latent codes for novel view synthesis of dynamic humans//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 9050-9059 ［DOI： 10.1109/CVPR46437.2021.00894http://dx.doi.org/10.1109/CVPR46437.2021.00894］

Pons-Moll G， Pujades S， Hu S and Black M J. 2017. ClothCap： seamless 4D clothing capture and retargeting. ACM Transactions on Graphics， 36（4）： #73 ［DOI： 10.1145/3072959.3073711http://dx.doi.org/10.1145/3072959.3073711］

Poole B， Jain A， Barron J T and Mildenhall B. 2022. DreamFusion： text-to-3D using 2D diffusion//Proceedings of the 11th International Conference on Learning Representations. Kigali， Rwanda： ICLR

Raj T， Hashim F H， Huddin A B， Ibrahim M F and Hussain A. 2020. A survey on LiDAR scanning mechanisms. Electronics， 9（5）： #741 ［DOI： 10.3390/electronics9050741http://dx.doi.org/10.3390/electronics9050741］

Ren S， He K， Ross G and Sun J. 2017. Faster R-CNN： towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（6）： 1137-1149. ［DOI： 10.1109/tpami.2016.2577031http://dx.doi.org/10.1109/tpami.2016.2577031］

Rockwell C and Fouhey D F. 2020. Full-body awareness from partial observations//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 522-539 ［DOI： 10.1007/978-3-030-58520-4_31http://dx.doi.org/10.1007/978-3-030-58520-4_31］

Romero J， Tzionas D and Black M J. 2017. Embodied hands： modeling and capturing hands and bodies together. ACM Transactions on Graphics， 36（6）： #245 ［DOI： 10.1145/3130800.3130883http://dx.doi.org/10.1145/3130800.3130883］

Roriz R， Cabral J and Gomes T. 2022. Automotive LiDAR technology： a survey. IEEE Transactions on Intelligent Transportation Systems， 23（7）： 6282-6297 ［DOI： 10.1109/TITS.2021.3086804http://dx.doi.org/10.1109/TITS.2021.3086804］

Rueegg N， Lassner C， Black M and Schindler K. 2020. Chained representation cycling： learning to estimate 3D human pose and shape by cycling between representations//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York， USA： AAAI： 5561-5569 ［DOI： 10.1609/aaai.v34i04.6008http://dx.doi.org/10.1609/aaai.v34i04.6008］

Saito S， Huang Z， Natsume R， Morishima S， Li H and Kanazawa A. 2019. PIFu： pixel-aligned implicit function for high-resolution clothed human digitization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. Seoul， Korea （South）： IEEE： 2304-2314 ［DOI： 10.1109/ICCV.2019.00239http://dx.doi.org/10.1109/ICCV.2019.00239］

Saito S， Simon T， Saragih J and Joo H. 2020. PIFuHD： multi-level pixel-aligned implicit function for high-resolution 3D human digitization//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 81-90 ［DOI： 10.1109/CVPR42600.2020.00016http://dx.doi.org/10.1109/CVPR42600.2020.00016］

Saito S， Yang J L， Ma Q L and Black M J. 2021. SCANimate： weakly supervised learning of skinned clothed avatar networks//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 2885-2896 ［DOI： 10.1109/CVPR46437.2021.00291http://dx.doi.org/10.1109/CVPR46437.2021.00291］

Santesteban I， Garces E， Otaduy M A and Casas D. 2020. SoftSMPL： data-driven modeling of nonlinear soft-tissue dynamics for parametric humans. Computer Graphics Forum， 39（2）： 65-75 ［DOI： 10.1111/cgf.13912http://dx.doi.org/10.1111/cgf.13912］

Santesteban I， Otaduy M A and Casas D. 2022. SNUG： self-supervised neural dynamic garments//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 8130-8140 ［DOI： 10.1109/CVPR52688.2022.00797http://dx.doi.org/10.1109/CVPR52688.2022.00797］

Sengupta A， Cipolla R and Budvytis I. 2020. Synthetic training for accurate 3D human pose and shape estimation in the wild//Proceedings of the 31st British Machine Vision Conference. ［s.l.］： BMVC

Sengupta S， Gu J W， Kim K， Liu G L， Jacobs D and Kautz J. 2019. Neural inverse rendering of an indoor scene from a single image//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. Seoul， Korea （South）： IEEE： 8597-8606 ［DOI： 10.1109/ICCV.2019.00869http://dx.doi.org/10.1109/ICCV.2019.00869］

Seo J， Irving G， Lewis J P and Noh J. 2011. Compression and direct manipulation of complex blendshape models//Proceedings of 2011 SIGGRAPH Asia Conference. Hong Kong， China： Association for Computing Machinery： #164 ［DOI： 10.1145/2024156.2024198http://dx.doi.org/10.1145/2024156.2024198］

Sitzmann V， Thies J， Heide F， Nießner M， Wetzstein G and Zollhöfer M. 2019a. DeepVoxels： learning persistent 3D feature embeddings//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 2432-2441 ［DOI： 10.1109/CVPR.2019.00254http://dx.doi.org/10.1109/CVPR.2019.00254］

Sitzmann V， Zollhöfer M and Wetzstein G. 2019b. Scene representation networks： continuous 3D-structure-aware neural scene representations//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver， Canada： NeurIPS： #101 ［DOI： 10.5555/3454287.3454388http://dx.doi.org/10.5555/3454287.3454388］

Sorkine O， Cohen-Or D， Lipman Y， Alexa M， Rössl C and Seidel H P. 2004. Laplacian surface editing//2004 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing. Nice， France： ACM： 175-184 ［DOI： 10.1145/1057432.1057456http://dx.doi.org/10.1145/1057432.1057456］

Su Z， Xu L， Zheng Z R， Yu T， Liu Y B and Fang L. 2020. RobustFusion： human volumetric capture with data-driven visual cues using a RGBD camera//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 246-264 ［DOI： 10.1007/978-3-030-58548-8_15http://dx.doi.org/10.1007/978-3-030-58548-8_15］

Su Z Q， Yu T， Wang Y G and Liu Y B. 2023. DeepCloth： neural garment representation for shape and style editing. IEEE Transactions on Pattern Analysis and Machine Intelligence， 45（2）： 1581-1593 ［DOI： 10.1109/TPAMI.2022.3168569http://dx.doi.org/10.1109/TPAMI.2022.3168569］

Sun G X， Chen X， Chen Y Z， Pang A Q， Lin P， Jiang Y H， Xu L， Yu J Y and Wang J Y. 2021. Neural free-viewpoint performance rendering under complex human-object interactions//Proceedings of the 29th ACM International Conference on Multimedia. ［s.l.］： ACM： 4651-4660 ［DOI： 10.1145/3474085.3475442http://dx.doi.org/10.1145/3474085.3475442］

Tan J， Budvytis I and Cipolla R. 2017. Indirect deep structured learning for 3D human body shape and pose prediction//Proceedings of 2017 British Machine Vision Conference. London， UK： BMVC

Tevet G， Gordon B， Hertz A， Bermano A H and Cohen-Or D. 2022. MotionCLIP： exposing human motion generation to clip space//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv， Israel： Springer： 358-374 ［DOI： 10.1007/978-3-031-20047-2_21http://dx.doi.org/10.1007/978-3-031-20047-2_21］

Tewari A， Zollöfer M， Kim H， Garrido P， Bernard F， Pérez P and Theobalt C. 2017. MoFA： model-based deep convolutional face autoencoder for unsupervised monocular reconstruction//Proceedings of 2017 International Conference on Computer Vision. Venice， Italy： IEEE： 3735-3744 ［DOI： 10.1109/ICCV.2017.401http://dx.doi.org/10.1109/ICCV.2017.401］

Tian Y T， Zhang H W， Liu Y B and Wang L M. 2023. Recovering 3D human mesh from monocular images： a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence， 45（12）： 15406-15425 ［DOI： 10.1109/TPAMI.2023.3298850http://dx.doi.org/10.1109/TPAMI.2023.3298850］

Tiwari G， Sarafianos N， Tung T and Pons-Moll G. 2021. Neural-GIF： neural generalized implicit functions for animating people in clothing//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. Montreal， Canada： IEEE： 11688-11698 ［DOI： 10.1109/ICCV48922.2021.01150http://dx.doi.org/10.1109/ICCV48922.2021.01150］

Tong L J and Zhang C Z. 2023. A 3D human body reconstruction method based on involution convolution. Journal of Hubei Minzu University （Natural Science Edition）， 41（2）： 184-190

童立靖，张成智. 2023. 一种基于Involution卷积的三维人体重建方法. 湖北民族大学学报（自然科学版）， 41（2）： 184-190. ［DOI： 10.13501/j.cnki.42-1908/n.2023.06.007http://dx.doi.org/10.13501/j.cnki.42-1908/n.2023.06.007］

Torresani L， Hertzmann A and Bregler C. 2008. Nonrigid structure-from-motion： estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence， 30（5）： 878-892 ［DOI： 10.1109/TPAMI.2007.70752http://dx.doi.org/10.1109/TPAMI.2007.70752］

Tretschk E， Kairanda N， R M B， Dabral R， Kortylewski A， Egger B， Habermann M， Fua P， Theobalt C and Golyanik V. 2023. State of the art in dense monocular non-rigid 3D reconstruction. Computer Graphics Forum， 42（2）： 485-520 ［DOI： 10.1111/cgf.14774http://dx.doi.org/10.1111/cgf.14774］

Varol G， Ceylan D， Russell B， Yang J M， Yumer E， Laptev I and Schmid C. 2018. BodyNet： volumetric inference of 3D human body shapes//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 20-38 ［DOI： 10.1007/978-3-030-01234-2_2http://dx.doi.org/10.1007/978-3-030-01234-2_2］

Vaswani A， Shazeer N， Parmar N， Uszkoreit J， Jones L， Gomez A N， Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach， USA： ACM： 6000-6010 ［DOI： 10.5555/3295222.3295349http://dx.doi.org/10.5555/3295222.3295349］

Vlasic D， Baran I， Matusik W and Popović J. 2008. Articulated mesh animation from multi-view silhouettes. ACM Transactions on Graphics， 27（3）： 1-9 ［DOI： 10.1145/1360612.1360696http://dx.doi.org/10.1145/1360612.1360696］

von Marcard T， Henschel R， Black M J， Rosenhahn B and Pons-Moll G. 2018. Recovering accurate 3D human pose in the wild using imus and a moving camera//Proceedings of the 15th European Conference on Computer Vision （ECCV）. Munich， Germany： Springer： 614-631 ［DOI： 10.1007/978-3-030-01249-6_37http://dx.doi.org/10.1007/978-3-030-01249-6_37］

Wang C， Chai M L， He M M， Chen D D and Liao J. 2021a. CLIP-NeRF： text-and-image driven manipulation of neural radiance fields//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 3825-3834 ［DOI： 10.1109/CVPR52688.2022.00381http://dx.doi.org/10.1109/CVPR52688.2022.00381］

Wang L L， Yan Q， Yao J M and Lin Z X. 2022. Research on 3D reconstruction algorithm of human body based on SMPL model. Transducer and Microsystem Technologies， 41（10）： 59-63

王栾栾，严群，姚剑敏，林志贤. 2022. 基于SMPL模型人体三维重建算法研究. 传感器与微系统， 41（10）： 59-63 ［DOI： 10.13873/J.1000-9787（2022）10-0059-05http://dx.doi.org/10.13873/J.1000-9787（2022）10-0059-05］

Wang P， Liu L J， Liu Y， Theobalt C， Komura T and Wang W P. 2021b. NeuS： learning neural implicit surfaces by volume rendering for multi-view reconstruction//Proceedings of the 35th International Conference on Neural Information Processing Systems. ［s.l.］： NeurIPS： 27171-27183

Wang Q Q， Wang Z C， Genova K， Srinivasan P， Zhou H， Barron J T， Martin-Brualla R， Snavely N and Funkhouser T. 2021c. Ibrnet： learning multi-view image-based rendering//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 4688-4697 ［DOI： 10.1109/CVPR46437.2021.00466http://dx.doi.org/10.1109/CVPR46437.2021.00466］

Wang X C and Phillips C. 2002. Multi-weight enveloping： least-squares approximation techniques for skin animation//2002 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. San Antonio， USA： ACM： 129-138 ［DOI： 10.1145/545261.545283http://dx.doi.org/10.1145/545261.545283］

Wang Y F， Aigerman N， Kim V G， Chaudhuri S and Sorkine-Hornung O. 2020. Neural cages for detail-preserving 3D deformations//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 72-80 ［DOI： 10.1109/CVPR42600.2020.00015http://dx.doi.org/10.1109/CVPR42600.2020.00015］

Weber O， Sorkine O， Lipman Y and Gotsman C. 2007. Context-aware skeletal shape deformation. Computer Graphics Forum， 26（3）： 265-274 ［DOI： 10.1111/j.1467-8659.2007.01048.xhttp://dx.doi.org/10.1111/j.1467-8659.2007.01048.x］

Wu C L， Bradley D， Garrido P， Zollhöfer M， Theobalt C， Gross M and Beeler T. 2016. Model-based teeth reconstruction. ACM Transactions on Graphics， 35（6）： #220 ［DOI： 10.1145/2980179.2980233http://dx.doi.org/10.1145/2980179.2980233］

Wu W X， Qi Z A and Li F X. 2019. PointConv： deep convolutional networks on 3D point clouds//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 9613-9622 ［DOI： 10.1109/CVPR.2019.00985http://dx.doi.org/10.1109/CVPR.2019.00985］

Xiong Z Y， Kang D， Jin D R， Chen W K， Bao L C， Cui S G and Han X G. 2023. Get3DHuman： lifting StyleGAN-human into a 3D generative model using pixel-aligned reconstruction priors//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris， France： IEEE： 9253-9263 ［DOI： 10.1109/ICCV51070.2023.00852http://dx.doi.org/10.1109/ICCV51070.2023.00852］

Xiu Y L， Yang J L， Cao X， Tzionas D and Black M J. 2023. ECON： explicit clothed humans optimized via normal integration//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Vancouver， Canada： IEEE： 512-523 ［DOI： 10.1109/CVPR52729.2023.00057http://dx.doi.org/10.1109/CVPR52729.2023.00057］

Xiu Y L， Yang J L， Tzionas D and Black M J. 2022. ICON： implicit clothed humans obtained from Normals//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 13286-13296 ［DOI： 10.1109/CVPR52688.2022.01294http://dx.doi.org/10.1109/CVPR52688.2022.01294］

Xu F Y， Wang P， Li B W and Yu H R. 2022. Virtual fitting of personalized cheongsam based on 3D reconstruction of Kinect's human point cloud. Microcomputer Applications， 38（10）： 108-112， 116

徐凤仪，王萍，黎博文，于昊冉. 2022. 基于Kinect人体点云三维重建的个性化旗袍虚拟试穿. 微型电脑应用， 38（10）： 108-112， 116 ［DOI： 10.3969/j.issn.1007-757X.2022.10.031http://dx.doi.org/10.3969/j.issn.1007-757X.2022.10.031］

Xu H Y， Bazavan E G， Zanfir A， Freeman W T， Sukthankar R and Sminchisescu C. 2020a. GHUM & GHUML： generative 3D human shape and articulated pose models//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 6183-6192 ［DOI： 10.1109/CVPR42600.2020.00622http://dx.doi.org/10.1109/CVPR42600.2020.00622］

Xu W P， Chatterjee A， Zollhöfer M， Rhodin H， Mehta D， Seidel H P and Theobalt C. 2018. MonoPerfCap： human performance capture from monocular video. ACM Transactions on Graphics， 37（2）： #27 ［DOI： 10.1145/3181973http://dx.doi.org/10.1145/3181973］

Xu X Y， Chen H， Moreno-Noguer F， Jeni L A and De La Torre F. 2020b. 3D human shape and pose from a single low-resolution image with self-supervised learning//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 284-300 ［DOI： 10.1007/978-3-030-58545-7_17http://dx.doi.org/10.1007/978-3-030-58545-7_17］

Xu Y F， Zhang J， Zhang Q M and Tao D C. 2022. ViTPose： simple vision transformer baselines for human pose estimation//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans： USA： ACM： #2795 ［DOI： 10.5555/3600270.3603065http://dx.doi.org/10.5555/3600270.3603065］

Xu Y L， Zhu S C and Tung T. 2019. DenseRaC： joint 3D pose and shape estimation by dense render-and-compare//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 7759-7769 ［DOI： 10.1109/ICCV.2019.00785http://dx.doi.org/10.1109/ICCV.2019.00785］

Yang Z， Wang S L， Manivasagam S， Huang Z， Ma W C， Yan X C， Yumer E and Urtasun R. 2021. S3： neural shape， skeleton， and skinning fields for 3D human modeling//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 13279-13288 ［DOI： 10.1109/CVPR46437.2021.01308http://dx.doi.org/10.1109/CVPR46437.2021.01308］

Yao L， Zhang Y A， Zhang M X， Wan Y. 2021. The impact of joint axis angle prior on the results of 3D human body reconstruction. Journal of Image and Graphics， 26（12）： 2918-2930

姚砺，张幼安，张梦雪，万燕. 2021. 关节轴角先验对3维人体重建结果的影响. 中国图象图形学报， 26（12）： 2918-2930 ［DOI： 10.11834/jig.200348http://dx.doi.org/10.11834/jig.200348］

Yariv L， Gu J T， Kasten Y and Lipman Y. 2021. Volume rendering of neural implicit surfaces//Proceedings of the 35th International Conference on Neural Information Processing Systems. ［s.l.］： NeurIPS： 4805-4815

Yi X Y， Zhou Y X， Habermann M， Shimada S， Golyanik V， Theobalt C and Xu F. 2022. Physical inertial poser （PIP）： physics-aware real-time human motion tracking from sparse inertial sensors//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 13157-13168 ［DOI： 10.1109/CVPR52688.2022.01282http://dx.doi.org/10.1109/CVPR52688.2022.01282］

Yi X Y， Zhou Y X and Xu F. 2021. TransPose： real-time 3D human translation and pose estimation with six inertial sensors. ACM Transactions on Graphics， 40（4）： #86 ［DOI： 10.1145/3450626.3459786http://dx.doi.org/10.1145/3450626.3459786］

Yu T， Guo K W， Xu F， Dong Y， Su Z Q， Zhao J H， Li J G， Dai Q H and Liu Y B. 2017. BodyFusion： real-time capture of human motion and surface geometry using a single depth camera//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 910-919 ［DOI： 10.1109/ICCV.2017.104http://dx.doi.org/10.1109/ICCV.2017.104］

Yu T， Zheng Z R， Guo K W， Liu P P， Dai Q H and Liu Y B. 2021a. Function4D： real-time human volumetric capture from very sparse consumer RGBD sensors//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 5742-5752 ［DOI： 10.1109/CVPR46437.2021.00569http://dx.doi.org/10.1109/CVPR46437.2021.00569］

Yu T， Zheng Z R， Guo K W， Zhao J H， Dai Q H， Li H， Pons-Moll G and Liu Y B. 2018. DoubleFusion： real-time capture of human performances with inner body shapes from a single depth sensor//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 7287-7296 ［DOI： 10.1109/CVPR.2018.00761http://dx.doi.org/10.1109/CVPR.2018.00761］

Yu Z B， Wang J J， Xu J W， Ni B B， Zhao C L， Wang M S and Zhang W J. 2021b. Skeleton2Mesh： kinematics prior injected unsupervised human mesh recovery//Proceedings of 2021 International Conference on Computer Vision. Montreal， Canada： IEEE： 8599-8609 ［DOI： 10.1109/ICCV48922.2021.00850http://dx.doi.org/10.1109/ICCV48922.2021.00850］

Yu Z X， Yoon J S， Lee I K， Venkatesh P， Park J， Yu J H and Park H S. 2020. HUMBI： a large multiview dataset of human body expressions//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 2987-2997 ［DOI： 10.1109/CVPR42600.2020.00306http://dx.doi.org/10.1109/CVPR42600.2020.00306］

Zanfir A， Bazavan E G， Xu H Y， Freeman W T， Sukthankar R and Sminchisescu C. 2020. Weakly supervised 3D human pose and shape reconstruction with normalizing flows//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 465-481 ［DOI： 10.1007/978-3-030-58539-6_28http://dx.doi.org/10.1007/978-3-030-58539-6_28］

Zanfir A， Marinoiu E and Sminchisescu C. 2018. Monocular 3D pose and shape estimation of multiple people in natural scenes： the importance of multiple scene constraints//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 2148-2157 ［DOI： 10.1109/CVPR.2018.00229http://dx.doi.org/10.1109/CVPR.2018.00229］

Zanfir M， Zanfir A， Bazavan E G， Freeman W T， Sukthankar R and Sminchisescu C. 2021. THUNDR： Transformer-based 3D HUmaN reconstruction with markers//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 12951-12960 ［DOI： 10.1109/ICCV48922.2021.01273http://dx.doi.org/10.1109/ICCV48922.2021.01273］

Zhang C， Pujades S， Black M and Pons-Moll G. 2017. Detailed， accurate， human shape estimation from clothed 3D scan sequences//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 5484-5493 ［DOI： 10.1109/CVPR.2017.582http://dx.doi.org/10.1109/CVPR.2017.582］

Zhang H W， Cao J， Lu G， Ouyang W L and Sun Z A. 2019a. DaNet： decompose-and-aggregate network for 3D human shape and pose estimation//Proceedings of the 27th ACM International Conference on Multimedia. Nice， France： ACM： 935-944 ［DOI： 10.1145/3343031.3351057http://dx.doi.org/10.1145/3343031.3351057］

Zhang H W， Lin S Y， Shao R Z， Zhang Y X， Zheng Z R， Huang H， Guo Y D and Liu Y B. 2023. CloSET： modeling clothed humans on continuous surface with explicit template decomposition//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver， Canada： IEEE： 501-511 ［DOI： 10.1109/CVPR52729.2023.00056http://dx.doi.org/10.1109/CVPR52729.2023.00056］

Zhang H W， Tian Y T， Zhou X C， Ouyang W L， Liu Y B， Wang L M and Sun Z A. 2021. PyMAF： 3D human pose and shape regression with pyramidal mesh alignment feedback loop//Proceedings of 2021 International Conference on Computer Vision. Montreal， Canada： IEEE： 11426-11436 ［DOI： 10.1109/ICCV48922.2021.01125http://dx.doi.org/10.1109/ICCV48922.2021.01125］

Zhang J F， Jiang Z H， Yang D D， Xu H Y， Shi Y C， Song G X， Xu Z C， Wang X C and Feng J S. 2022b. Avatargen： a 3D generative model for animatable human avatars//Proceedings of 2020 European Conference on Computer Vision. Tel Aviv， Israel： Springer： 668-685 ［DOI： 10.1007/978-3-031-25066-8_39http://dx.doi.org/10.1007/978-3-031-25066-8_39］

Zhang J Y， Deng B L， Hong Y， Peng Y， Qin W J and Liu L G. 2019b. Static/dynamic filtering for mesh geometry. IEEE Transactions on Visualization and Computer Graphics， 25（4）： 1774-1787 ［DOI： 10.1109/TVCG.2018.2816926http://dx.doi.org/10.1109/TVCG.2018.2816926］

Zhang J Y， Pepose S， Joo H， Ramanan D， Malik J and Kanazawa A. 2020a. Perceiving 3D human-object spatial arrangements from a single image in the wild//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 34-51 ［DOI： 10.1007/978-3-030-58610-2_3http://dx.doi.org/10.1007/978-3-030-58610-2_3］

Zhang W Y， Deng B L， Zhang J Y， Bouaziz S and Liu L G. 2015. Guided mesh normal filtering. Computer Graphics Forum， 34（7）： 23-34 ［DOI： 10.1111/cgf.12742http://dx.doi.org/10.1111/cgf.12742］

Zhang Y， Hassan M， Neumann H， Black M J and Tang S Y. 2020b. Generating 3D people in scenes without people//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Seattle， USA： IEEE： 6193-6203 ［DOI： 10.1109/CVPR42600.2020.00623http://dx.doi.org/10.1109/CVPR42600.2020.00623］

Zhao F Q， Jiang Y H， Yao K X， Zhang J K， Wang L， Dai H Z， Zhong Y H， Zhang Y L， Wu M Y， Xu L and Yu J Y. 2022a. Human performance modeling and rendering via neural animated mesh. ACM Transactions on Graphics， 41（6）： #235 ［DOI： 10.1145/3550454.3555451http://dx.doi.org/10.1145/3550454.3555451］

Zhao F Q， Yang W， Zhang J K， Lin P， Zhang Y L， Yu J Y and Xu L. 2022b. HumanNeRF： efficiently generated human radiance field from sparse inputs//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 7733-7743 ［DOI： 10.1109/CVPR52688.2022.00759http://dx.doi.org/10.1109/CVPR52688.2022.00759］

Zheng C X， Yao J M， Yan Q and Lin Z X. 2022. 3D human reconstruction based on sequence frames. Transducer and Microsystem Technologies， 41（12）： 33-37

郑承绪，姚剑敏，严群，林志贤. 2022. 基于序列帧的三维人体重建. 传感器与微系统， 41（12）： 33-37 ［DOI： 10.13873/J.1000-9787（2022）12-0033-05http://dx.doi.org/10.13873/J.1000-9787（2022）12-0033-05］

Zheng Y， Shao R Z， Zhang Y X， Yu T， Zheng Z R， Dai Q H and Liu Y B. 2021. DeepMultiCap： performance capture of multiple characters using sparse multiview cameras//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. Montreal， Canada： IEEE： 6219-6229 ［DOI： 10.1109/ICCV48922.2021.00618http://dx.doi.org/10.1109/ICCV48922.2021.00618］

Zheng Y F， Abrevaya V F， Bühler M C， Chen X， Black M J and Hilliges O. 2022a. I M Avatar： implicit morphable head avatars from videos//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. New Orleans， USA： IEEE： 13535-13545 ［DOI： 10.1109/CVPR52688.2022.01318http://dx.doi.org/10.1109/CVPR52688.2022.01318］

Zheng Z R， Huang H， Yu T， Zhang H W， Guo Y D and Liu Y B. 2022b. Structured local radiance fields for human avatar modeling//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 15872-15882 ［DOI： 10.1109/CVPR52688.2022.01543http://dx.doi.org/10.1109/CVPR52688.2022.01543］

Zheng Z R， Yu T， Liu Y B and Dai Q H. 2022c. PaMIR： parametric model-conditioned implicit representation for image-based human reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence， 44（6）： 3170-3184 ［DOI： 10.1109/TPAMI.2021.3050505http://dx.doi.org/10.1109/TPAMI.2021.3050505］

Zheng Z R， Yu T， Wei Y X， Dai Q H and Liu Y B. 2019. DeepHuman： 3D human reconstruction from a single image//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 7738-7748 ［DOI： 10.1109/ICCV.2019.00783http://dx.doi.org/10.1109/ICCV.2019.00783］

Zhu X Y， Liao T T， Zhang X M， Lyu J J， Chen Z W， Wang Y F， Guo K， Cao Q， Li S Z and Lei Z. 2023. MVP-human dataset for 3D clothed human avatar reconstruction from multiple frames. IEEE Transactions on Biometrics， Behavior， and Identity Science， 5（4）： 464-475 ［DOI： 10.1109/TBIOM.2023.3276901http://dx.doi.org/10.1109/TBIOM.2023.3276901］

Zhou Y， Barnes C， Lu J W， Yang J M and Li H. 2019. On the continuity of rotation representations in neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Long Beach， USA： IEEE： 5738-5746 ［DOI： 10.1109/CVPR.2019.00589http://dx.doi.org/10.1109/CVPR.2019.00589］

Zhou Y X， Habermann M， Habibie I， Tewari A， Theobalt C and Xu F. 2021. Monocular real-time full body capture with inter-part correlations//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Nashville， USA： IEEE： 4809-4820 ［DOI： 10.1109/CVPR46437.2021.00478http://dx.doi.org/10.1109/CVPR46437.2021.00478］

Zimmermann C and Brox T. 2017. Learning to estimate 3D hand pose from single RGB images//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 4913-4921 ［DOI： 10.1109/ICCV.2017.525http://dx.doi.org/10.1109/ICCV.2017.525］

Zou X X， Han X T and Wong W. 2023. CLOTH4D： a dataset for clothed human reconstruction//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver， Canada： IEEE： 12847-12857 ［DOI： 10.1109/CVPR52729.2023.01235http://dx.doi.org/10.1109/CVPR52729.2023.01235］

文章被引用时，请邮件提醒。

提交