关节轴角先验对3维人体重建结果的影响
The impact of joint axis angle prior on the results of 3D human body reconstruction
- 2021年26卷第12期 页码:2918-2930
纸质出版日期: 2021-12-16 ,
录用日期: 2020-09-29
DOI: 10.11834/jig.200348
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2021-12-16 ,
录用日期: 2020-09-29
移动端阅览
姚砺, 张幼安, 张梦雪, 万燕. 关节轴角先验对3维人体重建结果的影响[J]. 中国图象图形学报, 2021,26(12):2918-2930.
Li Yao, Youan Zhang, Mengxue Zhang, Yan Wan. The impact of joint axis angle prior on the results of 3D human body reconstruction[J]. Journal of Image and Graphics, 2021,26(12):2918-2930.
目的
2
3维人体重建的目标在于建立真实可靠的3维人体模型。但目前基于SMPL(skinned multi-person linear model)模型重建3维人体的实验和一些公开数据集中,常常会出现预测的姿势角度值不符合真实人体关节角度规则的现象。针对这一问题,本文提出设置关节旋转角值域,使得重建的结果真实性更强、更符合人体关节机械结构。
方法
2
根据人体关节的联接结构将各个关节的运动进行划分。根据划分结果计算关节运动自由度,并结合实际情况提出基于SMPL模型的关节旋转值域。提出一个简单的重建方法来验证值域分析的正确性。
结果
2
使用3维人体数据集UP-3D进行相关实验,并对比以往直接根据学习结果生成重建模型的数据。在使用轴角作为损失参数的情况下,重建精度提高显著,平均误差降低15.1%。在使用所有损失函数后,平均误差比直接根据预测值生成重建模型的两段式重建方法降低7.0%。重建结果与UP-3D数据集进行真实性对比有显著的关节联动性效果。
结论
2
本文提出的关节旋转角值域设置对基于SMPL模型进行3维人体重建的方法在进行关节点旋转角回归的过程中起到了很大作用,重建的模型也更符合人体关节运动联动性。
Objective
2
The goal of 3D human body reconstruction has been building a credible and reliable human body model. The credibility of the reconstructed results cannot be guaranteed entirely subject to the mismatch issue between predicted posture angle value and the real human joint movement based on the skinned multi-person linear model (SMPL). A prior of the joint rotation angle value range make the reconstruction result to fit human joint mechanical structure. The rationality and effectiveness of the value range setting combined with a simple reconstruction method have been demonstrated more better than the 3D SMPL-based reconstruction. The reconstruction accuracy of the model has been significantly improved.
Method
2
The two-stage 3D human body reconstruction method has tailored the reconstruction to improve the credibility and provide more details of the 3D human body model effectively in terms of the range setting verification. In the first stage
the description has been preprocessed to reduce noise
to extract key information as well as to remove the background of the image. In the second stage
the reconstruction has been completed via using the posture and shape parameters of the human body in the image learned with the residual network based on the SMPL model. The UP-3D dataset has been used. The details of reconstruction algorithm have been presented as follows: first
the human silhouette image has been extracted based on the render light image in via the UP-3D dataset with the priority of the acquired segmentation of the human body in the dataset. An RGB image of human body with uniform size can be obtained via removing the background of the original image based on the silhouette image. Next
the resized RGB image and the Render light image has been concatenated and used as the input of the pose regression. Silhouette has been input into the body shape regression simultaneously. The residual network has been used to learn human body posture parameters and body shape parameters based on the parametric statistical body shape model. At last
the range of the predicted body parameters based on the residual network computing has been supervised. The useful predicted body parameters and the posture parameters in supervision have been input into the SMPL model to generate a three-dimensional human body model via the same posture and shape as the original RGB image. The loss function has been consisted of three parts: predicted parameters loss
projection loss and vertex loss. The experiments have shown that adding vertex loss can effectively control the regression direction and suppress the "unequal cost" regression in the regression process by comparing the impact of adding vertex loss to the global loss on the prediction results. The learned posture parameter values supervision has been issued in the research. The way to set the value range based on ergonomics and mechanical structure for supervision exceed the single loss function prediction result. By combining intermediate supervision and prediction value result supervision
the posture parameters have been constrained to the regression range to make the learning value conform to the linkage of human motion joints. The reconstruction model has been more realistic. The method of setting the value range has been issued as follows: first
the movement of each joint has been divided based on the connection structure of the human body shutdown. Then
the degree of movement has been calculated. The joint rotation range has been presented based on the SMPL model in real situation. Finally
the range analysis combined with the simple reconstruction method has been presented to verify the credibility versus previous experiments.
Result
2
The experiments have been conducted via the UP-3D. It was compared with the models that generated directly from the learning results without using axis-angle prior to limit predictions. The reconstruction accuracy has improved significantly when the axis-angle used as the loss parameter. The average error has reduced by 15.1%. The average error has reached to 7.0% lower than the two-stage reconstruction method that generated the reconstruction model directly from the prediction via using all the loss functions. The reconstruction results have been compared with the UP-3D dataset for credibility to shown a significant joint linkage effect.
Conclusion
2
The range setting of the joint rotation angle has played an essential role in the process of the regression of the SMPL model pose parameters for 3D human reconstruction. The reconstructed model has been fit the human joint motion linkage more.
SMPL模型3维人体重建关节角度重建真实性关节运动联动性
SMPL(skinned multi-person linear model) model3D human reconstructionjoints anglereconstruction authenticityjoint linkage
Bogo F, Kanazawa A, Lassner C, Gehler P, Romero J and Black M J. 2016. Keep it SMPL: automatic estimation of 3D human pose and shape from a single image//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 561-578[DOI:10.1007/978-3-319-46454-1_34http://dx.doi.org/10.1007/978-3-319-46454-1_34]
Güler R A, Neverova N and Kokkinos I. 2018. DensePose: dense human pose estimation in the wild//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7297-7306[DOI:10.1109/cvpr.2018.00762http://dx.doi.org/10.1109/cvpr.2018.00762]
Han K, Pang Z Q, Wang L and Yue D. 2015. High identification 3D human body model reconstruction method based on the depth scanner. Journal of Graphics, 36(4): 503-510
韩凯, 庞宗强, 王龙, 岳东. 2015. 基于深度扫描仪的高辨识度三维人体模型重建方法. 图学学报, 36(4): 503-510[DOI: 10.3969/j.issn.2095-302X.2015.04.002]
Kanazawa A, Black M J, Jacobs D W and Malik J. 2018. End-to-end recovery of human shape and pose//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7122-7131[DOI:10.1109/cvpr.2018.00744http://dx.doi.org/10.1109/cvpr.2018.00744]
Kanazawa A, Zhang J Y, Felsen P and Malik J. 2019. Learning 3D human dynamics from video//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5607-5616[DOI:10.1109/CVPR.2019.00576http://dx.doi.org/10.1109/CVPR.2019.00576]
Kocabas M, Karagoz S and Akbas E. 2019. Self-supervised learning of 3D human pose using multi-view geometry//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 1077-1086[DOI:10.1109/CVPR.2019.00117http://dx.doi.org/10.1109/CVPR.2019.00117]
Kolotouros N, Pavlakos G and Daniilidis K. 2019. Convolutional mesh regression for single-image human shape reconstruction//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4496-4505[DOI:10.1109/cvpr.2019.00463http://dx.doi.org/10.1109/cvpr.2019.00463]
Lassner C, Romero J, Kiefel M, Bogo F, Black M J and Gehler P V. 2017. Unite the people: closing the loop between 3D and 2D human representations//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 4704-4713[DOI:10.1109/CVPR.2017.500http://dx.doi.org/10.1109/CVPR.2017.500]
Loper M, Mahmood N, Romero J, Pons-Moll G and Black M J. 2015. SMPL: a skinned multi-person linear model. ACM Transactions on Graphics, 34(6): #248[DOI: 10.1145/2816795.2818013]
Pavlakos G, Zhu L Y, Zhou X W and Daniilidis K. 2018. Learning to estimate 3D human pose and shape from a single color image//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 459-468[DOI:10.1109/cvpr.2018.00055http://dx.doi.org/10.1109/cvpr.2018.00055]
Rajchl M, Lee M C H, Oktay O, Kamnitsas K, Passerat-Palmbach J, Bai W J, Damodaram M, Rutherford M A, Hajnal J V, Kainz B and Rueckert D. 2017. DeepCut: object segmentation from bounding box annotations using convolutional neural networks. IEEE Transactions on Medical Imaging, 36(2): 674-683[DOI: 10.1109/tmi.2016.2621185]
Yang J L, Franco J S, Hétroy-Wheeler F and Wuhrer S. 2016. Estimation of human body shape in motion with wide clothing//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 439-454[DOI:10.1007/978-3-319-46493-0_27http://dx.doi.org/10.1007/978-3-319-46493-0_27]
Yao P F, Fang Z, Wu F, Feng Y and Li J W. 2019. DenseBody: directly regressing dense 3D human pose and shape from a single color image[DB/OL]. [2020-06-19].https://arxiv.org/pdf/1903.10153v3.pdfhttps://arxiv.org/pdf/1903.10153v3.pdf
Yoshiyasu Y, Sagawa R, Ayusawa K and Murai A. 2018. Skeleton transformer networks: 3D human pose and skinned mesh from single RGB image//Proceedings of the 14th Asian Conference on Computer Vision. Perth, Australia: Springer: 485-500[DOI:10.1007/978-3-030-20870-7_30http://dx.doi.org/10.1007/978-3-030-20870-7_30]
Yu T, Guo K W, Xu F, Dong Y, Su Z Q, Zhao J H, Li J G, Dai Q H and Liu Y B. 2017. BodyFusion: real-time capture of human motion and surface geometry using a single depth camera//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 910-919[DOI:10.1109/iccv.2017.104http://dx.doi.org/10.1109/iccv.2017.104]
Yu T, Zhao J H, Zheng Z R, Guo K W, Dai Q H, Li H, Pons-Moll G and Liu Y B. 2020. DoubleFusion: real-time capture of human performances with inner body shapes from a single depth sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10): 2523-2539[DOI: 10.1109/tpami.2019.2928296]
Zheng W W and Wu K J. 1997. Mechanical Theory. 7th ed. Beijing: Higher Education Press
郑文纬, 吴克坚. 1997. 机械原理. 7版. 北京: 高等教育出版社
Zheng Z R, Yu T, Wei Y X, Dai Q H and Liu Y B. 2019. DeepHuman: 3D human reconstruction from a single image//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (south): IEEE: 7738-7748[DOI:10.1109/ICCV.2019.00783http://dx.doi.org/10.1109/ICCV.2019.00783]
Zhou Z H. 2016. Machine Learning. Beijing: Tsinghua University Press
周志华. 2016. 机器学习. 北京: 清华大学出版社
相关文章
相关作者
相关机构