MASR-PSN：低分光度立体图像的高分法向重建深度学习模型

举雅琨; 蹇木伟; 饶源; 张述; 高峰; 董军宇

doi:10.11834/jig.220050

图像理解和计算机视觉 | 浏览量 : 0 下载量: 0 CSCD: 0

PDF
导出
分享
收藏
专辑

MASR-PSN：低分光度立体图像的高分法向重建深度学习模型
MASR-PSN： a low-resolution photometric stereo images-relevant deep learning model for high-resolution surface normal reconstruction
2023年28卷第7期页码：2120-2134
纸质出版日期： 2023-07-16 ，
DOI： 10.11834/jig.220050
稿件说明：

移动端阅览

举雅琨，蹇木伟，饶源，张述，高峰，董军宇. 2023. MASR-PSN：低分光度立体图像的高分法向重建深度学习模型. 中国图象图形学报， 28(07):2120-2134

Ju Yakun， Jian Muwei， Rao Yuan， Zhang Shu， Gao Feng， Dong Junyu. 2023. MASR-PSN： a low-resolution photometric stereo images-relevant deep learning model for high-resolution surface normal reconstruction. Journal of Image and Graphics， 28(07):2120-2134
举雅琨，蹇木伟，饶源，张述，高峰，董军宇. 2023. MASR-PSN：低分光度立体图像的高分法向重建深度学习模型. 中国图象图形学报， 28(07):2120-2134 DOI： 10.11834/jig.220050.

Ju Yakun， Jian Muwei， Rao Yuan， Zhang Shu， Gao Feng， Dong Junyu. 2023. MASR-PSN： a low-resolution photometric stereo images-relevant deep learning model for high-resolution surface normal reconstruction. Journal of Image and Graphics， 28(07):2120-2134 DOI： 10.11834/jig.220050.

摘要

目的

光度立体算法是一种单视角下的稠密三维重建方法，其利用相同视角下来自不同光照方向的一系列图像恢复像素级的表面法向。拍摄光度立体图像所用的高分辨率线性响应相机的成本十分昂贵且难以获取，很难通过传感器直接获取超高分辨率图像来恢复高分辨率表面法向。因此，提出一种基于深度神经网络的光度立体超分算法，以从低分光度立体图像中恢复出准确的高分表面法向。

方法

首先，对原始的低分光度立体图像进行归一化预处理操作，以消除剧烈变化的表面反射率影响，并消减过饱和镜面反射的影响。随后，提出多层聚合超分光度立体网络（multi-level aggregation super resolution photometric stereo network，MASR-PSN）。MASR-PSN包含一个新颖的深浅层融合的最大池化聚合框架、权值共享的特征回归器、并行设计的不同尺寸卷积核的并行回归器结构，能够在保留多尺度信息的同时，增强特征表示，防止模式坍塌学习到某一固定尺度相关的非重要特征，以及防止3×3卷积核带来空间域上的过度平滑。

结果

广泛的消融实验证明了提出的深浅层聚合层和并行权值共享回归器的有效性，能明显减少生成表面法向的平均角度误差（mean angular error，MAE）。本文方法仅需其他方法一半分辨率的光度立体图像，而能准确地恢复出复杂表面的结构。DiLiGenT benchmark数据集的定量实验和Light Stage Data Gallery数据集、 Gourd数据集的定性实验显示，MASR-PSN在预测表面法向精确度方面有明显提升。在DiLiGenT benchmark数据集中，本文方法在仅使用其他方法一半分辨率的光度立体图像的情况下，以96幅图像为输入时，取得7.31°的平均角度误差，比最佳方法提升0.08°，以10幅图像为输入时，取得9.00°的平均角度误差，比最佳方法提升0.43°。

结论

提出的MASR-PSN方法提升了光度立体任务表面法向重建的准确性，在低分辨率的输入图像下，依然可以恢复出细节清晰的超分辨率表面法向。

Abstract

Objective

Three-dimensional

（

3D） reconstruction is currently focused on in computer vision. To optimize the problem of recovering fine details of the surface and dense reconstruction， a fixed scene-related photometric stereo technique can be used in terms of the pixel-wise surface normal under the circumstance of varying shading cues. It can recover per-pixel dense surface normal and improve weak texture-reconstructed objects to a certain extent beyond binocular and multi-view stereo in triangulate sparse 3D points. Photometric stereo can be used in the commonly-used high-precision 3D reconstruction domains like cultural relic reconstruction and industrial defect detection. To solve the complex three-dimensional structure and alleviate the blur problem in the normal reconstruction， high-resolution surface normal can provide richer and more effective 3D information. However， due to the high-resolution linear response cameras are high involved， it is still challenged to recover high-resolution surface normal for photometric stereo images. Therefore， it is urgent to develop the high-resolution surface normal reconstruction in terms of low-resolution photometric stereo images analysis.

Method

We facilitate deep learning based super-resolution photometric stereo algorithm further to recover accurate high-resolution surface normal from low-resolution photometric stereo images. First， a normalized operation is employed to normalize in situ pixels in completed low-resolution photometric stereo images， which can alleviate the effects-contextual of severely changing surface reflectance and oversaturated specular reflection. This pre-processing method can be used to deal with steep color change-related objects for surfaces-homogeneous training. Furthermore， we develop a multi-level aggregation super resolution photometric stereo network （MASR-PSN） and a novel deep and shallow fusion max-pooling aggregation framework is designed. The proposed deep and shallow fusion max-pooling aggregation framework can be used to enhance feature representation and preserve multi-scale information because of receptive fields-derived deep and shallow features； to optimize effective learning features related to a certain fixed scale， a weight-shared feature regressor is developed as well， which can learn and reconstruct the surface normal from the features in multiple scales. The weight-shared feature regressor can be paid attention on multiple scale features as the input， and the 4 × 4 super-resolution features can be output after that， which are fused in the following step； For the regressor， the parallel network structure of different sizes of convolution kernels are designed in parallel to the smooth transition-spatial preservation of 3 × 3 convolution kernel. But， due to excessive smoothing in the spatial domain， the loss of resolution details and blur is required to be resolved. To preserve the consistent details of super-resolution surface normal， we develop a paralleled network design， which consists of 3 × 3 convolution layers and 1 × 1 layers. Additionally， a joint loss function is demonstrated as well， which can optimize the MASR-PSN on the constraints of the normal gradient and normal angle. The normal angle constraint is melted into the average error value of the predicted normal only， but the details of the surface are sacrificed and the blur is generated. Therefore， the normal gradient constraint is introduced to focus on the adjacent pixels-between changes， which can concern of more details and preserve the clear recovered super-resolution surface normal.

Result

Extensive ablation experiments are carried out and the effectiveness are demonstrated in terms of our proposed deep and shallow aggregation layer and parallel shared-weight regressor， which can reduce the mean angle error （MAE） of the generated surface normal significantly. It is required of input photometric stereo images according to other related resolution-half methods， and a high-resolution normal map-relevant structure of complex surfaces can be reconstructed accurately. The comparative experiments are carried out on the DiLiGenT benchmark dataset quantitatively， as well as on the light stage data gallery dataset and Gourd dataset in qualitative. For the DiLiGenT benchmark dataset （only using half-resolution photometric stereo images compared with other methods）， the proposed MASR-PSN can achieve an average angle error of 7.31 degrees when 96 dense images are added as input， and 0.12 degrees are improved， and an average angle error of 9.00 degrees are optimized as well when 10 sparse images are added as input， which is higher of 0.43 degrees. The robustness and effectiveness of the proposed MASR-PSN are shown based on more qualitative experiments on the light stage data gallery and gourd datasets.

Conclusion

To predict the super-resolution surface normal and clarify more details in the low-resolution input photometric stereo image， the photometric stereo task-oriented MASR-PSN is potential to improve the accuracy of surface normal reconstruction further.

关键词

三维重建光度立体表面法向恢复深度学习超分辨率

Keywords

3D reconstructionphotometric stereosurface normal recoverydeep learningsuper resolution

references

Alldrin N， Zickler T and Kriegman D. 2008. Photometric stereo with non-parametric and spatially-varying reflectance//Proceedings of 2008 Conference on Computer Vision and Pattern Recognition. Anchorage， USA： IEEE： 1-8 ［DOI： 10.1109/CVPR.2008.4587656http://dx.doi.org/10.1109/CVPR.2008.4587656］

Chen G Y， Han K and Wong K Y K. 2018. PS-FCN： a flexible learning framework for photometric stereo//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 3-19 ［DOI： 10.1007/978-3-030-01240-3_1http://dx.doi.org/10.1007/978-3-030-01240-3_1］

Chen G Y， Han K， Shi B X， Matsushita Y and Wong K Y K. 2020. Deep photometric stereo for non-Lambertian surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence， 44（1）： 129-142. ［DOI： 10.1109/TPAMI.2020.3005397http://dx.doi.org/10.1109/TPAMI.2020.3005397］

Chung H S and Jia J Y. 2008. Efficient photometric stereo on glossy surfaces with wide specular lobes//Proceedings of 2008 Conference on Computer Vision and Pattern Recognition. Anchorage， USA： IEEE： 1-8 ［DOI： 10.1109/CVPR.2008.4587771http://dx.doi.org/10.1109/CVPR.2008.4587771］

Einarsson P， Chabert C F， Jones A， Ma W C， Lamond B， Hawkins T， Bolas M， Sylwan S and Debevec P. 2006. Relighting human locomotion with flowed reflectance fields//Proceedings of the 17th Eurographics Conference on Rendering Techniques. Nicosia， Cyprus： Eurographics Association： 183-194

Georghiades A S. 2003. Incorporating the Torrance and Sparrow model of reflectance in uncalibrated photometric stereo//Proceedings of the 9th International Conference on Computer Vision. Nice， France： IEEE： 816-823 ［DOI： 10.1109/ICCV.2003.1238432http://dx.doi.org/10.1109/ICCV.2003.1238432］

Han H Y and Zhang J. 2018. 3D dynamic scene reconstruction based on virtual reality. Modern Electronics Technique， 41（2）： 170-173

韩海燕，张静. 2018. 基于虚拟现实的三维动态场景重建. 现代电子技术， 41（2）： 170-173 ［DOI： 10.16652/j.issn.1004-373x.2018.02.043http://dx.doi.org/10.16652/j.issn.1004-373x.2018.02.043］

He K M， Zhang X Y， Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 770-778 ［DOI： 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90］

Hertzmann A and Seitz S M. 2005. Example-based photometric stereo： shape reconstruction with general， varying BRDFs. IEEE Transactions on Pattern Analysis and Machine Intelligence， 27（8）： 1254-1264 ［DOI： 10.1109/TPAMI.2005.158http://dx.doi.org/10.1109/TPAMI.2005.158］

Ikehata S. 2018. CNN-PS： CNN-based photometric stereo for general non-convex surfaces//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 3-19 ［DOI： 10.1007/978-3-030-01267-0_1http://dx.doi.org/10.1007/978-3-030-01267-0_1］

Ikehata S， Wipf D， Matsushita Y and Aizawa K. 2012. Robust photometric stereo using sparse regression//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence， USA： IEEE： 318-325 ［DOI： 10.1109/CVPR.2012.6247691http://dx.doi.org/10.1109/CVPR.2012.6247691］

Jian M W， Dong J Y， Gong M G， Yu H， Nie L Q， Yin Y L and Lam K M. 2020. Learning the traditional art of Chinese calligraphy via three-dimensional reconstruction and assessment. IEEE Transactions on Multimedia， 22（4）： 970-979 ［DOI： 10.1109/TMM.2019.2937187http://dx.doi.org/10.1109/TMM.2019.2937187］

Jian Z X， Wang X， Ren J J and Ren M J. 2021. Metal surface texture reconstruction based on near-field photometric stereo. Acta Optica Sinica， 41（11）： #1112002

简振雄，王晰，任杰骥，任明俊. 2021. 基于近场光度立体视觉的金属表面纹理重构. 光学学报， 41（11）： #1112002 ［DOI： 10.3788/AOS202141.1112002http://dx.doi.org/10.3788/AOS202141.1112002］

Johnson M K and Adelson E H. 2011. Shape estimation in natural illumination//Proceedings of 2011 Conference on Computer Vision and Pattern Recognition. Colorado Springs， USA： IEEE： 2553-2560 ［DOI： 10.1109/CVPR.2011.5995510http://dx.doi.org/10.1109/CVPR.2011.5995510］

Ju Y K， Dong J Y and Chen S. 2021. Recovering surface normal and arbitrary images： a dual regression network for photometric stereo. IEEE Transactions on Image Processing， 30： 3676-3690 ［DOI： 10.1109/TIP.2021.3064230http://dx.doi.org/10.1109/TIP.2021.3064230］

Ju Y K， Lam K M， Chen Y， Qi L and Dong J Y. 2020. Pay attention to devils： a photometric stereo network for better details//Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama， Japan：［s.n.］： 694-700 ［DOI： 10.24963/ijcai.2020/97http://dx.doi.org/10.24963/ijcai.2020/97］

Li J X， Robles-Kelly A， You S D and Matsushita Y. 2019. Learning to minify photometric stereo//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 7560-7568 ［DOI： 10.1109/CVPR.2019.00775http://dx.doi.org/10.1109/CVPR.2019.00775］

Li S and Shi B X. 2015. Photometric stereo for general isotropic reflectances by spherical linear interpolation. Optical Engineering， 54（8）： #083104 ［DOI： 10.1117/1.OE.54.8.083104http://dx.doi.org/10.1117/1.OE.54.8.083104］

Long X X， Cheng X J， Zhu H， Zhang P J， Liu H M， Li J， Zheng L T， Hu Q Y， Liu H， Cao X， Yang R G， Wu Y H， Zhang G F， Liu Y B， Xu K， Guo Y L and Chen B Q. 2021. Recent progress in 3D vision. Journal of Image and Graphics， 26（6）： 1389-1428

龙霄潇，程新景，朱昊，张朋举，刘浩敏，李俊，郑林涛，胡庆拥，刘浩，曹汛，杨睿刚，吴毅红，章国锋，刘烨斌，徐凯，郭裕兰，陈宝权. 2021. 三维视觉前沿进展. 中国图象图形学报， 26（6）： 1389-1428 ［DOI： 10.11834/jig.210043http://dx.doi.org/10.11834/jig.210043］

Matusik W， Pfister H， Brand M and McMillan L. 2003. A data-driven reflectance model. ACM Transactions on Graphics， 22（3）： 759-769 ［DOI： 10.1145/882262.882343http://dx.doi.org/10.1145/882262.882343］

Mukaigawa Y， Ishii Y and Shakunaga T. 2007. Analysis of photometric factors based on photometric linearization. Journal of the Optical Society of America A， 24（10）： 3326-3334 ［DOI： 10.1364/josaa.24.003326http://dx.doi.org/10.1364/josaa.24.003326］

Santo H， Samejima M， Sugano Y， Shi B X and Matsushita Y. 2017. Deep photometric stereo network//Proceedings of 2017 International Conference on Computer Vision Workshops. Venice， Italy： IEEE： 501-509 ［DOI： 10.1109/ICCVW.2017.66http://dx.doi.org/10.1109/ICCVW.2017.66］

Shi B X， Mo Z P， Wu Z， Duan D L， Yeung S K and Tan P. 2019. A benchmark dataset and evaluation for non-Lambertian and uncalibrated photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence， 41（2）： 271-284 ［DOI： 10.1109/TPAMI.2018.2799222http://dx.doi.org/10.1109/TPAMI.2018.2799222］

Shi B X， Tan P， Matsushita Y and Ikeuchi K. 2012. Elevation angle from reflectance monotonicity： photometric stereo for general isotropic reflectances//Proceedings of the 12th European Conference on Computer Vision. Florence， Italy： Springer： 455-468 ［DOI： 10.1007/978-3-642-33712-3_33http://dx.doi.org/10.1007/978-3-642-33712-3_33］

Shi B X， Tan P， Matsushita Y and Ikeuchi K. 2014. Bi-polynomial modeling of low-frequency reflectances. IEEE Transactions on Pattern Analysis and Machine Intelligence， 36（6）： 1078-1091 ［DOI： 10.1109/TPAMI.2013.196http://dx.doi.org/10.1109/TPAMI.2013.196］

Simchony T， Chellappa R and Shao M. 1990. Direct analytical methods for solving Poisson equations in computer vision problems. IEEE Transactions on Pattern Analysis and Machine Intelligence， 12（5）： 435-446 ［DOI： 10.1109/34.55103http://dx.doi.org/10.1109/34.55103］

Solomon F and Ikeuchi K. 1996. Extracting the shape and roughness of specular lobe objects using four light photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence， 18（4）： 449-454 ［DOI： 10.1109/34.491627http://dx.doi.org/10.1109/34.491627］

Taniai T and Maehara T. 2018. Neural inverse rendering for general reflectance photometric stereo//Proceedings of the 35th International Conference on Machine Learning. Stockholm， Sweden： PMLR： 4857-4866

Tozza S， Mecca R， Duocastella M and Del Bue A. 2016. Direct differential photometric stereo shape recovery of diffuse and specular surfaces. Journal of Mathematical Imaging and Vision， 56（1）： 57-76 ［DOI： 10.1007/s10851-016-0633-0http://dx.doi.org/10.1007/s10851-016-0633-0］

Verbiest F and van Gool L. 2008. Photometric stereo with coherent outlier handling and confidence estimation//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage， USA： IEEE： 1-8 ［DOI： 10.1109/CVPR.2008.4587712http://dx.doi.org/10.1109/CVPR.2008.4587712］

Wiles O and Zisserman A. 2017. SilNet： single- and multi-view reconstruction by learning from silhouettes//Proceedings of the British Machine Vision Conference. London， UK： BMVA Press： #99 ［DOI： 10.5244/C.31.99http://dx.doi.org/10.5244/C.31.99］

Woodham R J. 1980. Photometric method for determining surface orientation from multiple images. Optical Engineering， 19（1）： #191139 ［DOI： 10.1117/12.7972479http://dx.doi.org/10.1117/12.7972479］

Wu L， Ganesh A， Shi B X， Matsushita Y， Wang Y T and Ma Y. 2010. Robust photometric stereo via low-rank matrix completion and recovery//Proceedings of the 10th Asian Conference on Computer Vision. Queenstown， New Zealand： Springer： 703-717 ［DOI： 10.1007/978-3-642-19318-7_55http://dx.doi.org/10.1007/978-3-642-19318-7_55］

Yao Z K， Li K， Fu Y， Hu H F and Shi B X. 2020. GPS-net： graph-based photometric stereo network//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： 10306-10316

Yu C， Seo Y and Lee S W. 2010. Photometric stereo from maximum feasible Lambertian reflections//Proceedings of the 11th European Conference on Computer Vision. Crete， Greece： Springer： 115-126 ［DOI： 10.1007/978-3-642-15561-1_9http://dx.doi.org/10.1007/978-3-642-15561-1_9］

Zheng Q， Kumar A， Shi B X and Pan G. 2019. Numerical reflectance compensation for non-Lambertian photometric stereo. IEEE Transactions on Image Processing， 28（7）： 3177-3191 ［DOI： 10.1109/TIP.2019.2894963http://dx.doi.org/10.1109/TIP.2019.2894963］

Zheng Q， Shi B X and Pan G. 2020. Summary study of data-driven photometric stereo methods. Virtual Reality and Intelligent Hardware， 2（3）： 213-221 ［DOI： 10.1016/j.vrih.2020.03.001http://dx.doi.org/10.1016/j.vrih.2020.03.001］

文章被引用时，请邮件提醒。

提交