Face pose correction based on morphable model and image inpainting

Congzhong Wu; Rongsheng Zheng; Huaijuan Zang; Mingwei Liu; Jiajia Xu; Shu Zhan

doi:10.11834/jig.200011

Image Analysis and Recognition | Views : 0 下载量: 0 CSCD: 1

PDF
Export
Share
Collection
Album

Face pose correction based on morphable model and image inpainting
Vol. 26, Issue 4, Pages: 828-836(2021)
Published： 16 April 2021 ，

Accepted： 11 June 2020
DOI： 10.11834/jig.200011
稿件说明：

移动端阅览

Congzhong Wu, Rongsheng Zheng, Huaijuan Zang, Mingwei Liu, Jiajia Xu, Shu Zhan. Face pose correction based on morphable model and image inpainting. [J]. Journal of Image and Graphics 26(4):828-836(2021)
DOI：

Congzhong Wu, Rongsheng Zheng, Huaijuan Zang, Mingwei Liu, Jiajia Xu, Shu Zhan. Face pose correction based on morphable model and image inpainting. [J]. Journal of Image and Graphics 26(4):828-836(2021) DOI： 10.11834/jig.200011.

摘要

目的

人脸姿态偏转是影响人脸识别准确率的一个重要因素，本文利用3维人脸重建中常用的3维形变模型以及深度卷积神经网络，提出一种用于多姿态人脸识别的人脸姿态矫正算法，在一定程度上提高了大姿态下人脸识别的准确率。

方法

对传统的3维形变模型拟合方法进行改进，利用人脸形状参数和表情参数对3维形变模型进行建模，针对面部不同区域的关键点赋予不同的权值，加权拟合3维形变模型，使得具有不同姿态和面部表情的人脸图像拟合效果更好。然后，对3维人脸模型进行姿态矫正并利用深度学习对人脸图像进行修复，修复不规则的人脸空洞区域，并使用最新的局部卷积技术同时在新的数据集上重新训练卷积神经网络，使得网络参数达到最优。

结果

在LFW(labeled faces in the wild)人脸数据库和StirlingESRC(Economic Social Research Council)3维人脸数据库上，将本文算法与其他方法进行比较，实验结果表明，本文算法的人脸识别精度有一定程度的提高。在LFW数据库上，通过对具有任意姿态的人脸图像进行姿态矫正和修复后，本文方法达到了96.57%的人脸识别精确度。在StirlingESRC数据库上，本文方法在人脸姿态为±22°的情况下，人脸识别准确率分别提高5.195%和2.265%；在人脸姿态为±45°情况下，人脸识别准确率分别提高5.875%和11.095%；平均人脸识别率分别提高5.53%和7.13%。对比实验结果表明，本文提出的人脸姿态矫正算法有效提高了人脸识别的准确率。

结论

本文提出的人脸姿态矫正算法，综合了3维形变模型和深度学习模型的优点，在各个人脸姿态角度下，均能使人脸识别准确率在一定程度上有所提高。

Abstract

Objective

Face recognition has been a widely studied topic in the field of computer vision for a long time. In the past few decades

great progress in face recognition has been achieved due to the capacity and wide application of convolutional neural networks. However

pose variations still remain a great challenge and warrant further studies. To the best of our knowledge

the existing methods that address this problem can be generally categorized into two classes: feature-based methods and deep learning-based methods. Feature-based methods attempt to obtain pose-invariant representations directly from non-frontal faces or design handcrafted local feature descriptors

which are robust to face poses. However

it is often too difficult to obtain robust representation of the face pose using these handcrafted local feature descriptors. Thus

these methods cannot produce satisfactory results

especially when the face pose is too large. In recent years

convolutional neural networks have been introduced in face recognition problems due to their outstanding performance in image classification tasks. Different from traditional methods

convolutional neural networks do not require the manual extraction of local feature descriptors. They try to directly rotate the face image of arbitrary pose and illuminate into the target pose

which maintains the face identity feature well. In addition

due to the powerful ability of image generation

generative adversarial network is also used in frontal face image synthesis and has achieved great progress. Compared with traditional methods

deep learning-based methods can obtain a higher face recognition rate. However

the disadvantage of deep learning-based methods is that the face images synthesized from the large face pose have low credibility

which lead to poor face recognition accuracy. To deal with the limitations of these two kinds of methods

we present a face pose correction algorithm based on 3D morphable model (3DMM) and image inpainting.

Method

In this study

we propose a face frontalization method by combining deep learning model and a 3DMM

which can generate a photorealistic frontal view of the face image. In detail

we first detect facial landmarks by using a well-known facial landmark detector

which is robust to large pose variations. We detect a total of 68 facial landmarks to fit the face image more accurately. Then

we perform accurate 3DMM fitting for face image with facial landmark weighting. Next

we estimate the depth information of the face image and rotate the 3D face model into frontal view using 3D transformation. Finally

we employ image inpainting for the irregular facial invisible region caused by self-occlusion by utilizing deep learning model. We fine-tune the pre-trained model to train our image inpainting model. In the training process

all of the convolutional layers are replaced with partial convolutional layers. Our training set consists of 13 223 face images that are selected from the labeled faces in the wild (LFW) dataset. Our image inpainting network is implemented in Keras. The batch size is set to 4

the learning rate is set to 10

-4

and the weight decay is 0.000 5. The network training procedure is accelerated using NVIDIA GTX 1080 Ti GPU devices

which takes approximately 10 days in total.

Result

We compare our method with state-of-the-art methods

including the traditional method and deep learning method

on two public face datasets

namely

LFW dataset and StirlingESRC 3D face dataset. The quantitative evaluation metric is face recognition rate under different face poses

and we provide several synthesized frontal face images by our method. The synthesized frontal face images show that our method can produce more photorealistic results than other methods in the LFW dataset. We achieve 96.57% face recognition accuracy on the LFW face dataset. In addition

the quantitative experiment results show that our method outperforms all other methods in StirlingESRC 3D face dataset. The experimental results show that the face recognition accuracy of our method is improved under different face poses. Compared with the other two methods in the StirlingESRC 3D face dataset

the face recognition accuracy increased by 5.195% and 2.265% under the face pose of 22° and by 5.875% and 11.095% under the face pose of 45°

respectively. Moreover

the average face recognition rate increased by 5.53% and 7.13%

respectively. The experimental results show that the proposed multi-pose face recognition algorithm improves the accuracy of face recognition.

Conclusion

In this study

we propose a face pose correction algorithm for multi-pose face recognition by combining 3DMM with deep learning model. The qualitative and quantitative experiment results show that our method can synthesize a more photorealistic frontal face image than other methods and can improve the accuracy performance of multi-pose face recognition.

关键词

多姿态人脸识别3维形变模型(3DMM)卷积神经网络(CNN)图像修复深度学习

Keywords

multi-pose face recognition3D morphable model (3DMM)convolutional neural network(CNN)image inpaintingdeep learning

references

Amberg B, Romdhani S and Vetter T. 2007. Optimal step nonrigid ICP algorithms for surface registration//Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE: 1-8[DOI: 10.1109/CVPR.2007.383165http://dx.doi.org/10.1109/CVPR.2007.383165]

Asthana A, Marks T K, Jones M J, Tieu K H and Rohith M V. 2011. Fully automatic pose-invariant face recognition via 3D pose normalization//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE: 937-944[DOI: 10.1109/ICCV.2011.6126336http://dx.doi.org/10.1109/ICCV.2011.6126336]

Blanz V and Vetter T. 1999. A morphable model for the synthesis of 3D faces//Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques. New York, USA: ACM: 187-194[DOI: 10.1145/311535.311556http://dx.doi.org/10.1145/311535.311556]

Blanz V and Vetter T. 2003. Face recognition based on fitting a 3D morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9): 1063-1074[DOI:10.1109/TPAMI.2003.1227983]

Cao C, Weng Y L, Zhou S, Tong Y Y and Zhou K. 2014. FaceWarehouse: a 3D facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3): 413-425[DOI:10.1109/TVCG.2013.249]

Cao J, Hu Y B, Zhang H W, He R and Sun Z. 2018. Learning a high fidelity pose invariant model for high-resolution face frontalization//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc. : 2872-2882

Cole F, Belanger D, Krishnan D, Sarna A, Mosseri I and Freeman W T. 2017. Synthesizing normalized faces from facial identity features//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 3386-3395[DOI: 10.1109/CVPR.2017.361http://dx.doi.org/10.1109/CVPR.2017.361]

Deng J K, Guo J, Xue N N and Zafeiriou S. 2018. ArcFace: additive angular margin loss for deep face recognition//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4685-4694[DOI: 10.1109/CVPR.2019.00482http://dx.doi.org/10.1109/CVPR.2019.00482]

Ding C X and Tao D C. 2015. Robust face recognition via multimodal deep face representation. IEEE Transactions on Multimedia, 17(11): 2049-2058[DOI:10.1109/TMM.2015.2477042]

Ding L, Ding X Q and Fang C. 2012. Continuous pose normalization for pose-robust face recognition. IEEE Signal Processing Letters, 19(11): 721-724[DOI:10.1109/LSP.2012.2215586]

Hassner T, Harel S, Paz E and Enbar R. 2015. Effective face frontalization in unconstrained images//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 4295-4304[DOI: 10.1109/CVPR.2015.7299058http://dx.doi.org/10.1109/CVPR.2015.7299058]

Heisele B, SerreT and Poggio T. 2007. A component-based framework for face detection and identification. International Journal of Computer Vision, 74(2): 167-181[DOI:10.1007/s11263-006-0006-z]

Hong X, Xiong P F, Ji R H and Fan H Q. 2019. Deep fusion network for image completion//Proceedings of the 27th ACM International Conference on Multimedia. New York, USA: Association for Computing Machinery: 2033-2042[DOI: 10.1145/3343031.3351002http://dx.doi.org/10.1145/3343031.3351002]

Hu Y B, Wu X, Yu B, He R and Sun Z. 2018. Pose-guided photorealistic face rotation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8398-8406[DOI: 10.1109/CVPR.2018.00876http://dx.doi.org/10.1109/CVPR.2018.00876]

Liu G L, Reda F A, Shih K J, Wang T C, Tao A and Catanzaro B. 2018. Image inpainting for irregular holes using partial convolutions//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 89-105[DOI: 10.1007/978-3-030-01252-6_6http://dx.doi.org/10.1007/978-3-030-01252-6_6]

Masi I, Rawls S, Medioni G and Natarajan P. 2016. Pose-aware face recognition in the wild//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 4838-4846[DOI: 10.1109/CVPR.2016.523http://dx.doi.org/10.1109/CVPR.2016.523]

Paysan P, Knothe R, Amberg B, Romdhani S and Vetter T. 2009. A 3D face model for pose and illumination invariant face recognition//Proceedings of the 6th IEEE International Conference on Advanced Video and Signal Based Surveillance. Genova, Italy: IEEE: 296-301[DOI: 10.1109/AVSS.2009.58http://dx.doi.org/10.1109/AVSS.2009.58]

Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241[DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]

Sagonas C, Panagakis Y, Zafeiriou S and Pantic M. 2015. Robust statistical face frontalization//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3871-3879[DOI: 10.1109/ICCV.2015.441http://dx.doi.org/10.1109/ICCV.2015.441]

Schroff F, Kalenichenko D and Philbin J. 2015. FaceNet: a unified embedding for face recognition and clustering//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 815-823[DOI: 10.1109/CVPR.2015.7298682http://dx.doi.org/10.1109/CVPR.2015.7298682]

Tran A T, Hassner T, Masi I and Medioni G. 2017b. Regressing robust and discriminative 3D Morphable models with a very deep neural network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1493-1502[DOI: 10.1109/CVPR.2017.163http://dx.doi.org/10.1109/CVPR.2017.163]

Tran L, Yin X and Liu X M. 2017a. Disentangled representation learning GAN for pose-invariant face recognition//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1283-1292[DOI: 10.1109/CVPR.2017.141http://dx.doi.org/10.1109/CVPR.2017.141]

Wang H, Wang Y T, Zhou Z, Ji X, Gong D H, Zhou J C, Li Z F and Liu W. 2018. CosFace: large margin cosine loss for deep face recognition//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 5265-5274[DOI: 10.1109/CVPR.2018.00552http://dx.doi.org/10.1109/CVPR.2018.00552]

Yeh R A, Chen C, Lim T Y, Schwing A G, Hasegawa-Johnson M and Do M N. 2017. Semantic image inpainting with deep generative models//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6882-6890[DOI: 10.1109/CVPR.2017.728http://dx.doi.org/10.1109/CVPR.2017.728]

Yim J, Jung H, Yoo B, Choi C, Park D and Kim J. 2015. Rotating your face using multi-task deep neural network//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 676-684[DOI: 10.1109/CVPR.2015.7298667http://dx.doi.org/10.1109/CVPR.2015.7298667]

Yin X, Yu X, Sohn K, Liu X M and Chandraker M. 2017. Towards large-pose face frontalization in the wild//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 4010-4019[DOI: 10.1109/ICCV.2017.430http://dx.doi.org/10.1109/ICCV.2017.430]

Zadeh A, Lim Y C, Baltrušaitis T and Morency L P. 2017. Convolutional experts constrainedlocal model for 3D facial landmark detection//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: 2519-2528[DOI: 10.1109/ICCVW.2017.296http://dx.doi.org/10.1109/ICCVW.2017.296]

Zhu Z, Luo P, Wang X and Tang X. 2014. Multi-View perceptron: a deep model for learning face identity and view representations//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press: 217-225

Zhu Z Y, Luo P, Wang X G and Tang X O. 2013. Deep learning identity-preserving face space//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE: 113-120[DOI: 10.1109/ICCV.2013.21http://dx.doi.org/10.1109/ICCV.2013.21]

Alert me when the article has been cited

提交

Large-scale datasets for facial tampering detection with inpainting techniques

Single image rain removal based on multi scale progressive residual network

Region-level channel attention for single image super-resolution combining high frequency loss

Hyperspectral image classification model based on 3D convolutional auto-encoder

Convolution neural network method for small-sample classification of hyperspectral images