面向大姿态人脸识别的正面化形变场学习
Large pose face recognition with morphing field learning
- 2022年27卷第7期 页码:2171-2184
收稿:2021-01-18,
修回:2021-4-22,
录用:2021-4-29,
纸质出版:2022-07-16
DOI: 10.11834/jig.210011
移动端阅览

浏览全部资源
扫码关注微信
收稿:2021-01-18,
修回:2021-4-22,
录用:2021-4-29,
纸质出版:2022-07-16
移动端阅览
目的
2
人脸识别已经得到了广泛应用,但大姿态人脸识别问题仍未完美解决。已有方法或提取姿态鲁棒特征,或进行人脸姿态的正面化。其中主流的人脸正面化方法包括2D回归生成和3D模型形变建模,前者能够生成相对自然真实的人脸,但会引入额外的噪声导致图像信息的扭曲;后者能够保持原始的人脸结构信息,但生成过程是基于物理模型的,不够自然灵活。为此,结合2D和3D方法的优势,本文提出了基于由粗到细形变场的人脸正面化方法。
方法
2
该形变场由深度网络以2D回归方式学得,反映的是不同视角人脸图像像素之间的语义级对应关系,可以类3D的方式实现非正面人脸图像的正面化,因此该方法兼具了2D正面化方法的灵活性与3D正面化方法的保真性,且借鉴分步渐进的思路,本文提出了由粗到细的形变场学习框架,以获得更加准确鲁棒的形变场。
结果
2
本文采用大姿态人脸识别实验来验证本文方法的有效性,在MultiPIE (multi pose,illumination,expressions)、LFW (labeled faces in the wild)、CFP (celebrities in frontal-profile in the wild)、IJB-A (intelligence advanced research projects activity Janus benchmark-A)等4个数据集上均取得了比已有方法更高的人脸识别精度。
结论
2
本文提出的基于由粗到细的形变场学习的人脸正面化方法,综合了2D和3D人脸正面化方法的优点,使人脸正面化结果的学习更加灵活、准确,保持了更多有利于识别的身份信息。
Objective
2
Face recognition is currently challenging in the context of large variations in pose
expression
aging
lighting and occlusion. Pose variations tend to large non-planar face transformation among these factors. To address the pose variations
previous methods mainly attempt to extract pose invariant feature or frontalize non-frontal faces. Among them
the frontalization methods can release discriminative feature learning via pose variations elimination. There are mainly two kinds of face frontalization methods : 2D and 3D frontalization methods. 2D methods can generate more natural frontal faces but it may lose facial structural information
which is the key factor of identity discrimination. 3D methods can well preserve facial structural information
but are not so flexible. In summary
both 3D methods and 2D methods have information loss in the frontalized faces especially for large pose variations like invisible pixels in 3D morphable model or pixel aberrance in 2D generative methods.
Method
2
We propose a novel coarse-to-fine morphing field network (CFMF-Net)
combining both 2D and 3D face transformation methods to frontalize a non-frontal face image via the coarse-to-fine optimized morphing field for shifting each pixel. Thanks to the flexibility of 2D learning based methods and structure preservation of 3D morphable model-based methods
our proposed morphing learning method makes the learning process easier and reduces the probability of over-fitting. First
a coarse morphing field is learned to capture the major structure variation of single face image. Then
a residual module based facial information extraction is designed to promote the coarse morphing field of those output concatenated with the coarse morphing field to generate the final fine morphing field for face image input. The overall framework is for the pixel correspondences regression but not pixel values. The work ensures that all pixels in the frontalized face image are taken from the input non-frontal image
thus reducing information distortion to a large extent. Therefore
the identity information related to the input non-frontal face images are well preserved with favorable visual results
thus further facilitating the subsequent face recognition task. To achieve more accurate morphing field output
our design of the coarse-to-fine morphing field learning assures the robustness of learned morphing field and the residual complementing branch.
Result
2
To verify the effectiveness of our proposed work
extensive experiments on multi pose
illumination
expressions (MultiPIE)
labeled faces in the wild (LFW)
celebrities in frontal-profile in the wild (CFP) and intelligence advanced research projects activity Janus benchmark-A (IJB-A) datasets are carried out and the results are compared with other face transformation methods. Among these testing sets
MultiPIE
CFP and IJB-A datasets are all with full pose variation. In addition
IJB-A contains full pose variations as well as other complicated variations like low resolution and occlusion. The experiments follow the same training and testing protocol with previous works
i.e.
training with both original and frontalized face images. For fair comparison
the commonly used LightCNN-29 is developed as the recognition model. Our method outperforms related works on the large pose testing protocol of MultiPIE and CFP and comparable performance on LFW and IJB-A. Additionally
our visualization results also show that our method can well preserve the identity information. Furthermore
the ablation study presents the feasibility of the coarse-to-fine framework in our CFMF-Net. In a word
the recognition accuracies and visualization results demonstrate that the proposed CFMF-Net can generate frontalized faces with identity information preserved and achieve higher large pose face recognition accuracy as well.
Conclusion
2
A coarse-to-fine morphing field learning framework frontalizes face images by shifting pixels to ensure the flexible learnability and identity information preservation. To improve its accuracy
the flexible learnability yields the network to optimize face frontalization objective without predefined 3D transformation rules. Moreover
the learned morphing field for each pixel makes the output frontal face shifted from the input image only
reducing the information loss. Simultaneously
the design of coarse-to-fine and residual architecture ensures more robust and accurate results further.
Asthana A, Marks T K, Jones M J, Tieu K H and Rohith M V. 2011. Fully automatic pose-invariant face recognition via 3D pose normalization//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE: 937-944[ DOI: 10.1109/ICCV.2011.6126336 http://dx.doi.org/10.1109/ICCV.2011.6126336 ]
Bookstein F L. 1989. Principal warps: thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(6): 567-585[DOI:10.1109/34.24792]
Cao J, Hu Y B, Zhang H W, He R and Sun Z N. 2018. Learning a high fidelity pose invariant model for high-resolution face frontalization//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc. : 2872-2882
Chang F J, Tran A T, Hassner T, Masi I, Nevatia R and Medioni G. 2017. FacePoseNet: making a case for landmark-free face alignment//Proceedings of2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE: 1599-1608[ DOI: 10.1109/ICCVW.2017.188 http://dx.doi.org/10.1109/ICCVW.2017.188 ]
Crosswhite N, Byrne J, Stauffer C, Parkhi O, Cao Q and Zisserman A. 2017. Template adaptation for face verification and identification//Proceedings of the 12th IEEE International Conference on Automatic Face and Gesture Recognition. Washington, USA: IEEE: 1-8[ DOI: 10.1109/FG.2017.11 http://dx.doi.org/10.1109/FG.2017.11 ]
Deng J K, Guo J, Xue N N and Zafeiriou S. 2019. ArcFace: additive angular margin loss for deep face recognition//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4685-4694[ DOI: 10.1109/CVPR.2019.00482 http://dx.doi.org/10.1109/CVPR.2019.00482 ]
Ding C X, Xu C and Tao D C. 2015. Multi-task pose-invariant face recognition. IEEE Transactions on Image Processing, 24(3): 980-993[DOI:10.1109/TIP.2015.2390959]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial nets//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press: 2672-2680
Hu L Q, Kan M N, Shan S G, Song X G and Chen X L. 2017. LDF-Net: learning a displacement field network for face recognition across pose//Proceedings of the 12th IEEE International Conference on Automatic Face and Gesture Recognition. Washington, USA: IEEE: 9-16[ DOI: 10.1109/FG.2017.12 http://dx.doi.org/10.1109/FG.2017.12 ]
Huang G B and Learned-Miller E. 2014. Labeled Faces in the Wild: Updates and New Reporting Procedures. Amherst Technical Report UM-CS-2014-003. University of Massachusetts
Huang R, Zhang S, Li T Y and He R. 2017. Beyond face rotation: global and local perception GAN for photorealistic and identity preserving frontal view synthesis//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2458-2467[ DOI: 10.1109/ICCV.2017.267 http://dx.doi.org/10.1109/ICCV.2017.267 ]
Kan M N, Shan S G, Chang H and Chen X L. 2014. Stacked progressive auto-encoders (SPAE) for face recognition across poses//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 1883-1890[ DOI: 10.1109/CVPR.2014.243 http://dx.doi.org/10.1109/CVPR.2014.243 ]
Kan M N, Shan S G, Zhang H H, Lao S H and Chen X L. 2016. Multi-view discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1): 188-194[DOI:10.1109/TPAMI.2015.2435740]
Klare B F, Klein B, Taborsky E, Blanton A, Cheney J, Allen K, Grother P, Mah A, Burge M and Jain A K. 2015. Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus benchmark A//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1931-1939[ DOI: 10.1109/CVPR.2015.7298803 http://dx.doi.org/10.1109/CVPR.2015.7298803 ]
Li A N, Shan S G, Chen X L and Gao W. 2009. Maximizing intra-individual correlations for face recognition across pose differences//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 605-611[ DOI: 10.1109/CVPR.2009.5206659 http://dx.doi.org/10.1109/CVPR.2009.5206659 ]
Li S X, Liu X, Chai X J, Zhang H H, Lao S H and Shan S G. 2012. Morphable displacement field based image matching for face recognition across pose//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer: 102-115[ DOI: 10.1007/978-3-642-33718-5_8 http://dx.doi.org/10.1007/978-3-642-33718-5_8 ]
Luan T, Yin X and Liu X M. 2017. Disentangled representation learning GAN for pose-invariant face recognition//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1283-1292[ DOI: 10.1109/CVPR.2017.141 http://dx.doi.org/10.1109/CVPR.2017.141 ]
Luan X, Geng H M, Liu L H, Li W S, Zhao Y Y and Ren M. 2020. Geometry structure preserving based GAN for multi-pose face frontalization and recognition. IEEE Access, 8: 104676-104687[DOI:10.1109/ACCESS.2020.2996637]
Masi I, Hassner T, Tràn A T and Medioni G. 2017. Rapid synthesis of massive face sets for improved face recognition//Proceedings of the 12th IEEE International Conference on Automatic Face and Gesture Recognition. Washington, USA: IEEE: 604-611[ DOI: 10.1109/FG.2017.76 http://dx.doi.org/10.1109/FG.2017.76 ]
Peng X, Yu X, Sohn K, Metaxas D N and Chandraker M. 2017. Reconstruction-based disentanglement for pose-invariant face recognition//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Venice, Italy: IEEE: 1632-1641[ DOI: 10.1109/ICCV.2017.180 http://dx.doi.org/10.1109/ICCV.2017.180 ]
Prabhu U, Heo J and Savvides M. 2011. Unconstrained pose-invariant face recognition using 3D generic elastic models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10): 1952-1961[DOI:10.1109/TPAMI.2011.123]
Rong C L, Zhang X M and Lin Y B. 2020. Feature-improving generative adversarial network for face frontalization. IEEE Access, 8: 68842-68851[DOI:10.1109/ACCESS.2020.2986079]
Sagonas C, Antonakos E, Tzimiropoulos G, Zafeiriou S and Pantic M. 2016. 300 faces in-the-wild challenge: database and results. Image and Vision Computing, 47: 3-18[DOI:10.1016/j.imavis.2016.01.002]
Sengupta S, Chen J C, Castillo C, Patel V M, Chellappa R and Jacobs D W. 2016. Frontal to profile face verification in the wild//Proceedings of 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Placid, USA: IEEE: 1-9[ DOI: 10.1109/WACV.2016.7477558 http://dx.doi.org/10.1109/WACV.2016.7477558 ]
Sharma A and Jacobs D W. 2011. Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch//Proceedings of 2011 CVPR. Colorado Springs, USA: IEEE: 593-600[ DOI: 10.1109/CVPR.2011.5995350 http://dx.doi.org/10.1109/CVPR.2011.5995350 ]
Sim T, Baker S and Bsat M. 2003. The CMU pose, illumination, and expression database. IEEE Transactions on Pattern Analysis andMachine Intelligence, 25(12): 1615-1618[DOI:10.1109/TPAMI.2003.1251154]
Wu X, He R, Sun Z N and Tan T N. 2018. A light CNN for deep face representation with noisy labels. IEEE Transactions on Information Forensics and Security, 13(11): 2884-2896[DOI:10.1109/TIFS.2018.2833032]
Yang J L, Ren P R, Zhang D Q, Chen D, Wen F, Li H D and Hua G. 2017. Neural aggregation network for video face recognition//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5216-5225[ DOI: 10.1109/CVPR.2017.554 http://dx.doi.org/10.1109/CVPR.2017.554 ]
Yi D, Lei Z, Liao S C and Li S Z. 2014. Learning face representation from scratch[EB/OL]. [2021-01-18] . https://arxiv.org/pdf/1411.7923.pdf https://arxiv.org/pdf/1411.7923.pdf
Yin X and Liu X M. 2018. Multi-task convolutional neural network for pose-invariant face recognition. IEEE Transactions on Image Processing, 27(2): 964-975[DOI:10.1109/TIP.2017.2765830]
Yin X, Xiang Y, Sohn K, Liu X M and Chandraker M. 2017. Towards large-pose face frontalization in the wild//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 4010-4019[ DOI: 10.1109/ICCV.2017.430 http://dx.doi.org/10.1109/ICCV.2017.430 ]
Zhang S F, Miao Q H, Huang M, Zhu X Y, Chen Y Y, Lei Z and Wang J Q. 2019. Pose-weighted GAN for photorealistic face frontalization//Proceedings of 2019 IEEE International Conference on Image Processing. Taipei, China: IEEE: 2384-2388[ DOI: 10.1109/ICIP.2019.8803362 http://dx.doi.org/10.1109/ICIP.2019.8803362 ]
Zhang Y Z, Shao M, Wong E K and Fu Y. 2013. Random faces guided sparse many-to-one encoder for pose-invariant face recognition//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE: 2416-2423[ DOI: 10.1109/ICCV.2013.300 http://dx.doi.org/10.1109/ICCV.2013.300 ]
Zhao J, Cheng Y, Xu Y, Xiong L, Li J S, Zhao F, Jayashree K, Pranata S, Shen S M, Xing J L, Yan S C and Feng J S. 2018a. Towards pose invariant face recognition in the wild//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2207-2216[ DOI: 10.1109/CVPR.2018.00235 http://dx.doi.org/10.1109/CVPR.2018.00235 ]
Zhao J, Xiong L, Cheng Y, Cheng Y, Li J S, Zhou L, Xu Y, Karlekar J, Pranata S, Shen S M, Xing J L, Yan S C and Feng J S. 2018b. 3D-aided deep pose-invariant face recognition//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: IJCAI: 1184-1190[ DOI: 10.24963/ijcai.2018/165 http://dx.doi.org/10.24963/ijcai.2018/165 ]
Zhao J, Xiong L, Jayashree K, Li J S, Zhao F, Wang Z C, Pranata S, Shen S M, Yan S C and Feng J S. 2017. Dual-agent GANs for photorealistic and identity preserving profile face synthesis//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc. : 65-75
Zhu X Y, Lei Z, Liu X M, Shi H L and Li S Z. 2016. Face alignment across large poses: a 3D solution//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 146-155[ DOI: 10.1109/CVPR.2016.23 http://dx.doi.org/10.1109/CVPR.2016.23 ]
Zhu X Y, Lei Z, Yan J J, Yi D and Li S Z. 2015. High-fidelity pose and expression normalization for face recognition in the wild//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 787-796[ DOI: 10.1109/CVPR.2015.7298679 http://dx.doi.org/10.1109/CVPR.2015.7298679 ]
Zhu Z Y, Luo P, Wang X G and Tang X O. 2013. Deep learning identity-preserving face space//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE: 113-120[ DOI: 10.1109/ICCV.2013.21 http://dx.doi.org/10.1109/ICCV.2013.21 ]
相关作者
相关机构
京公网安备11010802024621