Soft threshold denoising and video data fusion-relevant low-quality 3D face recognition
- Vol. 28, Issue 5, Pages: 1434-1444(2023)
Published: 16 May 2023
DOI: 10.11834/jig.220695
移动端阅览
浏览全部资源
扫码关注微信
Published: 16 May 2023 ,
移动端阅览
桑高丽, 肖述笛, 赵启军. 2023. 联合软阈值去噪和视频数据融合的低质量3维人脸识别. 中国图象图形学报, 28(05):1434-1444
Sang Gaoli, Xiao Shudi, Zhao Qijun. 2023. Soft threshold denoising and video data fusion-relevant low-quality 3D face recognition. Journal of Image and Graphics, 28(05):1434-1444
目的
2
低质量3维人脸识别是近年来模式识别领域的热点问题;区别于传统高质量3维人脸识别,低质量、高噪声是低质量3维人脸识别面对的主要问题。围绕低质量3维人脸数据噪声大、依赖单张有限深度数据提取有效特征困难的问题,提出了一种联合软阈值去噪和视频数据融合的低质量3维人脸识别方法。
方法
2
首先,针对低质量3维人脸中存在的噪声问题,提出了一个即插即用的软阈值去噪模块,在网络提取特征的过程中对特征进行去噪处理。为了使网络提取的特征更具有判别性,结合softmax和Arcface(additive angular margin loss for deep face recognition)提出的联合渐变损失函数使网络提取更具有判别性特征。为了更好地利用多帧低质量视频数据实现人脸数据质量提升,提出了基于门控循环单元的视频数据融合模块,实现了视频帧数据间互补信息的有效融合,进一步提高了低质量3维人脸识别准确率。
结果
2
实验在两个公开数据集上与较新方法进行比较,在Lock3DFace(low-cost kinect 3D faces)开、闭集评估协议上,相比于性能第2的方法,平均识别率分别提高了0.28%和3.13%;在Extended-Multi-Dim开集评估协议上,相比于性能第2的方法,平均识别率提高了1.03%。
结论
2
提出的低质量3维人脸识别方法,不仅能有效缓解低质量噪声带来的影响,还有效融合了多帧视频数据的互补信息,大幅提高了低质量3维人脸识别准确率。
Objective
2
3D sensors-portable are developed and focused on user-friendly 3D facial data. Its low-quality 3D face recognition is concerned about more in the context of pattern recognition in recent years. Low quality 3D face recognition is challenged of the problem of low quality and high noise. To suppress high noise in low-quality 3D face data and alleviate the difficulty of extracting effective features in terms of limited single-depth data, we develop a novel low-quality 3D face recognition method on the basis of soft threshold denoising and video data fusion.
Method
2
First, a trainable soft threshold denoising module is developed to denoise the features in the process of feature extraction. To denoise the features in the process of network feature extraction, deep learning method is melted into the soft threshold denoising module designed using the neural network model beyond threshold-manual method. Then, to make the features extracted more distinctive, a joint gradient loss function is fed into softmax and Arcface(additive angular margin loss for deep face recognition) to extract more effective features. Finally, to make use of multiple frames of low-quality video data, a recurrent unit-gated video data fusion module is proposed to improve the quality of face-related data, which can optimize the mutual-benefited information between video frame data.
Result
2
To verify the effectiveness, comparative analysis is carried out in respect of two popular low-quality 3D face datasets, called the Lock3DFace(low-cost kinect 3D faces) and the Extended-Multi-Dim dataset. To be clarified, the experiments are followed by the prior training and testing protocol. Specifically, each of three protocols mentioned below are in comparison with the method of second-highest performance. For the Lock3DFace closed-set protocol, the average recognition rate is increased by 3.13%; For the Lock3DFace open-set protocol, the average recognition rate is optimized by 0.28%; For the Extended-Multi-Dim open-set protocol, the average recognition rate is improved by 1.03%. Furthermore, the ablation study demonstrates that the effectiveness and the feasibility of soft threshold denoising and video data fusion as well.
Conclusion
2
A trainable soft threshold denoising module is developed to denoise the low-quality 3D faces. The joint gradient loss function can be used to extract more distinctive features in relevant to softmax and Arcface. Furthermore, a video-based data fusion module is used to fuse information-added between video frames and the accuracy of low-quality 3D face recognition can be improved further. This low-quality 3D face recognition method can alleviate the degree of noise and integrate more effective information in terms of multiple frames of video data, which is potential to optimize low-quality 3D face recognition.
3维人脸识别低质量3维人脸软阈值去噪联合渐变损失函数视频数据融合
3D face recognitionlow-quality 3D facesoft threshold denoisingjoint gradient loss functionvideo data fusion
Cui J Y, Zhang H, Han H, Shan S G and Chen X L. 2018. Improving 2D face recognition via discriminative face depth estimation//Proceedings of 2018 International Conference on Biometrics. Gold Coast, Australia: IEEE: 140-147 [DOI: 10.1109/ICB2018.2018.00031http://dx.doi.org/10.1109/ICB2018.2018.00031]
Deng J, Guo J, Xue N and Zafeiriou S. 2019. Arcface: additive angular margin loss for deep face recognition//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4690-4699 [Doi: 10.1109/CVPR.2019.00482http://dx.doi.org/10.1109/CVPR.2019.00482]
Dey R and Salem F M. 2017. Gate-variants of gated recurrent unit (GRU) neural networks//Proceedings of the 60th International Midwest Symposium on Circuits and Systems (MWSCAS). Boston, USA: IEEE: 1597-1600 [DOI: 10.1109/MWSCAS.2017.8053243http://dx.doi.org/10.1109/MWSCAS.2017.8053243]
Gong X and Zhou Y. 2021. 3D face recognition for low quality data. Journal of University of Electronic Science and Technology of China, 50(1): 43-51
龚勋, 周炀. 2021. 面向低质量数据的3D人脸识别. 电子科技大学学报, 50(1): 43-51 [DOI: 10.12178/1001-0548.2020321http://dx.doi.org/10.12178/1001-0548.2020321]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Hsu G S J, Liu Y L, Peng H C and Wu P X. 2014. RGB-D-based face reconstruction and recognition. IEEE Transactions on Information Forensics and Security, 9(12): 2110-2118 [DOI: 10.1109/TIFS.2014.2361028http://dx.doi.org/10.1109/TIFS.2014.2361028]
Hu Z G, Gui P H, Feng Z Q, Zhao Q J, Fu K R, Liu F and Liu Z X. 2019. Boosting depth-based face recognition from a quality perspective. Sensors, 19(19): #4124 [DOI: 10.3390/s19194124http://dx.doi.org/10.3390/s19194124]
Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France: JMLR.org: 448-456
Li B Y L, Mian A S, Liu W Q and Krishna A. 2013. Using kinect for face recognition under varying poses, expressions, illumination and disguise//Proceedings of 2013 IEEE Workshop on Applications of Computer Vision (WACV). Clearwater Beach, USA: IEEE: 186-192 [DOI: 10.1109/WACV.2013.6475017http://dx.doi.org/10.1109/WACV.2013.6475017]
Liu Z X, Hu H, Bai J Q, Li S H and Lian S G. 2019. Feature aggregation network for video face recognition//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea (South): IEEE: 990-998 [DOI: 10.1109/ICCVW.2019.00128http://dx.doi.org/10.1109/ICCVW.2019.00128]
Mu G D, Huang D, Hu G S, Sun J and Wang Y H. 2019. Led3D: a lightweight and efficient deep approach to recognizing low-quality 3D faces//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5766-5775 [DOI: 10.1109/CVPR.2019.00592http://dx.doi.org/10.1109/CVPR.2019.00592]
Sandler M, Howard A, Zhu M L, Zhmoginov A and Chen L C. 2018. MobileNetV2: inverted residuals and linear bottlenecks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4510-4520 [DOI: 10.1109/CVPR.2018.00474http://dx.doi.org/10.1109/CVPR.2018.00474]
Savran A, Alyüz N, Dibeklioğlu H, Çeliktutan O, Gökberk B, Sankur B and Akarun L. 2008. Bosphorus database for 3D face analysis//Proceedings of the 1st European Workshop on Biometrics and Identity Management. Roskilde, Denmark: Springer: 47-56 [DOI: 10.1007/978-3-540-89991-4_6http://dx.doi.org/10.1007/978-3-540-89991-4_6]
Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition//Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR: #1556 [DOI: 10.48550/arXiv.1409.1556http://dx.doi.org/10.48550/arXiv.1409.1556]
Xu C H, Wang Y H and Tan T N. 2004. Overview of research on 3D face modeling. Journal of Image and Graphics, 9(8): 893-903
徐成华, 王蕴红, 谭铁牛. 2004. 3维人脸建模与应用. 中国图象图形学报, 9(8): 893-903 [DOI: 10.3969/j.issn.1006-8961.2004.08.001http://dx.doi.org/10.3969/j.issn.1006-8961.2004.08.001]
Yang J L, Ren P R, Zhang D Q, Chen D, Wen F, Li H D and Hua G. 2017. Neural aggregation network for video face recognition//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5216-5225 [DOI: 10.1109/CVPR.2017.554http://dx.doi.org/10.1109/CVPR.2017.554]
Yang X D, Huang D, Wang Y H and Chen L M. 2015. Automatic 3D facial expression recognition using geometric scattering representation//Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG). Ljubljana, Slovenia: IEEE: #7163090 [DOI: 10.1109/FG.2015.7163090http://dx.doi.org/10.1109/FG.2015.7163090]
Zhang J J, Huang D, Wang Y H and Sun J. 2016. Lock3DFace: a large-scale database of low-cost kinect 3D faces//Proceedings of 2016 International Conference on Biometrics. Halmstad, Sweden: IEEE: #7550062 [DOI: 10.1109/ICB.2016.7550062http://dx.doi.org/10.1109/ICB.2016.7550062]
Zhang Z H, Yu C C, Xu S and Li H B. 2021. Learning flexibly distributional representation for low-quality 3d face recognition//Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI: 3465-3473 [DOI: 10.1609/aaai.v35i4.16460http://dx.doi.org/10.1609/aaai.v35i4.16460]
相关作者
相关机构