自纠正噪声标签的人脸美丽预测

甘俊英; 吴必诚; 翟懿奎; 何国辉; 麦超云; 白振峰

doi:10.11834/jig.210125

图像理解和计算机视觉 | 浏览量 : 0 下载量: 31 CSCD: 0

PDF
导出
分享
收藏
专辑

自纠正噪声标签的人脸美丽预测
Self-correcting noise labels for facial beauty prediction
2022年27卷第8期页码：2487-2495
收稿日期：2021-03-15，

修回日期：2021-05-21，

录用日期：2021-5-28，

纸质出版日期：2022-08-16
DOI： 10.11834/jig.210125
稿件说明：

移动端阅览

甘俊英, 吴必诚, 翟懿奎, 何国辉, 麦超云, 白振峰. 自纠正噪声标签的人脸美丽预测[J]. 中国图象图形学报, 2022,27(8):2487-2495. DOI： 10.11834/jig.210125.

Junying Gan, Bicheng Wu, Yikui Zhai, Guohui He, Chaoyun Mai, Zhenfeng Bai. Self-correcting noise labels for facial beauty prediction[J]. Journal of image and graphics, 2022, 27(8): 2487-2495. DOI： 10.11834/jig.210125.

摘要

目的

人脸美丽预测是研究如何使计算机具有与人类相似的人脸美丽判断或预测能力，然而利用深度神经网络进行人脸美丽预测存在过度拟合噪声标签样本问题，从而影响深度神经网络的泛化性。因此，本文提出一种自纠正噪声标签方法用于人脸美丽预测。

方法

该方法包括自训练教师模型机制和重标签再训练机制。自训练教师模型机制以自训练的方式获得教师模型，帮助学生模型进行干净样本选择和训练，直至学生模型泛化能力超过教师模型并成为新的教师模型，并不断重复该过程；重标签再训练机制通过比较最大预测概率和标签对应预测概率，从而纠正噪声标签。同时，利用纠正后的数据反复执行自训练教师模型机制。

结果

在大规模人脸美丽数据库LSFBD(large scale facial beauty database)和SCUT-FBP5500数据库上进行实验。结果表明，本文方法在人工合成噪声标签的条件下可降低噪声标签的负面影响，同时在原始LSFBD数据库和SCUT-FBP5500数据库上分别取得60.8%和75.5%的准确率，高于常规方法。

结论

在人工合成噪声标签条件下的LSFBD和SCUT-FBP5500数据库以及原始LSFBD和SCUT-FBP5500数据库上的实验表明，所提自纠正噪声标签方法具有选择干净样本学习、充分利用全部数据的特点，可降低噪声标签的负面影响，能在一定程度上降低人脸美丽预测中噪声标签的负面影响，提高预测准确率。

Abstract

Objective

Human facial beauty prediction is the research on how to make computers have the ability to judge or predict the beauty of faces similar to humans. However

deep neural networks based facial beauty prediction has challenged the issue of noisy label samples affecting the training of deep neural network models

which thus affects the generalizability of deep neural networks. Noisy labels are mislabeled in the database

which usually affect the training of deep neural network models

thus reduce the generalizability of deep neural networks. To reduce the negative impact of noisy labels on deep neural networks in facial beauty prediction

a self-correcting noisy label method was proposed

which has the features of selection of clean samples for learning and full utilization of all data.

Method

Our method is composed of a self-training teacher model mechanism and a re-labeling retraining mechanism. First

two deep convolutional neural networks (CNNs) are initialized with the same structure simultaneously

and the network is used as the teacher model with stronger generalization ability

while the other network is used as the student model. The teacher model can be arbitrarily specified during initialization. Second

small batches of training data are fed to the teacher and student models both at the input side together. The student model receives the sample number and finds the corresponding sample and label for back-propagation training until the generalization ability of the student model exceeds that of the teacher model. Then

the student model shares the optimal parameters to the teacher model

i.e.

the original student model becomes the new teacher model

where it is called the self-training teacher model mechanism. After several iterations of training

small batches of data are fed into the teacher model with the strongest generalization ability among all previous training epochs

and its prediction probability of each category is calculated. If the maximum output probability predicted by the teacher model for this data is higher than a certain threshold of the corresponding output probability of the label

it is considered that the sample label should be corrected. The self-training teacher model mechanism is then iteratively executed utilizing the corrected data

where the process above is called the relabeling retraining mechanism. Finally

the teacher model is output as the final model.

Result

The ResNet-18 model pre-trained on the ImageNet database is used as the backbone deep neural network

which is regarded as a baseline method with cross entropy as the loss function. The experiments on the large scale facial beauty database (LSFBD) and SCUT-FBP5500 database are divided into two main parts as mentioned below: 1) the first part is performed under synthetic noise label conditions

i.e.

10%

20%

and 30% of the training data are selected from each class of facial beauty data on the two databases mentioned above

while their labels are randomly changed. The accuracy of the method in this paper exceeds the baseline method by 5.8%

4.1% and 3.7% on the LSFBD database at noise rates of 30%

20% and 10%

respectively. The accuracy exceeds the baseline method by 3.1%

2.8%

and 2.5% on the SCUT-FBP5500 database

respectively. Therefore

it is demonstrated that our method can reduce the negative impact of noisy labels under synthetic noisy label conditions. 2) The second part is carried out on the original LSFBD database and the original SCUT-FBP5500 database

and our method exceeded the prediction accuracy of the baseline method by 2.7% and 1.2% on the original LSFBD database and the original SCUT-FBP5500 database

respectively. Therefore

our demonstrated illustration can reduce the negative impact of noise labels under the original data conditions.

Conclusion

Our proposed method of self-correcting noise labels can reduce the negative impact of noise label in human facial beauty prediction in some extent and improve the prediction accuracy based on the LSFBD and SCUT-FBP5500 databases under synthetic noisy label circumstances

the original LSFBD and SCUT-FBP5500 facial beauty databases

respectively.

关键词

Keywords

references

Algan G and Ulusoy I. 2021a. Image classification with deep learning in the presence of noisy labels: a survey. Knowledge-Based Systems, 215: #106771 [ DOI: 10.1016/j.knosys.2021.106771 http://dx.doi.org/10.1016/j.knosys.2021.106771 ]

Algan G and Ulusoy I. 2021b. Meta soft label generation for noisy labels//Proceedings ofthe 25th International Conference on Pattern Recognition. Milan, Italy: IEEE: 7142-7148 [ DOI: 10.1109/icpr48806.2021.9412490 http://dx.doi.org/10.1109/icpr48806.2021.9412490 ]

Arpit D, Jastrzębski S, Ballas N, Krueger D, Bengio E, Kanwal M S, Maharaj T, Fischer A, Courville A, Bengio Y and Lacoste-Julien S. 2017. A closer look at memorization in deep networks//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: PMLR: 233-242

Chen Q Q, Wang W J and Jiang G X. 2019. Label noise filtering based on the data distribution. Journal of Tsinghua University (Science and Technology), 59(4): 262-269

陈庆强, 王文剑, 姜高霞. 2019. 基于数据分布的标签噪声过滤. 清华大学学报(自然科学版), 59(4): 262-269 [DOI: 10.16511/j.cnki.qhdxxb.2018.26.059]

Gan J Y, Zhai Y K, Huang Y, Zeng J Y and Jiang K Y. 2019. Research of facial beauty prediction based on deep convolutional features using double activation layer. Acta Electronica Sinica, 47(3): 636-642

甘俊英, 翟懿奎, 黄聿, 曾军英, 姜开永. 2019. 基于双激活层深度卷积特征的人脸美丽预测研究. 电子学报, 47(3): 636-642 [DOI: 10.3969/j.issn.0372-2112.2019.03.017]

Guo S, Huang W L, Zhang H Z, Zhuang C F, Dong D K, Scott M R an d Huang D L. 2018. CurriculumNet: weakly supervised learning from large-scale web images//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 139-154 [ DOI: 10.1007/978-3-030-01249-6_9 http://dx.doi.org/10.1007/978-3-030-01249-6_9 ]

Han B, Yao Q M, Yu X R, Niu G, Xu M, Hu W H, Tsang I W and Sugiyama M. 2018. Co-teaching: robust training of deep neural networks with extremely noisy labels//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: MIT Press: 8536-8546

He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [ DOI: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90 ]

He T, Zhang Z, Zhang H, Zhang Z Y, Xie J Y and Li M. 2019. Bag of tricks for image classification with convolutional neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 558-567 [ DOI: 10.1109/cvpr.2019.00065 http://dx.doi.org/10.1109/cvpr.2019.00065 ]

Jiang L, Huang D, Liu M and Yang W L. 2020. Beyond synthetic noise: deep learning on controlled noisy labels//Proceedings of the 37th International Conference on Machine Learning. Virtual: PMLR: 4804-4815

Li J N, Socher R and Hoi S C H. 2020. DivideMix: learning with noisy labels as semi-supervised learning [EB/OL ] . [2021-06-01 ] . https://arxiv.org/pdf/2002.07394.pdf https://arxiv.org/pdf/2002.07394.pdf

Liang L Y, Lin L J, Jin L W, Xie D R and Li M R. 2018. SCUT-FBP5500: a diverse benchmark dataset for multi-paradigm facial beauty prediction//Proceedings of the 24th International Conference on Pattern Recognition. Beijing, China: IEEE: 1598-1603 [ DOI: 10.1109/icpr.2018.8546038 http://dx.doi.org/10.1109/icpr.2018.8546038 ]

Liu S, Niles-Weed J, Razavian N and Fernandez-Granda C. 2020. Early-learning regularization prevents memorization of noisy labels//Proceedings of the 34th Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates, Inc.

Shu J, Xie Q, Yi L X, Zhao Q, Zhou S P, Xu Z B and Meng D Y. 2019. Meta-weight-net: learning an explicit mapping for sample weighting [EB/OL ] . [2021-06-01 ] . https://arxiv.org/pdf/1902.07379.pdf https://arxiv.org/pdf/1902.07379.pdf

Tang K and Lang C Y. 2021. Noise label based self-adaptive person reidentification. Journal of Data Acquisition and Processing, 36(1): 103-112

唐轲, 郎丛妍. 2021. 基于噪声标签自适应的行人再识别方法. 数据采集与处理, 36(1): 103-112 [DOI: 10.16337/j.1004-9037.2021.01.010]

Wang K, Peng X J, Yang J F, Lu S J and Qiao Y. 2020. Suppressing uncertainties for large-scale facial expression recognition//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 6896-6905 [ DOI: 10.1109/cvpr42600.2020.00693 http://dx.doi.org/10.1109/cvpr42600.2020.00693 ]

Wang Y S, Ma X J, Chen Z Y, Luo Y, Yi J F and Bailey J. 2019. Symmetric cross entropy for robust learning with noisy labels//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 322-330 [ DOI: 10.1109/iccv.2019.00041 http://dx.doi.org/10.1109/iccv.2019.00041 ]

Xie Q Z, Luong M T, Hovy E and Le Q V. 2020. Self-training with noisy student improves ImageNet classification//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 10684-10695 [ DOI: 10.1109/cvpr42600.2020.01070 http://dx.doi.org/10.1109/cvpr42600.2020.01070 ]

Xu Y L, Cao P, Kong Y Q and Wang Y Z. 2019. L DMI : a novel information-theoretic loss function for training deep nets robust to label noise//Proceedings of the 33rd Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press: 6222-6233.

Zhang C Y, Bengio S, Hardt M, Recht B and Vinyals O. 2017. Understanding deep learning requires rethinking generalization [EB/OL ] . [2021-06-01 ] . https://arxiv.org/pdf/1611.03530.pdf https://arxiv.org/pdf/1611.03530.pdf

Zhang Z H, Jiang G X and Wang W J. 2021. Label noise filtering method based on local probability sampling. Journal of Computer Applications, 41(1): 67-73

张增辉, 姜高霞, 王文剑. 2021. 基于局部概率抽样的标签噪声过滤方法. 计算机应用, 41(1): 67-73 [DOI: 10.11772/j.issn.1001-9081.2020060970]

文章被引用时，请邮件提醒。

提交

深度学习时代图像融合技术进展

高光谱图像智能分类研究综述与展望

走向通用行人重识别：预训练大模型技术在行人重识别的应用综述

针对视觉深度学习模型的物理对抗攻击研究综述

全栈全谱：医疗影像人工智能的探索与应用