面向人脸年龄估计的深度融合神经网络
End-to-end trainable deep fusion network for facial age estimation
- 2018年23卷第1期 页码:133-143
收稿:2017-06-27,
修回:2017-8-25,
纸质出版:2018-01-16
DOI: 10.11834/jig.170305
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-06-27,
修回:2017-8-25,
纸质出版:2018-01-16
移动端阅览
目的
2
为了提高人脸图像年龄估计的精度,提出一种端对端可训练的深度神经网络模型来进行人脸年龄估计。
方法
2
该网络模型由多个卷积神经网络(CNN)和一个深度置信网络(DBN)堆叠而成,称为深度融合网络(DFN)。首先使用多个并联的CNN提取人脸图像多个区域的外观特征,将得到的特征进行串接输入一个DBN网络进行非线性融合。为了实现DFN的端到端的整体训练,提出一种逐网络迭代训练(INWT)的机制。为了降低过拟合效应,那些对应人脸局部图像的CNN经过多次迭代迁移学习实现面向人脸年龄估计任务的训练。完成对DFN中所有CNN和DBN的预训练后,再进行全网络端到端的整体精调。
结果
2
在两个人脸年龄图像库MORPH Ⅱ和FG-NET上对本文方法进行测试,实验结果显示基于DFN的人脸年龄估计方法能在两个人脸图像库中分别取得平均绝对误差(MAE)等于3.42和4.14的估计精度,与目前主流的年龄估计算法,如基于浅层学习的CA-SVR方法(两个数据库上取得的MAE分别等于5.88和4.75),基于深度学习的DeepRank+方法(MORPH Ⅱ数据库上取得的MAE为3.49)和Deep-CS-LBMFL方法(FG-NET数据库上取得的MAE为4.22)等相比,估计精确度明显提高。
结论
2
本文提出基于深度融合网络的人脸年龄估计方法与当前大部分基于深度神经网络的主流算法相比具有明显的优势。
Objective
2
In this study
we propose a facial age estimation (FAE) method based on end-to-end trainable deep neural network called deep fusion network (DFN)
which adopts the idea of stacking multiple CNNs (Convolutional Neural Networks) and a DBN (Deep Belief Network) extract and fuse facial features for age estimation.
Method
2
DFN-based method for FAE comprises image preprocessing
feature extraction
feature fusion
and age estimation. In image preprocessing
the faces are cropped from images by a face detector. Face alignment is utilized to deform the face image to a fixed size and position based on landmark points
which reduce the adverse effects on the subsequent process due to various face poses and noises. Several multiscale local patches are cropped from the aligned face image based on facial landmark points. We employ CNN as the feature extraction module (FEM)
which extracts deep features from the local face patches obtained by image preprocessing. The number of FEM is 37
which is the same as that of face patches. One FEM corresponds to one local face patch. Thirty-seven parallel FEMs can simultaneously extract global and local facial features from the face patches. After feature extraction from local patches by multiple FEMs
we obtain 37 CNN features with a size of 160. These 37 deep features are concatenated to form a feature vector. We use a DBN model to fuse these deep features. Two challenges exist in DFN training. The first challenge is implementing the end-to-end training of DFN
which comprises multiple parallel CNNs and one stacked DBN. The other challenge is training large-scale deep neural networks on limited local face patches. To address these issues
a scheme of iterative net-wise training (INWT) is proposed to train the DFN. The term "net-wise" means that all neural networks
including multiple CNNs and one DBN
in the DFN are pre-trained network by network
and the entire DFN then undergoes a globally end-to-end fine-tuning. The term "iterative" means that we use a scheme of multiple iterative transfer learning to train the network of FEM on limited local face patches. CNNs corresponding to patches that contain a small portion of the face are gradually fine-tuned on the basis of multiple iterative transfer learning to reduce overfitting. After all CNNs and DBN are pretrained
the DFN is globally fine-tuned to perform a regression of face age estimation.
Result
2
We conduct extensive experiments to evaluate the proposed FAE method. The experiments are performed on two well-known benchmarks
namely
FG-NET database and MORPH Ⅱ databases. First
we evaluate the performance of the proposed method in the case of using different iterations of transfer learning. Results show that the proposed multiple iterative transfer learning can significantly improve the accuracy of age estimation. Second
we evaluate the performance of the proposed method with different patch combinations. Results show that various scales of local patches provide complementary information for FAE and that they all contribute to the decrease of MAE. Third
we evaluate the proposed method with four fusion methods. In comparison with LR
SVR
and RA
DBN-based method can achieve the best MAE in all experiments. Finally
the proposed method is compared with state-of-the-art methods. Experimental results on the two databases show that the proposed DFN-based method is an effective deep architecture for FAE and achieves a competitive performance (MAE=3.42 and 4.14) compared with state-of-the-art methods.
Conclusion
2
We propose a deep neural network called DFN for FAE. Multiple CNNs are trained to extract deep facial age features
and one DBN is stacked for feature fusion
which makes the DFN a globally trainable end-to-end deep learning model that enlarges the scale of neural network for better age estimation performance. Then
INWT scheme is developed to train the DFN on limited multiscale local face patches. Experimental results on MORPH Ⅱ and FG-NET databases show that DFN is an effective deep learning model for FAE and can achieve a competitive result compared with state-of-the-art methods.
Han H, Otto C, Liu X M, et al. Demographic estimation from face images:human vs. machine performance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(6):1148-1161.[ DOI:10.1109/TPAMI.2014.2362759 http://dx.doi.org/10.1109/TPAMI.2014.2362759 ]
Farkas L G, Anthropometry of the Head and Face[M]. 2nd ed. New York:Raven Press, 1994.
Ramanathan N, Chellappa R. Modeling age progression in young faces[C]//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, NY, USA:IEEE, 2006, 1:387-394.[ DOI:10.1109/CVPR.2006.187 http://dx.doi.org/10.1109/CVPR.2006.187 ]
Kwon Y H, Da Vitoria Lobo N. Age classification from facial images[J]. Computer Vision and Image Understanding, 1999, 74(1):1-21.[DOI:10.1006/cviu.1997.0549]
Hayashi J, Yasumoto M, Ito H, et al. Age and gender estimation from facial image processing[C]//Proceedings of the 41st SICE Annual Conference. Osaka, Japan:IEEE, 2002, 1:13-18.[ DOI:10.1109/SICE.2002.1195171 http://dx.doi.org/10.1109/SICE.2002.1195171 ]
Gunay A, Nabiyev V V. Automatic age classification with LBP[C]//The 23rd International Symposium on Computer and Information Sciences. Istanbul, Turkey:IEEE, 2008:1-4.[ DOI:10.1109/ISCIS.2008.4717926 http://dx.doi.org/10.1109/ISCIS.2008.4717926 ]
Gao F, Ai H Z. Face age classification on consumer images with Gabor feature and fuzzy LDA method[C]//Proceedings of the 3rd International Conference on Advances in Biometrics. Alghero, Italy:Springer, 2009:132-141.[ DOI:10.1007/978-3-642-01793-3_14 http://dx.doi.org/10.1007/978-3-642-01793-3_14 ]
Yan S C, Liu M, Huang T S. Extracting age information from local spatially flexible patches[C]//Proceedings of 2008 IEEE International Conference on Acoustics, Speech and Signal Processing. Las Vegas, NV, USA:IEEE, 2008:737-740.[ DOI:10.1109/ICASSP.2008.4517715 http://dx.doi.org/10.1109/ICASSP.2008.4517715 ]
Cootes T F, Edwards G J, Taylor C J. Active appearance models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(6):681-685.[DOI:10.1109/34.927467]
Lanitis A, Taylor C J, Cootes T F. Toward automatic simulation of aging effects on face images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(4):442-455.[DOI:10.1109/34.993553]
Geng X, Zhou Z H, Smith-Miles K. Automatic age estimation based on facial aging patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(12):2234-2240.[DOI:10.1109/TPAMI.2007.70733]
Lanitis A, Draganova C, Christodoulou C. Comparing different classifiers for automatic age estimation[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2004, 34(1):621-628.[DOI:10.1109/TSMCB.2003.817091]
Ueki K, Hayashida T, Kobayashi T. Subspace-based age-group classification using facial images under various lighting conditions[C]//Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition. Southampton, UK:IEEE, 2006:6-48.[ DOI:10.1109/FGR.2006.102 http://dx.doi.org/10.1109/FGR.2006.102 ]
Nithyashri J, Kulanthaivel G. Classification of human age based on Neural Network using FG-NET Aging database and Wavelets[C]//Proceedings of the 4th International Conference on Advanced Computing. Chennai, India:IEEE, 2012:1-5.[ DOI:10.1109/ICoAC.2012.6416855 http://dx.doi.org/10.1109/ICoAC.2012.6416855 ]
Guo G D, Fu Y, Dyer C R, et al. Image-based human age estimation by manifold learning and locally adjusted robust regression[J]. IEEE Transactions on Image Processing, 2008, 17(7):1178-1188.[DOI:10.1109/TIP.2008.924280]
Chen K, Gong S G, Xiang T, et al. Cumulative attribute space for age and crowd density estimation[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA:IEEE, 2013:2467-2474.[ DOI:10.1109/CVPR.2013.319 http://dx.doi.org/10.1109/CVPR.2013.319 ]
Fernández C, Huerta I, Prati A. A comparative evaluation of regression learning algorithms for facial age estimation[C]//International Workshop. Stockholm, Sweden:Springer, 2015:133-144.[ DOI:10.1007/978-3-319-13737-7_12 http://dx.doi.org/10.1007/978-3-319-13737-7_12 ]
Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252.[DOI:10.1007/s11263-015-0816-y]
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada:Curran Associates Inc., 2012:1097-1105.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014. http://www.robots.ox.ac.uk/%7Evgg/research/very_deep/ .
Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA:IEEE, 2015:1-9.[ DOI:10.1109/CVPR.2015.7298594 http://dx.doi.org/10.1109/CVPR.2015.7298594 ]
Sun Y, Wang X G, Tang X O. Deep learning face representation from predicting 10000 classes[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA:IEEE, 2014:1891-1898.[ DOI:10.1109/CVPR.2014.244 http://dx.doi.org/10.1109/CVPR.2014.244 ]
Yi D, Lei Z, Li S Z. Age estimation by multi-scale convolutional network[C]//Proceedings of the 12th Asian Conference on Computer Vision. Singapore, Singapore:Springer, 2014:144-158.[ DOI:10.1007/978-3-319-16811-1_10 http://dx.doi.org/10.1007/978-3-319-16811-1_10 ]
Yan C J, Lang C Y, Wang T, et al. Age estimation based on convolutional neural network[C]//Proceedings of the 15th Pacific-Rim Conference on Multimedia. Kuching, Malaysia:Springer, 2014:211-220.[ DOI:10.1007/978-3-319-13168-9_22 http://dx.doi.org/10.1007/978-3-319-13168-9_22 ]
Dong Y, Liu Y N, Lian S G. Automatic age estimation based on deep learning algorithm[J]. Neurocomputing, 2016, 187:4-10.[DOI:10.1016/j.neucom.2015.09.115]
Huerta I, Fernández C, Segura C, et al. A deep analysis on age estimation[J]. Pattern Recognition Letters, 2015, 68:239-249.[DOI:10.1016/j.patrec.2015.06.006]
Liu X, Li S X, Kan M N, et al. AgeNet:Deeply learned regressor and classifier for robust apparent age estimation[C]//Proceedings of 2005 IEEE International Conference on Computer Vision Workshop. Santiago, Chile:IEEE, 2015:16-24.[ DOI:10.1109/ICCVW.2015.42 http://dx.doi.org/10.1109/ICCVW.2015.42 ]
Yang H F, Lin B Y, Chang K Y, et al. Automatic age estimation from face images via deep ranking[C]//Proceedings of the British Machine Vision Conference.Swansea, UK:BMVA Press, 2015:55.1-55.11.[ DOI:10.5244/C.29.55 http://dx.doi.org/10.5244/C.29.55 ]
Niu Z X, Zhou M, Wang L, et al. Ordinal regression with multiple output CNN for age estimation[C]//Proceedings of 2006 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA:IEEE, 2016:4920-4928.[ DOI:10.1109/CVPR.2016.532 http://dx.doi.org/10.1109/CVPR.2016.532 ]
Liu H, Lu J W, Feng J J, et al. Group-aware deep feature learning for facial age estimation[J]. Pattern Recognition, 2017, 66:82-94.[DOI:10.1016/j.patcog.2016.10.026]
Han H, Otto C, Jain A K. Age estimation from face images:human vs. machine performance[C]//Proceedings of 2013 International Conference on Biometrics. Madrid, Spain:IEEE, 2013:1-8.[ DOI:10.1109/ICB.2013.6613022 http://dx.doi.org/10.1109/ICB.2013.6613022 ]
Wang X L, Guo R, Kambhamettu C. Deeply-learned feature for age estimation[C]//Proceedings of 2015 IEEE Winter Conference on Applications of Computer Vision. Waikoloa, HI, USA:IEEE, 2015:534-541.[ DOI:10.1109/WACV.2015.77 http://dx.doi.org/10.1109/WACV.2015.77 ]
相关作者
相关机构
京公网安备11010802024621