构建并行卷积神经网络的表情识别算法
Expression recognition algorithm for parallel convolutional neural networks
- 2019年24卷第2期 页码:227-236
收稿:2018-06-04,
修回:2018-8-16,
纸质出版:2019-02-16
DOI: 10.11834/jig.180346
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-06-04,
修回:2018-8-16,
纸质出版:2019-02-16
移动端阅览
目的
2
表情识别在商业、安全、医学等领域有着广泛的应用前景,能够快速准确地识别出面部表情对其研究与应用具有重要意义。传统的机器学习方法需要手工提取特征且准确率难以保证。近年来,卷积神经网络因其良好的自学习和泛化能力得到广泛应用,但还存在表情特征提取困难、网络训练时间过长等问题,针对以上问题,提出一种基于并行卷积神经网络的表情识别方法。
方法
2
首先对面部表情图像进行人脸定位、灰度统一以及角度调整等预处理,去除了复杂的背景、光照、角度等影响,得到了精确的人脸部分。然后针对表情图像设计一个具有两个并行卷积池化单元的卷积神经网络,可以提取细微的表情部分。该并行结构具有3个不同的通道,分别提取不同的图像特征并进行融合,最后送入SoftMax层进行分类。
结果
2
实验使用提出的并行卷积神经网络在CK+、FER2013两个表情数据集上进行了10倍交叉验证,最终的结果取10次验证的平均值,在CK+及FER2013上取得了94.03%与65.6%的准确率。迭代一次的时间分别为0.185 s和0.101 s。
结论
2
为卷积神经网络的设计提供了一种新思路,可以在控制深度的同时扩展广度,提取更多的表情特征。实验结果表明,针对数量、分辨率、大小等差异较大的表情数据集,该网络模型均能够获得较高的识别率并缩短训练时间。
Objective
2
Face emotion recognition is widely applied in the fields of commercial
security
and medicine. Rapid and accurate identification of facial expressions are of great significance for their research and application. Several traditional machine learning methods
such as support vector machine (SVM)
principal component analysis (PCA)
and local binary pattern (LBP) are used to identify facial expressions. However
these traditional machine learning algorithms require manual feature extraction. In this process
some features are hidden or deliberately enlarged due to many human interventions
which affect accuracy. In recent years
convolutional neural networks (CNNs) have been used extensively in image recognition due to their good self-learning and generalization capabilities. However
several problems
such as difficulty in facial expression feature extraction and long training time of neural network
are still observed with neural network training. This study presents an expression recognition method based on parallel CNN to solve the aforementioned problems.
Method
2
First
a series of preprocessing operations is performed on facial expression images. For example
an original image is detected by using an AdaBoost cascade classifier to remove the complex background and obtain the face part. Then
a face image is compensated by illumination
a histogram equalization method is used to stretch the image nonlinearly
and the pixel value of the image is reallocated. Finally
affine transformation is used to achieve face alignment. The preceding preprocessing can remove complex background effects
compensate lighting
and adjust the angle to obtain more accurate face parts than that of the original image. Then
a CNN with two parallel convolution and pooling structures
which can extract subtle expressions
is designed for facial expression images. This parallel unit is the core unit of the CNN and comprises a convolutional layer
a pooling layer
and an activation function ReLu. This parallel structure has three different channels
in which each channel has different number of convolutions
pooling layers
and ReLu to extract different image features and fuse the extracted features. The second parallel processing unit can perform convolution and pooling on the extracted features by the first parallel processing unit and reduce the dimension of the image and shorten the training time of CNN. Finally
the previously merged features are sent to the SoftMax layer for expression classification.
Result
2
CK+ and FER2013 expression datasets that have undergone pre-processing and data enhancement are divided into 10 equal parts. Then
training and testing are performed on 10 parts
and the final accuracy is the average of the 10 results. Experimental results show that the accuracy increases and time decreases remarkably compared with traditional machine learning methods
such as SVM
PCA
and LBP or their combination and other classical CNNs
such as AlexNet and GoogLeNet. Finally
CK+ and FER2013 achieve 94.03% and 65.6% accuracy
and the iteration time reaches 0.185 s and 0.101 s
respectively.
Conclusion
2
This study presents a new parallel CNN structure that extracts the features of facial expressions by using three different convolutional and pooling structures. The three paths have different combinations of convolutional and pooling layers
and they can extract different image features. The different extracted features are combined and sent to the next layer for processing. This study provides a new concept for the design of CNNs
which can extend the breadth of CNN and control the depth. The proposed CNN can extract many expressions that are ignored or difficult to extract. CK+ and FER2013 expression datasets have large difference in quantity
size
and resolution. The experiments of CK+ and FER2013 show that the model can extract the precise and subtle features of facial expression images in a relatively short time under the premise of ensuring the recognition rate.
Anderson K, McOwan P W. A real-time automated system for the recognition of human facial expressions[J]. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 2006, 36(1):96-105.[DOI:10.1109/TSMCB.2005.854502
Pramerdorfer C, Kampel M. Facial expression recognition using convolutional neural networks: state of the art[EB/OL].[2018-05-20] . https://arxiv.org/pdf/1612.02903.pdf https://arxiv.org/pdf/1612.02903.pdf
Shan C F, Gong S G, McOwan P W. Facial expression recognition based on Local Binary Patterns:A comprehensive study[J]. Image and Vision Computing, 2009, 27(6):803-816.[DOI:10.1016/j.imavis.2008.08.005
Berretti S, Ben Amor B, Daoudi M, et al. 3D facial expression recognition using SIFT descriptors of automatically detected keypoints[J]. The Visual Computer, 2011, 27(11):1021-1036.[DOI:10.1007/s00371-011-0611-x
Albiol A, Monzo D, Martin A, et al. Face recognition using HOG-EBGM[J]. Pattern Recognition Letters, 2008, 29(10):1537-1543.[DOI:10.1016/j.patrec.2008.03.017
Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA: IEEE, 2015: 1-9.[ DOI: 10.1109/CVPR.2015.7298594 http://dx.doi.org/10.1109/CVPR.2015.7298594 ]
Burkert P, Trier F, Afzal M Z, et al. DeXpression: deep convolutional neural network for expression recognition[EB/OL].[2018-05-20] . https://arxiv.org/pdf/1509.05371.pdf https://arxiv.org/pdf/1509.05371.pdf
Yang G L, Deng X J, Liu C. Facial expression recognition model based on deep spatiotemporal convolutional neural networks[J]. Journal of Central South University:Science and Technology, 2016, 47(7):2311-2319.
杨格兰, 邓晓军, 刘琮.基于深度时空域卷积神经网络的表情识别模型[J].中南大学学报:自然科学版, 2016, 47(7):2311-2319. [DOI:10.11817/j.issn.1672-7207.2016.07.037
Lecun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015, 521(7553):436-444.[DOI:10.1038/nature14539
Zeiler M D, Fergus R. Stochastic pooling for regularization of deep convolutional neural networks[EB/OL].[2018-05-20] . https://arxiv.org/pdf/1301.3557.pdf https://arxiv.org/pdf/1301.3557.pdf
Nair V, Hinton G E. Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th International Conference on International Conference on Machine Learning. Haifa, Israel: Omnipress, 2010: 807-814. http://120.52.51.17/www.cs.toronto.edu/~fritz/absps/reluICML.pdf .
Wei Z. Research and implementation of face recognition based on deep learning based on Caffe platform[D]. Xi'an: Xi'an University of Electronic Science and Technology, 2015. http://cdmd.cnki.com.cn/Article/CDMD-10701-1016245785.htm .
魏正.基于Caffe平台深度学习的人脸识别研究与实现[D].西安: 西安电子科技大学, 2015.
Lucey P, Cohn J F, Kanade T, et al. The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. San Francisco, CA: IEEE, 2010: 94-101.[ DOI: 10.1109/CVPRW.2010.5543262 http://dx.doi.org/10.1109/CVPRW.2010.5543262 ]
Kotsia I, Pitas I. Facial expression recognition in image sequences using geometric deformation features and support vector machines[J]. IEEE Transactions on Image Processing, 2007, 16(1):172-187.[DOI:10.1109/TIP.2006.884954
Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[EB/OL].[2018-05-20] . https://arxiv.org/pdf/1207.0580.pdf https://arxiv.org/pdf/1207.0580.pdf
Fu Q M, Liu Q, Wang H, et al. A novel off policy Q(λ) algorithm based on linear function approximation[J]. Chinese Journal of Computers, 2014, 37(3):677-686.
傅启明, 刘全, 王辉, 等.一种基于线性函数逼近的离策略Q(λ)算法[J].计算机学报, 2014, 37(3):677-686. [DOI:10.3724/SP.J.1016.2013.00677
He J, Cai J F, Fang L Z, et al. Facial expression recognition based on LBP/VAR and DBN model[J]. Application Research of Computers, 2016, 33(8):2509-2513.
何俊, 蔡建峰, 房灵芝, 等.基于LBP/VAR与DBN模型的人脸表情识别[J].计算机应用研究, 2016, 33(8):2509-2513. [DOI:10.3969/j.issn.1001-3695.2016.08.060
Zhang B. Facial expression recognition based on Gabor and conditional random field[D]. Jinan: Shan Dong University, 2015. http://cdmd.cnki.com.cn/Article/CDMD-10422-1015372484.htm .
张博.基于Gabor和条件随机场的人脸表情识别[D].济南: 山东大学, 2015.
Goodfellow I J, Erhan D, Carrier P L, et al. Challenges in representation learning:A report on three machine learning contests[J]. Neural Networks, 2015, 64:59-63.[DOI:10.1016/j.neunet.2014.09.005
相关作者
相关机构
京公网安备11010802024621