发布时间: 2018-07-16 摘要点击次数: 全文下载次数: DOI: 10.11834/jig.170474 2018 | Volume 23 | Number 7 图像分析和识别

 收稿日期: 2017-08-20; 修回日期: 2018-01-15 基金项目: 国家自然科学基金项目（61501154）；浙江省大学生科技创新活动计划项目（2017R407074） 第一作者简介: 王强, 1992年生, 男, 杭州电子科技大学自动化学院控制科学与技术专业硕士研究生, 主要研究方向为视觉神经计算与图像处理。E-mail:dg_wangqiang@163.com;武薇, 女, 讲师, 主要研究方向为医学影像处理、神经信息编码与脑机交互、基于信息流的神经系统功能性活动建模与应用。E-mail:ww@hdu.edu.cn;朱亚萍, 女, 教授, 主要研究方向为生产过程综合自动化、自动化系统集成及检测技术与信息处理等领域的研究。E-mail:zhuyp@hdu.edu.cn. 中图法分类号: TP391.41 文献标识码: A 文章编号: 1006-8961(2018)07-1042-10

关键词

Mixed recognition of upright and inverted faces
Wang Qiang, Fan Yingle, Wu Wei, Zhu Yaping
School of Automation, Hangzhou DianZi University, Hangzhou 310018, China
Supported by: National Natural Science Foundation of China(61501154)

Abstract

Objective With the high-resolution imaging and hardware capability of parallel computing, face recognition based on massive visual data has become a research focus in pattern recognition and artificial intelligence. To a certain extent, traditional face recognition algorithms also consider the principle of biological perception, such as using massive training sample data for dynamically modifying the structure and parameters of neutral networks and realizing optimal decisions. However, these methods use only several basic characters of biological perception and simulate them as a black box overall. The abundant visual mechanisms in biological perception systems are the bases of realizing visual comprehension and recognition. The mechanism of recognizing inverted faces on the basis of the different information flows of visual neural systems has been demonstrated. A new face recognition method is proposed to solve mixed upright and inverted face recognition using global and local visual neural information flow. Method The recognition of facade faces may depend on the mode of a component architecture, where the overall information is larger than the sum of the local features. The identification of an inverted face does not significantly depend on the characterization of the abovementioned overall information. Eyes, mouths, and noses are also characteristics of local features of information sources. Two visual cortical sensing pipelines reflect the global and local features in face recognition mechanisms. However, most methods consider the two pipelines or systems to be operating independently and not transforming information with each other. Therefore, a divide-and-conquer strategy is adopted in practice. However, this work argues that orthographic and inverted faces represent not merely a simple inversion of visual information. The two visual pathways that convey holistic and local features play a decisive role in orthographic and inverted face recognition and are not independent of each other. The two pipelines portrayed by face information should have a complementary relationship. In the use of global contour information for face identification, the contribution of face recognition performance to the facial features cannot be dismissed. In this work, we constructed a new face recognition system that is based on global and local information, which is transformed by two pipelines in visual cortical pathways. Our study considered the process of the visual cortical pathway that is based on the left and right hemisphere coordination mechanisms. First, the underlying neural network was constructed, and the redundancy reduction and preprocessing of upright and inverted face images were realized through the mechanism of sensitive texture and symmetric convolution kernels. Second, this work proposes the pooled neural network layer, which is based on local region extraction, and constructed the network structure of multi-local feature fusion to realize compression extraction and fusion of local information. Finally, a predictive function was defined according to the characteristics of left and right hematopoietic collaboration in the advanced visual cortex to integrate the global and local information. Result Visual test and quantitative calculation results showed that the method had an enhanced feature capability in face recognition and could better identify upright and inverted faces in comparison with the traditional methods LDA, PCA, and DeepID. The experimental model was trained on the structure of a caffe neural network framework, and the parameters of the model were trained via batch gradient descent. With the use of an AT & T database as an example, the multi-local-feature fusion network structure was added to the classical convolution neural network (CNN) model. The recognition accuracy was improved from 98% to 100%, and this improvement indicated that the local information could improve the recognition capability of the facade. In the experiment, the result of the difference calculation showed that the underlying convolution kernel had symmetry and the same response to the texture features of the faces. The appropriate training dataset was used to adjust the relationship between the global and local information during fusion processing. The recognition rates of the model were 98% and 94% for the upright and inverted faces, respectively. Therefore, the positive and inverted face recognition had good characteristics. According to the pre-trained face recognition model, the two pipeline face systems exhibited satisfactory performance on the test dataset, which fused upright and inverted faces. Thus, our method can address the problem of face recognition with fusion. Conclusion In this work, a localized feature-based pooling neural layer was designed on the basis of the texture sensitivity of the input image feature by a CNN to realize the network structure of multi-local feature fusion. Meanwhile, with consideration for the biological mechanism of local participation in recognition, the relationship between the left and right hemispheres in the advanced visual cortex was introduced and a prediction function integrating global and local information was proposed. The correlation between training data factors and local or overall characteristics was emphasized. The proposed face recognition method contributes to the understanding of optic nerve mechanisms. For example, the traditional neural network, which fuses the multi-local features, enhanced the face recognition features and thus increased the effectiveness of information. Compared with a single training dataset of inverted faces, the mixed training dataset of upright and inverted faces had a larger impact on inverted face recognition. Results showed the importance of inconsistencies in the selection of local features and the crucial role of internal differences in local features in face recognition. The hybrid recognition method of upright and inverted faces proposed in this work provides a novel research idea for face recognition technology and discusses the role of multi-view path fusion in image understanding and visual cognition of the advanced visual cortex.

Key words

face recognition; inverted faces; multiple local feature fusion; visual pathway; visual mechanism; convolutional neural network

1.3.1 局部池化层

MLPB层主要对应于面部3个区域局部区域的特征提取和融合。值得说明的是，为保证正立和倒立面孔表征区域的差异性，正立面孔的选择包含左眼在内的左上区，包含右眼在内的右上区和包含口鼻在内的中间区域作为主要特征区域，同时倒立人脸的特征区域以左下区、右下区以及中间区为主。为有效地转换局部特征，提出基于局部区域提取的池化神经网络层概念，其基本思想将大小为$L \times L$的候选区域按照$N \times N$的步长做切割成小模块，并对每个小模块做最大池化操作(max pooling)，输出$M \times M$的特征矩阵。$M$

 $M = \left( {L + 2p - k} \right)/N + 1$ (1)

 ${f^k}\left( {i, j} \right) = \mathop {\max }\limits_{0 \le m, n < N} \left\{ \begin{array}{l} {x^k}\left( {{a_{{\rm{roi}}}} + i \cdot N + } \right.\\ \left. {m, {b_{{\rm{roi}}}} + j \cdot N + n} \right) \end{array} \right\}$ (2)

1.4 预测函数

 $\begin{array}{l} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;Class(\mathit{\boldsymbol{F}}) = \\ {\rm{arg}}\;\mathop {{\rm{max}}}\limits_i (\mathit{\boldsymbol{W}}_i^1 \cdot {\mathit{\boldsymbol{F}}_{{\rm{all}}}} + \lambda \cdot \mathit{\boldsymbol{W}}_i^2 \cdot {\mathit{\boldsymbol{F}}_{{\rm{Local}}}} + {\mathit{\boldsymbol{b}}_i}) \end{array}$ (4)

2.1 单一训练集情况下的识别性能

Table 1 Comparison of average recognition rate and test time of the algorithm

 模型 平均识别率/% 单张面孔测试时间/ms PCA 88 345 LDA 92 905 DeepId 98 1 440 混合识别模型 100 1 560

Table 2 Comparison of algorithm recognition rates in a single training set

 /% 模型 正立脸训练 倒立脸训练 正面测试 倒面测试 正面测试 倒面测试 PCA 87 7 8 87 LDA 92 6 10 92 DeepId 98 7 14 98 混合识别模型 100 9 14 100

2.2 混合识别模型中的参数优化

Table 3 Influence of parameter settings on the face recognition rate

 /% 面孔 $\lambda$ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 正立 100 100 100 100 99 98 97 95 94 93 倒立 21 23 30 33 39 46 57 65 75 80

2.3 样本集对识别性能的影响

Table 4 Influence of different training methods on face recognitionrate

 /% 面孔 训练方式 混合样本直接训练 正立面孔预训练倒立再训练 正立面孔预训练，混合面孔再训练 正立 15 98 100 倒立 15 46 93

参考文献

• [1] Perlibakas V. Distance measures for PCA-based face recognition[J]. Pattern Recognition Letters, 2004, 25(6): 711–724. [DOI:10.1016/j.patrec.2004.01.011]
• [2] Turk M, Pentland A. Eigenfaces for recognition[J]. Journal of Cognitive Neuroscience, 1991, 3(1): 71–86. [DOI:10.1162/jocn.1991.3.1.71]
• [3] Sun Y, Wang X G, Tang X O. Deep learning face representation from predicting 10000 classes[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1891-1898. [DOI:10.1109/CVPR.2014.244]
• [4] Sun Y, Chen Y H, Wang X G, et al. Deep learning face representation by joint identification-verification[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: ACM, 2014: 1988-1996. http://arxiv.org/abs/1406.4773
• [5] Sun Y, Wang X G, Tang X O. Deeply learned face representations are sparse, selective, and robust[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 2892-2900. [DOI:10.1109/CVPR.2015.7298907]
• [6] Schroff F, Kalenichenko D, Philbin J. FaceNet: a unified embedding for face recognition and clustering[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 815-823. [DOI:10.1109/CVPR.2015.7298682]
• [7] Wen Y D, Zhang K P, Li Z F, et al. A discriminative feature learning approach for deep face recognition[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 2016: 499-515. [DOI:10.1007/978-3-319-46478-7_31]
• [8] Liu W Y, Wen Y D, Yu Z D, et al. Large-margin softmax loss for convolutional neural networks[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York: ACM, 2016: 507-516. http://dl.acm.org/citation.cfm?id=3045445
• [9] Leder H, Goller J, Forster M, et al. Face inversion increases attractiveness[J]. Acta Psychologica, 2017, 178: 25–31. [DOI:10.1016/j.actpsy.2017.05.005]
• [10] Hills P J, Mileva M, Thompson C, et al. Carryover of scanning behaviour affects upright face recognition differently to inverted face recognition[J]. Visual Cognition, 2016, 24(9-10): 459–472. [DOI:10.1080/13506285.2017.1314399]
• [11] Itier R J, Taylor M J. Face recognition memory and configural processing:a developmental ERP study using upright, inverted, and contrast-reversed faces[J]. Journal of Cognitive Neuroscience, 2004, 16(3): 487–502. [DOI:10.1162/089892904322926818]
• [12] Schwartz N Z. Reconsidering face specialization and faceinversion[D]. California: University of Southern California, 2007. http://digitallibrary.usc.edu/cdm/compoundobject/collection/p15799coll127/id/556663/rec/2
• [13] Leder H, Bruce V. Feature processing from upright and inverted faces[M]//Wechsler H, Phillips P J, Bruce V, et al. Face Recognition. Berlin, Heidelberg: Springer, 1998: 547-555. [DOI:10.1007/978-3-642-72201-1_34]
• [14] DeHeering A, Rossion B, Maurer D. Revisiting upright and inverted face recognition in 6 to 12-year-old children and adults[J]. Journal of Vision, 2010, 10(7): 581. [DOI:10.1167/10.7.581]
• [15] Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C]//Proceedings of the 13th European Conference on Computer Vision-ECCV 2014. Switzerland: Springer, 2014: 818-833. [DOI:10.1007/978-3-319-10590-1_53]
• [16] Sermanet P, Eigen D, Zhang X, et al. OverFeat: integrated recognition, localization and detection using convolutional networks[J]. arXiv: 1312. 6229, 2013. http://www.researchgate.net/publication/259441043_OverFeat_Integrated_Recognition_Localization_and_Detection_using_Convolutional_Networks
• [17] Ren S Q, He K M, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. [DOI:10.1109/TPAMI.2016.2577031]
• [18] Girshick R. Fast R-CNN:Fast region-based convolutional networks for object detection[J]. Computer Science, 2015: 1440–1448. [DOI:10.1109/ICCV.2015.169]
• [19] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visualrecognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 37(9): 1904–1916. [DOI:10.1109/TPAMI.2015.2389824]
• [20] Helmut L, Vicki B. When inverted faces are recognized:The role of configural information in face recognition[J]. Q J Exp Psychol A, 2000, 53(2): 513–536. [DOI:10.1080/713755889]
• [21] Bartlett J C, Searcy J. Inversion and configuration of faces[J]. Cognitive Psychology, 1993, 25(3): 281–316. [DOI:10.1006/cogp.1993.1007]