正立和倒立面孔的混合识别
Mixed recognition of upright and inverted faces
- 2018年23卷第7期 页码:1042-1051
收稿:2017-08-20,
修回:2018-1-15,
纸质出版:2018-07-16
DOI: 10.11834/jig.170474
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-08-20,
修回:2018-1-15,
纸质出版:2018-07-16
移动端阅览
目的
2
改变正立和倒立面孔只是一种简单倒置关系的观点,研究基于视觉神经整体和局部信息流的正立和倒立面孔混合识别。
方法
2
模拟视觉信息流在视通路中的传递和处理过程,首先构建底层神经网络,建立敏感纹理特征以及对称卷积核的机制,实现正立和倒立面孔图像的去除冗余和预处理;接着提出一种基于局部区域提取的池化神经网络层的概念,构建多局部特征融合的网络结构,实现局部信息的压缩提取和融合;最后根据高级视觉皮层中左右半脑协作的特点,提出一种融合整体和局部信息的预测函数。
结果
2
以AT & T数据库为例,本文方法在经典卷积神经网络模型上增加了多局部特征融合的网络结构,识别准确率从98%提高到100%,表明局部信息能够提高对正立面孔识别的能力;同时采用合适的训练数据集,调节融合时整体与局部信息的关系比,结合使用合适模型训练方式,该模型对正立和倒立面孔的识别率分别为100%和93%,表明对正立和倒立面孔识别具有良好的特性。
结论
2
本文方法说明了整体和局部特征的两条视觉通路虽然分别在正立和倒立面孔识别上起了决定性的作用,但它们并不是孤立存在的,两条通路所刻画的面孔信息应该是一种互补式的关系。不仅为面孔识别提供一种新思路,而且将有助于对视觉神经机制的进一步理解。
Objective
2
With the high-resolution imaging and hardware capability of parallel computing
face recognition based on massive visual data has become a research focus in pattern recognition and artificial intelligence. To a certain extent
traditional face recognition algorithms also consider the principle of biological perception
such as using massive training sample data for dynamically modifying the structure and parameters of neutral networks and realizing optimal decisions. However
these methods use only several basic characters of biological perception and simulate them as a black box overall. The abundant visual mechanisms in biological perception systems are the bases of realizing visual comprehension and recognition. The mechanism of recognizing inverted faces on the basis of the different information flows of visual neural systems has been demonstrated. A new face recognition method is proposed to solve mixed upright and inverted face recognition using global and local visual neural information flow.
Method
2
The recognition of facade faces may depend on the mode of a component architecture
where the overall information is larger than the sum of the local features. The identification of an inverted face does not significantly depend on the characterization of the abovementioned overall information. Eyes
mouths
and noses are also characteristics of local features of information sources. Two visual cortical sensing pipelines reflect the global and local features in face recognition mechanisms. However
most methods consider the two pipelines or systems to be operating independently and not transforming information with each other. Therefore
a divide-and-conquer strategy is adopted in practice. However
this work argues that orthographic and inverted faces represent not merely a simple inversion of visual information. The two visual pathways that convey holistic and local features play a decisive role in orthographic and inverted face recognition and are not independent of each other. The two pipelines portrayed by face information should have a complementary relationship. In the use of global contour information for face identification
the contribution of face recognition performance to the facial features cannot be dismissed. In this work
we constructed a new face recognition system that is based on global and local information
which is transformed by two pipelines in visual cortical pathways. Our study considered the process of the visual cortical pathway that is based on the left and right hemisphere coordination mechanisms. First
the underlying neural network was constructed
and the redundancy reduction and preprocessing of upright and inverted face images were realized through the mechanism of sensitive texture and symmetric convolution kernels. Second
this work proposes the pooled neural network layer
which is based on local region extraction
and constructed the network structure of multi-local feature fusion to realize compression extraction and fusion of local information. Finally
a predictive function was defined according to the characteristics of left and right hematopoietic collaboration in the advanced visual cortex to integrate the global and local information.
Result
2
Visual test and quantitative calculation results showed that the method had an enhanced feature capability in face recognition and could better identify upright and inverted faces in comparison with the traditional methods LDA
PCA
and DeepID. The experimental model was trained on the structure of a caffe neural network framework
and the parameters of the model were trained via batch gradient descent. With the use of an AT & T database as an example
the multi-local-feature fusion network structure was added to the classical convolution neural network (CNN) model. The recognition accuracy was improved from 98% to 100%
and this improvement indicated that the local information could improve the recognition capability of the facade. In the experiment
the result of the difference calculation showed that the underlying convolution kernel had symmetry and the same response to the texture features of the faces. The appropriate training dataset was used to adjust the relationship between the global and local information during fusion processing. The recognition rates of the model were 98% and 94% for the upright and inverted faces
respectively. Therefore
the positive and inverted face recognition had good characteristics. According to the pre-trained face recognition model
the two pipeline face systems exhibited satisfactory performance on the test dataset
which fused upright and inverted faces. Thus
our method can address the problem of face recognition with fusion.
Conclusion
2
In this work
a localized feature-based pooling neural layer was designed on the basis of the texture sensitivity of the input image feature by a CNN to realize the network structure of multi-local feature fusion. Meanwhile
with consideration for the biological mechanism of local participation in recognition
the relationship between the left and right hemispheres in the advanced visual cortex was introduced and a prediction function integrating global and local information was proposed. The correlation between training data factors and local or overall characteristics was emphasized. The proposed face recognition method contributes to the understanding of optic nerve mechanisms. For example
the traditional neural network
which fuses the multi-local features
enhanced the face recognition features and thus increased the effectiveness of information. Compared with a single training dataset of inverted faces
the mixed training dataset of upright and inverted faces had a larger impact on inverted face recognition. Results showed the importance of inconsistencies in the selection of local features and the crucial role of internal differences in local features in face recognition. The hybrid recognition method of upright and inverted faces proposed in this work provides a novel research idea for face recognition technology and discusses the role of multi-view path fusion in image understanding and visual cognition of the advanced visual cortex.
Perlibakas V. Distance measures for PCA-based face recognition[J]. Pattern Recognition Letters, 2004, 25(6):711-724.[DOI:10.1016/j.patrec.2004.01.011]
Turk M, Pentland A. Eigenfaces for recognition[J]. Journal of Cognitive Neuroscience, 1991, 3(1):71-86.[DOI:10.1162/jocn.1991.3.1.71]
Sun Y, Wang X G, Tang X O. Deep learning face representation from predicting 10000 classes[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1891-1898. [ DOI:10.1109/CVPR.2014.244 http://dx.doi.org/10.1109/CVPR.2014.244 ]
Sun Y, Chen Y H, Wang X G, et al. Deep learning face representation by joint identification-verification[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: ACM, 2014: 1988-1996. http://arxiv.org/abs/1406.4773 .
Sun Y, Wang X G, Tang X O. Deeply learned face representations are sparse, selective, and robust[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 2892-2900. [ DOI:10.1109/CVPR.2015.7298907 http://dx.doi.org/10.1109/CVPR.2015.7298907 ]
Schroff F, Kalenichenko D, Philbin J. FaceNet: a unified embedding for face recognition and clustering[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 815-823. [ DOI:10.1109/CVPR.2015.7298682 http://dx.doi.org/10.1109/CVPR.2015.7298682 ]
Wen Y D, Zhang K P, Li Z F, et al. A discriminative feature learning approach for deep face recognition[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 2016: 499-515. [ DOI:10.1007/978-3-319-46478-7_31 http://dx.doi.org/10.1007/978-3-319-46478-7_31 ]
Liu W Y, Wen Y D, Yu Z D, et al. Large-margin softmax loss for convolutional neural networks[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York: ACM, 2016: 507-516. http://dl.acm.org/citation.cfm?id=3045445 .
Leder H, Goller J, Forster M, et al. Face inversion increases attractiveness[J]. Acta Psychologica, 2017, 178:25-31.[DOI:10.1016/j.actpsy.2017.05.005]
Hills P J, Mileva M, Thompson C, et al. Carryover of scanning behaviour affects upright face recognition differently to inverted face recognition[J]. Visual Cognition, 2016, 24(9-10):459-472.[DOI:10.1080/13506285.2017.1314399]
Itier R J, Taylor M J. Face recognition memory and configural processing:a developmental ERP study using upright, inverted, and contrast-reversed faces[J]. Journal of Cognitive Neuroscience, 2004, 16(3):487-502.[DOI:10.1162/089892904322926818]
Schwartz N Z. Reconsidering face specialization and faceinversion[D]. California: University of Southern California, 2007. http://digitallibrary.usc.edu/cdm/compoundobject/collection/p15799coll127/id/556663/rec/2 .
Leder H, Bruce V. Feature processing from upright and inverted faces[M]//Wechsler H, Phillips P J, Bruce V, et al. Face Recognition. Berlin, Heidelberg: Springer, 1998: 547-555. [ DOI:10.1007/978-3-642-72201-1_34 http://dx.doi.org/10.1007/978-3-642-72201-1_34 ]
DeHeering A, Rossion B, Maurer D. Revisiting upright and inverted face recognition in 6 to 12-year-old children and adults[J]. Journal of Vision, 2010, 10(7):581.[DOI:10.1167/10.7.581]
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C] //Proceedings of the 13th European Conference on Computer Vision-ECCV 2014. Switzerland: Springer, 2014: 818-833. [ DOI:10.1007/978-3-319-10590-1_53 http://dx.doi.org/10.1007/978-3-319-10590-1_53 ]
Sermanet P, Eigen D, Zhang X, et al. OverFeat: integrated recognition, localization and detection using convolutional networks[J]. arXiv: 1312. 6229, 2013. http://www.researchgate.net/publication/259441043_OverFeat_Integrated_Recognition_Localization_and_Detection_using_Convolutional_Networks .
Ren S Q, He K M, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.[DOI:10.1109/TPAMI.2016.2577031]
Girshick R. Fast R-CNN:Fast region-based convolutional networks for object detection[J].Computer Science, 2015:1440-1448.[DOI:10.1109/ICCV.2015.169]
He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visualrecognition[J]. IEEE Transactions on Pattern Analysis&Machine Intelligence, 2015, 37(9):1904-1916.[DOI:10.1109/TPAMI.2015.2389824]
Helmut L, Vicki B. When inverted faces are recognized:The role of configural information in face recognition[J]. Q J Exp Psychol A, 2000, 53(2):513-536.[DOI:10.1080/713755889]
Bartlett J C, Searcy J. Inversion and configuration of faces[J]. Cognitive Psychology, 1993, 25(3):281-316.[DOI:10.1006/cogp.1993.1007]
相关作者
相关机构
京公网安备11010802024621