王强,范影乐,武薇,朱亚萍(杭州电子科技大学自动化学院, 杭州 310018)
目的 改变正立和倒立面孔只是一种简单倒置关系的观点，研究基于视觉神经整体和局部信息流的正立和倒立面孔混合识别。方法 模拟视觉信息流在视通路中的传递和处理过程，首先构建底层神经网络，建立敏感纹理特征以及对称卷积核的机制，实现正立和倒立面孔图像的去除冗余和预处理；接着提出一种基于局部区域提取的池化神经网络层的概念，构建多局部特征融合的网络结构，实现局部信息的压缩提取和融合；最后根据高级视觉皮层中左右半脑协作的特点，提出一种融合整体和局部信息的预测函数。结果 以AT&T数据库为例，本文方法在经典卷积神经网络模型上增加了多局部特征融合的网络结构，识别准确率从98%提高到100%，表明局部信息能够提高对正立面孔识别的能力；同时采用合适的训练数据集，调节融合时整体与局部信息的关系比，结合使用合适模型训练方式，该模型对正立和倒立面孔的识别率分别为100%和93%，表明对正立和倒立面孔识别具有良好的特性。结论 本文方法说明了整体和局部特征的两条视觉通路虽然分别在正立和倒立面孔识别上起了决定性的作用，但它们并不是孤立存在的，两条通路所刻画的面孔信息应该是一种互补式的关系。不仅为面孔识别提供一种新思路，而且将有助于对视觉神经机制的进一步理解。
Mixed recognition of upright and inverted faces
Wang Qiang,Fan Yingle,Wu Wei,Zhu Yaping(School of Automation, Hangzhou DianZi University, Hangzhou 310018, China)
Objective With the high-resolution imaging and hardware capability of parallel computing, face recognition based on massive visual data has become a research focus in pattern recognition and artificial intelligence. To a certain extent, traditional face recognition algorithms also consider the principle of biological perception, such as using massive training sample data for dynamically modifying the structure and parameters of neutral networks and realizing optimal decisions. However, these methods use only several basic characters of biological perception and simulate them as a black box overall. The abundant visual mechanisms in biological perception systems are the bases of realizing visual comprehension and recognition. The mechanism of recognizing inverted faces on the basis of the different information flows of visual neural systems has been demonstrated. A new face recognition method is proposed to solve mixed upright and inverted face recognition using global and local visual neural information flow.Method The recognition of facade faces may depend on the mode of a component architecture, where the overall information is larger than the sum of the local features. The identification of an inverted face does not significantly depend on the characterization of the abovementioned overall information. Eyes, mouths, and noses are also characteristics of local features of information sources. Two visual cortical sensing pipelines reflect the global and local features in face recognition mechanisms. However, most methods consider the two pipelines or systems to be operating independently and not transforming information with each other. Therefore, a divide-and-conquer strategy is adopted in practice. However, this work argues that orthographic and inverted faces represent not merely a simple inversion of visual information. The two visual pathways that convey holistic and local features play a decisive role in orthographic and inverted face recognition and are not independent of each other. The two pipelines portrayed by face information should have a complementary relationship. In the use of global contour information for face identification, the contribution of face recognition performance to the facial features cannot be dismissed. In this work, we constructed a new face recognition system that is based on global and local information, which is transformed by two pipelines in visual cortical pathways. Our study considered the process of the visual cortical pathway that is based on the left and right hemisphere coordination mechanisms. First, the underlying neural network was constructed, and the redundancy reduction and preprocessing of upright and inverted face images were realized through the mechanism of sensitive texture and symmetric convolution kernels. Second, this work proposes the pooled neural network layer, which is based on local region extraction, and constructed the network structure of multi-local feature fusion to realize compression extraction and fusion of local information. Finally, a predictive function was defined according to the characteristics of left and right hematopoietic collaboration in the advanced visual cortex to integrate the global and local information.Result Visual test and quantitative calculation results showed that the method had an enhanced feature capability in face recognition and could better identify upright and inverted faces in comparison with the traditional methods LDA, PCA, and DeepID. The experimental model was trained on the structure of a caffe neural network framework, and the parameters of the model were trained via batch gradient descent. With the use of an AT&T database as an example, the multi-local-feature fusion network structure was added to the classical convolution neural network (CNN) model. The recognition accuracy was improved from 98% to 100%, and this improvement indicated that the local information could improve the recognition capability of the facade. In the experiment, the result of the difference calculation showed that the underlying convolution kernel had symmetry and the same response to the texture features of the faces. The appropriate training dataset was used to adjust the relationship between the global and local information during fusion processing. The recognition rates of the model were 98% and 94% for the upright and inverted faces, respectively. Therefore, the positive and inverted face recognition had good characteristics. According to the pre-trained face recognition model, the two pipeline face systems exhibited satisfactory performance on the test dataset, which fused upright and inverted faces. Thus, our method can address the problem of face recognition with fusion.Conclusion In this work, a localized feature-based pooling neural layer was designed on the basis of the texture sensitivity of the input image feature by a CNN to realize the network structure of multi-local feature fusion. Meanwhile, with consideration for the biological mechanism of local participation in recognition, the relationship between the left and right hemispheres in the advanced visual cortex was introduced and a prediction function integrating global and local information was proposed. The correlation between training data factors and local or overall characteristics was emphasized. The proposed face recognition method contributes to the understanding of optic nerve mechanisms. For example, the traditional neural network, which fuses the multi-local features, enhanced the face recognition features and thus increased the effectiveness of information. Compared with a single training dataset of inverted faces, the mixed training dataset of upright and inverted faces had a larger impact on inverted face recognition. Results showed the importance of inconsistencies in the selection of local features and the crucial role of internal differences in local features in face recognition. The hybrid recognition method of upright and inverted faces proposed in this work provides a novel research idea for face recognition technology and discusses the role of multi-view path fusion in image understanding and visual cognition of the advanced visual cortex.