Current Issue Cover

石亮,那天,宋晓宁,朱玉全(江苏大学计算机科学与通信工程学院, 镇江 212013;江苏科技大学计算机学院, 镇江 212000;江南大学物联网工程学院, 无锡 214122)

摘 要
目的 传统的稀疏表示分类方法运用高维数据提升算法的稀疏分类能力,早已引起了广泛关注,但其忽视了测试样本与训练样本间的信息冗余,导致了不确定性的决策分类问题。为此,本文提出一种基于卷积神经网络和PCA约束优化模型的稀疏表示分类方法(EPCNN-SRC)。方法 首先通过深度卷积神经网络计算,在输出层提取对应的特征图像,用以表征原始样本的鲁棒人脸特征。然后在此特征基础上,构建一个PCA(principal component analysis)约束优化模型来线性表示测试样本,计算对应的PCA系数。最后使用稀疏表示分类算法重构测试样本与每类训练样本的PCA系数来完成分类。结果 本文设计的分类模型与一些典型的稀疏分类方法相比,取得了更好的分类性能,在AR、FERET、FRGC和LFW人脸数据库上的实验结果显示,当每类仅有一个训练样本时,EPCNN-SRC算法的识别率分别达到96.92%、96.15%、86.94%和42.44%,均高于传统的表示分类方法,充分验证了本文算法的有效性。同时,本文方法不仅提升了对测试样本稀疏表示的鲁棒性,而且在保证识别率的基础上,有效降低了算法的时间复杂度,在FERET数据库上的运行时间为4.92 s,均低于一些传统方法的运行时间。结论 基于卷积神经网络和PCA约束优化模型的稀疏表示分类方法,将深度学习特征与PCA方法相结合,不仅具有较好的识别准确度,而且对稀疏分类也具有很好的鲁棒性,尤其在小样本问题上优势显著。
Jointly using convolutional neural network and PCA-constrained optimization model for sparse representation-based classification

Shi Liang,Na Tian,Song Xiaoning,Zhu Yuquan(School of Computer and Communication Engineering, Jiangsu University, Zhenjiang 212013, China;School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212000, China;School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China)

Objective Traditional sparse representation classification methods have drawn extensive attention due to the improved sparse classification capacity by means of high-dimensional data. However, they ignore the information redundancy between the gallery and query sets, thereby leading to the uncertainty of final recognized results. To address this issue, we propose a novel method by jointly using a convolutional neural network (CNN) and a PCA(principal component analysis)-constrained optimization model to perform sparse representation-based classification (EPCNN-SRC). Method In this study, we present a new sparse learning strategy based on CNN and PCA-constrained optimization model to perform sparse classification. The two critical contributions of this work are as follows. First, we utilize LDA (linear discriminant analysis) to enhance further the discriminative capacity of the collaborative representation classification. Second, we obtain robust face features by using a deep CNN. Specifically, for designing a classification method, we reconstruct PCA coefficients of training samples, which are achieved via PCA-constrained optimization. The objective of the proposed classification method is to use PCA plus LDA hybrid constrained model to enhance the discriminatory capacity of SRC. In the first phase, the proposed method seeks to achieve a compressive linear representation of the test samples. Our design achieves an accurate reconstruction of the test sample using sample space and principle coefficient space. The second phase further improves the discriminative capability of the PCA coefficient in representing a test sample, thereby obtaining a competitive optimization model for face classification. Assuming that a given dataset with multiple images per subject exists, the samples in each subject are stacked as vectors. Hence, an interclass variant dictionary can be constructed by subtracting the natural image from other images of the same class for training data augmentation. Many different approaches have been proposed by researchers to construct a variation dictionary. However, the idea of setting a variation dictionary can be the same, that is, to augment the training set. The constructed interclass dictionary contains all types of important difference information, such as illumination, expression, and other differences that the error cannot represent. To improve the optimization efficiency, we project the training samples into the PCA space, in which a new sparse representation model with PCA-constrained optimization is designed. The formula is described in this study. The strength of the proposed strategies lies in successfully constructing some optimization solutions using quadratic optimization in downsized coefficient subspace, thereby enhancing the collaborative and discriminative capacity of the dictionary to reconstruct the input images. Most existing sparse or collaborative representation methods focus on the training data augmentation for effective optimization to alleviate the adverse effect of the small sample size problem. However, the original dictionary is commonly built on a high-dimensional subspace. A typical example is found in some famous collaborative representation-based methods, such as ESRC(extended sparse representation-based classifier). The abundant hybrid training atoms with high dimensionality may lead to time-consuming and uncertainty in the dataset. In our method, original and within-class variations of one subject can be approximated by a collaborative linear combination of the other subjects, thereby integrating the dimensionality reduction of the training samples and the hybrid optimization process. With the PCA-constrained model, the ESRC decomposes the original face structure of the training set into the orthogonal components known as eigenfaces, and the transformed axes can be established as a set of biases, which represent the variations among the different subjects. Thus, our method can remarkably reduce the computational complexity. Meanwhile, CNNs have been successfully used in a wide range of computer vision and pattern recognition applications and become the mainstream in face biometrics. A CNN trained on a large number of face images can extract robust textural features for face recognition across a variety of appearance variations, such as pose, expression, illumination, and occlusion. To further improve the accuracy of the proposed system, we apply state-of-the-art deep CNN features to our model to improve the accuracy. Here, the proposed classification method is based on CNN features, which are different from widely used nearest neighbor classifiers with cosine and Euclidean distances. In the data processing phase, we use the pretrained VGG16 model for feature extraction. The extracted features from the original input face image can obtain better performance than the traditional sparse representation method, which uses the raw pixel intensities for classification. Therefore, the robust sample features in the process of classification perform a crucial role. Result The designed method has achieved better performance than the traditional sparse representation methods. We repeat each experiment 20 times in every dataset and compute the average value as the final recognition rate. We design an experiment in four different face datasets to evaluate the robustness of the proposed method. Each face dataset contains different styles and numbers in express and pose change. Results are compared with some traditional related classification methods, such as ESRC, NN_CNN(nearest neighbor convolutional neural networks), CIRLRC (conventional and inverse representation-based linear regression classification), TPTSR(two-phase test sample sparse representation), SRICE (sparse representation using iterative class elimination), SRC(sparse representation-based classifier), CRC(collaborative representation based classification), and LRC(linear representation classification). All the methods are operated under the same experimental condition. To confirm the capability of the method to alleviate the adverse effect of the small sample size problem, some experiments are performed in a single sample. The results obtained from the AR, FERET, FRGC, and LFW datasets show that when each subject has only one sample, the proposed EPCNN-SRC achieves 96.92%, 96.15%, 86.94%, and 42.44% recognition rates, respectively, which are higher than that of other traditional methods. This finding has fully provided the effectiveness of the proposed method. In addition, when the test environment contains complex changes, the algorithm still shows good recognition, particularly in terms of time complexity, which is considerably lower than that of the traditional representation classification algorithm, and achieves the expected results. Conclusion In this study, we propose EPCNN-SRC. Experiments in many datasets show that this algorithm, which applies iterative optimization strategy in feature space, not only effectively extracts the robust information features of the original samples but also combines norm and norm minimization to reduce the time cost of the representation classification algorithm. The key innovation of the proposed work is to accomplish face recognition using a novel dimensionality reduction optimization model, thereby resulting in robust SRC under appearance variations. The strength of the technique lies in successfully constructing a quadratic optimization in downsized coefficient solution subspace, thereby enhancing the discriminatory capacity of the dictionary to reconstruct input signals effectively. We believe that our promising results can encourage future works on synthesizing additional informative optimization structures and can improve this study for better SRC solutions.