Current Issue Cover
非负局部Laplacian稀疏编码和上下文信息的图像分类

万源, 史莹, 陈晓丽(武汉理工大学理学院, 武汉 430070)

摘 要
目的 稀疏编码是图像特征表示的有效方法,但不足之处是编码不稳定,即相似的特征可能会被编码成不同的码字。且在现有的图像分类方法中,图像特征表示和图像分类是相互独立的过程,提取的图像特征并没有有效保留图像特征之间的语义联系。针对这两个问题,提出非负局部Laplacian稀疏编码和上下文信息的图像分类算法。方法 图像特征表示包含两个阶段,第一阶段利用非负局部的Laplacian稀疏编码方法对局部特征进行编码,并通过最大值融合得到原始的图像表示,从而有效改善编码的不稳定性;第二阶段在所有图像特征表示中随机选择部分图像生成基于上下文信息的联合空间,并通过分类器将图像映射到这些空间中,将映射后的特征表示作为最终的图像表示,使得图像特征之间的上下文信息更多地被保留。结果 在4个公共的图像数据集Corel-10、Scene-15、Caltech-101以及Caltech-256上进行仿真实验,并和目前与稀疏编码相关的算法进行实验对比,分类准确率提高了约3%~18%。结论 本文提出的非负局部Laplacian稀疏编码和上下文信息的图像分类算法,改善了编码的不稳定性并保留了特征之间的相互依赖性。实验结果表明,该算法与现有算法相比的分类效果更好。另外,该方法也适用于图像分割、标注以及检索等计算机视觉领域的应用。
关键词
Image classification with non-negative and local Laplacian sparse coding and context information

Wan Yuan, Shi Ying, Chen Xiaoli(School of Science, Wuhan University of Technology, Wuhan 430070, China)

Abstract
Objective Image classification is an important issue in computer vision and a hot research topic. The traditional sparse coding (SC) method is effective for image representation and has achieved good results in image classification. However, the SC method has two drawbacks. First, the method ignores the local relationship between image features, thus losing local information. Second, because the combinatorial optimization problems of SC involve addition and subtraction, the subtraction operation might cause features to be cancelled. These two drawbacks result in coding instability, which means similar features are encoded into different codes. Meanwhile, representation and classification are usually independent of each other during image classification, so the features of image semantic relations between image features are not well preserved. In other words, image representation is not task-driven and may be unable to perform the final classification task well. Furthermore, the local feature quantization method disregards the underlying semantic information of the local region, which influences the classification performance. To deal with such problems, a two-stage method of image classification with non-negative and local Laplacian SC and context information (NLLSC-CI) is proposed in this study. NLLSC-CI aims to improve the efficiency of image representation and the accuracy of image classification.Method The representation of an image involves two stages. In the first stage, non-negative and locality-constrained Laplacian SC (NLLSC) is introduced to the encoding of the local features of the image to overcome coding instability. First, non-negativity is introduced in Laplacian SC (LSC) by non-negative matrix factorization (NMF) to avoid offsetting between features, which is applied to constrain the negativity of the codebook and code coefficient. Second, bases that are near the local features are selected to constrain the codes because locality is more important than sparseness; thus, the local information between features is preserved. Then, original image representation is attained by using spatial pyramid division (SPD) and max pooling (MP) in the pooling step. In the second stage, several original image representations are selected and connected to generate joint context spaces. All images are then mapped into these spaces by the SVM classifier. The mapped features in these joint context spaces are regarded as the final representations of images. In this manner, image representation and classification tasks are considered jointly to achieve improved performance. This two-stage representation method preserves the context relationship between the features of images to a certain extent.Results To validate the performance of the proposed method, experiments on four public image datasets, namely, Corel-10, Scene-15, Caltech-101, and Calthch-256, are conducted.Results suggest that the classification accuracy of NLLSC-CI increases by about 3% to 18% compared with that of state-of-the-art SC algorithms. The accuracy rate of NLLSC-CI increases by 3% to 12% in the Corel-10 dataset. For the Scene-15 dataset, classification accuracy increases by 4% to 15%. The classification performance in the Caltech-101 and Caltech-256 datasets increases by 3% to 14% and 4% to 18%, respectively. These findings show that the classification accuracy of the proposed method is better than that of state-of-art SC algorithms in the four benchmark image datasets. In addition, Tables 2 to 5 show that classification accuracy is the lowest in the Calthch-256 dataset. The reason could be the size of this dataset. The dataset contains too many categories and images, and the difference between and within classes is too large. As a result, the corresponding category of images cannot be identified correctly during classification. Thus, the accuracy of the proposed method is relatively low for datasets with large numbers and multiple classes of images. In general, however, NLLSC-CI demonstrates improved classification accuracy.Conclusion This study proposes an algorithm called NLLSC-CI to solve coding instability and the independence between image representation and classification. The proposed method overcomes coding instability and preserves the mutual context dependency between the local features of images. Specifically, due to the incorporation of non-negativity, locality, and graph Laplacian regularization, this new method improves the consistency of sparse codes and their mutual dependency, thus preserving more features and local information between them and making the local features more discriminating. The new optimization problem in NLLSC-CI is solved by defining a diagonal matrix to obtain the analytical solution. Furthermore, the consistency of sparse codes is maintained by introducing a Laplacian matrix. This two-stage method of image representation jointly considers two independent tasks:image representation and classification. The construction of a joint space based on context information preserves the context between image features, and the image representation obtained by context information and image classification are mutually dependent. Therefore, NLLSC-CI can model images adequately and represent the original images through mutual dependency and context information among features, thus improving the classification accuracy. Several benchmark image datasets are studied, and the final experimental results show that the proposed algorithm presents better performance than other previous algorithms. In addition, this novel method can be applied to other computer vision issues, such as image segmentation, image annotation, and image retrieval. Meanwhile, extensive image data need to be maximized because the experimental image data used in this study are from several standard image datasets. Moreover, although the context information of this method can effectively convey the information expressed by images, it cannot reflect the complete method of thinking of humans. Therefore, other methods and models of image semantic content that are closer to humans' perception and thinking need to be investigated.
Keywords

订阅号|日报