非负局部Laplacian稀疏编码和上下文信息的图像分类

万源; 史莹; 陈晓丽

doi:10.11834/jig.160583

图像处理和编码 | 浏览量 : 0 下载量: 426 CSCD: 2

PDF
导出
分享
收藏
专辑

非负局部Laplacian稀疏编码和上下文信息的图像分类
Image classification with non-negative and local Laplacian sparse coding and context information
2017年22卷第6期页码：731-740
网络出版：2017-06-08，

纸质出版：2017
DOI： 10.11834/jig.160583
稿件说明：

移动端阅览

万源, 史莹, 陈晓丽. 非负局部Laplacian稀疏编码和上下文信息的图像分类[J]. 中国图象图形学报, 2017,22(6):731-740. DOI： 10.11834/jig.160583.

Wan Yuan, Shi Ying, Chen Xiaoli. Image classification with non-negative and local Laplacian sparse coding and context information[J]. Journal of Image and Graphics, 2017, 22(6): 731-740. DOI： 10.11834/jig.160583.

摘要

稀疏编码是图像特征表示的有效方法，但不足之处是编码不稳定，即相似的特征可能会被编码成不同的码字。且在现有的图像分类方法中，图像特征表示和图像分类是相互独立的过程，提取的图像特征并没有有效保留图像特征之间的语义联系。针对这两个问题，提出非负局部Laplacian稀疏编码和上下文信息的图像分类算法。图像特征表示包含两个阶段，第一阶段利用非负局部的Laplacian稀疏编码方法对局部特征进行编码，并通过最大值融合得到原始的图像表示，从而有效改善编码的不稳定性；第二阶段在所有图像特征表示中随机选择部分图像生成基于上下文信息的联合空间，并通过分类器将图像映射到这些空间中，将映射后的特征表示作为最终的图像表示，使得图像特征之间的上下文信息更多地被保留。在4个公共的图像数据集Corel-10、Scene-15、Caltech-101以及Caltech-256上进行仿真实验，并和目前与稀疏编码相关的算法进行实验对比，分类准确率提高了约3%~18%。本文提出的非负局部Laplacian稀疏编码和上下文信息的图像分类算法，改善了编码的不稳定性并保留了特征之间的相互依赖性。实验结果表明，该算法与现有算法相比的分类效果更好。另外，该方法也适用于图像分割、标注以及检索等计算机视觉领域的应用。

Abstract

Image classification is an important issue in computer vision and a hot research topic. The traditional sparse coding (SC) method is effective for image representation and has achieved good results in image classification. However

the SC method has two drawbacks. First

the method ignores the local relationship between image features

thus losing local information. Second

because the combinatorial optimization problems of SC involve addition and subtraction

the subtraction operation might cause features to be cancelled. These two drawbacks result in coding instability

which means similar features are encoded into different codes. Meanwhile

representation and classification are usually independent of each other during image classification

so the features of image semantic relations between image features are not well preserved. In other words

image representation is not task-driven and may be unable to perform the final classification task well. Furthermore

the local feature quantization method disregards the underlying semantic information of the local region

which influences the classification performance. To deal with such problems

a two-stage method of image classification with non-negative and local Laplacian SC and context information (NLLSC-CI) is proposed in this study. NLLSC-CI aims to improve the efficiency of image representation and the accuracy of image classification. The representation of an image involves two stages. In the first stage

non-negative and locality-constrained Laplacian SC (NLLSC) is introduced to the encoding of the local features of the image to overcome coding instability. First

non-negativity is introduced in Laplacian SC (LSC) by non-negative matrix factorization (NMF) to avoid offsetting between features

which is applied to constrain the negativity of the codebook and code coefficient. Second

bases that are near the local features are selected to constrain the codes because locality is more important than sparseness; thus

the local information between features is preserved. Then

original image representation is attained by using spatial pyramid division (SPD) and max pooling (MP) in the pooling step. In the second stage

several original image representations are selected and connected to generate joint context spaces. All images are then mapped into these spaces by the SVM classifier. The mapped features in these joint context spaces are regarded as the final representations of images. In this manner

image representation and classification tasks are considered jointly to achieve improved performance. This two-stage representation method preserves the context relationship between the features of images to a certain extent. To validate the performance of the proposed method

experiments on four public image datasets

namely

Corel-10

Scene-15

Caltech-101

and Calthch-256

are conducted.Results suggest that the classification accuracy of NLLSC-CI increases by about 3% to 18% compared with that of state-of-the-art SC algorithms. The accuracy rate of NLLSC-CI increases by 3% to 12% in the Corel-10 dataset. For the Scene-15 dataset

classification accuracy increases by 4% to 15%. The classification performance in the Caltech-101 and Caltech-256 datasets increases by 3% to 14% and 4% to 18%

respectively. These findings show that the classification accuracy of the proposed method is better than that of state-of-art SC algorithms in the four benchmark image datasets. In addition

Tables 2 to 5 show that classification accuracy is the lowest in the Calthch-256 dataset. The reason could be the size of this dataset. The dataset contains too many categories and images

and the difference between and within classes is too large. As a result

the corresponding category of images cannot be identified correctly during classification. Thus

the accuracy of the proposed method is relatively low for datasets with large numbers and multiple classes of images. In general

however

NLLSC-CI demonstrates improved classification accuracy. This study proposes an algorithm called NLLSC-CI to solve coding instability and the independence between image representation and classification. The proposed method overcomes coding instability and preserves the mutual context dependency between the local features of images. Specifically

due to the incorporation of non-negativity

locality

and graph Laplacian regularization

this new method improves the consistency of sparse codes and their mutual dependency

thus preserving more features and local information between them and making the local features more discriminating. The new optimization problem in NLLSC-CI is solved by defining a diagonal matrix to obtain the analytical solution. Furthermore

the consistency of sparse codes is maintained by introducing a Laplacian matrix. This two-stage method of image representation jointly considers two independent tasks:image representation and classification. The construction of a joint space based on context information preserves the context between image features

and the image representation obtained by context information and image classification are mutually dependent. Therefore

NLLSC-CI can model images adequately and represent the original images through mutual dependency and context information among features

thus improving the classification accuracy. Several benchmark image datasets are studied

and the final experimental results show that the proposed algorithm presents better performance than other previous algorithms. In addition

this novel method can be applied to other computer vision issues

such as image segmentation

image annotation

and image retrieval. Meanwhile

extensive image data need to be maximized because the experimental image data used in this study are from several standard image datasets. Moreover

although the context information of this method can effectively convey the information expressed by images

it cannot reflect the complete method of thinking of humans. Therefore

other methods and models of image semantic content that are closer to humans' perception and thinking need to be investigated.

关键词

Keywords

references

文章被引用时，请邮件提醒。

提交