基于稀疏编码的多模态信息交叉检索

刘菲; 刘学亮

doi:10.11834/jig.20150904

图像处理和编码 | 浏览量 : 0 下载量: 302 CSCD: 2

PDF
导出
分享
收藏
专辑

基于稀疏编码的多模态信息交叉检索
Novel multi-modality information cross-retrieval based on sparse coding
2015年20卷第9期页码：1170-1176
网络出版：2015-08-27，

纸质出版：2015
DOI： 10.11834/jig.20150904
稿件说明：

移动端阅览

刘菲, 刘学亮. 基于稀疏编码的多模态信息交叉检索[J]. 中国图象图形学报, 2015,20(9):1170-1176. DOI： 10.11834/jig.20150904.

Liu Fei, Liu Xueliang. Novel multi-modality information cross-retrieval based on sparse coding[J]. Journal of Image and Graphics, 2015, 20(9): 1170-1176. DOI： 10.11834/jig.20150904.

摘要

多模态信息交叉检索的根本问题是多模态数据的特征表示。稀疏编码是一种有效的数据特征表示方法

但是当查询数据和被检索数据来自不同模态时

数据间存在分布差异

相似的特征可能被编码为差异显著的稀疏表示

此时传统稀疏编码便不再适用。为此

提出了一种基于稀疏编码的多模态信息交叉检索算法。采用最大均值差异(MMD)以及图拉普拉斯

并将二者加入到稀疏编码的目标函数中来充分利用多模态信息进行编码

模型求解采用特征符号搜索和离散线搜索算法逐个更新稀疏编码系数。在Wikipedia的文本图像对数据上进行实验

并与传统稀疏编码进行比较

实验结果表明

本文算法使交叉检索的平均准确率(MAP)提高了18.7%。本文算法增强了稀疏表示的鲁棒性

提高了多模态交叉检索的准确率

更适用于对多模态数据进行特征提取

并进行进一步的操作

如交叉检索、分类等。

Abstract

The fundamental issue of multi-modality information cross-retrieval is feature representation of multi-modality data. Sparse coding is an effective representation method for feature modeling. However

when the query terms and the retrieval terms come from different modalities

the traditional sparse coding may never be suitable because the distribution difference between different modalities and similar features can be encoded as a significant difference of sparse representation. Therefore

in this paper

we present a multi-modality information cross-retrieval algorithm based on sparse coding. In the proposed method

maximum mean difference (MMD) and graph Laplacianare used to formulate the sparse coding objective function to thoroughly exploit the multimodal information in coding. Then

feature-sign search and discrete line search algorithm are used to optimize the objective function. We performed a cross-retrieval experiment on a Wikipedia text-image dataset and compared the proposed method with traditional sparse coding methods. The experimental result shows that the proposed method increased the average mean average precision (MAP) of cross-retrieval by 18.7%. The proposed algorithm improves the robustness of sparse coding and the accuracy of multimodal cross-retrieval. and more suitable for extracting features of multimodality data for further operations

such as cross-retrieval

classification

etc.