混合生成式和判别式模型的图像自动标注

李志欣; 施智平; 张灿龙; 王金艳

doi:10.11834/jig.20150511

图像分析和识别 | 浏览量 : 0 下载量: 433 CSCD: 5

PDF
导出
分享
收藏
专辑

混合生成式和判别式模型的图像自动标注
Hybrid generative/discriminative model for automatic image annotation
2015年20卷第5期页码：687-699
网络出版：2015-05-07，

纸质出版：2015
DOI： 10.11834/jig.20150511
稿件说明：

移动端阅览

李志欣, 施智平, 张灿龙, 王金艳. 混合生成式和判别式模型的图像自动标注[J]. 中国图象图形学报, 2015,20(5):687-699. DOI： 10.11834/jig.20150511.

Li Zhixin, Shi Zhiping, Zhang Canlong, Wang Jinyan. Hybrid generative/discriminative model for automatic image annotation[J]. Journal of Image and Graphics, 2015, 20(5): 687-699. DOI： 10.11834/jig.20150511.

摘要

由于图像检索中存在着低层特征和高层语义之间的“语义鸿沟”

图像自动标注成为当前的关键性问题.为缩减语义鸿沟

提出了一种混合生成式和判别式模型的图像自动标注方法. 在生成式学习阶段

采用连续的概率潜在语义分析模型对图像进行建模

可得到相应的模型参数和每幅图像的主题分布.将这个主题分布作为每幅图像的中间表示向量

那么图像自动标注的问题就转化为一个基于多标记学习的分类问题.在判别式学习阶段

使用构造集群分类器链的方法对图像的中间表示向量进行学习

在建立分类器链的同时也集成了标注关键词之间的上下文信息

因而能够取得更高的标注精度和更好的检索效果. 在两个基准数据集上进行的实验表明

本文方法在Corel5k数据集上的平均精度、平均召回率分别达到0.28和0.32

在IAPR-TC12数据集上则达到0.29和0.18

其性能优于大多数当前先进的图像自动标注方法.此外

从精度—召回率曲线上看

本文方法也优于几种典型的具有代表性的标注方法. 提出了一种基于混合学习策略的图像自动标注方法

集成了生成式模型和判别式模型各自的优点

并在图像语义检索的任务中表现出良好的有效性和鲁棒性.本文方法和技术不仅能应用于图像检索和识别的领域

经过适当的改进之后也能在跨媒体检索和数据挖掘领域发挥重要作用.

Abstract

Given the notorious semantic gap between low level features and high level concepts in image retrieval

automatic image annotation has become a crucial issue. To bridge the semantic gap

this paper proposes a hybrid generative/discriminative approach to annotate images automatically. In the generative learning stage

images are modeled by continuous probabilistic latent semantic analysis model. As a result

we can obtain the corresponding model parameters and the topic distribution of each image. If this topic distribution is taken as an intermediate representation of each image

the image auto-annotation problem could be transformed into a multi-label classification problem. In the discriminative learning stage

we construct ensembles of classifier chains by learning these intermediate representations. At the same time

the contextual information of the annotation words can be integrated into the classifier chains. Therefore

this approach could achieve higher annotation accuracy and better retrieval performance. Experiments on two baseline datasets indicate that the average precision and recall of our approach attained 0.28 and 0.32

respectively

on Corel5k dataset. In addition

these two measures of our approach attained 0.29 and 0.18

respectively

on IAPR-TC12 dataset. The experimental results proved that our approach performed better than most state-of-the-art approaches on many evaluation measures. Furthermore

the precision-recall curve showed the superior performance of our approach over several typical and representative approaches. On the basis of hybrid learning strategy

this paper presents an image auto-annotation approach

which integrates the advantages of the generative and discriminative models. As a result

the approach exhibits better

more effective

and more robust semantic image retrieval. The methods and techniques of this paper are not only usable in the fields of image retrieval and recognition

but they can play an important role in the fields of cross-media retrieval and data mining after an appropriate adaption.