图像场景分类中视觉词包模型方法综述
Review of the bag-of-visual-words models in image scene classification
- 2014年19卷第3期 页码:333-343
网络出版:2014-03-03,
纸质出版:2014
DOI: 10.11834/jig.20140301
移动端阅览

浏览全部资源
扫码关注微信
网络出版:2014-03-03,
纸质出版:2014
移动端阅览
关于图像场景分类中视觉词包模型方法的综述性文章在国内外杂志上还少有报导,为了使国内外同行对图像场景分类中的视觉词包模型方法有一个较为全面的了解,对这些研究工作进行了系统总结。 在参考国内外大量文献的基础上,对现有图像场景分类(主要指针对单一图像场景的分类)中出现的各种视觉词包模型方法从低层特征的选择与局部图像块特征的生成、视觉词典的构建、视觉词包特征的直方图表示、视觉单词优化等多方面加以总结和比较。 回顾了视觉词包模型的发展历程,对目前存在的多种视觉词包模型进行了归纳,比较常见方法各自的优缺点,总结了视觉词包模型性能评价方法,并对目前常用的标准场景库进行汇总,同时给出了各自所达到的最高精度。 图像场景分类中视觉词包模型方法的研究作为计算机视觉领域方兴未艾的热点研究领域,在国内外研究中取得了不少进展,在计算机视觉领域的研究也不再局限于直接应用模型描述图像内容,而是更多地考虑图像与文本的差异。虽然视觉词包模型在图像场景分类的应用中还存在很多亟需解决的问题,但是这丝毫不能掩盖其研究的重要意义。
With the rapid development of computer multi-media technique
database technique and computer network technique
there have been more and more images to classify and label. Instead of using traditional manual mode
it has been a hot research field to use computer-aided automatic image-scene classification techniques. Among numerous image scene classification methods
the bag-of-visual-words(BOVW)model has become a widely adopted one
which
as a middle level feature
can narrow the gap between low-level visual features and high-level semantic features. However
reviews about BOVW model in image scene classification are rarely seen on journals in China and abroad. Therefore
in order to give a comprehensive understanding of this method to researchers in this field
this paper systematically summarizes these studies. Based on numerous references about the BOVW model in image scene classification during almost the past ten years
we divide the general process of development of the BOVW into five stages
that is
the stage of direct application of early bag-of-words model in image field
the stage of studying latent semantic information in the BOVW model
the stage of studying spatial layout or structure information in the BOVW model
the stage of studying context information in the BOVW model
and the stage of optimizing visual word semantics and introducing new methods into the BOVW model. Furthermore
we sum up and compare various existing BOVW models in image scene classification in terms of local feature selection
feature generation of local image patches
visual vocabulary construction
histogram representation of bag of visual words feature
optimization of visual words
and so on. The development history of the BOVW and the research status of the BOVW based image scene classification are reviewed
which gives a clear trail of the development of the BOVW model;the numerous existing the BOVW models are categorized according to their working mechanism;the advantages and disadvantages of commonly used methods are compared;the performance evaluation method for the BOVW model is described and the commonly used standard scene databases are collected
with their best classification accuracies given separately. As a hot research field that is currently rising
studies of the BOVW methods in image scene classification have produced quite a few research progress. The research in computer vision field has no longer been limited to directly applying original the BOVW model to describe image content
and more and more differences between images and texts are considered. The urgent problems to be solved are as follows:the performance of the BOVW will be greatly influenced when the bag of visual words are applied to the samples that are quite different from the training ones
while training new bag of visual words based on new training samples is very time and labor consuming;there is still no theoretical guide for determining the size of visual vocabulary;the relationship between visual words and semantics is still not fully exploited;the application of the BOVW in special fields
such as high-resolution remote sensing land-use scene classification
is far from being satisfactory. Besides
based on these problems
there may be some interesting research directions for example: constructing universal self-adaptive bag of visual words for different sample sets
automatically selecting optimal vocabulary size
adding more spatial layout and context information to the BOVW and exploring latent semantic information in visual words
studying image visual grammars for image understanding
studying scene classification problems in images of special fields
such as high resolution remote sensing images
and investigating new well-characterized low level feature extraction algorithms to construct high level bag of visual words. To conclude
although there are still a number of urgent problems to be solved in the application of the BOVW model based image scene classification
the important meanings of the studies of the BOVW model cannot be covered up.
相关作者
相关机构
京公网安备11010802024621