字典重建和空间分布关系约束下的特征选择与图像拼接

于邓; 刘玉杰; 隋国华; 陈晓明; 李宗民; 范建平

doi:10.11834/jig.170461

图像理解和计算机视觉 | 浏览量 : 0 下载量: 4 CSCD: 0

PDF
导出
分享
收藏
专辑

字典重建和空间分布关系约束下的特征选择与图像拼接
Feature selection and image stitching based on dictionary reconstruction and spatial distribution
2018年23卷第5期页码：698-707
收稿：2017-08-23，

修回：2017-12-8，

纸质出版：2018-05-16
DOI： 10.11834/jig.170461
稿件说明：

移动端阅览

于邓, 刘玉杰, 隋国华, 陈晓明, 李宗民, 范建平. 字典重建和空间分布关系约束下的特征选择与图像拼接[J]. 中国图象图形学报, 2018,23(5):698-707. DOI： 10.11834/jig.170461.

Deng Yu, Yujie Liu, Guohua Sui, Xiaoming Chen, Zongmin Li, Jianping Fan. Feature selection and image stitching based on dictionary reconstruction and spatial distribution[J]. Journal of Image and Graphics, 2018, 23(5): 698-707. DOI： 10.11834/jig.170461.

摘要

目的

针对大型图像检索领域中，复杂图像中SIFT特征描述子的冗余和高维问题，提出了一种基于字典重建和空间分布关系约束的特征选择的方法，来消除冗余特征并保留最具表现力的、保留原始空间结构性的SIFT特征描述子。

方法

首先，实验发现了特征选择和字典学习方法在稀疏表示方面的内在联系，将特征选择问题转化为字典重构任务；其次，在SIFT特征选择问题中，为了保证特征空间中特征的鲁棒性，设计了新型的字典学习模型，并采用模拟退火算法进行迭代求解；最后，在字典学习的过程中，加入熵理论来约束特征的空间分布，使学习到的特征描述子能最大限度保持原始SIFT特征空间的空间拓扑关系。

结果

在公开数据集Holiday大型场景图片检索数据库上，通过与国际公认的特征选择方法进行实验对比，本文提出的特征选择方法在节省内存空间和提高时间效率（30%~50%）的同时，还能保证所筛选的特征描述子的检索准确率比同类特征提高8%~14.1%；在国际通用的大型场景图片拼接数据库IPM上，验证本文方法在图像拼接应用中特征提取和特征匹配上的有效性，实验表明本文方法能节省（50%~70%）图像拼接时间。

结论

与已有的方法比较，本文的特征选择方法既不依赖训练数据集，也不丢失重要的空间结构和纹理信息，在大型图像检索、图像拼接领域和3D检索领域中，能够精简特征，提高特征匹配效率和准确率。

Abstract

Objective

In the research field of feature extraction and feature matching in large-scale 2D image retrieval

3D model retrieval

and image stitching

we determined that the large number of redundant feature descriptors in an image and the high dimensionality of these feature descriptors have caused an intractable problem for large-scale image databases in terms of feature matching speed and retrieval efficiency and have resulted in the poor performance scalability of these feature descriptors. In this study

for the previously mentioned problems

we present a feature descriptor selection algorithm based on dictionary learning and entropy spatial constraints to reduce and even remove the redundant feature descriptors to the maximum extent. That is

in our algorithm

we aim to preserve only the most representative subset of feature descriptors and to ensure that the selected feature descriptors in our algorithm can have a consistent spatial distribution with the original feature descriptors set.

Method

First

during our experiments

we observed the inner relativity between feature descriptor selection and dictionary learning problems in terms of the feature descriptor sparse representations. That is

based on the conception of sparse coding in dictionary learning

we determined that feature descriptor selection and dictionary learning are transferable between each other. Thus

we turn our feature descriptor selection problem into the research field of dictionary reconstruction. In the field of feature descriptor selection

we need to reduce the repeated feature descriptor points and save a small set of the most representative feature descriptors. After the transformation of the feature descriptor selection problems into dictionary learning tasks

we only need to identify the best key feature descriptors to reconstruct the original feature descriptor set under some conditions

such as the sparse and representative conditions

which we called dictionary reconstruction. After the transformation of the feature selection problem into the dictionary reconstruction task

we turn the feature descriptor selection problem into the research field of dictionary optimization. Second

after the transformation of our feature descriptor selection problem into a dictionary learning task

for the new dictionary of the original feature set

we design a new dictionary learning model to keep the robustness of our selected feature descriptors. In the field of dictionary learning

we take the entire original feature descriptor set as a dictionary and take the best representative feature descriptors as the keywords in our dictionary learning model. We derive the object functions of our dictionary reconstruction model

but our dictionary learning model is different from that of other situations because

in our dictionary learning model

we must ensure that the bases of our dictionary are unchangeable and the coefficients of the corresponding base are non-negative. On the basis of these limitations

we employ the simulated annealing algorithm to solve our object function and obtain the optimal solutions

which we finally take as the selected feature descriptors. Finally

during the process of dictionary learning

we add the entropy sparse constraint to save the spatial distribution characteristic of the original feature descriptor points to the largest extent

that is

we use entropy theory to limit dictionary learning. If the distribution of our final selected feature descriptor points is consistent with the original feature descriptor points

then the entropy value is low; otherwise

the entropy value is high. Thus

in this manner

we force our dictionary learning model to select the representative feature descriptor points with low value during the learning process

that is

our dictionary learning model tends to preserve the representative feature descriptor points whose spatial distribution is in accordance with the original feature descriptor points. Thus

we can finally obtain a small set of representative feature descriptors with good spatial distribution.

Result

We test our selected feature descriptors on two research fields to verify our feature descriptor selection algorithm. On the one hand

we implement our experiments on a large-scale image retrieval dataset

i.e.

holiday image retrieval dataset

by comparing our algorithm with the existing feature descriptor selection methods. The experiments showed that our algorithm can considerably save memory space

increase the time efficiency of feature matching and image retrieval (30% to 50%)

and improve the retrieval accuracy by approximately 8% to 14.1%. On the other hand

we test our feature descriptor selection algorithm on the standard image stitching dataset

i.e.

IPM image stitching dataset. We verified our feature descriptor selection method on the aspects of feature extraction and feature matching in the research field of image stitching. The experiments on the IPM image stitching dataset proved that our feature descriptor selection algorithm achieved the best time saving (50% to 70%) with a low range of accuracy degradation.

Conclusion

Compared with the existing methods

our feature descriptor selection algorithm neither relies on the database nor loses important spatial structure and texture information

that is

our feature descriptor selection algorithm has a stable performance and strong scalability in different situations

many datasets

and various tasks

such as video retrieval

image search

picture retrieval

and image matching

which require feature extraction

feature selection

and feature matching operations. The experimental results indicated that our model has a stable adaptive performance to different datasets and various scenes. The image retrieval and image stitching experiments illustrated that our feature descriptor selection algorithm can be adapted to different situations and achieve a good performance

which surpasses that of other feature selection approaches. By contrast

with the advent of Big Data

the demand for the most valuable feature descriptors in our large number of image datasets is urgent

and our feature selection approach can be further adopted to reduce the redundant descriptors. According to our feature descriptor selection algorithm

we can achieve 50% to 70% reduction of noise feature descriptors and the main advantage of our approach in improving the efficiency and accuracy of feature matching in many mainstream tasks

such as the research fields of large-scale image retrieval

image stitching

and 3D model retrieval.

关键词

Keywords

references

Lowe D G. Object recognition from local scale-invariant features[C]//Proceedings of the Seventh International Conference on Computer Vision. Kerkyra, Greece: IEEE, 1999: 1150-1157. [ DOI:10.1109/ICCV.1999.790410 http://dx.doi.org/10.1109/ICCV.1999.790410 ]

Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110.[DOI:10.1023/B:VISI.0000029664.99615.94]

Zhou N, Fan J P. Jointly learning visually correlated dictionaries for large-scale visual recognition applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(4):715-730.[DOI:10.1109/TPAMI.2013.189]

Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories[C]//Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, NY, USA: IEEE, 2006: 2169-2178. [ DOI:10.1109/CVPR.2006.68 http://dx.doi.org/10.1109/CVPR.2006.68 ]

Nowak E, Jurie F, Triggs B. Sampling strategies for bag-of-features image classification[C]//Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006: 490-503. [ DOI:10.1007/11744085_38 http://dx.doi.org/10.1007/11744085_38 ]

Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing[C]//Proceedings of the 25th International Conference on Very Large Data Bases. Edinburgh, Scotland, UK: Morgan Kaufmann Publishers Inc., 1999: 518-529.

Dean T, Ruzon M A, Segal M, et al. Fast, accurate detection of 100, 000 object classes on a single machine[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 1814-1821. [ DOI:10.1109/CVPR.2013.237 http://dx.doi.org/10.1109/CVPR.2013.237 ]

Nister D, Stewenius H. Scalable recognition with a vocabulary tree[C]//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, NY, USA: IEEE, 2006: 2161-2168. [ DOI:10.1109/CVPR.2006.264 http://dx.doi.org/10.1109/CVPR.2006.264 ]

Turcot P, Lowe D G. Better matching with fewer features: the selection of useful features in large database recognition problems[C]//Proceedings of the 12th IEEE International Conference on Computer Vision Workshops. Kyoto, Japan: IEEE, 2009: 2109-2116. [ DOI:10.1109/ICCVW.2009.5457541 http://dx.doi.org/10.1109/ICCVW.2009.5457541 ]

Foo J J, Sinha R. Pruning SIFT for scalable near-duplicate image matching[C]//Proceedings of the Eighteenth Conference on Australasian Database. Ballarat, Victoria, Australia: Australian Computer Society, Inc., 2007: 63-71.

Knopp J, Sivic J, Pajdla T. Avoiding confusing features in place recognition[C]//Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece: Springer, 2010: 748-761. [ DOI:10.1007/978-3-642-15549-9_54 http://dx.doi.org/10.1007/978-3-642-15549-9_54 ]

Lee Y J, Grauman K. Foreground focus:unsupervised learning from partially matching images[J]. International Journal of Computer Vision, 2009, 85(2):143-166.[DOI:10.1007/s11263-009-0252-y]

Johnson M, Cipolla R. Stable interest points for improved image retrieval and matching[R]. Matthew Johnson and Roberto Cipolla University of Cambridge, Cambridge, UK, 2006.

Brown M, Szeliski R, Winder S. Multi-image matching using multi-scale oriented patches[C]//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE, 2005: 510-517. [ DOI:10.1109/CVPR.2005.235 http://dx.doi.org/10.1109/CVPR.2005.235 ]

Yingze Bao S, Chandraker M, Lin Y Q, et al. Dense object reconstruction with semantic priors[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 1264-1271. [ DOI:10.1109/CVPR.2013.167 http://dx.doi.org/10.1109/CVPR.2013.167 ]

Alcantarilla P F, Beall C, Dellaert F. Large-scale dense 3D reconstruction from stereo imagery[C]//Proceedings of 5th Workshop on Planning, Perception and Navigation for Intelligent Vehicles. Tokyo, Japan: Institute of Electrical and Electronics Engineers, 2013.

Liu Y J, Chen X M, Zhao Q L, et al. TOP-SIFT: a new method for SIFT descriptor selection[C]//Proceedings of 2015 IEEE International Conference on Multimedia Big Data. Beijing, China: IEEE, 2015: 236-239. [ DOI:10.1109/BigMM.2015.34 http://dx.doi.org/10.1109/BigMM.2015.34 ]

Dash M, Choi K, Scheuermann P, et al. Feature selection for clustering-a filter solution[C]//Proceedings of 2002 IEEE International Conference on Data Mining. Maebashi City, Japan: IEEE, 2002: 115-122. [ DOI:10.1109/ICDM.2002.1183893 http://dx.doi.org/10.1109/ICDM.2002.1183893 ]

Aharon M, Elad M, Bruckstein A. $$ rmK$$ -SVD:an algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11):4311-4322.[DOI:10.1109/TSP.2006.881199]

Engan K, Aase S O, Husoy J H. Frame based signal compression using method of optimal directions (MOD)[C]//1999 IEEE International Symposium on Circuits and Systems. Orlando, FL, USA: IEEE, 1999: 1-4. [ DOI:10.1109/ISCAS.1999.779928 http://dx.doi.org/10.1109/ISCAS.1999.779928 ]

Jegou H, Douze M, Schmid C. Hamming embedding and weak geometric consistency for large scale image search[M] //Forsyth D, Torr P, Zisserman A. Computer Vision-ECCV 2008. Berlin, Heidelberg: Springer, 2008: 304-317. [ DOI:10.1007/978-3-540-88682-2_24 http://dx.doi.org/10.1007/978-3-540-88682-2_24 ]

Zhu J L, Li S J, Wan D S, et al. Content-based remote sensing image retrieval based on feature selection and semi-supervised learning[J]. Journal of Image and Graphics, 2011, 16(8):1474-1482.

朱佳丽, 李士进, 万定生, 等.基于特征选择和半监督学习的遥感图像检索[J].中国图象图形学报, 2011, 16(8):1474-1482. [DOI:10.11834/jig.100112]

Mori G, Belongie S, Malik J. Efficient shape matching using shape contexts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(11):1832-1837.[DOI:10.1109/TPAMI.2005.220]

Ke Y, Sukthankar R. PCA-SIFT: a more distinctive representation for local image descriptors[C]//Proceedings of 2004 IEEE Computer Society Computer Vision and Pattern Recognition. Washington, DC, USA: IEEE, 2004: Ⅱ-506-Ⅱ-513. [ DOI:10.1109/CVPR.2004.1315206 http://dx.doi.org/10.1109/CVPR.2004.1315206 ]

Mikolajczyk K, Tuytelaars T, Schmid C, et al. A comparison of affine region detectors[J]. International Journal of Computer Vision, 2005, 65(1-2):43-72.[DOI:10.1007/s11263-005-3848-x]

Sadeghi M A, Hejrati S M M, Gheissari N. Poisson local color correction for image stitching[C]//Proceedings of the Third International Conference on Computer Vision Theory and Applications. Funchal, Madeira, Portugal: VISAPP, 2008: 275-282.

Xu W, Mulligan J. Performance evaluation of color correction approaches for automatic multi-view image and video stitching[C]//Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 263-270. [ DOI:10.1109/CVPR.2010.5540202 http://dx.doi.org/10.1109/CVPR.2010.5540202 ]