动态匹配核函数图像检索

洪睿; 康晓东; 李博; 王亚鸽

doi:10.11834/jig.180137

图像理解和计算机视觉 | 浏览量 : 0 下载量: 4 CSCD: 1

PDF
导出
分享
收藏
专辑

动态匹配核函数图像检索
Application of dynamic match Kernel in image retrieval
2018年23卷第12期页码：1874-1885
收稿：2018-03-21，

修回：2018-7-9，

纸质出版：2018-12-16
DOI： 10.11834/jig.180137
稿件说明：

移动端阅览

洪睿, 康晓东, 李博, 王亚鸽. 动态匹配核函数图像检索[J]. 中国图象图形学报, 2018,23(12):1874-1885. DOI： 10.11834/jig.180137.

Rui Hong, Xiaodong Kang, Bo Li, Yage Wang. Application of dynamic match Kernel in image retrieval[J]. Journal of Image and Graphics, 2018, 23(12): 1874-1885. DOI： 10.11834/jig.180137.

摘要

目的

在传统的词袋模型图像搜索问题中，许多工作致力于提高局部特征的辨识能力。图像搜索得到的图像在细节部分和查询图像相似，但是有时候这些图像在语义层面却差别很大。而基于全局特征的图像搜索在细节部分丢失了很多信息，致使布局相似实则不相关的图像被认为是相关图像。为了解决这个问题，本文利用深度卷积特征来构建一个动态匹配核函数。

方法

利用这个动态匹配核函数，在鼓励相关图像之间产生匹配对的同时，抑制不相关图像之间匹配对的个数。该匹配核函数将图像在深度卷积神经网络全连接层最后一层特征作为输入，构建一个动态匹配核函数。对于相关图像，图像之间的局部特征匹配数量和质量都会相对增强。反之，对于不相关的图像，这个动态匹配核函数会在减少局部特征匹配的同时，降低其匹配得分。

结果

从数量和质量上评估了提出的动态匹配核函数，提出了两个指标来量化匹配核函数的表现。基于这两个指标，本文对中间结果进行了分析，证实了动态匹配核函数相比于静态匹配核函数的优越性。最后，本文在5个公共数据集进行了大量的实验，在对各个数据集的检索工作中，得到的平均准确率从85.11%到98.08%，均高于此领域的同类工作。

结论

实验结果表明了本文方法是有效的，并且其表现优于当前这一领域的同类工作。本文方法相比各种深度学习特征提取方法具有一定优势，由于本文方法使用特征用于构建动态匹配内核，而不是粗略编码进行相似性匹配，因此能在所有数据集上获得更好的性能。

Abstract

Objective

For the traditional image search retrieval problem based on the bag-of-words model

the enhancement of the recognition capability of local features of images has attracted considerable attention. Although the image results obtained by image search retrieval are similar to the query image in the detail part

the image results differ from the semantic perspective. In addition

an image search retrieval method based on global descriptors is used

which focuses on global features but loses several valuable information in the detail. Thus

the images that have similar layout but are not related are considered as related images. To solve this problem

this study uses deep convolution features to construct a dynamic matching kernel function.

Method

The proposed dynamic match kernel algorithm stimulates the feature matches between near-duplicate images and filters the matches between irrelevant images. This study extracts the features from the last fully connected layer in a convolutional neural network as the input for dynamic match kernel. Then

an adaptive threshold is constructed to match the local features. The threshold for relevant images should be large to enable the inclusion of positive matches. Conversely

for uncorrelated images

this dynamic matching kernel function reduces the matching score and local feature matching.

Result

In this study

we initially proposed two criteria to evaluate the effect of the dynamic match kernel algorithm and two indicators to quantify the performance of the dynamic matching kernel function. Then

on the basis of the two indicators

this study analyzed the intermediate results and verified the superiority of the dynamic matching kernel function through a comparison with the static matching kernel function. Finally

this study conducted a large number of experiments on five public datasets (Holidays

UKBench

Paris6K

Oxford5K

and DupImages)

including the experiments with dynamic matching kernel function methods with other methods and experiments with dynamic matching kernel functions versus mainstream deep learning methods. The range of average accuracy is from 85.11% to 98.08% using our method

which indicates that the dynamic kernel method is compatible with other methods.

Conclusion

The dynamic kernel method can be used as a portable component of image retrieval to improve the experimental results of image search retrieval. In the contrast deep learning method

the experimental results of our dynamic matching method with other methods outperform the method based on deep learning features

which indicate that the proposed method is effective and its performance is better than the current work in this field.

关键词

Keywords

references

Kang X D. Image Processing of Medical Imageology[M]. Beijing:People's Medical Publishing House, 2009.

康晓东.医学影像图像处理[M].北京:人民卫生出版社, 2009.

Jégou H, Douze M, Schmid C. Hamming embedding and weak geometric consistency for large scale image search[C ] //Proceedings of the 10th European Conference on Computer Vision. Marseille: Springer, 2008: 304-317.[ DOI: 10.1007/978-3-540-88682-2_24 http://dx.doi.org/10.1007/978-3-540-88682-2_24 ]

Tolias G, Avrithis Y, Jégou H. To aggregate or not to aggregate: selective match kernels for image search[C ] //Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 1401-1408.[ DOI: 10.1109/ICCV.2013.177 http://dx.doi.org/10.1109/ICCV.2013.177 ]

Kang X D, Wang H, Guo J, et al. Unsupervised deep learning method for color image recognition[J]. Computer Application, 2015, 35(9):2636-2639.

康晓东, 王昊, 郭军, 等.无监督深度学习彩色图像识别方法[J].计算机应用, 2015, 35(9):2636-2639.[DOI:10.11772/j.issn.1001-9081.2015.09.2636]

Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories[C ] //Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2006: 2169-2178.[ DOI: 10.1109/CVPR.2006.68 http://dx.doi.org/10.1109/CVPR.2006.68 ]

Douze M, Ramisa A, Schmid C. Combining attributes and Fisher vectors for efficient image retrieval[C ] //Proceedings of CVPR 2011. Colorado Springs: IEEE, 2011: 745-752.[ DOI: 10.1109/CVPR.2011.5995595 http://dx.doi.org/10.1109/CVPR.2011.5995595 ]

Jégou H, Perronnin F, Douze M, et al. Aggregating local image descriptors into compact codes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(9):1704-1716.[DOI:10.1109/TPAMI.2011.235]

Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: ACM, 2012: 1097-1105.

Mikolajczyk K, Schmid C. Scale&affine invariant interest point detectors[J]. International Journal of Computer Vision, 2004, 60(1):63-86.[DOI:10.1023/B:VISI.0000027790.02288.f2]

Matas J, Chum O, Urban M, et al. Robust wide-baseline stereo from maximally stable extremal regions[J]. Image and Vision Computing, 2004, 22(10):761-767.[DOI:10.1016/j.imavis.2004.02.006]

Liu Z, Li H Q, Zhou W G, et al. Uniting keypoints:local visual information fusion for large-scale image search[J]. IEEE Transactions on Multimedia, 2015, 17(4):538-548.[DOI:10.1109/TMM.2015.2399851]

Nister D, Stewenius H. Scalable recognition with a vocabulary tree[C ] //Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2006: 2161-2168.[ DOI: 10.1109/CVPR.2006.264 http://dx.doi.org/10.1109/CVPR.2006.264 ]

Yang J C, Yu K, Gong Y H, et al. Linear spatial pyramid matching using sparse coding for image classification[C ] //Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 1794-1801.[ DOI: 10.1109/CVPR.2009.5206757 http://dx.doi.org/10.1109/CVPR.2009.5206757 ]

Philbin J, Chum O, Isard M, et al. Lost in quantization: improving particular object retrieval in large scale image databases[C ] //Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage: IEEE, 2008: 1-8.[ DOI: 10.1109/CVPR.2008.4587635 http://dx.doi.org/10.1109/CVPR.2008.4587635 ]

Philbin J, Chum O, Isard M, et al. Object retrieval with large vocabularies and fast spatial matching[C ] //Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis: IEEE, 2007: 1-8.[ DOI: 10.1109/CVPR.2007.383172 http://dx.doi.org/10.1109/CVPR.2007.383172 ]

Zhou W G, Li H Q, Lu Y J, et al. SIFT match verification by geometric coding for large-scale partial-duplicate web image search[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2013, 9(1):#4.[DOI:10.1145/2422956.2422960]

Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database[C ] //Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 248-255.[ DOI: 10.1109/CVPR.2009.5206848 http://dx.doi.org/10.1109/CVPR.2009.5206848 ]

Aly M, Munich M, Perona P. Indexing in large scale image collections: scaling properties and benchmark[C ] //Proceedings of 2011 IEEE Workshop on Applications of Computer Vision. Kona: IEEE, 2011: 418-425.[ DOI: 10.1109/WACV.2011.5711534 http://dx.doi.org/10.1109/WACV.2011.5711534 ]

Zheng L, Wang S J, Liu Z Q, et al. Packing and padding: coupled multi-index for accurate image retri eval[C ] //Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1947-1954.[ DOI: 10.1109/CVPR.2014.250 http://dx.doi.org/10.1109/CVPR.2014.250 ]

Zheng L, Wang S J, Zhou W G, et al. Bayes merging of multiple vocabularies for scalable image retrieval[C ] //Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1963-1970.[ DOI: 10.1109/CVPR.2014.252 http://dx.doi.org/10.1109/CVPR.2014.252 ]

Zhang S L, Yang M, Wang X Y, et al. Semantic-aware co-indexing for image retrieval[C ] //Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 1673-1680.[ DOI: 10.1109/ICCV.2013.210 http://dx.doi.org/10.1109/ICCV.2013.210 ]

Zhang S T, Yang M, Cour T, et al. Query specific fusion for image retrieval[C ] //Proceedings of the 12th European Conference on Computer Vision. Florence: Springer-Verlag, 2012: 660-673.[ DOI: 10.1007/978-3-642-33709-3_47 http://dx.doi.org/10.1007/978-3-642-33709-3_47 ]

Yue-Hei Ng J, Yang F, Davis L S. Exploiting local features from deep networks for image retrieval[C ] //Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston: IEEE, 2015: 53-61.[ DOI: 10.1109/CVPRW.2015.7301272 http://dx.doi.org/10.1109/CVPRW.2015.7301272 ]

Babenko A, Slesarev A, Chigorin A, et al. Neural codes for image retrieval[C ] //Proceedings of the 13th European Conference on Computer Vision. Zurich: Springer, 2014: 584-599.[ DOI: 10.1007/978-3-319-10590-1_38 http://dx.doi.org/10.1007/978-3-319-10590-1_38 ]

Paulin M, Douze M, Harchaoui Z, et al. Local convolutional features with unsupervised training for image retrieval[C ] //Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 91-99.[ DOI: 10.1109/ICCV.2015.19 http://dx.doi.org/10.1109/ICCV.2015.19 ]

Zheng L, Wang S J, Tian L, et al. Query-adaptive late fusion for image search and person re-identification[C ] //Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1741-1750.[ DOI: 10.1109/CVPR.2015.7298783 http://dx.doi.org/10.1109/CVPR.2015.7298783 ]

Gong Y C, Wang L W, Guo R Q, et al. Multi-scale orderless pooling of deep convolutional activation features[C ] //Proceedings of the 13th European Con ference on Computer Vision. Zurich: Springer, 2014: 392-407.[ DOI: 10.1007/978-3-319-10584-0_26 http://dx.doi.org/10.1007/978-3-319-10584-0_26 ]

Razavian A S, Azizpour H, Sullivan J, et al. CNN features off-the-shelf: an astounding baseline for recognition[C ] //Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Columbus: IEEE, 2014: 512-519.[ DOI: 10.1109/CVPRW.2014.131 http://dx.doi.org/10.1109/CVPRW.2014.131 ]

Yandex A B, Lempitsky V. Aggregating local deep features for image retrieval[C ] //Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1269-1277.[ DOI: 10.1109/ICCV.2015.150 http://dx.doi.org/10.1109/ICCV.2015.150 ]

Liu Y, Guo Y M, Wu S, et al. Deepindex for accurate and efficient image retrieval[C ] //Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. Shanghai: ACM, 2015: 43-50.[ DOI: 10.1145/2671188.2749300 http://dx.doi.org/10.1145/2671188.2749300 ]

Razavian A S, Sullivan J, Carlsson S, et al. Visual instance retrieval with deep convolutional networks[J]. arXiv: 1412.6574, 2014.

RadenovićF, Tolias G, Chum O. CNN image retrieval learns from bow: unsupervised fine-tuning with hard examples[C ] //Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 2016: 3-20.[ DOI: 10.1007/978-3-319-46448-0_1 http://dx.doi.org/10.1007/978-3-319-46448-0_1 ]