面向视觉搜索的空间局部敏感哈希方法
Locality-sensitive hashing approach based on semantic space for visual retrieval
- 2021年26卷第7期 页码:1568-1582
纸质出版日期: 2021-07-16 ,
录用日期: 2021-02-28
DOI: 10.11834/jig.200534
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2021-07-16 ,
录用日期: 2021-02-28
移动端阅览
黄小燕, 孙彬, 杨展源, 朱映映, 田奇. 面向视觉搜索的空间局部敏感哈希方法[J]. 中国图象图形学报, 2021,26(7):1568-1582.
Xiaoyan Huang, Bin Sun, Zhanyuan Yang, Yingying Zhu, Qi Tian. Locality-sensitive hashing approach based on semantic space for visual retrieval[J]. Journal of Image and Graphics, 2021,26(7):1568-1582.
目的
2
视觉检索需要准确、高效地从大型图像或者视频数据集中检索出最相关的视觉内容,但是由于数据集中图像数据量大、特征维度高的特点,现有方法很难同时保证快速的检索速度和较好的检索效果。
方法
2
对于面向图像视频数据的高维数据视觉检索任务,提出加权语义局部敏感哈希算法(weighted semantic locality-sensitive hashing,WSLSH)。该算法利用两层视觉词典对参考特征空间进行二次空间划分,在每个子空间里使用加权语义局部敏感哈希对特征进行精确索引。其次,设计动态变长哈希码,在保证检索性能的基础上减少哈希表数量。此外,针对局部敏感哈希(locality sensitive hashing,LSH)的随机不稳定性,在LSH函数中加入反映参考特征空间语义的统计性数据,设计了一个简单投影语义哈希函数以确保算法检索性能的稳定性。
结果
2
在Holidays、Oxford5k和DataSetB数据集上的实验表明,WSLSH在DataSetB上取得最短平均检索时间0.034 25 s;在编码长度为64位的情况下,WSLSH算法在3个数据集上的平均精确度均值(mean average precision,mAP)分别提高了1.2%32.6%、1.7%19.1%和2.6%28.6%,与几种较新的无监督哈希方法相比有一定的优势。
结论
2
通过进行二次空间划分、对参考特征的哈希索引次数进行加权、动态使用变长哈希码以及提出简单投影语义哈希函数来对LSH算法进行改进。由此提出的加权语义局部敏感哈希(WSLSH)算法相比现有工作有更快的检索速度,同时,在长编码的情况下,取得了更为优异的性能。
Objective
2
Visual retrieval methods need to accurately and efficiently retrieve the most relevant visual content from large-scale images or video datasets. However
due to a large amount of image data and high feature dimensionality in the dataset
existing methods face difficulty in ensuring fast retrieval speed and good retrieval results. Hashing is a widely studied solution for approximate nearest neighbor search
which aims to convert high-dimensional data items into a low-dimensional representation or a hash code consisting of a set of bit sequences. Locality-sensitive hashing (LSH) is a data-independent
unsupervised hashing algorithm that provides asymptotic theoretical properties
thereby ensuring performance. LSH is considered as one of the most common methods for fast nearest-neighbor search in high-dimensional space. Nevertheless
if the number of hash functions
k
is set too small
it leads to too many data items falling into each hash bucket
thus increasing the query response time. By contrast
if
k
is set too large
the number of data items in each hash bucket is reduced. Moreover
to achieve the desired search accuracy
LSH usually needs to use long hash codes
thereby reducing the recall rate. Although the use of multiple hash tables can alleviate this problem
it significantly increases memory cost and query time. Besides
due to the semantic gap between the visual semantic space and metric space
LSH may not obtain good search performance.
Method
2
For visual retrieval of high-dimensional data
we first propose a hash algorithm called weighted semantic locality-sensitive hashing (WSLSH)
which is based on feature space partitioning
to address the aforementioned drawbacks of LSH. While building the indices
WSLSH considers the distance relationship between reference and query features
divides the reference feature space into two subspaces by a two-layer visual dictionary
and employs weighted-semantic locality sensitive hashing in each subspace to index
thereby forming a hierarchical index structure. The proposed algorithm can rapidly converge the target to a small range in the process of large-scale retrieval and make accurate queries
which greatly improves the retrieval speed. Then
dynamic variable-length hashing codes are applied in a hashing table to retrieve multiple hashing buckets
which can reduce the number of hashing tables and improve the retrieval speed based on guaranteeing the retrieval performance. Through these two improvements
the retrieval speed can be greatly improved. In addition
to solve the random instability of LSH
statistical information reflecting the semantics of reference feature space is introduced into the LSH function
and a simple projection semantic-hashing function is designed to ensure the stability of the retrieval performance.
Result
2
Experiment results on Holidays
Oxford5k
and DataSetB datasets show that the retrieval accuracy and retrieval speed are effectively improved in comparison with the representative unsupervised hash methods. WSLSH achieves the shortest average retrieval time (0.034 25 s) on DataSetB. When the encoding length is 64 bit
the mean average precision (mAP) of the WSLSH algorithm is improved by 1.2%32.6%
1.7%19.1%
and 2.6%28.6%. WSLSH is not highly sensitive to the size change of the reference feature subset involved
so the retrieval time does not change significantly
which reflects the retrieval advantage of WSLSH for large-scale datasets. With the increase of encoding length
the performance of the WSLSH algorithm is improved gradually. When the encoding length is 64 bit
the WSLSH algorithm obtains the highest precision and recall on the three datasets
which is superior to other compared methods.
Conclusion
2
The LSH algorithm is improved by performing feature space division twice
weighting the number of hash indexes of reference features
dynamically using variable-length hash codes
and introducing a simple-projection semantic-hash function. Thus
the proposed WSLSH algorithm has faster retrieval speed. In the case of long encoding length
WSLSH achieves better performance than the compared works and shows high application value for large-scale image datasets.
特征空间划分局部敏感哈希(LSH)动态变长哈希码视觉搜索最近邻搜索
feature space partitioninglocality-sensitive hashing(LSH)dynamic variable-length hashing codevisual retrievalnearest neighbor search
Broder A Z, Glassman S C, Manasse M S and Zweig G. 1997. Syntactic clustering of the web. Computer Networks and ISDN Systems, 29(8-13): 1157-1166[DOI:10.1016/S0169-7552(97)00031-7]
Broder A Z. 1997. On the resemblance and containment of documents//Proceedings of Compression and Complexity of SEQUENCES 1997. Salerno, Italy: 21-29[DOI: 10.1109/SEQUEN.1997.666900http://dx.doi.org/10.1109/SEQUEN.1997.666900]
Charikar M S. 2002. Similarity estimation techniques from rounding algorithms//The 34th Annual ACM Symposium on Theory of Computing. Montréal, Canada: ACM: 380-388[DOI: 10.1145/509907.509965http://dx.doi.org/10.1145/509907.509965]
Chum O and Matas J. 2010. Large-scale discovery of spatially related images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2): 371-377[DOI:10.1109/TPAMI.2009.166]
Dasgupta A, Kumar R and Sarlós T. 2011. Fast locality-sensitive hashing//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, USA: ACM: 1073-1081[DOI: 10.1145/2020408.2020578http://dx.doi.org/10.1145/2020408.2020578]
Datar M, Immorlica N, Indyk P and Mirrokni V S. 2004. Locality-sensitive hashing scheme based on p-stable distributions//The 20th Annual Symposium on Computational Geometry. Brooklyn, USA: ACM: 253-262[DOI: 10.1145/997817.997857http://dx.doi.org/10.1145/997817.997857]
Dean T, Ruzon M A, Segal M, Shlens J, Vijayanarasimhan S and Yagnik J. 2013. Fast, accurate detection of 100 000 object classes on a single machine//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 1814-1821[DOI: 10.1109/CVPR.2013.237http://dx.doi.org/10.1109/CVPR.2013.237]
Do T T, Le Tan D K, Pham T T and Cheung N M. 2017. Simultaneous feature aggregating and hashing for large-scale image search//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6618-6627[DOI: 10.1109/CVPR.2017.449http://dx.doi.org/10.1109/CVPR.2017.449]
Fei L K, Qin J Y, Teng S H, Zhang W, Liu D N and Hou Y. 2020. Hashing for approximate nearest neighbor search on big data: a survey. Journal of Guangdong University of Technology, 37(3): 23-35
费伦科, 秦建阳, 滕少华, 张巍, 刘冬宁, 侯艳. 2020. 近似最近邻大数据检索哈希散列方法综述. 广东工业大学学报, 37(3): 23-35 [DOI:10.12052/gdutxb.190123]
Gionis A, Indyk P and Motwani R. 1999. Similarity search in high dimensions via hashing//Proceedings of the 25th International Conference on Very Large Data Bases. Edinburgh, Scotland: Morgan Kaufmann: 518-529
Gong Y C, Lazebnik S, Gordo A and Perronnin F. 2013. Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12): 2916-2929[DOI:10.1109/TPAMI.2012.193]
He K M, Wen F and Sun J. 2013. K-means hashing: an affinity-preserving quantization method for learning binary compact codes//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 2938-2945[DOI: 10.1109/CVPR.2013.378http://dx.doi.org/10.1109/CVPR.2013.378]
Indyk P and Motwani R. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality//Proceedings of the 30th Annual ACM Symposium on Theory of Computing. Dallas, USA: ACM: 604-613[DOI: 10.1145/276698.276876http://dx.doi.org/10.1145/276698.276876]
Jégou H, Douze M, Schmid C and Pérez P. 2010. Aggregating local descriptors into a compact image representation//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE: 3304-3311[DOI: 10.1109/CVPR.2010.5540039http://dx.doi.org/10.1109/CVPR.2010.5540039]
Ji J Q, Li J M, Yan S C, Zhang B and Tian Q. 2012. Super-bit locality-sensitive hashing//Proceedings of the 26th Annual Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS: 108-116
Jin Z M, Li C, Lin Y and Cai D. 2014. Density sensitive hashing. IEEE Transactions on Cybernetics, 44(8): 1362-1371[DOI:10.1109/TCYB.2013.2283497]
Kulis B, Jain P and Grauman K. 2009. Fast similarity search for learned metrics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12): 2143-2157[DOI:10.1109/TPAMI.2009.151]
Lai H J, Pan Y, Liu Y and Yan S C. 2015. Simultaneous feature learning and hash coding with deep neural networks//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3270-3278[DOI: 10.1109/CVPR.2015.7298947http://dx.doi.org/10.1109/CVPR.2015.7298947]
Li P, Church K W and Hastie T J. 2007. Conditional random sampling: a sketch-based sampling technique for sparse data//Proceedings of the 19th International Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada: MIT Press: 873-880[DOI: 10.5555/2976456.2976566http://dx.doi.org/10.5555/2976456.2976566]
Li P, Hastie T J and Church K W. 2006. Very sparse random projections//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Philadelphia, USA: ACM: 287-296[DOI: 10.1145/1150402.1150436http://dx.doi.org/10.1145/1150402.1150436]
Li P and König C. 2010.b-Bit minwise hashing//Proceedings of the 19th International Conference on World Wide Web. Raleigh, USA: ACM: 671-680[DOI: 10.1145/1772690.1772759http://dx.doi.org/10.1145/1772690.1772759]
Li P, König A C and Gui W H. 2010. b-Bit minwise hashing for estimating three-way similarities//Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates, Inc. : 1387-1395[DOI: 10.5555/2997046.2997051http://dx.doi.org/10.5555/2997046.2997051]
Li P, Owen A B and Zhang C H. 2012. One permutation hashing//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS: 3113-3121[DOI: 10.5555/2999325.2999482http://dx.doi.org/10.5555/2999325.2999482]
Lin K, Lu J W, Chen C S and Zhou J. 2016. Learning compact binary descriptors with unsupervised deep neural networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 1183-1192[DOI: 10.1109/CVPR.2016.133http://dx.doi.org/10.1109/CVPR.2016.133]
Liu Y, Cheng M, Wang F P, Li D X, Liu W and Fan J L. 2020. Deep hashing image retrieval methods. Journal of Image and Graphics, 25(7): 1296-1317
刘颖, 程美, 王富平, 李大湘, 刘伟, 范九伦. 2020. 深度哈希图像检索方法综述. 中国图象图形学报, 25(7): 1296-1317 [DOI:10.11834/jig.190518]
Lyu Q, Josephson W, Wang Z, Charikar M and Li K. 2007. Multi-probe LSH: efficient indexing for high-dimensional similarity search//Proceedings of the 33rd International Conference on Very Large Data Bases. Vienna, Austria: ACM: 950-961[DOI: 10.5555/1325851.1325958http://dx.doi.org/10.5555/1325851.1325958]
Mao X J and Yang Y B. 2014. Semantic hashing with image subspace learning. Journal of Software, 25(8): 1781-1793
毛晓蛟, 杨育彬. 2014. 一种基于子空间学习的图像语义哈希索引方法. 软件学报, 25(8): 1781-1793 [DOI:10.13328/j.cnki.jos.004488]
Motwani R, Naor A and Panigrahi R. 2006. Lower bounds on locality sensitive hashing//The 22nd Annual Symposium on Computational Geometry. Sedona, USA: ACM: 154-157[DOI: 10.1145/1137856.1137881http://dx.doi.org/10.1145/1137856.1137881]
Norouzi M and Fleet D J. 2011. Minimal loss hashing for compact binary codes//Proceedings of the 28th International Conference on International Conference on Machine Learning. Bellevue, USA: Omnipress: 353-360[DOI: 10.5555/3104482.3104527http://dx.doi.org/10.5555/3104482.3104527]
O'Donnell R, Wu Y and Zhou Y. 2014. Optimal lower bounds for locality-sensitive hashing (except when q is tiny). ACM Transactions on Computation Theory, 6(1): #5[DOI:10.1145/2578221]
Oliva A and Torralba A. 2001. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3): 145-175[DOI:10.1023/A:1011139631724]
Panigrahy R. 2006. Entropy based nearest neighbor search in high dimensions//The 17th Annual ACM-SIAM Symposium on Discrete Algorithm. Miami, USA: ACM: 1186-1195[DOI: 10.5555/1109557.1109688http://dx.doi.org/10.5555/1109557.1109688]
Perronnin F and Dance C. 2007. Fisher kernels on visual vocabularies for image categorization//Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE: 1-8[DOI: 10.1109/CVPR.2007.383266http://dx.doi.org/10.1109/CVPR.2007.383266]
Razavian A S, Azizpour H, Sullivan J and Carlsson S. 2014. CNN features off-the-shelf: an astounding baseline for recognition//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Columbus, USA: IEEE: 512-519[DOI: 10.1109/CVPRW.2014.131http://dx.doi.org/10.1109/CVPRW.2014.131]
Shan Q, Wu C C, CurlessB, Furukawa Y, Hernandez C and Seitz S M. 2014. Accurate geo-registration by ground-to-aerial image matching//Proceedings of the 2nd International Conference on 3D Vision. Tokyo, Japan: IEEE: 525-532[DOI: 10.1109/3DV.2014.69http://dx.doi.org/10.1109/3DV.2014.69]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition//Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR: 1-14
Smeulders A W M, Worring M, Santini S, Gupta A and Jain R. 2000. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12): 1349-1380[DOI:10.1109/34.895972]
Wang J, Kumar S and Chang S F. 2012. Semi-supervised hashing for large-scale search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(12): 2393-2406[DOI:10.1109/TPAMI.2012.48]
Wang J, Liu W, Kumar S and Chang S F. 2016. Learning to hash for indexing big data-A survey. Proceedings of the IEEE, 104(1): 34-57[DOI:10.1109/JPROC.2015.2487976]
Wang J D, Shen H T, Song J K and Ji J Q. 2014. Hashing for similarity search: a survey[EB/OL]. [2020-08-30].https://arxiv.org/pdf/1408.2927.pdfhttps://arxiv.org/pdf/1408.2927.pdf
Wang J D, Zhang T, Sebe N and Shen H T. 2018. A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4): 769-790[DOI:10.1109/TPAMI.2017.2699960]
Wang X J, Zhang L, Jing F and Ma W Y. 2006. Annosearch: image auto-annotation by search//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE: 1483-1490[DOI: 10.1109/CVPR.2006.58http://dx.doi.org/10.1109/CVPR.2006.58]
Weiss Y, Torralba A and Fergus R. 2008. Spectral hashing//Proceedings of the 21st International Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada: Curran Associates, Inc. : 1753-1760[DOI: 10.5555/2981780.2981999http://dx.doi.org/10.5555/2981780.2981999]
Xia R K, Pan Y, Lai H J, Liu C and Yan S C. 2014. Supervised hashing for image retrieval via image representation learning//Proceedings of the 28th AAAI Conference on Artificial Intelligence. Québec City, Canada: AAAI Press: 2156-2162
Yuan Y. 2014. HABIR: hashing baseline for image retrieval[CP/OL]. [2020-07-30].https://github.com/willard-yuan/hashing-baseline-for-image-retrievalhttps://github.com/willard-yuan/hashing-baseline-for-image-retrieval
Zhao W L, Wu X and Ngo C W. 2010. On the annotation of web videos by efficient near-duplicate search. IEEE Transactions on Multimedia, 12(5): 448-461[DOI:10.1109/tmm.2010.2050651]
相关文章
相关作者
相关机构