哈夫曼编码乘积量化的图像哈希检索方法

栾婷婷; 祝继华; 徐思雨; 王佳星; 时璇; 李垚辰

doi:10.11834/jig.180264

图像分析和识别 | 浏览量 : 0 下载量: 4 CSCD: 3

PDF
导出
分享
收藏
专辑

哈夫曼编码乘积量化的图像哈希检索方法
Hashing method for image retrieval based on product quantization with Huffman coding
2019年24卷第3期页码：389-399
收稿：2018-04-17，

修回：2018-9-7，

纸质出版：2019-03-16
DOI： 10.11834/jig.180264
稿件说明：

移动端阅览

栾婷婷, 祝继华, 徐思雨, 王佳星, 时璇, 李垚辰. 哈夫曼编码乘积量化的图像哈希检索方法[J]. 中国图象图形学报, 2019,24(3):389-399. DOI： 10.11834/jig.180264.

Tingting Luan, Jihua Zhu, Siyu Xu, Jiaxing Wang, Xuan Shi, Yaochen Li. Hashing method for image retrieval based on product quantization with Huffman coding[J]. Journal of Image and Graphics, 2019, 24(3): 389-399. DOI： 10.11834/jig.180264.

摘要

目的

基于哈希编码的检索方法是图像检索领域中的经典方法。其原理是将原始空间中相似的图片经哈希函数投影、量化后，在汉明空间中得到相近的哈希码。此类方法一般包括两个过程：投影和量化。投影过程大多采用主成分分析法对原始数据进行降维，但不同方法的量化过程差异较大。对于信息量不均衡的数据，传统的图像哈希检索方法采用等长固定编码位数量化的方式，导致出现低编码效率和低量化精度等问题。为此，本文提出基于哈夫曼编码的乘积量化方法。

方法

首先，利用乘积量化法对降维后的数据进行量化，以便较好地保持数据在原始空间中的分布情况。然后，采用子空间方差作为衡量信息量的标准，并以此作为编码位数分配的依据。最后，借助于哈夫曼树，给方差大的子空间分配更多的编码位数。

结果

在常用公开数据集MNIST、NUS-WIDE和22K LabelMe上进行实验验证，与原始的乘积量化方法相比，所提出方法能平均降低49%的量化误差，并提高19%的平均准确率。在数据集MNIST上，与同类方法的变换编码方法（TC）进行对比，比较了从32 bit到256 bit编码时的训练时间，本文方法的训练时间能够平均缩短22.5 s。

结论

本文提出了一种基于多位编码乘积量化的哈希方法，该方法提高了哈希编码的效率和量化精度，在平均准确率、召回率等性能上优于其他同类算法，可以有效地应用到图像检索相关领域。

Abstract

Objective

Hashing method is one of the most popular approaches for content-based image retrieval. The main idea of this approach is to learn the same size of binary codes for each image and then use the Hamming distance to measure the similarity of images. Effective Hashing methods should include at least three properties. First

the learned codes should be short so that large amounts of images can be stored in a small memory. Second

the learned codes should transform images that are perceptually or semantically similar into binary strings with a small Hamming distance. Third

the method should be efficient to learn the parameters of the binary code and encode a new test image. Most Hashing approaches include two important steps to achieve binary coding

namely

projection and quantization. For projection

most Hashing approaches perform principal component analysis (PCA) to reduce the dimensionality of raw data. For quantization

different Hashing approaches may design different strategies. In the quantization stage

most traditional Hashing methods usually allocate the same number of bit to each data subspace for image retrieval. However

information quantities are different in each data subspace. Accordingly

a uniform quantization may result in inefficient codes and high quantization distortion problems

especially when the data have unbalanced information quantities. To address this problem

this study proposes an effective coding method based on product quantization

called Huffman coding.

Method

Similar to most Hashing approaches

the proposed method utilizes PCA to reduce the dimensionality of raw data in the projection stage. A vector quantization scheme is then carefully designed at the quantization stage. The proposed approach first utilizes product quantization to quantize data after dimensionality reduction to preserve data distribution in the original space. For each subspace

the variance can be directly calculated as the measure of its information quantity. For effectiveness

the subspace with high information quantity should be allocated with a large number of bit for binary coding and vice versa. To achieve this goal

the reciprocal value of the variance proportion can be used to build a Huffman tree

which can then be applied to generate Huffman codes. Accordingly

different bit and values of binary code can be assigned to each subspace. In other words

numerous bit will be allocated to encode subspaces with large variance and few for subspaces with small variance. The variance is easy to calculated

and therefore

the proposed approach is simple and efficient for binary coding. Experimental results illustrate that the Huffman coding method is effective for image retrieval.

Result

During the experiment

the proposed approach is tested on three public datasets

namely

MNIST

NUS-WIDE

and 22K LabelMe. For each image

a 512D GIST descriptor can be extracted as the input of the Hashing approach. To verify its good performance

the proposed approach is compared with four related approaches:original product quantization method

PCA-based product quantization method

iterative quantization method

and transform coding (TC) method. The experimental results are reported in the form of quantization distortion

mean average precision

recall

and training time. Results show that the average quantization distortion of the proposed approach can be decreased by approximately 49%

and the mean average precision of the retrieval results is increased by approximately 19% compared with the existing method based on product quantization. The training time of the proposed approach is also compared with that of TC from 32 bit to 256 bit on MNIST. The proposed approach can reduce 22.5 s of the training time on average.

Conclusion

This study proposes Huffman coding for image retrieval in the product quantization stage. According to information quantities

the Huffman-based product quantization scheme can allocate different numbers of bit to each data subspace

which can effectively increase coding efficiency and quantization accuracy. The proposed approach is tested on three public datasets and compared with four related approaches. Experimental results demonstrate that the proposed approach is superior to some state-of-the-art algorithms for image retrieval on mean average precision and recall. The proposed approach does not belong to precise coding methods; thus

our future work will focus on precise Hashing method for effective image retrieval.

关键词

Keywords

references

Shen F M, Zhou X, Yang Y, et al. A fast optimization method for general binary code learning[J]. IEEE Transactions on Image Processing, 2016, 25(12):5610-5621.[DOI:10.1109/TIP.2016.2612883]

Gui J, Liu T L, Sun Z N, et al. Fast supervised discrete hashing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(2):490-496.[DOI:10.1109/TPAMI.2017.2678475]

Li W J, ZHOU Z H. Learning to hash for big data:current status and future trends[J]. Chinese Science Bulletin, 2015, 60(5-6):485-490.

李武军, 周志华.大数据哈希学习:现状与趋势[J].科学通报, 2015, 60(5-6):485-490. [DOI:10.1360/N972014-00841]

Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing[C]//Proceedings of the 25th International Conference on Very Large Data Bases. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1999: 518-529.

Kulis B, Grauman K. Kernelized locality-sensitive hashing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(6):1092-1104.[DOI:10.1109/TPAMI.2011.219]

Zhang X Y, Wang M L, Cui J T. Efficient indexing of binary LSH for high dimensional nearest neighbor[J]. Neurocomputing, 2016, 213:24-33.[DOI:10.1016/j.neucom.2016.05.095]

Datar M, Immorlica N, Indyk P, et al. Locality-sensitive hashing scheme based on p-stable distributions[C]//Proceedings of the 20th Annual Symposium on Computational Geometry. Brooklyn, NY, USA: ACM, 2004: 253-262.[ DOI: 10.1145/997817.997857 http://dx.doi.org/10.1145/997817.997857 ]

Liu W, Wang J, Kumar S, et al. Hashing with graphs[C]//Proceedings of the 28th International Conference on International Conference on Machine Learning. Bellevue, Washington, USA: Omnipress, 2011: 1-8.

Gong Y C, Lazebnik S, Gordo A, et al. Iterative quantization:a procrustean approach to learning binary codes for large-scale image retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12):2916-2929.[DOI:10.1109/TPAMI.2012.193]

Shen F M, Shen C H, Liu W, et al. Supervised discrete hashing[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 37-45.[ DOI: 10.1109/CVPR.2015.7298598 http://dx.doi.org/10.1109/CVPR.2015.7298598 ]

Wang J, Kumar S, Chang S F. Semi-supervised hashing for scalable image retrieval[C] //Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 3424-3431.[ DOI: 10.1109/CVPR.2010.5539994 http://dx.doi.org/10.1109/CVPR.2010.5539994 ]

Kong W H, Li W J. Double-bit quantization for hashing[C]//Proceedings of the 26th AAAI Conference on Artificial Intelligence. Toronto, Ontario, Canada: AAAI, 2012: 634-640

Kong W H, Li W J, Guo M Y. Manhattan hashing for large-scale image retrieval[C]//Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. Portland, Oregon, USA: ACM, 2012: 45-54.[ DOI: 10.1145/2348283.2348293 http://dx.doi.org/10.1145/2348283.2348293 ]

Jegou H, Douze M, Schmid C. Product quantization for nearest neighbor search[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(1):117-128.[DOI:10.1109/TPAMI.2010.57]

Ge T Z, He K M, Ke Q F, et al. Optimized product quantization for approximate nearest neighbor search[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 2946-2953.[ DOI: 10.1109/CVPR.2013.379 http://dx.doi.org/10.1109/CVPR.2013.379 ]

Norouzi M, Fleet D J. Cartesian k-means[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 3017-3024.[ DOI: 10.1109/CVPR.2013.388 http://dx.doi.org/10.1109/CVPR.2013.388 ]

Brandt J. Transform coding for fast approximate nearest neighbor search in high dimensions[C]//Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 1815-1822.[ DOI: 10.1109/CVPR.2010.5539852 http://dx.doi.org/10.1109/CVPR.2010.5539852 ]

Kong W H, Li W J. Isotropic hashing[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: ACM, 2012: 1646-1654.

Huffman D A. A method for the construction of minimum-redundancy codes[J]. Proceedings of the IRE, 1952, 40(9):1098-1101.[DOI:10.1109/JRPROC.1952.273898]