万方,强浩鹏,雷光波(湖北工业大学计算机学院, 武汉 430068;武汉理工大学理学院, 武汉 430070)
目的 基于深度学习的图像哈希检索是图像检索领域的热点研究问题。现有的深度哈希方法忽略了深度图像特征在深度哈希函数训练中的指导作用，并且由于采用松弛优化，不能有效处理二进制量化误差较大导致的生成次优哈希码的问题。对此，提出一种自监督的深度离散哈希方法（self-supervised deep discrete hashing，SSDDH）。方法 利用卷积神经网络提取的深度特征矩阵和图像标签矩阵，计算得到二进制哈希码并作为自监督信息指导深度哈希函数的训练。构造成对损失函数，同时保持连续哈希码之间相似性以及连续哈希码与二进制哈希码之间的相似性，并利用离散优化算法求解得到哈希码，有效降低二进制量化误差。结果 将本文方法在3个公共数据集上进行测试，并与其他哈希算法进行实验对比。在CIFAR-10、NUS-WIDE（web image dataset from National University of Singapore）和Flickr数据集上，本文方法的检索精度均为最高，本文方法的准确率比次优算法DPSH（deep pairwise-supervised hashing）分别高3%、3%和1%。结论 本文提出的基于自监督的深度离散哈希的图像检索方法能有效利用深度特征信息和图像标签信息，并指导深度哈希函数的训练，且能有效减少二进制量化误差。实验结果表明，SSDDH在平均准确率上优于其他同类算法，可以有效完成图像检索任务。
Self-supervised deep discrete hashing for image retrieval
Wan Fang,Qiang Haopeng,Lei Guangbo(School of Computing, Hubei University of Technology, Wuhan 430068, China;College of Science, Wuhan University of Technology, Wuhan 430070, China)
Objective Hashing techniques have attracted much attention and are widely applied in the nearest neighbor search for image retrieval on large-scale datasets due to the low storage cost and fast retrieval speed. With the great development of deep learning, deep neural networks have been widely incorporated in image hashing retrieval, and existing deep learning-based hashing methods demonstrate the effectiveness of the end-to-end deep learning architecture for hashing learning. However, these methods have several problems. First, these existing deep hashing methods ignore the guiding role of deep image feature information in training deep hashing functions. Second, most deep hashing methods are to solve a relaxed problem first to simplify the optimization involved in a binary code learning procedure and then quantize the solved continuous solution to achieve the approximate binary solution. This optimization strategy leads to a large binary quantization error, resulting in the generation of suboptimal hash codes. Thus, to solve these two problems, a self-supervised deep discrete hashing method (SSDDH) is proposed in this study. Method The proposed SSDDH consists of two steps. First, using matrix decomposition, the binary hash code is obtained by solving the self-supervised loss function composed of the deep feature matrix extracted by the convolutional neural network and the image label matrix. The obtained binary hash code is used as the supervision information to guide the training of deep hash function. Second, a pair-wise loss function is constructed to maintain the similarity between the hash codes generated by deep hash function while maintaining the similarity between these hash codes and binary hash codes. The discrete optimization algorithm is used to solve the optimal solution of the objective function, thus effectively reducing the binary quantization error. Result Several experiments are conducted on three public datasets to validate the performance of the proposed algorithm. The first experiment compares the mean average precision (mAP) values of different existing hash methods on different hash code lengths, including unsupervised methods, supervised shallow methods, and supervised deep methods. The performance experimental results show that the mAP of our method SSDDH achieves the best performance in all cases with different values of the code length. On the CIFAR-10 and NUS-WIDE(web image dataset from National University of Singapore) datasets, the mAP of SSDDH is 3% higher than the next highest method named DPSH(deep pairwise-supervised hashing). On the Flickr dataset, SSDDH is also 1% higher than the highest method DPSH. The second experiment involves the CIFAR-10 dataset. The precision recall (PR) curves of DPSH and SSDDH are plotted. Query result comparison shows the PR curves of DPSH and SSDDH with 48-bit hash codes on CIFAR-10, and our SSDDH remarkably outperforms its competitor. SSDDH and DPSH are also compared in terms of the accuracy of the top 20 returned images when the hash code length is 48 bits. The result of the experiment is visualized for easy observation. We also found that the retrieval performance of SSDDH is considerably higher than that of DPSH. Experiment 3 is designed for parameter sensitivity analysis of SSDDH. Here, a parameter is used, while the others are fixed. Our method is insensitive to the parameters. This finding relatively demonstrates the robustness and effectiveness of the proposed method. Experiment 4 is conducted on CIFAR-10 when the hash code length is 48 bits to explore the difference between DPSH and SSDDH in time complexity. At the later stage of model training, SSDDH performance is better than DPSH at the same time consumption. Conclusion Considering that the existing deep hash methods ignore the guiding role of deep image feature information in the training of deep hash function and have the problem of large binary quantization error, this study proposes a self-supervised deep discrete hashing method named SSDDH. The deep feature matrix extracted by the convolutional neural network and the image label matrix are used to obtain the binary hash codes and make the binary hash codes the supervised information to guide the training of deep hash function. The similarity between the hash codes generated by deep hash function and the similarity between these hash codes and binary hash codes are maintained by constructing a pair-wise loss function. The binary quantization error is effectively reduced using the discrete cyclic coordinate descent. Comparison with several existing methods on three commonly used public datasets proves that this method is more efficient than the existing hash retrieval method. Future work lies in two aspects:First, focus will be on learning better fine-grained representation with more effectively. Second, semi-supervised regularization will be applied to our framework to make full use of the unlabeled data. Both will be employed to boost the image retrieval accuracy further. Third, our current approach will be extended to cross-modal retrieval, such as given a text query, to obtain all semantic relevant images from the database.