面向遥感影像的深度语义哈希检索
Deep semantic Hashing retrieval of remote sensing images
- 2019年24卷第4期 页码:655-663
收稿:2018-07-04,
修回:2018-8-10,
纸质出版:2019-04-24
DOI: 10.11834/jig.180420
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-07-04,
修回:2018-8-10,
纸质出版:2019-04-24
移动端阅览
目的
2
哈希检索旨在将海量数据空间中的高维数据映射为紧凑的二进制哈希码,并通过位运算和异或运算快速计算任意两个二进制哈希码之间的汉明距离,从而能够在保持相似性的条件下,有效实现对大数据保持相似性的检索。但是,遥感影像数据除了具有影像特征之外,还具有丰富的语义信息,传统哈希提取影像特征并生成哈希码的方法不能有效利用遥感影像包含的语义信息,从而限制了遥感影像检索的精度。针对遥感影像中的语义信息,提出了一种基于深度语义哈希的遥感影像检索方法。
方法
2
首先在具有多语义标签的遥感影像数据训练集的基础上,利用两个不同配置参数的深度卷积网络分别提取遥感影像的影像特征和语义特征,然后利用后向传播算法针对提取的两类特征学习出深度网络中的各项参数并生成遥感影像的二进制哈希码。生成的二进制哈希码之间能够有效保持原始高维遥感影像的相似性。
结果
2
在高分二号与谷歌地球遥感影像数据集、CIFAR-10数据集及FLICKR-25K数据集上进行实验,并与多种方法进行比较和分析。当编码位数为64时,相对于DPSH(deep supervised Hashing with pairwise labels)方法,在高分二号与谷歌地球遥感影像数据集、CIFAR-10数据集、FLICKR-25K数据集上,mAP(mean average precision)指标分别提高了约2%、6%7%、0.6%。
结论
2
本文提出的端对端的深度学习框架,对于带有一个或多个语义标签的遥感影像,能够利用语义特征有效提高对数据集的检索性能。
Objective
2
Hashing methods
which aim at mapping the high-dimensional data to compact binary Hashing codes in Hamming space and rapidly calculate the Hamming distance by bit operation and XOR operation
can effectively achieve search and retrieval with remaining similarity for big data. However
a massive number of remote sensing images are associated with semantic information. Traditional methods of extracting image features and generating Hash codes cannot effectively use semantic information
thereby limiting the accuracy of remote sensing image retrieval. This study proposes an image retrieval method based on DSH(deep semantic Hashing) for mining semantic information of remote sensing images with tags or other semantic annotations. The contribution of this study includes introducing Hashing methods for RS images which encode the high-dimensional image feature vector to binary bits by using a limited number of labeled (annotated) images. Furthermore
DSH directly learns the discrete Hashing codes without relaxation thereby deteriorating the accuracy of the learned Hashing codes. Hence
DSH provides highly time-efficient (in terms of storage and speed) and accurate search capability within huge data archives.
Method
2
The DSH model performs simultaneous feature learning and Hashing codes learning in an end-to-end framework
which is organized into two main parts
namely feature learning and Hashing learning. In feature learning
we use two deep neural networks for images and semantic annotations. The deep neural network for image is a convolutional neural network (CNN) adapted from vgg_net. Particularly
feature learning has seven layers of vgg_16 network pretrained on ImageNet. We replace the eighth layer as a fully-connected layer with the output of the learned image features. The first seven layers use the rectified linear unit (ReLU) as the activation function
and the eighth layer uses identity function as the activation function. For semantic annotations
we use semantic vectors as the input to a deep neural network with two fully-connected layers. Moreover
we use ReLU and identity function for two fully-connected layers as activation function. In Hashing learning
we assume that
f
(
x
i
;
θ
x
) represents the learned feature for image
x
i
which corresponds to the output of the CNN for images. Furthermore
let
g
(
y
j
;
θ
y
) denote the learned feature for semantic
y
i
which corresponds to the output of the deep neural network for semantic vectors. Here
θ
x
is the network parameter of the CNN for images
and
θ
y
is the network parameter of the deep neural network for semantic vectors. For binary codes
B
={
b
i
}
i
=1
n
Then
we define the similarities with the likelihood and optimization function and learn the parameters of the CNN through an alternating learning strategy
which learns one parameter while fixing the other parameters.
Result
2
We have conducted experiments on three archives. The first archive consists of 2 000 images acquired from GF-2 satellite and Google Earth. Each image in the archive is a section of 224×224 pixels and is associated with several textual tags. In our experiments
we consider several tags
which are similar to one semantic annotation. We use CIFAR-10 dataset as the second archive
which is a single-label dataset consisting 60 000 color images with a size of 32×32 pixels. Each image belongs to one of the ten classes. The third archive is the FLICKR-25K dataset
which consists of 25 000 images associated with several textual tags. We consider several tags that are similar to one semantic annotation such as the first archive. Each image in the archive is a section of 224×224 pixels. On GF-2 satellite and Google Earth remote sensing image dataset
when the Hashing bit is 64
the mean average precision (mAP) value can be improved by approximately 2% contrary to DPSH(deep supervised Hashing with pairwise labels). On the CIFAR-10 dataset
the proposed method attains an improvement by 6%7% compared with DPSH for the mAP evaluation when the Hashing bit is 64. On the FLICKR-25K dataset
the proposed method attains improvement by approximately 0.6% compared with DPSH for the mAP evaluation when the Hashing bit is 64.
Conclusion
2
In this study
we propose an end-to-end deep learning framework
which considers image visual and semantic features based on deep learning and generates Hashing functions for Hashing codes by utilizing the semantic information
thereby providing high accuracy for RS image retrieval. Experimental results show our proposed method greatly improves the detection accuracy of image retrieval. Notably
the archives used in the experiments are benchmarks
which are composed of a moderate number of images
whereas in many actual applications
the search is expected to be applied to considerably larger archives.
Yang X L, Yao J L, Wang X H, et al. Image copy detection method based on contextual descriptor[J].Journal of Image and Graphics, 2017, 22(8):1098-1105.[
杨醒龙, 姚金良, 王小华, 等.构建近邻上下文的拷贝图像检索[J].中国图象图形学报, 2017, 22(8):1098-1105. DOI:10.11834/jig.160562
Yu J Q, Wu Z B, Wu F, et al. Multimedia technology 2016:advances and trends in image retrieval[J].Journal of Image and Graphics, 2017, 22(11):1467-1485.[
于俊清, 吴泽斌, 吴飞, 等.多媒体工程:2016-图像检索研究进展与发展趋势[J].中国图象图形学报, 2017, 22(11):1467-1485. DOI:10.11834/jig.170503
Chen F, Lyu S H, Li J, et al. Multi-label image retrieval by Hashing with object proposal[J]. Journal of Image and Graphics, 2017, 22(2):232-240.[
陈飞, 吕绍和, 李军, 等.目标提取与哈希机制的多标签图像检索[J].中国图象图形学报, 2017, 22(2):232-240. DOI:10.11834/jig.20170211
Cao Y, Long M S, Wang J M, et al. Deep visual-semantic quantization for efficient image retrieval[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017.[ DOI: 10.1109/CVPR.2017.104 http://dx.doi.org/10.1109/CVPR.2017.104
Bahmanyar R, de Oca A M M, Datcu M, et al. The semantic gap:an exploration of user and computer perspectives in earth observation images[J]. IEEE Geoscience and Remote Sensing Letters, 2015, 12(10):2046-2050.[DOI:10.1109/LGRS.2015.2444666
Gong Y C, Lazebnik S. Iterative quantization: a procrustean approach to learning binary codes[C]//Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, CO, USA: IEEE, 2011: 817-824.[ DOI: 10.1109/CVPR.2011.5995432 http://dx.doi.org/10.1109/CVPR.2011.5995432 ]
Andoni A, Indyk P. Near-optimal Hashing algorithms for approximate nearest neighbor in high dimensions[C]//47th Annual IEEE Symposium on Foundations of Computer Science. Berkeley, CA, USA: IEEE, 2006: 117-129.[ DOI: 10.1109/FOCS.2006.49 http://dx.doi.org/10.1109/FOCS.2006.49 ]
Weiss Y, Fergus R, Torralba A. Multidimensional spectral Hashing[C]//Proceedings of the 12th European Conference on Computer Vision-ECCV 2012. Florence, Italy: Springer, 2012: 340-353.[ DOI: 10.1007/978-3-642-33715-4_25 http://dx.doi.org/10.1007/978-3-642-33715-4_25 ]
Raginsky M, Lazebnik S. Locality-sensitive binary codes from shift-invariant kernels[C]//Advances in Neural Information Processing Systems 22-Proceedings of the 2009 Conference. Vancouver, BC, Canada: Neural Information Processing Systems, 2009: 1509-1517.
Gong Y C, Lazebnik S, Gordo A, et al. Iterative quantization:a procrustean approach to learning binary codes for large-scale image retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12):2916-2929.[DOI:10.1109/TPAMI.2012.193
Yang H F, Lin K, Chen C S. Supervised learning of semantics-preserving Hash via deep convolutional neural networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(2):437-451.[DOI:10.1109/TPAMI.2017.2666812
Xia R K, Pan Y, Lai H J, et al. Supervised Hashing for image retrieval via image representation learning[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence. Quebec City, Canada: AAAI, 2014.
Lai H J, Pan Y, Liu Y, et al. Simultaneous feature learning and Hash coding with deep neural networks[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA: IEEE, 2015: 3270-3278.[ DOI: 10.1109/CVPR.2015.7298947 http://dx.doi.org/10.1109/CVPR.2015.7298947 ]
Li W J, Wang S, Kang W C. Feature learning based deep supervised Hashing with pairwise labels[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York: AAAI, 2016: 1711-1717.
Zhao F, Huang Y Z, Wang L, et al. Deep semantic ranking based Hashing for multi-label image retrieval[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 1556-1564.[ DOI: 10.1109/CVPR.2015.7298763 http://dx.doi.org/10.1109/CVPR.2015.7298763 ]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2018-05-23] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf .
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc., 2012: 1097-1105.
Nguyen V A, Do M N. Deep learning based supervised Hashing for efficient image retrieval[C]//Proceedings of 2016 IEEE International Conference on Multimedia and Expo. Seattle, WA, USA: IEEE, 2016.[ DOI: 10.1109/ICME.2016.7552927 http://dx.doi.org/10.1109/ICME.2016.7552927 ]
Vedaldi A, Lenc K. MatConvNet: convolutional neural networks for MATLAB[C]//Proceedings of the 23rd ACM International Conference on Multimedia. Brisbane, Australia: ACM, 2015: 689-692.[ DOI: 10.1145/2733373.2807412 http://dx.doi.org/10.1145/2733373.2807412 ]
Cakir F, He K, Bargal S A, et al. MIHash: online Hashing with mutual information[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 437-445.[ DOI: 10.1109/ICCV.2017.55 http://dx.doi.org/10.1109/ICCV.2017.55 ]
相关作者
相关机构
京公网安备11010802024621