双卷积池化结构的3D-CNN高光谱遥感影像分类方法
Doubleconvpool-structured 3D-CNN for hyperspectral remote sensing image classification
- 2019年24卷第4期 页码:639-654
收稿:2018-07-04,
修回:2018-10-7,
纸质出版:2019-04-16
DOI: 10.11834/jig.180422
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-07-04,
修回:2018-10-7,
纸质出版:2019-04-16
移动端阅览
目的
2
高光谱遥感影像数据包含丰富的空间和光谱信息,但由于信号的高维特性、信息冗余、多种不确定性和地表覆盖的同物异谱及同谱异物现象,导致高光谱数据结构呈高度非线性。3D-CNN(3D convolutional neural network)能够利用高光谱遥感影像数据立方体的特性,实现光谱和空间信息融合,提取影像分类中重要的有判别力的特征。为此,提出了基于双卷积池化结构的3D-CNN高光谱遥感影像分类方法。
方法
2
双卷积池化结构包括两个卷积层、两个BN(batch normalization)层和一个池化层,既考虑到高光谱遥感影像标签数据缺乏的问题,也考虑到高光谱影像高维特性和模型深度之间的平衡问题,模型充分利用空谱联合提供的语义信息,有利于提取小样本和高维特性的高光谱影像特征。基于双卷积池化结构的3D-CNN网络将没有经过特征处理的3D遥感影像作为输入数据,产生的深度学习分类器模型以端到端的方式训练,不需要做复杂的预处理,此外模型使用了BN和Dropout等正则化策略以避免过拟合现象。
结果
2
实验对比了SVM(support vector machine)、SAE(stack autoencoder)以及目前主流的CNN方法,该模型在Indian Pines和Pavia University数据集上最高分别取得了99.65%和99.82%的总体分类精度,有效提高了高光谱遥感影像地物分类精度。
结论
2
讨论了双卷积池化结构的数目、正则化策略、高光谱首层卷积的光谱采样步长、卷积核大小、相邻像素块大小和学习率等6个因素对实验结果的影响,本文提出的双卷积池化结构可以根据数据集特点进行组合复用,与其他深度学习模型相比,需要更少的参数,计算效率更高。
Objective
2
Hyperspectral remote sensing image data are rich in spatial and spectral information. Continuous spectral segment information enhances the capability to distinguish between ground objects. This information has been widely used in the fields of image classification
target detection
agricultural monitoring
and environmental management. However
the data structure of hyperspectral remote sensing image is highly nonlinear due to the high-dimensional characteristics of the signal
information redundancy
and multiple uncertainties. Some classification models based on statistical patterns are difficult to classify and recognize original hyperspectral data directly. Training samples for supervised learning are extremely limited. A Hughes phenomenon occurs for a limited number of training samples
that is
the classification accuracy decreases as feature dimension increases. The traditional pixel-level hyperspectral remote sensing image classification method mostly adopts the framework of feature extraction and classifier. In the feature extraction
a series of spectral feature dimension reduction methods is proposed for the high spectral characteristics of hyperspectral data. However
these methods cannot solve the nonlinear problem of hyperspectral data. Some methods only use spectral information
which will greatly neglect the rich spatial structural information of high-resolution images. Classification results often have many discrete isolated points
and the classification accuracy is greatly reduced. Therefore
introducing spatial information is necessary. In recent years
image classification based on deep learning has become a research hotspot. In comparison with the traditional artificial design features
it can automatically extract the abstract features from the bottom to the high-level semantics and convert the images into easily recognizable advanced features. At present
mainstream methods include the use of image input 2D-CNN(2D convolutional neural network) after PCA(principal component analysis) dimensionality reduction and fusion with spectral information in the subsequent stage
to achieve the extraction of spatial spectrum information. However
these methods require separate extraction of spatial and spectral information
do not take advantage of the combined spatial-spectral information
and require complex preprocessing. Moreover
3D-CNN is used to extract spatial-spectral information simultaneously. The 3D-CNN simultaneously acquires the spectral and spatial information of hyperspectral remote sensing images and utilizes the characteristics of hyperspectral remote sensing image data cubes to achieve full fusion of spectral and spatial information. It extracts important discriminative features from the classification and effectively solves the problem of spatial homogeneity and heterogeneity. Therefore
the use of 3D-CNN for spatial and spectral information extraction of hyperspectral remote sensing images has become the development trend of image classification. However
such methods use only simply stacked CNNs
do not fully consider the excellent features of 3D-CNN
and has low model scalability. This study proposes a 3D-CNN model based on a doubleconvpool structure.
Method
2
Doubleconvpool structure includes two convolution layers
two BN(batch normalization) layers
and one pooling layer. It not only considers the lack of label data in hyperspectral images but also the balance between the high-dimensional characteristics of hyperspectral images and model depth. Contrary to the use of only spatial or spectral information
the model fully uses the semantic information provided by the spatial-spectral information
thereby facilitating the feature extraction of hyperspectral images with small samples and high-dimensional characteristics. In a 3D-CNN based on doubleconvpool structure
the 3D remote sensing image without feature engineering is used as input data
and the deep learning model is trained in an end-to-end approach without complicated preprocessing. Moreover
the model uses regularization strategies
such as BN and Dropout
to avoid overfitting. We use the dual convolution pooling structure as a standard component of the network
and its number is used as an important hyperparameter of the network. For images with different data characteristics
acceptable classification results are achieved by rationally designing the number of doubleconvpool structures. The proposed method avoids the hyperparameter setting for the network when applied on different datasets and must greatly modify the network parameters. It also enhances the scalability of the network.
Result
2
The experiment compares SVM(support vector machine)
SAE(stack autoencoder)
and the current mainstream CNN method. The model has achieved 99.65% and 99.82% of the overall classification accuracy on the Indian Pines and Pavia University datasets
respectively. It effectively improves the classification accuracy of hyperspectral remote sensing images. We analyze and discuss the number of doubleconvpool structures
the regularization strategy
the spectral sampling stride of the first-layer convolution
the size of the convolution kernel
the size of neighboring pixel blocks
and the learning rate to provide a reasonable model under different constraints
such as training time and computational cost.
Conclusion
2
The doubleconvpool structure can be combined and multiplexed according to the characteristics of datasets. In comparison with other deep learning models
it requires less parameters and has higher computational efficiency. It further illustrates the deep learning
particularly the application potential of 3D-CNN on hyperspectral images.
Du P J, Xia J S, Xue Z H, et al. Review of hyperspectral remote sensing image classification[J]. Journal of Remote Sensing, 2016, 20(2):236-256.
杜培军, 夏俊士, 薛朝辉, 等.高光谱遥感影像分类研究进展[J].遥感学报, 2016, 20(2):236-256. [DOI:10.11834/jrs.20165022]
Zhang B. Intelligent remote sensing satellite system[J]. Journal of Remote Sensing, 2011, 15(3):415-431.
张兵.智能遥感卫星系统(英文)[J].遥感学报, 2011, 15(3):415-431. [DOI:10.11834/jrs.20110354]
Chang C I. Hyperspectral Imaging:Techniques for Spectral Detection and Classification[M]. New York:Springer, 2003[DOI:10.1007/978-1-4419-9170-6]
Zhu J Z, Shi Q, Chen F E, et al. Research status and development trends of remote sensing big data[J]. Journal of Image and Graphics, 2016, 21(11):1425-1439.
朱建章, 石强, 陈凤娥, 等.遥感大数据研究现状与发展趋势[J].中国图象图形学报, 2016, 21(11):1425-1439. [DOI:10.11834/jig.20161102]
Licciardi G, Marpu P R, Chanussot J, et al. Linear versus nonlinear PCA for the classification of hyperspectral data based on the extended morphological profiles[J]. IEEE Geoscience and Remote Sensing Letters, 2012, 9(3):447-451.[DOI:10.1109/LGRS.2011.2172185]
Villa A, Benediktsson J A, Chanussot J, et al. Hyperspectral image classification with independent component discriminant analysis[J]. IEEE Transactions on Geoscience and Remote Sensing, 2011, 49(12):4865-4876.[DOI:10.1109/TGRS.2011.2153861]
Lunga D, Prasad S, Crawford M M, et al. Manifold-learning-based feature extraction for classification of hyperspectral data:a review of advances in manifold learning[J]. IEEE Signal Processing Magazine, 2014, 31(1):55-66.[DOI:10.1109/MSP.2013.2279894]
Fauvel M, Tarabalka Y, Benediktsson J A, et al. Advances in spectral-spatial classification of hyperspectral images[J]. Proceedings of the IEEE, 2013, 101(3):652-675.[DOI:10.1109/JPROC.2012.2197589]
Du P J, Xue Z H, Li J, et al. Learning discriminative sparse representations for hyperspectral image classification[J]. IEEE Journal of Selected Topics in Signal Processing, 2015, 9(6):1089-1104.[DOI:10.1109/JSTSP.2015.2423260]
Song B Q, Li J, Dalla Mura M, et al. Remotely sensed image classification using sparse representations of morphological attribute profiles[J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(8):5122-5136.[DOI:10.1109/TGRS.2013.2286953]
Li J, Bioucas-Dias J M, Plaza A. Semisupervised hyperspectral image segmentation using multinomial logistic regression with active learning[J]. IEEE Transactions on Geoscience and Remote Sensing, 2010, 48(11):4085-4098.[DOI:10.1109/TGRS.2010.2060550]
Zhang L P, Zhang L F, Du B. Deep learning for remote sensing data:a technical tutorial on the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2016, 4(2):22-40.[DOI:10.1109/MGRS.2016.2540798]
Chen Y S, Zhao X, Jia X P. Spectral-spatial classification of hyperspectral data based on deep belief network[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015, 8(6):2381-2392.[DOI:10.1109/JSTARS.2015.2388577]
Zhao W Z, Du S H. Spectral-spatial feature extraction for hyperspectral image classification:a dimensionreduction and deep learning approach[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(8):4544-4554.[DOI:10.1109/TGRS.2016.2543748]
Yue J, Zhao W Z, Mao S J, et al. Spectral-spatial classification of hyperspectral images using deep convolutional neural networks[J]. Remote Sensing Letters, 2015, 6(6):468-477.[DOI:10.1080/2150704X.2015.1047045]
Makantasis K, Karantzalos K, Doulamis A, et al. Deep supervised learning for hyperspectral data classification through convolutional neural networks[C]//2015 IEEE International Geoscience and Remote Sensing Symposium. Milan, Italy: IEEE, 2015: 4959-4962.[ DOI: 10.1109/IGARSS.2015.7326945 http://dx.doi.org/10.1109/IGARSS.2015.7326945 ]
Zhong Z L, Li J, Luo Z M, et al. Spectral-spatial residual network for hyperspectral image classification:a 3-D deep learning framework[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(2):847-858.[DOI:10.1109/TGRS.2017.2755542]
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France: JMLR.org, 2015: 448-456.
Srivastava N, Hinton G, Krizhevsky A, et al. Dropout:a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1):1929-1958.
Xu B, Wang N Y, Chen T Q, et al. Empirical evaluation of rectified activations in convolutional network[EB/OL].[2018-06-20] . https://arxiv.org/pdf/1505.00853.pdf https://arxiv.org/pdf/1505.00853.pdf .
Hara K, Kataoka H, Satoh Y. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet?[EB/OL].[2018-06-20] . https://arxiv.org/pdf/1711.09577.pdf https://arxiv.org/pdf/1711.09577.pdf .
Tran D, Bourdev L, Fergus R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 4489-4497.[ DOI: 10.1109/ICCV.2015.510 http://dx.doi.org/10.1109/ICCV.2015.510 ]
Boureau Y L, Ponce J, LeCun Y. A theoretical analysis of feature pooling in visual recognition[C]//Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel: The International Machine Learning Society, 2010: 111-118.
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems.Lake Tahoe, Nevada: Curran Associates Inc., 2012: 1097-1105.
He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//Proceedings of the 13th European Conference on Computer Vision- ECCV 2014. Zurich, Switzerland: Springer, 2014: 346-361.[ DOI: 10.1007/978-3-319-10578-9_23 http://dx.doi.org/10.1007/978-3-319-10578-9_23 ]
Tarabalka Y, Fauvel M, Chanussot J, et al. SVM- and MRF-based method for accurate classification of hyperspectral images[J]. IEEE Geoscience and Remote Sensing Letters, 2010, 7(4):736-740.[DOI:10.1109/LGRS.2010.2047711]
Chen Y S, Lin Z H, Zhao X, et al. Deep learning-based classification of hyperspectral data[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2014, 7(6):2094-2107.[DOI:10.1109/JSTARS.2014.2329330]
Chen Y S, Jiang H L, Li C Y, et al. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(10):6232-6251.[DOI:10.1109/TGRS.2016.2584107]
相关作者
相关机构
京公网安备11010802024621