RMFS-CNN:遥感图像分类深度学习新框架
RMFS-CNN: new deep learning framework for remote sensing image classification
- 2021年26卷第2期 页码:297-304
收稿:2020-07-19,
修回:2020-9-15,
录用:2020-9-22,
纸质出版:2021-02-16
DOI: 10.11834/jig.200397
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-07-19,
修回:2020-9-15,
录用:2020-9-22,
纸质出版:2021-02-16
移动端阅览
现有卷积神经网络(convolutional neural network,CNN)利用卷积层和激活函数的叠加,构建复杂非线性函数拟合输入数据到输出标签的转换关系,这种端到端的学习方式严重影响了CNN特征图与先验知识的融合,导致其对训练样本数量和质量敏感,同时增加了CNN特征图可解释性难度。本文从深度学习建模方式角度出发,以遥感图像特征表达及其可解释性为切入点,搭建传统遥感图像先验知识与CNN的桥梁,分析阐述了黎曼流形特征空间(Riemannian manifold feature space,RMFS)对CNN可解释性、特征演化规律等方面的促进作用;提出融合CNN与RMFS构建RMFS-CNN遥感图像分类新框架,以RMFS为特征过渡平台,一方面利用其线性特征分布规律降低CNN对传统图像特征的学习难度,另一方面定义能够突显图像先验知识的表达范式,提高CNN对可解释性特征的学习能力,以达到利用RMFS对先验知识(特征)表达的优异性能提高CNN遥感图像分类特征利用效率的目的;以RMFS特征表达范式为基础定义控制CNN特征学习偏好的损失函数,进而发展具有良好特征解释性的CNN分类模型及可控的模型训练方法;最后指出构建RMFS-CNN分类框架的可行性及该框架对遥感图像分类和深度学习理论发展方面的理论贡献与应用价值。
Traditional convolutional neural networks (CNNs) use convolutional layers and activation functions to achieve nonlinear transformation from input images to output labels. The end-to-end training method is convenient
but it seriously hinders the introduction of prior knowledge regarding remote sensing images
leading to high dependency on the quality and quantity of training samples. The trained parameters of CNNs are used to extract features from input images. However
these features cannot be interpreted. That is
the learning process and the learned features are uninterpretable
further increasing dependency on training samples. Restricted by an end-to-end training method
traditional CNNs can only learn general features from the training set
while the learned general features are difficult to transfer to another training set. At present
CNNs can be used on multiple tasks if the model is trained using a target training set. However
improving training accuracy on a finite training set is an extremely difficult task. Traditional CNNs cannot correlate the features contained in the input data and the requirements of certain applications. In addition
loss functions that can be used in certain applications are limited. Among which
some loss functions can only describe the difference between the predicted results and the corresponding labels. In such case
the network will sacrifice the disadvantaged classes to ensure global optimum
resulting in the loss of detailed information.CNNs construct a complex nonlinear function to transfer input images to output labels. The features learned by CNNs cannot be understood and are also difficult to be merged with other features in an explainable manner. By contrast
artificial features can reflect some aspects of information of an image
and the information contained in artificial features is meaningful
i.e.
it can be used in most images. Artificial features can be considered prior knowledge that describes the empirical understanding of images. They cannot fully express the information contained in an image. Consequently
combining the advantages of CNNs and prior knowledge is efficient for learning essential features from images. Riemannian manifold feature space (RMFS) exhibits a powerful feature expression capability
through which the spectral and spatial features of an image can be unified. To benefit from CNNs and RMFS
this study analyzes the contribution of RMFS to the interpretability of CNNs and the corresponding evolution of image features from the perspective of CNN modeling and remote sensing image feature representation. Then
an RMFS-CNN classification framework is proposed to bridge the gap between CNNs and prior knowledge of remote sensing images. First
this study proposes using CNNs instead of traditional mathematical transformations to map the original remote sensing image onto points in RMFS. Mapping via CNNs can overcome the effects of neighboring sizes and modeling methods
improving the feature expression capability of RMFS. Second
the features learned via RMFS-CNN can be customized in RMFS to highlight specific information that can benefit certain applications. Furthermore
the customized features can also be used to design a rule-driven data perceptron on the basis of their interpretability and evolutions. Finally
new RMFS-CNN models based on the rule-driven data perceptron can be proposed. Considering the feature expression capability of RMFS
the proposed RMFS-CNN models will outperform traditional models in terms of learning capability and the stability of learned features. New loss functions
which can control the training process of RMFS-CNN models
can be developed by combining the customized features in RMFS. In general
the proposed RMFS-CNN framework can bridge the gap between remote sensing prior knowledge and CNN models. Its advantages are as follows. 1) Points in RMFS are interpretable due to the excellent feature expression capability of RMFS and the one-to-one correspondence between points in RMFS and pixels in the image domain. Therefore
RMFS can connect remote sensing prior knowledge and the learning capability of CNNs. The use of CNNs to learn specific information from remote sensing prior knowledge is efficient on the one hand
and it can ensure the stability of learned features on the other hand. Consequently
the dependency of CNNs on the quality and quantity of training samples can be reduced. 2) Points in RMFS contain the spectral features of corresponding pixels and spatial connections in the neighborhood system. Pixels representing the same object in the image domain are subject to a linear distribution when mapped onto RMFS. On the basis of these characteristics
RMFS can provide a platform for the interpretable features of remote sensing images. Under the premise of knowing the physical meaning and corresponding distribution of remote sensing images in RMFS
data-driven convolution can be converted into rule-driven data perceptron to improve the learning capability of RMFS-CNN models. The learning process and corresponding learned features can be interpreted using the rule-driven data perceptron. 3) RMFS exhibits another interesting distribution characteristic. Data points that represent the main body of an object construct a linear distribution
whereas data points that represent the edge of the object are randomly distributed in areas far from the linear distribution. This distribution characteristic enables RMFS to express different features of an object separately. Accordingly
features conducive to certain applications can be customized in RMFS and then abstracted by following the rule-driven data perceptron. With their feature customization capability
RMFS-CNN models can be refined in accordance to their input data and applications. 4) The RMFS-CNN framework can express the interpretable features of remote sensing images. These features can then be customized to adapt to the input data and the corresponding applications. The customized features contain useful information for a certain application
which can be used to define a constraint on the loss function to control the training process of RMFS-CNN models. Given that the constraint can force the network to learn features beneficial for the target application
two advantages are implemented: learning favorable features for a certain application can improve the training accuracy of a network on the one hand
and the interpretability of the learned features can be maintained on the other hand. Consequently
the trained network is easier to transfer compared with that of traditional CNNs.
Amari S and Nagaoka H. 2000. Methods of Information Geometry. Providence: American Mathematical Society
Bąk S, Corvée E, Brémond F and Thonnat M. 2012. Boosted human reidentification using Riemannian manifolds. Image and Vision Computing, 30(6/7): 443-452[DOI: 10.1016/j.imavis.2011.08.008]
Belgiu M and Drǎguţ L. 2016. Random forest in remote sensing: a review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing, 114: 24-31[DOI: 10.1016/j.isprsjprs.2016.01.011]
Belkin M and Niyogi P. 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6): 1373-1396[DOI: 10.1162/089976603321780317]
Chai D F, Newsam S, Zhang H K, Qiu Y F and Huang J F. 2019. Cloud and cloud shadow detection in Landsat imagery based on deep convolutional neural networks. Remote Sensing of Environment, 225: 307-316[DOI: 10.1016/j.rse.2019.03.007]
Chen J, Chen J, Liao A P, Cao X, Chen L J, Chen X H, He C Y, Han G, Peng S, Lu M, Zhang W W, Tong X H and Mills J. 2015. Global land cover mapping at 30 m resolution: a POK-based operational approach. ISPRS Journal of Photogrammetry and Remote Sensing, 103: 7-27[DOI: 10.1016/j.isprsjprs.2014.09.002]
Ding J L, Yao Y and Wang F. 2014. Detecting soil salinization in arid regions using spectral feature space derived from remote sensing data. Acta Ecologica Sinica, 34(16): 4620-4631
丁建丽, 姚远, 王飞. 2014.干旱区土壤盐渍化特征空间建模.生态学报, 34(16): 4620-4631)[DOI: 10.5846/stxb201212291895]
Dong G G and Kuang G Y. 2015. Target recognition in SAR images via classification on Riemannian manifolds. IEEE Geoscience and Remote Sensing Letters, 12(1): 199-203[DOI:10.1109/LGRS.2014.2332076]
Goh A and Vidal R. 2008. Clustering and dimensionality reduction on Riemannian manifolds//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE: 1-7[ DOI: 10.1109/CVPR.2008.4587422 http://dx.doi.org/10.1109/CVPR.2008.4587422 ]
Gordo A, Almazán J, Revaud J and Larlus D. 2017. End-to-end learning of deep visual representations for image retrieval. International Journal of Computer Vision, 124(2): 237-254[DOI: 10.1007/s11263-017-1016-8]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778[ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Khatami R, Mountrakis G and Stehman S V. 2016. A meta-analysis of remote sensing research on supervised pixel-based land-cover image classification processes: general guidelines for practitioners and future research. Remote Sensing of Environment, 177: 89-100[10.1016/j.rse.2016.02.028]
Kussul N, Lavreniuk M, Skakun S and Shelestov A. 2017. Deep learning classification of land cover and crop types using remote sensing data. IEEE Geoscience and Remote Sensing Letters, 14(5): 778-782[DOI: 10.1109/LGRS.2017.2681128]
Lin T and Zha H B. 2008. Riemannian manifold learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5): 796-809[DOI: 10.1109/TPAMI.2007.70735]
Liu N, Wan L H, Zhang Y, Zhou T, Huo H and Fang T. 2018. Exploiting convolutional neural networks with deeply local description for remote sensing image classification. IEEE Access, 6: 11215-11228[DOI: 10.1109/ACCESS.2018.2798799]
Liu Y, Lei Y B, Fan J L, Wang F P, Gong Y C and Tian Q. 2020. Survey on image classification technology based on small sample learning[EB/OL].[2020-07-04] . https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190720 https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190720
刘颖, 雷研博, 范九伦, 王富平, 公衍超, 田奇. 2020.基于小样本学习的图像分类技术综述[EB/OL].[2020-07-04] . https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190720 https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190720
Liu Y, Wu L X and Ma B D. 2013. Remote sensing monitoring of soil moisture on the basis of TM/ETM+ spectral space. Journal of China University of Mining and Technology, 42(2): 296-301
刘英, 吴立新, 马保东. 2013.基于TM/ETM+光谱特征空间的土壤湿度遥感监测.中国矿业大学学报, 42(2): 296-301)[DOI: 10.13247/j.cnki.jcumt.2013.02.022]
Maulik U and Chakraborty D. 2013. Learning with transductive SVM for semisupervised pixel classification of remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 77: 66-78[DOI: 10.1016/j.isprsjprs.2012.12.003]
Pelletier B. 2005. Kernel density estimation on Riemannian manifolds. Statistics and Probability Letters, 73(3): 297-304[DOI: 10.1016/j.spl.2005.04.004]
Poliyapram V, Imamoglu N and Nakamura R. 2019. Deep learning model for water/ice/land classification using large-scale medium resolution satellite images//Proceedings of 2019 IEEE International Geoscience and Remote Sensing Symposium. Yokohama, Japan: IEEE: 3884-3887[ DOI: 10.1109/IGARSS.2019.8900323 http://dx.doi.org/10.1109/IGARSS.2019.8900323 ]
Ravì D, Bober M, Farinella G M, Guarnera M and Battiato S. 2016. Semantic segmentation of images exploiting DCT based features and random forest. Pattern Recognition, 52: 260-273[DOI: 10.1016/j.patcog.2015.10.021]
Reichstein M, Camps-Valls G, Stevens B, Jung M, Denzler J, Carvalhais N and Prabhat. 2019. Deep learning and process understanding for data-driven earth system science. Nature, 566(7743): 195-204[DOI: 10.1038/s41586-019-0912-1]
Roweis S T and Saul L K. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500): 2323-2326[DOI: 10.1126/science.290.5500.2323]
Sammon J W. 1969. A nonlinear mapping for data structure analysis. IEEE Transactions on Computers, C-18(5): 401-409[DOI: 10.1109/T-C.1969.222678]
Seung H S and Lee D D. 2000. The manifold ways of perception. Science, 290(5500): 2268-2269[DOI: 10.1126/science.290.5500.2268]
Shang R H, He J H, Wang J M, Xu K M, Jiao L C and Stolkin R. 2020. Dense connection and depthwise separable convolution based CNN for polarimetric SAR image classification. Knowledge-Based Systems, 194: #105542[DOI: 10.1016/j.knosys.2020.105542]
Shi B G, Bai X and Yao C. 2017. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(11): 2298-2304[DOI: 10.1109/TPAMI.2016.2646371]
Simons J. 1968. Minimal varieties in Riemannian manifolds. Annals of Mathematics, 88(1): 62-105[DOI: 10.2307/1970556]
Tenenbaum J B, de Silva V and Langford J C. 2000. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500): 2319-2323[DOI: 10.1126/science.290.5500.2319]
van der Maaten L, Postma E and van den Herik H J. 2009. Dimensionality Reduction: A Comparative Review. TiCC TR 2009-005[s.n.]
Wang Q, Yuan Z H, Du Q and Li X L. 2019. GETNET: a general end-to-end 2-D CNN framework for hyperspectral image change detection. IEEE Transactions on Geoscience and Remote Sensing, 57(1): 3-13[DOI: 10.1109/TGRS.2018.2849692]
Wang R X and Peng G H. 2017. An image retrieval method with sparse coding based on Riemannian manifold. Acta Automatica Sinica, 43(5): 778-788
王瑞霞, 彭国华. 2017.基于黎曼流形稀疏编码的图像检索算法.自动化学报, 43(5): 778-788)[DOI: 10.16383/j.aas.2017.c150838]
Wu Y W, Jia Y D, Li P H, Zhang J and Yuan J S. 2015. Manifold kernel sparse representation of symmetric positive-definite matrices and its applications. IEEE Transactions on Image Processing, 24(11): 3729-3741[DOI: 10.1109/TIP.2015.2451953]
Yoo C, Han D, Im J and Bechtel B. 2019. Comparison between convolutional neural networks and random forest for local climate zone classification in mega urban areas using Landsat images. ISPRS Journal of Photogrammetry and Remote Sensing, 157: 155-170[DOI: 10.1016/j.isprsjprs.2019.09.009]
Zhang B. 2018. A survey of developments on optical remote sensing information technology and applications. Journal of Nanjing University of Information Science and Technology (Natural Science Edition), 10(1): 1-5
张兵. 2018.光学遥感信息技术与应用研究综述.南京信息工程大学学报(自然科学版), 10(1): 1-5)[DOI: 10.13878/j.cnki.jnuist.2018.01.001]
Zhao X M, Gao L R, Chen Z C, Zhang B and Liao W Z. 2019a. Large-scale Landsat image classification based on deep learning methods. APSIPA Transactions on Signal and Information Processing, 8: e26[DOI: 10.1017/ATSIP.2019.18]
Zhao X M, Gao L R, Chen Z C, Zhang B, Liao W Z and Yang X. 2019b. An entropy and MRF model-based CNN for large-scale Landsat image classification. IEEE Geoscience and Remote Sensing Letters, 16(7): 1145-1149[DOI: 10.1109/LGRS.2019.2890996]
Zhao X M, Li Y and Wang H J. 2019c. Manifold based on neighbour mapping and its projection for remote sensing image segmentation. International Journal of Remote Sensing, 40(24): 9304-9320[DOI: 10.1080/01431161.2019.1629718]
Zhao X M, Li Y and Zhao Q H. 2017. Unsupervised remote sensing image segmentation based on data sampling. Journal of Remote Sensing, 21(5): 767-775
赵雪梅, 李玉, 赵泉华. 2017.观测数据采样化的遥感影像非监督分割.遥感学报, 21(5): 767-775)[DOI: 10.11834/jrs.20176410]
Zhao X M, Wang H J, Wu J, Li Y and Zhao S J. 2020. Remote sensing image segmentation using geodesic-kernel functions and multi-feature spaces. Pattern Recognition, 104: #107333[DOI: 10.1016/j.patcog.2020.107333]
Zhong L H, Hu L N and Zhou H. 2019. Deep learning based multi-temporal crop classification. Remote Sensing of Environment, 221: 430-443[DOI: 10.1016/j.rse.2018.11.032]
Zhou J W and Wu Y Q. 2020. Building area recognition method of remote sensing image based on MRELBP feature, Franklin moment and SVM. Acta Geodaetica et Cartographica Sinica, 49(3): 355-364
周建伟, 吴一全. 2020. MRELBP特征、Franklin矩和SVM相结合的遥感图像建筑物识别方法.测绘学报, 49(3): 355-364)[DOI: 10.11947/j.AGCS.2020.20190073]
相关作者
相关机构
京公网安备11010802024621