结合深度学习与条件随机场的遥感图像分类

夏梦; 曹国; 汪光亚; 尚岩峰

doi:10.11834/jig.170122

遥感图像处理 | 浏览量 : 0 下载量: 483 CSCD: 11

PDF
导出
分享
收藏
专辑

结合深度学习与条件随机场的遥感图像分类
Remote sensing image classification based on deep learning and conditional random fields
2017年22卷第9期页码：1289-1301
网络出版：2017-08-25，

纸质出版：2017
DOI： 10.11834/jig.170122
稿件说明：

移动端阅览

夏梦, 曹国, 汪光亚, 尚岩峰. 结合深度学习与条件随机场的遥感图像分类[J]. 中国图象图形学报, 2017,22(9):1289-1301. DOI： 10.11834/jig.170122.

Xia Meng, Cao Guo, Wang Guangya, Shang Yanfeng. Remote sensing image classification based on deep learning and conditional random fields[J]. Journal of Image and Graphics, 2017, 22(9): 1289-1301. DOI： 10.11834/jig.170122.

摘要

为进一步提高遥感影像的分类精度，将卷积神经网络（CNN）与条件随机场（CRF）两个模型结合，提出一种新的分类方法。首先采用CNN对遥感图像进行预分类，并将其类成员概率定义为CRF模型的一阶势函数；然后利用高斯核函数的线性组合定义CRF模型的二阶势函数，用全连接的邻域结构代替常见的4邻域或8邻域；接着加入区域约束，使用Mean-shift分割方法得到超像素，通过计算超像素的后验概率均值修正各像素的分类结果，鼓励连通区域结果的一致性；最后采用平均场近似算法实现整个模型的推断。选用3组高分辨率遥感图像进行地物分类实验。本文方法不仅能抑制更多的分类噪声，同时还可以改善过平滑现象，保护各类地物的边缘信息。实验采用类精度、总体分类精度OA、平均分类精度AA，以及Kappa系数4个指标进行定量分析，与支持向量机（SVM）、CNN和全连接CRF相比，最终获得的各项精度均得到显著提升，其中，AA提高3.28个百分点，OA提高3.22个百分点，Kappa提高5.07个百分点。将CNN与CRF两种模型融合，不仅可以获得像元本质化的特征，而且同时还考虑了图像的空间上下文信息，使分类更加准确，后加入的约束条件还能进一步保留地物目标的局部信息。本文方法适用于遥感图像分类领域，是一种精确有效的分类方法。

Abstract

Remote sensing image classification refers to the use of computers to analyze the spectral and spatial information of various land cover objects in remote sensing images

divide feature space into non-overlapping subspaces

and place a pixel into a specific subspace.In computer vision

this procedure aims to assign a predefined semantic label to each pixel in an image.This process is also called "semantic segmentation." The rapid development of computer application technology

aerospace

and sensor technology in recent years has resulted in numerous methods for acquiring different types of remote sensing image data.As an important aspect of remote sensing technology

the classification of high-resolution remote sensing imagery has gained considerable attention.A novel image classification method is proposed in this study.This method is based on a fully connected conditional random field (CRF) model

which is combined with a convolutional neural network (CNN).These two models are merged to utilize their respective advantages to further improve classification accuracy for remote sensing images. On the one hand

most traditional classification methods typically rely on artificial experiences to extract the characteristics of training samples.After learning

a single-layer feature without a hierarchical structure is obtained.These methods generally have shallow structures

and the features they produced are relatively simple.By contrast

as a new research direction in the field of machine learning

deep learning can transform the feature representation of training samples from the original space into a new feature space layer by layer

as well as learn to automatically yield a hierarchical feature representation

which is conducive to classification and feature visualization.For the past years

this new subject has achieved a significant breakthrough in the field of computer vision applications

such as visual recognition challenges

image classification

and object detection.As one of its representatives

CNN has been widely used in pattern recognition to avoid the complex preprocessing of images.We use CNN in this study to replace the traditional classification methods to obtain essential features of the input image.On the other hand

traditional classification methods are based on the spectral statistical characteristics of pixels.These methods are also known as pixel-wise classification methods.They analyze the spectral information of each pixel individually by using a statistical learning algorithm

such as support vector machine (SVM)

maximum likelihood classification

minimum distance method

decision tree

and -means clustering.These methods typically produce high classification errors and results with low accuracies because they do not consider the rich spatial contextual information of images.We draw support from the probabilistic graphical model

which is one of the research hot spots in machine learning and pattern recognition

to solve this problem.When this model is utilized

researchers cannot only use Bayesian probability statistic theory to solve the problem

but also mature graph theory to deal with contextual information.As an excellent representative of a probabilistic graphical model

the CRF model for 1D sequence data processing was proposed by Lafferty in 2001.This model can incorporate spatial contextual information in the aspects of labels and observed data.The uniqueness of this model is that it can be flexible to modeling posterior distribution directly.The early CRF model was mainly used in natural language processing and speech recognition fields

and then it was successfully applied to image processing by Kumar and Hebert in 2003.Although considerable research has been conducted on CRF models

the conventional CRF still exhibits oversmoothing problems.Therefore

we add regional restriction (RR) to enhance the consistency of the classification results in connected areas to protect the edge structure of land cover objects.In summary

the steps of our proposed method are as follows.We preclassify the entire remote sensing image into certain land cover types via CNN using the results of class membership probabilities as the unary potential in the CRF model.The pairwise potential of CRF is defined by a linear combination of Gaussian kernels

which forms a fully connected neighbor structure instead of the common four-neighbor or eight-neighbor structure.RR is also incorporated into the framework to promote the consistency of connected areas.We use the mean shift algorithm to obtain superpixels and correct the classification results by calculating their average posterior probabilities.A highly efficient approximate inference algorithm

namely

mean field inference

is generated for the final model. Our experimental results

which are based on three different remote sensing images

demonstrate that the proposed classification framework exhibits competitive quantitative and qualitative performances

which effectively alleviate salt-and-pepper classification noise

improve the oversmoothing phenomenon

and protect the edge structure of land cover objects.The experiments are conducted using class accuracy

overall classification accuracy (OA)

average classification accuracy (AA)

and the kappa coefficient for the entire quantitative analysis.Compared with those of SVM

CNN

and fully connected CRF

the final accuracies of our experiments are significantly improved.AA is increased by 3.28 percentage points

OA is increased by 3.22 percentage points

and the kappa coefficient is increased by 5.07 percentage points. Traditional classification methods have two shortcomings.The first problem is insufficient feature extraction

which leads inaccurate classification results.The second problem is that pixel-based methods only consider the information of single points and disregard the mutual influence of surrounding points.The combination of CNN and CRF cannot only obtain the essential characteristics of pixels

but also considers the contextual information of an image.Therefore

our method can achieve accurate classification results.Moreover

the integration of RR can protect the edge structure of land cover objects to yield a satisfactory classification performance.The proposed method is accurate and effective

and it can be used in remote sensing image classification.