结合深度学习与条件随机场的遥感图像分类
Remote sensing image classification based on deep learning and conditional random fields
- 2017年22卷第9期 页码:1289-1301
网络出版:2017-08-25,
纸质出版:2017
DOI: 10.11834/jig.170122
移动端阅览

浏览全部资源
扫码关注微信
网络出版:2017-08-25,
纸质出版:2017
移动端阅览
为进一步提高遥感影像的分类精度,将卷积神经网络(CNN)与条件随机场(CRF)两个模型结合,提出一种新的分类方法。 首先采用CNN对遥感图像进行预分类,并将其类成员概率定义为CRF模型的一阶势函数;然后利用高斯核函数的线性组合定义CRF模型的二阶势函数,用全连接的邻域结构代替常见的4邻域或8邻域;接着加入区域约束,使用Mean-shift分割方法得到超像素,通过计算超像素的后验概率均值修正各像素的分类结果,鼓励连通区域结果的一致性;最后采用平均场近似算法实现整个模型的推断。 选用3组高分辨率遥感图像进行地物分类实验。本文方法不仅能抑制更多的分类噪声,同时还可以改善过平滑现象,保护各类地物的边缘信息。实验采用类精度、总体分类精度OA、平均分类精度AA,以及Kappa系数4个指标进行定量分析,与支持向量机(SVM)、CNN和全连接CRF相比,最终获得的各项精度均得到显著提升,其中,AA提高3.28个百分点,OA提高3.22个百分点,Kappa提高5.07个百分点。 将CNN与CRF两种模型融合,不仅可以获得像元本质化的特征,而且同时还考虑了图像的空间上下文信息,使分类更加准确,后加入的约束条件还能进一步保留地物目标的局部信息。本文方法适用于遥感图像分类领域,是一种精确有效的分类方法。
Remote sensing image classification refers to the use of computers to analyze the spectral and spatial information of various land cover objects in remote sensing images
divide feature space into non-overlapping subspaces
and place a pixel into a specific subspace.In computer vision
this procedure aims to assign a predefined semantic label to each pixel in an image.This process is also called "semantic segmentation." The rapid development of computer application technology
aerospace
and sensor technology in recent years has resulted in numerous methods for acquiring different types of remote sensing image data.As an important aspect of remote sensing technology
the classification of high-resolution remote sensing imagery has gained considerable attention.A novel image classification method is proposed in this study.This method is based on a fully connected conditional random field (CRF) model
which is combined with a convolutional neural network (CNN).These two models are merged to utilize their respective advantages to further improve classification accuracy for remote sensing images. On the one hand
most traditional classification methods typically rely on artificial experiences to extract the characteristics of training samples.After learning
a single-layer feature without a hierarchical structure is obtained.These methods generally have shallow structures
and the features they produced are relatively simple.By contrast
as a new research direction in the field of machine learning
deep learning can transform the feature representation of training samples from the original space into a new feature space layer by layer
as well as learn to automatically yield a hierarchical feature representation
which is conducive to classification and feature visualization.For the past years
this new subject has achieved a significant breakthrough in the field of computer vision applications
such as visual recognition challenges
image classification
and object detection.As one of its representatives
CNN has been widely used in pattern recognition to avoid the complex preprocessing of images.We use CNN in this study to replace the traditional classification methods to obtain essential features of the input image.On the other hand
traditional classification methods are based on the spectral statistical characteristics of pixels.These methods are also known as pixel-wise classification methods.They analyze the spectral information of each pixel individually by using a statistical learning algorithm
such as support vector machine (SVM)
maximum likelihood classification
minimum distance method
decision tree
and -means clustering.These methods typically produce high classification errors and results with low accuracies because they do not consider the rich spatial contextual information of images.We draw support from the probabilistic graphical model
which is one of the research hot spots in machine learning and pattern recognition
to solve this problem.When this model is utilized
researchers cannot only use Bayesian probability statistic theory to solve the problem
but also mature graph theory to deal with contextual information.As an excellent representative of a probabilistic graphical model
the CRF model for 1D sequence data processing was proposed by Lafferty in 2001.This model can incorporate spatial contextual information in the aspects of labels and observed data.The uniqueness of this model is that it can be flexible to modeling posterior distribution directly.The early CRF model was mainly used in natural language processing and speech recognition fields
and then it was successfully applied to image processing by Kumar and Hebert in 2003.Although considerable research has been conducted on CRF models
the conventional CRF still exhibits oversmoothing problems.Therefore
we add regional restriction (RR) to enhance the consistency of the classification results in connected areas to protect the edge structure of land cover objects.In summary
the steps of our proposed method are as follows.We preclassify the entire remote sensing image into certain land cover types via CNN using the results of class membership probabilities as the unary potential in the CRF model.The pairwise potential of CRF is defined by a linear combination of Gaussian kernels
which forms a fully connected neighbor structure instead of the common four-neighbor or eight-neighbor structure.RR is also incorporated into the framework to promote the consistency of connected areas.We use the mean shift algorithm to obtain superpixels and correct the classification results by calculating their average posterior probabilities.A highly efficient approximate inference algorithm
namely
mean field inference
is generated for the final model. Our experimental results
which are based on three different remote sensing images
demonstrate that the proposed classification framework exhibits competitive quantitative and qualitative performances
which effectively alleviate salt-and-pepper classification noise
improve the oversmoothing phenomenon
and protect the edge structure of land cover objects.The experiments are conducted using class accuracy
overall classification accuracy (OA)
average classification accuracy (AA)
and the kappa coefficient for the entire quantitative analysis.Compared with those of SVM
CNN
and fully connected CRF
the final accuracies of our experiments are significantly improved.AA is increased by 3.28 percentage points
OA is increased by 3.22 percentage points
and the kappa coefficient is increased by 5.07 percentage points. Traditional classification methods have two shortcomings.The first problem is insufficient feature extraction
which leads inaccurate classification results.The second problem is that pixel-based methods only consider the information of single points and disregard the mutual influence of surrounding points.The combination of CNN and CRF cannot only obtain the essential characteristics of pixels
but also considers the contextual information of an image.Therefore
our method can achieve accurate classification results.Moreover
the integration of RR can protect the edge structure of land cover objects to yield a satisfactory classification performance.The proposed method is accurate and effective
and it can be used in remote sensing image classification.
相关作者
相关机构
京公网安备11010802024621