Current Issue Cover
面向高光谱图像分类的内容引导卷积深度网络并行实现

刘启超, 肖亮, 杨劲翔(南京理工大学计算机科学与工程学院, 南京 210094)

摘 要
目的 受限于卷积核形状固定,传统卷积神经网络(convolutional neural network,CNN)方法难以精确分类高光谱图像(hyperspectral image,HSI)中的跨类别边缘区域,导致地物边界模糊。内容引导CNN (content-guided CNN,CGCNN)能够根据地物形态自适应调整卷积核形状,具有地物边缘保持分类能力。但由于内容引导卷积属于非固定模板结构,不能直接调用现有深度学习加速库实现并行计算。针对该问题,本文设计了一种内容引导卷积的并行计算方法,并验证其加速及分类性能。方法 本文基于内容引导卷积等价于各向异性核加权和标准卷积的组合结构,通过利用深度学习库中的平铺、堆叠、网格和采样等底层函数构造索引矩阵来定义重采样方式,以将内容引导卷积分解为与空间位置无关的像素级独立计算过程,并在图形处理器(graphics processing unit,GPU)上并行执行。结果 经测试,本文提出的并行化内容引导卷积相比串行运算方式平均提速近700倍。在分类性能测试中,并行化CGCNN在合成数据集上表现出优异的细节保持分类能力,总精度平均高于对比方法7.10%;同时在两组真实数据集上亦取得最优分类结果,分别高于对比方法7.21%、2.70%。结论 通过将内容引导卷积分步拆解,能够将其转化为一系列并行计算过程,且能够在GPU上高效执行;并通过在多组数据集上的分类精度、参数敏感度和小样本学习等综合性能测试进一步表明,并行化CGCNN在具有优良分类性能的同时,亦具有对不同地物的边缘保持分类能力,能够获得更精细的分类结果。
关键词
Parallel implementation of content-guided deep convolutional network for hyperspectral image classification

Liu Qichao, Xiao Liang, Yang Jinxiang(School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China)

Abstract
Objective Deep learning-based hyperspectral image (HSI) classification has become an important research point due to its ability to learn adaptive and robust features from training data automatically. However, issues, such as high-dimensionality of spectra, imbalanced and limited labeled samples, and diversity of land cover, persist. Hence, improving the robustness of deep networks under limited samples has become a key research hotpot. Evaluation indices of deep learning methods, such as reproducibility, credibility, efficiency, and accuracy, have also received research attention. Content-guided convolutional neural network (CGCNN) was proposed for HSI classification. Compared with shape-fixed kernels in traditional CNNs, CGCNN can adaptively adjust its kernel shape according to the spatial distribution of land covers; as a result, it can preserve the boundaries and details of land covers in classification maps. In particular, each kernel in content-guided convolution is composed of two kernels, namely, an isotropic kernel used to extract spatial features from HSI and an anisotropic kernel used to control the spatial shape of the isotropic kernel. The isotropic kernel is shared at different sampling regions, similar to regular convolution. However, anisotropic kernels are computed independently according to the local distribution of land covers at different spatial locations. Content-guided convolution cannot be computed in parallel straightforwardly by the deep learning acceleration library (e.g., Tensorflow and PyTorch) due to the unshared anisotropic kernels. To tackle this issue, we propose a parallel computing method for content-guided convolution to allow the CGCNN to be computed efficiently by a graphics processing unit (GPU) and evaluate its performance in hyperspectral image classification. Method To make content-guided convolution parallelizable, we break down the content-guided convolution into two steps:weighting the input feature map with the anisotropic kernels, and conducting a standard convolution with the isotropic kernels on the weighted feature map. Although the anisotropic kernels in the content-guided convolution are different at different sampling locations, their sampling grids (receptive fields) are restricted to small windows with a fixed size (e.g., 3×3). If we resample the input feature map into small cubes, in which each cube contains only the needed pixels used to compute the anisotropic kernel weights for its center pixel, then the content-guided convolution can be converted to a pixel-level computation process independent of spatial locations and then be executed on the GPU in parallel. To this end, we propose to construct an index matrix to define the resampling manner and then resample the input feature map to separate the overlapping pixels into non-overlapping pixels by employing the underlying interfaces of the deep learning acceleration library. Finally, a standard convolution is conducted on the weighted feature map to finish the weighting process of the isotropic kernels in the content-guided convolution, which can be easily performed by the deep learning acceleration library. The resampling index matrix is shared in the whole CGCNN and thus only needs to be computed once. The matrix can also be executed efficiently on the GPU because the resampling process is just a data copy operation and does not involve numerical computation. Result Compared with serial computing, the parallel-accelerated content-guided convolution achieves an average acceleration of nearly 700 times under different conditions. With increasing input scale, the acceleration ratio also increases steadily. As a result, the parallelizable CGCNN can classify the benchmark data sets in seconds. Additionally, in the comparison of classification accuracies, the parallelizable CGCNN shows great ability to preserve details and boundaries compared with other state-of-the-art deep learning-based methods. On the synthetic data set, the classification map acquired by the CGCNN preserves more edges and small objects, while the compared methods lead the classification maps to be smoother. As a result, the overall accuracy of the CGCNN is 7.10% higher than that of the other methods on average. On the two other real data sets (i.e., Indian Pines and Loukia), the CGCNN still achieves the best classification results, with classification accuracies that are 7.21% and 2.70% higher on average than the compared methods, respectively. Furthermore, the sensitivity parameter analysis shows the effect and function of the sensitivity value. Smaller sensitivity value can preserve more details and boundaries, but the outliers and noise are also more; in contrast, larger sensitivity value can suppress more outliers and noise, but the classification map may be smoother and lose more details. The automatic learning mechanism of sensitivity value can achieve a trade-off between the details and smoothness and give a suitable sensitivity for different HSIs according to their spatial contents and training samples. In the evaluation of small-sample learning capacity, the performance advantage of the CGCNN increases compared with the other methods gradually with decreasing number of training samples, indicating the great small-sample learning ability of the CGCNN. Conclusion By breaking down the content-guided convolution into a series of pixel-level computation processes independent of spatial locations, we propose a parallel computing method for the content-guided convolution to run the CGCNN in parallel on the GPU. Diverse experiments such as comparison of classification accuracies, sensitivity parameter analysis, and evaluation of small-sample learning capacity, demonstrate that the CGCNN can greatly preserve the edges and small objects in classification maps and outperforms the compared methods in quantitative and qualitative performance indices.
Keywords

订阅号|日报