发布时间: 2021-08-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.200411
2021 | Volume 26 | Number 8

高光谱图像分类

面向高光谱图像分类的内容引导卷积深度网络并行实现

刘启超, 肖亮, 杨劲翔

南京理工大学计算机科学与工程学院, 南京 210094

收稿日期: 2020-07-27; 修回日期: 2020-10-27; 预印本日期: 2020-11-03

基金项目: 国家自然科学基金项目（61871226，62001226）；江苏省重点研发计划项目（BE2018727）；中央高校基本科研业务费专项资金资助（30920021134）

作者简介: 刘启超, 1992年生, 男, 博士研究生, 主要研究方向为深度学习和遥感图像处理。E-mail: qc.l@njust.edu.cn
肖亮, 通信作者, 男, 教授, 主要研究方向为计算机视觉、机器学习和遥感图像处理。E-mail: xiaoliang@mail.njust.edu.cn
杨劲翔, 男, 讲师, 主要研究方向为机器学习、深度学习和遥感图像处理。E-mail: yang123jx@njust.edu.cn
*通信作者: 肖亮 xiaoliang@mail.njust.edu.cn

中图法分类号: TP751

文献标识码: A

文章编号: 1006-8961(2021)08-1926-14

摘要

目的受限于卷积核形状固定，传统卷积神经网络（convolutional neural network，CNN）方法难以精确分类高光谱图像（hyperspectral image，HSI）中的跨类别边缘区域，导致地物边界模糊。内容引导CNN（content-guided CNN，CGCNN）能够根据地物形态自适应调整卷积核形状，具有地物边缘保持分类能力。但由于内容引导卷积属于非固定模板结构，不能直接调用现有深度学习加速库实现并行计算。针对该问题，本文设计了一种内容引导卷积的并行计算方法，并验证其加速及分类性能。方法本文基于内容引导卷积等价于各向异性核加权和标准卷积的组合结构，通过利用深度学习库中的平铺、堆叠、网格和采样等底层函数构造索引矩阵来定义重采样方式，以将内容引导卷积分解为与空间位置无关的像素级独立计算过程，并在图形处理器（graphics processing unit，GPU）上并行执行。结果经测试，本文提出的并行化内容引导卷积相比串行运算方式平均提速近700倍。在分类性能测试中，并行化CGCNN在合成数据集上表现出优异的细节保持分类能力，总精度平均高于对比方法7.10%；同时在两组真实数据集上亦取得最优分类结果，分别高于对比方法7.21%、2.70%。结论通过将内容引导卷积分步拆解，能够将其转化为一系列并行计算过程，且能够在GPU上高效执行；并通过在多组数据集上的分类精度、参数敏感度和小样本学习等综合性能测试进一步表明，并行化CGCNN在具有优良分类性能的同时，亦具有对不同地物的边缘保持分类能力，能够获得更精细的分类结果。

关键词

内容引导卷积; 深度学习; 高光谱图像(HSI)分类; 并行加速; 边缘保持分类

Parallel implementation of content-guided deep convolutional network for hyperspectral image classification

Liu Qichao, Xiao Liang, Yang Jinxiang

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

Supported by: National Natural Science Foundation of China (61871226, 62001226); Jiangsu Provincial Social Developing Project (BE2018727); Fundamental Research Funds for the Central Universities (30920021134)

Abstract

Objective Deep learning-based hyperspectral image (HSI) classification has become an important research point due to its ability to learn adaptive and robust features from training data automatically. However, issues, such as high-dimensionality of spectra, imbalanced and limited labeled samples, and diversity of land cover, persist. Hence, improving the robustness of deep networks under limited samples has become a key research hotpot. Evaluation indices of deep learning methods, such as reproducibility, credibility, efficiency, and accuracy, have also received research attention. Content-guided convolutional neural network (CGCNN) was proposed for HSI classification. Compared with shape-fixed kernels in traditional CNNs, CGCNN can adaptively adjust its kernel shape according to the spatial distribution of land covers; as a result, it can preserve the boundaries and details of land covers in classification maps. In particular, each kernel in content-guided convolution is composed of two kernels, namely, an isotropic kernel used to extract spatial features from HSI and an anisotropic kernel used to control the spatial shape of the isotropic kernel. The isotropic kernel is shared at different sampling regions, similar to regular convolution. However, anisotropic kernels are computed independently according to the local distribution of land covers at different spatial locations. Content-guided convolution cannot be computed in parallel straightforwardly by the deep learning acceleration library (e.g., Tensorflow and PyTorch) due to the unshared anisotropic kernels. To tackle this issue, we propose a parallel computing method for content-guided convolution to allow the CGCNN to be computed efficiently by a graphics processing unit (GPU) and evaluate its performance in hyperspectral image classification. Method To make content-guided convolution parallelizable, we break down the content-guided convolution into two steps: weighting the input feature map with the anisotropic kernels, and conducting a standard convolution with the isotropic kernels on the weighted feature map. Although the anisotropic kernels in the content-guided convolution are different at different sampling locations, their sampling grids (receptive fields) are restricted to small windows with a fixed size (e.g., 3×3). If we resample the input feature map into small cubes, in which each cube contains only the needed pixels used to compute the anisotropic kernel weights for its center pixel, then the content-guided convolution can be converted to a pixel-level computation process independent of spatial locations and then be executed on the GPU in parallel. To this end, we propose to construct an index matrix to define the resampling manner and then resample the input feature map to separate the overlapping pixels into non-overlapping pixels by employing the underlying interfaces of the deep learning acceleration library. Finally, a standard convolution is conducted on the weighted feature map to finish the weighting process of the isotropic kernels in the content-guided convolution, which can be easily performed by the deep learning acceleration library. The resampling index matrix is shared in the whole CGCNN and thus only needs to be computed once. The matrix can also be executed efficiently on the GPU because the resampling process is just a data copy operation and does not involve numerical computation. Result Compared with serial computing, the parallel-accelerated content-guided convolution achieves an average acceleration of nearly 700 times under different conditions. With increasing input scale, the acceleration ratio also increases steadily. As a result, the parallelizable CGCNN can classify the benchmark data sets in seconds. Additionally, in the comparison of classification accuracies, the parallelizable CGCNN shows great ability to preserve details and boundaries compared with other state-of-the-art deep learning-based methods. On the synthetic data set, the classification map acquired by the CGCNN preserves more edges and small objects, while the compared methods lead the classification maps to be smoother. As a result, the overall accuracy of the CGCNN is 7.10% higher than that of the other methods on average. On the two other real data sets (i.e., Indian Pines and Loukia), the CGCNN still achieves the best classification results, with classification accuracies that are 7.21% and 2.70% higher on average than the compared methods, respectively. Furthermore, the sensitivity parameter analysis shows the effect and function of the sensitivity value. Smaller sensitivity value can preserve more details and boundaries, but the outliers and noise are also more; in contrast, larger sensitivity value can suppress more outliers and noise, but the classification map may be smoother and lose more details. The automatic learning mechanism of sensitivity value can achieve a trade-off between the details and smoothness and give a suitable sensitivity for different HSIs according to their spatial contents and training samples. In the evaluation of small-sample learning capacity, the performance advantage of the CGCNN increases compared with the other methods gradually with decreasing number of training samples, indicating the great small-sample learning ability of the CGCNN. Conclusion By breaking down the content-guided convolution into a series of pixel-level computation processes independent of spatial locations, we propose a parallel computing method for the content-guided convolution to run the CGCNN in parallel on the GPU. Diverse experiments such as comparison of classification accuracies, sensitivity parameter analysis, and evaluation of small-sample learning capacity, demonstrate that the CGCNN can greatly preserve the edges and small objects in classification maps and outperforms the compared methods in quantitative and qualitative performance indices.

Key words

content-guided convolution; deep learning; hyperspectral image (HSI) classification; parallel acceleration; edge-preserving classification

0 引言

高光谱图像(hyperspectral image，HSI)是“图谱合一”的3维立方体数据，其丰富的空间和光谱信息能够在像素级区分物质。精确分类HSI中的每个像素，是将其应用于环境监测、地质勘探等领域的重要前提(李铁等，2016；张号逵等，2018)。HSI分类面临光谱的高维冗余性、“同谱异物”和“同物异谱”的多样性、高质量标注样本少以及样本数量不均衡性等问题。传统方法主要通过数据降维、人工特征提取和针对小样本的分类器实现HSI分类，代表性方法包括波段选择(王立国和魏芳洁，2013)、稀疏表示(Chen等，2011)和低秩分解(陈昭等，2013)等。然而传统方法的分类性能往往受到数据规模和特征维数等诸多限制。而深度神经网络，如前馈神经网络(Fernandez-Redondo等，2004)、循环神经网络(Mou等，2017)和堆栈自编码机(Chen等，2014)等，得益于其强大的自适应特征学习能力，在HSI分析领域展现出了卓越性能，且逐渐成为HSI分类的研究热点。

基于深度学习的HSI监督分类目前包括两个基本研究思路：基于光谱的分类方法和基于空谱联合的分类方法。基于光谱的分类方法中，深度网络可从带标记的光谱训练数据中以监督形式学习光谱的高维映射，以此形成具有任务针对性的高维自适应特征，无需手工特征提取，但容易受到噪声干扰，导致结果容易出现分类错误或野值(outliers)；并且待训练参数较多，容易出现过拟合现象。而空谱联合分类方法能够有机捕获和学习空间及光谱的上下文语义特征，有效降低了分类噪声，使得HSI分类精度有了显著提升，受到研究者的广泛关注。

目前，研究者基于卷积神经网络(convolutional neural network，CNN)，探索了多种深度学习架构的HSI分类方法，包括1D-CNN(Hu等，2015)、2D-CNN(Makantasis等，2015)、3D-CNN(Li等，2017)以及复合结构CNN等。例如，Yang等人(2017)使用由1D-CNN、2D-CNN组成的双分支网络分别提取光谱特征与空间特征。Chen等人(2018)在此基础上额外增加3D-CNN分支以进一步增强特征提取。而Roy等人(2020)则在单分支中交替使用1D、2D以及3D卷积结构以联合提取空谱特征。相比单一结构CNN，复合结构CNN通常能够有效提升HSI的分类精度。为了进一步提升网络分类性能，有研究者在复合CNN的基础上引入了残差和稠密连接(Zhong等，2018；Wang等，2018；刘启超等，2020)。通过增强网络中的梯度流动和误差传播，即使在有限样本条件下，深度网络依然能够被较好地训练，进而提升HSI分类网络的性能。受传统波段选择方法的启发，Fang等人(2019)提出了波段(光谱)注意力深度学习方法，通过注意力机制自适应选择波段子集，减少了冗余波段的干扰，有效提升了深度网络的泛化性与鲁棒性。在此基础上，空间注意力机制与高光谱图像多通道特性有机结合形成空谱注意力深度模型(Mei等，2019；Li等，2020)，使得分类精度进一步提升。最近，图神经网络方法也受到人们的关注(Qin等，2019；Wan等，2020)，通过超像素(superpixel)分割等方法将HSI表达为图数据结构，可将HSI转化为图上的表示学习问题，并可结合图嵌入(graph embedding)和动态图学习机制捕获上下文语义结构，提升HSI空谱特征学习能力。但由于图神经网络主要面向非欧几里得数据，规则栅格结构的图像数据与不规则图数据之间存在表达代沟，图节点能够表达的内容受到限制，需要恰当地将其转换为图表示形式；而CNN等深度学习方法能够直接处理图像，因此基于CNN的方法仍是HSI深度学习的主流算法。

然而，受限于传统卷积中形状固定的采样方式，基于CNN的HSI分类方法往往难以正确地将跨类别边缘区域的像素进行分类，从而导致细小目标丢失，造成分类图过于平滑。针对该问题，Liu等人(2020)提出内容引导CNN(content-guided CNN，CGCNN)，根据地物空间分布自适应调整卷积核在不同采样位置的形状，有效降低了跨类别边缘区域的误分类现象，能够较好地保持不同地物的边缘及小目标区域。由于该方法的核心为内容引导卷积算法，采取非固定模板结构，不能直接调用现有深度学习加速库(如Tensorflow、PyTorch等)实现并行计算。针对该问题，本文给出了一种内容引导卷积的并行计算方法。具体而言，内容引导卷积中的形状自适应核由各向异性核与各向同性核加权而成，而其中的各向异性核在不同采样位置均不相同，因此不能按照常规卷积中使用共享卷积核的方式进行运算。本文将内容引导卷积分解为两步运算：首先计算各个空间位置对应的各向异性核，并将之与输入特征图进行加权；然后对加权后的特征图执行带有移动步长的标准卷积。由于各向异性核的加权计算需要重叠采样特征图，该重叠区域是阻碍其并行计算的首要原因，因此，本文将该重叠采样方式转换为非重叠采样方式，然后再执行像素级并行加速。最后，本文采取深度学习加速库中提供的数据采样接口，通过引入平铺、堆叠等底层操作，实现重叠采样到非重叠采样的转换，最终达到并行加速的目的。通过实验验证，本文提出的内容引导卷积的并行版本，在不同条件下平均可达到近700倍加速的效果，基本满足深度学习所需的运行效率，从而使得以此为基础的CGCNN具有快速计算能力。此外，本文从算法可重复性、可信性以及分类性能等角度出发，设计了针对该算法的性能评测和消融实验，包括多组数据集上的分类精度对比、参数敏感度分析和小样本学习能力等测试方案，充分验证了算法的综合分类性能和有效性。

1 CGCNN高光谱图像分类算法原理

CGCNN是一种端到端的HSI分类框架，其核心基础为内容引导卷积。通过结合常规卷积层、批归一化(batch normalization，BN)层以及稠密连接(Huang等，2017)等常见网络结构，共同构建高光谱图像分类网络。基于内容引导卷积构建的CGCNN框架如图 1所示。

图 1 CGCNN结构示意图

Fig. 1 Architecture of the CGCNN

1.1 内容引导卷积

内容引导卷积有两个输入：特征图和引导图。该卷积算法通过计算引导图的局部地物形状，可以动态地改变每个位置的卷积核形状，使之具有内容自适应性。基于内容引导卷积构建的CGCNN在高光谱图像分类中可以保留更多细小目标，从而获得更加精细的分类结果。

记输入特征图与引导图分别为$\mathit{\boldsymbol{X}} \in {{\bf{R}}^{H \times W \times B}}$与$\mathit{\boldsymbol{M}} \in {{\bf{R}}^{H \times W \times C}}$，则内容引导卷积的计算过程为

$ Y_{i, j}^{l}=f\left(\sum\limits_{r=0}^{B-1} \sum\limits_{u=0}^{h-1} \sum\limits_{v=0}^{w-1} A_{u, v}^{i, j} K_{u, v, r}^{l} X_{(i+u), (j+v), r}+b^{l}\right) $

(1)

式中，${\mathit{\boldsymbol{K}}^l}$、${b^l}$与${\mathit{\boldsymbol{Y}}^l}$分别表示第$l$个各向同性核、偏差以及输出特征波段(如无特殊说明，变量下标均表示其在该位置处的值，下同)，${\mathit{\boldsymbol{A}}^{i, j}}$为在空间坐标$\left({i, j} \right)$处的各项异性核，$\left({h, w} \right)$为卷积核的空间尺寸，$f$表示激活函数。各向异性核的计算方式为

$ A_{u, v}^{i, j}=\exp \left(-\frac{1}{\sigma^{2}}\left\|M_{(i+u), (j+v)}-M_{(i+\lfloor L / 2\rfloor), (j+\lfloor L w / 2\rfloor)}\right\|_{2}^{2}\right) $

(2)

式中，$\sigma $为自适应敏感度参数，${\left\| \cdot \right\|_2}$表示${\rm{L}}2$范数。不同于传统卷积，内容引导卷积中的卷积核由两个核组成，其中各向同性核用于提取HSI的空间特征，其在卷积过程中共享；而各向异性核用于控制各向同性核的空间形状，其在不同空间位置均不相同。若$\mathit{\boldsymbol{M}}$能够反映输入特征图的地物空间分布及形状，则该卷积核便可以根据$\mathit{\boldsymbol{M}}$自适应调整，从而减少跨类别边缘区域的误分类现象，更好地保留HSI中的细小目标及地物边缘。

为了缩减卷积核参数以降低网络过拟合风险，受深度可分离卷积启发(Howard等，2017)，式(1)可进一步分离为逐像素卷积和逐波段卷积的组合，具体为

$ Y_{i, j}^{l}=f\left(\sum\limits_{u=0}^{h-1} \sum\limits_{v=0}^{w-1} A_{u, v}^{i, j} K_{u, v}^{l} \tilde{X}_{(i+u), (j+v)}^{l}+b^{l}\right) $

(3)

式中，${{\mathit{\boldsymbol{\tilde X}}}^l}$为对输入特征图$\mathit{\boldsymbol{X}}$执行逐像素卷积(1×1卷积)后得到的光谱特征图的第$l$个波段。通过以上分解，卷积核中的待训练参数量可明显降低，从而降低网络过拟合风险并提高鲁棒性，进而提升网络在有限样本下的分类性能。

1.2 网络框架

如图 1所示，CGCNN是由多个引导特征提取单元(guided feature extraction unit，GFEU)堆叠构成，每个GFEU均由BN层、1×1卷积层以及逐波段内容引导卷积层构成，且具有两个输入：特征图和引导图。

具体而言，在CGCNN中，引导图$\mathit{\boldsymbol{M}}$是由一个子网络模块学习而得，而该子网络是由BN层和1×1卷积层简单构成，学习到的引导图在所有GFEU中共享。而每个GFEU的输入特征图是由其前面所有GFEU的输出拼接而成，即通过引入稠密连接增强网络中的特征传播，以缓解梯度弥散带来的网络训练困难等问题。记${\mathit{\boldsymbol{I}}^{\left(l \right)}}$和${\mathit{\boldsymbol{O}}^{\left(l \right)}}$分别表示第$l$个GFEU的输入和输出特征图(${\mathit{\boldsymbol{I}}^{\left(l \right)}}$即输入HSI)，则${\mathit{\boldsymbol{I}}^{\left(l \right)}}$可表示为

$ \boldsymbol{I}^{(l)}=\left[\boldsymbol{O}^{(1)}, \boldsymbol{O}^{(2)}, \cdots, \boldsymbol{O}^{(l-1)}\right] $

(4)

式中，[…]表示将不同特征图沿光谱维度进行拼接。设网络由$S\left({S \ge 1} \right)$个GFEU组成，则最终送入分类器的分层特征图$\mathit{\boldsymbol{I}}$表示为

$ \boldsymbol{I}=\left[\boldsymbol{O}^{(1)}, \boldsymbol{O}^{(2)}, \cdots, \boldsymbol{O}^{(S)}\right] $

(5)

最后，CGCNN使用Softmax分类器计算每个像素的类属概率。设$\mathit{\boldsymbol{P}}$表示最终输出的类属概率图，则其计算为

$ P_{i, j, k}=\frac{\exp \left(\sum\limits_{m} K_{m}^{k} I_{i, j, m}+b^{k}\right)}{\sum\limits_{c} \exp \left(\sum\limits_{m} K_{m}^{c} I_{i, j, m}+b^{c}\right)} $

(6)

式中，${\mathit{\boldsymbol{K}}^k}$与${b^k}$分别为第$k$个1维核及其偏差，输出${P_{i, j, k}}$即为HSI中在$\left({i, j} \right)$位置处的像素属于第$k$类的概率。此外，为了缓解HSI的类别不平衡问题，CGCNN采取类别加权交叉熵作为其损失函数，具体为

$ \mathcal{L}=-\sum\limits_{\boldsymbol{L}_{i, j} \in \boldsymbol{D}} \sum\limits_{k=0}^{C-1} N_{k}^{-1} L_{i, j, k} \log \left(P_{i, j, k}\right) $

(7)

式中，$N_k$表示第$k$类的监督样本数，$\mathit{\boldsymbol{D}} = \left\{ {{\mathit{\boldsymbol{L}}_{i, j}}} \right\}$表示所有监督像素的标签集合。

1.3 参数配置

如图 1所示，CGCNN包含一个引导图学习模块和5个GFEU。具体而言，引导图学习模块的输出通道数设置为3。对于第1个GFEU，输出特征图的通道数设置为128，而剩余GFEU的输出通道数设置为32。在所有的GFEU中，卷积核的大小统一设置为5×5。$\sigma $的初始值设置为1。最后，GFEU使用Adam优化器来训练，其中，$\sigma $的学习率设置为0.005，其余参数学习率设置为0.000 5。训练迭代次数为800次。实验环境：CPU为i7-8700K，GPU为GTX-1080Ti，Tensorflow-1.12。

2 内容引导卷积的并行计算方法

由于内容引导卷积属于非固定模板结构，不能直接采用深度学习库中的标准卷积模块实现其并行计算。本文通过元素重采样以及平铺、堆叠等深度学习库内置的底层加速接口，为内容引导卷积设计了一种并行加速方法。

首先，通过分析内容引导卷积结构，不难发现其主要计算问题是式(1)中的各向异性核在卷积运算过程中非共享，不同位置上的卷积核参数各不相同，因此需要进一步分解该卷积的特殊结构。考察式(1)所定义的卷积形式，可将其分解为如下两步运算：1)计算各个空间位置对应的各向异性核，并将其与输入特征图进行加权，即

$ G_{(i \times h+u), (j \times w+v), r}=A_{u, v}^{i, j} X_{(i+u), (j+v), r} $

(8)

式中，$0 \le u \le h - 1, 0 \le v \le w - 1$，$\mathit{\boldsymbol{G}}$为与各向异性核加权后的特征图。

2) 对加权后的特征图执行步长为$\left({h, w} \right)$的标准卷积，即

$ Y_{i, j}=\sum\limits_{r=0}^{B-1} \sum\limits_{u=0}^{h-1} \sum\limits_{v=0}^{w-1} K_{u, v, r} G_{(i \times h+u), (j \times w+v), r} $

(9)

式(9)描述的卷积结构可通过深度学习库直接进行加速运算。而式(8)则可以通过深度学习库提供的底层接口实现，且能够并行执行。

2.1 深度学习库的并行函数接口

现有主流深度学习库均内置了较为底层的加速函数接口，可支持开发人员实现更为复杂的并行算法。因此，首先简单介绍所提出的加速算法需要用到的函数接口。由于不同深度学习库中的函数名称可能不一致，将简化其函数接口名称，每个函数均可在主流深度学习库中找到对应版本。

1) $\mathit{\boldsymbol{Y}} = tile\left({\mathit{\boldsymbol{X}}, \mathit{N}, \mathit{dim}} \right)$：平铺函数，将输入矩阵$\mathit{\boldsymbol{X}}$中的每个元素按第$dim$维度重复平铺$N$次。例如：设$\mathit{\boldsymbol{X}} = \left[ {1\;\;2} \right]$，则有：$tile\left({\mathit{\boldsymbol{X}}, {\rm{2}}, {\rm{2}}} \right) = \left[ {1\;\;1\;\;2\;\;2} \right]$。

2) $\mathit{\boldsymbol{Y}} = repeat\left({\mathit{\boldsymbol{X}}, \mathit{N}, \mathit{dim}} \right)$：堆叠函数，将输入矩阵$\mathit{\boldsymbol{X}}$按第$dim$维度进行整体重复平铺$N$次。例如：设$\mathit{\boldsymbol{X}} = \left[ {1\;\;2} \right]$，则有$repeat\left({\mathit{\boldsymbol{X}}, {\rm{2}}, {\rm{2}}} \right) = \left[ {1\;\;2\;\;1\;\;2} \right]$。

3) $\mathit{\boldsymbol{I}}{\rm{ = }}\mathit{meshgrid}\left({\mathit{\boldsymbol{X}}, \mathit{\boldsymbol{Y}}} \right)$：网格函数，将$\mathit{\boldsymbol{X}}$与$\mathit{\boldsymbol{Y}}$分别作为列与行索引，生成网格采样点。例如：设$\mathit{\boldsymbol{X}} = \left[ {1\;\;2} \right]$，则有$\mathit{meshgrid}\left({\mathit{\boldsymbol{X}}, \mathit{\boldsymbol{Y}}} \right) = \left[ {\begin{array}{*{20}{c}} {\left[ {1, 3} \right]}&{\left[ {2, 3} \right]}\\ {\left[ {1, 4} \right]}&{\left[ {2, 4} \right]} \end{array}} \right]$。

4) $\mathit{\boldsymbol{Y}} = gather\left({\mathit{\boldsymbol{X}}, \mathit{\boldsymbol{I}}} \right)$：采样函数，按照索引$\mathit{\boldsymbol{I}}$中的坐标对$\mathit{\boldsymbol{X}}$进行采样。例如：设$\mathit{\boldsymbol{X}} = \left[ {\begin{array}{*{20}{c}} 1&2\\ 3&4 \end{array}} \right], \mathit{\boldsymbol{I = }}\left[ {\left[ {1, 2} \right]\;\;\left[ {1, 1} \right]\;\;\left[ {2, 2} \right]} \right]$，则有$gather\left({\mathit{\boldsymbol{X}}, \mathit{\boldsymbol{I}}} \right) = \left[ {2\;\;1\;\;4} \right]$。

5) $\mathit{\boldsymbol{Y}} = range\left(N \right)$：序列生成函数，生成从1到$N$间隔为1的等差序列。例如：$\mathit{\boldsymbol{Y}} = range\left(3 \right) = \left[ {1\;\;2\;\;3} \right]$。

2.2 内容引导卷积的并行计算过程

利用上述加速函数接口，本文设计了内容引导卷积的并行加速方式，具体为：首先通过计算一个共享的采样索引用来重采样原始数据，将不可分割的重叠数据隔离开，并在像素级并行执行式(8)；然后使用式(9)描述的带有移动步长的标准卷积实现剩余部分的并行计算。其算法实现技巧是将内容引导卷积分解为多个独立计算过程，将重叠计算转换为与空间位置无关的像素级并行计算。算法的实现框图如图 2所示，计算过程如算法1所描述, 其中${\mathit{\boldsymbol{\hat X}}}$、${\mathit{\boldsymbol{\hat M}}}$分别表示经过重采样的特征图$\mathit{\boldsymbol{X}}$与引导图$\mathit{\boldsymbol{M}}$，${\mathit{\boldsymbol{I}}^{xy}}$表示采样索引，$\mathit{\boldsymbol{C}}$表示经过平铺中心像素的$\mathit{\boldsymbol{M}}$，$\left({h, w} \right)$为卷积核大小。

图 2 内容引导卷积的并行运算过程

Fig. 2 The computing parallel procedure of the content-guided convolution

算法1：内容引导卷积的并行计算

输入：特征图$\mathit{\boldsymbol{X}}$，引导图$\mathit{\boldsymbol{M}}$，卷积核尺寸$\left({h, w} \right)$。

输出：特征图$\mathit{\boldsymbol{Y}}$。

1) 生成空间$x$轴(第1维)的采样索引${\mathit{\boldsymbol{I}}^{x}}$，步骤为：

(1) 初始化递增序列：$\mathit{\boldsymbol{I}}_H^x\mathit{\boldsymbol{ = }}range\left(H \right), \mathit{\boldsymbol{I}}_H^x\mathit{\boldsymbol{ = }}range\left(h \right)$;

(2) 将$\mathit{\boldsymbol{I}}_H^x$横向平铺$h$次：$\mathit{\boldsymbol{I}}_{{\rm{global}}}^x = tile\left({\mathit{\boldsymbol{I}}_H^x, h, 2} \right)$;

(3) 将$\mathit{\boldsymbol{I}}_h^x$横向堆叠$H$次：$\mathit{\boldsymbol{I}}_{{\rm{local}}}^x = repeat\left({\mathit{\boldsymbol{I}}_H^x, H, 2} \right)$;

(4) 计算$x$轴采样索引：${\mathit{\boldsymbol{I}}^x} = \mathit{\boldsymbol{I}}_{{\rm{global}}}^x + \mathit{\boldsymbol{I}}_{{\rm{local}}}^x - ceil\left({h/2} \right)$, $ceil\left({} \right)$为向上取整;

(5) 修正${\mathit{\boldsymbol{I}}^x}$中元素范围到$\left[ {1, H} \right]$，防止访问越界；

2) 替换$\left({H, h} \right)$为$\left({W, w} \right)$，并执行步骤1)得$y$轴(第2维)采样索引${\mathit{\boldsymbol{I}}^y}$；

3) 生成全局空间采样索引：${\mathit{\boldsymbol{I}}^{xy}}{\rm{ = }}\mathit{meshgrid}\left({{\mathit{\boldsymbol{I}}^x}, {\mathit{\boldsymbol{I}}^y}} \right)$；

4) 沿横纵轴扩充$\mathit{\boldsymbol{M}}$中每个采样区域的中心像素：

$ \boldsymbol{C}={tile}({tile}(\boldsymbol{M}, h, 1), w, 2) $

5) 根据${\mathit{\boldsymbol{I}}^{xy}}$分别对$\mathit{\boldsymbol{M}}$和$\mathit{\boldsymbol{X}}$进行重采样：

$ \hat{\boldsymbol{M}}={gather}\left(\boldsymbol{M}, \boldsymbol{I}^{x y}\right), \hat{\boldsymbol{X}}={gather}\left(\boldsymbol{X}, \boldsymbol{I}^{x y}\right) $

6) 按像素级并行执行如下运算：

$ \boldsymbol{G}_{i, j}=\exp \left(-\left\|\hat{\boldsymbol{M}}_{i, j}-\boldsymbol{C}_{i, j}\right\|_{2}^{2} / \sigma^{2}\right) \hat{\boldsymbol{X}}_{i, j} $

7) 对$\mathit{\boldsymbol{G}}$执行移动步长为$\left({h, w} \right)$的标准卷积。

本文通过构造一个索引矩阵定义重采样方式，将内容引导卷积分解为逐像素计算过程，并借由加速函数接口并行执行该运算。进一步，由于CGCNN以内容引导卷积为基础构建，且引导图$\mathit{\boldsymbol{M}}$为共享，因此，算法1中步骤1)—5)(不包括步骤5)中对$\mathit{\boldsymbol{X}}$的重采样)在一次网络前向传播中只须计算一次，以进一步减少运算需求。

3 综合性能实验方案与对比分析

本文提供并行化CGCNN源在线地址(https://github.com/qichaoliu/Content_Guided_CNN), 并采用3组数据集验证其有效性和可重复性:

1) 合成数据集。该合成数据集取自Cao等人(2018)，由于该数据集包含大量细小地物，且边缘复杂多变，能够充分反映算法对地物的细节保持分类能力。数据集在线可获取(https://github.com/qichaoliu/CNN-MRF-v1)。

2) Indian Pines数据集。数据集在线可获取(http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes)。

3) Loukia数据集。数据集在线可获取(http://www2.isprs.org/commissions/comm3/wg4/HyRANK.html)。

对于合成数据集，每类随机选取0.5%、0.1%的样本为训练集、验证集，剩余样本为测试集。对于Indian Pines和Loukia，每类随机选取5%、1%的样本为训练集、验证集，剩余样本作为测试集。

同时对比了6种基于深度学习的HSI分类方法，分别为：

1) 2D-CNN。采用2D卷积结构的深度网络算法(Makantasis等，2015)；

2) 双通道CNN(two-channel CNN，TC-CNN)。联合1D-CNN与2D-CNN的双通道深度网络算法(Yang等，2017)；

3) 3D-CNN。采用3D卷积结构的深度网络算法(Li等，2017)；

4) 多通道CNN(multi-channel CNN，MC-CNN)。联合1D-CNN、2D-CNN与3D-CNN的多通道深度网络算法(Chen等，2018)；

5) 空谱残差网络(spectral-spatial residual network，SSRN)。1D与3D卷积交替的深度空谱残差网络算法(Zhong等，2018)；

6) 混合光谱网络(hybrid spectral network，HybridSN)。1D、2D与3D卷积交替的深度网络算法(Roy等，2020)。

采用总精度(overall accuracy，OA)、平均精度(average accuracy，AA)和Kappa统计量对分类性能进行评价。每组实验重复5次，取均值及标准差作为评价指标。

3.1 分类精度性能分析

由表 1可知，CGCNN在合成数据集上的分类性能明显优于对比方法。2D-CNN、TC-CNN、SSRN和HybridSN由于在跨类别边缘附近的像素处存在严重的误分类现象，所得到的分类图过于平滑，造成精度较低。该结果表明，2D-CNN能够较好地建模平坦区域，但难以刻画孤立点及跨类别边缘等其他结构区域，而CGCNN能够较好地适应不同地物的空间内容，捕捉不同区域的结构模式，从而在跨类别区域中取得了令人满意的分类性能。不同方法得到的分类图如图 3所示。

表 1 不同算法在合成数据集上的分类精度
Table 1 Classification accuracies of different methods on the synthetic dataset

下载CSV

/%
类别	2D-CNN	TC-CNN	3D-CNN	MC-CNN	SSRN	HybridSN	CGCNN
1	90.92±1.13	87.88±1.29	98.24±0.34	94.55±2.72	89.76±2.74	85.58±3.31	99.43±0.53
2	93.15±3.64	88.56±3.79	97.67±0.78	93.27±3.14	91.27±2.93	84.80±3.35	99.54±0.61
3	89.43±5.03	83.54±9.34	93.42±2.26	93.29±4.45	87.08±5.02	80.92±6.67	95.13±1.83
4	92.24±1.85	89.90±3.11	97.55±0.43	94.77±1.34	91.03±2.33	86.18±2.88	98.08±0.84
5	93.56±1.61	90.99±1.84	99.44±0.17	96.28±1.68	92.47±1.40	87.68±2.38	99.54±0.41
OA/%	92.33±0.55	89.07±0.62	98.01±0.34	94.70±1.36	90.88±0.59	85.68±1.23	98.88±0.35
AA/%	91.86±0.74	88.17±1.87	97.27±0.56	94.43±1.25	90.32±1.06	85.03±1.55	98.34±0.26
Kappa (×100)	90.01±0.72	85.74±0.82	97.41±0.44	93.09±1.78	88.13±0.76	81.36±1.61	98.55±0.45
注：加粗字体为每行最优值。

图 3 不同方法对合成数据集的分类结果

Fig. 3 Classification maps of different methods on the synthetic dataset

((a) false-color image; (b) ground truth; (c) 2D-CNN; (d) TC-CNN; (e) 3D-CNN; (f) MC-CNN; (g) SSRN; (h) HybridSN; (i) CGCNN)

对于Indian Pines数据集，CGCNN同样取得了在对比方法中最好的结果，不同方法的分类结果如图 4和表 2所示。Indian Pines中存在的噪声会导致较大的类内距离，因此，需要更多的邻域信息来抑制噪声的影响。在本实验中，CGCNN的灵敏度参数$\sigma $自动更新为0.5左右，大于合成数据集的$\sigma $值(约0.2)。该结果表明，CGCNN鼓励分片光滑的分类结果，使分类结果与真实值保持了较高的一致性。此外，由于类别加权损失函数的存在，CGCNN的AA值均高于对比方法。

图 4 不同方法对Indian Pines的分类结果

Fig. 4 Classification maps of different methods on Indian Pines

((a) false-color image; (b) ground truth; (c) 2D-CNN; (d) TC-CNN; (e) 3D-CNN; (f) MC-CNN; (g) SSRN; (h) HybridSN; (i) CGCNN)

表 2 不同算法在Indian Pines上的分类精度
Table 2 Classification accuracies of different methods on Indian Pines

下载CSV

/%
类别	2D-CNN	TC-CNN	3D-CNN	MC-CNN	SSRN	HybridSN	CGCNN
1	100±0	94±12	97.44±3.14	88.56±15.69	93.15±13.68	98.70±1.71	91.19±8.54
2	84.48±2.43	88.79±1.56	84.29±3.65	83.77±2.43	92.49±5.46	94.07±2.72	96.95±0.96
3	77.28±2.32	88.52±4.13	79.66±5.60	83.92±2.98	95.47±3.13	93.97±3.18	97.47±2.21
4	90.57±5.39	90.30±6.61	89.65±3.16	88.43±3.84	93.09±6.03	94.49±3.35	95.35±3.51
5	96.08±1.42	94.23±3.00	94.09±1.63	95.36±0.86	96.91±1.57	94.35±3.19	96.05±3.10
6	94.43±3.60	94.35±1.66	93.76±2.14	93.30±2.84	97.28±1.33	96.25±1.54	99.60±0.38
7	100±0	90.90±2.62	98.00±4.00	96.91±3.94	60±48.98	88.73±7.48	96.66±3.11
8	93.81±1.98	97.18±1.79	93.43±0.20	94.03±1.79	95.97±1.51	99.14±0.72	99.73±0.42
9	60±48.98	100±0	100±0	98.18±3.63	0±0	71.95±23.22	100±0
10	87.15±1.55	86.72±2.54	83.55±5.37	85.19±3.07	88.81±5.62	96.63±1.87	96.80±1.56
11	85.84±1.53	91.67±1.94	84.60±2.13	86.73±2.81	94.93±3.94	95.27±1.71	97.44±1.11
12	79.52±10.14	85.54±4.81	76.28±3.37	80.88±1.78	87.70±7.43	91.62±3.42	97.89±1.42
13	97.84±1.28	97.59±2.17	96.74±1.50	97.55±0.99	99.11±1.32	96.96±3.04	99.49±0.31
14	94.39±2.42	95.92±2.32	94.26±2.75	95.12±1.56	97.34±1.54	97.98±1.81	99.26±0.27
15	86.95±4.31	90.11±4.11	90.17±1.44	86.70±4.64	98.84±1.08	94.13±3.69	98.27±3.45
16	91.98±4.05	84.49±4.84	93.46±1.67	92.05±5.85	93.95±4.63	88.97±10.87	99.09±0.84
OA/%	87.70±0.72	91.20±0.66	87.02±2.03	88.21±0.94	94.07±1.53	95.20±0.79	97.78±0.19
AA/%	88.77±3.95	91.89±1.59	90.59±1.51	90.42±1.30	86.56±4.55	93.33±1.76	97.58±0.73
Kappa (×100)	85.93±0.82	89.96±0.76	85.15±2.33	86.53±1.09	93.25±1.72	94.53±0.91	97.47±0.22
注：加粗字体为每行最优值。

对于Loukia数据集，CGCNN同样取得了令人满意的分类精度，如表 3所示。由于该数据集较大，同时小目标区域较少，因此CGCNN将灵敏度参数$\sigma $自动更新为0.6左右，相比上述两组数据集的$\sigma $值均稍大，表明算法在分类过程中利用了更多的邻域信息，同时降低对局部地物精细结构的保持，以达到分段平滑的分类效果。不同方法得到的分类图如图 5所示。

表 3 不同算法在Loukia上的分类精度
Table 3 Classification accuracies of different methods on Loukia

下载CSV

/%
类别	2D-CNN	TC-CNN	3D-CNN	MC-CNN	SSRN	HybridSN	CGCNN
1	79.63±8.63	71.2±2.69	73.14±4.41	77.47±4.54	77.18±10.68	80.35±5.42	79.20±3.88
2	87.76±9.70	97.56±2.77	92.85±7.02	95.33±2.48	96.10±3.45	96.82±2.85	95.55±8.88
3	79.11±3.71	82.25±2.43	77.79±5.51	82.43±3.78	89.26±9.36	84.48±1.39	94.11±0.53
4	89.09±21.81	80.04±17.45	69.77±19.01	50.47±15.01	74.08±18.06	65.26±10.06	52.62±12.83
5	84.77±3.14	84.95±2.15	86.89±0.91	84.03±1.51	89.60±3.62	81.85±2.09	91.45±4.94
6	74.69±4.07	57.82±10.05	70.56±6.79	62.10±4.50	79.09±7.73	70.44±15.16	71.47±7.14
7	84.61±4.61	80.27±5.57	82.2±4.70	79.43±4.06	87.11±9.67	83.99±6.58	75.40±5.21
8	68.74±5.63	72.56±2.88	74.75±3.78	70.39±2.42	78.55±4.93	79.95±2.03	83.64±2.56
9	78.03±1.13	78.31±1.51	79.22±1.38	77.51±1.50	79.84±4.09	78.91±2.37	80.55±5.32
10	83.80±2.07	82.95±1.78	81.69±0.63	83.79±1.32	84.91±2.45	80.58±2.51	84.32±2.41
11	79.20±5.66	84.37±4.93	84.91±5.15	84.75±7.31	91.05±3.86	94.39±2.95	89.07±8.04
12	89.49±3.19	92.87±2.20	89.66±4.69	90.76±2.03	92.64±4.22	94.94±1.70	94.34±1.90
13	99.98±0.03	99.93±0.12	99.92±0.09	99.96±0.06	99.87±0.14	99.93±0.08	100±0
14	99.81±0.17	99.16±0.86	99.62±0.63	99.67±0.24	98.54±2.68	98.31±1.77	100±0
OA/%	82.89±1.39	82.97±0.62	83.33±0.77	82.68±0.49	85.49±0.47	83.64±0.98	86.20±2.07
AA/%	84.19±2.98	83.16±0.94	83.07±0.94	81.29±1.34	86.99±2.48	85.02±2.04	85.13±1.74
Kappa (×100)	79.52±1.65	80.07±0.95	80.08±0.96	79.31±0.62	82.64±0.61	80.41±1.21	83.71±2.37
注：加粗字体为每行最优值。

图 5 不同方法对Loukia的分类结果

Fig. 5 Classification maps of different methods on Loukia

((a) false-color image; (b) ground truth; (c) 2D-CNN; (d) TC-CNN; (e) 3D-CNN; (f) MC-CNN; (g) SSRN; (h) HybridSN; (i) CGCNN)

3.2 敏感度参数对分类结果的影响

在CGCNN中，灵敏度参数$\sigma $可从样本中自动学习，并影响着分类结果的精细度。$\sigma $设置为不同固定值，并分析其对分类结果的影响。具体而言，较大的敏感度参数导致更平滑的分类结果，而较小的敏感度参数则导致更精细的分类结果。对每组数据集，$\sigma $分别固定为0.1以及10，其结果与自学习$\sigma $进行对比。为了展示不同$\sigma $值的影响，本节中的训练样本相比3.1节增加了1倍。分类结果显示在图 6中，OA指标总结在表 4中。

图 6 不同$\sigma $值的CGNN的分类结果

Fig. 6 Classification maps of CGNN with different $\sigma $

表 4 不同$\sigma $值的CGCNN的性能表现(OA)
Table 4 Accuracies (OA) of CGCNN with different $\sigma $

下载CSV

/%
数据集	$\sigma $=0.1	自学习$\sigma $	$\sigma $=10
合成	99.76	99.29	95.28
Indian Pines	98.21	99.17	99.02
注：加粗字体为每行最优值。

如图 6第2行所示，在Indian Pines数据集上，当$\sigma $趋向小值时，其分类细节更为丰富。然而，过于丰富的细节引入了一定的噪声或野值，造成精度略有下降。而过大的$\sigma $值则导致Indian Pines分类图较为平滑，从而造成地物边缘较为模糊。而在合成数据集上，如图 6第1行所示，较小的$\sigma $值获得了更高的精度，这是由于该数据集所含噪声较少，且包含大量细小目标，使得更细致的分类能够带来更高的精度。而当$\sigma $取较大值时，分类图明显过于平滑，导致大量细节丢失。而自学习的$\sigma $能够权衡噪声与细节的影响，从而取得较为满意的结果。

3.3 小样本学习性能分析

在实际的遥感应用中，标记每个像素的地物类别代价高昂，通常训练样本有限。为了研究CGCNN在小样本下的分类性能，本节测试了不同规模样本下不同方法的OA指标。具体而言，从每类中随机抽取5、10、15、20个样本作为训练数据集，1个样本作为验证数据集。若某类别样本数量不足，则取最大数量作为训练样本。实验重复5次，取OA均值作为评价指标。实验结果如图 7所示。

图 7 不同方法在不同规模训练样本下的分类精度

Fig. 7 Classification accuracies of different methods under different scales of training samples

((a) syntactic dataset; (b) Indian Pines dataset)

显然，CGCNN在不同训练样本数下均显示出令人满意的分类性能，且均优于所有对比方法。特别是在合成数据集上，CGCNN在每类只有5个样本的情况下获得了近95%的分类准确率，表明该方法具有良好的小样本学习能力。同时表明，当HSI中包含大量细小目标时，分类效果更为显著。

3.4 并行性能分析

为了评估本文提出的内容引导卷积的并行算法相较于串行实现的加速效果，对比了二者在不同规模输入情况下的运算时间。其中，串行实现版本是在CPU平台上依次对每个采样位置进行计算并输出；并行实现版本为本文提出的GPU平台上的并行运算过程。

实验中，输入特征图、引导图以及输出特征图的通道数分别固定为128、3和1。为了测试本文方法在不同条件下的有效性，设计了两组实验：不同输入规模下的加速效果对比以及不同卷积核尺寸下的加速效果对比。具体而言，在第1组实验中，输入数据的空间维依次设置为100×100像素、200×200像素、300×300像素、400×400像素以及500×500像素，同时卷积核尺寸固定为3×3；在第2组实验中，卷积核尺寸依次设置为3×3、5×5、7×7、9×9以及11×11，同时输入数据的空间维固定为100×100像素。每组实验均重复运行10次，取平均运行时间及加速比为评价标准。实验结果总结在表 5和表 6中。

表 5 内容引导卷积的串行版本与并行版本在不同输入规模下的运行时间及加速比
Table 5 Running time and acceleration ratios of serial and parallel versions of content-guided convolution with different input sizes

下载CSV

指标	100×100	200×200	300×300	400×400	500×500
CPU时间/s	10.096	39.899	94.578	165.850	279.505
GPU时间/s	0.066	0.069	0.156	0.222	0.310
加速比	152.97	578.24	606.27	747.07	901.63

表 6 内容引导卷积的串行版本与并行版本在不同卷积核尺寸下的运行时间及加速比
Table 6 Running time and acceleration ratios of serial and parallel versions of content-guided convolution with different kernel sizes

下载CSV

指标	3×3	5×5	7×7	9×9	11×11
CPU时间/s	10.096	31.680	60.449	98.865	143.125
GPU时间/s	0.066	0.066	0.070	0.086	0.108
加速比	152.97	480.00	863.56	1 149.59	1 325.23

由实验结果可知，内容引导卷积的并行版本相较于其串行版本具有显著的加速效果。具体而言，在不同输入规模条件下平均加速比为597，在不同卷积核尺寸条件下平均加速比为794，综合加速比为695，且随着运算规模逐步增大，加速比亦稳定提升。该结果充分表明本文提出的内容引导卷积的并行算法能够充分利用GPU实现加速运算，证明了本文方法的有效性。

4 结论

得益于内容引导卷积算法，CGCNN能够更为精细地分类HSI。然而，由于内容引导卷积的非模板化结构不被现有深度学习加速库所直接支持，导致以此为基础的CGCNN难以训练与应用。本文通过分析内容引导卷积的特殊结构，将其拆分为二步运算，并通过深度学习加速库所提供的底层加速接口实现了其并行版本。实验结果表明，本文提出的内容引导卷积的加速算法，相比串行实现版本，表现出了优异的加速效果，使得CGCNN能够满足实际应用中所需的运行效率。此外，本文通过在多组数据集上的分类精度、参数敏感度、小样本学习等综合性能测试证明了CGCNN的有效性，且算法精度稳定、具有可重复性。但由于本文提出的内容引导卷积的并行化算法需要重叠采样特征图，致使其显存(video memory)占用升高。后续工作可尝试通过统一计算设备架构(compute unified device architecture，CUDA)设计并编写更为底层的加速算法，通过深度学习加速库提供的CUDA接口完成与上层应用的对接，以进一步减少显存占用并提高计算效率。

参考文献

Cao X Y, Zhou F, Xu L, Meng D Y, Xu Z B, Paisley J. 2018. Hyperspectral image classification with Markov random fields and a convolutional neural network. IEEE Transactions on Image Processing, 27(5): 2354-2367 [DOI:10.1109/TIP.2018.2799324]

Chen C, Zhang J J, Zheng C H, Yan Q and Xun L N. 2018. Classification of hyperspectral data using a multi-channel convolutional neural network//Proceedings of International Conference on Intelligent Computing. Wuhan, China: Springer: 81-92[DOI: 10.1007/978-3-319-95957-3_10]

Chen Y, Nasrabadi N M, Tran T D. 2011. Hyperspectral image classification using dictionary-based sparse representation. IEEE Transactions on Geoscience and Remote Sensing, 49(10): 3973-3985 [DOI:10.1109/TGRS.2011.2129595]

Chen Y S, Lin Z H, Zhao X, Wang G, Gu Y F. 2014. Deep learning-based classification of hyperspectral data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(6): 2094-2107 [DOI:10.1109/JSTARS.2014.2329330]

Chen Z, Wang B, Zhang L M. 2013. Dimensionality reduction and classification based on lower rank tensor analysis for hyperspectral imagery. Journal of Infrared and Millimeter Waves, 32(6): 569-575 (陈昭, 王斌, 张立明. 2013. 基于低秩张量分析的高光谱图像降维与分类. 红外与毫米波学报, 32(6): 569-575) [DOI:10.3724/SP.J.1010.2013.00569]

Fang B, Li Y, Zhang H K, Chan J C W. 2019. Hyperspectral images classification based on dense convolutional networks with spectral-wise attention mechanism. Remote Sensing, 11(2): #159 [DOI:10.3390/rs11020159]

Fernandez-Redondo M, Hernandez-Espinosa C and Torres-Sospedra J. 2004. Hyperspectral image classification by ensembles of multilayer feedforward networks//Proceedings of 2004 IEEE International Joint Conference on Neural Networks. Budapest, Hungary: IEEE: 1145-1149[DOI: 10.1109/IJCNN.2004.1380097]

Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, Andreetto M and Adam H. 2017. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. https://arxiv.org/pdf/1704.04861.pdf

Hu W, Huang Y Y, Li W, Zhang F, Li H C. 2015. Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors, 2015: #258619 [DOI:10.1155/2015/258619]

Huang G, Liu Z, van der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 2261-2269[DOI: 10.1109/CVPR.2017.243]

Li R, Zheng S Y, Duan C X, Yang Y, Wang X Q. 2020. Classification of hyperspectral image based on double-branch dual-attention mechanism network. Remote Sensing, 12(3): #582 [DOI:10.3390/rs12030582]

Li T, Sun J G, Zhang X J, Wang X. 2016. Spectral-spatial joint classification method of hyperspectral remote sensing image. Chinese Journal of Scientific Instrument, 37(6): 1379-1389 (李铁, 孙劲光, 张新君, 王星. 2016. 高光谱遥感图像空谱联合分类方法研究. 仪器仪表学报, 37(6): 1379-1389) [DOI:10.3969/j.issn.0254-3087.2016.06.023]

Li Y, Zhang H K, Shen Q. 2017. Spectral-spatial classification of hyperspectral imagery with 3D convolutional neural network. Remote Sensing, 9(1): #67 [DOI:10.3390/rs9010067]

Liu Q C, Xiao L, Liu F, Xu J H. 2020. SSCDenseNet: a spectral-spatial convolutional dense network for hyperspectral image classification. Acta Electronica Sinica, 48(4): 751-762 (刘启超, 肖亮, 刘芳, 徐金环. 2020. SSCDenseNet: 一种空-谱卷积稠密网络的高光谱图像分类算法. 电子学报, 48(4): 751-762) [DOI:10.3969/j.issn.0372-2112.2020.04.017]

Liu Q C, Xiao L, Yang J X, Chan J C W. 2020. Content-guided convolutional neural network for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 58(9): 6124-6137 [DOI:10.1109/TGRS.2020.2974134]

Makantasis K, Karantzalos K, Doulamis A and Doulamis N. 2015. Deep supervised learning for hyperspectral data classification through convolutional neural networks//2015 IEEE International Geoscience and Remote Sensing Symposium. Milan, Italy: IEEE: 4959-4962[DOI: 10.1109/IGARSS.2015.7326945]

Mei X G, Pan E T, Ma Y, Dai X B, Huang J, Fan F, Du Q L, Zheng H, Ma J Y. 2019. Spectral-spatial attention networks for hyperspectral image classification. Remote Sensing, 11(8): #963 [DOI:10.3390/rs11080963]

Mou L C, Ghamisi P, Zhu X X. 2017. Deep recurrent neural networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 55(7): 3639-3655 [DOI:10.1109/TGRS.2016.2636241]

Qin A Y, Shang Z W, Tian J Y, Wang Y L, Zhang T P, Tang Y Y. 2019. Spectral-spatial graph convolutional networks for semisupervised hyperspectral image classification. IEEE Geoscience and Remote Sensing Letters, 16(2): 241-245 [DOI:10.1109/LGRS.2018.2869563]

Roy S K, Krishna G, Dubey S R, Chaudhuri B B. 2020. HybridSN: exploring 3-D——2-D CNN feature hierarchy for hyperspectral image classification. IEEE Geoscience and Remote Sensing Letters, 17(2): 277-281 [DOI:10.1109/LGRS.2019.2918719]

Wan S, Gong C, Zhong P, Du B, Zhang L F, Yang J. 2020. Multiscale dynamic graph convolutional network for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 58(5): 3162-3177 [DOI:10.1109/TGRS.2019.2949180]

Wang L G, Wei F J. 2013. Band selection for hyperspectral imagery based on combination of genetic algorithm and ant colony algorithm. Journal of Image and Graphics, 18(2): 235-242 (王立国, 魏芳洁. 2013. 结合遗传算法和蚁群算法的高光谱图像波段选择. 中国图象图形学报, 18(2): 235-242) [DOI:10.11834/jig.20130216]

Wang W J, Dou S G, Jiang Z M, Sun L J. 2018. A fast dense spectral-spatial convolution network framework for hyperspectral images classification. Remote Sensing, 10(7): #1068 [DOI:10.3390/rs10071068]

Yang J X, Zhao Y Q, Chan J C W. 2017. Learning and transferring deep joint spectral-spatial features for hyperspectral classification. IEEE Transactions on Geoscience and Remote Sensing, 55(8): 4729-4742 [DOI:10.1109/TGRS.2017.2698503]

Zhang H K, Li Y, Jiang Y N. 2018. Deep learning for hyperspectral imagery classification: the state of the art and prospects. Acta Automatica Sinica, 44(6): 961-977 (张号逵, 李映, 姜晔楠. 2018. 深度学习在高光谱图像分类领域的研究现状与展望. 自动化学报, 44(6): 961-977) [DOI:10.16383/j.aas.2018.c170190]

Zhong Z L, Li J, Luo Z M, Chapman M. 2018. Spectral-spatial residual network for hyperspectral image classification: a 3-D deep learning framework. IEEE Transactions on Geoscience and Remote Sensing, 56(2): 847-858 [DOI:10.1109/TGRS.2017.2755542]