Current Issue Cover

叶珍1, 白璘1, 何明一2(1.长安大学电子与控制工程学院, 西安 710064;2.西北工业大学电子信息学院, 西安 710129)

摘 要
Review of spatial-spectral feature extraction for hyperspectral image

Ye Zhen1, Bai Lin1, He Mingyi2(1.School of Electronics and Control Engineering, Chang'an University, Xi'an 710064, China;2.School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710129, China)

Hyperspectral imaging spectrometers collect radiation data from the ground in many adjacent and overlapping narrow spectral bands at the same time. The hyperspectral image (HSI) usually has hundreds of bands. Each of these bands contains the reflected light value within the specified range of the electromagnetic spectrum. Thus, the HSI contains a wealth of spectral and radiation information. The development of remote sensing imaging technology has increased the spatial resolution of HSI data obtained by hyperspectral imaging spectrometer. Therefore, HSI can be applied to accurately classify ground objects in various fields, such as geological exploration, precision agriculture, ecological environment, scientific remote sensing, and military target detection. However,many challenges and difficulties are encountered in classification applications because HSI has a large dataset, multiple bands, and strong band correlation. Specifically, the number of dimensionalities of HSI is often more than the number of available training samples. The lack of training samples and high computational cost are the inevitable obstacles in practical classification applications. Dimensionality reduction methods are often used to project HSI data into low-dimensional feature spaces for avoiding "Hughes" phenomenon. Spatial information can help create a more accurate classification map given the high probability of adjacent pixels belonging to the same category. In recent years, an increasing number of studies have applied spatial and spectral information to further improve the accuracy of classification. According to the characteristics and combination of spatial information, spectral information, and classifiers, the methods of spatial-spectral feature extraction for HSI can be defined into three types:spatial texture and morphological feature extraction, spatial neighborhood information acquisition, and spatial information post-processing. For the first type, spatial texture or morphological features (e.g., Gabor, local binary pattern, and morphological attribute features) are extracted in advance to preprocess the spatial information of pixels. In other words, spatial features are extracted through certain structures and rules, and then, the obtained features are sent to the classifier. The second method directly combines the relationship between the pixel and its spatial neighborhood pixels into the classifier. Spatial-spectral information is directly constructed into the classification models (e.g., sparse/collaborative representation of joint spatial information and kernel-based spatial information extraction and classification) through mathematical expressions. As a result, feature extraction and classification can be completed simultaneously. In the third category, spectral features are first classified. Then, the obtained classification results are corrected through spatial information post-processing methods (e.g., random fields, bilateral filtering, and graph segmentation) to further improve the classification accuracy. The traditional spatial-spectral feature extraction method for HSI has small computation, good mathematical theory foundation and explanation, and strong robustness against noise. However, the traditional spatial-spectral feature extraction methods mostly design shallow feature extraction schemes manually, which involves a lot of expert experience and parameter setting and thus affects the ability of feature expression and learning. For HSI data, scattering from other objects can distort the spectral properties of the interest object. In addition, different atmospheric scattering conditions and intra-class variability cause difficulty in extracting spatial-spectral features by traditional methods. Deep neural network has many advantages, such as learning representative and discriminant features, improving information representation through deep structure, and realizing automatic extraction and representation of features. Thus, higher accuracy of HSI classification will be achieved by designing the structure of deep network reasonably. In this study, the application of spatial-spectral feature extraction from deep learning is expounded and analyzed from the perspectives of convolutional neural network (CNN), graph neural network (GNN), and multi-source data cross-scene model. CNN shares weights and uses local connections to extract spatial information effectively. CNN model cannot generally adapt to local regions with various object distributions and geometric appearance because it convolves regular square regions with fixed size and weights. GNN model can represent many potential relationships between data with graphs. As a result, GNN can be applied for spatial-spectral feature extraction and classification for HSI. In some scenes (e.g., complex city scenes), different ground objects composed of the same material or substance need to be distinguished through shape, elevation, texture, and other information. Light detection and ranging data can be used to describe the elevation of the scene and the height of objects and obtain the spatial context and structural information without the effect of time and weather. Therefore, multi-sensor data can be considered to build joint-feature space for more accurate classification in some special scenes. In recent years,spatial-spectral feature extraction techniques for HSI have greatly progressed and achieved satisfactory results. However, the following problems need to be solved. 1) The methods of traditional and deep spatial-spectral feature extraction can be combined to fully utilize their respective advantages. 2) The small-sample-size and over-fitting problems from deep neural network need to be overcome through designing semi-supervised learning, active learning, or self-supervised learning models. 3) Using GNN to train HSI will lead to high computational cost and large memory usage. Thus, the model oriented to reduce the computational complexity should be studied. 4) Combining multi-source data from different sensors should consider reasonably unifying and complementary expressing multi-source data features. 5) Using multi-temporal, hyperspectral, and multi-perspective information to simultaneously mine spatial-spectral-temporal joint-feature information of complex dynamic targets has become a new frontier. 6) Using multi-temporal, hyperspectral and multi-perspective information to simultaneously extract spatial-spectral-temporal features of complex dynamic targets has become a new frontier. 7) With the progress of space remote sensing technologies in China, domestic hyperspectral data will receive more and more attention for research. 8) According to the development trend of big data and machine intelligence, the research on spatial-spectral feature extraction and classification of hyperspectral image based on the combination of applied domain knowledge and hyperspectral data will be a hot topic. From two sides of the traditional and deep spatial-spectral feature extraction in this study, the research status is systematically sorted out and comprehensively summarized. The existing problems are analyzed and evaluated, and the future development trend is evaluated and prospected.