基于Transformer和通道混合并行卷积的高光谱图像去噪方法(ChinaMM2023推荐论文)
胡帅, 高峰, 龚卓然, 陶盛恩, 上官心语, 董军宇(中国海洋大学) 摘 要
目的 由于设备本身和外部环境因素,高光谱图像容易受到噪声污染,导致图像的可见性和分析精度降低,因此高光谱图像去噪任务已经成为遥感图像处理领域国内外研究热点。当前的高光谱图像去噪方法主要面临两个难题:1)对特征的全局信息利用不足。当前基于卷积神经网络的方法受限于卷积核的大小,难以捕获特征的全局信息;2)卷积神经网络和Transformer在结构上存在差异,导致两者难以融合。 近来出现了一些基于Transformer的高光谱图像修复方法,但是由于其和卷积神经网络存在结构差异,需要考虑合理的特征交互方式,来平衡局部和全局特征提取之间的关系。方法 针对上述问题,本文提出了基于 Transformer和通道混合并行卷积的高光谱图像去噪模型,包括三个模块:通道混合特征提取模块、基于块下采样的全局增强模块和自适应双向特征融合模块。通过这三个模块的相互作用,可以充分结合全局和局部的特征信息,处理不同区域中的噪声和纹理差异,有效提高模型对空间细节信息的恢复能力。 结果 实验在2个数据集上与主流的5种方法进行了比较,在Pavia数据集中设置不同高斯噪声强度的情况下,相比于性能第2的模型,PSNR值最大提高了0.4dB;在ICVL数据集中设置各种混合噪声的情况下,相比于性能第2的模型,PSNR最大提高了2.18dB。同时可视化的去噪结果图像体现了本文所提出的去噪模型的优异性能。结论 本文方法在各种噪声情况下均具有较好的去噪效果,显著优于当前主流方法,能够有效去除高光谱图像中噪声,同时保留图像丰富的纹理信息。
关键词
基于Transformer和通道混合并行卷积的高光谱 图像去噪方法
Hu Shuai, Gao Feng, Gong Zhuoran, Tao Shengen, Shangguan Xinyu, Dong Junyu(Ocean University of China) Abstract
Objective Due to the increasing availability and advancement of hyperspectral imaging technology, hyperspectral images have become an invaluable resource in various fields, including agriculture, environmental monitoring, and remote sensing. However, these images are often prone to noise contamination, which can significantly degrade their quality and hinder accurate analysis and interpretation. As a result, denoising hyperspectral images has become a crucial task in the field of remote sensing image processing, attracting significant attention from researchers worldwide. The challenges associated with denoising hyperspectral images are multifaceted. Firstly, the inherent characteristics of hyperspectral data, such as high dimensionality and complex spectral information, pose significant difficulties for traditional denoising approaches. The presence of noise in hyperspectral images can obscure valuable information embedded within the spectral bands, making it essential to develop advanced denoising techniques that can effectively restore the original signal while preserving the rich texture and spatial details. Furthermore, the development of deep learning techniques, particularly convolutional neural networks (CNNs), has revolutionized the field of image processing, including denoising tasks. CNN-based approaches have shown promising results in denoising various types of images. However, when it comes to hyperspectral data, traditional CNN architectures face limitations in capturing the global contextual information necessary for accurate denoising. The fixed-size receptive fields of CNNs restrict their ability to exploit the spatial and spectral correlations present in hyperspectral images, thereby reducing their overall denoising performance. To overcome these limitations, recent research has explored the integration of Transformers, originally designed for natural language processing tasks, into the field of computer vision, including hyperspectral image denoising. Transformers are capable of capturing long-range dependencies and global contextual information, making them an attractive alternative to CNNs for denoising tasks. However, directly applying Transformer-based models to hyperspectral data requires careful consideration of the specific challenges posed by the unique characteristics of hyperspectral images. Method In this study, we propose a novel denoising model for hyperspectral images that combines the strengths of Transformers and parallel convolution operations. Our model comprises three key modules: Channel Shuffling Module, Block-Downsampling Global Enhancement Module and Adaptive Bidirectional Feature Fusion Module. These modules work synergistically to address the challenges encountered in denoising hyperspectral images. The channel shuffling module exploits the inter-channel relationships within hyperspectral data by incorporating channel-mixing operations. By fusing information across different spectral channels, the module enhances the representation power of the network and enables more comprehensive feature extraction. This approach effectively addresses the limitation of traditional CNN-based methods in fully utilizing the global information available in hyperspectral images, ultimately improving the model"s denoising performance. In the block-downsampling global enhancement module, we leverage a block down-sampling strategy to capture global contextual information. By reducing the spatial resolution of the input hyperspectral image, the module enlarges the receptive fields, allowing the model to incorporate larger-scale information during the denoising process. This mechanism enhances the model"s understanding of the overall structure of the image, facilitating more effective noise suppression and accurate restoration of spatial details. The adaptive bidirectional feature fusion module is designed to strike a balance between local and global feature extraction, leveraging the complementary strengths of both convolutional neural networks and Transformers. This module introduces a mechanism for adaptively fusing features from both local and global contexts, enabling the model to effectively combine local details with global information. By considering the intricate relationship between spatial and spectral features, our proposed approach improves the denoising performance and preserves the rich texture information inherent in hyperspectral images. Result To evaluate the effectiveness of our proposed model, extensive experiments were conducted on publicly available hyperspectral image datasets, including ICVL and Pavia. The experimental results demonstrated the superior denoising performance of our approach compared to current state-of-the-art methods. Our model consistently outperformed existing techniques in various noise scenarios, effectively removing noise while preserving the fine spatial details and rich texture information of hyperspectral images.The experimental evaluation involved quantitative metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and spectral angel mapping (SAM). Our proposed model achieved significantly higher PSNR values and SSIM scores compared to the baseline methods, indicating improved denoising accuracy and visual quality of the restored images. Additionally, the SAM values obtained using our model were consistently lower, indicating higher spectral similarity. Moreover, we conducted a comprehensive analysis of the computational efficiency of our model. With the increasing volume and complexity of hyperspectral data, it is crucial to develop denoising methods that are computationally efficient without sacrificing performance. Our proposed model demonstrated competitive computational efficiency, making it practical for real-world applications that involve large-scale hyperspectral image processing. Conclusion The success of our denoising model can be attributed to the synergistic combination of the Transformer-based architecture and the channel-mixing parallel convolution operations. The Transformer module enables effective capture of global contextual information, facilitating better understanding of the relationships between spectral bands and spatial features. By incorporating channel-mixing operations, our model exploits the inter-channel correlations and enhances the discriminative power of feature extraction, resulting in improved denoising performance.Furthermore, our model"s ability to handle diverse noise scenarios and maintain image quality can be attributed to the adaptive bidirectional feature fusion module. This module intelligently combines local and global features, enabling effective noise suppression while preserving the fine details and texture information specific to different regions of the hyperspectral images. The adaptability of the feature fusion mechanism ensures robust denoising performance across various noise levels and image characteristics. In conclusion, this study presents a novel denoising model for hyperspectral images based on the integration of Transformers and channel-mixing parallel convolution. The proposed model effectively addresses the limitations of traditional approaches in utilizing global information and captures the complex spatial-spectral correlations inherent in hyperspectral data. Experimental results demonstrate its superior denoising performance compared to state-of-the-art methods, with improved accuracy and preservation of fine details and texture information. The model"s computational efficiency further enhances its practicality for real-world applications. Future research directions may include exploring additional mechanisms for adaptive feature fusion and investigating the model"s performance on other hyperspectral image processing tasks such as classification and segmentation.
Keywords
|