可解译深度网络的多光谱遥感图像融合

余典; 李坤; 张玮; 李对对; 田昕; 江昊

doi:10.11834/jig.220575

遥感图像处理 | 浏览量 : 0 下载量: 0 CSCD: 1

PDF
导出
分享
收藏
专辑

可解译深度网络的多光谱遥感图像融合
Deep network-interpreted multispectral image fusion in remote sensing
2023年28卷第1期页码：290-304
纸质出版日期： 2023-01-16 ，

录用日期： 2022-10-05
DOI： 10.11834/jig.220575
稿件说明：

移动端阅览

余典, 李坤, 张玮, 李对对, 田昕, 江昊. 可解译深度网络的多光谱遥感图像融合[J]. 中国图象图形学报, 2023,28(1):290-304.

Dian Yu, Kun Li, Wei Zhang, Duidui Li, Xin Tian, Hao Jiang. Deep network-interpreted multispectral image fusion in remote sensing[J]. Journal of Image and Graphics, 2023,28(1):290-304.
余典, 李坤, 张玮, 李对对, 田昕, 江昊. 可解译深度网络的多光谱遥感图像融合[J]. 中国图象图形学报, 2023,28(1):290-304. DOI： 10.11834/jig.220575.

Dian Yu, Kun Li, Wei Zhang, Duidui Li, Xin Tian, Hao Jiang. Deep network-interpreted multispectral image fusion in remote sensing[J]. Journal of Image and Graphics, 2023,28(1):290-304. DOI： 10.11834/jig.220575.

摘要

目的

多光谱图像融合是遥感领域中的重要研究问题，变分模型方法和深度学习方法是目前的研究热点，但变分模型方法通常采用线性先验构建融合模型，难以描述自然场景复杂非线性关系，导致成像模型准确性较低，同时存在手动调参的难题；而主流深度学习方法将融合过程当做一个黑盒，忽视了真实物理成像机理，因此，现有融合方法的性能依然有待提升。为了解决上述问题，提出了一种基于可解译深度网络的多光谱图像融合方法。

方法

首先构建深度学习先验描述融合图像与全色图像之间的关系，基于多光谱图像是融合图像下采样结果这一认知构建数据保真项，结合深度学习先验和数据保真项建立一种新的多光谱图像融合模型，提升融合模型准确性。采用近端梯度下降法对融合模型进行求解，进一步将求解步骤映射为具有明确物理成像机理的可解译深度网络架构。

结果

分别在Gaofen-2和GeoEye-1遥感卫星仿真数据集，以及QuickBird遥感卫星真实数据集上进行了主客观对比实验。相对于经典方法，本文方法的主观视觉效果有了显著提升。在Gaofen-2和GeoEye-1遥感卫星仿真数据集，相对于性能第2的方法，本文方法的客观评价指标全局相对无量纲误差(relative dimensionless global error in synthesis，ERGAS)有效减小了7.58%和4.61%。

结论

本文提出的可解译深度网络，综合了变分模型方法和深度学习方法的优点，在有效保持光谱信息的同时较好地增强融合图像空间细节信息。

Abstract

Objective

Multispectral image fusion is one of the key tasks in the field of remote sensing (RS). Recent variational model-based and deep learning-based techniques have been developing intensively. However

traditional variational model-based approaches are employed based on linear prior

which is challenged to demonstrate the complicated nonlinear relationship for natural scenarios. Thus

the fusion model is restricted to optimal parameter selection and accurate model design. To resolve these problems

our research is focused on developing a deep network-interpreted for multispectral image and panchromatic image fusion.

Method

First

we explore a deep prior to describe the relationship between the fusion image and the panchromatic image. Furthermore

a data fidelity term is constructed based on the assumption that the multispectral image is considered to be the down-sampled version of the fusion result. A new fusion model is proposed by integrating the deep prior and the data fidelity term mentioned above. To obtain an accurate fusion result

we first resolve the proposed fusion model by the proximal gradient descent method

which introduces intermediate variables to convert the original optimization prob

lem into several iterative steps. Then

we simplify the iteration function by assuming that the residual for each iteration follows Gaussian distribution. After next

we unroll the above optimization steps into a deep learning network that contains several sub-modules. Therefore

the optimization process of network parameters is driven for a clear physical-based deep fusion network-interpreted via the training data and the proposed physical fusion model both. Moreover

the handcrafted hyper-parameters in the fusion model are also tuned from specific training data

which can resolve the problem of the design of manual parameters in the traditional variational model methods effectively. Specifically

to build an interpretable end-to-end fusion network

we implement the optimization steps in each iteration with different network modules. Furthermore

to deal with the challenging issues of the diversity of sensor spectrum character between different satellites

we use two consecutive 3×3 convolution layers separated with a ReLU nonlinear active layer to represent the optical spectrum transform matrix. For upgrading the intermediate variable-introduced

it is regarded as a denoising problem in related to SwinResUnet. Thanks to the capabilities of extraction of local features and attention of global information

the SwinResUnet incorporates convolutional neural network (CNN) and Swin-Transformer layers into its network architecture. And

a U-Net is adopted as the backbone of SwinResUnet in the deep denoiser

which contains three groups of encoders and decoders with different feature scales. In addition

short connections are established in each group of encoder and decoder for enhancing feature transmission and avoiding gradient explosion. Finally

the

$${{\rm{L}}_1}$$

norm for reference image and fusion image is used as the cost function.

Result

The experiments are composed of 3 aspects: 1) simulation experiment

2) real exper

iment

and 3) ablation analysis. The Wald's protocol-based simulation experiment fuses images via down-sampled multispectral image (MSI) and panchromatic image (PAN). The real experiment is conducted by fusing original MSI and PAN. The comparison methods include: a) polynomial interpolation

b) gram-schmidt adaptive (GSA) and c) partial replacement-based adaptive component substitution (PRACS) (component substitution methods)

d) Indusion and e) additive wavelet luminance proportional (AWLP) (multi-resolution analysis methods)

f) simultaneously registration and fusion (SIRF) and g) local gradient constraints (LGC) (variational model optimization methods)

h) pansharpening by using a convolutional neural network (PNN)

i) deep network architecture for pansharpening (PanNet) and j) interpretable deep network for variational pansharpening (VPNet) (deep learning methods). We demonstrate the superiority of our method in terms of visual effect and quantitative analysis on the simulated Gaofen-2

GeoEye-1 satellite datasets

and the real QuickBird satellite dataset. The quantitative evaluation metrics mainly include: 1) relative dimensionless global error in synthesis (ERGAS)

2) spectral angle mapping

3) global score

$${Q^{2n}}$$

4) structural similarity index

5) root mean square error

6) relative average spectral error

7) universal image quality index

and 8) peak signal-to-noise ratio. As there is no reference image for real experiment

we employ some non-reference metrics like quality with no reference (QNR)

$${D_{\rm{s}}}$$

and

$${D_\lambda }$$

. Visual comparison: the visual effect of the proposed method has a larger improvement over other state-of-the-art methods. Quantitative evaluation: compared with the second-best method

ERGAS can be efficiently reduced by 7.58% and 4.61% on the simulated Gaofen-2 and GeoEye-1 satellite datas

ets

respectively.

Conclusion

Our interpretable deep network combines the advantages of variational model-based and deep learning-based approaches

thus achieving a good balance between spatial and spectral qualities.

关键词

遥感(RS)多光谱图像(MSI)图像融合深度学习(DL)可解译网络近端梯度下降法(PGD)

Keywords

remote sensing(RS)multispectral image(MSI)image fusiondeep learning(DL)interpretable networkproximal gradient descent(PGD)

references

Aiazzi B, Alparone L, Baronti S, Garzelli A and Selva M. 2006. MTF-tailored multiscale fusion of high-resolution MS and Pan imagery. Photogrammetric Engineering and Remote Sensing, 72(5): 591-596[DOI: 10.14358/PERS.72.5.591]

Aiazzi B, Baronti S and Selva M. 2007. Improving component substitution pansharpening through multivariate regression of MS +Pan data. IEEE Transactions on Geoscience and Remote Sensing, 45(10): 3230-3239[DOI: 10.1109/TGRS.2007.901007]

Ballester C, Caselles V, Igual L, Verdera J and Rougé B. 2006. A variational model for P+XS image fusion. International Journal of Computer Vision, 69(1): 43-58[DOI: 10.1007/s11263-006-6852-x]

Chen C, Li Y Q, Liu W and Huang J Z. 2014. Image fusion with local spectral consistency and dynamic gradient sparsity//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 2760-2765[DOI: 10.1109/CVPR.2014.347http://dx.doi.org/10.1109/CVPR.2014.347]

Chen C, Li Y Q, Liu W and Huang J Z. 2015. SIRF: simultaneous satellite image registration and fusion in a unified framework. IEEE Transactions on Image Processing, 24(11): 4213-4224[DOI: 10.1109/TIP.2015.2456415]

Choi J, Yu K and Kim Y. 2011. A new adaptive component-substitution-based satellite image fusion by using partial replacement. IEEE Transactions on Geoscience and Remote Sensing, 49(1): 295-309[DOI: 10.1109/TGRS.2010.2051674]

da Cunha A L, Zhou J and Do M N. 2006. The nonsubsampled contourlet transform: theory, design, and applications. IEEE Transactions on Image Processing, 15(10): 3089-3101[DOI: 10.1109/TIP.2006.877507]

Do M N and Vetterli M. 2002. Contourlets: a directional multiresolution image representation//Proceedings of International Conference on Image Processing. Rochester, USA: IEEE: 357-360[DOI: 10.1109/ICIP.2002.1038034http://dx.doi.org/10.1109/ICIP.2002.1038034]

Fang F M, Li F, Shen C M and Zhang G X. 2013. A variational approach for pan-sharpening. IEEE Transactions on Image Processing, 22(7): 2822-2834[DOI: 10.1109/TIP.2013.2258355]

Fu X Y, Lin Z H, Huang Y and Ding X H. 2019. A variational pan-sharpening with local gradient constraints//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 10257-10266[DOI: 10.1109/CVPR.2019.01051http://dx.doi.org/10.1109/CVPR.2019.01051]

Garzelli A, Aiazzi B, Alparone L, Lolli S and Vivone G. 2018. Multispectral pansharpening with radiative transfer-based detail-injection modeling for preserving changes in vegetation cover. Remote Sensing, 10(8): #1308[DOI: 10.3390/rs10081308]

Ghahremani M and Ghassemian H. 2016. Nonlinear IHS: a promising method for pan-sharpening. IEEE Geoscience and Remote Sensing Letters, 13(11): 1606-1610[DOI: 10.1109/LGRS.2016.2597271]

He L, Rao Y Z, Li J, Chanussot J, Plaza A, Zhu J W and Li B. 2019. Pansharpening via detail injection based convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(4): 1188-1204[DOI: 10.1109/JSTARS.2019.2898574]

Hu X. 2021. Research on Multispctral Remote Sensing Image Fusion Algorithm. Harbin: Harbin Institute of Technology

胡鑫. 2021. 多光谱遥感图像融合算法研究. 哈尔滨: 哈尔滨工业大学 [DOI: 10.27061/d.cnki.ghgdu.2021.003491http://dx.doi.org/10.27061/d.cnki.ghgdu.2021.003491]

Huang W, Xiao L, Wei Z, Liu H Y and Tang S Z. 2015. A new pan-sharpening method with deep neural networks. IEEE Geoscience and Remote Sensing Letters, 12(5): 1037-1041[DOI: 10.1109/LGRS.2014.2376034]

Jiao J and Wu L D. 2019. Fusion of multispectral and panchromatic images via morphological filter and improved PCNN in NSST domain. Journal of Image and Graphics, 24(3): 435-446

焦姣, 吴玲达. 2019. 形态学滤波和改进PCNN的NSST域多光谱与全色图像融合. 中国图象图形学报, 24(3): 435-446[DOI: 10.11834/jig.180399]

Khan M M, Chanussot J, Condat L and Montanvert A. 2008. Indusion: fusion of multispectral and panchromatic images using the induction scaling technique. IEEE Geoscience and Remote Sensing Letters, 5(1): 98-102[DOI: 10.1109/LGRS.2007.909934]

Li W S, Hu X, Du J and Xiao B. 2017. Adaptive remote-sensing image fusion based on dynamic gradient sparse and average gradient difference. International Journal of Remote Sensing, 38(23): 7316-7332[DOI: 10.1080/01431161.2017.1371863]

Licciardi G A, Khan M M, Chanussot J, Montanvert A, Condat L and Jutten C. 2012. Fusion of hyperspectral and panchromatic images using multiresolution analysis and nonlinear PCA band reduction. EURASIP Journal on Advances in Signal Processing, 2012(1): #207[DOI: 10.1186/1687-6180-2012-207]

Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, Lin S and Guo B N. 2021. Swin transformer: hierarchical vision transformer using shifted windows//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 9992-10002[DOI: 10.1109/ICCV48922.2021.00986http://dx.doi.org/10.1109/ICCV48922.2021.00986]

Masi G, Cozzolino D, Verdoliva L and Scarpa G. 2016. Pansharpening by convolutional neural networks. Remote Sensing, 8(7): #594[DOI: 10.3390/rs8070594]

Möller M, Wittman T, Bertozzi A L and Burger M. 2012. A variational approach for sharpening high dimensional images. SIAM Journal on Imaging Sciences, 5(1): 150-178[DOI: 10.1137/100810356]

Pohl C and van Genderen J L. 1998. Review article multisensor image fusion in remote sensing: concepts, methods and applications. International Journal of Remote Sensing, 19(5): 823-854[DOI: 10.1080/014311698215748]

Restaino R, Vivone G, Addesso P and Chanussot J. 2020. A pansharpening approach based on multiple linear regression estimation of injection coefficients. IEEE Geoscience and Remote Sensing Letters, 17(1): 102-106[DOI: 10.1109/LGRS.2019.2914093]

Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241[DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]

Shen H F, Jiang M H, Li J, Yuan Q Q, Wei Y C and Zhang L P. 2019. Spatial-spectral fusion by combining deep learning and variational model. IEEE Transactions on Geoscience and Remote Sensing, 57(8): 6169-6181[DOI: 10.1109/TGRS.2019.2904659]

Tian X, Chen Y R, Yang C C, Gao X and Ma J Y. 2020. A variational pansharpening method based on gradient sparse representation. IEEE Signal Processing Letters, 27: 1180-1184[DOI: 10.1109/LSP.2020.3007325]

Tian X, Li K, Wang Z Y and Ma J Y. 2022. VP-Net: an interpretable deep network for variational pansharpening. IEEE Transactions on Geoscience and Remote Sensing, 60: #5402716[DOI: 10.1109/TGRS.2021.3089868]

Tu T M, Su S C, Shyu H C and Huang P S. 2001. A new look at IHS-like image fusion methods. Information Fusion, 2(3): 177-186[DOI: 10.1016/S1566-2535(01)00036-7]

Vivone G, Alparone L, Chanussot J, Mura M D, Garzelli A, Licciardi G A,Restaino R and Wald L. 2015. A critical comparison among pansharpening algorithms. IEEE Transactions on Geoscience and Remote Sensing, 53(5): 2565-2586[DOI: 10.1109/TGRS.2014.2361734]

Wang H R, Guo Q and Li A. 2021. Spatial-spectral fusion based on band-adaptive detail injection for GF-5 and Sentinel-2 remote sensing images. Journal of Image and Graphics, 26(8): 1896-1909

王海荣, 郭擎, 李安. 2021. 波段自适应细节注入的高分五号与Sentinel-2遥感影像空谱融合. 中国图象图形学报, 26(8): 1896-1909[DOI: 10.11834/jig.200755]

Yang J F, Fu X Y, Hu Y W, Huang Y, Ding X H and Paisley J. 2017. PanNet: a deep network architecture for pan-sharpening//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 1753-1761[DOI: 10.1109/ICCV.2017.193http://dx.doi.org/10.1109/ICCV.2017.193]

Yuan Q Q, Wei Y C, Meng X C, Shen H F and Zhang L P. 2018. A multiscale and multidepth convolutional neural network for remote sensing imagery pan-sharpening. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(3): 978-989[DOI: 10.1109/JSTARS.2018.2794888]

Zhang J and Ghanem B. 2018. ISTA-Net: interpretable optimization-inspired deep network for image compressive sensing//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Lake City, USA: IEEE: 1828-1837[DOI: 10.1109/CVPR.2018.00196http://dx.doi.org/10.1109/CVPR.2018.00196]

Zhang K, Li Y W, Liang J Y, Cao J Z, Zhang Y L, Tang H, Timofte R and Van Gool L. 2022. Practical blind denoising via swin-conv-UNet and data synthesis[EB/OL]. [2022-03-24].https://arxiv.org/pdf/2203.13278.pdfhttps://arxiv.org/pdf/2203.13278.pdf

Zhang J, Zhao D B and Gao W. 2014. Group-based sparse representation for image restoration. IEEE Transactions on Image Processing, 23(8): 3336-3351[DOI:10.1109/TIP.2014.2323127]

文章被引用时，请邮件提醒。

提交

三维重建场景的纹理优化算法综述

多级特征引导网络的红外与可见光图像融合

肺部肿瘤跨模态图像融合的并行分解自适应融合模型

高分辨率可见光图像引导红外图像超分辨率的Transformer网络

多尺度分解和八度卷积相结合的红外与可见光图像融合