可解译深度网络的多光谱遥感图像融合
Deep network-interpreted multispectral image fusion in remote sensing
- 2023年28卷第1期 页码:290-304
纸质出版日期: 2023-01-16 ,
录用日期: 2022-10-05
DOI: 10.11834/jig.220575
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2023-01-16 ,
录用日期: 2022-10-05
移动端阅览
余典, 李坤, 张玮, 李对对, 田昕, 江昊. 可解译深度网络的多光谱遥感图像融合[J]. 中国图象图形学报, 2023,28(1):290-304.
Dian Yu, Kun Li, Wei Zhang, Duidui Li, Xin Tian, Hao Jiang. Deep network-interpreted multispectral image fusion in remote sensing[J]. Journal of Image and Graphics, 2023,28(1):290-304.
目的
2
多光谱图像融合是遥感领域中的重要研究问题,变分模型方法和深度学习方法是目前的研究热点,但变分模型方法通常采用线性先验构建融合模型,难以描述自然场景复杂非线性关系,导致成像模型准确性较低,同时存在手动调参的难题;而主流深度学习方法将融合过程当做一个黑盒,忽视了真实物理成像机理,因此,现有融合方法的性能依然有待提升。为了解决上述问题,提出了一种基于可解译深度网络的多光谱图像融合方法。
方法
2
首先构建深度学习先验描述融合图像与全色图像之间的关系,基于多光谱图像是融合图像下采样结果这一认知构建数据保真项,结合深度学习先验和数据保真项建立一种新的多光谱图像融合模型,提升融合模型准确性。采用近端梯度下降法对融合模型进行求解,进一步将求解步骤映射为具有明确物理成像机理的可解译深度网络架构。
结果
2
分别在Gaofen-2和GeoEye-1遥感卫星仿真数据集,以及QuickBird遥感卫星真实数据集上进行了主客观对比实验。相对于经典方法,本文方法的主观视觉效果有了显著提升。在Gaofen-2和GeoEye-1遥感卫星仿真数据集,相对于性能第2的方法,本文方法的客观评价指标全局相对无量纲误差(relative dimensionless global error in synthesis,ERGAS)有效减小了7.58%和4.61%。
结论
2
本文提出的可解译深度网络,综合了变分模型方法和深度学习方法的优点,在有效保持光谱信息的同时较好地增强融合图像空间细节信息。
Objective
2
Multispectral image fusion is one of the key tasks in the field of remote sensing (RS). Recent variational model-based and deep learning-based techniques have been developing intensively. However
traditional variational model-based approaches are employed based on linear prior
which is challenged to demonstrate the complicated nonlinear relationship for natural scenarios. Thus
the fusion model is restricted to optimal parameter selection and accurate model design. To resolve these problems
our research is focused on developing a deep network-interpreted for multispectral image and panchromatic image fusion.
Method
2
First
we explore a deep prior to describe the relationship between the fusion image and the panchromatic image. Furthermore
a data fidelity term is constructed based on the assumption that the multispectral image is considered to be the down-sampled version of the fusion result. A new fusion model is proposed by integrating the deep prior and the data fidelity term mentioned above. To obtain an accurate fusion result
we first resolve the proposed fusion model by the proximal gradient descent method
which introduces intermediate variables to convert the original optimization prob
lem into several iterative steps. Then
we simplify the iteration function by assuming that the residual for each iteration follows Gaussian distribution. After next
we unroll the above optimization steps into a deep learning network that contains several sub-modules. Therefore
the optimization process of network parameters is driven for a clear physical-based deep fusion network-interpreted via the training data and the proposed physical fusion model both. Moreover
the handcrafted hyper-parameters in the fusion model are also tuned from specific training data
which can resolve the problem of the design of manual parameters in the traditional variational model methods effectively. Specifically
to build an interpretable end-to-end fusion network
we implement the optimization steps in each iteration with different network modules. Furthermore
to deal with the challenging issues of the diversity of sensor spectrum character between different satellites
we use two consecutive 3×3 convolution layers separated with a ReLU nonlinear active layer to represent the optical spectrum transform matrix. For upgrading the intermediate variable-introduced
it is regarded as a denoising problem in related to SwinResUnet. Thanks to the capabilities of extraction of local features and attention of global information
the SwinResUnet incorporates convolutional neural network (CNN) and Swin-Transformer layers into its network architecture. And
a U-Net is adopted as the backbone of SwinResUnet in the deep denoiser
which contains three groups of encoders and decoders with different feature scales. In addition
short connections are established in each group of encoder and decoder for enhancing feature transmission and avoiding gradient explosion. Finally
the
$${{\rm{L}}_1}$$
norm for reference image and fusion image is used as the cost function.
Result
2
The experiments are composed of 3 aspects: 1) simulation experiment
2) real exper
iment
and 3) ablation analysis. The Wald's protocol-based simulation experiment fuses images via down-sampled multispectral image (MSI) and panchromatic image (PAN). The real experiment is conducted by fusing original MSI and PAN. The comparison methods include: a) polynomial interpolation
b) gram-schmidt adaptive (GSA) and c) partial replacement-based adaptive component substitution (PRACS) (component substitution methods)
d) Indusion and e) additive wavelet luminance proportional (AWLP) (multi-resolution analysis methods)
f) simultaneously registration and fusion (SIRF) and g) local gradient constraints (LGC) (variational model optimization methods)
h) pansharpening by using a convolutional neural network (PNN)
i) deep network architecture for pansharpening (PanNet) and j) interpretable deep network for variational pansharpening (VPNet) (deep learning methods). We demonstrate the superiority of our method in terms of visual effect and quantitative analysis on the simulated Gaofen-2
GeoEye-1 satellite datasets
and the real QuickBird satellite dataset. The quantitative evaluation metrics mainly include: 1) relative dimensionless global error in synthesis (ERGAS)
2) spectral angle mapping
3) global score
$${Q^{2n}}$$
4) structural similarity index
5) root mean square error
6) relative average spectral error
7) universal image quality index
and 8) peak signal-to-noise ratio. As there is no reference image for real experiment
we employ some non-reference metrics like quality with no reference (QNR)
$${D_{\rm{s}}}$$
and
$${D_\lambda }$$
. Visual comparison: the visual effect of the proposed method has a larger improvement over other state-of-the-art methods. Quantitative evaluation: compared with the second-best method
ERGAS can be efficiently reduced by 7.58% and 4.61% on the simulated Gaofen-2 and GeoEye-1 satellite datas
ets
respectively.
Conclusion
2
Our interpretable deep network combines the advantages of variational model-based and deep learning-based approaches
thus achieving a good balance between spatial and spectral qualities.
遥感(RS)多光谱图像(MSI)图像融合深度学习(DL)可解译网络近端梯度下降法(PGD)
remote sensing(RS)multispectral image(MSI)image fusiondeep learning(DL)interpretable networkproximal gradient descent(PGD)
Aiazzi B, Alparone L, Baronti S, Garzelli A and Selva M. 2006. MTF-tailored multiscale fusion of high-resolution MS and Pan imagery. Photogrammetric Engineering and Remote Sensing, 72(5): 591-596[DOI: 10.14358/PERS.72.5.591]
Aiazzi B, Baronti S and Selva M. 2007. Improving component substitution pansharpening through multivariate regression of MS +Pan data. IEEE Transactions on Geoscience and Remote Sensing, 45(10): 3230-3239[DOI: 10.1109/TGRS.2007.901007]
Ballester C, Caselles V, Igual L, Verdera J and Rougé B. 2006. A variational model for P+XS image fusion. International Journal of Computer Vision, 69(1): 43-58[DOI: 10.1007/s11263-006-6852-x]
Chen C, Li Y Q, Liu W and Huang J Z. 2014. Image fusion with local spectral consistency and dynamic gradient sparsity//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 2760-2765[DOI: 10.1109/CVPR.2014.347http://dx.doi.org/10.1109/CVPR.2014.347]
Chen C, Li Y Q, Liu W and Huang J Z. 2015. SIRF: simultaneous satellite image registration and fusion in a unified framework. IEEE Transactions on Image Processing, 24(11): 4213-4224[DOI: 10.1109/TIP.2015.2456415]
Choi J, Yu K and Kim Y. 2011. A new adaptive component-substitution-based satellite image fusion by using partial replacement. IEEE Transactions on Geoscience and Remote Sensing, 49(1): 295-309[DOI: 10.1109/TGRS.2010.2051674]
da Cunha A L, Zhou J and Do M N. 2006. The nonsubsampled contourlet transform: theory, design, and applications. IEEE Transactions on Image Processing, 15(10): 3089-3101[DOI: 10.1109/TIP.2006.877507]
Do M N and Vetterli M. 2002. Contourlets: a directional multiresolution image representation//Proceedings of International Conference on Image Processing. Rochester, USA: IEEE: 357-360[DOI: 10.1109/ICIP.2002.1038034http://dx.doi.org/10.1109/ICIP.2002.1038034]
Fang F M, Li F, Shen C M and Zhang G X. 2013. A variational approach for pan-sharpening. IEEE Transactions on Image Processing, 22(7): 2822-2834[DOI: 10.1109/TIP.2013.2258355]
Fu X Y, Lin Z H, Huang Y and Ding X H. 2019. A variational pan-sharpening with local gradient constraints//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 10257-10266[DOI: 10.1109/CVPR.2019.01051http://dx.doi.org/10.1109/CVPR.2019.01051]
Garzelli A, Aiazzi B, Alparone L, Lolli S and Vivone G. 2018. Multispectral pansharpening with radiative transfer-based detail-injection modeling for preserving changes in vegetation cover. Remote Sensing, 10(8): #1308[DOI: 10.3390/rs10081308]
Ghahremani M and Ghassemian H. 2016. Nonlinear IHS: a promising method for pan-sharpening. IEEE Geoscience and Remote Sensing Letters, 13(11): 1606-1610[DOI: 10.1109/LGRS.2016.2597271]
He L, Rao Y Z, Li J, Chanussot J, Plaza A, Zhu J W and Li B. 2019. Pansharpening via detail injection based convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(4): 1188-1204[DOI: 10.1109/JSTARS.2019.2898574]
Hu X. 2021. Research on Multispctral Remote Sensing Image Fusion Algorithm. Harbin: Harbin Institute of Technology
胡鑫. 2021. 多光谱遥感图像融合算法研究. 哈尔滨: 哈尔滨工业大学 [DOI: 10.27061/d.cnki.ghgdu.2021.003491http://dx.doi.org/10.27061/d.cnki.ghgdu.2021.003491]
Huang W, Xiao L, Wei Z, Liu H Y and Tang S Z. 2015. A new pan-sharpening method with deep neural networks. IEEE Geoscience and Remote Sensing Letters, 12(5): 1037-1041[DOI: 10.1109/LGRS.2014.2376034]
Jiao J and Wu L D. 2019. Fusion of multispectral and panchromatic images via morphological filter and improved PCNN in NSST domain. Journal of Image and Graphics, 24(3): 435-446
焦姣, 吴玲达. 2019. 形态学滤波和改进PCNN的NSST域多光谱与全色图像融合. 中国图象图形学报, 24(3): 435-446[DOI: 10.11834/jig.180399]
Khan M M, Chanussot J, Condat L and Montanvert A. 2008. Indusion: fusion of multispectral and panchromatic images using the induction scaling technique. IEEE Geoscience and Remote Sensing Letters, 5(1): 98-102[DOI: 10.1109/LGRS.2007.909934]
Li W S, Hu X, Du J and Xiao B. 2017. Adaptive remote-sensing image fusion based on dynamic gradient sparse and average gradient difference. International Journal of Remote Sensing, 38(23): 7316-7332[DOI: 10.1080/01431161.2017.1371863]
Licciardi G A, Khan M M, Chanussot J, Montanvert A, Condat L and Jutten C. 2012. Fusion of hyperspectral and panchromatic images using multiresolution analysis and nonlinear PCA band reduction. EURASIP Journal on Advances in Signal Processing, 2012(1): #207[DOI: 10.1186/1687-6180-2012-207]
Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, Lin S and Guo B N. 2021. Swin transformer: hierarchical vision transformer using shifted windows//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 9992-10002[DOI: 10.1109/ICCV48922.2021.00986http://dx.doi.org/10.1109/ICCV48922.2021.00986]
Masi G, Cozzolino D, Verdoliva L and Scarpa G. 2016. Pansharpening by convolutional neural networks. Remote Sensing, 8(7): #594[DOI: 10.3390/rs8070594]
Möller M, Wittman T, Bertozzi A L and Burger M. 2012. A variational approach for sharpening high dimensional images. SIAM Journal on Imaging Sciences, 5(1): 150-178[DOI: 10.1137/100810356]
Pohl C and van Genderen J L. 1998. Review article multisensor image fusion in remote sensing: concepts, methods and applications. International Journal of Remote Sensing, 19(5): 823-854[DOI: 10.1080/014311698215748]
Restaino R, Vivone G, Addesso P and Chanussot J. 2020. A pansharpening approach based on multiple linear regression estimation of injection coefficients. IEEE Geoscience and Remote Sensing Letters, 17(1): 102-106[DOI: 10.1109/LGRS.2019.2914093]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241[DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Shen H F, Jiang M H, Li J, Yuan Q Q, Wei Y C and Zhang L P. 2019. Spatial-spectral fusion by combining deep learning and variational model. IEEE Transactions on Geoscience and Remote Sensing, 57(8): 6169-6181[DOI: 10.1109/TGRS.2019.2904659]
Tian X, Chen Y R, Yang C C, Gao X and Ma J Y. 2020. A variational pansharpening method based on gradient sparse representation. IEEE Signal Processing Letters, 27: 1180-1184[DOI: 10.1109/LSP.2020.3007325]
Tian X, Li K, Wang Z Y and Ma J Y. 2022. VP-Net: an interpretable deep network for variational pansharpening. IEEE Transactions on Geoscience and Remote Sensing, 60: #5402716[DOI: 10.1109/TGRS.2021.3089868]
Tu T M, Su S C, Shyu H C and Huang P S. 2001. A new look at IHS-like image fusion methods. Information Fusion, 2(3): 177-186[DOI: 10.1016/S1566-2535(01)00036-7]
Vivone G, Alparone L, Chanussot J, Mura M D, Garzelli A, Licciardi G A,Restaino R and Wald L. 2015. A critical comparison among pansharpening algorithms. IEEE Transactions on Geoscience and Remote Sensing, 53(5): 2565-2586[DOI: 10.1109/TGRS.2014.2361734]
Wang H R, Guo Q and Li A. 2021. Spatial-spectral fusion based on band-adaptive detail injection for GF-5 and Sentinel-2 remote sensing images. Journal of Image and Graphics, 26(8): 1896-1909
王海荣, 郭擎, 李安. 2021. 波段自适应细节注入的高分五号与Sentinel-2遥感影像空谱融合. 中国图象图形学报, 26(8): 1896-1909[DOI: 10.11834/jig.200755]
Yang J F, Fu X Y, Hu Y W, Huang Y, Ding X H and Paisley J. 2017. PanNet: a deep network architecture for pan-sharpening//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 1753-1761[DOI: 10.1109/ICCV.2017.193http://dx.doi.org/10.1109/ICCV.2017.193]
Yuan Q Q, Wei Y C, Meng X C, Shen H F and Zhang L P. 2018. A multiscale and multidepth convolutional neural network for remote sensing imagery pan-sharpening. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(3): 978-989[DOI: 10.1109/JSTARS.2018.2794888]
Zhang J and Ghanem B. 2018. ISTA-Net: interpretable optimization-inspired deep network for image compressive sensing//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Lake City, USA: IEEE: 1828-1837[DOI: 10.1109/CVPR.2018.00196http://dx.doi.org/10.1109/CVPR.2018.00196]
Zhang K, Li Y W, Liang J Y, Cao J Z, Zhang Y L, Tang H, Timofte R and Van Gool L. 2022. Practical blind denoising via swin-conv-UNet and data synthesis[EB/OL]. [2022-03-24].https://arxiv.org/pdf/2203.13278.pdfhttps://arxiv.org/pdf/2203.13278.pdf
Zhang J, Zhao D B and Gao W. 2014. Group-based sparse representation for image restoration. IEEE Transactions on Image Processing, 23(8): 3336-3351[DOI:10.1109/TIP.2014.2323127]
相关作者
相关机构