多级特征引导网络的红外与可见光图像融合
Infrared and visible image fusion based on multi-level guided network
- 2023年28卷第1期 页码:207-220
收稿:2022-06-15,
修回:2022-10-10,
录用:2022-10-17,
纸质出版:2023-01-16
DOI: 10.11834/jig.220638
移动端阅览

浏览全部资源
扫码关注微信
收稿:2022-06-15,
修回:2022-10-10,
录用:2022-10-17,
纸质出版:2023-01-16
移动端阅览
目的
2
以卷积神经网络为基础的深度学习技术在图像融合方面表现出优越的性能。在各类图像融合领域,红外与可见光的图像融合应用十分广泛,这两种图像各自的特性十分鲜明,二者信息交互融合得到的融合图像具有显著的价值和意义。为了提高红外与可见光图像的融合质量,本文提出了一种多级特征引导网络的融合框架。
方法
2
本文框架中编码器用于提取源图像的特征,并将多级特征引导至解码器中对融合结果进行重建。为了有效地训练网络,设计了一种混合损失函数。其中,加权保真项约束融合结果与源图像的像素相似度,而结构张量损失鼓励融合图像从源图像中提取更多的结构特征,为了有效进行多尺度信息交互,不同于普通的编解码结构,本文方法在编码器每一层的每一部分均进行特征引导,在编码部分采用池化对尺寸进行缩小,解码采用上采样将尺寸放大,实现多尺度融合与重建,有效弥补了训练过程中卷积层数的堆叠导致的信息的丢失,在编码部分适时地对特征进行引导,及时地与解码层进行融合,在网络结构构建完成后,提出一种损失融合算法,从红外图像和可见光图像各自特点出发,分别设计基于视觉显著性权值估计的2范数损失和基于结构张量的F范数损失。
结果
2
为了说明融合方法的可行性,在TNO数据集与RoadScene数据集上进行实验,与传统以及深度学习融合方法进行了视觉对比和客观对比,在信息保真度准则、基于梯度的融合性能边缘信息保持度、非线性相关熵以及基于结构相似度的图像质量测量指标等关键图像评价指标上达到了理想的结果。同时,为了验证提出的网络结构以及损失函数的有效性,使得提出的网络模型完备性得到保证。
结论
2
提出的融合模型综合了传统模型和深度学习模型的优点,得到了高质量的融合图像,取得了良好的融合效果。
Objective
2
Multi-source image fusion is focused effective information extraction and integration for diversified images. It is beneficial to resolve the insufficient information-related problem for single image and improve the efficiency of data processing for multi-source images. The infrared and visible images are widely used in the context of image processing and have their mutual benefits for information ability. To obtain a clear and accurate description of the scene
the fusion-mutual can optimize the detailed texture information in the visible image and clarify the target information in the infrared image. So
we develop a fusion-mutual algorithm in terms of deep learning-relevant image processing method.
Method
2
First
densed convolutional network is improved
and an end-to-end convolutional network fusion model is trained in relevant to encoding and decoding. To reconstruct the fusion results
the encoder is used to extract the features of the source image and guide the multilevel features into the decoder. Each layer-encoded has the feature-guided beyond regular decoding structure. The pooling-based size is narrowed down in the coding part. The upsampling can be enlarged in the decoding part to realize the multi-scale fusion and reconstruction. The training effectiveness can be improved further in the process of convolution layer stack but information loss is followed by. To train the network effectively
a hybrid loss function is designed. The weighted fidelity term will be used to constrain the pixel similarity between the fusion result and the source image when the structural tensor loss is activated in the fusion image to extract more structural features from the source image. The coding part is divided into three layers to ensure effective feature extraction in depth
and each layer is segmented by a pooling layer. The network-depth-via convolution blocks between each layer are down from 3 to 1 gradually. To bridge the extraction of effective network features
it can adapt more extraction of shallow network and less extraction of a deep network. To realize multi-scale information interaction
the features are efficient to be guided in the encoding part. It can be fused into the decoding layer at the same time. For decoding part of the design
the first layer is composed of five parts of the convolution blocks. Our fusion results are obtained after the fifth convolution block output. The second layer is composed of three convolution blocks
and the third layer is constructed based on a convolution block only. The sampling process is interconnected between layers. After the network structure is constructed
we proposed a loss fusion algorithm
which are included L2 saliency detection-based norm constraints; and the F norm constraint is based on structure tensor calculation for infrared and visible light. The image features are user-friendly. The fusion results are mutual-benefited under the control of the network structure and the loss algorithm.
Result
2
A series of evaluation indicators are achieved compared to traditional fusion methods and deep learning fusion methods on the TNO dataset and RoadScene dataset. To demonstrate the feasibility of the fusion method
its experiment is carried out on TNO dataset and RoadScene dataset. Furthermore
to validate the effectiveness of our loss function-based algorithm
the network structure and loss are ablated both in terms of the principle of control variables.
Conclusion
2
To obtain potential high-quality fusion images and achieve good fusion effects further
our fusion model has shown its priorities for models optimization.
Bavirisetti D P and Dhuli R. 2016. Two-scale image fusion of visible and infrared images using saliency detection. Infrared Physics and Technology, 76: 52-64 [DOI: 10.1016/j.infrared.2016.01.009]
Burt P J and Adelson E H. 1987. The Laplacian pyramid as a compact image code//Proceedings of Readings in Computer Vision: Issues, Problems, Principles, and Paradigms. San Francisco, USA: Morgan Kaufmann Publishers Inc. : 671-679
Chen J, Li X J, Luo L B, Mei X G and Ma J Y. 2020. Infrared and visible image fusion based on target-enhanced multiscale transform decomposition. Information Sciences, 508: 64-78 [DOI: 10.1016/j.ins.2019.08.066]
Chen M S. 2016. Image fusion of visual and infrared image based on NSCT and compressed sensing. Journal of image and Graphics, 21(1): 39-44
陈木生. 2016. 结合NSCT和压缩感知的红外与可见光图像融合. 中国图象图形学报, 21(1): 39-44 [DOI: 10.11834/jig.20160105]
Fu Y and Wu X J. 2021. A dual-branch network for infrared and visible image fusion//Proceedings of the 25th International Conference on Pattern Recognition (ICPR). Milan, Italy: IEEE: 10675-10680 [ DOI: 10.1109/ICPR48806.2021.9412293 http://dx.doi.org/10.1109/ICPR48806.2021.9412293 ]
Iandola F, Moskewicz M, Karayev S, Girshick R, Darrell T and Keutzer K. 2014. DenseNet: implementing efficient ConvNet descriptor pyramids [EB/OL ] . [2022-04-07 ] . https://arxiv.org/pdf/1404.1869/pdf https://arxiv.org/pdf/1404.1869/pdf
Jung H, Kim Y, Jang H, Ha N and Sohn K. 2020. Unsupervised deep image fusion with structure tensor representations. IEEE Transactions on Image Processing, 29: 3845-3858 [DOI: 10.1109/TIP.2020.2966075]
Li G F, Lin Y J and Qu X D. 2021a. An infrared and visible image fusion method based on multi-scale transformation and norm optimization. Information Fusion, 71: 109-129 [DOI: 10.1016/j.inffus.2021.02.008]
Li G L, Xiang W H, Zhang S L and Zhang B X. 2022. Infrared and visible image fusion algorithm based on residual network and attention mechanism. Unmanned Systems Technology, 5(2): 9-21
李国梁, 向文豪, 张顺利, 张博勋. 2022. 基于残差网络和注意力机制的红外与可见光图像融合算法. 无人系统技术, 5(2): 9-21 [DOI:10.19942/j.issn.2096-5915.2022.2.012]
Li H and Wu X J. 2019. DenseFuse: a fusion approach to infrared and visible images. IEEE Transactions on Image Processing, 28(5): 2614-2623 [DOI: 10.1109/TIP.2018.2887342]
Li H, Wu X J and Durrani T. 2020a. NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Transactions on Instrumentation and Measurement, 69(12): 9645-9656 [DOI: 10.1109/TIM.2020.3005230]
Li H, Wu X J and Kittler J. 2018. Infrared and visible image fusion using a deep learning framework//Proceedings of the 24th International Conference on Pattern Recognition. Beijing, China: IEEE: 2705-2710 [ DOI: 10.1109/ICPR.2018.8546006 http://dx.doi.org/10.1109/ICPR.2018.8546006 ]
Li H, Wu X J and Kittler J. 2020b. MDLatLRR: a novel decomposition method for infrared and visible image fusion. IEEE Transactions on Image Processing, 29: 4733-4746 [DOI: 10.1109/TIP.2020.2975984]
Li H, Wu X J and Kittler J. 2021b. RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Information Fusion, 73: 72-86 [DOI: 10.1016/j.inffus.2021.02.023]
Li S S, Hong R C and Wu X Q. 2008. A novel similarity based quality metric for image fusion//Proceedings of 2008 International Conference on Audio, Language and Image Processing. Shanghai, China: IEEE: 167-172 [ DOI: 10.1109/ICALIP.2008.4589989 http://dx.doi.org/10.1109/ICALIP.2008.4589989 ]
Li S T, Kang X D and Hu J W. 2013. Image fusion with guided filtering. IEEE Transactions on Image Processing, 22(7): 2864-2875 [DOI: 10.1109/TIP.2013.2244222]
Li Y F, Li J and He L. 2022. Convolutional neural network based single image pair method for spatiotemporal fusion. National Remote Sensing Bulletin, 26(8): 1614-1623
李云飞, 李军, 贺霖. 2022. 单样本对卷积神经网络遥感图像时空融合. 遥感学报, 26(8): 1614-1623 [DOI: 10.11834/jrs.20219348]
Liu C H, Qi Y and Ding W R. 2017. Infrared and visible image fusion method based on saliency detection in sparse domain. Infrared Physics and Technology, 83: 94-102 [DOI: 10.1016/j.infrared.2017.04.018]
Liu M W, Wang R H, Li J and Jiao Y Z. 2021. Infrared and visible image fusion with multi-scale anisotropic guided filtering. Journal of Image and Graphics, 26(10): 2421-2432
刘明葳, 王任华, 李静, 焦映臻. 2021. 各向异性导向滤波的红外与可见光图像融合. 中国图象图形学报, 26(10): 2421-2432 [DOI: 10.11834/jig.200339]
Nie R C, Ma C Z, Cao J D, Ding H W and Zhou D M. 2021. A total variation with joint norms for infrared and visible image fusion. IEEE Transactions on Multimedia, 24: 1460-1472 [DOI: 10.1109/TMM.2021.3065496]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Sheikh H R, Bovik A C and De Veciana G. 2005. An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Transactions on Image Processing, 14(12): 2117-2128 [DOI: 10.1109/TIP.2005.859389]
Tang L F, Yuan J T and Ma J Y. 2022. Image fusion in the loop of high-level vision tasks: a semantic-aware real-time infrared and visible image fusion network. Information Fusion, 82: 28-42 [DOI: 10.1016/j.inffus.2021.12.004]
Wang Q, Shen Y and Jin J. 2008. Performance evaluation of image fusion techniques//Image Fusion: Algorithms and Applications. London: Academic Press: 469-492 [ DOI: 10.1016/B978-0-12-372529-5.00017-2 http://dx.doi.org/10.1016/B978-0-12-372529-5.00017-2 ]
Xu H, Ma J Y, Jiangć J J, Guo X J and Ling H B. 2022. U2Fusion: a unified unsupervised image fusion network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1): 502-518 [DOI: 10.1109/TPAMI.2020.3012548]
Xydeas C S and Petrovic V. 2000. Objective image fusion performance measure. Electronics Letters, 36(4): 308-309 [DOI: 10.1049/el:20000267]
Yang Y, Tong S and Huang S Y. 2015. Image fusion based on fast discrete curvelet transform. Journal of Image and Graphics, 20(2): 219-228
杨勇, 童松, 黄淑英. 2015. 快速离散Curvelet变换域的图像融合. 中国图象图形学报, 20(2): 219-228 [DOI: 10.11834/jig.20150208]
Zhang B H, Lu X Q, Pei H Q and Zhao Y. 2015. A fusion algorithm for infrared and visible images based on saliency analysis and non-subsampled Shearlet transform. Infrared Physics and Technology, 73: 286-297 [DOI: 10.1016/j.infrared.2015.10.004]
Zhang Q, Huang N C, Yao L, Zhang D W, Shan C F and Han J G. 2020. RGB-T salient object detection via fusing multi-level CNN features. IEEE Transactions on Image Processing, 29: 3321-3335 [DOI: 10.1109/TIP.2019.2959253]
Zhao Z X, Xu S, Zhang C X, Liu J M, Li P F and Zhang J S. 2021. DIDFuse: deep image decomposition for infrared and visible image fusion [EB/OL ] . [2022-04-08 ] . https://arxiv.org/pdf/200309210.pdf https://arxiv.org/pdf/200309210.pdf
Zhao Z X, Xu S, Zhang C X, Liu J M and Zhang J S. 2020. Bayesian fusion for infrared and visible images. Signal Processing, 177: #107734 [DOI: 10.1016/j.sigpro.2020.107734]
相关作者
相关机构
京公网安备11010802024621