Spatial aware channel attention guided high dynamic image reconstruction

Lingfeng Tang; Huan Huang; Yafei Zhang; Fan Li

doi:10.11834/jig.211039

Image Understanding and Computer Vision | Views : 0 下载量: 0 CSCD: 1

PDF
Export
Share
Collection
Album

Spatial aware channel attention guided high dynamic image reconstruction
Vol. 27, Issue 12, Pages: 3581-3595(2022)
Published： 16 December 2022 ，

Accepted： 03 January 2022
DOI： 10.11834/jig.211039
稿件说明：

移动端阅览

Lingfeng Tang, Huan Huang, Yafei Zhang, Fan Li. Spatial aware channel attention guided high dynamic image reconstruction. [J]. Journal of Image and Graphics 27(12):3581-3595(2022)
DOI：

Lingfeng Tang, Huan Huang, Yafei Zhang, Fan Li. Spatial aware channel attention guided high dynamic image reconstruction. [J]. Journal of Image and Graphics 27(12):3581-3595(2022) DOI： 10.11834/jig.211039.

摘要

目的

通过融合一组不同曝光程度的低动态范围(low dynamic range

LDR)图像，可以有效重建出高动态范围(high dynamic range

HDR)图像。但LDR图像之间存在背景偏移和拍摄对象运动的现象，会导致重建的HDR图像中引入鬼影。基于注意力机制的HDR重建方法虽然有一定效果，但由于没有充分挖掘特征空间维度和通道维度的相互关系，只在物体出现轻微运动时取得比较好的效果。当场景中物体出现大幅运动时，这些方法的效果仍然存在提升空间。为此，本文提出了空间感知通道注意力引导的多尺度HDR图像重建网络来实现鬼影抑制和细节恢复。

方法

本文提出了一种全新的空间感知通道注意力机制(spatial aware channel attention mechanism

SACAM)，该机制在挖掘通道上下文关系的过程中，通过提取特征通道维度的全局信息和显著信息，来进一步强化特征的空间关系。这有助于突出特征空间维度与通道维度有益信息的重要性，实现鬼影抑制和特征中有效信息增强。此外，本文还设计了一个多尺度信息重建模块(multiscale information reconstruction module

MIM)。该模块有助于增大网络感受野，强化特征空间维度的显著信息，还能充分利用不同尺度特征的上下文语义信息，来重构最终的HDR图像。

结果

在Kalantari测试集上，本文方法的PSNR-L(peak signal to noise ratio-linear domain)和SSIM-L(structural similarity-linear domain)分别为41.101 3、0.986 5。PSNR-

(peak signal to noise ratio-tonemapped domain)和SSIM-

(structural similarity-tonemapped domain)分别为43.413 6、0.990 2。在Sen和Tursun数据集上，本文方法较为真实地重构了场景的结构，并清晰地恢复出图像细节，有效避免了鬼影的产生。

结论

本文提出的空间感知通道注意力引导的多尺度HDR图像重建网络，有效挖掘了特征中对重构图像有益的信息，提升了网络恢复细节信息的能力。并在多个数据集上取得了较为理想的HDR重建效果。

Abstract

Objective

High dynamic range (HDR) imaging technology is widely used in modern imaging terminals. Hindered by the performance of the imaging sensor

photographs can capture information only in a limited range. HDR images can be reconstructed effectively through a group of low dynamic range (LDR) images fusion with multiple exposure levels. Due to shooting in real scene accompanied by camera shake and motion of shooting object

different exposures-derived LDR images do not have rigid pixel alignment in space

and the fused HDR results are easy to introduce artifacts

which greatly reduces the image quality. Although the attention based HDR reconstruction methods has a certain effect on improving the image quality

it achieves good results only when the object moves slightly for it does not fully mine the interrelationship in space dimension and channel dimension. When large foreground motion occurs in the scene

there is still a large room for improvement in the effects of these methods. Therefore

it is important to improve the ability of network to eliminate artifacts and restore details in saturated region. We develop multi-scale HDR image reconstruction network guided by spatial-aware channel attention.

Method

The medium-exposure LDR image is used as the reference image

and the remaining images are used as the non-reference images. Therefore

it is necessary to make full use of the effective complementary information of the non-reference images in the process of HDR reconstruction to enhance the dynamic range of the fused image

suppress the invalid information in the non-reference images and prevent the introduction of artifacts and saturation. In order to improve the ability of the network to eliminate artifacts and restore the details of saturated areas

we demonstrate a spatial-aware channel attention mechanism (SACAM) and a multi-scale information reconstruction module (MIM). In the process of mining channel context

SACAM strengthens the spatial relationship of features further via global information extraction and key information of feature channel dimension. Our research is focused on highlighting the importance of useful information in space dimension and channel dimension

and realizing ghost suppression and effective information enhancement in features. The MIM is beneficial to increase the network receptive field

strengthen the significant information of feature space dimension

and make full use of the contextual semantic information of different scale features to reconstruct the final HDR image.

Result

Our experiments are carried out on three public HDR datasets

including Kalantari dataset

Sen dataset and Tursun dataset. It can obtain better visual performance and higher objective evaluation results. Specifically

1) on the Kalantari dataset

our PSNR-L and SSIM-L are 41.101 3 and 0.986 5

respectively. PSNR-

and SSIM-

are 43.413 6 and 0.990 2

respectively. HDR-VDP-2 is 64.985 3. In order to verify the generalization performance of each method

we also compare the experimental results on unlabeled Sen dataset and Tursun dataset. 2) On Sen dataset

our method can not only effectively suppress the ghosts

but also resilient clearer image details. 3) On the Tursun dataset

we reconstruct scene structure more real and avoid the artifacts effectively. In addition

ablation study proves the effectiveness of the proposed method.

Conclusion

A spatial-aware channel attention guided multi-scale HDR reconstruction network (SCAMNet) is facilitated. The spatial aware channel attention mechanism and multi-scale information reconstruction module are integrated into one framework

which effectively solves the artifact caused by target motion and detail recovery in saturated region. To enhance the useful information in the features for the reconstructed image

our spatial-aware channel attention mechanism tends to establish the relationship between features in spatial and channel dimensions. The multi-scale information reconstruction module makes full use of the context semantic relationship of different scale features to further mine the useful information in the input image and reconstruct the HDR image. The potentials of our method are evaluated and verified qualitatively and quantitatively.

关键词

多曝光图像融合高动态范围(HDR)注意力多尺度鬼影抑制

Keywords

multi-exposure image fusionhigh dynamic range(HDR)attentionmultiscaleghost suppression

references

Eilertsen G, Kronander J, Denes G, Mantiuk R K and Unger J. 2017. HDR image reconstruction from a single exposure using deep CNNs. ACM Transactions on Graphics, 36(6): #178 [DOI: 10.1145/3130800.3130816]

Fan K and Zhou X B. 2014. The optimization of image fusion and real-time application for HDR scenarios. Journal of Image and Graphics, 19(6): 940-945

范逵, 周晓波. 2014. 高动态场景的图像融合优化和实时应用. 中国图象图形学报, 19(6): 940-945 [DOI: 10.11834/jig.20140615]

Fotiadou K, Tsagkatakis G and Tsakalides P. 2020. Snapshot high dynamic range imaging via sparse representations and feature learning. IEEE Transactions on Multimedia, 22(3): 688-703 [DOI: 10.1109/TMM.2019.2933333]

Gallo O, Gelfandz N, Chen W C, Tico M and Pulli K. 2009. Artifact-free high dynamic range imaging//Proceedings of 2009 IEEE International Conference on Computational Photography (ICCP). San Francisco, USA: IEEE: 1-7 [DOI: 10.1109/ICCPHOT.2009.5559003http://dx.doi.org/10.1109/ICCPHOT.2009.5559003]

Jinno T and Okuda M. 2008. Motion blur free HDR image acquisition using multiple exposures//Proceedings of the 15th IEEE International Conference on Image Processing. San Diego, USA: IEEE: 1304-1307 [DOI: 10.1109/ICIP.2008.4712002http://dx.doi.org/10.1109/ICIP.2008.4712002]

Kalantari N K and Ramamoorthi R. 2017. Deep high dynamic range imaging of dynamic scenes. ACM Transactions on Graphics, 36(4): #144 [DOI: 10.1145/3072959.3073609]

Kang S B, Uyttendaele M, Winder S and Szeliski R. 2003. High dynamic range video. ACM Transactions on Graphics, 22(3): 319-325 [DOI: 10.1145/882262.882270]

Li Z C, Sun Y P, Zhang L Y and Tang J H. 2021. CTNet: Context-based tandem network for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence: #3132068 [DOI: 10.1109/TPAMI.2021.3132068]

Liu Y L, Lai W S, Chen Y S, Kao Y L, Yang M H, Chuang Y Y and Huang J B. 2020. Single-image HDR reconstruction by learning to reverse the camera pipeline//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1648-1657 [DOI: 10.1109/CVPR42600.2020.00172http://dx.doi.org/10.1109/CVPR42600.2020.00172]

Mantiuk R, Kim K J, Rempel A G and Heidrich W. 2011. HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Transactions on Graphics, 30(4): #40 [DOI: 10.1145/2010324.1964935]

Raman S and Chaudhuri S. 2011. Reconstruction of high contrast images for dynamic scenes. The Visual Computer, 27(12): 1099-1114 [DOI: 10.1007/s00371-011-0653-0]

Sen P, Kalantari N K, Yaesoubi M, Darabi S, Goldman D B and Shechtman E. 2012. Robust patch-based HDR reconstruction of dynamic scenes. ACM Transactions on Graphics, 31(6): #203 [DOI: 10.1145/2366145.2366222]

Tursun O T, Akyüz A O, Erdem A and Erdem E. 2016. An objective deghosting quality metric for HDR images. Computer Graphics Forum, 35(2): 139-152 [DOI: 10.1111/cgf.12818]

Ward G. 2003. Fast, robust image registration for compositing high dynamic range photographs from hand-held exposures. Journal of Graphics Tools, 8(2): 17-30 [DOI: 10.1080/10867651.2003.10487583]

Wu S Z, Xu J R, Tai Y W and Tang C K. 2018. Deep high dynamic range imaging with large foreground motions//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 120-135 [DOI: 10.1007/978-3-030-01216-8_8http://dx.doi.org/10.1007/978-3-030-01216-8_8]

Xu L, Jia J Y and Matsushita Y. 2010. Motion detail preserving optical flow estimation//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE: 1293-1300 [DOI: 10.1109/CVPR.2010.5539820http://dx.doi.org/10.1109/CVPR.2010.5539820]

Yan Q S, Gong D, Shi Q F, Van Den Hengel A, Shen C H, Reid I and Zhang Y N. 2019a. Attention-guided network for ghost-free high dynamic range imaging//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 1751-1760 [DOI: 10.1109/CVPR.2019.00185http://dx.doi.org/10.1109/CVPR.2019.00185]

Yan Q S, Gong D, Zhang P P, Shi Q F, Sun J Q, Reid I and Zhang Y N. 2019b. Multi-scale dense networks for deep high dynamic range imaging//Proceedings of 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE: 41-50 [DOI: 10.1109/WACV.2019.00012http://dx.doi.org/10.1109/WACV.2019.00012]

Yan Q S, Zhang L, Liu Y, Zhu Y, Sun J Q, Shi Q F and Zhang Y N. 2020. Deep HDR imaging via a non-local network. IEEE Transactions on Image Processing, 29: 4308-4322 [DOI: 10.1109/TIP.2020.2971346]

Zheng J H, Li Z G, Zhu Z J, Wu S Q and Rahardja S. 2013. Hybrid patching for a sequence of differently exposed images with moving objects. IEEE Transactions on Image Processing, 22(12): 5190-5201. [DOI: 10.1109/TIP.2013.2283401]

Zhu X Y, Lu X M, Li Z W, Wu W F, Tan H Z and Chen Q. 2018. High dynamic range image fusion with low rank matrix recovery. Journal of Image and Graphics, 23(11): 1652-1665

朱雄泳, 陆许明, 李智文, 吴炆芳, 谭洪舟, 陈强. 2018. 求解低秩矩阵融合高动态范围图像. 中国图象图形学报, 23(11): 1652-1665 [DOI: 10.11834/jig.180059]

Zimmer H, Bruhn A and Weickert J. 2011. Freehand HDR imaging of moving scenes with simultaneous resolution enhancement. Computer Graphics Forum, 30(2): 405-414 [DOI: 10.1111/j.1467-8659.2011.01870.x]

Alert me when the article has been cited

提交

Classification of breast pathological images based on multiscale information interaction and fusion

High-generalization spoofing fingerprint detection based on commonality feature learning

Progressive iteration network for hole filling in virtual view rendering

Research progress and challenges in real-time semantic segmentation for deep learning

3D multi-object tracking based on image and point cloud multi-information perception association