融合注意力机制的模糊图像多尺度复原

陈紫柠; 张宏怡; 曾念寅; 李寒

doi:10.11834/jig.210249

图像修复 | 浏览量 : 0 下载量: 0 CSCD: 2

PDF
导出
分享
收藏
专辑

融合注意力机制的模糊图像多尺度复原
Attention mechanism embedded multi-scale restoration method for blurred image
2022年27卷第5期页码：1682-1696
纸质出版日期： 2022-05-16 ，

录用日期： 2021-10-26
DOI： 10.11834/jig.210249
稿件说明：

移动端阅览

陈紫柠, 张宏怡, 曾念寅, 李寒. 融合注意力机制的模糊图像多尺度复原[J]. 中国图象图形学报, 2022,27(5):1682-1696.

Zining Chen, Hongyi Zhang, Nianyin Zeng, Han Li. Attention mechanism embedded multi-scale restoration method for blurred image[J]. Journal of Image and Graphics, 2022,27(5):1682-1696.
陈紫柠, 张宏怡, 曾念寅, 李寒. 融合注意力机制的模糊图像多尺度复原[J]. 中国图象图形学报, 2022,27(5):1682-1696. DOI： 10.11834/jig.210249.

Zining Chen, Hongyi Zhang, Nianyin Zeng, Han Li. Attention mechanism embedded multi-scale restoration method for blurred image[J]. Journal of Image and Graphics, 2022,27(5):1682-1696. DOI： 10.11834/jig.210249.

摘要

目的

去模糊任务通常难以进行对图像纹理细节的学习，所复原图像的细节信息不丰富，图像边缘不够清晰，并且需要耗费大量时间。本文通过对图像去模糊方法进行分析，同时结合深度学习和对抗学习的方法，提出一种新型的基于生成对抗网络(generative adversarial network

GAN)的模糊图像多尺度复原方法。

方法

使用多尺度级联网络结构，采用由粗到细的策略对模糊图像进行复原，增强去模糊图像的纹理细节；同时采用改进的残差卷积结构，在不增加计算量的同时，加入并行空洞卷积模块，增加了感受野，获得更大范围的特征信息；并且加入通道注意力模块，通过对通道之间的相关性进行建模，加强有效特征权重，并抑制无效特征；在损失函数方面，结合感知损失(perceptual loss)以及最小均方差(mean squared error

MSE)损失，保证生成图像和清晰图像内容一致性。

结果

通过全参考图像质量评价指标峰值信噪比(peak signal to noise ratio

PSNR)、结构相似性(structural similarity

SSIM)以及复原时间来评价算法优劣。与其他方法的对比结果表明，本文方法生成的去模糊图像PSNR指标提升至少3.8%，复原图像的边缘也更加清晰。将去模糊后的图像应用于YOLO-v4(you only look once)目标检测网络，发现去模糊后的图像可以检测到更小的物体，识别物体的数量有所增加，所识别物体的置信度也有一定的提升。

结论

采用由粗到细的策略对模糊图像进行复原，在残差网络中注入通道注意力模块以及并行空洞卷积模块改进网络的性能，并进一步简化网络结构，有效提升了复原速度。同时，复原图像有着更清晰的边缘和更丰富的细节信息。

Abstract

Objective

Image de-blurring task aims at a qualified image derived from the low-quality blurred one. Traditional fuzzy kernels based de-blurring techniques are challenging to sort out ideal fuzzy kernel for each pixel. Current restoration methods are based on manpowered prior knowledge of the images. Simultaneously

generalization capability is constrained to be extended. To harness image de-blurring processing

convolutional neural network (CNN)model has its priority in computer vision in the context of deep learning techniques. Nevertheless

poor CNN structure adaptive issues like over-fitting are extremely constrained of parameters and topology. A challenged image de-blurring tasks is to capture detailed texture. Tiny feature constrained images restoration have their deficiencies like inadequate detail information and indistinct image edge. To facilitate image in-painting and super-resolution restoration

generative adversarial networks (GAN) has its priority to preserve texture details.an adversarial network based end-to-end framework for image de-blurring is demonstrated based on GAN-based image-to-image translation

which can speed up multifaceted image restoration process.

Method

First

a variety of modified residual blocks are cascaded to build a multi-scale architecture

which facilitates extracting features from coarse to fine so that more texture details in a blurred image could be well restored. Next

extensive convolution module extended the receptive fields in parallel with no computational burden. Thirdly

channel attention mechanism is also applied to strengthen weights of useful features and suppress the invalid ones simultaneously via inter-channel modeling interdependencies. Finally

network-based perceptual loss is integrated with conventional mean squared error (MSE) to serve as the total loss function in order to maintain the fidelity of the image content. Consequently

our restored images quality can be guaranteed on the aspect of qualified semantic and fine texture details both. The application of minimum MSE loss between pixels also makes the generated de-blurred image have smoother edge information.

Result

GoPro database is adopted for our model training and testing

including 3 214 pairs of samples among which 2 013 pairs are used as training set and the remaining 1 111 pairs serve as testing data. To enhance the generalization capability of the network

data augmentation techniques like flipping and random angle rotation are conducted. Each training sample is randomly cropped into 256×256 pixels resolution images

and pixel values of clear and blurred images are all normalized to the range of [-1

1]. In order to comprehensively evaluate the our method

several indicators like peak signal to noise ratio (PSNR)

structural similarity (SSIM) and restoration time are used for evaluation. Our experimental results have demonstrated that overall performance of proposed method is satisfactory

which can effectively eliminate the blurred region in images. Compared with some other existing works

our method increases PSNR by 3.8% in less running time

which indicates the feasibility and superiority of our proposal. Restored images obtained by proposed method have clarified edges

and our method is efficient to restore the blurry images with different sizes of blur kernels to a certain extent. It is also found that while applying the restored images to the YOLO-v4(you only look once) object detection task

the results have been significantly improved regarding the identification accuracy and the confidence coefficient both

which reflects that designed strategies in proposed method.

Conclusion

Our image de-blurring method aims at blurred images and extracts features sufficiently from coarse to fine. Specifically

multi-scale improved residual blocks are cascaded for learned subtle texture details. Parallel enlarged convolution and channel attention mechanism are also intervened in our model to improve their adopted residual blocks capability. Moreover

trained loss function is modified via perceptual loss introducing to traditional mean square errors. Consequently

quality of restored images can be guaranteed to some extent like sharper edges and more abundant detail information. Our analyzed results have demonstrated their qualified and efficient priorities.

关键词

注意力机制图像修复深度学习生成对抗网络(GAN)多尺度

Keywords

attention mechanismimage restorationdeep learninggenerative adversarial network(GAN)multi-scale

references

Arjovsky M, Chintala S and Bottou L. 2017. Wasserstein GAN[EB/OL]. [2021-03-22].https://arxiv.org/pdf/1701.07875.pdfhttps://arxiv.org/pdf/1701.07875.pdf

Bochkovskiy A, Wang C Y and Liao H Y M. 2020. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2021-03-22].https://arxiv.org/pdf/2004.10934.pdfhttps://arxiv.org/pdf/2004.10934.pdf

Deng J, Dong W, Socher R, Li L, Li K and Li F F. 2009. ImageNet: a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Miami, USA: IEEE: 248-255[DOI: 10.1109/CVPR.2009.5206848http://dx.doi.org/10.1109/CVPR.2009.5206848]

Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial networks[EB/OL]. [2021-03-22].https://arxiv.org/pdf/1406.2661.pdfhttps://arxiv.org/pdf/1406.2661.pdf

Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V and Courville A. 2017. Improved training of Wasserstein GANs//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: ACM: 5769-5779

Hu J, Shen L, Albanie S, Sun G and Wu E H. 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8): 2011-2023[DOI: 10.1109/TPAMI.2019.2913372]

Isola P, Zhu J Y, Zhou T H and Efros A A. 2017. Image-to-image translation with conditional adversarial networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 5967-5976[DOI: 10.1109/CVPR.2017.632http://dx.doi.org/10.1109/CVPR.2017.632]

Kim T H and Lee K M. 2014. Segmentation-free dynamic scene deblurring//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 2766-2773[DOI: 10.1109/CVPR.2014.348http://dx.doi.org/10.1109/CVPR.2014.348]

Köhler R, Hirsch M, Mohler B, Schölkopf B and Harmeling S. 2012. Recording and playback of camera shake: benchmarking blind deconvolution with a real-world database//Proceedings of the 12th European Conference on Computer Vision Computer Vision. Florence, Italy: Springer: 27-40[DOI: 10.1007/978-3-642-33786-4_3http://dx.doi.org/10.1007/978-3-642-33786-4_3]

Kupyn O, Budzan V, Mykhailych M, Mishkin D and Matas J. 2018. DeblurGAN: blind motion deblurring using conditional adversarial networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8183-8192[DOI: 10.1109/CVPR.2018.00854http://dx.doi.org/10.1109/CVPR.2018.00854]

Kupyn O, Martyniuk T, Wu J R and Wang Z Y. 2019. DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 8877-8886[DOI: 10.1109/ICCV.2019.00897http://dx.doi.org/10.1109/ICCV.2019.00897]

Nah S, Kim T H and Lee K M. 2017. Deep multi-scale convolutional neural network for dynamic scene deblurring//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 257-265[DOI: 10.1109/CVPR.2017.35http://dx.doi.org/10.1109/CVPR.2017.35]

Noroozi M, Chandramouli P and Favaro P. 2017. Motion deblurring in the wild//Proceedings of the 39th German Conference on Pattern Recognition. Basel, Switzerland: Springer: 65-77[DOI: 10.1007/978-3-319-66709-6_6http://dx.doi.org/10.1007/978-3-319-66709-6_6]

Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition//Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: [s. n.]

Su S C, Delbracio M, Wang J, Sapiro G, Heidrich W and Wang O. 2017. Deep video deblurring for hand-held cameras//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 237-246[DOI: 10.1109/CVPR.2017.33http://dx.doi.org/10.1109/CVPR.2017.33]

Sun J, Cao W F, Xu Z B and Ponce J. 2015. Learning a convolutional neural network for non-uniform motion blur removal//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 769-777[DOI: 10.1109/CVPR.2015.7298677http://dx.doi.org/10.1109/CVPR.2015.7298677]

Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V and Rabinovich A. 2015. Going deeper with convolutions//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1-9[DOI: 10.1109/CVPR.2015.7298594http://dx.doi.org/10.1109/CVPR.2015.7298594]

Tao X, Gao H Y, Shen X Y, Wang J and Jia J Y. 2018. Scale-recurrent network for deep image deblurring//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8174-8182[DOI: 10.1109/CVPR.2018.00853http://dx.doi.org/10.1109/CVPR.2018.00853]

Wu D, Zhao H T and Zheng S B. 2020. Densely connected convolutional network image deblurring. Journal of Image and Graphics, 25(5): 890-899

吴迪, 赵洪田, 郑世宝. 2020. 密集连接卷积网络图像去模糊. 中国图象图形学报, 25(5): 890-899

Xu L, Zheng S C and Jia J Y. 2013. Unnatural L0 sparserepresentation for natural image deblurring//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 1107-1114[DOI: 10.1109/CVPR.2013.147http://dx.doi.org/10.1109/CVPR.2013.147]

文章被引用时，请邮件提醒。

提交

显著性引导的目标互补隐藏弱监督语义分割

轻量级图像超分辨率的蓝图可分离卷积Transformer网络

融合残差上下文编码和路径增强的视杯视盘分割

面向低剂量CT的牙齿分割网络

融合多尺度特征的复杂手势姿态估计网络