3D attention and Transformer based single image deraining network
- Vol. 27, Issue 5, Pages: 1509-1521(2022)
Received:14 September 2021,
Revised:2022-2-21,
Accepted:28 February 2022,
Published:16 May 2022
DOI: 10.11834/jig.210794
移动端阅览

浏览全部资源
扫码关注微信
Received:14 September 2021,
Revised:2022-2-21,
Accepted:28 February 2022,
Published:16 May 2022
移动端阅览
目的
2
因为有雨图像中雨线存在方向、密度和大小等各方面的差异,单幅图像去雨依旧是一个充满挑战的研究问题。现有算法在某些复杂图像上仍存在过度去雨或去雨不足等问题,部分复杂图像的边缘高频信息在去雨过程中被抹除,或图像中残留雨成分。针对上述问题,本文提出三维注意力和Transformer去雨网络(three-dimension attention and Transformer deraining network,TDATDN)。
方法
2
将三维注意力机制与残差密集块结构相结合,以解决残差密集块通道高维度特征融合问题;使用Transformer计算特征全局关联性;针对去雨过程中图像高频信息被破坏和结构信息被抹除的问题,将多尺度结构相似性损失与常用图像去雨损失函数结合参与去雨网络训练。
结果
2
本文将提出的TDATDN网络在Rain12000雨线数据集上进行实验。其中,峰值信噪比(peak signal to noise ratio,PSNR)达到33.01 dB,结构相似性(structural similarity,SSIM)达到0.927 8。实验结果表明,本文算法对比以往基于深度学习的神经网络去雨算法,显著改善了单幅图像去雨效果。
结论
2
本文提出的TDATDN图像去雨网络结合了3D注意力机制、Transformer和编码器—解码器架构的优点,可较好地完成单幅图像去雨工作。
Objective
2
Vision-based computer systems can be used to process and analyze acquired images and videos in fuzzy weather like rainy
snowy
sleet or foggy. These image quality degradation issues derived from severe weather conditions will significantly distort the image visual quality and reduce the performance of the computer vision system. Hence
it is important to develop computer image deraining automatic processing algorithms. Our research focuses on the issue of single image based removing rain streaks. The traditional image rain removal model is mainly based on the prior information to remove the rain from the image. It regards the rain image as a combination of the rain layer and the background layer
and defines the separation of the rain layer and the background layer by the image deraining task. Due to the existing differences in related to direction
density
and size of rain streaks in rain images
a single image derived de-raining issue is a challenging computer vision task currently. Deep learning has benefited to de-raining images but existing models has challenges like excessive rain removal or insufficient rain removal on complicated images scenario. The high-frequency edge information of some complex images is erased during the rain removal process
or rain components remaining in the rain removal image. We propose this paper proposes the three-dimension attention and Transformer de-raining network (TDATDN) single image rain removal network
which improves the image rain removal network based on the encoder-decoder architecture and integrates 3D attention
Transformer and encoder-decoder take advantages of the structure to enhance the image to the rain effect. Our training dataset consists of 12 000 pairs of training images (including three types of rain images with different rain densities)
and 1 200 test set images are used to test the rain removal effect. The input image size is scaled to 256×256 for training and testing. Adam optimizer is used for training and learning. The initial learning rate is set to 1×10
-4
and its network epoch number is 100. The learning rate is multiplied by 0.5 when reach 15 times.
Method
2
Our method melts the three-dimension attention mechanism into the residual dense block structure to resolve the challenge of high-dimensional feature fusion via the residual dense block channel. Then
our proposed three-dimension attention residual dense block as the backbone network to build an encoder-decoder-based architecture image de-raining network
and uses Transformer mechanism to calculate the global contextual relevance of the deep semantic information of the network. The Transformer obtained self-attention feature encoding by is up-sampling operation based on the decoder structure image restoration path. To obtain a rain removal result with richer high-frequency details the up-sampling operation obtains the feature map of the image is spliced in the channel direction with the corresponding encoder-based feature map. For the image high-frequency information loss and the structure information is erased in the rain removal process
our problem solving combines the multi-scale structure similarity loss with the commonly used image de-raining loss function to improve the training of the de-raining network.
Result
2
Our TDATDN network is demonstrated on the Rain12000 rain streaks dataset. Among them
the peak signal to noise ratio (PSNR) reached 33.01 dB
and the structural similarity (SSIM) reached 0.927 8. A comparative experiment was carried out to verify the fusion algorithm results. The result of the comparative experiment illustrated that our algorithm has its priority to improve the effect of a single image oriented rain removing.
Conclusion
2
Our image de-raining network takes the advantages of 3D attention mechanism
Transformer and encoder-decoder architecture into account.
Child R, Gray S, Radford A and Sutskever I. 2019. Generating long sequences with sparse transformers[EB/OL]. [2021-08-13] . https://arxiv.org/pdf/1904.10509.pdf https://arxiv.org/pdf/1904.10509.pdf
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J and Houlsby N. 2021. An image is worth 16×16 words: transformers for image recognition at scale [EB/OL]. [2021-08-13] . https://arxiv.org/pdf/2010.11929.pdf https://arxiv.org/pdf/2010.11929.pdf
Fu X Y, Huang J B, Ding X H, Liao Y H and Paisley J. 2017a. Clearing the skies: a deep network architecture for single-image rain removal. IEEE Transactions on Image Processing, 26(6): 2944-2956 [DOI: 10.1109/TIP.2017.2691802]
Fu X Y, Huang J B, Zeng D L, Huang Y, Ding X H and Paisley J. 2017b. Removing rain from single images via a deep detail network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1715-1723 [ DOI: 10.1109/CVPR.2017.186 http://dx.doi.org/10.1109/CVPR.2017.186 ]
Fu X Y, Liang B R, Huang Y, Ding X H and Paisley J. 2018. Lightweight pyramid networks for image deraining [EB/OL]. [2021-08-13] . https://arxiv.org/pdf/1805.06173.pdf https://arxiv.org/pdf/1805.06173.pdf
Guo T A, Dai T, Li J W and Xia S T. 2019. Self-attentive pyramid network for single image de-raining//Proceedings of the 26th International Conference on Neural Information Processing. Sydney, Australia: Springer: 390-401 [ DOI: 10.1007/978-3-030-36708-4_32 http://dx.doi.org/10.1007/978-3-030-36708-4_32 ]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vega, USA: IEEE: 770-778 [ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Hu J, Shen L, Albanie S, Sun G and Wu E H. 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8): 2011-2023 [DOI: 10.1109/TPAMI.2019.2913372]
Huang G, Liu Z, Van Der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 2261-2269 [ DOI: 10.1109/CVPR.2017.243 http://dx.doi.org/10.1109/CVPR.2017.243 ]
Itti L, Koch C and Niebur E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11): 1254-1259 [DOI: 10.1109/34.730558]
Kang L W, Lin C W and Fu Y H. 2012. Automatic single-image-based rain streaks removal via image decomposition. IEEE Transactions on Image Processing, 21(4): 1742-1755 [DOI: 10.1109/TIP.2011.2179057]
Lateef F and Ruichek Y. 2019. Survey on semantic segmentation using deep learning techniques. Neurocomputing, 338: 321-348 [DOI: 10.1016/j.neucom.2019.02.003]
Li X, Wu J L, Lin Z C, Liu H and Zha H B. 2018. Recurrent squeeze-and-excitation context aggregation net for single image deraining//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 262-277 [ DOI: 10.1007/978-3-030-01234-2_16 http://dx.doi.org/10.1007/978-3-030-01234-2_16 ]
Li Y, Tan R T, Guo X J, Lu J B and Brown M S. 2016. Rain streak removal using layer priors//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 2736-2744 [ DOI: 10.1109/CVPR.2016.299 http://dx.doi.org/10.1109/CVPR.2016.299 ]
Liu L, Ouyang W L, Wang X G, Fieguth P, Chen J, Liu X W and Pietikäinen M. 2020. Deep learning for generic object detection: a survey. International Journal of Computer Vision, 128(2): 261-318 [DOI: 10.1007/s11263-019-01247-4]
Li W K, Chia W L and Yu H F. 2012. Automatic single-image-based rain streaks removal via image decomposition. IEEE Transactions on Image Processing, 21(4): 1742-1755 [DOI: 10.1109/TIP.2011.2179057]
Luo Y, Xu Y and Ji H. 2015. Removing rain from a single image via discriminative sparse coding//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3397-3405 [ DOI: 10.1109/ICCV.2015.388 http://dx.doi.org/10.1109/ICCV.2015.388 ]
Ma L, Liu R S, Jiang Z Y, Wang Y Y, Fan X and Li H J. 2018. Rain streak removal using learnable hybrid MAP network. Journal of Image and Graphics, 23(2): 277-285
马龙, 刘日升, 姜智颖, 王怡洋, 樊鑫, 李豪杰. 2018. 自然场景图像去雨的可学习混合MAP网络. 中国图象图形学报, 23(2): 277-285 [DOI: 10.11834/jig.170390]
Parmar N, Vaswani A, Uszkoreit J, KaiserȽ, Shazeer N, Ku A and Tran D. 2018. Image transformer [EB/OL]. [2021-08-13] . https://arxiv.org/pdf/1802.05751.pdf https://arxiv.org/pdf/1802.05751.pdf
Rawat W and Wang Z H. 2017. Deep convolutional neural networks for image classification: a comprehensive review. Neural Computation, 29(9): 2352-2449 [DOI: 10.1162/neco_a_00990]
Ren D W, Zuo W M, Hu Q H, Zhu P F and Meng D Y. 2019. Progressive image deraining networks: a better and simpler baseline//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3932-3941 [ DOI: 10.1109/CVPR.2019.00406 http://dx.doi.org/10.1109/CVPR.2019.00406 ]
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ƚ and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc. : 6000-6010
Wang F, Jiang M Q, Qian C, Yang S, Li C, Zhang H G, Wang X G and Tang X O. 2017. Residual attention network for image classification//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 6450-6458 [ DOI: 10.1109/CVPR.2017.683 http://dx.doi.org/10.1109/CVPR.2017.683 ]
Wang G Q, Sun C M and Sowmya A. 2020. Cascaded attention guidance network for single rainy image restoration. IEEE Transactions on Image Processing, 29: 9190-9203 [DOI: 10.1109/TIP.2020.3023773]
Wang G Q, Sun C M and Sowmya A. 2021. Attentive feature refinement network for single rainy image restoration. IEEE Transactions on Image Processing, 30: 3734-3747 [DOI: 10.1109/TIP.2021.3064229]
Wang M H, He H J and Li C. 2020. Single image rain removal based on selective kernel convolution using a residual refine factor. Journal of Image and Graphics, 25(12): 2484-2493
王美华, 何海君, 李超. 2020. 自适应卷积的残差修正单幅图像去雨. 中国图象图形学报, 25(12): 2484-2493 [DOI: 10.11834/jig.190682]
Wang Z, Bovik A C, Sheikh H R and Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612 [DOI: 10.1109/TIP.2003.819861]
Woo S, Park J, Lee J Y and Kweon I S. 2018. CBAM: convolutional block attention module//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 3-19 [ DOI: 10.1007/978-3-030-01234-2_1 http://dx.doi.org/10.1007/978-3-030-01234-2_1 ]
Yang L X, Zhang R Y, Li L D and Xie X H. 2021. SimAM: a simple, parameter-free attention module for convolutional neural networks//Proceedings of the 38th International Conference on Machine Learning. Virtual event: ICML: 11863-11874
Yang W H, Tan R T, Feng J S, Liu J Y, Guo Z M and Yan S C. 2016. Joint rain detection and removal via iterative region dependent multi-task learning[EB/OL]. [2021-08-13] . https://arxiv.org/pdf/1609.07769.pdf https://arxiv.org/pdf/1609.07769.pdf
Yang W H, Tan R T, Feng J S, Liu J Y, Guo Z M and Yan S C. 2017. Deep joint rain detection and removal from a single image//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 1685-1694 [ DOI: 10.1109/CVPR.2017.183 http://dx.doi.org/10.1109/CVPR.2017.183 ]
Yang W H, Tan R T, Wang S Q, Fang Y M and Liu J Y. 2019. Single image deraining: from model-based to data-driven and beyond [EB/OL]. [2021-08-13] . https://arxiv.org/pdf/1912.07150.pdf https://arxiv.org/pdf/1912.07150.pdf
Yasarla R and Patel V M. 2019. Uncertainty guided multi-scale residual learning-using a cycle spinning CNN for single image de-raining//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 8397-8406 [ DOI: 10.1109/CVPR.2019.00860 http://dx.doi.org/10.1109/CVPR.2019.00860 ]
Zhang H and Patel V M. 2018. Density-aware single image de-raining using a multi-stream dense network//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 695-704 [ DOI: 10.1109/CVPR.2018.00079 http://dx.doi.org/10.1109/CVPR.2018.00079 ]
Zhang Y L, Tian Y P, Kong Y, Zhong B N and Fu Y. 2018. Residual dense network for image super-resolution//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2472-2481 [ DOI: 10.1109/CVPR.2018.00262 http://dx.doi.org/10.1109/CVPR.2018.00262 ]
相关文章
相关作者
相关机构
京公网安备11010802024621