结合深度残差学习和感知损失的图像去噪
Image denoising via residual network based on perceptual loss
- 2018年23卷第10期 页码:1483-1491
收稿:2018-02-27,
修回:2018-6-1,
纸质出版:2018-10-16
DOI: 10.11834/jig.180069
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-02-27,
修回:2018-6-1,
纸质出版:2018-10-16
移动端阅览
目的
2
现存的去噪算法中很多在去除噪声的同时都存在边缘信息过光滑、易产生彩色伪影的问题,为了解决这些缺点,本文提出了一种基于联合感知损失的深度残差去噪网络。
方法
2
首先利用低通滤波器将噪声图片分解成高频层和低频层,然后将包含噪声和边缘信息的高频层输入设计好的残差网络中,通过常规逐像素损失方法学习端到端的残差映射预测出噪声残差图片,再由一个从输入直接通往输出的全局跳跃连接处理得到初始较模糊的去噪结果,最后级联一个预训练好的语义分割网络用来定义感知损失,指导前面的去噪模型学习更多语义特征信息来增强被模糊的边缘细节,得到更清晰真实的去噪结果。
结果
2
本文从定性和定量两个方面进行对比实验。以峰值信噪比(PSNR)作为量化指标来评价算法性能,结果表明所提出的网络在同其他对比方法一样使用逐像素损失训练时能产生最好的指标结果,在Set5、Set14和BSD100测试集25噪声级别时的结果分别为30.51 dB、30.60 dB和29.38 dB。在视觉定性分析上,本文提出的感知损失模型明显取得了更清晰的去噪结果,相比其他方法产生的模糊区域该方法保留了更多的边缘信息和纹理细节。此外还进行了盲去噪测试实验,对一张含有不同噪声级别的图片进行去噪处理,结果表明本文训练好的算法模型可以一次性处理多种未知级别的噪声并产生满意的去噪输出而且没有多余伪影。
结论
2
基于边缘增强的感知损失残差网络的图像去噪算法在去除噪声的同时可以保留更多容易被模糊的边缘细节,改善去噪结果过平滑的问题,提高图像视觉效果。
Objective
2
Image denoising is a classical image reconstruction problem in low-level computer vision.It estimates the latent clean image from a noisy one.Digital images are often affected by the noise caused by imaging equipment and external environment in the process of digitization and transmission.Although several methods have achieved reasonable results in recent years
they rarely mentioned the over-smoothing effects and the loss of edge details.Thus
a novel image denoising method via residual learning based on edge enhancement is proposed.
Method
2
Recently
due to its powerful learning ability
very deep convolutional neural network has been widely used for image restoration.Inspired by ResNet
unlike other direct denoising networks
identity mappings are introduced to enable our residual network to increase the depth
and then slightly modify the architecture to adapt better to the denoising task.Pooling layers and batch normalization are removed to preserve details.Instead of these
high-frequency layer decomposition and global skip connection are used to prevent over-fitting.They change the input and output of the network to reduce the solution space.To speed up the training process
we select the rectified linear unit (ReLU) as the activation function and remove it before the convolution layer.Traditionally
image restoration work used the per-pixel loss between the ground truth and the restored image as the optimization target to obtain excellent quantitative scores.However
in recent research
minimizing pixel-wise errors only on the basis of low-level pixels has proven prone to loss of details and smoothens the results.Meanwhile
the perceptual loss function has shown that it can generate high-quality images with a better visual performance by capturing the difference between the high-level feature representations
but it sometimes fails to preserve color and local spatial information.To combine both benefits
we propose a new joint loss function that consists of a normal pixel-to-pixel loss and a perceptual loss with appropriate weights.In summary
the flow of our method is described as follows.First
the high-frequency layer of the noisy image is used as the input by removing the background information.Then
a residual mapping is trained to predict the difference between clean and noisy images as output instead of the final denoised image.The denoised result is improved further
and a joint loss function is defined as the weighted sum of the pixel-to-pixel Euclidean and perceptual losses.A well-trained convolutional neural network is connected to learn the semantic information
which we will likely measure in our perceptual loss.This setup encourages the train process to learn similar feature representations rather than match each low-level pixel
which can guide the front denoising network in reconstructing more edges and details.Unlike normal denoising models for only one specific noise level
our single model can deal with the noise of unknown levels (i.e.
blind denoising).We employ CBSD400 as the training set and evaluate the quality in Set5
Set14
and CBSD100 with noise levels of 15
25
and 50
respectively.To train the network for a specific noise level
we generate the noisy images by adding Gaussian noise with standard deviations of
σ
=15
25
50.Alternatively
we train a single blind network for the unknown noise range [1
50].
Result
2
To verify the effectiveness of the proposed network
we show the quantitative and qualitative results of our method in comparison to those of state-of-the-art methods
including BM3D
TNRD
and DnCNN.The performance of the algorithm is evaluated by the peak signal-to-noise ratio as the quantitative indicator.Results show that the proposed network training with MSE loss can solely produce the best index results.The proposed algorithm (MSE-S) is better by 0.63 dB、0.55 dB and 0.17 dB compared with BM3D
TNRD
and DnCNN
respectively.In the qualitative visual sense
the perceptual loss model proposed in this paper clearly achieves a clearer denoising result.Compared with the fuzzy regions generated by other methods
this method preserves more edge information and texture details.We perform another experiment to show the ability of blind denoising.The input is composed of noisy parts with three levels
10
30
and 50.Results indicate that our blind model can generate a satisfactory restored output without artifacts even when the input is corrupted by several levels of noise in different parts.
Conclusion
2
In this paper
we describe a deep residual denoising network of 26 weight layers where perceptual loss is adopted to enhance the information detail.Residual learning and high-frequency layer decomposition are used to reduce the solution space to speed up the training process without pooling layers and batch normalization.Unlike the normal denoising model for only one specific noise level
our new model can deal with blind denoising problems with different unknown noise levels.The experiments show that the proposed network achieves superior performances both in quantitative and qualitative results
and recovers majority of the missing details from low-quality observations.In the future
we will explore how to handle other kinds of noise
especially the complex real-world noise
and consider a single comprehensive network for more image restoration tasks.In addition
we will likely focus on researching more visually perceptible indicators in addition to PSNR.
Buades A, Coll B, Morel J M. A non-local algorithm for image denoising[C]//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE, 2005: 60-65.[ DOI: 10.1109/CVPR.2005.38 http://dx.doi.org/10.1109/CVPR.2005.38 ]
Elad M, Aharon M. Image denoising via sparse and redundant representations over learned dictionaries[J]. IEEE Transactions on Image Processing, 2006, 15(12):3736-3745.[DOI:10.1109/TIP.2006.881969]
Dabov K, Foi A, Katkovnik V, et al. Image denoising by sparse 3-D transform-domain collaborative filtering[J]. IEEE Transactions on Image Processing, 2007, 16(8):2080-2095.[DOI:10.1109/TIP.2007.901238]
Kim J, Lee J K, Lee K M. Accurate image super-resolution using very deep convolutional networks[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2015: 1646-1654.[ DOI: 10.1109/CVPR.2016.182 http://dx.doi.org/10.1109/CVPR.2016.182 ]
Nah S, Kim T H, Lee K M. Deep multi-scale convolutional neural network for dynamic scene deblurring[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 257-265.[ DOI: 10.1109/CVPR.2017.35 http://dx.doi.org/10.1109/CVPR.2017.35 ]
Jain V, Seung H S. Natural image denoising with convolutional networks[C]//Proceedings of the 21st International Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada: Curran Associates Inc., 2008: 769-776.
Vincent P, Larochelle H, Bengio Y, et al. Extracting and composing robust features with denoising autoencoders[C]//Proceeding of the 25th International Conference on Machine Learning. Helsinki, Finland: ACM, 2008: 1096-1103.[ DOI: 10.1145/1390156.1390294 http://dx.doi.org/10.1145/1390156.1390294 ]
Xie J Y, Xu L L, Chen E H. Image denoisingand inpainting with deep neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc., 2012: 341-349.
Chen Y, Pock T. Trainable Nonlinear Reaction Diffusion:A Flexible Framework for Fast and Effective Image Restoration[J]. IEEE Transactions on Pattern Analysis&Machine Intelligence, 2017, 39(6):1256-1272.[DOI:10.1109/TPAMI.2016.2596743]
Mao X J, Shen C H, Yang Y B. Image restoration using convolutional auto-encoders with symmetric skip connections[J]. arXiv preprint arXiv: 1606.08921, 2016.
Zhang K, Zuo W M, Chen Y J, et al. Beyond a Gaussian denoiser:residual learning of deep CNN for image denoising[J]. IEEE Transactions on Image Processing, 2017, 26(7):3142-3155.[DOI:10.1109/TIP.2017.2662206]
Johnson J, Alahi A, Li F F. Perceptual losses for real-time style transfer and super-resolution[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 694-711.[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Badrinarayanan V, Kendall A, Cipolla R. SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.[DOI:10.1109/TPAMI.2016.2644615]
He K M, Zhang X Y, Ren S Q, et al. Identity mappings in deep residual networks[C]//Proceedings of the 14th European Conference on Computer Vision . Amsterdam, The Netherlands: Springer, 2016: 630-645.[ DOI: 10.1007/978-3-319-46493-0_38 http://dx.doi.org/10.1007/978-3-319-46493-0_38 ]
He K M, Sun J, Tang X O. Guided image filtering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(6):1397-1409.[DOI:10.1109/TPAMI.2012.213]
Ledig C, Theis L, Huszár F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2016: 105-114.[ DOI: 10.1109/CVPR.2017.19 http://dx.doi.org/10.1109/CVPR.2017.19 ]
Jia Y Q, Shelhamer E, Donahue J, et al. Caffe: convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM international conference on Multimedia. Orlando, Florida, USA: ACM, 2014: 675-678.[ DOI: 10.1145/2647868.2654889 http://dx.doi.org/10.1145/2647868.2654889 ]
相关作者
相关机构
京公网安备11010802024621