图像超分辨率重建中的细节互补卷积模型

李浪宇; 苏卓; 石晓红; 黄恩博; 罗笑南

doi:10.11834/jig.170361

GDC 2017会议专栏 | 浏览量 : 0 下载量: 28 CSCD: 4

PDF
导出
分享
收藏
专辑

图像超分辨率重建中的细节互补卷积模型
Mutual-detail convolution model for image super-resolution reconstruction
2018年23卷第4期页码：572-582
收稿：2017-07-10，

修回：2017-9-28，

纸质出版：2018-04-16
DOI： 10.11834/jig.170361
稿件说明：

移动端阅览

李浪宇, 苏卓, 石晓红, 黄恩博, 罗笑南. 图像超分辨率重建中的细节互补卷积模型[J]. 中国图象图形学报, 2018,23(4):572-582. DOI： 10.11834/jig.170361.

Langyu Li, Zhuo Su, Xiaohong Shi, Enbo Huang, Xiaonan Luo. Mutual-detail convolution model for image super-resolution reconstruction[J]. Journal of Image and Graphics, 2018, 23(4): 572-582. DOI： 10.11834/jig.170361.

摘要

目的

现有的超分辨卷积神经网络为了获得良好的高分辨率图像重建效果需要越来越深的网络层次和更多的训练，因此存在了对于样本数量依懒性大，参数众多致使训练困难以及训练所需迭代次数大，硬件需求大等问题。针对存在的这些问题，本文提出一种改进的超分辨率重建网络模型。

方法

本文区别于传统的单输入模型，采取了一种双输入细节互补的网络模型，在原有的SRCNN单输入模型特征提取映射网络外，添加了一个新的输入。本文结合图像局部相似性，构建了一个细节补充网络来补充图像特征，并使用一层卷积层将细节补充网络得到的特征与特征提取网络提取的特征融合，恢复重建高分辨率图像。

结果

本文分别从主观和客观的角度，对比了本文方法与其他主流方法之间的数据对比和效果对比情况，在与SRCNN在相似网络深度的情况下，本文方法在放大3倍时的PSNR数值在Set5以及Set14数据下分别比SRCNN高出0.17 dB和0.08 dB。在主观的恢复图像效果上，本文方法能够很好的恢复图像边缘以及图像纹理细节。

结论

实验证明，本文所提出的细节互补网络模型能够在较少的训练以及比较浅的网络下获得有效的重建图像并且保留更多的图像细节。

Abstract

Objective

Single-image super-resolution (SR) is a classical problem in computer vision. In visual information processing

high-resolution images are still desired for considerable useful information

such as medical

remote sensing imaging

video surveillance

and entertainment. However

we can obtain low-resolution images of specific objects in some scenes only

such as long-distance shooting

due to the limitation of physical devices. SR has attracted considerable attention from computer vision communities in the past decades. We address the problem of generating a high-resolution image given a low-resolution image

which is commonly referred to as single-image SR. Early methods include bicubic interpolation

Lanczos resampling

statistical priors

neighbor embedding

and sparse coding. In recent years

a series of convolutional neural network (CNN) models has been proposed for single-image SR. Deep learning attempts to learn layered

hierarchical representations of high-dimensional data. However

the classical CNN for SR is a single-input model that limits its performance. These CNNs require deep networks

considerable training consumption

and a large number of sample images to obtain images with good details. These requirements lead to the use of numerous parameters to train the networks

the increased number of iterations for training

and the need for large hardware. In view of these existing problems

an improved super-resolution reconstruction network model is proposed.

Method

Unlike the traditional single-input model

we adopt a mutual-detail convolution model with double input. The combination of paths of different scales enables the model to synthesize a wide range of receptive fields. The different features of image blocks with different sizes are complemented at different scales. Low-dimensional and high-dimensional features are combined to supplement the details of the restoration images to improve the quality and detail of reconstructed images. Traditional self-similarity-based methods can also be combined with neural networks. The entire convolution model can be divided into three parts:F1

and F3 networks. F1 is the feature extraction and nonlinearly mapping network with four layers. Filters with spatial sizes of 9×9

and 3×3 are used. F2 is the detail network used to complement the features of F1. F2 consists of two layers and filters with spatial sizes of 11×11 and 5×5. F3 is the reconstruction network. We use mean squared error as the loss function. The loss is minimized using stochastic gradient descent (SGD) with the standard backpropagation. The network takes an original low-resolution image and an interpolated low-resolution image (to the desired size) as inputs and predicts the image details. Our method adds a new input to supplement the high-frequency information that is lost during the reconstruction process. As shown in the literature

deep learning generally benefits from big-data training. We use a training dataset of 500 images from BSD500

and the flipped and rotated versions of the training images are considered. We rotate the original images by 90° and 270°. The training images are split into 33×33 and 39×39

with a stride of 14

by considering training time and storage complexities. We set a mini batch size of SGD to 64 and the momentum parameter to 0.9.

Result

We use Set5 and Set14 as the validation sets. From previous experiments

we follow the conventional approach to super-resolving color images. We transform the color images into the YCbCr space. The SR algorithms are applied only on the Y channel

whereas the Cb and Cr channels are upscaled by bicubic interpolation. We show the quantitative and qualitative results of our method in comparison with those of state-of-the-art methods. Unlike traditional methods and SRCNN

our method can obtain better peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) values of the experimental results shown in the Set5 and Set14 datasets. For the upscaling factor 3

the average gains on PSNR achieved by our method are 0.17 and 0.08 dB higher than those of the next best approach

SRCNN

on the two datasets. A similar trend is observed when we use SSIM as the performance metric. Unlike the training times of SRCNN

the iterations of our approach are decreased by two orders of magnitude. With a lightweight structure

our method achieves superior performance to that of state-of-the-art methods.

Conclusion

The experiments show that the proposed method can effectively reconstruct images with considerable details with minimal training and relatively shallow networks. However

unlike the result of a very deep neural network

the result of our method is not sufficiently precise

and the network structure is relatively simple. We will consider using deep layers to acquire numerous image features at different layers and extending our model to several image tasks in the next work.

关键词

Keywords

references

Van Ouwerkerk J D. Imagesuper-resolution survey[J]. Image and Vision Computing, 2006, 24(10):1039-1052.[DOI:10.1016/j.imavis.2006.02.026]

Irani M, Peleg S. Improving resolution by image registration[J]. CVGIP:Graphical Models and Image Processing, 1991, 53(3):231-239.[DOI:10.1016/1049-9652(91)90045-L]

Tian J, Ma K K. A survey on super-resolution imaging[J]. Signal, Image and Video Processing, 2011, 5(3):329-342.[DOI:10.1007/s11760-010-0204-6]

Li X, Orchard M T. New edge-directed interpolation[J]. IEEE Transactions on Image Processing, 2001, 10(10):1521-1527.[DOI:10.1109/83.951537]

Leu J G. Image enlargement based on a step edge model[J]. Pattern Recognition, 2000, 33(12):2055-2073.[DOI:10.1016/S0031-3203(99)00184-3]

Cha Y, Kim S. Edge-forming methods for color image zooming[J]. IEEE Transactions on Image Processing, 2006, 15(8):2315-2323.[DOI:10.1109/TIP.2006.875182]

Tai Y W, Liu S C, Brown M S, et al. Super resolution using edge prior and single image detail synthesis[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 2400-2407. [ DOI:10.1109/CVPR.2010.5539933 http://dx.doi.org/10.1109/CVPR.2010.5539933 ]

Zhang K B, Gao X B, Tao D C, et al. Singleimage super-resolution with multiscale similarity learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(10):1648-1659.[DOI:10.1109/TNNLS.2013.2262001]

Sun J, Xu Z B, Shum H Y. Image super-resolution using gradient profile prior[C]//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE, 2008: 1-8. [ DOI:10.1109/CVPR.2008.4587659 http://dx.doi.org/10.1109/CVPR.2008.4587659 ]

Dong C, Loy C C, He K M, et al. Image super-resolution using deep convolutional networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2):295-307.[DOI:10.1109/TPAMI.2015.2439281]

Ledig C, Theis L, Huszar F, et al. Photo-realistic single image super-resolution using a generative adversarial network[J]. Computer Vision and Pattern Recognition, 2016:4681-4690.[DOI:10.1109/CVPR.2017.19]

Kim J, Lee J K, Lee K M. Accurate image super-resolution using very deep convolutional networks[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1646-1654. [ DOI:10.1109/CVPR.2016.182 http://dx.doi.org/10.1109/CVPR.2016.182 ]

He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778. [ DOI:10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]

Freeman W T, Jones T R, Pasztor E C. Example-based super-resolution[J]. IEEE Computer Graphics and Applications, 2002, 22(2):56-65.[DOI:10.1109/38.988747]

Kim J, Lee J K, Lee K M. Accurate image super-resolution using very deep convolutional networks[C] . Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1646-1654. [ DOI:10.1109/CVPR.2016.182 http://dx.doi.org/10.1109/CVPR.2016.182 ]

Wang Z W, Liu D, Yang J C, et al. Deep networks for image super-resolution with sparse prior[J]. International Conference on Computer Vision, 2015:370-378.[DOI:10.1109/ICCV.2015.50]

Wang Q, Tang X O, Shum H. Patch based blind image super resolution[C]//Proceedings of the 10th IEEE International Conference on Computer Vision. Beijing, China: IEEE, 2005: 709-716. [ DOI:10.1109/ICCV.2005.186 http://dx.doi.org/10.1109/ICCV.2005.186 ]

Lin Z C, Shum H Y. Fundamental limits of reconstruction-based super-resolution algorithms under local translation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(1):83-97.[DOI:10.1109/TPAMI.2004.1261081]

Chang H, Yeung D Y, Xiong Y M. Super-resolution through neighbor embedding[C]//Proceedings of 2004 Computer Society Conference on Computer Vision and Pattern Recognition. Washington DC, USA: IEEE, 2004: I-275-I-282. [ DOI:10.1109/CVPR.2004.1315043 http://dx.doi.org/10.1109/CVPR.2004.1315043 ]

Yang J C, Wright J, Huang T S, et al. Image super-resolution via sparse representation[J]. IEEE Transactions on Image Processing, 2010, 19(11):2861-2873.[DOI:10.1109/TIP.2010.2050625]

Dong W S, Zhang L, Shi G M, et al. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization[J]. IEEE Transactions on Image Processing, 2011, 20(7):1838-1857.[DOI:10.1109/TIP.2011.2108306]

Timofte R, De V, Van Gool L. Anchored neighborhood regression for fast example-based super-resolution[C]//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013: 1920-1927. [ DOI:10.1109/ICCV.2013.241 http://dx.doi.org/10.1109/ICCV.2013.241 ]

Timofte R, De Smet V, Van Gool L. A+: adjusted anchored neighborhood regression for fast super-resolution[C]//Computer Vision——ACCV 2014. Cham: Springer, 2015: 111-126. [ DOI:10.1007/978-3-319-16817-3_8 http://dx.doi.org/10.1007/978-3-319-16817-3_8 ]

Glasner D, Bagon S, Irani M. Super-resolution from a single image[C]//Proceedings of the IEEE 12th International Conference on Computer Vision. Kyoto, Japan: IEEE, 2009: 349-356. [ DOI:10.1109/ICCV.2009.5459271 http://dx.doi.org/10.1109/ICCV.2009.5459271 ]

Freedman G, Fattal R. Image and video upscaling from local self-examples[J]. ACM Transactions on Graphics, 2011, 30(2):12.[DOI:10.1145/1944846.1944852]

Huang J B, Singh A, Ahuja N. Single image super-resolution from transformed self-exemplars[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 5197-5206. [ DOI:10.1109/CVPR.2015.7299156 http://dx.doi.org/10.1109/CVPR.2015.7299156 ]

Kim J, Lee J K, Lee K M. Deeply-recursive convolutional network for image super-resolution[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 1637-1645. [ DOI:10.1109/CVPR.2016.181 http://dx.doi.org/10.1109/CVPR.2016.181 ]

Keys R. Cubic convolution interpolation for digital image processing[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1981, 29(6):1153-1160.[DOI:10.1109/TASSP.1981.1163711]

Bertasius G,Shi J B, Torresani L. DeepEdge: a multi-scale bifurcated deep network for top-down contour detection[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 4380-4389. [ DOI:10.1109/CVPR.2015.7299067 http://dx.doi.org/10.1109/CVPR.2015.7299067 ]

Nair V, Hinton G E. Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel: ACM, 2010: 807-814.

Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]//Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA: [s. n.], 2011: 315-323.

Suetake N, Sakano M, Uchino E. Image super-resolution based on local self-similarity[J]. Optical Review, 2008, 15(1):26-30.[DOI:10.1007/s10043-008-0005-0]

Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009: 248-255. [ DOI:10.1109/CVPR.2009.5206848 http://dx.doi.org/10.1109/CVPR.2009.5206848 ]

Sugano Y, Matsushita Y, Sato Y, et al. Graph-based joint clustering of fixations and visual entities[J]. ACM Transactions on Applied Perception, 2013, 10(2):10.

Bevilacqua M, Roumy A, Guillemot C, et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//Proceedings of British Machine Vision Conference. Surrey, UK: BMVC, 2012.

Zeyde R, Elad M, Protter M. On single image scale-up using sparse-representations[C]//Proceedings of the 7th International Conference on Curves and Surfaces. Avignon, France: Springer-Verlag, 2010: 711-730. [ DOI:10.1007/978-3-642-27413-8_47 http://dx.doi.org/10.1007/978-3-642-27413-8_47 ]

Schulter S, Leistner C, Bischof H. Fast and accurate image upscaling with super-resolution forests[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3791-3799. [ DOI:10.1109/CVPR.2015.7299003 http://dx.doi.org/10.1109/CVPR.2015.7299003 ]