多通道递归残差网络的图像超分辨率重建

程德强; 郭昕; 陈亮亮; 寇旗旗; 赵凯; 高蕊

发布时间： 2021-03-19
摘要点击次数： 2382
全文下载次数： 1088
DOI: 10.11834/jig.200108
2021 | Volume 26 | Number 3

多通道递归残差网络的图像超分辨率重建

程德强¹, 郭昕¹, 陈亮亮¹, 寇旗旗², 赵凯¹, 高蕊¹(1.中国矿业大学信息与控制工程学院, 徐州 221116;2.中国矿业大学计算机科学与技术学院, 徐州 221116)

摘要

目的基于神经网络的图像超分辨率重建技术主要是通过单一网络非线性映射学习得到高低分辨率之间特征信息关系来进行重建，在此过程中较浅网络的图像特征信息很容易丢失，加深网络深度又会增加网络训练时间和训练难度。针对此过程出现的训练时间长、重建结果细节信息较模糊等问题，提出一种多通道递归残差学习机制，以提高网络训练效率和图像重建质量。方法设计一种多通道递归残差网络模型，该模型首先利用递归方法将残差网络块进行复用，形成32层递归网络，来减少网络参数、增加网络深度，以加速网络收敛并获取更丰富的特征信息。然后采集不同卷积核下的特征信息，输入到各通道对应的递归残差网络后再一起输入到共用的重建网络中，提高对细节信息的重建能力。最后引入一种交叉学习机制，将通道1、2、3两两排列组合交叉相连，进一步加速不同通道特征信息融合、促进参数传递、提高网络重建性能。结果本文模型使用DIV2K （DIVerse 2K）数据集进行训练，在Set5、Set14、BSD100和Urban100数据集上进行测试，并与Bicubic、SRCNN （super-resolution convolutional neural network）、VDSR （super-resolution using very deep convolutional network）、LapSRN （deep Laplacian pyramid networks for fast and accurate super-resolution）和EDSR_baseline （enhanced deep residual networks for single image super-resolution_baseline）等方法的实验结果进行对比，结果显示前者获取细节特征信息能力提高，图像有了更清晰丰富的细节信息；客观数据方面，本文算法的数据有明显的提升，尤其在细节信息较多的Urban100数据集中PSNR （peak signal-to-noise ratio）平均分别提升了3.87 dB、1.93 dB、1.00 dB、1.12 dB和0.48 dB，网络训练效率相较非递归残差网络提升30%。结论本文模型可获得更好的视觉效果和客观质量评价，而且相较非递归残差网络训练过程耗时更短，可用于复杂场景下图像的超分辨率重建。

关键词

超分辨重建多通道递归交叉残差网络模型

Image super-resolution reconstruction from multi-channel recursive residual network

Cheng Deqiang¹, Guo Xin¹, Chen Liangliang¹, Kou Qiqi², Zhao Kai¹, Gao Rui¹(1.School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China;2.School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China)

Abstract

Objective The limitations of external environment, hardware conditions, and network resources will cause the images we obtain in daily life to be low-resolution images, which will affect the accuracy of images used in other applications. Therefore, super-resolution reconstruction technology has become a very important research topic. This technique can be used to recover super-resolution images. High-resolution images can be reconstructed from the information relationship between high-resolution and low-resolution images. Obtaining the correspondence between high-resolution and low-resolution images is the key to image super-resolution reconstruction technology. It is a basic method for neural networks to solve the problem of image super-resolution by using the single-channel network to learn the feature information relationship between high resolution and low resolution. However, the feature information of the image is easily lost in the shallow layer, and the low utilization of the feature information leads to an unsatisfactory reconstruction effect when the image magnification is large, and the restoration ability of the image detail information is poor. Simply deepening the depth of the network will increase the training time and difficulty of the network, which will waste a large amount of hardware resources and time. A multi-channel recursive residual network model is proposed to solve these problems. This model can improve network training efficiency by iterating the residual network blocks and enhance the detailed information reconstruction capability through multi-channel and cross-learning mechanisms. Method A multi-channel recursive cross-residual network model is designed. The use of a large number of convolutional layers in the model explains why training takes a large amount of time. Fewer convolutional layers will reduce network reconstruction performance. Therefore, the method of recursive residual network blocks is used to deepen the network depth and speed up the network training. First, a multi-channel recursive cross-residual network model is designed. The model uses recursive multiplexing of residual network blocks to form a 32-layer recursive network, thereby reducing network parameters and increasing network depth. This model can speed up network training and obtain richer information. Then, the amount of feature information which has a great influence on reconstruction performance, obtained by deepening the network, is limited. Characteristic information is easily lost in the network. Therefore, multi-channel networks are used to obtain richer feature information, increase the access to information, and reduce the rate of information loss. This method can improve the ability of the network to reconstruct image detail information. Finally, the degree of information fusion in the network is increased to facilitate image super-resolution reconstruction. A multi-channel network cross-learning mechanism is introduced to speed up the fusion of feature information of different channels, promote parameter transfer, and effectively improve the training efficiency and information fusion degree. Result Experimental results measure the performance of the algorithm by using peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and network training time. Bicubic, A+, super-resolution convolutional neural network(SRCNN), super-resolution using very deep convolutional network(VDSR), deep Laplacian pyramid networks for fast and accurate super-resolution(LapSRN), and enhanced deep residual network for single image super-resolution-baseline(EDSR_baseline) are used for comparison in open datasets. Training is performed on the DIV2K(DIVerse 2K) dataset, where the network uses 800 as the training dataset and 100 as the validation dataset. Tests are then performed on the Set5, Set14, BSD100, and Urban100 datasets with 219 test data. Three reconstruction models are designed, which are enlarged at×2,×3, and×4 resolutions, to facilitate the comparison of common algorithms. In the experiments, experimental data and reconstructed images are analyzed in detail. Compared with traditional serial networks, recursive networks can improve network efficiency and reduce network computing time. Especially in the Urban100 data set with more details, the experiments show that compared with Bicubic, SRCNN, VDSR, LapSRN, and traditional series networks, average PSNR increases by 3.87 dB, 1.93 dB, 1.00 dB, 1.12 dB, and 0.48 dB, respectively. The visual effect is also clearer than that of the previous algorithm. Compared with the traditional tandem network, network training efficiency is improved by 30%. Conclusion The proposed network overcomes the shortcomings of single-channel deep networks and accelerates network convergence and information fusion by adding recursive residual networks and cross-learning mechanisms. In addition, recursive residual networks can accelerate network convergence and solve problems such as gradients during network training. Experimental results show that compared with the existing reconstruction methods, this method can obtain higher PSNR and SSIM, and can improve substantially in images with more detailed information. Thus, this method has the advantages of short training time, low information redundancy, and better reconstruction effect. In the future, we will consider continuing to optimize the recursive network scale and network cross-learning mechanism.

Keywords

super-resolution reconstruction multi-channel recursion cross residual network model

在线采编平台

在线出版

年度会议

下载中心

年度信息