轻量级注意力特征选择循环网络的超分重建

徐雯捷; 宋慧慧; 袁晓彤; 刘青山

doi:10.11834/jig.200555

图像处理和编码 | 浏览量 : 0 下载量: 26 CSCD: 1

PDF
导出
分享
收藏
专辑

轻量级注意力特征选择循环网络的超分重建
Lightweight attention feature selection recursive network for super-resolution
2021年26卷第12期页码：2826-2835
收稿：2020-09-21，

修回：2021-1-6，

录用：2021-1-13，

纸质出版：2021-12-16
DOI： 10.11834/jig.200555
稿件说明：

移动端阅览

徐雯捷, 宋慧慧, 袁晓彤, 刘青山. 轻量级注意力特征选择循环网络的超分重建[J]. 中国图象图形学报, 2021,26(12):2826-2835. DOI： 10.11834/jig.200555.

Wenjie Xu, Huihui Song, Xiaotong Yuan, Qingshan Liu. Lightweight attention feature selection recursive network for super-resolution[J]. Journal of Image and Graphics, 2021, 26(12): 2826-2835. DOI： 10.11834/jig.200555.

摘要

目的

深度卷积网络在图像超分辨率重建领域具有优异性能，越来越多的方法趋向于更深、更宽的网络设计。然而，复杂的网络结构对计算资源的要求也越来越高。随着智能边缘设备（如智能手机）的流行，高效能的超分重建算法有着巨大的实际应用场景。因此，本文提出一种极轻量的高效超分网络，通过循环特征选择单元和参数共享机制，不仅大幅降低了参数量和浮点运算次数（floating point operations，FLOPs），而且具有优异的重建性能。

方法

本文网络由浅层特征提取、深层特征提取和上采样重建3部分构成。浅层特征提取模块包含一个卷积层，产生的特征循环经过一个带有高效通道注意力模块的特征选择单元进行非线性映射提取出深层特征。该特征选择单元含有多个卷积层的特征增强模块，通过保留每个卷积层的部分特征并在模块末端融合增强层次信息。通过高效通道注意力模块重新调整各通道的特征。借助循环机制（循环6次）可以有效提升性能且大幅减少参数量。上采样重建通过参数共享的上采样模块同时将浅层与深层特征进放大、融合得到高分辨率图像。

结果

与先进的轻量级网络进行对比，本文网络极大减少了参数量和FLOPs，在Set5、Set14、B100、Urban100和Manga109等基准测试数据集上进行定量评估，在图像质量指标峰值信噪比（peak signal to noise ratio，PSNR）和结构相似性（structural similarity，SSIM）上也获得了更好的结果。

结论

本文通过循环的特征选择单元有效挖掘出图像的高频信息，并通过参数共享机制极大减少了参数量，实现了轻量的高质量超分重建。

Abstract

Objective

Deep convolutional neural network has shown strong reconstruction ability in image super-resolution (SR) task. Efficient super-resolution has a great practical application scenario due to the popularity of intelligent edge devices such as mobile phones. A very lightweight and efficient super-resolution network has been proposed. The proposed method has reduced the number of parameters and floating point operations(FLOPs) greatly and achieved excellent reconstruction performance based on recursive feature selection module and parameter sharing mechanism.

Method

The proposed lightweight attention feature selection recursive network (AFSNet)has mainly evolved three key components: low-level feature extraction

high-level feature extraction and upsample reconstruction. In the low-level feature extraction part

the input low-resolution image has passed through a 3×3 convolutional layer to extract the low-level features. In the high-level feature extraction part

a recursive feature selection module(FSM) to capture the high-level features has been designed. At the end of the network

a shared upsample block to super-resolve low-level and high-level features has been utilized to obtain the final high-resolution image. Specifically

the FSM has contained a feature enhancement block and an efficient channel attention block. The feature enhancement block has four convolutional layers. Different from other cascaded convolutional layers

this block has retained part of features in each convolutional layer and fused them at the end of this module. Features extracted from different convolutional layers have different levels of hierarchical information

so the proposed network can choose to preserve part of them step-by-step and aggregate them at the end of this module. An efficient channel attention (ECA) block has been presented following the feature enhancement block. Different from the channel attention (CA) in the residual channel attention networks(RCAN)

the ECA has avoided the dimensionality reduction operation

which involves two 1×1 convolutional layers to realize no-linear mapping and cross-channel interaction. A local cross-channel interaction strategy has been implemented excluded dimensionality reduction via one-dimensional (1D) convolution. Furthermore

ECA block has adaptively opted kernel size of 1D convolution for determining coverage of local cross-channel interaction. The proposed ECA block has not increased the parameter numbers to improve the reconstruction performance.This network has employed recursive mechanism to share parameters across the efficient feature enhancement block as well to reduce the number of parameters extremely. In the end of the high-level feature extraction part

this network has concatenated and fused the output of all the FSM. The research network can capture valuable contextual information via this multi-stage feature fusion (MSFF) mechanism. In the upsample reconstruction part

this network has utilized a shared upsample block to reconstruct the low-level and high-level features into a high-resolution image

which includes a convolutional layer and a sub-pixel layer. The high-resolution image has fused low and high frequency information together without increasing the parameter numbers.

Result

The DF2K dataset as training dataset has been adopted

which includes 800 images from the DIV2K dataset and 2 650 images from the Flickr2k dataset. Data augmentation has been performed based on random horizontal flipping and 90 degree rotation further. The corresponding low-resolution image has been obtained by bicubic downsampling from the high-resolution image (the downscale scale is×2

×3

×4). The evaluation has used five benchmark datasets: Set5

Set14

B100

Urban100 and Manga109 respectively. Peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) have been used as the evaluation metrics to measure reconstruction performance.The AFSNet crops borders and the metrics in the luminance channel of transformed YCbCr space have been calculated following the evaluation protocol in residual dense network(RDN). In training process

16 low-resoution patches of size 48×48 and their corresponding high-resolution patches have been randomly cropped. In the high-level feature extraction stage

six recursive feature selection modules have been used. The number of channels in each convolution layer

=64 for our FSM has been set. In each channel split operation

the features of 16 channels have been preserved. The remaining 48 channels have been continued to perform the next convolution. The network parameters with Adam optimizer have been optimized. The network has been trained using L1 loss function. The initial learning rate has been set to 2E-4 and decreased half for every 200 epochs. The network has been implemented under the PyTorch framework with an NVIDIA 2080 Ti GPU for acceleration. The proposed AFSNet with several state-of-the-art lightweight convolutional neural networks(CNNs)-based SISR methods has been compared. The AFSNet has achieved the best performance in terms of both PSNR and SSIM among all compared methods in almost all benchmark datasets excluded×2 results on the Set5. The AFSNet has much less parameter numbers and much smaller FLOPs in particular. For×4 SR in the Set14 test dataset

the PSNR results have increased 0.4 dB

0.6 dB and 0.43 dB respectively compared with SRFBN-S

IDN and CARN-M. The parameter numbers of AFSNet have been decreased by 47%

53% and 38%. Meanwhile

the 24.5 G FLOPs of AFSNet have been superior to 30 G FLOPs as usual. In addition

the AFSNet has conducted ablation study on the effectiveness of the ECA module and MSFF mechanism. The AFSNet has selected×4 Set5 as test dataset.The PSNR results have been decrease by 0.09 dB and 0.11 dB

which shows the effectiveness of the proposed ECA module and MSFF mechanism when the AFSsNet dropped out ECA module and MSFF mechanism respectively.

Conclusion

The research has presented a lightweight attention feature selection recursive network for super-resolution

which improved reconstruction performance without large parameters and FLOPs. The network has employed a 3×3 convolutional layer in the low-level feature extraction part to extract low-resolution(LR) low-level features

then six recursive feature selection modules have been used to learn non-linear mapping and exploit high-level features. The FSM has preserved hierarchical features step-by-step and aggregated them according to the importance of candidate features based on the proposed efficient channel attention module evalution. Meanwhile

multi-stage feature fusion by concatenating outputs of all the FSM has been conducted to effectively capture contextual information of different stages. The extracted low-level and high-level features have been upsampled by a parameter-shared upsample block.

关键词

Keywords

references

Ahn N, Kang B and Sohn K A. 2018. Fast, accurate, and lightweight super-resolution with cascading residual network//Proceedings of the 15th European Conference. Munich, Germany: Springer: 256-272[ DOI: 10.1007/978-3-030-01249-6_16 http://dx.doi.org/10.1007/978-3-030-01249-6_16 ]

Bevilacqua M, Roumy A, Guillemot C and Morel M L A. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding//Proceedings of 2012 British Machine Vision Conference. Guildford, UK: BMVC: 135[ DOI: 10.5244/C.26.135 http://dx.doi.org/10.5244/C.26.135 ]

Dong C, Loy C C, He K M and Tang X O. 2014. Learning a deep convolutional network for image super-resolution//Proceedings of the 13th European Conference. Zurich, Switzerland: Springer: 184-199[ DOI: 10.1007/978-3-319-10593-2_13 http://dx.doi.org/10.1007/978-3-319-10593-2_13 ]

Dong C, Loy C C, He K M and Tang X O. 2016. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2): 295-307[DOI: 10.1109/TPAMI.2015.2439281]

Gu J J, Lu H N, Zuo W M and Dong C. 2019. Blind super-resolution with iterative kernel correction//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 1604-1613[ DOI: 10.1109/CVPR.2019.00170 http://dx.doi.org/10.1109/CVPR.2019.00170 ]

Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7132-7141[ DOI: 10.1109/CVPR.2018.00745 http://dx.doi.org/10.1109/CVPR.2018.00745 ]

Huang J B, Singh A and Ahuja N. 2015. Single image super-resolution from transformed self-exemplars//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA: IEEE: 5197-5206[ DOI: 10.1109/CVPR.2015.7299156 http://dx.doi.org/10.1109/CVPR.2015.7299156 ]

Hui Z, Gao X B, Yang Y C and Wang X M. 2019. Lightweight image super-resolution with information multi-distillation network//Proceedings of the 27th ACM International Conference on Multimedia. Nice, France: ACM: 2024-2032[ DOI: 10.1145/3343031.3351084 http://dx.doi.org/10.1145/3343031.3351084 ]

Hui Z, Wang X M and Gao X B. 2018. Fast and accurate single image super-resolution via information distillation network//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 723-731[ DOI: 10.1109/cvpr.2018.00082 http://dx.doi.org/10.1109/cvpr.2018.00082 ]

Kim J, Lee J K and Lee K M. 2016a. Accurate image super-resolution using very deep convolutional networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 1646-1654[ DOI: 10.1109/CVPR.2016.182 http://dx.doi.org/10.1109/CVPR.2016.182 ]

Kim J, Lee J K and Lee K M. 2016b. Deeply-recursive convolutional network for image super-resolution//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 1637-1645[ DOI: 10.1109/CVPR.2016.181 http://dx.doi.org/10.1109/CVPR.2016.181 ]

Kingma D P and Ba J L. 2014. Adam: a method for stochastic optimization[EB/OL]. [2020-08-20] . https://arxiv.org/pdf/1412.6980v8.pdf https://arxiv.org/pdf/1412.6980v8.pdf

Lai W S, Huang J B, Ahuja N and Yang M H. 2017. Deep Laplacian pyramid networks for fast and accurate super-resolution//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 5835-5843[ DOI: 10.1109/CVPR.2017.618 http://dx.doi.org/10.1109/CVPR.2017.618 ]

Li J C, Fang F M, Mei K F and Zhang G X. 2018. Multi-scale residual network for image super-resolution//Proceedings of the 15th European Conference. Munich, Germany: Springer: 527-542[ DOI: 10.1007/978-3-030-01237-3_32 http://dx.doi.org/10.1007/978-3-030-01237-3_32 ]

Li X G, Sun Y M, Yang Y L and Miao C Y. 2018. Image super-resolution reconstruction based on intermediate supervision convolutional neural networks. Journal of Image and Graphics, 23(7): 984-993

李现国, 孙叶美, 杨彦利, 苗长云. 2018. 基于中间层监督卷积神经网络的图像超分辨率重建. 中国图象图形学报, 23(7): 984-993[DOI: 10.11834/jig.170538]

Li Y X, Deng H P, Xiang S, Wu J and Zhu L. 2018. Depth map super-resolution reconstruction based on the texture edge-guided approach. Journal of Image and Graphics, 23(10): 1508-1517

李宇翔, 邓慧萍, 向森, 吴谨, 朱磊. 2018. 纹理边缘引导的深度图像超分辨率重建. 中国图象图形学报, 23(10): 1508-1517[DOI: 10.11834/jig.180127]

Li Z, Yang J L, Liu Z, Yang X M, Jeon G and Wu W. 2019. Feedback network for image super-resolution//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 3862-3871[ DOI: 10.1109/CVPR.2019.00399 http://dx.doi.org/10.1109/CVPR.2019.00399 ]

Lim B, Son S Kim H, Nah S and Lee K M. 2017. Enhanced deep residual networks for single image super-resolution//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, USA: IEEE: 1132-1140[ DOI: 10.1109/CVPRW.2017.151 http://dx.doi.org/10.1109/CVPRW.2017.151 ]

Martin D, Fowlkes C, Tal D and Malik J. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics//Proceedings of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada: IEEE: 416-423[ DOI: 10.1109/ICCV.2001.937655 http://dx.doi.org/10.1109/ICCV.2001.937655 ]

Matsui Y, Ito K, Aramaki Y, Fujimoto A, Ogawa T, Yamasaki T and Aizawa K. 2017. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76(20): 21811-21838[DOI: 10.1007/s11042-016-4020-z]

Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z M, Desmaison A,Antiga L and Lerer A. 2017. Automatic differentiation in PyTorch[EB/OL]. [2020-08-21] . https://openreview.net/pdf?id=BJJsrmfCZ https://openreview.net/pdf?id=BJJsrmfCZ

Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2020-08-20] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf

Shen M Y, Yu P F, Wang R G, Yang J and Xue L X. 2019. Image super-resolution reconstruction via deep network based on multi-staged fusion. Journal of Image and Graphics, 24(8): 1258-1269

沈明玉, 俞鹏飞, 汪荣贵, 杨娟, 薛丽霞. 2019. 多阶段融合网络的图像超分辨率重建. 中国图象图形学报, 24(8): 1258-1269[DOI: 10.11834/jig.180619]

Shocher A, Cohen N and Irani M. 2018. Zero-shot super-resolution using deep Internal learning//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3118-3126[ DOI: 10.1109/cvpr.2018.00329 http://dx.doi.org/10.1109/cvpr.2018.00329 ]

Soh J W, Cho S and Cho N I. 2020. Meta-transfer learning for zero-shot super-resolution[EB/OL]. [2020-02-27] . https://arxiv.org/pdf/2002.12213.pdf https://arxiv.org/pdf/2002.12213.pdf

Tai Y, Yang J and Liu X M. 2017a. Image super-resolution via deep recursive residual network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 2790-2798[ DOI: 10.1109/CVPR.2017.298 http://dx.doi.org/10.1109/CVPR.2017.298 ]

Tai Y, Yang J, Liu X M and Xu C Y. 2017b. MemNet: a persistent memory network for image restoration//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 4549-4557[ DOI: 10.1109/ICCV.2017.486 http://dx.doi.org/10.1109/ICCV.2017.486 ]

Timofte R, Agustsson E, van Gool L, Yang M H, Zhang L, Lim B, Son S, Kim H, Nah S, Lee K M, Wang X T, Tian Y P, Yu K, Zhang Y L, Wu S X, Dong C, Lin L, Qiao Y, Loy C C, Bae W, Yoo J, Han Y, Ye J C, Choi J S, Kim M, Fan Y C, Yu J H, Han W, Liu D, Yu H C, Wang Z Y, Shi H H, Wang X C, Huang T S, Chen Y J, Zhang K, Zuo W M, Tang Z M, Luo L K, Li S H, Fu M, Cao L, Heng W, Bui G, Le T, Duan Y, Tao D C, Wang R X, Lin X, Pang J X, Xu J C, Zhao Y, Xu X Y, Pan J S, Sun D Q, Zhang Y J, Song X B, Dai Y C, Qin X Y, Huynh X P, Guo T T, Mousavi H S, Vu T H, Monga V, Cruz C, Egiazarian K, Katkovnik V, Mehta R, Jain A K, Agarwalla A, Praveen C V S, Zhou R F, Wen H D, Zhu C, Xia Z Q, Wang Z T and Guo Q. 2017. Ntire 2017 challenge on single image super-resolution: methods and results//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, USA: IEEE: 1110-1121[ DOI: 10.1109/CVPRW.2017.149 http://dx.doi.org/10.1109/CVPRW.2017.149 ]

Timofte R, Rothe R and van Gool L. 2016. Seven ways to improve example-based single image super resolution//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 1865-1873[ DOI: 10.1109/CVPR.2016.206 http://dx.doi.org/10.1109/CVPR.2016.206 ]

Tong J C, Fei J L, Chen J S, Li H and Ding D D. 2019. Multi-level feature fusion image super-resolution algorithm with recursive neural network. Journal of Image and Graphics, 24(2): 302-312

佟骏超, 费加罗, 陈靖森, 李恒, 丁丹丹. 2019. 递归式多阶特征融合图像超分辨率算法. 中国图象图形学报, 24(2): 302-312[DOI: 10.11834/jig.180410]

Wang C F, Li Z and Shi J. 2019. Lightweight image super-resolution with adaptive weighted learning network[EB/OL]. [2020-08-20] . https://arxiv.org/pdf/1904/1904.02358.pdf https://arxiv.org/pdf/1904/1904.02358.pdf

Wang Q L, Wu B G, Zhu P F, Li P H, Zuo W M and Hu Q H. 2020. ECA-Net: efficient channel attention for deep convolutional neural networks[EB/OL]. [2020-08-20] . https://arxiv.org/pdf/1910.03151.pdf https://arxiv.org/pdf/1910.03151.pdf

Wang Z, Bovik A C, Sheikh H Rand Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612[DOI: 10.1109/TIP.2003.819861]

Yang J C, Wright J, Huang T S and Ma Y. 2010. Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11): 2861-2873[DOI: 10.1109/TIP.2010.2050625]

Zhang K, van Gool L and Timofte R. 2020. Deep unfolding network for image super-resolution[EB/OL]. [2020-08-20] . https://arxiv.org/pdf/2003.10428.pdf https://arxiv.org/pdf/2003.10428.pdf

Zhang Y L, Li K P, Li K, Wang L C, Zhong B N and Fu Y. 2018a. Image super-resolution using very deep residual channel attention networks//Proceedings of the 15th European Conference. Munich, Germany: Springer: 294-310[ DOI: 10.1007/978-3-030-01234-2_18 http://dx.doi.org/10.1007/978-3-030-01234-2_18 ]

Zhang Y L, Tian Y P, Kong Y, Zhong B N and Fu Y. 2018b. Residual dense network for image super-resolution//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2472-2481[ DOI: 10.1109/cvpr.2018.00262 http://dx.doi.org/10.1109/cvpr.2018.00262 ]