跨尺度耦合的连续比例因子图像超分辨率
Cross-scale coupling network for continuous-scale image super-resolution
- 2022年27卷第5期 页码:1604-1615
纸质出版日期: 2022-05-16 ,
录用日期: 2022-01-18
DOI: 10.11834/jig.210815
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2022-05-16 ,
录用日期: 2022-01-18
移动端阅览
吴瀚霖, 李宛谕, 张立保. 跨尺度耦合的连续比例因子图像超分辨率[J]. 中国图象图形学报, 2022,27(5):1604-1615.
Hanlin Wu, Wanyu Li, Libao Zhang. Cross-scale coupling network for continuous-scale image super-resolution[J]. Journal of Image and Graphics, 2022,27(5):1604-1615.
目的
2
虽然深度学习技术已大幅提高了图像超分辨率的性能,但是现有方法大多仅考虑了特定的整数比例因子,不能灵活地实现连续比例因子的超分辨率。现有方法通常为每个比例因子训练一次模型,导致耗费很长的训练时间和占用过多的模型存储空间。针对以上问题,本文提出了一种基于跨尺度耦合网络的连续比例因子超分辨率方法。
方法
2
提出一个用于替代传统上采样层的跨尺度耦合上采样模块,用于实现连续比例因子上采样。其次,提出一个跨尺度卷积层,可以在多个尺度上并行提取特征,通过动态地激活和聚合不同尺度的特征来挖掘跨尺度上下文信息,有效提升连续比例因子超分辨率任务的性能。
结果
2
在3个数据集上与最新的超分辨率方法进行比较,在连续比例因子任务中,相比于性能第2的对比算法Meta-SR(meta super-resolution),峰值信噪比提升达0.13 dB,而参数量减少了73%。在整数比例因子任务中,相比于参数量相近的轻量网络SRFBN(super-resolution feedback network),峰值信噪比提升达0.24 dB。同时,提出的算法能够生成视觉效果更加逼真、纹理更加清晰的结果。消融实验证明了所提算法中各个模块的有效性。
结论
2
本文提出的连续比例因子超分辨率模型,仅需要一次训练,就可以在任意比例因子上获得优秀的超分辨率结果。此外,跨尺度耦合上采样模块可以用于替代常用的亚像素层或反卷积层,在实现连续比例因子上采样的同时,保持模型性能。
Objective
2
Single image super-resolution (SISR) aims to restore a high-resolution (HR) image through adding high-frequency details to its corresponding low-resolution (LR). Such application scenarios like medical imaging and remote sensing harness super-resolve LR images to multiple scales to customize the accuracy requirement. Moreover
these scales should not be restricted to integers but arbitrary positive numbers. However
a scalable super-resolution(SR) model training will leak a high computational and storage cost. Hence
it is of great significance to construct a single SR model that can process arbitrary scale factors. Deep learning technology has greatly improved the performance of SISR
but most of them are designed for specific integer scale factors. Early pre-sampling methods like the super-resolution convolutional neural network (SRCNN) can achieve continuous-scale upsampling but low computational efficiency. The post-upsampling methods use the deconvolutional or sub-pixel layer of the final step of network to obtain upsampling. However
the structure of the sub-pixel layer and deconvolutional layer is related to the scale factor
resulting in the SR network just be trained for a single scale factor each time. multi-scale deep super-resolution (MDSR) uses multiple upsampling branches to process different scale factors
but it can only super-resolve trained integer scales. Meta super-resolution (Meta-SR) is the first scale-arbitrary SR network that builds a meta-upsampling module. The meta-upsampling module uses a fully connected network to dynamically predict the weights of feature mapping in the span of the LR space and HR space. However
the Meta-SR upsampling module has high computational complexity and the number of parameters in the feature extraction part is extremely huge.
Method
2
We illustrated a cross-scale coupling network (CSCN) for continuous-scale image SR. First
we devise a fully convolutional cross-scale coupled upsampling (CSC-Up) module to reach potential decimal scale efficient and end-to-end results. Our strategy is a continuous-scale upsampling module construction through coupling features of multiple scales. The CSC-up module first maps LR features to multiple HR spaces based on a variety of multiscales derived of multiple upsampling branches. Then
the features of multiple HR spaces are adaptively fused to obtain the SR image of the targeted scale. The CSC-Up module can be easily plugged into existing SR networks. We only need to replace the original upsampling module with our CSC-Up module to obtain a continuous-scale SR network. Second
multi-scale features extraction is beneficial to SR tasks. We facilitate a novel cross-scale convolutional (CS-Conv) layer
which can adaptively extract and couple features from multiple scales and exploit cross-scale contextual information. In addition
we utilize a feedback mechanism in the cross-scale feature learning part
using high-level features to refine low-level ones. Such a recurrent structure can increase the capacity of the network without the number of parameters generated. We train our model on the 800 images of diverse 2K resalution(DIV2K) dataset. In the training step
we randomly crop LR patches of size 32×32 as inputs. The input LR patches are generated based on the bicubic downsampling model and rotated or flipped for data augmentation in random. Our model is trained for 400 epochs with a mini-batch size of 16
and each epoch contains 1 000 iterations. The initial learning rate is 1×10
-3
and halves at the 1.5×10
5
2.5×10
5
and 3.5×10
5
iterations. Our demonstration is implemented based on the PyTorch framework and one Telsa V100 GPU training.
Result
2
Our method is in comparison with two state-of-the-art (SotA) continuous-scale SR methods
Meta-EDSR(enhanced deep super-resolution) and Meta-RDN. Meanwhile
we define a new benchmark via bicubic resampling the output of a residual dense network (RDN) to the target size
named Bi-RDN. To verify the generality of our CSC-Up module
we replace the original upsampling layer of the RDN with a CSC-Up module and construct a continuous-scale RDN (CS-RDN). We also use the self-ensemble method to further improve the performance of CSCN
named CSCN+. The quantitative evaluation metrics in related to peak signal-to-noise ratio (PSNR) and structure similarity index metric (SSIM). Our CSCN obtains comparable results with Meta-RDN
and CSCN+ obtains the good results on all scale factors. CS-RDN also obtains satisfactory results
demonstrating that the proposed CSC-Up module can be well adapted to the existing SR methods and obtain satisfactory non-integer SR results. We also compare our CSCN with six SotA methods on integer scale factors
including super-resolution convolutional neural network(SRCNN)
coarse-to-fine SRCNN (CFSRCNN)
RDN
SR feedback network (SRFBN)
iterative SR network (ISRN)
and meta-RDN
respectively. Comparing the results of RDN and CS-RDN
we sort out that our CSC-Up module can achieve comparable or better results to that of a single-scale upsampling module. Meanwhile
our proposed CSCN and CS-RDN can be trained once and release a single model. Our proposed CSCN uses a simpler and more efficient continuous-scale upsampling module and obtains corresponding results with Meta-SR. CSCN+ achieves the best performance on all datasets and scales. Moreover
the number of parameters in our model is 6 M
which is only 27% of Meta-RDN (22 M). Benefiting from the feedback structure
our method can well balance the number of network parameters and model performance. Thus
our proposed CSCN and CSCN+ are prior to comparing SotAs.
Conclusion
2
We propose a novel CSC-Up model that can be easily plugged into the existing SR networks to activate the continuous-scale SR. We also introduce a CS-Conv layer to learn scale-robust features and adopt feedback connections to design a lightweight CSCN. Compared with the previous single-scale SR networks
the proposed CSCN tends to time efficiency and suitable model storage space.
深度学习单幅图像超分辨率(SISR)连续比例因子跨尺度耦合跨尺度卷积
deep learningsingle image super-resolution(SISR)continuous-scalecross-scale couplingcross-scale convolution
Cai Z W, Fan Q F, Feris R S and Vasconcelos N. 2016. A unified multi-scale deep convolutional neural network for fast object detection//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 354-370 [DOI: 10.1007/978-3-319-46493-0_22http://dx.doi.org/10.1007/978-3-319-46493-0_22]
Dai T, Cai J R, Zhang Y B, Xia S T and Zhang L. 2019. Second-order attention network for single image super-resolution//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 11057-11066 [DOI: 10.1109/CVPR.2019.01132http://dx.doi.org/10.1109/CVPR.2019.01132]
Dong C, Loy C C, He K M and Tang X O. 2016a. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2): 295-307 [DOI: 10.1109/TPAMI.2015.2439281]
Dong C, Loy C C and Tang X O. 2016b. Accelerating the super-resolution convolutional neural network//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 391-407 [DOI: 10.1007/978-3-319-46475-6_25http://dx.doi.org/10.1007/978-3-319-46475-6_25]
Dong X Y, Wang L G, Sun X, Jia X P, Gao L R and Zhang B. 2021. Remote sensing image super-resolution using second-order multi-scale networks. IEEE Transactions on Geoscience and Remote Sensing, 59(4): 3473-3485 [DOI: 10.1109/TGRS.2020.3019660]
Fan Y C, Yu J H, Liu D and Huang T S. 2020. Scale-wise convolution for image restoration//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI: 10770-10777 [DOI: 10.1609/aaai.v34i07.6706http://dx.doi.org/10.1609/aaai.v34i07.6706]
Han W, Chang S Y, Liu D, Yu M, Witbrock M and Huang T S. 2018. Image super-resolution via dual-state recurrent networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1654-1663 [DOI: 10.1109/CVPR.2018.00178http://dx.doi.org/10.1109/CVPR.2018.00178]
Haris M, Shakhnarovich G and Ukita N. 2018. Deep back-projection networks for super-resolution//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1664-1673 [DOI: 10.1109/CVPR.2018.00179http://dx.doi.org/10.1109/CVPR.2018.00179]
He K M, Zhang X Y, Ren S Q and Sun J. 2015. Delving deep into rectifiers: surpassing human-level performance on imagenet classification//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1026-1034 [DOI: 10.1109/ICCV.2015.123http://dx.doi.org/10.1109/ICCV.2015.123]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Hu X C, Mu H Y, Zhang X Y, Wang, Z L, Tan T N and Sun J. 2019. Meta-SR: a magnification-arbitrary network for super-resolution//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 1575-1584 [DOI: 10.1109/CVPR.2019.00167http://dx.doi.org/10.1109/CVPR.2019.00167]
Huang J B, Singh A and Ahuja N. 2015. Single image super-resolution from transformed self-exemplars//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 5197-5206 [DOI: 10.1109/CVPR.2015.7299156http://dx.doi.org/10.1109/CVPR.2015.7299156]
Huang Y W, Shao L and Frangi A F. 2017. Simultaneous super-resolution and cross-modality synthesis of 3D medical images using weakly-supervised joint convolutional sparse coding//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5787-5796 [DOI: 10.1109/CVPR.2017.613http://dx.doi.org/10.1109/CVPR.2017.613]
Kim J, Lee J K and Lee K M. 2016. Accurate image super-resolution using very deep convolutional networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 1646-1654 [DOI: 10.1109/CVPR.2016.182http://dx.doi.org/10.1109/CVPR.2016.182]
Lai W S, Huang J B, Ahuja N and Yang M H. 2017. Deep Laplacian pyramid networks for fast and accurate super-resolution//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5835-5843 [DOI: 10.1109/CVPR.2017.618http://dx.doi.org/10.1109/CVPR.2017.618]
Li J C, Fang F M, Mei K F and Zhang G X. 2018. Multi-scale residual network for image super-resolution//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 527-542 [DOI: 10.1007/978-3-030-01237-3_32http://dx.doi.org/10.1007/978-3-030-01237-3_32]
Li Z, Yang J L, Liu Z, Yang X M, Jeon G and Wu W. 2019. Feedback network for image super-resolution//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3862-3871 [DOI: 10.1109/CVPR.2019.00399http://dx.doi.org/10.1109/CVPR.2019.00399]
Lim B, Son S, Kim H, Nah S and Lee K M. 2017. Enhanced deep residual networks for single image super-resolution//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, USA: IEEE: 1132-1140 [DOI: 10.1109/CVPRW.2017.151http://dx.doi.org/10.1109/CVPRW.2017.151]
Liu Y Q, Wang S Q, Zhang J, Wang S S, Ma S W and Gao W. 2021. Iterative network for image super-resolution. IEEE Transactions on Multimedia [J/OL]. [2021-08-25].https://ieeexplore.ieee.org/document/9427200https://ieeexplore.ieee.org/document/9427200[DOI: 10.1109/TMM.2021.3078615http://dx.doi.org/10.1109/TMM.2021.3078615].
Mahapatra D,Bozorgtabar B and Garnavi R. 2019. Image super-resolution using progressive generative adversarial networks for medical image analysis. Computerized Medical Imaging and Graphics, 71: 30-39 [DOI: 10.1016/j.compmedimag.2018.10.005]
Martin D, Fowlkes C, Tal D and Malik J. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics//Proceedings of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada: IEEE: 416-423 [DOI: 10.1109/ICCV.2001.937655http://dx.doi.org/10.1109/ICCV.2001.937655]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Shi W Z, Caballero J, Huszár F, Totz J, Aitken A P, Bishop R, Rueckert D and Wang Z H. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 1874-1883 [DOI: 10.1109/CVPR.2016.207http://dx.doi.org/10.1109/CVPR.2016.207]
Soh J W, Cho S and Cho N I. 2020. Meta-transfer learning for zero-shot super-resolution//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 3513-3522 [DOI: 10.1109/CVPR42600.2020.00357http://dx.doi.org/10.1109/CVPR42600.2020.00357]
Sun C W and Chen X. 2021. Multiscale feature fusion back-projection network for image super-resolution. Acta Automatica Sinica, 47(7): 1689-1700
孙超文, 陈晓. 2021. 基于多尺度特征融合反投影网络的图像超分辨率重建. 自动化学报, 47(7): 1689-1700 [DOI: 10.16383/j.aas.c200714]
Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V and Rabinovich A. 2015. Going deeper with convolutions//Proceedings of 2015IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1-9 [DOI: 10.1109/CVPR.2015.7298594http://dx.doi.org/10.1109/CVPR.2015.7298594]
Tian C W, Xu Y, Zuo W M, Zhang B, Fei L K and Lin C W. 2021. Coarse-to-fine CNN for image super-resolution. IEEE Transactions on Multimedia, 23: 1489-1502 [DOI: 10.1109/TMM.2020.2999182]
Tong T, Li G, Liu X J and Gao Q Q. 2017. Image super-resolution using dense skip connections//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 4809-4817 [DOI: 10.1109/ICCV.2017.514http://dx.doi.org/10.1109/ICCV.2017.514]
Wu H L, Zhang L B and Ma J. 2022. Remote sensing image super-resolution via saliency-guided feedback GANs. IEEE Transactions on Geoscience and Remote Sensing, 60: #5600316 [DOI: 10.1109/TGRS.2020.3042515]
Zamir A R, Wu T L, Sun L, Shen W B, Shi B E, Malik J and Savarese S. 2017. Feedback networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1808-1817 [DOI: 10.1109/CVPR.2017.196http://dx.doi.org/10.1109/CVPR.2017.196]
Zeiler M D and Fergus R. 2014. Visualizing and understanding convolutional networks//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 818-833 [DOI: 10.1007/978-3-319-10590-1_53http://dx.doi.org/10.1007/978-3-319-10590-1_53]
Zeiler M D, Krishnan D, Taylor G W and Fergus R. 2010. Deconvolutional networks//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE: 2528-2535 [DOI: 10.1109/CVPR.2010.5539957http://dx.doi.org/10.1109/CVPR.2010.5539957]
Zeyde R, Elad M and Protter M. 2010. On single image scale-up using sparse-representations//Proceedings of the 7th International Conference on Curves and Surfaces. Avignon, France: Springer: 711-730 [DOI: 10.1007/978-3-642-27413-8_47http://dx.doi.org/10.1007/978-3-642-27413-8_47]
Zhang Y L, Li K P, Li K, Wang L C, Zhong B N and Fu Y. 2018b. Image super-resolution using very deep residual channel attention networks//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 294-310 [DOI: 10.1007/978-3-030-01234-2_18http://dx.doi.org/10.1007/978-3-030-01234-2_18]
Zhang Y L, Tian Y P, Kong Y, Zhong B N and Fu Y. 2018a. Residual dense network for image super-resolution//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2472-2481 [DOI: 10.1109/CVPR.2018.00262http://dx.doi.org/10.1109/CVPR.2018.00262]
Zhou B, Li C H and Chen W. 2021. Region-level channel attention for single image super-resolution combining high frequency loss. Journal of Image and Graphics, 26(12): 2836-2847
周波, 李成华, 陈伟. 2021. 区域级通道注意力融合高频损失的图像超分辨率重建. 中国图象图形学报, 26(12): 2836-2847 [DOI: 10.11834/jig.200582]
相关作者
相关机构