类别信息生成式对抗网络的单图超分辨重建

杨云; 张海宇; 朱宇; 张艳宁

doi:10.11834/jig.180331

图像处理和编码 | 浏览量 : 0 下载量: 62 CSCD: 6

PDF
导出
分享
收藏
专辑

类别信息生成式对抗网络的单图超分辨重建
Class-information generative adversarial network for single image super-resolution
2018年23卷第12期页码：1777-1788
收稿：2018-05-29，

修回：2018-6-25，

纸质出版：2018-12-16
DOI： 10.11834/jig.180331
稿件说明：

移动端阅览

杨云, 张海宇, 朱宇, 张艳宁. 类别信息生成式对抗网络的单图超分辨重建[J]. 中国图象图形学报, 2018,23(12):1777-1788. DOI： 10.11834/jig.180331.

Yun Yang, Haiyu Zhang, Yu Zhu, Yanning Zhang. Class-information generative adversarial network for single image super-resolution[J]. Journal of Image and Graphics, 2018, 23(12): 1777-1788. DOI： 10.11834/jig.180331.

摘要

目的

基于生成式对抗网络的超分辨模型(SRGAN)以感知损失函数作为优化目标

有效解决了传统基于均方误差(MSE)的损失函数导致重建图像模糊的问题。但是SRGAN的感知损失函数中并未添加明确指示模型生成对应特征的标志性信息

使得其无法精准地将数据的具体维度与语义特征对应起来

受此局限性影响

模型对于生成图像的特征信息表示不足

导致重建结果特征不明显

给后续识别处理过程带来困难。针对上述问题

在SRGAN方法的基础上

提出一种类别信息生成式对抗网络的超分辨模型(class-info SRGAN)。

方法

对SRGAN模型增设类别分类器

并将类别损失项添加至生成网络损失中

再利用反向传播训练更新网络参数权重

以达到为模型提供特征类别信息的目的

最终生成具有可识别特征的重建图像。创新及优势在于将特征类别信息引入损失函数

改进了超分辨模型的优化目标

使得重建结果的特征表示更加突出。

结果

经CelebA数据集测试表明:添加性别分类器的class-info SRGAN的生成图像性别特征识别率整体偏高(58%~97%); 添加眼镜分类器的class-info SRGAN的生成图像眼镜框架更加清晰。此外

模型在Fashion-mnist与Cifar-10数据集上的结果同样表明其相较于SRGAN的重建质量更佳。

结论

实验结果验证了本方法在超分辨重建任务中的优势和有效性

同时结果显示:虽然class-info SRGAN更适用于具有简单、具体属性特征的图像

但总体而言仍是一种效果显著的超分辨模型。

Abstract

Objective

The use of image super-resolution reconstruction technology implies the utilization of a set of low-quality low-resolution images (or motion sequences) to produce the corresponding high-quality and high-resolution ones.This technology has a wide range of applications in many fields

such as military

medicine

public safety

and computer vision.In the field of computer vision

image super-resolution reconstruction enables the image to transform from the detection level to the recognition level

and even advance to the identification level.In other words

image super-resolution reconstruction can enhance image recognition capability and identification accuracy.In addition

image super-resolution reconstruction involves a dedicated analysis of a target.In this analytic scheme

a comparatively high spatial resolution image of the region of interest is obtained instead of directly calculating the configuration of a high spatial resolution image by using large amounts of data.The conventional approaches of super-resolution reconstruction generally include example-based model

bi-cubic interpolation model

and sparse coding methods

among others.Deep learning has been considered for many associative subjects since the advent of artificial intelligence in recent years

and substantial research achievements have been realized in this field alongside the research on super-resolution reconstruction.Convolutional neural networks (CNNs) and generative adversarial networks (GANs) have resulted in numerous breakthroughs and achievements in the domain of image super-resolution reconstruction.Examples include super-resolution reconstruction with CNN (SRCNN)

super-resolution reconstruction with very-deep convolutional networks (VDSR)

and super-resolution reconstruction with generative adversarial network (SRGAN).Particularly in SRGAN modeling

the single-image super-resolution technology has achieved remarkable progress

especially when the perceptual loss function instead of the traditional loss function based on the mean square error (MSE) is the optimization goal.The common problems during modeling can be effectively solved using the original loss function

and a relatively high peak signal-to-noise ratio (PSNR) can be obtained to resolve the fuzziness in the reconstruction results.However

even if super-resolution reconstruction can remarkably ameliorate image quality

a common problem is knowing how to comprehensively highlight the feature representation of reconstructed images

which then can improve the reconstruction quality of generated images.By itself

the method of super-resolution reconstruction causes an ill-posed problem; that is

images lose a certain amount of information during the down-sampling process.Therefore

the reconstruction of a high-resolution image may include the lost parts or characteristic of the corresponding low-resolution image

and this scenario inevitably leads to generative deviation.In addition

given that SRGAN does not add auxiliary trademark information into the loss function (i.e.

the model should have been explicitly instructed to generate the corresponding features)

the model may fail to accurately match the specific dimensions and semantic features of the data.Moreover

controllability will likely constrain the model from sufficiently representing the feature information of generated images

which then limits the model from improving the quality of reconstructed images.Such constraints pose difficulties to the subsequent identification and processing of the image.Aiming to solve the above problems

on the basis of the advantages of the SRGAN method

a super-resolution model based on the class-information generative adversarial network (class-info SRGAN) is proposed.Class-info SRGAN can be designed for the utilization of additional information variables to restrict the solution space scope of super-resolution reconstruction.Furthermore

class-info SRGAN can be used to assist the model to accurately fulfil the reconstruction task

particularly those referring to data semantic features.

Method

The original SRGAN model involves the adding of a class classifier and integrating the class-loss item into the generative network loss.Then

back-propagation is employed during the training process to update the parameter weights of the network and provide feature class-information for the model.Finally

the reconstructed images are produced and possessed with the corresponding features.In contrast to the original objective function

the proposed model is innovative given its merits of having to introduce feature class-information and improving the optimization objective of the super-resolution model.Sequentially

it optimizes the network training process

and it then renders the feature representation of the reconstruction results to become more prominent.

Result

According to the CelebA experiments

the class-loss item enables the SRGAN model to make minor changes and improve the output.A comparison of the SRGAN model with other models with gender-class information was conducted

and the differences were inconclusive

i.e.

it is hard to conclude whether the model has a significant effect even if improvements were achieved to some extent.The overall gender recognition rate of the generated images from the class-info SRGAN model ranges from 58% to 97%

which is higher than the rate of those from SRGAN (8% to 98%).However

with glasses-class information

the capability of the model to learn how to form better-shaped glasses increased.The results for the Fashion-mnist dataset and Cifar-10 dataset also show that the model has a significant effect even if the final results with the Cifar-10 dataset were not highly prominent as the previous experiments.In summary

the outcomes show that the reconstruction quality of the generated images from the class-info SRGAN model are better than those of the original SRGAN model.

Conclusion

Class-information operates well in cases where the attributes are clear and the model has learned as much as possible.The experimental results verify the superiority and effectiveness of the proposed model in the super-resolution reconstruction task.On the basis of some concrete and simple feature attributes

class-info SRGAN will likely become a promising super-resolution model.However

to advance its application

the goals must be definite

e.g.

how to develop a general class-info SRGAN that can be used for various super-resolution reconstruction tasks

how to successfully conduct class-info SRGAN with multiple attributes simultaneously

and how to integrate auxiliary class-information into the architectures of class-info SRGAN efficiently and conveniently.These assumptions can provide references and conditions for acquiring better performing super-resolution reconstruction in the future.

关键词

Keywords

references

Glasner D, Bagon S, Irani M.Super-resolution from a single image[C ] //Proceeding of the 12th International Conference on Computer Vision.Kyoto, Japan: IEEE, 2009: 349-356.[ DOI: 10.1109/ICCV.2009.5459271 http://dx.doi.org/10.1109/ICCV.2009.5459271 ]

Nasrollahi K, Moeslund T.Super-resolution:a comprehensive survey[J].Machine Vision and Applications, 2014, 25(6):1423-1468.[DOI:10.1007/s00138-014-0623-4]

Dong C, Loy C C, He K M, et al.Learning a deep convolutional network for image super-resolution[C ] //Proceeding of the 13th European Conference on Computer Vision.Zurich, Switzerland: Springer, 2014: 184-199.[ DOI: 10.1007/978-3-319-10593-2_13 http://dx.doi.org/10.1007/978-3-319-10593-2_13 ]

Goodfellow I J, Pouget-Abadie J, Mirza M, et al.Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.Montreal, Canada: MIT Press, 2014: 2672-2680.

Ledig C, Theis L, Huszár F, et al.Photo-realistic single image super-resolution using a generative adversarial network[C ] //Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA: IEEE, 2017: 105-114.[ DOI: 10.1109/CVPR.2017.19 http://dx.doi.org/10.1109/CVPR.2017.19 ]

Freeman W T, Jones T R, Pasztor E C.Example-based super-resolution[J].IEEE Computer graphics and Applications, 2002,22(2):56-65.[DOI:10.1109/38.988747]

Arya S, Mount D M, Netanyahu N S, et al.An optimal algorithm for approximate nearest neighbor searching fixed dimensions[J].Journal of the ACM, 1998, 45(6):891-923.[DOI:10.1145/293347.293348]

Keys R.Cubic convolution interpolation for digital image processing[J].IEEE Transactions on Acoustics, Speech, and Signal Processing, 1981, 29(6):1153-1160.[DOI:10.1109/TASSP.1981.1163711]

Lukin A, Krylov A S, Nasonov A.Image interpolation by super-resolution[C]//Proceedings of the 16th International Conference on Computer Graphics and Vision GraphiCon'2006.2006: 239-242.

Yuan S, Abe M, Taguchi A, et al.High accuracy WaDi image interpolation with local gradient features[C ] //2005 International Symposium on Intelligent Signal Processing and Communication Systems.Hong Kong, China: IEEE, 2005, 105: 49-54.[ DOI: 10.1109/ISPACS.2005.1595352 http://dx.doi.org/10.1109/ISPACS.2005.1595352 ]

Yang J C, Wright J, Huang T S, et al.Image super-resolution via sparse representation[J].IEEE Transactions on Image Processing, 2010, 19(11):2861-2873.[DOI:10.1109/TIP.2010.2050625]

Zhang Z, Xu Y, Yang J, et al.A survey of sparse representation:algorithms and applications[J].IEEE Access, 2015, 3:490-530.[DOI:10.1109/ACCESS.2015.2430359]

Dahl R, Norouzi M, Shlens J.Pixel recursive super resolution[C ] //Proceeding of 2017 IEEE International Conference on Computer Vision.Venice, Italy: IEEE, 2017: 5449-5458.[ DOI: 10.1109/ICCV.2017.581 http://dx.doi.org/10.1109/ICCV.2017.581 ]

Johnson J, Alahi A, Li F F.Perceptual losses for real-time style transfer and super-resolution[C ] // Proceeding of the 14th European Conference on Computer Vision.Amsterdam, The Netherlands: Springer, 2016.[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]

Kim J, Lee J K, Lee K M.Accurate image super-resolution using very deep convolutional networks[C ] //Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, NV, USA: IEEE, 2016.[ DOI: 10.1109/cvpr.2016.182 http://dx.doi.org/10.1109/cvpr.2016.182 ]

Dong C, Loy C C, Tang X O.Accelerating the super-resolution convolutional neural network[C ] //Proceedings of the 14th European Conference on Computer Vision.Amsterdam, The Netherlands: Springer, 2016.[ DOI: 10.1007/978-3-319-46475-6_25 http://dx.doi.org/10.1007/978-3-319-46475-6_25 ]

Radford A, Metz L, Chintala S.Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv preprint arXiv: 1511.06434, 2015.

He K M, Zhang X Y, Ren S Q, et al.Deep residual learning for image recognition[C ] //Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, NV, USA: IEEE, 2016: 770-778.[ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]

Simonyan K, Zisserman A.Very deep convolutional networks for large-scale image recognition[J].arXiv preprint arXiv: 1409.1556, 2014.

Liu Z W, Luo P, Wang X G, et al.Deep learning face attributes in the wild[C ] //Proceedings of 2015 IEEE International Conference on Computer Vision.Santiago, Chile: IEEE, 2015: 3730-3738.[ DOI: 10.1109/ICCV.2015.425 http://dx.doi.org/10.1109/ICCV.2015.425 ]

Xiao H, Rasul K, Vollgraf R.Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms[J].arXiv preprint arXiv: 1708.07747, 2017.

Krizhevsky A.Learning Multiple Layers of Features from Tiny Images[J].Toronto, California:University of Toronto, 2009:54-60.

文章被引用时，请邮件提醒。

提交

暂无数据