类别信息生成式对抗网络的单图超分辨重建
Class-information generative adversarial network for single image super-resolution
- 2018年23卷第12期 页码:1777-1788
收稿:2018-05-29,
修回:2018-6-25,
纸质出版:2018-12-16
DOI: 10.11834/jig.180331
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-05-29,
修回:2018-6-25,
纸质出版:2018-12-16
移动端阅览
目的
2
基于生成式对抗网络的超分辨模型(SRGAN)以感知损失函数作为优化目标
有效解决了传统基于均方误差(MSE)的损失函数导致重建图像模糊的问题。但是SRGAN的感知损失函数中并未添加明确指示模型生成对应特征的标志性信息
使得其无法精准地将数据的具体维度与语义特征对应起来
受此局限性影响
模型对于生成图像的特征信息表示不足
导致重建结果特征不明显
给后续识别处理过程带来困难。针对上述问题
在SRGAN方法的基础上
提出一种类别信息生成式对抗网络的超分辨模型(class-info SRGAN)。
方法
2
对SRGAN模型增设类别分类器
并将类别损失项添加至生成网络损失中
再利用反向传播训练更新网络参数权重
以达到为模型提供特征类别信息的目的
最终生成具有可识别特征的重建图像。创新及优势在于将特征类别信息引入损失函数
改进了超分辨模型的优化目标
使得重建结果的特征表示更加突出。
结果
2
经CelebA数据集测试表明:添加性别分类器的class-info SRGAN的生成图像性别特征识别率整体偏高(58%~97%); 添加眼镜分类器的class-info SRGAN的生成图像眼镜框架更加清晰。此外
模型在Fashion-mnist与Cifar-10数据集上的结果同样表明其相较于SRGAN的重建质量更佳。
结论
2
实验结果验证了本方法在超分辨重建任务中的优势和有效性
同时结果显示:虽然class-info SRGAN更适用于具有简单、具体属性特征的图像
但总体而言仍是一种效果显著的超分辨模型。
Objective
2
The use of image super-resolution reconstruction technology implies the utilization of a set of low-quality low-resolution images (or motion sequences) to produce the corresponding high-quality and high-resolution ones.This technology has a wide range of applications in many fields
such as military
medicine
public safety
and computer vision.In the field of computer vision
image super-resolution reconstruction enables the image to transform from the detection level to the recognition level
and even advance to the identification level.In other words
image super-resolution reconstruction can enhance image recognition capability and identification accuracy.In addition
image super-resolution reconstruction involves a dedicated analysis of a target.In this analytic scheme
a comparatively high spatial resolution image of the region of interest is obtained instead of directly calculating the configuration of a high spatial resolution image by using large amounts of data.The conventional approaches of super-resolution reconstruction generally include example-based model
bi-cubic interpolation model
and sparse coding methods
among others.Deep learning has been considered for many associative subjects since the advent of artificial intelligence in recent years
and substantial research achievements have been realized in this field alongside the research on super-resolution reconstruction.Convolutional neural networks (CNNs) and generative adversarial networks (GANs) have resulted in numerous breakthroughs and achievements in the domain of image super-resolution reconstruction.Examples include super-resolution reconstruction with CNN (SRCNN)
super-resolution reconstruction with very-deep convolutional networks (VDSR)
and super-resolution reconstruction with generative adversarial network (SRGAN).Particularly in SRGAN modeling
the single-image super-resolution technology has achieved remarkable progress
especially when the perceptual loss function instead of the traditional loss function based on the mean square error (MSE) is the optimization goal.The common problems during modeling can be effectively solved using the original loss function
and a relatively high peak signal-to-noise ratio (PSNR) can be obtained to resolve the fuzziness in the reconstruction results.However
even if super-resolution reconstruction can remarkably ameliorate image quality
a common problem is knowing how to comprehensively highlight the feature representation of reconstructed images
which then can improve the reconstruction quality of generated images.By itself
the method of super-resolution reconstruction causes an ill-posed problem; that is
images lose a certain amount of information during the down-sampling process.Therefore
the reconstruction of a high-resolution image may include the lost parts or characteristic of the corresponding low-resolution image
and this scenario inevitably leads to generative deviation.In addition
given that SRGAN does not add auxiliary trademark information into the loss function (i.e.
the model should have been explicitly instructed to generate the corresponding features)
the model may fail to accurately match the specific dimensions and semantic features of the data.Moreover
controllability will likely constrain the model from sufficiently representing the feature information of generated images
which then limits the model from improving the quality of reconstructed images.Such constraints pose difficulties to the subsequent identification and processing of the image.Aiming to solve the above problems
on the basis of the advantages of the SRGAN method
a super-resolution model based on the class-information generative adversarial network (class-info SRGAN) is proposed.Class-info SRGAN can be designed for the utilization of additional information variables to restrict the solution space scope of super-resolution reconstruction.Furthermore
class-info SRGAN can be used to assist the model to accurately fulfil the reconstruction task
particularly those referring to data semantic features.
Method
2
The original SRGAN model involves the adding of a class classifier and integrating the class-loss item into the generative network loss.Then
back-propagation is employed during the training process to update the parameter weights of the network and provide feature class-information for the model.Finally
the reconstructed images are produced and possessed with the corresponding features.In contrast to the original objective function
the proposed model is innovative given its merits of having to introduce feature class-information and improving the optimization objective of the super-resolution model.Sequentially
it optimizes the network training process
and it then renders the feature representation of the reconstruction results to become more prominent.
Result
2
According to the CelebA experiments
the class-loss item enables the SRGAN model to make minor changes and improve the output.A comparison of the SRGAN model with other models with gender-class information was conducted
and the differences were inconclusive
i.e.
it is hard to conclude whether the model has a significant effect even if improvements were achieved to some extent.The overall gender recognition rate of the generated images from the class-info SRGAN model ranges from 58% to 97%
which is higher than the rate of those from SRGAN (8% to 98%).However
with glasses-class information
the capability of the model to learn how to form better-shaped glasses increased.The results for the Fashion-mnist dataset and Cifar-10 dataset also show that the model has a significant effect even if the final results with the Cifar-10 dataset were not highly prominent as the previous experiments.In summary
the outcomes show that the reconstruction quality of the generated images from the class-info SRGAN model are better than those of the original SRGAN model.
Conclusion
2
Class-information operates well in cases where the attributes are clear and the model has learned as much as possible.The experimental results verify the superiority and effectiveness of the proposed model in the super-resolution reconstruction task.On the basis of some concrete and simple feature attributes
class-info SRGAN will likely become a promising super-resolution model.However
to advance its application
the goals must be definite
e.g.
how to develop a general class-info SRGAN that can be used for various super-resolution reconstruction tasks
how to successfully conduct class-info SRGAN with multiple attributes simultaneously
and how to integrate auxiliary class-information into the architectures of class-info SRGAN efficiently and conveniently.These assumptions can provide references and conditions for acquiring better performing super-resolution reconstruction in the future.
Glasner D, Bagon S, Irani M.Super-resolution from a single image[C ] //Proceeding of the 12th International Conference on Computer Vision.Kyoto, Japan: IEEE, 2009: 349-356.[ DOI: 10.1109/ICCV.2009.5459271 http://dx.doi.org/10.1109/ICCV.2009.5459271 ]
Nasrollahi K, Moeslund T.Super-resolution:a comprehensive survey[J].Machine Vision and Applications, 2014, 25(6):1423-1468.[DOI:10.1007/s00138-014-0623-4]
Dong C, Loy C C, He K M, et al.Learning a deep convolutional network for image super-resolution[C ] //Proceeding of the 13th European Conference on Computer Vision.Zurich, Switzerland: Springer, 2014: 184-199.[ DOI: 10.1007/978-3-319-10593-2_13 http://dx.doi.org/10.1007/978-3-319-10593-2_13 ]
Goodfellow I J, Pouget-Abadie J, Mirza M, et al.Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.Montreal, Canada: MIT Press, 2014: 2672-2680.
Ledig C, Theis L, Huszár F, et al.Photo-realistic single image super-resolution using a generative adversarial network[C ] //Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA: IEEE, 2017: 105-114.[ DOI: 10.1109/CVPR.2017.19 http://dx.doi.org/10.1109/CVPR.2017.19 ]
Freeman W T, Jones T R, Pasztor E C.Example-based super-resolution[J].IEEE Computer graphics and Applications, 2002,22(2):56-65.[DOI:10.1109/38.988747]
Arya S, Mount D M, Netanyahu N S, et al.An optimal algorithm for approximate nearest neighbor searching fixed dimensions[J].Journal of the ACM, 1998, 45(6):891-923.[DOI:10.1145/293347.293348]
Keys R.Cubic convolution interpolation for digital image processing[J].IEEE Transactions on Acoustics, Speech, and Signal Processing, 1981, 29(6):1153-1160.[DOI:10.1109/TASSP.1981.1163711]
Lukin A, Krylov A S, Nasonov A.Image interpolation by super-resolution[C]//Proceedings of the 16th International Conference on Computer Graphics and Vision GraphiCon'2006.2006: 239-242.
Yuan S, Abe M, Taguchi A, et al.High accuracy WaDi image interpolation with local gradient features[C ] //2005 International Symposium on Intelligent Signal Processing and Communication Systems.Hong Kong, China: IEEE, 2005, 105: 49-54.[ DOI: 10.1109/ISPACS.2005.1595352 http://dx.doi.org/10.1109/ISPACS.2005.1595352 ]
Yang J C, Wright J, Huang T S, et al.Image super-resolution via sparse representation[J].IEEE Transactions on Image Processing, 2010, 19(11):2861-2873.[DOI:10.1109/TIP.2010.2050625]
Zhang Z, Xu Y, Yang J, et al.A survey of sparse representation:algorithms and applications[J].IEEE Access, 2015, 3:490-530.[DOI:10.1109/ACCESS.2015.2430359]
Dahl R, Norouzi M, Shlens J.Pixel recursive super resolution[C ] //Proceeding of 2017 IEEE International Conference on Computer Vision.Venice, Italy: IEEE, 2017: 5449-5458.[ DOI: 10.1109/ICCV.2017.581 http://dx.doi.org/10.1109/ICCV.2017.581 ]
Johnson J, Alahi A, Li F F.Perceptual losses for real-time style transfer and super-resolution[C ] // Proceeding of the 14th European Conference on Computer Vision.Amsterdam, The Netherlands: Springer, 2016.[ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Kim J, Lee J K, Lee K M.Accurate image super-resolution using very deep convolutional networks[C ] //Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, NV, USA: IEEE, 2016.[ DOI: 10.1109/cvpr.2016.182 http://dx.doi.org/10.1109/cvpr.2016.182 ]
Dong C, Loy C C, Tang X O.Accelerating the super-resolution convolutional neural network[C ] //Proceedings of the 14th European Conference on Computer Vision.Amsterdam, The Netherlands: Springer, 2016.[ DOI: 10.1007/978-3-319-46475-6_25 http://dx.doi.org/10.1007/978-3-319-46475-6_25 ]
Radford A, Metz L, Chintala S.Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv preprint arXiv: 1511.06434, 2015.
He K M, Zhang X Y, Ren S Q, et al.Deep residual learning for image recognition[C ] //Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas, NV, USA: IEEE, 2016: 770-778.[ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Simonyan K, Zisserman A.Very deep convolutional networks for large-scale image recognition[J].arXiv preprint arXiv: 1409.1556, 2014.
Liu Z W, Luo P, Wang X G, et al.Deep learning face attributes in the wild[C ] //Proceedings of 2015 IEEE International Conference on Computer Vision.Santiago, Chile: IEEE, 2015: 3730-3738.[ DOI: 10.1109/ICCV.2015.425 http://dx.doi.org/10.1109/ICCV.2015.425 ]
Xiao H, Rasul K, Vollgraf R.Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms[J].arXiv preprint arXiv: 1708.07747, 2017.
Krizhevsky A.Learning Multiple Layers of Features from Tiny Images[J].Toronto, California:University of Toronto, 2009:54-60.
相关文章
相关作者
相关机构
京公网安备11010802024621