相关对齐的总变分风格迁移新模型
Correlation alignment total variation model and algorithm for style transfer
- 2020年25卷第2期 页码:241-254
收稿:2019-05-10,
修回:2019-9-7,
录用:2019-9-14,
纸质出版:2020-02-16
DOI: 10.11834/jig.190199
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-05-10,
修回:2019-9-7,
录用:2019-9-14,
纸质出版:2020-02-16
移动端阅览
目的
2
图像的风格迁移是近年来机器视觉领域的研究热点之一。针对传统基于卷积神经网络(CNN)的图像风格迁移方法得到的结果图像存在风格纹理不均匀、噪声增强及迭代时间长等问题,本文在CNN框架下提出了一种基于相关对齐的总变分图像风格迁移新模型。
方法
2
在详细地分析了传统风格迁移方法的基础上,新模型引入了基于相关对齐的风格纹理提取方法,通过最小化损失函数,使得风格信息更加均匀地分布在结果图像中。通过分析比较CNN分解图像后不同卷积层的重构结果,提出了新的卷积层选择策略,以有效地提高风格迁移模型的效率。新模型引入了经典的总变分正则,以有效地抑制风格迁移过程中产生的噪声,使结果图像具有更好的视觉效果。
结果
2
仿真实验结果说明,相对于传统方法,本文方法得到的结果图像在风格纹理和内容信息上均有更好的表现,即在风格纹理更加均匀细腻的基础上较好地保留了内容图像的信息。另外,新模型可以有效地抑制风格迁移过程中产生的噪声,且具有更高的运行效率(新模型比传统模型迭代时间减少了约30%)。
结论
2
与传统方法相比,本文方法得到的结果图像在视觉效果方面有更好的表现,且其效率明显优于传统的风格迁移模型。
Objective
2
The style transfer of images has been a research hotspot in computer vision and image processing in recent years. The image style transfer technology can transfer the style of the style image to the content image
and the obtained result image contains the main content structure information of the content image and the style information of the style image
thereby satisfying people's artistic requirements for the image. The development of image style transfer can be divided into two phases. In the first phase
people often use non-photorealistic rendering methods to add artistic style to the design works. These methods only use the low-level features of the image for style transfer
and most of them have problems
such as poor visual effects and low operational efficiency. In the second phase
researchers have performed considerable meaningful work by introducing the achievements of deep learning to style transfer. In the framework of convolutional neural networks
Researchers proposed a classical image style transfer method
which uses convolutional neural networks to extract advanced features of style and content images
and obtained the stylized result image by minimizing the loss function. Compared with the traditional non-photorealistic rendering method
the convolutional neural network-based method does not require user intervention in the style transfer process
is applicable to any type of style image
and has good universality. However
the resulting image has uneven texture expression and increased noise
and the method is more complex than other traditional methods. To address these problems
we propose a new model of total variational style transfer based on correlation alignment from a detailed analysis of the traditional style transfer method.
Method
2
In this study
we design a style texture extraction method based on correlation alignment to make the style information evenly distributed on the resulting image. In addition
the total variational regularity is introduced to suppress the noise generated during the style transfer effectively
and a more efficient result image convolution layer selection strategy is adopted to improve the overall efficiency of the new model. We build a new model consisting of three VGG-19 networks. Only the cov4_3 convolutional layer of the VGG(visual geometry group)-style network is used to provide style information. Only the cov4_2 convolutional layer of the VGG content network is used to provide content information. For a given content image
$$\mathit{\boldsymbol{c}}$$
and style image
$$\mathit{\boldsymbol{s}}$$
suppose the resulting image of the style transfer is
$$\mathit{\boldsymbol{x}}$$
(using a content image containing random noise as an initial value). Content image
$$\mathit{\boldsymbol{c}}$$
and style image s are input into the VGG content network on the left side and the VGG style network on the right side of the new model
and the feature maps corresponding to each convolution can be obtained. The initial value of the resulting image
$$\mathit{\boldsymbol{x}}$$
is input to the intermediate VGG result network
and the initial value of the feature map corresponding to each convolution layer is obtained. The Adam algorithm is used to minimize the total loss function
and the optimal value of the loss function is obtained by iteratively updating the weight of the VGG result network. The proposed style transfer model consists of three parameters
namely
content loss adjustment
style loss adjustment
and total variation regular parameters
which are set to 1
5
and 500
respectively. All programs are coded using Python and TensorFlow deep learning framework
and experiments are performed on Alibaba Cloud GN5 cloud server. The CPU is Intel Xeon E5-2682 V4 (Broadwell) processor clocked at 2.5 GHz and has Nvidia P100 GPU with 12 GB video memory. The proposed and traditional models use the same parameters
that is
the weight ratio of content and style losses is 1:5
and the number of iterations is 5 000.
Result
2
We compare our model with the classic style transfer. Experiments show that the resulting image of the proposed model has a style texture that is close to the style image
and its content structure is close to the content image. Furthermore
the resulting image from the new model contains considerably fewer impurities than that from the Gatys model. The iteration time of new model is approximately 31 s shorter and the running efficiency is approximately 30% higher than those of the classic Gatys model. The efficiency of the proposed model is substantially improved compared with the traditional style transfer model. Moreover
a series of comparative experiments is conducted to illustrate the universality of the proposed model.
Conclusion
2
In this paper
a new model of total variational style transfer based on correlation alignment is proposed. This model introduces the method of extracting style texture based on correlation alignment and the classical total variational regularization. Thus
the style information is distributed further uniformly in the resulting image
and the noise generated in the style transfer process is effectively reduced. A new convolutional layer selection strategy is proposed by analyzing and comparing the reconstruction results of different convolutional layers after CNN decomposition images
which improves the efficiency of the style transfer model. Several experimental results show that the proposed model is superior to the classical style transfer model in terms of the visual effect of the resulting image and the operational efficiency of the algorithm.
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z F and Citro C. 2015. TensorFlow: large-scale machine learning on heterogeneous distributed systems[EB/OL]. 2015-11-09[2019-05-01] . https://arxiv.org/pdf/1603.04467v1.pdf https://arxiv.org/pdf/1603.04467v1.pdf
Chen T D. 2006. The synthesis of non-photorealistic motion effects for cartoon//Proceedings of the 6th International Conference on Intelligent Systems Design and Applications. Jinan, China: IEEE: 811-818[ DOI:10.1109/ISDA.2006.253717 http://dx.doi.org/10.1109/ISDA.2006.253717 ]
d'Angelo E, Jacques L, Alahi A and Vandergheynst P. 2014. From bits to images:inversion of local binary descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5):874-887[DOI:10.1109/TPAMI.2013.228]
Fortune S. 1987. A sweepline algorithm for Voronoi diagrams. Algorithmica, 2(1/4):153-250[DOI:10.1007/bf01840357]
Gatys L A, Ecker A S and Bethge M. 2015. A neural algorithm of artistic style[EB/OL]. 2015-08-26[2019-05-01] . https://arxiv.org/pdf/1508.06576.pdf https://arxiv.org/pdf/1508.06576.pdf
He K M, Zhang X Y, Ren S Q and Sun J. 2015. Deep residual learning for image recognition[EB/OL]. 2015-12-10[2019-05-01] . https://arxiv.org/pdf/1512.03385.pdf https://arxiv.org/pdf/1512.03385.pdf
Hertzmann A, JacobsC E, Oliver N, Curless B and Salesin D H. 2001. Image analogies//Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM: 327-340[ DOI:10.1145/383259.383295 http://dx.doi.org/10.1145/383259.383295 ]
Hoff Ⅲ K E, Culver T, Keyser J, Lin M and Manocha D. 2000. Fast computation of generalized Voronoi diagrams using graphics hardware//Proceedings of the 26th Annual Symposium on Computational Geometry. Clear Water Bay, Kowloon, Hong Kong, China: ACM: 375-376[ DOI:10.1145/336154.336226 http://dx.doi.org/10.1145/336154.336226 ]
Jing Y C, Liu Y, Yang Y Z, Feng Z L, Yu Y Z, Tao D C and Song M L. 2018. Stroke controllable fast style transfer with adaptive receptive fields[EB/OL]. 2018-2-20[2019-05-01] . https://arxiv.org/pdf/1802.07101.pdf https://arxiv.org/pdf/1802.07101.pdf
Johnson J, Alahi A and Li F F. 2016. Perceptual losses for real-time style transfer and super-resolution//Proceedings of the 14th European Conference on Computer Vision-ECCV 2016. Amsterdam, The Netherlands: Springer International Publishing[ DOI:10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Kingma D P and Ba J L. 2014. ADAM: a method for stochastic optimization[EB/OL]. 2014-12-22[2019-05-01] . https://arxiv.org/pdf/1412.6980v8.pdf https://arxiv.org/pdf/1412.6980v8.pdf
Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th InternationalConference on Neural Information Processing Systems. Lake Tahoe, Nevada: ACM: 1097-1105
Li C and Wand M. 2016. Combining markov random fields and convolutional neural networks for image synthesis[EB/OL]. 2016-01-18[2019-05-01] . https://arxiv.org/pdf/1601.04589.pdf https://arxiv.org/pdf/1601.04589.pdf
Li S H, Xu X X, Nie L Q and Chua T S. 2017. Laplacian-steered neural style transfer[EB/OL]. 2017-07-05[2019-05-01] . https://arxiv.org/pdf/1707.01253.pdf https://arxiv.org/pdf/1707.01253.pdf
Luan F J, Paris S, Shechtman E and Bala K. 2017. Deep photo style transfer[EB/OL]. 2017-03-22[2019-05-01] . https://arxiv.org/pdf/1703.07511.pdf https://arxiv.org/pdf/1703.07511.pdf
Mahendran A and Vedaldi A. 2014. Understanding deep image representations by inverting them[EB/OL] . 2014-11-26[2019-05-01]. https://arxiv.org/pdf/1412.0035.pdf https://arxiv.org/pdf/1412.0035.pdf
Reed S, Akata Z, Mohan S, Tenka S, Schiele B and Lee H. 2016. Learning what and where to draw[EB/OL]. 2016-10-08[2019-05-01] . https://arxiv.org/pdf/1610.02454.pdf https://arxiv.org/pdf/1610.02454.pdf
Risser E, Wilmot P and Barnes C. 2017. Stable and controllable neural texture synthesis and style transfer using histogram losses[EB/OL]. 2017-01-31[2019-05-01] . https://arxiv.org/pdf/1701.08893.pdf https://arxiv.org/pdf/1701.08893.pdf
Sainath T N, Kingsbury B, Saon G, Soltau H, Mohamed A R, Dahl G and Ramabhadran B. 2015. Deep convolutional neural networks for large-scale speech tasks. Neural Networks, 64:39-48[DOI:10.1016/j.neunet.2014.08.005]
Secord A. 2002. Weighted voronoi stippling//Proceedings of the 2nd International Symposium on Non-photorealistic Animation and Rendering. Annecy, France: ACM: 37-43[ DOI:10.1145/508530.508537 http://dx.doi.org/10.1145/508530.508537 ]
Sun B C, Feng J S and Saenko K. 2015. Return of frustratingly easy domain adaptation[EB/OL]. 2015-11-17[2019-05-01] . https://arxiv.org/pdf/1511.05547.pdf https://arxiv.org/pdf/1511.05547.pdf
Szegedy C Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V and Rabinovich A. 2014. Going deeper with convolutions[EB/OL]. 2014-09-17[2019-05-01] . https://arxiv.org/pdf/1409.4842.pdf https://arxiv.org/pdf/1409.4842.pdf
Winnemöller H, Olsen S C and Gooch B. 2006. Real-time video abstraction. ACM Transactions on Graphics (TOG), 25(3):1221-1226[DOI:10.1145/1179352.1142018]
Ye F M, Su Y F, Xiao H, Zhao X Q and Min W D. 2018. Remote sensing image registration using convolutional neural network features. IEEE Geoscience and Remote Sensing Letters, 15(2):232-236[DOI:10.1109/LGRS.2017.2781741]
Yu L H, Feng Y Q and Chen W F. 2009. Adaptive regularization method based total variational de-noising algorithm. Journal of Image and Graphics, 14(10):1950-1954
余丽红, 冯衍秋, 陈武凡. 2009.基于自适应正则化的全变分去噪算法.中国图象图形学报, 14(10):1950-1954[DOI:10.11834/jig.20091004]
Zhou X C, Wu T, Shi L F and Chen M. 2018. A kind of wavelet transform image denoising method based on curvature variation regularization Acta Electronica Sinica, 46(3):621-628
周先春, 吴婷, 石兰芳, 陈铭. 2018.一种基于曲率变分正则化的小波变换图像去噪方法.电子学报, 46(3):621-628[DOI:10.3969/j.issn.0372-2112.2018.03.016]
相关作者
相关机构
京公网安备11010802024621