循环生成对抗网络的线稿图像自动提取
Image extraction of cartoon line art based on cycle-consistent adversarial networks
- 2021年26卷第5期 页码:1117-1127
收稿日期:2020-08-20,
修回日期:2020-09-29,
录用日期:2020-10-6,
纸质出版日期:2021-05-16
DOI: 10.11834/jig.200465
移动端阅览

浏览全部资源
扫码关注微信
收稿日期:2020-08-20,
修回日期:2020-09-29,
录用日期:2020-10-6,
纸质出版日期:2021-05-16
移动端阅览
目的
2
动漫制作中线稿绘制与上色耗时费力,为此很多研究致力于动漫制作过程自动化。目前基于数据驱动的自动化研究工作快速发展,但并没有一个公开的线稿数据集可供使用。针对真实线稿图像数据获取困难,以及现有线稿提取方法效果失真等问题,提出基于循环生成对抗网络的线稿图像自动提取模型。
方法
2
模型基于循环生成对抗网络结构,以解决非对称数据训练问题。然后将不同比例的输入图像及其边界图输入到掩码指导卷积单元,以自适应选择网络中间特征。同时为了进一步提升网络提取线稿的效果,提出边界一致性约束损失函数,确保生成结果与输入图像在梯度变化上的一致性。
结果
2
在公开的动漫彩色图像数据集Danbooru2018上,应用本文模型提取的线稿图像相比于现有线稿提取方法,噪声少、线条清晰且接近真实漫画家绘制的线稿图像。实验中邀请30名年龄在2025岁的用户,对本文以及其他4种方法提取的线稿图像进行打分。最终在30组测试样例中,本文方法提取的线稿图像被认为最佳的样例占总样例84%。
结论
2
通过在循环生成对抗网络中引入掩码指导单元,更加合理地提取彩色图像的线稿图像,并通过对已有方法提取效果进行用户打分证明,在动漫线稿图像提取中本文方法优于对比方法。此外,该模型不需要大量真实线稿图像训练数据,实验中仅采集1 000幅左右真实线稿图像。模型不仅为后续动漫绘制与上色研究提供数据支持,同时也为图像边缘提取方法提供了新的解决方案。
Objective
2
With the continuous development of digital media
people's demand for animation works continues to increase. Excellent two-dimensional animation works usually require a lot of time and effort. In the animation production process
the key frame line draft images are usually drawn by the original artist
then the intermediate frame line draft images are drawn by multiple ordinary animators
and finally all the line draft images are colored by the coloring staff. In order to improve the production efficiency of two-dimensional animation art
researchers have committed to improving the automation of the production process. At present
data-driven deep learning technology is developing rapidly
which provides a new solution for improving the production efficiency of animation works. Although many data-driven automated methods have been proposed
it is very difficult to obtain training datasets
and there is no public dataset that corresponds to color images and linear images. For this reason
the research work of automatically extracting line draft images from color animation images will provide data support for animation production-related research.
Method
2
Early image edge extraction methods depend on the setting of parameter values
and fixed parameters cannot be applied to all images. However
the data-driven image edge extraction method is limited by the collection and size of the dataset. Therefore
researchers usually use data enhancement techniques or use images similar to line art
such as boundary images (edge information extracted from color images). This study proposes an automatic extraction model of linear art images based on the cycle-consistent adversarial networks to solve the problem of the difficulty of obtaining real line art images and the distortion of the existing line art image extraction methods. First of all
this study uses a cycle-consistent adversarial network structure to solve the dataset problem without real color images and corresponding line art images. It only uses a few collected real line art images and a large number of color images to learn the model parameters. Then
the mask-guided convolution unit and the mask-guided residual unit are proposed to better autonomously select the intermediate output features of the network. Specifically
the input images of different scales and their corresponding boundary images are input to mask-guided convolution unit to learn the mask parameters of the intermediate feature layer
where the boundary map determines the line area of the line art image and the input image provides prior information. In order to ensure that information is not lost in the process of information encoding
no operations such as pooling that can cause information loss are used in the network design process
but the image resolution is reduced by controlling the size of the convolution kernel and the convolution step length. Finally
this study proposes a boundary constraint loss function. Considering that this study does not have the supervision information corresponding to the input image
the loss function is designed to calculate the difference between the gradient information of the input image and the output image. At the same time
regular constraints are added to ensure that the generated result is consistent with the gradient of the input image. The proposed method mainly restricts the gradient of the input image and the generated image to be consistent.
Result
2
Finally
on the public animation color image dataset Danbooru2018
the line art image extraction results of this method are compared with the results extracted by the Canny edge detection operator
cycle-consistent adversarial networks (CycleGAN)
holistically-nested edge detection (HED)
and SketchKeras methods. The Canny edge detection operator only extracts the position information of the image gradient. The resulting lines extracted by CycleGAN are blurred and accompanied by missing information
and the lines in some areas cannot be extracted correctly. The line art image extracted by HED has obvious outer contours but seriously lacks internal details. The line art image extracted by SketchKeras is closer to the edge information image and contains the rich gradient change information
which causes the lines to be unclear and noisy. The extracted results of the proposed model are not only clear and have less noise
but also are more in line with the effect drawn by human animators. In order to show the actual performance effect of the proposed method
30 users between the ages of 20-25 years are invited to score the cartoon line art images extracted by five different methods. A total of 30 sets of test samples are provided. Each user selects the best line art image in each group according to whether the extracted line art image lines are clear
whether there is noise
and whether it is close to the real cartoonist's line art image. The statistical results show that the linear art image extracted by the proposed method is superior to that of other methods in terms of image quality and authenticity. Moreover
the proposed method can not only extract the line art image corresponding to the color animation image
but also extract the line art from the real color image. In the experiment
the model was used to extract line art images from real-world color images
and results similar to animation line art images were obtained. At the same time
the proposed model is better at extracting black border lines
which may be because the borders of the color animation images given in the training set are black lines.
Conclusion
2
This study proposes a model for extracting line art images from color animation images. It trains network parameters through asymmetric data and does not require a large amount of real cartoon line art images. The proposed mask-guided convolution unit and mask-guided residual unit constrain the output features of the intermediate network through the input image and the corresponding boundary image to obtain clearer line results. The proposed boundary consistency loss function introduces a Gaussian regular term to make the boundary of the region with severe gradient change more obvious
and the region with weak gradient change is smoother
reducing the noise in the generated line art image. Finally
the proposed method extracts corresponding line art images from the public animation color dataset Danbooru2018
provides data support for subsequent line art drawing and line art coloring research work
and can also extract results similar to the sketch drawn by an animator from the real color image.
Arjovsky M, Chintala S and Bottou L. 2017. Wasserstein generative adversarial networks[EB/OL]. [2020-08-10] . http://proceedings.mlr.press/v70/arjovsky17a.html http://proceedings.mlr.press/v70/arjovsky17a.html
Branwen G. 2021. Danbooru2020: a large-scale crowdsourced and tagged anime illustration dataset[EB/OL]. [2020-08-10] . https://www.gwern.net/Danbooru2018 https://www.gwern.net/Danbooru2018
Canny J. 1986. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-8(6): 679-698[ DOI: 10.1109/TPAMI.1986.4767851 http://dx.doi.org/10.1109/TPAMI.1986.4767851 ]
Chen W L and Hays J. 2018. SketchyGAN: towards diverse and realistic sketch to image synthesis//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 9416-9425[ DOI: 10.1109/CVPR.2018.00981 http://dx.doi.org/10.1109/CVPR.2018.00981 ]
Fu H B, Zhou S Z, Liu L G and Mitra N J. 2011. Animated construction of line drawings. ACM Transactions on Graphics, 30(6): #133[DOI:10.1145/2070781.2024167]
Furusawa C, Hiroshiba K, Ogaki K and Odagiri Y. 2017. Comicolorization: semi-automatic manga colorization//SIGGRAPH Asia 2017 Technical Briefs. Bangkok, Thailand: ACM: 1-4[ DOI: 10.1145/3145749.3149430 http://dx.doi.org/10.1145/3145749.3149430 ]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial networks[EB/OL]. [2021-03-05] . https://arxiv.org/pdf/1406.2661.pdf https://arxiv.org/pdf/1406.2661.pdf
Hensman P and Aizawa K. 2017. cGAN-based manga colorization using a single training image//Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Kyoto, Japan: IEEE: 72-77[ DOI: 10.1109/ICDAR.2017.295 http://dx.doi.org/10.1109/ICDAR.2017.295 ]
Huang J L and Zheng X M. 2008. Improved image edge detection algorithm based on Canny operator. Computer Engineering and Applications, 44(25): 170-172
黄剑玲, 郑雪梅. 2008. 一种改进的基于Canny算子的图像边缘提取算法. 计算机工程与应用, 44(25): 170-172 [DOI:10.3778/j.issn.1002-8331.2008.25.051]
Isola P, Zhu J Y, Zhou T H and Efros A A. 2017. Image-to-image translation with conditional adversarial networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5967-5976[ DOI: 10.1109/CVPR.2017.632 http://dx.doi.org/10.1109/CVPR.2017.632 ]
Ji H, Sun J X, Shao X F and Mao L. 2004. The algorithm for image edge detection and prospect. Computer Engineering and Applications, 40(14): 70-73
季虎, 孙即祥, 邵晓芳, 毛玲. 2004. 图像边缘提取方法及展望. 计算机工程与应用, 40(14): 70-73 [DOI:10.3321/j.issn:1002-8331.2004.14.023]
Levin A, Lischinski D and Weiss Y. 2004. Colorization using optimization. ACM Transactions on Graphics, 23(3): 689-694[DOI:10.1145/1015706.1015780]
Lllyasviel. 2019. SketchKeras[EB/OL]. [2020-08-10] . https://github.com/lllyasviel/sketchKeras https://github.com/lllyasviel/sketchKeras
Mirza M and Osindero S. 2014. Conditional generative adversarial nets[EB/OL]. [2020-08-10] . https://arxiv.org/pdf/1411.1784.pdf https://arxiv.org/pdf/1411.1784.pdf
Miyato T, Kataoka T, Koyama M and Yoshida Y. 2018. Spectral normalization for generative adversarial networks[EB/OL]. [2020-08-10] . https://arxiv.org/pdf/1802.05957.pdf https://arxiv.org/pdf/1802.05957.pdf
Odena A, Dumoulin V and Olah C. 2016. Deconvolution and checkerboard artifacts[EB/OL]. [2020-10-06] . https://distill.pub/2016/deconv-checkerboard/ https://distill.pub/2016/deconv-checkerboard/
Qu Y G, Wong T T and Heng P A. 2006. Manga colorization. ACM Transactions on Graphics, 25(3): 1214-1220[DOI:10.1145/1179352.1142017]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer, 234-241[ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Sýkora D, Dingliana J and Collins S. 2009. LazyBrush: flexible painting tool for hand-drawn cartoons. Computer Graphics Forum, 28(2):599-608[DOI:10.1111/j.1467-8659.2009.01400.x]
Ulyanov D, Vedaldi A and Lempitsky V. 2016. Instance normalization: the missing ingredient for fast stylization[EB/OL]. [2020-08-10] . https://arxiv.org/pdf/1607.08022.pdf https://arxiv.org/pdf/1607.08022.pdf
Wang T C, Liu M Y, Zhu J Y, Tao A, Kautz J and Catanzaro B. 2018. High-resolution image synthesis and semantic manipulation with conditional GANs//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8798-8807[ DOI: 10.1109/CVPR.2018.00917 http://dx.doi.org/10.1109/CVPR.2018.00917 ]
Xie S N and Tu Z W. 2015. Holistically-nested edge detection//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1395-1403[ DOI: 10.1109/ICCV.2015.164 http://dx.doi.org/10.1109/ICCV.2015.164 ]
Zhang L M, Ji Y, Lin X and Liu C P. 2017. Style transfer for anime sketches with enhanced residual U-net and auxiliary classifier GAN//Proceedings of the 4th IAPR Asian Conference on Pattern Recognition. Nanjing, China: IEEE: 506-511[ DOI: 10.1109/ACPR.2017.61 http://dx.doi.org/10.1109/ACPR.2017.61 ]
Zhang L M, Li C Z, Wong T T, Ji Y and Liu C P. 2018. Two-stage sketch colorization. ACM Transactions on Graphics, 37(6): #261[DOI:10.1145/3272127.3275090]
Zhu J Y, Park T, Isola P and Efros A A. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2242-2251[ DOI: 10.1109/ICCV.2017.244 http://dx.doi.org/10.1109/ICCV.2017.244 ]
相关作者
相关机构
京公网安备11010802024621