Current Issue Cover

陈缘1, 赵洋2, 张效娟3, 刘晓平2(1.安徽大学;2.合肥工业大学;3.青海师范大学)

摘 要
目的 线稿上色在卡通动画制作和艺术绘画等领域中是非常关键的步骤之一,是由线条构成的黑白线稿草图被涂上颜色变为彩色图像的过程。全自动线稿上色方法可以减轻绘制过程中繁琐耗时的手工上色的工作量,然而自动理解线稿中的稀疏线条并选取合适的颜色仍较为困难。方法 依据现实场景中特定绘画类型常有固定用色风格偏好这一先验,本文聚焦于有限色彩空间下的线稿自动上色,通过约束色彩空间,这不仅可以降低语义理解的难度,还可以避免不合理的用色。具体地,本文提出一种两阶段线稿自动上色方法。在第一阶段,设计一个灰度图生成器,对输入的稀疏线稿补充线条和细节,以生成稠密像素的灰度图像。在第二阶段,首先设计色彩推理模块从输入的颜色先验中推理得到适合该线稿的色彩子空间,再提出一种多尺度的渐进融合颜色信息的生成网络以逐步生成高质量的彩色图像。结果 实验在三个数据集上与四种线稿自动上色方法进行对比,在上色结果的客观质量对比中,所提方法取得了更高的PSNR和SSIM值以及更低的均方误差;在上色结果的色彩指标对比中,所提方法取得了最高的色彩丰富度分数;在主观评价和用户调查中,所提方法也取得了与人的主观审美感受更一致的结果。此外,消融实验结果也表明了本文所使用的模型结构及色彩空间限制有益于上色性能的提升。结论 实验结果表明本文提出的有限色彩空间下的线稿自动上色方法可以有效地完成多类线稿的自动上色,并且可以简单地通过调整颜色先验以获得更多样的彩色图像。
Sketch colorization with finite color space prior


Objective In the art field, an exquisite painting usually takes a lot of effort from sketch drawing in the early stage to coloring and polishing. With the rise of cartoon, painting, graphic design and other related industries, sketch colorization has become one of the most tedious and repetitive processes. Although some computer-aided design tools have appeared in past decades, they still need human to accomplish colorization operations, and it is difficult for ordinary user to draw an exquisite painting. Meanwhile, automatic sketch colorization is still a difficult problem. Therefore, both academia and industry are in urgent need of convenient and efficient sketch colorization methods. With the outbreak of deep neural networks (DNN), DNN-based colorization methods have achieved promising performance in recent years. However, most studies focus on grayscale image colorization for natural images, which is quite different from sketch colorization. At present, only a few studies focus on automatic sketch colorization, and they usually require user-guidance or are designed for certain type such as anime characters. Unfortunately, automatically understanding sparse lines and selecting appropriate colors is still a very difficult and ill-posed problem. Disharmony colors thus often appear in recent automatic sketch colorization results, e.g., red grass and black sun. Therefore, most sketch colorization methods reduce the difficulty by means of user-guidance, which can be roughly divided into three types, i.e., reference images-based, color hints-based, and text expressions-based. Although these semi-automatic methods can provide more reasonable results, they still require inefficient user interaction processes. Method In practice, we observe the phenomenon that colors used in paintings of a particular style are usually fixed and finite, rather than arbitrary colorization. Therefore, this paper focuses on automatic sketch colorization with finite color space prior, which can effectively reduce the difficulty of understanding semantics and avoid undesired colors. More specifically, a two-stage multi-scale colorization network is designed. In the first stage, a subnetwork is proposed to generate a dense grayscale image from the input sparse sketch. It adopts a commonly used U-Net structure. The U-Net structure can obtain large receptive field and thus is helpful to understand high-level semantics. In the second stage, a multi-scale generative adversarial network is designed to colorize the grayscale image according to the input color palette. In this paper, we adopt three scales. At the minimum scale, the input color palette prior is used to guide the color reasoning process, which contains several specified dominant colors. Then the sketch content features extracted by content encoder are converted into a spatial attention, and it is multiplied by the color features extracted by color encoder to perceive the preliminary color reasoning. Then the structure of other two scales is similar to the minimum scale, color guidance information is gradually fused to generate higher quality and higher resolution results. Adversarial learning is adopted, hence, corresponding to each generator, there are three scales of discriminators and a discriminator for grayscale generation. An intermediate loss is designed to guide the generation of grayscale image in the first stage, pixel-wise loss, adversarial loss and TV loss is used in the second stage. Considering the application scenarios, we constructed three datasets, including Regong Tibetan painting dataset, Thangka (Tibetan scroll painting) elements dataset and specific cartoon dataset. Each dataset contains images, color palettes, and corresponding sketches. Result Our model is implemented with the Tensorflow platform. For the three datasets, images are all with the size of 256×256, and are split into training and testing set with a ratio of 9 : 1. The proposed method is compared with the baseline model Pix2Pix and several automatic sketch colorization methods: AdvSegLoss, PaintsChainer and Style2Paints V4.5. Note that PaintsChainer and Style2Paints are colorization tools that can be used directly, and they need human to input sketches manually. Hence, we only retrained Pix2Pix and AdvSegLoss with their official implementations on the same datasets for fair comparison. For quantitative evaluation, PSNR, SSIM and MSE are used to measure the objective quality of the output coloring images, and colorfulness score is used to measure the color vividness of coloring images. It can be found that the proposed method almost achieves the highest PSNR and SSIM values and the lowest MSE values on three datasets, and obtains the best colorfulness scores on all datasets. Although quantitative evaluation has shown the effectiveness of the proposed method, it is more meaningful to compare colorization results through human subjective feelings. Hence, subjective results and user studies are provided to evaluate the performance of the proposed method. For qualitative evaluation, the proposed method can reproduce harmonious colors and obtain better visual perception than comparison methods. Furthermore, owing to the decoupled color reasoning and fusion modules in our model, the color of the output image can be changed flexibly with different input color prior. In user study, most users prefer our results on both aspects of color selection and overall image quality. Moreover, we conduct ablation experiments, including training a single stage model, which means coloring an image without generating a grayscale image first, and only using single-scale in the second stage. It can be found that the performance of single stage model is the poorest, which proves that learning automatic colorization directly is a very difficult task. And the performance of single-scale model is poorer than the multi-scale model, which verifies the effectiveness of multi-scale strategy. Conclusion Experimental results on three datasets show that the proposed automatic sketch colorization method based on finite color space prior can colorize sketch effectively, and can generate various colorful results easily by changing color prior.