双重对偶生成对抗网络的跨年龄素描—照片转换

吴柳玮; 孙锐; 阚俊松; 高隽

发布时间： 2020-04-11
摘要点击次数： 1647
全文下载次数： 726
DOI: 10.11834/jig.190329
2020 | Volume 25 | Number 4

双重对偶生成对抗网络的跨年龄素描—照片转换

吴柳玮^1,2, 孙锐^1,2, 阚俊松^1,2, 高隽¹(1.合肥工业大学计算机与信息学院, 合肥 230009;2.工业安全与应急技术安徽省重点实验室, 合肥 230009)

摘要

目的跨年龄素描-照片转换旨在根据面部素描图像合成同一人物不同年龄阶段的面部照片图像。该任务在公共安全和数字娱乐等领域具有广泛的应用价值，然而由于配对样本难以收集和人脸老化机制复杂等原因，目前研究较少。针对此情况，提出一种基于双重对偶生成对抗网络（double dual generative adversarial networks，D-DualGANs）的跨年龄素描-照片转换方法。方法该网络通过设置4个生成器和4个判别器，以对抗训练的方式，分别学习素描到照片、源年龄组到目标年龄组的正向及反向映射。使素描图像与照片图像的生成过程相结合，老化图像与退龄图像的生成过程相结合，分别实现图像风格属性和年龄属性上的对偶。并增加重构身份损失和完全重构损失以约束图像生成。最终使输入的来自不同年龄组的素描图像和照片图像，分别转换成对方年龄组下的照片和素描。结果为香港中文大学面部素描数据集（Chinese University of Hong Kong（CUHK）face sketch database，CUFS）和香港中文大学面部素描人脸识别技术数据集（CUHK face sketch face recognition technology database，CUFSF）的图像制作对应的年龄标签，并依据标签将图像分成3个年龄组，共训练6个D-DualGANs模型以实现3个年龄组图像之间的两两转换。同非端到端的方法相比，本文方法生成图像的变形和噪声更小，且年龄平均绝对误差（mean absolute error，MAE）更低，与原图像相似度的投票对比表明1130素描与3150照片的转换效果最好。结论双重对偶生成对抗网络可以同时转换输入图像的年龄和风格属性，且生成的图像有效保留了原图像的身份特征，有效解决了图像跨风格且跨年龄的转换问题。

关键词

生成对抗网络图像转换人脸老化异质图像合成人脸素描合成

Double dual generative adversarial networks for cross-age sketch-to-photo translation

Wu Liuwei^1,2, Sun Rui^1,2, Kan Junsong^1,2, Gao Jun¹(1.School of Computer and Information, Hefei University of Technology, Hefei 230009, China;2.Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei 230009, China)

Abstract

Objective Sketch-to-photo translation has a wide range of applications in the public safety and digital entertainment area. For example, it can help the police find fugitives and missing children or generate an avatar of social account. The existing algorithm of sketch-to-photo translation can only translate sketches into photos under the same age group. However, it does not solve the problem of cross-age sketch-to-photo translation. Cross-age sketch-to-photo translation characters also have a wide range of applications. For example, when the sketch image of the police at hand is out of date after a long time, the task can generate an aging photo based on outdated sketches to help the police find the suspect. Given that paired cross-age sketches and photo images are difficult to obtain, no data sets are available. To solve this problem, this study combines dual generative adversarial networks (DualGANs) and identity-preserved conditional generative adversarial networks (IPCGANs) to propose double dual generative adversarial networks (D-DualGANs). Method DualGANs have the advantage of two-way conversion without the need to pair samples. However, it can only achieve a two-way conversion of an attribute and cannot achieve the conversion of two attributes at the same time. IPCGANs can complete the aging or rejuvenation of the face while retaining the personalized features of the person's face, but it cannot complete the two-way change between different age groups. This article considers the span of age as a domain conversion problem and considers the cross-age sketch-to-photo translation task as a problem of style and age conversion. We combined the characteristics of the above network to build D-DualGANs by setting up four generators and four discriminators to combat training. The method not only learns the mapping of the sketch domain to the photo domain and the mapping of the photo domain to the sketch domain but also learns the mapping of the source age group to the target age group and the mapping of the target age group to the original age group. In D-DualGANs, the original sketch image or the original photo image is successively completed by the four generators to achieve the four-domain conversion to obtain cross-age photo images or cross-age sketch images and reconstructed same-age sketch images or reconstructed same-age photo images. The generator is optimized by measuring the distance between the generated cross-age image and the reconstructed image of the same age by full reconstruction loss. We also used the identity retention module to introduce reconstructed identity loss to maintain the personalized features of the face. Eventually, the input sketch images and photo images from the different age groups are converted into photos and sketches of the other age group. This method does not require paired samples, currently overcoming the problem of lack of paired samples of cross-age sketches and photos. Result The experiments combine the images of the CUFS(CUHK(Chinese University of Hong Kong)-face sketeh database) and CUSFS(CUHK face sketch face recognition technology database) sketch photo datasets and produces corresponding age labels for each image based on the results of the age estimation software. According to the age label, the sketch and photo images in the datasets are divided into three groups of 1130,3150, and 50+,and each age group is evenly distributed. Six D-DualGAN models were trained to realize the two-two conversion between sketches and photographic images of the three age groups, namely, the 1130 sketch and the 3150 photo, the 1130 sketch and the 50+ photo, the 3150 sketch and the 1130 photo, the 3150 sketch and the 50+ photo, the 50+ sketch and the 3150 photo, the 50+ sketch and the 1130 photo. As there is little research on cross-age sketch-to-photo translation. To illustrate the effectiveness of the method, the generated image obtained by this method is compared with the generated image obtained by DualGANs and then by IPCGANs. Our images are of good quality with less distortion and noise. Using an age estimate CNN to judge the age accuracy of the generated image, the mean absolute error (MAE) of our method is lower than the direct addition of DualGANs and IPCGANs. To evaluate the similarity between the generated image and the original image, we invite volunteers unrelated to this study to determine whether the generated image is the same as the original image. The results show that the resulting aging image is similar, and the resulting younger image is poor. Among them, the 3150 photos generated by 1130 sketches are the same as the original image. Conclusion D-DualGANs proposed in this study provides knowledge on mapping and inverse mapping between the sketch domain and the photo domain and the mapping and inverse mapping between the different age groups. It also converts both the age and style properties of the input image. Photo images of different ages can be generated from a given sketch image. Through the introduced reconstructed identity loss and complete identity loss, the generated image effectively retains the identity features of the original image. Thus, the problem of image cross-style and cross-age translation is solved effectively. D-DualGANs can be used as a general framework to solve other computer vision tasks that need to complete two attribute conversions at the same time. However, some shortcomings are still observed in this method. For example, conversion between the different age groups requires training different models, such as to achieve 1130 sketches to 3150 photos and 1130 sketches to 50+ photos. To train two D-DualGAN models separately is necessary. This work is cumbersome in practical applications and can be used as an improvement direction in the future so that training a network model can achieve conversion between all age groups.

Keywords

generative adversarial networks(GAN) image translation face aging heterogeneous image synthesis face sketch synthesis

在线采编平台

在线出版

年度会议

下载中心

年度信息