双重对偶生成对抗网络的跨年龄素描—照片转换
Double dual generative adversarial networks for cross-age sketch-to-photo translation
- 2020年25卷第4期 页码:732-744
收稿:2019-07-15,
修回:2019-9-29,
录用:2019-10-6,
纸质出版:2020-04-16
DOI: 10.11834/jig.190329
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-07-15,
修回:2019-9-29,
录用:2019-10-6,
纸质出版:2020-04-16
移动端阅览
目的
2
跨年龄素描-照片转换旨在根据面部素描图像合成同一人物不同年龄阶段的面部照片图像。该任务在公共安全和数字娱乐等领域具有广泛的应用价值,然而由于配对样本难以收集和人脸老化机制复杂等原因,目前研究较少。针对此情况,提出一种基于双重对偶生成对抗网络(double dual generative adversarial networks,D-DualGANs)的跨年龄素描-照片转换方法。
方法
2
该网络通过设置4个生成器和4个判别器,以对抗训练的方式,分别学习素描到照片、源年龄组到目标年龄组的正向及反向映射。使素描图像与照片图像的生成过程相结合,老化图像与退龄图像的生成过程相结合,分别实现图像风格属性和年龄属性上的对偶。并增加重构身份损失和完全重构损失以约束图像生成。最终使输入的来自不同年龄组的素描图像和照片图像,分别转换成对方年龄组下的照片和素描。
结果
2
为香港中文大学面部素描数据集(Chinese University of Hong Kong(CUHK)face sketch database,CUFS)和香港中文大学面部素描人脸识别技术数据集(CUHK face sketch face recognition technology database,CUFSF)的图像制作对应的年龄标签,并依据标签将图像分成3个年龄组,共训练6个D-DualGANs模型以实现3个年龄组图像之间的两两转换。同非端到端的方法相比,本文方法生成图像的变形和噪声更小,且年龄平均绝对误差(mean absolute error,MAE)更低,与原图像相似度的投票对比表明11~30素描与31~50照片的转换效果最好。
结论
2
双重对偶生成对抗网络可以同时转换输入图像的年龄和风格属性,且生成的图像有效保留了原图像的身份特征,有效解决了图像跨风格且跨年龄的转换问题。
Objective
2
Sketch-to-photo translation has a wide range of applications in the public safety and digital entertainment area. For example
it can help the police find fugitives and missing children or generate an avatar of social account. The existing algorithm of sketch-to-photo translation can only translate sketches into photos under the same age group. However
it does not solve the problem of cross-age sketch-to-photo translation. Cross-age sketch-to-photo translation characters also have a wide range of applications. For example
when the sketch image of the police at hand is out of date after a long time
the task can generate an aging photo based on outdated sketches to help the police find the suspect. Given that paired cross-age sketches and photo images are difficult to obtain
no data sets are available. To solve this problem
this study combines dual generative adversarial networks (DualGANs) and identity-preserved conditional generative adversarial networks (IPCGANs) to propose double dual generative adversarial networks (D-DualGANs).
Method
2
DualGANs have the advantage of two-way conversion without the need to pair samples. However
it can only achieve a two-way conversion of an attribute and cannot achieve the conversion of two attributes at the same time. IPCGANs can complete the aging or rejuvenation of the face while retaining the personalized features of the person's face
but it cannot complete the two-way change between different age groups. This article considers the span of age as a domain conversion problem and considers the cross-age sketch-to-photo translation task as a problem of style and age conversion. We combined the characteristics of the above network to build D-DualGANs by setting up four generators and four discriminators to combat training. The method not only learns the mapping of the sketch domain to the photo domain and the mapping of the photo domain to the sketch domain but also learns the mapping of the source age group to the target age group and the mapping of the target age group to the original age group. In D-DualGANs
the original sketch image or the original photo image is successively completed by the four generators to achieve the four-domain conversion to obtain cross-age photo images or cross-age sketch images and reconstructed same-age sketch images or reconstructed same-age photo images. The generator is optimized by measuring the distance between the generated cross-age image and the reconstructed image of the same age by full reconstruction loss. We also used the identity retention module to introduce reconstructed identity loss to maintain the personalized features of the face. Eventually
the input sketch images and photo images from the different age groups are converted into photos and sketches of the other age group. This method does not require paired samples
currently overcoming the problem of lack of paired samples of cross-age sketches and photos.
Result
2
The experiments combine the images of the CUFS(CUHK(Chinese University of Hong Kong)-face sketeh database) and CUSFS(CUHK face sketch face recognition technology database) sketch photo datasets and produces corresponding age labels for each image based on the results of the age estimation software. According to the age label
the sketch and photo images in the datasets are divided into three groups of 11~30
31~50
and 50+
and each age group is evenly distributed. Six D-DualGAN models were trained to realize the two-two conversion between sketches and photographic images of the three age groups
namely
the 11~30 sketch and the 31~50 photo
the 11~30 sketch and the 50+ photo
the 31~50 sketch and the 11~30 photo
the 31~50 sketch and the 50+ photo
the 50+ sketch and the 31~50 photo
the 50+ sketch and the 11~30 photo. As there is little research on cross-age sketch-to-photo translation. To illustrate the effectiveness of the method
the generated image obtained by this method is compared with the generated image obtained by DualGANs and then by IPCGANs. Our images are of good quality with less distortion and noise. Using an age estimate CNN to judge the age accuracy of the generated image
the mean absolute error (MAE) of our method is lower than the direct addition of DualGANs and IPCGANs. To evaluate the similarity between the generated image and the original image
we invite volunteers unrelated to this study to determine whether the generated image is the same as the original image. The results show that the resulting aging image is similar
and the resulting younger image is poor. Among them
the 31~50 photos generated by 11~30 sketches are the same as the original image.
Conclusion
2
D-DualGANs proposed in this study provides knowledge on mapping and inverse mapping between the sketch domain and the photo domain and the mapping and inverse mapping between the different age groups. It also converts both the age and style properties of the input image. Photo images of different ages can be generated from a given sketch image. Through the introduced reconstructed identity loss and complete identity loss
the generated image effectively retains the identity features of the original image. Thus
the problem of image cross-style and cross-age translation is solved effectively. D-DualGANs can be used as a general framework to solve other computer vision tasks that need to complete two attribute conversions at the same time. However
some shortcomings are still observed in this method. For example
conversion between the different age groups requires training different models
such as to achieve 11~30 sketches to 31~50 photos and 11~30 sketches to 50+ photos. To train two D-DualGAN models separately is necessary. This work is cumbersome in practical applications and can be used as an improvement direction in the future so that training a network model can achieve conversion between all age groups.
Chen B C, Chen C S and Hsu W H. 2014. Cross-age reference coding for age-invariant face recognition and retrieval//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 768-783[ DOI: 10.1007/978-3-319-10599-4_49 http://dx.doi.org/10.1007/978-3-319-10599-4_49 ]
Chen S X, Zhang C J, Dong M, Le J L and Rao M K. 2017. Using ranking-CNN for age estimation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 742-751[ DOI: 10.1109/CVPR.2017.86 http://dx.doi.org/10.1109/CVPR.2017.86 ]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial networks[EB/OL].[2019-06-30] . https://arxiv.org/pdf/1406.2661.pdf https://arxiv.org/pdf/1406.2661.pdf
Isola P, Zhu J Y, Zhou T H and Efros A A. 2017. Image-to-image translation with conditional adversarial networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 5967-5976[ DOI: 10.1109/CVPR.2017.632 http://dx.doi.org/10.1109/CVPR.2017.632 ]
Kazemi V and Sullivan J. 2014. One millisecond face alignment with an ensemble of regression trees//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE: 1867-1874[ DOI: 10.1109/CVPR.2014.241 http://dx.doi.org/10.1109/CVPR.2014.241 ]
Kemelmacher-Shlizerman I, Suwajanakorn S and Seitz S M. 2014. Illumination-aware age progression//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE: 3334-3341[ DOI: 10.1109/CVPR.2014.426 http://dx.doi.org/10.1109/CVPR.2014.426 ]
Kim T, Cha M, Kim H, Lee J K and Kim J. 2017. Learning to discover cross-domain relations with generative adversarial networks[EB/OL].[2019-06-30] . https://arxiv.org/pdf/1703.05192.pdf https://arxiv.org/pdf/1703.05192.pdf
Klum S J, Han H,Klare B F and Jain A K. 2014. The FaceSketchID system:matching facial composites to mugshots. IEEE Transactions on Information Forensics and Security, 9(12):2248-2263[DOI:10.1109/TIFS.2014.2360825]
Krizhevsky A, Sutskever I and Hinton G E. 2017. Imagenet classification with deep convolutional neural networks.Communications of the ACM, 60(6):84-90[DOI:10.1145/3065386]
Li C and Wand M. 2016. Precomputed real-time texture synthesis with markovian generative adversarial networks//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 702-716[ DOI: 10.1007/978-3-319-46487-9_43 http://dx.doi.org/10.1007/978-3-319-46487-9_43 ]
Mirza M and Osindero S. 2014. Conditional generative adversarial nets[EB/OL].[2019-06-30] . https://arxiv.org/pdf/1411.1784.pdf https://arxiv.org/pdf/1411.1784.pdf
Ricanek K and Tesafaye T. 2006. Morph: a longitudinal image database of normal adult age-progression//Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition.Southampton, UK: IEEE: 341-345[ DOI: 10.1109/FGR.2006.78 http://dx.doi.org/10.1109/FGR.2006.78 ]
Suo J L, Zhu S C, Shan S G and Chen X L. 2010. A compositional and dynamic model for face aging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3):385-401[DOI:10.1109/TPAMI.2009.39]
Tang X, Wang Z W, Luo W X and Gao S H. 2018. Face aging with identity-preserved conditional generative adversarial networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 7939-7947[ DOI: 10.1109/CVPR.2018.00828 http://dx.doi.org/10.1109/CVPR.2018.00828 ]
Tang X O and Wang X G. 2002.Face photo recognition using sketch//Proceedings of International Conference on Image Processing. Rochester, NY, USA: IEEE: I-257-260[ DOI: 10.1109/ICIP.2002.1038008 http://dx.doi.org/10.1109/ICIP.2002.1038008 ]
Tang X O and Wang X G. 2003. Face sketch synthesis and recognition//Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE: 687-694[ DOI: 10.1109/ICCV.2003.1238414 http://dx.doi.org/10.1109/ICCV.2003.1238414 ]
Tazoe Y, Gohara H, Maejima A and Morishima S. 2012. Facial aging simulator considering geometry and patch-tiled texture//Proceedings of ACM SIGGRAPH 2012 Posters. Los Angeles, California: ACM: 90-90[ DOI: 10.1145/2342896.2343002 http://dx.doi.org/10.1145/2342896.2343002 ]
Wang N N, Li J, Tao D C, Li X L and Gao X B. 2013. Heterogeneous image transformation. Pattern Recognition Letters, 34(1):77-84[DOI:10.1016/j.patrec.2012.04.005]
Wang N N, Zha W J, Li J and Gao X B. 2018. Back projection:an effective postprocessing method for GAN-based face sketch synthesis. Pattern Recognition Letters, 107:59-65[DOI:10.1016/j.patrec.2017.06.012]
WangX G and Tang X O. 2009. Face photo-sketch synthesis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(11):1955-1967[DOI:10.1109/TPAMI.2008.222]
Wang Z Y, Cao M X, Li L and Peng Q S. 2009. Individual prototyping based facial aging image synthesis. Acta Electronica Sinica, 37(S1):118-124
王章野, 曹玫璇, 李理, 彭群生. 2009.基于个性化原型的人脸衰老图像合成.电子学报, 37(S1):118-124[DOI:10.3321/j.issn:0372-2112.2009.z1.021]
Xia Y C, He D, Qin T, Wang L W, Yu N H, Liu T Y and Ma W Y. 2016. Dual learning for machine translation[EB/OL].[2019-06-30] . https://arxiv.org/pdf/1611.00179.pdf https://arxiv.org/pdf/1611.00179.pdf
Yi R, Liu Y J, Lai Y K and Rosin P L. 2019. APDrawingGAN: generating artistic portrait drawings from face photos with hierarchical GANs[EB/OL].[2019-08-13] . http://orca.cf.ac.uk/121531/1/APDrawingGAN_CVPR2019.pdf http://orca.cf.ac.uk/121531/1/APDrawingGAN_CVPR2019.pdf
Yi Z L, Zhang H, Tan P and Gong M L. 2017. Dualgan: unsupervised dual learning for image-to-image translation//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2868-2876[ DOI: 10.1109/ICCV.2017.310 http://dx.doi.org/10.1109/ICCV.2017.310 ]
Zhang L L, Lin L, Wu X, Ding S Y and Zhang L. 2015. End-to-end photo-sketch generation via fully convolutional representation learning//Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. Shanghai, China: ACM: 627-634[ DOI: 10.1145/2671188.2749321 http://dx.doi.org/10.1145/2671188.2749321 ]
Zhang W, Wang X G and Tang X O. 2011. Coupled information-theoretic encoding for face photo-sketch recognition//Proceedings of CVPR 2011. Providence, RI, USA: IEEE: 513-520[ DOI: 10.1109/CVPR.2011.5995324 http://dx.doi.org/10.1109/CVPR.2011.5995324 ]
Zhang Z F, Song Y and Qi H R. 2017. Age progression/regression by conditional adversarial autoencoder//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 4352-4360[ DOI: 10.1109/CVPR.2017.463 http://dx.doi.org/10.1109/CVPR.2017.463 ]
Zhao J J, Fang Q, Liang Z C, Hu C S, Yang F M and Zhan S. 2016. Sketch face recognition based on super-resolution reconstruction. Journal of Image and Graphics, 21(2):218-224
赵京晶, 方琪, 梁植程, 胡长胜, 杨福猛, 詹曙. 2016.超分辨率重建的素描人脸识别.中国图象图形学报, 21(2):218-224[DOI:10.11834/jig.20160211]
Zhou H, Kuang Z H and Wong K Y K. 2012. Markov weight fields for face sketch synthesis//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE: 1091-1097[ DOI: 10.1109/CVPR.2012.6247788 http://dx.doi.org/10.1109/CVPR.2012.6247788 ]
Zhu J Y, Park T, Isola P and Efros A A. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2242-2251[ DOI: 10.1109/ICCV.2017.244 http://dx.doi.org/10.1109/ICCV.2017.244 ]
相关作者
相关机构
京公网安备11010802024621