Current Issue Cover
融合参考先验与生成先验的老照片修复

刘继鑫, 陈瑞, 安仕鹏(天津大学微电子学院, 天津市成像与感知微电子技术重点实验室, 天津 300072)

摘 要
目的 修复老照片具有重要的实用价值,但老照片包含多种未知复杂的退化,传统的修复方法组合不同的数字图像处理技术进行修复,通常产生不连贯或不自然的修复结果。基于深度学习的修复方法虽然被提出,但大多集中于对单一或有限的退化进行修复。针对上述问题,本文提出一种融合参考先验与生成先验的生成对抗网络来修复老照片。方法 对提取的老照片和参考图像的浅层特征进行编码获得深层语义特征与潜在编码,对获得的潜在编码进一步融合获得深度语义编码,深度语义编码通过生成先验网络获得生成先验特征,并且深度语义编码引导条件空间多特征变换条件注意力块进行参考语义特征、生成先验特征与待修复特征的空间融合变换,最后通过解码网络重建修复图像。结果 实验与6种图像修复方法进行了定量与定性评估。比较了4种评估指标,本文算法的所有指标评估结果均优于其他算法,PSNR (peak signal-to-noise ratio)为23.69 dB,SSIM (structural similarity index)为0.828 3,FID (Fréchet inception distance)为71.53,LPIPS (learned perceptual image patch similarity)为0.309),相比指标排名第2的算法,分别提高了0.75 dB, 0.019 7, 13.69%, 19.86%。定性结果中,本文算法具有更好的复杂退化修复能力,修复的细节更加丰富。此外,本文算法相比对比算法更加轻量,推断速度更快,以43.44 M的参数量完成256×256像素分辨率图像推断仅需248 ms。结论 本文提出了融合参考先验与生成先验的老照片修复方法,充分利用了参考先验的语义信息与生成模型封装的人像先验,在主观与客观上均取得了先进的修复性能。
关键词
Reference prior and generative prior linked distorted old photos restoration

Liu Jixin, Chen Rui, An Shipeng(School of Microelectronics, Tianjin Key Laboratory of Imaging and Sensing Microelectronic Technology, Tianjin University, Tianjin 300072, China)

Abstract
Objective Distorted old photos restoration is a challenging issue in practice. Photos are severely eroded in harsh environments, resulting in unclear photo content or even permanent damage, such as scratches, noise, blur and color fading. First, distorted old photos are digitized and implemented (such as Adobe Photoshop) to harness pixel-level manual fine restoration via image processing software. However, manual restoration is time consuming and a batch of manual restoration is more challenged. Traditional methods restore distorted photos (such as digital filtering, edge detection, image patching, etc.) based on multiple restoration algorithms. However, incoherent or unclear restoration results are produced. A large number of deep learning methods have been facilitated nowadays. However, most of the deep learning methods originated from single degradation or several integrated degradations affect generalization ability because the synthesized artificial data cannot represent the real degradation process and data distribution. Based on the framework of generative adversarial network, our problem solving restores distorted old photos through introducing reference priors and generative priors, which improve the restoration quality and generalization performance of distorted old photos. Method The reference image option is a key factor to implement our method. A high-quality reference image is linked to the following features:1) Structure similarity:the reference image and the distorted old photos should be similar to image structure. 2) Feature similarity:the distorted old photos restoration focuses more on the restoration of portraits. The resolution of the previous camera was generally not high and portraits are the core of the photo. The portrait content of the reference image should be as similar as possible to the portrait content in the targeted photos, including gender, age, posture, etc. Theoretically, the closer the two images are, the better the similarity coupling between features, more effective prior information can be obtained. Our method picks potential reference images up based on 2 portrait datasets of CelebFaces Atributes Dataset(CelebA) and Flickr faces high quality(FFHQ), using structural similarity as an indicator. The image structural similarity is greater than 0.9 as an appropriated reference image; the reference image is further aligned with the distorted old photo through feature point detection. Our demonstration first extracts the shallow features of the reference image and the distorted old photos. The method uses a 3×3 convolution to extract the reference image features and uses 3 kernel sizes (7×7, 5×5, 3×3) convolutions to extract the shallow features of targeted photos. The shallow features of the reference image and the targeted photos are then encoded each to obtain deep semantics features in multiple-scales and latent semantic codes. Our 2 latent semantic codes are fused in latent space to obtain deep semantic codes through a series of overall interlinked layers. Deep semantic codes use the generative prior via compressed pre-trained generative model to generate generative prior features and guide spatial multi-feature (SMF) transformation condition attention block to fuse reference semantic features, generative prior features and distorted old photo features. Specifically, the distorted photo features are segmented into two sections, one section remains identity connection to ensure the fidelity of the restoration, and its copy is fused with generative prior features simultaneously. The other one is projected to affine transformation via the compressed reference semantic features. Finally, the 2 sections are interconnected and then the deep semantic codes are used for attention fusion. The fused features are related to the decoded features through the skip connection and residual connection, a following 3×3 convolution is used to reconstruct the restored photos. We build up a distorted old photo dataset excluded synthetic data. Result Our quantitative illustrations compares the results of the method with 6 state-of-the-art methods on 4 evaluation metrics, including signal-to-noise ratio (PSNR), the structural similarity index (SSIM), the learned perceptual image patch similarity (LPIPS) and Fréchet inception distance (FID), which comprehensively consider the average pixel error, structural similarity, data distribution, and so on. Our demonstration is significantly better than other comparison methods in all evaluation metrics. The analyzed results of all numerical metrics are illustrated as mentioned below:the PSNR is 23.69 dB, the SSIM is 0.828 3, the LPIPS is 0.309 and the FID is 71.53, which are improved by 0.75 dB, 0.019 7, 13.69%, and 19.86%, respectively. Our qualitative method compares all the results of restoration methods. The best structured defects restoration quality is significantly better than other methods and the restoration results are more consistent and natural, such as missing, scratches, etc., our unstructured defects method also facilitates comparable and better restoration results. Fewer parameters (43.44 M) and faster inference time are obtained (mean 248 ms for 256×256 resolution distorted old photos). Conclusion Our reference priors and generative priors' method restore distorted old photos. The semantic information of reference priors and generative model compressed portrait priors are facilitated to qualitative and quantitative restoration both.
Keywords

订阅号|日报