深度学习图像修复方法综述
Survey on deep learning image inpainting methods
- 2019年24卷第3期 页码:447-463
收稿:2018-07-04,
修回:2018-9-29,
纸质出版:2019-03-16
DOI: 10.11834/jig.180408
移动端阅览

浏览全部资源
扫码关注微信
收稿:2018-07-04,
修回:2018-9-29,
纸质出版:2019-03-16
移动端阅览
目的
2
图像修复是计算机视觉领域研究的一项重要内容,其目的是根据图像中已知内容来自动地恢复丢失的内容,在图像编辑、影视特技制作、虚拟现实及数字文化遗产保护等领域都具有广泛的应用价值。而近年来,随着深度学习在学术界和工业界的广泛研究,其在图像语义提取、特征表示、图像生成等方面的应用优势日益突出,使得基于深度学习的图像修复方法的研究成为了国内外一个研究热点,得到了越来越多的关注。为了使更多研究者对基于深度学习的图像修复理论及其发展进行探索,本文对该领域研究现状进行综述。
方法
2
首先对基于深度学习图像修复方法提出的理论依据进行分析;然后对其中涉及的关键技术进行研究;总结了近年来基于深度学习的主要图像修复方法,并依据修复网络的结构对现有方法进行了分类,即分为基于卷积自编码网络结构的图像修复方法、基于生成式对抗网络结构的图像修复方法和基于循环神经网络结构的图像修复方法。
结果
2
在基于深度学习的图像修复方法中,深度学习网络的设计和训练过程中的损失函数的选择是其重要的内容,各类方法各有优缺点和其适用范围,如何提高修复结果语义的合理性、结构及细节的正确性,一直是研究者们努力的方向,基于此目的,本文通过实验分析总结了各类方法的主要特点、存在的问题、对训练样本的要求、主要应用领域及参考代码。
结论
2
基于深度学习图像修复领域的研究已经取得了一些显著进展,但目前深度学习在图像修复中的应用仍处于起步阶段,主要研究的内容也仅仅是利用待修复图像本身的图像内容信息,因此基于深度学习的图像修复仍是一个极具挑战的课题。如何设计具有普适性的修复网络,提高修复结果的准确性,还需要更加深入的研究。
Objective
2
Inpainting is the process of reconstructing lost or deteriorated parts of images and videos. This reconstruction process is an important research area in the field of computer vision
and its purpose is to automatically repair lost content according to the known content of the images and videos. Inpainting has extensive application value in the fields of image editing
film and television special effect production
virtual reality
and digital cultural heritage protection. Deep learning has been widely studied in the academic and industrial fields in recent years. Its advantages in image semantic extraction
feature representation
and image generation have become increasingly prominent
leading to the increasing attention to and popularity of research on image inpainting based on deep learning. This study reviews the current research status of image inpainting based on deep learning to enable researchers to explore its theory and development.
Method
2
This paper first discusses the issue of image inpainting and summarizes the advantages and disadvantages of the commonly used methods by comparing their application results in image restoration in large areas. The theoretical basis of image inpainting based on deep learning is then analyzed
and the key technologies of image inpainting based on deep learning
which include the generation network based on auto encoder
the general training methods of deep network
and the training methods based on convolutional auto encoder network
are studied. This paper also summarizes some image inpainting methods based on deep learning that have been proposed in recent years and classifies these methods into three categories according to the architecture of their repairing network:image inpainting methods based on deep convolutional auto encoder architecture
image inpainting methods based on generative adversarial network (GAN) architecture
and image inpainting based on recurrent neural network (RNN) architecture. The basic structure of the generation network based on autoencoder is described
many improved networks and their loss functions are analyzed
and the experimental results based on different loss functions are provided. For the image inpainting methods based on GAN
the basic structure and process architecture of the GAN are described
and the experimental results based on some classical methods are presented. For the image inpainting methods based on RNN
the RNN model is analyzed
especially the methods based on the PixelRNN model
and the experimental results based on MNIST and CIFAR-10 datasets are provided.
Result
2
The design of deep learning networks and the selection of training loss functions are important in image inpainting based on deep learning methods. Each method has its merits
demerits
and application ranges. However
the main direction of the research is how to improve the rationality of semantics
the correctness of structure
and the detail of the repaired image. On the basis of this purpose
this paper summarizes and analyzes the characteristics
existing issues
requirements for training samples
application fields
and reference codes of these methods through experiments.
Conclusion
2
Although remarkable progress has been made in the field of image inpainting based on deep learning
the application of deep learning in image restoration remains in its infancy research focuses on using own image content information to restore images and still has demerits and own adaptation range. Consequently
image inpainting based on deep learning remains a challenging subject. How to improve the adaptability of the repairing network and the correctness of repairing results still requires further studies. This paper indicates the developing prospects from the following aspects:1) Further research may focus on how to design an adaptive network based on both semantic and texture networks. 2) The quality of inpainting images must be improved by studying the loss function of the repair network; the study of distance measurement based on different application purposes is especially critical. 3) Further research may focus on image inpainting methods of specific types
such as improving the generalization capability of the methods on small datasets through designing targeted training network structure and performing processes
such as fine tuning. 4) As processing power
such as GPUs
increases
the inpainting methods that train high-resolution images directly are also worth studying. 5) For some complex scenes
utilizing human-computer interaction strategies to repair images is still worth studying to promote their practical application and enrich digital image restoration technologies.
Bertalmio M, Sapiro G, Caselles V, et al. Image inpainting[C]//Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New York, USA: ACM, 2000: 417-424. [ DOI: 10.1145/344779.344972 http://dx.doi.org/10.1145/344779.344972 ]
Shen J H, Chan T F. Mathematical models for local nontexture inpaintings[J]. SIAM Journal on Applied Mathematics, 2002, 62(3): 1019-1043. [DOI: 10.1137/S0036139900368844]
Shen J H, Kang S H, Chan T F. Euler's elastica and curvature-based inpainting[J]. SIAM Journal on Applied Mathematics, 2003, 63(2): 564-592. [DOI: 10.1137/S0036139901390088]
Tsai A, Yezzi A, Willsky A S. Curve evolution implementation of the Mumford-Shah functional for image segmentation, denoising, interpolation, and magnification[J]. IEEE Transactions on Image Processing, 2001, 10(8): 1169-1186. [DOI: 10.1109/83.935033]
Bertalmio M, Bertozzi A L, Sapiro G. Navier-stokes, fluid dynamics,and image and video inpainting[C]//Proceedings of 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, HI, USA: IEEE, 2001: 355-362. [ DOI: 10.1109/CVPR.2001.990497 http://dx.doi.org/10.1109/CVPR.2001.990497 ]
Barnes C, Shechtman E, Finkelstein A, et al. PatchMatch: a randomized correspondence algorithm for structural image editing[J]. ACM Transactions on Graphics, 2009, 28(3): #24. [DOI: 10.1145/1531326.1531330]
Criminisi A, Perez P, Toyama K. Object removal by exemplar-based inpainting[C]//Proceedings of 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Madison, WI, USA: IEEE, 2003: 721-728. [ DOI: 10.1109/CVPR.2003.1211538 http://dx.doi.org/10.1109/CVPR.2003.1211538 ]
Komodakis N. Image completion using global optimization[C]//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006: 442-452. [ DOI: 10.1109/CVPR.2006.141 http://dx.doi.org/10.1109/CVPR.2006.141 ]
Komodakis N, Tziritas G. Image completion using efficient belief propagation via priority scheduling and dynamic pruning[J]. IEEE Transactions on Image Processing, 2007, 16(11): 2649-2661. [DOI: 10.1109/TIP.2007.906269]
Qiang Z P, He L B, Xu D. Exemplar-based pixel by pixel inpainting based on patch shift[C]//Proceedings of the 2nd CCF Chinese Conference on Computer Vision. Tianjin, China: Springer, 2017: 370-382. [ DOI: 10.1007/978-981-10-7302-1_31 http://dx.doi.org/10.1007/978-981-10-7302-1_31 ]
Liu H M, Bi X H, Ye Z F, et al. Arc promoting image inpainting using exemplar searching and priority filling[J]. Journal of Image and Graphics, 2016, 21(8): 993-1003.
刘华明, 毕学慧, 叶中付, 等.样本块搜索和优先权填充的弧形推进图像修复[J].中国图象图形学报, 2016, 21(8): 993-1003.[DOI: 10.11834/jig.20160803]
Zeng J X, Wang C. Image completion based on redefined priority and image division[J]. Journal of Image and Graphics, 2017, 22(9): 1183-1193.
曾接贤, 王璨.基于优先权改进和块划分的图像修复[J].中国图象图形学报, 2017, 22(9): 1183-1193. [DOI: 10.11834/jig.170054]
Hays J, Efros A A. Scene completion using millions of photographs[J]. ACM Transactions on Graphics, 2007, 26(3): #4. [DOI: 10.1145/1276377.1276382]
Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [DOI: 10.1109/5.726791]
Vinyals O, Toshev A, Bengio S, et al. Show and tell: a neural image caption generator[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3156-3164. [ DOI: 10.1109/CVPR.2015.7298935 http://dx.doi.org/10.1109/CVPR.2015.7298935 ]
Gatys L A, Ecker A S, BethgeM. Texture synthesis using convolutional neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: ACM, 2015: 262-270.
Gatys L A, Ecker A S, Bethge M. Image style transfer using convolutional neural networks[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 2414-2423. [ DOI: 10.1109/CVPR.2016.265 http://dx.doi.org/10.1109/CVPR.2016.265 ]
Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montréal, Canada: ACM, 2014: 2672-2680.
Mirza M, Osindero S. Conditional generative adversarial nets[EB/OL]. 2014-11-06[2018-06-19]. https: //arxiv.org/pdf/1411.1784.pdf.
Arjovsky M, Bottou L. Towards principled methods for training generative adversarial networks[EB/OL]. 2017-01-17[2018-06-19] . https://arxiv.org/pdf/1701.04862.pdf https://arxiv.org/pdf/1701.04862.pdf .
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[EB/OL]. 2015-11-19[2018-06-19] . https://arxiv.org/pdf/1511.06434.pdf https://arxiv.org/pdf/1511.06434.pdf .
Berthelot D, Schumm T, Metz L. Began: boundary equilibrium generative adversarial networks[EB/OL]. 2017-03-31[2018-06-19] . https://arxiv.org/pdf/1703.10717.pdf https://arxiv.org/pdf/1703.10717.pdf .
Arjovsky M, Chintala S, Bottou L. Wasserstein GAN[EB/OL]. 2017-01-26[2018-06-19] . https://arxiv.org/pdf/1701.07875.pdf https://arxiv.org/pdf/1701.07875.pdf .
Yang C, Lu X, Lin Z, et al. High-resolution image inpainting using multi-scale neural patch synthesis[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 4076-4084. [ DOI: 10.1109/CVPR.2017.434 http://dx.doi.org/10.1109/CVPR.2017.434 ]
Rumelhart D E, Hinton G E, Williams R J. Learning internal representations by error propagation[M]//Anderson J A, Rosenfeld E. Neurocomputing: Foundations of Research. Cambridge: MIT Press, 1988: 318-362.
Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. [DOI: 10.1126/science.1127647]
Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems. Canada: ACM, 2006: 153-160.
Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527-1554. [DOI: 10.1162/neco.2006.18.7.1527]
Masci J, Meier U, Cireşan D, et al. Stacked convolutional auto-encoders for hierarchical feature extraction[C]//Proceedings of the 21st International Conference on Artificial Neural Networks. Espoo, Finland: Springer, 2011: 52-59. [ DOI: 10.1007/978-3-642-21735-7_7 http://dx.doi.org/10.1007/978-3-642-21735-7_7 ]
Zeiler M D, Taylor G W, Fergus R. Adaptive deconvolutional networks for mid and high level feature learning[C]//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011: 2018-2025. [ DOI: 10.1109/ICCV.2011.6126474 http://dx.doi.org/10.1109/ICCV.2011.6126474 ]
Pathak D, Krähenbühl P, Donahue J, et al. Context encoders: feature learning by inpainting[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 2536-2544. [ DOI: 10.1109/CVPR.2016.278 http://dx.doi.org/10.1109/CVPR.2016.278 ]
Demir U,Ünal G. Inpainting by deep autoencoders using an advisor network[C]//Proceedings of the 25th Signal Processing and Communications Applications Conference. Antalya, Turkey: IEEE, 2017: 1-4. [ DOI: 10.1109/SIU.2017.7960317 http://dx.doi.org/10.1109/SIU.2017.7960317 ]
Li Y J, Liu S F, Yang J M, et al. Generative face completion[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 5892-5900. [ DOI: 10.1109/CVPR.2017.624 http://dx.doi.org/10.1109/CVPR.2017.624 ]
Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion[J]. ACM Transactions on Graphics, 2017, 36(4): #107. [DOI: 10.1145/3072959.3073659]
Yu J H, Lin Z, Yang J M, et al. Generative image inpainting with contextual attention[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 5505-5514.
Laube P, Grunwald M, Franz M O, et al. Image inpainting for high-resolution textures using CNN texture synthesis[EB/OL]. 2017-12-08[2018-06-19] . https://arxiv.org/pdf/1712.03111.pdf https://arxiv.org/pdf/1712.03111.pdf .
Yan Z Y, Li X M, Li M, et al. Shift-Net: image inpainting via deep feature rearrangement[EB/OL]. 2018-01-29[2018-06-19] . https://arxiv.org/pdf/1801.09392.pdf https://arxiv.org/pdf/1801.09392.pdf .
Li H F, Li G B, Lin L, et al. Context-aware semantic inpainting[EB/OL]. 2017-12-21[2018-06-19] . https://arxiv.org/pdf/1712.07778.pdf https://arxiv.org/pdf/1712.07778.pdf .
Yang C, Song Y H, Liu X F, et al. Image inpainting using block-wise procedural training with annealed adversarial counterpart[EB/OL]. 2018-03-23[2018-06-19] . https://arxiv.org/pdf/1803.08943.pdf https://arxiv.org/pdf/1803.08943.pdf .
Song Y H, Yang C, Shen Y J, et al. SPG-Net: segmentation prediction and guidance network for image inpainting[EB/OL]. 2018-05-09[2018-06-19] . https://arxiv.org/pdf/1805.03356.pdf https://arxiv.org/pdf/1805.03356.pdf .
Liu G L, Reda F A, Shih K J, et al. Image inpainting for irregular holes using partial convolutions[EB/OL]. 2018-04-20[2018-06-19] . https://arxiv.org/pdf/1804.07723.pdf https://arxiv.org/pdf/1804.07723.pdf .
Song Y H, Yang C, Lin Z, et al. Contextual-based image inpainting: infer, match, and translate[EB/OL]. [2018-06-19] . https://arxiv.org/pdf/1711.08590.pdf https://arxiv.org/pdf/1711.08590.pdf .
Demir U, Unal G. Patch-based image inpainting with generative adversarial networks[EB/OL]. 2018-03-20[2018-06-19] . https://arxiv.org/pdf/1803.07422.pdf https://arxiv.org/pdf/1803.07422.pdf .
Kolouri S, Pope P E, Martin C E, et al. Sliced-wasserstein autoencoder: an embarrassingly simple generative model[EB/OL]. 2018-04-05[2018-06-19] . https://arxiv.org/pdf/1804.01947.pdf https://arxiv.org/pdf/1804.01947.pdf .
Yeh R A, Chen C, Lim T Y, et al. Semantic image inpainting with deep generative models[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 6882-6890. [ DOI: 10.1109/CVPR.2017.728 http://dx.doi.org/10.1109/CVPR.2017.728 ]
Dolhansky B, Ferrer C C. Eye in-painting with exemplar generative adversarial networks[EB/OL]. 2017-12-11[2018-06-19] . https://arxiv.org/pdf/1712.03999.pdf https://arxiv.org/pdf/1712.03999.pdf .
Elad A, Kerzhner Y, Romano Y. Image inpainting using pre-trained classification CNN[R]. Haifa, Israel: Israel Institute of Technology, 2018. [ DOI: 10.13140/RG.2.2.33013.68327 http://dx.doi.org/10.13140/RG.2.2.33013.68327 ]
Altinel F, Ozay M, Okatani T. Deep structured energy-based image inpainting[EB/OL]. 2018-01-24[2018-06-19] . https://arxiv.org/pdf/1801.07939.pdf https://arxiv.org/pdf/1801.07939.pdf .
Lahiri A, Jain A, Biswas P K, et al. Improving consistency and correctness of sequence inpainting using semantically guided generative adversarial network[EB/OL]. 2017-11-16[2018-06-19] . https://arxiv.org/pdf/1711.06106.pdf https://arxiv.org/pdf/1711.06106.pdf .
Oord A V D, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks[C]//Proceedings of the 33rd International Conference on Machine Learning. New York, USA: JMLR, 2016: 1747-1756.
Burlin C, Le Calonnec Y, Duperier L. Deep image inpainting[EB/OL]. 2017-07-02[2018-06-19] . http://cs231n.stanford.edu/reports/2017/pdfs/328.pdf http://cs231n.stanford.edu/reports/2017/pdfs/328.pdf .
Sogancioglu E, Hu S, Belli D, et al. Chest X-ray inpainting with deep generative models[EB/OL]. 2018-04-12[2018-06-19] . https://openreview.net/forum?id=HJzbN-2oz https://openreview.net/forum?id=HJzbN-2oz .
Zhao Y N, Price B, Cohen S, et al. Guided image inpainting: replacing an image region by pulling content from another image[EB/OL]. 2018-03-22[2018-06-19] . https://arxiv.org/pdf/1803.08435.pdf https://arxiv.org/pdf/1803.08435.pdf .
Zhang Q, Yuan Q Q, Zeng C, et al. Missing data reconstruction in remote sensing image with a unified spatial-temporal-spectral deep convolutional neural network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(8): 4274-4288. [DOI: 10.1109/TGRS.2018.2810208]
Bengio Y. Learning deep architectures for AI[J]. Foundations and Trends® in Machine Learning, 2009, 2(1): 1-127. [DOI: 10.1561/2200000006]
Johnson J, Alahi A, Li F F. Perceptual losses for real-time style transfer and super-resolution[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 694-711. [ DOI: 10.1007/978-3-319-46475-6_43 http://dx.doi.org/10.1007/978-3-319-46475-6_43 ]
Wang X L, Gupta A. Generative image modeling using style and structure adversarial networks[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 318-335. [ DOI: 10.1007/978-3-319-46493-0_20 http://dx.doi.org/10.1007/978-3-319-46493-0_20 ]
Yoo D, Kim N, Park S, et al. Pixel-level domain transfer[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 517-532. [ DOI: 10.1007/978-3-319-46484-8_31 http://dx.doi.org/10.1007/978-3-319-46484-8_31 ]
Zhang R, Isola P, Efros A A. Colorful image colorization[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 649-666. [ DOI: 10.1007/978-3-319-46487-9_40 http://dx.doi.org/10.1007/978-3-319-46487-9_40 ]
Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. [DOI: 10.1109/TPAMI.2016.2644615]
Zhou Y P, Berg T L. Learning temporal transformations from time-lapse videos[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 262-277. [ DOI: 10.1007/978-3-319-46484-8_16 http://dx.doi.org/10.1007/978-3-319-46484-8_16 ]
Ronneberger O,Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer, 2015: 234-241. [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Dahl R, Norouzi M, Shlens J. Pixel recursive super resolution[EB/OL]. 2017-02-02[2018-06-19] . https://arxiv.org/pdf/1702.00783.pdf https://arxiv.org/pdf/1702.00783.pdf .
Luo W J, Li Y J, Urtasun R, et al. Understanding the effective receptive field in deep convolutional neural networks[C]//Proceedings of the 29th Conference on Neural Information Processing Systems. Barcelona, Spain: NIPS, 2016: 4898-4906. [ DOI: 10.1007/s11042-018-5704-3 http://dx.doi.org/10.1007/s11042-018-5704-3 ]
Doersch C, Singh S, Gupta A, et al. What makes Paris look like Paris?[J]. ACM Transactions on Graphics, 2012, 31(4): #101. [DOI: 10.1145/2185520.2185597]
Liu Z W, Luo P, Wang X G, et al. Deep learning face attributes in the wild[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 3730-3738. [ DOI: 10.1109/ICCV.2015.425 http://dx.doi.org/10.1109/ICCV.2015.425 ]
Liu S F, Pan J S, Yang M H. Learning recursive filters for low-level vision via a hybrid neural network[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016: 560-576. [ DOI: 10.1007/978-3-319-46493-0_34 http://dx.doi.org/10.1007/978-3-319-46493-0_34 ]
Salimans T, Karpathy A, Chen X, et al. Pixelcnn++: improving the pixelcnn with discretized logistic mixture likelihood and other modifications[EB/OL]. 2017-01-19[2018-06-19] . https://arxiv.org/pdf/1701.05517.pdf https://arxiv.org/pdf/1701.05517.pdf .
相关作者
相关机构
京公网安备11010802024621