联合语义分割与边缘重建的深度学习图像修复
Deep learning image inpainting combining semantic segmentation reconstruction and edge reconstruction
- 2022年27卷第12期 页码:3553-3565
纸质出版日期: 2022-12-16 ,
录用日期: 2021-11-12
DOI: 10.11834/jig.210702
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2022-12-16 ,
录用日期: 2021-11-12
移动端阅览
杨红菊, 李丽琴, 王鼎. 联合语义分割与边缘重建的深度学习图像修复[J]. 中国图象图形学报, 2022,27(12):3553-3565.
Hongju Yang, Liqin Li, Ding Wang. Deep learning image inpainting combining semantic segmentation reconstruction and edge reconstruction[J]. Journal of Image and Graphics, 2022,27(12):3553-3565.
目的
2
传统图像修复方法缺乏对图像高级语义的理解,只能应对结构纹理简单的小面积受损。现有的端到端深度学习图像修复方法在大量训练图像的支持下克服了上述局限性,但由于这些方法试图在约束不足的情况下恢复整个目标,修复的图像往往存在边界模糊和结构扭曲问题。对此,本文提出一种语义分割结构与边缘结构联合指导的深度学习图像修复方法。
方法
2
该方法将图像修复任务分解为语义分割重建、边缘重建和内容补全3个阶段。首先重建缺失区域的语义分割结构,然后利用重建的语义分割结构指导缺失区域边缘结构的重建,最后利用重建的语义分割结构与边缘结构联合指导图像缺失区域内容的补全。
结果
2
在CelebAMask-HQ(celebfaces attributes mask high quality)人脸数据集和Cityscapes城市景观数据集上,将本文方法与其他先进的图像修复方法进行对比实验。在掩膜比例为50%~60%的情况下,与性能第2的方法相比,本文方法在Celebamask-HQ数据集上的平均绝对误差降低了4.5%,峰值信噪比提高了1.6%,结构相似性提高了1.7%;在Cityscapes数据集上平均绝对误差降低了4.2%,峰值信噪比提高了1.5%,结构相似性提高了1.9%。结果表明,本文方法在平均绝对误差、峰值信噪比和结构相似性3个指标上均优于对比方法,且生成的图像边界清晰,视觉上更加合理。
结论
2
本文提出的3阶段图像修复方法在语义分割结构与边缘结构的联合指导下,有效减少了结构重建错误。当修复涉及大面积缺失时,该方法比现有方法具有更高的修复质量。
Objective
2
Image in-painting is to reconstruct the missing areas of distorted images. This technique is widely used in multiple scenes like image editing
image de-noising
cultural relics preservation. Conventional image in-painting methods are based on patch blocks to fill the missing pixels or to transmit the pixels to the missing region via diffusion mechanism. These methods have achieved regular effects or small defects in in-painting. However
due to the lack of semantic understanding of the image
more generated images often have a non-photorealistic sense of inconsistent semantic structure when filling large-scale consistent holes. Deep learning-based in-painting method can learn the high-level semantic information of the image from a large number of data. Although these methods have made significant progress in image inpainting
they are often unable to reconstruct feasible structures. Current methods are focused on target-completed restoration without sufficient constraints
so the generated images often have the problems of fuzzy boundaries and distorted structures.
Method
2
Our research is aimed to develop a deep image inpainting method guided by semantic segmentation and edge. It divides the image inpainting task into three steps: 1) semantic segmentation reconstruction
2) edge reconstruction and 3) content restoration. First
the semantic segmentation reconstruction module reconstructs the semantic segmentation structure. Then
the reconstructed semantic segmentation structure is used to guide the reconstruction of the edge structure of the missing area. Finally
the reconstructed semantic segmentation structure and edge structure are used to guide the content restoration of the missing area. Semantic segmentation can represent the global structure information of the image well. 1) The reconstruction of the semantic segmentation structure can improve the accuracy of edge structure-reconstructed. 2) Edge contains rich structural information
reconstructing the edge structure is beneficial to generate more inner details of object. 3) Under the guidance of reconstructed semantic segmentation structure and edge structure
the content restoration can use texture in-painting to clear the boundary of the generated image. The structure is more reasonable
and the texture is more real. Our network structure is based on the generative adversarial network (GAN-based)
including generator and discriminator. The generator network uses encoder-decoder structure and the discriminator network uses 70 × 70 PatchGAN structure. Joint loss is adopted in terms of loss function in the three steps
which can approach the in-painting results of each step to real results. The two reconstructed modules of semantic segmentation and edge use adversarial loss and feature matching loss. Our feature matching loss used actually includes L1 loss function. Feature matching loss is similar to perceptual loss
which can clarify the ground truth issue of semantic segmentation structure and edge structure. The content restoration module can add the perception loss and style loss in the context of image in-painting when style loss can reduce the "checkerboard" artifact caused by transpose convolution layer.
Result
2
First
we analyze the performance of semantic segmentation reconstruction module quantitatively and qualitatively. The results show that the semantic segmentation reconstruction module can reconstruct the feasibility of semantic segmentation structure. When the mask is small
the pixel accuracy can reach 99.16%
and for the larger mask
the pixel accuracy can also reach 92.64%. Next
we compare the edge reconstruction results quantitatively. It shows that the accuracy and recall of the reconstructed edge structure are optimized further under the guidance of the semantic segmentation structure. Finally
the method proposed is compared with four popular image in-painting methods on CelebAMask HQ (celebfaces attributes mask high quality) dataset and Cityscapes dataset. When the mask ratio is 50%~60%
compared with the second-performing method
the mean absolute error (MAE) on the CelebAMask-HQ dataset is reduced by 4.5%
the peak signal-to-noise ratio (PSNR) is increased by 1.6%
and the structure similarity index measure (SSIM) is increased by 1.7%; the MAE on the Cityscapes dataset is reduced by 4.2%
the PSNR is increased by 1.5%
and the SSIM is increased by 1.9%. Our method is optimized for the three indexes of MAE
PSNR and SSIM
the generated image has more clear boundaries and visibility.
Conclusion
2
Our 3-steps image in-painting method introduces the guidance of semantic segmentation structure
which can significantly improve the accuracy of edge reconstruction. In addition
it can reduce structure reconstruction errors effectively through the joint guidance of semantic segmentation structure and edge structure. It has stronger potentials in-painting quality for large-area deletions-oriented in-painting task.
图像修复生成对抗网络(GAN)语义分割边缘检测深度学习
image inpaintinggenerative adversarial network(GAN)semantic segmentationedge detectiondeeplearning
Ballester C, Bertalmio M, Caselles V, Sapiro G and Verdera J. 2001. Filling-in by joint interpolation of vector fields and gray levels. IEEE Transactions on Image Processing, 10(8): 1200-1211 [DOI: 10.1109/83.935036]
Barnes C, Shechtman E, Finkelstein A and Goldman D B. 2009. PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics, 28(3): #24 [DOI: 10.1145/1531326.1531330]
Bertalmio M, Sapiro G, Caselles V and Ballester C. 2000. Image inpainting//Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. New Orleans, USA: ACM: 417-424 [DOI: 10.1145/344779.344972http://dx.doi.org/10.1145/344779.344972]
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S and Schiele B. 2016. The cityscapes dataset for semantic urban scene understanding//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 3213-3223 [DOI: 10.1109/CVPR.2016.350http://dx.doi.org/10.1109/CVPR.2016.350]
Elharrouss O, Almaadeed N, Al-Maadeed S and Akbari Y. 2020. Image inpainting: a review. Neural Processing Letters, 51(2): 2007-2028 [DOI: 10.1007/s11063-019-10163-0]
Gatys L A, Ecker A S and Bethge M. 2016. Image style transfer using convolutional neural networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2414-2423 [DOI: 10.1109/CVPR.2016.265http://dx.doi.org/10.1109/CVPR.2016.265]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial nets//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: CAM: 2672-2680
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Iizuka S, Simo-Serra E and Ishikawa H. 2017. Globally and locally consistent image completion. ACM Transactions on Graphics, 36(4): #107 [DOI: 10.1145/3072959.3073659]
Isola P, Zhu J Y, Zhou T H and Efros A A. 2017. Image-to-image translation with conditional adversarial networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5967-5976 [DOI: 10.1109/CVPR.2017.632http://dx.doi.org/10.1109/CVPR.2017.632]
Johnson J, Alahi A and Li F F. 2016. Perceptual losses for Real-Time style transfer and Super-Resolution//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 694-711 [DOI: 10.1007/978-3-319-46475-6_43http://dx.doi.org/10.1007/978-3-319-46475-6_43]
Kwatra V, Essa I, Bobick A and Kwatra N. 2005. Texture optimization for example-based synthesis. ACM Transactions on Graphics, 24(3): 795-802 [DOI: 10.1145/1073204.1073263]
Li J Y, He F X, Zhang L F, Du B and Tao D C. 2019. Progressive reconstruction of visual structure for image inpainting//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 5961-5970 [DOI: 10.1109/ICCV.2019.00606http://dx.doi.org/10.1109/ICCV.2019.00606]
Li J Y, Wang N, Zhang L F, Du B and Tao D C. 2020. Recurrent feature reasoning for image inpainting//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 7757-7765 [DOI: 10.1109/CVPR42600.2020.00778http://dx.doi.org/10.1109/CVPR42600.2020.00778]
Liu G L, Reda F A, Shih K J, Wang T C, Tao A and Catanzaro B. 2018. Image inpainting for irregular holes using partial convolutions//Proceedings of 2018 European Conference on Computer Vision. Munich, Germany: Springer: 89-105 [DOI: 10.1007/978-3-030-01252-6_6http://dx.doi.org/10.1007/978-3-030-01252-6_6]
Liu Z W, Luo P, Wang X G and Tang X O. 2015. Deep learning face attributes in the wild//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3730-3738 [DOI: 10.1109/ICCV.2015.425http://dx.doi.org/10.1109/ICCV.2015.425]
Miyato T, Kataoka T, Koyama M and Yoshida Y. 2018. Spectral normalization for generative adversarial networks//Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: OpenReview. net
Nazeri K, Ng E, Joseph T, Qureshi F and Ebrahimi M. 2019. EdgeConnect: structure guided image inpainting using edge prediction//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea (South): IEEE: 3265-3274 [DOI: 10.1109/ICCVW.2019.00408http://dx.doi.org/10.1109/ICCVW.2019.00408]
Odena A, Buckman J, Olsson C, Brown T B, Olah C, Raffel C and Goodfellow I J. 2018. Is generator conditioning causally related to GAN performance?//Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: PMLR: 3849-3858
Pathak D, Krähenbühl P, Donahue J, Darrell T and Efros A A. 2016. Context encoders: feature learning by inpainting//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2536-2544 [DOI: 10.1109/CVPR.2016.278http://dx.doi.org/10.1109/CVPR.2016.278]
Qiang Z P, He L B, Chen X and Xu D. 2019. Survey on deep learning image inpainting methods. Journal of Image and Graphics, 24(3): 447-463
强振平, 何丽波, 陈旭, 徐丹. 2019. 深度学习图像修复方法综述. 中国图象图形学报, 24(3): 447-463 [DOI: 10.11834/jig.180408]
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z H, Karpathy A, Khosla A, Bernstein M, Berg A C and Li F F. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3): 211-252 [DOI: 10.1007/s11263-015-0816-y]
Sajjadi M S M, Schölkopf B and Hirsch M. 2017. EnhanceNet: single image super-resolution through automated texture synthesis//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 4501-4510 [DOI: 10.1109/ICCV.2017.481http://dx.doi.org/10.1109/ICCV.2017.481]
Wadhwa G, Dhall A, Murala S and Tariq U. 2021. Hyperrealistic image inpainting with hypergraphs//Proceedings of 2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 3912-3921 [DOI: 10.1109/WACV48630.2021.00396http://dx.doi.org/10.1109/WACV48630.2021.00396]
Wan Z Y, Zhang B, Chen D D, Zhang P, Chen D, Liao J and Wen F. 2020. Bringing old photos back to life//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2744-2754 [DOI: 10.1109/CVPR42600.2020.00282http://dx.doi.org/10.1109/CVPR42600.2020.00282]
Wang T C, Liu M Y, Zhu J Y, Tao A, Kautz J and Catanzaro B. 2018. High-Resolution image synthesis and semantic manipulation with conditional GANs//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8798-8807 [DOI: 10.1109/CVPR.2018.00917http://dx.doi.org/10.1109/CVPR.2018.00917]
Wang Z, Bovik A C, Sheikh H R and Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612 [DOI: 10.1109/TIP.2003.819861]
Yu J H, Lin Z, Yang J M, Shen X H, Lu X and Huang T. 2019. Free-Form image inpainting with gated convolution//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 4470-4479 [DOI: 10.1109/ICCV.2019.00457http://dx.doi.org/10.1109/ICCV.2019.00457]
Zhang G M and Li Y B. 2019. Image inpainting of fractional TV model combined with texture structure. Journal of Image and Graphics, 24(5): 700-713
张桂梅, 李艳兵. 2019. 结合纹理结构的分数阶TV模型的图像修复. 中国图象图形学报, 24(5): 700-713 [DOI: 10.11834/jig.180509]
Zhang L and Chang M H. 2021. An image inpainting method for object removal based on difference degree constraint. Multimedia Tools and Applications, 80(3): 4607-4626 [DOI: 10.1007/s11042-020-09835-0]
Zhu J Y, Park T, Isola P and Efros A A. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks//Proceedings of 2017 IEEE International Conference on Computer Vision.Venice, Italy: IEEE: 2242-2251
相关作者
相关机构