郭冬升, 顾肇瑞, 郑冰, 董军宇, 郑海永(中国海洋大学)
An unified framework with iterative prediction for both image inpainting and image outpainting
GUO DONGSHENG, GU ZHAORUI, ZHENG BING, DONG JUNYU, ZHENG HAIYONG(Ocean University of China)
Objective: Image inpainting and outpainting an be regarded as the problem of painting unknown regions from known regions, which are research hotspots in computer vision. Recently, deep learning has become the mainstream to deal with image inpainting and outpainting tasks. However, most current solutions treat the cases of inpainting and outpainting separately, which is difficult to adapt to each other. Besides, Convolutional Neural Network (CNN) mainly constitutes the backbone of these methods, which is limited by locality that makes it difficult to paint the long-range content. In this paper, we propose a unified framework for tackling both image inpainting and image outpainting, with the model composed of CNN and Transformer based on divide-and-conquer strategy.
Method: We divide the problem-solving process into three stages of representation, prediction, and synthesis, where, representation and synthesis are tackled by CNN, for respectively mapping images to features and reconstructing images from features leveraging CNN"s locality, while prediction is dealt with Transformer, which takes full advantages of the powerful modeling ability of global context, and further the mask growth strategy is devised to predict features iteratively, reducing the difficulty of parallelly predicting the features of large-range unknown regions, finally adversarial learning is introduced to improve the fidelity of synthesized images.
Result: We conduct comprehensive experiments on different datasets covering both objects and scenes for both image inpainting and image outpainting, and the results demonstrate that our method outperforms state-of-the-art methods in terms of various metrics. Also ablation study validates the efficacy of each component of our method, including the framework structure and the mask growth strategy. Moreover, the impact of layer and head numbers of Transformer on performance is also empirically studied.
Conclusion: This paper proposes an iterative prediction unified framework to address the problems of image inpainting and outpainting. The proposed method outperforms commonly used and advanced methods in terms of performance, and each part of the design contributes to the performance improvement. It demonstrates the application value and potential of the iterative prediction unified framework and method in the problems of image inpainting and outpainting.