Current Issue Cover
深度学习图像合成研究综述

叶国升1, 王建明1,2, 杨自忠2, 张宇航1, 崔荣凯1, 宣帅1(1.大理大学数学与计算机学院, 大理 671003;2.云南省昆虫生物医药研发重点实验室 (大理大学), 大理 671000)

摘 要
图像合成一直是图像处理领域的研究热点,具有广泛的应用前景。从原图中精确提取出前景目标对象并将其与新背景合成,构造尽量接近真实的图像是图像合成的基本目标。为推动基于深度学习的图像合成技术研究与发展,本文论述了当前图像合成任务中面临的主要问题: 1)前景对象适应性问题,包括前景对象相对于背景图像的大小、位置、几何角度等几何一致性问题,以及前后景互相遮挡、前景对象边缘细节模糊的外观一致性问题; 2)视觉和谐问题,包括前后景色彩、对比度、饱和度等不统一的色调一致性问题,及前景对象丢失对应阴影的阴影缺失问题; 3)生境适应性问题,表现为前景对象与背景图像的逻辑合理性。总结了目前为解决不同问题主要使用的深度学习方法,同时对不同问题中的合成图像结果进行质量评估,总结了相应的评价指标,并介绍了为解决不同问题所使用的公开数据集,同时进行了深度学习方法的对比,描述了图像合成技术的主要应用场景,最后分析了基于深度学习的图像合成技术中仍然存在的不足,同时提出可行的研究意见,并对未来图像合成技术发展方向提出展望。
关键词
Survey of image composition based on deep learning

Ye Guosheng1, Wang Jianming1,2, Yang Zizhong2, Zhang Yuhang1, Cui Rongkai1, Xuan Shuai1(1.School of Mathematics and Computer Science, Dali University, Dali 671003, China;2.Yunnan Provincial Key Laboratory of Entomological Biopharmaceutical R&D, Dali University, Dali 671000, China)

Abstract
Image composition has always been a research hotspot in the field of image processing and has a wide range of application prospects. This process involves accurately extracting the foreground objects in an image and compositing them with a new background image. However, traditional image compositions methods are often time consuming and labor intensive. Users not only need to manually complete the accurate extraction and reasonable placement of foreground objects but also need to adjust the lighting conditions, saturation, edge details, shadows, and other information of foreground objects to make the image quality close to that of the real image. With the development of deep learning technology in recent years, image composition technology has attracted increasing applications and has demonstrated its efficiency. To promote the research and development of image composition technology based on deep learning, this paper expounds four main problems faced in current image composition tasks. First, the foreground object adaptation problem mainly involves foreground object size adjustment, spatial position placement, blurred edge detail processing of foreground objects, and unreasonable mutual occlusion of foreground and background. The current deep learning methods for solving this problem include appropriate bounding box prediction for foreground objects in background images, spatial transformation networks, foreground object location distribution prediction and adversarial training, image fusion technology, and guided placement based on domain information. Second, the foreground object harmonization problem mainly concerns the non-uniformity in the visual information, such as illumination, color, saturation, and contrast, of the foreground and background images after image composition. The current deep learning methods for solving this problem include the attention-based guidance mechanism, domain-information-based verification and discrimination methods, codecs, context-dependent capabilities of Transformers, assisting input with high dynamic range(HDR), and borrowing methods, such as style transfer. Third, the foreground object shadow harmonization problem mainly involves generating shadows of missing foreground objects in composite images. The current deep learning methods for solving this problem include methods based on image rendering, shadow generation using generative adversarial networks, relying on background ambient lighting assistance, and attentionbased methods and mechanisms. Fourth, the habitat adaptation problem between the foreground object and background mainly focuses on biological information matching, which should be considered when compositing foreground objects and background images. Whether foreground objects, such as animals and plants, can be composited in background images is the first problem that should be considered in image composition tasks. The background image selection of an object cannot deviate from its corresponding habitat information. For instance, seagulls do not appear in the desert, and flowers do not grow from ice and snow. The foreground object adaptation problem can be regarded as the key problem in image composition. As long as the foreground objects are correctly and reasonably composited, the subsequent optimization task of the composite image can be performed efficiently. Effectively solving the visual harmonization problem of foreground objects can further improve the authenticity of composite images from the perspective of users. The most important problem to be considered is the adaptation of the foreground and background habitats. Objects and background images cannot be chosen arbitrarily but need to satisfy the logical relationship of reality, that is, to satisfy habitat adaptation, which can be regarded as the primary task in an image composition task. If the habitat information does not fit, then the foreground object and background scenes lose their logical authenticity, and all subsequent tasks fail to make the composite image realistic. This study summarizes the current deep learning methods, publicly available datasets, and evaluation indices for each of the above problems, compares the different deep learning methods, and introduces the application of image synthesis technology. A composite image not only reduces the cost of real data acquisition but also improves the generalization ability of the model. The shortcomings of image composition technology based on deep learning are also analyzed, feasible research suggestions are put forward, and the future development direction of image synthesis technology is forecasted.
Keywords

订阅号|日报