Current Issue Cover
主动目标几何建模研究方法综述

孔研自1,2,3,4,5, 朱枫1,2,4,5, 郝颖明1,2,4,5, 吴清潇1,2,4,5, 鲁荣荣1,2,3,4,5(1.中国科学院沈阳自动化研究所, 沈阳 110016;2.中国科学院机器人与智能制造创新研究院, 沈阳 110016;3.中国科学院大学, 北京 100049;4.中国科学院光电信息处理重点实验室, 沈阳 110016;5.辽宁省图像理解与视觉计算重点实验室, 沈阳 110016)

摘 要
目的 目标建模是机器视觉领域的主要研究方向之一,主动目标建模是在保证建模完整度的情况下,通过有计划地调节相机的位姿参数,以更少的视点和更短的运动路径实现目标建模的智能感知方法。为了反映主动目标建模的研究现状和最新进展,梳理分析了2004年以来的相关文献,对国内外研究方法做出概括性总结。方法 以重构模型类型和规划视点所用信息作为划分依据,将无模型的主动目标建模方法分为基于表面的主动目标建模方法、基于搜索的目标建模方法和两者相结合的方法3大类,重点对前两类方法进行综述,首先解释了每类方法的基本思想,总结每类方法涉及的问题,然后对相关问题的主要研究方法进行归纳和分析,最后将各个问题的解决方法进行合理的搭配组合,形成不同的主动目标建模方法,并对各类方法的优势和局限性进行了总结。结果 各类主动目标建模算法在适用场景范围、计算复杂度等方面存在差异,但相对于传统的被动目标建模方法,当前的主动目标建模算法已经能够极大程度地提高建模任务的质量和降低建模所需代价。结论 基于表面的主动目标建模方法思想相对简单,但仅适用于表面简单的目标建模。基于搜索的目标建模方法能够量化地评价每一个候选视点,适用广泛且涉及的问题相对于基于表面的方法有更大的解决空间,有更多的研究成果产生。将二者涉及问题的不同研究方法相搭配,可以构成不同的主动目标建模方法子类。
关键词
Active geometric reconstruction methods for objects: a survey

Kong Yanzi1,2,3,4,5, Zhu Feng1,2,4,5, Hao Yingming1,2,4,5, Wu Qingxiao1,2,4,5, Lu Rongrong1,2,3,4,5(1.Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China;2.Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110016, China;3.University of Chinese Academy of Sciences, Beijing 100049, China;4.Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, China;5.Key Laboratory of Image Understanding and Computer Vision, Shenyang 110016, China)

Abstract
Objective Target modeling is one of the main research directions in the field of machine vision, and this technology is widely used in various fields. When modeling the geometry of an object, the data obtained from one viewpoint are often incomplete, and large-area losses may even occur. Therefore, obtaining the information of the target from different viewpoints and fusing the information are necessary to achieve a complete geometric modeling of the target. Active object reconstruction is an intelligent perception method that achieves target modeling with few viewpoints and short motion paths by systematically adjusting the pose parameters of the camera while ensuring model integrity. To reflect the research status and latest development of active object reconstruction, relevant studies since 2004 are combed and analyzed, and a summary of domestic and foreign research methods is made. Method At present, active object reconstruction is mainly aimed at two task types: model-based and non-model active object reconstruction. Model-based methods pre-plan a series of viewpoints before modeling and can achieve full coverage of the target with high quality. Non-model methods have no information on the target at all, and view planning is performed in real time during modeling. In practical applications, the second category appears frequently and is difficult; thus, this study only summarizes non-model methods. On the basis of the rebuilt model type and the information used during view planning, non-model active object reconstruction methods are divided into three categories, namely, surface based, search based, and combined. The basic ideas of each type of method are explained, and the problems involved are summarized. Surface-based methods use point cloud and triangular patch models. They extract shape information from the obtained local model and classify the shape of the unknown region to determine the next viewpoint. Search-based methods use voxel models. A certain method is employed to determine the candidate viewpoints, and then these viewpoints are scored by a reasonable evaluation function. The candidate viewpoint with the highest score is used as the next best view. The combined method uses the surface and voxel models and merges the advantages of the two methods comprehensively to provide effective information for view planning. However, combined methods have not been investigated much recently, and the first two methods have mainly been the focus. Surface-based methods involve problems of detection direction determination, unknown surface prediction, and next-best viewpoint determination. Search-based methods involve problems of model type selection, search space determination, undetected area prediction, and design of the evaluation function to sort candidate viewpoints. The main research methods for these related problems are summarized and analyzed, and the solutions to each problem are combined reasonably to form different active object reconstruction methods. Result In surface-based active object reconstruction methods, the manner of determining the direction of detection and predicting the unknown area has an important impact on the view planning effect. When selecting an edge point to determine the direction of detection, the use of the quantitative indicator method is more reliable than the use of the spatial position method to express the unknown region, but its computational complexity is higher. In addition, using an indirect method to predict an unknown surface may be simpler than using a direct method, but it results in larger fitting errors. In general, surface-based methods are relatively simple, and the process of each view planning consumes minimal time. However, the unknown region depends on its adjacent surface trend to predict; thus, this method is only suitable for reconstructing objects with regular shapes. Search-based active object reconstruction methods quantitatively evaluate each candidate view. The octomap model is more efficient than other probabilistic voxel models when selecting model types. The selection of candidate viewpoints using dynamic search space methods has higher computational complexity than using fixed search space methods, but such methods have no limitation on the target size, and their application scenario is extensive. When predicting the information contained in an unknown voxel, its relative positional relationship with the known voxel can be utilized; thus, using this method for the next view planning can maximize the known information compared with not updating the unknown voxel. When determining the evaluation function, information gain modules may be added to the evaluation function, and the adjacent frame overlap ratio optimization modules, the neighboring viewpoint distance optimization modules, and the reconstructed surface quality optimization items may be added as needed. The information gain of the viewpoint is obtained by counting the voxel gain in the field of view. Differences in voxel gain calculation and statistical methods directly affect the information gain value of the viewpoint. With these search-based methods, the next view planning works well, but the process is time consuming. Moreover, the problems involved in such methods have a larger solution space than those involved in surface-based methods. Therefore, more research results are generated in search-based active object reconstruction methods. However, such methods are relatively computationally intensive, and in most cases, the views are not continuously pulsating in the search space, and point cloud registration is not considered. Conclusion Researchers who study active object reconstruction have made some progress at present, but the accuracy and efficiency of active reconstruction can still be improved. Other feasible research directions are provided in the end, and these could serve as a reference for future research in this direction, such as introducing a priori information into the process of view planning, combining surface- and search-based methods, and building perceptual intelligence systems that are suitable for different tasks.
Keywords

订阅号|日报