Current Issue Cover

杨浩中, 舒文桐, 汪淼(北京航空航天大学)

摘 要
随着信息技术的发展,混合现实(Mixed Reality, MR)技术被应用于医疗、教育、辅助引导等众多领域。MR场景包含丰富的语义信息,基于场景上下文信息的混合现实技术可以改善用户对场景感知,优化用户的交互操作、提升交互模型的准确度,因而迅速受到广泛关注。然而,目前在该领域没有针对上下文信息进行调查的综述类文献,缺乏梳理与分类。本文的研究对象是使用上下文信息的MR技术与系统。通过对MR领域的文献调研,本文提出了三个研究问题,并对国外近20年的33篇实证研究论文进行分析,概述了使用上下文信息的MR技术最新发展,从三个维度出发进行分类学研究并分别提出分类标准,如上下文信息种类、上下文知识库的构建方式和应用领域等。其中,上下文信息的种类可以分为场景语义、对象语义、空间关系、群组关系、从属关系和运动信息六类,知识库的构建从用户介入角度和基础技术类型划分,应用领域从场景类型和生成式特性出发进行分类。通过从不同维度对研究对象的分类,本文对提出的研究问题进行了回应,并总结了现阶段的不足以及未来可能的研究方向。本综述可以辅助不同领域的研究人员对上下文信息的设计、选择和评估,从而推动未来混合现实应用技术与系统的研发。
A literature review of contextual information construction and applications in mixed reality

YangHaoZhong, ShuWenTong, WangMiao(Beihang University)

With the development of information technology, Mixed Reality (MR) technology has been applied in various fields such as healthcare, education, and assisted guidance. MR scenes contain rich semantic information, and MR technology based on scene context information can improve users" perception of the scene, optimize user interaction, and enhance the accuracy of interaction models. Therefore, they have quickly gained widespread attention. However, there is currently a lack of literature review specifically investigating context information in this field, and a lack of organization and classification. This paper focuses on MR technology and systems that utilize context information. This study was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA). Firstly, keywords for the search were determined based on three factors: research domain, study subjects, and research scenarios. Subsequently, searches were performed in two influential databases in the field of MR: ACM Digital Library and IEEE Xplore. A preliminary screening was then executed, taking into consideration the types of journals and conferences to eliminate irrelevant and unpublished literature. Following this, the titles and abstracts of the articles were reviewed sequentially, eliminating duplicates and irrelevant results. Finally, a total of 210 articles were individually screened to select 29 papers for the review. Additionally, 4 more articles were included based on expertise, resulting in a total of 33 articles for the review. Through a comprehensive literature review of MR databases, three research questions were formulated, and a dataset of research articles was established. The three research questions addressed in this paper are as follows: 1) What are the different types of scene context? 2) How is scene context organized in various MR technologies and systems? 3) What are the application areas of empirical research? Based on the evolution of scene context and the refinement of MR technologies and systems, we analyze the empirical research papers spanning nearly 20 years. This involves summarizing previous research and providing an overview of the latest developments in systems that leverage scene context. We also propose potential classification criteria, such as types of scene context, construction methods of knowledge bases for contextual information, fundamental technologies, and application domains. Among the various types of scene context, we categorize them into six classes: scene semantics, object semantics, spatial relationships, group relationships, dependence relationships, and motion relationships. Scene semantics is the semantic information encompassed by various elements in the scene environment, including objects, characters, texture information, etc. In the categorization of object semantics, we consider information about the individual object itself, such as user information, type, attributes, and special content. Spatial relationship refers to numerical information such as the relative position, angle or arrangement between various objects in the scene. We analyzed spatial relationships in three ways: base spatial relationships, micro-scene spatial information, and real-scene spatial information. We consider a certain number of closely neighboring objects of the same category as a group. Group relations focus on information about the overall perspective such as intergroup relations and the number of groups. Dependence relationship is concerned with the dependencies and affiliations that may exist between different objects in the scene at the functional and physical levels. Motion information is a new type of scene context, including basic motion information and special motion information, which describes the dynamic information of scene objects. Through an analysis of the utilization of various types of scene context, we establish the relationship between research objectives and contextual information, providing guidance on the selection of contextual information. The construction of knowledge bases is examined from user-intervention perspectives and types of fundamental technologies. Knowledge bases established with user intervention typically rely on researchers" abstract analysis of scene objects rather than pre-existing databases. Conversely, knowledge bases built without user intervention rely on existing information, such as low-level raw data in databases or predefined scenarios. The underlying technologies in this context are categorized into Virtual Reality (VR) and Augmented Reality (AR). Conducting classification research from the dual perspectives of user intervention and fundamental technology facilitates a deeper understanding of how contextual information is organized in various MR systems. Application areas are investigated based on the types of scenarios and whether they involve generative processes or not. The types of application scenarios are then categorized into six types: auxiliary guidance, AR annotation, scene reconstruction, medical treatment, object manipulation, and general purpose. Generative models can automatically generate target information, such as AR annotated shadows based on the scene, whereas non-generative models mainly focus on specific operations. Through analysis from these two perspectives, the advantages and disadvantages of MR systems and technologies in different application scenarios can be explored. Drawing upon the exploration and research in these three dimensions, we investigate the challenges associated with selecting, acquiring, and applying contextual information in MR scenarios. By classifying the research objects from different dimensions, we address the research questions and identify current shortcomings and future research directions. The aim of this review is to support researchers across diverse fields in designing, selecting, and evaluating scene context, ultimately fostering the advancement of future MR application technologies and systems.