Virtual manufacturing environments require complex and accurate three-dimensional (3D) human-computer interaction (HCI). The main problem of current virtual environments (VEs) is the heavy user burden associated with the cognitive and motor operation aspects
as well as the improvement in HCI efficiency. This problem is solved by promoting the cognitive capability of the machine. This study investigates how user intents are analyzes and abstracted
as well as constructs multimodal intent understanding algorithms. Intent-based VE construction is practiced in a virtual assembly system. Experiments on typical intents are conducted and analyzed. A comprehensive evaluation of the usability and reliability of multimodal intent understanding is presented
and the intent-based VE system is demonstrated to be a real-time system. The experiment focuses on the intent of object picking in VE. When the distance between the 3D cursor and object is 5 000 mm
the operation costs 5.344 7 s on average in traditional systems
whereas it costs 2.326 6 s on average in intent-based systems. The intent-based system significantly reduces operation time and manipulation complexity. Intent-driven scenario transition can significantly enhances the naturalness and efficiency of HCI
as well as effectively reduce the complexity of human-centered VE system analysis and development. Application of intent understanding demonstrates that multimodal intent models and algorithms can efficiently promote the naturalness and efficiency of HCI. This system construction method can be used in any VE system.