Current Issue Cover
面向同胚异构骨骼运动重定向的高阶图卷积网络

贾伟(合肥工业大学计算机与信息学院)

摘 要
目的 骨骼运动重定向是指将源角色的骨骼运动数据,修改后运用到另一个具有不同骨架结构的目标角色上,使得目标角色和源角色做出相同的动作。由于骨骼运动数据与骨架结构之间具有高耦合性,重定向算法需要从运动数据中分离出与骨架结构无关,只表示动作类型的特征。当源角色和目标角色骨架结构不同,且两者运动模式(如关节角变化范围)存在较大差异时,特征分离难度加大,重定向网络训练难度变大。针对该问题,本文提出了特征分离的方法和高阶骨骼卷积算子。方法 在数据处理阶段,首先从运动数据中分离出一部分与骨架结构无关的特征,从而降低重定向网络训练难度,得到更好的重定向结果。另外,结合图卷积网络,本文针对人体骨架结构提出了高阶骨骼卷积算子。使用该算子,本文网络模型在特征分离过程可以捕获更多有关骨架结构的信息,提高重定向结果的精度和视觉效果。结果 在异构重定向任务中,本文方法在合成动画数据集Mixamo上与最新方法对比,重定向结果精度提升了38.6%。另外,本文方法也同样适用于同构重定向,结果精度比最新方法提升了74.8%。在从真人采集的运动数据到虚拟动画角色的异构重定向任务中,相比最新方法,本文方法能够明显减少重定向错误,重定向结果有更高的视觉质量。结论 相比较于目前最新的方法,本文方法降低了特征分离的难度且更加充分挖掘了骨架的结构信息,使得重定向结果误差更低且动作更自然合理。
关键词
A high order graph convolutional network for homomorphic and heterogeneous skeletal motion retargeting

Jia Wei()

Abstract
Abstract: Objective Skeletal motion retargeting is a key technology that involves adapting skeletal motion data from a source character, after suitable modification, to a target character with a different skeleton structure, ensuring the target character performs identical actions to the source. This process is particularly crucial in animation production and game development, greatly enhancing the reuse of existing motion data and significantly reducing the need to create new motion data from scratch.The relationship between skeletal motion data and a character"s skeleton structure is inherently strong, and the core challenge in retargeting lies in extracting from the motion data features that are independent of the source skeleton and solely embody the essence and pattern of the action. In practical applications, especially when the source and target characters stem from distinct datasets, such as translating motion capture data from real human subjects onto virtual animated characters with heterogeneous skeletal structures, the complexity increases markedly. Differences between such datasets extend beyond mere skeletal disparities and may encompass inconsistencies in capture equipment, physiological variations among the individuals, and diverse action execution environments. Collectively, these factors result in significant discrepancies between source and target characters in terms of global movement ranges, joint angle variation range, and other motion attributes, posing formidable challenges for retargeting algorithms.Given this context, this paper addresses the problem of overcoming data heterogeneity, enabling precise motion retargeting from real human motion data to heterogeneous yet topologically equivalent virtual animated characters. To tackle this challenge, we propose a solution comprising strategies for feature separation and high-order skeletal convolution operators. Method During the data preprocessing stage, the first step involves performing feature separation on the motion data, isolating components that are independent of the skeletal structure. This approach significantly reduces the complexity of the data, consequently lowering the difficulty of the heterogeneous retargeting task and facilitating the attainment of superior retargeting outcomes. Moreover, given the highly sensitivity of motion retargeting tasks to local features, this paper delves deeply into the distance information between joints and, in conjunction with higher-order graph convolution theory, introduces innovative improvements to conventional skeletal convolution methods, ultimately proposing a novel high-order skeletal convolution operator. In high-order graph convolutional operations, the employed adjacency matrices of higher powers encapsulate a more abundant and tangible information profile. Such matrices not only encompass fundamental structural information, i.e., direct adjacencies between nodes, but also extend to embody multi-level distance characteristics among nodes. This new operator harnesses the rich adjacency relationships and distance information encapsulated within higher-order adjacency matrices, enabling convolution operations to more thoroughly and comprehensively extract intrinsic structural features of the skeleton, thereby powerfully enhancing both the accuracy and visual effect of the retargeting results. Result In the task of heterogeneous motion retargeting, the algorithm proposed in this paper demonstrates a significant 38.6% improvement in retargeting accuracy when evaluated using the synthetic animation dataset Mixamo compared to the current state-of-the-art methods. To delve deeper into the model"s characteristics, we conduct a analysis of root joint errors to examine its precision in handling root joint position. The results show that, relative to recent research, our method reduces root joint position errors by 35.5%, substantiating its exceptional capability in addressing retargeting tasks with larger ranges of root joint position variations. Moreover, our method also demonstrates applicability and superiority in homogeneous motion retargeting tasks, achieving a 74.8% increase in accuracy over the latest methods. In practical applications, when applying real-world motion data captured from humans to the retargeting of virtual animated characters in a heterogeneous context, our method excels in delivering higher levels of authenticity in the reproduction of specific actions and significantly reducing retargeting errors. Conclusion This paper presents a framework capable of handling more challenging motion retargeting tasks between heterogeneous skeletons yet topologically equivalent. When training data originates from two significantly diverse datasets, the proposed data preprocessing methods and high-order skeletal convolutional operators enable neural network models to more effectively extract motion features from the source data and integrate them with the target skeleton, thereby generating skeletal motion data for the target character. By separating features of the motion data that are independent to the skeleton structure, the model can focus more intently on structure-relevant information, effectively decoupling motion information from structural details and achieving motion retargeting. Additionally, by assigning different weights to joints at varying distances, the high-order skeletal convolutional operators gather enhanced skeletal structural information to improve network performance.
Keywords

订阅号|日报