Current Issue Cover
面向可解释性的物体拓扑结构骨架表征方法

危辉1,2, 余莉萍1,2(1.复旦大学计算机科学技术学院认知算法模型实验室, 上海 201203;2.上海市数据科学重点实验室, 上海 201203)

摘 要
目的 模式识别中,通常使用大量标注数据和有效的机器学习算法训练分类器应对不确定性问题。然而,这一过程缺乏知识表征和可解释性。认知心理学和实验心理学的研究表明,人类往往不使用代价如此巨大的机制,而是使用表征、归纳、推理、解释和约束传播等与符号主义人工智能方法类似的手段来应对物体识别中的不确定性并提供可解释性。因此本文旨在从传统的符号计算出发,利用骨架拓扑结构表征提供一种可解释性的思路。方法 以骨架树为基本手段来形成物体拓扑结构特征和几何特征的形式化表征,并基于泛化框架对少量同类表征进行知识抽取来形成关于物体类别的知识概括显式化表征。结果 在形成物体类别的概括表征实验中,通过路径重建直观展示了同类属物体上得到的最一般表征的几何物理意义。在可解释性验证实验中,通过跨数据的拓扑应用展示了新测试样本相对于概括表征的特定差异,表明该表征具有良好的可解释性。最后在形状补全的不确定性推理实验中,不仅可以得到识别结论,而且清晰展示了识别背后做出的判断依据,进一步验证了该表征的可解释性。结论 实验表明一般化的形式表征能够应对尺寸、颜色和形状等不确定性问题,本文方法避免了基于纹理特征所带来的不确定性,适用于任意基于基元的表征方式,具有更好的鲁棒性、普适性和可解释性,计算代价更小。
关键词
Skeleton characterization of object topology toward explainability

Wei Hui1,2, Yu Liping1,2(1.Laboratory of Cognitive Modeling and Algorithms, School of Computer Science and Technology, Fudan University, Shanghai 201203, China;2.Shanghai Key Laboratory of Data Science, Shanghai 201203, China)

Abstract
Objective Understanding the shape and structure of objects is extremely important in object recognition. The most commonly utilized pattern recognition method is machine learning, which often requires a large number of training data. However, this object-oriented learning method lacks a priori knowledge, uses a large amount of training data and complex computations, and is unable to extract explicit knowledge after learning (i.e., "knowing how without knowing why"). Great uncertainties are encountered in object recognition tasks due to changes in size, color, illumination, position, and environmental background. To deal with such uncertainties, a large number of samples should be trained and powerful machine learning algorithms should be used to generate a classifier. Despite achieving a favorable recognition accuracy in some standard datasets, these models lack explainability, and recent studies have shown that these purely data-driven models are vulnerable. These models also often ignore knowledge representation and even consider this aspect redundant. However, cognitive and experimental psychology research suggests that humans do not adopt such mechanism. Similar symbolic artificial intelligence methods, such as representation, induction, reasoning, interpretation, and constraint propagation have also been used to deal with the uncertainties in object recognition. In vision tasks, improving explainability is considered more important than improving accuracy. Such is the goal of interpretable artificial intelligence. Accordingly, this paper aims to provide an interpretable way of thinking from the traditional symbolic computing idea and adopts the skeleton topology representation. Method Psychological research reveals that humans show strong topological and geometric preferences when engaged in visual tasks. To explicitly characterize geometric and topological features, the proposed method adopts skeleton descriptors with excellent topological geometric characteristics. First, an object was decomposed into several connected components based on a skeleton graph, and each component was represented by a skeleton branch. Second, the statistical parameter of these skeleton branches were obtained, including their area ratio, path length, and skeletal mass ratio distribution. Third, the skeleton radius path was used to describe the contour of the object. Fourth, to form a robust spatial topology constraint, the spine-like axis (SPA) was used to describe the spatial distribution of shape components. Finally, a skeleton tree was used to represent the topological structure (RTS) of objects. A similarity measure based on RTS was also proposed, and the optimal subsequence bijection (OSB) was used for the elastic matching of object shapes. A multi-level generalization framework was then built to extract knowledge from a small number of similar representations and to subsequently form a generalized explicit representation (GRTS) of the object categories. The uncertainty reasoning and explainability were verified based on certainty factor theory. Result The proposed model illustrates the process of generating GRTS on the Kimia99 and Tari56 datasets and presents the physical meanings of most general representations obtained from homogeneous objects. The skeletal path of an object of the same category was used for reconstruction to clearly describe the object meaning of each part of GRTS. In the explainability verification experiment, the GRTS of several categories obtained from the Tari56 dataset was used to apply the topological character to a sample of Kimia99's closest category to discover the specific differences of the new test sample relative to the GRTS. Results show that the representation has good explainability. Meanwhile, in the shape complementing experiments, RTS was initially extracted from incomplete shapes to gather evidence, and the uncertainty reasoning was validated with the rule set (Tari56) that was established according to GRTS. The proposed model only provided are cognition conclusion and showed specific judgment basis, thereby further verifying the explainability of the representation. Conclusion A skeleton tree was used as the basic means for generating a formal representation of the topological and geometric features of an object. Based on the generalization framework, the knowledge extracted from a small number of similar representations was used to form a generalized explicit representation of the knowledge about an object category. The knowledge representation results were then used to conduct uncertainty reasoning experiments. This work presents a new perspective toward explainability and helps build trust-based relationships between models and people. Experiment results show that the generalized formal representation can cope with uncertainties in size, color, and shape. This representation also has strong robustness and universality, can prevent uncertainties arising from texture features, and is suitable for any primitive-based representation. The proposed approach significantly outperforms the mainstream machine learning methods in terms of explainability and computational cost.
Keywords

订阅号|日报