Current Issue Cover

侯赛辉1,2, 付杨1, 李奥奇1, 刘旭2, 曹春水2, 黄永祯1,2(1.北京师范大学人工智能学院, 北京 100875;2.银河水滴科技有限公司, 北京 100088)

摘 要
目的 基于步态剪影的方法取得了很大的性能提升,其中通过水平划分骨干网络的输出从而学习多身体部位特征的机制起到了重要作用。然而在这些方法对不同部位的特征都是以相对独立的方式进行提取,不同部位之间缺乏交互,有碍于识别准确率的进一步提高。针对这一问题,本文提出了一个新模块用于增强步态识别中的多部位特征学习。方法 本文将"分离-共享"机制引入到步态识别的多部位特征学习过程中。分离机制允许每个部位学习自身独有的特征,主要通过区域池化和独立权重的全连接层进行实现。共享机制允许不同部位的特征进行交互,由特征归一化和特征重映射两部分组成。在共享机制中,特征归一化不包含任何参数,目的是使不同部位的特征具有相似的统计特性以便进行权值共享;特征重映射则是通过全连接层或逐项乘积进行实现,并且在不同部位之间共享权重。结果 实验在步态识别领域应用最广泛的数据集CASIA-B (Institute of Automation,Chinese Academy of Sciences)和OUMVLP上进行,分别以GaitSet和GaitPart作为基线方法。实验结果表明,本文设计的模块能够带来稳定的性能提升。在CASIA-B背包条件下,本文提出的模块相对于GaitSet和GaitPart分别将rank-1的识别准确率提升了1.62%和1.17%。结论 本文设计了一个新的模块用于增强步态识别的多部位特征学习过程,能够在不显著增加计算代价的条件下带来稳定的性能提升。
Multifaceted-features enhancement-relevant gait recognition method

Hou Saihui1,2, Fu Yang1, Li Aoqi1, Liu Xu2, Cao Chunshui2, Huang Yongzhen1,2(1.School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China;2.Watrix Technology Limited Co., Ltd., Beijing 100088, China)

Objective Gait recognition can be focused on the identity labels of pedestrians-relevant recognition according to its walking style. Since it can be manipulated without coordination-derived constraints on a long distance scale,more applications potentials are illustrated for such domains like crime prevention,forensic identification,and public security. However,the process of gait recognition is challenged for many factors like camera views,carrying conditions,and different clothes. Current gait recognition tasks can be divided into two categories:model-based and appearance-based. Specifically,the model-based methods can be used to extract the human body structures for gait analysis. Conventional deep learning and graph convolutional network(GCN)based pose estimation is taken to extract gait features from the pose sequences in terms of hand-crafted features to model the walking process in common. First,model-based methods are robust to carrying and clothing theoretically,which is often challenged for human pose-precise low-resolution problems. Second,the appearance-based methods are oriented to learn gait features in terms of human body structures-potential modeling. The silhouettes are mostly taken as the input,and these methods can be divided into three sub-categories further: template-based,sequence-based,and set-based. Specifically,template-based methods can be used to fuse the silhouettes of a gait circle into a template but the temporal information is sacrificed inevitably. The sequence-based methods can yield the silhouettes of a gait sequence as a video for spatio-temporal features extraction. And,the set-based methods can use the silhouettes of a gait sequence as an insequential set and the permutation invariant is added to the input order. Furthermore, multiple data for gait recognition are categorized into the appearance-based methods,including RGB frames,gray images, and optical flow. Compared to these data modalities and the pose sequences in the model-based methods,the silhouettes are easy to use,which are more suitable for the low-resolution scales. To be noticed,recent silhouettes-based methods for gait recognition can learn multi-part features through slicing the output of the backbone horizontally. However,multi features are extracted solely and the feature-interacted is lacked,which is likely to hinder the recognition accuracy. To resolve this prolbem,we design a new module to enhance the multifaceted feature learning for gait recognition. Method Silhouettebased gait recognition model consists of two parts:backbone-based,and multi-component feature learning. First,we design the backbone in term of the network structures in GaitSet and GaitPart,which can be as two popular methods for silhouette-based gait recognition. For the backbone-relevant,the features are first extracted for each silhouette(regular 2D convolution and max pooling in relevance to spatial dimension),and a set pooling is taken to aggregate the silhouette-level features in a non-squential set(implemented by max pooling along the temporal dimension). Second,we design a new module for multiple-features learning and try to learn more robust and discriminative features for each motion. The independent-shared mechanism is introduced to learn motion-specific features,which is implemented by regional pooling and fully connected layers are sepearated. In particular,the interaction can be strengthened across various motions in terms of the coordinated mechanism,which consists of feature normalization and feature remapping. Feature normalization is parameter-free for weight balancing. And,feature remapping is implemented by a fully connected layer or element-wise multiple implecations. Result The experiments are carried out on Institute of Automation,Chinese Academy of Sciences (CASIA-B)and OUMVLP,and GaitSet GaitPart are as the baselines. The CASIA-B consists of 124 samples and collects the sequences of regular walking,such as walking with bags,and walking in different clothes for each object. The OUMVLP consists of 10 307 samples,which can collect the sequences of regular walking for each sample. Each sequence for CASIA-B and OUMVLP is recorded by 11 cameras and 14 cameras. GaitSet and GaitPart are commonly-used silhouettes methods as input for gait recognition. To learn the multifaceted features for gait recognition,GaitSet is regarded as an unseqential set and the features are sliced horizontally. To learn more specific features,GaitPart is focused on supressing the receptive field of convolutional layers and modeling the micro-motion features. To demonstrate its consistency,the identical-view cases-excluded rank-1 accuracy is taken as the main metric for performance comparison. For example,each of rank-1 accuracy for walking with bags on CASIA-B can be optimized by 1. 62% and 1. 17% based on GaitSet and GaitPart. Conclusion A new module is facilitated to enhance the multi-components learning for gait recognition,which is costeffective and the accuracy is improved in consistency. To be summarized,1)the lack of interaction to hinder the recognition accuracy is concerned. 2)The independent-shared mechanism is introducted into multifaceted feature learning for gait recognition,and a plug-and-play module is designed to learn more discriminative features for muliple motions. 3)This GaitSet and GaitPart-based method has its potentials for consistent optimization over the baselines under all walking circumstances.