多部位特征增强的步态识别算法

侯赛辉; 付杨; 李奥奇; 刘旭; 曹春水; 黄永祯

doi:10.11834/jig.220641

人脸、虹膜、步态等身份识别 | 浏览量 : 0 下载量: 0 CSCD: 0

PDF
导出
分享
收藏
专辑

多部位特征增强的步态识别算法
Multifaceted-features enhancement-relevant gait recognition method
2023年28卷第5期页码：1477-1486
纸质出版日期： 2023-05-16 ，
DOI： 10.11834/jig.220641
稿件说明：

移动端阅览

侯赛辉，付杨，李奥奇，刘旭，曹春水，黄永祯. 2023. 多部位特征增强的步态识别算法. 中国图象图形学报， 28(05):1477-1486

Hou Saihui， Fu Yang， Li Aoqi， Liu Xu， Cao Chunshui， Huang Yongzhen. 2023. Multifaceted-features enhancement-relevant gait recognition method. Journal of Image and Graphics， 28(05):1477-1486
侯赛辉，付杨，李奥奇，刘旭，曹春水，黄永祯. 2023. 多部位特征增强的步态识别算法. 中国图象图形学报， 28(05):1477-1486 DOI： 10.11834/jig.220641.

Hou Saihui， Fu Yang， Li Aoqi， Liu Xu， Cao Chunshui， Huang Yongzhen. 2023. Multifaceted-features enhancement-relevant gait recognition method. Journal of Image and Graphics， 28(05):1477-1486 DOI： 10.11834/jig.220641.

摘要

目的

基于步态剪影的方法取得了很大的性能提升，其中通过水平划分骨干网络的输出从而学习多身体部位特征的机制起到了重要作用。然而在这些方法对不同部位的特征都是以相对独立的方式进行提取，不同部位之间缺乏交互，有碍于识别准确率的进一步提高。针对这一问题，本文提出了一个新模块用于增强步态识别中的多部位特征学习。

方法

本文将“分离—共享”机制引入到步态识别的多部位特征学习过程中。分离机制允许每个部位学习自身独有的特征，主要通过区域池化和独立权重的全连接层进行实现。共享机制允许不同部位的特征进行交互，由特征归一化和特征重映射两部分组成。在共享机制中，特征归一化不包含任何参数，目的是使不同部位的特征具有相似的统计特性以便进行权值共享；特征重映射则是通过全连接层或逐项乘积进行实现，并且在不同部位之间共享权重。

结果

实验在步态识别领域应用最广泛的数据集CASIA-B（Institute of Automation， Chinese Academy of Sciences）和OUMVLP上进行，分别以GaitSet和GaitPart作为基线方法。实验结果表明，本文设计的模块能够带来稳定的性能提升。在CASIA-B背包条件下，本文提出的模块相对于GaitSet和GaitPart分别将rank-1的识别准确率提升了1.62%和1.17%。

结论

本文设计了一个新的模块用于增强步态识别的多部位特征学习过程，能够在不显著增加计算代价的条件下带来稳定的性能提升。

Abstract

Objective

Gait recognition can be focused on the identity labels of pedestrians-relevant recognition according to its walking style. Since it can be manipulated without coordination-derived constraints on a long distance scale， more applications potentials are illustrated for such domains like crime prevention， forensic identification， and public security. However， the process of gait recognition is challenged for many factors like camera views， carrying conditions， and different clothes. Current gait recognition tasks can be divided into two categories： model-based and appearance-based. Specifically， the model-based methods can be used to extract the human body structures for gait analysis. Conventional deep learning and graph convolutional network （GCN） based pose estimation is taken to extract gait features from the pose sequences in terms of hand-crafted features to model the walking process in common. First， model-based methods are robust to carrying and clothing theoretically， which is often challenged for human pose-precise low-resolution problems. Second， the appearance-based methods are oriented to learn gait features in terms of human body structures-potential modeling. The silhouettes are mostly taken as the input， and these methods can be divided into three sub-categories further： template-based， sequence-based， and set-based. Specifically， template-based methods can be used to fuse the silhouettes of a gait circle into a template but the temporal information is sacrificed inevitably. The sequence-based methods can yield the silhouettes of a gait sequence as a video for spatio-temporal features extraction. And， the set-based methods can use the silhouettes of a gait sequence as an insequential set and the permutation invariant is added to the input order. Furthermore， multiple data for gait recognition are categorized into the appearance-based methods， including RGB frames， gray images， and optical flow. Compared to these data modalities and the pose sequences in the model-based methods， the silhouettes are easy to use， which are more suitable for the low-resolution scales. To be noticed， recent silhouettes-based methods for gait recognition can learn multi-part features through slicing the output of the backbone horizontally. However， multi features are extracted solely and the feature-interacted is lacked， which is likely to hinder the recognition accuracy. To resolve this prolbem， we design a new module to enhance the multifaceted feature learning for gait recognition.

Method

Silhouette-based gait recognition model consists of two parts： backbone-based， and multi-component feature learning. First， we design the backbone in term of the network structures in GaitSet and GaitPart， which can be as two popular methods for silhouette-based gait recognition. For the backbone-relevant， the features are first extracted for each silhouette （regular 2D convolution and max pooling in relevance to spatial dimension）， and a set pooling is taken to aggregate the silhouette-level features in a non-squential set （implemented by max pooling along the temporal dimension）. Second， we design a new module for multiple-features learning and try to learn more robust and discriminative features for each motion. The independent-shared mechanism is introduced to learn motion-specific features， which is implemented by regional pooling and fully connected layers are sepearated. In particular， the interaction can be strengthened across various motions in terms of the coordinated mechanism， which consists of feature normalization and feature remapping. Feature normalization is parameter-free for weight balancing. And， feature remapping is implemented by a fully connected layer or element-wise multiple implecations.

Result

The experiments are carried out on Institute of Automation， Chinese Academy of Sciences（CASIA-B） and OUMVLP， and GaitSet GaitPart are as the baselines. The CASIA-B consists of 124 samples and collects the sequences of regular walking， such as walking with bags， and walking in different clothes for each object. The OUMVLP consists of 10 307 samples， which can collect the sequences of regular walking for each sample. Each sequence for CASIA-B and OUMVLP is recorded by 11 cameras and 14 cameras. GaitSet and GaitPart are commonly-used silhouettes methods as input for gait recognition. To learn the multifaceted features for gait recognition， GaitSet is regarded as an unseqential set and the features are sliced horizontally. To learn more specific features， GaitPart is focused on supressing the receptive field of convolutional layers and modeling the micro-motion features. To demonstrate its consistency， the identical-view cases-excluded rank-1 accuracy is taken as the main metric for performance comparison. For example， each of rank-1 accuracy for walking with bags on CASIA-B can be optimized by 1.62% and 1.17% based on GaitSet and GaitPart.

Conclusion

A new module is facilitated to enhance the multi-components learning for gait recognition， which is cost-effective and the accuracy is improved in consistency. To be summarized， 1） the lack of interaction to hinder the recognition accuracy is concerned. 2） The independent-shared mechanism is introducted into multifaceted feature learning for gait recognition， and a plug-and-play module is designed to learn more discriminative features for muliple motions. 3） This GaitSet and GaitPart-based method has its potentials for consistent optimization over the baselines under all walking circumstances.

关键词

步态识别剪影序列多部位特征分离机制共享机制

Keywords

gait recognitionsilhouette sequencesmulti-part featuresindependent mechanismshared mechanism

references

Bashir K， Xiang T and Gong S G. 2009. Gait representation using flow fields//Proceedings of 2009 British Machine Vision Conference. London， UK： BMVA Press： #113 ［DOI： 10.5244/C.23.113http://dx.doi.org/10.5244/C.23.113］

Ben X Y， Xu S and Wang K J. 2012. Review on pedestrian gait feature expression and recognition. Pattern Recognition and Artificial Intelligence， 25（1）： 71-81

贲晛烨，徐森，王科俊. 2012. 行人步态的特征表达及识别综述. 模式识别与人工智能， 25（1）： 71-81 ［DOI： 10.3969/j.issn.1003-6059.2012.01.010http://dx.doi.org/10.3969/j.issn.1003-6059.2012.01.010］

Castro F M， Marín-Jiménez M J， Guil N and Pérez de la Blanca N. 2020. Multimodal feature fusion for CNN-based gait recognition： an empirical comparison. Neural Computing and Applications， 32（17）： 14173-14193 ［DOI： 10.1007/s00521-020-04811-zhttp://dx.doi.org/10.1007/s00521-020-04811-z］

Chao H Q， He Y W， Zhang J P and Feng J F. 2019. GaitSet： regarding gait as a set for cross-view gait recognition. Proceedings of the AAAI Conference on Artificial Intelligence， 33（1）： 8126-8133 ［DOI： 10.1609/aaai.v33i01.33018126http://dx.doi.org/10.1609/aaai.v33i01.33018126］

Fan C， Peng Y J， Cao C S， Liu X， Hou S H， Chi J N， Huang Y Z， Li Q and He Z Q. 2020. GaitPart： temporal part-based model for gait recognition//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 14213-14221 ［DOI： 10.1109/CVPR42600.2020.01423http://dx.doi.org/10.1109/CVPR42600.2020.01423］

Han J and Bhanu B. 2006. Individual recognition using gait energy image. IEEE Transactions on Pattern Analysis and Machine Intelligence， 28（2）： 316-322 ［DOI： 10.1109/TPAMI.2006.38http://dx.doi.org/10.1109/TPAMI.2006.38］

Hermans A， Beyer L and Leibe B. 2017. In defense of the triplet loss for person re-identification ［EB/OL］. ［2022-06-12］. https：//arxiv.org/pdf/1703.07737.pdfhttps://arxiv.org/pdf/1703.07737.pdf

Hou S H， Liu X， Cao C S and Huang Y Z. 2021. Set residual network for silhouette-based gait recognition. IEEE Transactions on Biometrics， Behavior， and Identity Science， 3（3）： 384-393 ［DOI： 10.1109/TBIOM.2021.3074963http://dx.doi.org/10.1109/TBIOM.2021.3074963］

Huang T H， Ben X Y， Gong C， Zhang B C， Yan R and Wu Q. 2022. Enhanced spatial-temporal salience for cross-view gait recognition. IEEE Transactions on Circuits and Systems for Video Technology， 32（10）： 6967-6980 ［DOI： 10.1109/TCSVT.2022.3175959http://dx.doi.org/10.1109/TCSVT.2022.3175959］

Li N and Zhao X B. 2022. A strong and robust skeleton-based gait recognition method with gait periodicity priors. IEEE Transactions on Multimedia： #3154609 ［DOI： 10.1109/TMM.2022.3154609http://dx.doi.org/10.1109/TMM.2022.3154609］

Li X， Makihara Y， Xu C and Yagi Y. 2021. End-to-end model-based gait recognition using synchronized multi-view pose constraint//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal， Canada： IEEE： 4089-4098 ［DOI： 10.1109/ICCVW54120.2021.00456http://dx.doi.org/10.1109/ICCVW54120.2021.00456］

Liao R J， Cao C S， Garcia E B， Yu S Q and Huang Y Z. 2017. Pose-based temporal-spatial network （PTSN） for gait recognition with carrying and clothing variations//Proceedings of the 12th Chinese Conference on Biometric Recognition. Shenzhen， China： Springer： 474-483 ［DOI： 10.1007/978-3-319-69923-3_51http://dx.doi.org/10.1007/978-3-319-69923-3_51］

Lin B B， Zhang S L and Yu X. 2021. Gait recognition via effective global-local feature representation and local temporal aggregation//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 14628-14636 ［DOI： 10.1109/ICCV48922.2021.01438http://dx.doi.org/10.1109/ICCV48922.2021.01438］

Takemura N， Makihara Y， Muramatsu D， Echigo T and Yagi Y. 2018. Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition. IPSJ Transactions on Computer Vision and Applications， 10： #4 ［DOI： 10.1186/s41074-018-0039-6http://dx.doi.org/10.1186/s41074-018-0039-6］

Teepe T， Khan A， Gilg J， Herzog F， Hörmann S and Rigoll G. 2021. Gaitgraph： graph convolutional network for skeleton-based gait recognition//Proceedings of 2021 IEEE International Conference on Image Processing. Anchorage， USA： IEEE： 2314-2318 ［DOI： 10.1109/ICIP42928.2021.9506717http://dx.doi.org/10.1109/ICIP42928.2021.9506717］

Wang X N， Hu D D， Zhang T and Bai G X. 2021. Gait recognition using pose features and 2D Fourier transform. Journal of Image and Graphics， 26（4）： 796-814

王新年，胡丹丹，张涛，白桂欣. 2021. 姿态特征结合2维傅里叶变换的步态识别. 中国图象图形学报， 26（4）： 796-814 ［DOI： 10.11834/jig.200061http://dx.doi.org/10.11834/jig.200061］

Xu S， Zheng F， Tang J and Bao W X. 2022. Dual branch feature fusion network based gait recognition algorithm. Journal of Image and Graphics， 27（7）： 2263-2273

徐硕，郑锋，唐俊，鲍文霞. 2022. 双分支特征融合网络的步态识别算法. 中国图象图形学报， 27（7）： 2263-2273 ［DOI： 10.11834/jig.200730http://dx.doi.org/10.11834/jig.200730］

Yu S Q， Tan D L and Tan T N. 2006. A framework for evaluating the effect of view angle， clothing and carrying condition on gait recognition//Proceedings of the 18th International Conference on Pattern Recognition. Hong Kong， China： IEEE： 441-444 ［DOI： 10.1109/ICPR.2006.67http://dx.doi.org/10.1109/ICPR.2006.67］

Zhang H Y and Bao W J. 2022. The cross-view gait recognition analysis based on generative adversarial networks derived of self-attention mechanism. Journal of image and Graphics， 27（4）： 1097-1109

张红颖，包雯静. 2022. 融合自注意力机制的生成对抗网络跨视角步态识别. 中国图象图形学报， 27（4）： 1097-1109 ［DOI： 10.11834/jig.200482http://dx.doi.org/10.11834/jig.200482］

文章被引用时，请邮件提醒。

提交

融合自注意力机制的生成对抗网络跨视角步态识别