异常步态3维人体建模和可变视角识别
Parametric 3D body modeling and view-invariant abnormal gait recognition
- 2020年25卷第8期 页码:1539-1550
收稿:2019-09-29,
修回:2020-2-19,
录用:2020-2-26,
纸质出版:2020-08-16
DOI: 10.11834/jig.190497
移动端阅览

浏览全部资源
扫码关注微信
收稿:2019-09-29,
修回:2020-2-19,
录用:2020-2-26,
纸质出版:2020-08-16
移动端阅览
目的
2
运用视觉和机器学习方法对步态进行研究已成为当前热点,但多集中在身份识别领域。本文从不同的视角对其进行研究,探讨一种基于点云数据和人体语义特征模型的异常步态3维人体建模和可变视角识别方法。
方法
2
运用非刚性变形和蒙皮方法,构建基于形体和姿态语义特征的参数化3维人体模型;以红外结构光传感器获取的人体异常步态点云数据为观测目标,构建其对应形体和姿态特征的3维人体模型。通过ConvGRU(convolution gated necurrent unit)卷积循环神经网络来提取其投影深度图像的时空特征,并将样本划分为正样本、负样本和自身样本三元组,对异常步态分类器进行训练,以提高分类器对细小差异的鉴别能力。同时对异常步态数据获取难度大和训练视角少的问题,提出了一种基于形体、姿态和视角变换的训练样本扩充方法,以提高模型在面对视角变化时的泛化能力。
结果
2
使用CSU(Central South University)3维异常步态数据库和DHA(depth-included human action video)深度人体行为数据库进行实验,并对比了不同异常步态或行为识别方法的效果。结果表明,本文方法在CSU异常步态库实验中,0°、45°和90°视角下对异常步态的综合检测识别率达到了96.6%,特别是在90°到0°交叉和变换视角实验中,比使用DMHI(difference motion history image)和DMM-CNN(depth motion map-convolutional neural network)等步态动作特征要高出25%以上。在DHA深度人体运动数据库实验中,本文方法识别率接近98%,比DMM等相关算法高出2%~3%。
结论
2
提出的3维异常步态识别方法综合了3维人体先验知识、循环卷积网络的时空特性和虚拟视角样本合成方法的优点,不仅能提高异常步态在面对视角变换时的识别准确性,同时也为3维异常步态检测和识别提供一种新思路。
Objective
2
Gait has become a popular research topic that is currently investigated by using visual and machine learning methods. However
most of these studies are concentrated in the field of human identification and use 2D RGB images. In contrast to these studies
this paper investigates abnormal gait recognition by using 3D data. A method based on 3D point cloud data and the semantic body model is then proposed for view-invariant abnormal gait recognition. Compared with traditional 2D abnormal gait recognition approaches
the proposed 3D-based method can easily deal with many obstacles in abnormal gait modelling and recognition processes
including view-invariant problems and interference from external items.
Method
2
The point cloud data of human gait are obtained by using an infrared structured light sensor
which is a 3D depth camera that uses a structure projector and reflecting light receiver to gain the depth information of an object and calculate its point cloud data. Although the point cloud data of the human body are also in 3D
they are generally unstructured
thereby influencing the 3D representation of the human body and posture. To deal with this problem
a 3D parametric human body learned from the 3D body dataset by using a statistic method is introduced in this paper. The parameterized human body model refers to the description and construction of the corresponding visual human body mesh through abstract high-order semantic features
such as height
weight
age
gender
and skeletal joints. The parameters are determined by using statistical learning methods. The human body is embedded into the model
and the 3D parametric model can be deformed both in shapes and poses. Unlike traditional methods that directly model the 3D body from point cloud data via the point cloud reduction algorithm and triangle mesh grid method
the related 3D parameterized body model is deformed to fit the point cloud data in both shape and posture. The standard 3D human model proposed in this paper is constructed based on the body shape PCA (principal component analysis) analysis and skin method. An observation function that measures the similarity of the deformed 3D model with the raw point cloud data of the human body is also introduced. An accurate deformation of the 3D body is ensured by iteratively minimizing the observation function. After the 3D model estimation process
the features of the raw point cloud data of the human body are converted into a high-level structured representation of the human body. This process not only abstracts the unstructured data to a high-order semantic description but also effectively reduces the dimensionality of the original data. After 3D modelling and structured feature representation
a convolution gated recurrent unit (ConvGRU) recurrent neural network is applied to extract the temporal-spatial features of the projected depth gait images. ConvGRU has the advantages of both convolutional and recurrent neural networks
the latter of which is based on the gate structure. The tow gates (i.e.
reset and update gates) help the model memorize useful information and forget useless data. In the final classification process
the samples are divided into positive
negative
and anchor samples. The anchor sample is the sample itself
the positive samples are same-category samples that belong to different objects
and the negative samples are those that belong to opposite categories. Training the classifier by using the triples elements strategy can improve its ability to discriminate small feature differences of different categories. At the same time
a virtual 3D sample synthesizing method based on body
pose
and view deformation is proposed to deal with the data shortage problem of abnormal gait. Compared with normal gait datasets
abnormal gait data
especially 3D abnormal datasets
are rare and difficult to obtain. Moreover
given the limited amount of ground truth data
most of the abnormal data are imitated by the experimental participates. As a result
the virtual synthesizing method can help extend the training data and improve the generalization ability of the abnormal gait classification model.
Result
2
Experiments were performed by using the CSU(Central South University) abnormal 3D gait database and the depth-included human action video (DHA) dataset
and different abnormal gait or action recognition methods were compared with the proposed approach. In the CSU abnormal gait database
the rank-1 mean detection and recognition rate of abnormal gait is 96.6% at the 0°
45°
and 90° views. In the 90°-0° cross view recognition experiment
the proposed method outperforms the other approaches that use DMHI(difference motion history image) or DMM-CNN(depth motion map-convolutional neural network) as feature representation by at least 25%. Meanwhile
in the DHA dataset
the proposed method result has a rank-1 mean detection and recognition rate of near 98%
which is 2% to 3% higher than that of novel approaches
including DMM based methods.
Conclusion
2
Based on the feature extraction method of the 3D parameterized human body model
abnormal gait image data can be abstracted to high-order descriptions and effectively complete the feature extraction and dimensionality reduction of the original data. ConvGRU can extract the spatial and temporal features of the abnormal gait data well. The virtual sample synthesis and triple classification methods can be combined to classify and recognize abnormal gait data from different views. The proposed method not only improves the recognition accuracy of abnormal gait under various view angles but also provides a new approach for the detection and recognition of abnormal gait.
Bauckhage C, Tsotsos J K and Bunn F E. 2005. Detecting abnormal gait//Proceedings of 2nd Canadian Conference on Computer and Robot Vision. Victoria: IEEE: 1-7[ DOI:10.1109/CRV.2005.32 http://dx.doi.org/10.1109/CRV.2005.32 ]
Chen C, Liu M Y, Zhang B C, Han J G, Jiang J J and Liu H. 2016.3D action recognition using multi-temporal depth motion maps and Fisher vector//Proceedings of the 25th International Joint Conference on Artificial Intelligence. Sony: AAAI Press: 3331-3337
Cho K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H and Bengio Y. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation[EB/OL].[2019-09-22] https://arxiv.org/pdf/1406.1078.pdf https://arxiv.org/pdf/1406.1078.pdf
Elmadany N E D, He Y F and Guan L. 2018. Information fusion for human action recognition via biset/multiset globality locality preserving canonical correlation analysis. IEEE Transactions on Image Processing, 27(11):5275-5287[DOI:10.1109/TIP.2018.2855438]
Gao Z, Zhang H, Liu A, Xue Y B. 2014. Human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning. KSII Trans on Internet and Information Systems. 8(2):483-503[DOI:10.3837/tiis.2014.02.009]
Gao Z, Zhang H, Xu G P and Xue Y B. 2015. Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition. Neurocomputing, 151:554-564[DOI:10.1016/j.neucom.2014.06.085]
Han J and Bhanu B. 2006. Individual recognition using gait energy image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(2):316-322[DOI:10.1109/TPAMI.2006.38]
Lam T H W, Cheung K H and Liu J N K. 2011. Gait flow image:a silhouette-based gait representation for human identification. Pattern Recognition, 44(4):973-987[DOI:10.1016/j.patcog.2010.10.011]
Li G Y, Liu T and Yi J G. 2018. Wearable sensor system for detecting gait parameters of abnormal gaits:a feasibility study. IEEE Sensors Journal, 18(10):4234-4241[DOI:10.1109/JSEN.2018.2814994]
Lin Y C, Hu M C, Cheng W H, Hsieh Y H and Chen H M. 2012. Human action recognition and retrieval using sole depth information//Proceedings of the 20th ACM International Conference on Multimedia. Nara City: ACM: 1-4[ DOI:10.1145/2393347.2396381 http://dx.doi.org/10.1145/2393347.2396381 ]
Liu A A, Nie W Z, Su Y T, Ma L Hao T and Yang Z X. 2015. Coupled hidden conditional random fields for RGB-D human action recognition. Signal Processing, 112:74-82[DOI:10.1016/j.sigpro.2014.08.038]
Liu L, Su Z, Fu X D, Liu L J, Wang R M and Luo X N. 2017. A data-driven editing framework for automatic 3D garment modeling. Multimedia Tools and Applications, 76(10):12597-12626[DOI:10.1007/s11042-016-3688-4]
Luo J, Tang J, Tjahjadi T and Xiao X M. 2016. Robust arbitrary view gait recognition based on parametric 3D human body reconstruction and virtual posture synthesis. Pattern Recognition, 60:361-377[DOI:10.1016/j.patcog.2016.05.030]
Luo J, Tang J, Zhao P, Mao F and Wang P. 2016. Abnormal behavior detection for elderly based on 3D structure light sensor. Optical Technique, 42(2):146-151
罗坚, 唐琎, 赵鹏, 毛芳, 汪鹏. 2016.基于3D结构光传感器的老龄人异常行为检测方法.光学技术, 42(2):146-151[DOI:10.13741/j.cnki.11-1879/o4.2016.02.011]
Ngo T T, Makihara Y, Nagahara H, Mukaigawa Y and Yagi Y. 2015. Similar gait action recognition using an inertial sensor. Pattern Recognition, 48(4):1289-1301[DOI:10.1016/j.patcog.2014.10.012]
Pogorelc B, Bosnić Z and Gams M. 2012. Automatic recognition of gait-related health problems in the elderly using machine learning. Multimedia Tools and Applications, 58(2):333-354[DOI:10.1007/s11042-011-0786-1]
Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M and Moore R. 2013. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 56(1):116-124[DOI:10.1145/2398356.2398381]
Smisek J, Jancosek M and Pajdla T. 2011.3D with Kinect//Proceedings of 2011 IEEE International Conference on Computer Vision Workshops. Barcelona: IEEE: 1154-1160[ DOI:10.1109/ICCVW.2011.6130380 http://dx.doi.org/10.1109/ICCVW.2011.6130380 ]
Sun P, Xia F, Zhang H, Peng D G, Ma X and Luo Z J. 2017. Research of human fall detection algorithm based on improved Gaussian mixture model. Computer Engineering and Applications, 53(20):173-179
孙朋, 夏飞, 张浩, 彭道刚, 马茜, 罗志疆. 2017.改进混合高斯模型在人体跌倒检测中的应用.计算机工程与应用, 53(20):173-179[DOI:10.3778/j.issn.1002-8331.1604-0423]
Wang L. 2006. Abnormal walking gait analysis using silhouette-masked flow histograms//Proceedings of the 18th International Conference on Pattern Recognition. Hong Kong, China: IEEE: 1-4[ DOI:10.1109/ICPR.2006.199 http://dx.doi.org/10.1109/ICPR.2006.199 ]
Wang L, Jiang W J, Sun P and Xia F. 2017. Application of improved D-S evidence theory in human fall detection of transformer substation. Journal of Electronic Measurement and Instrumentation, 31(7):1090-1098
王磊, 江伟建, 孙朋, 夏飞. 2017.改进D-S证据理论在变电站人体跌倒检测的应用.电子测量与仪器学报, 31(7):1090-1098[DOI:10.13382/j.jemi.2017.07.015]
Xia L and Aggarwal J K. 2013. Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE: 2834-2841[ DOI:10.1109/CVPR.2013.365 http://dx.doi.org/10.1109/CVPR.2013.365 ]
Yang X D and Tian Y L. 2017. Super normal vector for human activity recognition with depth cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(5):1028-1039[DOI:10.1109/TPAMI.2016.2565479]
相关作者
相关机构
京公网安备11010802024621