Current Issue Cover
遮挡条件下的步态图像时空修复网络及其应用

阳强, 罗坚, 黄宇琛(湖南师范大学信息科学与工程学院, 长沙 410000)

摘 要
目的 当前,基于视觉的步态识别方法多基于完整的步态序列图像。然而,现实场景拍摄下的行人难免被遮挡,以至于获取的步态图像不完整,对识别结果有很大影响。如何处理大面积遮挡是步态识别中一个具有挑战性且重要的问题。针对此,提出了一种步态时空序列重建网络(gait spatio-temporal reconstruction network,GSTRNet),用于修复被遮挡的步态序列图像。方法 使用基于3D卷积神经网络和Transformer的GSTRNet来修复步态序列,在修复每一帧步态图像的空间信息的同时保持帧与帧之间的时空连贯性。GSTRNet通过引入YOLOv5(you only lookonce)网络来检测步态图像的局部遮挡区域,并将其作为先验知识为遮挡修复区域分配更高的修复权值,实现遮挡区域的局部修复,将局部修复步态图与原始遮挡图像进行融合,生成完整的修复步态图。同时,在GSTRNet中引入三元组特征损失和重建损失组成的联合损失函数来优化修复网络,提升修复效果。最终,以修复完整的步态序列图像为特征进行身份识别。结果 本文在大规模步态数据集OU_MVLP(the OU-ISIR gait database,multi-view largepopulation dataset)中人工合成遮挡步态序列数据来进行修复实验。结果表明,该方法在面对步态轮廓大面积遮挡时,识别准确率比现有的步态修复和遮挡识别方法有一定的提升,如在未知遮挡模式时比三元组视频生成对抗网络(sequence video wasserstein generative adversarial network based on triplet hinge loss,sVideoWGAN-hinge)最高提升6.7%,非单一模式遮挡时比Gaitset等方法识别率提高40%左右。结论 本文提出的GSTRNet对各种遮挡模式下的步态图像序列有较好的修复效果,使用修复后图像进行步态识别,可有效改善识别率。
关键词
Gait image spatio-temporal restoration network and its application under occlusion conditions

Yang Qiang, Luo Jian, Huang Yuchen(School of Information Science and Engineering, Hunan Normal University, Changsha 410000, China)

Abstract
Objective Gait recognition is a kind of identity recognition method based on human walking mode,which has been widely used in the field of video surveillance and public security. Compared with the face,fingerprint,and other biometric features,it has the advantages of long-distance recognition without the need for participant cooperation and the difficulty of camouflaging and hiding. At present,gait recognition algorithms are based on vision and deep learning,most of which use gait sequences without occlusion to form gait features for recognition. However,in reality,people under the monitoring of various public places are inevitably blocked,so the gait sequences obtained are usually occluded. The occlusion sequence has a great effect on gait recognition,such as the inability to obtain accurate gait periods from the sequence, and the lack of gait spatio-temporal information is also more serious,which leads to a substantial reduction in recognition performance. The existing occlusion gait processing algorithmsare divided into two kinds:One is to extract directly the features of occlusion robustness to identify from the occlusion sequence. It often needs to know the gait period in advance,but it is difficult in the occlusion gait sequence. The other algorithms perform identification by reconstructing the gait silhouette or repairing the gait features,but the existing algorithms often have poor performance when the occlusion area is large or the entire sequence is occluded. Method A gait spatio-temporal reconstruction network(GSTRNet),which consists of the occlusion detection network you only look once(YOLO),the spatio-temporal codec network,and the feature extraction network(Gaitset),based on prior knowledge is proposed to repair occluded gait sequences. GSTRNet uses YOLOv5 to detect the occlusion region in sequence(assigning the occlusion area to 1 and the nonocclusion area to 0)as a piece of prior knowledge to assign a higher weight to the loss of the occlusion area. Spatiotemporal codec network consists of 3D convolutional neural network(3DCNN)and Transformers. The 3DCNN can repair the spatial information of each gait image while maintaining the time coherence between frames. The encoder uses 3DCNN with a stride of 2 to reduce the dimension of the data so that each element can participate in the current sampling and more detailed information can be retained. The decoder uses skip connection to stitch together features in the encoder to reduce further the loss of detail due to encoder down sampling. To ensure the time and space consistency of the entire repair sequence,multiple Transformers composed of multiscale self-attention module are added between the encoder and decoder,extracting useful information from the global and local scope to repair the gait sequence. Because the 3DCNN is a global repair,the nonocclusion region data in the repaired gait sequence also change,and GSTRNet uses prior knowledge to take only the occlusion region repair result from the decoder output and then adds it to the original sequence as the output of the network. The Gaitset network is also introduced to extract features from three sequences as triplet losses to maintain feature consistency between the repair sequence and the original sequence,namely,occlusion sequences,genuine sequences(other nonocclusion sequences with the same identity as occlusion sequences),and imposter sequences (sequences that have different identities from occlusion sequences). In the OU-ISIR gait database,multiview large population dataset(OU_MVLP),24 occlusion gait sequences are synthesized as experimental data by simulating various occlusion types in real life,and our algorithm is evaluate dusing three sets of experiments:the occlusion mode is known and the gallery and probe occlusion modes are consistent,the occlusion mode is known but the gallery and probe occlusion modes are inconsistent,and the occlusion mode is unknown. Result Results show the proposed algorithm performs better than the existing occlusion sequence repair algorithms. Compared with other repair algorithms,the rank1 recognition rates of our algorithm in single occlusion mode and non-single occlusion mode when the occlusion mode is known are 4. 1% and 4. 1% higher,respectively and have a maximum recognition accuracy improvement of 6. 7% in the case of large area occlusion of the gait silhouette when the occlusion mode is unknown. Compared with gait recognition algorithms such as 3D local convolutional neural networks,the recognition rate in non-single occlusion mode has a maximum increase of about 50%. Conclusion The proposed GSTRNet model has a good effect on the repair of gait sequences to varying degrees in various occlusion modes and has a strong feasibility in reality.
Keywords

订阅号|日报