Current Issue Cover
自适应在线判别外观学习的分层关联多目标跟踪

方岚, 于凤芹(江南大学物联网工程学院, 无锡 214122)

摘 要
目的 复杂场景下目标频繁且长时间的遮挡、跟踪目标外观相似引起身份转换等问题给多目标跟踪带来许多挑战。针对多目标跟踪在复杂场景中因长时间遮挡引起身份转换和轨迹分段的问题,提出一种基于自适应在线判别外观学习的分层关联多目标跟踪算法。方法 利用轨迹置信度将多目标跟踪分为局部关联和全局关联两个层次。在局部关联中,置信度高的可靠轨迹利用外观、位置-大小相似度与当前帧检测点进行关联;在全局关联中,置信度低的不可靠轨迹引入运动模型和有效关联范围进一步关联分段的轨迹。在提取目标外观特征时引入增量线性可判别分析方法以解决身份转换问题,依据新增样本与目标样本均值的外观特征差异自适应地更新目标外观模型。结果 在公开数据集2D MOT2015中的PETS09-S2L1、TUD-Stadmitte、Town-Center 3个数据集中与当前10种多目标跟踪算法进行比较,该方法对各个数据集身份转换和轨迹分段都有减少,其中在Town-Center数据集中,身份转换减少了60个,轨迹分段减少了84个,跟踪准确度提高了5.2%以上。结论 本文多目标跟踪方法,能够在复杂场景中稳定有效地实现多目标跟踪,减少轨迹分段现象,其中引入的在线线性可判别外观学习对遮挡产生的身份转换具有良好的解决效果。
关键词
Multi-object tracking based on adaptive online discriminative appearance learning and hierarchical association

Fang Lan, Yu Fengqin(School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China)

Abstract
Objective Multi-object tracking is an important research topic in computer vision. Although several previous studies have dealt with varieties of particular problems in multi-object tracking, many challenges are still observed, such as object detection errors, missed detection, frequent and long-term occlusion of objects in complex scenes, and identity switches of tracking objects with similar appearance. All of which are easy to lead to trajectory drift or tracking interruption. With the improvement of object detection, the object tracking method based on detection shows good performance. The key of tracking-by-detection algorithm is the data association between detection points, which mainly consists of two types, namely, frame-by-frame association and multi-frame association. Frame-by-frame data association refers to the association between detection points in the two consecutive frames, which is carried out according to the properties of detection points, such as appearance, location, and size. Tracking drift or failure is likely to occur when object is blocked, misdetected or similar appearance exist due to that the frame-by-frame data association only contains the information of the previous two frames. Multi-frame data association establishes a relational model by using object detection information of multiple frames rather than only previous two frames. This condition can effectively reduce the object error association and deal with occlusion. However, if the occlusion time is longer than the time segment needed for multi-frame data association, the detection points before and after still cannot be successfully associated, and the tracking will also be interrupted. Moreover, this method needs all detection information before tracking, which cannot meet the real-time requirement. Aiming at the problems of ID switches and trajectory fragmentation caused by long-term occlusion, an online multi-object tracking algorithm based on adaptive online discriminative appearance learning and hierarchical association is proposed for multi-object tracking in complex scenes. This process combines the low-level appearance, position-size characteristics used in local association, and high-level motion model established in global association and can meet the real-time tracking requirement. Method In this study, multi-object tracking is divided into two stages according to track confidence:local association and global association. The establishment of the object robust appearance model is the key to local association and global association. An online incremental linear discriminant analysis(ILDA) method was introduced to discriminate the appearances of objects and adaptively update the object appearance models based on the difference value between the new sample and the mean of object samples to address the problem of identity switches. The reliable tracklet with high confidence in the local association stage is associated with the current frame detections by low-level properties of detection points:appearance and position-size similarity, which allows reliable trajectories to grow constantly. The unreliable tracklet with low confidence in the global association stage resulted from long-term occlusion is further associated. In this stage, the candidate object consists of two kinds. One is the detection points that are not associated in local association, and the other one is continuous trajectory with high confidence meeting the time condition. The end time of trajectory is before the current time. When we associate detection points that reappear after long-term occlusion, only appearance similarity is utilized within a validation range without the position-size property due to the unreliable motion dynamics of unreliable objects. At the same time, introducing a valid association range is related to the trajectory confidence. Once the track confidence is reduced, the valid association range is increased because the distance between a drifting track and the corresponding object can grow large if the track drift persists. This condition allows us to reassign drifting tracks to detections of reappearing objects, which is even distant from the corresponding tracks. When two track fragments are associated, a motion model is introduced to determine whether the two trajectories belong to the same object. In this condition, the average velocity vector angle of the two track fragments is larger than a threshold, indicating that it may include unreliable tracks. Thus, we only consider appearance similarity between the pair. Otherwise, we combine the appearance, position size, and motion similarity to make an association between the pair. If two track fragments are associated successfully, the linear interpolation is used to fill the lost interval of this object. Thus, the two trajectory fragments can be connected effectively. Result We compared our method with 10 state-of-the-art multi-object tracking algorithms, including five offline tracking approaches and five online tracking methods on three public datasets, namely, PETS09-S2L1, TUD-Stadmitte, and Town-Center. The quantitative evaluation metrics contained multi-object tracking accuracy (MOTA), multi-object tracking precision (MOTP), the number of identity switches (IDS), the ratio of mostly tracked trajectories (MT), the ratio of mostly lost trajectories (ML), and the number of track fragmentation (Frag). The experiment results illustrate that our tracking method outperforms in MOTA and MOTP compared with selected online multi-object tracking methods, which include two tracking approaches based on hierarchical association. In addition, the proposed approach performs almost the same or even better when compared with offline tracking methods. In the PETS09-S2L1 data set, the proposed approaches are superior to other comparators in MOTP, IDS, and Frag. MOTP increased by 6.1%, IDS reduced by 5,and Frag reduced by 21. In TUD-Stadmitte dataset, IDS reduced by 4. Compared with online tracking approaches, the MOTP and MOTA increased by 36.3% and 11.1%, respectively. In Town-Center dataset, MOTA and MT increased by 5.2% and 16.9%, respectively. IDS and Frag reduced by 60 and 84, respectively, and ML decreased by 1.5%. Conclusion In this study, we take the idea of hierarchical data association, proposing a multi-object tracking based on adaptive online discriminative appearance learning and hierarchical association. The experiment results indicate that our method has a good solution to the problems of ID switches and trajectory fragmentation caused by long-term occlusion in complex scenes.
Keywords

订阅号|日报