Current Issue Cover
足球视频球员感知跟踪算法

冯思佳1, 宋子恺1, 于俊清1,2, 何云峰1, 管涛1(1.华中科技大学计算机科学与技术学院, 武汉 430074;2.华中科技大学网络与计算中心, 武汉 430074)

摘 要
目的 足球比赛视频中的球员跟踪算法为足球赛事分析提供基础的数据支持。但足球比赛中球员跟踪存在极大的挑战:球员进攻、防守和争夺球权时,目标球员可能产生快速移动、严重遮挡和周围出现若干名干扰球员的情况,目前仍没有一种能够完美解决足球比赛中球员跟踪问题的算法。因此如何解决足球场景中的困难,提升球员跟踪的准确度,成为当前研究的热点问题。方法 本文在分析足球比赛视频中球员目标特点的基础上,通过融合干扰项感知的颜色模型和目标感知的深度模型,提出并设计了一种球员感知的跟踪算法。干扰项感知的颜色模型分别提取目标、背景和干扰项的颜色直方图,利用贝叶斯公式得到搜索区域中每个像素点属于目标的似然概率。目标感知的深度模型利用孪生网络计算搜索区域与目标的相似度。针对跟踪漂移问题,使用全局跟踪器和局部跟踪器分别跟踪目标整体和目标上半身,并且在两个跟踪器的跟踪结果出现较大差异的时候分析跟踪器有效性并进行定位修正。结果 在公共的足球数据集上将本文算法与10个其他跟踪算法进行对比实验,同时对于文本算法进行了局部跟踪器的消融实验。实验结果表明,球员感知跟踪算法的平均有效重叠率达到了0.560 3,在存在同队球员和异队球员干扰的情况下,本文算法比排名第2的算法的有效重叠率分别高出3.7%和6.6%,明显优于其他算法,但是由于引入了干扰项感知的颜色模型、目标感知的深度模型以及局部跟踪器等模块增加了算法的时间复杂度,导致本文算法跟踪速度较慢。结论 本文总结了跟踪算法的整体流程并分析了实验结果,认为干扰项感知、目标感知和局部跟踪这3个策略在足球场景中的球员跟踪问题中起到了重要的作用,为未来在足球球员跟踪领域研究的继续深入提供了参考依据。
关键词
Players-aware tracking algorithm in soccer video

Feng Sijia1, Song Zikai1, Yu Junqing1,2, He Yunfeng1, Guan Tao1(1.School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China;2.Center of Network and Computation, Huazhong University of Science and Technology, Wuhan 430074, China)

Abstract
Objective Target object tracking is important in computer vision. Player-tracking algorithms in broadcast soccer videos provide basic data support for the analysis of soccer matches. Several challenges occur in soccer player tracking, including a rapid move of the target player, occlusion, and disturbance of similar players when they attack, defend, and scramble for the ball. However, no perfect tracking algorithm specifically for soccer video is available. The following challenges remain in the player tracking of broadcast soccer videos: 1) A small patch of target players in the video frame is not conducive to feature extraction. 2) Similar players often interfere with the target player. 3) Occlusion of the target player by other players often occurs, requiring the algorithm to distinguish intra-class targets. 4) Relocating the target after tracking drift is difficult. Thus, a prevalent topic in current research is how to handle the challenges in the soccer scene and improve the accuracy of player tracking. Method Based on a depth analysis of the characteristics of a soccer player, we propose and design a player-aware tracking algorithm by fusing a distractor-aware color model and the target-aware deep model. In the color model, the color histogram of the target player, background, and distractors are extracted. The color model based on the Bayesian classifier aims to identify the foreground target from the background by color information in the search region. Three primary color components in the RGB color space are divided into 16 color regions by uniform quantization. The color histogram of the corresponding region can be obtained by calculating the number of pixels in each color interval. Distractors are non-target candidate regions whose similarity scores are larger than a certain threshold in the response map. As with the foreground-background color model, the color histogram of the target and distractor is counted, and the likelihood probability that the pixel belongs to the target in the target-distractor item is obtained. In the deep model, Siamese networks are adopted to calculate the similarity between the search and target regions. The target-aware deep model embeds deep features into the Siamese network, calculates the similarity between the output of the template branch and detects branches to obtain a response map of the search region. The well-known Visual Geometry Group(VGG) feature extraction network is adopted as a backbone network. In feature space, each channel of feature represents a different feature-representation capability, and specific combinations of features can recognize specific categories. The response of one category only focuses on specific deep-feature channels but not all feature channels. For the current tracking player, we design a small regression network to select feature channels related to the tracking player from VGG deep features. The structure of the small regression network is composed of one convolution layer with one convolution kernel. The size of the convolution kernel is the same as that of the target feature. The regression network aims to fit the features of the target sample to Gaussian distribution. In addition, to solve the problem of tracking drift, a global-local tracking strategy is designed to track the entire target and upper part of the target. Both global and local trackers have the same network architecture, including a distractor-aware color model branch and target-aware deep model branch. When a great difference in tracking results exists between the global and local trackers, the effectiveness of each tracker is analyzed and location revision is performed. In online tracking, both global and local trackers are used to track the whole and upper part of the target. When one tracker drifts, another is used to revise the target position. According to the intersection over union of the target of the global and local trackers, the tracking results can be classified into stable and unstable states. A stable state is when the intersection over union of the target boxes of the local and global trackers is greater than a certain threshold, while an unstable state indicates less than that threshold. In the unstable state, the following factors are considered simultaneously to analyze the tracker: main color similarity of the target in the current and initial frames, maximum response value of the response map, and moving distance from the center of the previous frame to the current frame. The lower the main color similarity, the more likely the tracker will be lost to the non-target player. The smaller the maximum response value of the response map, the lower is the reliability of the tracker. The moving distance of the tracker box is greater than a certain threshold, which indicates that the tracker is likely to have a sudden tracking drift in the current frame. Result We select 10 state-of-the-art tracking algorithms and compare them with the proposed algorithm on the public soccer dataset. The ablation experiment on the global-local tracking strategy is expanded. Experimental results show that the average valid overlap rate of the proposed tracking algorithm is 0.560 3, and when the target player is occluded by players in the same team and different teams, the average valid overlap rate of the proposed algorithm is 3.7% and 6.6% higher than that of the second-ranked algorithm, respectively.The evaluation results demonstrate that the player-aware tracking algorithm is more effective than other algorithms in addressing the disturbance by other similar players. However,the tracking speed is slow due to the increase of computational complexity by introducing the color model, deep model, and global-local tracking strategy. Conclusion We summarize the entire process of the proposed tracking algorithm and analyze the experimental results. Three strategies, namely, distractor-aware color model, target-aware deep model, and global-local tracking strategy, are demonstrated to play a crucial role in player tracking. In terms of the color model, the color histogram of the target player, background, and distractor are extracted, and the likelihood probability that each pixel in the search region belongs to the target is calculated by using the Bayesian formula. In terms of the deep model, a small regression network is adopted to select feature channels related to the target object from the deep feature, and the Siamese network is used to calculate the similarity between the search region and target object. To alleviate tracking drift, we use the global-local strategy to track the whole target and upper body of the target so that the failure location can be revised. This study provides a basic reference for further research on player tracking in broadcast soccer videos.
Keywords

订阅号|日报