足球视频球员感知跟踪算法

冯思佳; 宋子恺; 于俊清; 何云峰; 管涛

doi:10.11834/jig.200507

图像理解和计算机视觉 | 浏览量 : 0 下载量: 0 CSCD: 0

PDF
导出
分享
收藏
专辑

足球视频球员感知跟踪算法
Players-aware tracking algorithm in soccer video
2021年26卷第7期页码：1668-1680
纸质出版日期： 2021-07-16 ，

录用日期： 2021-01-06
DOI： 10.11834/jig.200507
稿件说明：

移动端阅览

冯思佳, 宋子恺, 于俊清, 何云峰, 管涛. 足球视频球员感知跟踪算法[J]. 中国图象图形学报, 2021,26(7):1668-1680.

Sijia Feng, Zikai Song, Junqing Yu, Yunfeng He, Tao Guan. Players-aware tracking algorithm in soccer video[J]. Journal of Image and Graphics, 2021,26(7):1668-1680.
冯思佳, 宋子恺, 于俊清, 何云峰, 管涛. 足球视频球员感知跟踪算法[J]. 中国图象图形学报, 2021,26(7):1668-1680. DOI： 10.11834/jig.200507.

Sijia Feng, Zikai Song, Junqing Yu, Yunfeng He, Tao Guan. Players-aware tracking algorithm in soccer video[J]. Journal of Image and Graphics, 2021,26(7):1668-1680. DOI： 10.11834/jig.200507.

摘要

目的

足球比赛视频中的球员跟踪算法为足球赛事分析提供基础的数据支持。但足球比赛中球员跟踪存在极大的挑战：球员进攻、防守和争夺球权时，目标球员可能产生快速移动、严重遮挡和周围出现若干名干扰球员的情况，目前仍没有一种能够完美解决足球比赛中球员跟踪问题的算法。因此如何解决足球场景中的困难，提升球员跟踪的准确度，成为当前研究的热点问题。

方法

本文在分析足球比赛视频中球员目标特点的基础上，通过融合干扰项感知的颜色模型和目标感知的深度模型，提出并设计了一种球员感知的跟踪算法。干扰项感知的颜色模型分别提取目标、背景和干扰项的颜色直方图，利用贝叶斯公式得到搜索区域中每个像素点属于目标的似然概率。目标感知的深度模型利用孪生网络计算搜索区域与目标的相似度。针对跟踪漂移问题，使用全局跟踪器和局部跟踪器分别跟踪目标整体和目标上半身，并且在两个跟踪器的跟踪结果出现较大差异的时候分析跟踪器有效性并进行定位修正。

结果

在公共的足球数据集上将本文算法与10个其他跟踪算法进行对比实验，同时对于文本算法进行了局部跟踪器的消融实验。实验结果表明，球员感知跟踪算法的平均有效重叠率达到了0.560 3，在存在同队球员和异队球员干扰的情况下，本文算法比排名第2的算法的有效重叠率分别高出3.7%和6.6%，明显优于其他算法，但是由于引入了干扰项感知的颜色模型、目标感知的深度模型以及局部跟踪器等模块增加了算法的时间复杂度，导致本文算法跟踪速度较慢。

结论

本文总结了跟踪算法的整体流程并分析了实验结果，认为干扰项感知、目标感知和局部跟踪这3个策略在足球场景中的球员跟踪问题中起到了重要的作用，为未来在足球球员跟踪领域研究的继续深入提供了参考依据。

Abstract

Objective

Target object tracking is important in computer vision. Player-tracking algorithms in broadcast soccer videos provide basic data support for the analysis of soccer matches. Several challenges occur in soccer player tracking

including a rapid move of the target player

occlusion

and disturbance of similar players when they attack

defend

and scramble for the ball. However

no perfect tracking algorithm specifically for soccer video is available. The following challenges remain in the player tracking of broadcast soccer videos: 1) A small patch of target players in the video frame is not conducive to feature extraction. 2) Similar players often interfere with the target player. 3) Occlusion of the target player by other players often occurs

requiring the algorithm to distinguish intra-class targets. 4) Relocating the target after tracking drift is difficult. Thus

a prevalent topic in current research is how to handle the challenges in the soccer scene and improve the accuracy of player tracking.

Method

Based on a depth analysis of the characteristics of a soccer player

we propose and design a player-aware tracking algorithm by fusing a distractor-aware color model and the target-aware deep model. In the color model

the color histogram of the target player

background

and distractors are extracted. The color model based on the Bayesian classifier aims to identify the foreground target from the background by color information in the search region. Three primary color components in the RGB color space are divided into 16 color regions by uniform quantization. The color histogram of the corresponding region can be obtained by calculating the number of pixels in each color interval. Distractors are non-target candidate regions whose similarity scores are larger than a certain threshold in the response map. As with the foreground-background color model

the color histogram of the target and distractor is counted

and the likelihood probability that the pixel belongs to the target in the target-distractor item is obtained. In the deep model

Siamese networks are adopted to calculate the similarity between the search and target regions. The target-aware deep model embeds deep features into the Siamese network

calculates the similarity between the output of the template branch and detects branches to obtain a response map of the search region. The well-known Visual Geometry Group(VGG) feature extraction network is adopted as a backbone network. In feature space

each channel of feature represents a different feature-representation capability

and specific combinations of features can recognize specific categories. The response of one category only focuses on specific deep-feature channels but not all feature channels. For the current tracking player

we design a small regression network to select feature channels related to the tracking player from VGG deep features. The structure of the small regression network is composed of one convolution layer with one convolution kernel. The size of the convolution kernel is the same as that of the target feature. The regression network aims to fit the features of the target sample to Gaussian distribution. In addition

to solve the problem of tracking drift

a global-local tracking strategy is designed to track the entire target and upper part of the target. Both global and local trackers have the same network architecture

including a distractor-aware color model branch and target-aware deep model branch. When a great difference in tracking results exists between the global and local trackers

the effectiveness of each tracker is analyzed and location revision is performed. In online tracking

both global and local trackers are used to track the whole and upper part of the target. When one tracker drifts

another is used to revise the target position. According to the intersection over union of the target of the global and local trackers

the tracking results can be classified into stable and unstable states. A stable state is when the intersection over union of the target boxes of the local and global trackers is greater than a certain threshold

while an unstable state indicates less than that threshold. In the unstable state

the following factors are considered simultaneously to analyze the tracker: main color similarity of the target in the current and initial frames

maximum response value of the response map

and moving distance from the center of the previous frame to the current frame. The lower the main color similarity

the more likely the tracker will be lost to the non-target player. The smaller the maximum response value of the response map

the lower is the reliability of the tracker. The moving distance of the tracker box is greater than a certain threshold

which indicates that the tracker is likely to have a sudden tracking drift in the current frame.

Result

We select 10 state-of-the-art tracking algorithms and compare them with the proposed algorithm on the public soccer dataset. The ablation experiment on the global-local tracking strategy is expanded. Experimental results show that the average valid overlap rate of the proposed tracking algorithm is 0.560 3

and when the target player is occluded by players in the same team and different teams

the average valid overlap rate of the proposed algorithm is 3.7% and 6.6% higher than that of the second-ranked algorithm

respectively.The evaluation results demonstrate that the player-aware tracking algorithm is more effective than other algorithms in addressing the disturbance by other similar players. However

the tracking speed is slow due to the increase of computational complexity by introducing the color model

deep model

and global-local tracking strategy.

Conclusion

We summarize the entire process of the proposed tracking algorithm and analyze the experimental results. Three strategies

namely

distractor-aware color model

target-aware deep model

and global-local tracking strategy

are demonstrated to play a crucial role in player tracking. In terms of the color model

the color histogram of the target player

background

and distractor are extracted

and the likelihood probability that each pixel in the search region belongs to the target is calculated by using the Bayesian formula. In terms of the deep model

a small regression network is adopted to select feature channels related to the target object from the deep feature

and the Siamese network is used to calculate the similarity between the search region and target object. To alleviate tracking drift

we use the global-local strategy to track the whole target and upper body of the target so that the failure location can be revised. This study provides a basic reference for further research on player tracking in broadcast soccer videos.

关键词

计算机视觉图像处理目标跟踪足球球员跟踪干扰项感知目标感知局部跟踪

Keywords

computer visionimage processingobject trackingplayer trackingdistractor awaretarget awareglobal-local tracking strategy

references

Bastanfard A, Jafari S and Amirkhani D. 2019. Improving tracking soccer players in shaded playfield video//Proceedings of the 5th Iranian Conference on Signal Processing and Intelligent Systems. Shahrood, Iran: IEEE: 1-8[DOI:10.1109/ICSPIS48872.2019.9066103http://dx.doi.org/10.1109/ICSPIS48872.2019.9066103]

Baysal S and Duygulu P. 2016. Sentioscope: a soccer player tracking system using model field particles. IEEE Transactions on Circuits and Systems for Video Technology, 26(7): 1350-1362[DOI:10.1109/TCSVT.2015.2455713]

Bertinetto L, Valmadre J, Golodetz S, Miksik O and Torr P H S. 2016a. Staple: complementary learners for real-time tracking//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 1401-1409[DOI:10.1109/CVPR.2016.156http://dx.doi.org/10.1109/CVPR.2016.156]

Bertinetto L, Valmadre J, Henriques J F, Vedaldi A and Torr P H S. 2016b. Fully-convolutional Siamese networks for object tracking//Proceedings of the European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 850-865[DOI:10.1007/978-3-319-48881-3_56http://dx.doi.org/10.1007/978-3-319-48881-3_56]

Danelljan M, Bhat G, Khan F S and Felsberg M. 2019. ATOM: accurate tracking by overlap maximization//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4655-4664[DOI:10.1109/CVPR.2019.00479http://dx.doi.org/10.1109/CVPR.2019.00479]

Danelljan M, Häger G, Khan F S and Felsberg M. 2014a. Accurate scale estimation for robust visual tracking//Proceedings of British Machine Vision Conference. Guildford, Surrey UK: British Machine Vision Association Press: #79[DOI:10.5244/C.28.65http://dx.doi.org/10.5244/C.28.65]

Danelljan M, Khan F S, Felsberg M and Van De Weijer J. 2014b. Adaptive color attributes for real-time visual tracking//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 1090-1097[DOI:10.1109/CVPR.2014.143http://dx.doi.org/10.1109/CVPR.2014.143]

Fu L. 2015. A Method for Multi-Target Tracking in Soccer Videos. Shijiazhuang: Hebei University of Technology

付龙. 2015. 足球视频中多目标跟踪算法研究. 石家庄: 河北工业大学

Henriques J F, Caseiro R, Martins P and Batista J. 2015. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3): 583-596[doi:10.1109/TPAMI.2014.2345390]

Heydari M and Moghadam A M E. 2012. An MLP-based player detection and tracking in broadcast soccer video//Proceedings of 2012 International Conference of Robotics and Artificial Intelligence. Rawalpindi, Pakistan: IEEE: 195-199[DOI:10.1109/ICRAI.2012.6413398http://dx.doi.org/10.1109/ICRAI.2012.6413398]

Kataoka H and Aoki Y. 2011. Football players and ball trajectories projection from single camera's image//Proceedings of the 17th Korea-Japan Joint Workshop on Frontiers of Computer Vision. Ulsan, Korea (South): IEEE: 1-4[DOI:10.1109/FCV.2011.5739712http://dx.doi.org/10.1109/FCV.2011.5739712]

Li B, Wu W, Wang Q, Zhang F Y, Xing J L and Yan J J.2019. SiamRPN++: evolution of Siamese visual tracking with very deep networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4277-4286[DOI:10.1109/CVPR.2019.00441http://dx.doi.org/10.1109/CVPR.2019.00441]

Li H P and Flierl M. 2012. Sift-based multi-view cooperative tracking for soccer video//Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto, Japan: IEEE: 1001-1004[DOI:10.1109/ICASSP.2012.6288054http://dx.doi.org/10.1109/ICASSP.2012.6288054]

Liu J C, Carr P, Collins R T and Liu Y X. 2013. Tracking sports players with context-conditioned motion models//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 1830-1837[DOI:10.1109/CVPR.2013.239http://dx.doi.org/10.1109/CVPR.2013.239]

Lou N, He N Z and Shi B C. 2007. Detection and tracking in soccer video sequences. Computer Engineering and Applications, 43(2): 227-230

娄娜, 何南忠, 施保昌. 2007. 足球视频中的目标检测与跟踪. 计算机工程与应用, 43(2): 227-230[DOI:10.3321/j.issn:1002-8331.2007.02.067]

Lu W L, Ting J A, Murphy K P and Little J J. 2011. Identifying players in broadcast sports videos using conditional random fields//Proceedings of the Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE: 3249-3256[DOI:10.1109/CVPR.2011.5995562http://dx.doi.org/10.1109/CVPR.2011.5995562]

Ma Y J, Feng S and Wang Y B. 2018. Research on player tracking algorithm based on deep learning. Journal of Communication University of China (Science and Technology), 25(3): 60-64

马月洁, 冯爽, 王永滨. 2018. 基于深度学习的足球球员跟踪算法研究. 中国传媒大学学报(自然科学版), 25(3): 60-64[DOI:10.16196/j.cnki.issn.1673-4793.2018.03.009]

Mazzeo P L, Spagnolo P, Leo M and D'Orazio T. 2008. Visual players detection and tracking in soccer matches//Proceedings of the 5th International Conference on Advanced Video and Signal Based Surveillance. Santa Fe, USA: IEEE: 326-333[DOI:10.1109/AVSS.2008.33http://dx.doi.org/10.1109/AVSS.2008.33]

Morais E, Goldenstein S, Ferreira A and Rocha A. 2012. Automatic tracking of indoor soccer players using videos from multiple cameras//Proceedings of the 25th SIBGRAPI Conference on Graphics, Patterns and Images. Ouro Preto, Brazil: IEEE: 174-181[DOI:10.1109/SIBGRAPI.2012.32http://dx.doi.org/10.1109/SIBGRAPI.2012.32]

Najafzadeh N, Fotouhi M and Kasaei S. 2015. Multiple soccer players tracking//Proceedings of the International Symposium on Artificial Intelligence and Signal Processing. Mashhad, Iran: IEEE: 310-315[DOI:10.1109/AISP.2015.7123503http://dx.doi.org/10.1109/AISP.2015.7123503]

Possegger H, Mauthner T and Bischof H. 2015. In defense of color-based model-free tracking//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 2113-2120[DOI:10.1109/CVPR.2015.7298823http://dx.doi.org/10.1109/CVPR.2015.7298823]

Seo Y, Choi S, Kim H and Hong K S. 1997. Where are the ball and players? Soccer game analysis with color-based tracking and image mosaick//Proceedings of the Image Analysis and Processing. Florence, Italy: Springer: 196-203[DOI:10.1007/3-540-63508-4_123http://dx.doi.org/10.1007/3-540-63508-4_123]

Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2020-08-21].https://arxiv.org/pdf/1409.1556v4.pdfhttps://arxiv.org/pdf/1409.1556v4.pdf

Valmadre J, Bertinetto L, Henriques J, Vedaldi A and Torr P H S. 2017. End-to-end representation learning for correlation filter based tracking//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5000-5008[DOI:10.1109/CVPR.2017.531http://dx.doi.org/10.1109/CVPR.2017.531]

Wang N Y and Yeung D Y. 2013. Learning a deep compact image representation for visual tracking//Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: Curran Associates Inc. : 809-817

Wang Q, Gao J, Xing J L, Zhang M D and Hu W M. 2017. DCFNet: discriminant correlation filters network for visual tracking[EB/OL]. [2020-08-21].https://arxiv.org/pdf/1704.04057.pdfhttps://arxiv.org/pdf/1704.04057.pdf

Wang X. 2017. Research on Player Tracking Algorithm in Soccer Video. Wuhan: Huazhong University of Science and Technology

王勋. 2017. 足球视频中球员跟踪算法研究. 武汉: 华中科技大学

Yu J Q, Lei A P, Song Z K, Wang T T, Cai H Y and Feng N. 2018. Comprehensive dataset of broadcast soccer videos//Proceedings of 2018 IEEE Conference on Multimedia Information Processing and Retrieval. Miami, USA: IEEE: 418-423[DOI:10.1109/MIPR.2018.00090http://dx.doi.org/10.1109/MIPR.2018.00090]

Zhang Z P and Peng H W. 2019. Deeper and wider Siamese networks for real-time visual tracking//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4586-4595[DOI:10.1109/CVPR.2019.00472http://dx.doi.org/10.1109/CVPR.2019.00472]

文章被引用时，请邮件提醒。

提交

基于Transformer方法的任意风格迁移策略

基于图像的自动驾驶3D目标检测综述——基准、制约因素和误差分析

利用时空特征编码的单目标跟踪网络

图像质量评价研究综述——从失真的角度

根据灰度值信息自适应窗口的半全局匹配