深度学习的目标跟踪算法综述
Survey of visual object tracking algorithms based on deep learning
- 2019年24卷第12期 页码:2057-2080
收稿日期:2019-07-15,
修回日期:2019-08-19,
录用日期:2019-8-26,
纸质出版日期:2019-12-16
DOI: 10.11834/jig.190372
移动端阅览

浏览全部资源
扫码关注微信
收稿日期:2019-07-15,
修回日期:2019-08-19,
录用日期:2019-8-26,
纸质出版日期:2019-12-16
移动端阅览
目标跟踪是利用一个视频或图像序列的上下文信息,对目标的外观和运动信息进行建模,从而对目标运动状态进行预测并标定目标位置的一种技术,是计算机视觉的一个重要基础问题,具有重要的理论研究意义和应用价值,在智能视频监控系统、智能人机交互、智能交通和视觉导航系统等方面具有广泛应用。大数据时代的到来及深度学习方法的出现,为目标跟踪的研究提供了新的契机。本文首先阐述了目标跟踪的基本研究框架,从观测模型的角度对现有目标跟踪的历史进行回顾,指出深度学习为获得更为鲁棒的观测模型提供了可能;进而从深度判别模型、深度生成式模型等方面介绍了适用于目标跟踪的深度学习方法;从网络结构、功能划分和网络训练等几个角度对目前的深度目标跟踪方法进行分类并深入地阐述和分析了当前的深度目标跟踪方法;然后,补充介绍了其他一些深度目标跟踪方法,包括基于分类与回归融合的深度目标跟踪方法、基于强化学习的深度目标跟踪方法、基于集成学习的深度目标跟踪方法和基于元学习的深度目标跟踪方法等;之后,介绍了目前主要的适用于深度目标跟踪的数据库及其评测方法;接下来从移动端跟踪系统,基于检测与跟踪的系统等方面深入分析与总结了目标跟踪中的最新具体应用情况,最后对深度学习方法在目标跟踪中存在的训练数据不足、实时跟踪和长程跟踪等问题进行分析,并对未来的发展方向进行了展望。
Object tracking is a fundamental problem in computer vision
which uses context information in a video or image sequence to predict and locate a target(s). It is widely used in smart video monitoring systems
intelligent human interaction
intelligent transportation
visual navigation systems
and many other areas. With the advent of the big data era and the emergence of deep learning methods
tracking performance has substantially improved. In this paper
we introduce the basic research framework of object tracking and review the history of object tracking from the perspective of the observation model. We indicate that deep learning allows for a more robust observation model to be obtained. We review the deep learning methods that are suitable for object tracking from the aspects of deep discriminative model and deep generative model. We also classify and analyze the existing deep object tracking methods from the perspectives of network structure
network function
and network training. In addition
we introduce several other deep object tracking methods
including deep object tracking based on the fusion of classification and regression
on reinforcement learning
on ensemble learning
and on meta-learning. We show the current commonly used databases for object tracking based on deep learning and their evaluation methods. We likewise analyze and summarize the latest specific application scenarios in object tracking from the perspectives of mobile tracking system
detection
and tracking-based system. Finally
we analyze the problems of object tracking
including insufficient training data
real-time tracking
and long-term tracking and specify further research directions for deep object tracking.
Agravante D J, De Magistris G, Munawar A, Vinayavekhin P and Tachibana R. 2018. Deep learning with predictive control for human motion tracking[EB/OL].2018-08-07[2019-07-01] . https://arxiv.org/pdf/1808.02200.pdf https://arxiv.org/pdf/1808.02200.pdf
Al-Shedivat M, Bansal T, Burda Y, Sutskever I, Mordatch I and Abbeel P. 2018. Continuous adaptation via meta-learning in nonstationary and competitive environments[EB/OL]. 2018-02-23[2019-07-01] . https://arxiv.org/pdf/1710.03641.pdf https://arxiv.org/pdf/1710.03641.pdf
Arjovsky M, Chintala S and Bottou L. 2017. Wasserstein GAN[DB/OL].[2019-07-02] . https://arxiv.org/pdf/1701.07875.pdf https://arxiv.org/pdf/1701.07875.pdf
Avidan S. 2007. Ensemble tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2):261-271[DOI:10.1109/TPAMI.2007.35]
Bertinetto L, Henriques J F, Valmadre J, Torr P and Vedaldi A. 2016a. Learning feed-forward one-shot learners//Proceedings of International Conference on Neural Information Processing Systems. Barcelona, Spain: NIPS, 523-531
Bertinetto L, Valmadre J, Henriques J F, Vedaldi A and Torr P H S. 2016b, Fully-convolutional siamese networks for object tracking//Proceedings of the European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 850-865[ DOI:10.1007/978-3-319-48881-3_56 http://dx.doi.org/10.1007/978-3-319-48881-3_56 ]
Bhat G, Johnander J, Danelljan M, Khan F S and Felsberg M. 2018. Unveiling the power of deep tracking//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany:Springer, 493-509[DOI:10.1007/978-3-030-01216-8_30]
Bolme D, Beveridge J R, Draper B A and Lui Y M. 2010. Visual object tracking using adaptive correlation filters//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2544-2550[ DOI:10.1109/cvpr.2010.5539960 http://dx.doi.org/10.1109/cvpr.2010.5539960 ]
Bonin-Font F, Ortiz A and Oliver G. 2008. Visual navigation for mobile robots:a survey. Journal of Intelligent and Robotic Systems, 53(3):263-296[DOI:10.1007/s10846-008-9235-4]
Bosch A, Zisserman A and Munoz X. 2007. Image classification using random forests and ferns//Proceedings of 2007 IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE, 1-8[ DOI:10.1109/ICCV.2007.4409066 http://dx.doi.org/10.1109/ICCV.2007.4409066 ]
Chen B, Wang D, Li P X, Wang S and Lu H. 2018. Real-time 'actor-critic' tracking//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 328-345[ DOI:10.1007/978-3-030-01234-2_20 http://dx.doi.org/10.1007/978-3-030-01234-2_20 ]
Chi Z Z, Li H Y, Lu H C and Yang M-H. 2017. Dual deep network for visual tracking. IEEE Transactions on Image Processing, 26(4):2005-2015[DOI:10.1109/TIP.2017.2669880]
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H and Bengio Y. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Doha, Qatar:EMNLP, 1724-1734
Choi J, Chang H J, Fischer T, Yun S, Lee K, Jeong J, Demiris Y and Choi J Y. 2018. Context-aware deep feature compression for high-speed visual tracking//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 479-488[ DOI:10.1109/CVPR.2018.00057 http://dx.doi.org/10.1109/CVPR.2018.00057 ]
Choi J, Kwon J and Lee K M. 2017. Deep meta learning for real-time visual tracking based on target-specific feature space[DB/OL][2019-07-02] . https://arxiv.org/pdf/1712.09153.pdf https://arxiv.org/pdf/1712.09153.pdf
Collins R T, Lipton A J and Kanade T. 2000. A system for video surveillance and monitoring[R]. VSAM Final Report, Pittsburgh: Carnegie Mellon University, 329-337
Collins R T and Liu Y X. 2003. On-line selection of discriminative tracking features//Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France. IEEE, 346-352[DOI:10.1109/iccv.2003.1238365]
Collins R, Zhou X H and Teh S K. 2005. An open source tracking testbed and evaluation website//Proceedings of 2005 IEEE International Workshop on Performance Evaluation of Tracking and Surveillance. Breckenridge, Colorado: IEEE, #35
Comaniciu D and Meer P. 2002. Mean shift:a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5):603-619[DOI:10.1109/34.1000236]
Cruz-Mota J, Bogdanova I, Paquier B, Bierlaire M and Thiran J P. 2012. Scale invariant feature transform onthe sphere:theory and applications. International Journal of Computer Vision, 98(2):217-241[DOI:10.1007/s11263-011-0505-4]
Danelljan M, Bhat G, Khan F S and Felsberg M. 2017. Eco:efficient convolution operators for tracking//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA:IEEE, 6931-6939[DOI:10.1109/CVPR.2017.733]
Danelljan M, Häger G, Khan F S and Felsberg M. 2014. Accurate scale estimation for robust visual tracking//Proceedings of the British Machine Vision Conference. Nottingham, UK:BMVA Press[DOI:10.5244/C.28.65]
Danelljan M, Häger G, Khan F S and Felsberg M. 2015a. Learning spatially regularized correlation filters for visual tracking//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 4310-4318[DOI: 10.1109/ICCV.2015.490 http://dx.doi.org/10.1109/ICCV.2015.490 ]
Danelljan M, Häger G, Shahbaz Khan F and Felsberg M. 2015b. Convolutional features for correlation filter based visual tracking//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop. Santiago, Chile: IEEE, 621-629[ DOI:10.1109/ICCVW.2015.84 http://dx.doi.org/10.1109/ICCVW.2015.84 ]
Danelljan M, Robinson A, Khan F S and Felsberg M. 2016. Beyond correlation filters: learning continuous convolution operators for visual tracking//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 472-488[ DOI:10.1007/978-3-319-46454-1_29 http://dx.doi.org/10.1007/978-3-319-46454-1_29 ]
Dong X P, Shen J B, Wang W G, Liu Y, Shao L and Porikli F. 2018. Hyperparameter optimization for tracking with continuous deep Q-learning//Proceedings of 2018 IEEE/CVF Conferenceon Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 518-527[ DOI:10.1109/CVPR.2018.00061 http://dx.doi.org/10.1109/CVPR.2018.00061 ]
Fan H and Ling H B. 2017. SANet: structure-aware network for visual tracking//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, HI, USA: IEEE, 2217-2224[ DOI:10.1109/CVPRW.2017.275 http://dx.doi.org/10.1109/CVPRW.2017.275 ]
Fisher R B. 2004. The PETS04 surveillance ground-truth data sets//Proceedings of 2004 IEEE International Workshop on Performance Evaluation of Tracking and Surveillance. Prague, Czech Republic: IEEE, 1-5
Galoogahi H K, Fagg A and Lucey S. 2017. Learning background-aware correlation filters for visual tracking//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 1144-1152[ DOI:10.1109/ICCV.2017.129 http://dx.doi.org/10.1109/ICCV.2017.129 ]
Grabner H, Leistner C and Bischof H. 2008. Semi-supervised on-line boosting for robust tracking//Proceedings of the 10th European Conference on Computer Vision. Marseille, France: Springer, 234-247[ DOI:10.1007/978-3-540-88682-2_19 http://dx.doi.org/10.1007/978-3-540-88682-2_19 ]
Guan H, Xue X Y and An Z Y. 2016. Advances on application of deep learning for video object tracking. Acta Automatica Sinica, 42(6):834-847
管皓, 薛向阳, 安志勇. 2016.深度学习在视频目标跟踪中的应用进展与展望.自动化学报, 42(6):834-847[DOI:10.16383/j.aas.2016.c150705]
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V and Courville A C. 2017. Improved training of wasserstein GANs//Proceedings of Advances in Neural Information Processing Systems. Long Beach, CA, USA: NIPS, 5767-5777
Gundogdu E and Alatan A A. 2018. Good features to correlate for visual tracking. IEEE Transactions on Image Processing, 27(5):2526-2540[DOI:10.1109/TIP.2018.2806280]
Guo Q, Feng W, Zhou C, Huang R, Wan L and Wang S. 2017. Learning dynamic Siamese network for visual object tracking//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 1781-1789[ DOI:10.1109/ICCV.2017.196 http://dx.doi.org/10.1109/ICCV.2017.196 ]
Han B, Sim J and Adam H. 2017. BranchOut: regularization for online ensemble tracking with convolutional neural networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 521-530[ DOI:10.1109/CVPR.2017.63 http://dx.doi.org/10.1109/CVPR.2017.63 ]
Hare S, Golodetz S, Saffari A, Vineet V, Cheng M M, Hicks S L and Torr P H S. 2016. Struck:structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(10):2096-2109[DOI:10.1109/TPAMI.2015.2509974]
Haritaoglu I, Harwood D and Davis L S. 2000. W 4 :real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):809-830[DOI:10.1109/34.868683]
He A F, Luo C, Tian X M and Zeng W. 2018. A twofold siamese network for real-time object tracking//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA:IEEE, 4834-4843[DOI:10.1109/CVPR.2018.00508]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 770-778[ DOI:10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
Held D, Thrun S and Savarese S. 2016. Learning to track at 100 FPS with deep regression networks//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 749-765[ DOI:10.1007/978-3-319-46448-0_45 http://dx.doi.org/10.1007/978-3-319-46448-0_45 ]
Henriques J F, Rui C, Martins P Vineet V, Cheng M, Hicks S L and Torr P H S. 2012. Exploiting the circulant structure of tracking-by-detection with kernels//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer, 702-715[ DOI:10.1007/978-3-642-33765-9_50 http://dx.doi.org/10.1007/978-3-642-33765-9_50 ]
Henriques J F, Caseiro R, Martins P and Batista J. 2015. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3):583-596[DOI:10.1109/tpami.2014.2345390]
Hester T, Vecerik M, Pietquin O, Lanctot M, Schaul T, Piot B, Dan H, Quan J, Sendonaris A and Dulacarnold G. 2018. Deep Qlearning from demonstrations//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, LA, USA: AAAI
Hochreiter S and Schmidhuber J. 1997. Long short-term memory. Neural Computation, 9(8):1735-1780[DOI:10.1162/neco.1997.9.8.1735]
Horn B K P and Schunck B G. 1981. Determining optical flow. Artificial Intelligence, 17(1-3):185-203[DOI:10.1016/0004-3702(81)90024-2]
Hu W M, Xie D, Fu Z Y, Zeng W and Maybank S. 2007. Semantic-based surveillance video retrieval. IEEE Transactions on Image Processing, 16(4):1168-1181[DOI:10.1109/TIP.2006.891352]
Huang C, Lucey S and Ramanan D. 2017a. Learning policies for adaptive tracking with deep feature cascades//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 105-114[ DOI:10.1109/ICCV.2017.21 http://dx.doi.org/10.1109/ICCV.2017.21 ]
Huang G, Liu Z, Van Der Maaten L and Weinberger K Q. 2017b. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2261-2269[ DOI:10.1109/CVPR.2017.243 http://dx.doi.org/10.1109/CVPR.2017.243 ]
Huang K Q, Chen X T, Kang Y F and Tan T N. 2015. Intelligent visual surveillance:a review. Chinese Journal of Computers, 38(6):1093-1118
黄凯奇, 陈晓棠, 康运锋, 谭铁军. 2015.智能视频监控技术综述.计算机学报, 38(6):1093-1118[DOI:10.11897/SP.J.1016.2015.01093]
Isard M and Blake A. 1998. CONDENSATION-conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1):5-28[DOI:10.1023/A:1008078328650]
Jepson A D, Fleet D J and El-Maraghi T F. 2003. Robust online appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10):1296-1311[DOI:10.1109/TPAMI.2003.1233903]
Kalal Z, Mikolajczyk K and Matas J. 2012. Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(7):1409-1422[DOI:10.1109/TPAMI.2011.239]
Kingma D P and Welling M. 2013. Auto-encoding variational bayes[DB/OL][2019-07-02] . https://arxiv.org/pdf/1312.6114.pdf https://arxiv.org/pdf/1312.6114.pdf
Kristan M, Matas J, Leonardis A, Felsberg M, Cehovin L, Fernandez G, Vojir T, Hager G, Nebehay G and Pflugfelder R. 2015a. The visual object tracking VOT2015 challenge results//Proceedings of 2015 IEEE International Conference on Computer vision Workshop. Santiago, Chile: IEEE, 564-586[ DOI:10.1109/ICCVW.2015.79 http://dx.doi.org/10.1109/ICCVW.2015.79 ]
Kristan M, Matas J, Leonardis A, Vojíř T, Pflugfelder R, Fernández G, Nebehay G, Porikli F and Čehovin L. 2016. A novel performance evaluation methodology for single-target trackers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(11):2137-2155[DOI:10.1109/TPAMI.2016.2516982]
Kristan M, Pflugfelder R, Leonardis A, Matas J,Čehovin L, Nebehay G, VojířT, Fernández G, LukežičA, Dimitriev A, Petrosino A, Saffari A, Li B, Han B, Heng C, Garcia C, PangeršičD, Häger G, Khan F S, Oven F, Possegger H, Bischof H, Nam H, Zhu J, Li J, Choi J Y, Choi J-W, Henriques J F, van de Weijer J, Batista J, Lebeda K,Öfjäll K, Yi K M, Qin L, Wen L, Maresca M E, Danelljan M, Felsberg M, Cheng M-M, Torr P, Huang Q, Bowden R, Hare S, Lim S Y, Hong S, Liao S, Hadfield S, Li S Z, Duffner S, Golodetz S, Mauthner T, Vineet V, Lin W, Li Y, Qi Y, Lei Z and Niu Z H. 2015b. The visual object tracking VOT2014 challenge results//Proceedings of 2014 European Conference on Computer vision. Zurich, Switzerland: Springer, 191-217[ DOI:10.1007/978-3-319-16181-5_14 http://dx.doi.org/10.1007/978-3-319-16181-5_14 ]
Kristan M, Pflugfelder R, Leonardis A, Matas J, Porikli F, Cehovin L, Nebehay G, Fernandez G, Vojir T, Gatt A, Khajenezhad A, Salahledin A, Soltani-Farani A, Zarezade A, Petrosino A, Milton A, Bozorgtabar B, Li B, Chan C S, Heng C, Ward D, Kearney D, Monekosso D, Karaimer H C, Rabiee H R, Zhu J, Gao J, Xiao J, Zhang J, Xing J, Huang K, Lebeda K, Cao L, Maresca M E, Lim M K, Helw M E, Felsberg M, Remagnino P, Bowden R, Goecke R, Stolkin R, Lim S Y, Maher S, Poullot S, Wong S, Satoh S, Chen W, Hu W, Zhang X, Li Y and Niu Z. 2013. The visual object tracking vot2013 challenge results//Proceedings of 2013 IEEE International Conference on Computer Vision Workshops. Sydney, NSW, Australia: IEEE, 98-111[ DOI:10.1109/ICCVW.2013.20 http://dx.doi.org/10.1109/ICCVW.2013.20 ]
Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: Curran Associates Inc, 1097-1105
Kumarawadu S, Watanabe K, Kiguchi K and Izumi K. 2002. Adaptive output tracking of partly known robotic systems using softmax function networks//Proceedings of 2002 International Joint Conference on Neural Networks. Honolulu, HI, USA, USA: IEEE, 483-488[ DOI:10.1109/IJCNN.2002.1005520 http://dx.doi.org/10.1109/IJCNN.2002.1005520 ]
Li A N, Lin M, Wu Y, Yang M and Yan S. 2016. NUS-PRO:a new visual tracking challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2):335-349[DOI:10.1109/TPAMI.2015.2417577]
Li B, Yan J J, Wu W, Zhu Z and Hu X. 2018a. High performance visual tracking with siamese region proposal network//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 8971-8980[DOI: 10.1109/CVPR.2018.00935 http://dx.doi.org/10.1109/CVPR.2018.00935 ]
Li F, Tian C, Zuo W M, Zhang L and Yang M H. 2018b. Learning spatial-temporal regularized correlation filters for visual tracking//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 4904-4913[ DOI:10.1109/CVPR.2018.00515 http://dx.doi.org/10.1109/CVPR.2018.00515 ]
Li P X, Wang D, Wang L J and Lu H. 2018c. Deep visual tracking:Review and experimental comparison. Pattern Recognition, 76:323-338[DOI:10.1016/j.patcog.2017.11.007]
Li B, Yan J J, Wu W, Zhu Z and Hu X L.2018d. High performance visual tracking with Siamese region proposal network//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT: IEEE, 8971-8980[ DOI:10.1109/cvpr.2018.00935 http://dx.doi.org/10.1109/cvpr.2018.00935 ]
Li T H S and Chang S J. 2005. Autonomous fuzzy parking control of a carlike mobile robot. IEEE Transactions on Systems, Man, and Cybernetics-Part A:Systems and Humans, 33(4):451-465[DOI:10.1109/TSMCA.2003.811766]
Li Y and Zhu J K. 2015. A scale adaptive kernel correlation filter tracker with feature integration//Proceedings of 2014 European Conference on Computer Vision. Zurich, Switzerland: Springer, 254-265[ DOI:10.1007/978-3-319-16181-5_18 http://dx.doi.org/10.1007/978-3-319-16181-5_18 ]
Liang P P, Blasch E and Ling H B. 2015. Encoding color information for visual tracking:algorithms and benchmark. IEEE Transactions on Image Processing, 24(12):5630-5644[DOI:10.1109/TIP.2015.2482905]
Lin Y M, Shen J, Cheng S Y and Pantic M. Mobile face tracking: a survey and benchmar[DB/OL].[2019-07-02] . i i
Lowe D G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91-110[DOI:10.1023/B:VISI.0000029664.99615.94]
Lu H C, Fang G L, Wang C and Chen Y W. 2010. A novel method for gaze tracking by local pattern model and support vector regressor. Signal Processing, 90(4):1290-1299[DOI:10.1016/j.sigpro.2009.10.014]
Lu H C, Li P X and Wang D. 2018. Visual object tracking:a survey. Pattern Recognition and Artificial Intelligence, 31(1):61-76
卢湖川, 李佩霞, 王栋. 2018.目标跟踪算法综述.模式识别与人工智能, 31(1):61-76[DOI:10.16451/j.cnki.issn1003-6059.201801006]
Lu X K, Ma C, Ni B B, Yang X, Reid I and Yang M-H. 2018. Deep regression tracking with shrinkage loss//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer,369-386[ DOI:10.1007/978-3-030-01264-9_22 http://dx.doi.org/10.1007/978-3-030-01264-9_22 ]
Lukežic A, Vojír T, Zajc L C, Matas J and Kristan M. 2017. Discriminative correlation filter with channel and spatial reliability//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 4847-4856[ DOI:10.1109/CVPR.2017.515 http://dx.doi.org/10.1109/CVPR.2017.515 ]
Ma C, Huang J B, Yang X K and Yang M H. 2015. Hierarchical convolutional features for visual tracking//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 3074-3082[ DOI:10.1109/ICCV.2015.352 http://dx.doi.org/10.1109/ICCV.2015.352 ]
Matas J, Chum O, Urban M and Pajdla T. 2004. Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 22(10):761-767[DOI:10.1016/j.imavis.2004.02.006]
Meshgi K, Oba S and Ishii S. 2017. Efficient diverse ensemble for discriminative co-tracking[DB/OL].[2019-07-02] . https://arxiv.org/pdf/1711.06564.pdf https://arxiv.org/pdf/1711.06564.pdf
Mita T, Kaneko T and Hori O. 2005. Joint haar-like features for face detection//Proceedings of the 10th IEEE International Conference on Computer Vision. Beijing:IEEE, 1619-1626[DOI:10.1109/ICCV.2005.129]
Mueller M, Smith N and Ghanem B. 2016. A benchmark and simulator for UAV tracking//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 445-461[ DOI:10.1007/978-3-319-46448-0_27 http://dx.doi.org/10.1007/978-3-319-46448-0_27 ]
Müller M, Bibi A, Giancola S, Al-Subaihi S and Ghanem B. 2018. TrackingNet: a large-scale dataset and benchmark for object trackingin the wild//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 310-327[ DOI:10.1007/978-3-030-01246-5_19 http://dx.doi.org/10.1007/978-3-030-01246-5_19 ]
Mei X and Ling H. 2011. Robust visual tracking and vehicle classification via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11):2259-2272
Nam H, Baek M and Han B. 2016. Modeling and propagating CNNs in a tree structure for visual tracking[DB/OL].[2019-07-02] . https://arxiv.org/pdf/1608.07242.pdf https://arxiv.org/pdf/1608.07242.pdf
Nam H and Han B. 2016. Learning multi-domain convolutional neural networks for visual tracking//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 4293-4302[ DOI:10.1109/CVPR.2016.465 http://dx.doi.org/10.1109/CVPR.2016.465 ]
Osogami T and Otsuka M. 2015. Seven neurons memorizing sequences of alphabetical images via spike-timing dependent plasticity. Scientific Reports, 5: #14149[ DOI:10.1038/srep14149 http://dx.doi.org/10.1038/srep14149 ]
Park E and Berg A C. 2018. Meta-tracker: fast and robust online adaptation for visual object trackers//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 587-604[ DOI:10.1007/978-3-030-01219-9_35 http://dx.doi.org/10.1007/978-3-030-01219-9_35 ]
Qi Y K, Zhang S P, Qin L, Yao H, Huang Q, Lim J and Yang M H. 2016. Hedged deep tracking//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 4303-4311[ DOI:10.1109/CVPR.2016.466 http://dx.doi.org/10.1109/CVPR.2016.466 ]
Radford A, Metz L and Chintala S. 2016. Unsupervised representation learning with deep convolutional generative adversarial network[DB/OL].[2019-07-02] . https://arxiv.org/pdf/1511.06434.pdf https://arxiv.org/pdf/1511.06434.pdf
Ren L L, Yuan X, Lu J W, Yang M and Zhou J. 2018. Deep reinforcement learning with iterative shift for visual tracking//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 697-713[ DOI:10.1007/978-3-030-01240-3_42 http://dx.doi.org/10.1007/978-3-030-01240-3_42 ]
Ross D A, Lim J, Lin R S and Yang M H. 2008. Incremental learning for robust visual tracking. International Journal of Computer Vision, 77(1-3):125-141[DOI:10.1007/s11263-007-0075-7]
Schroff F, Kalenichenko D and Philbin J. 2015. FaceNet: a unified embedding for face recognition and clustering//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 815-823[ DOI:10.1109/CVPR.2015.7298682 http://dx.doi.org/10.1109/CVPR.2015.7298682 ]
Shi J B and Tomasi C. 1994. Good features to track//Proceedings of 1994 IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE, 593-600[ DOI:10.1109/CVPR.1994.323794 http://dx.doi.org/10.1109/CVPR.1994.323794 ]
Shi X J, Chen Z R, Wang H, Yeung D Y, Wong W K and Woo W-C. 2015. Convolutional LSTM network: a machine learning approach for precipitation nowcasting//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 802-810
Shu C F, Hampapur A, Lu M, Brown L, Connell J, Senior A and Tian Y. 2005. IBM smart surveillance system (S3): a open and extensible framework for event based surveillance//Proceedings of 2005 IEEE Conference on Advanced Video and Signal Based Surveillance. Como, Italy: IEEE, 318-323[ DOI:10.1109/AVSS.2005.1577288 http://dx.doi.org/10.1109/AVSS.2005.1577288 ]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition//Proceedings of International Conference on Learning Representations. San Diego, CA: ICLR
Smeulders A W M, Chu D M, Cucchiara R, Calderara S, Dehghan A and Shah M. 2014. Visual tracking:an experimental survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7):1442-1468[DOI:10.1109/TPAMI.2013.230]
Song Y B, Ma C, Gong L J, Zhang J, Lau R W and Yang M H. 2017. CREST: convolutional residual learning for visual tracking//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2574-2583[ DOI:10.1109/ICCV.2017.279 http://dx.doi.org/10.1109/ICCV.2017.279 ]
Song Y B, Ma C, Wu X H, Gong L, Bao L, Zuo W, Shen C, Lau R and Yang M H. 2018. Vital: visual tracking via adversarial learning//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 8990-8999[ DOI:10.1109/CVPR.2018.00937 http://dx.doi.org/10.1109/CVPR.2018.00937 ]
Sun C, Wang D, Lu H C and Yang M-H. 2018. Correlation tracking via joint discrimination and reliability learning//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 489-497[ DOI:10.1109/CVPR.2018.00058 http://dx.doi.org/10.1109/CVPR.2018.00058 ]
Sun Y, Wang X G and Tang X O. 2015. Deeply learned face representations are sparse, selective, and robust//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2892-2900[ DOI:10.1109/CVPR.2015.7298907 http://dx.doi.org/10.1109/CVPR.2015.7298907 ]
SupančičIII J and Ramanan D. 2017. Tracking as online decision-making: Learning a policy from streaming videos with reinforcement learning//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 322-331[ DOI:10.1109/ICCV.2017.43 http://dx.doi.org/10.1109/ICCV.2017.43 ]
Suykens J A K and Vandewalle J. 1999. Least squares support vector machine classifiers. Neural Processing Letters, 9(3):293-300[DOI:10.1023/A:1018628609742]
Svetnik V, Liaw A, Tong C, Culberson J C, Sheridan R P and Feuston B P. 2003. Random forest:a classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Sciences, 43(6):1947-1958[DOI:10.1021/ci034160g]
Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V and Rabinovich A. 2015. Going deeper with convolutions//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 1-9[ DOI:10.1109/CVPR.2015.7298594 http://dx.doi.org/10.1109/CVPR.2015.7298594 ]
Tao R, Gavves E and Smeulders A W M. 2016. Siamese instance search for tracking//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 1420-1429[ DOI:10.1109/CVPR.2016.158 http://dx.doi.org/10.1109/CVPR.2016.158 ]
Valmadre J, Bertinetto L, Henriques J F, Tao R, Vedaldi A, Smeulders A, Torr P and Gavves E. 2018. Long-term tracking in the wild: a benchmark//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 692-707[ DOI:10.1007/978-3-030-01219-9_41 http://dx.doi.org/10.1007/978-3-030-01219-9_41 ]
Valmadre J, Bertinetto L, Henriques J, Vedaldi A and Torr P H. 2017. End-to-end representation learning for correlation filter based tracking//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 5000-5008[ DOI:10.1109/CVPR.2017.531 http://dx.doi.org/10.1109/CVPR.2017.531 ]
Van De Weijer J, Schmid C, Verbeek J and Larlus D. 2009. Learning color names for real-world applications. IEEE Transactions on Image Processing, 18(7):1512-1523[DOI:10.1109/TIP.2009.2019809]
Vincent P, Larochelle H, Lajoie I, Bengio Y and Manzagol P A. 2010. Stacked denoising autoencoders:Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11:3371-3408
Viola P and Jones M. 2001. Fast and robust classification using asymmetric adaboost and a detector cascade//Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic. Vancouver, British Columbia, Canada: MIT Press, 1311-1318
Wang L J, Ouyang W L, Wang X G and Lu H. 2016. STCT: sequentially training convolutional networks for visual tracking//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 1373-1381[ DOI:10.1109/CVPR.2016.153 http://dx.doi.org/10.1109/CVPR.2016.153 ]
Wang N Y and Yeung D Y. 2013. Learning a deep compact image representation for visual tracking//Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada: ACM, 809-817
Wang N Y, Shi J P, Yeung D Y and Jia J. 2015. Understanding and diagnosing visual tracking systems//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 3101-3109[ DOI:10.1109/ICCV.2015.355 http://dx.doi.org/10.1109/ICCV.2015.355 ]
Wang N, Zhou W G, Tian Q, Hong R, Wang M and Li H. 2018a. Multi-cue correlation filters for robust visual tracking//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 4844-4853[ DOI:10.1109/CVPR.2018.00509 http://dx.doi.org/10.1109/CVPR.2018.00509 ]
Wang Q, Zhang M D, Xing J L, Gao J, Hu W and Maybank S. 2018b. Do not lose the details: reinforced representation learning for high performance visual tracking//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: IJCAI, 985-997[ DOI:10.24963/ijcai.2018/137 http://dx.doi.org/10.24963/ijcai.2018/137 ]
Wang X, Li C L, Luo B and Tang J. 2018c. SINT++: robust visual tracking via adversarial positive instance generation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 4864-4873[ DOI:10.1109/CVPR.2018.00511 http://dx.doi.org/10.1109/CVPR.2018.00511 ]
Wu Y, Lim J and Yang M H. 2015. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9):1834-1848[DOI:10.1109/TPAMI.2014.2388226]
Wu Y, Lim J and Yang M H. 2013. Online object tracking: a benchmark//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2411-2418[ DOI:10.1109/CVPR.2013.312 http://dx.doi.org/10.1109/CVPR.2013.312 ]
Yang T Y and Chan A B. 2018. Learning dynamic memory networks for object tracking//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 153-169[ DOI:10.1007/978-3-030-01240-3_10 http://dx.doi.org/10.1007/978-3-030-01240-3_10 ]
Yun S, Choi J and Yun Y. 2017. Action-decision networks for visual tracking with deep reinforcement learning//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 1349-1358[ DOI:10.1109/CVPR.2017.148 http://dx.doi.org/10.1109/CVPR.2017.148 ]
Zeiler M D and Fergus R. 2014. Visualizing and understanding convolutional networks//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 818-833[ DOI:10.1007/978-3-319-10590-1_53 http://dx.doi.org/10.1007/978-3-319-10590-1_53 ]
Zhang K H, Liu Q S, Wu Y and Yang M H. 2016. Robust visual tracking via convolutional networks without training. IEEE Transactions on Image Processing, 25(4):1779-1792[DOI:10.1109/TIP.2016.2531283]
Zhao F, Wang J Q, Wu Y and Tang M J I T. 2019. Adversarial deep tracking. IEEE Transactions on Circuits and Systems for Video Technology, 29(7):1998-2011[DOI:10.1109/TCSVT.2018.2856540]
Zhu Z, Wang Q, Li B, Wu W, Yan J and Hu W. 2018a. Distractor-aware Siamese networks for visual object tracking//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 103-119[ DOI:10.1007/978-3-030-01240-3_7 http://dx.doi.org/10.1007/978-3-030-01240-3_7 ]
Zhu Z, Wu W, Zou W and Yan J. 2018b. End-to-end flow correlation tracking with spatial-temporal attention//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 548-557[ DOI:10.1109/CVPR.2018.00064 http://dx.doi.org/10.1109/CVPR.2018.00064 ]
相关作者
相关机构
京公网安备11010802024621