the state of the target in every video frame is linearly represented using several online learned templates. The modeling ability of the tracker greatly depends on the generalizability of the template data and its error estimation precision because of the complex interference factors that are caused by the target itself or the scenes. Many existing algorithms have been used to represent the samples in vector form and to change factitiously the original data structure such that the natural relationship between each data pixel of a sample is extremely damaged. In addition
such data expression mechanism may enlarge the data dimensionality that significantly intensifies the computational complexity and wastes much resources. This paper investigates the data representation and observation modeling mechanism of the video tracking framework and provides a more compact and effective solution based on multilinear analysis. In our framework
the candidate samples and their reconstructed signals are expressed in tensor form to maintain the original structure of the data. When the tracker outputs the candidate appearance models
the modeling tasks of the tracking system are organized using the excellent multilinear characteristics of the tensor structures. The objective function is regularized using the tensor nuclear norm and the L norm in order to excavate fully the independences and interdependences of the observation models with a multitask state learning assumption. The structured tensor form used in the data prototypes and observation models can effectively address the data representation problems and computational complexities in the tracking system. This form also provides a more simple and effective solution for the multitask joint learning of the candidate appearance models. When the tracker meets any destructive noise interferences
its tensor nuclear norm constraint mechanism of error estimation in a multitask joint learning framework fully excavates the most comprehensive information of the target
thereby allowing the tracker to adapt to various visual information changes that result from intrinsic or extrinsic factors. The experiment results on several challenging image sequences demonstrate that the proposed method achieves more robust performance in object model representation. Therefore
the average center location error and the average overlap rate of tracked image patches in all image sequences is reached better results (4.2 and 0.82 respectively) compared with several state-of-the-art tracking algorithms. Extensive experiments are performed to validate our algorithm. The tensor nuclear norm regression model and the error estimation mechanism of our algorithm can achieve the most desired candidate states that are greatly similar to actual object states in real time. The tracker strictly detects the true state of each candidate in the multitask learning framework
thereby providing a better solution to the model degradation and drifting problems.