To deal with the error segmentation problem of the existing video algorithms under complex and dynamic scenes
the proposed method extracts spatial-temporal attention features with salient maps
and adopts hierarchical conditional random field for video segmentation. Firstly
the algorithm constructs a weighted combination model based on spatial-temporal features by using information theory. Then
it uses the defined model to compute probability distribution of salient maps
which can locate region of moving object effectively. Finally
the Gaussian mixture model is adopted to construct energy functions with the above probability distribution
and the hierarchical conditional random field is used to constraint these feature energy functions to refine final segmentation. The experiment results showed that the algorithm can avoid the error segmentation problem induced by camera movement. So it is robust to handle the videos under complex and dynamic scenes.