Current Issue Cover
分裂合并运动分割的多运动视觉里程计方法

王晨捷, 张云, 赵青, 王伟, 尹露, 罗斌, 张良培(武汉大学测绘遥感信息工程国家重点实验室, 武汉 430079)

摘 要
目的 视觉里程计(visual odometry,VO)仅需要普通相机即可实现精度可观的自主定位,已经成为计算机视觉和机器人领域的研究热点,但是当前研究及应用大多基于场景为静态的假设,即场景中只有相机运动这一个运动模型,无法处理多个运动模型,因此本文提出一种基于分裂合并运动分割的多运动视觉里程计方法,获得场景中除相机运动外多个运动目标的运动状态。方法 基于传统的视觉里程计框架,引入多模型拟合的方法分割出动态场景中的多个运动模型,采用RANSAC(random sample consensus)方法估计出多个运动模型的运动参数实例;接着将相机运动信息以及各个运动目标的运动信息转换到统一的坐标系中,获得相机的视觉里程计结果,以及场景中各个运动目标对应各个时刻的位姿信息;最后采用局部窗口光束法平差直接对相机的姿态以及计算出来的相机相对于各个运动目标的姿态进行校正,利用相机运动模型的内点和各个时刻获得的相机相对于运动目标的运动参数,对多个运动模型的轨迹进行优化。结果 本文所构建的连续帧运动分割方法能够达到较好的分割结果,具有较好的鲁棒性,连续帧的分割精度均能达到近100%,充分保证后续估计各个运动模型参数的准确性。本文方法不仅能够有效估计出相机的位姿,还能估计出场景中存在的显著移动目标的位姿,在各个分段路径中相机自定位与移动目标的定位结果位置平均误差均小于6%。结论 本文方法能够同时分割出动态场景中的相机自身运动模型和不同运动的动态物体运动模型,进而同时估计出相机和各个动态物体的绝对运动轨迹,构建出多运动视觉里程计过程。
关键词
Multi-motion visual odometry based on split-merged motion segmentation

Wang Chenjie, Zhang Yun, Zhao Qing, Wang Wei, Yin Lu, Luo Bin, Zhang Liangpei(State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China)

Abstract
Objective With the continuous development and popularization of robotics and autonomous driving technology, the demand for high-precision localization and navigation in dynamic scenarios continues to increase. Visual localization only requires a common camera to achieve a localization function with considerable accuracy, and the obtained video data can be used for 3D scene reconstruction, scene analysis, target recognition, target tracking, and other tasks. Among these tasks, visual odometry (VO) has become a hotspot in autonomous localization research and has been widely applied in the localization and navigation of robots and unmanned vehicles. VO can estimate the camera pose relative to a static background. However, the current research and application of VO are based on static scenes where most objects are stationary. When multiple moving targets are present in a scene, the camera ego-motion generates a large error. Therefore, eliminating the interference of moving targets in a scene (even though they occupy most of the field of view), accurately calculating the camera pose, and estimating the motion model of each moving object are practical problems that need to be solved in moving target trajectory estimation and modeling analysis. Method This paper proposes a multi-motion VO based on split and merged motion segmentation that applies the general method for estimating the motion model parameters based on traditional VO. We also apply the multi-model fitting method on multiple data in the motion estimation process. The motion model is estimated to fit multiple motion model parameter instances. Afterward, multiple motion models are mapped in time series to complete a continuous frame motion segmentation, and the absolute pose of each moving target at the current time is obtained. Local bundle adjustment is then applied to directly correct the camera and absolute poses of each moving target and to complete the multi-motion VO process. The main contents and innovations of this article are summarized as follows:1) the motion segmentation method based on multi-model fitting is applied to the traditional VO framework, and a multi-motion VO framework based on multi-model fitting motion segmentation is proposed. In a dynamic scene with multiple moving rigid body targets, the trajectory of multiple moving objects and the ego-motion of the camera are simultaneously estimated. 2) This paper combines multi-model fitting with VO. The preference analysis method of quantized residuals is also combined with alternating sampling and clustering strategies to improve the performance of the existing multi-model fitting method in segmenting the motion and dynamic object motion models of the camera in dynamic scenes.3) In this paper, the motion segmentation strategy is optimized through motion segmentation to achieve a continuous frame motion segmentation and to obtain multi-motion model parameters. Furthermore, the absolute pose of multi-motion targets (including camera motion) in the same coordinate system can be obtained to realize a complete multi-motion VO.First, the oriented feature from aclelerated segments test(FAST) and rotated binary robust independent elementary features(BRIEF) (ORB) method was used to extract the feature points of the stereo images of the current and previous frames, and then a stereo matching of the left and right images of these frames was performed by matching the feature points of these images and obtaining the associated 3D information in the current and previous frames. Second, the preference analysis method of quantized residuals was combined with alternating sampling and clustering strategies to improve the existing multi-model fitting method, and an inlier segmentation of the current frame multi-motion model was performed to achieve a single-step motion segmentation. Third, a continuous frame motion segmentation was performed based on the results of the multi-motion segmentation at the previous moment. Fourth, based on the ego-motion estimation results obtained by a camera in each moment and the estimation results for other moving targets in a scene, the inliers of multiple target movement models in a scene at each moment were obtained as a time series. Fifth, by using the inliers of each motion model obtained via motion segmentation, random sample consonsus(RANSAC) robustness was used to estimate the motion parameters of each model, and the motion results of the camera relative to each motion target were estimated. Sixth, the estimation result was converted into a unified global coordinate to determine the absolute pose of each moving target at the current time. Finally, local bundle adjustment was used to directly correct the camera pose and the absolute pose of each moving target in each moment. The inliers of the camera motion model and the motion parameters of multiple motion models across various periods were used to optimize the trajectories of multiple moving targets. Result Compared with the existing methods, the proposed continuous frame motion segmentation method can achieve better segmentation results, show higher robustness and continuous frame segmentation accuracy, and guarantee an accurate estimation of each motion model parameter. The proposed multi-motion VO method not only effectively estimates the pose of the camera but also that of a moving target in a scene. The results for the self-localization of the camera and the localization of the moving target show small errors. Conclusion The proposed multi-motion VO method based on the split-merged motion segmentation method can simultaneously segment the motion model of the camera in dynamic scenes and the moving object motion models. The absolute motion trajectories of the camera and various moving objects can also be estimated simultaneously to build a multi-motion VO process.
Keywords

订阅号|日报