单目同时定位与建图中的地图恢复融合技术
Recovery of lost map for monocular simultaneous localization and mapping
- 2018年23卷第3期 页码:372-383
收稿:2017-09-15,
修回:2017-11-14,
纸质出版:2018-03-16
DOI: 10.11834/jig.170505
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-09-15,
修回:2017-11-14,
纸质出版:2018-03-16
移动端阅览
目的
2
传统的单目视觉SLAM(simultaneous localization and mapping)跟踪失败后需要相机重新回到丢失的位置才能重定位并恢复建图,这极大限制了单目SLAM的应用场景。为解决这一问题,提出一种基于视觉惯性传感器融合的地图恢复融合算法。
方法
2
当系统跟踪失败,仅由惯性传感器提供相机位姿,通过对系统重新初始化并结合惯性传感器提供的丢失部分的相机位姿将丢失前的地图融合到当前的地图中;为解决视觉跟踪丢失期间由惯性测量计算导致的相机位姿误差,提出了一种以关键帧之间的共视关系为依据的跳跃式的匹配搜索策略,快速获得匹配地图点,再通过非线性优化求解匹配点之间的运动估计,进行误差补偿,获得更加准确的相机位姿,并删减融合后重复的点云;最后建立前后两个地图中关键帧之间与地图点之间的联系,用于联合优化后续的跟踪建图过程中相机位姿和地图点位置。
结果
2
利用Euroc数据集及其他数据进行地图精度和地图完整性实验,在精度方面,将本文算法得到的轨迹与ground truth和未丢失情况下得到的轨迹进行对比,结果表明,在SLAM系统跟踪失败的情况下,此方法能有效解决系统无法继续跟踪建图的问题,其精度可达厘米级别。在30 m
2
的室内环境中,仅有9 cm的误差,而在300 m
2
工厂环境中误差仅有7 cm。在完整性方面,在相机运动较剧烈的情况下,恢复地图的完整性优于ORB_SLAM的重定位算法,通过本文算法得到的地图关键帧数量比ORB_SLAM多30%。
结论
2
本文提出的算法在单目视觉SLAM系统跟踪失败之后,仍然能够继续跟踪建图,不会丢失相机轨迹。此外,无需相机回到丢失之前的场景中,只需相机观察到部分丢失前场景,即可恢复融合所有地图。本文算法不仅保证了恢复地图的精度,还保证了建图的完整性。与传统的重定位方法相比,本文算法在系统建图较少时跟踪失败的情况下效果更好。
Objective
2
The traditional SLAM system requires a camera to return to the position where tracking is lost to restart location and mapping
thereby greatly limiting the application of monocular SLAM scenarios. Thus
a new and highly adaptive recovery strategy is required. In a traditional keyframe-based SLAM system
relocalization is achieved by finding the keyframes (a reference camera pose and image) that match the current frame to estimate the camera pose. Localization only to previously visited places where keyframes were added to the map is possible. In practical application
relocalization is not a user-friendly solution because users prefer to walk into previous unseen areas where no associated keyframes exist. Moreover
finding matches between the two frames is difficult when the scene changes considerably. Tracking resumption can even be difficult when the camera arrives at the place from where it has been. The motion and map message during lost time cannot be recovered
which leads to an imperfect map. Thus
this study proposes a strategy of monocular vision combined with inertial measurement unit (IMU) to recover a lost map based on map fusion.
Method
2
The ORB_SLAM system incorporates three threads that run in parallel:tracking
local mapping
and loop closing. Our algorithm improved on the basis of ORB_SLAM. When tracking fails in conditions with motion blur or occlusions
our system immediately restarts a new map. The first map created before tracking lost is saved
and relocalization is achieved by fusing the two maps. A new map is created by re-initializing. Compared with relocalization
initialization is achievable
whereas the scale of the camera trajectory computed by ORB_SLAM is arbitrary. The inconsistent scale must be solved first to fuse the two maps. The absolute scale of the map is calculated by combining monocular vision with IMU. The IMU sensor can recover the metric scale for monocular vision and is affected by the limitation of vision. The position of the camera is calculated by IMU data before the initialization is completed. The pose information during lost time can be used as a bridge to fuse the two maps when the new map is initialized. The process of map fusion consists of three steps:1) Coordinate transformation. The coordinate system of the new map has changed after re-initialization. The transformation between the new map and the lost map can be computed by IMU during the lost time
which is applied to the lost map to move the lost map onto the new map. 2) Matching strategy. The data measured by IMU contains various errors unavoidably affect the result of the map fusion. Thus
vision information is considered to eliminate errors. A jumping matching search strategy is proposed according to the covisibility relationship between keyframes to reduce matching time. We match keyframes selected from the two maps according to a certain condition instead of matching them individually. Thus
a considerable amount of matching time can be reduced. 3) Estimating the error between the two maps. After a set of matching map points is obtained
motion estimation between the matching points is solved by nonlinear optimization
which is applied to the two maps to reduce errors. Finally
the new map points in the matching map points are erased
and the two maps are merged. The relationship of the keyframes and the map points between the old and new maps is established
which are used to jointly optimize the pose of the camera and the position of the map points in subsequent tracking and mapping.
Result
2
The algorithm is tested on an open dataset (visual-inertial datasets collected onboard a micro aerial vehicle). Results show that our method can effectively solve the system's inability to keep the map and location when SLAM system tracking fails. The experiment includes map accuracy and map completeness. When referring to accuracy
the trajectory of the fusion map achieves a typical precision of 8 cm for a 30 m
2
room environment and a 300 m
2
industrial environment compared with the ground truth. We compare fusion trajectory with the trajectory without loss and find that the precision can reach 4 cm. In case the camera moves violently
the map recovered by our algorithm is more complete than that by the ORB_SLAM relocalization algorithm. The result of our algorithm contains 30% more keyframes than the ORB_SLAM algorithm.
Conclusion
2
The proposed algorithm can retrieve tracking by merging maps if tracking is lost and recover the trajectory during the lost time. Compared with the traditional SLAM system
our relocalization strategy does not need the camera to return to the previous place visited but only requires one to observe part of the place before the loss. Our algorithm is more adaptive than the ORB_SLAM algorithm in case the tracking fails soon after initialization.
Dong Z L, Hao Y M, Zhu F.A based vision location system for autonomous robots[J]. Journal of Image and Graphics, 2000, 5(8):688-692.
董再励, 郝颖明, 朱枫.一种基于视觉的移动机器人定位系统[J].中国图象图形学报, 2000, 5(8):688-692. [DOI:10.11834/jig.20000813]
Davison A J, Reid I D, Molton N D, et al. MonoSLAM:real-time single camera SLAM[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6):1052-1067.[DOI:10.1109/TPAMI.2007.1049]
Mourikis A I, Roumeliotis S I. A multi-state constraint kalman filter for vision-aided inertial navigation[C]//Proceedings of 2007 IEEE International Conference on Robotics and Automation. Roma, Italy:IEEE, 2007:3565-3572.[ DOI:10.1109/ROBOT.2007.364024 http://dx.doi.org/10.1109/ROBOT.2007.364024 ]
Bloesch M, Omari S, Hutter M, et al. Robust visual inertial odometry using a direct EKF-based approach[C]//Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany:IEEE, 2015:298-304.[ DOI:10.1109/IROS.2015.7353389 http://dx.doi.org/10.1109/IROS.2015.7353389 ]
Leutenegger S, Lynen S, Bosse M, et al. Keyframe-based visual-inertial odometry using nonlinear optimization[J]. The International Journal of Robotics Research, 2015, 34(3):314-334.[DOI:10.1177/0278364914554813]
Williams B, Cummins M, Neira J, et al. A comparison of loop closing techniques in monocular SLAM[J]. Robotics and Autonomous Systems, 2009, 57(12):1188-1197.[DOI:10.1016/j.robot.2009.06.010]
Eade E D, Drummond T W. Unified loop closing and recovery for real time monocular SLAM[C]//Proceeding of the 9th British Conference on Machine Vision. Leeds, UK:British Machine Vision Association, 2008:136.[ DOI:10.5244/C.22.6 http://dx.doi.org/10.5244/C.22.6 ]
Sivic J, Zisserman A. Video Google:a text retrieval approach to object matching in videos[C]//Proceeding of the 9th IEEE International Conference on Computer Vision. Nice, France:IEEE, 2003:1470-1477.[ DOI:10.1109/ICCV.2003.1238663 http://dx.doi.org/10.1109/ICCV.2003.1238663 ]
Cummins M, Newman P. FAB-Map:probabilistic localization and mapping in the space of appearance[J]. The International Journal of Robotics Research, 2008, 27(6):647-665.[DOI:10.1177/0278364908090961]
Klein G, Murray D. Parallel Tracking and Mapping for Small AR Workspaces[C]//Proceeding of the 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara, Japan:IEEE, 2007:225-234.[ DOI:10.1109/ISMAR.2007.4538852 http://dx.doi.org/10.1109/ISMAR.2007.4538852 ]
Mur-Artal R, Montiel J M M, Tardos J D. ORB-SLAM:a versatile and accurate monocular SLAM system[J]. IEEE Transactions on Robotics, 2015, 31(5):1147-1163.[DOI:10.1109/TRO.2015.2463671]
Straub J, Hilsenbeck S, Schroth G, et al. Fast relocalization for visual odometry using binary features[C]//Proceeding of 2013 IEEE International Conference on Image Processing. Melbourne, VIC, Australia:IEEE, 2013:2548-2552.[ DOI:10.1109/ICIP.2013.6738525 http://dx.doi.org/10.1109/ICIP.2013.6738525 ]
Moteki A, Yamaguchi N, Karasudani A, et al. Fast and accurate relocalization for keyframe-based SLAM using geometric model selection[C]//Proceeding of 2016 IEEE Virtual Reality. Greenville, SC, USA:IEEE, 2016:235-236.[ DOI:10.1109/VR.2016.7504740 http://dx.doi.org/10.1109/VR.2016.7504740 ]
Mur-Artal R, Tardós J D. Visual-inertial monocular SLAM with map reuse[J]. IEEE Robotics and Automation Letters, 2017, 2(2):796-803.[DOI:10.1109/LRA.2017.2653359]
Forster C, Carlone L, Dellaert F, et al. IMU preintegration on manifold for efficient visual-inertial maximum-a-posteriori estimation[C]//Proceedings of Robotics:Science and Systems. Rome, Italy:MIT Press, 2015.[ DOI:10.15607/RSS.2015.XI.006 http://dx.doi.org/10.15607/RSS.2015.XI.006 ]
Galvez-López D, Tardos J D. Bags of binary words for fast place recognition in image sequences[J]. IEEE Transactions on Robotics, 2012, 28(5):1188-1197.[DOI:10.1109/TRO.2012.2197158]
Burri M, Nikolic J, Gohl P, et al. The EuRoC micro aerial vehicle datasets[J]. The International Journal of Robotics Research, 2016, 35(10):1157-1163.[DOI:10.1177/0278364915620033]
Furgale P, Rehder J, Siegwart R. Unified temporal and spatial calibration for multi-sensor systems[C]//Proceedings of 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan:IEEE, 2013:1280-1286.[ DOI:10.1109/IROS.2013.6696514 http://dx.doi.org/10.1109/IROS.2013.6696514 ]
Sturm J, Engelhard N, Endres F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]//Proceedings of 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vilamoura, Portugal:IEEE, 2012:573-580.[ DOI:10.1109/IROS.2012.6385773 http://dx.doi.org/10.1109/IROS.2012.6385773 ]
相关作者
相关机构
京公网安备11010802024621