视觉即时定位和地图构建(Simultaneous localization and mapping,SLAM)[1,2]是自动驾驶、无人机、虚拟现实和增强现实等新兴技术的基本构建模块,主要由相机传感器、前端、后端、回环检测和地图构建5部分组成。相机传感器采集用于建图的图像序列,它们在时间上彼此相邻。前端根据图像序列计算帧间的位姿和图像中路标点的深度信息。由于这些信息并不准确,因此后端模块对前端计算出的数据进行优化,以获得较为准确的位姿和空间结构信息。地图构建模块利用后端优化后的位姿和空间结构信息建立点云地图。由于内存和计算资源的限制,视觉SLAM系统在完成上述计算任务过程中采用滑动窗口的形式,即位姿的计算和优化被限定在相邻数帧范围内。每次计算和优化都会存在误差,尽管很小,但不能被忽略。误差的长时间积累会导致后续位姿估计和空间结构估计越来越不准确,无法建立全局一致的地图。
目前,在主流的视觉SLAM系统框架中,回环检测模块多基于图像间的匹配实现,即将当前帧与历史帧进行比较,计算它们之间的相似度,若相似度超过设定阈值,则判定为发生回环。因此,回环检测问题转变为图像间特征点提取和匹配问题,核心在于特征点的提取、描述子的设计和选取以及相似度的计算。Angeli等[3]将文本检索领域的词袋(Bag of word,BoW)模型应用于回环检测中,提出采用BoW对图像进行描述并完成图像之间的匹配。Cummins等[4,5]提出了基于外观的快速映射(Fast appearance based mapping,FAB-MAP)算法,并将其应用于大场景中,其回环检测效果十分优异。梁志伟等[6]以FAB-MAP为基础,将离线训练字典替换为在线构建字典,进一步提高了系统的实时性。文献[7]提出了增量式的BoW模型,利用透视不变二值特征描述子完善回环检测模型,取得了较好的回环检测结果。Oliva等[8]提出了Gist描述子,它使用Gabor滤波器从频率和方向两个维度提取图像信息,使得Gist成为描述图像的全局描述子。文献[9]基于Gist描述子提出了一种描述场景信息的全局图像特征,以此优化了BoW模型。
在视觉SLAM系统中用于特征提取的BoW模型,多以ORB(Oriented FAST and rotated BRIEF)特征[15]为基础。该特征的提取是以图像中的FAST角点为关键点,对于场景中变化平缓的区域不能提取足够数量的特征。本文提出了基于均匀ORB的特征提取方法,克服了经典ORB特征对图像内容描述不充分的问题。进一步,在直接稀疏里程计(Direct sparse odometry,DSO)框架中提出了回环检测模块,该方法显著提高了DSO框架的建图性能。
Durrant-WhyteH, BaileyT. Simultaneous localization and mapping: Part I[J]. IEEE Robotics & Automation Magazine, 2006, 13(2): 99-108.
[2]
BaileyT, Durrant-WhyteH. Simultaneous localization and mapping (SLAM): part Ⅱ[J]. IEEE Robotics & Automation Magazine, 2006, 13(3): 108-117.
[3]
AngeliA, FilliatD, DoncieuxS, et al. Fast and incremental method for loop-closure detection using bags of visual words[J]. IEEE Transactions on Robotics, 2008, 24(5): 1027-1037.
[4]
CumminsM, NewmanP. FAB-MAP: probabilistic localization and mapping in the space of appearance[J]. International Journal of Robotics Research, 2008, 27(6): 647-665.
[5]
CumminsM, NewmanP. Appearance-only SLAM at large scale with FAB-MAP 2.0[J]. International Journal of Robotics Research, 2011, 30(9): 1100-1123.
LiangZhi-Wei, ChenYan-Yan, ZhuSong-Hao, et al. Loop closure detection algorithm based on monocular vision using visual dictionary[J]. Pattern Recognition and Artificial Intelligence, 2013, 26(6): 561-570.
[8]
ZhangH, LiuY L, TanJ D. Loop closing detection in RGB-D SLAM combining appearance and geometric constraints[J]. Sensors, 2015, 15(6): 14639-14660.
[9]
OlivaA, TorralbaA. Modeling the shape of the scene: a holistic representation of the spatial envelope[J]. International Journal of Computer Vision, 2001, 42(3): 145-175.
[10]
YangL, HongZ. Visual loop closure detection with a compact image descriptor[C]//2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Portugal, 2012: No. 13195481.
[11]
YangZ, PanY, DengL, et al. PLSAV: Parallel loop searching and verifying for loop closure detection[J]. IET Intelligent Transport Systems, 2021, 15(5): 683-698.
[12]
BaiD D, WangC Q, ZhangB, et al. CNN feature boosted SeqSLAM for real-time loop closure detection[J]. Chinese Journal of Electronics, 2018, 27(3): 488-499.
[13]
WangZ, PengZ, GuanY, et al. Two-stage vSLAM loop closure detection based on sequence node matching and semi-semantic autoencoder[J]. Journal of Intelligent & Robotic Systems, 2021, 101(2): 1-21.
[14]
MukherjeeA, ChakrabortyS, SahaS K. Detection of loop closure in SLAM: a DeconvNet based approach[J]. Applied Soft Computing Journal, 2019, 80: 650-656.
[15]
LiuQiang, DuanFu-hai. Loop closure detection using CNN words[J]. Intelligent Service Robotics, 2019, 12(4): 303-318.
[16]
RubleeE, RabaudV, KonoligeK, et al. ORB: an efficient alternative to SIFT or SURF[C]//Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain, 2012: 2564-2571.
[17]
RostenE, DrummondT. Machine learning for high-speed corner detection[C]//Computer Vision-ECCV 2006: Part I, Berlin, Germany, 2006: 430-443.
[18]
于录录. 视觉SLAM中回环检测算法的研究[D]. 长春: 吉林大学通信工程学院, 2021.
[19]
YuLu-lu. Research on loop detection in visual SLAM[D]. Changchun: College of Communication Engineering, Jilin University, 2021.
[20]
QuigleyM, ConleyK, GerkeyB, et al. Ros: an open-source robot operating system[J/OL]. [2021-03-22].
[21]
EngelJ, KoltunV, CremersD. Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 611-625.
[22]
Mur-ArtalR, MontielJ M M, TardosJ D. ORB-SLAM: a versatile and accurate monocular SLAM system[J]. IEEE Transactions on Robotics, 2015, 31(5): 1147-1163.
[23]
ArunK S, HuangT S, BlosteinS D. Least-squares fitting of two 3-D point sets[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, 9(5): 698-700.
[24]
HornB K P. Closed-form solution of absolute orientation using unit quaternions[J]. Journal of the Optical Society of America A, 1987, 4(4): 629-642.
[25]
EngelJ, UsenkoV, CremersD. A photometrically calibrated benchmark for monocular visual odometry[J/OL]. [2021-03-25].