广东工业大学学报 ›› 2023, Vol. 40 ›› Issue (01): 56-60,76.doi: 10.12052/gdutxb.210028

• • 上一篇    下一篇

基于深度特征的单目视觉惯导里程计

徐伟锋, 蔡述庭, 熊晓明   

  1. 广东工业大学 自动化学院,广东 广州 510006
  • 收稿日期:2021-02-18 出版日期:2023-01-25 发布日期:2023-01-12
  • 通信作者: 蔡述庭(1979-),男,教授,博士,主要研究方向为FPGA加速、深度学习等,E-mail:shutingcai@126.com
  • 作者简介:徐伟锋(1996-),男,硕士研究生,主要研究方向为计算机视觉、深度学习等
  • 基金资助:
    广东省应用型科技研发专项(2017B090909004)

Visual Inertial Odometry Based on Deep Features

Xu Wei-feng, Cai Shu-ting, Xiong Xiao-ming   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2021-02-18 Online:2023-01-25 Published:2023-01-12

摘要: 视觉里程计是SLAM (Simultaneous Localization and Mapping) 领域中的基石,单目视觉里程计因其成本低廉和仅需较少的相机标定工作而占据着重要的地位,但它存在着尺度不确定、尺度漂移、鲁棒性差等缺点。本文在ORB_SLAM3的基础上,提出了一种基于深度特征的单目视觉惯导里程计,简称DF-VIO(Visual Inertial Odometry Based on Deep Features),它采用深度学习网络提取的深度特征替代传统的人工点特征,并融合了人工线特征,强化了系统在现实复杂场景下的鲁棒性;另外,系统提供了多种位姿跟踪方式,除了基于恒速模型和跟踪参考关键帧的方式外,还提供了一种基于深度学习网络的可重复性图的位姿跟踪方法,进一步提高了系统位姿跟踪的精度。在公开数据集EuRoC上进行对比实验,在纯视觉模式下,平均轨迹误差下降了25.9%,在视觉惯导模式下,平均轨迹误差下降了8.6%,证明了本文提出的系统在复杂的场景下能够具有更高的鲁棒性。

关键词: 视觉里程计, 深度学习, 惯导, 线特征

Abstract: Visual odometry is the cornerstone in the field of SLAM. Monocular visual odometry occupies an important position because of its low cost and less camera calibration, but it has some shortcomings such as scale uncertainty, scale drift, poor robustness, and so on. To solve these problems, based on ORB-SLAM3, we process a monocular visual-inertial navigation odometer with depth features, referred to as DF-VIO (Visual Inertial Odometry Based on Deep Features) , which uses depth features extracted by deep learning network to replace traditional artificial point features, and fuses artificial line features to enhance the robustness of the system in real complex scenes. Besides, the system provides a variety of pose tracking methods. In addition to the method based on the constant speed model and tracking reference keyframe, a pose tracking method based on the predicted repeatability map is also provided, which further improves the accuracy of system pose tracking. Comparative experiments are carried out on the open data set EuRoC, and the average trajectory error is reduced by 25.9% in pure visual mode and 8.6% in visual-inertial mode, which proves that the system proposed in this paper can be more robust in complex scenes.

Key words: visual inertial odometry, deep learning, inertial measurement unit, line features

中图分类号: 

  • TP301
[1] CADENA C, CARLONE L, CARRILLO H, et al. Past, present, and future of simultaneous localization and mapping: toward the robust-perception age [J]. IEEE Transactions on Robotics, 2016, 32(6): 1309-1332.
[2] PARK S, SCHÖPS T, POLLEFEYS M. Illumination change robustness in direct visual slam[C]//2017 IEEE international conference on Robotics and Automation (ICRA) . Singapore: IEEE, 2017: 4523-4530.
[3] 池鹏可, 苏成悦. 移动机器人中单目视觉里程计的研究[J]. 广东工业大学学报, 2017, 34(5): 40-44.
CHI P K, SU C Y. A Research on monocular visual odometry for mobile robots [J]. Journal of Guangdong University of Technology, 2017, 34(5): 40-44.
[4] 汝少楠, 何元烈, 叶星余. 基于稀疏直接法闭环检测定位的视觉里程计[J]. 广东工业大学学报, 2021, 38(3): 48-54.
RU S N, HE Y L, YE X Y. Visual odometry based on sparse direct method loop-closure detection [J]. Journal of Guangdong University of Technology, 2021, 38(3): 48-54.
[5] STRASDAT H, MONTIEL J, DAVISON A J. Scale drift-aware large scale monocular SLAM[C]// Robotics: Science and Systems VI. Zaragoza: MIT Press, 2010.
[6] MUR-ARTAL R, TARDÓS J D. Visual-inertial monocular SLAM with map reuse [J]. IEEE Robotics and Automation Letters, 2017, 2(2): 796-803.
[7] FORSTER C, CARLONE L, DELLAERT F, et al. On-manifold preintegration for real-time visual-inertial odometry [J]. IEEE Transactions on Robotics, 2016, 33(1): 1-21.
[8] QIN T, SHEN S. Online temporal calibration for monocular visual-inertial systems[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . Madrid: IEEE, 2018: 3662-3669.
[9] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[EB/OL]. arXiv: 2004.10934 (2020-04-23). https://arxiv.org/abs/2004.10934.
[10] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
[11] EIGEN D, FERGUS R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 2650-2658.
[12] TORRES-CAMARA J M, ESCALONA F, GOMEZ-DONOSO F, et al. Map slammer: densifying scattered KSLAM 3D maps with estimated depth[C]//Iberian Robotics Conference. Porto: Springer, Cham, 2019: 563-574.
[13] YANG S, SCHERER S. Cubeslam: monocular 3-D object SLAM [J]. IEEE Transactions on Robotics, 2019, 35(4): 925-938.
[14] GRINVALD M, FURRER F, NOVKOVIC T, et al. Volumetric instance-aware semantic mapping and 3D object discovery [J]. IEEE Robotics and Automation Letters, 2019, 4(3): 3037-3044.
[15] DETONE D, MALISIEWICZ T, RABINOVICH A. Superpoint: self-supervised interest point detection and description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City: IEEE, 2018: 224-236.
[16] GROMPONE VON GIOI R, JAKUBOWICZ J, MOREL J M, et al. LSD: a fast line segment detector with a false detection control [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 32(4): 722-732.
[17] HUANG H, YE H, SUN Y, et al. Monocular visual odometry using learned repeatability and description[C]//2020 IEEE International Conference on Robotics and Automation (ICRA) . Paris: IEEE, 2020: 8913-8919.
[18] CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al. ORB-SLAM3: an accurate open-source library for visual, visual-inertial and multi-map SLAM[EB/OL]. arXiv: 2007.11898 (2020-07-23). https://arxiv.org/abs/2007.1189801.
[19] BURRI M, NIKOLIC J, GOHL P, et al. The EuRoC micro aerial vehicle datasets [J]. The International Journal of Robotics Research, 2016, 35(10): 1157-1163.
[1] 吴俊贤, 何元烈. 基于通道注意力的自监督深度估计方法[J]. 广东工业大学学报, 2023, 40(02): 22-29.
[2] 刘冬宁, 王子奇, 曾艳姣, 文福燕, 王洋. 基于复合编码特征LSTM的基因甲基化位点预测方法[J]. 广东工业大学学报, 2023, 40(01): 1-9.
[3] 刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
[4] 章云, 王晓东. 基于受限样本的深度学习综述与思考[J]. 广东工业大学学报, 2022, 39(05): 1-8.
[5] 郑佳碧, 杨振国, 刘文印. 基于细粒度混杂平衡的营销效果评估方法[J]. 广东工业大学学报, 2022, 39(02): 55-61.
[6] Gary Yen, 栗波, 谢胜利. 地球流体动力学模型恢复的长短期记忆网络渐进优化方法[J]. 广东工业大学学报, 2021, 38(06): 1-8.
[7] 叶培楚, 李东, 章云. 基于双目强约束的直接稀疏视觉里程计[J]. 广东工业大学学报, 2021, 38(04): 65-70.
[8] 赖峻, 刘震宇, 刘圣海. 基于全局数据混洗的小样本数据预测方法[J]. 广东工业大学学报, 2021, 38(03): 17-21.
[9] 汝少楠, 何元烈, 叶星余. 基于稀疏直接法闭环检测定位的视觉里程计[J]. 广东工业大学学报, 2021, 38(03): 48-54.
[10] 岑仕杰, 何元烈, 陈小聪. 结合注意力与无监督深度学习的单目深度估计[J]. 广东工业大学学报, 2020, 37(04): 35-41.
[11] 曾碧, 任万灵, 陈云华. 基于CycleGAN的非配对人脸图片光照归一化方法[J]. 广东工业大学学报, 2018, 35(05): 11-19.
[12] 陈旭, 张军, 陈文伟, 李硕豪. 卷积网络深度学习算法与实例[J]. 广东工业大学学报, 2017, 34(06): 20-26.
[13] 马晓东, 曾碧, 叶林锋. 基于BA的改进视觉/惯性融合定位算法[J]. 广东工业大学学报, 2017, 34(06): 32-36.
[14] 刘震宇, 李嘉俊, 王昆. 一种基于深度自编码器的指纹匹配定位方法[J]. 广东工业大学学报, 2017, 34(05): 15-21.
[15] 池鹏可, 苏成悦. 移动机器人中单目视觉里程计的研究[J]. 广东工业大学学报, 2017, 34(05): 40-44.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!