广东工业大学学报 ›› 2018, Vol. 35 ›› Issue (05): 31-37.doi: 10.12052/gdutxb.180068

• 综合研究 • 上一篇    下一篇

基于卷积神经网络的视觉闭环检测研究

杨孟军, 苏成悦, 陈静, 张洁鑫   

  1. 广东工业大学 物理与光电工程学院, 广东 广州 510006
  • 收稿日期:2018-03-20 出版日期:2018-07-10 发布日期:2018-07-10
  • 通信作者: 苏成悦(1961-),男,教授,硕士生导师,主要研究方向为应用物理、机器视觉.E-mail:scy.gdut@163.com E-mail:scy.gdut@163.com
  • 作者简介:杨孟军(1988-),男,硕士研究生,主要研究方向为机器人控制技术、机器视觉.
  • 基金资助:
    国家自然科学基金青年科学基金资助项目(61305069);广东省信息产业发展专项现代信息服务业项目(2150510)

Loop Closure Detection for Visual SLAM Using Convolutional Neural Networks

Yang Meng-jun, Su Cheng-yue, Chen Jing, Zhang Jie-xin   

  1. School of Physics and Optoeletronic Engineering, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2018-03-20 Online:2018-07-10 Published:2018-07-10

摘要: 闭环检测是视觉SLAM中很重要的一部分,成功地检测出闭环能减小定位算法所产生的累积里程漂移.鉴于深度卷积神经网络在分类问题上的优越表现,本文首次将应用于图像分类的vgg16-places365卷积神经网络模型应用于视觉SLAM闭环检测中,将配准数据输入训练好的该卷积神经网络,其各个隐藏层的输出对应于图像特征表示.然后通过实验比较选用匹配精度较高的中间层完成场景特征提取,通过计算场景特征的相似性得到闭环区域.最后在闭环检测数据集上进行实验测试.测试结果表明,相比于传统的闭环检测方法,vgg16-places365卷积神经网络模型在相同召回率条件下准确率要高约3%;对于特征提取时间,在CPU上要快约5~10倍,而在GPU上更是比传统人工设计特征的闭环检测快近100倍.

关键词: 视觉SLAM, 闭环检测, 卷积神经网络, 特征提取, 相似度

Abstract: The detection of loop closure is a very important part of visual slam. Successful detection of loop closure can reduce the accumulated mileage drift generated by positioning algorithms. In view of the superior performance of deep convolutional neural networks in classification, the network of VGG16-Places 365 is used, which is widely used in image classification to the area of loop closure detection for the first time. The registration data are input into a trained convolutional neural network, and the output of each hidden layer corresponds to the image feature representation. Then, experiments are implemented to get an intermediate layer with higher matching accuracy, which is used to complete scene feature extraction, and then the loop closure region is obtained by calculating the similarity of the scene feature; finally, experimental tests are performed on loop closure detection dataset. Test results show that the accuracy rate of the VGG16-Places 365 convolutional neural network model is about 3% higher than the traditional ways under the same recall rate; and the the feature extraction time is about 5 to 10 times faster on the CPU and 100 times on the GPU.

Key words: visual simultaneous location and mapping (vSLAM), loop closure detection, convolutional neural network, deep learning, similarity

中图分类号: 

  • TP242
[1] BAILEY T, DURRANT-WHYTE H. Simultaneous localization and papping:part I[J]. IEEE Robotics & Automation Magazine, 2006, 13(2):99-110
[2] WANG H, Hou Z, CHENG L, TAN M. Online mapping with a mobile robot in dynamic and unknown environments[J]. International Journal of Modelling Identification & Control, 2008, 4(4):415-423
[3] FILLIAT D. A visual bag of words method for interactive qualitative localization and mapping[C]//Robotics and Automation, IEEE International Conference. Roma:IEEE, 2007:3921-3926.
[4] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110
[5] BAY H, TUYTELAARS T, GOOL L. SURF:Speeded up robust features BT-Computer Vision-ECCV 2006[J]. Computer Vision-ECCV, 2006, 3951:404-417
[6] RUBLEE E, RABAUD V, KONOLIGE K. Orb:an efficient alternative to sift or surf[C]//Computer Vision, IEEE International Conference. Barcelona, Spain:IEEE, 2011:2564-2571.
[7] CUMMINS M, NEWMAN P. Highly scalable appearance-only SLAM-FAB-MAP 2.0[M]//Proceedings of Robotics:Science and Systems. Seattle, 2009:1-8.
[8] CUMMINS M, NEWMAN P. FAB-MAP:Probabilistic localization and mapping in the space of appearance[J]. International Journal of Robotics Research, 2008, 27(6):647-665
[9] LIU Y, ZHANG H. Visual loop closure detection with a compact image descriptor[J]. IEEE/RSJ International Conference on Intelligent Robots and Systems. Vilamoura-Algarve, Portugal:IEEE, 2012:1051-1056.
[10] GAO X, ZHANG T. Unsupervised learning to detect loops using deep neural networks for visual SLAM system[J]. Autonomous Robots, 2017, 41(1):1-18
[11] GAO X, ZHANG T. Loop closure detection for visual slam systems using deep neural networks[C]//Technical commitee on control theory, Chinese control conference. Hangzhou:Chinese Association of Automation, 2015:5851-5856.
[12] CHATFIELD K, SIMONYAN K, VEDALDI A, et al. Return of the devil in the details:delving deep into convolutional nets[J]. Computer Science, 2014:1-11
[13] WAN J, WANG D Y. Deep learning for content-based image retrieval:a comprehensive study[C]//Multimedia, ACM International Conference. Istanbul:ACM, 2014:157-166.
[14] KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[C]//Internationl Conference on Neural Information Processing. Doha, Qatar:ACM, 2012, 60(2):1097-1105.
[15] BABENKO A, SLESAREV A, CHIGORIN A. Neural codes for image retrieval[C]//Computer Vision, European Conference. Zurich:Springer, 2014:584-599.
[16] 何元烈, 陈佳腾, 曾碧. 基于精简卷积神经网络的快速闭环检测方法[J]. 计算机工程, 2018,44(6):182-187. HE Y L, CHEN J T, ZENG B. A fast loop closure detection method based on lightweight convolutional neural network[J]. Computer Engineering, 2018, 44(6):182-187.
[17] XIA Y, LI J, QI L, et al. Loop closure detection for visual SLAM using PCANet features[C]//Neural Networks, IEEE International Joint Conference. Vancouver, Canada:IEEE, 2016:2274-2281.
[18] HOU Y, ZHANG H, ZHOU S. Convolutional neural network-based image representation for visual loop closure detection[C]//Information and Automation, IEEE International Conference. Lijiang, China:IEEE, 2015:2238-2245.
[19] JIA Y Q, SHELHAMER E, JEFF D. Caffe:convolutional architecture for fast feature embedding[C]//Multimedia, ACM International Conference. Istanbul:ACM, 2014:675-678.
[20] SHANG W, SOHN K, ALMEIDA D, et al. Understanding and improving convolutional neural networks via concatenated rectified linear units[C]//Machine Learning, IEEE International Conference. New York:IEEE, 2016:1-17.
[21] ZHOU B, KHOSLA A, LAPEDRIZA A, et al. Places:An image database for deep scene understanding[J]. Journal of Vision, 2016, 17(10):1-12
[1] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[2] 章云, 王晓东. 基于受限样本的深度学习综述与思考[J]. 广东工业大学学报, 2022, 39(05): 1-8.
[3] 丘展春, 费伦科, 滕少华, 张巍. 余弦相似度保持的掌纹识别算法[J]. 广东工业大学学报, 2022, 39(03): 55-62.
[4] 黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58.
[5] 马少鹏, 梁路, 滕少华. 一种轻量级的高光谱遥感图像分类方法[J]. 广东工业大学学报, 2021, 38(03): 29-35.
[6] 汝少楠, 何元烈, 叶星余. 基于稀疏直接法闭环检测定位的视觉里程计[J]. 广东工业大学学报, 2021, 38(03): 48-54.
[7] 夏皓, 蔡念, 王平, 王晗. 基于多分辨率学习卷积神经网络的磁共振图像超分辨率重建[J]. 广东工业大学学报, 2020, 37(06): 26-31.
[8] 战荫伟, 朱百万, 杨卓. 车辆颜色和型号识别算法研究与应用[J]. 广东工业大学学报, 2020, 37(04): 9-14.
[9] 曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17.
[10] 陈旭, 张军, 陈文伟, 李硕豪. 卷积网络深度学习算法与实例[J]. 广东工业大学学报, 2017, 34(06): 20-26.
[11] 申小敏, 李保俊, 孙旭, 徐维超. 基于卷积神经网络的大规模人脸聚类[J]. 广东工业大学学报, 2016, 33(06): 77-84.
[12] 戴知圣, 潘晴, 常桂林, 陈健刚. 基于机器视觉的贴片引脚焊接缺陷检测[J]. 广东工业大学学报, 2016, 33(03): 65-69.
[13] 邹丽娜,凌捷. 一种基于特征提取的二级文本分类方法[J]. 广东工业大学学报, 2012, 29(4): 65-68.
[14] 张烈平; 张俞伟; 莫玮; . RBF神经网络在诱发脑电信号分类中的应用研究[J]. 广东工业大学学报, 2004, 21(4): 16-20.
[15] 周维忠; 赵海洋; 孙国基; 冯心海; . 基于自适应小波的光频数据分类[J]. 广东工业大学学报, 1999, 16(3): 52-56.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!